Блог пользователя MikeMirzayanov

Автор MikeMirzayanov, 5 лет назад, По-английски

Hello, Codeforces.

I hope you wash your hands and feel great.

I added the support of 64-bit g++. If you are using Windows, you can easily install it via our minimalistic package manager PBOX running the command line pbox install msys2-mingw64-9.

Your solutions are compiled with the command line g++ -static -DONLINE_JUDGE -Wl,--stack=268435456 -O2 -std=c++17 program.cpp.

Now you can try to use int128 and other 64-bit specific features! In fact, I am slightly worried that the presence of such features may widen the gap between C ++ and other languages. Wait and see.

Currently, support for 64-bit C++ is experimental. For example, I would not be surprised if IO on it works slower in some cases (it is necessary to test!). I invite you to join the testing and experimentations. Share your impressions in the comments!

  • Проголосовать: нравится
  • +2511
  • Проголосовать: не нравится

»
5 лет назад, # |
  Проголосовать: нравится +651 Проголосовать: не нравится

I see quarantine is doing its magic. :)

»
5 лет назад, # |
  Проголосовать: нравится +507 Проголосовать: не нравится

I would donate for this.

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +1 Проголосовать: не нравится

    Agree, if Mike would start a patreon for this, the goal would be achieved in no time

»
5 лет назад, # |
  Проголосовать: нравится +10 Проголосовать: не нравится

i would pray for this

»
5 лет назад, # |
  Проголосовать: нравится +9 Проголосовать: не нравится

how to initialize int128?

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится +285 Проголосовать: не нравится

What's something to say when starting a blog or getting a handjob?

Mike: I hope you wash your hands and feel great.

»
5 лет назад, # |
  Проголосовать: нравится -15 Проголосовать: не нравится

Finally! I can't wait to try this out.

Thanks Mike

»
5 лет назад, # |
  Проголосовать: нравится +83 Проголосовать: не нравится

FYI, Rust supports int128. I once tried to learn it, and... now I'm washing my hands.

»
5 лет назад, # |
  Проголосовать: нравится -124 Проголосовать: не нравится

"I am slightly worried that the presence of such features may widen the gap between C ++ and other languages. Wait and see."

It'd anyway compensate if time multipliers would be considered for slower languages. (Feature Request :P)

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +18 Проголосовать: не нравится

    The Church of C strikes out with the infamous downvote :P

  • »
    »
    5 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится +19 Проголосовать: не нравится

    I'm always stunned by how much people dislike the idea of time multipliers. It decreases the gap between languages. I often want to give bigger TL in div2-ABC for Python, and sometimes slightly bigger TL for Java in hard problems.

    Time multipliers can help to maximize the ratio slowestIntendedSolution / fastestNaiveSolution (assuming that running time is divided by time multiplier).

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится +71 Проголосовать: не нравится

      This reminds me of a local contest we hosted on HackerRank, the expected solution for the hardest problem (related to strings) was O(nlog(n)), but we had several Bruteforce (O(n^2)) solutions accepted due to the large multiplier for python language (x10) on HackerRank.

      And it turns out that python is not so slow with strings.

      • »
        »
        »
        »
        5 лет назад, # ^ |
          Проголосовать: нравится +12 Проголосовать: не нравится

        This reminds me of thousands of times when people can't solve Codeforces problems in Python.

        Yes, organizers need to be careful with multipliers. It's stupid to give Python a x10 multipier. Same for Java, some simple operations can be equally fast as in C++.

        • »
          »
          »
          »
          »
          5 лет назад, # ^ |
            Проголосовать: нравится +9 Проголосовать: не нравится

          I wasn't taking any side. I am not some learned person to wisely put forward my views on this matter.

          It was just some incident that I thought might be worth noting in this debate.

          Sure there must be workarounds or ways to handle this situation, but we didn't consider this as a possible problem before setting up.

          • »
            »
            »
            »
            »
            »
            5 лет назад, # ^ |
            Rev. 7   Проголосовать: нравится +7 Проголосовать: не нравится

            You bring up a valid point, that a constant multiplier can be too blunt: the performance gap between Python and C++ isn't constant, so a constant multiplier in some cases would be too much, and in other cases too little — i.e. you wouldn't be able to write an accepted solution in Python with a multiplier that's too low.

            Personally I love using Python, and I wish solving this problem was simpler.

            One possible solution is to ensure that each problem has an acceptable optimal solution in Python, but if that requires a multiplier, also ensure any suboptimal solution won't pass. Obviously that's a lot of work for the setter, and very hard to get right: a Python expert may be able to write a faster naive solution than the setter's fastest naive solution, and then get AC with the multiplier. Even if the setter spends all that effort and gets it exactly right for Python, competitors will complain that their favorite dynamic language doesn't get the same treatment, so the setter will have to spend the same effort on all 20 supported languages, which is simply unrealistic.

            So as much as I love Python, I must agree with riadwaw below that the most practical solution is to standardize on C++. Then the problem setters only have to check the above (optimal solution passes, suboptimal solution TLEs) for 1 language instead of 20.

            We should also mention that it's possible for a language to be so fast that a naive solution would pass. In practice if you make sure that doesn't happen for C++, then you are pretty safe for all other languages since they tend to be slower than C++ across the board. That feature of C++ is a big reason to standardize on it.

        • »
          »
          »
          »
          »
          5 лет назад, # ^ |
            Проголосовать: нравится +58 Проголосовать: не нравится

          I think any constant multiplier will be bad, because in some problems it's possible to write solution that will spend most time in the compiled arts written in C/C++, which will take time close to time in C++.

          This leaves with the option to set the TLs manually, but that requires authors to write solutions in 20 languages and they better know them well, not to leave these loopholes.

          I think that leaving hard problems c++-only is lesser of evil. (And for easier problems you can often just reduce the constraints anyway)

          • »
            »
            »
            »
            »
            »
            5 лет назад, # ^ |
              Проголосовать: нравится +20 Проголосовать: не нравится

            It just happens a lot that organizers have solutions in Python for easy problems and in Java for hard problems and they see that slightly different TL for a different language would be better. Time multipliers shouldn't be default for sure.

            in some problems it's possible to write solution that will spend most time in the compiled arts written in C/C++, which will take time close to time in C++.

            I agree that this is dangerous and a good argument to avoid time multipliers.

        • »
          »
          »
          »
          »
          5 лет назад, # ^ |
            Проголосовать: нравится +76 Проголосовать: не нравится

          After all, "do something so that our computer can answer these questions(tests) in 2s each" sounds reasonable, while "do something so that our computer can answer these questions(tests) in 2s each, but if you know only python, 10s is fine" is more strange

        • »
          »
          »
          »
          »
          5 лет назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          Yeah, that's why it's wrong to give Java x2 multiplier!

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится +36 Проголосовать: не нравится

      Yes, but languages has some dignities and some shortcomings, and language speed it's one of shortcomings, you can choose other language if you do not like this.

      • »
        »
        »
        »
        5 лет назад, # ^ |
          Проголосовать: нравится +1 Проголосовать: не нравится

        A lot of programmers know Python and they are discouraged from participating in Codeforces because TL is adjusted to C++ and Java. Are you sure that "learn a new language" is the best solution here?

        • »
          »
          »
          »
          »
          5 лет назад, # ^ |
          Rev. 3   Проголосовать: нравится +13 Проголосовать: не нравится

          Hmmmm, what if I know only C++, but python has some functions what i need and C++ has not it? Codeforces must import this functions in C++? No, because it is shortcomings of C++ and I can choose other language if I do not like this.

          • »
            »
            »
            »
            »
            »
            5 лет назад, # ^ |
              Проголосовать: нравится +20 Проголосовать: не нравится

            C++ is anyway better suited than Python (for CP). You're saying that we can't make two languages equivalent so we shouldn't decrease the gap even if it's easy. I strongly disagree. Making Python viable in div2 would greatly broaden the competitive programming community in the long run.

            • »
              »
              »
              »
              »
              »
              »
              5 лет назад, # ^ |
              Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

              You're saying that we can't make two languages equivalent so we shouldn't decrease the gap even if it's easy.

              The problem is that it's not easy.

              You need to get the multiplier exactly right, so the optimal solution passes but all suboptimal solutions TLE. That's very hard to do even just in Python, and you'll have to be a Python expert to know how to write the fastest naive code. Now multiply by the 20 languages Codeforces supports and it's obviously impractical.

              Even just for Python it may be impractical. Because of extreme differences between certain optimal and suboptimal techniques. For some problems, I may be able to use techniques that rely heavily on CPython's C routines and thus approach C performance, and with the multiplier I will AC naive solutions, but you will not be able to remove the multiplier, because if someone doesn't know these specific techniques they won't be able to get an optimal solution to AC without it.

              • »
                »
                »
                »
                »
                »
                »
                »
                5 лет назад, # ^ |
                  Проголосовать: нравится 0 Проголосовать: не нравится

                No, you don't need to get the multiplier exactly right. Almost always it's true that naive C++ solution is faster than naive Python solution, and intended C++ is faster than intended Python. Then multiplier of 1.001 for Python always improves the ratio slowestIntended/fastestNaive. And almost always the multiplier of 1.25 will be good too, sometimes even 2 or 5.

                Even if the setter spends all that effort and gets it exactly right for Python, competitors will complain that their favorite dynamic language doesn't get the same treatment, so the setter will have to spend the same effort on all 20 supported languages, which is simply unrealistic.

                (this is quote from your other comment above)
                Helping users of just Python is much better than not helping at all. Other languages are screwed in both cases.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  5 лет назад, # ^ |
                  Rev. 5   Проголосовать: нравится +4 Проголосовать: не нравится

                  And almost always the multiplier of 1.25 will be good too, sometimes even 2 or 5.

                  That right there is the problem:

                  A multiplier of 1.25 sounds nice and harmless. Indeed, you could probably apply a 1.25 constant multiplier safely right now. Why? Because it does almost nothing.

                  Python is generally slower than C++ by a factor that's far larger than 1.25. So 1.25 won't do much harm (you won't be able to cheese with it). It also won't do much good. If a problem was unsolvable in Python, a multiplier of 1.25 would almost never help you.

                  A multiplier that would actually make a difference would be closer to 5, which I believe is the multiplier on CodeChef. But then of course you are risking cheese if I can delegate most of the work to C routines.

                  I'm personally all for improving Python usability on Codeforces, but we should be aware of the potential issues. Incidentally, there's a 3rd-party project to do that, PyRival.

                  The best solution an OJ could implement would be to compile all solutions to the same Intermediate Representation (IR) that runs on an identical virtual machine. You can then calculate the exact computational cost of each solution, in terms of primitive operations. You could also just run it and expect the running time to be equivalent.

                  A second fundamentally correct approach would be to compute the actual time complexity of the solution by running it through multiple input sizes and calculating the running time growth curve.

                  The first approach is a pretty big project, but its result will be a drop-in replacement for the various platforms currently used by all OJs. The second one is a bit more realistic, but of course requires some more work from the setters and will not immediately work on existing test suites.

                  Notice that by standardizing on C++, as the CP community has effectively done, we get all the benefits of the first approach without the cost of an ambitious and difficult VM project. That's why most people are pushing to keep it as the status quo.

                  My guess is that we either stay in this current status quo, or we start applying multipliers to Python, and then we will have cases when people cheese naive solutions by delegating to C routines. Personally, I'm sympathetic to the argument that this is a small price to pay for the great benefit of making CP more widely accessible and inviting to newbies. Perhaps the best realistic approach would be to apply multipliers in div2 but not in the more competitive divisions and contests.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  5 лет назад, # ^ |
                    Проголосовать: нравится 0 Проголосовать: не нравится

                  What I want as a setter is an ability to choose a different TL for Java and Python sometimes. It doesn't matter what be be a good constant multiplier, my examples with numbers were there to show that we don't need to get the number exactly right.

            • »
              »
              »
              »
              »
              »
              »
              5 лет назад, # ^ |
                Проголосовать: нравится +3 Проголосовать: не нравится

              I agree to that, I have good experience with Python because I use it for development hence I use it for CP as well as it's easy for me to just start writing code in python without even thinking much (Time is of utmost importance here, right? :p).
              It is almost like a third language to me after Hindi and English. It took me a long time to get to this comfort zone, learning another language to this fluency will take up a long time for me.

              I believe people like me who don't wanna learn a whole new language because they enjoy doing CP for fun in their native language, multipliers should be taken care of at least for div2 because that's where we are most of the time.

»
5 лет назад, # |
Rev. 3   Проголосовать: нравится +65 Проголосовать: не нравится

With this improvement, we can now cheese https://codeforces.me/contest/1305/problem/F with a fast factoring implementation (taken from KACTL + Montgomery reduction):

https://codeforces.me/contest/1305/submission/73826085

Thanks Mike!

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится +65 Проголосовать: не нравится

time to

#define int long long

in my template

»
5 лет назад, # |
  Проголосовать: нравится +55 Проголосовать: не нравится

Wow! Now we need more contests on the quarantine please.

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится +16 Проголосовать: не нравится

Main question: how do 64-bit instructions perform in this mode? Are they still much slower than on 64-bit systems?

Second attempt: 73831074, judgement failed — system failed to run file. Dunno why, but I get the feeling that's not supposed to happen.

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится -8 Проголосовать: не нравится

    I also have a similar problem

    73994883 and 73994843 are absolutely the same, but C++17 (64) gives WA, while C++17 (32) passes.

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      That could be undefined behaviour. Either that or a compiler bug, but it makes sense for UB to happen only in one version.

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      O_o

      const int N = 502;
      ...
      int16_t dp[N][N][N];
      ...
      for (int len = 1; len <= 10000; ++len)
           checkmax(dp[i][j][len], dp[i][j][len - 1]);
      
      • »
        »
        »
        »
        5 лет назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        You obviously miss the point. In each for-loop under it I check i + len <= n and j + len <= m.

        • »
          »
          »
          »
          »
          5 лет назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          Ah yes. By default skipped these lines "there is a usual for"
          So, this is really strange behaviour

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится +14 Проголосовать: не нравится

      You use the value of R.p without initializing it, which is undefined behavior.

»
5 лет назад, # |
  Проголосовать: нравится +21 Проголосовать: не нравится

This is the best thing happen in 2020

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится +62 Проголосовать: не нравится

Imagine how wide the gap between C++ and other languages will be if we have compilers that are not obsoleted (eg 19.24 for MSVC and 9.0 for clang) with C++2a turned on...

»
5 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Now I would definitely not meet my ancestor because long long isn't enabled. :)

»
5 лет назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

Can someone please explain ,what this 64 bit g++ means.

Under which computer science topic ,does this knowledge come. (presently ,i only know how to code in c++,and topics like OS, COA basics are undergoing in my college)

I want to know what is this ,so that i can join this experimentation and testing.

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    The topics are CPU/low-level programming and OS.

  • »
    »
    5 лет назад, # ^ |
    Rev. 3   Проголосовать: нравится 0 Проголосовать: не нравится

    The relevant topics are probably processor architectures and other low level stuff. 32 bit and 64 bit refer to the word size of the processor.

»
5 лет назад, # |
  Проголосовать: нравится +19 Проголосовать: не нравится

well done, what goal for dark mode?

»
5 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

xalanq, I would be super happy to see that option in cf-tool Thanks for the work!

»
5 лет назад, # |
  Проголосовать: нравится +16 Проголосовать: не нравится

Just curious, but will this support available in Polygon as well?
Speaking of which, I'd be glad if Polygon supports PyPy, since its behavior compared to Python is completely different (other than just "significantly-faster") and Python solutions should be tested equally on both when preparing problems. ;)

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

Can anyone tell me how to add

<boost/multiprecision/cpp_int.hpp> and

library for big integers on mac(c++) ?

P.s. I am currently using sublime for coding with olympic fast coding add on.

  • »
    »
    5 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится +3 Проголосовать: не нравится

    Boost.multiprecision is a header-only library so brew install boost should place the required headers in boost subdirectory under /usr/include.

    Alternatively, you can download the headers from Boost.Multiprecision's GitHub repository and place it wherever you want but make sure to add the path of the downloaded directory or use -I compiler flag instead.

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +11 Проголосовать: не нравится

    You can't use boost or other third party libraries on Codeforces or any other competitive programming site that I know of though.

»
5 лет назад, # |
Rev. 4   Проголосовать: нравится -34 Проголосовать: не нравится

...

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

Now I can use #define int __int128 ????

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Note that you have to implement I/O by your own, because __int128 is supported by neither cin/cout nor scanf/printf.

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      And how exactly to do that? Can you give a simple example?

      • »
        »
        »
        »
        5 лет назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        If you've ever used fast read-in, you may feel familiar with this:

        __int128 read()
        {
            __int128 ans = 0;
            int sgn = 1;
            char c = getchar();
            while (!isdigit(c))
            {
                if (c == '-')
                    sgn *= -1;
                c = getchar();
            }
            while (isdigit(c))
            {
                ans = ans * 10 + c - '0';
                c = getchar();
            }
            return ans * sgn;
        }
        

        As for output:

        void print(__int128 x)
        {
            if (x > 9)
                print(x / 10);
            putchar(x % 10 + '0');
        }
        
»
5 лет назад, # |
Rev. 2   Проголосовать: нравится +25 Проголосовать: не нравится

YES YES YES! Finally convex hull trick isn't obnoxious to code! Finally we can factor large integers! Best feature ever!

»
5 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

What about for the Linux users? :)

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +5 Проголосовать: не нравится

    Just install g++, your system already supports it. Windows users have to resort to Cygwin or MinGW to get GCC because GCC primarily targets Linux.

    You could use alternative compilers such as MSVC, but then the resulting binary might be significantly different between your local machine and the judge.

»
5 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

On Linux, gdc is shipped with gcc9. Does gdc work in given msys2 distribution? If it does, it is possible to consider adding this alternative DLang compiler on Codeforces in a separate slot?

»
5 лет назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

** Solution to include bits/stdc++.h in visual c++ **

Because I really suffered from remembering all libraries you can add #include <bits/stdc++.h> which includes all libraries

  • This is just a method i used to make me able to include bits/stdc++.h
  • in visual c++.
  • for those had minGW installed on PC :
  • C:\MinGW\mingw32\lib\gcc\mingw32\4.8.1\include\c++\mingw32\bits
  • copy this folder and then go to this adress
  • C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include
  • paste your folder. go to your visual studio project type bits you will see
  • the auto-complete for the library and then choose stdc++.h
  • for those don't have minGW:
  • you should write your own header file and include all libraries in it then
  • go to C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include
  • make new folder name it "bits" and name the header file stdc++.h
  • then paste it in "bits" folder.
  • Hope this helps you!
  • Happy coding
  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +6 Проголосовать: не нравится

    Alternative: don't use bits/stdc++.h because it takes way too long to compile, just use a few common includes.

»
5 лет назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

Finally! Thanks, Mike

»
5 лет назад, # |
Rev. 3   Проголосовать: нравится -23 Проголосовать: не нравится

emmmm,Sorry

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +1 Проголосовать: не нравится

    Butthurt with other languages' fancy libraries?
    Keep in mind that referring to other libraries and calling methods from class will be slightly slower, not to mention big integer arithmetic by strings cannot match actual ALU-in-depth arithmetic support. That explains how BigInteger (Java) and Python int with large absolute value is significantly slower when used.

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      Sorry Sir,I am too excited with int128's appearance in codeforce ,I think I should apologize for every pyer and javaer~ Sorry bros!

»
5 лет назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

Thanks MikeMirzayanov for the kind correspondence and for this good news.

P.S. Please consider fixing the following typo in the first prompt of the pbox command

PBOX is abscent or not the last release. Downloading...

It should be written in English as

PBOX is absent or not the last release. Downloading...

»
5 лет назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

Is scanf("%I64d", &a); also available while reading a 64-bit integer in this mode, or we have to change it?

  • »
    »
    5 лет назад, # ^ |
    Rev. 4   Проголосовать: нравится 0 Проголосовать: не нравится

    Backward compatibility is often among the features that software developers and engineers aim to maintain when introducing new versions of computer language compilers.

    Check the following performance comparison using the same code:

    1. GNU C++17 9.2.0 (64-bit) 30 msec and 12 KB 74014633
    2. GNU C++17 7.3.0 (32-bit) 31 msec and 8 KB 74017705
»
5 лет назад, # |
  Проголосовать: нравится +34 Проголосовать: не нравится
#include <bits/stdc++.h>
using namespace std;
const int nax = 2e5;
using ll = long long;
bitset <nax>b;
int main() {
	ll ans = 0;
	int p = 0;
	for (int i = 0; i < nax; ++i) {
		b[p] = 1;
		p += 99827;
		if (p >= nax) p -= nax;
		ans += b.count();
	}
	printf("%lld\n", ans);
}

I tried running following code in custom test. Results:

gnu g++17 (64bit):

20000100000
=====
Used: 2807 ms, 20 KB

gnu g++17 (32bit):

20000100000
=====
Used: 2448 ms, 28 KB

Shouldn't bitsets be faster on 64bit machines?

  • »
    »
    5 лет назад, # ^ |
    Rev. 3   Проголосовать: нравится -30 Проголосовать: не нравится

    Anything using bitset is profoundly slow. Any container in STL is profoundly slow, actually, but bitset is one of the slowest containers. I stopped using it after getting TLE on several problems from using the STL.

    If it's really necessary for an 8x memory optimization, it's possible to write a bitset by hand, with an array of uint8_t.

    For reference, this code

    #include <bits/stdc++.h>
    using namespace std;
    const int nax = 2e5;
    using ll = long long;
    uint8_t b[(nax+7)>>3];
    int main() {
    	ll ans = 0;
    	int p = 0, bc = 0;
    	for (int i = 0; i < nax; ++i) {
    	    bc += !(bool)(b[p>>3]&(1<<(p&7)));
    		b[p>>3] |= 1<<(p&7);
    		p += 99827;
    		if (p >= nax) p -= nax;
    		ans += bc;
    	}
    	printf("%lld\n", ans);
    }
    

    uses the same amount of memory and runs in 0ms.

    20000100000
    
    =====
    Used: 0 ms, 20 KB
    
  • »
    »
    5 лет назад, # ^ |
    Rev. 14   Проголосовать: нравится -10 Проголосовать: не нравится

    The following implementation of your program using vector<uint64_t> is much faster.

    • »
      »
      »
      5 лет назад, # ^ |
        Проголосовать: нравится +11 Проголосовать: не нравится

      Yes. I do know it is possible to implement bitset in a way, that count works in constant time (and by the way slow down all other operations by some constant). It doesn't matter here. I intentionally implemented something in $$$O(n^2)$$$ to measure time on 32-bit and 64-bit machine.

      • »
        »
        »
        »
        5 лет назад, # ^ |
        Rev. 4   Проголосовать: нравится -10 Проголосовать: не нравится

        The following update evaluates the performance when bit_set::count() uses __builtin_popcountll() to count the number of ones in the vector 64-bit integer words instead of updating the number of ones in each bit set/reset/flip function call.

»
5 лет назад, # |
  Проголосовать: нравится -6 Проголосовать: не нравится

Today I experienced an extraordinary submission. I ran 74250513 on my machine, it worked in ~5 seconds on the largest input. After I submitted it, it showed me TLE. Tested it on the custom test, it showed me 11 seconds. Previously I remember always CF was faster than my machine.

Several minutes later, I tried to submit my code with the new compiler (74251728) and boom! It worked in 4.8 seconds (~3 times faster). I didn't use __int128 or any other features of the 64-bit compiler. Maybe it's because of the new GCC optimizations.

  • »
    »
    5 лет назад, # ^ |
      Проголосовать: нравится +10 Проголосовать: не нравится

    Your inner loop in the NTT uses long longs to do modmul

    (ll) w * a[i + j + k] % p;
    

    That is probably the reason you get a speed up from the 64 bit version.

»
5 лет назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

Compare these two submissions: 78189783 78187029

The differences between these submissions are the compiler and removing multiple comments (which will not affect the outcome).

The one that compiled with the 64-bit compiler received TLE, but the other one received runtime error.(and the main problem with code is array size)

Is it ok?

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

In this problem I got compilation error on this submission. It says that

Compilation Error

Why this happened? Can anyone please explain? Thank you.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится +3 Проголосовать: не нравится

    If you want to use __int128 (and other 64-bit features), you should choose "GNU C++17 9.2.0 (64 bit, msys 2)" as a compiler when you submit, not "GNU C++17 7.3.0" or other options (those are 32-bit versions).

»
4 года назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

MikeMirzayanov, Can CodeForces add support for other C++ compilers like LLVM's Clang 10.0?

The reason being that sometimes because of a bug in GCC, code fails to compiler and MSVC is trash when it comes to compiler optimizations.

86342066 AC (MSVC)

86342552 G++14 (Can't compile)

86342114 G++17(64) (Can't compile)

Moreover, I think compiler memory should be increased as compiler failed to allocate memory here: 86342465

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Moreover, I think compiler memory should be increased as compiler failed to allocate memory here: 86342465

    Rather you should work on reducing compile time MLE.

    Hint
»
4 года назад, # |
Rev. 6   Проголосовать: нравится -11 Проголосовать: не нравится

I have a question regarding the function sqrt; it does not support 128 values (submission https://codeforces.me/contest/325/submission/104269483)

here lll = __int128;

lll b=8*nn+(a-3)*(a-3);
     lll c=sqrt(b);

It gives a compilation error as follow:

program.cpp:53:22: error: call of overloaded 'sqrt(__int128&)' is ambiguous 53 | lll c=sqrt(b);

And it works after converting b into (long double)b but I think it loses precision?

  • »
    »
    3 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    For using __int128 i think we can't use any pre-defined functions. You have to write your own square root function.

»
4 года назад, # |
  Проголосовать: нравится +1 Проголосовать: не нравится

I was trying to use int128 but my compiler shows me the problem — " ISO c++ does not support 'int128' [-Wpedantic]

ide — farmanager -std=c++14 -pendatic (rest are some common flags)

Can you please Help me out ??