Блог пользователя majk

Автор majk, 6 лет назад, По-английски

I realised I have too much contribution anyway, so I write this blog about G of today's contest.

This problem was plagued by technical issues from start to finish. When you read some of them, you will probably ask yourself how the hell did I get red in the first place.

Originally, the problem had n a product of just 2 primes. We then decided to raise it up to 10 primes. I was really confused that I had two types of runs — either they passed with runtime 30ms, or they got TLE or ILE. I added time measurement to the interactor itself, and found out that the 30ms solutions in fact took more than couple seconds to run. The same solutions reported run time of couple seconds in Gym contest that we used for testing. During post contest discussions with MikeMirzayanov I realised that I still don't understand how Polygon and Codeforces measure time for interactive problems.

I am not the fastest duck in the pond, so I investigated my Tonelli Shanks implementation for bugs, found and fixed a couple, but it was still too slow. Only at this point I actually calculated the number of operations needed to find a square root modulo a prime and facepalmed. I realised that only about 10 queries can fit into time limit in the worst case. 10 queries were too few to achieve reasonable success probability even for 2 prime factors.

Luckily, if the prime is of format 4k + 3, one can find the square root with just 1 modular exponentiation instead of 5 - 10 for Tonelli Shanks solving the general case. So the problem was changed to only support these primes, the TL and success probability was suddenly more than enough.

We made a way for the contestants to understand the implications of the runtime of the interactor, and were happy with the problem. Oh, how wrong we were! (Side note: After the contest, Mike suggested a better way of doing that, and I will change the problem for practice using this suggestion.)

Fast forward to contest — there is a crashing solution of Shik getting TLE on interactor. I don't see the interaction log in CF interface, so at first glance it seemed to me that he simply used too many queries and the time accumulated. Mike proved me wrong, as his logs showed that my interactor never returned answer to some of the queries.

I took the test case and the queries and modified the interactor so that it just uses stdin/stdout, and I can test and debug it manually outside of Polygon. For some strange reason the GCD was not returning the answer in a timely manner.

In the ensuing chaos we misjudged the situation and introduced the query limit of 100, but it only made matters worse.

I was looking at my GCD implementation like an idiot. More precisely, it is not my GCD implementation -- I am using someone else's library for big integers! I've made a few changes to it, but I considered it good, as it never failed me in a contest. Everything seemed fine except it wasn't. Then arsijo pointed out this loop:

do {
    b >>= b.count_trailing_zeros();            
    if (cmp_abs(a, b) > 0) a.words.swap(b.words);
    sub_unsigned_overwrite(b, a);
} while (b.size() > 0);

You don't need to know too much about my implementation to understand that this is roughly equivalent of:

do {
   if (a>b) swap(a,b);
   b -= a;
} while (b != 0);

You don't need to be a freaking red to understand that this is really stupid way of writing gcd.

I sighed, changed it to a = Num::mod(a, b) and ran it against Petr's and mine solution in Polygon. You don't need to be freaking red to understand that this is an incorrect fix. You need to have some Polygon experience to understand that this will use your CPU quota in Polygon and lock you out for 5 minutes, which is definitely something you appreciate when you have 10 potentially frustrated/angry contestants getting Denial of judgement.

I sighed again, changed it to b = Num::mod(b, a), committed the problem and asked arsijo to test it. He managed to lock himself out for some reason as well, and then the polygon returned HTTP 502 for some time. Magic MikeMirzayanov fixed it, saw that it runs correctly and pushed the change to Codeforces. It didn't change the verdicts in the slightest, but luckily it was just a case of a stale cache and it was fixed promptly.

We added a few minutes to the contest duration, but I understand that for some people the contest was already quite unfair or broken. I hope that during the post-contest investigation we found the least possible unfair solution. I would like to thank arsijo and MikeMirzayanov for the immense help with this difficult situation.

For anyone who managed to read this far: the downvote button is just below this sentence, and it looks like a triangle standing on one of its vertices.

Update: I realised I forgot to talk about one more important subject. Why the hell did we not find the issue with gcd during contest preparation? I think I know why. All of our solutions apparently did something along the lines of:

  • find random x in [1, N)
  • ask for square root of x * x

The crucial thing is that in expectation x is N / 2, hence x2 is N / 4. For these values, the subtraction gcd seems to run fast in expectation, and we never noticed the issue. When contestants did something else, suddenly things broke. For me, this is an important lesson about testing my problems.

  • Проголосовать: нравится
  • +336
  • Проголосовать: не нравится

»
6 лет назад, # |
  Проголосовать: нравится +19 Проголосовать: не нравится

Can I ask what is "a better way" suggested by Mike?

Also, I don't know why, but when someone asks to be downvoted it doesn't work this way.

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится +46 Проголосовать: не нравится

    Probably because that sincere remark in the end of the article just reminds us that everyone is human and is prone to making mistakes from time to time..

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится +63 Проголосовать: не нравится

    Can I ask what is "a better way" suggested by Mike?

    It's so simple once you know it. You have 6000 coins. Asking query of type +, -, ... costs 1 coin, query of type sqrt costs 80 coins etc. When you run out of coins, you get Wrong Answer.

    Also, I don't know why, but when someone asks to be downvoted it doesn't work this way.

    The only reliable way to get downvotes is to call people names and add ROFLLOOOLLMAOOOO!!!! between every two words.

»
6 лет назад, # |
  Проголосовать: нравится +37 Проголосовать: не нравится

relevant

»
6 лет назад, # |
Rev. 3   Проголосовать: нравится +153 Проголосовать: не нравится

Hey majk and all!

I think the loop that arsijo pointed out isn't actually slow. It's running the binary GCD algorithm, and the key line you're missing is b >>= b.count_trailing_zeros().

The way the algorithm works is that we first count and remove all common powers of 2 (trailing zeros) from a and b. WLOG, assume a is now odd. Then, in each iteration, we remove all trailing zeros from b (because a is odd, so the GCD doesn't divide 2), so that a and b are both odd. Then, we go from (a, b) to (a, b - a), so that we're back to having an even number and an odd number. This way, in each round, the number of binary digits decreases by one in the next round, so it takes many rounds.

This is actually faster than the plain-old Euclidean algorithm: we perform many subtractions instead of many modulo operations.

All this being said, I'm not sure what the slowdown is anymore. Do you have any thoughts?

EDIT: On a second read of Wikipedia, it looks like this is known to have comparable performance to the standard Euclidean algorithm. Perhaps the mod optimization indeed made it faster, though that's a bit of an odd hybrid approach.

  • »
    »
    6 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится +8 Проголосовать: не нравится

    Thanks for the link!

    The problem with binary GCD is that it should in fact say: a >>= a.count_trailing_zeros(); b >>= b.count_trailing_zeros();

    In particular, if the smaller number is even, it is never divided by two, and the much larger integer remained odd. It degenerated to performance of .

    • »
      »
      »
      6 лет назад, # ^ |
        Проголосовать: нравится +3 Проголосовать: не нравится

      Prior to the loop, was a guaranteed to be odd? If that's the case, we always do b = odd - odd = even, while a stays odd the whole time.

      • »
        »
        »
        »
        6 лет назад, # ^ |
          Проголосовать: нравится +8 Проголосовать: не нравится

        Prior to the loop it is guaranteed that at least one of the numbers is odd by doing

        size_t n = std::min(a.count_trailing_zeros(), b.count_trailing_zeros());
        a >>= n;
        b >>= n;
        

        Since the larger integer is a product of odd primes, the smaller is in fact the only one that can be even, triggering the bug.

        • »
          »
          »
          »
          »
          6 лет назад, # ^ |
            Проголосовать: нравится +6 Проголосовать: не нравится

          Interesting, that's quite a subtle bug then. Replacing those lines with

          if (a.count_trailing_zeros() > b.count_trailing_zero()) 
              a.words.swap(b.words);
          size_t n = a.count_trailing_zeros();
          a >>= n;
          b >>= n;
          

          could fix the algorithm then.

»
6 лет назад, # |
  Проголосовать: нравится +12 Проголосовать: не нравится

If x is N/2, should x² be N²/4 instead of N/4?

  • »
    »
    6 лет назад, # ^ |
      Проголосовать: нравится +10 Проголосовать: не нравится

    Yeah, x2 should be N2 / 4, I think that's a typo.

    I think what's important for the statement is actually , which is around N / 2 in expectation if x is chosen randomly.

»
6 лет назад, # |
Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

For those curious: The big-integer library can be found on GitHub. The faulty GCD code is here.

Someone should definitely make a pull request to fix the bug.

»
6 лет назад, # |
Rev. 2   Проголосовать: нравится 0 Проголосовать: не нравится

I didn't really get the issue you mentioned first: so you expected that interactor time is counted as part of TL but it wasn't?