I was attempting to implement the solution to 1770G - Koxia and Bracket after reading the editorial recently but I kept getting TLE on test case 12. However, after changing the compiler from GNU C++17 to GNU C++17 (64), I got AC in 1185ms, which is very far off from the time limit of 5 seconds.
GNU C++17: 188211296
GNU C++17 (64): 188211331
I tried testing it on errorgorn solution in the editorial as well and there was the same problem.
GNU C++17: 188211673
GNU C++17 (64): 188211705
I thought that it might be because of the NTT implementation that we used (KACTL), however, I even tried on Radewoosh submission which uses a different NTT implementation, but still faces the same issue.
GNU C++17: 188203821
GNU C++20 (64): 187349301
I have seen some cases where 64-bit compiler runs faster than 32-bit before, but never to such a large extent. Does anyone know the reason why? Does NTT run a lot faster on 64-bit compiler? Or is it something about our implementation?
If anyone know the reason why, I will appreciate it very much if you could explain it to me down in the comments. I guess it is about time that I shift from GNU C++17 to GNU C++17 (64) 😢