I've been working on 577B - Modulo Sum, and solved it with 218ms.
I looked into the best solutions, and I found that the ideas are the same, only mine runs much slower.
Here are the two submissions:
285195320 62ms
285195748 218ms
We both use cin/cout and did that sync and tie0 thing, so IO shouldn't be the problem. And actually I think mine should be faster, cause in one loop I only did % operation one time, and I have that break statement.
And check this submission:
You can see that in test 13, only first 3 number already satisfy the problem, in my program, the remaining 1e6 loops are ignored, but still, 186ms on this test case, why? I'm extremely confused >_<