This time, Auroraa_ is solving 524E - Rooks and Rectangles. After some trial, he got
and then received TLE: 260160406
He was surprised, so he tried it again with exact same code, with C++17 (32). You know what? The AC magic happened again: 260161644.
At first glance, we do think some code has triggered undefined behaviours like Access out of bounds. But Auroraa_ denied that, because all the data could be fit in int32_t.
Then came the dramatic scene: by simply changing the code vector<int>v1[N], v2[N];
to vector<long long>v1[N], v2[N];
, the strange TLE problem disappeared: 260164936.
After that, my further investigation(260199113) pointed out that the block causes TLE is:
for (int i = 1; i <= q; i++) {
int x1, y1, x2, y2;
cin >> x1 >> y1 >> x2 >> y2;
vec.push_back({ x1, y1, x2, y2, i });
}
It took over 2000ms(260199205) to write data into this vec. And only took about 100ms(260199447) to do the same thing if we just changed vector<int>v1[N], v2[N];
to vector<long long>v1[N], v2[N];
. The weirdest thing is, there is no direct connection between v1,v2 and vec.
Recalling the process just described, coincidentally, this modification process is really similar to the blog Strange TLE by cin using GNU C++20 (64), and some weird subtle changes to (surprisingly) fix it.
And like this blog Slowdown bug affecting C++ (64 bit) on Codeforces, this time we encountered a Pandora's Box that just used a different magic number to open. Although MikeMirzayanov has updated the CF system to
, there is still something fishy here.
The new version of GCC and OS passed the "test" on You won't believe how this simple trick defeated the unexplained bug destroying every top LGM; also included other languages such as Rust. The script mentioned there does not work for Rust now (both in custom test and on my Windows 11 computer).
I translate the script to C++:
#include "bits/stdc++.h"
int main() {
constexpr size_t n = 30000;
auto vec = std::vector<std::vector<int>>(n);
for (size_t i = 10000; i < 20000; i++) {
vec.resize(i);
for (size_t j = 0; j < i; j++) {
vec[j].reserve(7);
}
auto start = std::chrono::system_clock::now();
for (size_t j = 0; j < 10000; j++) {
vec[i - 1].reserve(7);
vec[i - 1].shrink_to_fit();
}
auto elapsed = std::chrono::duration<double, std::milli>(std::chrono::system_clock::now() - start).count();
if (elapsed > 40) {
std::cout << i << " " << elapsed << "\n";
}
}
}
and also the bomb:
#include "bits/stdc++.h"
int main() {
constexpr size_t n = 30000;
constexpr size_t evil = 17182;
auto vec = std::vector<std::vector<int>>(n);
vec.resize(evil);
for (size_t i = 0; i < evil; i++) {
vec[i].reserve(7);
}
for (size_t i = 0; i < size_t(1e6); i++) {
vec[evil - 1].reserve(7);
vec[evil - 1].shrink_to_fit();
}
}
And it generated the output Used: 5514 ms, 0 KB
As usual, I change constexpr size_t evil = 17182;
to constexpr size_t evil = 17183;
, and then Used: 93 ms, 0 KB
OOPS, MikeMirzayanov! Did you forget to upgrade compilers after migrating the CF system to Windows Server 2022? If that's true, please, upgrade those compilers to UCRT version, and I suggest you could simply use the one with LLVM for the simplest configuration for clang18 with GCC's libstdc++.