I was solving this problem and was getting TLE (3s) when using vector<vector<ll>>
as my 2x2 matrix, then I converted it to array<ll,4>
and I passed in 500 ms. Why is it so much faster?
Update: 2D vector TLE code, array AC code
Some notes: I wasn't sure why I was TLE so at first I tried using int instead of long long. I also stopped maintaining 2 versions of the matrices per node and did the flip transform mentioned in the editorial. Both of these optimizations didn't work, so I then tried converting my 2d vector to 1d array. I remember that trick working on some other matrix problem on CF that I solved from long time ago, but I can't remember which one it is.