Why are 2x2 matrices that are flattened much faster? - Codeforces

→ Pay attention

Before contest
Codeforces Round 1006 (Div. 3)
3 days
Register now »

→ Streams

The 2025 Universal Cup Finals

By tourist

Stream is running

Greedy Algorithms — Topic Stream

By Shayan

Before stream 04:53:49

View all →

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	156
6	Qingyu	155
7	djm03178	151
7	adamant	151
9	luogu_official	150
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

limabeans's blog

Why are 2x2 matrices that are flattened much faster?

By limabeans, history, 5 years ago, In English

In English

I was solving this problem and was getting TLE (3s) when using vector<vector<ll>> as my 2x2 matrix, then I converted it to array<ll,4> and I passed in 500 ms. Why is it so much faster?

Update: 2D vector TLE code, array AC code

Some notes: I wasn't sure why I was TLE so at first I tried using int instead of long long. I also stopped maintaining 2 versions of the matrices per node and did the flip transform mentioned in the editorial. Both of these optimizations didn't work, so I then tried converting my 2d vector to 1d array. I remember that trick working on some other matrix problem on CF that I solved from long time ago, but I can't remember which one it is.

+13

limabeans
5 years ago
17

Comments

Comments (17)

Write comment?

»

5 years ago, # |

Vote: I like it

+21

Vote: I do not like it

A vector has some overhead, so for small sizes it's better to use arrays than vectors.

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

So I guess array<array<ll,2>,2> will be fine as well? In other words, the speed-up is due to predefined sizing?

→ Reply

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

I would guess so but you can check.

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

I’ve seen a lot of reds use vectors easily without getting TLE. And whenever i use them i get TLE and after changed it to array it becomes AC. how can we figure out if we get TLE by using vectors?

→ Reply

»

»

»

5 years ago, # ^ |

Vote: I like it

+32

Vote: I do not like it

If you use a lot of small vectors then they are slow.

→ Reply

»

»

»

»

5 years ago, # ^ |

Vote: I like it

-22

Vote: I do not like it

WHAT DO YOU MEAN? I've heard this answer thousand times... please replace the words (a lot of) and (small) and (slow) by integers!

→ Reply

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+11

Vote: I do not like it

That's a general rule of thumb. If you want to know numbers you can try doing some benchmarks.

→ Reply

»

»

»

5 years ago, # ^ |

Vote: I like it

+7

Vote: I do not like it

Just a guess, you night be passing your vectors to functions by value instead of by reference.

→ Reply

»

5 years ago, # |

Vote: I like it

+15

Vote: I do not like it

Array of such small size can be allocated and manipulated on the call stack efficiently.

But vectors causes heap allocation, construction, possibly initialisation, destruction, heap deallocation. That sounds like too much overhead but isn't unless you repeatedly create vector.

→ Reply

»

5 years ago, # |

Vote: I like it

0

Vote: I do not like it

Can you post your submissions? There might be something slightly specific regarding how you are using the arrays/vectors.

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Sure, I updated the post.

→ Reply

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

Ah yeah, so as retrograd mentioned there will be many heap allocations and the rows of the matrix will not necessarily be next to each other. The performance hit from this is actually multiplied by the fact that you are doing matrix multiplication, which will probably incur many cache misses.

→ Reply

»

5 years ago, # |

Vote: I like it

0

Vote: I do not like it

Why are the submissions to this problem are not viewable? Submissions to 1252K

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

+8

Vote: I do not like it

I think for some ICPC contests, they set submissions and/or test cases to not-viewable.

→ Reply

»

5 years ago, # |

Vote: I like it

+20

Vote: I do not like it

Not sure how the latest compilers do their magic, but most of yhe times 2d data structures like std::vector<std::vector<T>> allocate on heap. On top of that, they do not necesarily allocate contiguous memory for the elements on different rows (although it should be the case with malloc in practice). Now, heap allocations are expensive. And the overhead is seen more and more as you copy vectors arround. On the other hand, small arrays are probably kept on the stack, therefore allocation and copying is almost of zero cost, due to cache locality of the stack.

→ Reply

»

5 years ago, # |

Vote: I like it

0

Vote: I do not like it

If you are allocating a vector for 2x2 matrix a lot, you could declare it once and reset entries everytime you need to use it. This should be just as fast as array. Also I think even vector<ll>(4) should be fast enough to pass in most cases.

→ Reply

»

5 years ago, # |

Vote: I like it

0

Vote: I do not like it

Auto comment: topic has been updated by limabeans (previous revision, new revision, compare).

→ Reply