If we want to use a 2D dynamic array, some C++ user writes following 2D vector:
vector<vector<int>> a(n);
// add y to a[x]
a[x].push_back(y);
// walkthrough a[x]
for(auto &nx : a[x]){
cout << nx << "\n";
}
But we can do the equivalent thing with handwritten linked list (when we know the total size) by using 1D vector<pair<int,int>>:
pair<int,int> empty={-1,-1};
vector<pair<int,int>> b(n+total_size,empty);
int add_point=n;
// add y to b[x]
if(b[x]!=empty){
b[add_point]=b[x];
b[x].second=add_point;
add_point++;
}
b[x].first=y;
// walkthrough b[x]
int current=x;
while(current!=empty.second && b[currnt]!=empty){
cout << b[current].first << "\n";
current=b[current].second;
}
So, what's the benefit for the later implementation? The merit is the generation time of later one is very fast.
vector size = $$$10^5$$$, Total elements = $$$10^6$$$ (this means we need vector<vector<int>>(1e5)
and the sum of the size of $$$10^5$$$ vectors is $$$10^6$$$)
generate | walkthrough | |
---|---|---|
vector<vector<int>> | 49.04 ms | 3.12 ms |
vector<pair<int,int>> | 12.78 ms | 19.52 ms |
vector size = $$$10^6$$$, Total elements = $$$10^6$$$
generate | walkthrough | |
---|---|---|
vector<vector<int>> | 138.56 ms | 10.92 ms |
vector<pair<int,int>> | 17.4 ms | 7.8 ms |
vector size = $$$10^6$$$, Total elements = $$$10^7$$$
generate | walkthrough | |
---|---|---|
vector<vector<int>> | 1406.84 ms | 32.86 ms |
vector<pair<int,int>> | 120.48 ms | 437.28 ms |
( on CF judge(C++20 64bit), x10 average, experiment code )
So, when the vector size is large and need small number of walkthrough, the vector<pair<int,int>>
implementation have an advantage.
I use this implementation for 303C - Наименьший модуль (because the package is old, current TL is 1s). 185222734
I've seen the latter implementation being used by a lot of Chinese competitors in graph problems (so I tend to call it Chinese adjacency list). It's used in the fastest solutions to a lot of problems on various judges (that involve graph traversals — since DFS is very bad for cache anyway, the vector approach loses its advantage).
One thing that is much faster than
vector<vector<int>>
isvector<basic_string<int>>
, and this is because of short string optimization (SSO). However, due to howbasic_string
is implemented, if you write complex stuff likepair<int, int>
instead ofint
, it will not compile/give UB. To fix that, you can just define a custom struct without any constructors/initial values (so stuff liketemplate <class T, class U> struct Pair { T a; U b; };
works for "simple types"T
andU
that are either plain old datatypes or of this form recursively).Is it because fetching data from the adjacency list is usually (if not mostly) more cache-friendly for traversing through the graph?
I found https://codeforces.me/blog/entry/67001 pretty insightful on how to minimize cache misses
True, I have used that optimization myself once but this is the first time I have seen it mentioned on codeforces. Thanks for the reference! However it is not applicable to this problem so much, since it needs you to do a DFS anyway (so as to prepare for other potentially costly DFSes).
Bro, this is scary. I swapped vector<vector> for vector<basic_string> in my template and got double speed on todays E.
without basic string (296ms): https://codeforces.me/contest/1774/submission/185685343
with basic string (186ms): https://codeforces.me/contest/1774/submission/185736069
This is insane
The difference is even more prominent if you make the IO faster. 185739871 with
basic_string
(109ms) and 185739827 withvector
(202ms). This is mostly because for graphs where most vertices have a small degree, multiple allocations are bad.12 years ago it was my typical graph implementation