Hello mates, I'm trying to solve this DP question Coin Combinations II from CSES: https://cses.fi/problemset/task/1636/
Initially, I define an array dp[i][j]: meaning number of ways to form sum i, starting choosing coins from j-th index. And this solution using this DP array get me TLE.
But if I switch the two dimension, array dp[i][j]: number of ways to form sum j, starting choosing coins from i-th index. This solution give me accepted. But why?
Note:
Two solution is 99% the same the only different is that I switch two dimensions.
Sum i at most 10^6 and there are at most 10^2 coins.
Thank in advance.
Accepted code
TLE code
Auto comment: topic has been updated by lvisbl_ (previous revision, new revision, compare).
In C/C++ (see here), a 2D array (or vector) will be stored in a row major format, i.e., an array
arr[N][M]
will be stored asN
collections ofM
sized contiguous memory blocks. This might not be the case for other languages like Java, see here.Now, when your program is run by the processor, and you access
arr[i][j]
, the cache (which holds copies of the main memory for fast access by the processor) loads some contiguous blocks of memory due to their spatial locality, and thus,arr[i][j+1], arr[i][j+2], ...
will be accessed much faster than sayarr[i+1][j] or arr[i+2][j]
. If the array size is large enough it can cause a significant difference in the efficiency of your code.Hope this answers your query!!
Wow! Thats cool.
Would using a 2D vector help?
No, even for a dynamically allocated vector, each row will be stored in a contiguous fashion, see here, so accessing
arr[i][j+1]
is still faster thatarr[i+1][j]
, ifarr[i][j]
has been accessed.Thank you so much for clear explanation, this is what I'm looking for.