Here's a nice optimization for linear recurrence problems. This mostly works when the you're trying to find the nth term of a sequence of the form $$$f(n) = a_1 \cdot f(n-1) + a_2 \cdot f(n-2) + \dots + a_m \cdot f(n-m)$$$, usually modulo some big number, where n is too large to use the normal dynamic programming solution, usually $$$ n \approx 10^{18}$$$. If $$$m \leq 100$$$, the normal solution involves matrix multiplication in $$$O(m^3 log(n))$$$. But under some conditions we can do better using dynamic programming and divide and conquer.
The idea involves reducing the problem into finding how many arrays with numbers 1 to m and sum n can be constructed. To do this, we will divide the problem of finding $$$f(n)$$$ into finding $$$f(k)$$$, where k is in the neighborhood of $$$\frac n 2$$$.
All problems shown here can be solved using matrix exponentiation, but due to it's cubic nature, if m was larger it would not be possible.
Magic Gems — 1117D
This was the first problem that I solved using this approach, so I will be using to demonstrate it: You have gems that are either worth 1 or M and your objective is to find how many arrays modulo $$$10^9+7$$$ have worth N.
As an example, if N = 10 and M = 3, you would try to find in how many ways you could either write 1 or 3 and add up to N. 11111111, 11111113, 131131, 3331, 1333 are 5 of the possible examples.
Let's look at the largest prefix that doesn't exceed $$$\frac n 2$$$. In the example, this subarray can either have 3, 4 or 5 elements. Moreover, if it has 3 or 4 elements, the next one will be a 3, and then will come a suffix with 4 or 3 elements respectively, but if we have 5, the suffix can be any array with sum 5. With this we can derive:
$$$f(10)=f(5)\cdot f(5) + f(4) \cdot f(3) + f(3) \cdot f(4)$$$
More generally, if $$$k = \lfloor {\frac n 2} \rfloor$$$
$$$f(n) = f(k)\cdot f(n-k) + f(k-1) \cdot f(n-k-2) + f(k-2) \cdot f(n-k-1)$$$
And if m wasn't 3, we get
$$$f(n) = f(k) \cdot f(n-k) + \sum _{i=1}^{M-1} f(k+1) \cdot f(n-k-m+1)$$$
It can be proven that the ith iteration of dividing by 2, the range of values for to calculate are never greater than $$$\frac n {2^i} +2$$$ and are all at least $$$\frac n {2^i} - 2\cdot m$$$, meaning that in each one of the $$$log n$$$ iterations, we will need to calculate $$$O(m)$$$ values. Given that the previous terms were already computed, we can compute that next term in $$$O(m)$$$. Meaning the total complexity is $$$O(m^2 \cdot log(n))$$$ by using a hashmap to save the dp, or $$$O(m^2 \cdot (log (n))^2)$$$ by using a treemap.
Due to the numbers coming in batches, I will assume that hacking this resulting in multiple hash collitions and TLE is not possible. If you don't trust the analysis, add an extra $$$O(log n)$$$ term to all results from now on.
[Matrix solution](https://codeforces.me/contest/1117/submission/259507773) — AC 452ms
[DP solution](https://codeforces.me/contest/1117/submission/259506862) — AC 78ms
Fibonacci Numbers — CSES 1722
After seeing the last solution, you may have realized that if M=2, the terms were all fibonacci numbers. This is because they follow the same recurrence $$$f(n) = f(n-1) + f(n-2)$$$.
You may be tempted to just return hardcode M=2, and send it as a solution, but this will give WA due to $$$F_0 = 0$$$, while our solution will assume it equals 1. Changing the values on the dp map so that $$$f(0) = 0, f(1) = 1$$$, also won't solve our problem. Our solution assumes that this is a construction and so by definition, we can never construct an array with sum less than 0, and we can only construct an array with sum 0 in one way. This forces us to leave $$$f(0) = 1, f(1) = 1$$$ and then call $$$F_n = f(n-1)$$$
As both solutions are equally as fast, I will only be linking the dp solution [here](https://cses.fi/paste/8c714b560d7d4b168beb1a/)
Throwing Dice — CSES 1096
This problem can be easily translated as adding up to N using only numbers 1, 2, 3, 4, 5 or 6. Let's use the same trick as before; we will grab the longest prefix that doesn't add up to more than half of N.
Let's supposed that N = 100. The halfway mark is 50, so the last prefix could add up to 45, 46, 47, 48, 49 or 50. In the case that it adds up to 45, I know the next dice throw was a 6, and then comes a suffix of 49. If it added up to 46, the next dice throw was either a 5 or a 6, and then would either come a suffix of 48 or 49. The next one over, if it added up to 47, the next throw would either be 4, 5, or 6, followed by a suffix of 47, 48 or 49. So on and so forth. We derive the next equation:
$$$f(n) = f(k-5)\cdot f(n-k-1) + f(k-4) \cdot (f(n-k-2) + f(n-k-1))+\dots$$$
$$$f(n) = \sum _{i=1} ^{6} (f(k-6+i) \cdot (\sum_{j=1}^i f(n-k-j)))$$$
If the dice had M faces, then the general equation would be:
$$$f(n) = \sum _{i=1} ^{M} (f(k-M+i) \cdot (\sum_{j=1}^i f(n-k-j)))$$$
Note that we can compute $$$\sum_{j=1}^i f(n-k-j)$$$ on each iteration fast. If we didn't, then computing $$$f(n)$$$ would take $$$O(m^2)$$$, which would make the whole algorithm $O(m^3 \cdot log(n))
Despite there not being a significant time difference, I leave both solutions here with the dice face count(M) being a variable in case someone wants to compare speed.
[DP](https://cses.fi/problemset/result/9176929/)
[Matrix](https://cses.fi/problemset/result/9176928/)
<div>90c3350648510e54bde811ef2e4802b6</div>