There is a problem that I have noticed recently to be a subproblem in many string dp problems, especially one where you want to either "include" or "exclude" a certain substring when considering your answers.↵
↵
This problem is "maximum prefix of S that matches given i characters already match and we add the character c". Basically it is useful because at each point in our dp we want to know how many characters we are matching (how much we progress to a good/invalid state).↵
↵
For example:↵
↵
S = "lololol"↵
↵
i = 3, c = 'z' would represent "lolz", and the match would be 0↵
↵
i = 3, c = 'o' would represent "lolo", and the match would be 4 because it matches with the prefix " **lolo** lol"↵
↵
<spoiler summary="Slightly better example)">↵
↵
S = "aab"↵
↵
i = 2, c = 'a' would represent "aaa" and the match would be 2 because the suffix a **aa** matches with the prefix **aa** b↵
↵
Which means once we add a bad character we cannot just set match to 0.↵
↵
</spoiler>↵
↵
Naive O(alphabet * N^3) solution is to loop every size, (a-z), then try every prefix size, and compare string in O(N). This can become O(N^2) with hashing or some string algorithm (what I had to resort to, but it is a pain).↵
↵
However, there seems to be an easy to write DP solution.↵
↵
The DP is briefly mentioned in this div3 tutorial (but with only 2 characters): ↵
http://codeforces.me/contest/1015/problem/F↵
↵
However, solution code does not contain an implementation of it.↵
↵
After some digging, I found tutorial to [this problem](http://www.usaco.org/index.php?page=viewproblem2&cpid=267) used exactly same dp.↵
↵
DP[i][char] represents value as defined above.↵
↵
Implementation of tutorial looks something like this (notice pattern is zero indexed, so i-1 is the i'th character): ↵
↵
<spoiler summary="Code">↵
for (int i = 1; i < m; ++i) {↵
int prev = next[i - 1][pattern[i - 1]];↵
next[i - 1][pattern[i - 1]] = i;↵
for (int j = 0; j < 26; ++j) {↵
next[i][j] = next[prev][j];↵
}↵
}↵
next[m - 1][pattern[m - 1]] = m;↵
</spoiler>↵
↵
So maybe I am just slow, but it really does seem like magic right now!↵
↵
Now I want to ask, what is the idea behind this O(N) dp. For example, why is it always optimal to build off prev, and what exactly does prev represent (I'm assuming it is the biggest match beforehand excluding a full match).↵
↵
↵
This problem is "maximum prefix of S that matches given i characters already match and we add the character c". Basically it is useful because at each point in our dp we want to know how many characters we are matching (how much we progress to a good/invalid state).↵
↵
For example:↵
↵
S = "lololol"↵
↵
i = 3, c = 'z' would represent "lolz", and the match would be 0↵
↵
i = 3, c = 'o' would represent "lolo", and the match would be 4 because it matches with the prefix " **lolo** lol"↵
↵
<spoiler summary="Slightly better example)">↵
↵
S = "aab"↵
↵
i = 2, c = 'a' would represent "aaa" and the match would be 2 because the suffix a **aa** matches with the prefix **aa** b↵
↵
Which means once we add a bad character we cannot just set match to 0.↵
↵
</spoiler>↵
↵
Naive O(alphabet * N^3) solution is to loop every size, (a-z), then try every prefix size, and compare string in O(N). This can become O(N^2) with hashing or some string algorithm (what I had to resort to, but it is a pain).↵
↵
However, there seems to be an easy to write DP solution.↵
↵
The DP is briefly mentioned in this div3 tutorial (but with only 2 characters): ↵
http://codeforces.me/contest/1015/problem/F↵
↵
However, solution code does not contain an implementation of it.↵
↵
After some digging, I found tutorial to [this problem](http://www.usaco.org/index.php?page=viewproblem2&cpid=267) used exactly same dp.↵
↵
DP[i][char] represents value as defined above.↵
↵
Implementation of tutorial looks something like this (notice pattern is zero indexed, so i-1 is the i'th character): ↵
↵
<spoiler summary="Code">↵
for (int i = 1; i < m; ++i) {↵
int prev = next[i - 1][pattern[i - 1]];↵
next[i - 1][pattern[i - 1]] = i;↵
for (int j = 0; j < 26; ++j) {↵
next[i][j] = next[prev][j];↵
}↵
}↵
next[m - 1][pattern[m - 1]] = m;↵
</spoiler>↵
↵
So maybe I am just slow, but it really does seem like magic right now!↵
↵
Now I want to ask, what is the idea behind this O(N) dp. For example, why is it always optimal to build off prev, and what exactly does prev represent (I'm assuming it is the biggest match beforehand excluding a full match).↵
↵