You are given a string (|S| <= 10^5) and have to find such substring that repeats one after another maximum number of times. Or, if there are some such substrings, you must choose the longest/shortest/lexicografically_something.
For example: for test bacacacb
answer is ac
.
How to solve it? Thanks!
Make a suffix array, and an LCP array. Now your task is to find the longest non-zero contiguous subsequence in the LCP array, which can easily be done in linear time.edit: sorry, I thought it just said maximum number of occurrences.
aclwawalaciac
Your solution will give
ac
as an answer, but the correct answer iswa
.It seems that you didn't notice "such substring that repeats one after another maximum number of times"
This post mentioned this paper which I think solves the problem.
Nevermind, a easier solution can be found here:
Iterate through all possible "period" (the length of the repeated string) of the answer $$$p$$$ from $$$1$$$ to $$$|S|$$$. For each $$$p$$$, check
LCP(i * p, (i + 1) * p)
andLCS(i * p - 1, (i + 1) * p - 1)
to get the length of the whole repetition substring.Iterating through all $$$i$$$ takes $$$O(n / i)$$$, thus the total complexity is $$$O(n \log n)$$$ since LCP and LCS can be calculated from SA (LCS's calculation is based on reversed string) and sparse table.