.-.-.-'s blog

By .-.-.-, 15 hours ago, In English

Any ideas?

  • Vote: I like it
  • +6
  • Vote: I do not like it

»
15 hours ago, # |
  Vote: I like it -26 Vote: I do not like it

For a string of length $$$O(\sqrt{n\log{n}})$$$ you can bruteforce all endpoints of substrings, calculate their hashes and count them in unordered map. The answer is the sum of $$$\frac{v (v - 1)}{2}$$$ over all pairs $$$(k,v)$$$ in the map. Total complexity is $$$O(\sqrt{n\log{n}}^2) = O(n\log{n})$$$ time and $$$O(distinct\ substrings)$$$ memory.

  • »
    »
    14 hours ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Jokes aside, this solution isn't even $$$\mathcal{O}(n^2)$$$. It's $$$\mathcal{O}(n^3)$$$. Each hash calculation takes linear time. Unless you write a custom hash calculation method which can be updated incrementally as you add chars.

    • »
      »
      »
      13 hours ago, # ^ |
        Vote: I like it -6 Vote: I do not like it

      Read about rolling hashes, bro, my analysis if fully correct.

»
14 hours ago, # |
  Vote: I like it +75 Vote: I do not like it

Isn't this just Suffix Array?

»
5 hours ago, # |
  Vote: I like it 0 Vote: I do not like it

Build a suffix array, the answer for some suffix starting from index i will be the sum of lcp(i,j) for all j such that j<i. You can calculate the answer using a lazy segtree which maintains the counts of lcp(i,j) for j<i, when you transition from i to i+1, just update the answer according to the value of lcp(i,i+1). If the pairs are ordered just multiple the final ans by 2.

»
5 hours ago, # |
  Vote: I like it +1 Vote: I do not like it

we can do it with suffix array and lcp

imagine the lcp array form a histogram where the height is the length and the width is the frequency then if we used monotonic stack to get the prev greater, next greater we will be able to know for each substr how many times it exists

then for each length from 1 to lcp[i] it should has duplicates equal to the width in the histogram