Omar_Mohammad's blog

By Omar_Mohammad, history, 15 months ago, In English

I have been stuck for days on this problem with no progress so any help will be appreciated.

the problem if the image is not clear: given two strings s and t(|t|, |s| <= 1e5) and q queries (q <= 1e6). let's denote by t(a, b) as the substring of t that starts at a and ends at b. each query is of the form: l, r, i, j find the number of occurrences of t(l, r) + t(i, j) in s. where "+" is the concatenation operator. the sum of |t|, |s|, and q over all test cases <= 1e6.

  • Vote: I like it
  • +3
  • Vote: I do not like it

»
14 months ago, # |
  Vote: I like it -14 Vote: I do not like it

interested to know a suitable solution for this.

»
14 months ago, # |
Rev. 5   Vote: I like it +23 Vote: I do not like it

We'll answer the queries offline in $$$O((|s|+|t|)log(|s|+|t|)+(|s|+q)log(|s|+q))$$$

So we have $$$z_i=t[l,r]+t[i,j]$$$.

First of all assume $$$|s|=n$$$ and $$$|t|=m$$$.

Let us have a new string $$$d$$$ such as $$$d=s+t$$$. Build Suffix Array and LCP array of string $$$d$$$.
Now, on using these Suffix Array and LCP array, we can compare $$$s[i,n]$$$ and $$$t[j,m]$$$ by finding the length of the longest common prefix of $$$s[i,n]$$$ and $$$t[j,m]$$$ for all $$$1 \leq i \leq n, 1 \leq j \leq m$$$. Complexity for this part is $$$O((|s|+|t|)log(|s|+|t|))$$$.

Now consider an array(say $$$a$$$) of strings of length $$$n+q$$$.
First of all, $$$a$$$ contains all suffixes of string $$$s$$$. Remaining elements of $$$a$$$ are $$$z_i$$$ for $$$1 \leq i \leq q$$$.

Now, we can sort this array $$$a$$$, using a custom comparator. You can compare any two elements of $$$a$$$ using Suffix Array and LCP array of string $$$d$$$ (How to do it is left as an exercise for readers).
The good thing is that you can compare in $$$O(1)$$$ if you use a sparse table. Complexity for this part is $$$O((|s|+q)log(|s|+q))$$$.

Now we have array $$$a$$$ sorted.

Build LCP array for this array $$$a$$$ too. Complexity for this part is $$$O(|s|+q)$$$.

Let us see how we can answer for string $$$z_i$$$. Assume length of $$$z_i$$$ is $$$len$$$.
Find the position of $$$z_i$$$ in array $$$a$$$. Let us assume that $$$z_i$$$ occurs at position $$$p$$$.
The answer for string $$$z_i$$$ is the sum of the following two values.

  1. number of $$$l$$$ such that $$$F(l,p)=len$$$, where $$$1 \leq l < p$$$ and $$$a_l$$$ is a suffix of string $$$s$$$.(We can find this using stack)
  2. number of $$$r$$$ such that $$$F(p,r)=len$$$, where $$$p < r \leq |a|$$$ and $$$a_r$$$ is a suffix of string $$$s$$$.(We can also find this using stack)

Here, $$$F(x,y)$$$ gives the length of the longest common prefix of strings $$$a_x, a_{x+1}, \ldots a_y$$$.

Is this problem available for practice anywhere?

Edit: I realised that it is possible to solve this problem online(in $$$O(|d| \cdot log(|d|)+q \cdot log(|d|))$$$) as well. Instead of having array $$$a$$$, we can just find the number of suffixes of $$$d$$$ which are smaller(using binary search) than $$$z_i$$$. After that we can use binary search and sparse table to find leftmost $$$l$$$ and rightmost $$$r$$$ in $$$O(log(|d|))$$$ for each query.

»
14 months ago, # |
  Vote: I like it +17 Vote: I do not like it

I have library code that does exactly this