Help needed in a String Problem

→ Pay attention

Before contest
Rayan Programming Contest 2024 - Selection (Codeforces Round 989, Div. 1 + Div. 2)
19:40:21
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	tourist	3993
2	jiangly	3743
3	orzdevinwang	3707
4	Radewoosh	3627
5	jqdai0815	3620
6	Benq	3564
7	Kevin114514	3443
8	ksun48	3434
9	Rewinding	3397
10	Um_nik	3396

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	155
8	TheScrasse	154
9	Dominater069	153
10	djm03178	152

View all →

→ Find user

→ Recent actions

Detailed →

Omar_Mohammad's blog

Help needed in a String Problem

By Omar_Mohammad, history, 15 months ago, In English

I have been stuck for days on this problem with no progress so any help will be appreciated.

the problem if the image is not clear: given two strings s and t(|t|, |s| <= 1e5) and q queries (q <= 1e6). let's denote by t(a, b) as the substring of t that starts at a and ends at b. each query is of the form: l, r, i, j find the number of occurrences of t(l, r) + t(i, j) in s. where "+" is the concatenation operator. the sum of |t|, |s|, and q over all test cases <= 1e6.

string suffix structures

Omar_Mohammad
15 months ago
5

Comments (5)

Write comment?

gholyo

14 months ago, # |

-14

interested to know a suitable solution for this.

→ Reply

satyam343

14 months ago, # |

← Rev. 5 →

+23

We'll answer the queries offline in $$$O((|s|+|t|)log(|s|+|t|)+(|s|+q)log(|s|+q))$$$

So we have $$$z_i=t[l,r]+t[i,j]$$$.

First of all assume $$$|s|=n$$$ and $$$|t|=m$$$.

Let us have a new string $$$d$$$ such as $$$d=s+t$$$. Build Suffix Array and LCP array of string $$$d$$$.
Now, on using these Suffix Array and LCP array, we can compare $$$s[i,n]$$$ and $$$t[j,m]$$$ by finding the length of the longest common prefix of $$$s[i,n]$$$ and $$$t[j,m]$$$ for all $$$1 \leq i \leq n, 1 \leq j \leq m$$$. Complexity for this part is $$$O((|s|+|t|)log(|s|+|t|))$$$.

Now consider an array(say $$$a$$$) of strings of length $$$n+q$$$.
First of all, $$$a$$$ contains all suffixes of string $$$s$$$. Remaining elements of $$$a$$$ are $$$z_i$$$ for $$$1 \leq i \leq q$$$.

Now, we can sort this array $$$a$$$, using a custom comparator. You can compare any two elements of $$$a$$$ using Suffix Array and LCP array of string $$$d$$$ (How to do it is left as an exercise for readers).
The good thing is that you can compare in $$$O(1)$$$ if you use a sparse table. Complexity for this part is $$$O((|s|+q)log(|s|+q))$$$.

Now we have array $$$a$$$ sorted.

Build LCP array for this array $$$a$$$ too. Complexity for this part is $$$O(|s|+q)$$$.

Let us see how we can answer for string $$$z_i$$$. Assume length of $$$z_i$$$ is $$$len$$$.
Find the position of $$$z_i$$$ in array $$$a$$$. Let us assume that $$$z_i$$$ occurs at position $$$p$$$.
The answer for string $$$z_i$$$ is the sum of the following two values.

number of $$$l$$$ such that $$$F(l,p)=len$$$, where $$$1 \leq l < p$$$ and $$$a_l$$$ is a suffix of string $$$s$$$.(We can find this using stack)
number of $$$r$$$ such that $$$F(p,r)=len$$$, where $$$p < r \leq |a|$$$ and $$$a_r$$$ is a suffix of string $$$s$$$.(We can also find this using stack)

Here, $$$F(x,y)$$$ gives the length of the longest common prefix of strings $$$a_x, a_{x+1}, \ldots a_y$$$.

Is this problem available for practice anywhere?

Edit: I realised that it is possible to solve this problem online(in $$$O(|d| \cdot log(|d|)+q \cdot log(|d|))$$$) as well. Instead of having array $$$a$$$, we can just find the number of suffixes of $$$d$$$ which are smaller(using binary search) than $$$z_i$$$. After that we can use binary search and sparse table to find leftmost $$$l$$$ and rightmost $$$r$$$ in $$$O(log(|d|))$$$ for each query.

→ Reply