Generalized Suffix Tree

→ Pay attention

Before contest
Codeforces Round 1006 (Div. 3)
3 days
Register now »

→ Streams

The 2025 Universal Cup Finals

By tourist

Stream is running

Greedy Algorithms — Topic Stream

By Shayan

Before stream 07:36:15

View all →

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	156
6	Qingyu	155
7	djm03178	152
7	adamant	152
9	luogu_official	150
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

Enchom's blog

Generalized Suffix Tree

By Enchom, 10 years ago, In English

Hello everybody,

Recently I faced yet another problem and I thought asking the codeforces community is the best decision.

Lately I've been learning Ukkonen's algorithm for linear building of suffix trees. However as I was reading some applications I noticed that often you have to build a generalized suffix tree of usually 2, but sometimes more strings. In the papers about Ukkonen's algorithm I managed to find only how to build a single suffix tree.

Now my initial thought was to build a suffix tree for each string and then try to merge them in linear time, but that seemed like a very annoying to implement idea and looking around the web many people said that Ukkonen's algorithm can be used to produce generalized suffix tree.

I was wondering if someone could outline a solution that keeps the structure and suffix links correct and builds a generalized suffix tree based on Ukkonen's algorithm.

Thanks in advance! :)

Enchom
10 years ago
7

Comments (7)

Write comment?

gawry

10 years ago, # |

+24

The "standard" (as far as I know) method is to concatenate all the strings together, separated with some special characters. Say, for two strings s1 and s2, run Ukkonen's on s1$s2.

→ Reply

Enchom

10 years ago, # |

← Rev. 2 →

True but if we have "abc#" and "def$" concatenating them we get "abc#def$" which includes suffixes such as "c#def$". In most generalized suffix trees such suffixes are not present, but I guess it won't really worsen the structure, so that's a solution I will have in mind.

Edit: Meant to put the comment as a reply to gawry

→ Reply

--Pavel--

10 years ago, # ^ |

+16

I think you can just remove the vertices which have separator characters as a ancestor.

→ Reply

Enchom

10 years ago, # ^ |

I'm quite inexperienced with suffix trees, so are you sure that this transformation will keep all the suffix links correct?

→ Reply

--Pavel--

10 years ago, # ^ |

What's the problem with suffix links?

We remove some subtrees with all it's outgoing suffix links.

On the other side, if the path from root to vertex does not contain separator then the path from root to suffix link does not contain it too, since it is suffix of our substring.

→ Reply

Enchom

10 years ago, # ^ |

Well that sounds good enough, thanks a lot :)

→ Reply

--Pavel--

10 years ago, # ^ |

You are welcome :)

→ Reply