#	User	Rating
1	tourist	3857
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3463
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	166
2	-is-this-fft-	161
3	Qingyu	160
4	Dominater069	158
5	atcoder_official	157
6	adamant	155
7	Um_nik	152
8	djm03178	151
9	luogu_official	150
10	awoo	148

lrvideckis's blog

on minimizing the expected number of recursive calls for segment tree

By lrvideckis, history, 18 hours ago, In English

For the longest time, this comment https://codeforces.me/blog/entry/112755#comment-1004966 was bugging me (the part about how the 'complete' segment tree is 20% slower (complete, because it’s a complete binary tree)). Instead of benchmarking, this blog calculates the expected number of recursive calls visited between segment tree implementations.

Thanks to camc for giving valuable feedback.

This blog will study 2 implementations of segment trees.

'standard' segment tree

void dfs(int v, int l, int r) { // [l, r)
  if (r - l == 1) {
    return;
  }
  int m = (l + r + 1) / 2;
  dfs(2 * v, l, m);
  dfs(2 * v + 1, m, r);
}
// dfs(1, 0, n);

And

'complete' segment tree

int split(int l, int r) {
  int pw2 = 1 << __lg(r - l);
  return min(l + pw2, r - pw2 / 2);
}
void dfs(int v, int l, int r) { // [l, r)
  if (r - l == 1) {
    return;
  }
  int m = split(l, r);
  dfs(2 * v, l, m);
  dfs(2 * v + 1, m, r);
}
// dfs(1, 0, n);

(complete, because it’s a complete binary tree); Also tourist uses the iterative version of this.

Structure

Consider the structures of 'standard' segment trees on n=1,2,3,...

(credit to this tool for making the pictures; here are the pictures in higher resolution)

Notice going from one tree to the next, exactly one leaf node turns into an internal node, and “grows” 2 new leaves. Well consider the sequence of the nodes which “grow” 2 new leaves:

1, 2, 3, 4, 6, 5, 7, 8, 12, 10, 14, 9, 13, 11, 15, …

It is exactly this sequence https://oeis.org/A059893. But this sequence is about reversing bits, so how is it related?

As you increment n, the child (left or right) of the root whose range-size increments will alternate (so it’s based on parity/LSB of n), then the child’s range-size is half of the root’s range-size and you repeat to determine which grandchild’s range-size increments (it also alternates), and so on. Eventually you get to a leaf, and this leaf’s range-size increments (i.e. “grows” 2 new leaves).

If you split up the sequence into subarrays [1,2), [2,4), [4,8), ..., [2^(i-1),2^i),..., then:

the i-th subarray represents all nodes at depth i
Every node at depth i grows its leaves before all nodes at depth (i+1), and after all nodes at depth (i-1)
So as n increases, the standard segment tree grows its leaves layer by layer, i.e. each layer is completed fully before the next layer’s leaves start growing.
abs(depth(leaf_i)-depth(leaf_j))<=1 for any pairs of leaves

Point updates/queries

Expected number of recursive calls of point update

= expected depth of leaf node (where depth of root=1)

= sum_of_leaf_depths/n

So how to work out the sum of depths of leaves?

Well there are x leaves of depth __lg(n)+1 and y leaves of depth __lg(n)+2. We have:

x+y=n
y = 2*(bit_floor(n)-x) because there are bit_floor(n)-x internal nodes at depth __lg(n)+1, each having 2 leaf-childs.

Then sum of depths of leaves = x*(__lg(n)+1)+y*(__lg(n)+2).

Finally notice, this math is exactly the same for both the complete segment tree and the standard segment tree, so they have the same sum of leaf depths.

Also for merge sort tree: (sum of array lengths) = (sum of leaf depths). So the respective merge sort tree’s have the same sum of array lengths.

Range updates/queries

Expected number of recursive calls of range update

= total_recursive_calls_over_all_possible_updates/(n*(n+1)/2)

Test comparing total number of recursive calls

#include <bits/stdc++.h>
using namespace std;

int split(int tl, int tr) {
  int pw2 = 1 << __lg(tr - tl);
  return min(tl + pw2, tr - pw2 / 2);
}

void update_complete_tree(int l, int r, int tl, int tr, int& num_calls) {
  num_calls++;
  if (r <= tl || tr <= l) return;
  if (l <= tl && tr <= r) return;
  int tm = split(tl, tr);
  update_complete_tree(l, r, tl, tm, num_calls);
  update_complete_tree(l, r, tm, tr, num_calls);
}

void update_standard_tree(int l, int r, int tl, int tr, int& num_calls) {
  num_calls++;
  if (r <= tl || tr <= l) return;
  if (l <= tl && tr <= r) return;
  int tm = (tl + tr + 1) / 2;
  update_standard_tree(l, r, tl, tm, num_calls);
  update_standard_tree(l, r, tm, tr, num_calls);
}

int main() {
  for (int n = 1; n < 1100; n++) {
    int complete_calls = 0, standard_calls = 0;
    for (int l = 0; l < n; l++) {
      for (int r = l + 1; r <= n; r++) {
        update_complete_tree(l, r, 0, n, complete_calls);
        update_standard_tree(l, r, 0, n, standard_calls);
      }
    }
    assert(complete_calls <= standard_calls);  // unexpected!!!
    cout << n << ' ' << complete_calls << ' ' << standard_calls << endl;
  }

  return 0;
}

I was blown away by this. I thought that the complete segment tree should have more recursive calls because the ranges aren’t split in the middle. But somehow this is wrong? Let’s see why.

I was able to come up with this formula for the total number of recursive calls (derivation left to the reader haha).

formula

#include <bits/stdc++.h>
using namespace std;

int split(int tl, int tr) {
  // return (tl + tr + 1) / 2;  // try this too!
  // return tl + 1;  // the formula also works for unbalanced trees
  int pw2 = 1 << __lg(tr - tl);
  return min(tl + pw2, tr - pw2 / 2);
}

void update(int l, int r, int tl, int tr, int& num_calls_naive) {
  num_calls_naive++;
  if (r <= tl || tr <= l) return;
  if (l <= tl && tr <= r) return;
  int tm = split(tl, tr);
  update(l, r, tl, tm, num_calls_naive);
  update(l, r, tm, tr, num_calls_naive);
}

void dfs(int tl, int tr, int depth, int n, int& num_calls_fast) {
  num_calls_fast -= (tr - tl) * (tr - tl);
  if (tr - tl == 1) {
    num_calls_fast += depth * (2 * n + 3);
    return;
  }
  int tm = split(tl, tr);
  dfs(tl, tm, depth + 1, n, num_calls_fast);
  dfs(tm, tr, depth + 1, n, num_calls_fast);
}

int main() {
  for (int n = 1; n <= 1000; n++) {
    int num_calls_naive = 0;
    for (int l = 0; l < n; l++) {
      for (int r = l + 1; r <= n; r++) {
        update(l, r, 0, n, num_calls_naive);
      }
    }
    int num_calls_fast = (-7 * n * n - 3 * n + 4) / 2;
    dfs(0, n, 1, n, num_calls_fast);
    assert(num_calls_naive == num_calls_fast);
    cout << n << "    " << num_calls_naive << ' ' << num_calls_fast << endl;
  }
  return 0;
}

Comparing this formula for complete versus standard segment tree:

(-7n^2-3n+4)/2 is the same
sum of [depth*(2n+3)] over leaves is the same as they both have the same number of leaves at depths __lg(n)+1 and __lg(n)+2

The only difference is we subtract f(n) = n^2 + f(left child range-size) + f(right child range-size) from the total number of calls. And f(n) increases when abs(left child range-size - right child range-size) increases, i.e. as the tree becomes less balanced.

To me, this result is unintuitive. I’d be interested if anyone can come up with some intuition for this.

But then why is the complete segment tree slower...

Calculating the midpoint has a larger constant.

https://judge.yosupo.jp/submission/270739 288 ms 15.82 Mib — complete segment tree
https://judge.yosupo.jp/submission/270737 298 ms 16.27 Mib — standard segment tree, but with artificially inflated constant factor of calculating midpoint

Full text and comments »

lrvideckis
18 hours ago
4

[tutorial] string matching: binary search over suffix array in O(log(|s|)+|p|)

By lrvideckis, history, 6 months ago, In English

Recently I was learning about how to binary search over suffix array to solve string matching (specifically, single text, multiple patterns, solve it online). Here (1 ("Pattern Query" section), 2, 3) describes how to solve it in O(|s| * log(|p|)) but I'll describe how to improve this to O(|s| + log(|p|)). There already exists resources online about this, but I will try to simplify it.

visualizing suffix array

Let's take the text s="banana", and consider all suffixes (written vertically):

a
n a
a n a
n a n a
a n a n a
b a n a n a
0 1 2 3 4 5

now let's sort it:

      a
    a n
    n a   a
  a a n   n
  n n a a a
a a a b n n
5 3 1 0 4 2

s's suffix array is [5, 3, 1, 0, 4, 2].

Now take any substring of s (like "an"). Observe there exists a maximal-length subarray of s's suffix array ([3,1]) representing all the suffixes (3 -> "ana", 1 -> "anana") where "an" is a prefix. I like to call this the "subarray of matches": as this subarray represents all the starting indexes (in s) of a match.

Let's define a function which calculates this: subarray_of_matches(s[s_l:s_r]) = suffix_array[suffix_array_l:suffix_array_r].

Now observe that subarray_of_matches(s[s_l:s_r+1]) is nested inside subarray_of_matches(s[s_l:s_r]). This is because every spot where s[s_l:s_r+1] is a match, s[s_l:s_r] is also a match.

But in particular, we can take some suffix (like "anana") and plug in all of its' prefixes to subarray_of_matches and we get a sequence of nested subarrays:

subarray_of_matches("a") = [5,3,1]
subarray_of_matches("an") = [3,1]
subarray_of_matches("ana") = [3,1]
subarray_of_matches("anan") = [1]
subarray_of_matches("anana") = [1]

in general, you can visualize it like this:

One more point: consider any subarray of the suffix array: suffix_array[suffix_array_l:suffix_array_r] and let lcp_length = the longest common prefix of these suffixes. Formally: for each i in range [suffix_array_l,suffix_array_r) the strings s[suffix_array[i]:suffix_array[i]+lcp_length] are all equal.

Now consider the set of next letters: s[suffix_array[i]+lcp_length], they are sorted. We can visualize it like:

visualizing the binary search

Given text s, s's suffix array, and query string p, our goal is to calculate the minimum index i such that p is a prefix of s[suffix_array[i]:] (the suffix of s starting at suffix_array[i])

so we can start the binary search as usual with l=0,r=size(s),m=(l+r)/2, and compare p to s[suffix_array[m]:]. if p is less, search lower (r=m;), else search higher (l=m;).

Let's also keep track of the "best" suffix so far, e.g. the suffix which matches the most characters in p. Let's store it as a pair {best_i_so_far,count_matched} with the invariant: p[:count_matched] == s[best_i_so_far:best_i_so_far+count_matched].

So now, depending on whether p[:count_matched+1] is less/greater than s[best_i_so_far:best_i_so_far+count_matched+1] we want to look for the green section which is before/after best_i_so_far. And here, we have the cases taken from here:

the middle red section will not contain the answer because the pattern p didn't match with that letter g, so it still won't match anywhere in that range.
the green sections will contain a match which is either better or the same as the middle red section
the outer red sections won't contain the answer because the LCP is too low

So back to our binary search, we have l,r,m=(l+r)/2.

If m lies in either of the red sections; then we can check for this in O(1) using a lcp query, and continue the search "towards" the green section.
If m is already in the green section, then continue matching p[count_matched:] with s[best_i_so_far+count_matched:] (and also update our best match {best_i_so_far,count_matched})

we start comparing characters in p starting from count_matched, so we only compare characters in p once, and achieve complexity O(log(|s|) + |p|).

code: https://codeforces.me/edu/course/2/lesson/2/3/practice/contest/269118/submission/277693465

Full text and comments »

lrvideckis
6 months ago
5

[tutorial] O(n),O(1) level ancestor, not method of 4 russians

By lrvideckis, history, 12 months ago, In English

I recently read “Still Simpler Static Level Ancestors by Torben Hagerup” describing how to process a rooted tree in O(n) time/space to be able to answer online level ancestor queries in O(1). I would like to explain it here. Thank you to camc for proof-reading & giving feedback.

background/warmup

Prerequisites: ladder decomposition: https://codeforces.me/blog/entry/71567?#comment-559299, and <O(n),O(1)> offline level ancestor https://codeforces.me/blog/entry/52062?#comment-360824

First, to define a level ancestor query: For a node u, and integer k, let LA(u, k) = the node k edges “up” from u. Formally a node v such that:

v is an ancestor of u
distance(u, v) = k (distance here is number of edges)

For example LA(u, 0) = u; LA(u, 1) = u’s parent

Now the slowest part of ladder decomposition is the O(n log n) binary lifting. Everything else is O(n). So the approach will be to swap out the binary lifting part for something else which is O(n).

We can do the following, and it will still be O(n):

Store the answers to O(n) level ancestor queries of our choosing (answered offline during the initial build)
Normally in ladder decomposition, length(ladder) = 2 * length(vertical path). But we can change this to length(ladder) = c * length(vertical path) for any constant c (of course the smaller the better).

The key observation about ladders: Given any node u and integer k (0 <= k <= depth[u] / 2): The ladder which contains LA(u, k) also contains LA(u, 2*k); or generally LA(u, c*k) when we extend the vertical paths by the multiple of c.

the magic sequence

Let’s take a detour to the following sequence a(i) = (i & -i) = 1 << __builtin_ctz(i) for i >= 1 https://oeis.org/A006519

1 2 1 4 1 2 1 8 1 2 1 4 1 2 1 16 1 2 1 4 1 2 1 8 1 2 1 4 1 2 1 32 1 2 1 4 1 2 1 …

Observe: for every value 2^k, it shows up first at index 2^k (1-based), then every 2^(k+1)-th index afterwards.

Q: Given index i >=1, and some value 2^k, I can move left or right. What’s the minimum steps I need to travel to get to the nearest value of 2^k?

A: at most 2^k steps. The worst case is I start at a(i) = 2^l > 2^k, e.g. exactly in the middle of the previous, and next occurrence of 2^k

the algorithm

Let’s do a 2n-1 euler tour; let the i’th node be euler_tour[i]; i >= 1. Let’s calculate an array jump[i] = LA(euler_tour[i], a(i)) offline, upfront.

how to use the “jump” array to handle queries?

Let node u, and integer k be a query. We can know i = u’s index in the euler tour (it can show up multiple times; any index will work).

key idea: We want to move either left, right in the euler tour to find some “close”-ish index j with a “big” jump upwards. But not too big: we want to stay in the subtree of LA(u,k). Then we use the ladder containing jump[j] to get to LA(u,k). The rest of the blog will be the all math behind this.

It turns out we want to find closest index j such that 2*a(j) <= k < 4*a(j). Intuition: we move roughly k/2 steps away in the euler tour to get to a node with an upwards jump of size roughly k/2.

Note if we move to j: abs(depth[euler_tour[i]] - depth[euler_tour[j]]) <= abs(i - j) <= a(j)

how to calculate j in O(1)

Note we’re not creating a ladder of length c*a(i) starting from every node because that sums to c*(a(1)+a(2)+...+a(2n-1)) = O(nlogn). Rather it’s a vertical path decomposition (sum of lengths is exactly n), and each vertical path is extended upwards to c*x its original length into a ladder (sum of lengths <= c*n)

finding smallest `c` for ladders to be long enough

Let’s prove some bounds on d = depth[euler_tour[j]] - depth[LA(u,k)]

Note 2*a(j) <= k from earlier. So a(j) = 2*a(j) - a(j) <= k - a(j) <= d. This implies jump[j] stays in the subtree of LA(u,k).

Note k < 4*a(j) from earlier. So d <= a(j) + k < a(j) + 4*a(j) = 5*a(j)

Remember, we can choose a constant c such that length(ladder) = c * length(vertical path). Now let’s figure out the smallest c such that the ladder containing jump[j] will also contain LA(u,k):

if we choose c=5 then length(ladder) = 5 * length(vertical path) >= 5 * a(j) > d

I mean that constant factor kinda sucks :( Well, at least we’ve shown a way to do the almighty, all-powerful theoretical O(n)O(1) level ancestor, all hail to the almighty. If thou is tempted by method of 4 russians, thou shall receive eternal punishment

here’s a submission with everything discussed so far: https://judge.yosupo.jp/submission/194335

a cool thing

Here’s some intuition for why we chose the sequence a(i): It contains arbitrarily large jumps which appear regularly, sparsely. Sound familiar to any algorithm you know? It feels like linear jump pointers https://codeforces.me/blog/entry/74847. Let’s look at the jump sizes for each depth:

1,1,3,1,1,3,7,1,1,3,1,1,3,7,15,1,1,3,...

Map x -> (x+1)/2 and you get https://oeis.org/A182105 . https://oeis.org/A006519 is kinda like the in-order traversal of a complete binary tree, and https://oeis.org/A182105 is like the post-order traversal.

improving the constant factor

Let’s introduce a constant kappa (kappa >= 2).

Instead of calculating jump[i] as LA(euler_tour[i], a(i)), calculate it as LA(euler_tour[i], (kappa-1) * a(i))

Let node u, and integer k be a query. We want to find nearest j such that kappa*a(j) <= k < 2*kappa*a(j)

Then you can bound d like: (kappa-1) * a(j) <= d < (2 * kappa + 1) * a(j)

And you can show c >= (2 * kappa + 1) / (kappa - 1)

The catch is when k < kappa, you need to calculate LA(u,k) naively (or maybe store them).

The initial explanation is for kappa = 2 btw

submission with kappa: https://judge.yosupo.jp/submission/194336

Full text and comments »

level ancestor

+125

lrvideckis
12 months ago
2

O(n),O(1) lca/rmq; not method of 4 russians

By lrvideckis, history, 13 months ago, In English

I came across this paper

On Finding Lowest Common Ancestors: Simplification and Parallelization by Baruch Schieber, Uzi Vishkin April, 1987

so naturally I tried to code golf it

lca https://judge.yosupo.jp/submission/188189

rmq https://judge.yosupo.jp/submission/188190

edit: minor golf

Full text and comments »

lrvideckis
13 months ago
11

2n memory segment tree by choosing midpoint

By lrvideckis, history, 2 years ago, In English

Hi Codeforces! If you calculate midpoint like

int get_midpoint(int l, int r) {//[l, r)
	int pow_2 = 1 << __lg(r-l);//bit_floor(unsigned(r-l));
	return min(l + pow_2, r - pow_2/2);
}

then your segment tree requires only $$$2 \times n$$$ memory.

test

#include <bits/stdc++.h>
using namespace std;


int get_midpoint(int l, int r) {//[l, r)
	int pow_2 = 1 << __lg(r-l);//bit_floor(unsigned(r-l));
	return min(l + pow_2, r - pow_2/2);
}


int n;
set<int> internal_nodes, leaf_nodes;
map<int, int> depth_of_segments_not_pow2;

void build(int v, int l, int r) {//[l, r)
	if(r-l == 1) {
		const int depth_leaf = __lg(v), max_depth = __lg(2*n-1);
		if(l == 0) {//left-most leaf
			assert(v == int(bit_ceil(unsigned(n))));
			assert(depth_leaf == max_depth);
		}
		assert(depth_leaf == max_depth || depth_leaf == max_depth - 1);
		if((n&(n-1)) == 0) assert(depth_leaf == max_depth);
		assert(n <= v && v < 2*n);
		assert(!leaf_nodes.count(v));
		leaf_nodes.insert(v);
		return;
	}
	if(((r-l)&(r-l-1)) == 0) assert(get_midpoint(l,r) == (l+r)/2);
	else depth_of_segments_not_pow2[__lg(v)]++;
	assert(1 <= v && v < n);
	assert(!internal_nodes.count(v));
	internal_nodes.insert(v);

	int m = get_midpoint(l, r);
	build(2*v, l, m);
	build(2*v+1, m, r);
}

int main() {
	for(n = 1; n <= 520; n++) {
		cout << "n: " << n << endl;
		internal_nodes.clear();
		leaf_nodes.clear();
		depth_of_segments_not_pow2.clear();

		build(1, 0, n);

		assert(ssize(internal_nodes) == n-1);
		assert(ssize(leaf_nodes) == n);
		for(int i = 1; i < n; i++) assert(internal_nodes.count(i));
		for(int i = n; i < 2*n; i++) assert(leaf_nodes.count(i));
		//at most one "bad" segment per depth
		//either left child or right child (or both) will have segment length
		//a power of 2; then their subtrees are a perfect binary tree
		for(auto [depth, cnt] : depth_of_segments_not_pow2) assert(cnt == 1);
		assert(ssize(depth_of_segments_not_pow2) <= __lg(n));
	}
	return 0;
}

proof

induction assumption: the segment tree with root segment $$$[0,n)$$$ turns into a complete binary tree which has $$$2 \times n - 1$$$ nodes and max depth = __lg(2*n-1).

notes about induction assumption

observe:

0 = __lg(1)
1 = __lg(2) = __lg(3)
2 = __lg(4) = __lg(5) = __lg(6) = __lg(7)
...

so __lg(v) = depth of node v
max depth of complete binary tree = depth of any node on lowest level
node 2*n-1 is always on the lowest level

Also note: get_midpoint(l + x, r + x) = get_midpoint(l, r) + x so we can "shift" any segment $$$[l,r)$$$ to $$$[0,r-l)$$$ and the corresponding segment trees have the same structure, hint: min(a + x, b + x) = min(a, b) + x

Induction step

case 0: $$$r-l$$$ is a power of 2

details

case 1: $$$l + pow2 < r - \frac{pow2}{2}$$$

details

From this, the left and right childs have the same max depth; the left child is a perfect binary tree; the right child is a complete binary tree, so overall it's a complete binary tree.

case 2: $$$l + pow2 \ge r - \frac{pow2}{2}$$$

details

From this, the left child has max depth = 1 + right child max depth. The left child is a complete binary tree; the right child is a perfect binary tree, so overall, it's a complete binary tree.

I was inspired by ecnerwala's in_order_layout https://github.com/ecnerwala/cp-book/blob/master/src/seg_tree.hpp

I'll be waiting for some comment "It's well known in china since 2007" 😂

Full text and comments »

segment tree

lrvideckis
2 years ago
10

Why do you do CP?

By lrvideckis, history, 5 years ago, In English

Here are some reasons I believe people do CP:

Enjoyment
Preparation for job (coding) interviews
Preparation for Competitions

CP seems to be a low priority activity. For example, school and job responsibilities usually take higher priority. It’s not possible (realistically) to make a living from CP. Thus, people taking part in CP usually have good reasons.

The nihilist viewpoint says CP is just solving contrived made-up problems. Who cares? There’s little/no benefit to society. Unlike CP, programming for a job creates services (value) for people. Unlike CP, Computer Science research pushes the boundaries of knowledge of the field (value). Why spend time in an activity which doesn’t product relative value? Again, it seems the people doing CP must have good reasons.

I’m wondering what people’s reasons are for doing CP.

Full text and comments »

+129

lrvideckis
5 years ago
56

lrvideckis's blog

Structure

Point updates/queries

Range updates/queries

But then why is the complete segment tree slower...

visualizing suffix array

visualizing the binary search

background/warmup

the magic sequence

the algorithm

finding smallest c for ladders to be long enough

a cool thing

improving the constant factor

finding smallest `c` for ladders to be long enough