O(n),O(1) lca/rmq; not method of 4 russians

→ Pay attention

Before contest
Codeforces Round (Div. 1)
7 days

Before contest
Codeforces Round (Div. 2)
7 days

→ Streams

Codeforces Blitz Cup 2025: aryanc403 vs 244mhq (R1)

By aryanc403

Before stream 18:02:52

View all →

→ Top rated

#	User	Rating
1	tourist	3857
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3463
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	165
2	-is-this-fft-	161
3	Qingyu	160
4	Dominater069	158
5	atcoder_official	157
6	adamant	155
7	Um_nik	151
8	djm03178	150
8	luogu_official	150
10	awoo	148

View all →

→ Find user

→ Recent actions

Detailed →

lrvideckis's blog

O(n),O(1) lca/rmq; not method of 4 russians

By lrvideckis, history, 13 months ago, In English

I came across this paper

On Finding Lowest Common Ancestors: Simplification and Parallelization by Baruch Schieber, Uzi Vishkin April, 1987

so naturally I tried to code golf it

lca https://judge.yosupo.jp/submission/188189

rmq https://judge.yosupo.jp/submission/188190

edit: minor golf

lrvideckis
13 months ago
11

Comments (11)

Write comment?

brunomont

13 months ago, # |

Cool! This algorithm seems to be almost the same size of the one described here, although maybe a bit slower.

→ Reply

lrvideckis

13 months ago, # ^ |

rip

→ Reply

lrvideckis

13 months ago, # ^ |

although looking at the test cases, this approach does worse on the small width query where the usual 4-russians approach would have an advantage: you hit the case with less operations

→ Reply

lrvideckis

13 months ago, # ^ |

on cses rmq: your implementation: .10s, my implementation: 0.12s

→ Reply

lrvideckis

7 months ago, # ^ |

← Rev. 2 →

+16

revisiting this, I code golfed it another way, and now it is a bit faster

https://judge.yosupo.jp/submission/222465

→ Reply

brunomont

7 months ago, # ^ |

Whoa, pretty small code! which algorithm is it using?

→ Reply

lrvideckis

7 months ago, # ^ |

+10

So it uses the same alg as in the paper; but you calculate the cartesian tree of the array, then the index of RMQ of [le, ri] is the LCA of nodes le, ri in the cartesian tree.

So the paper describes how to calculate 3 arrays INLABEL, ASCENDANT, HEAD from a rooted tree; and then how to answer O(1) LCA using these arrays. I thought the paper describes the alg quite well, so I won't re-explain it here. But I can mention a few key points:

So our task is to calculate these arrays INLABEL, ASCENDANT, HEAD from the cartesian tree of the array.

For a node u, the paper defines INLABEL[u] as (considering all nodes in u's subtree) the dfs-time-in value with maximum count of trailing zeros (__builtin_ctz) in it's binary representation.

And noting, among the numbers le, le+1, ..., ri-1, ri, the number with maximum count of trailing zeros is given by ri & -bit_floor(ri^(le-1)).

So we could calculate dfs-time-in of cartesian tree, then use the above formula with: le = time_in[u], ri = time_in[u]+subtree_size[u]-1; but there's a better way:

the paper has this remark:

Remark: The above computation is based on PREORDER numbering of the vertices of T. This numbering has the property that the numbers assigned to the subtree rooted at any vertex of T provide a consecutive series of integers. In fact, any alternative numbering having this property (e.g. POSTORDER, INORDER) will produce INLABEL numbers which will be suitable for our preprocessing stage.

And for cartesian tree, It makes more sense to calculate INLABEL over INORDER traversial instead of PREORDER, because the INORDER value of node u is just u.

And for cartesian tree, the INORDER values of the subtree of u is the range [index_of_previous_smaller[u]+1, index_of_next_smaller[u]-1], where the previous and next smaller indexes can be calculated with the monotonic stack alg.

Finally one more point is I don't actually build the cartesian tree, rather it is thought of as being "implicitly" used; just like for rectangular-array-graph problems, where you just loop over the 4 (or 8) neighbors "implicitly" in a loop instead of building the graph "explicitly"

So if you take the code:

vector<int> st;
for(int i = 0; i < n; i++) {
  int prev = -1;
  while(!empty(st) && a[st.back()] > a[i]) {
    // (1)
    prev = st.back();
    st.pop_back();
  }
  // (2)
  st.push_back(i);
}

It's kind of like doing an "implicit" dfs on cartesian tree: the order which nodes are popped off the stack is POSTORDER of cartesian tree for example. Also there's a bunch of invariants:

at spot 1, we have: