A simple introduction to "Segment tree beats"

#	User	Rating
1	jiangly	3976
2	tourist	3815
3	jqdai0815	3682
4	ksun48	3614
5	orzdevinwang	3526
6	ecnerwala	3514
7	Benq	3482
8	hos.lyric	3382
9	gamegame	3374
10	heuristica	3357

#	User	Contrib.
1	cry	169
2	-is-this-fft-	165
3	Um_nik	161
3	atcoder_official	161
5	djm03178	157
6	Dominater069	156
7	adamant	154
8	luogu_official	152
9	awoo	151
10	TheScrasse	147

Hi, I’d like to introduce a simple trick about segment tree in this blog as I promised in [this comment](http://codeforces.me/blog/entry/54750?#comment-387957). Sorry for the long delay, as a sophomore in Peking University, I've just finished a tired semester and a painful final exam. And now I finally have enough time to do a simple introduction to this interesting algorithm.↵
↵
It may be a huge project for me since my English is not good. I think I will finish this blog in several steps and I will try to finish it as soon as possible :)↵
↵
In China, all of the 15 candidates for the Chinese National Team are asked to write a simple research report about algorithms in informatics Olympiad, and the score will be counted in the final selection. There are many interesting ideas and algorithms in these reports. And I find that some of them are quite new for competitors in CF although they are well known in China from the final standings of some recent contests. For example, In the last contest CF Round #458, the problem G can be easily solved using "power series beats" (Thanks for [user:matthew99,2018-01-25] to give this name :)) in $O(2^{n} n^2)$ which was imported in China by [user:Vfleaking,2018-01-24] in 2015.↵
↵
This blog is about my report which is written about two years ago. I am satisfied with this work although there is a flaw about time complexity. [user:xyz111,2018-01-24] gave me a lot of inspiration, and the name “Segment tree beats” is given by [user:C_SUNSHINE,2018-01-24] which is from a famous Japanese anime “Angel Beats”.↵
↵
Here is [the link](http://www.doc88.com/p-6744902151779.html) about the Chinese version of this report.↵
↵
### Part1. What it can do.↵
↵
This work has two parts:↵
↵
1. Transform interval max/min (for $i \in [l,r], a_i = \max(a_i,x)$) operations into interval add/minus operations in $O(1)$ or $O(\log n)$↵
2. Transform history max/min/sum queries into interval max/min operations in $O(1)$.↵
↵
Here are some sample problems I used in my report. At first you have a array $A$ of length $n$ and two auxiliary arrays $B,C$. Initially, $B$ is equal to $A$ and $C$ is all zero.↵
↵
**Task 1**. There are two kinds of operations:↵
↵
1. For all $i \in [l,r]$, change $A_i$ to $\max(A_i, x)$↵
2. Query for the sum of $A_i$ in $[l,r]$↵
↵
It can be solved in $O(n \log n)$↵
↵
**Task 2**. We can add more operations to Task 1: ↵
↵
1. For all $i \in [l,r]$, change $A_i$ to $\max(A_i, x)$↵
2. For all $i \in [l,r]$, change $A_i$ to $\min(A_i, x)$↵
3. For all $i \in [l,r]$, change $A_i$ to $A_i + x$, $x$ can be a negative number↵
4. Query for the sum of $A_i$ in $[l,r]$↵
↵
It can be solved in $O(n \log^2 n)$ and I could not proved the exact time complexity yet, maybe It is still $O(n \log n)$ ;)↵
↵
**Task 3**. And we can query for some other things:↵
↵
1. For all $i \in [l,r]$, change $A_i$ to $\max(A_i, x)$↵
2. For all $i \in [l,r]$, change $A_i$ to $\min(A_i, x)$↵
3. For all $i \in [l,r]$, change $A_i$ to $A_i + x$, $x$ can be a negative number↵
4. Query for the sum of $B_i$ in $[l,r]$↵
↵
After each operation, for each $i$, if $A_i$ changed in this operation, add $1$ to $B_i$.↵
↵
It’s time complexity is the same as Task 2.↵
↵
**Task 4**. We can also deal with several arrays synchronously. Assume there are two arrays $A_1$ and $A_2$ of length $n$.↵
↵
1. For all $i \in [l,r]$, change $A_{a,i}$ to $\min(A_{a,i}, x)$↵
2. For all $i \in [l,r]$, change $A_{a,i}$ to $A_{a,i} + x$, $x$ can be a negative number↵
3. Query for the max $A_{1,i} + A_{2,i}$ in $[l,r]$↵
↵
It can be solved in $O(n \log^2 n)$ and if there are $k$ arrays, the time complexity will be raised to $O(n \log^2 n 2^k)$.↵
↵
**Task 5**.  And there are some tasks about historic information.↵
↵
1. For all $i \in [l,r]$, change $A_{i}$ to $A_{i} + x$, $x$ can be a negative number.↵
2. Query for the sum of $B_i$ in $[l,r]$.↵
3. Query for the sum of $C_i$ in $[l,r]$↵
↵
After each operation, for each $i$, change $B_i$ to $\max(B_i,A_i)$ and add $A_i$ to $C_i$.↵
↵
The query for $B_i$ can be solved in $O(n \log^2 n)$ and the query for $C_i$ can be solved in $O(n \log n)$.↵
↵
**Task 6**. We can even merged the two parts together. ↵
↵
1. For all $i \in [l,r]$, change $A_{i}$ to $\max(A_{i}, x)$↵
2. For all $i \in [l,r]$, change $A_{i}$ to $A_{i} + x$, $x$ can be a negative number.↵
3. Query for the sum of $B_i$ in $[l,r]$.↵
↵
After each operation, for each $i$, change $B_i$ to $\min(B_i,A_i)$.↵
↵
It can be solved in $O(n \log^3 n)$.↵
↵
There are $11$ sample tasks in my report and here are $6$ of them. All of them are interesting and are hard to solve using the traditional techniques such as lazy tags. ↵
↵
### Part2. The main idea↵
↵
#### Interval min/max operations↵
↵
To make the description clearer, I think it’s better to introduce an extended segment tree template. ↵
↵
I think most of the competitors’ templates of the lazy tag is like this ($[l, r]$ is the node's interval and $[ll, rr]$ is the operation's interval):↵
↵
```c++↵
void modify(int node, int l, int r, int ll, int rr) {↵
  if (l > rr || r < ll) return;↵
  if (l >= ll && r <= rr) {↵
      puttag(node); return;↵
  }↵
  pushdown(node);↵
  int mid = (l + r) >> 1;↵
  modify(node * 2, l, mid, ll, rr);↵
  modify(node * 2 + 1, mid + 1, r, ll ,rr);↵
  update();↵
}↵
```↵
↵
The main idea is **return whenever we can, put the tag whenever we can**:↵
↵
1. When the operation's interval and the node's interval are no longer intersected, the information inside this subtree must not be affected. So we can return immediately.↵
2. When the node's interval is contained by the operation's interval, all the information inside the subtree will be changed together. So we can put the tag on it and return.↵
↵
In other words, we can replace the two conditions arbitrarily, i.e., we can extend the template like this:↵
↵
```c++↵
void modify(int node, int l, int r, int ll, int rr) {↵
  if (break_condition()) return;↵
  if (tag_condition()) {↵
      puttag(node); return;↵
  }↵
  pushdown(node);↵
  int mid = (l + r) >> 1;↵
  modify(node * 2, l, mid, ll, rr);↵
  modify(node * 2 + 1, mid + 1, r, ll ,rr);↵
  update();↵
}↵
```↵
↵
What's the use of such a modification? In some advanced data structure tasks, it's impossible for us to put tags in such a weak condition `l >= ll && r <= rr`. But we can put it when the condition is stronger. We can use this template to deal with this kind of tasks but we need to analyze the time complexity carefully. ↵
↵
**Simple task:** There are three kinds of operations:↵
↵
1. For all $i \in [l,r]$, change $A_i$ to $A_i \ \text{mod}\ x$↵
2. For all $i \in [l,r]$, change $A_i$ to $x$↵
3. Query for the sum of $A_i$ in $[l,r]$↵
↵
It's a classic problem (it's the simple extension of [problem:438D]) and the traditional solution is to use balanced tree such as splay/treap to maintain the continuous segments with the same $A_i$ and for operation $2$, we find out all the segments with $A_i \geq x$ and change the value of each one.↵
↵
But if we use segment tree, we can get a much simpler solution: let `break_condition` be `l > rr || r < ll || max_value[node] < x ` and let `tag_condition` be `l >= ll && r <= rr and max_value[node] == min_value[node]`. And we can find that the time complexity of this segment tree is also $O(n \log^2 n)$.↵
↵
And now, we can easily describe the main idea of "segment tree beats". For each node, we maintain the maximum value `max_value[node]` and the strict second maximum value `second_value[node]`. ↵
↵
When we are doing the interval min operation for number $x$. let `break_condition` be `l > rr || r < ll || max_value[node] <= x` and let `tag_condition` be `l >= ll && r <= rr && second_value[node] < x`. Under such an condition, after put this tag, all of the maximum values inside this subtree will be changed to $x$ and they are still the maximum values of this subtree. ↵
↵
To make it easier to merge with other operations, we can maintain the values in this way: For each node, we maintain the maximum value inside this subtree and other values separately (the maximum values are the first kind and others are the second). Then, interval max operation will be changed to "add a number to the first kind values in some intervals". Keep the meanings of each kinds of values and the tags, you will find that the processes of pushdown and update will be much clearer.↵
↵
That's the main idea of the first part of "segment tree beats". It is very simple, right? And it also has a very nice time complexity. (I will give out the proof of the time complexity in the third part of this article).↵
↵
And let's see the first two sample tasks in part 1.↵
↵
**Task 1**. In this task, we can maintain the number of the first kind of values inside each node `t[node]`, and when we put tag "add $x$ to the first kind values in this subtree", the sum will be added by `t[node] * x`. Then we can easily maintain the information we need. ↵
↵
**Task 2**. In this task, we can maintain the maximum values/ minimum values and others separately, i.e., there are three kinds of numbers now, and we will use three sets of tags for each kind of values. Also, to deal with the queries of the interval's sum, we need to maintain the numbers of the first kind and the second kind of values. Pay attention to some boundary conditions, if there are just two different values in this subtree, there will be no third kind of values and if all the values are the same, the set of the first kind and the second kind will be the same↵
↵
Since I'm still on the journey, I‘m going to stop here today. And I will show the second part of this work in the next update. If there is no accident, I think I can finish this article this week#### Historic information↵
↵
In this part, we will focus on three special values which I named "historic information":↵
↵
1. historic maximal value: after each operation, change $B_i$ to $\max(B_i,A_i)$↵
2. historic minimal value: after each operation, change $B_i$ to $\min(B_i,A_i)$↵
3. historic sum: after each operation, add $C_i$ by $A_i$.↵
↵
You may wonder why we need to consider these values. But in China, there had been several problems about these values before I wrote this report. And maintaining these values is a much harder task than it looks like. Here is a data structure task in Tsinghua University Training 2015:↵
↵
**Sample Task**: [This](http://uoj.ac/problem/164) is the link of this task. There are five kinds of operations ($x$ is a positive integer.):↵
↵
1. For all $i \in [l,r]$, change $A_i$ to $A_i + x$.↵
2. For all $i \in [l,r]$, change $A_i$ to $\max(A_i + x,0)$. ↵
3. For all $i \in [l,r]$, change $A_i$ to $x$. ↵
4. Query for $A_i$.↵
5. Query for $B_i$.↵
↵
After each operation, change $B_i$ to $\max(B_i, A_i)$.↵
↵
To solve this task, ordinary segment tree is enough. But It's not easy to deal with the relationship between the lazy tags and the historic information. Since I do not have enough time to write today, I will update the solution later. ↵
↵
We have already get the way to deal with the interval min/max operations. And they are good tools to maintain historic information. ↵
↵
Now, let use consider such kind of problems (Part1. Task 5): interval add/minus, query for the interval sum of historic information. It's quite hard since we can't use the traditional lazy tag technique to solve it. But we can use an auxiliary array $D_i$ with the initial value of all zero to do some transformation:↵
↵
1. Historic maximal value: let $D_i = B_i - A_i$. Then if we change $A_i$ to $A_i + x$, $D_i$ will be changed to $\max(D_i - x, 0)$.↵
2. Historic maximal value: let $D_i = B_i - A_i$. Then if we change $A_i$ to $A_i + x$, $D_i$ will be changed to $\min(D_i - x, 0)$.↵
3. Historic sum: let $D_i = C_i - tA_i$ while $t$ is the current operation id. Then if we change $A_i$ to $A_i + x$, $D_i$ will be changed to $D_i - (t-1)x$.↵
↵
We can use the technique in the previous part about interval min/max operations to maintain the sum of $D_i$. And then we could get the sum of $B_i$ or $C_i$.↵
↵
What's more, since we have already transformed interval min/max operations to interval add/minus operations. So we can also maintain the interval sum of historic information under interval min/max operations (Part1. Task 6). This is a simple expansion and I think it's better to leave it as an exercise.↵
↵
That's all the main idea of "segment tree beats". And I will add the proof about time complexity in the next update. It's a little hard for me to explain the proof in English so maybe it will take me some time to form words. I think that will be the last update, and I will try to finish it soon :).↵
↵
I've never thought that this article will be so popular. Thanks a lot for the encouragement and the support!

Rev.	By	When	Δ	Comment
en21	jiry_2	2018-01-28 04:43:36	12	Tiny change: 'ven by [usr:xyz111] ' -> 'ven by [user:xyz111] '
en20	jiry_2	2018-01-27 14:17:29	117
en19	jiry_2	2018-01-27 14:08:14	1735	Tiny change: '90605.png)\n\nThus, ' -> '90605.png){:height="100px" width="400px"}\n\nThus, '
en18	jiry_2	2018-01-27 13:40:43	7	Tiny change: ', we will also use the t' -> ', we will still use the t'
en17	jiry_2	2018-01-27 13:39:20	1607	Tiny change: 'the total number of $\Phi(' -> 'the total increase of $\Phi('
en16	jiry_2	2018-01-27 12:56:03	365
en15	jiry_2	2018-01-27 12:43:04	4659	Add the proof for task 1
en14	jiry_2	2018-01-27 12:35:19	33
en13	jiry_2	2018-01-27 11:56:33	5	Tiny change: '& r <= rr and max_value' -> '& r <= rr && max_value'
en12	jiry_2	2018-01-27 10:29:51	49
en11	jiry_2	2018-01-27 10:12:20	4	Tiny change: 'Historic maximal value' -> 'Historic minimal value'
en10	jiry_2	2018-01-26 17:01:38	2806
en9	jiry_2	2018-01-25 17:21:34	136
en8	jiry_2	2018-01-25 17:17:52	5446	Tiny change: 'affected. \n2. When ' -> 'affected. So we can return immediately.\n2. When '
en7	jiry_2	2018-01-25 13:07:08	12
en6	jiry_2	2018-01-24 18:05:58	1	Tiny change: 'rged the to parts to' -> 'rged the two parts to'
en5	jiry_2	2018-01-24 18:02:56	3	Tiny change: 'in China for the final' -> 'in China from the final'
en4	jiry_2	2018-01-24 17:40:31	5	Tiny change: '2)$ which is import in China ' -> '2)$ which was imported in China '
en3	jiry_2	2018-01-24 17:26:29	2	Tiny change: ') in $O(2^n n^2)$ whi' -> ') in $O(2^{n} n^2)$ whi'
en2	jiry_2	2018-01-24 17:25:28	1	Tiny change: ' in $O(2^nn^2)$ whic' -> ' in $O(2^n n^2)$ whic'
en1	jiry_2	2018-01-24 17:24:51	5058	Initial revision (published)

History