Merging Queries Trick

#	User	Rating
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

#	User	Contrib.
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Credits to FLEA for teaching me this trick.

Introduction

Recently I learnt an interesting trick I wanted to share with others. I'm not sure if this trick is well known, but I didn't know about it and didn't find any other articles on it so decided to write this blog.

Problem statement

We have a connected undirected graph with $$$n$$$ nodes and $$$m$$$ edges between them, each node having a value $$$w_i$$$ ($$$1 <= n, m <= 10^5$$$), ($$$1 <= w_i <= 10^9$$$).

Let's denote $$$f(a, b)$$$ as the minimum value of a node on a path from node $$$a$$$ to node $$$b$$$.

We have to answer $$$q$$$ queries about this graph. Each query contains $$$2$$$ nodes $$$a$$$, $$$b$$$ and asks for the maximum $$$f(a, b)$$$ over all possible paths from node $$$a$$$ to node $$$b$$$. ($$$1 <= q <= 10^5$$$).

Notes

This problem can be solved in another (more optimal) way, which is described in these comments: 1, 2. Thanks to valeriu and geniucos for mentioning it!

A similar idea from stefdasca.

Prerequisites

In order to fully understand the idea you have to know about DSU (disjoint set union) and small to large merging.

Solution

We will do some kind of DSU (disjoint set union). For each node, we don't only keep its parent, but also the queries it appears in (we store them in a map/set). Then we go through all the $$$n$$$ nodes in decreasing order of their values. When we are at a node — $$$u$$$, we "activate" it and then go through all the already activated neighbors the node has and we merge these two nodes.

How do we merge them?

First, we do the simple merge of the $$$2$$$ components with the classic $$$par[a] = b$$$. Then, we go through all queries the node is responsible for and check if it also appears in the query of the other node. If it does, we set the answer of that query to the weight of the node we are currently considering, since we know it will be the minimum on the path (because we are going through nodes in decreasing order of their values), if they don't both appear we just insert the the given query to the combined component and continue. But, since it would take too long, we merge the query sets with small to large merging (merge the node which has a smaller (in size) query set to the one that has a larger one). So, we do all this in $$$O((n + m) log^2n)$$$.

Some code

Note: ans[i] is the answer for query with index $$$i$$$.

First of all, we need the DSU functions — get (which returns the node responsible for the whole component) and union, which merges $$$2$$$ different components.

Since the only things we care about for a node are its parent and set of queries it's responsible for we can keep a pair of an integer and a map/set.

The union, as discussed in the above solution part involves small to large merging. If both nodes are responsible for a query then the answer for the query is the weight of the other node, otherwise we insert the query to the combined component.

Code for this part

const int N = 1e5 + 10;

vector<pair<int,int>> adj[N];

pair<int, map<int, int>> d[N];

int get(int u) {
    if(u == d[u].first) return u;
    return d[u].first = get(d[u].first);
}

void uni(int a, int b, int c) {
    a = get(a), b = get(b);

    if(d[a].second.size() > d[b].second.size()) {
        swap(a, b);
    }

    d[a].first = b;
    for(auto i: d[a].second) {
        if(d[b].second.count(i.first)) {
            ans[i.first] = c;
        } else d[b].second[i.first] = 1;
    }
}

After that, we need to set the initial components. Initially, nodes should have an empty query set and their parent should be themself only, and we can sort the nodes by decreasing order of their values in the same time (vector v will contain the node index on the second position). And, we can go also go through queries and add the queries responsible for a given node to a list.

Code for this part

vector<pair<ll, int>> v;

for(int i = 0;i < n; ++i) {
   v.pb({dist[i], i});
   d[i] = {i, {}};
   d[i].second.clear();
}
sort(v.rbegin(), v.rend());
int qq;
cin >> qq; // scanning the number of queries and node a, b of each query
vector<pair<int,int>> qr(qq);
vector<int> responsible_queries[N];
for(int i = 0;i < qq; ++i) {
   cin >> qr[i].first >> qr[i].second;
   --qr[i].first, --qr[i].second;
    responsible_queries[qr[i].first].pb(i);
    responsible_queries[qr[i].second].pb(i);
}

Now for the iteration in decreasing order of the values, we keep the boolean activated for each node, which tells us whether the node was already "activated" (gone through) or not.

Code for this part

vector<bool> activated(n, false);
    for(auto ff : v) {
        int u = ff.second, w = ff.first; //u - node, w - its value
        int l = get(u); //l - parent responsible for the component of the node
        activated[u] = true;
        for(int z: responsible_queries[u]) {
            if(!d[l].second.count(z))
                d[l].second[z] = 1; //we add the query to the set of queries the node is responsible for.
            else ans[z] = w; //The node is an endpoint of a query and its the smallest when merged to it, so the answer for the query is just the value of the node.
        }
        for(auto yy: adj[u]) {
            int ng = yy.first;
            if(activated[ng]){
                uni(u, ng, w); //if the node is activated we merge them in the manner described above.
            }
        }
    }

Then, we are left with outputting the answer for each query.

Variations

This trick can solve many variations of the problem, the most obvious one being that $$$f(a, b)$$$ would be the maximum value on the path instead of the minimum, which would require their little modifications

Comments (16)

Show archived | Write comment?

_Vanilla_

3 years ago, # |

Thank you sir for sharing this trick.

→ Reply

mesanu

based trick

qwexd

SlavicG top contributor poggers

TimDee

-7

Thanks, SlavicG, with help of your blogs I can get better in cp!

valeriu

← Rev. 2 →

+48

Why can't it be done like this?

SlavicG

3 years ago, # ^ |

+18

You are right! The MST solution is correct but I thought the trick is still worth sharing. I will add the MST solution to the blog as well.

In any case, remembered some problems that could be solved using this idea:

IZhO18_plan (can be solved with paralel binary search)

The problem you described, but poorer quality because of matrices (can be solved with paralel binary search, and the MST thing, I guess)

PurpleThinker

This problem is also very similar: link

geniucos

+39

You can do the same by computing a MST where the cost of an edge $$$(u, v)$$$ is given by $$$\max(val_u, val_v)$$$. Then what you're interested in is just the maximum value on a path (which can be done online in $$$\log(N)$$$ per query in several ways. I thought it's worth mentioning because this is a less known (though not difficult to prove, especially by your construction) property of MST

4fecta

+28

Hi there, thanks for sharing this trick. In addition to this method and the $$$w = \min (val_u, val_v)$$$ method, I would also like to mention the connections between your DSU and Reachability Tree/Kruskal Reconstruction Tree. In fact, you are building the vertex-based reachability tree in your method as you solve the queries! It is well known that the maximum value on a path from $$$u$$$ to $$$v$$$ in the reachability tree is simply $$$\text{LCA}(u, v)$$$, so we can also solve the problem not by merging the queries but by simply outputting $$$\text{LCA}(u, v)$$$ for each query after the tree is constructed. This speeds the solution up to $$$\mathcal{O}((n+q)\log n)$$$ with binary lifting, for example.

kozliklekarsky

Funnily, in his own contest this trick was employed to solve this problem

Bungmint

Another similar question CSAcademy Mountain Time

Carmel_Ab1

Yet another ~~tree~~ related problem

Olympia

orz

mmaxio

This code is pretty weird tbh.

You use map instead of set.
I don't get why we need sets in the first place, during "query merging" you can just check if the query endpoints were in the different components before and will be in the same component after(yes, that means the other "clone" of this query becomes a dead weight). Doing it this way also removes a log from the complexity.

Best trick ever!

SlavicG's blog

Introduction

Problem statement

Notes

Prerequisites

Solution

Some code

Variations