DAG to SCC - Codeforces

→ Pay attention

Before contest
Codeforces Round (Div. 3)
5 days

→ Streams

Greedy Algorithms — Topic Stream

By Shayan

Before stream 19:13:06

View all →

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	156
6	adamant	152
6	djm03178	152
8	Qingyu	151
9	luogu_official	149
10	awoo	147

View all →

→ Find user

→ Recent actions

Detailed →

faresbasbas's blog

DAG to SCC

By faresbasbas, history, 5 years ago, In English

In English

Recently I've read about minimum number of needed edges to construct SCC from a DAG. And it was the MAX(x,y) where x is the number vertices with 0 in-degree and y is the number of vertices with 0 out-degree. But i am wondering how to find such combination of edges in O(N) time. Thanks in advance :)

edit : solved. thank you very much for all people who helped me :)

Tags

#graph theory, graph algorithm, bipartite matching, matchings

+55

faresbasbas
5 years ago
36

Comments

Comments (36)

Write comment?

»

5 years ago, # |

← Rev. 3 →

Vote: I like it

+6

Vote: I do not like it

Let $$$s_1, s_2, ..., s_n$$$ will be $$$0$$$ in-degree vertices and $$$e_1, e_2, ..., e_m$$$ vertices with $$$0$$$ out-degree. Assume that $$$m \geq n$$$ (if it's wrong we can, for example, change direction of all edges).

Notice, that out-degree of every $$$e_i$$$ vertex should be at least one after adding edges. But we can add only $$$m$$$ edges. So, every edge should begins in one of $$$e$$$ vertices and out-degree of every vertex from $$$e$$$ will be exactly $$$1$$$. Let's build these edges.

Run dfs from $$$s_1$$$ and find all $$$0$$$ out-degree vertices, which achievable from $$$s_1$$$. Let's call these vertices $$$c_1, c_2, ..., c_k$$$. Now add edges $$$(c_1, c_2)$$$, $$$(c_2, c_3), ..., (c_{k-1}, c_{k})$$$ and $$$(c_k, s_2)$$$. Then repeat this process with other vertices from $$$s$$$. On every step we shouldn't touch vertices from $$$e$$$, which were visited on previous steps. And don't forget about edge $$$(c_{nk_n}, s_1)$$$ in the end.

Total complexity is $$$O(N+M)$$$.

→ Reply

»

»

5 years ago, # ^ |

← Rev. 5 →

Vote: I like it

+10

Vote: I do not like it

Ok it looks very clear for me thank you very much except this part (cnkn,s1) which is the last edge, but how can we deal with it if we had multiple DAGs in the same graph?

→ Reply

»

»

»

5 years ago, # ^ |

← Rev. 2 →

Vote: I like it

+16

Vote: I do not like it

It also works with multiple DAGs in the same graph. After adding edges there will be cycle $$$(s_1, c_{11}), (c_{11}, c_{12}), ... (c_{1k_{1}-1}, c_{1k_1}), (c_{1k_1}, s_2), (s_2, c_{21}), (c_{21}, c_{22}), ...,$$$ $$$(c_{2k_{2}-1}, c_{2k_2}), (c_{2k_2}, s_3), ..., (c_{(n-1)k_{n-1}}, s_n), ... , (c_{nk_{n}}, s_1)$$$. This cycle covers all $$$0$$$-in degree verteces, from which we can reach any vertex.

→ Reply

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+22

Vote: I do not like it

What's your response for Radewoosh's counter example?

→ Reply

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

His counter example works :)

→ Reply

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

why ? won't you connect 3 -> 4 , 4 -> 1 which is wrong ?

→ Reply

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+13

Vote: I do not like it

There are will be 2 CSS: (2) and (1, 3, 4)

→ Reply

»

»

»

»

»

»

»

»

5 years ago, # ^ |

← Rev. 2 →

Vote: I like it

+5

Vote: I do not like it

each vertex should be in a different SCC in the example 1,3,4 can't reach each other out only 1 can reach 3,4 but 3,4 can't reach anything

→ Reply

»

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+13

Vote: I do not like it

Graph after adding 3->4 and 4->1 edges.

→ Reply

»

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

i mean it's still not a SCC but i need to be an SCC

→ Reply

»

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

the solution of this example should be 4 -> 1 , 3 -> 2 while your idea's logic say the answer is 3 -> 4 , 4 -> 1 which is wrong

→ Reply

»

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+10

Vote: I do not like it

I meant that Radewoosh's counter example works against my solution. So, this approach doesn`t work.

→ Reply

»

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

ohh ok i misunderstoond you sorry :)

but i appreciate you helping me :)

thank you very much

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

+34

Vote: I do not like it

And what do you propose we do if all vertices reachable from $$$s_i$$$ were previously visited?

→ Reply

»

»

5 years ago, # ^ |

← Rev. 3 →

Vote: I like it

+53

Vote: I do not like it

n=4, m=4, edges:

→ Reply

»

»

»

5 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

Also thank you for the edge case lol

→ Reply

»

5 years ago, # |

Vote: I like it

+3

Vote: I do not like it

Any other ideas of how to make it work :)?

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

+13

Vote: I do not like it

I have a solution which I think should work. Let
Number of nodes with 0 in-degree = x
Number of nodes with 0 out-degree = y
I am considering y $$$\geq$$$ x. If y $$$<$$$ x then we can reverse the edges.

First, colour all the x nodes with colours from 0 to x-1. Now do a dfs from each of these x nodes. While doing dfs from ith node, colour all nodes you visit with the colour of the root node (the node with 0 in-degree from which we are running dfs currently). In dfs, whenever you encounter a node which has already been coloured, simply return as it is guaranteed that all nodes in it's subtree are already coloured. Now our graph will have these properties-

Here,
root node of ith colour is node which has colour i and in-degree 0 and,
leaf nodes of ith colour are nodes which have colour i and out-degree 0.

Each node will be coloured by some colour c.
There will be atleast 1 node of each colour.

Now let us consider all colours c such that there exist atleast 1 node with colour c and out-degree = 0. Let there are p such colours f₁, f₂....f_p. Also, select one leaf node of each of these p colours. Now add edges like this-
leaf(f₁) -> root(f₂), leaf(f₂) -> root(f₃), .... leaf(f_p) -> root(f₁). Total p edges would be added like this.

Now, there are x-p colours left which do not have any leaf. Let these be c₁, c₂ ... c_x-p. We have y-p leaf nodes left. Now out of these y-p leaves, choose any x-p leaves and let those be l₁, l₂....l_x-p. Add edges like this-
l₁ -> root(c₁), l₂ -> root(c₂) ... l_x-p -> root(c_x-p). Total x-p edges will be added like this.

Total edges added till now = p+x-p = x.

We still have y-x leaves left. Add edges from these leaves to root of any colour. So total y edges would be added. It can easily be shown that we can reach from any node of colour f_i to any node of colour f_j, from any node of colour f_i to any node of colour c_j and from any node of colour c_i to any node of colour c_j.

I do feel that it might fail on a case like this-
We have to go from some node of colour c_i to some node of colour f_j. It is possible in my algorithm that from each node of colour c_i, I am only able to reach a leaf node of colour f_j which is connected to a root node of colour c_k and thus I won't be able to reach every node of colour f_k. In this case, I will choose a colour k such that it has a leaf node connected to root(f_i) and at least 1 leaf node connected to root(c_j). I will swap these two leaf nodes and the problem would be solved.

It is a very long solution, I am sorry for any typos in between or any mistake. I think this should solve your problem in linear time. Please correct me if I am wrong somewhere.

→ Reply

»

5 years ago, # |

Vote: I like it

+13

Vote: I do not like it

Hello, this is known as the strong connectivity augmentation problem. You can find a solution described in this paper.

→ Reply

»

»

5 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

Actually, I've read this paper more than once but the idea isn't very clear to me. Also I tried to just implement it, And it didn't work, probably there is something I missed, But I am not able to figure it out.

→ Reply

»

5 years ago, # |

← Rev. 2 →

Vote: I like it

+21

Vote: I do not like it

There is a pretty nice and neat algorithm for this.

The algorithm implemented in Python

def DAG_minimum_augmentation_strong_connectivity(graph, indeg, outdeg):
    n = len(graph)
    if n <= 1:
        return []

    found = [0] * n
    def dfs(node):
        if found[node]:
            return
        found[node] = 1
        if outdeg[node] == 0:
            # unmatched sink found
            return node
        for nei in graph[node]:
            ret = dfs(nei)
            if ret is not None:
                # unmatched sink found
                return ret

    matched_sources = []
    matched_sinks = []

    never_matched_sources = []
    never_matched_sinks = []

    for node in range(n):
        if indeg[node] == 0:
            matching_sink = dfs(node)
            if matching_sink is not None:
                matched_sources.append(node)
                matched_sinks.append(matching_sink)
            else:
                never_matched_sources.append(node)
    
    for node in range(n):
        if outdeg[node] == 0 and not found[node]:
            never_matched_sinks.append(node)
    
    sources = matched_sources + never_matched_sources
    sinks = matched_sinks + never_matched_sinks

    P = len(matched_sources)
    N = len(sources)
    M = len(sinks)
    K = min(N, M)
        
    ret =  [(0, P - 1)]
    ret += ((i, i - 1) for i in range(1, P))
    ret += ((i, i)     for i in range(P, K))
    ret += ((0, i)     for i in range(K, N))
    ret += ((i, 0)     for i in range(K, M))
    return [(sinks[i], sources[j]) for i,j in ret]

The basic idea of the algorithm is to greedily try to match sources with sinks using dfs. Then when no more matches can be found, the sources/sinks that never found a match must be able to reach/be reachable by some node in matched_sinks/matched_sources. You can then strongly connect everything using $$$\text{max}(\text{size}(\text{sources}), \text{size}(\text{sinks}))$$$ edges.

Properties of the greedy matching

The process of greedily matching source and sinks has the following properties:

matched_sources[i] can reach matched_sinks[i]
Every source in never_matched_sources can reach at least one sink in matched_sinks.
Every sink in never_matched_sinks can be reached by at least one source in matched_sources.

How to add the edges

To make the DAG strongly connected.

Add edges to create one big SCC out of all of the matched sources and sinks. (Note that from the properties of the greedy matching, all of the unmatched sources can now reach this SCC, and the unmatched sinks can all be reached from this SCC.)
Start adding edges to strongly connect pairs of unmatched sources and sinks. If there are more unmatched sources than sinks, or vice versa, then just strongly connect the excess to any node in the SCC.

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

so as i did understand

firstly:

for any source i will try to connect it to exactly one sink which it can reach if not i will add it to never_matched_sources

secondly:

i will find all of the unmatched sinks and put them in never_matched_sinks

thirdly:

strongly connect the never_matched_sinks to the never_matched_sources

fourthly:

i have to strongly connect the rest

is it this ?

→ Reply

»

»

»

5 years ago, # ^ |

Vote: I like it

+8

Vote: I do not like it

"firstly: for any source i will try to connect it to exactly one sink which it can reach if not i will add it to never_matched_sources"

Note that the dfs never visits the same node twice. So the formulation should be "it can reach using unvisited nodes"

"thirdly: strongly connect the never_matched_sinks to the never_matched_sources"

I start by strongly connecting all of the matched sources and sinks into one big SCC. Think of it like making one big cycle.

    ret =  [(0, P - 1)]
    ret += ((i, i - 1) for i in range(1, P))

Once I've made the big SCC, I then strongly connect the unmatched sources and sinks to it.

    ret += ((i, i) for i in range(P, K))
    ret += ((0, i) for i in range(K, N))
    ret += ((i, 0) for i in range(K, M))

→ Reply

»

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Yeah it's very clear to me i did implement it and it looks to work for single component graph, but can you solution be generalized for several DAGs in the graph ?

→ Reply

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+10

Vote: I do not like it

Still not sure what you are talking about. What do you even mean by single component graph when it comes to directed graphs? The algorithm works for any DAG.

→ Reply

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

take this for example

n = 4 , m = 2

1 2

2 3

you will connect 3 -> 1 then you will have 4 as a sink and source at the same time and you can add only 1 more edge what would you do in this case ?

→ Reply

»

»

»

»

»

»

»

5 years ago, # ^ |

← Rev. 3 →

Vote: I like it

+5

Vote: I do not like it

For that graph matched_sources will become [1,4] and matched_sinks will become [3,4] (note this means (source) 1 is matched with (sink) 3, and (source) 4 is matched with (sink) 4, so 4 is matched with itself). There are no unmatched sources or sinks.

To create one big SCC out of the matched sources and sinks, I run this code

    ret =  [(0, P - 1)]
    ret += ((i, i - 1) for i in range(1, P))

which connects 3->4 and 4->1. It does not connect 3->1. Think of this process as making one big cycle.

→ Reply

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

ohh ok

→ Reply

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

is it possible to explain the last 4 loops cause i am not used to python :)

→ Reply

»

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

ret =  [(0, P - 1)]
ret += ((i, i - 1) for i in range(1, P))

this is the same as

ret =  [(0, P - 1)]
for i in range(1, P):
  ret.append((i, i - 1))

which in C++ would look something like

vector<pair<int,int>> ret{{0, P - 1}};
for (int i = 1; i < P; ++i)
  ret.push_back({i, i - 1});

→ Reply

»

»

»

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

ohh yeah i just got it all right now thank you very much :))

→ Reply

»

»

»

»

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

is my point clear to you ?

→ Reply

»

»

5 years ago, # ^ |

Vote: I like it

+5

Vote: I do not like it

Also how would you deal with it if the graph is made of multiple DAGs ?

→ Reply

»

»

»

5 years ago, # ^ |

Vote: I like it

+8

Vote: I do not like it

The definition of DAG does not require the nodes to be connected. For example a graph with no edges is technically a DAG. So multiple DAGs is still a DAG.

→ Reply

»

5 years ago, # |

Vote: I like it

0

Vote: I do not like it

Auto comment: topic has been updated by faresbasbas (previous revision, new revision, compare).

→ Reply

»

7 days ago, # |

← Rev. 3 →

Vote: I like it

0

Vote: I do not like it

[Sorry for the necroposting, but I found this method quite useful]

As the author said, to make a directed graph strongly connected with the minimum number of edges, we need to add

x = max(source SCCs, sink SCCs) number of edges.

A source SCC has no incoming edges from other SCCs, and a sink SCC has no outgoing edges. Using Kosaraju’s or Tarjan’s algorithm, we find all SCCs and classify them into:

S = {s1, s2, ..., sk} (source SCCs)
T = {t1, t2, ..., tm} (sink SCCs)

Now we run a loop from 0 to x in cyclic manner, and joining the nodes

u = sinks[i % sinks.size()] and v = sources[(i + 1) % sources.size()]

We connect the i-th sink component to the (i+1)-th source component, ensuring a directed path between SCCs. Since direct connections within the same SCC could create self-loops, shifting the index by +1 prevents this issue.

Thus, it can be done in O(n) time.

Why Self Loops?

Recommended Problem to apply the above idea: https://codeforces.me/contest/22/problem/E

Implementation: https://codeforces.me/contest/22/submission/305805098

→ Reply