Better solution to hard graph problem?

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	156
6	Qingyu	153
7	djm03178	152
7	adamant	152
9	luogu_official	150
10	awoo	147

I recently solved the problem Playlist on Kattis: https://open.kattis.com/problems/playlist. In brief the statement is:

Given a directed graph with $$$n \leq 100$$$ nodes and the outdegree of each node is less than 40. All the vertices are colored. Find a path of length 9 in the graph such that no pair of nodes in the path has the same color, and output this path. If there're multiple solutions, output any, output "fail" if there is no such path. The time limit is very big (12 seconds).

My solution involved meet-in-the-middle:

Split the path into a middle node and two paths of length 4 connected to this middle node.

Now we first calculate for each node all the subsets of 4 colors such that there's exist a path of these four colors going into this node and going out of this node. This can be done by a DFS with extra state. The state is of the form: (current node, subset of colours we visited until know). A state can be extended by putting the colour of the current node in the subset and going to a neighbour of a color that isn't in our subset. With hashtables I assure that no path is calculated twice (like normal DFS). We do this for the given graph, and the reverse graph. A simple bound of the number of states is $$$\binom{100}{4} \cdot 100 \approx 4 \cdot 10^8$$$, and for each state we loop through all the out edges. So the number of operations of this step is about $$$40 \cdot |\mathrm{States}|$$$.

In the combining step we loop through all candidate middle nodes. Now we have two collections of subsets (one for the ingoing paths, one for the outgoing). We want to find two sets in both collections that are disjoint. I used Inclusion-Exclusion to count for a given set in the first collection, how many sets in the second collection have at least one element in common, and with this we can detect disjoint sets. This adds $$$2^4 \cdot |\mathrm{States}|$$$ to the runtime.

Because of all the hash tables and the huge number of states, my solution ran in 11 seconds.

Here's my code: https://ideone.com/JEZoNF

Is there a simpler or faster solution? I couldn't find an editorial from the contest (Spotify Challenge 2011).

Comments (4)

Write comment?

jeroenodb

4 years ago, # |

+11

Auto comment: topic has been updated by jeroenodb (previous revision, new revision, compare).

→ Reply

Errichto

$$${100} \choose {4}$$$ fits in space so it might be possible to use an array instead of a map. You can't have hundred such arrays but a single array should be enough when we consider one vertex as the middle one.

4 years ago, # ^ |

Although we could only consider one vertex as the middle one at once, and do a DFS for that, I don't think we can shrink the number of states of the DFS to $$$\binom{100}{4}$$$. Because the transitions need also to know which node we came from (A bit like hamiltonian path DP).

First I also stored parent pointers of the DFS for the reconstruction, but they're not needed. Therefore I don't need maps, but sets. This could be done with 100 bitsets of $$$\binom{100}{4}$$$, but it seems hard to give a unique id to each subset that can be computed quickly. Now I use bloomfilters, and with a few other optimizations it now gets accepted in 6 seconds.

When you are at any node in DFS, store up to three previous colors. Don't include the color of the final node. That's $$$100 \cdot {{100} \choose 3}$$$ states, or let's just make it $$$100^4$$$. You can compress further when considering the very last node.

You can do the whole thing with for-loops too, no DFS needed.

it seems hard to give a unique id to each subset that can be computed quickly

That's a problem of counting lexicographically smaller sorted sequences of fixed length. It's indeed non-trivial $$$O(4)$$$ with precomputed binomial coefficients and prefix sums. It's be easier and faster to precompute an array int id[a][b][c] so you could say that hash of $$$(a, b, c, d)$$$ is $$$id[a][b][c]+d$$$. So, $$$id[a][b][c]$$$ is lexicographical id of sequence $$$(a, b, c, 0)$$$, minus $$$c$$$.

jeroenodb's blog