I recently solved the problem Playlist on Kattis: https://open.kattis.com/problems/playlist. In brief the statement is:
Given a directed graph with $$$n \leq 100$$$ nodes and the outdegree of each node is less than 40. All the vertices are colored. Find a path of length 9 in the graph such that no pair of nodes in the path has the same color, and output this path. If there're multiple solutions, output any, output "fail" if there is no such path. The time limit is very big (12 seconds).
My solution involved meet-in-the-middle:
Split the path into a middle node and two paths of length 4 connected to this middle node.
Now we first calculate for each node all the subsets of 4 colors such that there's exist a path of these four colors going into this node and going out of this node. This can be done by a DFS with extra state. The state is of the form: (current node, subset of colours we visited until know). A state can be extended by putting the colour of the current node in the subset and going to a neighbour of a color that isn't in our subset. With hashtables I assure that no path is calculated twice (like normal DFS). We do this for the given graph, and the reverse graph. A simple bound of the number of states is $$$\binom{100}{4} \cdot 100 \approx 4 \cdot 10^8$$$, and for each state we loop through all the out edges. So the number of operations of this step is about $$$40 \cdot |\mathrm{States}|$$$.
In the combining step we loop through all candidate middle nodes. Now we have two collections of subsets (one for the ingoing paths, one for the outgoing). We want to find two sets in both collections that are disjoint. I used Inclusion-Exclusion to count for a given set in the first collection, how many sets in the second collection have at least one element in common, and with this we can detect disjoint sets. This adds $$$2^4 \cdot |\mathrm{States}|$$$ to the runtime.
Because of all the hash tables and the huge number of states, my solution ran in 11 seconds.
Here's my code: https://ideone.com/JEZoNF
Is there a simpler or faster solution? I couldn't find an editorial from the contest (Spotify Challenge 2011).
Auto comment: topic has been updated by jeroenodb (previous revision, new revision, compare).
$$${100} \choose {4}$$$ fits in space so it might be possible to use an array instead of a map. You can't have hundred such arrays but a single array should be enough when we consider one vertex as the middle one.
Although we could only consider one vertex as the middle one at once, and do a DFS for that, I don't think we can shrink the number of states of the DFS to $$$\binom{100}{4}$$$. Because the transitions need also to know which node we came from (A bit like hamiltonian path DP).
First I also stored parent pointers of the DFS for the reconstruction, but they're not needed. Therefore I don't need maps, but sets. This could be done with 100 bitsets of $$$\binom{100}{4}$$$, but it seems hard to give a unique id to each subset that can be computed quickly. Now I use bloomfilters, and with a few other optimizations it now gets accepted in 6 seconds.
When you are at any node in DFS, store up to three previous colors. Don't include the color of the final node. That's $$$100 \cdot {{100} \choose 3}$$$ states, or let's just make it $$$100^4$$$. You can compress further when considering the very last node.
You can do the whole thing with for-loops too, no DFS needed.
That's a problem of counting lexicographically smaller sorted sequences of fixed length. It's indeed non-trivial $$$O(4)$$$ with precomputed binomial coefficients and prefix sums. It's be easier and faster to precompute an array
int id[a][b][c]
so you could say that hash of $$$(a, b, c, d)$$$ is $$$id[a][b][c]+d$$$. So, $$$id[a][b][c]$$$ is lexicographical id of sequence $$$(a, b, c, 0)$$$, minus $$$c$$$.