heuristica's blog

By heuristica, history, 4 years ago, In English

(It's a translated version of my blog at https://peehs-moorhsum.blog.uoj.ac/blog/6384. )

Finding a Hamiltonian path in a directed graph is a well-known NP problem.

However, about a year ago, I came up with the following heuristic algorithm which has GREAT performance on random graphs(by first generating a hamiltonian path, adding random edges, then randomly permuting indices) and many CP problems.

Here is the idea of the algorithm:

We maintain a subset $$$S$$$ of edges, ensuring there's no cycle and each vertex has in-degree and out-degree at most 1. Whenever $$$|S|$$$ reaches $$$n-1$$$, we find a hamiltonian path.

We start with $$$S$$$ empty. Each time we pick a random edge $$$e \notin S$$$. If we can add $$$e$$$ to $$$S$$$ without violating any rule, we simply add it to $$$S$$$. Otherwise, if $$$e$$$ won't create a cycle and violate with exactly one edge e', we replace e' with $$$e$$$ with 50% probability. Repeat until $$$|S|$$$ reaches $$$n-1$$$.

We can use LCT to check cycles. Here's my code, which takes less than 1 second on most random graphs with $$$|V|=100000, |E|=500000$$$:

namespace hamil {
    template <typename T> bool chkmax(T &x,T y){return x<y?x=y,true:false;}
    template <typename T> bool chkmin(T &x,T y){return x>y?x=y,true:false;}
    #define vi vector<int>
    #define pb push_back
    #define mp make_pair
    #define pi pair<int, int>
    #define fi first
    #define se second
    #define ll long long
    namespace LCT {
        vector<vi> ch;
        vi fa, rev;
        void init(int n) {
            ch.resize(n + 1);
            fa.resize(n + 1);
            rev.resize(n + 1);
            for (int i = 0; i <= n; i++)
                ch[i].resize(2), 
                ch[i][0] = ch[i][1] = fa[i] = rev[i] = 0;
        }
        bool isr(int a)
        {
            return !(ch[fa[a]][0] == a || ch[fa[a]][1] == a);
        } 
        void pushdown(int a)
        {
            if(rev[a])
            {
                rev[ch[a][0]] ^= 1, rev[ch[a][1]] ^= 1;
                swap(ch[a][0], ch[a][1]);
                rev[a] = 0;
            }
        }
        void push(int a)
        {
            if(!isr(a)) push(fa[a]);
            pushdown(a); 
        }
        void rotate(int a)
        {
            int f = fa[a], gf = fa[f];
            int tp = ch[f][1] == a;
            int son = ch[a][tp ^ 1];
            if(!isr(f)) 
                ch[gf][ch[gf][1] == f] = a;    
            fa[a] = gf;

            ch[f][tp] = son;
            if(son) fa[son] = f;

            ch[a][tp ^ 1] = f, fa[f] = a;
        }
        void splay(int a)
        {
            push(a);
            while(!isr(a))
            {
                int f = fa[a], gf = fa[f];
                if(isr(f)) rotate(a);
                else
                {
                    int t1 = ch[gf][1] == f, t2 = ch[f][1] == a;
                    if(t1 == t2) rotate(f), rotate(a);
                    else rotate(a), rotate(a);    
                } 
            } 
        }
        void access(int a)
        {
            int pr = a;
            splay(a);
            ch[a][1] = 0;
            while(1)
            {
                if(!fa[a]) break; 
                int u = fa[a];
                splay(u);
                ch[u][1] = a;
                a = u;
            }
            splay(pr);
        }
        void makeroot(int a)
        {
            access(a);
            rev[a] ^= 1;
        }
        void link(int a, int b)
        {
            makeroot(a);
            fa[a] = b;
        }
        void cut(int a, int b)
        {
            makeroot(a);
            access(b);
            fa[a] = 0, ch[b][0] = 0;
        }
        int fdr(int a)
        {
            access(a);
            while(1)
            {
                pushdown(a);
                if (ch[a][0]) a = ch[a][0];
                else {
                    splay(a);
                    return a;
                }
            }
        }
    }
    vi out, in;
    vi work(int n, vector<pi> eg, ll mx_ch = -1) { 
        // mx_ch : max number of adding/replacing  default is (n + 100) * (n + 50) 
        // n : number of vertices. 1-indexed. 
        // eg: vector<pair<int, int> > storing all the edges. 
        // return a vector<int> consists of all indices of vertices on the path. return empty list if failed to find one.  
        out.resize(n + 1), in.resize(n + 1);
        LCT::init(n);
        for (int i = 0; i <= n; i++) in[i] = out[i] = 0;
        if (mx_ch == -1) mx_ch = 1ll * (n + 100) * (n + 50); //default
        vector<vi> from(n + 1), to(n + 1);
        for (auto v : eg)
            from[v.fi].pb(v.se), 
            to[v.se].pb(v.fi);
        unordered_set<int> canin, canout;
        for (int i = 1; i <= n; i++)
            canin.insert(i), 
            canout.insert(i); 
        mt19937 x(chrono::steady_clock::now().time_since_epoch().count());
        int tot = 0;
        while (mx_ch >= 0) {
        //    cout << tot << ' ' << mx_ch << endl;
            vector<pi> eg;
            for (auto v : canout)
                for (auto s : from[v])
                    if (in[s] == 0) {
                        assert(canin.count(s));
                        continue;
                    }
                    else eg.pb(mp(v, s));
            for (auto v : canin)
                for (auto s : to[v])
                    eg.pb(mp(s, v));
            shuffle(eg.begin(), eg.end(), x);
            if (eg.size() == 0) break;
            for (auto v : eg) {
                mx_ch--;
                if (in[v.se] && out[v.fi]) continue;
                if (LCT::fdr(v.fi) == LCT::fdr(v.se)) continue;
                if (in[v.se] || out[v.fi]) 
                    if (x() & 1) continue;
                if (!in[v.se] && !out[v.fi]) 
                    tot++;
                if (in[v.se]) {
                    LCT::cut(in[v.se], v.se);
                    canin.insert(v.se);
                    canout.insert(in[v.se]);
                    out[in[v.se]] = 0;
                    in[v.se] = 0;
                }
                if (out[v.fi]) {
                    LCT::cut(v.fi, out[v.fi]);
                    canin.insert(out[v.fi]);
                    canout.insert(v.fi);
                    in[out[v.fi]] = 0;
                    out[v.fi] = 0;
                }
                LCT::link(v.fi, v.se);
                canin.erase(v.se);
                canout.erase(v.fi);
                in[v.se] = v.fi;
                out[v.fi] = v.se;
            }
            if (tot == n - 1) {
                vi cur;
                for (int i = 1; i <= n; i++) 
                    if (!in[i]) {
                        int pl = i;
                        while (pl) {
                            cur.pb(pl), 
                            pl = out[pl];
                        }
                        break;
                    } 
                return cur;
            }
        }
        //failed to find a path
        return vi();
    }
}

Note that if the start vertex $$$S$$$ and/or end vertex $$$T$$$ of the path is given, we simply create two new vertices $$$A, B$$$ and add edges from $$$A$$$ to $$$S$$$, from $$$T$$$ to $$$B$$$. Any Hamiltonian path in the new graph automatically has $$$A\to S$$$ as the first edge and $$$T\to B$$$ as the last edge. If we want to find a Hamiltonian cycle, we simply enumerate an edge, then convert the problem to finding a path given start/end vertices. For bidirectional case, we can simply add each edge twice.

  • Vote: I like it
  • +455
  • Vote: I do not like it

»
4 years ago, # |
  Vote: I like it +4 Vote: I do not like it

Auto comment: topic has been updated by heuristica (previous revision, new revision, compare).

»
4 years ago, # |
  Vote: I like it +11 Vote: I do not like it

I assume you meant $$$|V|=10^4$$$? With $$$|V|=10^5$$$ and $$$|E|=5\cdot 10^5$$$ directed edges the expected number of vertices with out-degree 0 is quite large ... ($$$\approx |V|\cdot e^{-5}\approx 674$$$)

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    He did say "by first generating a hamiltonian path, adding random edges, then randomly permuting indices" which I guess means he ensures there are no such vertices (and I guess there are ~674 vertices with degree 1 then?)

    • »
      »
      »
      4 years ago, # ^ |
      Rev. 3   Vote: I like it +14 Vote: I do not like it

      aha, can't read

      Well assuming $$$4\cdot 10^5$$$ edges in addition to the hamiltonian path then that's $$$|V|\cdot e^{-4}\approx 1832$$$ with out-degree 1.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +8 Vote: I do not like it

        Yeah, that's true. I also tried many other random graphs, and it turned out that the performance is approximate $$$O(|E|\log|V|)$$$, not depends on $$$|E|/|V|$$$ very much (but it sometimes stuck in cases $$$|E|~3|V|$$$). I guess there're many paths in random graphs, so random guesses work most of the time?

»
4 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Very cool approach!

Instead of time(0) it's better to use chrono::steady_clock::now().time_since_epoch().count() inside mt19937, because we can call function work() several times for better probabilities. If we call this function with the same arguments several times per second, time(0) returns the same value, which is bad.

Or we can move mt19937(time(0)) out of work()'s body.

»
4 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

You can test any input with the code above! Just generate graphs (store edges as vector<pair<int, int>>, where (u, v) denotes an edge from $$$u$$$ to $$$v$$$), and put them into hamil::work. Initialization is done automatically, so you can test graphs in series.

»
4 years ago, # |
Rev. 3   Vote: I like it +24 Vote: I do not like it

This idea is also useful in some problems which are related to Hamiltonian path! In JOISC 2021 Day1 T1 Aerobatics, applying this idea will get >80 points easily.

»
4 years ago, # |
  Vote: I like it +20 Vote: I do not like it

I tried to apply this idea here and looks like it doesn't perform great for the bidirectional case, where not so many random edges added, so it won't work for very simple undirected graphs. For example, it fails to find the path for 5000 vertices and undirected edges $$$1-2-3-\ldots-N$$$ in 2 seconds.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +40 Vote: I do not like it

    I see. Seems like the algorithm works bad when there are lots of directional paths but few Hamiltonian paths(which is often true when bidirectional).

    (It also depends on seed a lot. For 1-2-3-...-N with N = 5000, it runs rather fast with seed = 377921677545000 (the 1 out of 100 seeds that success at mx_ch = 1e6 lol))

    But bidirectional case should be easier, as we can only restrict vertices to have $$$deg \le 2$$$ rather than $$$out \le 1$$$ and $$$in \le 1$$$. I'll write another code for bidirectional later. :)