Please read the new rule regarding the restriction on the use of AI tools. ×

Anthony_Smith's blog

By Anthony_Smith, history, 2 years ago, In English

UPDATE: I actually managed to hack the Rust Code with this testcase. This is the same as CSES test case #27, but I switch the order of the first two edges. The code fails if it sees the 1 2 -100 edge after the 1 3 1 edge, since that would cause it to traverse the 1 2 -100 edge first (it traverses edges in reverse order of how they show up in the input), and apparently that is enough to make it output 3 instead of the correct answer, -1. As according to CSES Guidelines, all submissions will be regraded. I wonder how this'll turn out!

Problem Summary: Given a directed and weighted graph, find the longest path from node 1 to node N, given that edges can be traversed multiple times. If there is an arbitrarily long path, then print -1. The problem link is here.

My method to solve this problem came straight from the CSES Book: Use the Bellman-Ford Algorithm to determine which nodes (if any) are in a positive cycle. If any positive cycle is reachable from both node 1 and node N, then return -1. Otherwise, just return the longest distance from node 1 to node N. Another method to solve this problem is by using a topological sort and finding the longest distance from node 1 to node N that way.

However, the top-performing code for this problem comes from Rust, and seems to use a completely different method. Here is the code. My main language is C++ so I can kind of interpret some parts of the code. The program seems to keep extra variables first_edge and first_reverse_edge to access edges, instead of using an adjacency list. The function mark_useful_edges() seems to set all edges that can reach node N as useful. But I have no clue how the search() function works.

Can someone interpret this code, or reveal the algorithm or logic the code-writer used?

Note: I understand that simply caring about best performance is a narrow-minded way of learning algorithms. I am interested in this solution because it seems to be a simple way that I have never seen before, and still solves the problem quickly. If this code involves a new technique, I want to learn it.

  • Vote: I like it
  • +41
  • Vote: I do not like it

| Write comment?
»
2 years ago, # |
Rev. 4   Vote: I like it +22 Vote: I do not like it

Before I say anything, I want to thank you for one of the best-written questions on codeforces.

The method used in that solution is called (at least by me) an "adjacency edge list", and I learned about it from a blog that called it the same thing. It's not anything more special than a normal adjacency list, and you can think of it as storing the edges out of each vertex in a linked list. I'll find the blog and link it here. Found it!.

I've gotten the fastest solutions on some problems with this method, and I've seen it before in other judges as well (I believe https://judge.yosupo.jp). Found a submission of mine! I encourage people reading this to try and beat my time on that problem; it seems quite unoptimized.

I'll show the general idea here with a DFS floodfill implementation.

Code

The key idea is that for each edge, you store a index for the previous edge you've initialized for that node. You start with that index as -1, which is why we memset it at the beginning, and why the stopping condition in the loop uses ~j (you can also do j != -1, but I like how it makes the code more mysterious :)

How do you use :) at the end of a parenthesized statement

Note: It's also worth mentioning that this idea can be applied to speed up any context where you use vector<T>[N] and add elements to the vectors as long as you don't need random-access references to the elements. Just replace the to array with data to store whatever you want. Often when I want to use this on weighted graphs for example, I use an extra array cost.

  • »
    »
    2 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Whether it's faster or not is doubtful. For example, if you use that in min cost max flow problems I think it'll end up being slower because of cache miss reasons.