egor.okhterov's blog

By egor.okhterov, history, 8 years ago, In English

We have a tree with n (2 ≤ n ≤ 1000) nodes and we need to find k (1 ≤ k ≤ 100 and k ≤ n - 1) vertices such that the path going through all of these vertices has the maximum length.

Note:

  • The path should start at vertex 1 and finish also at vertex 1.
  • The algorithm has to return just the length of the maximum path (the path itself is not needed).
  • The weight for an edge has following restrictions: 0 ≤ w ≤ 100000.

Let's say we have the following tree with n = 5 vertices:

We need to find k = 3 vertices which will give us the longest path.

For this tree the maximal path is the following:
15241

It has the total length of 13 + 18 + 10 + 5 = 46.
So, for that particular tree we have to print 46 as our result.


I came up with a following greedy/dp-like solution. First we solve the problem for k = 1 and we remember this solution in a linked list 1v1. After that we try to solve the problem for k = 2 by trying all of the remaining n - 2 vertices and insert it in the current path: 1uv1 and 1vu1. After going through all of the vertices we choose the one which gave us the best result. Then we proceed to solve k = 3.

The problem is that it looks like this solution is not correct, because it fails the tests for this problem. I cannot prove that my solution is correct and I cannot disprove it. All I managed to do is to generate millions of different random trees and in all of these cases my clever solution matched bruteforce solution.

For now all my effort is directed towards generating a counter example in which my method will fail, but if it is correct I'd be glad to see the reasoning why is it correct.

  • Vote: I like it
  • +17
  • Vote: I do not like it

»
8 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

Are all the k vertices required to be distinct?

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    You are correct. We have to choose k distinct vertices, but we are allowed to cross a vertex more than once while walking along the path.

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      In your solution, are you maintaining distinct vertices somehow?

      • »
        »
        »
        »
        8 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Yes, I am maintaining the largest path that I currently have in the array:

        int next[1001] = {}; // linked list
        

        And I maintain the vertices that are already in the path in the array:

        bool used[1001] = {};
        
        • »
          »
          »
          »
          »
          8 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Your method is greedy.

          Suppose k=5, and you find the longest path 1->2->3->4 at k=4, now you can't choose these vertices again as targets. What if 1->2->5->3->4 is longer than 1->2->3->4->5 but 1->2->5 is shorter than 1->2->3 ?

          • »
            »
            »
            »
            »
            »
            8 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            What if 1->2->5->3->4 is longer than 1->2->3->4->5 but 1->2->5 is shorter than 1->2->3 ?

            This is actually the level of my current progress =)
            I wasn't able to construct a real tree which fails this solution.

            • »
              »
              »
              »
              »
              »
              »
              8 years ago, # ^ |
              Rev. 3   Vote: I like it 0 Vote: I do not like it

              You are picking the longest path, and then back to 1, then the second longest and back to 1, third longest and so on, keeping in mind that you don't re-target a vertex. But you don't need to check 'back to 1' every time. Just reaching the target might be sufficient.

              Maybe a simple fix like terminating greedy at k-1 iterations, and then checking kth iteration + path to 1 for every un-targeted vertex will work. But I'm not sure. I haven't proved it yet.

              • »
                »
                »
                »
                »
                »
                »
                »
                8 years ago, # ^ |
                  Vote: I like it 0 Vote: I do not like it

                Maybe a simple fix like terminating greedy at k-1 iterations, and then checking kth iteration + path to 1 for every un-targeted vertex will work.

                Is my solution wrong? If it is, how does the tree look like which fails my solution?

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 2   Vote: I like it 0 Vote: I do not like it

                  I don't know if I can help you find a test case right now ( I am sleepy ) but I think the order of choosing vertices matter. So if optimal order has 3 before 4, and 1->2->3 > 1->2->4 but 1->2->3->1 < 1->2->4->1, you choose 1->2->4->1, therefore changing the optimal order.

                  I mean, instead of checking 1->v->u->1, check only 1->...->v->u until k-1. At final k, check u->w->1.

                  Not proven. Just an idea. Good luck :)

                  Another idea : graph and k-cycle. Good luck :)

                  And let us know if you get AC :)

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  You should also consider cases where your candidate paths have equal values. As the order of choosing matters, we might make a mistake here.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                    Vote: I like it +3 Vote: I do not like it

                  This argument seems right, but a real tree that breaks this algorithm will help tremendously.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 4   Vote: I like it 0 Vote: I do not like it

                  Looks like I have one :)

                  1 2 10
                  2 3 20
                  2 5 20
                  1 4 20

                  and k = 3

                  Your algorithm chooses as follows :

                  1->3->1 or 1->5->1 . Let's say it chose 1->3->1.

                  Now it has option 1->3->4->1 = 100 and 1->3->5->1 = 100.

                  Now your algorithm chooses one of these without any other criteria, but we can see that the optimal order is 1->3->4->5->1.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  My algorithm returns the length of 160 for that tree: http://ideone.com/NO769r.

                  Is that correct?

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 6   Vote: I like it 0 Vote: I do not like it

                  Lol, yes! But only accidentally.

                  You must've chosen 1->3->4->1 instead of 1->3->5->1, but they have equal probability of picking by your algorithm. :)

                  The point is, your algorithm doesn't handle equal candidates well. I hope you get the point. I can't spend any more time on this :)

                  I added some debug lines in your code your code

                  The output suggests your algorithm doesn't do what you said it does.

                  max_path_length=60
                  best_v=4 best_increase=40 max_path_length=100
                  best_v=5 best_increase=60 max_path_length=160
                  Case #1: 160

                  If in the start, max_path_length=60, then when best_v=4, why is increase=40? Then when best_v=5, increase is 60? These values are not what I expected to see. Btw, the input for code is slightly different. I wanted to show that your code gives wrong answer on slightly changing the above example I gave.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 2   Vote: I like it 0 Vote: I do not like it

                  I can't spend any more time on this :)

                  I am spending my time on that, because it might be so that there has to made just a small incremental change and suddenly the algorithm becomes correct.

                  For example, Dijkstra's algorithm is also greedy, but it works because we are traversing all the vertices in the right way. Maybe if we always choose the correct vertex (among all of the vertices that give the path with the same length), the algorithm suddenly becomes correct =)

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 2   Vote: I like it 0 Vote: I do not like it

                  your algorithm doesn't do what you said it does.

                  It does exactly what I said it does =)

                  the input for code is slightly different.

                  The input is exactly the same as what you have suggested =)

                  Each line describes an edge:
                  parent weight
                  parent weight

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 2   Vote: I like it 0 Vote: I do not like it

                  No, I changed the input.

                  Swap positions of 4 and 5, like here :

                  1 2 10
                  2 3 20
                  2 4 20
                  1 5 20

                  and k = 3

                  The input format is different than how I presented. So I made some assumptions about the input format. Try this modified example.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                    Vote: I like it 0 Vote: I do not like it

                  Try this modified example.

                  The result is 160: http://ideone.com/q2NzDS

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 2   Vote: I like it 0 Vote: I do not like it

                  Output using your algo should be :

                  max_path_length=60
                  best_v=4 best_increase=40 max_path_length=100
                  best_v=5 best_increase=40 max_path_length=140
                  Case #1: 140

                  According to your algo, answer is 140, which is wrong. According to your code, the answer is 160, which is correct, but your code != your algo.

                  Just try with pen and paper and you will see why this is the output for your described algorithm.

                  Your code is doing something else. Find the bug, and your algo will give wrong answer on this test case.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  8 years ago, # ^ |
                  Rev. 2   Vote: I like it 0 Vote: I do not like it

                  I can't find a bug: http://ideone.com/uQebyB

                  The algo constructs the paths like that:

                  131  + 60
                  1431  + 40
                  14531  + 60

                  Your code is doing something else.

                  My code does exactly what I have described.

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Btw, for the given tree, it's 13+12+10+5

    No. The distance between vertex 2 and vertex 5 is 5 + 3 + 10 = 18.

»
8 years ago, # |
Rev. 2   Vote: I like it +14 Vote: I do not like it

solution forgetting about K below :((

wrong
  • »
    »
    8 years ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    How would you use K in your solution?

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it +4 Vote: I do not like it

      hahah that was what I was missing :) sorry for that, I don't really know how to use it :((

  • »
    »
    8 years ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    Not sure what you meant exactly, but is it this :

    Make a graph with edge from each vertex to every other vertex and find the largest cycle of size k+1 in this graph starting and finishing at 1.

»
8 years ago, # |
  Vote: I like it -18 Vote: I do not like it

Here is a solution:

First, consider the problem where we do not need to return to the first vertex at the end. Then, one can notice that the optimal solution is to, at each step, visit an arbitrary farthest vertex from the current vertex (and mark the current vertex as visited).

The complexity of the solution to this problem is O(N * K)

However, now we can iterate over all possible ending vertices and perform the same algorithm. Total complexity would be O(N^2 * K).

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Why is everyone downvoting? If there's something wrong with the solution, could someone explain what it is?

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I have upvoted your answer, because I appreciate any involvement in solving the problem.
      However, I don't understand your solution. Maybe that's the case for the others to downvote (but I'm not sure).

      You could write the program and see if it'll be accepted on the online judge.

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I'm not sure but isn't this the same as solving the TSP greedily?

»
8 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Since for each node of the route we only care about the next node(I will call it matching a node with another one), now we can do this:
Dp[v][i][false,true]=the maximum lenght of a route in v's subtree such that we have i not matched nodes yet and true=the second node of the route is in this subtree/false=otherwise.
O(n*k^2)

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How do you incorporate the fact that we have to go back to node 1 finally too ???

  • »
    »
    8 years ago, # ^ |
      Vote: I like it +6 Vote: I do not like it

    Can you explain in more details?

  • »
    »
    8 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    How do you keep track of which nodes you have already visited? You might have to get re-enter a subtree a few times to increase the length.

»
8 years ago, # |
  Vote: I like it +3 Vote: I do not like it

managed to get a very large input where your solution fails (it's smaller than answer shown in udebug) https://puu.sh/uEbNc/f1d8e8f44c.in

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Great! That is the first progress for the whole time and all what was needed is a little help of the red guy =)

    Did you just generate a random tree or it has some special properties?

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      It is k=6 (or some near integers) chains attached to root 1.

»
8 years ago, # |
  Vote: I like it +16 Vote: I do not like it

All I managed to do is to generate millions of different random trees and in all of these cases my clever solution matched bruteforce solution.

You likely made some mistake in the generator/script. I have run your solution on hundreds of small tests and it was enough.

1
6 3
1 2
2 2
3 2
3 1
4 3

Your solution finds a path 1-5-2-6-1 with total length 24. The optimal path is 1-6-2-4-1 with total length 26.

(drawn with CS-Academy tool)

  • »
    »
    8 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    You likely made some mistake in the generator/script. I have run your solution on hundreds of small tests and it was enough.

    It looks like I had a bad generator.

    Now, looking at this tree it doesn't seem to me that my algorithm can be cured, though I'll try to think about it a little bit more.


    Well, what I notice is that this optimal path 1-6-2-4-1 can be built by looking for the furthest vertex from the current one. Again, I'm not sure that this will work in general =)

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it -8 Vote: I do not like it

      Now that's exactly what I said :PPP

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Well, what I notice is that this optimal path 1-6-2-4-1 can be built by looking for the furthest vertex from the current one. Again, I'm not sure that this will work in general =)

      This approach is easier to break. Consider a graph with edges: 1-2 of length 10, 1-3 of length 1, 3-4 of length 7, 3-5 of length 7. The optimal path is 1-4-2-5-1, while the greedy starts with 1-2-...

      • »
        »
        »
        »
        8 years ago, # ^ |
        Rev. 2   Vote: I like it -27 Vote: I do not like it

        Checkmate =)
        I have no more new ideas.


        Wow, maybe we can run these 2 algorithms and choose the max between them? =)

»
8 years ago, # |
Rev. 2   Vote: I like it +31 Vote: I do not like it

root your tree in node 1

let us say there is an edge between node u and node v , if you choose to visit [ x ] nodes in the subtree of v then there is x nodes there and K — x nodes outside

then you will traverse the edge min(x , K — x) * 2 times if you behave optimally and you can proof that you can always do that

it turned out to be a normal knapsack on a tree

dp[node][visited nodes in its subtree]

call dp[1][K-1]

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    How to prove that number of traversals are min(x , K — x) * 2? It is easy for a line tree, but I am not so sure how to prove for a general tree(or extend it). Thanks in advance!

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      It's easy. If x > k-x then you will go outside the subtree k-x times, as after going outside k-x times, you have no more nodes to visit outside the subtree. The remaining x — ( k — x ) nodes are inside the subtree, so you will not go outside the subtree now. Similarly for x < k-x.

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    The Kth traversal is to the parent of 1(outside 1's subtree), and edge_cost(1,parent of 1) = 0. Is this right?

  • »
    »
    8 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I still don't understand. In a knapsack problem we have a set of things which have 2 attributes: weight and value. What is our set of things and what are theirs attributes?

    • »
      »
      »
      8 years ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      It's about distributing the nodes on subtrees.

      the value is calculated as i mentioned above

      BTW this is the "only" cool problem in all Arab Region contests :D

  • »
    »
    8 years ago, # ^ |
    Rev. 2   Vote: I like it +3 Vote: I do not like it

    So your time complexity is O(N*K^2)?

    If your answer is yes then I think it's not good enough (there are t<=100 test cases in the problem).

    UP: My mistake. It's OK.

  • »
    »
    8 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    1) Why do we call dp[1][k-1] instead of dp[1][k] ???? 2) We traverse an edge min(k,k-x)*2 times ,but say all the k nodes lie in the same subtree ,Then the edge joining will be traversed (0) times ???? Please correct me if I got anything wrong . Thanks in advance..