never_giveup's blog

By never_giveup, history, 3 years ago, translation, In English

I'm unable to find any resources(with some proofs) on how to find certificate for lambda(a.k.a. Lagrange) dp optimization for non-strict convex functions. Any help is highly appreciated!

  • Vote: I like it
  • +168
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Is it really possible? I have always thought that this is not solved in the general case.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +23 Vote: I do not like it

    A lot of different ways exist. I have yet to find a problem where it is impossible to restore the answer. The technique from my comment below can be easily extended to a variety of problems.

»
3 years ago, # |
Rev. 2   Vote: I like it +70 Vote: I do not like it

For simplicity let's assume that the problem is to split an array into $$$k$$$ segments with the minimum cost and $$$f(k)$$$ is the answer function. For Aliens trick to work function should be concave or convex (i.e. $$$f(k+1)-f(k) \leq f(k+2)-f(k+1)$$$ for all $$$k$$$ or $$$f(k+1)-f(k) \geq f(k+2)-f(k+1)$$$ for all $$$k$$$)

When we use the Aliens trick, for the fixed value of $$$\lambda$$$ we should find the minimum value of

Unable to parse markup [type=CF_MATHJAX]

(in case there are several minimums, we need to find the smallest $$$k$$$)

Sometimes we are lucky, and we can find $$$\lambda$$$ where the optimal choice is the required $$$k$$$, and we will always find it if $$$f(k+1) - f(k) \neq f(k) - f(k-1)$$$ for all $$$k$$$.

Why is $$$f(k+1)-f(k)=f(k)-f(k-1)$$$ a problem?

Let $$$\lambda=f(k)-f(k-1)$$$, then for this value of $$$\lambda$$$, $$$f(k) - \lambda (k) = f(k-1) - \lambda (k-1)$$$, so we will always prefer $$$k-1$$$ to $$$k$$$ when $$$\lambda \leq f(k+1)-f(k)$$$, and in the other case we will prefer $$$k+1$$$.

The source of this problem is that we have several $$$k$$$'s with the minimum value of $$$f(k) - \lambda k$$$

Our function is convex/concave, so the set of $$$k$$$'s is always going to be $$$\{l, l + 1, \ldots, r\}$$$ (contiguous interval).

In the DP that we use for the fixed $$$\lambda$$$, let's maintain the smallest and the largest $$$k$$$ that lead to the minimum answer (we will call them $$$l(\lambda), r(\lambda)$$$ (and we know that all $$$k$$$'s between them will also have the minimum value).

Then once we found the value of $$$\lambda$$$ such that $$$l(\lambda) \leq k \leq r(\lambda)$$$, we just need to restore the answer.

It is easy to do if we store $$$l(\lambda), r(\lambda)$$$ along with all the DP values, every time we just need to jump into the previous position with the minimum possible answer and suitable $$$(l(\lambda),r(\lambda))$$$ interval (we can do it in a naive way, resembling two-pointers).

  • »
    »
    3 years ago, # ^ |
    Rev. 3   Vote: I like it -11 Vote: I do not like it

    whoops, misread

    • »
      »
      »
      3 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I think that restoring certificate will be incorrect in your method, since now we don't have correct way of getting $$$f(k)$$$ (i.e. dividing array into $$$k$$$ segments to get minimum function value), only $$$f(l)$$$ and $$$f(r)$$$.

      • »
        »
        »
        »
        3 years ago, # ^ |
        Rev. 2   Vote: I like it -8 Vote: I do not like it

        Yes, sorry, I misunderstood

      • »
        »
        »
        »
        3 years ago, # ^ |
        Rev. 3   Vote: I like it -8 Vote: I do not like it

        Actually, now that I understand the problem correctly I think there is some merit to what I said; if you compute at $$$\lambda - 1/2$$$ and $$$\lambda + 1/2$$$ you indeed recover the values of $$$l(\lambda)$$$ and $$$r(\lambda)$$$ for each of the DP's intermediate states. Of course the transitions will not necessarily be the same as for $$$\lambda$$$, so from here we proceed as in the original comment.

        This is slightly easier since otherwise you have to maintain $$$l$$$ and $$$r$$$ in whatever structure you're using and make sure to minimise / maximise them during transitions.

        (This obviously requires the intermediate states to be convex, but this is required for the original comment too, and I haven't seen a problem that can be solved using Aliens trick where this isn't the case.)

        • »
          »
          »
          »
          »
          3 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          I still don't understand correctness of your solution, what's the difference between doing what you said and like taking $$$l(-\infty)$$$ and $$$r(\infty)$$$, which is wrong?

          • »
            »
            »
            »
            »
            »
            3 years ago, # ^ |
              Vote: I like it -8 Vote: I do not like it

            Basically at each intermediate state we assume that the value will be a convex function of $$$k$$$ (same as in the original comment), and for a fixed $$$\lambda$$$ our DP is guaranteed to find us $$$\text{min}_k f(k) - \lambda k$$$ at every state (even though we don't know which transitions it will take). So, just like how querying at $$$\lambda \pm 1/2$$$ will get us the values of $$$l(\lambda)$$$ and $$$r(\lambda)$$$ for the final result of the DP, it will also get us the correct values for all intermediate states, by exactly the same reasoning.

            The only thing this does is replace the explicit computation of $$$l(\lambda)$$$ and $$$r(\lambda)$$$; after doing this you would still have to recover the solution exactly the same way as in the original comment, which requires computing the actual values of $$$\text{min}_k f(k) - \lambda k$$$ at all states (although it's easy to recover these from the values at $$$\lambda \pm 1/2$$$).

            I think it's still worth pointing out, because any other method for computing $$$l$$$ and $$$r$$$ would probably need two data structures or two passes for minimising / maximising $$$k$$$ for fixed $$$\lambda$$$, both of which require more thought than doubling everything and querying at $$$\lambda \pm 1/2$$$.

»
3 years ago, # |
  Vote: I like it +20 Vote: I do not like it

My solution is tricky to implement and I don't have a proof of correctness. I once did a lot of stress-testing and it was ok but maybe it was specific to that problem.

We will binary search $$$\lambda$$$ and a special index $$$z$$$ ($$$z \leq n$$$).

For a fixed $$$\lambda$$$, you can choose whether to break ties by preferring more or fewer segments. In the dp, if you are at index $$$i$$$, let's prefer more segments if $$$i \leq z$$$ and fewer segments otherwise. Bigger value of $$$z$$$ yields more segments. For correctness, we need to prove that increasing $$$z$$$ by $$$1$$$ increases the number of segments by at most $$$1$$$. This might be problem-dependent.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +26 Vote: I do not like it

    Sorry for necroposting but I wanted to mention that there was such a discussion a couple of years ago, and this comment proposed the same idea.