adamant's blog

By adamant, history, 7 years ago, In English

Hi everyone!

Perhaps you heard about github project on translating e-maxx. The thing is that project is actually more than just translating it. You see, there are bunch of algorithms and approaches which either do not have proper elaborations or have but they're written in some weird uncommon languages like, you know, russian or chinese. And there are some sites which just don't fit for this purpose for some reasons.

Years ago when I started doing competitive programming e-maxx.ru was the main resource to learn things. Things changed a bit now. E-maxx is Russian only and it wasn't updated for years. And now I hope that e-maxx-eng will manage to fill this gap of common resource for everyone to learn new things and keep updated on recent competitive programming tricks and new algorithms.

So I encourage everyone to collaborate in making e-maxx-eng comprehensive guide into competitive programming, and not only on hacktoberfests :). And to begin with I would like to share with you my article on convex hull trick and Li Chao tree I wrote for this resource. Enjoy!

If you were too lazy to read it thoroughly: There is a link to CHT and Li Chao tree article just above this sentence!

  • Vote: I like it
  • +191
  • Vote: I do not like it

| Write comment?
»
7 years ago, # |
  Vote: I like it +8 Vote: I do not like it

Oh, I totally forgot, if you have any topics on which you would like to get an article, you can write it here :)

  • »
    »
    7 years ago, # ^ |
      Vote: I like it +14 Vote: I do not like it
    1. Ji Driver segment tree
    2. Gomory Hu trees
    3. Maybe some math serious topic.
  • »
    »
    7 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    SGT Beats use of segment tree(mentioned here).

    Also, I think your article should mention that Li-chao can easily be extended for parabolic functions much easily(useful for problem KILLER)

    • »
      »
      »
      7 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Well, I mentioned that it only assumed that each pair of functions intersects at most once..

  • »
    »
    7 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    (O(n), O(1)) RMQ algorithm by Fischer and Heun (link).

»
7 years ago, # |
Rev. 2   Vote: I like it +3 Vote: I do not like it

WPS Binary Search

(or I guess it is also called Lagrange optimization in section 5 of this article)

»
7 years ago, # |
  Vote: I like it +6 Vote: I do not like it

On behalf of all newbies like me I really thank you! Looking forward for more articles!

»
7 years ago, # |
  Vote: I like it +19 Vote: I do not like it

Here's a straightforward dynamic CHT/Li Chao tree problem if anyone wants to test their template: https://csacademy.com/contest/archive/task/squared-ends

»
7 years ago, # |
  Vote: I like it +3 Vote: I do not like it

It is first time to hear about Li Chao tree, it is so awesome and easy to code ^_^

However if the numbers are very large, we can use ranking (instead of dynamic segment tree) as it will save space, memory and easier to code (as the code will be the same, only difference is instead of calling f(x), we call f(rev_rnk[x])), however this works only in offline queries (since we need to rank both input and numbers in queries).

Thank you for the nice article :)

»
7 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Thanks for the great article adamant! Just to notice, the images/figures on the article are broken.

  • »
    »
    7 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    adamant I'll look into it. Looks like all images are broken on e-maxx-eng.

»
7 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Are exercise problems sorted in increasing order of difficulty?

»
6 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Is it working now? I can't enter the site. (Or problem with my vpn?)

»
6 years ago, # |
  Vote: I like it +26 Vote: I do not like it

While solving 1179D - Fedor Runs for President with Li Chao tree I came up with a idea to improve its memory usage.
If queries are integers from $$$1$$$ to $$$n$$$, then you only need $$$n$$$ nodes.

Your implementation uses $$$4n$$$ nodes because it's segment tree with binary heap indexing.
Segment tree can use only $$$2n-1$$$ nodes with proper indexing.
We can further improve it to $$$n$$$ by replacing segment tree with binary search tree.

Let $$$t[l,r]$$$ denote the node of range $$$[l,r]$$$ and $$$m=\lfloor \frac{l+r}{2} \rfloor$$$. In the Li Chao tree algorithm, $$$t[l,r]$$$ stores a line that has the smallest $$$y$$$-value at $$$x=m$$$ in its subtree. Therefore, when you are querying $$$x=m$$$, and you reach the node $$$t[l,r]$$$, you can just return instead of going down the subtree.

In the original algorithm, $$$leftchild(t[l,r]) = t[l,m], rightchild(t[l,r]) = t[m,r]$$$.
We can modify the algorithm and make $$$leftchild(t[l,r]) = t[l,m-1], rightchild(t[l,r]) = t[m+1,r]$$$.

So we have a bijection between nodes and integers from $$$1$$$ to $$$n$$$, and thus we only need $$$n$$$ nodes.

If you use $$$m$$$ as the index of the node $$$t[l,r]$$$ and draw the tree with index on each node, then you can see it's a binary search tree of integers from $$$1$$$ to $$$n$$$.

This is my implementation.55968228

  • »
    »
    6 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    will there be any operation that only li-chao segement tree could handle?

  • »
    »
    6 years ago, # ^ |
      Vote: I like it -10 Vote: I do not like it

    Just as you describe, we can use binary search trees(ex. implicit treaps) to replace any segment tree with only n nodes. But with the high constant they are unlikely to be any faster/have less memory usage than a segment tree. (Also atleast for me, a segment tree is much faster/easier to code.)

    • »
      »
      »
      6 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      oToToT If functions are not line but segment, Li Chao segment tree can solve it in $$$O(\log^2 n)$$$ per update and $$$O(\log n)$$$ per query. I dont think Li Chao BST works in this case.

      amnesiac_dusk In this case, the constant decreases. The only difference between the original algorithm and mine is that my left child and right child's ranges are shorter.

      You can also see it by modifying adamant's code.
      just add if (r < l) return; and replace every l, m to l, m-1, every m, r to m+1, r. After that it became my algorithm

      • »
        »
        »
        »
        6 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Ok. I understood it. It will be a nice constant optimisation for Lichao-tree. Thanks

»
6 years ago, # |
  Vote: I like it 0 Vote: I do not like it

some weird uncommon languages :(

»
6 years ago, # |
Rev. 2   Vote: I like it -25 Vote: I do not like it

There is some misleading in the article about DSU. It is said there, that if you link a to b randomly, the complexity will not be worse. In fact, if you use this, it is easy to construct a test, where expected value of tree height is n / 2, where n is number of sets. In the article attached, there is another way of random linking — there we assign all the priorities at random and link a node with bigger priority to a node with smaller priority, but do not update the priorities after that, which can be a bit easier to implement.

By the way, adamant, do you know how to construct a test, where dsu with only path compression without rollbacks works in O(log n) on average? I heard that it is not that trivial, and i could not do find this by myself.

Another thing that i believe is worth mentioning, is that you can link each node to it's grandparent, which will have the same complexity, but a bit faster in practice, as said here: link.

  • »
    »
    6 years ago, # ^ |
      Vote: I like it +1 Vote: I do not like it

    I actually can't see the reason for this exact comment being downvoted. I can see reasoning behind downvoting most of the comments from this account, as most of them are rude or unpleasant in some way. But here i make 3 points and each of which seems ok:

    1. I mentioned about a mistake in an article and asked to extend it. You want exact test case where random linking performs badly? It's just easy-peasy: let's link 1 to 2, then 2 to 3, ... 3 to n. Then, expected value of tree height is n / 2 which is easy to prove by induction. I also mentioned, how priorities are chosen in the original paper, which corrects the original articl on cp-algorithm. This seems more than OK to me.

    2. I asked a cool coder whether he knows a test where upper bound for DSU with path compression only is achieved. I googled some information on this topic, but failed, so it is at least not too easy to find it and it is good to ask this question in public, in case there will be other people interested in this topic.

    3. Finally, i mentioned a technique, which may make your life a bit easier and your DSU a little faster and provided a link to an article.

    I do not see why this comment was so heavily downvoted. I pointed to a mistake in an article and proposed some ways to improve it. Even if at won`t, i am strongly convinced that a comment upwards has educational value itself and is useful for the cf community.

    • »
      »
      »
      6 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      It is probably because your comment directly contradicts the linked paper. I'm more inclined to trust the paper. You basically just said "Consider a line" and concluded that somehow the algorithm is worse time complexity. Which isn't true at all, plus, the union by size would do the exact same thing on your line, so they would still give the same resulting complexity.

      • »
        »
        »
        »
        6 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Not "somehow concluded". Each time you with equal probability link a tree to one node and increase depth by 1, link a node to an existing tree and it does not change your depth (it may increase it by one, but may not). The expected value is n / 2.

        • »
          »
          »
          »
          »
          6 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Maybe I misunderstand the case. It is not a line?

    • »
      »
      »
      6 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Sorry for getting so many downvotes. I'm guessing the reason for it is, that this discussion about DSU doesn't really belong to this blog post. adamant wrote this blog post to promote mostly his own article about the convex hull trick, and to motivate new people into writing articles. In fact adamant has nothing to do with the DSU article.

      (Also being unrated doesn't help much for the upvote/downvote situation.)

      Discussion about mistakes in articles is usually done on Github, where we develop/manage the articles.

      I've created an issue for your concerns: Issue 437. I will fix the article tomorrow after work. Btw, it is also possible for yourself to fix the article, by creating a pull request on Github.

  • »
    »
    6 years ago, # ^ |
      Vote: I like it +22 Vote: I do not like it

    Yes, this is one of the most common mistakes with randomized linking. Here is a 3 year old discussion about this that's work reading. To summarize the discussion:

    • There afaik is no proof that shows $$$O(n\ \alpha (n))$$$ runtime for randomized linking by flipping a fair coin, i.e. by if(rand()%2). In fact, a benchmark from the linked page suggest otherwise, namely that the runtime is close to $$$\Theta(n \log n)$$$.
    • The paper on randomized linking talks about giving every node a random weight and always linking the node with lower weight to the one with higher weight. This is equivalent to flipping a biased coin with weights equal to the sizes of the individual trees, which makes it closely related to union by size.
    • The paper also conjectures that randomized linking by flipping a fair coin needs $$$\Omega\left(n \frac{\log n}{\log \log n}\right)$$$ time w.h.p.

    To answer your question on worst-case inputs for path compression:

    • In the paper Worst-Case Analysis of Set Union Algorithms, it is shown that if we start with a binomial tree of height $$$h = \lg(\frac{n}{2})$$$ (so with $$$2^h = \frac{n}{2}$$$ nodes), link it to a single node and then do a find operation on a deepest leaf, we end up with our binomial tree of height $$$h$$$ (and a node attached to the root of it). This allows us to repeat these steps $$$\frac{n}{2}$$$ times (using the remaining $$$\frac{n}{2}$$$ nodes). Each find operation takes $$$\Theta(h) = \Theta(\log n)$$$ time, resulting in $$$\Theta(n \log n)$$$ total runtime.
    • There are some nice pictures of this in the paper and they make understanding the behavior of binomial trees under path compression a lot easier.
    • I haven't thought too much about this, but if we allow later queries to depend on the answer to earlier find queries (i.e. something like an iterative union-find problem), then the following might show a $$$\Omega\left(n \frac{\log n}{\log \log n}\right)$$$ bound for randomized linking by flipping a fair coin:

      1. This construction is based on the case $$$m = n \log n$$$ from the paper, where $$$j = \log n$$$ and $$$i \in \Theta\left(\frac{\log n}{\log \log n}\right)$$$.
      2. In the paper, they fix a parameter $$$j$$$ and then for $$$k = i \cdot j$$$ define trees $$$T_k$$$ on which we can do $$$1$$$ unite operation and $$$j$$$ find operations so that we get back $$$T_k$$$ with an additional (useless) leaf and so that each find operation takes $$$\Theta(i)$$$ time. $$$T_k$$$ contains at most $$$(j+1)^{i-1}$$$ nodes, so picking $$$i = \log_j(n) - \Theta(1)$$$ makes $$$T_k$$$ contain at most $$$\frac{n}{2}$$$ nodes.
      3. Similar to the paper, we fix a parameter $$$j$$$. Our goal is to define trees $$$S_k$$$ which can be built with randomized unite operations. Moreover, we want $$$S_k$$$ to contain a copy of $$$T_k$$$.
      4. We define $$$S_k$$$ recursively. If $$$k \leq j$$$, then $$$S_k$$$ consists of a single node. Otherwise it consist of a copy of $$$S_{k-j}$$$ to the root of which we attach $$$j-1$$$ copies of $$$S_{k-j}$$$.
      5. We can build $$$S_k$$$ recursively by building $$$j$$$ copies of $$$S_{k-j}$$$, connecting them with $$$j-1$$$ unite operations. Finally, we do a a find operation on one of the $$$S_{k-j}$$$ to find the root of the $$$S_k$$$.
      6. If you compare this construction to figure $$$11.a$$$ from the paper, it is clear that $$$S_k$$$ contains $$$T_k$$$. To figure out where to query, we need an explicit embedding from $$$T_k$$$ to $$$S_k$$$. For $$$k \leq j$$$, this is the identity, for $$$k > j$$$, we can compute such an embedding out of the embedding of the $$$T_{k-j}$$$ in the recursively built $$$S_{k-j}$$$. (Note that an embedding of $$$T_{k-j}$$$ into $$$S_{k-j}$$$ also gives us one from $$$T_{k-j-l}$$$ into $$$S_{k-j}$$$ for any $$$l > 0$$$.
      7. With this construction, $$$T_k$$$ contains at most $$$(j+1)^{i}$$$ nodes, so if we pick $$$i = \log_j(n) - \Theta(1)$$$ we use at most $$$\frac{n}{2}$$$ nodes.
      8. Once we constructed $$$S_k$$$, we only keep the copy of $$$T_k$$$ contained in it and forget all the other nodes from $$$S_k$$$.
      9. In the paper, it show that we can perform one unite query followed by $$$j$$$ find queries so that we get $$$T_k$$$ back and each find query needs $$$\Theta(i)$$$ time. Let's call such a batch of queries a phase. To deal with the fair coin flips, we do the single unite by "trying $$$2 \log_2(n)$$$ times". Let $$$a$$$ denote the current root, then repeat the following steps $$$2 \log_2(n)$$$ times: Take a single node tree $$$b$$$, unite $$$a$$$ with $$$b$$$ and set $$$a$$$ to $$$\operatorname{find}(a)$$$. This does not change the shape of the $$$T_k$$$ and with probability $$$\frac{1}{2}$$$, it add an upward edge from the root of the $$$T_k$$$. The probability of this not happening in any of the steps is $$$\frac{1}{n^2}$$$.
      10. Set $$$j = \log_2(n)$$$, $$$i$$$ as above so that $$$S_k$$$ contains at most $$$\frac{n}{2}$$$ nodes. With the remaining $$$\frac{n}{2}$$$ nodes, we can do $$$\frac{n}{4 \log_2(n)}$$$ phases. In every phase, we do $$$\Theta(j)$$$ queries needing $$$\Theta(i)$$$ time each, so a phase needs $$$\Theta(j i) = \Theta\left(\frac{(\log n)^2}{\log \log n}\right)$$$ time. As there are $$$\Theta\left(\frac{n}{\log n}\right)$$$ phases in total, we get a runtime of $$$\Theta\left(n \frac{\log n}{\log \log n}\right)$$$.
      11. The probability of failure of one phase is $$$\frac{1}{n^2}$$$, so by union bound, the the probability of failing at least once is at most $$$\frac{1}{n}$$$, so with high probability, we indeed get the desired lower bound.

    Let me know it you have questions regarding the proof or if you find a mistake it it. There are probably some typos in the LaTeX parts.