kalimm's blog

By kalimm, 10 years ago, In English

Hi everyone! I want to share a very cute LIS(Longest Increasing Subsequence) implementation.

I found it on here.

Also thanks to mnbvmar for improve the implementation.

But this assumes no duplicates in array.

set < int > s;
set < int > :: iterator it;

...

FOR(i, 1, n)
{
    s.insert(a[i]);
    
    it = s.upper_bound(a[i]);

    if(it != s.end())
        s.erase(it);
}

cout << s.size() << endl;

So I changed it to multiset, for duplicate.

multiset < int > s;
multiset < int > :: iterator it;

...

FOR(i, 1, n)
{
    s.insert(a[i]);
    
    it = s.upper_bound(a[i]);
    
    if(it != s.end())
        s.erase(it);
}

cout << s.size() << endl;

There are seems work, and it's easy to proof them.

Thanks for your reading :)

UPD : I just found a similar approach for Longest Strictly Increasing Subsequence.

multiset < int > s;
multiset < int > :: iterator it;

...

FOR(i, 1, n)
{
    s.insert(a[i]);

    it = s.lower_bound(a[i]);

    it++;

    if(it != s.end())
        s.erase(it);
}

cout << s.size() << endl;

UPD2 : You can solve LMIS(LIS) and ELIS(LSIS) with this approach.

  • Vote: I like it
  • +22
  • Vote: I do not like it

| Write comment?
»
10 years ago, # |
Rev. 2   Vote: I like it +15 Vote: I do not like it

It may not work now depending on your luck and multiset implementation (but there's an easy workaround). I assume we're talking about a longest non-decreasing sequence.

For example, take 1 3 1 2. After two first steps S is obviously {1,3}. After third insert it's {1,1,3}. However, you have no control over which one (literally, 1) gets selected by find routine. If it's second one, everything is OK (it's {1,1}, that is the LIS of the first three elements), however if it finds the first one, the second element gets deleted and we end up with incorrect set {1,3}. In first case, we find a correct longest non-decreasing sequence of length 3, in second case we find only the one with length 2.

It means we should modify the search a bit. We should look for the first position containing the value bigger than a[i] (and not, as before, the next position after any value equal to a[i]). So it should be like

FOR(i, 1, n){
    S.insert(a[i]);
    it = S.upper_bound(a[i]);   // this does exactly what I said
    if(it != S.end()) S.erase(it);
}
»
10 years ago, # |
  Vote: I like it +8 Vote: I do not like it

what if we need to print the lis?

  • »
    »
    10 years ago, # ^ |
    Rev. 3   Vote: I like it -29 Vote: I do not like it

    Ignore ...

    • »
      »
      »
      10 years ago, # ^ |
      Rev. 3   Vote: I like it +12 Vote: I do not like it

      Actually it's not true. Look at this example we have : 1 1 0 10 3

      S contains : 0 1 3

      Actually i-th element in the set is the least value for end of LIS of length i(or i+1 if you think zero-based). If you think i am wrong please tell me.I am not very sure.

      edit:a typo corrected

      • »
        »
        »
        »
        10 years ago, # ^ |
        Rev. 2   Vote: I like it -19 Vote: I do not like it

        Ignore...

        • »
          »
          »
          »
          »
          10 years ago, # ^ |
            Vote: I like it +3 Vote: I do not like it

          Guys, think twice before you post misleading comments.

          Again, not true, take a look at 2, 3, 1. The sets contain numbers 1,3, which is not a valid subsequence.

          I'm not sure if it's straightforward to recover the subsequence from this approach, anybody a clue?

          • »
            »
            »
            »
            »
            »
            10 years ago, # ^ |
            Rev. 4   Vote: I like it +13 Vote: I do not like it

            Sure, it can be done. The code becomes a bit longer because instead of storing just the values you have to store the indices.

            In the simple code, you have the following invariant after processing each element: Consider the values currently stored in the set S, in increasing order. Let's call them S[0], S[1], S[2], etc. Then, for each i, S[i] is the smallest value such that the sequence we already processed has an increasing subsequence of length i+1 that ends in the value S[i].

            If you want to reconstruct the sequence, you have to remember indices of those values instead of the values themselves. Then, whenever you insert a new index x into the position i, you look at the index currently at position i-1 in the set and remember it as the previous element of the optimal sequence that ends at index x.

            Here is a sample implementation. The function returns the indices of elements that form one possible longest strictly increasing subsequence of the input. (The input is assumed to be non-empty. It is allowed to contain duplicates.)

            vector<int> LIS(const vector<int> &elements) {
                // declare the set with a custom comparison function
                // that compares values instead of indices
                auto compare = [&](int x, int y) {
                    return elements[x] < elements[y];
                };
                set< int, decltype(compare) > S(compare);
            
                // process the elements; for each length
                // maintain the index of the best end so far
                vector<int> previous( elements.size(), -1 );
                for (int i=0; i<int( elements.size() ); ++i) {
                    auto it = S.insert(i).first;
                    if (it != S.begin())
                        previous[i] = *prev(it);
                    if (*it == i && next(it) != S.end()) 
                        S.erase(next(it));
                }
            
                // reconstruct the indices that form 
                // one possible optimal answer
                vector<int> answer;
                answer.push_back( *S.rbegin() );
                while ( previous[answer.back()] != -1 )
                    answer.push_back( previous[answer.back()] );
                reverse( answer.begin(), answer.end() );
                return answer;
            }
            

            It's much easier if the input is guaranteed to have no duplicates. In that case, just use a map and for each value remember the immediately preceding one in an optimal solution.

            • »
              »
              »
              »
              »
              »
              »
              10 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it
              auto compare = [&](int x, int y) {
              	return elements[x] < elements[y];
              };
              

              i have never seen a piece of code like this. can u explain what it does?

              • »
                »
                »
                »
                »
                »
                »
                »
                10 years ago, # ^ |
                  Vote: I like it +3 Vote: I do not like it

                It's a lambda function. It's an anonymous function with a few abilities (for example, it can access every variable that is given to it as parameter and every variable that's visible from the point of definition).

                Why is it better to use lambdas? If you want to sort a vector of ints by increasing value of x3 + 17x, you can just write

                sort(vec.begin(), vec.end(), [](int a, int b){
                    return a*a*a+17*a < b*b*b+17*b;
                });
                

                ...instead of some boring and lengthy defining new comparison functions etc.

                In the example above the advantages are even a bit larger — when not using lambdas, you would have to write a functor (way 1 here: http://ideone.com/191tt3) that is as long as a function where we use it! Lambda is way more concise — and moreover, we can define it exactly where we need it (if the function was 200 lines long, we wouldn't have to scroll up above this function to find the comparators!).

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  7 years ago, # ^ |
                    Vote: I like it -11 Vote: I do not like it

                  A very underrated coment by mnbvmar

  • »
    »
    9 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Print all elements of set :/

»
10 years ago, # |
Rev. 2   Vote: I like it +3 Vote: I do not like it

What is x in the second code? It is fixed now!Thanks!

»
10 years ago, # |
  Vote: I like it 0 Vote: I do not like it
»
10 years ago, # |
Rev. 2   Vote: I like it +6 Vote: I do not like it

in first two code snippets, what is x? (i think it should be a[i])

»
10 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

It looks like magic, why does it works?

  • »
    »
    10 years ago, # ^ |
      Vote: I like it -8 Vote: I do not like it

    It's itself of magic!

  • »
    »
    10 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    Forget it is a (multi)set, think of it as a simply vector (look at my comment). When you insert a value x at position i it means that there exists a LIS ending with x has length i. Furthermore, if magic[i] = x it means x is the smallest value which can end a LIS of length i. Sketch of proof by induction:

    What you do (with lower_bound function) is you look for the first index i such that the number magic[i-1] is less than x. By induction it means that there is a LIS ending with magic[i-1] of length i-1 and it's the "smallest" one. So you can extend it by one obtaining a LIS of length i and it will also be the "smallest" one (i.e. ending with the smallest number).

    But it does NOT mean, that the vector contains a valid subsequence.

»
10 years ago, # |
  Vote: I like it +1 Vote: I do not like it

JuanMata, Mr.ink, thanks, fixed.

»
10 years ago, # |
Rev. 5   Vote: I like it +3 Vote: I do not like it

I would like to share my solution, which can be extended to recover the LIS (the // OPTIONAL part). My solution works for Longest Strictly Increasing Subsequence, however it is a matter of some minor changes to make it work for the other version. I am sorry for using some C++ 11 tricks, but it's more convenient and readable for me.


int solve(vector<int> input, vector<int>& solution) { vector<int> magic; vector< vector<int> > history; // OPTIONAL for (int x: input) { vector<int>::iterator it = lower_bound(magic.begin(), magic.end(), x); if (it == magic.end()) { magic.push_back(x); history.push_back(vector<int>(1, x)); // OPTIONAL } else { *it = x; history[it-magic.begin()].push_back(x); // OPTIONAL } } // OPTIONAL { int result = magic.size(); solution.resize(result); solution.back() = magic.back(); for (int i=result-2; i>=0; i--) { auto it = lower_bound(history[i].rbegin(), history[i].rend(), solution[i+1]); it --; solution[i] = *it; } } return magic.size(); }

A short comment on the code:

Instead of a (multi)set I use simply a vector.

For recovery, I use the following fact: if some number x is in the optimal subsequence on position i, it was once added to the magic vector at position i. This is why I record the history of all numbers which were inserted on each position in the vector. Then I traverse this structure from the end to the beginning and select the appropriate number. It requires several simple lemmas to prove it works: e.g. all vectors history[i] are sorted by both value (reverse order) and position in the input sequence!

Edit: I can see my solution (its basic part) is equivalent to dalex's solution in this comment

  • »
    »
    10 years ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    I think that my code is better.

    Construct F[i] and output LIS length (max element in array F):

       for (int i=1; i<=n; i++){
          f[i]=lower_bound(b+1, b+answer+1, a[i])-b;
          maximize(answer, f[i]);
          b[f[i]]=a[i];
       }
       printf("%d\n", answer);
    

    If we need to output the sequence, add this code:

       int require = answer;
       for (int i=n; i>=1; i--)
       if (f[i]==require){
          T.push_back(a[i]);
          require--;
       }
       reverse(T.begin(), T.end());
    
       for (int i=0; i<T.size(); i++)
       printf("%d ", T[i]);
    
    • »
      »
      »
      10 years ago, # ^ |
        Vote: I like it +3 Vote: I do not like it

      It's hard to say which code is "better" (afaik there is no order on codes based on their quality) :) But indeed, your solution is shorter. Thank you for the feedback, I really like it!

    • »
      »
      »
      5 years ago, # ^ |
        Vote: I like it -10 Vote: I do not like it

      Nice implementation

      But can I reconstruct the LMIS array in O(LIS.size) ?

»
9 years ago, # |
  Vote: I like it -11 Vote: I do not like it

LIS IMPLEMENTATION

ll lis(vector &vec){ ll an=0,n=vec.size(); n++; ll arr[n]; arr[0]=-1e9; repp(i,1,n)arr[i]=1e9; rep(i,n-1){ ll pos=upper_bound(arr,arr+n,vec[i])-arr; an=max(an,pos); arr[pos]=vec[i]; } return an; }

»
6 years ago, # |
  Vote: I like it -11 Vote: I do not like it

Can anyone tell me why this is correct ?

»
6 years ago, # |
  Vote: I like it 0 Vote: I do not like it

yeah really cute i love this it's very simple

»
6 years ago, # |
Rev. 2   Vote: I like it +13 Vote: I do not like it

This is an old post, but it seems no one has explicitly mentioned yet that you do not need to use a set or multiset for this implementation, a vector works fine (and can be faster). For example,

vector<int> v;
for (int i = 0; i < n; i++) {
    auto it = lower_bound(v.begin(), v.end(), a[i]);
    if (it != v.end()) *it = a[i];
    else v.push_back(a[i]);
}
return v.size();

This finds a strictly-increasing LIS, but you can easily modify it to find longest nondecreasing sequences by changing lower_bound to upper_bound.

  • »
    »
    6 years ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Code in my comment above is 2 lines shorter and does the same thing.

    But you need to fill v with infinities.

  • »
    »
    6 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    It's from 5 years ago and just an another implementation. So many people already knew vector approach so that's why nobody didn't mention it.

    Also in my solution, you do not even need to use a vector for that implementation, a set works fine(and is shorter).

»
6 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Use BIT

»
5 years ago, # |
  Vote: I like it 0 Vote: I do not like it

How can this be used in the case of pairs?

»
4 years ago, # |
  Vote: I like it -20 Vote: I do not like it

What if the input array is a = [ 8, 2, 2, 1] Answer is 2 (2,2) but algorithm gives (1,2)

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Sorry to Necropost, but my G, do you know what the word "increasing" means?

»
3 years ago, # |
Rev. 2   Vote: I like it -10 Vote: I do not like it

Just to make this blog little more useful — We can also find total no of $$$LIS$$$ (of all version increasing/non-increasing...) in $$$O(nlogn)$$$.

Hint