lucky_clover_'s blog

By lucky_clover_, history, 5 weeks ago, In English

Today I was attempting to hack this submission for 2061B. In this solution, the code uses s1.count(x) inside the input-reading loop. Since the time complexity of multiset's count function is $$$O(\log n + k)$$$, where $$$k$$$ is the number of occurrences of the element, the code's time complexity should become $$$O(n^2)$$$ if all elements $$$a_i$$$ are set to $$$1$$$.

However, when I submitted the hack, this code only took 109 ms. Local testing showed that it indeed results in a TLE without -O2, but runs very quickly with -O2 enabled. Does this indicate that the compiler optimizes std::multiset.count() when it is used as boolean?

  • Vote: I like it
  • +49
  • Vote: I do not like it

»
5 weeks ago, # |
Rev. 3   Vote: I like it +13 Vote: I do not like it

I think so.

Putting the code in compiler explorer, you can see the assembly generated when using count() as boolean has one less std::_Rb_tree_increment(std::_Rb_tree_node_base const*) loop.

Meanwhile the modified one that uses count(a) == 10 has

        call    std::_Rb_tree_increment(std::_Rb_tree_node_base const*)
        add     rbp, 1
        mov     rdi, rax
        cmp     r12, rax
        jne     .L71
        cmp     r13, r14
        mov     rax, r14
        cmovge  rax, r13
        cmp     rbp, 10
        cmove   r13, rax
        jmp     .L75

which seems to be the process of count() as it has a cmp rbp, 10 (comparing if the count == 10).

»
2 weeks ago, # |
Rev. 6   Vote: I like it +11 Vote: I do not like it

lucky_clover_ Bedge

It was really, and I did a test.

#include <bits/stdc++.h>
using namespace std;

int main() {
    multiset<int> s;
    int x, y;
    cin >> x >> y;
    // Do not let the compiler know the values of x and y in advance, and you should input them equal to 1.
    // This method may be a bit foolish
    for (int i = 1; i <= 100000; ++i) s.insert(min(rand() % x + 1, x));
    // Confusing the compiler, in fact, this always equals to 1.
    for (int i = 1; i <= 100000; ++i) {
        if (s.count(y));
        else return -1; // Never return -1,because x = y = 1.
        y = max(rand() % y + 1, y); //The same
    }
    return 0;
}

When entering 1 1, the result of not turning on O2 in this code is stuck, while when turned on, it ends quickly.

Similarly, I think O2 has turned if (s.count(y)) into if (s.find(y) != s.end()). This greatly improves the running speed of the program. Oh, I didn't know O2 was so powerful before.

Finally, I am a middle school student. And if there are any errors, please feel free to point them out.

»
2 weeks ago, # |
Rev. 2   Vote: I like it +7 Vote: I do not like it

When I checked this myself, I noticed a very unusual behavior. I tried submitting the following two programs:

Code-1:

#include <bits/stdc++.h>
using namespace std;
int main()
{
    multiset<int> s;
    for (int i = 1; i <= 100000; ++i)
        s.insert(1);
    int x = 0;
    for (int i = 1; i <= 100000; ++i)
         x = s.count(1);
    cout<<x<<'\n';
    return 0;
}


Code-2:

#include <bits/stdc++.h>
using namespace std;

int main()
{
    multiset<int> s;
    for (int i = 1; i <= 100000; ++i)
        s.insert(1);
    int x = 0;
    for (int i = 1; i <= 100000; ++i)
         x = s.count(1);
    // cout<<x<<'\n';
    return 0;
}

The only difference between 2 codes is I commented cout<<x<<'\n; line. I checked both the programs on Codeforces and Codechef compilers.

Runtime Comparison:

Code-1: Runtime = 15000+ ms (causing TLE)

Code-2: Runtime = 62 ms

I seek to understand the reasons for this drastic increase in time complexity and runtime caused by this seemingly minor modification.

  • »
    »
    2 weeks ago, # ^ |
      Vote: I like it +7 Vote: I do not like it

    Because now you calculate something, but don't output any result. It means that compiler optimize it to:

    int main() { return 0; }

  • »
    »
    2 weeks ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    The as-if rule states that “Implementations are free to reorder or even rewrite a program’s instructions, so long as the resulting program still behaves as if it were exactly as written.”. In the second case, the result is not printed, so it does not even need to be calculated.