raghav_19's blog

By raghav_19, history, 4 years ago, In English

In this problem, i was using fft to solve it but found very strange behaviour between two case. submission1 in which my fft code and its function are written are written normally which gave me TLE in mostly all cases but in submission2 when I put all fft function into struct and my code get AC.

struct is the only change between two submission. Can someone explain the reason behind this.

  • Vote: I like it
  • +57
  • Vote: I do not like it

| Write comment?
»
4 years ago, # |
  Vote: I like it +45 Vote: I do not like it

The difference is that the compiler deduced that mod was in fact a constant when it was in struct scope, and avoided using the expensive idiv instruction. ll mod=M; is a very bad idea, a single const would speed your code up a lot without relying on the compiler being that smart.

  • »
    »
    4 years ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    Thanks for the help, You are right I just checked again by adding const and it speed up the code by around 4 times. but I have 2 doubt here.

    1. why compiler deduced mod as constant in struct scope. I mean mod value can be changed, right?

    2. Either mod is const or not , how it is effecting the runtime?.

    I tried to search both question but couldn't find anything. If you have anything to read can you please share it.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +6 Vote: I do not like it

      For (2), you can try to use godbolt to try to see the assembly output of mod a constant or not. You can see the compiler will optimize and not use the idiv mentioned above.

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +22 Vote: I do not like it

      It can be deduced because it's a member variable of a struct, and there isn't any code which modifies it. Interestingly, just having a function in that struct returning a non-const pointer to mod will break the assumption and your program will be slower. You may ask the question why the same can't be said for the global variable? It has to do with the concept of linkage. Global variables by default have external linkage, meaning that, in theory, some other code (for example, in another cpp file) can access it and modify it while the program is running, and that's why the compiler can't assume its value won't change, and therefore has to emit the slower idiv.

      As I said, if you don't know the modulo in advance, you must use the expensive idiv, but if you do, there are some tricks that allow you to compute the modulo faster. As a trivial example, if you want to compute modulo $$$2$$$, you just look at the last binary digit. What's cool is that similar tricks exist for all modulos.

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it +3 Vote: I do not like it

        Thanks a lot for the explanation. conceptually it is more clear now.

        To get into more detail for understanding purpose I used golbolt. In first i used mod as int, it is using idiv instruction and in second i assigned it as const int now it start using some trick without idiv instruction.

        To check the same in struct as well. I used with const int in struct but in this case it is still using idiv instruction which is not what you told above. Can you check this

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
          Rev. 2   Vote: I like it 0 Vote: I do not like it

          Here the mod is a non-static (const) member. The compiler still can't assume that the mod will be same for every call of the method since there can be a different instances of the struct with different mods. For example, you can have

          check one;     // one.mod == 654562
          check two{42}; // two.mod == 42
          one.square(100);
          two.square(100);
          

          You can make the mod a static member by static const int mod = 654562 for the compiler to optimize it.

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            I understood your point but in Accepted submission, I only wrote ll mod=M in struct and according to godbolt compiler it is using idiv instruction in same case then how it gets optimized.

            • »
              »
              »
              »
              »
              »
              »
              4 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              That's because it is compiled with -O2 flag. Passing -O2 or even -O1 on the goldbolt example completely removes the method call and does compile time calculations.
              https://godbolt.org/z/6s8TY68sT

      • »
        »
        »
        »
        4 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        btw, even if you don't know the modulo during compilation time, but have a batch of mods by the same modulo, you can preprocess something(basically the same thing that compiler does during the compilation) and then divide without idiv. There's a library called libdivide that does that

        • »
          »
          »
          »
          »
          4 years ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Could you elaborate on this?

          • »
            »
            »
            »
            »
            »
            4 years ago, # ^ |
              Vote: I like it 0 Vote: I do not like it

            in what way?

            • »
              »
              »
              »
              »
              »
              »
              4 years ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              but have a batch of mods by the same modulo, you can preprocess something and then divide without idiv.

              I don't understand what we can preprocess, and avoid idiv.

              • »
                »
                »
                »
                »
                »
                »
                »
                4 years ago, # ^ |
                Rev. 2   Vote: I like it 0 Vote: I do not like it

                the same thing the compiler preprocesses. I do not know the details from the top of my head, but it should be googleable or you can check sources of libdivide

  • »
    »
    4 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I dont underestand very well, in few words, initialize a variable like mod is faster if i declare as const?

    • »
      »
      »
      4 years ago, # ^ |
        Vote: I like it +5 Vote: I do not like it

      No, it's not faster. What's faster is division by a constant compared to division by a variable. That is why:

      const int MOD = 1000000009;
      ...
      a % MOD
      

      and

      a % 1000000009
      

      are both much faster than

      int MOD = 1000000009;
      ...
      a % MOD
      

      That is because when MOD is constant, the compiler can optimize division to a multiplication and a shift. If it's not a constant, this optimization can't be done. (in theory it can be done, but calculating how much to shift and what to multiply by is slower than just performing a division)