In this problem, i was using fft to solve it but found very strange behaviour between two case. submission1 in which my fft code and its function are written are written normally which gave me TLE in mostly all cases but in submission2 when I put all fft function into struct and my code get AC.
struct is the only change between two submission. Can someone explain the reason behind this.
The difference is that the compiler deduced that
mod
was in fact a constant when it was in struct scope, and avoided using the expensiveidiv
instruction.ll mod=M;
is a very bad idea, a singleconst
would speed your code up a lot without relying on the compiler being that smart.Thanks for the help, You are right I just checked again by adding
const
and it speed up the code by around 4 times. but I have 2 doubt here.why compiler deduced
mod
as constant in struct scope. I meanmod
value can be changed, right?Either
mod
isconst
or not , how it is effecting the runtime?.I tried to search both question but couldn't find anything. If you have anything to read can you please share it.
For (2), you can try to use godbolt to try to see the assembly output of mod a constant or not. You can see the compiler will optimize and not use the
idiv
mentioned above.thanks for telling the tool to check this.
oh Nice.
It can be deduced because it's a member variable of a struct, and there isn't any code which modifies it. Interestingly, just having a function in that struct returning a non-const pointer to
mod
will break the assumption and your program will be slower. You may ask the question why the same can't be said for the global variable? It has to do with the concept of linkage. Global variables by default have external linkage, meaning that, in theory, some other code (for example, in another cpp file) can access it and modify it while the program is running, and that's why the compiler can't assume its value won't change, and therefore has to emit the sloweridiv
.As I said, if you don't know the modulo in advance, you must use the expensive
idiv
, but if you do, there are some tricks that allow you to compute the modulo faster. As a trivial example, if you want to compute modulo $$$2$$$, you just look at the last binary digit. What's cool is that similar tricks exist for all modulos.Thanks a lot for the explanation. conceptually it is more clear now.
To get into more detail for understanding purpose I used golbolt. In first i used mod as
int
, it is using idiv instruction and in second i assigned it asconst int
now it start using some trick withoutidiv
instruction.To check the same in struct as well. I used with
const int
in struct but in this case it is still usingidiv
instruction which is not what you told above. Can you check thisHere the mod is a non-static (const) member. The compiler still can't assume that the mod will be same for every call of the method since there can be a different instances of the struct with different mods. For example, you can have
You can make the mod a static member by
static const int mod = 654562
for the compiler to optimize it.I understood your point but in Accepted submission, I only wrote
ll mod=M
in struct and according to godbolt compiler it is usingidiv
instruction in same case then how it gets optimized.That's because it is compiled with -O2 flag. Passing -O2 or even -O1 on the goldbolt example completely removes the method call and does compile time calculations.
https://godbolt.org/z/6s8TY68sT
btw, even if you don't know the modulo during compilation time, but have a batch of mods by the same modulo, you can preprocess something(basically the same thing that compiler does during the compilation) and then divide without idiv. There's a library called libdivide that does that
Could you elaborate on this?
in what way?
I don't understand what we can preprocess, and avoid
idiv
.the same thing the compiler preprocesses. I do not know the details from the top of my head, but it should be googleable or you can check sources of libdivide
I dont underestand very well, in few words, initialize a variable like mod is faster if i declare as const?
No, it's not faster. What's faster is division by a constant compared to division by a variable. That is why:
and
are both much faster than
That is because when
MOD
is constant, the compiler can optimize division to a multiplication and a shift. If it's not a constant, this optimization can't be done. (in theory it can be done, but calculating how much to shift and what to multiply by is slower than just performing a division)