I was practising FFT with some problems, but I found out that my implementation of FFT seems to take longer than what it takes to fit into the TL.
For the problem 528D. Fuzzy Search (I understand that the problem also has a bitmask solution, but I'd like to practise FFT), I implemented the FFT algorithm (code here) for the string comparison, yet it takes more than 3 seconds to compute 12 FFT(and inverse) of a vector with a size of (2^18).
I wonder how I can improve my implementation's runtime?
Try getting rid of allocating O(NlogN) memory units and O(NlogN) cos/sin computations.
Thanks for the suggestion, the runtime has improved a lot! =D
Try to search for
fft
on codeforces, I remember there were few topics with useful comments.Here's one: http://codeforces.me/blog/entry/18543?#comment-235293
Also you can reduce number of calls to
ifft
(4 -> 1
) by doing it only in the end and before just accumulatingvs[i]
.You can see my implementation of the problem using FFT.