dimss's blog

By dimss, history, 19 months ago, In English

There is such an implementation of the sieve of Eratosthenes:

usual

Then replace the second line with bitset used;

bitset

For comparison, let 's take another linear sieve of Eratosthenes.

linear

Running time table on compiler GNU G++20 11.2.0 (64 bit, winlibs) with #pragma GCC optimize("O3") in all three solutions.

\begin{array}{|c|c|c|c|} \hline N & usual & linear & bitset \cr \hline 2e7 & 202ms & 140ms & 78ms \cr \hline 5e7 & 467ms & 374ms & 249ms \cr \hline 1e8 & 1138ms & 732ms & 686ms \cr \hline 2e8 & 2308ms & ML & 1669ms \cr \hline \end{array}

I guess it's because of the cache. The data takes up 8 times less memory and is cached better. The more cache there is on the computer, the more noticeable the acceleration. For example, I have a 3-fold acceleration on $$$N = 1e8$$$ compared to a conventional sieve.

Unfortunately, this is useless in practice, because almost always in the sieve we want to get some more information.

  • Vote: I like it
  • +85
  • Vote: I do not like it

»
18 months ago, # |
  Vote: I like it 0 Vote: I do not like it

great blog ! I just have a few questions for the following line:

for (int j = min((ll)INT32_MAX, i * 1ll * i); j < N; j += i)

to be more spesific this part :

int j = min((ll)INT32_MAX, i * 1ll * i)

for some reason it seems like what you did there makes it the code 300ms faster for N = 1e8 , and I really couldnt understood it , can you elaborate that part

  • »
    »
    18 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    When you consider all multiples of i larger than i, you do not need to consider j * i for j less than i, because you already considered them before (for j < i). But one must avoid overflow of integers, that is why one need to cast i to long long. If I'm not mistaken this gives O(n log(log(n))) complexity for the sieve. Sorry, if I misunderstood your question.

»
18 months ago, # |
Rev. 2   Vote: I like it +52 Vote: I do not like it

If you just want primes, this is pretty fast too:

Spoiler

Update: Just benchmarked, seems to be 3x faster than the version in the blog.

  • »
    »
    18 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    so cool! could you suggest any blog on this technique ?

    • »
      »
      »
      18 months ago, # ^ |
        Vote: I like it +1 Vote: I do not like it

      It is just noting that all primes other than $$$2, 3$$$ are $$$\pm 1 \pmod 6$$$, and iterating over only those candidates.

»
18 months ago, # |
Rev. 5   Vote: I like it 0 Vote: I do not like it

i like to do this in sieve..
1. vector<char> instead of vector<bool> / bitset