I decided to write a blog on this as I was doing a problem on local judge and I decided to try to speed up my brute force code. However, it was quite difficult to find resources on SIMD vectorization, so I decided to try to compile some of the resources I found together to hopefully allow more people to learn to scam brute force solutions
Introduction
SIMD stands for single instruction, multiple data. SIMD allows us to give vector instructions which will allow the code to run faster. Vector instructions are instructions that handle short (2-16) vectors of integers / floats / characters in a parallel way by making use of the extra bits of space to do operations simultaneously.
To make use of SIMD, we have to add the following code at the top of the code.
#include <nmmintrin.h>
#pragma GCC target("avx2")
There are three 128 bit data types for integers, floats and doubles.
__m128i i; // stores 4 integers since each integer is 32 bit and 128/32=4
__m128 f; // stores 4 floats since each float is 32 bit and 128/32=4
__m128d d; // stores 2 doubles since each double is 64 bit and 128/64=2
The integer data type can also be used to store 32 characters.
In order to do operations on these data types, we will make use of SIMD intrinsics.