[Tutorial] GCD Convolution

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

On today's POI training camp round I have learnt a nice technique that could possibly be useful in some number theory problems. I couldn't find any CF article on it, so I think it's fair enough to share it on my own.

Remark on used notation

In some sums I will use an Iverson notation

Problem: Squarefree Function

Let's define a Squarefree Function $$$f(x)$$$ for any positive integer $$$x$$$ as $$$x$$$ divided by a greatest perfect square, that divides $$$x$$$.

For example: $$$f(1) = 1$$$, $$$f(2) = 2$$$, $$$f(4) = 1$$$, $$$f(6) = 6$$$, $$$f(27) = 3$$$, $$$f(54) = 6$$$, $$$f(800) = 2$$$

Given an array $$$a$$$ of $$$n \leq 10^5$$$ positive integers, where each $$$a_i \leq 10^5$$$ compute sum

\begin{gather} \sum\limits_{1 \leq i < j \leq n}f(a_i\cdot a_j) (mod 10^9 + 7) \end{gather}

Technique: GCD Convolution

You might probably heard about a Sum Convolution. For two arrays $$$b$$$, and $$$c$$$, it is defined as an array $$$d$$$ such that \begin{gather} d_k = \sum\limits_{i + j = k}b_i\cdot c_j \end{gather} If not, it's basically the same thing as a polynomial multiplication. If $$$B(x) = b_0 + b_1x + b_2x^2 + ... + b_nx^n$$$, and $$$C(x) = c_0 + c_1x + c_2x^2 + ... + c_nx^n$$$, then $$$(B \cdot C)(x) = d_0 + d_1x + d_2x^2 + ... + d_{2n}x^{2n}$$$

Let's define GCD Convolution by analogy

Definition

A GCD Convolution of two arrays $$$b$$$, and $$$c$$$, consisting of positive integers, is an array $$$d$$$ such that \begin{gather} d_k = \sum\limits_{gcd(i,j) = k}b_i\cdot c_j \end{gather}

Algorithm to find GCD Convolution

Of course, we can compute it naively by iterating over all pairs of indicies. If $$$b$$$ and $$$c$$$ consists of $$$n$$$ elements then the overall complexity would be $$$O(n^2log(max(b) + max(c)))$$$. But it turns out, that we can do better.

Let's look at the sum $$$d_k$$$ values, with indicies divisible by some integer $$$g$$$, so that is $$$k = gm$$$ for some integer m. \begin{gather} \sum\limits_{m=1}^{n/g}d_{gm} = \sum\limits_{m=1}^{n/g}\sum\limits_{gcd(i,j) = gm}b_i\cdot c_j = \sum\limits_{g | gcd(i,j)}b_i\cdot c_j \end{gather}

From the definition of gcd, we know that $$$g | gcd(i,j) \Leftrightarrow g | i \wedge g | j$$$ \begin{gather} \sum\limits_{g | gcd(i,j)}b_i\cdot c_j = \sum\limits_{i,j}b_i\cdot c_j[g \;|\; gcd(i,j)] = \sum\limits_{i,j}b_i\cdot c_j[g \;|\; i][g \;|\; j] = \end{gather} \begin{gather} =\sum\limits_{i,j}\left(b_i[g \;|\; i]\right)\cdot \left(c_j[g \;|\; j]\right) = \left(\sum\limits_{g|i}b_i\right)\left(\sum\limits_{g|j}c_j\right) \end{gather}

We can define $$$B_g = \sum_{m=1}^{n/g}b_{gm}$$$, and $$$C_g = \sum_{m=1}^{n/g}c_{gm}$$$, and $$$D_g = \sum_{m=1}^{n/g}d_{gm}$$$. From above equation one could easily derive $$$D_g = B_g\cdot C_g$$$. Knowing that $$$O(n + \frac{n}{2} + \frac{n}{3} + ...) = O(n\log n)$$$, arrays $$$B$$$ and $$$C$$$ can be computed directly from their definitions in $$$O(n\log n)$$$.

Recovering a $$$d_k$$$ values from D array is simple. All we need is just subtract all the summands of $$$D_i$$$ except for the smallest. So, formally, we have \begin{gather} d_k = D_k - \sum\limits_{m=2}^{n/k}d_{km} \end{gather} Which can be computed using dynamic programming, starting from $$$k = n$$$.

So, the overall complexity of computing a GCD Convolution of two arrays of size $$$n$$$ is $$$O(n\log n)$$$.

Back to original problem

We can see, that \begin{gather} f(a_i\cdot a_j) = \frac{f(a_i)\cdot f(a_j)}{gcd(f(a_i), f(a_j))^2} \end{gather}

So, having an array $$$w_{f(a_i)} = \sum\limits_if(a_i)$$$ all we need is just to compute a GCD Convolution of $$$w$$$ with itself. Let's denote that convolution by $$$d$$$. Then, from definition \begin{gather} \sum\limits_{i,j :\;gcd(f(a_i), f(a_j)) = k} \frac{f(a_i)\cdot f(a_j)}{gcd(f(a_i), f(a_j))^2} = \frac{d_k}{k^2} \end{gather}

So answer to our problem is just a sum \begin{gather} \sum\limits_{k=1}^{max(f(a_i))}\frac{d_k}{k^2} \end{gather}

If we denote $$$A = max(a_i)$$$ and $$$F = max(f(a_i))$$$, then overall complexity of our solution is $$$O(n\log A + F \log F)$$$

Practice problems

Actually, I don't have any. I will be glad if you share some problems in comments. All I have is just this:

GCD Convolution

Rev.	Кто	Когда	Δ	Комментарий
en16	szaranczuk	2023-02-05 16:37:54	33
en15	szaranczuk	2023-02-05 01:19:42	0	(published)
en14	szaranczuk	2023-02-05 01:17:32	108
en13	szaranczuk	2023-02-05 01:01:55	653
en12	szaranczuk	2023-02-05 00:10:36	15
en11	szaranczuk	2023-02-04 23:51:22	151
en10	szaranczuk	2023-02-04 23:37:19	54
en9	szaranczuk	2023-02-04 23:33:55	7
en8	szaranczuk	2023-02-04 23:31:29	118
en7	szaranczuk	2023-02-04 22:50:30	340
en6	szaranczuk	2023-02-04 22:44:31	682
en5	szaranczuk	2023-02-04 22:15:46	1037
en4	szaranczuk	2023-02-04 21:39:05	627
en3	szaranczuk	2023-02-04 21:18:33	233
en2	szaranczuk	2023-02-04 21:06:12	325
en1	szaranczuk	2023-02-04 20:59:01	1344	Initial revision (saved to drafts)