On reducing the times of Moebius transform for subset convolution

№	Пользователь	Рейтинг
1	jiangly	3976
2	tourist	3815
3	jqdai0815	3682
4	ksun48	3614
5	orzdevinwang	3526
6	ecnerwala	3514
7	Benq	3482
8	hos.lyric	3382
9	gamegame	3374
10	heuristica	3357

№	Пользователь	Вклад
1	cry	169
2	-is-this-fft-	165
3	Um_nik	161
3	atcoder_official	161
5	djm03178	157
6	Dominater069	156
7	adamant	154
8	luogu_official	152
9	awoo	151
10	TheScrasse	147

Hi, Codeforces! I want to introduce a simple way to understand the subset convolution and a simple optimization from scratch.

Actually, I do not know this trick is well-known or just out-of-date.

If there are any mistakes, I am happy to fix them (if I am able to) and thanks for pointing out.

We know that subset convolution is equivalent to truncated multivariate polynomial multiplication like

$$$ A(x_1,\dots ,x_n)B(x_1,\dots ,x_n)\bmod (x_1^2,\dots ,x_n^2) $$$

simply written

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\mathbf{x}^2} $$$

we know that we could compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}\right)} $$$

by using the FFT-like way (Computing $$$A(\mathbf{x})\bmod{(\lbrace x_1,x_1-1\rbrace \times \cdots \times\lbrace x_n,x_n-1\rbrace)}$$$ and using Chinese remainder algorithm to restore the result. This is called the Zeta transform and the Moebius transform respectively). But the coefficients are somehow “mixed”, we can not tell the coefficients of $$$\mathbf{x}^2$$$ and $$$\mathbf{x}$$$. An idea is that we could introduce a new variable $$$t$$$ and computing $$$A(\mathbf{x})B(\mathbf{x})\bmod{\left(\mathbf{x}^2-\mathbf{x}t\right)}$$$, simply let $$$t\gets 0$$$ to get the result of subset convolution. Here is the main idea of the whole algorithm we may familar with:

First, we define $$$s$$$ for each term of $$$A(\mathbf{x}),B(\mathbf{x})$$$:

$$$ s\left(c x_1^{d_1}x_2^{d_2}\cdots x_n^{d_n}\right):=\sum_{k=1}^nd_k $$$

and

$$$ A_k(\mathbf{x})\text{ such that }s(y)=k\text{ for any term }y\text{ of }A_k(\mathbf{x}) $$$

then we compute

$$$ A(\mathbf{x})B(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}=\left(\sum_i A_i(\mathbf{x})\right)\left(\sum_i B_i(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)} $$$

For the $$$O(n^2)$$$ times of computation of form $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, we have an invariant that $$$s(y)+\deg_t(y)=i+j$$$ for any term $$$y$$$ of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$. Even we only do the computation of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x})}$$$, we could restore the result of $$$A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, and we do not use these coefficients that with variable $$$t$$$.

Another observation is that $$$\deg_t(A_i(\mathbf{x})B_j(\mathbf{x})\bmod{(\mathbf{x}^2-\mathbf{x}t)})\leq (i+j)/2$$$. For example, if we have $$$\left(\sum_{i+j=4}A_i(\mathbf{x})B_j(\mathbf{x})\right)+\left(\sum_{i+j=9}A_i(\mathbf{x})B_j(\mathbf{x})\right)\bmod{(\mathbf{x}^2-\mathbf{x}t)}$$$, it is still possible for us to tell apart. With this simple trick, we could reduce the times of Moebius transform for about a half.

This is a submission for the optimized code. I did not implement it carefully.

upd: submission reinplemented.

Блог пользователя hly1204