Computing n! modulo pᵏ for small p

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	168
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	159
5	atcoder_official	156
6	djm03178	153
6	adamant	153
8	luogu_official	149
9	awoo	148
10	TheScrasse	146

Hi everyone!

There is an article on cp-algorithms about how to compute $$$n!$$$ modulo prime number $$$p$$$. Today I was wondering, how to do it if we need to compute modulo $$$p^k$$$ rather than just $$$p$$$. Turns out one can do it efficiently if $$$p$$$ is small. Great thanks to Endagorion for sharing it with me!

Task formulation and outline of result

To clarify the task a bit, our ultimate goal here is to be able to compute e.g. binomial coefficients modulo $$$p^k$$$. Thus, what we really need is to represent $$$n! = p^t a$$$, where $$$\gcd(a, p)=1$$$, and then report $$$t$$$ and $$$a$$$ modulo $$$p^k$$$. This is sufficient to also compute $$$\binom{n}{r}$$$ modulo $$$p^k$$$.

We will show that, assuming that polynomial multiplication of size $$$n$$$ requires $$$O(n \log n)$$$ operations, and assuming that arithmetic operations modulo $$$p^k$$$ take $$$O(1)$$$, we can find $$$t$$$ and $$$a$$$ in $$$O(d^2 + dk\log k)$$$, where $$$d=\log_p n$$$. It requires $$$O(pk\log^2 k)$$$ pre-computation that takes $$$O(pk \log k)$$$ memory.

Motivational example

Xiaoxu Guo Contest 3 — Binomial Coefficient. Given $$$1 \leq n,k \leq 10^{18}$$$, find $$$\binom{n}{k}$$$ modulo $$$2^{32}$$$.

There are also some projecteuler problems that require it, including with several queries of distinct $$$n$$$ and $$$k$$$.

Algorithm with polynomials

Let $$$k=\lfloor \frac{n}{p} \rfloor$$$, then we can represent $$$n!$$$ as

$$$ n! = p^k k! \prod\limits_{\substack{1 \leq j \leq n \\ \gcd(j, n)=1}}j. $$$

So, there is a part contributed by numbers divisible by $$$p$$$, which can be reduced to computing $$$k!$$$, and a part contributed by numbers not divisible by $$$p$$$. Let $$$n = a_0 + a_1 p + \dots + a_d p^d$$$, then the later can be represented as follows:

$$$ \prod\limits_{\substack{1 \leq j \leq n \\ \gcd(j, n)=1}}j = \prod\limits_{i=0}^d \prod\limits_{\substack{1 \leq j \leq a_i p^i \\\gcd(j, p)=1}} \left(\left\lfloor\frac{n}{p^{i+1}} \right\rfloor p^{i+1}+j\right). $$$

To compute it quickly, we can define a family of polynomial $$$P_{i,a}(x)$$$ for $$$0 \leq a \leq p$$$ such that

$$$ P_{i, a}(x) = \prod\limits_{\substack{1 \leq j\leq a p^{i-1} \\ \gcd(j, p)=1}}(xp^{i}+j), $$$

so the value of the factorial would be represented as

$$$ n! = p^k k! \prod\limits_{i=0}^d P_{i+1,a_i}\left(\left\lfloor\frac{n}{p^{i+1}} \right\rfloor\right), $$$

and expanding $$$k!$$$ it then rewrites into

$$$ n! = \prod\limits_{i=0}^{d} p^{\left\lfloor \frac{n}{p^{i+1}} \right\rfloor} \prod\limits_{j=0}^{d-i} P_{j+1,a_{i+j}}\left(\left\lfloor\frac{n}{p^{i+j+1}} \right\rfloor\right), $$$

which simplifies as

$$$ \boxed{n! = \prod\limits_{i=0}^{d} p^{\left\lfloor \frac{n}{p^{i+1}} \right\rfloor} \prod\limits_{j=0}^{i} P_{j+1,a_{i}}\left(\left\lfloor\frac{n}{p^{i+1}} \right\rfloor\right)} $$$

Now, what would it take us to use this setup? First of all, notice that $$$P_{i,a}(x)$$$ can be computed from one another:

$$$ P_{i,a}(x) = \prod\limits_{\substack{1 \leq j\leq a p^{i-1} \\ \gcd(j, p)=1}}(xp^{i}+j) = \prod\limits_{t=0}^{a-1}\prod\limits_{\substack{1 \leq j\leq p^{i-1} \\ \gcd(j, p)=1}}(xp^{i}+tp^{i-1}+j)=\prod\limits_{t=0}^{a-1} P_{i-1,p}(px+t). $$$

Note that for shorter and more consistent implementation, this recurrent formula also mostly works for $$$i=1$$$ if we put $$$P_{0, p}(x) = x+1$$$, but for $$$P_{1,p}$$$ we should go up to $$$p-2$$$ instead of $$$p-1$$$. We should also note that for $$$P_{i,a}(x)$$$, we only care for coefficients up to $$$x^{\lfloor x/i \rfloor}$$$, as the larger ones are divisible by $$$p^k$$$.

This allows to compute $$$P_{i,a}(x)$$$ in $$$O(\frac{pk \log k}{i})$$$ for all $$$a$$$ for a given $$$i$$$. Over all $$$i$$$ from $$$1$$$ to $$$k$$$ it sums up to $$$O(pk \log^2 k)$$$ time and $$$O(p k \log k)$$$ memory. Then, evaluating all the polynomials for any specific $$$a$$$ requires $$$O(d^2 + dk \log k)$$$ operations, where $$$d = \log_p n$$$.

As it requires some manipulations with polynomials, I implemented it in Python with sympy just as a proof of concept:

Code

from sympy import poly
from sympy.abc import x

p = 2
k = 6
mod = p**k

# replace P(x) with P(arg)
def subs_expand(P, arg):
    return poly(P.subs(x, arg).as_expr().expand(), x, modulus=mod)

P = [[poly(x+1, modulus=mod)]]

# P[i] is constant for larger i
for i in range(1, k+2):
    P.append([poly(1, x, modulus=mod)])
    for j in range(p-(i==1)):
        P[-1].append(P[-1][-1]*subs_expand(P[-2][-1], p*x+j) % poly(x**(k//i+1), modulus=mod))

def fct(n, h=0):
    if n == 0:
        return 0, 1
    t = n // p
    d = n % p
    k, a = fct(t, h+1)
    k += t
    for i in range(h+1):
        a = a * P[min(len(P)-1, i+1)][d].subs(x, t) % mod
    return k, a

Algorithm with $$$p$$$-adic logarithms and exponents

The algorithm above requires some heavy machinery on polynomials. I also found out an alternative approach that allows to compute $$$t$$$ and $$$a$$$ for any given $$$n$$$ divisible by $$$p$$$ in $$$O(pk)$$$. It doesn't rely on polynomial operations, except for Lagrange interpolation.

Let $$$n=pt+b$$$, where $$$0 \leq b < p$$$, then we can represent its factorial as

$$$ n! = p^t t! \prod\limits_{i=1}^{b} \prod\limits_{j=0}^{t} \left(pj+i\right)\prod\limits_{i=b+1}^{p-1} \prod\limits_{j=0}^{t-1} \left(pj+i\right). $$$

We can further rewrite it as

$$$ n! = p^t t! (p-1)!^t b! \prod\limits_{i=1}^{b} \prod\limits_{j=0}^{t} \left(1+\frac{j}{i}p\right)\prod\limits_{i=b+1}^{p-1} \prod\limits_{j=0}^{t-1} \left(1+\frac{j}{i}p\right) $$$

Let's learn how to compute the product

$$$ A(b, t) = \prod\limits_{i=1}^b \prod\limits_{j=0}^t \left(1+\frac{j}{i}p\right), $$$

as with it we can represent the factorial as

$$$ n! = p^t t! (p-1)!^t b! A(b, t)\frac{A(p-1,t-1)}{A(b,t-1)}. $$$

We can take a $$$p$$$-adic logarithm from the product to get

$$$ \boxed{\log A(b, t) = \sum\limits_{i=1}^{b} \sum\limits_{j=0}^{t} \log\left(1+\frac{j}{i}p\right)=\sum\limits_{z=1}^\infty \frac{(-1)^{z+1}p^z}{z} \sum\limits_{i=1}^{b} i^{-z} \sum\limits_{j=0}^{t} j^z} $$$

We can precompute sums of $$$i^{-z}$$$ for each $$$z$$$ up to $$$k$$$ and $$$b$$$ up to $$$p-1$$$ in $$$O(pk)$$$, and we can find the sum of $$$j^z$$$ for $$$j$$$ from $$$0$$$ to $$$t$$$ in $$$O(z)$$$ via Lagrange interpolation. Therefore, with $$$O(pk)$$$ precomputation, we can compute $$$\log A(b, t)$$$ in $$$O(k^2)$$$, which then allows to find $$$n!$$$ in $$$O(dk^2)$$$, where $$$d = \log_p n$$$. I implemented it in Python, as a proof of concept, but I didn't bother with faster sums over $$$i$$$ and $$$j$$$:

Code

p = 3
k = 7
mod = p**k
phi = (p-1)*p**(k-1)

def inv(z):
    return pow(z, phi-1, mod)

def valuation(z):
    ans = 0
    while z % p == 0:
        ans += 1
        z //= p
    return ans

def sumi(z, b):
    return sum(pow(inv(i), z, mod) for i in range(1, b+1)) % mod

def sumj(z, t):
    return sum(pow(j, z, mod) for j in range(t+1)) % mod

def logA(b, t):
    ans = 0
    for z in range(1, 40):
        pw = valuation(z)
        ans += (-1)**(z+1) * p**(z-pw) * inv(z//p**pw) * sumi(z, b) * sumj(z, t) % mod
    return ans % mod

F = [1]
for i in range(1, p):
    F.append(F[-1]*i % mod)

def pexp(lg):
    assert lg % p == 0
    ans = 0
    fct = 1
    for z in range(40):
        pw = valuation(fct)
        ans += pow(p, z-pw, mod) * inv(fct//p**pw) * pow(lg//p, z, mod) % mod
        fct = fct*(z+1)
    assert ans % p == 1
    return ans % mod

def fct(n):
    if n == 0:
        return 0, 1
    t, b = n // p, n % p
    k, a = fct_test(t)
    k += t
    a = a * pow(F[-1], t, mod) * F[b] % mod
    a = a * pexp(logA(b, t) + logA(p-1, t-1) - logA(b, t-1)) % mod
    return k, a

Note: Due to inherent properties of $$$p$$$-adic logarithms and exponents, the algorithm only works with $$$p>2$$$, and it might return $$$-n!$$$ instead with $$$p=2$$$. This is because $$$\exp \log z = -z$$$ with $$$p=2$$$ when $$$z$$$ has remainder $$$3$$$ modulo $$$4$$$, and it should be tracked separately.

Some other approaches

There is also a blog by prabowo, and a scholarly article by Andrew Granville. I'm not sure whether they talk about the same algorithms as described here.

Rev.	By	When	Δ	Comment
en27	adamant	2023-05-31 14:22:05	2	Tiny change: '\ \gcd(j, n)=1}}j = \' -> '\ \gcd(j, p)=1}}j = \'
en26	adamant	2023-05-31 14:21:17	2	Tiny change: '\ \gcd(j, n)=1}}j.\n$' -> '\ \gcd(j, p)=1}}j.\n$'
en25	adamant	2023-05-31 14:10:41	60
en24	adamant	2023-05-31 14:10:03	531
en23	adamant	2023-05-31 14:04:45	256
en22	adamant	2023-05-31 13:58:15	2269
en21	adamant	2023-05-31 02:45:28	41
en20	adamant	2023-05-31 02:29:49	308
en19	adamant	2023-05-31 02:26:05	308
en18	adamant	2023-05-31 02:20:22	1037	Tiny change: 'lgorithm w' -> 'lgorithm when $p^k$ is also small\n\n\n\n#### Algorithm w'
en17	adamant	2023-05-23 18:48:21	57
en16	adamant	2023-05-23 17:46:45	41
en15	adamant	2023-05-23 17:45:16	42
en14	adamant	2023-05-23 17:44:45	1268
en13	adamant	2023-05-23 03:22:28	22	Tiny change: 'omputation. It doesn' -> 'omputation, where $d = \log_p n$. It doesn'
en12	adamant	2023-05-23 03:21:59	30	Tiny change: '$p$ in $O(pk)$. It doesn' -> '$p$ in $O(dk^2)$ with $O(pk)$ precomputation. It doesn'
en11	adamant	2023-05-23 03:20:54	216
en10	adamant	2023-05-23 03:19:19	175
en9	adamant	2023-05-23 03:14:42	10
en8	adamant	2023-05-23 03:07:15	184
en7	adamant	2023-05-23 02:57:31	5	Tiny change: 'k, a = fct_test(t)\n ' -> 'k, a = fct(t)\n '
en6	adamant	2023-05-23 00:40:56	153
en5	adamant	2023-05-23 00:04:31	106
en4	adamant	2023-05-22 23:23:37	625
en3	adamant	2023-05-22 23:23:03	62
en2	adamant	2023-05-22 23:22:24	339
en1	adamant	2023-05-22 23:16:26	7960	Initial revision (published)

Task formulation and outline of result

Motivational example

Algorithm with polynomials

Algorithm with $$$p$$$-adic logarithms and exponents

Some other approaches

History