Recovering a linear recurrence with the extended Euclidean algorithm

#	User	Rating
1	jiangly	3898
2	tourist	3840
3	orzdevinwang	3706
4	ksun48	3691
5	jqdai0815	3682
6	ecnerwala	3525
7	gamegame	3477
8	Benq	3468
9	Ormlis	3381
10	maroonrk	3379

#	User	Contrib.
1	cry	168
2	-is-this-fft-	165
3	Dominater069	161
4	Um_nik	160
5	atcoder_official	159
6	djm03178	157
7	adamant	153
8	luogu_official	151
9	awoo	149
10	TheScrasse	146

Hi everyone!

The task of finding the minimum linear recurrence for the given starting sequence is typically solved with the Berlekamp-Massey algorithm. In this article I would like to highlight another possible approach, with the use of the extended Euclidean algorithm.

Tl'dr.

If you need to find the minimum linear recurrence for a given sequence $$$F_0, F_1, \dots, F_m$$$, do the following:

Let $$$F(x) = F_m + F_{m-1} x + \dots + F_0 x^m$$$ be the generating function of the reversed $$$F$$$.

Compute the sequence of remainders $$$r_{-2}, r_{-1}, r_0, \dots, r_k$$$ such that $$$r_{-2} = F(x)$$$, $$$r_{-1}=x^{m+1}$$$ and

$$$r_{k} = r_{k-2} \mod r_{k-1}.$$$

Let $$$a_k(x)$$$ be a polynomial such that $$$r_k = r_{k-2} - a_k r_{k-1}$$$.

Compute the auxiliary sequence $$$q_{-2}, q_{-1}, q_0, \dots, q_k$$$ such that $$$q_{-2} = 1$$$, $$$q_{-1} = 0$$$ and

$$$q_{k} = q_{k-2} + a_k q_{k-1}.$$$

Pick $$$k$$$ to be the first index such that $$$\deg r_k < \deg q_k$$$. Let $$$q_k(x) = a_0 x^d - \sum\limits_{k=1}^d a_k x^{d-k}$$$, then it also holds that

$$$F_n = \sum\limits_{k=1}^d \frac{a_k}{a_0}F_{n-k}$$$

for any $$$n \geq d$$$ and $$$d$$$ is the minimum possible. Thus, $$$q_k(x)$$$ divided by $$$a_0$$$ is the characteristic polynomial of the minimum linear for $$$F$$$.

More generally, one can say that

$$$ F(x) \equiv \frac{(-1)^{k}r_k(x)}{q_k(x)} \pmod{x^{m+1}}. $$$

The procedure above essentially corresponds to the extended Euclidean algorithm done on $$$F(x)$$$ and $$$x^{m+1}$$$.

Linear recurrence interpolation

In the previous article on linear recurrences we derived that the generating function of the linear recurrence always looks like

$$$ G(x) = \frac{P(x)}{Q(x)}, $$$

where $$$P(x)$$$ and $$$Q(x)$$$ are polynomials and $$$\deg P < \deg Q$$$. In this representation, $$$Q(x) = 1 - \sum\limits_{k=1}^d a_k x^k$$$ corresponds to

$$$ F_n = \sum\limits_{k=1}^d a_k F_{n-k}. $$$

Typical competitive programming task of recovering linear recurrence is formulated as follows:

Find Linear Recurrence. You're given $$$F_0, \dots, F_{m}$$$. Find $$$a_1, \dots, a_d$$$ with minimum $$$d$$$ such that $$$F_n = \sum\limits_{k=1}^d a_k F_{n-k}$$$.

In formal power series terms it means that we're given $$$F(x) = F_0 + F_1 x + \dots + F_{m}x^{m}$$$ and we need to find $$$\frac{P(x)}{Q(x)}$$$ such that $$$ F(x) \equiv \frac{P(x)}{Q(x)} \pmod{x^{m+1}} $$$

and $$$d = \deg Q(x)$$$ is the minimum possible. In other words, what we're asked to do is to find the Padé approximant of $$$F(x)$$$.

Padé approximants

Given a formal power series $$$f(x) = \sum\limits_{k=0}^\infty f_k x^k$$$, its Padé approximant of order $$$[p/q]$$$ is the rational function

$$$ R(x) = \frac{a_0 + a_1 x + \dots + a_p x^p}{1+b_1 x + \dots + b_q x^q}=\frac{P(x)}{Q(x)} $$$

such that $$$f(x) \equiv R(x) \pmod{x^{p+q+1}}$$$. The Padé approximant is denoted $$$[p/q]_{f}(x) = R(x)$$$.

By definition, for Padé approximants generally holds

$$$ [p/q]_f \equiv [(p+k)/(q+l)]_f \pmod {x^{p+q+1}}. $$$

From this follows that the generating function of the linear recurrence is also the Padé approximant $$$[\deg P / \deg Q]_F$$$. Moreover, since

$$$ F(x) \equiv \frac{P(x)}{Q(x)} \pmod{x^{m+1}}, $$$

it also holds that $$$\frac{P(x)}{Q(x)} = [(\deg P + \alpha)/(\deg Q + \beta)]_F$$$ for any $$$\alpha + \beta \leq m - \deg P - \deg Q$$$.

Therefore, the needed fraction $$$\frac{P(x)}{Q(x)}$$$ can be found among $$$[1/m]_F, [2/(m-1)]_F,\dots,[m/1]_F$$$.

Extended Euclidean algorithm

Let's look again on the condition

$$$ F(x) \equiv \frac{P(x)}{Q(x)} \pmod{x^{m+1}}. $$$

It translates into the following Bézout's identity:

$$$ F(x) Q(x) = P(x) + x^{m+1} K(x), $$$

where $$$K(x)$$$ is a formal power series. When $$$P(x)$$$ is divisible by $$$\gcd(F(x), x^{m+1})$$$ the solution to this equation can be found with the extended Euclidean algorithm. Turns out, the extended Euclidean algorithm can also be used to enumerate all $$$[p/q]_F$$$ with $$$p+q=m$$$.

Let's formalize the extended Euclidean algorithm of $$$A(x)$$$ and $$$B(x)$$$. Starting with $$$r_{-2} = A$$$ and $$$r_{-1} = B$$$, we compute the sequence

$$$ r_{k} = r_{k-2} \mod r_{k-1} = r_{k-2} - a_{k} r_{k-1}. $$$

If you're familiar with continued fractions, you could recognize, that this sequence corresponds to the representation

$$$ \frac{A(x)}{B(x)} = a_0(x) + \frac{1}{a_1(x) + \frac{1}{a_2(x) + \frac{1}{\dots}}} = [a_0(x); a_1(x), a_2(x), \dots]. $$$

Same as with rational numbers, for such sequence it is possible to define the sequence of convergents

$$$ \frac{p_k(x)}{q_k(x)} = [a_0(x); a_1(x), \dots, a_k(x)] = \frac{a_{k}(x) p_{k-1}(x) + p_{k-2}(x)}{a_{k}(x) q_{k-1}(x) + q_{k-2}(x)}, $$$

starting with

$$$ \begin{matrix} \frac{p_{-2}(x)}{q_{-2}(x)} = \frac{0}{1}, & \frac{p_{-1}(x)}{q_{-1}(x)} = \frac{1}{0}, & \frac{p_{0}(x)}{q_{0}(x)} = \frac{a_0(x)}{1},& \dots \end{matrix} $$$

Now, one can prove the following identity:

$$$ (-1)^{k}r_k(x) = q_k(x) A(x) - p_k(x) B(x). $$$

Explanation

In other words, we get three sequences $$$r_k$$$, $$$p_k$$$ and $$$q_k$$$ that define a family of solutions to the Bézout's identity.

Using $$$A(x) = x^{m+1}$$$ and $$$B(x) = F(x)$$$, we conclude that

$$$ F(x) q_k(x) \equiv (-1)^k r_k(x) \pmod{x^{m+1}}. $$$

Therefore,

$$$ [(m - \deg q_k) / \deg q_k]_F = [(m - \deg q_k - 1) / (\deg q_k + 1)]_F = \dots = [\deg r_k / (m - \deg r_k)]_F = \frac{(-1)^k r_k}{q_k}. $$$

Thus, $$$\frac{(-1)^kr_k}{q_k}$$$ corresponds to $$$[p/q]_F$$$ for $$$q$$$ from $$$\deg q_k$$$ to $$$m - \deg r_k$$$, starting with $$$\deg q_0 = 0$$$ and $$$m - \deg B$$$.

Note that $$$\deg q_{k} - \deg q_{k-1} = a_k = \deg r_{k-2} - \deg r_{k-1}$$$, therefore

$$$ \deg q_k = a_0 + \dots + a_k $$$

and

$$$ \deg r_k = (m+1) - a_0 - \dots - a_{k}-a_{k+1}. $$$

Therefore, the set of fractions $$$\frac{p_k}{q_k}$$$ covers all possible approximants for $$$q$$$ from

$$$ \deg q_k = a_0 + \dots + a_k $$$

$$$ m - \deg r_k = a_0 + \dots + a_k + a_{k+1} -1, $$$

which altogether constitutes the full set of approximants.

Thus, if you find $$$k$$$ such that $$$\deg q_k \leq q$$$ and $$$\deg q_{k+1} > q$$$, assuming $$$p+q=m$$$, it will hold that

$$$ [p/q]_F = \frac{(-1)^k r_k}{q_k}. $$$

Note that it is not guaranteed that $$$q_k(0) = 1$$$. Moreover, it is possible that $$$q_k(0)=0$$$. This is due to the fact that with the Euclidean algorithm we can only guarantee the normal form of the denominator

$$$ q_k(x) = x^d - \sum\limits_{k=1}^d a_k x^{d-k} $$$

rather than

$$$ q_k(x) = 1 - \sum\limits_{k=1}^d a_k x^k. $$$

To deal with it, one should execute the algorithm for $$$x^m F(x^{-1}) = F_{m} + F_{m-1} x + \dots + F_0 x^m$$$ and reverse the resulting $$$Q(x)$$$.

Finding the linear recurrence

In this notion, the minimum recurrence is defined by the first $$$k$$$ such that $$$\deg r_k < \deg q_k$$$.

I have implemented the $$$O(n^2)$$$ algorithm as a part of my polynomial algorithms library:

// Returns the characteristic polynomial
// of the minimum linear recurrence for
// the first d+1 elements of the sequence
poly min_rec(int d = deg()) const {
    // R1 = reversed (Q(x) mod x^{d+1}), R2 = x^{d+1}
    auto R1 = mod_xk(d + 1).reverse(d + 1), R2 = xk(d + 1);
    auto Q1 = poly(T(1)), Q2 = poly(T(0));
    while(!R2.is_zero()) {
        auto [a, nR] = R1.divmod(R2); // R1 = a*R2 + nR, deg nR < deg R2
        tie(R1, R2) = make_tuple(R2, nR);
        tie(Q1, Q2) = make_tuple(Q2, Q1 + a * Q2);
        if(R2.deg() < Q2.deg()) {
            return Q2 / Q2.lead(); // guarantee that the highest coefficient is 1
        }
    }
    assert(0);
}

You can see this Library Judge submission for further details.

Also here is a library-free version, if you prefer it.

Half-GCD algorithm

The notion above provides the basis to construct the $$$O(n \log^2 n)$$$ divide and conquer algorithm of computing $$$\gcd(P, Q)$$$ in polynomials and finding the minimum linear recurrence. I have a truly marvelous demonstration of this proposition that this margin is, unfortunately, too narrow to contain. I hope, I will be able to formalize the process for good and write an article about it sometime...

As a teaser, here's an example of the problem, that (probably) requires Half-GCD:

Library Checker — Inv of Polynomials. You're given $$$f(x)$$$ and $$$h(x)$$$. Compute $$$f^{-1}(x)$$$ modulo $$$h(x)$$$.

Let $$$\frac{f(x)}{h(x)} = [a_0; a_1, \dots, a_k]$$$ and $$$\frac{p_{k-1}}{q_{k-1}} = [a_0; a_1, \dots, a_{k-1}]$$$, then $$$f q_{k-1} - h p_{k-1} = (-1)^{k-2} r_k = (-1)^{k-2} \gcd(f, p)$$$.

Therefore, if $$$r_{k} = \gcd(f, h)$$$ is non-constant, the inverse doesn't exist. Otherwise, the inverse is $$$(-1)^{k-2} q_{k-1}(x)$$$.

In the problem, $$$n \leq 5 \cdot 10^4$$$, therefore you need to do something better than $$$O(n^2)$$$ Euclidean algorithm.

Rev.	By	When	Δ	Comment
en45	adamant	2022-07-22 14:30:59	12
en44	adamant	2022-07-22 14:30:30	31
en43	adamant	2022-07-21 21:44:21	7000
en42	adamant	2022-07-21 16:37:28	18
en41	adamant	2022-04-08 17:18:50	140
en40	adamant	2022-04-08 17:17:39	491
en39	adamant	2022-04-08 17:07:37	37
en38	adamant	2022-04-08 17:06:22	21
en37	adamant	2022-04-08 17:04:53	773
en36	adamant	2022-04-08 15:39:20	10
en35	adamant	2022-04-08 15:38:39	5
en34	adamant	2022-04-08 15:27:29	74
en33	adamant	2022-04-08 15:24:39	22
en32	adamant	2022-04-08 15:22:01	5090	explaining on pade approximants, b0=1 and bq=1
en31	adamant	2022-04-07 22:03:19	253	rephrase tldr
en30	adamant	2022-04-06 23:37:11	13
en29	adamant	2022-04-06 22:19:33	334	problem example
en28	adamant	2022-04-06 21:22:31	13
en27	adamant	2022-04-06 21:21:31	105
en26	adamant	2022-04-06 21:14:41	1082	Tldr
en25	adamant	2022-04-06 18:50:52	34
en24	adamant	2022-04-06 18:50:22	114
en23	adamant	2022-04-06 17:23:32	1953
en22	adamant	2022-04-06 17:05:46	1249
en21	adamant	2022-04-06 16:43:26	89
en20	adamant	2022-04-06 16:32:56	838	explanation
en19	adamant	2022-04-06 14:57:04	216	it seems it is not necessarily unique?
en18	adamant	2022-04-06 14:02:54	509	less continued fractions, they probably scare away people
en17	adamant	2022-04-06 13:59:17	85
en16	adamant	2022-04-06 13:57:34	47
en15	adamant	2022-04-06 13:56:19	9
en14	adamant	2022-04-06 13:53:57	1338	finally got the hang of it
en13	adamant	2022-04-06 06:46:22	113	some edge case still need to be considered (published)
en12	adamant	2022-04-06 06:00:12	0	(saved to drafts)
en11	adamant	2022-04-06 03:49:58	15
en10	adamant	2022-04-06 03:41:10	345
en9	adamant	2022-04-06 03:23:06	77
en8	adamant	2022-04-06 03:14:13	5
en7	adamant	2022-04-06 02:13:21	86
en6	adamant	2022-04-06 01:37:01	13
en5	adamant	2022-04-06 01:28:10	57
en4	adamant	2022-04-06 01:23:00	29
en3	adamant	2022-04-06 00:10:26	9	I'm not sure
en2	adamant	2022-04-05 23:29:13	22
en1	adamant	2022-04-05 23:27:31	6370	Initial revision (published)