Recovering a linear recurrence with the extended Euclidean algorithm

№	Пользователь	Рейтинг
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

№	Пользователь	Вклад
1	cry	168
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	159
5	atcoder_official	156
6	djm03178	153
6	adamant	153
8	luogu_official	149
9	awoo	148
10	TheScrasse	146

Hi everyone!

The task of finding the minimum linear recurrence for the given starting sequence is typically solved with the Berlekamp-Massey algorithm. In this article I would like to highlight another possible approach, with the use of the extended Euclidean algorithm.

Great thanks to nor for the proofreading and all the useful comments to make the article more accessible and rigorous.

Tl'dr.

The procedure below is essentially a formalization of the extended Euclidean algorithm done on $$$F(x)$$$ and

Unable to parse markup [type=CF_MATHJAX]

If you need to find the minimum linear recurrence for a given sequence

Unable to parse markup [type=CF_MATHJAX]

, do the following:

Let

Unable to parse markup [type=CF_MATHJAX]

be the generating function of the reversed $$$F$$$.

Compute the sequence of remainders

Unable to parse markup [type=CF_MATHJAX]

such that

Unable to parse markup [type=CF_MATHJAX]

and

Unable to parse markup [type=CF_MATHJAX]

Let

Unable to parse markup [type=CF_MATHJAX]

be a polynomial such that

Unable to parse markup [type=CF_MATHJAX]

Compute the auxiliary sequence

Unable to parse markup [type=CF_MATHJAX]

such that

Unable to parse markup [type=CF_MATHJAX]

and

Unable to parse markup [type=CF_MATHJAX]

Pick $$$k$$$ to be the first index such that $$$\deg r_k < \deg q_k$$$. Let

Unable to parse markup [type=CF_MATHJAX]

, then it also holds that

Unable to parse markup [type=CF_MATHJAX]

for any

Unable to parse markup [type=CF_MATHJAX]

and $$$d$$$ is the minimum possible. Thus,

Unable to parse markup [type=CF_MATHJAX]

divided by $$$a_0$$$ is the characteristic polynomial of the minimum linear for $$$F$$$.

More generally, one can say for such $$$k$$$ that

Unable to parse markup [type=CF_MATHJAX]

Linear recurrence interpolation

In the previous article on linear recurrences we derived that the generating function of the linear recurrence always looks like

$$$ G(x) = \frac{P(x)}{Q(x)}, $$$

where $$$P(x)$$$ and $$$Q(x)$$$ are polynomials and $$$\deg P < \deg Q$$$. In this representation, $$$Q(x) = 1 - \sum\limits_{k=1}^d a_k x^k$$$ corresponds to

$$$ F_n = \sum\limits_{k=1}^d a_k F_{n-k}. $$$

Typical competitive programming task of recovering linear recurrence is formulated as follows:

Library Checker — Find Linear Recurrence. You're given $$$F_0, \dots, F_{m}$$$. Find $$$a_1, \dots, a_d$$$ with minimum $$$d$$$ such that $$$ F_n = \sum\limits_{k=1}^d a_k F_{n-k}. $$$

In formal power series terms it means that we're given $$$F(x) = F_0 + F_1 x + \dots + F_{m}x^{m}$$$ and we need to find $$$\frac{P(x)}{Q(x)}$$$ such that

$$$ F(x) \equiv \frac{P(x)}{Q(x)} \pmod{x^{m+1}} $$$

and $$$d$$$ is the minimum possible. Note that in this particular problem it is not required for

Unable to parse markup [type=CF_MATHJAX]

to be $$$0$$$, hence

Unable to parse markup [type=CF_MATHJAX]

In this terms, as we will see later, what the problem asks us to find is essentially the Padé approximant of $$$F(x)$$$.

Padé approximants

Given a formal power series $$$f(x) = \sum\limits_{k=0}^\infty f_k x^k$$$, its Padé approximant of order $$$[p/q]$$$ is the rational function

Unable to parse markup [type=CF_MATHJAX]

such that

Unable to parse markup [type=CF_MATHJAX]

. The Padé approximant is denoted $$$[p/q]_{f}(x) = R(x)$$$.

For normalization purposes, we will demand

Unable to parse markup [type=CF_MATHJAX]

, that is

Unable to parse markup [type=CF_MATHJAX]

, but it is acceptable to have

Unable to parse markup [type=CF_MATHJAX]

Explanation: The definition of Padé approximants requires some normalization, as you could obtain different representations of the same rational function by multiplying its numerator and denominator with the same polynomial. Most commonly used normalization is

Unable to parse markup [type=CF_MATHJAX]

. However, to guarantee

Unable to parse markup [type=CF_MATHJAX]

and to easier deal with possible trailing zeros in the recurrence, we will use the normalization

Unable to parse markup [type=CF_MATHJAX]

These definitions are different in terms of whether the Padé approximant for the given normalization exists. However, when $$$\deg P < \deg Q$$$, they're equivalent in a sense that when

Unable to parse markup [type=CF_MATHJAX]

defines the linear recurrence

Unable to parse markup [type=CF_MATHJAX]

the reversed polynomial

Unable to parse markup [type=CF_MATHJAX]

defines the recurrence

Unable to parse markup [type=CF_MATHJAX]

Formal proof

Thus to find the first kind recurrence for

Unable to parse markup [type=CF_MATHJAX]

, we could find the second kind recurrence for

Unable to parse markup [type=CF_MATHJAX]

instead.

Online Judge — Rational Approximation. Given $$$f(x)$$$, compute $$$p(x)$$$ and $$$q(x)$$$ of degrees at most $$$m-1$$$ and $$$n-1$$$ such that

Unable to parse markup [type=CF_MATHJAX]

Extended Euclidean algorithm

Let's look again on the condition

Unable to parse markup [type=CF_MATHJAX]

It translates into the following Bézout's identity:

$$$ F(x) Q(x) = P(x) + x^{m+1} K(x), $$$

where $$$K(x)$$$ is a formal power series. When $$$P(x)$$$ is divisible by $$$\gcd(F(x), x^{m+1})$$$ the solution to this equation can be found with the extended Euclidean algorithm. Turns out, the extended Euclidean algorithm can also be used to enumerate all $$$[p/q]_F$$$ with $$$p+q=m$$$.

Formalizing the algorithm

Let's formalize the extended Euclidean algorithm of $$$A(x)$$$ and $$$B(x)$$$. Starting with $$$r_{-2} = A$$$ and $$$r_{-1} = B$$$, we compute the sequence

$$$ r_{k} = r_{k-2} \mod r_{k-1} = r_{k-2} - a_{k} r_{k-1}. $$$

If you're familiar with continued fractions, you could recognize, that this sequence corresponds to the representation

$$$ \frac{A(x)}{B(x)} = a_0(x) + \frac{1}{a_1(x) + \frac{1}{a_2(x) + \frac{1}{\dots}}} = [a_0(x); a_1(x), a_2(x), \dots]. $$$

Same as with rational numbers, for such sequence it is possible to define the sequence of convergents

$$$ \frac{p_k(x)}{q_k(x)} = [a_0(x); a_1(x), \dots, a_k(x)] = \frac{a_{k}(x) p_{k-1}(x) + p_{k-2}(x)}{a_{k}(x) q_{k-1}(x) + q_{k-2}(x)}, $$$

starting with

$$$ \begin{matrix} \frac{p_{-2}(x)}{q_{-2}(x)} = \frac{0}{1}, & \frac{p_{-1}(x)}{q_{-1}(x)} = \frac{1}{0}, & \frac{p_{0}(x)}{q_{0}(x)} = \frac{a_0(x)}{1},& \dots \end{matrix} $$$

Now, one can prove the following identity:

$$$ (-1)^{k}r_k(x) = q_k(x) A(x) - p_k(x) B(x). $$$

Explanation

In other words, we get three sequences $$$r_k$$$, $$$p_k$$$ and $$$q_k$$$ that define a family of solutions to the Bézout's identity.

Using

Unable to parse markup [type=CF_MATHJAX]

and

Unable to parse markup [type=CF_MATHJAX]

, we conclude that $$$ F(x) q_k(x) \equiv (-1)^k r_k(x) \pmod{x^{m+1}}. $$$

Enumerating approximants

Let's estimate the degrees of $$$r_k$$$ and $$$q_k$$$ to estimate how they relate with Padé approximants.

It follows from the recurrence that

Unable to parse markup [type=CF_MATHJAX]

, therefore

Unable to parse markup [type=CF_MATHJAX]

and

Unable to parse markup [type=CF_MATHJAX]

Multiplying both the numerator and the denominator of

Unable to parse markup [type=CF_MATHJAX]

, we get the Padé approximants $$$[p/q]_F$$$ for all $$$q$$$ from $$$\deg q_k$$$ to

Unable to parse markup [type=CF_MATHJAX]

, while also maintaining the inequality

Unable to parse markup [type=CF_MATHJAX]

Joining it together for all $$$q_k$$$ we see that all

Unable to parse markup [type=CF_MATHJAX]

for $$$q$$$ from $$$0$$$ to $$$m$$$ are covered.

Thus, if you find $$$k$$$ such that $$$\deg q_k \leq q$$$ and $$$\deg q_{k+1} > q$$$, assuming $$$p+q=m$$$, it will hold that

Unable to parse markup [type=CF_MATHJAX]

Note that the Padé approximant is not necessarily unique (for example,

Unable to parse markup [type=CF_MATHJAX]

). However, if

Unable to parse markup [type=CF_MATHJAX]

then it is also true that

Unable to parse markup [type=CF_MATHJAX]

Assuming that both approximants satisfy the definition of

Unable to parse markup [type=CF_MATHJAX]

, we can conclude that

Unable to parse markup [type=CF_MATHJAX]

and

Unable to parse markup [type=CF_MATHJAX]

otherwise the identity

Unable to parse markup [type=CF_MATHJAX]

won't hold. That being said,

Unable to parse markup [type=CF_MATHJAX]

found by the Euclidean algorithm provides the minimum possible

Unable to parse markup [type=CF_MATHJAX]

for a fixed

Unable to parse markup [type=CF_MATHJAX]

Finding the linear recurrence

Let

Unable to parse markup [type=CF_MATHJAX]

for some $$$P(x)$$$ and $$$Q(x)$$$ such that

Unable to parse markup [type=CF_MATHJAX]

then $$$\frac{P}{Q}$$$ satisfies the definition of

Unable to parse markup [type=CF_MATHJAX]

. Otherwise,

Unable to parse markup [type=CF_MATHJAX]

would provide another pair of $$$P(x)$$$ and $$$Q(x)$$$ with the same degree of $$$Q$$$ but even smaller degree of $$$P$$$.

Thus, a representation defining a minimum linear recurrence is present among the approximants generated by the Euclidean algorithm.

In this notion, the minimum recurrence is defined by the first $$$k$$$ such that $$$\deg r_k < \deg q_k$$$ and has a characteristic polynomial $$$q_k$$$.

I have implemented the $$$O(n^2)$$$ algorithm as a part of my polynomial algorithms library:

// Returns the characteristic polynomial
// of the minimum linear recurrence for
// the first d+1 elements of the sequence
poly min_rec(int d = deg()) const {
    // R1 = reversed (Q(x) mod x^{d+1}), R2 = x^{d+1}
    auto R1 = mod_xk(d + 1).reverse(d + 1), R2 = xk(d + 1);
    auto Q1 = poly(T(1)), Q2 = poly(T(0));
    while(!R2.is_zero()) {
        auto [a, nR] = R1.divmod(R2); // R1 = a*R2 + nR, deg nR < deg R2
        tie(R1, R2) = make_tuple(R2, nR);
        tie(Q1, Q2) = make_tuple(Q2, Q1 + a * Q2);
        if(R2.deg() < Q2.deg()) {
            return Q2 / Q2.lead(); // guarantee that the highest coefficient is 1
        }
    }
    assert(0);
}

You can see this Library Judge submission for further details.

Also here is a library-free version, if you prefer it.

Half-GCD algorithm

The notion above provides the basis to construct the $$$O(n \log^2 n)$$$ divide and conquer algorithm of computing $$$\gcd(P, Q)$$$ in polynomials and finding the minimum linear recurrence. I have a truly marvelous demonstration of this proposition that this margin is, unfortunately, too narrow to contain. I hope, I will be able to formalize the process for good and write an article about it sometime...

As a teaser, here's an example of the problem, that (probably) requires Half-GCD:

Library Checker — Inv of Polynomials. You're given $$$f(x)$$$ and $$$h(x)$$$. Compute

Unable to parse markup [type=CF_MATHJAX]

modulo $$$h(x)$$$.

Let

Unable to parse markup [type=CF_MATHJAX]

and $$$\frac{p_{k-1}}{q_{k-1}} = [a_0; a_1, \dots, a_{k-1}]$$$, then

Unable to parse markup [type=CF_MATHJAX]

Therefore, if

Unable to parse markup [type=CF_MATHJAX]

is non-constant, the inverse doesn't exist. Otherwise, the inverse is

Unable to parse markup [type=CF_MATHJAX]

In the problem,

Unable to parse markup [type=CF_MATHJAX]

, therefore you need to do something better than $$$O(n^2)$$$ Euclidean algorithm.

Rev.	Кто	Когда	Δ	Комментарий
en45	adamant	2022-07-22 14:30:59	12
en44	adamant	2022-07-22 14:30:30	31
en43	adamant	2022-07-21 21:44:21	7000
en42	adamant	2022-07-21 16:37:28	18
en41	adamant	2022-04-08 17:18:50	140
en40	adamant	2022-04-08 17:17:39	491
en39	adamant	2022-04-08 17:07:37	37
en38	adamant	2022-04-08 17:06:22	21
en37	adamant	2022-04-08 17:04:53	773
en36	adamant	2022-04-08 15:39:20	10
en35	adamant	2022-04-08 15:38:39	5
en34	adamant	2022-04-08 15:27:29	74
en33	adamant	2022-04-08 15:24:39	22
en32	adamant	2022-04-08 15:22:01	5090	explaining on pade approximants, b0=1 and bq=1
en31	adamant	2022-04-07 22:03:19	253	rephrase tldr
en30	adamant	2022-04-06 23:37:11	13
en29	adamant	2022-04-06 22:19:33	334	problem example
en28	adamant	2022-04-06 21:22:31	13
en27	adamant	2022-04-06 21:21:31	105
en26	adamant	2022-04-06 21:14:41	1082	Tldr
en25	adamant	2022-04-06 18:50:52	34
en24	adamant	2022-04-06 18:50:22	114
en23	adamant	2022-04-06 17:23:32	1953
en22	adamant	2022-04-06 17:05:46	1249
en21	adamant	2022-04-06 16:43:26	89
en20	adamant	2022-04-06 16:32:56	838	explanation
en19	adamant	2022-04-06 14:57:04	216	it seems it is not necessarily unique?
en18	adamant	2022-04-06 14:02:54	509	less continued fractions, they probably scare away people
en17	adamant	2022-04-06 13:59:17	85
en16	adamant	2022-04-06 13:57:34	47
en15	adamant	2022-04-06 13:56:19	9
en14	adamant	2022-04-06 13:53:57	1338	finally got the hang of it
en13	adamant	2022-04-06 06:46:22	113	some edge case still need to be considered (published)
en12	adamant	2022-04-06 06:00:12	0	(saved to drafts)
en11	adamant	2022-04-06 03:49:58	15
en10	adamant	2022-04-06 03:41:10	345
en9	adamant	2022-04-06 03:23:06	77
en8	adamant	2022-04-06 03:14:13	5
en7	adamant	2022-04-06 02:13:21	86
en6	adamant	2022-04-06 01:37:01	13
en5	adamant	2022-04-06 01:28:10	57
en4	adamant	2022-04-06 01:23:00	29
en3	adamant	2022-04-06 00:10:26	9	I'm not sure
en2	adamant	2022-04-05 23:29:13	22
en1	adamant	2022-04-05 23:27:31	6370	Initial revision (published)

Tl'dr.

Linear recurrence interpolation

Padé approximants

Extended Euclidean algorithm

Formalizing the algorithm

Enumerating approximants

Finding the linear recurrence

Half-GCD algorithm

История