In this blog, we build on the introduction from my previous one and then we'll see some tricks I used to solve PIBO2. For this, you need to read at least the first section there, the "Clacking Balls" section and probability stuff are not required.
Thanks bashkort for the Month of Blog Posts challenge!
The basis of falling factorials
The $$$k$$$-th falling factorial is the polynomial defined is as
with $$$(x)_0 = 1$$$. It has degree $$$k$$$, so the falling factorials $$$1, x, (x)_2, \dots, (x)_k, \dots$$$ form a basis over the vector space of polynomials, like the monomial basis $$$1, x, x^2, \dots, x^k, \dots$$$ does. We note that
for non-negative integers $$$n$$$ and $$$k$$$. Hence the binomial can also be seen as a polynomial in $$$n$$$ and we can replace $$$n$$$ by an indeterminate variable $$$x$$$. From the combinatorial identity
we get
and thus
Hence $$$\Delta$$$ acts on $$$(x)_k$$$ like $$$\frac{d}{dx}$$$ acts on $$$x^k$$$.
One the first things one can do when given a linear operator is finding a suitable basis for it, one that makes it easier to understand. And that's what we just did for $$$\Delta$$$, as a linear operator on the vector space of polynomials. This basis allows us to try to "invert" $$$\Delta$$$. Note that $$$\Delta$$$ is not invertible since $$$\Delta 1 = 0$$$. But if $$$\Delta p(x) = (x)_k$$$, then we know that $$$p(x) = \frac{1}{k + 1} (x)_{k+1} + C$$$, where $$$C$$$ is some constant. This is similar to integration.
A quick application
Consider a random walk over a path with $$$n$$$ vertices. Let's say we start on vertex $$$1$$$ and each time we go to an adjacent vertex with equal probability. That is, if we are on vertex $$$x$$$ with $$$1 < x < n$$$, then we go to either $$$x - 1$$$ or $$$x + 1$$$, each with probability 1/2. If we are at $$$1$$$, we automatically go to vertex $$$2$$$. Let's calculate the expected number of steps until we reach vertex $$$n$$$ for the first time.
Let $$$E_k$$$ denote the expected number of steps until we reach vertex $$$n$$$ if we start at vertex $$$k$$$. Then $$$E_1 = 1 + E_2$$$, $$$E_n = 0$$$ and
for $$$1 < x < n$$$. Rearranging these, we get a set of equations, like in the previous blog:
"Integrating" once, we get $$$\Delta E_x = -2(x)_1 + C = -2x + C$$$, for some constant $$$C$$$. The second equation yields $$$C = 1$$$. "Integrating" again, we get
for another constant $$$C'$$$, which equals $$$n^2 - 2n$$$ by the third equation in the set.
PIBO2
Now let's solve PIBO2.
Since we're working with derivatives, let's try to compute $$$\Delta Pib(n)$$$ with the recurrence and see what we find:
This is the exact same recurrence but with $$$Pib$$$ and $$$P$$$ replaced by $$$\Delta Pib$$$ and $$$\Delta P$$$ respectively. So we can iterate! This looks useful, because we reduced the degree of the polynomial by $$$1$$$. Iterating it $$$d$$$ times more, we get
since $$$P$$$ has degree $$$d$$$ and $$$\Delta^{d + 1} P = 0$$$. We got rid of the polynomial here, which seems useful. We also got a Fibonacci-like recurrence, which is very simple for us to solve with matrix exponentiation, given $$$\Delta^{d+1} Pib(0)$$$ and $$$\Delta^{d + 1} Pib(1)$$$.
On another hand, just rearranging the recurrence equation from the problem statement, we get
and, changing variables,
We can iterate this, plugging this same formula to compute $$$\Delta Pib(n + 1)$$$ above, until we reach $$$\Delta^{d + 1}Pib$$$:
Now, as we've seen above, $$$\Delta^{d+1} Pib(n + d + 1)$$$ can be computed with matrix exponentiation, given the values $$$\Delta^{d+1} Pib(0)$$$ and $$$\Delta^{d + 1} Pib(1)$$$. These values, along with the terms in in the summation above, can be computed directly from the definitions in $$$O(d^2)$$$. We simply need to compute the required values of $$$Pib$$$ and $$$P$$$ and to successively use a function that takes an array $$$(a_1, \dots, a_m)$$$ and returns the array of differences $$$(a_2 - a_1, \dots, a_m - a_{m - 1})$$$ (it's 1 size smaller).
We remark that there is another way to compute the summation above, also in $$$O(d^2)$$$, which is more thematic and requires less thinking about indices. Write $$$P$$$ in the basis of falling factorials. To do that, first compute $$$(x)_d$$$. Since the falling factorials are monic (also remember we're working over $$$\mathbb{Z}_{1111111111}$$$), the coefficient of $$$(x)_d$$$ is simply the leading coefficient of $$$P$$$, so subtracting it from $$$P$$$ we get a polynomial of degree $$$d - 1$$$. Computing $$$(x)_{d-1}$$$ from $$$(x)_d$$$ can be done in $$$O(d)$$$ with naive division. With $$$P$$$ written in the basis of falling factorials, we can go to $$$\Delta^{k+1}P$$$ from $$$\Delta^k P$$$ in $$$O(d)$$$ (we saw how $$$\Delta$$$ acts on $$$(x)_k$$$). Evaluating a polynomial written in the basis of falling factorials can be done also in $$$O(d)$$$ for each point.
In the rest of this blog, we'll see more stuff to solve this problem in $$$O(d \log^2 d)$$$. Some familiarity with generating functions will be useful.
Evaluating, interpolating and change of basis
At this point we'll assume we're working over a field, as opposed to the ring $$$\mathbb{Z}_{1111111111}$$$. Given $$$d + 1$$$ values $$$y_0, y_1, \dots, y_d$$$, we know there exists a unique polynomial $$$P$$$ of degree $$$d$$$ such that $$$P(i) = y_i$$$ for each $$$i = 0, 1, \dots, d$$$. Evaluating from and interpolating to the monomial basis is well known. In this section, we'll learn how to do that for the basis of falling factorials. As a result, we'll also have a way of converting between these two bases.
Evaluation looks simpler, so let's start with that. Let's suppose we have a polynomial $$$P$$$ written like
and we want to compute $$$P(0), P(1), \dots, P(d)$$$. Well, for a point $$$m$$$,
that is,
One can recognize the right hand side as a convolution. What we get is that the exponential generating function of the $$$P(m)$$$ equals a simple product:
Interpolation, a.k.a. inverting this, is just as simple. Just multiply both sides by $$$\exp(-x)$$$ and we can get the coefficients $$$\alpha_k$$$ from the values $$$P(m)$$$. So both operations can be done in $$$O(d \log d)$$$.
Shifting
Prerequisite: OGFs, EGFs, differentiation and Taylor shifts.
Given a polynomial $$$P$$$ written in the basis of falling factorials, how to compute $$$P(x + c)$$$? Note that
Iterating this, for integer $$$c$$$, we get
If $$$P$$$ has degree $$$d$$$, then we only need the first $$$d + 1$$$ terms in the sum above, the others are zero. Note that both sides are polynomials in $$$c$$$, so we can extend this for non-integer values of $$$c$$$ as well.
How to apply a polynomial (or power series) of $$$\Delta$$$ to $$$P$$$? The linked prerequisite teaches how to do that for the standard derivative operator $$$D$$$. We know that $$$\Delta$$$ acts on $$$(x)_k$$$ like $$$D$$$ acts $$$x^k$$$. So just replace $$$P(x) = \sum_k \alpha_k (x)_k$$$ by $$$\sum \alpha_k x_k$$$ and $$$\sum_k \binom{c}{k} \Delta^k$$$ by $$$\sum_k \binom{c}{k} D^k$$$ and apply what we already know to get the coefficients that we want.
With this section and the previous one, you should be able to solve Shift of Sampling Points of Polynomial from Library Checker. Also checkout Pow of Formal Power Series (Sparse), as it can speed the computation of $$$(1 + \Delta)^c$$$ (I learned how to solve it from this blog).
Explicit formula for $$$\Delta^k f$$$
Let's say we have a function $$$f$$$, not necessarily a polynomial. We want to compute $$$\Delta^k f$$$. To get an intuition, if we try manually computing the first few values,
we see that the coefficients look like the beginning of Pascal's triangle, with some alternating signs.
We can prove by induction on $$$k$$$ that
PIBO2, but faster
At several points above, we assumed we were working over a field instead of simply a ring. The prime factorization of $$$1111111111$$$ is
So we should solve the problem for each prime $$$p$$$ independently, in $$$\mathbb{Z}_p$$$ (a field), and join the results using the Chinese remainder theorem. Moreover, when working over $$$\mathbb{Z}_p$$$, we should use Fermat's little theorem ($$$a^p \equiv a \mod p$$$ for all $$$a$$$), to transform $$$P$$$ into a polynomial of degree at most $$$p - 1$$$. This is required because large factorials (more specifically $$$p!$$$ and subsequent ones) do not have inverses modulo $$$p$$$, and they would be needed without this transformation.
To use the explicit formula to compute $$$\Delta^{d+1} Pib(0)$$$ and $$$\Delta^{d + 1} Pib(1)$$$, we need the first $$$d + 3$$$ values $$$Pib(0), Pib(1), \dots, Pib(d + 2)$$$. In their turn, calculating these requires the $$$d + 1$$$ values $$$P(2), P(3), \dots, P(d + 3)$$$, which can be computed with standard multipoint evaluation in $$$O(d \log^2 d)$$$. This is the slowest part of the algorithm. In the next part, we will need $$$P(0)$$$ and $$$P(1)$$$, so we might as well evaluate these together.
Now we are left with the sum
This looks hard, because not only the polynomial is changing but the points at which we evaluate are changing together, and both transforming and evaluating are $$$O(d)$$$. But with Taylor shifts, we can transform this into an evaluation at a single point:
We are now done, because we've seen how to apply a power series of $$$\Delta$$$ to a polynomial written in the basis of falling factorials, we've seen how to interpolate $$$P$$$ from $$$P(0), P(1), \dots, P(d)$$$ and we already have these values.
I think it's much easier to work with $$$D : f(k) \mapsto f(k+1)$$$ (or actually even $$$f(k-1)$$$, since most recurrences are defined from prior values, rather than from the next ones), rather than with $$$\Delta = D-1$$$. In particular,
Shifting formulas. You can see immediately that
Of course, the shifting formula for $$$f(x+c) = D^c f(x)$$$ via $$$\Delta$$$ is obtained in the similar manner because, in turn, $$$D = \Delta+1$$$.
Recurrences. $$$E_x = 1 + \frac{1}{2}(E_{x-1}+E_{x+1})$$$ turns into
Yes, this particular example can look easier with $$$\Delta$$$, but what if it was a different recurrence that wouldn't factorize as nicely? For example, if it was $$$(3-D)^2$$$, I would immediately see that $$$E_k = (a+bk)3^{k}$$$ for some constants $$$a$$$ and $$$b$$$, but it wouldn't be as obvious with $$$\Delta$$$, and you would need to solve two equations of type $$$2E = \Delta E$$$.
PIBO2. We can rewrite it as $$$(1-D-D^2) Pib(n) = P(n)$$$. Then, applying $$$(1-D)^{d+1}$$$ to it will turn it into
Then, general solutions to linear recurrences say that $$$Pib(n) = a(n) + b(n)$$$, where $$$(1-D)^{d+1} a(n) = 0$$$, i.e. $$$a(n)$$$ is a polynomial of degree $$$d$$$ and $$$(1-D-D^2) b(n) = 0$$$, i. e. $$$b(n)$$$ is a Fibonacci-like recurrence. Using this, we get
Which tells us right away that $$$(1-D)^{d+1} Pib(n)$$$ is a Fibonacci-like recurrence, and also allows to recover $$$a(n)$$$ and $$$b(n)$$$.
As a side note, PIBO2 can also be solved in $$$O(d \log d)$$$ with this particular mod, because when you have a small prime number $$$p$$$, you can find $$$P(x)$$$ for all possible $$$x$$$ modulo $$$p$$$ in $$$O(d \log d)$$$ via chirp z-transform (see #1 here).
very interesting stuff, I can definitely see the similarities between the falling factorial and the normal factorial