Trying to solve this problem. I got the last non-zero digit by working with $$$2$$$-s and $$$5$$$-s, but here's an easier solution by using $$$10$$$ directly.
Can anyone explain why this works? How taking modulo $$$10^9$$$ does not make it wrong?
Code
Thanks.
we're interested in the last digit so even taking modulo $$$10$$$ is right
But it gets wrong answer.
Taking modulo 10 actually does not work.
For example, take 125*8, and you will get 1000, with a last non-zero digit of 1. But what if it was actually 125*18, which is equal to 2250, with a last non-zero digit of 5?
oh yes i am wrong
I think it's modulo 1e9 because the zeros will never go past that far on a single iteration, for example instead it could be 1000 if N went up to 99 because the most the zeros will go up at once is twice because of 25 (4*25 = 100).
boocoo What I don't understand is when we do modulo, we lose some information about the number.
It is true that by doing modulo, we lose some information about the actual number. However, we are more interested in the least significant digit (rightmost non zero digit). Taking modulo removes part of the most significant digits (leftmost digits are lost).
Here, the only purpose of having modulo 1e9 is to keep the answer in such a range that multiplying a big number like N ( <= 2e7 according to the question ), the product still fits in a long long variable.
By repeatedly dividing ans by 10 inside the while loop, we are losing information about the least significant digits, only if these digits are 0 (ans % 10 == 0). In the same way, we don't need to keep track of unnecessary most significant digits. Doing so, we are focussed on the requirement, i.e., the last (rightmost) non zero digit.
I want to know exactly the reason for that, you haven't told how is it enough.
Let's say at some point you have ans = 123 (after removing the right hand side 0s).
Instead, when you do modulo 1e9 at each step, it will look something like this
Notice that doing the modulus operation doesn't change the last digits, It only gets rid of the digits to the left. With modulo and without modulo, in both cases the last digits are the same since the modulus is done with a power of 10.
But why it does not work on $$$10^5$$$?
Because you have N <= 2e7. If you take any modulo lesser than that, it will cause the multiplying value to be truncated. The nearest power of 10 above that value of N is 1e8. That might work.
Ok, thanks. But I'm still little bit confused.
Maybe you need to calculate factorial[n] * inverse(factorial[m])?
Modulo 10 inverses should not work, it is composite.
oh yes, didn't notice that
what is the range of n and m?
both at most 20000000
I think the reason they choose the mod depends on the range of n and m, since the maximum is 20000000, the product after each steps always has no more than 8 trailing zeroes, so we need a number that larger than 10^8. Besides if we choose 10^10, the product each steps can be larger than 2^64-1, which leads to error. That's my opinion :)
But I first remove those 0-s and then take the mod, not before.
Take a minor example, 3125 mod 10^3 = 125, 3125 mod 10^2 = 25, multiply both by 4 and we get 500 and 100, after moving the zeroes we get two different solutions, so we need a large number to decrease this probability As I mentioned in the last comment, 10^9 is good enough since 10^10 is risky :)
Any better explanations are still welcome.
It doesn't work for $$$5^{10}=9765625$$$ as far as I understand
What do you mean does not work? There are two values in the problem $$$n$$$ and $$$m$$$.
I meant n=5^10, m=n so you are calculating factorial, basically
It seems like it's correct.
actually I bugged in my reimplementation, it's $$$n=m=5^{10}+1$$$, it prints 8 instead of 4
Ok, the interesting thing is how did u find it.
well, it's clear that the problem is with 2's and 5's , because without them (I mean if 10 wouldn't have divisors) everything would work, but not with 10's (because zeros will be handler OK automatically). So, tried nearby $$$5^{10}$$$
$$$n=5^{10}+1$$$
$$$m=11$$$
The correct answer is 8, but the program outputs 6.
There is actually a smaller counterexample:
$$$n=5^9+13$$$
$$$m=14$$$
Thanks for the correct counterexamples. I also checked on my another code (with 2-s and 5-s) and the correct answer is 8 indeed.
But the question is, how did you find it?
I compared the correct answer(using $$$mod=10^{18}$$$ and __int128) and the answer given by the code written in the blog on values of n close to a power of 5 and small values of m.
I was convinced this was true and was trying to prove it, but got stuck in some part of the proof. I'll leave my attempt with a mark in the error.
Let's rephrase the problem a litte bit, note that after the $$$i$$$-th iteration of the algorithm we get some $$$9$$$-digit number, $$$a_{i}$$$, that isn't divisible by $$$10$$$. If we prove that every step yields a number whose last digit is the last non-zero digit of $$$P_{i}^{N}$$$, then the algorithm works for all $$$1 \leq i \leq N$$$. This screams induction, but focusing only on the last non-zero digit will take us nowhere because we will be missing a lot of information whenever the current last digit becomes zero and this would make retrieving previous digits impossible. So, instead of proving that the last non-zero digit is correct after each step of the algorithm, we will prove that also the $$$8$$$ digits to the left of the last non-zero digit are correct in the sense that $$$a_{i}$$$ is a substring of $$$P_{i}^{N}$$$.
Let's define $$$f(n)$$$ as the $$$k$$$-digit number (we are interested in $$$k = 9$$$, but let's do it more generally) obtained after removing all trailing zeros from $$$n$$$ and then taking the $$$k$$$ less significant digits, we may have leading zeroes.
We proceed by induction on $$$i$$$, the base case $$$i = 1$$$ is clear since $$$N < 10^{9}$$$. Now we only need to prove \begin{align}f(P_{i + 1}^{N}) = f(a_{i}\cdot(N − i))\end{align} But notice that $$$P_{i + 1}^{N}$$$ is equal to $$$P_{i}^{N} \cdot (N - i)$$$ by definition and $$$f(P_{i}^{N}) = a_{i}$$$ by the induction hypothesis. Using the following claim we conclude the proof.
$$$\text{Claim:}$$$ Let $$$n$$$ and $$$m$$$ be positive integers. If the number of digits of $$$m$$$ is less than $$$k$$$, then $$$f(nm) = f(f(n)\cdot m)$$$, i.e. the less $$$k$$$ significant digits, excluding trailing zeros, of $$$nm$$$ and $$$f(n)\cdot m$$$ are the same.
Suppose the last non-zero digit of $$$n$$$ is at position $$$j$$$ from right to left, then we are only interested in the digits at positions $$$j, j + 1, \dots , j + k - 1$$$. How exactly does $$$nm$$$ look like? Well, if $$$n = \sum\limits_{s = 0}^{k}10^{s}b_{s}$$$ and $$$m = \sum\limits_{s = 0}^{r}10^{s}c_{s}$$$, for some $$$r < k$$$, then the $$$t$$$ digit of $$$nm$$$ is obtained after taking modulo $$$10$$$ the sum of $$$\sum\limits_{s = 0}^{\min(t, r)}b_{t - s}c_{s}$$$ and the carrying which comes from the right. Suppose $$$nm$$$ has $$$q \geq 0$$$ more trailing zeros than $$$n$$$, then $$$f(nm)$$$ is formed by the digits of $$$nm$$$ at positions $$$j + q, j + q + 1, \dots , j + q + k - 1$$$, using the previous formula we notice two things:
Trailing zeros of $$$n$$$ doesn't affect the value of any digit of $$$nm$$$.
(This is false) Digits of $$$n$$$ at positions $$$t > (j + k - 1)$$$ are not needed to compute the digits of $$$f(nm)$$$, that's because all the digits of interest are at positions of the form $$$(j + q + p)$$$ with $$$ 0 \leq p \leq k - 1$$$; plugging that into the formula yields \begin{align}\sum\limits_{s = 0}^{\min(j + q + p,\, r)}b_{j + q + p − s}c_{s}\end{align} so we only need to prove that \begin{align} j + q + p − s &\leq j + k − 1 \newline \iff q + p &\leq s + (k − 1) \end{align} which follows from the fact that $$$s = \min(j + q + p,\, r)$$$, note that if $$$\min(j + q + p,\, r) = j + q + p$$$ just substitute and the result is true, else $$$s = r$$$ and see that there is not way $$$nm$$$ has more than $$$r$$$ trailing zeros than $$$n$$$ (This last part is totally wrong).
The above suggests that $$$f(f(n)\cdot m)$$$ should be equal to $$$f(nm)$$$ since we don't involve digits of $$$n$$$ not included in $$$f(n)$$$ to compute the $$$k$$$ less significant digits of $$$nm$$$, excluding trailing zeros.
The solution only works for $$$n<5^9$$$, so proving that it works for $$$n<10^9$$$ is impossible.