The Intuition Behind NIM and Grundy Numbers in Combinatorial Game Theory

Правка en3, от Shisuko, 2019-04-15 13:49:11

There are already many wonderful resources online for learning the basics to combinatorial game theory, but the intention I have for this guide in particular is to try to make them more intuitive. Why does XOR work for Nim? Why does mex produce the Grundy numbers? When it was taught to me, it was more or less just as 'magic' to memorize, but now I think I have a decent understanding of why they are true, so now that I have a grasp of the intuition behind it, I decided I should record my thoughts into a blog.

Nim games

Suppose we are playing a Nim game. As we know, Nim games usually follow this sort of template:

  • There are $$$n$$$ piles of stones, and each pile contains $$$p_1, p_2, p_3, \cdots p_n$$$ stones, where each of the $$$p_i$$$ is some nonnegative integer.

  • There are two players who take turns removing stones. The current turn player may remove as many stones as they wish, so long as all the stones come from the same pile. More formally, they choose a pile $$$i$$$ and a number of stones to remove $$$0 < s \leq p_i$$$, then replace $$$p_i$$$ with $$$p_i-s$$$. You cannot pass your turn doing nothing. Then, it is the other player's turn to remove stones, and so on.

  • The game ends when there are no stones left on the board. The winner is the one who took the last stone; alternative, the loser is the first one who cannot take a stone on their turn.

The question usually goes something like this: Given an initial board state $$$p_1, p_2, p_3, \cdots p_n$$$, is there a winning strategy the first player can take so that they always win, or is it always possible for the second player to win?

We could attack this problem using DP, however, a lot of Nim-type problem will have bounds such as $$$N$$$ up to $$$2\cdot10^5$$$ and $$$p_i$$$ up to $$$10^9$$$ or even $$$10^{18}$$$. That's obviously impossible to DP.

There exists a fairly standard $$$O(N)$$$ formula to solving the Nim game, and here is my attempt to naturally and intuitively derive for ourselves what this magic formula is.

You can only take stones from one pile at a time, so it seems like a good idea to find some property for each pile of the game, then find some way to compose that information together to describe the board state as a whole. It's rather mathy, but as an analogy, think about multiplicative functions---to evaluate a multiplicative function at $$$n$$$, we decompose $$$n$$$ into its prime powers, more easily evaluate the function with each prime power, then find some way to compose the answers for each $$$p^k$$$ together to get the answer for $$$n$$$.

So, what we will do is:

  • For each game state $$$g$$$, let us assign to it some integer "Nim-value" $$$G$$$ which tells us whether this is a winning state or a losing state.

  • Given two game states $$$a$$$ and $$$b$$$, let $$$a+b$$$ be the game state that is simply the piles of $$$a$$$ together with the piles of $$$b$$$. If their Nim-values are $$$A$$$ and $$$B$$$ respectively, then define $$$A\oplus B$$$ to be the Nim value of the game state $$$a+b$$$. We are assuming (and hoping) that there does in fact exist some operator that allows us to compose different games together easily.

Now, let's tackle the simplest possible game---one that consists of a single pile with $$$p$$$ stones in it.

  • Intuitively, since the only information that distinguishes piles from one another is $$$p$$$, let's say that the Nim-value of a game that consists of only a single pile is $$$p$$$.

  • We see that a single pile is a winning state if $$$p>0$$$, and is a losing state if $$$p=0$$$. That's something we'd like to see reflected in the Nim-values as well---a Nim-value $$$G$$$ corresponds to a winning game state when $$$G>0$$$, and corresponds to a losing game state when $$$G=0$$$.

Next step, let's find a way to compose Nim-values together with this operator $$$A \oplus B$$$. We should notice the following properties about it though:

  • $$$(A \oplus B) \oplus C = A\oplus(B \oplus C)$$$, and $$$A \oplus B = B \oplus A$$$. That is to say, $$$\oplus$$$ is associative and commutative. Since we're just combining the piles together, order doesn't really matter, so this should be apparent.

  • $$$A \oplus 0 = A$$$. That is to say, the identity for this operator is 0. We obviously want this to be true since if you place an empty pile next to an existing game, you should still effectively have the same game.

  • $$$A \oplus A = 0$$$. That is to say, the inverse of each game state is itself. Think about why this is true---suppose we have two piles that each have $$$p$$$ stones in them. The second player always has a winning strategy here, which is to just mirror the first player's move! For some generic game state $$$A$$$, note that the game state $$$A\oplus A$$$ guarantees a respective 'mirror image' for each pile; if the first player makes a move on some pile, the second player can always take the same number of stones from the 'mirror image' of that pile. Thus, the first player will always be the one to run out of moves first, therefore this is a losing state, and therefore $$$A \oplus A = 0$$$.

You'll notice that all of these properties define a group over the nonnegative integers! In fact, take a look at that last property in particular, that $$$A \oplus A = 0$$$. The inverse of every element is itself, i.e. every non-zero element in the group has order 1. This is actually a huge hint because it turns out the only group that has this property is the set $$${0,1}$$$ with the operation of addition modulo 2, or some ordered $$$n$$$-tuple of $$${0,1}$$$ with the operation of addition modulo 2 done between each corresponding component; aka, the logical XOR and the bitwise XOR.

Bingo! It seems our operator $$$A\oplus B$$$ is just the bitwise XOR of the two Nim-values. In fact, since XOR is just addition modulo 2, in literature the Nim-values are usually called the Nim-sums of the game state. However most of what I gave consists of heuristics and intuitions---now that we have this operator as a candidate, let's try to formally prove that our magic formula works.

**Given a game of Nim whose current game state consists of $$$n$$$ piles with $$$p_1, p_2, p_3, \cdots, p_n$$$, then the current game state is a winning state if the Nim-sum $$$p_1 \oplus p_2 \oplus p_3 \oplus \cdots \oplus p_n$$$ is non-zero, and a losing state if the Nim-sum is zero, where $$$\oplus$$$ is the bitwise XOR operation.

  1. The empty board is a losing state with Nim-sum 0. This follows from our observations earlier.

  2. If the current Nim-sum is 0, then any move will make it non-zero. Recall that a move replaces $$$p_i$$$ with $$$p_i-s$$$. So, keeping in mind that the XOR-inverse of each integer is itself, that means the new Nim-sum after a move is made becomes $$$p_1 \oplus p_2 \oplus p_3 \oplus \cdots \oplus p_n \oplus p_i \oplus (p_i-s)$$$; we remove $$$p_i$$$ from the Nim-sum, then add in $$$p_i-s$$$. But, if the current Nim-sum was 0, then this is just $$$p_i \oplus (p_i-s)$$$, and this is only equal to 0 when $$$p_i=p_i-s$$$, but the rules state you have to always take at least one stone on your turn, so $$$s \neq 0$$$.

  3. If the current Nim-sum is non-zero, then there always exists a move to make it equal to 0. Suppose the current Nim-sum is $$$N>0$$$. We need to choose $$$p_i$$$ and $$$s$$$ such $$$p_i \oplus (p_i-s)=N$$$, subject to the constraint that $$$0 < s \leq p_i$$$. Consider the fact that $$$N$$$, since it is nonzero, must have a greatest significant bit, the $$$g$$$th bit. Furthermore, at least one of the piles $$$p_i$$$ will have the $$$g$$$th bit as a 1 as well; if all the piles had a 0 there, then it couldn't have been the greatest significant bit of their XORsum after all. So, we choose one such pile as our $$$p_i$$$, and we construct our $$$p_i-s$$$ as the following:

  • All bits greater than the $$$g$$$th bit we keep the same as $$$p_i$$$.

  • The $g$th bit is set to 0.

  • All bits less than the $$$g$$$th bit we make match those of $$$N \oplus p_i$$$.

Notice that this construction is always less than $$$p_i$$$, since we know their first difference is at the $$$g$$$th bit, therefore this is a valid choice of $$$p_i-s$$$, from which we can easily recover what the $$$s$$$ should be. We can see that the new Nim-sum is 0 because

  • All bits in $$$N$$$ greater than the $$$g$$$th bit were already 0 (since the $$$g$$$th bit was defined to be the greatest significant bit). Then, we know that all bits in $$$p_i \oplus (p_i-s)$$$ greater than the $g$th bit are also 0, since that was how we constructed it.

  • The $$$g$$$th bit in $$$N$$$ 1, by definition as the greatest significant bit, and the $$$g$$$th bit in $$$p_i \oplus (p_i-s)$$$ is also 1, since it is 1 in $$$p_i$$$ and 0 in $$$p_i-s$$$.

  • All bits in $$$p_i \oplus (p_i-s)$$$ less than the $$$g$$$th bit, are guaranteed to match that of $$$N$$$, since $$$p_i \oplus (p_i-s) = p_i \oplus (p_i \oplus N) = N$$$ in this range.

The best way to convince yourself and see that this does in fact set the new Nim-sum to 0 is to write it out yourself and try it with some concrete values.

Now, with these tools in hand, we can see that:

  • If the current turn player has a zero state, they have no choice but to 'disturb' it into a nonzero state.

  • The next player then, given a nonzero state, always is able to 'restore' it back into a zero state.

  • Thus, if the players know what they are doing, the first player is always stuck with a game state whose Nim-sum is 0; since this remains invariant and there are only a finite number of moves, eventually when the game ends with the empty game whose Nim-sum is 0, we know the current turn player, and loser, is the first player.

  • Similarly, if the current turn player has a nonzero state, they can restore it to a zero state and pass the board over to the second player, who by our logic, must be the loser.

This proves our theorem. Thus, determining the winner given some initial game state is linear on the number of piles.

Extensions on NIM and Grundy Numbers

No self-respecting problem-setter would ever give just a vanilla Nim game in 2019, though. Usually there will be some variation on the rules somehow. Consider the following, for instance.

Say you are similarly given a game state of $$$n$$$ piles with stones $$$p_1, p_2, p_3, \cdots p_n$$$. The normal rules of Nim apply, except there is an extra move: instead of removing stones from a pile, the current turn player may choose to spend their turn adding any number of stones to a pile instead. They can add as many stones as they want, so long as, again, they are all added to the same pile. Assume this addition move may only be done a finite number of times per player.

It turns out that... this is exactly the same as normal Nim! Why? Assume that the current turn player is in a losing state, under the normal set of Nim rules. However, if they add $$$s$$$ stones to some pile, the other player can spend their turn removing $$$s$$$ stones from that same pile, effectively undoing the last move, and then the losing player is back exactly where he started. A winning player, meanwhile, can just play the game like normal Nim, only deviating from the strategy to undo any additions the opposing player makes, as necessary.

Other games though are not quite as simple. A common tweak is putting a limitation on the number of stones you can get from a pile. What if there is a condition that you can only take a perfect square number of stones from a pile? What if you can take no more than half the stones in a pile?

Let us turn back to our original strategy---decompose the game state into several piles, find some property for each piles, then compose them again together to get the answer.

So, given only a single pile with $$$p$$$ stones, how do you know if it is a winning state or a losing stats? Earlier, we simply operated on $$$p$$$, but we'll need something a bit more sophisticated here.

Let's reframe this as a directed graph. Let there be $$$p+1$$$ vertices in this graph, labelled $$$0$$$ to $$$p$$$. Each vertex corresponds to some game state, where its label indicated how many stones are in the pile at this state. Then, draw a directed edge from $$$u$$$ to $$$v$$$ if it is possible to transition from $$$u$$$ stones to $$$v$$$ with a valid move. For instance, if you may only take a square number of stones from the pile, then vertex 11 would have a directed edge to vertices 10, 7, and 2, since I could take 1, 4, or 9 stones from that pile. The terminal vertices---the ones with outdegree 0---are our loss states.

We could imagine solving this with DP then. Let the terminal vertices be, by definition, loss states. Then, if a vertex points to at least one loss state, it is a win (the current turn player takes that moves and puts the opposing player in a loss state). If a vertex only points to winning states, then it is a loss state (the current turn player has no ways out).

Let's take this a step further though and treat it like our Nim-values from earlier; instesd of just assigning win/loss, let's assign some integer to each vertex such that 0 is a loss and nonzero is a win. This is usually called the Grundy numbers, or more cutely, the Nimber of a pile.

Let the $$$G(u)$$$ be the Grundy number of a pile with $$$u$$$ stones. The Grundy number of a terminal state is 0; otherwise, $$$G(u)$$$ is the minimum excludant of the Grundy numbers of all the states it points to. We usually write minimum excludant as mex, and it basically means "return the smallest nonnegative integer that is not in this set." So, for instance, $$$\mex(0,1,3,4,5)=2$$$.

The important Sprague-Grundy theorem states that these games are equivalent to playing Nim, but instead of taking the XOR of the piles, we take the XOR of their Grundy numbers.

Let's consider a basic example, the Nim game where you can only take a square number of stones from a pile. We see that the Grundy numbers for $$$0, 1, 2, 3, 4, 5, 6$$$ are $$$0, 1, 0, 1, 2, 0, 1$$$. You can verify this. Therefore, if we have a game state with piles of $$$1, 2, 3, 4, 6$$$ stones, the Nim-sum of this game is $$$G(1)\oplus G(2) \oplus G(3) \oplus G(4) \oplus G(6) = 1\oplus 0 \oplus 1 \oplus 2 \oplus 1 = 3$$$, which is non-zero, therefore the first player has a winning strategy.

Note that in our normal game of Nim, each state points to every state smaller than it (since taking any number of stones is a valid move). Therefore, $$$G(p)=p$$$, which is consistent with our results about the original Nim.

When I first learned this, I thought ut was magic! But the helpful commenters over at the math subreddit helped give me this wonderful intuition. The mex function is so weird, what's that all about?

Basically, the mex function is a promise. The promise is that: if $$$G(p)=g$$$, then it means every integer from $$$0$$$ to $$$g-1$$$ is accounted for in the list of valid moves. Imagine that every pile in our game was transformed from having $$$p$$$ stones to $$$G(p)=g$$$ stones. You can see that we can do this because in Nim, we can transition from a pile of size $$$p$$$ to any other pile of size less than $$$p$$$; in our transformed Grundy number world, the mex property guarantees that we can transition from our 'pile' of size $$$g$$$ to any 'pile' of size less than $$$g$$$. So, by looking at this transformed mex world, we really are just playing Nim!

However, unlike in Nim, it's possible to transition from a pile of size $$$g$$$ to a pile of size greater than $$$g$$$. Consider in the square number game earlier, if we have a pile of size 5 and we take away one stone, we end up with 4; in the transformed Grundy number world, we went from a pile of size 0 to one of size 2!

But not to worry, we already accounted for this! Remember we said that Nim where you can add stones is just the same as normal Nim. Suppose I started with a Grundy number of $$$g$$$, then after making a move I ended up with a larger Grundy number in the new pile; but, recall that the Grundy number of some state is the minimum excludant of all the states it can transition to. It's our promise from earlier. So, if we transition from $$$g$$$ to some larger number, we have been promised that we can always transition back to $$$g$$$. Just like in our Nim-with-addition from earlier, we can always undo any additions.

Therefore, if we use Grundy numbers, Nim with any weird restrictions on the amount of stones we can take per pile can be reduced to Nim with addition, which itself can be reduced to normal Nim.

The complexity in these kinds of programming problems in contests usually then come downs to finding some efficient formula for the Grundy numbers.

Теги #game-theory, #nim-game, #xor, grundy, sprague-grundy

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en18 Английский Shisuko 2020-08-04 12:41:11 0 (published)
en17 Английский Shisuko 2020-07-30 17:18:24 21 Tiny change: 'rundy:\n\n**Project Euler**\n\nhttps:' -> 'rundy:\n\nhttps:'
en16 Английский Shisuko 2020-07-30 17:13:26 733 Added list of problems (saved to drafts)
en15 Английский Shisuko 2020-07-30 11:53:45 204
en14 Английский Shisuko 2020-03-03 14:52:04 1424 Revised the wording of some parts.
en13 Английский Shisuko 2019-04-16 05:53:18 0 Grammar fixes (published)
en12 Английский Shisuko 2019-04-16 05:45:09 39 (saved to drafts)
en11 Английский Shisuko 2019-04-15 18:20:26 0 (published)
en10 Английский Shisuko 2019-04-15 18:18:57 1 Tiny change: ' 0 to $h-1. And, si' -> ' 0 to $h-1$. And, si'
en9 Английский Shisuko 2019-04-15 18:17:56 173
en8 Английский Shisuko 2019-04-15 18:13:53 688 Tiny change: 'I thought ut was magi' -> 'I thought it was magi'
en7 Английский Shisuko 2019-04-15 17:57:42 1232 Tiny change: 'the set $\left\{0,1\right\}$ and th' -> 'the set $\\{0,1\\}$ and th'
en6 Английский Shisuko 2019-04-15 17:30:05 1510
en5 Английский Shisuko 2019-04-15 13:57:07 27
en4 Английский Shisuko 2019-04-15 13:50:06 1 Tiny change: 'nstance, $\mex(0,1,3,' -> 'nstance, $mex(0,1,3,'
en3 Английский Shisuko 2019-04-15 13:49:11 25622
en2 Английский Shisuko 2019-04-15 11:59:56 13982
en1 Английский Shisuko 2019-03-19 18:43:00 4359 Initial revision (saved to drafts)