Tue 15 Oct 2013
Everyone knows the age-old game of rock, paper, and scissors, it’s been around pretty much the whole world for a century or so, and originated in China ~2000 years ago.
Generally the game is approached in 1 of 2 ways: the first is where you try and out-think your opponent, trying to exploit the inherent non-randomness of people when they play the game. This is essentially applied psychology.
The other way to approach the game is where you assume that you and your opponent are totally rational, i.e. both of you know everything about the game and there is no ‘human’ element to exploit. This is essentially game theory.
In game theory, two players are perfectly matched: neither one has any particular advantage over the other. Both players have the same chance to win each game. So the best strategy is to randomly choose rock, paper, or scissors each time (with an equal probability of each).
Using game theory and probability, you can actually prove this. Here’s one way it can be done. Define the following: if you win a game, you get 1 point. If you lose a game, you lose 1 point. , , and are the probabilities of you choosing rock, paper, and scissors respectively. Similarly , , and are the probabilities your opponent chooses. Obviously , , and .
Now we calculate the expected point gain per game by taking every possible game outcome, multiplying the point gain/loss of that particular outcome times the probability of it occurring, and then summing them all together. So the points won per game is the following:
Since we know that the best choice is to choose , , and with equal probability, then . Since our opponent is also perfectly rational, he will also choose the same probabilities. So the points won per game is:
Everything sums up to , which is what you expect when the game is evenly matched. However, this choice of lends itself to a very interesting consequence. Let’s say your opponent chooses some . What happens then? Let’s say he’s really stupid and chooses rock every time, or . Of course the obvious choice for us to win would be to choose paper every time, or . But if we don’t change our strategy, the following happens:
Of course, this is obvious without having to calculate it. If your opponent chooses rock every time, and you are choosing each one 1/3 of the time, then 1/3 of the time you’ll tie with rock, 1/3 of the time win with paper, and 1/3 of the time lose with scissors. So let’s see what happens when we choose , but , , and are still unknown:
Rearrange the terms a bit:
So we see that for choosing , it doesn’t matter what kind of strategy or ratio that our opponent chooses: the expected point gain will always be zero and we will always be evenly matched with our opponent.
This is what’s called the Nash equilibrium, named after John Nash, the subject of A Beautiful Mind. As I best understand it, for a zero-sum game (of which RPS certainly is), there always exists a state where one player can force the game to remain in equilibrium, no matter what the other player does. For the standard RPS, this comes out to be . If you choose to play with this distribution, you and your opponent will always be in equilibrium no matter what strategy your opponent tries.
So that brings us to variants of RPS. This whole line of inquiry was inspired by a homework problem that my younger daughter had a few weeks ago. Basically it had a variant of RPS where if you win with rock you get 1 point, win with scissors you get 2 points, and win with paper you get 3 points. Since my daughter is in 1st grade the questions were very easy, i.e. “if you have 10 points and win with scissors, how man points do you have?” etc. But this got me thinking about how this game would actually work. Of course you want to win with paper, but you run into the same problem as regular RPS: you and your opponent are both trying to out-guess each other, so there’s no simple strategy that’s guaranteed to win.
So the next question is, is there a Nash equilibrium for this version of RPS, and if so what is it?
This post is already too long, so I’ll show how to find the Nash equilibrium for this version in my next post.