clock menu more-arrow no yes mobile

Filed under:

The Gambler's Fallacy

I am of the belief that for trying to learn concepts, the more times you hear it expressed in different ways, the more likely that one of them clicks through. So, because it came up recently here is my attempt to explain away the gambler's fallacy. There are many other examples out there if this doesn't make sense. I encourage everybody to keep trying until they have internalized the logic behind why the gambler's fallacy is wrong.

I am going to end up flipping this coin 100 times. It will probably land on heads about 50 times. I just flipped it ten times and got ten tails. In order to end up seeing 50 heads after my 100 flips, I expect to see 50 heads and only 40 tails from now on. That way, I end up with 50 heads and 50 tails.

The gambler's fallacy refers to the belief that something is "due" to happen because recent events have gone the other way even though those past events have no impact on future events. In the above line of reasoning, because the first ten flips landed on tails AND because there was an expectation of 50 total tails after 100 flips, there is a thought that there will be more heads than tails over the remaining 90 flips to make up the difference.

That line of thinking is wrong because it is not grasping the full consequences of what statistical independence means. To say that a coin will land on heads 50% of the time means that if we took a sequence of one million coin flips, we'd expect to see close to 50% of them heads. Most people understand that. However, the important concept to get is that we ALSO expect to see close to 50% heads in ANY subset of those one million flips. It doesn't matter if you looked at only the first ten, the first hundred, only the prime number flips, only the flips corresponding to the numbers on your lottery ticket, etc. EVERY ONE of those subsets of the million throws would be expected to see 50% heads. The probability is independent of everything else.

A cause of the confusion might be because we often express probabilities not as percentages, but as expected totals. Watch what happens when we write them as rates instead.

I am going to end up flipping this coin 100 times. It will probably land on heads about 50%. I just flipped it ten times and got ten tails. In order to end up seeing 50% heads after my 100 flips, I expect to see 56% heads and only 44% tails from now on. That way, I end up with 50% heads and 50% tails.

That should immediately jump out to you as incorrect. A fair coin always has a 50% chance of landing on heads. It does not matter what it landed on last time, this time, the next time, or any time in the past or future. Every time you flip the coin the probability is 50-50. Expecting it to be 56-44 is flat out mistaken.

Now baseball games are more complex than simply flipping a coin, but the concept above remains. If you assume that baseball games are independent events, then the probability of a team winning the next game has nothing to do with what they did in the previous game, or the current game, or any individual game. Therefore, if you thought the Mariners would win 76 games (read: 47%) this season, then you should think they will win 47% of their remaining games.

Think back to the coin above and the one million flips. Now make it 162 flips and read the part about subsets again. If we thought the Mariners would win 47% of their 162 games then we would expect them to win 47% of any subset of those 162 games. That includes the first ten, the first hundred, only the prime number flips, only the games corresponding to the numbers on your lottery ticket, etc. EVERY ONE of those subsets of the 162 games would be expected to see 47% Mariner wins. The probability is independent of everything else.

Now, are baseball games actually independent events? No. Teams change, they face different teams, the rosters turn over and so on. But if you have no reason to change your initial prediction of 47% wins, then you can treat them as independent for the purposes of predicting what the Mariners' record will be after 50 games, 100 games or 162 games. You just take your initial predicted winning percentage, multiply it by the number of games left to get the expected number of wins going forward and then add it to the number of wins in past games that you want to include.

Sticking with that 47% win (76 per 162) projection, here's the implications of believing in the fallacy. If the Mariners started 2-7 (22%), you are now expecting them to go 74-79 (48%) the rest of the way so that they "make up" those missing wins in the first nine games.

Pre-season: "I think the Mariners will win 47% of their games."
Mariners lose seven of first nine.
Now: "I think the Mariners will win 48% of their games."

You should be uncomfortable with that kind of reasoning. Why do you see the Mariners fail, which should, if anything, make you think they are WORSE than you initially thought, and decide that the Mariners should from now on be BETTER than you initially thought. It doesn't make sense and it doesn't because it's wrong.