This month we kicked off this series with an uber-basic, nearly math-free look at Win Probability Added and Wins Above Replacement. Today we are going to add a new metric: Pythagorean Win-Loss (Pythag). Pythag is often utilized as a way of measuring a team’s supposed “true talent,” and you’ll often see it referenced around here if a team is outperforming or, if you’re the 2015 M’s, underperforming their pythag. We’ll look at how it’s calculated, and also why we should be wary of using this rather simple stat with such wide-ranging conclusions.
*Mathematics disclaimer from a sociology major: My senior yearbook superlative was “Most Spirited”, and I will fight anyone who says I did not earn that title. The superlative I definitely did not earn was “Most Likely to Teach Mathematical Equations on the Internet”, so once again bear in mind, if this is new to you, you’re gonna be able to handle it. If it is still old hat to some of you, I appreciate your patience.
Pythag is based on the understanding that there is a significant correlation between run differential (which is just what it sounds like: number of runs scored minus number of runs allowed. Greater positive differentials correlate to a better team record, while greater negative differentials likely indicate a worse record) and winning percentages (self explanatory). Winning by a wider margin consistently would logically imply that you are a stronger team than one that ekes out game after game. In 2016, we could have just as easily called this the official stat of “Screw the Texas Rangers”, but we should be careful there. Pythag is also integral to converting individual players’ WAR into team-wide projections. More on both of those thoughts in a moment.
The equation for a team’s Pythagorean winning percentage is, simply:
Runs Scored^2 / (Runs Scored^2 + Runs Allowed^2) = Pythagorean Winning %
So, for the 2016 Seattle Mariners, who went 86-76 (.531%), lets do a little *heaves* math.
(768 runs scored^2) / (768 runs scored^2) + (707 runs allowed^2) = .541%
That .541 most closely rounds to an 87-75 record, which would seemingly indicate the Mariners’ record was about as good as their ability to score and prevent runs scored would indicate. Simply put, they performed almost exactly to their pythag. Hip hip hoo-booooring. Let’s rip into something more wonky.
It is easy to point at the run differential discrepancy between the Rangers and, say, the Nationals, who shared the same record but allowed nearly a full run fewer each game, and say that Pythagorean Win Expectancy shows Texas was lucky. Hell, it’s even highlighted for you, go ahead, it feels good and it must be true. Look at Rougned Odor - he’s either lucky or imbued with the spirit of Dolos.
Here’s the problem: that’s a very shallow way of looking at Pythagorean wins. Shame on you for trusting me.
Texas was an extreme enough case to say yes, this team was somewhat fortunate (and being 36-11 in one-run games would back that hypothesis). Making a conclusion about the true talent, however, or even what a team is liable to do in the future by looking at their Pythag mid-season, is tenuous. Pythag doesn’t account for players getting injured or returning from injury. It doesn’t account for teams making a move to win-now or sell.
Both Pythag and WAR are used to attempt to convey the true talent of a team. WAR attempts to account for a wide number of variables. You can look at a team’s combined WAR and know that attempts are made to properly weight offensive and defensive production, baserunning, the varying stadiums in which each individual event occurred, and how that all compares to what is standard in that specific season. Pythag is the far more simple, and thus more flawed, metric. It is a single variable statistic that can tell you how many runs a team scored and gave up, and then allows you to extrapolate that information how you see fit. Using Pythag to identify a team outperforming or under-performing their expected run differential is an excellent way to begin to formulate a question. It’s not capable of giving you the answer.
Think of this series as the production of our own iLLuminati-esque pyramid. We’re starting you with the simple, basic metrics so that you can recognize the flaws in some early statistics and better understand the flurry of acronyms floating around this site. Next round we’ll be looking at the equations for WAR and, with your newfound knowledge of Pythag, you’ll have a better understanding of why WAR is the more reliable true talent metric.