clock menu more-arrow no yes

Filed under:

An Idiot’s Guide to Advanced Statistics: FIP > ERA

New, comments

Simple but important.

Texas Rangers v Seattle Mariners
James Paxton’s outta here if y’all are gonna judge him by his ERA.
Photo by Otto Greule Jr/Getty Images

Editor’s Note: In the latest segment of our awards-eligible series on basic statistical evaluation, we look at the most basic, integral number for evaluating pitchers: FIP. Our first piece looked at Wins Above Replacement and Win Probability Added, and the second attempted to shed some light on the use (and dangers) of Pythagorean Win/Loss. The most recent looked at our most used hitting metrics: wRC+ and wOBA. On deck is a look at spin rates and effective velocity. Enjoy!

Here at Lookout Landing, we try not to use ERA (earned run average) as a measure of whether or not a pitcher is performing well. The reasoning behind this is that for most pitchers, ERA tends to fluctuate from year to year and is not always predictive of a pitcher’s future success. A pitcher’s ERA one year has actually been shown to have a relatively low level of correlation with his ERA the next year.

ERA also heavily relies upon the defense behind a pitcher. True, a run that scores as the result of an error is not counted as a black mark on his ERA. However, errors are a subjective stat that are often determined on the whim of the official scorer. Beyond this, consider two pitchers that give up an identical ground ball some ten feet to the right of the shortstop. One of these pitchers has Francisco Lindor, certified Good Shortstop, as his shortstop. Lindor easily ranges over to the ball and records an out. The other pitcher has Alexei Ramirez, certified Bad Shortstop. Ramirez doesn’t even get close to the ball. Because Ramirez didn’t bobble the ball or anything, he isn’t charged with an error, so the pitcher is charged with a hit. If that runner scores, the pitcher with Alexei Ramirez behind him will be shown by ERA to be worse than the pitcher with Francisco Lindor, despite having given up the exact same ground ball.

This is why Fielding Independent Pitching, or FIP, is a much better measure of a pitcher’s true ability than ERA. FIP assumes that once the batter has hit a ball into play, the pitcher no longer has any ability to determine the outcome of the play. Therefore, FIP is a metric that strips out any balls in play and assumes that they will all average out over time. FIP holds the pitcher directly responsible only for home runs, walks, hit batters, and strikeouts. The pitcher is docked for home runs, walks, and hit batters. The pitcher is credited for strikeouts.

The equation for Fielding Independent Pitching is as follows:

As you can see, it’s not terribly scary. Home runs are clearly weighted much more highly than walks or strikeouts, which makes sense. A home run is a guaranteed run. The FIP constant, meanwhile, exists only to normalize the FIP so that it correlates with ERA. The scale, in other words, is the same for ERA and FIP. If a pitcher has an exactly league average defense and league average luck with timing, FIP should determine what that pitcher’s ERA is expected to be.

Additionally, FIP has been shown to correlate more strongly from year to year than ERA. The reasoning for this is simple: ERA involves much more luck than does FIP. Of course, there are some exceptions. Some pitchers have been shown to consistently record ERAs much lower than their FIPs. For example, R.A. Dickey and Jamie Moyer posted lower ERAs than they would be expected to nearly every year of their careers. These soft-tossing pitchers generally invited fairly weak contact, so balls in play off of them were more likely to result in outs. Despite these exceptions, FIP has generally been shown to be a far more accurate indicator of a pitcher’s success (and quality) than ERA.

A clear example of FIP vs. ERA comes from the Mariners in 2016. James Paxton last year was clearly the best pitcher in the M’s rotation, yet his ERA would indicate he was only about average, around 3.79. That number was roughly equivalent to Félix Hernández’s 3.82 ERA in 2016, but no casual observer, nor 30 year scout would look at the two in a game last year and objectively say their performances were equivalent. Félix had an FIP of 4.63 due to his high walk rate and lower strikeout numbers. His ERA was saved largely by hitters having a batting average of just .271 on balls put in play against him (league average is around .300). Paxton, meanwhile, had an FIP of 2.80, or essentially a full run better than his actual earned run average. Hitters were fortunate against him, running a .347 batting average on balls in play (BABIP). Teams with extremely strong defenses (as the Mariners project to have this year) often see their ERAs outperform their FIPs, and poor defenses naturally have the opposite effect. Keep that in mind while watching this team and evaluating future rosters.

I should probably mention xFIP while we’re here. It’s more or less the same as FIP. Here’s the equation.

As you can see, the only change is in the “home run” area. xFIP stands for “expected Fielding Independent Pitching,” and is based off of the assertion that home run rates are unstable over time, while fly ball rate is much more stable from year to year for pitchers. This is shown in that same piece I linked above. xFIP simply replaces home runs from FIP with the number of fly balls a pitcher gave up multiplied by the league average home run per fly ball rate. Basically, instead of using home runs a pitcher actually gave up, like in FIP, xFIP uses how many home runs a pitcher “should” have given up.

You can see in that correlation piece that neither FIP nor xFIP are perfect. Neither one correlates perfectly from year to year. Of course, why should they? Pitchers get better and worse over time. However, the overall theme of these (and nearly all other) metrics is luck and expectation. Baseball is an intrinsically flighty mistress [Ed. note: or mister, or theyster]. We’ll never be able to eliminate all of the white noise, but we can do our best. FIP does its best to cut some of the luck out of the equation. ERA doesn’t even try, which is why we encourage phasing it out of usage in any evaluative sense.