Where Fangraphs is wrong about the Mariners

Dustin Ackley is better than Fangraphs' numbers indicate. Tom Wilhelmsen, maybe not so much.

In the world of online baseball statistics, there are really only two big names. Sure, sites like Brooks Baseball, Baseball Savant, and Minor League Central offer great sets of exclusive numbers, but personally I consider them to be supplementary sources to internet sabermetrics' Big Two. You know who I'm talking about: Baseball Reference and Fangraphs.

Each site has its pros and cons; I imagine if you ask a different sportswriter you'll get a different preference. It doesn't help that each site has its own flagship defensive statistic (UZR vs. DRS), offensive statistic (wRC+ vs. OPS+), and pitching statistic (RA9-WAR vs. FIP-WAR). I personally come down on the side of Fangraphs in each case, as well as just generally preferring their website layout, but there's definitely a case to be made in the other direction. In fact, I'm about to make a case in the other direction.

I've been banging this drum for quite a while now, but it's time to bang it some more. When the Mariners moved the fences in two years ago, Fangraphs and BBREF took different approaches to changing their park factors in response. BBREF stuck with its rolling-total methodology, using data from pre-fence-move years in conjunction with the new data to calculate park adjustments for OPS+. Fangraphs decided to take a more intellectually honest route, resetting the park factors to league average and regressing incoming results by up to 90%. Unfortunately, in this case, it does not appear that "intellectually honest" was equal to "more effective".

For two years running now, Tony Blengino's granular-batted-ball park factors have pegged the new-look Safeco Field as one of the worst places to hit in the sport. Especially to left center field, where the meat of the fence move took place, the situation is bad. While moving in the fences may have marginally improved the rate of home runs to left center, the drop in the doubles rate has almost entirely negated that improvement. It just doesn't look like the big offseason renovation has had any impact, and with two years of data now in the books, the odds that 2015 will bring a sharp turnaround are getting long.

OK, so the fence move doesn't look like it did much. We've known that for a while; data indicating such has been available for months. Why is this article being written now? Because now we have an admission of error from Fangraphs' managing editor himself. Dave Cameron, take it away:

Whenever a park radically realigns its dimensions, as the Mariners did to Safeco in 2013, the earlier park factor data becomes somewhat less useful, and so our calculations (based on five years of data) add in a larger share of regression to the mean, since we don’t know if the park will still play like it did before the redesign.

In cases where the realignment doesn’t end up changing things that much, then the result is that our park factors probably overcorrect for the changes. I think there’s a pretty good case to be made that Safeco is still extremely pitcher friendly, and our park factors just haven’t picked up on that after only two years of the "new Safeco" design.

I think, in general, all Mariners pitchers are a little overrated by FG data, and all Mariners hitters are a little underrated, because the park isn’t as neutral as our current PFs suggest.

Now, what Dave doesn't say is that the methodology used by Fangraphs is wrong. He doesn't say that, of course, because he can't, but he also doesn't say that because it isn't true. In many park renovation cases (like when the Mets redid Citi Field at the same time as the Mariners changed Safeco), the effects turn out to be quite significant. In those cases, regressing as Fangraphs did is absolutely the right thing to do, and using old data like BBREF did is absolutely going to generate worse results. This just wasn't one of those cases.

So what's a number-loving Mariners fan to do? Well, you could use BBREF's OPS+ to evaluate the Mariners. Which, incidentally, paints a somewhat different picture of their offense than wRC+ does.

Name wRC+ OPS+
Cano 136 142
Seager 126 127
Miller 86 88
Ackley 97 99
Morrison 110 111
Taylor 103 102
Saunders 126 128
Zunino 86 88
Team 94 96

Better across the board, mostly, with the exception of Chris Taylor. But this isn't a perfect solution, either. For one, OPS+ is a metric with known flaws - mostly in its core of OPS, which overvalues power (hence the Taylor misevaluation). For two, if we limit ourselves to using BBREF's statistics, we're stuck with DRS as our fielding component, which is a little bit less stable than UZR.

A more complicated (but also more accurate) thing to do would be to just refactor Fangraphs' wRC+ and WAR for the difference in park factors. BBREF assigns Safeco a Park Factor of 95, while Fangraphs has it at 97. Using the formulas for Batting Runs and wRC+ from Fangraphs' Library and constants from its guts page, I reevaluated the Mariners' 2014 offensive performance:

Name Old wRC+ New wRC+
Cano 136 139
Seager 126 128
Miller 86 89
Ackley 97 100
Morrison 110 113
Taylor 103 105
Saunders 126 128
Zunino 86 89
Team 94 96

...and thus their oWAR:

Name Old WAR New WAR
Cano 5.2 5.3
Seager 5.5 5.6
Miller 1.4 1.5
Ackley 2.1 2.2
Morrison 1.0 1.1
Taylor 1.4 1.4
Saunders 1.9 2
Zunino 1.7 1.8

As the WAR table should make pretty clear, the distinction here is almost totally academic. WAR isn't a stat designed to have the difference between 0.2 and 0.3 actually mean anything. I could perform the same calculations for pitchers, but you'd see the same results, just in reverse: knock everyone's WAR down by 0.1. On an individual scale, it doesn't really mean all that much.

On the team scale, though, all those 0.1s add up. If you used Fangraphs' data to evaluate the Mariners last year, you probably got the idea that the team's pitching was a couple wins better than it actually was, and you probably paired that with the idea that the team's batting was a couple wins worse than it actually was. In other words, Fangraphs' Team WAR leaderboards convey the idea that the 2014 Mariners were a great-pitching, poor-hitting team, when actually they were a good-pitching, slightly-below-average-hitting team. This might inform your beliefs on what the Mariners' offseason plans should be just a little more. For me, it reinforces the team's need to bolster its starting rotation.

The effect here isn't as extreme as it was last year, when Fangraphs' park factor for Safeco was the hyper-regressed 99 instead of its current 95. Last year, park factor analysis was the difference between Kendrys Morales being worth a QO and Kendrys Morales being a decent stopgap DH. This year, the difference is 3 points of wRC+ and 0.1 WAR per player. Still, it's a good difference to keep in mind.

Honestly, in the long term, I think that park factors are probably destined to join OPS on the sabermetric scrap heap of techniques that used to be useful approximations but have since been outmoded. One-size-fits-all PFs ignore the quirks of individual parks; Kyle Seager and Mike Zunino get credited equally for playing in a tough park, even though Zunino (as a right-hander) has the much tougher job. Once everyone has granular batted ball data - which may become available with StatCast - we'll be able to break offensive performance down to speed off bat and launch angle, and park adjustments will become a thing of the past.

Until then, though, we're stuck the system we have. May as well get to know it.