The Problem With Sabermetrics

Once upon a time, sabermetrics was an interesting field. Better, it meant something. Those curious about how baseball worked were lifting the veil and understanding the mechanics of the game. New metrics were developed that gave us a better idea of not only what a player was worth but how to puzzle that particular question out. Following the logic behind the new wave of baseball statistics was a ride through the logical skeleton of the game. Understand the stats, and you understood baseball. And there were a bevy of talented writers to guide you down that route.

Now, things are more than a little different. Sabermetrics seems to have lost its way.

I was reading a very good book the other day about the development of quantum mechanics. One of the many things that struck me as particularly fascinating was the process of science itself - the delicate balance between experimental results and theory and the way they've interacted over the past century in perhaps the most exciting field in physics.

Sometimes experiment races ahead and the theorists struggle to catch up, and, more rarely, theory moves into areas where experiment can't go. Quantum physics has somehow been in the second zone for about 25 years now, and scientists are spending a lot of time and money trying to come up with experiments to prove or disprove the theories that have sprouted up since then.

Science requires experiment and theory to be on the same page. Without experiment, we see everything dissolve into mathematical chaos. Metaphysics rules the day with nothing to anchor the work being done in the field to any glimmer of reality. But without theory, experimental results are nothing but nonsensical gobbledegook. Real science - good science - is a very tricky balancing act between the theorists and the experimentalists.

It's tempting to appropriate this idea wholesale and drop it into the context of sabermetrics. After all, practitioners of baseball analysis do often compare their work to science. But, despite the claims to the contrary, sabermetrics is most emphatically not science. Proper science requires, at the very least, controlled experimentation, which is something impossible to manage in baseball analysis.

Realistically, whether people want to believe it or not, sabermetrics is a branch of applied philosophy. Without experimentation, all we can do is observe, hypothesise, theorise and then check back with the (flawed) observations we've conducted. That's not good enough for science, and it certainly means we can't reasonably apply the same sort techniques that serve as checks and balances in the worlds of physics, biology and chemistry.

So if the theory plus experiment model is out, what's actually important? Well... forgive me for sounding exceptionally cheesy here, but the important thing about baseball analysis is baseball.

At its best, sabermetrics flows directly from the innate logic of the game, and then fits the observed data in an agreeable way. Stuff like win probability, linear weights and baseruns aren't really statistical constructs but logical ones. Thinking about the game in a rigourous enough manner gets you to those concepts, whether or not you can do the maths involved to nail down the minutae. The ideas are what's important, and they all come from baseball.

I suspect someone with a proper understanding of the game and none of mathematics could, after some prodding, give us a fairly excellent definition of regression, but no amount of statistical theory will help a mathematician understand baseball. There's an old story about an attempt to determine the run value of specific hits by running a regression analysis that illustrates this perfectly.

If you're not familiar with the term, a regression analysis essentially involves determining how much certain variables impact a single outcome. Doing one is fairly easy these days (thanks, Excel!), but used to be quite challenging due to the numberwork that was involved. Anyway, in this case, total team runs in a season were tested against home runs, triples, walks, errors, etc. All sounds good and sensible, right? Wrong. When the numbers come back, it turned out that hitting triples cost a team runs.

Obviously, this result is insane. Triples are innately good things, and one would imagine that teams would be rather glad to hit more of them. What happened in the example above is that the wrong effect was picked up. Teams that hit more triples are generally worse at run scoring on account of being small and speedy instead of power-hitting monsters, but the analysis blamed the triples for that rather than treating it as a symptom. This is the sort of thing that happens when you start by picking up the wrong end of the stick.

Proper sabermetrics is something that has to come from the top down (baseball-driven) rather than the bottom up (mathematics/data driven), and to lose sight of that causes a whole host of issues that are plaguing the field at present. Every single formula must be explainable without recourse to using ridiculous numbers. Every analyst must be open to thinking about the game in new ways. Every number, every graph in a sabermetric piece must tell a baseball story*, because otherwise we're no longer writing about the sport but indulging in blind number-crunching for its own sake.

*"This author really likes baseball numbers and graphs" doesn't count.

Surveying the field, I no longer believe that those essential precepts hold sway over the sabermetric community. Data analysis methods are being misapplied and sold to readers as the next big thing. Articles are being written for the sake of sharing irrelevant changes in irrelevant metrics. Certain personalities are so revered that their word is taken as gospel when fighting dogma was what brought them the respect they're now given in the first place. Sabermetrics is in a sorry state.

How do we fix it? Well, the answer seems simple. Sabermetrics shouldn't be so incomprehensible so as not to call up the smell of fresh mown grass in midsummer, or the crack of the ball off the bat, the blur of seams as an outfielder whips a throw in towards his cutoff man. Statistics shouldn't be sterile and clean and shiny and soulless. They shouldn't just be about baseball; they should invoke it. Otherwise, they run the risk of losing the language which makes them so special.