Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Trent Richardson Interviews Fellow Brown Brandon Weeden

The Problem With Sabermetrics

Once upon a time, sabermetrics was an interesting field. Better, it meant something. Those curious about how baseball worked were lifting the veil and understanding the mechanics of the game. New metrics were developed that gave us a better idea of not only what a player was worth but how to puzzle that particular question out. Following the logic behind the new wave of baseball statistics was a ride through the logical skeleton of the game. Understand the stats, and you understood baseball. And there were a bevy of talented writers to guide you down that route.

Now, things are more than a little different. Sabermetrics seems to have lost its way.

Star-divide

I was reading a very good book the other day about the development of quantum mechanics. One of the many things that struck me as particularly fascinating was the process of science itself - the delicate balance between experimental results and theory and the way they've interacted over the past century in perhaps the most exciting field in physics.

Sometimes experiment races ahead and the theorists struggle to catch up, and, more rarely, theory moves into areas where experiment can't go. Quantum physics has somehow been in the second zone for about 25 years now, and scientists are spending a lot of time and money trying to come up with experiments to prove or disprove the theories that have sprouted up since then.

Science requires experiment and theory to be on the same page. Without experiment, we see everything dissolve into mathematical chaos. Metaphysics rules the day with nothing to anchor the work being done in the field to any glimmer of reality. But without theory, experimental results are nothing but nonsensical gobbledegook. Real science - good science - is a very tricky balancing act between the theorists and the experimentalists.

It's tempting to appropriate this idea wholesale and drop it into the context of sabermetrics. After all, practitioners of baseball analysis do often compare their work to science. But, despite the claims to the contrary, sabermetrics is most emphatically not science. Proper science requires, at the very least, controlled experimentation, which is something impossible to manage in baseball analysis.

Realistically, whether people want to believe it or not, sabermetrics is a branch of applied philosophy. Without experimentation, all we can do is observe, hypothesise, theorise and then check back with the (flawed) observations we've conducted. That's not good enough for science, and it certainly means we can't reasonably apply the same sort techniques that serve as checks and balances in the worlds of physics, biology and chemistry.

So if the theory plus experiment model is out, what's actually important? Well... forgive me for sounding exceptionally cheesy here, but the important thing about baseball analysis is baseball.

At its best, sabermetrics flows directly from the innate logic of the game, and then fits the observed data in an agreeable way. Stuff like win probability, linear weights and baseruns aren't really statistical constructs but logical ones. Thinking about the game in a rigourous enough manner gets you to those concepts, whether or not you can do the maths involved to nail down the minutae. The ideas are what's important, and they all come from baseball.

I suspect someone with a proper understanding of the game and none of mathematics could, after some prodding, give us a fairly excellent definition of regression, but no amount of statistical theory will help a mathematician understand baseball. There's an old story about an attempt to determine the run value of specific hits by running a regression analysis that illustrates this perfectly.

If you're not familiar with the term, a regression analysis essentially involves determining how much certain variables impact a single outcome. Doing one is fairly easy these days (thanks, Excel!), but used to be quite challenging due to the numberwork that was involved. Anyway, in this case, total team runs in a season were tested against home runs, triples, walks, errors, etc. All sounds good and sensible, right? Wrong. When the numbers come back, it turned out that hitting triples cost a team runs.

Obviously, this result is insane. Triples are innately good things, and one would imagine that teams would be rather glad to hit more of them. What happened in the example above is that the wrong effect was picked up. Teams that hit more triples are generally worse at run scoring on account of being small and speedy instead of power-hitting monsters, but the analysis blamed the triples for that rather than treating it as a symptom. This is the sort of thing that happens when you start by picking up the wrong end of the stick.

Proper sabermetrics is something that has to come from the top down (baseball-driven) rather than the bottom up (mathematics/data driven), and to lose sight of that causes a whole host of issues that are plaguing the field at present. Every single formula must be explainable without recourse to using ridiculous numbers. Every analyst must be open to thinking about the game in new ways. Every number, every graph in a sabermetric piece must tell a baseball story*, because otherwise we're no longer writing about the sport but indulging in blind number-crunching for its own sake.

*"This author really likes baseball numbers and graphs" doesn't count.

Surveying the field, I no longer believe that those essential precepts hold sway over the sabermetric community. Data analysis methods are being misapplied and sold to readers as the next big thing. Articles are being written for the sake of sharing irrelevant changes in irrelevant metrics. Certain personalities are so revered that their word is taken as gospel when fighting dogma was what brought them the respect they're now given in the first place. Sabermetrics is in a sorry state.

How do we fix it? Well, the answer seems simple. Sabermetrics shouldn't be so incomprehensible so as not to call up the smell of fresh mown grass in midsummer, or the crack of the ball off the bat, the blur of seams as an outfielder whips a throw in towards his cutoff man. Statistics shouldn't be sterile and clean and shiny and soulless. They shouldn't just be about baseball; they should invoke it. Otherwise, they run the risk of losing the language which makes them so special.

Comment 158 comments  |  56 recs  | 

Do you like this story?

Comments

Display:

This itches at the sort of direction I've taken with stats recently.

There are complicated methods of trying to measure subjects like defense, park effects, etc, but I’ve been trying to find and explain ways that are (I hope) baseball intuitive at their core. Personally, I’ve always likened sabr as a sibling of applied economics.

by Matthew on Jul 20, 2011 11:07 PM PDT reply actions  

Glad you said it.

Sabermetrics deals with the same problem that economists/econometricians face.

How do you measure results when you cannot use controlled experiments?

You collect all of the observations you can and isolate those that are your “experiment”. Sort of like the minimum wage change in New Jersey (or was it Pennsylvania?).

When doing so, you can’t just throw regressions at numbers and then infer the meaning, relying on R^2 to tell you whether you have a good fit — the point of the regression is to test a theory. I theorize that A results from B, C, and D for reasons X, Y, and Z. If the results do not make sense with the theory or intuitive sense (triples reduce runs?), then you have to consider what the results are telling you, consider whether the theory still makes sense, and revise your model and/or theory in response.

by Trickman on Jul 21, 2011 7:58 AM PDT up reply actions  

I would agree and add that a good question, in addition to "does this makes sense with theory?" is

Does the theory or model have much explanatory power for data sets other than the one I developed it on?

by Mike Fast on Jul 21, 2011 8:04 AM PDT up reply actions  

Bingo.

I have a series about psychology I’m writing up and one of the tenets is the big problem with finding your hypothesis from within the data analysis

by Matthew on Jul 21, 2011 10:12 AM PDT up reply actions  

I would enjoy reading that.

It seems to me that one of the more common mistakes I see being made by MD’s venturing into the world of hypothesis-driven research falls into this area.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:13 PM PDT up reply actions  

The same quandary occured when I was studying Astrophysics.

My professors described Astrophysics to me as the only physical science where it was impossible to do experiments. To test hypotheses, you needed to predict a natural event and then wait to see if youid predicted it correctly.

Hell, even relativity gets experiments like Gravity Probe B. What do we get?

So, I’m not so willing to say that baseball analysis isn’t science, because that would mean that Astrophyics isn’t science, and I think Astrophysics is science.

Though, I’m also a fairly strict Popperian, so that might be relevant.

I like using semi-colons; they make me feel smart.

by Llewdor on Jul 21, 2011 4:43 PM PDT up reply actions  

The Science/Not-Science thing is a false dichotomy

It’s not really that there’s “Science” and “Not Science.” It’s that different disciplines have different ways of getting at truth. Sometimes you have the ability to set up experiments and do observations. Other disciplines (economics is the prime candidate) you have to rely on “natural experiments” to arrive at your learning. Sabermetrics is a lot like economics that way.

But when you’re relying on natural experiments you have to be a whole lot more careful with your data. Because you can find things that are not true. (Like: Triples don’t help you score runs. Or ice cream causes drowning deaths.)

(Disclosure: I do chemistry for a living, in an industrial setting.)

by robbbbbb on Jul 25, 2011 7:35 AM PDT up reply actions   1 recs

Overall, I like what you wrote

But I’m not sure that I understand what you’re actually driving at. It sounds like an anti-stat piece, only written by a guy who understands and appreciates, well, stats: it’s like saying what we read on BRef player cards doesn’t embody enough of the soul of the game anymore, but thank goodness it’s there anyway.

What, exactly, would be the solution? Or is there no solution; viz, the saber movement has simply swung too far away from the mainstream and we need to collectively chill? And if that’s the case, then it opens a discussion about whether authors — many of whom were inspired by and now produce straight number-crunching articles — should change what they write, or if audiences should change what they want to consume.

I guess the natural compromise would be ’don’t play by (or write, or tweet according to) the numbers all the time,’ but … I suspect thats somewhat less impactful of a conclusion than what you were going for.

@PadmanJones

by Paddy McMahon on Jul 20, 2011 11:19 PM PDT reply actions  

Baseball-Reference is fantastic

The vast majority of the numbers on there are all meaningful, and that they translate back to the game itself in a fairly direct way. I love that sort of thing. I love wOBA (not as much as baseRuns, though). I love Matthew’s work on team defence. But I am seeing a lot of people start analysis by looking at the numbers and never really relating that back to baseball, and it’s gotten worse and worse lately.

So, the piece is (I hope) definitely not anti-stats. Instead, it questions just why stats/analyses that don’t advance our understanding of the game are so prevalent.

by Graham MacAree on Jul 20, 2011 11:25 PM PDT up reply actions  

Out of all the articles I have read on why sabermetrics is "ruining" baseball, this is the most though-out piece I have read

Rather than flat out denying the concept, you wrote this in a way that made me think about things instead of just being baffled by the writers stupidity. I still hate the idea of Pitcher Wins and RBI’s and Runs Scored and all that, although this made me think of the fact that these are real humans playing out there, not computers that generate numbers that we can all dissect. Correct me if I’m wrong, but I have noticed that the sterotype of sabermetrics flat out deny the existence of “clutch”. This is wrong. Its a whole different mindset batting up there when there are runners on first and second and two outs instead of the bases being empty and two outs, which is why I love the Fangraphs “high leverage” and “low leverage” numbers. I think those types of things are parts where sabermetrics are not deriving from the casual fans perspective too much, and stats that still stick to the idea of sabermetrics staying connected to the human element of the game. By the way, does anybody know if you can find a “high leverage” or “low leverage” leader board on Fangraphs?

by MilesC on Jul 20, 2011 11:32 PM PDT reply actions  

I don't think sabermetrics is ruining baseball, and I don't think that's Graham's point either (correct me if I'm wrong).

I think the issue that Graham is bringing up is a crucial one — and one that should be at the forefront of our minds. He’s questioning the misapplication of sabermetric thought, if that makes sense. Instead of theorizing and advancing our understanding of the game a la 3-4 years ago, we’ve settled for little to no research and more spewing out of what we already know. This has resulted in number-crunching for the sake of number-crunching, and the results don’t tell us anything. If you look around the blogosphere, you’ll see articles written that don’t tell you much if anything about a topic. They could just say “look at his tRA, he’s a great pitcher this year but he should regress” or “this guy’s tAV is below the Mendoza line; why is he playing?”

Graham thinks that sabermetrics as a group has strayed from what it was, and I kind of agree with him. We can’t be thinking statistics-driven first. We have to take baseball for what it is, and logically follow what we know about baseball into our understanding of its statistics, not the other way around.

"Satisfaction is the enemy of success." SanFranPreps Twitter: @d_quazzo

by perfectstrat on Jul 21, 2011 12:29 AM PDT up reply actions   1 recs

You lost me at
I was reading a very good book the other day about the development of quantum mechanics between the early 1900s and the present.

Kidding. The last paragraph really invoked my own feelings on why I often tune out sabermetrics. It really can’t be divorced from the actual baseball experience.

by Kirsten Schlewitz on Jul 20, 2011 11:50 PM PDT reply actions  

And you too!
And it really does emphasise how lucky Mariners fans are to have Jeff and Matthew

Though I do miss mean Graham a little.

"Satisfaction is the enemy of success." SanFranPreps Twitter: @d_quazzo

by perfectstrat on Jul 21, 2011 12:46 AM PDT up reply actions  

I do want to emphasise that this is my stance as well.

I still believe that at its best, sabermetrics is interesting, worthwhile and educational. I disagree wholeheartedly with anyone who thinks starts are useless or make the game less fun. However, I also disagree with the direction more recent sabermetric analysis is going, for the reasons I explained in the piece.

I guess all of the above is more obvious if you know about my history with this stat work (obviously you do, Matthew), but I seem to have confused some of the people who weren’t really around for the bulk of my LLing. Hello new people! Much love!

by Graham MacAree on Jul 21, 2011 1:00 AM PDT up reply actions  

My reading of sabr-minded blogs has certainly colored how I see other areas of life.

Personally, I think I’ve always been brought up to have a skeptical approach to the world. This probably suggests why analytical blogs appealed to me in the first place. What your writing and the writings of others have bolstered, though, is more how to question things, namely consideration for all contributing factors. I think of it as Seeing ERA, Crediting Defenders.

Concepts like regression to the mean and replacement levels have also bled into my views of politics, society, and pop culture, giving me yet another angle from which to see and possibly better understand things.

by iheartjavelinas on Jul 21, 2011 11:13 AM PDT up reply actions  

Yes,

once you start thinking about choice in probabilistic terms, it becomes difficult to not apply that to other areas. Though, it does seem to me that having a healthy skepticism of Jenny McCarthy’s claims don’t exactly require an analytical mind.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:29 PM PDT up reply actions  

It's funny

I feel like plenty of us cared mightily about it when sabermetrics was initially introduced because it added a new layer (of sorts) to the game.

Now, however, sabermetrics is much more about itself and less about baseball. I’ve been suffering from the same apathy you described – it’s gotten less and less interesting to me, to the point where a couple months ago I just stopped trying to read about all the ‘new’ data. I just took what I already knew, and continued to apply it to the game of baseball.

It feels like is continually trying to expand itself simply for the sake of expanding, rather than expanding in an effort to help people better understand what goes on during a baseball season. And, once that direction is lost, newly discovered sabermetrics methods no longer serves a purpose for most of us.

by cwel87 on Jul 22, 2011 4:58 AM PDT up reply actions  

I think there's just a lot more sabermetrics these days in general

And I would grossly simplify the situation by dividing it into two groups. One is the group that wants to take the research that’s been done and evangelize with it. The other is the group that wants to keep researching to learn more about the game.

There is a lot more research over the last few years into very fundamental concepts of the game (e.g., what causes the platoon differential, what causes flyballs/groundballs, etc.) than there was 5 or 10 years ago.

There is also a lot more widespread application of the older ideas (e.g. DIPS, fielding metrics based on granular batted ball data), and this shows up everywhere because of fantasy baseball interest.

I mostly don’t find sabermetrics interesting unless I’m doing it for myself. That may be what you’re saying, too. That doesn’t mean I don’t enjoy reading other people’s work, but it’s my own research that gives me the most enjoyment and the most interest in learning more. One of the things I’ve wanted to know for a long time is how much the hit-and-run play affects our valuation of stolen bases and caught stealing. At some point I’ll dig in and mine the data for that answer. There are a ton of other unanswered questions about the game where because of Retrosheet or PITCHf/x the data is now available to find an answer.

I see it as a really exciting time for sabermetics, even though there may be a lot more fluff in the field than there was a few years back.

by Mike Fast on Jul 22, 2011 6:23 AM PDT up reply actions  

Yes, yes it is.

But I think on the whole it’s been quite good. It gives a lot of young folks the chance to cut their teeth on baseball analysis in front of everyone, which isn’t always very pretty, but for the ones who stick around and up their game, we all get to reap the benefits.

I wrote some poor and wrong-headed stuff when I was starting out, too, and I at least had the benefit of already being in my 30s and having made some of my stupider mistakes in private. Hopefully I’m doing less of that now. I don’t know any remedy for poor writing than more writing practice and good feedback, and I don’t know any remedy for bad analysis other than continuing to grapple with the data until you really learn what it does and does not tell you, as well as good feedback from other analysts, too, of course. All of that takes time.

I don’t find it very helpful to pick one or two sites and read through all the articles every day as if each one of them should tell me something new and fascinating about the game. If there were really 10 new insights every day, we’d know so much about baseball we wouldn’t know what to do with ourselves. Those kind of insights take time. I find that there are, oh, I don’t know, a dozen or so analysts whose articles I make a point to read every time they write one. Everything else I skim when I’m in the mood or bored, and I don’t expect too much from it. Then sometimes I’m pleasantly surprised.

by Mike Fast on Jul 22, 2011 7:28 PM PDT up reply actions  

I like your point and it's important you brought it up, but I don't think the problem lies with sabermetrics itself.

I believe the problem you speak of lies, as with most problems, with humans themselves. The philosophy that once was is still here. There’s nothing preventing us from thinking like we did; questioning the theory and application behind a new or existing stat. Instead of using statistics to understand baseball statistics, we should use baseball again. Why don’t we?

"Satisfaction is the enemy of success." SanFranPreps Twitter: @d_quazzo

by perfectstrat on Jul 21, 2011 12:37 AM PDT reply actions  

One of the things I think is interesting has popped up this year.

Maybe… about 2 years ago or so, there was a pretty widespread belief that the eyes lie, and the stats are telling us what we are unable to see. Nowadays, people are bringing up stats (most notably the defensive stats, but not exclusively), and correcting those stats with what they see, because people are no longer convinced that stats are necessarily a better judge of skill than the eye test – provided that the person knows what to look for. With defensive stats, people now look closer at things like range, and less at things like errors. Sabermetrics successfully taught people that defense and pitching were being evaluated incorrectly by pointing out the issues that they have, which taught the eye what to look for, which allowed the eye to become a better measure of talent than it used to be, and has allowed people to doubt other aspects of the sabermetrics.

It’s been a nifty little circle.

...and now I'm here

by CapSea on Jul 21, 2011 2:21 AM PDT reply actions   6 recs

I agree with most of what you wrote here

But I have a few questions and thoughts.

1. I’ve never really been able to put a finger on what discipline or label I would put on sabermetrics. For one thing, it has a number of facets and draws from many diverse disciplines, from math to physics to psychology to economics. I find your label of “applied philosophy” to be helpful, but I’m not sure it’s all-encompassing, either. Nor does it really do justice to the specific nature of what is being done with logic. As I mentioned on Twitter, I think astronomy might also be labeled “applied philosophy” under that definition. I do like your emphasis that everything flows from the game and how the game works.

2. There is some actual experimentation being done in baseball research, in baseball aerodynamics and pitcher kinematics at least, and perhaps other areas. We wouldn’t have PITCHf/x if that hadn’t been the case.

3. Some specifics examples of what you believe advances or doesn’t advance our understanding of the game would be helpful, especially if they are recent. I imagine you are reluctant to criticize specifics authors or articles, but even so, if you could give some examples of recent work that does illuminate the game, that might put some flesh to the skeleton you’ve so elegantly constructed here.

The reason I say that is that people have various beefs with sabermetrics, and if they don’t have a history with you, it would be very easy to read their own beefs into this article. Or if they have a sabermetric viewpoint already, to read the beefs of their critics into this article.

by Mike Fast on Jul 21, 2011 7:25 AM PDT reply actions  

1. I have a lot of respect for philosophy, done right, so I don't neccesarily understand the issue there. That may be a semantics/definition thing

2. There is, certainly. But neither aerodynamics nor kinematics really fall under the umbrella of sabermetrics as we really know it, unless the definition has changed while I’ve been off playing soccer writer (entirely possible). The focus of the piece is on statistical work, not pf/x and the like.
3. One (good) piece of work that’s stuck in my head over the last year was Colin Wyers’ work on how accurate our ball in play data is. Fundamental, discipline-shaking stuff that was both statistically rigourous and never lost that connection between baseball and the data we’re getting out. I’m not going to get into specific stuff I don’t like.

by Graham MacAree on Jul 21, 2011 10:34 AM PDT up reply actions  

Thanks, Graham

1. My point about astronomy is not a criticism of philosophy, it’s that basically that whole discipline of astronomy relies on observational data over which they have no control rather than controlled, repeatable experiments. That’s basically the same situation we find ourselves in with analyzing sports. One can go out in the back yard and test a few ideas with a bat and ball, but for most baseball questions that’s not an option, and one must use whatever unrepeatable events the athletes stage.

2. I’m thinking of sabermetrics = “the search for objective knowledge about baseball”, under which definition learning about why the baseball moves how it does, either from the pitcher or off the bat, is very important. I understand that wasn’t the focus of your criticism. However, there is plenty of awful PITCHf/x-related analysis out there. It suffers from all the same problems that you point out above.

3. Thanks for the example. It really frustrates me how much the sabermetric community has pushed back against Colin for that effort as if he is some destroyer of knowledge rather than taking his criticisms to heart and trying to evaluate their validity. (Sean Smith being an exception to that.)

by Mike Fast on Jul 21, 2011 10:56 AM PDT up reply actions  

To give a specific example of what I am thinking about with my point #2

Freddy Garcia threw a splitter this year that moved in a very unusual way. It happened to be captured on slow-mo video. The physics community has come up with a theory for why it moved the way it did, and some experiments are planned to see if we can reproduce that effect.

It seems to me that that sort of understanding, broader than just that pitch but applied to how the baseball moves and bounces in every interaction in the game, is rather important for understanding how the game works. With better understanding of the physics of the game, we can produce better statistical models, too.

by Mike Fast on Jul 21, 2011 11:01 AM PDT up reply actions  

I think, obviously, that there are differences between baseball analysis and astronomy

Astronomy as a science tends to be an upscaling of actual science tested down on the ground. We assume that the rules up there are the same as the rules down here, and we apply those rules to up there. In baseball, we don’t have a lab testing ground for most of our concepts.

by Graham MacAree on Jul 21, 2011 11:13 AM PDT up reply actions  

We do too!

It’s called T-Ball.

I write for Stumptown Footy, SB Nation's Portland Timbers blog.

by thehemogoblin on Jul 21, 2011 11:19 PM PDT up reply actions  

Hear hear!

The balls-in-play data quality argument of the past year is incredibly important, and goes a long way towards disproving a lot of the overheated criticism of sabermetrics, and, I’d argue, inflicts a sizeable dent in your theory.

Really, this gets down to how you define “science” – but as Mike points out, we’ve got best-fittism, “is this sustainable?” pieces being published alongside cutting-edge pitch fx stuff, interesting analysis of hit fx data, and really meaty discussions of data quality. I see people questioning what we know in fairly strident terms. That may not map 100% to your definition of “science” but it’s something, and it’s pretty good.

There’s so much stuff being published because there’s suddenly a market for it. Fantasy baseball is most of it, gamblers may be another segment, but even if this is 90% of what’s out there, it doesn’t discredit the entire enterprise. The 17th century saw Newton and Kepler, a ton of angels-on-the-head-of-a-pin philosophizing, and phlogiston theory.

by marc w on Jul 22, 2011 11:28 PM PDT up reply actions  

I don't know

The article generalises, certainly, but if 90% of what comes out is garbage, do I need to not generalise? There are still good analysts doing good analysty things, but there’s a lot of terrible analysis and most consumers can’t tell the difference.

by Graham MacAree on Jul 23, 2011 12:12 PM PDT up reply actions  

I think you're right on all points there.

I’m less sure what to do about it.

Articles that are critical of other people’s analysis get major negative feedback. Just look at Colin’s piece on discontinuing SIERA at BPro, for example.

My “Internet Cried” article on PITCHf/x analysis mistakes was pretty well-received, but anything I’ve written that’s been critical of BIS or UZR has come under major fire from other sabermetricians.

That doesn’t mean I won’t write those sorts of pieces again, but when I’m doing this as a hobby, I’ll admit that sometimes I don’t feel like enduring criticism for simply investigating and telling the truth about what I find.

by Mike Fast on Jul 23, 2011 12:38 PM PDT up reply actions  

I don't get that, at all

Obviously my work on pitching is one of the things most impacted by BIP metrics turning out to be… far less good that we thought. But that’s a good thing, surely? I did my work, did a good job on it, and if it turns out wrong because of bad data, so what? We learn and move on.

Being wrong about something you’ve worked on is a blessing, not a curse, and people are so invested in being right that that gets lost. It’s crazy.

by Graham MacAree on Jul 23, 2011 3:12 PM PDT up reply actions   8 recs

That's true

When something I’ve worked on has turned out to be wrong, I’ve felt a little stupid at first, but in the long run, I’ve learned so much from it, it is definitely worth it.

If people don’t take it that way, it’s their loss, I guess. And if I’m wrong about their being wrong, I’d rather them show me that than just yell at me.

by Mike Fast on Jul 23, 2011 3:31 PM PDT up reply actions  

This is a great comment, Graham.

Even if it admits “Why Graham is Probably Wrong.” :)

by Decatur on Jul 29, 2011 5:43 AM PDT up reply actions  

That most consumers can't tell the difference is irrelevant

(If true; I don’t know.)

Yes, there are hundreds more saber pieces published nowadays, and many probably wouldn’t fall under even the most ecumenical definition of science. But I just don’t know that this means that the entire enterprise is in a sorry state. At some point 10 or 20 years ago, there seemed to be a schism in, well, I don’t really know what to call it… people who used baseball data. Elias Sports Bureau sort of “won” and to this day, many people think that sabermetrics produced things like WHIP and batting-average-in-night-games-on-thursdays-versus-lefty-Dominicans. Does that mean nothing happened in the past 20 years?

I sympathetic to your argument, but I still think it’s a really exciting time. I don’t have any use for fantasy baseball tools, and I don’t much care if 90% of what comes out is geared at that audience. It’s actually pretty easy to tune out. You can still find important discussions and really innovative work. 1000 other people doing something else doesn’t invalidate that, just as every position player on the Mariner 40 man doesn’t somehow invalidate Dustin Ackley.

I kind of hope the fight SIERA’s stirred up produces another schism, or a realignment to use a less weighted term.

by marc w on Jul 23, 2011 10:44 PM PDT up reply actions  

No, I would say that Dustin Ackley is good, and just because 10 other guys can't hit

doesn’t mean that there’s no actual baseball being played in Seattle.

I sympathize here, but I think the problem is with gatekeeping, and gatekeeping is the way it is because of fantasy (and maybe betting, but I don’t really know about that). That means we’re more aware of bad or insignificant pieces now, but I don’t think the publishing/posting decisions of a group of people (even a large group) invalidates what another, smaller, group is doing.

I will say this: one thing that gets me worried is the desire of so many people to have all of sabermetrics, whatever that means, resolve some fundamental question. For there to be ONE measure of WAR, or ONE defensive metric. That we haven’t got there, and that there really isn’t a lot of movement towards that (BP going to FIP aside) is a really good sign. I don’t want people to compromise on one WAR measure just for the sake of ESPN. I want people constantly questioning the assumptions in fWAR or rWAR, and hey, that’s what we’ve got. Those debates just happen out of the spotlight, and maybe that’s OK. Your complaint that what the spotlight illuminates isn’t important is true as far as it goes, but it doesn’t prove the thesis.

Fuck, I sound optimistic and naive.

by marc w on Jul 24, 2011 9:56 PM PDT up reply actions  

Sorry for being terse here, but I by no means intended to say that's there's no good sabermetrics being conducted

Personally, I think the debate you mention is the interesting, illustrative part of what researchers do, and the results… well, they’re useful, but they’re not enlightening.

by Graham MacAree on Jul 24, 2011 10:10 PM PDT up reply actions  

I think I would say Sabermetrics is closest to behavioral economics

You generate theories and test it with empirical data. Then again, there is also a lot of physics involved, which would complicate the matter.

by vivaelpujols on Jul 26, 2011 5:52 PM PDT up reply actions  

Graham, I pull two arguments out of your writing.

Tell me if you didn’t intend both or don’t see a difference.

1. The “science” of sabermetrics has problems. Conclusions may be useless or even wrong, and we’re not as worried about that as we should be.

2. The findings and applications of sabermetrics aren’t entertaining enough. Stories are too dry and/or too separated from baseball.

by Sky Kalkman on Jul 21, 2011 7:34 AM PDT reply actions  

As for #2, recent applications of sabermetrics have certainly lost their luster in my opinion

For instance I still read Fangraphs occasionally but mostly what it is now is people pointing to someone experiencing poor luck/good luck and telling us why that will normalize. I get that as a general concept, I got it years ago, yet things haven’t really progressed beyond that point. I think we have reached or are getting close to the point of diminishing returns for applying statistics to baseball, at least until major innovations are made in measuring stuff, and I think that’s part of what Graham was saying too. Things are getting too granular because the major concepts are pretty well hashed out now. Not that the major concepts weren’t good discoveries and not that they don’t remain useful, but there’s not much room for growing the field if the focus remains so heavily anchored in statistical analysis.

by OlSalty on Jul 21, 2011 8:13 AM PDT up reply actions  

I would add that most of the Fangraphs articles aren't analysis.

It’s “This guy has been way better than expected. Is this his new baseline?” and the answer is always “Maybe.” The analysis is limited to a few stats and there is rarely depth beyond the “Maybe.”

by abender20 on Jul 21, 2011 10:22 AM PDT up reply actions   5 recs

Ignoring any specific site, I agree that there are a ton of these articles.

And I’m not innocent, it’s a lot of what I used to write and what I still sometimes tweet about. Maybe part of this issue is that we haven’t done a good enough job of moving on to the next thing.

And a lot of the next things take more knowledge and skills than things we’ve already addressed.

by Sky Kalkman on Jul 21, 2011 12:37 PM PDT up reply actions  

To clarify, certainly not all of the next things need crazy sql or physics knowledge.

It’s a cliche, but a lot of learning is simply the creativity and flexibility to frame things differently, taking a different point of view and adding a little bit more information.

by Sky Kalkman on Jul 21, 2011 12:39 PM PDT up reply actions  

Or as a professor of mine in college used to say

“Learning is not linear”. It’s about wandering around, seeing how what you know can be applied to other things, and advancing in small bits most of the time, not so much learning A, B, and C and then forcing yourself to learn D through Z because that’s the “next logical step”.

by pdb on Jul 21, 2011 12:45 PM PDT up reply actions  

Good point.

We find lots of cool stuff by standing on the shoulders of giants. But you can also find cool stuff by jumping on a flying unicorn (other than the flying unicorn.)

by Sky Kalkman on Jul 21, 2011 12:47 PM PDT up reply actions  

While I understand the frustration with FanGraphs not being enough on the leading edge,

I wonder if that’s truly a bad thing? To me, people like tango and others are the ones who are pushing the field forward in interesting ways. Reading his stuff is incredibly thought-provoking to me because it causes me to actually think about the game in different ways.

To me, FanGraphs is the place for people who are slowly moving away from traditional stats and want to know about how the most famous and new, up-and-coming players are doing and how they are likely to do in the future. After familiarization with the approach, the topics become somewhat routine (boring?) as individuals can do the same level of analysis by themselves more quickly. But until that familiarization occurs, I think I am happy that there is a place like FanGraphs to help people make that transition. I’m sure that there are many for whom baseball is just a casual interest that never fully make the transition and need others to continually do the analysis for them.

For those that feel that FanGraphs isn’t doing interesting work anymore, I think that this is common and had assumed that the answer was to move on to more though-provoking and challenging websites as opposed to giving up on the current trends in sabermetrics all together.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:48 PM PDT up reply actions  

Yep, more or less

I had to get to 75 words somehow, hence the length :)

by Graham MacAree on Jul 21, 2011 10:35 AM PDT up reply actions  

2 problems with this

1.Your main counter argument is built on a subject that is most likely non-linear in nature and applying straight linear concepts to it will produce a regression line that will not make intuitive since.

2. Modeling is only as useful as the idea at its backbone. There are two totally different sets of ideals at hand on this. Hypothesis testing to prove the idea is statistically significant and then the actual form fitting to the model. If you walk in with a massively varied data set and and expect simple regression to tell you the secrets of life you are wildly mistaken. Baseball is as a complex system and running simple regression on it will have major difficulties with the interactions of various variables and tends to lead you on the wrong path.

JD’s like, "you want some f*&#ing pitching? Here’s all the pitching you can stand. Now choke on it, b*#&hes!"- RCCook

LSB: "Oh s#*t, JD. You crazy!"

by laxtonto on Jul 21, 2011 7:55 AM PDT reply actions  

In my experience

Researchers who use more complex regressions on baseball data usually come up with even worse answers that make less baseball sense. There are a few VERY rare exceptions. Having a lot of math knowledge often seems to be an impediment to good baseball thinking and research. Most often there’s a happy medium—knowing enough to avoid some statistical blunders but not knowing so much that you trust your math training ahead of your understanding of how the game works.

by Mike Fast on Jul 21, 2011 8:01 AM PDT up reply actions  

I have to say,

in my own attempts at applying non-linear regressions to some of the current sabermetric concepts, I have ended up with truly absurd results. I’m not convinced this means that more advanced non-linear models wouldn’t be useful, just that it’s likely to be quite difficult.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:41 PM PDT up reply actions  

Coding knowledge on the other hand...

Is priceless.

Although, some high level math can be very useful and neccesarry. Logistic regressions for example.

by vivaelpujols on Jul 26, 2011 5:55 PM PDT up reply actions  

I do believe #2 is in fact the crux of Graham's entire article.

If you walk in with a massively varied data set and and expect simple regression to tell you the secrets of life you are wildly mistaken

That seems to be at the heart of what Graham was saying, does it not?

by pdb on Jul 21, 2011 8:04 AM PDT up reply actions  

my retort is that if you are a true staticiain with true reaserch background

the simple regression concept on a non-linear complex system is only used in a best case scenario to see if the data is pliable. No real statistician would just run with it in a simple linear regression on a complex system and stop because it would fail the test phase of their analysis if the data set is partitioned. The real problem is that there are very few true statisticians working on these problems, not people anointed by themselves or others as statisticians.

I love the direction baseball has taken in regards to application of mathematics and models to baseball, but the problem is that there are too many guys that know just enough to not get in trouble doing analysis that is not as thorough as it should be. To compound the problem those results are getting accepted as fact.

JD’s like, "you want some f*&#ing pitching? Here’s all the pitching you can stand. Now choke on it, b*#&hes!"- RCCook

LSB: "Oh s#*t, JD. You crazy!"

by laxtonto on Jul 21, 2011 8:18 AM PDT up reply actions  

I would say that making a more complex model work for baseball analysis

requires a lot of care and a far deeper understanding of how the game actually works than I see most true statisticians display when they attempt to analyze baseball. Most often the result is very bad analysis cloaked in mathematical gobbledygook that sounds impressive.

In theory, I think it would be great to have more sophisticated techniques applied to baseball analysis. In practice, I don’t see many “true” statisticians willing to put in the years of work to get themselves familiar with the game and getting their hands dirty with the data. Thus they, and the field of sabermetrics, never see a payoff from all their learning. Russell Carleton is one prominent exception to this, and J-Doug Mathewson is working on getting there. I can think of a much longer list of folks who have tried and failed.

by Mike Fast on Jul 21, 2011 8:29 AM PDT up reply actions  

I'd add Brian Mills to the list of up-and-comers among "true" statisticians analyzing baseball

and Phil Birnbaum could make the list of true stat guys, too, although his brilliance comes in showing how simple techniques produce much more helpful and accurate results than sophisticated techniques. If you follow Phil’s work, it would be very tough to argue that what we need in baseball analysis is mainly more true statisticians working in the field.

by Mike Fast on Jul 21, 2011 8:34 AM PDT up reply actions  

My base premise in all of this is that they need less people in the feild period

and stick to those who understand the subject matter and have the necessary statistical background. The entire problem now is that you have the two extremes producing large amounts of research and very few in the middle.

It is an unattainable idea due to the love of the sport and human nature, but even a progressive step of understanding that by those who consume the data as instantaneous fact would be a step in the right direction.

The part I can’t fathom is why it is acceptable to have a true statistician working in baseball not to have a baseball background when it is a prerequisite all other statistical application fields? You get some cross over in fields, but very rarely do you a psychometrician doing large scale business orientated research. You don’t see the management science guys doing research in sociology and the output getting widespread acceptance on merit.

Things will change, but people keep forgetting that as far as a statistical field goes, sports research is still in its infancy. The best news I can give on this topic is that some of the sports statistic journals are now considered “A” rated journals for several of the statistical disciplines.

JD’s like, "you want some f*&#ing pitching? Here’s all the pitching you can stand. Now choke on it, b*#&hes!"- RCCook

LSB: "Oh s#*t, JD. You crazy!"

by laxtonto on Jul 21, 2011 8:47 AM PDT up reply actions  

I think that the ones who are willing to do so

work for Major League organizations.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:39 PM PDT up reply actions  

I was an English major in college 20 years ago and math is, to understate it grossly, not my strong suit

so much so that your first sentence has me curling up in the fetal position in the corner. But I do think you’re right when you say

there are too many guys that know just enough to not get in trouble doing analysis that is not as thorough as it should be.

This is a problem with all sophisticated concepts – my example is always people using “the wisdom of crowds” to mean “this is what a bunch of people think” when in reality “the wisdom of crowds” is a much more complex concept than that. You can’t take one leaf and assume you have an entire tree, and that seems to be a lot of what the current wave of sabermetricians are doing. I am obviously not the guy to debunk the numbers ore even talk about them intelligently, but even my layperson brain knows that things with sabermetrics are a bit wonky these days.

by pdb on Jul 21, 2011 8:42 AM PDT up reply actions  

If it is, then I completely missed the point

This is where my request came from to “please clarify because I think one can read into the criticism whatever one doesn’t like about baseball analysis.”

I see a lot of bad analysis being done with complex multivariate regression.

I do believe that laxtonto’s statement that “modeling is only as useful as the idea at its backbone” is correct and is consistent with one of Graham’s main points. I thought the idea at the heart of what he was saying is that any analysis that loses a connection to how the game works has gone off the track. It’s not the simplicity or the complexity of the model that matters, it’s the explanatory power and the foundation of logic underneath it. Simple regression models can have a lot of explanatory power in baseball and can illuminate a lot if they are driven by thinking that’s well-grounded in how the game works.

by Mike Fast on Jul 21, 2011 8:20 AM PDT up reply actions  

1. I hope that that story is read as more of an interesting supporting anecdote rather than a counter-argument

2. I don’t understand how this is a disagreement – you’ve just framed the question in a different way. In your two ideals, I think I pretty clearly fall into camp one, and I don’t understand anyone who’s happy with using the second.

by Graham MacAree on Jul 21, 2011 10:38 AM PDT up reply actions  

I like the FG_as_Tweets twitter account

Does a good job at describing how silly some of the “analysis” has gotten lately.

by Vegasexpat on Jul 21, 2011 9:12 AM PDT reply actions   1 recs

Nice article.

It reminded me of Thomas Kuhn’s “The Structure of Scientific Revolutions,” if you want to get into that philosophy/sociology of science stuff. Sabermetrics emerged as a type of revolutionary paradigm of baseball analysis and, at least online, has been established as a “normal” form of analysis (or what Kuhn calls normal science). Much progress occurs under this time of normalcy, but anomalies in the paradigm eventually emerge. This post clearly points out some anomalies that emerge when taking a sabermetric paradigm to a certain degree — triples are bad being the prime example. Eventually, Kuhn argues that this “normal” science is replaced with a new revolutionary paradigm. We don’t necessarily have to believe this, or think this means the end of sabermetrics in baseball, just that new ways of understanding the relationship between objects and tools of analysis might emerge to move beyond anomalies and myopic analyses.

Yeah, I can’t believe I just wrote all that. Thanks for making me churn some mental wheels Graham.

by boomdonkey on Jul 21, 2011 9:14 AM PDT reply actions  

I would want to categorize things a bit differently (though I'm thrilled to see mention of Kuhn here).

The distinction between sabermetrics and older, less sophisticated approachs to analysis is more like the distinction between good science and bad science. The former is better than the latter precisely because it uses analytical methods to avoid bias, control for randomness inasmuch is as possible, and so on. Because sabermetrics is essentially a methodological approach to understanding baseball, rather than a theory or set of background assumptions about how the game works, it isn’t susceptible to the sorts of anomalies you mention. Thus, for example, the sabermetrician will take the goodness of triples as a desideratum for her theory. The fact that a certain (naive) sabermetric theory says that triples are bad is a reason to reject that theory, not a reason to reject the methodological approach itself.

by ty540 on Jul 21, 2011 1:11 PM PDT up reply actions  

Perhaps,

but I’m not sure using pejorative terms to classify the traditional or outgoing school of thought is going to encourage open-mindedness regarding new and potentially better theories.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 5:01 PM PDT up reply actions  

But you just used the term 'better'...

(Just messing with you. I see your point.)

by ty540 on Jul 25, 2011 6:12 PM PDT up reply actions  

For a while sabermetrics seemed to be heading dangerously down the elevator analysis path

Where results were merely explained in reference to other results (i.e. this stat went up so this stat went down) without much analysis into underlying factors causing them. (Which is what I think Graham is getting at in his post.) With the advent of pitchFX and hitFX and a slew of other advanced measuring tools, however, I’m starting to see more complex analysis being performed in sabermetric circles. Instead of simply saying a batter is underperforming because of a low BABIP, we can now better analyze why his BABIP is abnormally low instead of blaming it on the metaphysical vagaries of balls in play.

This, I think, is the ultimate endgame of sabermetrics, a sort of GUT where statistical analysis is married with observational analysis in an effort to determine the hows and whys of the baseball universe.

by ThomasG on Jul 21, 2011 12:37 PM PDT reply actions  

I am a nothing on this site and even less in the world of baseball thought

but for what it’s worth this was wonderful to read and I greatly enjoyed it. Although you appear to have steered away from awe-inspiring, nuclear grade personal takedowns. Which I understand and respect but still lament.

by TheBishop on Jul 21, 2011 2:13 PM PDT reply actions  

be more passive aggressive

use the letters in “fuck all of you” to start the first word of your first 12 paragraphs!

by pdb on Jul 21, 2011 2:25 PM PDT up reply actions  

If you actually feel bad that means you've lost the hatred. And once that goes it never comes back.

I say that as a boring 29 year old who used to be so full of piss and vinegar. Now I’m just happy and shit.

by TheBishop on Jul 21, 2011 2:48 PM PDT up reply actions  

Crap. Outfoxed by someone much smarter than me.

So hard to foresee. Oh well, enough of my tomfoolery. Thanks again for the article. The combination of content/source gives it a lot of weight in my eyes and, like all good writing, is applicable well outside it’s subject matter. Cheers!

by TheBishop on Jul 21, 2011 2:58 PM PDT up reply actions  

Thanks

I do appreciate that. Glad you enjoyed.

by Graham MacAree on Jul 21, 2011 3:00 PM PDT up reply actions  

That's really excellent.

(I know some philosophers X for whom philosophical X entails angry X.)

by ty540 on Jul 21, 2011 2:23 PM PDT up reply actions  

Great writeup

Similar stuff has been troubling me too but I haven’t put it all together as eloquently as you did.

I think my main issue always comes back to our ability to test hypothesis like you mentioned. Without clear and controlled experiments there is no way to test and determine the limits of a theory. Without this we can’t ever “believe” a theory’s result unless it is infallibly logical.

Furthermore, it prevents you from taking a theory and extrapolating it beyond the data available. Would Chone Figgins be more valuable if he stopped swinging at any pitches and try to walk every AB? We could probably argue so but logically it doesn’t really make sense. Just one small example but when you move to more complex examples it only gets worse.

by Edgar for Pres on Jul 21, 2011 7:08 PM PDT reply actions  

I always appreciate when people are able to clearly share their feelings in a well thought-out process.

Thank you for sharing, Graham.

My problem with advanced baseball stats has always been player development. Humans are not constants, they are always changing. Baseball players find themselves in three different phases in their careers: maturing, prime, and declining. This career cycle has different lengths for each player ranging from one game to 20 years. The progression throughout these phases are not typically smooth and are often unpredictable. We like to justify our ability to predict a player’s career with statistics, but it is simply too complex. There are too many variables you cannot plug into a statistic. Players get injured. Players have off-the-field issues. Players tweak their batting stance or pitching motion.

Baseball statistics are great to value a player’s worth in a given time frame. Going forward, however, it can be a crapshoot to predict a player’s value with statistics. Nobody thought Figgins would be this horrible. Nobody thought Jose Bautista would turn into the best hitting player in the game. Reasonable expectations are assigned to each player, but, outside of superstars, few players stay near the median line on the bell curve. Aaron Hill hit 36 home runs in 2009. Ichiro is currently hitting .265. Outliers in a player’s career happen each year and we don’t know if it is a shift in talent level. We can explain what went right or wrong in the past with statistics, but predicting a player’s future performance is an inexact science.

by Wilder. on Jul 22, 2011 5:22 PM PDT reply actions  

Someone beat me to this... nuts...

but I don’t think I would have said it anywhere as well as you.

Seriously though, that’s the problem I have with sabermetrics now. While the numbers suggest a trend or perhaps a course of action, we are still dealing with people – not machines. People can be irrational, people can fold under pressure, people can rise under pressure. A team can be affected by winning, affected by losing. You get the point.

And in that sense, we can look at the numbers all the live long day, but then look out into the field and see something completely different. I too think that there has to be a marriage of sabermetrics and observational analysis to determine what the best course of action, or how a player projects. It can also perhaps determine what one should do in the macro view of baseball. But I personally would go so far to say that I think in the micro view certain out-of -the-box things can, and may be warranted that throw sabermetrics and conventional US baseball thinking out the window.

by KaminaAyato on Jul 23, 2011 8:02 PM PDT reply actions  

I think you should read it again.
People can be irrational, people can fold under pressure, people can rise under pressure. A team can be affected by winning, affected by losing. You get the point.

This was not at all the point of the piece. I have seen some responses that believe this was an argument against the use of statistics or even interpreted as anti-intellectualism. This wasn’t a Joe Morgan rant.

by abender20 on Jul 24, 2011 8:04 AM PDT up reply actions   5 recs

Fair enough...

I see what you mean. But at the same time, I am not arguing to be anti-stats either. I just think stats are useful when applied properly, or looked at in a proper light.

Just because someone is supposed to regress in any given year doesn’t mean they do. Just because a team sabermetrically on paper is supposed to suck doesn’t mean they wind up doing so. Looking at the stats by themselves void of everything else I don’t think fully works. There are human elements in baseball, precisely because it’s played by humans, that we can’t quantify that can (and have) thrown analyses out the window. Sabermetrics to me help define a macro view of baseball. But there is still the human element that can affect the micro views (i.e. game level, even situational level).

Earlier in the reply thread it was stated:

Graham thinks that sabermetrics as a group has strayed from what it was, and I kind of agree with him. We can’t be thinking statistics-driven first. We have to take baseball for what it is, and logically follow what we know about baseball into our understanding of its statistics, not the other way around.

I think a lot of analysis on what we should do in game or in personnel is because of our analysis of statistics. But I also believe that what we’ve learned we’ve assumed that we must do all the time. And that’s where I disagree.

For instance, I wrote an post wondering if we should adopt a Japanese-style approach for the M’s this year. The approach would include more bunting, which is fundamentally frowned upon by the sabermetric community. But does it make a difference when you don’t have a league average offense? I have to assume that’s the case (though I don’t have the time to figure that out).

I also made the argument in the July 23rd gamethread that with bases loaded in the 8th with no down and Carp up we may have wanted to consider a squeeze bunt. It’s something that would get me thrown out immediately since sabermetrics would completely disallow it, but if you’re depending on an offense such as ours who can’t even get a sac fly, I consider it to be a possibility worth considering. Yes, there has to be the assumption that it’s successful, but I don’t think you could say given the current circumstances of the team you could throw the possibility out altogether.

This is where I think the micro view of baseball may wind up being different that what the macro view would suggest we should do. In other words, in any normal situation people would yell at me and point to the table saying that you’d never squeeze with bases loaded with no outs while down 2. But if you’re in a 13 game losing streak and the offense unable to score “normally” in leveraged situations, can you throw out really doing something different? Looking to bank a run via squeeze, then going so far as to say trying it again should it work is complete unconventional, but in my opinion given our team shouldn’t be thrown out.

Anyways, if I’m going off base again, I’m sorry but I just wanted to clarify my stand on things, and in some ways how I interpret the article.

by KaminaAyato on Jul 25, 2011 12:36 PM PDT up reply actions  

Regarding bunting

It’s not frowned upon nearly as much by the sabermetric community as it used to be. Additional research of looking at actual results as opposed to simply subtracting two values from a run expectancy table has shown that bunting has much better results than previously believed by sabermetrics.

It’s my opinion that a lot more of this kind of work is needed, and that the minutiae of baseball, the micro view as you call it, isn’t as divorced from sabermetrics when sabermetrics is done well.

With Retrosheet we now have the data to do a lot of these kind of analyses well.

by Mike Fast on Jul 25, 2011 12:44 PM PDT up reply actions  

The views towards bunting have changed.

I think it is now accepted that, assuming a decent success rate, while bunting is always going to limit your ability to score multiple runs, there are times that it increases your ability to score one run. Since there are specific times in baseball where all you are looking for is one run, there are times where bunting is not a bad idea when all other factors (park, team quality, etc) are assumed equal.

Personally, I think there are aspects to run environment on a micro level that the manager should take into account when making the decision. Playing in a very pitcher friendly park with a poor offense against a great pitcher (and great bullpen) behind him can make the value of that one run quite high in terms of what it does towards winning that game.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 8:10 PM PDT up reply actions  

Great article Graham. Nothing I could really add at the moment just because I am very tired after my fist night of legal partying.

Just wanted to drop in and congratulate you on a fine piece of work.

"Tell my tale to those who ask. Tell it truly, the ill deeds along with the good and let me be judged accordingly. The rest is silence." ~ Dinobot

by beastwarking on Jul 24, 2011 1:39 AM PDT reply actions  

Some clarification for those of you who read this far

This post was meant primarily for LL readers, where my expectation is that you know who I am. That’s not really egotistical, but this is the place that knows my work the best, so it was in a much more familiar style.

So, since it got much bigger than I really intended it to…

1. No, I am not anti-stats. Yes, I can do maths at a reasonably competent level.

2. If this was intended as an attack on Fangraphs, SIERA, or Matt Swartz, I would have made that abundantly clear. But, it’s not, and I don’t think that going down that path would have been particularly useful. I think this is a cultural problem, and not specific to a statistic, person, or site. Something triggering me to write doesn’t mean that it’s all I’m writing about.

3. No, I don’t give a damn about tRA vs. SIERA. I think batted ball data is broken, which means I think that UZR, xFIP and tRA are all broken because the data is garbage.

by Graham MacAree on Jul 24, 2011 10:22 PM PDT reply actions   3 recs

Having read through all this and having my head spin, now I come onto something that I can grasp.

And it’s something that I can relate to and agree with: batted ball data is broken.

That’s not to say I don’t think there are some valuable things we can learn from what we have; it’s just hard to be really sure that what we’re looking at is telling us as much as we’d like it too. More accurate, granular data would help out a lot.

And I’m pretty sure I’m never going to get my wish that HitF/X will be made public. Maybe the best we can hope for is release of data several years after the fact, with which people could build studies and metrics on, but can’t be used for near real-time analysis.

by nathaniel dawson on Jul 25, 2011 3:08 PM PDT up reply actions  

Based on what?
I think batted ball data is broken, which means I think that UZR, xFIP and tRA are all broken because the data is garbage.

by Matthew on Jul 25, 2011 3:35 PM PDT up reply actions  

Wyers' work over the past year or so

I don’t have a link handy, but you could probably track it down pretty easily.

by Graham MacAree on Jul 25, 2011 4:11 PM PDT up reply actions  

Just because the data is flawed/biased

Doesn’t mean we know how much it affects the metrics.

And beyond that, the thought processes behind tRA and UZR are examples of great sabermetrics.

by vivaelpujols on Jul 26, 2011 5:58 PM PDT up reply actions  

See the comments

here and Colin’s piece here are good starting points.

by marc w on Jul 25, 2011 4:13 PM PDT up reply actions  

NOTE: the above links open in this window. Right click 'em!

Here’s another of Colin’s pieces on batted ball data, from April of ’10.

by marc w on Jul 25, 2011 4:29 PM PDT up reply actions  

I've read all those before (and just did again) and find no reason to call batted ball data garbage or broken.

I see reasons to be wary of hit location data, but then again, I always have and tRA and xFIP and the like don’t rely on it.

Also, I don’t think the systemic bias — when dealing with batted ball types — is all that troublesome to handle. It is, after all, cooked into the whole system. We don’t have a platonic ground ball, fly ball, line drive, etc that we use to judge run and out values off of. We figure out our weights based on the measured (bias and all) results.

That’s why I went to such lengths to present component factors for every park, because it’s my belief that those capture both the effects of the park and the scorers at the park and luckily, we don’t want either of their biases.

by Matthew on Jul 25, 2011 5:22 PM PDT up reply actions  

What the park factors don't do is address the bias that arises when a fielding play is made/not made

It’s my understanding that the same trajectory will be classified differently depending on whether a ball is caught or not, etc.

by Graham MacAree on Jul 25, 2011 5:45 PM PDT up reply actions  

Anyway - Matthew, your argument is what I've used to defend tRA in the past

I’ve just never had a particularly good feeling about it. You may well be right, and I know we’ve integrated tRA on Statcorner in the best way possible, but I’ve never managed to convince myself.

by Graham MacAree on Jul 25, 2011 5:48 PM PDT up reply actions  

Nor do I think it really's possible to prove.

You’d need a good sample of similar batted ball trajectories and since we don’t have that data in the first place. Maybe someone with access to field F/X could share the output, but even then we’d have to trust their math because we couldn’t have access to the inputs.

It may well be true (I’d bet it is to some degree), but that’s not a satisfactory reason for me to ditch batted ball types

by Matthew on Jul 25, 2011 9:31 PM PDT up reply actions  

And that's your prerogative

You’re in a much better position to judge than I am, because you’re still actively working on this, and I write about soccer. But for me, the data is in question, and that’s enough to make me very leery about using it, even with the safeguards we made sure Statcorner was equipped with. I probably overstated when I said things were ‘broken’, though.

by Graham MacAree on Jul 25, 2011 10:40 PM PDT up reply actions  

I think it's fair to say the data is broken when it comes to judging fielders

But that’s a bit different question than whether park factors are sufficient to correct the batted ball types for something like tRA. I don’t know the answer to the latter, and it seems that was what Matt was referring to.

I have found catch/no-catch bias at the groundball-linedrive boundary using April 2009 HITf/x data, but I’m not confident enough about disentangling spin effects from that to publish my findings. No one has figured out yet, of course, how to use that HITf/x data to measure anything in the outfield.

by Mike Fast on Jul 26, 2011 6:03 AM PDT up reply actions  

Matthew, I meant

As in Carruth. Not Matt, as in Swartz. Dammit, I need more sleep.

by Mike Fast on Jul 26, 2011 1:03 PM PDT up reply actions  

"We figure out our weights based on the measured (bias and all) results."

And what do you get when you do that? Do we know that this leads to improvement either in explaining results or in predicting future results? I mean, kwERA seems to better predict ERA than bbFIP. Why?

by marc w on Jul 26, 2011 10:26 AM PDT up reply actions  

What you get, in my belief, is more accurate representations of what actually happened.

Know is a strong word, but it should be an improvement in explaining results. What’s the alternative? I am highly doubtful that ignoring batted ball types completely leads to a better understanding of past results. Why not use them when there are ways (the various factors I’ve posted for one) to estimate their deviation from expected and simply scale our results back based on those?

And touching on your last question, and with a nod toward the outside debate on SIERA and stuff, does it matter how well something predicts future ERA? I have never understood the interest in predicting ERA. It’s a team stat, not an individual one so why should I want individual metrics to predict it? If I had a formula that exactly predicted a pitcher’s wins in 2012, would it be useful for anything outside fantasy baseball?

Here’s my thing with ERA predictors. Imagine there somehow was a perfect ERA predictor. That’s the ideal for ERA predictors right? And say we’re looking at Felix Hernandez in 2012 and this system predicts a 3.16 ERA for Felix and if nothing changes, that is exactly what he will post.

Now, the Mariners go out and somehow acquire Brett Gardner to play LF in 2012. In all likelihood, that would dramatically improve the Mariners defense and thus, Felix’s future ERA should go down. Nothing about Felix changed. Nothing about the factors under his control have changed, but his ERA will change. So does this ERA predictor now change?

If no, then it’s clearly not a perfect ERA predictor and it opens the door for something less perfect, but more lucky to end up “better” because the less perfect system erred in a way that had Felix at a lower ERA and lucked into the Mariners acquiring Brett Gardner.

If yes, then what does it tell you about the pitcher? If the ERA predictor is subject to change based on the pitcher’s park, run and defensive environment, then it doesn’t tell you much about how good the pitcher is because he could be pitching in front of a great defense in San Diego or a shitty defense in Chicago or anywhere in between.

I guess that interests people, but not me. I’d rather have separate estimators for the pitcher, for the park, for the defense, etc so that we can do our best to puzzle out where success and failure are coming from and not pin it all on the pitcher.

by Matthew on Jul 26, 2011 12:08 PM PDT up reply actions   1 recs

I second all your comments about ERA estimators/predictors.

But the reason I’m not sure you’re right about this:

I am highly doubtful that ignoring batted ball types completely leads to a better understanding of past results. Why not use them when there are ways (the various factors I’ve posted for one) to estimate their deviation from expected and simply scale our results back based on those?

is that the line drive category of batted ball is what is so hard to define “correctly” (whatever that may mean—consistently, I guess?) and it has so much of a different run value from the other types, that the effect of any error is magnified. If it were simply a matter of comparing pure groundballs to pure flyballs and not having to worry about any of those nasty line drives in the middle, I’m pretty sure that you would be right.

by Mike Fast on Jul 26, 2011 1:08 PM PDT up reply actions  

I think StatCorner tRA is implemented about as well as it can be

To a certain extent, most of the problems with ERA estimation, park biases, etc were discussed when Matthew and I came up with the plan for making StatCorner work, to the point of essentially anticipating this conversation.

Right now, it’s just a question about how well-defined the LD boundaries are, and if there’s enough bias to make statistics that incorporate them worthwhile. I think that if pitchers can influence GB and FB, they’re almost certainly able to exert some control over whether a ball is in that LD range (just from thinking about the collision physics), but it’s a question about whether our measurement tools are misleading us.

At this point, there’s reason to believe they might be, but not proof. I don’t know what to do about that apart from hope that better data comes along at some point.

by Graham MacAree on Jul 26, 2011 4:10 PM PDT up reply actions  

Yes, I regret the ERA thing.

Your last paragraph is spot-on. I’m just trying to figure out how to test whether we actually ARE getting a better explanation of results. It definitely seems like we should be, but are we?

If you’re going to have separate estimators for pitching and for defense, you’ve got to make sure that the batted ball data is giving you something meaningful. Right now, the Mariners have one of the better team DERs in baseball. At the same time, they’re rated below average by UZR. Is the M’s defense good or below average? Not “what’s their true talent.” Have the M’s had good defensive results this year or not? How do we know?

I think the idea of blending the air balls together makes a lot of sense, as you’ve done, and can’t wait to see what that shows us.

by marc w on Jul 26, 2011 3:15 PM PDT up reply actions  

The ERA thing was far more something that formed in my mind this morning in the shower.

Your comment just gave me a pertinent opening.

As for the defensive thing, is there a way to know? I guess it comes down to figuring out what measurements we believe are the best ones to judge a defense on. I know that adding up the scouting reports, previous season success and the superb run prevention of the Mariners this season, I find it far more intuitive that the Mariners have had a good defense than a bad one, even with help from Safeco.

by Matthew on Jul 26, 2011 4:00 PM PDT up reply actions  

I could not agree any more with this:
And touching on your last question, and with a nod toward the outside debate on SIERA and stuff, does it matter how well something predicts future ERA? I have never understood the interest in predicting ERA. It’s a team stat, not an individual one so why should I want individual metrics to predict it? If I had a formula that exactly predicted a pitcher’s wins in 2012, would it be useful for anything outside fantasy baseball?

At the very least, try to predict park-, league- and maybe defense-adjusted ERA.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 26, 2011 10:33 PM PDT up reply actions  

Wait, no, I don't think it would.

It would potentially turn park factors into an adjustment for consistent batted ball classification errors.

...and now I'm here

by CapSea on Jul 25, 2011 5:16 PM PDT up reply actions  

Yes

But only park by park classification errors

by Graham MacAree on Jul 25, 2011 5:16 PM PDT up reply actions  

Broken?

But do you agree with the approaches?

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:30 PM PDT up reply actions  

I'm with you.

I know that this isn’t exactly the place that you would choose for this, but I wondered if your concerns/criticisms of SIERRA matched my own or if there are other issues entirely that I’m not seeing. Is there any way you could mention in brief suggested critiques?

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 5:02 PM PDT up reply actions  

I'm not really sure how best to respond to this

Obviously, I think that stating point in SIERA is wrong, and all the mistakes I see in it are a result of taking a philosophy that I see as invalid and applying it to measuring pitchers.

by Graham MacAree on Jul 25, 2011 5:50 PM PDT up reply actions  

Ok, I agree with that.

I had seen your thoughts alluded to but not stated explicitly. Thanks.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 7:19 PM PDT up reply actions  

Yes.

I think we are in agreement. I was trying to explain why I didn’t like SIERRA over at Lonestar Ball and I realized that it was difficult to summarize thoroughly.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 7:42 PM PDT up reply actions  

Patriot's got a good

piece here that might be useful. Not sure if this is the angle that you were going for, but I think the general point about anchoring the system with a run estimator is a good, if less evocative, restatement of Graham’s point about having the thing make baseball sense.

by marc w on Jul 26, 2011 10:31 AM PDT up reply actions  

Is there a place to get year by year splits on UZR data?

Fangraphs only covers career splits.

...and now I'm here

by CapSea on Jul 25, 2011 5:21 PM PDT up reply actions  

Excellent piece, and I agree for the most part

Just one question, which is a bit off-topic: which book on quantum physics were you reading? Thanks!

The idiot formerly known as pkyankeefan! Now in Technicolour!

by Hasan Paliwala on Jul 25, 2011 1:05 PM PDT via mobile reply actions  

Thanks Graham. I'll be sure to check it out.

Yours were one of the first comments I’d read on LL, and I have to admit, I miss angry Graham too.

The idiot formerly known as pkyankeefan! Now in Technicolour!

by Hasan Paliwala on Jul 25, 2011 3:01 PM PDT via mobile up reply actions  

*some of the first

The idiot formerly known as pkyankeefan! Now in Technicolour!

by Hasan Paliwala on Jul 25, 2011 3:14 PM PDT via mobile up reply actions  

I'm curious on your thoughts

as to whether front office known to be forward thinking in terms of statistics would be able to function as some sort of quasi-experimental group while the control group would be those front offices that followed the more traditional approach.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:16 PM PDT reply actions  

Eh, I think most every FO claims to use a balanced approach.

Which group would you put the M’s in? Why? How about the Rangers?
What is a “traditional approach?”

by marc w on Jul 25, 2011 4:30 PM PDT up reply actions  

Yeah, I realize the inherent issue in determining the buckets,

but perhaps if you looked at the tail ends of the distribution instead of trying to classify the nebulous middle?

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:31 PM PDT up reply actions  

Maybe you could sketch something out, but you'd be doing some sort of controlled experiment

when the control group and the experimental group are at least partially defined through public statements, managerial decisions, and what, player acquisitions? What’s the hypothesis again?

by marc w on Jul 25, 2011 4:44 PM PDT up reply actions  

Well, the hypothesis would be one that I don't actually believe,

but something like:

Front offices that use advanced statistics and probabilistic thought in making decisions end up being more successful than those who avoid these modes of thought.

Classification would occur through finding organizations that have publicly hired statisticians and perhaps further confirm through public statements. Similarly, it seems that there are quite a few organizations that feel that these approaches are incorrect, or perhaps incomplete.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:54 PM PDT up reply actions  

x

The more I think about this, the more I realize it can’t be done externally. I am of the belief that clubs can succeed in a variety of ways. The Royals and Braves don’t appear to be that forward-thinking, but the quality of their scouting makes up for that (or will soon in the case of the Royals).

However, the core issue of what I’m getting at still holds, I think. For organizations that have statistics departments and are trying to leverage advanced analysis to gain an upper hand, I am willing to bet that there is a certain amount of experimentation being done as the statisticians try to implement policies based on the conclusions drawn from their work. This data from this is almost certainly tested.

I just don’t know that we will ever be able to see inside the black box.

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:55 PM PDT up reply actions  

Is there ANY org at this point w/o a statistician/database guy?

I don’t know, but it sure doesn’t seem like it. The M’s had Mat Olkin in the dark days of the Bavasi era and we’ve got Tango now, but what does that get us? The major difference between the orgs seems to be in the scouting department, and it’s going to be damn hard to separate out THAT in your results. The whole “did you draft an awesome baseball player” thing skews the results.

by marc w on Jul 25, 2011 4:56 PM PDT up reply actions  

Yeah, it's true.

Plus parsing out differences in resources, both financial and from being at the top of the draft…

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:58 PM PDT up reply actions  

I do think that there are still organizations

that don’t value the advances that have been made in baseball analysis, though. How else can one explain Dan Haren Trade pt. 2?

"I’d love to walk in and hug everybody every day, but that’s not critical to us winning." - Jon Daniels

by GhettoBear04 on Jul 25, 2011 4:59 PM PDT up reply actions  

constructive criticism, then extrapolation

I think you’re wrong about controlled experiment and science. There are many scientific achievements that are perfectly credible that don’t depend on controlled experiment. Lots of biology, geology, and medicine use observation and data without controlled experiment. An important diference between science and non-science is the use of data, but not all data is a controlled experiment. Controlled experiment is wonderful stuff. Nevertheless, controlled experiment fails too: we once proved with the Poission spot that light was a wave and not a particle (its neither and both); Rutherford’s demonstration that atoms have a positive central charge relies on false assumptions about the electrodynamics of atoms. There are less famous examples too. It seems to me that when you look across the history of science, no one method jumps out as the method. Even knowing what data matters is a difficult problem. The early history of chemistry, for example, systematically confuses heat and combustion, which, when you think about it, is pretty understandable; the was a lot of interesting science done under that rubric, but even after oxygen was discoverd, heat was mostly regarded as a material substance, not a kinetic property of material bodies.

I think what you’ve basically argued is that sabermetrics is no longer making progress, and I think you’ve nearly identifed why. It’s not abut experiment. It’s about data. One problemis that we’ve nearly exhausted the usefulness of box score data. Another is that additional sorts of data, like bucketted batted ball data, is corrupt or simply of limited usefulness. Perhaps more precise data, like real batted ball velocity vectors, will be more revealing, but maybe they won’t. Most likely, they will help with some things but not with others.

There’s one more thing I want to add: if you take the one hundred year history of quantum mechanics and condesnse it into one book, you see a fascinating, progressive science in which even failures are illuminating. (This may be less true of contemporary quantum physics, but see the final point of this paragraph.) But if you actually stop and look at all the quantum mechanical results and publications, tons of it looks like number crunching quantum mechanics for quantum mechanics’ sake. Ooooh, Sommerfeld’s introdcution of eliptical orbits to Bohr’s model of hydrogen reduces his error in the derivation of the Balmer series by one third! Most of the results between 1913 and 1925 are less significant than that. Scientists inthe midst of research often make the very complaint you have made, but about their own sciences. Standing in the middle of a forest, one only sees trees. It’s entirly possibel that we are making progress, but just won’t see it for what it is untila decade passes.

by philosofool on Jul 29, 2011 3:48 PM PDT reply actions   4 recs

Comments For This Post Are Closed


User Tools

By reading a game thread of your own volition you agree to accept all liability for any and all damage done to your delicate sensibilities.

FanPosts

Community blog posts and discussion.

Recent FanPosts

Small
Starlin Castro's fit with Seattle
Kawasaki80_small
Lists! So many lists!
M_s_hat_copy_small
OT -- May 22nd In Memoriam
Ichiro_small
Why do managers and media members hate walks?
Wbc_029_small
Friday Morning Music Thread
Small
Dustin Ackley BP swing vs game swing
Beastquakerwallpaper_small
More on the Struggles of Smoak
Randy2_for_sbn_small
Albert Pujols 2012: Three Retrospectives
Small
On Batting Orders
Niehaus_small
More on Dustin Ackley and the strikezone

+ New FanPost All FanPosts >

Yahoo_full_count

Sexy People

Wbc_029_small Jeff Sullivan

Small Matthew

Claw_small JY