A Common Mistake
It wasn't very long ago that Graham and Matthew launched StatCorner and, in so doing, made tRA available to the public. tRA, as we should all know by now, is a pitching metric designed to evaluate pitchers based on the run values of the pitch outcomes that they generate on the mound. It should be better than FIP and xFIP because it adjusts for things like grounders and line drives. It should be better than ERC because it doesn't fool around with hits and constants. It should be better than ERA because it doesn't suck. One could make a fairly convincing argument that, at least as far as the non-PITCHf/x realm is concerned, tRA (or, if you prefer, tRA*) is about as good of a pitching metric as anyone could design.
This leads to an obvious follow-up question: if tRA is so good, then shouldn't we be evaluating hitters the same way?
The answer is, yes, we should. But all too often people don't even try, because it's pretty freaking hard.
It's not a trivial thing. You can try to apply the same principles to hitters as you do for pitchers, but you'll find that you run into trouble in a hurry. It's easy to figure out the average run value of a line drive allowed by, say, John Lackey or Aaron Harang. That's because we can make the safe assumption that, over a long enough period of time, both Lackey and Harang face a ~representative sample of hitters throughout the league, so the run value of a LD will simply be (or approximate) the league average run value of a LD. Pitchers face everybody. And when you have everybody blended together, you approach the league average. Obviously there will be little variations if you have a guy in, say, a really strong or really weak offensive division, but this generally isn't a big deal. Run values of outcomes for pitchers are easy to determine because the batters that pitchers face even out over time.
It's not so for hitters. Hitters are individual. Each one will put a unique spin on every ball he puts in play. While the average line drive allowed by John Lackey will be worth ~the same as the average line drive allowed by Aaron Harang, the average line drive hit by Albert Pujols is not worth the same as the average line drive hit by Miguel Cairo. Pujols is stronger. He hits the ball harder. So his line drives will rather obviously be better than Cairo's. You can see this reflected in their career splits; Pujols' career BA on line drives is .816 with a bunch of home runs, against .725 and two for Cairo. I think this is pretty intuitive.
So you can't apply the same principles to hitters as you can for pitchers, because the whole averaging-out phenomenon doesn't take place. You can't treat a groundball hit by David Ortiz the same way you treat a groundball hit by Ichiro. It wouldn't make sense. They're clearly two different types of balls in play, and they should therefore be treated as such.
This is where we begin to understand why nothing like tRA has ever really been attempted with hitters. It hasn't been attempted because to do so, you need to (A) calculate run values on a player-by-player basis, rather than applying the average to everyone, and (B) accumulate enough of a sample size to be able to calculate those run values in the first place. Sound fun? You're a nerd. And a dreamer. Such a project would be unfathomably complicated. PrOPS tried, sort of, and because it's better than raw OPS it's fun to look at from time to time, but it leaves a lot to be desired. A lot that I'm not sure is possible to obtain.
So we're stuck. Tango's wOBA is the perfect (and I mean literally perfect) measure of what a hitter has already done in terms of results, but why should we care that much about the results? If we never talk about a pitcher's BAA, why should we have to talk about a hitter's BA? Okay, yeah, so a hitter's batting average is more meaningful than that of a pitcher, but I think the point should remain. If we're so ready to accept that a pitcher should be evaluated on the immediate results of his pitches, why aren't we the same way with hitters?
In an ideal world, hitters would be evaluated not on their BA/OBP/SLG slash lines, but rather on the balls they put in play (and, of course, the balls they don't). This is the more relevant information, right? Not all hits deserve to be hits. Not all outs deserve to be outs. The ball is out of a hitter's control the instant it leaves his bat. After all, you can't aim your line drives. So shouldn't we be judging hitters on how well they do their jobs, rather than on some combination of that, defense, and luck?
Just because we can't have the same level of accuracy as we can with tRA doesn't mean we shouldn't at least try to look at things this way. Educated estimates are better than nothing. Adrian Beltre's current batting average is .247. That sucks. But he's also hit just .622 on his line drives, against a career average of .748. Which seems more likely - that Beltre's suddenly gotten weaker than ever before in his age-29 season, to the point at which he's turning fewer line drives into base hits than Willie Ballgame, or that he's simply the victim of unsustainable bad luck? If you regress Beltre's BA on line drives to his career rate, his 2008 average jumps from .247 to .270. Right there you're talking about an OPS difference of 50-70 points. By the traditional measures, Beltre hasn't been a very good hitter so far this season, but in reality, he's actually done his job fairly well. It just hasn't worked out like he and the rest of us have hoped.
Isn't that important information? And forget about Beltre; isn't this the sort of approach we should be taking with everyone? Luck doesn't only happen to pitchers. It happens to hitters, too, and it can often have a significant effect on their results. If we're going to try and eliminate it when evaluating one, we should do the same for the other. Ultimately, what we're after is an accurate measure of how well a guy has performed, and accepting that a hitter's slash line automatically reflects how good he's been just doesn't strike me as being good enough.
HITf/x is going to be a godsend. Where currently we have to deal with the limitations of human observation when it comes to the quality of balls put in play, down the road we'll be able to look at a hit with a given trajectory and assign it a 40% chance of dropping in for a double, then look at another hit with another given trajectory and assign that one a 90% chance of being caught. It's going to open so many doors that, over time, it may bring about the death of the slash line. At least among us dorks. There are going to be real stats - BA, OBP, and SLG - and there are going to be theoretical stats - tBA, tOBP, and tSLG - that tell a more accurate story. This will represent the pinnacle of hitter analysis. Once you can assign a ball in play an accurate probability of turning into any given event, there is no further room for growth.
But just because we're not there yet doesn't mean that, given present constraints, we can't try to be as accurate as possible. Question the slash line. Ask yourself whether a guy's BA truly reflects how well he's performed at the plate. Dig deeper. Because to not do so is to do yourself a disservice, and there is no more noble endeavor than the pursuit of new knowledge.
6 recs |
38 comments
Comments
===
there are going to be theoretical stats – tBA, tOBP, and tSLG – that tell a more accurate story. This will represent the pinnacle of hitter analysis. Once you can assign a ball in play an accurate probability of turning into any given event, there is no further room for growth.
There is no floor, but there is a ceiling?
by JI on Aug 21, 2008 3:19 PM PDT reply actions 0 recs
Funny I was bringing up this idea to Graham not two days ago
I think my phrasing was along the lines of “We need to find a way to regress wOBA so that it better strips out luck. So we’d have to find a way to regress each of the components of wOBA in some fashion. Holy hell that’s hard to do.”
I’m still going to take a look at it though. Obviously HITf/x will do it well but who knows how far off that is (or even if it will be public once it’s available), but there’s got to be a 90/10 method that’s better than what we currently have.
by Matthew on Aug 21, 2008 3:51 PM PDT up reply actions 0 recs
A 5% improvement on PrOPS would still give us the best measure available.
Painstaking, sure, but worthwhile if you can figure out a way to do it that doesn’t make you want to kill yourself.
by Jeff on Aug 21, 2008 3:54 PM PDT up reply actions 0 recs
My initial thoughts are decision trees
where, just like everything else, we regress heavily toward league averages until the sample size expands and as it does, we leave more and more of the player’s career (or last 3 years) run values in. Basically, we need tons of correlation studies to see how we might build in a proxy model.
Can we use stolen bases to categorize speedy types in order to put them in a higher RBOE and 1B/GB talent level for instance? Tons and tons of these questions.
by Matthew on Aug 21, 2008 3:59 PM PDT up reply actions 0 recs
I can't see how you can have one without the other.
by JI on Aug 21, 2008 4:36 PM PDT up reply actions 0 recs
Yes it does.
There’s a certain point beyond which you are no longer on Earth.
by Matthew on Aug 21, 2008 4:53 PM PDT up reply actions 0 recs
Kind of like a box with no lid
The box itself has no ceiling, but keep going up and you’re eventually out of the box.
by Gomez on Aug 21, 2008 10:35 PM PDT up reply actions 0 recs
That is an arbitrary definition
There is no certain point.
by Edgar for Pres on Aug 22, 2008 8:40 AM PDT up reply actions 0 recs
Yes, Jeff, parabolae are a fine example.
I like using semi-colons; they make me feel smart.
by Llewdor on Aug 22, 2008 11:24 AM PDT up reply actions 0 recs
Dumb people.
There’s a firm limit on what they can achieve, but their dumbness knows now bounds.
I like using semi-colons; they make me feel smart.
by Llewdor on Aug 22, 2008 11:23 AM PDT up reply actions 0 recs
There has to be a floor for their dumb if there is a ceiling.
by JI on Aug 22, 2008 1:55 PM PDT up reply actions 0 recs
There's a saying...
…that it’s hard to idiot-proof things because idiots are just so darn ingenuitive.
by GhettoBear04 on Aug 22, 2008 2:35 PM PDT up reply actions 0 recs
I see what you did there.
I like using semi-colons; they make me feel smart.
by Llewdor on Aug 22, 2008 3:24 PM PDT up reply actions 0 recs
So why do I bother including slash lines in my wrap-ups?
by Gomez on Aug 21, 2008 4:17 PM PDT reply actions 0 recs
And... not being a wise ass here
This is a serious topic you bring up… so I’m seriously asking here, because in light of the more advanced analytical methods out there, I often ask myself the same question, and I’m still not exactly sure of the answer.
by Gomez on Aug 21, 2008 4:28 PM PDT up reply actions 0 recs
Because at the minor league level they're the best we can do
by Jeff on Aug 21, 2008 4:36 PM PDT up reply actions 0 recs
There's big part of me that doesn't want to reduce a player's offensive merit into one number
and I can’t explain exactly why.
by JI on Aug 21, 2008 4:37 PM PDT up reply actions 0 recs
Probably because
If you lump everyone offensively into a single number, you are taking away the different styles of player. Ichiro gets put into the same group as A-Rod even though they’re vastly different in their approaches.
Midnight Baseball - No Lights - Only in Alaska!
by MfaninAlaska on Aug 21, 2008 4:50 PM PDT up reply actions 0 recs
That answer works for me
I do try to go beyond the slash when I can, but there’s so many variables. I’d hope that, as we advance statistical analysis and tools for it at the MLB level, that the tools that allow this trickle down the way Gameday, pitch counts and other features have started to trickle down into minors coverage.
by Gomez on Aug 21, 2008 4:42 PM PDT up reply actions 0 recs
Would you somehow include pitches/PA into the perfect hitting stat?
by Last Fan Of Jose Lopez on Aug 21, 2008 9:08 PM PDT reply actions 0 recs
Are there two types of patient hitters?
One that sees pitches but still intends to hit his way on base vs one that thinks that walking to first is just as good as a single?
by GhettoBear04 on Aug 22, 2008 2:37 PM PDT up reply actions 0 recs
I don't know off the top of my head
but what I do know that P/PA is worthless if a guy isn’t able to recognize a hittable pitch and drive it.
Patience is overrated. Discipline isn’t.
by Jeff on Aug 22, 2008 2:41 PM PDT up reply actions 0 recs
I'm not sure I agree.
Patience is overrated, but it’s not worthless. Those extra 8 pitches a game can matter, especially if your whole team is like that.
I like using semi-colons; they make me feel smart.
by Llewdor on Aug 22, 2008 3:26 PM PDT up reply actions 0 recs
Making people throw more pitches really doesn't accomplish anything
by Jeff on Aug 22, 2008 3:42 PM PDT up reply actions 0 recs
If all 9 guys do it it can lead to high pitch counts.
by JI on Aug 22, 2008 4:05 PM PDT up reply actions 0 recs
High pitch counts don't really accomplish anything unless the starter is an ace
by Jeff on Aug 22, 2008 5:45 PM PDT up reply actions 0 recs
Which could lead to a better pitcher coming in.
MGL had an interesting take on this whole thing.
by Teej on Aug 22, 2008 5:47 PM PDT up reply actions 0 recs
That said,
I still think that, over a three- or four-game series, there has to be some value in running starters out of the game early. Maybe in that last game or two the bullpen is more tired and more apt to throw a few meatballs.
by Teej on Aug 22, 2008 5:52 PM PDT up reply actions 0 recs
Here's what I'm wondering
Is there any merit to considering more than just the outcomes of an at-bat (by this I don’t mean standard slashline stats, but what’s being talked about here: different types of balls in play/walks/strikeouts/whatever). Is it possible that we might be able to learn something about a player’s ability within the context of that at-bat. While the currency of baseball in most cases is outs (or at least plate appearances), each individual pitch is a discrete event. I wonder if at some point we might be able to glean information even from the pitches that come before the one that ends the plate appearance.
I see some of this here, when we talk about how often a guy swings at pitches out of the zone, or takes in the zone, or whatever.
I think LFoJL might be on to something (small, perhaps), because while we talk about outs as being the only limiting factor, in a practical sense pitches are limited as well (at least for any individual pitcher), both because pitchers eventually lose effectiveness when they reach a certain point (variable, of course) and because a manager will eventually replace that pitcher (with someone either better or worse, depending).
If the goal is to create one offensive stat that encapsulates a hitter’s abilities (or offensive value, I guess), it seems important to consider everything they do to further their team’s chances to win. In general, it would seem to me that a home run on the 10th pitch of an at-bat is (marginally) more valuable than a home run on the first pitch. Similarly, a three-pitch strikeout is even worse than a six-pitch one.
Perhaps the added accuracy from such considerations pales in comparison to what we’d get from truly accurate pitchf/x data, and as such will only be a thought exercise. However, it might be worth someone who can actually do math at least looking at.
I'd rather know a little about a lot than a lot about a little
by Sportszilla on Aug 21, 2008 9:34 PM PDT reply actions 0 recs
Things I will never hear again in my life:
“…And that’s another homerun for Willie Harris”
by JI on Aug 22, 2008 2:55 PM PDT reply actions 0 recs
I'm still waiting to see
somebody examine how people can have their ISO suffer from bad luck. We focus on BABIP so much but this doesn’t tell us the 2B that are turned into 1B by bad luck.
by Edgar for Pres on Aug 22, 2008 4:03 PM PDT reply actions 0 recs

by 















