A Thought
One of the most frequently repeated criticisms of defensive stats in general and UZR in particular is that on a number of occasions they aren't very stable on a year-to-year basis. According to UZR, for example, Bobby Abreu dropped from being a -4 run RF in 2007 to a -26 run RF in 2008, while Mark Teixeira jumped from being a -4 run 1B in 2007 to a +10 run 1B in 2008. For a lot of people, this sort of inconsistency seems sufficient to invalidate the entire system.
It's not. For one thing, nobody familiar with defensive metrics would advise arriving at conclusions based on individual seasons (or even individual metrics). It's far better to look at a three- or four-year window in order to gauge a guy's true ability. But for another thing - and this is the bigger point, I think - inconsistency in no way means that defensive stats are bullshit any more than it means offensive or pitching stats are bullshit. Since 2006, Abreu's bat has been worth anywhere from +11 runs to +33 runs, for a range of 22. Teixeira also has a 22-run range. 38 runs for Manny Ramirez. 21 runs for Lance Berkman. 21 runs for Albert Pujols. 34 runs for Jason Bay. And so on and so forth. Are our offensive stats broken?
By tRA, looking over the same span, CC Sabathia has a range of 27 runs. 25 runs for Roy Halladay. 15 runs for Derek Lowe. 29 runs for Dan Haren. 50 runs for Josh Beckett. 30 runs for Carlos Silva. Are our pitching stats broken?
Statistical instability is a part of the game, and just because we don't understand defense as well as we do offense and pitching doesn't mean it shouldn't be susceptible to the same sort of annual fluctuation. Yeah, it's weird to think that Abreu could've dropped from being worth -4 runs in the field in 2007 to -26 runs in 2008 while remaining the same person with the same skillset, but then, he also dropped from being worth 33 runs with the bat in 2006 to 11 runs in 2007 while remaining the same person with the same skillset, too, so who cares? Nobody's accusing wOBA of being a waste of time.
Single-season performances can be inconsistent. At the plate, on the mound, and in the field. That's a fact, and the only people who disagree are the people who don't like what the numbers are telling them. We may not yet have the perfect measure of defensive ability, but we've still got some great stuff, and as long as you interpret the numbers properly, there is valuable and accurate information to be gleaned.
Comments
Also
I don’t think this would be a problem if people saw the defensive stats updated game-to-game like offensive stats are. We all know that sometimes a player gets an unlucky out or a lucky hit, but we still “believe” in on-base percentage because it travels with us throughout the season, and since we can keep an eye on it, we know that it’s not doing anything crazy behind our back.
I think it will just be a matter of time before people can make the distinction between “he’s a bad fielder” and “he had a bad year in the field” the way that they already make the distinction between “he’s a bad hitter” and “he had a bad year at the plate.”
by ubelmann on
Jan 13, 2009 3:08 PM PST
reply
actions
0 recs
Heck, just look at RBI splits if you're trying to convince "traditionalists"
I bet errors made doesn’t stay entirely constant either, for that matter.
---
Juuuust a bit outside!!
http://www.rightfieldbleachers.com
by jhmoore on
Jan 13, 2009 3:14 PM PST
reply
actions
0 recs
Well, we all know the lousy stats are inconsistent
But even the stuff we know to be accurate can vary quite a bit season to season.
by Jeff on
Jan 13, 2009 3:17 PM PST
up
reply
actions
0 recs
Could any minor intangibles effect these stats from year to year?
Such as a team being in contention, or something like that.
by Fin on
Jan 13, 2009 11:07 PM PST
up
reply
actions
0 recs
Yes and no
We all understand that the range of offensive stats is large, but we generally know which sign is going to come before the number. For many fielders, we have no clue.
For most good hitters/pitchers, the range is quite wide, but it’s generally moving between ‘above average-ish’ to ‘insanely great.’
Curtis Granderson’s (and I know this was discussed on fangraphs) UZR went 11.7, 10.9 to -10.2 from 2006-08. It’s not so much that the range is so much wider, it’s that it encompasses (centers on?) zero. People can’t get used to that… I can’t get used to that.
I think there are explanations for this – the sample size is generally smaller than ABs, and the average is always moving. But there are reasons why people will see these stats differently, and I think we just need to work on explanations for why a ‘defensive star’ (according to a multi-year UZR, for example) can pop up as a lead-glove every now and again.
by marc w on
Jan 13, 2009 3:35 PM PST
reply
actions
0 recs
I think injury and age are often overlooked when it comes to defense.
How that applies to Curtis Granderson, I’m not sure. He was 27 at the start of 2008, so that doesn’t explain it. Playing with injury?
Also, IMO, with UZR now easily accessible to the public, people will begin using a single metric for evaluating defense, when it’s much better to look at all of the advanced metrics and see if they agree. So if PMR and +/- agrees with UZR, then it’s probably legit (subject to SSS, of course). If not, then UZR could be off for some reason.
Perfect case for this is Orlando Hudson. UZR says he’s below average the past 3 years, +/- and PMR disagree along with the fans’ opinions. The conclusion should be that Hudson is still an above average 2B. I know it’s a 3 year sample and not a 1 year discrepancy like Granderson, but it’s a decent example.
by JLC on
Jan 13, 2009 3:53 PM PST
up
reply
actions
0 recs
He wasn't injured to my knowledge.
Hudson is older than Grandy, and it’s not inconceivable that UZR’s ‘correct’ and the fans perceptions are lagging – this is basically what we saw with Betancourt. Nobody knows for sure, of course.
by marc w on
Jan 13, 2009 3:57 PM PST
up
reply
actions
0 recs
Well, the problem with Betancourt is that his accuracy blows.
From 2006 to 2007, fans liked his speed, instincts, and first step alright, but gradually began to dislike his hands and accuracy. In 2008, fans just took a complete dump on him and rated him poorly across the board (NB: agreement level is .52 in ’08)
I mean, this may be deserving of a separate topic altogether, but is Yuniesky Betancourt’s range really THAT bad? The reason why fans don’t understand Betancourt’s decline is because he’s shown at least average range for a SS, but often can’t complete the play for some reason.
As for Hudson, it’s possible that UZR is correct, but unlikely. I’ve written up a draft on why I think Hudson would be a good signing for the M’s, and basically, I think his defense is still very good. PMR has him at +23, +18, +7 in ‘06-’08 and +/- has him at +8, +15, 3 in ’06‘08. And, unfortunately I don’t have access to yearly UZR based on STATS Inc. data, but from ‘03-’07, “sUZR” rates Hudson as +12 per 150 games, much higher than the BIS-based UZR.
by JLC on
Jan 13, 2009 4:20 PM PST
up
reply
actions
0 recs
It's worth noting
that the best and worst defensive players will generally be +/- 20 or 25, whereas the best and worst offensive players will fall anywhere from, say, -25 to +60 or +70. So, yeah, it’s a lot easier to predict the sign that comes before the number for a great hitter, because great hitters will be further above zero than great defenders.
There are still tons of guys who bounce around zero at the plate. Miguel Tejada. Michael Young. Nick Swisher. Cristian Guzman. Etc.
by Jeff on
Jan 13, 2009 3:59 PM PST
up
reply
actions
0 recs
Cristian Guzman?
But you can pick out several players who simply do not bounce anywhere around zero. Pujols, Manny, etc.
If you could do the same for fielders – try to pick out the guys who were clearly, year in and year out, above average – who’d be on the list? Before this year, Granderson would’ve been on my list. Adam Everett, Crawford, Ellis too. It’s not just great fielders having a bad year, it’s guys with established track records of meh fielding becoming UZR (or PMR) superstars: think Randy Winn. He was a decent LF in 2003, though nothing to write home about. Now, after turning 32, he’s an elite defender?
Part of the problem is sample size, part of the problem is clearly park effects, and part may be the whole STATS/BIS difference. People are always going to be skeptical when advanced metrics vary so much on guys like Grady Sizemore or Ichiro. There are reasons why, and over time I think we’ll have fewer and fewer anomalies, but until then, I still think the fact that the range seems to center on zero is going to make a lot of people skeptical.
It’s possible that it’s simply much, much, much harder to be a consistently elite defender than being a consistently elite hitter, but that’s somewhat counter intuitive, especially given replacement level.
by marc w on
Jan 13, 2009 4:26 PM PST
up
reply
actions
0 recs
But hitters can be further above average than fielders
That’s just the nature of the issue. The great ones won’t bounce around zero because they’re not anywhere even close to zero to begin with. Fielders get fewer opportunities to make a difference, which leads to a narrower range and greater instability as a result of smaller sample sizes.
All I’m trying to say is that variation is no reason to throw a metric away. Players fluctuate in everything they do. Defense is no different.
by Jeff on
Jan 13, 2009 5:14 PM PST
up
reply
actions
0 recs
Yes, I'm getting it now...
the more I’ve thought about, the more it makes sense. Still, why IS Randy Winn an elite RF in his mid 30s? How did that happen? It seems to me that it’s one of three things: 1) that he made a quantum leap in his ability to get jumps, 2) it’s partially a park effect deal, or 3) it’s got to do with the nature of his FB chances. The latter two aren’t terribly satisfactory, and the former isn’t terribly likely. HITf/x will solve everything, I hope.
And no one (here) is talking about scrapping a metric. Defensive metrics give us a tremendous advantage over conventional wisdom, and often do spot counter intuitive results well before fans/scouts. But I think for a number of reasons, we need to regress them just as we would with small-sample offensive results, or park-specific offensive results. As we get 2, 3, 4 seasons of data, the amount of regression needed obviously goes down. Still, the fact that the range is compressed AND we need to regress means that it’s really difficult to take something like “Abreu gave up 2.5 wins with his glove last year” at face value. I hope I’m making sense.
by marc w on
Jan 13, 2009 5:24 PM PST
up
reply
actions
0 recs
I don't know on Winn
He was pretty solid in center before switching to right, so maybe he’s always been really good? I don’t think anyone’s going to tell you he’s +17 good – clearly, some regression is in order – but he may just be a late bloomer or going through an exceptional aging process. If it can happen to hitters and pitchers, I don’t see why it can’t happen for fielders. They’re all equally counterintuitive.
HITf/x would almost certainly solve everything. Somebody fund this please.
I think you should always view historical defensive numbers as a range, rather than as a point (say, -30 < x < -20 for Abreu rather than -26). However, I don’t think it’s a problem to be of the opinion that both (1) Abreu was terrible last year, and (2) Abreu doesn’t project to be as terrible going forward. It’s like looking at Ryan Ludwick and saying, okay, he was +41 at the plate last year, but that’s probably not going to happen again in 2009. There’s a difference between reflection and projection, and it’s important to keep that in mind when looking at the numbers. Abreu very well may have given up 2.5 wins or so with the glove in 2008. That’s a data point. I think we only really have to play around with it if we’re trying to predict how he’ll do in the future.
by Jeff on
Jan 13, 2009 5:42 PM PST
up
reply
actions
0 recs
It’s like looking at Ryan Ludwick and saying, okay, he was +41 at the plate last year, but that’s probably not going to happen again in 2009.
shut up shut up shut up
by JI on
Jan 13, 2009 6:07 PM PST
up
reply
actions
0 recs
he better not
---
Juuuust a bit outside!!
http://www.rightfieldbleachers.com
by jhmoore on
Jan 13, 2009 10:59 PM PST
up
reply
actions
0 recs
He was OK in CF, nothing special
and he’s been poor there in limited trials for the Giants (again, just using UZR).
I agree that we’ve got a data point, and I agree that Abreu was horrible last year. The problem comes in trying to peg his value, or rather a reasonable range of his value. The positional adjustments for moving from LF/RF to DH are so high, it only makes sense for truly abysmal fielders (Dunn, for example). Should Abreu’s next team leave him in the OF to regress, or has he fallen off a cliff? Yes, yes, this can happen with hitting as well (for every Yuni Betancourt there is a Travis Hafner), but if fielding is subject to, what, 2 win swings from year to year… what are we up to with total player value? The error bars are getting damned far apart.
by marc w on
Jan 14, 2009 9:36 AM PST
up
reply
actions
0 recs
If you're trying to peg someone's current value
I think you absolutely have to regress outlandish data points with a kind of defensive Marcel. So a team signing Abreu should consider him a -15 or -20 defensive OF, rather than -25 or -30. If he proves in the early going that he really is as bad as he looked last year, then that’s something you have to deal with, but regression is a fact of life, in all areas.
by Jeff on
Jan 14, 2009 9:42 AM PST
up
reply
actions
0 recs
Despite the fact that he was ~-4 in RF in 2007?
Again, if his true talent is -15<x<-20, then moving him to DH maybe makes sense (still a tough call). If it’s more like -5<x<-28, then, uh, what?
Regression IS a fact of life, and every team probably does it subconsciously even if they’ve never heard the word. I know Tango and others are still really interested in the question of when do you have ‘enough’ data – at what point is it ‘OK’ to use a batters home/road splits as opposed to generic splits to see how a park affects a specific person, for example. I’m pretty sure one season of defensive data isn’t enough to move the expected range for Abreu all that much, although I totally agree that it IS a data point and is a damned useful indicator not only that he sucked last year, but that the trend may be pointing towards washed up.
by marc w on
Jan 14, 2009 9:53 AM PST
up
reply
actions
0 recs
General consensus seems to be that you need at *least* 2-3 seasons of defensive data to get a good idea
for Abreu, you’re looking at this picture (by UZR only):
-15
-4
-26
As a player who turns 35 in March, that looks an awful lot like a guy who’s 15 to 20 runs below average, and it’s supported by +/-, PMR, and RZR. So I think that’s how a team should view him going forward. It could end up being inaccurate, but the same goes for any metric measuring any part of performance.
by Jeff on
Jan 14, 2009 10:17 AM PST
up
reply
actions
0 recs
I could see how the zones would favor a RF.

by JLC on
Jan 13, 2009 5:52 PM PST
up
reply
actions
0 recs
See, now this makes a lot of sense
and it’s also really troubling if true.
by marc w on
Jan 14, 2009 9:36 AM PST
up
reply
actions
0 recs
UZR includes a park factor that should in theory adjust for this
by Jeff on
Jan 14, 2009 10:20 AM PST
up
reply
actions
0 recs
Randy Winn flipped out offensively in 2005 but no one ever calls bullshit on that
by JI on
Jan 14, 2009 10:59 AM PST
up
reply
actions
0 recs
EVERYONE calls bullshit on that
and look, he’s not a .680 slugger anymore (and besides, that was a sample of 58 games).
by marc w on
Jan 14, 2009 11:01 AM PST
up
reply
actions
0 recs
Exactly nobody projects Winn as a .600+ slugging outfielder.
We had enough data to know that it was a SSS fluke, and that’s exactly what it was shown to be.
The UZR thing is different. Again, maybe it’s possible, but it would be great to figure out how/why, and, more importantly, if it’s repeatable or a real skill.
by marc w on
Jan 14, 2009 11:24 AM PST
up
reply
actions
0 recs
I think you misunderstand
I’m just using it as an example of how a +3 outfielder could flip out post a +17, and then return to normal levels.
by JI on
Jan 14, 2009 11:44 AM PST
up
reply
actions
0 recs
I think the problem here is that fielding is relative to league average, whereas hitting is relative to replacement
if you wanted to define some “replacement level fielding,” you’d probably start with an Ibanez or Burrell, say -20, all the way up to the best, say an Endy Chavez(!) or Franklin Gutierrez(!) at like +15.
So if we set Ibanez to replacement, or 0, all of a sudden Chavez/Gutierrez becomes a +35 run defender instead of +15
by seattlebruin on
Jan 13, 2009 5:59 PM PST
up
reply
actions
0 recs
However, from observation, defense appears to be more consistent than offense.
That is, offense is more susceptible to fluctuations. This is well illustrated by the many offensive “slumps” that can be observed frequently. A guy suddenly seems to forget how to the hit the ball and almost can’t buy a hit for a couple weeks. But same guy doesn’t forget how to catch a ball or run for it, and we won’t see someone seem so clueless on defense in such a “slump” (or brilliantly burst out of it to go on a defensive spree/onslaught suddenly, as seen often for hitting).
Such simple observation would indicate that offense inherently has more than fluctuations than defense, and comparing the two cannot be a defense for the glaring inconsistencies in UZR.
On a yearly scale too, we have career years for offense. I’ve never heard of some guy having a career year for defense. Same guy is most likely defensively good the year after also.
(Of course you can expand the timeline to many years to smooth out the inconsistencies in UZR, but that doesn’t say much about UZR’s value as a predictive tool.)
My own main reason for not trusting UZR (the inconsistencies hurt it too of course) is that unless he changed it, I think the guy who does it assigned a park factor detrimental to defenders in Safeco. That is, UZR claims that Safeco is easy to defend, which is ridiculous in my opinion and unfair to Mariners.
I don’t trust UZR.
by Sam Regens on
Jan 13, 2009 4:42 PM PST
reply
actions
0 recs
You have good reason to say Safeco isn't easy to defend in?
by Matthew on
Jan 13, 2009 5:02 PM PST
up
reply
actions
0 recs
Why wouldn't a defender slump?
No hitter literally forgets how to hit, just as no defender literally forgets how to defend. Shit just happens over small sample sizes that makes people look better or worse than they really are. That’s the case with everything on the planet; why should fielding be any different? Just because it’s more difficult to understand doesn’t mean we shouldn’t assume that it’s true nonetheless.
Also, Safeco is an easy place to field, relative to the rest of the league.
by Jeff on
Jan 13, 2009 5:23 PM PST
up
reply
actions
0 recs
Any idea why this is?
I’d think that larger ballparks would be harder to field in due to more space in the OF.
by seattlebruin on
Jan 13, 2009 6:00 PM PST
up
reply
actions
0 recs
Safeco generates more fly balls and the wind knocks them down
by Graham on
Jan 13, 2009 6:04 PM PST
up
reply
actions
0 recs
Follow-up
so are ballpark factors as much a result of environmentals as actual ballpark dimensions then? A bigger factor in overall ballpark factors?
by seattlebruin on
Jan 13, 2009 6:06 PM PST
up
reply
actions
0 recs
Prevailing weather is critically important in terms of park effects
Safeco’s murder on righties because of the wind and the dimensions, not just the latter.
by Graham on
Jan 13, 2009 6:22 PM PST
up
reply
actions
0 recs
Yeah
I was really surprised when I compared the dimensions of Safeco Field to the Ballpark in Arlington. Granted, TBIA doesn’t have the crazy death alley in LCF, but if dimensions were the only important factor, that should be largely balanced by TBIA being harder on LHB.
by ubelmann on
Jan 13, 2009 7:11 PM PST
up
reply
actions
0 recs
Jeff, is it that defense is harder to understand (not the same as observing btw) or simply that the actual fluctuations are smaller compared to hitting?
Nobody’s saying that there are no fluctuations in defense, just that it’s hard to sell that UZR’s big inconsistencies reflect reality.
by Sam Regens on
Jan 13, 2009 8:01 PM PST
up
reply
actions
0 recs
It's difficult to understand what a defensive slump would entail
but I don’t think that’s reason to believe they don’t exist, and that they can’t on occasion be severe.
by Jeff on
Jan 13, 2009 8:03 PM PST
up
reply
actions
0 recs
I find it much easier to explain defensive slumps than offensive slumps.
I can see minor injury or muscle fatigue affecting defensive performance (specifically range) and having little to no effect on offensive performance.
by acblue on
Jan 13, 2009 10:07 PM PST
up
reply
actions
0 recs
It all comes down to chances.
The number of balls a hitter’s going to see at the plate is far more than the number of balls a fielder is going to have hit to him.
Let’s take Jose Lopez as an example. In 2008, Lopez had 448 balls hit in his zone, according to THT. Comparatively, Lopez saw 3.7 pitches/plate appearance in 2008. With 687 PAs, that means Lopez saw 2,542 pitches the entire season. That’s more than 5 times as many balls seen at the plate than in the field.
It’s easier to recoup missed opportunities with the bat since you’re given more chances to see a pitcher’s pitch, thereby increasing the number of mistakes the pitcher is likely to make, as well as the number of successes you can have. When we see a hitter “slump”, we see him missing opportunities on easy-to-hit balls, or getting “fooled” a lot. But we’re able to see those cases because of the number of pitches a hitter sees in a season.
For a fielder, it’s a one-shot opportunity. You either field the groundball/flyball or you don’t. You don’t get “3 strikes” to try and glove it. Is it easier to field than to hit a pitch? Sure. But you need to account for the variance too. If a groundball happens to take a “bad hop”, or the ground is wet that day, or the sun got in the fielder’s eye, then the chance is missed and the fielder gets penalized. Over one season, a lot of fluke misplays can add up.
by JLC on
Jan 13, 2009 5:25 PM PST
up
reply
actions
0 recs
But the thing is
when you talk about offense, you’re talking about the results; when you talk about observing a guy in the field, that’s more akin to watching his swing. It’s like saying, “I don’t believe he’s in a slump — his swing’s as pretty as ever.”
by The Ancient Mariner on
Jan 13, 2009 5:39 PM PST
up
reply
actions
0 recs
Good old "common knowledge"
For the entirety of baseball history, it has been a self-evident “fact” that pitching and hitting can fluctuate, but speed and defense never slump. When over 75% of the people who (try to) control the discussion believe Jim Rice is a Hall of Famer, you realize that advanced defensive metrics have a long way to go before they become mainstream.
It’s funny how even casual NFL fans can trust QB rating, but so incredibly few baseball commentators will bother quoting anything more complicated than OPS.
by AnotherAaron on
Jan 13, 2009 5:51 PM PST
reply
actions
0 recs
I've been wanting someone to address this.
Thanks, great post.
We need to be reminded every now and then that ‘true talent level X’ can produce seasons that fall into a surprisingly wide range of productivity levels. It’s amazing what a few lucky bloopers, infield singles, and wind aided flyballs can do to triple-slash line.
by Manzanillos Cup on
Jan 13, 2009 6:54 PM PST
reply
actions
0 recs
Good point, and don't throw out the baby with the bath water
The points being made are good ones.
Here’s an inference that sends you to logical hell: information x is of limited reliability, therefore, I am totally justified in completely ignoring it. This is a horrible thought processes that people are far to ready to engage in. If you have information that is of limited reliability, the thing to do is understand those limitations as you assimilate the information, not ignore it completely. It seems to me that most of the rejection of UZR is based on exactly this sort of egregious thinking.
by philosofool on
Jan 13, 2009 10:19 PM PST
reply
actions
0 recs







