FanPost

Runs Estimators Discussion

There is a very interesting article on Hardball Times website about Runs Estimators.  It might help fill that statistics craving you have had and you might just learn something.  

http://www.hardballtimes.com/main/article/a-closer-look-at-run-estimation/

I thought this was especially interesting:
"As a result, the differences in estimations for any one team are fairly small, as in this case here where it varies by 18 runs. Variables that tend to throw these estimates off include especially good or especially poor hitting with runners on base (the 2002 Phils hit .250/.338/.398 with runners on and .266/.340/.441 with the bases empty), or offenses where one or two individuals carry the lion's share of the load."

I think this last part means that a horrible lineup with one or two really good hitters (Ichiro and Sexson) underperforms because their contributions are not taken advantage of.  Ichiro can hit all the singles he wants, but that doesn’t mean he will create runs unless a teammate gets a hit too.  Sexson’s solo home runs are only worth one run compared to what our theories tell us they should be worth.  This means that a team like the A’s who have solid but not great hitters will outperform a team who has a few great hitters and mostly horrible hitters.  The Mariners last year were definitely a team stacked with garbage players for most of the year.

Another topic of interest:
"In both cases, however, we first have to answer the question of whether run estimation formulas that are designed and validated at the team level can actually be applied to individuals.
At first glance the answer should obviously be yes. After all, if a team can be projected to score X number of runs given a specific number of at bats, hits, doubles, and so on, then a player can be said to have "created" (contributed, produced, etc.) Y number of runs given his at bats, hit, doubles etc. However, statisticians are quick to point out that inferences about individuals based on aggregate data don’t always hold. This is the core of the so-called "ecological fallacy"."

I think this may be a problem with current "runs created" type models because you may not be able to apply theory that holds for teams to individual players.  I think it works better than using metrics such as OPS to compare player’s worth, which may be all that we want from it.  It may be good enough for what we are doing but if you look at the extremes stuff starts to break down.  If you have a team where everybody hits a home run once a game and doesn't get any other hits, each of those home runs is worth 1 run and so the team (and hence the players) will be very overrated by all these metrics.  Players could be underrated by the metrics just as easily.  If I put Reed in the 2 hole in the lineup and he gets a bunch of walks, these walks are more likely to turn into runs because Sexson could hit a double or a HR than if I put him in the 6 hole because Betancourt will probably strike out and leave Reed stranded.  A single is much more valuable if Ichiro hits it because he is faster.   The biggest factor seems to be that a player’s real contribution to the team’s runs scored depends on all the other players in the lineup (probably most on the players right before and after him in the lineup).  The list goes on and on.  On a balanced team, these effects are probably small and the general "runs created" theories probably work fine but when you have a team where the production is very polarized I wonder if they really tell us the whole story.

I think these metrics are usually pretty good at trying to compare individual players’ abilities but I don’t know if they are very good at showing how important the players are to the team.  Maybe they aren’t really meant to show how important players are to a team’s performance.  If I have Reed hitting in front of Sexson, I need Reed to get on-base (By the way, I think Reed will see his OBP jump).  Sure it would be nice if Reed hit a few HRs but it would help the team much more if he could consistently get on-base.  The same thing could be said for a lead-off hitter because a in about a quarter of his at-bats nobody is on-base.  When Ichiro starts a game, a single is just as good as a walk but singles are always worth more in all these methods so a lead off hitter like Ichiro may actually see a boost to his RC numbers because he rarely walks even though he doesn’t actually help the team with a single.

I think a lot of this statistics stuff is pretty interesting but it seems like there is little that is gained by using some of the pretty complicated stats out there.  By far, the most useful stats are going to be the simple stuff like OBP or SLG plus a peak at HR and 2B rates along with some scouting info on the player’s defense.  These RC style stats are interesting because they make us reconsider how valuable we think a single is compared to a HR and help us to make better decisions when comparing players but I think it is also important to keep in mind that if we use RC, we may not get the sum of the individual parts.

Oh yeah, I really like the "Win Probability Added" stuff so keep up the good work Jeff.  I think its one of the cooler approaches to track performance and gives us some pretty graphs to look at although hopefully this season more will finish going up to one than down to zero.