Baseruns applied to pitchers using batted ball data
First off, this is not a completely novel or creative idea. It has been tossed around here and there for a while. Graham and I had exchanged a few comments about it a while back which ended with me saying it would be cool and Graham saying he didn't have the time to do it at that time. From there I expected it to fade away and not be thought about again in the near future.
The basic idea is that Baseruns is a very good runs estimator. Some would probably consider it the best for a variety of reasons. The most important part is that can account for basically any run environment. For example, if we use it analyze a single game where one of the teams gets one hit and its a home run then Baseruns predicts 1 run instead of the 1.4 runs linear weights predicts. If you want to read more about it, I suggest looking at The Book Blog or ask if you want more details.
The major problem with Baseruns as a run estimator is that it is difficult to apply to individual players because it is a team runs estimator. A great player on an average team is a great player living in an average offensive environment. If you take that great player's stats and enter them into a Baseruns equation, you will overestimate the production of the great player because Baseruns thinks he is a great player playing in a great offensive environment. Likewise, a poor hitter will be undervalued by Baseruns. One of the ways to use Baseruns to calculate the value of a hitter is to use Baseruns to calculate the team production and then calculate the team production without the hitter on it. Linear weights is a more commonly used offensive metric because you can calculate the production of a hitter, independent of the run environment (team) they play on. Instead, linear weights assumes that the hitter plays in an average offensive environment which is fair and makes a lot of sense for a hitter.
The great thing for Baseruns is that pitchers play in a run environment determined by mostly just themselves (as well as defense, park and competition level). Felix Hernandez has a run scoring environment which is much different than Livan Hernandez and this changes the run value for different outcomes. I will be ignoring the effect of the defense or park for now although these are important. This means that we can take the stats from a pitcher's performance and input them into a Baseruns equation and it will spit out the expected runs while accurately taking into account the value for strikeouts, walks, and every hit type in that specific run environment. This is something that pretty much none of the metrics out there do (FIP, xFIP, tRA, etc). For example, Felix's home runs are less harmful than average because he allows less baserunners than average and Livan's strikeouts are worth more than average since he likely has baserunners on. Overall this is a pretty small effect because almost all pitchers have roughly similar run environments except for a few exceptional and exceptionally bad pitchers. To sum up this paragraph, pitchers performance (runs) is not linearly related to our measured variables (K, BB, etc). Baseruns takes this into account and FIP, tRA, etc don't. Since most pitchers are actually very similar with small absolute differences in abilities these non-linearities only show themselves in the extremes.
Interestingly, I stumbled across a thread over at Tango's blog where he was discussing BP's new pitching metric, SIERA. There, I found Patriot (a well known Saber/blogger) had recently done something pretty similar to what I was thinking about using some data (Comment #28 - Colin Wyers) that gave 1B, 2B, etc probabilities for batted ball types. Patriot was trying to recreate SIERA using this sort of data so his aims were a little different than what I was looking for.
I was interested was using this data to convert a player's batted ball profile (FB, GB, etc) along with BB and K rates to calculate the expected runs allowed using the Baseruns equation. The data posted by Colin Wyers gave me all the batted ball outcomes I needed. With this data I could take a pitcher that gave up 10 groundballs, 10 flyballs, 2 popups and 5 line drives and estimate how many singles, doubles, triples and home runs he would give up in an average park in front of an average defense. The next step is to take the projected number of 1B, 2B, 3B and HR along with the number of K and BB the pitcher allowed and plug all this info into the Baseruns equation. Baseruns then estimates the number of runs allowed by the pitcher based on his batted ball profile independent of park or defense. (Not completely independent but its close. More work could fix bias.)
I think this is pretty interesting and we will probably see this sort of thing pop up in some form however the differences between using Baseruns and a linear run estimator turns out to be pretty small. Graham has talked about trying to use Baseruns with tRA as a way to improve it but really the improvement would be small and its tough to motivate coding and gathering all the data to implement it but it would still be cool.
A couple small notes about my implementation. The values Colin Wyers gave were for a few years ago and appear to overpredict hits and home runs which is probably because offense has declined since then. To take care of this I just applied a fudge factor to push down the hit and home run totals to push them in line with league performance last year. The fudge factor isn't ideal but I haven't taken the time to master play-by-play databases to be able to calculate this sort of stuff for myself. I am posting the spreadsheet right here if people are interested. I'm intending on trying to post a couple of things building on this after this post but I wanted to get this out there to hear any thoughts. If we think this sort of thing is valid and works then I'll throw up a post with some more analysis.
The spreadsheet (2009 data) including leaderboards (Pitchers w/ 50+ IP) and all the data can be downloaded here (I hope you can download this). There are lots of numbers and I haven't explained too much of the details so let me know if you have questions about what is going on.
18 comments
|
5 recs |
Do you like this story?
Comments
I think this is a great Idea
When the batted ball data first became available on THT that was one of the first things I did was try and create an estimated batting line using the data and applying base runs to it. I think it’s great because you can then use park factors and park specific hit values.
One of the things I would like to see is a pitcher projection using a method similar to this, using regressed inputs (K%, BB%, GB%, etc).
It would be tRA but also uses a different way to generate the run values for each outcome.
Baseruns is non-linear and more accurate for pitchers than the current linear weights used. (I think this is right, correct me if I’m wrong.)
by Edgar for Pres on May 1, 2010 1:15 PM PDT up reply actions
Very, very nice
If I remember right, this is essentially the same thing David Gassko did with DIPS 3.0.
One thing to take into consideration (if you haven’t yet) is to fudge the “B” factor to maximize accuracy. For example, the best-fit for 2009 (using your inputs) would be:
B = .779*1B + 2.045*2B + 3.312*3B + 1.753*HR + 0.097*TBB
We can take it a step further and include even more data:
B = .765*1B + 2.052*2B + 3.301*3B + 1.784*HR + 0.841*ROE + 0.055*NIBB + 0.172*HBP – 0.509*IBB – 0.04*(AB – H – K – ROE + SH + SF) – 0.060*K
The “A” term would have to incorporate ROE, also.
It’d be interesting to see how much stronger the correlation or RMSE would be.
Triples Alley: Analysis of the San Francisco Giants, Baseball, and Sabermetrics.
Whoops
That should read as “-0.004” for non-K outs.
Triples Alley: Analysis of the San Francisco Giants, Baseball, and Sabermetrics.
Yeah I wasn't really too worried about making this really complex
because I feel like the major source of inaccuracy is the numbers from Colin Wyers that gave 1B, 2B, etc probabilities for batted ball types because they don’t really apply to 2009. You definitely could play these sort of games an increase the total accuracy.
by Edgar for Pres on May 1, 2010 4:05 PM PDT up reply actions
Also, looking at it closer, DIPS 3.0 is basically the same thing.
That’s cool and good to know. It would be interesting to compare DIPS 3.0 and tRA to see what differences there are.
by Edgar for Pres on May 1, 2010 4:07 PM PDT up reply actions
This seems like a good post to ask:
Is it possible/easy to integrate tRA with a defensive metric in order to get an estimated ERA for a given player on a given team?
Ignoring the fact that the “E” part of that would be kind of stupid, it does seem like this would be a great way to make the metric more mainstream and, in some ways, it might even be more interesting because although ERA is a bad way to judge pitching talent, it is an interesting way to see the results of when a pitcher pitched.
Additional Question: Could tRA be used to create a defensive metric, by performing some equation on the difference between a pitcher’s actual RA and their tRA? It would have to be team wide, but it could still be interesting.
...and now I'm here
3rd Question:
What is the point of tERA, since it assumes an average defense?
...and now I'm here
Tells you how good the pitcher is. Basically it tries to subtract out everything like defense or the park.
If you want to compare pitchers then you want to only be comparing the pitchers and not anything about where they play or who they play in front of. This is the basic thinking behind all of the modern pitching metrics (FIP, xFIP etc.).
by Edgar for Pres on May 2, 2010 2:43 AM PDT up reply actions
I was under the impression that it was simply tRA multiplied by .93
In order to get the number updated for earned runs. But that would be pointless, since it doesn’t give you an estimated ERA since ERA is largely based on defense. It is the same reason you don’t multiply FIP by .93 to get eFIP.
...and now I'm here
Oh the difference between tRA and tERA is just the factor of .92 or .93 (i forget which exactly)
Graham likes RA more because errors are stupid. Fangraphs switched from tRA to tERA simply because we all think in the ERA scale since its so common and well understood. FIP and xFIP is are on a scale that is equivalent to ERA. This way you can compare tERA, FIP, xFIP and ERA because they are all scaled the same. On average ERA, tERA and FIP come out to the same number (league average) and there are a ton of reasons why one might be higher or lower than another.
FIP is on the same scale as ERA already.
by Edgar for Pres on May 2, 2010 3:19 AM PDT up reply actions
I appreciate the responses, but I do worry that we're not talking about the same thing.
I’m saying that there is no point in “scaling to ERA” since the number itself is meaningless when the two are compared. Let’s look at another example:
Average wOBA is .330
Average BA is… let’s say .270
So if I wanted to, I could multiply wOBA by .8181 and I’d get a wOBA that is “scaled to BA.” But that still tells me nothing since the calculations involved in wOBA have essentially nothing/little to do with the calculation involved with BA. It appears that the same thing is true of tERA. I understand the basic idea of all runs vs. earned runs, but why bother scaling to earned runs at all if the number you get is still not going to tell you anything new about the pitcher, or the expected results of that pitcher, or anything.
Let me know if I’m missing the mark in any way, but as far as I can tell I can’t think of one good reason to scale tRA to ERA even if you believe that earned runs are important. It still doesn’t tell you anything new. It would make more sense to me to use tRA and a defensive metric to come up with an estimated ERA, which can then be compared to the actual ERA. That I can see a use for. tERA just seems worthless.
...and now I'm here
I fully admit I might be missing something in your explanation though.
...and now I'm here
I guess maybe the idea of tERA is to say
“Hey, if the defense was average, here’s what is ERA would have been” but that still seems weird.
...and now I'm here
I think this is a good way to think about it but you also have to include enviroment and luck in that statement too.
I agree with you that ERA is a bad metric but by scaling all our pitching metrics to ERA I don’t think we are necessarily saying that ERA is better than it is. We are just comfortable with the ERA scale. We could just as easily put tERA on a scale where 100 is average. Its all about readers being able to process a number into something that means something to them.
The same can be said for wOBA. OBA doesn’t tell us completely how good a pitcher is and its bias to overvalue players who don’t hit for power. We scale wOBA to an OBA scale because people know what is a good OBA and what a bad OBA is so they automatically know what a good wOBA and what a bad wOBA is.
by Edgar for Pres on May 2, 2010 11:57 AM PDT up reply actions
Good questions
1. You could do that. Basically all you have to do is adjust the run values you are using. Defense has a good outfield? Make flyballs have a lower run value. That sort of stuff is actually pretty easy. I would assume projection systems include a lot of this stuff already so you can go to fangraphs and look at their projected FIP and ERA. I’d assume the differences are from park factors and adjustments for defense.
2. Statcorner already kind of has something like this. They now call it xRR. It is a convolution of park effects and defense. You could normalize it for park effects if you wanted but I think the other widely available metrics would be better although I think xRR does track with team defense decently well. Team defense is pretty easy to measure though.
by Edgar for Pres on May 2, 2010 2:41 AM PDT up reply actions

by 















