As most of you know, I'm inordinately fond of adjusting a pitcher's runs given up by the outcomes a pitcher himself controls, namely K/BB/HBP/GB/OFB/IFB/HR. I'm going to assume everyone reading is familiar with my work on this here and here (and if you're not, I just linked it for you).
There's a problem with this approach, however, especially when we then translate it into a per 9 IP measurement: Although we've adjusted the runs the pitcher has given up we still haven't adjusted for outs, meaning that using their actual IP might give you a less than accurate result. Obviously, this is a problem for tRA, which is a R/9IP analogue.
How do we fix this? The same way we adjust for runs. Try to find out how many outs we should expect for each batted ball types, and then use those expected outs rather than actual innings pitched.
I use values as follows:
K: 1.00 (α)
BB: 0.00 (β)
HBP: 0.00 (γ)
LD: 0.26 (δ)
GB: 0.81 (ε)
OFB: 0.867 (ζ)
IFB: 0.971 (η)
HR: 0.00 (θ)
These numbers were derived from a THT article and a Replacement-Level-Yankees post. You'll note that the only values that are exactly the same as in the sources I cited are LD and IFB. This is because a) double plays aren't accounted for (i.e. the numbers in the links are likelihood of at least 1 out, rather than the number of outs) and b) HR are bundled in with outfield flies.
I got around the first problem by summing double plays over an average season (2006, to be specific, because all the other stats I use are from that season) and dividing by total grounders, giving an extra 0.078 outs per ground ball. The more observant amongst you will have noticed that this technique ignores the distinction between ground ball double plays and every other type - I did it like this because otherwise it would be impossible. I don't think the assumption that every double play comes off a ground ball will lead to much inaccuracy.
The second problem's just some pretty simple algebra:
yields ζ=0.867 given rough league averages for all of those values.
So now we have expected outs per event, so we can sum up expected outs as follows:
xOuts = Batters_Faced*[K%*α+BB%*β+...+HR%*θ]
which simplifies (since lots of our out values are 0 or 1) to:
xOuts = Batters_Faced*[K%+LD%*δ+GB%*ε+OFB%*ζ+IFB%*η]
Sorted? Not quite. Not all of a pitcher's outs are made as the result of a plate appearance - there's baserunning to be considered. Here's where I make the more worrying of my two assumptions: pitchers are all equally adept at controlling the running game. This is obviously untrue, but I have no pitcher-by-pitcher data for PO/CS, and so this is the way I had to do it. So few outs are made this way compared to actually pitching that it's unlikely to skew the results very much.
Instead, I just took total CS (again from 2006) and divided them into total not-outs (there's an iterative process somewhere in here, but it's messy and I don't want to talk about it). This gives us 0.016 ( κ) CS for every not-out*. Our modified xOuts is then as follows:
xOuts' = Batters_Faced*[1-(1-(K%+LD%*δ+GB%*ε+OFB%*ζ+IFB%*η)*(1- κ)]
And then we can finally get tRA again by doing a calculation that gives us expected runs per 27 expected outs, like this:
tRA = [xRuns/xOuts']*27
This ends up being a much more robust calculation than the previous methods, because it more or less nullifies the effect defence and park**has on how many innings a pitcher works, which it was already doing for runs. It should, therefore, be significantly more accurate than previous versions of tRA.
Let's show you an example of what it all ends up looking like, using Mariner data from last year:
xIP is expected innings pitched (xOut'/3), O-xO is Outs-xOuts', and the rest of those columns should be familiar to you. Isn't it interesting how badly our defence managed to screw Jeff Weaver? At least they got outs for Ho while giving up a tonne of runs, but they completely bailed on both making plays and preventing runs for poor Weaver.
NB: The xRuns here is park adjusted and thus not what tRA is calculated with. Just so you know.
An interesting offshoot of this work is that it's generated team Outs-xOuts and xRuns-Runs. These could be useful as measures of team defence, and what's even more curious is that the relationship between O-xO and xR-R isn't nearly as strong as I was expecting. That's a story for another day, however.
*Unless you are the 2008 Mariners. >:(
**This is not entirely true. Some parks reduce the likelihood of