The tRA Post
I've been promising everyone a full tRA writeup for some time now, so here it is. Enjoy.
Introduction
We don't really have a widely available coherent metric for pitchers which tells us how good a pitcher is, independent of his home park and the defence behind (and if anyone feels tempted to say 'ERA' here, read Dave Cameron's article on pitcher evaluation first). FIP and xFIP are the most commonly used general pitching stats we have, and they're not really good enough, as they only look at 3 possible outcomes of an at-bat: K, BB, HR.
There is therefore a distinct motivation for the construction of a metric which takes into account every action a pitcher is responsible for, and turns those numbers into runs, based around a highly logical and transparent mathematical framework.
Theory and Method
There are essentially only eight possibilities for the state of a baseball at the instant the contest between batter and pitcher has resolved itself. The batter may walk, strike out, or be hit by a pitch. He may also hit a line drive, a ground ball, an outfield fly, a popup, or a home run. Others, such as bunts and intentional walks, are essentially subsets of the more important outcomes. These possibilities can be regarded as being governed by the pitcher, provided that there is a large enough sample size.
tRA is built around knowing how many runs and outs each of these events are worth, and ideally we want to know this for each year so as to account for different run-scoring environments. Before we delve into the metric itself, let's see how we might accomplish this, with a little help from Matthew the Data Fairy and his magic.
Using play by play data we can, in any given year, determine the average number of outs that was made on a given type of play by simply going through logs counting outs and dividing. Fairly straightforward, although a small correction factor has to be introduced to deal with outs made on the bases. An example table for 2008 is shown below:

Runs are slightly more tricky. We have to introduce a run expectancy matrix in order to work out how many runs -should- score from any given game situation. (bases empty, no out, etc.). In general, they look something like this, but it's certainly possible to build your own based again on play-by-play data. When the matrix is derived, we can work out the difference in runs on any given play by looking at the following:
play_run_value = runs_scored + (run_expectancy_after - run_expectancy_before)
With a little effort/fairy dust, the yearly average value of each type of play can be determined. So far for 2008 it looks a little like this:

If these are combined with the frequency with which a pitcher gives up each outcome (after making some park adjustments based on these numbers [spreadsheet]) and multiplied by total batters faced (TBF), we can determine how many runs that pitcher would have given up in a neutral park in front of an average defence. From the outs table shown earlier we can also figure out how many outs/innings he would have been expected to pitch through. tRA can then be determined as follows:
tRA = expected_runs/expected_outs*27
which gives us the expected runs a pitcher will give up per 9 innings pitched.
tROA measures how many runs are saved by a given pitcher compared to an average pitcher from the same grouping (e.g. NL SP, etc.). This metric ignores the work done on determining how many outs a pitcher should be expected to record, but it does allow for measuring pitchers in terms of their overall run value (which can then be converted to wins). tRA+ is simply pitcher_tRA/league_tRA, facilitating a quick evaluation of a pitcher compared to league average.
Another point worth considering is regression towards average. Certain pitching stats are known to fluctuate quite wildly from year to year, and in order to correct for this every outcome is regressed towards the mean based on their year-by-year correlation values and the total batters that a pitcher has faced on the season, with less regression applied the larger the sample size. The actual values to which regression is applied are as follows:
K%, BB%, HBP%, GB per ball in play%, IFF per ball in air%, LD per ball in air%, and HR per FB%
The order is extremely important, as influencing GB% will have an effect on LD% later, and so on, sometimes causing regression away from the mean in unusual situations.
Once a pitcher's line has been regressed, the same algorithms used to generate tRA and tROA are applied again to give tRA* and tROA*.
Results
Unfortunately I can't give you guys any spreadsheets to play around with this time, but Matthew and I (ok, mostly Matthew) are working on making everything accesible online. We'll be sure to let you know when it's ready.
Hopefully we can appease you with some 2008 leaderboards and old 2007 player cards, though...




2007 Player Cards: Batista, Bedard, Felix, Silva, Washburn.
If there are any questions, that's what the comments section is for. Oh, and many thanks to Matthew for his work on this too.
10 recs |
78 comments
Comments
2 interesting points I took away when making the leaderboards, 1 visible and 1 not.
Visible: Arizona has four (4!!) SP in the top ten, Colorado and the Dodgers both have three RP in the NL top ten. These are incredible groupings.
Not visible: Morrow is 11th in AL RP at 2.7 tRA.
by Matthew on Jun 23, 2008 12:44 PM PDT reply actions 0 recs
KC also has three RP's in the top ten.
How close do the expected runs match up with the actual runs? I’m assuming the error % is small, but I’m curious.
by Jed MC on Jun 23, 2008 4:12 PM PDT up reply actions 0 recs
Well, it is an awesome writeup, and Graham is also awesome.
Makes sense.
I reject your reality and substitute my own!
Also, I'm always down for some online Grand Theft Auto IV or Rock Band. Gamertag: Phildopip
by Phildopip on Jun 23, 2008 1:02 PM PDT up reply actions 0 recs
I picture it as a very dramatic pause
as if the authorship was in question.
by Matthew on Jun 23, 2008 1:07 PM PDT up reply actions 0 recs
Ah.
I see it as something similar to, “Police close roads, bridges”.
I reject your reality and substitute my own!
Also, I'm always down for some online Grand Theft Auto IV or Rock Band. Gamertag: Phildopip
by Phildopip on Jun 23, 2008 1:23 PM PDT up reply actions 1 recs
there's a word for this but it escapes me
the artist formerly known as Mere Tantalisers.
by Bearskin Rugburn on Jun 23, 2008 1:36 PM PDT up reply actions 0 recs
Yeah, in a headline, the comma could function as an "and."
But in regular writing, the comma is good when addressing someone. Awesome punctuation, Brian!
Also: Awesome writeup, Graham. This is good stuff. Even a lay person such as myself can see the inherent logic of tRA. And it’s not like some crazy, secret BP metric in which the formula is unknown to the reader, so it’s much easier to trust the numbers. Good work.
by Teej on Jun 23, 2008 9:31 PM PDT up reply actions 0 recs
I read inherent as incoherent the first time
and was about to unleash rage.
by Matthew on Jun 23, 2008 9:49 PM PDT up reply actions 0 recs
at least you have power to delete your own misguided rants
Your= "belonging to you" You're= "You are" (like the song)
by JI on Jun 23, 2008 10:31 PM PDT up reply actions 0 recs
I knew Felix has taken a step backwards this year, but how far...? What's his current 2008 tRA?
Yesterday's Pants
A blog-thingy about the Mariners and stuff.
by BrettJMiller on Jun 23, 2008 1:26 PM PDT reply actions 0 recs
Wow, I'm up to 6,000 comments on LL
Awesome work, Graham and Matthew. Did you go through the FanGraphs data again one by one or were you able to get it automated this time?
by seattlebruin on Jun 23, 2008 1:35 PM PDT reply actions 0 recs
Maybe you should change your username
by seattlebruin on Jun 23, 2008 1:45 PM PDT up reply actions 0 recs
Fairy.
Oh, snap.
I reject your reality and substitute my own!
Also, I'm always down for some online Grand Theft Auto IV or Rock Band. Gamertag: Phildopip
by Phildopip on Jun 23, 2008 1:57 PM PDT up reply actions 0 recs
When I think of Matthew
I think of large boats filled with spreadsheets.
by BrianL on Jun 23, 2008 1:58 PM PDT up reply actions 0 recs
So the run values for the various outcomes are based on this season's play by play?
That seems like a fairly small sample size, but then the out expectancy seems almost in line with what I remember from Tango’s charts of a couple years ago so I guess it evens out pretty quickly.
I wonder, actually, how much this could change over the course of the season, as the air warms up and more OF flies go to the wall?
Also, I notice that you have four distinct categories or balls hit in the air – pop up, fly ball, home run, line drive – each with a very different run/out value. Is the play by play available out there any nearer to giving better characterizations of grounders? I imagine that the difference between a DP ball to short and a screamer up the line that just grazes the grass is considerable. And while I don’t know whether a pitcher can consistently induce weak grounders, you seem to be taking defense and park out of the equation, not so much repeatability. Great work, I look forward to seeing the season’s end tables.
the artist formerly known as Mere Tantalisers.
by Bearskin Rugburn on Jun 23, 2008 1:50 PM PDT reply actions 0 recs
I imagine BIP and run value data will get a shitload more refined once bat f/x gets into full swing
by Fett42 on Jun 23, 2008 2:03 PM PDT up reply actions 0 recs
I hope you mean ball f/x
but yeah, that would be awesome. for now we just have to do with hittrackeronline
the artist formerly known as Mere Tantalisers.
by Bearskin Rugburn on Jun 23, 2008 2:36 PM PDT up reply actions 0 recs
It is a small sample size, but if the object is to normalize values to the league environment, which is what Graham is trying to do,
then it’s necessary and not really a concern.
The values themselves change very little, but the impact of those changes is actually sizable.
BIS provides a velocity measurement (on a 1-3 scale), but as with all these things it’s entirely subjective. Plus BIS data costs money. MLB’s does not. You’re going to have issues with batted ball classifications no matter who you go by.
by Matthew on Jun 23, 2008 2:42 PM PDT up reply actions 0 recs
figures
I realize you have to stop with the categorizing at some point, else you end up like a JL Borges story (bonus points if you know which one).
But it seems unfair to give pitchers a bonus for pop flies while grouping double play balls with doubles up the line (unless inducing pop ups is a highly repeatable skill, which it isn… is it?). There is a ~.15 run difference between the two types of fly ball, and I imagine the difference between an infield grounder and one that gets to the grass is much greater.
In any case, what you guys have put together is great. It’s a work-realted habit for me to try to find flaws in good research, even if there aren’t any to find.
the artist formerly known as Mere Tantalisers.
by Bearskin Rugburn on Jun 24, 2008 12:43 PM PDT up reply actions 0 recs
A while back I decided that whenever I'm in the company of Matthew
I need to buy him at least one beer. The same goes for Graham. This information is awesome.
Felix Hernandez is just... too... sweeeeeeeeeeeeet!
by Katal LM on Jun 23, 2008 2:48 PM PDT reply actions 0 recs
Spending the weekend in New England
=(
Felix Hernandez is just... too... sweeeeeeeeeeeeet!
by Katal LM on Jun 23, 2008 3:02 PM PDT up reply actions 0 recs
Lame.
Though there’s another chance two or three weeks later in Portland.
by Matthew on Jun 23, 2008 3:07 PM PDT up reply actions 0 recs
Yeah?
An excuse to visit Portland + that excuse being beer = I can’t miss that.
Felix Hernandez is just... too... sweeeeeeeeeeeeet!
by Katal LM on Jun 23, 2008 3:10 PM PDT up reply actions 0 recs
Fair warning
Summer Brewfest in Portland is unbelievably annoying. Way too many people crammed into way too small a space (it’s in Waterfront Park), which means hideous lines for small servings of beer, unless you want a full pint, which costs 3 tokens ($6, I think). If you’re coming to Portland to drink you’re better off going to Henry’s (douchey crowd, incredible beer list, ice bar) or to any of the Lompoc bars (awesome crowd, awesome beers).
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on Jun 23, 2008 3:13 PM PDT up reply actions 0 recs
yikes
Thanks for the heads up. I have very little experience with Portland, but I’ll write down the name “Lompoc” in order to remember.
Felix Hernandez is just... too... sweeeeeeeeeeeeet!
by Katal LM on Jun 23, 2008 3:16 PM PDT up reply actions 0 recs
The Lompoc bars are awesome.
The one in NW Portland (the New Old Lompoc) has a fantastic outdoor beer garden, and really good food. The Oaks Bottom Public House in Sellwood has lots of oft-rotating local craft brewers (like, every couple weeks they have different ones) in addition to the Lompoc beers (which are always awesome), and the one in N Portland (The 5th Quadrant) is a little slicker/more corporate (it’s a lot newer than the others, in a newly built space) but it is still really good.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on Jun 23, 2008 3:20 PM PDT up reply actions 0 recs
You keep forgetting to mention
Tater tots at Oaks Bottom.
Go Fo Broke!
by eknpdx on Jun 23, 2008 3:42 PM PDT up reply actions 0 recs
Oh no, not just tater tots.
TOTCHOS!!!!! Tater tot nachos.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on Jun 23, 2008 3:48 PM PDT up reply actions 0 recs
Tater tots at any McMenamins!
Yes, I'm a girl. Yes, I know baseball. Yes, I even drink beer.
by NOLAmarinergirl on Jun 24, 2008 7:22 AM PDT up reply actions 0 recs
McMenamins is the Burger King of bars
there’s too many of ‘em, they all serve the same bland food, and the beer’s meh. But I will give them credit for the tots, which are miles better than any McMenamins french fry.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on Jun 24, 2008 10:12 AM PDT up reply actions 0 recs
But the tots!
I take a lot of pleasure in those tots.
Yes, I'm a girl. Yes, I know baseball. Yes, I even drink beer.
by NOLAmarinergirl on Jun 24, 2008 11:55 AM PDT up reply actions 0 recs
Next time you're in town
go to the Oaks Bottom and get the totchos. You will never think about McMenamins again.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on Jun 24, 2008 1:50 PM PDT up reply actions 0 recs
Don't go to Henry's.
You’re better off sticking with the Lompoc, or the Blue Monk in SE Portland.
I reject your reality and substitute my own!
Also, I'm always down for some online Grand Theft Auto IV or Rock Band. Gamertag: Phildopip
by Phildopip on Jun 23, 2008 3:52 PM PDT up reply actions 0 recs
Henry's is good when it's not Friday or Saturday night
the douchiness quotient is low, and the beer selection is quite amazing.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on Jun 23, 2008 9:20 PM PDT up reply actions 0 recs
You're probably right.
I’ve been there twice, and hated it both times…but both times I went there it was a weekend night.
Another bar with a good beer selection and a decent crowd is North 45 near 21st and Glisan.
I reject your reality and substitute my own!
Also, I'm always down for some online Grand Theft Auto IV or Rock Band. Gamertag: Phildopip
by Phildopip on Jun 24, 2008 8:08 AM PDT up reply actions 0 recs
I used to live right by there
and that place is way too trendy for me. We always used to go next door to O’Briens, great burgers but a so-so beer selection. But they have Dead Guy.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on Jun 24, 2008 10:13 AM PDT up reply actions 0 recs
Random saber-?
Has anybody tried to predict K%with swinging strike % and called strike ? Has anybody analyzed called strike, swinging strike %, etc as a function of count? Seems like these would be very useful in trying to predict and analyze a pitcher’s performance and how a pitcher goes about his business.
Has HitTracker been used to see which pitchers had lucky or unlucky HR totals?
by Edgar for Pres on Jun 23, 2008 4:21 PM PDT reply actions 0 recs
Has anybody tried to predict K%with swinging strike % and called strike ?
Matthew has
by Graham on Jun 23, 2008 4:56 PM PDT up reply actions 0 recs
I thought he had done some work with it
I couldn’t remember if it worked. On THT?
by Edgar for Pres on Jun 23, 2008 5:08 PM PDT up reply actions 0 recs
No, senior year stat concentration research project.
It’s unpublished because I wanted to refine it and then pitch f/x came out.
by Matthew on Jun 23, 2008 6:00 PM PDT up reply actions 0 recs
You picked a shitty day to post this.
Your= "belonging to you" You're= "You are" (like the song)
by JI on Jun 23, 2008 7:19 PM PDT reply actions 0 recs
It's cute you spell defense with a c.
Your= "belonging to you" You're= "You are" (like the song)
by JI on Jun 23, 2008 7:34 PM PDT up reply actions 0 recs
The one time I did my uncle mocked me.
Your= "belonging to you" You're= "You are" (like the song)
by JI on Jun 23, 2008 7:41 PM PDT up reply actions 0 recs
No surprise to see Webb and Haren, and I guess RJ, but Doug Davis? Weird.
Felix Hernandez hit a Grand Slam off Johan Santana.
by Goose on Jun 23, 2008 7:25 PM PDT reply actions 0 recs
RJ is so underappreciated.
Your= "belonging to you" You're= "You are" (like the song)
by JI on Jun 23, 2008 7:34 PM PDT up reply actions 0 recs
Cool stuff.
I wish I could look at baseball statistics and do something with them. But I’m satisifed just watching you and Matthew do awesome stuff with it. Keep up the good work.
Also, I’m surprised to see Thornton ahead of Papelbon.
I will not let negativity bring me down.
"Do the right thing."
by LantermanC on Jun 23, 2008 8:17 PM PDT reply actions 0 recs
Out of Curiosity...
Have you ever regressed tRA against ERA or FIP?
by PLU Tim on Jun 23, 2008 8:21 PM PDT reply actions 0 recs
It, as well as the below question
is on the agenda for this summer when we’re finally able to work together at the same time that does not involve either me skimping on my day job or me staying up until 7am.
by Matthew on Jun 23, 2008 9:51 PM PDT up reply actions 0 recs
Would you consider this to be useful for predicting future performance?
I understand that as a theoretically neutralizing stat it would have some use in projection, but overall do you see it as something that is going to vary significantly from year to year?
Determined, Jonesing Commentor
by I'm NOT Corco on Jun 23, 2008 9:40 PM PDT reply actions 0 recs
So, with regards to Doug Davis . . .
Dude’s walking more than 5 guys per 9 innings this year, and his LD% is right around 25%, so how does he end up in the top 10? My dumb guess:
a. A solid GB% limiting the damage.
b. Some good fortune when it comes to homers. (0.41 HR/9, which doesn’t seem at all sustainable)
Am I in the ballpark?
by Teej on Jun 23, 2008 11:12 PM PDT reply actions 0 recs
different LD rate.
MLB has him at ~20%, a tick above average,
GB% is actually below average. He has a solid K rate, and gets a boost in the K/BB department from park factors, but overall, yes, it’s all about his low HR/BIA% which no, probably isn’t sustainable at the level he’s at.
by Matthew on Jun 23, 2008 11:58 PM PDT up reply actions 0 recs
Oh, OK, I was looking at Fangraphs.
And as far as GB%, I thought 44% was a bit better than average. But again, I guess we’re looking at two sets of data, so that’s probably it.
Anyway, thanks for the clarification.
by Teej on Jun 24, 2008 12:11 AM PDT up reply actions 0 recs
Also, where you finding MLB's LD%?
And is their stuff more accurate than Fangraphs right now?
by Teej on Jun 24, 2008 12:14 AM PDT up reply actions 0 recs
MLB doesn't publish it in easy to read form. I get all facts and figures from my database.
Accurate is again subjective. See my comment above re: batted ball types.
by Matthew on Jun 24, 2008 12:55 AM PDT up reply actions 0 recs
Dumb question
But I’m still relatively inexperienced with Sabermetrics so hopefully you won’t hold it against me too much.
Looking at your valuation of tRA+ above, which is pitcher_tRA/league_tRA, wouldn’t this give the top pitchers a
Free Stephen Awesome Strasburg!
by thejew4u on Jul 2, 2008 1:33 PM PDT reply actions 0 recs
Ack, forgot I can't use lcaret
Wouldn’t this give the top pitchers a below 100 tRA+, and the Miguel Batistas of the world a sky-high one? Is that on purpose, or is there some other reason it goes against (what to me is conventional wisdom) of ERA+, OPS+, etc. above 100 meaning above avg?
Free Stephen Awesome Strasburg!
by thejew4u on Jul 2, 2008 1:40 PM PDT up reply actions 0 recs

by 













