Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Tiger Woods Makes His 2012 PGA Tour Debut

tRAr and wOBAr

I have been working on these two stats for awhile now, along with overhauling StatCorner and while the latter has stalled a bit with a personal time crunch, the new stats have been ready and I'm finally at the point where I wanted to at least get them out there in the wild. There will be much more detailed explanations of both of these forthcoming.

tRAr is basically what tRA* was intended to be. It takes each of the inputs to tRA (GB, FB, LD, IF, Bt, K, BB, HBP, HR) and regresses them first toward the pitcher's recent (max three year) historical rates and then factors in the league average if that 3-year sample is too small.

wOBAr is geared toward the same goal, but is different in a way that builds more off of PrOPS than wOBA itself. PrOPS took a player's batted ball rates and used the league average rates of hits on each type to come up with a modified OPS for the hitter, a way to spot check if someone was getting unlucky on their batted balls. What wOBAr does is take that basic concept and, like tRAr, use the player's own recent past as the baseline, filling in league rates only if needed.

In other words, say Player X hit 100 fly balls in 2009, and 300 over 2007-9 combined. Over those 300, 15 have gone for singles, 30 for doubles, 0 for a triple and five for a home run. The league average would say that 17 of those would go for a single, 24 for a double, three for a triple and three for a home run. Based on how meaningful a sample of 300 is, a mix of the player and league rates is used to come up with regressed rates for 1B, 2B, 3B and HRs for those 100 fly balls in 2009. Repeat across all batted ball types and then you can add up all the singles, all the doubles, etc and end up with the inputs to wOBA.

Both of these are available on the redesigned (and not fully done redesigning, so suggestions welcome, but holier-than-though critiques are not) player pages on StatCorner, such as RRS and Ichiro! Both of these might have some coding bugs in them so I would not yet go around treating them as gospel. The point is to get the idea out there.

Star-divide

Important note:

Because I currently only have three years of data for this to run against, the earlier year (2007, even 2008) of tRAr and wOBAr are not going to be great because they end up getting heavily regressed.

For instance, estimating Erik Bedard’s 2007, the formula wants to look at his 2004-7 data in order to figure out his recent rates. But I don’t have 2004-6 data, so it ends up only using 2007, which means his sample size is small, which means he (like everyone else) gets heavily regressed toward league average.

I’m still working on getting past retrosheet years into my new DB schema. Once I do that, then those earlier years will be more meaningful.

 

Comment 121 comments  |  12 recs  | 

Do you like this story?

Comments

Display:

Wow

These are amazing. Thank you Mathew. tRAr is exactly what I have been clamoring for in a DIPS stat. One question… are the individual components in tRAr (and wOBAr regressed at different magnitudes (like in tRA*)?

by vivaelpujols on Mar 8, 2010 6:16 PM PST reply actions  

Not precisely, no.

Each input for wOBA has it’s own sample size requirement.
tRAr is purely done on an overall BF basis.

Regressions for y2y correlations will come in tRAp and wOBAp, the (hopefully) more predictive forms of these.

by Matthew on Mar 8, 2010 6:22 PM PST up reply actions  

I think you might be better off doing split pair correlations rather than year to year

If you can get a 50% split pair confidence level you know exactly how many plate appearances of league average performance to add to a stat line for regression purposes.

by Graham MacAree on Mar 8, 2010 6:24 PM PST up reply actions   1 recs

And preferably the correlation to ERA in Year N + 1?

I know that you don’t like that as a measure of predictability, but I think it’s valid.

by vivaelpujols on Mar 8, 2010 6:27 PM PST up reply actions  

I'd prefer correlation to tRA in n+1 as a measure of predictability

tRA isolates performance than ERA does, so why not try to predict that?

by Graham MacAree on Mar 8, 2010 6:29 PM PST up reply actions  

This is a really good point that came up at Royals Review,

I think you were the one who brought it up, Graham

Maybe it’s been done somewhere else, but I’d love to see one or more metric-inventors or conceivers such as yourself, Graham, or whomever else, talk about what exactly we want from an “advanced” pitching metric like FIP, tRA, SIERA, EqRA, tFIP, or whatever. I think I just got lost in all the different “tests.”

Exactly why does “prediction” of the current or future year need to be a test of a pitching metric, and what is supposed to be predicted? I think I know some of the answers that some people would give, but I’m curious as to how much weight it has. If ERA (or RA or whatever) is problematic in its conceptual foundations as most of us think, why do we care about predicting it at all?

Again, I"m not saying there aren’t any answers, even good ones, but sometimes when following these debates, I lose the forest for the trees.

Anyway, sorry if this is irrelevant here, it just came to mind and I wanted to type it out before I forgot. Commence mockery now.

I'm not a sabermetrician, but I do play one at FanGraphs.

Can't get enough of me? Check out my Twitter feed.

by Matt Klaassen on Mar 8, 2010 6:36 PM PST up reply actions  

You weren't adressing me, but I have a pretty clear opinion on this matter, so I'll speak up anyway ;)

The problem with ERA or RA is that it contains a ton of information that is out of the pitcher’s control. Voros obviously discovered that pitcher’s have little control over their Linear Weights on BIP, so he (and Tango with FIP) only used the three true outcomes to value pitchers. tRA is pretty similar to that; however, it also breaks down BIP into 4 different categories.

My problem with those metrics (and SIERA and LIPS and the rest) is the assumptions that they make. FIP gives each pitcher 100% credit for his strikeouts, 100% credit for his walks, 100% credit for his home runs and 0% credit for his Linear Weights on BIP. tRA is similar in that it gives each pitcher 100% control for the K, BB, HR and his batted ball rates, but 0% control over the outcomes of those rates.

In reality, pitcher’s don’t have 100% control over their strikeouts, walks, home runs, and batted ball rates (as batter’s, umpires, ballpark, etc. all have an effect on that). And they don’t have 0% control over what happens when the balls is put in play – just very little. So if you are trying to estimate skill, than each stat should be regressed somewhere between 0% and 100% based on the amount of skill involved (which is what tRA* and tRAp do). And really, the true measure of how well a pitching stat measures skill is to correlate it to ERA in year n + 1.

If you are trying to measure value, than I think the current DIPS estimators are fine – with the exception of the fact that they perhaps regress LW on BIP to much as well as timing.

by vivaelpujols on Mar 8, 2010 6:49 PM PST up reply actions  

I was hoping a number of people would jump in

So I know which bandwagon on which to jump

I'm not a sabermetrician, but I do play one at FanGraphs.

Can't get enough of me? Check out my Twitter feed.

by Matt Klaassen on Mar 8, 2010 6:50 PM PST up reply actions  

Would you use wins?

The only difference between wins and ERA is that wins contain pitching, offence, and defence, while ERA ignores offence.

by Graham MacAree on Mar 8, 2010 6:53 PM PST up reply actions  

No, because wins don't add anything that ERA doesn't have

Offense is completely irrelevant to the performance of the pitcher. So is defense, but ERA isn’t just DIPS + defense, it’s DIPS + other things that are pitcher influenced by hard to quantify + defense.

by vivaelpujols on Mar 8, 2010 6:55 PM PST up reply actions  

Either way, your information is screwed up

You have to choose between defensive contamination or losing things like clutch pitching and more precise ball in play information.

Considering that pitchers generally play in front of the same team, I have to think that the former is a lot more onerous than the latter.

by Graham MacAree on Mar 8, 2010 7:02 PM PST up reply actions  

I agree

Like I said, as a measure of value, the current DIPS stats are pretty close to the top of our capabilities with boxscore stats. The only real things left are dealing with correlated skills (like GB% effecting the value of a walk) and timing.

As a measure of skill, however, I’m somewhat unsatisfied (and I recognize that tRA isn’t necessarily trying to predict skill), and I’d like to see some more precise methods for the regression used – and that’s what tRA* and tRAp do I think.

by vivaelpujols on Mar 8, 2010 7:04 PM PST up reply actions  

So basically, their are two types of things that you are going for in a DIPS stat

One is skill, and to test that you should look at ERA in n+1.

The other is value, and I’m not really sure of a way to test that.

Incidentally, this is why I hated BPro’s testing of SIERA. Because they aren’t including stuff like LD% or HR’s, they are going for a skill stat and not a value stat. So comparing the correlation in N+1 to tRA is unfair to tRA because the two are measuring two different things.

But SIERA also gives 100% credit to a pitcher’s strikeouts and walks and GB% and whatnot, so it’s not a true skill metric either. It’s just some kind of bastardized DIPS, with the only redeeming quality being the interrelating events. That is also why I dislike xFIP, but at least studes just calls it a quick and dirty metric and isn’t parroted it around like the best thing since sliced bread.

by vivaelpujols on Mar 8, 2010 7:10 PM PST up reply actions  

Enh

I think the test of a ‘skill’ stat is to predict value. If tRA is a better value estimator than ERA, use that as your baseline.

by Graham MacAree on Mar 8, 2010 7:13 PM PST up reply actions  

Well fair enough

But at any rate, you are either trying to predict the future (skill) or trying to explain the past (value).

You can test the “skill” stats by comparing it against ERA (or tRA) in year N+1, but how do you test the value stats (“how many runs would this pitcher have given up given neutral defensive support”)?

by vivaelpujols on Mar 8, 2010 7:20 PM PST up reply actions  

Ah, the correlated skills thing.

Frankly, that’s a load of complete and utter tosh.

Why would ground ball percentage effect the value of walks? Clearly, it does, but only indirectly. What’s happening is groundballs mean more baserunners (and presumable a change in average baserunner advancement), and that change means that the value of a walk changes. Looking at it any other way is pretty nonsensical. Just use Base Runs.

by Graham MacAree on Mar 8, 2010 7:11 PM PST up reply actions  

BsR, of course, isn't perfect

But it has the benefit of handling the situation logically and robustly.

by Graham MacAree on Mar 8, 2010 7:14 PM PST up reply actions  

Not really

When you allow a lot of ground balls, it means that walks aren’t as harmful on average because the baserunners are more likely to be doubled up. I think that is a pretty real effect (although I’m not sure how large).

by vivaelpujols on Mar 8, 2010 7:21 PM PST up reply actions  

Right, but it's not the walks that are having a direct effect

It’s that they’re changing the game environment. Skipping that step means you’re missing the forest for the trees. It makes sense to bear these effects in mind – it makes absolutely no sense to include them directly. It’s essentially cheating.

Why not linearise baserun weights to incorporate this?

by Graham MacAree on Mar 8, 2010 7:25 PM PST up reply actions  

In essence, I hate it when people say the skills are correlated

It misleads the reader from what’s actually happening. I know you know what’s going on, but the shorthand (and the way some people have converted that shorthand into an actual pitching statistic) is really grating, and probably unfair to everyone else reading.

by Graham MacAree on Mar 8, 2010 7:28 PM PST up reply actions  

I don't know enough about BsR to know how well this would work

I do think that SIERA’s way of handling the run environment affecting events (see that’s just a lot less pretty than “correlating”) isn’t particular good, and would like to see something a bit more… logical than just regression.

by vivaelpujols on Mar 8, 2010 7:34 PM PST up reply actions  

Well because the correlation of tRAr to tRA in n+1

shows how well it predicts future tRA. But tRA is not a perfect model of pitcher isolated ERA, it’s just an estimate (although a good one!). It takes out a bit of pitcher skill, so you are not really measuring how well it predicts future performance. ERA contains all of the pitcher skill + other variables. Assuming that those variables aren’t biased against one metric (say you were comparing the correlation to ERA in n+1 of tRAr and SIERA), it’s a better measure of the stats predictability in my opinion. Although I can see the obvious points of contention.

by vivaelpujols on Mar 8, 2010 6:36 PM PST up reply actions  

Well some guys like to test it against team runs scored

Which is kind of stupid in my opinion, because then you get nonsensical calibrated formulas like extrapolated runs.

I think the best test is against runs scored in a particular inning, like Colin did here:

http://www.hardballtimes.com/main/article/the-great-run-estimator-shootout-part-2/

That’s basically like testing it against batter ERA.

by vivaelpujols on Mar 8, 2010 6:53 PM PST up reply actions  

I could, but they're mostly meaningless as this point until I get more years of data in.

As pointed out: http://www.lookoutlanding.com/2010/3/8/1362878/trar-and-wobar#32207629

I can say that even with too small amounts of past data, that 2008 tRAr predicts 2009 tRA and 2009 RA just as well as 2008 tRA does. (Both of which vastly better predict RA in Y+1 than using RA in Y does, for what it’s worth). I expect 2009 tRAr to outpredict 2010 tRA and 2010 RA.

wOBAr is a little shakier, being less predictive of Y+1 wOBA at the moment, but again, I really need a few more years of data to be able to say anything of value.

by Matthew on Mar 8, 2010 7:17 PM PST up reply actions  

And I was dumb, comparing to straight wOBA instead of park adjusted wOBA (since wOBAr is park-neutralized)

2008 wOBA* to 2009 wOBA* has a RMSE of 40 points
2008 wOBAr to 2009 wOBA* has a RMSE of 38 points

2008 RA to 2009 tRA has a RMSE of 1.50
2008 tRA to 2009 tRA has a RMSE of 1.29
2008 tRAr to 2009 tRA has a RMSE of 1.08

2008 RA to 2009 RA has a RMSE of 1.71
2008 tRA to 2009 RA has a RMSE of 1.54
2008 tRAr to 2009 RA has a RMSE of 1.34

by Matthew on Mar 8, 2010 7:31 PM PST up reply actions  

Looks like tRAr is a solid improvement in terms of measuring skill

Can you plug in the formula for SIERA as well?

6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GB-FB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GB-FB-PU)/PA)^2) + 9.561*(SO/PA)*((GB-FB-PU)/PA) – 4.027*(BB/PA)*((GB-FB-PU)/PA)

by vivaelpujols on Mar 8, 2010 7:38 PM PST up reply actions  

Here's the full version

SIERA = 6.262 – 18.055*(SO/PA) + 11.292*(BB/PA) – 1.721*((GB-FB-PU)/PA) +10.169*((SO/PA)^2) – 7.069*(((GB-FB-PU)/PA)^2) + 9.561*(SO/PA)/PA) – 4.027(BB/PA)*((GB-FB-PU)/PA)

by OlSalty on Mar 8, 2010 7:44 PM PST up reply actions  

SBN strips out some of the figures when you post formulas for some reason

That’s why I added the

pre
tag. If you highlight the formula in my post, you can copy the entire formula even if the rest doesn’t appear on the screen.

by vivaelpujols on Mar 8, 2010 8:00 PM PST up reply actions  

Okay, got it to work by copying more than that line.

And I spent 10 minutes figuring out the SQL query for it and got back nonsense. Fuck you, SIERA

by Matthew on Mar 8, 2010 8:09 PM PST up reply actions  

Okay, I just did it in Excel instead

2008 SIERA to 2009 RA has a RMSE of 1.36

Data set was same as used above, pitchers with at least 100 outs recorded in both 2008 and 2009 in the same league.

by Matthew on Mar 8, 2010 8:29 PM PST up reply actions  

So about as good as tRAr

I think that tRAr will be a lot better when you regress individual components differently though.

by vivaelpujols on Mar 8, 2010 8:37 PM PST up reply actions  

Looks cool so far

I’m not sure what a “holier than thou” critique would be, but if I think of one, I’ll be sure and pass it along

I'm not a sabermetrician, but I do play one at FanGraphs.

Can't get enough of me? Check out my Twitter feed.

by Matt Klaassen on Mar 8, 2010 6:38 PM PST reply actions  

Very cool

Can’t wait to dig into it and see what cool stuff pop out. Might just be a marginal improvement on what we already have for most players but I think it might be pretty cool.

by Edgar for Pres on Mar 8, 2010 6:40 PM PST reply actions  

Absolutely awesome

Out of curiosity, how are you handling the minors?

by Jeff Sullivan on Mar 8, 2010 6:54 PM PST reply actions  

Each player is contained within the league

so 2009 wOBAr for a player in AAA looks at 2006-9 stats from AAA only.

I spent about 15 seconds thinking about doing league translations in order to make all past data valid and then decided that was a stupid use of my time.

by Matthew on Mar 8, 2010 7:11 PM PST up reply actions  

Salaciously

De Gutibus non disputandum est

by Bearskin Rugburn on Mar 8, 2010 8:54 PM PST up reply actions   4 recs

Important note:

Because I currently only have three years of data for this to run against, the earlier year (2007, even 2008) of tRAr and wOBAr are not going to be great because they end up getting heavily regressed.

For instance, estimating Erik Bedard’s 2007, the formula wants to look at his 2004-7 data in order to figure out his recent rates. But I don’t have 2004-6 data, so it ends up only using 2007, which means his sample size is small, which means he (like everyone else) gets heavily regressed toward league average.

I’m still working on getting past retrosheet years into my new DB schema. Once I do that, then those earlier years will be more meaningful.

by Matthew on Mar 8, 2010 7:10 PM PST reply actions   1 recs

Branyan had a huge HR/FB ratio (25% compared to leag avg of 11%)

but without a big enough sample size, that gets regressed a ton (to ~14.5%) and that’s your difference. He loses 14 home runs because of it.

by Matthew on Mar 8, 2010 8:02 PM PST up reply actions  

Matthew, you can download an SQL dump of retrosheet for the 2000 decade here

http://www.wantlinux.net/2009/04/retrosheet-baseball-mysql-database-download/

I’m not sure if that schema has everything you need, but it should be a hell of a lot easier than building up the database from scratch.

by vivaelpujols on Mar 8, 2010 7:32 PM PST up reply actions  

It will not.

I need the data formatted in a very particular way in order to run these formulas.

by Matthew on Mar 8, 2010 7:36 PM PST up reply actions  

So let me see if I understand how this works

I am looking at Clifford Lee’s page. His tRAr in 2008 is much higher than his tRA in 2008 because it is incorporating his stats from 2007, when he was bad. In 2009, his tRAr stays low because it now also uses his stats from 2008, when he became really good. Is this correct?

by Dewey N on Mar 8, 2010 7:41 PM PST up reply actions  

Site Design Suggestions

I hope I’m not out of line putting these here, but I’m not sure how else to send them.

I like the new comprehensive layout of player pages as opposed to the tabbed approach of the previous design. However, I would like to have each broad category with a heading like fangraphs: there’s a “plate discipline” section that’s clearly labeled, a “traditional” section that’s labeled, and so forth. About 60% of the time, when I head to a stat page it’s because there’s a specific question I want answered (e.g. “Does his wOBAr match his wOBA? And if not, what should I expect in the future?”), so labeling the stats page will make my using them easier. Others might be the same.

I realize that programing the text-recognizing drop down in the search bar is probably a fucking pain in the ass, so I’ll understand if you’re like “Sorry, that’s staying the way it is.” It’s not like I can’t use it the way you’ve made it. But anyway, if I type “Felix” I get guys with the last name “Felix”. If I type “Hernandez” i get guys with the last name “Hernandez” and there’s a whole shit ton of them. “Felix Hernandez” and “Hernandez, Felix” get me no results at all. It would be nice If “Hernandez, F” gave me all the Hernandezes with a first name beginning with “F”. It’s not a huge deal, obviously, but it would be sweet to have the drop down work that way from an “oh my god what a sweetly super-usable website” point of view.

That’s all I can think of right now. It’s always been a well-designed site overall. Thanks.

by philosofool on Mar 8, 2010 9:35 PM PST reply actions  

Thinking about it now

How is this fundementally different from a projection system?

by vivaelpujols on Mar 8, 2010 10:25 PM PST reply actions  

Okay, let me just get your opinion on what the stat is trying to measure

Are you trying to isolate skill in past performance, or eliminate the impact of fielders. tRA doesn’t regress anything, besides what happens after the balls is put in play, and uses just one year of data, so it is decidedly a value metric.

tRAr regresses each component to the pitcher’s estimated mean by using multiple years of data. So I gather that it is trying to estimate skill in past performance. But buy then, it’s basically a projection system (with somewhat arbitrary regression to the mean components) without the aging factor.

I’m not indicting the stat – I’m curious as to what you would say the point of it is.

by vivaelpujols on Mar 8, 2010 10:42 PM PST up reply actions  

I urge you to exercise caution

It’s only a matter of time before these statistics become self-aware and decide that we humans must be eliminated.

by Fett42 on Mar 9, 2010 5:23 AM PST reply actions   4 recs

I for one welcome our post-apololyptic mathematical overlords

And look forward to battling them with guerrilla warfare tactics!

Mariners/D Broncos/BSU Broncos fan in Seattle

by appleshampoo on Mar 9, 2010 5:19 PM PST up reply actions  

This is fabulous.

It seems like every year brings one very cool new statistic; this must be it for 2010. I look forward to spending hours of my time at StatCorner over the coming days, checking out various players’ tRAr/wOBAr. Thank you, Matthew.

Might wOBAr be used to create a revised WAR?

A suggestion for redesigning StatCorner: I used to really like the banner that featured Puget Sound, a ferry, and the Seattle skyline. You should bring it back.

by katal on Mar 9, 2010 8:17 AM PST reply actions  

So let me make sure I understand this on a layman's level

well, a layman who appreciates linear weights, anyhow.

tRAr and wOBAr aren’t meant to be “Well this is a look at how good his 2009 REALLY was” but rather “Based on what we’ve seen the past 3 years, this is about what we expect him to do in 2010”

I know I’m trying to oversimplify things, but I write for a simple audience.

Purple Row: Take this personally
http://www.youtube.com/user/rockiesmagicnumber
Learn about Batting Metrics
Learn about Pitching Metrics

by Andrew Martin on Mar 9, 2010 8:23 AM PST reply actions  

Not really.

I would need to add projection components (adjustments for age, and a further regression to league for example) for the goal to really be about accurate projection. The goal is more closely related to “Based on what we’ve seen in the player’s recent history, this is about the level of skill he displayed in ”

A side effect of that might be better projections since player’s skills do not tend to fluctuate wildly year to year

by Matthew on Mar 9, 2010 10:19 AM PST up reply actions  

Yeah, I think that these are bordering on the thin line between projections and retrospective measures of skill

I think the only difference really is that you specifically aren’t trying to predict future RA – although it’s mostly semantics.

by vivaelpujols on Mar 9, 2010 10:55 AM PST up reply actions  

But I think the main difference is in the order you do it

Establish the mean first, then regress past years stats to it, rather than use past years stats and regression to the mean to predict the current mean.

by vivaelpujols on Mar 9, 2010 10:58 AM PST up reply actions  

ok, so use it as a "skill" metric

I’ll keep that in mind. Probably too complex to present to Purple Row, but it’s good to know these things.

I guess I’m sort of confused what you were presenting the 2008→2009 RMSD for then.

Purple Row: Take this personally
http://www.youtube.com/user/rockiesmagicnumber
Learn about Batting Metrics
Learn about Pitching Metrics

by Andrew Martin on Mar 9, 2010 12:54 PM PST up reply actions  

gotcha.

Thanks for clarifying.

Purple Row: Take this personally
http://www.youtube.com/user/rockiesmagicnumber
Learn about Batting Metrics
Learn about Pitching Metrics

by Andrew Martin on Mar 9, 2010 1:07 PM PST up reply actions  

Matthew great work.

Since you’re doing work one stat corner, may I request that you set it up so we can have a side by side function. I’m not sure how hard that it is, but it would be help in comparing one player vs. another.

Racer X. You have to love those amarillo hops.

p.s. fuck you angels

by InSpokane on Mar 9, 2010 10:51 AM PST reply actions  

Can we get a top ten leaderboard for each?

Also maybe players with the biggest differences. I always think these sort of things are cool.

by Edgar for Pres on Mar 9, 2010 1:06 PM PST reply actions  

Comments For This Post Are Closed


User Tools

By reading a game thread of your own volition you agree to accept all liability for any and all damage done to your delicate sensibilities.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Moar_bacon_small
Everything I Know About Jesus Montero

Recent FanPosts

Small
OTDOD - Early February Edition
Agentejebaox3_small
A Statistical Analysis of Mariners' Fan Support
Small
Who will have a better season?
Claw_small
BA's Top 10 M's Prospects
Wbc_029_small
Friday Morning Music Thread
Small
Munenori Kawasaki Predictions!!!
Small
The Longevity and Future Success of Felix Hernandez.
Small
The present vs future conundrum
Small
2012 Seattle Mariners: Playoff Team

+ New FanPost All FanPosts >


Sexy People

Wbc_029_small Jeff Sullivan

Small Matthew