# Recreating VORP

Baseball statistics are a powerful tool for player analysis, but quite a few conventional ones (W-L, ERA, AVG, etc) are actually fairly useless for evaluating or projecting a player. One of the more advanced websites around, Baseball Prospectus, uses a batting stat called VORP (Value Over Replacement Player, basically how many runs a major leaguer's worth over some dude in AAA whom you've never heard of and will play for peanuts). Unfortunately, I only have it in book form, so it's hard to actually use. Hrm.

And so, armed with only my trusty SQL database for the past 20 years and Excel, I went on a great undertaking to recreate the stat, which would allow me to utilise it (I needed -something- to do during the final innings of that fiasco in Baltimore on Friday, after all).

Since this is a counting stat, I used plate appearances (PA) as a baseline.

PA: At Bats(AB) + Walks(BB)+ Hit by Pitch(HBP) + Sacrifice Fly(SF) + Sacrifice Bunt(SH).

Then I decided to use a Runs Created metric to determine total player value.

RC: [(2.4C+A)x(3C+B)/9C]-0.9C

A: Hits(H) + BB - Caught Stealing(CS) + HBP - Double Plays(GIDP )
B: Total Bases(TB) + (.24 * (BB - Intentional BB + HBP)) + (.62 * Stolen Bases(SB)) + (.5 * (SH + SF)) - (.03 * Strikeouts(K))
C: AB + BB + HBP + SH + SF

A little complicated, perhaps, but it's one of the most accurate models of how many runs each player contributes to his team per year, taking into account basic hitting, baserunning, etc.

So we tabulate these for every non-pitcher for 2005, break them up by position most played (an approximation I hate to make due to the fact I don't have actual play-by-play data; I'd rather have been able to split up an individual player's stats by position too), and stick them in Excel.

Then we total up all the RC for a position and divide it by PA. This gets you an average major leaguer at a certain position for a year (unsurprisingly, the more demanding defensive positions - C, SS, etc have a lower RC/PA than the outfield corners and 1st). But 'average major leaguer' isn't what we're looking to compare someone against. So I looked at an average AAA player instead. Or pretended to, anyway.

Various studies (I believe this involves another set of advanced metrics which I don't understand, and thus won't explain) show that a random AAA catcher will hit at about 85% ML average if brought up to the big leagues. That number's around 75% for the power positions - 1B, 3B, LF, RF, and 80% for the rest.

So then what we do is take a player's modified RC numbers, divide by PA, then subtract the replacement-level RC/PA. Once you have that, multiply by PA again to get a RCORP, which is Runs Created over Replacement Player. Say it out loud! It's really fun. ARRR-CORE!

My numbers for 2005 match VORP to within 5%, despite the rather crude approximations I had to make. I'm pretty pleased with the results.

Note that this will completely ignore defense, but that's ok, because everyone else's batting stats do as well.

