Last time I examined some of the issues with OPS and why, although it's unsurpassed as a quick and dirty measure of offense, there is room for improvement if you want to allow more complexity. GPA is one such method; adding a bit of complexity in exchange for a bit more accuracy, but GPA is but a stepping stone between the beautiful simplicity of OPS and the more accurate but mind bogglingly complex models out there. Now, let us move to the other end of the spectrum to the most accurate model we can.

*Important Note: Most of the explanation for BaseRuns comes from Brandon Heipp's site, here. I have only tried to re-word some of the stuff in the hopes of providing the simplest explanation possible that covers what's useful to know. All credit for originality belongs to Smyth and Heipp.*

Many people here are at least acoustically familiar with the term linear weights, but may not be aware of what linear weights actually means. Linear weights is, in a nutshell, a linear regression equation. There are certain inputs to the system (hits, home runs, etc) and each input is assigned its own importance (or weight) toward overall run scoring. With all apologies to Tim McCarver's ignorance, as far as run scoring goes, a walk is not, in fact, as good as a homerun. A homerun contributes more runs and so it deserves more importance in a run estimating model.

Linear weights formulas apply a static run value to each event. To borrow Heipp's example, many linear weight systems say that a single walk is worth about 0.33 runs. Over the course of a season, this is a good estimate of the value, in runs, of a walk. The problem is, in a game in which a team draws one walk and makes 27 outs, the walk will not have the same value. Most systems apply a value of about -0.1 runs for every out and so the system's prediction for the number of runs scored in this game would be ~-2.4 runs (+0.33 - 2.7). This clearly is incorrect.

This is an example of the shortcomings of most linear weights systems; they were designed to work within a certain range. Now, granted, the range they were designed for corresponds to the range in which most major league teams perform, but baseball has shifted dramatically in just the last 25 years, and while it's nice to have an accurate measure of major league performance, we might also want a system that can be applied to minor league games, which can have vastly different environments.

BaseRuns (BsR) is not a linear weighting system, but rather a multiplicative weighting system. It was developed by David Smyth about 15 years ago and has undergone some minor tweaks since its inception, but the theory of BaseRuns is sound. Every plate appearance ends in one of three ways: the batter is out, the batter reaches base (note: fielder's choices fall into this category) or he hits a homerun. If the batter does reach base, there are three more potential outcomes: he will score, he will make an out (here's where the fielder's choice out comes in) or he will be stranded. With that established, we can write an equation for the number of runs scored as such:

Runs Scored = (#Baserunners * % of baserunners that score) + Homeruns

Now, David Smyth broke up his model into four parts: A, B, C and D. A is just the number of baserunners. B is what's called the "advance factor" and is used to describe how important certain events are toward advancing runners. C is the number of outs and D is the number of homeruns. Equatically:

A = H + BB + HBP - HR - .5*IBB

B = [1.4*TB -.6*H -3*HR +.1*(BB+HBP-IBB) +.9*(SB-CS-GDP)] * 1.1

C = AB - H + CS + GDP

D = HR

BaseRuns = A*(B/(B + C)) + D

*Note: IBB are cut in half because while normal walks come in randomly distributed situations, intentional walks are usually issued only in favorable situations, that is, where the run value of adding that batter is much lower than normal.*

The benefit of BaseRuns(BsR) over other models is that the integrity of the model holds up even in extreme environments. Again borrowing from Heipp, it should be noticed that when a solo homerun is hit, BsR is alone among models to correctly predict that a single run be scored. Most linear weights will predict 1.4 runs from the same situation. This is because most of those models are built to survive only in the major leagues, while BsR has the unique ability to adapt to any league, even little league!

The great thing is that this adaptability does not come at the price when modeling the actual major leagues, which is what we do care about the most. Running comparisons between various models and actual run scoring outputs, BsR routinely is among the models with the highest degrees of correlation and with the lowest error rates.

To finish it off, here's a chart of 2007 totals for BaseRuns and the Pythag records based off them.

TEAM Pyth W/L RS-RA

AL EAST

Boston 102-60 900-681

Yankees 97-65 965-791

Toronto 88-74 757-689

Orioles 76-86 772-824

Tampa Bay 71-91 818-924

AL CENTRAL

Cleveland 92-70 823-717

Detroit 88-74 881-807

Minnesota 76-86 709-754

White Sox 70-92 701-809

KC Royals 67-95 677-816

AL WEST

Oakland 85-77 777-735

Anaheim 85-77 781-745

Seattle 78-84 783-814

Texas 75-87 779-837

NL EAST

NY Mets 88-74 817-748

Atlanta 88-74 800-732

Phillies   87-75 906-839

Marlins 75-87 821-889

Nationals 68-94 672-797

NL CENTRAL

Chi Cubs   86-76 759-707

Milwaukee 85-77 807-770

Cincinnati   76-86 789-846

St. Louis 73-89 711-789

Houston 70-92 732-846

Pittsburgh   68-94 711-838

NL WEST

Colorado   91-71 848-749

San Diego   87-75 723-663

LA Dodgers   85-77 741-698

Arizona 77-85 709-750

SF Giants 74-88 666-733