clock menu more-arrow no yes mobile

Filed under:

Sabermetrics 101: Park Factors

Park effects are one of the easiest concepts to understand but actually deriving them is one of the most difficult processes I can think of

Prerequisites for understanding: Environment, win/run conversion

Prerequisites for derivations: Data

Derivation

It's a little strange what effect on the game the playing field can have. We're all aware of the ballpark's impact on home runs, due to different dimensions, elevations, and temperatures. What's harder to wrap one's head around is a stadium lowering strikeout rates, or raising the run value of line drives. There are mechanisms for explaining some of these things physically (pitches break less in dryer air, for example, lowering the efficacy of curves and sliders), but some effects are probably psychological - pitchers might be more inclined to pitch up in the strike zone in a park with a deep outfield, raising fly ball rate and strikeouts. No matter the cause, the varied park effects are real (or they're doing such a good job of pretending that we should just run with it). This means we have to deal with them, because otherwise they're skewing our measurements.

How do we do remove park effects from our evaluations of players? The first step is to measure them. This is more problematic than one might think, mainly because you have to tease out what's being caused by the park from what's caused by the talents of the home nine, and the latter typically dominates. So we take multi-year samples, do some recursive analysis to strip away some of the bias, and regress heavily, and we're typically left with a blanket park factor which we're fairly confident with. We then apply this equally to all players, which works in one way, but not really so well in another.

Applications

There are two reasons to have good park factors when we're looking at players. The first is to gauge their worth in that specific environment, and the second is to gauge their talent level independent of environment (i.e. their expected worth if they move elsewhere). We can use blanket park values  - specifically the run factor - to determine which run environment to use for our run to win conversion, or we can go the slightly easier route of dividing the run contribution of a player by the run factor of the park to normalise our results. This tells us about a player's value. What it absolutely does not do is tell us about the player's talent.

One of the biggest headaches in baseball analysis is that parks play differently depending on both handedness and hitting or pitching style. We've already discussed blanket factors above, and I hope the reader has a grasp on how difficult they are to compute. Imagine complicating an already very difficult analysis by adding handedness and batter type in order to still not be convinced your numbers are perfect. You've just discovered why this sort of analysis hasn't gained much traction yet! Still, it's pretty clear that this is probably a fruitful area of analysis for projecting how players will do in different environments, something currently we have a qualitative handle on while the numbers lag far behind.

Examples

My favourite park factors were derived by The Hardball Times several years ago. Although they're blanket factors, they go into a level of detail that's fairly useful in adjusting hitting and pitching statistics.

What Follows

Park-adjusted statistics.