Sabermetrics 101: Pitching
We're into the stretch run now. I'm not going to go into individual statistics - the idea was never to walk through absolutely everything but rather to provide a solid foundation that facilitates good, logical thinking about sabermetrics. So instead of talking about strikeouts, wins, tRA, xFIP, whatever over the next few days, I'll describe how I think pitching/batting/defence should be evaluated - but in general. We'll start with pitching.
Prerequisites for Understanding: The Isolation Problem, Linear Weights, Base Runs, Replacement Level, Expected Wins/Losses, The Run-Win Conversion, Value, Regression, Correlation, Park Effects, Environment, WPA and LI, Data.
Evaluating Pitching
What makes a good pitcher? What makes a bad one? How do we evaluate them? Pitching is a deceptive area of study - our first generation numbers told us that we knew how many games pitchers were responsible for winning, and how many runs they gave up. For a very long time, we were content with this.
And then, quite suddenly, we weren't. What are wins, we ask? And what exactly does ERA tell you? Well, wins tell you how often your position players score more runs over the course of a game than the pitcher and the position players save. ERA is similar bizarre, thinking about it: How many runs does a pitcher and his defence concede per game discounting runs that the scorers think ought not to have counted.
This would, of course, be all well and good if the impact of defence was negligible, or that pitchers had any real control over whether batted balls find gloves or not. But defence matters. It can make average pitchers look like world beaters, and replacement level pitchers look alarmingly valuable. Whenever the ball enters the field of play, the defence is involved, and understanding that and seeking to adjust for it is absolutely critical.
So, what do we need in order to measure pitchers? Ideally, we'd be able to judge them without the defence clouding the issue. We certainly shouldn't involve bats, so wins and losses are outs. We must build upon the things that the pitcher is solely responsible for: strikeouts (mostly), walks, HBP, and home runs. After these are taken into account, we can start looking at batted balls - third generation data gives us some insight into how difficult a ball is to field, although it's not as accurate as we'd like. Anyway, my belief is that if a pitcher gives up a drive that's an out 90% of the time with an average defence, we should give him 0.9 of an out. Credit for the quality of the defence should probably go to the actual defenders rather than the pitcher. Actually doing this is non-trivial, but it's the direction we should be steering our statistics.
We then need to turn this information into runs and outs. Outs are essentially trivial with good enough defensive data, but the run conversion can come from linear weights, Base Runs (this is, of course, more accurate than linear weights), or any other run expectancy tool you can think of. With expected runs and outs, you can figure out how many runs you'd expect a pitcher given up per nine innings... to a point: We've neglected 'situational pitching'. Personally, I think this is an acceptable oversight, but it may well be that it can make a significant difference to our evaluation of pitchers. Certainly, it's something that will be fairly important to look into down the line.
So. Expected runs per nine. Using this combined with some function of batters faced leads you to a certain number of runs above average which eventually leads us to wins (and don't forget to park/league adjust!). But many prefer a quite elegant shortcut: if we know the expected runs allowed per nine and the league average figure, we can use pythagorean theory to derive expected winning percentage. This is the number that WAR for pitchers is based on, and ultimately what we want to know. Getting to that point is just a matter of refining the method and using better data: our general theory is laid out pretty cleanly.
We should be careful to regress our numbers pretty severely when dealing with pitchers, as some of their outcomes are highly luck-dependent (notably home runs per fly ball, and to a lesser extent some ball in play classifications). As with everything we look at, remember that the data at hand never tells the whole story. Regression is the name of the game, and we want to apply it mercilessly when non-correlative statistics come into play. But we should also remember that there is real value in measuring what a pitcher actually has done, as well.
Things to Remember
- The transition between the rotation and the bullpen sees bullpen pitchers give up more walks, but less home runs and strikeouts. This results in replacement level being higher in the bullpen
- The National League doesn't see the designated hitter; the AL does. Keep this in mind when making league adjustments.
- Therefore, National League pitchers have some value tied up in hitting. Including a pitcher's batting ability can make a big difference in their valuation.
- An interesting technique to measuring bullpen effectiveness is to consider the average leverage of each 'role' in the pen. This has a multiplicative effect on the value of each player - a closer might see his LI at 2.00, for example, meaning that every run saved above average is really counting as double. We can use LI here because it is what is defining bullpen roles in the first place (admittedly, not that well, thanks to the saves rule).
- Remember to include 'scouting-style' information when you think about player value. Stuff, command... these are extremely important things to consider. Don't ever use one number as a crutch - that way lies dogma.
6 comments
|
1 recs |
Do you like this story?
Comments
Would you really say pitchers are responsible for how many home runs they give up?
I think it might be clearer to say they are responsible for how many flyballs they give up right?
No it definitely would not
Just because something is not particularly correlative doesn’t mean it’s not their responsibility. Pitchers give up home runs and pretending like they’re all fly balls is crazy.
by Graham MacAree on Mar 3, 2010 2:30 PM PST up reply actions
I have never fully understood why FIP is considered accurate/relevant
I don’t know if such a player actually exists, but consider a hypothetical pitcher who doesn’t give up many home runs, has decent strikeout and walk rates, but for whatever reason gives up a ton of balls in play (and therefore hits). Questions I have:
1. This hypothetical player will have much better FIP than he should, since at the end of the day he’s still giving up hits and runs. Do people realize and/or care about this?
2. Is this even a realistic scenario? I see how if a pitcher has good K and BB rates then it’s unlikely that he’s giving up a ton of hits, but it seems like there are bound to be some exceptions. I’m thinking maybe sinkerballer type guys who rely a lot on defense and balls in play to get outs.
3. By only focusing on 3 PA outcomes (K, BB, HR), FIP really ignores a very large chunk of what goes on during a baseball game. Why then do people rely on FIP as a relevant measure of a pitchers ability?
Couple of things, if I may
First, as you mentioned, if a pitcher isn’t giving up home runs and is striking out a lot of players, he’s not going to have many balls put in play, which is good. Second, the point of FIP is in its name: Fielding Independent Pitching. So you could have a guy who gives up a ton of fly balls that don’t go for home runs, but has the Mariners’ outfield defense, so he doesn’t give up a lot of hits on those fly balls. Then he’s traded and has Adam Dunn in the outfield. He pitches at the same level that he used to, but now he’s giving up huge amounts of hits that sail over the head of the concrete statue positioned in left.
All FIP tries to do is isolate the outcomes that a pitcher has control over, and not punish him for having a lousy defense or reward him for an elite one. It has more predictive power than ERA.
by controlled_slide on Mar 4, 2010 8:24 AM PST up reply actions
Good explanation
I’ll add that FIP actually has the leaguewide run value of the average batted ball implicit in its formula. Essentially, what FIP is doing is taking the average ERA and tweaking it upwards or downwards in response to certain events – balls in play are already embedded into that average ERA.
by Graham MacAree on Mar 4, 2010 8:27 AM PST up reply actions


















