Identifying Candidates for Regression
There are two main areas you should look at when judging whether a pitcher is due for regression (either good or bad). First, you look for extreme values in areas that we know are under little or no control of the pitcher. BABIP (average value for SP = .296, RP = .293) is the premier example. If a starting pitcher has a BABIP less than .275 or greater than .315, then you'd do well to bet on overall regression next season. There are other such statistics, pointed out here as the components with low correlation, but the other major one listed is HR/FB% (average value for SP = 13.9%, RP = 12.6%). Because of its direct impact on runs allowed, extreme swings in HR/FB can radically change a pitcher's performance by traditional metrics. The percentage of runners left on base (average value for SP = 70.4%, RP = 72.8%) is the last such key statistic but we have to be more careful with that one because it actually does have some year-to-year correlation (r=0.18), tied namely to a pitcher's strikeout rate.
The second way is much more subtle and admittedly not possible for most people, but it can bring important evidence to the table regarding who might regress. This technique is somewhat the opposite of the above. Instead of looking at the variables that a pitcher has little control over and regressing those to the league mean, we look at the variables that a pitcher has a high degree of control over and inspect the related statistics. This is best illustrated with an example. There's a solidly direct relationship between the number of missed bats a pitcher generates and the number of strikeouts that he gets (r=0.71) and swinging strikes is more consistent year over year than strikeout ratio (.77 to .75). Granted, it's not much, but what this gives us is a relationship to look at. If a pitcher has many more or less strikeouts than his percentage of missed bats would suggest, that provides another hint toward regression. The same holds true with percentage of pitches that are balls and walk ratio.
To ferret out these regression candidates, I looked at each of the five categories mentioned above: Ks, BBs, HR/FB%, BABIP and LOB%. If a pitcher was above or below the expected amount by greater than half a standard deviation, I made a note of that figure. Most pitchers end up with a grab bag of assorted categories. For example, Erik Bedard appears four times for 2007, for everything but his BABIP. But while Bedard's regressed values suggest he should do worse in the strikeout and LOB% categories, they also suggest he should do better in the walk and HR/FB% categories, pretty much a wash. The pitchers we are looking for are those that show an overwhelming majority of under/overperforming and who appear on more than a few of the categories.
PAST EXAMPLES
Jered Weaver, 2006.
-Weaver is perhaps the prototypical example. In 2006, Weaver struck out 23% more batters than you'd expect given his percentage of missed bats, he left 22% more men on base than normal, allowed a hit on 20% fewer balls in play than the league and had 21% fewer flyballs turn into home runs than you'd expect. Jered Weaver appeared on four the measurements and each one portended Weaver's 2007 to be worse than his 2006. So what happened? In 2007, Weaver saw his missed bat percentage drop to 7.83%, below league average (7.95%) and his K rate fell to minimally above league average (15.57%) to 15.97%. His BABIP went from .237 to .313. He went from stranding 86.2% of baserunners to 73.6%. His HR/FB% actually dropped, the only statistic he appears on for 2007, but again, one that suggests further regression. In 2006, Jered Weaver posted a 2.56 ERA coupled with a 3.99 FIP. In 2007, Weaver's ERA skyrocketed to 3.91 while his FIP remained relatively stable at 4.14.
Kris Benson, 2003.
-Benson had an improved strikeout, LOB and BABIP rate to look forward to according to regression. Benson's LOB% actually fell a bit, but the expected rebounds in strikeout and hit rate did occur and more than offset the extra bad luck on runners scoring in time to land Benson an absurd at the time contract in late 2004.
Chris Carpenter, 2004.
-Just going to show you that these measurements do not preclude a pitcher from improving, Carpenter exceeded expected rates in walks, LOB% and BABIP while underperforming in HR/FB% in 2004. Sounds like a recipe for a step backward in 2005 right? Well, 2005 saw Carpenter drop his FIP from 3.71 to 2.86 and his ERA from 3.46 to 2.83 as he pitched 242 innings and grabbed himself a nifty Cy Young trophy. But guess what? His walk rate rose, his LOB% fell, his BABIP rose and his HR/FB% fell, 4 for 4.
Shawn Chacon, 2005.
-Hit the trifecta in the triple crown as I refer to the weakly correlated stats. Higher than expected LOB%, lower than expected BABIP and HR/FB%. All three regressed in 2006 and he was predictably terrible.
Shawn Estes, 2003 and Casey Fossum, 2004.
-The reverse triple crown, underperforming in the three stats mentioned with Chacon. All three regressed positively for Estes in 2004 and he shaved nearly a run per 9. Ditto Fossum except he lost nearly two full runs allowed.
Tom Glavine, 2004.
-Expected more walk, home runs, hits and less stranded runners in 2005. Swing and miss on three of four. The hits came back in force, but the other three did not change and in fact have been very stable for the past few years. Glavine is a good example of a pitcher that breaks the mold.
Oliver Perez, 2003, 2004 and 2006.
-Perez has more entries than any other pitcher during the time period. Regression analysis predicted Oliver's improvement from 2003 to 2004 (though not by that much, see next), his downfall from 2004 to 2005 (too much good luck in '04) and again the improvement in 2007 over 2006 (and is expecting him to take a step back in 2008, but not a dramatic one).
Jaret Wright, 2004.
-Yeah, who didn't see that one coming? I mean besides the Yankees.
2008 CANDIDATES
Josh Beckett - Beckett was a full standard deviation above the expected K and LOB rates. It's not to say Beckett isn't great, but, and this should come as no surprise, expect a regression away from his 2007 Cy Young effort.
Joe Blanton - For the second year in a row, Blanton's walk and HR/FB rates are below what you would expect. Blanton may just end up being a pitcher who always walks less than normal and the home run rates can be partially explained by Oakland, but it's worth keeping an eye on.
Lenny DiNardo - Can expect improvement in his K/BB ratio and should strand a few more runners.
Zack Greinke - Look for a worsening K/BB ratio and more runners scoring, but fewer home runs allowed should his groundball ratio stay constant.
Jeremy Guthrie - Too low of a BABIP and too high of a LOB% coupled with a higher than expected K rate spells uh-oh for oh-eight.
Ted Lilly - Same as Guthrie above though he should see some improvement in the walk department that could offset a change in strikeouts.
Scott Olsen - Would expect fewer walks and an improvement in each flukey stat; less hits and home runs and more stranded baserunners. Too bad nobody would notice since he plays for Florida.
Jake Peavy - See Beckett. Anytime you have a season like Peavy there's a good chance that you had a dose of good luck to go along with immense talent. It's the nature of how good the MLB talent pool is as a whole. It's remarkably unusual for any one player to be heads and shoulders above everyone else.
Justin Verlander - Could be in line for less Ks and stranded runners and a slight uptick in hits and home runs per flyball, but this could fly out the window as he continues to take steps forward in his talent level each year in the rotation.
4 recs |
41
comments
Read Related
Comments
Snooping
Hey, I'm just snooping around, coming from McCovey Chronicles, and this is awesome. I love reading articles where the writer peels back another layer for their analysis. This reminds me of what Marc Normandin does when he writes his Player Profiles.
by marcello on Mar 27, 2008 8:56 PM PDT 0 recs
Minor nitpick
Carp led the Cards to the WS title the season after.
Signatures are for Communists.
by JI on Mar 27, 2008 10:31 PM PDT 0 recs
I don't know why I continually
refuse to acknowledge that 2005 ever happened.
by Matthew on
Mar 27, 2008 10:38 PM PDT
up
0 recs
the 2005 team kicked the shit out of the 2006 team....
Signatures are for Communists.
by JI on
Mar 27, 2008 10:53 PM PDT
up
0 recs
Before I became a super-educated fan
I always found this shit perplexing.
Signatures are for Communists.
by JI on Mar 27, 2008 10:34 PM PDT 0 recs
This was helpful, thanks.
I understand regression, but I didn't get what it is you regress.
...and now I'm here
by Librocrat on Mar 27, 2008 10:39 PM PDT 0 recs
Great
Most of the guys on your 2008 list are on one of my fantasy teams or another. I appreciate the education snuck in between beer threads and GTE. Terrific analysis once again Matthew.
by AZSEAfan on Mar 27, 2008 10:43 PM PDT 0 recs
Well they're not all slated to regress backwards
and really, that's always going to be the general principle. Regression to the mean is universal and that means the good players are going to get pulled down and the terrible players pushed up or replaced.
And thanks for the compliment.
by Matthew on
Mar 27, 2008 11:09 PM PDT
up
0 recs
are some of them slated to regress forwards?
cuz that's be pretty amazing.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 8:02 AM PDT
up
0 recs
or, that'd be pretty amazing
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 8:02 AM PDT
up
0 recs
well since whenever we talk about "regression"
we actually should be saying "regression to the mean", yes, some are slated to regress forwards.
Stupid distinction I think and even I'm not consistent with it since I use progression sometimes.
by Matthew on
Mar 28, 2008 8:41 AM PDT
up
0 recs
Per chance...
... did you discover any Mariner pitchers on the list this year?
by Doc Baseball on Mar 28, 2008 1:56 AM PDT 0 recs
On that note
I know its kinda stupid and probably not too cared about in general but year to year scatterplots for all these components would be interesting since you can have low correlation and still have a small group of pitchers have high correlation.
Any sleeper candidates we should be watching for major improvement this year? How are the M's gonna do? Washburn finally hitting the wall?
by Edgar for Pres on
Mar 28, 2008 5:39 AM PDT
up
0 recs
I'll be getting around to scatterplots at some point
mainly because I love graphs.
The 2008 candidates includes everyone I found who looked like a good shot for a change in tradition performance (ERA). I'll go back and see if I can tease out anything more specific regarding sleepers.
No Ms made a prominent appearance in terms of a majority of pluses or minuses. Felix was the closest with just an expectation of lowered BABIP and HR/FB%, but we know the story there.
by Matthew on
Mar 28, 2008 8:44 AM PDT
up
0 recs
didn't look at RP
RP are a completely different beast as I tried to illustrate above by showing the differing league averages. Plus the smaller sample sizes inherent in relieving increase the variance even larger.
All in all, it's just a mathematical way of saying what we all know, but seemingly few GMs do, relief performance is crazy unpredictable.
by Matthew on
Mar 28, 2008 9:39 AM PDT
up
0 recs
Greetings from Over the Monster
Good stuff!
I'd be interested to find out who is likely to excel this year. Most of the "regressions" seem to be negative, save Scott Olsen (injured last year) and Verlander (mixed bag).
"You know you're having a bad day when the fifth inning rolls around and they drag the warning track." - Mike Flanagan, Baltimore Orioles pitcher, 1992.
by SoxDevil on Mar 28, 2008 9:58 AM PDT 0 recs
This isn't a method to identity breakout candidates; that's another post
That being said, here's a list of pitchers who in line to improve a little on their 2007's based on the above five statistics. Keep in mind these are very weak candidates. If any of them were strong bets for improvement, I would have listed them in the main post.
Matt Albers
Scott Baker
Matt Belisle
Jeremy Bonderman
Jose Contreras
Zach Duke
Adam Eaton
Felix Hernandez
Kei Igawa
Edwin Jackson
Cliff Lee
Paul Maholm
Horacio Ramirez
Jason Simontacchi
Andy Sonnanstine
Ronbinson Tejeda
Josh Towers
Kip Wells
by Matthew on
Mar 28, 2008 10:59 AM PDT
up
0 recs
The presence of HoRam in that list
doesn't fill me with confidence.
by Llewdor on
Mar 28, 2008 11:07 AM PDT
up
0 recs
Not coincidentally...
...that's the inspirational slogan painted on the players' tunnel at the new Wembley.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 11:11 AM PDT
up
0 recs
True enough.
BTW, why did Beckham have "100th cap" written in gold under his England badge when nobody else had it? Does the poor boy need that much ego-stroking?
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 11:21 AM PDT
up
0 recs
The 100th cap is a pretty big deal
His shoes blow the shirt away for ostentatiousness though, apparently. BBC commentary wouldn't shut up about them, apart from to tell us how poor we looked.
by Graham on
Mar 28, 2008 11:23 AM PDT
up
0 recs
I guess It's kind of a big deal
but then I'm not that obsessed with nice round numbers in my sporting fandom.
My problem is that nobody else has ever had "100th cap" on their shirt - I hope that if it starts with Becks, it will continue. If not it's just another arrow in my "Reasons Why David Beckham Is Vastly Overrated As A Player" quiver.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 11:38 AM PDT
up
0 recs
Well...
England has had a grand total of 4 centurians pre-Beckham. I think the most recent was Peter Shilton. Unsurprising that the shirt thing hadn't been done before (although Shilton was made captain for a game).
by Graham on
Mar 28, 2008 11:44 AM PDT
up
0 recs
I didn't realize it was that few
my snark is thus retracted. And I also didn't realize Shilton was the most recent.
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 11:46 AM PDT
up
0 recs
This is where you could make money Matthew
Publish a list of "Matthew's Locks" to improve on their previous year. Fantasy players would eat it up. I kept looking at Edwin Jackson and his sexy strike out rate in the late rounds but ultimately passed on drafting him.
Some of these guys, like HoRam just sucked so hard that they're bound to be better this year (assuming they find a job) but that isn't as interesting as seeing who got "unlucky" last year. Who seems to have good missed bat percentages (stuff) and good called ball percentages (control) but suffered from a crappy BABIP or HR/FB ratio. That is, who will break out from being a mediocre performer into a potential fantasy star? Predict that consistently, and the universe will be yours.
And BTW, great article.
by johnbai on
Mar 28, 2008 11:26 AM PDT
up
0 recs
Improvement
"Prince" Felix (sorry, Seattle, he ain't a king yet) should be primed for a very good season if he's due for a rebound. Much like Verlander, I expect his skill to increase as well.
Zake Duke apparently having a good spring. It'll be interesting to see if he returns to 2005 form (I think that was his good year).
Sannanstine is another intriguing guy.
"You know you're having a bad day when the fifth inning rolls around and they drag the warning track." - Mike Flanagan, Baltimore Orioles pitcher, 1992.
by SoxDevil on
Mar 28, 2008 2:01 PM PDT
up
0 recs
Sonnanstine was the pitcher most hurt by his defense
His ERA is horrible last year, but that's got almost nothing to do with him.
by Graham on
Mar 28, 2008 2:03 PM PDT
up
0 recs
he looks like he ate spring training
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 2:05 PM PDT
up
0 recs
I'm not sure you get to do that
Nice Guys Finish Third - Hopelessly lost, but makin' good time.
by pdb on
Mar 28, 2008 2:04 PM PDT
up
0 recs
Great analysis, Matthew
Would have liked to have this at my side for fantasy drafts (though it's nice to know I didn't wind up with any land mines).
Also, love the "what a nerd" tag. I hope that sticks around.
"People ask me what I do in winter when there's no baseball. I'll tell you what I do. I stare out the window and wait for spring." ~Rogers Hornsby
by thejew4u on Mar 28, 2008 11:54 AM PDT 0 recs
Oh yeah
If you get around to it, I'd love to see a similar article for batters. It'd be nice to have a comprehensive list of their baseline statistics for regression and who to keep an eye on this year.
"People ask me what I do in winter when there's no baseball. I'll tell you what I do. I stare out the window and wait for spring." ~Rogers Hornsby
by thejew4u on
Mar 28, 2008 11:57 AM PDT
up
0 recs
Here's a candidate for regression in 2008:
The Seattle Mariners
by Gomez on Mar 28, 2008 6:19 PM PDT 0 recs













