Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Todd Haley Is The Steelers Next Offensive Coordinator

On Pitcher Contact Rate And Strikeouts

If you've ever wondered why I spend so much time talking about swinging strikes (or contact rate, which are basically the same idea), this is why:

Kcontact_medium

2005-2008 data, based on 567 pitchers who threw at least 100 innings in a season. As expected, there is a very strong correlation between missing bats and racking up strikeouts. I've looked at this sort of thing before, and it comes as no surprise.

However, the correlation isn't 1 (or -1, as it were), and what's always interested me is how certain pitchers can exceed their expected strikeout rate, while other pitchers undershoot. For example, last year AJ Burnett struck out 24.2% of the batters he faced even though, based on his contact rate, we would've expected him to come in at 22.1%. This isn't just an anomaly. It does appear to be at least somewhat within the pitcher's control.

Kexpkcorrel_medium

This is a chart showing 283 matched pairs of consecutive pitcher seasons with 100+ innings pitched (2005-2008). On the x axis is the difference between K% and expected K% in one year, while on the y axis is the same difference in the year following. What you see is that, though the correlation isn't as strong as in the first chart, it's still very much significant. It's clear that, though swinging strikes are important, they aren't the only factor when it comes to generating strikeouts.

Based on some preliminary investigation, the following factors are correlated to the difference between K% and expK%:

  • First-pitch strike% (positive correlation)
  • Zone% (positive correlation)
  • Fastball% (positive correlation)
  • Fastball velocity (positive correlation)
  • Curveball% (positive correlation)
  • Changeup% (negative correlation)
  • Called strike% (positive correlation)

Called strike%, curveball% and changeup% have the strongest correlations among those listed. That is, pitchers who throw a lot of curveballs or get a lot of called strikes may be able to exceed their expected strikeout rate, while pitchers who throw a lot of changeups may be the opposite.

There's a lot more work to be done on this matter, though. Just maybe not by me.

In case you're curious, here are the pitchers who, between 2005-2008, showed the biggest differences between K% and expK%.

Top Five

1) Erik Bedard (+4.4%, three-year average)
2) Mike Mussina (+4.3%, four-year average)
3) Josh Beckett (+3.8%, four-year average)
4) Esteban Loaiza (+3.5%, two-year average)
5) Curt Schilling (+3.5%, two-year average)

Bottom Five

1) Runelvys Hernandez (-4.6%, two-year average)
2) Brandon Backe (-4.4%, two-year average)
3) Ramon Ortiz (-3.7%, three-year average)
4) Kelvim Escobar (-3.5%, two-year average)
5) Brian Burres (-3.3%, two-year average)

So far in 2009, the biggest positive differences belong to Tim Lincecum, Justin Verlander, Josh Beckett, Zack Greinke, and Jon Lester, while the biggest negative differences belong to Trevor Cahill, Micah Owings, Ryan Dempster, Francisco Liriano, and Armando Galarraga.

Comment 47 comments  |  3 recs  | 

Do you like this story?

Comments

Display:

Yeah, this harkens back to our conversation earlier in the season.

It would be ideal to separate out the power curves like Felix’s that are similar to vertical sliders and seem to get more swinging strikes, as opposed to the loopy curves thrown by Bedard, Mussina, and Beckett.

I’d be curious to see called strike % correlated to curveball horizontal and vertical movement as well as velocity.

by abender20 on Aug 16, 2009 8:23 PM PDT up reply actions  

Out of curiosity...

what is the correlation between K% and ERA, FIP, or tRA?

The reason I ask, is that I ran this once and got a negative number. Didn’t seem to make sense.

by PLU Tim on Aug 16, 2009 9:06 PM PDT reply actions  

Oops...let me rephrase...

I was getting a positive correlation between the two. Which seemed counter intuitive. It was a very small constant, but positive nonetheless.

by PLU Tim on Aug 17, 2009 8:22 AM PDT up reply actions  

Yeah, there's no way it would be positive

unless some weird shit was going on with pitchers who’d thrown a couple of innings. Did you use an IP cutoff PLU Tim?

Join the Lookout Landing Premier League Fantasy....League. Use code 1391901-278108.

by marc w on Aug 17, 2009 9:49 AM PDT up reply actions  

I can't recall...

I know that I did use some point of reference. I didn’t allow any schmuck into the sample. I know that it was less than Jeff’s and included relief pitchers. Which I wonder if relief pitchers just threw the entire sample off because they are a statistically volatile in nature.

by PLU Tim on Aug 17, 2009 10:21 AM PDT up reply actions  

Adding relievers shouldn't matter. I'd just check it again, because there's counterintuitive results

that are interesting and point out something new and exciting, and then there are counterintuitive results that don’t make sense and point out errors.
Especially the FIP thing… I mean, you see the equation for FIP; HOW can that be positively correlated to K%?

Join the Lookout Landing Premier League Fantasy....League. Use code 1391901-278108.

by marc w on Aug 17, 2009 10:24 AM PDT up reply actions  

Well..when I did this..

was like 2-5-3 years ago so it was based on ERA. FIP wasn’t terrible “mainstream” as far as advanced metrics go at the time.

If I used FIP the result would likely make sense. ERA has enough noise in it to screw everything up anyways.

by PLU Tim on Aug 17, 2009 11:00 AM PDT up reply actions  

Considering the lowest three year average is -2.1 and the highest is 2.0

Couldn’t we just do a general +/- 2% to the expectancy and call it good?

Fans are typically idiots.

by The Typical Idiot Fan on Aug 16, 2009 9:07 PM PDT reply actions  

But that makes for a fairly large swing

The average number of batters faced in a season (for qualified starters over 2007-2008) is 833 (per FanGraphs TBF numbers).

So +/- 2% for 833 batters faced works out to +/- 16.66 K/Season.

Your swing there is 33.33 K/Season, a certainly not insignificant amount, and one you probably don’t want to apply to every pitcher when predicting K rate.

by Robert Lintott on Aug 17, 2009 6:43 AM PDT up reply actions  

Wow.

Any way you can post p-value’s on your graphs? I always wonder about level of significance.

From one interested fan to another.

br

by sirbrianwilson on Aug 17, 2009 6:41 AM PDT up reply actions  

Significance F for Contact% and K% is 1.4E-172

Significant!

The other chart is from a sheet I have at home.

by Jeff Sullivan on Aug 17, 2009 11:08 AM PDT up reply actions  

Congrats...

Now tell me what you get….

Run tRA against Swinging Strike%, Ground Ball%, and the average tuesday temperature in Dublin, Ireland.

by PLU Tim on Aug 17, 2009 11:27 AM PDT up reply actions  

Easy...

Tools → Add Ins → Analysis Toolpak

Once that is done..

Tools → Data Analysis → ANOVA: Single Factor

Plus in the cells and go.

by PLU Tim on Aug 17, 2009 11:16 AM PDT up reply actions  

I was just looking at Pujols' numbers to see

what a perfect hitter’s contact% looks like and indeed, he is awesome. One thing surprised me though which was he has a lot of infield fly balls. I don’t know why this is surprising but when I think of the type of guys who hit fly balls, usually I think of guys like Jose Lopez and not amazing hitters.

by Edgar for Pres on Aug 16, 2009 11:18 PM PDT reply actions  

Pujols gets a lot of backspin on his swings

When he misses them, he often pops it up; when he doesn’t….

F*** Billy Beane... actually, I kinda like Holliday

by vivaelpujols on Aug 17, 2009 5:04 AM PDT up reply actions  

I'll use FIP because it's easier

The r value for FIP and K-expK is -0.3948. However, we’d expect a correlation like that, because pitchers with a higher K-expK will generally have a higher K%, which improves their FIP.

by Jeff Sullivan on Aug 17, 2009 9:33 AM PDT up reply actions  

This is fantastic. I'd love to see more graphs like this.

They really help the statistical layman get a sense of the math behind a lot of the conclusions about sustainability and sample size that the LL authors come to.

by Decatur on Aug 17, 2009 9:31 AM PDT reply actions  

By the way

As one might expect, strikeout rate in year x is a slightly better predictor of strikeout rate in year x+1 than swinging strikes in year x.

by Jeff Sullivan on Aug 17, 2009 9:43 AM PDT reply actions  

It'll be multivariate

I’d bet that you could figure out y+1 K% with some permutation of y contact% and zone% more accurately than with y K%

by Graham MacAree on Aug 17, 2009 9:45 AM PDT up reply actions  

Comments For This Post Are Closed


User Tools

By reading a game thread of your own volition you agree to accept all liability for any and all damage done to your delicate sensibilities.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Moar_bacon_small
Everything I Know About Jesus Montero

Recent FanPosts

Agentejebaox3_small
A Statistical Analysis of Mariners' Fan Support
Small
Who will have a better season?
Claw_small
BA's Top 10 M's Prospects
Wbc_029_small
Friday Morning Music Thread
Small
Munenori Kawasaki Predictions!!!
Small
The Longevity and Future Success of Felix Hernandez.
Small
The present vs future conundrum
Small
2012 Seattle Mariners: Playoff Team
Smell-the-glove_small
OT 1/24/12 - How Do You Survive Winter?

+ New FanPost All FanPosts >


Sexy People

Wbc_029_small Jeff Sullivan

Small Matthew