Swinging Strikes
Before I begin, let me state that I am not a mathematician, nor do I always completly understand some of the statistics on this website. I do the best I can, but I'm not sure I'd have the ability to test this theory if I wanted to back up my hypothesis.
Graham and Matthew's tRA is a model based around the idea that a ball in play can only have 5 outcomes (LD, GB, OFB, IFB, HR). After that, the result is beyond the pitcher's control, therefore, a pitcher that posts consistent rates between these five outcomes will, in a park neutral environment with neutral defense, will post a consistent BABIP.
However, another key point that they raise is the regression towards the mean the batted ball results. To quote the tRA explained page "The order is extremely important, as influencing GB% will have an effect on LD% later, and so on, sometimes causing regression away from the mean in unusual situations." tRA Primer
Part of my understanding of this (again, I could be wrong) is that a pitchers batted ball results will tend to logically distribute around their tendencies. For example, a groundball pitcher that suddenly posts a high IFB rate, without seeing a raised OFB or HR rate, is due for regression towards his most frequent outcome, a GB, and therefore, the IFB rate is unsustainable and it is likely his OFB and HR rates will rise at the expense of the IFB rate. (Again, I could me misinterpreting, but this seems logical to me.)
So my question is that is this true about swinging strikes, and if so, how could this be tested?
For example, we can assume that when a batter swings and misses, there are four possible reasons for the missed contact.
1. The Batter swung over the ball (I'll call his mO, for Missed Over)
2. The Batter swung under the ball (mU for Missed Under)
3. The batter swung too early. (mE for Missed Early)
4. The batter swung too late (mL for Missed Late)
Just for the sake of my argument, I'm going to use JJ Putz as an example. I do not have the pitch fx data, but in 2006, JJ Putz posted a 15.0% swinging strike rate from all his pitches. Lets assume (feel free to find the data) that JJ's splitter also got a 15% SwSr. My guess (complete guess) is that when batters swing and miss against JJ's splitter, it is because the bottom falls off the pitch, and they swing too high (mO). Logic would then say that in looking at the SwSr% in depth against JJ's Splitter, an overwhelming percentage would be mO.
Now, the reason I raise this point is simple. Lets now take Jarrod Washburn for a moment and look at his SwSr rate. Washburn has (SSS applies) posted a 7.3 SwStr% so far in 2009, up from 6% in 2008. A 7.3% rate would be a career high for Washburn if he were to continue it. However, if we had data available to analyze his career SwSr breakdown, we could evaluate if this success is sustainable, or due for a regression to the mean.
Hypothetically (Read: I'm pulling these numbers out my ass) lets say Washburn has a SwSr distribution of 40% mO, 40% mU, 10% mE and 10% mL. Now lets assume that so far in 2009, the rate has shifted as such (35% mO, 45% mU, 10% mE and 10% mL) If this is true, then, as the tRA model suggests, these changes must be reflected somewhere else.
More batters swinging under Washburn's pitches, and less batters missing over Washburn's pitches would suggest that batters are getting underneath more pitches, and would logically be reflected throughout his batted ball distribution (more FBs, less GBs etc.) If everything else were to remain the constant, including, for example, the percentage of TOTAL pitches that were mO, (the increased SwSr means this number is less proportionally) it would be logical to conclude that extra missed bats classified as mU, are therefore unsustainable and due for a regression towards the mean. Otherwise the entire BB distribution would shift.
This example is a gross oversimplification, but I believe the logic is sound. To the best of my knowledge, there is no system in place to further classify SwSr, and my guess would be that doing so would be difficult. The classifications, for example, would be arbitrary at best.
For example, if a batter is fooled on a change up, out on his front foot, and in an attempt to slow down his swing, missed over (mO) on the pitch, but the swing was slowed enough to be properly timed, the strict definition would say this is a mO, when the real reason for the swinging strike was that the batter swung early (mE), and swung over the pitch as he slowed his swing to compensate. How do you determine what to call this scenario?
To me it seems logical that if a pitcher that misses most bats by having batters swing too high suddenly pitches a 90 pitch game with 15 swinging strikes, but 10 of those strikes were mU, then we can safely assume the game was a SSS anomaly, and not the result of "________ finally figuring it out!!!"
This entire fanpost is just idol minded speculation, but I do think it warrants some discussion. This is my first statistical based FP here, so hopefully it was interesting. Looking forward to some discussion, however harsh :)
Cheers
BQueezy
2 recs |
6 comments
Comments
It's an interesting idea but I have no idea how you'd get the data
Or what to do with the data if we could hypothetically get it. Do we try to determine if some swinging strikes are more predictive than others? I guess if we could do that we could then look at which pitches generate which type of strikes…
by Graham on Apr 24, 2009 6:08 PM PDT reply actions 0 recs
I don't have any idea where we'd get the data either
Which is why I said this was simple speculation and not something I knew how to test.
As far which swinging strikes are more predictive, my guess would be that it would be better to analyze any data on a pitcher by pitcher basis. The important part of this analysis would be to take a guy that has increased his SwSr rate and determine if the success is repeatable or if it is lucky.
Generating the data would be incredibly difficult, but as a concept, I think my argument would work in theory.
Formerly Mariners124M... Username was sorta bland, so I'm changin it up
by BQueezy on Apr 24, 2009 6:23 PM PDT up reply actions 0 recs
It seems to me it would not help measure luck. A slider should generate misses over the top,
and the swstr% on that pitch should be as good a measure as the type of miss. This sort of thing could help you determine if a pitcher has actually changed the break of their pitches (a different mix of swinging strikes), but pitchFX covers that anyway.
Also, a study such as this would only get further scuppered by mixed effects: Johan Santana’s changeup would net you both an early swing and a swing over the top.
by abender20 on Apr 24, 2009 6:29 PM PDT up reply actions 0 recs
This is why I did talk about the difficulty in classifying swinging strikes
Also why I feel like this hypothesis would be extremely difficult to test.
Pitch FX may not be all telling. Sometimes better command could lead to more SwSr. A pitcher that consistently gets ahead 0-1 or 0-2 will probably get more swinging strikes because batters cannot sit in predictable counts.
In part you’ve reinforced some of my initial point. A slider should generate a miss over the top. If Player Y gets 5 swinging strikes on a slider, but only 2 were over the top, and through his career his slider has gotten primarily over the top misses, then 3 of the 5 swinging strikes we can determine are unrepeatable.
Formerly Mariners124M... Username was sorta bland, so I'm changin it up
by BQueezy on Apr 24, 2009 7:31 PM PDT up reply actions 0 recs
RE: last paragraph
First you’d have to demonstrate that they are unrepeatable. I don’t think you can just assume it
by Graham on Apr 24, 2009 8:12 PM PDT up reply actions 0 recs
Good catch, didn't mean to assume this was true.
That was basically the purpose of the post. My hypothesis is that this type of SwSr is unsustainable. I shouldn’t have concluded as if I had already proven it to be true.
I wish there was some way to determine this. Seems like it would take either days of mlb.tv archives, or hoping for a MLB or some organization to start tracking these things
Formerly Mariners124M... Username was sorta bland, so I'm changin it up
by BQueezy on Apr 24, 2009 9:38 PM PDT up reply actions 0 recs

by 
















