/cdn.vox-cdn.com/uploads/chorus_image/image/46578150/usa-today-8586821.0.jpg)
I should probably kick off this article with a disclaimer: I may be the biggest Brad Miller fan on the internet. Certainly I am the biggest Brad Miller fan with a platform to publish articles about baseball statistics. I've been expecting big things from Crazy Legs since before he hit AAA. After a rough early 2014, I was among the first to point out his resurgence. I even (inadvertently) gave him the awesome vegetable-themed nickname that's inspired two of my very favorite LL 2.0 GIFs. I wear my belief in Miller's talent on my metaphorical internet sleeve, and that belief absolutely colors my analysis.
With that said... the fact that my position on Miller is even a little bit controversial boggles my freakin' mind. Most of the time, when I find myself in possession of an unpopular opinion, I can understand and respect the thought process that informs the views of the majority. This time? Nope. Zip. Nada. I simply cannot understand why Mariners fans do not like Brad Miller.
And before you ask, no, Mariners fans do not like Brad Miller. I'm not just making that up. That's science.
Recently, both for my summer job as a data analyst at a sports science startup and my personal curiosity, I've begun to explore the computer science subfield of machine learning. The core pursuit of machine learning is the development of algorithms which can "learn" from data to build ever more accurate statistical models of various phenomena. Machine learning algorithms construct models by maximizing their performance on a set of training data, check the models by evaluating their performance on a set of testing data, and thereafter use the models to make predictions about new data.
The classic example is an e-mail spam filter. GMail's spam filters read your email, extract useful pieces of data (Have you received an email from this person before? How many spelling errors are there? How many capital letters?), and use the data to decide whether or not the e-mail should be blocked as spam. But there wasn't some programmer who sat down and wrote a spam blocker. Writing a computer program capable of intelligently handling all of that data and making the best possible decision in every case is simply too hard for a human to do. So instead, Google got a whole bunch of data - millions of e-mails, flagged as "Spam" or "Not Spam" by users - and used it to train a machine learning model. Then they got a whole bunch more data - more millions of flagged emails - and used it to evaluate the model they'd built. Once they decided their model was making the right decision on the testing data at a high enough rate, they put it into action on GMail, and now your inbox is defended by a continuously learning and self-improving spam filter. As an added bonus, the fact that the model can learn from new data means that Google's programmers don't have to write new code every time spambots get a little better. Instead, every time you click "Mark as Spam", your spam filter gets smarter on its own.
How does this relate to Brad Miller? Well, one of the other classic examples of machine learning is "sentiment analysis". Here's where it gets crazy: by training an algorithm on a dataset of text samples, each scored in terms of positivity or negativity (for example, IMDB's database of movie reviews), you can train a computer program to read text and then determine how positively the author feels about the topic at hand. One tool you can use to do this quickly, easily, and for free is provided by data science startup Indico. Indico's mission is to make machine learning accessible for non-specialists, and so using their Sentiment Analysis tool doesn't even require any programming knowledge - just go to their homepage, scroll down a bit, and paste your text into their demo.
According to Indico's Sentiment Analysis tool, the sentiment of my first disclaimer (the one about how I'm biased in favor of Brad Miller) is 0.884. The scale of Sentiment is from 0 on the negative end to 1 on the positive end, so a .884 means I must really like Brad Miller. But the sentiment of the next paragraph, the one about how I don't understand why no one else likes Brad Miller, is only .055. Obviously, I am not a fan of people who are themselves not fans of Brad Miller.
Got it? OK. Indico has also collaborated with fellow startup Blockspring to build a tool that lets users easily search through Tweets, perform sentiment analysis, and determine how Twitter feels about the searched topic. You can read the tutorial here - it's stupidly easy to do. The tool isn't perfect, mostly because Twitter limits automated searches to the last ~week and the last 100 Tweets, but it's a good indicator of the pulse of public opinion. And what does the Indico/Blockspring Twitter Sentiment Analysis Spreadsheet tell us?
Twitter does not like Brad Miller.
By contrast, here are the Twitter Sentiment Analysis pie charts for Felix Hernandez and Robinson Cano. Tweets about Felix, for obvious reasons, have mostly been "very positive". Cano's having an absolutely awful year, but Tweets about him have been mostly "neutral". Brad Miller? Only 27% positive tweets.
To be sure, some of this is because of his run-costing error against the Giants a few nights ago - although you will remember that in the previous game he hit a home run, and for the last week he has a 101 wRC+. But my experience with Mariners Twitter gives me the impression that, had I run this analysis at just about any time in the last calendar year, I would've gotten the same results. With the empiricism of the best sentiment analysis algorithm in the world behind me, I feel confident in my assertion that Mariners fandom at large is unimpressed with our asparagus-loving friend Brad. In my mind, this gives the anecdotal evidence I've happened upon before more weight. That guy last week in Jeff Sullivan's Fangraphs chat who asked if Miller was a bust? Not an outlier. The SBN authors on the internal Slack channel who, when I asked them to guess the top five offensive shortstops of the last calendar year, guessed Alexi Amarista before they guessed Brad Miller? Not outliers. The first round of @LookoutLanding Twitter mentions in response to this article? Also not outliers. People really just don't like Brad Miller.
And I don't get it, guys. It doesn't make sense.
Here's the leaderboard for American League shortstops, last calendar year (dating back from Sunday night), minimum 150 PA, sorted by fWAR.
Name | fWAR | Off | Def | PA |
Brad Miller | 2.8 | 5.7 | 7.2 | 407 |
Jose Reyes | 2.7 | 4.9 | 1.2 | 574 |
Alcides Escobar | 2.4 | -6.6 | 9.5 | 591 |
J.J. Hardy | 2.1 | -11.0 | 15.6 | 454 |
Erick Aybar | 2.1 | -10.8 | 9.3 | 626 |
Over the last calendar year - that is to say, the most recent fifty percent of his career - and with a minimum of 150 PA, Brad Miller leads all AL shortstops in BB%. He leads all AL shortstops in ISO. He's fourth in wRC+, behind one guy with a .363 BABIP and two fringy-at-best defenders. He's in the top 33% by both UZR and BsR. Sure, the AL shortstop crop is weak, but if you expand to the AL and NL, Miller's fWAR ranking drops all the way to... third. His ISO falls to second. His wRC+ is seventh, tied with Troy Tulowitzki. Oh no!
To think that Brad Miller is a bad baseball player requires completely ignoring a growing mountain of statistical evidence. Even when I try to put myself in the shoes of a less statistically literate fan, I can't come up with anything reasonable to hold against him. To run through a brief list:
- His batting average is bad, sitting at .244 over the last calendar year. Well, OK, but it's not the 90's. .300 hitters don't grow on trees these days, especially not in Safeco Field. Since 2008, only Ichiro and Robinson Cano have hit over .300 while qualifying for the batting title with the Mariners. And yeah, these have been some pretty bad offenses, but Kyle Seager doesn't hit .300, and people like him just fine.
- His throwing arm at short is inconsistent, and he occasionally makes boneheaded mistakes in the field. True, but he's also rangier than the average shortstop, and his good performance on plays most shortstops don't make counterbalances his poor performance on plays that most shortstops do make. This is actually captured in UZR, which scores Miller poorly in its Errors component but well in its Range component. Even if you don't believe in advanced defensive metrics at all, and your eye test informs you that Miller's no good with the glove... it's OK to be a below average defensive shortstop when you're a top-five offensive shortstop.
- He had a two-month-long really bad stretch at the beginning of 2014. Every hitter has bad stretches. Robinson Cano has had a two-month-long really bad stretch at the beginning of 2015. Robinson Cano is not a bad hitter.
- He also had a month-long bad stretch at the beginning of 2015. Not dissimilarly, Kyle Seager had a month-long bad stretch at the beginning of 2014. Kyle Seager, like Robinson Cano, is not a bad hitter.
- He inherited the shortstop job from Brendan Ryan, who was a defensive wizard, and his glovework looks bad by comparison. OK, but does anyone remember watching Brendan Ryan hit? No? Just me? It was terrible!
- Dave Cameron likes him. Mariners fans do not like Dave Cameron. I don't think I need to explain how stupid this one is.
If Miller were just some slightly-below-average player who happened to look bad in the field and hit in hot and cold streaks, I could see why fans wouldn't like him. But Brad's on pace for almost 4 WAR this year, guys. There is a legitimate case to be made that he should be the American League All-Star at shortstop. He's the best position player the Mariners have developed since Kyle Seager, who is in turn the best position player the Mariners have developed since A-Rod. Over the last calendar year, he's been the third or fourth best player on the Mariners, and the best shortstop in the American League.
And yet the fans don't like him.
What gives?