A respected scientist once told me that statistics only confirm what you see. His words continuously popped up in my head while reading well-known writers as they justified their Cy Young ballots. In particular, I was shocked to see Dave Cameron of FanGraphs, whose work I greatly admire, leave Jon Lester off his top five ballot.
I empathize with fans who think these dumb advanced statistics are worthless, especially when most who’ve watched Lester all year believe he was truly one of the best NL pitchers. This post highlights how I believe Cameron misinterpted a few advanced baseball statistics and how we can use his example as a reminder to talk appropriately about baseball analytics.
Many, including myself and Cameron, weigh FIP and and other isolation metrics heavily. One reason Lester wasn’t voted as a top five pitcher by Cameron was because his 3.41 FIP was significantly worse than his 2.44 ERA. Since FIP is more reliable than ERA, it should be more reflective of player value. Even so, FIP is not immune to limitations. Tom Tango created the stat as a means to evaluate pitchers in addition to existing stats, not exclusive to them. It is possible a player can truly outperform his FIP.
“Just because FIP doesn’t consider a player’s performance on batted balls in play or with holding runners, etc,” Tango said, “doesn’t mean that it thinks that all players are equal in those respects.”
Cameron believes the discrepancy between Lester’s FIP and ERA is due to the exemplary work of the teammates surrounding him.
For Lester, it was the Cubs defense, and their impact on his run prevention, which made it difficult for me to give him full credit for the run prevention that occurred while he was pitching. While we don’t have perfect tools for separating pitching from fielding yet, the metrics we do have suggest that Lester was a little bit less responsible for the outs on balls in play he got than the other Cubs starter who was part of my top eight.
And the impact of those outs on balls in play couldn’t have been larger; Lester ran a .215 BABIP with men in scoring position, and the contact numbers just don’t suggest that Lester had a lot to do with that. So while his run prevention numbers are excellent, I didn’t want to reward him for the performance of his teammates.
He hammers his message home by citing a “quality contact ” stat on a 1-100 scale.
…Lester recorded an Unadjusted Contact Score (75) well below his adjusted mark (88). There’s that Cub defense again. Lester’s big variance was on line drives, where he allowed an amazingly low .519 AVG-.615 SLG; that’s 58 Unadjusted Liner Production, marked up to 90 for context. His BIP mix is relatively unremarkable, but he limits authority quite well and has no glaring weaknesses anywhere in his profile.
Cameron essentially argued that Lester gave up more bad contact than his counterparts, and that the Cubs defense bailed him out as a result. But this isn’t completely true.
Andrew Perpetua’s expected stats show Lester actually didn’t get hit very hard. Perpetua’s model converts the run value of batted balls’ velocity, launch angle, and spray angle into predictve stats, such as wOBA, batting average, slugging, etc. Lester’s expected wOBA (xOBA) ranks second in MLB only behind Max Scherzer (minimum 200 innings). This conflicts with Cameron’s belief that Lester gave up too much bad contact. Had Lester indeed done so, his xOBA would not be so good.
Our current baseball tools sometimes fail to tell the whole story. That is particularly true in this case, when no argument is strong enough to justify leaving Lester off the Cy Young ballot. Trying to argue that the Cubs lefty wasn’t a top-five pitcher is understandably silly, some might even say egregiously futile. Maybe in the near future we can say definitively that statistics confirm what we see in baseball.
But for now, even the best of the best sometimes get it wrong. Jon Lester was really good this year. I saw it. You saw it. Most everyone saw it.