Monday, July 21, 2008

Is Cliff Lee overperforming?

For my first post here on SABR Rattling, I will focus on my favorite player of the 2008 season, Cliff Lee. Early in the 2008 season, Lee was throwing absolutely ridiculous baseball. In his first 4 starts he gave up 2 ER, walked 2, and struck out 29, over 33 1/3 IP. Obviously, people responded to this outrageous start by saying that he would quickly fall to earth and regress back to his career averages, or at least close to them. After a few more starts, it became apparent that Lee was going to continue to throw the ball far better than he has in the past, at least based on the traditional stats of Wins, Strikeouts, and ERA. However, the more inquisitive or statistically minded fan may say "Wins are meaningless, and ERA is fuzzy, as it is influenced by relief pitchers and the players in the field. Cliff Lee is nowhere near as good as his numbers are." Is that true?

Well, let's look a bit deeper. The only way to truly analyze how a pitcher is performing is to remove as much luck and team defensive influence as possible. Anyone bothering to read this post already knows about DIPS, but here's a quick summary: In 1999, Voros McCracken came to the startling conclusion that nearly anything that happens to the ball once it leaves the bat is essentially luck. He created a formula for analyzing the individual performance of a pitcher, and called it DIPS: Defense Independent Pitching Statistics. The sabermetric community, of course, responded by analyzing his conclusion and ultimately agreed, and it proved to be a breakthrough method of analyzing pitcher's performances. DIPS looks at all the things that are within a pitchers control: strikeouts, walks, hit batters, and HR allowed, and reveals a far more accurate "ERA" than the traditional measurement, known as dERA.

So let's look at Cliff Lee: after 19 starts (so more or less about 2/3 of his season), he is posting an ERA of 2.29, the 3rd best in the MLB for starting pitchers. As mentioned previously, however, this number isn't ideal - we want to look at his dERA - which is a MLB best 2.57. Along with this, his BABIP (Batting Average for Balls In Play) is a fairly high .283, which puts him at 67th in the MLB for SPs. What does this mean? It basically means Lee isn't getting lucky. When players hit the ball off Lee, they are getting on base more frequently than on 50% of other SPs in the game. So what is Lee doing that makes him so effective, even if he is less lucky than most in terms of BABIP?

Most importantly, perhaps, he isn't walking anyone. He has 20 walks in his 19 starts, which ties him for 3rd least in the MLB. A number of pitchers are exhibiting similar numbers of walks, however, and only a few have been nearly as effective as Lee, and the unifying factor is that they have high numbers of strikeouts. In addition to this, Lee has only allowed 5 home runs all season - tying him for 2nd least. All these individual metrics indicate why Lee's overall numbers are so superior.

It might prove interesting to look at some of the top pitchers in traditional metrics and compare them to some of the more interesting ones that we used to analyze Lee:






















Pitcher Wins ERA dERA BABIP BB K HR
Cliff Lee 13 2.29 2.57 .283 20 110 5
Justin Duchscherer 10 1.87 3.63 .202 26 70 7
Rich Harden 5 2.19 2.78 .282 34 102 5
Edinson Volquez 12 2.49 3.39 .278 59 129 6
Tim Lincecum 12 2.79 3.15 .298 51 143 9


Consistent themes here (out of an obviously small set): Low numbers of HR allowed, high number of strikeouts, fairly low number of walks, and a consistently average or above average BABIP. This leads to ERAs that are close to their dERA, or "real" ERA. Do you notice the outlier? Duchscherer, obviously. His dERA is nearly 2 full runs higher than his current ERA, most likely caused by his fairly modest K:BB ratio and extremely low BABIP (lowest in the majors, even). This indicates that Duchscherer is quite likely to regress to an ERA of 3 or above. There hasn't been a single starting pitcher in the 2000's to throw a full season with a BABIP of .202 - for example, the best in 2007 was Orlando Hernandez's .214.

Here is a table showing a few starting pitchers who are currently underperforming in traditional metrics vs. their actual pitching (a difference of 1 run or greater between ERA and dERA):



















Pitcher Wins ERA dERA BABIP BB K HR
A.J. Burnett 10 4.83 3.77 .311 60 132 12
Andrew Miller 5 5.63 3.87 .333 50 80 6
Randy Johnson 6 5.13 3.95 .319 28 95 15
Kevin Millwood 6 5.23 3.97 .360 38 78 10


Consistent themes here? Fairly obvious: much higher than average BABIP (an astonishing .360 for Millwood!), combined with fairly decent BB:K ratios and/or low numbers of HR allowed. These are SPs that are likely to lower their ERA (and might make for good fantasy pickups if your league is deep).

Besides Duchscherer, who else is overperforming? Yet another poorly constructed table tells the tale:



















Pitcher Wins ERA dERA BABIP BB K HR
Joe Saunders 12 3.05 4.59 .232 34 64 15
John Lannan 6 3.29 4.58 .260 38 63 12
Armando Galarraga 7 3.41 4.69 .231 37 68 13
Gavin Floyd 10 3.52 5.11 .212 50 81 18


Themes? Low BABIP, fairly poor BB:K ratios, higher number of HR allowed. These fellows are more likely to regress, so may make good trade bait (although its unlikely people have or want Lannan or Galarraga on their rosters).

However, there is the possibility that pitchers like Floyd, Duchscherer, Saunders, etc, are just BETTER at inducing ground outs, which would lead to a lower BABIP than average. As the DIPS 2.0 formula used to calculate the dERA does not incorporate such factors, it may be incorrect. In 2005, David Gassko wrote an article entitled Batted Balls and DIPS for The Hardball Times, in which he attempts to create an even superior version of DIPS by incorporating things like ground ball/fly ball rates. Even so, research by McCracken and others shows that BABIP is hugely volatile season to season (especially compared to the consistency of HR and BB allowed), indicating that it is unlikely that a pitcher has much control over BABIP. However, pitchers DO have large differences between career BABIP, and in many cases these career numbers account for their success.

So while McCracken's findings still hold up quite well, there is likely more work to be done on DIPS. Even working with what we have access to now, though, we can analyze whether or not they are over or underperforming, and make predictions on their future performance. So, is Cliff Lee overperforming? A tiny bit. By any measure, however, he is an elite pitcher in 2008.

No comments: