Over the last few days, various reputable baseball analysis sites have been digging into the relationship between infield fly ball rates (IFFB%) and home run per fly ball rates (HR/FB). The discussion was prompted by a blog post by Rory Paap at Paapfly.com called “Matt Cain ignores xFIP, again and again,” which generated a response from Dave Cameron at Fangraphs.
Paap suggested FIP and xFIP do Cain a disservice because they don’t give him his due credit for possessing the “unique skill” of inducing harmless fly ball contact, a theory that David Pinto at Baseball Musings attempted to quantify last October. Cameron’s response included some interesting analysis that looked at the best pitchers from 2002-2007 in terms of HR/FB rate and compared their IFFB% over that span to what they posted the next three seasons. His conclusion?
Is there some skill to allowing long fly outs? Maybe. But if you can identify which pitchers are likely to keep their home run rates low while giving up a lot of fly balls before they actually do it, then you could make a lot of money in player forecasting.
Since my mind automatically places all arguments regarding statistical trends into the context of fantasy baseball, I decided to throw my hat into the ring and see if I could find a trend between IFFB% and HR/FB rate. My theory was that if IFFB% and HR/FB rate showed some sort of correlation, then plotting HR/FB rate as a function of IFFB% would show a clear inverse trend (meaning that a higher IFFB% would more likely generate a lower HR/FB rate, and vice versa).
To do this, I looked at all pitchers from 2008 to 2010 who threw at least 162 innings and plotted their IFFB% and HR/FB rate as described above. This three-year range generated 257 such data sets, and these were the results:
Note: IFFB% is on the x-axis and HR/FB rate is on the y-axis.
Just by looking at the chart, it’s tough to visually decipher any sort of trend. If there actually is an inverse relationship between IFFB% and HR/FB rates, we would expect the data points to slope from the top-left (low IFFB%, high HR/FB rate) to the bottom-right (high IFFB%, low HR/FB rate).
By adding a best-fit trend line to the data set, we see that there is a very slight slope in the direction we anticipated, but to say it shows any sort of useful relationship is a stretch. The data has an R-Squared value of just 0.0126, which tells us there was very little correlation between IFFB% and HR/FB rate. If you don’t know what R-squared is, it’s simply a representation of one variable’s ability to forecast another. R-Squared values range from 0 to 1, and the closer they are to 1 the more of a correlation there is between the two sets of data. An R-Squared value of 0.0126 between IFFB% and HR/FB rate shows very little correlation.
What conclusions can we draw from this? Perhaps it is possible to tell if a pitcher like Cain is more prone to lower HR/FB rates by virtue of his ability to induce weaker contact, but IFFB% alone is not enough to draw any conclusions. More sophisticated analysis, like that provided by Pinto’s article at Baseball Musings, might unveil some usable relationships, but we cannot simply look at Clayton Kershaw’s 4.1 percent HR/FB rate in 2009 and say his 13.5 percent IFFB% explains it. The data just doesn’t support it.