# The Prediction of Batting Averages in Major League Baseball

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Data Analysis

#### 2.1. Using 2015 Data and Predictions

#### 2.2. Using 2016 Data and Predictions

## 3. Assessing 2017 Predictions

#### Approximate Standard Errors

## 4. Discussion

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Malone, S. Open Mic: Why Baseball GMs Have the Most Difficult Job. Bleacher Report. 24 June 2008. Available online: http://bleacherreport.com/articles/32306-open-mic-why-baseball-gms-have-the-most-difficult-job (accessed on 15 October 2017).
- Fimrite, R. Bonny Debut for Clyde. Sports Illustrated. 9 July 1973. Available online: https://www.si.com/vault/1973/07/09/606524/bonny-debut-for-clyde# (accessed on 15 October 2017).
- Hubbard, S. Eyes of Texas Left Crying. The Pittsburgh Press. 17 July 1986. Available online: https://news.google.com/newspapers?id=5NQbAAAAIBAJ&sjid=IGMEAAAAIBAJ&pg=3057,782458&dq=david+clyde&hl=en (accessed on 15 October 2017).
- Brisbee, B. The Alternate Histories of Albert Pujols, Baseball’s Worst Player. SBNATION, 31 August 2017. Available online: https://www.sbnation.com/mlb/2017/8/31/16230242/albert-pujols-angels-contract-yeeps (accessed on 15 October 2017).
- Scully, G.W. Pay and performance in Major League Baseball. Am. Econ. Rev.
**1974**, 64, 915–930. [Google Scholar] - Tacon, R. Fantasy sport: A systematic review and new research directions. Eur. Sport Manag. Q.
**2017**, 17, 558–589. [Google Scholar] [CrossRef] - PECOTA. Wikipedia. 2017. Available online: https://en.wikipedia.org/wiki/PECOTA (accessed on 15 October 2017).
- Silver, N. The Signal and the Noise: Why So Many Predictions Fail—But Some Don’t; The Penguin Press: New York, NY, USA, 2012. [Google Scholar]
- Gleeman, A.; Reichert, S.; Pease, D.; Sayre, B. (Eds.) Baseball Prospectus 2017; Turner Publishing: New York, NY, USA, 2017. [Google Scholar]
- Casella, P. Statcast Primer: Baseball Will Never Be the Same. MLB.com. 2015. Available online: http://m.mlb.com/news/article/119234412/statcast-primer-baseball-will-never-be-the-same/ (accessed on 15 October 2017).
- Bailey, S.R. Forecasting Batting Averages in MLB; MSc Masters Project in the Department of Statistics and Actuarial Science; Simon Fraser University: Burnaby, BC, Canada, 2017. [Google Scholar]
- Barnett, A.G.; van der Pols, J.C.; Dobson, A.J. Regression to the mean: What it is and how to deal with it. Int. J. Epidemiol.
**2005**, 34, 215–220. [Google Scholar] [CrossRef] - Basco, D.; Davies, M. The many flavors of DIPS: A history and an overview. Baseb. Res. J.
**2010**, 39, 41–50. [Google Scholar] - Albert, J.A. Improved component predictions of batting and pitching measures. J. Quant. Anal. Sports
**2016**, 12, 73–85. [Google Scholar] [CrossRef] - Jensen, S.T.; McShane, B.B.; Wyner, A.J. Hierarchical Bayesian modeling of hitting performance in baseball. Bayesian Anal.
**2009**, 4, 631–652. [Google Scholar] [CrossRef] - Tangotiger. Statcast Lab: No Nulls in Batted Balls Launch Parameters. Tangotiger Blog. 2017. Available online: http://tangotiger.com/index.php/site/article/statcast-lab-no-nulls-in-batted-balls-launch-parameters (accessed on 15 October 2017).

**Figure 2.**Statcast and PECOTA predictions for 2016 season. The line $y=x$ has been added to aid in interpretation.

**Figure 3.**Scatterplots of the Statcast and PECOTA predictions for the 2017 season plotted against actual batting averages. Simple linear regression lines have been added to both plots.

**Table 1.**Comparison of the three prediction methods in 2017. We include mean absolute error (MAE), 95% CI for MAE based on 100 bootstrap samples, ME (mean error), average prediction, standard deviation of the prediction and prediction percentiles.

Method | MAE | MAE CI | ME | Avg | Sd | 5th Perct | 95th Perct |
---|---|---|---|---|---|---|---|

Statcast | 0.0236 | (0.0229,0.0233) | −0.0009 | 0.260 | 0.026 | 0.212 | 0.299 |

PECOTA | 0.0209 | (0.0205,0.0209) | −0.0017 | 0.261 | 0.018 | 0.234 | 0.291 |

Combined | 0.0208 | (0.0198,0.0202) | 0.0000 | 0.262 | 0.017 | 0.233 | 0.287 |

**Table 2.**Predictions and approximate standard errors for the first 10 batters (alphabetically) in the 2017 season.

Batter | ${\mathit{y}}_{\mathit{j}}^{\left(\mathit{C}\right)}$ | SE$\left({\mathit{y}}_{\mathit{j}}^{\left(\mathit{C}\right)}\right)$ |
---|---|---|

Aaron Hicks (New York Yankees) | 0.245 | 0.0162 |

Adam Duval (Cincinnati Reds) | 0.240 | 0.0157 |

Adam Jones (Baltimore Orioles) | 0.267 | 0.0160 |

Adam Lind (Seattle Mariners) | 0.269 | 0.0164 |

Adam Rosales (San Diego Padres) | 0.224 | 0.0157 |

Addison Russell (Chicago Cubs) | 0.241 | 0.0159 |

Adeiny Hechavarria (Miami Marlins) | 0.262 | 0.0158 |

Adrian Gonzalez (Los Angeles Dodgers) | 0.272 | 0.0158 |

Adrian Beltre (Texas Rangers) | 0.292 | 0.0158 |

Albert Pujols (Los Angeles Angels) | 0.269 | 0.0158 |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bailey, S.R.; Loeppky, J.; Swartz, T.B.
The Prediction of Batting Averages in Major League Baseball. *Stats* **2020**, *3*, 84-93.
https://doi.org/10.3390/stats3020008

**AMA Style**

Bailey SR, Loeppky J, Swartz TB.
The Prediction of Batting Averages in Major League Baseball. *Stats*. 2020; 3(2):84-93.
https://doi.org/10.3390/stats3020008

**Chicago/Turabian Style**

Bailey, Sarah R., Jason Loeppky, and Tim B. Swartz.
2020. "The Prediction of Batting Averages in Major League Baseball" *Stats* 3, no. 2: 84-93.
https://doi.org/10.3390/stats3020008