Winning one or two games during a Major League Baseball (MLB) season is often the difference between a team advancing to post-season play, or “waiting until next year”. Technology advances have made it feasible to augment historical data with in-game contextual data to provide managers immediate insights regarding an opponent’s next move, thereby providing a competitive edge. We developed statistical models of pitcher behavior using pitch sequences thrown during three recent MLB seasons (2011–2013). The purpose of these models was to predict the next pitch type, for each pitcher, based on data available at the immediate moment, in each at-bat. Independent models were developed for each player’s most frequent four pitches. The overall predictability of next pitch type is 74:5%. Additional analyses on pitcher predictability within specific game situations are discussed. Finally, using linear regression analysis, we show that an index of pitch sequence predictability may be used to project player performance in terms of Earned Run Average (ERA) and Fielding Independent Pitching (FIP) over a longer term. On a restricted range of the independent variable, reducing complexity in selection of pitches is correlated with higher values of both FIP and ERA for the players represented in the sample. Both models were significant at the α = 0.05 level (ERA: p
= 0.022; FIP
= 0.0114). With further development, such models may reduce risk faced by management in evaluation of potential trades, or to scouts assessing unproven emerging talent. Pitchers themselves might benefit from awareness of their individual statistical tendencies, and adapt their behavior on the mound accordingly. To our knowledge, the predictive model relating pitch-wise complexity and long-term performance appears to be novel.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited