A Study of Winning Percentage in the MLB Using Fuzzy Markov Regression
Abstract
:1. Introduction
2. Methodology
2.1. Factor Analysis
2.2. Fuzzy Partition on the Winning Percentage
2.3. Fuzzy Probability of Future Win Percentage
2.4. Fuzzy Markov Regression
3. Results and Discussion
3.1. Fuzzy Regression for Winning Percentage
3.2. Fuzzy Markov Regression for Winning Percentage
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Mullin, B.J.; Hardy, S.; Sutton, A. Sport Management; Human Kinentics: Champaign, IL, USA, 2007. [Google Scholar]
- Stimel, D.S. Dependence Relationships between On Field Performance, Wins, and Payroll in Major League Baseball. J. Quant. Anal. Sports 2011, 7, 1–19. [Google Scholar]
- Fort, R. Sports Economics, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2010. [Google Scholar]
- Kim, M.-C. A Study on the Winning and Losing Factors of Para Ice Hockey Using Data Mining-Based Decision Tree Analysis. Appl. Sci. 2023, 13, 1334. [Google Scholar] [CrossRef]
- Lee, W.; Jhang, H.; Lee, S.; Choi, S.H. Forecasting Winning Rates in Major League Baseball Based on Fuzzy Logic. J. Korean Inst. Intell. Syst. 2020, 30, 366–372. [Google Scholar]
- Stick, J.D. A Regression Analysis of Predictors on the Productivity Indices of Major League Baseball. ProQuest Dissertations and Thesis, The University of Nebraska, Lincoln, NE, USA, 2005. [Google Scholar]
- MLB Homepage. Available online: https://www.mlb.com/ (accessed on 15 January 2023).
- NPB Homepage. Available online: https://npb.jp/ (accessed on 20 March 2023).
- KBL Homepage. Available online: https://www.koreabaseball.com/ (accessed on 12 February 2023).
- Soto Valero, C. Predicting Win-Loss outcomes in MLB regular season games—A comparative study using data mining methods. Int. J. Comput. Sci. Sport 2016, 15, 91–112. [Google Scholar] [CrossRef]
- Osawa, K.; Aida, K. The technique of calculating the winning percentage in baseball games with runners’ advancement statistics. Jpn. J. Ind. Appl. Math. 2008, 18, 321–346. [Google Scholar]
- Lee, J.T.; Kim, Y.T. A study on the estimation of winning percentage in Korean pro-baseball. J. Korean Data Anal. Soc. 2006, 8, 857–869. [Google Scholar]
- MLB Stats Page. Available online: https://www.mlb.com/stats/ (accessed on 13 January 2023).
- Fangraphs Homepage. Available online: https://www.fangraphs.com/ (accessed on 8 January 2023).
- MLB Glossary Page. Available online: https://www.mlb.com/glossary/ (accessed on 8 January 2023).
- Kim, H.K.; Choi, S.H. Statistical Analysis with Applications; Kyung Moon Sa: Seoul, Republic of Korea, 2002. [Google Scholar]
- Guillaume, S.; Charnomordic, B.; Loisel, P. Fuzzy partitions: A way to integrate expert knowledge into distance calculations. Inf. Sci. 2013, 245, 76–95. [Google Scholar] [CrossRef]
- Zadeh, L. Fuzzy Sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
- Buckley, J.J. Fuzzy Probability; Springer: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
- Choi, S.H.; Buckley, J.J. Fuzzy regression using least absolute deviation estimators. Soft Comput. 2008, 12, 257–263. [Google Scholar] [CrossRef]
- Jung, H.Y.; Yoon, J.H.; Choi, S.H. Fuzzy linear regression using rank transform method. Fuzzy Sets Syst. 2015, 274, 97–108. [Google Scholar] [CrossRef]
- Phaiboon, S.; Phokharatkul, P. Multi-Boundary Empirical Path Loss Model for 433 MHz WSN in Agriculture Areas Using Fuzzy Linear Regression. Sensors 2023, 23, 3525. [Google Scholar] [CrossRef] [PubMed]
- Yoon, J.H.; Choi, S.H. A Large Sample Study of Fuzzy Least-Squares Estimation. Axioms 2025, 14, 181. [Google Scholar] [CrossRef]
- Philip, B.; Paul, D.B.; Bruce, D.W. Predicting Run Production and Run Prevention in Baseball: The Impact of Sabermetrics. Int. J. Bus. 2012, 2, 67–75. [Google Scholar]
- Cho, Y.S.; Cho, Y.J. A study on winning percentage using batter’s runs and pitcher’s runs in Korean professional baseball league. J. Korean Data Anal. Soc. 2005, 7, 2303–2312. [Google Scholar]
- Yen, A.M.-F.; Chen, T.H.-H. Mixture Multi-state Markov Regression Model. J. Appl. Stat. 2007, 34, 11–21. [Google Scholar]
- Davis, M.C. The interaction between baseball attendance and winning percentage: A VAR analysis. Int. J. Sport Financ. 2008, 3, 58–73. [Google Scholar]
- Lemke, R.J.; Leonard, M.; Tlhokwane, K. Estimating attendance at Major League Baseball games for the 2007 season. J. Sports Econ. 2010, 11, 316–348. [Google Scholar]
32 Batting Variables | 11 vac | RBI, BB, SB, E | -Variables excluded as a result of discriminant analysis: G, PA, AB, R, CS, OPS, Fld%, HBP, SF, SH, Def, IBB | |
SO, DP | ||||
HR, 2B | ||||
3B, H, Sh | ||||
9 var | BA, OBP, wOBA, wRC+ | |||
ISO, SLG | ||||
OFF, WAR, BsR | ||||
29 Pitching Variables | 10 vac | BK, HBP, BB, WP | -Variables excluded as a result of discriminant analysis: G, IP, R, ER, SO, HR9, DP, Fld%, Page, ShO, GS, IBB, E | |
CG, SV, tSHO | ||||
H, HR | ||||
SO9 | ||||
6 var | ERA, FIP, WHIP | |||
BABIP, LOB% | ||||
WAR | ||||
8 Fielding Variables | 3 vac | E | -Variables excluded as a result of discriminant analysis: PO, PB, SB, CS | |
DP | ||||
A | ||||
1 var | DEP |
N | Minimum | Median | Maximum | Variance | Standard Deviation(s) | |
---|---|---|---|---|---|---|
1902 | 0.235 | 0.506 | 0.512 | 0.763 | 0.007 | 0.085 |
After | ||||||
---|---|---|---|---|---|---|
Before | Total | |||||
17 | 44 | 4 | 2 | 0 | 67 | |
39 | 345 | 149 | 91 | 0 | 624 | |
9 | 147 | 165 | 162 | 5 | 488 | |
0 | 89 | 167 | 391 | 24 | 671 | |
0 | 1 | 2 | 26 | 7 | 36 | |
Total | 65 | 626 | 487 | 672 | 36 | 1886 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Choi, S.H.; Ji, S.-K. A Study of Winning Percentage in the MLB Using Fuzzy Markov Regression. Mathematics 2025, 13, 1008. https://doi.org/10.3390/math13061008
Choi SH, Ji S-K. A Study of Winning Percentage in the MLB Using Fuzzy Markov Regression. Mathematics. 2025; 13(6):1008. https://doi.org/10.3390/math13061008
Chicago/Turabian StyleChoi, Seung Hoe, and Seo-Kyung Ji. 2025. "A Study of Winning Percentage in the MLB Using Fuzzy Markov Regression" Mathematics 13, no. 6: 1008. https://doi.org/10.3390/math13061008
APA StyleChoi, S. H., & Ji, S.-K. (2025). A Study of Winning Percentage in the MLB Using Fuzzy Markov Regression. Mathematics, 13(6), 1008. https://doi.org/10.3390/math13061008