A Validation Study of a Deep Learning-Based Doping Drug Text Recognition System to Ensure Safe Drug Use among Athletes
Abstract
1. Introduction
2. Materials and Methods
2.1. Data
2.1.1. Test Data
2.1.2. Data Augmentation
2.1.3. Database
2.2. Optical Character Recognition-Based Doping Drug-Recognition System
2.2.1. Composition and Mechanism
2.2.2. Tesseract OCR
2.2.3. Text Recognition by Tesseract OCR
2.3. Data Processing and Analysis Method
3. Results
3.1. Doping Drug-Recognition System
3.2. Character Recognition Accuracy in the Developed System
3.3. Validation of the Doping Drug-Recognition System
4. Discussion
5. Conclusions and Suggestions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Pavot, D. A gap or lacuna in the World Anti-Doping Code? Remarks on the CAS Interpretation in IOC, WADA, and ISU v. RUSADA, Kamila Valieva and Russian Olympic Committee (CAS OG 22-08, CAS OG 22-09, and CAS OG 22-10). Front. Sports Act. Living 2022, 4, 946608. [Google Scholar] [CrossRef] [PubMed]
- Baron, D.A.; Martin, D.M.; Magd, S.A. Doping in sports and its spread to at-risk populations: An international review. World Psychiatry 2007, 6, 118–123. [Google Scholar] [PubMed]
- Cho, Y. Sport celebrity in South Korea: Park, Tae-Hwan from new generation to fallen angel. Asia Pac. J. Sport Soc. Sci. 2015, 4, 223–236. [Google Scholar] [CrossRef]
- Bell, P.; Ten Have, C.; Lauchs, M. A case study analysis of a sophisticated sports doping network: Lance Armstrong and the USPS Team. Int. J. Law Crime Justice 2016, 46, 57–68. [Google Scholar] [CrossRef]
- Fainaru-Wada, M.; Williams, L. Game of Shadows: Barry Bonds, BALCO, and the Steroids Scandal That Rocked Professional Sports; Gotham Books: Sheridan, WY, USA, 2006. [Google Scholar]
- Hill, S.E.; Schvaneveldt, S.J. Using statistical process control charts to identify the steroids era in major league baseball: An educational exercise. J. Stat. Educ. 2011, 19, 1–19. [Google Scholar] [CrossRef]
- Ratamess, N.A. Keeping an eye on steroid abuse. In Steroid Abuse; Newton, D.E., Ed.; ABC-CLIO: Santa Barbara, CA, USA, 2013; p. 150. [Google Scholar]
- Schneider, A.J.; Friedmann, T. The problem of doping in sports. Adv. Genet. 2006, 51, 1–9. [Google Scholar] [CrossRef]
- Lippi, G.; Franchini, M.; Guidi, G.C. Doping in competition or doping in sport? Br. Med. Bull. 2008, 86, 95–107. [Google Scholar] [CrossRef]
- Bhasin, S.; Storer, T.W.; Berman, N.; Callegari, C.; Clevenger, B.; Phillips, J.; Bunnell, T.J.; Tricker, R.; Shirazi, A.; Casaburi, R. The effects of supraphysiologic doses of testosterone on muscle size and strength in normal men. N. Engl. J. Med. 1996, 335, 1–7. [Google Scholar] [CrossRef]
- Nathan, A.M. The possible effect of steroids on home-run production. Am. J. Phys. 2008, 76, 15–20. [Google Scholar]
- McKnight, K.M.; Bernes, K.B.; Gunn, T.; Chorney, D.; Orr, D.T.; Bardick, A.D. Life after sport: Athletic career transition and transferable skills. J. Excell. 2009, 13, 63–77. [Google Scholar]
- Baron, D.A.; Reardon, C.L.; Baron, S.H. Clinical Sports Psychiatry: An International Perspective; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
- Kim, S.H.; Cho, S.; Choi, J.H.; Lee, Y.H.; Rhie, S.J. Sports pharmacy: New specialty of pharmacists and pharmaceutical care services. Korean J. Clin. Pharm. 2021, 31, 12–20. [Google Scholar] [CrossRef]
- World Anti-Doping Agency. World Anti-Doping Code International Standard Prohibited List 2023. 2022. Available online: https://www.wada-ama.org/sites/default/files/2022-09/2023list_en_final_9_september_2022.pdf (accessed on 20 March 2023).
- World Anti-Doping Agency. 2019 Anti-Doping Rule Violations (ADRVs) Report. 2019. Available online: https://www.wada-ama.org/sites/default/files/2022-01/2019_adrv_report_external_final_12_december_2021_0_0.pdf (accessed on 5 January 2023).
- Lamon, S.; Robinson, N.; Mangin, P.; Saugy, M. Detection window of Darbepoetin-alpha following one single subcutaneous injection. Clin. Chim. Acta 2007, 379, 145–149. [Google Scholar] [CrossRef]
- Kim, T.; Kim, Y.H. Korean national athletes’ knowledge, practices, and attitudes of doping: A cross-sectional study. Subst. Abus. Treat. Prev. Policy 2017, 12, 7. [Google Scholar] [CrossRef] [PubMed]
- Overbye, M. Doping control in sport: An investigation of how elite athletes perceive and trust the functioning of the doping testing system in their sport. Sport Manag. Rev. 2016, 19, 6–22. [Google Scholar] [CrossRef]
- Backhouse, S.H.; McKenna, J. Doping in sport: A review of medical practitioners’ knowledge, attitudes and beliefs. Int. J. Drug Policy 2011, 22, 198–202. [Google Scholar] [CrossRef] [PubMed]
- Kamenju, W.J.; Mwisukha, A.; Elijah, R.; Hellen, M.; Mwangi, W.P. Influence of sports disciplines and demographics of Kenya colleges athletes on their awareness of doping in sports. Int. J. Hum. Soc. Sci. 2016, 6, 155–162. [Google Scholar]
- Goldberg, L.; Elliot, D.; Clarke, G.N.; MacKinnon, D.P.; Moe, E.; Zoref, L.; Green, C.; Wolf, S.L.; Greffrath, E.; Miller, D.J.; et al. Effects of a multidimensional anabolic steroid prevention intervention: The Adolescents Training and Learning to Avoid Steroids (ATLAS) program. JAMA 1996, 276, 1555–1562. [Google Scholar] [CrossRef]
- Mottram, D.; Khalifa, S.; Alemrayat, B.; Rahhal, A.; Ahmed, A.; Stuart, M.; Awaisu, A. Perspective of pharmacists in Qatar regarding doping and anti-doping in sports. J. Sports Med. Phys. Fit. 2015, 56, 817–824. [Google Scholar]
- Sagoe, D.; Holden, G.; Rise, E.N.K.; Torgersen, T.; Paulsen, G.; Krosshaug, T.; Lauritzen, F.; Pallesen, S. Doping prevention through anti-doping education and practical strength training: The Hercules program. Perform. Enhanc. Health 2016, 5, 24–30. [Google Scholar] [CrossRef]
- Alaranta, A.; Alaranta, H.; Helenius, I. Use of prescription drugs in athletes. Sports Med. 2008, 38, 449–463. [Google Scholar] [CrossRef]
- Yee, K.C.; De Marco, M.; Salahudeen, M.S.; Peterson, G.M.; Thomas, J.; Naunton, M.; Kosari, S. Pharmacists as a source of advice on medication use for athletes. Pharmacy 2020, 8, 10. [Google Scholar] [CrossRef]
- Asif, A.M.A.M.; Hannan, S.A.; Perwej, Y.; Vithalrao, M.A. An overview and applications of optical character recognition. Int. J. Adv. Res. Sci. Eng. 2014, 3, 261–274. [Google Scholar]
- Mithe, R.; Indalkar, S.; Divekar, N. Optical character recognition. Int. J. Recent Tech. Eng. 2013, 2, 72–75. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Chaudhuri, A.; Mandaviya, K.; Badelia, P.; Ghosh, S.K. Optical Character Recognition Systems. In Optical Character Recognition Systems for Different Languages with Soft Computing Springer; Chaudhuri, A., Mandaviya, K., Badelia, P., Ghosh, S.K., Eds.; Springer International Publishing: New York, NY, USA, 2017; pp. 9–41. [Google Scholar]
- Huang, Z.; Chen, K.; He, J.; Bai, X.; Karatzas, D.; Lu, S.; Jawahar, C.V. ICDAR2019 competition on scanned receipt OCR and information extraction. In Proceedings of the 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 20–25 September 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1516–1520. [Google Scholar] [CrossRef]
- Kumar, R.; Gupta, M.; Shukla, S.; Yadav, R.K. E-challan automation for RTO using OCR. In Proceedings of the 2021 Third International Conference on Inventive Research in Computing Applications (ICIRCA), Coimbatore, India, 2–4 September 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–8. [Google Scholar] [CrossRef]
- Ashwini, B.; Sapna, K.; Ishwari, B.; Pallavi, P.; Achaliya, P.N. An Android based medication reminder system based on OCR using ANN. Int. J. Comput. Appl. 2013, 3, 25–30. [Google Scholar]
- Hassan, E.; Tarek, H.; Hazem, M.; Bahnacy, S.; Shaheen, L.; Elashmwai, W.H. Medical prescription recognition using machine learning. In Proceedings of the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 27–30 January 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 973–979. [Google Scholar] [CrossRef]
- Rumi, R.I.; Pavel, M.I.; Islam, E.; Shakir, M.B.; Hossain, M.A. IoT enabled prescription reading smart medicine dispenser implementing maximally stable extremal regions and OCR. In Proceedings of the 2019 Third International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 12–14 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 134–138. [Google Scholar] [CrossRef]
- Kumar, A.; Goyal, A.; Rai, B.K.; Sharma, S. OCR based medical prescription and report analyzer. In Proceedings of the AIP Conference, Nagpur, India, 18–19 June 2021; AIP Publishing LLC: Melville, NY, USA, 2022; Volume 2424, p. 070006. [Google Scholar]
- Park, J.; Yoon, S.; Yoon, J.; Lee, S.; Lee, H.; Lee, J. Development of a doping drug recognition system: Application of deep learning-based OCR technology. Korean J. Physic. Educ. 2022, 61, 83–92. [Google Scholar] [CrossRef]
- DeVries, T.; Taylor, G.W. Learning confidence for out-of-distribution detection in neural networks. arXiv 2018, arXiv:1802.04865. [Google Scholar] [CrossRef]
- Wong, S.C.; Gatt, A.; Stamatescu, V.; McDonnell, M.D. Understanding data augmentation for classification: When to warp? In Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Gold Coast, Australia, 30 November–2 December 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–6. [Google Scholar] [CrossRef]
- Smith, R.W. History of the Tesseract OCR engine: What worked and what didn’t. In Proceedings of the SPIE Document Recognition and Retrieval XX, Burlingame, CA, USA, 4 February 2013; p. 865802. [Google Scholar]
- Smith, R. An overview of the Tesseract OCR engine. In Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), Parana, Brazil, 23–26 September 2007; IEEE: Piscataway, NJ, USA, 2007; Volume 2, pp. 629–633. [Google Scholar]
- Smith, R.; Antonova, D.; Lee, D.S. Adapting the Tesseract open source OCR engine for multilingual OCR. In Proceedings of the International Workshop on Multilingual OCR, Barcelona, Spain, 25 July 2009; pp. 1–8. [Google Scholar]
- Linden, A.; Yarnold, P.R. Using data mining techniques to characterize participation in observational studies. J. Eval. Clin. Pract. 2016, 22, 835–843. [Google Scholar] [CrossRef]
- Adate, A.; Tripathy, B.K.; Adate, A.; Tripathy, B.K. A survey on deep learning methodologies of recent applications. In Deep Learning in Data Analytics: Recent Techniques, Practices and Applications; Acharjya, D.P., Mitra, A., Zaman, N., Eds.; Springer: Cham, Switzerland, 2022; Volume 91, pp. 145–170. [Google Scholar]
- Jan, B.; Farman, H.; Khan, M.; Imran, M.; Islam, I.U.; Ahmad, A.; Ali, S.; Jeon, G. Deep learning in big data analytics: A comparative study. Comp. Electr. Eng. 2019, 75, 275–287. [Google Scholar] [CrossRef]



| Data Search | Source |
|---|---|
| https://www.google.co.kr/ (accessed on 20 March 2023) | |
| Korea Pharmaceutical Information Center | https://www.health.kr/main.as (accessed on 20 March 2023) |
| WADA | https://www.wada-ama.org/en (accessed on 20 March 2023) |
| Number of Drug Substances | Correct | Error | Accuracy | |
|---|---|---|---|---|
| Google Tesseract OCR | 323 | 311 | 12 | 96.3% |
| Binary Classification | Reference Classification | ||
|---|---|---|---|
| Banned | Safe or Acceptable | ||
| Prediction categories | Banned drugs | True master (TM) 10 | False master (FM) 2 |
| Acceptable drugs | False non-master (FN) 0 | True non-master (TN) 10 | |
| Calculation for the system’s classification accuracy | |||
| 1. Accuracy: Accuracy refers to the frequency at which banned drugs are correctly recognized (true master [TM]) and acceptable drugs are correctly recognized (true non-master [TN]) in all prescription and drug substance images. | |||
| An example of the formula used to calculate accuracy derived from this study is as follows. (TM + TN)/(TM + FN + FM + TN) = (10 + 10)/(10 + 0 + 2 + 10) = 0.9 | |||
| 2. Sensitivity: Sensitivity refers to the percentage of banned substances’ images that are correctly classified as banned (TM). | |||
| An example of the formula used to calculate sensitivity derived from this study is as follows. TM/(TM + FN) = 10/(10 + 0) = 1.0 | |||
| 3. Specificity: Specificity refers to the percentage of acceptable substances’ images that are correctly classified as acceptable (TN). | |||
| An example of the formula used to calculate specificity derived from this study is as follows. TN/(FM + TN) = 10/(2 + 10) = 0.83 | |||
| Step 1 | Step 2 | |
| Use procedure | ![]() | ![]() |
| Log in the system | Enter a drug substance image | |
| Step 3 | Step 4 | |
![]() | ![]() | |
| Analyze substances after image upload | Produce the text as an output of analysis |
| Frequency of Words Extracted (n) | % | |
|---|---|---|
| Total | 5379 | 100 |
| Error | 91 | 1.6 |
| Accuracy | 98.3% | |
| Binary Classification | Reference Classification | ||
|---|---|---|---|
| Banned | Safe or Acceptable | ||
| Prediction categories | Banned drugs | True master (TM) 218 | False master (FM) 44 |
| Acceptable drugs | False non-master (FN) 0 | True non-master (TN) 624 | |
| Accuracy of the Doping Drug-Recognition System’s Classification | |||
| Accuracy | 0.95 | ||
| Sensitivity | 1.00 | ||
| Specificity | 0.93 | ||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, S.-Y.; Park, J.-H.; Yoon, J.; Lee, J.-Y. A Validation Study of a Deep Learning-Based Doping Drug Text Recognition System to Ensure Safe Drug Use among Athletes. Healthcare 2023, 11, 1769. https://doi.org/10.3390/healthcare11121769
Lee S-Y, Park J-H, Yoon J, Lee J-Y. A Validation Study of a Deep Learning-Based Doping Drug Text Recognition System to Ensure Safe Drug Use among Athletes. Healthcare. 2023; 11(12):1769. https://doi.org/10.3390/healthcare11121769
Chicago/Turabian StyleLee, Sang-Yong, Jae-Hyeon Park, Jiwun Yoon, and Ji-Yong Lee. 2023. "A Validation Study of a Deep Learning-Based Doping Drug Text Recognition System to Ensure Safe Drug Use among Athletes" Healthcare 11, no. 12: 1769. https://doi.org/10.3390/healthcare11121769
APA StyleLee, S.-Y., Park, J.-H., Yoon, J., & Lee, J.-Y. (2023). A Validation Study of a Deep Learning-Based Doping Drug Text Recognition System to Ensure Safe Drug Use among Athletes. Healthcare, 11(12), 1769. https://doi.org/10.3390/healthcare11121769





