DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α+-Thalassemia Trait Using Red Blood Cell Parameters
Abstract
:1. Introduction
2. Materials and Methods
2.1. Study Subjects
- Normal (not thalassemia): cases with normal Hb analysis results (Hb AA2 pattern, A2 ≤ 3.5%) and negative PCR results for α0-thalassemia, α+-thalassemia, Hb CS and Hb Pakse mutations;
- α+-Thalassemia trait: heterozygous α+-thalassemia, Hb CS trait, Hb Pakse trait;
- Two-allele α-thalassemia mutation: heterozygous α0-thalassemia, homozygous α+-thalassemia, and α+-thalassemia/Hb CS or α+-thalassemia/Hb Pakse.
2.2. Sample Size
2.3. Statistical Analysis
2.3.1. Conventional Statistical Analysis
2.3.2. Model Construction and Development
2.3.3. Performance Evaluation
3. Results
3.1. Baseline Data and Comparisons of the Hematologic Parameters
3.2. Performance Evaluation of Conventional Models
3.3. Performance Evaluation of Different ML-Based Models
3.4. Mechanistic Interpretation of DeepThal
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Lemmens-Zygulska, M.; Eigel, A.; Helbig, B.; Sanguansermsri, T.; Horst, J.; Flatz, G. Prevalence of alpha-thalassemias in northern Thailand. Hum. Genet. 1996, 98, 345–347. [Google Scholar] [CrossRef] [PubMed]
- Chaibunruang, A.; Sornkayasit, K.; Chewasateanchai, M.; Sanugul, P.; Fucharoen, G.; Fucharoen, S. Prevalence of Thalassemia among Newborns: A Re-visited after 20 Years of a Prevention and Control Program in Northeast Thailand. Mediterr. J. Hematol. Infect. Dis. 2018, 10, e2018054. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Fucharoen, S.; Viprakasit, V. Hb H disease: Clinical course and disease modifiers. Hematol. Am. Soc. Hematol. Educ. Program 2009, 1, 26–34. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lorey, F.; Charoenkwan, P.; Witkowska, H.E.; Lafferty, J.; Patterson, M.; Eng, B.; Waye, J.S.; Finklestein, J.Z.; Chui, D.H. Hb H hydrops foetalis syndrome: A case report and review of literature. Br. J. Haematol. 2001, 115, 72–78. [Google Scholar] [CrossRef] [PubMed]
- Laosombat, V.; Viprakasit, V.; Chotsampancharoen, T.; Wongchanchailert, M.; Khodchawan, S.; Chinchang, W.; Sattayasevana, B. Clinical features and molecular analysis in Thai patients with HbH disease. Ann. Hematol. 2009, 88, 1185–1192. [Google Scholar] [CrossRef] [PubMed]
- Charoenkwan, P.; Sirichotiyakul, S.; Chanprapaph, P.; Tongprasert, F.; Taweephol, R.; Sae-Tung, R.; Sanguansermsri, T. Anemia and hydrops in a fetus with homozygous hemoglobin constant spring. J. Pediatr. Hematol. Oncol. 2006, 28, 827–830. [Google Scholar] [CrossRef]
- Luewan, S.; Charoenkwan, P.; Sirichotiyakul, S.; Tongsong, T. Fetal haemoglobin H-Constant Spring disease: A role for intrauterine management. Br. J. Haematol. 2020, 190, e233–e236. [Google Scholar] [CrossRef]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Charoenkwan, P.; Chiangjong, W.; Nantasenamat, C.; Hasan, M.M.; Manavalan, B.; Shoombuatong, W. StackIL6: A stacking ensemble model for improving the prediction of IL-6 inducing peptides. Brief. Bioinform. 2021, 22, bbab172. [Google Scholar] [CrossRef]
- Charoenkwan, P.; Nantasenamat, C.; Hasan, M.M.; Manavalan, B.; Shoombuatong, W. BERT4Bitter: A bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides. Bioinformatics 2021, 37, 2556–2562. [Google Scholar] [CrossRef]
- Charoenkwan, P.; Schaduangrat, N.; Moni, M.A.; Shoombuatong, W.; Manavalan, B. Computational prediction and interpretation of druggable proteins using a stacked ensemble-learning framework. Iscience 2022, 25, 104883. [Google Scholar] [CrossRef] [PubMed]
- Azadpour, M.; McKay, C.M.; Smith, R.L. Estimating confidence intervals for information transfer analysis of confusion matrices. J. Acoust. Soc. Am. 2014, 135, EL140–EL146. [Google Scholar] [CrossRef] [PubMed]
- Charoenkwan, P.; Nantasenamat, C.; Hasan, M.M.; Moni, M.A.; Manavalan, B.; Shoombuatong, W. UMPred-FRL: A New Approach for Accurate Prediction of Umami Peptides Using Feature Representation Learning. Int. J. Mol. Sci. 2021, 22, 13124. [Google Scholar] [CrossRef] [PubMed]
- Charoenkwan, P.; Nantasenamat, C.; Hasan, M.M.; Moni, M.A.; Manavalan, B.; Shoombuatong, W. StackDPPIV: A novel computational approach for accurate prediction of dipeptidyl peptidase IV (DPP-IV) inhibitory peptides. Methods 2022, 204, 189–198. [Google Scholar] [CrossRef] [PubMed]
- Hongjaisee, S.; Nantasenamat, C.; Carraway, T.S.; Shoombuatong, W. HIVCoR: A sequence-based tool for predicting HIV-1 CRF01_AE coreceptor usage. Comput. Biol. Chem. 2019, 80, 419–432. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4768–4777. [Google Scholar]
- Tongsong, T.; Wanapirak, C.; Sirivatanapa, P.; Sanguansermsri, T.; Sirichotiyakul, S.; Piyamongkol, W.; Chanprapaph, P. Prenatal control of severe thalassaemia: Chiang Mai strategy. Prenat Diagn 2000, 20, 229–234. [Google Scholar] [CrossRef]
- Barrett, A.N.; Saminathan, R.; Choolani, M. Thalassaemia screening and confirmation of carriers in parents. Best Pr. Res Clin Obs. Gynaecol. 2017, 39, 27–40. [Google Scholar] [CrossRef]
- Jindadamrongwech, S.; Wisedpanichkij, R.; Bunyaratvej, A.; Hathirat, P. Red cell parameters in alpha-thalassemia with and without beta-thalassemia trait or hemoglobin E trait. Southeast Asian J. Trop. Med. Public Health 1997, 28 (Suppl. S3), 97–99. [Google Scholar]
- Anselmo, F.C.; Ferreira, N.S.; da Mota, A.J.; Gonçalves, M.S.; Albuquerque, S.R.L.; Fraiji, N.A.; Ferreira, A.C.D.; de Moura Neto, J.P. Deletional Alpha-Thalassemia Alleles in Amazon Blood Donors. Adv. Hematol. 2020, 2020, 4170259. [Google Scholar] [CrossRef] [Green Version]
- Tayapiwatana, C.; Kuntaruk, S.; Tatu, T.; Chiampanichayakul, S.; Munkongdee, T.; Winichagoon, P.; Fuchareon, S.; Kasinrerk, W. Simple method for screening of alpha-thalassaemia 1 carriers. Int. J. Hematol. 2009, 89, 559–567. [Google Scholar] [CrossRef] [PubMed]
- Makonkawkeyoon, L.; Pharephan, S.; Sirivatanapa, P.; Tuntiwechapikul, W.; Makonkawkeyoon, S. Development of an ELISA strip for the detection of alpha thalassemias. Haematologica 2010, 95, 338–339. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gilad, O.; Shemer, O.S.; Dgany, O.; Krasnov, T.; Nevo, M.; Noy-Lotan, S.; Rabinowicz, R.; Amitai, N.; Ben-Dor, S.; Yaniv, I.; et al. Molecular diagnosis of α-thalassemia in a multiethnic population. Eur. J. Haematol. 2017, 98, 553–562. [Google Scholar] [CrossRef] [PubMed]
α-Thalassemia Status | Group | Frequency | Percentage |
---|---|---|---|
Normal (not thalassemia trait) | Normal | 229 | 38.5 |
α+-thalassemia trait | α+-thal trait | 124 | 20.9 |
Hb Constant Spring trait | 36 | 6.1 | |
Homozygous α+-thalassemia | Two-allele α-thal mutation | 5 | 0.8 |
α+-thalassemia trait/Hb Constant Spring trait | 3 | 0.5 | |
α0-thalassemia trait | 197 | 33.2 | |
Total | 594 | 100.0 |
Clinical and Red Blood Cell Parameters | Control | α+-Thal Trait | p-Value Control vs. α+-Thal Trait | Two-Allele α-Thal Mutation | p-Value Control vs. Two-Allele α-Thal Mutation | p-Value Two-Allele α-Thal Mutation vs. α+-Thal Trait |
---|---|---|---|---|---|---|
Age | 29.8 ± 6.8 | 29.3 ± 6.1 | 0.531 | 30.2 ± 6.6 | 0.525 | 0.235 |
Female/male | 121/108 | 69/55 | 0.616 | 121/84 | 0.252 | 0.615 |
Hb (g/dL) | 13.7 ± 1.7 | 13.0 ± 1.7 | 0.002 | 12.3 ± 1.4 | <0.001 | <0.001 |
Hct (%) | 40.4 ± 5.1 | 39.7 ± 5.1 | 0.206 | 39.3 ± 4.8 | 0.046 | 0.571 |
MCV (fL) | 86.7 ± 4.3 | 81.6 ± 4.2 | <0.001 | 67.2 ± 4.2 | <0.001 | <0.001 |
MCH (pg) | 29.4 ± 1.8 | 26.9 ± 1.7 | <0.001 | 21.1 ± 1.6 | <0.001 | <0.001 |
MCHC (g/dL) | 34.0 ± 2.8 | 32.9 ± 1.1 | <0.001 | 31.4 ± 1.1 | <0.001 | <0.001 |
RDW (%) | 12.9 ± 0.9 | 13.5 ± 1.0 | <0.001 | 16.7 ± 2.0 | <0.001 | <0.001 |
RBC count (×106/mm3) | 4.7 ± 0.6 | 4.9 ± 0.6 | 0.002 | 5.9 ± 1.3 | <0.001 | <0.001 |
Red Blood Cell Parameters | AUC (95% CI) | Cutoff | Sn (%) | Sp (%) |
---|---|---|---|---|
Hb (g/dL) | 0.600 (0.538–0.663) | 12.15 | 36.3 | 80.8 |
Hct (%) | 0.539 (0.476–0.602) | 44.95 | 83.9 | 25.8 |
MCV (fL) | 0.801 (0.753–0.850) | 83.95 | 70.2 | 77.7 |
MCH (pg) | 0.857 (0.816–0.899) | 28.95 | 78.1 | 70.4 |
MCHC (g/dL) | 0.767 (0.714–0.820) | 33.3 | 75.8 | 70.7 |
RBC count | 0.599 (0.539–0.660) | 4.50 | 72.6 | 45.4 |
RDW (%) | 0.692 (0.636–0.749) | 12.95 | 78.2 | 54.6 |
Combined parameters | 0.868 (0.830–0.906) | 0.316 | 80.1 | 75.1 |
Method | ACC (%) | Sn (%) | Sp (%) | MCC | AUC |
---|---|---|---|---|---|
DL (DeepThal) * | 80.69 | 79.87 | 81.23 | 0.604 | 0.851 |
SVM | 80.68 | 74.40 | 84.94 | 0.598 | 0.855 |
MLP | 79.72 | 74.77 | 83.27 | 0.581 | 0.854 |
RF | 79.41 | 75.73 | 82.16 | 0.577 | 0.857 |
PLS | 79.39 | 70.31 | 85.48 | 0.567 | 0.846 |
LR | 79.07 | 68.44 | 85.87 | 0.556 | 0.854 |
ET | 78.15 | 73.81 | 81.16 | 0.549 | 0.831 |
LGBM | 78.11 | 74.35 | 80.96 | 0.549 | 0.842 |
XGB | 77.79 | 76.94 | 78.28 | 0.547 | 0.840 |
DT | 72.33 | 70.63 | 73.16 | 0.433 | 0.719 |
KNNs | 69.43 | 62.58 | 74.09 | 0.367 | 0.683 |
Method | ACC (%) | Sn (%) | Sp (%) | MCC | AUC |
---|---|---|---|---|---|
DL (DeepThal) * | 80.77 | 70.59 | 88.64 | 0.608 | 0.856 |
RF | 78.21 | 70.59 | 84.09 | 0.554 | 0.838 |
LR | 76.92 | 58.82 | 90.91 | 0.534 | 0.860 |
PLS | 76.92 | 58.82 | 90.91 | 0.534 | 0.865 |
SVM | 75.64 | 58.82 | 88.64 | 0.504 | 0.862 |
ET | 75.64 | 64.71 | 84.09 | 0.501 | 0.849 |
LGBM | 75.64 | 61.76 | 86.36 | 0.502 | 0.824 |
XGB | 74.36 | 58.82 | 86.36 | 0.475 | 0.815 |
MLP | 73.08 | 58.82 | 84.09 | 0.447 | 0.866 |
DT | 69.23 | 47.06 | 86.36 | 0.368 | 0.667 |
KNNs | 67.95 | 52.94 | 79.55 | 0.339 | 0.662 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Phirom, K.; Charoenkwan, P.; Shoombuatong, W.; Charoenkwan, P.; Sirichotiyakul, S.; Tongsong, T. DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α+-Thalassemia Trait Using Red Blood Cell Parameters. J. Clin. Med. 2022, 11, 6305. https://doi.org/10.3390/jcm11216305
Phirom K, Charoenkwan P, Shoombuatong W, Charoenkwan P, Sirichotiyakul S, Tongsong T. DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α+-Thalassemia Trait Using Red Blood Cell Parameters. Journal of Clinical Medicine. 2022; 11(21):6305. https://doi.org/10.3390/jcm11216305
Chicago/Turabian StylePhirom, Krittaya, Phasit Charoenkwan, Watshara Shoombuatong, Pimlak Charoenkwan, Supatra Sirichotiyakul, and Theera Tongsong. 2022. "DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α+-Thalassemia Trait Using Red Blood Cell Parameters" Journal of Clinical Medicine 11, no. 21: 6305. https://doi.org/10.3390/jcm11216305