Obstructive Sleep Apnea (OSA) and COVID-19: Mortality Prediction of COVID-19-Infected Patients with OSA Using Machine Learning Approaches
Abstract
:1. Introduction
2. Literature Review
3. Methods and Materials
3.1. Dataset Collection
3.2. Data Statistical Analysis
3.2.1. Inclusion Exclusion Criteria for the Dataset
3.2.2. Pre-Processing the Dataset
3.2.3. Developing Correlation Matrix
3.2.4. Balancing the Dataset
3.2.5. Distribution of Data
3.3. Machine Learning Algorithms
3.3.1. Random Forest Classifier
3.3.2. Decision Tree Classifier
3.3.3. Support Vector Machine
3.3.4. Gradient Descent Classifier
3.3.5. Logistic Regression
3.3.6. K-Nearest Neighbor
3.3.7. Extreme Gradient Boosting
3.3.8. AdaBoost
3.3.9. Light Gradient Boosting Machine
3.3.10. Naive Bayes
3.3.11. Artificial Neural Network
3.4. Parameter Optimization and Cross-Validation
3.5. Evaluation Metrics
4. Result and Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wu, Y.C.; Chen, C.S.; Chan, Y.J. The outbreak of COVID-19: An overview. J. Chin. Med. Assoc. 2020, 83, 217. [Google Scholar] [CrossRef] [PubMed]
- Morens, D.M.; Breman, J.G.; Calisher, C.H.; Doherty, P.C.; Hahn, B.H.; Keusch, G.T.; Kramer, L.D.; LeDuc, J.W.; Monath, T.P.; Taubenberger, J.K. The origin of COVID-19 and why it matters. Am. J. Trop. Med. Hyg. 2020, 103, 955. [Google Scholar] [CrossRef] [PubMed]
- Struyf, T.; Deeks, J.J.; Dinnes, J.; Takwoingi, Y.; Davenport, C.; Leeflang, M.M.; Spijker, R.; Hooft, L.; Emperador, D.; Domen, J.; et al. Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19. Cochrane Database Syst. Rev. 2021. [Google Scholar] [CrossRef]
- Yin, T.; Li, Y.; Ying, Y.; Luo, Z. Prevalence of comorbidity in Chinese patients with COVID-19: Systematic review and meta-analysis of risk factors. BMC Infect. Dis. 2021, 21, 200. [Google Scholar] [CrossRef] [PubMed]
- Punjabi, N.M. The epidemiology of adult obstructive sleep apnea. Proc. Am. Thorac. Soc. 2008, 5, 136–143. [Google Scholar] [CrossRef] [PubMed]
- Bonsignore, M.R.; Baiamonte, P.; Mazzuca, E.; Castrogiovanni, A.; Marrone, O. Obstructive sleep apnea and comorbidities: A dangerous liaison. Multidiscip. Respir. Med. 2019, 14, 8. [Google Scholar] [CrossRef] [Green Version]
- Eriksson, C.O.; Stoner, R.C.; Eden, K.B.; Newgard, C.D.; Guise, J.M. The association between hospital capacity strain and inpatient outcomes in highly developed countries: A systematic review. J. Gen. Intern. Med. 2017, 32, 686–696. [Google Scholar] [CrossRef] [Green Version]
- Shams, A.B.; Raihan, M.; Sarker, M.; Khan, M.; Uddin, M.; Preo, R.B.; Monjur, O. Telehealthcare and Covid-19: A Noninvasive & Low Cost Invasive, Scalable and Multimodal Real-Time Smart-phone Application for Early Diagnosis of SARS-CoV-2 Infection. arXiv 2021, arXiv:2109.07846. [Google Scholar]
- Adib, Q.A.R.; Tasmi, S.T.; Bhuiyan, M.; Islam, S.; Raihan, M.; Sarker, M.; Shams, A.B. Prediction Model for Mortality Analysis of Pregnant Women Affected With COVID-19. arXiv 2021, arXiv:2111.11477. [Google Scholar]
- Miller, M.A.; Cappuccio, F.P. A systematic review of COVID-19 and obstructive sleep apnoea. Sleep Med. Rev. 2021, 55, 101382. [Google Scholar] [CrossRef]
- McSharry, D.; Lam, M.T.; Malhotra, A. OSA as a probable risk factor for severe COVID-19. J. Clin. Sleep Med. 2020, 16, 1649. [Google Scholar] [CrossRef] [PubMed]
- Voncken, S.; Feron, T.; Karaca, U.; Beerhorst, K.; Klarenbeek, P.; Straetmans, J.; de Vries, G.; Kolfoort-Otte, A.; de Kruif, M. Impact of obstructive sleep apnea on clinical outcomes in patients hospitalized with COVID-19. Sleep Breath. 2021, 1–9. [Google Scholar] [CrossRef] [PubMed]
- Hariyanto, T.I.; Kurniawan, A. Obstructive sleep apnea (OSA) and outcomes from coronavirus disease 2019 (COVID-19) pneumonia: A systematic review and meta-analysis. Sleep Med. 2021, 82, 47–53. [Google Scholar] [CrossRef] [PubMed]
- McSharry, D.; Malhotra, A. Potential influences of obstructive sleep apnea and obesity on COVID-19 severity. J. Clin. Sleep Med. 2020, 16, 1645. [Google Scholar] [CrossRef] [PubMed]
- Maas, M.B.; Kim, M.; Malkani, R.G.; Abbott, S.M.; Zee, P.C. Obstructive sleep apnea and risk of COVID-19 infection, hospitalization and respiratory failure. Sleep Breath. 2021, 25, 1155–1157. [Google Scholar] [CrossRef]
- Tufik, S. Obstructive Sleep Apnea as a comorbidity to COVID-19. Sleep Sci. 2020, 13, 181. [Google Scholar]
- Nandy, K.; Salunke, A.; Pathak, S.K.; Pandey, A.; Doctor, C.; Puj, K.; Sharma, M.; Jain, A.; Warikoo, V. Coronavirus disease (COVID-19): A systematic review and meta-analysis to evaluate the impact of various comorbidities on serious events. Diabetes Metab. Syndr. Clin. Res. Rev. 2020, 14, 1017–1025. [Google Scholar] [CrossRef]
- Parish, J.M.; Somers, V.K. Obstructive sleep apnea and cardiovascular disease. Mayo Clin. Proc. 2004, 79, 1036–1046. [Google Scholar] [CrossRef] [Green Version]
- Ho, J.; Donders, H.; Zhou, N.; Schipper, K.; Su, N.; de Lange, J. Association between the degree of obstructive sleep apnea and the severity of COVID-19: An explorative retrospective cross-sectional study. PLoS ONE 2021, 16, e0257483. [Google Scholar] [CrossRef]
- Murti, D.M.P.; Pujianto, U.; Wibawa, A.P.; Akbar, M.I. K-Nearest Neighbor (K-NN) based Miss-ing Data Imputation. In Proceedings of the 2019 5th International Conference on Science in Information Technology (ICSITech), Yogyakarta, Indonesia, 23–24 October 2019; pp. 83–88. [Google Scholar] [CrossRef]
- Vinutha, H.; Poornima, B.; Sagar, B. Detection of outliers using interquartile range technique from intrusion dataset. In Information and Decision Sciences; Springer: Berlin/Heidelberg, Germany, 2018; pp. 511–518. [Google Scholar]
- Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
- Elreedy, D.; Atiya, A.F. A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Inf. Sci. 2019, 505, 32–64. [Google Scholar] [CrossRef]
- Singh, A.; Thakur, N.; Sharma, A. A review of supervised machine learning algorithms. In Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), New Delhi, India, 16–18 March 2016; pp. 1310–1315. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Song, Y.Y.; Ying, L. Decision tree methods: Applications for classification and prediction. Shanghai Arch. Psychiatry 2015, 27, 130. [Google Scholar] [PubMed]
- Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
- Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747. [Google Scholar]
- Peng, C.Y.J.; Lee, K.L.; Ingersoll, G.M. An introduction to logistic regression analysis and reporting. J. Educ. Res. 2002, 96, 3–14. [Google Scholar] [CrossRef]
- Zhang, M.L.; Zhou, Z.H. ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognit. 2007, 40, 2038–2048. [Google Scholar] [CrossRef] [Green Version]
- Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K. Xgboost: Extreme Gradient Boosting; R Package Version 0.4-2; Microsoft: New York, NY, USA, 2015; pp. 1–4. [Google Scholar]
- Schapire, R.E. A brief introduction to boosting. Ijcai 1999, 99, 1401–1406. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Rish, I. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI-01: Seventeenth International Joint Conference on Artificial Intelligence, Seattle, WA, USA, 4–10 August 2001; pp. 41–46. [Google Scholar]
- Uhrig, R.E. Introduction to artificial neural networks. In Proceedings of the IECON’95—21st Annual Conference on IEEE Industrial Electronics, Orlando, FL, USA, 6–10 November 1995; IEEE: Piscataway, NJ, USA, 1995; Volume 1, pp. 33–37. [Google Scholar]
- Berrar, D. Cross-Validation. 2019. Available online: https://www.researchgate.net/publication/324701535_Cross-Validation (accessed on 27 June 2022).
- Liashchynskyi, P.; Liashchynskyi, P. Grid search, random search, genetic algorithm: A big comparison for NAS. arXiv 2019, arXiv:1912.06059. [Google Scholar]
- Zu, Z.Y.; Jiang, M.D.; Xu, P.P.; Chen, W.; Ni, Q.Q.; Lu, G.M.; Zhang, L.J. Coronavirus disease 2019 (COVID-19): A perspective from China. Radiology 2020, 296, E15–E25. [Google Scholar] [CrossRef] [Green Version]
- Lei, S.; Jiang, F.; Su, W.; Chen, C.; Chen, J.; Mei, W.; Zhan, L.Y.; Jia, Y.; Zhang, L.; Liu, D.; et al. Clinical characteristics and outcomes of patients undergoing surgeries during the incubation period of COVID-19 infection. EClinicalMedicine 2020, 21, 100331. [Google Scholar] [CrossRef]
- Marshall, J.C.; Murthy, S.; Diaz, J.; Adhikari, N.; Angus, D.C.; Arabi, Y.M.; Baillie, K.; Bauer, M.; Berry, S.; Blackwood, B.; et al. A minimal common outcome measure set for COVID-19 clinical research. Lancet Infect. Dis. 2020, 20, e192–e197. [Google Scholar] [CrossRef]
- Kopel, J.; Perisetti, A.; Roghani, A.; Aziz, M.; Gajendran, M.; Goyal, H. Racial and gender-based differences in COVID-19. Front. Public Health 2020, 8, 418. [Google Scholar] [CrossRef] [PubMed]
- Sanyaolu, A.; Okorie, C.; Marinkovic, A.; Patidar, R.; Younis, K.; Desai, P.; Hosein, Z.; Padda, I.; Mangat, J.; Altaf, M. Comorbidity and its impact on patients with COVID-19. SN Compr. Clin. Med. 2020, 2, 1069–1076. [Google Scholar] [CrossRef] [PubMed]
- Hajian-Tilaki, K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Casp. J. Intern. Med. 2013, 4, 627. [Google Scholar]
Programs and Packages | Application | Version |
---|---|---|
Python | Programming language for encoding and decoding. | Python 3.6. |
NumPy | Creation of array objects and applying functions of linear algebra. | numpy 1.22.4 |
Pandas | Importing and analyzing the dataset. | pandas 1.4.2 |
Scikit Learn | Machine learning prediction modeling. | scikit-learn 1.1.1 |
Developing correlation between variables. | ||
Analyzing performance metrics of the models. | ||
Seaborn | Visualizing the dataset and other statistics. | 0.11.2 |
Algorithm | Parameters Used |
---|---|
Logistic Regression | C = 0.001, random_state = 0 |
KNeighborsClassifier | n_neighbors = 13, |
metric = ’minkowski’, | |
p = 1, weights = ’uniform’ | |
SVC | kernel = ’rbf’, |
probability = True, | |
C = 0.1, gamma = 0.01, | |
random_state = 0 | |
GaussianNB | var_smoothing = 0.012328467394420659 |
DecisionTreeClassifier | Criterion = ’gini’, |
max_depth = 5, | |
max_leaf_nodes = 11, | |
min_samples_split = 3 | |
RandomForestClassifier | Criterion = ’gini’, |
max_depth = 7, | |
max_features = ’sqrt’, | |
n_estimators = 8 | |
XGBClassifier | colsample_bytree = 0.7, |
max_depth = 15, | |
n_estimators = 2, | |
reg_alpha = 1.1, | |
reg_lambda = 1.1, | |
subsample = 0.7 | |
AdaBoostClassifier | base_estimator = DecisionTreeClassifier |
(max_depth = 2, max_leaf_nodes = 5), | |
learning_rate = 0.01, | |
n_estimators = 100 | |
lgb.LGBMClassifier | colsample_bytree = 0.7, |
max_depth = 15, | |
min_split_gain = 0.4, | |
n_estimators = 400, | |
num_leaves = 50, | |
reg_lambda = 1.1, | |
Subsample = 0.7, | |
subsample_freq = 20 | |
GradientBoostingClassifier | criterion = ’friedman_mse’, |
learning_rate = 0.05, | |
loss = ’deviance’, | |
max_depth = 3, | |
max_features = ’log2′, | |
min_samples_leaf = 0.1, | |
min_samples_split = 0.5, | |
n_estimators = 10, | |
subsample = 0.618 |
True Class | |||
---|---|---|---|
Predicted Class | 0 | 1 | |
0 | True Negative (TN) | False Positive (FP) | |
1 | False Negative (FN) | True Positive (TP) |
WHO Progression Scale Used in the Paper | Explanation |
---|---|
WHO-Cat_3 | Mild disease and infection. |
Ambulatory care is needed. | |
WHO-Cat_4 | Patient is moderately diseased |
and infected. | |
Patient may require to be hospitalized. | |
WHO_Cat_5 | Patient is severely infected. |
Patient may require intensive care unit (ICU) | |
and have severe mortality risk (death). |
Attributes | Frequency Distribution | |
---|---|---|
Before Balancing Dataset | After Balancing Dataset | |
Gender | Male = 66.9% | Male = 74.5% |
Female = 33.1% | Female = 25.5% | |
Admission In Hospital | Yes = 65.4% | Yes = 67.7% |
No = 34.6% | No = 32.3% | |
Treatment OSAS | Yes = 82.4% | Yes = 78.6% |
No = 17.6% | No = 21.4% | |
Smoking | Yes = 4.4% | Yes = 3.1% |
No = 95.6% | No = 96.9% | |
Smoking Entire Life | Ever = 52.2% | Ever = 51.6% |
Never = 46.3% | Never= 57.4% | |
Present = 1.5% | Present = 1% | |
DM | No = 66.9% | No = 73.4% |
Yes = 33.1% | Yes = 26.6% | |
CVD | Yes = 66.9% | Yes = 62.5% |
No = 30.1% | No = 37.5% | |
COPD Asthma | No = 66.2% | No = 72.4% |
Yes = 33.8% | Yes = 27.6% | |
Chronic Kidney Disease | No = 84.6% | No = 89.1% |
Yes = 15.4% | Yes = 10.9% | |
Immunosuppression | No = 94.1% | No = 96.4% |
Yes = 5.9% | Yes = 3.6% | |
Active Malignancy | No = 94.1% | No = 93.8% |
Yes = 5.9% | Yes = 6.2% | |
Who_Cat_4 | 2 = 38.2% | 4 = 49% |
1 = 30.1% | 2 = 27.1% | |
4 = 27.9% | 1 = 21.4% | |
3 = 3.7% | 3 = 2.6% | |
Who_Cat_5 | 3 = 38.2% | 5 = 49% |
1 = 30.1% | 3 = 27.1% | |
5 = 27.9% | 1 = 21.4% | |
4 = 3.7% | 4 = 2.6% | |
Who_Cat_3 | 2 = 38.2% | 3 = 57.6% |
3 = 31.6% | 2 = 27.1% | |
1 = 30.1% | 1 = 21.4% | |
Death | No = 70.6% | No = 50% |
Yes = 29.4% | Yes = 50% |
One-Sample Test | Test Value = 0 t | df | Sig. (2-Tailed) | Mean | 95% Confidence | |
---|---|---|---|---|---|---|
Difference | Interval of the Difference | |||||
Lower | Upper | |||||
Age | 53.722 | 135 | 0.0001 | 65.82353 | 63.4003 | 68.2467 |
Hospital Admission Days | 7.004 | 135 | 0.0001 | 5.93529 | 4.2594 | 7.6111 |
ICU Admission Days | 2.579 | 135 | 0.011 | 1.71324 | 0.3995 | 3.0270 |
Intubation Days | 2.740 | 135 | 0.007 | 1.39706 | 0.3887 | 2.4054 |
AHI | 14.885 | 135 | 0.0001 | 27.11103 | 23.5089 | 30.7131 |
LSAT | 120.979 | 135 | 0.0001 | 82.32206 | 80.9763 | 83.6678 |
ODI | 15.197 | 135 | 0.0001 | 27.88897 | 24.2597 | 31.5183 |
RDI | 14.622 | 135 | 0.0001 | 33.31853 | 28.8122 | 37.8249 |
BMI | 57.142 | 135 | 0.0001 | 32.02709 | 30.9186 | 33.1355 |
Algorithms | Accuracy | Precision | Recall | F1 Score | AUC |
---|---|---|---|---|---|
Logistic Regression | 0.9867 | 0.9882 | 0.9866 | 0.9865 | 1.0000 |
K-NN | 0.9933 | 0.9938 | 0.9938 | 0.9933 | 1.0000 |
SVM | 0.9867 | 0.9875 | 0.9875 | 0.9867 | 1.0000 |
Naive Bayes | 0.9867 | 0.9889 | 0.9857 | 0.9864 | 0.9875 |
Decision Tree | 0.9867 | 0.9826 | 0.9795 | 0.9731 | 0.9804 |
Random Forest | 0.9999 | 0.9999 | 0.9999 | 0.9999 | 0.9999 |
XGBoost | 0.9800 | 0.9819 | 0.9804 | 0.9799 | 1.0000 |
AdaBoost | 0.9867 | 0.9764 | 0.9732 | 0.9731 | 0.9938 |
LightGBM | 0.9800 | 0.9819 | 0.9804 | 0.9799 | 1.0000 |
Gradient Boosting | 0.9933 | 0.9819 | 0.9804 | 0.9799 | 1.0000 |
ANN | 0.9899 | 1.0000 | 0.9899 | 0.9899 | 1.0000 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tasmi, S.T.; Raihan, M.M.S.; Shams, A.B. Obstructive Sleep Apnea (OSA) and COVID-19: Mortality Prediction of COVID-19-Infected Patients with OSA Using Machine Learning Approaches. COVID 2022, 2, 877-894. https://doi.org/10.3390/covid2070064
Tasmi ST, Raihan MMS, Shams AB. Obstructive Sleep Apnea (OSA) and COVID-19: Mortality Prediction of COVID-19-Infected Patients with OSA Using Machine Learning Approaches. COVID. 2022; 2(7):877-894. https://doi.org/10.3390/covid2070064
Chicago/Turabian StyleTasmi, Sidratul Tanzila, Md. Mohsin Sarker Raihan, and Abdullah Bin Shams. 2022. "Obstructive Sleep Apnea (OSA) and COVID-19: Mortality Prediction of COVID-19-Infected Patients with OSA Using Machine Learning Approaches" COVID 2, no. 7: 877-894. https://doi.org/10.3390/covid2070064