Gene-Based Predictive Modelling for Enhanced Detection of Systemic Lupus Erythematosus Using CNN-Based DL Algorithm
Abstract
:1. Introduction
2. Materials and Methods
2.1. Dataset
2.2. Feature Selection
Algorithm 1. Feature selection using Lasso Regression. |
: Vector of observed target values. : Regularization parameter for LASSO regression. : Number of folds for cross-validation.
|
2.3. Stacked Deep Learning Classifier (SDLC)
2.4. Attention-Based CNN Model (ACNN)
2.5. Stacked Bi-LSTM Architecture (SBLSTM)
2.6. Meta Classifiers
3. Results
3.1. Data Partitioning
3.2. Performance Evaluation Metrics
3.3. Performance of ACNN
3.4. Performance of Stacked Bi-LSTM
3.5. Performance of Voting Classifier
4. Discussion
Ablation Study
- Using Robust Feature Selection to train on the most valuable data increases predictive power.
- Preprocessing and data integration raise the bar for input data quality and consistency.
- Improve your model’s efficiency via hyperparameter tuning.
- To increase accuracy and resilience, a Stacked deep learning classifier integrates the best features of several classifiers.
- A balanced dataset ensures excellent accuracy and dependability by preventing bias towards any class.
- The attention mechanism aids the model in zeroing in on the most relevant data to improve prediction accuracy.
- Both the amount and quality of the input data have a significant impact on performance.
- Training and inference need a significant amount of computer resources.
- Learning models are more challenging to understand than simpler ones.
- The selection of hyperparameters may have an impact on performance.
- Predictions made with missing or partial data can be off.
- Performance may be negatively affected by input data with high noise levels.
- A skewed model towards the majority class can result from a class imbalance.
- Problems could arise if presented with datasets with different feature distributions.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Dema, B.; Charles, N. Autoantibodies in SLE: Specificities, isotypes and receptors. Antibodies 2016, 5, 2. [Google Scholar] [CrossRef] [PubMed]
- Durcan, L.; O’Dwyer, T.; Petri, M. Management strategies and future directions for systemic lupus erythematosus in adults. Lancet 2019, 393, 2332–2343. [Google Scholar] [CrossRef] [PubMed]
- Kiriakidou, M.; Ching, C.L. Systemic lupus erythematosus. Ann. Intern. Med. 2020, 172, ITC81–ITC96. [Google Scholar] [CrossRef] [PubMed]
- Yu, H.; Nagafuchi, Y.; Fujio, K. Clinical and immunological biomarkers for systemic lupus erythematosus. Biomolecules 2021, 11, 928. [Google Scholar] [CrossRef] [PubMed]
- Sebastiani, G.D.; Prevete, I.; Iuliano, A.; Minisola, G. The importance of an early diagnosis in systemic lupus erythematosus. Isr. Med. Assoc. J. 2016, 18, 212–215. [Google Scholar]
- Pons-Estel, G.J.; Ugarte-Gil, M.F.; Alarcón, G.S. Epidemiology of systemic lupus erythematosus. Expert Rev. Clin. Immunol. 2017, 13, 799–814. [Google Scholar] [CrossRef] [PubMed]
- Ribeiro, C.; Freitas, A.A. A mini-survey of supervised machine learning approaches for coping with ageing-related longitudinal datasets. In Proceedings of the 3rd Workshop on AI for Aging, Rehabilitation and Independent Assisted Living (ARIAL), Macao, China, 10–12 August 2019. [Google Scholar]
- Martí-Juan, G.; Sanroma-Guell, G.; Piella, G. A survey on machine and statistical learning for longitudinal analysis of neuroimaging data in Alzheimer’s disease. Comput. Methods Programs Biomed. 2020, 189, 105348. [Google Scholar] [CrossRef] [PubMed]
- Perveen, S.; Shahbaz, M.; Saba, T.; Keshavjee, K.; Rehman, A.; Guergachi, A. Handling irregularly sampled longitudinal data and prognostic modeling of diabetes using machine learning technique. IEEE Access 2020, 8, 21875–21885. [Google Scholar] [CrossRef]
- Kinreich, S.; Meyers, J.L.; Maron-Katz, A.; Kamarajan, C.; Pandey, A.K.; Chorlian, D.B.; Zhang, J.; Pandey, G.; de Viteri, S.S.-S.; Pitti, D.; et al. Predicting risk for alcohol use disorder using longitudinal data with multimodal biomarkers and family history: A machine learning study. Mol. Psychiatry 2019, 26, 1133–1141. [Google Scholar] [CrossRef] [PubMed]
- Zhao, Y.; Wang, T.; Bove, R.; Cree, B.; Henry, R.; Lokhande, H.; Polgar-Turcsanyi, M.; Anderson, M.; Bakshi, R.; Weiner, H.L.; et al. Ensemble learning predicts multiple sclerosis disease course in the summit study. NPJ Digit. Med. 2020, 3, 135. [Google Scholar] [CrossRef]
- Rokach, L.; Maimon, O. Decision trees. In Data Mining and Knowledge Discovery Handbook; Springer: Berlin/Heidelberg, Germany, 2005; pp. 165–192. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Kleinbaum, D.G.; Klein, M. Introduction to logistic regression. In Logistic Regression; Springer: Berlin/Heidelberg, Germany, 2010; pp. 1–39. [Google Scholar]
- Huang, H.; Wu, N.; Liang, Y.; Peng, X.; Shu, J. SLNL: A novel method for gene selection and phenotype classification. Int. J. Intell. Syst. 2022, 37, 6283–6304. [Google Scholar] [CrossRef]
- Venkatasubramanian, S.; Dwivedi, J.N.; Raja, S.; Rajeswari, N.; Logeshwaran, J.; Kumar, A.P. Prediction of Alzheimer’s Disease Using DHO-Based Pretrained CNN Model. Math. Probl. Eng. 2023, 2023, 1110500. [Google Scholar] [CrossRef]
- Sheikhtaheri, A.; Sadoughi, F.; Hashemi Dehaghi, Z. Developing and using expert systems and neural networks in medicine: A review on benefits and challenges. J. Med. Syst. 2014, 38, 110. [Google Scholar] [CrossRef] [PubMed]
- Yao, X.; Xie, R.; Zan, X.; Su, Y.; Xu, P.; Liu, W. A Novel Image Encryption Scheme for DNA Storage Systems Based on DNA Hybridization and Gene Mutation. Interdiscip. Sci. Comput. Life Sci. 2023, 15, 419–432. [Google Scholar] [CrossRef] [PubMed]
- Rajimehr, R.; Farsiu, S.; Kouhsari, L.M.; Bidari, A.; Lucas, C.; Yousefian, S.; Bahrami, F. Prediction of lupus nephritis in patients with systemic lupus erythematosus using artificial neural networks. Lupus 2002, 11, 485–492. [Google Scholar] [CrossRef] [PubMed]
- Oates, J.C.; Varghese, S.; Bland, A.M.; Taylor, T.P.; Self, S.E.; Stanislaus, R.; Almeida, J.S.; Arthur, J.M. Prediction of urinary protein markers in lupus nephritis. Kidney Int. 2005, 68, 2588–2592. [Google Scholar] [CrossRef] [PubMed]
- Wolf, B.J.; Spainhour, J.C.; Arthur, J.M.; Janech, M.G.; Petri, M.; Oates, J.C. Development of Biomarker Models to Predict Outcomes in Lupus Nephritis. Arthritis Rheumatol. 2016, 68, 1955–1963. [Google Scholar] [CrossRef] [PubMed]
- Tang, H.; Poynton, M.R.; Hurdle, J.F.; Baird, B.C.; Koford, J.K.; Goldfarb-Rumyantzev, A.S. Predicting three-year kidney graft survival in recipients with systemic lupus erythematosus. ASAIO J. 2011, 57, 300–309. [Google Scholar] [CrossRef] [PubMed]
- He, B.; Lang, J.; Wang, B.; Liu, X.; Lu, Q.; He, J.; Gao, W.; Bing, P.; Tian, G.; Yang, J. TOOme: A Novel Computational Framework to Infer Cancer Tissue-of-Origin by Integrating Both Gene Mutation and Expression. Front. Bioeng. Biotechnol. 2020, 8, 394. [Google Scholar] [CrossRef]
- Jothimani, S.; Premalatha, K. THFN: Emotional health recognition of elderly people using a Two-Step Hybrid feature fusion network along with Monte-Carlo dropout. Biomed. Signal Process. Control. 2023, 86, 105116. [Google Scholar] [CrossRef]
- Jothimani, S.; Premalatha, K. MFF-SAug: Multi feature fusion with spectrogram augmentation of speech emotion recognition using convolution neural network. Chaos Solitons Fractals 2022, 162, 112512. [Google Scholar] [CrossRef]
- Sangeethaa, S.N.; Jothimani, S. Detection of exudates from clinical fundus images using machine learning algorithms in diabetic maculopathy. Int. J. Diabetes Dev. Ctries. 2023, 43, 25–35. [Google Scholar] [CrossRef]
- Jothimani, S.; Sangeethaa, S.N.; Premalatha, K. Advanced Deep Learning Techniques with Attention Mechanisms for Acoustic Emotion Classification. In Proceedings of the 2022 International Conference on Inventive Computation Technologies (ICICT), Lalitpur, Nepal, 20–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1235–1240. [Google Scholar]
- Nancy, P.; Rajeshram, V.; Sathish Kumar, G.; Dhivya, P. A synergistic framework for histopathologic cancer detection using Epicurve Search–PSB model with surrosec Optimizer. Biomed. Signal Process. Control 2024, 96, 106498. [Google Scholar] [CrossRef]
- Guan, S.; Khan, A.A.; Sikdar, S.; Chitnis, P.V. Fully dense UNet for 2-D sparse photoacoustic tomography artifact removal. IEEE J. Biomed. Health Inform. 2020, 24, 568–576. [Google Scholar] [CrossRef] [PubMed]
- Chen, C.X.; Zhu, J.; Zeng, Z. Use of Ultrasound to Observe Mycosis Fungoides: A Case Report and Review of Literature. Curr. Med. Imaging 2022, 18, 771–775. [Google Scholar] [CrossRef] [PubMed]
- Zheng, Z.; Zhang, X.; Ding, J.; Zhang, D.; Cui, J.; Fu, X.; Han, J.; Zhu, P. Deep Learning-Based Artificial Intelligence System for Automatic Assessment of Glomerular Pathological Findings in Lupus Nephritis. Diagnostics 2021, 11, 1983. [Google Scholar] [CrossRef] [PubMed]
- Pavithra, U.; Shalini, M.; Sreeniveatha, P.; Chitra, J. Systemic Lupus Erythematosus Detection using Deep Learning with Auxiliary Parameters. In Proceedings of the 2023 Second International Conference on Electrical, Electronics, Information and Communication Technologies (ICEEICT), Trichirappalli, India, 5–7 April 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–7. [Google Scholar]
- Jorge, A.M.; Smith, D.; Wu, Z.; Chowdhury, T.; Costenbader, K.; Zhang, Y.; Choi, H.K.; Feldman, C.H.; Zhao, Y. Exploration of machine learning methods to predict systemic lupus erythematosus hospitalizations. Lupus 2022, 31, 1296–1305. [Google Scholar] [CrossRef] [PubMed]
- Adamichou, C.; Genitsaridi, I.; Nikolopoulos, D.; Nikoloudaki, M.; Repa, A.; Bortoluzzi, A.; Bertsias, G.K. Lupus or not? SLE Risk Probability Index (SLERPI): A simple, clinician-friendly machine learning-based model to assist the diagnosis of systemic lupus erythematosus. Ann. Rheum. Dis. 2021, 80, 758–766. [Google Scholar] [CrossRef] [PubMed]
GEO Accession | SLE | Health | Total |
---|---|---|---|
GSE138458 | 307 | 23 | 330 |
GSE154851 | 38 | 32 | 70 |
GSE50635 | 33 | 16 | 49 |
GSE61635 | 99 | 30 | 129 |
GSE99967 | 38 | 17 | 55 |
GSE185047 | 87 | 0 | 87 |
GSE110685 | 36 | 17 | 53 |
GSE112087 | 62 | 58 | 120 |
GSE72509 | 99 | 18 | 117 |
Layer (Type) | Param # | Output Shape |
---|---|---|
input_1 | [(None, 32, 32, 3)] | 0 |
conv2d | (None, 32, 32, 32) | 896 |
max_pooling2d | (None, 16, 16, 32) | 0 |
conv2d_1 | (None, 16, 16, 64) | 18,496 |
max_pooling2d_1 | (None, 8, 8, 64) | 0 |
conv2d_2 | (None, 8, 8, 128) | 73,856 |
max_pooling2d_2 | (None, 4, 4, 128) | 0 |
dot_product_attention | (None, 4, 4, 128) | 0 |
conv2d_3 | (None, 4, 4, 256) | 295,168 |
max_pooling2d_3 | (None, 2, 2, 256) | 0 |
dot_product_attention_1 | (None, 2, 2, 256) | 0 |
conv2d_4 | (None, 2, 2, 512) | 1,180,160 |
max_pooling2d_4 | (None, 1, 1, 512) | 0 |
global_average_pooling2d | (None, 512) | 0 |
concatenate | (None, 1280) | 0 |
dense | (None, 512) | 655,872 |
dense_1 | (None, 2) | 1026 |
Total params: | 22, 20, 474 | |
Trainable params: | 22, 20, 474 | |
Non-trainable params: | 0 |
Hyperparameter | Attention-Based CNN | Stacked Bi-LSTM |
---|---|---|
Number of Convolutional Layers | 5 | - |
Number of Filters (Convolutional Layers) | [32, 64, 128, 256, 512] | - |
Filter Size (Convolutional Layers) | - | |
Pooling Size (Max Pooling Layers) | (2, 2) | - |
Dropout Rate | 0.4 | - |
Learning Rate | 0.001 | 0.01 |
Batch Size | 32 | 64 |
Number of Bi-LSTM Layers | - | 3 |
Number of Units (Bi-LSTM Layers) | - | [64, 128, 64] |
Dropout Rate (Bi-LSTM Layers) | - | 0.3 |
Recurrent Dropout Rate (Bi-LSTM Layers) | - | 0.2 |
Activation Function (Output Layer) | Softmax | Softmax |
Loss Function | Sparse Categorical Crossentropy | Sparse Categorical Crossentropy |
Optimizer | Adam | Adam |
Number of Epochs | 100 | 100 |
Metric | Value |
---|---|
Accuracy | 0.95 |
Precision | 0.92 |
Recall | 0.96 |
F1 Score | 0.94 |
Specificity | 0.93 |
AUC-ROC Score | 0.97 |
Metric | Value |
---|---|
Accuracy | 0.92 |
Precision | 0.88 |
Recall | 0.94 |
F1 Score | 0.91 |
Specificity | 0.90 |
AUC-ROC Score | 0.94 |
Metric | Value |
---|---|
Accuracy | 0.996 |
Precision | 0.992 |
Recall | 0.997 |
F1 Score | 0.994 |
Specificity | 0.99 |
AUC-ROC Score | 0.998 |
Fold | Model | Accuracy | Precision | Recall | F1 Score | Specificity | AUC-ROC |
---|---|---|---|---|---|---|---|
1 | ACNN | 0.95 | 0.92 | 0.94 | 0.93 | 0.93 | 0.95 |
SBLSTM | 0.9 | 0.88 | 0.9 | 0.89 | 0.9 | 0.93 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
2 | ACNN | 0.96 | 0.91 | 0.94 | 0.92 | 0.93 | 0.95 |
SBLSTM | 0.91 | 0.89 | 0.91 | 0.9 | 0.91 | 0.94 | |
Voting | 0.98 | 0.92 | 0.94 | 0.93 | 0.94 | 0.96 | |
3 | ACNN | 0.94 | 0.9 | 0.93 | 0.91 | 0.92 | 0.94 |
SBLSTM | 0.89 | 0.87 | 0.89 | 0.88 | 0.89 | 0.92 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
4 | ACNN | 0.95 | 0.92 | 0.94 | 0.93 | 0.93 | 0.95 |
SBLSTM | 0.9 | 0.88 | 0.9 | 0.89 | 0.9 | 0.93 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
5 | ACNN | 0.96 | 0.91 | 0.94 | 0.92 | 0.93 | 0.95 |
SBLSTM | 0.91 | 0.89 | 0.91 | 0.9 | 0.91 | 0.94 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
6 | ACNN | 0.95 | 0.92 | 0.94 | 0.93 | 0.93 | 0.95 |
SBLSTM | 0.9 | 0.88 | 0.9 | 0.89 | 0.9 | 0.93 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
7 | ACNN | 0.96 | 0.91 | 0.94 | 0.92 | 0.93 | 0.95 |
SBLSTM | 0.91 | 0.89 | 0.91 | 0.9 | 0.91 | 0.94 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
8 | ACNN | 0.94 | 0.9 | 0.93 | 0.91 | 0.92 | 0.94 |
SBLSTM | 0.89 | 0.87 | 0.89 | 0.88 | 0.89 | 0.92 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
9 | ACNN | 0.95 | 0.92 | 0.94 | 0.93 | 0.93 | 0.95 |
SBLSTM | 0.9 | 0.88 | 0.9 | 0.89 | 0.9 | 0.93 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 | |
10 | ACNN | 0.95 | 0.92 | 0.94 | 0.93 | 0.93 | 0.95 |
SBLSTM | 0.9 | 0.88 | 0.9 | 0.89 | 0.9 | 0.93 | |
Voting | 0.99 | 0.93 | 0.95 | 0.94 | 0.95 | 0.97 |
Experiment | Accuracy (%) | Precision (%) | Recall (%) |
---|---|---|---|
Full Model | 99.6 | 99.2 | 99.7 |
Without ACNN | 98.6 | 98.3 | 99 |
Without SBLSTM | 98.8 | 98.4 | 99.2 |
Without Attention Mechanism | 99 | 98.6 | 99.5 |
Without Max Pooling Layers | 99.3 | 98.9 | 99.6 |
Without Voting | 96.7 | 96.2 | 97.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Subramani, J.; Kumar, G.S.; Gadekallu, T.R. Gene-Based Predictive Modelling for Enhanced Detection of Systemic Lupus Erythematosus Using CNN-Based DL Algorithm. Diagnostics 2024, 14, 1339. https://doi.org/10.3390/diagnostics14131339
Subramani J, Kumar GS, Gadekallu TR. Gene-Based Predictive Modelling for Enhanced Detection of Systemic Lupus Erythematosus Using CNN-Based DL Algorithm. Diagnostics. 2024; 14(13):1339. https://doi.org/10.3390/diagnostics14131339
Chicago/Turabian StyleSubramani, Jothimani, G. Sathish Kumar, and Thippa Reddy Gadekallu. 2024. "Gene-Based Predictive Modelling for Enhanced Detection of Systemic Lupus Erythematosus Using CNN-Based DL Algorithm" Diagnostics 14, no. 13: 1339. https://doi.org/10.3390/diagnostics14131339
APA StyleSubramani, J., Kumar, G. S., & Gadekallu, T. R. (2024). Gene-Based Predictive Modelling for Enhanced Detection of Systemic Lupus Erythematosus Using CNN-Based DL Algorithm. Diagnostics, 14(13), 1339. https://doi.org/10.3390/diagnostics14131339