A Reliable Deep Learning Model for ECG Interpretation: Mitigating Overconfidence and Direct Uncertainty Quantification
Abstract
:1. Introduction
2. Related Work
2.1. Deep Learning in ECG Interpretation
2.2. Uncertainty Estimation in ECG Classification
2.3. Dirichlet-Based Uncertainty Modeling in Deep Learning
3. Materials and Methods
3.1. Feature Extraction
3.2. Probability Prediction and Uncertainty Estimation
3.3. Optimization Objectives
3.4. Source of Data and Preprocssing
3.5. Baseline Models
- PCNN: The model architecture is adapted from [42], with multiple convolutional layers applied sequentially to the ECG recordings, utilizing shared weights across the layers. The kernel size of the convolutional layers is progressively halved from the initial to the final layer, making the features stacked like a ’pyramid’. Each convolutional layer is followed by an activation function, batch normalization, max-pooling and dropout. A fully connect layer and softmax layer are used for prediction.
- SE-PCNN: We integrated an attention mechanism into the PCNN by calculating the weights across different channels in each convolutional layer, following the Squeeze-and-Excitation approach [43].
- RNN: The bi-directional GRU is utilized to extract features from ECG samples, followed by a batch normal layer, a fully connected layer and a softmax layer applied to the top hidden layer.
- AttRNN: Based on RNN, with attention [44].
- PCRNN: We modified the architecture presented in [45] by substituting the original convolutional structure with that of the PCNN and replacing the LSTM structure with the GRU used in RNN. Batch normal layer, dense layer and softmax layer are the same as RNN.
- AttPCRNN: Based on PCRNN, with attention [44].
- MP-ResNet: The model is adapted from [12] and is composed of multiple convolutional layers, each followed by batch normalization, activation, and dropout layers. Every two convolutional layers are linked by a residual connection implemented via max pooling. The model concludes with a dense layer and a softmax layer. Additionally, we tried incorporating attention module and RNN module into this model, there was no significant change in the performance.
3.6. Implementation Details
3.7. Metrics
4. Results
5. Discussion
5.1. Diagnostic Feature Requirements and Model Capabilities
5.2. Overconfidence Mitigation
5.3. Uncertainty Quantification for Dynamic Triage
5.4. Limitations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Roth, G.A.; Mensah, G.A.; Johnson, C.O.; Addolorato, G.; Ammirati, E.; Baddour, L.M.; Barengo, N.C.; Beaton, A.Z.; Benjamin, E.J.; Benziger, C.P.; et al. Global burden of cardiovascular diseases and risk factors, 1990–2019: Update from the GBD 2019 study. J. Am. Coll. Cardiol. 2020, 76, 2982–3021. [Google Scholar] [CrossRef] [PubMed]
- Gupta, V.; Mittal, M.; Mittal, V. Chaos theory and ARTFA: Emerging tools for interpreting ECG signals to diagnose cardiac arrhythmias. Wirel. Pers. Commun. 2021, 118, 3615–3646. [Google Scholar] [CrossRef]
- Romero, I.; Serrano, L. ECG frequency domain features extraction: A new characteristic for arrhythmias classification. In Proceedings of the 2001 Conference Proceedings of the 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Istanbul, Turkey, 25–28 October 2001; IEEE: New York, NY, USA, 2001; Volume 2, pp. 2006–2008. [Google Scholar]
- Nasiri, J.A.; Naghibzadeh, M.; Yazdi, H.S.; Naghibzadeh, B. ECG arrhythmia classification with support vector machines and genetic algorithm. In Proceedings of the 2009 Third UKSim European Symposium on Computer Modeling and Simulation, Athens, Greece, 25–27 November 2009; IEEE: New York, NY, USA, 2009; pp. 187–192. [Google Scholar]
- Escalona-Morán, M.A.; Soriano, M.C.; Fischer, I.; Mirasso, C.R. Electrocardiogram classification using reservoir computing with logistic regression. IEEE J. Biomed. Health Inform. 2014, 19, 892–898. [Google Scholar] [CrossRef]
- Guglin, M.E.; Thatai, D. Common errors in computer electrocardiogram interpretation. Int. J. Cardiol. 2006, 106, 232–237. [Google Scholar] [CrossRef]
- Shah, A.P.; Rubin, S.A. Errors in the computerized electrocardiogram interpretation of cardiac rhythm. J. Electrocardiol. 2007, 40, 385–390. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Luong, M.T.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. arXiv 2015, arXiv:1508.04025. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Hannun, A.Y.; Rajpurkar, P.; Haghpanahi, M.; Tison, G.H.; Bourn, C.; Turakhia, M.P.; Ng, A.Y. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 2019, 25, 65–69. [Google Scholar] [CrossRef]
- Li, Y.; Pang, Y.; Wang, K.; Li, X. Toward improving ECG biometric identification using cascaded convolutional neural networks. Neurocomputing 2020, 391, 83–95. [Google Scholar] [CrossRef]
- Schwab, P.; Scebba, G.C.; Zhang, J.; Delai, M.; Karlen, W. Beat by beat: Classifying cardiac arrhythmias with recurrent neural networks. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; IEEE: New York, NY, USA, 2017; pp. 1–4. [Google Scholar]
- Liu, M.; Kim, Y. Classification of heart diseases based on ECG signals using long short-term memory. In Proceedings of the 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 17–21 July 2018; IEEE: New York, NY, USA, 2018; pp. 2707–2710. [Google Scholar]
- Li, F.; Wu, J.; Jia, M.; Chen, Z.; Pu, Y. Automated heartbeat classification exploiting convolutional neural network with channel-wise attention. IEEE Access 2019, 7, 122955–122963. [Google Scholar] [CrossRef]
- Yang, T.; Gregg, R.E.; Babaeizadeh, S. Detection of strict left bundle branch block by neural network and a method to test detection consistency. Physiol. Meas. 2020, 41, 025005. [Google Scholar] [CrossRef]
- Rawi, A.A.; Elbashir, M.K.; Ahmed, A.M. Deep learning models for multilabel ECG abnormalities classification: A comparative study using TPE optimization. J. Intell. Syst. 2023, 32, 20230002. [Google Scholar] [CrossRef]
- Sensoy, M.; Kaplan, L.; Kandemir, M. Evidential deep learning to quantify classification uncertainty. Adv. Neural Inf. Process. Syst. 2018, 31, 1–11. [Google Scholar]
- Van Amersfoort, J.; Smith, L.; Teh, Y.W.; Gal, Y. Uncertainty estimation using a single deep deterministic neural network. In Proceedings of the International Conference on Machine Learning, Virtual Event, 13–18 July 2020; PMLR: Warrington, UK, 2020; pp. 9690–9700. [Google Scholar]
- Han, Z.; Zhang, C.; Fu, H.; Zhou, J.T. Trusted multi-view classification with dynamic evidential fusion. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 2551–2566. [Google Scholar] [CrossRef]
- Leibig, C.; Allken, V.; Ayhan, M.S.; Berens, P.; Wahl, S. Leveraging uncertainty information from deep neural networks for disease detection. Sci. Rep. 2017, 7, 1–14. [Google Scholar] [CrossRef]
- Zhang, W.; Di, X.; Wei, G.; Geng, S.; Fu, Z.; Hong, S. A deep Bayesian neural network for cardiac arrhythmia classification with rejection from ECG recordings. arXiv 2022, arXiv:2203.00512. [Google Scholar]
- Li, H.; Lin, Z.; An, Z.; Zuo, S.; Zhu, W.; Zhang, Z.; Mu, Y.; Cao, L.; Garcia, J.D.P. Automatic electrocardiogram detection and classification using bidirectional long short-term memory network improved by Bayesian optimization. Biomed. Signal Process. Control. 2022, 73, 103424. [Google Scholar] [CrossRef]
- Islam, M.F.; Zabeen, S.; Mehedi, M.H.K.; Iqbal, S.; Rasel, A.A. Monte carlo dropout for uncertainty analysis and ecg trace image classification. In Proceedings of the Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), Montreal, QC, Canada, 26–27 August 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 173–182. [Google Scholar]
- Essa, E.; Xie, X. An ensemble of deep learning-based multi-model for ECG heartbeats arrhythmia classification. IEEE Access 2021, 9, 103452–103464. [Google Scholar] [CrossRef]
- Alsayat, A.; Mahmoud, A.A.; Alanazi, S.; Mostafa, A.M.; Alshammari, N.; Alrowaily, M.A.; Shabana, H.; Ezz, M. Enhancing cardiac diagnostics: A deep learning ensemble approach for precise ECG image classification. J. Big Data 2025, 12, 7. [Google Scholar] [CrossRef]
- Yu, B.; Liu, Y.; Wu, X.; Ren, J.; Zhao, Z. Trustworthy diagnosis of Electrocardiography signals based on out-of-distribution detection. PloS ONE 2025, 20, e0317900. [Google Scholar] [CrossRef] [PubMed]
- Wu, Q.; Li, H.; Li, L.; Yu, Z. Quantifying intrinsic uncertainty in classification via deep Dirichlet mixture networks. arXiv 2019, arXiv:1906.04450. [Google Scholar]
- Hobbhahn, M.; Kristiadi, A.; Hennig, P. Fast predictive uncertainty for classification with Bayesian deep networks. In Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, Eindhoven, The Netherlands, 1–5 August 2022; PMLR: Warrington, UK, 2022; Volume 180, pp. 822–832. [Google Scholar]
- Şensoy, M.; Kaplan, L.M.; Julier, S.; Saleki, M.; Cerutti, F. Risk-aware classification via uncertainty quantification. Expert Syst. Appl. 2025, 265, 125906. [Google Scholar] [CrossRef]
- Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006; Volume 4. [Google Scholar]
- Malinin, A.; Mlodozeniec, B.; Gales, M. Ensemble distribution distillation. arXiv 2019, arXiv:1905.00076. [Google Scholar]
- Malinin, A.; Gales, M. Predictive uncertainty estimation via prior networks. Adv. Neural Inf. Process. Syst. 2018, 31, 1–12. [Google Scholar]
- Taboga, M. Lectures on Probability Theory and Mathematical Statistics; Amazon Publishing: Seattle, WA, USA, 2017. [Google Scholar]
- Jsang, A. Subjective Logic: A Formalism for Reasoning Under Uncertainty; Springer: Berlin/Heidelberg, Germany, 2018. [Google Scholar]
- Bao, W.; Yu, Q.; Kong, Y. Evidential deep learning for open set action recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 13349–13358. [Google Scholar]
- Higgins, I.; Matthey, L.; Pal, A.; Burgess, C.P.; Glorot, X.; Botvinick, M.M.; Mohamed, S.; Lerchner, A. beta-vae: Learning basic visual concepts with a constrained variational framework. ICLR 2017, 3. [Google Scholar]
- Clifford, G.D.; Liu, C.; Moody, B.; Li-wei, H.L.; Silva, I.; Li, Q.; Johnson, A.; Mark, R.G. AF classification from a short single lead ECG recording: The PhysioNet/computing in cardiology challenge 2017. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; IEEE: New York, NY, USA, 2017; pp. 1–4. [Google Scholar]
- Brandes, A.; Stavrakis, S.; Freedman, B.; Antoniou, S.; Boriani, G.; Camm, A.J.; Chow, C.K.; Ding, E.; Engdahl, J.; Gibson, M.M.; et al. Consumer-led screening for atrial fibrillation: Frontier review of the AF-SCREEN international collaboration. Circulation 2022, 146, 1461–1474. [Google Scholar] [CrossRef]
- Crawford, M.H.; Bernstein, S.J.; Deedwania, P.C.; DiMarco, J.P.; Ferrick, K.J.; Garson, A., Jr.; Green, L.A.; Greene, H.L.; Silka, M.J.; Stone, P.H.; et al. ACC/AHA guidelines for ambulatory electrocardiography: Executive summary and recommendations: A report of the American College of Cardiology/American Heart Association task force on practice guidelines (committee to revise the guidelines for ambulatory electrocardiography) developed in collaboration with the north American society for pacing and electrophysiology. Circulation 1999, 100, 886–893. [Google Scholar]
- Jiang, M.; Lu, Y.; Li, Y.; Xiang, Y.; Zhang, J.; Wang, Z. Research on electrocardiogram classification using deep residual network with pyramid convolution structure. J. Biomed. Eng. 2020, 37, 692–698. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Zhou, P.; Shi, W.; Tian, J.; Qi, Z.; Li, B.; Hao, H.; Xu, B. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016; Volume 2, pp. 207–212. [Google Scholar]
- Zihlmann, M.; Perekrestenko, D.; Tschannen, M. Convolutional recurrent neural networks for electrocardiogram classification. In Proceedings of the 2017 Computing in Cardiology (CinC), Rennes, France, 24–27 September 2017; IEEE: New York, NY, USA, 2017; pp. 1–4. [Google Scholar]
- Goodfellow, I. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Karpathy, A.; Fei-Fei, L. Deep visual-semantic alignments for generating image descriptions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3128–3137. [Google Scholar]
- Zagoruyko, S. Wide residual networks. arXiv 2016, arXiv:1605.07146. [Google Scholar]
- Xie, S.; Girshick, R.; Dollár, P.; Tu, Z.; He, K. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1492–1500. [Google Scholar]
Type | # Recording | # of Duration | ||||
---|---|---|---|---|---|---|
Mean | StDev | Max | Median | Min | ||
Normal | 5076 | 31.9 | 10.0 | 61.0 | 30.0 | 9.0 |
AF | 758 | 31.6 | 12.5 | 60.0 | 30.0 | 10.0 |
Other | 2557 | 34.1 | 11.8 | 60.9 | 30.0 | 9.1 |
Noisy | 279 | 27.1 | 9.0 | 60.0 | 30.0 | 10.2 |
Total | 8528 | 32.5 | 10.9 | 61.0 | 30.0 | 9.0 |
ACC | F1 | PR-AUC | ROC-AUC | OC | |
---|---|---|---|---|---|
PCNN | 79.42 | 71.97 | 76.86 | 89.72 | 17.26 |
SE-PCNN | 79.95 | 72.58 | 78.96 | 90.34 | 16.31 |
RNN | 76.11 | 70.60 | 72.21 | 85.77 | 21.93 |
AttRNN | 77.8 | 73.22 | 76.74 | 88.36 | 17.39 |
PCRNN | 82.79 | 79.13 | 83.87 | 90.82 | 16.37 |
AttPCRNN | 83.45 | 79.86 | 84.99 | 91.85 | 15.31 |
MP-ResNet | 85.53 | 82.36 | 85.28 | 92.08 | 12.27 |
Ours | 86.12 | 83.14 | 85.27 | 92.87 | 0.59 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, X.; Zheng, Q.; Zhang, S.; Fu, S.; Chen, Y.; Ye, K. A Reliable Deep Learning Model for ECG Interpretation: Mitigating Overconfidence and Direct Uncertainty Quantification. Symmetry 2025, 17, 794. https://doi.org/10.3390/sym17050794
Li X, Zheng Q, Zhang S, Fu S, Chen Y, Ye K. A Reliable Deep Learning Model for ECG Interpretation: Mitigating Overconfidence and Direct Uncertainty Quantification. Symmetry. 2025; 17(5):794. https://doi.org/10.3390/sym17050794
Chicago/Turabian StyleLi, Xuedong, Qingxiao Zheng, Shibin Zhang, Shipeng Fu, Yingke Chen, and Ke Ye. 2025. "A Reliable Deep Learning Model for ECG Interpretation: Mitigating Overconfidence and Direct Uncertainty Quantification" Symmetry 17, no. 5: 794. https://doi.org/10.3390/sym17050794
APA StyleLi, X., Zheng, Q., Zhang, S., Fu, S., Chen, Y., & Ye, K. (2025). A Reliable Deep Learning Model for ECG Interpretation: Mitigating Overconfidence and Direct Uncertainty Quantification. Symmetry, 17(5), 794. https://doi.org/10.3390/sym17050794