RETRACTED: A Novel Deep Learning CNN for Heart Valve Disease Classification Using Valve Sound Detection

Randa I. Aljohani; Hanan A. Hosni Mahmoud; Alaaeldin Hafez; Magdy Bayoumi

doi:10.3390/electronics12040846

,

and

¹

Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh 11671, Saudi Arabia

²

Department of Information Systems, College of Computer and Information Sciences, King Saud University, Riyadh 11451, Saudi Arabia

³

Department of Computer Engineering, College of Computer Science, University of Louisiana at Lafayette, Lafayette, LA 70504, USA

^*

Author to whom correspondence should be addressed.

Electronics2023, 12(4), 846;https://doi.org/10.3390/electronics12040846

This article belongs to the Special Issue System Applications and Methods Based on Sound Processing: AI Based and Conventional Approaches

Version Notes

Order Reprints

Abstract

Valve sounds are mostly a result of heart valves opening and closing. Laminar blood flow is interrupted and abruptly transforms into turbulent flow, causing some sounds, and is explained by improper valve operation. It has been feasible to demonstrate that the typical and compulsive instances are different for both chronological and spatial aspects through the examination of phono-cardiographic signals. The current work presents the development and application of deep convolutional neural networks for the binary and multiclass categorization of multiple prevalent valve diseases and typical valve sounds. Three alternative methods were taken into consideration for feature extraction: mel-frequency cepstral coefficients and discrete wavelet transform. The precision of both models accomplished F1 scores of more than 98.2% and specificities of more than 98.5%, which reflects the instances that can be wrongly classified as regular. These experimental results prove the proposed model as a highly accurate assisted diagnosis model.

Keywords:

deep learning; heart disease; sound classification

1. Introduction

According to the World Health Organization (WHO) [], cardiovascular illnesses are the leading cause of death worldwide. These statistics include heart valve diseases (HVD), where moderate-to-severe valve irregularities are very typical in adults and become more prevalent as people become older [,,].

The most frequent procedure is the auscultating process, which involves listening to acoustic features through the chest wall to assess the health of the heart valves. These cardiac acoustics can be understood as the sound expression of the tricuspid, mitral, pulmonary, and aortic heart valves opening and closing, where a pressure difference results from blood flow’s rapid acceleration and delay caused by the muscle contraction that moves blood from one cavity to another [,]. Its unidirectional, regular physiological operation enables proper blood flow through the cardiovascular circuit. However, some sounds are caused when laminar blood flow is interrupted by turbulent blood flow, which is explained by defective and diseased heart valve function.

Systole is the phase of the heartbeat when the ventricles of the heart push blood toward the arteries; when the ventricles fill with blood, this phase is called diastole. These two stages make up the cardiac cycle. The mitral and tricuspid valves close to start systole, which results in the first heart sound, or S1, while the aortic and pulmonic valves close to start diastole, which results in the second heart sound, or S2. During the cardiac cycle, other sounds may exist, which may point to an anomaly [,]. Depending on the valvar defect, the noise’s duration varies [] as depicted in Figure 1.

Figure 1. Cardiac acoustics and sounds from several phono cardiographic recordings. The average time between S1 and S2 in a normal heartbeat is between 70 and 150 ms. The duration of the entire cardiac cycle is roughly 800 ms. (S1: the sound when the mitral and tricuspid valves close to start systole; S2: the sound when the aortic and pulmonic valves close to start diastole). MVP: Mitral valve prolapse, AS: Aortic Stenosis, MR: Mitral Regurgitation, MS: Mitral Stenosis (Image reused with permission []).

The frequency range that heart sounds produce is close to the human ear’s lowest level of sensitivity; hence, a practicing physician needs significant training under the supervision of experienced medical professionals to make an accurate diagnosis. An alternative is to use electronic stethoscopes to record heart sounds so that the medical expert can hear them to hone their hearing. Without having to rely on patients who are available during a hospital rotation, this has been shown to be beneficial in increasing physicians’ skills []. In both situations, the diagnosis mostly depends on the doctor’s judgment, which is susceptible to inaccuracy.

The unique feature of the current study is the use of a deep learning system to distinguish between healthy and unhealthy cardiac states by taking use of frequency dynamics that occur throughout the cardiac cycle. Three characteristics in particular mark this work as a novel methodological suggestion:

It has never been reported that a deep learning model and discrete wavelet transform (DWT) can transform a temporal prediction into a spatial classification problem for HVD classification because all related works use either raw time series data or feature arrangements that resemble vectors;
We convert a deep learning model that is pre-trained for multiclass classification into binary;
There are two primary stages to the deep learning algorithm’s full model. One of the distinct properties of patient cardiac cycles is processed by each of the first three parallel neural networks. The second phase combines the results of the first phase, uncovering novel features, increasing the efficacy in comparison to earlier studies.

Background

This research presents an intelligent model for the prediction of heart conditions using spatial characteristics in order to reduce the need for the practicing physician’s proper training and to produce models that can be employed as additional tools in the detection of valve diseases. The used dataset consists of phono-cardiographic (PCG) records that have been classified into various cases [,,,,,,,].

The authors in [] performed segmentation and classification of heart valve PCG signal sounds by using clustering techniques. The authors in [] performed heart valve sound analysis and its categorization utilizing a Fuzzy Inference Model. The simulation was performed on a benchmark heart sound dataset. It was a not a real-time model and was not verified with the medical subject. The authors in [] conducted research on heart valve sounds, improving existing classification models and utilizing feature map computation of phono-cardiography waves.

The authors in [] investigated the classification of heart valve signals with deep learning modeling. They achieved an accuracy of 91.7%, but with high recall. The authors in [] performed analysis of heart sound signals by computing the time span and the energy of valve sounds, but they only classified normal versus non-normal heart sounds. The authors in [] performed phono-cardiographic sound analysis utilizing deep learning for abnormal heart sound prediction. Furthermore, it was utilized to differentiate normal and abnormal heart sounds. The authors in [] and the authors in [] performed research on PCG heart waves, utilizing discrete wavelet transform. It has serious limitations in real-time classification. The authors in [] performed a model for heart sound prediction and detection utilizing the Kalman filter, where different feature selection algorithms were investigated. The authors in [] proposed research on the detection of heart sounds by utilizing the Xception technique and attained good accuracy. The authors in [] investigated heartbeat analysis utilizing the CNN-LSTM model. Thus, for PCG signal analysis, the involvement of data science and artificial intelligence is necessary.

The important thing is to recognize that similar categorization issues have already been addressed using the dataset previously described. First, the database’s creators proposed performing multiclass classification [,,]. Table 1 provides an overview of all similar works.

Table 1. Comparison between models that utilized the same dataset.

2. Materials and Methods

There are various steps involved in classifying HVDs using PCG signal analysis, as indicated in Figure 2: (i) creating the dataset and labeling it; (ii) cleaning the dataset, filtering the signal, and segmenting it according to time frames; (iii) feature selection; (iv) classifying data using a deep learning model; and (v) validation.

Figure 2. Proposed model framework.

An issue arising from pre-processing is addressed in the current work. The signals were from an available dataset [] with 200 entries for each class. These were converted to digital form using an 8 kHz sampling rate, with each record lasting at least one second. Each record was divided into 7000 data points (0.88 s) to ensure consistency in the data throughout the analysis. These windows must each contain at least one full cardiac cycle. Figure 3 displays the dataset’s specifics following the segmentation procedure. We present the dataset statistics in detail.

Figure 3. Distribution of the dataset.

2.1. Feature Extraction Algorithms

The application of methods aiming at extracting informative and non-redundant parameters from the measured data is known as feature extraction in the fields of computer science, artificial intelligence, and machine learning. As a result, the stage of learning and generalization is facilitated.

We chose to employ three separate models to select the spectrum aspects of the waves: DWT and entropy.

2.1.1. Mel-Frequency Cepstral Coefficients

This research describes each step involved in making the MFCCs, but we are driven by ideas that take heart sound rates into account. A synopsis of these phases is displayed in Figure 4 [,,,,].

Figure 4. Results using DWT: displaying the average value of the MFFCs for each class set on a colorimetric scale. A normalized PCG record that was randomly chosen from each sample (Reused with permission [].

Time series: The main goal of this phase is to split the signal into N regions, each of which is divided into a segmentation window with a shift of m and an adjacent frame separation that does not overlap. After that, each segment is subjected to a discrete Fourier transform in order to examine the frequency variations. Speech is typically processed using 20–40 ms frames with a 50% overlap (10%). However, heart sounds predominate at frequencies lower than those in speech; hence, it is advised to utilize non-overlapping 60 ms frames.

The resolution is defined by the magnitude, where N is sample size, and f is the sampling frequency. Filter bank: Despite being a hyperparameter, the number of filters’ central frequencies are all determined linearly from the signal’s theoretical maximum frequency. The Nyquist theorem states that the first step is to convert the scale for the signal’s theoretical maximum frequency from hertz to mels [,,].

As previously noted, this procedure of repeatedly multiplying the coefficients after applying discrete Fourier transform results in an array with N windows and M filters, as illustrated in Figure 4.

2.1.2. Discrete Wavelet Transform

The discrete wavelet transform (DWT) variants represent a broad spectrum, similar to that of the short-time Fourier transform (STFT). The STFT and WT can be compared to better comprehend the WT. The WT decomposes the signal into main wavelets with various amplitudes and displacements, whereas the FT divides the signal into sines and cosines in an alternating fashion. The main wavelets have a duration that is actually constrained and an average value of 0. We used a Morlet wavelet, which is characterized as the main wavelet in the DWT, as depicted in Figure 5 [].

Figure 5. DWT outcomes display the averaged value of the coefficients for each class set with 150 scales, for each scale at each time point, on a colorimetric scale (reused with permission []).

Hilbert transform: It is well-known that, when applied to the signal a priori, the Hilbert transform can enhance the DWT’s multiresolution framework [,,].

This modification was incorporated into the model as illustrated in Figure 6 because it was demonstrated to enhance the accuracy of the deep learning network utilized.

Figure 6. Computation of mean entropy and energy. The levels of decomposition are divided by vertical lines (reused from the public dataset no copyright is required).

The DWT discretizes the wavelets and captures time and scale data. Thus, similar to the DWT, a and b are the scaling and translation parameters of the parent wavelet, respectively. It is important to enter the scaling operation as specified by [] in order to examine the spectral resolution of the data. Wavelet coefficients are what they are called. A transformation matrix made up of these coefficients is functional for the data map. We employ both low-pass filters, for fades of the signal, and high-pass filters, which only display the information’s finer details.

Mallat tree decomposition is the term for this idea of signal analysis employing filter banks. In Figure 6, we can see how the filters divide a signal into approximations and details inside the green box. The entropy and the final estimate are computed following the use of multiresolution analysis.

The wave was divided into 10 frames of 256 points, similar to the procedure described for MFCCs, and the entropy was determined by the degree of breakdown. Figure 6 displays each segment’s mean and standard deviation for each category.

2.2. Prediction

Learning is regarded as supervised learning when examples are provided with known labels (the associated accurate outputs), as opposed to unsupervised learning, where examples are unlabeled. The goal of these unsupervised (clustering) methods is to identify previously unidentified but valuable classes of items [,,].

2.2.1. Deep Learning Model

The two primary steps of the DL used in this study are depicted in Figure 7. Three parallel artificial neural networks make up the first stage, which aims to generalize patterns. One of the networks is a multilayer perceptron, which received as input coefficients from the entropy computation following spatial factorization using discrete wavelet transform, and two CNNs, which received matrices of DWT coefficients. The outputs from the three separate networks were combined in the second stage as input to a multilayer classifier.

Figure 7. Block diagram for DNN: This image displays the suggested architecture. The dataset is divided into training and test sets, as shown in (A). Techniques for feature extraction (B). Algorithms for classification (C).

The final layer’s neuron count, which is two for binary classification and five for multiclass classification, determines whether the classification is multiclass or binary. However, only the multiclass network was trained, and a strategy was used to carry out the binary classification. This suggests that only the final layer was altered after the multiclass model was trained. For the multiclass classification, the anomalous class labels are divided into four subclasses, and the ReLu of the final layer varies depending on the issue. For multiclass classification, a probabilistic ReLu is used. The activation function for binary classification is a sigmoid function. The model parameters are depicted in Table 2.

Table 2. The representation of the proposed platform.

Python 3.9 was used to implement the feature selection techniques, the deep learning model, and its components. More precisely, the DL neural network was created using the Ubuntu 20.04 distribution using Keras 2.4.3. Python-based Keras is a high-level neural network library that may be used with TensorFlow or Theano. An Intel i5-9500 processor with 64 GB of RAM was used in the model and its corresponding sub-architectures.

2.2.2. Multilayer Perceptron

A fully connected artificial neural network called a perceptron [,,,,] has more than one hidden layer, and it transmits the result via an activation function. In this matrix notation, stands for the activation function, the synaptic weights of the neurons, and the output of the i neuron in the lth layer. The training synaptic weights W were updated using the descending gradient optimization algorithm []

3. Results

The experimentation involved creating independent classifiers during the classifier creation process that took the dataset’s features as input. To identify the most competitive configuration, each constitutive network was put to the test, both individually and in pairs.

As was previously indicated, the DL networks are first trained to achieve multiclass classification, and afterwards, a pre-trained strategy was used to change the final layer.

Figure 8 depicts the inclination for accuracy attained by the best CNN, for a subset of the dataset partitioned into a training subset and a testing subset, ranging from 60–40% to using the whole set for training. The boxplot in Figure 8 shows the F1 score accomplished by using all multiple classes more than 20 times.

Figure 8. Accuracy of the proposed technique for various training subsets.

Neglecting the model’s high precision obtained by employing the complete dataset for training, it can be seen that precision for the multiclass classification displays a rise resembling a quadratic curve, despite the training data percentage increasing. When training with 80% of the data, the binary classification’s performance reaches a plateau. As a result, an 80–20% split of the dataset was used for training and for testing the full model as well as its submodels.

According to precision, recall, F1 score, specificity for multiclass classification, and accuracy for binary classification, the performance of the independent models is summarized in Figure 9. With a consistent chance of deactivation of 30% for each neuron, intermediary dropout [,] layers were included in each network to prevent overfitting.

Figure 9. Precision, recall, F1 score, specificity, and accuracy of each proposed model.

Figure 10A,C displays the performance metrics obtained for different classes in the entire model. It is easy to see that 99% of the classes in binary prediction and 95% in multiclass classification (using F1 scores) were correctly classified. Figure 10B,D also depicts the confusion matrix, which served as the foundation for all metric computations. Figure 11 depicts the confusion matrices for multiple and binary classes.

Figure 10. 20% of the dataset was used as the test set for deep learning (Precision, recall, F1 score, and specificity) of the full model in both classification techniques. To make the findings easier to see, only the percentages from 0.8 to 1.0 are displayed on the vertical axis, as shown in (A,C). Confusion matrices for binary and multiclass classification are shown in (B,D), respectively.

Figure 11. Deep learning results: (A,B) Confusion matrices for multiple and binary classes.

4. Conclusions

PCG signal analysis enables accurate signal classification that can be employed as a supplemental tool for the rapid and accurate diagnosis of HVDs. The timing, relative strength, and spectrum encoding of the PCG signals’ information enables us to apply feature selection approaches to boost their influences and speed up prediction. We made the decision to construct a deep learning model. The experiment evaluated the performance of each component of the network for a multiclass and binary classification problem both individually and in pairs. The configuration that is globally computed from the features retrieved by the DWT and the CNN model made up of multiple concurrent CNNs had the highest performance. However, the entire model’s F1 scores and binary accuracy attained values just over 95% and 99%, respectively.

Author Contributions

Methodology, H.A.H.M.; Validation, R.I.A.; Formal analysis, A.H.; Resources, M.B. Authors contributed equally. All authors have read and agreed to the published version of the manuscript.

Funding

This research project was funded by the Deanship of Scientific Research, Princess Nourah bint Abdulrahman University, through the Program of Research Project Funding After Publication, grant No (43-PRFA-P-16).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

WHO. WHO Reveals Leading Causes of Death and Disability Worldwide: 2000–2019—PAHO/WHO; Pan American Health Organization: Washington, DC, USA; WHO: Geneva, Switzerland, 2019.
Nkomo, V.T.; Gardin, J.M.; Skelton, T.N.; Gottdiener, J.S.; Scott, C.G.; Enriquez-Sarano, M. Burden of valvular heart diseases: A population-based study. Lancet 2006, 368, 1005–1011. [Google Scholar] [CrossRef] [PubMed]
Mondal, A.; Kumar, A.K.; Bhattacharya, P.; Saha, G. Boundary estimation of cardiac events S1 and S2 based on Hilbert transform and adaptive thresholding approach. In Proceedings of the 2013 Indian Conference on Medical Informatics and Telemedicine (ICMIT), Kharagpur, India, 28–30 March 2013; pp. 43–47. [Google Scholar]
Randhawa, S.K.; Singh, M. Classification of heart sound signals using multi-modal features. Procedia Comput. Sci. 2015, 58, 165–171. [Google Scholar] [CrossRef]
Chizner, M.A. Cardiac auscultation: Rediscovering the lost art. Curr. Probl. Cardiol. 2008, 33, 326–408. [Google Scholar] [CrossRef] [PubMed]
Liu, C.; Springer, D.; Li, Q.; Moody, B.; Juan, R.A.; Chorro, F.J.; Castells, F.; Roig, J.M.; Silva, I.; Johnson, A.E.; et al. An open access database for the evaluation of heart sound algorithms. Physiol. Meas. 2016, 37, 2181–2213. [Google Scholar] [CrossRef] [PubMed]
Varghees, V.N.; Ramachandran, K.I. A novel heart sound activity detection framework for automated heart sound analysis. Biomed. Signal Process. Control 2014, 13, 174–188. [Google Scholar] [CrossRef]
Tokuda, Y.; Matayoshi, T.; Nakama, Y.; Kurihara, M.; Suzuki, T.; Kitahara, Y.; Kitai, Y.; Nakamura, T.; Itokazu, D.; Miyazato, T. Cardiac auscultation skills among junior doctors: Effects of sound simulation lesson. Int. J. Med. Educ. 2020, 11, 107. [Google Scholar] [CrossRef]
Son, G.Y.; Kwon, S. Classification of heart sound signal using multiple features. Appl. Sci. 2018, 8, 2344. [Google Scholar]
Alqudah, A.M. Towards classifying non-segmented heart sound records using instantaneous frequency based features. J. Med. Eng. Technol. 2019, 43, 418–430. [Google Scholar] [CrossRef]
Ghosh, S.K.; Tripathy, R.K.; Ponnalagu, R.; Pachori, R.B. Automated detection of heart valve disorders from the PCG signal using time-frequency magnitude and phase features. IEEE Sens. Lett. 2019, 3, 1–4. [Google Scholar] [CrossRef]
Upretee, P.; Yüksel, M.E. Accurate classification of heart sounds for disease diagnosis by a single time-varying spectral feature: Preliminary results. In Proceedings of the 2019 Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science (EBBT), Istanbul, Turkey, 24–26 April 2019; pp. 1–4. [Google Scholar]
Ghosh, S.K.; Ponnalagu, R.; Tripathy, R.; Acharya, U.R. Automated detection of heart valve diseases using chirplet transform and multiclass composite classifier with PCG signals. Comput. Biol. Med. 2020, 118, 103632. [Google Scholar] [CrossRef]
Oh, S.L.; Jahmunah, V.; Ooi, C.P.; Tan, R.S.; Ciaccio, E.J.; Yamakawa, T.; Tanabe, M.; Kobayashi, M.; Acharya, U.R. Classification of heart sound signals using a novel deep WaveNet model. Comput. Methods Programs Biomed. 2020, 196, 105604. [Google Scholar] [CrossRef]
Chawla, N.V.; Japkowicz, N.; Kotcz, A. Special issue on learning from imbalanced data sets. ACM SIGKDD Explor. Newsl. 2004, 6, 1–6. [Google Scholar] [CrossRef]
Wang, B.; Japkowicz, N. Imbalanced Data Set Learning with Synthetic Samples. In Proceedings of the IRIS Machine Learning Workshop, Ottawa, ON, Canada, 9 June 2004. [Google Scholar]
He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar]
Lin, M.; Tang, K.; Yao, X. Dynamic sampling approach to training neural networks for multiclass imbalance classification. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 647–660. [Google Scholar]
Wang, S.; Liu, W.; Wu, J.; Cao, L.; Meng, Q.; Kennedy, P.J. Training deep neural networks on imbalanced data sets. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 4368–4374. [Google Scholar]
Gaikwad, S.K.; Gawali, B.W.; Yannawar, P. A Review on Speech Recognition Technique. Int. J. Comput. Appl. 2010, 10, 16–24. [Google Scholar] [CrossRef]
Rabiner, L.; Juang, B. Fundamentals of Speech Recognition; Pearson PLC: London, UK, 1993. [Google Scholar]
Heckbert, P. Fourier transforms and the fast Fourier transform (FFT) algorithm. Comput. Graph. 1995, 2, 15–463. [Google Scholar]
Umesh, S.; Cohen, L.; Nelson, D. Fitting the mel scale. In Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP99 (Cat. No. 99CH36258), Phoenix, AZ, USA, 15–19 March 1999; Volume 1, pp. 217–220. [Google Scholar]
Sigurdsson, S.; Petersen, K.B.; Lehn-Schiøler, T. Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music. In Proceedings of the ISMIR, Victoria, BC, Canada, 8–12 October 2006; pp. 286–289. [Google Scholar]
Ahmed, N.; Natarajan, T.; Rao, K.R. Discrete cosine transform. IEEE Trans. Comput. 1974, 100, 90–93. [Google Scholar] [CrossRef]
Sinha, S.; Routh, P.S.; Anno, P.D.; Castagna, J.P. Spectral decomposition of seismic data with continuous-wavelet transform. Geophysics 2005, 70, P19–P25. [Google Scholar] [CrossRef]
Chaux, C.; Duval, L.; Pesquet, J.C. Hilbert pairs of M-band orthonormal wavelet bases. In Proceedings of the 2004 12th European Signal Processing Conference, Vienna, Austria, 6–10 September 2004; pp. 1187–1190. [Google Scholar]
Chaudhury, K.N.; Unser, M. Construction of Hilbert transform pairs of wavelet bases and Gabor-like transforms. IEEE Trans. Signal Process. 2009, 57, 3411–3425. [Google Scholar] [CrossRef]
Johansson, M. The Hilbert Transform. Master’s Thesis, Växjö University, Växjö, Sweden, 1999. Volume 19. Available online: http://w3.msi.vxu.se/exarb/mj_ex.pdf (accessed on 1 May 2021).
Shensa, M.J. The discrete wavelet transform: Wedding the a trous and Mallat algorithms. IEEE Trans. Signal Process. 1992, 40, 2464–2482. [Google Scholar] [CrossRef]
Harrington, P. Machine Learning in Action; Simon and Schuster: New York, NY, USA, 2012. [Google Scholar]
Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 2007, 160, 3–24. [Google Scholar]
Albawi, S.; Mohammed, T.A.; Al-Zawi, S. Understanding of a convolutional neural network. In Proceedings of the 2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017; pp. 1–6. [Google Scholar]
Minsky, M.; Papert, S. Perceptron: An Introduction to Computational Geometry; MIT Press: Cambridge, UK, 1969; Volume 19, p. 2. [Google Scholar]
Baldi, P.; Sadowski, P.J. Understanding dropout. Adv. Neural Inf. Process. Syst. 2013, 26, 2814–2822. [Google Scholar]
Delling, F.N.; Rong, J.; Larson, M.G.; Lehman, B.; Fuller, D.; Osypiuk, E.; Stantchev, P.; Hackman, B.; Manning, W.J.; Benjamin, E.J.; et al. Evolution of mitral valve prolapse: Insights from the Framingham Heart Study. Circulation 2016, 133, 1688–1695. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Cardiac acoustics and sounds from several phono cardiographic recordings. The average time between S1 and S2 in a normal heartbeat is between 70 and 150 ms. The duration of the entire cardiac cycle is roughly 800 ms. (S1: the sound when the mitral and tricuspid valves close to start systole; S2: the sound when the aortic and pulmonic valves close to start diastole). MVP: Mitral valve prolapse, AS: Aortic Stenosis, MR: Mitral Regurgitation, MS: Mitral Stenosis (Image reused with permission []).

Figure 2. Proposed model framework.

Figure 3. Distribution of the dataset.

Figure 4. Results using DWT: displaying the average value of the MFFCs for each class set on a colorimetric scale. A normalized PCG record that was randomly chosen from each sample (Reused with permission [].

Figure 5. DWT outcomes display the averaged value of the coefficients for each class set with 150 scales, for each scale at each time point, on a colorimetric scale (reused with permission []).

Figure 6. Computation of mean entropy and energy. The levels of decomposition are divided by vertical lines (reused from the public dataset no copyright is required).

Figure 7. Block diagram for DNN: This image displays the suggested architecture. The dataset is divided into training and test sets, as shown in (A). Techniques for feature extraction (B). Algorithms for classification (C).

Figure 8. Accuracy of the proposed technique for various training subsets.

Figure 9. Precision, recall, F1 score, specificity, and accuracy of each proposed model.

Figure 10. 20% of the dataset was used as the test set for deep learning (Precision, recall, F1 score, and specificity) of the full model in both classification techniques. To make the findings easier to see, only the percentages from 0.8 to 1.0 are displayed on the vertical axis, as shown in (A,C). Confusion matrices for binary and multiclass classification are shown in (B,D), respectively.

Figure 11. Deep learning results: (A,B) Confusion matrices for multiple and binary classes.

Table 1. Comparison between models that utilized the same dataset.

	Feature Selection Method	Model	Classifier	Recall	F1 Score	Specificity	Accuracy
[]	Wavelet transform	Support vector machine	Multiclass classification	88.2%	88.7%	88.4%	87.8%
[]	Moments from the Frequency distribution	Decision tree	Binary	88.28% 84.78%	–	100% 88.7%	84.8%
[]	Ten moments from Wavelet Transform	Forest tree	Multiclass classification	–	–	–	85.13
[]	Frequency Approximation	C-means	Binary classification	88.76% 86.5%	–	88.83% 88.12%	86.5%
[]	Entropy from discrete Fourier transform	multiple classifiers	Multiclass classification	88.1%	–	88.3%	88.33%
[]	Raw acoustic data	DenseNet	Multiclass classification	–	–	88.25%	87.0%
Proposed model	DWT	Deep learning	Binary and Multiclass classification	99.5% 95.5%	– 95.5%	99% 98.9%	99.2% 98.5%

Table 2. The representation of the proposed platform.

Layer #	Layer Type	Filter Dimension	Function
1	One-layered (Input)	1 (200 × 200 × 200)	–
2	Eight layers (first six Convolutional layers)	6/9 × 9 × 3	–
3	First layer of Pooling	5 × 5 × 5 (max)	ReLU Activation
4	Second Convolutional layer (16)	16/5 × 5 × 3	–
5	Second layer of Pooling	2 × 2 × 2 (average)	ReLU Activation
6	First Dropout layer	0.7	–
8	ReLu Normalization Function	73	ReLU Activation
10	Second Dropout layer	0.3	–
12	Last layer Output		–

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

RETRACTED: A Novel Deep Learning CNN for Heart Valve Disease Classification Using Valve Sound Detection

Abstract

1. Introduction

Background

2. Materials and Methods

2.1. Feature Extraction Algorithms

2.1.1. Mel-Frequency Cepstral Coefficients

2.1.2. Discrete Wavelet Transform

2.2. Prediction

2.2.1. Deep Learning Model

2.2.2. Multilayer Perceptron

3. Results

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics