Next Article in Journal
Validation of Microplate Methods for Total Phenolic Content and Antioxidant Activity on Honeys, and Comparison with Conventional Spectrophotometric Methods
Previous Article in Journal
A Method for Batch Allocation of Equipment Maintenance Tasks Considering Dynamic Importance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparison of Machine Learning Models in Nonlinear and Stochastic Signal Classification

by
Elzbieta Olejarczyk
1,2,* and
Carlo Massaroni
3,4
1
Nalecz Institute of Biocybernetics and Biomedical Engineering, Polish Academy of Sciences, 02-109 Warsaw, Poland
2
Faculty of Electrical Engineering, Automatics, Computer Science, and Biomedical Engineering, AGH University of Krakow, 30-059 Krakow, Poland
3
Departmental Faculty of Engineering, Università Campus Bio-Medico di Roma, 00128 Rome, Italy
4
Fondazione Policlinico Universitario Campus Bio-Medico di Roma, 00128 Rome, Italy
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(20), 11226; https://doi.org/10.3390/app152011226
Submission received: 23 September 2025 / Revised: 14 October 2025 / Accepted: 17 October 2025 / Published: 20 October 2025
(This article belongs to the Special Issue New Advances in Electrocardiogram (ECG) Signal Processing)

Abstract

Featured Application

The algorithm, based on an optimized ensemble RUSBoosted Trees classifier and a set of statistical and nonlinear measures, may be helpful in single-channel wearable ECG devices to detect artifacts occurring in real-time ECG recordings.

Abstract

This study aims to compare different classifiers in the context of distinguishing two classes of signals: nonlinear electrocardiography (ECG) signals and stochastic artifacts occurring in ECG signals. The ECG signals from a single-lead wearable Movesense device were analyzed with a set of eight features: variance (VAR), three fractal dimension measures (Higuchi fractal dimension (HFD), Katz fractal dimension (KFD), and Detrended Fluctuation Analysis (DFA)), and four entropy measures (approximate entropy (ApEn), sample entropy (SampEn), and multiscale entropy (MSE) for scales 1 and 2). The minimum-redundancy maximum-relevance algorithm was applied for evaluation of feature importance. A broad spectrum of machine learning models was considered for classification. The proposed approach allowed for comparison of classifier features, as well as providing a broader insight into the characteristics of the signals themselves. The most important features for classification were VAR, DFA, ApEn, and HFD. The best performance among 34 classifiers was obtained using an optimized RUSBoosted Trees ensemble classifier (sensitivity, specificity, and positive and negative predictive values were 99.8, 73.7%, 99.8, and 74.3, respectively). The accuracy of the Movesense device was very high (99.6%). Moreover, the multifractality of ECG during sleep was observed in the relationship between SampEn (or ApEn) and MSE.

1. Introduction

Electrocardiography (ECG) is a fundamental tool in cardiac diagnostics, but the interpretation of ECG signals remains a challenging task due to frequent contamination by artifacts such as baseline wander, motion artifacts, and muscle noise [1].
Traditional time- and frequency-domain methods (e.g., bandpass filters, Wiener and Kalman filters, or wavelet decompositions) have been widely used to suppress noise [2,3,4,5,6,7]. Moreover, adaptive and hybrid techniques, such as empirical mode decomposition (EMD), ensemble EMD, and empirical wavelet transform (EWT), have been proposed [8,9,10,11,12]. However, these linear techniques often fail when the noise overlaps in frequency with the signal or when the ECG exhibits nonlinear dynamics [2].
Given these challenges, nonlinear analysis methods have emerged as powerful alternatives for both noise reduction and feature extraction. Measures such as Higuchi fractal dimension (HFD), Katz fractal dimension (KFD), approximate entropy (ApEn), sample entropy (SampEn), and Detrended Fluctuation Analysis (DFA) allow for the quantification of complexity, irregularity, and self-similarity in ECG signals. Recent studies have demonstrated that such nonlinear measures are not only effective in distinguishing between normal and diseased cardiac dynamics, but also in identifying and filtering out artifacts in real-time applications. For instance, Sharanya and Arjunan (2023) used a combination of fractal dimension techniques (HFD and KFD) and entropy-based features to effectively differentiate diabetic patients with cardiac autonomic neuropathy from healthy controls using only short ECG segments [13]. Similarly, the work by Chen et al. (2024) introduced a fast sample entropy algorithm optimized for wearable ECG devices, showing that SampEn can be implemented in real time with reduced computational cost and high discriminative power for atrial fibrillation detection [14]. Additionally, Olejarczyk et al. (2024) used nonlinear and statistical features to identify noisy ECG segments recorded with Movesense wearable devices [15].
In parallel, deep learning, especially convolutional neural networks (CNNs) and transformer-based architectures, has revolutionized automated ECG classification by learning complex spatiotemporal patterns from raw or preprocessed signals. Hybrid models like DeepECG-Net have achieved high accuracy (~98.2%) while maintaining some robustness to noise [16]. Additionally, wavelet-transformed inputs combined with Swin Transformers have enhanced arrhythmia classification under artifact-heavy conditions [14].
Despite their superior performance, deep learning (DL) methods face several limitations in real-time, resource-constrained environments: (1) high computational cost: DL models often require GPUs and are not feasible on ultra-low-power microcontrollers without significant model compression; (2) lack of interpretability: DL models are largely black-boxes and require post hoc explainability tools (e.g., SHAP and saliency maps), which may not yield physiologically meaningful insights; (3) dependence on large datasets: DL requires thousands of labeled examples, which may not be available for rare cardiac conditions or personal adaptations; (4) sensitivity to artifacts: unless explicitly trained with corrupted data, deep models may misclassify noisy signals; (5) DL architectures and hyper-parameters often vary widely between studies, making reproducibility and regulatory approval more challenging.
By contrast, nonlinear features offer several advantages: (1) they are lightweight in terms of computation and memory footprint [14,17] and interpretable in terms of physiological dynamics [18,19]; (2) they can be computed in real time on edge devices without cloud support [14,17]; (3) they do not require large labeled datasets for training and can generalize better in small-sample settings [20]; (4) some features (e.g., SampEn and DFA) are robust to mild artifacts and nonstationarity [21]; (5) many nonlinear measures are well-established and validated across decades of the biomedical signal processing literature, with standardized parameter settings and performance benchmarks [13,15,18,19,22].
Thus, despite the growing dominance of deep learning, nonlinear features remain highly relevant, particularly in wearable applications where power efficiency, real-time processing, and explainability are critical. Deep learning and nonlinear analysis can be seen as complementary approaches, i.e., nonlinear features can act as input to lightweight classifiers or as preprocessing steps that enhance signal quality prior to deep learning.
Multiple comparative studies confirm the value of nonlinear features in ECG quality assessment and disease classification. For example, a study by Abdelrazik et al. (2025) demonstrated that nonlinear features, including HFD and KFD, can serve as effective inputs to lightweight machine learning models (e.g., random forest and SVM) for wearable arrhythmia detection [22]. Similarly, Ribeiro et al. (2024) used discrete wavelet transforms and nonlinear features for multi-class cardiovascular disease classification from the PTB ECG database, achieving high accuracy [23]. Noitz et al. (2024) showed that machine learning models can still extract meaningful features in ECG data corrupted by synthetic artifacts [24].
During the last ten years, a tendency to consider a large number of features and classifiers to find the optimal set for assessing the quality of the ECG signal has been observed. The Web of Science database contains 294 papers published from January 2014 to the present related to the classification of ECG artifacts. However, only a few of these studies used nonlinear measures, even though the ECG signal, unlike artifacts, is nonlinear. In most studies, the authors analyze heart rate variability (HRV), derived from the ECG signal, not the signal itself, which significantly limits the information contained therein. Previous studies have compared the performance of several varieties of two standard classifiers—random forest and support vector machine (SVM)—using as many as 27 linear and nonlinear features, including approximate entropy, permutation entropy, and Lempel–Ziv complexity [25]. Considering that the nonlinear features significantly improved the classification performance, another study compared three classifiers, standard SVM, least-squares SVM, and long short-term memory (LSTM), using a combination of six features: approximate entropy, sample entropy, fuzzy measure entropy, Hurst exponent, kurtosis, and power spectral density [26]. The results indicated that the performance of LSTM is higher than the performance of the other two SVM classifiers. Recently, some investigators reported that applying a set of as many as seventy-seven time- and frequency-domain features of HRV, including nonlinear measures, allows for the classification of different patient conditions such as normal sinus rhythm, sudden cardiac death, coronary artery disease, congestive heart failure, and atrial fibrillation [27].
This study aims to comprehensively compare machine learning models in terms of performance, scalability, interpretability, and utility in biomedical signal analysis, using ECG signals and the artifacts occurring in these signals. We performed an extensive comparative study of thirty-four classifiers using a set of the eight most promising nonlinear features, including measures of fractal dimension that have not previously been considered as indices of ECG signal quality—Higuchi fractal dimension and Katz fractal dimension—as well as multiscale entropy.
Importantly, much of the literature focuses on heart rate variability (HRV), derived from ECG, rather than the ECG signal itself [27,28,29,30,31,32]. This can limit the richness of information, especially when assessing waveform morphology changes such as ST segment shifts. In this study, we compared 34 classifiers using a concise set of nonlinear features applied directly to ECG signals rather than HRV. The importance of matching specific features and classifiers to context-specific goals was emphasized.

2. Materials and Methods

2.1. ECG Registration and Preprocessing

A prospective cohort study was performed on six healthy volunteers (age 46 ± 17 years old; three women and three men; all volunteers without musculoskeletal, cardiovascular, and respiratory diseases; all European) at the Università Campus Bio-Medico di Roma (UCBM). The recruitment was accomplished in adherence to the Declaration of Helsinki and after Ethics Committee approval from the UCBM institution (Prot. PAR 04.22 OSS) [33]. All patients provided written informed consent.
About 8 h of ECG signals during sleep were recorded using a Movesense ECG device. The sampling frequency was 256 Hz. ECG signals were filtered using a notch filter and a band-pass Butterworth filter from 0.5 Hz to 70 Hz. The signal was then normalized in 2 s windows to guarantee maximum R peak amplitude in the artifact-free ECG segments throughout the whole recording. All artifacts were manually annotated in EDFBrowser and exported to ASCII files for further analysis.

2.2. ECG Analysis

The ECG analysis was performed using eight measures, including variance and the following seven nonlinear measures: Higuchi fractal dimension, Katz fractal dimension, Detrended Fluctuation Analysis, sample entropy, approximate entropy, and multiscale entropy for scales 1 and 2 (MSE1 and MSE2). All measures were calculated in 4 s windows. Then, an outlier detection and analysis was performed. Of the 217,246 points, 4 with standard deviations much higher than 1 were rejected from further analysis.

2.2.1. Higuchi Fractal Dimension (HFD)

Higuchi fractal dimension (HFD) is a measure of signal complexity [34]. The more complex the curve (from line to white noise), the greater the HFD value. HFD is 1 for lines and 2 for white noise. From a given time series, X(i), i = 1,…, N, where N is the total number of samples, k new time series are constructed:
X m k :                 X ( m ) , X ( m + k ) , . . . , X m + int N m k k
where k—time interval; m—initial time in the range from 1 to k; int(r)—integer part of a real number r.
The length of each of the k time series is calculated as a normalized sum of the absolute value of difference between a pair of samples distant k starting from sample m:
L m ( k ) = 1 k · i = 1 ,     int N m k X m + i k X m + i 1 k N 1 int N m k
The length of curve L(k) is calculated as the average Lm(k) for m = 1,…, k.
HFD is calculated as the coefficient of linear regression, which relates L(k) to the inverse of the k parameter as follows:
ln L ( k ) ~   H F D · l n ( 1 k )
The k parameter is the only parameter of the HFD algorithm. In this study, an optimal k value was set to 8.

2.2.2. Katz Fractal Dimension (KFD)

Katz fractal dimension (KFD) [35] is defined as outlined below:
K F D = log ( L ) log ( d )
where L is a sum of Euclidean distances between successive points; and d is the diameter estimated as the maximum distance between the first point and any other sequence point.
To avoid the dependence of KFD on the measurement units, an average distance between successive points a is used as a general unit to normalize the distances L and d:
K F D = log L a log d a
In contrast to HFD, KFD does not require any parameter.

2.2.3. Detrended Fluctuation Analysis (DFA)

Detrended Fluctuation Analysis (DFA) is a modification of root-mean-square analysis of random walks applied to nonstationary signals [36].
First, a cumulative sum of a given time series X(i) is calculated as follows:
y k = i = 1 k X i X m e a n
where Xmean is the average value of the entire time series.
Next, the integrated time series y(k) is divided into segments of equal length n, and the least-squares line yn(k) is fitted to the data in each segment.
The root-mean-square fluctuation in an integrated and detrended time series F(n) in the function of window size n is given by the following:
F ( n ) = 1 N t = 1 N y ( k ) y n ( k ) 2
The scaling exponent α is defined as the slope of a least-squares regression line, which relates log(F(n)) to log(n) for n from 1 to 10.
ln F ( n ) ~   α · l n ( n )

2.2.4. Approximate Entropy (ApEn)

Approximate entropy (ApEn) is a measure of the irregularity of the signal [37].
ApEn is defined as follows:
A p E n ( m , r , N ) = 1 N m + 1 i = 1 N m + 1 ln C i m r 1 N m i = 1 N m ln C i m + 1 r
where
C i m r = 1 N m + 1 i = 1 N m + 1 Θ ( r d ( X ( i ) X ( j ) ) )
is the correlation integral with the heavy-side step function Θ; d (X(i), X(j))—distance between ith and jth point of the time series of length N; m—embedding dimension, i.e., length of the sub-sequences to be compared (set to 2); r—tolerance level, i.e., threshold of radius distance (set to 0.2).

2.2.5. Sample Entropy (SampEn)

Sample entropy (SampEn) is a modification of ApEn that excludes self-matches, i.e., comparisons with itself [38].
Thus, the correlation integral has the following form:
C i m r = 1 N m + 1 i = 1 N m + 1 Θ ( r d ( X ( i ) X ( j ) ) )   ,               i j
SampEn is defined as
S a m p E n m , r , N = l n Φ m ( r ) Φ m + 1 ( r )
where
Φ m ( r ) = 1 N m + 1 i = 1 N m + 1 ln C i m r
The advantage of SampEn over ApEn is its independence on data length. However, SampEn is only an approximate measure of information because it directly uses correlation integrals.

2.2.6. Multiscale Entropy (MSE)

The multiscale entropy (MSE) was introduced by [39].
A given time series, X(i), i = 1,…, N, where N is the total number of samples, is divided into nonoverlapping windows of length τ. Then, the data inside each window are averaged. Next, the coarse-grained time series are constructed for every scale factor τ according to the following equation:
y τ i = 1 τ i = j 1 · τ + 1 j · τ X i   ,           1 j N / τ
The length of each coarse-grained time series is equal to the size of the original time series divided by the scale factor. SampEn is calculated for each scale factor. MSE is the relationship between SampEn and τ.

2.3. Feature Selection and Classification

The feature selection and classification were performed using the Classification Learner App implemented in the Statistics and Machine Learning Toolbox in Matlab R2023a.

2.3.1. Feature Selection

The minimum-redundancy maximum-relevance (MRMR) algorithm was applied for feature selection [40]. The MRMR method, based on mutual information of pairs of features and mutual information of a feature and the response, allows for minimizing the redundancy of a feature set and maximizing its relevance to the response.

2.3.2. Feature Classification

The binary classification of ECG segments with and without artifacts was performed separately for 34 classifiers to choose the best classifier. Several groups of classifiers were considered, such as Tree, Discriminant, Logistic Regression, Support Vector Machine (SVM), Naïve Bayes, k-Nearest Neighbors (k-NN), Neural Network, and Ensemble Trees. A stratified 5-fold cross-validation was performed before training all the classification models to avoid overfitting.

2.3.3. Hyper-Parameter Optimization

Every group of classifiers was optimized to find the best set of hyper-parameters for a given model. Table 1 provides the ranges of hyper-parameters used for training individual models. The values obtained for the optimized classifiers are also reported.

2.3.4. Classification Performance

The classification performance was evaluated for all classifiers using five metrics: sensitivity, specificity, precision (PPV), negative predictive value (NPV), and detection accuracy estimated according to the following formulas:
s e n s i t i v i t y = T P T P + F N
s p e c i f i c i t y = T N T N + F P
p r e c i s i o n = T P T P + F P
N P V = T N T N + F N
a c c u r a c y = T P + T N T P + T N + F P + F N
where TP—artifacts marked manually and identified automatically; TN—no artifacts marked manually and not identified automatically; FP—no artifacts marked manually but identified automatically; FN—artifacts marked manually but not identified automatically.

3. Results

3.1. Distributions of Nonlinear ECG Measures in Healthy Persons

ECG segments with artifacts are characterized by higher values of variance and entropy (SampEn and ApEn) and lower values of KFD than those without artifacts. Interestingly, the HFD of segments with artifacts can take lower and higher values than segments without artifacts. In comparison, MSE1 and MSE2 do not allow for differentiation of the two classes. The ranges covering 99.5% of values typical for artifact-free ECG segments are provided in Table 2. Both non-standardized and standardized values are reported.

3.2. Feature Importance Scores

The most important feature is variance (score = 0.0267), followed by two measures with the same importance scores of 0.0210 (DFA and ApEn). Next, HFD contributes a slightly higher score than MSE1 (0.0166 vs. 0.0160). SampEn, KFD, and MSE2 have much lower scores than other features (0.0090, 0.0050, and 0.0016, respectively).

3.3. Choice of the Best Classifier

The training results, including accuracy, total cost, error rate, prediction speed, training time, and model size, are summarized in Table 3. The performance of thirty-four classifiers evaluated using five metrics is provided in Table 4. The average accuracy of the Movesense ECG device was 99.4%. Among all classifiers, an optimized ensemble RUSBoosted Trees classifier offers the best performance (c.f., Table 4) at a relatively lower total cost (c.f., Table 2). However, the weighted k-NN classifier allows us to find the highest number of segments with artifacts (TP). The ROC curves for both classifiers are shown in Figure 1. The area under the curve (AUC), which measures the discriminatory power of a classifier, is larger for the ensemble RUSBoosted Trees classifier than for the weighted k-NN classifier. Both classifiers, optimized weighted k-NN and optimized ensemble RUSBoosted Trees, are characterized by an optimal training time (14 s and 130 s, respectively), which is much lower than the time needed for training of Naïve Bayes, SVM, or Neural Network classifiers (c.f., Table 3). The maximal time was over an hour for non-optimized Kernel Naïve Bayes and Cubic SVM. The optimization of the hyper-parameters allowed for a significant reduction in training time. However, training an optimized Kernel Naïve Bayes or Neural Network still requires much longer time than other classifiers, which is reflected in up to 348 times (!) greater energy consumption (see Appendix A).
The detailed procedure for comparing performance metrics between each pair of optimized classifiers is provided in Appendix B.

3.4. Comparison of Classifiers: The Ensemble RUSBoosted Trees and the Weighted k-NN Classifier

In Figure 2, the scatter plots of variance in relation to the selected best features (DFA, ApEn, and HFD) are compared for two non-optimized classifiers: the ensemble RUSBoosted Trees and the weighted k-NN.
In the scatter plots of the weighted k-NN classifier, the space dominated by feature values for segments with artifacts is “contaminated” by blue points, i.e., values belonging to segments that the expert has classified as segments without artifacts. At the same time, the ensemble RUSBoosted Trees classifier can eliminate points not located in a space dominated by points belonging to the same class. However, the optimization of the ensemble RUSBoosted Trees classifier made the distributions of both classifiers similar.

3.5. Relationships Between Variance and Nonlinear Measures

The points in the scatter plots of variance and each of the nonlinear measures of ECG segments without artifacts occupy an area in the shape of an elongated blue disk (c.f., Figure 3), proving that these feature pairs are correlated. Regardless of the value of the nonlinear feature, the variance remains constant and at the same level independent of the feature (horizontal disks), except for KFD, for which we observe a slight increase in variance with the rise in KFD (c.f., Figure 3B). For this reason, applying a variance threshold above 0.02 allows for the elimination of most artifacts, which are observed mainly in the scatter plots of variance in the function of DFA (c.f., Figure 3C) and MSE independently on the scale (c.f., MSE1 in Figure 3F and MSE2 in Figure 3G). An additional threshold for HFD greater than 1.26 is needed in case of variance and HFD (Figure 3A). In the relationships between the variance and entropy measures, thresholds for SampEn, ApEn, MSE1, and MSE2 are required above the following values: 1.84, 2.10, 1.4, and 2.6, respectively.

3.6. Relationships Between Entropy and Fractal Dimension Measures

Surprisingly, we do not observe any correlation between ApEn and individual fractal dimension measures. Artifacts have higher entropy and lower DFA than a low-noise ECG signal (c.f., Figure 4C). Unlike DFA, the other two measures of fractal dimension (mainly HFD) can take both higher and lower values for artifacts than for the artifact-free ECG signal (c.f., Figure 4A,B).

3.7. Relationships Between Entropy Measures

As expected, strong correlations between SampEn and ApEn (c.f., Figure 5A) and between MSE1 and MSE2 can be observed (c.f., Figure 5B). Applying a threshold to higher entropy values, especially to the SampEn (above 1.84) and ApEn (above 2.10), allows us to eliminate a large number of artifacts. We observe an interesting pattern in the scatter plots of SampEn (or ApEn) in relation to MSE (c.f., Figure 5C and Figure 2D). In the artifact-free ECG-signal-value area, three subspaces are observed. This pattern is better visible in MSE2 than in MSE1 (c.f., Figure 5D).

4. Discussion

We performed an extensive comparative study of as many as thirty-four classifiers using a set of the eight most promising nonlinear features, including measures of fractal dimension that have not been considered previously as indices of ECG signal quality, i.e., Higuchi fractal dimension and Katz fractal dimension, as well as multiscale entropy.
The application of other nonlinear methods, such as Correlation Dimension (CD) or Lyapunov exponents (LEs), is limited because of their high computational load, sensitivity to noise, and dependence on long signals. In contrast, HFD and KFD calculate complexity directly from the time series without phase-space embedding. ApEn and SampEn use fixed template lengths and tolerance windows, while DFA uses a detrending and scaling procedure to make them much faster than CD or LE. Thus, the proposed methods (HFD, KFD, DFA, ApEn, and SampEn) are faster, more robust, and better suited to real-time or resource-constrained environments, like wearable ECG devices. In this study, ECG segments with and without artifacts were classified using a set of nonlinear features of the ECG signal itself, not HRV derived from the ECG, as used in most previous studies [27,28,29,30,31,32].
Moreover, unlike other studies, we do not indiscriminately use many features and classifiers [25,27]. Instead, we discuss the role of individual features and classifiers in a specific context and provide a physiological interpretation of those that yield the best results.
The MRMR method [40] was used to evaluate the importance of individual features for classification. Unlike other methods, this method does not depend on the classifier but only on the class characteristics. This allowed for more effective classification and the discovery of interesting relationships between features. The variance has been selected by MRMR as the best feature to characterize the artifacts because of their high variability compared to the variability of artifact-free ECG signals. However, the classification performance improved when nonlinear measures were included. Measures of fractal dimension and entropy are beneficial in the analysis of the ECG signal due to its nonlinear nature. These measures also allowed for differentiation between an artifact-free ECG signal and a stochastic noise. Entropy is a measure of the degree of disorder of a system; thus, the presence of artifacts in the ECG causes an increase in signal entropy. Meanwhile, the fractal dimension depends on the signal complexity. HFD takes values from one for deterministic curves (line and sinusoid) to two for white noise. Previous studies have shown that the HFD allows for the differentiation of movement and muscle artifacts [15]. HFD is lower than 1.15 for movement artifacts, while the HFD is greater than 1.26 for muscle artifacts. This is consistent with the low complexity of movement artifacts causing a decrease in HFD. In contrast, the high complexity of muscle artifacts led to an increase in the HFD of the signal. HFD is a much easier method of elimination of muscle artifacts than those based on shifted rank-1 reconstruction proposed by other authors [41].
Although the sensitivity and precision (PPV) values for all classifiers were over 99%, specificity and NPV depended on the choice of the classifier. The best performance was obtained using an optimized ensemble RUSBoosted Trees classifier. Specificity and NPV were 73.7 and 74.3, respectively. The highest NPV value of 83.5%, corresponding to identifying the largest number of segments with artifacts (TP), was possible using an optimized weighted k-NN classifier. However, it was possible at the expense of a lower specificity of 64.3%, which was related to the rejection of a more significant number of segments without artifacts.
A characteristic pattern with three subspaces distinguishable in the area of the artifact-free ECG-signal values, observed in the scatter plots of SampEn (or ApEn) in relation to MSE, may indicate the multifractal nature of the ECG, which is related to the occurrence of different sleep stages [42].

5. Conclusions

The aim of this study was to provide a comprehensive comparison of a broad spectrum of machine learning models in the context of differentiating two classes of signals, nonlinear ECG signals and stochastic artifacts. In this specific case, an optimized ensemble RUSBoosted Trees classifier guaranteed the best classification performance results.
Both classifiers, optimized weighted k-NN and optimized ensemble RUSBoosted Trees, are characterized by an optimal training time (14 s and 130 s, respectively). However, they require higher memory usage (k-NN: 24 MB, ensemble RUSBoosted Trees: 53 MB) than other classifiers, such as Decision Tree (13 kB), Discriminant (5 kB), SVM (314 kB), or Neural Network (31 kB). The Discriminant classifier has the lowest training time (2 s) and the smallest model size (5 kB), but at the cost of lower specificity (59.5% for Discriminant vs. 73.7% for ensemble RUSBoosted Tree). However, in wearable devices, where minimal memory usage is important, the Discriminant classifier would be preferable.
The importance of features for classification was assessed using the MRMR algorithm, which is much more computationally efficient and generalized than standard feature selection methods because a set of selected features does not depend on the classifier but only on the class characteristics. The application of MRMR revealed the particular importance of the variance in classifying artifacts occurring in a nonlinear signal, such as ECG, regardless of the type of classifier. Among the nonlinear measures, the most important were DFA, ApEn, and HFD.
DFA and entropy measures are more effective than HFD in distinguishing between stochastic (random) and chaotic (nonlinear deterministic) signals. HFD quantifies the self-similarity or complexity of a signal across scales but is not sensitive to the source of complexity. Consequently, both stochastic and chaotic signals can sometimes have similar HFD values, whereas DFA can distinguish better between correlated (e.g., deterministic or fractional Brownian) and uncorrelated (white noise) signals. On the other hand, entropy measures evaluate the unpredictability or irregularity in a time series. Stochastic signals (especially white noise) have high entropy due to their lack of structure. Chaotic systems have lower entropy than stochastic systems because they follow deterministic rules, even though they appear complex. Thus, entropy allows for distinguishing noise from structured chaos.
Moreover, we noticed an interesting relationship between sample entropy (or approximate entropy) and multiscale entropy, revealing the possible multifractality of ECG during sleep.
Additionally, the algorithm based on an optimized ensemble RUSBoosted Trees classifier and a set of several statistical and nonlinear measures may be helpful in single-channel wearable ECG devices to detect artifacts occurring in real-time ECG recordings.
The limitation of this study is the use of 8 h ECG signals from a small number of healthy volunteers during sleep. Further study will be expanded to a larger group of people, taking into account everyday conditions and other wearable devices.
Another limitation of this study is related to the subjective factor of the expert labeling process of artifact-free segments. To partially address this, labeling would need to be performed by multiple experts.

Author Contributions

Conceptualization, E.O.; methodology, E.O.; software, E.O.; validation, E.O.; formal analysis, E.O.; investigation, E.O.; resources, C.M.; data curation, C.M.; writing—original draft preparation, E.O.; writing—review and editing, C.M.; visualization, E.O.; project administration, E.O. and C.M.; funding acquisition, E.O. and C.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the European Health and Digital Executive Agency (HaDEA), grant no. 101128983. Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union or the European Health and Digital Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Institutional Review Board Statement

This study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of the Fondazione Policlinico Universitario Campus Bio-Medico on 6 February 2022 (Prot. PAR 04.22 OSS).

Informed Consent Statement

Informed consent was obtained from all subjects involved in this study.

Data Availability Statement

The datasets of this manuscript are publicly available in the RepOD repository: https://doi.org/10.18150/O7QQNQ.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ApEnApproximate entropy
CDCorrelation dimension
CNNConvolutional Neural Network
DFADetrended Fluctuation Analysis
DLDeep learning
ECGElectrocardiography
EMDEmpirical mode decomposition
EWTEmpirical wavelet transform
FNFalse negative
FPFalse positive
HFDHiguchi fractal dimension
HRVHeart rate variability
ICAIndependent component analysis
KFDKatz fractal dimension
k-NNk-Nearest Neighbors
LELyapunov exponent
LSTMLong short-term memory
MRMRMinimum redundancy maximum relevance
MSEMultiscale entropy
NPVNegative predictive value
PPVPositive predictive value (precision)
SampEnSample entropy
SVMSupport vector machine
TPTrue positive
TNTrue negative
VARVariance

Appendix A. Evaluation of Energy Consumption Due to Model Training

The carbon footprint expressed in terms of carbon dioxide equivalent (CO2eq) was estimated using the Machine Learning Emissions Calculator (https://mlco2.github.io/impact/, accessed on 22 September 2025) to evaluate the energy consumption due to model training.
The model training was performed using an Intel® Core™ i7-13700K Processor. According to the company specifications (https://www.intel.com/content/www/us/en/products/sku/230500/intel-core-i713700k-processor-30m-cache-up-to-5-40-ghz/specifications.html, accessed on 22 September 2025), the processor’s base power is 125 W and maximum turbo power is 253 W.
Assuming the value of carbon efficiency equal to 0.432 kg/kWh and offset bought set to 0%, the carbon footprint of 100 min of model training corresponds to 0.09–0.18 kg eq. CO2.
For example, the training of an optimized weighted k-NN classifier lasting 14 s corresponds to the following:
125 W · 14 s = 0.125 · 0.0039 kWh = 0.00049 kWh · 0.432 kg eq. CO2/kWh = 0.00021 kg eq. CO2
The training of an optimized ensemble RUSBoosted Trees classifier lasting 130 s cor-responds to the following:
125 W · 130 s = 0.125 · 0.036 kWh = 0.0045 kWh · 0.432 kg eq. CO2/kWh = 0.00195 kg eq. CO2
The training of an optimized SVM lasting 3629 s corresponds to the following:
125 W · 3629 s = 0.125 · 1.01 kWh = 0.126 kWh · 0.432 kg eq. CO2/kWh = 0.055 kg eq. CO2
Whereas the training of an optimized Neural Network classifier lasting 4837 s corresponds to the following:
125 W · 4837 s = 0.125 · 1.34 kWh = 0.168 kWh · 0.432 kg eq. CO2/kWh = 0.073 kg eq. CO2
Therefore, the carbon footprint for an optimized Neural Network is 348 times (!) higher than for an optimized weighted k-NN.

Appendix B. Comparison of Model Performance

The performance of seven optimized models (Tree, Discriminant, Naïve Bayes, SVM, k-NN, ensemble RUSBoost Tree, and Neural Network) calculated for 5 folds is reported in Table A1.
To extract values from 5-fold classification, each model was trained using an appropriate fitc* function in MATLAB R2023a with the option “KFold”.
For example, for the optimized k-NN classifier, the following procedure was applied:
H.NumNeighbors = 14;
H.Distance = ‘euclidean’;
H.DistanceWeight = ‘squaredinverse’;
H.Standardize = 1;
k = 5; % number of folds.
Mdl = fitcknn (features, classes, ‘Distance’, char(H.Distance), ...
  ‘DistanceWeight’, char(H.DistanceWeight), ...
  ‘NumNeighbors’, H.NumNeighbors, ...
  ‘Standardize’, H.Standardize, ‘KFold’, k);
The confusion matrices (CM) for each fold were extracted as follows:
  for i = 1:k
    Labels = classes(Mdl.Partition.test(i));
    Pred = predict(Mdl.Trained{i}, features(Mdl.Partition.test(i), :));
    CM{i} = confusionmat(Labels, Pred);
  end
Then, five performance metrics (sensitivity, specificity, PPV, NPV, and accuracy) were calculated from the confusion matrices (CM) for each of the five folds.
Next, the paired-sample t-test was applied to compare each pair of classifiers. The results are presented in Table A2.
Table A1. Performance of seven optimized classifiers evaluated using five metrics: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and detection accuracy; TP—artifacts marked manually and identified automatically; TN—no artifacts marked manually and not identified automatically; FP—no artifacts marked manually but identified automatically; FN—artifacts marked manually but not identified automatically.
Table A1. Performance of seven optimized classifiers evaluated using five metrics: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and detection accuracy; TP—artifacts marked manually and identified automatically; TN—no artifacts marked manually and not identified automatically; FP—no artifacts marked manually but identified automatically; FN—artifacts marked manually but not identified automatically.
Optimized ModelTPFPTNFNSensitivitySpecificityPPVNPVAccuracy
Tree43,016140219730.9980.6100.9970.7500.995
43,028161199610.9990.5530.9960.7650.995
43,015152208740.9980.5780.9960.7380.995
43,011133227770.9980.6310.9970.7470.995
43,025167192640.9990.5350.9960.7500.995
0.9980.5810.9970.7500.995
Discriminant42,9891601991000.9980.5540.9960.6660.994
42,993141219960.9980.6080.9970.6950.995
42,998150210910.9980.5830.9970.6980.994
43,004144216840.9980.6000.9970.7200.995
43,008129230810.9980.6410.9970.7400.995
0.9980.5970.9970.7040.995
Naïve Bayes42,672672924170.9900.8130.9980.4120.989
42,662622984270.9900.8280.9990.4110.989
42,640682924490.9900.8110.9980.3940.988
42,665622984230.9900.8280.9990.4130.989
42,656573024330.9900.8410.9990.4110.989
0.9900.8240.9990.4080.989
SVM43,040148211490.9990.5880.9970.8120.995
43,031157203580.9990.5640.9960.7780.995
43,046147213430.9990.5920.9970.8320.996
43,024148212640.9990.5890.9970.7680.995
43,022150209670.9980.5820.9970.7570.995
0.9990.5830.9970.7890.995
k-NN43,037125234520.9990.6520.9970.8180.996
43,048135225410.9990.6250.9970.8460.996
43,056125235330.9990.6530.9970.8770.996
43,038134226500.9990.6280.9970.8190.996
43,048123236410.9990.6570.9970.8520.996
0.9990.6430.9970.8420.996
Ensemble42,617473124720.9890.8690.9990.3980.988
42,644603004450.9900.8330.9990.4030.988
42,579473135100.9880.8690.9990.3800.987
42,633622984550.9890.8280.9990.3960.988
42,643433164460.9900.8800.9990.4150.989
0.9890.8560.9990.3980.988
Neural Network43,022141218670.9980.6070.9970.7650.995
43,023148212660.9980.5890.9970.7630.995
43,036147213530.9990.5920.9970.8010.995
43,024147213640.9990.5920.9970.7690.995
43,040139220490.9990.6130.9970.8180.996
0.9990.5980.9970.7830.995
Table A2. Comparison of classification performance for each pair of models separately for each of the five metrics (sensitivity, specificity, PPV, NPV, and accuracy). The statistically significant differences between models for p-values less than 0.05 are marked in red.
Table A2. Comparison of classification performance for each pair of models separately for each of the five metrics (sensitivity, specificity, PPV, NPV, and accuracy). The statistically significant differences between models for p-values less than 0.05 are marked in red.
SensitivityDiscriminantNaïve BayesSVMk-NNEnsembleNeural Network
Tree0.0126540.0000010.0978240.0023490.0000030.087904
Discriminant 0.0000010.0098940.0004720.0000060.000494
Naïve Bayes 0.0000010.0000020.0205190.000002
SVM 0.0543140.0000150.587531
k-NN 0.0000070.004663
Ensemble 0.000007
SpecificityDiscriminantNaïve BayesSVMk-NNEnsembleNeural Network
Tree0.6099340.0003170.9186460.0404320.0003350.426127
Discriminant 0.0000180.4386210.0480480.0001310.941055
Naïve Bayes 0.0000110.0000410.0614520.000003
SVM 0.0005240.0000100.062232
k-NN 0.0000010.000620
Ensemble 0.000005
PPVDiscriminantNaïve BayesSVMk-NNEnsembleNeural Network
Tree0.6141200.0003200.9134030.0398950.0003380.424345
Discriminant 0.0000180.4491080.0468230.0001320.926907
Naïve Bayes 0.0000110.0000440.0619230.000003
SVM 0.0005140.0000100.063177
k-NN 0.0000010.000614
Ensemble 0.000005
NPVDiscriminantNaïve BayesSVMk-NNEnsembleNeural Network
Tree0.0275010.0000000.0782180.0021260.0000000.075628
Discriminant 0.0000200.0250510.0006820.0000100.001404
Naïve Bayes 0.0000230.0000070.0564550.000009
SVM 0.0215450.0000320.749565
k-NN 0.0000060.002727
Ensemble 0.000005
AccuracyDiscriminantNaïve BayesSVMk-NNEnsembleNeural Network
Tree0.2421120.0000010.0871220.0043200.0000180.115168
Discriminant 0.0000130.0820950.0017450.0000090.009532
Naïve Bayes 0.0000100.0000060.0314570.000005
SVM 0.0034990.0000430.795631
k-NN 0.0000140.000643
Ensemble 0.000011

References

  1. Clifford, G.D.; Azuaje, F.; McSharry, P.E. Advanced Methods for ECG Analysis; Artech House: London, UK, 2006. [Google Scholar]
  2. Chatterjee, S.; Thakur, R.S.; Yadav, R.N.; Gupta, L.; Raghuvanshi, D.K. Review of noise removal techniques in ECG signals. IET Signal Proc. 2020, 14, 569–590. [Google Scholar] [CrossRef]
  3. Van der Bijl, K.; Elgendi, M.; Menon, C. Automatic ECG Quality Assessment Techniques: A Systematic Review. Diagnostics 2022, 12, 2578. [Google Scholar] [CrossRef]
  4. Siddiah, N.; Srikanth, T.; Kumar, Y.S. Nonlinear filtering in ECG Signal Enhancement. Int. J. Comput. Sci. Commun. Netw. 2012, 2, 123–128. [Google Scholar]
  5. Sarafan, S.; Vuong, H.; Jilani, D.; Malhotra, S.; Lau, M.P.H.; Vishwanath, M.; Ghirmai, T.; Cao, H. A Novel ECG Denoising Scheme Using the Ensemble Kalman Filter. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2022, 2022, 2005–2008. [Google Scholar]
  6. Khavas, Z.R.; Asl, B.M. Robust heartbeat detection using multimodal recordings and ECG quality assessment with signal amplitudes dispersion. Comput. Methods Programs Biomed. 2018, 163, 169–182. [Google Scholar] [CrossRef]
  7. Zhao, Z.; Zhang, Y. SQI Quality Evaluation Mechanism of Single-Lead ECG Signal Based on Simple Heuristic Fusion and Fuzzy Comprehensive Evaluation. Front. Physiol. 2018, 9, 727. [Google Scholar] [CrossRef]
  8. Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.-C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
  9. Wu, Z.H.; Huang, N.E. A study of the characteristics of white noise using the empirical mode decomposition method. Proc. R. Soc. A Math. Phys. Eng. Sci. 2004, 460, 1597–1611. [Google Scholar] [CrossRef]
  10. Chang, K.M. Arrhythmia ECG noise reduction by ensemble empirical mode decomposition. Sensors 2010, 10, 6063–6080. [Google Scholar] [CrossRef] [PubMed]
  11. Xu, X.; Liang, Y.; He, P.; Yang, J. Adaptive Motion Artifact Reduction Based on Empirical Wavelet Transform and Wavelet Thresholding for the Non-Contact ECG Monitoring Systems. Sensors 2019, 19, 2916. [Google Scholar] [CrossRef]
  12. Elouaham, S.; Dliou, A.; Jenkal, W.; Louzazni, M.; Zougagh, H.; Dlimi, S. Empirical Wavelet Transform Based ECG Signal Filtering Method. J. Electr. Comput. Eng. 2024, 2024, 9050909. [Google Scholar] [CrossRef]
  13. Sharanya, S.; Arjunan, P.D. Fractal Dimension Techniques for Analysis of Cardiac Autonomic Neuropathy. Biomed. Eng. Appl. Basis Commun. 2023, 35, 2350003. [Google Scholar] [CrossRef]
  14. Chen, C.; da Silva, B.; Ma, C.; Li, J.; Liu, C. Fast Sample Entropy Atrial Fibrillation Analysis Towards Wearable Device. In Proceedings of the 12th Asian-Pacific Conference on Medical and Biological Engineering. APCMBE 2023, Suzhou, China, 18–21 May 2023; Wang, G., Yao, D., Gu, Z., Peng, Y., Tong, S., Liu, C., Eds.; IFMBE Proceedings. Springer: Cham, Switzerland, 2024; Volume 103. [Google Scholar]
  15. Olejarczyk, E.; Raus-Jarzabek, E.; Massaroni, C. Automatic identification of movement and muscle artifacts in ECG based on statistical and nonlinear measures. In Proceedings of the 2024 IEEE International Workshop on Metrology for Industry 4.0 & IoT, Florence, Italy, 29–31 May 2024. [Google Scholar]
  16. Alghieth, M. DeepECG-Net: A hybrid transformer-based deep learning model for real-time ECG anomaly detection. Sci. Rep. 2025, 15, 20714. [Google Scholar] [CrossRef]
  17. Wang, Y.-H.; Chen, I.-Y.; Chiueh, H.; Liang, S.-F. A Low-Cost Implementation of Sample Entropy in Wearable Embedded Systems: An Example of Online Analysis for Sleep EEG. IEEE Trans. Instrum. Meas. 2021, 70, 4002412. [Google Scholar] [CrossRef]
  18. Gomolka, R.S.; Kampusch, S.; Kaniusas, E.; Thurk, F.; Szeles, J.C.; Klonowski, W. Higuchi Fractal Dimension of Heart Rate Variability During Percutaneous Auricular Vagus Nerve Stimulation in Healthy and Diabetic Subjects. Front. Physiol. 2018, 9, 1162. [Google Scholar] [CrossRef] [PubMed]
  19. Horie, T.; Burioka, N.; Amisaki, T.; Shimizu, E. Sample Entropy in Electrocardiogram During Atrial Fibrillation. Yonago Acta Med. 2018, 61, 49–57. [Google Scholar] [CrossRef]
  20. Zhao, L.; Liu, C.; Wei, S.; Shen, Q.; Zhou, F.; Li, J. A New Entropy-Based Atrial Fibrillation Detection Method for Scanning Wearable ECG Recordings. Entropy 2018, 20, 904. [Google Scholar] [CrossRef]
  21. Alcan, V. Sample Entropy Analysis of heart rate variability in RR interval detection. Muhendis. Bilim. Ve Tasarım Derg. 2020, 8, 783–790. [Google Scholar] [CrossRef]
  22. Abdelrazik, A.; Eldesouky, M.; Antoun, I.; Lau, E.Y.M.; Koya, A.; Vali, Z.; Suleman, S.A.; Donaldson, J.; Ng, G.A. Wearable Devices for Arrhythmia Detection: Advancements and Clinical Implications. Sensors 2025, 25, 2848. [Google Scholar] [CrossRef]
  23. Ribeiro, P.; Sa, J.; Paiva, D.; Rodrigues, P.M. Cardiovascular Diseases Diagnosis Using an ECG Multi-Band Non-Linear Machine Learning Framework Analysis. Bioengineering 2024, 11, 58. [Google Scholar] [CrossRef] [PubMed]
  24. Noitz, M.; Mortl, C.; Bock, C.; Mahringer, C.; Bodenhofer, U.; Dunser, M.W.; Meier, J. Detection of Subtle ECG Changes Despite Superimposed Artifacts by Different Machine Learning Algorithms. Algorithms 2024, 17, 360. [Google Scholar] [CrossRef]
  25. Zhang, Y.; Wei, S.; Zhang, L.; Liu, C. Comparing the Performance of Random Forest, SVM and Their Variants for ECG Quality Assessment Combined with Nonlinear Features. J. Med. Biol. Eng. 2019, 39, 381–392. [Google Scholar] [CrossRef]
  26. Fu, F.; Xiang, W.; An, Y.; Liu, B.; Chen, X.; Zhu, S.; Li, J. Comparison of Machine Learning Algorithms for the Quality Assessment of Wearable ECG Signals Via Lenovo H3 Devices. J. Biol. Eng. 2021, 41, 231–240. [Google Scholar] [CrossRef]
  27. Karimulla, S.; Patra, D. An Optimal Methodology for Early Prediction of Sudden Cardiac Death Using Advanced Heart Rate Variability Features of ECG Signal. Arab. J. Sci. Eng. 2024, 49, 6725–6741. [Google Scholar] [CrossRef]
  28. Rasmussen, J.H.; Rosenberger, K.; Langbein, J.; Easie, R.R. An open-source software for non-invasive heart rate variability assessment. Methods Ecol. Evol. 2020, 11, 773–782. [Google Scholar] [CrossRef]
  29. El-Yaagoubi, M.; Goya-Esteban, R.; Jabrane, Y.; Munoz-Romero, S.; Garcia-Alberola, A.; Rojo-Alvarez, J.L. On the Robustness of Multiscale Indices for Long-Term Monitoring in Cardiac Signals. Entropy 2019, 21, 594. [Google Scholar] [CrossRef]
  30. Stapelberg, N.J.C.; Neumann, D.L.; Shum, D.H.K.; McConnell, H.; Hamilton-Craig, I. The sensitivity of 38 heart rate variability measures to the addition of artifact in human and artificial 24-hr cardiac recordings. Ann. Noninvasive Electrocardiol. 2018, 23, e12483. [Google Scholar] [CrossRef]
  31. Giles, D.A.; Draper, N. Heart rate variability during exercise: A comparison of artefact correction methods. J. Strength Cond. Res. 2018, 32, 726–735. [Google Scholar] [CrossRef]
  32. Ernst, G. Hidden Signals-The History and Methods of Heart Rate Variability. Front. Public Health 2017, 5, 265. [Google Scholar] [CrossRef]
  33. Massaroni, C.; Olejarczyk, E.; Lo Presti, D.; Schena, E.; Nusca, A.; Ussia, G.P.; Silvestri, S. Indirect Respiratory Monitoring via Single-Lead Wearable ECG: Influence of Motion Artifacts and Devices on Respiratory Rate Estimations. In Proceedings of the 2024 IEEE International Workshop on Metrology for Industry 4.0 & IoT, Florence, Italy, 29–31 May 2024. [Google Scholar]
  34. Higuchi, T. Approach to an irregular time series on the basis of the fractal theory. Phys. D 1988, 31, 277–283. [Google Scholar] [CrossRef]
  35. Katz, M.J. Fractals and the analysis of waveforms. Comput. Biol. Med. 1988, 18, 145–156. [Google Scholar] [CrossRef]
  36. Peng, C.K.; Havlin, S.; Hausdorff, J.M.; Mietus, J.E.; Stanley, H.E.; Goldberger, A.L. Fractal mechanisms and heart rate dynamics: Long-range correlations and their breakdown with disease. J. Electrocardiol. 1996, 28 (Suppl. S1), 59–64. [Google Scholar] [CrossRef] [PubMed]
  37. Pincus, S.M. Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. USA 1991, 88, 2297–2301. [Google Scholar] [CrossRef] [PubMed]
  38. Richman, J.S.; Moorman, R.J. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 2000, 278, 2039–2049. [Google Scholar] [CrossRef] [PubMed]
  39. Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy analysis of complex physiologic time series. Phys. Rev. Lett. 2002, 89, 068102. [Google Scholar] [CrossRef]
  40. Ding, C.; Peng, H. Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 2005, 3, 185–205. [Google Scholar] [CrossRef]
  41. Chen, X.; Zheng, S.; Peng, L.; Zhong, Q.; He, L. A novel method based on shifted rank-1 reconstruction for removing EMG artifacts in ECG signals. Biomed. Signal Process. Control 2023, 85, 104967. [Google Scholar] [CrossRef]
  42. Costa, M.; Goldberger, A.L.; Peng, C.K. Multiscale entropy analysis of biological signals. Phys. Rev. E 2005, 71, 021906. [Google Scholar] [CrossRef]
Figure 1. The ROC curves for two classifiers: (A) the ensemble RUSBoosted Trees classifier; (B) the weighted k-NN classifier.
Figure 1. The ROC curves for two classifiers: (A) the ensemble RUSBoosted Trees classifier; (B) the weighted k-NN classifier.
Applsci 15 11226 g001
Figure 2. Comparison of the scatter plots of variance in relation to the selected best features. (A,B)—DFA, (C,D)—ApEn, (E,F)—HFD for two non-optimized classifiers: the ensemble RUSBoosted Trees classifier (AE) and the weighted k-NN classifier (BF).
Figure 2. Comparison of the scatter plots of variance in relation to the selected best features. (A,B)—DFA, (C,D)—ApEn, (E,F)—HFD for two non-optimized classifiers: the ensemble RUSBoosted Trees classifier (AE) and the weighted k-NN classifier (BF).
Applsci 15 11226 g002
Figure 3. The scatter plots of variance in function of nonlinear measures: (A) HFD; (B) KFD; (C) DFA; (D) ApEn; (E) SampEn; (F) MSE for scale 1; (G) MSE for scale 2.
Figure 3. The scatter plots of variance in function of nonlinear measures: (A) HFD; (B) KFD; (C) DFA; (D) ApEn; (E) SampEn; (F) MSE for scale 1; (G) MSE for scale 2.
Applsci 15 11226 g003
Figure 4. The scatter plots of ApEn in function of fractal dimension measures: (A) HFD; (B) KFD; (C) DFA.
Figure 4. The scatter plots of ApEn in function of fractal dimension measures: (A) HFD; (B) KFD; (C) DFA.
Applsci 15 11226 g004
Figure 5. The scatter plots illustrating relationships between entropy measures: (A) SampEn vs. ApEn; (B) MSE2 vs. MSE1; (C) MSE2 vs. SampEn; (D) MSE2 vs. ApEn.
Figure 5. The scatter plots illustrating relationships between entropy measures: (A) SampEn vs. ApEn; (B) MSE2 vs. MSE1; (C) MSE2 vs. SampEn; (D) MSE2 vs. ApEn.
Applsci 15 11226 g005
Table 1. The ranges of hyper-parameters used for training individual models.
Table 1. The ranges of hyper-parameters used for training individual models.
Classifier GroupRanges of Hyper-Parameters
Tree split criterion: Gini’s diversity index; surrogate decision splits: off; maximum number of splits: 100 (fine), 20 (medium), 7 (coarse)
optimized: maximum number of splits: 36
Discriminantin both linear and quadratic discriminant, a full covariance structure is used
optimized: linear
Efficient Logistic Regression and Efficient Linear SVMsolver, regularization, and regularization strength (lambda) are automatic; relative coefficient tolerance (beta tolerance): 0.0001; multi-class coding: one-vs.-one
Naïve Bayesstandardize data; kernel, in contrast to Gaussian distribution for numeric predictors, uses unbounded support
optimized: kernel type: triangle
SVMmulti-class coding: one-vs.-one; standardize data; box constraint level: 1; linear, quadratic, and cubic kernel function use automatic scale; Gaussian SVM scale: 0.71 (fine), 2.8 (medium), and 11 (coarse)
optimized: kernel function: Gaussian; kernel scale: 1.7037
k-NNstandardize data; distance metric: Euclidean (fine, medium, coarse, and weighted k-NN), Cosine (cosine k-NN), Minkowski (cubic k-NN); number of neighbors: k = 10, except fine k-NN (k = 1) and coarse k-NN (k = 100); distance weight: equal, except the weighted k-NN (squared inverse distance)
optimized: distance metric: Euclidean; weighted k-NN; number of neighbors: k = 14
Ensemble Treesnumber of learners: 30; learner type: Decision Tree for AdaBoost, Bag, and RUSBoost, while Discriminant or Nearest Neighbors for Subspace Ensemble with subspace dimension equal to 4; all predictors to sample are used by Decision Tree learner; maximum number of splits: 20 for AdaBoost and RUSBoost with learning rate equal to 0.1, or 217,241 for Bag
optimized: for RUSBoost: number of learners: 325; maximum number of splits: 139,269; learning rate: 0.50351; for Bagged Tree: number of learners: 112; maximum number of splits: 2451
Neural Networksstandardize data; regularization strength (lambda): 0; activation: ReLU; iteration limit: 1000; number of layers: 1 (narrow, medium, and wide), 2 (bilayered), 3 (trilayered); layer size: 10, except medium (25) and wide (100) Neural Network
optimized: lambda: 4.7042 × 10−6; activation: ReLU; number of layers: 1; layer size: 296
KernelSVM or Logistic Regression Kernel; regularization strength (lambda): automatic; multi-class coding: one-vs.-one; kernel scale: automatic; iteration limit: 1000; number of expansion dimensions: automatic
Table 2. Ranges of feature values covering 99.5% of values typical for artifact-free ECG segments: non-standardized and standardized values.
Table 2. Ranges of feature values covering 99.5% of values typical for artifact-free ECG segments: non-standardized and standardized values.
Non-Standardized Data
MeasureVarianceHFDKFDDFASampEnAppEnMSE1MSE2
min0.0081.04 1.00001.310.781.170.02 0.04
max0.0971.491.00032.952.222.322.253.38
Standardized Data
MeasureVarianceHFDKFDDFASampEnAppEnMSE1MSE2
min−0.0042−7.61 −0.11 −10.17 −3.74−4.57−1.56−1.33
max−0.004015.860.202.535.554.4917.9515.13
Table 3. Training results in terms of accuracy, total cost, error rate, prediction speed, training time, and model size for individual models. The results are reported separately for non-optimized and for optimized classifiers. The prediction speed and training time were provided for a group of optimized classifiers and the best-optimized classifiers in each group.
Table 3. Training results in terms of accuracy, total cost, error rate, prediction speed, training time, and model size for individual models. The results are reported separately for non-optimized and for optimized classifiers. The prediction speed and training time were provided for a group of optimized classifiers and the best-optimized classifiers in each group.
ClassifierAccuracy [%]Total CostPrediction Speed [obs/s]Training Time [s]Model Size [kB]
Fine Tree99.511091,400,0001129
Medium Tree99.51097830,000108
Coarse Tree99.41294870,00095
Linear Discriminant99.51169610,00095
Quadratic Discriminant98.53193580,00088
Binary GLM Logistic Regression99.4not applicable730,0002139,000
Efficient Logistic Regression99.41354770,0001612
Efficient Linear SVM99.31436730,0001912
Gaussian Naïve Bayes98.43562610,000147
Kernel Naïve Bayes98.92477260472953,000
Linear SVM99.41229180,000911222
Quadratic SVM99.51109360,0003192202
Cubic SVM99.51048540,0005681188
Fine Gaussian SVM99.3151626,00026931000
Medium Gaussian SVM99.599378,000908232
Coarse Gaussian SVM99.5112460,000452202
Fine k-NN99.5109813,0005924,000
Medium k-NN99.51042520014524,000
Coarse k-NN99.51155240038224,000
Cosine k-NN99.41242140074818,000
Cubic k-NN99.51041290031724,000
Weighted k-NN99.6871560016024,000
Ensemble Boosted Trees99.5105069,000376273
Ensemble Bagged Trees99.688551,00010925000
Ensemble Subspace Discriminant99.4124636,00083120
Ensemble Subspace k-NN99.510455100349492,000
Ensemble RUSBoosted Trees98.1407698,000130273
Narrow Neural Network99.51077710,00013507
Medium Neural Network99.51023780,00020728
Wide Neural Network99.510461,100,000341914
Bilayered Neural Network99.51156710,00018018
Trilayered Neural Network99.51070850,000216510
SVM Kernel99.41334120,000113811
Logistic Regression Kernel99.41406110,00054711
Optimized Classifier GroupAccuracy [%]Total CostPrediction Speed [obs/s]Training Time [s]Model Size [kB]
Tree99.510833,600,0004413
Discriminant99.511692,200,000355
SVM99.6927130,00029,587314
Naïve Bayes98.9246627051,16653,000
k-NN99.687060,000414024,000
Ensemble Bagged Trees99.689933,00032,67718,000
Ensemble RUSBoosted Trees99.693119,000237253,000
Neural Network99.6957590,00039,44431
Optimized ClassifierAccuracy [%]Total CostPrediction Speed [obs/s]Training Time [s]Model Size [kB]
Tree99.510833,200,000213
Discriminant99.511692,100,00025
SVM99.6927210,000184314
Naïve Bayes98.92466250362953,000
k-NN99.687082,0001424,000
Ensemble Bagged Trees99.689951,00031618,000
Ensemble RUSBoosted Trees99.693320,00013053,000
Neural Network99.6953600,00048331
Table 4. Performance of thirty-four classifiers evaluated using five metrics: sensitivity, specificity, precision (PPV), negative predictive value (NPV), and detection accuracy; TP—artifacts marked manually and identified automatically; TN—no artifacts marked manually and not identified automatically; FP—no artifacts marked manually but identified automatically; FN—artifacts marked manually but not identified automatically.
Table 4. Performance of thirty-four classifiers evaluated using five metrics: sensitivity, specificity, precision (PPV), negative predictive value (NPV), and detection accuracy; TP—artifacts marked manually and identified automatically; TN—no artifacts marked manually and not identified automatically; FP—no artifacts marked manually but identified automatically; FN—artifacts marked manually but not identified automatically.
ClassifierTPFPTNFNSensitivitySpecificityPPVNPVAccuracy
1. Fine Tree215,058723107538699.859.899.773.699.5
2. Medium Tree215,073726107237199.859.699.774.399.5
3. Coarse Tree215,02987991941599.851.199.668.999.4
Optimized Tree215,090729106935499.859.599.775.199.5
4. Linear Discriminant214,993718108045199.860.199.770.599.5
5. Quadratic Discriminant212,5753241474286999.782.098.833.999.5
6. Binary GLM Logistic Regression215,15991488428599.949.299.675.699.4
7. Efficient Logistic Regression215,163107372528199.940.399.572.199.4
8. Efficient Linear SVM215,3371329469107100.026.199.481.499.3
9. Gaussian Naïve Bayes212,1582761522328698.584.699.931.798.4
10. Kernel Naïve Bayes213,2833161482216199.082.499.940.798.9
Optimized Naïve Bayes213,2913131485215399.082.699.940.898.9
11. Linear SVM215,240102577320499.943.099.579.199.4
12. Quadratic SVM215,21588091822999.951.199.680.099.5
13. Cubic SVM215,21081498423499.954.799.680.899.5
14. Fine Gaussian SVM215,43515072919100.016.299.397.099.3
15. Medium Gaussian SVM215,152701109729299.961.099.779.099.5
16. Coarse Gaussian SVM215,18986992925599.951.799.678.599.5
Optimized Gaussian SVM215,170653114527499.963.799.780.799.6
17. Fine k-NN214,980634116446499.864.799.771.599.5
18. Medium k-NN215,184782101626099.956.599.679.699.5
19. Coarse k-NN215,20291388524299.949.299.678.599.5
20. Cosine k-NN215,13293086831299.948.399.673.699.4
21. Cubic k-NN215,181778102026399.956.799.679.599.5
22. Weighted k-NN215,213640115823199.964.499.783.499.6
Optimized Weighted k-NN215,216642115622899.964.399.783.599.6
23. Ensemble Boosted Trees215,078684111436699.862.099.775.399.5
24. Ensemble Bagged Trees215,151592120629399.967.199.780.599.6
25. Ensemble Subspace Discriminant215,918720107852699.860.099.767.299.9
26. Ensemble Subspace k-NN215,27787892016799.951.299.684.699.5
27. Ensemble RUSBoosted Trees211,5531851613389198.289.799.929.398.1
Optimized Ensemble Bagged Trees215,144599119930099.966.799.780.099.6
Optimized Ensemble RUSBoosted Trees214,985472132645999.873.799.874.399.6
28. Narrow Neural Network215,076709108936899.860.699.774.799.5
29. Medium Neural Network215,080659113936499.863.399.775.899.5
30. Wide Neural Network215,020622117642499.865.499.773.599.5
31. Bilayered Neural Network215,076688111036899.861.799.775.199.5
32. Trilayered Neural Network215,068694110437699.861.499.774.699.5
Optimized Neural Network215,098611118734699.866.099.777.499.6
33. SVM Kernel215,183107372526199.940.399.573.599.4
34. Logistic Regression Kernel215,127108970931799.939.499.569.199.4
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Olejarczyk, E.; Massaroni, C. Comparison of Machine Learning Models in Nonlinear and Stochastic Signal Classification. Appl. Sci. 2025, 15, 11226. https://doi.org/10.3390/app152011226

AMA Style

Olejarczyk E, Massaroni C. Comparison of Machine Learning Models in Nonlinear and Stochastic Signal Classification. Applied Sciences. 2025; 15(20):11226. https://doi.org/10.3390/app152011226

Chicago/Turabian Style

Olejarczyk, Elzbieta, and Carlo Massaroni. 2025. "Comparison of Machine Learning Models in Nonlinear and Stochastic Signal Classification" Applied Sciences 15, no. 20: 11226. https://doi.org/10.3390/app152011226

APA Style

Olejarczyk, E., & Massaroni, C. (2025). Comparison of Machine Learning Models in Nonlinear and Stochastic Signal Classification. Applied Sciences, 15(20), 11226. https://doi.org/10.3390/app152011226

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop