1. Introduction
Artificial intelligence (AI) and machine learning (ML) are currently key catalysts in the development life cycle of applied technologies associated with Industry 3.0. Furthermore, the revolution is setting new standards for IoT and serving as a cross-domain driving force for Industry 4.0. AI/ML has transformed the traditional data processing and prediction mechanisms of numerous crucial sectors, including finance, healthcare, energy, seismology, and future 5G/6G communication networks [
1,
2,
3,
4,
5]. For example, a new industry backed by AI/ML, financial technology (FinTech), is already shaping decision-making in digital commerce, particularly in market analysis and fraud detection [
1]. Similarly, in the medical field, novel frameworks, such as AI-enhanced robotic desktop automation, have been introduced [
2]. They can be used to classify healthcare data across multi-domain government platforms, demonstrating the impact of AI/ML on data workflow management. AI/ML-based blockchain technologies have also benefited energy markets by providing decentralized, secure data management [
3]. It has already transformed the way we analyze transparency, transactions, and trusts for resource predictions. Additionally, integrating AI/ML with earthquake risk mitigation technologies has exponentially enhanced early warning systems [
4]. It has drastically improved disaster-resilience strategies through AI/ML-backed predictive analytics. The digital communication world—such as IoT, robotics, vehicular networks, and 5G/6G secure technologies—has emerged as the most critical sector for data processing through AI/ML [
5]. From an application perspective, AI/ML has become a prominent ingredient in developing new engineering tools across engineering design, manufacturing, agriculture, and healthcare [
6,
7,
8,
9]. We are now witnessing paradigm shifts in data processing techniques for cybersecurity and intelligent mobile communication systems, driven by the advent of federated learning and explainable AI.
AI/ML has recently gained prominence across all disciplines. Noise, especially impulsive noise, has always been a crucial factor in modeling relevant data and designing physical systems across the applied sciences. The sensitive handling of noise in diverse domains determines the outcomes that can be extracted from it for applied technologies. Fintech uses noise to capture market fluctuations, such as aggregation and discretization, to model time-series data for high-frequency prices [
10]. Primarily, financial tools like fact-impulsive noise stick-balancing have been discovered by integrating stochastic noise with artificial news [
11]. Healthcare technologies have long relied on noise sources, such as electrical interference and motion artifacts, to enhance diagnostic features extracted from Electroencephalogram (EEG) and Electrocardiogram (ECG) imaging [
12]. Medical learning and prediction algorithms heavily rely on impulsive noise to gauge uncertainty and quantify biomedical data [
13]. Similarly, the accuracy of noisy phasor measurement units and sensor data is a crucial factor in triggering false alarms and mitigating noise-induced errors in power systems, such as smart grids [
14]. Specifically, designing mitigation strategies while modeling different noises—Gaussian, impulsive, and colored—significantly enhances the efficiency of broadband power-line communications [
15]. Various earthquake prediction technologies, such as virtual Green functions, also use ambient seismic noise in tomographic inversion to predict subsurface velocity structures [
16]. Moreover, real-time monitoring of geophysical environments is fundamentally based on event discrimination and situational awareness, which are achievable through seismic impulsive noise across networks [
17]. Significantly, in 5G/6G communication technologies, noise-based chips—such as ORBGRAND [
18]—are developed to maintain throughput by self-adjusting to fluctuating noise levels in real time for dynamic encoding and decoding. Moreover, unconventional digital communication frameworks based on inverse-impulsive-noise systems are already providing secure, fast transmission for current 5G and future 6G systems [
19,
20]. The presence of noise in all the above fields is not incidental. It has been explicitly used for modeling (disturbance and errors) and harnessing (data processing and filtering) to improve overall system performance.
Research gap: The AI and ML algorithms provide effective predictions and classifications when dealing with conventional noisy data. Currently, unconventional impulsive noise is the most crucial factor for designing future complex devices and state-of-the-art technologies. Therefore, the growing use of impulsive noise in multidisciplinary devices also requires new AI and ML-based prediction and estimation techniques. Despite the long-standing use of impulsive noise in cross-domain systems, no comprehensive study has examined the complexity and performance of AI and ML algorithms when processing impulsive distributions. It has begun to restrict the integration of current AI and ML algorithms with technologies that require fast decision-making while dealing with impulsive noise at multiple levels [
21]. This limitation motivates the exploration of supervised ML algorithms for parameter estimation of impulsive noise distributions. Given the research gap, our main objective in this investigation is to
Evaluate the behavior of renowned supervised ML models under impulsive environments.
Obtain results for ML classifiers applicable to cross-domain physical systems dealing with impulsive noise.
Provide trade-offs between complexity and performance to help choose suitable ML models for future AI-integrated technologies.
Based on the objectives, the key contributions of this paper are as follows:
Explicit generalization of ML under α-stable noise: Unlike existing studies, which often focus on Gaussian or light-tailed noise assumptions, we explicitly generalized classical ML models under impulsive α-stable noise, which, to the best of our knowledge, has not yet been studied. We methodically analyzed classical ML algorithms (DT, RF, SVM, kNN, NB) for predicting crucial parameters governing the generation and prediction of α-stable noise.
Utilization of versatile synthetic datasets: We generated and utilized synthetic datasets for symmetric and skewed α-stable noise. Using synthetically generated α-stable noise allows precise control over distribution parameters, enabling systematic and reproducible evaluation of model complexity and performance. Therefore, the derived results and trade-offs are valid for applied domains such as energy, finance, healthcare, seismology, and cybersecurity.
Generalized complexity and performance: We sought to identify trends in accuracy and scalability across different dataset types and sizes, aiming at determining the trade-off between performance and complexity. The latter is valuable for embedded technologies operating in impulsive, noise-prone environments.
Discovered benchmarks: We derived precision–recall (PR) and receiver operating characteristic (ROC) curves to assess ML classifier performance under impulsive conditions. By averaging across multiple Monte Carlo simulations, we aimed at establishing reliable benchmarks for applicability in real-world scenarios.
Interpretability for real time: our work goes beyond benchmarking by providing interpretability of observed patterns in performance and computational behavior in light of impulsive noise characteristics, model assumptions, and real-time constraints.
The paper is structured as follows:
Section 2 examines the lack of research in the domain, followed by a discussion of the adopted approach.
Section 3 explains the implemented methodology and essential concepts associated with it.
Section 4 presents in detail the results of nine tests performed related to complexity and performance. Moreover, it discusses the results and trade-offs for each task. Lastly,
Section 5 provides a focused conclusion and outlines directions for multipurpose future research.
3. Methodology
Our proposed approach comprises two phases, as shown in
Figure 1. The first phase is data generation of synthetic training and test datasets of α-stable noise. The second phase, Noise Parameter Classification, is performed in two steps. The first step, Computational Complexity Analysis (CCA), consists of tests relevant to the study of the complexity of selected ML classifiers for direct estimation of α-stable noise parameters. The second step is performance analysis (PA). We analyzed the performance of ML classifiers for possible binary and multi-class classification of α-stable noise parameters using synthetic training (
) and testing datasets (
). We have used tests in both phases to extract significant results, trends, and trade-offs.
Our adopted methodology focuses on increasing the likelihood that each classifier is integrated into environments that best suit its complexity and performance. Moreover, the approach aligns with the demands of future AI-integrated miniature technologies [
30]. This section defines the adopted two-phase methodology and the associated tests structured under it. It also explains concepts, which are the basis of the paper.
3.1. Dataset Generation
In this phase, we have explained the method for generating and regenerating synthetic datasets tailored to α-stable noise distributions.
-Stable Symmetric and Skewed Noise Distributions
The α-stable distribution is the most widely used mathematical phenomenon for modeling impulsive noise across disciplines. Due to its unique properties, it helps researchers model digital devices operating in noisy environments. Many areas, such as vehicular networks, body-centric communications, space, robotics, IoT, and cybersecurity, which are the backbones of future AI embedded technologies, use α-stable distributions for channel modeling [
31,
32,
33,
34,
35].
In this paper, we denote the α-stable distribution as A~
(
β,
,
μ). It is mathematically defined by its characteristic function derived in [
36] as
where we mainly governed it by the associated control parameters α,
β,
, and
μ used in Equation (1), and defined below as follows:
The impulsiveness parameter () or characteristic exponent, responsible for maneuvering the impulsiveness of the resultant distribution, stays inside (0, 2];
The biasness () or skewness parameter, responsible for introducing positive or negative bias in the resultant distribution, stays inside [−1, 1];
The shift, mean, or location parameter (), responsible for moving the resultant distribution left or right across the x-axis, stays inside (−, );
The dispersion or mixing parameter (), responsible for dispersing the resultant distribution, stays inside (0, ).
The mathematical function
in Equation (1) is the sign function (i.e.,
) and
is the imaginary unit (i.e.,
). The α-stable distribution is classified into two types: impulsive α-stable (IαS) and skewed α-stable (SkαS). More importantly, there are some specially studied cases of the α-stable noise family, i.e., Gaussian noise A
G~
(
β = 0,
,
μ), Lévy noise A
L~
(
β = 1,
,
μ), and Cauchy noise A
C~
(
β = 0,
,
μ). To better reflect specialized cases and the role of control parameters, we have plotted probability distribution functions (PDFs) for key cases in
Figure 2.
Based on Equation (1) and the method given in [
29], we generated various synthetic training (
) and testing datasets (
).
is constituted as
The values of and are varied to generate different datasets, with and μ = 0 used throughout the investigation. We have fixed γ = 1 and μ = 0, to normalize the spread and remove any shift in the -stable distribution, respectively. It is the usual practice to isolate the effect of the key parameters of interest ( and ) and ensure a controlled and comparable experimental setup. The sign of ), i.e., Sβ = 0 if β < 0; 1 if β ≥ 0, is used only to classify the negative or positive skewness of the noise. The approximate (), i.e., , is used only to classify high or low impulsiveness of the noise. The resulting noise sample ‘a’ is produced from SkS distribution if and is from IS distribution if . The size of is denoted as N.
Since the real-life technological devices operate in noisy channel environments, we have focused on using a testing dataset
composed of a mixture of α-stable noise and channel noise, expressed as
We used similar values of
,
,
as in
. However, the resulting noise sample ‘c’ comes from the relationship
where g is the channel noise sample produced from the Gaussian distribution A
G~
(
β = 0,
,
μ). To quantify the intensity of the mixture, defined in Equation (4), we have used a mixed signal-to-noise ratio (MSNR) in decibels, defined as
It is the standard criterion for quantifying the Gaussian channel corrupting α-stable random noise samples [
29,
36]. Many studies have utilized MSNR
dB dB to −10 dB (extreme worst to less worse) as the channel noise in the past. Importantly, we have fixed MSNR
dB dB to test ML classifier performance under severe external noise degradation. It provides the dispersion ratio of the mixture comprising I
S or Sk
S noise A~
(
β,
,
μ) and Gaussian noise A
G~
(
β = 0,
,
μ). We used it to analyze the effects of the Gaussian channel on ML classifiers, especially when they are embedded in technological devices operating in Gaussian channels.
3.2. Noise Parameter Classification
Phase 2 comprises Computational Complexity Analysis (CCA) and performance analysis (PA). We have used DT, RF, SVM, kNN, and NB for binary and multi-class classification. The binary classification is based on estimating
and
On the contrary, multi-class classification is based on estimating
and
The rationale behind the formulation: The binary classification of
in Equation (6) helps determine whether the test data is positively or negatively skewed. Predicting the sign of skewness (
) rather than the exact skewness value is critical for real-time threshold-based devices. It helps categorize the severity of impulsive noise in communication channels (e.g., severe or not severe) or detect anomaly types in sensor networks. It helps apply AI models for rapid adaptation in both positive and negative directions in financial and environmental applications [
37,
38]. Similarly, binary classification of
in Equation (7), rather than the actual value of impulsiveness, enables modern devices to apply rapid impulsive or non-impulsive processing mechanisms in medical and cybersecurity scenarios [
39,
40]. Both binary classifications in Equations (6) and (7) help build robust, accurate AI-embedded models for multidisciplinary technologies such as healthcare, robotics, social behavior, and space systems [
41,
42,
43,
44]. Multi-class classification is highly required in incorporating AI-driven signal filters to stabilize real-world systems that cannot rely on strong prior assumptions about noise [
45,
46].
Following this approach, we trained ML classifiers on to classify , , and from . We have explained the mechanism used by each classifier to process noise samples, which are governed by the SkS and IS distributions. These descriptions correspond to the experimental evaluation carried out in the next section.
3.2.1. Decision Tree (DT)
Based on the classification and regression trees (CART) algorithm, the DT classifier builds a hierarchical tree. We focused on minimizing node impurity when splitting the tree. The minimization criteria rely on the calculated Gini index (
GI), given below as
where DT calculates the probability of class
i in the
i-th region as
. The process continues by splitting the dataset
(parent node) into two child sub-datasets,
and
(subtrees/child nodes).
and
correspond to the left and right subtrees. Then, DT calculated the split quality from the weighted
GI (
GIS), given below as
j
1, j
2, and j are the sizes of child nodes
and
and parent node
, where the DT continues to progress in the direction of the split with the lowest
GIS. In this way, we trained DT to classify
,
,
and
, given in Equations (6)–(9), from
.
3.2.2. Random Forest (RF)
Extending the basis hierarchical structure of DT given in Equations (10) and (11), we adopted an ensemble method to build an ML model for improved accuracy. We ensembled many DTs where each
k-th tree
trains itself on a bootstrap sample
. Every DT increases its diversity by considering a random subset of features. The main RF classification for a sample
is carried out as
where
denotes the prediction of the
k-th DT for total
F utilized trees. We ensured that the RF maintains the interpretability of DTs to achieve ensemble accuracy, as each DT uses the same
GIS and
GIS method specified in Equations (10) and (11). We have used
F = 100 as following standard default settings to ensure a good balance between predictive performance and computational efficiency.
3.2.3. k-Nearest Neighborhood (kNN)
Based on standardized vector features, we utilized kNN to perform classification of
,
,
,
for a test instance
by aggregating
k-nearest neighbor weights from
, according to their Euclidean distance. Hence, we performed classification using the same weighted averaging principle given below as
is defined as distance-based weight, and ϵ = is required for avoiding div by 0, when distances are 0. The i-th neighbor target value () is denoted by . We ensured that kNN considers all dimensions by calculating the distance function derived from z-score-normalized feature vectors. We have used the default configuration for the neighbors, , i.e., to ensure consistency with the standardized experimental setup used across all models.
3.2.4. Support Vector Machine (SVM)
We relied on the input features for SVM to provide non-linear mappings to every target parameter
,
,
,
in
. Then we performed classification for each
by approximating the underlying function given below as
Here, (, ) were obtained during model fitting as dual variables and is the input feature vector obtained from the development while training on . We achieved non-linear learning for the trained SVM using the kernel function through implicit transformation into a high-dimensional feature space. We selected radial basis functions after testing various kernels because of their superior adaptability to IS and SkS distributions. We used 5-fold cross-validation to optimize hyperparameters, setting C = 10 and γ = 0.01 as box constraints. Afterwards, the SVM was able to classify , , , from .
3.2.5. Naïve Bayes (NB)
We classified the target parameters—
,
,
,
—by NB after calculating expectations of the posterior probabilities given by the input features. For a given target
, the NB performs approximation as
Due to known dependencies between , and , we relaxed the standard assumption of conditional independence in NB. Empirical calculations indicate that the NB provided reasonable posterior estimates. We performed the training on and classifications on . It resulted in the lightweight estimation of , , , under a resource constraint scenario.
Based on the two-phase methodology described, we conducted a thorough experimental evaluation of the ML classifiers listed above. The derived results and analysis are given in the next section.
4. Experimental Evaluation
In this section, we present the results of experimental evaluation in two parts: CCA and PA. The CCA comprises four experiments that highlight the computational complexity of ML classifiers when classifying IS and SkS noise parameters. The PA shall consist of five experiments that reflect the classification abilities of ML classifiers for binary and multi-class classification of the parameters . Before presenting the results of CCA and PA, we have summarized standard settings for conducting all experiments, given below:
: Synthetic A~ (, μ = 0) datasets with samples.
: Synthetic A~ (, μ = 0) datasets corrupted by AG~(β = 0, , μ = 0) with samples.
Channel noise: was selected to set channel noise in to MSNRdB .
Statistical significance: We performed 20 independent Monte Carlo runs (with different random training subsets in each) and reported the mean performance with standard deviations and confidence intervals to assess statistical reliability.
Training fraction: We set it to 0.8 (i.e., models trained on 80% of the , unless stated otherwise) in experiments 1–4 for CCA. Experiments 5–9 were conducted with separate and except for experiment 7, which was based on different training fractions.
ML Classifiers: DT, RF, SVM, NB, kNN.
Employed Platform: We carried out experiments 1–9 on macOS 12 installed on an Apple MacBook Pro (8-core CPU @ 2.3 GHz, 16 GB RAM). In phase 1, we generated and datasets using libraries and α-stable noise custom scripts on MATLAB R2023b. In phase 2, we conducted experiments on Python 3.9 by utilizing scikit-learn 1.2 (for model training and prediction). We applied NumPy 1.21 and SciPy 1.7 for binary and multi-class classification in PA.
We acknowledge that the absolute timings depend upon implementation and hardware (e.g., Python, scikit-learn, Apple M1 CPU), the relative trends in training/inference time and resource usage across models remain consistent and generalizable, reflecting underlying algorithmic complexity. We ensured consistency and fair comparison across ML classifiers by training them using default hyperparameters. The adopted approach reflects real algorithmic performance under α-stable noise by avoiding bias introduced by model-specific tuning.
Note: While this study is empirical in its core methodology, each experimental finding is accompanied by theoretical interpretation grounded in model design principles. This dual approach allows us to connect observed behaviors to the underlying mechanisms of each algorithm under impulsive noise conditions.
4.1. Computation Complexity Analysis (CCA)
The CCA reflects on each ML classifier scaling in terms of computational resource usage by calculating training, inference, peak memory usage, and model storage size, as the increment of increases. In experiments 1–4, we tested each aspect of resource usage individually. We ran each measurement multiple times which reduced variability from system noise. It provided average readings with string confidence intervals as reliable complexity estimates. We focused on discovering computational cost trade-offs to categorize ML classifiers as complex or straightforward. We discuss every experiment individually.
4.1.1. Experiment 1: Training Time vs. Dataset Size
We focused on discovering the scaling trend of ML classifiers by measuring training time as the dataset size increased. We used
comprising
by averaging wall-clock training time over 20 Monte Carlo runs for stable results. A clear separation in training-time scalability is observed, classifying ML classifiers as slow or fast. Based on the results in
Figure 3, we found kNN, DT, and NB as the quickest due to their training period being in milliseconds (ms).
NB and kNN took approximately 1.1–2.8 ms and 1.8–8.5 ms, respectively, reflecting simple closed-form estimates. Similarly, DT showed a linear relationship with the developed number of splits and finished training in roughly 2.0–7.2 ms. On the contrary, we found moderate scalability of SVM, with training times reaching from 10 ms to 50 ms due to its expensive iterative optimization for the margin-based objective. The heaviest model is RF, with a training time of 0.15–0.62 s. We believe that it is due to the need to contract and assemble 100 trees. Despite the variation, we observed low timing variance across Monte Carlo runs for all models, with confidence intervals ranging from 10−5 for the fastest models to 10−3 for the slower models, indicating reliable measurements. These observed distinctions are essential for the development of small- or large-scale time-sensitive embedded technologies, with the fundamental trade-off lying between fast training and complexity.
4.1.2. Experiment 2: Inference Time vs. Dataset Size
We focused on measuring the prediction times of trained ML models as
scales from
. As shown in
Figure 4, DT and NB maintained prediction times below 0.6 ms for all
thanks to the developed shallow tree paths and closed-form computations. We found SVM operating with medium latency, ranging approximately from 2.9 ms to 7.3 ms, with a growing set of support vectors supporting prediction during kernel evaluations. In contrast, aggregating 100 trees increased the inference time from 5.5 ms to 18.4 ms. Moreover, kNN scaled poorly, from 3.4 ms to 25.9 ms, and was the slowest of all.
The kNN has O(N) due to its lazy learning nature. Conversely, parametric models such as NB and DT achieved low-latency inference. We observed high repeatability across all models, as indicated by tight confidence intervals. We believe DT and NB are preferable for adaptive real-time systems that cannot tolerate decision-making bottlenecks. Technologies dealing with large-scale IS and SkS noise data can pose decision risks due to slow inference.
4.1.3. Experiment 3: Peak Memory Usage vs. Dataset Size
We quantified each ML classifier’s peak memory usage during training. In
Figure 5, the same datasets, ranging from
were used with Python’s tracemalloc function to derive the maximum allocated memory, averaged over 20 Monte Carlo trials.
Regardless of different absolute levels, we have observed almost linear progress with N. The most memory-friendly was NB, requiring just ~0.44 MB and ~2.03 MB for 10 k and 50 k datasets, respectively, as it derives its classification from basic statistical functions. We observed DT, kNN, and, surprisingly, SVM as middle-tier models in terms of memory usage. The DT and kNN progressed from ~0.72 MB to ~3.56 MB and ~0.75 MB to 3.71 MB, respectively. The reason is the memory scaling of trees attached to the number of nodes, which grows with . Similarly, storage of all instances supports linear memory growth for kNN. However, SVM based on cached kernel structures and support vectors consumed more memory, i.e., ~0.69 MB to 3.41 MB, than kNN and DT. On the other hand, we found RF to be the most memory-intensive classifier, ranging from ~1 MB to ~4.72 MB, since it must build a 100-ensemble tree architecture for classification.
We ensured consistency and reproducibility of the results by keeping confidence intervals ≤ 0.01 MB, which is tight. Importantly, all tested ML classifiers’ memory consumption remained below 5 MB, making them suitable for embedded systems. We have highlighted the complexity and memory trade-offs: RF shows increased memory consumption, while NB and DT demonstrate their suitability for future nanodevices. This experiment is significant for applied fields such as communications or IoT sensing, where we face strict resource constraints due to the presence of IS and SkS noises.
4.1.4. Experiment 4: Model Size vs. Dataset Size
Apart from peak memory usage during training, real-world deployment requires the actual storage size of trained ML classifiers on embedded devices. Therefore, in experiment 4, we recorded the serialized file sizes of the trained model as a function of dataset size.
Figure 6 reflects the results. We observed negligible variance across Monte Carlo runs due to the deterministic nature of serialization.
From a deployment perspective, we found NB and DT to be space-efficient and independent of N, requiring only 0.002 MB each. The reason is that once a tree structure is developed, only a fixed set of splits, independent of data, is stored by DTs, and NB also relies on storing the mean/variance for each given feature. Similarly, the SVM maintains a size close to ~0.003 MB, whereas regularization, independent of N, requires the minimum number of support vectors to perform classification. Hence, SVM, DT, and NB are ideal for embedded technologies if model sizes, independent of dataset sizes, are required for system-on-chip development. On the contrary, we observed storage complexity for RF and kNN, with model size increasing and being clearly dependent on N. Since the depth of ensemble trees and the number of nodes depend on dataset size, RF showed moderate sublinear growth from ~0.44 MB to ~0.63 MB as we increased from 10 k to 50 k. But RF will plateau once we reach maximum depth. It is still more space-efficient than kNN, which scaled linearly from ~0.42 MB to ~2.07 MB in direct proportion to . It is due to the storage of all data points. In summary, we observed that ensemble methods (RF) and instance-based (kNN) ML classifiers require more storage space than parametric ML classifiers (SVM, DT, and NB). This fact is cost-critical where ML classifier portability is required for resource-limited technologies.
CCA Summary: We found that key factors in experiments 1–4 (training, inference, memory, and storage) help differentiate simpler models from complex models. The NB and DT are highly efficient in terms of computational complexity. We declare NB to be the lightest classifier, offering fast training and minimal storage requirements. Similarly, DT also offers rapid training and model compactness, but comes at a higher training cost. kNN is also simple during and after training, but its instance-based nature affects its inference time and model size. Although the CCA is based on empirical results, the observed trends align well with theoretical expectations. We noticed it at various points: (i) RF training cost scales with number of trees and data splits, as expected for ensemble methods. (ii) Due to instance based nature of kNN, its inference time grows with dataset size. (iii) The closed-form updates and minimal model storage of NB results in near-constant memory usage, etc. These indicators reflect that the behavior of ML models is not an implementation artifact but a consistent reflection of known computational structure.
Modeling perspective: The ensemble process and kernel-based classification in RF and SVM demanded more resources. In particular, the steepest increase in training cost and the linear growth in memory usage with increasing dataset size make RF the least computationally efficient among all ML classifiers. It is to be noted that the narrow confidence intervals (low measurement variability) in experiments 1–4 reflect the authenticity of observations. In practical deployment scenarios, the reported results can be used to anticipate, e.g., that embedding kNN instead of NB will require more space, or that DT can be trained more rapidly in adaptive environments than SVM. We believe that such results and trade-offs are crucial for the design of AI-embedded systems that deal with IS and SkS noise in resource-constrained scenarios. Moreover, it will help in balancing decisions between performance and complexity.
4.2. Performance Assessment (PA)
In this step of Noise Parameter Classification, we carried out Performance Assessment to analyze the classification performance of ML classifiers while predicting control parameters () of the -stable noise. In experiments 5–9, we perform binary classification by predicting the sign of () as and approximate () as to categorize the underlying noise as positive/negative SkS and high/low IS, respectively. Moreover, we tried to precisely predict the noise parameters and as and , respectively, using multi-class classification to find the exact parameter values used to generate . Contrary to using only clean CCA in step 1, we used for training and with MSNRdB for testing the classification abilities of DT, SVM, RF, NB, and kNN to predict SkS and IS noise distributions corrupted by Gaussian noise. Experiments 5–9 assess the performance of the ML classifiers for predicting noise control parameter. Key evaluation criteria include accuracy, F1-score, precision–recall, and receiver operating characteristic (ROC) curves. In each experiment, we incorporated low-difficulty (binary classification of ) and high-difficulty (multi-class classification of ) tasks to derive each classifier’s strengths and weaknesses.
4.2.1. Experiment 5: Accuracy vs. Dataset Size
In this experiment, we investigated the evolution of ML classifiers’ binary and multi-class classification accuracy as we increased the size of
, i.e.,
= 10 k to 50 k, during training. All sub-experiments, shown in
Figure 7, are supervised classification tasks done on
to assess the performance of DT, SVM, RF, NB, and kNN. We have considered the results averaged over 20 Monte Carlo runs with 95% confidence intervals.
We have also summarized the results in
Table 1. In binary classification of
and
, we found that DT and RF achieved 100% accuracy across all
values. It highlights the possibility of separating positive/negative Sk
S and low/high I
S noise distributions, even in the presence of severe Gaussian noise, if an appropriate threshold is selected. The SVM showed incremental accuracy rising from ~74% at 10 k to 100% at 50 k for
and from ~72% at 10 k to ~97% by 50 k for
. In the same trend, kNN also achieved performance of 73% to 100% for
and ~75% to ~98% for
. NB showed early lagging at 10 k, where it reached ~83% for
while reaching 100% later.
On the contrary, it showed flat performance of ~51% throughout, proving its inability to catch the impulsive nature of IS noise. We found ML classifiers in a difficult position while performing multi-class classification of and . For , SVM performs comparatively better among all, achieving an accuracy of ~39.9% at 50 k, up from ~27.7% at 10 k. However, the performance of all other classifiers improved but hovered between 34 and 35%. The case of is not different as well. None of the classifiers exceeded an accuracy of ~40% at any value. NB, DT, and RF showed similar and stable accuracy of ~37% across all values. Moreover, SVM and kNN also showed identical performance, starting at ~23% at = 10 k and reaching ~37% at = 50 k.
We concluded that most ML classifiers can perform binary classification of SkS and IS noise parameters even under the worst corruption introduced by Gaussian noise. Notably, RF and DT consistently achieved ~100% accuracy even with . In a similar trend, SVM and kNN can match the performance, but with larger where NB is inconsistent, especially in predicting IS noise. The results are significantly different for multi-class classification: Even with 50 k samples, all classifiers saturated at below 40% and 35% when predicting SkS and IS noise parameters, respectively. The derived results help embed ML classifiers into modern technologies, given the required approximate accuracy.
4.2.2. Experiment 6: F1 Score vs. Dataset Size
Based on the derived accuracies, we focused on testing the ML classifiers’ F1-scores as the size of the Sk
S and I
S noise datasets increased. For experiment 6,
Figure 8 shows the F1-score performance of DT, SVM, RF, NB, and kNN as a function of
. Aside from some model variances and failures, we observed trends similar to those in the last experiment.
The binary classifications of
and
, given on the left side of
Figure 8, reflect flawless performance of DT and RF. They achieved a seamless F1 = 1.00 across
= 10 k–50 k. We noticed the unbothered nature of tree-based models to positive/negative bias and high/low impulsiveness, as evidenced by the Sk
S and I
S noises. The SVM and kNN also complemented each other’s performance, despite minor ups and downs at a few points. They initially obtained F1 ≈ 0.74–0.75 for
= 10 k, which later improved to F1 ≈ 0.99–1.00 as
approached 50 k. NB tried to understand the performance of SVM and kNN at least for
but it proved extremely fragile in classifying high/low impulsiveness in I
S noises. We found NB as a misfit with the non-linear, biased, and impulsive structure of the α-stable distribution. In the multi-class classification shown on the right side of
Figure 8, F1-scores remain low throughout, exposing the weak performance of ML classifiers when it comes to exact prediction of Sk
S and I
S noise parameters. For
, all ML classifiers followed a similar trend with a marginal change in overall performance starting from F1 ≈ 0.29–0.30 at
= 10 k and saturating to F1 ≈ 0.35–0.36 at
= 50 k. However, for
, we observed comparatively fluctuating behavior of ML classifiers across
values, where the overall F1-scores ranged from F1 ≈ 0.20 at
= 10 k to F1 ≈ 0.36 at
= 50 k. We did not observe any link between performance and data size, underscoring that the exact noise parameter remains a challenge.
To conclude the experiment, we believe binary classification of datasets governed by SkS and IS noises is easier than multi-class classification. Additionally, tree-based models are more robust and better suited to applied technologies that rely on threshold-based decision-making. However, classifiers like NB are suitable for devices dealing in complex or non-linear settings.
4.2.3. Experiment 7: F1 Score vs. Training Data Fraction
We performed this experiment to specifically quantify the scaling abilities of ML classifiers with training data. Using
, i.e.,
= 50 k, we trained DT, SVM, RF, NB, and kNN in fractions (10% to 100%). The resulting F1-scores for binary and multi-class classifications against training data % are reported in
Figure 9.
Like in the previous experiments, we carried out binary classifications of
and
. The results are presented on the left side of
Figure 9. Following the observed trend for
, RF and DT instantaneously reached F1 ≈ 1.0 by utilizing even 10% of
. The SVM and kNN achieved similar F1-scores with 20% of
, followed by NB, which was completed at 40%. In case of
, DT and RF again confirm their superior stability and data efficiency over other ML classifiers by achieving F1 ≈ 1.0 throughout. The SVM and kNN followed the path but achieved F1 ≈ 0.7 with 20% of
and saturating at 80%. The NB surprisingly underperformed across all data fractions, with F1 remaining below ≈ 0.40. On the contrary, in complex multi-class classification of
and
, we observed near-chance performance for all ML classifiers, even with 100% of
. The F1-scores stayed between ~0.18 at 10% and ~0.28 at 100% and between ~0.22 (10%) and ~0.33 (100%) for
and
, respectively. However, the performance trends of ML classifiers remain the same: DT and RF are on top, followed by SVM and kNN, with NB lagging.
In summary, we conclude that binary classification was straightforward for the considered ML classifiers with -stable noise distributions. However, multi-class classification seems to offer limited returns as the fraction increases, and performance tends to plateau after a point. NB turned out to be the worst-performing classifier, and it remained insensitive to increased data for most SkS and IS noise parameters.
4.2.4. Experiment 8: Precision–Recall (PR)
One key factor in ranking ML classifiers when dealing with Sk
S and I
S noise distributions is the derivation of the associated precision–recall (PR) curves. Therefore, in experiment 8, we computed average precision for
by using
with the largest
= 50 k. The results are reported in
Figure 10.
As depicted in
Figure 10, the results of binary classification tasks (
,
) reflect a sharp performance gap between NB and other models. We observed near-best performances for DT, RF, kNN, and SVM, achieving a precision ≈ 1.00. It demonstrates these models’ ability to achieve positive class separation. The NB also showed similar performance for
, but performed drastically worse for
with precision ≈ 0.53. It confirms that NB is incapable of performing classification in I
S noise environments. The results also reinforce the importance of tree-based models for AI-driven technologies that require confidence-based decisions. Opposingly, multi-class classification (
,
) again emphasized the weak performance of ML classifiers to classify fine-grained Sk
αS and I
αS noise parameters distinctly. Among all, SVM still leads the rankings with precision ≈ 0.39 and 0.35 for
and
, respectively, due to its margin-based decision function. The RF and kNN showed second-best performances for both
and
. However, the NB followed the training obtained for binary classification, where it performed much better for
than
. The DT lagged in all instances. Overall, we can easily deduce that threshold-based technologies can be incorporated into tree-based classifiers. However, multi-class classification still lacks better options; complex classifiers like SVM and kNN can be tuned to yield well-ranked outputs. NB is still the weak link among all classifiers when it comes to classifying Sk
S and I
S noise parameters.
4.2.5. Experiment 9: Receiver Operating Characteristics (ROC)—True Positive Rate (TPR) vs. False Positive Rate (FPR)
In this vital experiment, we further evaluate ML classifiers’ performance using receiver operating characteristic (ROC) curves, which plot the true positive rate (TPR) vs. false positive rate (FPR) at various thresholds. Results across binary and multi-class classifications are presented in
Figure 11. In parallel with the PR curves in experiment 8, we gained further insights into the model’s separability for Sk
S and I
S noise.
As in previous experiments, we obtained similar ML classifier performance rankings for binary classification of and . The RF, DT, kNN, and SVM all achieved near-best performance, with an area under the curve (AUC) ≈ 1.00, indicating excellent separability. However, the NB again showed poor performance when dealing with IS noise distributions, with an AUC ≈ 0.50. Nevertheless, it also performed brilliantly in classifying the skewness of SkS noises. It shows the performance of ML classifiers under SkS and IS environments.
Modeling perspective: We observed that all ML classifiers struggle to predict the SkαS and IαS noise parameters accurately. The SVM again led with AUC ranging between 0.75 and 0.83 for both and . Interestingly, the NB comes second for but again lagged with , highlighting its struggle with calibration (with PR). The RF and kNN almost matched the previous performances for both and . However, the DT performed the worst overall, with its AUC remaining below ~0.58. It points out that multi-class classification remains a problem for ML classifiers. The SVM turns out to be the most optimized classifier and well-suited for applications requiring exact classification of SkS and IS noise parameters. We can reaffirm from the ROC-AUC results that binary classification is not a matter of concern for most ML classifiers, even with severe -stable noise. We have computed near-best AUC. On the contrary, multi-class classification remains a significant challenge, and we obtained only a moderate AUC with SVM.
PA Summary: In experiments 5 to 9, we derived the performance of ML classifiers under Sk
S and I
S noise distributions. We varied the classification difficulty from low-level (binary classification) to high-level (multi-class classification). The general trend is that almost all ML classifiers performed comfortably during binary classification. However, as we shifted to multi-class classification, the weaknesses started to emerge. The saturation of multi-class performance at around 30–40% reflects the intrinsic difficulty of distinguishing α-stable parameter classes under heavy-tailed noise. In such conditions, different parameter values produce highly overlapping observations, reducing separability in the feature space. Overall, more resource-intensive classifiers, i.e., RF and SVM, show consistent dominance over others, achieving higher F1 scores and precision in both binary and multi-class classifications. kNN and DT, although less resource-intensive, performed brilliantly for binary classification but struggled during multi-class classification. Surprisingly, the NB failed in most instances and could not compete with its peers. We observed that the learnability of α-stable noise parameters in multi-class classification is tied with their statistical identifiability. The inability to accurately classify α and β from finite noisy samples is linked to their connection with higher-order properties (e.g., tail behavior and asymmetry). It results in an overlap between observations generated by various values of α and β with less class separability. It makes multi-class classification inherently ill-posed under the given conditions. But it makes alpha-stable noise a suitable candidate for many covert and noise mitigation technologies [
19,
20,
34,
41,
42].
In all experiments, the improvements start to saturate as reaches 50 k. It indicates convergence toward model-specific limits under α-stable noise. Therefore, the derived results reveal trade-offs between computational cost and performance. We observed the appropriateness of lightweight models for real-world technologies relying on threshold-based mechanisms. However, devices that require precise noise parameter estimation should deploy larger models for better results.
5. Conclusions
We methodically examined the performance of prominent supervised ML classifiers for classifying SkS and IS noise parameters. Based on experiments designed to evaluate binary and multi-class parameter classification, we identified new trends and rankings regarding the complexity and performance of the considered ML classifiers. First experiments (1–4) successfully ranked ML classifiers using various complexity benchmarks. Key points include the following: RF had the longest training time, kNN had the longest inference time, and RF peaked in both memory usage and model size, followed by kNN. However, the DT and NB emerged as winners on these benchmarks. In terms of complexity, the NB and DT reflected the best suitability with the lowest memory use, training time, and model size. On the contrary, SVM and RF yielded moderate trade-offs with acceptable costs. In case of performance benchmarks measured through binary classification, we observed consistent high performance for RF and DT. On the other hand, in multi-class classification, SVM performed better at approximating the α-stable noise parameters for impulsiveness and skewness. Lightweight NB struggled to achieve accuracy in any test. The kNN performed competitively well, but at the cost of high inference, latency, and memory costs. We have seen consistent emerging trends in performance: DT and RF, both tree-structured classifiers, achieved near-perfect performance in binary classification. SVM turned out to be the best classifier for multi-class ranking of SkS and IS noise parameters. Surprisingly, we observed consistent underperformance of NB. With these benchmarking results, our focus is on highlighting the importance of selecting appropriate classifiers for AI-integrated technologies operating in -stable noise environments. Importantly, supervised ML classifiers can provide robust decision-making for structured binary tasks that rely on ON/OFF, HIT/MISS, LOW/HIGH, etc., mechanisms. However, exact parameter estimation might require more robust, noise-aware classifiers.
By adopting the explored versatile classification abilities of the considered ML classifiers under -stable noise, future devices can perform better by adapting to real-time channel impairments. In addition to considering Gaussian channel noise, we are currently studying the behavior of ML classifiers operating in Rician–Rayleigh fading noise channels. Similarly, it will be interesting to investigate the performance and complexity of advanced algorithms, such as deep learning-based alternatives and transfer learning, under SkS and IS noises. It will provide new trends and benefit multidisciplinary applications. Future work will also explore systematic hyperparameter optimization strategies to further assess performance gains across models. Similarly, representation learning can also be explored to capture α-stable characteristics and improve multi-class classification. Nevertheless, we expect the trends identified in the current study will open avenues for research on ML and -stable noise. It will act as a catalyst for future AI-integrated technologies whose decision mechanisms depend heavily on their classification performance.