You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Article
  • Open Access

5 December 2025

WISEST: Weighted Interpolation for Synthetic Enhancement Using SMOTE with Thresholds †

,
,
and
1
Graduate School of Information Sciences, Tohoku University, Aramaki Aza Aoba 6-3-09, Aoba-ku, Sendai 980-8579, Miyagi, Japan
2
Department of Research, Science and Technology, Universidad San Francisco Xavier de Chuquisaca, Junin Esq. Estudiantes, Sucre, Chuquisaca, Bolivia
3
National Institute of Technology, Sendai College, 4-16-1 Ayashi-Chuo, Aoba-ku, Sendai 989-3128, Miyagi, Japan
4
Cyberscience Center, Tohoku University, 6-3, Aramaki Aza Aoba, Aoba-ku, Sendai 980-8578, Miyagi, Japan
Sensors2025, 25(24), 7417;https://doi.org/10.3390/s25247417 
(registering DOI)
This article belongs to the Special Issue Advances in Security of Mobile and Wireless Communications

Highlights

What are the main findings?
  • Consistent recall and F1 gains: WISEST increased recall and F1 across multiple benchmarks by generating weighted locality-constrained synthetic minority samples that reduce unsafe interpolation.
  • Robustness to moderate noise and borderline structure: WISEST outperformed or matched alternatives when sufficient samples formed borderline pockets or subclusters, delivering stronger minority detection with modest precision trade-offs.
What are the implications of the main findings?
  • The approach is a practical choice for recall-critical applications: for tasks where detecting minority events matters (e.g., intrusion detection and rare-fault detection), WISEST offers a reliable oversampling option without aggressive sample generation or complex training.
  • There is no “one oversampler that rules them all” for imbalanced datasets: WISEST should be used alongside dataset diagnostics. Datasets with limited minority support, extreme separability, highly noisy borderlines, or many categorical features may still benefit more from other methods.

Abstract

Imbalanced learning occurs when rare but critical events are missed because classifiers are trained primarily on majority-class samples. This paper introduces WISEST, a locality-aware weighted-interpolation algorithm that generates synthetic minority samples within a controlled threshold near class boundaries. Benchmarked on more than a hundred real-world imbalanced datasets, such as KEEL, with different imbalance ratios, noise levels, geometries, and other security and IoT sets (IoT-23 and BoT–IoT), WISEST consistently improved minority detection in at least one of the metrics on about half of those datasets, achieving up to a 25% relative recall increase and up to an 18% increase in F1 compared to the original training and other approaches. However, in most cases, WISEST’s trade-off gains are in accuracy and precision depending on the dataset and classifier. These results indicate that WISEST is a practical and robust option when minority support and borderline structure permit safe synthesis, although no single sampler uniformly outperforms others across all datasets.

1. Introduction

Machine learning (ML) and Artificial Intelligence (AI) have recently become widespread, helping to create robust solutions across different fields from medical care [1] to cyberattack detection [2]. ML systems require high-quality datasets for learning or training. However, not all datasets possess high-quality/representative data, and the sample distributions are often unbalanced. The issue with imbalanced datasets, which are structured to favor a majority class, is that, when ML models are trained on them, the apparent accuracy is high. However, they cannot detect anomalies because most of the training data comes from the majority class. Consequently, misclassification is unavoidable [3], thereby decreasing detection precision, especially at the boundary between classes.
The traditional approach to handling imbalanced datasets is by “oversampling”. Oversampling increases the number of samples from the minority class, typically using algorithms such as SMOTE [4] and ADASYN [5]. These methods increase the number of instances in the minority class. However, they are created at random locations, even in areas where it might increase the chance of misclassification, that is, at the borderline between the majority and minority classes. Approaches such as Borderline-SMOTE [6], SMOTEENN [7], or WCOM-KKNBR [8] worked on strategies to overcome the imbalance issue by applying innovative and effective solutions. However, all these traditional and emerging works highlight two points. First, the class imbalance problem is still an open issue, and, second, no single method is universally the “best” for all situations. In fact, to the best of our knowledge, there are no existing methods to increase the minority-class count without generating extra noise at the borderline, or that consider various dataset elements, such as geometry or local class overlaps.
In our prior work [9], we designed a basic algorithm that considered the elements above; that is, a locally aware minority-class oversampling method at the borderline, thereby avoiding unsafe synthetic generation that would harm precision. However, the evaluation was primarily conducted on synthetic datasets, which may not reflect actual performance on real-world datasets. Therefore, this paper enhances our initial proposal, and its main contributions are threefold:
  • We present WISEST, an oversampling algorithm that assigns weights to near neighbors and interpolates synthetic samples using SMOTE within a threshold.
  • We extensively test WISEST against traditional and oversampling methods using real-world datasets.
  • Finally, we present a thorough analysis of the conditions by which WISEST performs better than existing work.
Based on our results across various real-world imbalanced datasets, we concluded that, when the minority class has a moderate number of samples and is not isolated (i.e., it has both minority and majority neighbors nearby), WISEST can interpolate within local neighborhoods to increase recall without creating many outliers. Thus, its per-sample weighting avoids oversampling dense subclusters and instead enhances sparser border regions, which helps models to detect minority sub-classes more reliably. Therefore, if F1 or recall is the evaluation metric, WISEST commonly yields the best scores in more than half of the datasets tested.
Before delving into the (proposed) WISEST oversampling method presented in Section 3 and the idea behind it, we describe the related work on traditional and modern oversampling methods in Section 2. Section 4 presents and discusses the benchmark results from extensive testing. Finally, Section 5 concludes this paper with a brief summary of our findings and future lines of work.

3. Proposed WISEST Oversampling

3.1. Overview

This section details the proposed oversampling method, WISEST (Weighted Interpolation for Synthetic Enhancement using SMOTE with Thresholds). The main features can be summarized as follows:
  • WISEST creates minority synthetic samples in the “boundary area”, which is located between the majority and minority classes. This will avoid excessive and indiscriminate creation of samples that might hinder classification. For illustration purposes, we created a synthetic binary classification dataset using the following code:
    make_classification(n_samples=10000, n_features=3,
    n_redundant=0,n_clusters_per_class=1, weights=[0.99],flip_y=0,
    random_state=1)
    The make_classification function from the sklearn.datasets library in Python 3.10.15 creates n_samples=10000 samples, with three non-redundant features (independent variables); the minority and majority classes will consist of a single cluster with no noise, and the IR (IR) would be 99:1. That is, 1% of the samples would belong to the minority class.
    Figure 1 depicts the generated dataset (majority class samples depicted in red; minority in blue) and the boundary area (shown in shaded blue) from a synthetic dataset created above. Note that the area varies depending on the features being compared and the dataset’s geometry.
    Figure 1. Feature pairwise scatterplots from a synthetic dataset with their boundary area highlighted in shaded blue.
  • Some feature pairs might have high separability (i.e., Figure 1b), while others might have overlapping classes (i.e., Figure 1c). Thus, we apply a weighted locally aware interpolation within a threshold. To do so, we use a custom threshold distance to limit the area around each nearest minority class. This distance is used to assign weights to each minority point. Then, we apply conditional branching based on this weight; thus, even when the minority class data is sparsely distributed in the dataset, we can generate samples from the closest minority neighbors.
  • Contrary to traditional approaches, such as SMOTE, ADASYN, and Borderline-SMOTE, which create the same number of samples as the majority-class data, we adopt a more conservative approach, producing only samples that fall within the boundary area, thereby avoiding over-inflating the dataset with samples that might affect its performance.

3.2. The WISEST Algorithm

Algorithm 1 presents a pseudocode of the proposed WISEST oversampling based on the premises outlined in Section 3.1. As the first step, we calculate the weight (w) for each data point in the minority class, which is determined by the ratio of the number of points that belong to the majority class ( N N ) among the k neighboring points, as shown in Equation (1).
w i = N N k i : 1 , 2 , , k
Algorithm 1. Proposed Algorithm
  1:
Input: Training dataset T, Threshold θ
  2:
Output: Resampled dataset
  3:
for each minority class sample i in T do
  4:
      Find its k nearest neighbors in T
  5:
       N N number of neighbors from the minority class
  6:
       w i N N k
  7:
      if  w i < 1  then                                          ▹ No majority samples near or less than k.
  8:
            do nothing
  9:
      else if  w i = 1  then                            ▹ Exactly the minimum number of neighbors.
10:
            for each neighbor j do                                                                        ▹ j = 1 , 2 , , k
11:
                  Compute distance d i f i j to nearest minority neighbor
12:
                  if  d i f i j θ  then
13:
                        Generate sample t using SMOTE in range 0 r θ
14:
                  end if
15:
            end for
16:
                  else                                                   ▹ There are enough majority neighbors
17:
            Calculate the number neighbors n in range 0 < j < j
18:
            for each neighbor j do                                                                        ▹ j = 1 , 2 , , n
19:
                  Compute distance d i f i j r to nearest minority neighbor 0 r θ
20:
                  Compute distance d i f i j s to nearest majority neighbor 0 s θ / 2
21:
                  if  d i f i j r d i f i j s  then
22:
                        Generate t samples using SMOTE in range 0 r θ     ▹ Wider range.
23:
                  else
24:
                        Generate t samples using SMOTE in range 0 s θ / 2     ▹ Less range.
25:
                  end if
26:
                  Add t to T
27:
            end for
28:
      end if
29:
end for
30:
return T
As the next step, WISEST applies conditional branching using the following criteria:
  • When no majority-class points ( N N ) exist nearby the k neighboring points (measured by the distance dif) that are within the threshold ( θ ), then no synthetic data is generated as it would not be within the boundary area, as detailed in lines 7 and 8 in Algorithm 1.
  • Now, when the ratio of the number of majority and the selected k minority samples is equal, that is, there are only the minimum number of majority samples ( N N ) equal to k, then w = 1 . This case is treated separately as the dataset may be very sparse or contain an isolated boundary. Thus, we adopt a conservative approach and create only k synthetic data toward the minority class. To do so, we calculate whether the distance is within the threshold ( d i f θ ) and whether the distance to the closest minority (r) neighbor is 0 r θ using SMOTE, as shown in lines 9 to 15 in Algorithm 1.
  • Otherwise, since there are enough majority-class neighbors, we must first determine the number of samples to create (n); we do this based on the number of nearest minority samples, as shown in Equation (2):
    n = w j · n j w j i n r a n g e 0 < j < j
    Then, n synthetic samples are generated in the range 0 r θ if the distance from the sample to the closest minority neighbor is less than or equal to the closest majority neighbors (s), otherwise within the range 0 s θ / 2 of the majority class, as described in Algorithm 1, lines 17 to 29. The rationale for this is that we prefer to create samples near the minority class in cases where it is closer but not so close to the majority neighbor.

3.3. WISEST Applied on the Synthetic Dataset

To illustrate how WISEST works, Figure 2 depicts the results of applying our proposed oversampling to the synthetic dataset presented in Section 3.1 and its corresponding class distribution. Note that WISEST generates synthetic points within the threshold-defined boundary, with minimal outlier samples.
Figure 2. Pairplot of features 1 to 3 with classes oversampled by our WISEST approach. Note the shaded areas represent the imbalance between the majority and minority classes.

3.4. Preliminary Analysis of Oversampling Methods on the Synthetic Dataset Compared to WISEST

As a preliminary experiment, we run other algorithms, both traditional (SMOTE [4] and Borderline-SMOTE [6]) and existing work (SMOTEENN [7] and Radius-SMOTE [13]), on the same synthetic dataset presented in Section 3.1. For brevity, we only considered two features (i.e., feature 1 vs. feature 2). The goal of this preliminary evaluation is to (visually) assess how different strategies produce minority synthetic samples and compare them with the original. Figure 3 shows how SMOTE and SMOTEENN aggressively populate the samples all over the dataset. Borderline-SMOTE is a less intensive approach, but it still may cause misclassification in some areas. On the other hand, Radius-SMOTE and WISEST present the most conservative approaches.
Figure 3. Comparative visualization of oversampling methods for features 1 and 2.
Additionally, we conducted a Kernel Density Estimate (KDE) analysis of the effects of each method on the number of generated samples and the distribution of the dataset. To do so, we performed a Principal Component Analysis (PCA) to reduce the dataset’s dimensionality. The PCA-1 projections of the original and oversampled datasets are shown in Figure 4. The legend at the top shows the synthetic samples generated per approach.
Figure 4. Comparative density analysis of oversampling methods in PCA-1 projection showing the distributional impact in the minority class.
As observed, the original dataset, shown in Figure 4a, is composed of 9900 majority and 100 minority samples and presents a clear distribution between classes. While oversampling the datasets with SMOTE (Figure 4b), Borderline-SMOTE (Figure 4c), and SMOTEENN (Figure 4d) produces almost the same amount of minority samples as the majority, note that the newly produced samples alter the minority distribution. On the other hand, Radius-SMOTE (Figure 4e) and WISEST (Figure 4f) adopt a more conservative approach, producing around 5% of the results of the others.
However, note that, among the tested approaches, WISEST alters the original distribution the least; thus, this might help to preserve the original dataset’s accuracy while creating only “safe” samples. Of course, being a straightforward synthetic dataset, this might not hold in real-world datasets. Therefore, in the next section, we present a thorough evaluation of these approaches, considering formal performance metrics and settings.

4. Evaluation and Discussion

This section benchmarks WISEST against existing traditional and state-of-the-art oversampling algorithms and discusses its implications.

4.1. Experimental Setup

4.1.1. Datasets

We used two groups of well-known real-world datasets: KEEL [16] and other IoT/security-related datasets (IoT-23 [17], BoT–IoT [18], and Air Quality and Pollution Assessment [19]).
The KEEL repository comprises a variety of imbalanced datasets, as detailed in Table 1. Note that we have only included the binary class datasets and subdivided the tests depending on the IR as follows:
Table 1. Representative dataset groups from KEEL used for evaluation.
  • Datasets with an IR between 1.5 and 9: 20 dataset subgroups (e.g., iris, glass, pima, new-thyroid, vehicle, some yeast subset variants, some ecoli subset variants, and shuttle).
  • Datasets with an IR higher than 9: 70 dataset subgroups (e.g., some abalone subsets, winequality subsets, poker class-pair problems, KDD attack pairs, some ecoli subsets, and some yeast subset variants).
  • Noisy and borderline examples: 30 dataset subgroups (e.g., 03subcl5, 04clover5z, paw02a families and their noise levels variants).
The other datasets contained multiple class imbalances, detailed as follows:
  • IoT-23 [17] contains data on attacks on IoT devices such as Amazon Echo. The dataset is classified into six classes with an IR of 57:18:14:10:1:0.2. We used the following columns: orig_bytes, orig_okts, orig_ip_bytes, resp_bytes, resp_pkts, and resp_ip_bytes, and the classification labels were PortOfHorizontalPortScan, Okiru, Benign, DDoS, and C&C. Note, we only used the first 10,000 samples for both training (80%) and testing (20%).
  • BoT–IoT [18] is a dataset from a realistic network environment that classifies the network data into four classes: normal, backdoor, XSS, and scanning, with an IR of 84:14:1.8:0.2. We used the following columns: FC1_Read_Input_Register, FC2_Read_Discrete_Value, FC3_Read_Holding_Register, and FC4_Read_Coil. We used the same number of samples and distributions as with IoT-23.
  • Air Quality and Pollution Assessment [19] is a dataset containing environmental and demographics data regarding air quality, which is separated into four classes (Good, Moderate, Poor, and Hazardous), with an IR of 4:3:2:1. We used the Temperature, Humidity, PM2.5, PM10, NO2, SO2, CO, Promixity_to_Industrial_Areas, and Population_Density columns and 5000 samples.
Note that the datasets above differ in terms of IR, sample distributions (e.g., borderline and noise levels), and class separability. This variety allowed thorough testing.

4.1.2. Oversampling Techniques

We used the following approaches to benchmark WISEST:
  • SMOTE [4]
    Base technique that generates a new minority-class sample using linear interpolation between a minority sample and one of its k nearest minority neighbors.
  • Borderline-SMOTE [6]
    A variant of SMOTE that creates synthetic samples near the class boundary instead of between minority points.
  • ADASYN [5]
    Generates more synthetic minority samples near hard-to-learn samples, so classifiers focus on difficult regions.
  • SMOTE+Tomek [14]
    Creates synthetic minority samples (SMOTE) and then removes overlapping borderline pairs (Tomek links) to balance and clean the data.
  • SMOTEENN [7]
    An approach that combines SMOTE with Edited Nearest Neighbor (ENN), which removes samples whose labels disagree with the majority of their k nearest neighbors.
  • Radius-SMOTE [13]
    Another SMOTE variant that restricts or selects interpolation inside a fixed neighborhood radius, which reduces unsafe extrapolation and limits generation in sparse/noisy regions.
  • WCOM-KKNBR [8]
    Uses conditional Wasserstein CGAN-based oversampling, which generates minority class samples to balance the majority using a K-means and nearest neighbor-based method.
  • WISEST
    Our approach.
These approaches were selected because, except for SMOTE, they generate synthetic samples in regions near the borders between the majority and minority classes as WISEST does.

4.1.3. Implementation and Setup

SMOTE, BorderlineSMOTE, ADASYN, SMOTE+Tomek, and SMOTEENN were already implemented into the imblearn library in Python, while Radius-SMOTE, WCOM-KKNBR, and our algorithm were implemented using Python 3.10.15 in a locally deployed Jupyter Notebook version 7.4.6. The experiments were performed on a MacBook Pro with a silicon M1 Max Chip with 30 GPU cores and 64 GB of memory.

4.1.4. Methodology

The datasets with nominal variables (e.g., KDD variants and Poker) were preprocessed by converting them to numerical values using Dummy Encoding. The feature selection used for the evaluation comprised a five-fold cross-validation to estimate variance with the same random number generator (RNG) as a seed ( R N G = 42 ) for all algorithms.
The common and specific parameters used for each algorithm are described below:
  • SMOTE: k = 5 neighbors. The rationale for this number was to set “enough” NNs to ensure the newly generated samples would safely lie within the boundary area. Based on preliminary experiments, it was determined to be the optimal value. However, we will address a thorough analysis of how this value affects each dataset in the future.
  • Borderline-SMOTE: variant 1, k = 5 neighbors.
  • ADASYN: Same as SMOTE, sampling adaptivity enabled.
  • SMOTE+Tomek: SMOTE component same as above, and Tomek link step has no tunable hyperparameter.
  • SMOTEENN: Same SMOTE parameters as above, E N N = k = 5 .
  • Radius-SMOTE: As defined in [13].
  • WCOM-KKNBR: latent dimension latent_dim = 16; epochs = 50; batch_size = 32; learning rates η G = 2 × 10 4 (generator) and η D = 1 × 10 4 (discriminator); generate n samples equal to the original minority count, as defined in [8].
  • WISEST (proposed): k = 5 neighbors and threshold distance θ = 0.5 . Note that the threshold value has been statically set based on preliminary experiments with the synthetic dataset shown in Section 3.1.
Once the datasets were resampled, we generated synthetic sample counts for each method. Then, we classified the newly resampled dataset using Random Forest. Finally, the benchmark metrics (precision, recall, accuracy, and F1) were calculated for both the original and resampled datasets.
We conducted the above procedure for each category of the KEEL datasets described in Section 4.1.1. The experiments were run on datasets with IR of less than 9 and greater than 9, as well as variants with controlled noise levels.

4.2. Results

4.2.1. Results on KEEL Datasets with IR Less than or Equal to 9

In this section, we tested each benchmark method on real-world KEEL datasets with an IR of 9 or less (22 datasets in total).
First, we analyzed the number of synthetic samples generated per approach as a reference for how each dataset’s characteristics affect sample generation. Table 2 summarizes the number of synthetic points created per approach, where the lowest value is highlighted in bold as a reference.
Table 2. Number of new synthetic minority samples produced per oversampling method (KEEL datasets with IR less than 9).
As observed, our WISEST approach generated the fewest synthetic minority samples only on the ecoli-0_vs_1 dataset, while WCOM-KKNBR created the least in about 60% of the tested datasets. However, note that we do not intend to produce the smallest number but only those that are within k minority-class points of the boundary. Therefore, we analyzed the characteristics of the datasets for which WISEST generated the fewest and most synthetic samples, as detailed in Table 3.
Table 3. Dataset characteristics of instances where WISEST created the fewest (above the mid line) and most synthetic samples (below the mid line).
According to the examined datasets, if the minority class has a high class overlap, the distance to the majority class at the borderline is low, i.e., low separability between classes, such is the case with the ecoli variants and yeast; WISEST produces fewer “unsafe” interpolation targets and therefore generates fewer new points. In contrast, if there is a distinct class separation or compact minority clusters, WISEST creates up to 5 times more points than its counterparts as these are considered “safe” to add, even when some exhibit low separability or imbalance.
Next, we measured the benchmark metrics (accuracy, precision, recall, and F1) for KEEL datasets with an IR less than 9, highlighting the highest value for each metric in bold. Table 4 shows the results of selected datasets where WISEST performed the best in at least one of the metrics. However, the full results are available in Appendix A.
Table 4. KEEL dataset benchmark results for datasets with an IR less than or equal to 9.
Based on the above results, WISEST presents the best or comparable performance in the following datasets: glass1, ecoli-0_vs_1, iris0, yeast1, haberman, glass-0-1-2-3_vs_4-5-6, new-thyroid1, new-thyroid2, and glass6. The improvements were, on average, 8% in recall (peak at 16% on yeast1) and 3% in F1 (peak at 10% on haberman) compared to the original (relative), and about 2% compared to the other methods (relative). However, in the same datasets, the decrease in accuracy and precision was about 1 to 4% on average, respectively.
After running diagnostics on the characteristics of each dataset, we can conclude that, in datasets with a nontrivial fraction of minority points near class boundaries (i.e., glass1, yeast1, pima, haberman, glass-0-1-2-3_vs_4-5-6, and glass6), or if the minority class presents multiple small minority subclusters near majority modes (i.e., glass1, yeast1, glass-0-1-2-3_vs_4-5-6, glass6, vehicle1, and vehicle3), or moderate label noise or borderline examples (i.e., yeast1, haberman, pima, glass1, and glass6), WISEST performs well. That is almost two-thirds of the tested datasets. However, on the contrary, WISEST underperforms or is even rendered unnecessary when the datasets have either extremely low minority support (i.e., ecoli_0vs_1, ecoli1, ecoli2, ecoli3, and page-blocks0), or nearly-separable classes (i.e., glass0, vehicle2, vehicle0, and segment0), or the nearest-neighbor geometry highly depends on the distance metrics (e.g., ecoli* variants and page-blocks0), causing unsafe interpolation. Nevertheless, the difference from the best performers is not that far, even in these datasets.
Note that the number of newly created synthetic minority samples did not influence the performance. Take new-thyroid2, for example, where WISEST generated almost 4 times as many samples as the rest; however, it performed better across nearly all metrics.
To sum up, for datasets with IR less than 9, if a dataset pre-diagnostic shows a sizable frac_border_k (e.g., ≥0.15~0.4), mean minority to majority distances comparable to minority internal spacing, and you can afford a small precision drop (approx. 4%) for larger recall/F1 gains (about 10%), WISEST would be preferred against the existing approaches.

4.2.2. Results on KEEL Datasets with an IR Greater than 9

Next, we benchmarked WISEST against existing methods using KEEL datasets with an IR greater than 9 (i.e., highly imbalanced). As with the prior case, we tested the influence of each oversampling method on both the number of synthetic samples and performance. From a total of 69, we could test in the KEEL dataset. Table 5 presents the minority synthetic count results. However, due to space constraints, we show only the top 10 datasets for which WISEST generated the fewest points, including ties, and the full results are in Appendix B. Note, we could not test all of those (i.e., winequality-white-9_vs_4, zoo-3, shuttle-c2-vs-c4, lymphography-normal-fibrosis, and kddcup-land_vs_portsweep) with very few total minority samples (sometimes single-digit minority counts) and very few (under 200) samples overall. Thus, most of the sampling algorithms did not execute.
Table 5. Number of new synthetic minority samples per oversampling method for KEEL datasets. Top 10 WISEST lowest above the mid line and top 10 highest below.
Across the 69 datasets, WISEST produced the fewest synthetic samples in almost half of them. Among these datasets, there were variations among abalone, cleveland, ecoli, glass, poker, winequality (red and white), and yeast. Similarly to the prior experiment, we also analyzed the characteristics of randomly selected datasets in which WISEST created the fewest (and otherwise) synthetic samples, as shown in Table 6.
Table 6. Dataset characteristics of instances where WISEST created the fewest (above the mid line) and most synthetic samples (below the mid line) for datasets with IR greater than 9.
In contrast to the experiment in Section 4.2.1, the distance to the majority class at the borderline, or the IR, is not a determining factor. However, note that, in cases of high class granularity, such as with KEEL variants like various class-subset or one-vs.-rest variants (yeast, abalone, glass, and ecoli), where minority labels come from narrow subpopulations that are structurally distinct, WISEST produces fewer “unsafe” points and with more certainty (frac_with_majority_neighbor_k5=1).
On the other hand, WISEST produces the most synthetic samples, in which the minority class is distributed across many locally mixed (majority and minority) borderline neighborhoods, and expanding minority coverage at those boundaries is useful; this results in larger synthetic counts than bulk oversamplers that apply a uniform rule across the minority class.
Next, we measured the metrics (accuracy, precision, recall, and F1). Again, for space’s sake, Table 7 shows only the results where WISEST performs the highest or tied for highest in at least one metric. A complete list of all the datasets in this category can be found in Appendix C.
Table 7. KEEL benchmark results (accuracy, precision, recall, and F1) for all datasets whose IR is greater than 9. Selected results where WISEST achieved the highest value in at least one of the metrics (including ties).
As observed, across all datasets tested in this category, WISEST performed best and achieved competitive values in more than half of the datasets (36 out of 69 in total). The best results were achieved in terms of recall and F1, which were the highest (or competitive) in various yeast variants, vowel0, glass-0-1-6_vs_5, ecoli variants, shuttle-c0_vs_c4, page-blocks-1-3_vs_4, dermatology-6, and others. The results yield a recall improvement of up to 25% (5% AVG) relative to the original; for F1, the improvements were up to 18% (4% AVG) relative to the classifier trained on the original dataset or different strategies.
As for the datasets’ characteristics, many are class-subset or one-vs.-rest KEEL variants (ecoli, yeast, abalone, and glass) with locally complex boundaries that benefit from targeted boundary-focused sampling, as in our approach. Therefore, if there is a moderate minority (not small), these are enough seeds for WISEST to generate local, useful, and diverse synthetics across boundary regions. With these results, we can confirm that WISEST usually trades small precision and accuracy (0–7% AVG, respectively, relative to the original or other approaches) for better minority detection (i.e., recall and F1). Also, we confirmed that, even though WISEST did not produce the fewest samples, it still achieved gains compared to other approaches unless the original performance was already high.
To sum up, in KEEL datasets with IR greater than 9, WISEST performs best when the minority class is distributed across many locally mixed neighborhoods (borderline points) and when recall/F1 are the metrics of focus. As evident in the expanded results, WISEST often ties with Radius-SMOTE and WCOM-KKNBR in various metrics or achieves slightly higher F1 when safe boundary-focused interpolation is needed. Furthermore, on perfectly separable datasets, WISEST ties with other methods, yielding no practical advantage except for the processing time compared to heavy-processing approaches, such as WCOM-KKNBR (CGAN-based).

4.2.3. Results on KEEL’s Noisy and Borderline Datasets

The last set of experiments run on KEEL datasets tested tolerance to varying levels of noise. To this aim, KEEL provides preprocessed datasets that alter the base dataset (i.e., 03subcl5, 04clover5z, and paw02a) in the number of samples (i.e., 600 or 800), cluster setup (e.g., 5 or 7 parameters), and noise or borderline level (from 0 to 70%), making a total of 30 variations. For all variations, we used the binary-imbalanced version to ensure consistency with prior experiments. For instance, the dataset 03subcl5-600-5-30-BI is a binary variant of the 03subcl5 dataset family, generated with 600 cases, five parameters, and 30% noise or borderline examples.
Table 8 displays the benchmark results for all parameters (accuracy, precision, recall, and F1). However, once again for space’s sake, we only show the datasets where WISEST performs the best in any of the results; full results can be found in Appendix D.
Table 8. KEEL benchmark results (accuracy, precision, recall, and F1) for all noisy/borderline datasets. Selected results where WISEST achieved the highest value in at least one of the metrics (including ties).
The results show that WISEST improves the recall and in some cases F1-score in datasets with pronounced borderline structure and moderate noise levels (0–30%). However, it falls short compared to the others on datasets with noise levels above the 30% threshold. In datasets where minority points have enough nearby minority neighbors to permit safe interpolation (the 600–800 and 5–7 parameterized synthetic datasets), let WISEST generate useful localized synthetics. In those cases, WISEST’s weighted location-aware synthesis increases true-positive detection near decision boundaries without producing many unsafe samples, thereby lifting recall and, in turn, F1. In terms of recall, we observed up to 23% (13% AVG) relative to the original but −5% on AVG compared to others, especially WCOM-KKNB. For F1, improvements of up to 12% (3% AVG) relative to the original dataset and up to 10% (−1% AVG) relative to different strategies were observed. Regarding datasets with high or very high noise rates (50–70%), the policies applied by SMOTEENN or Borderline-SMOTE, which intensively clean the minority class, can improve precision. Surprisingly, WCOM-KKNB performed the best in noisy variants of paw02a dataset but not in other datasets. Nevertheless, WISEST remains conservative in this situation, creating fewer (but safer) synthetic points, which makes it perform lower compared to the others but remain competitive.

4.2.4. Results on Other Datasets Using Different Classifiers

This section evaluated the performance of WISEST when training ML classifiers on datasets other than the KEEL datasets, for instance, IoT-23 [17], BoT–IoT [18], and Air Pollution [19]; in particular, we used K-Nearest Neighbor (KNN), Random Forest (RF), and LightGBM trained with 80% of the oversampled dataset for training and the rest for testing. The goal of this experiment was to observe the difference compared to the original dataset. Thus, as with the other experiments, we measured accuracy, precision, recall, and F1. Note that, since WISEST was conceived with binary-class datasets in mind, this experiment also serves to analyze its behavior in multi-class environments. Nevertheless, testing WISEST robustness on multi-class datasets would be a topic for a future work.
The results are summarized in Table 9. As observed, the oversampling strategy applied by WISEST consistently improved most ML models. For instance, in the IoT-23 dataset and BoT–IoT, KNN achieved up to 50% improvement over the original data, and LightGBM achieved about 15% improvement, especially in recall and F1. On the other hand, WISEST performed slightly worse when applied to the Air Quality and Pollution Assessment dataset, about 1–2% lower than the algorithms trained on the original data. Unlike BoT–IoT and IoT-23, which exhibit a moderate to extreme imbalance depending on the classes, many borderline samples resemble benign traffic. Air Quality and Pollution Assessment’s IR is 1:4 at maximum, with very low borderline examples at the threshold; thus, time-series methods or regression models would be more appropriate than classification models. As described in prior experiments, WISEST would create many samples when the classes’ separability is high, which decreased the performance by adding more minority samples to the centroid rather than the borderline.
Table 9. Benchmark results by dataset, metric, and classifier (original vs. WISEST).

4.3. Sensitivity Analysis of the Threshold Distance

Up to this point, the distance threshold ( θ ) was set using prior experiments on the synthetic dataset described in Section 3.1. This section evaluates how the input threshold affects model performance. We varied the distance parameter d i f from 0 to 3 and measured accuracy, precision, recall, and macro-F1 on the BoT–IoT dataset using a Random Forest classifier. Figure 5 shows accuracy and precision, and Figure 6 shows recall and F1.
Figure 5. Quantitative differences as a function of the threshold distance. On the left: accuracy; on the right: precision.
Figure 6. Quantitative differences as a function of the threshold distance. On the left: recall; on the right: F1-score.
As observed, accuracy shows little variation across different d i f values (approximately 0.5% relative change on average), whereas precision, recall, and macro-F1 vary more substantially (roughly 5–10% relative change), peaking at d i f = 1.5 . Therefore, for the BoT–IoT dataset with a Random Forest classifier, the optimal threshold θ is 1.5. From this, we can conclude that it is important to perform similar preliminary analyses for other datasets and classifiers before running the oversampling procedure. However, dynamic threshold selection is left for future work.

4.4. Discussion

The prior sections described extensive testing across different datasets, from synthetic to real-world, with varying IR, types, contexts, and noise/borderline levels. From these results, we conclude that no single approach is suitable for all cases. Even vanilla SMOTE performed better than other methods for specific datasets under certain conditions.
WISEST uses a weighted interpolation approach to create synthetic samples within a threshold between the majority and minority classes, which has been shown to improve performance (especially recall and F1 measures) in various scenarios. For instance, as shown in Section 4.2.1Section 4.2.3, when there is a significant fraction of minority samples with majority neighbors (borderline points), WISEST is designed to adapt sampling based on local class composition by targeting synthetic samples that safely expand minority coverage around boundaries, improving recall and F1. On the other hand, in very sparse minority structures, highly noisy borders, or clearly separable classes, WISEST will perform as if without the oversampling (i.e., original dataset) or will underperform compared to other alternatives, both SMOTE-based or otherwise.
Moreover, in Section 4.2.4, we showed that the WISEST oversampling strategy improved performance when used with multi-class imbalanced datasets. As with KEEL datasets, the number of borderline samples influenced precision, as expected in imbalanced datasets [20,21]. Note that, even if the IR differs for each pair of classes, WISEST uses a single (static) threshold across all the tested datasets in the current implementation, which is expected to yield better performance in the binary case. At the same time, the multi-class setup would require a more careful analysis for each pair of classes. Therefore, it is a limitation we plan to study and overcome as future work. However, we believe that dynamic threshold selection will make WISEST a more robust oversampling method. Nevertheless, as shown in the results, there was an improvement of up to 20% compared to algorithms trained on the original data across most tested datasets.
Regarding time and space complexity, WISEST adds modest computational overhead over vanilla SMOTE and all SMOTE-based approaches as it performs weighting and distance-threshold checks. In practice, these operations depend on the number of neighbors (k) and the dataset topology. However, most of the operations can be performed in constant time, while the highest cost (i.e., calculating the nearest neighbors) is the same for all SMOTE-like methods; that is, building an NN index (O(n · d)) and performing m k-neighbor queries (typical cost O(m · k · log n · d); worst-case O(m · n · d)), where n is the total number of samples, d is number of features, and m is the number of minority-class samples. The extra operations per returned neighbor are inexpensive vector arithmetic and a few scalar tests, so the asymptotic time complexity remains aligned with other NN-based samplers. Still, constant factors increase relative to plain SMOTE due to weighting, conditional branches, and occasional generation of multiple candidates per seed. Memory overhead is comparable too since, regarding the storage for the NN index plus O(nsyn · d) for synthetic points, WISEST may produce fewer or more synthetics than SMOTE depending on local weights, which temporarily affects peak memory usage. Compared to cleaning pipelines (e.g., SMOTE-ENN and Tomek+Links), WISEST can be faster overall because it often avoids an expensive second NN pass over the combined dataset. Now, compared to CGAN-based approaches (e.g., WCOM-KKNBR), WISEST is orders of magnitude cheaper in both runtime and memory since it does not need to train deep networks. Therefore, WISEST’s runtime is attractive for small-to-moderate n and moderate d.
WISEST provides a viable robust alternative to traditional oversampling methods, such as SMOTE and SMOTE-based approaches. However, as the results show, there is no universal solution for imbalanced datasets. All methods have their own potential contributions and limitations. Therefore, we believe it is necessary to run a pre-diagnostic on various parameters, such as IR, fraction border, and number of nearest neighbors (from minority to majority), sample distribution (i.e., local class overlap/noisy borders), and silhouette scores for minority clusters beforehand.

5. Conclusions

Imbalanced datasets significantly influence ML models. However, traditional oversampling methods, such as SMOTE, tend to generate unnecessary synthetic samples, including borderline samples, which can hinder detection. This paper introduces WISEST, a novel oversampling approach that uses a weighted location-aware strategy to increase sample counts near decision boundaries without generating many unsafe points within a threshold.
Through extensive experimentation, this paper showed that WISEST is effective on various datasets. Across the complete KEEL collection (low-imbalance, high-imbalance, and noisy/borderline variants), its primary strengths are consistent increases in recall and often net F1 gains. This was also the case on other multi-class datasets (i.e., IoT-23 and BoT–IoT) using different ML models. Thus, we can conclude that the WISEST conditional branching approach can help to address the dataset imbalance problem under the conditions described above.
Future directions for this work include a deep and formal sensitivity analysis and testing of dynamic thresholds and variables (e.g., θ , k ) for multi-class datasets and their effects across different public real-world datasets, such as Credit Card Fraud (European cardholders), Mammography, and NSL-KDD (network intrusion; improved KDD99), among others. We also plan to include other ML models, such as SVM, CNN, and DNN, for evaluation. Finally, correlational and ablation analyses might shed further light on the influence of each part of the strategy (e.g., weighting and threshold distance).

Author Contributions

Conceptualization, R.M.; data curation, R.M. and L.G.; investigation, L.G.; methodology, R.M. and L.G.; software, R.M. and L.G.; supervision, S.I. and T.S.; writing—original draft, R.M. and L.G.; writing—review and editing, T.S., L.G. and S.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The original datasets presented in the study are openly available from the KEEL (Knowledge Extraction based on Evolutionary Learning) repository at https://sci2s.ugr.es/keel/ (accessed on 27 November 2025) or Kaggle at https://www.kaggle.com/ (accessed on 27 November 2025).

Acknowledgments

We would like to thank Takaaki Mizuki and Toru Abe from Tohoku University, Japan, and the anonymous reviewers for their useful discussions and advice, which helped to improve this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Full KEEL dataset benchmark results for datasets with an IR less than or equal to 9.
Table A1. Full KEEL dataset benchmark results for datasets with an IR less than or equal to 9.
DatasetMetricOriginalSMOTEBorderline-
SMOTE
ADASYNSMOTE +TomekSMOTEENNRadius-SMOTEWCOM-KKNBRWISEST
glass1Accuracy0.85980.85020.85040.82210.84100.77540.82690.85480.8032
Precision0.89140.82570.80410.77210.81150.67600.76620.88750.7188
Recall0.71000.76330.79000.75000.75000.73580.76330.69750.8025
F10.77920.78580.79080.75110.77250.70140.75910.77650.7492
ecoli-0_vs_1Accuracy0.99090.99090.97730.97270.98640.99090.99090.99090.9864
Precision1.00001.00000.96240.95190.98821.00001.00001.00000.9882
Recall0.97330.97330.97330.97330.97330.97330.97330.97330.9733
F10.98620.98620.96730.96160.98010.98620.98620.98620.9801
wisconsinAccuracy0.97360.97360.96780.97070.97510.97360.97660.96780.9737
Precision0.95120.94450.94350.94360.94460.94460.94820.93700.9444
Recall0.97480.98320.96640.97470.98740.98320.98730.97480.9832
F10.96270.96320.95460.95870.96540.96320.96720.95520.9633
pimaAccuracy0.76300.74730.74340.76040.76040.73040.74730.76430.7421
Precision0.68790.63170.61960.64430.64960.58550.62110.69210.6073
Recall0.58970.67160.70150.70880.68650.81330.72390.58570.7574
F10.63420.65010.65700.67450.66660.67920.66760.63430.6729
iris0All1.00001.00001.0000 1.00001.00001.00001.00001.0000
glass0Accuracy0.86930.86000.85990.85060.85530.81790.86000.87390.8553
Precision0.81250.75890.75500.72450.75950.66350.75150.82430.7360
Recall0.80000.84290.84290.88570.81430.90000.85710.80000.8714
F10.80000.79810.79570.79620.78420.76280.80030.80650.7978
yeast1Accuracy0.78170.76420.76690.75540.77090.72910.77220.78230.7668
Precision0.66820.59060.59450.57190.59770.52330.60320.67030.5878
Recall0.49640.62470.62240.63630.64340.74590.63400.49420.6596
F10.56730.60530.60720.60110.61900.61410.61710.56640.6202
habermanAccuracy0.67300.66000.66000.66320.64690.65010.67960.67310.6894
Precision0.34170.35610.34270.37150.33350.38290.38920.34350.4083
Recall0.25740.34490.31910.39410.33160.50370.35740.24630.3824
F10.29250.34990.32960.38170.33160.43210.37150.28610.3935
vehicle2Accuracy0.98700.98460.98340.98580.98460.97280.98580.98580.9787
Precision0.98170.97340.96850.97280.97340.92040.96930.98160.9401
Recall0.96780.96780.96780.97230.96780.98150.97710.96320.9814
F10.97440.97010.96790.97240.97010.94920.97270.97210.9597
vehicle1Accuracy0.79320.78840.78600.78600.78960.74710.78600.79430.7766
Precision0.63300.58000.57560.58090.58340.50610.57920.63670.5567
Recall0.47480.64510.63570.61240.63110.82000.61710.47470.6448
F10.53990.60850.60300.59230.60440.62470.59330.54280.5955
vehicle3Accuracy0.78960.76710.78250.77190.77660.73280.77660.79320.7707
Precision0.62180.53480.56450.54370.55180.48320.55630.62970.5405
Recall0.40620.59480.62260.59460.59980.79730.55710.41590.6092
F10.48850.56160.59090.56660.57290.60060.55520.49750.5714
glass-0-1-2-
3_vs_4-5-6
Accuracy0.93920.94390.94850.93910.94390.92510.95320.95320.9483
Precision0.88590.87030.88080.85440.87030.80510.89290.87470.8488
Recall0.86180.92000.92000.92000.92000.94000.92000.94000.9800
F10.87170.89010.89690.88130.89010.86150.90420.90520.9067
vehicle0Accuracy0.97160.96810.97520.97040.96690.93500.96330.96930.9551
Precision0.94470.91000.93300.91120.90950.79370.90500.93980.8580
Recall0.93450.95960.96460.96960.95460.98000.94450.92940.9696
F10.93930.93390.94820.93920.93130.87690.92370.93420.9102
ecoli1Accuracy0.91080.91680.89890.89310.91680.87510.90790.90480.9049
Precision0.82310.79940.74880.71970.79250.66710.79220.81170.7717
Recall0.80330.88250.88170.92170.89500.93420.84170.79000.8683
F10.79420.83140.80150.80260.83330.77500.80530.78560.8052
new-
thyroid1
Accuracy0.99070.98140.97210.97670.98140.99070.98140.98140.9860
Precision1.00000.97500.93060.95560.97500.97500.97500.97500.9306
Recall0.94290.91430.91430.91430.91430.97140.91430.91431.0000
F10.96920.93790.91290.92630.93790.97130.93790.93790.9617
new-
thyroid2
Accuracy0.98600.97670.98140.97670.97670.98140.98140.98140.9860
Precision0.97500.97500.97500.95560.97500.97500.97500.97500.9500
Recall0.94290.88570.91430.91430.88570.91430.91430.91430.9714
F10.95330.91670.93790.92630.91670.93790.93790.94050.9579
ecoli2Accuracy0.94640.94940.95540.92860.94940.95830.95830.95230.9494
Precision0.89670.86890.90440.74850.84950.85210.90630.91810.8452
Recall0.73270.79090.78910.80910.80910.88730.80910.75270.8273
F10.80080.81820.83640.77420.82330.86740.84690.81830.8252
segment0Accuracy0.99740.99700.99700.99740.99650.99570.99650.99740.9957
Precision0.99690.99090.99090.99100.99080.98490.98810.99690.9821
Recall0.98480.98780.98780.99090.98480.98480.98780.98480.9878
F10.99080.98930.98930.99090.98780.98480.98790.99080.9849
glass6Accuracy0.96270.98130.97200.97660.98130.96730.96730.97190.9720
Precision0.96670.96670.96670.93810.96670.93330.96670.94290.9667
Recall0.76670.90000.83330.90000.90000.83330.80000.86670.8333
F10.84360.92730.88730.91190.92730.87210.86550.89390.8873
yeast3Accuracy0.94880.95080.94740.95150.94880.94210.94810.94810.9488
Precision0.81730.75720.73470.75460.74970.68530.78140.79880.7675
Recall0.69870.82770.83410.84020.82180.90190.74770.72350.7786
F10.74820.78810.77830.79280.77990.77650.76000.75410.7704
ecoli3Accuracy0.93450.91370.90470.89580.91370.88390.92270.92260.9167
Precision0.80000.56750.53360.50500.56750.47030.66940.73290.6152
Recall0.51430.71430.57140.68570.71430.85710.60000.45710.6000
F10.61510.62590.54620.57070.62590.60620.61920.54080.6005
page-blocks0Accuracy0.97620.97200.97060.96860.97150.96070.97610.97640.9746
Precision0.88930.82470.81400.80010.82110.73840.88090.88830.8597
Recall0.87660.92310.92310.92490.92310.95350.88550.88010.8981
F10.88280.87080.86490.85740.86870.83210.88300.88400.8784

Appendix B

Table A2. Number of new synthetic minority samples per oversampling method for KEEL datasets with an IR greater than 9.
Table A2. Number of new synthetic minority samples per oversampling method for KEEL datasets with an IR greater than 9.
DatasetSMOTEBorderline-
SMOTE
ADASYNSMOTE +TomekSMOTEENNRadius-SMOTEWCOM-KKNBRWISEST
yeast-2_vs_43303303333293221254161
yeast-0-5-6-7-9_vs_4341341339338321724165
vowel064664664664664629072540
glass-0-1-6_vs_2126126126125115141414
shuttle-c0-vs-c41266760126612661264486982410
yeast-1_vs_7319319321316301212421
glass4150150150150150251037
ecoli4237237236236235591622
page-blocks-1-3_vs_43333333333333307022186
abalone9-18518518519511467283428
glass-0-1-6_vs_513313313213313312712
shuttle-c2-vs-c400000848
yeast-1-4-5-8_vs_7506506504505489112411
glass515715715715715711711
yeast-2_vs_8354283353351338441610
yeast411061106111011031086494147
yeast-1-2-8-9_vs_7710710710708689132413
yeast5111711171115111711151013587
yeast611311131113111291114662855
abalone19328832883285328432453263
cleveland-0_vs_4118118118113996106
ecoli-0-1_vs_2-3-51571571561561486419260
ecoli-0-1_vs_51601601601591566016256
ecoli-0-1-4-6_vs_51921921921921886116269
ecoli-0-1-4-7_vs_2-3-5-62222222222212126723167
ecoli-0-1-4-7_vs_5-62262262252252226420164
ecoli-0-2-3-4_vs_51301301291291266116253
ecoli-0-2-6-7_vs_3-51441441431431385118131
ecoli-0-3-4_vs_51281281281271236116253
ecoli-0-3-4-7_vs_5-61661661641641596420172
ecoli-0-4-6_vs_51301301301301276116257
ecoli-0-6-7_vs_3-51421421441421375118139
ecoli-0-6-7_vs_51441441441441424916141
glass-0-1-4-6_vs_2137137137136128161416
glass-0-1-5_vs_2110110110109101151415
glass-0-4_vs_5595958595917733
glass-0-6_vs_5727273727113713
led7digit-0-2-4-5-6-7-8-9_vs_12952952952951052630128
yeast-0-2-5-6_vs_3-7-8-964564564863860018979144
yeast-0-2-5-7-9_vs_3-6-864564564764363029179112
yeast-0-3-5-9_vs_7-8325325324322301684055
abalone-3_vs_1137830237837837852124
abalone-17_vs_7-8-9-1017781778177617761758374637
abalone-19_vs_10-11-12-13124612461246124111935265
abalone-20_vs_8-9-10149114911492148914789219
abalone-21_vs_8442442442442436121112
car-good12721272127112721130415569
car-vgood12781278128112781117495278
dermatology-62542032542542547316333
flare-F784784789783680163438
kddcup-buffer_overflow_vs_back173869517391738173811524563
kddcup-guess_passwd_vs_satan12290012291229212421060
kddcup-land_vs_portsweep815008158158117378
kr-vs-k-one_vs_fifteen16701670167116701670304621428
kr-vs-k-zero-one_vs_draw21532153215221522149358841542
lymphography-normal-fibrosis00000444
poker-8_vs_6115411541154115411543143
poker-8-9_vs_5162016201618162016157207
poker-9_vs_7182182182182181565
shuttle-2_vs_52574257425752574257419139923
shuttle-6_vs_2-3168168134168168388182
winequality-red-3_vs_5537537538532488181
winequality-red-4119411941183117910628428
winequality-red-8_vs_6-76555246576435581143
winequality-red-8_vs_64964964954854222144
winequality-white-3_vs_76886886886795988168
winequality-white-3-9_vs_511461146114211219493203
winequality-white-9_vs_400000101
zoo-300000404

Appendix C

Table A3. Full KEEL benchmark results (accuracy, precision, recall, and F1) for datasets with IR greater than 9.
Table A3. Full KEEL benchmark results (accuracy, precision, recall, and F1) for datasets with IR greater than 9.
DatasetMetricOriginalSMOTEBorderline-
SMOTE
ADASYNSMOTE +TomekSMOTE ENNRadius-SMOTEWCOM KKNBWISEST (Ours)
yeast-
2_vs_4
Accuracy0.96310.95730.94950.94950.95530.94170.95920.95920.9592
Precision0.90100.78180.77330.73060.76970.65470.83730.86390.8465
Recall0.72730.80550.72550.82730.80550.90180.74730.72730.7473
F10.79480.79210.73910.76980.78510.75410.78280.77350.7842
yeast-0-5-
6-7-9_vs_4
Accuracy0.91850.92230.90900.91850.91660.88820.92420.91860.9166
Precision0.75000.59300.52310.56760.56420.46110.66400.75330.5906
Recall0.27450.64550.52550.64360.62550.76550.46910.29450.4673
F10.39810.61710.52060.60070.59260.57100.54480.41000.5135
vowel0Accuracy0.99490.99600.99600.99490.99600.99600.99490.99190.9959
Precision0.98950.97890.97890.96950.97890.97890.97840.98950.9684
Recall0.95560.97780.97780.97780.97780.97780.96670.92220.9889
F10.97140.97780.97780.97260.97780.97780.97210.95290.9781
glass-0-1-
6_vs_2
Accuracy0.91160.90120.89580.88530.89600.82280.90650.91160.9012
Precision0.00000.53330.43330.45710.50000.25350.06670.00000.0667
Recall0.00000.35000.23330.36670.35000.51670.05000.00000.0500
F10.00000.38670.28380.35050.37330.33690.05710.00000.0571
shuttle-c0-vs-c4All1.00001.00001.00001.00001.00001.00001.00001.00001.0000
yeast-
1_vs_7
Accuracy0.94340.91510.92160.90640.91070.84740.93900.94120.9346
Precision0.45000.34240.35000.28030.29670.20470.45000.43330.3300
Recall0.26670.40000.36670.33330.36670.46670.30000.23330.2667
F10.33330.36300.35710.29930.32420.28180.35180.30220.2927
glass4Accuracy0.96260.96710.98130.96710.96710.95290.97190.96260.9672
Precision0.70000.68330.80000.68330.68330.62670.80000.70000.6500
Recall0.50000.83330.93330.83330.83330.83330.76670.50000.7333
F10.56000.74480.85330.74480.74480.70050.75330.56000.6648
ecoli4Accuracy0.97620.97030.96140.96440.97030.96740.97920.97620.9792
Precision0.93330.80000.69670.73000.80000.76330.88330.96000.8667
Recall0.65000.65000.65000.65000.65000.70000.75000.65000.7500
F10.75240.70950.66750.67700.70950.72460.80710.74920.8000
page-blocks-
1-3_vs_4
Accuracy0.99160.99580.99790.99790.99580.99581.00000.99160.9958
Precision0.96671.00001.00001.00001.00001.00001.00000.96670.9381
Recall0.90000.93330.96670.96670.93330.93331.00000.90001.0000
F10.92360.96360.98180.98180.96360.96361.00000.92360.9664
abalone9-18Accuracy0.94800.91520.92890.91380.91930.87690.94940.95080.9467
Precision0.50000.38200.42330.37820.43200.24760.67670.70000.5333
Recall0.12220.50560.41110.48060.52780.52780.24440.16940.2444
F10.19640.40920.39490.39870.43690.32510.33640.26550.3216
glass-0-1-
6_vs_5
Accuracy0.97820.98350.98350.98350.98350.98890.97810.98360.9782
Precision0.80000.80000.80000.80000.80000.80000.80000.90000.8000
Recall0.80000.70000.70000.70000.70000.80000.60000.80000.8000
F10.76670.73330.73330.73330.73330.80000.66670.80000.7667
shuttle-c2-
vs-c4
Accuracy0.9923-----0.99230.99231.0000
Precision1.0000-----1.00001.00001.0000
Recall0.9000-----0.90000.90001.0000
F10.9333-----0.93330.93331.0000
yeast-1-4-
5-8_vs_7
Accuracy0.95670.92490.94080.92780.92640.88890.95670.95670.9553
Precision0.00000.13000.19050.13330.08330.12050.00000.00000.0000
Recall0.00000.13330.13330.10000.06670.26670.00000.00000.0000
F10.00000.12990.15040.11110.07330.16540.00000.00000.0000
glass5Accuracy0.97670.97670.97210.97670.97670.98140.97210.97670.9721
Precision0.73330.53330.53330.53330.53330.73330.53330.73330.5333
Recall0.60000.60000.50000.60000.60000.70000.50000.60000.5000
F10.62670.56000.49330.56000.56000.69330.49330.62670.4933
yeast-2_vs_8Accuracy0.97510.96060.95640.93980.95850.93570.97510.97920.9751
Precision0.95000.56760.46670.30000.57500.32940.95000.95000.9500
Recall0.45000.50000.15000.30000.50000.55000.45000.55000.4500
F10.56140.47310.21710.28890.46380.38840.56140.64330.5614
yeast4Accuracy0.96500.95210.95550.95220.95280.91780.96900.96770.9656
Precision0.33330.32680.33980.31470.33310.22610.76670.56670.5600
Recall0.07820.40360.34550.38360.40360.54360.15640.11640.1764
F10.12560.35610.33000.34060.35560.31620.25500.18840.2605
yeast-1-2-
8-9_vs_7
Accuracy0.96940.95040.95140.94510.94620.90700.96940.97150.9704
Precision0.50000.24970.21780.15820.19670.09690.61670.73330.6333
Recall0.13330.23330.16670.16670.20000.23330.23330.23330.2333
F10.20710.23090.17860.15990.19050.13440.32490.33650.3294
yeast5Accuracy0.98110.98380.98250.98250.98380.98050.98520.98180.9852
Precision0.82500.70600.67690.68390.70600.63150.77390.81000.7589
Recall0.49170.81110.80830.78610.81110.88060.72220.56110.7444
F10.56310.74000.71300.70940.74000.72540.73690.62120.7406
yeast6Accuracy0.98180.97780.97980.97710.97710.96500.98450.98250.9825
Precision0.75000.55180.60830.55170.53790.38700.76000.73670.6795
Recall0.34290.60000.54290.60000.57140.65710.48570.40000.4857
F10.46060.56460.56010.56420.53890.47610.58540.50460.5550
abalone19Accuracy0.99230.97440.98590.97410.97510.94300.99230.99190.9923
Precision0.00000.04580.05830.04670.04730.02380.00000.00000.0000
Recall0.00000.12860.06190.12860.12860.15240.00000.00000.0000
F10.00000.06760.05930.06840.06900.04110.00000.00000.0000
cleveland-
0_vs_4
Accuracy0.94810.97110.95970.97110.96540.94820.94820.94820.9539
Precision0.60000.60000.55000.60000.60000.50000.60000.60000.6000
Recall0.30000.60000.53330.60000.53330.53330.33330.33330.4000
F10.39330.60000.53140.60000.56000.49330.40000.40000.4600
ecoli-0-
1_vs_2-3-5
Accuracy0.96310.95090.95510.95090.95090.94680.96320.95900.9591
Precision0.91000.78100.86670.78330.75760.74330.92000.84330.8933
Recall0.71000.76000.68000.80000.80000.80000.72000.71000.7200
F10.77980.74890.73750.77200.76440.75420.77980.76560.7653
ecoli-0-
1_vs_5
Accuracy0.97500.96670.96250.97080.96670.96670.96670.97500.9792
Precision0.76000.89330.73330.84330.89330.82670.72000.76000.9600
Recall0.75000.75000.65000.85000.75000.80000.70000.75000.8000
F10.74920.75110.66480.82110.75110.78540.68890.74920.8444
ecoli-0-1-4-
6_vs_5
Accuracy0.97860.96790.96790.96790.96790.97140.97140.97140.9750
Precision0.92000.78000.81670.75000.78000.79000.87000.87000.8700
Recall0.80000.80000.70000.85000.80000.85000.75000.75000.8000
F10.84760.78250.75000.79440.78250.81670.78810.78810.8167
ecoli-0-1-4-
7_vs_2-3-5-6
Accuracy0.96430.96440.97020.96440.96730.96130.96430.96130.9732
Precision0.95000.83110.86000.82500.84500.77260.92000.88000.9000
Recall0.62000.78670.79330.78670.78670.82670.65330.65330.7933
F10.74180.79880.82120.79740.80830.79130.75520.74060.8358
ecoli-0-1-4-
7_vs_5-6
Accuracy0.97590.96680.96980.96680.96980.96380.97590.97590.9820
Precision0.95000.77000.84330.79440.80330.76430.95000.95000.9333
Recall0.72000.80000.76000.84000.80000.80000.72000.72000.8400
F10.81110.78060.78880.79940.79880.77170.81110.81110.8732
ecoli-0-2-
3-4_vs_5
Accuracy0.97060.96550.97050.96550.96550.96550.97060.97060.9754
Precision0.88330.86000.93330.81000.86000.81000.88330.88330.9500
Recall0.80000.80000.75000.85000.80000.85000.80000.80000.8000
F10.83570.82060.82860.82780.82060.82780.83570.83570.8643
ecoli-0-2-
6-7_vs_3-5
Accuracy0.96440.95100.95550.93740.95550.93310.96440.96000.9643
Precision0.90000.72670.76000.65330.74670.65830.90000.90000.8600
Recall0.71000.79000.70000.79000.79000.79000.71000.66000.7400
F10.77860.74780.71110.69850.76110.70440.77860.74050.7683
ecoli-0-3-
4_vs_5
Accuracy0.97000.96500.97500.97000.97000.96000.97000.96500.9800
Precision0.89330.86000.96000.87000.86000.84330.89330.88330.9100
Recall0.80000.80000.80000.85000.85000.80000.80000.75000.9000
F10.83490.82060.85400.84840.84920.79250.83490.80710.8992
ecoli-0-3-
4-7_vs_5-6
Accuracy0.95730.95320.96100.94560.95320.95320.96510.96510.9571
Precision0.92000.77950.85950.74780.77950.77950.88330.92000.8295
Recall0.64000.76000.76000.76000.76000.76000.76000.72000.7600
F10.73430.75180.78180.72800.75180.75180.80660.79780.7685
ecoli-0-4-
6_vs_5
Accuracy0.97540.95560.96550.95560.95560.96060.97040.97050.9704
Precision0.96000.81670.91000.79430.81670.82670.91000.92000.8533
Recall0.80000.80000.75000.85000.80000.85000.80000.80000.9000
F10.85400.77480.80400.79940.77480.80250.83250.83170.8584
ecoli-0-6-
7_vs_3-5
Accuracy0.96400.95960.95060.94160.95960.94150.95950.96400.9640
Precision0.96000.84330.76670.72670.84330.67670.92000.96000.8933
Recall0.68000.78000.74000.78000.78000.78000.68000.68000.7800
F10.77670.78670.74100.72510.78670.71660.75440.77670.8033
ecoli-0-6-
7_vs_5
Accuracy0.97270.95910.95910.95910.95910.93180.97730.97270.9727
Precision0.96000.80760.80330.75760.80760.70430.92000.96000.8933
Recall0.75000.85000.80000.90000.85000.80000.85000.75000.8500
F10.80060.78800.74560.80470.78800.66750.86030.80060.8425
glass-0-1-
4-6_vs_2
Accuracy0.91220.87800.87320.90730.87800.81460.91710.90730.9024
Precision0.00000.37330.35900.50330.37330.25910.15000.00000.2333
Recall0.00000.38330.38330.43330.38330.50000.20000.00000.2000
F10.00000.30780.29440.41100.30780.30380.17140.00000.2133
glass-0-1-
5_vs_2
Accuracy0.90150.84290.84860.84290.84290.77870.91290.90130.9072
Precision0.20000.19050.28330.25000.19050.16510.63330.20000.6000
Recall0.06670.23330.28330.28330.23330.35000.28330.06670.2333
F10.10000.19430.27860.26430.19430.22240.38100.10000.3333
glass-0-4_vs_5All1.00001.00001.00001.00001.00001.00001.00001.00001.0000
glass-0-
6_vs_5
Accuracy0.99050.98180.97230.99090.98180.99090.99050.99050.9905
Precision0.80000.93330.73331.00000.93331.00000.80000.80000.8000
Recall0.80000.90000.70000.90000.90000.90000.80000.80000.8000
F10.80000.89330.69330.93330.89330.93330.80000.80000.8000
led7digit-
0-2-4-5-6-
7-8-9_vs_1
Accuracy0.96620.96840.96620.96170.96840.93240.96620.96620.9594
Precision0.79840.80120.79840.76340.80120.86670.79840.79840.7523
Recall0.80710.83210.80710.80710.83210.30000.80710.80710.8071
F10.80050.81370.80050.78110.81370.42150.80050.80050.7737
yeast-0-2-
5-6_vs_3-7-
8-9
Accuracy0.93820.92930.92830.91930.92930.90440.93720.93730.9323
Precision0.81430.64960.66620.59820.65350.51720.76820.77900.7099
Recall0.48420.63530.56530.60530.62530.69580.53530.51470.5447
F10.60300.63730.60670.59590.63420.59050.62610.61680.6129
yeast-0-2-
5-7-9_vs_3-
6-8
Accuracy0.96410.95920.96510.95720.95820.95320.96310.96610.9641
Precision0.85870.78370.85880.76860.78070.72660.84390.85840.8328
Recall0.76950.81950.78890.82950.81950.85000.77950.80000.8095
F10.80660.79820.81650.79310.79520.78080.80490.82130.8159
yeast-0-3-5-
9_vs_7-8
Accuracy0.91110.87750.88530.87940.88730.81820.90720.91510.9131
Precision0.63330.38750.40950.41950.43890.30460.49330.63330.5950
Recall0.18000.46000.36000.46000.48000.62000.24000.24000.3200
F10.26200.40990.37390.42410.44720.40450.31500.33810.4076
abalone-
3_vs_11
Accuracy0.99800.99800.99800.99800.99800.99800.99801.00000.9940
Precision0.95000.95000.95000.95000.95000.95000.95001.00000.9000
Recall1.00001.00001.00001.00001.00001.00001.00001.00001.0000
F10.97140.97140.97140.97140.97140.97140.97141.00000.9333
abalone-
17_vs_7-8-
9-10
Accuracy0.97560.95770.96150.95680.95640.93970.97650.97480.9748
Precision0.63330.28730.30540.29070.28110.24310.56670.36670.5100
Recall0.10450.44700.41360.43030.44850.58640.15910.05150.1561
F10.17610.34510.34600.33540.34270.33600.24570.08860.2349
abalone-
19_vs_10-
11-12-13
Accuracy0.98030.94270.96050.94640.94270.89640.98030.98030.9803
Precision0.00000.09260.08890.08600.06710.06220.00000.00000.0000
Recall0.00000.22380.12860.20000.16670.32380.00000.00000.0000
F10.00000.13020.10280.11990.09550.10370.00000.00000.0000
abalone-
20_vs_8-9-10
Accuracy0.98590.97340.97860.97230.97080.96400.98490.98590.9854
Precision0.00000.22710.28850.22530.18330.17750.00000.00000.0000
Recall0.00000.48000.43330.48000.40000.51330.00000.00000.0000
F10.00000.30510.34380.30340.24930.26240.00000.00000.0000
abalone-
21_vs_8
Accuracy0.97760.96900.97760.96380.96900.96210.97590.97760.9794
Precision0.33330.50830.60330.39170.50830.34880.30000.33330.5500
Recall0.20000.56670.56670.56670.56670.66670.13330.20000.3333
F10.23330.46620.53000.43350.46620.43610.18000.23330.3714
car-goodAccuracy0.94910.94790.94850.95020.94790.93460.94970.95020.9485
Precision0.32280.35490.35940.37170.35490.35350.36110.34250.3363
Recall0.24730.31870.33300.33410.31870.75160.31870.26150.2747
F10.27800.32870.33960.34850.32870.47820.33520.29190.2996
car-vgoodAccuracy0.96180.95950.96010.96010.95950.96120.96180.96060.9589
Precision0.51020.45850.47760.47760.45850.49540.50160.49690.4734
Recall0.33850.33850.35380.35380.33850.78460.33850.32310.3692
F10.39130.38120.39910.39910.38120.60550.39070.37710.4006
dermatology-6All1.00001.00001.00001.00001.00001.00001.00001.00001.0000
flare-FAccuracy0.95030.93810.93900.93530.94000.91560.94750.94930.9437
Precision0.13330.14710.16500.13790.17890.24490.12500.15500.1817
Recall0.06940.14170.16670.14170.19170.51390.06940.09440.1667
F10.09080.14210.16380.13640.18290.32990.08760.11580.1705
kddcup-
buffer_overflow_
vs_back
All1.00001.00001.00001.00001.00001.00001.00001.00001.0000
kddcup-
guess_passwd_
vs_satan
All1.00001.00001.0000-1.00001.00001.00001.00001.0000
kddcup-
land_vs_
portsweep
All1.00001.00001.0000-1.00001.00001.00000.99911.0000
kr-vs-k-
one_vs_fifteen
All1.00001.00001.00001.00001.00001.00001.00001.00001.0000
kr-vs-k-zero-
one_vs_draw
Accuracy0.99590.99590.99550.99620.99660.99450.99760.99660.9983
Precision0.97950.94450.96090.94370.96230.88820.98100.99050.9814
Recall0.90480.94290.91430.95240.94290.97140.95240.91430.9714
F10.94000.94280.93660.94770.95190.92740.96610.95020.9761
lymphography-
normal-
fibrosis
Accuracy0.9864-----0.98640.98640.9864
Precision0.6000-----0.60000.60000.6000
Recall0.6000-----0.60000.60000.6000
F10.6000-----0.60000.60000.6000
poker-8_vs_6Accuracy0.98850.99460.99320.99390.99460.99460.98850.98850.9885
Precision0.00000.80000.60000.80000.80000.8000---
Recall0.00000.55000.43330.48330.55000.5500---
F10.00000.61330.49330.57330.61330.6133---
poker-8-9_vs_5Accuracy0.98800.98550.98700.98600.98550.98270.98800.98800.9880
Precision0.00000.00000.00000.00000.00000.00000.00000.00000.0000
Recall0.00000.00000.00000.00000.00000.00000.00000.00000.0000
F10.00000.00000.00000.00000.00000.00000.00000.00000.0000
poker-9_vs_7Accuracy0.97130.97130.96730.97540.97130.96720.97130.97130.9713
Precision0.20000.20000.00000.40000.20000.20000.20000.20000.2000
Recall0.20000.10000.00000.20000.10000.10000.20000.20000.2000
F10.20000.13330.00000.26670.13330.13330.20000.20000.2000
shuttle-2_vs_5All1.00001.00001.00001.00001.00001.00001.00001.00001.0000
shuttle-6_vs_2-3All1.00001.00001.00001.00001.00001.00001.00001.00001.0000
winequality-
red-3_vs_5
Accuracy0.98550.96530.97680.96670.96670.95660.98550.98550.9855
Precision---------
Recall---------
F1---------
winequality-
red-4
Accuracy0.96690.94750.96000.94750.94810.91370.96620.96690.9656
Precision0.10000.18310.06670.17540.14580.12680.00000.10000.0000
Recall0.01820.16910.02000.13270.11270.26180.00000.01820.0000
F10.03080.17480.03080.14730.12680.16970.00000.03080.0000
winequality-
red-8_vs_6-7
Accuracy0.98130.96020.98010.96020.96140.92160.98130.98130.9813
Precision0.40000.13330.40000.13330.14670.11030.40000.40000.4000
Recall0.10000.20000.10000.20000.20000.38330.10000.10000.1000
F10.16000.16000.16000.16000.16890.17000.16000.16000.1600
winequality-
red-8_vs_6
Accuracy0.97410.95430.97260.95270.94810.92380.97410.97260.9726
Precision0.40000.24440.36670.23050.21330.14460.30000.30000.3000
Recall0.11670.30000.16670.30000.30000.41670.11670.11670.1167
F10.18000.24760.21330.24430.22580.20620.16000.16000.1600
winequality-
white-3_vs_7
Accuracy0.98110.97110.97780.97000.96780.94780.98110.98000.9800
Precision0.40000.16670.30000.16670.12860.18120.60000.40000.6000
Recall0.15000.15000.15000.15000.10000.40000.20000.15000.2000
F10.21330.14670.18000.14670.10300.24570.29330.21330.2933
winequality-
white-3-9_vs_5
Accuracy0.98250.97300.97840.97170.97230.95750.98250.98250.9818
Precision0.20000.11190.25000.08330.10670.09930.20000.20000.2000
Recall0.04000.12000.08000.08000.12000.20000.04000.04000.0400
F10.06670.11410.11110.08080.11270.13070.06670.06670.0667
winequality-
white-9_vs_4
Accuracy0.9763-----0.97630.97630.9763
Precision0.2000-----0.20000.20000.2000
Recall0.2000-----0.20000.20000.2000
F10.2000-----0.20000.20000.2000
zoo-3Accuracy0.9705-----0.97050.97050.9705
Precision0.4000-----0.40000.40000.4000
Recall0.4000-----0.40000.40000.4000
F10.4000-----0.40000.40000.4000

Appendix D

Table A4. Full KEEL benchmark results (accuracy, precision, recall, and F1) for all noisy/borderline datasets. Best results in bold.
Table A4. Full KEEL benchmark results (accuracy, precision, recall, and F1) for all noisy/borderline datasets. Best results in bold.
DatasetMetricOriginalSMOTEBorderline-
SMOTE
ADASYNSMOTE +TomekSMOTEENNRadius-SMOTEWCOM KKNBWISEST (Ours)
03subcl5-
600-5-0-BI
Accuracy0.94830.94670.94500.94830.94500.92330.94000.95330.9483
Precision0.83780.82350.81310.81810.81760.72630.79810.87710.7810
Recall0.86000.87000.89000.90000.87000.88000.87000.84000.9700
F10.84570.84300.84540.85420.84030.79400.82920.85510.8641
03subcl5-
600-5-30-BI
Accuracy0.88500.85170.84830.83500.83670.83500.87000.89170.8667
Precision0.70520.54470.53630.50560.51530.50700.60170.72220.5890
Recall0.57000.74000.76000.76000.70000.81000.66000.60000.7000
F10.62630.62500.62780.60550.59060.62270.62890.65210.6387
03subcl5-
600-5-50-BI
Accuracy0.86000.81500.81500.81000.81000.78000.83500.85830.8350
Precision0.62450.47300.46920.45820.46060.42420.51430.61560.5125
Recall0.46000.68000.68000.71000.67000.83000.52000.46000.5800
F10.52780.55420.55360.55610.54330.55910.51570.52420.5432
03subcl5-
600-5-60-BI
Accuracy0.83500.80500.79330.80330.80330.78500.80500.83330.8000
Precision0.51500.44360.42460.44510.43840.42500.41850.49920.4201
Recall0.37000.65000.63000.71000.63000.81000.46000.35000.5400
F10.42430.52590.50540.54580.51610.55720.43690.41020.4710
03subcl5-
600-5-70-BI
Accuracy0.83670.80500.79170.79670.81330.77000.81000.84000.7983
Precision0.54420.45160.42670.44310.46670.40660.43710.52730.4177
Recall0.35000.73000.66000.79000.76000.80000.46000.37000.4800
F10.41420.55650.51630.56570.57640.53810.44430.43150.4415
03subcl5-
800-7-0-BI
Accuracy0.95380.95250.95380.95750.95500.94620.95500.95630.9550
Precision0.82120.77440.78960.79630.78780.72740.79200.85600.7662
Recall0.81000.88000.87000.89000.88000.91000.87000.78000.9300
F10.81490.82240.82350.83910.82970.80670.82830.81550.8387
03subcl5-
800-7-30-BI
Accuracy0.91130.86130.85880.85750.85750.82630.90250.90880.9025
Precision0.69940.46470.46180.47070.45520.40120.60790.69190.6117
Recall0.54000.69000.69000.74000.68000.74000.63000.52000.6600
F10.59900.55370.55140.57070.54340.51850.61490.58730.6324
03subcl5-
800-7-50-BI
Accuracy0.88380.82000.83130.81380.80880.77130.87380.88630.8625
Precision0.58830.37510.40400.36880.36040.31800.49530.60700.4525
Recall0.34000.64000.68000.67000.64000.72000.47000.33000.4800
F10.42460.47200.50540.47400.45900.44010.48200.42440.4650
03subcl5-
800-7-60-BI
Accuracy0.87870.82000.82000.80880.80880.78750.85880.87500.8412
Precision0.53830.37470.36840.36240.35660.33710.43550.51100.3693
Recall0.30000.61000.57000.67000.64000.73000.38000.28000.3700
F10.37800.46170.44540.46910.45720.46070.40090.35730.3636
03subcl5-
800-7-70-BI
Accuracy0.86000.80500.80380.80000.79250.75130.84630.86000.8412
Precision0.45650.34610.34260.34770.32930.30280.38710.44120.3815
Recall0.25000.63000.61000.68000.63000.73000.39000.21000.3900
F10.30820.44660.43800.45970.43180.42630.38380.27370.3836
04clover5z-
600-5-0-BI
Accuracy0.92000.91500.92000.90830.92000.89670.91500.93500.9150
Precision0.78020.71810.71660.68190.72350.64460.72560.84380.6931
Recall0.74000.83000.87000.87000.87000.89000.81000.76000.9000
F10.75590.76830.78440.76250.78670.74510.76230.79690.7816
04clover5z-
600-5-30-BI
Accuracy0.87170.86670.86000.86000.86000.82830.86830.87000.8700
Precision0.65730.58750.56720.56150.56900.50110.60670.64460.5955
Recall0.49000.72000.73000.75000.71000.80000.63000.51000.7100
F10.55690.64380.63650.64060.62940.61130.61130.56420.6451
04clover5z-
600-5-50-BI
Accuracy0.84330.83500.83670.85330.83500.80500.85170.84500.8417
Precision0.55590.51550.51260.54870.51790.46130.55980.55350.5259
Recall0.35000.69000.70000.78000.69000.82000.55000.36000.5800
F10.42490.58510.59030.64210.58490.58800.54840.43220.5480
04clover5z-
600-5-60-BI
Accuracy0.82170.81170.79830.80000.80500.76500.81170.82670.8067
Precision0.43080.45940.43680.44010.44710.39350.44720.46150.4315
Recall0.30000.66000.63000.66000.65000.74000.47000.34000.4900
F10.34970.53970.51330.52660.52700.51210.45420.38770.4557
04clover5z-
600-5-70-BI
Accuracy0.83330.81330.79830.80170.81330.76830.82330.82000.8083
Precision0.52370.46510.42630.43790.46600.40930.47810.47650.4309
Recall0.32000.66000.60000.64000.71000.78000.47000.31000.4700
F10.38740.54170.49530.51810.55830.53180.46870.36530.4442
04clover5z-
800-7-0-BI
Accuracy0.94880.94630.95500.94500.94870.92000.94500.94500.9325
Precision0.87750.74420.79180.74120.76090.62820.79320.88000.6957
Recall0.69000.87000.88000.86000.86000.89000.76000.65000.8200
F10.76930.80120.83020.79510.80670.73560.77410.74580.7513
04clover5z-
800-7-30-BI
Accuracy0.91000.88620.88130.88250.88250.85750.90000.90750.8975
Precision0.73190.53690.51920.52320.52450.45680.62270.72300.5958
Recall0.44000.70000.72000.70000.69000.75000.53000.43000.5800
F10.54890.60600.60250.59770.59480.56690.56970.53750.5857
04clover5z-
800-7-50-BI
Accuracy0.87380.83380.83750.83630.83130.80870.87000.87130.8500
Precision0.49950.39870.40080.41160.39100.36900.47810.46050.3998
Recall0.24000.62000.57000.68000.62000.73000.41000.20000.3900
F10.32230.48290.46850.51160.47840.48910.43970.27680.3915
04clover5z-
800-7-60-BI
Accuracy0.85130.82120.83250.82750.82500.80250.84250.85750.8363
Precision0.35210.38390.40000.39920.38920.36780.37260.39190.3566
Recall0.20000.62000.61000.65000.62000.74000.33000.22000.3600
F10.25190.47180.48190.49200.47560.48950.34340.27330.3556
04clover5z-
800-7-70-BI
Accuracy0.86870.82370.82000.82620.81880.78630.85250.87000.8525
Precision0.45460.36560.35530.38010.36400.34140.40550.48830.4055
Recall0.23000.59000.55000.64000.62000.75000.37000.20000.3900
F10.30080.44950.43130.47510.45710.46770.38430.28090.3972
paw02a-
600-5-0-BI
Accuracy0.97000.96670.97170.97330.97000.96000.97000.97000.9650
Precision0.91320.87140.90120.89090.88290.83440.89040.91320.8594
Recall0.91000.94000.94000.96000.95000.95000.94000.91000.9500
F10.91040.90320.91850.92330.91320.88790.91250.91040.9011
paw02a-
600-5-30-BI
Accuracy0.92830.90330.89830.88830.89670.87500.91330.93000.8967
Precision0.88210.68840.66980.63000.66730.60480.75520.87540.6753
Recall0.69000.81000.81000.86000.82000.84000.76000.71000.8200
F10.75860.73780.72610.72310.72910.69580.74440.76880.7290
paw02a-
600-5-50-BI
Accuracy0.89000.87000.86330.85830.86170.84670.88000.89830.8733
Precision0.71530.59070.57680.55730.56750.52920.63040.75290.6142
Recall0.58000.76000.77000.79000.76000.80000.69000.59000.7000
F10.63420.66010.65270.64950.64650.63490.65560.65550.6490
paw02a-
600-5-60-BI
Accuracy0.86830.83500.82830.83170.84000.80830.84500.86170.8583
Precision0.62490.50430.48980.49660.51430.45480.53020.60400.5711
Recall0.53000.73000.72000.78000.75000.75000.59000.52000.6300
F10.56930.59530.58230.60590.60950.56510.55770.55290.5964
paw02a-
600-5-70-BI
Accuracy0.86830.83000.83330.82330.82500.82000.83500.86670.8350
Precision0.62040.49590.50170.48230.48700.47920.50410.61290.5068
Recall0.54000.72000.72000.72000.75000.82000.65000.54000.6600
F10.57560.58550.58950.57590.58880.60380.56710.57330.5720
paw02a-
800-7-0-BI
Accuracy0.97120.96880.97370.97000.96880.96500.97000.97130.9675
Precision0.88840.84170.87930.84490.84170.80720.85920.89200.8285
Recall0.89000.93000.92000.94000.93000.95000.92000.89000.9500
F10.88640.88200.89810.88790.88200.87170.88560.88700.8816
paw02a-
800-7-30-BI
Accuracy0.94630.89880.91130.88630.90000.87630.93750.94630.9250
Precision0.87700.57200.62410.53250.58240.50120.77490.87200.6850
Recall0.66000.77000.78000.81000.75000.83000.71000.67000.7400
F10.74830.65410.68820.64120.65160.62460.73550.74980.7105
paw02a-
800-7-50-BI
Accuracy0.92380.86000.85750.85750.86500.84630.90250.92500.8900
Precision0.78900.46110.45290.45660.47540.43930.61340.77410.5553
Recall0.53000.72000.68000.74000.73000.81000.62000.56000.6500
F10.63280.56160.54190.56450.57490.56910.61270.64670.5961
paw02a-
800-7-60-BI
Accuracy0.90000.85380.85750.84130.85380.81750.89130.89880.8787
Precision0.64640.44740.46210.42590.44760.38560.58410.64870.5283
Recall0.47000.70000.70000.73000.71000.77000.55000.47000.5400
F10.53630.54520.55530.53680.54810.51310.56070.53590.5292
paw02a-
800-7-70-BI
Accuracy0.88880.85130.85630.83750.84750.81250.87870.88000.8663

References

  1. Shimizu, H.; Nakayama, K.I. Artificial intelligence in oncology. Cancer Sci. 2020, 111, 1452–1460. [Google Scholar] [CrossRef] [PubMed]
  2. Shinde, P.; Shah, S. A Review of Machine Learning and Deep Learning Applications. In Proceedings of the 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 16–18 August 2018; pp. 1–6. [Google Scholar] [CrossRef]
  3. Mazurowski, M.A.; Habas, P.A.; Zurada, J.M.; Lo, J.Y.; Baker, J.A.; Tourassi, G.D. Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance. Neural Netw. 2008, 21, 427–436. [Google Scholar] [CrossRef] [PubMed]
  4. Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
  5. He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IJCNN), Hong Kong, China, 1–8 June 2008; pp. 1322–1328. [Google Scholar] [CrossRef]
  6. Han, H.; Wang, W.Y.; Mao, B.H. Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. In Proceedings of the Advances in Intelligent Computing: ICIC 2005 Proceedings, Hefei, China, 23–26 August 2005; pp. 878–887. [Google Scholar] [CrossRef]
  7. Batista, G.; Prati, R.C.; Monard, M.C. A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 2004, 6, 20–29. [Google Scholar] [CrossRef]
  8. Zhou, H.; Pan, H.; Zheng, K.; Wu, Z.; Xiang, Q. A novel oversampling method based on Wasserstein CGAN for imbalanced classification. Cybersecurity 2025, 8, 2025. [Google Scholar] [CrossRef]
  9. Matsui, R.; Guillen, L.; Izumi, S.; Mizuki, T.; Suganuma, T. An Oversampling Method Using Weight and Distance Thresholds for Cyber Attack Detection. In Proceedings of the 1st International Conference on Artificial Intelligence Computing and Systems (AICompS 2024), Jeju, Republic of Korea, 16–18 December 2024; pp. 48–52. [Google Scholar]
  10. Aditsania, A.; Adiwijaya; Saonard, A.L. Handling imbalanced data in churn prediction using ADASYN and backpropagation algorithm. In Proceedings of the 2017 3rd International Conference on Science in Information Technology (ICSITech), Bandung, Indonesia, 25–26 October 2017; pp. 533–536. [Google Scholar] [CrossRef]
  11. Gameng, H.A.; Gerardo, G.B.; Medina, R.P. Modified Adaptive Synthetic SMOTE to Improve Classification Performance in Imbalanced Datasets. In Proceedings of the 2019 IEEE 6th International Conference on Engineering Technologies and Applied Sciences (ICETAS), Kuala Lumpur, Malaysia, 20–21 December 2019; pp. 1–5. [Google Scholar] [CrossRef]
  12. Dai, Q.; Liu, J.; Zhao, J.L. Distance-based arranging oversampling technique for imbalanced data. Neural Comput. Appl. 2023, 35, 1323–1342. [Google Scholar] [CrossRef]
  13. Pradipta, G.A.; Wardoyo, R.; Musdholifah, A.; Sanjaya, I.N.H. Radius-SMOTE: A New Oversampling Technique of Minority Samples Based on Radius Distance for Learning From Imbalanced Data. IEEE Access 2021, 9, 74763–74777. [Google Scholar] [CrossRef]
  14. Batista, G.; Prati, R.C.; Monard, M.C. Balancing training data for automated annotation of keywords: A case study. In Proceedings of the II Brazilian Workshop on Bioinformatics, Macaé, RJ, Brazil, 3–5 December 2003; pp. 10–18. [Google Scholar]
  15. Guan, H.; Zhang, Y.; Xian, M.; Cheng, H.D.; Tang, X. SMOTE-WENN: Solving class imbalance and small sample problems by oversampling and distance scaling. Appl. Intell. 2021, 51, 1394–1409. [Google Scholar] [CrossRef]
  16. Fernandez, A.; Garcia, S.; del Jesus, M.J.; Herrera, F. A study of the behaviour of linguistic fuzzy rule based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 2008, 159, 2378–2398. [Google Scholar] [CrossRef]
  17. Garcia, S.; Parmisano, A.; Erquiaga, M.J. IoT-23: A Labeled Dataset with Malicious and Benign IoT Network Traffic (Version 1.0.0) [Dataset]. Zenodo. 2020. Available online: https://zenodo.org/records/4743746 (accessed on 27 November 2025).
  18. Koroniotis, N.; Moustafa, N.; Sitnikova, E.; Turnbull, B. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset. Future Gener. Comput. Syst. 2019, 100, 779–796. [Google Scholar] [CrossRef]
  19. Mateen, M. Air Quality and Pollution Assessment. Kaggle. [Dataset]. 2024. Available online: https://www.kaggle.com/datasets/mujtabamatin/air-quality-and-pollution-assessment (accessed on 27 November 2025).
  20. He, H.; Garcia, E.A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 2009, 21, 1263–1284. [Google Scholar] [CrossRef]
  21. Chowdhury, R.H.; Hossain, Q.D.; Ahmad, M. Automated method for uterine contraction extraction and classification of term vs pre-term EHG signals. IEEE Access 2024, 12, 49363–49375. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.