Next Article in Journal
Discrete and Continuous Symmetries of Stratified Flows Past a Sphere
Next Article in Special Issue
Constructing Adaptive Multi-Scale Feature via Transformer-Aware Patch for Occluded Person Re-Identification
Previous Article in Journal
On the Error Estimation of the FEM for the Nikol’skij-Lizorkin Problem with Degeneracy in the Lebesgue Space
Previous Article in Special Issue
TEXT Analysis on Ocean Engineering Equipment Industry Policies in China between 2010 and 2020
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Ensemble Framework to Forest Optimization Based Reduct Searching

School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, China
*
Author to whom correspondence should be addressed.
Symmetry 2022, 14(6), 1277; https://doi.org/10.3390/sym14061277
Submission received: 2 June 2022 / Revised: 15 June 2022 / Accepted: 18 June 2022 / Published: 20 June 2022
(This article belongs to the Special Issue Recent Advances in Granular Computing for Intelligent Data Analysis)

Abstract

:
Essentially, the solution to an attribute reduction problem can be viewed as a reduct searching process. Currently, among various searching strategies, meta-heuristic searching has received extensive attention. As a new emerging meta-heuristic approach, the forest optimization algorithm (FOA) is introduced to the problem solving of attribute reduction in this study. To further improve the classification performance of selected attributes in reduct, an ensemble framework is also developed: firstly, multiple reducts are obtained by FOA and data perturbation, and the structure of those multiple reducts is symmetrical, which indicates that no order exists among those reducts; secondly, multiple reducts are used to execute voting classification over testing samples. Finally, comprehensive experiments on over 20 UCI datasets clearly validated the effectiveness of our framework: it is not only beneficial to output reducts with superior classification accuracies and classification stabilities but also suitable for data pre-processing with noise. This improvement work we have performed makes the FOA obtain better benefits in the data processing of life, health, medical and other fields.

1. Introduction

In the era of big data, with rapid growth in the amount of data, a high dimension of data is a representative characteristic. Nevertheless, it is well-known that not all features in data are useful for providing valuable learning ability [1,2,3]. That is why feature selection has been validated to be one of the crucial data pre-processing techniques in the fields of machine learning, knowledge discovery and so on.
Attribute reduction, from the perspective of rough set, is one state-of-the-art feature selection technique [4,5,6]. One of the important advantages of attribute reduction is that it possesses rich semantic explanations based on various measures related to rough set. For instance, a reduct can be regarded as one minimal subset of attributes that satisfies the pre-defined constraint constructed by using measures such as dependency, conditional entropy, conditional discrimination index and so on [7,8].
Without a loss in generality, the problem solving of attribute reduction can be considered as a searching optimization procedure. Up to now, with respect to different requirements, many approaches have been reported [9].
With a careful review of the previous literature, most searching methods can be categorized into the following two groups. They are exhaustive searching [10] and heuristic searching [11,12].
  • Exhaustive searching. The fundamental advantage of exhaustive searching is that such a process can seek out all qualified reducts. Nevertheless, the apparent limitation of such a search is the unacceptable complexity. For example, the discernibility matrix [13,14] and backtracking [15] are two representative mechanisms of exhaustive searching. Though some simplified and pruning modes have been proposed to improve the efficiency of those exhaustive searchings; they still face a big challenge in the dimension reduction in large-scale data.
  • Heuristic searching. Different from complete searching, heuristic searching can only be used to acquire one and only one reduct for one round of execution. The dominating superiority of heuristic searching is the low complexity due to the guidance of heuristic information for the whole process of searching. Take the forward greedy searching [16,17,18] as an example, the variation of used measures in the definition of attribute reduction is the heuristic searching; such a variation can be employed to identify the importance of candidate attributes.
Since heuristic searching is superior to exhaustive searching in most cases, the former has been widely accepted by many researchers. Among various heuristic searches, meta-heuristic searching is especially popular [19,20]. Meta-heuristic searching mainly focuses on relevant behaviors in the natural world. The inherent computational intelligence mechanism is useful for seeking out the optimal solution to complex optimization problems. Different from forward greedy searching, meta-heuristic searching combines both random searching and local searching strategies [21,22]; the global optimal solution can then be gradually achieved in the whole process of searching. However, it should also be emphasized that the random factor in meta-heuristic searching may involve the following limitations.
  • Random factor may result in the poor adaptability of reduct. This is mainly because each iteration of identifying qualified attributes is equipped with strong randomness. As has been reported by Li et al., a feature selection algorithm without the guarantee of stability usually leads to significantly different feature subsets if the data varies [23]. Such a case implies that the classification results based on selected features will shake the confidence of domain experts.
  • Random factor may generate an ineffective reduct if high-dimensional data is faced. This is mainly because for hundreds of attributes in data, a large number of possible reducts exist, random factor can then possibly identify a reduct without the preferable learning ability [24,25].
To fill the gaps mentioned above, a new strategy to perform meta-heuristic searching has become necessary. First, in this study, the forest optimization algorithm (FOA) [26,27] is selected as the problem-solving approach for attribute reduction. Secondly, to further improve the effectiveness of the derived reducts by using FOA, an ensemble framework is developed, which aims to generate multiple reducts. It should be emphasized that the structure of those multiple reducts is symmetrical, which indicates that no order exists among those reducts. The advantage of such a symmetric structure is that our approach can be easily performed over parallel platforms. The use of multiple obtained reducts can then execute voting classification over testing samples; it is the key to achieving the objective of improving effectiveness [28,29]. Immediately, the main contributions of our study are elaborated as follows.
  • It is the first attempt to introduce forest optimization-based searching into the problem solving of attribute reduction. Different from the widely used forward greed searching in many pieces of the literature, this research is useful to further push the application of meta-heuristic searching in data pre-processing forward.
  • The ensemble framework proposed in this research is a general form; though it is combined with FOA in this article, such an ensemble framework can also be further embedded into other searchings. From this point of view, the discussed framework possesses the advantage of plug-and-play.
The remainder of this paper is organized as follows. In Section 2, we briefly review notions related to our study. Attribute reduction via FOA and the proposed ensemble framework are addressed in Section 3. We used 20 classical datasets on the UCI dataset to conduct comparative experiments. The experimental results are shown and analyzed in Section 4. Finally, the article is summarized in Section 5 and the future work plan is given.

2. Preliminaries

2.1. Granular Computing and Rough Set

Thus far, the principle of dividing complex data/information into several minor blocks to perform problem solving has been widely accepted in the era of big data. This thinking is referred to as information granulation in the framework of Granular Computing (GrC) [30,31]. As the fundamental operation element in GrC, the concept called information granulating has been thoroughly investigated [32,33,34]. The degree to characterize the coarser or finer structure of information granules can also be revealed; it is regarded as granularity.
Presently, it has been widely accepted that rough set is a general mathematical tool to carry out GrC. The reason can be contributed to the following two facts:
  • the fundamental characteristic of rough set is to approximate the objective target by using information granules;
  • different structures of information granules imply different values of granularity, and then the results of rough approximations may vary.
In rough set theory, a data or a decision system [35] can be denoted by DS = U , A T , d , in which U is a set of nonempty finite samples, such as U = { x 1 , x 2 , , x n } , A T is a set of condition attributes, d is the decision to record the labels of samples, i.e., x i U , d ( x i ) is the label of sample x i . Given a decision system DS , to derive the result of information granulation over U, various binary relations have been developed with respect to different requirements. Two representative forms are illustrated as follows.
  • To deal with categorical data, the equivalence relation or the so-called indiscernibility relation proposed by Pawlak can be used. For example, if the classification task is considered, then by decision d, the corresponding equivalence relation is IND d = { ( x i , x j ) U 2 | d ( x i ) = d ( x j ) } . Therefore, a partition over U is derived such as U / IND d = { X 1 , X 2 , , X m } . X U / IND d , X is actually a collection of samples that possess the same labels.
  • To handle continuous or non-structured data, parameterized binary relationships can be employed. For instance, Hu et al. [36] have presented a neighborhood relation such as N A δ = { ( x i , x j ) U 2 : A ( x i , x j ) δ } , in which A ( x i , x j ) is the distance between samples x i and x j over A and δ is the radius (one form of parameter). Following N A δ , each sample in U can induce a related neighborhood, which is a parameterized form of an information granule [37,38].

2.2. Attribute Selection

Note that since continuous data is more popular than categorical data in real-world applications, we will mainly focus on parameterized binary relationships in the context of this study. Furthermore, it has been pointed out that different specific parameters will lead to different results of information granulation, and then the related values of granularity may also be different. For example, Zhang et al. fused the multi granularity decision-making theory rough set with the hesitant fuzzy language group decision-making, and expanded the application of multi-granularity three-way decision-making in information analysis and information fusion by introducing the adjustable parameter of expected risk preference [39].
From the discussions above, it is interesting to explore one of the crucial topics in the field of rough set based on the parameter; that is, attribute reduction. Many researchers have performed outstanding work in the field of attribute reduction. Xu et al. proposed assignment reduction and approximate reduction for inconsistent ordered information systems to enhance the effectiveness of attribute reduction in complex information systems [40]. Chen et al. proposed the solution of a granular ball in the process of information granular attribute reduction, which improved the effectiveness of searching attributes [41]. We will first present the following general definition of attribute reduction.
Theorem 1.
 Given a decision system DS and a parameter ω, assuming that ρ ω -constraint is a constraint related to a specific measure ρ and parameter ω, A A T , A is referred to as a ρ ω -reduct if and only if:
1.
A satisfies ρ ω -constraint;
2.
B A , B does not satisfy ρ ω -constraint.
Without a loss in generality, in the scenario of rough set, the value of measure ρ is closely related to the given parameter ω . Therefore, the semantic explanation of ρ ω -reduct is a minimal subset of raw condition attributes, which satisfies the pre-defined constraint [42,43].
In most state-of-the-art studies about attribute reduction, the value of measure ρ may be equipped with the following two cases.
  • If a measure is positive preferred; that is, the measure-value is expected to be as high as possible, e.g., the measures called approximation quality and classification accuracy, then the ρ ω -constraint is usually expressed as “ ρ ω ( A ) ρ ω ( A T ) ” where ρ ω ( A ) is the value of measure derived based on condition attributes A and parameter ω .
  • If a measure is negative preferred; that is, the measure-value is expected to be as low as possible, e.g., the measure called condition entropy and classification error rate, then the ρ ω -constraint is usually expressed as “ ρ ω ( A ) ρ ω ( A T ) ”.
Following Theorem 1, how to select qualified attributes and construct the required reduction becomes a problem worth exploring. As a classic algorithm in heuristic strategy, forward greedy search has received extensive attention due to its low complexity and high efficiency. The key to such searching is to select the most significant attribute for each iteration. The detailed algorithm is shown as follows.
The qualified attributes can be obtained in step 5 of Algorithm  1, and this selected attribute is closely related to the measure used in the above algorithm. In detail, if a measure of positive correlation is used in the attribute search process, then attribute b should be qualified with a higher measure value such b = arg max { ρ ω ( A { a } ) : a A T A } ; Conversely, if a negative correlation measure is used, a lower measure value should be considered to qualify attribute b, that is, b = arg min { ρ ω ( A { a } ) : a A T A } .
Algorithm 1: Forward greedy searching to select attributes.
Symmetry 14 01277 i001
When the number of samples is | U | and the number of conditional attributes is | A T | , the time complexity of the above Algorithm 1 is O ( | U | 2 × | A T | 2 ) .

3. Foa and Proposed Framework

3.1. FOA

The forest optimization algorithm (FOA) is an evolutionary algorithm that was developed by Ghaemi in 2014 [26]. Such an algorithm is enlightened by the phenomenon that a few trees in the forest can subsist for a long time while other trees can only survive for a limited time. Presently, FOA has been introduced into the problem solving of feature selection. Note that attribute reduction can be regarded as one rough set-based feature-selection tool, it is then an interesting topic to perform the searching of qualified attributes required in the reduct by principles of FOA.
Generally, in FOA, each tree represents a possible solution, i.e., a subset of condition attributes. To simplify our discussion, such a tree can be denoted by a vector with the length of 1 + | A T | . That is, a vector consists of 1 + | A T | variables. The first variable in such a vector is the “Age” of a tree, and the remainders are used to denote the existence/nonexistence of attributes. For example, value “1” in a vector indicates that the corresponding attribute is identified to be involved in the subsequent iteration, value “0” in a vector implies that the corresponding attribute is removed for the subsequent iteration.
Following classical FOA, five main steps are used and elaborated as follows.
  • Initializing trees. The variable “Age” of each tree is set to be ‘0’. Further, other variables of each tree are initialized randomly with either “0” or “1”. The following stage, called local seeding, will increase the values of “Age” of all trees except newly generated trees.
  • Local seeding. For each tree with “Age” 0 in the forest, some variables are selected randomly(“LSC” parameter determines the number of the selected variables). Such a tree is split into an “LSC” number of trees, and for each split tree the value of one distinguished variable is changed from 0 to 1 or vice versa. Figure 1 shows an example of the local seeding operator of one tree. In such an example, | A T | = 5 , the value of “LSC” is set to 2.
  • Population limiting. In this stage, to form the candidate population, the following two types of trees will be removed from the forest: (1) the “Age” of a tree is bigger than a parameter called “life time”; (2) the extra trees that exceed a parameter called “area limit” by sorting trees via their fitness values.
  • Global seeding. For trees in the candidate population, the number of variables to be selected is determined by a parameter named “GSC”, and these variables are selected at random. Immediately, the values of those selected variables will be changed (from 0 to 1 or vice versa). An example of performing global seeding is shown in Figure 2. In such an example, the value of “GSC” is also set to 3.
  • Updating the best tree. In this stage, by sorting the trees in the candidate population based on their fitness values, the tree with the greatest fitness value is identified as the best one, and its “Age” will be set to be “0”. These stages will be performed iteratively until the ρ ω -constraint in attribute reduction is satisfied.
Algorithm 2 illustrates a detailed process to select attributes and then construct a reduct by using FOA.
Algorithm 2: FOA to select attributes.
Symmetry 14 01277 i002

3.2. Ensemble FOA

By carefully reviewing the process shown in Algorithm 2, we observe that for some specific steps, there are some random characteristics. For such a reason, it may be argued that the derived reduct based on FOA is unstable. In other words, two different reducts are to be obtained though the same searching procedure is executed twice over the same data.
Furthermore, following research reported in Yang et al. [44], an unstable reduct may be invaluable in providing robust learning, i.e., the stability of classification results may be far from what we expect. Such a study has indicated that not only the learning accuracy but also the learning stability should be paid a lot of attention to.
To fill the gaps mentioned above, a new approach to carrying out FOA has become necessary. That is why an ensemble strategy will be developed to further improve the performance of FOA in the problem solving of attribute reduction. Formally, the details of our ensemble strategy can be elaborated as follows. First, the searching process of FOA will be executed N times; it follows that N reducts may be derived such as A 1 , A 2 , , A N . Secondly, each testing sample will be comprehensively predicated by those derived reduct. Finally, voting will be used to determine the final predication of the testing sample.
Nevertheless, it is not difficult to observe that two main limitations may emerge for the above ensemble strategy. First, though different reducts can be obtained, the diversity over those derived reducts is still unsatisfactory. Secondly, the searching efficiency is a big problem because there are N times to calculate the reduct. By considering such limitations, a data perturbation mechanism will be introduced into the whole process [45,46]. Grant there are m attributes in the given raw training data set. Our used data perturbation is performed by randomly identifying λ % ( 0 λ 100 ) number of raw attributes, i.e., λ % · | A T | attributes, the reduct can then be generated from those attributes and the universe U. The first advantage of such a data perturbation is that the expected diversity of the reducts can be induced because different subsets of attributes are used to calculate the reducts. Moreover, since for each round computation of a reduct, only a subset of the attributes is employed, the time consumption can be reduced.
From the discussions above, a detailed framework of our proposed ensemble FOA is illustrated in Figure 3.

4. Experimental Analysis

4.1. Data Sets

To verify the superiorities of the proposed framework, the experiments are conducted over 20 real-world datasets from the UCI Machine Learning Repository. Table 1 summarizes some details of these datasets. Additionally, the values in all datasets have been normalized by column. In addition, besides using the raw dataset shown in Table 1, we also added a comparative experiment of injecting label noise. The specific operation is as follows: if the given label noise ratio is ω % ( 0 ω 100 ) , then we randomly select ω % · | U | samples in the raw dataset and perform noise injection by replacing the labels of these selected samples with other labels. The purpose of this is to further test the robustness of our method.

4.2. Experimental Setup and Configuration

All experiments were performed on a Windows 10 system personal computer configured with an Intel Core i7-6700HQ CPU (2.60 GHz) and 16.00 GB of memory. The programming language is Matlab (MathWorks Inc., Natick, MA, USA), and the integrated development environment version used is R2019b.
In the following experiments, neighborhood rough set is used to perform our framework, in the case of using multiple different radii, neighborhood rough sets can form multi-grained structures. Therefore, we specify a set of 20 radii in ascending order such as R = { δ 1 = 0.02 , δ 2 = 0.04 , , δ 20 = 0.40 } . Furthermore, the experiment uses 10-fold cross-validation; that is, in each calculation, 90% of the samples of the dataset are used to solve the attribute reduction, and the remaining samples are used to test the classification performance of the reduct.
Note that there are two approaches that have been tested in our experiment. One is a primitive forest optimization algorithm the other is an ensemble forest optimization algorithm.
They are denoted as PF and EF. Furthermore, to comprehensively compare our approaches, forward greedy searching has also been tested. KNN and CART are simple and mature classifiers, which have been widely accepted in various learning tasks. Therefore, we used KNN and CART classifiers in the experiment to measure the classification ability of the reduction obtained by the above three methods. These three methods are compared based on the following four measures.
  • Approximation Quality (AQ) [47]: reflects the uncertainty of the sample space characterized by information granularity derived from attribute subsets. Set the radius to δ m R , the approximation quality related to d over A A T can be denoted by A Q δ m ( A ) and:
    A Q δ m ( A ) = { x U : N A δ m ( x ) [ x ] d } | U | ,
    where N A δ m ( x ) = { y U : A ( x , y ) δ m } is the neighborhood of sample x related to A in terms of δ m , [ x ] d = { y U : d ( x ) = d ( y ) } is the decision class of x, and | · | is the cardinality of a set.
    The higher the value A Q δ m ( A ) derives, the better the performance of the conditional attribute subset A is. From this point of view, the constraint is set to be “ A Q δ m ( A ) A Q δ m ( A T ) ” in Algorithms 1 and 2 for deriving Approximation Quality Reduct (AQR).
  • Conditional Entropy (CE) [48]: reflects the uncertainty of the information granularity extracted from the attribute set to δ m R , the condition entropy related to d over A A T can be denoted by C E δ m ( A ) and:
    C E δ m ( A ) = 1 | U | x U N A δ m ( x ) [ x ] d log N A δ m ( x ) [ x ] d N A δ m ( x ) .
    The lower the value C E δ m ( A ) derives, the better the performance of the conditional attribute subset A is. From this point of view. The constraint is set to be “ C E δ m ( A ) C E δ m ( A T ) ” in Algorithms 1 and 2 for deriving the Conditional Entropy Reduct (CER).
  • h [49]: the regularizer is useful to robustly evaluate the significance of candidate attributes and then reasonably identify a valuable attribute. Given a radius δ m R , the regularization loss related to d over A A T can be denoted by R L δ m ( A ) and:
    R L δ m ( A ) = ( 1 A Q δ m ( A ) ) + α G ( A ) ,
    in which α is a hyper-parameter to balance the loss of approximation quality and regularizer G ( A ) . In the context of this paper, the regularizer G ( A ) is defined as a specific form of granularity [50] such that G ( A ) = x U N A δ m ( x ) | U | 2 log N A δ m ( x ) | U | 2 .
    The lower the value R L δ m ( A ) derives, the better the performance of the condition attribute subset A is. From this point of view, the constraint is set to be “ R L δ m ( A ) R L δ m ( A T ) ” in Algorithms 1 and 2 for deriving the Regularization Loss Reduct (RLR).
  • Neighborhood Discrimination Index (NDI) [51]: the measure is used to reflect the discrimination ability of attribute sets for different decision classes. Given a radius δ m R , the neighborhood discrimination index relates to d over A A T can be denoted by NDI δ m ( A ) and:
    N D I δ m ( A ) = log N A δ m ( x ) N A δ m ( x ) IND d .
    The lower the value NDI δ m ( A ) derives, the better the performance of the condition attribute subset A is. From this point of view, the constraint is set to be “ NDI δ m ( A ) NDI δ m ( A T ) ” in Algorithms 1 and 2 for deriving the Neighborhood Discrimination Index Reduct (NDIR).
With these metrics, four sets of comparative experiments will be carried out on three algorithms: primitive forest optimization algorithm, ensemble forest optimization algorithm and forward greedy algorithm: PF-AQR, EF-AQR and FG-AQR consist of a group of comparisons based on the measure of approximation quality; PF-CER, EF-CER and FG-CER consist of a group of comparisons based on the measure of conditional entropy; PF-RLR, EF-RLR and FG-RLR consist of a group of comparisons based on the measure of regularization loss; PF-NDI, EF-NDI and FG-NDI consist of a group of comparisons based on the measure of discrimination index.
Moreover, three types of results will be reported: (1) the classification accuracy, (2) the classification stability and (3) the AUC.

4.3. Comparisons among Classification Accuracies of the Derived Reducts

In this subsection, the classification accuracies under three methods based on four measures will be shown. These results are shown for the raw dataset and the dataset with 10%, 20%, 30% and 40% different label noise ratios. Note that CART and KNN classifiers are employed to evaluate the classification performance of the derived reducts.

4.3.1. Classification Accuracies (Raw Data)

Table 2 and Table 3 below show the classification accuracies of the raw dataset with different classifiers.
With a deep investigation of Table 2 and Table 3, it is not difficult to observe that regardless of the classifier used, the classification accuracies associated with our method dominate other compared methods in most datasets. Take data “Statlog (Heart) (ID = 13)” as an example, by KNN classifier, the classification accuracies related to our method are 0.8457, 0.8407, 0.8419 and 0.8426, respectively. At the same time, the classification accuracies of several other comparison methods are no more than 0.8.

4.3.2. Classification Accuracies (10%, 20%, 30% and 40% Label Noise)

The classification accuracies in terms of four different noise ratios are reported in Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10 and Table 11.
From the above results, it is not difficult to observe that the performance of both CART and KNN classifiers decreases with the increase in the ratios of label noises. Taking “Connectionist Bench (Vowel Recognition—Deterding) (ID = 2)” as an example, for the CART classifier and the approximation quality are used to define the constraint of attribute reduction; if the noise ratio increases from 10% to 40%, the values of EF-AQR in Table 4, Table 5, Table 6, Table 7, Table 8, Table 9, Table 10 and Table 11 are 0.7764, 0.7419, 0.6925 and 0.6234. Obviously, the classification accuracies have been significantly reduced. Such observations suggest that more label noise does have a negative impact on the classification performance of selected attributes in the reducts.
At the same time, we can find that when the label noise ratios are 20% and 30%, the results in Table 6, Table 7, Table 8 and Table 9 show that when using the KNN classifier, our method has a weak disadvantage in classification accuracy compared to other methods on the small number of datasets. It shows that our method is more suitable for the CART classifier in the environment of 20% and 30% label noise ratios.
Note that for the four ratios of label noises we tested, the classification accuracies associated with our method are also superior to the results associated with the compared methods. Such observations also imply that our proposed strategy is more robust in dirty data.

4.4. Comparisons among Classification Stabilities of the Derived Reducts

In this subsection, we will show the classification stability values related to three different methods under four metrics. These comparative experiments are not only performed on the raw dataset but also performed on the dataset with ratios of label noises of 10%, 20%, 30% and 40%. The CART and KNN classifiers are used to test the performance of all algorithms.

4.4.1. Classification Stabilities (Raw Data)

Table 12 and Table 13 report different classification stabilities obtained over the raw dataset.
By carefully observing the data in the tables, it is not difficult to find that no matter which classifier is used, in most datasets, our method is more stable and better than other methods. Take “LSVT Voice Rehabilitation (ID = 8)” as an example; using the CART classifier, the classification stabilities related to EF under four different measures are 0.8340, 0.8488, 0.8372 and 0.8488, respectively, while the classification stabilities related to other methods are lower under the same measure.
Of course, by comparing Table 12 and Table 13, we also found that on the raw dataset, when using the KNN classifier, the neighborhood discrimination index as the measure to derive reduction has the best classification stability. When the other three measures are used, the classification stability of our method is slightly inferior to that of the other two methods on the datasets with IDs of 3, 5, 14 and 15. This also shows that our method is more suitable for the CART classifier. When using the KNN classifier, we recommend using the neighborhood discrimination index as a measure of attribute reduction.

4.4.2. Classification Stabilities(10%, 20%, 30% and 40% Label Noise)

Table 14, Table 15, Table 16, Table 17, Table 18, Table 19, Table 20 and Table 21 report different classification stabilities obtained on datasets with different noise ratios, on which the stabilities are tested for KNN and CART classifiers.
Through the detailed investigation of Table 14, Table 15, Table 16, Table 17, Table 18, Table 19, Table 20 and Table 21, it can be seen that in all experimental datasets, whether using the CART or KNN classifier, its stability decreases with the increase in label noise. Taking “Statlog (Image Segmentation)(ID=14)” as an example, based on the CART classifier, in Table 15, where the noise ratio is only 10%, the classification stabilities related to EF-AQR, EF-CER, EF-NDIR and EF-RLR are 0.9599, 0.9474, 0.9449 and 0.9441, respectively; these values are reduced to 0.7553, 0.7502, 0.7597 and 0.7495 in Table 21 with a noise ratio of 40%. Such observations suggest that more label noise does have a negative impact on the classification stabilities of selected attributes in the reducts.
It is easy to know that no matter how much proportion of noise is injected into the raw dataset, the classification stabilities of our method is higher than that of the other two comparison methods. The results show that our method is more robust in a dirty data environment.

4.5. Comparisons among AUC of the Derived Reducts

The receiver operating characteristic (ROC) curve has been used in machine learning to describe the trade-off between the hit rate and the error alarm rate of classifiers. Because it is a two-dimensional description, it is not easy to evaluate the performance of the classifier, so we usually use the area under the ROC curve (AUC) in experiments. It is part of the cell square area, so its value is between 0 and 1.0 [52]. In this subsection, the AUC related to three different approaches to derive reducts under four measures will be compared. These are obtained on raw datasets and data setswith label noise ratios of 10%, 20%, 30% and 40%.

4.5.1. Auc (Raw Data)

Table 22 and Table 23 below show the AUC of the raw dataset with different classifiers.
With a thorough investigation of Table 22 and Table 23, it is not difficult to conclude that no matter which measures our framework uses as a constraint on search termination, the derived reducts obtained by our method are better than different classifiers. Take “Urban Land Cover (ID = 18)” as an example; the AUC related to EF-CER based on KNN and CART classifiers are 0.9227 and 0.9188, respectively; under the same classifier, the AUC value of EF-CER is higher than PF-CER and FG-CER.

4.5.2. Auc (10%, 20%, 30% and 40% Label Noise)

Table 24, Table 25, Table 26, Table 27, Table 28, Table 29, Table 30 and Table 31 report the different AUCs obtained on datasets with different noise ratios, which were tested on KNN and CART classifiers.
From Table 24, Table 25, Table 26, Table 27, Table 28, Table 29, Table 30 and Table 31, it is not difficult to draw the following conclusions. Whether using the CART or KNN classifier, its AUC value decreases with the increase in label noise. Take “Wine_Nor (ID = 20)” as an example; for the KNN classifier, the regularization loss is used in defining the constraint of attribute reduction, if the noise ratio increases from 10% to 40%, the values of EF-RLR in Table 24, Table 25, Table 26, Table 27, Table 28, Table 29, Table 30 and Table 31 are 0.9584, 0.9358, 0.9180 and 0.8790. Obviously, the AUC has been significantly reduced. Such observations suggest that more label noise does have a negative impact on the AUC.
Note that no matter which ratio of label noise is injected into the data, The value of AUC provided by the EF method is higher than other methods on most datasets. Therefore, our ensemble FOA approach cannot only improve the stability of the reduct but also bring a better classification performance. It also shows that our strategy is more robust in dirty data. However, it is undeniable that our method has a weak disadvantage in AUC value on a small number of datasets, which also provides a direction for our future enhancement research.

5. Conclusions and Future Plans

This research is inspired by introducing the forest optimization algorithm into the problem solving of attribute reduction. To further improve the effectiveness of selected attributes by the forest optimization algorithm, an ensemble framework is developed, which is used to perform ensemble classification based on multiple derived reducts. Our comparative experiments have clearly demonstrated that our framework is better than the other popular algorithms for four widely used measures in rough set. In the fields of medicine and health, there are large amounts of data predictions on drugs and diseases. Our improvement research can bring good benefits to these applications in the future. The following topics deserve further study.
  • Our framework can be introduced into other rough set models to perform on complex data under different scenarios, e.g., semi-supervised or unsupervised data.
  • Trying to combine our framework with some other effective feature selection techniques is also a challenge to data pre-processing.

Author Contributions

Conceptualization, X.Y.; methodology, X.Y.; software, Y.L.; validation, X.Y.; formal analysis, Y.L.; investigation, J.C.; resources, J.C.; data curation, J.W.; writing—original draft preparation, J.W.; writing—review and editing, J.W.; visualization, J.W.; supervision, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (Nos. 62076111, 62176107).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gheyas, I.A.; Smith, L.S. Feature Subset Selection in Large Dimensionality Domains. Pattern Recognit. 2010, 43, 5–13. [Google Scholar] [CrossRef] [Green Version]
  2. Hosseini, E.S.; Moattar, M.H. Evolutionary Feature Subsets Selection Based on Interaction Information for High Dimensional Imbalanced Data Classification. Appl. Soft Comput. 2019, 82, 105581. [Google Scholar] [CrossRef]
  3. Sang, B.; Chen, H.; Li, T.; Xu, W.; Yu, H. Incremental Approaches for Heterogeneous Feature Selection in Dynamic Ordered Data. Inf. Sci. 2020, 541, 475–501. [Google Scholar] [CrossRef]
  4. Xu, W.; Li, Y.; Liao, X. Approaches to Attribute Reductions Based on Rough Set and Matrix Computation in Inconsistent Ordered Information Systems. Knowl. Based Syst. 2012, 27, 78–91. [Google Scholar] [CrossRef]
  5. Zhang, X.; Chen, J. Three-Hierarchical Three-Way Decision Models for Conflict Analysis: A Qualitative Improvement and a Quantitative Extension. Inf. Sci. 2022, 587, 485–514. [Google Scholar] [CrossRef]
  6. Zhang, X.; Yao, Y. Tri-Level Attribute Reduction in Rough Set Theory. Exp. Syst. Appl. 2022, 190, 116187. [Google Scholar] [CrossRef]
  7. Yang, X.; Liang, S.; Yu, H.; Gao, S.; Qian, Y. Pseudo-Label Neighborhood Rough Set: Measures and Attribute Reductions. Int. J. Approx. Reason. 2019, 105, 112–129. [Google Scholar] [CrossRef]
  8. Liu, K.; Yang, X.; Yu, H.; Mi, J.; Wang, P.; Chen, X. Rough Set Based Semi-Supervised Feature Selection via Ensemble Selector. Knowl. Based Syst. 2019, 165, 282–296. [Google Scholar] [CrossRef]
  9. Sun, L.; Wang, L.; Ding, W.; Qian, Y.; Xu, J. Feature Selection Using Fuzzy Neighborhood Entropy-Based Uncertainty Measures for Fuzzy Neighborhood Multigranulation Rough Sets. IEEE Trans. Fuzzy Syst. 2021, 29, 19–33. [Google Scholar] [CrossRef]
  10. Pendharkar, P.C. Exhaustive and Heuristic Search Approaches for Learning a Software Defect Prediction Model. Eng. Appl. Artif. Intell. 2010, 23, 34–40. [Google Scholar] [CrossRef]
  11. Hu, Q.; Yu, D.; Liu, J.; Wu, C. Neighborhood Rough Set Based Heterogeneous Feature Subset Selection. Inf. Sci. 2008, 178, 3577–3594. [Google Scholar] [CrossRef]
  12. Jia, X.; Shang, L.; Zhou, B.; Yao, Y. Generalized Attribute Reduct in Rough Set Theory. Knowl. Based Syst. 2016, 91, 204–218. [Google Scholar] [CrossRef]
  13. Chen, D.; Zhao, S.; Zhang, L.; Yang, Y.; Zhang, X. Sample Pair Selection for Attribute Reduction with Rough Set. IEEE Trans. Knowl. 2012, 24, 2080–2093. [Google Scholar] [CrossRef]
  14. Dai, J.; Hu, H.; Wu, W.-Z.; Qian, Y.; Huang, D. Maximal-Discernibility-Pair-Based Approach to Attribute Reduction in Fuzzy Rough Sets. IEEE Trans. Fuzzy Syst. 2018, 26, 2174–2187. [Google Scholar] [CrossRef]
  15. Yang, X.; Qi, Y.; Song, X.; Yang, J. Test Cost Sensitive Multigranulation Rough Set: Model and Minimal Cost Selection. Inf. Sci. 2013, 250, 184–199. [Google Scholar] [CrossRef]
  16. Ju, H.; Yang, X.; Yu, H.; Li, T.; Yu, D.-J.; Yang, J. Cost-Sensitive Rough Set Approach. Inf. Sci. 2016, 355–356, 282–298. [Google Scholar] [CrossRef]
  17. Qian, Y.; Liang, J.; Pedrycz, W.; Dang, C. An Efficient Accelerator for Attribute Reduction from Incomplete Data in Rough Set Framework. Pattern Recognit. 2011, 44, 1658–1670. [Google Scholar] [CrossRef]
  18. Wang, X.; Wang, P.; Yang, X.; Yao, Y. Attribution Reduction Based on Sequential Three-Way Search of Granularity. Int. J. Mach. Learn. Cybern. 2021, 12, 1439–1458. [Google Scholar] [CrossRef]
  19. Tan, K.C.; Teoh, E.J.; Yu, Q.; Goh, K.C. A Hybrid Evolutionary Algorithm for Attribute Selection in Data Mining. Expert Syst. Appl. 2009, 36, 8616–8630. [Google Scholar] [CrossRef]
  20. Zhang, X.; Lin, Q. Three-Learning Strategy Particle Swarm Algorithm for Global Optimization Problems. Inf. Sci. 2022, 593, 289–313. [Google Scholar] [CrossRef]
  21. Xie, X.; Qin, X.; Zhou, Q.; Zhou, Y.; Zhang, T.; Janicki, R.; Zhao, W. A Novel Test-Cost-Sensitive Attribute Reduction Approach Using the Binary Bat Algorithm. Knowl. Based Syst. 2019, 186, 104938. [Google Scholar] [CrossRef]
  22. Ju, H.; Ding, W.; Yang, X.; Fujita, H.; Xu, S. Robust Supervised Rough Granular Description Model with the Principle of Justifiable Granularity. Appl. Soft Comput. 2021, 110, 107612. [Google Scholar] [CrossRef]
  23. Li, Y.; Si, J.; Zhou, G.; Huang, S.; Chen, S. FREL: A Stable Feature Selection Algorithm. IEEE Trans. Neural Networks Learn. Syst. 2015, 26, 1388–1402. [Google Scholar] [CrossRef] [PubMed]
  24. Li, S.; Harner, E.J.; Adjeroh, D.A. Random KNN Feature Selection—A Fast and Stable Alternative to Random Forests. BMC Bioinform. 2011, 12, 450. [Google Scholar] [CrossRef] [Green Version]
  25. Sarkar, C.; Cooley, S.; Srivastava, J. Robust Feature Selection Technique Using Rank Aggregation. Appl. Artif. Intell. 2014, 28, 243–257. [Google Scholar] [CrossRef]
  26. Ghaemi, M.; Feizi-Derakhshi, M.-R. Forest Optimization Algorithm. Exp. Syst. Appl. 2014, 41, 6676–6687. [Google Scholar] [CrossRef]
  27. Ghaemi, M.; Feizi-Derakhshi, M.-R. Feature Selection Using Forest Optimization Algorithm. Pattern Recognit. 2016, 60, 121–129. [Google Scholar] [CrossRef]
  28. Hu, Q.; An, S.; Yu, X.; Yu, D. Robust Fuzzy Rough Classifiers. Fuzzy Sets Syst. 2011, 183, 26–43. [Google Scholar] [CrossRef]
  29. Hu, Q.; Pedrycz, W.; Yu, D.; Lang, J. Selecting Discrete and Continuous Features Based on Neighborhood Decision Error Minimization. IEEE Trans. Syst. Man Cybern. Part B 2010, 40, 137–150. [Google Scholar] [CrossRef]
  30. Xu, W.; Pang, J.; Luo, S. A Novel Cognitive System Model and Approach to Transformation of Information Granules. Int. J. Approx. Reason. 2014, 55, 853–866. [Google Scholar] [CrossRef]
  31. Liu, D.; Li, T.; Ruan, D. Probabilistic Model Criteria with Decision-Theoretic Rough Sets. Inf. Sci. 2011, 181, 3709–3722. [Google Scholar] [CrossRef]
  32. Pedrycz, W.; Succi, G.; Sillitti, A.; Iljazi, J. Data Description: A General Framework of Information Granules. Knowl. Based Syst. 2015, 80, 98–108. [Google Scholar] [CrossRef]
  33. Wu, W.-Z.; Leung, Y. A Comparison Study of Optimal Scale Combination Selection in Generalized Multi-Scale Decision Tables. Int. J. Mach. Learn. Cybern. 2020, 11, 961–972. [Google Scholar] [CrossRef]
  34. Jiang, Z.; Yang, X.; Yu, H.; Liu, D.; Wang, P.; Qian, Y. Accelerator for Multi-Granularity Attribute Reduction. Knowl. Based Syst. 2019, 177, 145–158. [Google Scholar] [CrossRef]
  35. Wang, W.; Zhan, J.; Zhang, C. Three-Way Decisions Based Multi-Attribute Decision Making with Probabilistic Dominance Relations. Inf. Sci. 2021, 559, 75–96. [Google Scholar] [CrossRef]
  36. Hu, Q.; Yu, D.; Xie, Z. Neighborhood Classifiers. Expert Syst. Appl. 2008, 34, 866–876. [Google Scholar] [CrossRef]
  37. Liu, K.; Li, T.; Yang, X.; Yang, X.; Liu, D.; Zhang, P.; Wang, J. Granular Cabin: An Efficient Solution to Neighborhood Learning in Big Data. Inf. Sci. 2022, 583, 189–201. [Google Scholar] [CrossRef]
  38. Jiang, Z.; Liu, K.; Yang, X.; Yu, H.; Fujita, H.; Qian, Y. Accelerator for Supervised Neighborhood Based Attribute Reduction. Int. J. Approx. Reason. 2020, 119, 122–150. [Google Scholar] [CrossRef]
  39. Zhang, C.; Li, D.; Liang, J. Multi-Granularity Three-Way Decisions with Adjustable Hesitant Fuzzy Linguistic Multigranulation Decision-Theoretic Rough Sets over Two Universes. Inf. Sci. 2020, 507, 665–683. [Google Scholar] [CrossRef]
  40. Xu, W.; Zhang, W. Knowledge Reduction and Matrix Computation in Inconsistent Ordered Information Systems. Int. J. Bus. Intell. Data Min. 2008, 3, 409–425. [Google Scholar] [CrossRef]
  41. Chen, Y.; Wang, P.; Yang, X.; Mi, J.; Liu, D. Granular Ball Guided Selector for Attribute Reduction. Knowl. Based Syst. 2021, 229, 107326. [Google Scholar] [CrossRef]
  42. Liu, K.; Yang, X.; Fujita, H.; Liu, D.; Yang, X.; Qian, Y. An Efficient Selector for Multi-Granularity Attribute Reduction. Inf. Sci. 2019, 505, 457–472. [Google Scholar] [CrossRef]
  43. Ba, J.; Liu, K.; Ju, H.; Xu, S.; Xu, T.; Yang, X. Triple-G: A New MGRS and Attribute Reduction. Int. J. Mach. Learn. Cybern. 2022, 13, 337–356. [Google Scholar] [CrossRef]
  44. Yang, X.; Yao, Y. Ensemble Selector for Attribute Reduction. Appl. Soft Comput. 2018, 70, 1–11. [Google Scholar] [CrossRef]
  45. Sun, D.; Zhang, D. Bagging Constraint Score for Feature Selection with Pairwise Constraints. Pattern Recognit. 2010, 43, 2106–2118. [Google Scholar] [CrossRef]
  46. Xu, S.; Yang, X.; Yu, H.; Yu, D.-J.; Yang, J.; Tsang, E.C.C. Multi-Label Learning with Label-Specific Feature Reduction. Knowl. Based Syst. 2016, 104, 52–61. [Google Scholar] [CrossRef]
  47. Liang, J.; Li, R.; Qian, Y. Distance: A More Comprehensible Perspective for Measures in Rough Set Theory. Knowl. Based Syst. 2012, 27, 126–136. [Google Scholar] [CrossRef]
  48. Zhang, X.; Mei, C.; Chen, D.; Li, J. Feature Selection in Mixed Data: A Method Using a Novel Fuzzy Rough Set-Based Information Entropy. Pattern Recognit. 2016, 56, 1–15. [Google Scholar] [CrossRef]
  49. Lianjie, D.; Degang, C.; Ningling, W.; Zhanhui, L. Key Energy-Consumption Feature Selection of Thermal Power Systems Based on Robust Attribute Reduction with Rough Sets. Inf. Sci. 2020, 532, 61–71. [Google Scholar] [CrossRef]
  50. Xu, W.; Liu, S.; Zhang, X.; Zhang, W. On Granularity in Information Systems Based on Binary Relation. Intell. Inf. Manag. 2011, 3, 75–86. [Google Scholar] [CrossRef] [Green Version]
  51. Wang, C.; Hu, Q.; Wang, X.; Chen, D.; Qian, Y.; Dong, Z. Feature Selection Based on Neighborhood Discrimination Index. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 2986–2999. [Google Scholar] [CrossRef] [PubMed]
  52. Wang, S.; Li, D.; Petrick, N.; Sahiner, B.; Linguraru, M.G.; Summers, R.M. Optimizing Area under the ROC Curve Using Semi-Supervised Learning. Pattern Recognit. 2015, 48, 276–287. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. An example of local seeding with “LSC” = 2.
Figure 1. An example of local seeding with “LSC” = 2.
Symmetry 14 01277 g001
Figure 2. An example of global seeding with “GSC” = 3.
Figure 2. An example of global seeding with “GSC” = 3.
Symmetry 14 01277 g002
Figure 3. Ensemble FOA-based searching and classification.
Figure 3. Ensemble FOA-based searching and classification.
Symmetry 14 01277 g003
Table 1. Datasets description.
Table 1. Datasets description.
IDDataSets# Samples# Attributes# LabelsDomain
1Climate Model Simulation Crashes540202Physical
2Connectionist Bench (Vowel Recognition—Deterding)9901311Geography
3Diabetic Retinopathy Debrecen1151192Life
4German1000242Business
5Glass Identification21496Physical
6Ionosphere351342Physical
7Libras Movement3609015Astronomy
8LSVT Voice Rehabilitation12625610Computer
9Musk (Version 1)4761662Physical
10QSAR biodegradation1055412Biology
11Quality Assessment of Digital Colposcopies287622Life
12Sonar208602Physical
13Statlog (Heart)270132Life
14Statlog (Image Segmentation)2310187Life
15Statlog (Vehicle Silhouettes)846184Life
16Synthetic Control Chart Time Series600606Management
17Ultrasonic flowmeter diagnostics - Meter D180444Biology
18Urban Land Cover6751479Geography
19Wdbc569302Life
20Wine178133Physical
Table 2. Comparison among KNN classification accuracies (raw data and higher values are in bold).
Table 2. Comparison among KNN classification accuracies (raw data and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.88700.90750.9048 0.88680.90740.9038 0.88810.90740.9030 0.88710.90740.8955
20.88010.88960.8644 0.88370.88840.8830 0.88520.89070.8829 0.88060.88730.8860
30.65900.67240.6505 0.65870.67050.6551 0.65440.67310.6297 0.66230.67480.6253
40.68610.72320.6886 0.69820.72250.6986 0.70140.72630.6956 0.69450.72760.7129
50.63860.67790.6502 0.63810.66510.6349 0.63930.67490.6514 0.63860.66790.6360
60.82490.82500.8381 0.83170.82470.8294 0.82700.82640.8350 0.82790.83040.8370
70.78640.79600.7634 0.81200.83300.7965 0.76520.76330.7834 0.79300.80010.7784
80.79600.82520.8084 0.80000.82840.8172 0.81440.82800.8340 0.80120.81880.8012
90.82200.83390.7529 0.81940.83380.7422 0.82110.83010.7696 0.82240.82720.7615
100.83540.88190.8340 0.81310.82540.7856 0.86700.84300.8080 0.81230.82430.7680
110.75720.79970.7131 0.76210.80140.7340 0.77100.79950.7290 0.76410.79980.7883
120.82660.85000.7417 0.82680.84590.7376 0.82510.84780.7715 0.81730.84780.7700
130.79310.84570.7917 0.78800.84070.7943 0.79760.84190.7970 0.79000.84260.7907
140.93970.95240.9414 0.94170.95080.9429 0.94580.95530.9503 0.93850.94950.9444
150.66410.69210.6343 0.66660.69330.6491 0.66640.69470.6563 0.66790.70010.6676
160.96280.98330.7844 0.96420.98580.7700 0.96580.98380.8721 0.96450.98630.9374
170.80930.81120.8106 0.81340.82430.8201 0.83420.84560.8231 0.79800.80120.7893
180.84470.87130.7442 0.84510.87140.6624 0.84700.87060.7202 0.85100.87340.8236
190.91340.91350.9017 0.93420.90320.9128 0.92760.95410.9281 0.93670.95120.9097
200.93000.95510.9280 0.92860.95200.9446 0.93340.95830.9509 0.93430.95710.9360
Table 3. Comparison among CART classification accuracies (raw data and higher values are in bold).
Table 3. Comparison among CART classification accuracies (raw data and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.86930.90980.9048 0.87060.91000.9015 0.86290.90940.9000 0.86520.90910.9065
20.68110.80880.6961 0.69680.80840.6998 0.69570.81090.7021 0.69230.80560.7029
30.61000.65230.6165 0.61390.65780.6266 0.61480.66120.6146 0.60530.64890.5959
40.67900.73800.6739 0.68960.74460.6902 0.68990.74030.6822 0.68760.74480.7027
50.70860.73490.7091 0.71020.73880.7202 0.69160.73160.6802 0.70860.73770.6674
60.85960.88690.8470 0.86640.88610.8546 0.86560.88300.8590 0.87260.88770.8569
70.75860.78600.7231 0.73240.77230.7143 0.75630.78650.7326 0.72220.74800.7210
80.76040.81240.7692 0.74640.80480.7724 0.75840.81160.7524 0.75280.81960.7632
90.76720.85200.7382 0.76580.84670.7182 0.76540.85180.7454 0.76510.84990.7392
100.81100.83630.8106 0.82650.85760.8032 0.82310.83260.8154 0.82300.83210.8381
110.73600.79170.6847 0.74190.79050.6945 0.74710.79260.6917 0.74210.79480.7509
120.72170.83050.6763 0.71800.80900.6707 0.69710.81000.7134 0.69780.81000.7007
130.74280.79740.7344 0.73670.80130.7296 0.73520.80070.7261 0.74020.79670.7278
140.95050.96100.9501 0.95200.96140.9476 0.95400.96260.9477 0.95220.96010.9511
150.68860.73640.6475 0.69180.73650.6569 0.69570.73590.6715 0.70000.74080.6879
160.89030.97140.8268 0.89280.96750.8043 0.89110.97120.8570 0.88890.97230.8861
170.76780.78900.7798 0.73540.75470.7413 0.76100.76450.7578 0.74230.73890.7880
180.80310.86300.7700 0.80080.86080.6640 0.80280.86100.7344 0.80250.85920.8107
190.89210.89900.8769 0.91030.92300.8819 0.91200.92100.9001 0.89760.88730.9023
200.88830.93800.8869 0.88430.93770.8689 0.88400.94260.8926 0.88260.94200.8900
Table 4. Comparison among KNN classification accuracies (10% label noise and higher values are in bold).
Table 4. Comparison among KNN classification accuracies (10% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.92280.93520.9337 0.92320.93530.9291 0.92440.93520.9331 0.92180.93520.9368
20.84420.85890.8234 0.85630.85750.8473 0.85320.85800.8449 0.85060.84680.8311
30.60870.65000.6080 0.61020.64880.6161 0.61170.66030.6019 0.61230.64560.5708
40.69840.71300.6967 0.70100.71550.6965 0.70010.71470.6952 0.70260.71540.7098
50.60140.62090.5940 0.60160.62000.5765 0.60000.62050.5660 0.61490.63600.5726
60.86660.86900.8631 0.87240.87240.8679 0.86500.87860.8730 0.86940.87630.8713
70.72310.75010.7420 0.71910.73580.7219 0.74270.75530.7502 0.72180.74580.7310
80.77080.80080.7136 0.77560.80600.7300 0.76600.80240.7480 0.77760.79760.7284
90.79960.79840.7213 0.79670.80230.7303 0.79720.80130.7533 0.79510.79980.7575
100.84020.86660.8247 0.84020.86860.8329 0.84250.86420.8191 0.84300.86860.8322
110.77170.80190.7348 0.77570.80710.7357 0.77570.80720.7352 0.76720.80430.7733
120.79270.80610.7102 0.79100.80850.7020 0.79290.81320.7422 0.80320.80630.7468
130.72350.76280.7231 0.73370.76410.7233 0.72560.75740.7307 0.72930.75850.7322
140.91000.93200.9241 0.94870.95310.9447 0.95720.95810.9449 0.94700.95870.9194
150.64320.66890.5927 0.65740.67070.6316 0.65200.66840.6328 0.65590.66920.6199
160.92110.96050.7358 0.91620.95830.7258 0.91870.96000.8008 0.92480.95850.9380
170.78670.79080.7697 0.78860.78920.7686 0.78330.79330.7181 0.79000.79420.7603
180.80760.85190.6489 0.80790.85030.5962 0.81030.85030.6539 0.80680.85670.8040
190.93810.95940.9310 0.94130.95650.9335 0.93910.95680.9325 0.94030.95530.9418
200.92060.94490.9069 0.92830.94400.9231 0.92770.94490.9266 0.91310.94290.9191
Table 5. Comparison among CART classification accuracies (10% label noise and higher values are in bold).
Table 5. Comparison among CART classification accuracies (10% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.89770.93530.9168 0.88950.93420.9098 0.89520.93420.9095 0.88850.93440.9160
20.63630.77640.6266 0.65040.77350.6289 0.64500.77220.6459 0.64670.76170.6249
30.61140.65140.6120 0.60530.65370.6147 0.60340.65770.6071 0.61330.64960.5720
40.67730.72770.6863 0.68640.73090.6874 0.68140.72790.6879 0.68410.73140.6848
50.62070.67560.6181 0.62000.67840.6193 0.62140.67700.5926 0.62840.68140.5893
60.89870.94100.8797 0.89790.94000.8980 0.89110.93760.9063 0.89060.94070.8949
70.73210.77500.7475 0.73780.75810.7491 0.75820.76530.7630 0.76190.76080.7567
80.70600.78760.7044 0.70120.77920.6804 0.71120.78120.7068 0.70320.78640.6956
90.74230.81680.7041 0.73370.81530.7026 0.73950.81170.7186 0.73180.81400.7200
100.79060.84010.7851 0.79240.83890.7865 0.79420.83820.7712 0.79160.83620.7891
110.74930.81570.7047 0.75760.81860.7247 0.75790.81900.7233 0.75140.81780.7547
120.68200.76460.6602 0.68000.77630.6478 0.67800.76850.6746 0.68290.76800.6749
130.68690.72740.6696 0.68700.72390.6783 0.68040.72690.6796 0.68570.72410.6815
140.93230.94010.9351 0.95840.96410.9619 0.93170.95670.9581 0.93080.95010.9476
150.64200.69720.6018 0.64520.69730.6211 0.64340.69640.6210 0.64980.69840.6208
160.80780.95900.7553 0.81230.95900.7505 0.80650.95500.7763 0.81020.96140.8024
170.82330.85060.8111 0.83530.85000.7847 0.83190.85190.7556 0.82920.85170.8097
180.73440.84560.6717 0.73760.84390.5941 0.73430.84390.6565 0.73470.84400.7292
190.87250.91980.8800 0.87550.91780.8697 0.87930.91680.8770 0.87260.91680.8744
200.86460.92490.8540 0.86660.91770.8703 0.86830.92660.8763 0.85430.91830.8686
Table 6. Comparison among KNN classification accuracies (20% label noise and higher values are in bold).
Table 6. Comparison among KNN classification accuracies (20% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.85050.85190.8739 0.85180.85190.8619 0.85280.85190.8676 0.84920.85190.8695
20.81700.81890.7900 0.82130.83640.8180 0.81710.82360.8202 0.82590.89590.7546
30.56280.58810.5637 0.56400.58770.5623 0.57130.59510.5835 0.56790.59130.5337
40.68720.69380.6823 0.69270.69620.6906 0.69230.69730.6878 0.69010.69490.7032
50.64490.64420.6349 0.65050.64300.6449 0.64420.64300.5316 0.65210.64000.5886
60.84400.84170.8569 0.84930.83810.8530 0.85240.84940.8686 0.84700.84290.8543
70.75420.76890.7672 0.75810.77710.7658 0.73090.75210.7492 0.76420.76890.7598
80.65960.68580.6685 0.67380.68730.6750 0.67000.68960.6912 0.66650.68230.6623
90.70070.69470.6592 0.69410.69860.6698 0.69610.69180.6701 0.70390.69770.6755
100.80110.81770.7898 0.80200.81540.8005 0.80060.81180.7548 0.80180.81550.7990
110.72780.72880.6748 0.72190.72840.6766 0.72400.72600.6769 0.72100.72600.7293
120.74370.75200.6741 0.75320.74760.6615 0.74100.75050.7066 0.74020.74120.7161
130.78630.82630.7670 0.79780.82350.7813 0.79670.82260.7824 0.80130.82560.8135
140.87470.90930.8695 0.87490.90850.8766 0.87520.91410.8797 0.87440.90340.6407
150.63980.67820.6078 0.64140.67870.6328 0.64660.68160.6343 0.64110.68210.5967
160.86200.91880.6714 0.86520.92420.6555 0.86030.92230.7259 0.86230.91750.8933
170.77690.78110.7494 0.77310.77500.7506 0.77140.77920.6944 0.77470.78170.7531
180.72180.81540.6999 0.74510.83210.7017 0.72890.80830.6893 0.74310.84880.6598
190.87620.90580.8804 0.88120.90190.8748 0.87720.90900.8802 0.87610.90000.8815
200.88110.91830.8660 0.88630.91200.8863 0.88140.91230.8823 0.87490.91230.8843
Table 7. Comparison among CART classification accuracies (20% label noise and higher values are in bold).
Table 7. Comparison among CART classification accuracies (20% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.84030.85190.8662 0.83930.85310.8490 0.83940.86380.8569 0.83940.86280.8637
20.62360.74190.6070 0.62730.74000.6287 0.62120.74790.6287 0.63440.71210.5780
30.57630.60350.5766 0.58370.60390.5866 0.58110.60540.5804 0.57880.60100.5407
40.68750.70180.6861 0.69090.70320.6931 0.68720.70250.6928 0.68720.70390.6936
50.57860.64120.5688 0.57600.64880.5635 0.57950.65300.5337 0.57530.64580.5528
60.84440.90970.8529 0.84940.91000.8537 0.85810.91700.8631 0.85690.91030.8620
70.72420.76690.7632 0.71810.76510.7388 0.73090.74210.7217 0.73420.76560.7548
80.62770.69690.6446 0.64270.68690.6442 0.63810.68850.6388 0.63040.68230.6542
90.68660.73390.6565 0.68220.73990.6498 0.68530.73780.6666 0.68920.73450.6713
100.76800.80550.7555 0.76240.80620.7554 0.76500.80180.7352 0.76440.80280.7591
110.71760.73030.6491 0.71380.73050.6805 0.71620.73030.6810 0.71330.73000.7155
120.65170.68370.6044 0.65150.67980.6141 0.64730.68830.6463 0.65150.68390.6337
130.72330.78570.7174 0.72570.78800.7131 0.72330.78940.7094 0.72650.78350.7263
140.81450.90670.8158 0.81780.90700.8190 0.82190.91510.8314 0.81520.89960.6186
150.62950.69310.5897 0.63350.69540.6052 0.62570.69860.6128 0.62880.69820.5940
160.75210.93010.7096 0.75470.92730.6886 0.75230.92750.7244 0.74630.93010.7453
170.76190.82060.7461 0.76500.81330.7350 0.76860.82500.7053 0.75920.81060.7436
180.73180.82140.6859 0.73780.81210.6890 0.76590.81330.6943 0.78710.85980.6668
190.79180.84740.7951 0.78710.84780.7920 0.79230.85150.7978 0.79160.83600.7957
200.79940.86710.7960 0.80230.86370.8014 0.80230.85940.8094 0.79860.86540.8026
Table 8. Comparison among KNN classification accuracies (30% label noise and higher values are in bold).
Table 8. Comparison among KNN classification accuracies (30% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.91760.92590.9254 0.91690.92590.9192 0.91730.92600.9231 0.91630.92590.9219
20.73870.74030.7135 0.74660.74740.7474 0.74230.75190.7432 0.75110.77350.6369
30.54200.55660.5438 0.54040.55430.5386 0.54900.56100.5558 0.54420.55300.5190
40.70980.73750.7108 0.70570.73600.7117 0.70590.73610.7082 0.71030.73450.7137
50.54930.59330.5545 0.54900.58760.5371 0.54570.61480.5029 0.55450.59810.5145
60.77990.76970.7741 0.78660.77540.7810 0.78810.77400.8063 0.78510.77090.7841
70.71430.73590.7102 0.73510.73910.7388 0.73870.74520.7288 0.73540.76180.7588
80.63850.62350.6262 0.64350.62730.6385 0.64880.63080.6469 0.63690.62850.6358
90.69540.69950.6480 0.69890.69970.6585 0.70020.68970.6654 0.70380.68800.6701
100.77580.78100.7619 0.77620.78010.7697 0.77370.77550.7115 0.77650.78240.7707
110.85120.86020.7868 0.84790.85790.8046 0.84910.85950.8032 0.84790.85670.8453
120.69120.69510.6300 0.70490.71510.6261 0.70880.71850.6539 0.69460.69930.6734
130.67430.69630.6570 0.68020.68980.6656 0.67890.69540.6730 0.67090.69810.6728
140.88250.91150.8263 0.81780.90700.8190 0.82190.91510.8314 0.81520.89960.6186
150.58210.61290.5532 0.58540.61190.5702 0.58640.60730.5701 0.59010.60860.5440
160.80300.89380.5997 0.80430.89750.6081 0.80530.89500.6588 0.80220.89960.8313
170.70940.71670.6925 0.71390.71430.6903 0.71190.71690.6581 0.71780.72190.6742
180.73180.82140.6859 0.73780.81210.6890 0.76590.81330.6943 0.78710.85980.6668
190.80130.81910.8013 0.79790.82080.7918 0.79870.82390.7964 0.79750.81620.8025
200.83140.89140.8240 0.83090.89430.8169 0.83860.89030.8389 0.81690.89060.8443
Table 9. Comparison among CART classification accuracies (30% label noise and higher values are in bold).
Table 9. Comparison among CART classification accuracies (30% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.89310.92510.9136 0.89440.92420.9044 0.88770.92660.9088 0.89130.92490.9115
20.56030.69250.5410 0.56950.68520.5695 0.56910.70200.5710 0.57780.66450.4805
30.55690.56760.5568 0.55650.56890.5517 0.55720.57070.5600 0.55410.56780.5171
40.69880.74210.7055 0.69780.74070.7010 0.69790.74270.6967 0.70160.74320.7000
50.52810.59740.5355 0.51810.59620.5171 0.52950.61190.4912 0.53000.60070.4960
60.79230.82290.7767 0.79160.82190.8024 0.79340.82660.8051 0.80040.82270.7904
70.72540.74670.7268 0.71350.74120.7303 0.73870.75760.7648 0.74210.77310.7567
80.60920.62270.6058 0.60580.62120.6015 0.60730.62380.6250 0.60380.61690.6162
90.66350.69440.6536 0.67260.69510.6456 0.66510.69600.6540 0.67590.69760.6598
100.74560.76820.7433 0.74810.76730.7458 0.75380.76650.7153 0.75330.76860.7465
110.79350.85260.7642 0.79700.85400.7656 0.80120.85190.7665 0.79840.85350.7953
120.59830.62780.5807 0.61340.61370.5939 0.61980.62200.6039 0.60340.62240.5876
130.62650.65890.6169 0.61960.62760.6126 0.62590.63020.6256 0.62310.63810.6257
140.83250.91950.8457 0.82760.93120.8586 0.87310.91090.8637 0.85810.91820.8378
150.57080.65530.5353 0.57300.64960.5488 0.56920.65070.5551 0.57440.65020.5398
160.66480.88220.6252 0.65760.88560.6204 0.67250.88330.6468 0.66060.88900.6602
170.70110.76830.6883 0.70580.76560.6786 0.70110.77060.6964 0.70640.75810.6747
180.76340.81560.6631 0.74610.82730.6759 0.75880.81760.6818 0.77240.84720.6731
190.72800.76180.7337 0.72500.75640.7321 0.72460.76720.7382 0.72540.75680.7248
200.76830.85430.7626 0.76940.84460.7749 0.76540.85060.7663 0.76030.84660.7643
Table 10. Comparison among KNN classification accuracies (40% label noise and higher values are in bold).
Table 10. Comparison among KNN classification accuracies (40% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.87690.87960.8774 0.87470.90960.8703 0.87690.87960.8746 0.87640.88960.8766
20.64340.65350.6362 0.65300.65460.6360 0.65840.66960.6628 0.60830.62320.5275
30.51060.54240.5100 0.51030.52380.5120 0.51860.52860.5122 0.51200.53470.4909
40.75220.77180.7482 0.74930.77220.7506 0.75030.77160.7495 0.75040.77250.7497
50.45860.48950.4581 0.47260.49420.4707 0.46950.49860.4160 0.46630.48740.4412
60.75540.75690.7561 0.75660.77840.7627 0.75460.79010.7720 0.75740.77010.7581
70.74640.75170.7188 0.72670.74670.7413 0.73870.75760.7648 0.74640.75370.7261
80.59400.59480.5708 0.56200.57320.5676 0.58920.59440.5804 0.58600.58840.5732
90.63600.64580.6240 0.66540.66710.6257 0.66840.66880.6389 0.65390.66270.6412
100.75990.76020.7497 0.75950.75960.7564 0.73930.75480.6992 0.76330.76920.7561
110.80380.80980.7734 0.79760.80780.7824 0.79950.80880.7824 0.80140.80970.7914
120.53730.59880.5744 0.58900.60680.5673 0.63460.66950.6029 0.62150.69000.6300
130.63260.64480.6098 0.63890.64780.6231 0.63570.63780.6296 0.63700.66260.6417
140.72810.78530.7187 0.73020.78230.7270 0.72660.78800.7248 0.72850.78360.5948
150.52770.54980.5028 0.52990.55500.5195 0.53110.56140.5206 0.53550.55990.4989
160.71920.83580.5413 0.72190.83440.5438 0.72420.83890.5898 0.72280.84080.7518
170.62890.63390.5853 0.62330.62940.5850 0.62750.63360.5728 0.63470.63110.5800
180.85990.91330.7413 0.86450.90880.7170 0.86490.90990.7504 0.86590.90970.8447
190.70290.69120.7058 0.70650.69470.6992 0.70390.69600.6938 0.71090.69260.7098
200.81250.85500.7922 0.81110.85860.8156 0.81560.86670.8103 0.80750.86060.8292
Table 11. Comparison among CART classification accuracies (40% label noise and higher values are in bold).
Table 11. Comparison among CART classification accuracies (40% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.85530.87940.8791 0.86090.87930.8686 0.86310.88030.8768 0.85780.87990.8801
20.49150.62340.4861 0.50640.62500.5036 0.51010.63930.5089 0.51650.59350.3981
30.53470.53820.5284 0.53290.55770.5383 0.52800.52880.5213 0.52900.54530.4950
40.74270.77300.7416 0.74420.77510.7415 0.74420.77470.7436 0.74270.77320.7451
50.46230.51910.4693 0.46790.52930.4542 0.45980.53300.4158 0.46880.53090.4342
60.77840.78910.7570 0.78490.78590.7819 0.77260.78160.7754 0.77600.77870.7719
70.75830.76820.7399 0.74130.74680.7142 0.73510.77310.7148 0.73130.74120.7372
80.58480.59960.5604 0.58160.58800.5700 0.59040.61000.5888 0.59640.59280.5704
90.65120.66110.6196 0.64970.65910.6345 0.65130.65930.6409 0.64800.65890.6442
100.73750.74910.7311 0.74140.74820.7347 0.74000.74910.7027 0.74290.75090.7393
110.79100.81240.7493 0.78140.81140.7497 0.78410.81170.7509 0.78550.81020.7817
120.57070.55980.5510 0.57270.56100.5490 0.57610.55930.5661 0.56200.55120.5612
130.58980.62310.6000 0.59610.62240.5843 0.59300.60260.5848 0.60070.62740.5956
140.66900.82820.6640 0.67110.82420.6634 0.67180.83320.6790 0.66840.82300.5654
150.52710.60120.4941 0.52190.60040.5087 0.52310.60200.5097 0.52970.60140.5020
160.58890.83160.5535 0.59120.82730.5558 0.59200.83390.5760 0.58650.83470.5854
170.60140.67860.5714 0.60080.67170.5844 0.59640.68420.5994 0.60000.66640.5786
180.85350.91830.7673 0.87250.91090.7264 0.87380.91890.7608 0.88120.91960.8688
190.64340.64500.6619 0.65520.65800.6533 0.66770.68330.6560 0.65390.69540.6725
200.73530.82470.7428 0.73080.81440.7500 0.72420.82750.7406 0.72140.81640.7469
Table 12. Comparison among KNN classification stabilities (raw data and higher values are in bold).
Table 12. Comparison among KNN classification stabilities (raw data and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.94230.99980.9348 0.94201.00000.9276 0.94401.00000.9279 0.93781.00000.9378
20.86750.87930.8589 0.87710.87980.8780 0.87870.88040.8789 0.87250.87320.8804
30.80010.78640.8018 0.81990.79120.8221 0.74120.78640.7529 0.81600.79170.7925
40.71580.86170.7687 0.77110.86610.7973 0.76920.86220.7811 0.75920.86130.8539
50.84910.84930.8658 0.86260.85880.8809 0.82720.84700.8465 0.86740.85880.8137
60.93440.95770.8884 0.94110.96530.9156 0.93590.96010.9083 0.92870.96040.9083
70.73780.75200.7275 0.74180.74570.7256 0.74710.76210.7356 0.72760.75210.7289
80.74760.83320.7472 0.75800.82240.7848 0.76880.82320.8028 0.75080.82440.7388
90.82790.85580.7146 0.82320.86130.7422 0.82540.85970.7414 0.82250.85720.7309
100.88960.92240.8807 0.90130.92320.9002 0.89370.91970.8866 0.90280.92400.8934
110.82120.93620.7800 0.82120.92760.7986 0.83880.93310.7991 0.82910.93190.8912
120.80270.85370.7049 0.79800.85150.7515 0.79290.84730.7522 0.79220.84290.7271
130.81760.88300.8509 0.84090.87740.8669 0.83570.87570.8448 0.83540.88000.8609
140.96560.96520.9748 0.97090.96300.9729 0.96940.96820.9774 0.97060.96260.9719
150.80530.80950.7922 0.82510.81490.8199 0.81660.81890.8117 0.82490.81580.8298
160.94510.97770.7463 0.94840.97910.7402 0.94760.97730.8302 0.94440.97970.9158
170.90060.91530.8708 0.89500.91940.8689 0.90330.91310.8194 0.90720.91920.8825
180.86410.91330.7413 0.86450.90880.7170 0.86490.90990.7504 0.86590.90970.8447
190.96530.97770.9659 0.96460.97700.9716 0.96510.97850.9645 0.96230.97660.9648
200.93770.97400.9474 0.92740.96430.9600 0.93600.97630.9574 0.92630.96970.9520
Table 13. Comparison among CART classification stabilities (raw data and higher values are in bold).
Table 13. Comparison among CART classification stabilities (raw data and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.86650.98970.9144 0.86640.99140.9058 0.85460.98820.9049 0.86250.98980.9157
20.66890.78880.6967 0.68890.79400.6913 0.68770.79290.6990 0.68460.79070.6968
30.62470.73970.6337 0.62520.74180.6442 0.61770.74460.6815 0.62630.74230.6857
40.67530.83680.7170 0.69200.83290.7062 0.69440.82960.7021 0.69060.83110.7128
50.82020.84370.8293 0.79070.84160.8298 0.77740.83090.7865 0.81980.84770.7774
60.86060.92740.8707 0.86710.92240.8781 0.85470.92510.8853 0.86660.92870.8757
70.73170.74860.7275 0.71750.75120.7476 0.75270.75870.7444 0.73160.76170.7523
80.71680.83400.7212 0.72080.84880.7488 0.73600.83720.6928 0.72040.84880.7200
90.70620.85570.6783 0.71430.84830.6513 0.70750.85490.6816 0.71950.85430.6817
100.80790.90790.7998 0.81280.90450.8100 0.81760.90630.8007 0.82410.90270.8121
110.76210.86980.7207 0.76260.86980.7124 0.77140.86670.7179 0.76690.86410.7943
120.64800.80680.6088 0.66320.79510.6139 0.63020.78930.6446 0.63050.77660.6488
130.75810.84570.7728 0.74670.84570.7467 0.75170.84390.7433 0.74560.83980.7472
140.94360.96480.9468 0.95040.96520.9450 0.95100.96690.9455 0.95100.96440.9488
150.71470.79290.7001 0.73140.79230.7042 0.72580.79020.7093 0.74800.79630.7353
160.84790.96390.7726 0.85050.95990.7473 0.85040.96520.7992 0.84180.96770.8573
170.81890.87580.7994 0.83420.87330.8328 0.83640.87000.7900 0.83310.87890.8072
180.79900.90710.7684 0.79820.90350.6857 0.80300.90620.7659 0.79670.90460.8130
190.92460.95260.9193 0.91790.95040.9202 0.91820.95490.9201 0.91680.95460.9182
200.84800.94090.8729 0.84370.93090.9077 0.85510.93260.8760 0.84000.93860.8780
Table 14. Comparison among KNN classification stabilities (10% label noise and higher values are in bold).
Table 14. Comparison among KNN classification stabilities (10% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.95481.00000.9375 0.95630.99980.9399 0.95521.00000.9472 0.95311.00000.9595
20.81230.82440.7969 0.82680.82400.8189 0.82180.82610.8130 0.82390.81510.8019
30.72540.75520.7203 0.74130.75470.7449 0.68310.75620.6977 0.73880.75430.6765
40.74520.90740.7965 0.76830.89420.8003 0.77160.89790.7982 0.77190.88810.8293
50.78120.80420.7784 0.77810.79770.8051 0.75950.80530.7644 0.78600.80670.7116
60.90340.93340.8549 0.91140.93670.9027 0.89910.93960.8867 0.90660.93460.8940
70.71890.74720.7385 0.71830.75390.7318 0.71090.74870.7411 0.72090.75870.7519
80.72200.75560.6184 0.72600.76600.6532 0.72400.76200.6628 0.71720.74720.6348
90.81480.84000.6957 0.80730.84690.7322 0.80710.84570.7369 0.81610.84350.7273
100.86920.91010.8648 0.87530.91590.8796 0.87680.90930.8633 0.87860.91260.8710
110.87500.94360.8207 0.86970.94340.8222 0.87260.94400.8228 0.86760.94740.8731
120.75710.81800.6649 0.74660.81980.6710 0.74730.82490.6890 0.75730.81100.6951
130.76500.83440.7869 0.80370.83850.7935 0.78850.83500.7933 0.79540.82590.8274
140.93910.95920.9428 0.93270.95640.9466 0.94500.95420.9371 0.94990.94990.9378
150.73470.75450.7324 0.74890.75250.7474 0.73590.75720.7515 0.74930.75600.7482
160.88580.95140.6691 0.87950.95200.6917 0.88200.95340.7466 0.89000.95240.9189
170.89940.90810.8436 0.89580.90470.8764 0.89500.91310.7994 0.89940.91530.8081
180.82060.88200.6529 0.81370.88310.6694 0.81380.87660.6766 0.81500.88140.8420
190.92020.95280.9111 0.92860.95160.9226 0.92240.95390.9171 0.92650.95110.9286
200.90770.94460.8991 0.91860.93690.9237 0.91770.94260.9151 0.89770.94290.9066
Table 15. Comparison among CART classification stabilities (10% label noise and higher values are in bold).
Table 15. Comparison among CART classification stabilities (10% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.88190.98810.9023 0.86900.99210.8929 0.87440.98770.8949 0.86890.99240.9006
20.62150.72900.6282 0.64560.72430.6401 0.63840.72490.6387 0.64860.71700.6241
30.58740.68950.5859 0.58710.69350.5972 0.58190.70260.6453 0.59160.68970.6443
40.69040.87740.7312 0.69700.86530.7128 0.70150.86840.7104 0.69610.86380.7127
50.70740.75280.6963 0.68980.75210.7319 0.68880.75950.7181 0.71930.75560.6670
60.85770.93170.8380 0.85730.92790.8587 0.84590.92770.8693 0.85030.93010.8580
70.72750.73720.7255 0.72830.75390.7318 0.73910.75170.7411 0.75890.77170.7690
80.62720.75600.6020 0.62160.74440.6048 0.63520.73680.6000 0.61320.75640.5976
90.67580.79520.6233 0.66760.79360.6338 0.67030.78630.6429 0.67150.78980.6361
100.77570.89010.7723 0.78810.88340.7782 0.78010.88660.7665 0.78230.88210.7781
110.74220.89970.7398 0.74100.90660.7495 0.74670.89310.7493 0.73760.90120.7369
120.60560.71830.5932 0.61150.73150.5922 0.60590.73100.6063 0.60850.72610.5971
130.71000.80610.7026 0.70630.79460.7146 0.70630.81460.7181 0.70480.79930.7120
140.91760.95990.9380 0.93980.94740.9441 0.92830.94490.9301 0.94300.94410.9366
150.65750.75230.6436 0.66150.75000.6566 0.65780.75370.6637 0.66140.75110.6691
160.71810.93120.6601 0.72250.93130.6771 0.71510.92440.7038 0.72280.93360.7238
170.81030.86280.7900 0.81810.86610.7983 0.80690.87250.7717 0.81440.86610.7992
180.71570.87350.6346 0.71420.86490.6195 0.71290.86430.6728 0.70650.86000.7125
190.84110.92810.8454 0.83950.92130.8434 0.84380.92580.8444 0.83600.92140.8374
200.82170.88690.8183 0.82030.88430.8171 0.81660.88970.8226 0.79460.88510.8183
Table 16. Comparison among KNN classification stabilities (20% label noise and higher values are in bold).
Table 16. Comparison among KNN classification stabilities (20% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.97521.00000.9306 0.96601.00000.9511 0.97311.00000.9414 0.96671.00000.9501
20.75840.75540.7407 0.76460.75480.7693 0.75770.76260.7642 0.77050.74200.7651
30.70550.77640.7094 0.71770.77330.7250 0.67670.78630.6954 0.72020.77620.6839
40.77970.93840.8039 0.79480.92560.8079 0.79540.92570.8015 0.79200.92590.8243
50.76190.78910.7491 0.76490.79630.7530 0.75000.80120.6960 0.78070.78930.6760
60.86760.89340.8551 0.87300.88900.8709 0.86690.88900.8649 0.86970.89270.8661
70.73190.74120.7175 0.73130.74190.7288 0.74810.75370.7381 0.74990.76270.7570
80.61770.68000.6100 0.61810.68150.6304 0.64460.68730.6358 0.62540.67000.6004
90.78240.82830.6682 0.77230.82620.7092 0.77800.81980.6948 0.77800.82340.7002
100.83270.86820.8287 0.83820.86620.8374 0.82900.86180.7926 0.84100.86070.8332
110.88020.92090.8788 0.87600.91570.8755 0.87430.92020.8759 0.87120.91660.9067
120.70410.76610.6300 0.71880.77560.6388 0.70660.76830.6588 0.69680.76710.6671
130.74090.81910.7452 0.76500.81700.7644 0.76430.81930.7606 0.76850.81810.8007
140.85340.89060.8481 0.85710.89180.8574 0.84700.89580.8531 0.86010.88460.8154
150.69660.72110.6888 0.70020.71860.7037 0.69990.72100.6992 0.70140.71750.7093
160.80510.91290.6488 0.81010.91260.6654 0.80060.91390.6996 0.80070.90680.8541
170.81030.81470.7567 0.80610.80670.7658 0.80000.81030.6961 0.79720.81500.7436
180.87210.91470.8729 0.89730.91420.8607 0.87960.91040.8694 0.88350.90610.8438
190.80790.86660.8126 0.81830.86030.8104 0.81370.87040.8128 0.81370.86230.8185
200.86230.91830.8394 0.86510.91000.8700 0.85710.91570.8620 0.85490.91540.8674
Table 17. Comparison among CART classification stabilities (20% label noise and higher values are in bold).
Table 17. Comparison among CART classification stabilities (20% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.87990.99450.8974 0.88320.99650.8868 0.89310.99280.8852 0.89720.99560.8812
20.61800.70850.6036 0.62030.70930.6193 0.61580.71310.6212 0.63250.68780.6298
30.58770.71690.5883 0.58970.71840.5975 0.58170.72520.6144 0.59930.72260.7118
40.73330.89720.7702 0.73850.88840.7438 0.73630.88730.7481 0.74090.88790.7456
50.66600.73910.6595 0.67090.74650.6533 0.67630.77350.6500 0.68440.75090.6274
60.80370.89300.8087 0.80200.89170.8113 0.81340.90140.8131 0.81600.89230.8127
70.73740.74260.7183 0.73760.74290.7283 0.74570.75410.7347 0.74760.76820.7549
80.61770.68000.6100 0.61810.68150.6304 0.64460.68730.6358 0.62540.67000.6004
90.62570.71970.6061 0.63010.72660.6049 0.62840.72290.6049 0.63270.71340.6135
100.75550.85250.7513 0.75630.85250.7499 0.75770.85130.7443 0.75990.84680.7512
110.75930.89030.8036 0.75360.89670.7728 0.76000.89000.7716 0.75590.89620.7703
120.59170.67510.5556 0.58370.66240.5798 0.58880.68020.5927 0.58610.68370.5700
130.66410.75850.6613 0.66980.76190.6630 0.66590.76540.6709 0.67480.74980.6735
140.74190.87690.7417 0.74860.87930.7481 0.75170.89120.7617 0.74770.87220.7479
150.61650.71100.6056 0.62330.71300.6128 0.62140.71070.6237 0.62560.71540.6381
160.65920.88980.6258 0.65450.88490.6238 0.65580.88320.6433 0.64960.89050.6550
170.69190.77190.6825 0.69690.77640.6889 0.70190.77940.6714 0.69390.76190.6825
180.85360.92650.8302 0.87290.92260.9121 0.88720.92040.8951 0.91420.91420.8627
190.71500.80550.7139 0.71200.80400.7128 0.71180.80880.7167 0.71680.79040.7195
200.73570.82710.7429 0.74260.81140.7323 0.73340.81830.7449 0.72230.82630.7451
Table 18. Comparison among KNN classification stabilities (30% label noise and higher values are in bold).
Table 18. Comparison among KNN classification stabilities (30% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.96471.00000.9399 0.96721.00000.9512 0.96880.99980.9463 0.96371.00000.9461
20.66580.67120.6593 0.67480.67660.6811 0.66920.67950.6743 0.67980.66110.6804
30.69880.77550.7001 0.70730.77400.7044 0.68400.78030.6740 0.70730.77190.7200
40.80100.94840.8301 0.81070.94200.8265 0.81440.94190.8255 0.81230.93790.8261
50.67140.69430.6638 0.67930.68260.6924 0.66710.69480.6452 0.68450.69210.6402
60.84560.85740.7967 0.84040.86100.8309 0.83260.85840.8234 0.83540.85970.8214
70.72540.73760.7283 0.74160.74780.7363 0.75070.75790.7477 0.75160.75720.7419
80.71650.72420.6062 0.69730.73380.6081 0.71120.72690.6427 0.69850.73380.6212
90.73960.77670.6704 0.73730.77350.6955 0.74650.77430.6969 0.74250.77370.6925
100.82940.85930.8188 0.83570.85910.8359 0.82700.86360.7943 0.83710.86210.8287
110.91440.94600.8402 0.90580.95210.8742 0.91020.94770.8747 0.90700.94840.8982
120.67320.74950.6251 0.68460.74070.6298 0.68680.73560.6317 0.67410.74020.6595
130.75310.78940.7239 0.77440.78890.7648 0.76390.80000.7570 0.76350.78850.7761
140.92720.95280.9402 0.93290.94910.9194 0.91900.94440.9325 0.93570.94020.9372
150.64910.65380.6476 0.65960.65610.6615 0.65720.65930.6557 0.66600.65830.6580
160.72720.86290.6004 0.72920.86710.6268 0.72660.86720.6334 0.72520.87130.7550
170.64500.65580.6261 0.65140.66060.6486 0.65140.65220.6289 0.64670.65750.6186
180.84950.92900.9017 0.90100.91950.8601 0.91790.91870.8977 0.90800.91790.8704
190.73040.76360.7286 0.72530.76150.7196 0.72250.76710.7191 0.72420.76050.7302
200.75140.83570.7591 0.75740.83460.7586 0.76600.83060.7666 0.73290.83570.7714
Table 19. Comparison among CART classification stabilities (30% label noise and higher values are in bold).
Table 19. Comparison among CART classification stabilities (30% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.89190.99470.9047 0.89330.99250.9033 0.88820.99370.9035 0.89430.99510.8996
20.56690.65140.5665 0.57370.64480.5718 0.57020.66060.5719 0.58370.63730.5940
30.59990.73070.5958 0.59370.72960.5969 0.60070.73940.6150 0.59880.72630.7142
40.74840.92450.7845 0.75450.91350.7645 0.75310.91620.7604 0.75760.91600.7615
50.59430.64480.6069 0.59240.64000.6062 0.59120.66050.5893 0.60760.64620.5676
60.77300.82700.7569 0.77300.83090.7746 0.76900.83360.7796 0.77000.83110.7607
70.72810.73210.7248 0.74210.75880.7351 0.74170.75170.7383 0.74760.75380.7467
80.58120.66620.5646 0.57420.65460.5781 0.58120.64730.5704 0.56880.65920.5885
90.61590.70540.6051 0.63060.70250.6005 0.61430.70630.6054 0.62040.70150.6053
100.77980.86320.7787 0.78360.86060.7772 0.78130.86390.7825 0.78530.85680.7774
110.78300.92540.7772 0.77350.92460.7775 0.77880.92020.7825 0.77910.92670.7698
120.56980.68370.5546 0.56610.67850.5671 0.57510.67590.5722 0.57340.67680.5566
130.67200.75370.6656 0.67430.74930.6606 0.67500.76570.6659 0.67130.75670.6689
140.91770.95960.9248 0.91480.95620.9434 0.91290.94850.9361 0.91560.94340.9262
150.58000.65420.5794 0.58210.65450.5816 0.57760.65130.5831 0.58280.65150.6005
160.58240.82370.5613 0.57330.82780.5704 0.58480.82630.5772 0.58180.83190.5809
170.61110.68640.6083 0.61750.68360.6008 0.61190.68690.6253 0.61920.66920.5997
180.88840.90590.8795 0.88100.89980.8938 0.86970.89730.8366 0.84400.89380.8325
190.65950.71180.6615 0.65480.70940.6608 0.66180.71840.6653 0.65590.71320.6491
200.66570.77110.6637 0.67200.76570.6771 0.66540.76540.6609 0.66060.76510.6671
Table 20. Comparison among KNN classification stabilities (40% label noise and higher values are in bold).
Table 20. Comparison among KNN classification stabilities (40% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.98271.00000.9348 0.98151.00000.9598 0.98031.00000.9476 0.97971.00000.9484
20.59080.60350.5991 0.59450.59870.5889 0.59700.60780.6007 0.59690.59710.6555
30.73030.81460.7257 0.73070.81070.7277 0.71360.82050.7059 0.73130.80930.6990
40.85570.98690.8631 0.85880.98180.8666 0.85670.98110.8605 0.85900.97940.8660
50.65650.67880.6609 0.66580.68020.6619 0.64440.67400.6377 0.67810.67770.6314
60.83140.86160.8111 0.83230.86170.8324 0.83390.86190.8093 0.83870.86170.8197
70.74110.75110.7378 0.74290.75170.7231 0.73570.74870.7413 0.74160.75180.7437
80.66920.73160.6056 0.66360.73440.6192 0.67080.72360.6100 0.66880.73240.6136
90.74650.79000.6806 0.74620.78990.6995 0.74620.79000.7119 0.74630.78750.7157
100.82300.85660.8226 0.83130.85340.8345 0.82420.86010.7913 0.83340.85370.8285
110.93790.97050.8945 0.93710.97050.9114 0.93530.97000.9140 0.94220.96900.9248
120.67880.75510.6376 0.67560.74880.6629 0.67100.74780.6622 0.67730.75000.6802
130.73220.78430.7191 0.75090.77740.7406 0.74350.78610.7378 0.75410.78350.7541
140.66000.71900.6565 0.66010.70980.6588 0.65320.71930.6524 0.66280.72170.6646
150.60060.60780.6035 0.60670.61210.6103 0.60210.61700.6056 0.60280.61240.6134
160.64080.78650.5683 0.63510.78630.5897 0.63850.79820.5913 0.64100.80150.6633
170.58190.60280.5575 0.57830.59580.5719 0.58940.59720.5519 0.58860.60140.5447
180.88470.91140.8801 0.85740.90170.8797 0.87770.89360.8647 0.89200.91200.8895
190.64880.66230.6510 0.65890.66460.6573 0.65410.66750.6534 0.65910.66480.6568
200.72470.79330.7119 0.71530.79830.7278 0.73190.80920.7339 0.72640.79830.7347
Table 21. Comparison among CART classification stabilities (40% label noise and higher values are in bold).
Table 21. Comparison among CART classification stabilities (40% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.90000.99610.9057 0.90960.99640.9022 0.90860.99670.9010 0.90730.99610.8985
20.54520.59790.5520 0.54570.59750.5480 0.54660.60560.5519 0.54870.58800.6156
30.61370.76480.6187 0.61250.76700.6277 0.61190.77720.6336 0.61560.76290.7420
40.79550.96450.8220 0.80030.95770.8131 0.80240.95660.8118 0.80900.95320.8103
50.58210.61370.5851 0.56600.61770.5753 0.57210.61210.5865 0.58090.60770.5823
60.76730.83040.7630 0.77810.83060.7701 0.77530.83700.7749 0.77200.82910.7709
70.73440.74660.7361 0.73830.74430.7372 0.73850.74140.7333 0.73300.73850.7371
80.56680.62400.5776 0.55480.64000.5428 0.55400.63440.5548 0.55920.63880.5740
90.62690.72970.6080 0.63550.73330.6149 0.63490.73290.6228 0.63280.73060.6234
100.78630.86910.7816 0.79230.86670.7866 0.78360.87040.8002 0.79170.86390.7891
110.84430.96570.8160 0.83830.96430.8326 0.83810.96640.8324 0.84240.96120.8367
120.58200.72410.5856 0.59780.73050.5902 0.58490.72540.5941 0.58980.73020.5839
130.69440.78830.6820 0.69940.79390.6898 0.69220.80040.6863 0.69020.78780.6839
140.58170.75530.5853 0.58200.75020.5807 0.58520.75970.5927 0.58330.74950.6139
150.54670.59730.5518 0.54770.59600.5563 0.55090.60240.5537 0.54210.59650.5777
160.54080.76040.5303 0.53830.75490.5439 0.53580.75810.5421 0.53840.76130.5404
170.55580.61220.5508 0.55500.60580.5533 0.54970.60810.5425 0.54580.61080.5431
180.83820.92720.8660 0.83950.91960.8863 0.88730.90900.8924 0.84170.89240.8703
190.62790.65450.6318 0.62630.65180.6232 0.63430.64960.6237 0.62450.65450.6339
200.64360.74030.6500 0.63420.73330.6633 0.62390.74580.6497 0.62030.72750.6586
Table 22. Comparison among AUC (raw data and higher values are in bold).
Table 22. Comparison among AUC (raw data and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.51570.50050.6476 0.51510.50000.6224 0.51630.50000.6161 0.52030.50000.5724
20.93370.93890.9251 0.93570.93830.9354 0.93650.93950.9353 0.93400.93770.9370
30.65840.67520.6498 0.65790.67300.6547 0.65390.67660.6288 0.66170.67750.6210
40.60280.58000.6080 0.62180.58270.6146 0.62820.58710.6141 0.69450.59260.6432
50.75570.77350.7591 0.75490.76370.7472 0.75510.77120.7625 0.75730.76700.7497
60.78260.78170.8053 0.79230.78130.7918 0.78610.78380.7974 0.78660.78940.8008
70.73260.75700.7458 0.73720.75640.7279 0.73510.75580.7430 0.72830.74580.7331
80.82980.86320.8111 0.83440.86490.8269 0.84860.86390.8488 0.83200.85580.8045
90.83540.84890.7558 0.83200.84920.7473 0.83470.84580.7752 0.83450.84250.7696
100.81480.82580.7987 0.81560.82700.8040 0.81530.82800.8011 0.81390.82560.8087
110.60970.61020.5348 0.61610.61650.5512 0.62080.61230.5417 0.61660.61250.6437
120.81900.85320.7370 0.82000.84760.7280 0.81970.84910.7643 0.81120.84990.7651
130.79050.84350.7894 0.78500.83860.7919 0.79490.84000.7951 0.78730.84030.7882
140.96510.97270.9660 0.96620.97180.9669 0.96860.97430.9712 0.96440.97100.9678
150.77560.79460.7559 0.77730.79530.7656 0.77720.79630.7705 0.77820.79980.7780
160.97740.98960.8710 0.97810.99120.8624 0.97920.98990.9231 0.97840.99140.9623
170.90710.90870.8457 0.90580.90410.8804 0.90690.90450.8281 0.91090.91270.8507
180.90780.92270.8504 0.90790.92270.8034 0.90920.92220.8359 0.91140.92410.8957
190.95620.96210.9555 0.95540.96030.9599 0.95900.96270.9573 0.95530.96060.9576
200.94900.96740.9473 0.94790.96520.9595 0.95150.96970.9640 0.95190.96900.9531
Table 23. Comparison among AUC (raw data and higher values are in bold).
Table 23. Comparison among AUC (raw data and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.57600.52060.6589 0.87060.51980.6633 0.57470.52220.6629 0.56610.51890.6629
20.82440.89470.8327 0.69680.89440.8347 0.83250.89580.8360 0.83060.89280.8364
30.60820.65490.6150 0.61390.66050.6244 0.61370.66370.6135 0.60330.65130.5887
40.61450.61860.6016 0.68960.63140.6277 0.62990.62660.6169 0.68760.63340.6469
50.80330.81480.8024 0.71020.81740.8089 0.79060.81200.7819 0.80310.81720.7740
60.84530.86850.8319 0.86640.86670.8393 0.85380.86450.8461 0.86080.86850.8433
70.73740.75670.7358 0.73950.75560.7323 0.72840.75480.7420 0.72380.74830.7252
80.73750.78600.7542 0.74640.77040.7638 0.72810.77840.7336 0.72390.79160.7458
90.76840.85660.7324 0.76580.85120.7079 0.76680.85630.7395 0.76390.85370.7319
100.78250.80800.7735 0.80630.80900.7683 0.78420.81240.7710 0.78570.80930.7771
110.62070.63850.5359 0.74190.63740.5438 0.63980.64640.5396 0.63440.64540.6256
120.71610.82620.6700 0.71800.80560.6676 0.68840.80430.7084 0.68760.80670.6941
130.74040.79620.7319 0.73670.79980.7273 0.73260.79950.7234 0.73780.79510.7251
140.97130.97760.9711 0.95200.97780.9696 0.97330.97850.9697 0.97230.97700.9717
150.79180.82400.7645 0.69180.82400.7705 0.79650.82360.7804 0.79940.82690.7913
160.93450.98300.8962 0.89280.98070.8829 0.93490.98280.9142 0.93370.98360.9320
170.90490.92980.8778 0.88110.93000.8988 0.91370.92800.8546 0.91210.93180.8860
180.88620.92010.8669 0.80080.91880.8069 0.88590.91890.8463 0.88580.91790.8900
190.92230.94030.9199 0.92730.93890.9217 0.91920.94230.9244 0.91650.93890.9206
200.91520.95350.9147 0.88430.95280.9004 0.91230.95670.9173 0.91100.95620.9172
Table 24. Comparison among AUC (10% label noise and higher values are in bold).
Table 24. Comparison among AUC (10% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.53520.50000.6514 0.53350.50070.6071 0.54010.50000.6245 0.53340.50000.6092
20.91390.92200.9024 0.92050.92120.9156 0.91890.92150.9143 0.91740.91530.9066
30.60930.65130.6087 0.61090.65000.6168 0.61230.66160.6023 0.61290.64680.5720
40.60800.56850.6004 0.61140.57520.6010 0.60950.57360.6000 0.61240.57680.6199
50.71470.72250.7105 0.71550.72100.7010 0.71420.72070.6935 0.72390.73250.6957
60.79770.80070.8006 0.80680.80590.8028 0.79560.81520.8114 0.80170.81170.8087
70.72200.75930.7296 0.74790.75420.7221 0.74540.74980.7426 0.74440.74790.7309
80.78720.81590.7087 0.78510.82240.7225 0.78120.82030.7424 0.78660.81730.7312
90.81610.81890.7358 0.81370.82320.7470 0.81470.82220.7720 0.81340.82090.7760
100.82630.84780.8085 0.82630.85050.8161 0.82820.84200.7960 0.82900.85040.8161
110.55140.55430.5091 0.55830.56270.5131 0.56150.56150.5121 0.55420.55860.5649
120.79750.82150.7148 0.79440.82440.7053 0.79840.82930.7464 0.80730.82350.7521
130.73560.77990.7335 0.74480.78050.7336 0.73750.77250.7401 0.74140.77550.7432
140.94290.95940.9147 0.91590.95540.9193 0.94370.94590.9397 0.94340.94370.9297
150.75560.77280.7220 0.76520.77400.7479 0.76160.77240.7485 0.76420.77290.7406
160.95270.97640.8421 0.94990.97510.8360 0.95150.97610.8810 0.95510.97530.9629
170.85850.85910.8452 0.85950.85990.8446 0.85590.86070.8099 0.86090.86150.8387
180.88660.91150.7962 0.88710.91050.7658 0.88830.91070.7987 0.88610.91450.8847
190.94320.96140.9340 0.94660.95900.9384 0.94400.95840.9353 0.94590.95780.9481
200.94140.96010.9317 0.94730.95930.9437 0.94710.96010.9459 0.93580.95840.9407
Table 25. Comparison among AUC (10% label noise and higher values are in bold).
Table 25. Comparison among AUC (10% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.60430.52270.6969 0.88950.51010.6653 0.61290.51810.6784 0.58270.51090.7031
20.79930.87660.7939 0.65040.87490.7953 0.80410.87430.8045 0.80500.86840.7930
30.61160.65230.6122 0.60530.65460.6149 0.60370.65870.6074 0.61350.65050.5732
40.60580.60040.6063 0.68640.60910.6160 0.60760.60340.6174 0.61260.61000.6155
50.74050.77070.7392 0.62000.77170.7396 0.74140.77090.7199 0.74600.77460.7200
60.87660.92360.8575 0.89790.92380.8783 0.86680.91840.8880 0.86690.92350.8732
70.72900.75490.7389 0.73850.74710.7362 0.72850.74650.7338 0.74410.74410.7380
80.69030.78010.6930 0.70120.77230.6694 0.69390.77830.7026 0.69330.78260.6847
90.74790.82590.7059 0.73370.82460.7057 0.74700.82060.7203 0.73780.82370.7233
100.76520.80560.7572 0.79240.80560.7590 0.76890.80330.7407 0.76520.80240.7615
110.61410.61650.5072 0.75760.61650.5530 0.61630.62060.5493 0.61820.61490.6241
120.68540.78050.6650 0.68000.79240.6533 0.68450.78600.6792 0.68760.78260.6806
130.69950.74750.6842 0.68700.74490.6912 0.69310.74770.6921 0.69890.74460.6936
140.94470.95860.9201 0.93080.95290.9102 0.91130.95140.9497 0.94110.94970.9451
150.75650.79320.7301 0.64520.79310.7430 0.75740.79270.7426 0.76180.79410.7423
160.88490.97550.8540 0.81230.97550.8510 0.88410.97320.8663 0.88630.97700.8818
170.88450.90200.8760 0.83530.90200.8587 0.89100.90290.8394 0.88880.90250.8746
180.84670.90980.8113 0.73760.90930.7669 0.84720.90900.8028 0.84730.90920.8436
190.88720.92860.8942 0.87550.92760.8839 0.89220.92620.8915 0.88510.92560.8882
200.89820.94340.8900 0.86660.93840.9014 0.90070.94500.9063 0.88970.93870.9009
Table 26. Comparison among AUC (20% label noise and higher values are in bold).
Table 26. Comparison among AUC (20% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.51600.50000.6134 0.52520.50000.5601 0.52200.50000.5831 0.51930.50000.5793
20.88910.89830.8843 0.90150.89870.8997 0.89920.90270.9009 0.90400.88740.8645
30.57350.60490.5742 0.57430.60420.5721 0.58180.61250.5936 0.57810.60790.5477
40.57920.53820.5729 0.58650.59510.5832 0.58470.59580.5802 0.58370.64380.6007
50.75150.74260.7458 0.75580.74120.7523 0.75040.73970.6636 0.75600.73950.7102
60.75790.75400.7813 0.76690.74890.7738 0.77160.76570.7976 0.76310.75600.7751
70.74400.75800.7516 0.73640.75740.7327 0.74040.75380.7472 0.74460.75160.7437
80.66830.70020.6279 0.67710.70300.6297 0.67360.70910.6579 0.67210.69920.6404
90.71490.71120.6698 0.70830.71510.6831 0.71050.70840.6823 0.71820.71430.6877
100.77130.78110.7568 0.77210.77870.7684 0.76960.77180.7204 0.77130.77940.7675
110.60150.59290.5304 0.59630.59350.5349 0.59850.58930.5352 0.59670.58980.5971
120.76390.78540.6914 0.77240.78190.6776 0.75980.78460.7273 0.75870.77680.7332
130.78630.82630.7670 0.79780.82350.7813 0.79670.82260.7824 0.80130.82560.8135
140.92710.94730.9240 0.92720.94680.9282 0.92740.95010.9300 0.92690.94380.7907
150.76150.78700.7401 0.76260.78740.7568 0.76600.78930.7577 0.76230.78960.7327
160.91820.95240.8044 0.92010.95560.7953 0.91730.95450.8372 0.91840.95160.9368
170.84580.84950.8280 0.84310.84560.8287 0.84240.84830.7980 0.84400.84960.8315
180.85700.92940.8391 0.88570.92130.8397 0.88500.91320.8759 0.84760.90250.9025
190.90260.92430.9056 0.90440.92060.9003 0.90160.92630.9041 0.90200.91900.9070
200.91080.94030.8989 0.91500.93570.9148 0.91150.93570.9118 0.90620.93580.9141
Table 27. Comparison among AUC (20% label noise and higher values are in bold).
Table 27. Comparison among AUC (20% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.57480.50420.6349 0.83930.50510.5941 0.56390.50910.6196 0.55880.50490.6446
20.79300.85800.7838 0.62730.85700.7957 0.79170.86140.7958 0.79900.84160.7682
30.58180.61610.5823 0.58370.61670.5933 0.58630.61850.5876 0.58430.61360.5555
40.59620.56230.5874 0.69090.56860.6055 0.59770.56760.6011 0.59760.56960.6050
50.71450.74700.7094 0.57600.74930.7106 0.71680.75280.6787 0.71120.75010.6952
60.80090.87240.8078 0.84940.87360.8139 0.81670.88220.8227 0.81240.87360.8229
70.74210.75110.7311 0.72990.75090.7236 0.72680.74400.7273 0.73520.74210.7233
80.59590.67040.6107 0.64270.65900.5996 0.60940.66730.6139 0.60550.66170.6204
90.69310.74490.6621 0.68220.75080.6557 0.69120.74870.6718 0.69580.74510.6767
100.72820.75420.7151 0.76240.75610.7167 0.72560.74910.6899 0.72530.75160.7193
110.62430.59960.5222 0.71380.59950.5699 0.62310.59830.5705 0.62030.59760.6192
120.66590.71730.6218 0.65150.71700.6280 0.66470.72450.6623 0.66710.72330.6482
130.72330.78570.7174 0.72570.78800.7131 0.72330.78940.7094 0.72650.78350.7263
140.89190.94560.8927 0.81780.94580.8946 0.89630.95050.9018 0.89240.94150.7778
150.75430.79680.7277 0.63350.79830.7381 0.75180.80040.7432 0.75380.80010.7306
160.85140.95840.8260 0.75470.95660.8137 0.85130.95680.8349 0.84770.95830.8471
170.83640.87510.8263 0.76500.87050.8204 0.84120.87840.7999 0.83540.86950.8246
180.86010.90030.8723 0.83630.88910.8662 0.83800.88590.8715 0.84220.88510.8851
190.83630.88110.8347 0.78710.88080.8322 0.83490.88430.8385 0.83490.87270.8352
200.83990.89460.8387 0.80230.89220.8421 0.84230.88850.8493 0.83960.89330.8429
Table 28. Comparison among AUC (30% label noise and higher values are in bold).
Table 28. Comparison among AUC (30% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.52660.50000.6187 0.52160.50000.5596 0.51840.50060.5864 0.52360.50000.5893
20.85590.85660.8420 0.86020.85980.8607 0.85780.86300.8584 0.86270.87180.7996
30.56270.58760.5642 0.56140.58520.5593 0.57060.59250.5768 0.56480.58380.5459
40.55260.52380.5500 0.54750.52560.5522 0.54690.56500.5461 0.55310.59580.5617
50.68960.71390.6939 0.68830.71110.6808 0.68940.72840.6583 0.69280.71860.6673
60.70340.78610.7005 0.71350.79430.7058 0.71660.79190.7374 0.71050.78790.7110
70.75140.75950.7419 0.72500.75890.7331 0.73140.75600.7307 0.73020.75140.7447
80.65680.65880.6434 0.66360.67350.6498 0.66930.67610.6636 0.65790.65890.6546
90.73280.73290.6834 0.73660.73300.6951 0.73710.73820.7035 0.74100.73070.7076
100.70370.71040.6887 0.70350.79910.6961 0.70210.70920.6418 0.70410.70680.6977
110.59570.58110.5159 0.59330.57170.5206 0.59130.57760.5180 0.59010.57050.5931
120.72430.73460.6645 0.73740.73870.6608 0.74190.73630.6895 0.72710.73870.7089
130.70060.72980.6813 0.70520.76410.6917 0.70510.79990.6991 0.69680.70150.6985
140.91230.95500.9392 0.92800.94700.9434 0.93000.94420.9415 0.91600.94340.9120
150.72040.74110.7011 0.72260.74050.7125 0.72320.73740.7123 0.72570.73830.6951
160.88210.93650.7605 0.88290.93870.7656 0.88350.93710.7958 0.88160.93990.8992
170.81380.81830.8027 0.81640.81600.8011 0.81510.81850.7808 0.81900.82170.7910
180.88780.92830.8834 0.84830.92280.9098 0.89860.91630.8614 0.91580.91580.8392
190.85430.86850.8558 0.85210.86900.8462 0.85150.87080.8495 0.85020.86460.8550
200.87280.91920.8665 0.87180.92090.8617 0.87790.91820.8786 0.86130.91800.8822
Table 29. Comparison among AUC (30% label noise and higher values are in bold).
Table 29. Comparison among AUC (30% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.56850.50530.6498 0.89440.50540.5982 0.55990.51240.6345 0.55490.50410.6480
20.75830.83080.7476 0.56950.82680.7632 0.76300.83610.7641 0.76780.81530.7148
30.57190.59510.5716 0.55650.59610.5664 0.57250.59850.5767 0.56960.59480.5437
40.56480.54010.5622 0.69780.54410.5655 0.56210.54490.5629 0.56580.54580.5654
50.69050.72470.6956 0.51810.72500.6861 0.69050.73340.6656 0.69210.72750.6710
60.74080.76120.7233 0.79160.75910.7547 0.73990.76430.7515 0.75030.76020.7388
70.75020.75750.7207 0.73280.75340.7396 0.73830.75190.7279 0.74300.75020.7359
80.61910.64710.6120 0.60580.64660.6043 0.61830.65030.6252 0.61550.64210.6216
90.68700.73080.6755 0.67260.73130.6675 0.68910.73330.6763 0.69930.73410.6822
100.66530.66850.6620 0.74810.76820.6649 0.67500.68550.6229 0.67250.68090.6663
110.59750.59840.5264 0.59700.69010.5403 0.62600.69110.5364 0.61630.68670.6202
120.62520.68000.6092 0.61340.66940.6211 0.64440.67390.6349 0.62910.67600.6120
130.65740.66060.6451 0.61960.65010.6433 0.65740.68300.6555 0.65490.65900.6571
140.91410.95180.9112 0.92400.94850.9174 0.93210.94480.9199 0.91550.93780.9293
150.71260.76930.6890 0.57300.76550.6979 0.71160.76620.7022 0.71500.76590.6920
160.79910.92940.7751 0.65760.93150.7723 0.80370.93010.7882 0.79660.93350.7963
170.80980.85240.8023 0.70580.85070.7955 0.81060.85380.8056 0.81340.84580.7932
180.83200.92620.8435 0.83590.92330.8482 0.90690.91570.9094 0.88410.90940.8948
190.79720.82730.8002 0.72500.82310.7964 0.79420.83050.7994 0.79630.82430.7932
200.82170.88840.8173 0.76940.88040.8270 0.81900.88510.8192 0.81560.88260.8189
Table 30. Comparison among AUC (40% label noise and higher values are in bold).
Table 30. Comparison among AUC (40% label noise and higher values are in bold).
IDKNN
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.50900.50000.5997 0.50520.50000.5419 0.51180.50000.5669 0.51080.50000.5740
20.80320.80860.7993 0.80850.81930.8157 0.81150.81760.8139 0.81140.79170.7394
30.55500.55970.5540 0.55450.56060.5552 0.56230.56670.5550 0.55600.56150.5365
40.54230.54810.5414 0.53930.57070.5415 0.54220.55010.5415 0.53980.54210.5394
50.64510.65770.6435 0.65480.66080.6511 0.65080.66410.6094 0.65130.65720.6296
60.66160.66690.6663 0.66230.69030.6716 0.66170.70110.6860 0.66590.68160.6664
70.73040.75820.7379 0.73660.74720.7237 0.74170.74390.7288 0.74090.74350.7398
80.61700.64640.6229 0.63270.65440.6257 0.65500.66320.6391 0.64910.64980.6289
90.68650.66940.6425 0.68590.67040.6454 0.68870.67230.6590 0.68460.66640.6615
100.67190.68050.6596 0.65030.66100.6660 0.63110.65060.6107 0.64560.66300.6565
110.55260.54380.5230 0.54190.54230.5274 0.54680.55280.5265 0.53530.54460.5393
120.65870.66630.6233 0.66050.67310.6229 0.65540.66670.6542 0.67360.66750.6825
130.66810.68120.6448 0.67360.65290.6586 0.67090.64450.6650 0.67190.69830.6758
140.84190.87550.8364 0.84310.87380.8413 0.84110.87700.8399 0.84210.87450.7642
150.68380.69890.6673 0.68530.70240.6783 0.68610.70660.6791 0.68910.70560.6648
160.83210.90190.7259 0.83390.90110.7274 0.83510.90380.7548 0.83440.90500.8517
170.75050.75380.7209 0.74660.75090.7207 0.74950.75370.7128 0.75430.75200.7174
180.83980.92780.8710 0.90650.91930.9016 0.91280.91430.8965 0.84190.91280.9029
190.78960.80440.7944 0.79270.80780.7880 0.78560.78870.7827 0.79500.82450.7961
200.83640.87470.8209 0.83550.87810.8390 0.83950.88460.8333 0.83180.87900.8500
Table 31. Comparison among AUC (40% label noise and higher values are in bold).
Table 31. Comparison among AUC (40% label noise and higher values are in bold).
IDCART
PF-AQREF-AQRFG-AQR PF-CEREF-CERFG-CER PF-NDIREF-NDIRFG-NDIR PF-RLREF-RLRFG-RLR
10.54230.50280.6229 0.86090.50240.5837 0.55230.50470.6049 0.54040.50410.6314
20.72030.79260.7172 0.50640.79360.7269 0.73050.80160.7298 0.73400.77610.6696
30.56550.57640.5600 0.53290.57900.5706 0.55810.57880.5558 0.56000.57560.5416
40.56450.51990.5503 0.74420.52630.5562 0.56270.52710.5597 0.55800.52590.5621
50.65930.68650.6625 0.46790.69200.6532 0.65720.69380.6235 0.66280.69450.6370
60.70950.71480.6829 0.78490.69290.7143 0.71540.69240.7033 0.71210.68950.7011
70.74230.75760.7311 0.74870.75170.7218 0.73320.74520.7419 0.74560.75180.7387
80.62050.65850.6060 0.58160.65070.6045 0.62680.66860.6241 0.63050.65420.6131
90.66410.68180.6333 0.64970.67990.6478 0.66510.68000.6545 0.66160.67960.6575
100.64540.65540.6391 0.63140.64560.6428 0.62920.63560.6002 0.65190.68940.6471
110.58680.59440.5309 0.78140.55310.5237 0.57810.55330.5257 0.58110.55200.5754
120.60690.62840.5918 0.57270.63060.5908 0.61290.63010.6066 0.60170.62260.5985
130.62940.65420.6348 0.59610.63370.6227 0.63100.63370.6225 0.63860.65880.6334
140.80730.90020.8042 0.67110.89790.8039 0.80890.90320.8131 0.80690.89730.7469
150.68320.73290.6613 0.52190.73230.6710 0.68060.73340.6716 0.68490.73310.6665
160.75380.89920.7324 0.59120.89670.7337 0.75560.90060.7459 0.75230.90110.7516
170.73260.78400.7117 0.60080.77910.7213 0.72910.78790.7304 0.73160.77590.7171
180.86430.92480.8673 0.83160.90550.8676 0.85600.90050.8960 0.88650.89600.8664
190.74460.75100.7517 0.65520.75350.7459 0.75900.75060.7464 0.74790.74980.7619
200.76900.84540.7784 0.73080.83580.7808 0.76000.84810.7710 0.75740.83820.7777
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wang, J.; Liu, Y.; Chen, J.; Yang, X. An Ensemble Framework to Forest Optimization Based Reduct Searching. Symmetry 2022, 14, 1277. https://doi.org/10.3390/sym14061277

AMA Style

Wang J, Liu Y, Chen J, Yang X. An Ensemble Framework to Forest Optimization Based Reduct Searching. Symmetry. 2022; 14(6):1277. https://doi.org/10.3390/sym14061277

Chicago/Turabian Style

Wang, Jin, Yuxin Liu, Jianjun Chen, and Xibei Yang. 2022. "An Ensemble Framework to Forest Optimization Based Reduct Searching" Symmetry 14, no. 6: 1277. https://doi.org/10.3390/sym14061277

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop