Incorporating Grey Total Inﬂuence into Tolerance Rough Sets for Classiﬁcation Problems

: Tolerance-rough-set-based classiﬁers (TRSCs) are known to operate effectively on real-valued attributes for classiﬁcation problems. This involves creating a tolerance relation that is deﬁned by a distance function to estimate proximity between any pair of patterns. To improve the classiﬁcation performance of the TRSC, distance may not be an appropriate means of estimating similarity. As certain relations hold among the patterns, it is interesting to consider similarity from the perspective of these relations. Thus, this study uses grey relational analysis to identify direct inﬂuences by generating a total inﬂuence matrix to verify the interdependence among patterns. In particular, to maintain the balance between a direct and a total inﬂuence matrix, an aggregated inﬂuence matrix is proposed to form the basis for the proposed grey-total-inﬂuence-based tolerance rough set (GTI-TRS) for pattern classiﬁcation. A real-valued genetic algorithm is designed to generate the grey tolerance class of a pattern to yield high classiﬁcation accuracy. The results of experiments showed that the classiﬁcation accuracy obtained by the proposed method was comparable to those obtained by other rough-set-based methods.

Direct relationships have been effectively measured in grey TRSC [19] through grey relational analysis (GRA) from the viewpoint of relationships obtained between patterns. However, direct as well as indirect relationships can exist between patterns. The widely used Decision-Making Trial and Evaluation Laboratory (DEMATEL) can effectively verify interdependencies among patterns or variables [20][21][22]. Furthermore, the total influence matrix plays an important role in the DEMATEL, and can be used to indicate direct/indirect influences among patterns. This is why the DEMATEL has been widely applied to various decision problems [24,25]. This motivated us to use the total influence matrix to realize direct/indirect relationships when constructing a TRS-based classifier. This article thus proposes a grey-total-influence-based TRS (GTI-TRS) for pattern classification. Furthermore, a genetic algorithm is used to determine the parameters required to construct the proposed classifier with high classification accuracy.
The remainder of this paper is organized as follows. Section 2 introduces a traditional similarity measure for the TRS and its computation steps. In Section 3, we present the proposed grey total influence matrix and GTI-TRS for pattern classification. In Section 4, we provide a genetic algorithm to construct the proposed GTI-TRS-based classifier (GTI-TRSC). Some real-world datasets were used to determine the classification accuracy of the proposed method. The experiments and results described in Section 5 show that the proposed GTI-TRSC can perform well compared to rough-set-based methods considered. Section 6 contains a discussion of the results and the conclusions of this study.

Tolerance Rough Sets
The rough set is briefly introduced in Section 2.1. TRS with a similarity measure is described in Section 2.2. In Section 2.3, we detail the classification procedure for the TRSC.

Rough Set Theory
Uncertainty and vagueness can be handled by rough set theory. Let S = (U, A {d}) be a decision table, where U, A, and D are nonempty finite sets. U is the universe of objects, A is a set of conditional attributes, and d / ∈ A is a decision attribute. An information function a: U → V a can be defined for a ∈ A, where V a is the set of values of a called the domain of a. An indiscernibility relation Ind(P) is defined for any P ⊆ A: where x i and x j are indiscernible if (x i , x j ) belongs to Ind(P). Ind(P) is called the P-indiscernibility relation, and its equivalence classes are denoted by [x] P = {y ∈ U|a(x) = a(y), ∀a ∈ P}. A P-definable set denotes any finite union of elementary sets [26]. In a classification problem, a concept X is composed of elements with the same class label such that X ∈ U/{d}. Sometimes, X ⊆ U is not P-definable. If X is a vague concept, a pair of precise concepts, P-upper (PX) and P-lower approximations (PX), can be used to approximate X [1,8]: It is clear that PX ⊆ PX. Elements that certainly belong to X constitute PX, whereas those possibly belonging to X constitute PX. The tuple PX, PX is called a rough set, and PX and PX are singleton approximations. When PX = PX, X is definable because X is precise with respect to P. In contrast, PX = PX means that X is undefinable.
A boundary region BND P (X) defined for a vague concept X is as follows: A rough membership function defines the degree of inclusion of x within X with respect to P: where µ P X (x) ∈ [0, 1], and |[x] P | denotes the cardinality of [x] P .

Traditional Similarity Measure
Let m and n denote the numbers of patterns and attributes, respectively. x i and x j (1 ≤ i, j ≤ m) are some objects in U. A simple distance function can be defined by a similarity measure S a (x i , x j ) to measure the closeness between a(x i ) and a(x j ) as in [14,15]: where a(x i ) and a(x j ) are attribute values of x i and x j , respectively, in V a , and d max is the maximum value among |a(x i ) − a(x j )|. d max can be replaced by (max a − min a ) [17], where max a and min a are the maximum and minimum values, respectively, of the domain interval of a [12]. Then, x i and x j are similar with respect to a when S a (x i , x j ) ≥ t a , where t a ∈ [0, 1] is the similarity threshold with respect to a. The tolerance relation R a has a relation with S a (x i , x j ): This means that x i and x j are similar with respect to attribute a when a(x i )R a a(x j ). S A (x i , x j ), an overall similarity measure, can be further defined as: where |A| is the number of attributes in A, and |A| = n here. Kim and Bang [14] used x i τ A x j to denote the above similarity between objects x i and x j with respect to all attributes A. As a result, the tolerance relation τ A can be related to S A as where t A in [0, 1] is a similarity threshold based on A.
Patterns that have a tolerance relation with x i form a tolerance class TC(x i ) of x i : X can be approximated by the lower (τ A X) and upper approximations (τ A X). For subset approximations, τ A X and τ A X are defined as [26]: For concept approximations, τ A X and τ A X are defined as: The tuple τ A X, τ A X is the tolerance rough set. The main difference between these two approximations is associated with objects belonging to U or X.

Computational Steps of a TRS-Based Classifier
The computational steps of a TRSC [14,15] can be briefly described as follows: Step 1. Determine τ A TC(x), τ A TC(x) With x, τ A TC(x) is composed of patterns certainly similar to x, and τ A TC(x) is composed of patterns possibly similar to x. For subset and concept approximations, τ A TC(x) is identical to TC(x), but τ A TC(x) is not.
Step 2. Classification using lower approximations If τ A TC(x) = {x}, the classification of x can be left to the next step. If the cardinality of τ A TC(x) is at least two, τ A TC(x) − {x} is used to determine the relative frequency of the class inclusion of the training patterns in τ A TC(x) − {x}. Then, x can be assigned to the class with the highest relative frequency by majority vote. However, if the highest relative frequency is not unique, the classification of x can be left until the next step.
Step 3. Classification using upper approximations The boundary region BND A (TC(x)) (τ A TC(x) − τ A TC(x)) of x can be used to determine the class label of x. Assume that patterns belonging to class C i constitute X i . With y in BND A (TC(x)) = φ, the rough membership function denoted by µ C i (y) defined as: where |TC(y)| denotes the cardinality of TC(y). Then, the average rough membership function of x regarding C i is computed as: where m is the cardinality of BND A (TC(x)). x can be assigned to a class with the largest degree of average rough membership. However, the class label of x cannot be confirmed if BND A (TC(x)) = φ.

Grey-Total-Influence-Based Tolerance Rough Sets
The proposed GTI-TRS plays an important role in designing the proposed classifier. Thus, related studies with respect to the measurement of total influence are first described in Section 3.1. Three main components constitute the GTI-TRS: the GRA introduced in Section 3.2, the proposed grey total influence presented in Section 3.3, and the GTI-based tolerance relation described in Section 3.4.

Studies Related to Measuring Total Influence
Pattern classification refers to the problem of partitioning a pattern space into classes and assigning a pattern to one of them [15]. As mentioned above, to improve the classification performance of the TRSC, this study addresses direct relationships measured in the grey TRSC [19]. The main issue addressed is that as a pattern is likely to influence another directly and/or indirectly, indirect relationships among patterns should be studied as well. To develop novel similarity measures for the TRS by means of relationships, this study focuses on ways of leveraging direct relationships among patterns measured by GRA to further generate the total influence, consisting of indirect and direct relationships, for a TRSC. It should be noted that many studies (e.g., [19,[27][28][29]) have shown the effectiveness of GRA in measuring relationships among attributes and patterns.
The use of grey theory to measure direct influences among patterns for the DEMATEL has been addressed in recent studies, such as [30][31][32][33][34][35][36]. These studies derived the total influence matrix by representing a direct influence as a grey number [27]. As the proposed method derives the total influence matrix by using GRA to automatically generate direct influences from crisp-valued data, the focus is completely different from that of the aforementioned grey DEMATEL methods.

Grey Relational Analysis
Unlike statistical correlation measuring the relationship between random variables, GRA explores relationships between data sequences [27,29] by treating one of these sequences as the goal [28]. Assume that n denotes the number of attributes. Let x j = (x j1 , x j2 , . . . , x jn ) (1 ≤ j ≤ m) be a comparative sequence and can be used to measure the relationship between these two sequences on attribute k (1 ≤ k ≤ n) [37]: where where ρ is a discriminative coefficient, commonly specified as 0.5 [29], but this is apparently not an optimal setting. ξ k (x i , x j ) falls somewhere between zero and one. The grey relational grade (GRG) Υ(x i , x j ) can be used to measure the overall relationship between x i and x j : 1]. w k denotes the relative importance of attribute k, and w 1 , w 2 , . . . , w n satisfy

Generating a Direct Influence Matrix Using GRA
The total influence matrix in the DEMATEL can be used to indicate causal relationships among patterns. Prior to obtaining the total influence matrix T = [t ij ] m×m , a direct influence matrix, Z = [z ij ] m×m is constructed, where z ij (1 ≤ i, j ≤ m) represents the extent to which x i influences x j . The values of zero and one represent "no effect" and "very strong effect," respectively, when z ij ranges from zero to one. The higher the value of z ij , the more x i is likely to directly influence x j .
In particular, as z ij represents the impact of x i on x j , it is reasonable to attribute such an impact to a relationship between x i and x j . This implies that the stronger the relationship between x i and x j , the greater the direct impact of x i on x j . As GRA is an appropriate technique to identify relationships among patterns, this inspired us to determine the grey total influence using GRA to generate the total influence matrix. Compared with the traditional method, the distinctive feature of determining the grey total influence is that it can automatically determine the impact z 1v , z 2v , . . . , z uv of x 1 , x 2 , . . . , and , respectively, at a time by means of GRA, when x 1 , x 2 , . . . , x u act as comparative sequences, and x v is a reference sequence such that . To obtain Z, x 1 , x 2 , . . . , and x u act as reference sequences in turn. Thus, this study calls Z a grey direct influence matrix.

Generating a Grey Direct Influence Matrix for Pattern Classification
For a multiclass problem, to verify the performance of a classifier, it is necessary to partition the collected patterns into training (Category 1) and testing (Category 2) data. That is, each pattern (e.g., x i , x j ) can be categorized into either Category 1 or 2. As a result, four matrix segments, Z 11 , Z 12 , Z 21 , and Z 22 make up a partitioned matrix Z, where each matrix segment represents a relationship between categories in a classification system: Z 11 and Z 22 describe the inner impacts of the patterns on those in Categories 1 and 2, respectively, whereas Z 12 and Z 21, respectively, describe the outer impacts of the imposition of Category 1 on Category 2, and vice versa. Let pattern p in Category 1, represented by x 1p = (x 1pi1 , x 1p2 , . . . , x 1pn ) (1 ≤ p ≤ m 1 ), be a reference pattern, and pattern q in Category 2, x 2q , represented by (x 2q1 , x 2q2 , . . . , x 2qn ) (1 ≤ q ≤ m 2 ), be a comparative pattern, where m 1 and m 2 denote the number of patterns in Categories 1 and 2, respectively. Therefore, m 1 + m 2 = m. Z 11 , Z 12 , Z 21 , and Z 22 are derived as follows: . , x 1m 1 as comparative sequences and x 1i as a reference sequence, so that z(x 1l , . , x 1m 1 as comparative sequences and x 2j as a reference sequence, so that z(x 1p , x 2q ) = Υ(x 1p , x 2q ). (3) Z 21 : As the testing patterns are unseen by the training patterns, they do not have any impact on the training patterns. Therefore, z(x 2q , x 1p ) is set to zero, so that Z 21 = 0. (4) Z 22 : As the testing patterns are unseen, they do not have any impact on themselves. Therefore, z(x 2k , x 2q ) (1 ≤ k ≤ m 2 ) is set to zero, so that Z 22 = 0.
Of course, the symmetry of Z is not required. Note that during the training phase, only training patterns are considered, Z 12 = 0.

Generating a Grey Total Influence Matrix
All diagonal elements of Z should first be set to zero [24]. Z is in turn normalized to produce a normalized direct influence matrix X: where Finally, the grey total influence matrix T = [t ij ] m×m can be further generated by X(I − X) −1 . Unlike Z, T considers not only direct, but also indirect relationships for each pair of patterns. As Z and T express different prospects regarding the impact of x i on x j , considering z ij and t ij , it is reasonable to aggregate Z and T into a new hybrid matrix G = [g ij ] m×m . By balancing Z and T, G is defined as follows: This means that where 0 ≤ α ≤ 1, and the relative importance of the two items is measured by α. g ij tends to affect direct influence when α > 0.5 and total influence when α < 0.5. The higher the value of g ij , the greater the degree to which x i influences x j .

Grey-Total-Influence-Based Tolerance Relation
The overall relationship index (g ij ) forms the foundation of the proposed grey-total-influence-based tolerance rough set (GTI-TRS). S GTI A (x i , x j ), an overall GTI-based similarity measure, is defined as where t GTI A ∈ [0, 1] is a similarity threshold based on A. GTI-TC(x i ), a GTI-based tolerance class of x i , can be generated by considering patterns that have a GTI-based tolerance relation with x i : The higher the value of t ij , the more likely it is that x j can be included in GTI-TC(x i ).
The lower (τ GTI A X) and upper approximations (τ GTI A X) of X can be determined by subset and concept approximations by replacing τ A , τ A , and TC(x) with τ GTI A , τ GTI A , and GTI-TC(x), respectively. τ GTI A X, τ GTI A X is called a GTI-TRS. As shown in Figure 1, by finding the grey total influence, the proposed GTI-TRSC can be set up by merging the proposed classifier with the computational steps of the TRSC.

Illustrative Example
To explain the generation of a grey direct influence matrix and its total influence matrix during the training and testing phases, a small decision table is shown in Table 1, where x 1 , x 2 , and x 3 are training patterns, and x 4 , x 5 , and x 6 are used for testing. In a practical problem, such as bankruptcy prediction, each pattern may be a firm, and conditional attributes may be explanatory financial ratios. Each pattern is composed of four real-valued conditional attributes. Let ρ be 0.5 and w k be 1 /4 (1 ≤ k ≤ 4).

Testing Phase
During the testing phase, Z 11 and Z 12 need to be generated. For Z 12 , when we use x 4 as the reference pattern, ∆ max and ∆ min are 27 and zero, respectively. For Υ(x 1 , x 4 ), ξ 1 (x 1 , x 4 ) can be computed as follows: Therefore, the grey direct influence matrix Z is generated as follows: As λ = 3.121 occurs in the first row of Z, the normalized matrix X derived from Z is: The grey total influence matrix T can be easily generated by X(I − X) −1 :

Genetic-Algorithm-Based Learning Algorithm
Basic genetic operations such as selection, crossover, and mutation [37][38][39] are involved in construction of the proposed GTI-TRSC. To construct a GTI-TRSC with high classification accuracy, n + 3 parameters-w 1 , w 2 , . . . , w n , ρ, α, and τ G A -constituting a chromosome were determined by a real-valued genetic algorithm (GA). The related parameters were the probability of crossover Pr c , probability of mutation Pr m , total number of generations n max , population size n size , and the number of elite chromosomes n del (0 ≤ n del ≤ n size ). The pseudo-code of the learning algorithm is as follows Algorithm 1: The pseudo-code of the learning algorithm Set 0 to k; //1 ≤ k ≤ n max Initialize population (k, n size ); Evaluate chromosomes (k, n size ); While not satisfying the stopping rule do Set k + 1 to k; Select (k, n size ); //Select generation k from generation k − 1 Crossover (k, n size ); Mutation (k, n size ); Elitist (k, n size ); Evaluate chromosome (k, n size ); End while The function of each operation is as follows: (1) Initialize population: The most common population size is between 50 and 500. Generate an initial population of n size chromosome. Each parameter in a chromosome is assigned a real random value ranging from zero to one. (2) Evaluate chromosomes: Each chromosome corresponds to a GTI-TRSC that can be generated by the process shown in Figure 1. For each pattern, determine the lower and upper approximations for a GTI-based tolerance class. Furthermore, correct classification serves as a fitness function. Classification accuracy is the number of correct predictions made divided by the total number of predictions made, multiplied by 100 to turn it into a percentage. (3) Select: To produce generation k, randomly select two chromosomes from generation k − 1 by a binary tournament and place the one with higher fitness in a mating pool. (4) Crossover: Let w k i1 w k i2 . . . w k in ρ k i α k i τ k i and w k j1 w k j2 . . . w k jn ρ k j α k j τ k j be randomly selected chromosomes (1 ≤ i, j ≤ n size ) from generation k. Pr c determines whether crossover can be performed on any two real-valued parameters.
Two new chromosomes, are generated and are added into P k+1 . The related crossover operations are performed as: where a w , b, c, and d are random numbers ranging from zero to one. (5) Mutation: Pr m determines whether a mutation can be performed on each real-valued parameter of a newly generated chromosome. With a mutation, the affected gene is altered by adding a random number selected from a prespecified interval, such as (−0.01, 0.01). A smaller Pr m is required to avoid excessive perturbation.
(6) Elitist strategy: Randomly remove n del chromosomes from generation k. Insert n del chromosomes with the maximum fitness from generation k − 1. A smaller n del is required to generate a smaller perturbation in generation k. (7) Stopping rule: When n max generations have been created, the algorithm reaches the stopping condition.
When the algorithm is terminated, the chromosome with the maximum fitness among all successive generations can be used to examine the generalization capability of the proposed GTI-TRSC.

Computer Simulations
As shown in Table 2, the generalization capability of the proposed GTI-TRSC was examined by experiments on some practical datasets available from the UCI Machine Learning Repository (http://www.ics.uci.edu/~mlearn/MLRepository.html). In Section 5.1, the performance of different rough-set-based classification methods is reported. Section 5.2 describes a statistical analysis to compare the different rough-set-based methods considered.

Evaluating Classification Performance
There is no optimal setting for parameter specifications for genetic algorithms, but we can refer to the principles introduced in [40,41]. Parameter specifications were specified for all experiments as follows: n size = 50, n max = 500, n del = 2, Pr c = 0.8, and Pr m = 0.01. Five-fold cross-validation (5-CV) was considered for each classification method ten times by means of a distribution-balanced stratified CV (DBSCV) [42]. This study divided all patterns into five disjoint subsets of equal size such that four served as training patterns and one as test data. We iterated this procedure until each subset had been tested.
The classification performance of the proposed GTI-TRSC was compared with that of several representative rough-set-based classification methods: a rule-based method with shortening optimization (RSES-O) using the Rough Set Exploration System (RSES) [3][4][5], a hierarchical version of the lattice machine (HLM) [43,44], a hierarchical form of RSES-O (RSES-H) [44,45], and the Rule Induction with Optimal Neighborhood Algorithm (RIONA) [46]. These classification methods are briefly introduced as follows: (1) HLM: The lattice machine generates hypertuples as a model of the data. Some more general hypertuples can be used in the hierarchy that covers objects covered by the hypertuples. The covering hypertuples locate various levels of the hierarchy. (4) RIONA: RIONA is also implemented in RSES. It uses the nearest neighbor method to induce distance-based rules. For a new pattern, the patterns most similar to it can vote for its decision, but patterns that do not match any rule are excluded from voting.
The classification performance of the above methods, reported in [44], is summarized in Table 3. Variants of TRSC were also considered: TRSC with subset approximations (TRSC-SU), TRSC with concept approximations (TRSC-CO), flow-based TRSC (FTRSC) with subset approximations (FTRSC-SU), flow-based TRSC with concept approximations (FTRSC-CO) [22], Grey-tolerance-roughset-based classifier (GTRSC) with subset approximations (GTRSC-SU), and GTRSC with concept approximations (GTRSC-CO) [19]. The GTRSC and FTRSC were chosen because it is interesting to investigate whether different measures of similarity or relationship can influence classification accuracy. To implement the basic differences between GTI-TRSC, GTRSC, and FTRSC, GTRSC and FTRSC are briefly introduced as follows: (1) GTRSC: Instead of a simple distance measure used to evaluate the proximity of any two patterns, the GRG (grey relational grade) is used here to implement a relationship-based similarity measure that generates a tolerance class for each pattern. As mentioned above, only direct relationships were considered in the GTRSC. (2) FTRSC: The FTRSC uses preference information expressed by flows among patterns to measure similarity between patterns. The flow of each pattern is computed by the well-known preference ranking organization methods for enrichment evaluations (PROMETHEE) [47,48].
The results are summarized in Tables 3 and 4. It is clear that the classification performance of GTI-TRSC was comparable to that of FTRSC and GTRSC. This means that different measures can impose a certain impact on classification results.

Statistical Analysis
The nonparametric Friedman test [49] was used to statistically analyze the aforementioned classification methods. Using the null hypothesis, whereby the ranks of the classification methods were identical on average, the F F statistic, distributed as an F distribution with k 1 − 1 and (k 1 − 1) (k 2 − 1) degrees of freedom, can be defined as follows [20]: where r j , k 1 , and k 2 are the average rank of method j, the number of methods, and the number of datasets considered, respectively. χ 2 F is defined as F F is 14.09 because k 1 = 12, k 2 = 10, and χ 2 F = 67.13. The null hypothesis was rejected at the 5% level as F F was above the critical value of 1.98 (i.e., F(9, 99)).
Subsequently, the Nemenyi test [50] was used to detect significant differences among the classification methods. The classification accuracies of the two methods were significantly different and the differences in their average ranks were greater than CD: where CD is a critical difference, and CD = 4.89 because q 0.10 = 3.03 at the 10% level. We summarize the results as follows:

Discussion and Conclusions
From the perspective of numerous or few relationships between any pair of patterns, this study used GRA to identify direct influences among patterns. A total influence matrix was generated to verify the direct/indirect influences among them. The total influence formed the foundation of the proposed GTI-TRS associated with the construction of the TRSC. The idea is that the higher the value of g ij , the more similar x i is to x j . We noted that [19] had proposed grey tolerance rough sets (GTRS), which defined an overall similarity measure S G A (x i , x j ), so that S G A (x i , x j ) = Υ(x i , x j ). Therefore, the main difference between the GTRS and the GTI-TRS is that the former considers only direct relationships, but the latter considers direct as well as indirect relationships using the proposed grey total influence matrix. In particular, a GA was implemented to determine the optimal parameter specification for the proposed GTI-TRSC that cannot be easily determined by users.
Even though parameter specifications for GA are subjective, experimental results showed that the chosen parameters were acceptable. Indeed, the proposed GTI-TRSC is sufficiently simple to implement as a computer program without conforming to any statistical assumptions. Experimental results obtained by the proposed GTI-TRSC are promising. We see that GTI-TRSC-CO and GTI-TRSC-SU produced satisfactory results compared with the other rough-set-based classification methods considered. In particular, these two classifiers were superior in terms of classification performance to the TRSC. However, it should be noted that there is no best classifier [42].
This study has motivated us to investigate the subject further. As mentioned above, the grey DEMATEL has been an important issue for multiple attribute decision making. It is interesting to examine a distinctive version of the grey DEMATEL built on the proposed grey direct influence matrix. First, a novel DEMATEL-based Analytic Network Process (DANP) proposed in [20] was used to avoid agonizing pairwise comparisons for the ANP by replacing the total influence matrix produced by the DEMATEL by directly using the unweighted supermatrix of the ANP. The DANP has gained considerable research attention in recent years due to its convenience . It is interesting to explore its applicability to practical problems by incorporating the new version of the grey DEMATEL into DANP.
Second, considering decision-making in a fuzzy environment, the linguistic interpretation of elements of the grey direct influence matrix is challenging. In practical applications, it may be desirable that a linguistic term be generated from numerical data [41]. Linguistic values such as "low influence", "medium influence", and "high influence" associated with fuzzy sets can be considered. Such a linguistic grey direct influence matrix can then be incorporated into the DEMATEL for further processing.
Finally, the traditional GRG is implemented using the weighted-average method, where dependency among the attributes is not considered. However, an assumption of additivity may not be realistic in practical applications [53]. Thus, it is more useful to employ a nonadditive grey total influence matrix with a nonadditive GRG [54], and check the resultant impact on the performance of GTI-TRSC. The aforementioned grey DANP, fuzzy-grey-DEMATEL, and nonadditive grey DEMATEL will be investigated in future studies.