Next Article in Journal
DFTHR: A Distributed Framework for Trajectory Similarity Query Based on HBase and Redis
Previous Article in Journal
PKCHD: Towards a Probabilistic Knapsack Public-Key Cryptosystem with High Density
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Nonparametric Hyperbox Granular Computing Classification Algorithms

1
Center of Computing, Xinyang Normal University, Xinyang 464000, China
2
School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
*
Author to whom correspondence should be addressed.
Information 2019, 10(2), 76; https://doi.org/10.3390/info10020076
Submission received: 8 January 2019 / Revised: 1 February 2019 / Accepted: 15 February 2019 / Published: 24 February 2019

Abstract

:
Parametric granular computing classification algorithms lead to difficulties in terms of parameter selection, the multiple performance times of algorithms, and increased algorithm complexity in comparison with nonparametric algorithms. We present nonparametric hyperbox granular computing classification algorithms (NPHBGrCs). Firstly, the granule has a hyperbox form, with the beginning point and the endpoint induced by any two vectors in N-dimensional (N-D) space. Secondly, the novel distance between the atomic hyperbox and the hyperbox granule is defined to determine the joining process between the atomic hyperbox and the hyperbox. Thirdly, classification problems are used to verify the designed NPHBGrC. The feasibility and superiority of NPHBGrC are demonstrated by the benchmark datasets compared with parametric algorithms such as HBGrC.

1. Introduction

The classification algorithm is a traditional data analysis method that is widely applied in many fields, including computer vision [1], DNA analysis [2], and physical chemistry [3]. For classification problems, the main method is the parameter-based learning method, whereby the relation between the input and the output is found to predict the class label of an input with unknown class label. The parameter-based learning method includes the analytic function method and the discrete inclusion relation method. The analytic function method establishes the mapping relationship between the input and output of the training datasets. The trained mapping is used to predict the class label of inputs with unknown class labels. Support Vector Machine (SVM) and multilayer perceptron (MLP) are kinds of methods by which linear or nonlinear mapping relationships are formed to predict the class label of inputs without class labels. The discrete inclusion relation method estimates the class labels of inputs based on the discrete inclusion relation between an input with a determined class label and an input without a class label and includes techniques such as random forest (RF) and granular computing (GrC). In this paper, we mainly study the classification algorithm using GrC, especially GrC with the form of hyperbox granule, the superiority and feasibility of which are shown in references [4,5,6,7,8,9,10,11].
As a classification and clustering method, GrC involves a computationally intelligent theory and method, and jumps back-and-forth between different granularity spaces [12,13,14]. Being fundamentally a data analysis method, GrC is commonly studied from the perspectives of theory and application, the latter of which includes pattern recognition, image processing, and industrial applications [12,13,14,15,16,17,18,19]. The main research issues of GrC include shape, operation, relation, granularity, etc.
A granule is a set of objects in which the elements are regarded to be objects with similar properties [17]. Binary granular computing proposes a conventional binary relation between two sets. Correspondingly, the operations between two sets are converted into the operation between two granules, such as the intersection operation and the union operation between two sets (granules). Another research issue in GrC is how to define the distance between two granules. Chen and his colleague introduced Hamming distance to understand distance measurements with respect to binary granules for a rough set. Moreover, granule swarm distance is used to measure the uncertainty between two granules [20].
Operations between two granules are expressed as the equivalent form of membership grades, which are produced by the two triangular norms [15]. Kaburlasos defined the join operation and the meet operation as inducing granules with different granularity in terms of the theory of lattice computing [5,6]. Kaburlasos defined the fuzzy inclusion measure between two granules on the basis of the defined join operation and meet operation, and the fuzzy lattice reasoning classification algorithm was designed based on the distance between the beginning point and the endpoint of the hyperbox granule [7].
The relation between two granules is mainly used to generate the rules of association between inputs and outputs for classification problems and regression problems. A specialized version of this general framework is proposed by GrC theory in order to mine the potential relations behind data [21]. Kaburlasos and his colleague embed the lattice computing, including GrC, into a fuzzy inference system (FIS), and preliminary industrial applications have demonstrated the advantages of their proposed GrC methods [4].
Granularity is the index of measurement for the size of a granule and the means by which the granularity of a granule can be measured is one of the foundational issues in GrC. Yao regarded a granule as a set and defined the granularity as the cardinality of the set by a strictly monotonic function [14]. As a classification algorithm, GrC is concerned with human information processing procedures: the procedure includes both the data abstraction and the derivation of knowledge from information. To induce and deduce knowledge from the data, parameters are introduced to achieve suitable prior knowledge from the given data, such as the granularity threshold, the λ of positive valuation function used for the construction of fuzzy inclusion measure between two granules, and the maximal number of data belonging the granule, thus resulting in some redundant granules during the training process. On one hand, these parameters improve the performance of GrC classification algorithms and GrC clustering algorithms. On the other hand, these parameters also have negative impacts, such as the higher time consumption required by parametric GrC compared with nonparametric GrC algorithms.
The proposed nonparametric hyperbox GrC has two main advantages for classification tasks. First, the nonparametric hyperbox GrC achieved better performance when compared with the parametric hyperbox GrC. Second, compared with the nonparametric hyperbox GrC, the parametric hyperbox GrC classification algorithms perform the algorithm multiple times, which is time-consuming for the selection of parameters, such as the parameter for positive valuation function and the threshold of granularity. The nonparametric hyperbox granular computing classification algorithm (NPHBGrC) includes the following steps. First, the granule has a regular hyperbox shape, with the beginning point and the endpoint that are induced by two vectors in N-dimensional (N-D) space; second, the distance between two hyperbox granules is introduced to determine their join process; and third, the NPHBGrC is designed and verified by the benchmark dataset compared with hyperbox granular computing classification algorithms (HBGrCs).

2. Nonparametric Granular Computing

In this section, we discuss the nonparametric granular computing, including the representation of granules, the operation between two granules, and the distance between two granules.

2.1. Representation of Hyperbox Granule

For granular computing in N-D space, we suppose a granule as a regular shape, such as a hyperbox with the beginning point x and the endpoint y which satisfy the partial order relation x _ y . The beginning point x and the endpoint y are vectors in N-D space, and the hyperbox granule has the form G = [ B p , E p ] , where B p is the beginning point and E p is the endpoint. For any two vectors x = ( x 1 , x 2 , , x N ) and y = ( y 1 , y 2 , , y N ) , if the two vectors x and y satisfy the partial order relation x _ y , then B p = x and E p = y , otherwise B p = x y and E p = x y . The partial order relation between two vectors in N-D space is defined as follows
x _ y x 1 y 1 , x 2 y 2 , , x N y N
The operation and operation between two vectors are defined as follows
x y = ( x 1 y 1 , x 2 y 2 , , x N y N ) x y = ( x 1 y 1 , x 2 y 2 , , x N y N )
where the operation and operation between two scalars are a b = min { a , b } and a b = max { a , b } .
Obviously, for two vectors x and y in N-D space, we form the hyperbox granule with the form of vector G = [ B p , E p ] , where B p is the beginning point and E p is the endpoint of the granule. In the following sections, we represent hyperbox granule by G = [ x , y ] for N-D space. In 2-D space, the granule G = [ x , y ] is box, and in N-D space, the granule G = [ x , y ] is a hyperbox.

2.2. Operations between Two Hyperbox Granules

For two hyperbox granules G 1 = [ x 1 , y 1 ] and G 2 = [ x 2 , y 2 ] , the join hyperbox granule is the following form by the join operation
G 1 G 2 = [ x 1 x 2 , y 1 y 2 ]
where x 1 = ( x 11 , x 12 , , x 1 N ) and y 1 = ( y 11 , y 12 , , y 1 N ) are vectors, x 1 x 2 = ( x 11 x 21 , x 12 x 22 , , x 1 N x 2 N ) , y 1 y 2 = ( y 11 y 21 , y 12 y 22 , , y 1 N y 2 N ) .
The join hyperbox granule has greater granularity than the original hyperbox granules. The original hyperbox granules and the join hyperbox granule have the following relations.
G 1 G 1 G 2 G 2 G 1 G 2 .
The meet hyperbox granule has the following form by the meet operation
G 1 G 2 = { [ x 1 x 2 , y 1 y 2 ] x 1 x 2 _ y 1 y 2 o t h e r w i s e .
The meet hyperbox granule has less granularity than the original hyperbox granules. The meet hyperbox granule and the original hyperbox granules have the following relations.
G 1 G 2 G 1 G 2 G 1 G 2 .
For example, in 2-D space, G 1 = [ 0.05 , 0.15 , 0.48 , 0.68 ] and G 2 = [ 0.1 , 0.2 , 0.5 , 0.7 ] are two hyperbox granules, and their join hyperbox granule is G 1 G 2 = [ 0.05 , 0.15 , 0.5 , 0.7 ] , which is induced by the aforementioned join operation. These three hyperboxes are shown in Figure 1.

2.3. Novel Distance between Two Hyperbox Granules

The atomic hyperbox granule is a point in N-D space and is represented as the hyperbox with a beginning point and an endpoint which are identical. We can measure the distance relation between the point and the hyperbox granule.
For N-D space, the distance function is the mapping between N-D vector space and 1-D real space. From a visual point of view, distance is a numerical description of how far two objects are from one another. The distance function between two hyperbox granules is the mapping between hyperbox granule space and 1-D space, and a larger distance means that there is a smaller overlap area between the two hyperbox granules. The distance function between two hyperbox granules in granule space S is a function:
d : S × S R
where R denotes the set of real numbers. We define the distance between two hyperbox granules G 1 = [ B p 1 , E p 1 ] and G 2 = [ B p 2 , E p 2 ] as follows.
Definition 1.
The distance between point P and hyperbox granule G = [ B p , E p ] is defined as
D ( P , G ) = d ( P , B p ) + d ( P , E p ) d ( B p , E p )
where B p is the beginning point and is denoted as B p = ( x 1 , x 2 , , x N ) , E p is the endpoint and is denoted as E p = ( y 1 , y 2 , , y N ) , and d ( , ) is the Manhattan distance between two points:
d ( B p , E p ) = B p E p 1 = | x 1 y 1 | + + | x N y N | .
Suppose P = ( p 1 , p 2 , , p N ) is a point in N-D space, G is a hyperbox granule in granule space, the distance between P and G is the mapping between the granule space and the real space which satisfies the following non-negativity property.
D ( P , G ) = d ( P , B p ) + d ( P , E p ) = | p 1 x 1 | + + | p N x N | + | y 1 p 1 | + + | y N p N | ( | y 1 x 1 | + + | y N x N | ) . = ( | p 1 x 1 | + | p 1 y 1 | | y 1 x 1 | ) + + ( | p N x N | + | p N y N | | y N x N | ) 0
The distance between the point and hyperbox granule G is explained in 2-D space. For G = [ 0.1 , 0.2 , 0.4 , 0.3 ] and the point P ( 0.3 , 0.4 ) , d ( P , B p ) = 0.4 , d ( P , E p ) = 0.2 , d ( B p , E p ) = 0.4 , D ( P , G ) = 0.2 > 0 . The location of P and G is shown in Figure 2. As shown in Figure 2, the point P is outside the hyperbox granule G .
Theorem 1.
In N-D space, the point P is inside the hyperbox granule G if and only if D ( P , G ) = 0 .
Proof: 
Suppose B p = ( x 1 , x 2 , , x N ) , E p = ( y 1 , y 2 , , y N ) , and P = ( p 1 , p 2 , , p N ) .
If the point P is inside the hyperbox granule G = [ B p , E p ] , then B p _ P and P _ E p , d ( P , B p ) = p 1 x 1 + p 2 x 2 + + p N x N and d ( P , E p ) = y 1 p 1 + y 2 p 2 + + y N p N :
d ( P , B p ) + d ( p , E p ) = p 1 x 1 + p 2 x 2 + + p N x N + y 1 p 1 + y 2 p 2 + + y N p N = y 1 x 1 + y 2 x 2 + + y N x N = d ( B p , E p ) .
Namely, D ( P , G ) = d ( P , B p ) + d ( p , E p ) d ( B p , E p ) = 0
If D ( p , G ) = 0 , then
D ( P , G ) = d ( P , B p ) + d ( p , E p ) d ( B p , E p ) = | p 1 x 1 | + | p 2 x 2 | + + | p N x N | + | y 1 p 1 | + | y 2 p 2 | + + | y N p N | ( | y 1 x 1 | + | y 2 x 2 | + + | y N x N | ) = ( | y 1 p 1 | + | x 1 p 1 | | y 1 x 1 | ) + ( | y 2 p 2 | + | x 2 p 2 | | y 2 x 2 | ) + + ( | y N p N | + | x N p N | | y N x N | ) = 0 .
Because ( | y i p i | + | x i p i | | y i x i | ) 0 and D ( P , G ) = 0 , | y i p i | + | x i p i | | y i x i | = 0 . We discuss the relation between x i and p i and the relation between y i and p i in two situations.
When p i < x i , p i < y i owing to x i y i ,
| y i p i | + | x i p i | | y i x i | = y i p i + x i p i y i + x i = 2 ( x i p i ) > 0 | y i p i | + | x i p i | | y i x i | = y i p i + x i p i y i + x i = 2 ( x i p i ) > 0 .
Namely, D ( P , G ) > 0 . This is obviously not in agreement with D ( P , G ) = 0 , namely x i p i .
When p i > y i , p i > x i owing to x i y i ,
| y i p i | + | x i p i | | y i x i | = p i y i + p i x i y i + x i = 2 ( p i y i ) > 0 .
Namely, D ( P , G ) > 0 . This is obviously not in agreement with D ( P , G ) = 0 , namely p i y i .
Therefore, x i p i and p i y i , namely, P is included in G. □
Definition 2.
The distance between two hyperbox granules G 1 = [ B p 1 , E p 1 ] and G 2 = [ B p 2 , E p 2 ] is defined as
D ( G 1 , G 2 ) = max { D ( B p 1 , G 2 ) , D ( E p 1 , G 2 ) } .
Obviously, D ( G 1 , G 2 ) 0 and the distance between two hyperbox granules have the following properties.
Theorem 2.
D ( G 1 , G 2 ) = 0 if G 1 G 2 .
Proof: 
Because D ( G 1 , G 2 ) 0 and D ( G 1 , G 2 ) = max { D ( B p 1 , G 2 ) , D ( E p 1 , G 2 ) } = 0 ,
D ( B p 1 , G 2 ) = D ( E p 1 , G 2 ) = 0 , according to Theorem 1, B p 1 is inside the hyperbox granule G 2 and E p 1 is inside the hyperbox granule G 2 , namely B p 1 G 2 and E p 1 G 2 . So G 1 G 2 .
If G 1 G 2 , both B p 1 and E p 1 are inside the hyperbox granule G 2 . According to Theorem 1, D ( B p 1 , G 2 ) = 0 and D ( E p 1 , G 2 ) = 0 , the maximum of D ( B p 1 , G 2 ) and D ( E p 1 , G 2 ) = 0 is zero, namely,
D ( G 1 , G 2 ) = max { D ( B p 1 , G 2 ) , D ( E p 1 , G 2 ) } = 0 .
 □
Theorem 3.
D ( G 1 , G 2 ) D ( G 2 , G 1 ) .
Proof: 
D ( G 1 , G 2 ) = max { D ( B p 1 , G 2 ) , D ( E p 1 , G 2 ) } = max { d ( B p 1 , B p 2 ) + d ( B p 1 , E p 2 ) d ( B p 2 , E p 2 ) , d ( E p 1 , B p 2 ) + d ( E p 1 , E p 2 ) d ( B p 2 , E p 2 ) } D ( G 2 , G 1 ) = max { D ( B p 2 , G 1 ) , D ( E p 2 , G 1 ) } = max { d ( B p 2 , B p 1 ) + d ( B p 2 , E p 1 ) d ( B p 1 , E p 1 ) , d ( E p 2 , B p 1 ) + d ( E p 2 , E p 1 ) d ( B p 1 , E p 1 ) }
Owing to d ( B p 1 , E p 1 ) d ( B p 2 , E p 2 ) , D ( G 1 , G 2 ) D ( G 2 , G 1 ) . □

2.4. Nonparametric Granular Computing Classification Algorithms

For classification problem, the training set is the set S and the NPHBGrC are proposed by the following steps to form the granule set G S , which is composed of hyperbox granules. First, the sample is selected to form the atomic hyperbox granule randomly. Second, the other sample with the same class label as the hyperbox granule in GS is selected to form the join hyperbox by join operation. Third, the hyperbox granule is updated if the join hyperbox granule does not include the sample with the other class label. The NPHBGrC algorithms include training process and testing process, which are listed as Algorithms 1 and 2.
Algorithm 1: Training process
Input: Training set S
Output: Hyperbox granule set G S , the class label l a b corresponding to G S
 S1. Initialize the hyperbox granule set G S = , l a b = ;
 S2. i = 1 ;
 S3. Select the samples with class labels i , and generate set X ;
 S4. Initialize the hyperbox granule set G S t = ;
 S5. If G S t = , the sample x j in X is selected to construct the corresponding atomic hyperbox granule G j , x j is removed from X , otherwise j = 1 ;
 S6. The sample x k is selected from X and forms the hyperbox granule G k ;
 S7. If the join hyperbox granule G j G k between G j and G k does not include the other class sample, the G j is replaced by the join hyperbox granule G j G k and the samples included in G j G k with the class labels i are removed from X, namely, G j = G j G k , otherwise G S and l a b are updated, G S = G S { G k } , l a b = l a b { i } ;
 S8. j = j + 1 ;
 S9. If i = n , output G S and class label l a b , otherwise i = i + 1 .
Algorithm 2: Testing process
Input: inputs of unknown datum x , the trained hyperbox granule set G S and class label l a b
 Output: class label of x
 S1. For i = 1 : | G S | ;
 S2. Compute the distance D ( x , G i ) between x and G in G S ;
S3. Find the minimal distance D ( x , G i ) ;
S4. Find the corresponding class label of the G i as the label of x .
We take the training set including 10 training data for example to explain the training algorithm. Suppose the training set is
S = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , ( x 3 , y 3 ) , ( x 4 , y 4 ) , ( x 5 , y 5 ) , ( x 6 , y 6 ) , ( x 7 , y 7 ) , ( x 8 , y 8 ) , ( x 9 , y 9 ) , ( x 10 , y 10 ) } .
where the inputs of data are
x 1 = ( 4 , 7 ) , x 2 = ( 7 , 6 ) , x 3 = ( 8 , 2 ) , x 4 = ( 2 , 4 ) , x 5 = ( 5 , 5 ) , x 6 = ( 5 , 9 ) , x 7 = ( 6 , 4 ) , x 8 = ( 5 , 7 ) , x 9 = ( 7 , 3 ) , x 10 = ( 3 , 7 )
The corresponding class label is
y 1 = 1 , y 2 = 1 , y 3 = 2 , y 4 = 2 , y 5 = 2 , y 6 = 1 , y 7 = 2 , y 8 = 1 , y 9 = 2 , y 10 = 2 .
We explain the generation of G S by Algorithm 1. The x 1 = ( 4 , 7 ) is selected to form the atomic hyperbox granule G 1 = [ 4 , 7 , 4 , 7 ] with the granularity 0 and the class label 1 shown in Figure 3a. The second datum x 2 = ( 7 , 6 ) with the same class label as G 1 is selected to generate the atomic hyperbox granule [7 6 7 6] which is joined with G 1 and forms the join hyperbox granule [ x 2 , x 2 ] G 1 = [ 4 , 6 , 7 , 7 ] . Since there are no data with the other class label lying in the join hyperbox granule [ x 2 , x 2 ] G 1 = [ 4 , 6 , 7 , 7 ] , the G 1 is replaced by the join hyperbox granule, namely G 1 = [ x 2 , x 2 ] G 1 = [ 4 , 6 , 7 , 7 ] , as shown in Figure 3b. The third datum x6 with the same class label with G 1 is selected to generate atomic hyperbox granule [ x 6 , x 6 ] = [ 5 , 9 , 5 , 9 ] , which is joined with G 1 and forms the join hyperbox granule [ x 6 , x 6 ] G 1 = [ 4 , 6 , 7 , 9 ] . As there are no data with the other class label lying in the join hyperbox granule [ x 6 , x 6 ] G 1 = [ 4 , 6 , 7 , 9 ] , G 1 is replaced by [ x 6 , x 6 ] G 1 = [ 4 , 6 , 7 , 9 ] , namely G 1 = [ 4 , 6 , 7 , 9 ] , as shown in Figure 3c. During the join process, the datum with the class label with the hyperbox granule lies in the hyperbox granule is not considered the join process, such as datum x8 with the class label 1. In this way, the hyperbox granule G 1 = [ 4 , 6 , 7 , 9 ] with the blue lines is generated for the data with the class label 1. The same strategy is adopted for the data with the class label 2; two hyperbox granules G 2 = [ 2 , 2 , 8 , 5 ] and G 3 = [ 3 , 7 , 3 , 7 ] are generated and are shown in Figure 3d. For the training set S, the achieved granule set is G S = { G 1 , G 2 , G 3 } and the corresponding class label is l a b = { 1 , 2 , 2 } . The granules in GS are shown in Figure 3d; the granule marked with the blue lines is the granule with class label 1, and the granules with the red lines are the granules with class label 2.

3. Experiments

The effectiveness of the NPHBGrC is evaluated with a series of empirical studies including the classification problems in 2-D space and classification problems in N-D space. We compare NPHBGrC with GrC with parameters, such as the HBGrC [22], and evaluate the performance of classification algorithms by the threshold of granularity of HBGrC(Par.), the number of hyperbox granules (Ng), time cost (T(s)) including the training and testing processes, training accuracy (TAC), and testing accuracy (AC).

3.1. Classification Problems in 2-D Space

In the first benchmark study, the two spiral curve classification problem [23], Ripley classification problem [24], and sensor2 classification problem (wall—following robot navigation data) from the websites http://archive.ics.uci.edu/ml/datasets.html, which were created in two dimensions, were used to assess the efficacy of classification algorithms and to visualize the boundary of classification. The details of the datasets and classification performance are summarized in Table 1. The number of training data (#Tr), the number of testing data (#Ts), and the performances of NPHBGrC and HBGrC are shown in Table 1. From Table 1, it can be seen that NPHBGrC has greater or equal testing accuracies and less time cost compared with HBGrC. NPHBGrC has less time cost than HBGrC due to the fact that HBGrC produces some redundant hyperbox granules. Figure 4 and Figure 5 show the boundaries of NPHBGrC and HBGrC for the Ripley dataset.

3.2. Classification Problems in N-dimensional (N-D) Space

In this section, we verify the performance of the proposed classification algorithms which are extended to N-D space compared with the HBGrC by the selected benchmark datasets from the website, http://archive.ics.uci.edu/ml/. These datasets are the most popular datasets since 2007, and the characteristics and the performance of the datasets are listed in Table 2 and Table 3.
For the parametric algorithm, in order to facilitate the selection of parameters of thresholds of granularities, the RN space is normalized into the [0,1]N space, the granularity parameters are set to between 0 and 0.5 with steps of 0.01 for the n-class classification problems performed by HBGrC.
A 10-fold cross-validation is used to evaluate the parametric and nonparametric classification algorithms. For each dataset, the nonparametric and parametric algorithms are performed for each fold, and the parametric algorithms are performed 51 times for each fold due to the selection of granularity threshold parameters.
The performances of classification algorithms include the maximal testing accuracies, the mean testing accuracies, the minimal testing accuracies, and the standard deviation of testing accuracies. The superiority of algorithms is evaluated by the mean testing accuracies and the stability of algorithms is verified by the standard deviation of testing accuracies, which are shown in Table 3. From the Table 3, it can be seen that NPHBGrC algorithms are superior to HBGrC algorithms regardless of the maximum testing accuracy (max), the mean testing accuracy (mean), or the minimum testing accuracy (min). On the other hand, it can also be seen from Table 3 that the standard deviations of 10-fold cross-validation by NPHBGrC are less than those of HBGrC, which shows that NPHBGrC algorithms are more stable than HBGrC algorithms.
The testing accuracies are the main evaluation indices for the classification algorithms. A t-test was used to verify the testing accuracies by nonparametric algorithms and parametric algorithms statistically. If h = 0, then the testing accuracies achieved by NPHBGrC and HBGrC have no significant difference statistically, although h = 0, but p is relatively small, close to 0.05, we regard the achieved testing accuracies have significant difference. If h = 1, then the testing accuracies achieved by NPHBGrC and HBGrC are significantly different, and we can illustrate the superiority of the algorithm by the mean testing accuracy, especially, although h = 1, but p is relatively small, close to 0.05, we regard the achieved testing accuracies as having no significant difference.
For the datasets Iris, Wine, Cancer1, Sensor4, and Cancer2, h = 0, as shown in Table 4. Statistically, the testing accuracies obtained by NPHBGrC and HBGrC have no significant difference from the h values of t-test listed in Table 3, and the testing accuracies of NPHBGrC are slightly higher than those of HBGrC in terms of maximal testing accuracies, mean testing accuracies, and the minimal testing accuracies listed in Table 3.
For the datasets Phoneme, Car, and Semeion, h = 1, as shown in Table 4. Statistically, the testing accuracies by NPHBGrC and HBGrC are significantly different, and we determine which is the better classification algorithm for NPHBGrC and HBGrC from the mean testing accuracies in Table 3. NPHBGrC algorithms are better than HBGrC algorithms, since the mean testing accuracies obtained by NPHBGrC are greater than those obtained by HBGrC, as shown in Table 3.
The computational complexities are evaluated by the time cost, including the training and testing time cost. Obviously, NPHBGrC algorithms have lower computational complexities compared with HBGrC due to the redundant hyperbox granules and the parameter selection for HBGrC.

3.3. Classification for Imbalanced Datasets

For the imbalanced datasets, an imbalanced dataset called yeast, including 1484 data, was used to verify the performance of the proposed algorithm, where the positive data belong to class NUC (class label 1 in the paper) and the negative data belong to the rest (class label 2 in the paper). The dataset can be downloaded from the website http://keel.es/. Five-fold cross-validation was used to evaluate the performance of NPHBGrC and HBGrC, such as the testing accuracy and class-based testing accuracy. The accuracies are listed in Table 5, and the histogram of accuracies is shown in Figure 6. For the testing set, AC is the total accuracy, C1AC is the accuracy of data with class label 1, and C2AC is the accuracy of data with class label 2. For the five tests, named Test 1, Test 2, Test 3, Test 4, and Test 5, NPHBGrC achieved better total accuracies (AC) than HBGrC for the imbalanced class problem yeast. The geometric mean (GM) of the true rates is defined in [22] and attempts to maximize the accuracy of each of the two classes with a good balance. From Table 5, it can be seen that the GM of NPHBGrC is 74.2023, which is superior to the GM of HBGrC (64.8344), and to the fuzzy rule-based classification systems (69.66) by Fernández [25] and the weighted extreme learning machine (73.19) by Akbulut [26].

4. Conclusions

According to the computational complexity produced the redundant hyperbox granules, we presented the NPHBGrC. The novel distance was introduced to measure the distance between two hyperbox granules and to determine the join process between two hyperbox granules. The feasibility and superiority of NPHBGrC were demonstrated by the benchmark datasets compared with HBGrC. There are some improvements in the NPHBGrC, for example, relating to the overfitting problem and the effect of the data order on the classification accuracy. The purpose of using distance in this paper was to determine the positional relationship between points (such as the points inside and outside the hyperbox) and the hyperbox. For the interval set and the fuzzy set, the operations between two granules were designed based on the fuzzy relation between two granules. For the fuzzy set, further research is needed in the future to determine how to use the proposed distance between two granules to design classification algorithms. For the classification of imbalanced datasets, the superiority of NPHBGrC was verified by the yeast dataset. In the future, the superiority and feasibility of GrC need to be verified using more metrics—such as the receiver operating curve (ROC), usually known as area under curve (AUC)—by more imbalanced datasets, and the computing theory of GrC needs further study for imbalanced datasets to achieve a better performance.

Author Contributions

Conceptualization, H.L.; Methodology, H.L. and X.D.; Validation, H.L., X.D., and H.G.; Data Curation, X.D.; Writing—Original Draft Preparation, H.L.

Funding

This work was supported in part by the Henan Natural Science Foundation Project (182300410145, 182102210132).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ristin, M.; Guillaumin, M.; Gall, J.; Van Gool, L. Incremental Learning of Random Forests for Large-Scale Image Classification. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 490–503. [Google Scholar] [CrossRef] [PubMed]
  2. Garro, B.A.; Rodríguez, K.; Vázquez, R.A. Classification of DNA microarrays using artificial neural networks and ABC algorithm. Appl. Soft Comput. 2016, 38, 548–560. [Google Scholar] [CrossRef]
  3. Yousef, A.; Charkari, N.M. A novel method based on physicochemical properties of amino acids and one class classification algorithm for disease gene identification. J. Biomed. Inform. 2015, 56, 300–306. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Kaburlasos, V.G.; Kehagias, A. Fuzzy Inference System (FIS) Extensions Based on the Lattice Theory. IEEE Trans. Fuzzy Syst. 2014, 22, 531–546. [Google Scholar] [CrossRef]
  5. Kaburlasos, V.G.; Pachidis, T.P. A Lattice—Computing ensemble for reasoning based on formal fusion of disparate data types, and an industrial dispensing application. Inform. Fusion 2014, 16, 68–83. [Google Scholar] [CrossRef]
  6. Kaburlasos, V.G.; Papadakis, S.E.; Papakostas, G.A. Lattice Computing Extension of the FAM Neural Classifier for Human Facial Expression Recognition. IEEE Trans. Neural Netw. Learn. Syst. 2013, 24, 1526–1538. [Google Scholar] [CrossRef] [PubMed]
  7. Kaburlasos, V.G.; Papakostas, G.A. Learning Distributions of Image Features by Interactive Fuzzy Lattice Reasoning in Pattern Recognition Applications. IEEE Comput. Intell. Mag. 2015, 10, 42–51. [Google Scholar] [CrossRef]
  8. Liu, H.; Li, J.; Guo, H.; Liu, C. Interval analysis-based hyperbox granular computing classification algorithms. Iranian J. Fuzzy Sys. 2017, 14, 139–156. [Google Scholar]
  9. Guo, H.; Wang, W. Granular support vector machine: A review. Artif. Intell. Rev. 2017, 51, 19–32. [Google Scholar] [CrossRef]
  10. Wang, Q.; Nguyen, T.-T.; Huang, J.Z.; Nguyen, T.T. An efficient random forests algorithm for high dimensional data classification. Adv. Data Anal. Classif. 2018, 12, 953–972. [Google Scholar] [CrossRef]
  11. Kordos, M.; Rusiecki, A. Reducing noise impact on MLP training—Techniques and algorithms to provide noise-robustness in MLP network training. Soft Comput. 2016, 20, 49–65. [Google Scholar] [CrossRef]
  12. Zadeh, L.A. Some reflections on soft computing, granular computing and their roles in the conception, design and utilization of information/intelligent systems. Soft Comput. 1998, 2, 23–25. [Google Scholar] [CrossRef]
  13. Yao, Y.; She, Y. Rough set models in multigranulation spaces. Inform. Sci. 2016, 327, 40–56. [Google Scholar] [CrossRef]
  14. Yao, Y.; Zhao, L. A measurement theory view on the granularity of partitions. Inform. Sci. 2012, 213, 1–13. [Google Scholar] [CrossRef]
  15. Savchenko, A.V. Fast multi-class recognition of piecewise regular objects based on sequential three-way decisions and granular computing. Knowl. Based Syst. 2016, 91, 252–262. [Google Scholar] [CrossRef]
  16. Kerr-Wilson, J.; Pedrycz, W. Design of rule-based models through information granulation. Exp. Syst. Appl. 2016, 46, 274–285. [Google Scholar] [CrossRef]
  17. Bortolan, G.; Pedrycz, W. Hyperbox classifiers for arrhythmia classification. Kybernetes 2007, 36, 531–547. [Google Scholar] [CrossRef]
  18. Hu, X.; Pedrycz, W.; Wang, X. Comparative analysis of logic operators: A perspective of statistical testing and granular computing. Int. J. Approx. Reason. 2015, 66, 73–90. [Google Scholar] [CrossRef]
  19. Pedrycz, W. Granular fuzzy rule-based architectures: Pursuing analysis and design in the framework of granular computing. Intell. Decis. Tech. 2015, 9, 321–330. [Google Scholar] [CrossRef]
  20. Chen, Y.; Zhu, Q.; Wu, K.; Zhu, S.; Zeng, Z. A binary granule representation for uncertainty measures in rough set theory. J. Intell. Fuzzy Syst. 2015, 28, 867–878. [Google Scholar]
  21. Hońko, P. Association discovery from relational data via granular computing. Inform. Sci. 2013, 234, 136–149. [Google Scholar] [CrossRef]
  22. Eastwood, M.; Jayne, C. Evaluation of hyperbox neural network learning for classification. Neurocomputing 2014, 133, 249–257. [Google Scholar] [CrossRef]
  23. Sossa, H.; Guevara, E. Efficient training for dendrite morphological neural networks. Neurocomputing 2014, 131, 132–142. [Google Scholar] [CrossRef] [Green Version]
  24. Ripley, B.D. Pattern Recognition and Neural Networks; Cambridge University Press: Cambridge, UK, 1996. [Google Scholar]
  25. Fernández, A.; Garcia, S.; Del Jesus, M.J.; Herrera, F. A study of the behaviour of linguistic fuzzy rule-based classification systems in the framework of imbalanced data-sets. Fuzzy Sets Syst. 2008, 159, 2378–2398. [Google Scholar] [CrossRef]
  26. Akbulut, Y.; Şengür, A.; Guo, Y.; Smarandache, F. A Novel Neutrosophic Weighted Extreme Learning Machine for Imbalanced Data Set. Symmetry 2017, 9, 142. [Google Scholar] [CrossRef]
Figure 1. Join process of two hyperbox granules in 2-D space.
Figure 1. Join process of two hyperbox granules in 2-D space.
Information 10 00076 g001
Figure 2. Distance between a point and a hyperbox granule in 2-D space.
Figure 2. Distance between a point and a hyperbox granule in 2-D space.
Information 10 00076 g002
Figure 3. The example of Algorithm 1. (a) The hyperbox granule by x 1 ., (b) the join hyperbox granule between the granule G 1 and the atomic hyperbox granule [7 6 7 6], (c) the join hyperbox granule between G 1 and the atomic hyperbox granule [5 9 5 9], and (d) the granule set including three hyperbox granules with class label 1 (blue) and class label 2 (red).
Figure 3. The example of Algorithm 1. (a) The hyperbox granule by x 1 ., (b) the join hyperbox granule between the granule G 1 and the atomic hyperbox granule [7 6 7 6], (c) the join hyperbox granule between G 1 and the atomic hyperbox granule [5 9 5 9], and (d) the granule set including three hyperbox granules with class label 1 (blue) and class label 2 (red).
Information 10 00076 g003
Figure 4. Boundary performed by nonparametric hyperbox granular computing classification algorithm (NPHBGrC) for the Ripley dataset.
Figure 4. Boundary performed by nonparametric hyperbox granular computing classification algorithm (NPHBGrC) for the Ripley dataset.
Information 10 00076 g004
Figure 5. Boundary performed by hyperbox granular computing classification algorithm (HBGrC) for the Ripley dataset.
Figure 5. Boundary performed by hyperbox granular computing classification algorithm (HBGrC) for the Ripley dataset.
Information 10 00076 g005
Figure 6. The histogram of performance by nonparametric hyperbox granular computing classification algorithm (NPHBGrC) and HBGrC for the yeast dataset.
Figure 6. The histogram of performance by nonparametric hyperbox granular computing classification algorithm (NPHBGrC) and HBGrC for the yeast dataset.
Information 10 00076 g006
Table 1. The classification problems and their performances in 2-D space.
Table 1. The classification problems and their performances in 2-D space.
Dataset#Tr#TsAlgorithmsPar.NgTACACT(s)
Spiral970194NPHBGrC
HBGrC

0.08
58
161
100
100
99.48
99.48
0.6864
1.6380
Ripley2501000NPHBGrC
HBGrC

0.27
32
67
100
96
90.2
90.1
0.0625
0.1159
Sensor24487569NPHBGrC
HBGrC

4
4
8
100
100
99.47
99.47
1.0764
1.365
Table 2. The classification problems in N-dimensional (N-D) space.
Table 2. The classification problems in N-dimensional (N-D) space.
DatasetsNClassesSamples
Iris43150
Wine133178
Phoneme525404
Sensor4445456
Car651728
Cancer2302532
Semeion256101593
Table 3. The performances in N-D space.
Table 3. The performances in N-D space.
DatasetAlgorithmsTesting AccuracyT(s)
maxmeanminstd
IrisNPHBGrC10098.666793.33333.44270.0265
HBGrC10097.333393.33332.81091.1560
WineNPHBGrC10096.875093.75003.29400.0406
HBGrC10096.250087.50004.37001.0140
PhonemeNPHBGrC91.651289.823688.31171.109822.4844
HBGrC87.569685.935083.11691.3704422.3009
Cancer1NPHBGrC10098.507595.52241.72340.9064
HBGrC10097.636292.53732.661569.8214
Sensor4NPHBGrC10099.455197.42170.86211.0670
HBGrC10099.215796.68510.994471.8509
CarNPHBGrC97.660891.144581.87135.38348.7532
HBGrC94.736885.959377.77785.50271166.5
Cancer2NPHBGrC10098.076992.30772.39850.4602
HBGrC10097.415994.23081.91077.5676
SemeionNPHBGrC10098.751297.40260.71776.7127
HBGrC97.402694.988192.20781.4397533.2691
Table 4. The t-test values of comparison of NPHBGrC and HBGrC.
Table 4. The t-test values of comparison of NPHBGrC and HBGrC.
AlgorithmsIrisWinePhonemeCancer1
h-valuep-valueh-valuep-valueh-valuep-valueh-valuep-value
NPHBGrC-HBGrC00.355300.72221000.3963
AlgorithmsSensor4CarCancer2Semeion
h-valuep-valueh-valuep-valueh-valuep-valueh-valuep-value
NPHBGrC-HBGrC00.572210.047200.504110
Table 5. Performance of NPHBGrC and HBGrC for the imbalanced dataset “yeast”.
Table 5. Performance of NPHBGrC and HBGrC for the imbalanced dataset “yeast”.
TestsAC (%)C1AC (%)C2AC (%)G (%)
NPHBGrCHBGrCNPHBGrCHBGrCNPHBGrCHBGrCNPHBGrCHBGrC
Test 178.787976.431058.139554.651287.203885.308175.402668.2802
Test 274.074173.737448.837244.186084.360285.782072.829961.5659
Test 376.767774.074155.814054.651285.308181.990574.753066.9394
Test 474.074173.400751.162847.674483.412383.886374.738263.2395
Test 576.013574.662257.647148.235383.412385.308173.287564.1472
mean75.943574.461154.320149.879684.739384.455074.202364.8344

Share and Cite

MDPI and ACS Style

Liu, H.; Diao, X.; Guo, H. Nonparametric Hyperbox Granular Computing Classification Algorithms. Information 2019, 10, 76. https://doi.org/10.3390/info10020076

AMA Style

Liu H, Diao X, Guo H. Nonparametric Hyperbox Granular Computing Classification Algorithms. Information. 2019; 10(2):76. https://doi.org/10.3390/info10020076

Chicago/Turabian Style

Liu, Hongbing, Xiaoyu Diao, and Huaping Guo. 2019. "Nonparametric Hyperbox Granular Computing Classification Algorithms" Information 10, no. 2: 76. https://doi.org/10.3390/info10020076

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop