NS-k-NN: Neutrosophic Set-Based k-Nearest Neighbors Classifier

k-nearest neighbors (k-NN), which is known to be a simple and efficient approach, is a non-parametric supervised classifier. It aims to determine the class label of an unknown sample by its k-nearest neighbors that are stored in a training set. The k-nearest neighbors are determined based on some distance functions. Although k-NN produces successful results, there have been some extensions for improving its precision. The neutrosophic set (NS) defines three memberships namely T, I and F. T, I, and F shows the truth membership degree, the false membership degree, and the indeterminacy membership degree, respectively. In this paper, the NS memberships are adopted to improve the classification performance of the k-NN classifier. A new straightforward k-NN approach is proposed based on NS theory. It calculates the NS memberships based on a supervised neutrosophic c-means (NCM) algorithm. A final belonging membership U is calculated from the NS triples as U = T + I − F. A similar final voting scheme as given in fuzzy k-NN is considered for class label determination. Extensive experiments are conducted to evaluate the proposed method’s performance. To this end, several toy and real-world datasets are used. We further compare the proposed method with k-NN, fuzzy k-NN, and two weighted k-NN schemes. The results are encouraging and the improvement is obvious.


Introduction
The k-nearest neighbors (k-NN), which is known to be the oldest and simplest approach, is a non-parametric supervised classifier [1,2]. It aims to determine the class label of an unknown sample by its k-nearest neighbors that are stored in a training set. The k-nearest neighbors are determined based on some distance functions. As it is simplest and oldest approach, there have been so many data mining and pattern recognition applications, such as ventricular arrhythmia detection [3], bankruptcy prediction [4], diagnosis of diabetes diseases [5], human action recognition [6], text categorization [7], and many other successful ones.
Although k-NN produces successful results, there have been some extensions for improving its precision. Fuzzy theory-based k-NN (Fuzzy k-NN) has been among the most successful ones. As k-NN produces crisp memberships for training data samples, fuzzy k-NN replaces the crisp memberships with a continuous range of memberships which enhances the class label determination. Keller et al. [8] was the one who incorporated the fuzzy theory in the k-NN approach. Authors proposed three different methods for assigning fuzzy memberships to the labeled samples. After determination of the fuzzy memberships, some distance function was used to weight the fuzzy memberships for final class label determination of the test sample. The membership assignment by the conventional fuzzy k-NN algorithm has a disadvantage in that it depends on the choice of some distance function. To alleviate this drawback, Pham et al. [9] proposed an optimally-weighted fuzzy k-NN approach. Author introduced a computational scheme for determining optimal weights which were used to improve the efficiency of the fuzzy k-NN approach. Denoeux et al. [10] proposed a k-NN method where Dempster-Shafer theory was used to calculate the memberships of the training data samples. Author assumed that each neighbor of a sample to be classified was considered as an item of evidence and the degree of support was defined as a function of the distance. The final class label assignment was handled by Dempster's rule of combination. Another evidential theory-based k-NN approach, denoted by Ek-NN, has been proposed by Zouhal et al. [11]. In addition to the belonging degree, the authors introduced the ignorant class to model the uncertainty. Then, Zouhal et al. [12] proposed the generalized Ek-NN approach, denoted by FEk-NN. Authors adopted fuzzy theory for improving the Ek-NN classification performance. The motivation for the FEk-NN was arisen from the fact that each training sample was considered having some degree of membership to each class. In addition, Liu et al. [13] proposed an evidential reasoning based fuzzy-belief k-nearest neighbor (FBK-NN) classifier. In FBK-NN, each labeled sample was assigned with a fuzzy membership to each class according to its neighborhood and the test sample's class label was determined by the K basic belief assignments which were determined from the distances between the object and its K nearest neighbors. A belief theory based k-NN, denoted by the BK-NN classifier was introduced by Liu et al. [14]. The author aimed to deal with uncertain data using the meta-class. Although, the proposed method produced successful results, the computation complexity and the sensitivity to k makes the approach inconvenient for many classification application. Derrac et al. [15] proposed an evolutionary fuzzy k-NN approach where interval-valued fuzzy sets were used. The authors not only defined a new membership function, but also a new voting scheme was proposed. Dudani et al. [16] proposed a weighted voting method for k-NN which was called the distance-weighted k-NN (WKNN). Authors presumed that the closer neighbors were weighted more heavily than the farther ones, using the distance-weighted function. Gou et al. [17] proposed a distance-weighted k-NN (DWKNN) method where a dual distance-weighted function was introduced. The proposed method has improved the traditional k-NN's performance by using a new method for selection of the k value.
In [18][19][20][21], Smarandache proposed neutrosophic theories. Neutrosophy was introduced as a new branch of philosophy which deals with the origin, nature, and scope of neutralities, and their interactions with different ideational spectra [19]. Neutrosophy is the base for the neutrosophic set (NS), neutrosophic logic, neutrosophic probability, neutrosophic statistics, and so on. In NS theory, every event has not only a certain degree of truth, but also a falsity degree and an indeterminacy degree that have to be considered independently from each other [20]. Thus, an event, or entity, {A} is considered with its opposite {Anti-A} and the neutrality {Neut-A}. NS provides a powerful tool to deal with the indeterminacy. In this paper, a new straightforward k-NN approach was developed which is based on NS theory. We adopted the NS memberships to improve the classification performance of the k-NN classifier. To do so, the neutrosophic c-means (NCM) algorithm was considered in a supervised manner, where labeled training data was used to obtain the centers of clusters. A final belonging membership degree U was calculated from the NS triples as U = T + I − F. A similar final voting scheme as given in fuzzy k-NN was employed for class label determination.
The paper is organized as follows: In the next section, we briefly reviewed the theories of k-NN and fuzzy k-NN. In Section 3, the proposed method was introduced and the algorithm of the proposed method was tabulated in Table 1. The experimental results and related comparisons were given in Section 4. The paper was concluded in Section 5.

k-Nearest Neighbor (k-NN) Classifier
As it was mentioned earlier, k-NN is the simplest, popular, supervised, and non-parametric classification method which was proposed in 1951 [1]. It is a distance based classifier which needs to measure the similarity of the test data to the data samples stored in the training set. Then, the test data is labelled by a majority vote of its k-nearest neighbors in the training set.
Let X = {x 1 , x 2 , . . . , x N } denote the training set where x i ∈ R n is a training data point in the n-dimensional feature space and let Y = {y 1 , y 2 , . . . , y N } denotes the corresponding class labels. Given a test data pointx whose class label is unknown, it can be determined as follows: • Calculate the similarity measures between test sample and training samples by using a distance function (e.g., Euclidean distance) • Find the test sample's k nearest neighbors in training data samples according to the similarity measure and determine the class label by the majority voting of its nearest neighbors.

Fuzzy k-Nearest Neighbor (k-NN) Classifier
In k-NN, a training data sample x is assumed to belong to one of the given classes so the membership U of that training sample to each class of C is given by an array of values in {0, 1}. If training data sample x belongs to class c 1 then However, in fuzzy k-NN, instead of using crisp memberships, continuous range of memberships is used due to the nature of fuzzy theory [8]. So, the membership of training data sample can be calculated as: where k c 1 shows the number of instances belonging to class c 1 found among the k neighbors ofx and k is an integer value between [3,9].
After fuzzy membership calculation, a test sample's class label can be determined as following. Determine the k nearest neighbors of the test sample via Euclidean distance and produce a final vote for each class and neighbor using the Euclidean norm and the memberships: where k j is the jth nearest neighbor and m = 2 is a parameter. The votes of each neighbor are then added to obtain the final classification.

Proposed Neutrosophic-k-NN Classifier
As traditional k-NN suffers from assigning equal weights to class labels in the training dataset, neutrosophic memberships are adopted in this work to overcome this limitation. Neutrosophic memberships reflect the data point's significance in its class and these memberships can be used as a new procedure for k-NN approach.
Neutrosophic set can determine a sample's memberships belonging to truth, false, and indeterminacy. An unsupervised neutrosophic clustering algorithm (NCM) is used in a supervised manner [22,23]. Crisp clustering methods assumed that every data points should belong to a cluster according to their nearness to the center of clusters. Fuzzy clustering methods assigned fuzzy memberships to each data point according to their nearness to the center of cluster. Neutrosophic clustering assigned memberships (T, I, and F) to each data point not only according to its nearness to a cluster center, but also according to the nearness to the center mean of the two clusters. Readers may refer to [22] for detailed information about the NCM clustering. As the labels of a training dataset samples are known in a supervised learning, the centers of the clusters can be calculated accordingly. Then, the related memberships of true (T), false (F), and indeterminacy (I) can be calculated as follows: where m is a constant, δ is a regularization parameter and c j shows the center of cluster j. For each point i, the c imax is the mean of two cluster centers where the true membership values are greater than the others. T ij shows the true membership value of point i for class j. F i shows the falsity membership of point i and I i determines the indeterminacy membership value for point i. Larger T ij means that the point i is near a cluster and less likely to be a noise. Larger I i means that the point i is between any two clusters and larger F i indicates that point i is likely to be a noise. A final membership value for point i can be calculated by adding indeterminacy membership value to true membership value and subtracting the falsity membership value as shown in Equation (6). After determining the neutrosophic membership triples, the membership for an unknown sample x u to class label j, can be calculated as [9]: where d i is the distance function to measure the distance between x i and x u , k shows the number of k-nearest neighbors and q is an integer. After the assignment of the neutrosophic membership grades of an unknown sample x u to all class labels, the neutrosophic k-NN assigns x u to the class whose neutrosophic membership is maximum. The following steps are used for construction of the proposed NS-k-NN method: Step 1: Initialize the cluster centers according to the labelled dataset and employ Equations (3)- (5) to calculate the T, I, and F values for each data training data point.
Step 2: Compute membership grades of test data samples according to the Equations (6) and (7).
Step 3: Assign class labels of the unknown test data points to the class whose neutrosophic membership is maximum.

Experimental Works
The efficiency of the proposed method was evaluated with several toy and real datasets. Two toy datasets were used to test the proposed method and investigate the effect of the parameters change on classification accuracy. On the other hand, several real datasets were used to compare the proposed method with traditional k-NN and fuzzy k-NN methods. We further compare the proposed method with several weighted k-NN methods such as weighted k-NN (WKNN) and distance-weighted k-nearest neighbor (DWKNN).
The toy dataset that were used in the experiments were shown in Figure 1a,b respectively. Both toy datasets contain two dimensional data with four classes. Randomly selected half of the toy datasets were used for training and the other half were used for testing. The k value was chosen to be 5, 10, and 15 and the δ parameter was chosen to be 0.01, 0.1, and 1, respectively. The obtained results were shown in Figure 2, respectively. As seen in the first row of Figure 2, the proposed method obtained 100% classification accuracy with k = 10 and δ = 0.01 values for both toy datasets. However, 100% correct classification did not obtained for the other parameters as shown in the second and the third rows of Figure 2. This situation shows that the proposed method needs a parameter tuning mechanism in the k vs. δ space. So, k was set to an integer value between [2,15] and δ parameter was also searched on 2 −10 , 2 −8 , . . . , 2 8 , 2 10 . Symmetry 2017, 9,179 5 of 10 mechanism in the k vs. δ space. So, k was set to an integer value between [2,15] and δ parameter was also searched on {2 −10 , 2 −8 , … , 2 8 , 2 10 }.  We conducted further experiments on 39 real-world datasets which were downloaded from KEEL dataset repository [24]. Each dataset was already partitioned according to the cross validation procedure (five-folds or 10-folds). Table 1 shows several characteristics of the each dataset such as number of samples, number of features, and number of classes. All feature values were normalized to [−1, 1] and a five-folds cross validation procedure was adopted in all experiments. The accuracies were calculated as the ratio of the number of correctly classified samples to the total number of samples.  Balance  625  4  3  Phoneme  5404  5  2  Banana  5300  2  2  Pima  768  8  2  Bands  365  19  2  Ring  7400  20  2  Bupa  345  6  2  Satimage  6435  36  7  Cleveland  297  13  5  Segment  2310  19  7  Dermatology  358  34  6  Sonar  208  60  2  Ecoli  336  7  8  Spectfheart  267  44  2  Glass  214  9  7  Tae  151  5  3  Haberman  306  3  2  Texture  5500  40  11  Hayes-roth  160  4  3  Thyroid  7200  21  3  Heart  270  13  2  Twonorm  7400  20  2  Hepatitis  80  19  2  Vehicle  846  18  4  Ionosphere  351  33  2  Vowel  990  13  11  Iris  150  4  3  Wdbc  569  30  2  Mammographic  830  5  2  Wine  178  13  3  Monk-2  432  6  2  Winequality-red  1599  11  11  Movement  360  90  15  Winequality-white 4898  11  11  New thyroid  215  5  3  Yeast  1484  8   In addition to our results, we also compared our results with k-NN and fuzzy k-NN results on the same datasets. The obtained results were tabulated in Table 2 where the best results were indicated with bold-face. As seen in Table 2, the proposed method performed better than the other methods in 27 of 39 datasets. In addition, k-NN and fuzzy k-NN performed better on six and seven datasets out of 39 datasets, respectively. Our proposal obtained 100% accuracy for two datasets (new thyroid and wine). Moreover, for 13 datasets, the proposed method obtained accuracy values higher than 90%. On the other hand, the worse result was recorded for "Wine quality-white" dataset where the accuracy was 33.33%. Moreover, there were a total of three datasets where the accuracy was lower than 50%. We further conducted experiments on several datasets from UCI-data repository [25]. Totally, 11 datasets were considered in these experiments and compared results with two weighted k-NN approaches, namely WKNN and DWKNN. The characteristics of the each dataset from UCI-data repository were shown in Table 3, and the obtained all results were tabulated in Table 4. The boldface in Table 4 shows the higher accuracy values for each dataset.   Glass  10  214  7  140  74  Wine  13  178  3  100  78  Sonar  60  208  2  120  88  Parkinson  22  195  2  120  75  Iono  34  351  2  200  151  Musk  166  476  2  276  200  Vehicle  18  846  4  500  346  Image  19  2310  7  1310  1000  Cardio  21  2126  10  1126  1000  Landsat  36  6435  7  3435  3000  Letter  16  20,000  26 10,000 10,000 As seen in Table 4, the proposed method performed better than the other methods in eight of 11 datasets and DWKNN performed better in the rest datasets. For three datasets (Parkinson, Image and Landsat), the proposed method yielded accuracy value higher than 90% and the worse result was found for the 'Glass' dataset where the accuracy was 60.81%. DWKNN and the WKNN produced almost same accuracy values and performed significantly better than the proposed method on 'Letter and Glass' datasets. We further compared the running times of each method on each KEEL dataset and the obtained running times were tabulated in Table 5. We used MATLAB 2014b (The MathWorks Inc., Natick, MA, USA) on a computer having an Intel Core i7-4810 CPU and 32 GB memory. As seen in Table 5, for some datasets, the k-NN and fuzzy k-NN methods achieved lower running times than our proposal's achievement. However, when the average running times took into consideration, the proposed method achieved the lowest running time with 0.69 s. The k-NN method also obtained the second lowest running time with 1.41 s. The fuzzy k-NN approach obtained the average slowest running time when compared with the other methods. The fuzzy k-NN method's achievement was 3.17 s. Generally speaking, the proposed NS-k-NN method can be announced successful when the accuracy values which were tabulated in Tables 3-5, were considered. The NS-k-NN method obtained these high accuracies because it incorporated the NS theory with the distance learning for constructing an efficient supervised classifier. The running time evaluation was also proved that the NS-k-NN was quite an efficient classifier than the compared other related classifiers.

Conclusions
In this paper, we propose a novel supervised classification method based on NS theory called neutrosophic k-NN. The proposed method assigns the memberships to training samples based on the supervised NCM clustering algorithm, and classifies the samples based on their neutrosophic memberships. This approach can be seen as an extension of the previously-proposed fuzzy k-NN method by incorporating the falsity and indeterminacy sets. The efficiency of the proposed method was demonstrated with extensive experimental results. The results were also compared with other improved k-NN methods. According to the obtained results, the proposed method can be used in various classification applications. In the future works, we plan to apply the proposed NS-k-NN on imbalanced dataset problems. We would like to analyze the experimental results with some non-parametric statistical methods, such as the Freidman test and Wilcoxon signed-ranks test. In addition, some other evaluation metrics such as AUC will be used for comparison purposes. We will also explore the k-NN method where Dezert-Smarandache theory will be used to calculate the data samples' memberships, replacing Dempster's rule by Proportional Conflict Redistribution Rule #5 (PCR5), which is more performative in order to handle the assignments of the final class.