Interval Intuitionistic Fuzzy Clustering Algorithm Based on Symmetric Information Entropy

: Based on the continuous optimal aggregation operator, a novel distance measure is proposed to deal with interval intuitionistic fuzzy clustering problems. The optimal ordered weighted intuitionistic fuzzy quasi-averaging (OOWIFQ) operator and the continuous OOWIFQ operator are presented to aggregate all the values in an interval intuitionistic fuzzy number. Some of their desirable properties are also studied. The OOWIFQ operator can describe the fuzzy state of things more realistically and present the fuzzy properties more accurately. The opinions of experts are very important, the OOWIFQ operators take expert preferences into account to reduce systematic errors. Considering the hesitation of things and avoiding distortion of information, we put forward the distance measure for interval intuitionistic fuzzy numbers by using symmetric information entropy. Based on the continuous OOWIFQ operator and proposed distance measure, a new interval intuitionistic fuzzy clustering (IIFC) algorithm is proposed. The application in soil clustering shows the validity and practicability of the IIFC algorithm.

In the real world, the boundaries between many objective things are often vague. It is bound to be accompanied by ambiguity when categorizing things, which leads to fuzzy clustering analysis. Ruspini [27] put forward the concept of fuzzy partition. Many scholars proposed a variety of clustering methods, such as the method based on similarity relation [28] and fuzzy relation [29,30], the transitive closure method based on fuzzy equivalence relation [31,32], the maximum tree method based on fuzzy graph theory [33] and dynamic programming [34].
In recent years, considering the degree of hesitation between things, the research of intuitionistic fuzzy sets and interval intuitionistic fuzzy sets have become a hot topic. The facts show that intuitionistic fuzzy sets and interval intuitionistic fuzzy sets can describe and portray the ambiguous nature of the objective world more delicately. However, most of the interval intuitionistic fuzzy clustering algorithms only consider a certain value in the interval, which makes the information missing and distorted. In addition, in the clustering algorithm the preference of decision maker is not taken into account, which may easily cause the result to be inconsistent with the expected result. From the current literatures, there are a few researches on the interval intuitionistic fuzzy set clustering algorithm, and their research have important practical significance. The proposed distance measure considers all the information in the continuous interval. The distortion and loss of information are avoided. Aiming at the shortcomings of existing algorithms and combining with the distance measure based on symmetric information entropy, an interval intuitionistic fuzzy clustering algorithm with preference is proposed. The algorithm considers not only the preference of the decision maker, but also all the values of the interval. The algorithm is applied to the soil clustering to provide guidance for scientific fertilization.
The rest of this paper is organized as follows. Section 2 introduces some basic knowledge of intuitionistic fuzzy sets and relevant aggregation operators. In Section 3, a continuous optimal aggregation operator based on Chi-squared deviation is proposed. In Section 4, the new distance measure based on symmetric information entropy is proposed. The intuitionistic fuzzy clustering (IIFC) algorithm and its application are analyzed in Section 5 and Section 6, respectively. This work is concluded in the last section.

Some Basic Concepts of Intuitionistic Fuzzy Set
Intuitionistic fuzzy set theory was developed from fuzzy set theory by Atanassov [35]. Intuitionistic fuzzy set considers membership, non-membership, and hesitation of input data. Therefore, in practical applications, intuitionistic fuzzy set has greater power to represent fuzzy and uncertain information than fuzzy set. In the following, we will briefly review some basic concepts of intuitionistic fuzzy set and introduce the distance measure of intuitionistic fuzzy sets. Definition 1. Let X be a fixed set. A fuzzy set α is defined as [35]: where the functions The third parameter of the intuitionistic fuzzy set α is ( ) x α π , called the hesitancy degree of For any x X ∈ on A satisfies the following condition: , the relationship between a  and b  is defined as follows [36]:

Continuous Aggregation Operators
On a continuous interval, Yager [37] proposed a continuous ordered weighted average operator (C-OWA) based on the OWA operator as follows:

Definition 4. A C-OWA operator is a mapping
where Γ is the set of all positive interval numbers, and R + is the positive real number set. Q is the basic unit-interval monotonic (BUM) function, and satisfies (0) 0 Based on the idea of geometric mean, Yager and Xu [38] proposed the continuous ordered weighted geometric (C-OWG) operator:

Definition 5. A C-OWG operator is a mapping
associated with a BUM function Q , In 2008, Chen [39] et al. proposed the following continuous ordered weighted harmonic (C-OWH) operators based on the harmonic mean and the C-OWA operator:

Definition 6. A C-OWH operator is a mapping
In addition, on the continuous interval number, Liu et al. [40] proposed the continuous quasiordered weighted averaging (C-QOWA) operators.

Definition 7. A C-QOWA operator is a mapping
Take the partial derivatives of J with respect to u and v, respectively, and we have In real life, data tends to be continuous, but not discrete. Intuitionistic fuzzy number considers a value in a continuous interval, which may lose some important information and make the clustering result deviated. To overcome this shortcoming, information fusion is needed for all values on the continuous interval. Therefore, we propose a continuous OOWIFQ (C-OOWIFQ) operator based on OOWIFQ operator.
where I  be a set of all interval intuitionistic fuzzy numbers.
Next, we will introduce the specific derivation process of Equation (13): , (a). When f is strictly monotonic increasing, then 0 δ ≥ , 0 η ≥ holds. As a result, we have f is also strictly monotonic increasing, we get (b). On the contrary, when f is strictly monotonic decreasing, then which can be further expressed as According to Equation (11), the approximation of can be derived as follows: where j w and i w are the associated weights of the ordered weighted average operator. Using the BUM function, we can get the associated weights j w and i w as [41] The C-OOWIFQ operator has the following properties, which are proved as follows:

Property 1. For all strictly monotonic continuous function
Proof. Let us consider different cases of function f : the above two inequalities can be further writhen as Since 1 − f is also strictly monotonic increasing, it holds that Similarly, for any , it can be proved that f is strictly monotonic decreasing, so Similarly, for any Therefore, the Property 1 is proved. □

Proof.
(a) When f function is strictly monotonic increasing, for any [0,1] ∈ u , we have It follows that f is also strictly monotonic increasing, so there is By similar proof processes, we can obtain When f function is strictly monotonic decreasing, for any [0,1] ∈ u , we have Obviously, it holds that f is also strictly monotonic decreasing, we get By similar proof process, we can also get From the above analysis, we have ), we can get In very similar way, we can also get So, we have In a similar process, the following equation can be obtained Moreover, we have (a) when f function is strictly monotonic increasing, then ( ) ( ) Therefore, we have which can be written as f is strictly monotonic increasing, so we have In a similar process, we get Thus, we have ) (b) when f function is strictly monotonic decreasing, then Because 1 − f is also strictly monotonic decreasing, we get Similarly, we have Therefore, we have ) In summary, Property 4 is proved. □ Through the above proof, the C-OOWIFQ operator has some desirable properties such as boundedness, monotonicity, identity and monotonicity about the function Q. The C-OOWIFQ operator can aggregate all points in a closed interval, and can also take into account the preferences of experts, so that the clustering of things by the C-OOWIFQ operator can be analyzed more comprehensively and effectively. The proposed IIFC algorithm improves the reliability of analysis results. Proof. It is easy to know that , by Property 1 we have It follows that is an intuitionistic fuzzy number. □

Distance Measure of Interval Intuitionistic Fuzzy Numbers Based on Symmetric Information Entropy
Interval intuitionistic fuzzy number describes the uncertainty under the fuzzy setting. Since entropy can effectively measure uncertainty, we propose an interval intuitionistic fuzzy distance measure based on symmetric information entropy, which can be shown as follows:  (22) can be abbreviated to the following formula: be three interval intuitionistic fuzzy numbers, the distance measure d must satisfy the following properties: Proof. For the proposed distance measure, four properties (a)-(d) should be satisfied, and the proof is as follows: The partial derivative of Equation (23) is obtained as follows: Similarly, we get  (c) According to Equation (23), obviously, the symmetry of distance measure is holds.
In the same way, ( , ) ( , ) d b c d a c ≤     also holds. □

Interval Intuitionistic Fuzzy Clustering Algorithm
For the multiple attribute clustering problem with interval intuitionistic fuzzy information, let m object sets Based on the C-OOWIFQ operator and the symmetric information entropy based distance measure, we propose an interval intuitionistic fuzzy clustering (IIFC) algorithm as follows. The flow chart of IIFC algorithm is illustrated in Figure 1.
Step 1 Select an strictly monotonic function Step 2 Step 3 Select a random sample from B  as the initial clustering center 1 c .
Step 4 Firstly, calculate the shortest distance between each object and the existing clustering center, denoted by ( ) D x ; then calculate the probability that each object is selected as the next cluster center. Finally, a cluster center is selected according to the roulette method.
Step 5 Repeat Step 4 until k cluster centers are selected.
Step 6 For each object i ) in B  , its symmetric entropy distance to the k cluster center is calculated by the Equation (23) and divided into the class corresponding to the cluster center with the smallest distance.
Step 7 For each category i c , recalculate its cluster center Step 8 Repeat Steps 6 and 7 until the position of the cluster center does not change.

Numerical Example
This article focuses on the theoretical research of the algorithm. The IIFC algorithm can be applied to many fields such as data mining, image segmentation, feature extraction, and soil attribute analysis. This paper verifies the feasibility and effectiveness of the IIFC algorithm through clustering examples. The soil composition can be measured by using conventional soil agrochemical analysis method. Assume that the following experimental sample attributes are selected: total nitrogen (TN), total phosphorus (TP), organic matter (OM), available nitrogen (AN), available phosphorus (AP), and available potassium (AK). Let us suppose that fifteen soil samples are considered here, and the attribute values of these soil samples are given by using interval intuitionistic fuzzy numbers, shown in Table 1. Step 4. Randomly select a sample from B  as the initial cluster center No. 1 Step 5. Calculate the shortest distance between each object and the existing clustering center, denoted by ( ) D x ; then calculate the probability that each object is selected as the next cluster center. A cluster center is selected according to the roulette method.
Step 6. Repeat Step 4 until k cluster centers are selected. An initial clustering center is shown in Table 3.
) in B  , the symmetric entropy distance from the k cluster center is calculated by Equation (22) and divided into the class corresponding to the cluster center with the smallest distance.
Step 8. For each category i c , recalculate its cluster center After three iterations, the clustering center point no longer changes, and the final cluster center is shown in Table 4. The sample soil was successfully divided into four categories: (a) very poor 2, 5, 11; (b) poor 4, 6, 12, 15; (c) fertility 1, 3, 8, 10; (d) very fertility 7, 9, 13, 14. The clustering results derived by IIFC algorithm are illustrated in Figure 2. By analyzing the soil composition of the sample, we can scientifically conduct effective fertilization guidance for the sample.

Conclusions
In this paper, a new continuous optimal aggregation operator based on Chi-squared deviation is proposed, which can effectively convert interval intuitionistic fuzzy number into intuitionistic fuzzy number. The distance between interval intuitionistic fuzzy numbers is calculated by constructing a new distance measure based on symmetric information entropy. The C-OOWIFQ operator and the distance measure based on symmetric information entropy are applied to deal with soil clustering. The main advantages of this paper are shown as follows: (1) Compared with traditional clustering and fuzzy clustering, interval intuitionistic fuzzy clustering describes the fuzzy nature of things more delicately.
(2) The symmetric information entropy based distance measure considers all the information in the continuous interval. Thus, the distortion and loss of information are avoided, and the result is more accurate and effective. (3) The C-OOWIFQ takes into account the preferences of decision makers.
In addition, the IIFC algorithm can effectively solve the problem of soil clustering. In the followup study, we will apply the distance measure and symmetric information entropy to pattern recognition, data mining, medical diagnosis and other fields.

Conflicts of Interest:
The authors declare no conflict of interest.