Relative Knowledge Distance Measure of Intuitionistic Fuzzy Concept

: Knowledge distance is used to measure the difference between granular spaces, which is an uncertainty measure with strong distinguishing ability in a rough set. However, the current knowledge distance failed to take the relative difference between granular spaces into account under the given perspective of uncertain concepts. To solve this problem, this paper studies the relative knowledge distance of intuitionistic fuzzy concept (IFC). Firstly, a micro-knowledge distance (md) based on information entropy is proposed to measure the difference between intuitionistic fuzzy information granules. Then, based on md, a macro-knowledge distance (MD) with strong distinguishing ability is further constructed, and it is revealed the rule that MD is monotonic with the granularity being ﬁner in multi-granularity spaces. Furthermore, the relative MD is further proposed to analyze the relative differences between different granular spaces from multiple perspectives. Finally, the effectiveness of relative MD is veriﬁed by relevant experiments. According to these experiments, the relative MD has successfully measured the differences in granular space from multiple perspectives. Compared with other attribute reduction algorithms, the number of subsets after reduction by our algorithm is in the middle, and the mean-square error value is appropriate.


Introduction
Granular computing (GrC) [1][2][3][4] is a new type of computing used to solve problems by simulating the cognitive mechanism of humans. Information granule is the fundamental element in GrC for constructing granular spaces. A granular space consists of several information granules and their relationships, while a granular structure consists of many granular spaces and their relationships. By fusing the structure and optimization approach of granularity, Pedrycz [2] introduced the notion of justifiable granularity. Yao [5,6] examined the two fields of three-way decision and GrC, as well as their interplay. Wang [7,8] reviewed the GrC work from three aspects, including granularity optimization, granularity switching, and multi-granulation computing.
As the main GrC model, rough set [9] is a useful tool for handling uncertain knowledge by utilizing existing information granules. Uncertainty measure is a crucial tool for data analysis in a rough set. Wang [10] introduced a series of uncertainty measures for selecting the optimal features effectively. Li [11] offered the axiom definition of uncertainty measure for covering information systems by using its information structures. Sun [12] investigated the fuzzy neighborhood multigranulation rough set model to construct uncertainty measures. In generalized rough set models, Wang [13] described new uncertainty measures from the perspectives of the upper and lower approximations. Nevertheless, these uncertainty measures struggle to distinguish the differences between granular spaces when they possess the same uncertainty. To address this issue, Qian [14,15] first introduced the concept of knowledge distance, and there have been several works on knowledge distance in recent years. Li [16] proposed an interval-valued intuitionistic fuzzy set to describe fuzzy granular structure distance, and proved that knowledge distance is a special form of intuitionistic fuzzy granular structure distance. Yang [17,18] proposed a partition-based knowledge distance based on the Earth Mover's Distance and further established the fuzzy knowledge distance. Chen [19] presented a new measure formula of knowledge distance by using Jaccard distance to replace set similarity. To measure the uncertainty derived from the disparities between local upper and lower approximation sets, Xia [20] introduced the local knowledge distance.
In practical applications, the target concept may be vague or uncertain. As a classical soft computing tool, the intuitionistic fuzzy set [21] extends the membership from single value to interval value. For uncertain information, the intuitionistic fuzzy set has more powerful ability than the fuzzy set [22], and it is currently extensively applied in different fields, i.e., decision-making [23][24][25], pattern recognition [26,27], control and reasoning [28,29], and fuzzy reasoning [30,31]. In rough set, an intuitionistic fuzzy concept (IFC) can be characterized by a pair of lower and upper approximation fuzzy sets. There are many research works [32][33][34][35][36][37] on the combination between rough set and the intuitionistic fuzzy set. In particular, the uncertainty measure of IFC in granular spaces becomes a basic issue. A novel concept of an intuitionistic fuzzy rough set based on two universes was proposed by Zhang [32] along with a specification of the associated operators. On the basis of the rough set, Dubey [35] presented an intuitionistic fuzzy c-means clustering algorithm and applied it to the segmentation of the magnetic resonance brain images. Zheng [36] proposed an improved roughness method to measure the uncertainty of covering-based rough intuitionistic fuzzy sets. These works indicate that intuitionistic fuzzy set and rough set are suitable mathematical methods for studying vagueness and uncertainty. Current uncertainty measures failed to distinguish different rough granular spaces with the same uncertainty when they are used to describe an IFC; that is, it is difficult to reflect on the differences between them. However, in some situations, such as attribute reduction or granularity selection, the different rough granular spaces for describing an IFC are necessary to distinguish. To solve this problem, based on our previous works [17,18], two-layer knowledge distance measures-that is, micro-knowledge distance (md) and macro-knowledge distance (MD)-are constructed to reflect the difference between granular spaces for describing an IFC. Finally, in order to analyze the relative differences between rough granular spaces under certain prior granular spaces, the concept of relative MD applied to data analysis is also proposed.
The following are the main contributions of our paper: (1) Based on information entropy, md is designed to measure the difference among intuitionistic fuzzy information granules. (2) On the basis of md, MD with strong distinguishing ability is further constructed, which can calculate the difference between rough granular spaces for describing an IFC. (3) The relative MD is proposed to analyze the relative difference between two rough granular spaces from multiple perspectives. (4) An algorithm of attribute reduction based on MD or relative MD is presented, and its effectiveness is verified by relevant experiments.
The rest of this paper is arranged as follows. Section 2 introduces related preliminary concepts. In Section 3, the two types of information entropy-based distance measure (md and MD) are presented. Section 4 presents the concept of relative MD. The relevant experiments are reported in Section 5. Finally, in Section 6, conclusions are formed.

Preliminaries
This part will go through some of the core concepts. Let S = (U, C ∪ D, V, f ) be an information system, where U, C, D and V represent the universe of discourse, condition attribute set, decision attribute set and attribute value set corresponding to each object, respectively, and f : U × C is an information function that specifies the property value of each object x in U.

Definition 1 (Intuitionistic fuzzy set).
Assume that U is the universe of discourse, the following is the definition of an intuitionistic fuzzy set I on U: where γ I (x) and υ I (x) denote two nonempty finite sets on the interval [0, 1], which refer to the set of degrees of membership and non-membership of x on I, respectively, and satisfy the conditions: Note: For convenience, all I below are represented as intuitionistic fuzzy sets on U.
Definition 2 (Average step intuitionistic fuzzy set [38]). Assume that in S = (U, C ∪ D), is therefore referred to as an average step intuitionistic fuzzy set on U/R .
As well known, the information entropy as an uncertainty measure is proposed in rough set theory, Let U be a nonempty universe and I be an intuitionistic fuzzy set on U.
The information entropy of I can be expressed as follows: (1) When U is continuous.
µlog 2 µdµ, and µ denotes the membership degree of x i belongs to the intuitionistic fuzzy set I.
To measure the information entropy of the rough granular space U/R of the IFC, this paper further proposed the definition of average information entropy as follows: (1) When U is continuous, the average information entropy of the rough granular space of I can be denoted by: When U is discrete, the average information entropy of the rough granular space of I can be denoted by: EĪ where, e I R (x) = −2 Definition 3 (Distance measure [39]). Assume that U is the universe of discourse; Y, P and Q are three finite sets on U. When d(·, ·) meets the following criteria, it is considered a distance measure, (1) Positive: d(P, Q) ≥ 0; (2) Symmetric: d(P, Q) = D(Q, P); Definition 4 (Granularity measure [40]). Assume that in S = (U, C ∪ D), G is a mapping from the power set of C to the real number set. For any R 1 , R 2 ⊆ C, when G meets the following criteria, it is considered a granularity measure, Definition 5 (Information measure [41]). Assume that in S = (U, C ∪ D), H is a mapping from the power set of C to the real number set. For any R 1 , R 2 ⊆ C, when H meets the following criteria, it is considered an information measure,

Information-Entropy-Based Two-Layer Knowledge Distance Measure
Although there are many research works [42][43][44][45] on distance measures between intuitionistic fuzzy sets from different perspectives, when an IFC is characterized by different rough granular spaces, respectively, the present fuzzy set distance measures failed to capture the differences between these granular spaces. In addition, as explained in Section 1, when an IFC is defined by two granular spaces, the measure result (fuzziness or information entropy) may be the same. Nevertheless, this does not mean that these two granular spaces are absolutely equal, and the difference between them for characterizing an IFC cannot be reflected. To tackle the difficulties listed above, this paper proposed micro-knowledge distance and macro-knowledge distance based on information entropy, which construct the two-layer knowledge distance measure in this section.
By Formula (2), It shows that calculating the average information entropy does not necessarily distinguish and describe two different rough granular spaces. Although the average information entropy values of U/R 1 and U/R 2 are the same, U/R 2 is superior to U/R 1 in terms of granularity selection, since U/R 2 has a coarser granularity and has a stronger generalization ability for describing IFC.
Assume S = (U, C ∪ D), A is a finite set on U. Then, we call the intuitionistic fuzzy set generated by A as the intuitionistic fuzzy information granule ( Definition 6 (Micro-knowledge distance). Assume in S = (U, C ∪ D), IFG P and IFG Q are two intuitionistic fuzzy information granules on U, hence, the following is the definition of the md formula: Proof of Theorem 1. Let IFG Y , IFG P and IFG Q be three intuitionistic fuzzy information granules. Let: Then md(Y, P) + md(P, Q) ≥ md(Y, Q). According to Definition 3, conditions (1) and (2) are obviously satisfied, Therefore, md(·, ·) is a distance measure.
is an intuitionistic fuzzy set on U, A = {x 1 , x 3 , x 4 , x 6 } and B = {x 3 , x 4 , x 5 , x 7 } are two finite sets on U. Then, Theorem 2. Let Y, P and Q be three intuitionistic fuzzy sets on U. If Y ⊆ P ⊆ Q, then md(Y, P) ≤ md(Y, Q).
Proof of Theorem 2. According to condition, because Y ⊆ P ⊆ Q, obviously, Theorem 3. Let Y, P and Q be three intuitionistic fuzzy sets on U. If Y ⊆ P ⊆ Q, then md(Y, Q) = md(Y, P) + md(P, Q).

Proof of Theorem 3. Theorem 3 obviously holds.
Based on md, this research further created MD, which is formulated as follows, to express the difference between two rough granular spaces for characterizing an IFC.
.., g n } and U/R 2 = {g 1 , g 2 , ..., g m } are two granular spaces induced by R 1 and R 2 , respectively. Then, the following is the definition of MD between U/R 1 and U/R 2 .
where, md ij = md(g i , g j ) and f ij = g i ∩ g j . Figure 1 shows the relationship between md and MD. Suppose For example,
In fact, md measures the difference between two sets, and MD measures the difference between two rough granular spaces, which integrates the md of all sets of the two granular spaces. According to Theorem 1, Theorem 4 and Formula (4), as long as md in MD is a distance measure, then MD is a distance measure. (3),
Proof of Theorem 6. Suppose that R 1 , Proof of Theorem 8. For simplicity, based on the proof of Theorem 5, Similarly, Because g 1 = g 1 ∪ g 2 , g 1 = g 1 ∪ g 2 and g 2 = g 3 . According to Theorem 3, According to Theorem 8, from the perspective of distance, the granular spaces in hierarchical granular structure are linearly additive, which can be explained by Figure 2 intuitively. Moreover, the following corollaries hold: Proof of Corollary 5.
From Corollary 3 and Theorem 6, for an IFC, the larger the granularity difference between granular spaces in hierarchical granular structure, the larger MD between them. From Corollary 4 and Theorem 7, for an IFC, the larger the information difference between granular spaces in hierarchical granular structure, the larger MD between them. From Corollary 5, the larger the information measure, the smaller the granularity measure, and one measure value can be deduced from another.
Note: By using a suitable md in Formula (3), the method of this paper is able to extend to quantify the difference between any types of granular spaces. These specifics are outside the scope of this paper's discussion.

Relative Macro-Knowledge Distance
Section 3 has constructed an MD based on md, which described the difference between two rough granular spaces of IFC. We regarded this knowledge distance as absolute. Because in data analysis, sometimes some conditions are known, it is necessary to analyze the differences between rough granular spaces at this time; that is, to analyze the differences between rough granular spaces under different prior granular spaces. Inspired by Wang [46], this section proposes the concept of relative MD and analyzes its properties.
Definition 8 (Relative macro-knowledge distance). Assume that in S = (U, C ∪ D), R 1 , R 2 ⊆ C, U/R is the prior granular space on U, U/R 1 = {g 1 , g 2 , · · · , g n } and U/R 2 = {g 1 , g 2 , · · · , g m } are two granular spaces induced by R 1 and R 2 , respectively. Then, the relative MD of U/R 1 and U/R 2 under U/R is defined as: Based on the original MD, this definition adds prior granular space U/R, which reflects the relative differences between two rough granular spaces from different perspectives. Theorem 9. RMD(·, ·/·) is a distance measure.
From Examples 5 and 6, after adding the prior granular space, the difference between the two rough granular spaces may change, and when the prior granular space is different, the obtained results may also be different.
From Theorem 11, under the same prior granular space, the relative MD is linearly additive.
Proof of Corollary 7.
Note: From Example 6, when the prior granular space is not the most refined, the relative MD may also be zero. Therefore, the prior granular space is the most refined granular space, which is only a sufficient condition for the relative MD to be zero, not a necessary condition.
According to Corollary 6, the absolute MD is the relative MD without any prior granular space; that is, the absolute MD can be viewed as a special case of the relative MD. By Corollary 7, when the prior granular space is fine enough, the relative MD between two different rough granular spaces has been infinitely reduced or even to zero. Combining Theorem 12, it follows that RMD((U/R 1 , U/R 2 )/ω)) ≤ RMD((U/R 1 , U/R 2 )/(U/R)) ≤ RMD((U/R 1 , U/R 2 )/σ) is true when ω ≺ U/R≺ σ.
Proof of Theorem 13.
According to Theorem 3, From Theorem 13, an absolute MD was divided into the sum of two unidirectional relative MD in different directions. That is, the absolute MD of the two granular spaces is equal to the relative MD of the two granular spaces when the prior granular space is one of the two granular spaces, plus the relative MD of the two granular spaces when the prior granular space is the other granular space of the two granular spaces. This theoretically explains the dialectical unity of relative MD and absolute MD.

Experiment and Analysis
This section verifies that MD has a good advantage when describing IFC in multigranularity space through relevant experiments. The experimental environments are Windows 10, Intel Core (TM) I5-10500 CPU (3.10 GHz) and 16GB RAM. The experimental platform is MATLAB 2022a. We filtered out nine datasets with decision attributes and a sufficient number of conditional attributes from UCI [47] and Dryad. Meanwhile, we removed attributes from some datasets that are completely independent of the decision attributes, such as serial number and date. The dataset's basic information is recorded in Table 1, and experiments will use the following formula [48] to convert numerical values to discrete values. For convenience, the ID numbers in Table 1 will be used to represent the datasets.
where α(x) represent the attribute value, min α represent the minimum value of α(x) and σ α represent the standard deviation of the attribute.

Monotonicity Experiment
In this experiment, some attributes of the dataset in Table 1 were selected. Suppose GL = (GL 1 , GL 2 , GL 3 , GL 4 , GL 5 ) is a hierarchical quotient space structure consisting of five granularity layers. Figure 3, this figure shows that the behavior of each dataset is similar; that is, MD increases with the increase in the granularity difference between two granular spaces, and conversely, MD decreases with the decrease in the granularity difference between two granular spaces. Table 2 summarizes the changes in the two measures (granularity measure and information measure) based on MD in a hierarchical quotient space structure as the granularity layer becomes finer. The findings indicate that these two measures can provide additional information for assessing the uncertainty of fuzzy concepts. These findings support Theorems 6 and 7. The granularity measure decreases as the available information increases, while the information measure increases as the available information increases. According to Table 2, Corollary 5 can also be verified, the sum of granularity measure and information measure is fixed, and the result is

Attribute Reduction
The so-called attribute reduction is to delete the irrelevant or unimportant attributes under the condition that the classification ability of the knowledge base remains unchanged. In data analysis, deleting unnecessary attributes can greatly improve the efficiency of data analysis, and the subset derived from attribute reduction with prior granular space may be different from the subset derived from attribute reduction without prior granular space. Aiming at this problem, this section makes a comparative experiment of attribute reduction based on relative MD in different prior granular spaces and attribute reduction based on absolute MD; in this paper, the attributes that divide the prior granular space are called prior conditions.  Some attributes of the dataset in Table 1 were selected in the experiment. Taking the calculation of attribute reduction based on relative MD as an example, Algorithm 1 is the algorithm used in the experiment. Attribute reduction based on absolute MD only needs to change the fourth step in Algorithm 1 to delete the first and last items in conT; that is, without any prior conditions. In this paper, Algorithm 2 is used to represent attribute reduction based on absolute MD. Suppose an information system S = (U, C ∪ D, V, f ), then the calculation formula of attribute importance id is as follows: As shown in Figure 4, the attribute importance represents MD between the granular space after removing attribute i in the dataset and the granular space without removing this attribute. The larger the distance, the higher the attribute importance degree, and c i represents attribute i. Therefore, in this paper, the attributes with the largest and smallest attribute importance of the dataset are selected as prior conditions, and attribute reduction based on relative MD is carried out. Moreover, attribute reduction based on absolute MD is also performed without any prior conditions.

Algorithm 1 Attribute reduction based on relative MD
Input: An information system S = (U, C ∪ D, V, f ) Output: Attribute subset R obtained after attribute reduction 1: Let R = C and conT = C 2: Calculate the information entropy of each instance by Formula (1) 3: Calculate the attribute importance of all attributes in conT by Formula (6), and sort this in ascending order, the result is recorded as conT_rank 4: Take conT_rank(1) or conT_rank(length(conT)) as a prior condition, that is, delete the last or first item in conT_rank and ensure that the first item or last item in conT_rank always exists 5: while conT_rank = ∅ do  (Note: Figure 4 is only used to analyze the importance of the conditional attribute of a single system, so there is no correlation between the height of the line graph of different systems).
As shown in Table 3, in the attribute reduction based on absolute MD, ξ is the maximum absolute MD between the granular space divided by attribute subsets after attribute reduction and the granular space divided by all attributes. In attribute reduction based on relative MD, ξ is the maximum relative MD between the granular space divided by attribute subsets after attribute reduction and the granular space divided by all attributes. This paper sets ξ to 0.003 and 0.006 for comparison. In the table, numbers are directly used to represent the serial numbers of the conditional attributes.
According to the analysis in Figure 4 and Table 3: (1) When the prior conditions are more important attributes, the number of attributes is significantly reduced compared to the attribute reduction based on absolute MD, which shows that selecting more important properties increases the cognitive ability of the system, consistent with Theorem 12.
(2) When the prior condition is an unimportant attribute, compared with the prior condition is an important attribute, the number of subsets after attribute reduction is usually more, which also indicates that the more important the prior condition is, the more cognitive ability of the attribute to the system can be improved.
(3) When ξ is different-that is, the maximum MD between the granular space remains conditionally divided after attribute reduction and the granular space divided without reduction changes-the subsets after attribute reduction may be different, which illustrates the efficiency of this algorithm. The algorithm will obtain different attribute subsets as the requirements increase and decrease.
(4) The reduced attributes are all attributes with low attribute importance, which shows the effectiveness of this algorithm in calculating attribute importance. Table 3. Attribute reduction on each dataset based on different situations.

Conclusions and Discussion
In this paper, the macro-knowledge distance of intuitionistic fuzzy sets is proposed to measure the difference between granular spaces effectively. Under the given perspective of uncertain concepts, the current knowledge distance failed to account for the relative difference between granular spaces. As a result, we further propose the relative macroknowledge distance and demonstrate its practicability through relative attribute reduction experiments. These results provide a new perspective to current knowledge distance research by measuring the relative differences between different granular spaces under prior granular spaces. The conclusions are as follows: (1) Macro-knowledge distance increases with the increase in the granularity difference between two granular spaces, and vice versa. The sum of granularity measure and information measure is always |U|−1 |U| .
(2) After attribute reduction, the number of subsets obtained by our algorithm is appropriate, and in comparison to other algorithms, our mean square error is suitable. In the analysis of data, the more important the prior condition is, the more it can improve the cognitive ability of the attributes.
Under specific circumstances, the relative macro-knowledge distance is able to remove unnecessary attributes in practical applications, which can significantly increase the accuracy of attribute reduction and the effectiveness of data analysis. The characteristics of the data will be more thoroughly understood during the attribute reduction process.