Cost-Sensitive Multigranulation Approximation in Decision-Making Applications

Yang, Jie; Kuang, Juncheng; Liu, Qun; Liu, Yanmin

doi:10.3390/electronics11223801

Open AccessArticle

Cost-Sensitive Multigranulation Approximation in Decision-Making Applications

by

Jie Yang

^1,2,

Juncheng Kuang

²,

Qun Liu

² and

Yanmin Liu

^1,*

¹

School of Physics and Electronic Science, Zunyi Normal University, Zunyi 563002, China

²

Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(22), 3801; https://doi.org/10.3390/electronics11223801

Submission received: 25 October 2022 / Revised: 13 November 2022 / Accepted: 16 November 2022 / Published: 18 November 2022

(This article belongs to the Special Issue Advances in Intelligent Data Analysis and Its Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

A multigranulation rough set (MGRS) model is an expansion of the Pawlak rough set, in which the uncertain concept is characterized by optimistic and pessimistic upper/lower approximate boundaries, respectively. However, there is a lack of approximate descriptions of uncertain concepts by existing information granules in MGRS. The approximation sets of rough sets presented by Zhang provide a way to approximately describe knowledge by using existing information granules. Based on the approximation set theory, this paper proposes the cost-sensitive multigranulation approximation of rough sets, i.e., optimistic approximation and pessimistic approximation. Their related properties were further analyzed. Furthermore, a cost-sensitive selection algorithm to optimize the multigranulation approximation was performed. The experimental results show that when multigranulation approximation sets and upper/lower approximation sets are applied to decision-making environments, multigranulation approximation produces the least misclassification costs on each dataset. In particular, misclassification costs are reduced by more than 50% at each granularity on some datasets.

Keywords:

multigranulation rough sets; optimistic approximation; pessimistic approximation; cost-sensitive; decision-making applications

1. Introduction

As a human-inspired paradigm, granular computing (GrC) solves complex problems by utilizing multiple granular layers [1,2,3,4]. Zadeh [1] noted that information in granules refer to pieces, classes, and groups, into which complex information is divided in accordance with the characteristics and processes of understanding and decision-making. From the different views, GrC models mainly cover four types: fuzzy sets [5], rough sets [6], quotient spaces [7], and cloud models [8]. As representative models of GrC, rough sets describe uncertain concepts by upper and lower approximation boundaries, which have been applied to data mining [9,10], medical systems [11], attribute reductions [12,13], decision systems [14,15], and machine learning [16].

Regarding similarity, Zhang [17,18,19,20] presented the approximation set of rough sets, vague sets, rough fuzzy sets, rough vague sets, etc. These approximation models were developed by utilizing the existing equivalence classes to describe uncertain concepts. The approximation model has a higher similarity with the target concept than the upper/lower approximations. Furthermore, the approximation model has been applied in attribute reduction [21], image segmentation [22], the optimization algorithm [23], etc. Based on the approximation set theory, Yang [24,25] developed the approximation model of rough sets based on misclassification costs. In the process of cost-sensitive learning, the smaller misclassification costs will help to improve the decision-making qualities in real applications. Recently, from the perspective of three-way decisions [26,27,28,29], Yao [30] constructed a symbol–meaning–value (SMV) model for data analysis. In the three-way decision model, the equivalence classes in a boundary region will produce misclassification costs when they are used as approximation sets. Hence, the approximation model that is constructed from the perspective of similarity is no longer applicable to cost-sensitive scenarios. To minimize the misclassification costs of constructing the approximation set, we proposed the multigranulation approximation, i.e., the optimistic approximation model and pessimistic approximation model. Moreover, to search the optimal approximation layer for multigranulation rough sets [31] under the constraints, the algorithm of the cost-sensitive multigranulation approximation selection is further proposed to be applied to decision-making environments.

The following sections are arranged as follows: Section 2 presents the related works. Section 3 introduces the relevant definitions of the multigranulation rough set and approximation set. Section 4 introduces an approximate representation of the rough sets. Section 5 presents the cost-sensitive multigranulation approximations of rough sets and further introduces the optimal multigranulation approximation algorithm. To verify the availability of the proposed model, the related experiments and discussion are presented in Section 6. Ultimately, in Section 7, the conclusions are presented.

2. Literature Review

Rough sets are typically constructed based on a single binary relation. However, in many cases, they may be described in multiple granularity structures. In order to extend single granularity to multi-granularity in rough approximations, Qian [31] proposed the multigranulation rough set model (MGRS), where the upper/lower approximations were defined by multi-equivalence relations (multiple granulations) in the universe [32,33]. For the lower approximation of optimistic MGRS, at least one granular space was obtained, such that objects completely belonged to the target concept. For the lower approximation of pessimistic MGRS, objects completely belong to target concepts in each granular space. MGRS has two advantages: (1) In the process of decision-making applications, the decision of each decision maker may be independent of the same project (or an element) in the universe [34]. In this situation, the intersection operations between any two granularity structures will be redundant for decision-making [35]. (2) Extract decision rules from distributive information systems and groups of intelligent agents by using rough set approaches [34,36].

There are many works [33,34,35,37,38,39,40,41,42] on multigranulation rough sets. To extend the MGRS to the neighborhood information system, Hu [43,44] presented matrix-based incremental approaches to update knowledge about neighborhood information systems by changing the granular structures. From the perspective of uncertainty measure, Sun [39] proposed a feature selection based on fuzzy neighborhood multigranulation rough sets. Xu [38] proposed a dynamic approximation update mechanism of a multigranulation neighborhood rough set from a local viewpoint. Liu [35] introduced a parameter-free multi-granularity attribute reduction scheme, which is more effective for microarray data than other well-established attribute reductions. Based on the three-way decision theory, She [40] presented a five-valued logic approach for the multigranulation rough set model. From the above, however, the method of approximately describing the target concept with existing information granules is not given, which limits the application of the multigranulation rough set theory. Li [41] presented two kinds of local multigranulation rough set models in the ordered decision system by extending the single granulation environment to a multigranulation case. Zhang [42] constructed hesitant fuzzy multigranulation rough sets to handle the hesitant fuzzy information and group decision-making for person–job fit.

3. Preliminaries

In this section, some necessary definitions related to the multigranulation rough set and approximation set are reviewed to facilitate the framework of this paper. Let

S = (U, C \cup D, V, f)

be a decision information table, where U is a non-empty finite set of objects, C is a non-empty finite set of condition attributes, D is a set of decision attributes, V is the set of all attribute values, and

f : U \times C

is an information function.

Definition 1

((Rough Sets) [6]). Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

, the lower and upper approximation sets of X are given as follows:

\begin{matrix} \underset{̲}{A} (X) = {x \in U| {[x]}_{A} \subseteq X}, \\ \bar{A} (X) = {x \in U| {[x]}_{A} \cap X \neq ϕ} . \end{matrix}

where

{[x]}_{A}

denotes the equivalence class induced by

U / A

, namely,

U / A = {{[x]}_{R}} = {{[x]}_{1}, {[x]}_{2}, \dots, {[x]}_{l}}

.

Based on the lower and upper approximations, the universe U can be divided into three disjoint regions, which are expressed as follows:

\begin{matrix} P O S_{A} (X) = \underset{̲}{A} (X), \\ B N D_{A} (X) = \bar{A} (X) - \underset{̲}{A} (X), \\ N E G_{A} (X) = U - \bar{A} (X) . \end{matrix}

Definition 2

((Optimistic multigranulation rough sets) [36]). Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

X \subseteq U

, then the lower and upper approximation sets of X related to

A_{1}, A_{2}, \dots, A_{m}

are given as follows:

\begin{matrix} \underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X) = {x| {[x]}_{A_{1}} \subseteq X \lor {[x]}_{A_{2}} \subseteq X \lor \dots \lor {[x]}_{A_{m}} \subseteq X, x \in U}, \end{matrix}

(1)

\begin{matrix} \bar{\sum_{i = 1}^{m} A_{i}^{O}} (X) = \sim \underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (\sim X) . \end{matrix}

(2)

Then,

(\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X), \bar{\sum_{i = 1}^{m} A_{i}^{O}} (X))

is called optimistic multigranulation rough sets. The lower and upper approximation sets of X in optimistic multigranulation rough sets are presented by multiple independent approximation spaces. The boundary regions are defined as follows:

\begin{matrix} B N D_{\sum_{i = 1}^{m} A_{i}}^{O} (X) = \bar{\sum_{i = 1}^{m} A_{i}^{O}} (X) - \underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X) . \end{matrix}

(3)

Definition 3

((Pessimistic multigranulation rough sets) [36]). Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

, and

X \subseteq U

. The lower and upper approximation sets of X related to

A_{1}, A_{2}, \dots, A_{m}

are given as follows:

\begin{matrix} \underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X) = {x| {[x]}_{A_{1}} \subseteq X \land {[x]}_{A_{2}} \subseteq X \land \dots \land {[x]}_{A_{m}} \subseteq X, x \in U}, \end{matrix}

(4)

\begin{matrix} \bar{\sum_{i = 1}^{m} A_{i}^{P}} (X) = \sim \underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (\sim X) . \end{matrix}

(5)

Then,

(\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X), \bar{\sum_{i = 1}^{m} A_{i}^{P}} (X))

is called pessimistic multigranulation rough sets. The lower and upper approximation sets of X in pessimistic multigranulation rough sets are presented by multiple independent approximation spaces. However, the strategy is different from optimistic multigranulation rough sets. The boundary region is defined as follows:

\begin{matrix} B N D_{\sum_{i = 1}^{m} A_{i}}^{P} (X) = \bar{\sum_{i = 1}^{m} A_{i}^{P}} (X) - \underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X) . \end{matrix}

(6)

Definition 4

((Approximation of rough sets) [17]). Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

.

U / A = {{[x]}_{1}, {[x]}_{2}, \dots, {[x]}_{l}}

is a granularity layer on U, then the α-approximation of X on

U / A

is defined as follows:

\begin{matrix} A_{α} (X) = \cup \{{[x]}_{i} |\bar{μ} ({[x]}_{i}) \geq α, {[x]}_{i} \subseteq U\} . \end{matrix}

(7)

where

0 \leq α \leq 1

,

\bar{μ} ({[x]}_{i}) = \frac{|{[x]}_{i} \cap X|}{|{[x]}_{i}|}

denotes the membership degree of the equivalence class

{[x]}_{i}

belongs to X.

Example 1.

Let Table 1 be a decision information table,

A_{1}, A_{2}, A_{3} \subseteq C

and

X \subseteq U

. For the element

x_{4}

, the equivalence classes

{[x_{4}]}_{i} (i = 1, 2, 3)

belonging to X in the multigranulation approximation space are as follows:

\begin{matrix} {[x_{4}]}_{1} = {x_{1}, x_{2}, x_{3}, x_{4}}; \\ {[x_{4}]}_{2} = {x_{3}, x_{4}, x_{7}, x_{8}}; \\ {[x_{4}]}_{3} = {x_{2}, x_{4}, x_{6}, x_{8}} . \end{matrix}

Accordingly, the membership degrees are computed:

\begin{matrix} \bar{μ} ({[x_{4}]}_{1}) = \frac{0 + 0 + 0 + 1}{4} = 0.25; \\ \bar{μ} ({[x_{4}]}_{2}) = \frac{0 + 1 + 1 + 1}{4} = 0.75; \\ \bar{μ} ({[x_{4}]}_{3}) = \frac{0 + 1 + 1 + 1}{4} = 0.75 . \end{matrix}

If α is set to 0.5, considering the optimistic approximation, element

x_{4}

will be classified into the optimistic lower approximation sets of X due to one of its membership degrees being greater than 0.5. However, if considering the pessimistic approximation, element

x_{4}

will only be classified into the pessimistic upper approximation sets of X.

Based on the given conditions, we have:

\begin{matrix} X & = \frac{0.25 + 0.25 + 0.25}{x_{1}} + \frac{0.25 + 0.25 + 0.75}{x_{2}} + \frac{0.25 + 0.75 + 0.05}{x_{3}} + \frac{0.25 + 0.75 + 0.75}{x_{4}} \\ + \frac{0.75 + 0.25 + 0.25}{x_{5}} + \frac{0.75 + 0.25 + 0.75}{x_{6}} + \frac{0.75 + 0.75 + 0.25}{x_{7}} + \frac{0.75 + 0.75 + 0.75}{x_{8}} . \end{matrix}

Then, the results of the optimistic approximations are shown as follows:

\begin{matrix} \underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X) = {x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}, x_{8}}, \\ \bar{\sum_{i = 1}^{m} A_{i}^{O}} (X) = {x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}, x_{8}}, \\ B N D_{\sum_{i = 1}^{m} A_{i}}^{O} (X) = {x_{1}} . \end{matrix}

Moreover, the results of the pessimistic approximations, in this case, are changed as follows:

\begin{matrix} \underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X) = {x_{8}}, \\ \bar{\sum_{i = 1}^{m} A_{i}^{P}} (X) = {x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}, x_{8}}, \\ B N D_{\sum_{i = 1}^{m} A_{i}}^{P} (X) = {x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6}, x_{7}} . \end{matrix}

4. Cost-Sensitive Approximation Methods of the Rough Sets

Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

.

U / A = {{[x]}_{1}, {[x]}_{2}, \dots, {[x]}_{l}}

is a granularity layer on U.

λ_{12}

represents the cost generated by taking an element belonging to X as the approximation,

λ_{21}

represents the cost generated by taking an element that does not belong to X as the approximation. Furthermore, misclassification costs incurred by the equivalence classes in characterizing X on

U / A

are given in the following:

\begin{matrix} λ^{Y} = λ_{12} (1 - \bar{μ} ({[x]}_{i})) |{[x]}_{i}| . \end{matrix}

(8)

Misclassification costs incurred by the equivalence classes when not characterizing X on

U / A

are given in the following:

\begin{matrix} λ^{N} = λ_{21} \bar{μ} ({[x]}_{i}) |{[x]}_{i}| . \end{matrix}

(9)

Herein,

\bar{μ} ({[x]}_{i}) (i = 1, 2, \dots, l)

denotes the membership degree of

{[x]}_{i}

belonging to X. Based on the Bayesian decision procedure, the minimum cost decision rules are expressed as follows:

(P): If $λ^{Y} \leq λ^{N}$ , ${[x]}_{i} \subseteq A (X)$ ;
(N): If $λ^{Y} > λ^{N}$ , ${[x]}_{i} ⊄ A (X)$ .

It is clear that the above rules are only relevant to the loss function

\bar{μ} ({[x]}_{i})

. From Formulas (8) and (9), the decision rules are re-expressed in the following:

(P1): If $\bar{μ} ({[x]}_{i}) \geq \frac{λ_{12}}{λ_{12} + λ_{21}}$ , ${[x]}_{i} \subseteq A (X)$ ;
(N1): If $\bar{μ} ({[x]}_{i}) < \frac{λ_{12}}{λ_{12} + λ_{21}}$ , ${[x]}_{i} ⊄ A (X)$ .

Supposing

γ = \frac{λ_{12}}{λ_{12} + λ_{21}}

, then we have the following decision rules:

(P2): If $\bar{μ} ({[x]}_{i}) \geq γ$ , ${[x]}_{i} \subseteq A (X)$ ;
(N2): If $\bar{μ} ({[x]}_{i}) < γ$ , ${[x]}_{i} ⊄ A (X)$ .

Definition 5.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

.

U / A = {{[x]}_{1}, {[x]}_{2}, \dots, {[x]}_{l}}

is a granularity layer on U, then the cost-sensitive approximation model (CSA) of rough sets is defined as follows:

\begin{matrix} A (X) = \cup \{{[x]}_{i} |\bar{μ} ({[x]}_{i}) \geq \frac{λ_{12}}{λ_{12} + λ_{21}}, {[x]}_{i} \subseteq U\} . \end{matrix}

(10)

Suppose

0 \leq λ_{12}, λ_{21} \leq 1

, boundary region I and boundary region II are denoted by

B N 1 (X) = {{[x]}_{i}| \frac{λ_{12}}{λ_{12} + λ_{21}} \leq \bar{μ} ({[x]}_{i}) < 1}

and

B N 2 (X) = {{[x]}_{i}| 0 < \bar{μ} ({[x]}_{i}) < \frac{λ_{12}}{λ_{12} + λ_{21}}}

, respectively, then

B N (X) = B N 1 (X) \cup B N 2 (X)

, and

A (X) = B N 1 (X) \cup P O S (X)

. Figure 1 shows the CSA of rough sets, where

B N 1 (X)

is the dark blue region, which denotes the region in the boundary region used as the approximation.

B N 2 (X)

is the light blue region, which denotes the region in the boundary region not used as the approximation. Therefore, the region surrounded by the green broken line in Figure 1 constructs the approximations of rough sets, and the misclassification costs come from two uncertain regions, defined as follows:

\begin{matrix} D C (A (X)) = \sum_{{[x]}_{i} \in B N 1 (X)} λ^{Y} + \sum_{{[x]}_{i} \in B N 2 (X)} λ^{N} . \end{matrix}

(11)

Theorem 1.

Let

S = (U, A \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots,

A_{m} \subseteq C

, then

D C (A_{1} (X)) \geq D C (A_{2} (X))

.

Proof of Theorem 1.

Let U be a non-empty finite domain,

{U / A}_{1} = {E_{1}, E_{2}, \dots, E_{l}}

and

{U / A}_{2} = {F_{1}, F_{2}, \dots, F_{m}}

. Because

A_{1} \subseteq A_{2}

,

U / A_{2} ≼ U / A_{1}

. According to the condition, for simplicity, supposing only one granule

E_{1}

can be subdivided into two finer sub-granules by

Δ A = A_{2} - A_{1}

(the more complicated cases can be transformed into this case, so we will not repeat them here). Without loss of generality, let

E_{1} = F_{1} \cup F_{2}, E_{2} = F_{3},

E_{3} = F_{4}, \dots, E_{l} = F_{m}

(

m = l + 1

), namely,

U / A_{2} = {F_{1}, F_{2}, E_{2}, E_{3}, \dots, E_{l}}

. There are two cases to prove the theorem as follows:

(1): Suppose $\bar{μ} (E_{1}) \geq γ$ , obviously, $E_{1} \subseteq A_{1} (X)$ .

Case 1.

\bar{μ} (F_{1}) \geq γ

and

\bar{μ} (F_{2}) \geq γ

. Namely,

F_{1} \subseteq A (X)

and

F_{2} \subseteq A (X)

. Case 1, in which the granules are subdivided in

B N 1 (X)

, can be shown in Figure 2a, then

\begin{matrix} Δ D C_{A_{1} - A_{2}} (X) & = D C (A_{1} (X)) - D C (A_{2} (X)) \\ = λ_{12} (1 - \bar{μ} (E_{1})) |E_{1}| - λ_{12} (1 - \bar{μ} (F_{1})) |F_{1}| - λ_{12} (1 - \bar{μ} (F_{2})) |F_{2}| . \\ = (|E_{1}| - |F_{1}| - |F_{2}| + \sum_{x_{i} \in F_{1}} μ (x_{i}) + \sum_{x_{i} \in F_{2}} μ (x_{i}) - \sum_{x_{i} \in E_{1}} μ (x_{i})) λ_{12} . \end{matrix}

Because

\sum_{x_{i} \in E_{1}} μ (x_{i}) = \sum_{x_{i} \in F_{1}} μ (x_{i}) + \sum_{x_{i} \in F_{2}} μ (x_{i})

and

|E_{1}| = |F_{1}| + |F_{2}|

, then

Δ D C_{A_{1} - A_{2}} = 0

. Therefore,

D C (A_{1} (X)) = D C (A_{2} (X))

.

Case 2.

\bar{μ} (F_{1}) \geq γ

and

\bar{μ} (F_{2}) < γ

. Namely,

F_{1} \subseteq A (X)

and

F_{2} ⊄ A (X)

. Case 2, in which the granules are subdivided in

B N 1 (X)

, can be shown in Figure 2b, then

\begin{matrix} Δ D C_{A_{1} - A_{2}} & = D C (A_{1} (X)) - D C (A_{2} (X)) \\ = λ_{12} (1 - \bar{μ} (E_{1})) |E_{1}| - λ_{12} (1 - \bar{μ} (F_{1})) |F_{1}| - λ_{21} \bar{μ} (F_{2}) |F_{2}| \\ = |F_{2}| (λ_{12} - \bar{μ} (F_{2}) (λ_{21} + λ_{12})) \end{matrix}

Because

\bar{μ} (F_{2}) < γ = \frac{λ_{12}}{λ_{12} + λ_{21}}

and

|E_{1}| = |F_{1}| + |F_{2}|

, then

Δ D C_{A_{1} - A_{2}} (X) \geq 0

. Therefore,

D C (A_{1} (X)) > D C (A_{2} (X))

.

(2): Suppose $\bar{μ} (E_{1}) < γ$ , obviously, $E_{1} ⊄ A_{1} (X)$ .

Case 1.

\bar{μ} (F_{1}) \geq γ

and

\bar{μ} (F_{2}) < γ

. Namely,

F_{1} \subseteq A (X)

and

F_{2} ⊄ A (X)

. Case 1, in which the granules are subdivided in

B N 2 (X)

, can be shown in Figure 2c, then

\begin{matrix} Δ D C_{A_{1} - A_{2}} & = D C (A_{1} (X)) - D C (A_{2} (X)) \\ = \bar{μ} (E_{1}) |E_{1}| λ_{21} - \bar{μ} (F_{2}) |F_{2}| λ_{21} - (1 - \bar{μ} (F_{1})) |F_{1}| λ_{12} \\ = |F_{1}| \bar{μ} (F_{1}) ((λ_{21} + λ_{12}) - λ_{12}) . \end{matrix}

Because

\bar{μ} (F_{1}) \geq γ = \frac{λ_{12}}{λ_{12} + λ_{21}}

, then

Δ D C_{A_{1} - A_{2}} \geq 0

. Therefore,

D C (A_{1} (X)) \geq D C (A_{2} (X))

.

Case 2.

\bar{μ} (F_{1}) < γ

and

\bar{μ} (F_{2}) < γ

. Namely,

F_{1} ⊄ R (X)

and

F_{2} ⊄ R (X)

. Case 2, in which the granules are subdivided in

B N 2 (X)

, can be shown in Figure 2d, then

\begin{matrix} Δ D C_{A_{1} - A_{2}} & = D C (A_{1} (X)) - D C (A_{2} (X)) \\ = \bar{μ} (E_{1}) |E_{1}| λ_{21} - \bar{μ} (F_{1}) |F_{1}| λ_{21} - \bar{μ} (F_{2}) |F_{2}| λ_{21} \\ = (\sum_{x_{i} \in E_{1}} μ (x_{i}) - \sum_{x_{i} \in F_{1}} μ (x_{i}) - \sum_{x_{i} \in F_{2}} μ (x_{i})) λ_{21} . \end{matrix}

Because

\sum_{x_{i} \in E_{1}} μ (x_{i}) = \sum_{x_{i} \in F_{1}} μ (x_{i}) + \sum_{x_{i} \in F_{2}} μ (x_{i})

, then

Δ D C_{A_{1} - A_{2}} = 0

. Therefore,

D C (A_{1} (X)) = D C (A_{2} (X))

. □

Theorem 1 shows that the misclassification costs in the approximation model will monotonously decrease with the changing approximation space, in accordance with human cognitive mechanisms.

5. Cost-Sensitive Multigranulation Approximations and Optimal Granularity Selection Method

The multigranulation rough set model (MGRS) [43] extends single granularity to multi-granularity in rough approximations to describe an uncertain concept. MGRS is an expansion of the classical rough set, and the target concept is characterized by optimistic and pessimistic upper/lower approximation boundaries in MGRS, respectively. However, there is a lack of an approximate description of an uncertain concept by utilizing the equivalence classes in MGRS. In this section, based on the model proposed in Section 3, we further construct the approximations of MGRS.

5.1. Cost-Sensitive Multigranulation Approximations of Rough Sets

Definition 6.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

X \subseteq U

. The optimistic membership degree of

x \in U

related to

A_{1}, A_{2}, \dots, A_{m}

is given as follows:

\begin{matrix} {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x) = max {\bar{μ} ({[x]}_{A_{i}}) |i = 1, 2, \dots, m} . \end{matrix}

(12)

Definition 7.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

X \subseteq U

. The approximation model of the optimistic MGRS of X related to

A_{1}, A_{2}, \dots, A_{m}

is given as follows:

\begin{matrix} \sum_{i = 1}^{m} A_{i}^{O} (X) = {x| \bar{μ} ({[x]}_{A_{1}}) \geq γ \lor \bar{μ} ({[x]}_{A_{2}}) \geq γ \lor \dots \lor \bar{μ} ({[x]}_{A_{m}}) \geq γ, x \in U} . \end{matrix}

(13)

From the perspective of the optimistic membership degree,

\sum_{i = 1}^{m} A_{i}^{O} (X)

can be =expressed as follows:

\begin{matrix} \sum_{i = 1}^{m} A_{i}^{O} (X) = {x| {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x) \geq γ, x \in U} . \end{matrix}

(14)

The corresponding decision regions are defined as follows:

\begin{matrix} B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X) = {x| 1 > {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x) \geq γ, x \in U}, \\ B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X) = {x| 0 < {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x) < γ, x \in U}, \\ P O S_{\sum_{i = 1}^{m} A_{i}}^{O} (X) = {x| {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x) = 1, x \in U}, \\ N E G_{\sum_{i = 1}^{m} A_{i}}^{O} (X) = {x| {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x) = 0, x \in U} . \end{matrix}

We have

\begin{matrix} \sum_{i = 1}^{m} A_{i}^{O} (X) = B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X) \cup P O S_{\sum_{i = 1}^{m} A_{i}}^{O} (X) . \end{matrix}

(15)

The misclassification costs of approximations of optimistic MGRS come from two uncertain regions

B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)

and

B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)

, which are defined in the following:

\begin{matrix} D C (\sum_{i = 1}^{m} A_{i}^{O} (X)) = \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{Y} + \sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{N} . \end{matrix}

(16)

Theorem 2.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

,

X \subseteq U

and

X_{1}, X_{2}, \dots, X_{n} \subseteq U

. The following properties hold:

(1): $\sum_{i = 1}^{m} A_{i}^{O} (X) = ⋃_{i = 1}^{m} A_{i} (X)$ ;
(2): $\sum_{i = 1}^{m} A_{i}^{O} (⋂_{j = 1}^{n} X_{j}) = ⋃_{i = 1}^{m} (⋂_{j = 1}^{n} A_{i}^{O} (X_{j}))$ ;
(3): $\sum_{i = 1}^{m} A_{i}^{O} (⋂_{j = 1}^{n} X_{j}) \subseteq ⋂_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{O} (X_{j}))$ ;
(4): $\sum_{i = 1}^{m} A_{i}^{O} (⋃_{j = 1}^{n} X_{j}) \supseteq ⋃_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{O} (X_{j}))$ ;
(5): $\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X) \subseteq \sum_{i = 1}^{m} A_{i}^{O} (X) \subseteq \bar{\sum_{i = 1}^{m} A_{i}^{O}} (X)$ .

Proof of Theorem 2.

(1): From Formula (14), $\sum_{i = 1}^{m} A_{i}^{O} (X) = ⋃_{i = 1}^{m} A_{i} (X)$ obviously holds.
(2): $\sum_{i = 1}^{m} A_{i}^{O} (⋂_{j = 1}^{n} X_{j}) = ⋃_{i = 1}^{m} A_{i}^{O} (⋂_{j = 1}^{n} X_{j}) = ⋃_{i = 1}^{m} (⋂_{j = 1}^{n} A_{i}^{O} (X_{j}))$ .
(3): $\begin{array}{l} \sum_{i = 1}^{m} A_{i}^{O} (⋂_{j = 1}^{n} X_{j}) = ⋃_{i = 1}^{m} (⋂_{j = 1}^{n} A_{i} (X_{j})) \\ = ⋂_{j = 1}^{n} (⋃_{i = 1}^{m} A_{i} (X_{j}) \cap \dots \\ = ⋂_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{O} (X_{j})) \cap \dots \\ = ⋂_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{O} (X_{j})) . \end{array}$
(4): From $X_{j} \subseteq ⋃_{j = 1}^{n} X_{j}$ , we have $\sum_{i = 1}^{m} A_{i}^{O} (X_{j}) \subseteq \sum_{i = 1}^{m} A_{i}^{O} (⋃_{j = 1}^{n} X_{j})$ . Therefore, $\sum_{i = 1}^{m} A_{i}^{O} (⋃_{j = 1}^{n} X_{j}) \supseteq ⋃_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{O} (X_{j}))$ .
(5): It is easy to prove by Formulas (1), (2), and (14).

□

Definition 8.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

X \subseteq U

. The pessimistic membership degree of

x \in U

related to

A_{1}, A_{2}, \dots, A_{m}

is given as follows:

\begin{matrix} {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x) = min {\bar{μ} ({[x]}_{A_{i}}) |i = 1, 2, \dots, m} . \end{matrix}

(17)

Definition 9.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

X \subseteq U

. The approximation model of pessimistic MGRS of X related to

A_{1}, A_{2}, \dots, A_{m}

is given as follows:

\begin{matrix} \sum_{i = 1}^{m} A_{i}^{P} (X) = {x| \bar{μ} ({[x]}_{A_{1}}) \geq γ \land \bar{μ} ({[x]}_{A_{2}}) \geq γ \land \dots \land \bar{μ} ({[x]}_{A_{m}}) \geq γ, x \in U} . \end{matrix}

(18)

From the perspective of the pessimistic membership degree,

\sum_{i = 1}^{m} A_{i}^{P} (X)

can be expressed as follows:

\begin{matrix} \sum_{i = 1}^{m} A_{i}^{P} (X) = {x| {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x) \geq γ, x \in U} . \end{matrix}

(19)

The corresponding decision regions are expressed in the following:

\begin{matrix} B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X) = {x| 1 > {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x) \geq γ, x \in U}, \\ B N 2_{\sum_{i = 1}^{m} A_{i}}^{P} (X) = {x| 0 < {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x) < γ, x \in U}, \\ P O S_{\sum_{i = 1}^{m} A_{i}}^{P} (X) = {x| {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x) = 1, x \in U}, \\ N E G_{\sum_{i = 1}^{m} A_{i}}^{P} (X) = {x| {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x) = 0, x \in U} . \end{matrix}

We have

\begin{matrix} \sum_{i = 1}^{m} A_{i}^{P} (X) = B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X) \cup P O S_{\sum_{i = 1}^{m} A_{i}}^{P} (X) . \end{matrix}

(20)

The misclassification costs of approximations of optimistic MGRS come from two uncertain regions

B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)

and

B N 2_{\sum_{i = 1}^{m} A_{i}}^{P} (X)

, which are defined in the following:

\begin{matrix} D C (\sum_{i = 1}^{m} A_{i}^{P} (X)) = \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{Y} + \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{N} . \end{matrix}

(21)

Theorem 3.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

,

X \subseteq U

and

X_{1}, X_{2}, \dots, X_{n} \subseteq U

. The following properties hold:

(1): $\sum_{i = 1}^{m} A_{i}^{P} (X) = ⋂_{i = 1}^{m} (A_{i} (X));$
(2): $\sum_{i = 1}^{m} A_{i}^{P} (⋂_{j = 1}^{n} X_{j}) = ⋂_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{P} (X_{j}));$
(3): $\sum_{i = 1}^{m} A_{i}^{P} (⋃_{j = 1}^{n} X_{j}) \supseteq ⋃_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{P} (X_{j}));$
(4): $\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X) \subseteq \sum_{i = 1}^{m} A_{i}^{P} (X) \subseteq \bar{\sum_{i = 1}^{m} A_{i}^{P}} (X) .$

Proof of Theorem 3.

(1): $\forall x \in \sum_{i = 1}^{m} A_{i}^{P} (X)$ , according to Definition 9, $\bar{μ} ({[x]}_{A_{i}}) \geq γ$ holds, $i = 1, 2, \dots, m$ . According to Definition 5, $x \in A_{i} (X)$ , $i = 1, 2, \dots, m$ and $x \in ⋂_{i = 1}^{m} (A_{i} (X))$ holds, i.e., $\sum_{i = 1}^{m} A_{i}^{P} (X) \subseteq ⋂_{i = 1}^{m} (A_{i} (X))$ . $\forall x \in ⋂_{i = 1}^{m} (A_{i} (X))$ , $\bar{μ} ({[x]}_{A_{i}}) \geq γ$ , $i = 1, 2, \dots, m$ . According to Definition 7, $x \in \sum_{i = 1}^{m} A_{i}^{P} (X)$ holds, i.e., $⋂_{i = 1}^{m} (A_{i} (X)) \subseteq \sum_{i = 1}^{m} A_{i}^{P} (X)$ . Therefore, we have $\sum_{i = 1}^{m} A_{i}^{P} (X) = ⋂_{i = 1}^{m} (A_{i} (X))$ .
(2): From the proof of (1), $\sum_{i = 1}^{m} A_{i}^{P} (⋂_{j = 1}^{n} X_{j}) = ⋂_{i = 1}^{m} A_{i} (⋂_{j = 1}^{n} X_{j})$ . According to Definition 7, $⋂_{i = 1}^{m} A_{i} (⋂_{j = 1}^{n} X_{j}) = ⋂_{i = 1}^{m} ⋂_{j = 1}^{n} A_{i} (X_{j})$ . Because $⋂_{i = 1}^{m} A_{i} (X_{j}) = \sum_{i = 1}^{m} A_{i}^{P} (X_{j})$ , $\sum_{i = 1}^{m} A_{i}^{P} (⋂_{j = 1}^{n} X_{j}) = ⋂_{i = 1}^{m} A_{i} (⋂_{j = 1}^{n} X_{j}) = ⋂_{j = 1}^{n} ⋂_{i = 1}^{m} A_{i} (X_{j}) = ⋂_{j = 1}^{n}$ $(\sum_{i = 1}^{m} A_{i}^{P} (X_{j}))$ .
(3): $\forall x \in ⋃_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{P} (X_{j}))$ , $\exists X_{k} (k \in {1, 2, \dots, n})$ , $x \in \sum_{i = 1}^{m} A_{i}^{P} (X_{j})$ . According to Definition 7, $\forall X_{j}, j = 1, 2, \dots, n$ , $\bar{μ} ({[x]}_{A_{i}}) \geq γ$ , $, i = 1, 2, \dots, m$ , $x \in \sum_{i = 1}^{m} A_{i}^{P} (⋃_{j = 1}^{n} (X_{j}))$ holds. Therefore, $\sum_{i = 1}^{m} A_{i}^{P} (⋃_{j = 1}^{n} X_{j}) \supseteq ⋃_{j = 1}^{n} (\sum_{i = 1}^{m} A_{i}^{P} (X_{j}))$ .
(4): It is easy to prove by Formulas (4), (5), and (19).

□

Theorem 4.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

X \subseteq U

,

A = A_{1} \cup A_{2} \cup \dots \cup A_{m}

and

X_{1}, X_{2}, \dots, X_{n} \subseteq U

. The following properties hold:

(1): $\sum_{i = 1}^{m} A_{i}^{P} (X) \subseteq \sum_{i = 1}^{m} A_{i}^{O} (X) \subseteq A (X);$
(2): $\sum_{i = 1}^{m - 1} A_{i}^{O} (X) \subseteq \sum_{i = 1}^{m} A_{i}^{O} (X)$ and $\sum_{i = 1}^{m - 1} A_{i}^{P} (X) \supseteq \sum_{i = 1}^{m} A_{i}^{P} (X) .$

Proof of Theorem 4.

(1): According to Definition 6, we only need to prove $\sum_{i = 1}^{m} A_{i}^{P} (X) \subseteq \sum_{i = 1}^{m} A_{i}^{O} (X)$ . $\forall x \in \sum_{i = 1}^{m} A_{i}^{P} (X)$ , according to Definition 7, $\bar{μ} ({[x]}_{A_{i}}) \geq γ$ . From Definition 5, we have $x \in \sum_{i = 1}^{m} A_{i}^{O} (X)$ .
(2): It is easy to prove according to Definitions 5 and 7.

□

Lemma 1.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

,

U / A = {E_{1}, E_{2}, . . ., E_{l}}

is a granularity layer on U. The following properties hold:

(1): $\sum_{E_{i} \in B N 1 (X)} λ_{E_{i}}^{Y} \leq \sum_{E_{i} \in B N 1 (X)} λ_{E_{i}}^{N};$
(2): $\sum_{E_{i} \in B N 2 (X)} λ_{E_{i}}^{N} \leq \sum_{E_{i} \in B N 2 (X)} λ_{E_{i}}^{Y} .$ Here, $E_{i} \in U / A (i = 1, 2, \dots, l)$ .

Proof of Lemma 1.

(1): $λ_{E_{i}}^{Y} - λ_{E_{i}}^{N} = λ_{12} (1 - \bar{μ} (E_{i})) |E_{i}| - λ_{21} \bar{μ} (E_{i}) |E_{i}| = |E_{i}| (λ_{12} - (λ_{12} + λ_{21})) \bar{μ} (E_{i})$ , because $E_{i} \in B N 1 (X)$ , we have $\frac{λ_{12}}{λ_{12} + λ_{21}} \leq \bar{μ} (E_{i}) < 1$ , then $λ_{E_{i}}^{Y} \leq λ_{E_{i}}^{N}$ , therefore $\sum_{E_{i} \in B N 1 (X)} λ_{E_{i}}^{Y} \leq \sum_{E_{i} \in B N 1 (X)} λ_{E_{i}}^{N}$ .
(2): $λ_{E_{i}}^{Y} - λ_{E_{i}}^{N} = λ_{12} (1 - \bar{μ} (E_{i})) |E_{i}| - λ_{21} \bar{μ} (E_{i}) |E_{i}| = |E_{i}| (λ_{12} - (λ_{12} + λ_{21})) \bar{μ} (E_{i})$ . Because $E_{i} \in B N 2 (X)$ , we have $0 < \bar{μ} (E_{i}) < \frac{λ_{12}}{λ_{12} + λ_{21}}$ , then $λ_{E_{i}}^{Y} \geq λ_{E_{i}}^{N}$ . Therefore, $\sum_{E_{i} \in B N 2 (X)} λ_{E_{i}}^{N} \leq \sum_{E_{i} \in B N 2 (X)} λ_{E_{i}}^{Y}$ .

□

Lemma 1 shows that the misclassification costs incurred by the equivalence classes in characterizing X are not more than the misclassification costs incurred by the equivalence classes when they do not characterize X in

B N 1 (X)

. Moreover, misclassification costs incurred by the equivalence classes when they do not characterize X are not more than the misclassification costs incurred by the equivalence classes in characterizing X in

B N 2 (X)

.

Theorem 5.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

, the following properties hold:

(1): $\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ_{}^{Y} \leq \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ_{}^{N}$ and $\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ_{}^{Y} \leq \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ_{}^{N};$
(2): $\sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{N} \leq \sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{Y}$ and $\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{N} \leq \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{Y} .$

Proof of Theorem 5.

From Lemma 1, Theorem 5 holds. □

Theorem 5 shows that misclassification costs incurred by the equivalence classes in characterizing X are not more than misclassification costs incurred by the equivalence classes when they do not characterize X in

B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)

and

B N 1_{\sum_{i = 1}^{m} R_{i}}^{P} (X)

. Moreover, misclassification costs incurred by the equivalence classes when they do not characterize X are not more than misclassification costs incurred by the equivalence classes in characterizing X in

B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)

and

B N 2_{\sum_{i = 1}^{m} A_{i}}^{P} (X)

.

D C (\sum_{i = 1}^{m} A_{i}^{O} (X))

,

D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X))

and

D C (\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X))

denote the misclassification costs generated when

\sum_{i = 1}^{m} A_{i}^{O} (X)

,

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X)

and

\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X)

are approximated to X, respectively. Then, the following theorem holds:

Theorem 6.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

, Then,

D C (\sum_{i = 1}^{m} A_{i}^{O} (X)) \leq D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X))

and

D C (\sum_{i = 1}^{m} A_{i}^{O} (X)) \leq D C (\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X))

.

Proof of Theorem 6.

When

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X)

is taken as the approximation of X,

D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X)) = \sum_{x \in B N (X)} λ^{N}

; when

\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X)

is considered the approximation of X,

D C (\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X)) = \sum_{x \in B N (X)} λ^{Y}

; when

D C (\sum_{i = 1}^{m} A_{i}^{O} (X))

is considered the approximation of X,

D C (\sum_{i = 1}^{m} A_{i}^{O} (X)) = \sum_{x \in B N 1 (X)} λ^{Y} + \sum_{x \in B N 2 (X)} λ^{N}

.

Because

B N (X) = B N 1 (X) \cup B N 2 (X)

, we have:

\begin{matrix} D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X)) = \sum_{x \in B N 1 (X)} λ^{N} + \sum_{x \in B N 2 (X)} λ^{N}, \\ D C (\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X)) = \sum_{x \in B N 1 (X)} λ^{Y} + \sum_{x \in B N 2 (X)} λ^{Y} . \end{matrix}

Therefore, according to Theorem 5,

\begin{matrix} D C (\sum_{i = 1}^{m} A_{i}^{O} (X)) \leq D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X)), \\ D C (\sum_{i = 1}^{m} A_{i}^{O} (X)) \leq D C (\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X)) . \end{matrix}

Theorem 6 indicates that when

D C (\sum_{i = 1}^{m} A_{i}^{O} (X))

,

D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X))

and

D C (\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X))

are used as approximations of X, respectively,

D C (\sum_{i = 1}^{m} A_{i}^{O} (X))

generates the least misclassification costs. □

D C (\sum_{i = 1}^{m} A_{i}^{P} (X))

,

D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X))

and

D C (\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X))

denote the misclassification costs generated when

\sum_{i = 1}^{m} A_{i}^{P} (X)

,

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X)

and

\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X)

are approximated to X, respectively.

Theorem 7.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A \subseteq C

and

X \subseteq U

. Then,

D C (\sum_{i = 1}^{m} A_{i}^{P} (X)) \leq D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X))

and

D C (\sum_{i = 1}^{m} A_{i}^{P} (X)) \leq D C (\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X))

.

Proof of Theorem 7.

Similar to Theorem 6, Theorem 7 is easy to prove. □

From Theorem 7, when

D C (\sum_{i = 1}^{m} A_{i}^{P} (X))

,

D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X))

and

D C (\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X))

are used as approximations of X, respectively,

D C (\sum_{i = 1}^{m} A_{i}^{P} (X))

generates the least misclassification costs. Theorems 6 and 7 reflect the advantages of the multigranulation approximation sets that are used for approximating the target concept.

5.2. The Optimal Multigranulation Approximation Selection Method

The objects in the boundary region may be reclassified under different granularities. As a result, the equivalence classes used to represent the approximation set will be changed in the boundary region. In practical applications, the optimal approximation selection should consider both the misclassification and test costs. In MGRS, uncertain concepts characterized in a finer approximation layer result in lower misclassification costs, and test costs increase with the added attributes. Therefore, it is essential to find a balance between misclassification and test costs.

Lemma 2.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

,

A_{1} \subseteq A_{2} \subseteq, \dots, \subseteq A_{m}

, and

X \subseteq U

, then

\forall x \in U

,

{\bar{μ}}_{\sum_{i = 1}^{m - 1} A_{i}}^{O} (x) \leq {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x)

.

Lemma 3.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

,

A_{1} \subseteq A_{2} \subseteq, \dots, \subseteq A_{m}

, and

X \subseteq U

, then

\forall x \in U

,

{\bar{μ}}_{\sum_{i = 1}^{m - 1} A_{i}}^{P} (x) \geq {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x)

.

Theorem 8.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

,

A_{1} \subseteq A_{2} \subseteq, \dots, \subseteq A_{m}

and

X \subseteq U

, only when

\bar{μ} (x)

(

x \in B N 1_{\sum_{i = 1}^{m - 1} A_{i}}^{O} (X)

) changed with the attribute increased, then

D C (\sum_{i = 1}^{m - 1} A_{i}^{O} (X)) \geq D C (\sum_{i = 1}^{m} A_{i}^{O} (X))

.

Proof of Theorem 8.

\begin{matrix} D C (\sum_{i = 1}^{m} A_{i}^{O} (X)) & = \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{Y} + \sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{N} \\ = \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ_{12} (1 - \bar{μ} (x)) + \sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ_{21} \bar{μ} (x) . \end{matrix}

According to Lemma 2, we have

\begin{matrix} {\bar{μ}}_{\sum_{i = 1}^{m - 1} A_{i}}^{O} (x) \leq {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{O} (x) . \end{matrix}

Obviously,

D C (\sum_{i = 1}^{m - 1} A_{i}^{O} (X)) \geq D C (\sum_{i = 1}^{m} A_{i}^{O} (X))

. □

From Theorem 7, for optimistic MGRS, to reduce the misclassification costs of the approximation, we can add the attribute that only changes the membership of objects in

x \in B N 1_{\sum_{i = 1}^{m - 1} A_{i}}^{O} (X)

.

Theorem 9.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

A_{1} \subseteq A_{2} \subseteq, \dots, \subseteq A_{m}

,

X \subseteq U

, only when

\bar{μ} (x)

(

x \in B N 2_{\sum_{i = 1}^{m - 1} A_{i}}^{P} (X)

) changed with the attributes increased, then

D C (\sum_{i = 1}^{m - 1} A_{i}^{P} (X)) \geq D C (\sum_{i = 1}^{m} A_{i}^{P} (X))

.

Proof of Theorem 9.

\begin{matrix} D C (\sum_{i = 1}^{m} A_{i}^{P} (X)) & = \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{Y} + \sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{N} \\ = \sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ_{12} (1 - \bar{μ} (x)) + \sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ_{21} \bar{μ} (x) . \end{matrix}

According to Lemma 3, we have

\begin{matrix} {\bar{μ}}_{\sum_{i = 1}^{m - 1} A_{i}}^{P} (x) \geq {\bar{μ}}_{\sum_{i = 1}^{m} A_{i}}^{P} (x) . \end{matrix}

Obviously,

D C (\sum_{i = 1}^{m - 1} A_{i}^{P} (X)) \geq D C (\sum_{i = 1}^{m} A_{i}^{P} (X))

. □

From Theorem 9, for pessimistic MGRS, to reduce the misclassification costs of the approximation, we can add the attribute that only changes the membership of objects in

x \in B N 2_{\sum_{i = 1}^{m - 1} R_{i}}^{O} (X)

.

In practical applications, on the one hand, the factors included in the test cost, such as money, time, environment, etc., are hard to evaluate objectively. On the other hand, these factors are hard to be integrated because of their different dimensions. In this section, we will evaluate test costs in an attribute-driven form, which are more objective.

Definition 10.

Let

S = (U, C \cup D, V, f)

be a decision information table,

a \in C

and

X \subseteq U

, the significance of a is defined as follows:

\begin{matrix} S i g (a, C, D) = D C_{C - {a}} - D C_{C} . \end{matrix}

(22)

Definition 11.

Let

S = (U, C \cup D, V, f)

be a decision information table,

a \in C

,

A \subseteq C

and

X \subseteq U

; the test cost to construct

A (X)

is defined as follows:

\begin{matrix} T C_{A} = \sum_{a \in A} S i g (a, C, D) . \end{matrix}

(23)

In this paper, for simplicity, to present the optimal granularity selection of the multigranulation approximation, we only use the optimistic MGRS as an example.

Definition 12.

Let

S = (U, C \cup D, V, f)

be a decision information table,

A_{1}, A_{2}, \dots, A_{m} \subseteq C

and

X \subseteq U

; the test cost to construct

\sum_{i = 1}^{m} A_{i}^{O} (X)

can be defined as follows:

\begin{matrix} T C_{\sum_{i = 1}^{m} A_{i}^{O}} = \sum_{i = 1}^{m} T C_{A_{i}} . \end{matrix}

(24)

In this paper, the misclassification and test costs for user requirements are represented as

D C_{u}

and

T C_{u}

, respectively. A multigranulation approximation

\sum_{i = 1}^{k} A_{i}^{O} (X)

is selected to meet the constraints

D C_{\sum_{i = 1}^{k} A_{i}^{O} (X)} \leq D C_{u}

and

T C_{\sum_{i = 1}^{k} A_{i}^{O} (X)} \leq T C_{u}

, then the related decision are made on

\sum_{i = 1}^{k} A_{i}^{O} (X)

. Figure 3 presents the optimal multigranulation approximation selection of optimistic MGRS. Herein,

\sum_{i = 1}^{3} A_{i}^{O} (X)

complies with the requirements of the misclassification costs and fails to comply with the requirements of the test costs;

A_{1}^{O} (X)

complies with the requirements of the test cost and fails to comply with the requirements of misclassification costs;

\sum_{i = 1}^{2} A_{i}^{O} (X)

complies with both requirements of misclassification costs and test costs, enabling effective calculations according to granularity optimization. Similarly, the optimal approximation selection of pessimistic MGRS is the same. We formalize the computation as an optimization problem:

\underset{k}{arg min} C o s t_{\sum_{i = 1}^{k} A_{i}^{O} (X)},

(25)

\begin{matrix} s . t . \\ ξ D C_{\sum_{i = 1}^{k} A_{i}^{O} (X)} \leq D C_{u}; \\ T C_{\sum_{i = 1}^{k} A_{i}^{O} (X)} \leq T C_{u} . \end{matrix}

where

C o s t_{\sum_{i = 1}^{k} A_{i}^{O} (X)} = ξ D C_{\sum_{i = 1}^{k} A_{i}^{O}} + T C_{\sum_{i = 1}^{k} A_{i}^{O}}

, and

C o s t_{\sum_{i = 1}^{k} A_{i}^{O} (X)}

denotes the total cost for constructing

\sum_{i = 1}^{k} A_{i}^{O} (X)

.

ξ = \frac{|U|}{|\sum_{i = 1}^{m} A_{i}^{O} (X)|} * \frac{1}{D C (A_{m} (X))}

reflects the contribution degree of the multigranulation approximation layer for the misclassification costs of

\sum_{i = 1}^{k} A_{i}^{O} (X)

.

5.3. Case Study

Table 2 is an evaluation form of the company venture capital given by five experts.

U = {x_{1}, x_{2}, \dots, x_{900}}

is 900 investment plans, which are evaluated by 5 experts. The risk level is divided into three categories, including high and low. Suppose that

D C_{u} = 5

and

T C_{u} = 2.5

.

(1): According to the above conditions, the attribute significance can be computed by Formula (25), which is shown in Table 3.
(2): Each multigranulation approximation is obtained by adding attributes in ascending order of attribute significance, $E_{4} \to E_{3} \to E_{2} \to E_{1} \to E_{5}$ in each stage, respectively. We represent the attributes as follows: $A_{1} = E_{4}$ , $A_{2} = E_{3}$ , $A_{3} = E_{2}$ , $A_{4} = E_{1}$ , and $A_{5} = E_{5}$ .
(3): For each multigranulation approximation layer, $D C_{\sum_{i = 1}^{k} A_{i}^{O}}$ , $T C_{\sum_{i = 1}^{k} A_{i}^{O}}$ , and $C o s t_{\sum_{i = 1}^{k} A_{i}^{O} (X)}$ are computed by Formulas (16), (24), and (25), respectively, where $k = 1, 2, \dots, 5$ and the results are displayed in Table 4.

C o s t_{\sum_{i = 1}^{k} A_{i}^{O} (X)}

changes with the increased attributes and only

C o s t_{\sum_{i = 1}^{3} A_{i}^{O} (X)}

and

C o s t_{\sum_{i = 1}^{4} A_{i}^{O} (X)}

satisfy

D C_{\sum_{i = 1}^{k} A_{i}^{O} (X)} \leq D C_{u}

and

T C_{\sum_{i = 1}^{k} A_{i}^{O} (X)} \leq T C_{u}

at the same time. According to the Formula (25), we choose the multigranulation approximation layer with the lowest total cost from the above layers; its corresponding approximation layer is

\sum_{i = 1}^{3} A_{i}^{O} (X)

. Therefore,

\sum_{i = 1}^{3} A_{i}^{O} (X)

is the optimal multigranulation approximation used for deciding investment plans, because it possesses lower misclassification costs, i.e., from the perspective of optimistic MGRS,

E_{4}

,

E_{3}

, and

E_{2}

are reasonable expert sets. The analysis of the case study shows that the proposed method can search for a reasonable approximation under the constraint conditions.

6. Simulation Experiment and Result Analysis

6.1. Simulation Experiment

In this section, the effectiveness and rationality of our model are demonstrated as shown by illustrative experiments. The computer used in the experiments was a WIN 10 operating system with 3.10-GHz CPU and 16.0 GB RAM, and the programming software was MATLAB R2022a. The capability of the proposed model was evaluated on six UCI datasets, which are shown in Table 5. In our experiments, we randomly took away some known attribute values from datasets 10–12 to create incomplete decision systems. The missing values are randomly distributed on all conditional attributes.

From Figure 4, for classical rough sets, the misclassification costs of the approximation model monotonously decrease with the granularity being finer, which complies with human cognitive habits.

In Figure 5,

O D C 1

,

O D C 2

,

O D C 3

and

O D C 4

represent

\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{Y}

,

\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{N}

,

\sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{N}

and

\sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)} λ^{Y}

, respectively.

P D C 1

,

P D C 2

,

P D C 3

and

P D C 4

represent

\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{Y}

,

\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{N}

,

\sum_{x \in B N 2_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{N}

and

\sum_{x \in B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)} λ^{Y}

, respectively. From Figure 5, under different granular layers, misclassification costs incurred by the equivalence classes in approximating X are always less than or equal to misclassification costs incurred by the equivalence classes when they do not characterize X in

B N 1_{\sum_{i = 1}^{m} A_{i}}^{O} (X)

and

B N 1_{\sum_{i = 1}^{m} A_{i}}^{P} (X)

. Moreover, misclassification costs incurred by equivalence classes when they do not characterize X are less than or equal to the misclassification costs incurred by the equivalence classes in approximating X in

B N 2_{\sum_{i = 1}^{m} A_{i}}^{O} (X)

and

B N 2_{\sum_{i = 1}^{m} A_{i}}^{P} (X)

. This is consistent with Theorem 4.

In Figure 6, the horizontal and vertical axes denote the granularity and misclassification costs, respectively.

O D C_l o w e r

,

O D C_u p p e r

and

O D C

represent

D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X))

,

D C (\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X))

and

D C (\sum_{i = 1}^{m} A_{i}^{O} (X))

.

P D C_l o w e r

,

P D C_u p p e r

and

P D C

represent

D C (\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X))

,

D C (\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X))

and

D C (\sum_{i = 1}^{m} A_{i}^{P} (X))

, respectively, namely, the misclassification costs generated when

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X)

,

\bar{\sum_{i = 1}^{m} R_{i}^{O}} (X)

and

\sum_{i = 1}^{m} A_{i}^{O} (X)

are approximated to X and the misclassification costs generated when

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X)

,

\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X)

and

\sum_{i = 1}^{m} A_{i}^{P} (X)

are approximated to X. Obviously, compared with

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X)

and

\bar{\sum_{i = 1}^{m} A_{i}^{O}} (X)

, the misclassification costs of

\sum_{i = 1}^{m} A_{i}^{O} (X)

are the least on each granularity. Similarly, compared with

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X)

,

\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X)

, the misclassification costs of

\sum_{i = 1}^{m} A_{i}^{P} (X)

are the least on each granularity. This is consistent with Theorems 5 and 6.

6.2. Results and Discussions

According to the above experiments, compared with the upper/lower approximation sets, we can conclude that the multigranulation approximations have the following advantages when applied to decision-making environments:

(1): The misclassification costs of the approximation model monotonously decrease with the granularity being finer;
(2): In multigranulation approximations, under different granular layers, the misclassification costs incurred by the equivalence classes in approximating X are less than or equal to the misclassification costs incurred by the equivalence classes when they do not characterize X in boundary region I of optimistic and pessimistic rough sets. Moreover, the misclassification costs incurred by equivalence classes when they do not characterize X are less than or equal to the misclassification costs incurred by the equivalence classes in approximating X in boundary region II of optimistic and pessimistic rough sets;
(3): Compared with the upper/lower approximation sets, the misclassification costs of the multigranulation approximations are the least on each granularity.

7. Conclusions

In MGRS, optimistic and pessimistic upper/lower approximation boundaries are utilized to characterize uncertain concepts. They still cannot take advantage of the known equivalence classes to establish the approximation of an uncertain concept. To handle the problem, cost-sensitive multigranulation approximations of rough sets were constructed. Furthermore, an optimization mechanism of the multigranulation approximations is proposed, which selects the optimal approximation to obtain the minimum misclassification costs under the conditions. The case study shows that the proposed algorithm is capable of searching for a rational approximation under restraints. Finally, the experiments demonstrate that the multigranulation approximations possess the least misclassification costs. In particular, our models apply to the decision-making environment where each decision-maker is independent. Moreover, our models are useful for extracting decision rules from distributive information systems and groups of intelligent agents through rough set approaches [34,36]. Figure 7 presents a diagram that summarizes the works conducted in this paper. Herein, we present the process of the cost-sensitive multigranulation approximations of rough sets; according to different granulation mechanisms, our approach can be extended to uncertainty models, i.e., vague sets, shadow sets, and neighborhood rough sets. These results will be important to contribute to the progress of the GrC theory.

Our future work will focus on the following two aspects: (1) We hope to build a more reasonable three-way decision model based on our model from optimistic and pessimistic perspectives; (2) we wish to combine the model with the cloud model theory to extend our model to construct a multigranulation approximation model with bidirectional cognitive computing. This will offer more cognitive advantages and benefits in application fields with uncertainty from multiple perspectives, i.e., image segmentation, clustering, and recommendation systems.

Author Contributions

Conceptualization, J.Y.; methodology, J.Y., J.K. and Q.L.; writing—original draft, J.Y. and J.K.; writing—review and editing, J.Y., J.K., Q.L. and Y.L.; data curation, J.Y., Q.L. and Y.L.; supervision, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Science Foundation of China (No. 6206049), Excellent Young Scientific and Technological Talents Foundation of Guizhou Province (QKH-platform talent (2021) No. 5627), the Key Cooperation Project of Chongqing Municipal Education Commission (HZ2021008), Guizhou Provincial Science and Technology Project (QKH-ZK (2021) General 332), Science and Technology Top Talent Project of Guizhou Education Department (QJJ2022(088)), Key Laboratory of Evolutionary Artificial Intelligence in Guizhou (QJJ[2022] No. 059) and the Key Talens Program in digital economy of Guizhou Province, Electronic Manufacturing Industry University Research Base of Ordinary Colleges and Universities in Guizhou Province (QJH-KY Zi (2014) No. 230-2).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

This study was mainly completed at the Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, and the authors would like to thank the laboratory for its assistance.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zadeh, L.A. Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic. Fuzzy Sets Syst. 1997, 90, 111–127. [Google Scholar] [CrossRef]
Bello, M.; Nápoles, G.; Vanhoof, K.; Bello, R. Data quality measures based on granular computing for multi-label classification. Inf. Sci. 2021, 560, 51–67. [Google Scholar] [CrossRef]
Pedrycz, W.; Chen, S. Interpretable Artificial Intelligence: A Perspective of Granular Computing; Springer Nature: Berlin/Heidelberg, Germany, 2021; Volume 937. [Google Scholar]
Li, J.; Mei, C.; Xu, W.; Qian, Y. Concept learning via granular computing: A cognitive viewpoint. IEEE Trans. Fuzzy Syst. 2015, 298, 447–467. [Google Scholar] [CrossRef] [PubMed]
Zadeh, L. Fuzzy sets. Inf. Control. 1965, 8, 338–353. [Google Scholar] [CrossRef] [Green Version]
Pawlak, Z. Rough sets. Int. J. Comput. Inf. Sci. 1982, 11, 341–356. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, B. The quotient space theory of problem solving. In Proceedings of the International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing, Chongqing, China, 26–29 May 2003; pp. 11–15. [Google Scholar]
Li, D.Y.; Meng, H.J.; Shi, X.M. Membership clouds and membership cloud generators. J. Comput. Res. Dev. 1995, 32, 15–20. [Google Scholar]
Colas-Marquez, R.; Mahfouf, M. Data Mining and Modelling of Charpy Impact Energy for Alloy Steels Using Fuzzy Rough Sets. IFAC-Pap. 2017, 50, 14970–14975. [Google Scholar] [CrossRef]
Hasegawa, K.; Koyama, M.; Arakawa, M.; Funatsu, K. Application of data mining to quantitative structure-activity relationship using rough set theory. Chemom. Intell. Lab. Syst. 2009, 99, 66–70. [Google Scholar] [CrossRef]
Santra, D.; Basu, S.K.; Mandal, J.K.; Goswami, S. Rough set based lattice structure for knowledge representation in medical expert systems: Low back pain management case study. Expert Syst. Appl. 2020, 145, 113084. [Google Scholar] [CrossRef] [Green Version]
Chebrolu, S.; Sanjeevi, S.G. Attribute Reduction in Decision-Theoretic Rough Set Model using Particle Swarm Optimization with the Threshold Parameters Determined using LMS Training Rule. Procedia Comput. Sci. 2015, 57, 527–536. [Google Scholar] [CrossRef] [Green Version]
Abdolrazzagh-Nezhad, M.; Radgohar, H.; Salimian, S.N. Enhanced cultural algorithm to solve multi-objective attribute reduction based on rough set theory. Math. Comput. Simul. 2020, 170, 332–350. [Google Scholar] [CrossRef]
Beaubier, S.; Defaix, C.; Albe-Slabi, S.; Aymes, A.; Galet, O.; Fournier, F.; Kapel, R. Multiobjective decision making strategy for selective albumin extraction from a rapeseed cold-pressed meal based on Rough Set approach. Food Bioprod. Process. 2022, 133, 34–44. [Google Scholar] [CrossRef]
Landowski, M.; Landowska, A. Usage of the rough set theory for generating decision rules of number of traffic vehicles. Transp. Res. Procedia 2019, 39, 260–269. [Google Scholar] [CrossRef]
Tawhid, M.; Ibrahim, A. Feature selection based on rough set approach, wrapper approach, and binary whale optimization algorithm. Int. J. Mach. Learn. Cybern. 2020, 11, 573–602. [Google Scholar] [CrossRef]
Zhang, Q.H.; Wang, G.Y.; Yu, X. Approximation sets of rough sets. J. Softw. 2012, 23, 1745–1759. [Google Scholar] [CrossRef]
Zhang, Q.H.; Wang, J.; Wang, G.Y. The approximate representation of rough-fuzzy sets. Chin. J. Comput. Jisuanji Xuebao 2015, 38, 1484–1496. [Google Scholar]
Zhang, Q.; Wang, J.; Wang, G.; Yu, H. The approximation set of a vague set in rough approximation space. Inf. Sci. 2015, 300, 1–19. [Google Scholar] [CrossRef]
Zhang, Q.H.; Zhang, P.; Wang, G.Y. Research on approximation set of rough set based on fuzzy similarity. J. Intell. Fuzzy Syst. 2017, 32, 2549–2562. [Google Scholar] [CrossRef]
Zhang, Q.H.; Yang, J.J.; Yao, L.Y. Attribute reduction based on rough approximation set in algebra and information views. IEEE Access 2016, 4, 5399–5407. [Google Scholar] [CrossRef]
Yao, L.Y.; Zhang, Q.H.; Hu, S.P.; Zhang, Q. Rough entropy for image segmentation based on approximation sets and particle swarm optimization. J. Front. Comput. Sci. Technol. 2016, 10, 699–708. [Google Scholar]
Zhang, Q.H.; Liu, K.X.; Gao, M. Approximation sets of rough sets and granularity optimization algorithm based on cost-sensitive. J. Control. Decis. 2020, 35, 2070–2080. [Google Scholar]
Yang, J.; Yuan, L.; Luo, T. Approximation set of rough fuzzy set based on misclassification cost. J. Chongqing Univ. Posts Telecommun. (Nat. Sci. Ed.) 2021, 33, 780–791. [Google Scholar]
Yang, J.; Luo, T.; Zeng, L.J.; Jin, X. The cost-sensitive approximation of neighborhood rough sets and granular layer selection. J. Intell. Fuzzy Syst. 2022, 42, 3993–4003. [Google Scholar] [CrossRef]
Siminski, K. 3WDNFS—Three-way decision neuro-fuzzy system for classification. Fuzzy Sets Syst. 2022, in press. [Google Scholar] [CrossRef]
Subhashini, L.; Li, Y.; Zhang, J.; Atukorale, A.S. Assessing the effectiveness of a three-way decision-making framework with multiple features in simulating human judgement of opinion classification. Inf. Process. Manag. 2022, 59, 102823. [Google Scholar] [CrossRef]
Subhashini, L.; Li, Y.; Zhang, J.; Atukorale, A.S. Integration of semantic patterns and fuzzy concepts to reduce the boundary region in three-way decision-making. Inf. Sci. 2022, 595, 257–277. [Google Scholar] [CrossRef]
Mondal, A.; Roy, S.K.; Pamucar, D. Regret-based three-way decision making with possibility dominance and SPA theory in incomplete information system. Expert Syst. Appl. 2023, 211, 118688. [Google Scholar] [CrossRef]
Yao, Y.Y. Symbols-Meaning-Value (SMV) space as a basis for a conceptual model of data science. Int. J. Approx. Reason. 2022, 144, 113–128. [Google Scholar] [CrossRef]
Qian, Y.H.; Liang, J.Y.; Dang, C.Y. Incomplete multigranulation rough set. IEEE Trans. Syst. Man-Cybern.-Part Syst. Humans 2009, 40, 420–431. [Google Scholar] [CrossRef]
Huang, B.; Guo, C.X.; Zhuang, Y.L.; Li, H.X.; Zhou, X.Z. Intuitionistic fuzzy multigranulation rough sets. Inf. Sci. 2014, 277, 299–320. [Google Scholar] [CrossRef]
Li, F.J.; Qian, Y.H.; Wang, J.T.; Liang, J. Multigranulation information fusion: A Dempster-Shafer evidence theory-based clustering ensemble method. Inf. Sci. 2017, 378, 389–409. [Google Scholar] [CrossRef]
Liu, X.; Qian, Y.H.; Liang, J.Y. A rule-extraction framework under multigranulation rough sets. Int. J. Mach. Learn. Cybern. 2014, 5, 319–326. [Google Scholar] [CrossRef]
Liu, K.; Li, T.; Yang, X.; Ju, H.; Yang, X.; Liu, D. Hierarchical neighborhood entropy based multi-granularity attribute reduction with application to gene prioritization. Int. J. Approx. Reason. 2022, 148, 57–67. [Google Scholar] [CrossRef]
Qian, Y.H.; Liang, X.Y.; Lin, G.P.; Guo, Q.; Liang, J. Local multigranulation decision-theoretic rough sets. Int. J. Approx. Reason. 2017, 82, 119–137. [Google Scholar] [CrossRef]
Qian, Y.H.; Zhang, H.; Sang, Y.L.; Liang, J. Multigranulation decision-theoretic rough sets. Int. J. Approx. Reason. 2014, 55, 225–237. [Google Scholar] [CrossRef]
Xu, W.; Yuan, K.; Li, W. Dynamic updating approximations of local generalized multigranulation neighborhood rough set. Appl. Intell. 2022, 52, 9148–9173. [Google Scholar] [CrossRef]
Sun, L.; Wang, L.; Ding, W.; Qian, Y.; Xu, J. Feature selection using fuzzy neighborhood entropy-based uncertainty measures for fuzzy neighborhood multigranulation rough sets. IEEE Trans. Fuzzy Syst. 2020, 29, 19–33. [Google Scholar] [CrossRef]
She, Y.H.; He, X.L.; Shi, H.X.; Qian, Y. A multiple-valued logic approach for multigranulation rough set model. Int. J. Approx. Reason. 2017, 82, 270–284. [Google Scholar] [CrossRef]
Li, W.; Xu, W.; Zhang, X.Y.; Zhang, J. Updating approximations with dynamic objects based on local multigranulation rough sets in ordered information systems. Artif. Intell. Rev. 2021, 55, 1821–1855. [Google Scholar] [CrossRef]
Zhang, C.; Li, D.; Zhai, Y.; Yang, Y. Multigranulation rough set model in hesitant fuzzy information systems and its application in person-job fit. Int. J. Mach. Learn. Cybern. 2019, 10, 717–729. [Google Scholar] [CrossRef]
Hu, C.; Zhang, L.; Wang, B.; Zhang, Z.; Li, F. Incremental updating knowledge in neighborhood multigranulation rough sets under dynamic granular structures. Knowl.-Based Syst. 2019, 163, 811–829. [Google Scholar] [CrossRef]
Hu, C.; Zhang, L. Dynamic dominance-based multigranulation rough sets approaches with evolving ordered data. Int. J. Mach. Learn. Cybern. 2021, 12, 17–38. [Google Scholar] [CrossRef]

Figure 1. The approximation of rough sets (surrounded by the broken line).

Figure 2. The granules subdivided in

B N 1 (X)

and

B N 2 (X)

of the cost-sensitive approximation model of rough sets. And all the red circles in the figure represent the set X. In addition, (a) shows the case 1 in which the granules are subdivided in

B N 1 (X)

; (b) shows the case 2 in which the granules are subdivided in

B N 1 (X)

; (c) shows the case 1 in which the granules are subdivided in

B N 2 (X)

; (d) shows the case 2 in which the granules are subdivided in

B N 2 (X)

.

Figure 2. The granules subdivided in

B N 1 (X)

and

B N 2 (X)

of the cost-sensitive approximation model of rough sets. And all the red circles in the figure represent the set X. In addition, (a) shows the case 1 in which the granules are subdivided in

B N 1 (X)

; (b) shows the case 2 in which the granules are subdivided in

B N 1 (X)

; (c) shows the case 1 in which the granules are subdivided in

B N 2 (X)

; (d) shows the case 2 in which the granules are subdivided in

B N 2 (X)

.

Figure 3. The optimal granularity selection of the optimistic multigranulation approximation. And the red circles in the figure represent the set X.

Figure 4. The misclassification cost with the changing granularity on each dataset.

Figure 5. The misclassification cost of two boundary regions under different granularities.

Figure 6. The misclassification costs of

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X), \bar{\sum_{i = 1}^{m} A_{i}^{O}} (X), \sum_{i = 1}^{m} A_{i}^{O} (X), \underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X)

,

\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X)

,

\sum_{i = 1}^{m} A_{i}^{P} (X)

with the changing granularity of each dataset.

Figure 6. The misclassification costs of

\underset{̲}{\sum_{i = 1}^{m} A_{i}^{O}} (X), \bar{\sum_{i = 1}^{m} A_{i}^{O}} (X), \sum_{i = 1}^{m} A_{i}^{O} (X), \underset{̲}{\sum_{i = 1}^{m} A_{i}^{P}} (X)

,

\bar{\sum_{i = 1}^{m} A_{i}^{P}} (X)

,

\sum_{i = 1}^{m} A_{i}^{P} (X)

with the changing granularity of each dataset.

Figure 7. Diagram of works conducted in this paper.

Table 1. A decision information table.

	$A_{1}$	$A_{2}$	$A_{3}$	X
$x_{1}$	0	0	0	0
$x_{2}$	0	0	1	0
$x_{3}$	0	1	0	0
$x_{4}$	0	1	1	1
$x_{5}$	1	0	0	0
$x_{6}$	1	0	1	1
$x_{7}$	1	1	0	1
$x_{8}$	1	1	1	1

Table 2. Evaluation form of the company’s venture capital.

Firm	$E_{1}$	$E_{2}$	$E_{3}$	$E_{4}$	$E_{5}$	D
$x_{1}$	3	3	3	3	1	High
$x_{2}$	2	1	2	3	2	High
$x_{3}$	2	1	2	1	2	High
·	·	·	·	·	·	·
·	·	·	·	·	·	·
·	·	·	·	·	·	·
$x_{898}$	3	3	2	2	3	Low
$x_{899}$	3	1	3	3	1	Low
$x_{900}$	1	1	3	3	1	Low

Table 3. Description of attribute significance.

Attribute	$E_{1}$	$E_{2}$	$E_{3}$	$E_{4}$	$E_{5}$
$S i g (a, C, D)$	0.74	0.56	0.54	0.36	0.77

Table 4. The description of cost in each multigranulation approximation.

	$A_{1}^{O} (X)$	$\sum_{i = 1}^{2} A_{i}^{O} (X)$	$\sum_{i = 1}^{3} A_{i}^{O} (X)$	$\sum_{i = 1}^{4} A_{i}^{O} (X)$	$\sum_{i = 1}^{5} A_{i}^{O} (X)$
$ξ D C$	8.3	5.9	4.7	3.6	3.5
$T C$	0.36	0.9	1.46	2.2	2.97
$C o s t$	8.66	6.8	6.16	5.8	6.47

Table 5. The description of datasets.

ID	Dataset	Attribute Characteristics	Instances	Condition Attributes
1	Bank	Integer	39	12
2	Breast-Cancer	Integer	699	9
3	Car	Integer	1728	6
4	ENB2012data	Real	768	8
5	Mushroom	Integer	8124	22
6	Tic	Integer	958	9
7	Air Quality	Real	9358	12
8	Concrete	Real	1030	8
9	Hcv	Real	569	10
10	Wisconsin	Real	699	9
11	Zoo	Integer	101	16
12	Balance	Integer	625	4

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, J.; Kuang, J.; Liu, Q.; Liu, Y. Cost-Sensitive Multigranulation Approximation in Decision-Making Applications. Electronics 2022, 11, 3801. https://doi.org/10.3390/electronics11223801

AMA Style

Yang J, Kuang J, Liu Q, Liu Y. Cost-Sensitive Multigranulation Approximation in Decision-Making Applications. Electronics. 2022; 11(22):3801. https://doi.org/10.3390/electronics11223801

Chicago/Turabian Style

Yang, Jie, Juncheng Kuang, Qun Liu, and Yanmin Liu. 2022. "Cost-Sensitive Multigranulation Approximation in Decision-Making Applications" Electronics 11, no. 22: 3801. https://doi.org/10.3390/electronics11223801

APA Style

Yang, J., Kuang, J., Liu, Q., & Liu, Y. (2022). Cost-Sensitive Multigranulation Approximation in Decision-Making Applications. Electronics, 11(22), 3801. https://doi.org/10.3390/electronics11223801

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cost-Sensitive Multigranulation Approximation in Decision-Making Applications

Abstract

1. Introduction

2. Literature Review

3. Preliminaries

4. Cost-Sensitive Approximation Methods of the Rough Sets

5. Cost-Sensitive Multigranulation Approximations and Optimal Granularity Selection Method

5.1. Cost-Sensitive Multigranulation Approximations of Rough Sets

5.2. The Optimal Multigranulation Approximation Selection Method

5.3. Case Study

6. Simulation Experiment and Result Analysis

6.1. Simulation Experiment

6.2. Results and Discussions

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI