# Feature Selection with Conditional Mutual Information Considering Feature Interaction

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Basic Information-Theoretic Concepts

## 3. Related Work

## 4. Some Definitions about Feature Relationships

**Definition**

**1.**

**Definition**

**2.**

**Corollary**

**1.**

**Definition**

**3.**

**Definition**

**4.**

_{i}is redundant if and only if it’s weakly relevant and has a Markov blanket M

_{i}within F.

**Definition**

**5.**

**Definition**

**6.**

**Definition**

**7.**

**Definition**

**8.**

**Definition**

**9.**

## 5. A New Feature Smethod Considering Feature Interaction

_{redu}, S

_{inde}and S

_{inte}.

_{inde}doesn’t influence the selection of candidate feature ${f}_{i}$ and could be ignored. Therefore, the deviated function DF could be replaced by

_{redu}, more features would intensify redundancy, thus

_{inte}, more features would intensify interaction

_{redu}, S

_{inte}are all implicit, we should generalize the above DF into the whole subset S. It’s easy to prove that $\mathrm{min}\underset{{f}_{j}\in {S}_{redu}}{I}({f}_{i};C|{f}_{j})=\mathrm{min}\underset{{f}_{j}\in S}{I}({f}_{i};C|{f}_{j})$. Therefore, the first part of DF is denoted as $\mathrm{min}[[\mathrm{min}\underset{{f}_{j}\in S}{I}({f}_{i};C|{f}_{j})-I({f}_{i};C)],0]$, comparing with zero in case of ${S}_{redu}=\varnothing $. Similarly, the second part of DF is denoted as $\mathrm{max}[[\mathrm{max}\underset{{f}_{j}\in S}{I}({f}_{i};C|{f}_{j})-I({f}_{i};C)],0]$.

Algorithm 1 CMIFSI algorithm |

Input: A training dataset D with a full feature set F = {f_{1},f_{2},…f_{n}} and class vector C |

A predefined threshold K |

Output: The selected feature sequence |

1. Initialize parameters: the selected feature subset S = Ø, k = 0, deviated function DF(f_{i}, S) = 0 for all candidate features; |

2. for i = 1 to n do |

3. Calculate I(f_{i};C) |

4. end |

5. While k < K do |

6. For each candidate feature f_{i} ∈ F do |

7. Update DF(f_{i}, S) according to Equation (36) |

8. Calculate the evaluation value |

J_{CMI}(f_{i}) = I(f_{i};C) + DF(f_{i}, S) |

9. End |

10. Select the feature f_{j} with the largest J_{CMI}(f_{i}) |

S = S∪f_{j} |

F = F − f_{j} |

11. k = k + 1 |

12. end |

## 6. Experiments and Results

#### 6.1. Experiment Setup

#### 6.2. Experiment on Synthetic Datasets

#### 6.2.1. Synthetic Datasets

_{0},a

_{1},a

_{2}…,a

_{9}with 2 classes. Six of these ten features are irrelevant and the target concept was defined as

_{0},a

_{1},a

_{2}…,a

_{9}with 2 classes. Five of the ten features are irrelevant and the target concept was defined as

_{0},a

_{1},a

_{2}…,a

_{9}with 2 classes. Six of the ten features are irrelevant and the target concept was defined as

_{0},a

_{1},a

_{2}…,a

_{14}with 3 classes. Eleven of the fifteen features are irrelevant and the target concept was defined as

_{0},a

_{1},a

_{2}…,a

_{9}with 2 classes. Six of the ten features are irrelevant and the target concept was defined as

#### 6.2.2. Results on Synthetic Datasets

_{12}is a relevant feature in the optimal subset which is interactive with a

_{2}, a

_{8}, traditional algorithms remove it or rank it at the back of the feature sequence mainly because its correlation with the class is low individually and its interaction with other features is ignored. However, during the selection process of CMIFSI, once a

_{2}and a

_{8}are selected, the evaluation criterion value of a

_{12}would be compensated because of its interaction with a

_{2}, a

_{8}, which would increase its probability of being selected. Therefore, the result of CMIFSI is found to be superior to others.

#### 6.3. Experiment on Benchmark Datasets

#### 6.3.1. Benchmark Datasets

#### 6.3.2. Results on Benchmark Datasets

## 7. Discussion and Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learning Res.
**2002**, 3, 1157–1182. [Google Scholar] - Maldonado, S.; Weber, R. A wrapper method for feature selection using support vector machines. Inf. Sci.
**2009**, 179, 2208–2217. [Google Scholar] [CrossRef] - Bolón-Canedo, V.; Sánchez-Maroño, N.; Alonso-Betanzos, A. Recent advances and emerging challenges of feature selection in the context of big data. Knowledge-Based Syst.
**2015**, 86, 33–45. [Google Scholar] [CrossRef] - Dash, M.; Liu, H. Feature Selection for Classification. Intel. Data Anal.
**1997**, 1, 131–156. [Google Scholar] [CrossRef] - Jakulin, A.; Bratko, I. Testing the Significance of Attribute Interactions. In Proceedings of the twenty-first international conference on Machine learning, Banff, AB, Canada, 4–8 July 2004. [Google Scholar]
- Zhao, Z.; Liu, H. Searching for interacting features in subset selection. Intel. Data Anal.
**2009**, 13, 207–228. [Google Scholar] [CrossRef] - Zeng, Z.; Zhang, H.; Zhang, R.; Yin, C. A novel feature selection method considering feature interaction. Pattern Recogn.
**2015**, 48, 2656–2666. [Google Scholar] [CrossRef] - Tao, H.; Hou, C.; Nie, F.; Jiao, Y.; Yi, D. Effective Discriminative Feature Selection with Nontrivial Solution. IEEE Trans. Neural Netw. Learning Syst.
**2016**, 27, 3013–3017. [Google Scholar] [CrossRef] [PubMed] - Murthy, C.A.; Chanda, B. Generation of Compound Features based on feature Interaction for Classification. Expert Syst. Appl.
**2018**, 108, 61–73. [Google Scholar] - Chang, X.; Ma, Z.; Lin, M.; Yang, Y.; Hauptmann, A.G. Feature Interaction Augmented Sparse Learning for Fast Kinect Motion Detection. IEEE Trans. Image Process.
**2017**, 26, 3911–3920. [Google Scholar] [CrossRef] [PubMed] - Yin, Y.; Zhao, Y.; Zhang, B.; Li, C.; Guo, S. Enhancing ELM by Markov Boundary Based Feature Selection. Neurocomputing
**2017**, 261, 57–69. [Google Scholar] [CrossRef] - Brown, G.; Pocock, A.; Zhao, M.J.; Luján, M. Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection. J. Mach. Learning Res.
**2012**, 13, 27–66. [Google Scholar] - Lewis, D.D. Feature Selection and Feature Extraction for Text Categorization. In Proceedings of the workshop on Speech and Natural Language, New York, NY, USA, 23–26 February 1992; pp. 212–217. [Google Scholar]
- Press, W. Numerical Recipes in FORTRAN: The Art of Scientific Computing. Available online: https://doi.org/10.5860/choice.30-5638a (accessed on 29 May 2019).
- Battiti, R. Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw.
**1994**, 5, 537–550. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intel.
**2005**, 27, 1226–1238. [Google Scholar] [CrossRef] [PubMed] - Yang, H.H.; Moody, J. Data visualization and feature selection: New algorithms for non-gaussian data. Advances in Neural Information Processing Systems. 2000, pp. 687–693. Available online: https://www.researchgate.net/publication/2460722 (accessed on 29 May 2019).
- Fleuret, F. Fast Binary Feature Selection with Conditional Mutual Information. J. Mach. Learn. Res.
**2004**, 5, 1531–1555. [Google Scholar] - Yu, L.; Liu, H. Efficient Feature Selection via Analysis of Relevance and Redundancy. J. Mach. Learn. Res.
**2004**, 5, 1205–1224. [Google Scholar] - Tesmer, M.; Estevez, P.A. AMIFS: Adaptive feature selection by using mutual information. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, Budapest, Hungary, 25–29 July 2004; pp. 303–308. [Google Scholar]
- Cheng, G.; Qin, Z.; Feng, C.; Wang, Y.; Li, F. Conditional Mutual Information-Based Feature Selection Analyzing for Synergy and Redundancy. Etri J.
**2011**, 33, 210–218. [Google Scholar] [CrossRef] - John, G.H.; Kohavi, R.; Pfleger, K. Irrelevant Features and the Subset Selection Problem. In Proceedings of the Machine Learning Proceedings, New Brunswick, NJ, USA, 10–13 July 1994; pp. 121–129. [Google Scholar]
- Wang, G.; Lochovsky, F.H. Feature selection with conditional mutual information maximin in text categorization. In Proceedings of the 13th ACM international conference on Information and knowledge management, Washington, DC, USA, 8–13 November 2004; pp. 342–349. [Google Scholar]
- Kira, K.; Rendell, L.A. A Practical Approach to Feature Selection. In Proceedings of the Ninth International Workshop on Machine Learning, Aberdeen, Scotland, 1–3 July 1992; pp. 249–256. [Google Scholar]
- Hall, M.A. Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In Proceedings of the Seventeenth International Conference on Machine Learning, Stanford, CA, USA, 29 June–2 July 2000; pp. 359–366. [Google Scholar]
- UCI Repository of Machine Learning Databases. Available online: http://www.ics.uci.edu/~mlearn/MLRepository.html (accessed on 29 May 2019).
- Liu, H.; Sun, J.; Liu, L.; Zhang, H. Feature selection with dynamic mutual information. Pattern Recognit.
**2009**, 42, 1330–1339. [Google Scholar] [CrossRef]

Dataset | Relevant Features | Interactive Features |
---|---|---|

Data1 | a_{1},a_{5},a_{6} | (a_{1},a_{6}) |

Data2 | a_{0},a_{1},a_{5},a_{6},a_{8} | (a_{0},a_{1},a_{5}),(a_{0},a_{6},a_{8}) |

Data3 | a_{1},a_{5},a_{6},a_{8} | (a_{1},a_{6},a_{8}) |

Data4 | a_{2},a_{8},a_{12},a_{13} | (a_{2},a_{8},a_{12}) |

Data5 | a_{0},a_{1},a_{5},a_{6} | (a_{1},a_{5}),(a_{0},a_{6}) |

Algorithm | Data1 | Data2 |

GA | a,_{1}a,a_{2}_{5},a_{0},a,a_{6}_{9},a_{4},a_{7},a_{3},a_{8} | a,_{8}a,_{0}a,_{6}a,a_{5}_{7},a_{3},a_{2},a,a_{1}_{9},a_{4} |

SU | a,_{1}a,a_{5}_{2},a_{0},a_{7},a_{9},a_{8},a_{4},a_{3},a_{6} | a,_{8}a,_{0}a,a_{6}_{7},a,_{1}a,a_{5}_{4},a_{3},a_{2},a_{9} |

Relief | a,_{1}a,_{5}a,a_{6}_{2},a_{7},a_{7},a_{4},a_{9},a_{0},a_{3}* | a,_{8}a,_{0}a,_{6}a,_{1}a,a_{5}_{7},a_{2},a_{3},a_{4},a_{9}* |

CFS | a,a_{1}_{2},a_{5} | a,_{0}a,a_{6}_{7},a_{8} |

MRMR | a,a_{1}_{0},a_{7},a_{8},a,_{6}a,a_{5}_{2},a_{9},a_{4},a_{3} | a,_{8}a,_{6}a,a_{5}_{3},a_{2},a,a_{0}_{7},a,a_{1}_{9},a_{4} |

CMIM | a,_{1}a,_{6}a,a_{5}_{2},a_{7},a_{9},a_{0},a_{8},a_{3},a_{4}* | a,_{8}a,_{0}a,_{6}a,a_{5}_{7},a,a_{1}_{4},a_{3},a_{9},a_{2} |

CMIFSI | a,_{1}a,_{6}a,a_{5}_{4},a_{2},a_{9},a_{7},a_{8},a_{3},a_{0}* | a,_{8}a,_{0}a,_{1}a,_{6}a,a_{5}_{7},a_{4},a_{2},a_{9},a_{3}* |

Optimal subset | a_{1},a_{5},a_{6} | a_{0},a_{1},a_{5},a_{6},a_{8} |

Algorithm | Data3 | Data4 |

GA | a,_{5}a,a_{6}_{9},a,a_{1}_{3},a_{2},a_{0},a_{7},a,a_{8}_{4} | a,_{2}a,a_{13}_{11},a,a_{8}_{1},a_{14},a_{5},a,a_{12}_{0},a_{6},a_{10},a_{7},a_{3},a_{4},a_{9} |

SU | a,_{5}a,_{1}a,a_{6}_{7},a_{0},a_{3},a_{4},a_{9},a,a_{8}_{2} | a,_{2}a,_{8}a,a_{13}_{10},a_{6},a_{11},a_{0},a_{1},a_{14},a_{4},a_{7},a_{5},a_{3},a_{9},a_{12} |

Relief | a,_{5}a,_{1}a,a_{6}_{2},a_{0},a_{9},a_{4},a,a_{8}_{7},a_{3} | a,_{2}a,_{8}a,a_{13}_{11},a_{14},a_{10},a_{7},a_{4},a,a_{12}_{1},a_{6},a_{3},a_{5},a_{0},a_{9} |

CFS | a_{0},a,_{1}a,a_{5}_{7} | a_{0},a,_{2}a,a_{8}_{10},a_{11},a,a_{13}_{14} |

MRMR | a,_{5}a,a_{6}_{3},a_{2},a_{9},a,a_{1}_{0},a_{4},a_{7},a_{8} | a,_{2}a,a_{13}_{11},a_{1},a,a_{8}_{7},a_{5},a,a_{12}_{10},a_{6},a_{0},a_{14},a_{4},a_{3},a_{9} |

CMIM | a,_{5}a,_{6}a,a_{1}_{0},a_{7},a_{2},a_{4},a,a_{8}_{9},a_{3} | a,_{2}a,_{8}a,a_{13}_{4},a_{10},a_{1},a_{0},a_{14},a_{11},a_{6},a,a_{12}_{3},a_{5},a_{7},a_{9} |

CMIFSI | a,_{5}a,_{6}a,_{1}a,a_{8}_{2},a_{4},a_{0},a_{9},a_{7},a_{3}* | a,_{2}a,_{8}a,a_{13}_{4},a,a_{12}_{11},a_{5},a_{1},a_{14},a_{10},a_{3},a_{0},a_{6},a_{9},a_{7} |

Optimal subset | a_{1},a_{5},a_{6},a_{8} | a_{2},a_{8},a_{12},a_{13} |

Algorithm | Data5 | |

GA | a,_{1}a,a_{6}_{9},a,a_{5}_{3},a_{2},a_{8},a_{4},a,a_{0}_{7} | |

SU | a,_{1}a,_{5}a,a_{6}_{2},a_{9},a,a_{0}_{7},a_{3},a_{4},a_{8} | |

Relief | a,_{5}a,_{1}a,a_{6}_{2},a_{9},a_{4},a_{8},a,a_{0}_{7},a_{3} | |

CFS | a,a_{6}_{2},a,_{1}a_{5} | |

MRMR | a,_{1}a,a_{6}_{2},a_{9},a,a_{5}_{3},a,a_{0}_{8},a_{7},a_{4} | |

CMIM | a,_{5}a,_{6}a,a_{1}_{9},a,a_{0}_{2},a_{4},a_{7},a_{8},a_{3} | |

CMIFSI | a,_{5}a,_{6}a,a_{1}_{2},a,a_{0}_{9},a_{4},a_{8},a_{7},a_{3} | |

Optimal subset | a_{0},a_{1},a_{5},a_{6} |

No. | Datasets | Features | Instances | Classes |
---|---|---|---|---|

1 | Wine | 13 | 178 | 3 |

2 | Kr-vs-kp | 36 | 3196 | 2 |

3 | SPECTF-heart | 44 | 267 | 2 |

4 | Zoo | 16 | 101 | 7 |

5 | Credit Approval | 15 | 690 | 2 |

6 | Optical Recognition of Handwritten Digits | 64 | 1797 | 10 |

7 | Contraceptive Method Choice | 9 | 1473 | 3 |

8 | Congressional Voting Records | 16 | 435 | 2 |

9 | Waveform | 21 | 5000 | 3 |

10 | Waveform+noise | 40 | 5000 | 3 |

No. | C4.5 | ||||||||
---|---|---|---|---|---|---|---|---|---|

Total | GA | SU | Relief | CFS | MRMR | CMIM | CMIFSI | AVE. | |

1 | 13 | 6 | 3 | 5 | 11 | 3 | 3 | 3 | 4.86 |

2 | 36 | 15 | 13 | 15 | 7 | 14 | 19 | 19 | 14.57 |

3 | 44 | 3 | 2 | 1 | 21 | 1 | 2 | 3 | 4.71 |

4 | 16 | 10 | 14 | 9 | 9 | 10 | 4 | 4 | 8.57 |

5 | 15 | 8 | 6 | 9 | 6 | 4 | 3 | 7 | 6.14 |

6 | 64 | 20 | 15 | 20 | 38 | 17 | 20 | 20 | 21.43 |

7 | 9 | 8 | 7 | 7 | 8 | 8 | 7 | 4 | 7.0 |

8 | 16 | 12 | 13 | 9 | 5 | 12 | 10 | 6 | 9.57 |

9 | 21 | 17 | 17 | 16 | 16 | 17 | 16 | 9 | 15.43 |

10 | 40 | 16 | 13 | 11 | 15 | 16 | 11 | 13 | 13.57 |

No. | SVM | ||||||||
---|---|---|---|---|---|---|---|---|---|

Total | GA | SU | Relief | CFS | MRMR | CMIM | CMIFSI | AVE. | |

1 | 13 | 2 | 5 | 8 | 11 | 9 | 6 | 8 | 7.0 |

2 | 36 | 13 | 13 | 20 | 7 | 12 | 15 | 15 | 13.57 |

3 | 44 | 15 | 18 | 5 | 21 | 14 | 8 | 19 | 14.29 |

4 | 16 | 9 | 9 | 8 | 9 | 7 | 9 | 5 | 8.0 |

5 | 15 | 6 | 5 | 1 | 6 | 6 | 4 | 3 | 4.43 |

6 | 64 | 20 | 19 | 13 | 38 | 17 | 11 | 11 | 18.43 |

7 | 9 | 8 | 5 | 5 | 8 | 9 | 3 | 7 | 6.43 |

8 | 16 | 6 | 2 | 1 | 5 | 3 | 5 | 4 | 3.71 |

9 | 21 | 19 | 19 | 18 | 16 | 18 | 20 | 19 | 18.43 |

10 | 40 | 20 | 20 | 19 | 15 | 18 | 17 | 16 | 18.29 |

No. | C4.5 | |||||||
---|---|---|---|---|---|---|---|---|

Unselected | GA | SU | Relief | CFS | MRMR | CMIM | CMIFSI | |

1 | 93.80 | 92.67 | 96.07 | 94.38 | 93.80 | 97.19* | 97.19* | 97.19* |

2 | 99.31* | 95.48 | 96.62 | 97.70 | 94.09 | 96.65 | 96.53 | 97.74 |

3 | 74.90 | 79.40 | 79.78 | 79.40 | 77.90 | 79.40 | 79.78 | 80.15* |

4 | 92.08 | 95.05* | 95.05* | 95.05* | 93.07 | 95.05* | 94.06 | 94.06 |

5 | 84.93 | 86.24 | 86.09 | 86.81 | 86.81 | 86.96* | 85.65 | 85.94 |

6 | 87.42 | 86.16 | 87.48* | 86.94 | 87.42 | 87.20 | 86.92 | 87.26 |

7 | 53.22 | 55.53 | 55.53 | 55.53 | 54.58 | 45.96 | 55.53 | 56.21* |

8 | 96.32 | 95.86 | 96.32 | 96.32 | 94.94 | 96.78* | 96.32 | 96.32 |

9 | 75.94 | 76.82 | 77.04 | 76.92 | 76.76 | 77.04 | 77.16 | 77.22* |

10 | 75.08 | 77.40 | 77.82 | 77.92* | 77.30 | 76.58 | 77.82 | 77.82 |

Ave. | 83.30 | 84.06 | 84.78 | 84.71 | 83.67 | 83.88 | 84.70 | 84.99 |

W/T/L | 1/1/8 | 2/0/8 | 3/2/5 | 3/2/5 | 2/0/8 | 3/1/6 | 0/4/6 |

No. | SVM | |||||||
---|---|---|---|---|---|---|---|---|

Unselected | GA | SU | Relief | CFS | MRMR | CMIM | CMIFSI | |

1 | 94.44 | 97.85 | 97.22 | 98.89 | 95.56 | 98.89 | 98.33 | 99.40* |

2 | 94.91 | 95.03 | 95.68 | 97.80* | 94.87 | 96.21 | 95.03 | 96.94 |

3 | 79.78 | 82.57 | 83.70 | 83.33 | 81.48 | 82.22 | 82.59 | 84.81* |

4 | 93.07 | 97.27 | 94.54 | 96.36 | 88.18 | 97.27 | 98.18* | 97.27 |

5 | 85.51 | 87.79 | 88.10 | 87.82 | 88.50 | 87.39 | 87.10 | 88.55* |

6 | 95.11* | 88.28 | 90.78 | 88.39 | 88.28 | 89.44 | 88.00 | 88.00 |

7 | 51.28 | 53.85 | 54.32 | 53.85 | 54.79 | 50.81 | 54.46 | 55.34* |

8 | 96.09 | 94.31 | 95.90 | 95.45 | 93.86 | 96.59 | 96.59 | 97.27* |

9 | 84.08 | 85.59 | 86.22 | 85.70 | 85.20 | 86.12 | 86.56* | 86.24 |

10 | 86.02 | 86.12 | 86.18 | 86.12 | 85.50 | 86.42* | 85.82 | 86.18 |

Ave. | 86.03 | 86.87 | 87.26 | 87.37 | 85.62 | 87.14 | 87.27 | 88.00 |

W/T/L | 1/0/9 | 1/1/8 | 1/1/8 | 1/0/9 | 1/0/9 | 2/1/7 | 2/2/6 |

No. | C4.5 | |||||||
---|---|---|---|---|---|---|---|---|

Unselected | GA | SU | Relief | CFS | MRMR | CMIM | CMIFSI | |

1 | [93.41,94.35] | [92.04,92.96] | [95.73,96.70] | [93.50,94.46] | [93.20,94.16] | [96.77,97.70] | [96.53,97.70] | [96.84,97.86] |

2 | [98.62,99.58] | [94.92,95.81] | [95.95,97.00] | [97.55,98.50] | [93.49,94.36] | [95.59,96.62] | [95.81,96.76] | [97.68,98.66] |

3 | [74.45,75.44] | [78.76,79.71] | [79.84,80.90] | [79.10,80.08] | [77.59,78.66] | [78.78,79.80] | [77.96,79.80] | [79.68,80.65] |

4 | [91.92,92.86] | [94.48,95.46] | [94.55,95.55] | [94.68,95.59] | [92.53,93.52] | [94.19,95.23] | [93.58,94.61] | [93.56,94.51] |

5 | [84.03,85.05] | [85.80,86.73] | [85,58,86.40] | [86.50,87.55] | [86.04,87.17] | [86.84,87.72] | [85.13,86.10] | [85.14,86.00] |

6 | [86.87,87.73] | [85.57,86.60] | [87.20,88.09] | [86.40,87.49] | [87.05,87.92] | [86.71,87.74] | [86.53,87.44] | [86.36,87.40] |

7 | [52.47,53.93] | [54.29,55.75] | [55.06,56.35] | [55.13,56.60] | [53.82,55.66] | [45.45,46.71] | [53.85,55.38] | [55.34,56.33] |

8 | [96.32,96.74] | [95.56,96.37] | [95.81,96.72] | [95.56,96.38] | [94.08,94.87] | [96.16,96.95] | [96.06,96.87] | [96.16,97.05] |

9 | [75.19,76.16] | [76.63,77.52] | [76.67,77.56] | [76.58,77.51] | [76.33,77.30] | [76.51,77.47] | [76.45,77.31] | [77.09,78.09] |

10 | [74.58,75.58] | [77.01,77.97] | [76.92,77.70] | [77.86,78.77] | [76.97,77.86] | [75.96,76,91] | [77.10,78.06] | [77.27,78.15] |

No. | SVM | |||||||
---|---|---|---|---|---|---|---|---|

Unselected | GA | SU | Relief | CFS | MRMR | CMIM | CMIFSI | |

1 | [93.95,94.87] | [97.71,98.52] | [96.80,97.64] | [98.59,99.35] | [95.14,95.95] | [98.48,99.27] | [97.61,98.36] | [98.47,99.52] |

2 | [94.40,95.27] | [94.81,95.60] | [95.14,95.92] | [97.50,98.32] | [94.22,94.96] | [95.35,96.23] | [94.46,95.31] | [96.83,97.60] |

3 | [78.68,79.49] | [82.39,83.31] | [83.05,83.84] | [82.94,83.73] | [80.81,81.55] | [81.98,82.88] | [82.50,83.39] | [84.32,85.13] |

4 | [92.38,93.33] | [96.71,97.42] | [94.31,95.20] | [95.56,96.35] | [87.56,88.47] | [96.77,97.61] | [96.78,98.55] | [96.88,97.77] |

5 | [85.26,86.08] | [87.52,88.27] | [87,58,88.46] | [87.01,87.85] | [88.22,89.14] | [86.77,87.52] | [86.49,87.38] | [88.56,89.40] |

6 | [94.68,95.60] | [87.69,88.44] | [90.39,91.22] | [87.87,88.77] | [87.88,88.65] | [89.07,89.98] | [87.66,88.46] | [87.20,88.03] |

7 | [51.24,52.02] | [53.41,54.22] | [53.74,54.59] | [53.63,54.46] | [54.34,55.17] | [50.53,51.38] | [54.02,54.79] | [55.18,55.98] |

8 | [95.82,96.60] | [93.64,94.47] | [95.34,96.21] | [95.30,96.18] | [93.60,94.38] | [96.13,96.97] | [96.37,97.23] | [96.53,97.28] |

9 | [83.79,84.59] | [85.13,86.00] | [85.43,86.40] | [85.06,85.80] | [84.89,85.80] | [85.77,86.63] | [86.18,86.98] | [86.10,86.90] |

10 | [85.69,86.58] | [86.33,87.13] | [86.68,87.60] | [87.45,88.21] | [84.86,85.69] | [86.94,87.77] | [86.80,87.62] | [87.71,88.56] |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Liang, J.; Hou, L.; Luan, Z.; Huang, W.
Feature Selection with Conditional Mutual Information Considering Feature Interaction. *Symmetry* **2019**, *11*, 858.
https://doi.org/10.3390/sym11070858

**AMA Style**

Liang J, Hou L, Luan Z, Huang W.
Feature Selection with Conditional Mutual Information Considering Feature Interaction. *Symmetry*. 2019; 11(7):858.
https://doi.org/10.3390/sym11070858

**Chicago/Turabian Style**

Liang, Jun, Liang Hou, Zhenhua Luan, and Weiping Huang.
2019. "Feature Selection with Conditional Mutual Information Considering Feature Interaction" *Symmetry* 11, no. 7: 858.
https://doi.org/10.3390/sym11070858