# Label-Driven Learning Framework: Towards More Accurate Bayesian Network Classifiers through Discrimination of High-Confidence Labels

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

**x**$=({x}_{1},\dots ,{x}_{n})$ (${x}_{i}\in {\mathsf{\Omega}}_{{X}_{i}}$), where ${x}_{i}$ is the value of an attribute ${X}_{i}$ and ${c}_{i}$ is a value of the class variable C. Among numerous classification techniques, Bayesian network classifiers (BNCs) are well-known for their model interpretability, comparable classification performance and the ability to directly handle multi-class classification problems [1]. A BNC $\mathcal{B}$ provides a confidence measure in the form of a posterior probability for each class label and performs a classification by assigning the label with the maximum posterior probability to

**x**, that is:

## 2. Preliminaries

#### 2.1. Information Theory

**Definition**

**1**

**([18]).**

**Definition**

**2**

**([18]).**

**Definition**

**3**

**([18]).**

**Definition**

**4**

**([18]).**

#### 2.2. Bayesian Network Classifiers

#### 2.2.1. Naive Bayes

#### 2.2.2. TAN and WATAN

Algorithm 1: The TAN learning algorithm. |

Input: $CPT$s.Output: The built TAN model ${\mathcal{B}}_{TAN}$.1 Let $\mathcal{N}$ be a $n\times n$ matrix of $CMI$, where ${\mathcal{N}}_{ij}=I({X}_{i};{X}_{j}|C)$, if $i\ne j$;2 $\mathcal{Y}\leftarrow treeConstrction\left(\mathcal{N}\right)$; // Algorithm 23 ${\mathcal{B}}_{TAN}\leftarrow $ add the class node C to $\mathcal{Y}$ and add an arc from C to each attribute node;4 return ${\mathcal{B}}_{TAN}$; |

Algorithm 2: treeConstruction($\mathcal{W}$). |

Input: The $n\times n$ edge weight matrix $\mathcal{W}$, whose element is denoted as ${\mathcal{W}}_{ij}$.Output: The built directed MST $\mathcal{D}$.1 Let $\mathcal{Y}$ be a complete undirected graph where vertices are the attributes and the weight of the edge connecting ${X}_{i}$ to ${X}_{j}$ is annotated by ${\mathcal{W}}_{ij}$;2 $\mathcal{U}\leftarrow Prim\left(\mathcal{Y}\right)$; // perform Prim’s algotithm to find an MST in $\mathcal{Y}$3 Transform the resulting undirected tree $\mathcal{U}$ to a directed tree $\mathcal{D}$ by choosing a root attribute and setting the direction of all edges to be outward from it.;4 return $\mathcal{D}$; |

#### 2.2.3. KDB and LKDB

**Definition**

**5.**

**Definition**

**6.**

#### 2.3. Bayesian Multinet Classifiers

Algorithm 3: The BMC${}^{CL}$ learning algorithm. |

## 3. Label-Driven Learning Framework

#### 3.1. Motivation

- When posterior probabilities of some class labels are close to the maximum posterior probability, there is a high risk of misclassification.
- When the maximum posterior probability is far greater than the posterior probabilities of other labels, the classification result is more credible—in other words, more likely to be correct.

**x**, the set of labels ${\mathsf{\Omega}}_{C}^{{}^{\prime}}$ whose posterior probabilities are close to or equal to the maximum posterior probability can be obtained by:

#### 3.2. Label Filtering Stage

**Definition**

**7.**

#### 3.3. Label Specialization Stage

**Definition**

**8.**

#### 3.4. Overall Structure and Complexity Analysis

Algorithm 4: The label-driven learning framework applied to TAN. |

## 4. Empirical Study

#### 4.1. Selection of the Threshold for Label Filtering

#### 4.2. Comparisons in Terms of Zero-One Loss

#### 4.3. Analysis of the Label-Driven Learning Framework

**x**is a testing instance, $\mathcal{Q}$ is the testing set and ${\mathsf{\Omega}}_{C}^{{}^{\prime}}$ can be obtained using Equation (14). Thus, label-driven learning percentage ($LLP$) can be defined as follows:

**Definition**

**9.**

**Definition**

**10.**

**x**, ${\widehat{c}}_{G}$ and ${\widehat{c}}_{L}$ is the label predicted by TAN and LTAN, respectively.

**Definition**

**11.**

#### 4.3.1. Effects of the Label Filtering Stage

#### 4.3.2. Effects of the Label Specialization Stage

**Definition**

**12.**

#### 4.4. Time Comparisons of TAN, LTAN and AKDB

## 5. Discussion

## 6. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Bielza, C.; Larrañaga, P. Discrete bayesian network classifiers: A survey. ACM Comput. Surv.
**2014**, 47. [Google Scholar] [CrossRef] - Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian network classifiers. Mach. Learn.
**1997**, 29, 131–163. [Google Scholar] [CrossRef] - Sahami, M. Learning Limited Dependence Bayesian Classifiers. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, Oregon, USA, 2–4 August 1996; pp. 335–338. [Google Scholar]
- Song, H.; Xu, Q.; Yang, H.; Fang, J. Interpreting out-of-control signals using instance-based Bayesian classifier in multivariate statistical process control. Commun. Stat. Simul. Comput.
**2017**, 46, 53–77. [Google Scholar] [CrossRef] - Wang, L.; Zhao, H.; Sun, M.; Ning, Y. General and local: Averaged k-dependence bayesian classifiers. Entropy
**2015**, 17, 4134–4154. [Google Scholar] [CrossRef] - Zheng, F.; Webb, G.I.; Suraweera, P.; Zhu, L. Subsumption resolution: An efficient and effective technique for semi-naive Bayesian learning. Mach. Learn.
**2012**, 87, 93–125. [Google Scholar] [CrossRef] - Webb, G.I.; Boughton, J.R.; Wang, Z. Not so naive Bayes: Aggregating one-dependence estimators. Mach. Learn.
**2005**, 58, 5–24. [Google Scholar] [CrossRef] - Jiang, L.; Cai, Z.; Wang, D.; Zhang, H. Improving tree augmented naive bayes for class probability estimation. Knowl. Based Syst.
**2012**, 26, 239–245. [Google Scholar] [CrossRef] - Libal, U.; Hasiewicz, Z. Risk upper bound for a NM-type multiresolution classification scheme of random signals by Daubechies wavelets. Eng. Appl. Artif. Intell.
**2017**, 62, 109–123. [Google Scholar] [CrossRef] - Das, N.; Sarkar, R.; Basu, S.; Saha, P.K.; Kundu, M.; Nasipuri, M. Handwritten bangla character recognition using a soft computing paradigm embedded in two pass approach. Pattern Recogn.
**2015**, 48, 2054–2071. [Google Scholar] [CrossRef] - Liu, K.-H.; Yan, S.; Kuo, C.-C.J. Age estimation via grouping and decision fusion. IEEE Trans. Inf. Forensics Secur.
**2015**, 10, 2408–2423. [Google Scholar] [CrossRef] - Grossi, G.; Lanzarotti, R.; Lin, J. Robust face recognition providing the identity and its reliability degree combining sparse representation and multiple features. Int. J. Pattern Recogn.
**2016**, 30, 1656007. [Google Scholar] [CrossRef] - Godbole, S.; Sarawagi, S.; Chakrabarti, S. Scaling multi-class support vector machines using inter-class confusion. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, AB, Canada, 23–25 July 2002; pp. 513–518. [Google Scholar]
- Bache, K.; Lichman, M. UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu/ml/datasets.html (accessed on 1 December 2017).
- Shannon, C. A mathematical theory of communications, I and II. Bell Syst. Tech. J.
**1948**, 27, 379–423. [Google Scholar] [CrossRef] - Chen, S.; Martínez, A.M.; Webb, G.I.; Wang, L. Selective AnDE for large data learning: A low-bias memory constrained approach. Knowl. Inf. Syst.
**2017**, 50, 475–503. [Google Scholar] [CrossRef] - Peng, H.; Fan, Y. Feature selection by optimizing a lower bound of conditional mutual information. Inf. Sci.
**2017**, 418, 652–667. [Google Scholar] [CrossRef] - Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2012; pp. 1–22. [Google Scholar]
- Liu, H.; Zhou, S.; Lam, W.; Guan, J. A new hybrid method for learning Bayesian networks: Separation and reunion. Knowle. Based Syst.
**2017**, 121, 185–197. [Google Scholar] [CrossRef] - Bartlett, M.; Cussens, J. Integer linear programming for the Bayesian network structure learning problem. Artif. Intell.
**2017**, 244, 258–271. [Google Scholar] [CrossRef] - Prim, R.C. Shortest connection networks and some generalizations. Bell Syst. Tech. J.
**1957**, 36, 1389–1401. [Google Scholar] [CrossRef] - Martínez, A.M.; Webb, G.I.; Chen, S.; Zaidi, N.A. Scalable learning of bayesian network classifiers. J. Mach. Learn. Res.
**2016**, 17, 1515–1549. [Google Scholar] - Pensar, J.; Nyman, H.; Lintusaari, J.; Corander, J. The role of local partial independence in learning of Bayesian networks. Int. J. Approx. Reason.
**2016**, 69, 91–105. [Google Scholar] [CrossRef] - Dan, G.; Heckerman, D. Knowledge representation and inference in similarity networks and Bayesian multinets. Artif. Intell.
**1996**, 82, 45–74. [Google Scholar] - Huang, K.; King, I.; Lyu, M.R. Discriminative training of Bayesian chow-liu multinet classifiers. In Proceedings of the International Joint Conference on Artificial intelligence, Acapulco, Mexico, 19–25 August 2003; pp. 484–488. [Google Scholar]
- Chow, C.; Liu, C. Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory
**1968**, 14, 462–467. [Google Scholar] [CrossRef] - Fayyad, U.M.; Irani, K.B. Multi-interval Discretization of Continuous-Valued Attributes for Classification Learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France, 28 August–3 September 1993; pp. 1022–1029. [Google Scholar]
- Zaidi, N.A.; Cerquides, J.; Carman, M.J.; Webb, G.I. Alleviating naive bayes attribute independence assumption by attribute weighting. J. Mach. Learn. Res.
**2013**, 14, 1947–1988. [Google Scholar] - Cestnik, B. Estimating probabilities: a crucial task in machine learning. In Proceedings of the Ninth European Conference on Artificial Intelligence, Stockholm, Sweden, 6–10 August 1990; pp. 147–149. [Google Scholar]
- Friedman, M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc.
**1937**, 32, 675–701. [Google Scholar] [CrossRef] - Demřar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res.
**2006**, 7. Available online: http://dl.acm.org/citation.cfm?id=1248547.1248548 (accessed on 1 December 2017). - Nemenyi, P. Distribution-Free Multiple Comparisons. Ph.D. Thesis, Princeton University, Princeton, NJ, USA, 1963. [Google Scholar]
- Özöğür-Akyüz, S.; Windeatt, T.; Smith, R. Pruning of error correcting output codes by optimization of accuracy-diversity trade off. Mach. Learn.
**2015**, 101, 253–269. [Google Scholar] [CrossRef][Green Version] - Díez-Pastor, J.-F.; García-Osorio, C.; Rodríguez, J.J. Tree ensemble construction using a grasp-based heuristic and annealed randomness. Inf. Fusion.
**2014**, 20, 189–202. [Google Scholar] [CrossRef]

**Figure 1.**Examples of network structures with four attributes for the following BN classifiers: (

**a**) NB; (

**b**) TAN; (

**c**) KDB (k = 2).

**Figure 13.**Time comparisons of TAN, LTAN and AKDB per dataset. TAN: tree-augmented naive Bayes. LTAN: label-driven TAN. AKDB: averaged k-dependence Bayesian classifier. (

**a**) training time; (

**b**) classification time.

**Figure 14.**Network structures of TAN${}^{IE}$ and E${}^{{c}_{5}}$ on the testing instance in Table 7: (

**a**) TAN${}^{IE}$; (

**b**) E${}^{{c}_{5}}$.

No. | $\mathbf{x}=({\mathit{x}}_{1},{\mathit{x}}_{2},{\mathit{x}}_{3},{\mathit{x}}_{4},{\mathit{x}}_{5},{\mathit{x}}_{6},{\mathit{x}}_{7},{\mathit{x}}_{8})$ |
---|---|

1 | $\mathbf{x}=(usual,very\_crit,completed,3,convenient,inconv,problematic,priority)$ |

2 | $\mathbf{x}=(great\_pret,proper,incomplete,2,less\_conv,convenient,nonprob,priority)$ |

**Table 2.**The class membership probabilities of the testing instances in Table 1 estimated by TAN.

No. | ${\mathit{P}}_{\mathit{\theta}}\left({\mathit{c}}_{1}\right|\mathit{x})$ | ${\mathit{P}}_{\mathit{\theta}}\left({\mathit{c}}_{2}\right|\mathit{x})$ | ${\mathit{P}}_{\mathit{\theta}}\left({\mathit{c}}_{3}\right|\mathit{x})$ | ${\mathit{P}}_{\mathit{\theta}}\left({\mathit{c}}_{4}\right|\mathit{x})$ | ${\mathit{P}}_{\mathit{\theta}}\left({\mathit{c}}_{5}\right|\mathit{x})$ | $\widehat{\mathit{c}}$ | $\tilde{\mathit{c}}$ |
---|---|---|---|---|---|---|---|

1 | 2.8394×10${}^{-4}$ | 1.7242×10${}^{-5}$ | 5.8891×10${}^{-5}$ | 1.2269×10${}^{-3}$ | 0.9984 | ${c}_{5}$ | ${c}_{5}$ |

2 | 3.9812×10${}^{-4}$ | 2.5073×10${}^{-5}$ | 5.3492×10${}^{-7}$ | 0.5051 | 0.4945 | ${c}_{4}$ | ${c}_{5}$ |

**x**.

No. | Dataset | Instance | Att. | Class | No. | Dataset | Instance | Att. | Class |
---|---|---|---|---|---|---|---|---|---|

1 | Lung Cancer | 32 | 56 | 3 | 21 | Segment | 2310 | 19 | 7 |

2 | Labor-negotiations | 57 | 16 | 2 | 22 | Hypothyroid | 3163 | 25 | 2 |

3 | Post-operative | 90 | 8 | 3 | 23 | Kr-vs-Kp | 3196 | 36 | 2 |

4 | Zoo | 101 | 16 | 7 | 24 | Hypo | 3772 | 29 | 4 |

5 | Promoters | 106 | 57 | 2 | 25 | Waveform-5000 | 5000 | 40 | 3 |

6 | Iris | 150 | 4 | 3 | 26 | Phoneme | 5438 | 7 | 50 |

7 | Teaching-ae | 151 | 5 | 3 | 27 | Page-blocks | 5473 | 10 | 5 |

8 | Sonar | 208 | 60 | 2 | 28 | Optdigits | 5620 | 64 | 10 |

9 | Heart | 270 | 13 | 2 | 29 | Mushrooms | 8124 | 22 | 2 |

10 | Hungarian | 294 | 13 | 2 | 30 | Thyroid | 9169 | 29 | 20 |

11 | Heart-disease-c | 303 | 13 | 2 | 31 | Pendigits | 10,992 | 16 | 10 |

12 | Dermatology | 366 | 34 | 6 | 32 | Sign | 12,546 | 8 | 3 |

13 | Musk1 | 476 | 166 | 2 | 33 | Nursery | 12,960 | 8 | 5 |

14 | Cylinder-bands | 540 | 39 | 2 | 34 | Letter-recog | 20,000 | 16 | 26 |

15 | Chess | 551 | 39 | 2 | 35 | Shuttle | 58,000 | 9 | 7 |

16 | Syncon | 600 | 60 | 6 | 36 | Waveform | 100,000 | 21 | 3 |

17 | Soybean | 683 | 35 | 19 | 37 | Census-income | 299,285 | 41 | 2 |

18 | Breast-cancer-w | 699 | 9 | 2 | 38 | Covtype | 581,012 | 54 | 7 |

19 | Tic-Tac-Toe | 958 | 9 | 2 | 39 | Poker-hand | 1,025,010 | 10 | 10 |

20 | Vowel | 990 | 13 | 11 | 40 | Donation | 5,749,132 | 11 | 2 |

No. | Dataset | LTAN | NB | TAN | KDB | AODE | WATAN | AKDB |
---|---|---|---|---|---|---|---|---|

1 | Lung Cancer | 0.5938 | 0.4375 | 0.5938 | 0.5938 | 0.4688 | 0.6250 | 0.6562 |

2 | Labor-negotiations | 0.0702 ∘ | 0.0702 | 0.1053 | 0.0351 | 0.0526 | 0.1053 | 0.0702 |

3 | Post-operative | 0.3444 ∘ | 0.3444 | 0.3667 | 0.3444 | 0.3444 | 0.3667 | 0.3333 |

4 | Zoo | 0.0099 | 0.0297 | 0.0099 | 0.0495 | 0.0198 | 0.0198 | 0.0297 |

5 | Promoters | 0.0849 ∘ | 0.0755 | 0.1321 | 0.1321 | 0.1038 | 0.1132 | 0.0943 |

6 | Iris | 0.0867 • | 0.0867 | 0.0800 | 0.0867 | 0.0867 | 0.0800 | 0.0867 |

7 | Teaching-ae | 0.5099 ∘ | 0.4967 | 0.5497 | 0.5430 | 0.4570 | 0.5364 | 0.5033 |

8 | Sonar | 0.2115 | 0.2308 | 0.2212 | 0.2308 | 0.2260 | 0.2212 | 0.2212 |

9 | Heart | 0.1778 ∘ | 0.1778 | 0.1926 | 0.1963 | 0.1704 | 0.1926 | 0.2037 |

10 | Hungarian | 0.1565 ∘ | 0.1599 | 0.1701 | 0.1701 | 0.1667 | 0.1735 | 0.1497 |

11 | Heart-disease-c | 0.1980 | 0.1815 | 0.2079 | 0.2079 | 0.1947 | 0.2046 | 0.2013 |

12 | Dermatology | 0.0137 ∘ | 0.0191 | 0.0328 | 0.0301 | 0.0219 | 0.0328 | 0.0219 |

13 | Musk1 | 0.1071 ∘ | 0.1660 | 0.1134 | 0.1113 | 0.1366 | 0.1134 | 0.1071 |

14 | Cylinder-bands | 0.1704 ∘ | 0.2148 | 0.2833 | 0.2278 | 0.1889 | 0.2463 | 0.2148 |

15 | Chess | 0.0980 • | 0.1125 | 0.0926 | 0.0998 | 0.1053 | 0.0926 | 0.0998 |

16 | Syncon | 0.0117 • | 0.0283 | 0.0083 | 0.0100 | 0.0133 | 0.0083 | 0.0150 |

17 | Soybean | 0.0425 ∘ | 0.0893 | 0.0469 | 0.0644 | 0.0542 | 0.0527 | 0.0498 |

18 | Breast-cancer-w | 0.0372 ∘ | 0.0258 | 0.0415 | 0.0486 | 0.0386 | 0.0415 | 0.0300 |

19 | Tic-Tac-Toe | 0.2328 | 0.3069 | 0.2286 | 0.2463 | 0.2683 | 0.2265 | 0.2683 |

20 | Vowel | 0.1394 • | 0.4242 | 0.1303 | 0.2343 | 0.1747 | 0.1263 | 0.2182 |

21 | Segment | 0.0364 ∘ | 0.0788 | 0.0390 | 0.0403 | 0.0329 | 0.0394 | 0.0385 |

22 | Hypothyroid | 0.0092 ∘ | 0.0149 | 0.0104 | 0.0107 | 0.0130 | 0.0104 | 0.0092 |

23 | Kr-vs-Kp | 0.0576 ∘ | 0.1214 | 0.0776 | 0.0544 | 0.0854 | 0.0776 | 0.0507 |

24 | Hypo | 0.0130 ∘ | 0.0138 | 0.0141 | 0.0077 | 0.0106 | 0.0130 | 0.0087 |

25 | Waveform-5000 | 0.1630 ∘ | 0.2006 | 0.1844 | 0.1820 | 0.1462 | 0.1844 | 0.1644 |

26 | Phoneme | 0.2378 ∘ | 0.2615 | 0.2733 | 0.2120 | 0.2100 | 0.2345 | 0.1885 |

27 | Page-blocks | 0.0369 ∘ | 0.0619 | 0.0415 | 0.0433 | 0.0322 | 0.0418 | 0.0347 |

28 | Optdigits | 0.0345 ∘ | 0.0767 | 0.0407 | 0.0416 | 0.0278 | 0.0406 | 0.0400 |

29 | Mushrooms | 0.0001 | 0.0196 | 0.0001 | 0.0006 | 0.0002 | 0.0001 | 0.0006 |

30 | Thyroid | 0.0681 ∘ | 0.1111 | 0.0720 | 0.0693 | 0.0719 | 0.0723 | 0.0674 |

31 | Pendigits | 0.0225 ∘ | 0.1181 | 0.0321 | 0.0362 | 0.0187 | 0.0328 | 0.0286 |

32 | Sign | 0.2659 | 0.3586 | 0.2755 | 0.2881 | 0.2822 | 0.2752 | 0.2826 |

33 | Nursery | 0.0590 ∘ | 0.0973 | 0.0654 | 0.0654 | 0.0733 | 0.0654 | 0.0633 |

34 | Letter-recog | 0.1043 ∘ | 0.2525 | 0.1300 | 0.1285 | 0.0863 | 0.1300 | 0.1203 |

35 | Shuttle | 0.0009 ∘ | 0.0039 | 0.0015 | 0.0015 | 0.0011 | 0.0014 | 0.0010 |

36 | Waveform | 0.0195 | 0.0220 | 0.0202 | 0.0226 | 0.0180 | 0.0202 | 0.0200 |

37 | Census-income | 0.0542 ∘ | 0.2363 | 0.0628 | 0.0619 | 0.1013 | 0.0628 | 0.0513 |

38 | Covtype | 0.2378 ∘ | 0.3158 | 0.2517 | 0.2451 | 0.2385 | 0.2516 | 0.2445 |

39 | Poker-hand | 0.2266 ∘ | 0.4988 | 0.3295 | 0.3291 | 0.4812 | 0.3295 | 0.0763 |

40 | Donation | 0.0000 | 0.0002 | 0.0000 | 0.0000 | 0.0002 | 0.0000 | 0.0000 |

W/D/L | NB | TAN | KDB | AODE | WATAN | AKDB |
---|---|---|---|---|---|---|

TAN | 26/3/11 | |||||

KDB | 26/4/10 | 7/22/11 | ||||

AODE | 28/7/5 | 21/4/15 | 20/8/12 | |||

WATAN | 28/1/11 | 5/33/2 | 13/20/7 | 13/7/20 | ||

AKDB | 27/7/6 | 21/9/10 | 20/15/5 | 16/8/16 | 22/10/8 | |

LTAN | 30/6/4 | 27/9/4 | 26/9/5 | 22/6/12 | 25/11/4 | 18/15/7 |

No. | Dataset | Instance | Class | $\mathbf{CP}$ | $\mathbf{LP}$ |
---|---|---|---|---|---|

1 | Iris | 150 | 3 | 0.00% | 7.14% |

2 | Chess | 551 | 2 | 4.05% | 5.41% |

3 | Syncon | 600 | 6 | 0.00% | 66.67% |

4 | Vowel | 900 | 11 | 7.09% | 10.14% |

**Table 7.**A testing instance in Vowel which is misclassified by LTAN but correctly classified by TAN.

$\mathbf{x}=({\mathit{x}}_{1},{\mathit{x}}_{2},{\mathit{x}}_{3},{\mathit{x}}_{4},{\mathit{x}}_{5},{\mathit{x}}_{6},{\mathit{x}}_{7},{\mathit{x}}_{8},{\mathit{x}}_{9},{\mathit{x}}_{10},{\mathit{x}}_{11},{\mathit{x}}_{12},{\mathit{x}}_{13})$ | $\tilde{\mathit{c}}$ |
---|---|

x = (1,14,1, (−2.3195,−2.1045], >3.1765, ≤0.4725, ≤−0.0225, (−0.7345,0.4995],≤1.2250, (−1.1270,−0.6575], (−0.5970,0.1185], ≤−0.9465,known) | ${c}_{5}$ |

**x**.

**Table 8.**The class membership probabilities of the testing instance in Table 7 estimated by TAN, TAN${}^{IE}$, IBMC${}^{CL}$ and LTAN.

Classifier | ${\mathit{P}}_{\mathit{\theta}}\left({\mathit{c}}_{5}\right|\mathit{x})$ | ${\mathit{P}}_{\mathit{\theta}}\left({\mathit{c}}_{4}\right|\mathit{x})$ | $\widehat{\mathit{c}}$ |
---|---|---|---|

TAN | 0.7783 | 0.1691 | ${c}_{5}$ |

TAN${}^{IE}$ | 0.8215 | 0.1785 | ${c}_{5}$ |

IBMC${}^{CL}$ | 0.0185 | 0.9815 | ${c}_{4}$ |

LTAN | 0.4200 | 0.5800 | ${c}_{4}$ |

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Sun, Y.; Wang, L.; Sun, M.
Label-Driven Learning Framework: Towards More Accurate Bayesian Network Classifiers through Discrimination of High-Confidence Labels. *Entropy* **2017**, *19*, 661.
https://doi.org/10.3390/e19120661

**AMA Style**

Sun Y, Wang L, Sun M.
Label-Driven Learning Framework: Towards More Accurate Bayesian Network Classifiers through Discrimination of High-Confidence Labels. *Entropy*. 2017; 19(12):661.
https://doi.org/10.3390/e19120661

**Chicago/Turabian Style**

Sun, Yi, Limin Wang, and Minghui Sun.
2017. "Label-Driven Learning Framework: Towards More Accurate Bayesian Network Classifiers through Discrimination of High-Confidence Labels" *Entropy* 19, no. 12: 661.
https://doi.org/10.3390/e19120661