# Towards Fire Prediction Accuracy Enhancements by Leveraging an Improved Naïve Bayes Algorithm

## Abstract

## 1. Introduction

## 2. Naive Bayesian Classifier

## 3. Practical Enhancements in the Naive Bayesian Algorithm to Augment Fire Prediction Accuracy

#### 3.1. Laplace Smoothing

#### 3.2. Logarithmic Operation

#### 3.3. Double Weighting of Characteristic Attribute

#### 3.4. Prior Probability Compensation

## 4. Fire Prediction Model

#### 4.1. Development of the Classification Model

- (1)
- Divide the dataset into two groups: the training set and the test set. Data preprocessing is applied, including data discretization and normalization.
- (2)
- Based on the samples from the training set, calculate the probability $p\left({C}_{i}\right)$ of the decision category ${C}_{i}$, and the conditional probability $p({x}_{j},k|{C}_{i})$ of the characteristic attribute ${X}_{k}$ with the value of ${x}_{j}$ under the decision category ${C}_{i}$.
- (3)
- Calculate the characteristic attribute weighting coefficient and the characteristic attribute value weighting coefficient according to (6) and (7), respectively.
- (4)
- Determine the prior probability compensation coefficient. The decision category model is developed based on the weighting of characteristic attributes, and a complete weighted naive Bayes classifier is obtained.
- (5)
- For data point Y to be classified, apply the developed naive Bayes classifier to classify the data.

#### 4.2. Determination of the Compensation Coefficient ${\xi}_{i}$

_{16}(4

^{3}) to construct the preliminary determination test for ${\xi}_{i}$. The test results are shown in Table 2.

_{16}(4

^{3}), the second orthogonal test was conducted to determine the final value of ${\xi}_{i}$. Three factors, ${\xi}_{\mathrm{OF}}$, ${\xi}_{\mathrm{SF}}$, and ${\xi}_{\mathrm{NF}}$, were chosen. Each factor was divided into five levels, for 3-factor and 5-level orthogonal tests. The orthogonal table was selected as L

_{25}(5

^{3}). The factor level and the test results are shown in Table 3 and Table 4.

_{.}The influence from ${\xi}_{\mathrm{NF}}$ is still small. The optimal levels of ${\xi}_{\mathrm{OF}}$, ${\xi}_{\mathrm{SF}}$, and ${\xi}_{\mathrm{NF}}$ are 1, 4, and 2, respectively. Therefore, it can be determined that the ${\xi}_{i}$ values are ${\xi}_{\mathrm{OF}}$ = 1.1, ${\xi}_{\mathrm{SF}}$ = 2.7, ${\xi}_{\mathrm{NF}}$ = 3.3.

#### 4.3. Classification Model Performance Analysis

## 5. Experimental Verification

## 6. Conclusions

## Nomenclature

$P$ | Probability |

$X$ | Characteristic attribute |

$C$ | Categories |

${S}_{j}$ | Number of all the values of the characteristic attribute ${X}_{k}$ |

${\sigma}_{k,i}{}^{2}$ | Variance of the characteristic attribute ${X}_{k}$ under the decision category ${C}_{i}$ |

$\overline{{x}_{k,i}}$ | Average value of the characteristic attribute ${X}_{k}$ under the decision category ${C}_{i}$ |

${\xi}_{i}$ | Prior probability compensation coefficient |

${N}_{r}$ | Total number of data that can be correctly identified for open flame, smoldering fire and no fire |

$N$ | Total number of data in the test set |

$TP$ | True Positive |

$FP$ | False Positive |

$TN$ | True Negative |

$FN$ | False Negative |

$\mathrm{DWCNB}$ | Double weighted naive Bayes with compensation coefficient |

$\mathrm{DWNB}$ | Double weighted naive Bayesian algorithm |

$\mathrm{NB}$ | Naive Bayes |

$P\left(C\right|X)$ | Basic naive Bayesian model |

${C}_{NB}\left(X\right)$ | Naive Bayesian classification model |

${v}_{k,i}$ | Characteristic attribute weighting coefficient |

${w}_{k,j,i}$ | Weight coefficient of the characteristic attribute ${X}_{k}$ ${x}_{j}$ value under the decision category |

$A$ | Accuracy rate |

$P$ | Accuracy rate of model |

$R$ | Recall rate |

$F$ | F-measure |

Temperature | Smoke Concentration | Carbon Monoxide Concentration | Category |
---|---|---|---|

0.134993447 | 0.731509228 | 0.491863445 | SF |

0.057899902 | 0.11399056 | 0.279757662 | NF |

0.530814524 | 0.91432897 | 0.917676408 | OF |

0.051104374 | 0.171237491 | 0.408244787 | SF |

0.056918548 | 0.114881111 | 0.274411974 | NF |

0.145478375 | 0.586159577 | 0.378634151 | SF |

0.138925295 | 0.811801608 | 0.43573411 | SF |

0.054955839 | 0.11399056 | 0.249465431 | NF |

0.137614679 | 0.781443414 | 0.298299892 | SF |

0.052993131 | 0.089945676 | 0.293121882 | NF |

0.846657929 | 0.118553785 | 0.696103857 | OF |

0.056918548 | 0.114881111 | 0.285994298 | NF |

0.058033781 | 0.205166522 | 0.522469442 | SF |

0.545690775 | 0.840545804 | 0.870615962 | OF |

0.871559633 | 0.067020472 | 0.299467674 | OF |

Number | Factor Variable | Evaluation Result | |||
---|---|---|---|---|---|

ξ_{OF} | ξ_{SF} | ξ_{NF} | A (%) | ||

1 | 1.5 | 1.5 | 1.5 | 95.54 | |

2 | 1.5 | 2.5 | 2.5 | 96.56 | |

3 | 1.5 | 3.5 | 3.5 | 95.07 | |

4 | 1.5 | 4.5 | 4.5 | 95.07 | |

5 | 2.5 | 1.5 | 2.5 | 94.87 | |

6 | 2.5 | 2.5 | 1.5 | 95.07 | |

7 | 2.5 | 3.5 | 4.5 | 95.23 | |

8 | 2.5 | 4.5 | 3.5 | 94.97 | |

9 | 3.5 | 1.5 | 3.5 | 94.61 | |

10 | 3.5 | 2.5 | 4.5 | 94.67 | |

11 | 3.5 | 3.5 | 1.5 | 94.97 | |

12 | 3.5 | 4.5 | 2.5 | 93.43 | |

13 | 4.5 | 1.5 | 4.5 | 92.92 | |

14 | 4.5 | 2.5 | 3.5 | 94.10 | |

15 | 4.5 | 3.5 | 2.5 | 92.97 | |

16 | 4.5 | 4.5 | 1.5 | 92.30 | |

Level average value | k_{1} | 95.56 | 94.48 | 94.09 | |

k_{2} | 95.04 | 95.18 | 94.33 | ||

k_{3} | 93.98 | 94.18 | 94.69 | ||

k_{4} | 93.07 | 93.82 | 94.55 | ||

Range | R | 2.49 | 1.36 | 0.60 |

Level | Factor Variable | ||
---|---|---|---|

ξ_{OF} | ξ_{SF} | ξ_{NF} | |

1 | 1.1 | 2.1 | 3.1 |

2 | 1.3 | 2.3 | 3.3 |

3 | 1.5 | 2.5 | 3.5 |

4 | 1.7 | 2.7 | 3.7 |

5 | 1.9 | 2.9 | 3.9 |

Level | Factor Variable | Evaluation Result | |||
---|---|---|---|---|---|

ξ_{OF} | ξ_{SF} | ξ_{NF} | A (%) | ||

1 | 1.1 | 2.1 | 3.1 | 96.87 | |

2 | 1.1 | 2.3 | 3.3 | 96.77 | |

3 | 1.1 | 2.5 | 3.5 | 96.82 | |

4 | 1.1 | 2.7 | 3.7 | 97.18 | |

5 | 1.1 | 2.9 | 3.9 | 96.72 | |

6 | 1.3 | 2.1 | 3.3 | 96.87 | |

7 | 1.3 | 2.3 | 3.5 | 96.77 | |

8 | 1.3 | 2.5 | 3.7 | 96.77 | |

9 | 1.3 | 2.7 | 3.9 | 96.82 | |

10 | 1.3 | 2.9 | 3.1 | 96.72 | |

11 | 1.5 | 2.1 | 3.5 | 96.72 | |

12 | 1.5 | 2.3 | 3.7 | 96.77 | |

13 | 1.5 | 2.5 | 3.9 | 96.77 | |

14 | 1.5 | 2.7 | 3.1 | 96.56 | |

15 | 1.5 | 2.9 | 3.3 | 96.56 | |

16 | 1.7 | 2.1 | 3.7 | 95.07 | |

17 | 1.7 | 2.3 | 3.9 | 95.43 | |

18 | 1.7 | 2.5 | 3.1 | 96.61 | |

19 | 1.7 | 2.7 | 3.3 | 97.02 | |

20 | 1.7 | 2.9 | 3.5 | 96.56 | |

21 | 1.9 | 2.1 | 3.9 | 94.92 | |

22 | 1.9 | 2.3 | 3.1 | 95.33 | |

23 | 1.9 | 2.5 | 3.3 | 95.38 | |

24 | 1.9 | 2.7 | 3.5 | 95.74 | |

25 | 1.9 | 2.9 | 3.7 | 96.10 | |

Level average value | k_{1} | 96.87 | 96.09 | 96.42 | |

k_{2} | 96.79 | 96.21 | 96.52 | ||

k_{3} | 96.68 | 96.47 | 96.52 | ||

k_{4} | 96.14 | 96.66 | 96.38 | ||

k_{5} | 95.50 | 96.53 | 96.13 | ||

Range | R | 0.73 | 0.57 | 0.39 |

Positive Prediction | Negative Prediction | |
---|---|---|

Positive Class | True Positive (TP) | False Negative (FN) |

Negative Class | False Positive (FP) | True Negative (TN) |

Test Set | Classification Algorithms | ||
---|---|---|---|

NB (%) | DWNB (%) | DWCNB (%) | |

1 | 90.49 | 94.43 | 97.24 |

2 | 92.96 | 95.23 | 96.84 |

3 | 91.82 | 96.18 | 98.13 |

4 | 92.01 | 93.82 | 96.84 |

5 | 93.82 | 94.24 | 97.44 |

Average | 92.22 | 94.78 | 97.30 |

Test Fire | Classification Algorithms | ||
---|---|---|---|

NB (%) | DWNB (%) | DWCNB (%) | |

Wood smoldering fire | 92.27 | 94.13 | 97.98 |

Cotton rope smoldering fire | 92.67 | 95.67 | 98.13 |

Polyurethane plastic open flame | 93.33 | 93.27 | 97.53 |

Ethanol open flame | 92.53 | 93.00 | 97.40 |

Average | 92.70 | 94.02 | 97.76 |

Interference Source | Classification Algorithms | ||
---|---|---|---|

NB (%) | DWNB (%) | DWCNB (%) | |

Cigarette lighter | 93.20 | 97.47 | 98.53 |

Dust | 95.47 | 96.20 | 99.47 |

Cigarette smoke | 90.73 | 92.20 | 96.73 |

Average | 93.13 | 95.29 | 98.24 |

