# Classification of Contaminated Insulators Using k-Nearest Neighbors Based on Computer Vision

## Abstract

## 1. Introduction

- The first contribution of this work is related to the improvement in the diagnosis of contaminated insulators through an artificial intelligence model, which can be used for several applications and shows high efficiency.
- The second contribution is related to computational vision analysis of insulators using non-soluble deposit density. This is an innovative contamination analysis using this measure for electrical power system insulators.
- The third contribution is that the k-nearest neighbors model is superior to the decision tree, ensemble, support vector machine, and multilayer perceptron models for this application.

## 2. Insulator Contamination

#### 2.1. Contaminated Insulator Samples

#### 2.2. Image Preprocessing

#### 2.3. Feature Extraction

## 3. Nearest Neighbors Method

#### 3.1. Model Architecture

- e = optimal Bayes classifier error;
- ${e}_{1nn}\left(D\right)$: error of 1-NN;
- ${e}_{knn}\left(D\right)$: error of k-NN.

- $li{m}_{n\to \infty}\phantom{\rule{3.33333pt}{0ex}}{e}_{1nn}\left(D\right)<=2\times e$;
- $li{m}_{n\to \infty ,\phantom{\rule{3.33333pt}{0ex}}k\to n}\phantom{\rule{3.33333pt}{0ex}}{e}_{knn}\left(D\right)=e$.

#### 3.2. Neighbor Distance Method

#### 3.3. Holdout

#### 3.4. Cross-Validation

#### 3.5. Benchmarking

## 4. Analysis of Results

#### Benchmarking

## 5. Conclusions

## Abbreviations

ANFIS | adaptive neuro-fuzzy inference system |

ANN | artificial neural network |

CAPES | Coordination for the Improvement of Higher Education Personnel |

CBIE | Canadian Bureau for International Education |

CNN | convolutional neural network |

ELAP | Emerging Leaders in the Americas Program |

ESDD | equivalent salt deposit density |

GMDH | group method of data handling |

k-NN | k-nearest neighbors |

LSTM | long short term memory |

NSDD | non-soluble deposit density |

SVM | support vector machine |

## References

Kaolin ($\mathit{g}/\mathit{l}$) | 6 | 8 | 10 | 16 | 20 | 25 |
---|---|---|---|---|---|---|

Max. ${W}_{f}$ | 2.778 | 3.279 | 3.483 | 4.297 | 4.201 | 4.485 |

Min. ${W}_{f}$ | 2.408 | 2.542 | 2.615 | 2.946 | 3.580 | 3.956 |

Max. ${W}_{i}$ | 2.226 | 2.415 | 2.204 | 2.522 | 2.385 | 2.461 |

Min. ${W}_{i}$ | 2.121 | 2.134 | 2.088 | 2.123 | 2.088 | 2.145 |

Max. NSDD | 0.868 | 1.503 | 1.688 | 2.618 | 2.637 | 2.656 |

Min. NSDD | 0.377 | 0.393 | 0.633 | 1.029 | 1.568 | 1.661 |

Kaolin (g/l) | NSDD < 1.0 (mg/cm${}^{2}$) | 1.0 < NSDD < 2.0 (mg/cm${}^{2}$) | NSDD > 2.0 (mg/cm${}^{2}$) |
---|---|---|---|

6 | 100% | 0% | 0% |

8 | 80% | 20% | 0% |

10 | 60% | 40% | 0% |

16 | 0% | 40% | 60% |

20 | 0% | 40% | 60% |

25 | 0% | 20% | 80% |

Distance Weight | Accuracy (%) | |||||
---|---|---|---|---|---|---|

5-Fold | 6-Fold | 7-Fold | 8-Fold | 9-Fold | 10-Fold | |

Equal | 82.85 | 81.69 | 82.85 | 81.69 | 82.85 | 79.94 |

Inverse | 79.36 | 81.98 | 82.85 | 80.52 | 84.59 | 81.69 |

Sq. Inver. | 81.69 | 84.30 | 82.85 | 83.14 | 84.58 | 83.14 |

Distance Weight | Accuracy (%) | Std. Dev. | ||
---|---|---|---|---|

Max. | Min. | Mean | ||

Equal | 85.17 | 79.65 | 82.39 | 8.70 × 10${}^{-3}$ |

Inverse | 85.17 | 77.91 | 82.26 | 1.11 × 10${}^{-2}$ |

Sq. Inverse | 84.88 | 79.36 | 82.23 | 9.6 × 10${}^{-3}$ |

Method | Accuracy (%) | Std. Dev. | ||
---|---|---|---|---|

Max. | Min. | Mean | ||

Decision Tree (gdi) | 56.98 | 47.09 | 52.92 | 1.26 × 10${}^{-2}$ |

Decision Tree (deviance) | 61.63 | 56.40 | 60.02 | 7.43 × 10${}^{-3}$ |

Ensemble (subspace) | 67.44 | 63.95 | 65.53 | 6.42 × 10${}^{-3}$ |

SVM (onevsone) | 47.67 | 43.31 | 45.51 | 6.34 × 10${}^{-3}$ |

SVM (allpairs) | 47.38 | 42.73 | 45.40 | 7.84 × 10${}^{-3}$ |

Multilayer perceptron | 76.25 | 66.25 | 70.87 | 2.27 × 10${}^{-2}$ |

