# Optimal Classifier to Detect Unit of Measure Inconsistency in Gas Turbine Sensors

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Problem Statement and Literature Survey

#### 1.2. Contribution and Outline of This Paper

## 2. Supervised Machine Learning Classifiers

#### 2.1. Overview

#### 2.2. Basics of Support Vector Machine

_{tr}is the number of training data. Data for testing, i.e., x, are labeled as the sign provided by Equation (2), where b is the intercept of decision hyperplane:

^{2}is the distance between data and the decision hyperplane.

#### 2.3. Benefits and Drawbacks of Support Vector Machine

#### 2.4. Basics of Naïve Bayes

_{i}, to the most probable class C, i.e., the one that accounts for the highest posterior probability (or conditional probability), calculated as in Equation (7):

_{i}belongs to class C

_{k}depends on the class prior probability P(C

_{k}) and the likelihood P(x

_{i}|C

_{k}). Based on the conditional independency assumption, likelihood P(x

_{i}|C

_{k}) can be calculated separately for each variable, by reducing a multidimensional problem to a one-dimensional problem, as in Equation (8):

#### 2.5. Benefits and Drawbacks of Naïve Bayes

#### 2.6. Basics of K-Nearest Neighbors

_{i}and y

_{i}are the coordinates of each unlabeled and labeled data in the f-dimensional space, respectively. If z is set equal to 1, d is named Manhattan distance, whereas Equation (9) calculates the Euclidean distance if z is equal to 2.

#### 2.7. Benefits and Drawbacks of K-Nearest Neighbors

_{tr}is the number of available labeled data, Cheng et al. [22] suggested setting K equal to N

_{tr}

^{0.5}if the dataset is larger than 100 samples. However, the same authors also stated that this rule of thumb has not been proved suitable for all datasets. Thus, the optimal setting of the K value is still an ongoing research topic.

## 3. Training and Testing

^{®}environment.

## 4. Field Data

## 5. Analysis of Classification Performance

**Data quality.**In the current paper, the influence of data quality is evaluated by means of different analyses. First, each classifier is trained by means of correctly labeled data only, i.e., filtered data (Figure 1). Then, each classifier is trained by means of non-filtered data (Figure 1), in which all data acquired from Site #11 (approximately 13% of the dataset) are experimentally affected by label noise issue. The effect of UMIs on classification capability is further investigated by means of a sensitivity analysis in which the rate of UMI is progressively increased, as outlined in Table 3. At each step, the rate of UMIs is increased by roughly 10%, so that the label noise experimentally affects all data acquired from Site #11, as well as all data collected from additional sites, in which noisy labels were implanted. As in Site #11, implanted UMIs were originally labeled as kPa absolute, though bar absolute was the correct label. Such analyses are aimed at assessing the robustness of each classifier by varying data quality.

**Data quantity.**The amount of data used for training a classifier usually affects its classification capability. In this paper, the influence of data quantity is assessed by means of two analyses.

**Number of classes.**The classification capability of the classifiers is tested by means of twelve classes. The selected UOMs were identified by means of engineering practice. In fact, since all field data were originally labeled as kPa absolute, five additional UOMs, i.e., mmH

_{2}O, mbar, inH

_{2}O, psi, and bar, are accounted for in absolute terms. As a result, six absolute UOMs in total are considered, whose scale factors with respect to the kPa absolute are highlighted in Figure 3 (full circles). Such scale factors are independent from the dataset under analysis.

_{2}O gauge is equal to just 1.2. In addition, regardless of the number of classes, when non-filtered data are used, each classifier is also trained by means of incorrectly labeled data.

## 6. Indices of Classification Performance

**Classification accuracy.**In the field of ML, confusion matrix is usually exploited to evaluate the effectiveness of supervised classifiers. Based on the classifier prediction, four metrics can be calculated, i.e., the rate of true positives (TPs), false positives (FPs), true negatives (TNs) and false negatives (FNs), which can be used to calculate classification accuracy, precision, recall, and specificity.

**Posterior probability.**The posterior probability represents the confidence that a given data belongs to a given class. By considering all classes, for each data the sum of the c posterior probabilities is equal to 100%.

**P**is calculated for each pair of classes; each element p

_{ij}(p

_{ij}∈ [0,1]) is the confidence that a given data belongs to the class i. Consequently, p

_{ji}= 1 − p

_{ij}. The comprehensive procedure to calculate the score-matrix

**P**is described in Platt [41].

**Receiver Operating Characteristic curve.**The capability of each ML classifier is also assessed by means of the receiver operating characteristic (ROC) curve, which displays the false positive rate (FPR) vs. the true positive rate (TPR). Thus, the ROC curve shows the trade-off between the probability of detecting false positives or true positives. For an accurate ML classifier, the ROC curve should climb steeply [42].

**Area Under the Curve.**The area under the curve (AUC) is the area under the ROC curve, which is in the range from 0 to 1. The higher the AUC value, the better the performance of the ML classifier [26].

## 7. Results and Discussion

_{tr}

^{0.5}.

_{tr}is the number of training data (Figure 1), three analyses were carried out, in which: (i) K = 1, (ii) K = (N

_{tr}/c)

^{0.5}, and (iii) K = N

_{tr}

^{0.5}. In these analyses, the ratio N

_{tr}/c corresponds to the number of “Raw data for training” (Figure 1). Among the three K value settings, the best results were obtained when K = N

_{tr}

^{0.5}, especially when non-filtered data were accounted for. This outcome confirms the analysis reported in Cheng et al. [22], which suggested setting K = N

_{tr}

^{0.5}.

#### 7.1. Training with Filtered Data

_{2}O gauge.

_{2}O gauge label slightly challenges the classifiers. In fact, for SVM RO and NB a limited fraction of data (between 3% and 9%) is classified as inH

_{2}O gauge; this rate increases up to 23% in the K-NN classifier.

_{2}O gauge is only equal to 1.2, while the scale factor between bar absolute and bar gauge is roughly equal to 5. As a result, the assignment of the correct UOM is more challenging for Site #1 through #10 than for Site #11.

#### 7.2. Training with Non-Filtered Data

_{2}O gauge label, which is the second-most probable label because of the challenging scale factor.

_{2}O absolute data of Site #11 slightly overlap the kPa absolute data acquired from the other sites. Thus, the area that identifies the mmH

_{2}O absolute label is shifted toward the kPa absolute label.

_{2}O absolute and kPa absolute is higher than that of the other UOMs. However, this outcome cannot be clearly grasped from Figure 6 because few outliers are included within the dataset.

_{2}O absolute, mbar absolute, inH

_{2}O gauge, and kPa gauge is higher than the one of the other UOMs. Similarly, for Site #11, posterior probability of all UOMs is null, with the exception of bar gauge and psi gauge.

_{2}O absolute, inH

_{2}O gauge, mbar gauge, and kPa absolute (the true UOM). This outcome can be explained by considering that wrongly assumed mmH

_{2}O absolute data approximately overlap data that are correctly labeled as kPa absolute. In addition, the inH

_{2}O gauge and mbar gauge labels are the classes that are nearest and second-nearest to the true UOM (see Figure 3). A similar result is confirmed for Site #11, in which the kPa absolute label is the only incorrect UOM whose posterior probability is not null. In fact, wrongly assumed kPa absolute data numerically overlap data correctly labeled as bar absolute.

#### 7.3. ROC Curve and AUC

#### 7.4. Sensitivity Analysis

#### 7.5. Discussion and Guidelines

**✓✓**), positive (

**✓**), or not acceptable (

**✕**).

## 8. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Nomenclature

A | accuracy |

b | intercept of the SVM decision hyperplane |

C | class |

c | number of classes |

d | Minkowski distance |

f | number of features |

g | function |

K | number of neighbors (K-Nearest Neighbors) |

k | non-linear kernel function (Support Vector Machine) |

l | label |

N | number |

P | posterior probability |

x | unlabeled data |

y | labeled data |

z | exponent of the Minkowski distance (K-Nearest Neighbors) |

α | Lagrangian multiple |

γ | distance between data and the SVM decision hyperplane |

σ | standard deviation |

Φ | mapping function |

tr | training |

AUC | Area Under the Curve |

FN | False Negative |

FP | False Positive |

FPR | False Positive Rate |

GT | Gas Turbine |

ML | Machine Learning |

K-NN | K-Nearest Neighbors |

NB | Naïve Bayes |

OVA | One-vs-All |

OVO | One-vs-One |

RBF | Radial Basis Function |

RO | Radial Basis Function with OVO decomposition |

ROC | Receiver Operating Characteristic |

SVM | Support Vector Machine |

TN | True Negative |

TP | True Positive |

TPR | True Positive Rate |

UMI | Unit of Measure Inconsistency |

UOM | Unit Of Measure |

## References

- Frénay, B.; Verleysen, M. Classification in the Presence of Label Noise: A Survey. IEEE Trans. Neural Netw. Learn. Syst.
**2013**, 25, 845–869. [Google Scholar] [CrossRef] - Cappozzo, A.; Greselin, F.; Murphy, T.B. Anomaly and Novelty detection for robust semi-supervised learning. Stat. Comput.
**2020**, 30, 1545–1571. [Google Scholar] - Guan, D.; Chen, K.; Han, G.; Huang, S.; Yuan, W.; Guizani, M.; Shu, L. A Novel Class Noise Detection Method for High-Dimensional Data in Industrial Informatics. IEEE Trans. Ind. Inform.
**2021**, 17, 2181–2190. [Google Scholar] [CrossRef] - Manservigi, L.; Murray, D.; de la Iglesia, J.A.; Ceschini, G.F.; Bechini, G.; Losi, E.; Venturini, M. Detection of Unit of Measure Inconsistency by means of a Machine Learning Model. In Proceedings of the ASME Turbo Expo 2020, London, UK, 22–26 June 2020. GT2020-16094. [Google Scholar]
- Feng, W.; Quan, Y.; Dauphin, G. Label Noise Cleaning with an Adaptive Ensemble Method Based on Noise Detection Metric. Sensors
**2020**, 20, 6718. [Google Scholar] [CrossRef] - Pu, X.; Li, C. Probabilistic Information-Theoretic Discriminant Analysis for Industrial Label-Noise Fault Diagnosis. IEEE Trans. Ind. Inform.
**2021**, 17, 2664–2674. [Google Scholar] [CrossRef] - Liu, J.; Song, C.; Zhao, J.; Ji, P. Manifold-Preserving Sparse Graph-Based Ensemble FDA for Industrial Label-Noise Fault Classification. IEEE Trans. Instrument. Meas.
**2020**, 69, 2621–2634. [Google Scholar] [CrossRef] - Zhang, K.; Tang, B.; Deng, L.; Tan, Q.; Yu, H. A fault diagnosis method for wind turbines gearbox based on adaptive loss weighted meta-ResNet under noisy labels. Mech. Syst. Signal Process.
**2021**, 161, 107963. [Google Scholar] [CrossRef] - Wang, Y.; Liu, N.; Guo, H.; Wang, X. An engine-fault-diagnosis system based on sound intensity analysis and wavelet packet pre-processing neural network. Eng. Appl. Artif. Intell.
**2020**, 94, 103765. [Google Scholar] [CrossRef] - Chen, Q.; Wei, H.; Rashid, M.; Cai, Z. Kernel extreme learning machine based hierarchical machine learning for multi-type and concurrent fault diagnosis. Measurement
**2021**, 184, 109923. [Google Scholar] [CrossRef] - Si, L.; Wang, Z.; Liu, X.; Tan, C. A sensing identification method for shearer cutting state based on modified multi-scale fuzzy entropy and support vector machine. Eng. Appl. Artif. Intell.
**2019**, 78, 86–101. [Google Scholar] [CrossRef] - Manservigi, L.; Murray, D.; de la Iglesia, J.A.; Ceschini, G.F.; Bechini, G.; Losi, E.; Venturini, M. Detection of Unit of Measure Inconsistency in gas turbine sensors by means of Support Vector Machine classifier. ISA Trans.
**2021**. [Google Scholar] [CrossRef] - Kim, T.-W.; Oh, J.; Min, C.; Hwang, S.-Y.; Kim, M.-S.; Lee, J.-H. An Experimental Study on Condition Diagnosis for Thrust Bearings in Oscillating Water Column Type Wave Power Systems. Sensors
**2021**, 21, 457. [Google Scholar] [CrossRef] - Aralikatti, S.S.; Ravikumar, K.N.; Kumar, H.; Nayaka, H.S.; Sugumaran, V. Comparative Study on Tool Fault Diagnosis Methods Using Vibration Signals and Cutting Force Signals by Machine Learning Technique. Struct. Durab. Health Monit.
**2020**, 14, 127–145. [Google Scholar] [CrossRef] - Niazi, K.A.K.; Akhtar, W.; Khan, H.A.; Yang, Y.; Athar, S. Hotspot diagnosis for solar photovoltaic modules using a Naive Bayes classifier. Sol. Energy
**2019**, 190, 34–43. [Google Scholar] [CrossRef] - Da Silva, P.R.N.; Gabbar, H.A.; Junior, P.V.; Junior, C.T.D.C. A new methodology for multiple incipient fault diagnosis in transmission lines using QTA and Naïve Bayes classifier. Int. J. Electr. Power Energy Syst.
**2018**, 103, 326–346. [Google Scholar] [CrossRef] - Aker, E.; Othman, M.L.; Veerasamy, V.; Aris, I.B.; Wahab, N.I.A.; Hizam, H. Fault Detection and Classification of Shunt Compensated Transmission Line Using Discrete Wavelet Transform and Naive Bayes Classifier. Energies
**2020**, 13, 243. [Google Scholar] [CrossRef] [Green Version] - Shi, M.; Zhao, R.; Wu, Y.; He, T. Fault diagnosis of rotor based on Local-Global Balanced Orthogonal Discriminant Projection. Measurement
**2020**, 168, 108320. [Google Scholar] [CrossRef] - Aslinezhad, M.; Hejazi, M.A. Turbine blade tip clearance determination using microwave measurement and k-nearest neighbour classifier. Measurement
**2020**, 151, 107142. [Google Scholar] [CrossRef] - Kužnar, D.; Možina, M.; Giordanino, M.; Bratko, I. Improving vehicle aeroacoustics using machine learning. Eng. Appl. Artif. Intell.
**2012**, 25, 1053–1061. [Google Scholar] [CrossRef] - Bhavani, D.; Vasavi, A.; Keshava, P.T. Machine Learning: A Critical Review of Classification Techniques. IJARCCE
**2016**, 2319–5940. [Google Scholar] [CrossRef] - Cheng, D.; Zhang, S.; Deng, Z.; Zhu, Y.; Zong, M. kNN algorithm with data-driven k value. In Advanced Data Mining and Applications; ADMA 2014—Lecture Notes in Computer Science; Luo, X., Yu, J.X., Li, Z., Eds.; Springer: Cham, Switzerland, 2014; Volume 8933. [Google Scholar] [CrossRef]
- Kumar, Y.; Sahoo, G. Analysis of Parametric & Non Parametric Classifiers for Classification Technique using WEKA. Int. J. Inf. Technol. Comput. Sci.
**2012**, 4, 43–49. [Google Scholar] [CrossRef] [Green Version] - Sinha, V.K.; Patro, K.K.; Pławiak, P.; Prakash, A.J. Smartphone-Based Human Sitting Behaviors Recognition Using Inertial Sensor. Sensors
**2021**, 21, 6652. [Google Scholar] [CrossRef] [PubMed] - Scholkopf, B.; Smola, A.J. Learning with Kernels, 7th ed.; MIT Press: Cambridge, MA, USA, 2002. [Google Scholar]
- Doan, Q.H.; Le, T.; Thai, D.-K. Optimization strategies of neural networks for impact damage classification of RC panels in a small dataset. Appl. Soft Comput.
**2021**, 102, 107100. [Google Scholar] [CrossRef] - Bedi, P.; Mewada, S.; Vatti, R.A.; Singh, C.; Dhindsa, K.S.; Ponnusamy, M.; Sikarwar, R. Detection of attacks in IoT sensors networks using machine learning algorithm. Microprocess. Microsyst.
**2021**, 82, 103814. [Google Scholar] [CrossRef] - Saidi, L.; Ali, J.B.; Fnaiech, F. Application of higher order spectral features and support vector machines for bearing faults classification. ISA Trans.
**2015**, 54, 193–206. [Google Scholar] [CrossRef] [PubMed] - Mountrakis, G.; Im, J.; Ogole, C. Support vector machines in remote sensing: A review. ISPRS J. Photogramm. Remote Sens.
**2011**, 66, 247–259. [Google Scholar] [CrossRef] - Safavi, S.; Safavi, M.A.; Hamid, H.; Fallah, S. Multi-Sensor Fault Detection, Identification, Isolation and Health Forecasting for Autonomous Vehicles. Sensors
**2021**, 21, 2547. [Google Scholar] [CrossRef] - Krawczyk, B.; Galar, M.; Woźniak, M.; Bustince, H.; Herrera, F. Dynamic ensemble selection for multi-class classification with one-class classifiers. Pattern Recognit.
**2018**, 83, 34–51. [Google Scholar] [CrossRef] - Petschke, D.; Staab, T.E. A supervised machine learning approach using naive Gaussian Bayes classification for shape-sensitive detector pulse discrimination in positron annihilation lifetime spectroscopy (PALS). Nucl. Instrum. Methods Phys. Res. Sect. A Accel. Spectrom. Detect. Assoc. Equip.
**2019**, 947, 162742. [Google Scholar] [CrossRef] - Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning, 2nd ed.; Springer: New York, NY, USA, 2008. [Google Scholar]
- Sanchez, J.S.; Pla, F.; Ferri, F.J. On the use of neighbourhood-based non-parametric classifiers. Pattern Recognit. Lett.
**1997**, 18, 1179–1186. [Google Scholar] - Peng, K.; Tang, Z.; Dong, L.; Sun, D. Machine Learning Based Identification of Microseismic Signals Using Characteristic Parameters. Sensors
**2021**, 21, 6967. [Google Scholar] [CrossRef] [PubMed] - Zhang, Z.; Han, H.; Cui, X.; Fan, Y. Novel application of multi-model ensemble learning for fault diagnosis in refrigeration systems. Appl. Therm. Eng.
**2020**, 164, 114516. [Google Scholar] [CrossRef] - Ho, T.K. The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell.
**1998**, 20, 832–884. [Google Scholar] [CrossRef] [Green Version] - Tran, T.M.; Thi Le, X.M.; Nguyen, H.T.; Huynh, V.N. A novel non-parametric method for time series classification based on k-Nearest Neighbors and Dynamic Time Warping Barycenter Averaging. Eng. Appl. Artif. Intell.
**2019**, 78, 173–185. [Google Scholar] [CrossRef] - Manservigi, L.; Venturini, M.; Ceschini, G.F.; Bechini, G.; Losi, E. Development and Validation of a General and Robust Methodology for the Detection and Classification of Gas Turbine Sensor Faults. J. Eng. Gas Turbines Power
**2020**, 142, 1071961. [Google Scholar] [CrossRef] - Yu, Y.; Yao, H.; Liu, Y. Structural dynamics simulation using a novel physics-guided machine learning method. Eng. Appl. Artif. Intell.
**2020**, 96, 103947. [Google Scholar] [CrossRef] - Platt, J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in Large Margin Classifiers; The MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
- Subasi, O.; Di, S.; Bautista-Gomezm, L.; Balaprakash, P.; Unsal, O.; Labarta, J.; Cristal, A.; Krishnamoorthy, S.; Cappello, F. Exploring the capabilities of support vector machines in detecting silent data corruptions. Sustain. Comput. Inform. Syst.
**2018**, 19, 277–290. [Google Scholar] [CrossRef] [Green Version] - Manservigi, L. Detection and Classification of Faults and Anomalies in Gas Turbine Sensors by Means of Statistical Filters and Machine Learning Models. Ph.D. Thesis, Università degli Studi di Ferrara, Ferrara, Italy, 2021. [Google Scholar]
- Bhattacharya, G.; Ghosh, K.; Chowdhury, A.S. kNN classification with an outlier informative distance measure. In Pattern Recognition and Machine Intelligence; PReMI 2017—Lecture Notes in Computer Science; Springer: Berlin, Germany, 2017. [Google Scholar]

**Figure 1.**Flowchart of the procedure for identifying the data used for training and testing the classifiers.

**Figure 4.**Posterior probability for twelve UOMs: training with 10 % of filtered data (full bar: absolute UOMs, dotted bar: gauge UOMs; kPa absolute sites (

**a**), bar absolute site (

**b**)).

**Figure 5.**Posterior probability for twelve UOMs: site cross-validation by means of filtered data (full bar: absolute UOMs, dotted bar: gauge UOMs; kPa absolute sites (

**a**), bar absolute site (

**b**)).

**Figure 6.**Posterior probability for twelve UOMs: training with 10% of non-filtered data (full bar: absolute UOMs, dotted bar: gauge UOMs; kPa absolute sites (

**a**), bar absolute site (

**b**)).

**Figure 7.**Posterior probability for twelve UOMs: site cross-validation by means of non-filtered data (full bar: absolute UOMs, dotted bar: gauge UOMs; kPa absolute sites (

**a**), bar absolute site (

**b**)).

**Figure 8.**ROC curve for the SVM RO (

**a**), NB (

**b**) and K-NN classifiers (

**c**): training with 10% of non-filtered data and twelve UOMs (continuous line: kPa absolute sites, dashed line: bar absolute site).

Site | Number of GTs | Share | True UOM |
---|---|---|---|

1 | 10 | 21.1% | kPa absolute |

2 | 1 | 6.1% | |

3 | 1 | 2.7% | |

4 | 2 | 12.1% | |

5 | 1 | 6.1% | |

6 | 1 | 0.7% | |

7 | 1 | 2.8% | |

8 | 7 | 24.2% | |

9 | 2 | 7.9% | |

10 | 1 | 2.9% | |

11 | 3 | 13.4% | bar absolute |

TOTAL | 30 | 100.0 % |

Site | Feature #1 (Mean Value) | Feature #2 (Standard Deviation) | ||
---|---|---|---|---|

Skewness | Kurtosis | Skewness | Kurtosis | |

#1 | −0.11 | 2.60 | 1.32 | 3.78 |

#2 | 0.41 | 1.98 | 1.33 | 5.29 |

#3 | 1.19 | 3.67 | 3.81 | 21.55 |

#4 | −0.08 | 2.97 | 5.52 | 34.27 |

#5 | 1.06 | 3.35 | 1.88 | 7.06 |

#6 | 0.15 | 2.22 | −0.18 | 1.69 |

#7 | 0.61 | 1.88 | 3.55 | 16.81 |

#8 | −0.23 | 2.58 | 0.90 | 3.20 |

#9 | −0.29 | 2.21 | 1.74 | 5.87 |

#10 | 1.15 | 2.87 | 4.50 | 24.75 |

#11 | 1.18 | 3.14 | 6.81 | 65.48 |

Rate of UMIs | Sites Affected by UMIs |
---|---|

0% | / |

13.4% | Site #11 |

20.2% | Site #2, Site 6, Site #11 |

30.8% | Site #2, Site #3, Site 6, Site #9, Site #11 |

39.8% | Site #2, Site #3, Site #5, Site 6, Site #9, Site #10, Site #11 |

51.9% | Site #2, Site #3, Site #4, Site #5, Site 6, Site #9, Site #10, Site #11 |

60.9% | Site #1, Site #2, Site #3, Site #5, Site #6, Site #9, Site #10, Site #11 |

73.0% | From Site #1 to #6 and from Site #9 to Site #11 |

97.2% | Site #1 through #6 and Site #8 through #11 |

100.0% | All sites |

SVM RO | NB | K-NNs | |
---|---|---|---|

kPa absolute sites | 0.992 | 0.993 | 0.990 |

bar absolute site | 1.000 | 1.000 | 0.992 |

Site | Data Quality | Training | SVM RO | NB | K-NNs |
---|---|---|---|---|---|

Site #1–#10 | Filtered | 10% of data | 83% | 90% | 62% |

Site cross-validation | 81% | 85% | 77% | ||

Non-filtered | 10% of data | 64% | 79% | 54% | |

Site cross-validation | 65% | 76% | 63% | ||

Site #11 | Filtered | 10% of data | 99% | 100% | 99% |

Site cross-validation | 75% | 100% | 100% | ||

Non-filtered | 10% of data | 77% | 95% | 72% | |

Site cross-validation | 65% | 100% | 100% |

SVM RO | NB | K-NNs | |
---|---|---|---|

Robustness in the presence of UMIs | ✓ | ✓✓ | ✕ |

Computational time | ✕ | ✓ | ✓✓ |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Manservigi, L.; Venturini, M.; Losi, E.; Bechini, G.; Artal de la Iglesia, J.
Optimal Classifier to Detect Unit of Measure Inconsistency in Gas Turbine Sensors. *Machines* **2022**, *10*, 228.
https://doi.org/10.3390/machines10040228

**AMA Style**

Manservigi L, Venturini M, Losi E, Bechini G, Artal de la Iglesia J.
Optimal Classifier to Detect Unit of Measure Inconsistency in Gas Turbine Sensors. *Machines*. 2022; 10(4):228.
https://doi.org/10.3390/machines10040228

**Chicago/Turabian Style**

Manservigi, Lucrezia, Mauro Venturini, Enzo Losi, Giovanni Bechini, and Javier Artal de la Iglesia.
2022. "Optimal Classifier to Detect Unit of Measure Inconsistency in Gas Turbine Sensors" *Machines* 10, no. 4: 228.
https://doi.org/10.3390/machines10040228