# Phase I Analysis of Nonlinear Profiles Using Anomaly Detection Techniques

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Work

## 3. Theoretical Background

#### 3.1. Anomaly Detection

#### 3.2. Local Outlier Factor

#### 3.3. Isolation Forest

#### 3.4. Elliptic Envelope

## 4. Methods

#### 4.1. Smoothing Profiles

#### 4.2. Determining the Contamination Rate

#### 4.3. Detecting Outliers

## 5. Results and Discussion

#### 5.1. Data

#### 5.2. Evaluation Metrics

#### 5.3. Results

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Chicken, E.; Pignatiello, J.J., Jr.; Simpson, J.R. Statistical process monitoring of nonlinear profiles using wavelets. J. Qual. Technol.
**2009**, 41, 198–212. [Google Scholar] [CrossRef] - Jensen, W.A.; Grimshaw, S.D.; Espen, B. Nonlinear profile monitoring for oven-temperature data. J. Qual. Technol.
**2016**, 48, 84–97. [Google Scholar] [CrossRef] - Woodall, W.H.; Spitzner, D.J.; Montgomery, D.C.; Gupta, S. Using control charts to monitor process and product quality profiles. J. Qual. Technol.
**2004**, 36, 309–320. [Google Scholar] [CrossRef] - Maleki, M.R.; Amiri, A.; Castagliola, P. An overview on recent profile monitoring papers (2008–2018) based on conceptual classification scheme. Comput. Ind. Eng.
**2018**, 126, 705–728. [Google Scholar] [CrossRef] - Stover, F.S.; Brill, R.V. Statistical quality control applied to ion chromatography calibrations. J. Chromatogr. A
**1998**, 804, 37–43. [Google Scholar] [CrossRef] - Kang, L.; Albin, S.L. On-line monitoring when the process yields a linear profile. J. Qual. Technol.
**2000**, 32, 418–426. [Google Scholar] [CrossRef] - Kim, K.; Mahmoud, M.A.; Woodall, W.H. On the monitoring of linear profiles. J. Qual. Technol.
**2003**, 35, 317–328. [Google Scholar] [CrossRef] - Mahmoud, M.A.; Parker, P.A.; Woodall, W.H.; Hawkins, D.M. A change point method for linear profile data. Qual. Reliab. Eng. Int.
**2007**, 23, 247–268. [Google Scholar] [CrossRef] - Williams, J.D.; Woodall, W.H.; Birch, J.B. Statistical monitoring of nonlinear product and process quality profiles. Qual. Reliab. Eng. Int.
**2007**, 23, 925–941. [Google Scholar] [CrossRef] - Gardner, M.M.; Lu, J.C.; Gyurcsik, R.S.; Wortman, J.J.; Hornung, B.E.; Heinisch, H.H.; Rying, E.A.; Rao, S.; Davis, J.C.; Mozumder, P.K. Equipment fault detection using spatial signatures. IEEE Trans. Compon. Packag. Manuf. Technol. Part C
**1997**, 20, 295–304. [Google Scholar] [CrossRef] - Fan, J. Test of significance based on wavelet thresholding and Neyman’s truncation. J. Am. Stat. Assoc.
**1996**, 91, 674–688. [Google Scholar] [CrossRef] - Jin, J.; Shi, J. Automatic feature extraction of waveform signals for in-process diagnostic performance improvement. J. Intell. Manuf.
**2001**, 12, 257–268. [Google Scholar] [CrossRef] - Jeong, M.K.; Lu, J.C.; Wang, N. Wavelet-based SPC procedure for complicated functional data. Int. J. Prod. Res.
**2006**, 44, 729–744. [Google Scholar] [CrossRef] - Woodall, W.H. Controversies and contradictions in statistical process control. J. Qual. Technol.
**2000**, 32, 341–350. [Google Scholar] [CrossRef] - Ding, Y.; Zeng, L.; Zhou, S. Phase I analysis for monitoring nonlinear profiles in manufacturing processes. J. Qual. Technol.
**2006**, 38, 199–216. [Google Scholar] [CrossRef] - Hodge, V.; Austin, J. A survey of outlier detection methodologies. Artif. Intell. Rev.
**2004**, 22, 85–126. [Google Scholar] [CrossRef] - Pattisahusiwa, A.; Purqon, A. Comparison of outliers and novelty detection to identify ionospheric TEC irregularities during geomagnetic storm and substorm. J. Phys. Conf. Ser.
**2016**, 739, 012015. [Google Scholar] [CrossRef] - Miljković, D. Review of novelty detection methods. In Proceedings of the 33rd International Convention MIPRO, Opatija, Croatia, 29 July 2010; pp. 593–598. [Google Scholar]
- Chalapathy, R.; Chawla, S. Deep learning for anomaly detection: A survey. arXiv
**2019**, arXiv:1901.03407. [Google Scholar] - Pang, G.; Shen, C.; Cao, L.; Hengel, A.V.D. Deep learning for anomaly detection: A review. ACM Comput. Surv.
**2021**, 54, 1–38. [Google Scholar] [CrossRef] - Ruff, L.; Kauffmann, J.R.; Vandermeulen, R.A.; Montavon, G.; Samek, W.; Kloft, M.; Müller, K.R. A unifying review of deep and shallow anomaly detection. Proc. IEEE
**2021**, 109, 756–795. [Google Scholar] [CrossRef] - Thudumu, S.; Branch, P.; Jin, J.; Singh, J.J. A comprehensive survey of anomaly detection techniques for high dimensional big data. J. Big Data
**2020**, 7, 42. [Google Scholar] [CrossRef] - Ebadi, M.; Chenouri, S.; Steiner, S.H. Phase I analysis of high-dimensional multivariate processes in the presence of outliers. arXiv
**2021**, arXiv:2110.13689. [Google Scholar] - Blázquez-García, A.; Conde, A.; Mori, U.; Lozano, J.A. A review on outlier/anomaly detection in time series data. ACM Comput. Surv.
**2021**, 54, 1–33. [Google Scholar] [CrossRef] - Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv.
**2009**, 41, 1–58. [Google Scholar] [CrossRef] - Pimentel, M.A.; Clifton, D.A.; Clifton, L.; Tarassenko, L. A review of novelty detection. Signal Process.
**2014**, 99, 215–249. [Google Scholar] [CrossRef] - Al-amri, R.; Murugesan, R.K.; Man, M.; Abdulateef, A.F.; Al-Sharafi, M.A.; Alkahtani, A.A. A review of machine learning and deep learning techniques for anomaly detection in IoT data. Appl. Sci.
**2021**, 11, 5320. [Google Scholar] [CrossRef] - Choi, K.; Yi, J.; Park, C.; Yoon, S. Deep learning for anomaly detection in time-series data: Review, analysis, and guidelines. IEEE Access
**2021**, 9, 120043–120065. [Google Scholar] [CrossRef] - McLachlan, G.; Peel, D. Finite Mixture Models; John Wiley & Sons, Ltd.: New York, NY, USA, 2000. [Google Scholar]
- Chen, Y.; Birch, B.; Woodall, W.H. Effect of Phase I estimation on Phase II control chart performance with profile data. Qual. Reliab. Eng. Int.
**2016**, 32, 79–87. [Google Scholar] [CrossRef] - Chen, Y.; Birch, J.B.; Woodall, W.H. Cluster-based profile analysis in phase I. J. Qual. Technol.
**2015**, 47, 14–29. [Google Scholar] [CrossRef] - Saremian, D.; Noorossana, R.; Raissi, S.; Soleimani, P. Robust cluster-based method for monitoring generalized linear profiles in phase I. J. Ind. Eng. Int.
**2021**, 17, 88–97. [Google Scholar] - Nie, B.; Liu, D.; Liu, X.; Ye, W. Phase I non-linear profiles monitoring using a modified Hausdorff distance algorithm and clustering analysis. Int. J. Qual. Reliab. Manag.
**2021**, 38, 536–550. [Google Scholar] [CrossRef] - Mao, W.; Shi, H.; Wang, G.; Liang, X. Unsupervised deep multitask anomaly detection with robust alarm strategy for online evaluation of bearing early fault occurrence. IEEE Trans. Instrum. Meas.
**2022**, 71, 3520713. [Google Scholar] [CrossRef] - Velasco-Gallego, C.; Lazakis, I. RADIS: A real-time anomaly detection intelligent system for fault diagnosis of marine machinery. Expert Syst. Appl.
**2022**, 204, 117634. [Google Scholar] [CrossRef] - Du, W.; Guo, Z.; Li, C.; Gong, X.; Pu, Z. From anomaly detection to novel fault discrimination for wind turbine gearboxes with a sparse isolation encoding forest. IEEE Trans. Instrum. Meas.
**2022**, 71, 2512710. [Google Scholar] [CrossRef] - Tian, Y.; Mirzabagheri, M.; Bamakan, S.M.H.; Wang, H.; Qu, Q. Ramp loss one-class support vector machine; a robust and effective approach to anomaly detection problems. Neurocomputing
**2018**, 310, 223–235. [Google Scholar] [CrossRef] - Shieh, A.D.; Kamm, D.F. Ensembles of one class support vector machines. In Multiple Classifier Systems; Benediktsson, J.A., Kittler, J., Roli, F., Eds.; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5519, pp. 181–190. [Google Scholar]
- Fernández, A.; Bella, J.; Dorronsoro, J.R. Supervised outlier detection for classification and regression. Neurocomputing
**2022**, 486, 77–92. [Google Scholar] [CrossRef] - Roig, M.; Catalan, M.; Gastón, B. Ensembled outlier detection using multi-variable correlation in WSN through unsupervised learning techniques. In Proceedings of the 4th International Conference on Internet of Things, Big Data and Security (IoTBDS), Heraklion, Crete, Greece, 2–4 May 2019; pp. 38–48. [Google Scholar]
- Cheng, Z.; Zou, C.; Dong, J. Outlier detection using isolation forest and local outlier factor. In Proceedings of the International Conference on Research in Adaptive and Convergent Systems, Chongqing, China, 24–27 September 2019; pp. 161–168. [Google Scholar]
- Dentamaro, V.; Convertini, V.N.; Galantucci, S.; Giglio, P.; Palmisano, T.; Pirlo, G. Ensemble consensus: An unsupervised algorithm for anomaly detection in network security data. In Proceedings of the Italian Conference on Cybersecurity (ITASEC), Virtual, 7–9 April 2021; pp. 309–318. [Google Scholar]
- Schölkopf, B.; Platt, J.C.; Shawe-Taylor, J.C.; Smola, A.J.; Williamson, R.C. Estimating the Support of a High-Dimensional Distribution. Neural Comput.
**2001**, 13, 1443–1471. [Google Scholar] [CrossRef] - Tax, D.M.J.; Duin, R.P.W. Support vector data description. Mach. Learn.
**2004**, 54, 45–66. [Google Scholar] [CrossRef] - Ruff, L.; Vandermeulen, R.A.; Görnitz, N.; Deecke, L.; Siddiqui, S.A.; Binder, A.; Müller, E.; Kloft, M. Deep One-Class Classification. In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 4393–4402. [Google Scholar]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 15–18 May 2000; pp. 93–104. [Google Scholar]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res.
**2011**, 12, 2825–2830. [Google Scholar] - Xu, Z.; Kakde, D.; Chaudhuri, A. Automatic hyperparameter tuning method for local outlier factor. In Proceedings of the 2019 IEEE International Conference on Big Data, Los Angeles, CA, USA, 9–12 December 2019; pp. 4201–4207. [Google Scholar]
- Rousseeuw, P.J. Least median of squares regression. J. Am. Stat. Assoc.
**1984**, 79, 871–880. [Google Scholar] [CrossRef] - Rousseeuw, P.J.; Driessen, K.V. A fast algorithm for the minimum covariance determinant estimator. Technometrics
**1999**, 41, 212–223. [Google Scholar] [CrossRef] - Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nat. Methods
**2020**, 17, 261–272. [Google Scholar] [CrossRef] [PubMed] - Wang, K.; Jiang, W. High-dimensional process monitoring and fault isolation via variable selection. J. Qual. Technol.
**2009**, 41, 247–258. [Google Scholar] [CrossRef] - Zou, C.; Qiu, P. Multivariate statistical process control using LASSO. J. Am. Stat. Assoc.
**2009**, 104, 1586–1596. [Google Scholar] [CrossRef] - Zhang, H.; Albin, S. Detecting outliers in complex profiles using a χ
^{2}control chart method. IIE Trans.**2009**, 41, 335–345. [Google Scholar] [CrossRef] - Zou, C.; Tseng, S.T.; Wang, Z. Outlier detection in general profiles using penalized regression method. IIE Trans.
**2014**, 46, 106–117. [Google Scholar] [CrossRef] - Taha, A.A.; Hanbury, A. Metrics for evaluating 3D medical image segmentation: Analysis, selection, and tool. BMC Med. Imaging
**2015**, 15, 29. [Google Scholar] [CrossRef] - Tharwat, A. Classification assessment methods. Appl. Comput. Inform.
**2020**, 17, 168–192. [Google Scholar] [CrossRef]

**Figure 1.**Oven temperature profile studied by Jensen et al. [2]. The blue line represents the in-control (IC) profile and the red line represents the out-of-control (OOC) profile.

**Figure 2.**Comparison between statistical process control (SPC) and anomaly detection. UCL, CL, and LCL represent the upper control limit, center line, and lower control limit, respectively. UCL*, CL*, and LCL* denote the modified upper control limit, center line, and lower control limit, respectively.

**Figure 3.**Reachability distance (RD) when $k=3$. Points ${p}_{1}$–${p}_{4}$ have the same RD, and ${p}_{5}$ is not a $k$-nearest neighbor.

**Figure 4.**Concept of an isolation tree: (

**a**) data points and isolation operations; (

**b**) binary tree and the isolated point.

**Figure 6.**Nonlinear profiles examined in this study. (

**a**) Nonlinear profiles without noise; (

**b**) Nonlinear profiles with noise added.

**Figure 7.**Original profiles and smoothed profiles. (

**a**–

**d**) are examples of profiles with coefficients $a$ of 0.5, 0.7, 1.1 and 1.5, respectively. In subplots (

**a**–

**d**), the blue lines denote the original profiles and red lines represent the smoothed profiles.

**Table 1.**Type I error of each considered method under different estimated contamination rates and different magnitudes of coefficient $a$.

Coefficient $\mathit{a}$ | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Method | ${\mathit{m}}_{0}$ | 0.7 | 0.9 | 1.1 | 1.3 | 1.5 | |||||

MHD-C | 20 | 0.142 | (0.083) | 0.005 | (0.006) | 0 | (0.003) | 0 | (0.002) | 0 | (0.001) |

PPOD | 0.061 | (0.020) | 0.067 | (0.022) | 0.066 | (0.023) | 0.067 | (0.022) | 0.067 | (0.022) | |

LOF | 0.054 | (0.002) | 0.054 | (0.003) | 0.054 | (0.002) | 0.054 | (0.003) | 0.053 | (0.004) | |

EE | 0.056 | (0) | 0.056 | (0) | 0.056 | (0) | 0.056 | (0) | 0.056 | (0) | |

IF | 0.072 | (0.007) | 0.056 | (0) | 0.056 | (0) | 0.056 | (0) | 0.056 | (0) | |

MHD-C | 40 | 0.057 | (0.029) | 0.005 | (0.006) | 0 | (0.002) | 0 | (0.001) | 0 | (0.001) |

PPOD | 0.050 | (0.021) | 0.067 | (0.022) | 0.066 | (0.023) | 0.066 | (0.023) | 0.066 | (0.023) | |

LOF | 0.049 | (0.002) | 0.049 | (0.002) | 0.048 | (0.003) | 0.048 | (0.003) | 0.048 | (0.003) | |

EE | 0.050 | (0) | 0.050 | (0) | 0.050 | (0) | 0.050 | (0) | 0.050 | (0) | |

IF | 0.128 | (0.008) | 0.081 | (0.007) | 0.074 | (0.007) | 0.066 | (0.008) | 0.069 | (0.006) | |

MHD-C | 60 | 0.045 | (0.028) | 0.004 | (0.006) | 0.001 | (0.003) | 0 | (0.002) | 0.001 | (0.002) |

PPOD | 0.041 | (0.018) | 0.065 | (0.025) | 0.065 | (0.025) | 0.064 | (0.024) | 0.065 | (0.024) | |

LOF | 0.043 | (0.003) | 0.042 | (0.003) | 0.042 | (0.002) | 0.041 | (0.003) | 0.041 | (0.003) | |

EE | 0.044 | (0.002) | 0.043 | (0) | 0.043 | (0) | 0.043 | (0) | 0.043 | (0) | |

IF | 0.214 | (0.018) | 0.171 | (0.017) | 0.153 | (0.014) | 0.154 | (0.023) | 0.144 | (0.019) |

**Table 2.**Type II error of each considered method under different estimated contamination rates and different magnitudes of coefficient $a$.

Coefficient $\mathit{a}$ | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Method | ${\mathit{m}}_{0}$ | 0.7 | 0.9 | 1.1 | 1.3 | 1.5 | |||||

MHD-C | 20 | 0.006 | (0.016) | 0.002 | (0.009) | 0 | (0) | 0 | (0) | 0 | (0) |

PPOD | 0.366 | (0.141) | 0.001 | (0.007) | 0 | (0) | 0 | (0) | 0 | (0) | |

LOF | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

EE | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.150 | (0.067) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

MHD-C | 40 | 0.016 | (0.021) | 0.002 | (0.008) | 0 | (0) | 0 | (0) | 0 | (0) |

PPOD | 0.510 | (0.134) | 0.001 | (0.005) | 0 | (0) | 0 | (0) | 0 | (0) | |

LOF | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

EE | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.310 | (0.032) | 0.125 | (0.026) | 0.098 | (0.030) | 0.065 | (0.034) | 0.075 | (0.024) | |

MHD-C | 60 | 0.022 | (0.02) | 0.001 | (0.005) | 0 | (0.002) | 0 | (0.002) | 0 | (0) |

PPOD | 0.724 | (0.105) | 0.001 | (0.004) | 0 | (0) | 0 | (0) | 0 | (0) | |

LOF | 0.002 | (0.005) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

EE | 0.002 | (0.005) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.398 | (0.042) | 0.298 | (0.040) | 0.257 | (0.032) | 0.260 | (0.053) | 0.235 | (0.045) |

**Table 3.**${F}_{2}$ score of each considered method under different estimated contamination rates and different magnitudes of coefficient $a$.

Coefficient $\mathit{a}$ | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Method | ${\mathit{m}}_{0}$ | 0.7 | 0.9 | 1.1 | 1.3 | 1.5 | |||||

MHD-C | 20 | 0.792 | 0.990 | 1.000 | 1.000 | 1.000 | |||||

PPOD | 0.612 | 0.892 | 0.899 | 0.892 | 0.892 | ||||||

LOF | 0.911 | (0.003) | 0.911 | (0.003) | 0.912 | (0.003) | 0.912 | (0.003) | 0.912 | (0.003) | |

EE | 0.909 | (0) | 0.909 | (0) | 0.909 | (0) | 0.909 | (0) | 0.909 | (0) | |

IF | 0.773 | (0.061) | 0.909 | (0) | 0.909 | (0) | 0.909 | (0) | 0.909 | (0) | |

MHD-C | 40 | 0.944 | 0.994 | 1.000 | 1.000 | 1.000 | |||||

PPOD | 0.522 | 0.948 | 0.950 | 0.950 | 0.950 | ||||||

LOF | 0.962 | (0.001) | 0.962 | (0.001) | 0.963 | (0.002) | 0.963 | (0.002) | 0.963 | (0.002) | |

EE | 0.962 | (0) | 0.962 | (0) | 0.962 | (0) | 0.962 | (0) | 0.962 | (0) | |

IF | 0.664 | (0.041) | 0.841 | (0.041) | 0.869 | (0.037) | 0.893 | (0.030) | 0.898 | (0.024) | |

MHD-C | 60 | 0.962 | 0.997 | 0.999 | 1.000 | 0.999 | |||||

PPOD | 0.316 | 0.970 | 0.971 | 0.971 | 0.971 | ||||||

LOF | 0.979 | (0.005) | 0.981 | (0.001) | 0.981 | (0.001) | 0.981 | (0.001) | 0.981 | (0.001) | |

EE | 0.979 | (0.004) | 0.980 | (0) | 0.980 | (0) | 0.980 | (0) | 0.980 | (0) | |

IF | 0.590 | (0.039) | 0.688 | (0.035) | 0.729 | (0.035) | 0.726 | (0.035) | 0.750 | (0.035) |

**Table 4.**Type I error of the three outlier-detection algorithms under the assumption that the contamination rate is known.

Coefficient $\mathit{a}$ | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Method | ${\mathit{m}}_{0}$ | 0.7 | 0.9 | 1.1 | 1.3 | 1.5 | |||||

LOF | 20 | 0.001 | (0.002) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) |

EE | 0.001 | (0.002) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.029 | (0.008) | 0.011 | (0.006) | 0.006 | (0.004) | 0.003 | (0.004) | 0.002 | (0.003) | |

LOF | 40 | 0.002 | (0.004) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) |

EE | 0.003 | (0.004) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.093 | (0.013) | 0.059 | (0.010) | 0.041 | (0.011) | 0.038 | (0.008) | 0.037 | (0.010) | |

LOF | 60 | 0.006 | (0.007) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) |

EE | 0.005 | (0.007) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.189 | (0.012) | 0.148 | (0.016) | 0.129 | (0.012) | 0.130 | (0.013) | 0.124 | (0.012) |

**Table 5.**Type II error of the three outlier-detection algorithms under the assumption that the contamination rate is known.

Coefficient $\mathit{a}$ | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Method | ${\mathit{m}}_{0}$ | 0.7 | 0.9 | 1.1 | 1.3 | 1.5 | |||||

LOF | 20 | 0.005 | (0.016) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) |

EE | 0.008 | (0.018) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.260 | (0.070) | 0.100 | (0.058) | 0.055 | (0.037) | 0.025 | (0.035) | 0.015 | (0.024) | |

LOF | 40 | 0.010 | (0.017) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) |

EE | 0.011 | (0.015) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.370 | (0.051) | 0.235 | (0.039) | 0.165 | (0.043) | 0.150 | (0.033) | 0.149 | (0.040) | |

LOF | 60 | 0.017 | (0.018) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) |

EE | 0.012 | (0.016) | 0 | (0) | 0 | (0) | 0 | (0) | 0 | (0) | |

IF | 0.440 | (0.029) | 0.345 | (0.038) | 0.300 | (0.039) | 0.303 | (0.031) | 0.288 | (0.027) |

**Table 6.**${F}_{2}$ score of the three outlier-detection algorithms under the assumption that the contamination rate is known.

Coefficient $\mathit{a}$ | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

Method | ${\mathit{m}}_{0}$ | 0.7 | 0.9 | 1.1 | 1.3 | 1.5 | |||||

LOF | 20 | 0.995 | (0.016) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) |

EE | 0.993 | (0) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | |

IF | 0.580 | (0.180) | 0.853 | (0.082) | 0.935 | (0.043) | 0.970 | (0.034) | 0.983 | (0.025) | |

LOF | 40 | 0.991 | (0.017) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) |

EE | 0.989 | (0.015) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | |

IF | 0.539 | (0.103) | 0.753 | (0.048) | 0.835 | (0.043) | 0.850 | (0.033) | 0.851 | (0.033) | |

LOF | 60 | 0.984 | (0.017) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) |

EE | 0.988 | (0.016) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | 1.000 | (0) | |

IF | 0.504 | (0.070) | 0.641 | (0.046) | 0.690 | (0.046) | 0.699 | (0.032) | 0.703 | (0.032) |

**Table 7.**Type I error of the three outlier-detection algorithms when the dataset is contaminated by profiles with different magnitudes of coefficient $a$.

Contamination Rate | IF | EE | LOF | |||
---|---|---|---|---|---|---|

0.1 (actual) | 0.013 | (0.005) | 0.001 | (0.002) | 0.001 | (0.002) |

0.15 (estimated) | 0.059 | (0.003) | 0.056 | (0) | 0.053 | (0.003) |

0.2 (actual) | 0.040 | (0.014) | 0.000 | (0.001) | 0.000 | (0.001) |

0.24 (estimated) | 0.073 | (0.011) | 0.050 | (0) | 0.049 | (0.003) |

0.3 (actual) | 0.095 | (0.015) | 0.001 | (0.003) | 0.000 | (0.002) |

0.33 (estimated) | 0.118 | (0.019) | 0.043 | (0) | 0.041 | (0.003) |

**Table 8.**Type II error of the three outlier-detection algorithms when the dataset is contaminated by profiles with different magnitudes of coefficient $a$.

Contamination Rate | IF | EE | LOF | |||
---|---|---|---|---|---|---|

0.1 (actual) | 0.120 | (0.041) | 0.005 | (0.015) | 0.005 | (0.015) |

0.15 (estimated) | 0.035 | (0.029) | 0 | (0) | 0 | (0) |

0.2 (actual) | 0.160 | (0.054) | 0.001 | (0.006) | 0.001 | (0.006) |

0.24 (estimated) | 0.094 | (0.042) | 0 | (0) | 0 | (0) |

0.3 (actual) | 0.223 | (0.034) | 0.001 | (0.004) | 0.001 | (0.004) |

0.33 (estimated) | 0.174 | (0.044) | 0 | (0) | 0 | (0) |

**Table 9.**${F}_{2}$ score of the three outlier-detection algorithms when the dataset is contaminated by profiles with different magnitudes of coefficient $a$.

Contamination Rate | IF | EE | LOF | |||
---|---|---|---|---|---|---|

0.1 (actual) | 0.880 | (0.041) | 0.995 | (0.015) | 0.995 | (0.015) |

0.15 (estimated) | 0.877 | (0.026) | 0.909 | (0) | 0.913 | (0.005) |

0.2 (actual) | 0.840 | (0.054) | 0.999 | (0.006) | 0.999 | (0.006) |

0.24 (estimated) | 0.871 | (0.040) | 0.962 | (0) | 0.963 | (0.002) |

0.3 (actual) | 0.778 | (0.034) | 0.999 | (0.004) | 0.999 | (0.004) |

0.33 (estimated) | 0.810 | (0.047) | 0.980 | (0) | 0.981 | (0.002) |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Cheng, C.-S.; Chen, P.-W.; Wu, Y.-T.
Phase I Analysis of Nonlinear Profiles Using Anomaly Detection Techniques. *Appl. Sci.* **2023**, *13*, 2147.
https://doi.org/10.3390/app13042147

**AMA Style**

Cheng C-S, Chen P-W, Wu Y-T.
Phase I Analysis of Nonlinear Profiles Using Anomaly Detection Techniques. *Applied Sciences*. 2023; 13(4):2147.
https://doi.org/10.3390/app13042147

**Chicago/Turabian Style**

Cheng, Chuen-Sheng, Pei-Wen Chen, and Yu-Tang Wu.
2023. "Phase I Analysis of Nonlinear Profiles Using Anomaly Detection Techniques" *Applied Sciences* 13, no. 4: 2147.
https://doi.org/10.3390/app13042147