AOC-OPTICS: Automatic Online Classiﬁcation for Condition Monitoring of Rolling Bearing

: Bearings are essential components in rotating machines. They ensure the rotation and power transmission. So, these components are essential elements for industrial machines. Thus, real-time monitoring is required to detect a possible anomaly, diagnose the failure of rolling bearing and follow its evolution. This paper presents a methodology for automatic online implementation of fault diagnosis of rolling bearings, by AOC-OPTICS (automatic online classiﬁcation monitoring based on ordering points to identify clustering structure, OPTICS). The algorithm consists of three phases namely: initialization, detection and follow-up. These phases use the combination of features extraction methods, smart ranking, features weighting and classiﬁcation by the OPTICS method. Two methods have been integrated in the dimension reduction step to improve the e ﬃ ciency of detection and the followed of the defect (relief method and t-distributed stochastic neighbor embedding method). Thus, the determination of the internal parameters of the OPTICS method is improved. A regression model and exponential model are used to track the fault. The analytical simulations discuss the inﬂuence of parameters automation. Experimental validation shows detection with 100% accuracy and regression models of monitoring reaching R 2 = 0.992.


Introduction
The automation of techniques takes place around the world in the manufacturing and processing of industrial sectors [1,2]. In the industrial and rotary machines, the main idea of automation is the monitoring without input parameters. The error is human, and the limitation of inexact input parameters affects the accuracy of monitoring, so it was interesting to make an autonomous method. There is a growing demand for real-time monitoring in the rotary machines to facilitate advanced maintenance programs [3]. Rotary machines are most often made of a significant and critical component: the rolling bearings [4].
The monitoring of rolling bearings gets the scientist's attention; so many methods applied to detect defects such as support vector machine [5], Bayesian network [6] and clustering [7]. Numerous literature reviews are available on monitoring methods [8,9]. From all these used methods, clustering analysis is one of the most remarkable approaches [10][11][12]. The density-based method is one of them, the clusters of dense regions of data separated from the less dense [11]. The method OPTICS (ordering points to identify clustering structure), subdivided from density-based, has the basic idea to separate clusters by density [13]. In addition, it has the advantage to attain clusters with varied data density. Clustering by OPTICS methods is an unsupervised learning method directly implemented to vibration data. Being thus can be applied directly in the industrial environments without trained by data measured on a machine under a fault condition [7]. Further advantages of the method are its ease many fields and areas of biology, astronomy, topology, and recently for the detection of the defect in rolling bearings in rotary machines [15]. This method is capable of regrouping the base of data into an order of points with different parameter settings, and then detecting a meaningful difference of data with varied density by producing a request of data that is spatially closed to each other and can become a neighbor. It can separate considerable objects from noise and identify all possible levels of clusters. The main idea for the OPTICS algorithm is that for each point of a cluster the neighborhood of a given radius (ε) has to contain at least a minimum number of points (MinPts), where ε and MinPts are input parameters. The concept of OPTICS algorithm starts by adding points to the clustered data in arbitrary shape and then to continue by adding points iteratively for developing the final cluster. The addition of points close to each other respecting the ε-neighbor order continues until getting the entire group.
The two-components of OPTICS are the core distance, C d , and the reachability distance, R d , Equations (1) and (2). If the number of points in the vicinity of an object, N ε (p), is less than MinPts, C d is the distance from p to its Minpts th neighbour, MinPts distance(p) . In this case, p is a core-object. The reachability distance of an object o, R d , is the maximum of the Core Distance of p and the Euclidean distance between o and p. Figure 1a is a representation of the reachability distance and the core distance objects.
Processes 2020, 8, x FOR PEER REVIEW 3 of 18 OPTICS (ordering points to identify clustering structure) is a hierarchical clustering algorithm that relies on a density notion [13]. The application of this method is not limited to one field. It used in many fields and areas of biology, astronomy, topology, and recently for the detection of the defect in rolling bearings in rotary machines [15]. This method is capable of regrouping the base of data into an order of points with different parameter settings, and then detecting a meaningful difference of data with varied density by producing a request of data that is spatially closed to each other and can become a neighbor. It can separate considerable objects from noise and identify all possible levels of clusters. The main idea for the OPTICS algorithm is that for each point of a cluster the neighborhood of a given radius ( ) has to contain at least a minimum number of points ( ), where ε and are input parameters. The concept of OPTICS algorithm starts by adding points to the clustered data in arbitrary shape and then to continue by adding points iteratively for developing the final cluster. The addition of points close to each other respecting the -neighbor order continues until getting the entire group.
The two-components of OPTICS are the core distance, , and the reachability distance, , Equations (1) and (2). If the number of points in the vicinity of an object, ( ), is less than , is the distance from p to its ℎ neighbour, ( ) . In this case, is a coreobject. The reachability distance of an object , , is the maximum of the Core Distance of p and the Euclidean distance between o and p. Figure 1a is a representation of the reachability distance and the core distance objects.
The number of classes is determined from the reachability plot, Figure 1b. It corresponds to the number of valleys of the graphic representation as a function of the points o ordered.

Global Architecture
The AOC-OPTICS method is developed for monitoring the state of health of a bearing throughout its entire life. It is based on the physical manifestations involved in the deterioration of a bearing. Thus 3 automated phases were proposed, Figure 2. Phase 1 considers that when a bearing is fitted, it is healthy during an interval ℎ . This phase allows the initialization of the method. Phase 2 corresponds to the failure detection phase. It is effective if the failure is not detected. Data The number of classes is determined from the reachability plot, Figure 1b. It corresponds to the number of valleys of the graphic representation R d as a function of the points o ordered.

Global Architecture
The AOC-OPTICS method is developed for monitoring the state of health of a bearing throughout its entire life. It is based on the physical manifestations involved in the deterioration of a bearing. Thus 3 automated phases were proposed, Figure 2. Phase 1 considers that when a bearing is fitted, it is healthy during an interval T h . This phase allows the initialization of the method. Phase 2 corresponds to the failure detection phase. It is effective if the failure is not detected. Data agglomeration is used for early and reliable fault detection. The third phase corresponds to the follow-up of the evolution of the fault. In view of the evolutionary nature of the fault, the third phase is a follow-up loop of this state by second class geometrical values. It runs until the bearing fails. Each phase is described in the following sections and Table 1 shows the associated pseudo code.

Phase 1, Initialization
The first phase is executed for a duration ℎ , which is assumed to be a healthy phase of the bearing. For every iteration , signals were collected. = 17 features were extracted in the time, spectral and/or time-frequency domains. The use of a multidomain feature in the detection of defect bearing can offer an efficacy diagnosis for different defects of rolling bearings, with variated speed and load. The time domain provides nine characteristic features as descriptive statistics. The statistical indicators are widely used to their relations with significant bearing damages [23]. The frequency-domain allows one to localize and detect the nature of the bearing defect [24]. Six indicators are computed. The time scale domain uses the wavelet method to extract two features [25], Table 2. These indicators are stored in a matrix [ ] where each column corresponds to a signal and each row to an indicator.  Table 1. Pseudocode for automatic online classification monitoring based on ordering points to identify clustering structure (AOC-OPTICS).

Inputs
T h , n, ∆t ∆t is the interval time between two data collection n is the number of signals collected at time k∆t T h is the time of initialization monitoring Outputs n c , Plot GV = f (k∆t) GV are the geometric values n c is the number of class (=1 for a healthy condition, =2 for a healthy and faulty conditions)

Phase 1, Initialization
The first phase is executed for a duration T h , which is assumed to be a healthy phase of the bearing. For every iteration k, n signals were collected. p = 17 features were extracted in the time, spectral and/or time-frequency domains. The use of a multidomain feature in the detection of defect bearing can offer an efficacy diagnosis for different defects of rolling bearings, with variated speed and load. The time domain provides nine characteristic features as descriptive statistics. The statistical indicators are widely used to their relations with significant bearing damages [23]. The frequency-domain allows one to localize and detect the nature of the bearing defect [24]. Six indicators are computed. The time scale domain uses the wavelet method to extract two features [25], Table 2. These indicators are stored in a matrix [HI] where each column corresponds to a signal and each row to an indicator. Table 2. Computed features. x is the sequence of samples obtained after digitizing the time domain signals, x i is a signals series for i = 1, 2..., N. W S ( f k ) corresponds to the spectral density of the max coefficients of the continuous wavelet transform. s(k) is a spectrum for k = 1, 2 . . . K, K is the number of spectrum lines, f K is the frequency value of the k th spectrum value.

TIME DOMAIN Root mean square
Crest factor x CF = X PEAK X RMS ; Skewness FREQUENCY DOMAIN Frequency Root mean square Weighted Frequency Root mean square Frequency Center Weighted Standard deviation frequency Mean Power envelope Average value of the envelope Amplitudes . Normalization aims to transform the computed to be on a similar scale.
[HI] norm i,(k+1)n = Processes 2020, 8, 606 6 of 17 A ranking step is applied. The ranking features are a significant method for eliminating the unimportant features before reduction the dimension. The massive amount of data calculates features take a long time. To reduce this long process, the method of ranking features is implemented to minimize the number of features, which can make the calculation faster, without touching the accuracy of detecting the defect. For the AOC-OPTICS, two methods are compared in Section 4.3 to eliminate the unnecessary features, with the different amounts of features: the relief method and the chi-square method [26].
Although the nuisance of dimensionality poses serious problems, processing data with high dimensions has an advantage that the data can give more information. The reduction method t-distributed stochastic neighbor embedding (t-SNE) is a powerful dimensional reduction tool, which can reduce functionality dimensions and increase the recognition rate to an overwhelming majority. The dimension reduced to be in three components, which will give more accuracy than two dimensions. The difference accuracy between the dimensions noticed in the representation of amplitude. Due to the use of the three-component in this paper, figures are shown in three dimensions.
Finally, the calculation of ε is done after a reduction in dimension. ε corresponds to the maximum distance between the center of the class, c h and the MinPts th neighbor, Equation (4).
The resulting class is a so-called healthy class, noted C h , with center c h . This class corresponds to a reference state.

Phase 2, Detection
The second phase is a step to detect the mechanical failure. The objective of this phase is to detect a new state called the defective class, noted C f . At each new iteration k, the indicators are extracted, normalized, sorted and reduced as in the previous phase. These features [FI] 3,(k+1)n are tested by the OPTICS method to detect or not a second class. If only one class is obtained, which corresponds to the reference state, the algorithm remains in the detection phase, this new data feeds the reference state. If two classes are detected, this new class C f , is obtained in a plan B, which will be kept for the follow-up phase.

Phase 3, Follow-up
The third step is carried out in plan B, which is determined in the previous phase. It is important to keep the same plan in order to visualize the evolution of the characteristics. This plan is the best plan to follow the evolution of the bearing failure. With each new series of data, the indicators were extracted, standardized and projected in plan B. From these features, [FI] 3,(k+1)n , five geometrical parameters GV i were calculated to monitor over time.
The Calinski-Harabasz index, GV 1 , is based on the density and the separated clusters, Equation (5). p is the features number. c f , c h are the center of the class C f , C h respectively. n c is the number of clusters. d is the Euclidean distance between x, c i .
The Davies-Bouldin index, GV 2 , measures the average of similarity between each cluster. The lower index means a better cluster configuration. R ij is the similarity measure of two clusters i and j. n c is the number of clusters. This third parameter, GV 3 , calculates the distance between the center cluster of the initial phase C h with the centre of the fault cluster C f , where d M is the Manhattan distance.
Finally, the contour, GV 4 , of the cluster is calculated from a convex hull, which is the smallest convex set that contains the points. The density, GV 5 , is the number of points of the cluster, C f for a volume V f .

Simulated Model
A mathematical model verifies the methodology. It is corresponding to the bearing vibratory signature, with an outer race defect (x BPFO ). Equations (8)-(10) [27] describes the used model to present the effect of a rolling element at each passage in the faulty outer race according to time t. The passage of balls in the defect of the outer race creates impacts at the frequency f BPFO . This impact generates an impulse response of the structure with a natural frequency f 0 and a damping µ. Frequency f BPFO depends on the rotation speed of the motor, f r , and bearing's geometry, Equation (10). Thus, the model is defined by four-parameters: amplitude A, damping factor µ, rotational speed f r , the amplitude of the noise signal b(t). The exponential formula implanted in place of amplitude A. Roller bearing simulated is a type SKF 6206 whose characteristics listed in Table 3. Every signal contains 16384 samples (N) with a sampling rate of 51.2 kHz. To simulate the appearance and evolution of the defect, the database was made of created fifty-one different values of the amplitude A, noted A i with i = 1 . . . 51. For each value, twenty signals were generated with a Gaussian variability of ±5% for the three parameters f r , µ and f o , Table 4. Thus, the database was made on 51 signals × 20 signals ordered by increasing values. The amplitude for A i=1−10 was constantly equal to zero, which had no variation for amplitude A. The deviation started from eleven to fifty-one, introducing Equation (9) in Equation (8), to create signals, Table 4.

Effect of Internal Parameters of The OPTICS Method
OPTICS uses two parameters ε and MinPts. ε was calculated in the initialization phase, after collecting all the data. ε depends on the MinPts value. For simulation, ε had a value in a range (0.909-0.920) for a range MinPts =(2-20). Thus, this value varied only slightly during the initialization phase. Its value for the MinPts th neighbour was kept for the rest of the algorithm. Figure 3 confirms the value of ε. After the initialization phase, ε increased abruptly.  s was related to the number of signals for an instant. Table 5 shows the effects of for three levels of noise. This table aimed to represent the effectiveness of the automatized and , with the initial state that is the Euclidean distance that exists in the OPTICS algorithm, and all the features (seventeen). From this table, the optimal value of was /2. This value of made it possible to detect the fault before the others. The selection of the distance measure affects the results of clustering algorithms. In this section, the advantages and disadvantages of every distance method used are shown in Table 6. The Euclidean distance used in the OPTICS algorithm in clustering, to calculate the distance between two vectors, was significantly difficult to iterate even an approximate of the precise values of data. Table  7 below shows the effect of distance implanted in the AOC-OPTICS method for three noise levels. The Manhattan distance could lead to the detection of the defect with global accuracy for the different signal to noise ratio equal to 96.7%, and then the second one was the Mahalanobis distance that detected at 88.2%, for the other distances the accuracy equaled 85.5%. Advantage 1) Accessible to counting and suitable for datasets with separated clusters [28]. 2) Fast for small data [28]. 1) Susceptible to outliers [28]. Minpts was related to the number of signals for an instant. Table 5 shows the effects of MinPts for three levels of noise. This table aimed to represent the effectiveness of the automatized ε and MinPts, with the initial state that is the Euclidean distance that exists in the OPTICS algorithm, and all the features (seventeen). From this table, the optimal value of MinPts was n/2. This value of MinPts made it possible to detect the fault before the others. The selection of the distance measure affects the results of clustering algorithms. In this section, the advantages and disadvantages of every distance method used are shown in Table 6. The Euclidean distance used in the OPTICS algorithm in clustering, to calculate the distance between two vectors, was significantly difficult to iterate even an approximate of the precise values of data. Table 7 below shows the effect of distance implanted in the AOC-OPTICS method for three noise levels. The Manhattan distance could lead to the detection of the defect with global accuracy for the different signal to noise ratio equal to 96.7%, and then the second one was the Mahalanobis distance that detected at 88.2%, for the other distances the accuracy equaled 85.5%.
(3) Failure in classification massive data. Mahalanobis C is the covariance matrix.
(3) The distance is a distortion caused by a linear combination of attributes. (4) Takes account of the shape of the clusters by employing within-group correlation [30].

Disadvantage
(1) If the noise has a high effect, it can lead to covers of the data provided and misclassification [31]. (2) It is not able to calculate the inverse of the correlation matrix when the variables highly correlated [32]. (3) When the dimension is proportional, eigenvalues of covariance equal zero, then distance cannot be calculated [33].

Cityblock
Or Manhattan x i − y i Advantage (1) Shows better performance with the datasets in terms of less computation time.
(3) Having triangular inequality and offering better data contrast than Euclidean distance [30]. (4) Relatively good data contrast in high dimensions.

Advantage
(1) Useful in high dimensions of data [34]. (2) Useful for datasets with compact or isolated clusters.
Chebychev max i x i − y i Advantage (1) It takes less time to count distances between data sets [36].

Disadvantage
(1) More sensitive to the scales of the feature magnitude, the inherent weakness can be resolved by normalization of all features before the classification task [37]. Table 7. Effect of distances, with MinPts = n/2, 17 features, ε = 0.094.

Effect of Ranking Features
Usually ranking features is used in preprocessing data as a feature subdivision. The concept for use is to count the random instance, then calculate their nearest neighbors and set the vector of weighting features, which can distinguish the features from neighbors of various classes.
Two methods chi-square and relief were compared. Table 8 represents the result of the ranking features. The comparative study presents the effectiveness of the relief method that could detect the defect in the high accuracy from features number ten to the end. The chi-square start to recognize the highest efficiency with twelve features. From the results of Table 8 could conclude that the method of relief ranking features was the best with just ten features that was enough to obtain the highest accuracy.

Results
In this section, the results were obtained for the following parameters: relief method, distance from Manhattan, MinPts = n / 2 and ε = 0.094. A 3D visualization was chosen (three principal components). In fact, the 3D results gave a detection accuracy of 96.7% and the 2D results covered an accuracy of less than 93.7%. The results of the BPFO (ball pass frequency outer) simulation showed the fault detected from signal A 11 , for noise levels 0.1b (t), 0.3 b(t) and A 12 for 0.5b (t) (Figures 4-6).  The follow-up starting after the end of the detection phase. monitors the growth of the fault with the varied amplitude of signals. The evolution of was studied for the three noise levels 0.1b (t), 0.3b (t) and 0.5b (t), Figure 7. Figure 7a represents the Calinski index calculated between the two clusters. The curve values increased with increasing amplitude values. The Calinski index value for the 0.1b (t) was more  The follow-up starting after the end of the detection phase. monitors the growth of the fault with the varied amplitude of signals. The evolution of was studied for the three noise levels 0.1b (t), 0.3b (t) and 0.5b (t), Figure 7. Figure 7a represents the Calinski index calculated between the two clusters. The curve values increased with increasing amplitude values. The Calinski index value for the 0.1b (t) was more significant and the curve was above the others. For a high noise level, the evolution was linear 1 = 0.438 + 7.245 ( 2 = 0.980), while for the other two noise levels the evolution was exponential ( 2 =  The follow-up starting after the end of the detection phase. monitors the growth of the fault with the varied amplitude of signals. The evolution of was studied for the three noise levels 0.1b (t), 0.3b (t) and 0.5b (t), Figure 7. Figure 7a represents the Calinski index calculated between the two clusters. The curve values increased with increasing amplitude values. The Calinski index value for the 0.1b (t) was more significant and the curve was above the others. For a high noise level, the evolution was linear 1 = 0.438 + 7.245 ( 2 = 0.980), while for the other two noise levels the evolution was exponential ( 2 = 0.977 2 = 0.742). The follow-up starting after the end of the detection phase. GV monitors the growth of the fault with the varied amplitude of signals. The evolution of GV was studied for the three noise levels 0.1b (t), 0.3b (t) and 0.5b (t), Figure 7. Figure 7a represents the Calinski index calculated between the two clusters. The curve values increased with increasing amplitude values. The Calinski index value for the 0.1b (t) was more significant and the curve was above the others. For a high noise level, the evolution was linear GV 1 = 0.438k + 7.245 R 2 = 0.980 , while for the other two noise levels the evolution was exponential (R 2 = 0.977 and R 2 = 0.742). Figure 7b represents the Davies-Bouldin index, the curve was the opposite of the Calinski-Harabasz index, which decreased with the increasing amplitude of signals. The results observed here showed a curve of 0.1b (t), which was above the other curves, and started near to one and ended near-zero. For the three noise levels, the regression was linear. The mathematical model was similar: GV 2 = −0.0236k + 0.954 R 2 = 0.999 , GV 2 = −0.0239k + 0.997 R 2 = 0.997 and GV 2 = −0.0252k+ 1.064 R 2 = 0.994 respectively for 0.1b (t), 0.3b (t) and 0.5b (t). Figure 7c represents the density of the defected cluster or the second class. The density decreases over the amplitude of signals until it became constantly equal to zero, contrary to the Davies-Bouldin index decrease, to attend near zero at the end of class. The comparison between the curves showed that the density of 0.1b (t), bigger than the other noise to signal ratios. The evolution was exponential with the mathematical model: and GV 3 = 809e −0.151k R 2 = 0.720 , 0.1b (t), 0.3b (t) and 0.5b (t). The correlation was poor for a low noise level. Figure 7d represents the distance between two clusters, the distance values growing with amplitude. However, the curvy curve had an increasing trajectory form for the three scenarios 0.1b (t), 0.3b (t) and 0.5b (t). Additionally, the distance parameter could observe the trajectory of 0.1b (t), was above the other curves at the end, but initially, the three curves were conjoined, then started to separate from an amplitude equal to k = 31. A linear model mathematic measurement could be done from k = 31, Figure 7e represents the contour of the second cluster, showing the increase of contour with the amplitude of signals. The comparison of the contour with the Calinski index shows, the Calinski index remained increasing with the number of amplitudes. However, the contour values were similar for noise levels at low amplitudes. The contour was relevant for a certain amplitude level, k = 31 for low noise levels and k = 41 for higher noise levels. The regression models starting from k = 31 were GV 5 = 0.373e −0.154k R 2 = 0.948 , GV 5 = 0.059e 0.240k R 2 = 0.986 and GV 5 = 0.034e 0.215k R 2 = 0.924 .
In summary, the Calinski index differentiates noise levels for all amplitudes. However, the mathematical regression model was different. For low noise levels, a linear model was interesting, while for high noise levels, the exponential model was preferred. On the contrary, the Calinski index was little influenced by the noise level, thus the linear regression model was relevant and similar. That could show the importance of the Calinski index, which could separate the curves of different scenarios, the value started with zero and grew directly with the amplitude, while the contour parameter increased slowly with the amplitude. The parameters, density and distance, had values close to 0 either for low amplitudes or high amplitudes. The evolutions were only visible for ranges of amplitudes. According to these simulations the Calinski and Davies-Bouldin indexes were preferred. This numerical investigation made it possible to fix the internal parameters OPTICS, ε = 0.094 (Equation (4)), MinPts (= n/2) and to optimize the methods involved in the AOC-OPTICS process (relief method, t-SNE and Manhattan distance).

Test Bench
The experimental bench consists of a crankcase connected with the electric motor of 10 KW maximum power through a shaft and two rolling bearings: healthy (6206 ball bearing) and degraded (N.206.E. G15 roller bearing), Figure 8. A hydraulic jack via steel cable was used to vary loads on the shaft (Figure 8a). The motor has rotational speed controlled by variable speed drive. The whole device was built to a concrete structure to isolate it from the low frequencies generated by the external environment. A piezoelectric sensor was placed radially on the bearing, considered as the best measuring point. The data were collected with a sampling frequency of 51,200 Hz. Eight defects on the outer ring of the roller bearing were created with an electro pen. The defects were measured using a paste mark "plastiform", Table 9. The resulting profile was characterized as roughness, with a Taylor-Hobson subtronic 3P profilometer (Figure 8b). For the nine states of the defect (one healthy and eight defect sizes), 10 randomly operating conditions were applied among 5 loads ranging from 100 to 220 daN, with a 30 daN step, and 5 rotation speed varies ranging from 1405 to 1560 rpm with a 50 rpm step. The number of combinations was 90 ( = 90). For each combination 8 signals were collected ( = 8) with 12,800 samples. The total database was made of 720 signals.
AOC-OPTICS method was applied. The inputs were Δ = 1 ( ), = 10 and = 8. Thus, at each iteration , 8 new signals integrated the algorithm. Eighty signals initiated the monitoring process.

Test Bench
The experimental bench consists of a crankcase connected with the electric motor of 10 KW maximum power through a shaft and two rolling bearings: healthy (6206 ball bearing) and degraded (N.206.E. G15 roller bearing), Figure 8. A hydraulic jack via steel cable was used to vary loads on the shaft (Figure 8a). The motor has rotational speed controlled by variable speed drive. The whole device was built to a concrete structure to isolate it from the low frequencies generated by the external environment. A piezoelectric sensor was placed radially on the bearing, considered as the best measuring point. The data were collected with a sampling frequency of 51,200 Hz. Eight defects on the outer ring of the roller bearing were created with an electro pen. The defects were measured using a paste mark "plastiform", Table 9. The resulting profile was characterized as roughness, with a Taylor-Hobson subtronic 3P profilometer (Figure 8b). For the nine states of the defect (one healthy and eight defect sizes), 10 randomly operating conditions were applied among 5 loads ranging from 100 to 220 daN, with a 30 daN step, and 5 rotation speed varies ranging from 1405 to 1560 rpm with a 50 rpm step. The number of combinations was 90 (k = 90). For each combination 8 signals were collected (n = 8) with 12,800 samples. The total database was made of 720 signals.

Results
After the initialization phase ( = 10), the detection phase operated during the detection of a second class. This detection was made for the iteration = 11 . The inputs parameters were = /2 = 4 and = 0.12. Results of AOC-OPTICS method are represented in Figure 9. The cluster number 2 appeared at iteration 11 and was confirmed by the following iterations. Despite the variation of loads and speed the accuracy was 100%. The results of our methodology could detect a tiny variation in the state of the bearing. All these results could demonstrate the robustness of the used methodology.   AOC-OPTICS method was applied. The inputs were ∆t = 1 (de f ault value), T ini = 10 and n = 8. Thus, at each iteration k, 8 new signals integrated the algorithm. Eighty signals initiated the monitoring process.

Results
After the initialization phase (T ini = 10), the detection phase operated during the detection of a second class. This detection was made for the iteration k = 11. The inputs parameters were Minpts = n/2 = 4 and = 0.12. Results of AOC-OPTICS method are represented in Figure 9. The cluster number 2 appeared at iteration 11 and was confirmed by the following iterations. Despite the variation of loads and speed the accuracy was 100%. The results of our methodology could detect a tiny variation in the state of the bearing. All these results could demonstrate the robustness of the used methodology.

Results
After the initialization phase ( = 10), the detection phase operated during the detection of a second class. This detection was made for the iteration = 11 . The inputs parameters were = /2 = 4 and = 0.12. Results of AOC-OPTICS method are represented in Figure 9. The cluster number 2 appeared at iteration 11 and was confirmed by the following iterations. Despite the variation of loads and speed the accuracy was 100%. The results of our methodology could detect a tiny variation in the state of the bearing. All these results could demonstrate the robustness of the used methodology.

Follow-Up
The follow-up starts at iteration 11 to iteration 90 for the 5 GV, Calinski and Davies-Bouldin index, density, contour and distance, Figure 10. The behavior of these variables was different and remained similar to the behaviors established during the simulation. The Calinski index increased with the size of defects. At iteration 65, the index had an exponential evolution in the mathematical form GV 1 = 161.13e 0.084k R 2 = 0.982 , Figure 10a. The Davies-Bouldin index decreased proportionally with the fault, Figure 10b. In this case, a linear regression GV 2 = −0.0107k + 0.947 R 2 = 0.994 was proposed. The density decreased with the increasing amplitude values to attend around zero from signal number sixty to ninety, the mathematical model was GV 3 = 197.15. exp(−0.095k) R 2 = 0.940 . The distance curve was increasing with the increasing of the amplitude values. The evolution was exponential GV 4 = 0.225e 0.039k R 2 = 0.828 . However the monotony was not relevant. There was a lot of variability around the average trend, Figure 10d. The contour shows two trends, Figure 10e. The contour evolved proportionally for the first 60 iterations with a low slope, GV 5 = 0.014k − 0.0697 R 2 = 0.934 . From the 60th iteration onwards, the evolution remained linear but increased sharply, GV 5 = 0.4791k − 3.518 R 2 = 0.904 . The follow-up starts at iteration 11 to iteration 90 for the 5 , Calinski and Davies-Bouldin index, density, contour and distance, Figure 10. The behavior of these variables was different and remained similar to the behaviors established during the simulation. The Calinski index increased with the size of defects. At iteration 65, the index had an exponential evolution in the mathematical form 1 = 161.13 0.084 ( 2 = 0.982) , Figure 10a. The Davies-Bouldin index decreased proportionally with the fault, Figure 10b. In this case, a linear regression 2 = −0.0107 + 0.947 ( 2 = 0.994) was proposed. The density decreased with the increasing amplitude values to attend around zero from signal number sixty to ninety, the mathematical model was 3 = 197.15. exp(−0.095 ) ( 2 = 0.940). The distance curve was increasing with the increasing of the amplitude values. The evolution was exponential 4 = 0.225 0.039 ( 2 = 0.828). However the monotony was not relevant. There was a lot of variability around the average trend, Figure 10d. The By comparing the evolution of these indicators, the Calinski index and the contour showed some singularities in the evolution at iteration 60 corresponding to defect 5. These parameters indicate the severity degradation stage in the rolling bearing. The Davies-Bouldin index was the index most correlated to the number of iteration (R 2 = 0.994). In general, these indicators allowed us to make a prognosis on the evolution of these parameters with the iterations.

Conclusions
This paper proposed an automatic online methodology for monitoring ball bearings by optimizing the internal parameters of the OPTICS method and the dimension reduction step. The dynamic monitoring AOC-OPTICS was divided into three phases: the initialization, the detection and following the defect. The methodology was confronted with a simulated fault evolution and then with experimental data. The detection reached an accuracy of 100%. The follow-up was assured by geometrical values whose trend followed linear or exponential mathematical models with correlation coefficients up to 0.994. This methodology brings many improvements: (I) This automated methodology used the best parameters for the detection and following the defects with high accuracy. (II) The variation of speed and load cannot lead to discovering the fault in the rolling bearing. Only the amplitude leads to detecting the faulty state. (III) The relief method is efficient compared to chi-square, which is used to delete unnecessary features, which can make the iteration to be calculated speedily. (IV) The characteristics parameters related to the defect facilitate monitoring of the evolution with the times. (V) The density and Calinski and Davies-Bouldin index represent efficacy more than the other parameters, for monitoring the defect growth trajectory. The major perspective is to add the diagnostic part in the methodology to increase the prognosis. This part must be based on previous knowledge provided by a digital twin or an expert.