Application of PCA and Classiﬁcation for Fault Diagnosis of MAB Installed in Petrochemical Plant Process Facilities

: In large systems, such as power plants or petrochemical plants, various equipment (e.g., compressors, pumps, turbines, etc.) are typically deployed. Each piece of equipment operates under generally harsh operating conditions, depending on its purpose, and operates with a probability of failure. Therefore, several sensors are attached to monitor the status of each piece of equipment to observe its conditions; however, there are many limitations in monitoring equipment using thresholds such as maximum and minimum values of data. Therefore, this study introduces a technology that can diagnose fault conditions by analyzing several sensor data obtained from plant operation information systems. The equipment for the case study was a main air blower (MAB), an important cooling equipment in the plant process. The driving sensor data were analyzed for approximately three years, measured at the plant. The fault history of the actual process was also analyzed. Due to the large number of sensors installed in the MAB system, a dimension reduction method was applied with the principal component analysis (PCA) method when analyzing collected sensor data. For application to PCA, the collected sensor data were analyzed according to the statistical analysis method and data features were extracted. Then, the features were labeled and classiﬁed according to normal and fault operating conditions. The analyzed features were converted with a diagnosis model, by dimensional reduction, applying the PCA method and a classiﬁcation algorithm. Finally, to validate the diagnosis model, the actual failure signal that occurred in the plant was applied to the suggested method. As a result, the results from diagnosing signs of failure were conﬁrmed even before the failure occurred. This paper explains the case study of fault diagnosis for MAB equipment with the suggested method and its results.


Introduction
Large-scale systems, such as onshore or offshore plant process equipment, should operate with high reliability and availability, because failure-induced downtime significantly affects manufacturing activities. To meet stable and robust operating conditions, a method for monitoring, diagnosing, and predicting critical anomalies and determining appropriate maintenance measures is necessary. This concept is generally known as the prognostics and health management (PHM) [1,2]. Furthermore, the PHM focuses on not only fault detection and diagnostics of components but also degradation monitoring and failure prediction. Generally, PHM can be treated as a method used to reduce the uncertainty of maintenance activities [2]. In the large-scale plant systems, any damage can lead to serious results. In this respect, the PHM is a very reasonable method for an industry with high-valued physical assets [3,4].
The maintenance strategy is the process of taking timely, appropriate actions, and making accurate logistics decisions based on outputs from diagnostics and prognostics, available resources, and operational demands [5].
In this study, for the application of the strategy concept into the current maintenance method, and to implement it, the PHM procedure was firstly defined. The suggested procedure was applied to analyze the operation data of the actual plant process equipment. Based on the proposed strategy, this study seeks to establish a system to diagnose the conditions of the target equipment based on several sensor data generated during plant operations. In general, for data-based state diagnostics, parameters are selected to analyze and monitor acquired data [6], select the parameters to be monitored, determine the inspection frequency, and establish the criteria for diagnosis. Furthermore, it is also necessary to develop a decision method that selects the reasonable maintenance method [7,8].
The maintenance method usually requires many tasks, combining sensing and interpretation of environmental, operational, and performance-related parameters [7,9,10] for assessing the reliability of a system. The system could be conducted by (1) gathering system status data; (2) monitoring the system conditions based on gathered data; (3) making a diagnosis of the system status; (4) predicting the remaining useful life of the system; and (5) executing appropriate actions, such as repair, replace, and disposal based on reasonable decision making [7,11]. Figure 1 provides an overview of the overall procedures for maintenance methods for monitoring equipment status in general: data acquisition, data manipulation, system definition, application, and validation/maintenance, which could be a basis for the PHM procedure [10,11]. In this study, the diagnosis procedures were mainly defined by classifying them into five stages: from data gathering to health maintenance, as shown in Figure 1. The maintenance strategy is the process of taking timely, appropriate actions, and making accurate logistics decisions based on outputs from diagnostics and prognostics, available resources, and operational demands [5].
In this study, for the application of the strategy concept into the current maintenance method, and to implement it, the PHM procedure was firstly defined. The suggested procedure was applied to analyze the operation data of the actual plant process equipment. Based on the proposed strategy, this study seeks to establish a system to diagnose the conditions of the target equipment based on several sensor data generated during plant operations. In general, for data-based state diagnostics, parameters are selected to analyze and monitor acquired data [6], select the parameters to be monitored, determine the inspection frequency, and establish the criteria for diagnosis. Furthermore, it is also necessary to develop a decision method that selects the reasonable maintenance method [7,8].
The maintenance method usually requires many tasks, combining sensing and interpretation of environmental, operational, and performance-related parameters [7,9,10] for assessing the reliability of a system. The system could be conducted by (1) gathering system status data; (2) monitoring the system conditions based on gathered data; (3) making a diagnosis of the system status; (4) predicting the remaining useful life of the system; and (5) executing appropriate actions, such as repair, replace, and disposal based on reasonable decision making [7,11]. Figure 1 provides an overview of the overall procedures for maintenance methods for monitoring equipment status in general: data acquisition, data manipulation, system definition, application, and validation/maintenance, which could be a basis for the PHM procedure [10,11]. In this study, the diagnosis procedures were mainly defined by classifying them into five stages: from data gathering to health maintenance, as shown in Figure 1. In this study, the suggested approach is used to analyze sensor data obtained from the plant process and apply it to the condition diagnosis of the process equipment. The plant process was generally equipped with several pieces of rotatory equipment, such as compressors, pumps, motors, and turbines, as well as heat exchangers, separators, synthesizers, valves, piping, and pressure vessels. In particular, the rotary device had potential failure modes, such as misalignment between components, overheating, plugging, liquid leakage, abrasion, wear-out, and corrosion [12,13]. The technology to monitor the operation conditions and identify the fault status by attaching sensors to each piece of equipment has a long history. It is a very common maintenance method to accumulate the time In this study, the suggested approach is used to analyze sensor data obtained from the plant process and apply it to the condition diagnosis of the process equipment. The plant process was generally equipped with several pieces of rotatory equipment, such as compressors, pumps, motors, and turbines, as well as heat exchangers, separators, synthesizers, valves, piping, and pressure vessels. In particular, the rotary device had potential failure modes, such as misalignment between components, overheating, plugging, liquid leakage, abrasion, wear-out, and corrosion [12,13]. The technology to monitor the operation conditions and identify the fault status by attaching sensors to each piece of equipment has a long history. It is a very common maintenance method to accumulate the time series signals, feature values, and images obtained from the sensors in the data acquisition system and analyze each signal to determine the fault indications [14,15]. In particular, one method that is widely used for determining a possibility of failure is when the failure indication value-extracted from the measured signal data-exceeds the upper and lower limit values. In other words, an upper or a lower limit (threshold) was set for each signal, and the equipment is checked when the measured signal is out of the threshold value, and the possibility of failure is diagnosed based on the failure history.
Rotatory machines are some of the most typically monitored pieces of equipment that are subject to various faults (e.g., vibration, shaft alignment, bearing breakage, and overheating) [16]. Failure detection of the rotatory machine is possible by analyzing time series data of temperature, acceleration, displacement, stress, lubrication status, etc. Generally, a signal processing method, such as the wavelet transform or the Hilbert-Huang transform, could detect fault signals using the time series signal analysis method [17,18]. In order to apply the signal processing methods, the data collection frequency should be 10 kHz or more, but the storage capacity of the sensor data acquisition device cannot afford the amount of data gathered at the frequency of KHz, in practice. For this reason, in this study, sensor data of the onshore plate were collected at a frequency of less than 1/min, and applied in order to analyze data patterns using statistical features of the data. Data patterns and statistical processing of error signals are suitable for overcoming the limitations of time series data and performing error diagnostics [18].
In this study, the diagnostic methodology applied was for a main air blower (MAB) unit operated in the fluid catalyst cracking (FCC) process. The main air blower is a major piece of equipment utilized for air circulation of the FCC process in petrochemical plants and is one of the main equipment that can cause significant losses throughout the process in the event of a failure. Stable operation of the air blower greatly influences the safe production of the entire FCC process [19,20]. Operation data (sensor data) obtained for MAB equipment in the FCC process were collected and applied to the diagnosis method. The collected sensor signal was labeled based on the acquired sensor data and the failure history for the MAB equipment, and the normal and the failure condition were patterned and classified, respectively. The statistical characteristics of the time series signals were used to overcome the limits of the low-period measured signals. Classified data were derived from the statistical features of the data and used as representative values for each operating state. Once the data patterns of normal and failure conditions were defined, we applied dimensional reduction methods to simplify and establish the status criteria. The applied dimensional reduction method is the principal component analysis method (PCA), which represents the order of multidimensional data in two dimensions with representative values. Then, the classifier-learning algorithm was applied to the data reduced by PCA to establish boundary criteria for each operation condition. The two-dimensional decision boundaries determined here are the criteria for determining the failure state of MABs proposed in this study. The data analysis process, algorithm application process, and results of MAB equipment are described in detail in the next section. Figure 2 shows the detailed procedure for fault diagnostics applied in this study [21][22][23]. In the first step of fault diagnostics, it is necessary to obtain the data in the operating conditions. In general, we can gather time series data from various sensors attached to the target equipment during its operation. In the second step, the sensor signal should be identified to determine whether it is data representative of normal or fault state with the information on equipment failure, i.e., type of failure modes [24,25]. The sensor data can be extracted to data features from by signal pre-processing. Then, among the statistical features, the ones with the greatest correlation, with failure and abnormal status, are selected to build the fault classification map. In this way, features potentially representing faults are extracted from the sensor data. After that, we reduce the dimension of the features, and build the fault classification map represented by the features through several statistical methods, e.g., principal component analysis (PCA) and the Bayesian classifier [25][26][27]. In the classifier learning process, previous fault history information and normal operation status information are used as the learning data of the fault classifier in the supervised learning scheme. Finally, when there are sensor data in the current state, it distinguishes whether it is a normal signal or a fault signal through the learned classifier and fault classification map. If the current signal indicates a fault, the fault condition can be determined by the system. A more detailed procedure is as follows:

Detailed Procedure for Fault Diagnosis
Appl. Sci. 2021, 11, x FOR PEER REVIEW 4 of 15 statistical methods, e.g., principal component analysis (PCA) and the Bayesian classifier [25][26][27]. In the classifier learning process, previous fault history information and normal operation status information are used as the learning data of the fault classifier in the supervised learning scheme. Finally, when there are sensor data in the current state, it distinguishes whether it is a normal signal or a fault signal through the learned classifier and fault classification map. If the current signal indicates a fault, the fault condition can be determined by the system. A more detailed procedure is as follows:

Diagnosis System Definition and Sensor Data Acquisition
For fault diagnostics of the plant equipment, its sensor data are required that show the characteristics of fault conditions and normal operation conditions occurring in the plant equipment. In order to analyze the data generated by the target equipment, a unit group of the sensors should be set, a list of sensors to be used for analysis should be set, and signals of each sensor should be acquired. This is a very important concept when building a diagnostic system (i.e., defining a diagnosis system) because it is directly related to the maintenance strategy. For example, let us consider an example of building a fault diagnosis system for two pieces of equipment with pressure (P), temperature (T), and flow (F) sensors attached in parallel, as shown in Figure 3. In such a case, it may be difficult to select a sensor list to determine the type of failure, and it may be necessary to manage unnecessary data that are not related to the failure diagnostics. In this case, it may be efficient to configure and manage one unit according to the operating conditions. Therefore, a list of sensors that can represent the status of the target system should be appropriately selected, excluding the unnecessary ones among the sensors included in the equipment. To this end, in this study, the Pearson correlation coefficient between each sensor signal is calculated by Equation (1) to classify the sensor list having a significant correlation; only signals of the target sensors are acquired.

Diagnosis System Definition and Sensor Data Acquisition
For fault diagnostics of the plant equipment, its sensor data are required that show the characteristics of fault conditions and normal operation conditions occurring in the plant equipment. In order to analyze the data generated by the target equipment, a unit group of the sensors should be set, a list of sensors to be used for analysis should be set, and signals of each sensor should be acquired. This is a very important concept when building a diagnostic system (i.e., defining a diagnosis system) because it is directly related to the maintenance strategy. For example, let us consider an example of building a fault diagnosis system for two pieces of equipment with pressure (P), temperature (T), and flow (F) sensors attached in parallel, as shown in Figure 3. In such a case, it may be difficult to select a sensor list to determine the type of failure, and it may be necessary to manage unnecessary data that are not related to the failure diagnostics. In this case, it may be efficient to configure and manage one unit according to the operating conditions. Therefore, a list of sensors that can represent the status of the target system should be appropriately selected, excluding the unnecessary ones among the sensors included in the equipment. To this end, in this study, the Pearson correlation coefficient between each sensor signal is calculated by Equation (1) to classify the sensor list having a significant correlation; only signals of the target sensors are acquired.
where, ρ x,y is the Pearson correlation coefficient of the x, y signal matrix, and σ and µ are the standard deviation and the mean value, respectively. Moreover, cov(x, y) is a covariance matrix as shown in Equation (2).
where, , is the Pearson correlation coefficient of the , signal matrix, and and are the standard deviation and the mean value, respectively. Moreover, , is a covariance matrix as shown in Equation (2).

Pre-Processing
The time series data acquired from sensors are non-stationary and complex, which have a lot of primitive information about the operation conditions of the equipment. In general, the sensor data are mixed with nulls and missing value or noise data in various operating environments. Thus, to extract the features from the sensor data, data pre-processing is needed. To this end, we classify the sensor data at regular intervals, and replace the missing data values at each interval with the mean and standard deviation.
To remove the noise data, the data below the 5 in a certain interval is determined as the noise of the sensor data and replaced with the average of the data values in the interval. Moreover, since each sensor datum has a different average and variance value, normalization using the standard score (Z-score) of Equation (3) or min-max scaling method of Equation (4) is performed to moderate the influence of sensor data.

Feature Extraction
A significant feature was extracted from the collected sensor data to represent each operation state. Due to the difficulty of applying sensor data as they are, the feature extraction process requires the selection of appropriate features that characterize sensor data in order to apply to the diagnosis methods. To minimize the number of features used and classification errors, statistical features typically applied to the fault analysis are reviewed [27]. Furthermore, since sensor data collected in this study were collected at 1/min intervals, it was difficult to apply the analysis method of frequency domain. Therefore, various parameters were reviewed by applying analysis methods in statistical areas. As a result, a total of six features suitable for sensor data analysis of MAB equipment applied in this study were selected, as shown in Table 1.

Pre-Processing
The time series data acquired from sensors are non-stationary and complex, which have a lot of primitive information about the operation conditions of the equipment. In general, the sensor data are mixed with nulls and missing value or noise data in various operating environments. Thus, to extract the features from the sensor data, data preprocessing is needed. To this end, we classify the sensor data at regular intervals, and replace the missing data values at each interval with the mean and standard deviation.
To remove the noise data, the data below the µ − 5σ in a certain interval is determined as the noise of the sensor data and replaced with the average of the data values in the interval. Moreover, since each sensor datum has a different average and variance value, normalization using the standard score (Z-score) of Equation (3) or min-max scaling method of Equation (4) is performed to moderate the influence of sensor data.

Feature Extraction
A significant feature was extracted from the collected sensor data to represent each operation state. Due to the difficulty of applying sensor data as they are, the feature extraction process requires the selection of appropriate features that characterize sensor data in order to apply to the diagnosis methods. To minimize the number of features used and classification errors, statistical features typically applied to the fault analysis are reviewed [27]. Furthermore, since sensor data collected in this study were collected at 1/min intervals, it was difficult to apply the analysis method of frequency domain. Therefore, various parameters were reviewed by applying analysis methods in statistical areas. As a result, a total of six features suitable for sensor data analysis of MAB equipment applied in this study were selected, as shown in Table 1.
In Table 1, X is original sensor data and µ is the mean value of the sensor data.

Feature Index Description
Root mean square

Dimensionality Reduction
To simplify the features of sensor data having a high-dimensional vector form, a process of reducing the dimension is required. Dimensional reduction of the sensor data is a data processing method by focusing on representing data with minimal number of dimensions while keeping the fundamental properties of the data from being lost. PCA is a method that is applied to sensor data analysis and processing of multiple data in order to reduce the data dimensions [26,27]. In this study, to minimize the dimensionality of the features and classify the fault condition clearly, two principal components having the high Eigen values are selected, as shown in Figure 4. The principal components can be represented by the following Equation (5).
where PC i denotes the principal component i, X n and a n represent the original feature n and numerical coefficient for X n , respectively.  In Table 1, is original sensor data and is the mean value of the sensor data.

Dimensionality Reduction
To simplify the features of sensor data having a high-dimensional vector form, a process of reducing the dimension is required. Dimensional reduction of the sensor data is a data processing method by focusing on representing data with minimal number of dimensions while keeping the fundamental properties of the data from being lost. PCA is a method that is applied to sensor data analysis and processing of multiple data in order to reduce the data dimensions [26,27]. In this study, to minimize the dimensionality of the features and classify the fault condition clearly, two principal components having the high Eigen values are selected, as shown in Figure 4. The principal components can be represented by the following Equation (5).
where denotes the principal component , and represent the original feature and numerical coefficient for , respectively.

Classifier Learning
After identifying the suitable features, in order to classify the sensor data patterned according to the state of each signal, we need the classification process with the suitable classifier. The classifier was applied to distinguish the features of sensor data represented in two-dimensional spaces. The classification process recognizes the given data into a suitable pattern for the features based on the criteria obtained by learning the sensor data of the existing fault and normal operating state. Note that the dotted line of the right side in Figure 4 indicates the classifier. In this study, Naive Bayesian classifiers were used. The Naive Bayesian classifiers assign the most likely class to a given example described by its feature vector, expressed in Equations (6) and (7). They determine the point where the error is minimized by appropriately moving the decision boundary and finding the minimum error where the probability density functions of two data classes are idealized. Therefore, the features can be simplified by assuming that features are in an independent class for fault diagnostics [21].

Main Air Blower
In this study, we applied the proposed fault diagnostics procedure to the main air blower (MAB) in the fluid catalytic cracking (FCC) process, which is one of the plant processes. The MAB is a rotating device for blowing air in the FCC process to the reaction tower; moreover, it is a major piece of equipment, and can cause significant losses in the event of a failure. Figure 5 shows a conceptual diagram of MAB equipment. It is composed of various equipment, such as compressors, turbines, etc. To monitor the condition of the equipment, a total of 236 sensors, such as temperature, pressure, flow rate, displacement, etc., are attached, as shown in Table 2.
After identifying the suitable features, in order to classify the sensor data patterned according to the state of each signal, we need the classification process with the suitable classifier. The classifier was applied to distinguish the features of sensor data represented in two-dimensional spaces. The classification process recognizes the given data into a suitable pattern for the features based on the criteria obtained by learning the sensor data of the existing fault and normal operating state. Note that the dotted line of the right side in Figure 4 indicates the classifier. In this study, Naive Bayesian classifiers were used. The Naive Bayesian classifiers assign the most likely class to a given example described by its feature vector, expressed in Equations (6) and (7). They determine the point where the error is minimized by appropriately moving the decision boundary and finding the minimum error where the probability density functions of two data classes are idealized. Therefore, the features can be simplified by assuming that features are in an independent class for fault diagnostics [21].

Main Air Blower
In this study, we applied the proposed fault diagnostics procedure to the main air blower (MAB) in the fluid catalytic cracking (FCC) process, which is one of the plant processes. The MAB is a rotating device for blowing air in the FCC process to the reaction tower; moreover, it is a major piece of equipment, and can cause significant losses in the event of a failure. Figure 5 shows a conceptual diagram of MAB equipment. It is composed of various equipment, such as compressors, turbines, etc. To monitor the condition of the equipment, a total of 236 sensors, such as temperature, pressure, flow rate, displacement, etc., are attached, as shown in Table 2.

Sensor Data Acquisition
Operational data related to MAB equipment were applied to the analysis for about two years from May 2015 to August 2017 to analyze the operation status of the FCC Appl. Sci. 2021, 11, 3780 8 of 15 process installed at a South Korean oil plant. The total number of sensors associated with the MAB equipment was 236 sensors; the types are shown in Table 2. A total of five MAB equipment failures occurred during the data collection period for about two years; the operation information of the equipment (sensor data) was also included. In this study, the data of 236 sensors collected were analyzed for Pearson correlation and 23 sensors were classified to determine the actual operation status of the MAB. The types of sensors classified can show that temperature sensors and displacement sensors are mainly associated with the operating conditions of the MAB equipment, as shown in Table 2. The history of the selected 23 sensor data out of a total of 236 collected over a period of about two years, and failure history information at the time of failure, are visualized together, as shown in Figure 6. Of the sensor data, for about two years (Figure 6), the operation datum clearly marked as "Normal State" is the period representing normal operation. The sensor datum for each failure condition (oil level abnormal', 'filter plugging', 'steam turbine overheating', 'turbine overheating', 'fuel gas overheating) represents the period of the labeled sensor data, based on when the actual failure occurred during the maintenance history of the MAB facility. From the 23 sensors selected, as shown in Table 2, the typical sensors indicating the status of the MAB equipment can be seen as temperature sensors and displacement sensors. However, as shown in Figure 6, the history of sensor data is limited in determining the steady state and each failure condition. Therefore, the sensor data were classified by labeling in order to extract feature factors from the 23 sensor data selected for approximately two years, and to establish failure diagnosis judgment criterion. Figure 7 describes the labeling process of classified sensor data for application to diagnostic trouble procedures. The data for December 2015 were divided into test data because they contained both normal operations and failure conditions. All labeled datasets, except test data, were applied as data for training.

Feature Extraction of Sensor Data
To extract the features of the fault and normal operation signals during the MAB

Feature Extraction of Sensor Data
To extract the features of the fault and normal operation signals during the MAB operation, the statistical feature values listed in Table 1 were extracted at intervals of 30

Feature Extraction of Sensor Data
To extract the features of the fault and normal operation signals during the MAB operation, the statistical feature values listed in Table 1 were extracted at intervals of 30 min. Figure 8a shows a split state for extracting features at regular time intervals from 23 sensor data histories. The extracted data features can be represented as time series data, with a total of 138 features for 23 sensors: 23 (sensors) times 6 (features). Figure 8b shows some of the extracted features as time series data.

Feature Extraction of Sensor Data
To extract the features of the fault and normal operation signals during the MAB operation, the statistical feature values listed in Table 1 were extracted at intervals of 30 min. Figure 8a shows a split state for extracting features at regular time intervals from 23 sensor data histories. The extracted data features can be represented as time series data, with a total of 138 features for 23 sensors: 23 (sensors) times 6 (features). Figure 8b shows some of the extracted features as time series data.

Dimensionality Reduction and Classification
When applying the history of extracted features to data analysis, as shown in Figure  8b, the data history is represented in a 138-dimensional high-dimensional vector form.
Therefore, the dimension reduction method by PCA was applied to the data of 138 extracted features to analyze the historical information of principle component 1 (PC1) and principal component 2 (PC2) that can represent high-dimensional data. Figure 9 shows the history of principle components reduced in dimensions by PCA for normal operation sensor data among the analyzed data. In the same process, data for each operating state were analyzed, and the time series information of the analyzed principal component is visualized in a two-dimensional plane based on principal component 1 and principal component 2. Figure 10 shows the status information of the visualized MAB equipment classified for each operation status. From this figure, it is possible to clearly distinguish the areas of the two-dimensional principal component values between each fault mode and the normal modes. This map is used as the basis for determining the failure of MAB equipment.

Dimensionality Reduction and Classification
When applying the history of extracted features to data analysis, as shown in Figure 8b, the data history is represented in a 138-dimensional high-dimensional vector form.
Therefore, the dimension reduction method by PCA was applied to the data of 138 extracted features to analyze the historical information of principle component 1 (PC1) and principal component 2 (PC2) that can represent high-dimensional data. Figure 9 shows the history of principle components reduced in dimensions by PCA for normal operation sensor data among the analyzed data. In the same process, data for each operating state were analyzed, and the time series information of the analyzed principal component is visualized in a two-dimensional plane based on principal component 1 and principal component 2. Figure 10 shows the status information of the visualized MAB equipment classified for each operation status. From this figure, it is possible to clearly distinguish the areas of the two-dimensional principal component values between each fault mode and the normal modes. This map is used as the basis for determining the failure of MAB equipment. component is visualized in a two-dimensional plane based on principal component 1 and principal component 2. Figure 10 shows the status information of the visualized MAB equipment classified for each operation status. From this figure, it is possible to clearly distinguish the areas of the two-dimensional principal component values between each fault mode and the normal modes. This map is used as the basis for determining the failure of MAB equipment.  The analysis results in Figure 10 show that the normal state and filter plugging failure states have very similar patterns in the type of data. In addition, in case of turbine overheating, it is possible to check that the normal state is similar only to PC1. Other failures, such as oil level abnormal, fuel gas overheating, and steam turbine overheating can confirm that PC1 and PC2 are relatively low similarities.
This result explains that, in the case of filter plugging failure, even in actual operation,  The analysis results in Figure 10 show that the normal state and filter plugging failure states have very similar patterns in the type of data. In addition, in case of turbine overheating, it is possible to check that the normal state is similar only to PC1. Other failures, such as oil level abnormal, fuel gas overheating, and steam turbine overheating can confirm that PC1 and PC2 are relatively low similarities. The analysis results in Figure 10 show that the normal state and filter plugging failure states have very similar patterns in the type of data. In addition, in case of turbine overheating, it is possible to check that the normal state is similar only to PC1. Other failures, such as oil level abnormal, fuel gas overheating, and steam turbine overheating can confirm that PC1 and PC2 are relatively low similarities.
This result explains that, in the case of filter plugging failure, even in actual operation, it is difficult to determine the steady state and the fault condition in the sensor data. Therefore, to divide the boundaries of each operating state, especially to divide between normal state and filter plugging, the Naive Bayesian classifiers model was applied to establish the decision boundary for each operation state. The criteria for each established decision boundary allowed a probability model to be applied for the distribution of each cluster of data to represent each state as a probability distribution.

Verification of the Diagnosis
To verify the proposed fault diagnostics procedure, the test data, which contain fault state, illustrated by the red arrow line in Figure 7, were used. The test data used were for one month (December 2015). The test data includes the initial normal state for about 10 days and the filter plugging condition for about 20 days. At the last point, the operation of the facility suddenly stopped due to a filter plugging failure on the actual MAB equipment. The reason why filter plugging failure status was selected as test datum is that the most appropriate data set was obtained to apply the data that failed in the actual operation of the MAB equipment. Moreover, in case of steady state and filter plugging failure conditions, as shown in Figure 10, it is the most difficult section to take the decision criteria when analyzing only sensor data analysis. Therefore, during this validation phase, the test data for one month were applied to verify whether the filter plugging condition could be determined.
The test data were in a normal state at the beginning and gradually transited to the fault state ('filter plugging state'). Figure 11 shows the results compared to the operation status criteria of the MAB equipment presented in Figure 10 by applying the test data for one month to the procedure presented. The results in the test data are displayed in the normal state and the filter plug fault status areas, as shown in Figure 11, because the results of the test data gradually changes from the normal state to the failed state of the filter plug failure.

Verification of the Diagnosis
To verify the proposed fault diagnostics procedure, the test data, which contain fault state, illustrated by the red arrow line in Figure 7, were used. The test data used were for one month (December 2015). The test data includes the initial normal state for about 10 days and the filter plugging condition for about 20 days. At the last point, the operation of the facility suddenly stopped due to a filter plugging failure on the actual MAB equipment. The reason why filter plugging failure status was selected as test datum is that the most appropriate data set was obtained to apply the data that failed in the actual operation of the MAB equipment. Moreover, in case of steady state and filter plugging failure conditions, as shown in Figure 10, it is the most difficult section to take the decision criteria when analyzing only sensor data analysis. Therefore, during this validation phase, the test data for one month were applied to verify whether the filter plugging condition could be determined.
The test data were in a normal state at the beginning and gradually transited to the fault state ('filter plugging state'). Figure 11 shows the results compared to the operation status criteria of the MAB equipment presented in Figure 10 by applying the test data for one month to the procedure presented. The results in the test data are displayed in the normal state and the filter plug fault status areas, as shown in Figure 11, because the results of the test data gradually changes from the normal state to the failed state of the filter plug failure. Since the test data are initially distributed in the normal state, and gradually move to the fault point at the beginning of the test data, approximately 95% of the normal state is diagnosed, as shown in Figure 12. Then, it can be confirmed that more than 80% of the 'filter plugging state' fault mode is diagnosed at the last point of the test data. Since the test data are initially distributed in the normal state, and gradually move to the fault point at the beginning of the test data, approximately 95% of the normal state is diagnosed, as shown in Figure 12. Then, it can be confirmed that more than 80% of the 'filter plugging state' fault mode is diagnosed at the last point of the test data. The analysis processes of the one-month test data were checked each time and the normal operation status was clearly determined by 5 December, 2015, as shown in Figure 12. However, at when the 10th day passed, the probability of the normal condition decreased rapidly to less than 50%, while abnormal signs of the filter plugging failure were found. The analysis processes of the one-month test data were checked each time and the normal operation status was clearly determined by 5 December, 2015, as shown in Figure 12. However, at when the 10th day passed, the probability of the normal condition decreased rapidly to less than 50%, while abnormal signs of the filter plugging failure were found. When the 15th day passed, the probability of filter plugging failure rose rapidly to more than 70 to 80%, and the standard of decision, of normal conditions, decreased. Therefore, it can be confirmed that abnormalities can be determined about 15 days prior to 30 December, when the actual plant process recognized the failure status of the MAB equipment.

Conclusions
This paper proposed the application of a process for the PHM strategy. Recently, officials in charge of process equipment that required high safety, such as plants, have been very interested in applying PHM technology for maintenance. To study the application cases of these technologies, this study applied sensor data in the operation of MAB equipment during plant processes. Sensor data collected from operational information systems in plant processes can be major decision parameters for fault diagnosis in complex systems.
In order to diagnose the operating conditions of the MAB equipment, the collected sensor data were prepared with examples of fault diagnosis through diagnostic systems, data analyses and classifications, data feature extractions, dimension reductions, and classification using PCA. A total of 236 sensors were initially selected in the operation information system, and 23 sensors were re-selected through a Pearson correlation analysis that best represented the normal and failure conditions of the MAB equipment. The 23 types of sensor data were extracted; each had a statistical feature at a certain time interval. In addition, it was visualized to clearly express the operation status of each piece of MAB equipment in a two-dimensional plane through a dimension reduction process by PCA. As a result, the decision criteria were prepared to diagnose the operation status of the MAB equipment.
To validate the training results of the analyzed diagnostic process, sensor data-for approximately one month of December 2015, including normal state and filter plugging failure state-were applied as test data. As a result, the filter plugging failure of the MAB equipment, which was confirmed around 30 December, 2015, was detected about 15 days before the failure and determined that it was a complete failure about 10 days prior.
From the results, the validity of the diagnostic process proposed in this study is considered valid. However, since there is a limit to validating the proposed method with one test result, the validation process and update of the training model will continue to be necessary to validate its validity with appropriate datasets, such as various failure histories. However, it could be a potential direction for future maintenance strategies for MAB equipment in the plant process presented in this study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available because it is only partially available for research.