Next Article in Journal
A Versatile Distribution Based on the Incomplete Gamma Function: Characterization and Applications
Previous Article in Journal
Fractional Optimal Control Problem and Stability Analysis of Rumor Spreading Model with Effective Strategies
Previous Article in Special Issue
Fractional-Order Sliding Mode Observer for Actuator Fault Estimation in a Quadrotor UAV
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Transient-State Fault Detection System Based on Principal Component Analysis for Distillation Columns

by
Gregorio Moreno-Sotelo
,
Adriana del Carmen Téllez-Anguiano
*,†,
Mario Heras-Cervantes
,
Ricardo Martínez-Parrales
and
Gerardo Marx Chávez-Campos
DEPI, TecNM, Instituto Tecnológico de Morelia, Av. Tecnológico No. 1500, Col. Lomas de Santiaguito, Morelia 58120, Michoacán, Mexico
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2025, 13(11), 1747; https://doi.org/10.3390/math13111747 (registering DOI)
Submission received: 21 April 2025 / Revised: 18 May 2025 / Accepted: 19 May 2025 / Published: 25 May 2025
(This article belongs to the Special Issue Control Theory and Computational Intelligence)

Abstract

:
This paper presents the design of a fault detection system (FDD) based on principal component analysis (PCA) to detect faults in the transient state of distillation processes. The FDD system detects faults due to changes in calorific power and pressure leaks that can occur during the heating of the mixture to be distilled (transient), mainly affecting the quality of the distilled product and the safety of the process and operators. The proposed FDD system is based on PCA with a T2 Hotelling statistical approach, considering data from a real distillation pilot plant process. The FDD system is evaluated with two fault scenarios, performing power changes and pressure leaks in the pilot plant reboiler during the transient state. Finally, the results of the FDD system are analyzed using Accuracy, Precision, Recall, and Specificity metrics to validate its performance.

1. Introduction

A topic of interest within the scientific field is fault detection in processes, a task that, for a long time, depended only on the operator’s experience, and due to the increase in the complexity of modern processes, plants ended up generating a higher cost due to the large number of monitoring variables that exceed the operator’s experience [1].
The evolution of process automation has transformed the industry, allowing the systematic conversion of natural resources into final products without constant human supervision.
A key method that can help ensure that processes operate safely is the application of the efficient and continuous monitoring of variables, which, in conjunction with a fault detection system, can increase reliability and ensure proper operation in the different stages and process units [2,3,4].
Fault detection can be classified into model-based and data-driven approaches: model-based approaches describe the physical and chemical background of the process; models typically focus on describing ideal steady states of processes [5]. However, mathematical models are computationally expensive and require precise knowledge of the model parameters, which can be difficult to obtain. On the other hand, data-driven models use data measured within the plant under normal and abnormal conditions, therefore describing the actual process conditions in the presence of faults, and can describe the true process conditions; they are based on the idea that patterns and behaviors observed in the past can be used to establish a baseline for normal system operation [6].
In data-based fault detection, statistical techniques are used as monitoring methods [7,8]. Univariate and multivariate methods are useful for monitoring key measurements that help define the final quality of the process. The main difference lies in the number of variables analyzed to monitor a process [9].
Multivariate techniques are invaluable in industries with intricate and interdependent processes, such as chemical and petrochemical manufacturing, where a local failure can propagate through multiple variables, leading to consequences such as degradation, decreased productivity, and, in extreme cases, latent danger to the operator [10].
Conventional methods for monitoring multivariate processes include independent component analysis (ICA) [11], Partial Least Squares (PLS) [12], and principal component analysis (PCA) [13]; due to their capabilities of handling complex data and correlations, they are powerful tools when applied for this purpose.
ICA focuses on separating independent signals within correlated data; it is useful in processes where variables are mixed, and it is required to identify independent sources of variation, such as in the analysis of signals or noise in complex systems.
In [14], the authors propose a novel scheme based on the estimation of ICA model parameters at each decomposition depth, where the effectiveness of the proposed FD (fault detection) strategy, based on multi-scale independent component analysis (MSICA), is illustrated through three case studies: a multivariate dynamic process, a quadruple tank process, and a distillation column process. The results indicate that the proposed MSICA FD strategy can improve performance for higher levels of noise in the data.
However, ICA assumes that the data sources are statistically independent, which may not be true in industrial systems where variables are correlated. In contrast, PCA only requires linearly correlated data [15].
On the other hand, PLS models relationships between predictor variables X and response variables Y. It is used to monitor processes where the input and output variables are highly correlated, allowing the prediction of the behavior of the process and the detection of deviations.
The features of PCA include the transformation of multivariate data into uncorrelated variables called principal components that encapsulate systematic data variations. During normal operation, data points cluster tightly in the transformed space. The fault is assessed by monitoring these components, evaluating the relationships between various variables to determine when the behavior deviates from the norm [16,17].
In industrial environments, PCA stands out for its ability to identify deviations and its simplicity [18], even in the presence of noise. Additionally, PCA is excellent for reducing data dimensionality while preserving as much variability as possible—a feature that is very necessary in industrial systems due to the handling of enormous amounts of data. Moreover, PCA allows the observation space to be separated into two subspaces: one that captures the systematic trend of the process and another related to random noise [19,20].
Statistical measures such as Hotelling’s T 2 and the squared prediction error (SPE) statistic are useful for defining these decision thresholds, which indicate the occurrence of a failure.
The application of PCA in fault detection systems is the subject of active research. In the literature, applications related to the predictive maintenance of industrial induced-draft fans can be found, considering faults such as high vibration in the internal diameter of fans and complete failure of the fan-motor system [21]. An application example that focuses on successfully detecting, isolating, and estimating incipient failure in sensors can be found in [22].
Processes such as that of Tennessee Eastman (TE) in conjunction with a stacked autoencoder (SAE) consider the linear and nonlinear relationships between the variables [23]. In [24], PCA is used in combination with wavelet transform based on moving windows; through performance analysis based on T 2 , squared prediction error statistics, and contribution graphs, it is possible to detect sensor bias-type faults and process faults in a stirred tank reactor.
Although numerous applications related to PCA are reported in the literature, most of these investigations use simulated data and, in addition, mainly consider failures in the steady state of the process, ignoring its transient state [25,26,27].
Detecting faults during the initialization phase of a distillation column is critical to ensuring operational efficiency, safety, and product quality.
The startup period is inherently unstable, with dynamic transitions in temperature, pressure, and composition, making the system highly susceptible to deviations that can escalate into significant operational failures if not promptly addressed.
For instance, improper handling of vapor flow during startup can lead to flooding, weeping, or foaming, which severely compromise separation efficiency and may necessitate costly shutdowns. Early fault detection allows for corrective actions before these issues propagate, minimizing energy waste, reducing downtime, and preventing off-spec products.
Moreover, startups are energy-intensive; optimizing this phase through fault monitoring can lead to substantial energy savings, as highlighted by studies on hybrid model-based control systems that reduce boiler heat duty during transient states.
In this work, a fault detection scheme for a binary ethanol–water batch column is developed. PCA is used, taking advantage of the two-dimensional reduction characteristic, to monitor the process performance over time and verify that the process remains within a normal operating control state. The Hotelling T 2 statistic is used as a process monitoring method. The resulting models consider the variation in their variables under normal conditions that affect the process and are essentially unavoidable within the current process.

2. Methodology

2.1. Case Study: Distillation Column

A distillation column consists of a condenser, a boiler, and the column body, which consists of n-2 perforated plates. The actuator in the boiler provides the heat necessary to evaporate the liquid mixture that it contains. As the vapor rises through the plates of the column body, it is enriched with the light element (the element with the lowest boiling point in the mixture). The vapor that reaches the condenser condenses and, depending on the state of the reflux actuator, is extracted as a distilled product or re-enters the column. The liquid that re-enters through the reflux descends due to gravity within the column body, becoming enriched with the heavy element (the element with the highest boiling point).
Figure 1 shows a simplified diagram as well as a photograph of the distillation pilot plant considered in this case study, composed of 11 perforated plates, 7 of which have RTD (Pt100) temperature sensors: the plate located in the condenser (plate 1); plates 2, 4, 6, 8, and 10; and the boiler (plate 11). It also has 2 actuators, 1 in the boiler and 1 in the condenser. The actuator in the boiler is an adjustable direct-current voltage source that feeds the heating resistance inside the boiler tank. The actuator in the condenser is an open/close valve.

Variable Selection Strategy for Training Data

An ethanol–water mixture was used in the same proportion (1 L), within an operating range of 158 to 160 watts of electrical power in the heating resistance. The process data were acquired through a local interface and stored as CSV files; each file contained data that included the transient state and part of the stable state of the process.
Every file had 5000 samples, and the contained variables were as follows: condenser temperature, plate 2 temperature, plate 4 temperature, plate 6 temperature, plate 8 temperature, plate 10 temperature, boiler temperature, room temperature, DC voltage, electrical current, and heating power.
Data corresponding to the normal operation of the process, such as those indicated in Figure 2, were considered.
The transient state analysis was categorized into two phases, low- and high-transient, as illustrated in Figure 3. This classification was based on the observed thermal behavior of the system, where temperature trends correlate with vapor accumulation in the plate. Specifically, as the vapor content increases, the plate temperature exhibits a proportional increase.
The terms “low-transient” and “high-transient” were assigned to reflect these dynamics: the low-transient phase corresponds to a state of minimal vapor input, resulting in a gradual temperature increment, while the high-transient phase aligns with elevated vapor levels, driving a more pronounced thermal increase. This division ensured a systematic evaluation of thermal responses under transient conditions.
The data for the training set were obtained by selecting samples within the variables of interest, named according to the following nomenclature:
  • T T L 0 —Initial transient temperature;
  • T T L 1 —Low-transient sample 1 temperature;
  • T T L 2 —Low-transient sample 2 temperature;
  • T T L —Low-transient sample … temperature;
  • T T L n —Low-transient sample n 1 temperature;
  • T T H 0 —High-transient sample 0 temperature;
  • T T H 1 —High-transient sample 1 temperature;
  • T T H 2 —High-transient sample … temperature;
  • T T H n —Final transient n temperature;
  • P o t —Heating power.
where n indicates the number of samples ( n = 30 for this analysis).
Figure 4 shows an example of sample selection within the low and high transients for the temperature of plate 4.
The data were pre-processed to remove null and outlier values from the dataset and standardized using Python 3.11.4.

2.2. PCA

PCA was proposed by Pearson in 1901 to reduce data dimensionality while preserving variance [28]. Hotelling [29] extended Pearson’s work and formalized PCA in more rigorous mathematical terms, introducing the use of covariance and correlation matrices to calculate the principal components.
By projecting process variables onto a lower-dimensional subspace, PCA reveals the inherent cross-correlation between process variables. PCA uses an orthogonal transformation to convert a sample set of possibly correlated variables into a set of linearly and statistically uncorrelated variable values called principal components [30].
In this sense, PCA latent variables, or principal components (PCs) (also called scores), are the directions in which the data have the largest variances and capture most of the information content of the data, as shown in Figure 5. Mathematically, they correspond to the eigenvectors associated with the largest eigenvalues in the autocorrelation matrix of the data vectors [31].
PCA-based methods are frequently applied in data compression [32], pattern recognition, data smoothing, classification [33], and fault detection [34].
PCA starts with a matrix of observations X of n × m. The data are normalized using Equation (1) to avoid different magnitudes. The purpose of the usual scaling is to make the variance the same (i.e., give standard units) [35].
X = x x ¯ σ
In Equation (1), X represents the normalized data, x ¯ the mean of the data X, and σ the standard deviation of the data X.
To understand the relationship between variables and, thus, the values that represent how two variables change together, an analysis of covariance was performed. To obtain all possible covariance values between the different dimensions, the covariance matrix S (Equation (2)) was obtained.
S = s 1 2 s 12 s 1 p s 12 s 2 2 s 2 p s 1 p s 2 p s p 2
In the matrix S, s i 2 is the variance of the i-th variable s i , and s i j is the covariance at the i-th and j-th variables. If the covariance is not equal to zero, it indicates that there is a linear relationship between these two variables, and the strength of that relationship is represented by the correlation coefficient (Equation (3)).
r i j = s i j s i s j
The covariance s i j is calculated using Equation (4).
s i j = n x i k x j k x i k x j k n ( n 1 )
If the covariance is positive, both variables tend to increase or decrease together; if it is negative, one increases while the other decreases.
PCA is based on a key result of matrix algebra: a symmetric, non-singular p x p matrix A, such as the covariance matrix S, can be reduced to a diagonal matrix L by premultiplying and postmultiplying it by a particular orthonormal matrix U such that Equation (5) is obtained.
U SU = L
The diagonal elements of L ( L 1 , L 2 , , L p ), called characteristic roots, latent roots, or eigenvalues of S, indicate the amount of variance in the data that each principal component captures; thus, λ = d i a g ( s 1 1 , , s n m ) is a matrix comprising eigenvalues of S arranged diagonally in decreasing magnitude, A high eigenvalue means that that principal component explains more of the variability in the original data.
The columns of U ( u 1 , u 2 , , u p ), called characteristic vectors or eigenvectors of S, represent the direction of the new axes in the data space. These vectors indicate how to combine the original variables to obtain the principal components. Each principal component is a linear combination of the original variables; the eigenvectors define these coefficients. Important points about eigenvalues and eigenvectors are mentioned in [36].
The characteristic roots can be obtained from the solution of the determining Equation (6), called the characteristic equation:
| S l I | = 0
where I is the identity matrix. This equation produces a polynomial of degree p t h from which the values l 1 , l 2 , , l p are obtained. Then, the characteristic vectors can be obtained by solving Equations (7) and (8).
[ S l I ] t i = 0
u i = t i t i t i
The characteristic vector from the matrix U, shown in Equation (9), is orthonormal.
U = u 1 u u n
The principal axis transformation will transform p correlated variables x 1 , x 2 , , x p into p new uncorrelated variables z 1 , z 2 , , z p . The coordinate axes of these new variables are described by the characteristic vectors u i , which form the direction cosine matrix U used in the transformation given by Equation (10):
z = U x x ¯
where x and x ¯ are vectors p x 1 of observations on the original variables and their means. The transformed variables are the principal components of X or P C . The ith principal component is z i = U i x x ¯ . The classical PCA algorithm can be seen in Figure 6.
The first principal components are sufficient to preserve the relevant information in the original data, according to the parameter k that determines the dimension of the extracted features; this parameter must meet k < m i n m , n .
In the PCA algorithm, the eigenvectors corresponding to the first k eigenvalues of the sample covariance matrix are the orthogonal bases of the feature space. Generally, the variance contribution rate or explained variance refers to the proportion of the total variance in the original data captured by each principal component (Equation (11)).
R i = λ i j = 1 m λ j i = 1 , 2 , k
The parameter k should make the cumulative contribution rate greater than a threshold (usually 80–90%) [37], as shown in Equation (12). This value also represents the cumulative sum of the variance explained by the first principal components. This measure indicates how much of the total information from the original data has been retained when considering a specific set of principal components.
G ( k ) = i = 1 k λ i j = 1 k λ j

2.3. Control Charts Based on Principal Components

The T 2 control chart combines information on the mean and dispersion of more than one variable [29].
Given a vector x × 1 of measurements y, normally distributed variables with a covariance matrix S , we can test whether the vector x ¯ = x 1 , x 2 , , x n of the means of these variables is at its desired target X ¯ by computing the Hotelling statistic T 2 .
For notational purposes, ith represents the individual observation of the p characteristics of the reference sample with the X i vector
X i = x i 1 x i 2 x i p
The estimated mean vector, whose components are the means of each feature, is
X ¯ m = x ¯ i 1 x ¯ i 2 x ¯ i p
where X ¯ m is obtained by Equation (13),
X ¯ m = 1 m i = 1 m X i j
and the estimated covariance matrix is obtained by Equation (14).
S m = 1 m 1 i = 1 m X i X ¯ m X i X ¯ m
To construct a multivariate control chart based on Hotelling’s T 2 statistic, for observation X ¯ i , the chart statistic given by Equation (15) is used:
T 2 = ( x x ¯ ) S 1 ( x x ¯ )
This statistic will be distributed as a central Chi-square distribution with q degrees of freedom if x ¯ = X ¯ . A multivariate Chi-square control chart can be constructed by plotting T 2 against time with an upper control limit ( U C L ) in Equation (17),
U C L = m 1 2 m X p / ( m p 1 ) F ( α / 2 ) ; p , m p 1 1 + ( p / ( m p 1 ) ) F ( α / 2 ) ; p , m p 1 ) )
and an L C L in Equation (18) [38].
L C L = m 1 2 m X p / ( m p 1 ) F ( 1 α / 2 ) ; p , m p 1 1 + ( p / ( m p 1 ) ) F ( 1 α / 2 ) ; p , m p 1 ) )
where F ( α ; p , m p ) is the 1 α perceptible of the F distribution with p and m p degrees of freedom.
The relationship between the UCL and the data is determined by calculating the statistic T 2 . Each sample generates a value of T 2 , which measures the distance of the data from the multivariate mean of the process, and this calculation considers the covariance matrix used to evaluate the correlation between the variables.
It is also possible to establish that the distribution of the T 2 values, under normal conditions, follows a Fisher F distribution because the statistic T 2 is based on the relationship between the variability in the samples and the expected variability in the process, which establishes a control limit based on a predefined confidence level.

2.4. Fault Detection with PCA- T 2

The main idea behind this method is to select the control limits whose function is to determine whether the monitored process is under statistical control. The UCL value defines the threshold at which multivariate observations are considered normal or within the expected limits of process variation. If an observation has a T 2 value that exceeds the UCL value, the process is outside the control limits. Figure 7 defines the methodology that describes the proposal for the PCA- T 2 FD.

3. Results

This section presents the FD PCA- T 2 method, aimed at detecting faults in the transient state of a chemical process, specifically a distillation column. The matrix X p 2 , used for the plate 2 analysis, contains the procedure variables. This is a 12 × 63 matrix. For the sake of size, only a portion of the resulting analysis matrices is shown.
X p 2 = P T O 0 P T O 1 P T O 2 P T O 3 P T O 59 P T O 60 P T O 61 P o t 23.468 23.535 23.569 23.637 71.278 71.793 72.033 158.073 23.231 23.231 23.333 23.502 71.484 71.827 72.067 158.073 27.121 26.952 26.851 26.884 74.744 74.984 75.156 158.073 21.271 21.338 21.338 21.474 72.788 72.856 72.925 158.073 20.426 20.460 20.595 20.798 71.861 72.444 72.959 161.150 23.468 23.535 23.569 23.637 71.278 71.793 72.033 158.073 25.903 25.971 26.072 26.140 71.381 71.793 72.136 157.534 26.005 26.072 26.106 26.208 71.141 71.587 71.861 157.487 26.106 26.140 26.275 26.377 72.033 72.136 72.307 158.181 25.903 25.971 26.072 26.140 71.107 71.655 71.964 157.534 26.884 26.952 27.020 27.121 71.141 71.964 72.342 158.466 26.343 26.174 26.072 25.971 70.798 71.827 72.513 158.945
where p in the matrix X p indicates the plate under analysis. For any plate p in the distillation column, this matrix is formed of
x p = n S a m p l e s T r a n s . L o w , n S a m p l e s T r a n s . H i g h P o w e r
where n indicates the number of samples considered for low- and high-transient sections corresponding to each temperature value.
The vector of sample means for the analysis of plate 2 is indicated by x ¯ h .
x ¯ p 2 = 25.903 25.971 26.072 26.055 71.3300 71.8274 72.2219 158.0735 T
The covariance matrix S p 2 resulting from the analysis data of plate 2 is of size 63 × 63.
S p 2 = 1.2892 1.0567 1.0581 0.1090 0.1014 0.5887 1.0567 1.2872 1.0583 0.0800 0.0716 0.5960 1.0581 1.0583 1.292 0.0591 0.0499 0.6033 0.1090 0.0800 0.0591 1.2008 0.9622 0.102 0.1014 0.0716 0.0499 0.9622 1.1967 0.1432 0.5887 0.596 0.603 0.1023 0.1432 0.7235
The eigenvector matrix U p 2 for plate 2 is defined as
U p 2 = 0.1446 0.0543 0.1533 0.07214 0.1464 0.0513 0.1480 0.0810 0.1484 0.0495 0.1439 0.0872 0.1493 0.0496 0.1419 0.08727 0.1517 0.0456 0.1382 0.0847 0.1531 0.0454 0.1352 0.0827 0.0277 0.2349 0.0163 0.1828 0.0338 0.2197 0.0378 0.2306 0.0356 0.1829 0.0899 0.2718 0.0332 0.1494 0.1175 0.2996 0.0378 0.1295 0.1344 0.3064 0.0962 0.0765 0.0105 0.1463
The importance of each principal component resulting from the analysis was observed, the result is shown in Figure 8. Within the graph, the representation of the components is listed in order from highest (47.88%) to lowest (6.82%).
Figure 9 shows the variance explained according to the number of components selected for the analysis. The graph indicates that by only using the first and second components, 71.31% of the variance is explained.
Principal components that explain an acceptable cumulative percentage of variance are often selected. A commonly used value is between 70% and 90% of the explained variance. Too many components can overload the model with irrelevant details, while too few components can cause the loss of important information. This range allows for a balance between these extremes. In this work, only the first two principal components were used for the analysis, accounting for 71.31% of the explained variance.
The selection is validated using the matrix Loads p 2 . This matrix is useful for obtaining the variance in each variable explained by a component. The load matrix is considered, and the value of squared cosines is obtained.
Loads p 2 =   C P 1 C P 2 C P 3 C P 4 P T O 0 0.82968 0.21792 0.5568 0.1561 P T O 1 0.84033 0.20593 0.5376 0.1754 P T O 2 0.85143 0.19871 0.5226 0.1888 P T O 3 0.85703 0.19939 0.5152 0.1889 P T O 4 0.87031 0.18319 0.5018 0.1835 P T O 5 0.87879 0.18257 0.4910 0.1792 P T O 6 0.88374 0.17526 0.4852 0.1796 P T O 56 0.1015 0.9765 0.04033 0.3008 P T O 57 0.1591 0.9428 0.05945 0.3959 P T O 58 0.1944 0.8817 0.13728 0.4993 P T O 59 0.2042 0.7340 0.32649 0.5886 P T O 60 0.1905 0.5998 0.42685 0.6486 P T O 61 0.2169 0.5197 0.48817 0.6635 P O T 0.5521 0.307 0.0381 0.3168
In Figure 10, the direction and importance of the original characteristics given by the first and second components are shown; within the graph, the magnitude of the vector is scaled 5 times for visualization purposes.
The first component describes each of the analysis variables for the low-transient state; on the other hand, the second component explains the analysis variables for the high-transient state, as shown in matrix cos p 2 2 .
cos p 2 2 =   C P 1 C P 2 C P 3 C P 4 P T O 0 68.8373 4.7491 31.0073 2.4397 P T O 1 70.6155 4.2410 28.9042 3.0776 P T O 2 72.4940 3.9487 27.3176 3.5648 P T O 3 73.4515 3.9758 26.5497 3.5703 P T O 4 75.7452 3.3559 25.1902 3.3703 P T O 5 77.2282 3.3333 24.1145 3.2125 P T O 6 78.1011 3.0717 23.5447 3.2265 P T O 55 1.0268 98.0731 0.0571 4.7623 P T O 56 1.0319 95.3692 0.1627 9.0524 P T O 57 2.5330 88.8979 0.3535 15.6796 P T O 58 3.7799 77.7478 1.8846 24.9353 P T O 59 4.1723 53.8899 10.6597 34.6493 P T O 60 3.6304 35.9816 18.2205 42.0792 P T O 61 4.7063 27.0090 23.8320 44.0294 P O T 30.4826 9.4253 0.1459 10.0372
To define the reference threshold, statistical theory is taken into account. The idea is to map the transformed data to a univariate set using the Hotelling statistical parameter T 2 , and from this, establish a normal state threshold, based on the variability in the data.
This parameter represents the magnitude of each transformed data point, resulting in a univariate data set. Thus, it is possible to define the UCL control limits with a univariate process approach [38]. Assuming a β -type distribution, shown in Equation (15), with a confidence level of α = 0.05 , p = 2 , and m = 12 , the UCL threshold or upper control limit is given by
U C L = m 1 2 m X p / ( m p 1 ) F ( α / 2 ) ; p , m p 1 1 + ( p / ( m p 1 ) ) F ( α / 2 ) ; p , m p 1 ) ) = 5.09327
A fixed UCL provides a constant criterion for fault detection, avoiding dynamic adjustments that could cause false alarms or computational overload. If the UCL changes dynamically, it might respond to minor variations that do not represent actual faults, leading to unnecessary alerts. Since this research considered an operational range, a fixed limit allows for a more reliable assessment of significant deviations [39].
Figure 11 shows the data transformed into a control chart based on the T 2 statistic, the indicated region represents the normal state of the system within the UCL value.
For the classification process, a new sample must be standardized, then mapped based on the principal components obtained from the normal-state model, and then, the Hotelling statistic is calculated based on the covariance. Thus, if this value falls outside the UCL threshold, an abnormal condition is detected, which indicates a failure.

Faults Scenarios

This study consisted of causing faults during the start-up of the process, such as pressure leaks and changes in the heating power (for this experiment, the start-up is considered to be the transient state of the process).
Figure 12 shows a failure due to a heating power increase during the transient state, starting in sample 1500 and finishing in sample 2000.
Under this failure, it is possible to observe that the temperature remains constant and does not reflect immediate changes. The effects of this fault appear in the range of samples 2010 to 2800; this delay changes depending on the magnitude of the fault failure.
Figure 13 shows a failure due to a pressure leak during the start-up of the process; the failures occurred in 1000 to 1500 and 2500 to 3000 samples.
To validate the FD-PCA T 2 model, files containing fault conditions were used, from which a vector X F a u l t was generated with the analysis data.
X F a u l t = [ T T L 0 , T T L 1 , , T T L N , T T H 0 , T T H 1 , , T T H N , P o t ]
The data of the vector X F a u l t were standardized and projected onto the reduced basis of the principal components using Equation (20).
Y p = x C p 1 , 2
where x represents the normalized X F a u l t vector data and C p 1 , 2 the first two principal components.
Figure 14 shows an example of data to validate the model in failure due to a change in the heating power. The effect of this fault extends to sample 3196 in the boiler. This behavior is also visible at other plate temperatures, but with a different time extension. For example, for plate 2, the effect extends to sample 3850.
Figure 15 presents the result of FD-PCA T 2 ; the parameter T p 2 is greater than the normal operating UCL for the temperature trend of plate 2.
The results of FD-PCA T 2 for plates 2 and 6 and the condenser are shown in Figure 16 in this order.
The blue, green, and violet points indicate normal behavior in the process; the failure of these plates is represented by the red, orange, and yellow points.
It is possible to observe that for plate 6 (orange), the statistic does not detect the fault due to its lower value compared with the UCL threshold.
To validate the model under a pressure leakage fault, the data observed in Figure 17 were used.
Unlike faults due to heating power, the effect of the pressure leak fault is proportional to the magnitude of the leak present, which has a greater impact on the plate where the leak occurs.
Figure 18 shows the result of FD-PCA T 2 evaluated with a pressure leak fault. The graph indicates that the parameter T p 2 is greater than the normal operating UCL for the temperature trend of plate 2.
The results of FD-PCA T 2 for plates 2 and 6 and the condenser are shown in Figure 19 in this order. Blue, green, and violet points indicate normal behavior; the faults are represented as red, orange, and yellow points. In the case of the condenser (yellow), the statistic does not detect the fault due to its lower value regarding the UCL threshold.
The variance explained by each model applied to each plate with a sensor in the distillation column, as well as the T p 2 Hotelling values of the models used in the T p 2 graph, are described in Table 1.
According to Table 1, the variance explained by only two components for the boiler presents 97.68% of the information of the data used, compared to plate 2, which indicates 71.31% of this value. Within the analysis, plate 2 presents the least information when considering only two components.
Finally, to validate the model, Accuracy, Precision, Sensitivity/Recall, and Specificity metrics are used. These results are presented in Table 2.
These results suggest that the models perform well using only two principal components. Overall, the average accuracy is 0.8386, indicating correct detection in normal cases and element faults in 83.86% of cases.
Regarding accuracy, an average value of 0.9523 indicates a correct fault detection in 95.23% of the evaluated cases where a fault was indicated. On the other hand, an average value of 0.9642 in specificity suggests that the models responded adequately in cases where a fault did not occur; in other words, the fault detection models were correct in normal operating cases in 96.42% of cases. On average, a sensitivity value of 0.8666 indicates that the models performed correct fault detection in 86.66% of cases. In Table 2, it can be observed that the best model was for Plate 4, with 100% for each metric.

4. Discussion

It is of utmost importance to correctly identify the low- and high-transient sections; if they are defined correctly, there is a circumstantial improvement in fault detection.
Regarding the failure cases not detected by the FD-PCA T 2 , it was observed that most of them are cases masked by the subspace formed by the training data, for example, the case shown in Figure 20.
The fault is visible in the process temperature measurement trend due to a power increase from 158 W to 165 W in samples 2070 to 3000, but the effect is not strong enough to be detected as a fault.
This can be seen more clearly in Figure 21, where the analyzed transient (red line) falls within the region of transients used for training, which causes incorrect categorization by the FD-PCA T 2 .
It is worth mentioning that these graphs are based only on real observations, which makes them less sensitive to small changes. Furthermore, detection thresholds in PCA methods are typically derived under the assumption that the data follow a Gaussian distribution [13,40].

5. Conclusions

This paper presents a fault detection strategy that uses principal component analysis with Hotelling’s T 2 statistical distribution to construct a multivariate control chart. Fault detection is performed by observing a UCL value that defines the threshold value at which multivariate observations are considered normal. If an observation has a T 2 value that exceeds the UCL, this indicates a fault.
Real data from a distillation column are used to train and validate the resulting models. The results indicate good performance of the applied method, achieving correct fault detection in 95.23% of the evaluated cases. However, for the fault detection and normal (non-failure) cases, the model performs reliable detection in 86.66% of the evaluated cases, suggesting that there is room for improvement in detecting some real faults that could be masked in the training data. This research demonstrates that the applied method can detect faults in the transient state of the process. The proposed strategy meets the essential criteria for a reliable fault monitoring scheme.
The PCA-T2 fault detection system can be applied in real time, since once the process data under normal operating conditions are obtained, the X matrix is generated, the principal component analysis is performed, and the statistical control thresholds or limits are also calculated, so no recalculation is needed during the fault detection process. Having PCA data available, process monitoring and fault detection can be performed; each new reading from the monitored process must have its T2 statistic calculated. If any of the values exceed the threshold, the alarm is triggered.
However, the results can be improved; for this experiment, a limited dataset was used, which was divided into two subsets (test and validation data). In addition, the data are obtained from a specific operating range because of the characteristics of the distillation pilot plant; however, considering more variables in the analysis, such as pressure, temperature, and concentration, would improve the results obtained, as correlations may exist between them. By including these relationships, more complex patterns can be identified and the precision of diagnosis can be improved. In addition, when correlated variables are considered, the system can better differentiate between normal variations and real failures. This helps reduce false alarms, avoiding unnecessary interventions in the process. However, redundancy of information should be considered.

Author Contributions

Conceptualization, G.M.-S. and A.d.C.T.-A.; methodology, M.H.-C.; validation, R.M.-P. and G.M.C.-C.; formal analysis, M.H.-C.; writing—original draft preparation, G.M.-S.; writing—review and editing, R.M.-P.; supervision, A.d.C.T.-A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Secihti and TecNM.

Data Availability Statement

Data can be provided upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Kini, K.R.; Madakyaru, M. Performance evaluation of independent component analysis-based fault detection using measurements corrupted with noise. J. Control Autom. Electr. Syst. 2021, 32, 642–655. [Google Scholar] [CrossRef]
  2. Park, Y.J.; Fan, S.K.S.; Hsu, C.Y. A review on fault detection and process diagnostics in industrial processes. Processes 2020, 8, 1123. [Google Scholar] [CrossRef]
  3. Bi, X.; Qin, R.; Wu, D.; Zheng, S.; Zhao, J. One step forward for smart chemical process fault detection and diagnosis. Comput. Chem. Eng. 2022, 164, 107884. [Google Scholar] [CrossRef]
  4. Webert, H.; Döß, T.; Kaupp, L.; Simons, S. Fault handling in industry 4.0: Definition, process and applications. Sensors 2022, 22, 2205. [Google Scholar] [CrossRef]
  5. Albertos, P.; Goodwin, G.C. Virtual sensors for control applications. Annu. Rev. Control 2002, 26, 101–112. [Google Scholar] [CrossRef]
  6. Yin, S.; Ding, S.X.; Xie, X.; Luo, H. A review on basic data-driven approaches for industrial process monitoring. IEEE Trans. Ind. Electron. 2014, 61, 6418–6428. [Google Scholar] [CrossRef]
  7. Chen, Z.; O’Neill, Z.; Wen, J.; Pradhan, O.; Yang, T.; Lu, X.; Lin, G.; Miyata, S.; Lee, S.; Shen, C.; et al. A review of data-driven fault detection and diagnostics for building HVAC systems. Appl. Energy 2023, 339, 121030. [Google Scholar] [CrossRef]
  8. Li, W.; Yue, H.H.; Valle-Cervantes, S.; Qin, S.J. Recursive PCA for adaptive process monitoring. J. Process. Control 2000, 10, 471–486. [Google Scholar] [CrossRef]
  9. Kini, K.R.; Madakyaru, M.; Harrou, F.; Menon, M.K.; Sun, Y. Improved Fault Detection in Chemical Engineering Processes via Non-Parametric Kolmogorov–Smirnov-Based Monitoring Strategy. ChemEngineering 2023, 8, 1. [Google Scholar] [CrossRef]
  10. Ge, Z.; Song, Z. Multivariate Statistical Process Control: Process Monitoring Methods and Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
  11. Harrou, F.; Kini, K.R.; Madakyaru, M.; Sun, Y. Uncovering sensor faults in wind turbines: An improved multivariate statistical approach for condition monitoring using SCADA data. Sustain. Energy Grids Netw. 2023, 35, 101126. [Google Scholar] [CrossRef]
  12. Harrou, F.; Sun, Y.; Madakyaru, M.; Bouyedou, B. An improved multivariate chart using partial least squares with continuous ranked probability score. IEEE Sensors J. 2018, 18, 6715–6726. [Google Scholar] [CrossRef]
  13. Harrou, F.; Nounou, M.N.; Nounou, H.N.; Madakyaru, M. Statistical fault detection using PCA-based GLR hypothesis testing. J. Loss Prev. Process. Ind. 2013, 26, 129–139. [Google Scholar] [CrossRef]
  14. Kini, K.R.; Madakyaru, M. Improved process monitoring scheme using multi-scale independent component analysis. Arab. J. Sci. Eng. 2022, 47, 5985–6000. [Google Scholar] [CrossRef]
  15. Camacho, O.; Padilla, D.; Gouveia, J.L. Diagnóstico de fallas utilizando técnicas estadísticas multivariantes. Rev. Tec. Fac. Ing. Univ. Zulia 2007, 30, 253–262. [Google Scholar]
  16. Zamarrón, A.M.C.; Prado, E.M.; Luis, F.Z. Monitoreo y control de un proceso normal multivariado. Concienc. Tecnol. 2012, 29–35. Available online: https://www.redalyc.org/pdf/944/94424470005.pdf (accessed on 21 April 2025).
  17. Olmos-Zepeda, J.R.; Ramírez-Valverde, G. Evaluation of a multivariate control chart based on data depth for non-normal observations in the presence of autocorrelation. Ing. Investig. Tecnol. 2020, 21. Available online: https://www.revistaingenieria.unam.mx/numeros/v21n3-03.php (accessed on 21 April 2025).
  18. Jolliffe, I.T. Principal Component Analysis for Special Types of Data; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
  19. Anaya-Isaza, A.J.; Peluffo-Ordoñez, D.H.; Alvarado-Pérez, J.C.; Ivan-Rios, J.; Castro-Silva, J.A.; Rosero-Montalvo, P.D.; Peña-Unigarro, D.F.; Umaquinga-Criollo, A.C. Estudio Comparativo de Métodos Espectrales para Reducción de la Dimensionalidad: LDA versus PCA. INCISCOS 2016. 2017. Available online: https://www.researchgate.net/publication/311450410_Estudio_comparativo_de_metodos_espectrales_para_reduccion_de_la_dimensionalidad_LDA_versus_PCA_Comparative_study_between_spectral_methods_for_dimension_reduction_LDA_versus_PCA (accessed on 21 April 2025).
  20. Garcia-Alvarez, D.; Fuente, M. Estudio comparativo de técnicas de detección de fallos basadas en el Análisis de Componentes Principales (PCA). Rev. Iberoam. Autom. Inform. Ind. RIAI 2011, 8, 182–195. [Google Scholar] [CrossRef]
  21. Sarita, K.; Devarapalli, R.; Kumar, S.; Malik, H.; Garcia Marquez, F.P.; Rai, P. Principal component analysis technique for early fault detection. J. Intell. Fuzzy Syst. 2022, 42, 861–872. [Google Scholar] [CrossRef]
  22. Chai, Y.; Tao, S.; Mao, W.; Zhang, K.; Zhu, Z. Online incipient fault diagnosis based on Kullback-Leibler divergence and recursive principle component analysis. Can. J. Chem. Eng. 2018, 96, 426–433. [Google Scholar] [CrossRef]
  23. Li, J.; Yan, X. Process monitoring using principal component analysis and stacked autoencoder for linear and nonlinear coexisting industrial processes. J. Taiwan Inst. Chem. Eng. 2020, 112, 322–329. [Google Scholar] [CrossRef]
  24. Nawaz, M.; Maulud, A.S.; Zabiri, H.; Suleman, H.; Tufa, L.D. Multiscale framework for real-time process monitoring of nonlinear chemical process systems. Ind. Eng. Chem. Res. 2020, 59, 18595–18606. [Google Scholar] [CrossRef]
  25. Kazemi, P.; Masoumian, A.; Martin, P. Fault Detection and Isolation for Time-Varying Processes Using Neural-Based Principal Component Analysis. Processes 2024, 12, 1218. [Google Scholar] [CrossRef]
  26. de Carvalho Michalski, M.A.; de Souza, G.F.M. Comparing PCA-based fault detection methods for dynamic processes with correlated and Non-Gaussian variables. Expert Syst. Appl. 2022, 207, 117989. [Google Scholar] [CrossRef]
  27. Malluhi, B.; Nounou, H.; Nounou, M. Enhanced multiscale principal component analysis for improved sensor fault detection and isolation. Sensors 2022, 22, 5564. [Google Scholar] [CrossRef] [PubMed]
  28. Pearson, K. LIII. On lines and planes of closest fit to systems of points in space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
  29. Hotelling, H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 1933, 24, 417. [Google Scholar] [CrossRef]
  30. Abdi, H.; Williams, L.J. Principal component analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 433–459. [Google Scholar] [CrossRef]
  31. Kong, X.; Hu, C.; Duan, Z. Principal Component Analysis Networks and Algorithms; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
  32. Park, S.; Lee, J.J.; Yun, C.B.; Inman, D.J. Electro-mechanical impedance-based wireless structural health monitoring using PCA-data compression and k-means clustering algorithms. J. Intell. Mater. Syst. Struct. 2008, 19, 509–520. [Google Scholar] [CrossRef]
  33. Festa, D.; Novellino, A.; Hussain, E.; Bateson, L.; Casagli, N.; Confuorto, P.; Del Soldato, M.; Raspini, F. Unsupervised detection of InSAR time series patterns based on PCA and K-means clustering. Int. J. Appl. Earth Obs. Geoinf. 2023, 118, 103276. [Google Scholar] [CrossRef]
  34. Wen, S.; Zhang, W.; Sun, Y.; Li, Z.; Huang, B.; Bian, S.; Zhao, L.; Wang, Y. An enhanced principal component analysis method with Savitzky–Golay filter and clustering algorithm for sensor fault detection and diagnosis. Appl. Energy 2023, 337, 120862. [Google Scholar] [CrossRef]
  35. Jackson, J.E. A User’s Guide to Principal Components; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
  36. Zhang, X.D. Matrix Analysis and Applications; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
  37. Zhao, H.; Lai, Z.; Leung, H.; Zhang, X. Feature Learning and Understanding; Springer: Cham, Germany, 2020. [Google Scholar]
  38. Tracy, N.D.; Young, J.C.; Mason, R.L. Multivariate control charts for individual observations. J. Qual. Technol. 1992, 24, 88–95. [Google Scholar] [CrossRef]
  39. Montgomery, D.C. Introduction to Statistical Quality Control; John Wiley & Sons: Hoboken, NJ, USA, 2020. [Google Scholar]
  40. Kini, K.R.; Madakyaru, M.; Harrou, F.; Vatti, A.K.; Sun, Y. Robust Fault Detection in Monitoring Chemical Processes Using Multi-Scale PCA with KD Approach. ChemEngineering 2024, 8, 45. [Google Scholar] [CrossRef]
Figure 1. Simplified diagram and photograph of distillation pilot plant.
Figure 1. Simplified diagram and photograph of distillation pilot plant.
Mathematics 13 01747 g001
Figure 2. Representation of process temperature data used for training.
Figure 2. Representation of process temperature data used for training.
Mathematics 13 01747 g002
Figure 3. Transient sections.
Figure 3. Transient sections.
Mathematics 13 01747 g003
Figure 4. Data selection for the model based on the temperature trend of plate 4.
Figure 4. Data selection for the model based on the temperature trend of plate 4.
Mathematics 13 01747 g004
Figure 5. Principal components of a two-dimensional data set.
Figure 5. Principal components of a two-dimensional data set.
Mathematics 13 01747 g005
Figure 6. Classical PCA algorithm.
Figure 6. Classical PCA algorithm.
Mathematics 13 01747 g006
Figure 7. PCA- T 2 FD proposal.
Figure 7. PCA- T 2 FD proposal.
Mathematics 13 01747 g007
Figure 8. Percentage of information captured by each component in the plate 2 model.
Figure 8. Percentage of information captured by each component in the plate 2 model.
Mathematics 13 01747 g008
Figure 9. Representation of the explained variance according to the number of components considered in the analysis.
Figure 9. Representation of the explained variance according to the number of components considered in the analysis.
Mathematics 13 01747 g009
Figure 10. Direction and importance of the original features in the new principal component space.
Figure 10. Direction and importance of the original features in the new principal component space.
Mathematics 13 01747 g010
Figure 11. Statistical control region considering data indicating normal behavior in plate 2.
Figure 11. Statistical control region considering data indicating normal behavior in plate 2.
Mathematics 13 01747 g011
Figure 12. Change in heating power during process start-up.
Figure 12. Change in heating power during process start-up.
Mathematics 13 01747 g012
Figure 13. Failure due to pressure leak from the boiler during process start-up.
Figure 13. Failure due to pressure leak from the boiler during process start-up.
Mathematics 13 01747 g013
Figure 14. Data graphical trend considering a fault due to a power increase.
Figure 14. Data graphical trend considering a fault due to a power increase.
Mathematics 13 01747 g014
Figure 15. Data obtained from the statistical parameter T p 2 and UCL threshold.
Figure 15. Data obtained from the statistical parameter T p 2 and UCL threshold.
Mathematics 13 01747 g015
Figure 16. Statistics T p 2 and the UCL threshold for plates P2 and P6 and the condenser for a heating power fault.
Figure 16. Statistics T p 2 and the UCL threshold for plates P2 and P6 and the condenser for a heating power fault.
Mathematics 13 01747 g016
Figure 17. Graphical trend of data used with pressure leak fault.
Figure 17. Graphical trend of data used with pressure leak fault.
Mathematics 13 01747 g017
Figure 18. Data obtained from the statistical parameter T p 2 and UCL threshold for plate P2.
Figure 18. Data obtained from the statistical parameter T p 2 and UCL threshold for plate P2.
Mathematics 13 01747 g018
Figure 19. Statistics T p 2 and UCL threshold for plates P2 and P6 and the condenser for the pressure leak fault.
Figure 19. Statistics T p 2 and UCL threshold for plates P2 and P6 and the condenser for the pressure leak fault.
Mathematics 13 01747 g019
Figure 20. Statistical parameter T p 2 and UCL threshold for the undetected boiler fault.
Figure 20. Statistical parameter T p 2 and UCL threshold for the undetected boiler fault.
Mathematics 13 01747 g020
Figure 21. The temperature trend with a fault in the boiler, masked by the training region.
Figure 21. The temperature trend with a fault in the boiler, masked by the training region.
Mathematics 13 01747 g021
Table 1. Explained variance and T p 2 values for each applied model.
Table 1. Explained variance and T p 2 values for each applied model.
Model PlateVariance Explained T 2 (Heating Power Fault) T 2 (Pressure Leak Fault)
Boiler97.68%08.091512.5100
Plate 271.31%20.935711.45565
Plate 484.29%23.227747.9671
Plate 689.12%01.744650.1661
Plate 883.96%57.036765.0746
Plate 1086.58%4.47987.56791
Condenser88.59%06.4824.20775
Table 2. Validation metric results for each model.
Table 2. Validation metric results for each model.
ModelAccuracyPrecisionRecallSpecificityF1 Score
Boiler0.84611.0000.83331.0000.9090
Plate 20.84611.0000.83331.0000.9090
Plate 41.0001.0001.0001.0001.000
Plate 60.84611.0000.80001.0000.8888
Plate 80.92300.83331.00000.87500.9090
Plate 100.92300.83331.00000.87500.9090
Condenser0.84611.0000.60001.00000.7499
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Moreno-Sotelo, G.; Téllez-Anguiano, A.d.C.; Heras-Cervantes, M.; Martínez-Parrales, R.; Chávez-Campos, G.M. Transient-State Fault Detection System Based on Principal Component Analysis for Distillation Columns. Mathematics 2025, 13, 1747. https://doi.org/10.3390/math13111747

AMA Style

Moreno-Sotelo G, Téllez-Anguiano AdC, Heras-Cervantes M, Martínez-Parrales R, Chávez-Campos GM. Transient-State Fault Detection System Based on Principal Component Analysis for Distillation Columns. Mathematics. 2025; 13(11):1747. https://doi.org/10.3390/math13111747

Chicago/Turabian Style

Moreno-Sotelo, Gregorio, Adriana del Carmen Téllez-Anguiano, Mario Heras-Cervantes, Ricardo Martínez-Parrales, and Gerardo Marx Chávez-Campos. 2025. "Transient-State Fault Detection System Based on Principal Component Analysis for Distillation Columns" Mathematics 13, no. 11: 1747. https://doi.org/10.3390/math13111747

APA Style

Moreno-Sotelo, G., Téllez-Anguiano, A. d. C., Heras-Cervantes, M., Martínez-Parrales, R., & Chávez-Campos, G. M. (2025). Transient-State Fault Detection System Based on Principal Component Analysis for Distillation Columns. Mathematics, 13(11), 1747. https://doi.org/10.3390/math13111747

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop