1. Introduction
Sensors serve as “the eyes and ears” of the HVAC control system, monitoring critical variables and providing data support for the control system actuator to make decisions and optimize control. The malfunctioning of sensors can have a negative impact on the system’s functionality, leading to a decrease in overall performance. A reliable sensor system plays a crucial role in risk management strategies to enhance safety and reliability. Specifically, sensor fault detection involves three aspects: (1) detection (determining whether any sensors have experienced faults); (2) isolation (identifying the faulty sensors); and (3) accommodation (replacing faulty data with approximate correct data transmitted to downstream systems) [
1]. Therefore, establishing an effective HVAC sensor fault detection, identification, and accommodation (SFDIA) scheme is of significant importance for the long-term stable operation of sensors.
Currently, there is limited research on HVAC SFDIA. Most of the existing research focuses on sensor fault diagnosis and isolation, with a lack of fault compensation. With the advancement of data processing technologies [
2], data-driven fault diagnosis methods have gained widespread application. Among them, statistical analysis methods [
3,
4,
5] and artificial intelligence (AI) methods [
6,
7,
8] are highly favored. Principal Component Analysis (PCA) is commonly employed in statistical analysis, yet it struggles with non-linear issues. Its variant, Kernel PCA (KPCA) [
9,
10], can address non-linear problems, but its computations are complex and unsuitable for large-scale industrial processes [
11]. Artificial intelligence methods [
12,
13] can automatically extract and integrate crucial information from complex data, which is more suitable for nonlinear systems, so it is applied in the field of sensor fault diagnosis [
14,
15,
16]. The above research focuses on fault diagnosis and isolation. Moreover, AI-based classification models require significant amounts of labeled historical fault data, which are often lacking in industrial databases [
17,
18]. AI-based prediction models often utilize the maximum prediction residual as the fault threshold, without quantitatively evaluating the performance of the fault threshold. Therefore, there is a need for an approach that can realize SFDIA using fault-free data, as well as a quantifiable metric to evaluate fault thresholds. As a combination of mathematical models, data processing, and software technology methods, the soft sensor employs fault-free data to generate a virtual measurement to replace a real sensor measurement realizing SFDIA.
A soft sensor [
19,
20] is an estimation method based on available physical sensors and process parameters to derive the interested physical variables [
21,
22]. There are primarily three types of soft sensor models: mechanism-based, knowledge-based, and data-driven methods [
23]. However, due to the increasing complexity of industrial processes, soft sensor models based on data-driven approaches are gaining popularity, especially when the necessary prior knowledge is lacking or modeling processes are complex. Chun et al. [
24] proposed a data-driven dynamic soft measurement method based on Supervised Bidirectional Long Short-Term Memory to extract and utilize nonlinear dynamic latent information. Ding et al. [
25] introduced a deep learning-based modeling framework to develop soft sensor models that are robust to sensor faults, which exhibit a strong resilience to sensor faults and high recovery capability for sensor fault signals. Darvishi et al. [
26] established a series of neural network estimators and classifiers, where estimators correspond to virtual sensors for all unreliable sensors, used to replace isolated faulty sensors within the system. Classifiers are employed for detection and isolation tasks, utilizing estimators to reconstruct faulty sensor data. Giovanni et al. [
27] developed three kinds of soft sensors such as temperature, airflow, and fan speed by employing linear regression models, statistical models, and non-linear regression models. And a sensor fault diagnosis method using three sensors to mutually corroborate each other is proposed, which has high reliability. Hence, utilizing soft sensors to estimate the faulty physical sensor can not only realize sensor fault diagnosis but also reconstruct the fault data.
The soft-sensor-based HVAC SFDIA framework, which considers only one sensor fault at a time, consists of three parts. The first part is fault diagnosis. A soft sensor model combining the convolutional neural network, long-short-term memory neural network, and attention mechanism (CNN-BILTM-ATTENTION) is constructed using multidimensional sensor data excluding the diagnosed sensor to estimate the output of the diagnosed sensor. The residual between the estimated and measured value of the diagnosed sensor is calculated. If the residual exceeds the fault threshold, it is considered that the diagnosed sensor is faulty or the input sensor is faulty. The method for determining the fault threshold involves three steps. Firstly, several appropriate fault threshold combinations are identified based on the maximum residual. Subsequently, fault diagnosis test experiments are performed on data containing faults. Finally, the combination with the largest difference between FDR and FAR is chosen as the optimal fault threshold. The second part is fault isolation. Assuming the diagnosed sensor fault, the estimated value of the diagnosed sensor is used instead of the faulty value to estimate and diagnose the second sensor. If the second sensor is normal, it indicates that the diagnosed sensor is faulty. If the second sensor is faulty, the estimated value of the second sensor is used instead of it. The method is repeated to diagnose all input sensors. The third part is fault compensation. A univariate input BILSTM soft sensor model is built using normal data and historical data from before the sensor failure to estimate the sensor’s output and realize the recovery of fault data.
In summary, to realize HVAC sensor fault diagnosis, isolation, and accommodation, a soft-sensor-based method is proposed. The main contributions are as follows:
(1) Establishing an HVAC SFDIA scheme based on the soft sensor, which can realize fault sensor diagnosis, isolation, and fault data recovery.
(2) By utilizing the soft sensor to estimate the diagnosed sensor, the residual between the estimated and sensor measured value is computed and compared with a predetermined threshold. If the residual surpasses the threshold, the estimated value is used to replace the faulty sensor’s measurement, thus realizing the goals of sensor fault diagnosis, isolation, and accommodation.
(3) An evaluation metric for determining the fault threshold is proposed, avoiding the influence of subjective factors and enabling enhanced fault diagnosis effectiveness.
2. Basis of Soft-Sensor-Based HVAC SFDIA
A typical HVAC SFDIA procedure involves three parts.
(1) Diagnosis: utilize historical data to determine whether there has been a sensor malfunction at the current moment.
(2) Isolation: if there is a malfunctioning sensor, identify the faulty sensor’s location.
(3) Accommodation: Determine the first fault data point of the faulty sensor, which is the starting point for data recovery. Use historical normal data before the point to evaluate the correct output value of faulty sensor, ensuring the short-term normal operation of the HVAC system.
2.1. SFDIA Scheme Process Based on Soft Sensor
The soft-sensor-based HVAC SFDIA process is divided into 6 steps, as shown in
Figure 1.
Step 1: Data preprocessing. The experimental data contain unstable data during the startup phase and a few missing data points. A one-dimensional interpolation method is used to fill in missing data. And the unstable data are removed. Subsequently, the data are normalized to eliminate the interference caused by different units of measurement. The data input and time series transformation are illustrated in
Figure 2. The training set is used to construct the diagnosis soft sensor for fault diagnosis, the test set is used to determine the optimal combination of fault thresholds, and the validation set is used for fault diagnosis experiments.
Step 2: Constructing a diagnosis soft sensor. Utilize the diagnosed sensor as the label and the other sensors as model inputs. The model uses RMSE as the loss function and Adam optimizer. Additionally, early stopping is added to prevent overfitting of the network.
Step 3: Determining the fault threshold. Calculate the residual sequences between the soft sensor and the physical sensor outputs on the test set to determine the range of fault thresholds. Utilize the score metric from
Section 3.2 to select the optimal combination of fault thresholds within the determined range.
Step 4: Fault diagnosis. The trained soft sensor is used to predict the diagnosed sensor. The residual between the predicted value and the measurement is calculated and compared with the fault threshold to determine whether the diagnosed sensor has a fault. When the residuals of five consecutive soft sensor data points exceed the fault threshold, it is considered that a fault has occurred at that data position. In addition, the initial fault data point can be identified. Fault accommodation will start from the point.
Step 5: Fault isolation. To identify the location of the faulty sensor, repeat steps 2–4 for each sensor individually.
Figure 3 shows a subgraph of the fault isolation portion of
Figure 1, illustrating the process of clearly identifying faulty sensors. More specifically, first, the T
1 sensor is taken as the diagnostic target. Input T
2~T
7 measurement
yphysics2~
yphysics7 into the T
1 soft sensor to predict the output of T
1. If the residual ε
1 between the predicted data and the measurement is less than the fault threshold, all sensors are considered normal. Otherwise, it is considered possible that there may be a failure of the T
1 sensor or the input sensor. Assuming a fault in the T
1 sensor, replace the T
1 measurement
yphysics1 with the predicted data
yprediction1. Input
yprediction1 and
yphysics3~
yphysics7 into the T
2 soft sensor to calculate the residual
ε2. If
ε2 is less than the fault threshold, the assumption is true, which indicates a fault in the T
1 sensor while others are normal. Otherwise, if
ε2 exceeds the threshold, the assumption is false, which indicates that the T
1 sensor is normal while a faulty sensor exists in T
2~T
7. Assuming a fault in the T
2 sensor, replace
yphysics2 with
yprediction2. Continue diagnosing T
3. Repeat this process to isolate a fault in T
1~T
7 sensors.
Step 6: Fault accommodation. Utilizing the normal data from the faulty sensor as input to train the soft sensor data recovery model, the output values are used as reconstructed data to recover the faulty data.
2.2. Soft Sensor Construction
Soft sensors, as a core component within the SFDIA scheme, plays a crucial role in estimating sensor data. The estimated value is employed to accomplish diagnosis, isolation, and accommodation processes.
The main difference between diagnosis and accommodation soft sensors lies in the external input. Therefore, the design of their internal core models also differs. Considering the time-varying nature of sensor data, both soft sensors need to effectively store the temporal information of input signals. Additionally, diagnosis soft sensors deal with complex multidimensional data that exhibit strong coupling, requiring feature extraction of input information. To help the network capture crucial information while handling complex data, an attention mechanism is introduced to use weighted operation to enable the network to focus on information highly relevant to the task.
The structure of the CNN-BILSTM-ATTENTION diagnosis soft sensor is depicted in
Figure 4, primarily composed of the input layer, CNN layer, BILSTM layer, attention layer, and output layer. Each layer is described as follows:
Input layer: The input layer receives the time-serialized data and feeds them into the convolutional layer, defining the input size. A single data point is represented as Equations (1) and (2):
where
represents the nth sensor sampled at time t.
CNN layer: This mainly consists of the convolutional layer and pooling layer. The convolutional layer performs sliding convolutions of the multidimensional input signals along the time sequence to achieve feature extraction. The calculation process is shown in (3) and (4).
where
represents the
jth feature output of the lth layer,
f(
·) is the activation function,
N is the number of input signals, * denotes the convolution operator,
is the convolution kernel,
is the bias term, and
is the output of the activation function. The ReLU activation function is used to perform a non-linear transformation on the output.
The pooling layer utilizes downsampling to merge highly similar features, reducing spatial dimensions, lowering computational complexity, while preserving feature invariance and locality. The max-pooling operation is adopted, and the calculation formula is shown as (5).
where
Pl[l] represents the output of the pooling layer.
BILSTM layer: Adjacent sensor data often exhibit strong correlations. Data at any time point are highly correlated with the data at its neighboring time points. The BILSTM network consists of two LSTM networks in opposite directions, providing more comprehensive information for each time point. The calculation formula is as follows:
where
Ht represents the total output of the hidden layer at time
t,
is the forward output of the LSTM hidden layer at time
t,
is the backward output of the LSTM hidden layer at time
t,
ht is the output of the hidden layer at time
t,
Ct is the state value of the LSTM hidden layer’s state unit, and
xt is the input of the model at time
t.
Attention layer: The attention mechanism is employed to weight the output vectors of the BILSTM network and perform a weighted sum. The softmax function is used to assign different weights to the outputs at different time steps, aiming to maximize the extraction of temporal feature information and achieve better evaluation results.
where
Q,
K, and
V represent the query matrix, key matrix, and value matrix, respectively.
WQ,
WK, and
WV are weight matrices, and
dk represents the vector dimension of
Q.
Output layer: a fully connected layer is used with the ReLU activation function to generate a single evaluated value.
Since accommodation soft sensors utilize data directly sourced from the historical data of the faulty sensor, which exhibits a more direct and tighter correlation with the sensor values to be predicted, this allows for the attainment of more accurate prediction results. The structure of the BILSTM accommodation soft sensor is shown in
Figure 5, consisting of the input layer, BILSTM layer, and output layer. The input variable is one-dimensional sensor data, as shown in Equation (9). The BILSTM layer and output layer are the same as in the diagnosis soft sensor and is not further elaborated.
4. Results Analysis
4.1. Analysis of Fault Diagnosis Results Based on Soft Sensor
A CNN-BILSTM-ATTENTION-based soft sensor is constructed using the training dataset of six sensors, excluding the diagnosed sensor. Different models’ predictive performances are compared on the test set to identify the model with the best predictive results. The optimal fault threshold is determined using the testing set. Finally, the feasibility of the approach is validated by conducting fault diagnosis on the validation set.
4.1.1. Analysis of Prediction Results
On the testing dataset, diagnosis soft sensors are employed to predict the values of seven sensors. The comparison chart of predicted results for the T
1 sensor is presented in
Figure 7. The soft sensor built with CNN-BILSTM-ATTENTION exhibits a better representation of the changing trends of the true values.
Table 3 presents a comparative analysis of prediction evaluation metrics for soft sensors built using different models. The CNN-BILSTM-ATTENTION model consistently achieves the best results across all metrics, with average MAE and RMSE reductions of 30.51% and 25.03%, respectively, compared to other optimal results. As a result, CNN-BILSTM-ATTENTION stands as the optimal diagnosis soft sensor prediction model.
4.1.2. Analysis of Fault Threshold and Diagnosis Results
To maximize the fault diagnosis performance, the score is used to determine the optimal fault threshold. Firstly, the residual between the output of the soft sensor and the measurement of the physical sensor is calculated, as in Equation (19), to determine the difference caused by the predictive model.
Figure 8 illustrates the residuals of the T
1 sensor. Currently, many diagnostic methods use the maximum residual value as the fault threshold. Although using the maximum value as the fault threshold can achieve a lower false alarm rate, it also decreases the fault detection rate, thus failing to achieve the best diagnostic outcome. Therefore, to select a more suitable fault threshold, the range of the fault threshold can be preliminarily determined based on the distribution of residuals. For instance, the positive threshold range for the T
1 sensor is [0.4~0.6], and the negative threshold range is [−0.2~−0.4]. Introducing faults into the test set, a test is conducted to determine the optimal fault threshold. By adjusting the thresholds within the defined positive and negative threshold ranges, false alarm rates and fault detection rates are computed for different threshold combinations. The fault threshold yielding the best
score is chosen.
Figure 9,
Figure 10,
Figure 11,
Figure 12,
Figure 13,
Figure 14 and
Figure 15 present a comparison of the
score results for different threshold combinations. When the score metric is equal, the threshold combination with the narrower range is selected to achieve more precise fault diagnosis. Using the optimal threshold, fault diagnosis is performed on the validation set to validate the performance of the best threshold combination and the effectiveness of the fault diagnosis method proposed. The validation set’s fault diagnosis results are depicted in
Figure 16,
Figure 17 and
Figure 18.
where
yprediction represents the output of the soft sensor and
yphysics represents the measurement of the physical sensor.
Therefore, soft sensors can effectively predict sensor data, which enable the implementation of fault diagnosis and realize good performance.
4.2. Analysis of Fault Accommodation Results Based on Soft Sensor
Faulty sensor data can impact the operational performance of the HVAC system. Thus, accurate identification of sensor fault points is essential. Data recovery is initiated from this point.
Figure 19 and
Figure 20 illustrate the results of identifying fault locations for the T
1 and T
2 sensors. In the validation dataset, faults are introduced from the 100th data point, and the first five continuous data points exceeding the fault threshold should correspond to the 102nd data point, as shown in the T
2 sensor identification result. Taking the T
1 sensor as an example, some sensors identify normal states as faults. In such cases, recovery is performed on normal data (normal data are falsely identified as faulty data). However, this case occurs rarely, and the recovered data remain within the fault threshold, making it viable to treat as normal data. The recovered data from each sensor fall within the fault threshold range. The data recovery results for the T
1 and T
2 sensors are shown in
Figure 21 and
Figure 22. The reconstructed data represent the output of the soft sensor and closely resemble the true data.
Figure 22 indicates that the reconstructed data fall within the fault threshold range and are considered normal data. This implies that the reconstructed data can effectively recover faulty data.
Table 4 provides the evaluation results of using the soft sensor for data recovery.
The results indicate that when sensors experience faults, compared with diagnosis soft sensors, accommodation soft sensors exhibit smaller predictive evaluation metrics. In other words, its predictive performance is better, and its predictions are closer to true data. It can effectively substitute true data, thereby achieving the reconstruction and recovery of faulty data.
4.3. Discussion
Based on the experimental results of the aforementioned fault diagnosis and accommodation, the following conclusions can be drawn:
(1) The fault diagnosis method can achieve good diagnostic results for faults of different levels. The larger the fault level, the more pronounced the fault manifestation, resulting in better diagnostic outcomes. The diagnostic performance for fault level 1 is noticeably superior to that of fault level 2.
(2) The utilization of the best fault thresholds selected using the score obtains better fault diagnosis outcomes, with an average false alarm rate of 1.57%, an average positive deviation fault detection rate of 96.88%, and an average negative deviation fault detection rate of 96.13%.
(3) For faulty sensors, the method is capable of effectively identifying the fault location. Utilizing single-variable input soft sensors, the faulty data can be successfully recovered. The reconstructed data fall within the fault threshold range, enabling short-term data recovery and maintaining the proper functioning of the HVAC systems.