Fault Detection and Diagnosis in Air-Handling Unit (AHU) Using Improved Hybrid 1D Convolutional Neural Network

Prince,; Yoon, Byungun; Kumar, Prashant

doi:10.3390/systems13050330

Open AccessArticle

Fault Detection and Diagnosis in Air-Handling Unit (AHU) Using Improved Hybrid 1D Convolutional Neural Network

by

Prince

¹

,

Byungun Yoon

^1,*

and

Prashant Kumar

²

¹

Department of Industrial & Systems Engineering, Dongguk University, Seoul 04620, Republic of Korea

²

Department of of AI and Big Data, Woosong University, Daejeon 34606, Republic of Korea

^*

Author to whom correspondence should be addressed.

Systems 2025, 13(5), 330; https://doi.org/10.3390/systems13050330

Submission received: 30 March 2025 / Revised: 23 April 2025 / Accepted: 26 April 2025 / Published: 1 May 2025

(This article belongs to the Special Issue Data-Driven Analysis of Industrial Systems Using AI)

Download

Browse Figures

Versions Notes

Abstract

The air-handling unit (AHU) is an essential component of heating, ventilation, and air-conditioning (HVAC) systems. Hence, detecting the faults in AHUs is essential for maintaining continuous HVAC operation and preventing system breakdowns. The advent of artificial intelligence has transformed the AHU fault diagnosis techniques. Specifically, deep learning has obviated the necessity for manual feature extraction and selection, thereby streamlining the fault diagnosis process. While conventional convolutional neural networks (CNNs) effectively detect defects, incorporating more spatial variables could enhance their performance further. This paper presents a hybrid architecture combining a CNN model with a long short-term memory (LSTM) model to diagnose the faults in AHUs. The advantages of the LSTM model and convolutional layers are combined to identify significant patterns in the input data, which considerably facilitates the detection of AHU defects. The hybrid design enhances the network’s capability to capture both local and global characteristics, thus improving its ability to differentiate between normal and abnormal circumstances. The proposed approach achieves strong diagnostic accuracy, exhibiting high sensitivity to nuanced fault patterns. Furthermore, its efficacy is corroborated through comparisons with state-of-the-art AHU fault identification techniques.

Keywords:

fault detection; HVAC; air-handling unit; CNN; LSTM

1. Introduction

The building industry accounts for over 33% of the global final energy consumption [1]. In both commercial and residential buildings, heating, ventilation, and air-conditioning (HVAC) systems constitute the highest proportion of energy consumption, surpassing other systems, such as refrigeration, water heating, and lighting [2]. Thus, inefficient HVAC systems not only increase greenhouse gas emissions but also diminish the occupant thermal comfort and cause significant energy wastage. Such significant quantities of wasted energy could be utilized more effectively through advanced technologies and efficient corporate procedures [3]. Consequently, sophisticated HVAC systems are becoming increasingly popular. These systems employ adaptive options instead of the traditional rule-based control and are equipped with built-in, data-based advanced failure detection and diagnosis (FDD) methods [4,5].

Smart FDD techniques are increasingly utilizing data-driven methodologies to circumvent the limitations associated with conventional rule-based physical models. The reliance of classic FDD systems on empirical recommendations or mechanical details to identify faults significantly undermines their efficacy. Furthermore, model-based approaches may not consistently offer sufficient diagnostic accuracy or ease of application [6]. Furthermore, improvements in the computational capability, data collection efficiency, and algorithmic design have enabled the incorporation of machine learning (ML) techniques into modern HVAC systems [7]. Deep learning (DL) is a solid alternative to traditional ML methods, even though the latter have been widely used in the past. DL models are effective because of their capacity to autonomously learn and model the complex behaviors associated with system problems; thus, they represent an efficient tool for improving the FDD in HVAC systems [8].

In [9], the author presented an extreme gradient boosting regression model to detect and analyze different fault occurrences. In [10], the researchers investigated a novel ML feature extraction technique for managing extensive HVAC datasets. This approach utilizes a multiclass support vector machine (SVM) algorithm for automatic FDD in intelligent HVAC systems [11]. A recent study introduced a statistical machine-learning strategy for identifying and categorizing problems in rooftop HVAC units. The results demonstrated that SVM can outperform linear models, such as discriminant analysis models, in the classification tasks. Mirnaghi et al. employed a big-data-based methodology to investigate the applicability of FDD models trained on erroneous datasets in practical scenarios [12]. An automated FDD technique was developed in [13], which employs an extensive data architecture and SVMs to identify the problems in terminal HVAC devices.

In [14], the author presented a data-driven FDD methodology for the air-handling units (AHUs) of HVAC systems in buildings. This strategy enhances the maintenance reliability by clarifying ambiguous system situations. An FDD technique based on Gaussian process regression was presented in [6,15] to ascertain the response of an HVAC system to variables such as temperature, time, and occupancy levels. In [16], the author employed a back-tracing feature selection approach to examine feature selection, a crucial component of ML-based problem identification and diagnosis. Yen et al. [17] utilized a generative adversarial network, an unsupervised learning technique, to detect and rectify the problems in AHUs. In [18], the authors developed a comprehensive FDD framework comprising an SVM and principal component analysis (PCA) to forecast the system performance under unprecedented scenarios. For an extensive examination of FDD models in buildings utilizing computational intelligence, see [19].

Notwithstanding the advancements achieved in various data-driven approaches, inherent structural limitations may restrict the choice of machine-learning models in HVAC applications. PCA-based models are particularly vulnerable to outliers until supplemented with other methods, such as wavelet analysis [20]. Likewise, unsupervised learning techniques heavily rely on feature extraction or selection performed by humans, a process that requires considerable expertise and is difficult to execute effectively [21]. Additionally, models such as SVMs and artificial neural networks (ANNs) possess straightforward structures, which impede their ability to replicate intricate nonlinear interactions. Basic ML techniques are ineffective for generalization, particularly in challenging FDD scenarios [22]. Taheri et al. [23] developed a deep recurrent neural network (DRNN), an unsupervised learning model, to detect and diagnose problems in AHUs. In [24], FDD techniques based on data-driven approaches, such as SVMs and random forests, were developed to detect and diagnose the problems in AHUs. In [25], the author developed a deep neural network (DNN)–based real-time defect diagnosis model for AHUs to improve their operational efficiency and thereby reduce the energy consumption of building HVAC systems. In [26], Gao et al. proposed a new adaptive federated learning method that improves the fault diagnosis in pumping units, enhancing the accuracy and communication efficiency by addressing the label heterogeneity and redundancy. In [27], Mehta et al. proposed a federated learning framework with a duplet classifier that enables the accurate, privacy-preserving mixed fault diagnosis in rotating machinery across factories with heterogeneous, unbalanced data.

A significant issue with the aforementioned methodologies is their inability to adequately capture the temporal evolution of faults or the correlations of faults with other occurrences. To address this gap, we developed a hybrid convolutional neural network (CNN). Hybrid CNNs employ a recursive architecture, which enables them to effectively capture the temporal correlations in data as well as manage the noise, nonlinearities, and variations in FDD systems. As stated in [28], hybrid CNNs are proficient at learning intricate implicit data distributions, detecting detrimental occurrences that are challenging to discern owing to noise, and recognizing the temporal patterns within extensive datasets. Studies demonstrate that hybrid CNs are superior to and more dependable than non-recursive ML algorithms.

This study introduces a combination of CNNs and long short-term memory (LSTM) networks for DL along with an effective data-processing method to sort actuators, AHUs, and control problems into groups. Together with the CNN-LSTM architecture, the data-processing strategy ensures optimal and efficient performance. The developed FDD system can detect multiple faults and can identify and diagnose the various issues associated with distinct subsystems. The proposed model is grounded in a detailed explanation of a complex framework designed to interpret FDD predictions. This is highly relevant as it allows us to understand model behavior and facilitates proactive strategies to enhance the operational efficiency and minimize greenhouse gas (GHG) emissions. This research article presents the following key contributions:

A 13-layer CNN-LSTM AHU FDD for the HVAC system is suggested, which uses batch normalization for quicker training and efficient autonomous feature learning;
Evaluate the capabilities of deep-learning models such as GB, RF, DRNN, DNN, and CNN-LSTM models in comparison to traditional neural networks and machine-learning models;
Create a hybrid model (1D-CNN-LSTM) that leverages the best features of the CNN and LSTM methods;
The hybrid model offers a comprehensive approach to FDD learning;
The suggested model outperforms the baseline models in terms of performance, namely recall, accuracy, and precision, according to the experimental data.

The remainder of this paper is organized as follows: Section 2 provides a theoretical background for CNNs and LSTM. Section 3 outlines the methods utilized to develop the proposed fault detection and categorization system. Section 4 presents the results obtained using the proposed methods. Finally, Section 5 concludes the paper by delineating the benefits of the proposed approach and highlighting the directions for further research.

2. Data-Driven Method

The proposed approach for identifying faults in the AHUs of HVAC systems incorporates historical and real-time data to detect abnormalities and inefficiencies. ML algorithms, statistical models, and pattern recognition are used to assess sensor data, such as temperature, pressure, and airflow, and detect deviations from the normal operating ranges. Unlike model-based methods, data-driven approaches do not require detailed physical models of the system, which enhances their ability to adapt to complex and evolving HVAC systems. These techniques can identify potential issues, enhance the performance, and reduce the energy consumption, thereby improving the system reliability and maintenance efficiency. To tackle this significant issue, several methodologies have arisen, particularly those grounded in AI and ML. These approaches do not necessitate any mathematical modeling of the system, regardless of its complexity. Data-driven systems can function in real time, offering millisecond-level reactions [29].

The advent of data-driven approaches was enabled by the developments in computer technology, which allowed for the utilization of more powerful machines capable of supporting computationally intensive tasks, alongside the availability of extensive data. ML methods, which are often data-driven, directly collect real-world data for decision making rather than depending on mathematical models that simulate reality. This research focused on the FDD for sensors via data-driven methodologies, including SVM and random forest techniques [24]. Nevertheless, these techniques evidently become constrained when other aspects must be considered.

To address the aforementioned challenge, we focused primarily on DNNs. Initially, as the fundamental concepts necessary for comprehending neural networks, DDNs comprise neural networks and several interconnected hidden layers. Although they were conceived in the 1980s and 1990s, their popularity surged later for various reasons, including the availability of extensive data and advancements in technology that facilitated the development of more powerful processors and graphics cards.

2.1. Long Short-Term Memory (LSTM)

LSTM is a type of RNN that integrates long-term temporal connections and solves vanishing-gradient issues [30]. The hidden layers of RNNs are replaced with LSTM units, which comprise memory cells and gates. The memory cells (C) are regulated by the gates to retain information. The input gate, output gate, and forget gate regulate the inflow and outflow of information within the memory cells. LSTM is efficient in sequential modeling applications, including classification, time-series analysis, and prediction problems [28]. Figure 1 illustrates the architecture of LSTM, while Equations (1)–(8) provide a mathematical representation of an LSTM block.

i_{t} = σ_{t} (w_{i} [h_{t - 1}, x_{t}] + b_{i}),

(1)

{\bar{c}}_{t} = t a n h (w_{c} [h_{t - 1}, x_{t}] + b_{c}

(2)

c_{t} = f_{t} * c_{t - 1} + a_{t} * {\bar{c}}_{t},

(3)

f_{t} = σ_{t} (w_{f} [h_{t - 1}, x_{t}] + b_{f}

(4)

o_{t} = σ_{t} (w_{o} [h_{t - 1}, x_{t}] + b_{o}),

(5)

h_{t} = \tanh (c_{t}) * o_{t},

(6)

σ_{t} = \frac{1}{1 + e^{- x}},

(7)

\tanh (x_{t}) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}},

(8)

where i_t is the input vector; w_i, w_f, and w_o denote the weight matrix of the input, forget, and output gates, respectively; b_i, b_f, and b_o are the input, forget, and output gate bias vectors, respectively; and h_t is the output matrix.

The core principle of LSTM is that the network can ascertain the part of the data that should be preserved in the long-term state, the part that should be discarded, and the part that should be accessed. As the long-term state ct-1 traverses the network from left to right, it initially passes through a forget gate, which results in the loss of certain memories. Subsequently, an addition operation is performed, which adds new memories (including those selected by an input gate). Thereafter, c_t is transmitted without any additional modifications. Consequently, at each time step, specific memories are eliminated while others are incorporated. The tanh function is utilized to simulate and communicate the long-term state after the addition operation. The output gate ultimately enhances the outcome, yielding the short-term state h, which denotes the cell’s output at the present step (o).

2.2. Convolutional Neural Network (CNN)

The CNN method is a popular DL technique, commonly utilized in time-series analysis, image classification, and text categorization [31]. The space-invariant ANN [32] is characterized by a shared-weight design and translation-invariant behavior.

The CNN method is a common variation of many perceptrons, indicating a fully connected network [33]. The interconnected nature of a CNN frequently renders it susceptible to overfitting. This problem is typically tackled by incorporating new features or layers into the network. CNNs use several regularization techniques. They are divided into a basic and simple pattern after absorbing the pattern from the entering data. CNNs lack connectedness and intricacy. The biological changes in the connections between neurons reflecting the animal visual brain inspired CNNs. Specific brain neurons react to enable access to inside the occupied visual field. The various layers in a CNN are defined below as follows:

(1): Convolutional Layer

The convolutional layer (CL) extracts meaningful patterns from the input data, establishing a fundamental relationship between the input and the outputs. It utilizes filters or kernels to systematically capture features from the raw input. Figure 2 illustrates the basic structure of a CNN, which connects each element of the kernel and is applied to the input data through multiplication, and the results are then aggregated. In the CL, the input data and feature maps are mixed with trainable parameters in several locations. The training process for a CL is similar to that in a conventional backpropagation neural network. However, unlike in standard ANNs, in CNNs, only the kernels in each CL are trained.

(2): Pooling Layer

CNNs frequently incorporate local and global pooling layers (PLs) in conjunction with conventional CLs. PLs reduce the data dimensionality by consolidating the outputs from the clusters of neurons in one layer into a singular neuron in the subsequent layer. Local pooling is performed on diminutive clusters, occasionally with tiling dimensions such as 1 × 1. Conversely, global pooling considers all the neurons within a feature map [34]. The two most common pooling techniques are max pooling, which determines the largest value from each localized region of neurons in the feature map, and average pooling, which calculates the mean value [35].

(3): Fully Connected Layer

A fully connected layer, similar to a standard multilayer perceptron neural network, establishes connections between every neuron in one layer and all the neurons in the subsequent layer. The flattened matrix is sent via a fully linked layer to provide predictions during image classification.

2.3. The Proposed Model

A hybrid CNN is a DL architecture that combines the best features of CNNs with those of other neural network models to enhance the speed, flexibility, or efficiency in certain tasks. CNNs are adept at extracting features from grid-like data, such as images, leveraging their CLs to identify patterns and spatial hierarchies. However, their inability to handle sequential data, global contexts, or complex linkages has necessitated the creation of hybrid models. A CNN-RNN hybrid model utilizes CNNs to extract spatial features and RNNs or LSTMs to describe temporal relationships. Thus, this architecture is useful for tasks such as time-series forecasting or video analysis. CNN-transformer models combine CNNs with the self-attention mechanism of transformer designs. This combination improves the model’s understanding of the global context and long-range interdependence, which is especially useful in medical image analysis and image captioning. CNNs can also be combined with graph neural networks to handle non-Euclidean data, such as chemical structures or social networks. Classification problems can be solved using hybrid CNNs based on traditional ML techniques, such as DT or SVM methods. Feature representation can be achieved using hybrid CNNs based on unsupervised learning models, such as autoencoders. These hybrid models are often designed to serve specific purposes, such as accelerating the computation, preventing overfitting, or enhancing data interpretability. Real-time applications on edge devices utilize lightweight hybrid CNNs with depthwise separable convolutions or attention modules. Hybrid CNNs are an example of flexible and innovative DL frameworks. They can be combined with various other approaches to solve complex, multi-modal, or domain-specific problems more effectively.

The proposed model comprises a CNN with several hidden layers, also known as feature extraction layers. The input layers receive the variables fed into the network, and the output layers extract the features of the input data from the LSTM cells. The hidden layer generally comprises a CL, a PL, an activation function layer, and a dropout layer. The dropout layer mitigates overfitting, a common problem affecting 1D CNNs. Unlike ANNs, CNNs can capture the sequential features of the input. The LSTM layer incorporates these elements by managing long-range connections to predict the output variable.

In the CNN, the flattened layer processes the data before transferring them to the LSTM layer. Overfitting is a common occurrence in DNNs. Among all the strategies available to address this issue, dropout is one of the simplest and most successful. The dropout layer [36] explains that throughout the preparation process of DNNs, the cell is temporarily removed from the system based on a specific probability. We introduce a dropout layer to prevent overfitting. In this layer, the output is linked to the LSTM layer for prediction, and then it is connected to a fully connected layer. Figure 3 illustrates the proposed framework.

The activation function introduces nonlinearity into a network, enhancing the responsiveness of DNNs and addressing nonlinearity issues that linear networks cannot resolve [37]. Various activation functions are available, including the sigmoid function, tanh function, and ReLU function. The ReLU activation function can assist in addressing the gradient-vanishing problem, thereby enhancing the network convergence and accelerating the computation relative to alternative functions. It is defined by Equation (9):

R e l u = M a x (0, A) .

(9)

3. Data Analysis and Results

The no-fault category, marked “1”, represents normal system operation. This analysis was focused on two fault types: setback error delayed onset and setback error early termination. Their low occurrence may be compensated for by HVAC control programming. Considering a variable air volume, we focused on AHUs in both single-zone and multi-zone HVAC designs. The datasets utilized in this study were acquired from three building systems situated at FLEXLab, a research facility at Lawrence Berkeley National Laboratory in Berkeley, CA, USA [38]. The detailed test setup for the proposed methodology is shown in Figure 4. The data were collected at 1 min intervals from January to December 2017. Overall, 272,160 measurements were recorded, 80% of which constituted the training dataset. Apart from historical data, some operational metrics were also included. The weather-related data comprised the supply air temperature, supply air temperature setpoints, outside air temperature, mixed air temperature, and return air temperature. Additionally, the recorded control signals encompassed the supply and return air fan velocities, the positions of the outside and return air dampers, the controls for the cooling and heating coil valves, and the setpoint for the supply air duct static pressure.

Temporal variables provide extra data for hybrid CNN models. Such variables include minutes within an hour, hours in a day, day of the month, and month of the year (I). A collection of 30 binary variables—each corresponding to a numerical value in the dataset—can be used to represent a particular day within a month [39]. One-hot encoding is a technique for encoding categorical variables whereby all bits are set to zero, except for a single binary indication denoting a particular day of the year. For the other date-related variables, a corresponding encoding method is under discussion.

The data were divided into training, testing, and validation sets, constituting 70%, 15%, and 15% of the total data, respectively. The splitting of the data across these sets must not create any bias against specific types of faults or standard conditions. The dataset was divided into training and testing subsets using the stratified split function in Python version 3.4, which ensures that the proportions of diverse fault categories are accurately replicated in both sets. This process facilitates the creation of uniform training and testing sets, thereby minimizing the sampling bias. Furthermore, the SimpleImputer function was employed to address the absence of data by substituting incomplete values with the corresponding median. Subsequently, the sample mean was subtracted, and the standard deviation was divided to normalize all the characteristics. This procedure, described as “standardization”, ensured that the ultimate distribution had a unit variance [40].

The HVAC data were utilized to train, test, and validate the proposed hybrid CNN model, which offers several benefits, including effective global feature extraction, the mitigation of overfitting, and enhanced feature learning efficiency. The model was trained over many sessions, its performance was assessed repeatedly, and it was optimized to achieve the maximum performance. The model’s performance was evaluated based on a confusion matrix (CM), which visualizes the performance of a classification model in a tabular format. The performance was also assessed based on the accuracy, precision, recall, and F1-score metrics, which are calculated as follows:

A c c u r a c y = \frac{T r u e p o s i t i v e s + T r u e N e g a t i v e s}{T r u e p o s i t i v e s + T r u e N e g a t i v e s + F a l s e p o s i t i v e s + F a l s e n e g a t i v e s},

(10)

P r e c i s i o n = \frac{T r u e p o s i t i v e s}{T r u e p o s i t i v e s + F a l s e p o s i t i v e s},

(11)

R e c a l l = \frac{T r u e p o s i t i v e s}{T r u e p o s i t i v e s + F a l s e n e g a t i v e s}

(12)

F 1 S c o r e = \frac{2 * P r e c i s i o n * R e c a l l}{P r e c i s i o n + R e c a l l} .

(13)

The hybrid CNN-based FD model required approximately 100 epochs to achieve convergence. Figure 5 illustrates the accuracy curve for both the training and validation phases, whereas Figure 6 presents the loss curve for training and validation. The model achieved convergence after 100 epochs, as illustrated in Figure 5 and Figure 6. The model achieved an average accuracy exceeding 97% for fault classification. Figure 7 illustrates the confusion matrix (CM) of the model. The fault is thoroughly characterized. The CM (Figure 7) indicates that the average accuracy for the hybrid CNN model is 97%. The results of the performance measures are presented in Table 1, which includes accuracy (acc), precision (p), recall (r), and F1-score.

4. Comparison with the State-of-the-Art Methods

To evaluate the efficacy of the proposed hybrid CNN model, it was benchmarked against three state-of-the-art models that utilize distinct AI-based methodologies, gradient boost (GB) [41], random forests [24], DNN [25], RNNs [23], and 1D-CNN. For the FDD in the AHU, random forests (precision is 0.81, recall is 0.89, and accuracy is 0.83) outperformed the gradient boosting (precision is 0.79, recall is 0.81, and accuracy is 0.79). This is reliable in light of the outcome of the study [24], which concludes that the RF model outperformed better than the GB model in the FDD. This due course of GB cannot be a quick, robust model with minimal tuning and resistance to overfitting, whereas the DNN model (precision is 0.79, recall is 0.95, and accuracy is 0.95) outperformed the DRNN model (precision is 0.89, recall is 0.92, and accuracy is 0.91). This is reliable in light of the outcome of the study [23], which concludes that the DNN model outperformed the DRNN model in the FDD. While DNNs rely on fully connected layers and activation functions to learn patterns, DRNNs leverage memory elements such as recurrent layers (e.g., LSTM or GRU) to retain past information, capturing temporal dependencies in data. However, DRNNs are more complex to train due to vanishing gradients and longer dependency chains, whereas DNNs are generally faster and more efficient for non-sequential tasks. The proposed model (precision is 0.9844, recall is 0.9885, and accuracy is 0.97) performed very well in the standalone model in the FDD in the HVAC system as shown in Table 2 as compared to the other models. The comparison was conducted uniformly, ensuring the absence of any noise. Table 3 compares the accuracy of the proposed approach with that of the benchmark models, highlighting that the proposed model exhibited the best accuracy. By combining the advantages of CNNs and LSTM, the proposed model was identified.

4.1. Robustness Analysis

To evaluate the robustness of the model, we introduced two noise conditions to simulate real-world sensor disturbances:

Gaussian noise with μ = 0, σ = 0.05;
Random missing values replaced with the column mean (up to 10%).

The robustness analysis of the proposed method at two noise conditions is shown in Table 4.

4.2. Shortcomings and Limitations

Despite its strong performance, IH-1DCNN has a few limitations:

Interpretability: While the model performs well, it lacks transparency. Future work may explore explainable AI methods (e.g., SHAP, LIME).
Computation Cost: The hybrid architecture has a higher training time and memory requirements compared to simpler ML models.
Limited Fault Diversity: The performance is strong on the existing dataset, but may not generalize to systems with different AHU configurations without retraining.
Data Dependency: Requires large and labeled datasets. Rare fault types may be underrepresented, affecting the recall for minority classes.

Consequently, we may delineate some technical discoveries derived from the aforementioned observations:

Because many air quality data parameters can affect the FDD accuracy and efficiency, data-driven approaches are needed to quickly identify the key components and build single and hybrid models for comparison. The unique model contains just one DL model. The greatest aspects of both kinds are combined in hybrid vehicles. The results show that the hybrid model effectively detects the fault in the HVAC system more accurately than isolated models. Thus, the hybrid model should be utilized for multi-feature data.
There are advantages and disadvantages to almost every model. FDD requires a CNN-LSTM hybrid model, whereby the former uses the preexisting AHU system to extract the relevant characteristics, and the latter generates FDD. The findings demonstrate that the proposed model halves the training time while increasing the FDD accuracy.

5. Conclusions

This study introduced an innovative CNN-LSTM hybrid model, which combines the benefits of CNNs and LSTM to achieve effective FDD for HVAC systems. The proposed architecture demonstrated robust efficacy in classifying diverse fault types, leveraging the LSTM layers for temporal sequence analysis and CLs of the CNNs for feature extraction. The model was evaluated using data collected from the HVAC systems of three real-world buildings. Based on the results, the hybrid CNN model exhibited superior accuracy compared with state-of-the-art fault detection models. The main conclusions of this study are outlined below as follows:

The hybrid ML architecture proposed herein combines the benefits of LSTM and 1D CNN algorithms to detect the faults in an HVAC system, thereby enhancing the system efficacy.
The proposed model offers rapid convergence, autonomous feature learning, and enhanced learning capabilities, in addition to being more computationally efficient than the other models.
The dropout layers in the model facilitate feature extraction and noise reduction, providing crucial information to the LSTM layers.

The hybrid CNN model represents an efficient black-box modeling approach for FDD in HVAC systems. However, its deficiencies were exposed by reference to the data variability. While the model could largely manage the disparities in the dataset distribution, such discrepancies still limit its performance, as is the case with the existing black-box algorithms. The proposed model attempts to address the data variability by integrating supplementary modeling techniques, which extend the model’s accurate prediction horizons to several days, enabling long-term use.

Author Contributions

Conceptualization, P. and B.Y.; Methodology, P., B.Y., P.K.; Software, P. and B.Y.; Validation, P.; Formal analysis, P., B.Y. and P.K.; Investigation, B.Y.; Resources, P. and P.K.; Data curation, P.; Writing—original draft, P., B.Y. and P.K.; Writing—review & editing, P., B.Y. and P.K.; Supervision, B.Y.; Funding acquisition, B.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research Foundation of Korea under Grant NRF-2021R1I1A2045721.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Santamouris, M.; Vasilakopoulou, K. Present and future energy consumption of buildings: Challenges and opportunities towards decarbonisation. e-Prime—Adv. Electr. Eng. Electron. Energy 2021, 1, 100002. [Google Scholar] [CrossRef]
Che, W.W.; Tso, C.Y.; Sun, L.; Ip, D.Y.; Lee, H.; Chao, C.Y.; Lau, A.K. Energy consumption, indoor thermal comfort and air quality in a commercial office with retrofitted heat, ventilation and air conditioning (HVAC) system. Energy Build. 2019, 201, 202–215. [Google Scholar] [CrossRef]
Nižetić, S.; Djilali, N.; Papadopoulos, A.; Rodrigues, J.J. Smart technologies for promotion of energy efficiency, utilization of sustainable resources and waste management. J. Clean. Prod. 2019, 231, 565–591. [Google Scholar] [CrossRef]
Chen, J.; Zhang, L.; Li, Y.; Shi, Y.; Gao, X.; Hu, Y. A review of computing-based automated fault detection and diagnosis of heating, ventilation and air conditioning systems. Renew. Sustain. Energy Rev. 2022, 161, 112395. [Google Scholar] [CrossRef]
Shi, Z.; O’Brien, W. Development and implementation of automated fault detection and diagnostics for building systems: A review. Autom. Constr. 2019, 104, 215–229. [Google Scholar] [CrossRef]
Li, W.; Li, H.; Gu, S.; Chen, T. Process fault diagnosis with model-and knowledge-based approaches: Advances and opportunities. Control Eng. Pract. 2020, 105, 104637. [Google Scholar] [CrossRef]
Zhou, X.; Du, H.; Xue, S.; Ma, Z. Recent advances in data mining and machine learning for enhanced building energy management. Energy 2024, 307, 132636. [Google Scholar] [CrossRef]
Zhao, Y.; Li, T.; Zhang, X.; Zhang, C. Artificial intelligence-based fault detection and diagnosis methods for building energy systems: Advantages, challenges and the future. Renew. Sustain. Energy Rev. 2019, 109, 85–101. [Google Scholar] [CrossRef]
Trizoglou, P.; Liu, X.; Lin, Z. Fault detection by an ensemble framework of Extreme Gradient Boosting (XGBoost) in the operation of offshore wind turbines. Renew. Energy 2021, 179, 945–962. [Google Scholar] [CrossRef]
Zhang, L.; Wen, J.; Li, Y.; Chen, J.; Ye, Y.; Fu, Y.; Livingood, W. A review of machine learning in building load prediction. Appl. Energy 2021, 285, 116452. [Google Scholar] [CrossRef]
Martinez-Viol, V.; Urbano, E.M.; Kampouropoulos, K.; Delgado-Prieto, M.; Romeral, L. Support vector machine based novelty detection and FDD framework applied to building AHU systems. In Proceedings of the 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Vienna, Austria, 8–11 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1749–1754. [Google Scholar]
Mirnaghi, M.S.; Haghighat, F. Fault detection and diagnosis of large-scale HVAC systems in buildings using data-driven methods: A comprehensive review. Energy Build. 2020, 229, 110492. [Google Scholar] [CrossRef]
Chen, Z.; O’Neill, Z.; Wen, J.; Pradhan, O.; Yang, T.; Lu, X.; Lin, G.; Miyata, S.; Lee, S.; Shen, C.; et al. A review of data-driven fault detection and diagnostics for building HVAC systems. Appl. Energy 2023, 339, 121030. [Google Scholar] [CrossRef]
Yun, W.-S.; Hong, W.-H.; Seo, H. A data-driven fault detection and diagnosis scheme for air handling units in building HVAC systems considering undefined states. J. Build. Eng. 2021, 35, 102111. [Google Scholar] [CrossRef]
Van Every, P.M.; Rodriguez, M.; Jones, C.B.; Mammoli, A.A.; Martínez-Ramón, M. Advanced detection of HVAC faults using unsupervised SVM novelty detection and Gaussian process models. Energy Build. 2017, 149, 216–224. [Google Scholar] [CrossRef]
Bezyan, Y.; Nasiri, F.; Nik-Bakht, M. A Feature Selection Approach for Unsupervised Steady-State Chiller Fault Detection. In Multiphysics and Multiscale Building Physics, Proceedings of the 9th International Building Physics Conference (IBPC 2024), Toronto, ON, Canada, 25–27 July 2024; International Association of Building Physics; Springer: Singapore, 2025; pp. 148–153. [Google Scholar]
Yan, K.; Chong, A.; Mo, Y. Generative adversarial network for fault detection diagnosis of chillers. Build. Environ. 2020, 172, 106698. [Google Scholar] [CrossRef]
Pule, M.; Matsebe, O.; Samikannu, R. Application of PCA and SVM in fault detection and diagnosis of bearings with varying speed. Math. Probl. Eng. 2022, 2022, 5266054. [Google Scholar] [CrossRef]
Li, Y.; O’Neill, Z. A critical review of fault modeling of HVAC systems in buildings. Build. Simul. 2018, 11, 953–975. [Google Scholar] [CrossRef]
Yang, X.; Chen, J.; Gu, X.; He, R.; Wang, J. Sensitivity analysis of scalable data on three PCA related fault detection methods considering data window and thermal load matching strategies. Expert Syst. Appl. 2023, 234, 121024. [Google Scholar] [CrossRef]
Zebari, R.; Abdulazeez, A.; Zeebaree, D.; Zebari, D.; Saeed, J. A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. J. Appl. Sci. Technol. Trends 2020, 1, 56–70. [Google Scholar] [CrossRef]
Abid, A.; Khan, M.T.; Iqbal, J. A review on fault detection and diagnosis techniques: Basics and beyond. Artif. Intell. Rev. 2021, 54, 3639–3664. [Google Scholar] [CrossRef]
Taheri, S.; Ahmadi, A.; Mohammadi-Ivatloo, B.; Asadi, S. Fault detection diagnostic for HVAC systems via deep learning algorithms. Energy Build. 2021, 250, 111275. [Google Scholar] [CrossRef]
Masdoua, Y.; Boukhnifer, M.; Adjallah, K.H. Fault detection and diagnosis in AHU system with data driven approaches. In Proceedings of the 2022 8th International Conference on Control, Decision and Information Technologies (CoDIT), Istanbul, Turkey, 17–20 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1375–1380. [Google Scholar]
Lee, K.-P.; Wu, B.-H.; Peng, S.-L. Deep-learning-based fault detection and diagnosis of air-handling units. Build. Environ. 2019, 157, 24–33. [Google Scholar] [CrossRef]
Gao, Z.-W.; Xiang, Y.; Lu, S.; Liu, Y. An optimized updating adaptive federated learning for pumping units collaborative diagnosis with label heterogeneity and communication redundancy. Eng. Appl. Artif. Intell. 2025, 152, 110724. [Google Scholar] [CrossRef]
Mehta, M.; Chen, S.; Tang, H.; Shao, C. A federated learning approach to mixed fault diagnosis in rotating machinery. J. Manuf. Syst. 2023, 68, 687–694. [Google Scholar] [CrossRef]
Prince; Hati, A.S. Convolutional neural network-long short term memory optimization for accurate prediction of airflow in a ventilation system. Expert Syst. Appl. 2022, 195, 116618. [Google Scholar] [CrossRef]
Furia, C.A.; Mandrioli, D.; Morzenti, A.; Rossi, M. Modeling Time in Computing; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long Short-term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Liu, X.; Tang, Z.; Yang, B. Predicting Network Attacks with CNN by Constructing Images from NetFlow Data. In Proceedings of the 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), Washington, DC, USA, 27–29 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 61–66. [Google Scholar]
Yalçın, M.E.; Ayhan, T.; Yeniçeri, R. Artificial Neural Network Models. In Reconfigurable Cellular Neural Networks and Their Applications; Springer: Cham, Switzerland, 2020; pp. 5–22. [Google Scholar]
Lee, K.B.; Cheon, S.; Kim, C.O. A convolutional neural network for fault classification and diagnosis in semiconductor manufacturing processes. IEEE Trans. Semicond. Manuf. 2017, 30, 135–142. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems; NIPS: Kolkata, India, 2012; pp. 1097–1105. [Google Scholar]
Yu, F.; Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
Wu, H.; Gu, X. Towards dropout training for convolutional neural networks. Neural Netw. 2015, 71, 1–10. [Google Scholar] [CrossRef]
Cuéllar, M.P.; Delgado, M.; Pegalajar, M. An application of non-linear programming to train recurrent neural networks in time series prediction problems. In Enterprise Information Systems VII, Proceedings of the the Seventh International Conference on Enterprise Information Systems (ICEIS 2005), Miami, FL, USA, 24–28 May 2005; Springer: Berlin/Heidelberg, Germany, 2007; pp. 95–102. [Google Scholar]
Li, S. Development and validation of a dynamic air handling unit model, Part I. ASHRAE Trans. 2010, 116, 57–73. [Google Scholar]
Cassel, M.; Lima, F. Evaluating one-hot encoding finite state machines for SEU reliability in SRAM-based FPGAs. In Proceedings of the 12th IEEE International On-Line Testing Symposium (IOLTS’06), Lake Como, Italy, 10–12 July 2006; IEEE: Piscataway, NJ, USA, 2006; p. 6. [Google Scholar]
Granderson, J.; Lin, G.; Harding, A.; Im, P.; Chen, Y. Building fault detection data to aid diagnostic algorithm creation and performance testing. Sci. Data 2020, 7, 65. [Google Scholar] [CrossRef] [PubMed]
Ahmadi, A.; Nabipour, M.; Mohammadi-Ivatloo, B.; Amani, A.M.; Rho, S.; Piran, M.J. Long-term wind power forecasting using tree-based learning algorithms. IEEE Access 2020, 8, 151511–151522. [Google Scholar] [CrossRef]

Figure 1. Structure of LSTM model.

Figure 2. The architecture of the 1D-CNN model.

Figure 3. Structure of the proposed (1D CNN-LSTM) model.

Figure 4. Test setup for proposed methodology [38], where T_oa, Q_oa, and Pi_oa are the temperature, flow, and pressure sensors at the outdoor air unit. T_ra and Q_ra are the temperature and flow sensors at the return air unit. T_sa and Q_sa are the temperature and flow sensors at the supply air unit.

Figure 5. Training and validation accuracy curves of proposed model with respect to number of training epochs.

Figure 6. Training and validation loss curves of proposed model with respect to number of training epochs.

Figure 7. CM of proposed hybrid CNN model.

Table 1. Structure of the proposed model.

Layer	Output Shape	Param #
conv1d (Conv1D)	(None, 14, 64)	256
max_pooling1d (MaxPooling1D)	(None, 7, 64)	0
conv1d_1 (Conv1D)	(None, 5, 128)	24,704
max_pooling1d_1 (MaxPooling1D)	(None, 2, 128)	0
lstm (LSTM)	(None, 50)	35,800
flatten (Flatten)	(None, 50)	0
dense (Dense)	(None, 100)	5100

Note: # is the parameters number utilized by the program.

Table 2. Performance metrics of proposed model.

Precision	Recall	F1-Score	Accuracy
0.9844	0.98857	0.98648	0.97

Table 3. Performance comparison between the proposed model and state-of-the-art models.

Model	Precision	Recall	Accuracy
Gradient Boost [41]	0.79	0.81	0.79
Random forests [24]	0.81	0.89	0.83
DRNN [23]	0.89	0.92	0.91
DNN [25]	0.79	0.95	0.95
1D-CNN	0.874	0.865	0.892
Proposed model	0.9844	0.98857	0.97

Table 4. Performance comparison between the proposed model and baseline 1D-CNN.

Model	Clean Data	Gaussian Noise	Missing Value
Proposed Model	0.97	96.72	91.3
Baseline 1D-CNN	89.2	83.7	81.6

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Prince; Yoon, B.; Kumar, P. Fault Detection and Diagnosis in Air-Handling Unit (AHU) Using Improved Hybrid 1D Convolutional Neural Network. Systems 2025, 13, 330. https://doi.org/10.3390/systems13050330

AMA Style

Prince, Yoon B, Kumar P. Fault Detection and Diagnosis in Air-Handling Unit (AHU) Using Improved Hybrid 1D Convolutional Neural Network. Systems. 2025; 13(5):330. https://doi.org/10.3390/systems13050330

Chicago/Turabian Style

Prince, Byungun Yoon, and Prashant Kumar. 2025. "Fault Detection and Diagnosis in Air-Handling Unit (AHU) Using Improved Hybrid 1D Convolutional Neural Network" Systems 13, no. 5: 330. https://doi.org/10.3390/systems13050330

APA Style

Prince, Yoon, B., & Kumar, P. (2025). Fault Detection and Diagnosis in Air-Handling Unit (AHU) Using Improved Hybrid 1D Convolutional Neural Network. Systems, 13(5), 330. https://doi.org/10.3390/systems13050330

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fault Detection and Diagnosis in Air-Handling Unit (AHU) Using Improved Hybrid 1D Convolutional Neural Network

Abstract

1. Introduction

2. Data-Driven Method

2.1. Long Short-Term Memory (LSTM)

2.2. Convolutional Neural Network (CNN)

2.3. The Proposed Model

3. Data Analysis and Results

4. Comparison with the State-of-the-Art Methods

4.1. Robustness Analysis

4.2. Shortcomings and Limitations

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI