Next Article in Journal
A Coupled Multi-Mechanism Modeling Study for the Fractured Horizontal Well in Shale Oil Reservoirs
Previous Article in Journal
Construction of Transmission Line Segments Assessment Model Based on Correlation Analysis and Analytic Hierarchy Process Method
Previous Article in Special Issue
Smart Grids in the Context of Smart Cities: A Literature Review and Gap Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Cluster- and Temperature-Aware Auto-Ensemble Model for Airport Cooling Load Forecasting

1
Department of Energy and Resources Engineering, School of Mechanics and Engineering Science, Peking University, Beijing 100871, China
2
Nanchang Innovation Institute, Peking University, Nanchang 330000, China
3
Ordos Research Institute of Energy, Peking University, Ordos 017000, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Energies 2026, 19(5), 1375; https://doi.org/10.3390/en19051375
Submission received: 30 January 2026 / Revised: 24 February 2026 / Accepted: 5 March 2026 / Published: 9 March 2026

Abstract

Accurate cooling load forecasting supports energy-efficient operation in large public buildings such as airports. Cooling load time series are often nonlinear and temporally dependent, with frequent operating condition changes and pronounced thermal inertia, which limits the reliability of single-model forecasting. This study proposes a cluster- and temperature-aware auto-ensemble model (CATS-Ens) for short- and long-term cooling load prediction. CATS-Ens learns condition-dependent model contributions within temperature-based operating intervals and distinct load regimes, enabling collaborative prediction across complementary experts and avoiding reliance on a single globally optimal predictor. The proposed model is evaluated on a real-world hourly cooling load dataset collected from an airport terminal. Results show that CATS-Ens achieves consistently better performance than representative baselines under multiple metrics, including MAE, RMSE, MAPE, sMAPE, and R 2 . Compared with the best individual baseline, CATS-Ens reduces MAE by 8.5%, RMSE by 8.4%, MAPE by 12.6%, and sMAPE by 7.1%, with an R 2 of 0.967. The model maintains stable accuracy under varying operating conditions and alleviates false-positive predictions during zero-load and low-load periods, demonstrating its practical value for cooling load forecasting in complex building energy systems.

1. Introduction

1.1. Research Background

Cooling load forecasting is a fundamental task in building energy management systems. It plays a vital role in optimal control, demand response, and energy efficiency improvement [1]. Accurate short- and long-term forecasting enables advanced control strategies. These techniques reduce unnecessary equipment cycling and support energy-efficient HVAC system operation. This is particularly crucial for large-scale and high-energy-demand buildings, such as airport terminals [1,2].
Compared with other types of building energy demand, the cooling load exhibits several distinctive characteristics. First, cooling systems frequently operate in ON/OFF modes, resulting in intermittent zero-load periods, which significantly challenge traditional regression-based and statistical forecasting models [3]. Second, cooling load demand is highly sensitive to outdoor temperature and occupancy conditions, leading to strong nonlinearity and nonstationary behavior [4]. Third, cooling load patterns vary considerably under different operating conditions, such as part-load operation and peak cooling periods, reflecting pronounced operational heterogeneity in real-world systems [3,5]. These characteristics pose substantial challenges for conventional forecasting methods. Linear and multiple regression models often generate false-positive predictions during system shutdown periods and fail to capture nonlinear temperature–load relationships [3]. Physical-model-based simulation tools, while theoretically accurate, require extensive system information and are difficult to deploy in complex buildings with uncertain operating conditions [6]. Meanwhile, single forecasting models struggle to maintain stable performance under varying operational regimes, particularly when frequent mode switching occurs in real-world cooling systems [7].
In recent years, data-driven approaches have been widely adopted for cooling load forecasting. Recurrent neural networks and long short-term memory models have demonstrated superior nonlinear modeling capability compared with traditional statistical methods [1,8]. More recently, Transformer-based architectures have further improved temporal dependency modeling in complex load time series [9]. Hybrid and decomposition-based approaches, such as SSA- and EMD-assisted deep learning models, enhance prediction accuracy by mitigating noise and capturing multi-scale temporal features [2,10]. Tree-based ensemble methods, including random forest and gradient boosting, remain competitive due to their robustness under limited data conditions [11].
Despite these advances, cooling load forecasting remains challenging in real-world applications. No single model consistently performs best under all operating conditions, especially when cooling systems frequently switch between ON and OFF states. Moreover, many existing studies treat cooling load time series as a homogeneous process and overlook operational heterogeneity and regime transitions [5,12]. Recent review studies emphasize that auto-ensemble learning and regime-aware modeling are promising directions for improving robustness in complex energy systems [7,13].

1.2. Related Work

Cooling load and thermal load forecasting is a fundamental task in building energy management, HVAC system operation, and low-carbon energy system optimization. Early studies mainly relied on statistical time-series models, such as autoregressive and regression-based approaches. Although these methods are computationally efficient, their ability to capture nonlinear dynamics, operational variability, and intermittent cooling demand is limited, especially for complex buildings and large-scale systems [14,15].
With the rapid development of machine learning, data-driven approaches have been widely applied to cooling load forecasting. Tree-based models, including random forests and gradient boosting machines, have demonstrated strong nonlinear representation ability and robustness to noisy measurements [16]. In particular, XGBoost has been extensively adopted in building energy forecasting due to its high efficiency, scalability, and stable performance across different datasets [17]. Homogeneous ensemble trees, such as random forest, have also been validated for hourly building energy prediction and can deliver strong generalization under noisy measurements; however, they typically learn a globally pooled mapping and may be insensitive to regime-dependent dynamics [18].
Deep learning techniques further advanced cooling load forecasting by enabling end-to-end temporal modeling. Artificial neural network (ANN) methods have been systematically reviewed for building cooling load demand forecasting, covering representative network structures and modeling strategies [19]. Recurrent neural networks and their variants, such as LSTM and GRU, have been widely used to capture short- and medium-term temporal dependencies in cooling demand [20]. Dong et al. further proposed an improved LSTM model for hourly cooling load forecasting and reported superior performance compared with several LSTM variants [21]. Beyond architectures, multi-step forecasting strategies have been systematically assessed in building energy prediction, showing that inference design (recursive vs direct vs MIMO) significantly affects stability and error accumulation over extended horizons [22]. More recently, Transformer-based architectures have attracted increasing attention for their ability to model long-range temporal correlations and complex nonlinear patterns in building energy time series [23,24,25]. In addition, hybrid and physics-informed approaches, such as RC-model-assisted learning and physics-informed neural networks, have been introduced to improve multi-step forecasting stability and physical consistency, particularly under limited data conditions [26]. To provide a concise comparison of representative approaches and highlight their strengths and limitations, Table 1 summarizes several influential studies in building and cooling load forecasting.
In parallel, uncertainty has been increasingly recognized as a critical issue in practical load forecasting. Probabilistic forecasting methods that incorporate weather forecasting uncertainty and abnormal peak uncertainty can enhance reliability under real operational conditions [27]. Bayesian deep neural networks have also been explored to quantify epistemic and aleatoric uncertainty for building load prediction, providing uncertainty-aware outputs rather than purely deterministic point forecasts [28].
Despite these advances, several studies have highlighted that the cooling load exhibits strong operational heterogeneity and regime-dependent characteristics. Cooling demand is highly sensitive to occupancy patterns, solar gains, control settings, and operating schedules, leading to distinct load regimes and characteristic daily profiles [29]. Cluster-based analysis and load pattern classification have therefore been proposed to characterize cooling behavior prior to forecasting [30]. However, most existing forecasting studies focus on improving individual model architectures under globally aggregated data. Although ensemble learning has been explored to enhance robustness, many existing ensemble methods rely on static or global weighting schemes, which are insufficient to address condition-dependent cooling dynamics.
Motivated by the above limitations, this study develops a cluster- and temperature-aware auto-ensemble model for cooling load prediction. Cooling load heterogeneity is explicitly characterized through clustering of operating regimes, while temperature-based segmentation captures thermal driving conditions. Model contributions are automatically adjusted across joint regimes, enabling accurate and robust forecasting for complex building energy systems.

1.3. Major Contributions and Novelties

To overcome the challenges of intermittency induced by ON/OFF operation, regime heterogeneity, and temperature-driven nonlinear behavior in airport cooling load time series, this paper develops a cooling load forecasting model that integrates operational awareness with adaptive ensemble learning. The main contributions are summarized as follows:
  • A comprehensive feature engineering pipeline is developed for cooling load forecasting, including correlation-based feature screening, temporal dependency representation, and temperature-derived indicator construction.
  • Representative deep learning and machine learning baselines are implemented and evaluated under a unified experimental protocol, ensuring consistent data preprocessing and leakage-free training–testing settings for both short-term (day-ahead) and long-term (season-wide) forecasting.
  • An ON/OFF gating mechanism is introduced to explicitly identify operating states, effectively reducing structural mispredictions and ensuring physical consistency during intermittent zero-load periods.
  • A joint regime partitioning strategy is developed by integrating operational clustering and quantile-based temperature thresholds, systematically characterizing heterogeneous operating modes and thermal dynamics.
  • CATS-Ens is proposed to support regime-specific expert selection and adaptive blending, improving forecasting robustness under complex operational states and peak-risk thermal conditions.

2. Methods

2.1. Dataset

The dataset used in this study consists of historical cooling load measurements collected from the cooling system of a large airport facility located in Ordos, China, together with corresponding meteorological variables obtained from an open-access European meteorological database. The meteorological variables include outdoor air temperature, relative humidity, wind speed, wind direction, atmospheric pressure, solar radiation, and dew point temperature [31,32]. Ordos is situated in a mid-temperate semi-arid continental climate zone, characterized by typical hot-summer and cold-winter conditions. Summers are relatively short, hot, and dry, whereas winters are long and severe. Annual precipitation is limited and mainly concentrated in the summer months, resulting in arid conditions and relatively low humidity. Transitional seasons are brief, often accompanied by frequent sand-dust events and relatively high annual average wind speeds.
The data cover two full cooling seasons and represent real-world operating conditions, providing sufficient variability in both cooling demand and environmental factors. To ensure consistency with practical cooling load forecasting scenarios, only cooling-season data are used for model development. All samples are preserved in strict chronological order to reflect realistic prediction conditions and to avoid information leakage. The dataset is divided into training, validation, and testing subsets using a time-ordered splitting strategy, with 80%, 10%, and 10% of the data allocated to each subset, respectively [33]. For visualization purposes in Figure 1, the cooling load is normalized using a maximum-value scaling approach. Specifically, the normalized cooling load y t norm is defined as
y t norm = y t max ( y ) ,
where y t denotes the observed cooling load at time t, and max ( y ) represents the maximum cooling load observed within the selected cooling seasons (June–August, 2023 and 2024).
This normalization rescales the load values to the range [ 0 , 1 ] to facilitate comparative visualization across different periods, while preserving the relative temporal variation patterns. It is noted that normalization is applied solely for visualization and does not affect the model training or forecasting procedures, which are performed using the original load values.
Figure 1 illustrates the representative characteristics of the cooling load during the cooling seasons of 2023 and 2024. The time series exhibits pronounced intermittency, with frequent zero-load periods corresponding to system shutdown or standby states. To further highlight this behavior, several typical time windows are enlarged in the figure. The highlighted regions clearly demonstrate abrupt transitions between active operation and near-zero load states, revealing a distinct regime-switching pattern rather than smooth and continuous temporal evolution.
During low-demand periods, the cooling load frequently drops to zero, indicating shutdown or reduced-capacity operation under mild weather or nighttime conditions. Even during peak-demand intervals, intermittent fluctuations and short-duration interruptions are observed, suggesting the influence of operational constraints and ON/OFF control strategies. These patterns confirm that the cooling load dynamics are not governed solely by continuous thermal inertia and environmental drivers but are also significantly affected by discrete operational states and control logic.
Consequently, the cooling load exhibits hybrid dynamics that combine continuous thermal processes with discrete regime transitions. This state-dependent behavior implies that treating cooling load as a purely continuous regression target may overlook important structural characteristics. The empirical observations from the highlighted regions therefore provide direct justification for incorporating state-aware feature engineering and regime-sensitive forecasting strategies in the proposed framework.

2.2. Baseline Models and Evaluation Metrics

Given the time-series nature of cooling load data, forecasting models capable of capturing temporal dependencies and nonlinear relationships are selected as baseline methods. Previous review studies on load forecasting and time-series prediction have shown that recurrent neural networks, attention-based models, and tree-based machine learning approaches are among the most effective techniques for short-term load forecasting [20].
Based on this research consensus, five representative forecasting models are employed as base learners in this study, recurrent neural networks (RNN), long short-term memory networks (LSTM), Gated recurrent units (GRU), Transformer, and XGBoost [34]. The RNN family is well suited for modeling sequential dependencies, while Transformer-based models capture long-range temporal relationships through self-attention mechanisms [25]. In addition, XGBoost is included as a strong nonlinear baseline due to its robustness and effectiveness in modeling complex feature interactions. All forecasting models considered in this study can be generally formulated as a nonlinear mapping from historical observations and exogenous variables to future cooling load:
y ^ t = f θ ( x t , x t 1 , , x t L ) ,
where y ^ t denotes the predicted cooling load at time t, xt represents the input feature vector, L is the look-back window length, and f θ ( · ) denotes a forecasting model parameterized by θ .
The forecasting performance is evaluated using several commonly adopted metrics, including the mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), symmetric mean absolute percentage error (sMAPE), and coefficient of determination ( R 2 ) [35,36]. The formulas for these metrics are as follows:
MAE = 1 N i = 1 N y i y ^ i .
RMSE = 1 N i = 1 N ( y i y ^ i ) 2 .
MAPE = 100 % N i = 1 N y i y ^ i y i .
sMAPE = 100 % N i = 1 N | y i y ^ i | ( | y i | + | y ^ i | ) / 2 .
R 2 = 1 i = 1 N ( y i y ^ i ) 2 i = 1 N ( y i y ¯ ) 2 ,
where y ¯ denotes the mean value of the observed cooling load.

2.3. Algorithm Framework and Workflow

Figure 2 illustrates the overall framework of the proposed cooling load forecasting approach. The framework is systematically designed to address the intrinsic characteristics of cooling load, including intermittency (frequent ON/OFF switching), operational heterogeneity (multiple operating regimes), and strong temperature-dependent nonlinearity. The workflow begins with comprehensive feature analysis and enhancement, where temporal, meteorological, and operational features are extracted and augmented. This includes constructing lagged variables inspired by autocorrelation analysis, adding polynomial and piecewise temperature features to capture nonlinear thermal responses, and incorporating cooling degree-hour (CDH) indicators as physics-informed drivers [37]. Spearman correlation screening is applied to retain strongly relevant features, reducing noise and dimensionality while preserving the most informative predictors for cooling demand dynamics.
All models were implemented in Python (version 3.10.18). Deep learning models were developed using PyTorch (version 2.5.1), while tree-based models were implemented using XGBoost. Hyperparameter optimization (HPO) was conducted using the Optuna framework with a Tree-structured Parzen Estimator sampler. Experiments were performed on a workstation equipped with an Intel Core i9-13900HX CPU, 16 GB DDR5 RAM, and an NVIDIA RTX 4060 GPU (8 GB VRAM). CUDA acceleration was enabled when available. Random seeds were fixed and early stopping based on validation performance was applied to ensure reproducibility and stable convergence.
Building upon the refined feature set, a diverse pool of forecasting models is trained as base learners. To ensure each model operates near its optimal capacity, a structured Hyperparameter Optimization procedure is conducted using Optuna. For each model type and target variable, HPO searches the defined parameter space by minimizing the validation RMSE [38], and the best-found parameters are saved and injected into the main training pipeline. This step guarantees that every expert model is fairly and robustly calibrated, establishing a high-performance baseline for the subsequent ensemble stage. The core innovation of the framework lies in the adaptive ensemble module, which integrates two complementary gating mechanisms: a Cooling ON/OFF Gating layer that uses a binary classifier to suppress false-positive predictions during system shutdown periods and a Joint Temperature–Cluster Gating mechanism that routes samples into distinct regimes based on both outdoor temperature segments and identified operating clusters. For each resulting joint regime, the top-two performing expert models are selected, and an optimal convex blending weight is learned on the validation set. The final prediction is the output of this regime-specific top-two blend, further masked by the ON/OFF gate.
Unlike conventional monolithic forecasting methods, CATS-Ens performs context-aware decision-making during prediction. Hyperparameter-optimized experts are coordinated through a dual-gating and regime-adaptive fusion mechanism so that the modeling strategy matches each thermal and operational state. The ON/OFF gating component reduces structural errors during zero-load periods, while the cluster- and temperature-aware auto-ensemble mechanism improves the representation of regime-dependent nonlinear patterns and peak-load behaviors.

2.4. Problem Formulation and Notations

This section introduces the mathematical formulation of the forecasting problem, explicitly states the physical assumptions governing the cooling dynamics, and defines the core equations for the regime-aware ensemble mechanism.
Let { y t } t = 1 N denote the measured hourly cooling load, and let z t denote the exogenous inputs available at time t (e.g., weather and calendar variables). The fundamental forecasting task is to learn a mapping F from historical information x t to the future cooling load:
y ^ t + h = F x t , h H ,
where x t is the constructed feature vector, and H is the prediction horizon set. In this study, short-term forecasting refers to day-ahead prediction at the aggregated 24 h scale, while long-term forecasting refers to season-wide forecasting at the original hourly resolution.
The design of our forecasting framework is driven by several key physical and operational assumptions regarding airport cooling systems: (1) thermal inertia induces strong temporal dependence; (2) the system load is highly intermittent, containing extended zero-demand periods bounded by physical shutdown constraints; (3) operational conditions switch frequently, leading to heterogeneous operating modes; and (4) the temperature–load relationship becomes increasingly nonlinear under high thermal stress.
To mathematically address the intermittency assumption, we define a cooling load ON/OFF gating mechanism. Let p on , t denote the predicted probability of the system being in the ON state. A binary gating variable g t { 0 , 1 } is introduced using a decision threshold τ :
g t = I p on , t τ ,
where I ( · ) is the indicator function. This gate enforces a strict zero-load constraint during predicted OFF periods.
To address the heterogeneous conditions and nonlinear thermal stress, we define a joint regime partitioning strategy. First, thermal regimes are delineated using quantile-based thresholds computed from the validation outdoor temperature series T v a l :
θ 1 = Q p 1 T v a l , θ 2 = Q p 2 T v a l .
The thermal regime index r t at time t with temperature T t is defined as
r t = 1 , T t < θ 1 , 2 , θ 1 T t < θ 2 , 3 , T t θ 2 .
Subsequently, clustering analysis is utilized to assign an operational mode label c t { 0 , 1 , 2 , 3 } for each timestamp. The combination of thermal and operational states forms the joint regime index:
s t = ( r t , c t ) ,
which serves as the foundational condition tag for the auto-ensemble mechanism.
Based on the defined joint regimes, we formulate an adaptive ensemble strategy to combine predictions from a pool of M expert forecasters { f m } m = 1 M . Let y ^ m , t denote the point prediction from the m-th expert. For a specific joint regime s, we define its corresponding validation subset as I s = { t D v a l : s t = s } . The performance of each expert in this regime is quantified by the Root Mean Square Error (RMSE):
RMSE m , s = 1 | I s | t I s y t y ^ m , t 2 .
To maintain deployment efficiency, a compact Top-2 selection strategy is adopted to identify the best-performing experts for each regime:
( m s ( 1 ) , m s ( 2 ) ) = Top 2 { RMSE m , s } m = 1 M .
A regime-specific blending weight α s [ 0 , 1 ] is then learned by minimizing the validation squared error:
α s = arg min α [ 0 , 1 ] t I s y t α y ^ m s ( 1 ) , t ( 1 α ) y ^ m s ( 2 ) , t 2 .
Finally, the intermediate ensemble prediction y ^ t ens is constrained by the ON/OFF gate g t (Equation (9)) to yield the physically consistent final output y ^ t final :
y ^ t ens = α s t y ^ m s t ( 1 ) , t + ( 1 α s t ) y ^ m s t ( 2 ) , t , y ^ t final = g t · y ^ t ens .
The detailed engineering implementation and algorithm workflow based on these formulations will be systematically elaborated in Section 3.

3. Experimental Analysis

This section implements the methodological framework defined in Section 2 and evaluates its effectiveness on real-world airport cooling load data. Following the unified regime-and-constraint formulation, we first construct informative features to represent temporal dependency and environmental sensitivity. We then develop regime identification and adaptive ensemble mechanisms and finally assess the performance under short-term and long-term forecasting scenarios.

3.1. Feature Engineering

Feature engineering aims to encode the structural characteristics discussed in Section 2, including thermal inertia, operational heterogeneity, and temperature-dependent nonlinearity. A multi-stage feature analysis and selection pipeline is therefore developed.

3.1.1. Temporal Dependency Analysis and Feature Construction

To explicitly represent the temporal dependence induced by thermal inertia, autocorrelation analysis is conducted to examine the intrinsic time-series structure of cooling load data. For a discrete cooling load series { y t } t = 1 N , the autocorrelation function (ACF) at lag k is defined as
ACF ( k ) = t = 1 N k ( y t y ¯ ) ( y t + k y ¯ ) t = 1 N ( y t y ¯ ) 2 ,
where y ¯ denotes the sample mean of the cooling load series, and N is the total number of observations.
Analysis of the ACF reveals several dominant lag intervals that correspond to distinct physical and operational mechanisms within the cooling system [39]. As illustrated in Figure 3, the airport cooling load series exhibits significant temporal dependencies across multiple time scales.
Firstly, strong autocorrelation at short lags (e.g., 1–3 h) indicates pronounced short-term persistence. This behavior is primarily attributable to system thermal inertia and thermal storage effects within the building envelope and the chilled water network. Owing to the heat capacity of the building structure and the circulating fluid, the cooling demand cannot adjust instantaneously, leading to gradual load evolution. Additionally, equipment start-up dynamics and inherent delays in control responses further contribute to this short-term autocorrelation.
Secondly, a clear and dominant peak around a 24 h lag corroborates the presence of a daily periodicity. This daily cycle is driven by diurnal variations in outdoor temperature, patterns of solar radiation, occupancy schedules, and routine airport operational activities. The persistence observed at this lag reflects repetitive thermal and operational conditions across consecutive days.
Furthermore, a noticeable autocorrelation component around a 168 h (7-day) lag suggests a weekly pattern. This weekly dependence is primarily related to differences between weekday and weekend passenger flows, occupancy intensity, and adjustments in operational scheduling.
In summary, the identified dominant lags (short-term, 24 h, and 168 h) reveal that the cooling load dynamics are shaped by a combination of continuous thermal inertia processes and structured operational cycles. These findings provide a physically interpretable foundation for the selection of lagged features in the forecasting model.
To assess the statistical significance of the identified temporal dependencies, the ACF values are further evaluated using approximate 95 % confidence bounds under the assumption of a white-noise process, given by
± 1.96 N .
Autocorrelation values exceeding these bounds are considered statistically significant, indicating non-random temporal dependence in the cooling load series.
Based on the ACF analysis, representative lagged cooling load features are constructed to explicitly capture dominant temporal dependencies at physically interpretable intervals (e.g., short-term lags, 24 h, and 168 h). These discrete lag features enable the model to directly learn the structured periodic patterns and short-term persistence identified from the statistical analysis.
However, relying solely on discrete lagged inputs may lead to a fragmented temporal representation, as they capture dependencies only at specific time delays while potentially overlooking gradual trend evolution between those points. To address this limitation, exponentially weighted moving average (EWMA) features are introduced as complementary temporal descriptors. By assigning exponentially decaying weights to historical observations, EWMA features emphasize recent load behavior while retaining information from earlier states, thereby forming a continuous and smoothed representation of system inertia and dynamic adjustment processes.
The combined use of ACF-guided lag features and EWMA features enhances temporal representation in two key aspects. First, the lag features preserve interpretable discrete dependencies associated with thermal inertia, daily cycles, and weekly operational patterns. Second, the EWMA features model the gradual evolution of load levels and transient dynamics that occur between discrete lag intervals. This hybrid temporal encoding strategy allows the model to simultaneously capture abrupt state persistence and smoothly evolving load trends, improving its ability to represent both short-term dynamics and medium-term transitions within the cooling system. Overall, these observations justify the proposed ACF-guided temporal feature construction strategy, in which short-term, daily, and weekly characteristics are simultaneously represented to enhance forecasting robustness under real-world intermittent operating conditions.

3.1.2. Environmental Feature Selection via Spearman Analysis

Following the identification of temporal dependency features, Spearman rank correlation analysis is further employed to evaluate the monotonic relationships between candidate features and the target cooling load. Unlike linear correlation measures, Spearman correlation is capable of capturing nonlinear but monotonic dependencies, making it particularly suitable for analyzing relationships between cooling load and meteorological variables [40].
The Spearman rank correlation coefficient is defined as
ρ s = 1 6 i = 1 N d i 2 N ( N 2 1 ) ,
where d i denotes the difference between the ranks of the i-th pair of observations, and N is the sample size.
As shown in Figure 4, the bar chart presents the Spearman rank correlation coefficients between cooling-related features and the cooling load. The bar color encodes the direction of the monotonic relationship (blue for positive correlation and red for negative correlation), while the bar length and the numerical labels indicate the magnitude of the correlation coefficient ρ .
Temporal features derived from the cooling load itself exhibit the largest correlation magnitudes, as reflected by longer bars and higher ρ values. In particular, short-term lag features (e.g., 1 h), daily periodic lags (24 h), weekly periodic lags (48 h), and the corresponding EWMA features show strong positive correlations. This quantitatively confirms the pronounced temporal inertia and persistence in cooling load dynamics.
To explicitly represent the nonlinear influence of the ambient temperature on the cooling demand, temperature-derived features are incorporated into the feature set. Specifically, polynomial transformations of the outdoor air temperature are introduced to capture nonlinear sensitivity under high-temperature conditions. In addition, cooling degree-hour indicators are constructed as physics-informed features to quantify the accumulated thermal driving effect, which has been widely adopted in building energy modeling. By embedding such temperature-related nonlinearities into the input space, the forecasting models can better differentiate cooling load behaviors across diverse thermal environments.
Among environmental variables, the outdoor air temperature and solar radiation display relatively larger positive coefficients, indicating that increased thermal and radiative conditions monotonically drive higher cooling demand. Temperature-derived cooling degree indicators present moderate positive correlations, demonstrating their effectiveness in capturing the temperature-driven load behavior.
Conversely, several meteorological and calendar-related variables, such as wind direction, atmospheric pressure, and day-of-week indicators, exhibit coefficients close to zero, corresponding to very short bars in the chart. This suggests weak monotonic associations with cooling load variations.
Based on the quantified correlation magnitudes, this study adopts a feature selection threshold of | ρ | 0.05 to retain variables with meaningful monotonic relevance for subsequent model development.

3.1.3. Operational Regime Identification via Clustering

Based on the extracted temporal and environmental features, clustering analysis is further conducted to identify distinct operational regimes of cooling load during cooling seasons [41,42]. The clustering process groups samples with similar operating patterns by minimizing within-cluster variance, formulated as
min j = 1 K x i C j x i μ j 2 ,
where C j denotes the j-th cluster, and μ j represents the corresponding cluster centroid. In this work, K = 4 is adopted, resulting in four interpretable cooling load regimes. According to the cluster labels shown in Figure 5, the four operating conditions are summarized as follows:
  • Cluster 0: Medium-load Dynamic Operation (Medium dynamic);
  • Cluster 1: OFF / Near-zero Load (OFF-like);
  • Cluster 2: Peak-load / Stress Operation (Peak / stress);
  • Cluster 3: Low-load Steady Operation (Low steady).
Figure 5a illustrates the temporal evolution of the clustered regimes in both the 2023 and 2024 cooling seasons. Frequent regime switching can be observed throughout the two seasons, indicating that the cooling system operates under multiple alternating operational states rather than a single stationary mode. In particular, Cluster 1 (OFF-like) appears intermittently across the entire season, corresponding to periods of system shutdown or standby operation when the cooling demand is near zero. Meanwhile, Cluster 3 (Low steady) typically occurs during normal active cooling periods with relatively stable but low demand, suggesting conservative operation or mild cooling requirements. Cluster 0 (Medium dynamic) is associated with moderate demand accompanied by noticeable fluctuations, which can be attributed to occupancy-related disturbances, internal heat gains, and control adjustments. Finally, Cluster 2 (Peak/stress) represents high-load conditions that emerge during extreme thermal stress or intensive operational demand, where the system is likely operating close to capacity.
The cluster-wise cooling load distributions are presented in Figure 5b. The four clusters demonstrate clearly differentiated load magnitudes and variability patterns, confirming that the clustering outcomes reflect meaningful operational regimes. Cluster 1 (OFF-like) shows a highly concentrated distribution around near-zero values, characterizing non-cooling or standby states. Cluster 3 (Low steady) exhibits low but non-zero cooling demand with relatively narrow dispersion, representing stable low-intensity operation. Cluster 0 (Medium dynamic) demonstrates a wider distribution with moderate load levels, indicating stronger temporal variability under dynamic operating conditions. In contrast, Cluster 2 (Peak/stress) is characterized by consistently high cooling loads with relatively concentrated dispersion, capturing high-demand states where the system is under sustained stress. The cluster-wise distributions shown in Figure 5b further demonstrate that each regime corresponds to statistically distinguishable load magnitude and variability characteristics. The clear separation in both central tendency and dispersion suggests that modeling the entire dataset under a single stationary assumption may obscure regime-dependent variability structures.
Figure 5c maps the clustered samples in the temperature–load space. A clear separation trend can be observed along the outdoor temperature dimension, demonstrating that the identified regimes are strongly governed by ambient thermal driving. Lower temperature conditions are predominantly associated with Cluster 1 (OFF-like) and Cluster 3 (Low steady), while higher temperatures correspond to Cluster 0 (Medium dynamic) and especially Cluster 2 (Peak/stress). This joint relationship suggests that cooling load regimes are not only influenced by environmental temperature but also modulated by operational scheduling and system control behavior. Overall, the interpretable regime identification provides a structured basis for the subsequent cluster- and temperature-aware auto-ensemble forecasting approach. In addition, the temperature–load scatter visualization in Figure 5c reveals that regime separation generally aligns with increasing ambient temperature levels. However, partial overlaps between clusters under similar temperature ranges indicate that temperature alone does not fully determine the load magnitude. This residual variability suggests the presence of additional influencing factors, which may warrant further investigation in future studies.
Overall, the temporal switching behavior, differentiated statistical distributions, and temperature-conditioned separation collectively confirm that the cooling load exhibits multi-regime and nonstationary characteristics. These findings provide structural support for adopting regime-aware modeling approaches.

3.2. Development and Evaluation of the CATS-Ens Approach

3.2.1. Cooling Load ON/OFF Gating and Temperature-Based Regime Partitioning

To handle the structural zero-load issue and regime-dependent thermal dynamics formulated in Section 2, we implement a two-step conditioning mechanism: ON/OFF gating followed by temperature-based partitioning.
In the practical implementation of the ON/OFF gating mechanism, a binary classifier (or a rule-based status indicator derived from historical operational schedules) is first deployed to generate the predicted ON probability. Based on the gating logic defined previously, predictions during identified shutdown intervals are strictly forced to zero. This implementation explicitly suppresses false-positive cooling demands, thereby significantly improving the physical consistency of the subsequent regression outputs.
Beyond the binary ON/OFF states, the ambient temperature continuously drives the cooling load dynamics. Following the mathematical definition of thermal regimes, we adopt a quantile-based partitioning strategy on the validation outdoor temperature series. In this study, the quantile thresholds are empirically set to ( p 1 ,   p 2 ) = ( 0.33 ,   0.66 ) . This specific configuration partitions the temperature distribution into three approximately balanced regimes (low, medium, and high) and is selected for three main reasons:
  • Data-Adaptive Robustness: It provides a dynamic partition that is robust to climate distribution shifts across different seasons and geographical sites, avoiding the bias typically associated with fixed absolute temperature thresholds.
  • Sample Balance: It ensures sufficient and relatively balanced training samples within each thermal regime, which is crucial for the stable estimation of regime-wise expert blending weights.
  • Nonlinear Adaptation: A three-regime design aligns well with the physical reality of nonlinear temperature–cooling load responses, specifically enhancing the model’s sensitivity and robustness under high-temperature peak-demand conditions.
By integrating the structural ON/OFF gating with the continuous thermal regimes and operational clustering labels, the heterogeneous data space is effectively segmented. This joint partitioning lays a solid foundation for the subsequent CATS-Ens.

3.2.2. Cluster- and Temperature-Aware Auto-Ensemble Mechanism

Building upon the partitioned data space and the operational clustering labels extracted in Section 3.1.3, the proposed CATS-Ens framework executes a regime-specific expert selection and adaptive blending pipeline. This execution strategy ensures that model contributions dynamically adjust across different external thermal conditions and internal operational modes, avoiding the limitations of a single global predictor.
The practical workflow of the auto-ensemble mechanism is implemented through the following sequential pipeline:
  • Step 1: Regime Matching. For each timestamp in the forecasting horizon, the system evaluates the real-time weather inputs and operational cluster tags to determine the active joint regime index.
  • Step 2: Top-2 Expert Selection. Instead of deploying all available models, which increases the computational overhead and the risk of overfitting, CATS-Ens retrieves the Top-2 best-performing experts for the currently active regime based on their historical validation RMSE. The expert pool comprises the baseline models discussed in Section 2.
  • Step 3: Adaptive Blending. The predictions from the selected Top-2 experts are fused using the optimized blending weights specific to that regime. This step ensures that models excelling in capturing peak loads dominate during high-stress regimes, while models better at capturing steady-state trends dominate during baseline operations.
  • Step 4: Final Constraint Integration. In the final step, the integrated ensemble prediction is filtered through the ON/OFF gate. This guarantees that the final output remains physically bounded and strictly non-negative.
This automated regime-aware pipeline systematically mitigates both systematic bias and error accumulation. To demonstrate its effectiveness, the proposed method is evaluated in the subsequent sections under two practical scenarios defined by distinct prediction horizons and application objectives. Specifically, short-term forecasting refers to day-ahead prediction at a 24 h aggregated scale [43], emphasizing stability and bias reduction for operational planning. Conversely, long-term forecasting involves season-wide rolling prediction at the original hourly resolution, highlighting the model’s robustness against error propagation and intermittent operating behaviors across the entire cooling season.

3.2.3. Short-Term Cooling Load Forecasting Performance

Following the proposed two-stage decision framework (cooling ON/OFF gating + joint-regime adaptive ensemble), we first evaluate the short-term day-ahead forecasting performance with a 24 h horizon. This setting is particularly relevant for near-term operational scheduling and control, where stable daily-level forecasts are required. To quantify performance at the day-ahead scale, both predicted and observed cooling load series are aggregated to the 24 h timescale.
As shown in Figure 6a, all evaluated models are able to follow the overall day-to-day variation in cooling demand after temporal aggregation. To better appreciate these details, Figure 6b provides two side-by-side zoomed-in views of 80 h periods characterized by significant load fluctuations. In these detailed perspectives, noticeable differences in bias and consistency become clearly observable among the models. CATS-Ens demonstrates the tightest alignment with the actual load curve across different operating conditions. This is expected because daily forecasts remain sensitive to intermittent shutdown intervals and regime-dependent load dynamics.
Benefiting from the ON/OFF gating that suppresses false-positive outputs during non-cooling periods and the regime-adaptive blending that adjusts model selection and weights across thermal/operational regimes, CATS-Ens exhibits the most concentrated error distribution around zero in Figure 6c, indicating superior day-ahead stability.
Table 2 summarizes the relative error statistics at the 24 h forecasting scale. CATS-Ens yields the lowest mean absolute relative error (MAE) of 0.96%, with 95% of absolute relative errors confined within 2.15%. The mean relative bias is limited to 0.19%, confirming nearly unbiased short-term forecasting performance. In contrast, the baseline models exhibit larger error dispersion and higher bias, suggesting reduced robustness when daily forecasts must reconcile varying thermal regimes and intermittent operating behaviors.
Overall, these results verify that the proposed gating-and-ensemble strategy translates into stable and accurate day-ahead cooling load forecasts, making CATS-Ens well suited for real-time operation planning and short-term control optimization.

3.2.4. Long-Term Cooling Load Forecasting Performance

To further examine the robustness under extended forecasting horizons, we evaluate the long-term cooling load forecasting performance over the entire cooling season at the original hourly resolution. Compared with the 24 h aggregated setting, this task is more challenging due to error accumulation over time, intermittent ON/OFF operation, and persistent diurnal cycling patterns.
As illustrated in Figure 7a, all evaluated models capture the pronounced diurnal cycling of cooling load throughout the cooling season, indicating that the fundamental temporal periodicity can generally be learned. However, the dense visualization of the hourly data over the entire season severely obscures the fine-grained tracking performance, especially during daily peaks and valleys. To reveal these critical details, Figure 7b provides two side-by-side micro-scale zoomed-in views focusing on specific 40 h windows. By visualizing the point-by-point predictions, the superior tracking capability of CATS-Ens becomes distinctly evident. Its predictions consistently align with the actual load data points, whereas baseline models often exhibit noticeable overestimation or underestimation.
Furthermore, Figure 7c shows that baseline models exhibit relatively large error dispersion, especially during rapid load transitions and intermittent operating periods, where false-positive outputs and peak underestimation may occur. By leveraging the proposed framework, CATS-Ens mitigates these long-horizon challenges from two complementary aspects: the ON/OFF gating suppresses unphysical non-zero predictions during shutdown intervals, while the cluster- and temperature-aware adaptive blending assigns regime-specific expert combinations to handle heterogeneous operating conditions and peak-demand regimes. As a result, CATS-Ens yields a more concentrated error distribution over the entire season and demonstrates improved robustness against error propagation.
The quantitative results at the original hourly scale are summarized in Table 3. Using the normalized relative error ( y ^ y ) / y max , CATS-Ens achieves the lowest hourly MAE of 2.79% and a 95th-percentile absolute error (P95) of 8.18%. In comparison, baseline models show larger dispersion, such as the Transformer model with an hourly MAE of 3.66% and a noticeably higher tail error.
These results indicate that, although most models can track the seasonal trend, the proposed cluster- and temperature-aware auto-ensemble model significantly improves the long-term reliability by reducing both systematic deviation and extreme errors.
To further investigate the reliability of CATS-Ens under specific operational stresses, we extract three representative “significant scenarios” from the test set for a granular case-by-case analysis. As illustrated in Figure 8, these scenarios highlight the comparative advantages of the proposed framework:
Scenario A: Peak Demand Day. During extreme high-temperature periods, traditional neural network baselines (e.g., Transformer and RNN) often suffer from significant peak underestimation. In contrast, CATS-Ens tracks the load surges much more closely. By leveraging temperature-adaptive blending weights, the model effectively prioritizes expert components with better high-load regression capabilities, reducing the peak-shaving error seen in vanilla models.
Scenario B: High Volatility Day. When the system undergoes rapid load ramps, CATS-Ens demonstrates improved stability and tracking compared to recurrent baselines. The joint-regime gating ( r t , c t ) assists the model in adjusting to regime shifts. While a slight delay is inherent in data-driven forecasting, CATS-Ens maintains the predicted magnitude closer to the ground truth, mitigating the severe “lag effect” or magnitude mismatch observed in traditional models.
Scenario C: Transitional Load Period. This scenario examines the model’s behavior during load shedding and shutdown. Although capturing the exact trigger of sudden drops is challenging, CATS-Ens shows superior capability in suppressing non-physical “tail” predictions (false positives) during shutdown intervals. Compared to baselines that maintain high residual outputs, the ON/OFF gating mechanism ensures better physical consistency with the zero-load state, effectively narrowing the gap between predictions and actual operations.
These observations illustrate how the model balances dynamic tracking with operational constraints, providing a physical foundation for the quantitative improvements discussed in the next section.

3.3. Results and Discussion

The experimental results demonstrate that the proposed auto-ensemble strategy consistently outperforms all baseline models across multiple evaluation metrics. In particular, CATS-Ens exhibits superior robustness under varying operating conditions and effectively mitigates false-positive predictions during zero-load and low-load periods.
Table 4 summarizes the forecasting performance of all compared models. CATS-Ens achieves the lowest MAE (203.92 kW) and RMSE (281.51 kW), along with the highest R 2 (0.967). Compared with the strongest recurrent baseline (RNN), CATS-Ens reduces the MAE from 222.34 kW to 203.92 kW (an 8.3% improvement) and lowers the RMSE from 307.19 kW to 281.51 kW. This reduction of 25.68 kW in RMSE underscores its effectiveness in mitigating large-deviation errors.
Notably, some baseline models exhibit competitive performance in certain relative metrics. For example, the Transformer achieves a lower MAPE (34.31%) than CATS-Ens (36.99%), but its RMSE (385.11 kW) and sMAPE (42.49%) are substantially higher. This discrepancy indicates that while the Transformer performs reasonably under moderate load conditions, it exhibits instability during peak or transitional intervals, leading to amplified squared errors. Similarly, GRU and XGBoost yield relatively acceptable R 2 values but higher RMSE, suggesting higher susceptibility to errors under nonlinear load amplification.
The superiority of CATS-Ens lies not only in its average accuracy but also in its structured control of heterogeneous error sources. In real-world cooling systems, forecasting deviations primarily arise from two mechanisms: structural bias during near-zero or shutdown states and nonlinear amplification under thermal stress.
During off-like intervals, regression-based predictors tend to generate small positive outputs due to smoothing behavior around zero. This bias accumulates and inflates the relative error metrics. The ON/OFF gating mechanism (Equation (9)) eliminates this structural error by enforcing physical consistency, which explains the simultaneous reduction in MAE and sMAPE.
Under high-temperature conditions, cooling demand responds nonlinearly to thermal forcing. Small temperature deviations may propagate into disproportionately large load variations. Global predictors that average across regimes tend to underestimate this curvature, leading to large square errors and elevated RMSE. The temperature-based regime partitioning (Equations (10) and (11)) isolates thermal sensitivity levels, allowing experts with stronger nonlinear modeling capacity to dominate peak intervals. This targeted specialization reduces error magnification, accounting for the pronounced RMSE improvement.
Furthermore, clustering captures the operational heterogeneity induced by equipment scheduling and dynamic switching. The joint regime definition s t = ( r t , c t ) enables regime-wise expert evaluation and Top-2 blending. The learned weights α s act as regime-specific regularizers, reducing the prediction variance in dynamic states while preserving responsiveness under stress regimes.
The larger relative improvement in RMSE compared with MAE confirms that CATS-Ens effectively suppresses extreme deviations rather than merely smoothing predictions. This characteristic is particularly important for capacity planning and operational reliability in large-scale cooling systems.
Overall, the performance gain of CATS-Ens stems from its mechanism-aware decomposition of intermittency, operational heterogeneity, and temperature-dependent nonlinearity. By aligning model structure with the physical and operational characteristics of cooling systems, the framework enhances robustness without increasing model complexity. Nevertheless, the current formulation remains deterministic and does not explicitly quantify predictive uncertainty under rare disturbances or extreme climate shifts. Adaptive recalibration of regime boundaries and probabilistic extensions constitute important directions for future research.

Economic Implications Based on Forecasting Accuracy

Beyond statistical accuracy improvements, the practical value of cooling load forecasting lies in its impact on operational decision-making and energy expenditure. In centralized cooling systems, hourly load forecasts are directly used for equipment scheduling, chiller staging, and capacity planning. Forecasting deviations may lead to conservative overestimation (resulting in unnecessary energy consumption) or underestimation (causing emergency adjustments and potential peak stress).
As shown in Table 4, the proposed CATS-Ens reduces the RMSE from 307.19 kW (RNN baseline) to 281.51 kW, corresponding to an absolute reduction of 25.68 kW (approximately 8.4%). Let y t denote the observed cooling load and y ^ t the predicted load at time t. The forecasting error is defined as
ε t = y ^ t y t .
In practical operation, the cooling load can be converted into electrical power consumption using the coefficient of performance (COP):
P t = y t COP .
Accordingly, the electrical deviation induced by forecasting error becomes
Δ P t = ε t COP .
If π t represents the time-dependent electricity price, and Δ t denotes the time resolution, the incremental operational cost deviation over N time steps can be approximated as
Δ C = t = 1 N | ε t | COP Δ t π t .
Assuming a representative COP = 4.0 for large-scale chiller systems, the RMSE reduction of 25.68 kW corresponds to an electrical deviation reduction of approximately 6.42 kW per hour. Over the entire cooling season, such reductions accumulate and contribute to measurable electricity cost savings.
Although this estimation is simplified and does not explicitly incorporate demand charges or detailed dispatch optimization, it quantitatively demonstrates that improved forecasting accuracy directly reduces avoidable operational deviations. Therefore, the performance gain achieved by CATS-Ens is not only statistically significant but also economically meaningful in practical cooling system management.
Beyond economic cost reduction, the proposed forecasting framework also provides methodological support for building energy management. In centralized cooling systems, load forecasts serve as key inputs for operational planning and system-level energy coordination. Improving the forecasting stability reduces the uncertainty in daily load estimation and enhances the reliability of information used in energy management processes.
Moreover, enforcing physical consistency during OFF intervals and stabilizing predictions under high-temperature stress directly translates to fewer false alarms and more reliable performance assessment across varying thermal conditions. Therefore, the proposed cluster- and temperature-aware auto-ensemble model strengthens not only the predictive accuracy but also the informational robustness required in modern building energy management systems.

4. Conclusions

This study proposed CATS-Ens, a cluster- and temperature-aware auto-ensemble model, for cooling load forecasting in large public buildings such as airports. The model was designed to explicitly address three practical characteristics commonly observed in real-world cooling systems: structural intermittency induced by ON/OFF operation, operational heterogeneity across load regimes, and temperature-dependent nonlinear amplification under thermal stress.
By integrating ON/OFF gating, joint regime partitioning, and cluster- and temperature-aware top-2 expert blending, CATS-Ens transforms a globally nonstationary forecasting problem into structured regime-level subproblems. This mechanism-aware decomposition enables targeted control of both bias and variance across heterogeneous operating states. In particular, structural bias during shutdown intervals is suppressed through state-consistent gating, while nonlinear error amplification under high-temperature conditions is mitigated through regime-specific expert specialization.
Comprehensive experiments on an airport cooling load dataset demonstrate that CATS-Ens achieves the lowest MAE (203.92 kW) and RMSE (281.51 kW), along with the highest R 2 (0.967) among representative baselines. Compared to the strongest recurrent model, it reduces the MAE by 8.3% and RMSE by 25.68 kW. Beyond numerical improvements, CATS-Ens effectively mitigates false-positive predictions during zero- and low-load periods and exhibits superior stability during peak or transitional intervals. This pronounced reduction in extreme deviations is critical for operational stability and capacity planning in large-scale cooling systems.
From an application perspective, the proposed framework provides a practical and robust solution for both short-term (day-ahead) and long-term (season-wide) cooling load forecasting in airport energy management systems. Its modular architecture allows the flexible integration of additional expert models and facilitates transfer to other public building types with limited structural modification. Future work will extend CATS-Ens toward probabilistic forecasting, adaptive regime recalibration, and multi-building validation to further enhance deployment reliability in complex public building environments.

Author Contributions

Writing—original draft preparation, X.-Y.X.; data curation, J.-R.L.; methodology, X.-Y.X. and Y.-W.F.; resources, Y.-Z.W.; writing—review and editing, X.-R.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (No. 52376041).

Data Availability Statement

Data available upon request.

Acknowledgments

This work is also supported by the Inner Mongolia Autonomous Region Science and Technology Major Project and Peking University Nanchang Innovation Institute.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
RNNRecurrent neural networks
LSTMLong short-term memory networks
GRUGated recurrent units
MAEMean absolute error
RMSERoot mean square error
MAPEMean absolute percentage error
sMAPESymmetric mean absolute percentage error
ACFAutocorrelation function
EWMAExponentially weighted moving average
CDHCooling degree-hour
COPCoefficient of performance
HPOHyperparameter optimization
CATS-EnsCluster- and temperature-aware auto-ensemble model
Nomenclature
y t Observed cooling load at time t (kW)
y t norm Normalized cooling load at time t (-)
y ^ t Predicted cooling load at time t (kW)
y ^ t ens Ensemble prediction before ON/OFF gating (kW)
y ^ t final Final gated prediction (kW)
y ^ m , t Prediction of the m-th expert model at time t (kW)
ε t Forecasting error at time t (kW)
P t Electrical power consumption at time t (kW)
Δ P t Electrical power deviation induced by forecasting error (kW)
Δ C Cumulative incremental operational cost deviation ($)
π t Electricity price at time t ($/kWh)
Δ t Time resolution (h)
COP Coefficient of performance (-)
xtInput feature vector at time t (-)
LLook-back window length (h)
F Forecasting mapping function (-)
f θ Forecasting function parameterized by θ (-)
θ Model parameters (-)
NNumber of samples (-)
y ¯ Mean observed cooling load (kW)
max(y) Maximum observed cooling load during the selected cooling seasons (kW)
R 2 Coefficient of determination (-)
ACF(k) Autocorrelation coefficient at lag k (-)
kTime lag (h)
ρ s Spearman rank correlation coefficient (-)
d i Rank difference of the i-th sample (-)
KNumber of clusters (-)
C j The j-th cluster (-)
μ j Centroid of cluster C j (-)
p on , t Predicted ON probability at time t (-)
g t ON/OFF gating variable at time t (-)
τ Probability threshold for ON/OFF gating (-)
T t Outdoor air temperature at time t (°C)
T v a l Outdoor air temperature series of the validation set (°C)
θ 1 , θ 2 Quantile-based temperature thresholds (°C)
r t Temperature regime index (-)
c t Cluster label (-)
s t Joint regime index ( r t , c t ) (-)
MNumber of expert models (-)
I s Validation index set of regime s (-)
α s Optimal regime-specific blending weight (-)
I ( · ) Indicator function (-)
Q p ( · ) Quantile function at probability p (-)
p 1 , p 2 Quantile probabilities for temperature partitioning (-)
D v a l Validation dataset (-)
f m The m-th expert forecasting model (-)
RMSE m , s Root mean square error of the m-th expert in regime s (kW)
m s ( 1 ) , m s ( 2 ) Indices of the Top-2 best-performing expert models for regime s (-)
hPrediction horizon (h)
H Prediction horizon set (h)
ztExogenous input variables at time t (-)

References

  1. Fan, C.; Xiao, F.; Zhao, Y. A short-term building cooling load prediction method using deep learning algorithms. Appl. Energy 2017, 195, 222–233. [Google Scholar] [CrossRef]
  2. Chen, B.; Yang, W.; Yan, B.; Zhang, K. An advanced airport terminal cooling load forecasting model integrating SSA and CNN-Transformer. Energy Build. 2024, 309, 114000. [Google Scholar] [CrossRef]
  3. Ciulla, G.; D’Amico, A. Building energy performance forecasting: A multiple linear regression approach. Appl. Energy 2019, 253, 113500. [Google Scholar] [CrossRef]
  4. Wei, Y.; Zhang, X.; Shi, Y.; Xia, L.; Pan, S.; Wu, J.; Han, M.; Zhao, X. A review of data-driven approaches for prediction and classification of building energy consumption. Renew. Sustain. Energy Rev. 2018, 82, 1027–1047. [Google Scholar] [CrossRef]
  5. Zhou, Y.; Zheng, S.; Hensen, J.L. Machine learning-based digital district heating/cooling with renewable integrations and advanced low-carbon transition. Renew. Sustain. Energy Rev. 2024, 199, 114466. [Google Scholar] [CrossRef]
  6. Pan, Y.; Zhu, M.; Lv, Y.; Yang, Y.; Liang, Y.; Yin, R.; Yang, Y.; Jia, X.; Wang, X.; Zeng, F.; et al. Building energy simulation and its application for building performance optimization: A review of methods, tools, and case studies. Adv. Appl. Energy 2023, 10, 100135. [Google Scholar] [CrossRef]
  7. Hasan, M.; Mifta, Z.; Papiya, S.J.; Roy, P.; Dey, P.; Salsabil, N.A.; Chowdhury, N.U.R.; Farrok, O. A state-of-the-art comparative review of load forecasting methods: Characteristics, perspectives, and applications. Energy Convers. Manag. X 2025, 26, 100922. [Google Scholar] [CrossRef]
  8. Quanwei, T.; Guijun, X.; Wenju, X. MGMI: A novel deep learning model based on short-term thermal load prediction. Appl. Energy 2024, 376, 124209. [Google Scholar] [CrossRef]
  9. Lim, B.; Zohren, S. Time-series forecasting with deep learning: A survey. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2021, 379, 20200209. [Google Scholar] [CrossRef] [PubMed]
  10. Li, K.; Duan, P.; Cao, X.; Cheng, Y.; Zhao, B.; Xue, Q.; Feng, M. A multi-energy load forecasting method based on complementary ensemble empirical model decomposition and composite evaluation factor reconstruction. Appl. Energy 2024, 365, 123283. [Google Scholar] [CrossRef]
  11. Dai, S.; Meng, F.; Dai, H.; Wang, Q.; Chen, X.; Bai, W.; Shi, P.; Allmendinger, R.; Zhang, Y.; Liu, J. Machine learning in peak demand forecasting: Foundations, trends, and insights. Renew. Sustain. Energy Rev. 2026, 227, 116500. [Google Scholar] [CrossRef]
  12. Fida, K.; Abbasi, U.; Adnan, M.; Iqbal, S.; Mohamed, S.E.G. A comprehensive survey on load forecasting hybrid models: Navigating the Futuristic demand response patterns through experts and intelligent systems. Results Eng. 2024, 23, 102773. [Google Scholar] [CrossRef]
  13. Duan, P.; Zhao, X.; Hu, J.; Li, K.; Xue, Q.; Cao, X.; Wang, Y.; Zhao, B.; Zhang, C.; Yuan, X. Multi-energy load forecasting incorporating AI algorithms: Research status and trends in integrated energy systems. Renew. Sustain. Energy Rev. 2026, 229, 116611. [Google Scholar] [CrossRef]
  14. Foucquier, A.; Robert, S.; Suard, F.; Stéphan, L.; Jay, A. State of the art in building modelling and energy performances prediction: A review. Renew. Sustain. Energy Rev. 2013, 23, 272–288. [Google Scholar] [CrossRef]
  15. Frayssinet, L.; Merlier, L.; Kuznik, F.; Hubert, J.L.; Milliez, M.; Roux, J.J. Modeling the heating and cooling energy demand of urban buildings at city scale. Renew. Sustain. Energy Rev. 2018, 81, 2318–2327. [Google Scholar] [CrossRef]
  16. Balali, Y.; Chong, A.; Busch, A.; O’Keefe, S. Energy modelling and control of building heating and cooling systems with data-driven and hybrid models—A review. Renew. Sustain. Energy Rev. 2023, 183, 113496. [Google Scholar] [CrossRef]
  17. Chen, Z.; Xiao, F.; Guo, F.; Yan, J. Interpretable machine learning for building energy management: A state-of-the-art review. Adv. Appl. Energy 2023, 9, 100123. [Google Scholar] [CrossRef]
  18. Wang, Z.; Wang, Y.; Zeng, R.; Srinivasan, R.S.; Ahrentzen, S. Random Forest based hourly building energy prediction. Energy Build. 2018, 171, 11–25. [Google Scholar] [CrossRef]
  19. Runge, J.; Zmeureanu, R. Forecasting energy use in buildings using artificial neural networks: A review. Energies 2019, 12, 3254. [Google Scholar] [CrossRef]
  20. Eren, Y.; Küçükdemiral, İ. A comprehensive review on deep learning approaches for short-term load forecasting. Renew. Sustain. Energy Rev. 2024, 189, 114031. [Google Scholar] [CrossRef]
  21. Dong, F.; Yu, J.; Quan, W.; Xiang, Y.; Li, X.; Sun, F. Short-term building cooling load prediction model based on DwdAdam-ILSTM algorithm: A case study of a commercial building. Energy Build. 2022, 272, 112337. [Google Scholar] [CrossRef]
  22. Fan, C.; Wang, J.; Gang, W.; Li, S. Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Appl. Energy 2019, 236, 700–710. [Google Scholar] [CrossRef]
  23. Zhao, Y.; Gong, M.; Sun, J.; Han, C.; Jing, L.; Li, B.; Zhao, Z. A new hybrid optimization prediction strategy based on SH-Informer for district heating system. Energy 2023, 282, 129010. [Google Scholar] [CrossRef]
  24. Zhang, Y.; Wang, D.; Wang, G.; Xu, P.; Zhu, Y. Data-driven building load prediction and large language models: Comprehensive overview. Energy Build. 2025, 326, 115001. [Google Scholar] [CrossRef]
  25. Li, L.; Su, X.; Bi, X.; Lu, Y.; Sun, X. A novel Transformer-based network forecasting method for building cooling loads. Energy Build. 2023, 296, 113409. [Google Scholar] [CrossRef]
  26. Chen, Y.; Wang, H.; Chen, Z. Sensitivity analysis of physical regularization in physics-informed neural networks (PINNs) of building thermal modeling. Build. Environ. 2025, 273, 112693. [Google Scholar] [CrossRef]
  27. Xu, L.; Wang, S.; Tang, R. Probabilistic load forecasting for buildings considering weather forecasting uncertainty and uncertain peak load. Appl. Energy 2019, 237, 180–195. [Google Scholar] [CrossRef]
  28. Xu, L.; Hu, M.; Fan, C. Probabilistic electrical load forecasting for buildings using Bayesian deep neural networks. J. Build. Eng. 2022, 46, 103853. [Google Scholar] [CrossRef]
  29. Lin, L.; Liu, X.; Zhang, T.; Liu, X.; Rong, X. Cooling load characteristic and uncertainty analysis of a hub airport terminal. Energy Build. 2021, 231, 110619. [Google Scholar] [CrossRef]
  30. Hu, M.; Ge, D.; Telford, R.; Stephen, B.; Wallom, D.C. Classification and characterization of intra-day load curves of PV and non-PV households using interpretable feature extraction and feature-based clustering. Sustain. Cities Soc. 2021, 75, 103380. [Google Scholar] [CrossRef]
  31. Amin, A.; Mourshed, M. Weather and climate data for energy applications. Renew. Sustain. Energy Rev. 2024, 192, 114247. [Google Scholar] [CrossRef]
  32. Xianliang, G.; Jingchao, X.; Zhiwen, L.; Jiaping, L. Analysis to energy consumption characteristics and influencing factors of terminal building based on airport operating data. Sustain. Energy Technol. Assessments 2021, 44, 101034. [Google Scholar] [CrossRef]
  33. Zhang, Z.; Sun, H.; Huang, T. Spatiotemporal Analysis of Building Energy Performance and Urban Morphology Interactions Across Chinese Climate Zones: A Machine Learning Approach to Predicting Cooling Demand in High-Density Districts. Case Stud. Therm. Eng. 2025, 79, 107592. [Google Scholar] [CrossRef]
  34. Liu, J.; Zhang, Y.; Zhai, Z.; Wang, Y.; Ye, L.; Ding, Y. Comparison and analysis of different machine learning models for predicting air-conditioning cooling loads in comprehensive large public buildings across various time scales. Appl. Therm. Eng. 2025, 278, 127328. [Google Scholar] [CrossRef]
  35. Wu, S.; Wang, Y.; Zhang, H.; Hu, R.; Zhang, Y.; Du, J.; Liu, L. Dual-modal cross-attention integrated model for airport terminal cooling load prediction using variational mode decomposition. J. Build. Eng. 2025, 104, 112344. [Google Scholar] [CrossRef]
  36. Chalapathy, R.; Khoa, N.L.D.; Sethuvenkatraman, S. Comparing multi-step ahead building cooling load prediction using shallow machine learning and deep learning models. Sustain. Energy Grids Netw. 2021, 28, 100543. [Google Scholar] [CrossRef]
  37. Atalla, T.; Gualdi, S.; Lanza, A. A global degree days database for energy-related applications. Energy 2018, 143, 1048–1055. [Google Scholar] [CrossRef]
  38. Zhang, L.; Cao, M.; Li, N.; Luo, L.; Chen, Y.; Li, Z. Machine learning prediction of heating and cooling loads based on Athenian residential buildings’ simulation dataset. Energy Build. 2025, 342, 115808. [Google Scholar] [CrossRef]
  39. Erbey, M.M.; Ozdil, N.F.T. Forecasting the European Union’s Space Cooling Potential Using the Cooling Degree-Day Factor (1970–2024). Energy 2026, 344, 140113. [Google Scholar] [CrossRef]
  40. Yu, M.; Niu, D.; Zhao, J.; Li, M.; Sun, L.; Yu, X. Building cooling load forecasting of IES considering spatiotemporal coupling based on hybrid deep learning model. Appl. Energy 2023, 349, 121547. [Google Scholar] [CrossRef]
  41. Yu, D.; Liu, T.; Wang, K.; Li, K.; Mercangöz, M.; Zhao, J.; Lei, Y.; Zhao, R. Transformer based day-ahead cooling load forecasting of hub airport air-conditioning systems with thermal energy storage. Energy Build. 2024, 308, 114008. [Google Scholar] [CrossRef]
  42. Huang, S.; Ali, N.A.M.; Shaari, N.; Noor, M.S.M. Multi-scene design analysis of integrated energy system based on feature extraction algorithm. Energy Rep. 2022, 8, 466–476. [Google Scholar] [CrossRef]
  43. Zhu, C.; Yang, Y.; Chen, H.; Zeng, M. An Enhanced, Lightweight Large Language Model-Driven Time Series Forecasting Approach for Air Conditioning System Cooling Load Forecasting. Mathematics 2025, 13, 3887. [Google Scholar] [CrossRef]
Figure 1. Cooling-season load profile for the airport case study (2023–2024). (a) Normalized load time series for the full cooling seasons (June–August) of 2023 and 2024; (b) zoomed view of 7–10 June highlighting low-load and zero-load intervals; (c) zoomed view of 19–21 August illustrating intermittent shutdown behavior during operational periods.
Figure 1. Cooling-season load profile for the airport case study (2023–2024). (a) Normalized load time series for the full cooling seasons (June–August) of 2023 and 2024; (b) zoomed view of 7–10 June highlighting low-load and zero-load intervals; (c) zoomed view of 19–21 August illustrating intermittent shutdown behavior during operational periods.
Energies 19 01375 g001
Figure 2. Overall framework of the proposed cooling load forecasting approach.
Figure 2. Overall framework of the proposed cooling load forecasting approach.
Energies 19 01375 g002
Figure 3. ACF of the airport cooling load during the cooling seasons. Horizontal dashed lines denote 95% confidence intervals. ACF values outside these bounds are statistically significant. Red dots highlight key lags at 1, 24, and 168 h, representing short-term inertia, daily periodicity, and weekly patterns.
Figure 3. ACF of the airport cooling load during the cooling seasons. Horizontal dashed lines denote 95% confidence intervals. ACF values outside these bounds are statistically significant. Red dots highlight key lags at 1, 24, and 168 h, representing short-term inertia, daily periodicity, and weekly patterns.
Energies 19 01375 g003
Figure 4. Spearman rank correlation between cooling-related features and cooling load. Bar color indicates the direction of the correlation (blue: positive; red: negative), while bar length and the overlaid numerical labels represent the magnitude of the Spearman coefficient ρ .
Figure 4. Spearman rank correlation between cooling-related features and cooling load. Bar color indicates the direction of the correlation (blue: positive; red: negative), while bar length and the overlaid numerical labels represent the magnitude of the Spearman coefficient ρ .
Energies 19 01375 g004
Figure 5. Interpretable clustering of airport cooling load regimes during cooling seasons: (a) temporal switching of the identified regimes in 2023 and 2024; (b) cluster-wise cooling load distributions highlighting distinct magnitude and variability patterns; (c) temperature–load scatter visualization showing regime separation under different ambient thermal conditions.
Figure 5. Interpretable clustering of airport cooling load regimes during cooling seasons: (a) temporal switching of the identified regimes in 2023 and 2024; (b) cluster-wise cooling load distributions highlighting distinct magnitude and variability patterns; (c) temperature–load scatter visualization showing regime separation under different ambient thermal conditions.
Energies 19 01375 g005
Figure 6. Short-term (24 h-ahead) cooling load forecasting performance. (a) Predicted and observed cooling load overall trends at the daily scale, with shaded areas indicating two zoomed-in regions. (b1,b2) Zoomed-in detailed views of the load trends during two specific 80 h highly variable windows (hours 100–180 and hours 260–340). (c) Corresponding relative prediction errors at the 24 h timescale.
Figure 6. Short-term (24 h-ahead) cooling load forecasting performance. (a) Predicted and observed cooling load overall trends at the daily scale, with shaded areas indicating two zoomed-in regions. (b1,b2) Zoomed-in detailed views of the load trends during two specific 80 h highly variable windows (hours 100–180 and hours 260–340). (c) Corresponding relative prediction errors at the 24 h timescale.
Energies 19 01375 g006
Figure 7. Long-term cooling load forecasting performance over the entire cooling season. (a) Predicted and observed cooling load time series at the original hourly resolution, with shaded areas indicating two zoomed-in micro regions. (b1,b2) Two side-by-side zoomed-in micro views (hours 120–160 and 280–320) illustrating point-by-point tracking details during specific 40 h windows. (c) Corresponding relative prediction errors over the full cooling season.
Figure 7. Long-term cooling load forecasting performance over the entire cooling season. (a) Predicted and observed cooling load time series at the original hourly resolution, with shaded areas indicating two zoomed-in micro regions. (b1,b2) Two side-by-side zoomed-in micro views (hours 120–160 and 280–320) illustrating point-by-point tracking details during specific 40 h windows. (c) Corresponding relative prediction errors over the full cooling season.
Energies 19 01375 g007
Figure 8. Forecasting performance under different operational scenarios. (a) Scenario A: Peak demand during extreme heatwave. (b) Scenario B: High volatility during rapid load transitions. (c) Scenario C: Transitional load and intermittent operation.
Figure 8. Forecasting performance under different operational scenarios. (a) Scenario A: Peak demand during extreme heatwave. (b) Scenario B: High volatility during rapid load transitions. (c) Scenario C: Transitional load and intermittent operation.
Energies 19 01375 g008
Table 1. Summary of representative studies in cooling/building load forecasting (pros and cons).
Table 1. Summary of representative studies in cooling/building load forecasting (pros and cons).
Ref.Main Idea/MethodProsCons/LimitationsRelevance to This Work
[14]Review of building modeling and energy performance predictionComprehensive taxonomy of building modeling methods and prediction tasksNot focused on modern deep learning or condition-adaptive ensemblesProvides foundational background and motivates data-driven forecasting
[16]Review of data-driven and hybrid models for building heating/cooling systemsSummarizes ML/hybrid modeling and control-oriented implicationsLimited emphasis on regime-wise adaptive ensemble mechanismsSupports the motivation of robustness and hybrid perspectives
[18]Random forest for hourly building energy predictionStrong generalization and robustness under noise; interpretable feature importanceTypically learns a global mapping; limited regime specializationRepresentative tree-ensemble baseline family; contrasts with regime-wise adaptation
[22]Systematic assessment of deep recurrent multi-step forecasting strategiesClarifies inference strategies (recursive/direct/MIMO) and error accumulation effectsNot designed for regime-aware expert selection or intermittency gatingSupports the need for stability under long-horizon forecasting
[25]Transformer-based network for building cooling load forecastingModels long-range temporal dependencies; strong nonlinear representationMay still behave as a global predictor without explicit regime adaptationMotivates including Transformer-style baselines and long-range modeling discussion
[27]Probabilistic building load forecasting considering weather and peak uncertaintyImproves reliability under weather/peak uncertainties; quantifies predictive distributionsIncreased modeling complexity; not focused on regime-wise expert blendingComplements reliability discussion; orthogonal to our deterministic regime-wise ensemble
[28]Bayesian deep neural networks for probabilistic building load forecastingQuantifies epistemic/aleatoric uncertainty with Bayesian deep learningFocuses on uncertainty rather than condition-dependent expert selectionProvides uncertainty-aware perspective and contextualizes deterministic setting
[29]Cooling load characteristics and uncertainty analysis of a hub airport terminalReveals heterogeneity and uncertainty sources in airport cooling demandDoes not propose an adaptive ensemble forecasting mechanismDomain evidence motivating regime heterogeneity modeling in airports
[30]Interpretable feature extraction and clustering for load curve classificationEffective mode discovery and interpretable clustering featuresNot a forecasting framework; clustering not integrated with adaptive ensembleInspires the use of clustering to represent operational regimes
Table 2. Relative error statistics at the 24 h rolling mean timescale.
Table 2. Relative error statistics at the 24 h rolling mean timescale.
ModelBias (Mean)MAEP95P99
CATS-Ens0.19%0.96%2.15%2.63%
RNN−0.52%1.06%2.68%3.11%
GRU0.85%1.21%2.91%3.25%
LSTM0.58%1.47%4.54%5.51%
Transformer0.61%1.54%2.89%3.70%
XGBoost0.41%2.04%4.16%5.62%
Table 3. Relative error statistics at the original hourly scale.
Table 3. Relative error statistics at the original hourly scale.
ModelBias (Mean)MAEP95P99
CATS-Ens0.21%2.79%8.18%11.60%
RNN−0.47%3.04%8.96%13.00%
GRU0.89%3.15%8.55%13.10%
LSTM0.60%3.29%9.60%14.60%
XGBoost0.46%3.51%12.10%16.10%
Transformer0.53%3.66%11.10%16.90%
Table 4. Overall performance comparison of different models for cooling load forecasting.
Table 4. Overall performance comparison of different models for cooling load forecasting.
ModelMAERMSEMAPE (%)sMAPE (%) R 2
RNN222.34307.1941.8034.100.960
LSTM240.01342.4941.1434.220.951
GRU229.78314.5745.3134.130.958
Transformer267.18385.1134.3142.490.938
XGBoost256.06388.7445.2534.660.937
CATS-Ens 203.92 281.51 36.99 31.68 0.967
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xie, X.-Y.; Fan, Y.-W.; Wang, Y.-Z.; Li, J.-R.; Zhang, X.-R. A Cluster- and Temperature-Aware Auto-Ensemble Model for Airport Cooling Load Forecasting. Energies 2026, 19, 1375. https://doi.org/10.3390/en19051375

AMA Style

Xie X-Y, Fan Y-W, Wang Y-Z, Li J-R, Zhang X-R. A Cluster- and Temperature-Aware Auto-Ensemble Model for Airport Cooling Load Forecasting. Energies. 2026; 19(5):1375. https://doi.org/10.3390/en19051375

Chicago/Turabian Style

Xie, Xiao-Yu, Yu-Wei Fan, Yi-Zhou Wang, Jie-Ru Li, and Xin-Rong Zhang. 2026. "A Cluster- and Temperature-Aware Auto-Ensemble Model for Airport Cooling Load Forecasting" Energies 19, no. 5: 1375. https://doi.org/10.3390/en19051375

APA Style

Xie, X.-Y., Fan, Y.-W., Wang, Y.-Z., Li, J.-R., & Zhang, X.-R. (2026). A Cluster- and Temperature-Aware Auto-Ensemble Model for Airport Cooling Load Forecasting. Energies, 19(5), 1375. https://doi.org/10.3390/en19051375

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop