Thermal Load Predictions in Low-Energy Buildings: A Hybrid AI-Based Approach Integrating Integral Feature Selection and Machine Learning Models

El Mghouchi, Youness; Udristioiu, Mihaela Tinca

doi:10.3390/app15116348

Open AccessArticle

Thermal Load Predictions in Low-Energy Buildings: A Hybrid AI-Based Approach Integrating Integral Feature Selection and Machine Learning Models

by

Youness El Mghouchi

¹

and

Mihaela Tinca Udristioiu

^2,*

¹

Department of Energetics, École Nationale Supérieure d’Arts et Métiers, Moulay Ismail University, Meknes 50050, Morocco

²

Department of Physics, Faculty of Sciences, University of Craiova, 13 A.I. Cuza Street, 200585 Craiova, Romania

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(11), 6348; https://doi.org/10.3390/app15116348

Submission received: 21 April 2025 / Revised: 21 May 2025 / Accepted: 2 June 2025 / Published: 5 June 2025

(This article belongs to the Special Issue Renewable Energy in Smart Cities)

Download

Browse Figures

Versions Notes

Abstract

A hybrid Artificial Intelligence (AI) framework centered on metamodeling, integrating simulation data with hybrid data-driven techniques, was implemented to enhance the predictive accuracy and optimization of thermal load projections in three distinct climates in Morocco. Initially, 13 machine learning (ML) models were assessed to predict heating and cooling loads. The best-performing models from this stage were then selected for the subsequent phase to find out the optimal combinations of inputs to predict thermal loads. In this phase, an Integral Feature Selection (IFS) method was employed in conjunction with the best ML models. An extensive evaluation using advanced statistical measures was performed during the evaluation stage. The results reveal that, for each climate, numerous high-accuracy prediction pathways were identified for thermal load prediction, surpassing the confidence level of 99% for R². The results found here outperformed those reported by other researchers in thermal load predictions for Low-Energy Buildings (LEBs).

Keywords:

building energy optimization; low-energy buildings; metamodeling; artificial intelligence; integral feature selection; Morocco

1. Introduction

The need for sustainable and energy-efficient building strategies is becoming increasingly urgent, especially in regions like Morocco, which spans diverse climatic zones—from Mediterranean to semi-arid and cold mountainous regions. In line with its low carbon transition strategy, Morocco has pledged to reduce national greenhouse gas emissions by 40% by 2030 and 77% by 2050 [1,2]. Within this context, Low-Energy Buildings (LEBs) are crucial for meeting these goals. However, a persistent Energy Performance Gap (EPG)—the discrepancy between predicted and actual energy use—remains a challenge, particularly in residential buildings where geometry, insulation, and local climate play pivotal roles [3,4]. Bridging this gap necessitates accurate thermal load predictions using advanced methodologies tailored to Morocco’s climatic and socio-economic realities [1].

Artificial intelligence (AI), and more specifically machine learning (ML), has shown great promise in improving the precision of heating and cooling load predictions for LEBs. Hybrid ML models such as XGBoost and LightGBM have demonstrated high predictive performance (R² > 0.99) by integrating optimized feature selection techniques and climate-sensitive inputs [5]. Transformer-based architectures have also proven capable of multi-step forecasting with remarkably low errors in complex environments [6], while Physics-Informed Neural Networks (PINNs) provide robust predictions in thermodynamic applications, even with sparse datasets [7]. These methods are further supported by interpretable models like decision trees, which help identify critical predictors (e.g., glazing area, compactness) and align well with Morocco’s material efficiency and solar integration objectives [2,4].

The use of hybrid AI techniques is growing in building energy prediction. For instance, [8] combined Support Vector Regression (SVR) and XGBoost with six metaheuristic algorithms and identified SBO-XGBoost as the best-performing model for both heating (R² = 0.9380) and cooling (R² = 0.9583) loads. In [9], the authors evaluated four ML models—Multi-Layer Perceptron (MLP), Extreme Learning Machine (ELM), Radial Basis Function (RBF), and Response Surface Methodology (RSM)—and found ELM to outperform others (R² = 0.9850 for heating and R² = 0.9916 for cooling). Likewise, [10] used an enhanced SVR combined with Feature Selection (FS) to predict cooling loads in public buildings, relying on dispersion metrics for evaluation. In Morocco-specific research, [11] compared Artificial Neural Network (ANN) and Generalized Linear Model(GLM) models for heating load prediction across six climatic zones and found ANN to be more accurate (R² = 0.95), although it suffered from stochastic variability.

An extensive review in [12] highlights two critical gaps: (1) the underrepresentation of building-specific characteristics—such as geometry or orientation—as predictors, and (2) the lack of clear taxonomies distinguishing between ML, deep learning, and forecasting methods. Most models in the literature rely heavily on historical energy consumption data, overlooking the predictive value of architectural design variables. Furthermore, while numerous studies use AI to predict thermal loads using various features, none systematically compare all possible combinations of predictor variables to find optimal input configurations.

In contrast, the present study offers a holistic and comparative approach to thermal load prediction in LEBs using hybrid AI models. It investigates how different predictor variables interact across three distinct Moroccan climates, with the goal of optimizing both model accuracy and computational efficiency. This work integrates Integral Feature Selection (IFS) techniques with several advanced ML algorithms to discover the most relevant input features and ideal model structures.

The primary aim of this study is not merely to achieve accurate predictions of thermal loads using AI models, but to leverage these predictions to inform and optimize energy-efficient design decisions and operational strategies for LEBs. By identifying the most influential building and environmental parameters through hybrid IFS–ML frameworks, the findings provide actionable insights for architects, engineers, and energy managers. These insights support more informed design choices—such as optimized window-to-wall ratios, insulation levels, and orientation—which can contribute to substantial reductions in heating and cooling demand. While this study does not directly quantify the exact energy savings from each intervention, it lays the foundation for integrated building performance simulations and future work that links predictive accuracy to real-world energy savings and Heating, Ventilation and Air-Conditioning (HVAC) system optimization.

The main objectives and contributions of this study are:

To analyze the partial dependence and relative importance of predictor variables for heating and cooling loads;
To evaluate and compare various hybrid IFS–ML models for accurate thermal load prediction;
To identify the most influential predictor combinations for enhanced model performance across diverse climates;
To contribute actionable insights toward improving building energy management strategies in Morocco.

The structure of the paper is as follows:

Section 2 presents the methodology, including the building case study, climate data, ML models, and optimization framework;
Section 3 reports on the performance evaluation of the models and discusses the results;
Section 4 concludes the study, and outlines the study limitations and directions for future research.

2. Materials and Methods

2.1. Weather Data and Locations

This study used typical year data for three distinct Moroccan climates: Ifrane, Meknes, and Marrakech. These cities were selected based on their diverse climatic characteristics, ranging from cold to semi-arid conditions. Ifrane experiences a humid and temperate climate, characterized by higher rainfall in winter than in summer, classified as Csb on the Köppen–Geiger climate map, with an average annual temperature of 15 °C. Meknes exhibits a Mediterranean climate with moderate, rainy winters and hot, dry summers. The temperature ranges from 30 °C to 44 °C in the warmest month and from 0 °C to 7 °C in the coldest month. On the other hand, Marrakech has a semi-arid climate, featuring an average annual temperature of 20 °C and an average annual rainfall of 281 mm, which is lower than the average for the Mediterranean climatic zone. Table 1 provides details on specific climatic characteristics such as dry bulb temperature (DBT), cooling degree days (CDD), and heating degree days (HDD).

2.2. Building Description and Simulation

This study examined a multi-story building, as depicted in Figure 1, showcasing the geometric model. The building model under investigation was proposed by the National Agency for the Development of Renewable Energy and Energy Efficiency (ADEREE). It was segmented into four levels, with two apartments on each floor. Occupancy patterns during weekdays involve each flat being occupied by 5 people from 05:00 p.m. to 07:30 a.m. and by 2 people during the remaining time. On weekends, each home is occupied by 5 people. The overall window-to-wall ratio is 21%, and the net floor space covers 588 m². The windows are single-glazed with a heat-transfer coefficient (U-value) of 5.74 W/m².K and a solar heat gain coefficient (SHGC) of 0.87. External shading devices (50%) are utilized during the summer from 07:30 a.m. to 05:00 p.m. Additional information about the building, architectural plan, floor areas, and external surfaces can be found in [13]. The building was simulated in TRNSYS 18 software for 8760 h (one year) with a 1 h time step. The assumptions employed to calculate the heating and cooling loads are shown in [5,14].

The case study was based on a single prototypical residential LEB, selected to ensure a consistent basis for model comparison across different climatic zones. This reference building aligns with widely accepted design practices and performance standards in energy-efficient construction. The rationale for this selection lies in the need to isolate and analyze the effects of climate and predictor variables without introducing confounding variability from differing building geometries or operational profiles. Despite this focused approach, the framework developed is inherently flexible and can be applied to various building types, provided appropriate input data are available. Future studies will aim to validate the generalizability of the method across multiple building archetypes and use cases.

Table 2 outlines the characteristics of the envelope materials, with the thermo-physical properties sourced from the TRNSYS library. The set-point temperature is maintained at 20 °C in winter and 26 °C in summer, following the Moroccan standard (NM ISO 7730) [5,14].

2.3. Predictor Variables

Ten input parameters pertaining to the building envelope and HVAC system were chosen as predictor variables. Several of these parameters have been investigated in recent research, considering the anomalies observed in the current national thermal regulation. These parameters include the heat transfer coefficient of external walls, the coating of opaque elements, air change rate, windows-to-wall ratio, and type of glazing. The ranges of variation for these input factors are detailed in Table 3.

2.4. Methodology

The initial step involves enhancing the efficiency of building energy optimization by simultaneously reducing computation time and increasing accuracy. To achieve this, a MATLAB 2023a code generates 1000 quasi-random samples, serving as the initial sample size for the predictor variables outlined in Table 3. Subsequently, TRNSYS software is used as a computer simulation tool, coupled with MATLAB via TRNSYS Type155, to execute building energy simulations for each configuration and climate.

The flowchart depicting the adopted methodology is illustrated in Figure 2.

The main steps are summarized as follows:

The dataset, comprising predictor variables and corresponding heating and cooling loads, underwent pre-processing involving scaling and normalization. Subsequently, the pre-processed data were randomly split into a training dataset (70%) and a test dataset (30%):
Thirteen different ML models were checked for predicting thermal loads (heating, cooling, and total loads): artificial neural network (ANN), decision trees (DT), Support Vector Machine (SVM), Extreme Learning Machine (ELM), Extreme Gradient Boosting (XGBoost), random forest (RF), Tree Bagger (TreeBag), Generalized Linear Regression (GLR) model, Gaussian Process Regression (GR), Linear Regression (LR), Generalized Additive Model (GAM), Kernelized Ridge Regression (KRR) model, and Linear Ridge Regression (LRR).
Following the selection of the best ML models in step 2, a comprehensive statistical analysis was conducted to discover the optimal combinations of predictor variables for accurately predicting heating and cooling loads. This was achieved through a hybrid approach employing the proposed IFS–ML approach;
Based on the best combinations of the predictor variables, the thermal loads were predicted by employing, respectively, the best IFS–ML models.

2.5. Hybrid AI Models and Evaluation Metrics

This section provides a short description of each hybrid AI technique employed. Readers can find more details about each technique in the given references.

2.5.1. Employed ML Models

The ML models we have employed for predicting thermal loads are summarized in Table 4. In this work, grid search and the Bayesian optimization methods are used to find the optimal hyperparameter settings for each model [15,16].

2.5.2. Integral Feature Selection

In [30], the authors introduced a novel method within the Integral Variable Selection (IVS) framework. This method identifies the optimal combinations of predictor variables for modeling, prediction, and forecasting tasks. It systematically evaluates all possible combinations of input variables to select the most efficient and effective set. Moreover, it aims to maximize the prediction accuracy of the output variable (objective function). The number of possible input combinations is determined using the “n chooses k” formula, represented by the binomial coefficient Equation (1) or simply using (

2^{n} - 1

):

C o m b = \sum_{k = 1}^{n} C_{n}^{k} = \sum_{k = 1}^{n} \frac{n!}{(n - k)! k!}

(1)

where n represents the total number of input variables, and k represents the number of variables to be selected in each combination.

The proposed method comprises several steps, as illustrated in Figure 3. The algorithm is initiated, and input and output data are imported. The data are then split into training and testing–validating sets (70/30 split) and pre-processed using normalization and autonomous anomaly detection techniques. The total number of possible input combinations is computed using Equation (1). The method then enters the first loop based on the size of the data, K1. Within this loop, the number of combinations is computed for each ith considered size, and a second loop is initiated for each value of K2. The combnk (V, K) function generates a matrix with K columns. The ML model is loaded, and the data are processed to predict the corresponding output parameter for each combination. The predicted values are then saved, and the algorithm moves on to the next iteration. After computing the predicted values for all possible combinations, a second algorithm performs statistical analysis to identify the best input combinations that give the best accuracy prediction. Finally, the algorithm concludes after determining the optimal combinations of the predictor variables.

2.5.3. Statistical Accuracy Assessment

In [32], the authors introduced a modified version of the performance score to rank the effectiveness of the applied ML and hybrid IFS–ML models. The performance score (φ), defined by Equation (2), is used for this evaluation, where higher values of φ indicate poorer model performance.

φ = r a n k (M B E) + r a n k (R M S E) + r a n k (M A P E) + r a n k (σ) + r a n k (R^{2}),

(2)

The performance indicators in Equation (2) include the Mean Bias Error (MBE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), standard deviation (σ), and the Coefficient of Determination (R²). These indicators are detailed in Equations (3)–(7).

M B E = \frac{1}{K} \sum (v_{p}^{i} - v_{m}^{i}),

(3)

R M S E = {(\frac{1}{K} \sum {(v_{p}^{i} - v_{m}^{i})}^{2})}^{\frac{1}{2}},

(4)

σ = {[\frac{K ({R M S E}^{2} - {M B E}^{2})}{(K - 1)}]}^{\frac{1}{2}},

(5)

M A P E = \frac{100}{K} \sum |\frac{(v_{p}^{i} - v_{m}^{i})}{v_{m}^{i}}|,

(6)

R^{2} = 1 - \frac{\sum {(v_{p}^{i} - v_{m}^{i})}^{2}}{\sum {(v_{m}^{i} - \bar{v_{m}})}^{2}},

(7)

Here,

v_{p}^{I}

,

v_{m}^{I}

, and

\bar{v}

represent the ith predicted value, the ith measured value, and the mean value, respectively, while K denotes the total number of measurements.

Hence, this paper only gives brief information for each statistical indicator, and the readers are referred to [30] for more details.

3. Results and Discussion

3.1. Sensitivity Analysis: Evaluating Predictor Variable Impacts on Thermal Loads

Understanding the influence of predictor variables on heating and cooling loads is crucial for optimizing building energy management strategies. In this section, we delve into a sensitivity analysis using the coefficient of determination (R²) to assess the impacts of individual predictor variables on the heating (QHEAT) and cooling (QCOOL) thermal loads. Recent studies, such as [33], demonstrate the effectiveness of R² in quantifying the relative importance of factors like building geometry and material properties, while advanced sensitivity frameworks [34] highlight methodologies to address uncertainties in load prediction. This analysis is conducted across different climatic conditions, offering insights into the variable dependencies that contribute significantly to the energy dynamics of LEBs. The results found are illustrated in Figure 4, Figure 5 and Figure 6.

As a key metric, the coefficient of determination R² provides a quantitative measure of the proportion of variance in thermal loads explained by each predictor variable. By scrutinizing R² values, we aim to identify the most influential factors shaping QHEAT and QCOOL under diverse climate scenarios. Recent studies, such as [35], demonstrate that LightGBM models achieve R² values as high as 0.9959 when evaluating features like glazing distribution and insulation efficiency across climatic regions, outperforming methods like random forest (RF) and long short-term memory (LSTM) networks. Similarly, deep learning frameworks incorporating bidirectional gated recurrent units (Bi-GRU) show enhanced predictive accuracy for ultra-short-term heating loads, with R² values reflecting robust performance in time-shifted feature analysis [36]. This comprehensive examination allows us to discern the relative importance of each predictor variable and elucidate their unique contributions to the overall predictive accuracy of the ML models. For instance, interpretable classifiers like decision trees and rule induction (RI) leverage R² to prioritize factors such as relative compactness and glazing area in residential buildings [37].

The upcoming figures and paragraphs present a detailed breakdown of pearson correlation r values for each predictor variable, unveiling the nuances of their impacts on heating and cooling loads. This sensitivity analysis not only refines our understanding of the complex interplay between predictors and thermal loads, but also lays the foundation for informed decision-making in designing energy-efficient solutions tailored to specific climates.

In a Mediterranean climate like that of Meknes (Figure 4), there is a strong correlation with QHEAT primarily associated with X6 (r = 0.93). X1, X7, and X2 exhibit very weak correlations, while other predictor variables show almost no correlation with QHEAT. Regarding QCOOL, a moderate correlation is observed with X8 and X7, a minor correlation is present with X6 and X2, and even less of a correlation is seen with X9 and X4. Other variables show a lack of correlation.

In a cold climate like that of Ifrane (Figure 5), almost similar to the results shown for Meknes, the correlation between QHEAT and X6 reaches the confidence interval. X1, X7, X2, and X3 exhibit very weak correlations, while other predictor variables show no correlation with QHEAT. Regarding QCOOL, a moderate correlation is observed with X8, followed by X7; a minor correlation is present with X6 and X2, and even less is seen with X4 and X9. Other variables indicate a lack of correlation.

In a semi-arid climate like that of Marrakech (Figure 6), X6 demonstrates a relatively strong correlation with QHEAT (r = 0.90). X1, X2, X7, and X3 exhibit low to very low correlations, respectively. Other variables show no significant correlation. Regarding QCOOL, a moderate correlation is identified with X7 and X8, a minor correlation is present with X2 and X6, and even less is seen with X4 and X9. Other variables indicate a lack of correlation.

In summary, for the three climates considered concerning QHEAT, it can be concluded that a strong correlation is evident with the air change rate (X6) (average r = 0.92). X1, X7, X2, and, to a lesser extent, X3 show low to very low correlations. No significant correlation is found for other variables. Regarding QCOOL, a moderate correlation is demonstrated by the east window-to-wall ratio (X8) (average r = 0.60), followed by X7 (average r = 0.52). X2, X6, X4, and X9 exhibit low to very low correlations, respectively, while others show no notable correlation.

The inclusion of all available predictor variables at the outset was intentional and aligned with the methodology of performing a comprehensive feature evaluation. The purpose of the sensitivity analysis was twofold, as follows:

To quantify and rank the relative importance of each predictor variable with respect to its contribution to thermal load prediction. This is given in the form of filter feature selection analysis based only on the Pearson correlation between variables;
To get information about the predictor variables with a strong impact on thermal loads;
To support and validate the subsequent application of the Integral Feature Selection (IFS) method, which systematically eliminates redundant or non-informative features, and searches for optimal combinations of predictor variables can be employed to improve model performance and reduce complexity.

By including all potential variables, even those with seemingly weak direct correlations, we ensured that no latent interactions or higher-order effects were prematurely excluded. In some machine learning algorithms, especially nonlinear ones, variables that are weakly correlated individually may still contribute to model accuracy through interactions. Concequently, the sensitivity analysis results helped justify the exclusion of certain features during the optimization phase and provided insights into the relative impact of each variable, which ultimately guided the development of a more robust and interpretable predictive model.

3.2. ML Models: Accurate Predictions of Heating and Cooling Loads

This section presents the outcomes of the investigation into the application of ML algorithms for the accurate prediction of heating and cooling loads in buildings. Leveraging advanced models and methodologies, this section delves into the effectiveness of well-known ML models in optimizing energy consumption prediction and improving thermal comfort within the built environment. The results presented herein offer valuable insights into the performances of these models, shedding light on their potential for revolutionizing the way we approach heating and cooling load predictions in the realm of LEBs.

In Figure 7, Figure 8 and Figure 9, the predictive capabilities of each ML model are evaluated for both heating and cooling loads in three different climates. This assessment uses a combination of inputs that includes all predictor variables. The ranking of the ML models is determined according to the performance score. Additional analyses are conducted considering MAPE, σ, and R².

For the Mediterranean climate represented by Meknes City, the results show that during the training phase, the XGBoost model achieves the best performance in predicting both heating and cooling loads. In the testing phase, the ELM model demonstrates superior performance for heating load prediction, while the SVM model outperforms others in predicting cooling loads. Moreover, based on the high R² values obtained, alternative models like SVM display strong potential, exhibiting near-perfect correlations for both thermal loads along with low dispersion indicators. Further details are illustrated in Figure 7.

The results obtained for Ifrane (a cold climate) demonstrate that, during the training phase, the XGBoost model performs optimally for predicting thermal loads. During the testing phase, the SVM model emerges as the most effective, showcasing a perfect correlation and very low values for the dispersion indicators considered in this study. Additional insights can be obtained from Figure 8.

Similar to the findings in a cold climate, the results for Marrakech, representing a semi-arid climate, reveal that, during the training phase, the XGBoost model excels in predicting both heating and cooling thermal loads. However, the SVM model is the most effective in the testing phase. Additional insights can be derived from Figure 9.

The XGBoost model optimally predicts heating and cooling thermal loads across the three considered climates during the training phase. However, excluding the Mediterranean climate, the SVM model is the most effective for predicting the heating and cooling loads. In a Mediterranean climate, the ELM model slightly exceeds the SVM model only in predicting heating loads, while the SVM model outperforms all ML models in predicting cooling loads.

As a summary of the most accurate predictions of heating and cooling loads, we conclude that the XGBoost model consistently performed best during the training phase for all climates. In testing, model performance varied by region:

Mediterranean climate (Meknes)—ELM was most effective for heating load prediction, while SVM led in cooling load prediction;
Cold climate (Ifrane)—SVM outperformed others with near-perfect accuracy;
Semi-arid climate (Marrakech)—SVM again showed superior performance.

Overall, while XGBoost excelled in training, SVM emerged as the most robust model during testing, especially for Ifrane and Marrakech. In Meknes, ELM slightly outperformed SVM for heating loads, but SVM remained dominant for cooling predictions.

In the following subsection, only the top-performing model from the testing phase—specific to each thermal load and each city—will be utilized to identify the optimal combinations of predictor variables, aimed at enhancing both prediction accuracy and model simplicity.

3.3. Optimizing Thermal Load Predictions: Best Hybrid IFS–ML Models

In pursuing accurate and efficient thermal load predictions for LEBs, integrating Integral Feature Selection (IFS) with machine learning (ML) models has emerged as a promising avenue. This section unveils the culmination of our research efforts—the identification and validation of the best-performing hybrid IFS–ML models. While traditional input variable selection (IVS) methods often provide only a single “optimal” combination of predictors, the IFS methodology systematically explores all possible combinations (in our case study there are 2¹⁰ − 1 = 1023 possible combinations) to identify multiple high-performing feature subsets, ensuring robustness and generalizability.

The primary objective of this section is to showcase the prowess of hybrid models in discerning the most influential predictors for robust heating and cooling load predictions. By strategically applying IFS to exhaustively search the predictor space and subsequently integrating these refined variables into the top-performing model obtained in the previous subsection, we aim to achieve a high accuracy yet interpretable modeling framework.

The optimal combinations of predictors for heating and cooling loads across the three Moroccan climates are illustrated in Figure 10 and Figure 11. Only the combinations achieving R² ≥ 0.99 were retained as optimal.

In a Mediterranean climate (Meknes), 63 optimal combinations were found for heating load prediction. For the cold climate (Ifrane), there were 93 combinations, and for the semi-arid climate (Marrakech), 64 combinations were identified. Remarkably, the best combination across all climates included variables X1, X2, X6, and X7, indicating a core set of features critical to accurate load estimation.

For cooling loads, 12 optimal combinations were identified for Meknes and Ifrane, and 11 for Marrakech. While Mediterranean and cold climates shared the same best combinations, the semi-arid climate slightly differed in its top-ranked set. Specifically, for Meknes and Ifrane, the best predictor combination included X2, X3, X4, X6, X7, X8, and X9, while for Marrakech it included X1, X2, X4, X6, X7, X8, and X9.

A statistical validation of the top-ranked combinations was performed (Figure 12 and Figure 13). The results show that the R² consistently exceeded 0.998, while MAPE and σ values were exceptionally low, confirming the reliability and superiority of the proposed approach compared to benchmarks in the literature.

For heating load, the top-ranked combination was 955 for Meknes and Ifrane and 1001 for Marrakech, achieving R² = 0.998, MAPE = 0.284 kWh/m²/year, and σ = 1.115 kWh/m²/year. For cooling load, the best-performing combinations were 1013 for Meknes and Ifrane, and 1023 for Marrakech, with R² = 0.998, MAPE = 0.263 kWh/m²/year, and σ = 1.131 kWh/m²/year. Interestingly, the first-best combination for all climates includes X1, X2, X6, and X7. These findings affirm the effectiveness of the IFS–ML hybrid model in accurately identifying the most relevant variable combinations for thermal load prediction across diverse climates.

Additionally, Figure 12 and Figure 13 present a comprehensive statistical analysis of all optimal input combinations used for predicting heating and cooling thermal loads. The results highlight that the employed performance metrics—R², MAPE, and σ—consistently approach their ideal values (with R² nearing 1, and MAPE and σ approaching zero). This demonstrates that highly accurate thermal load predictions can be achieved using multiple combinations of predictor variables, rather than relying on a single “best” configuration. These outcomes outperform those reported in previous studies on thermal load prediction for LEBs, confirming the robustness and versatility of the proposed hybrid IFS–ML framework.

When we compare our findings with those of other research publications, we may conclude the following:

The incorporation of IFS significantly improved prediction accuracy by isolating critical predictors such as building geometry and material properties. This aligns with recent studies emphasizing feature optimization, including hybrid models combining metaheuristic algorithms (e.g., Particle Swarm Optimization) with XGBoost and SVR [8] and interpretable classifiers prioritizing variables like glazing area and relative compactness [37]. Our results extend these approaches by demonstrating that systematic feature engineering (via IFS) reduces overfitting while maintaining robustness across climates;
The XGBoost model emerged as the optimal predictor during training across all climates, corroborating its dominance in long-term load prediction tasks reported in prior work [38]. However, in Mediterranean climates, the SVM model outperformed others for cooling loads, while ELM showed niche superiority for heating loads. This climate-specific divergence contrasts with studies that prioritize general model performance (e.g., LightGBM achieving R² = 0.9959 globally [39]), highlighting the need for regionally tailored frameworks—a gap underexplored in recent literature [40];
Our models achieved near-perfect Pearson correlation (close to 1) and low dispersion, surpassing benchmarks set by state-of-the-art techniques such as hybrid CNN architectures (MAE < 2 MW [40]) and LightGBM ensembles (CVRMSE = 5.25% [39]). Unlike studies focusing on single-model superiority (e.g., TPE-LightGBM with R² = 0.9981 [8]), we identified multiple predictor combinations that maximize accuracy, offering flexibility for diverse design scenarios;
By bridging feature selection, model optimization, and climate adaptability, this work contributes to the operationalization of hybrid AI systems for sustainable architecture—a priority underscored in recent frameworks integrating ML with climate models [41]. Our methodology aligns with calls for interpretable, actionable tools to guide HVAC optimization and envelope design [8,37].

4. Conclusions, Limitations, Future Directions

4.1. Conclusions

This study proposed a hybrid AI-based framework for accurate thermal load prediction in Low-Energy Buildings (LEBs) across three distinct Moroccan climates. Thirteen machine learning models were initially evaluated using a comprehensive set of input features, revealing that while XGBoost demonstrated superior performance during training, SVM emerged as the most robust model during the testing phase—especially for cold and semi-arid climates. In the Mediterranean climate, the ELM model slightly outperformed SVM in heating load prediction.

Building upon these results, an Integral Feature Selection (IFS) approach was employed in conjunction with the top-performing models to identify the most influential predictor combinations. This not only enhanced the accuracy of thermal load prediction, but also contributed to reducing model complexity.

The findings demonstrate the potential of integrating advanced ML models with feature selection techniques to support energy-efficient building design and operation. The proposed framework outperforms existing methods and offers a scalable, climate-adaptive solution for optimizing heating and cooling load predictions in LEBs. Future work will focus on incorporating additional parameters—such as window thermal performance—and extending the approach to real-time prediction scenarios.

The practical value of this research lies in its ability to provide architects, building engineers, and energy consultants with a robust decision-support framework for optimizing thermal load predictions in LEBs. By identifying the most relevant input features through the IFS method and selecting the best-performing machine learning models tailored to specific climates, our approach supports the early design phase of buildings, where decisions on envelope parameters influence energy performance.

Additionally, the proposed framework is applicable across diverse climatic zones, as demonstrated in the three representative Moroccan cities. This adaptability enhances its transferability to other regions with similar climate profiles, supporting national and regional energy efficiency goals.

4.2. Study Limitations

While the proposed hybrid AI framework demonstrated strong performance in predicting thermal loads across different climates, several limitations must be acknowledged, as follows:

First, the analysis did not include certain detailed building parameters such as the thermal performance of exterior windows, which may influence results in real-world applications;
Second, the models were developed using simulated data, which, while controlled and consistent, may not capture all the variabilities present in actual building operation;
Third, the framework has not yet been tested in real-time or online predictive environments, which are critical for practical implementation in Building Energy Management Systems (BEMS).
Lastly, while the study included three diverse climates in Morocco, the generalizability to other regions requires further validation.

4.3. Future Directions

Building on the promising results of this study, several future research directions are envisioned, as follows:

Incorporation of additional building parameters. Future work should include more detailed characteristics of building components—particularly the thermal performance of windows, shading devices, and occupancy schedules—to better reflect real-world thermal dynamics;
Validation with real-world data. While this study relied on simulation data for model training and evaluation, validating the proposed framework using real-world measurements from monitored buildings would enhance its reliability and practical applicability;
Dynamic and real-time prediction. The integration of the framework into BEMS for real-time thermal load predicting and control represents a valuable extension, especially for smart buildings and grid-responsive operations;
Cross-regional generalization. Although the study focused on three Moroccan climates, extending the framework to other geographical regions with different climate patterns and building typologies will further test its adaptability and scalability;
Integration with multi-objective optimization. Future research may explore combining predictive models with optimization algorithms (e.g., genetic algorithms, NSGA-II) to support the design of buildings that balance energy efficiency, cost, and thermal comfort.
Use of deep learning and hybrid architectures. Further investigation into deep learning models (e.g., CNNs, LSTMs, Transformers) and hybrid architectures that can automatically learn temporal and spatial patterns in energy data may enhance predictive performance.

Author Contributions

Conceptualization, Y.E.M. and M.T.U.; methodology, Y.E.M.; software, Y.E.M.; formal analysis, Y.E.M. and M.T.U.; investigation, Y.E.M. and M.T.U.; resources, M.T.U.; writing—original draft preparation, Y.E.M. and M.T.U.; writing—review and editing, Y.E.M. and M.T.U.; visualization, Y.E.M.; supervision, Y.E.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

(a) The dataset, models, or codes supporting this study’s findings are available from the corresponding author upon a reasonable request. (b) All data, models and code generated or used during this study appear in the submitted article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ACH	Air change rate (h-1)
ANN	Artificial neural network
CDD	Cooling degree days
HDD	Heating degree days
ML	Machine learning
N	North (building orientation)
R²	Coefficient of Determination
σ	Standard deviation
X	Vector of design variables
Q_COOL	Cooling load
Q_HEAT	Heating load
HVAC	Heating, Ventilation and Air Conditioning
WWR	Windows-to-Wall Ratio
MBE	Mean Bias Error MBE
RMSE	Root Mean Square Error
MAPE	Mean Absolute Percentage Error
φ	Performance score
DT	Decision Trees
SVM	Support Vector Machine
ELM	Extreme Learning Machine
XGBoost	Extreme Gradient Boosting
RF	Random Forest
TreeBag	Tree Bagger
GLR	Generalized Linear Regression
GR	Gaussian process Regression
LR	Linear Regression
GAM	Generalized Additive Model
KRR	Kernelized Ridge Regression
LRR	Linear Ridge Regression
IFS	Integral Feature Selection
IVS	Input Variable Selection
LEBs	Low-Energy Buildings
AI	Artificial Intelligence
SVR	Support Vector Regression
MLP	Multi-Layer Perception
RBF	Radial Basis Function
RSM	Response Surface Methodology

References

Slimani, J.; Kadrani, A.; El Harraki, I.; Ezzahid, E. Towards a sustainable energy future: Modeling Morocco’s transition to renewable power with enhanced OSeMOSYS model. Energy Convers. Manag. 2024, 317, 118857. [Google Scholar] [CrossRef]
El Hafdaoui, H.; Khallaayoun, A.; Ouazzani, K. Long-term low carbon strategy of Morocco: A review of future scenarios and energy measures. Results Eng. 2024, 21, 101724. [Google Scholar] [CrossRef]
Bai, Y.; Yu, C.; Pan, W. Systematic examination of energy performance gap in low-energy buildings. Renew. Sustain. Energy Rev. 2024, 202, 114701. [Google Scholar] [CrossRef]
Smouh, S.; Gargab, F.Z.; Ouhammou, B.; Mana, A.A.; Saadani, R.; Jamil, A. A New Approach to Energy Transition in Morocco for Low Carbon and Sustainable Industry (Case of Textile Sector). Energies 2022, 15, 3693. [Google Scholar] [CrossRef]
Abdou, N.; El Mghouchi, Y.; Jraida, K.; Hamdaoui, S.; Hajou, A.; Mouqallid, M. Prediction and optimization of heating and cooling loads for low energy buildings in Morocco: An application of hybrid machine learning methods. J. Build. Eng. 2022, 61, 105332. [Google Scholar] [CrossRef]
Yu, D.; Liu, T.; Wang, K.; Li, K.; Mercangöz, M.; Zhao, J.; Lei, Y.; Zhao, R. Transformer based day-ahead cooling load forecasting of hub airport air-conditioning systems with thermal energy storage. Energy Build. 2024, 308, 114008. [Google Scholar] [CrossRef]
Suh, Y.; Chandramowlishwaran, A.; Won, Y. Recent progress of artificial intelligence for liquid-vapor phase change heat transfer. NPJ Comput Mater 2024, 10, 65. [Google Scholar] [CrossRef]
Dasi, H.; Ying, Z.; Ashab, M.F.B. Proposing hybrid prediction approaches with the integration of machine learning models and metaheuristic algorithms to forecast the cooling and heating load of buildings. Energy 2024, 291, 130297. [Google Scholar] [CrossRef]
Afzal, S.; Shokri, A.; Ziapour, B.M.; Shakibi, H.; Sobhani, B. Building energy consumption prediction and optimization using different neural network-assisted models; comparison of different networks and optimization algorithms. Eng. Appl. Artif. Intell. 2024, 127, 107356. [Google Scholar] [CrossRef]
Liu, H.; Yu, J.; Dai, J.; Zhao, A.; Wang, M.; Zhou, M. Hybrid prediction model for cold load in large public buildings based on mean residual feedback and improved SVR. Energy Build. 2023, 294, 113229. [Google Scholar] [CrossRef]
El Alaoui, M.; Rougui, M.; Lamrani, A.; Mouhat, O. Building energy prediction using artificial neural networks and analysis of covariance in the six thermal zones of Morocco. Mater. Today Proc. 2023. [Google Scholar] [CrossRef]
Al-Shargabi, A.A.; Almhafdy, A.; Ibrahim, D.M.; Alghieth, M.; Chiclana, F. Buildings’ energy consumption prediction models based on buildings’ characteristics: Research trends, taxonomy, and performance measures. J. Build. Eng. 2022, 54, 104577. [Google Scholar] [CrossRef]
Sick, F.; Schade, S.; Mourtada, A.; Uh, D.; Grausam, M. DYNAMIC BUILDING SIMULATIONS FOR THE ESTABLISHMENT OF A MOROCCAN THERMAL REGULATION FOR BUILDINGS. J. Green Build. 2014, 9, 145–165. [Google Scholar] [CrossRef]
Abdou, N.; EL Mghouchi, Y.; Hamdaoui, S.; EL Asri, N.; Mouqallid, M. Multi-objective optimization of passive energy efficiency measures for net-zero energy building in Morocco. Build. Environ. 2021, 204, 108141. [Google Scholar] [CrossRef]
Wu, J.; Chen, X.-Y.; Zhang, H.; Xiong, L.-D.; Lei, H.; Deng, S.-H. Hyperparameter Optimization for Machine Learning Models Based on Bayesian Optimizationb. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar] [CrossRef]
Liashchynskyi, P.; Liashchynskyi, P. Grid Search, Random Search, Genetic Algorithm: A Big Comparison for NAS 2019. arXiv 2019, arXiv:1912.06059. [Google Scholar] [CrossRef]
Chakraborty, K.; Mehrotra, K.; Mohan, C.K.; Ranka, S. Forecasting the behavior of multivariate time series using neural networks. Neural Netw. 1992, 5, 961–970. [Google Scholar] [CrossRef]
Quinlan, J.R. Induction of decision trees. Mach Learn 1986, 1, 81–106. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Huang, G.-B.; Zhu, Q.-Y.; Siew, C.-K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Abellán, J.; Masegosa, A.R. Bagging Decision Trees on Data Sets with Classification Noise. In Proceedings of the Foundations of Information and Knowledge Systems, Sofia, Bulgaria, 15–19 February 2010; Link, S., Prade, H., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 248–265. [Google Scholar]
Lesaffre, E.; Marx, B.D. Collinearity in generalized linear regression. Commun. Stat.-Theory Methods 1993, 22, 1933–1952. [Google Scholar] [CrossRef]
Najibi, F.; Apostolopoulou, D.; Alonso, E. Enhanced performance Gaussian process regression for probabilistic short-term solar output forecast. Int. J. Electr. Power Energy Syst. 2021, 130, 106916. [Google Scholar] [CrossRef]
Maulud, D.; Abdulazeez, A.M. A Review on Linear Regression Comprehensive in Machine Learning. JASTT 2020, 1, 140–147. [Google Scholar] [CrossRef]
Hastie, T.J. Generalized Additive Models. In Statistical Models in S; Routledge: London, UK, 1992. [Google Scholar]
Vovk, V. Kernel Ridge Regression. In Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik; Schölkopf, B., Luo, Z., Vovk, V., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 105–116. ISBN 978-3-642-41136-6. [Google Scholar]
Liu, X.-Q.; Gao, F. Linearized Ridge Regression Estimator in Linear Regression. Commun. Stat.-Theory Methods 2011, 40, 2182–2192. [Google Scholar] [CrossRef]
El Mghouchi, Y.; Chham, E.; Zemmouri, E.M.; El Bouardi, A. Assessment of different combinations of meteorological parameters for predicting daily global solar radiation using artificial neural networks. Build. Environ. 2019, 149, 607–622. [Google Scholar] [CrossRef]
Udristioiu, M.T.; EL Mghouchi, Y.; Yildizhan, H. Prediction, modelling, and forecasting of PM and AQI using hybrid machine learning. J. Clean. Prod. 2023, 421, 138496. [Google Scholar] [CrossRef]
Badescu, V. Assessing the performance of solar radiation computing models and model selection procedures. J. Atmos. Sol.-Terr. Phys. 2013, 105–106, 119–134. [Google Scholar] [CrossRef]
Mehdizadeh Khorrami, B.; Soleimani, A.; Pinnarelli, A.; Brusco, G.; Vizza, P. Forecasting heating and cooling loads in residential buildings using machine learning: A comparative study of techniques and influential indicators. Asian J. Civ. Eng. 2024, 25, 1163–1177. [Google Scholar] [CrossRef]
Zhu, L.; Zhang, J.; Gao, Y.; Tian, W.; Yan, Z.; Ye, X.; Sun, Y.; Wu, C. Uncertainty and sensitivity analysis of cooling and heating loads for building energy planning. J. Build. Eng. 2022, 45, 103440. [Google Scholar] [CrossRef]
Chen, Y.; Ye, Y.; Liu, J.; Zhang, L.; Li, W.; Mohtaram, S. Machine Learning Approach to Predict Building Thermal Load Considering Feature Variable Dimensions: An Office Building Case Study. Buildings 2023, 13, 312. [Google Scholar] [CrossRef]
Lv, R.; Yuan, Z.; Lei, B.; Zheng, J.; Luo, X. Building thermal load prediction using deep learning method considering time-shifting correlation in feature variables. J. Build. Eng. 2022, 61, 105316. [Google Scholar] [CrossRef]
Abdel-Jaber, F.; Dirks, K.N. Thermal Load Prediction in Residential Buildings Using Interpretable Classification. Buildings 2024, 14, 1989. [Google Scholar] [CrossRef]
Wang, Z.; Hong, T.; Piette, M.A. Building thermal load prediction through shallow machine learning and deep learning. Appl. Energy 2020, 263, 114683. [Google Scholar] [CrossRef]
Chen, Y.; Ye, Y.; Chen, Z.; Liu, J.; Su, L.; Ji, Y.; Li, W. Performance Comparison for Building Thermal Load Prediction of Office Buildings Using Machine Learning Methods. SSRN 2021. [Google Scholar] [CrossRef]
Zhao, A.; Mi, L.; Xue, X.; Xi, J.; Jiao, Y. Heating load prediction of residential district using hybrid model based on CNN. Energy Build. 2022, 266, 112122. [Google Scholar] [CrossRef]
Slater, L.J.; Arnal, L.; Boucher, M.-A.; Chang, A.Y.-Y.; Moulds, S.; Murphy, C.; Nearing, G.; Shalev, G.; Shen, C.; Speight, L.; et al. Hybrid forecasting: Blending climate predictions with AI models. Hydrol. Earth Syst. Sci. 2023, 27, 1865–1889. [Google Scholar] [CrossRef]

Figure 1. Model simulation—(a) perspective, (b) east view, (c) south view, (d) north view, (e) west view.

Figure 2. The main steps in our methodology.

Figure 3. Flowchart for the employed IFS method [31].

Figure 4. Matrix of correlation between the considered predictor variables and thermal loads for Meknes.

Figure 5. Matrix of correlation between the considered predictor variables and thermal loads for Ifrane.

Figure 6. Matrix of correlation between predictor variables and thermal loads for Marrakech.

Figure 7. A comparative analysis of ML models in predicting thermal loads within a Mediterranean climate (Meknes City).

Figure 8. A comparative analysis of ML models in predicting thermal loads within a cold climate (Ifrane City).

Figure 9. A comparative analysis of ML models in predicting thermal loads within a hot climate (Marrakech City).

Figure 10. The best combinations of input for predicting heating loads.

Figure 11. The best combinations of input for predicting cooling loads.

Figure 12. A statistical examination of the best combinations of input for predicting heating loads.

Figure 13. A statistical examination of the best combinations of input for predicting cooling loads.

Table 1. Locations and climate characteristics.

Location (Morocco)	Climatic Zone	Climate Type	Minimum DBT (°C)	Mean DBT (°C)	Maximum DBT (°C)	CDD Base 24 °C	HDD Base 18 °C
Meknes	Z3	Mediterranean	0.12	17.66	43.95	223.45	1007.32
Ifrane	Z4	Cold	−4.0	15.09	34.1	156.65	1711.17
Marrakech	Z5	Semi-arid	2.4	20.3	43.7	459.12	565.85

Table 2. Building construction materials.

Building Components	Material (layers)	Thickness (cm)	Thermal Conductivity, KJ/(h.m.K)	Density (kg/m³)	Thermal Capacity, kJ/(kg K)	Overall U-Value (W/m² K)
Exterior wall	Cement plaster	2	4.152	1700	1	0.3–1
	Hollow brick	7	1.805	720	0.794
	Polystyrene	1–12	0.141	25	1.38
	Hollow brick	7	1.805	720	0.794
	Cement plaster	2	4.152	1700	1
Floor	Tile	0.7	1.227	790	0.801	0.3–2
	Mortar	5	4.152	2000	0.84
	Polystyrene	1–12	0.141	25	1.38
	Heavy concrete	20	6.318	2300	0.92
Roof	Cement plaster	2	4.152	1700	1	0.3–1
	Concrete Block	25	3.924	1300	0.65
	Polystyrene	1–12	0.141	25	1.38
	Heavy concrete	4	6.318	2300	0.92
Interior wall	Cement plaster	2	4.152	1700	1	2.904
	Hollow brick	7	1.805	720	0.794
	Cement plaster	2	4.152	1700	1

Table 3. Locations and climate characteristics.

Parameter	Variable	Unit	Min. Value	Max. Value	Avg. Value
External walls’ transmission coefficient	X1	W/Km²	0.3	1	0.65
Absorption coefficient of the solar radiation of the external walls	X2	-	0.2	0.8	0.5
Roof transmission coefficient	X3	W/Km²	0.3	1	0.65
Absorption coefficient of the solar radiation of the roof	X4	-	0.2	0.8	0.5
Transmission coefficient of the floor	X5	W/Km²	0.3	2	1.15
Air change rate	X6	1/h	0.5	2	1.05
South window-to-wall ratio	X7	%	10	40	25
East window-to-wall ratio	X8	%	10	40	25
West window-to-wall ratio	X9	%	10	40	25
North window-to-wall ratio	X10	%	10	40	25

Table 4. Locations and climate characteristics.

No.	Model	Description	Key Hyperparameters and Settings	Historical Background/Key References
1	Artificial Neural Networks (ANNs)	ML models inspired by the human brain, consisting of interconnected neurons organized into layers (input, hidden, output).	Hidden layers: 2. Neurons per layer: 10–50. Activation: ReLU. Optimizer: Adam. Learning rate: 0.01. Epochs: 1000. Batch size: 32.	Originated in 1940s–50s; gained popularity in the 1980s–1990s with the development of backpropagation [17].
2	Decision Trees (DT)	Hierarchical structures representing decisions or tests on features. Simple, interpretable, and effective for classification, regression, and feature importance tasks.	Split criterion: MSE. Maximum depth: 10. Minimum samples per leaf: 5. Split method: Best split.	Initially introduced in [18].
3	Support Vector Machine (SVM)	Supervised ML algorithm aiming to find the optimal hyperplane that separates data classes with maximum margin. Used in classification and regression.	Kernel: RBF. Box constraint (C): 1–10. Kernel scale: auto. Epsilon: 0.1–0.5 (tuned).	First proposed in [19].
4	Extreme Learning Machine (ELM)	Single hidden layer feedforward neural network with randomly generated neurons. Offers faster training compared to traditional methods.	Hidden neurons: 100. Activation function: Sigmoid. Input weights and biases: Random initialization.	Proposed by Huang Guang-Bin in 2006 [20].
5	Extreme Gradient Boosting (XGBoost)	Ensemble learning method based on gradient boosting, sequentially building trees to correct errors. Widely used for its efficiency and accuracy.	Learning rate: 0.1. Max depth: 6. Number of estimators: 100. Regularization: λ = 1.	Introduced by Tianqi Chen in 2014 [21].
6	Random Forest (RF)	Ensemble method combining bagging and random feature selection to build multiple decision trees for classification and regression.	Number of trees: 100. Max depth: auto. Min samples split: 2. Max features: sqrt.	Introduced by Leo Breiman in 2001 [22].
7	Tree Bagger (TB)	Ensemble technique building multiple bagged decision trees trained on bootstrap samples to improve prediction robustness.	Number of trees: 100. Leaf size: 5. Predictor selection: Random. Bootstrap aggregation: Enabled.	Described in [23].
8	Generalized Linear Regression Model (GLRM)	Extends linear regression to non-normal response distributions using a link function connecting predictors to the response mean.	Link function: Identity. Distribution: Normal. Regularization: L2 (λ = 0.1).	Detailed in [24].
9	Gaussian Process Regression (GPR)	Non-parametric, probabilistic model treating outputs as random variables following a multivariate Gaussian distribution.	Kernel function: Rational quadratic. Sigma: auto. Basis function: constant. Fit method: Exact Gaussian process.	Explained in [25].
10	Linear Regression (LR)	Predicts continuous outcomes assuming a linear relationship between input features and output. A foundational and widely applied model.	Intercept: Included. Regularization: None.	Dates back to early 19th century [26].
11	Generalized Additive Model (GAM)	Extends GLRM by allowing non-linear, additive relationships between predictors and response variables.	Spline order: 3. Number of spline terms per predictor: 5. Link function: Identity.	Introduced in [27].
12	Kernelized Ridge Regression Model (KRRM)	Combines ridge regression with the kernel trick for modeling non-linear relationships with regularization.	Kernel: RBF. Lambda: 0.1–1.0. Sigma (RBF scale): auto.	Presented in [28].
13	Linear Ridge Regression (LRR)	Linear regression method adding a regularization term to reduce overfitting. Provides a closed-form solution for coefficient estimation.	Regularization parameter (λ): 0.1–1.0. Solver: SVD.	Developed in [29].

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

El Mghouchi, Y.; Udristioiu, M.T. Thermal Load Predictions in Low-Energy Buildings: A Hybrid AI-Based Approach Integrating Integral Feature Selection and Machine Learning Models. Appl. Sci. 2025, 15, 6348. https://doi.org/10.3390/app15116348

AMA Style

El Mghouchi Y, Udristioiu MT. Thermal Load Predictions in Low-Energy Buildings: A Hybrid AI-Based Approach Integrating Integral Feature Selection and Machine Learning Models. Applied Sciences. 2025; 15(11):6348. https://doi.org/10.3390/app15116348

Chicago/Turabian Style

El Mghouchi, Youness, and Mihaela Tinca Udristioiu. 2025. "Thermal Load Predictions in Low-Energy Buildings: A Hybrid AI-Based Approach Integrating Integral Feature Selection and Machine Learning Models" Applied Sciences 15, no. 11: 6348. https://doi.org/10.3390/app15116348

APA Style

El Mghouchi, Y., & Udristioiu, M. T. (2025). Thermal Load Predictions in Low-Energy Buildings: A Hybrid AI-Based Approach Integrating Integral Feature Selection and Machine Learning Models. Applied Sciences, 15(11), 6348. https://doi.org/10.3390/app15116348

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Thermal Load Predictions in Low-Energy Buildings: A Hybrid AI-Based Approach Integrating Integral Feature Selection and Machine Learning Models

Abstract

1. Introduction

2. Materials and Methods

2.1. Weather Data and Locations

2.2. Building Description and Simulation

2.3. Predictor Variables

2.4. Methodology

2.5. Hybrid AI Models and Evaluation Metrics

2.5.1. Employed ML Models

2.5.2. Integral Feature Selection

2.5.3. Statistical Accuracy Assessment

3. Results and Discussion

3.1. Sensitivity Analysis: Evaluating Predictor Variable Impacts on Thermal Loads

3.2. ML Models: Accurate Predictions of Heating and Cooling Loads

3.3. Optimizing Thermal Load Predictions: Best Hybrid IFS–ML Models

4. Conclusions, Limitations, Future Directions

4.1. Conclusions

4.2. Study Limitations

4.3. Future Directions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI