Generalization of Machine Learning Surrogates Across Building Orientation and Roof Solar Absorptance in Naturally Ventilated Dwellings

Cintia Monreal Jiménez; Angel Jiménez-Godoy; Guillermo Barrios; Robert Jäckel; Alberto Ramos Blanco; Geydy Gutiérrez-Urueta

doi:10.3390/buildings16061245

Abstract

This study develops an interpretable machine learning (ML) surrogate to predict hourly indoor air temperature and discomfort indicators for a representative Mexican social-housing prototype in San Luis Potosí (cold semi-arid, Köppen–Geiger BSk). A four-zone EnergyPlus model with constant window opening (50%) and no internal gains was used to generate a parametric dataset spanning 24 building orientations, seven roof solar absorptance levels, and two neighborhood configurations (surrounded vs. corner). Zone-specific bagged-tree regression models were trained in MATLAB using weather predictors, temporal indicators, and weather-memory features (including outdoor temperature lags and rolling averages). Orientation and roof absorptance were included as explicit design predictors, enabling the surrogate model to generalize across the full combinatorial design space rather than requiring a separate model for each configuration. Interpretability was assessed with SHAP values. Evaluated on orientation–absorptance combinations deliberately held out during training, the surrogate achieved high accuracy across zones of the house (R² = 0.98–0.99; RMSE = 0.31–0.67 °C) with stable, near-zero-centered residuals. When propagated into adaptive-comfort metrics computed directly relative to the monthly neutral temperature

T_{n}

, ML predictions preserved the main cold and hot discomfort degree-hour patterns across the full design space. The proposed surrogate enables rapid, physically consistent comfort-oriented screening of roof finishes and orientation choices in naturally ventilated social housing.

Keywords:

thermal comfort; surrogate modeling; machine learning; building energy simulation

1. Introduction

Approximately 30% of global final energy use and 28% of energy-related

{CO}_{2}

emissions are attributable to buildings, emphasizing the built environment as a central sector for climate change mitigation [1,2]. In Mexico, the residential, commercial and public sectors account for approximately 18% of the national final energy consumption [3]. Rapid urbanization, particularly in developing countries, continues to increase housing demand while intensifying both cooling and heating requirements in residential buildings. In the absence of air conditioning or space heating—whether due to limited access or affordability constraints—occupants may be exposed to prolonged periods of thermal discomfort during hot and cold seasons. Ensuring the evaluation of indoor thermal conditions in naturally ventilated social housing should become both an environmental and social priority, especially in regions where mechanical conditioning systems remain economically inaccessible. In Mexico, recent federal housing policy has renewed emphasis on large-scale delivery of affordable dwellings: in an official government communiqué announcing the Programa de Vivienda y Regularización, President Claudia Sheinbaum stated that the country will build one million homes over the current six-year term [4].

This situation is particularly critical in social housing programs, where construction policies frequently prioritize affordability and rapid deployment over thermal performance. Consequently, many dwellings lack adequate passive or active thermal control strategies, exposing occupants to extended periods of discomfort. In Mexico, large numbers of social housing units have reportedly been abandoned, with poor thermal conditions identified as one contributing factor [5]. Meanwhile, national housing initiatives continue promoting large-scale construction, reinforcing the need to integrate the evaluation and thermal performance criteria into residential design practices.

Although Mexico has mandatory energy-efficiency requirements for the building envelope of new residential buildings intended to be mechanically cooled [6], comparable regulatory guidance is limited for naturally ventilated housing, where overheating and cold-season discomfort are often addressed primarily through passive means. Post-occupancy evaluations provide quantitative evidence of persistent thermal discomfort in Mexican social housing. Post-disaster housing developed under the FONDEN program showed that 52% of occupants experienced extreme dissatisfaction during warm seasons and 28% during colder periods. Environmental monitoring further revealed that indoor conditions remained outside comfort ranges during 67% of the monitored time in warm periods, confirming a substantial gap between delivered housing performance and occupant comfort requirements [7].

Additionally, industrialized construction systems commonly employed in Mexican social housing developments are frequently adapted from other climatic contexts without sufficient regional adjustments, resulting in dwellings poorly suited to local environmental conditions [8]. Recent research has therefore emphasized bioclimatic and parametric evaluations of low-income and social housing in Mexico, showing that climate-responsive envelope and passive strategies can substantially reduce discomfort and, when cooling is present, reduce electricity demand [9,10,11,12]. However, many of these analyses are conducted under assumptions of mechanically conditioned operation or focus on energy savings, whereas studies explicitly addressing naturally ventilated or mixed-mode social housing remain comparatively limited despite their relevance for the majority of the housing stock [13]. Moreover, as of 2020, only about 17% of Mexican households were equipped with at least one air-conditioning unit [14], meaning that the large majority of the housing stock relies on natural ventilation as its primary thermal conditioning strategy.

San Luis Potosí, representative of semi-arid highland climates (Köppen BSk), presents particular thermal challenges, including high solar radiation levels, pronounced diurnal temperature swings, and seasonal transitions requiring both cooling and heating responses. Despite these conditions, technical guidance supporting thermally adequate, naturally ventilated social housing remains limited, even within public programs promoting incremental housing construction. Where such guidance exists, it seldom addresses the combined effect of building orientation and exterior surface finish, variables that significantly influence solar heat gains yet are rarely explored systematically across their full range of combinations. Improving the prediction and understanding of indoor thermal performance in such dwellings is therefore essential for developing effective passive design strategies.

Building energy simulation tools such as EnergyPlus enable detailed prediction of indoor thermal behavior and are widely employed in performance assessment studies [15,16]. However, individual simulation runs may require several minutes of computational time, making parametric analyses involving thousands of design alternatives computationally expensive and operationally complex [17]. Design optimization, sensitivity analysis, and uncertainty quantification typically demand large numbers of simulation evaluations, rendering exclusive reliance on physics-based models impractical in many applications [18]. Surrogate modeling offers a practical alternative by training machine learning models on simulation outputs to approximate the input–output relationships of physics-based models. Once trained, such models can produce predictions within milliseconds, enabling large-scale parametric exploration while retaining the physical consistency embedded in simulation data [18,19]. This approach bridges the gap between detailed simulation and rapid performance assessment, supporting performance-informed design decisions at scales otherwise computationally prohibitive.

Machine learning techniques have increasingly been applied to predict indoor thermal conditions and building energy performance. Tree-based ensemble methods, including Random Forest and Gradient Boosting algorithms, have demonstrated strong performance in regression tasks involving heterogeneous building and climate datasets [20,21].

Feature engineering strategies incorporating lag variables and rolling (moving) averages have proven effective in capturing delayed responses to outdoor forcing in building thermal datasets [21]. In parallel, interpretability techniques such as SHapley Additive exPlanations (SHAP) enhance model transparency by quantifying predictor contributions, improving physical interpretability and trust in machine learning predictions [22].

Recent indoor temperature prediction studies have also explored forecasting-oriented formulations based on sequential models such as LSTM and related deep-learning architectures, particularly when long histories of measured indoor variables are available [23,24]. More broadly, recent reviews of residential indoor temperature prediction confirm that both white-box and black-box approaches are being actively developed for building-performance applications [25]. However, these forecasting-oriented approaches typically address operational prediction from past observations, whereas the present study targets a different objective: surrogate emulation of a physics-based simulation model across a parametric design space using exogenous weather and design predictors. Accordingly, we prioritize an interpretable tree-based formulation that is well suited to tabular inputs and design-space screening rather than a benchmark of sequential forecasting architectures.

Nevertheless, important gaps remain. Existing studies often focus on mechanically conditioned commercial buildings, whereas naturally ventilated residential dwellings receive comparatively limited attention. Semi-arid climates are underrepresented, and neighborhood effects—such as differences between surrounded and corner dwellings—remain insufficiently studied. Furthermore, the capacity of ML models to generalize predictions across unseen design parameters, including orientation and roof absorptance, remains largely unexplored. Yet these two variables are among the most actionable early-stage design levers for naturally ventilated housing and therefore require models that remain accurate across their full ranges. Collectively, these gaps point to the need for interpretable surrogate models for naturally ventilated housing in semi-arid climates that also capture neighborhood effects and generalize across key design parameters. This study advances research on data-driven thermal performance modeling by focusing on naturally ventilated Mexican social housing in semi-arid highland climates, a context that remains underrepresented in machine learning thermal prediction studies. To address the limited attention given to residential naturally ventilated buildings relative to mechanically conditioned commercial stock, we develop a machine-learning surrogate model to predict indoor thermal behavior and evaluate its performance under a structured parametric design space that varies building orientation, roof absorptance, and neighborhood configuration. In addition, without relying on past indoor temperatures, we account for outdoor forcing history through lagged and rolling outdoor temperature descriptors, yielding a more realistic characterization of dynamic indoor responses in typical Mexican residential construction. We also assess the ability of the models to generalize to unseen building orientations under natural ventilation, considering the influence of wind speed and wind direction on indoor conditions. This generalizable proxy model enables rapid screening across the full combinatorial design space of orientation–absorptance choices, which would otherwise require thousands of EnergyPlus evaluations. Further, the resulting models are interpreted using SHAP to identify the dominant drivers of indoor thermal performance, and thermal comfort implications are assessed using an adaptive neutrality temperature framework, thereby linking predictive accuracy with physically meaningful insights for design and planning. Overall, the study delivers an interpretable, generalizable ML proxy for indoor thermal behavior in naturally ventilated Mexican social housing, enabling comfort-oriented evaluation of orientation and roof-finish choices under semi-arid conditions and contrasting neighborhood exposures.

2. Case Study and Simulation Framework

This study evaluates the thermal performance of a representative Mexican social-housing prototype obtained from the public platform Decide y Construye, developed by Mexico’s Secretariat of Agrarian, Territorial and Urban Development (SEDATU). The platform provides open-access architectural and technical documentation (plans and manuals), enabling a reproducible definition of building geometry and construction systems for simulation-based research [26]. Among the available prototypes, Casa 3 was selected because it targets dry and semi-dry climates and provides sufficient information to define the dwelling layout, envelope assemblies, and room-based thermal zoning [27]. The case study is located in San Luis Potosí, a highland city in central Mexico with a cold semi-arid (BSk) climate under the Köppen–Geiger classification [28]. Selecting Casa 3 (dry/semi-dry) is consistent with Mexico-specific climatic regionalization based on adaptations of the Köppen scheme [29].

Weather boundary conditions were defined using a Typical Meteorological Year (TMY) file obtained from OneBuilding based on the period 2009–2023 [30,31]. The climate is characterized by moderate mean temperatures combined with pronounced diurnal swings and high solar availability during clear-sky seasons. A rainy period associated with the North American monsoon increases humidity and typically reduces the diurnal thermal amplitude. The main climatic indicators derived from the selected EPW file are summarized in Table 1. Seasonal trends of key boundary conditions are shown in Figure 1.

Table 1. Main climatic indicators for San Luis Potosí derived from the EPW weather file from period 2009–2023.

Figure 1. Seasonal evolution of key outdoor boundary conditions for San Luis Potosí derived from the TMY (2009–2023) EPW dataset: (top) hourly outdoor air temperature with seasonal mean, (middle) hourly relative humidity with seasonal mean, and (bottom) Solar Energy with seasonal mean. Vertical dashed lines indicate seasonal boundaries.

The dwelling is a single-story residential unit (approximately 42 m² of built area) with an interior height of 2.8 m. The plan includes four primary zones: bathroom, bedroom, kitchen–dining area, and study room in Figure 2 [27]. For simulation purposes, each space was modeled as an independent thermal zone [15]. Envelope systems correspond to masonry walls and a lightweight roof solution typical of low-cost construction with no air conditioning. Envelope assemblies and thermophysical properties were defined directly from the official documentation [27] and are presented in Table 2.

Figure 2. Simplified floor plan of Casa 3 (Decide y Construye) showing the thermal zoning adopted in this study and the corresponding zone areas.

Table 2. Thermophysical properties and thickness of construction materials used in the EnergyPlus model, reported from outside to inside, including thermal conductivity k, density

ρ

, specific heat

c_{p}

, and thickness L.

Operational assumptions were kept constant across all parametric cases to isolate the influence of the design variables. Windows were maintained at a 50% open-area fraction throughout the year, and internal heat gains (occupants, lighting, appliances) were neglected. These assumptions provide a controlled setting to assess the effect of orientation and roof solar absorptance on indoor temperatures and comfort outcomes, and to evaluate whether the surrogate models capture physically meaningful drivers rather than operating as a black box.

Figure 3 shows the geometric model of the dwelling. Shading surfaces were added to represent the roof overhangs and the water tank volume.

Figure 3. Isometric view of the geometric model for Casa 3.

Two neighborhood configurations were considered: a surrounded dwelling, representing a mid-block condition with adjacent buildings on both side facades and the rear facade, and a corner dwelling, with adjacent buildings on one side and at the rear, leaving one facade fully exposed (Figure 4). Neighboring buildings were represented as shading objects to capture their impact on solar access without explicitly simulating their internal thermal behavior [15]. The shared walls were modeled as adiabatic boundaries, assuming similar indoor thermal conditions in the adjacent dwellings. These configurations allow quantification of how urban adjacency modifies radiative exposure and indirectly influences the indoor thermal response.

Figure 4. Schematic representation of the two neighborhood configurations considered in this study: (left) surrounded dwelling with adjacent buildings on both sides and at the rear, and (right) corner dwelling with adjacent buildings on one side and at the rear.

Annual simulations were conducted in EnergyPlus (Version 25.2.0, U.S. Department of Energy, Washington, DC, USA) with six timesteps per hour [15] using the selected EPW file. Simulation outputs were requested at an hourly reporting frequency for the four thermal zones (Kitchen, Study, Bathroom, Bedroom). Standard warm-up iterations were used to reduce sensitivity to initial conditions [15]. The principal output variable extracted for the analysis and training of the surrogate model was the mean air temperature of the zone,

T_{in}

, for each thermal zone. For the present case study, this zone-mean representation is appropriate because the dwelling is a small, single-story, naturally ventilated housing unit with four relatively compact thermal zones, a ceiling height of 2.8 m, and no mechanical air-conditioning system. Under these conditions, local stratification and recirculation effects are not expected to dominate the building-scale thermal response of interest here. In addition, the EnergyPlus AirflowNetwork formulation accounts for pressure-driven airflow through openings and between zones, capturing the main ventilation processes relevant to this study without requiring CFD resolution. CFD would be more appropriate for large open-plan spaces, tall atria, or studies focused on detailed local velocity fields or pollutant transport.

In this framework, EnergyPlus solves the transient zone heat balance by treating each thermal zone as a well-mixed air volume, meaning that a single mean air temperature characterizes each zone at every timestep [15]. This is the standard modeling approach in building energy simulation and forms the basis of internationally recognized validation procedures such as ASHRAE Standard 140 [32]. Each EnergyPlus release is systematically tested against the Standard 140 BESTEST suite, with published results falling within the range of other validated programs [33].

A parametric design was defined to quantify the combined impact of building orientation and roof solar absorptance and to generate the dataset required for developing the surrogate model. Building orientation was varied from 0° to 345° in 15° increments (24 orientations). Roof solar absorptance was varied from 0.2 to 0.8 in 0.1 increments (7 levels), spanning typical light-to-dark roof finishes. These variations were evaluated for both neighborhood configurations. Each simulation produced 8760 hourly records per thermal zone and variable. The EnergyPlus model was developed to provide physically consistent data for training and evaluating the surrogate machine learning workflow under controlled assumptions; therefore, the model was not calibrated against field measurements.

Thermal comfort was assessed using an adaptive approach based on the monthly neutral temperature

T_{n}

, which relates thermal neutrality to prevailing outdoor conditions in free-running (naturally ventilated) buildings [34]. In this study,

T_{n}

for each month was computed from the mean monthly outdoor air temperature

\bar{T_{o}}

as:

T_{n} = 13.5 + 0.54 \bar{T_{o}}

(1)

where

\bar{T_{o}}

was obtained from the same weather dataset used for the EnergyPlus simulations.

To avoid masking how temperature prediction errors propagate into comfort outcomes, hot and cold discomfort degree-hours were quantified directly with respect to the adaptive neutrality temperature

T_{n}

. For each thermal zone z and annual simulation, the total cold and hot discomfort degree-hours over the full year (8760 h) were computed as:

\begin{matrix} {DDH}_{cold}^{(z)} & = \sum_{t = 1}^{N} max (0, T_{n} (t) - T_{in}^{(z)} (t)) Δ t, \end{matrix}

(2)

\begin{matrix} {DDH}_{hot}^{(z)} & = \sum_{t = 1}^{N} max (0, T_{in}^{(z)} (t) - T_{n} (t)) Δ t, \end{matrix}

(3)

where t indexes the hourly timesteps,

N = 8760

, and

Δ t = 1 h

. Here,

T_{in}^{(z)} (t)

is the hourly zone mean indoor air temperature in thermal zone z. Annual cold and hot discomfort degree-hours were then obtained by summing the corresponding zone-level totals across all thermal zones, and total annual discomfort was defined as the sum of annual cold and annual hot discomfort degree-hours.

This procedure was applied consistently to: (i) indoor temperatures obtained from EnergyPlus simulations and (ii) indoor temperatures predicted by the ML surrogate models, enabling a direct comparison of comfort-relevant outcomes between physics-based and data-driven approaches.

3. Dataset and Machine Learning Surrogate Modeling

A supervised dataset was constructed from the EnergyPlus parametric simulations to train and evaluate surrogate ML models for indoor air temperature prediction. Each simulation produced hourly values of the target variable, defined as the zone mean air temperature

T_{in}

for each thermal zone. Because thermal response differs among rooms due to geometry, envelope exposure, and airflow pathways, independent models were trained for each thermal zone and for each neighborhood configuration (surrounded vs. corner), resulting in eight ML models.

This zone (and configuration) specific formulation was adopted intentionally to preserve interpretability and to avoid conflating distinct thermal behaviors associated with room function, envelope exposure, and neighborhood boundary conditions. A single global model including all zones and configurations through additional categorical or geometric descriptors could be explored in future work; however, such an approach would require a substantially broader and more heterogeneous training dataset spanning multiple dwelling typologies, layouts, and operating scenarios in order to generalize reliably beyond the present case study.

Predictors were selected to be computable from the weather file and design parameters, avoiding dependency on indoor measurements or latent state variables unavailable at prediction time. The predictor set includes: (i) meteorological drivers (temperature, radiation, wind, humidity, pressure), (ii) design parameters such as orientation and roof solar absorptance, and (iii) temporal indicators (week-of-year). To represent delayed thermal response without introducing indoor autoregressive terms, lag features and rolling averages derived from outdoor air temperature were included to capture the time-delayed influence of outdoor forcing [21]. These features rely only on prior values of

T_{o}

from the weather file (or forecast) and are therefore available at prediction time for unseen cases; they do not require past indoor temperatures or previous model outputs; therefore, the formulation is non-autoregressive.

Thus, each hourly record was treated as one supervised-learning sample. For a given thermal zone and neighborhood configuration, the input to the model is therefore a tabular predictor vector of 18 features (weather variables, week of the year, design parameters, and outdoor temperature memory descriptors), while the output is a single scalar target: the hourly zone mean indoor air temperature

T_{in}

.The full list of predictors is shown in Table 3.

Table 3. Eighteen predictors used to train the ML surrogate models. Outdoor temperature memory predictors are derived from outdoor air temperature

T_{o}

. Building azimuth and roof solar absorptance are kept constant within each simulation.

For each neighborhood configuration, the parametric study includes 24 orientations and 7 roof solar absorptances, each case representing an annual simulation. Each case provides 8760 hourly records per zone, yielding a total of 1,471,680 records per zone and per neighborhood configuration. Design parameters (

θ

,

α_{r o o f}

) were replicated across all timesteps within each case, while lag/rolling predictors were computed from the hourly outdoor temperature series. Records with undefined lag/rolling values at the start of the year were removed to ensure consistent feature availability.

To evaluate generalization across unseen combinations of orientation and roof absorptance, data partitioning was performed in two stages. First, complete (

θ

,

α_{r o o f}

) combinations were withheld to create an independent generalization dataset composed of annual cases entirely absent from model fitting. This subset was used as the main out-of-sample benchmark. Second, within the remaining training–validation cases, MATLAB Regression Learner App was used to perform internal model selection and evaluation through cross-validation together with a small held-out validation subset. Therefore, the principal assessment of generalization reported in this study is based on design-case combinations not seen during training, enabling a strict assessment of interpolation capability across the parametric design space.

Surrogate regression models were developed using MATLAB (R2024b, MathWorks, Natick, MA, USA) [35]. A bagged decision-tree ensemble (Bagged Trees) was selected as the working surrogate formulation because they provide a favorable balance between predictive accuracy, robustness on tabular datasets, and model interpretability, which is central to the present study [36,37]. Bagging reduces variance by training multiple trees on bootstrap resamples and averaging predictions [36]. For the predictor vector

x

, the ensemble prediction is:

\hat{y} (x) = \frac{1}{M} \sum_{m = 1}^{M} f_{m} (x),

(4)

where M is the number of trees and

f_{m}

is the m-th regression tree.

Model selection and hyperparameter tuning were performed using cross-validation within Regression Learner [35]. Feature importance was quantified using SHAP, which attributes to each predictor a signed contribution indicating how it shifts a specific prediction upward or downward relative to the model’s typical output. The magnitude of the SHAP value reflects the strength of that predictor’s influence for a given sample; averaging the absolute SHAP values across samples provides a global importance ranking used here for interpretability and reduced-model selection [22]. Prediction error relevance was assessed by propagating temperature errors into discomfort degree-hours (Section 1), providing comfort-relevant evaluation beyond purely statistical indicators.

In Table 4, the main configuration for the MATLAB Regression Learner is shown. For each model, the minimum leaf size was, 5 and the number of learners was 30. The main difference between models was the number of predictors.

Table 4. Software environment and Bagged Trees hyperparameter configuration (MATLAB Regression Learner).

Table 5 summarizes the different predictors included in the selected model for each zone and dwelling configuration. The feature-selection procedure began by training each machine learning model with the full set of 18 predictors. Next, predictors were ranked according to their SHAP values, and a recursive elimination process was applied by removing the least important predictor one at a time. After each removal, the model was retrained, and its performance was compared using RMSE and

R^{2}

. If the RMSE variation remained below ∼2% and

R^{2}

was not adversely affected, the predictor was permanently removed, and the procedure continued with the next SHAP-ranked variable. This process was repeated iteratively until no further predictors could be eliminated without degrading the model performance.

Table 5. Predictor inclusion matrix for the selected models by dwelling configuration and zone. An “x” indicates that the predictor was included in the corresponding model.

The objective of this recursive elimination procedure was not to analyze the full degradation trajectory of every intermediate model, but to identify a reduced predictor subset that preserved predictive performance within a practical tolerance. Accordingly, the manuscript reports the final selected predictor sets (Table 5) and the corresponding predictive performance of the retained models, rather than the full sequence of intermediate trials. A comprehensive benchmark of alternative feature-selection methods or elimination trajectories was considered beyond the scope of the present study.

4. Results and Discussion

To provide a first illustrative view of surrogate fidelity before moving to aggregated metrics, Figure 5 and Figure 6 juxtapose hourly EnergyPlus reference temperatures

T_{ref}

with ML predictions

T_{ML}

for the thermal zones over representative time periods, for both neighborhood configurations. Across these exemplars, the surrogate closely tracks the temporal dynamics of each zone, reproducing diurnal cycling, short-term fluctuations, and zone-specific amplitude differences with minimal visible phase lag. Agreement remains tight during both rising and falling temperature segments, including periods with sharper excursions, suggesting that the learned mapping captures the dominant thermal response of the dwelling under varying boundary-condition exposures.

Figure 5. Example hourly indoor air temperature time series for the corner neighbor configuration (

θ

= 90,

α_{r o o f}

= 0.5), comparing EnergyPlus reference

T_{r e f}

(orange), surrogate predictions

T_{M L}

(yellow), and outdoor air temperature

T_{o}

(blue dotted line) over one week.

Figure 6. Example hourly indoor air temperature time series for the surrounded neighbor configuration (

θ

= 90,

α_{r o o f}

= 0.5), comparing EnergyPlus reference

T_{r e f}

(orange), surrogate predictions

T_{M L}

(yellow), and outdoor air temperature

T_{o}

(blue dotted line) over one week.

The results evaluate the surrogate model from three complementary perspectives: (i) zone-level predictive accuracy of hourly indoor air temperature by means of parity plots and standard regression indicators (RMSE, MAE, and

R^{2}

), (ii) stability and distribution of prediction errors across dwelling configurations and zones, examining bias, dispersion, and outlier behavior, and (iii) propagation of temperature prediction uncertainty into thermal-discomfort metrics based on the monthly adaptive neutral temperature,

T_{n}

. The full corpus of simulated data comprises 1,453,504 hourly observations split into two disjoint sets. The training–validation dataset (n = 265,544) was used for model fitting, partitioned into a training split (95%, n = 251,568) and a validation split (5%, n = 13,976). The generalization dataset (n = 1,187,960), comprising dwelling configurations entirely absent from the training–validation dataset, was reserved as an independent benchmark to assess the model’s ability to generalize to unseen design conditions within the explored parameter ranges. The performance metrics reported throughout are MSE, RMSE, MAE, MAPE and

R^{2}

, computed separately for the training and validation splits and recalculated on the generalization dataset. All metrics are further disaggregated by thermal zone (bathroom, kitchen, study, and bedroom) to assess whether predictive quality is uniform across functionally distinct spaces. SHAP analyses are subsequently used to interpret the dominant drivers learned by the models, before discomfort degree-hours are compared between EnergyPlus reference outputs and ML-based predictions.

To evaluate the agreement between EnergyPlus reference temperatures and ML predictions for the four thermal zones in both neighborhood configurations (corner and surrounded), parity plots were used because they provide a visual diagnostic of systematic bias and dispersion across the full annual dataset, while RMSE, MAE, and

R^{2}

provide a compact quantitative summary of accuracy. Reporting results by zone and neighbor configuration is important because each space experiences distinct boundary-condition exposure (external walls, solar gains, and ventilation sensitivity), which can influence both thermal dynamics and predictive difficulty. In the generalization dataset, Figure 7 shows strong agreement between EnergyPlus-simulated and ML-predicted indoor temperatures for the corner neighbor configuration. Across all zones, the point clouds closely follow the 1:1 line, indicating low dispersion and limited systematic bias. Predictive accuracy remains high, with

R^{2}

ranging from 0.985 to 0.991 and RMSE from 0.38 to 0.57 °C, suggesting that the surrogate captures the dominant indoor thermal response across the parametric design space. A consistent zone-dependent performance gradient is observed. The Bathroom model exhibits the highest fidelity (RMSE = 0.38 °C,

R^{2}

= 0.991), whereas the Bedroom shows the largest errors (RMSE = 0.57 °C,

R^{2}

= 0.985). This pattern is consistent with the Bedroom experiencing stronger sensitivity to solar and ventilation-driven boundary conditions in the corner configuration, which can increase temporal variability and reduce predictability.

Figure 7. Parity plots comparing EnergyPlus as reference temperatures and ML predictions for the corner neighbor configuration across the four thermal zones for the generalization dataset.

As shown in Figure 8 for the generalization dataset, the surrogate models also achieve excellent performance for the surrounded neighbor configuration, with tight clustering around the 1:1 line for all zones. The models explain nearly all variance in the reference temperatures (

R^{2}

= 0.981–0.994) with low absolute errors (RMSE = 0.31–0.60 °C), indicating that the learned mapping generalizes well within this neighborhood configuration. Zone-level differences persist. Bathroom and Study room yield the best agreement (RMSE = 0.31–0.36 °C;

R^{2}

= 0.990–0.994), while Bedroom exhibits the largest residual spread (RMSE = 0.60 °C;

R^{2}

= 0.981). Compared with the corner dwelling, errors are slightly reduced in most zones, consistent with the surrounded configuration imposing more constrained boundary conditions (reduced lateral exposure on shared facades), which tends to smooth indoor temperature dynamics.

Figure 8. Parity plots comparing EnergyPlus reference temperatures and ML predictions for the surrounded neighbor configuration across the four thermal zones for the generalization dataset.

In both configurations, training errors are marginally higher than test errors across all zones. The training subset (251,568 samples) is approximately 18 times larger than the test subset (13,976 samples) and therefore provides a more complete representation of all hourly conditions, including rare thermal transients that are harder to predict, whereas the smaller test subset may by chance slightly underrepresent these challenging hours. The observed differences (0.01 to 0.04 °C in RMSE) fall within expected sampling variability for subsets of such different sizes. Conversely, generalization errors are consistently slightly larger than training errors across nearly all zones and both configurations, which is the expected ordering when unseen data is evaluated and confirms that the models are not overfitting.

To complement the parity plots, Table 6 and Table 7 summarize the predictive performance of the zone-specific surrogate models under the three evaluation splits used in this study. Across both neighborhood configurations and all thermal zones, the metrics remain close to the training errors, indicating limited overfitting. The Study zone in the surrounded configuration achieved the lowest errors overall, with a training MAPE of 1.20% and a test RMSE of 0.26 °C (Table 6), while the Bathroom zone in the corner configuration yielded comparable accuracy with a generalization RMSE of 0.38 °C (Table 7). Importantly, the generalization metrics preserve high agreement with the EnergyPlus reference outputs, with only moderate degradation in error levels depending on the zone, confirming that the surrogate models retain predictive robustness when applied to orientation–roof absorptance combinations not encountered during training. The largest degradation was observed in the Bedroom zone of the surrounded configuration, where the generalization MAPE reached 2.78% and the RMSE increased to 0.61 °C compared to 0.54 °C on the training split (Table 6); a more contained degradation was observed for the same zone in the corner configuration, where the generalization RMSE was 0.57 °C (Table 7). Despite these zone-specific differences, the coefficient of determination

R^{2}

remained at or above 0.98 across all zones, splits, and configurations, confirming the overall fidelity of the surrogate models.

Table 6. Model performance for the surrounded neighborhood configuration across the four thermal zones. Metrics are reported for the training subset (95% of the training dataset), the held-out test subset (5% of the training dataset), and the generalization dataset (unseen orientation–roof absorptance combinations).

Table 7. Model performance for the corner neighborhood configuration across the four thermal zones. Metrics are reported for the training subset (95% of the training dataset), the held-out test subset (5% of the training dataset), and the generalization dataset (unseen orientation–roof absorptance combinations).

To avoid bias from training and testing data and to emphasize out-of-sample performance, all subsequent figures and results are reported exclusively for the generalization dataset (previously unseen cases). Figure 9 reports the full distribution of prediction errors, defined as

e = T_{pred} - T_{ref}

, for each zone and neighbor configuration. Figure 9 shows that error distributions are strongly centered around zero for all zones and both configurations, indicating negligible global bias. Dispersion is zone dependent: Bedroom exhibits the widest distributions and heavier tails, whereas Bathroom and Study room show more concentrated errors. Differences between the surrounded and corner cases are visible as modest changes in distribution width, suggesting that the additional exposed facade in the corner configuration slightly affects predictability for some zones. Overall, these distributions complement point metrics (RMSE/MAE/

R^{2}

) by showing that most hourly predictions remain close to zero error, while a small fraction of timesteps contributes to the extreme tails.

Figure 9. Distribution of temperature prediction errors by thermal zone for the surrounded and corner neighbor configurations.

To assess whether the learned input–output mapping is consistent with expected physical drivers, SHAP feature importance was computed for each zone-specific model in both neighborhood configurations. Figure 10 summarizes normalized global SHAP importance across the eight models (four zones × two configurations).

Figure 10. SHAP features importance across dwelling type and thermal zone. Higher values indicate predictors with stronger average influence on indoor temperature predictions. Values are normalized by the maximum SHAP in each configuration.

The zone-specific predictor subsets in Table 5 can be further interpreted in light of the building plan (Figure 2), envelope assemblies (Table 2), and neighborhood exposure (Figure 4). Because the roof solar absorptance

α_{roof}

and building orientation

θ

are the parametric design variables of this study, they are included in all models by construction; nevertheless, SHAP analysis independently confirms their physical relevance by consistently ranking them among the most influential predictors across zones and configurations. The kitchen, which includes the courtyard-facing glass door (the largest openable aperture in the dwelling), requires both wind speed and wind direction in both configurations, consistent with a dominant ventilation pathway that remains sensitive to wind conditions regardless of neighborhood shielding. The bathroom, despite being the smallest zone, requires a large number of predictors (8 to 11), including multiple lag features and, in the surrounded case, wind direction; this higher predictor demand is attributable to the ceramic tile finishes on its walls and floor, which modify the zone’s effective thermal mass and transient response to radiative and convective forcing compared to the bare masonry surfaces in other zones. Conversely, the bedroom in the surrounded configuration uses the fewest predictors (6) and excludes wind variables entirely, consistent with its more interior position and limited facade exposure when flanked by adjacent dwellings; in the corner configuration, however, the bedroom model additionally requires wind direction and a 6 h temperature lag, reflecting the thermal influence of the newly exposed facade. The study room exhibits an intermediate and configuration-dependent pattern: it uses 10 predictors in the surrounded case but only 7 in the corner case, suggesting that direct facade exposure concentrates influence on fewer, stronger drivers and simplifies the predictor hierarchy. At the configuration level, the elevated importance of

α_{roof}

in the surrounded dwelling is consistent with adjacency rendering the roof as the primary radiative interface once lateral facades become adiabatic boundaries, whereas the corner dwelling’s greater sensitivity to

θ

and

T_{o, roll 6}

reflects a faster thermal response through the additional exposed facade. These correspondences between SHAP-based feature importance and the spatial, constructive, and ventilation characteristics of each zone indicate that the surrogate has learned physically grounded input–output relationships rather than purely statistical associations.

Across all zones and both dwelling types, outdoor temperature history features dominate model predictions, particularly the 24 h rolling mean

T_{o, roll 24}

and, to a lesser extent, the 6 h rolling mean

T_{o, roll 6}

. This ranking indicates that the surrogate relies primarily on predictors encoding outdoor temperature history, rather than responding only to instantaneous weather fluctuations. Roof solar absorptance

α_{roof}

and orientation

θ

also contribute substantially in several models, reflecting the role of radiative gains and exposure under natural ventilation. The week-of-year indicator w shows a secondary but recurrent influence, acting as a proxy for slowly varying seasonal factors (solar geometry and synoptic weather regimes). Wind-related variables (

W_{s}

,

W_{d}

) and relative humidity (RH) contribute in a zone- and configuration-dependent manner, suggesting that ventilation-driving conditions modulate indoor temperature in some rooms, as expected due to the use of the AirFlowNetwork model, though their overall influence remains smaller than the temperature-memory predictors.

Figure 11 summarizes global SHAP importance by neighborhood configuration (corner vs. surrounded) to compare the dominant drivers across dwelling types. For each predictor, we compute the mean absolute SHAP value and average it across the four thermal-zone models within a given dwelling type, yielding a single global importance score per predictor. The values are then normalized by the maximum mean SHAP value in each configuration; consequently, the top-ranked predictor has a normalized value of 1, and other bars should be interpreted as relative importance within the same panel. In both dwelling types,

T_{o, roll 24}

is the dominant predictor, stressing the strong role of diurnal outdoor thermal memory. Secondary predictor importance differs between configurations: the corner dwelling shows relatively higher dependence on shorter-term outdoor history (

T_{o, roll 6}

) and orientation

θ

, consistent with increased exposure and sensitivity to rapidly changing boundary conditions. In contrast, the surrounded neighborhood configuration exhibits comparatively stronger contributions from

α_{roof}

and seasonality (w) after

T_{o, roll 24}

, consistent with buffering effects from adjacent buildings that reduce lateral exposure.

Figure 11. Global SHAP ranking by neighborhood configuration (corner vs. surrounded). Predictors are ranked by their mean absolute SHAP value averaged across the four thermal-zone models within each dwelling type. Values are normalized by the maximum mean SHAP in each configuration.

Although Figure 7, Figure 8 and Figure 9 quantify temperature-prediction accuracy, building-performance assessment often relies on comfort-relevant indicators. Therefore, the surrogate was evaluated by comparing discomfort degree-hours computed relative to the monthly neutral temperature

T_{n}

using (i) EnergyPlus reference temperatures and (ii) ML-predicted temperatures.

In order to test whether the surrogate model preserves comfort-relevant outcomes beyond temperature accuracy, Figure 12 compares the total adaptive discomfort degree-hours computed from EnergyPlus (reference) and from the ML-predicted indoor temperatures for both dwelling configurations. For each orientation–roof absorptance combination, cold and hot discomfort are accumulated relative to the monthly neutral temperature

T_{n}

. The left panel reports the worst-case total-discomfort mismatch across all simulations, while the right panel reports the best-case mismatch. In both extremes, the surrogate reproduces the same partition between cold and hot components, indicating that the model does not merely match aggregate totals but also preserves the relative contribution of each discomfort mode. Over the full generalization space, the mean absolute error in total discomfort is

4.6 \times 10^{3}

°C·h for the surrounded dwelling and

3.9 \times 10^{3}

°C·h for the corner dwelling. These results support the use of the surrogate for rapid, comfort-oriented screening of envelope and orientation choices under naturally ventilated conditions.

Figure 12. Stacked annual discomfort degree-hours (cold + hot) for the (a) corner and the (b) surrounded dwelling, computed relative to the monthly neutral temperature

T_{n}

: EnergyPlus reference vs. ML predictions for the cases with (left) maximum and (right) minimum absolute error in total discomfort across all simulated orientation–roof-absorptance combinations. Absolute error is defined as

| Δ {DDH}_{tot} | = | {({DDH}_{cold} + {DDH}_{hot})}_{ML} - {({DDH}_{cold} + {DDH}_{hot})}_{ref} |

.

Furthermore, Figure 13 and Figure 14 compare reference and ML-based discomfort degree-hour maps as a function of orientation and roof absorptance, disaggregated by thermal zone. In these heatmaps, cells corresponding to training cases are left blank to emphasize model behavior on unseen orientation–absorptance combinations.

Figure 13. Corner dwelling—spatial distribution of discomfort degree-hours by thermal zone (orientation × roof solar absorptance): reference (left) vs. ML predictions (right). Heatmaps show (a) cold discomfort degree-hours and (b) hot discomfort degree-hours.

Figure 14. Surrounded dwelling—spatial distribution of discomfort degree-hours by thermal zone (orientation × roof solar absorptance): reference (left) vs. ML predictions (right). Heatmaps show (a) cold discomfort degree-hours and (b) hot discomfort degree-hours.

For the corner dwelling (Figure 13), reference maps show strong sensitivity of cold discomfort to roof absorptance: higher

α_{roof}

generally mitigates cold discomfort via increased absorbed solar gains, whereas lower

α_{roof}

increases cold discomfort. Hot discomfort shows the opposite tendency, increasing with higher

α_{roof}

, and also exhibits clearer orientation dependence consistent with facade-specific solar exposure and ventilation differences. Across all zones, ML predictions reproduce the dominant gradients and hotspots observed in the reference maps.

For the surrounded configuration (Figure 14), adjacency modifies boundary conditions and reduces exposure relative to the corner configuration. Roof absorptance remains as a primary driver, particularly for cold discomfort, while hot discomfort increases with higher

α_{roof}

and retains zone-specific orientation effects. As in the corner case, the ML model preserves the overall structure of the reference discomfort landscapes.

Figure 15 complements the discomfort analysis; the heatmaps report the difference between

D D H s

from ML predictions and reference for each thermal zone as a function of building orientation and roof solar absorptance, for both neighborhood configurations. The error fields remain centered near zero and exhibit no systematic spatial bias, indicating that the surrogate preserves the dominant cold and hot discomfort patterns predicted by EnergyPlus over unseen orientation–absorptance combinations. Larger local deviations occur only in limited regions of the design space (typically where discomfort magnitudes are highest), but these remain comparatively small relative to the corresponding reference

D D H

levels. The sign of

Δ D D H s

further clarifies whether the surrogate slightly overestimates (positive) or underestimates (negative) discomfort, providing a compact diagnostic of where the model’s sensitivity to orientation and roof absorptance is most critical. For clarity and direct comparability, all subpanels within Figure 15 share the same color scale (identical minimum and maximum limits), whereas Figure 13 and Figure 14 use a different scale for each hot and cold due to the different magnitude range of the underlying quantities.

Figure 15. Spatial distribution of

Δ D D H = D D H_{M L} - D D H_{r e f}

by thermal zone (orientation × roof solar absorptance). Heatmaps show (a) cold and hot DDHs for corner dwelling, (b) cold and hot DDHs for surrounded dwelling.

To complement the signed difference maps in Figure 15, a quantitative error screening was also performed at the configuration level using percentage-based DDH errors (Table 8). This analysis confirmed that the surrogate preserves the total discomfort signal with low average relative deviations in both neighborhood configurations (mean absolute total DDH error of ∼3.46% for the surrounded dwelling and ∼3.03% for the corner dwelling). By component, cold-discomfort errors remained comparable between configurations (both ∼4.7% mean absolute percentage error), whereas hot-discomfort errors were more sensitive, particularly for the surrounded case. This behavior is consistent with the lower absolute magnitude and stronger local variability of hot DDHs in some orientation–roof absorptance combinations, for which percentage metrics become more reactive to small absolute deviations. Overall, these results support the interpretation of Figure 15: the surrogate reproduces the dominant spatial patterns of discomfort while keeping configuration-level total DDH errors within a narrow range.

Table 8. Mean absolute percentage error of discomfort degree-hours (DDHs) by dwelling configuration, computed over the generalization cases.

Considering all cases, the bagged-tree surrogate models reproduce EnergyPlus indoor temperature fields with high fidelity across both neighborhood configurations and all four thermal zones (Figure 7, Figure 8 and Figure 9). SHAP analyses confirm that the models rely on physically meaningful predictors, with outdoor thermal memory (

T_{o, roll 24}

,

T_{o, roll 6}

) dominating and design variables (

θ

,

α_{roof}

) consistently contributing to predictions (Figure 10 and Figure 11). When propagated into comfort metrics, ML-predicted temperatures preserve both the magnitude and spatial trends of discomfort degree-hours relative to

T_{n}

at the dwelling level and across the orientation–absorptance design space (Figure 13, Figure 14 and Figure 15), supporting the surrogate model as a reliable tool for rapid parametric comfort assessment.

5. Conclusions

This study developed and evaluated a machine-learning surrogate framework to predict hourly indoor air temperature in a naturally ventilated Mexican social-housing prototype under a semi-arid highland climate. The workflow explicitly tested generalization across building orientation and roof solar absorptance, and compared two neighborhood boundary conditions (surrounded vs. corner). From a total corpus of 1,453,504 hourly observations, only 18% (n = 265,544) was used for model training and validation, while the remaining 82% (n = 1,187,960), comprising orientation–roof absorptance combinations entirely absent from training, was reserved as an independent generalization benchmark. Across all thermal zones and both configurations, the bagged-tree surrogate models achieved high predictive accuracy on this unseen dataset (

R^{2} = 0.98

–

0.99

; RMSE

= 0.31

–0.67 °C), with residuals centered near zero and stable dispersion, indicating robust performance rather than isolated best-case fits. Notably, this accuracy was achieved without autoregressive terms—no past indoor temperatures were used as predictors—meaning the surrogate can generate predictions from weather data and design parameters alone, without requiring prior indoor measurements.

SHAP-based interpretation suggests that the surrogate captures physically meaningful drivers of indoor temperature. In all models, the 24 h rolling mean of outdoor temperature (

T_{o, roll 24}

) was the single most influential predictor, followed by the 6 h rolling mean (

T_{o, roll 6}

), consistent with outdoor thermal memory effects in lightweight masonry construction. Roof solar absorptance and building orientation emerged as influential secondary drivers, reflecting radiative exposure and envelope forcing under naturally ventilated conditions. Notably, the relative importance of these drivers varied between neighborhood configurations: the corner dwelling exhibited higher sensitivity to shorter-term outdoor history and orientation, consistent with its greater facade exposure, whereas the surrounded configuration showed comparatively stronger contributions from roof absorptance and seasonality, consistent with the buffering effect of adjacent buildings that reduces lateral solar and wind exposure. Wind-related variables and relative humidity contributed in a zone- and configuration-dependent manner, indicating that ventilation-driving conditions modulate indoor temperature in specific rooms without dominating the global predictor hierarchy.

When translated into adaptive comfort outcomes computed relative to the monthly neutral temperature

T_{n}

, ML-predicted temperatures preserved the main cold- and hot-discomfort trends at both (i) the aggregated dwelling scale and (ii) across the full orientation–absorptance design space resolved by thermal zone. Beyond matching aggregate totals, the surrogate preserved the relative partition between cold and hot discomfort components, indicating that prediction errors do not systematically shift the perceived balance between seasonal discomfort modes. Over the full generalization space, the mean absolute error in total discomfort degree-hours was 4.6 × 10³ °C·h for the surrounded dwelling and 3.9 × 10³ °C·h for the corner dwelling. The spatial distribution of discomfort differences (

Δ DDH

) remained centered near zero with no systematic bias across the orientation–absorptance domain; larger local deviations occurred only in limited regions where reference discomfort magnitudes were highest, remaining comparatively small in relative terms. The discomfort heatmaps further show that the surrogate reproduces the dominant spatial patterns and design trade-offs and that much of this agreement is achieved on orientation–absorptance combinations entirely absent from training, supporting interpolation capability within the explored parameter ranges. Overall, the proposed surrogate is not only accurate for hourly temperature prediction but also reliable for comfort-driven screening of design alternatives, enabling rapid exploration of roof-finish and orientation decisions under different neighborhood contexts without the computational cost of exhaustive simulation campaigns.

Finally, it should be noted that the present framework was developed under controlled assumptions—no calibration against field measurements, no internal heat gains from people or equipment, and a constant 50% window-opening fraction—intended to isolate envelope–climate interactions and support a clear evaluation of surrogate generalization. Extending the approach to include monitored indoor data, time-varying occupancy and window operation, additional dwelling typologies, encoded geometric descriptors, and other Mexican climate regions would further test robustness and move toward more global surrogate models applicable across broader housing scenarios.

Author Contributions

Conceptualization, G.B. and C.M.J.; methodology, G.B. and C.M.J.; software, A.J.-G. and C.M.J.; formal analysis, G.B., C.M.J. and R.J.; investigation, G.B., C.M.J. and R.J.; resources, G.B. and C.M.J.; data curation, G.B., A.J.-G. and C.M.J.; writing—original draft preparation, G.B. and C.M.J.; writing—review and editing, G.B., R.J., G.G.-U. and C.M.J.; visualization, A.J.-G. and C.M.J.; supervision, G.B. and A.R.B.; project administration, G.G.-U. and A.R.B.; funding acquisition, G.G.-U. and A.R.B. All authors have read and agreed to the published version of the manuscript.

Funding

Authors acknowledge the support provided by SECIHTI through the graduate scholarship awarded to A. Jiménez-Godoy.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request. The MATLAB code used to train and evaluate the surrogate models is also available from the authors upon reasonable request.

Acknowledgments

R. Jäckel would like to acknowledge the support from the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior–Brasil (CAPES).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DDHs	Discomfort Degree Hours
MAE	Mean Absolute Error
MAPE	Mean Absolute Percentage Error
MSE	Mean Squared Error
R²	Coefficient of Determination
RH	Relative Humidity
RMSE	Root Mean Squared Error
SEDATU	Secretaría de Desarrollo Agrario, Territorial y Urbano
SHAP	SHapley Additive exPlanations
TMY	Typical Meteorological Year

References

International Energy Agency (IEA). Energy Efficiency 2023; International Energy Agency (IEA): Paris, France, 2023. [Google Scholar]
International Energy Agency (IEA). Tracking Clean Energy Progress 2023; International Energy Agency (IEA): Paris, France, 2023. [Google Scholar]
Secretaría de Energía (SENER). Balance Nacional de Energía 2023; Subsecretaría de Planeación y Transición Energética, Dirección General de Planeación e Información Energéticas, Gobierno de México: Mexico City, Mexico, 2025.
Gobierno de México. Programa de Vivienda para el Bienestar. Available online: https://www.gob.mx/conavi/acciones-y-programas/programa-de-vivienda-para-el-bienestar-2025 (accessed on 12 February 2026).
Medrano-Gómez, L.E.; Escobedo Izquierdo, A. Social Housing Retrofit: Improving Energy Efficiency and Thermal Comfort for the Housing Stock Recovery in Mexico. Energy Proced. 2017, 121, 41–48. [Google Scholar] [CrossRef]
NOM-020-ENER-2011; Eficiencia Energética en Edificaciones—Envolvente de Edificios para Uso Habitacional. Diario Oficial de la Federación: Mexico City, Mexico, 2011.
Aguilar-Perez, Y.; Rodrigues, L.; Beccarelli, P.; Tubelo, R. Post-Occupancy Evaluation in Post-Disaster Social Housing in a Hot-Humid Climate Zone in Mexico. Sustainability 2023, 15, 13443. [Google Scholar] [CrossRef]
Romero, D.; Torres, K.A.; Gonzalez, J.; Cetina-Quiñones, A.J.; Acosta, C.; Sadoqi, M.; Bassam, A. Climate Change and Assessing Thermal Comfort in Social Housing of Southeastern Mexico: A Prospective Study Using Machine Learning and Global Sensitivity Analysis. Sustainability 2025, 17, 9596. [Google Scholar] [CrossRef]
Ochoa, J.M.; Marincic, I.; Alpuche, M.G.; Canseco, S.; Borbon, A.C. Bioclimatic and Energy Efficiency Considerations for Social Housing: A Case Study in Hot Dry Climate. In Proceedings of the the ASME 2011 5th International Conference on Energy Sustainability (ES 2011), Washington, DC, USA, 7–10 August 2011; pp. 225–234. [Google Scholar] [CrossRef]
Vargas, A.P.; Hamui, L. Thermal Energy Performance Simulation of a Residential Building Retrofitted with Passive Design Strategies: A Case Study in Mexico. Sustainability 2021, 13, 8064. [Google Scholar] [CrossRef]
Hernández, G.; Cetina-Quiñones, A.J.; Bassam, A.; Carrillo, J.G. Passive Strategies towards Energy Efficient Social Housing: A Parametric Case Study and Decision-Making Framework in the Mexican Tropical Climate. J. Build. Eng. 2024, 82, 108282. [Google Scholar] [CrossRef]
Mousavi, S.; Gijón-Rivera, M.; Rivera-Solorio, C.I.; Godoy-Rangel, C. Energy, Comfort, and Environmental Assessment of Passive Techniques Integrated into Low-Energy Residential Buildings in Semi-Arid Climate. Energy Build. 2022, 263, 112053. [Google Scholar] [CrossRef]
Vázquez-Torres, C.E.; Gómez-Amador, A. Impact of Indoor Air Volume on Thermal Performance in Social Housing with Mixed Mode Ventilation in Three Different Climates. Energy Built Environ. 2022, 3, 433–443. [Google Scholar] [CrossRef]
Instituto Nacional de Estadística y Geografía (INEGI). Encuesta Nacional de Ingresos y Gastos de los Hogares (ENIGH) 2020: Nueva Serie; Instituto Nacional de Estadística y Geografía (INEGI): Aguascalientes, Mexico, 2021. [Google Scholar]
U.S. Department of Energy. EnergyPlus Version 25.1 Documentation: Engineering Reference and Input/Output Reference. Available online: https://energyplus.net/documentation (accessed on 27 November 2025).
Gamboa-Loya, B.; Jäckel, R.; Gutiérrez-Urueta, G.; Monreal-Jiménez, C.; Rojas-Ricca, J.; Peña-Gallardo, R. Assessment of thermal comfort with phase change materials in a standard house for different Mexican climates: A simulation study using EnergyPlus. Rev. Mex. Ing. Quím. 2024, 23, IE24351. [Google Scholar] [CrossRef]
Agdas, D.; Srinivasan, R.S. Building Energy Simulation and Parallel Computing: Opportunities and Challenges. In Proceedings of the Winter Simulation Conference (WSC 2014), Savannah, GA, USA, 7–10 December 2014; pp. 3167–3175. [Google Scholar] [CrossRef]
Yue, N.; Caini, M.; Li, L.; Zhao, Y.; Li, Y. A Comparison of Six Metamodeling Techniques Applied to Multi Building Performance Vectors Prediction on Gymnasiums under Multiple Climate Conditions. Appl. Energy 2023, 332, 120481. [Google Scholar] [CrossRef]
Li, X.; Liu, S.; Zhao, L.; Meng, X.; Fang, Y. An Integrated Building Energy Performance Evaluation Method: From Parametric Modeling to GA-NN Based Energy Consumption Prediction Modeling. J. Build. Eng. 2022, 45, 103571. [Google Scholar] [CrossRef]
Ahmad, M.W.; Reynolds, J.; Rezgui, Y. Predictive Modelling for Solar Thermal Energy Systems: A Comparison of Support Vector Regression, Random Forest, Extra Trees and Regression Trees. J. Clean Prod. 2018, 203, 810–821. [Google Scholar] [CrossRef]
Gong, M.; Bai, Y.; Qin, J.; Wang, J.; Yang, P.; Wang, S. Gradient Boosting Machine for Predicting Return Temperature of District Heating System: A Case Study for Residential Buildings in Tianjin. J. Build. Eng. 2020, 27, 100950. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar] [CrossRef]
Xu, C.; Chen, H.; Wang, J.; Guo, Y.; Yuan, Y. Improving prediction performance for indoor temperature in public buildings based on a novel deep learning method. Build. Environ. 2019, 148, 128–135. [Google Scholar] [CrossRef]
Fang, Z.; Crimier, N.; Scanu, L.; Midelet, A.; Alyafi, A.; Delinchant, B. Multi-zone indoor temperature prediction with LSTM-based sequence to sequence model. Energy Build. 2021, 245, 111053. [Google Scholar] [CrossRef]
Hampo, C.C.; Schinasi, L.H.; Hoque, S. Indoor temperature and humidity prediction in residential buildings: A review of the white box and black box modeling techniques. Build. Environ. 2026, 290, 114125. [Google Scholar] [CrossRef]
Secretaría de Desarrollo Agrario; Territorial y Urbano (SEDATU). Decide y Construye: Descarga de Planos y Manuales. Available online: https://decideyconstruye.gob.mx/index.php/descarga-planos-y-manuales/ (accessed on 4 August 2025).
Instituto del Fondo Nacional de la Vivienda para los Trabajadores (INFONAVIT). La Casa 3: Manual de Vivienda Progresiva; Instituto del Fondo Nacional de la Vivienda para los Trabajadores (INFONAVIT): Mexico City, Mexico, 2023. [Google Scholar]
Peel, M.C.; Finlayson, B.L.; McMahon, T.A. Updated World Map of the Köppen–Geiger Climate Classification. Hydrol. Earth Syst. Sci. 2007, 11, 1633–1644. [Google Scholar] [CrossRef]
García, E. Modificaciones al Sistema de Clasificación Climática de Köppen para Adaptarlo a las Condiciones de la República Mexicana; Universidad Nacional Autónoma de México, Instituto de Geografía: Mexico City, Mexico, 2004. [Google Scholar]
Climate.OneBuilding.org. Climate and Weather Data for Building Simulation. Available online: https://climate.onebuilding.org/ (accessed on 30 August 2025).
Climate.OneBuilding.org. Climate/Weather Data Sources (EPW Datasets). Available online: https://climate.onebuilding.org/sources/ (accessed on 30 August 2025).
ASHRAE Standard 140-2017; Standard Method of Test for the Evaluation of Building Energy Analysis Computer Programs. American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE): Atlanta, GA, USA, 2017.
Henninger, R.H.; Witte, M.J. EnergyPlus Testing with Building Thermal Envelope and Fabric Load Tests from ANSI/ASHRAE Standard 140-2017; U.S. Department of Energy: Washington, DC, USA, 2017.
Nicol, J.F.; Humphreys, M.A. Adaptive Thermal Comfort and Sustainable Thermal Standards for Buildings. Energy Build. 2002, 34, 563–572. [Google Scholar] [CrossRef]
MathWorks. MATLAB R2024b Documentation: Regression Learner App. Available online: https://www.mathworks.com/help/stats/regression-learner-app.html (accessed on 15 October 2025).
Breiman, L. Bagging Predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]

Figure 1. Seasonal evolution of key outdoor boundary conditions for San Luis Potosí derived from the TMY (2009–2023) EPW dataset: (top) hourly outdoor air temperature with seasonal mean, (middle) hourly relative humidity with seasonal mean, and (bottom) Solar Energy with seasonal mean. Vertical dashed lines indicate seasonal boundaries.

Figure 2. Simplified floor plan of Casa 3 (Decide y Construye) showing the thermal zoning adopted in this study and the corresponding zone areas.

Figure 3. Isometric view of the geometric model for Casa 3.

Figure 4. Schematic representation of the two neighborhood configurations considered in this study: (left) surrounded dwelling with adjacent buildings on both sides and at the rear, and (right) corner dwelling with adjacent buildings on one side and at the rear.

Figure 5. Example hourly indoor air temperature time series for the corner neighbor configuration (

θ

= 90,

α_{r o o f}

= 0.5), comparing EnergyPlus reference

T_{r e f}

(orange), surrogate predictions

T_{M L}

(yellow), and outdoor air temperature

T_{o}

(blue dotted line) over one week.

Figure 5. Example hourly indoor air temperature time series for the corner neighbor configuration (

θ

= 90,

α_{r o o f}

= 0.5), comparing EnergyPlus reference

T_{r e f}

(orange), surrogate predictions

T_{M L}

(yellow), and outdoor air temperature

T_{o}

(blue dotted line) over one week.

Figure 6. Example hourly indoor air temperature time series for the surrounded neighbor configuration (

θ

= 90,

α_{r o o f}

= 0.5), comparing EnergyPlus reference

T_{r e f}

(orange), surrogate predictions

T_{M L}

(yellow), and outdoor air temperature

T_{o}

(blue dotted line) over one week.

Figure 6. Example hourly indoor air temperature time series for the surrounded neighbor configuration (

θ

= 90,

α_{r o o f}

= 0.5), comparing EnergyPlus reference

T_{r e f}

(orange), surrogate predictions

T_{M L}

(yellow), and outdoor air temperature

T_{o}

(blue dotted line) over one week.

Figure 7. Parity plots comparing EnergyPlus as reference temperatures and ML predictions for the corner neighbor configuration across the four thermal zones for the generalization dataset.

Figure 8. Parity plots comparing EnergyPlus reference temperatures and ML predictions for the surrounded neighbor configuration across the four thermal zones for the generalization dataset.

Figure 9. Distribution of temperature prediction errors by thermal zone for the surrounded and corner neighbor configurations.

Figure 10. SHAP features importance across dwelling type and thermal zone. Higher values indicate predictors with stronger average influence on indoor temperature predictions. Values are normalized by the maximum SHAP in each configuration.

Figure 11. Global SHAP ranking by neighborhood configuration (corner vs. surrounded). Predictors are ranked by their mean absolute SHAP value averaged across the four thermal-zone models within each dwelling type. Values are normalized by the maximum mean SHAP in each configuration.

Figure 12. Stacked annual discomfort degree-hours (cold + hot) for the (a) corner and the (b) surrounded dwelling, computed relative to the monthly neutral temperature

T_{n}

: EnergyPlus reference vs. ML predictions for the cases with (left) maximum and (right) minimum absolute error in total discomfort across all simulated orientation–roof-absorptance combinations. Absolute error is defined as

| Δ {DDH}_{tot} | = | {({DDH}_{cold} + {DDH}_{hot})}_{ML} - {({DDH}_{cold} + {DDH}_{hot})}_{ref} |

.

Figure 12. Stacked annual discomfort degree-hours (cold + hot) for the (a) corner and the (b) surrounded dwelling, computed relative to the monthly neutral temperature

T_{n}

: EnergyPlus reference vs. ML predictions for the cases with (left) maximum and (right) minimum absolute error in total discomfort across all simulated orientation–roof-absorptance combinations. Absolute error is defined as

| Δ {DDH}_{tot} | = | {({DDH}_{cold} + {DDH}_{hot})}_{ML} - {({DDH}_{cold} + {DDH}_{hot})}_{ref} |

.

Figure 13. Corner dwelling—spatial distribution of discomfort degree-hours by thermal zone (orientation × roof solar absorptance): reference (left) vs. ML predictions (right). Heatmaps show (a) cold discomfort degree-hours and (b) hot discomfort degree-hours.

Figure 14. Surrounded dwelling—spatial distribution of discomfort degree-hours by thermal zone (orientation × roof solar absorptance): reference (left) vs. ML predictions (right). Heatmaps show (a) cold discomfort degree-hours and (b) hot discomfort degree-hours.

Figure 15. Spatial distribution of

Δ D D H = D D H_{M L} - D D H_{r e f}

by thermal zone (orientation × roof solar absorptance). Heatmaps show (a) cold and hot DDHs for corner dwelling, (b) cold and hot DDHs for surrounded dwelling.

Figure 15. Spatial distribution of

Δ D D H = D D H_{M L} - D D H_{r e f}

by thermal zone (orientation × roof solar absorptance). Heatmaps show (a) cold and hot DDHs for corner dwelling, (b) cold and hot DDHs for surrounded dwelling.

Table 1. Main climatic indicators for San Luis Potosí derived from the EPW weather file from period 2009–2023.

Parameter	Annual Mean	Winter Mean	Spring Mean	Summer Mean	Autumn Mean
Outdoor air temperature (°C)	17.6	14.1	20.5	19.4	16.3
Daily temperature range (°C)	15.3	17.0	16.8	13.2	14.1
Relative humidity (%)	60.0	54.0	54.0	64.0	67.0
Global horizontal irradiation (kWh/m²/day)	6.0	5.4	7.3	6.5	4.8

Table 2. Thermophysical properties and thickness of construction materials used in the EnergyPlus model, reported from outside to inside, including thermal conductivity k, density

ρ

, specific heat

c_{p}

, and thickness L.

Table 2. Thermophysical properties and thickness of construction materials used in the EnergyPlus model, reported from outside to inside, including thermal conductivity k, density

ρ

, specific heat

c_{p}

, and thickness L.

Material	k (W/m·K)	$ρ$ (kg/m³)	$c_{p}$ (J/kg·K)	L (m)
Standard wall
Red clay brick	0.70	1600	900	0.12
Bathroom wall
Red clay brick	0.70	1600	900	0.12
CREST^® tile adhesive ^a	0.80	1600	800	0.003
White ceramic tile	1.20	2100	700	0.006
Kitchen wall (stove area)
Red clay brick	0.70	1600	900	0.12
Green smooth ceramic mosaic tile	1.20	2200	800	0.015
Column
Reinforced concrete (200 kg/cm²)	1.50	2200	900	0.12
Beam
Reinforced concrete (200 kg/cm²)	1.50	2200	900	0.12
Standard floor
Polished reinforced concrete (200 kg/cm²)	1.70	2300	850	0.15
Bathroom floor
Polished reinforced concrete (200 kg/cm²)	1.70	2300	850	0.15
CREST^® tile adhesive	0.80	1600	800	0.003
White ceramic tile	1.20	2100	700	0.006
Standard roof
IMPAC^® waterproofing membrane ^b	0.17	1100	1000	0.0035
Joist-and-vault slab system	0.47	1000	900	0.20
Bathroom roof
IMPAC^® waterproofing membrane	0.17	1100	1000	0.0035
Reinforced concrete (200 kg/cm²)	1.50	2200	900	0.12
Standard window
Clear glass	0.90	–	–	0.003
Courtyard door
Clear glass	0.90	–	–	0.003
Standard door
Black steel sheet (gauge 20)	50.00	7850	480	0.001

^a CREST^® tile adhesive; Grupo CREST, San Pedro Garza García, NL, Mexico. ^b IMPAC^® waterproofing membrane; Saint-Gobain, Monterrey, NL, Mexico.

Table 3. Eighteen predictors used to train the ML surrogate models. Outdoor temperature memory predictors are derived from outdoor air temperature

T_{o}

. Building azimuth and roof solar absorptance are kept constant within each simulation.

Table 3. Eighteen predictors used to train the ML surrogate models. Outdoor temperature memory predictors are derived from outdoor air temperature

T_{o}

. Building azimuth and roof solar absorptance are kept constant within each simulation.

Category	Description	Symbol	Units
Time indicator	Calendar-related indicator capturing seasonal progression	w	–
Meteorological	Hourly outdoor dry-bulb temperature	$T_{o}$	°C
Meteorological	Hourly outdoor relative humidity	$R H$	%
Meteorological	Hourly atmospheric pressure	P	Pa
Meteorological	Hourly global horizontal solar radiation	$I_{g}$	W/m²
Meteorological	Hourly diffuse horizontal solar radiation	$I_{d}$	W/m²
Meteorological	Hourly direct normal solar radiation	$I_{b}$	W/m²
Meteorological	Hourly wind speed	$W_{s}$	m/s
Meteorological	Hourly wind direction	$W_{d}$	deg
Design	Building azimuth angle	$θ$	deg
Design	Roof solar absorptance	$α_{roof}$	–
Thermal memory	Outdoor temperature 1 h before	$T_{o, lag 1}$	°C
Thermal memory	Outdoor temperature 3 h before	$T_{o, lag 3}$	°C
Thermal memory	Outdoor temperature 6 h before	$T_{o, lag 6}$	°C
Thermal memory	Outdoor temperature 12 h before	$T_{o, lag 12}$	°C
Thermal memory	Outdoor temperature 24 h before	$T_{o, lag 24}$	°C
Thermal memory	Rolling mean of outdoor temperature over the previous 6 h	$T_{o, roll 6}$	°C
Thermal memory	Rolling mean of outdoor temperature over the previous 24 h	$T_{o, roll 24}$	°C

Table 4. Software environment and Bagged Trees hyperparameter configuration (MATLAB Regression Learner).

Setting	Value
Software	MATLAB R2024b
Tool	Regression Learner App
Validation Scheme (Cross-Validation)	5 folds
Set aside to evaluate model performance	5% (from training data set)
Model family	Ensemble regression trees
Preset	Bagged Trees
Minimum leaf size	5
Number of learners (trees)	30

Table 5. Predictor inclusion matrix for the selected models by dwelling configuration and zone. An “x” indicates that the predictor was included in the corresponding model.

Dwelling	Zone	$T_{o}$	$T_{o, roll 6}$	$T_{o, roll 24}$	$α_{roof}$	$θ$	w	$W_{s}$	$W_{d}$	$T_{o, lag 1}$	$T_{o, lag 3}$	$T_{o, lag 6}$	$T_{o, lag 12}$	$T_{o, lag 24}$	$RH$
Surrounded	Bathroom	x	x	x	x	x	x		x	x		x		x	x
	Kitchen	x	x	x	x	x	x	x	x			x		x
	Study		x	x	x	x	x		x		x	x	x	x
	Bedroom		x	x	x	x	x								x
Corner	Bathroom	x	x	x	x	x	x			x				x
	Kitchen	x	x	x	x	x	x	x	x			x		x
	Study			x	x	x	x		x			x		x
	Bedroom		x	x	x	x	x		x			x		x	x

Table 6. Model performance for the surrounded neighborhood configuration across the four thermal zones. Metrics are reported for the training subset (95% of the training dataset), the held-out test subset (5% of the training dataset), and the generalization dataset (unseen orientation–roof absorptance combinations).

Metric	Bathroom			Kitchen			Study			Bedroom
Metric	Train	Test	Gen.	Train	Test	Gen.	Train	Test	Gen.	Train	Test	Gen.
MSE (°C²)	0.09816	0.092771	0.0968	0.2187	0.2061	0.2285	0.0763	0.0676	0.1315	0.2917	0.2632	0.4423
RMSE (°C)	0.31331	0.30458	0.3112	0.4677	0.4539	0.4781	0.2763	0.2601	0.3627	0.5401	0.5131	0.6051
MAE (°C)	0.23701	0.2320	0.2390	0.3545	0.3458	0.3615	0.2141	0.2049	0.2890	0.3926	0.3731	0.4610
MAPE (%)	1.30	1.30	1.34	1.90	1.80	1.91	1.20	1.10	1.55	2.00	1.90	2.78
$R^{2}$ (–)	0.99	0.99	0.994	0.99	0.99	0.9873	0.9902	0.9902	0.9902	0.98	0.99	0.9809

Table 7. Model performance for the corner neighborhood configuration across the four thermal zones. Metrics are reported for the training subset (95% of the training dataset), the held-out test subset (5% of the training dataset), and the generalization dataset (unseen orientation–roof absorptance combinations).

Metric	Bathroom			Kitchen			Study			Bedroom
Metric	Train	Test	Gen.	Train	Test	Gen.	Train	Test	Gen.	Train	Test	Gen.
MSE (°C²)	0.1393	0.1231	0.1426	0.2196	0.1923	0.2350	0.1429	0.1240	0.1875	0.3208	0.2769	0.3237
RMSE (°C)	0.3732	0.3509	0.3776	0.4686	0.4385	0.4848	0.3780	0.3521	0.4330	0.5664	0.5262	0.5689
MAE (°C)	0.2783	0.2658	0.2789	0.3562	0.3373	0.3626	0.2848	0.2691	0.3397	0.4299	0.4057	0.4370
MAPE (%)	1.50	1.50	1.54	1.90	1.80	1.89	1.50	1.40	1.79	2.10	2.00	2.21
$R^{2}$ (–)	0.99	0.99	0.9914	0.99	0.99	0.9886	0.99	0.99	0.9864	0.98	0.99	0.9848

Table 8. Mean absolute percentage error of discomfort degree-hours (DDHs) by dwelling configuration, computed over the generalization cases.

Dwelling Configuration	Cold DDH (%)	Hot DDH (%)	Total DDH (%)
Surrounded	4.66	20.08	3.45
Corner	4.67	11.74	3.03

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.