Next Article in Journal
A Model Based on Variable Weight Theory and Interval Grey Clustering to Evaluate the Competency of BIM Construction Engineers
Previous Article in Journal
Dynamic Skin: A Systematic Review of Energy-Saving Design for Building Facades
Previous Article in Special Issue
Study on the Leakage Diagnosis of a Chilled Water Pipeline Network System Based on Pressure Variation Rate Analysis for Climate Change Mitigation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Archetype Identification and Energy Consumption Prediction for Old Residential Buildings Based on Multi-Source Datasets

1
School of Architecture and Urban Planning, Guangzhou University, Guangzhou 510006, China
2
School of Civil Engineering and Transportation, Guangzhou University, Guangzhou 510006, China
*
Author to whom correspondence should be addressed.
Buildings 2025, 15(14), 2573; https://doi.org/10.3390/buildings15142573
Submission received: 21 June 2025 / Revised: 14 July 2025 / Accepted: 16 July 2025 / Published: 21 July 2025
(This article belongs to the Special Issue Enhancing Building Resilience Under Climate Change)

Abstract

Assessing energy consumption in existing old residential buildings is key for urban energy conservation and decarbonization. Previous studies on old residential building energy assessment face challenges due to data limitations and inadequate prediction methods. This study develops a novel approach integrating building energy simulation and machine learning to predict large-scale old residential building energy use using multi-source datasets. Using Guangzhou as a case study, open-source building data was collected to identify 31,209 old residential buildings based on age thresholds and areas of interest (AOIs). Key building form parameters (i.e., long side, short side, number of floors) were then classified to identify residential archetypes. Building energy consumption data for each prototype was generated using EnergyPlus (V23.2.0) simulations. Furthermore, XGBoost and Random Forest machine learning algorithms were used to predict city-scale old residential building energy consumption. Results indicated that five representative prototypes exhibited cooling energy use ranging from 17.32 to 21.05 kWh/m2, while annual electricity consumption ranged from 60.10 to 66.53 kWh/m2. The XGBoost model demonstrated strong predictive performance (R2 = 0.667). SHAP (Shapley Additive Explanations) analysis identified the Building Shape Coefficient (BSC) as the most significant positive predictor of energy consumption (SHAP value = 0.79). This framework enables city-level energy assessment for old residential buildings, providing critical support for retrofitting strategies in sustainable urban renewal planning.

1. Introduction

In 2021, urban populations accounted for 56% of the global total, and this proportion is projected to rise to 68% by 2050 [1,2]. Urban buildings account for approximately 30% of total energy consumption and contribute to 26% of greenhouse gas emissions [3]. By the end of 2024, China’s permanent urbanization rate reached 67% [4], with urban residential floor area totaling 33.1 billion m2 [5]. A substantial portion of residential buildings constructed during the initial stages of urbanization (1980–2000) consume substantially more energy than modern structures. Implementing energy efficiency retrofits in these old residential buildings is essential for achieving decarbonization targets [6,7].
To achieve urban building energy savings, conducting energy consumption assessments of old residential buildings is essential [8]. Existing methodologies for urban building energy modeling are primarily categorized as top-down or bottom-up approaches [9]. Top-down approaches estimate energy use at the sector level [10], utilizing statistical models, regression analyses, building stock data, technology adoption models, and economic models. However, they typically lack precise temporal or spatial resolution results of individual building energy consumption, and it is often difficult to obtain dynamic data [11,12]. In contrast, the bottom-up approach is usually used to calculate energy consumption for individual buildings. Bottom-up methodologies can be classified into three types based on application method: statistical models, physical models, and hybrid models [13]. Statistical models establish correlations between building characteristics, energy usage, and socio-economic indicators. Physical models, conversely, compute energy consumption by simulating a building’s physical attributes (e.g., insulation, orientation) and technical specifications (e.g., HVAC systems), though this process demands substantial computational resources. The hybrid approach integrates the advantages of both methods, offering heightened flexibility across diverse building energy application scenarios [9].
Obtaining city-scale building data presents significant challenges, hindering comprehensive energy consumption evaluation at this scale [14]. The inherent complexity of modeling energy use for individual buildings further compounds this difficulty. Prototype energy modeling is regarded as an effective method for large-scale building energy assessment [15,16,17]. This approach can be implemented using building codes-based, data-driven, and hybrid methods [18]. EnergyPlus [19] and DeST [20] are the predominant simulation tools for prototype buildings. Numerous studies establish prototype models to assess specific building energy. For example, Carnieletto et al. [21] developed 46 prototypes (16 single-family, 16 multi-family, 14 office buildings) for energy consumption. Liu and Gou [22] established six residential building prototypes and an energy consumption model, identifying climate, population, and income as significant Energy Use Intensity (EUI) influencers. Deng et al. [23] constructed 66 prototype models for Changsha city from geographic information system (GIS) data (68,966 buildings), covering residential, commercial, and office types. They subsequently employed clustering and Random Forest (RF) models to identify 22 representative prototypes and analyze their energy consumption [24]. Song et al. [25] built various prototypes based on multi-source open data (such as maps and satellite images) and used simulated energy intensity with EnergyPlus. An et al. [26] developed 151 building prototypes for China across five climate zones and construction periods aligned with national standards. Peng et al. [27] constructed seven typical residential prototypes through spatial data and envelope structure of 120 residential areas (involving 1300 buildings, 900 units, 350 types), referring the latest energy-saving specifications to support energy prediction models. Xia et al. [28] developed eight prototype residential units in Guangzhou using population, urban residential data, and basic unit information. In addition, Yu [29] created a community-scale residential energy performance model based on a prototype, applied to 1963 Shanghai communities, identifying plot ratio and building coverage ratio as key influencing factors. Similarly, Alasmar et al. [30] built 21 prototype models to assess the energy consumption of Jordan’s national housing stock based on available building datasets. Li et al. [31] developed a residential building shape prototype via satellite images and clustering for urban-scale building stock energy calculation. Previous studies indicate that prototype models are widely used to assess old residential building energy. However, their ability to characterize city-scale energy performance remains limited under variable data conditions.
Machine learning has demonstrated significant efficacy in city-scale building energy assessment [32,33], utilizing algorithms such as extreme gradient boosting (XGBoost), Random Forest (RF), Artificial Neural Network (ANN), Gradient Boosting Decision Tree (GBDT), and Support Vector Regression (SVR). Notably, the XGBoost model often outperforms conventional regression methods in predicting residential building loads [34] and energy consumption [35]. Machine learning models can capture nonlinear relationships between urban morphological parameters and energy performance [36,37,38,39]. RF exhibits strengths in energy consumption prediction [40] especially when leveraging multi-source data (e.g., IoT sensor networks and physical energy models) [41,42,43]. While XGBoost typically achieves higher accuracy than RF at the community scale, both models enable efficient identification of energy-saving potentials in urban buildings through morphology-driven feature interpretation [36,44]. However, it still lacks a framework integrating open-source building data with individual energy characteristics for predicting city-scale old residential energy consumption.
To address these research gaps, this study develops a novel approach integrating building energy simulation with machine learning to predict large-scale old residential building energy consumption using multi-source datasets. First, the K-means clustering algorithm was employed to analyze key urban building features (length, width, number of floors, aspect ratio), identifying representative building prototypes. Second, the energy utilization index (EUI) for each prototype was generated using EnergyPlus simulations. Relationships between urban morphological indicators and energy consumption were analyzed. Third, an XGBoost model was developed to predict city-scale energy consumption. Results and their spatial patterns across different scales were visualized. This study provides an efficient framework for analyzing the energy consumption of old urban residential buildings, providing support for urban planning and energy saving strategies.

2. Materials and Methods

The research framework for archetype identification and energy consumption prediction of old residential buildings is shown in Figure 1. Firstly, the data processing process involves collecting multi-source urban building datasets and identifying old residential buildings through house age thresholds and area of interest (AOI). The approximate rectangle method is applied to derive key building parameters, including building length and elevation. Secondly, old residential buildings are categorized into distinct archetypes based on building morphology parameters and clustering algorithms. EnergyPlus, the most widely used energy consumption simulation software, is employed to calculate the energy consumption data of each archetype. This archetype’s energy consumption data is then assigned to corresponding residential buildings according to their cluster numbers, thereby providing diverse energy simulation data for the energy consumption prediction model. Thirdly, a building energy prediction model is developed by integrating machine learning algorithms (i.e., XGBoost and RF) with building morphological parameters. City-scale energy consumption of old residential buildings is subsequently estimated using this model. Finally, spatial mapping of urban energy consumption patterns is conducted, and key influencing factors are quantified through interpretable machine learning analysis. This study offers a methodological framework for calculating city-scale energy consumption in old residential buildings, with the results intended to inform targeted retrofit strategies and advance sustainable urban development.

2.1. Study Area and Data Collection

2.1.1. Case Study Area

Guangzhou (120°52′ E to 121°52′ E, 22°40′ N to 23°53′ N) (Figure 2) is situated in south-central Guangdong Province, on the northern edge of China’s Pearl River Delta [45]. The city features a marine subtropical monsoon climate, classified as a hot summer and warm winter climate zone [46]. Guangzhou’s central urban area encompasses the districts of Yuexiu, Liwan, Tianhe, and Haizhu. This area contains Guangzhou’s most densely developed old residential neighborhoods and serves as the city’s economic, cultural, and commercial core.
These neighborhoods primarily consist of courtyard-style housing constructed from the 1950s to the 1990s and commercial housing built during the 1980s and 1990s. During rapid urbanization, numerous old residential buildings emerged in these areas. The layout is relatively regular and simple, with most buildings featuring rectangular designs.
Due to early construction dates and low initial building standards, the thermal performance of building envelope structures in old residential buildings has significantly deteriorated over time [47]. This degradation necessitates substantially higher energy consumption to maintain indoor thermal comfort during Guangzhou’s high-temperature, high-humidity summers [48,49]. Consequently, accurate energy consumption assessment in these aging communities represents a critical research priority. This study focuses on Guangzhou’s central urban districts (Yuexiu, Liwan, Tianhe, Haizhu) to investigate city-scale energy consumption in old residential communities. These areas contain numerous residential buildings constructed before 2003 that need to conduct energy-saving retrofits.

2.1.2. Data Collection and Pre-Processing

Datasets include spatial data, attribute data, and weather data, as shown in Table 1. Building outline data is obtained using Baidu Map API v3.0; road network data is obtained from the OpenStreetMap website. Community age information for developments constructed between 1980 and 2003 was sourced from the Anjuke (www.anjuke.com) platform. Typical Meteorological Year (TMY) and design day data for South China were incorporated into the analysis.
Data preprocessing was conducted as follows. First, building outlines, road networks, and AOIs were processed in ArcGIS. Then, residential construction dates from Anjuke website text records were batch-converted to georeferenced point data via the Baidu Coordinate Picking System. Non-rectangular structures were geometrically standardized through minimum bounding rectangle extraction. Key morphological parameters (length, width, aspect ratio, floor count) were programmatically derived using Python’s GeoPandas library (v0.5.1) within the ArcGIS 10.6 environment, generating a structured geodatabase for subsequent analysis.

2.2. Prototype of Old Residential Buildings

2.2.1. K-Means Method

K-means clustering (Figure 3) was selected for its efficiency in grouping buildings by morphology. Building prototypes are established by categorizing structures into multi-story and high-rise types based on morphological features. The classification parameters include building height, long side, short side, aspect ratio, and floor count. A K-means clustering algorithm is employed to group the building dataset. A random sampling heuristic method initializes the cluster centers by selecting them from the dataset. The Euclidean distance metric is used, and the formula is as follows:
d x 1 , x 2 = k = 1 d x k 1 x k 2
where x 1 , x 2 are two sample points, d represents feature dimensions (including building height, long side, short side, and aspect ratio), x k 1 , x k 2 represent the value of the kth feature.
Samples are assigned to the nearest cluster center according to the minimum distance principle. The new cluster center calculation formula is as follows:
μ κ = 1 N k j ϵ c l u s t e r k x j
where μ κ is the new center of the k cluster, N k is the number of samples belonging to the k cluster, x j is the feature vector of the j sample. The clustering end condition is determined as follows: when the difference between two adjacent cluster centers is less than the set threshold, which means that the clustering criterion function converges, or when the iteration reaches the maximum number of steps, the sample adjustment ends and the final prototype building classification result is obtained.
j m = j = 1 k x i c j x i μ k 2

2.2.2. Parameters of Old Residential Prototypes

The heat transfer coefficients of the exterior wall, roof, and exterior window of the residential building are defined in accordance with the energy-saving design JGJ 75-2003: Design Standard for Energy Efficiency of Residential Buildings in Hot Summer and Warm Winter Zone; [50] and previous studies [24]. The exterior walls use clay solid bricks as the primary material, which have low thermal resistance and poor insulation. They also absorb high levels of solar radiation, resulting in increased indoor heat gain during summer. The exterior windows feature aluminum alloy frames, which significantly affect the heat transfer coefficient, causing it to reach 5.0~6.0 W/(m2·K). The roof is a flat roof with a large-step brick overhead ventilation layer and an expanded perlite insulation layer on top [51]. While this structure primarily optimizes the roof’s U-value, its high solar radiation absorption coefficient still leads to excessively high indoor temperatures on the top floor during summer. The definition of thermal parameters of old residential building prototypes is shown in Table 2.
China’s standard weather data and design day data were selected as input data. The building energy consumption was simulated using the EnergyPlus simulation program. Cooling energy consumption and annual electricity consumption data for different archetypes were then obtained. These energy consumption results for the old residential building archetypes are used to train and test the prediction model.
Ε U I i c o o l e l e = h ϵ 8760 L o a d i , h c o o l e l e × 1 F l o o r a g e i
where L o a d i , h c o o l / e l e denotes the cooling and electricity loads of building i at hour h F l o o r a g e i is the floor area of building i , which is used to calculate the cooling and electricity energy usage intensity per unit area of the whole building.
E U i c o o l e l e = h 8760 L o a d i , h c o o l e l e × 1

2.3. Energy Consumption Prediction Model

This study presents a method for generating urban morphological parameters using open-source data (Figure 4). This method provides an approach to evaluating urban form and its relationship to energy performance at the city scale. The data acquisition and processing phase begins with extracting road network data from OpenStreetMap (OSM), including urban trunk roads, branch roads, and internal roads. These datasets serve as the basis for delineating block boundaries, and the resulting block surfaces are then used for energy analysis. Block feature analysis reveals that larger urban plots frequently incorporate natural landscape elements such as mountains, water bodies, and green spaces, all of which significantly influence the spatial distribution of energy consumption patterns.
To address data gaps in old residential buildings, additional interest surface data was added to construct a more comprehensive block-level dataset. This enhanced dataset captures both the physical characteristics of urban blocks and their functional attributes, enabling accurate energy performance assessments across diverse urban blocks. For spatial visualization, the study employs grid-based techniques to map energy consumption across Guangzhou’s central urban area. This gridded representation not only facilitates intuitive interpretation of energy use patterns but also supports comparative analysis between different urban blocks. Such visualization maps serve as valuable tools for urban planning decisions and targeted energy management strategies, particularly in dense urban environments.
The boundary definitions and building form variables of old residential areas are shown in Table 3. These variables serve as inputs for both the XGBoost model and the RF model. Hyperparameter tuning was applied to optimize model performance, with the best parameter combination selected based on evaluation via K-Fold Cross-Validation. The energy consumption prediction accuracy of the two models is compared, and the superior one is chosen for energy analysis of urban old residential buildings.
Shapley Additive Explanations (SHAP) was utilized to analyze the optimal model’s performance, assessing the contribution of six features to the model output. SHAP, derived from cooperative game theory, measures the marginal contribution of features to the model’s output by calculating Shapley values. This approach constructs an additive interpretation model where all features are considered contributors. Each feature’s contribution to the predicted value is quantified through the Shapley value, ultimately determining the feature importance ranking of the optimal model.
The study selected parameters such as R2 and Mean Squared Error (MSE) as model performance comparison parameters. R2 measures the proportion of variance explained by the model for the target variable Energy. The R2 value range is between 0 and 1. In terms of explanatory power, the closer R2 is to 1, the better the model fits the data. MSE is the average of the squares of the difference between the predicted value and the true value. The calculation formula is as follows:
R 2 = 1 i = 1 n y i y i ^ 2 i = 1 n y i y ¯ 2
where n represents the number of data points for calculating R 2 , y i is the true value of the i sample, y i ^ represents the predicted value of the i t h sample, and y ¯ represents the average of all true values.
M S E = 1 n i = 1 n y i y i ^ 2
where n represents the number of data points, y i is the true value of the i sample, and y i ^ represents the predicted value of the i sample.

2.4. Mapping Energy Consumption

The energy usage of old residential buildings at the city scale is obtained by summarizing the energy usage of each building. The EUI of old residential buildings and its spatial distribution differences were visually analyzed using a heat map. Three grid units (i.e., 220 m × 220 m, 500 m × 500 m, and 1 km × 1 km) were selected to establish a product calculation model based on building archetype area and EUI. Color gradient coding (bright colors indicating low energy consumption, dark colors indicating high energy consumption) was used for multi-scale energy consumption mapping. The 1 km grid reflects the overall energy consumption level of the urban area, the 500 m grid identifies high-consumption clustered areas, and the 220 m grid analyzes block-level distribution characteristics.

3. Results

3.1. Energy Consumption of Prototypes

Satellite images of old residential areas in various districts of Guangzhou from the 1980s to the 1990s were selected to verify the screening results, and the accuracy rate of old residential building recognition reached 90%. Due to the absence of a small number of building outlines and errors in the number of floors in the original building outline data, a small number of old residential buildings could not be identified.
Regarding prototype clustering, the K-means model demonstrated a significant processing effect when the overall data volume exceeded 20,000 samples. The contour coefficient derived from the algorithm was at a medium level, suggesting the presence of a clustering structure. The Davies–Bouldin index of 0.798 indicated substantial distances between clusters while maintaining close intra-cluster distances. The prototype categories and quantity statistics of old residential buildings are shown in Table 4 and Figure 5.
In practice, energy consumption measurements for old residential buildings are unavailable. Thus, the energy consumption results of the old residential building archetypes were validated against previous studies. Cooling energy and EUI results of different prototypes are shown in Figure 6. The average specific power consumption for civil buildings in Guangdong Province in 2012 was 72.13 kWh/(m2·a) [55]. Since civil buildings encompass both residential and public structures, this average is notably higher than the energy consumption values reported in the study. The discrepancies between Prototypes B and C in the study’s results were 10–11%, while the cooling energy consumption of Prototypes A, E, and F closely aligned with the simulated air conditioning energy consumption. Pre-2001 energy consumption figures for old mid-rise residential buildings in hot summer and warm winter regions were 52 kWh/ m2, and for old high-rise buildings, 38 kWh/m2 [24]. The mid-rise buildings exhibited a 22% difference compared to the results of this study.

3.2. Prediction of Building Energy Consumption at City Scale

3.2.1. Comparison of Different Prediction Models

Machine learning prediction analysis uses building-scale and city-scale cooling energy consumption and electricity usage data. By comparing the performance of XGBoost and RF on the test set, it is found that XGBoost’s R2 (0.667) is slightly higher than RF (0.619), indicating that the Boosting method has stronger nonlinear fitting ability on this dataset. The higher unit value of the predicted energy consumption value leads to a larger MSE indicator value, 1.0 × 1011 for XGBoost and 1.1 × 1011 for RF. The model performance R2 of the XGBoost model in predicting the complex relationship between urban spatial morphology and land surface temperature in the plain city of Chengdu and the plateau city of Lhasa is 0.515 and 0.429, respectively [56]. Similar studies have shown an R2 of 0.685 in the test set when predicting cooling loads in large urban communities [34]. The R2 in the study of the relationship between urban form and urban building energy consumption is 0.4854 [57]. In summary, the R2 of the XGBoost model in prediction is 0.667, which is within a reasonable range.
This study uses Random Forest as an alternative model for cross-verification to evaluate the robustness of the XGBoost model. The superparameters of the random forest (including the number of decision trees, the minimum sample number of leaf nodes, and the maximum number of features) are optimized by the RandomizedSearchCV method, and the model is selected based on the 3-fold cross-verification and mean square error minimization target. The performance of the optimal random forest model on the test set is compared with that of the XGBoost model. The results show that the two have a high consistency in the order of feature importance (Spearman correlation coefficient = 0.82, p < 0.01), indicating that the model is stable in the identification of key features. In addition, the robustness of the model was verified through data segmentation experiments of different random seeds (42, 123, 456, 789), and the R2 fluctuation range was controlled within ±0.03, confirming the reliability of the model’s prediction performance.

3.2.2. Analysis of Influencing Factors

Based on the urban morphological parameters and building energy consumption datasets generated by residential building energy consumption prototypes and blocks. This study used the SHAP method to quantify the marginal contribution of each feature to the prediction results, and constructed a building energy consumption prediction framework that integrates ensemble learning and interpretable analysis for energy consumption prediction analysis of old residential buildings in Guangzhou.
The feature correlation matrix (Figure 7) shows the linear correlation between the features in the dataset: SA and ANF: highly negative correlation (about −0.99), indicating that when SA increases, ANF almost always decreases. BSC and energy: the positive correlation is 0.52, indicating that BSC has a positive impact on the change in energy. BAC and energy: high positive correlation (about 0.79), which means that BAC is an important factor affecting energy. In terms of correlation with the energy of the target variable: BAC and BSC: there is a strong positive correlation with energy. ANF_BSC_product(ANF×BSC) and log_BSC(Logarithmic transformation of building shape coefficients): there is a moderate correlation with energy, indicating that housing project characteristics such as BAC and BSC can help improve the prediction effect.
The feature importance plots in Figure 8 show that BSC has a higher impact, while SA has the lowest impact. The SHAP values (impact on model output) show that features BSC, ANF, and FAR have the greatest impact on model output, and they significantly positively promote the output results.
Figure 9 reveals that building morphology parameters particularly ANF, FAR, and BSC are the dominant positive drivers of energy consumption, contributing SHAP values of +414,305.15, +393,685.13, and +268,099.79, respectively. Conversely, Surface Area (SA) and Number of Buildings (NoB) emerge as negative factors, reducing energy consumption predictions by −8678.71 and −5184.37, respectively. This directional influence is visually evident in the distribution of feature impacts, where higher values of positive drivers (red data points) consistently push predictions above the baseline output (E[f(x)] = 402,997), while higher values of negative factors (blue) correlate with reduced consumption estimates.
Figure 10 delineates the SHAP value distributions for six critical features (a–f) in an XGBoost model, revealing distinct characteristic patterns. In Figure 10a, the SA feature demonstrates a bimodal distribution of SHAP values (range: −1 to 1), with medium-value samples (blue) exhibiting predominantly negative contributions, while high-value samples (red) show heterogeneous positive effects. Figure 10b discloses a pronounced clustering of high NoB values (red) at SHAP ≈ 0.5, indicating a systematic positive impact beyond specific thresholds. The FAR feature in Figure 10c exhibits a weak linear tendency, where low-value samples (blue) concentrate in negative SHAP space, contrasting with high-value samples (red) in positive regions. Figure 10d manifests a polarized pattern for BAC: low values drive strong negative contributions (SHAP < −0.5), whereas high values correlate with robust positive effects (SHAP > 1). The BSC feature in Figure 10e displays minimal dispersion (range: ±0.25 SHAP) without clear value-gradient dependency, suggesting stable predictive influence. Notably, Figure 10f reveals an abrupt transition for ANF—low values yield negative contributions, but marginal increments trigger a phase shift to sustained positive SHAP values (>0.6), evidencing a critical nonlinear boundary effect.

3.3. Analysis of Old Residential Building Energy Consumption

3.3.1. Energy Consumption Analysis at Urban Scale

The spatial distribution of energy consumption in old residential areas reveals a concentration in peri-urban areas surrounding Guangzhou’s historic core. As shown in Figure 11, At the 1 km grid-scale analysis, Huadu Central District demonstrates significantly higher energy intensity (27.1–94.3 × 105 kWh/m2), contrasting with lower intensity values (27.1 × 105 kWh/m2) observed in Zengcheng and Conghua urban centers. The least energy-intensive areas, corresponding to minimal old housing stock, are primarily located in Nansha District and Huangpu District. Within the four central historic districts, energy consumption displays high spatial clustering with intensities predominantly below 9.69 × 105 kWh/m2, necessitating finer-scale analysis (e.g., 500 m grid) to delineate precise energy intensity zones.

3.3.2. Energy Consumption Differences Under Different Grids

In Figure 12, this study analyzed urban energy consumption patterns in Guangzhou’s central districts—Liwan, Yuexiu, and Haizhu—employing a multi-scale grid approach (220 × 220 m, 500 × 500 m, and 1000 × 1000 m). Total energy consumption in old residential areas was estimated by aggregating EUI values with corresponding floor areas of prototype buildings. At the 1 km × 1 km scale, high-consumption zones (8.7–15.8 × 107 kWh/m2·a) clustered predominantly in Yuexiu District, moderate-consumption areas (2.6–8.7 × 107 kWh/m2·a) concentrated in Tianhe and Haizhu, while Liwan District showed low consumption (0.9 × 107 kWh/m2·a). The 500 m × 500 m resolution revealed finer distributions: Yuexiu’s sub-districts exhibited moderate consumption (3.0–4.6 × 106 kWh/m2·a), Tianhe contained more low-consumption areas (0.3–1.8 × 106 kWh/m2·a) than Haizhu, and minimal consumption zones (0.3 × 106 kWh/m2·a) dominated Liwan. Neighborhood-level analysis at 220 m × 220 m resolution identified peak intensity clusters (5.7–18.3 × 105 kWh/m2·a) in Yuexiu’s Jianshe Sub-district, effectively capturing block-level variations in old residential areas.

3.3.3. Energy Consumption of Different Prototypes

Two types of old residential areas, namely Suihua New Village in the 1980s and Liuyun Community in the 1990s, were selected, and the number and distribution of their building prototypes were counted (A/C/D types accounted for more than 80%). The impact of envelope structure defects on energy consumption was simulated with Lingnan climate parameters, revealing the common laws of energy consumption of building prototypes from different eras. The study selected two old residential communities of different ages, Liuyun Community and Suihua New Village in Guangzhou, to analyze the distribution and energy consumption characteristics of different residential prototypes.
In Figure 13, Suihua New Village in Haizhu District and Liuyun Community in Tianhe District were selected as research objects for analysis. Suihua New Village, constructed in the early 1980s, exhibits typical characteristics of old residential areas with aging structures and suboptimal building envelopes. The building structure is deteriorating, the envelope has significant defects, and supporting facilities are relatively outdated, leading to elevated energy consumption in the high-temperature and high-humidity Lingnan climate. Liuyun Community, built in the early 1990s, represents a typical open, pure residential community developed by private investors. The buildings’ exteriors are in poor condition, and their energy consumption patterns demonstrate strong universality. Furthermore, since both communities underwent certain renovation and transformation processes during urban development, the energy consumption simulation parameters were not specifically differentiated for each era. The buildings in these two communities predominantly correspond to Prototype A, C, and D categories, reflecting commonalities in the prototype characteristics of older communities.
From the perspective of urban renewal, the energy consumption conditions of the old residential area are introduced into the urban renovation and renewal guidelines to guide the energy-saving and low-carbon development of old communities. As shown in Figure 14, there are various strategies for the renovation of urban residential buildings. At block level/building scale, strengthen the promotion of energy-saving and low-carbon measures in communities, open up public activity spaces in neighborhoods, establish community ventilation micro-channels, enhance residents’ awareness of low-carbon and energy-saving, strengthen roof and exterior wall structures, and use more environmentally friendly and energy-saving doors and windows.

4. Discussion

The old residential prototype building is determined based on the building length and width, aspect ratio, and number of floors, and OpenStudio is used to efficiently simulate the energy consumption of the old residential prototype building. Compared with the energy consumption of old residential buildings in different climate zones in the previous literature, the simulation results of the old residential prototype building are within a reasonable range, and the actual building energy consumption data is needed to further calibrate the building energy consumption.
Due to the lack of actual measured energy consumption data of an old residential building, the study refers to the data in previous research and relevant documents for verification, and uses the data in the Construction Yearbook to verify the results. According to the 2015 Guangdong Construction Yearbook, the average unit area power consumption of civil buildings in the 2012 Guangdong Provincial Energy Audit is 72.13 kilowatt-hours/(m2·a) [55]. The energy audit results include residential buildings and public buildings, so they are significantly higher than the energy consumption values obtained in the study. The energy consumption simulation results of the previous study using the Guangzhou Asian Games City residential building as an example showed that the air conditioning cooling energy consumption was 19.3 kWh/m2. This value differs from the cooling energy consumption of Prototype B and Prototype C by 10–11%, while the cooling energy consumption of Prototype A, Prototype E, and Prototype F is very close to the research results [58]. The annual electricity EUI of the prototype building of mid-rise residential buildings in Changsha, a hot summer and cold winter region, is 50.2 kWh/m2 [24], which is 16.4% to 24.5% different from the annual electricity EUI of the prototype building obtained in this study. The cooling energy consumption demand is higher in the hot summer and warm winter region throughout the year. Therefore, the building energy data is within a reasonable error range.
Urban morphological parameters and energy consumption data are used as inputs for machine learning algorithms. The XGBoost and RF models are then employed to estimate the energy consumption of old residential buildings. XGBoost achieves higher prediction accuracy than the RF model. The energy consumption of old urban areas is analyzed at three different scales (220 m, 500 m, and 1 km), providing a reference for further identifying energy consumption patterns and informing energy planning.
This study still has some limitations. A typical meteorological year’s data was selected as input for the building energy model in this study. However, extreme high temperatures, the urban heat island effect, as well as usage habits and population density can significantly impact building energy consumption results [59,60,61]. Future work should consider these impacts and further improve the energy performance simulation of old residential buildings [62]. Meanwhile, the predictive accuracy of the XGBoost model remains limited. Future research should focus on enhancing accuracy through optimization of the prototype datasets. In addition, the application of the proposed residential building energy prediction model in other cities needs to be validated in the future. The model will require adaptive adjustments based on the actual meteorological parameters of the study area and the structural characteristics of old residential buildings.

5. Conclusions

This study developed a novel approach that integrates building energy simulation and machine learning to predict city-scale energy use in old residential buildings, utilizing multi-source datasets. It extends building-scale simulations to estimate energy consumption at the city scale. The main conclusions include:
  • Pre-2003 residential buildings were identified from 706,188 building outlines in Guangzhou using AOI data and internet housing data. This identification method was validated with 90% accuracy, yielding a total of 31,209 confirmed pre-2003 residential buildings;
  • Five representative prototypes exhibited cooling energy use (17.32–21.05 kWh/m2) and annual electricity EUI (60.10–66.53 kWh/m2), scalable to city-scale energy assessment for low-carbon planning;
  • A prediction model establishing the correlation between urban morphological factors and energy consumption was developed. The reliability of the XGBoost model was confirmed through cross-validation and its predictive accuracy for energy consumption. Model performance demonstrated that the XGBoost algorithm (R2 = 0.667) outperformed the RF model. Furthermore, a strong positive correlation (r = 0.79) was identified between BSC and the energy consumption of old residential buildings;
  • Spatial analysis of Guangzhou’s old residential buildings revealed distinct energy consumption patterns: Huadu Central District exhibited peak intensity (27.1–94.3 × 105 kWh/m2), while Nansha and Huangpu showed relatively lower consumption levels. Multi-scale grid analysis (ranging from 220 m to 1 km) identified Yuexiu as the highest energy consumption zone, with peak values reaching up to 18.3 × 105 kWh/m2·a at 220 m resolution. Prototype energy simulations of 1980s–1990s communities indicated that building envelope was the key inefficiency factor.
This study establishes an efficient framework for analyzing the energy consumption of urban old residential buildings. This framework facilitates the development of urban planning strategies and energy-saving policies. Potential applications include: (1) At city scale, it can support the design of policies, standards, and guidelines to promote energy efficiency in urban renewal projects. (2) At block scale, it can enhance public awareness and transform public spaces within high-energy-consumption communities to encourage energy-saving and low-carbon practices among residents, fostering both promoted and spontaneous energy-conserving behaviors. (3) At building scale, it can guide the retrofitting of building envelope components (e.g., roofs, windows, exterior walls) to reduce the energy consumption of individual structures.

Author Contributions

Conceptualization, C.F. and R.L.; methodology, C.F., R.L., and Y.L.; formal analysis, C.F., R.L., and Y.L.; resources, Y.L.; data curation, C.F. and R.L.; writing—original draft preparation, R.L. and C.F.; writing—review and editing, Y.L.; visualization, supervision, C.F. and Y.L.; funding acquisition, C.F. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by Guangdong Basic and Applied Basic Research Foundation (No. 2023A1515012138; 2025A1515012875), Guangzhou Science and Technology Project (No. 2024A04J3355).

Data Availability Statement

The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that they have no known competing financial interest or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

XGBoosteXtreme Gradient Boosting
RFRandom Forest
SHAPShapley Additive Explanations
BSCBuilding Shape Factor
EUIEnergy Use Intensity
SNBuilding Serial Number
SASite Area
SHGCSolar Heat Gain Coefficient
SEERSeasonal Energy Efficiency Ratio
ANFAverage number of building floors
NoBNumber of Buildings
BCRBuilding Coverage Ratio
FARFloor Area Ratio
MSEMean Squared Error

References

  1. WCR PR Chinese Press Release. wuf.unhabitat. 2022. Available online: https://wuf.unhabitat.org/sites/default/files/2022-06/files/WCR-PR-Chinese-press-release-29-06-2022.pdf (accessed on 20 June 2025).
  2. Zou, B.; Fan, C.; Wang, M.; Li, J.; Zhou, X.; Liao, Y. How Do Urban-Rural and Regional Summer Heat Exposures Evolve? A Case Study of 301 Cities in China from 2000 to 2020. Build. Environ. 2025, 271, 112584. [Google Scholar] [CrossRef]
  3. Wang, Z.; Hong, Y.; Huang, L.; Zheng, M.; Yuan, H.; Zeng, R. A Comprehensive Review and Future Research Directions of Ensemble Learning Models for Predicting Building Energy Consumption. Energy Build. 2025, 335, 115589. [Google Scholar] [CrossRef]
  4. Statistical Communiqué of the People’s Republic of China on National Economic and Social Development in 2024; National Bureau of Statistics: Beijing, China, 2025.
  5. Annual Research Report on China Building Energy Efficiency 2025 (Urban Residential Buildings Special Topic); Tsinghua University Building Energy Research Center: Beijing, China, 2025.
  6. Action Plan for Carbon Dioxide Peaking Before 2030; Department of Resource Conservation and Environmental Protection: Beijing, China, 2021.
  7. Peng, Z.; Sun, Q.; Li, P.; Sun, F.; Ren, S.; Guan, R. The Environmental Impact of the Entire Renovation Process of Urban Aged Residential Buildings in China. IJBPA 2024. [Google Scholar] [CrossRef]
  8. Dahlström, L.; Broström, T.; Widén, J. Advancing Urban Building Energy Modelling through New Model Components and Applications: A Review. Energy Build. 2022, 266, 112099. [Google Scholar] [CrossRef]
  9. Ferrando, M.; Causone, F.; Hong, T.; Chen, Y. Urban Building Energy Modeling (UBEM) Tools: A State-of-the-Art Review of Bottom-up Physics-Based Approaches. Sustain. Cities Soc. 2020, 62, 102408. [Google Scholar] [CrossRef]
  10. Hong, T.; Chen, Y.; Luo, X.; Luo, N.; Lee, S.H. Ten Questions on Urban Building Energy Modeling. Build. Environ. 2020, 168, 106508. [Google Scholar] [CrossRef]
  11. Abbasabadi, N.; Ashayeri, M. Urban Energy Use Modeling Methods and Tools: A Review and an Outlook. Build. Environ. 2019, 161, 106270. [Google Scholar] [CrossRef]
  12. Gan, L.; Liu, Y.; Shi, Q.; Cai, W.; Ren, H. Regional Inequality in the Carbon Emission Intensity of Public Buildings in China. Build. Environ. 2022, 225, 109657. [Google Scholar] [CrossRef]
  13. Johari, F.; Peronato, G.; Sadeghian, P.; Zhao, X.; Widén, J. Urban Building Energy Modeling: State of the Art and Future Prospects. Renew. Sustain. Energy Rev. 2020, 128, 109902. [Google Scholar] [CrossRef]
  14. Koral Iseri, O.; Duran, A.; Canlı, I.; Meral Akgul, C.; Kalkan, S.; Gursel Dino, I. A Method for Zone-Level Urban Building Energy Modeling in Data-Scarce Built Environments. Energy Build. 2025, 337, 115620. [Google Scholar] [CrossRef]
  15. Yang, J.; Zhang, Q.; Peng, C.; Chen, Y. AutoBPS-Prototype: A Web-Based Toolkit to Automatically Generate Prototype Building Energy Models with Customizable Efficiency Values in China. Energy Build. 2024, 305, 113880. [Google Scholar] [CrossRef]
  16. Oraiopoulos, A.; Hsieh, S.; Schlueter, A. Energy Futures of Representative Swiss Communities under the Influence of Urban Development, Building Retrofit, and Climate Change. Sustain. Cities Soc. 2023, 91, 104437. [Google Scholar] [CrossRef]
  17. Cerezo, C.; Sokol, J.; AlKhaled, S.; Reinhart, C.; Al-Mumin, A.; Hajiah, A. Comparison of Four Building Archetype Characterization Methods in Urban Building Energy Modeling (UBEM): A Residential Case Study in Kuwait City. Energy Build. 2017, 154, 321–334. [Google Scholar] [CrossRef]
  18. Shen, P.; Wang, H. Archetype Building Energy Modeling Approaches and Applications: A Review. Renew. Sustain. Energy Rev. 2024, 199, 114478. [Google Scholar] [CrossRef]
  19. Henninger, R.H.; Witte, M.J.; Crawley, D.B. Analytical and Comparative Testing of EnergyPlus Using IEA HVAC BESTEST E100–E200 Test Suite. Energy Build. 2004, 36, 855–863. [Google Scholar] [CrossRef]
  20. Yan, D.; Zhou, X.; An, J.; Kang, X.; Bu, F.; Chen, Y.; Pan, Y.; Gao, Y.; Zhang, Q.; Zhou, H.; et al. DeST 3.0: A New-Generation Building Performance Simulation Platform. Build. Simul. 2022, 15, 1849–1868. [Google Scholar] [CrossRef]
  21. Carnieletto, L.; Ferrando, M.; Teso, L.; Sun, K.; Zhang, W.; Causone, F.; Romagnoni, P.; Zarrella, A.; Hong, T. Italian Prototype Building Models for Urban Scale Building Performance Simulation. Build. Environ. 2021, 192, 107590. [Google Scholar] [CrossRef]
  22. Liu, M.; Gou, Z. A Regional Domestic Energy Consumption Model Based on LoD1 to Assess Energy-Saving Potential. Adv. Eng. Inform. 2025, 65, 103247. [Google Scholar] [CrossRef]
  23. Deng, Z.; Chen, Y.; Pan, X.; Peng, Z.; Yang, J. Integrating GIS-Based Point of Interest and Community Boundary Datasets for Urban Building Energy Modeling. Energies 2021, 14, 1049. [Google Scholar] [CrossRef]
  24. Deng, Z.; Chen, Y.; Yang, J.; Chen, Z. Archetype Identification and Urban Building Energy Modeling for City-Scale Buildings Based on GIS Datasets. Build. Simul. 2022, 15, 1547–1559. [Google Scholar] [CrossRef]
  25. Song, C.; Deng, Z.; Zhao, W.; Yuan, Y.; Liu, M.; Xu, S.; Chen, Y. Developing Urban Building Energy Models for Shanghai City with Multi-Source Open Data. Sustain. Cities Soc. 2024, 106, 105425. [Google Scholar] [CrossRef]
  26. An, J.; Wu, Y.; Gui, C.; Yan, D. Chinese Prototype Building Models for Simulating the Energy Performance of the Nationwide Building Stock. Build. Simul. 2023, 16, 1559–1582. [Google Scholar] [CrossRef]
  27. Peng, H.; Li, M.; Lou, S.; He, M.; Huang, Y.; Wen, L. Investigation on Spatial Distribution and Thermal Properties of Typical Residential Buildings in South China’s Pearl River Delta. Energy Build. 2020, 206, 109555. [Google Scholar] [CrossRef]
  28. Xia, D.; Wu, Z.; Zou, Y.; Chen, R.; Lou, S. Developing a Bottom-up Approach to Assess Energy Challenges in Urban Residential Buildings of China. Front. Archit. Res. 2025, S209526352500041X. [Google Scholar] [CrossRef]
  29. Yu, H.; Wang, M.; Lin, X.; Guo, H.; Liu, H.; Zhao, Y.; Wang, H.; Li, C.; Jing, R. Prioritizing Urban Planning Factors on Community Energy Performance Based on GIS-Informed Building Energy Modeling. Energy Build. 2021, 249, 111191. [Google Scholar] [CrossRef]
  30. Alasmar, R.; Schwartz, Y.; Burman, E. Developing a Housing Stock Model for Evaluating Energy Performance: The Case of Jordan. Energy Build. 2024, 308, 114010. [Google Scholar] [CrossRef]
  31. Li, X.; Yao, R.; Liu, M.; Costanzo, V.; Yu, W.; Wang, W.; Short, A.; Li, B. Developing Urban Residential Reference Buildings Using Clustering Analysis of Satellite Images. Energy Build. 2018, 169, 417–429. [Google Scholar] [CrossRef]
  32. Bourdeau, M.; Zhai, X.Q.; Nefzaoui, E.; Guo, X.; Chatellier, P. Modeling and Forecasting Building Energy Consumption: A Review of Data-Driven Techniques. Sustain. Cities Soc. 2019, 48, 101533. [Google Scholar] [CrossRef]
  33. Sheng, Y.; Arbabi, H.; Ward, W.O.C.; Álvarez, M.A.; Mayfield, M. City-Scale Residential Energy Consumption Prediction with a Multimodal Approach. Sci. Rep. 2025, 15, 5313. [Google Scholar] [CrossRef]
  34. Jiang, Q.; Huang, C.; Wu, Z.; Yao, J.; Wang, J.; Liu, X.; Qiao, R. Predicting Building Energy Consumption in Urban Neighborhoods Using Machine Learning Algorithms. FURP 2024, 2, 6. [Google Scholar] [CrossRef]
  35. Wang, R.; Lu, S.; Li, Q. Multi-Criteria Comprehensive Study on Predictive Algorithm of Hourly Heating Energy Consumption for Residential Buildings. Sustain. Cities Soc. 2019, 49, 101623. [Google Scholar] [CrossRef]
  36. Liu, K.; Xu, X.; Zhang, R.; Kong, L.; Wang, W.; Deng, W. Impact of Urban Form on Building Energy Consumption and Solar Energy Potential: A Case Study of Residential Blocks in Jianhu, China. Energy Build. 2023, 280, 112727. [Google Scholar] [CrossRef]
  37. Liu, K.; Xu, X.; Zhang, R.; Kong, L.; Wang, X.; Lin, D. An Integrated Framework Utilizing Machine Learning to Accelerate the Optimization of Energy-Efficient Urban Block Forms. Build. Simul. 2024, 17, 2017–2042. [Google Scholar] [CrossRef]
  38. Wang, W.; Liu, K.; Zhang, M.; Shen, Y.; Jing, R.; Xu, X. From Simulation to Data-Driven Approach: A Framework of Integrating Urban Morphology to Low-Energy Urban Design. Renew. Energy 2021, 179, 2016–2035. [Google Scholar] [CrossRef]
  39. Feng, Y.; Duan, Q.; Chen, X.; Yakkali, S.S.; Wang, J. Space Cooling Energy Usage Prediction Based on Utility Data for Residential Buildings Using Machine Learning Methods. Appl. Energy 2021, 291, 116814. [Google Scholar] [CrossRef]
  40. Chen, P.; Wu, Y.; Zhong, H.; Long, Y.; Meng, J. Exploring Household Emission Patterns and Driving Factors in Japan Using Machine Learning Methods. Appl. Energy 2022, 307, 118251. [Google Scholar] [CrossRef]
  41. Wei, Z.; Zhang, T.; Yue, B.; Ding, Y.; Xiao, R.; Wang, R.; Zhai, X. Prediction of Residential District Heating Load Based on Machine Learning: A Case Study. Energy 2021, 231, 120950. [Google Scholar] [CrossRef]
  42. Qavidelfardi, Z.; Tahsildoost, M.; Zomorodian, Z.S. Using an Ensemble Learning Framework to Predict Residential Energy Consumption in the Hot and Humid Climate of Iran. Energy Rep. 2022, 8, 12327–12347. [Google Scholar] [CrossRef]
  43. Luo, C.; Feng, C.; Zhong, H.; Liu, Y.; Dou, M. Design Optimization of Climate-Responsive Rural Residences in Solar Rich Areas Considering Sustainability and Occupant Comfort. Energy Build. 2025, 336, 115546. [Google Scholar] [CrossRef]
  44. Ye, Z.; Cheng, K.; Hsu, S.-C.; Wei, H.-H.; Cheung, C.M. Identifying Critical Building-Oriented Features in City-Block-Level Building Energy Consumption: A Data-Driven Machine Learning Approach. Appl. Energy 2021, 301, 117453. [Google Scholar] [CrossRef]
  45. Liu, Y.; Zhang, W.; Liu, W.; Tan, Z.; Hu, S.; Ao, Z.; Li, J.; Xing, H. Exploring the Seasonal Effects of Urban Morphology on Land Surface Temperature in Urban Functional Zones. Sustain. Cities Soc. 2024, 103, 105268. [Google Scholar] [CrossRef]
  46. Zou, B.; Fan, C.; Li, J.; Wang, M.; Liao, Y.; Zhou, X. Assessing the Impact of Land Use Changes on Urban Heat Risk under Different Development Scenarios: A Case Study of Guangzhou in China. Sustain. Cities Soc. 2025, 130, 106532. [Google Scholar] [CrossRef]
  47. Zhou, X.; Deng, S.; Cui, Y.; Fan, C. Developing a Co-Benefits Evaluation Model to Optimize Greening Coverage Designs on University Campuses in Hot and Humid Areas. Energy Build. 2025, 328, 115214. [Google Scholar] [CrossRef]
  48. Cui, Y.; Fan, C.; Zhou, X.; Liao, Y. Analysis of Green Belt Designs in Mitigating Anthropogenic Heat Emission from Buildings by Considering Cooling Efficiency and Nurture Costs. Indoor Built Environ. 2024, 34, 438–459. [Google Scholar] [CrossRef]
  49. Zou, B.; Fan, C.; Li, J. Quantifying the Influence of Different Block Types on the Urban Heat Risk in High-Density Cities. Buildings 2024, 14, 2131. [Google Scholar] [CrossRef]
  50. JGJ 75-2003; Design Standard for Energy Efficiency of Residential Buildings in Hot Summer and Warm Winter Zone. Ministry of Housing and Urban-Rural Development of China (MOHURD) China Architecture & Building Press: Beijing, China, 2003.
  51. Zhang, X.; Chen, Z.; Yue, Y.; Qi, X.; Zhang, C.H. Fusion of Remote Sensing and Internet Data to Calculate Urban Floor Area Ratio. Sustainability 2019, 11, 3382. [Google Scholar] [CrossRef]
  52. Song, S.; Leng, H.; Xu, H.; Guo, R.; Zhao, Y. Impact of Urban Morphology and Climate on Heating Energy Consumption of Buildings in Severe Cold Regions. IJERPH 2020, 17, 8354. [Google Scholar] [CrossRef]
  53. Perera, A.T.D.; Javanroodi, K.; Nik, V.M. Climate Resilient Interconnected Infrastructure: Co-Optimization of Energy Systems and Urban Morphology. Appl. Energy 2021, 285, 116430. [Google Scholar] [CrossRef]
  54. Ratti, C.; Baker, N.; Steemers, K. Energy Consumption and Urban Texture. Energy Build. 2005, 37, 762–776. [Google Scholar] [CrossRef]
  55. Guangdong Construction Yearbook Compilation Committee. Guangdong Construction Yearbook 2015; Guangdong People’s Publishing House: Guangzhou, China, 2015; ISBN 978-7-218-10509-3. [Google Scholar]
  56. Wang, Z.; Zhou, R.; Rui, J.; Yu, Y. Revealing the Impact of Urban Spatial Morphology on Land Surface Temperature in Plain and Plateau Cities Using Explainable Machine Learning. Sustain. Cities Soc. 2025, 118, 106046. [Google Scholar] [CrossRef]
  57. Li, Z.; Ma, J.; Jiang, F.; Zhang, S.; Tan, Y. Assessing the Impacts of Urban Morphological Factors on Urban Building Energy Modeling Based on Spatial Proximity Analysis and Explainable Machine Learning. J. Build. Eng. 2024, 85, 108675. [Google Scholar] [CrossRef]
  58. Wang, J.; Ren, C. Analysis of residential energy consumption ratio from Guangzhou Asian Games City project. Water Wastewater Eng. 2011, 37, 92–96. [Google Scholar] [CrossRef]
  59. Demir Dilsiz, A.; Ng, K.; Kämpf, J.; Nagy, Z. Ranking Parameters in Urban Energy Models for Various Building Forms and Climates Using Sensitivity Analysis. Build. Simul. 2023, 16, 1587–1600. [Google Scholar] [CrossRef]
  60. Fan, C.; Zou, B.; Li, J.; Wang, M.; Liao, Y.; Zhou, X. Exploring the Relationship between Air Temperature and Urban Morphology Factors Using Machine Learning under Local Climate Zones. Case Stud. Therm. Eng. 2024, 55, 104151. [Google Scholar] [CrossRef]
  61. Chen, Y.; Fan, C.; Nie, Y.; Wu, H.; Zhu, Y.; Liao, Y.; Li, H.; Wu, L.; Lao, M. Integrating Bottom-up Energy Model with Urban Parameterization for Fine-Scale Anthropogenic Heat Estimation. Case Stud. Therm. Eng. 2025, 73, 106627. [Google Scholar] [CrossRef]
  62. Oh, S.; Ahn, H.; Bae, M.; Kang, J. Development and Analysis of Easy-to-Implement Green Retrofit Technologies for Windows to Reduce Heating Energy Use in Older Residential Buildings. Sustainability 2025, 17, 3307. [Google Scholar] [CrossRef]
Figure 1. Research framework of energy consumption prediction for old residential buildings.
Figure 1. Research framework of energy consumption prediction for old residential buildings.
Buildings 15 02573 g001
Figure 2. Satellite map of Guangzhou city and old town.
Figure 2. Satellite map of Guangzhou city and old town.
Buildings 15 02573 g002
Figure 3. K-Means cluster analysis. (a): before K-Means; (b): after K-Means.
Figure 3. K-Means cluster analysis. (a): before K-Means; (b): after K-Means.
Buildings 15 02573 g003
Figure 4. The process of generating the street surface of old residential areas.
Figure 4. The process of generating the street surface of old residential areas.
Buildings 15 02573 g004
Figure 5. Proportion of prototypes of old residential buildings in Guangzhou.
Figure 5. Proportion of prototypes of old residential buildings in Guangzhou.
Buildings 15 02573 g005
Figure 6. Annual energy consumption and cooling energy consumption of different prototypes.
Figure 6. Annual energy consumption and cooling energy consumption of different prototypes.
Buildings 15 02573 g006
Figure 7. Feature correlation matrix for building energy prediction.
Figure 7. Feature correlation matrix for building energy prediction.
Buildings 15 02573 g007
Figure 8. Feature importance plots for building energy prediction.
Figure 8. Feature importance plots for building energy prediction.
Buildings 15 02573 g008
Figure 9. SHAP decision plot for building energy prediction.
Figure 9. SHAP decision plot for building energy prediction.
Buildings 15 02573 g009
Figure 10. SHAP interaction summary plots: (a) SHAP value for SA; (b) SHAP value for NoB; (c) SHAP value for FAR; (d) SHAP value for BAC; (e) SHAP value for BSC; (f) SHAP value for ANF.
Figure 10. SHAP interaction summary plots: (a) SHAP value for SA; (b) SHAP value for NoB; (c) SHAP value for FAR; (d) SHAP value for BAC; (e) SHAP value for BSC; (f) SHAP value for ANF.
Buildings 15 02573 g010
Figure 11. Visualization of energy consumption in old residential areas in Guangzhou.
Figure 11. Visualization of energy consumption in old residential areas in Guangzhou.
Buildings 15 02573 g011
Figure 12. Energy consumption at different scales: (a) 220 m × 220 m grids; (b) 500 m × 500 m grids; (c) 1 km × 1 km grids.
Figure 12. Energy consumption at different scales: (a) 220 m × 220 m grids; (b) 500 m × 500 m grids; (c) 1 km × 1 km grids.
Buildings 15 02573 g012
Figure 13. Spatial distribution of building energy consumption: (a) old residential areas from the 1980s; (b) old residential areas from the 1990s.
Figure 13. Spatial distribution of building energy consumption: (a) old residential areas from the 1980s; (b) old residential areas from the 1990s.
Buildings 15 02573 g013
Figure 14. Multi-scale energy-saving and low-carbon renewal measures for old residential areas.
Figure 14. Multi-scale energy-saving and low-carbon renewal measures for old residential areas.
Buildings 15 02573 g014
Table 1. Data sources.
Table 1. Data sources.
Dataset NameData SourceFormatAttributes
2023 Guangzhou Building FootprintBaidu Map (https://map.baidu.com/; accessed 15 May 2023)ShapefilePolygon data
Guangzhou Road NetworkOpenStreetMap (https://www.openstreetmap.org/)ShapefileLine data
Residential Areas with Year BuiltAnjuke (https://www.anjuke.com/)XlsxPoint data with year tags
Areas of interestAmap (https://ditu.amap.com/)ShapefilePolygon data
Community CoordinatesBaidu Coordinate Picker (https://api.map.baidu.com/lbsapi/getpoint/)ShapefilePoint data
Meteorological DataEnergyPlus (https://energyplus.net/)EpwClimate parameters
Table 2. Prototype thermal parameter settings.
Table 2. Prototype thermal parameter settings.
ParametersPrototype APrototype BPrototype CPrototype DPrototype E
Equipment power density (W/m2)4.34.34.34.34.3
Lighting power density (W/m2)77777
Occupancy (people/m2)0.070.070.070.050.05
Exterior wall U-value (W/(m2·K))2.472.472.471.621.62
Roof U-value (W/(m2·K))1.81.81.81.661.66
Window U-value (W/(m2·K))6.46.46.44.994.99
Window SHGC0.850.850.850.850.85
Room Air Conditioner SEER2.72.72.72.72.7
Cooling/heating setpoints (°C)26/1626/1626/1626/1626/16
Table 3. Parameter calculation formula.
Table 3. Parameter calculation formula.
VariableDefinitionFormulaReference
Building Serial Number (SN)Unique identifier of a building, used to distinguish and mark different buildings.
Site Area (SA)Refers to the area of the plot of land on which the building belongs.
Average number of building floors (ANF)The average number of floors in a building’s total height, usually expressed as an integer or decimal, used to estimate the building’s mass. A N F = F A R B C R [51]
Number of Buildings (NoB)The number of buildings within the site calculation unit. N o B = C o u n t   o f   B u i l d i n g s   i n   t h e   A r e a
Building Coverage Ratio (BCR)The ratio of building area to total land area. B C R = b u i l d i n g   a r e a s i t e   a r e a [52]
Floor Area Ratio (FAR)The ratio of the total building area in a region to the total area of the region. F A R = b u i l d i n g   f l o o r   a r e a s i t e   a r e a [53]
Building Shape Coefficient (BSC)Used to measure the complexity of a building’s outline; the ratio of the building’s surface area to the unit building volume. B S C = i n S i i n V i [54]
Table 4. Prototype categories of old residential buildings.
Table 4. Prototype categories of old residential buildings.
Prototype CategoryNumber Building Aspect RatioLength × WidthGraphic
Prototype A97083.1042 × 17Buildings 15 02573 i001
Prototype B23944.0178 × 26Buildings 15 02573 i002
Prototype C16,1141.9421 × 12Buildings 15 02573 i003
Prototype D11,0572.0930 × 17Buildings 15 02573 i004
Prototype E35753.2173 × 27Buildings 15 02573 i005
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fan, C.; Liu, R.; Liao, Y. Archetype Identification and Energy Consumption Prediction for Old Residential Buildings Based on Multi-Source Datasets. Buildings 2025, 15, 2573. https://doi.org/10.3390/buildings15142573

AMA Style

Fan C, Liu R, Liao Y. Archetype Identification and Energy Consumption Prediction for Old Residential Buildings Based on Multi-Source Datasets. Buildings. 2025; 15(14):2573. https://doi.org/10.3390/buildings15142573

Chicago/Turabian Style

Fan, Chengliang, Rude Liu, and Yundan Liao. 2025. "Archetype Identification and Energy Consumption Prediction for Old Residential Buildings Based on Multi-Source Datasets" Buildings 15, no. 14: 2573. https://doi.org/10.3390/buildings15142573

APA Style

Fan, C., Liu, R., & Liao, Y. (2025). Archetype Identification and Energy Consumption Prediction for Old Residential Buildings Based on Multi-Source Datasets. Buildings, 15(14), 2573. https://doi.org/10.3390/buildings15142573

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop