Balancing Solar Energy, Thermal Comfort, and Emissions: A Data-Driven Urban Morphology Optimization Approach

Bian, Chenhang; Hu, Panpan; Li, Chun Yin; Lee, Chi Chung; Chen, Xi

doi:10.3390/en18133421

Open AccessArticle

Balancing Solar Energy, Thermal Comfort, and Emissions: A Data-Driven Urban Morphology Optimization Approach

by

Chenhang Bian

¹

,

Panpan Hu

¹

,

Chun Yin Li

¹

,

Chi Chung Lee

^1,*

and

Xi Chen

^2,*

¹

School of Science and Technology, Hong Kong Metropolitan University, Hong Kong, China

²

Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong, China

^*

Authors to whom correspondence should be addressed.

Energies 2025, 18(13), 3421; https://doi.org/10.3390/en18133421

Submission received: 21 May 2025 / Revised: 26 June 2025 / Accepted: 27 June 2025 / Published: 29 June 2025

(This article belongs to the Section B: Energy and Environment)

Download

Browse Figures

Versions Notes

Abstract

Urban morphology critically shapes environmental performance, yet few studies integrate multiple sustainability targets within a unified modeling framework for its design optimization. This study proposes a data-driven, multi-scale approach that combines parametric simulation, artificial neural network-based multi-task learning (MTL), SHAP interpretability, and NSGA-II optimization to assess and optimize urban form across 18 districts in Hong Kong. Four key sustainability targets—photovoltaic generation (PVG), accumulated urban heat island intensity (AUHII), indoor overheating degree (IOD), and carbon emission intensity (CEI)—were jointly predicted using an artificial neural network-based MTL model. The prediction results outperform single-task models, achieving R² values of 0.710 (PVG), 0.559 (AUHII), 0.819 (IOD), and 0.405 (CEI), respectively. SHAP analysis identifies building height, density, and orientation as the most important design factors, revealing trade-offs between solar access, thermal stress, and emissions. Urban form design strategies are informed by the multi-objective optimization, with the optimal solution featuring a building height of 72.11 m, building centroid distance of 109.92 m, and east-facing orientation (183°). The optimal configuration yields the highest PVG (55.26 kWh/m²), lowest CEI (359.76 kg/m²/y), and relatively acceptable AUHII (294.13 °C·y) and IOD (92.74 °C·h). This study offers a balanced path toward carbon reduction, thermal resilience, and renewable energy utilization in compact cities for either new town planning or existing district renovation.

Keywords:

urban morphology factors; thermal and energy environment; multi-task learning model; SHAP method; nonlinear relationship

1. Introduction

1.1. Background

Over the past three years, global environmental changes have accelerated at an alarming rate. According to the Copernicus Climate Change Service, 2024 marked the warmest year on record, with global mean temperatures exceeding the pre-industrial baseline by more than 1.5 °C for the first time [1]. Currently, the World Meteorological Organization reported that glacier mass loss between 2021 and 2024 reached historically unprecedented levels, while severe marine heatwaves triggered the most extensive coral bleaching event, affecting over 84% of coral reef areas worldwide [2]. In response, the global renewable energy sector experienced a surge in deployment, with 585 GW of new capacity installed in 2024 alone, accounting for more than 90% of total energy additions [3]. National and regional governments intensified their climate commitments, including updated nationally determined contributions (NDCs), enhanced renewable energy targets, and urban adaptation programs [4]. China’s renewable energy plan aims to increase annual renewable energy consumption to 1 billion tons of standard coal equivalent by 2025 and 5 billion tons by 2030 [5]. Hong Kong has also set ambitious goals, such as Climate Action Plan 2050, which focuses on net-zero electricity generation, energy-saving and green buildings, green transport, and waste reduction [6].

Despite these advances, urban areas where over half of the global population now resides face escalating challenges. The intensification of the urban heat island effect, particularly in densely built environments, has increased the vulnerability of populations to extreme heat, leading to heightened public health risks, increased cooling energy demands, and exacerbated socioeconomic inequalities [7]. In parallel, demographic shifts have further reshaped urban landscapes. Global population surpassed 8.2 billion by 2024, with growth concentrated in emerging economies, intensifying pressure on urban ecosystems and infrastructure [8]. Therefore, there is an urgent need to systematically investigate the integrated impacts of urban form on environmental sustainability outcomes. Such insights are essential not only for informing future urban planning and architectural design but also for guiding policy interventions aimed at achieving net-zero and climate-resilient cities, as well as the transformation of existing urban areas to better adapt to environmental changes.

1.2. Literature Review

1.2.1. Single-Factor Studies

Urban morphology has long been recognized as a fundamental determinant of urban environmental quality. Early research typically adopted a single-factor, single-task approach to reveal how one morphological attribute influences a specific environmental metric. For example, Stewart & Oke [9] and Zuo et al. [10] found that building footprint density significantly affects land surface temperature (LST) across different functional zones. Natanian & Auer [11] analyzed the relationship between street layout and natural ventilation, while Yuan et al. [12] emphasized the dominant role of 3D metrics—such as average building height—in regulating LST. Chen et al. [13] identified threshold effects of vegetation-to-building volume differences on urban heat island intensity (UHII) during heatwave events. At the street scale, Hong et al. [14] and Zhu et al. [15] demonstrated that building coverage, green space, and surface albedo significantly influence nighttime street-level UHII and LST. Liu et al. [16] pointed out that parameters such as sky view factor or building height may exert opposite influences across different local climate zones (LCZs), highlighting the need for context-sensitive planning strategies. Zhang et al. [17] focused on heating and cooling loads and energy optimization for different building typologies through parametric simulation and machine learning models. Jia et al. [18] extended the understanding of indoor comfort by quantifying the short-term effects of thermal sensation and CO₂ concentration in Hong Kong. However, these studies concentrate on isolated environmental metrics (e.g., UHI, LST, ventilation efficiency, or energy consumption), employing single-task learning models for analysis or prediction without accounting for interdependencies among multiple metrics—such as energy use, solar potential, indoor thermal comfort, and carbon emissions—and thus overlook the multi-objective nature essential for sustainable urban design.

1.2.2. Comprehensive Multi-Factor Analysis of Urban Morphology

Since the 2010s, driven by the push for urban sustainability and climate action, integrated multi-factor analyses of urban form have become more prevalent. Advances in remote sensing and GIS have enabled the quantification of both 2D and 3D morphological indicators at citywide scales and their linkage to diverse performance outcomes. For instance, Kim et al. [19] leveraged GIS coupled with satellite thermal data to relate indicators such as floor area ratio, green cover, and sky view factor to urban thermal conditions, while Tian et al. [20] evaluated how 3D metrics (e.g., building volume and configuration) influence carbon mitigation strategies. It is now well established that a city’s layout of buildings, open spaces, and infrastructure directly affects urban heat island intensity, carbon emission patterns, indoor thermal comfort, and the feasibility of renewable energy integration [21]. A systematic review covering 258 studies identified critical morphological factors: higher building density, coverage ratio, and canyon aspect ratio tend to reduce solar irradiance at façades, whereas a larger sky view factor and strategic orientation enhance photovoltaic potential [22]. Urban form also governs indoor overheating risk [23]; for example, Hamdy et al. [24] proposed an operative-temperature-based IOD index for standard indoor comfort assessment, and in hot–humid climates, alternative formulations using Heat Index have been introduced for extreme-heat resilience analysis [25]. Although two-dimensional indicators (e.g., building density, land use) have been widely examined, there remains a lack of comprehensive frameworks that integrate and interpret complex three-dimensional morphological attributes (such as building height, centroid spacing, and sky view factor) to predict and optimize performance outcomes in dense urban environments.

1.2.3. Machine Learning-Based Approaches and Optimization Strategies

The emergence of data-driven methods has introduced machine learning (ML) techniques into urban environment modeling, significantly accelerating research in this domain. Algorithms such as Random Forests, Support Vector Machines, and Artificial Neural Networks have been widely applied to predict building energy consumption [26], urban heat island intensity [27] and renewable energy generation [28]. For example, Zheng et al. [29] quantified 26 morphological indicators affecting building energy use using spatial proximity metrics and explainable AI (SHAP), uncovering the consistent high-impact features of different building types. Li et al. [30] combined Graph Neural Networks (GNNs) with a Genetic Algorithm (GA) to optimize peak-hour traffic carbon emissions in Shanghai via urban form improvements. Zhang et al. [31] employed machine learning models including Random Forest and XGBoost to investigate the influence of 3D morphological parameters on street-level PM_2.5 in Shenzhen. Liu et al. [32] used Gradient Boosting with SHAP analysis to uncover the nonlinear impacts of built environment attributes on urban heat resilience in Beijing. Chen et al. [33] developed a GCN-LSTM model to predict rooftop photovoltaic potential, explicitly incorporating spatial shading effects. Xu et al. [34] proposed a Functional–Spatial–Temporal GCN (FST-GCN) to estimate the near-surface air temperature across multiple Chinese cities. Okumus and Akay [35] and Li et al. [36] demonstrated the region-specific nonlinear influences of morphological metrics—such as ground area ratio and road network structure—on urban heat and carbon emissions using XGBoost-SHAP and geographically weighted regression models. These ML-based approaches generally achieve higher predictive accuracy than traditional statistical models; however, they predominantly focus on single targets, and most remain at an evaluation stage, offering limited guidance for multi-objective decision-making in urban design.

1.3. Research Gap and Objectives

Although considerable progress has been made in examining urban morphology through single-factor, multi-factor, and ML-based studies, there lacks an integrated framework that simultaneously captures the interdependencies among multiple environmental and energy metrics. Prior works have often treated metrics—such as energy consumption, urban heat island intensity, solar potential, indoor thermal comfort, and carbon emissions—in isolation or within single-task models, failing to fully leverage 3D morphological attributes in a unified, multi-objective optimization-oriented context. To address these gaps, this study aims to develop a novel integrated framework that combines parametric urban modeling, multi-task learning (MTL) for the joint prediction of key interdependent environmental indicators (e.g., the energy demand, urban heat island intensity, solar potential), SHAP analysis, and multivariate optimization to evaluate and determine the optimal environmental performance under varied urban morphology.

The rest of the paper is organized as follows. Section 2 describes the research framework, including the study area, data sources, urban morphology indicators, simulation methods, and model development. Section 3 presents the results, including model performance, feature importance, optimization, and clustering analysis. Section 4 provides the main conclusions, discusses limitations, and suggests directions for future research.

2. Methodology

2.1. Study Area and Data Sources

The study focuses on 18 districts of Hong Kong as the research area (Figure 1), owing to its high-density urban development, steep terrain, and complex microclimatic conditions. As a compact, high-rise metropolis in a subtropical coastal setting, Hong Kong exhibits significant variations in solar access, ventilation, and thermal environments across districts. These characteristics make it an ideal context for investigating the interactions between urban morphology and environmental performance, particularly in terms of energy demand, thermal stress, and carbon emissions. In recent years, Hong Kong has actively promoted green building standards, photovoltaic integration, and carbon neutrality goals through policy instruments such as the Feed-in Tariff (FiT) Scheme [37], the Energy Saving Plan [38], and the Climate Action Plan 2050 [39]. This supportive regulatory and technological landscape underscores the practical necessity of incorporating urban microclimate and morphological dynamics into energy and sustainability simulations. This study focuses on residential buildings in Hong Kong, with data collected from 18 districts, covering various urban block morphologies. The building footprints were classified into five generations based on construction year ranges (Generation 1: before 1986s; Generation 2: 1986–1992; Generation 3: 1992–2003; Generation 4: 2003–2013; Generation 5: after 2013), and their spatial distribution was mapped across the 18 Hong Kong districts. The map is provided as Figure 2.

The geospatial datasets were acquired from multiple reliable sources, including Landsat 8 imagery accessed via Google Earth Engine (GEE) [40], Hong Kong geospatial data (HK Geodata). This study used a typical meteorological year with broad representation as the meteorological parameters for the simulation [41]. Historical building data were categorized into five construction periods to account for variances in building materials, construction methods, HVAC systems, window-to-wall ratios (WWR), and U-values. Urban morphological indicators—including building density (BD), average building height (aBH), mean building centroid distance (mBCD), building shape coefficient (BSC), sky view factor (SVF), floor area ratio (FAR), plot size (PS), aspect ratio (AR), compactness (CN), building orientation (BO), vegetation coverage ratio (VCR), and vegetation albedo (Alb_v)—were extracted through integrated processing using QGIS, Rhino, and Grasshopper.

Urban morphological data—including building footprints, heights, road networks, vegetation coverage, and land use types—were obtained from the Hong Kong Geodata Portal [42], a public platform for exploring and downloading open spatial data (https://geodata.gov.hk/ (accessed on 24 April 2025)). Information on building construction year and architectural typology was sourced from two official repositories: BRAVO [43]—online building records (https://bravo.bd.gov.hk (accessed on 24 April 2025)), and the list of Public Housing Estates in Hong Kong [44].

All boundary conditions were assembled from peer-reviewed sources that cover Hong Kong’s public-housing stock, its urban microclimate, and local photovoltaic yield. The envelope thermophysical properties, internal load densities, HVAC efficiencies, and infiltration rates for the five housing “generations” were extracted directly from the standard-block analysis of Wang et al. [45] and cross-checked against Xue et al.’s [46] historical survey of public buildings (1960–2006). The baseline standards were referenced to the ASHRAE Standard 90.1 [47], and the Building Energy Code (BEC) of the Electrical and Mechanical Services Department (EMSD) of Hong Kong [48,49]. These two studies supply the U-values, SHGC, WWR, lighting/equipment densities, and split-unit COPs shown in Table 1; when ranges were reported, their mid-points were adopted as baseline inputs and the extremes used for sensitivity tests. Urban-microclimate parameters, vegetation coverage ratio, and albedo were calculated through Landsat 8 [50] before being imported into Dragonfly. PV generation setting was as follows: module efficiency = 15%, temperature coefficient = −0.5% K⁻¹, inverter efficiency = 85%.

2.2. Urban Morphology Indicators and Targets

Urban morphology plays a pivotal role in shaping the environmental performance of built environments. Specifically, indicators such as aBH, SVF, and BO directly influence solar exposure and shading patterns, thereby affecting PVG. Parameters including BD, mBCD, and VCR are closely linked to urban ventilation and thermal accumulation, exerting substantial influence on the outdoor microclimate and thus on AUHII. For indoor thermal comfort, variables like PS, BSC, and AR regulate the spacing and form of buildings, which impact solar penetration, heat retention, and cooling loads—key determinants of IOD. In terms of CEI, the indicators FAR, CN, and Alb_v reflect the overall spatial intensity, thermal inertia, and reflectivity of urban materials, which together influence both operational energy demand and environmental load.

In this study, 12 types of urban morphology indicators were analyzed, including BD, aBH, mBCD, BSC, SVF, FAR, PS, AR, CN, BO, VCR, and Alb_v, as shown in Table 2. These parameters were automatically computed using parametric modeling workflows in Grasshopper, ensuring spatial consistency and high-resolution morphological characterization across all sampled districts.

To identify the most accurate urban dataset among the 18 districts for renewable energy, thermal comfort, microclimate, and carbon simulation, this study examined four key sustainability targets to comprehensively assess the environmental performance of urban form. The four objective functions were as follows:

Photovoltaic generation (PVG) represents the annual photovoltaic electricity generation potential per unit roof area (kWh/m²), as shown in (Equation (1)). It is simulated using the Ladybug-Radiance plugin in Grasshopper, based on hourly solar radiation, surface tilt, orientation, and local shading conditions. The model considers rooftop and facade availability to estimate PV output:

PVG = S × Es × Ƞ

(1)

where S represents the active area of the photovoltaic panels utilized for electricity generation (in square meters), Es denotes the solar energy received by the building surface within a specific time period (in kWh/m²), and Ƞ signifies the efficiency of the PV system in converting solar radiation into electrical energy (expressed as a percentage).

Accumulated urban heat island intensity (AUHII) [51] quantifies the annual cumulative intensity of the urban heat island effect (°C·y), defined as the temperature difference between urban and rural areas over time, as shown in (Equation (2)). It is derived from the Dragonfly plugin in Grasshopper.

A U H I I = \sum_{t = 1}^{T} {m a x (0, T}_{u r b a n, n} (t) - T_{r u r a l, n} (t))

(2)

where

T_{u r b a n, n}

refers to the urban temperature, and

T_{r u r a l, n}

refers to the rural temperature. T is total number of time-steps in the analysis period; n is the weather station index.

Indoor Overheating Degree (IOD) [25] quantifies the cumulative indoor heat-stress exposure by summing hours during which the Heat Index (rather than the operative temperature) exceeds a critical threshold. This formulation better captures combined temperature–humidity stress under extreme conditions, as shown in (Equation (3)), °C·h. It is computed using the EnergyPlus simulation engine via Honeybee:

I O D = \sum_{i = 1}^{N} m a x (0, {H I}_{i} - {H I}_{C L})

(3)

where

{H I}_{i}

refers to the value of heat index at moment

i

, while

{H I}_{C L}

represents the critical level of heat index.

Carbon Emission Intensity (CEI) (kg/m²·y) [25] reflects annual building-related carbon emissions per unit floor area, as shown in (Equation (4)). It is based on operational energy consumption simulated in EnergyPlus and converted into emissions using local emission factors:

C E I = E U I \times G F A \times f \times (V C R \times A) \times f n

(4)

where EUI refers to energy use intensity, GFA refers to gross floor area, m², and

f

refers to the electrical power carbon emissions factor (which was 0.801 kg/kWh).

V C R

refers to the vegetation coverage ratio. A refers to plant area;

f n

refers to the carbon sink factors of planting methods (which was 0.112 kg CO₂/m²/y).

2.3. Development of ANN-Based Multi-Task Learning Model

Multi-task learning (MTL) is a machine learning paradigm where a single model is trained to solve multiple related tasks simultaneously by sharing common hidden representations while maintaining task-specific outputs. Architecturally, MTL networks typically consist of shared input layers followed by branched output heads. During training, subsets of network parameters are updated jointly across tasks, inducing an inductive bias that promotes the learning of generalized features relevant to multiple objectives [52]. The fundamental hypothesis underpinning MTL is that learning multiple correlated tasks in parallel acts as a form of regularization, reducing overfitting and enhancing model generalization, particularly in domains with limited labeled data. In contrast, single-task learning (STL) models train an independent network for each target, thereby missing opportunities to exploit task synergies and often requiring substantially larger training data to achieve comparable performance.

Drawing from the categorization of MTL architectures described by Meyer [53], this study develops an artificial neural network-based multi-task learning (ANN-MTL) model. The structure of the model is shown in Figure 3. The input layer accepts a tensor of shape (batch_size, 12), representing the 12 urban morphology features (BD, aBH, mBCD, BSC, SVF, FAR, PS, AR, CN, BO, VCR, Alb-v). Before model training, the input features were subjected to a standardized preprocessing pipeline with random-seed control for reproducibility, involving log1p transformation to stabilize variance, outlier removal using the interquartile range (IQR) method, and z-score normalization fitted on the training data and applied consistently to validation/test sets. The resulting feature vectors were then fed into a shared hidden representation network, comprising a dense layer with 128 neurons (ReLU activation), yielding output tensors of shape (batch_size, 64). This was followed by a dropout layer (rate = 0.3), and a subsequent dense layer with 64 neurons (ReLU activation). Following the shared layers, the network diverged into four task-specific branches, each tailored to a distinct target variable: photovoltaic generation (PVG), annual urban heat island intensity (AUHII), indoor overheating degree (IOD), and carbon emission intensity (CEI). Each branch was composed of additional dense and dropout layers, optimized according to the complexity of each task. The final outputs of each branch were single-node dense layers corresponding to continuous regression targets. Training was configured with the Adam optimizer (learning rate = 1 × 10⁻³), batch size = 32, EarlyStopping and ModelCheckpoint callbacks, and MSE losses with appropriate task-specific weights to balance learning.

2.4. Evaluation Criteria and SHAP Explanation

The evaluation of model performance was based on three complementary metrics: mean squared error (MSE), root mean square error (RMSE), and the coefficient of determination (R²) [54]. These indicators jointly offer a comprehensive perspective on the accuracy, stability, and explanatory power of the developed MTL model. Lower values of MSE and RMSE indicate higher predictive accuracy, while a higher R² value, approaching 1, signifies stronger explanatory power of the model. These metrics were computed on an independent validation dataset (20% of the original samples) to ensure the unbiased evaluation of model generalization.

To further enhance the interpretability of the model predictions, SHapley Additive exPlanations (SHAP) [55] were utilized. SHAP provides a theoretically consistent framework based on cooperative game theory, quantifying the contribution of each input feature to the model output by evaluating marginal contributions across all possible feature subsets. KernelExplainer [56] was specifically employed due to its flexibility with different model types, providing detailed insights into feature interactions and nonlinear effects. During training, the MTL model was optimized by minimizing a composite loss function, defined as the weighted sum of task-specific MSE losses across the four target variables. Task-specific weights were introduced to account for differences in target value scales and relative importance. Model optimization was carried out using the Adam optimizer with an initial learning rate of 0.001, and early stopping was applied based on the validation loss to prevent overfitting and enhance generalization ability. Following optimization, the generated Pareto-optimal solutions underwent clustering analysis to derive the representative urban development strategies. The K-means clustering algorithm [57] was used to classify solutions into distinct typologies based on urban morphological indicators and associated performance outcomes. Each typology represented a specific combination of urban features optimized toward particular sustainability objectives. Statistical analysis identified characteristic patterns and performance trade-offs within each cluster, enabling clear delineation and interpretation.

2.5. Framework

To comprehensively assess the environmental impacts of urban morphology and identify optimized urban form strategies, this study proposes a multi-source analytical framework. As illustrated in Figure 4, the framework integrates four main components: (1) urban morphology data acquisition and processing, (2) multi-dimensional environmental performance simulation, (3) interpretable multi-task learning model development, and (4) multi-objective optimization and clustering analysis. First, 12 key urban morphological parameters, including building density (BD), average building height (aBH), sky view factor (SVF), and vegetation coverage ratio (VCR), were extracted using a parametric modeling workflow built in QGIS [58], Rhino, and Grasshopper [59]. Second, environmental performance indicators were computed using physics-based simulation engines, like Radiance, Dragonfly, and Honeybee. Third, a multi-task learning (MTL) model was developed to predict all four indicators simultaneously, using shared feature representation and task-specific output branches. To enhance interpretability, SHAP (SHapley Additive exPlanations) analysis was applied to quantify the importance and interaction effects of morphological features across tasks. Finally, the trained MTL model was embedded into a NSGA-II multi-objective optimization framework to identify the Pareto-optimal urban form solutions that balance trade-offs among PVG, AUHII, IOD, and CEI. The resulting solutions were then clustered using K-means to extract typical morphological strategies. This integrated workflow enables not only the accurate prediction of performance outcomes but also actionable urban design guidance that aligns energy, thermal comfort, and carbon goals. It is suitable for high-dimensional and complex urban design problems and can find the best trade-off between objectives [60].

3. Results and Discussion

3.1. Urban Morphology Input Parameters

Prior to the development of the machine learning model, all 12 urban form indicators underwent a standardized preprocessing process to ensure consistency, comparability, and model robustness. Figure 5 presents the distributions of standardized urban morphological features used in this study, highlighting variations across the investigated parameters. BD, aBH, and mBCD show relatively balanced distributions, indicating varied urban densities and forms within the dataset. BSC exhibits a highly skewed distribution, predominantly concentrated at lower values, suggesting a preference or prevalence of more compact urban forms in Hong Kong. SVF and VCR distributions are approximately normal, reflecting balanced variations in openness and vegetation coverage within urban environments. FAR and PS demonstrate skewed distributions, implying a higher frequency of densely built-up plots with limited open space. AR, CN, and Alb_v show varying degrees of skewness and multimodality, signifying diverse building layouts, compactness, and vegetation reflectivity levels. Lastly, BO indicates a relatively uniform distribution, illustrating the randomness and diverse directional arrangements of buildings in Hong Kong. For indicators that presented significant skewness, log1p transformation was selectively applied to reduce the skewness and stabilize the variance.

The distribution of target values obtained by simulation is shown in Figure 6. Specifically, PVG exhibited a relatively balanced and interpretable distribution, and thus no transformation was applied. In contrast, AUHII demonstrated significant right-skewness and a wide value range, and was therefore transformed using a logarithmic function to improve linearity and reduce variance. For IOD and CEI, a log1p transformation was employed to handle potential zero or near-zero values while addressing their skewed distributions. This allowed for better convergence and prediction accuracy during training. Additionally, all input features were standardized using MinMaxScaler to normalize their values between 0 and 1, facilitating consistent learning dynamics across different scales. These findings validate simulation outputs that are reasonable and consistent with peer-reviewed results under similar subtropical, high-density urban settings. The value ranges obtained in this study are consistent with those reported by Bian et al. [51], who found an average PVG of 54.27 kWh/m² and mean AUHII of 263.8 °C·h in Hong Kong’s high-density districts, validating the reliability of the simulation outputs.

By identifying the most influential morphological drivers in each sustainability indicator, guidance can be provided for further model development and SHAP-based interpretation. Figure 7 presents the Pearson correlation matrix between raw urban morphological features and the four target indicators: PVG, AUHII, IOD, and CEI. Notably, BD shows a strong positive correlation with AUHII (−0.81) and IOD (0.77), but a moderate negative correlation with PVG (0.38), indicating that denser urban forms are more prone to heat accumulation and indoor discomfort while reducing solar exposure. Conversely, aBH exhibits a strong negative correlation with PVG (−0.71), suggesting that taller buildings reduce rooftop PV potential due to shading or limited roof area. SVF also shows opposing trends, positively correlating with AUHII (0.53) but negatively with IOD (−0.52), highlighting its complex impact on thermal conditions. Interestingly, aspect ratio (AR) is negatively related to PVG (−0.44), and PS negatively correlates with IOD (−0.25), revealing spatial form implications for both energy access and heat mitigation. While some variables (e.g., BSC, CN, BO) exhibit weak or inconsistent correlations, they may still have nonlinear or interaction effects captured by advanced models.

In order to reduce the risk of conflict or mutual exclusion of different objectives and ensure the accuracy and stability of the model, it is necessary to study the correlation between objective variables. Figure 7 also shows the Pearson correlation matrix between the four sustainability target variables. Most notably, AUHII and IOD exhibit a very strong negative correlation (−0.88), indicating that areas with higher outdoor heat accumulation tend to have better-controlled indoor thermal environments, possibly due to adaptive building technologies or effective insulation. This relationship implies that sharing feature representations in a MTL framework could be beneficial, as it allows the model to capture shared underlying patterns that influence both indoor and outdoor thermal environments. PVG shows a weak positive correlation with CEI (0.28), suggesting that locations with higher solar potential might also be associated with higher energy consumption and related emissions, potentially due to increased urban activity or roof area availability. The correlations between PVG and both AUHII (−0.14) and IOD (0.08) are minimal, indicating that the differences between these objectives are large enough and will not produce “negative transfer” due to too similar tasks. At the same time, each branch (task-specific head) can be optimized independently and is not easy to interfere with each other. Similarly, CEI has weak or negligible correlations with the other three variables, highlighting its distinct influencing factors.

3.2. Model Performance

3.2.1. Single-Task Learning Model

To establish a performance baseline and gain a deeper understanding of each individual sustainability target, single-task learning (STL) models were first developed separately for PVG, AUHII, IOD, and CEI. In the STL framework, each objective was trained independently, allowing for the evaluation of the inherent learnability and modeling complexity associated with each task.

Figure 8 presents the single-task learning (STL) performance for each of the four target variables. The left column illustrates the loss curves (MSE loss) for both the training and validation sets across training epochs, while the right column displays the corresponding scatter plots comparing predicted and true values for each task. For PVG (Figure 8a), the model demonstrates steady convergence, with both training and validation losses decreasing consistently over epochs. The scatter plot indicates a moderate fit, though some variance remains, particularly for higher PVG values. AUHII (Figure 8b) shows a similar trend, with stable training convergence and acceptable validation loss, though the prediction scatter suggests room for improvement in capturing outlier behavior. In contrast, the IOD model (Figure 8c) achieves excellent alignment, with near-parallel and low-magnitude loss curves and a highly concentrated prediction distribution along the 1:1 line, reflecting high predictive accuracy. For CEI (Figure 8d), although the training curve shows improvement, the validation loss remains relatively high, and the prediction scatter is more dispersed, indicating greater model uncertainty, likely due to the complex or noisy nature of CEI data. However, given the observed correlation structure among the targets—particularly the strong interdependence between AUHII and IOD—it became apparent that exploiting shared information across tasks could further enhance the predictive performance.

3.2.2. Multi-Task Learning Model

MTL can leverage shared knowledge between tasks such as AUHII and IOD while maintaining task-specific modeling for weaker-related outputs like CEI and PVG, ultimately improving overall predictive robustness. Figure 9 and Table 3 present a comparative evaluation of model performance between STL and MTL across the four target variables: PVG, AUHII, IOD, and CEI. MTL consistently outperformed STL in terms of R² across all targets, with the most substantial improvements observed for IOD (from 0.789 to 0.825, p = 0.0035) and PVG (from 0.687 to 0.712, p = 0.0125), both achieving statistically significant gains. AUHII also shows a marginally significant improvement (from 0.546 to 0.559, p = 0.0499), while the modest increase in CEI (from 0.430 to 0.451) is not statistically meaningful (p = 0.9537), indicating no reliable enhancement for this target. Across most metrics (MAE, MSE, RMSE, and Norm_RMSE), MTL achieved lower or comparable error values than STL. For instance, MTL reduced the RMSE of IOD from 0.0262 to 0.024, and that of AUHII from 0.078 to 0.076. These results highlight the advantage of multi-task learning in leveraging shared morphological features and capturing cross-domain patterns, particularly between AUHII and IOD, which are known to exhibit strong environmental interdependencies in high-density urban contexts.

3.3. SHAP-Based Feature Importance and Interactions

Although the MTL framework demonstrated improved predictive accuracy by leveraging shared information across tasks, the inherent complexity and black-box nature of deep learning models necessitate further exploration to ensure transparency and explainability [61]. To address this, Shapley Additive exPlanations (SHAP) were employed to quantify the contribution of each morphological feature to the predicted outcomes. Figure 10 provides the SHAP feature importance and summary plots for all four sustainability targets: PVG, AUHII, IOD, and CEI. For PVG (Figure 10a), aBH emerges as the most influential factor, followed by SVF and BD. The SHAP summary plot indicates that higher aBH values are associated with lower PVG due to shading or reduced roof exposure, while higher SVF values positively impact PVG, supporting better solar access. For AUHII (Figure 10b), BD dominates the influence, strongly increasing urban heat accumulation, followed by AR and BSC. High AR and BSC tend to exacerbate heat intensity, reflecting the effect of dense, compact environments on thermal buildup. For IOD (Figure 10c), BD again plays a decisive role, reinforcing its cross-domain impact on both outdoor and indoor thermal conditions. Additionally, PS and BSC significantly influence IOD, where higher PS and more compact forms (lower BSC) are associated with reduced indoor overheating. For CEI (Figure 10d), aBH is the most critical feature, with taller buildings generally associated with lower emissions, likely due to vertical efficiency in cooling/heating loads. BD, SVF, and FAR also contribute meaningfully, revealing how dense development, solar exposure, and built intensity shape operational energy emissions. Overall, these SHAP results show both convergences and conflicts across the four targets. BD consistently ranks as a high-impact factor across AUHII, IOD, and CEI, while aBH is critical to both PVG and CEI. This analysis highlights the trade-offs embedded in urban form: for instance, increasing building height may boost PV performance but reduce thermal emissions, whereas increasing BD may raise AUHII and IOD.

Many secondary metrics correlate strongly with dominant features (e.g., BSC nearly collinear with AR, r = 0.93; moderate correlations between mBCD and PS, r = 0.60; aBH and FAR, r = 0.49). Additionally, within our data ranges, some variables (e.g., VCR, Alb_v) exhibit weak direct correlations with the target outcomes, limiting their independent variation. In the high-density public housing development, dominant predictors like BD and aBH account for the most variation in performance indicators, while other factors only become influential under morphological extremes not present in our sample. Thus, the low SHAP importance for these features reflects context-specific collinearity and distributional constraints rather than fundamental irrelevance.

3.4. Optimization Results and Pareto Fronts

To translate the predictive and interpretive results into actionable urban design strategies, a multi-objective optimization process was subsequently conducted. To better interpret the diversity of the Pareto solutions, K-means clustering was applied to the full set of optimized solutions (including input features and predicted outcomes). The number of clusters was set to four, determined via silhouette analysis [62].

Figure 11 illustrates the cluster-wise SHAP analysis across four sustainability objectives (PVG, AUHII, IOD, CEI). Across all four clusters, several consistent trends emerge. aBH is the dominant variable influencing PVG in every cluster, reflecting its critical role in determining solar access across urban forms. Similarly, aBH and BO are repeatedly identified as key drivers for AUHII and IOD, although their directional impacts differ—high aBH tends to improve PVG and reduce CEI, but may increase AUHII in some configurations. Cluster 0 emphasizes a strong dependence on aBH, BO, and mBCD, suggesting a form type characterized by vertical expansion and regular spacing. Cluster 1 shows a similar pattern but places greater weight on BO for IOD, indicating sensitivity to solar exposure angles. Cluster 2 exhibits a shift, where BO becomes the dominant driver for both AUHII and IOD, reinforcing the importance of directional control and spacing (mBCD) in compact layouts. Meanwhile, Cluster 3 is defined by the strong and sometimes extreme influence of BO on IOD and AUHII.

The 200 Pareto-optimal solutions were grouped into four distinct clusters, each representing a typical urban design strategy. The cluster centers of key urban morphological parameters of the different clusters are shown in Table 4.

PCA (principal component analysis) was applied to project the high-dimensional solutions onto a 2D plane, enabling the intuitive visualization of cluster boundaries and solution density. PCA visualization confirmed distinct separation among clusters, suggesting that urban form has a clear influence on multi-objective performance. Figure 12 provides deeper insights into the multi-objective trade-offs among optimal solutions. Figure 12a presents a PCA projection of the optimal solutions, where each point is color-coded according to its K-means cluster label. The distribution demonstrates that different clusters occupy distinct regions in the PCA-reduced feature space, implying that urban morphological strategies leading to Pareto trade-offs are clearly differentiable. Figure 12b illustrates a 3D Pareto front visualization, plotting AUHII, IOD, and CEI, with PVG values represented by a color gradient. The plot shows that a distinct inverse relationship is observed between CEI and AUHII, where lower AUHII is generally associated with lower CEI. Solutions with higher PVG (warmer colors) are concentrated in regions with higher AUHII and CEI, suggesting that maximizing solar energy potential often incurs penalties in outdoor thermal conditions and carbon emissions. Conversely, solutions exhibiting lower AUHII and IOD tend to have lower PVG values (cooler colors), reinforcing the earlier findings that achieving superior thermal environments often aligns with lower energy production but improved carbon efficiency.

To comprehensively evaluate the performance trade-offs among the different cluster-based strategies and selected optimal solutions, four clusters were further analyzed. Figure 13 revealed that Cluster 2 achieved lower AUHII and CEI without compromising PVG and IOD, while Cluster 3 excelled in PVG and indoor comfort but at the cost of high emissions.

Combined with the results of the cluster analysis mentioned above, the characteristics of each cluster are as follows: Cluster 0 is characterized by moderately high buildings (average aBH = 97.86 m) with compact layouts (average mBCD = 107.81 m) and predominantly west-facing orientations (average BO = 274°). It demonstrates the lowest PVG and the highest AUHII, while IOD and CEI remain low. Such configurations resemble the dense traditional urban cores of Hong Kong, such as Central and Causeway Bay, where limited ventilation and strong shading effects exacerbate urban heat accumulation and constrain photovoltaic utilization. Cluster 1 consists of medium-rise buildings (average aBH = 74.24 m) with larger inter-building distances (average mBCD = 123.99 m) and mainly southwest-oriented layouts (average BO = 276°). This cluster achieves moderate PVG, reduced AUHII, and moderate IOD, but higher CEI, compared to Cluster 0. This pattern mirrors the suburban and peripheral areas of Hong Kong, such as Lantau Island and Clearwater Bay, and is ideal for developments prioritizing residential comfort and natural ventilation over maximizing energy generation efficiency or carbon emission reduction. Cluster 2 features taller buildings (average aBH = 74.23 m) with relatively large separations (average mBCD = 109.31 m) and mainly west-facing orientations (average BO = 268°). It results in the lowest AUHII, indicating the strongest mitigation of urban heat island effects, although PVG and CEI remain at intermediate levels. This form is comparable to well-ventilated coastal districts like Tseung Kwan O and parts of Kai Tak, where ventilation corridors are emphasized to counteract heat accumulation. Cluster 2 is suitable for regions prioritizing heat island mitigation rather than solely maximizing photovoltaic generation. Cluster 3 exhibits a well-balanced morphology with mid-to-high building heights (average aBH = 72.11 m), moderate building distances (average mBCD = 109.92 m), and more east-facing orientations (average BO = 183°). It achieves the highest PVG, the lowest CEI, and competitive performance in AUHII, and IOD.

Taking Hong Kong as an example, Gan et al. [63] optimized building orientation and ventilation for a 40-story residential project; the results show that by adopting a southward-biased orientation and appropriate inter-building spacing, annual cooling energy consumption can be reduced by 30–40% (peak cooling load reduced by approximately 0.5–1 kW). Regarding photovoltaic potential, Anand et al.’s simulation and empirical work indicate that an orientation 10–20° off true south can improve annual PV efficiency by about 5–10%. Cluster 3’s average orientation of 3° off south falls within this optimal range; combined with a larger roof area, this aids year-round solar utilization [64]. Although Cluster 3 exhibits relatively higher carbon emissions (CEI), it strikes a balance between maximizing PV generation (PVG), significantly reducing outdoor heat accumulation (AUHII), and maintaining indoor thermal comfort (IOD within acceptable limits), aligning with the core requirements of sustainable integrated design in subtropical high-density cities. In summary, the mid-to-high-rise form with moderate spacing and an eastward-slightly southern orientation represented by Cluster 3 offers synergistic benefits in photovoltaic utilization, heat-island mitigation, and ventilation cooling in subtropical coastal cities such as Hong Kong, Singapore, and Shenzhen, and can serve as an important reference model for green redevelopment or new district design.

While this study focuses on the unique high-density, high-rise morphology of Hong Kong, the proposed methodology and findings are broadly applicable to other cities with similar climatic and urban characteristics. Many rapidly urbanizing cities in subtropical or coastal regions—such as Singapore [65], Shenzhen [66], Bangkok [67], Manila [68], Miami [69], and Sydney [70] face comparable challenges related to urban heat island effects, renewable energy deployment, and carbon mitigation. The use of universally relevant morphological indicators (e.g., building height, density, orientation) and climate-responsive performance metrics (PVG, AUHII, IOD, CEI) enhances the transferability of the model. Moreover, the modular simulation–ML-optimization workflow can be adapted to local conditions using regional geospatial data, typical meteorological years, and representative building stock. Thus, this integrated framework offers a scalable tool for climate-resilient urban form analysis, supporting data-driven policy and sustainable design strategies in diverse global cities.

4. Conclusions

This study developed a multi-source data framework that integrates multi-task learning (MTL), SHAP-based interpretability, and multi-objective optimization to systematically evaluate and optimize urban morphology for sustainability in Hong Kong public housing. Focusing on four critical targets—photovoltaic generation (PVG), accumulated urban heat island intensity (AUHII), indoor overheating degree (IOD), and carbon emission intensity (CEI)—the study provides a comprehensive, interpretable, and actionable modeling approach. The main contributions are highlighted below:

(1): Quantitative evaluation demonstrated that the MTL model achieved high predictive accuracy across all targets, with R² values reaching 0.712 for PVG, 0.559 for AUHII, 0.825 for IOD, and 0.451 for CEI. Compared with STL models, MTL improved performance by efficiently leveraging shared representations, particularly for tasks with strong underlying correlations such as AUHII and IOD.
(2): The SHAP-based analysis identified average building height (aBH), building density (BD), and building orientation (BO) as the dominant morphological factors, explaining up to 65% of the total predictive variance across tasks. Specifically, a higher aBH increased PVG by enhancing rooftop solar access but simultaneously exacerbated AUHII and IOD under dense urban conditions, reflecting clear trade-offs among environmental goals.
(3): Among the identified configurations, Cluster 3 demonstrated the most robust and transferable urban morphology strategy for integrated sustainability. It featured mid-to-high building heights (aBH = 72.11 m), moderate inter-building distances (mBCD = 109.92 m), and east-southeast orientations (BO = 183°). This morphology consistently achieved superior outcomes across multiple objectives, with the highest PVG (55.26 kWh/m²), the lowest CEI (359.76 kg/m²/y), and competitive AUHII (294.1 °C·y) and IOD (92.7 °C·h) values. These results suggest that urban blocks combining moderate height, sufficient spacing, and optimized orientation can effectively balance energy efficiency, thermal comfort, and carbon reduction goals, offering valuable guidance for sustainable urban development across diverse climatic and urban contexts.

Despite these contributions, this study has several limitations: the simulations relied on typical meteorological year (TMY) data, which may not fully capture extreme weather under future climate change scenarios. Although the model demonstrates generalizability in subtropical high-density contexts, further validation in diverse geographical and climatic settings is still needed. In future work, we will integrate this framework with agent-based or digital-twin platforms to capture dynamic interactions between built form, occupant behavior, and environmental conditions, and incorporate economic and life-cycle assessments to evaluate cost–benefit trade-offs and inform stakeholder decision-making, further enhancing the applicability of our approach in evidence-based urban sustainability planning.

Author Contributions

Conceptualization, C.B.; Methodology, C.B., P.H. and X.C.; Software, C.B.; Validation, P.H.; Formal analysis, C.B.; Investigation, P.H.; Resources, C.Y.L., C.C.L. and X.C.; Data curation, C.B. and P.H.; Writing—original draft, C.B.; Writing—review & editing, C.B., C.C.L. and X.C.; Visualization, C.B.; Supervision, C.Y.L., C.C.L. and X.C.; Project administration, C.Y.L. and C.C.L.; Funding acquisition, C.C.L. and X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was sponsored by the Research Grants Council of the Hong Kong Special Administrative Region, China (Grant Nos: 14200524, UGC/FDS16/E04/21, UGC/FDS16/E10/22, and UGC/FDS16/E05/23), and RGC Research Matching Grant Scheme (Project No. 2021/3008), and the CUHK Direct Grant for Research No. 4055230.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Abbreviations

aBH	Average building height
ANN-MTL	Artificial neural network-based multi-task learning
Alb_v	Vegetation albedo
AR	Aspect ratio
AUHII	Accumulated urban heat island intensity (°C∙y)
BD	Building density
BO	Building orientation
BSC	Building shape coefficient
CEI	Carbon emission intensity (kg/m²/y)
CN	Compactness
COP	The coefficient of performance
FAR	Floor area ratio
FiT	Feed-in tariff
FST-GCN	Functional–spatial–temporal GCN
GIS	Geographic Information System
GNN-GA	GNN plus Genetic Algorithm
GCN-LSTM	Graph convolution network (GC) embedded long short-term memory network (LTSM)
GEE	Google Earth Engine
HK Geodata	Hong Kong geospatial data
HVAC	Heating, ventilation, and air conditioning
IOD	Indoor overheating degree (°C·h)
IQR	Interquartile range method
LCZs	Local climate zones
A_b	Total building footprint area
H_i	Height of building i
d_ij	Distance between centroids of buildings i and j
A_v	Total building volume
A_t	Total sky hemisphere area
N	Number of plots/buildings/weather stations
W	Width of street canyon
$θ i$	Angle of building i
Es	The solar energy received by the building surface within a specific time period
$T_{c, t}$	The urban temperature at time-step t
${H I}_{i}$	Hourly heat index
N	Total number of time-steps
GFA	Gross floor area
$f n$	The carbon sink factors
LST	Land surface temperature
mBCD	Mean building centroid distance
ML	Machine learning
MTL	Multi-task learning
MSE	Mean squared error
NDCs	Nationally determined contributions
NSGA-II	Non-dominated sorting genetic algorithm II
PCA	Principal component analysis
PS	Plot size
PV	Photovoltaics
PVG	Photovoltaic generation (kWh/m²)
UHII	Urban heat island intensity
UHR	Urban heat resilience
VCR	Vegetation coverage ratio
WWR	Window-to-wall ratios
RMSE	Root mean square error
R²	Coefficient of determination
SHAP	SHapley Additive exPlanations
SHGC	Solar heat gain coefficient
STL	Single-task learning
SVF	Sky view factor
TMY	Typical meteorological year
A_t	Total study area
n_b/n_p	Total number of buildings/number of plots
A_e	Total exterior surface area
A_o	Obstructed sky area
A_f	Total floor area
L	Length of street canyon
P	Perimeter of building footprints
S	Active area
Ƞ	Efficiency of the PV system
$T_{v, t}$	The rural temperature at time-step t
${H I}_{C L}$	Critical heat index threshold
EUI	Energy use intensity
$f$	Electrical power carbon emissions factor

References

Global Climate Highlights 2024|Copernicus. Available online: https://climate.copernicus.eu/global-climate-highlights-2024?utm_source=chatgpt.com (accessed on 26 April 2025).
State of the Global Climate. 2024. Available online: https://wmo.int/publication-series/state-of-global-climate-2024 (accessed on 26 April 2025).
Record-Breaking Annual Growth in Renewable Power Capacity. 2025. Available online: https://www.irena.org/News/pressreleases/2025/Mar/Record-Breaking-Annual-Growth-in-Renewable-Power-Capacity?utm_source=chatgpt.com (accessed on 26 April 2025).
CAT 2035 Climate Target Update Tracker. Available online: https://climateactiontracker.org/climate-target-update-tracker-2035/ (accessed on 26 April 2025).
China’s New Renewable Energy Plan: Key Insights for Businesses. China Briefing News 2024. Available online: https://www.china-briefing.com/news/chinas-new-renewable-energy-plan-key-insights-for-businesses/ (accessed on 26 April 2025).
Hong Kong’s Climate Action Plan 2050-Climate Change Laws of the World. Available online: https://climate-laws.org/documents/hong-kong-s-climate-action-plan-2050_b5b6 (accessed on 17 May 2025).
Yuan, Y.; Santamouris, M.; Xu, D.; Xiaolei, G.; Li, C.; Cheng, W.; Su, L.; Xiong, P.; Fan, Z.; Wang, X.; et al. Surface urban heat island effects intensify more rapidly in lower income countries. Npj Urban Sustain. 2025, 5, 11. [Google Scholar] [CrossRef]
World Population by Year. Worldometer. Available online: http://www.worldometers.info/world-population/world-population-by-year/ (accessed on 26 April 2025).
Stewart, I.D.; Oke, T.R. Local Climate Zones for Urban Temperature Studies. Bull. Am. Meteorol. Soc. 2012, 93, 1879–1900. [Google Scholar] [CrossRef]
Zuo, M.; Li, M.; Li, H.; Chen, T. Discovering morphological impact discrepancies on thermal environment among urban functional zones using essential urban land use categories and machine learning. Urban Clim. 2025, 61, 102423. [Google Scholar] [CrossRef]
Natanian, J.; Auer, T. Beyond nearly zero energy urban design: A holistic microclimatic energy and environmental quality evaluation workflow. Sustain. Cities Soc. 2020, 56, 102094. [Google Scholar] [CrossRef]
Yuan, B.; Zhou, L.; Hu, F.; Wei, C. Effects of 2D/3D urban morphology on land surface temperature: Contribution, response, and interaction. Urban Clim. 2024, 53, 101791. [Google Scholar] [CrossRef]
Chen, Y.; Ma, W.; Shao, Y.; Wang, N.; Yu, Z.; Li, H.; Hu, Q. The impacts and thresholds detection of 2D/3D urban morphology on the heat island effects at the functional zone in megacity during heatwave event. Sustain. Cities Soc. 2025, 118, 106002. [Google Scholar] [CrossRef]
Hong, T.; Yim, S.H.L.; Heo, Y. Interpreting complex relationships between urban and meteorological factors and street-level urban heat islands: Application of random forest and SHAP method. Sustain. Cities Soc. 2025, 126, 106353. [Google Scholar] [CrossRef]
Zhu, S.; Yan, Y.; Zhao, B.; Wang, H. Assessing the impact of adjacent urban morphology on street temperature: A multisource analysis using random forest and SHAP. Build. Environ. 2025, 267, 112326. [Google Scholar] [CrossRef]
Liu, Q.; Hang, T.; Wu, Y. Unveiling differential impacts of multidimensional urban morphology on heat island effect across local climate zones: Interpretable CatBoost-SHAP machine learning model. Build. Environ. 2025, 270, 112574. [Google Scholar] [CrossRef]
Zhang, L.; Cao, M.; Li, N.; Luo, L.; Chen, Y.; Li, Z. Machine learning prediction of heating and cooling loads based on Athenian residential buildings’ simulation dataset. Energy Build. 2025, 342, 115808. [Google Scholar] [CrossRef]
Jia, L.-R.; Li, Q.-Y.; Chen, X.; Lee, C.-C.; Han, J. Indoor Thermal and Ventilation Indicator on University Students’ Overall Comfort. Buildings 2022, 12, 1921. [Google Scholar] [CrossRef]
Kim, S.W.; Brown, R.D. Development of a micro-scale heat island (MHI) model to assess the thermal environment in urban street canyons. Renew. Sustain. Energy Rev. 2023, 184, 113598. [Google Scholar] [CrossRef]
Tian, P.; Cai, M.; Sun, Z.; Liu, S.; Wu, H.; Liu, L.; Peng, Z. Effects of 3D urban morphology on CO2 emissions using machine learning: Towards spatially tailored low-carbon strategies in Central Wuhan, China. Urban Clim. 2024, 57, 102122. [Google Scholar] [CrossRef]
Li, Y.; Schubert, S.; Kropp, J.P.; Rybski, D. On the influence of density and morphology on the Urban Heat Island intensity. Nat. Commun. 2020, 11, 2647. [Google Scholar] [CrossRef]
Liu, B.; Liu, Y.; Cho, S.; Chow, D.H.C. Urban morphology indicators and solar radiation acquisition: 2011–2022 review. Renew. Sustain. Energy Rev. 2024, 199, 114548. [Google Scholar] [CrossRef]
Morales, R.D.; Audenaert, A.; Verbeke, S. Thermal comfort and indoor overheating risks of urban building stock - A review of modelling methods and future climate challenges. Build. Environ. 2025, 269, 112363. [Google Scholar] [CrossRef]
Hamdy, M.; Carlucci, S.; Hoes, P.-J.; Hensen, J.L.M. The impact of climate change on the overheating risk in dwellings—A Dutch case study. Build. Environ. 2017, 122, 307–323. [Google Scholar] [CrossRef]
Shi, Q.; Luo, W.; Xiao, C.; Wang, J.; Zhu, H.; Chen, X. Investigating urban-scale building thermal resilience under compound heat waves and power outage events based on urban morphology analysis. Build. Environ. 2025, 276, 112747. [Google Scholar] [CrossRef]
Lei, L.; Shao, S.; Liang, L. An evolutionary deep learning model based on EWKM, random forest algorithm, SSA and BiLSTM for building energy consumption prediction. Energy 2024, 288, 129795. [Google Scholar] [CrossRef]
Oukawa, G.Y.; Krecl, P.; Targino, A.C. Fine-scale modeling of the urban heat island: A comparison of multiple linear regression and random forest approaches. Sci. Total Environ. 2022, 815, 152836. [Google Scholar] [CrossRef]
Tang, H.; Chai, X.; Chen, J.; Wan, Y.; Wang, Y.; Wan, W.; Li, C. Assessment of BIPV power generation potential at the city scale based on local climate zones: Combining physical simulation, machine learning and 3D building models. Renew. Energy 2025, 244, 122688. [Google Scholar] [CrossRef]
Li, Z.; Ma, J.; Jiang, F.; Zhang, S.; Tan, Y. Assessing the impacts of urban morphological factors on urban building energy modeling based on spatial proximity analysis and explainable machine learning. J. Build. Eng. 2024, 85, 108675. [Google Scholar] [CrossRef]
Li, J.; Zhang, Y.; Yu, S.; Qin, H.; Xu, Z. AI-Driven urban planning optimization: A graph neural network and genetic algorithm framework for tackling peak-hour challenges. Sustain. Cities Soc. 2025, 126, 106407. [Google Scholar] [CrossRef]
Zhang, J.; Wan, Y.; Tian, M.; Li, H.; Chen, K.; Xu, X.; Yuan, L. Comparing multiple machine learning models to investigate the relationship between urban morphology and PM2.5 based on mobile monitoring. Build. Environ. 2024, 248, 111032. [Google Scholar] [CrossRef]
Liu, Q.; Wang, J.; Bai, B. Unveiling nonlinear effects of built environment attributes on urban heat resilience using interpretable machine learning. Urban Clim. 2024, 56, 102046. [Google Scholar] [CrossRef]
Yang, C.; Li, S.; Gou, Z. Spatiotemporal prediction of urban building rooftop photovoltaic potential based on GCN-LSTM. Energy Build. 2025, 334, 115522. [Google Scholar] [CrossRef]
Xu, Z.; Yi, Z.; Wang, Y.; Wang, D.; Zhang, L.; Huo, H. Estimating near-surface air temperature in urban functional zones in China using spatial-temporal attention. Build. Environ. 2025, 276, 112860. [Google Scholar] [CrossRef]
Erdem Okumus, D.; Akay, M. Quantitative assessment of non-stationary relationship between multi-scale urban morphology and urban heat. Build. Environ. 2025, 272, 112669. [Google Scholar] [CrossRef]
Li, L.; Sun, S.; Zhong, L.; Han, J.; Qian, X. Novel spatiotemporal nonlinear regression approach for unveiling the impact of urban spatial morphology on carbon emissions. Sustain. Cities Soc. 2025, 125, 106381. [Google Scholar] [CrossRef]
GovHK: Feed-in Tariff. Available online: https://www.gov.hk/en/residents/environment/sustainable/renewable/feedintariff.htm (accessed on 24 April 2025).
C40 Good Practice Guides: Hong Kong-Energy Saving Plan 2015-2025+. C40 Cities. Available online: https://www.c40.org/case-studies/c40-good-practice-guides-hong-kong-energy-saving-plan-2015-2025/ (accessed on 17 May 2025).
GovHK: Climate Change. Available online: https://www.gov.hk/en/residents/environment/global/climate.htm (accessed on 24 April 2025).
Google Earth Engine. Available online: https://earthengine.google.com (accessed on 26 April 2025).
Chan, A.L.S.; Chow, T.T.; Fong, S.K.F.; Lin, J.Z. Generation of a typical meteorological year for Hong Kong. Energy Convers. Manag. 2006, 47, 87–96. [Google Scholar] [CrossRef]
Common Spatial Data Infrastructure (CSDI) Portal for Government. Available online: https://portal.csdi.gov.hk/csdi-webpage/ (accessed on 24 April 2025).
Login-BRAVO. Available online: https://bravo.bd.gov.hk/login.action (accessed on 24 April 2025).
List of Public Housing Estates in Hong Kong. Wikipedia. 2025. Available online: https://en.wikipedia.org/wiki/List_of_public_housing_estates_in_Hong_Kong (accessed on 10 March 2025).
Wang, L.; Cheng, J.C.; Mazan, W.; Jacoby, S. Standard Block and Modular Dwelling Designs in Hong Kong’s Public Housing. Architecture 2024, 4, 89–111. [Google Scholar] [CrossRef]
Xue, C.Q.L.; Hui, K.C.; Zang, P. Public buildings in Hong Kong: A short account of evolution since the 1960s. Habitat Int. 2013, 38, 57–69. [Google Scholar] [CrossRef]
Standard 90.1. Available online: https://www.ashrae.org/technical-resources/bookstore/standard-90-1 (accessed on 9 December 2023).
Electrical and Mechanical Services Department (EMSD). Code of Practice for Energy Efficiency of Building Services Installation (BEC 2024); EMSD: Hong Kong, China, 2024. Available online: https://www.emsd.gov.hk/beeo/en/pee/BEC_2024_ENG.pdf (accessed on 30 December 2024).
Electrical and Mechanical Services Department (EMSD). Codes & Technical Guidelines: Buildings Energy Efficiency Ordinance–EMSD. Available online: https://www.emsd.gov.hk/beeo/en/mibec_beeo_codtechguidelines.html (accessed on 24 June 2025).
Landsat 8-9 OLI/TIRS Collection 2 Level 1 Data Format Control Book | U.S. Geological Survey. Available online: https://www.usgs.gov/media/files/landsat-8-9-olitirs-collection-2-level-1-data-format-control-book (accessed on 24 April 2025).
Bian, C.; Cheung, K.L.; Chen, X.; Lee, C.C. Integrating microclimate modelling with building energy simulation and solar photovoltaic potential estimation: The parametric analysis and optimization of urban design. Appl. Energy 2025, 380, 125062. [Google Scholar] [CrossRef]
Torbarina, L.; Ferkovic, T.; Roguski, L.; Mihelcic, V.; Sarlija, B.; Kraljevic, Z. Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Position Paper. Nat. Lang. Process. J. 2024, 7, 100076. [Google Scholar] [CrossRef]
Meyer, J. Multi-Task and Transfer Learning in Low-Resource Speech Recognition. Ph.D. Thesis, University of Arizona, Tucson, AZ, USA, September 2019. [Google Scholar]
Zhai, Z.; Chen, F.; Yu, H.; Hu, J.; Zhou, X.; Xu, H. PS-MTL-LUCAS: A partially shared multi-task learning model for simultaneously predicting multiple soil properties. Ecol. Inform. 2024, 82, 102784. [Google Scholar] [CrossRef]
Welcome to the SHAP Documentation—SHAP Latest Documentation. Available online: https://shap.readthedocs.io/en/latest/ (accessed on 26 April 2025).
shap.KernelExplainer—SHAP Latest Documentation. Available online: https://shap.readthedocs.io/en/latest/generated/shap.KernelExplainer.html (accessed on 26 April 2025).
Hu, Y.-S.; Lo, K.-Y.; Hsieh, I.-Y.L. AI-driven short-term load forecasting enhanced by clustering in multi-type university buildings: Insights across building types and pandemic phases. J. Build. Eng. 2025, 104, 112417. [Google Scholar] [CrossRef]
Spatial without Compromise · QGIS Web Site. Available online: https://qgis.org/ (accessed on 26 April 2025).
Costantino, D.; Grimaldi, A.; Pepe, M. 3D MODELLING OF BULDINGS AND URBAN AREAS USING GRASSHOPPER AND RHINOCERSOS. Geogr. Tech. 2022, 17, 167–176. [Google Scholar] [CrossRef]
Castrejon-Esparza, N.M.; González-Trevizo, M.E.; Martínez-Torres, K.E.; Santamouris, M. Optimizing urban morphology: Evolutionary design and multi-objective optimization of thermal comfort and energy performance-based city forms for microclimate adaptation. Energy Build. 2025, 115750. [Google Scholar] [CrossRef]
La Gatta, V.; Sperlì, G.; De Cegli, L.; Moscato, V. From single-task to multi-task: Unveiling the dynamics of knowledge transfers in disinformation detection. Inf. Sci. 2025, 696, 121735. [Google Scholar] [CrossRef]
Abdulnassar, A.A.; Nair, L.R. Performance analysis of K-means with modified initial centroid selection algorithms and developed K-means9+ model. Meas. Sens. 2023, 25, 100666. [Google Scholar] [CrossRef]
Feng, W.; Chen, J.; Yang, Y.; Gao, W.; Zhao, Q.; Xing, H.; Yu, S. The Impact of Building Morphology on Energy Use Intensity of High-Rise Residential Clusters: A Case Study of Hangzhou, China. Buildings 2024, 14, 2245. [Google Scholar] [CrossRef]
Chew, L.W.; Norford, L.K. Pedestrian-level wind speed enhancement with void decks in three-dimensional urban street canyons. Build. Environ. 2019, 155, 399–407. [Google Scholar] [CrossRef]
Aydin, E.E.; Ortner, F.P.; Peng, S.; Yenardi, A.; Chen, Z.; Tay, J.Z. Climate-responsive urban planning through generative models: Sensitivity analysis of urban planning and design parameters for urban heat island in Singapore’s residential settlements. Sustain. Cities Soc. 2024, 114, 105779. [Google Scholar] [CrossRef]
Gao, X.; Yu, H.; Li, L.; Yu, J. A county-level analysis of spatiotemporal variation and human causes of urban heat islands in the Guangdong-Hong Kong-Macao Greater Bay Area. City Environ. Interact. 2025, 26, 100194. [Google Scholar] [CrossRef]
Ngamsiriudom, T.; Tanaka, T. Making an urban environmental climate map of the Bangkok Metropolitan Region, Thailand: Analysis of air temperature, wind distributions, and spatial environmental factors. World Dev. Sustain. 2023, 3, 100105. [Google Scholar] [CrossRef]
Llorin, A.G.A.; Olaguera, L.M.P.; Cruz, F.A.T.; Villarin, J.R.T. Improved WRF simulation of surface temperature and urban heat island intensity over Metro Manila, Philippines. Atmospheric Res. 2024, 310, 107644. [Google Scholar] [CrossRef]
Singh, M.; Sharston, R.; Murtha, T. Critical evaluation of the spatiotemporal behavior of UHI, through correlation analyses based on multi-city heterogeneous dataset. Sustain. Cities Soc. 2024, 110, 105576. [Google Scholar] [CrossRef]
Kong, J.; Zhao, Y.; Strebel, D.; Gao, K.; Carmeliet, J.; Lei, C. Understanding the impact of heatwave on urban heat in greater Sydney: Temporal surface energy budget change with land types. Sci. Total Environ. 2023, 903, 166374. [Google Scholar] [CrossRef]

Figure 1. Number and models of public housing in 18 districts of Hong Kong. (a) Number of existing public housing estates. (b) Representative building morphologies of public housing estates across 18 districts.

Figure 2. Spatial distribution of public housing estates from five different eras in 18 districts of Hong Kong; (a–r) represents different districts.

Figure 3. ANN-based multi-task learning model architecture and workflow: from input features to SHAP analysis and results aggregation.

Figure 4. Research framework.

Figure 5. Distribution of different urban morphology parameters and targets.

Figure 6. Distributions of target variables.

Figure 7. Correlation analysis of parameters and targets.

Figure 8. STL model performance for four targets: (a) PVG, (b) AUHII, (c) IOD, (d) CEI. Each subplot shows the training/validation loss curve (left) and predicted vs. true values (right). (The red dashed line indicates the 1:1 line between predicted and true values.)

Figure 9. Comparison of the calculation performance of STL and MTL models.

Figure 10. SHAP plot for the MTL model using all features. (a) PVG, (b) AUHII, (c) IOD, (d) CEI.

Figure 11. Feature importance for each cluster evaluated using mean SHAP values from the MTL model. Feature importance for different targets: (a–d) Cluster 0; (e–h) Cluster 1; (i–l) Cluster 2; (m–p) Cluster 3. In each bar chart, the x-axis represents mean SHAP values, and the y-axis lists feature names in descending order of importance. Clusters are derived from K-means on Pareto-optimal solutions, and the MTL model is trained on the full dataset.

Figure 12. Visualization of Pareto-optimal solutions under different clusters. (a) PCA projection of Pareto-optimal solutions for visualization only: points represent individual solutions, colored by K-means cluster labels. (b) 3D Pareto plot showing trade-offs among AUHII (x-axis), IOD (y-axis), and CEI (z-axis), with marker colors representing PVG values; each point corresponds to one optimal solution.

Figure 13. Radar chart for selected representative solutions: each polygon corresponds to a chosen optimal solution (e.g., solution with maximal PVG, minimal AUHII, minimal IOD, minimal CEI, and overall best trade-off).

Table 1. Model setting parameters.

Generation	First	Second	Third	Fourth	Fifth
Year	Before 1986s	1986–1992	1992–2003	2003–2012	2012–present
Building type	Slab, Cruciform Block, Trident	Trident, Harmony	Concord, Harmony	Concord, Trident	Cruciform
U_value of roof	0.58–1.13	0.58	0.58	0.55–0.58	1.8
U_value of wall	2.16–3.33	2.88–3.33	2.75–2.88	2.75–3.85	1.9
U_value of floor	2.48	2.48	2.48	2.48	2.48
U_value of windows	1.13–5.75	5.75	5.75	5.75–5.78	5.8
SHGC	0.37–0.72	0.57	0.681	0.6–0.775	0.82
WWR	0.304–0.382	0.305–0.4	0.65	0.148	0.143
COP	2.5–5	2.5	2.5	2.4–2.5	3
People density (W/m²)	0.83	0.83	0.83	0.83	0.83
Lighting density (W/m²)	46.395	15	15–19	19	15
Equipment density (W/m²)	543.53	142	142	142	100
Cooling (°C)	25	25	24–26	20–29	25
Infiltration (ACH)	1.5	1.5	0.5	0.5	0.3–0.6

Table 2. Calculation method of urban morphological parameters.

Indicator	Abbreviation	Formula	Description
Building density	BD	BD = A_b/A_t	A_b: Total building footprint area A_t: Total study area
Average building height	aBH	aBH = ( $\sum_{i = 1}^{n} H i$ )/n_b	H_i: Height of building i n_b: Total number of buildings
Mean building centroid distance	mBCD	mBCD = ( $\sum_{i = 1 j = 1}^{n} d i j$ )/(n_b(n_b − 1))	d_ij: Distance between centroids of buildings i and j
Building shape coefficient	BSC	BSC = A_e/A_b	A_e: Total exterior surface area. A_v: Total building volume.
Sky view factor	SVF	SVF = 1 − (A_o/A_t)	A_o: Obstructed sky area. A_t: Total sky hemisphere area.
Floor area ratio	FAR	FAR = A_f/A_t	A_f: Total floor area. A_t: Total study area
Plot size	PS	PS = A_t/n_p	A_t: Total study area n_p: Number of plots
Aspect ratio	AR	AR = L/W	L: Length of street canyon W: Width of street canyon
Compactness	CN	CN = A_b/P²	A_b: Total building footprint area P: Perimeter of building footprints
Building orientation	BO	BO = ( $\sum_{i = 1}^{n} θ i$ )/n_b	$θ i$ : Orientation angle of building i n_b: Total number of buildings
Vegetation coverage ratio	VCR	$V C R = \frac{{N D V I}_{m e a n} - {N D V I}_{m i n}}{{N D V I}_{m a x} - {N D V I}_{m i n}}$	NDVI_mean: Mean NDVI of the study area NDVI_min: Bare-soil reference NDVI NDVI_max: Full-vegetation reference NDVI
Vegetation albedo	Alb_v	Alb_v = $\sum (ω_{i} \times ρ_{i})$	$ω_{i}$ : The surface reflectivity of different bands $ρ_{i}$ : The weighting coefficient of each band

Table 3. Comparison of MTL and STL model performance (R²) across four environmental indicators. Paired t-tests were conducted to assess statistical significance. Significant differences (p < 0.05) are marked with an asterisk (*).

Target	Metric	STL_Mean	MTL_Mean	Test	p_Value
PVG (kWh/m²)	R²	0.687 ± 0.018	0.712 ± 0.015	paired t-test	0.0125 (*)
AUHII (°C·y)	R²	0.546 ± 0.022	0.559 ± 0.019	paired t-test	0.0499 (*)
IOD (°C·h)	R²	0.789 ± 0.010	0.825 ± 0.008	paired t-test	0.0035 (*)
CEI (kg/m²/y)	R²	0.430 ± 0.050	0.451 ± 0.045	paired t-test	0.9537

Table 4. Clustering centers of key urban morphological parameters for the different clusters.

Parameters	Cluster 0	Cluster 1	Cluster 2	Cluster 3
BD	0.29	0.30	0.36	0.37
aBH (m)	97.86	74.24	74.23	72.11
mBCD (m)	107.81	123.99	109.31	109.92
PS	10.97	10.91	10.90	11.52
AR	0.93	1.12	0.93	0.93
BO (°)	274	276	268	183
PVG (kWh/m²)	52.51	52.85	54.42	55.26
AUHII (°C∙y)	346.0	331.2	304.8	294.1
IOD (°C·h)	87.0	86.8	90.7	92.7
CEI (kg/m²/y)	216.92	290.75	337.87	359.76

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bian, C.; Hu, P.; Li, C.Y.; Lee, C.C.; Chen, X. Balancing Solar Energy, Thermal Comfort, and Emissions: A Data-Driven Urban Morphology Optimization Approach. Energies 2025, 18, 3421. https://doi.org/10.3390/en18133421

AMA Style

Bian C, Hu P, Li CY, Lee CC, Chen X. Balancing Solar Energy, Thermal Comfort, and Emissions: A Data-Driven Urban Morphology Optimization Approach. Energies. 2025; 18(13):3421. https://doi.org/10.3390/en18133421

Chicago/Turabian Style

Bian, Chenhang, Panpan Hu, Chun Yin Li, Chi Chung Lee, and Xi Chen. 2025. "Balancing Solar Energy, Thermal Comfort, and Emissions: A Data-Driven Urban Morphology Optimization Approach" Energies 18, no. 13: 3421. https://doi.org/10.3390/en18133421

APA Style

Bian, C., Hu, P., Li, C. Y., Lee, C. C., & Chen, X. (2025). Balancing Solar Energy, Thermal Comfort, and Emissions: A Data-Driven Urban Morphology Optimization Approach. Energies, 18(13), 3421. https://doi.org/10.3390/en18133421

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Balancing Solar Energy, Thermal Comfort, and Emissions: A Data-Driven Urban Morphology Optimization Approach

Abstract

1. Introduction

1.1. Background

1.2. Literature Review

1.2.1. Single-Factor Studies

1.2.2. Comprehensive Multi-Factor Analysis of Urban Morphology

1.2.3. Machine Learning-Based Approaches and Optimization Strategies

1.3. Research Gap and Objectives

2. Methodology

2.1. Study Area and Data Sources

2.2. Urban Morphology Indicators and Targets

2.3. Development of ANN-Based Multi-Task Learning Model

2.4. Evaluation Criteria and SHAP Explanation

2.5. Framework

3. Results and Discussion

3.1. Urban Morphology Input Parameters

3.2. Model Performance

3.2.1. Single-Task Learning Model

3.2.2. Multi-Task Learning Model

3.3. SHAP-Based Feature Importance and Interactions

3.4. Optimization Results and Pareto Fronts

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI