Automated Local Climate Zone Mapping via Multi-Parameter Synergistic Optimization and High-Resolution GIS-RS Fusion

Li, Wenbo; Liu, Ximing; Samat, Alim; Gamba, Paolo

doi:10.3390/rs17122038

Open AccessArticle

Automated Local Climate Zone Mapping via Multi-Parameter Synergistic Optimization and High-Resolution GIS-RS Fusion

¹

Department of Electrical, Computer and Biomedical Engineering, University of Pavia, 27100 Pavia, Italy

²

State Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi 830011, China

³

China-Kazakhstan Joint Laboratory for RS Technology and Application, Al-Farabi Kazakh National University, Almaty 050012, Kazakhstan

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(12), 2038; https://doi.org/10.3390/rs17122038

Submission received: 14 April 2025 / Revised: 25 May 2025 / Accepted: 10 June 2025 / Published: 13 June 2025

(This article belongs to the Special Issue Remote Sensing Applications in Urban Ecosystem Services (Second Edition))

Download

Browse Figures

Versions Notes

Abstract

Local Climate Zone (LCZ) classification is essential for urban microclimate modeling and heat mitigation planning. Traditional methods relying on manual sampling face limitations in scalability, objectivity, and handling spatial heterogeneity. This study presents an automated framework for LCZ sample generation, facilitating efficient large-scale LCZ mapping and LCZ-based urban climate analysis and geospatial applications. To this aim, it proposes a dual-path automated framework integrating GIS-driven sample generation to enhance LCZ classification accuracy: a multi-parameter Synergistic Optimization approach for urban LCZs and a Distance-driven Maximum Coverage method for natural LCZs. Specifically, urban samples are selected via multi-objective optimization and Pareto front screening for quality and representativeness, while the selection of natural samples prioritizes spatial coverage and diversity. Combining urban morphological parameters with Sentinel-2 imagery and a Random Forest classifier yielded a final accuracy of 0.95 in our test site, confirming the framework’s effectiveness.

Keywords:

Local Climate Zones; urban morphology analysis; spatial analysis; high-resolution GIS-RS fusion; automated sampling; urban sustainability

1. Introduction

With the continuous acceleration of global urbanization, currently, 4 billion people live in urban areas, and it is projected that by 2050, two-thirds of the global population will reside in cities [1,2]. The expansion of urban areas has profoundly impacted the natural environment, making cities critical areas for studying human–environment interactions and climate change. Urban morphological changes driven by urbanization can alter land-cover and land-use patterns, leading to significant modifications in the local climate—particularly by intensifying the urban heat island effect and influencing urban precipitation patterns [3,4,5]. Some studies have primarily focused on urban climate modeling, inter-city comparisons, and macroscale trends [6,7,8], while others have shifted attention to the intra-urban scale, emphasizing how local variations in urban morphology influence microclimatic conditions.

In order to more accurately characterize intra-urban heterogeneity, researchers have proposed various urban spatial classification frameworks, including Urban Terrain Zones (UTZs), the Davenport roughness classification, and Urban Climate Zones (UCZs) [9,10,11]. Building on these, Stewart and Oke, by extending the UCZ classification method, formally proposed the Local Climate Zone classification system [12], which enables the standardization of urban climate studies. Local Climate Zones (LCZs) are a standardized set of UCZs designed to describe local urban climate characteristics. They effectively reveal the impact of different urban morphologies (e.g., compact high-rise, open low-rise, industrial zones) on urban microclimates. The LCZ system divides urban surfaces into 17 categories, including 10 built-up types and 7 natural land-cover types, providing a consistent way to relate urban form with local climate phenomena such as temperature, wind, and precipitation [13] (see Figure 1 for a graphical representation of different LCZs).

Since its inception, the LCZ framework has been widely adopted in diverse domains of urban climate research. One of its most prominent applications lies in urban heat islands. The LCZ classification provides a systematic framework for comparing how different LCZ types respond to UHI effects. Studies have shown that UHI intensity varies significantly across LCZ types, with LCZ 1 (compact high-rise) consistently linked to higher intensities, while variations in building morphology and density can significantly mitigate heat accumulation [14,15]. In heatwave and heat risk assessments, the LCZ framework has proven instrumental in evaluating spatial variations in thermal exposure and vulnerability. For instance, one study applied the LCZ framework in combination with heat index measurements and field surveys to investigate outdoor thermal comfort in a tropical planning region in eastern India. The findings indicated that the morphological patterns and geometric configurations of outdoor spaces significantly determine thermal comfort levels and climatic conditions across different LCZ types [16]. Another study demonstrated that the incorporation of LCZ classifications significantly improved the spatial accuracy of urban heatwave simulations, particularly by enhancing the estimation of air temperature within urban areas [17]. LCZs also play a pivotal role in climate-sensitive urban design and planning. For instance, ref. [18] developed a methodology that integrates LCZ parameterization with urban canopy layer modeling to optimize urban spatial morphology for climate adaptability. Similarly, ref. [19] utilized LCZ classifications to analyze urban expansion and densification patterns in Kunming, China, between 2005 and 2017. Their findings highlight the utility of LCZs in assessing the impacts of urban planning policies on city morphology and local climate, emphasizing the importance of integrating LCZ frameworks into sustainable urban development strategies. Furthermore, other studies have also explored the application of LCZs in various domains, including urban meteorology, urban pollution, and human health [20,21,22]. As such, LCZ mapping plays a critical role in urban climate research, serving as an essential tool for analyzing urban thermal environments, guiding climate-responsive urban planning, and informing health and risk mitigation strategies.

LCZ classification is performed by manually collecting samples of different LCZ types and using them to classify remote sensing (RS) images, which are sometimes integrated with geographic information system (GIS) layers. Manual sampling is labor-intensive and time-consuming, especially in large cities with complex urban morphologies, and is also prone to human bias during the process, making it unsuitable as a standard method for LCZ mapping at the global scale. Still, it can be used, and there are notable examples of its use, such as the pixel-based supervised classification techniques integrated into the World Urban Database and Access Portal Tools (WUDAPT) or the object-based classification methods driven by deep learning in [23,24,25].

Since the classification results depend heavily on the quality of the training samples, an effective solution to this problem may be to exploit available GIS layers (e.g., building footprints and heights in cadastral databases) to establish a standardized LCZ workflow [26,27]. Therefore, more and more studies have explored automated or semi-automated methods to improve the efficiency and accuracy of LCZ mapping. These methods can be broadly categorized into three main types:

(1) GIS-based rule-driven methods. In [28], the authors found that the semi-automatic GIS-LCZ method had a higher classification accuracy in Berlin than the traditional WUDAPT method. By calculating seven parameters for each basic spatial unit—including the sky view factor, mean building height or mean vegetation height, aspect ratio, building surface fraction, impervious surface fraction, anthropogenic heat flux, and roughness length—LCZs were classified using a fuzzy logic algorithm.

(2) Supervised or learning-based classification methods. Ref. [29] explored two supervised classification methods—spatial indices and spatial clustering—for mapping Local Climate Zones in Xi’an, and the study showed that the spatial index method was more effective. Similarly, ref. [30] proposed an automatic LCZ classification method (AutoLCZ) based on key parameters extracted from LiDAR data (including the building surface fraction, impervious surface fraction, pervious surface fraction, and building height), using estimated thresholds derived from sample labels. Ref. [31] applied the YOLOv8 deep learning model to classify Local Climate Zones (LCZs) in Istanbul using high-resolution Google Earth imagery. The results indicated that the model exhibited confusion among LCZ types 1, 2, and 3, as well as between types B–C and E–8, highlighting the need for enhanced training data.

(3) Hybrid strategies integrating rule-based and statistical methods. Ref. [32] describes a two-stage hybrid approach that combines decision rules with a Random Forest classifier to automatically generate land-cover LCZ training samples using multi-source RS data. The method was validated across multiple climatic zones globally, demonstrating its classification effectiveness. However, it is applicable only to land-cover LCZ types (A–G), and its reliance on manually defined thresholds limits its generalization capability.

In summary, automated sample generation still faces several key challenges:

Inadequate adaptation to spatial heterogeneity: The block-based method proposed in [29] for Xi’an struggles to capture intra-block heterogeneity; AutoLCZ employs uniform grids, which are not adaptive to morphological gradients; and although the Berlin method attempts to identify transition zones using k-means, it fails to address the issue of spatial autocorrelation.
Lack of multi-parameter synergistic constraints: LCZ classification requires satisfying multiple threshold conditions simultaneously. AutoLCZ extracts only a limited set of parameters from LiDAR data; the method applied in Xi’an relies heavily on manual judgment; and the Berlin GIS-LCZ approach considers seven parameters, but lacks a mechanism for considering synergistic constraints among them.
Trade-off between computational efficiency and classification accuracy: The Xi’an method involves significant manual intervention; AutoLCZ depends on high-quality LiDAR data; and the fuzzy logic algorithm in the Berlin GIS-LCZ approach has high computational complexity and strong data dependency. None of the three methods effectively balances efficiency and accuracy. In addition, the Istanbul study emphasizes that both remote sensing image quality and annotation accuracy have a significant impact on the model’s performance.

To address these issues, this study proposes a dual-path framework for automated LCZ sample generation, integrating multi-objective optimization theory with morphology-driven spatial strategies, to enable efficient and high-precision sample extraction. The main contributions of this study are summarized as follows:

First, a multi-parameter Synergistic Optimization-based window selection strategy is proposed for built LCZ types. By employing multi-objective function evaluation and Pareto front screening, the method achieves optimal sample selection under the coordinated constraints of multiple urban morphological parameters. Furthermore, a dynamic threshold adjustment mechanism and a three-stage global-to-local fine-tuning strategy are introduced to effectively address the parameter conflicts and spatial heterogeneity inherent in traditional methods.
Second, a Distance-driven Maximum Coverage method is developed for land-cover LCZ types. Based on land-use classification data, this method utilizes chessboard distance transformation and boundary constraints to generate candidate sampling windows. It prioritizes the selection of large, non-overlapping areas, significantly enhancing spatial coverage diversity and computational efficiency.
Third, this study is the first to integrate Pareto front screening with a dynamic tolerance mechanism to mitigate premature convergence in multi-objective optimization. This innovation ensures robust and adaptive parameter optimization for built-up LCZ sample generation. The proposed dual-path framework balances strict morphological constraints with optimized spatial coverage, contributing to scalable, automated, and high-quality LCZ sample production at a global scale.

2. Materials and Methods

2.1. Materials

2.1.1. Study Area

Milan is the capital of the Lombardy region and the second most populous city in Italy, with approximately 1.4 million residents (https://www.comune.milano.it) (Figure 2). Milan faces significant challenges related to urbanization, environmental sustainability, and climate adaptability. The city’s unique geographic and climatic characteristics make it an ideal case study for exploring sustainable urban development strategies, particularly through the lens of Local Climate Zones (LCZs) and urban green spaces.

Indeed, Milan’s urban landscape exhibits diverse land-use types, including residential, commercial, and industrial areas, as well as airports, forests, agricultural fields, grasslands, and lakes, corresponding to various LCZ classifications. The city also features a mix of low-density and high-density urban structures. This diversity offers unique opportunities to study the interactions between urban morphology and climate. These characteristics allow for a comprehensive analysis of the spatial distribution of green spaces within the LCZ classification map and their role in climate regulation. As such, Milan serves as an exemplary case study for investigating sustainable urban development strategies, particularly regarding the influence of urban green spaces on Local Climate Zones.

2.1.2. Data

The data used in this study comprise multi-source heterogeneous data collected between 2020 and 2022, and are listed in Table 1. The building height and area information used for calculating the building height (BH) and building surface fraction (BSF) were derived from the EUBUCCO 2022 database. This dataset records individual building footprints, building types, and heights for over 200 million buildings across 27 EU countries and Switzerland [33]. The Corine Land Cover plus Backbone (CLCplus Backbone) was provided by the European Space Agency and is based on Sentinel-2 time series; it constitutes a global land-cover product with a 10 m resolution for the years 2020 and 2021 [34]. These data were used to calculate the pervious surface fraction (PSF) and to classify land-cover LCZ types (LCZ A-G). The Global Canopy Height Maps dataset, provided by Meta and the World Resources Institute, accurately predicted global canopy presence and height for the period of 2009–2020, with a mean absolute error of 2.8 m [35]. The Istituto Nazionale di Geofisica e Vulcanologia released a 10 m seamless digital elevation model (DEM) covering the entire Italian territory in 2023 [36]. This dataset, combined with building height and canopy height data, was used to construct a digital surface model (DSM) for the calculation of the sky view factor (SVF) [36]. Sentinel-2 images for 2020 and 2021, provided by European Space Agency, were integrated into median composites, generated to extract RS features [37]. Finally, the Impervious Built-up, a 10 m resolution thematic product, provided binary information on building presence (class 1) or absence (class 0) within the sealing outline, derived from the 2018 Imperviousness Density layer, that was used to calculate the impervious surface fraction (ISF) [38].

In this study, QGIS was used to process the spatial data. First, the Building Dataset was converted from vector format to 1 m resolution raster format, resulting in two raster layers: building height and building area. Then, the datasets CLCplus Backbone, TINITALY DEM, and Impervious Built-Up were resampled to 1 m resolution using the “r.resample” function. Finally, the raster calculator in QGIS was used to combine the Building Dataset, TINITALY DEM, and the Global Canopy Height Maps to generate the DSM. Since the remote sensing imagery used for classification was Sentinel-2 imagery, the Urban Morphological Parameters (UMPs) datasets—including BSF, ISF, PSF, SVF, and BH—were downsampled to 10 m resolution using the function “r.resamp.stats”, to ensure consistency with the Sentinel-2 spatial resolution.

2.2. Urban Morphological Parameters (UMPs)

The LCZ system comprises 10 urban types, which are combinations of buildings, roads, vegetation, soil, rocks, and water within a region. Their characteristics can be quantified using standardized parameters for urban morphology, surface radiation properties, surface thermal properties, and human activity features [12]. Based on the complementarity of these parameters and data availability, this study selects five parameters for LCZ mapping in the study area: building height, building surface fraction, pervious surface fraction, impervious surface fraction, and sky view factor [26,39,40].

(1): Building Height (BH)
The mean building height refers to the geometric mean height of buildings within an analysis unit. It is a key parameter used to identify high-rise areas (BH > 25 m), mid-rise areas (BH = 15–25 m), or low-rise areas (BH < 15 m).

$BH = \frac{\sum_{i = 1}^{n} B_{height}}{n}$

(1)

where $B_{height}$ represents the building height, and n is the count of buildings within the unit.
(2): Building Surface Fraction (BSF)
The building surface fraction is the ratio of the total building area within a unit to the total unit area. BSF is a critical parameter for distinguishing compact urban areas (BSF > 0.4) from open urban areas (BSF < 0.4).

$BSF = \frac{\sum_{i = 1}^{n} B_{area}}{U_{area}}$

(2)

where $B_{area}$ represents the area of buildings, and $U_{area}$ is the area of the unit.
(3): Pervious Surface Fraction (PSF)
The pervious surface fraction is the ratio of the total pervious area within a unit to the total unit area. Pervious surfaces include land-use types such as vegetation, bare soil, and water. In this study, PSF was calculated based on the sum of water and vegetation areas from land-use data provided by the ESA.

$PSF = \frac{\sum_{i = 1}^{n} P_{area}}{U_{area}}$

(3)

where $P_{area}$ represents the area of pervious surface, and $U_{area}$ is the area of the unit.
(4): Impervious Surface Fraction (ISF)
The impervious surface fraction refers to the ratio of the total impervious area within a unit to the total unit area. Here, impervious surfaces exclude building areas. ISF in this study was calculated using impervious surface area data provided by Wuhan University.

$ISF = \frac{\sum_{i = 1}^{n} I_{area}}{U_{area}}$

(4)

where $I_{area}$ represents the area of the impervious surface, and $U_{area}$ is the area of the unit.
(5): Sky View Factor (SVF)
The sky view factor is the average ratio of the visible sky hemisphere from the ground, with values ranging from 0 to 1. In this study, SVF was calculated using DSM data for terrain, the global canopy height data, and building data, input into the System for Automated Geoscientific Analyses (SAGA) software [41]. The input parameters were set to their default values: a maximum search radius of 100 units and 16 directional sectors.

2.3. Methods

This paper defines an LCZ mapping framework based on the integration of GIS and RS data. The core of this framework is q pixel-level supervised classification of Sentinel-2 satellite imagery using a Random Forest model (Figure 3). The specific implementation steps are as follows:

Step 1 Implementation of the dual-path automated framework for the automatic selection of LCZ samples (Algorithm 1). Path I, comprising Steps 1.1 to 1.3, is dedicated to built LCZ types, while Path II, comprising Step 1.4, targets land-cover LCZ types.

Step 1.1: Data Processing. First, the five urban morphological raster datasets, including BSF, PSF, ISF, SVF, and BH, are processed to ensure spatial consistency through spatial alignment and resampling. Based on the resampled datasets, the first-order integral image (I) and the squared integral image (

I^{2}

) are computed for each variable to facilitate efficient calculation of window-based statistical metrics [42]. The integral image is computed as follows:

I (x, y) = \sum_{i = 0}^{x} \sum_{j = 0}^{y} D (i, j)

(5)

where,

D (i, j)

represents the value of the pixel at coordinates

(i, j)

in the original image. The integral image (I) enables mean calculations, while the squared integral image (

I^{2}

) is used for variance computation within a window, significantly improving the efficiency of large-scale data processing.

According to the parameter thresholds for each LCZ type listed in Table 2, a sensitivity-based weighting matrix is constructed. For instance, higher weights (1.5×) are assigned to LCZ types such as LCZ 6 and LCZ 9 due to their greater sensitivity to the PSF parameter.

Step 1.2: Design of the Optimization Objectives and Scoring Functions. To enable automated construction of the LCZ sampling windows, two key functions are considered: (1) an optimization objective function to guide the search and adjustment of the candidate windows, and (2) a composite scoring function to comprehensively evaluate the quality of the candidate solutions.

(1): Design of the Optimization Objective Function

A baseline objective function

L

is first defined to guide the search and adjustment of the sampling windows, incorporating the following four optimization objectives:

Threshold adherence ensures that the mean values within each window align with the predefined LCZ class ranges, with a dynamic tolerance ($T_tol$) enhancing adaptability;
Spatial uniformity constrains the local feature space by minimizing the variance within the window to ensure internal consistency;
Window shape regularity restricts the window aspect ratio ( $R_{aspect} \leq 2.0$ ) and enforces a minimum area ( $A_{\min} \geq 100 m^{2}$ ) to avoid irregular geometries;
The overlap constraint prohibits overlap between windows of different LCZ classes, while allowing only minimal overlap among windows of the same class.

The optimization function is eventually formulated as follows:

L = \sum_{i = 1}^{5} [ω_{i} \cdot Ψ (μ_{i}, [μ_{i, min}, μ_{i, max}]) + λ_{var} \cdot σ_{i}^{2}] + λ_{shape} \cdot max (0, R_{aspect} - 2.0)

(6)

where

Ψ

denotes the out-of-bound penalty function,

ω_{i}

is the band-specific weight depending on the LCZ class, and

λ_{var}

and

λ_{shape}

are tuning coefficients controlling the contributions of the variance and shape-regularity terms, respectively.

(2): Composite Scoring Function for Candidate Solutions

To identify the most representative samples, a composite scoring function

F

is proposed, integrating both feature and geometric criteria. It includes four components:

The optimization confidence reflects the performance of the candidate in the objective function optimization process, defined as the inverse of the loss value;
The threshold coverage represents the proportion of spectral bands within the window that meet the predefined LCZ threshold conditions;
The shape score penalizes deviations from an ideal rectangular form based on the aspect ratio;
The threshold adherence calculates the weighted distance between band means and the center of threshold ranges to assess sample representativeness.

The composite scoring function is eventually defined as follows:

F = confidence \times coverage \times shape_score + λ_{adherence} \cdot adherence

(7)

In terms of formulas,

optimization confidence = \frac{1}{1 + L},

threshold coverage coverage = \frac{N_{valid}}{N_{total}} \in [0, 1],

where

N_{valid}

and

N_{total}

denote the number of threshold-satisfying bands and the total number of bands, respectively.

shape_score = \frac{1}{1 + max (0, r - r_{max})}, r = \frac{max (h, w)}{min (h, w)},

where h and w represent the height and width of the window, and

r_{max}

is the maximum acceptable aspect ratio (e.g., 2.0).

threshold adherence = - \sum_{i = 1}^{5} ω_{i} \cdot |\frac{μ_{i} - m_{i}}{Δ_{i}}|,

where

μ_{i}

is the mean of the i-th band,

m_{i}

is the midpoint of its threshold interval,

Δ_{i}

is half the width of that interval, and

ω_{i}

is a weight depending on the LCZ class.

Finally, the weight coefficient

λ_{adherence}

controls the influence of adherence in the overall score, and was set to 1.0 in this study.

Step 1.3: Multi-stage Optimization Strategy Building upon the optimization objectives and scoring functions defined in Step 1.2, a multi-stage optimization strategy was designed to ensure both the efficiency and robustness of LCZ sampling-window extraction. The proposed framework integrates global and local optimization in a sequential manner, with the following four components:

Global exploration: A dual annealing algorithm is employed to heuristically explore the complex and high-dimensional solution space [44,45]. Its ability to escape local minima makes it particularly suitable for identifying diverse initial candidate windows across varying LCZ types [46].
Dynamic variance constraint: To enhance local refinement, a dynamic threshold is applied to the standard deviation of each spectral band. The 85th percentile of global candidates’ variance distributions is used to define the maximum allowable standard deviation (max_std), effectively filtering out noisy candidates while maintaining sufficient sampling diversity. This approach strikes a balance between avoiding overly strict thresholds that may lead to sample scarcity, and avoiding overly lenient thresholds that may introduce noise.
Hybrid Optimization Strategy: To further enhance the precision of candidate window refinement, this study introduces a three-stage local optimization mechanism. First, the L-BFGS-B algorithm is employed to perform rapid bounded optimization on initial candidate windows, ensuring spatial validity through explicit constraints [47]. If the resulting confidence is insufficient, a Differential Evolution (DE) algorithm is introduced to perform a global search and escape from local optima. Finally, L-BFGS-B is applied again to finely adjust the DE results with high precision, ensuring strict compliance with LCZ-specific prior constraints such as parameter consistency and geometric regularity.
Pareto-based candidate selection: To simultaneously satisfy multiple objectives—statistical, geometric, and spatial—this study adopts a Pareto-optimal selection strategy [48]. By using non-dominated sorting and Pareto front extraction, candidate windows are evaluated based on the mean error, variance, coverage rate, and aspect ratio. This avoids subjective weighting and preserves optimal candidates that balance feature representation and geometric regularity. The parameter settings are listed in Appendix A, Table A1.

Step 1.4 For land-cover LCZ types, the Distance-driven Maximum Coverage method is applied. Based on the Chessboard Distance Transform, the maximum reachable radius of each target pixel is computed as

r_{\max} = min (d_{edge}, d_{chessboard})

.

Candidate windows exceeding 0.5 km² are prioritized in a sampling queue, while an orthogonal overlap detection algorithm ensures that the final set of selected windows maintains an overall overlap ratio below 5%.

Step 2: Compute the Feature Parameter Set for RS Classification. The feature parameter set is a collection of spectral, morphological, and spatial characteristics derived from multi-source RS and GIS data. The urban morphology parameters (BH, BSF, ISF, PSF, and SVF) extracted from the Building Dataset, land-cover product, impervious product, and DSM effectively characterized the internal structure of cities. Therefore, Sentinel-2 spectral bands and GIS-derived urban morphology parameters were selected as the input feature set for the Random Forest (RF) classifier.

Step 3: RF Classifier Based on Feature Optimization. RF is an ensemble learning algorithm based on decision trees that is efficient, robust, and highly adaptable to high-dimensional data [49]. Several studies have demonstrated that RF achieves higher accuracy in LCZ mapping, and it has become one of the most widely used and robust classification algorithms for pixel-level classification [50,51,52]. Feature optimization plays a critical role in understanding the importance of different features in the classification process, improving computational efficiency, eliminating noisy variables, and reducing model complexity.

Algorithm 1: Dual-path Framework for Automated LCZ Sample Generation

In RF models, the Out-of-Bag Error, derived from the Bagging algorithm, provides an effective method for model evaluation and feature selection [53]. Specifically, each decision tree is trained using the Bootstrap resampling technique, where approximately two-thirds of the original dataset is randomly sampled for modeling, while the remaining one-third of the samples, not selected, form the Out-of-Bag (OOB) dataset. These OOB samples serve as a natural independent validation set. The model computes the importance score of each feature by evaluating the change in OOB prediction error when randomly permuting the feature values. This quantifies the global impact of each feature on predictions. The optimal feature set was selected for a sample set of LCZ classifications, which was divided into training and testing samples at a 7:3 ratio. Finally, the RF classifier was used for LCZ mapping, followed by accuracy validation.

3. Results

3.1. LCZ Map Validation

According to the official WUDAPT guidelines, “fit for purpose” urban data should be produced using globally consistent methods and openly accessible input data and tools. In alignment with this principle, WUDAPT provides the LCZ Generator, a tool designed to streamline the mapping process of Local Climate Zones (LCZs) and enhance the accuracy and accessibility of information regarding urban-induced human and environmental impacts [54].

Therefore, we validated the sample results for the city of Milan—generated using the proposed dual-pathway framework—with those generated by the LCZ Generator. The sample sampling frequency and validation results are shown in the figure below (Figure 4). The overall accuracy (OA) of 0.77 met the minimum average accuracy threshold of 0.5 required by WUDAPT’s automated quality control, indicating that the classification results were reliable and suitable for subsequent LCZ mapping [55]. The urban-only overall accuracy (OAu = 0.68) fell within a reasonable range for LCZ classification in urban microclimate studies. This level of accuracy is generally sufficient for medium-precision research scenarios, such as urban morphology analysis and urban heat island assessment. The overall accuracy for built vs. natural LCZ classes (OAbu = 0.91) reflected an excellent level of classification performance between built-up and natural types. This result confirmed that the generated samples possessed reliable discriminatory power at the macro scale. The weighted overall accuracy (OAw = 0.94), which incorporated class similarity weights into the evaluation, demonstrated the strong overall performance of the samples when spatial correlations among LCZ types were taken into account.

3.2. LCZ Mapping Results

In this study, the automatically generated samples were further used to explore the potential of urban morphological parameters (UMPs) as auxiliary data in LCZ mapping. UMPs characterize the physical structure of urban environments, including the building height, vegetation canopy height, sky viewing angle, and land-cover types. These parameters can provide additional spatial–structural information to classification models and have demonstrated enhanced discriminative power, particularly when distinguishing morphologically similar urban LCZ classes.

Based on a classification model using only Sentinel-2 imagery, the incorporation of UMP data into the training and prediction process led to a significant improvement in classification performance. The overall accuracy increased from 0.942 to 0.953, substantially exceeding the minimum accuracy threshold recommended by WUDAPT. As shown in Figure 5, the inclusion of UMPs notably improved both the classification accuracy and the spatial consistency of built LCZ types.

Figure 6 presents the confusion matrix of the LCZ classification results generated by the model after integrating Sentinel-2 imagery with UMP data. Overall, the classification accuracy across LCZ types was consistently high, indicating a stable model performance. Notably (Table 3), LCZ D showed the most outstanding performance, with an F1-score of 0.99, demonstrating the model’s strong capability in identifying natural areas. Typical built classes such as LCZ 2, 6, 8, and 9 also achieved high F1-scores (all above 0.89), indicating the model’s robust discriminative power in distinguishing between different urban functional zones.

In summary, the LCZ mapping results for the city of Milan (Figure 7) validate the effectiveness of the proposed dual-pathway framework for both automated sample generation and high-accuracy LCZ classification. Moreover, they highlight the crucial role of UMP data in enhancing the classification precision of urban morphological types, thereby laying a solid foundation for future applications of LCZs in urban climate modeling and urban planning.

4. Discussion

4.1. The Influence of High-Resolution UMPs on the Construction of LCZ Samples

The fundamental role of high-resolution GIS data in quantifying urban morphological characteristics is thoroughly demonstrated in this study. Such data offer significant advantages in capturing the complexity of urban forms, and can effectively improve the accuracy and quality of automated LCZ sample generation.

The feature importance analysis based on Out-of-Bag (OOB) scores in the Random Forest model indicates that SVF, DSM, and PSF—as high-resolution urban structural parameters—demonstrate the highest discriminative power in LCZ classification (Figure 8). Among all UMPs, SVF ranks as the most important feature, with an OOB score significantly outperforming that of other structural variables such as DSM and BSF.

SVF effectively captures the degree of spatial openness and building obstruction in urban environments, making it a key indicator for distinguishing between compact high-density built-up zones and open low-density areas. Notably, SVF cannot be directly extracted from traditional remote sensing imagery; instead, it must be derived from high-resolution GIS datasets, highlighting its independent and irreplaceable role in urban morphological analysis. Therefore, SVF is regarded as the most representative and explanatory variable when analyzing the impact of fine-scale urban structure on LCZ classification performance.

Based on this finding, this study further used SVF as the representative variable to design an experiment incorporating SVF data at three spatial resolutions (30 m, 1 m). This allowed for a quantitative evaluation of how the GIS data resolution affected the LCZ classification accuracy, thereby validating the critical role of high-resolution structural data in urban climate modeling. In this study, all SVF values were generated using the SAGA GIS software. Figure 9a presents the SVF results calculated based on 30 m resolution DSM data, where it can be observed that this resolution fails to adequately capture the spatial openness within complex urban environments. In contrast, Figure 9b shows the SVF results derived from higher-resolution data inputs, including a 10 m DEM resampled to 1 m, 1 m vegetation canopy height data, and 1 m building height data obtained through vector-to-raster conversion. Compared with the coarse-resolution output, the high-resolution results more accurately reveal the fine-scale openness of the urban spatial structure, clearly demonstrating the advantages of high-resolution data in SVF extraction for urban areas.

With the increasing availability of high-resolution geospatial data, the generation of samples for urban LCZ classes is expected to gradually move away from dependence on manual intervention and low-resolution legacy datasets. Urban categories that were previously difficult to extract automatically due to data limitations can now be processed more efficiently and accurately. This trend indicates that high-resolution GIS data will play an increasingly critical role in advancing the automation of LCZ sample generation and enhancing the overall quality of the resulting samples.

4.2. Interpretation of the LCZ Mapping Results

To further validate the effectiveness of the automated LCZ classification results in this study, we compared our findings with existing research. Specifically, the proposed automated workflow identified four built LCZ types in the Milan area: LCZ 2, 6, 8, and 9. In contrast, ref. [56] reports a study based on expert interpretation, identifying five built LCZ types: LCZ 2, 3, 5, 6, and 8.

This discrepancy reflects the complexity of LCZ delineation in Milan and highlights the challenges associated with automated sample extraction for certain urban categories. For example, LCZ 3 (compact low-rise) and LCZ 5 (open mid-rise), although expected in Milan, were not successfully detected in the automated classification results. This may be attributed to factors such as the sample-window size, feature extraction scale, and degree of land-cover mixing.

In addition to this qualitative comparison, a quantitative assessment was also performed to benchmark the classification performance of our proposed method against the PRISMA product developed in [56]. Table 4 presents the precision, recall, and F1-score values for each LCZ class, along with the overall accuracy. The results demonstrate that our method achieves higher overall accuracy (0.95) compared to the one in [56] (0.88), and shows particularly strong performance in classes such as LCZ 2 (compact mid-rise), LCZ 6 (open low-rise), and LCZ 8 (large low-rise). However, the results in [56] provide coverage for LCZ 3 and LCZ 5, which were not detected by the automated workflow, further confirming the previously discussed limitations.

Therefore, although the automated workflow adopted in this study has demonstrated high accuracy in multi-class LCZ recognition, there remains room for improvement in achieving comprehensive urban class coverage. Future research could focus on adjusting sampling-window sizes, enhancing boundary-detection capabilities, and incorporating additional high-resolution auxiliary datasets to improve the distinction between morphologically similar classes—such as LCZ 3 and 5—and further advance the refinement of automated LCZ sample-extraction methods.

4.3. Scalability and Generalizability to Diverse Urban Forms

This study aimed to develop a standardized and automated workflow for LCZ sample generation and mapping, with the goal of improving the generality, efficiency, and accuracy of LCZ classification research. Accordingly, the selected UMPs—including BSF, BH, ISF, PSF, and SVF—were chosen for their high availability and accessibility.

In regions where the specific datasets used in this study may not be readily available, alternative global products can be used. For example, Microsoft Maps provides a global building footprint dataset derived from Bing Maps imagery (2014–2023), including 1.2 billion building footprints and 174 million building height estimates, based on sources such as Maxar, Airbus, and IGN France (Microsoft Global ML Building Footprints). Similarly, DSM data can be substituted using globally available products such as AW3D30 DSM [57]. Although Section 4.1 discusses the impact of spatial resolution differences on UMP computation, such alternatives can still serve as effective substitutes in regions where high-resolution GIS data are unavailable. In summary, although this study employed high-resolution GIS-derived UMPs as inputs, the proposed automated LCZ mapping framework is theoretically transferable to other cities.

While the case study focuses on Milan, which typifies the compact mid-rise urban form commonly found in European cities, it is important to note the considerable variation in urban morphology worldwide. Cities with high-density cores (e.g., Hong Kong, Tokyo), low-rise sprawl (e.g., Los Angeles), rapid urbanization (e.g., Shenzhen), informal settlements in tropical climates (e.g., Mumbai), or lush vegetation in equatorial cities (e.g., Kuala Lumpur) all exhibit unique spatial structures, urban heat island patterns, and green-space configurations.

Although Milan differs from these cities in many respects, the proposed framework is based on LCZ-specific parameter thresholds and GIS data inputs, and thus can be adapted for LCZ sample generation and mapping in such diverse urban environments. However, caution is warranted when applying the framework to cities in different climatic zones, which represents a limitation of this study. Large variations in urban morphology may lead to differences in LCZ threshold values. Therefore, adjusting the parameters of the framework based on local urban characteristics will be essential to enhance its applicability and robustness.

4.4. Challenges and Prospects of Automated LCZ Sample Generation

As a key classification system for investigating the relationship between urban climate and urban structure, the accurate delineation of LCZs is critical for studies in areas such as urban heat environment modeling and climate-adaptive urban planning. However, due to the strong spatial heterogeneity and high degree of land-cover mixing within urban environments, the automated extraction of high-quality and representative LCZ samples remains a significant challenge.

The dual-pathway automated sample extraction framework proposed in this study aims to improve both the efficiency and accuracy of LCZ sample generation by incorporating high-resolution GIS data and physically based parameters. By combining global optimization and local refinement based on surface physical characteristics and LCZ parameter thresholds, the method enables physically interpretable LCZ identification. The framework demonstrates high reliability across most LCZ types, confirming that the integration of high-resolution GIS data significantly enhances the scientific robustness and stability of automated LCZ sample generation, while also laying a foundation for future quantitative analyses of urban morphology.

Nevertheless, the experimental results reveal some limitations in identifying morphologically similar urban classes (e.g., LCZ 3 and LCZ 5), particularly in areas with fuzzy class boundaries or significant spatial mixing, where classification accuracy tends to decrease. This suggests that the current framework still has room for improvement in terms of fine-scale classification within complex urban settings.

Future research can build upon this framework to further optimize the sample recognition process. On the one hand, machine learning models can be introduced to enhance the model’s ability to distinguish complex spatial structures. On the other hand, prior knowledge from existing high-quality LCZ datasets can be leveraged to guide window selection and class determination. Additionally, the integration of multi-source RS data—such as SAR and nighttime light imagery—holds great potential for improving the distinguishability of LCZ types. Through these advancements, automated LCZ sample generation is expected to evolve toward more intelligent and fine-grained methodologies, providing a more solid data foundation for urban climate research and morphological modeling.

5. Conclusions

The dual-pathway automated sampling framework developed in this study significantly improves the efficiency and accuracy of LCZ sample generation by integrating high-resolution GIS data and urban physical parameter constraints. The key contributions are as follows:

Methodological innovation: This study is the first to combine Pareto front screening with a dynamic tolerance mechanism, effectively addressing the problem of premature convergence in multi-objective optimization. The proposed dual-pathway framework demonstrates strong adaptability and high accuracy in generating samples for both urban and land-cover LCZ types.
Advantages of data fusion: By integrating high-resolution GIS data (specifically, urban morphological parameters) with RS imagery, the framework significantly enhances sample representativeness. This confirms the core value of high-resolution datasets in supporting automated LCZ construction.
Outstanding classification performance: In the case study of Milan, the overall classification accuracy reached 95.3%, with stable performance in typical urban types such as LCZ 2, 6, 8, and 9, indicating strong discriminatory power across different urban functional zones.

Although some confusion remains between morphologically similar classes such as LCZ 3 (compact low-rise) and LCZ 5 (open mid-rise), future research could further optimize the model by leveraging even higher-resolution data, intelligent window-selection strategies, and multi-source RS fusion. Overall, this framework establishes a solid data and methodological foundation for urban climate studies and smart city planning.

Author Contributions

Conceptualization, W.L. and P.G.; methodology, W.L.; software, W.L.; validation, W.L. and X.L.; formal analysis, X.L.; investigation, X.L.; resources, W.L. and X.L.; data curation, W.L. and A.S.; writing—original draft preparation, W.L.; writing—review and editing, W.L., A.S. and P.G.; visualization, W.L. and X.L.; supervision, A.S. and P.G.; project administration, P.G.; funding acquisition, P.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the China Scholarship Council (202404910092) and the European Space Agency through the HEATWISE project.

Data Availability Statement

Data are available from the corresponding author upon reasonable request.

Acknowledgments

We are particularly grateful to all the researchers and institutions that provided data support for this study. Special thanks are extended to Alberto Vavassori and his team at Politecnico di Milano for their valuable support of this research.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. The parameter settings used in the dual-pathway automated sampling framework of Built LCZ Type Processing.

Parameter Type	Parameter Name	Value	Usage
Global Parameters	HUGE_PENALTY	$10^{10}$	A penalty value is applied to invalid solutions (e.g., constraint violations) to force the optimizer away from infeasible regions
	LOCAL_REFINE_THRESHOLD	0.5	The confidence threshold for triggering local refinement; activates the three-stage local optimization if the confidence is below this value
	MAX_WINDOW_RATIO	2.0	The maximum allowed aspect ratio of a sampling window to avoid overly elongated shapes (e.g., ratio > 2:1)
	MIN_WINDOW_AREA	100	The minimum window area (in pixels); filters out excessively small candidate windows
	SHAPE_WEIGHT	50.0	The shape penalty weight; applies a linear penalty to windows exceeding the aspect ratio limit
	UNIFORMITY_REWARD_WEIGHT	0.05	The reward weight for spatial uniformity; encourages low-variance windows
	COVERAGE_THRESHOLD	1	The posterior validation threshold: requires 100% of bands to meet their respective mean constraints
	ALLOWED_OVERLAP_RATIO	0	The maximum allowed overlap ratio for same-class windows (0 indicates strict non-overlapping)
	LAMBDA_ADHERENCE	1.0	The weight for threshold adherence; emphasizes the closeness between band means and the threshold center in the composite score
Custom Parameters	MAX_ITERATIONS	500	The maximum number of iterations for the dual annealing algorithm during the global search phase
	CANDIDATE_CONFIDENCE	0.9	The minimum confidence threshold for candidate solutions; candidates below this value are discarded
	MIN_CONFIDENCE	0.05	The lower confidence bound for retaining candidate solutions to avoid over-filtering
	VAR_WEIGHT	0.05	The weight assigned to variance error in the objective function (balanced with the weight for mean error)
	MEAN_PENALTY_WEIGHT	8.0	The penalty weight for mean deviation; controls the sensitivity of the objective function to deviations from threshold means
	MAX_STD	–	The initial maximum allowed standard deviation for each band (prior to dynamic adjustment), corresponding to `SVF:0.5`, `BSF:0.5`, `ISF:0.55`, `PSF:0.6`, and `BH:15.0` respectively.
Adaptive parameters	EFFECTIVE_MAX_STD	85th	Dynamically constrains within-window uniformity by excluding the top 15% of high-variance outliers
Adaptive parameters	TOL_FACTOR	0.15	The dynamic tolerance range for posterior mean validation, allowing slight deviation from the threshold (e.g., ±15%)

Table A2. Reference table for Local Climate Zone (LCZ) classification.

LCZ Code	Name	Description
1	Compact high-rise	Dense, tall buildings with little vegetation
2	Compact mid-rise	Dense, mid-height buildings with limited vegetation
3	Compact low-rise	Dense, low buildings with paved surfaces
4	Open high-rise	Tall buildings with open spacing and vegetation
5	Open mid-rise	Mid-height buildings with moderate open space
6	Open low-rise	Detached low buildings with vegetation
7	Lightweight low-rise	Informal or temporary low buildings
8	Large low-rise	Industrial or commercial buildings with large footprints
9	Sparsely built	Scattered buildings with substantial open land
10	Heavy industry	Large-scale industrial complexes
A	Dense trees	Forests or wooded areas with closed canopy
B	Scattered trees	Open tree cover with grass or bare soil
C	Bush, scrub	Shrubs, low woody plants, sparse trees
D	Low plants	Grasslands, herbaceous vegetation
E	Bare rock or paved	Hard, non-vegetated surfaces
F	Bare soil or sand	Loose soil, sand, or dry ground
G	Water	Rivers, lakes, reservoirs, and other water bodies

References

United Nations. World Urbanization Prospects: The 2014 Revision, Highlights; Department of Economic and Social Affairs, Population Division, United Nations: New York, NY, USA, 2014; Volume 32. [Google Scholar]
Acuto, M.; Parnell, S.; Seto, K.C. Building a Global Urban Science. Nat. Sustain. 2018, 1, 2–4. [Google Scholar] [CrossRef]
Collier, C.G. The Impact of Urban Areas on Weather. Q. J. R. Meteorol. Soc. 2006, 132, 1–25. [Google Scholar] [CrossRef]
Miao, S.; Chen, F.; Li, Q.; Fan, S. Impacts of Urban Processes and Urbanization on Summer Precipitation: A Case Study of Heavy Rainfall in Beijing on 1 August 2006. J. Appl. Meteorol. Climatol. 2011, 50, 806–825. [Google Scholar] [CrossRef]
Oke, T.R.; Mills, G.; Christen, A.; Voogt, J.A. Urban Climates; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar]
Sun, Y.; Zhang, X.; Ren, G.; Zwiers, F.W.; Hu, T. Contribution of Urbanization to Warming in China. Nat. Clim. Chang. 2016, 6, 706–709. [Google Scholar] [CrossRef]
Yan, Z.W.; Wang, J.; Xia, J.J.; Feng, J.M. Review of Recent Studies of the Climatic Effects of Urbanization in China. Adv. Clim. Chang. Res. 2016, 7, 154–168. [Google Scholar] [CrossRef]
Manoli, G.; Fatichi, S.; Schläpfer, M.; Yu, K.; Crowther, T.W.; Meili, N.; Burlando, P.; Katul, G.G.; Bou-Zeid, E. Magnitude of Urban Heat Islands Largely Explained by Climate and Population. Nature 2019, 573, 55–60. [Google Scholar] [CrossRef]
Oke, T.R. Initial Guidance to Obtain Representative Meteorological Observations at Urban Sites; World Meteorological Organization: Geneva, Switzerland, 2004; Volume 81. [Google Scholar]
Ellefsen, R. Mapping and Measuring Buildings in the Canopy Boundary Layer in Ten U.S. Cities. Energy Build. 1991, 16, 1025–1049. [Google Scholar] [CrossRef]
Davenport, A.G.; Grimmond, C.S.B.; Oke, T.R.; Wieringa, J. Estimating the Roughness of Cities and Sheltered Country. In Proceedings of the 12th Conference on Applied Climatology, Ashville, NC, USA, 8–11 May 2000; Volume 96, p. 99. [Google Scholar]
Stewart, I.D.; Oke, T.R. Local Climate Zones for Urban Temperature Studies. Bull. Am. Meteorol. Soc. 2012, 93, 1879–1900. [Google Scholar] [CrossRef]
Demuzere, M.; Hankey, S.; Mills, G.; Zhang, W.; Lu, T.; Bechtel, B. Combining Expert and Crowd-Sourced Training Data to Map Urban Form and Functions for the Continental US. Sci. Data 2020, 7, 264. [Google Scholar] [CrossRef]
Bechtel, B.; Demuzere, M.; Mills, G.; Zhan, W.; Sismanidis, P.; Small, C.; Voogt, J. SUHI Analysis Using Local Climate Zones—A Comparison of 50 Cities. Urban Clim. 2019, 28, 100451. [Google Scholar] [CrossRef]
Liu, Y.; Li, Q.; Yang, L.; Mu, K.; Zhang, M.; Liu, J. Urban Heat Island Effects of Various Urban Morphologies under Regional Climate Conditions. Sci. Total Environ. 2020, 743, 140589. [Google Scholar] [CrossRef]
Das, M.; Das, A. Exploring the Pattern of Outdoor Thermal Comfort (OTC) in a Tropical Planning Region of Eastern India during Summer. Urban Clim. 2020, 34, 100708. [Google Scholar] [CrossRef]
Patel, P.; Jamshidi, S.; Nadimpalli, R.; Aliaga, D.G.; Mills, G.; Chen, F.; Demuzere, M.; Niyogi, D. Modeling Large-Scale Heatwave by Incorporating Enhanced Urban Representation. J. Geophys. Res. Atmos. 2022, 127, e2021JD035316. [Google Scholar] [CrossRef]
Liu, L.; Liu, J.; Jin, L.; Liu, L.; Gao, Y.; Pan, X. Climate-Conscious Spatial Morphology Optimization Strategy Using a Method Combining Local Climate Zone Parameterization Concept and Urban Canopy Layer Model. Build. Environ. 2020, 185, 107301. [Google Scholar] [CrossRef]
Vandamme, S.; Demuzere, M.; Verdonck, M.L.; Zhang, Z.; Van Coillie, F. Revealing Kunming’s (China) Historical Urban Planning Policies through Local Climate Zones. Remote Sens. 2019, 11, 1731. [Google Scholar] [CrossRef]
Shi, Y.; Ren, C.; Lau, K.K.L.; Ng, E. Investigating the Influence of Urban Land Use and Landscape Pattern on PM2.5 Spatial Variation Using Mobile Monitoring and WUDAPT. Landsc. Urban Plan. 2019, 189, 15–26. [Google Scholar] [CrossRef]
Brousse, O.; Georganos, S.; Demuzere, M.; Vanhuysse, S.; Wouters, H.; Wolff, E.; Linard, C.; van Lipzig, N.P.M.; Dujardin, S. Using Local Climate Zones in Sub-Saharan Africa to Tackle Urban Health Issues. Urban Clim. 2019, 27, 227–242. [Google Scholar] [CrossRef]
Yang, X.; Peng, L.L.H.; Chen, Y.; Yao, L.; Wang, Q. Air Humidity Characteristics of Local Climate Zones: A Three-Year Observational Study in Nanjing. Build. Environ. 2020, 171, 106661. [Google Scholar] [CrossRef]
Mills, G.; Ching, J.; See, L.; Bechtel, B.; Foley, M. An Introduction to the WUDAPT Project. In Proceedings of the 9th International Conference on Urban Climate, Toulouse, France, 20–24 July 2015; pp. 20–24. [Google Scholar]
Kim, M.; Jeong, D.; Kim, Y. Local Climate Zone Classification Using a Multi-Scale, Multi-Level Attention Network. ISPRS J. Photogramm. Remote Sens. 2021, 181, 345–366. [Google Scholar] [CrossRef]
Wu, Q.; Ma, X.; Sui, J.; Pun, M.O. DF4LCZ: A SAM-Empowered Data Fusion Framework for Scene-Level Local Climate Zone Classification. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5406216. [Google Scholar] [CrossRef]
Zheng, Y.; Ren, C.; Xu, Y.; Wang, R.; Ho, J.; Lau, K.; Ng, E. GIS-based Mapping of Local Climate Zone in the High-Density City of Hong Kong. Urban Clim. 2018, 24, 419–448. [Google Scholar] [CrossRef]
Hidalgo, J.; Dumas, G.; Masson, V.; Petit, G.; Bechtel, B.; Bocher, E.; Foley, M.; Schoetter, R.; Mills, G. Comparison between Local Climate Zones Maps Derived from Administrative Datasets and Satellite Observations. Urban Clim. 2019, 27, 64–89. [Google Scholar] [CrossRef]
Muhammad, F.; Xie, C.; Vogel, J.; Afshari, A. Inference of Local Climate Zones from GIS Data, and Comparison to WUDAPT Classification and Custom-Fit Clusters. Land 2022, 11, 747. [Google Scholar] [CrossRef]
Xu, D.; Zhang, Q.; Zhou, D.; Yang, Y.; Wang, Y.; Rogora, A. Local Climate Zone in Xi’an City: A Novel Classification Approach Employing Spatial Indicators and Supervised Classification. Buildings 2023, 13, 2806. [Google Scholar] [CrossRef]
Liu, C.; Song, H.; Shreevastava, A.; Albrecht, C.M. AutoLCZ: Towards Automatized Local Climate Zone Mapping from Rule-Based Remote Sensing. In Proceedings of the IGARSS 2024—2024 IEEE International Geoscience and Remote Sensing Symposium, Athens, Greece, 7–12 July 2024; pp. 2023–2027. [Google Scholar] [CrossRef]
Nicancı Sinanoğlu, M.; Kaya, Ş. Local Climate Zone Classification Using YOLOV8 Modeling in Instance Segmentation Method. Int. J. Environ. Geoinform. 2024, 11, 1–9. [Google Scholar] [CrossRef]
Islam, M.D.; Di, L.; Zhang, C.; Yang, R.; Qu, J.J.; Tong, D.; Guo, L.; Lin, L.; Pandey, A. A Decision Rule and Machine Learning-Based Hybrid Approach for Automated Land-Cover Type Local Climate Zones (LCZs) Mapping Using Multi-Source Remote Sensing Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 8271–8290. [Google Scholar] [CrossRef]
Milojevic-Dupont, N.; Wagner, F.; Nachtigall, F.; Hu, J.; Brüser, G.B.; Zumwald, M.; Biljecki, F.; Heeren, N.; Kaack, L.H.; Pichler, P.P.; et al. EUBUCCO v0.1: European Building Stock Characteristics in a Common and Open Database for 200+ Million Individual Buildings. Sci. Data 2023, 10, 147. [Google Scholar] [CrossRef]
Copernicus Land Monitoring Service. CLCplus Backbone 2021 (Raster 10 m), Europe, 3-Yearly, Jun. 2024; European Environment Agency: Copenhagen, Denmark, 2024. [Google Scholar] [CrossRef]
Tolan, J.; Yang, H.I.; Nosarzewski, B.; Couairon, G.; Vo, H.V.; Brandt, J.; Spore, J.; Majumdar, S.; Haziza, D.; Vamaraju, J.; et al. Very High Resolution Canopy Height Maps from RGB Imagery Using Self-Supervised Vision Transformer and Convolutional Decoder Trained on Aerial Lidar. Remote Sens. Environ. 2024, 300, 113888. [Google Scholar] [CrossRef]
Tarquini, S.; Isola, I.; Favalli, M.; Battistini, A.; Dotta, G. TINITALY, a Digital Elevation Model of Italy with a 10 Meters Cell Size (Version 1.1); Istituto Nazionale di Geofisica e Vulcanologia (INGV): Roma, Italy, 2023. [Google Scholar]
European Space Agency. Sentinel-2 MSI: MultiSpectral Instrument, Level-2A. Available online: https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED (accessed on 14 April 2025).
Copernicus Land Monitoring Service. Impervious Built-up 2018 (Raster 10 m), Europe, 3-Yearly, Aug. 2020; European Environment Agency: Copenhagen, Denmark, 2020. [Google Scholar] [CrossRef]
Quan, J. Enhanced Geographic Information System-Based Mapping of Local Climate Zones in Beijing, China. Sci. China Technol. Sci. 2019, 62, 2243–2260. [Google Scholar] [CrossRef]
Yonghuan, M.A.; Linlin, L.U.; Da, X.; Meng, C.A.I.; Chao, R.E.N.; Meiling, Z.; Wenhua, H.U.I.; Qingting, L.I. Urban Thermal Environment Analysis by Local Climate Zone in Beijing. J. Beijing Norm. Univ. (Nat. Sci.) 2022, 58, 901–909. [Google Scholar]
Conrad, O.; Bechtel, B.; Bock, M.; Dietrich, H.; Fischer, E.; Gerlitz, L.; Wehberg, J.; Wichmann, V.; Böhner, J. System for Automated Geoscientific Analyses (SAGA) v. 2.1.4. Geosci. Model Dev. 2015, 8, 1991–2007. [Google Scholar] [CrossRef]
Viola, P.; Jones, M. Rapid Object Detection Using a Boosted Cascade of Simple Features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, Kauai, HI, USA, 8–14 December 2001; Volume 1, pp. I-511–I-518. [Google Scholar] [CrossRef]
Demuzere, M.; Bechtel, B.; Middel, A.; Mills, G. Mapping Europe into Local Climate Zones. PLoS ONE 2019, 14, e0214474. [Google Scholar] [CrossRef]
Xiang, Y.; Sun, D.Y.; Fan, W.; Gong, X.G. Generalized Simulated Annealing Algorithm and Its Application to the Thomson Model. Phys. Lett. A 1997, 233, 216–220. [Google Scholar] [CrossRef]
Abouelsaad, O.; Hassan, A.; Omar, M.; Hinkelmann, R. Identifying Manning Roughness Coefficient Using Automatic Calibration Method and Simulation of Pollution Incidents in the Nile River, Egypt. J. Hydrol. Reg. Stud. 2024, 55, 101908. [Google Scholar] [CrossRef]
Soonjun, B.; Krityakierne, T. GDESA: Gradient Differential Evolution-Simulated Annealing Hybrid. IEEE Access 2024, 12, 165555–165581. [Google Scholar] [CrossRef]
Kästner, J.; Sherwood, P. Superlinearly Converging Dimer Method for Transition State Search. J. Chem. Phys. 2008, 128, 014106. [Google Scholar] [CrossRef] [PubMed]
Li, X.; Gao, B.; Pan, Y.; Bai, Z.; Gao, Y.; Dong, S.; Li, S. Multi-Objective Optimization Sampling Based on Pareto Optimality for Soil Mapping. Geoderma 2022, 425, 116069. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random Forest in Remote Sensing: A Review of Applications and Future Directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Mushore, T.D.; Dube, T.; Manjowe, M.; Gumindoga, W.; Chemura, A.; Rousta, I.; Odindi, J.; Mutanga, O. Remotely Sensed Retrieval of Local Climate Zones and Their Linkages to Land Surface Temperature in Harare Metropolitan City, Zimbabwe. Urban Clim. 2019, 27, 259–271. [Google Scholar] [CrossRef]
Chung, L.C.H.; Xie, J.; Ren, C. Improved Machine-Learning Mapping of Local Climate Zones in Metropolitan Areas Using Composite Earth Observation Data in Google Earth Engine. Build. Environ. 2021, 199, 107879. [Google Scholar] [CrossRef]
Huang, F.; Jiang, S.; Zhan, W.; Bechtel, B.; Liu, Z.; Demuzere, M.; Huang, Y.; Xu, Y.; Ma, L.; Xia, W.; et al. Mapping Local Climate Zones for Cities: A Large Review. Remote Sens. Environ. 2023, 292, 113573. [Google Scholar] [CrossRef]
Breiman, L. Bagging Predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef]
Demuzere, M.; Kittner, J.; Bechtel, B. LCZ Generator: A Web Application to Create Local Climate Zone Maps. Front. Environ. Sci. 2021, 9, 637455. [Google Scholar] [CrossRef]
Bechtel, B.; Alexander, P.J.; Beck, C.; Böhner, J.; Brousse, O.; Ching, J.; Demuzere, M.; Fonte, C.; Gál, T.; Hidalgo, J.; et al. Generating WUDAPT Level 0 Data–Current Status of Production and Evaluation. Urban Clim. 2019, 27, 24–45. [Google Scholar] [CrossRef]
Vavassori, A.; Oxoli, D.; Venuti, G.; Brovelli, M.A.; Siciliani de Cumis, M.; Sacco, P.; Tapete, D. A Combined Remote Sensing and GIS-based Method for Local Climate Zone Mapping Using PRISMA and Sentinel-2 Imagery. Int. J. Appl. Earth Obs. Geoinf. 2024, 131, 103944. [Google Scholar] [CrossRef]
Tadono, T.; Nagai, H.; Ishida, H.; Oda, F.; Naito, S.; Minakawa, K.; Iwamoto, H. Generation of the 30 M-Mesh Global Digital Surface Model by Alos Prism. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, XLI-B4, 157–162. [Google Scholar] [CrossRef]

Figure 1. LCZ typology figure [13].

Figure 2. Study area.

Figure 3. Overall technical flow chart.

Figure 4. LCZ generation verification result.

Figure 5. Comparison of classification metrics.

Figure 6. Confusion matrix of LCZ classification results.

Figure 7. LCZ mapping result.

Figure 8. Feature importances.

Figure 9. Comparison of SVF using different input data.

Table 1. Data sources and their usage in this study.

Data	Time	Data Type	Resolution	Source	Usage
Building Dataset	2022	Vector	–	EUBUCCO [33]	Building location, building area, calculation of BH and BSF
CLCplus Backbone	2021	Raster	10 m	ESA [34]	Classification of land-cover LCZ types, calculation of PSF
TINITALY DEM	2023	Raster	10 m	INGV [36]	Calculation of SVF
Global Canopy Height Maps	2009–2020	Raster	1 m	Meta and WRI [35]	Calculation of SVF
Sentinel 2 Image	2020–2021	Raster	10 m	ESA [37]	Band features used for classification
Impervious Built-Up	2018	Raster	10 m	ESA [38]	Calculation of ISF

Note: ESA—European Space Agency; INGV—Istituto Nazionale di Geofisica e Vulcanologia.

Table 2. LCZ characteristics.

LCZ	SVF	BSF (%)	ISF (%)	PSF (%)	MBH (m)
LCZ 1	0.2–0.4	40–60	40–60	<10	≥25
LCZ 2	0.3–0.6	40–70	30–50	<20	10–<25
LCZ 3	0.2–0.6	40–70	20–50	<30	3–<10
LCZ 4	0.5–0.7	20–40	30–40	30–40	≥25
LCZ 5	0.5–0.8	20–40	30–50	20–40	10–<25
LCZ 6	0.6–0.9	20–40	20–50	30–60	3–<10
LCZ 7	0.2–0.5	60–90	<20	<30	2–4
LCZ 8	>0.7	30–50	40–50	<20	3–<10
LCZ 9	>0.8	10–20	<20	60–80	3–<10
LCZ 10	0.6–0.9	20–30	20–40	40–50	5–<15
LCZ A–G	Defined by land-use classification

Note: Data based on European thresholds from Demuzere et al. (2019) [43].

Table 3. Transposed classification for selected LCZ classes.

Metric	2	6	8	9	A	B	C	D	E	F	G
Precision	0.98	0.87	0.95	0.89	0.95	1.00	0.97	0.99	0.97	0.93	1.00
Recall	0.93	0.93	0.85	0.88	0.98	0.62	0.74	1.00	0.87	0.84	1.00
F1-score	0.96	0.90	0.90	0.89	0.97	0.76	0.84	0.99	0.92	0.89	1.00

Table 4. Comparison of accuracy metrics.

LCZ Types	Proposed			From [56]
LCZ Types	Precision	Recall	F1-Score	Precision	Recall	F1-Score
2	0.98	0.93	0.96	0.82	0.89	0.85
3	–	–	–	0.77	0.73	0.75
5	–	–	–	0.77	0.78	0.78
6	0.87	0.93	0.90	0.82	0.78	0.80
8	0.95	0.85	0.90	0.98	0.99	0.99
9	0.89	0.88	0.89	–	–	–
A	0.95	0.98	0.97	0.98	0.73	0.84
B	1.00	0.62	0.76	0.83	0.93	0.88
C	0.97	0.74	0.84	–	–	–
D	0.99	1.00	0.99	0.98	0.92	0.95
E	0.97	0.87	0.92	0.93	0.97	0.95
F	0.93	0.84	0.89	0.90	0.99	0.94
G	1.00	1.00	1.00	1.00	1.00	1.00
Overall Accuracy	0.95			0.88

Note: The precision, recall, F1-score, and overall accuracy (OA) of the proposed method were calculated using LCZ maps derived from median-composited Sentinel-2 imagery (2018–2020), while the reference experiment metrics from [56] were based on June 2023 LCZ maps.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, W.; Liu, X.; Samat, A.; Gamba, P. Automated Local Climate Zone Mapping via Multi-Parameter Synergistic Optimization and High-Resolution GIS-RS Fusion. Remote Sens. 2025, 17, 2038. https://doi.org/10.3390/rs17122038

AMA Style

Li W, Liu X, Samat A, Gamba P. Automated Local Climate Zone Mapping via Multi-Parameter Synergistic Optimization and High-Resolution GIS-RS Fusion. Remote Sensing. 2025; 17(12):2038. https://doi.org/10.3390/rs17122038

Chicago/Turabian Style

Li, Wenbo, Ximing Liu, Alim Samat, and Paolo Gamba. 2025. "Automated Local Climate Zone Mapping via Multi-Parameter Synergistic Optimization and High-Resolution GIS-RS Fusion" Remote Sensing 17, no. 12: 2038. https://doi.org/10.3390/rs17122038

APA Style

Li, W., Liu, X., Samat, A., & Gamba, P. (2025). Automated Local Climate Zone Mapping via Multi-Parameter Synergistic Optimization and High-Resolution GIS-RS Fusion. Remote Sensing, 17(12), 2038. https://doi.org/10.3390/rs17122038

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automated Local Climate Zone Mapping via Multi-Parameter Synergistic Optimization and High-Resolution GIS-RS Fusion

Abstract

1. Introduction

2. Materials and Methods

2.1. Materials

2.1.1. Study Area

2.1.2. Data

2.2. Urban Morphological Parameters (UMPs)

2.3. Methods

3. Results

3.1. LCZ Map Validation

3.2. LCZ Mapping Results

4. Discussion

4.1. The Influence of High-Resolution UMPs on the Construction of LCZ Samples

4.2. Interpretation of the LCZ Mapping Results

4.3. Scalability and Generalizability to Diverse Urban Forms

4.4. Challenges and Prospects of Automated LCZ Sample Generation

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI