Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classiﬁcation: A Case Study of Nanchang, China

: Local climate zone (LCZ) maps have been used widely to study urban structures and urban heat islands. Because remote sensing data enable automated LCZ mapping on a large scale, there is a need to evaluate how well remote sensing resources can produce ﬁne LCZ maps to assess urban thermal environments. In this study, we combined Sentinel-2 multispectral imagery and dual-polarized (HH + HV) PALSAR-2 data to generate LCZ maps of Nanchang, China using a random forest classiﬁer and a grid-cell-based method. We then used the classiﬁer to evaluate the importance scores of different input features (Sentinel-2 bands, PALSAR-2 channels, and textural features) for the classiﬁcation model and their contribution to each LCZ class. Finally, we investigated the relationship between LCZs and land surface temperatures (LSTs) derived from summer nighttime ASTER thermal imagery by spatial statistical analysis. The highest classiﬁcation accuracy was 89.96% when all features were used, which highlighted the potential of Sentinel-2 and dual-polarized PALSAR-2 data. The most important input feature was the short-wave infrared-2 band of Sentinel-2. The spectral reﬂectance was more important than polarimetric and textural features in LCZ classiﬁcation. PALSAR-2 data were beneﬁcial for several land cover LCZ types when Sentinel-2 and PALSAR-2 were combined. Summer nighttime LSTs in most LCZs differed signiﬁcantly from each other. Results also demonstrated that grid-cell processing provided more homogeneous LCZ maps than the usual resampling methods. This study provided a promising reference to further improve LCZ classiﬁcation and quantitative analysis of local climate.


Introduction
With continuous urbanization and the increasing settlement in global cities, natural landscapes are constantly converted to impervious surfaces in urban areas, altering the natural surface energy and water balances, which often results in altered climatic conditions in urban areas and the formation of the urban heat island (UHI) phenomenon [1][2][3]. As a key topic in urban climate studies, the concept of a "local climate zone" (LCZ) was introduced in 2012 by Stewart and Oke [4] to quantify the relationship between urban morphology and the UHI phenomenon. LCZs provide a standardized framework to link land cover types and urban morphology with corresponding thermal properties, so LCZs have been the systematic criteria for UHI comparisons [5]. Notably, the World Urban Database and Access Portal Tools (WUDAPT) project was developed as a new global initiative to produce standardized LCZ maps [6][7][8]. Because remote sensing data are

Study Area
Nanchang City, which is located between 115 • 27 -116 • 35 E and 28 • 09 -29 • 11 N (Figure 1), was selected as our study area to fill the research gap for LCZ maps in this region. Nanchang is the capital city of Jiangxi Province in southeastern China. It is one of the central cities in the middle reaches of the Yangtze River and covers about 7402 km 2 . Since the 1980s, Nanchang has experienced rapid economic development, industrialization, and urbanization [52]. The Gan River runs through Nanchang from south to north and divides it into two parts. The eastern bank of the Gan River is the old urban district, while the western bank of the Gan River is the emerging urban district. As of the end of 2019, the permanent population of Nanchang was 5.6 million. The area on the eastern bank of the Gan River in Nanchang has a higher population density than other areas. In addition, Nanchang is one of the hottest cities in China, with a strong urban heat island effect [53]. Nanchang is located on the southwest shore of Poyang Lake, China's largest freshwater lake and the link between the Gan River and the Yangtze River. Nanchang lies within the Poyang Plain, which is rich in vegetation, rivers, and lakes. The city has rolling hills to the northwest and relatively flat terrain to the southeast. Nanchang has a subtropical, humid monsoon climate, with annual precipitation of 1613.3 mm, an average annual temperature of 19.1 • C, the highest temperature of 37.5 • C, and the lowest temperature of 0 • C, based on the meteorological statistics of 2019 [54]. Nanchang has a large diversity of land use and land cover types, which mainly includes urban and industrial land, rural settlements, paved land, rivers and bottomlands, ponds and reservoirs, cultivated land, forests, bush, grassland, and bare land [52]. The main types of buildings in Nanchang are residential buildings (e.g., elevator buildings, walk-up buildings, townhouses, bungalows, and villas), public buildings, industrial buildings, and agricultural buildings.

Remote Sensing Data
We chose Sentinel-2 MSI imagery and PALSAR-2 data to generate LCZ maps in the study area. To minimize classification errors due to different acquisition dates, we chose PALSAR-2 data with the acquisition date closest to that of Sentinel-2. The coverage of the PALSAR-2 scene is not the same as that of Sentinel-2. Therefore, we selected PALSAR-2 acquired in summer and late spring, which are closest to the acquisition date of Sentinel-2, to cover the whole study area. ASTER land surface temperature products (AST_08) were selected to investigate the relationship between the LCZs and the LST. Table 1 provides the details of the remote sensing images used in this study. All remote sensing data were acquired in 2019 and were projected to the same coordinate system by transforming projection (universal transverse mercator (UTM) zone 50 north map projection, World Geodetic System 84 (WGS-84) datum). For each source of remote sensing data, multiple scenes were mosaicked using a histogram-matching method.

Sentinel-2 MSI Imagery
Four Sentinel-2 MSI level-2A images (bottom-of-atmosphere reflectance) acquired in September 2019 were selected to generate a cloud-free image of the study area (https: //scihub.copernicus.eu/dhus/#/home, (last accessed on 8 May 2021)). Sentinel-2 data are acquired in 13 spectral bands ranging from the visible and near-infrared (VNIR) to the short wave infrared (SWIR) at spatial resolutions of either 10 m, 20 m, or 60 m [55]. Band 10 (SWIR/cirrus) was excluded because it does not contain information about the land surface. To maintain consistency and facilitate calculations, we resampled bands with 20 m and 60 m resolutions to 10 m using a bilinear interpolation method based on Sentinel application platform (SNAP) 7.0 software (https://step.esa.int/main/download/snap-download/, (last accessed on 8 May 2021)).

PALSAR-2 Data
The L-band PALSAR-2 level 3.1 products were produced by the Japan Aerospace Exploration Agency (JAXA) (https://auig2.jaxa.jp/ips/homepalsar, (last accessed on 8 May 2021)) [56,57]. The data were acquired in stripmap fine beam dual (FBD) mode (HH and HV) during an ascending orbit with a right-looking observation direction, a pixel spacing of 6.25 m, and off-nadir angles of 28.6 • (for 19 May 2019) and 32.9 • (for 28 July 2019). To combine the PALSAR-2 data with the Sentinel-2 imagery at the pixel level, we transformed the PALSAR-2 data into the same coordinate system as the Sentinel-2 imagery and resampled it to a spatial resolution of 10 m using a bilinear interpolation method. The PALSAR-2 data were coregistered by using dispersed ground control points selected from Sentinel-2 imagery and applying a quadratic polynomial transformation and bilinear interpolation. The root-mean-square error of the ground control points was less than 0.5 pixels.

ASTER Land Surface Temperature Products
ASTER level-2 AST_08 (surface kinetic temperature) products are generated from ASTER's five thermal infrared bands at 90 m resolution and produced by the temperature and emissivity separation (TES) algorithm [58]. The AST_08 products were downloaded from the Land Processes Distributed Active Archive Center (https://lpdaac.usgs.gov/ products/ast_08v003/, (last accessed on 8 May 2021)) and processed by the science scalable scripts-based science processor for missions (S4PM) algorithm (Version 3.4) [59]. Because a very small amount of data covering the study area were missing in 2019, we used the values of their nearest neighbors according to Euclidean distance to substitute for the missing data based on the nibble tool in ArcGIS 10.8 software.

Local Climate Zones Scheme
LCZs are climate-related regions that span hundreds of meters to several kilometers on a horizontal scale and are functions of surface cover, structures, construction material, and human activity [4]. As depicted in Figure 2a, the standard LCZ scheme comprises two major types: built types (LCZ classes 1-10) and land cover types (LCZ classes A-G). The 17 standard classes of LCZs are determined by surface characteristics; each provides a unique thermal environment that is most apparent in areas of simple relief, over dry surfaces, and on calm nights [4].

Training and Test Datasets
To collect field-based land cover observations, we conducted field surveys in Nanchang from May to September 2019. To reduce the effects caused by the imbalance of classes [60], roughly balanced ground reference samples of 13 LCZ classes were randomly collected throughout the study area based on this field investigation and visual interpretation of high-spatial-resolution Google Earth imagery from May to September 2019. The reference samples were then randomly split into two sets of disjoint training and test pixels to ensure spatial separation of training and test sites [61] (Table 2). Figure 2b shows Google Earth images of typical samples of the LCZ classes in Nanchang. It should be pointed out that LCZ 1 (compact high-rise) was not included because there was almost no LCZ 1 in our study area. Furthermore, we merged LCZ B (scattered trees) and LCZ C (bush, scrub) into a new class LCZ B C (scattered trees with bush and scrub) because in most cases, shrubs, short trees, and scattered trees were mixed.

Input Features
The 12 spectral bands of Sentinel-2 MSI imagery (bottom-of-atmosphere reflectance), four backscattering intensity features obtained from dual-polarized PALSAR-2 (HH and HV backscattering coefficients, and the difference and ratio between the two polarization bands), and 24 textural features were used for the LCZ classification (Table 3). To explore the effects of different combinations of input features on classification accuracy, we set up six datasets designated as D1-D6 using these 40 features (Table 3). The textural features were extracted by using ENVI 5.5 software as follows: First, we performed a minimum noise fraction (MNF) transformation [62] on four bands at 10 m (bands 2, 3, 4, and 8) in the Sentinel-2 image. Second, the gray-level co-occurrence matrix (GLCM) [63] was computed considering a processing window of 3 × 3, the grayscale quantization level of 64, and the distance of 1. For the Sentinel-2 based GLCM, we selected the first MNF component (MNF 1) as the input. For the PALSAR-2 based GLCM, we selected the two polarization bands (HH and HV) as the input, respectively. Third, based on the obtained GLCM, we averaged eight textural features (contrast, correlation, dissimilarity, entropy, homogeneity, mean, angular second moment, and variance) in four directions (0 • , 45 • , 90 • , and 135 • ) to achieve rotational invariance.

Random Forest Classification
The RF [22] is a parallel ensemble based on a classification and regression tree and can be generated simultaneously without strong dependencies between individual learners [64]. We implemented the RF classifier by using the scikit-learn library [65] and the Geospatial Data Abstraction Library (GDAL, https://gdal.org/, (last accessed on 8 May 2021)) in Python.
We used out-of-bag (OOB) samples for selecting the hyperparameters of the model. Before launching the RF classifier, two important hyperparameters that determine the randomness of the RF model had to be set: the number of trees (T) and the number of features (as listed in Table 3) randomly selected at each node (nr). We kept the other hyperparameters of the RF classifier as defaults and performed a grid search. The searching range of T was between 100 and 2000 using intervals of 100, whereas the searching range of nr was between the total number of features in intervals of 1. Based on the OOB scores of different RF models using various combinations of hyperparameters, we selected the optimal combination of hyperparameters (Table 3).

Grid-Cell Processing and Postprocessing
Because LCZs are defined at the local scale (10 2 -10 4 ) [4,49], we used a grid-cell (100 m × 100 m) process for pixel aggregation. First, we used ArcGIS 10.8 software to create nets of grid cells with sizes of 100 m × 100 m covering the entire study area. The 100 m × 100 m grid cells were intersected with the LST data (90 m spatial resolution), and the area of each intersected portion was calculated. The LST attribute of a grid cell was then obtained by the weighted average of the LST values of the intersected portion according to the area percentage. Next, for each grid cell, the area of each LCZ class within a grid cell was calculated and stored in the attribute table. To calculate the percentage of each LCZ class within each grid cell, we divided the area of each LCZ class by the area of the grid cell. For a single grid cell, we assigned the dominant LCZ class that accounted for the largest area to the corresponding grid cell. Finally, we used a 3 × 3 majority filter for LCZ classification maps to include more contextual information.

Usual Resampling Methods
To explore the differences between the grid-cell-based method and the usual resampling methods, we used the D6 dataset to generate 100 m LCZ maps based on ArcGIS 10.8 software. We performed majority resampling and nearest neighbor resampling on the classified LCZ map. For the classified LCZ map (categorical data), we did not include bilinear interpolation or cubic convolution in the comparison because they alter the pixel values so that the original categories are not maintained. In addition, we applied nearest neighbor resampling, bilinear interpolation resampling, and cubic convolution resampling to the original images before executing the classification. To ensure the consistency of the comparison, the 100 m LCZ classification results obtained by each resampling method were subjected to the same 3 × 3 majority filter as those obtained by the grid-cell-based method. Finally, we used the grid-cell-based method as the baseline to compare the differences between the other five resampling methods and the grid-cell-based method.

Feature Importance for the RF Model and Feature Contributions for Each Class
To understand how each feature affected the RF classification model, we used the mean decrease in Gini/Gini importance and the mean decrease in accuracy/permutation importance [22] based on the training set. The Gini importance of a feature was obtained by averaging the decrease of the Gini impurity at all nodes where this feature was used in all trees. The permutation importance was expressed as the value of the change in the accuracy of a trained model when the values of a feature in the dataset were randomly permuted. For the second of these calculations, we performed 100 repeated shuffles for all features separately and averaged the decrease of accuracy to reduce randomness.
To explore the impact of each feature on each class, we employed a feature contribution method using the tree-interpreter package. The feature contribution method is based on decision paths through each tree in a forest and can reveal the relationship between features and predictions [36,37].
For classification tasks, consider a dataset of m samples where T is the total number of trees. For a single input x i , there is a decision path from the root node to the leaf node in each tree. At a node in the decision path, if this node (parent node) is split into child nodes by feature j, then the contribution of feature j to class k is defined as: if the split in a parent node is performed over the feature j; 0, otherwise, where p child j,k is the fraction of samples that belong to class k at the child node, corresponding to the feature j, and p parent j,k is the fraction of samples that belong to class k at the parent node, corresponding to the feature j.
The predicted probability P k that x i belongs to class k can be written as: where p (t,root) k is the fraction of samples that belong to class k at the root node in the t-th tree and FC (t) j,k is the sum of FC j,k over all nodes on the decision path in the t-th tree. To obtain the feature contributions of each class, we averaged the results computed from all training samples belonging to the same class.

Statistical Analysis for Nighttime LST within LCZs
To examine the spatial autocorrelation of nighttime LST, we used the global Moran's I statistic and the Anselin local Moran's I statistic based on ArcGIS 10.8 software. For grid cells, we used an inverse distance conceptualization to generate a spatial weight matrix with a default threshold value of 270 m. Subsequently, the global Moran's I index for all grid cells was computed based on the spatial weight matrix. The Anselin local Moran's I analysis for all grid cells was also based on the spatial weight matrix. In addition, to explore the differences in LST among LCZ classes, we carried out statistical analysis using SPSS Statistics 26 software. First, we examined the normality of the LST in each LCZ class by using histogram comparisons, Q-Q plots, and Kolmogorov-Smirnov tests. We then performed Levene's test to examine the homogeneity of variances. Based on the results of these two tests, to estimate the statistical significance of the LST differences between LCZ classes, we finally chose nonparametric tests, including the Kruskal-Wallis one-way analysis of variance (ANOVA) test followed by all pairwise multiple comparisons and a median test followed by all pairwise multiple comparisons. Figure 3a shows the LCZ maps obtained with different datasets (D1-D6), respectively. The percentages of the area occupied by each LCZ class in different datasets (D1-D6) are shown in Figure 3b. The accuracies of the classification were evaluated in terms of user's accuracy (UA), producer's accuracy (PA), and overall accuracy (OA), which were derived from the confusion matrix based on test pixels [61]. The confusion matrices of LCZ maps obtained using different datasets (D1-D6) are shown in Figure 4. Figure 5 also shows the differences in the PAs and UAs of each LCZ class for the different LCZ maps.   Compared to using the D1 dataset, the OAs were improved by 2.24% using D2, 2.32% using D5, and 4.03% using D6. There was a small improvement in the OA after using textural features. For example, the D2 dataset improved 2.24% over the D1 dataset, the D4 dataset improved 7.14% over the D3 dataset, and the D6 dataset improved 1.71% over the D5 dataset. When using the D5 dataset, the OAs were improved by 51.9% over the D3 dataset and 44.76% over the D4 dataset. The LCZ map derived using only the dualpolarized PALSAR-2 data (the D3 dataset) had the lowest OA. Using the D3 or D4 dataset, either the OAs were relatively low, or the land cover was not satisfactorily categorized. The highest OA was 89.96%, obtained from the D6 dataset by using all 40 input features.

Accuracy Assessment of LCZ Maps
For the D6 dataset, land cover LCZ types were generally classified with higher accuracy than built LCZ types (except for LCZs E and F). The confusion was manifested mainly among the built LCZ types. For the land cover LCZ types, LCZs E (bare rock or paved) and F (bare soil or sand) tended to be confused with built LCZ types. For the D6 dataset, LCZs A (dense tree), G (water), and D (low plants) had relatively high PA and UA among the land cover types. Among the built types, LCZs 2 (compact mid-rise) and 4 (open high-rise) had relatively high PA and UA. For the D6 dataset, open buildings (LCZs 4-6) were generally more difficult to distinguish than compact buildings (LCZs 2 and 3). This difficulty reflects that compact buildings are clustered in high-to-medium-spatial-resolution (10-m to 100-m) satellite imagery, whereas open buildings are scattered and occupy small pixels.
To measure the compliance or divergence of individual LCZ classifications, we computed the number of the same classes for a given location (individual cells of the grid) ( Figure 6). The most obvious differences among the six LCZ maps were located in the northeastern part (close to Poyang Lake), the eastern part, and the urban district. A total of 86.2% of the grid cells showed good compliance for all datasets (Figure 6b). To visualize the discrimination of LCZ classes using the datasets D1-D6, we extracted a subregion A in the urban district of Nanchang (Figure 7). This subregion is a typical urban region consisting of different types of buildings and land cover. It could be visually observed that the classification using PALSAR-2 polarimetric data alone did not yield a satisfactory result. Using the D3 dataset, most LCZ classes were under-represented. When using D4 by adding textural features to D3, there was a slight improvement in the classification of built LCZ types. Nevertheless, worse performance on LCZ classification was obtained using the D3 or D4 dataset. When using D2 by adding textural features to D1, the discrimination among built LCZ types was notably improved, especially for LCZ 4 (open high-rise). Compared to the classification results obtained from datasets D5 and D6, LCZ E (bare rock or paved) was under-represented using the D1 or D2 dataset. Compared to the D6 dataset, LCZ 4 was under-represented, while LCZ 6 (open low-rise) was over-represented using the D5 dataset. The most desirable result was produced when all 40 input features (the D6 dataset) were used because the confusion among LCZ classes was markedly reduced.

Comparison of the Grid-Cell-Based Method and Resampling Methods
As shown in Figure 8, using the nearest neighbor resampling, bilinear interpolation resampling, and cubic convolution resampling produced salt-and-pepper noise. Visually, the grid-cell-based method generated a more homogeneous result. The result of the majority resampling lay between those of the grid-cell-based method and the other resampling methods. Compared with other resampling methods, the difference between the majority resampling after classification and the grid-cell-based method was relatively small. As shown in Figure 8b-e, the basic patterns of these maps were relatively similar.

Importance and Contributions of Features for LCZ Classification
As mentioned above, the best LCZ classification was obtained using the D6 dataset. Therefore, we analyzed the feature importance of the RF model trained by all features (the D6 dataset) (Figure 9). The patterns of these two importance measures differed slightly from each other. In general, spectral features showed greater importance than polarimetric features and textural features. For both measures of importance, the most beneficial feature in the LCZ classification was S2_B12. Polarimetric features were also helpful for LCZ classification, especially the backscattering intensity at the HV polarization. In the eight textural features, GLCM_Mean was found to be the most useful feature. In addition, GLCM_Mean at the HV polarization of PALSAR-2 was more important than those extracted by Sentinel-2.   Figure 10 shows the feature contributions for each LCZ class for the RF model trained by all features (the D6 dataset). Land cover LCZ types exhibited more variability across input features than built LCZ types. In general, the same trends appeared in the feature importance for the RF model ( Figure 9) and in the feature contributions for each LCZ class ( Figure 10). For instance, S2_B12 was a beneficial feature for most of the LCZ classes. However, the contributions of a feature to each LCZ class differed to varying degrees. For example, compared with built LCZ types, the HV polarization band made a higher contribution to land cover LCZ types, especially LCZs A (dense trees), G (water), and E (bare rock or paved). As shown in Figure 10  As shown in Figures 9 and 10, the combinations of features with low importance and contributions for LCZ classification did not have high classification accuracies. For the D3 and D4 datasets, except for P2_HV_GLCM_Mean and P2_HH_GLCM_Mean, the importance and contributions of the remaining features were not high, and therefore, their classification accuracies were not high. The importance and contributions of the 12 features of Sentinel-2 were relatively high, explaining the good classification accuracy achieved by using only Sentinel-2 imagery (the D1 dataset).

Relationships between LCZs and Nighttime LST
As shown in Figure 11a-b, high nighttime LSTs were dominant mainly in urban areas and water bodies. The fact that the global Moran's I index for all grid cells was 0.78 (p < 0.001) indicated a strong positive spatial autocorrelation for LST. The LST of both LCZ A (dense trees) and LCZ D (low plants) was clustered mostly as low-low, whereas the LST of LCZs G (water), E (bare rock or paved), and built LCZ types (LCZs 2-6, 8, and 10) was clustered mostly as high-high (Figure 11c). As shown in Figure 11d, different LST variations within a single LCZ class were observed. In general, there were large differences in nighttime LSTs between LCZ classes, especially between land cover LCZ types. The nighttime LST was generally higher for the built LCZ types than for the land cover LCZ types. Residential buildings (LCZs 2-6) had higher nighttime LSTs than nonresidential buildings (LCZs 8 and 10), except for LCZ 3 (compact low-rise). Both Kruskal-Wallis one-way ANOVA test and median tests showed statistically significant differences (p < 0.001). Overall, there were statistically significant LST differences for most LCZ classes (Figure 11e).

LCZ Classification Using Sentinel-2 Imagery and PALSAR-2 Data
The fact that the accuracies of the LCZ map obtained by the D6 dataset were acceptable for further studies (as expected) revealed the potential of combining optical and SAR data for LCZ classification in urban areas. The LCZ classification using only PALSAR-2 was not satisfactory, especially for land cover LCZ types, such as those exhibiting many LCZ G (water) in the D3 dataset and LCZ E-bare rock or paved (stripes) in the D4 dataset. This reveals the limitations of using only SAR data for land cover classification in complex urban and peri-urban environments [10,15,66]. Comparing the LCZ classification maps obtained from the four datasets (D1, D2, D5, and D6), we found that the basic patterns of these LCZ maps were generally similar. It can be concluded that optical data are still dominant in LCZ classification compared to SAR data [10].
Compared to the study of La et al. [16] that combined Sentinel-2 with full-polarized PALSAR-2, this study introduced textural features derived from Sentinel-2 and dualpolarized PALSAR-2 but did not consider the contribution of polarimetric parameters. When fully polarimetric SAR data are available, adding various types of information obtained by polarimetric target decomposition methods to the classification will help to improve classification accuracy [16,67]. However, the fully polarimetric data are not always available due to its limited swath width [21]. As an alternative, our results showed the attractiveness of dual-polarized SAR data for LCZ classification. In addition, we showed the advantages of a majority rule-based grid-cells process in generating LCZ maps with generalized urban patterns.
For the D6 dataset, there are still limitations in the separability between LCZ classes with similar spectral characteristics, such as LCZs B and C; LCZ E (bare rock or paved) and built LCZ types; and LCZs 8 (large low-rise) and 10 (heavy industry). These problems can be solved by adding more discriminatory data in the classification or by improving classification algorithms. The information on building height is beneficial for the discrimination between built LCZ types. Further research on combining height data with other datasets for LCZ classification will be required in the future. It has been shown that the inclusion of LiDAR data in the classification can assist in urban land cover classification [68,69]. Moreover, deep learning methods, such as convolutional neural networks, have recently shown promising performance in LCZ classification [19,70].
Considering that the inherent speckle noise in SAR data makes individual pixels unreliable, the textural features from SAR data can provide attractive information [15]. However, there is a need to further investigate the effect of textural features from both optical and SAR data on LCZ classification. It is important to select the optimal combination of textural features for LCZ classification. Because various combinations of different bands, window sizes, and texture measures will produce many textural features, using these massive features may lead to the curse of dimensionality and reduce the accuracy in the classification using a finite-sized training set [71].

Implications of the Grid-Cell-Based Method
For the nearest neighbor resampling, the difference between the LCZ maps obtained before and after classification was small, which illustrates the applicability of resampling before classification, as implemented by the WUDAPT method [8]. The effect of the different resampling rules on the LCZ maps is more significant than whether the resampling is implemented before or after classification. As the postprocessing approach for categorical data, the LCZ map obtained by the majority resampling was not as homogeneous as those obtained by the majority rule-based grid-cells method, which may be limited by the built-in default filter window in ArcGIS software.

Assessment of Interpretability of Features
The significant importance and contribution of features can be explained by the good representation of those features in the characteristics of LCZ classes [11,12]. Not all of the features were equally beneficial to the LCZ classification for the RF models. The fact that the overall importance of the PALSAR-2 derived features was not as prominent as the Sentinel-2 derived features was probably due to the inconsistent dates of PALSAR-2 data and Sentinel-2 imagery. The seasonal variation of vegetation may make polarimetric features contribute little to identifying these changing ground objects. For PALSAR-2 dualpolarization data, the HV polarization band is more conducive to land cover classification than the HH polarization band, which may be due to the unique scattering information about ground objects provided by cross-polarization [15,25]. Our study also indicated that the GLCM textural features have limited ability in land cover classification. This limited ability may reflect the spatial resolution of Sentinel-2 MSI imagery and PALSAR-2 data. This problem will probably be resolved with the improvement of spatial resolution [72].
The fact that most features performed better in land cover LCZ types than built LCZ types indicated that there were still limitations in discriminating built LCZ types for these features. Interestingly, band 1 (coastal aerosol) made a significant contribution to LCZs 8 (large low-rise) and 10 (heavy industry); and band 9 (water vapor) contributed significantly to LCZs 2 (compact mid-rise), D (low plants), and 6 (open low-rise). These results showed that both band 1 (dedicated for aerosol retrieval) and band 9 (dedicated for water vapor correction) in Sentinel-2 imagery were beneficial for LCZ classification, despite their relatively low spatial resolution (60 m). In addition, for LCZ 5 (open mid-rise), S2_GLCM_Mean made the highest contribution. This observation highlights the fact that features that are generally unimportant for the model may be important for a specific class [37].
It is worth noting that there were many features that have very low permutation importance (Figure 9b), probably because of the correlation between features. When features are correlated, permutating one feature has little impact on the model's performance because it can obtain the same information from the correlated features. In the future, it will be necessary to evaluate additional features to provide more information on how to allow LCZ classes to be better differentiated. In addition, it is also important to analyze the correlation in the features extracted from the remote sensing data.

LST Differentiation of LCZs
In general terms, the fact that there were statistically significant nighttime LST differences between most LCZs indicated that different LCZ classes exhibited thermal environments associated with their surface characteristics [4,49]. For example, built LCZ types (LCZs 2-6, 8, and 10) and LCZ E (bare rock or paved) were clustered as high-high on nighttime LST (Figure 11c), probably because of the thermal properties of impervious surfaces [45]. Compared to other land cover LCZ types, LCZs E and G (water) had relatively high nighttime LSTs, probably because they cool off more slowly during nighttime. The fact that LCZ A (dense trees) had lower nighttime LSTs than LCZ B C (scattered trees with bush and scrub) indicates that aggregated vegetation cools better than dispersed vegetation [73]. The fact that the nighttime LSTs of LCZ D (low plants) were lower than that of LCZ A was probably because dense trees have greater shading coverage that influences the penetration of solar radiation [74]. For buildings located in urban areas with the same heights, open buildings had lower nighttime LSTs than compact buildings. The former may benefit from the surrounding vegetation and good ventilation [75].
However, the fact that the nighttime LSTs of several LCZ classes were not significantly different statistically from other classes may have been related to the influence of local or regional climate [43]. In addition, the intra-LCZ variability of nighttime LST revealed the potential effects of heterogeneous surrounding environments [76]. For example, the nighttime LSTs of both LCZs 3 (compact low-rise) and B C were generally higher in urban areas than in rural areas. Similarly, LCZ F (Bare soil or sand) was warmer near water than near dense trees during nighttime. Buildings surrounded by large areas of vegetation tended to have lower nighttime LSTs than buildings surrounded by a small amount of vegetation.
Furthermore, several issues need to be further explored in studies of LSTs or surface urban heat islands using LCZ maps, including seasonal changes in LCZs, the effects of multitemporal (day and night), seasonal, and thermal anisotropy on LST variations [42,43]. Considering that it is not the focus of this study, we only used two dates of LST data in summer to analyze the relationship between LCZ and nighttime LST. Many studies have shown LST differences within LCZs using many dates of LST data [38,40,42,43]. Therefore, the applicability of the inclusion of thermal remote sensing images from different sensors (e.g., Landsat and ASTER) over multiple periods in LCZ classification can be investigated in the future. However, this could potentially lead to methodological bias in LST analysis [44].

Conclusions
The combination of Sentinel-2 MSI imagery and dual-polarized (HH + HV) PALSAR-2 data was found to be promising and beneficial for LCZ mapping. The quantitative analysis of input features based on the RF classifier showed that in LCZ classification, band 12-SWIR 2 is crucial for Sentinel-2 imagery, whereas the HV polarization is important for dual-polarized PALSAR-2 data. By using the feature contribution approach based on decision paths, each input feature was found to contribute differently to LCZ classes. These different contributions may not be detected by a standard feature importance analysis. Through this class-based analysis of feature contributions, it is possible to reveal the effective features in distinguishing different LCZ classes. In addition, our comparative results showed that the grid-cell-based method produced more homogeneous LCZ maps than the usual resampling methods.
Spatial analysis of LCZs and summer nighttime LST showed that high LSTs were concentrated mostly in the built LCZ types, LCZ E, and LCZ G, whereas low LSTs were mostly concentrated in LCZs A and D. Statistical analysis showed that the summer nighttime LST differences between most LCZ classes were statistically significant, but this phenomenon needs to be further investigated using more dates of thermal remote sensing images. Considering the thermal differentiation within LCZs, the effect of thermal remote sensing data in LCZ classification can also be further explored.
This study provided insights into the performance of RF classifiers in LCZ mapping and feature assessment that could contribute to future LCZ mapping. In addition, this study highlighted the potential of the LCZ map and the grid-cell-based method for urban climate research that could contribute to a better understanding of the impact of urban morphology defined by LCZs on local climatic conditions.