Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classification: A Case Study of Nanchang, China

Chen, Chaomin; Bagan, Hasi; Xie, Xuan; La, Yune; Yamagata, Yoshiki

doi:10.3390/rs13101902

Open AccessArticle

Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classification: A Case Study of Nanchang, China

by

Chaomin Chen

¹

,

Hasi Bagan

^1,2,*

,

Xuan Xie

¹,

Yune La

^3,4 and

Yoshiki Yamagata

²

¹

School of Environmental and Geographical Sciences, Shanghai Normal University, Shanghai 200234, China

²

Center for Global Environmental Research, National Institute for Environmental Studies, Ibaraki 305-8506, Japan

³

Cryosphere Research Station on the Qinghai-Tibetan Plateau, State Key Laboratory of Cryospheric Science, Northwest Institute of Eco-Environment and Resource, Chinese Academy of Sciences, Lanzhou 730000, China

⁴

University of Chinese Academy Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2021, 13(10), 1902; https://doi.org/10.3390/rs13101902

Submission received: 16 March 2021 / Revised: 20 April 2021 / Accepted: 5 May 2021 / Published: 13 May 2021

(This article belongs to the Special Issue ALOS-2/PALSAR-2 Calibration, Validation, Science and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Local climate zone (LCZ) maps have been used widely to study urban structures and urban heat islands. Because remote sensing data enable automated LCZ mapping on a large scale, there is a need to evaluate how well remote sensing resources can produce fine LCZ maps to assess urban thermal environments. In this study, we combined Sentinel-2 multispectral imagery and dual-polarized (HH + HV) PALSAR-2 data to generate LCZ maps of Nanchang, China using a random forest classifier and a grid-cell-based method. We then used the classifier to evaluate the importance scores of different input features (Sentinel-2 bands, PALSAR-2 channels, and textural features) for the classification model and their contribution to each LCZ class. Finally, we investigated the relationship between LCZs and land surface temperatures (LSTs) derived from summer nighttime ASTER thermal imagery by spatial statistical analysis. The highest classification accuracy was 89.96% when all features were used, which highlighted the potential of Sentinel-2 and dual-polarized PALSAR-2 data. The most important input feature was the short-wave infrared-2 band of Sentinel-2. The spectral reflectance was more important than polarimetric and textural features in LCZ classification. PALSAR-2 data were beneficial for several land cover LCZ types when Sentinel-2 and PALSAR-2 were combined. Summer nighttime LSTs in most LCZs differed significantly from each other. Results also demonstrated that grid-cell processing provided more homogeneous LCZ maps than the usual resampling methods. This study provided a promising reference to further improve LCZ classification and quantitative analysis of local climate.

Keywords:

local climate zone; random forest; feature importance; land surface temperature; grid cells; Sentinel-2; PALSAR-2; ASTER

Graphical Abstract

1. Introduction

With continuous urbanization and the increasing settlement in global cities, natural landscapes are constantly converted to impervious surfaces in urban areas, altering the natural surface energy and water balances, which often results in altered climatic conditions in urban areas and the formation of the urban heat island (UHI) phenomenon [1,2,3]. As a key topic in urban climate studies, the concept of a “local climate zone” (LCZ) was introduced in 2012 by Stewart and Oke [4] to quantify the relationship between urban morphology and the UHI phenomenon. LCZs provide a standardized framework to link land cover types and urban morphology with corresponding thermal properties, so LCZs have been the systematic criteria for UHI comparisons [5]. Notably, the World Urban Database and Access Portal Tools (WUDAPT) project was developed as a new global initiative to produce standardized LCZ maps [6,7,8]. Because remote sensing data are widely available, they have been routinely used for LCZ mapping and have shown great potential for that purpose [9,10,11,12]. It is necessary to explore the combination of multi-source remote sensing data to generate LCZ mapping.

Because of the heterogeneity and complexity of the composition and configuration of urban pixels in remote sensing images, urban land cover maps based on remote sensing data are characterized by inherent uncertainties [13]. Unlike optical sensors that capture the spectral characteristics of objects on the ground, synthetic aperture radar (SAR) sensors can record the characteristics of light scattered by objects on the ground. Previous studies have demonstrated that the synergistic use of optical imagery and SAR data can facilitate urban land cover classification [14,15,16]. The cost-free, high-spatial-resolution imagery from the Sentinel-2 multispectral instrument (MSI) has been found to be suitable for large-scale LCZ mapping [17,18,19,20]. In addition, high-spatial-resolution phased array L-band SAR-2 (PALSAR-2) data have been used for large-scale land use and land cover mapping [21]. The use of a combination of Sentinel-2 imagery and PALSAR-2 data, therefore, has the potential to produce large-scale LCZ classification maps.

Random forest (RF) models [22] have become popular in the classification of land cover using remote sensing data because their classifications are highly accurate, their computational costs are low, and they can handle high-dimensional datasets [23,24]. Various studies have examined the importance of input features for the classifier [15,25,26,27,28,29] and for each class [30,31,32,33] in the context of RF classification. However, the contributions of the different bands and features of remote sensing data to the classification model and its classes have not been systematically studied in the case of LCZ classification. Only a few studies have examined the importance of features for LCZ mapping [12,17,34,35]. The feature contribution method based on decision paths [36,37] must be further investigated to take advantage of the RF model in LCZ land cover classification.

The land surface temperature (LST) observed by satellites is widely used for urban climate research, where pixel values are time-synchronized and spatially continuous [38,39,40]. Medium-resolution thermal satellite imagery is readily available and can provide a better alternative for urban land surface thermal analysis (e.g., surface UHI) than in situ thermal data [41]. Many studies have recently applied the LCZ classification scheme to understand the thermal characteristics of cities based on LSTs retrieved from thermal remote-sensing data [38,42,43,44,45]. Previous studies have indicated that nighttime LST could observe climatic conditions more accurately than daytime LST [46,47]. Given that summer nighttime is a crucial temporal period for surface UHI [48], it is important to explore the relationship between summer nighttime LST and LCZs.

Typically, the scales of LCZs vary from about a hundred meters to several kilometers that represent relatively homogeneous urban surfaces that share a similar energy budget [4,49]. To achieve a suitable resolution for LCZ classification, the common approach to generate LCZ maps is to preprocess the remote sensing images by resampling or to post-process the classified LCZ maps by resampling. Considering the different spatial resolutions of LCZ maps and LST data, as an alternative, the grid-cell-based method has been found to be a powerful tool for linking data at different spatial resolutions; it enables the user to fine-tune an analysis of data from multiple sources and to strike a compromise between the need for details and the feasibility of computations [50,51].

In this study, we combined the Sentinel-2 MSI imagery and PALSAR-2 data to generate LCZ maps of Nanchang City, Jiangxi Province, China, based on the RF classifier. The main objectives of this study were (i) to classify different combinations of spectral, backscattering, and textural features in Sentinel-2 and PALSAR-2, (ii) to assess the importance and contribution of the input features from Sentinel-2 MSI imagery and PALSAR-2 data to LCZ classification, and (iii) to compare the advantages and disadvantages of the resampling method and the grid-cell-based method in the process of LCZ mapping, and then to perform spatial statistical analysis of the best LCZs map and LST derived from summer nighttime Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) thermal imagery.

2. Materials

2.1. Study Area

Nanchang City, which is located between 115°27′–116°35′ E and 28°09′–29°11′ N (Figure 1), was selected as our study area to fill the research gap for LCZ maps in this region. Nanchang is the capital city of Jiangxi Province in southeastern China. It is one of the central cities in the middle reaches of the Yangtze River and covers about 7402 km². Since the 1980s, Nanchang has experienced rapid economic development, industrialization, and urbanization [52]. The Gan River runs through Nanchang from south to north and divides it into two parts. The eastern bank of the Gan River is the old urban district, while the western bank of the Gan River is the emerging urban district. As of the end of 2019, the permanent population of Nanchang was 5.6 million. The area on the eastern bank of the Gan River in Nanchang has a higher population density than other areas. In addition, Nanchang is one of the hottest cities in China, with a strong urban heat island effect [53].

Nanchang is located on the southwest shore of Poyang Lake, China’s largest freshwater lake and the link between the Gan River and the Yangtze River. Nanchang lies within the Poyang Plain, which is rich in vegetation, rivers, and lakes. The city has rolling hills to the northwest and relatively flat terrain to the southeast. Nanchang has a subtropical, humid monsoon climate, with annual precipitation of 1613.3 mm, an average annual temperature of 19.1 °C, the highest temperature of 37.5 °C, and the lowest temperature of 0 °C, based on the meteorological statistics of 2019 [54]. Nanchang has a large diversity of land use and land cover types, which mainly includes urban and industrial land, rural settlements, paved land, rivers and bottomlands, ponds and reservoirs, cultivated land, forests, bush, grassland, and bare land [52]. The main types of buildings in Nanchang are residential buildings (e.g., elevator buildings, walk-up buildings, townhouses, bungalows, and villas), public buildings, industrial buildings, and agricultural buildings.

2.2. Remote Sensing Data

We chose Sentinel-2 MSI imagery and PALSAR-2 data to generate LCZ maps in the study area. To minimize classification errors due to different acquisition dates, we chose PALSAR-2 data with the acquisition date closest to that of Sentinel-2. The coverage of the PALSAR-2 scene is not the same as that of Sentinel-2. Therefore, we selected PALSAR-2 acquired in summer and late spring, which are closest to the acquisition date of Sentinel-2, to cover the whole study area. ASTER land surface temperature products (AST_08) were selected to investigate the relationship between the LCZs and the LST. Table 1 provides the details of the remote sensing images used in this study.

All remote sensing data were acquired in 2019 and were projected to the same coordinate system by transforming projection (universal transverse mercator (UTM) zone 50 north map projection, World Geodetic System 84 (WGS-84) datum). For each source of remote sensing data, multiple scenes were mosaicked using a histogram-matching method.

2.2.1. Sentinel-2 MSI Imagery

Four Sentinel-2 MSI level-2A images (bottom-of-atmosphere reflectance) acquired in September 2019 were selected to generate a cloud-free image of the study area (https://scihub.copernicus.eu/dhus/#/home, (last accessed on 8 May 2021)). Sentinel-2 data are acquired in 13 spectral bands ranging from the visible and near-infrared (VNIR) to the short wave infrared (SWIR) at spatial resolutions of either 10 m, 20 m, or 60 m [55]. Band 10 (SWIR/cirrus) was excluded because it does not contain information about the land surface. To maintain consistency and facilitate calculations, we resampled bands with 20 m and 60 m resolutions to 10 m using a bilinear interpolation method based on Sentinel application platform (SNAP) 7.0 software (https://step.esa.int/main/download/snap-download/, (last accessed on 8 May 2021)).

2.2.2. PALSAR-2 Data

The L-band PALSAR-2 level 3.1 products were produced by the Japan Aerospace Exploration Agency (JAXA) (https://auig2.jaxa.jp/ips/homepalsar, (last accessed on 8 May 2021)) [56,57]. The data were acquired in stripmap fine beam dual (FBD) mode (HH and HV) during an ascending orbit with a right-looking observation direction, a pixel spacing of 6.25 m, and off-nadir angles of 28.6° (for 19 May 2019) and 32.9° (for 28 July 2019). To combine the PALSAR-2 data with the Sentinel-2 imagery at the pixel level, we transformed the PALSAR-2 data into the same coordinate system as the Sentinel-2 imagery and resampled it to a spatial resolution of 10 m using a bilinear interpolation method. The PALSAR-2 data were coregistered by using dispersed ground control points selected from Sentinel-2 imagery and applying a quadratic polynomial transformation and bilinear interpolation. The root-mean-square error of the ground control points was less than 0.5 pixels.

2.2.3. ASTER Land Surface Temperature Products

ASTER level-2 AST_08 (surface kinetic temperature) products are generated from ASTER’s five thermal infrared bands at 90 m resolution and produced by the temperature and emissivity separation (TES) algorithm [58]. The AST_08 products were downloaded from the Land Processes Distributed Active Archive Center (https://lpdaac.usgs.gov/products/ast_08v003/, (last accessed on 8 May 2021)) and processed by the science scalable scripts-based science processor for missions (S4PM) algorithm (Version 3.4) [59]. Because a very small amount of data covering the study area were missing in 2019, we used the values of their nearest neighbors according to Euclidean distance to substitute for the missing data based on the nibble tool in ArcGIS 10.8 software.

3. Methods

3.1. Local Climate Zones Scheme

LCZs are climate-related regions that span hundreds of meters to several kilometers on a horizontal scale and are functions of surface cover, structures, construction material, and human activity [4]. As depicted in Figure 2a, the standard LCZ scheme comprises two major types: built types (LCZ classes 1–10) and land cover types (LCZ classes A–G). The 17 standard classes of LCZs are determined by surface characteristics; each provides a unique thermal environment that is most apparent in areas of simple relief, over dry surfaces, and on calm nights [4].

3.2. Training and Test Datasets

To collect field-based land cover observations, we conducted field surveys in Nanchang from May to September 2019. To reduce the effects caused by the imbalance of classes [60], roughly balanced ground reference samples of 13 LCZ classes were randomly collected throughout the study area based on this field investigation and visual interpretation of high-spatial-resolution Google Earth imagery from May to September 2019. The reference samples were then randomly split into two sets of disjoint training and test pixels to ensure spatial separation of training and test sites [61] (Table 2). Figure 2b shows Google Earth images of typical samples of the LCZ classes in Nanchang. It should be pointed out that LCZ 1 (compact high-rise) was not included because there was almost no LCZ 1 in our study area. Furthermore, we merged LCZ B (scattered trees) and LCZ C (bush, scrub) into a new class LCZ B_C (scattered trees with bush and scrub) because in most cases, shrubs, short trees, and scattered trees were mixed.

3.3. Input Features

The 12 spectral bands of Sentinel-2 MSI imagery (bottom-of-atmosphere reflectance), four backscattering intensity features obtained from dual-polarized PALSAR-2 (HH and HV backscattering coefficients, and the difference and ratio between the two polarization bands), and 24 textural features were used for the LCZ classification (Table 3). To explore the effects of different combinations of input features on classification accuracy, we set up six datasets designated as D1–D6 using these 40 features (Table 3). The textural features were extracted by using ENVI 5.5 software as follows: First, we performed a minimum noise fraction (MNF) transformation [62] on four bands at 10 m (bands 2, 3, 4, and 8) in the Sentinel-2 image. Second, the gray-level co-occurrence matrix (GLCM) [63] was computed considering a processing window of 3 × 3, the grayscale quantization level of 64, and the distance of 1. For the Sentinel-2 based GLCM, we selected the first MNF component (MNF 1) as the input. For the PALSAR-2 based GLCM, we selected the two polarization bands (HH and HV) as the input, respectively. Third, based on the obtained GLCM, we averaged eight textural features (contrast, correlation, dissimilarity, entropy, homogeneity, mean, angular second moment, and variance) in four directions (0°, 45°, 90°, and 135°) to achieve rotational invariance.

3.4. Random Forest Classification

The RF [22] is a parallel ensemble based on a classification and regression tree and can be generated simultaneously without strong dependencies between individual learners [64]. We implemented the RF classifier by using the scikit-learn library [65] and the Geospatial Data Abstraction Library (GDAL, https://gdal.org/, (last accessed on 8 May 2021)) in Python.

We used out-of-bag (OOB) samples for selecting the hyperparameters of the model. Before launching the RF classifier, two important hyperparameters that determine the randomness of the RF model had to be set: the number of trees (T) and the number of features (as listed in Table 3) randomly selected at each node (nr). We kept the other hyperparameters of the RF classifier as defaults and performed a grid search. The searching range of T was between 100 and 2000 using intervals of 100, whereas the searching range of nr was between the total number of features in intervals of 1. Based on the OOB scores of different RF models using various combinations of hyperparameters, we selected the optimal combination of hyperparameters (Table 3).

3.5. Grid-Cell Processing and Postprocessing

Because LCZs are defined at the local scale (10²–10⁴) [4,49], we used a grid-cell (100 m × 100 m) process for pixel aggregation. First, we used ArcGIS 10.8 software to create nets of grid cells with sizes of 100 m × 100 m covering the entire study area. The 100 m × 100 m grid cells were intersected with the LST data (90 m spatial resolution), and the area of each intersected portion was calculated. The LST attribute of a grid cell was then obtained by the weighted average of the LST values of the intersected portion according to the area percentage. Next, for each grid cell, the area of each LCZ class within a grid cell was calculated and stored in the attribute table. To calculate the percentage of each LCZ class within each grid cell, we divided the area of each LCZ class by the area of the grid cell. For a single grid cell, we assigned the dominant LCZ class that accounted for the largest area to the corresponding grid cell. Finally, we used a 3 × 3 majority filter for LCZ classification maps to include more contextual information.

3.6. Usual Resampling Methods

To explore the differences between the grid-cell-based method and the usual resampling methods, we used the D6 dataset to generate 100 m LCZ maps based on ArcGIS 10.8 software. We performed majority resampling and nearest neighbor resampling on the classified LCZ map. For the classified LCZ map (categorical data), we did not include bilinear interpolation or cubic convolution in the comparison because they alter the pixel values so that the original categories are not maintained. In addition, we applied nearest neighbor resampling, bilinear interpolation resampling, and cubic convolution resampling to the original images before executing the classification. To ensure the consistency of the comparison, the 100 m LCZ classification results obtained by each resampling method were subjected to the same 3 × 3 majority filter as those obtained by the grid-cell-based method. Finally, we used the grid-cell-based method as the baseline to compare the differences between the other five resampling methods and the grid-cell-based method.

3.7. Feature Importance for the RF Model and Feature Contributions for Each Class

To understand how each feature affected the RF classification model, we used the mean decrease in Gini/Gini importance and the mean decrease in accuracy/permutation importance [22] based on the training set. The Gini importance of a feature was obtained by averaging the decrease of the Gini impurity at all nodes where this feature was used in all trees. The permutation importance was expressed as the value of the change in the accuracy of a trained model when the values of a feature in the dataset were randomly permuted. For the second of these calculations, we performed 100 repeated shuffles for all features separately and averaged the decrease of accuracy to reduce randomness.

To explore the impact of each feature on each class, we employed a feature contribution method using the tree-interpreter package. The feature contribution method is based on decision paths through each tree in a forest and can reveal the relationship between features and predictions [36,37].

For classification tasks, consider a dataset of m samples

D = \{(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{m}, y_{m})\}

consisting of n input features and one label

y_{i}

, where

x_{i, j} (1 \leq i \leq m, 1 \leq j \leq n)

is the value of the j-th feature at the i-th sample. Denote classes by

k (1 \leq k \leq K)

, where K is the total number of classes. Let

t (1 \leq t \leq T)

be the t-th tree in a forest, where T is the total number of trees. For a single input

x_{i}

, there is a decision path from the root node to the leaf node in each tree. At a node in the decision path, if this node (parent node) is split into child nodes by feature j, then the contribution of feature j to class k is defined as:

F C_{j, k} = \{\begin{cases} p_{j, k}^{child} - p_{j, k}^{parent}, & \begin{matrix} if the split in a parent node is \\ performed over the feature j; \end{matrix} \\ 0, & otherwise, \end{cases}

(1)

where

p_{j, k}^{child}

is the fraction of samples that belong to class k at the child node, corresponding to the feature j, and

p_{j, k}^{parent}

is the fraction of samples that belong to class k at the parent node, corresponding to the feature j.

The predicted probability

P_{k}

that

x_{i}

belongs to class k can be written as:

P_{k} = \frac{1}{T} \sum_{t = 1}^{T} p_{k}^{(t, root)} + \sum_{j = 1}^{n} (\frac{1}{T} \sum_{t = 1}^{T} F C_{j, k}^{(t)}),

(2)

where

p_{k}^{(t, root)}

is the fraction of samples that belong to class k at the root node in the t-th tree and

F C_{j, k}^{(t)}

is the sum of

F C_{j, k}

over all nodes on the decision path in the t-th tree.

To obtain the feature contributions of each class, we averaged the results computed from all training samples belonging to the same class.

3.8. Statistical Analysis for Nighttime LST within LCZs

To examine the spatial autocorrelation of nighttime LST, we used the global Moran’s I statistic and the Anselin local Moran’s I statistic based on ArcGIS 10.8 software. For grid cells, we used an inverse distance conceptualization to generate a spatial weight matrix with a default threshold value of 270 m. Subsequently, the global Moran’s I index for all grid cells was computed based on the spatial weight matrix. The Anselin local Moran’s I analysis for all grid cells was also based on the spatial weight matrix. In addition, to explore the differences in LST among LCZ classes, we carried out statistical analysis using SPSS Statistics 26 software. First, we examined the normality of the LST in each LCZ class by using histogram comparisons, Q–Q plots, and Kolmogorov–Smirnov tests. We then performed Levene’s test to examine the homogeneity of variances. Based on the results of these two tests, to estimate the statistical significance of the LST differences between LCZ classes, we finally chose nonparametric tests, including the Kruskal–Wallis one-way analysis of variance (ANOVA) test followed by all pairwise multiple comparisons and a median test followed by all pairwise multiple comparisons.

4. Results

4.1. Accuracy Assessment of LCZ Maps

Figure 3a shows the LCZ maps obtained with different datasets (D1–D6), respectively. The percentages of the area occupied by each LCZ class in different datasets (D1–D6) are shown in Figure 3b. The accuracies of the classification were evaluated in terms of user’s accuracy (UA), producer’s accuracy (PA), and overall accuracy (OA), which were derived from the confusion matrix based on test pixels [61]. The confusion matrices of LCZ maps obtained using different datasets (D1–D6) are shown in Figure 4. Figure 5 also shows the differences in the PAs and UAs of each LCZ class for the different LCZ maps.

Compared to using the D1 dataset, the OAs were improved by 2.24% using D2, 2.32% using D5, and 4.03% using D6. There was a small improvement in the OA after using textural features. For example, the D2 dataset improved 2.24% over the D1 dataset, the D4 dataset improved 7.14% over the D3 dataset, and the D6 dataset improved 1.71% over the D5 dataset. When using the D5 dataset, the OAs were improved by 51.9% over the D3 dataset and 44.76% over the D4 dataset. The LCZ map derived using only the dual-polarized PALSAR-2 data (the D3 dataset) had the lowest OA. Using the D3 or D4 dataset, either the OAs were relatively low, or the land cover was not satisfactorily categorized. The highest OA was 89.96%, obtained from the D6 dataset by using all 40 input features.

For the D6 dataset, land cover LCZ types were generally classified with higher accuracy than built LCZ types (except for LCZs E and F). The confusion was manifested mainly among the built LCZ types. For the land cover LCZ types, LCZs E (bare rock or paved) and F (bare soil or sand) tended to be confused with built LCZ types. For the D6 dataset, LCZs A (dense tree), G (water), and D (low plants) had relatively high PA and UA among the land cover types. Among the built types, LCZs 2 (compact mid-rise) and 4 (open high-rise) had relatively high PA and UA. For the D6 dataset, open buildings (LCZs 4–6) were generally more difficult to distinguish than compact buildings (LCZs 2 and 3). This difficulty reflects that compact buildings are clustered in high-to-medium-spatial-resolution (10-m to 100-m) satellite imagery, whereas open buildings are scattered and occupy small pixels.

To measure the compliance or divergence of individual LCZ classifications, we computed the number of the same classes for a given location (individual cells of the grid) (Figure 6). The most obvious differences among the six LCZ maps were located in the northeastern part (close to Poyang Lake), the eastern part, and the urban district. A total of 86.2% of the grid cells showed good compliance for all datasets (Figure 6b).

To visualize the discrimination of LCZ classes using the datasets D1–D6, we extracted a subregion A in the urban district of Nanchang (Figure 7). This subregion is a typical urban region consisting of different types of buildings and land cover. It could be visually observed that the classification using PALSAR-2 polarimetric data alone did not yield a satisfactory result. Using the D3 dataset, most LCZ classes were under-represented. When using D4 by adding textural features to D3, there was a slight improvement in the classification of built LCZ types. Nevertheless, worse performance on LCZ classification was obtained using the D3 or D4 dataset. When using D2 by adding textural features to D1, the discrimination among built LCZ types was notably improved, especially for LCZ 4 (open high-rise). Compared to the classification results obtained from datasets D5 and D6, LCZ E (bare rock or paved) was under-represented using the D1 or D2 dataset. Compared to the D6 dataset, LCZ 4 was under-represented, while LCZ 6 (open low-rise) was over-represented using the D5 dataset. The most desirable result was produced when all 40 input features (the D6 dataset) were used because the confusion among LCZ classes was markedly reduced.

4.2. Comparison of the Grid-Cell-Based Method and Resampling Methods

As shown in Figure 8, using the nearest neighbor resampling, bilinear interpolation resampling, and cubic convolution resampling produced salt-and-pepper noise. Visually, the grid-cell-based method generated a more homogeneous result. The result of the majority resampling lay between those of the grid-cell-based method and the other resampling methods. Compared with other resampling methods, the difference between the majority resampling after classification and the grid-cell-based method was relatively small. As shown in Figure 8b–e, the basic patterns of these maps were relatively similar.

4.3. Importance and Contributions of Features for LCZ Classification

As mentioned above, the best LCZ classification was obtained using the D6 dataset. Therefore, we analyzed the feature importance of the RF model trained by all features (the D6 dataset) (Figure 9). The patterns of these two importance measures differed slightly from each other. In general, spectral features showed greater importance than polarimetric features and textural features. For both measures of importance, the most beneficial feature in the LCZ classification was S2_B12. Polarimetric features were also helpful for LCZ classification, especially the backscattering intensity at the HV polarization. In the eight textural features, GLCM_Mean was found to be the most useful feature. In addition, GLCM_Mean at the HV polarization of PALSAR-2 was more important than those extracted by Sentinel-2.

Figure 10 shows the feature contributions for each LCZ class for the RF model trained by all features (the D6 dataset). Land cover LCZ types exhibited more variability across input features than built LCZ types. In general, the same trends appeared in the feature importance for the RF model (Figure 9) and in the feature contributions for each LCZ class (Figure 10). For instance, S2_B12 was a beneficial feature for most of the LCZ classes. However, the contributions of a feature to each LCZ class differed to varying degrees. For example, compared with built LCZ types, the HV polarization band made a higher contribution to land cover LCZ types, especially LCZs A (dense trees), G (water), and E (bare rock or paved). As shown in Figure 10, there was no significant difference between each feature for LCZs 4 (open high-rise), 5 (open mid-rise), and 6 (open low-rise).

As shown in Figure 9 and Figure 10, the combinations of features with low importance and contributions for LCZ classification did not have high classification accuracies. For the D3 and D4 datasets, except for P2_HV_GLCM_Mean and P2_HH_GLCM_Mean, the importance and contributions of the remaining features were not high, and therefore, their classification accuracies were not high. The importance and contributions of the 12 features of Sentinel-2 were relatively high, explaining the good classification accuracy achieved by using only Sentinel-2 imagery (the D1 dataset).

4.4. Relationships between LCZs and Nighttime LST

As shown in Figure 11a–b, high nighttime LSTs were dominant mainly in urban areas and water bodies. The fact that the global Moran’s I index for all grid cells was 0.78 (p < 0.001) indicated a strong positive spatial autocorrelation for LST. The LST of both LCZ A (dense trees) and LCZ D (low plants) was clustered mostly as low-low, whereas the LST of LCZs G (water), E (bare rock or paved), and built LCZ types (LCZs 2–6, 8, and 10) was clustered mostly as high-high (Figure 11c).

As shown in Figure 11d, different LST variations within a single LCZ class were observed. In general, there were large differences in nighttime LSTs between LCZ classes, especially between land cover LCZ types. The nighttime LST was generally higher for the built LCZ types than for the land cover LCZ types. Residential buildings (LCZs 2–6) had higher nighttime LSTs than nonresidential buildings (LCZs 8 and 10), except for LCZ 3 (compact low-rise). Both Kruskal–Wallis one-way ANOVA test and median tests showed statistically significant differences (p < 0.001). Overall, there were statistically significant LST differences for most LCZ classes (Figure 11e).

5. Discussion

5.1. LCZ Classification Using Sentinel-2 Imagery and PALSAR-2 Data

The fact that the accuracies of the LCZ map obtained by the D6 dataset were acceptable for further studies (as expected) revealed the potential of combining optical and SAR data for LCZ classification in urban areas. The LCZ classification using only PALSAR-2 was not satisfactory, especially for land cover LCZ types, such as those exhibiting many LCZ G (water) in the D3 dataset and LCZ E-bare rock or paved (stripes) in the D4 dataset. This reveals the limitations of using only SAR data for land cover classification in complex urban and peri-urban environments [10,15,66]. Comparing the LCZ classification maps obtained from the four datasets (D1, D2, D5, and D6), we found that the basic patterns of these LCZ maps were generally similar. It can be concluded that optical data are still dominant in LCZ classification compared to SAR data [10].

Compared to the study of La et al. [16] that combined Sentinel-2 with full-polarized PALSAR-2, this study introduced textural features derived from Sentinel-2 and dual-polarized PALSAR-2 but did not consider the contribution of polarimetric parameters. When fully polarimetric SAR data are available, adding various types of information obtained by polarimetric target decomposition methods to the classification will help to improve classification accuracy [16,67]. However, the fully polarimetric data are not always available due to its limited swath width [21]. As an alternative, our results showed the attractiveness of dual-polarized SAR data for LCZ classification. In addition, we showed the advantages of a majority rule-based grid-cells process in generating LCZ maps with generalized urban patterns.

For the D6 dataset, there are still limitations in the separability between LCZ classes with similar spectral characteristics, such as LCZs B and C; LCZ E (bare rock or paved) and built LCZ types; and LCZs 8 (large low-rise) and 10 (heavy industry). These problems can be solved by adding more discriminatory data in the classification or by improving classification algorithms. The information on building height is beneficial for the discrimination between built LCZ types. Further research on combining height data with other datasets for LCZ classification will be required in the future. It has been shown that the inclusion of LiDAR data in the classification can assist in urban land cover classification [68,69]. Moreover, deep learning methods, such as convolutional neural networks, have recently shown promising performance in LCZ classification [19,70].

Considering that the inherent speckle noise in SAR data makes individual pixels unreliable, the textural features from SAR data can provide attractive information [15]. However, there is a need to further investigate the effect of textural features from both optical and SAR data on LCZ classification. It is important to select the optimal combination of textural features for LCZ classification. Because various combinations of different bands, window sizes, and texture measures will produce many textural features, using these massive features may lead to the curse of dimensionality and reduce the accuracy in the classification using a finite-sized training set [71].

5.2. Implications of the Grid-Cell-Based Method

For the nearest neighbor resampling, the difference between the LCZ maps obtained before and after classification was small, which illustrates the applicability of resampling before classification, as implemented by the WUDAPT method [8]. The effect of the different resampling rules on the LCZ maps is more significant than whether the resampling is implemented before or after classification. As the postprocessing approach for categorical data, the LCZ map obtained by the majority resampling was not as homogeneous as those obtained by the majority rule-based grid-cells method, which may be limited by the built-in default filter window in ArcGIS software.

5.3. Assessment of Interpretability of Features

The significant importance and contribution of features can be explained by the good representation of those features in the characteristics of LCZ classes [11,12]. Not all of the features were equally beneficial to the LCZ classification for the RF models. The fact that the overall importance of the PALSAR-2 derived features was not as prominent as the Sentinel-2 derived features was probably due to the inconsistent dates of PALSAR-2 data and Sentinel-2 imagery. The seasonal variation of vegetation may make polarimetric features contribute little to identifying these changing ground objects. For PALSAR-2 dual-polarization data, the HV polarization band is more conducive to land cover classification than the HH polarization band, which may be due to the unique scattering information about ground objects provided by cross-polarization [15,25]. Our study also indicated that the GLCM textural features have limited ability in land cover classification. This limited ability may reflect the spatial resolution of Sentinel-2 MSI imagery and PALSAR-2 data. This problem will probably be resolved with the improvement of spatial resolution [72].

The fact that most features performed better in land cover LCZ types than built LCZ types indicated that there were still limitations in discriminating built LCZ types for these features. Interestingly, band 1 (coastal aerosol) made a significant contribution to LCZs 8 (large low-rise) and 10 (heavy industry); and band 9 (water vapor) contributed significantly to LCZs 2 (compact mid-rise), D (low plants), and 6 (open low-rise). These results showed that both band 1 (dedicated for aerosol retrieval) and band 9 (dedicated for water vapor correction) in Sentinel-2 imagery were beneficial for LCZ classification, despite their relatively low spatial resolution (60 m). In addition, for LCZ 5 (open mid-rise), S2_GLCM_Mean made the highest contribution. This observation highlights the fact that features that are generally unimportant for the model may be important for a specific class [37].

It is worth noting that there were many features that have very low permutation importance (Figure 9b), probably because of the correlation between features. When features are correlated, permutating one feature has little impact on the model’s performance because it can obtain the same information from the correlated features. In the future, it will be necessary to evaluate additional features to provide more information on how to allow LCZ classes to be better differentiated. In addition, it is also important to analyze the correlation in the features extracted from the remote sensing data.

5.4. LST Differentiation of LCZs

In general terms, the fact that there were statistically significant nighttime LST differences between most LCZs indicated that different LCZ classes exhibited thermal environments associated with their surface characteristics [4,49]. For example, built LCZ types (LCZs 2–6, 8, and 10) and LCZ E (bare rock or paved) were clustered as high-high on nighttime LST (Figure 11c), probably because of the thermal properties of impervious surfaces [45]. Compared to other land cover LCZ types, LCZs E and G (water) had relatively high nighttime LSTs, probably because they cool off more slowly during nighttime. The fact that LCZ A (dense trees) had lower nighttime LSTs than LCZ B_C (scattered trees with bush and scrub) indicates that aggregated vegetation cools better than dispersed vegetation [73]. The fact that the nighttime LSTs of LCZ D (low plants) were lower than that of LCZ A was probably because dense trees have greater shading coverage that influences the penetration of solar radiation [74]. For buildings located in urban areas with the same heights, open buildings had lower nighttime LSTs than compact buildings. The former may benefit from the surrounding vegetation and good ventilation [75].

However, the fact that the nighttime LSTs of several LCZ classes were not significantly different statistically from other classes may have been related to the influence of local or regional climate [43]. In addition, the intra-LCZ variability of nighttime LST revealed the potential effects of heterogeneous surrounding environments [76]. For example, the nighttime LSTs of both LCZs 3 (compact low-rise) and B_C were generally higher in urban areas than in rural areas. Similarly, LCZ F (Bare soil or sand) was warmer near water than near dense trees during nighttime. Buildings surrounded by large areas of vegetation tended to have lower nighttime LSTs than buildings surrounded by a small amount of vegetation.

Furthermore, several issues need to be further explored in studies of LSTs or surface urban heat islands using LCZ maps, including seasonal changes in LCZs, the effects of multitemporal (day and night), seasonal, and thermal anisotropy on LST variations [42,43]. Considering that it is not the focus of this study, we only used two dates of LST data in summer to analyze the relationship between LCZ and nighttime LST. Many studies have shown LST differences within LCZs using many dates of LST data [38,40,42,43]. Therefore, the applicability of the inclusion of thermal remote sensing images from different sensors (e.g., Landsat and ASTER) over multiple periods in LCZ classification can be investigated in the future. However, this could potentially lead to methodological bias in LST analysis [44].

6. Conclusions

The combination of Sentinel-2 MSI imagery and dual-polarized (HH + HV) PALSAR-2 data was found to be promising and beneficial for LCZ mapping. The quantitative analysis of input features based on the RF classifier showed that in LCZ classification, band 12-SWIR 2 is crucial for Sentinel-2 imagery, whereas the HV polarization is important for dual-polarized PALSAR-2 data. By using the feature contribution approach based on decision paths, each input feature was found to contribute differently to LCZ classes. These different contributions may not be detected by a standard feature importance analysis. Through this class-based analysis of feature contributions, it is possible to reveal the effective features in distinguishing different LCZ classes. In addition, our comparative results showed that the grid-cell-based method produced more homogeneous LCZ maps than the usual resampling methods.

Spatial analysis of LCZs and summer nighttime LST showed that high LSTs were concentrated mostly in the built LCZ types, LCZ E, and LCZ G, whereas low LSTs were mostly concentrated in LCZs A and D. Statistical analysis showed that the summer nighttime LST differences between most LCZ classes were statistically significant, but this phenomenon needs to be further investigated using more dates of thermal remote sensing images. Considering the thermal differentiation within LCZs, the effect of thermal remote sensing data in LCZ classification can also be further explored.

This study provided insights into the performance of RF classifiers in LCZ mapping and feature assessment that could contribute to future LCZ mapping. In addition, this study highlighted the potential of the LCZ map and the grid-cell-based method for urban climate research that could contribute to a better understanding of the impact of urban morphology defined by LCZs on local climatic conditions.

Author Contributions

Conceptualization, C.C. and H.B.; methodology, C.C. and H.B.; software, C.C.; formal analysis, C.C., H.B. and X.X.; investigation, X.X.; resources and data curation, Y.L.; writing—original draft preparation, C.C.; writing—review and editing, H.B.; visualization, C.C.; supervision, H.B. and Y.Y.; project administration, H.B.; funding acquisition, H.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Key Foundation of China (Grant No. 41730642), and the National Natural Science Foundation of China (Grant No. 41771372).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

This research was supported by ALOS-2 RA-6.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Voogt, J.; Oke, T. Thermal remote sensing of urban climates. Remote Sens. Environ. 2003, 86, 370–384. [Google Scholar] [CrossRef]
Mirzaei, P.A. Recent challenges in modeling of urban heat island. Sustain. Cities Soc. 2015, 19, 200–206. [Google Scholar] [CrossRef] [Green Version]
Parsaee, M.; Joybari, M.M.; Mirzaei, P.A.; Haghighat, F. Urban heat island, urban climate maps and urban development policies and action plans. Environ. Technol. Innov. 2019, 14, 100341. [Google Scholar] [CrossRef]
Stewart, I.D.; Oke, T.R. Local Climate Zones for Urban Temperature Studies. Bull. Am. Meteorol. Soc. 2012, 93, 1879–1900. [Google Scholar] [CrossRef]
Zhao, C.; Jensen, J.L.R.; Weng, Q.; Currit, N.; Weaver, R. Use of Local Climate Zones to investigate surface urban heat islands in Texas. GIScience Remote Sens. 2020, 57, 1083–1101. [Google Scholar] [CrossRef]
Danylo, O.; See, L.; Bechtel, B.; Schepaschenko, D.; Fritz, S. Contributing to WUDAPT: A Local Climate Zone Classification of Two Cities in Ukraine. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 1841–1853. [Google Scholar] [CrossRef] [Green Version]
Ching, J.; Mills, G.; Bechtel, B.; See, L.; Feddema, J.; Wang, X.; Ren, C.; Brousse, O.; Martilli, A.; Neophytou, M.; et al. WUDAPT: An Urban Weather, Climate, and Environmental Modeling Infrastructure for the Anthropocene. Bull. Am. Meteorol. Soc. 2018, 99, 1907–1924. [Google Scholar] [CrossRef] [Green Version]
Bechtel, B.; Alexander, P.J.; Beck, C.; Böhner, J.; Brousse, O.; Ching, J.; Demuzere, M.; Fonte, C.; Gál, T.; Hidalgo, J.; et al. Generating WUDAPT Level 0 data—Current status of production and evaluation. Urban Clim. 2019, 27, 24–45. [Google Scholar] [CrossRef] [Green Version]
Bechtel, B.; Daneke, C. Classification of Local Climate Zones Based on Multiple Earth Observation Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 1191–1202. [Google Scholar] [CrossRef]
Bechtel, B.; See, L.; Mills, G.; Foley, M. Classification of Local Climate Zones Using SAR and Multispectral Data in an Arid Environment. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 9, 3097–3105. [Google Scholar] [CrossRef]
Xu, Y.; Ren, C.; Cai, M.; Edward, N.Y.Y.; Wu, T. Classification of Local Climate Zones Using ASTER and Landsat Data for High-Density Cities. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 3397–3405. [Google Scholar] [CrossRef]
Demuzere, M.; Bechtel, B.; Middel, A.; Mills, G. Mapping Europe into local climate zones. PLoS ONE 2019, 14, e0214474. [Google Scholar] [CrossRef] [Green Version]
Reba, M.; Seto, K.C. A systematic review and assessment of algorithms to detect, characterize, and monitor urban land change. Remote Sens. Environ. 2020, 242, 111739. [Google Scholar] [CrossRef]
Amarsaikhan, D.; Ganzorig, M.; Ache, P.; Blotevogel, H. The integrated use of optical and InSAR data for urban land-cover mapping. Int. J. Remote Sens. 2007, 28, 1161–1171. [Google Scholar] [CrossRef]
Zhu, Z.; Woodcock, C.E.; Rogan, J.; Kellndorfer, J. Assessment of spectral, polarimetric, temporal, and spatial dimensions for urban and peri-urban land cover classification using Landsat and SAR data. Remote Sens. Environ. 2012, 117, 72–82. [Google Scholar] [CrossRef]
La, Y.; Bagan, H.; Yamagata, Y. Urban land cover mapping under the Local Climate Zone scheme using Sentinel-2 and PALSAR-2 data. Urban Clim. 2020, 33, 100661. [Google Scholar] [CrossRef]
Qiu, C.; Schmitt, M.; Mou, L.; Ghamisi, P.; Zhu, X.X. Feature Importance Analysis for Local Climate Zone Classification Using a Residual Convolutional Neural Network with Multi-Source Datasets. Remote Sens. 2018, 10, 1572. [Google Scholar] [CrossRef] [Green Version]
Qiu, C.; Mou, L.; Schmitt, M.; Zhu, X.X. Local climate zone-based urban land cover classification from multi-seasonal Sentinel-2 images with a recurrent residual network. ISPRS J. Photogramm. Remote Sens. 2019, 154, 151–162. [Google Scholar] [CrossRef]
Rosentreter, J.; Hagensieker, R.; Waske, B. Towards large-scale mapping of local climate zones using multitemporal Sentinel 2 data and convolutional neural networks. Remote Sens. Environ. 2020, 237, 111472. [Google Scholar] [CrossRef]
Yoo, C.; Lee, Y.; Cho, D.; Im, J.; Han, D. Improving local climate zone classification using incomplete building data and Sentinel 2 images based on convolutional neural networks. Remote Sens. 2020, 12, 3552. [Google Scholar] [CrossRef]
Ohki, M.; Shimada, M. Large-Area Land Use and Land Cover Classification With Quad, Compact, and Dual Polarization SAR Data by PALSAR-2. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5550–5557. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Rodriguez-Galiano, V.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. An assessment of the effectiveness of a random forest classifier for land-cover classification. ISPRS J. Photogramm. Remote Sens. 2012, 67, 93–104. [Google Scholar] [CrossRef]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Jin, H.; Mountrakis, G.; Stehman, S.V. Assessing integration of intensity, polarimetric scattering, interferometric coherence and spatial texture metrics in PALSAR-derived land cover classification. ISPRS J. Photogramm. Remote Sens. 2014, 98, 70–84. [Google Scholar] [CrossRef]
Van Beijma, S.; Comber, A.; Lamb, A. Random forest classification of salt marsh vegetation habitats using quad-polarimetric airborne SAR, elevation and optical RS data. Remote Sens. Environ. 2014, 149, 118–129. [Google Scholar] [CrossRef]
Mahdianpari, M.; Salehi, B.; Mohammadimanesh, F.; Motagh, M. Random forest wetland classification using ALOS-2 L-band, RADARSAT-2 C-band, and TerraSAR-X imagery. ISPRS J. Photogramm. Remote Sens. 2017, 130, 13–31. [Google Scholar] [CrossRef]
Merchant, M.A.; Warren, R.K.; Edwards, R.; Kenyon, J.K. An Object-Based Assessment of Multi-Wavelength SAR, Optical Imagery and Topographical Datasets for Operational Wetland Mapping in Boreal Yukon, Canada. Can. J. Remote Sens. 2019, 45, 308–332. [Google Scholar] [CrossRef]
Abdi, A.M. Land cover and land use classification performance of machine learning algorithms in a boreal landscape using Sentinel-2 data. GIScience Remote Sens. 2020, 57, 1–20. [Google Scholar] [CrossRef] [Green Version]
Gislason, P.O.; Benediktsson, J.A.; Sveinsson, J.R. Random Forests for land cover classification. Pattern Recognit. Lett. 2006, 27, 294–300. [Google Scholar] [CrossRef]
Rodriguez-Galiano, V.; Chica-Olmo, M.; Abarca-Hernandez, F.; Atkinson, P.; Jeganathan, C. Random Forest classification of Mediterranean land cover using multi-seasonal imagery and multi-seasonal texture. Remote Sens. Environ. 2012, 121, 93–107. [Google Scholar] [CrossRef]
Guan, H.; Li, J.; Chapman, M.; Deng, F.; Ji, Z.; Yang, X. Integration of orthoimagery and lidar data for object-based urban thematic mapping using random forests. Int. J. Remote Sens. 2013, 34, 5166–5186. [Google Scholar] [CrossRef]
Tatsumi, K.; Yamashiki, Y.; Torres, M.A.C.; Taipe, C.L.R. Crop classification of upland fields using Random forest of time-series Landsat 7 ETM+ data. Comput. Electron. Agric. 2015, 115, 171–179. [Google Scholar] [CrossRef]
Xu, Z.; Chen, J.; Xia, J.; Du, P.; Zheng, H.; Gan, L. Multisource Earth Observation Data for Land-Cover Classification Using Random Forest. IEEE Geosci. Remote Sens. Lett. 2018, 15, 789–793. [Google Scholar] [CrossRef]
Yokoya, N.; Ghamisi, P.; Xia, J.; Sukhanov, S.; Heremans, R.; Tankoyeu, I.; Bechtel, B.; le Saux, B.; Moser, G.; Tuia, D. Open Data for Global Multimodal Land Use Classification: Outcome of the 2017 IEEE GRSS Data Fusion Contest. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 1363–1377. [Google Scholar] [CrossRef] [Green Version]
Palczewska, A.; Palczewski, J.; Robinson, R.M.; Neagu, D. Interpreting random forest models using a feature contribution method. In Proceedings of the 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI), San Francisco, CA, USA, 14–16 August 2013; pp. 112–119. [Google Scholar]
Palczewska, A.; Palczewski, J.; Robinson, R.M.; Neagu, D. Interpreting Random Forest Classification Models Using a Feature Contribution Method. In Advances in Intelligent Systems and Computing; Bouabana-Tebibel, T., Rubin, S.H., Kacprzyk, J., Eds.; Springer Science and Business Media LLC: Cham, Switzerland, 2014; Volume 263, pp. 193–218. [Google Scholar]
Bechtel, B.; Demuzere, M.; Mills, G.; Zhan, W.; Sismanidis, P.; Small, C.; Voogt, J. SUHI analysis using Local Climate Zones—A comparison of 50 cities. Urban Clim. 2019, 28, 100451. [Google Scholar] [CrossRef]
Rahman, M.; Avtar, R.; Yunus, A.P.; Dou, J.; Misra, P.; Takeuchi, W.; Sahu, N.; Kumar, P.; Johnson, B.A.; Dasgupta, R.; et al. Monitoring Effect of Spatial Growth on Land Surface Temperature in Dhaka. Remote Sens. 2020, 12, 1191. [Google Scholar] [CrossRef] [Green Version]
Du, P.; Chen, J.; Bai, X.; Han, W. Understanding the seasonal variations of land surface temperature in Nanjing urban area based on local climate zone. Urban Clim. 2020, 33, 100657. [Google Scholar] [CrossRef]
Mushore, T.D.; Odindi, J.; Dube, T.; Matongera, T.N.; Mutanga, O. Remote sensing applications in monitoring urban growth impacts on in-and-out door thermal conditions: A review. Remote Sens. Appl. Soc. Environ. 2017, 8, 83–93. [Google Scholar] [CrossRef]
Geletič, J.; Lehnert, M.; Dobrovolný, P. Land Surface Temperature Differences within Local Climate Zones, Based on Two Central European Cities. Remote Sens. 2016, 8, 788. [Google Scholar] [CrossRef] [Green Version]
Geletič, J.; Lehnert, M.; Savić, S.; Milošević, D. Inter-/intra-zonal seasonal variability of the surface urban heat island based on local climate zones in three central European cities. Build. Environ. 2019, 156, 21–32. [Google Scholar] [CrossRef]
Wang, C.; Middel, A.; Myint, S.W.; Kaplan, S.; Brazel, A.J.; Lukasczyk, J. Assessing local climate zones in arid cities: The case of Phoenix, Arizona and Las Vegas, Nevada. ISPRS J. Photogramm. Remote Sens. 2018, 141, 59–71. [Google Scholar] [CrossRef]
Khamchiangta, D.; Dhakal, S. Physical and non-physical factors driving urban heat island: Case of Bangkok Metropolitan Administration, Thailand. J. Environ. Manag. 2019, 248, 109285. [Google Scholar] [CrossRef] [PubMed]
Hartz, D.; Prashad, L.; Hedquist, B.; Golden, J.; Brazel, A. Linking satellite images and hand-held infrared thermography to observed neighborhood climate conditions. Remote Sens. Environ. 2006, 104, 190–200. [Google Scholar] [CrossRef]
Sekertekin, A.; Bonafoni, S. Sensitivity Analysis and Validation of Daytime and Nighttime Land Surface Temperature Retrievals from Landsat 8 Using Different Algorithms and Emissivity Models. Remote Sens. 2020, 12, 2776. [Google Scholar] [CrossRef]
Peng, J.; Ma, J.; Liu, Q.; Liu, Y.; Hu, Y.; Li, Y.; Yue, Y. Spatial-temporal change of land surface temperature across 285 cities in China: An urban-rural contrast perspective. Sci. Total. Environ. 2018, 635, 487–497. [Google Scholar] [CrossRef]
Stewart, I.D.; Oke, T.R.; Krayenhoff, E.S. Evaluation of the ‘local climate zone’ scheme using temperature observations and model simulations. Int. J. Clim. 2014, 34, 1062–1080. [Google Scholar] [CrossRef]
Bagan, H.; Yamagata, Y. Landsat analysis of urban growth: How Tokyo became the world’s largest megacity during the last 40 years. Remote Sens. Environ. 2012, 127, 210–222. [Google Scholar] [CrossRef]
Bagan, H.; Millington, A.; Takeuchi, W.; Yamagata, Y. Spatiotemporal analysis of deforestation in the Chapare region of Bolivia using LANDSAT images. Land Degrad. Dev. 2020, 31, 3024–3039. [Google Scholar] [CrossRef]
Zhang, Y.; Xu, B. Spatiotemporal analysis of land use/cover changes in Nanchang area, China. Int. J. Digit. Earth 2014, 8, 312–333. [Google Scholar] [CrossRef]
Zhang, X.; Estoque, R.C.; Murayama, Y. An urban heat island study in Nanchang City, China based on land surface temperature and social-ecological variables. Sustain. Cities Soc. 2017, 32, 557–568. [Google Scholar] [CrossRef]
Statistics Bureau of Nanchang, Nanchang Investigation Team of National Bureau of Statistics. Nanchang Statistical Yearbook of 2020; China Statistics Press: Beijing, China, 2020.
Drusch, M.; del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
Rosenqvist, A.; Shimada, M.; Suzuki, S.; Ohgushi, F.; Tadono, T.; Watanabe, M.; Tsuzuku, K.; Kamijo, S.; Aoki, E. Operational performance of the ALOS global systematic acquisition strategy and observation plans for ALOS-2 PALSAR-2. Remote Sens. Environ. 2014, 155, 3–12. [Google Scholar] [CrossRef]
Shimada, M. Imaging from Spaceborne and Airborne SARs, Calibration, and Applications; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Gillespie, A.; Rokugawa, S.; Matsunaga, T.; Cothern, J.; Hook, S.; Kahle, A. A temperature and emissivity separation algorithm for Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) images. IEEE Trans. Geosci. Remote Sens. 1998, 36, 1113–1126. [Google Scholar] [CrossRef]
NASA/METI/AIST/Japan Spacesystems; U.S./Japan ASTER Science Team. ASTER Level 2 Surface Temperature Product. 2001. NASA EOSDIS Land Processes DAAC. Available online: https://doi.org/10.5067/ASTER/AST_08.003 (accessed on 18 May 2020).
Mellor, A.; Boukir, S.; Haywood, A.; Jones, S. Exploring issues of training data imbalance and mislabelling on random forest performance for large area land cover classification using the ensemble margin. ISPRS J. Photogramm. Remote Sens. 2015, 105, 155–168. [Google Scholar] [CrossRef]
Congalton, R.G.; Green, K. Assessing the Accuracy of Remotely Sensed Data, 3rd ed.; CRC Press; Taylor & Francis Group: Boca Raton, FL, USA, 2019. [Google Scholar]
Green, A.; Berman, M.; Switzer, P.; Craig, M. A transformation for ordering multispectral data in terms of image quality with implications for noise removal. IEEE Trans. Geosci. Remote Sens. 1988, 26, 65–74. [Google Scholar] [CrossRef] [Green Version]
Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man, Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef] [Green Version]
Rokach, L. Ensemble-based classifiers. Artif. Intell. Rev. 2010, 33, 1–39. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn, Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. Available online: https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html (accessed on 26 May 2020).
Hu, J.; Ghamisi, P.; Zhu, X.X. Feature Extraction and Selection of Sentinel-1 Dual-Pol Data for Global-Scale Local Climate Zone Classification. ISPRS Int. J. Geo-Inf. 2018, 7, 379. [Google Scholar] [CrossRef] [Green Version]
Bagan, H.; Kinoshita, T.; Yamagata, Y. Combination of AVNIR-2, PALSAR, and Polarimetric Parameters for Land Cover Classification. IEEE Trans. Geosci. Remote Sens. 2011, 50, 1318–1328. [Google Scholar] [CrossRef]
Gu, Y.; Wang, Q.; Jia, X.; Benediktsson, J.A. A Novel MKL Model of Integrating LiDAR Data and MSI for Urban Area Classification. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5312–5326. [Google Scholar] [CrossRef]
Yan, W.Y.; Shaker, A.; El-Ashmawy, N. Urban land cover classification using airborne LiDAR data: A review. Remote Sens. Environ. 2015, 158, 295–310. [Google Scholar] [CrossRef]
Yoo, C.; Han, D.; Im, J.; Bechtel, B. Comparison between convolutional neural networks and random forest for local climate zone classification in mega urban areas using Landsat images. ISPRS J. Photogramm. Remote Sens. 2019, 157, 155–170. [Google Scholar] [CrossRef]
Laurin, G.V.; Liesenberg, V.; Chen, Q.; Guerriero, L.; del Frate, F.; Bartolini, A.; Coomes, D.; Wilebore, B.; Lindsell, J.; Valentini, R. Optical and SAR sensor synergies for forest and land cover mapping in a tropical site in West Africa. Int. J. Appl. Earth Obs. Geoinf. 2013, 21, 7–16. [Google Scholar] [CrossRef]
Mishra, V.N.; Prasad, R.; Rai, P.K.; Vishwakarma, A.K.; Arora, A. Performance evaluation of textural features in improving land use/land cover classification accuracy of heterogeneous landscape using multi-sensor remote sensing data. Earth Sci. Inform. 2018, 12, 71–86. [Google Scholar] [CrossRef]
Estoque, R.C.; Murayama, Y.; Myint, S.W. Effects of landscape composition and pattern on land surface temperature: An urban heat island study in the megacities of Southeast Asia. Sci. Total. Environ. 2017, 577, 349–359. [Google Scholar] [CrossRef]
Ng, E.; Chen, L.; Wang, Y.; Yuan, C. A study on the cooling effects of greening in a high-density city: An experience from Hong Kong. Build. Environ. 2012, 47, 256–271. [Google Scholar] [CrossRef]
Yang, J.; Jin, S.; Xiao, X.; Jin, C.; Xia, J.; Li, X.; Wang, S. Local climate zone ventilation and urban land surface temperatures: Towards a performance-based and wind-sensitive planning proposal in megacities. Sustain. Cities Soc. 2019, 47, 101487. [Google Scholar] [CrossRef]
Shi, Y.; Lau, K.K.-L.; Ren, C.; Ng, E. Evaluating the local climate zone classification in high-density heterogeneous urban environment using mobile measurement. Urban Clim. 2018, 25, 167–186. [Google Scholar] [CrossRef]

Figure 1. Left: Location of the study area (Nanchang City, China). Right: Sentinel-2 MSI image of the study area (R/G/B = bands 4/3/2). The square labeled “A” indicates a subregion shown in Figure 7.

Figure 2. (a) Standard local climate zone (LCZ) scheme modified from Stewart and Oke [4]. (b) Google Earth images of typical samples of the LCZs in Nanchang.

Figure 3. (a) LCZ maps obtained with each of the six datasets (D1–D6); (b) percentages of LCZ classes with each of the six datasets (D1–D6).

Figure 4. Confusion matrices and overall accuracies (OAs) of LCZ maps obtained by RF classification using different datasets (D1–D6). The confusion matrices are expressed as percentages to the total number of test pixels.

Figure 5. Producer’s accuracies (PAs) and user’s accuracies (UAs) of different LCZ maps obtained by RF classification using different datasets (D1–D6).

Figure 6. Difference (a) and its percentages (b) between six LCZ maps using different datasets (D1–D6). The difference is presented as the number of the same classes for individual cells of the grid.

Figure 7. Sentinel-2 MSI image (RGB = bands 4, 3, 2) and LCZ classification maps using six datasets (D1–D6) of a subregion A in the urban district of Nanchang.

Figure 8. Detail of the comparison between the grid-cell-based method and the resampling methods using the D6 dataset. (a) Majority resampling after classification; (b) nearest neighbor resampling after classification; (c) nearest neighbor resampling before classification; (d) bilinear interpolation resampling before classification; (e) cubic convolution resampling before classification.

Figure 9. (a) Gini importance and (b) permutation importance of 40 input features for the RF model using the training set (D6). (S2: Sentinel-2. P2: PALSAR-2).

Figure 10. Mean of feature contributions for each LCZ class for the RF model using the training set (D6).

Figure 11. (a) Spatial distribution of nighttime LSTs in Nanchang. (b) Cluster and outlier map for nighttime LST. (c) Number of grid cells for each cluster/outlier type. (d) Nighttime LSTs for each LCZ class. The violin density plot displays the probability density of the LST data, with a boxplot of the mean (hollow circle), median (center horizontal line), interquartile range (black rectangle), and upper and lower whiskers (vertical lines between upper and lower horizontal lines). (e) Pairwise multiple comparison results of the Kruskal–Wallis one-way ANOVA test and median test, respectively. Blank cells indicate pairs of LCZs with significantly different LSTs (p < 0.001).

Table 1. Summary of remote-sensing data used in this study.

Remote Sensing Data	Date	Local Time at the Start of the Observation	Location in the Study Area	Spatial Resolution (m)
Sentinel-2B MSI L2A	17 September 2019	10:55:49	Northwest	10, 20, 60
Sentinel-2B MSI L2A	17 September 2019	10:55:49	Southwest	10, 20, 60
Sentinel-2A MSI L2A	19 September 2019	10:45:51	Northeast	10, 20, 60
Sentinel-2A MSI L2A	19 September 2019	10:45:51	Southeast	10, 20, 60
PALSAR-2 L3.1	19 May 2019	00:12:54	Southwest	6.25
	19 May 2019	00:13:02	West
	19 May 2019	00:13:10	Northwest
	28 July 2019	00:12:54	Southeast
	28 July 2019	00:13:02	East
	28 July 2019	00:13:10	Northeast
ASTER L2 AST_08	29 July 2019	22:31:08	Southeast	90
	29 July 2019	22:31:17	East
	29 July 2019	22:31:26	Northeast
	23 August 2019	22:25:01	Southwest
	23 August 2019	22:25:10	West
	23 August 2019	22:25:18	Northwest

Table 2. Description of LCZ classes and the number of training and test pixels in the classification.

Class	Description	Training Pixels	Test Pixels
LCZ 2	Compact mid-rise	4078	1211
LCZ 3	Compact low-rise	4377	1336
LCZ 4	Open high-rise	4500	1297
LCZ 5	Open mid-rise	4843	1343
LCZ 6	Open low-rise	4226	1420
LCZ 8	Large low-rise	4134	1303
LCZ 10	Heavy industry	4046	1310
LCZ A	Dense trees	4616	1448
LCZ B_C	Scattered trees with bush and scrub	4063	1214
LCZ D	Low plants	4283	1387
LCZ E	Bare rock or paved	4723	1430
LCZ F	Bare soil or sand	4654	1289
LCZ G	Water	4727	1381
Total		57,272	17,369

Table 3. Six datasets of different input features for LCZ classification and hyperparameters used for RF classifiers (T: the number of trees; nr: the number of features randomly selected at each node).

Dataset	Features	Number of Features	Source	Hyperparameters (T and nr)
D1	Sentinel-2 bands (1–8, 8a, 9, 11–12)	12	Sentinel-2	T = 2000, nr = 4
D2	Sentinel-2 bands (1–8, 8a, 9, 11–12) + MNF 1_GLCM (contrast, correlation, dissimilarity, entropy, homogeneity, mean, angular second moment (ASM), variance)	20	Sentinel-2	T = 2000, nr = 12
D3	Backscattering intensity (HH, HV, HH–HV, HH/HV)	4	PALSAR-2	T = 2000, nr = 2
D4	Backscattering intensity (HH, HV, HH–HV, HH/HV) + HH_GLCM (contrast, correlation, dissimilarity, entropy, homogeneity, mean, ASM, variance) + HV_GLCM (contrast, correlation, dissimilarity, entropy, homogeneity, mean, ASM, Variance)	20	PALSAR-2	T = 2000, nr = 12
D5	Sentinel-2 bands (1–8, 8a, 9, 11–12) + backscattering intensity (HH, HV, HH–HV, HH/HV)	16	Sentinel-2 + PALSAR-2	T = 2000, nr = 6
D6	D2 + D4	40	Sentinel-2 + PALSAR-2	T = 2000, nr = 18

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, C.; Bagan, H.; Xie, X.; La, Y.; Yamagata, Y. Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classification: A Case Study of Nanchang, China. Remote Sens. 2021, 13, 1902. https://doi.org/10.3390/rs13101902

AMA Style

Chen C, Bagan H, Xie X, La Y, Yamagata Y. Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classification: A Case Study of Nanchang, China. Remote Sensing. 2021; 13(10):1902. https://doi.org/10.3390/rs13101902

Chicago/Turabian Style

Chen, Chaomin, Hasi Bagan, Xuan Xie, Yune La, and Yoshiki Yamagata. 2021. "Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classification: A Case Study of Nanchang, China" Remote Sensing 13, no. 10: 1902. https://doi.org/10.3390/rs13101902

APA Style

Chen, C., Bagan, H., Xie, X., La, Y., & Yamagata, Y. (2021). Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classification: A Case Study of Nanchang, China. Remote Sensing, 13(10), 1902. https://doi.org/10.3390/rs13101902

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Combination of Sentinel-2 and PALSAR-2 for Local Climate Zone Classification: A Case Study of Nanchang, China

Abstract

1. Introduction

2. Materials

2.1. Study Area

2.2. Remote Sensing Data

2.2.1. Sentinel-2 MSI Imagery

2.2.2. PALSAR-2 Data

2.2.3. ASTER Land Surface Temperature Products

3. Methods

3.1. Local Climate Zones Scheme

3.2. Training and Test Datasets

3.3. Input Features

3.4. Random Forest Classification

3.5. Grid-Cell Processing and Postprocessing

3.6. Usual Resampling Methods

3.7. Feature Importance for the RF Model and Feature Contributions for Each Class

3.8. Statistical Analysis for Nighttime LST within LCZs

4. Results

4.1. Accuracy Assessment of LCZ Maps

4.2. Comparison of the Grid-Cell-Based Method and Resampling Methods

4.3. Importance and Contributions of Features for LCZ Classification

4.4. Relationships between LCZs and Nighttime LST

5. Discussion

5.1. LCZ Classification Using Sentinel-2 Imagery and PALSAR-2 Data

5.2. Implications of the Grid-Cell-Based Method

5.3. Assessment of Interpretability of Features

5.4. LST Differentiation of LCZs

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI