Identification of Urban Green Space Types and Estimation of Above-Ground Biomass Using Sentinel-1 and Sentinel-2 Data

: High-quality urban green space supports the healthy functioning of urban ecosystems. This study aimed to rapidly assess the distribution, and accurately estimate the above-ground biomass, of urban green space using remote sensing methods, thus providing a better understanding of the urban ecological environment in Xuzhou for more effective management. We performed urban green space classifications and compared the performance of Sentinel-2 MSI data and Sentinel-1 SAR data and combinations, for estimating above-ground biomass, using field data from Xuzhou, China. The results showed the following: (1) incorporating an object-oriented method and random forest algorithm to extract urban green space information was effective; (2) compared with stepwise regression models with single-source data, biomass estimation models based on multi-source data provide higher estimation accuracy (R 2 = 0.77 for coniferous forest, R 2 = 0.76 for shrub-grass vegetation, R 2 = 0.75 for broadleaf forest); and (3) from 2016 to 2021, urban green space coverage in Xuzhou decreased, while the total above-ground biomass increased, with higher average above-ground biomass in broadleaf forests (133.71 tons/ha) compared to coniferous forests (92.13 tons/ha) and shrub-grass vegetation (21.65 tons/ha). Our study provides an example of automated classification and above-ground biomass mapping for urban green space using multisource data and facilitates urban eco-management.


Introduction
In the urgent context of global warming, cities, with their intensive energy consumption and greenhouse gas emissions [1,2], are required to take responsibility for carbon dioxide emission reduction [3][4][5][6]. Urban green spaces (UGSs) are woody and herbaceous vegetated areas open to the public and have valuable natural carbon sequestration functions, as well as playing an important role in the local carbon balance in urban areas [7][8][9][10]. One of the important indicators for quantifying the carbon sequestration capacity of UGSs is above-ground biomass (AGB) [11][12][13]. Rapid mapping of UGSs and the refined acquisition of AGB data for continuous monitoring of green space vegetation are needed, as these will improve scientific exploration of urban ecosystem material cycles and green space carbon sequestration potential. They also facilitate indepth consideration about UGS management. construction of UGS in recent years. It is worth mentioning that its urban forest coverage rate ranks among the best in Jiangsu Province, reaching 27.80% in 2021 [42]. The study area (117°2′24.58″~117°19′57.11″ E, 34°8′34.38″~34°23′4.10″ N) is located within the outer ring expressway around Xuzhou (represented by the red line in Figure  1d), which is the living and working circle of urban residents in the central downtown. It covers a geographical area of approximately 527.63 km 2 . The dominant forest species in the urban area is warm temperate deciduous broadleaf [43], and popular tree species are Cinnamomum camphora (L.) Presl., Ginkgo biloba L., and Koelreuteria paniculata Laxm From October to early December 2017, we conducted field surveys on typical green spaces in Xuzhou. To match the spatial resolution of the Sentinel-2 MSI data, rectangular green quadrats were set up with a fixed size of 10 m × 10 m. The growth indicators of the vegetation in the quadrats were measured and recorded, including diameter at breast height (DBH), height (from the base to the crown), and crown width of each tree; base diameter, height, and crown width (both south-north and east-west directions) of each shrub; and grass coverage and plant height. Afterward, a dataset of AGB with a total of 140 quadrats was created. The hierarchical summary method was used to calculate AGB for trees, shrubs, and grass layers separately, and then they were summed to obtain the AGB in the same quadrat. Furthermore, the AGB of trees and shrubs was estimated indirectly using allometric models (see Tables A1 and A2, Appendix A), which were established based on correlations between the dry weight of each organ component and easily measurable variables (e.g., DBH, height) for trees [44,45]. The AGB for bamboo forest was also calculated to be 22.50 kg per plant [46], and 8 tons/ha was taken as a reference for calculating the measured grassland biomass in Xuzhou [47]. The biomass calculation results from field data are listed in Table A3, Appendix A.

Satellite Data
The Sentinel-2 MSI data used in this study were all acquired during the summer months (from June to August), except for images used for modeling that coincided with the autumn field survey period (Table 1). Sentinel-2 MSI data pre-processing was mainly performed using Sen2cor 02.05.05 for atmospheric correction and then the data were resampled to 10 m in the SNAP 8.0 (Sentinel Application Platform) environment. Excluding three bands (1, 9, and 10) that had only marginal relevance to vegetation monitoring, the remaining 10 single bands were combined in ENVI 5.3 followed by masking and image mosaicking. We also used the Sentinel-1 SAR data with the dualpolarization VV + VH mode. This product had already been multi-looked and projected to ground range using an Earth ellipsoid model. To reduce any errors arising from the remote sensing data source, Sentinel-1 SAR data was acquired at a time corresponding to the acquisition time of the Sentinel-2 MSI data (Table 1). Sentinel-1 SAR data preprocessing included masking, orbit correction, thermal noise removal, radiometric calibration, terrain correction, and format conversion of the backscattering coefficients, all of which were done with the assistance of SNAP 8.0. Then, a SRTM digital elevation model was utilized with pixel spacing resampled to 10 m by cubic convolution. In addition to the original remote sensing images, 10 remote sensing indices, 32 texture features, and three topographical features were derived from multi-source data ( Table 2). Based on the coefficient of variation (CV) analysis [48] and fineness judgment of the image texture (see Figures A1 and A2, Appendix A), texture feature analysis was performed using a gray level co-occurrence matrix (GLCM) with an optimal window size of 5 × 5 pixels, a shift step of 1 in the X and Y directions, and a gray scale compression selection of 32 levels.

Urban Green Space Classification
The prime, core part of object-oriented classification is image segmentation, which generates internally "homogeneous and cohesive" polygonal objects that directly impact the following classification process. Image "under-segmentation" or "over-segmentation" can be easily caused by the use of a single segmentation scale, so multi-scale segmentation makes it possible to perform reasonable segmentation of all features [49]. The affiliation function algorithm, which is frequently applied in object-oriented classification, was created based on fuzzy functions and classifies images via the formation of logical functions [50]. Thereby, the distinction between the three major land cover types (urban green spaces, urban impervious surfaces, and water bodies) was first achieved using the above approaches. The strategies for image segmentation were as follows: (1) weights for the red-edge bands (bands 5, 6, and 7) from Sentinel-2 MSI data were taken as 1.20, because they reflect monitored vegetation growth, and the other multispectral bands were weighted as 1; (2) the corresponding optimal scale parameters for urban green spaces, urban impervious surfaces, and water bodies were set to 25, 30, and 50, respectively, due to their different spatial connectivity and heterogeneity; (3) the color parameter was set to 0.80, the shape parameter was set to 0.20, and the compactness and smoothness parameters were both 0.50.
Machine learning methods were used to further classify UGS into three blocks of broadleaf forest, coniferous forest, and shrub-grass vegetation in the object-oriented classification Classifier module of eCognition 9.1. Multiple image features (Table 2) were combined to establish the classification rules. Since the relationships between different vegetation patches in urban ecosystems are highly complex and nonlinear, random forest (RF) and k-nearest neighbor (KNN), both of which can be used for multi-class and nonlinear classification, were the preferred classifiers tested in this study. RF, which combines Bagging integrated learning theory with a random subspace method, is an integrated algorithm, based on decision trees, with high accuracy in the application of massive data-based fast image classification [51-53]. KNN is also an effective supervised machine learning method, a lazy learning algorithm that allows the class of each target object to be determined by the k nearest training sample objects in the feature space, according to the Euclidean distance (ED) minimum or majority voting criterion [54]. The commonly used function for weighting distance is: is the distance between the target object T and the training object Ti in the feature space, and k is the number of closest units. The confusion matrix was used to evaluate the results of UGS classification with the indexes of producer accuracy (PA), user accuracy (UA), overall classification accuracy (OA), and overall kappa index (KIA).
To determine how candidate variables affect AGB of each vegetation type prior to modeling, we performed Pearson's correlation analysis of these variables with measured biomass data.

Stepwise Regression Modeling
Stepwise regression creates an "optimal" multiple linear regression equation. It sequentially introduces variables into the model and tests the significance of the forward variables and the newly introduced ones on the dependent variable using stepwise calculations, to determine the appropriateness of introducing new independent variables [55]. The F-statistic was used to determine whether the original variables became insignificant after new variables were introduced (the thresholds for entry and removal were set at 0.05 and 0.10, respectively). To improve model significance, attention needs to be paid to the possibility of multi-collinearity among the independent variables [56]. The independent variables with high covariance were excluded by a variance inflation factor (VIF) [57] with the exclusion condition of VIF ≥ 10. We used the stepwise regression model for estimating AGB defined as: where simulated AGB is the simulated value of above-ground biomass, i p is the regression coefficient of variable i, n is the number of modeling variables, and q is the regression intercept.

Model Accuracy Assessment
Seventy-five percent of the calculated quadrat biomass values were selected as training data and the remaining were used to estimate model performance. The coefficient of determination (R 2 ) and the root mean square error (RMSE) were applied to assess the credibility of the biomass estimation model.
where simulated AGB is the simulated value of above-ground biomass, measured AGB is the measured value of above-ground biomass, AGB is the average of measured values, and n is the number of field data quadrats. Figure 2 is a technical flowchart of the methodology in this study.

Urban Green Space Type Identification
After combining multi-features from remote sensing images, green space coverage maps in urban areas were derived using RF and KNN algorithms and the results are shown in Figure 3. As can be seen in Table 3, the overall Kappa coefficients for the UGS classification results by RF (0.78) and KNN (0.72) algorithms both exceeded 0.70, thereby the classification results of UGS were in good agreement with the actual situation. Nevertheless, in terms of overall classification accuracy, the RF (86.59%) results were ~4% greater than KNN (82.68%). Moreover, the confidence level for coniferous forest (producer accuracy of 89.62%) was higher than that of shrub-grass vegetation (producer accuracy of 89.55%) and broadleaf forest (producer accuracy of 82.07%) in the RF classification results.  Finally, object-oriented classification of green space in the study area during the 2016, 2018, and 2021 growing seasons was carried out by training the RF algorithm ( Figure 4). Some details of the post-classification results, versus the ground-truth situation, are displayed in Figure 5. Greater insight into the UGS composition structure and change trend is provided by an area statistical table (Table 4). Shrub-grass vegetation (~58.14% for 2021) was most widespread in approximately half of the UGS in the study area, followed by broadleaf forests (~34.77% for 2021) and finally coniferous forests (~7.09% for 2021). During the study period, the area of broadleaf forest decreased and then increased, the area of coniferous forest decreased slightly, and the area of shrub-grass vegetation first increased and then decreased in line with the general trend of UGS change.

Correlation Analysis
Twenty-six, twenty-three, and eighteen variables were significantly correlated with the measured AGB in the broadleaf, coniferous, and shrub-grass vegetation quadrats, respectively ( Figure 6).

Stepwise Regression Models
For different green space vegetation types, AGB models, based on Sentinel-2 MSI data, Sentinel-1 SAR data, and multi-source remote sensing data were developed accordingly. The type-specific vegetation AGB estimation models are shown in Table 5.

Accuracy Assessment
The accuracy evaluation results from the AGB estimation models are shown in Figure  7. The R 2 values of all models exceeded 0.50. Among the biomass models, the poorest performance was found in the broadleaf forest biomass estimation model, based on Sentinel-2 MSI data, with both the lowest R 2 value (0.54) and the highest RMSE (54.88 tons/ha). The coniferous forest biomass estimation model, with a combination of Sentinel-2 and Sentinel-1 data, had the highest R 2 value (0.77). The combination of Sentinel-2 and Sentinel-1 data used to estimate the biomass of shrub-grass vegetation had the lowest RMSE (15.05 tons/ha). . Accuracy assessment of AGB estimation models: S2 refers to Sentinel-2 MSI data, S1 to Sentinel-1 SAR data, and S2&S1 to multi-source data combinations.

Above-Ground Biomass Estimation
The optimal models built by combining the use of Sentinel-2 MSI data and Sentinel-1 SAR data were selected from the model estimation results to estimate AGB of UGS in the study area at different periods. It was derived by overlaying the AGB maps for the three green space types. Figure 8 shows that AGB for Xuzhou UGS decreased generally from southwest to northeast, with a distribution range of 0-160 tons/ha.

Discussion
The combination of object-oriented classification and machine learning methods using high-resolution multispectral or hyperspectral image-based classification proved to distinguish urban forest from other urban landscapes [58][59][60]. To further distinguish diverse urban vegetation components, such as different species of trees and shrub-grass vegetation, in this study, we fused optical and SAR images for green space type identification and further confirmed that urban green space (UGS) classification based on image objects using an RF algorithm was more effective than the typical KNN. This method had the highest classification accuracy for coniferous forests (producer accuracy of 89.62%), which may have been due to concentrated and continuous distribution and orderly spatial arrangement of coniferous forests with unique stand structure characteristics.
The modeling pre-selection variable screening showed that for all three types of green spaces, the spectral reflectance of Sentinel-2 MSI data was mostly negatively correlated with above-ground biomass (AGB) of green space. In addition, vegetation indices and texture features influenced AGB estimation, which has been reported in the results of other studies [61,62]. A strong correlation was found between the variable aspect (0.54) and the AGB of coniferous forests. This was because coniferous forests in Xuzhou are typically located in mountainous areas and the specific geographical location influences the length of sunlight available to trees, soil fertility, etc. Thus, it should be particularly noted that topographic features were likewise key influencing variables in urban areas with undulating landscapes.
Previous studies have shown that vegetation AGB can be effectively and robustly estimated using optical or radar remote sensing images [63][64][65][66]. Our study equally found that AGB of coniferous forest could be estimated using Sentinel-2 MSI data (R 2 = 0.69, RMSE = 22.83 tons/ha). Moreover, urban shrub-grass vegetation biomass could be better estimated using Sentinel-1 (R 2 = 0.69, RMSE = 15.72 tons/ha) than Sentinel-2 data (R 2 = 0.55, RMSE = 20.30 tons/ha) in this study, in contrast with the findings on Mediterranean shrubland biomass by Chang et al. (R 2 < 0.60 for Sentinel-1 model, but R 2 = 0.72 for Sentinel-2 model) [67]. Besides this, compared with a quantitative study of Xuzhou broadleaf forest biomass estimation using only Sentinel-2 MSI data by Li et al. (R 2 = 0.73, RMSE = 45.56 tons/ha) [62], we proved that the additional use of Sentinel-1 SAR data showed better accuracy (R 2 = 0.75, RMSE = 35.95 tons/ha). More importantly, using both data sources provided better estimates of AGB compared to modeling with Sentinel-2 MSI data or Sentinel-1 SAR data alone. The results of a study by Nuthammachot et al. also revealed higher accuracy of the model (R 2 = 0.79, RMSE = 22.98 tons/ha) when combining Sentinel-1 and Sentinel-2 data than modeling with Sentinel-2 data only (R 2 = 0.77, RMSE = 24.19 tons/ha) for predicting forest biomass [68]. Meanwhile, Wang et al. demonstrated the potential of the simultaneous use of HJ-1B and RadarSat-2 data to estimate AGB in the desert grasslands of Ningxia, China (R 2 = 0.71, RMSE = 14.20 kg/hm 2 ), which outperformed the AGB estimation based on the NDVI index of HJ-1B (R 2 = 0.27, RMSE = 20.58 kg/hm 2 ) [69]. Therefore, we argue that combining the strengths of optical remote sensing data and radar data enables better quantitative estimation of AGB of UGS, an inference that agrees with several studies [70][71][72][73].
Despite the overall decline in the total area of UGS in Xuzhou, the AGB showed an opposite increasing trend during the study period, mainly due to the increase in the area of broadleaf forests with high biomass attributes. Moreover, the average AGB of broadleaf forests (133.71 tons/ha in 2021) surpassed that of coniferous forests (92.13 tons/ha in 2021), as evidenced by the findings of Xing et al. [74]. There was no clear annual variability in the average AGB of shrub-grass vegetation, with a mean value as low as 21.81 tons/ha from 2016 to 2021; nevertheless, its contribution to AGB in green areas should not be underestimated, as its coverage area was considerably higher than the other two types of vegetation. Therefore, for the future construction of UGS systems, adequate broadleaf forests ought to be reserved, while giving high value to the coordination arrangement of broadleaf forests, coniferous forests, and shrub-grass vegetation. This will improve green space maintenance and overall environmental outcomes.
At the same time, there are still some limitations. The classification results for broadleaf forests had relatively low, but acceptable, accuracy (producer accuracy of 82.07%), which was caused by the misclassification of objects belonging to broadleaf forests as coniferous forests. Consequently, it remains to be tested whether the additional use of vegetation phenological characteristics and thermal properties from remote sensing images would help reduce misclassification. This study also verified that the backscatter coefficient and texture features of Sentinel-1 alone can hardly reflect the full variability of AGB in green spaces. Therefore, it would be more interesting if richer image features were derived to further test the capabilities of Sentinel-1 SAR data for AGB estimation.

Conclusions
This study demonstrated the feasibility of applying a combination of Sentinel-2 MSI data and Sentinel-1 SAR data to map urban green space and estimate above-ground biomass. The main findings of the study are as follows: Based on Sentinel satellite series high-resolution remote sensing images, the use of objectoriented classification and random forest algorithm is a reliable method to identify urban green space types. The integration of Sentinel-2 MSI data and Sentinel-1 SAR data improved the accuracy of a stepwise regression model for above-ground biomass in urban green space. Model prediction accuracy when using combined multi-source data was, from highest to lowest, coniferous forest (R 2 = 0.77, RMSE = 18.39 tons/ha), shrub-grass vegetation (R 2 = 0.76, RMSE = 15.05 tons/ha), and broadleaf forest (R 2 = 0.75, RMSE = 35.95 tons/ha). From 2016 to 2021, the above-ground biomass of urban green space in Xuzhou increased, most prominently due to broadleaf forests. The highest average above-ground biomass was derived from broadleaf forests and the lowest from shrub-grass vegetation.
Altogether, the combination high-resolution Sentinel-2 MSI data and Sentinel-1 SAR data provides a practical solution for estimating above-ground biomass of urban green space in other regions. Much effort has been made in this study to consider the urban green space dynamics in Xuzhou in the last five years. Further research on natural environmental factors, such as soil physical and chemical properties, is needed to improve the above-ground biomass distribution and mapping accuracy of urban green space in Xuzhou.  T W refers to total above-ground biomass.