Mapping the Essential Urban Land Use in Changchun by Applying Random Forest and Multi-Source Geospatial Data

: Understanding urban spatial pattern of land use is of great signiﬁcance to urban land management and resource allocation. Urban space has strong heterogeneity, and thus there were many researches focusing on the identiﬁcation of urban land use. The emergence of multiple new types of geospatial data provide an opportunity to investigate the methods of mapping essential urban land use. The popularization of street view images represented by Baidu Maps is beniﬁcial to the rapid acquisition of high-precision street view data, which has attracted the attention of scholars in the ﬁeld of urban research. In this study, OpenStreetMap (OSM) was used to delineate parcels which were recognized as basic mapping units. A semantic segmentation of street view images was combined to enrich the multi-dimensional description of urban parcels, together with point of interest (POI), Sentinel-2A, and Luojia-1 nighttime light data. Furthermore, random forest (RF) was applied to determine the urban land use categories. The results show that street view elements are related to urban land use in the perspective of spatial distribution. It is reasonable and feasible to describe urban parcels according to the characteristics of street view elements. Due to the participation of street view, the overall accuracy reaches 79.13%. The contribution of street view features to the optimal classiﬁcation model reached 20.6%, which is more stable than POI features.


Introduction
Acceleration of urbanization and the proposal of smart city brings new demands to the refinement of urban governance. Spatial pattern of urban land use, which affects urban activities, is an important information for urban investigation, modeling, and resource allocation [1][2][3][4]. Traditional methods for mapping urban land use rely on remote sensing, and identified land patches are relatively fragmented, which differ from the more regular spatial scope of urban management. Gong et al. (2019) regarded parcels which are bounded by road networks as the intrinsic segmentation of urban land use [5]. Many scholars used OpenStreetMap (OSM) to delimit the boundary of parcel, and the methods performed well [6,7]. Currently, OSM is the largest project in collaborative and publicly licensed geospatial data collection, and thus it was widely used as an alternative or supplement to authoritative

Study Area
The city of Changchun (43 • 14 -44 • 05 N, 125 • 03 -126 • 00 E) is a regional transportation hub with convenient transportation conditions. Changchun plays an important role in the provincial economy [34]. The downtown area, or named central urban area of Changchun, is the scope of land allowed for urban construction in the current urban master plan. It is a concentrated distribution area carrying the urban functions. According to the remote sensing image of Google Earth platform, the downtown area was selected as the study area, and the maximum range of urban land boundary was determined combined with the road network ( Figure 1).

OpenStreetMap
In this study, road data of OSM were used to delineate the boundaries of urban parcels and to provide the spatial location of the street view data. OSM data are made available for reuse under the Open Database License (ODbL) share-alike license for data. The OSM data used in this study were in 2016, and were processed in the following ways. First, expressways, primary roads, secondary roads, and residential roads were extracted according to the attribute information of the road network. Second, a buffer was built according to the road level to generate the road area for the segmentation of urban parcels, while the buffer distance was set to 21, 21, 14, and 7 m according to the number of lanes and the width of lane. Third, the road buffer data in vector format were converted into raster format, and the road centerline was obtained by the ArcGIS vectorization tool (Figure 2).

Street View Images
The density tool of ArcGIS was used to generate nodes on the road with an interval of 50 m, and the node coordinates were taken as the sampling position of street view. Finally, a total of 24,520 points were obtained. The Baidu and Tencent are the two major street view providers in China, and the data can be obtained through the application program interface (API). Users can directly use the service under the condition of following the Baidu Map API service terms (http: //lbsyun.baidu.com/index.php?title=open/law) and Tencent location service open API service protocol (https://lbs.qq.com/terms.html). Users need to apply for the developer key first. The daily service calls of each key are limited. Users can pay to increase the number of calls. Baidu street view covers more than 95% of China's cities and more than 3 million kilometers, and its updated frequency is faster than Tencent. Although Google has a large number of street views, their images are not freely available in China. Therefore, this study uses the Baidu street view as the data source. In order to obtain images close to the pedestrians' visual angle, the vertical angle was set to 0 • , which is the state of horizontal observation. The horizontal angle was set at 90 • intervals, which were 0, 90, 180, and 270 • , respectively ( Figure 3) [35]. In order to avoid image distortion caused by a too large field angle, the field angle was set to 90 • . The street view images in the study area were taken in May 2014, April 2016, and June 2017, corresponding to spring, with little difference in vegetation greening characteristics. Considering the different data update frequency, the date of other data was based on the street view.

POI Dataset
POI data were generated for navigation application, referring to the point representation of geographical entities in space. The POI dataset records a large amount of information about the types, names, and spatial locations of places in a comprehensive and detailed way, which is widely used in urban researches [36]. The POI data used in this study were obtained through the Baidu Map API.
The land use classification system used in this study referred to the essential urban land use categories (EULUC) classification system [5], which was adapted from the Chinese Standard of Land Use Classification. Considering the features of park green space, it was regarded as a separate category in the classification of urban functional areas. As for Changchun, most of the urban traffic facilities are parking lots, which support the facilities of residential areas and do not have a separate space. Therefore, this study did not include the traffic category. Then, the POI data were sorted according to the land use classification [5,6] (Table 1). A total of 25,094 points were obtained. Nighttime light data provide an effective analysis perspective for measuring the intensity of urban social and economic activities. The Luojia-1 satellite launched on 2 June, 2018 and can provide nighttime light data with a resolution of 130 m [37,38]. Nighttime light data, acquired in 31 August, 2018, were downloaded from the website http://59.175.109.173:8888/. Users can register an account on the website and apply for data download under the condition of following the usage method of Luojia-1 nighttime light data (http://www.hbeos.org.cn/xwzx/2/2018-07-09/363.html).

Sentinel-2A Remote Sensing Images
The Sentinel-2 satellites of Sentinel-2A and Sentinel-2B are two similar satellites with high spatial and multispectral resolution. The level L1C images of sentinel-2A can be freely downloaded from the website https://scihub.copernicus.eu/. The Copernicus Open Access Hub (previously known as Sentinels Scientific Data Hub) provides complete, free, and open accessibility to Sentinel-2 user products. This level of image denotes that the product of top-of-atmosphere (TOA) reflectance have performed radiometric and geometric corrections. Therefore, the Sen2cor module of the sentinel application platform (SNAP) software was used for atmospheric correction to obtain corrected bottom-of-atmosphere (BOA) reflectance values. Sentinel-2A satellite provides 13 spectral bands at various spatial resolutions [39][40][41]. In this study, the used image was taken on 28 October, 2016, and the near-infrared, red, and mid-infrared bands with a spatial resolution of 10, 10, and 20 m, respectively, were used to calculate the texture features, the normalized difference vegetation index (NDVI), and the normalized difference built-up index (NDBI) [42].

Survey Data for Urban Land Use
In this study, the current construction land survey data of Changchun City Planning Department in 2016 was used to label the urban land use and verify the classification results. Its classification system is an objective description and summary of land attributes and characteristics, with strong subjective desires and purposes [43]. This dataset reflects the socio-economic functional attributes of land, separated into eight categories including residential, administration and public service, commercial and business facilities, industrial and manufacturing, road and transportation, municipal utilities, green space and square, and logistics and warehouse. ArcGIS software was used for overlay analysis, and the area of different land use type in each parcel was obtained. The land use attribute with the largest proportion of area was regarded as the land use category of the urban parcel.

Methodology
The research work included three main parts. First, the road area obtained from OSM preprocessing was used to determine the scope of urban parcels through overlay analysis (for more details see Section 2.2.1. and previous publications [3,5]). Second, based on the semantic segmentation of street view, the features from POI data, Luojia-1 nighttime light data, and Sentinel-2A were constructed to describe the parcels. In the third step, the urban parcels were divided into two sections, namely, a training sample and a test sample, which were used for model training and testing, respectively. The land use of the urban parcels was predicted according to the optimized model, then the producer accuracy (PA) and the overall accuracy (OA) were used to evaluate the classification accuracy ( Figure 4).

Semantic Recognition of Street View Images
Semantic segmentation technology is a very important field in computer vision technology, and its goal is to assign the object category to each pixel in an image. Deep learning is an effective method in semantic segmentation, but the training of a model depends on a large number of datasets [44]. The Cityscapes dataset is an urban street view segmentation dataset designed for automatic driving applications, which includes different cities, seasons, and weather environments. All pixels of an image are divided into 19 elements, including vegetation, sky, roads, and buildings. Cityscapes provide a good foundation for street view semantic segmentation. DeepLab, developed by Google and based on convolutional neural network (CNN), is an advanced algorithm in the field of semantic segmentation of images [45,46]. In this study, DeepLab pre-trained on Cityscape data was used for the street view segmentation (https://github.com/tensorflow/models), and the area ratio of each element in the image was obtained.

Features Constructed According to the Street View Segmentation
Green visual ratio (GVR) refers to the proportion of green plants in the visual field seen by the human eye, which is shown as the proportion of visible green elements in street view images (Equation (1)). It not only objectively reflects the quality of the public environment, but also emphasizes the three-dimensional characteristics of a space. Openness (OP) refers to the proportion of the sky seen by the human eye in the whole picture (Equation (2)) [47]. The enclosure index (Equation (3)) refers to the degree of public space enclosed by buildings, walls, and other structures [48,49]. The walkability index (Equation (4)) is the ratio of pedestrian roads to vehicle roads in a street view image [50].
where Area tree , Area sky , Area sidewalk , Area road , and Area building refer to the areas of green trees, sky, sidewalks, roads, and buildings in a picture, respectively; Area total is the total area of the processed image; and i represents the number of pictures collected at the observation point.
The diversity index (DIV) of street elements is expressed by the Simpson index, which is usually used to measure the richness of an ecosystem [51]. There are great differences in the shape characteristics and colors between streets at different points. The size of the value can reflect the visual complexity and richness of the street space to a certain extent (Equation (5)).
where Area j is the proportion of the area of type j in a street view image at a certain point, and j is the total number of all types; the larger the DIV value, the higher the richness of an element.

Features Constructed According to Sentinel-2A Images
The gray level co-occurrence matrix is a common texture analysis method. It is a statistical form of the joint distribution of two gray levels in an image, which can better reflect the correlation law of the gray levels of texture. Haralick et al. extracted 14 features with the gray level co-occurrence matrix [52]. The effectiveness of these features in remote sensing image classification has been evaluated in a large number of studies [53,54]. In this study, the following three indicators were selected to extract the texture features of Sentinel-2A images. Their meanings and formulas are as follows: Entropy (ENT) measures the randomness of the image texture. When all values in the spatial co-occurrence matrix are equal, ENT reaches the maximum value. On the contrary, if the values in the matrix are very uneven, the ENT value is small (Equation (6)).
The angular second moment (ASM) is the sum of the squares of the gray-scale co-occurrence matrix elements, and thus is also called energy, which reflects the uniformity of the gray-scale distribution and texture thickness (Equation (7)). If all values of the gray level co-occurrence matrix are equal, the ASM is small, the image texture is fine, and the energy is low. If some values are large and others are small, the ASM is large, the image texture is thick, and the energy is high.
Correlation (COR) measures the similarity of the spatial gray level co-occurrence matrix elements in rows or columns. As Equation (8), the correlation value reflects the local gray level correlation in an image. When the element values of the matrix are even and equal, the correlation value is larger. On the contrary, if the pixel values of the matrix are very different, the correlation value is smaller.
Based on Sentinel-2A band 8, 4, and 11, the above texture feature values were calculated, and the mean and standard deviation were obtained by taking the parcel as a unit.
The NDVI value, calculated by the spectral contrast of green plant leaves in red and near-infrared bands (Equation (9)), can fully reflect the condition of vegetation on the ground, and is one of the most widely used vegetation indices [42,55]. A positive NDVI value indicates vegetation coverage and increases with the increase in coverage. The NDBI, estimated using Equation (10), represents the distribution of construction land. An NDBI value greater than or equal to 0 indicates that the ground is covered by buildings or bare land [56].
where B nir ,B red , and B mir refer to the reflectance of the near-infrared, red, and mid-infrared bands, respectively, corresponding to bands 8, 4, and 11 of Sentinel-2A.

Features Constructed According to POI
The urban land use category in this study refers to the dominant urban land use of an urban parcel. In fact, within an urban land use, there are often different types of functions. The total number of POI, the number of POI types, and the proportion of each type within an urban parcel were calculated in this study.
In addition, the spatial distribution of the POI of each type was obtained by using kernel density (KDE), and then the mean value of the POI density of each type in an urban parcel was calculated. KDE is an effective spatial interpolation method used to calculate the density of a continuous surface by using the kernel function superposed at each location (Equation (11)). It describes the spatial distribution characteristics of the events or objects according to the determination of the spatial location and relationship of the events or objects [55][56][57][58]. In the process of analyzing the spatial distribution of a POI, KDE was measured by Euclidean distance. With an increase in the distance from a POI point, the calculated continuous surface value gradually decreases.
where f (x, y) is the estimated KDE at the position, n is the number of observation points, h is the bandwidth parameter, K is the kernel function, and d i represents the distance from the position to the observation position i. According to the original data and the results of the analysis of street view, Luojia-1, Baidu POI, and Sentinel-2A, 70 features were constructed to describe the urban parcels ( Table 2). Sentinel-2A Mean of the near-infrared, red, and mid-infrared bands 3 Standard deviation of the near-infrared, red, and mid-infrared bands 3 Texture mean value and standard deviation of the near-infrared, red, and mid-infrared bands 18 Mean value and standard deviation of NDVI and NDBI 4

Developing of the RF Model
Bagging or bootstrap aggregating was used to generate the random forest, which changes the sample distribution of the data used in the model, introduces noise, and increases the generalization ability of the model. All the data to be trained are put into a black box, and then some of the data are randomly selected from the bag to train a model [59]. In addition, each tree grows completely without pruning. When the tree has finished growing, the variables of each node are only generated in a few randomly selected variables. That is to say, the used instances and variables are randomized. This double random process does not easily succumb to overfitting. The final decision tree is generated by voting on the potential random tree, that is, the random forest selection classification with the most votes [60]. The RF model is insensitive to multicollinearity and can effectively prevent over fitting in the lower scale process, and the prediction results of the algorithm are robust for missing data and unbalanced data, and can produce high accuracy classifiers for a variety of observation data [61]. It is applicable to the prediction of multiple urban land use in the case of no balanced sample.
When training samples are fixed, there are two main factors affecting the accuracy of RF classification, including the number of features (max_features) randomly selected to generate a decision tree and the number of decision trees (n_estimators). The number of features is related to the ability of the decision tree and the correlation between decision trees. A smaller max_features value generally reduces the performance of the model because there are fewer choices to be considered on each node. There is a tendency to reduce the ability of the RF classifiers. However, the smaller the correlation between decision trees, the stronger the ability of the classifier [33]. The prediction accuracy of out-of-bag samples (OOB_SCORE) can be used to estimate the performance of the model. In other words, the prediction accuracy of the unselected sample set is used to verify the performance of the model [62]. The higher the value, the more reliable the model [63,64]. In this study, scikit-learn was used to determine the optimal parameters according to the OOB_SCORE value calculated by the parameter cycle.

Street View Segmentation
Through the street view semantic segmentation, each element in the image was obtained, and the proportion of each element in the image was calculated according to the number of pixels, which was used to construct the streetscape features ( Figure 5). In terms of the proportion of street view elements in the study area, the top four are sky, roads, buildings, and vegetation, accounting for 25.09%, 20.89%, 16.96%, and 16.96%, respectively (Table 3).  Based on the distribution of the spatial street view elements, there is a correlation between the street view and the land use. Compared to the known land use in Changchun, it can be seen that the spatial distribution of the building elements in green space shows a lower value, while the spatial distribution of vegetation shows a higher value. In addition, it can also be seen that urban areas in different construction years show obvious spatial heterogeneity due to different urban planning concepts and urban construction technologies. In the third ring road area of Changchun, the time of urban construction is earlier, the road network is dense, the road width is small, and the building height is low. Accordingly, the spatial distribution of the sky shows a lower value, and that of the buildings shows a higher value. However, the completion time of urban construction outside of the third ring road is relatively late, which shows the opposite characteristics ( Figure 6).

Parameter Optimization of the RF Model
In this study, the optimal RF model and its prediction accuracy were compared under two conditions, i.e., street view feature participation (S1) and no street view feature participation (S2). In the case of S1, the maximum OOB_SCORE value is 0.741, and the corresponding max_features and n_estimators values are 20 and 75, respectively. In the absence of street view features, the maximum OOB_SCORE is 0.727, and the corresponding max_features and n_estimators values are 28 and 35, respectively. It can be seen from Figure 7 that in the case of using the same random forest tree without the participation of streetscape features, participation of more features is needed in the construction of the model in order to obtain a similar prediction accuracy. The optimal model obtained by training was used to predict the samples. With the participation of street view features, the recognition accuracy of public land and green space was effectively improved by 50% and 40%, respectively (Tables 4 and 5). In addition, the importance of each feature in the model was calculated by using the method of average impurity reduction [65,66]. The contribution of street view features to the model was 20.7%, and that of Sentinel-2A image features was 41.24% ( Table 6). The most important features of the POI features were the "proportion of residential points" (Pro_Res), the "proportion of educational points" (Pro_Edu), and the "density of residential points" (Dens_Res). The actual functions of these three types of areas were relatively singular, and the degree of mixing of the POI types within these urban parcels was low, which is very representative of the function of urban parcels. In the street view features, the top two contributions were the "average value of building elements" (M_Buil) and the "standard deviation of GVR" (Std_GVR). Based on the degree of the dispersion of the contribution, the degree of the contribution of the POI characteristics is uneven ( Figure 8). Generally, the contribution of the POI kernel density characteristics was higher than the contribution of the POI proportion characteristics. However, the contribution of the street view features was relatively stable, and the standard deviation was small.

Results of the Urban Land Use Classification
According to the trained model, the land use of all urban parcels were predicted, and a land use distribution map of the urban parcels within the downtown area of Changchun was obtained ( Figure 9). The overall accuracy of the classification was up to 91.80%. Urban parcels whose predicted land use was different from the actual dominant land use category were concentrated in the southwest, middle, and north of the study area, while the prediction was good in the southeast and south of the study area. The reason lies in the late completion of the construction in the southeastern and southern areas of Changchun. The urban planning and management of the southeastern and southern areas are relatively perfect. Thus, the actual land use of the urban parcels is consistent with their planned land use. The southwest and central regions were built earlier, with a large number of historical buildings in these regions. However, the functions of the above-ground buildings have changed, and so the land use of the urban parcels is different from the original planning ( Figure 10). Therefore, it is possible to improve the prediction accuracy by obtaining urban construction data across different years.

Discussion
In this study, OSM was combined to delimit boundaries of urban parcels. The results of the urban street view semantic segmentation were applied to enrich the multi-dimensional description of urban parcels, and RF was used to identify the land use of the urban parcels. The results show that based on the spatial distribution characteristics, the street view elements were related to urban functions, and it is reasonable and feasible to describe an urban parcel according to characteristics of the street view elements. The contribution of the street view characteristics to the optimal model reached 20.6%. The mixing of the land use of urban parcels is an important obstacle to accurate identification of urban land use [67]. The accuracy of identification is related to the purity of the urban parcel. The results in this study show that the purity of residential land, industrial land, and green land is high, and thus the accuracies of their classifications are high. The land use type recorded in the survey data of the current construction land survey data of Changchun City is pure. The parcels of this data are finer than those divided by OSM, but there are no strict rules for their boundaries. There may be no roads or spaces between different functional parcels. At present, OSM is still a widely used data for segmentation, and the urban parcel boundary obtained is relatively regular, which conforms to the actual coverage of urban parcel [67,68].
POI data used in this study were obtained from commercial companies. The original goal of the data was to serve navigation. The type of POI does not exactly correspond to the land use of urban parcels. For example, the POI of medical facilities include clinics and pharmacies, but these facilities do not occupy separate land. Only specialized hospitals and large hospitals have independent land. Therefore, the POI data were filtered in the data preprocessing stage, and the POI characteristics contributed a lot to the accurate identification of urban parcels with independent land, such as residential and educational parcels, but not to other functional parcels. In the future, the introduction of a toponymic database managed by the government should be considered.
Street view can supplement the social attribute information of an urban parcel from the perspective of ground observation [69]. Street view complements the lack of ground detail in top view images and provides useful auxiliary information to be transmitted to remote sensing images for improving performance [26]. It can be seen that the introduction of street view effectively improves the accuracy of the urban land use classification of public service-type parcels, which are typically small with but with a high degree of mixing, and are difficult to identify. Street view images contain rich information, which need to be further explored [70,71]. The street view segmentation model used in this study was trained on the dataset of CityScapes. We need to identify the elements in the picture, and then establish features to identify urban land use type. It makes sense to build dataset directly related to urban land use in the future, although it is a complex and onerous task. For example, by collecting street view data and adding corresponding land use tags, combined with deep learning technology, it can be used to predict urban land use directly through pictures. Such a dataset would be helpful to improve the efficiency of the land use identification of urban parcels, and to provide convenience for street view applications in other cities.

Conclusions
Given the need of urban land use classification and the new data environment, this study applied street view data, aiming to provide reference for urban multi-source data fusion, enrich urban characteristic indicators, and provide new ideas to improve the accuracy of urban land use classification. The street view features showed better performance compared to the POI characteristics. The prediction accuracy of the area built later was higher. Since the building function in the area built earlier has changed, its street view features can not effectively express its existing functions. If the study area is divided into different areas according to the construction time, and then the urban land use prediction is carried out separately, the higher accuracy could be obtained.
Although this study attempted to impove the classification of urban land use, there are still problems in this process that need further study. First of all, urban parcels delimited by OSM have mixed land use. The land use of the parcel was determined by the actual construction land type with the largest area proportion, while urban parcels have three-dimensional characteristics. The actual use is related to the building area and the business characteristics, and thus the classification system of urban land use needs to be improved. With the emergence of spatiotemporal big data, it is necessary to introduce human activity big data with finer time and spatial resolution.