Detailed Mapping of Urban Land Use Based on Multi-Source Data: A Case Study of Lanzhou

: Detailed urban land use information is the prerequisite and foundation for implementing urban land policies and urban land development, and is of great importance for solving urban problems, assisting scientiﬁc and rational urban planning. The existing results of urban land use mapping have shortcomings in terms of accuracy or recognition scale, and it is di ﬃ cult to meet the needs of ﬁne urban management and smart city construction. This study aims to explore approaches that mapping urban land use based on multi-source data, to meet the needs of obtaining detailed land use information and, taking Lanzhou as an example, based on the previous study, we proposed a process of urban land use classiﬁcation based on multi-source data. A combination road network dataset of Gaode and OpenStreetMap (OSM) was synthetically applied to divide urban parcels, while multi-source features using Sentinel-2A images, Sentinel-1A polarization data, night light data, point of interest (POI) data and other data. Simultaneously, a set of comparative experiments were designed to evaluate the contribution and impact of di ﬀ erent features. The results showed that: (1) the combination utilization of Gaode and OSM road network could improve the classiﬁcation results e ﬀ ectively. Speciﬁcally, the overall accuracy and kappa coe ﬃ cient are 83.75% and 0.77 separately for level I and the accuracy of each type reaches more than 70% for level II; (2) the synthetic application of multi-source features is conducive to the improvement of urban land use classiﬁcation; (3) Internet data, such as point of interest (POI) information and multi-time population information, contribute the most to urban land use mapping. Compared with single-moment population information, the multi-time population distribution makes more contributions to urban land use. The framework developed herein and the results derived therefrom may assist other cities in the detailed mapping and reﬁned management of urban land use.


Introduction
Cities are important part of human civilization. Since the 20th century, the process of urbanization has been accelerating, and has become one of the most important factors affecting human society and economy. Rapid urbanization has brought about a series of urban problems, such as environmental degradation, resource exhaustion, traffic congestion, the foam of real estate Tibet provinces. As a major town for industrial and economic development in the northwest region of China, Lanzhou has experienced, and is experiencing, rapid urbanization. The types of land use in the main city of Lanzhou are diverse, including residential, industrial, commercial, and public. The classification system of EULUC was adopted as the basic system in study. The two-level scheme adapted from Chinese Standard of Land Use Classification could fit the land use characters of Lanzhou City [24]. On the other hand, mapping based on unified classification system so that easy comparisons with other cities in China could be made. Therefore, we divided the urban land into five Level I classes of residential, commercial, industrial, transportation and public. On this basis, each type in Level I was redefined combining with the land use status of Lanzhou City. Table 1 shows the detailed information of the classification system.  The Sentinel-1A data used in this study are interferometric wide swath (IW) ground distance multivision product, with an acquisition time of 23 December 2018, and the polarization modes are vertical transmit (VV) and horizontal receive (VH). For Sentinel-2A data, the Level-1C products, with an acquisition time of 24 August 2018, which have been processed by radiation calibration and geometric correction, were adopted in this study. Specifically, the four 10 m spectral bands of red, green, blue, near-infrared, and two 20 m short-wave infrared (SWIR) bands in the Sentinel-2A data were used. The Sen2Cor, a processor for  The Sentinel-1A data used in this study are interferometric wide swath (IW) ground distance multi-vision product, with an acquisition time of 23 December 2018, and the polarization modes are vertical transmit (VV) and horizontal receive (VH). For Sentinel-2A data, the Level-1C products, Remote Sens. 2020, 12, 1987 4 of 19 with an acquisition time of 24 August 2018, which have been processed by radiation calibration and geometric correction, were adopted in this study. Specifically, the four 10 m spectral bands of red, green, blue, near-infrared, and two 20 m short-wave infrared (SWIR) bands in the Sentinel-2A data were used. The Sen2Cor, a processor for Sentinel-2 Level 2A product generation and formatting tool [26], was used for atmospheric correction, which is performed using a set of Look-Up tables based on the LIBRADTRAN radiative transfer model [27]. Both Sentinel-1A and Sentinel-2A data were obtained from sentinel open access hub (https://sentinel.esa.int/web/sentinel/sentinel-data-access).
Many studies have shown that nighttime light data could be used to characterize regional population, urbanization, and economy development status [28,29]. The greater the intensity of human activity, the greater the intensity of the nighttime lights. In fact, there is a certain degree of correlation between the intensity of nighttime light and the type of urban land. For example, areas with high economic activity such as commercial and industrial land usually have high-intensity public and commercial lights, while the light intensity of residential land is relatively weak. Compared with DMSP/OLS and VIIRS-DNB data, Luojia-1 could provide high-resolution nighttime light imagery with clearer spatial details [30]. Therefore, the Luojia-1 nighttime light images used in this study, with an acquisition time of 23 August 2018, and a spatial resolution of 130 m, coming from the data and application platform of high-resolution earth observation system for Hubei province, China (http://www.hbeos.org.cn/). Geometric correction for Luojia-1 images were processed based on prominent features, such as road intersections and inflections as reference points.

Internet Data
The Easygo data was acquired from Tencent big data platform (https://heat.qq.com/index.php), which records the real-time locations of active users on Tencent, one of the largest social media platforms in China. The Easygo data could provide the real-time spatial distribution of the population in the study area [14]. We implement a web crawler to acquire the Easygo data within the study area for an entire week from 16-22 December 2019 (16-20 are weekdays, 21-22 are weekends), with a spatial resolution of 25 m and a temporal resolution of two hours.
The Gaode POI data were acquired via the application programming interface (API) provided by the official website of Gaode Maps (https://lbs.amap.com/), with a collection time of December 2019. Each POI record has a series of attributes such as the name, type, address, latitude, and longitude. All POIs were reclassified into 11 categories of residential, village, business, commercial, industrial, administrative, educational, medical, sport and cultural, greenspace and undeveloped.

Road Network Data
The road data consist of OSM road data and Gaode road data, OSM road data were acquired from the OpenStreetMap platform (https://www.openstreetmap.org), and Gaode road data were acquired from Gaode Maps (https://lbs.amap.com/) via a web crawler, respectively. OSM data has the advantages of fast update speed, high current and low cost. It has been widely applied to urban block division and parcel generation [14,22]. As a volunteered geographic dataset, OSM road data are dense and applicable in large and medium-sized cities [31]. However, the completeness of high-level roads such as highways and urban arterial roads are more than that of low-level roads in Lanzhou city. Specifically, areas with strong human activities have dense roads, areas with weak activities have sparse roads relatively (such as roads in the suburbs), the completeness of high-level roads such as highways and urban arterial roads are more than that of low-level roads. Therefore, the parcels generated only using the OSM road network have a high urban land-use mixed degree, which could not meet the needs of the refined mapping of land use in Lanzhou City. The Gaode road network data providing detailed roads could help to better divide the urban parcels.

Method
Based on road network segmentation, the RF algorithm was chosen for the classification of the parcels. The specific process ( Figure 2) is as follows. (1) The boundary of the downtown area of Lanzhou is obtained by visually interpreting high-resolution Google images. Water bodies (NDWI > 0.1) [32] were excluded. (2) OSM and Gaode road networks were used to divide the parcels as the basic mapping units. (3) All features, such as spectral, texture, time-series population density, were extracted based on remote sensing data and Internet data. (4) To obtain the preferred feature dataset, the ReliefF method was used to eliminate redundant features. (5) The RF algorithm was applied to classify urban land use types with preferred features. Based on validation samples, the indicators based on confusion matrix consist of user accuracy (UA), producer accuracy (PA), overall accuracy (OA) and kappa coefficients were used to evaluate the classification performance [33]. By constructing different feature sets based on preferred features, the classification accuracies of different feature sets and the contribution of different features to the classification results were further analyzed.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 20 features. Based on validation samples, the indicators based on confusion matrix consist of user accuracy (UA), producer accuracy (PA), overall accuracy (OA) and kappa coefficients were used to evaluate the classification performance [33]. By constructing different feature sets based on preferred features, the classification accuracies of different feature sets and the contribution of different features to the classification results were further analyzed.

Parcel Generation
Gaode road network data was used as a supplement to generate parcels based on the OSM data. Original OSM and Gaode road network data are shown in Figure 3a,d, respectively. After preprocessing operations including simplification, merging and geometric correction on the road network data, the segmentation processes in the study were carried out as follows. First, we conducted width sampling on roads of different levels according to the attribute of the OSM road data. The road buffers were built based on the sampling results, in which the width of railway, motorway, primary, secondary and tertiary road are set to 7.8 m, 25 m, 48 m, 28 m and 18 m, respectively ( Figure 3b). Second, the buffered roads were used to split the boundary of the main city into independent street blocks ( Figure 3c). Third, we used levelThreeRoad, levelFourRoad, and roadsBeingBuilt in the Gaode Road Network (Figure 3e) to divide the street blocks. The independent roads with the distance of less than 10 m from the adjacent roads at both ends were extended to allow them to connect adjacent roads. To remove debris parcels, we merged parcels with an area of less than 1000 square meters with adjacent parcels, and the results of the parcels are shown in Figure 3f.

Parcel Generation
Gaode road network data was used as a supplement to generate parcels based on the OSM data. Original OSM and Gaode road network data are shown in Figure 3a,d, respectively. After preprocessing operations including simplification, merging and geometric correction on the road network data, the segmentation processes in the study were carried out as follows. First, we conducted width sampling on roads of different levels according to the attribute of the OSM road data. The road buffers were built based on the sampling results, in which the width of railway, motorway, primary, secondary and tertiary road are set to 7.8 m, 25 m, 48 m, 28 m and 18 m, respectively ( Figure 3b). Second, the buffered roads were used to split the boundary of the main city into independent street blocks ( Figure 3c). Third, we used levelThreeRoad, levelFourRoad, and roadsBeingBuilt in the Gaode Road Network (Figure 3e) to divide the street blocks. The independent roads with the distance of less than 10 m from the adjacent roads at both ends were extended to allow them to connect adjacent roads. To remove debris parcels, we merged parcels with an area of less than 1000 square meters with adjacent parcels, and the results of the parcels are shown in Figure 3f.

Feature Extraction
The features of parcels used in this study include spectral, texture, backscatter, nighttime lights, POI, and temporal population density features ( Table 2). Spectral features were calculated based on the bands of Sentinel-2A and typical spectral indices, including the normalized vegetation index (NDVI), the normalized water index (NDWI), the normalized built-up index (NDBI). Texture features were calculated by the Grey Level Concurrence Matrix (GLCM), including the mean, standard deviation, homogeneity, contrast, heterogeneity, entropy. Backscatter and nighttime light features were calculated based on Sentinel-1A and Luojia-1 images. For time-series population density features, the population density at different periods within each parcel was extracted base on Easygo data. POI features consisted of the total number of all POIs, total number, and proportion of each type of POIs within each parcel. Population density values at every two hours in a week 84

Feature Extraction
The features of parcels used in this study include spectral, texture, backscatter, nighttime lights, POI, and temporal population density features ( Table 2). Spectral features were calculated based on the bands of Sentinel-2A and typical spectral indices, including the normalized vegetation index (NDVI), the normalized water index (NDWI), the normalized built-up index (NDBI). Texture features were calculated by the Grey Level Concurrence Matrix (GLCM), including the mean, standard deviation, homogeneity, contrast, heterogeneity, entropy. Backscatter and nighttime light features were calculated based on Sentinel-1A and Luojia-1 images. For time-series population density features, the population density at different periods within each parcel was extracted base on Easygo data. POI features consisted of the total number of all POIs, total number, and proportion of each type of POIs within each parcel.

Feature Reduction
To eliminate the interference of redundant features on the classification accuracy as much as possible, and to obtain the feature variables with strong intra-class aggregation and high-class separability, the ReliefF algorithm was used to reduce the dimension of the extracted features. The main idea of the ReliefF algorithm is as follows: given a training data set D, a sample x is randomly selected from it, and the k nearest neighbor samples H (x) and M (x) from the sample set of the same and heterogeneous samples are randomly selected [34]. The sum of the distance between the sample x and the k nearest neighbor samples H (x) and M (x) are calculated respectively, and the feature weights are updated according to the distance. After multiple iterations, the final weight of each feature is obtained. The formula (1) presents the weight update of the ReliefF.
where dis a () represents the distance between different samples of feature a; H(x) and M(x) represent the k nearest neighbor samples of the same kind and heterogeneity with sample x; p() represents the class probability; m is the number of iterations; k is the number of nearest neighbor samples; and i is the randomly selected samples. According to the feature weights calculated by ReliefF algorithm, the feature variables were added to the RF in order, and the feature subset with the highest accuracy was selected as the final specific features.

Random Forest Model
The RF is a new machine-learning algorithm based on the integration of multiple Classification and Regression Tree (CART) decision trees, which is proposed by Leo Breiman and Adele Cutler [35]. The RF algorithm extracts samples from the original training set randomly, using the bagging method, to construct multiple independently growing and distributed decision trees; in the growth process of each tree, a certain number of features are randomly selected from all the features for nodes dividing; finally, the classification results of each tree are counted by voting, and the category with the most votes is the final result. The main parameters that affect performance of the RF are the number of Remote Sens. 2020, 12, 1987 8 of 19 decision trees and the size of the features [36]. The grid search method based on the out-of-bag (OOB) error value was used for parameter optimization in this study. The algorithm was implemented using scikit-learn, a python-based data analysis toolkit. According to the classification results, the confusion matrix was calculated to obtain the overall accuracy and kappa coefficient.

Classification Based on Different Feature Combination
To evaluate the impact of different types of features on the urban land use classification, ten sets of experiments were designed based on different combinations of features. Table 3 represents the summary of classification experiments based on different feature combinations.

Samples for Training and Validation
According to the specified classification system, the preliminary selection and discrimination of parcels were made based on Google Maps, and then sampling points were arranged for field sampling. Based on the field survey results, the land use types and proportions were revised. In this study, a total of 11,536 parcels was generated, and 580 parcels with a purity of more than 95% were selected as samples. In these samples, 70% were used as training samples, while the rest were used for testing. Additionally, a total of 400 parcels were selected randomly for validation. The numbers of each type of parcels in the validation samples were determined by its proportion in the classification results.

Feature Selection Result
The overall accuracy (OA) of classification results with different numbers of features for the Level II category are showed in Figure 4. It can be seen that when the number of features involved in classification process is less than eight, the accuracy increased rapidly as the number of features increased; when the number of features is between 8 and 136, the rate of growth of accuracy slowed markedly; when it exceeded 136, there was no significant change in classification accuracy. Obviously, the overall accuracy was the largest (77.1%) when the number of feature variables was 136.
Therefore, a total number of 136 features was selected in this study, which consist of nine spectral features (mean values of blue, green, NIR, SWIR1, SWIR2, NDVI, NDWI and standard deviation of SWIR1 and SWIR2,), 16 texture features (entropy of blue, green, red, NIR, SWIR1 and SWIR2; autocorrelation of blue, green, red, SWIR1 and SWIR2; contrast of green, red, NIR, SWIR1 and SWIR2), three backscatter features (sum and mean of VV and mean of VH), 18 POI feature variables, six nighttime light features and 84 time-series population density features. The overall accuracy (OA) of classification results with different numbers of features for the Level II category are showed in Figure 4. It can be seen that when the number of features involved in classification process is less than eight, the accuracy increased rapidly as the number of features increases; when the number of features is between 8 and 136, the rate of growth of accuracy slowed markedly; when it exceeds 136, there is no significant change in classification accuracy. Obviously, the overall accuracy was the largest (77.1%) when the number of feature variables was 136. Therefore, a total number of 136 features was selected in this study, which consist of nine spectral features (mean values of blue, green, NIR, SWIR1, SWIR2, NDVI, NDWI and standard deviation of SWIR1 and SWIR2,), 16 texture features (entropy of blue, green, red, NIR, SWIR1 and SWIR2; autocorrelation of blue, green, red, SWIR1 and SWIR2; contrast of green, red, NIR, SWIR1 and SWIR2), three backscatter features (sum and mean of VV and mean of VH), 18 POI feature variables, six nighttime light features and 84 time-series population density features.

Classification Results Based on Multi-Source Features Using RF Model
The pattern and distribution of urban land use based on optimized multi-source features for Level II category are shown in Figure 5a. Compared with the Lanzhou City Master Plan (2011-2020) [37] via visual interpretation, it was found that the overall consistency in spatial distribution was high. The confusion matrix was used to evaluate the accuracy of the classification results for the Level I category (Table 4). Accordingly, the OA and kappa coefficient for the four level I classes were 83.75% and 0.77, respectively, and the classification accuracies were overall higher than 70%. Residential area had the highest accuracy, with user accuracy and producer accuracy of 93.17% and 84.27%, respectively, while industrial and public areas had low accuracy relatively, and the confusion with other land types was serious.

Comparison with EULUC Results
Due to the inconsistency between the classification system and the sample, the difference between the classification results obtained in this paper and the results of EULUC [24] cannot be directly compared numerically. Therefore, the classification results were compared with the mapping results of EULUC in Lanzhou from the distribution of urban land use. Overall, both our work (Figure 5a) and the EULUC (Figure 5b) were similar in spatial distribution. However, the large parcels of EULUC had resulted in large areas of continuous urban land being classified into the same type (Figure 5g). The mapping results were relatively coarse, especially in the suburbs, where the mixed degree of land use is high, had high misclassification. In contrast, the results of the study were more elaborate. For example, land types with a relatively small internal area (such as commercial office land, which often exists in the form of individual buildings) can be better distinguished (Figure 5f). Therefore, combining the OSM and the Gaode road network could divide the land more finely, effectively improving the classification effect, to better express the detailed information of land use within the city. Figure 6 illustrates the comparisons of the OA and kappa coefficients of different combinations of features for level I and level II categories. Comprehensive utilization of all features (experiment ALL) had the highest classification accuracy, with an OA of 83.75% and a kappa coefficient of 0.77, for level I categories, 76.25 and 0.71 for level II. When only using spectral and texture features, the accuracy was the lowest, with overall accuracies of 50.75% and 43.25% for level I and level II categories, respectively. Based on spectral and texture features, the addition of backscatter features, POI features, population density features and other types of features could improve the accuracy at different degree (Figure 6a,b). Experiments S2 + S1 and S2 + Luojia added backscatter and nighttime light features, respectively, both could both improve the accuracy slightly, but the urban land use classification accuracy was still less than 60%. In contrast, experiments S2 + Easygo and S2 + POI added time-series population density and POI features, which improved the classification accuracy effectively. Compared with S2, the overall accuracies of improved by 23.75% and 19.75% for level I categories, for level II, 20.25% and 18.5%, respectively. What deserves to be mentioned is that, both POI and temporal population density features further improved classification accuracy compared with adding temporal population density (or POI) features alone.

Comparison of Classification Accuracy of Different Feature Combinations
Although both POI and time-series population density features could effectively improve the classification accuracy, using either alone or both cannot meet the accuracy requirements. When only using POI or time-series population density features, the classification accuracy for level II were less than 60% (Figure 6a). Only using POI and time-series population density features for classification, the overall accuracy and kappa coefficient for level I categories were 77% and 0.67, but for level II were only 67.2 and 0.57, respectively. Compared with ALL, the OA decreased by 6.75% for level I categories and 9.5% for level II, respectively, while the kappa coefficient decreased by 0.1 for level I and 0.14 for level II, respectively. It was shown that the remote sensing features were indispensable for the improvement of accuracy in urban land use classification.
Remote Sens. 2020, 12, x FOR PEER REVIEW 11 of 20 II, respectively, while the kappa coefficient decreased by 0.1 for level I and 0.14 for level II, respectively. It was shown that the remote sensing features were indispensable for the improvement of accuracy in urban land use classification.   Table A1). Analyzing from the classification accuracy of a single land use class for level II, there were different effects of various features on urban land use types. In experiment S2, only using the spectral and texture feature classification, the UA and PA of the greenspace were the highest, at 75% and 84% (see in Table A1), respectively. Meanwhile, the addition of other features was not obvious for the improvement of its accuracy, indicating that the classification of greenspace mainly depended on the spectral and texture features. Compared with S2, the accuracy of the urban residential and educational lands in experiment S2 + S1 were improved slightly, indicating that the backscatter features could improve the classification of the urban residential land and educational land, but the degree of improvement was limited. Experiment S2 + Luojia showed that the nighttime light features could improve the UA of the land use classes with strong social and economic activities, such as village, business, commercial lands, etc. When only using remote sensing features (Figure 7a-c), administrative, educational, medical, sport and cultural lands had low accuracies extremely, which indicated that their classification mainly depended on the characteristics of time-series population density and POI features.   Table A1). Analyzing from the classification accuracy of a single land use class for level II, there were different effects of various features on urban land use types. In experiment S2, only using the spectral and texture feature classification, the UA and PA of the greenspace were the highest, at 75% and 84% (see in  Table A1), respectively. Meanwhile, the addition of other features was not obvious for the improvement of its accuracy, indicating that the classification of greenspace mainly depended on the spectral and texture features. Compared with S2, the accuracy of the urban residential and educational lands in experiment S2 + S1 were improved slightly, indicating that the backscatter features could improve the classification of the urban residential land and educational land, but the degree of improvement was limited. Experiment S2 + Luojia showed that the nighttime light features could improve the UA of the land use classes with strong social and economic activities, such as village, business, commercial lands, etc. When only using remote sensing features (Figure 7a-c), administrative, educational, medical, sport and cultural lands had low accuracies extremely, which indicated that their classification mainly depended on the characteristics of time-series population density and POI features.
Based on spectral and texture features, the addition of Internet features such as POI, time-series population density, or their combination could effectively improve the accuracy of almost all types of land (Figure 7d,e). When only using POI or time-series population density features, some land types, such as industrial, educational, and commercial land can achieve higher accuracy, but the classification accuracy of most land types was lower. In contrast, experiment ALL (Figure 7k) integrating all the features have significantly improved the accuracy of all land types, especially village, greenspace and undeveloped. Based on spectral and texture features, the addition of Internet features such as POI, time-series population density, or their combination could effectively improve the accuracy of almost all types of land (Figure 7d,e). When only using POI or time-series population density features, some land types, such as industrial, educational, and commercial land can achieve higher accuracy, but the classification accuracy of most land types was lower. In contrast, experiment ALL (Figure 7k) integrating all the features have significantly improved the accuracy of all land types, especially village, greenspace and undeveloped.
Using the remote sensing features or Internet data features alone to classify urban land use types had relatively low accuracy, but when all features were used, the accuracy of urban land use classification was the highest, indicating that the synthesis of multi-source features was conducive to the improvement of classification results. Different types of features can improve the classification accuracy to varying degrees. Remote sensing features such as spectral, texture, backscatter and night light information could improve the classification accuracy of some land categories, while POI features and temporal population density features could improve accuracy of all classes significantly. Using the remote sensing features or Internet data features alone to classify urban land use types had relatively low accuracy, but when all features were used, the accuracy of urban land use classification was the highest, indicating that the synthesis of multi-source features was conducive to the improvement of classification results. Different types of features can improve the classification accuracy to varying degrees. Remote sensing features such as spectral, texture, backscatter and night light information could improve the classification accuracy of some land categories, while POI features and temporal population density features could improve accuracy of all classes significantly.

Importance of Different Features
Mean impurity decrease derived in RF was used to calculate the importance of each feature in the optimized feature subset. Given the large number of features, the features with importance greater than 0.005 ( Figure A1) were selected for analysis.
The characteristics of time-series population density and POI played important roles in urban land use classification, and other types of features contributed less to the classification results. Among the features of POI, a different type of POIs had significant differences in feature importance. POI1per (proportion of residential of POIs), POI3per (proportion of commercial of POIs) had the most significant contribution to land use classification. At the same time, the features of types with fewer POIs, such as POI8sum (the total number of sport and culture of POIs), were relatively less important. Compared with POI features, population density features accounted for a large proportion, but the importance of population density features at a single moment was generally low. It was shown that the population distribution information at a single moment had little effect on the urban land use classification, while the multi-time population distribution contributed a lot to the urban land use classification. People's travel characteristics and agglomeration laws could reflect the social functional attributes of the parcel to a certain extent, and it also illustrated from the side that the RF model could mine the dynamic characters of data.

Discussion
This study combined the two road network data of Gaode and OSM to divide the parcels as the basic mapping units, extracted the multi-source feature information such as spectral, texture, POI, and on this basis, constructed a model for detailed mapping of regional urban land use. Our approach demonstrates the potential of utilization of OSM and Gaode road network for parcel segmentation. Compared with methods that focus on national mapping [20,24], the mapping results in this study represents more details in urban land use. For regional mapping and analysis, some studies have introduced image segmentation to address the issue of mixed land use within the parcels generated by the OSM road network, and have yielded good classification results [25]. In contrast, the parcels extracted by Gaode road in our study may be more regular and suitable for mapping habits of people in urban areas. In addition, dynamic population distribution was utilized to random forest for urban land use mapping, which improved the classification accuracy significantly compared with static distribution.
Our findings are consistent with the standpoints in previous studies that the integration of multi-source data consisting of remote sensing and Internet data could provide more comprehensive information of urban land use [38,39]. The preferred feature set takes advantage of multi-source information to maximize the useful information, and its classification accuracy is higher than other feature combinations or a single feature. However, the contribution of different features for urban land use classification diversified greatly. POI and Easygo data contribute the most to classification in our results. Land use information describes that how human use land [40], which focus on the functional characteristics in urban areas. The characteristics could be represented by both POI and Easygo data. Due to its type information, POI can often reflect the categories of parcels directly. Moreover, the spatiotemporal dynamic patterns of population density in the Easygo data could represent the functional characteristics of urban land indirectly. For example, on weekdays, people gather from the residential to the business land in the morning, and return to the residential at night [17]. On the other hand, despite the relatively small contribution of remote sensing features (i.e., spectral, texture and backscatter), they are important to the improvement of some urban land use type identification. For example, NDVI could help classify the greenspace, nighttime light is beneficial for the categories where human have strong social and economic activities like commercial and business lands.
Although our method has achieved good classification results in main city of Lanzhou, some limitations should be considered. The first limitation relates to the mixing degree of land use within the parcel. As suggested by Grippa et al. [22], it is difficult to make all the parcels to be homogeneous in terms of land use. In fact, actual land use is often a mix. Rapidly growing cities tend to expand in three-dimension like increasing building height to improving land use efficiency, accompanied by more and more phenomena of three-dimensional land use [41]. For example, some buildings are commercial on the lower floors and business on the upper floors. Moreover, the boundaries of some parcels may not be roads, but fences, or there may be no obvious boundaries. Although these parcels are few in quantity in the study area, they still may decline the accuracy of mapping results. In this study, the overall accuracy of urban land use is high, but the user accuracies of sport and cultural lands and medical lands in public are low, are 33.33% and 28.57% for level II category, respectively. These lands are often mixed with residential land, and there are many types of land in the same parcel.
It is difficult to use the road network to divide them accurately, causing some interference to the classification. The second limitation is linked to the imbalanced distribution of data, which is not only reflected in the imbalance of the type of data itself, but also in the spatial distribution. We have found that POI and Easygo data contribute the most to the classification in this study. However, a lack of these data in some lands, such as suburbs or undeveloped areas, which is the one of the reasons why the accuracy of some categories (e.g., undeveloped land) are low. Moreover, there are imbalanced POIs size in different categories. For instance, the number of commercial and residential types of POIs are generally higher than that of industrial.
Further effort should be made to improve the mapping results. As for the mix degree of land, the process of parcel segmentation could be optimized. When reducing the mix of parcels, the integrity of the parcel should be considered. A hierarchical classification that different segmentation strategies could be used in terms of the characteristics of different lands could be considered. For example, objects that are subjectively inseparable but have finer roads inside, such as universities and large communities, can be identified, and then, small objects could be segmented by finer roads and classified. In addition, for some lands with three-dimensional land use, a parcel may be labeled with multiple categories. As for imbalanced distribution of data, a possible strategy is giving weight to different POI categories to removes the effects of the imbalance. In addition, for areas lack of POI or Easygo data, higher resolution images [10] and other social media data, such as mobile phone signals [12], could be supplemented for classification.

Conclusions
Given the current issues in urban land use classification, on the basis of road network segmentation of urban areas, comprehensive utilization of spectral, texture, backscatter, nighttime light, POI and time-series population density information, the fine extraction of urban land use information was achieved in the main urban area of Lanzhou City. The conclusions are summarized as follows.
(1) The random forest classifier conducted on multi-source data achieves good classification results.
Specifically, for level I categories, the overall accuracy is as high as 83.75%, the kappa coefficient is 0.77, for level II, the accuracy of each type reaches more than 70%. Compared with the results of EULUC, the results in this study represent more details of urban land use.   Figure A1. Features with importance greater than 0.005. (a) feature importance based on mean impurity decrease; (b) proportion of the number of features of each type. Note: Mean_B3, Mean_B8, Mean_B11, Mean_B12, Mean_B2, NDVI, NDWI represent mean of green, near-infrared, SWIR 1, SWIR 2, blue, normalized vegetation index, normalized water index, respectively; Luojia1_MIN, Luojia1_MAX, Luojia1_SUM represent the maximum, minimum, and sum of night light intensity, respectively; S1AVV_SUM, S1AVV_MEAN and S1AVH_SUM represent the sum and mean VV and the sum of VH respectively; POI1per, POI3per, and POI7per represent the proportion of residential, commercial, and medical of POIs, POI2sum, POI3sum, POI5sum, POI6sum, POI7sum, POI8sum represent the total number of business, commercial, administrative, educational, medical of POIs, respectively; pop_Sat22 represents the population density at 22:00 on Saturday, and so on for other population density features. Mean_B12, Mean_B2, NDVI, NDWI represent mean of green, near-infrared, SWIR 1, SWIR 2, blue, normalized vegetation index, normalized water index, respectively; Luojia1_MIN, Luojia1_MAX, Luojia1_SUM represent the maximum, minimum, and sum of night light intensity, respectively; S1AVV_SUM, S1AVV_MEAN and S1AVH_SUM represent the sum and mean VV and the sum of VH respectively; POI1per, POI3per, and POI7per represent the proportion of residential, commercial, and medical of POIs, POI2sum, POI3sum, POI5sum, POI6sum, POI7sum, POI8sum represent the total number of business, commercial, administrative, educational, medical of POIs, respectively; pop_Sat22 represents the population density at 22:00 on Saturday, and so on for other population density features.