Comparison of Machine-Learning Methods for Urban Land-Use Mapping in Hangzhou City, China

Urban land-use information is important for urban land-resource planning and management. However, current methods using traditional surveys cannot meet the demand for the rapid development of urban land management. There is an urgent need to develop new methods to overcome the shortcomings of conventional methods. To address the issue, this study used the random forest (RF), support vector machine (SVM), and artificial neural network (ANN) models to build machine-leaning methods for urban land-use classification. Taking Hangzhou as an example, these machine-leaning methods could all successfully classify the essential urban land use into 6 Level I classes and 13 Level II classes based on the semantic features extracted from Sentinel-2A images, multi-source features of types of points of interest (POIs), land surface temperature, night lights, and building height. The validation accuracy of the RF model for the Level I and Level II land use was 79.88% and 71.89%, respectively, performing better compared to SVM (78.40% and 68.64%) and ANN models (71.30% and 63.02%). However, the variations of the user accuracy among the methods depended on the urban land-use level. For the Level I land-use classification, the user accuracy was high, except for the transportation land by all methods. In general, the RF and SVM models performed better than the ANN model. For the Level II land-use classification, the user accuracy of different models was quite distinct. With the RF model, the user accuracy of educational and medical land was above 80%. Moreover, with the SVM model, the user accuracy of the business office and educational land classification was above 75%. However, the user accuracy of the ANN model on the Level II land-use classification was poor. Our results showed that the RF model performs best, followed by SVM model, and ANN model was relatively poor in the essential urban land-use classification. The results proved that the use of machine-learning methods can quickly extract land-use types with high accuracy, and provided a better method choice for urban land-use information acquisition.


Introduction
Land-use/land-cover classification is the basis for Land-Use and Land-Cover Change research. However, it can no longer meet the requirements for efficiency and accuracy of land-use classification and complex information acquisition through traditional visual interpretation and mathematical statistics in rapidly urbanizing areas [1]. With the rapid development of China urbanization, how to quickly obtain information regarding urban land use is considered a hot topic. With powerful adaptive and self-learning parallel information-processing capabilities, machine-learning methods have been successfully applied and developed in many fields. For example, Paoletti et al. [2] developed a new, highly efficient fulfillment of support vector machine (SVM) using the high computational power of graphics processing units to lessen the execution time of the storage and processing of hyperspectral Figure 1 presents a flowchart outlining the methodology used in this study, which consists of three major parts. First, the study used the road network and water body to generate parcels, and based on the classification feature data sets extracted from the multi-source data, combined with field sampling photos and Google Street Views to form various land-use classification data sets. Then, a random method was used to divide the training samples and the validation samples. Next, RF, SVM, and ANN were chosen to classify urban land use, and the classification results and accuracy were further compared and analyzed.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 18 data, then used three typical machine-learning methods: RF, SVM, and artificial neural network (ANN) to classify urban land use in Hangzhou (2018). It aimed to compare the classification accuracy of different machine-learning methods in urban land use. In addition, the information regarding classification results can also provide decision-making reference for urban land-use planning and management.

Study Area
Hangzhou is the provincial capital of Zhejiang province, and regarded as the economic, cultural, science, and educational center in Hangzhou metropolitan area. Hangzhou is also one of the central cities in the region of Yangtze River Delta located between 29°11′-30°34′N and 118°20′-120°37′E. Despite the population growth slowdown in some cities in China, in recent years, the migrant population in Hangzhou still maintains a fast-growing trend. The city's gross domestic product (GDP) was RMB 1537.3 billion, and the permanent resident population was 10.36 million, with an urbanization rate of 78.5% in 2019. Rapid urbanization has brought significant changes to the landuse pattern of Hangzhou. Figure 1 presents a flowchart outlining the methodology used in this study, which consists of three major parts. First, the study used the road network and water body to generate parcels, and based on the classification feature data sets extracted from the multi-source data, combined with field sampling photos and Google Street Views to form various land-use classification data sets. Then, a random method was used to divide the training samples and the validation samples. Next, RF, SVM, and ANN were chosen to classify urban land use, and the classification results and accuracy were further compared and analyzed.    [25]. For the research of the EULUC [20], a basic network of parcel segmentation for the impervious surface were generated by the major roads and minor roads from OpenStreetMap (https://www.openstreetmap.org/#map=4/36.96/104.17) and the water layer from the 10-m resolution global land-cover map based on Sentinel-2 data [26]. To further refine urban parcels, a more accurate data of the road network and water body extracted from the monitoring data of Zhejiang province's geographic conditions was integrated to generate the final urban parcels. A total of 11,212 parcels were obtained in Hangzhou, with an average parcel area of 21.01 hectares ( Figure 2(b 1, b 2 )). Compared with EULUC ( Figure 2(a 1 ,a 2 )), the parcel segmentation was more precise.  [25]. For the research of the EULUC [20], a basic network of parcel segmentation for the impervious surface were generated by the major roads and minor roads from OpenStreetMap (https://www.openstreetmap.org/#map=4/36.96/104.17) and the water layer from the 10-m resolution global land-cover map based on Sentinel-2 data [26]. To further refine urban parcels, a more accurate data of the road network and water body extracted from the monitoring data of Zhejiang province's geographic conditions was integrated to generate the final urban parcels. A total of 11,212 parcels were obtained in Hangzhou, with an average parcel area of 21.01 hectares ( Figure  2(b1,b2)). Compared with EULUC ( Figure 2(a1,a2)), the parcel segmentation was more precise.

Parcel Classification and Sample Selection
Based on the land-use classification system of EULUC and the actual characteristics of regional mapping in Hangzhou, 6 Level I land-use classification (Residential, Commercial, Industrial, Transportation, Public management and service, Non-construction) and 12 Level II land-use classification (Residential, Business office, Commercial service, Industrial, Road, Transportation station, Airport, Administrative, Educational, Medical, Sport and cultural, Park and green space, Non-construction) were formed, as shown in Table 1. In particular, the category of non-construction

Parcel Classification and Sample Selection
Based on the land-use classification system of EULUC and the actual characteristics of regional mapping in Hangzhou, 6 Level I land-use classification (Residential, Commercial, Industrial, Transportation, Public management and service, Non-construction) and 12 Level II land-use classification (Residential, Business office, Commercial service, Industrial, Road, Transportation station, Airport, Administrative, Educational, Medical, Sport and cultural, Park and green space, Non-construction) were formed, as shown in Table 1. In particular, the category of non-construction Remote Sens. 2020, 12, 2817 5 of 17 land was newly added to the classification, which mainly includes the land approved but not built in the city and the cultivated land in suburban areas. In the selection of samples location and purity, to enhance the stability of the samples as much as possible, only parcels with a balanced spatial distribution and a purity of 80% or more can be selected as samples. Meanwhile, in order to avoid excessive differences in the number of samples of different land-use types, a total record of 1127 samples were finally obtained through field surveys and Google Street Views (photos shown in Figure 3). The number of samples for each land-use types is shown in Table 1. In addition, these samples were randomly divided into training data sets and testing data sets at a ratio of 7:3. According to the randomly selected training samples and testing samples, the confusion matrix was used to test the accuracy of the Level I and II land-use classification results obtained by different methods.

Image Features
The Sentinel-2 satellite image has been used in many applications owing to the fine resolution both in time and space [15,27]. The 2018 Sentinel-2A satellite images were selected to extract the

Image Features
The Sentinel-2 satellite image has been used in many applications owing to the fine resolution both in time and space [15,27]. The 2018 Sentinel-2A satellite images were selected to extract the multispectral features, which were downloaded from the website of https://scihub.copernicus.eu/. After the atmospheric correction, the four bands of blue, green, red and near-infrared bands, and normalized vegetation index (NDVI = (NIR-Red)/(NIR+Red)), normalized water index (NDWI = (Green-NIR)/( Green+NIR )) all with 10 meters spatial resolution were calculated in each parcel to further obtain the corresponding index mean, standard deviation, and information entropy. Among them, the information entropy characterizes the image texture features, which can measure the randomness of the image information and represent the complexity of the image.

Land Surface Temperature
The land surface temperature (LST) reflects the social and economic activities of human beings to a certain extent. Previous studies have shown that there are significant disparities in the LST of different urban land-use types [28]. The GEE platform was used to extract the Landsat 8 image of Hangzhou city [29,30]. Due to poor images quality caused by cloud pollution in 2018, the Landsat 8 images of May 2017, as close as possible to 2018, finally were selected instead for the LST retrieval. The radiometric correction equation was used to calculate the pixel value of the thermal band of Landsat 8 image as the radiant temperature, which was corrected by the specific emissivity next [31,32]. In the end, the LST of Hangzhou was obtained and further calculated the mean and standard deviation of the LST in each parcel.

POIs
The POIs data in 2018 were acquired from Gaode Maps (https://lbs.amap.com/api/webservice/ guide/api/Search). Each POIs data contains a series of information such as name, location coordinates, and city function categories. According to the Level II land-use classification, all POIs were reclassified into 12 corresponding types. In addition, the number, proportion, and total of each POIs type were calculated in each parcel, respectively.

Building Height
Buildings height data was consisted of the outline and height of each building in Hangzhou, which was acquired via the Baidu Map API (http://api.map.baidu.com/staticimage/v2). the average height of the buildings in each parcel was calculated further to aggregate these data into parcel levels.

Night Lights
Night lights are closely related to human economic activities, and the application of research in urban development and population has become more and more extensive [33][34][35]. The Luojia-1 nighttime lights (NTLs) with a resolution of 130 meters in Hangzhou, 2018 were downloaded from the website of http://59.175.109.173:8888/app/login.html. Based on the DN value of the NTLs pixel, the mean and standard deviation of the DN value in each parcel were calculated. The features used in this study were summarized in Table 2.

Random Forest
The RF model is an integrated learning algorithm proposed by Breiman in 2001 [36][37][38]. It can increase the diversity of classification trees and enhance the performance of a single classification tree or regression tree by putting back sampling and randomly changing the combination of predictive variables in the evolution of different trees. The modeling steps are as follows: first, using bootstrap sampling technology to extract X i training sets from the original data set, the size of each training set is about 2/3 of the original data set, and the remaining (X-X i ) samples form the out-of-bag data (out-of-bag, OOB). Second, the regression tree of each Xi training set was not pruned and allowed to grow freely. Randomly select m predictor variables at each node, and among these random variables, the optimal feature was select for node segmentation according to the principle of minimum Gini coefficient. Third, predict new data through the feedback information regarding X i regression trees, and the classification result is determined by voting on the output results of each classification decision tree. In the process of random forest classification, three custom parameters need to be defined to optimize the model: the number of spanning trees (n_estimators), the number of predictors used to split the node at each node (max_features), and the minimum number of leaves (min_samples_leaf). The three parameters can be determined by the error rate of the data outside the bag.

Support Vector Machine
SVM is a new machine-learning method developed based on statistical learning theory and the principle of structural risk minimization [39]. Compared with the traditional learning methods, it has the characteristics of high accuracy, fast calculation speed, and strong generalization ability, which is widely used in image and land classification mapping [19,40]. The basic idea of SVM classification is to transform the input space into a high-dimensional feature space through nonlinear transformation, and then further find the optimal hyperplane (OHP) in this new high-dimensional feature space. The optimal hyperplane can not only correctly classify all the training samples, but also maximize the distance between the points closest to the classification plane namely it can maximize the classification interval to segregate the different classes. Simultaneously, the most crucial thing for classification using Remote Sens. 2020, 12, 2817 8 of 17 SVM is the choice of kernel function and the solution of kernel parameters [39]. In this study, the radial basis function was used as the kernel function, and the grid search method was used to determine the optimal penalty coefficient C and the classification interval. Finally, set penalty coefficient C to 5, gamma to 0.01, and the classification rule adopted the "one-versus-rest" classifier.

Artificial Neural Network
The study used the Back-Propagation Network, which is the most widely used ANN model and was proposed by Rumelhart et al. in 1985 [41]. It is a multi-layer feed-forward neural network trained in accordance with the error back-propagation algorithm. A typical BP neural network structure includes an input layer, a hidden layer (consisting of one or more layers) and an output layer. The adjacent layers are connected by weights, and the network learning process of information consists of two processes of forward and backward propagation. The input information is forwarded through the activation function [42,43], and then the corresponding output is acquired. The output result needs to be compared with the target output. On condition that the error exceeds the predetermined value, it will be transferred to back-propagation. Meanwhile, the error signal will be fed back from the output layer to the hidden layer and the input layer, and the connection weights between the nodes (neurons) of each layer will be adjusted according to the error [44]. Repeat this process until the signal error reaches the allowable error range, to achieve the purpose of classification. The input layer of ANN is the number of neurons, i.e., the number of input variables. It depends on the complexity of the hidden layer problem. The number of neurons in the output layer is the number of output variables. The neural network activation functions include identity, logistic, tanh, and relu functions, and the solver for weight optimization include lbfgs, sgd, and adam functions (https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html). In the study, the logistic function was used as the activation function, and the lbfg function was used to optimize the weights.
The above classification calculations were all implemented using the Scikit-Learn package in Python 3.7.

Classification Results
The results of land-use classification using RF, SVM and ANN models were various in the Level I land use (Figure 4). The areas of the three land types of residential, industrial, and public management and service land accounted for 18.67%, 35.66%, and 33.85% with RF, 26.61%, 41.80%, 25.62% with SVM, and 32.52%, 38.52%, 23.64% with ANN. On the contrary, both the commercial and transportation land accounted for less than 4%. Although the area percentages were different, residential, industrial, and public management and service land parcels accounted for a large proportion in the Level I land use, no matter what methods were used. The results of land-use classification using RF, SVM and ANN models were various in the Level I land use (Figure 4). The areas of the three land types of residential, industrial, and public management and service land accounted for 18.67%, 35.66%, and 33.85% with RF, 26.61%, 41.80%, 25.62% with SVM, and 32.52%, 38.52%, 23.64% with ANN. On the contrary, both the commercial and transportation land accounted for less than 4%. Although the area percentages were different, residential, industrial, and public management and service land parcels accounted for a large proportion in the Level I land use, no matter what methods were used. The results of Level II land-use classification also showed significant differences ( Figure 5). For the business office and commercial service land, the proportions were 1.06% and 2.66% with RF, 0.57% and 4.17% with SVM, and 0.58% and 4.40% with ANN. For the results of public management and service land, the areas of educational, parks and green space, and administrative land occupied a large proportion, which can be classified relatively well. The areas of the three types accounted for 6.65%, 10.32%, and 4.98% with RF, 11.68%, 5.45%, and 0.82% with SVM, and 12.46%, 3.68%, and 0.62% with ANN.  The results of Level II land-use classification also showed significant differences ( Figure 5). For the business office and commercial service land, the proportions were 1.06% and 2.66% with RF, 0.57% and 4.17% with SVM, and 0.58% and 4.40% with ANN. For the results of public management and service land, the areas of educational, parks and green space, and administrative land occupied a large proportion, which can be classified relatively well. The areas of the three types accounted for 6.65%, 10.32%, and 4.98% with RF, 11.68%, 5.45%, and 0.82% with SVM, and 12.46%, 3.68%, and 0.62% with ANN. management and service land accounted for 18.67%, 35.66%, and 33.85% with RF, 26.61%, 41.80%, 25.62% with SVM, and 32.52%, 38.52%, 23.64% with ANN. On the contrary, both the commercial and transportation land accounted for less than 4%. Although the area percentages were different, residential, industrial, and public management and service land parcels accounted for a large proportion in the Level I land use, no matter what methods were used. The results of Level II land-use classification also showed significant differences ( Figure 5). For the business office and commercial service land, the proportions were 1.06% and 2.66% with RF, 0.57% and 4.17% with SVM, and 0.58% and 4.40% with ANN. For the results of public management and service land, the areas of educational, parks and green space, and administrative land occupied a large proportion, which can be classified relatively well. The areas of the three types accounted for 6.65%, 10.32%, and 4.98% with RF, 11.68%, 5.45%, and 0.82% with SVM, and 12.46%, 3.68%, and 0.62% with ANN.  Overall, the results of Level II land-use classification using three methods showed high consistency in space that the commercial service and business office land were mainly distributed in the central urban areas, and that industrial and park and green space land mostly located in the suburbs. However, in the specific areas, the classification results were quite different. Selecting the downtown areas of Hangzhou as the comparison area, it showed that the classification results of RF and SVM were relatively close, while the ANN results were quite different ( Figure 6).
Overall, the results of Level II land-use classification using three methods showed high consistency in space that the commercial service and business office land were mainly distributed in the central urban areas, and that industrial and park and green space land mostly located in the suburbs. However, in the specific areas, the classification results were quite different. Selecting the downtown areas of Hangzhou as the comparison area, it showed that the classification results of RF and SVM were relatively close, while the ANN results were quite different ( Figure 6).

Accuracy Assessment
On the whole, the validation accuracy, training accuracy, and Kappa coefficient of the Level I land-use classification were 79.88%, 92.88%, 0.738 with RF, 78.40%, 82.34%, 0.720 with SVM, and 71.30%, 99.8%, 0.624 with ANN ( Figure 7; Table 3). Obviously, the classification results with RF were the best, with a high degree of consistency and high reliability. The second was SVM, whose classification accuracy was close to RF. The ANN had a poor classification effect compared with the formers. From the results of the Level I land-use classification, first, residential and industrial land can be identified well with the three methods. The user accuracy and product accuracy of the two types land classification were more than 80% with RF and SVM, and more than 75% with ANN. Secondly, the user accuracy and product accuracy of public management and service land classification were more than 70% with all the methods. The classification errors were mainly classified as residential land, and the commission error was above 10%. Thirdly, commercial land had a better classification effect with RF, with user accuracy reaching 86.27%, and 78.43% with SVM, 60.78% with ANN. Finally, the classification effect of non-construction and transportation land were poor. Combining the high-resolution remote-sensing images, we find that the transportation land such as stations has a poor regularity. Meanwhile, it is mostly mixed with other land types, resulting in low sample purity and small sample size, which may be one of the main reasons for low classification accuracy. The non-construction land is mainly the land approved within the city but not built, and farmland in the suburbs. These parcels are lack of POIs data, and the spectral features and texture features of the images are less prominent, resulting in low classification accuracy. the best, with a high degree of consistency and high reliability. The second was SVM, whose classification accuracy was close to RF. The ANN had a poor classification effect compared with the formers. From the results of the Level I land-use classification, first, residential and industrial land can be identified well with the three methods. The user accuracy and product accuracy of the two types land classification were more than 80% with RF and SVM, and more than 75% with ANN. Secondly, the user accuracy and product accuracy of public management and service land classification were more than 70% with all the methods. The classification errors were mainly classified as residential land, and the commission error was above 10%. Thirdly, commercial land had a better classification effect with RF, with user accuracy reaching 86.27%, and 78.43% with SVM, 60.78% with ANN. Finally, the classification effect of non-construction and transportation land were poor. Combining the high-resolution remote-sensing images, we find that the transportation land such as stations has a poor regularity. Meanwhile, it is mostly mixed with other land types, resulting in low sample purity and small sample size, which may be one of the main reasons for low classification accuracy. The non-construction land is mainly the land approved within the city but not built, and farmland in the suburbs. These parcels are lack of POIs data, and the spectral features and texture features of the images are less prominent, resulting in low classification accuracy.
In summary, RF and SVM had better classification effects on residential, commercial, industrial, public management and service, and non-construction land. Moreover, ANN had better classification effects on residential, industrial, and public management and service land.   In summary, RF and SVM had better classification effects on residential, commercial, industrial, public management and service, and non-construction land. Moreover, ANN had better classification effects on residential, industrial, and public management and service land.
It can be seen from Figure 8 and Table 4 that the results of the Level II land-use classification were similar to those of the Level I. The overall classification effect of the RF model was still the best, whose validation accuracy, training accuracy, and Kappa coefficient were 71.89%, 91.74%, and 0.664, respectively. The second was SVM. Its validation accuracy, training accuracy, and Kappa coefficient were 68.64%, 81.83% and 0.630, respectively. Although the training accuracy of ANN reached 99.8%, its validation accuracy and Kappa coefficient were both low (63.02% and 0.559), showing obvious over fitting phenomenon. It was mainly due to the increased complexity of the Level II land-use types, and the overall classification accuracy varied greatly.

Discussion
The more accurate road network and water body data from government departments was used to further generate a basic of parcel segmentation for urban land-use mapping, and the number of testing samples was greatly increased in this paper. Compared with the study of EULUC in China  From the perspective of the specific types of the Level II land-use classification, the educational and medical in the public management and service land had a higher classification accuracy. Among the methods, RF was the best of the two land-use type classifications, with user accuracy reaching 88.89% and 82.93%, and product accuracy reaching 88.89% and 77.27% respectively. In contrast, the SVM and ANN had lower classification accuracy. Secondly, for park and green space, the user accuracy of the three methods was about 50%. The classification errors of park and green space were mainly divided into non-construction land. This is because the image characteristics of cultivated land in non-construction land is relatively similar to park and green space, which interferes with the classification of park and green space, resulting in low classification accuracy. The classification accuracy of sport and cultural and administrative land in the three methods were low. Except for the user accuracy of sport and cultural land classification with SVM of 60%, the other accuracies were below 50%. Combining with the high-resolution remote-sensing images, these two types of land use are mostly mixed with other land-use types, especially the administrative land, in addition its area ratio is small. In the Level II classification of commercial service land, the user accuracy of business office land was higher than that of commercial service land, and the user accuracy was all above 68% with the three methods, while that of commercial service land only reached above 60% with RF. The three methods all misclassified commercial service land into residential land or business office land, caused by the mixed use of commercial service, residential, and business office land mostly.
In summary, RF had a better classification effect on educational and medical land. The SVM had a better classification effect on business office and educational land. In addition, ANN had a poor overall effect on the Level II land-use classification.

Discussion
The more accurate road network and water body data from government departments was used to further generate a basic of parcel segmentation for urban land-use mapping, and the number of testing samples was greatly increased in this paper. Compared with the study of EULUC in China [20], the research system had an improvement, as well as urban land-use mapping results more detailed. Moreover, the accuracy testing shows the reliability of the data in the evaluation of the effect of urban land-use classification. This paper, same as other studies [20][21][22][23][24], mainly adopted the testing method of cross validation, using validation accuracy, Kappa coefficients, user accuracy, and producer accuracy, which are widely used in land-use/land-cover mapping to evaluate the classification accuracy. In addition, we also showed the training accuracy (Tables 3 and 4), ROC curve and Area Under the Curve (AUC) value (Figures 7 and 8) of different model classifications, supporting the results of accuracy evaluation of the classification. Overall, the designed accuracy evaluation system can explain the accuracy of land-use classification results. With the random forest model, the validation accuracy of the Level I and Level II of land-use classification was 79.88% and 71.89% respectively in this paper. Compared with published studies using random forests model for urban land-use classification, it was close to the overall accuracy of the Level I and Level II classification of Shenzhen (75.94%, 71%) [22], and lower than that of Hangzhou (82%, 78%) [20], Ningbo (87.58%, 73.53%) [21], Lanzhou (83.75%, 76.25%) [23] and Nanjing (86.1%, 80%) [24]. Meanwhile, the classification accuracy of the random forest model was generally better than that of the SVM and the artificial neural networks models in this article, which was consistent with the results of comparison of machine-learning methods for land-cover classification in the complicated terrain regions proposed by Gu et al. [1]. Synthesizing the research results of Ningbo, Shenzhen, Lanzhou, Nanjing, and other regions to have obtained nice classification results, it showed that the random forest model had good robustness and applicability in land-use mapping of different cities. Finally, there were also cases where the effect of land-use classification in individual cities or regions was poor [20], because the selection of features [21,23,24] was an essential factor that affected the accuracy of land-use classification.
In addition, this study also found that the main factor affecting the accuracy of land-use classification was the high degree of urban land-use mixing, which will cause the purity of the land-use classification parcels to decrease [22], and making it difficult to determine the correct category. Regarding that how to improve the classification accuracy of mixed types, we suggest: First, increase the types of mixed land use, such as mixed commercial and residential land. In recent years, it has gradually become a common phenomenon to improve land-use efficiency through land-use mixing in some first-tier cities in China [45]. The second is to further generate the parcels. Apparently, the road network and water body data used in this study was not enough, so we need to combine other methods or data. Tu et al. [21] used an object-based segmentation approach to generate basic urban land-use classification parcels, but it needs to pay attention to the scale of the basic parcels. If the parcel area is too small, the map spots may be incomplete and lose the attribute feature of the land-use type itself, resulting in the texture features of some land-use types becoming inconspicuous, which will affect the classification accuracy. In addition to further mining the basic parcel segmentation, feature selection, and other factors, we can also consider the advantages of multiple classification methods, combining multiple methods for urban land-use mapping and information extraction in future research.

Conclusions
On a more precise basis of parcel segmentation of urban areas, this paper compared and analyzed the accuracy of the RF, SVM, and ANN models in urban land-use classification in Hangzhou, providing practical methods for urban land-use classification, as well as better method selection. The main conclusions are as follows: (1) In general, RF had the best effect on urban land-use classification, followed by SVM, and ANN was comparatively poor. In the Level I land-use classification, the training accuracy, validation accuracy, and Kappa coefficient with RF were 92.88%, 79.88%, and 0.738, respectively; 82.34%, 78.40%, and 0.720 with SVM, and 99.80%, 71.30%, 0.624 with ANN. In the Level II land-use classification, the training accuracy, validation accuracy, and Kappa coefficient with random forest were 91.74%, 71.89%, and 0.664, respectively, 81.83%, 68.64%, and 0.630 with SVM, and 99.60%, 63.02%, and 0.559 with ANN.
(2) For the Level I land use, the accuracy of the land-use classification was high except for transportation, with the user accuracy below 30% by all the methods. Among them, the user accuracies of residential and industrial land classification were basically above 80%, and the user accuracy of commercial service and public management service land classification were basically above 70%.
(3) For the Level II land use, the classification accuracies of different models for dissimilar land-use types were quite distinct. In general, the Level II of the public management and service land had a better classification effect with RF, which showed that the user accuracy of educational and medical land was above 80%. Moreover, the Level II of the commercial service land classification had a better effect with SVM, reflected in the user accuracy of business office land classification of 75%. Meanwhile, the classification effect of SVM in the educational land was also fine, with user accuracy of 76%. In addition, the Level II classification effect of the ANN was poor.