Urban Functional Zone Classiﬁcation Based on POI Data and Machine Learning

: The identiﬁcation of urban spatial functional units is of great signiﬁcance in urban planning, construction, management, and services. Conventional ﬁeld surveys are labour-intensive and time-consuming, while the abundant data available via the internet provide a new way to identify urban spatial functions. A major issue is in determining point of interest (POI) weights in urban functional zone identiﬁcation using POI data. Along these lines, this work proposed a recognition method based on POI data combined with machine learning. First, the relationship between POI data and urban spatial function types was mapped, and the density of each type of POI was calculated. Then, the density values of each type of POI in the study unit were used as feature vectors and combined with the Kstar algorithm to identify urban spatial functions. Finally, the identiﬁcation results were validated by combining multiple sources of POI data. From the acquired sampling results, it was demonstrated that the proposed method achieved an accuracy of 86.50%. The problem of human bias was also avoided in determining POI weights. High recognition accuracy was achieved, making urban spatial function recognition more accurate and automatable


Introduction
China's urbanisation rate has been on the increase since the beginning of its reform and opening-up process.Rapid urbanisation has affected the original coordinated and stable urban spatial structure.A lack of foresight in planning and construction has led to problems such as resource shortages, traffic congestion, and environmental pollution.The classification of urban functional zones is a prerequisite for planning and construction [1] and is important for the rational use of urban spatial resources, urban structural optimisation, land resource allocation, and infrastructural layout [2,3].Conventionally, the identification of urban spatial function requires an extensive field survey, which is time-consuming and labourious.Along with the development of smart cities and geoinformation technology, a large amount of Internet of Things (IoT) and social sensing data have been generated, providing a basis for the dynamic identification of urban spatial functions [4,5].Many studies have used high-resolution remote sensing images to classify and identify urban functions from the perspective of land use [1,6,7].Remote sensing data can identify the physical characteristics of features but lack socioeconomic information.Since the identification of urban functions focuses on socioeconomic functions, the social awareness data on the internet are advantageous in this regard [8].
In recent years, various emerging big data sources, including point of interest (POI), cell phone signals, cab trajectories, social media, and public transportation swipe cards, have been used to research urban spatial function identification.Cell phone signals provide accurate and complete location information that can be used to identify the spatial and temporal distribution and mobility characteristics of a population [9,10].Cab trajectories can reflect the travel characteristics of residents and the conditions of urban traffic operations, which can be used to identify hotspot areas in cities [11,12].Social media, with their rich semantic information, can be combined with mobile phone location data to study the functions and daily changing patterns of cities [13].Metro smartcard data have also been used in the literature to explore the spatiotemporal structures of urban space [14].However, data from cell phones [15], taxi trajectories, social media, and smartcards can only reflect certain characteristics of urban functions.The zoning of urban functional areas is relatively rough and can only distinguish between inner urban and occupational and residential areas and hotspot/coldspot areas; it cannot finely identify urban spatial structures.
POI is an important spatial information resource.With the development of internet technology, various mapping service providers have collected a large amount of POI data.Compared with other data types, POI data have the advantages of easy access and wide coverage, and contain rich socioeconomic attributes, providing a new perspective for urban analysis research [16].The applications of POI in urban studies mainly include functional zone classification [17][18][19], building type identification [20], urban land use monitoring [21][22][23], commercial retail site selection [24], and demand analysis of public service facilities [25].To classify urban functional zones, some studies use POI data as the main data source to identify urban functional zones through the analysis of spatial distribution characteristics of POI [26,27].This type of research focuses on how the spatial and semantic features of POIs reflect the functional characteristics of cities.Some other studies combine POI data with other data such as cell phone signalling, social media check-in, night-time imagery, and Baidu Heat Map to study urban functional area classification [28][29][30].Such methods need to take advantage of the respective strengths of multiple data sources and combine them to correct the recognition results in order to improve the accuracy of functional area classification.In existing available reports in the literature, the two methods often used to identify urban functional zones using POI data are density analysis [31][32][33] and cluster analysis [21,34].The density analysis method is easy to understand; it has a fixed basic unit of study, and the effect of the data coverage on the classification results is relatively marginal.Wang et al. used POI and road network data to construct an automatic analysis model of urban functional areas based on the kernel density analysis method and the first law of geography [35].Hu et al. proposed a method for identifying and analysing urban functional areas based on the frequency density and ratio of POI functional types based on Gaode POI data [17].
As far as density analysis is concerned, the identification of urban functional areas based on POI data is usually carried out by first dividing the city into small cells using a grid or neighbourhood boundaries [36] and reclassifying the POI data.Then, the density index and frequency index of each type of POI are calculated for each cell, and the type of functional area the cell belongs to is determined using a threshold [22].The reasonableness of the setting of the weights and index thresholds for each type of POI can impact the recognition accuracy.Most of these weights and thresholds in the existing research methods are set manually, and the level of automation and intelligence is low.Based on this perspective, this paper addressed the shortcomings of existing research.It proposed a method based on POI and machine learning to improve the automation and intelligence of urban functional zone identification, taking Nanning city as the study area.The research area and data sources are introduced in the second part of this paper, while the third part explores the methods in detail.The fourth part describes an experiment and discusses the extracted results, and our work is concluded in the final part.

Study Area
Nanning is the capital city of the Guangxi Zhuang Autonomous Region.It has a total population of 8.74 million people and is a second-tier city of China.The built-up area covered 320 km 2 in 2020, making it the largest city in Guangxi.Nanning has good development opportunities as a comprehensive transportation hub and the central city of the Beibu Gulf Economic Zone of southwestern China.The central city of Nanning, bounded by the Outer Ring Expressway and Ring Expressway, was selected as the study area.It comprises six districts under the jurisdiction of Nanning City, i.e., Qingxiu, Xingning, Jiangnan, Xixiangtang, Liangqing, and Yongning Districts, and has a total area of 905 km 2 (Figure 1).population of 8.74 million people and is a second-tier city of China.The built-up area covered 320 km 2 in 2020, making it the largest city in Guangxi.Nanning has good development opportunities as a comprehensive transportation hub and the central city of the Beibu Gulf Economic Zone of southwestern China.The central city of Nanning, bounded by the Outer Ring Expressway and Ring Expressway, was selected as the study area.It comprises six districts under the jurisdiction of Nanning City, i.e., Qingxiu, Xingning, Jiangnan, Xixiangtang, Liangqing, and Yongning Districts, and has a total area of 905 km 2 (Figure 1).

Data
The POI data used were Baidu POI and Gaode POI, obtained through the internet prior to September 2019.The POI data included the name, latitude, longitude, address, and classification information of each point.A total of 594,039 Baidu POI data were obtained in 21 primary classification categories and 183 secondary classification categories.A total of 289,813 Gaode POI data points were used and divided into three levels, with 22 primary classification categories and 263 secondary classification categories.The road network data were obtained from the Open Street Map website.The POI data used in this paper were obtained by fusing Baidu POI and Gaode POI after processing [37,38].

Overall Design
First, the POI data were pre-processed to reclassify the different functional types of POIs into corresponding urban spatial function types to form a POI dataset with functional zone classifications.By referring to the existing literature [19,27] and the positional accuracy of the POI data, the study area was divided into 500 × 500 m cells.The kernel

Data
The POI data used were Baidu POI and Gaode POI, obtained through the internet prior to September 2019.The POI data included the name, latitude, longitude, address, and classification information of each point.A total of 594,039 Baidu POI data were obtained in 21 primary classification categories and 183 secondary classification categories.A total of 289,813 Gaode POI data points were used and divided into three levels, with 22 primary classification categories and 263 secondary classification categories.The road network data were obtained from the Open Street Map website.The POI data used in this paper were obtained by fusing Baidu POI and Gaode POI after processing [37,38].

Overall Design
First, the POI data were pre-processed to reclassify the different functional types of POIs into corresponding urban spatial function types to form a POI dataset with functional zone classifications.By referring to the existing literature [19,27] and the positional accuracy of the POI data, the study area was divided into 500 × 500 m cells.The kernel density coefficients of different functional types in the central city of Nanning were obtained by partitioning the statistical kernel density raster data to the average value of all image elements in the cell.The POI kernel density values of different functional types were assigned as different functional feature values in the cell of the study area.The Kstar algorithm [39] was used to identify urban spatial function types; some samples were selected as a training set according to manual recognition judgment, and the samples were divided into training and test samples to map the relationship between the feature value information and the unit spatial function categories.This formed an urban functional area type classifier to classify and identify the urban spatial function categories.Finally, the results of city function area classification were collated to form a visualisation chart and to analyse the spatial function pattern of Nanning city.The technical process is shown in Figure 2.

Data Pre-Processing
Data pre-processing was conducted to ensure the quality of the POI data, which mainly included data cleaning and collation.Data cleaning included the removal of duplicate acquisitions, POI points with low-quality missing information, and points that pointed to unclear city functions.First, we excluded the POI objects with missing information.Then, the POI data with low public awareness and no city function information, such as data on the categories of entrances/exits, administrative landmarks, event activities, indoor facilities, and other cities, were deleted.
Referring to the Urban Land Classification and Planning and Construction Land Standard (GB50137-2011) and following the principle of universality and consistency in POI classification [40], the urban space functions were divided into six types: residential, commercial services, logistics and storage, transportation, green areas and squares, and public administration and public services.In this study, POI points involving commercial service exchange activities were categorised as function type commercial services.The POI points within residential building areas were categorised as function type residential.The POI points related to government offices and cultural, education, research, sports, medical and health, social welfare, and religious facilities were categorised as function type public management and public services.The POI points related to open spaces such as parks, squares, and attractions in certain urban areas were classified as function type green spaces and squares.Points related to storage, transit, foreign trade, and supply were classified as function type logistics and storage.Points related to railway and subway stations and transportation hubs and facilities were classified as function type roads and transportation.The POIs corresponding to the urban space functional categories are shown in Table 1.
Table 1.POIs corresponding to the urban space functional categories.

Business and services
Cafes, fast food restaurants, convenient hotels, shopping centres travel agencies, publishers, companies, banks, cinemas, training institutions, car dealers, etc.

Residence
Real estate, community housing, etc.

Public management and services
Government agencies, administrative units, higher education institutions, elementary schools, secondary schools, exhibition halls, convention centres, museums, stadiums, hospitals, etc.
Green spaces and squares Parks, scenic spots, tourist attractions, etc.

Logistics and storage
Logistics companies, logistics courier stations, distribution centres, etc.

Road and traffic facilities
Parking lots, driving schools, railway stations, subway stations, bus stations, port terminals, etc.

Kernel Density Estimation
The aim of kernel density analysis is to obtain an estimate of each point's density function that can approximate the distribution of the data [41].It is regarded as an important statistical analysis method for the extraction of distribution features of geospatial facilities.Kernel density estimation methods are widely used in detection studies of urban hotspots [19].In this paper, we used kernel density analysis to explore the clustering of different types of POI data in Nanning city.We determined the impact of their functional types within each cell grid based on the average kernel density values of different types of POI data.The formula for calculating the nuclear density is shown in Equations ( 1) and (2).
Sustainability 2023, 15, 4631 6 of 18 where K is the kernel function; h is the search radius (also known as the bandwidth, which refers to an extended width on the surface space where point x is located); and n is the number of element points included in the search of point x within the broadband.The core parameter in the kernel density is the distance decay threshold, which is mainly set according to the scale of analysis and the characteristics of the geographical phenomenon being studied.In this paper, a linear cell of 10 m and an attenuation threshold of 1200 m were selected as the ideal experimental parameters.

Identification of Urban Spatial Functions Using KStar Classifier
After calculating the various types of POI densities in the grid cells, a common method is to perform weighted summation to obtain the total density, and then determine the matching results by threshold judgment [42].This requires too much manual intervention when performing weight and threshold setting and is troublesome to use in different data scenarios.Therefore, a machine learning-based classifier for urban spatial functions was designed.
KStar is an instance-based machine learning classifier that uses an information entropybased distance function.It defines a finite set of transformations where the distance is the cost of transforming one instance a to another instance b.The KStar algorithm performs type prediction by selecting the parameters and a distance measure [41].For any function P*, the number of valid instances can be expressed by Formula (3).n0 ≤ where N is the total number of training samples and n0 is the number of training samples with the smallest distance from sample a.A sample is computed by calculating the sum of probabilities from each sample belonging to category C using Formula (4).The probabilities of each category are computed and processed, and relative probabilities are obtained to obtain an estimate of the category distribution at the sampling points represented by a.
In this work, the input vector of Kstar was defined as a six-dimensional vector and the kernel density values of the POIs corresponding to the six urban space types-commercial services, residential, public administration, and public services, green spaces and plazas, logistics and storage, and roads and traffic-in the cell grids were used as the input vector.The defined categories are equivalent to the urban space types defined in this work.

POI Data
The experiment used POI data obtained from the internet.The POI data points were mainly distributed along both sides of the Yongjiang River Basin in the western part of the centre of Nanning city, as shown in Figure 3.

Results of Kernel Density Analysis
The natural interruption method was chosen to classify the kernel density display grades into five categories.The results of nuclear density analysis are shown in Figure 4.

POI Data
The experiment used POI data obtained from the internet.The POI data points were mainly distributed along both sides of the Yongjiang River Basin in the western part of the centre of Nanning city, as shown in Figure 3.

Results of Kernel Density Analysis
The natural interruption method was chosen to classify the kernel density display grades into five categories.The results of nuclear density analysis are shown in Figure 4.The kernel density diagram shows the distribution of hotspots for each function in the city.Figure 4 shows that the main hotspot areas were concentrated in the central and central-western parts of the study area on both sides of the Yongjiang River.The kernel density values for the various functions were quantified to the cell grid as raster data to obtain the average kernel density values for the different cell function types.

Training the Classifier
This experiment used remote sensing imagery, in which regional units with obvious spatial functional characteristics were manually selected.For example, large shopping malls were selected as samples of commercial services function, residential areas for the residential function, universities and centralised government offices for public administration and services function, large logistics parks for logistics and warehousing function, parks and attractions for the green spaces and squares function samples (such as the scenic area of Qingxiu Mountain in Nanning), and roads and traffic facilities from Nanning East Station and the traffic stations in the suburbs.The samples were scattered throughout the study area as much as possible, and samples with only one type of data in some units were also selected.We selected 207 functional units as samples.To ensure a balanced distribution of samples for each type of function type, 60% of the samples were selected as the training set and the remaining 40% were used as the test set.After debugging and sample selection adjustment, the correlation coefficient of the training model reached 0.9901, and the recognition accuracy of the detection sample set was 80.67%.The kernel density diagram shows the distribution of hotspots for each function in the city.Figure 4 shows that the main hotspot areas were concentrated in the central and

Results of Urban Functional Zone Identification
The results of urban spatial function identification in Nanning are shown in Figure 5. Statistics in the functional area types are shown in Table 2.The results of functional zone identification show that the urban space of Nanning is dominated by commercial services and residential functions, followed by public management and services.Qingxiu Mountain Park, Shishan Park, Wuxiang Lake Park, Jiangnan Park, etc. were identified as green spaces and squares; the major logistics parks in Nanning were identified as logistics and storage; and Nanning East Station and Langdong Passenger Terminal, etc. were identified as road and transportation facilities function types.

Verification of the Results
To verify the accuracy of the function identification results, we randomly selected samples with various types of functions while avoiding the originally selected samples, and used them for comparison and verification.In this work, a sample of 237 cells was randomly selected for testing.The test sample contained six functional area types and was distributed

Verification of the Results
To verify the accuracy of the function identification results, we randomly selected samples with various types of functions while avoiding the originally selected samples, and used them for comparison and verification.In this work, a sample of 237 cells was randomly selected for testing.The test sample contained six functional area types and was distributed across different areas.After manual verification, the number of grid cells that could be correctly identified was 205, and the accuracy of automatic identification reached 86.50%.The accuracy of sample identification for each functional type is presented in Table 3.It was easy to misclassify the situation in terms of the public administration and services, residential, and commercial services functions, especially in the central part of the study area.In actual life, these three types of urban land and space functions are generally accompanied by human activities that feature a high degree of mixing.The zones identified as serving mainly residential functions also feature the commercial service industry, public management and services, road and traffic facilities, and logistics and storage functions.There may also be residential areas in commercial service industry areas with large shopping centres.There are also residential, public management and services, road and traffic facilities, and logistics and storage function POIs.When an area is identified as having green spaces and squares function with some park attractions, there may also be commercial service industry, public management and services, as well as road and traffic facilities function POIs.
POI function results were compared with Baidu map data.Figure 6 shows a Baidu map of the scenic area of Nanning City Expo Park and its parking lot, which contains points with the functions of green spaces and squares, commercial services, public management and services, and road and traffic facilities.The identification results are consistent with the actual situation when the POI identification judgment for green spaces and squares and road and traffic facilities functions are used.It was easy to misclassify the situation in terms of the public administration and services, residential, and commercial services functions, especially in the central part of the study area.In actual life, these three types of urban land and space functions are generally accompanied by human activities that feature a high degree of mixing.The zones identified as serving mainly residential functions also feature the commercial service industry, public management and services, road and traffic facilities, and logistics and storage functions.There may also be residential areas in commercial service industry areas with large shopping centres.There are also residential, public management and services, road and traffic facilities, and logistics and storage function POIs.When an area is identified as having green spaces and squares function with some park attractions, there may also be commercial service industry, public management and services, as well as road and traffic facilities function POIs.
POI function results were compared with Baidu map data.Figure 6 shows a Baidu map of the scenic area of Nanning City Expo Park and its parking lot, which contains points with the functions of green spaces and squares, commercial services, public management and services, and road and traffic facilities.The identification results are consistent with the actual situation when the POI identification judgment for green spaces and squares and road and traffic facilities functions are used.As shown in Figure 7, Yudong Logistics Port in Liangqing District of Nanning City is an area with obvious logistics and warehousing functions.Yuan Sheng City is not only a large building materials and furniture market, but also a storage and logistics base, and integrates commercial and logistics services.The POI automatic identification results were used in line with the actual situation.As shown in Figure 7, Yudong Logistics Port in Liangqing District of Nanning City is an area with obvious logistics and warehousing functions.Yuan Sheng City is not only a large building materials and furniture market, but also a storage and logistics base, and integrates commercial and logistics services.The POI automatic identification results were used in line with the actual situation.As shown in Figure 8, the Guangxi University for Nationalities (Xiangsi Lake Campus) is a public management and services function area for science and education.The campus serves primarily as a point of public management and services, but also has some commercial service points and a small number of logistics and storage and road and traffic facilities POIs.The results of POI identification are consistent with the actual situation in terms of functional area identification.Figure 9 shows the area of Golden Lake Plaza and Minge Lake, with part of the Convention and Exhibition Centre in the lower-right corner.Jinhu Square not only has the commercial services industry function but also has a government office area and some residential areas of Nanning City have public management and services and residential functions.It mostly has commercial buildings and office buildings.The middle-right part of this area is Minge Lake Square and the bottom-right location is Nanning International Convention and Exhibition Centre, with functions of green spaces and squares and public management and services, respectively.The functional zone identification results are consistent with the actual situation.As shown in Figure 8, the Guangxi University for Nationalities (Xiangsi Lake Campus) is a public management and services function area for science and education.The campus serves primarily as a point of public management and services, but also has some commercial service points and a small number of logistics and storage and road and traffic facilities POIs.The results of POI identification are consistent with the actual situation in terms of functional area identification.As shown in Figure 8, the Guangxi University for Nationalities (Xiangsi Lake Campus) is a public management and services function area for science and education.The campus serves primarily as a point of public management and services, but also has some commercial service points and a small number of logistics and storage and road and traffic facilities POIs.The results of POI identification are consistent with the actual situation in terms of functional area identification.Figure 9 shows the area of Golden Lake Plaza and Minge Lake, with part of the Convention and Exhibition Centre in the lower-right corner.Jinhu Square not only has the commercial services industry function but also has a government office area and some residential areas of Nanning City have public management and services and residential functions.It mostly has commercial buildings and office buildings.The middle-right part of this area is Minge Lake Square and the bottom-right location is Nanning International Convention and Exhibition Centre, with functions of green spaces and squares and public management and services, respectively.The functional zone identification results are consistent with the actual situation.Figure 9 shows the area of Golden Lake Plaza and Minge Lake, with part of the Convention and Exhibition Centre in the lower-right corner.Jinhu Square not only has the commercial services industry function but also has a government office area and some residential areas of Nanning City have public management and services and residential functions.It mostly has commercial buildings and office buildings.The middle-right part of this area is Minge Lake Square and the bottom-right location is Nanning International Convention and Exhibition Centre, with functions of green spaces and squares and public management and services, respectively.The functional zone identification results are consistent with the actual situation.As shown in Figure 10, according to Baidu map observation, most of the area is Fengling residential area (residential function).The upper-right area contains Fengling Campus of Nanning Second High School (public management and services function), the lower-right area is Nanning Langdong Bus Terminal (road and transportation facilities function), and the lower-left area contains the ASEAN business district (green spaces and squares and commercial services functions).The unit on the lower left involves two functions; the dominating commercial services function and the functional zone were identified using POI in order to have better results.Figure 11 shows the area near Nanning Railway Station, which has the road and transportation facilities function.Nanning Railway Station is located in the old city and is small in size; this area contains a large number of POIs and various functional types.The train station is located in the middle of four cells, which affects the identification of the dominant function.The four cells have a mix of many functions, which are identified as residential and commercial services functions.The identification of road and traffic facility functions only identifies the Nanning railway machine section to the left of the railway station, which has a complete cell division.The identification of urban functional areas may be interfered with by other data, and the area with a long development time contains a large amount of information.Some functions may be covered by other functions due to the influence of cell division.As shown in Figure 10, according to Baidu map observation, most of the area is Fengling residential area (residential function).The upper-right area contains Fengling Campus of Nanning Second High School (public management and services function), the lower-right area is Nanning Langdong Bus Terminal (road and transportation facilities function), and the lower-left area contains the ASEAN business district (green spaces and squares and commercial services functions).The unit on the lower left involves two functions; the dominating commercial services function and the functional zone were identified using POI in order to have better results.As shown in Figure 10, according to Baidu map observation, most of the area is Fengling residential area (residential function).The upper-right area contains Fengling Campus of Nanning Second High School (public management and services function), the lower-right area is Nanning Langdong Bus Terminal (road and transportation facilities function), and the lower-left area contains the ASEAN business district (green spaces and squares and commercial services functions).The unit on the lower left involves two functions; the dominating commercial services function and the functional zone were identified using POI in order to have better results.Figure 11 shows the area near Nanning Railway Station, which has the road and transportation facilities function.Nanning Railway Station is located in the old city and is small in size; this area contains a large number of POIs and various functional types.The train station is located in the middle of four cells, which affects the identification of the dominant function.The four cells have a mix of many functions, which are identified as residential and commercial services functions.The identification of road and traffic facility functions only identifies the Nanning railway machine section to the left of the railway station, which has a complete cell division.The identification of urban functional areas may be interfered with by other data, and the area with a long development time contains a large amount of information.Some functions may be covered by other functions due to the influence of cell division.Figure 11 shows the area near Nanning Railway Station, which has the road and transportation facilities function.Nanning Railway Station is located in the old city and is small in size; this area contains a large number of POIs and various functional types.The train station is located in the middle of four cells, which affects the identification of the dominant function.The four cells have a mix of many functions, which are identified as residential and commercial services functions.The identification of road and traffic facility functions only identifies the Nanning railway machine section to the left of the railway station, which has a complete cell division.The identification of urban functional areas may be interfered with by other data, and the area with a long development time contains a large amount of information.Some functions may be covered by other functions due to the influence of cell division.

Comparison with Traditional Methods
This work employed the same POI data with kernel density analysis for experiments using a weighted quantitative identification method.An essential task in the quantitative weighted identification method is the weighting of the POI data, which takes into account the different contributions of different types of POIs to the identification of urban functional zones and assigns different weights to each POI type according to the space occupied by the POI and public awareness.The weighting of POI data is often combined with other methods for the classification of urban functional zones.Yang et al. combined the weighting method of POI data with the kernel density analysis method and the OSM road network segmentation method for the classification of urban functional zones [33].Jiang et al. combined weighted quantitative identification methods with multi-level grids, LDA models, and kernel density analysis methods to classify urban functional zones [31].Although the methods used in these two papers are not identical, they are both based on the weighting of POI data.The weighting of POI data is based on the proposed type of urban functional zone classification and the actual situation of the POI data.Considering that the main objective of this paper was to avoid the impact of the manual weighting of POI data on the accuracy of urban functional zone classification, we only combined POI data weighting with the kernel density analysis method in the set comparison method.The experiments refer to refs.[31,33] and assign weights to the POI data in conjunction with the use of data.The POI functional category weights were set as shown in the Table 4 below.For the classification results of the functional zones, this work also took a sample of 237 cells for verification, and the recognition accuracy was 52.32%.Table 5 shows the recognition accuracy results of the weighted quantitative recognition method and the method proposed in this work.The recognition accuracy of the machine learning KStar algorithm proposed in this paper was significantly higher than that of the traditional weighted quantitative recognition method.

Comparison with Traditional Methods
This work employed the same POI data with kernel density analysis for experiments using a weighted quantitative identification method.An essential task in the quantitative weighted identification method is the weighting of the POI data, which takes into account the different contributions of different types of POIs to the identification of urban functional zones and assigns different weights to each POI type according to the space occupied by the POI and public awareness.The weighting of POI data is often combined with other methods for the classification of urban functional zones.Yang et al. combined the weighting method of POI data with the kernel density analysis method and the OSM road network segmentation method for the classification of urban functional zones [33].Jiang et al. combined weighted quantitative identification methods with multi-level grids, LDA models, and kernel density analysis methods to classify urban functional zones [31].Although the methods used in these two papers are not identical, they are both based on the weighting of POI data.The weighting of POI data is based on the proposed type of urban functional zone classification and the actual situation of the POI data.Considering that the main objective of this paper was to avoid the impact of the manual weighting of POI data on the accuracy of urban functional zone classification, we only combined POI data weighting with the kernel density analysis method in the set comparison method.The experiments refer to refs.[31,33] and assign weights to the POI data in conjunction with the use of data.The POI functional category weights were set as shown in the Table 4 below.For the classification results of the functional zones, this work also took a sample of 237 cells for verification, and the recognition accuracy was 52.32%.Table 5 shows the recognition accuracy results of the weighted quantitative recognition method and the method proposed in this work.The recognition accuracy of the machine learning KStar algorithm proposed in this paper was significantly higher than that of the traditional weighted quantitative recognition method.

Analysis of Urban Space Functions
A neighbourhood is usually an area surrounded by roads in a city and is the basic building block of the urban fabric.The neighbourhoods were used to overlay analysis with the grid cells, using the functional area type with the largest proportion of area as the dominant function of the neighbourhood.The results are shown in Figure 12.The method proposed in this paper 237 205 86.50%

Analysis of Urban Space Functions
A neighbourhood is usually an area surrounded by roads in a city and is the basic building block of the urban fabric.The neighbourhoods were used to overlay analysis with the grid cells, using the functional area type with the largest proportion of area as the dominant function of the neighbourhood.The results are shown in Figure 12.The above-presented visualisation was analysed in conjunction with the results of the Nanning City District Functional Area Identification and the Nanning City Master Plan (2011-2020).

•
Commercial and services function According to the results of the distribution of urban spatial functions in Nanning, the commercial and service function areas are concentrated and have contiguous distributions.The commercial and services industries are mainly distributed in the peripheral circular area near the outer ring highway and in a concentrated line of the main roads in the city.The functional area of commercial services in the city centre is mainly located in Chaoyang square, Jinhu square, Hang Yang city, and Wanxiang city.The distribution of the commercial services industry is concentrated in one block.The southwest direction is mainly a development area for business service industry functions, such as the science and technology park and industrial park in Wuxiang New Area.Meanwhile, the southeast direction mainly features the China Union Industrial Park.The northeast direction of the business services industry mainly features the Nanning City Agricultural Products Trade Centre, Outlet, and some companies and factories, while the remaining peripheral areas mainly feature some companies and processing plants and areas of other business services industry functions.

Residential function
The rational planning of residential areas is particularly important in urban land and space function areas.It should consider urban residents' requirements for the residential environment, facilities and comfort, as well as the city's sustainable development.The residential function areas in Nanning are mainly located in areas with convenient transportation, and the adjacent area not only has a full range of commercial and service functions, public administration, and public services, but also a certain amount of greenery and open space.

•
Public management and services function The public administration and services function includes government offices and cultural sites, education and research, sports, medical, and health facilities, social welfare, and religious facilities.A rational layout of these functions can provide good services and security for urban residents.Figure 12 shows that the public administration and services functions in Nanning City cover a wide area and generally conform to the overall urban plan.

•
Logistics and storage function The logistics and storage function mainly refers to local reserve, transit, foreign trade, and supply areas that facilitate the material operations of the city.Since this type of area is responsible for the storage and functioning of the city, it has a varying degree of impact on the city's inhabitants, urban traffic, and the environment.Figure 12 shows that the logistics industry in Nanning City is concentrated in small areas that consider the coordination between water, land, and air transport.This supports economic development, while avoiding the urban centre.Full use of the edge of the city is made to minimise impacts on urban residents, traffic, and the environment.

•
Green spaces and squares function Green spaces and squares can improve the urban landscape by providing recreational spaces for urban residents to increase their happiness.Figure 12 shows that green spaces and squares function areas of different sizes are distributed in the central area of Nanning City.Some areas along Yongjiang River do not have this function but Nanning has a good number of areas with this function in general.

• Road and traffic facilities function
As shown in Figure 12, the main areas of the transportation function in Nanning city are distributed in Anji, Fengling North, and Yongning districts.These are all areas that are convenient for interconnection within and outside the city, and for future expansion.Thus, it can be seen that the functional layout of the major transportation facilities in Nanning city is reasonable.

Discussion
The classification of urban spatial functions aims to identify the economic and social functions undertaken by urban spatial units and provide a basis for spatial decisionmaking in cities.The research results of this work can be useful not only for government applications, such as urban physical examination assessment for territorial spatial planning and public service facility layout, but also for commercial applications, such as commercial site selection [43] and customer demand prediction [44].
The classification method of urban functional zones based on POI and machine learning proposed in this work has the following features: (1) POI was used as the data source, and POI data were easily available through the internet.(2) The classification method was highly automated, and only a small number of samples needed to be selected and combined with the Kstar algorithm to build a classification model, by which the mapping relationship between POI kernel density features and city functions can be established, and the city functional zones can be intelligently identified.As a result, the manual assignment of POI-type weights in the traditional method is avoided.(3) The method has strong generalisation because the POI data distribution characteristics and urban spatial patterns of different cities are different.When identifying the functional zones of different cities, only a small number of city samples need to be selected for training, forming a classifier adapted to the characteristics of the city.This can, then, be used to classify the functional zones of the city.
The shortcoming of this method is that only one type of data, namely POI, is used to classify functional areas, and it fails to integrate multiple types of data for research.Data such as cell phone signalling and taxi trajectory can reflect human activity characteristics and can be used to analyse urban occupancy and traffic characteristics to assist in the identification of urban functional areas.However, it is difficult and expensive to obtain such data.In future studies, cell phone signalling and vehicle trajectory data will be used to study the commuting space of the city; this will provide a rich method for spatial decisionmaking in the city.In addition, the types of urban functional areas will be refined in future studies, including considering mixed functional areas and special functional areas.

Conclusions
Identifying spatial functional zones is important in supporting urban planning decisionmaking, and using internet big data to identify urban spatial functions is a current research hotspot.This study proposed a recognition method that combines POI and machine learning to address the existing problems of low accuracy and poor automation in functional area recognition.In this study, 500 × 500 m gridding was used to divide the study area of central Nanning City.The kernel densities of different functional area types within the grid were obtained using kernel density analysis of POI data and used as the feature values of the cell areas.The relationship between feature value information and cell spatial function categories was mapped using the KStar algorithm, and an urban functional zone classifier was constructed to classify and identify spatial function categories in the central area of Nanning.Finally, the homeland spatial function classification of a randomly selected Nanning city central unit grid was compared with satellite images from the Baidu electronic map.
The acquired experimental results demonstrated that the recognition accuracy of the proposed method reached 86.50%, which was 34.17% higher than that of the traditional weighted quantitative recognition method.This shows that the proposed method in this work can effectively improve the accuracy of urban functional zone classification.This study demonstrated that urban functional area identification based on POI data and machine learning is feasible.The data are easy to obtain and the method avoids human bias to a certain extent, thereby providing realistic identification of functional areas.
By using the method proposed in this work to identify the urban functional zones in Nanning, the following conclusions can be drawn: commercial and service functions are concentrated in Nanning, while the residential functions are mainly distributed in areas with convenient transportation and public administration.Additionally, the public service function covers a wide area of the city.Additionally, logistics and storage functions are distributed at the city's edges.Particularly, the transportation functions are distributed in areas with convenient interconnection between the city and outside the city, and green

Figure 1 .
Figure 1.The study area within Nanning city.

Figure 1 .
Figure 1.The study area within Nanning city.
different functional types in the central city of Nanning were obtained by partitioning the statistical kernel density raster data to the average value of all image elements in the cell.The POI kernel density values of different functional types were assigned as different functional feature values in the cell of the study area.The Kstar algorithm[39] was used to identify urban spatial function types; some samples were selected as a training set according to manual recognition judgment, and the samples were divided into training and test samples to map the relationship between the feature value information and the unit spatial function categories.This formed an urban functional area type classifier to classify and identify the urban spatial function categories.Finally, the results of city function area classification were collated to form a visualisation chart and to analyse the spatial function pattern of Nanning city.The technical process is shown in Figure2.

Figure 2 .
Figure 2. Technical process of the data analysis used in this study.

Figure 4 .
Figure 4. Density of each functional nucleus for the functions of (a) business and services, (b) residence, (c) public management and services, (d) logistics and storage, (e) green spaces and squares, and (f) road and traffic facilities.

Figure 4 .
Figure 4. Density of each functional nucleus for the functions of (a) business and services, (b) residence, (c) public management and services, (d) logistics and storage, (e) green spaces and squares, and (f) road and traffic facilities.

20 Figure 5 .
Figure 5. Identification of urban spatial functions in Nanning.

Figure 5 .
Figure 5. Identification of urban spatial functions in Nanning.

Figure 6 .
Figure 6.Identification results for Nanning City Garden Expo Park.(a) Baidu map, (b) satellite image, and (c) recognition results.

Figure 6 .
Figure 6.Identification results for Nanning City Garden Expo Park.(a) Baidu map, (b) satellite image, and (c) recognition results.

Figure 9 .
Figure 9. Golden Lake Square and Folk Song Lake Square.(a) Baidu map, (b) Satellite image, (c) Recognition results.

Figure 9 .
Figure 9. Golden Lake Square and Folk Song Lake Square.(a) Baidu map, (b) Satellite image, (c) Recognition results.

Figure 12 .
Figure 12.Distribution of urban spatial functions in Nanning.The above-presented visualisation was analysed in conjunction with the results of the Nanning City District Functional Area Identification and the Nanning City Master Plan (2011-2020).• Commercial and services function According to the results of the distribution of urban spatial functions in Nanning, the commercial and service function areas are concentrated and have contiguous distributions.The commercial and services industries are mainly distributed in the peripheral circular area near the outer ring highway and in a concentrated line of the main roads in the city.The functional area of commercial services in the city centre is mainly located in Chaoyang square, Jinhu square, Hang Yang city, and Wanxiang city.The distribution of

Figure 12 .
Figure 12.Distribution of urban spatial functions in Nanning.

Table 2 .
Statistics on the spatial function types in Nanning.

Table 2 .
Statistics on the spatial function types in Nanning.

Table 3 .
Identification accuracy for each function type.

Table 3 .
Identification accuracy for each function type.

Table 5 .
Comparison of the experimental results of the different methods.

Table 5 .
Comparison of the experimental results of the different methods.