1. Introduction
Urban functional zones (UFZs) are closely related to human life and production. Pacione [
1] associates urban form and function with human economy and culture. He believes that urban development in different countries is influenced by cultural backgrounds and stages of development. Johansson [
2] suggests that UFZs can reflect the concentration of people and activities. As a fundamental spatial unit for urban planning and management, the types of UFZs are determined by residents’ use of urban space and the human activities that occur inside [
3,
4,
5]. In this paper, an urban function zone is defined as a spatial unit of human activities within a metropolitan area that is dominated by a specific function. With the rapid development of the economy and urbanization, urban compactness has increased [
6], and various functions have been integrated within the same zones. The urban spatial structure presents the characteristics of heterogeneity, complexity, and diversity [
7,
8], which brings challenges to the identification of UFZs; therefore, a timely and in-depth understanding of urban functions is conducive to solving urban development problems, such as urban sprawl, chaotic functional layouts, and declining urban livability [
9,
10,
11], and promoting sustainable urban development. Some research has used remote sensing images to identify land use or urban functions [
12,
13,
14,
15], and these methods are effective in identifying simple land covers by extracting surface physical properties of the objects, such as spectral and textural features [
16,
17,
18]. Some studies classify UFZs based on the morphological features of the building [
19,
20]. However, these methods only consider the natural attributes of the urban space, neglecting the economic and human activity characteristics that are closely related to urban functions [
21,
22].
The advent of the big data era has brought new challenges to in-depth perception and understanding of UFZs. Geospatial big data contains abundant semantic information about socio-economic and human activities, which can adequately reflect the patterns and preferences of human activities and is conducive to describing and distinguishing complex UFZs [
23,
24]. Many studies have recognized and analyzed UFZs using geospatial big data, such as points of interest (POIs) [
22,
25,
26,
27], mobile phone location data [
5,
28], trajectories of floating cars [
29,
30,
31], and social media data [
32,
33]. Soto et al. [
34] used Madrid’s cell phone records to classify land use types. Becker et al. [
35] used cellular network data to analyze people flow and identify park and residential zones. Frias-Martinez et al. [
36] used geolocated tweets generated by mobile social media applications to perceive urban land use. Barlacchi et al. [
37] extracted the hierarchical structure features of POIs to model city zones. Among these, POI data have been used widely due to their comprehensiveness and accessibility; for example, Niu and Silva [
38] used POI data to infer UFZs in London. Yao et al. [
27] introduced the Word2Vec model to explore the relationship between the spatial distribution patterns of POIs and UFZs.
To obtain more information, some researchers have also integrated multi-source data. Crooks et al. [
4] presented a bottom-up approach to capture a city’s form and function using open-source and volunteered datasets. Han et al. [
39] identified UFZs based on bus smart card data and POIs. Feng et al. [
40] proposed an SOE (scene–object–economy)-based framework to identify UFZs, which integrates scene features, object features, and economy features. Ye et al. [
41] revealed urban functions by integrating social media data and street-level images. Jendryke et al. [
42] combined remote sensing and social media data to classify possible land cover types. Du et al. [
29] applied the latent Dirichlet allocation (LDA) model to POI data, taxi trajectory data, and bicycle stock data to identify functional regions. Thakur et al. [
43] used social media and geolocation sensor data to study population dynamics and land classification. Yang et al. [
24] combined morphological features extracted from buildings and socioeconomic features of POIs to classify UFZs. The above studies show that different geospatial data have their own advantages and disadvantages [
44]. Integrating multi-source data can provide more detailed information for inferring UFZs [
45].
Despite the above studies having achieved remarkable performance, there is still room for improvement. First, previous studies have mainly focused on static urban features and have failed to capture the dynamic features of urban functions driven by human activities. Only a limited number of previous studies consider the temporal, spatial, and semantic features of human activities simultaneously. Second, the extracted features are simply spliced or stacked [
24,
46] and then directly fed to the classifier, which cannot adequately capture the complementary strengths between features. Therefore, a rational ensemble learning framework needs to be designed for feature integration.
To alleviate the above issues, we proposed a data-synthesis-driven approach to recognize UFZs. Our methods integrate POI data and social media check-in data to extract dynamic human activities and socio-economic features in geospatial big data from three perspectives: temporal, spatial, and semantic. Specifically, we extract spatial semantic features from POI data using the Place2Vec model, which employs the nearest neighbor approach and considers the distance augmenting factor to construct the training dataset. Then, we use the Skip-Gram training framework to extract high-dimensional feature vectors of the POIs. We adopt the LDA-Doc2Vec model to obtain the activity semantic feature of social media check-in tweets and adopt statistical analysis to extract activity temporal features of check-in time series. Finally, a weighted stacking ensemble (WSE) classification model is built to recognize the UFZs. The contributions of this study are as follows: (1) A new data-synthesis-driven approach is developed for comprehensive perception and understanding of the UFZs by integrating dynamic and static human activity data; (2) We build a WSE classification model that combines the strengths of different classifiers and features to obtain the final UFZ classification result, improving the accuracy of the recognition results. (3) Our methods recognize the dominant functions and the proportion of urban functions within each analysis unit and analyze its spatial patterns and degree of mixing. The results provide insights for urban planning and decision making.
3. Results
We extracted 20% of the TAZs in our study area to train the WSE model and evaluate the results. Sample selection and labeling have a great impact on the UFZs identification results. The annotation method follows Zhang et al. [
28] and is further validated using the EULUC-China map [
54], remote sensing imagery, and electronic maps.
3.1. Results of the Proposed Method
3.1.1. Visualization of Human Activity Features
The method proposed in this paper extracted three features: spatial semantic features, activity time features, and activity semantic features. In the Place2Vec model, the k of the nearest neighbor method is set to 150, the length of the feature vector is set to 70, the number of negative samples is set to 5, and the minimum word frequency is set to 1. For the LDA-Doc2Vec model, the number of topics is set to 50, the vector size is set to 50, the context window size is set to 2, the minimum word frequency is set to 2, and the number of training parallels is set to 4. We used t-SNE [
55] to downscale the three feature vectors and project them into the 2D plane. The results are shown in
Figure 6, where some typical locations within the Fifth Ring Road of Beijing are labeled (see
Figure 1 for the abbreviations of the places).
It can be seen from
Figure 6 that TAZs with the same functional type are clustered in the vector space, and different features have different distribution characteristics in the 2D plane. The spatial semantic features of commercial areas, residential areas, industrial areas, and administration and public service areas show an aggregated distribution (
Figure 6a), indicating that the spatial units with the same functional type have similar spatial distribution characteristics. The activity temporal features extracted in this paper (
Figure 6b) are relatively dispersed in the vector space. However, the distances between typical locations of the same functional type are quite close, indicating a mixing of functions within the TAZ. This finding helps to identify the proportion of functional types in UFZs. The activity semantic features of commercial areas, green space and square, working areas, and administration and public service areas were aggregated into different clusters (
Figure 6c). This suggests that activity semantic features are better able to recognize green space and square area compared to spatial semantic features. We speculate that people prefer to post on social media platforms when sightseeing and traveling, which makes this feature more prominent. The results show that the three features extracted in this paper reveal different aspects of the relationship between human activities and urban functions. All the above features are conducive to the identification of UFZs.
3.1.2. Spatial Distribution of Urban Functional Zones
In this section,
Figure 7 demonstrates the proportion of five urban functional types and the dominant functional type in each TAZ:
As shown in
Figure 7, different types of functions within the Fifth Ring Road of Beijing have different spatial distribution characteristics. Commercial areas include business circles, shopping centers, convenience stores, and comprehensive markets that cater to the daily needs of residents. These areas are mainly located in the eastern part of the study area, as well as in the vicinity of the Third to the Fourth ring roads and the major urban thoroughfares (
Figure 7a).
Administration and public service areas reflect the spatial distribution and configuration of public service facilities such as education, healthcare, and government offices. The results indicate that these functions are mainly concentrated in the central and northwestern parts of the study area (
Figure 7b). Compared with electronic maps, the northwestern part of the study area is home to several universities and research institutions, including the prestigious Peking University, Tsinghua University, and the Chinese Academy of Sciences. Meanwhile, the central area of the study region mainly houses national administrative institutions. According to the “Beijing Urban Master Plan (2016–2035)”, the center of the study areas is the core zone for the capital, serving as the heart of the nation’s political, cultural, and international exchanges. This aligns with the findings of our research.
Residential areas comprise the largest proportion of the study area (see
Figure 7c). It displays a distribution pattern that radiates outward from the center. Within the Second Ring Road, residential areas are relatively sparse, and these areas are often surrounded by facilities such as supermarkets and hospitals.
Green space and square areas, including recreational zones such as scenic spots and parks, are mainly situated on the outskirts of the study area.
Industrial areas, which include buildings like companies, industrial parks, and factories, are primarily located on the boundaries of the study area, beyond the Fourth Ring Road and far from the city center. This distribution pattern is closely related to Beijing’s urban planning. This spatial arrangement helps reduce the impact of industrial pollution on the environment and noise on human life, while also lowering the cost of production.
In summary, within the study area, only a few TAZs serve a single function. Most zones exhibit a mixed-use pattern. Among these, functional zones dominated by residential functions are the most numerous, while those dominated by industrial functions have the lowest number. The Dongcheng District, Xicheng District, and Haidian District mainly focus on administration and public service functions. Chaoyang District mainly focuses on commercial functions and industrial functions are mainly distributed in Fengtai District.
3.2. Validation and Comparison
3.2.1. Different Data Combinations
We first compared the classification accuracy of different data combinations. Based on different data sources, we divided them into three combinations: (I) POI data (spatial semantic feature, SSF); (II) check-in data (activity temporal feature and activity semantic feature, ATF+ASF); (III) POI + check-in data (SSF+ATF+ASF). We used 70% of the sample for training, with the remaining 30% used for testing. The decision trees of the RF model were set to 100.
Table 2 presents the confusion matrices for different data combinations. POI data performed best at classifying residential and commercial areas, while check-in data were most effective for recognizing green spaces and square areas. This indicates that POI data and social media check-in data can reveal dynamic human activity characteristics from different perspectives. When combined with POI data and social media check-in data, the model’s recognition capability was improved.
Table 3 presents the overall accuracy and kappa coefficients. In this section, each model was run 100 times, and we took the average value as the final result. As shown in this table, the overall accuracy (OA) using our method was 81.24%, which was 4.35% higher than with POI data alone and 21.13% higher than with check-in data alone. The results indicate that combining spatial semantic features obtained from POI data with activity temporal features and activity semantic features obtained from social media check-in data can improve classification accuracy.
3.2.2. Different Classification Models
For comparison, we selected three commonly used classifiers (RF, SVM, and XGBoost) and compared them with the SE and WSE models to evaluate their performance in the UFZ recognition task. We use 70% of the sample for training, with the remaining 30% used for testing. The decision trees of the RF model were set to 100, while for the WSE model, the decision trees of the base classifier were set to 100. SVM used the linear kernel. A 5-fold cross-validation method was used to avoid data overfitting. To ensure the reliability of the results, each model was repeated 100 times. The other parameters were set to their default values.
The overall accuracy (OA) and kappa coefficients are shown in
Table 4. The comparison of the results shows that the WSE classifier with multi-source data achieved the best performance for UFZ classification. Among the traditional single classifiers, RF yielded the best classification results. The performance of the ensemble classifiers outperformed that of the single classifiers. The classification accuracy of SE in the table is slightly lower than that of WSE, but it outperforms other single classifiers. Additionally, the Kappa value of the method presented in this paper was the highest, indicating that the model has good consistency.
Figure 8 displays the confusion matrices of three different models. The confusion matrices provide a more detailed view of the classification results. The deeper the color of the diagonal cells and the lighter the color of the off-diagonal cells, the more accurate the predictions and the higher the model’s accuracy. As shown in
Figure 8, the number of misclassified TAZs decreases when using the WSE.
Overall, our approach introduces a new framework for identifying UFZs. The complementary strengths of the data and classifier can provide valuable insights into revealing the relationship between dynamic human activities and urban functions.
To evaluate the accuracy of the proportion results, we manually tagged some TAZ functional proportions as references, similar to Zhang et al. [
28], and compared them with the results from our method. As shown in
Figure 9, *p represents the model prediction results for the zone, and *t represents the manually interpreted results.
The results indicate that the method presented in this paper not only identifies the dominant urban function but also reflects the proportions of urban functional types. Moreover, it performs well in handling various combinations of functional types.
3.3. Spatial Patterns of the Urban Functional Zones
The location quotient (LQ) is used to evaluate specialization in each geographic area. In this study, the LQ is calculated to reflect the development status of each function type and the spatial structure of UFZs can be analyzed. The LQ is calculated by Equation (12).
where
denotes the LQ of function
in region
,
is the area of function
in region
,
is the total area of region
,
is the total area of function
in the entire study area, and
is the total area of the study area. By calculating LQ, it becomes clear which functions dominate within a specific area and can reflect the degree of function composite. A higher LQ value indicates greater concentration of the function and stronger development advantages. In general, when
, the dominance of function
is below the average level in region
. When
, function
is above the average level in region
. When
, function
has a strong advantage and is dominant in region
.
We calculated the LQ of each ring road in Beijing according to Equation (12), as shown in
Table 5. The five different function types show different superiorities in the study area. Inside the Second Ring Road, administration and public service areas, as well as green spaces and squares, are the dominant functions, which is consistent with the results found in
Figure 7. We believe that this is due to the presence of numerous historical and cultural attractions and central administrative institutions inside the Second Ring Road. It is worth noting that there are no industrial functions within the Second Ring Road. Compared to the Second Ring Road, the superiority of commercial and residential functions gradually increases in the Third Ring Road area. While administration and public service remain the primary function, the superiority of green space and squares weakens. Within the Fourth Ring Road, commercial functions are the most dominant, while the superiority of other functions diminishes. Within the Fifth Ring Road, all functions, except for industrial, show relatively weak LQ. According to the results, between the Second and Third ring roads, three functions show high superiority. This indicates a high degree of functional completeness in this area.
Figure 10 shows the spatial distributions of LQs for each ring road. Commercial function exhibits significant superiority within the Second to Fourth ring roads, with a particularly strong presence between the Third and Fourth ring roads. Administration and public service areas are prominent within the Third Ring Road. The LQ of residential areas is relatively uniform, with the weakest superiority inside the Second Ring Road. Green spaces and squares show high superiority inside the Second and Fourth–Fifth ring roads. Industrial areas show a low level of LQ within the Fourth ring, with their superiority gradually increasing from the inner to the outer rings.
3.4. Degree of Urban Functional Mix
We use Shannon entropy as the mixing index to quantitatively assess the degree of mixing in TAZs, thus validating the accuracy of the proportion of urban functional types. A higher index value indicates a higher degree of mixing and a wider variety of functions within a TAZ, while a lower value suggests a more homogeneous urban functional zone. The Shannon entropy index effectively reflects the complexity and diversity of a TAZ. It is calculated using Equation (13):
where
denotes the number of function types in this study and
indicates the probability that the TAZ is labeled as the
-th function type. The Shannon index in each TAZ is shown in
Figure 11:
After calculation, the average mixing index within the fifth ring road of Beijing is 0.96, with approximately 52.19% of UFZs having a mixing index higher than this value. The results indicate that more than half of the functional zones within the study area have a high degree of functional diversity. Areas with a low mixing index are mainly concentrated in the administration and public service areas, as well as some green spaces and squares. TAZs with a relatively high mixing index are mainly commercial areas and residential areas. Some administrative and public service areas located near residential areas also have a high mixing index.
To reveal the spatial distribution patterns of the degree of urban functional mixing, we calculated the functional mixing index based on each ring road. The results are shown in
Table 6.
From the statistical results, there is no significant difference in the degree of mixing in the regions divided by ring roads. We believe that the city’s infrastructure is well-developed, with a well-organized distribution of economic, cultural, educational, and other facilities.
The average mixing index is highest inside the Second Ring Road. Apart from the highly intensive land use in the city center, this can be attributed to the unique historical and cultural background of Beijing’s old city areas. Within the Second Ring Road, there are many hutongs, which are a distinctive feature of Beijing’s cultural heritage. A hutong is not only a major place for residents to live, but also a popular tourist attraction. Therefore, multiple functions coexist in the same TAZ, resulting in a higher mixing index. Additionally, we can find that as the distance between the ring road and the city center grows, the variance of the mixing index gradually widens. This suggests that the farther from the city center, the greater the differences in functional structures within the ring roads. Some studies have pointed out that combining different types of infrastructures in appropriate proportions within a TAZ can provide residents with more diverse urban services; this not only accommodates large populations but also demonstrates significant urban vitality.
4. Discussion
The analyses above suggest that the data-synthesis-driven approach proposed in this paper is effective for gaining a comprehensive understanding of UFZs. Compared to the research of Zhang et al. [
28], Srivastava et al. [
14], and Barlacchi et al. [
37], we incorporate dynamic human activity features, offering a more enriched perspective for the identification of UFZs. POI data alone may overlook the dynamic features of human activities, leading to inaccurate classification results, whereas check-in data can effectively compensate for this limitation. The WSE classification model used in the classification addresses the limitations of previous studies [
7,
29,
46], leading to more accurate recognition results. This improvement is mainly driven by two key factors. First, unlike previous studies that concatenated the obtained feature vectors and directly input them into classifiers, the WSE model integrates the contributions of different features through an ensemble learning approach. Second, the model’s weight factors allow more weight to be assigned to base classifiers with higher accuracy, ensuring that well-performing classifiers contribute more to the final result. Our analysis, based on the identification results, deepens our understanding of the urban functional pattern. Research by Monteiro et al. suggests that urban development may deviate from the original plan. Therefore, our approach provides a foundation for perceiving the actual functional structure. Furthermore, our analysis has confirmed the existence of imbalances in urban development.
In the recognition results, some TAZs were still misclassified. We summarized and analyzed the reasons for misclassification, and
Figure 12 shows four common types of errors that occurred in most cases.
The misclassification of TAZ_1 is attributed to the imbalance in the proportion and spatial distribution of different types of POIs. The center of TAZ_1 is a residential area, surrounded by schools and some commercial facilities. It is evident that residential uses occupy the largest area and serve as the dominant function of this TAZ. However, the presence of a significant proportion of commercial-related POIs (e.g., stores and restaurants) skews the identification results towards commercial areas.
The misclassification of TAZ_2 is due to the discrepancy between its actual function and planned function. This area hosts the “Bird’s Nest” and the “Water Cube”, which are venues for the Beijing Olympics Games and fall under the public service functional type. Nowadays, this site is frequently visited by tourists as a popular attraction. From a human activity perspective, it serves a recreational function and is recognized as a green space and square in this study. This indicates that the method proposed in this paper offers insights into recognizing urban functions and effectively reflects how humans actually utilize urban areas.
The reason for the misclassification of TAZ_3 is similar to TAZ_1 and is attributed to an imbalance in check-in frequencies. TAZ_3 encompasses both residential areas and historical and cultural attractions. As found in
Section 3.1.1, people tend to share content on social media platforms while traveling. Therefore, this area was incorrectly classified as a green space and a square area.
We believe that the classification error of TAZ_4 is due to the ambiguity between functional types, as this area contains multiple facilities, such as industrial parks and wastewater treatment plants. However, our method identified the dominant function as commercial areas. Analysis of the data revealed that in the northeastern part of this area, several companies are primarily engaged in commercial and office activities. We speculate that our method may still have some limitations in distinguishing between industrial functions and commercial functions, which has led to TAZ_4 being recognized as a commercial area.
These examples reflect problems that occurred in the majority of cases, and understanding these potential causes is crucial for improving the model’s performance. It should be noted that the reasons for misclassification are not limited to this. The accuracy of classification may also be influenced by factors such as data quality, feature selection, or the inherent similarities between certain UFZ types.
In addition, another limitation is the selection and annotation of the samples. We invited three experts to annotate the functions of the regions, and the highest-scoring types were used as the final annotations. However, the results of manual annotation may be influenced by the subjective bias of the annotators. Finally, this study categorized UFZs into five types. However, residents have more specific uses of urban space. Therefore, we could consider dividing urban zones into finer granularity and utilizing additional reliable data sources to achieve a more detailed identification of UFZs.
5. Conclusions
In this study, we proposed a data-synthesis-driven approach that integrated human activity data to infer UFZs. We divided spatial analysis units based on road networks to achieve a fine-grained identification of UFZs. Spatial semantic features, activity temporal features, and activity dynamic features were comprehensively and deeply mined by our method. Then, a weighted stacked ensemble model was built to identify the dominant function and the proportions of urban functional types of each TAZ. We chose the area within the Fifth Ring Road of Beijing for our case study. The research results indicate that our proposed method extracted both dynamic and static human activity information from large-scale geographic spatial data, contributing to a better understanding of urban spatial structures. The calculation of the Shannon entropy index and location quotient revealed how people utilize different administrative districts and ring roads in Beijing. Our results can provide valuable insights for urban planning and management, helping to formulate better urban development strategies.
In future work, improvements can be made in the following aspects: Firstly, in areas with limited human activity, POI data and social media check-in data may not capture enough characteristic information. Therefore, further research can be conducted by considering additional sources, such as mobile phone trajectory data, to extract more accurate and comprehensive human activity patterns. Secondly, the scope of the study can be expanded by selecting study areas of different city sizes and different socio-economic levels for comparison to reveal differences in the function of spatial structures in different cities.