A New Urban Vitality Analysis and Evaluation Framework Based on Human Activity Modeling Using Multi-Source Big Data

: A quantitative study of urban vitality brings new insights for evaluating the external construction environment and internal development power of cities. However, it still has limited knowledge of the relations between people’s diverse urban life and urban vitality, although urban activities are often used as the proxy for urban vitality. This paper aims to deeply mine the content of urban social life and reveal the driving mechanism of urban vitality after inspecting human activities. We propose a general framework for exploring the spatial pattern and driving mechanism of urban vitality using multi-source big data. It builds a mapping relationship between various urban activities and urban vitality aspects, including economic and social. In addition, the physical environment (static) and human–land interaction (dynamic) indicators are designed to analyze the driving mechanism of urban vitality using the Geographically Weighted Regression model. The results show that the spatial pattern and driving factors of urban vitality are heterogeneous over space regarding both the economic and social aspects of our experimental study. This work provides us with multiple perspectives to understand the connotation of urban vitality and urges us to develop rational strategies to make the city more vital, coordinated, and sustainable.


Introduction
Creating a vibrant city is an essential goal of urban development [1]. How to effectively evaluate the abstract concepts and connotations of urban vitality has always been a concern of urbanists, sociologists, and urban planners. With the development of geospatial information science and the emergence of new data sources, spatial big data mining and GIS (Geographic Information System) analysis techniques have been successfully applied to quantitative evaluation of urban vitality [2,3]. According to the previous studies [4,5], urban morphology quantified by some spatial metrics can be used to infer urban functions. Good cities tend to be a balance of a reasonably ordered and legible city form [6]. Hence, the quantitative evaluation methods assess the differences in urban construction, spatial design, and urban development within a city and among different cities [6,7]. The relationship between landscape characteristics and urban vibrancy has also been discussed [8,9]. They all have provided reference opinions for urban planning practice. Based on theory study and field investigation, the dynamic interaction between human and urban space is considered the source updates [26]. However, dealing with the defects, such as sparse sampling, low positioning accuracy, and high impurities, in the construction of the human activity process has also become a significant research challenge. Related studies have found that human activity types are strongly correlated with land use by modeling group activities [27,28]. Generally, family activities mainly occur in residential areas, while work primarily happens around the office areas. Jiang et al. [29] found that the type of human activity is a probability function under certain space-time conditions. That is to say, the type of activity can be inferred by the time, the activity chain within a day, and the spatial location with specific land use characteristics and population density characteristics. Based on this assumption, many studies have established probabilistic models, such as supervised and unsupervised probability graph models, to infer the types of people's activities in different ground space characteristics. The supervised model acquires the contextual knowledge of human activities from personal activity sample data with activity labels, such as family travel questionnaire [30] or GPS tracking data [31]. It provides the basis for semantic annotation of mobile phone location activity purpose (type). The unsupervised model clusters activity data by constructing relationship rules and constraints related to human activities and space-time. It relies on a large number of repeated activity patterns to distinguish user activity types. Large-scale mobile phone data provide a high-quality data source for this unsupervised activity identification. These studies have laid a sound technical foundation for exploring the different urban vitality spatial manifest content.
In summary, the development of big spatial data, human activity tracking data, and data mining technology provides a new paradigm for current urban scientific research. However, the existing analytical framework has not only not analyzed the composition of human activities but has also neglected to explore the influence of human-land interaction on urban vitality. This paper integrates multi-source heterogeneous big data to construct a general evaluation framework of urban vitality performance and its internal driving mechanism based on human activity modeling. It aims to give us a comprehensive and systematic understanding of the connotation of urban vitality. According to our best understanding of existing research, this framework has made the following two innovations.

•
It is a new framework for urban vitality analysis from connotation deconstruction, with explicit expression to drive mechanism exploration. Based on the modeling of human activities, this framework constructs the connotation of urban vitality and its spatial performance into economic and social aspects and realizes the mapping from human activity types to urban vitality. It is more complete and more accurate than the traditional urban vitality analysis based on population density and part of the population.

•
The proposed framework makes full use of multi-source big data, and it provides a multi-dimensional drive indicator analysis method for urban vitality. Considering the shaping effect of cyberspace and social space on urban vitality, it can explore the inner driving mechanism of urban vitality from the perspective of the combination of urban physical environmental value and spatial value defined by human-land interaction.
Our frame enriches the theoretical connotation, quantitative technology, and evaluation dimension of urban vitality, and reveals the methodology of creating urban vitality from a multi-dimensional perspective. It provides certain enlightenment on how to improve urban vitality and promote coordinated and sustainable development of cities. As the process of economic globalization and urbanization continues to accelerate around the world, the global gravity center of human dynamics is constantly shifting. According to the globalization research of recent decades [32,33], the global centers of gravity are moving to East Asia (especially China), as expressed by GDP, CO 2 emissions, population, and urban population. Therefore, the research on urbanization and its countermeasures needs more examples from China. This paper takes the central metropolitan area of Nanjing, China, as the case area to verify the feasibility of this framework. Based on the exploration of the driving mechanism, we summarize the strategies for promoting urban vitality in both the economic and societal aspects.
The remainder of this paper is organized as follows. The following section describes the conceptual framework of the urban vitality quantitative evaluation, data requirements and methodology, the case ISPRS Int. J. Geo-Inf. 2020, 9, 617 4 of 25 study area, and experimental configuration used in the evaluation framework. The subsequent sections introduce the experimental results of the case study area, followed by the discussion and conclusions.

Conceptual Framework
The spatial gathering of human activities is a direct manifestation of urban vitality, and this aggregation has significant differences in spatial distribution and composition of activity categories [34,35]. According to the nature of human activity, this study divides urban vitality into two aspects: social vitality and economic vitality. The economic vitality is characterized by the spatial behavioral activities related to human production and consumption, representing the productivity and creativity of various urban spaces. It is an essential guarantee for improving the life quality of citizens [36] and making a vita-city. Social vitality is used to reflect the characteristics of people's social behavior, including living at home, entertainment, education, and cultural activities. It demonstrates the enthusiasm of citizens to participate in urban social life, the livability of urban space, and the soft power of urban development. We regard social vitality as a necessary condition for the good and sustainable development of cities. According to the above thoughts, this paper proposes a general analysis and evaluation framework ( Figure 1) for urban vitality based on activity type identification and multi-source big data fusion. The subsequent sections introduce the experimental results of the case study area, followed by the discussion and conclusions.

Conceptual Framework
The spatial gathering of human activities is a direct manifestation of urban vitality, and this aggregation has significant differences in spatial distribution and composition of activity categories [34,35]. According to the nature of human activity, this study divides urban vitality into two aspects: social vitality and economic vitality. The economic vitality is characterized by the spatial behavioral activities related to human production and consumption, representing the productivity and creativity of various urban spaces. It is an essential guarantee for improving the life quality of citizens [36] and making a vita-city. Social vitality is used to reflect the characteristics of people's social behavior, including living at home, entertainment, education, and cultural activities. It demonstrates the enthusiasm of citizens to participate in urban social life, the livability of urban space, and the soft power of urban development. We regard social vitality as a necessary condition for the good and sustainable development of cities. According to the above thoughts, this paper proposes a general analysis and evaluation framework ( Figure 1) for urban vitality based on activity type identification and multi-source big data fusion.
As shown in the bottom of Figure 1, three types of big data, which include mobile phone location data, geospatial big data, and internet sharing data, are used as the data source for human activity modeling and driving factor evaluating. Three core steps constitute the entire process of urban vitality measurement and evaluation, as numbered with yellow labels.  As shown in the bottom of Figure 1, three types of big data, which include mobile phone location data, geospatial big data, and internet sharing data, are used as the data source for human activity ISPRS Int. J. Geo-Inf. 2020, 9, 617 5 of 25 modeling and driving factor evaluating. Three core steps constitute the entire process of urban vitality measurement and evaluation, as numbered with yellow labels. Part 1 (yellow label 1) shows the urban vitality measurement (estimation) method. It builds the individual's movement process using mobile phone data, and it also constructs a probabilistic inference model to identify the human activity types and map them into two aspects, namely economic vitality and social vitality. Then we can analyze their specific spatial characteristics. The human movement modeling and activity type identification method will be introduced in Section 3.2.1.1, and the vitality mapping method will be described in Section 3.2.1.2.
Part 2 shows the urban vitality driving factors design in this framework. We design ten indicators to separately estimate the urban spatial attraction from two dimensions of physical and social environments. The physical dimension includes five indicators of urban physical environment evaluation. POI density (PD), land use richness (LR), and land use mix (LM) are selected to perform the urban functional feature, which is considered to be closely related to urban vitality [1,37]. Building density (BD) and road accessibility (RA) are used to assess the spatial form characteristics. Related theories suggest that a sufficient concentration of buildings and small block sizes attract more people to participate in social interaction [12,37].
The social dimension that reflects human-land interaction features is a new perspective proposed by our framework. Especially in the Information Age, when mobile Internet and LBS technology are prevalent, the relevance of ternary spaces (physical, social, and information space) becomes increasingly great. So, the behavior choice of human activities is often guided by network information [38,39]. We formulate evaluation indicators to analyze people's interaction between social space and cyberspace on urban vitality. This framework chooses quantitative evaluation from three aspects: the human ability to obtain and create wealth adopting the indicators of salary level (SL) and job richness (JR), urban social life spending capacity adopting the indicators of the house price level (HPL), and average price level (ACL), and satisfaction with urban life services adopting the indicator of business popularity (BP). The quantitative calculation method of the above indicators will be detailed in Section 3.2.2.
Spatial regression analysis measures the degree of influence of changes in one or more variables (explanatory variables) on changes in another variable (dependent variable) by constructing regression equations between attributes of spatial objects. We use spatial regression analysis technology to analyze the correlation between the urban vitality spatial distribution and characteristics of driving factors, as shown in part 3. To make the independent variable and the dependent variable of spatial regression analysis comparable, we construct a regular uniform grid as the spatial unit for the above quantitative assessment. The GWR (Geographically Weighted Regression) modeling method will be described in Section 3.2.3.

Data
According to our framework, multi-source data fusion is the technical foundation of this paper. Human activity data, geospatial big data, and internet contributing big data are fused by urban spatial units. They are used to mine the characteristics of human activities, physical construction, and human-land interaction in urban space. The composition, usage, source, and contents of these data are shown in Table 1. The data of human space-time activity are a basis for user activity identification. So, we need to convert the original mobile phone signaling records into individual daily activity chain data. For cellular positioning data, the speed threshold method has proved to be an effective stay recognition method that can reduce the spatiotemporal error [29,40]. Notably, the recursive look-ahead filter achieved the best performance compared to the recursive naïve filter and Kalman filter [41]. Aiming at the characteristics of spatial ambiguity and temporal sparseness of cell phone positioning data, we use the recursive look-ahead filter method to identify stays and build individual daily activity stay chains. Here are the steps:

1.
Data re-organization. Based on the IMSI (unique user id), the original signaling inventory data are refactored into a sequence of discrete space-time points of user daily activity in time order.

2.
Stay chain identification. Depending on the time, multiple consecutive records at the same location are merged into a stay point containing arrival and departure times to form a stop chain, denoted as L. To provide a sufficient inference basis for activity type identification, we remove the user's single stay for less than a specified time threshold ∆ t from the stay chain. 3.
Abnormal filtering. The moving speed v(i) is calculated based on the distance between L(i) and L(i − 1) and the difference in arrival time. If vi exceeds the normal linear movement speed of human activity v normal , the distance between L(i) and L(i + 1) is compared with the distance between L(i) and L(i − 1). The filter removes the activity record further away from L(i + 1) of the activity chain.
Existing studies have shown that the type of human activity is usually closely related to the characteristics of the type of destination land use [29]. However, it is impractical to traverse the combination of all land-use features and user activity types, and features such as activity time can also ISPRS Int. J. Geo-Inf. 2020, 9, 617 7 of 25 influence activity type. Liao, Fox, and Kautz [42] proposed a supervised activity recognition general framework that extends Relational Markov Networks (RMNs) [43]. Their framework comprehensively considers four aspects of learning and inference activity types, namely time, space, the active transfer relationship such as "home" activity which follows "working" activity, and global constraints such as the number of homes or workplaces. On this basis, Widhalm et al. [44] extended the relational schema and proposed the expectation-maximization (EM)-based learning method, which successfully achieved unsupervised activity type inference based on mobile phone activity trajectory. This study implements the non-directional graph model, which identifies the user activity type by estimating the joint posterior probability P r (l, a|P, t, δ, i) of activity type and land use. {P, t, δ, i} of P r represent the land-use share, activity start time, dwell time, and the number of stay chain locations, respectively, which constitute the observation variables of the relational clique template. l and a represent land use and activity types, respectively. They are the marker variables for the relational clique template.
Here, we instantiate four clique templates composed of an activity type distribution clique, land use type, activity type clique, time and activity type clique, and an indicator that the activity is performed at only one unique location (such as "Home"). Each clique's potential function represents the frequency of random variable occurrences to meet its conditions, and all cliques constitute an expanded Markov network G. Based on the above definition, the joint probability distribution (l, a) can be written as: where φ 1 , · · · , φ 4 represent the potential function of the four clique templates. φ 1 is the probability distribution of each activity type, φ 2 represents the probability of land use ratio p l of the mobile phone station coverage zone, and activity type a concerns the probability of land use type. φ 3 indicates the probability between arrival time, length of stay, and type of activity. φ 4 is a global constraint on a specific activity (e.g., a working position is highly likely to be visited once in a day, while home tends to be visited two times).
Based on the above potential function, the relationship Markov model is trained by the EM algorithm. When the distribution Φ is converged, the resulting probability distribution will be used as an input to mark the most likely activity type for each user's stay. Combined with the practical experience of related research [42,44] and the characteristics of land use type data in this study area, this research will use this activity inference method to label the following seven activity types: "Shopping", "Catering", "Working", "Education and Culture", "Home", "Leisure or travel" and "Other".

Spatial Interpolation Method for Estimating Urban Vitality Distribution
In this paper, based on the speculation of mobile phone user activity types, the spatially explicit characteristics of urban vitality are extracted by the following two steps: We first estimate the spatial distribution of urban human activities. Currently, the two common types of method used in population distribution estimation are spatial interpolation methods and statistical model methods [45]. Among them, areal interpolation is used to solve the problem of statistical unit inconsistency. This method converts the spatial unit of the census into the actual analysis spatial unit, i.e., the conversion from the source area to the target area [46]. The activity recognition result of the stay point obtained earlier is considered to be the frequency statistics of various activities within a particular buffer zone of the user's stay position. However, the statistical unit of human activity distribution is a regular grid unit. We design an area weight interpolation method to convert ISPRS Int. J. Geo-Inf. 2020, 9, 617 8 of 25 the user activity frequency of the stay points into the activity intensity distribution of a corresponding grid statistical unit, as shown in Figure 2. The activity intensity is calculated as the ratio of the frequency of various activities to the area of the spatial grid unit.
types of method used in population distribution estimation are spatial interpolation methods and statistical model methods [45]. Among them, areal interpolation is used to solve the problem of statistical unit inconsistency. This method converts the spatial unit of the census into the actual analysis spatial unit, i.e., the conversion from the source area to the target area [46]. The activity recognition result of the stay point obtained earlier is considered to be the frequency statistics of various activities within a particular buffer zone of the user's stay position. However, the statistical unit of human activity distribution is a regular grid unit. We design an area weight interpolation method to convert the user activity frequency of the stay points into the activity intensity distribution of a corresponding grid statistical unit, as shown in Figure 2. The activity intensity is calculated as the ratio of the frequency of various activities to the area of the spatial grid unit. Next, the spatial distribution of urban vitality is determined by defining the mapping relationship between activity type and vitality. Here we describe the mapping relationship between activity type and vitality based on the definition of economic vitality and social vitality and the nature of human activity type based on our urban vitality analysis framework. The results of the mapping relationship are shown in Table 2. Economic vitality is expressed through human economic activities, and people's daily economic activities are characterized by consumer buying activities and productive activities as producers, and they have direct benefits for urban economic growth and development [47]. Specifically, the activity modeling identified shopping and catering activities are summed up as consumption activities, and the working activity corresponds to the production activity-they together constitute the economic vitality of the city. Building a livable urban environment and open public spaces, promoting people's social interaction, and integrating them into urban life is an essential aspect of urban vitality [1,10]. Participation in recreational activities, Figure 2. The implementation process of area weight interpolation method for activity distribution estimation, using sample data.
Next, the spatial distribution of urban vitality is determined by defining the mapping relationship between activity type and vitality. Here we describe the mapping relationship between activity type and vitality based on the definition of economic vitality and social vitality and the nature of human activity type based on our urban vitality analysis framework. The results of the mapping relationship are shown in Table 2. Economic vitality is expressed through human economic activities, and people's daily economic activities are characterized by consumer buying activities and productive activities as producers, and they have direct benefits for urban economic growth and development [47]. Specifically, the activity modeling identified shopping and catering activities are summed up as consumption activities, and the working activity corresponds to the production activity-they together constitute the economic vitality of the city. Building a livable urban environment and open public spaces, promoting people's social interaction, and integrating them into urban life is an essential aspect of urban vitality [1,10]. Participation in recreational activities, cultural and educational activities, and family life reflects urban social life, so they are mapped to social vitality. Additionally, "Other" activity is also integrated into social vitality. It indicates other types of activities, such as people walking on the street.

Quantification Calculation of Vitality Impact Factors
After collecting and processing GIS spatial big data and the internet contributing big data, this paper constructs and quantifies ten potential driving indicators of urban vitality. These indicators evaluate the two-dimensional ground value of urban ground physical built environment and human-land interaction characteristics. The meaning and calculation of the indicators are shown in Table 3.
In the first three formulas, p i is the proportion of the i-th POI type for the total POI records, n is the total number of the POI type, Num i is the number of each POI type, and A is the area of each patch (spatial analysis unit). The LM indicator is used to reflect the orderliness of land use type and its quantity-the higher the value, the more ordered the land type is, while on the contrary, a lower value means disorder and randomness. h i and a i in Equation (4)  For human-land interaction dimension indicators (the last five indicators in Table 3), the Ordinary Kriging (OK) interpolation method is used to estimate the spatial distribution of analysis indicators. Sample points collected from related websites are used to estimate the spatial distribution of analysis indicators used as the input data. I(v) is a general equation design for spatial estimation as shown in Equation (7), where K is the OK interpolation method, radius is the search radius, pts is a specified number of input sampling points, and A i is the area of i-th analysis unit. BP and APL reflect the consumption heat and consumption level in various urban regions and reflect the quality of life services. The level of the SL indicator reflects each region's economic creativity, and each company's average salary is estimated as removing the remuneration of the highest and lowest positions and calculating the average value of the remaining remuneration range data, see the equation job e in (9). The JR indicator reflects the demand and attractiveness of talent in each region, and it is represented by the number of jobs offered by each enterprise (see Equation (10)). In Equations (9) and (10), j i is the average salary for the i-th position of the e enterprise, num i is the number of recruits for this position, n is all positions of the e enterprise, and j represents the average salary set of e enterprise. The HPL indicator shows the fundamental value of urban space; the level of housing prices not only correlates with the location but also reflects the consumption capacity of a region.
Land use mix (LM) Physical built environment (urban design dimension) The number of jobs provided in each company is selected to reflect job richness House price level (HPL) Real estate information website Each residential community's average house price is used as input

Spatial Regression Analysis of Urban Vitality
Regression modeling is a statistical data analysis technique. It provides us with an effective means for understanding the relationship between independent variables and explanatory variables. Multiple regression analysis techniques have been successfully applied in the research of physical phenomenon simulation [48], population growth and spatial location factors modeling [49][50][51], economic fluctuation trend fitting and prediction [52], and tourism research [53,54]. The spatial regression model has been successfully applied to the study of the correlation between urban vitality and land-use characteristics [11,19]. However, the ordinary least squares (OLS) model is suitable for global feature analysis, and it does not consider spatial non-stationarity. If the data are spatially dependent and ignored, it will often produce misleading results [55]. Since the spatial distribution characteristics of different vitality driving factors always show significant differences, the impact of various factors on urban vitality will also be different in multiple regions. To reveal the driving mode of urban vitality of urban built environment and interaction between human and urban space, we adopt the local linear regression model, the GWR model, which takes into account the spatial self-correlation parameters as the modeling method. The mathematical form of the GWR model is: where the y is a dependent variable observation, and (u i , v i ) is the coordinate of the observation point i. β 0 is the regression coefficient of the i-point, indicating the degree to which the argument affects dependent variables. p is the number of independent variables, x ik represents the value of the k-th independent variable of position i, and β k is the k-th regression coefficient of the i-th point. ε i represents a random error, which is a normal distribution function with a constant variance.
Here, the driving indicator constructed in Section 3.2.2 is taken as the explanatory variable x i , and the urban economic vitality and social vitality intensity estimated in Section 3.2.1.2 are used as the dependent variable y i . Furthermore, this paper will adopt the idea of exploratory regression analysis, and we try to construct the GWR model of urban economic vitality and social vitality using different combinations of explanatory variables. After the successful construction of the model, diagnostic indicators such as residual squares, AICc, R2, and adjusted R2 will be output to reflect the performance of the model. The lower the value of AICc and the closer R2 is to 1, the better the fit of the result. Then, we obtain the optimal analysis model according to the diagnostic coefficient. Finally, the optimal model will answer questions about which factors contribute to the urban economic vitality and social vitality and how they work.

Case Study Area and Experimental Data
Nanjing is a megacity in the Yangtze River Delta City Cluster, which is the most economical vita resource allocation center of China. The size of Nanjing is second only to Shanghai in this City Cluster. We select the main urban area of Nanjing (according to Nanjing 2011-2020 master plan) as our case study area. This area is located in an urban area with Xinjiekou CBD (Central Business District) as the core. It is a highly urbanized area where population and urban functions gather. Therefore, the embodiment of urban vitality and the formation mechanism of urbanization can be fully reflected here. The geographical location, administrative divisions, and business district distribution of the study area are shown in Figure 3. According to the urban vitality evaluation framework proposed in Section 2, geospatial big data, mobile phone data, and the Internet contributing big data are obtained as research data. Their specific descriptions are detailed in Table 4.

13,250
Note: For user activity type inference needs, the original 20 categories of POI are merged into 13 categories. They are catering services, shopping services, life services, sports and recreation services, medical services, accommodation services, tourist attractions, residences, governmental organizations, science/culture and education services, transportation services, finance and insurance services, and enterprises.

Experimental Parameter Settings
To facilitate the spatial characteristic expression and correlation analysis of urban vitality, we divide the study area into 200*200 m grids as spatial analysis units by considering the spatial error of mobile phone location data. v normal was set to 5.43 km/h according to a pedestrian average walking speed and the time threshold of a single stay ∆ t = 15 min. Since individual stops may be spread throughout the city, the activity type label of the user's stay chain is expanded to the entire city of Nanjing. Simultaneously, considering the non-uniform distribution of base stations in space, this paper creates 250 and 500 m buffer zones for base stations in urban areas and outside urban areas. Then, it calculates the proportion of POI types in each buffer zone as the probability coefficient p l . The spatial interpolation of the human-land interaction indicator uses the Gaussian Semivariogram, with the search point of 10 and the maximum search radius of 1km.

Activity Types Inference Probability Model Training
Before using the activity recognition model to speculate on the type of activity that the user stays in, we need to initialize the prior probability for the model. φ 1 is initialized as a uniform distribution. Prior knowledge is used to set the correlation between land use and activity type to initialize φ 2 . According to the strength of the correlation, it is divided into three weight levels of 1-10-1000-the higher the level, the stronger the correlation (e.g., the activity type "Home" is most likely to correspond to the land use type "Residence"), and normalize them to a ratio of 1, so that the sum of the probabilities of all activity types is 1. φ 3 consists of two parts: start time w t and length of stay w s mapping with various activities. It is divided into three levels of 1-10-100, and a higher value indicates a greater possibility of engaging in such activities at that moment and stay. We construct a w t × w s matrix to initialize p(a t, δ) . φ 4 is used to set global limits on specific activities. If an individual's activity has been returned, the probability that the activity type is "Home" is 1, and the probability that the activity type is outside the "Home" is 0.5.
After using the above initialization parameters and the daily activity chain of 20,000 users as a training sample, we successfully construct an activity inference model. After 36 iterations, the model is convergent. The activity types of 13,248,925 stays were finally identified, with an average number of daily stops per user of 3.63. As shown in Figure 4a, "Work" accounts for the largest share of people's activities, followed by "Home", and the smallest proportions of activity types are occupied by "Education" and "Catering". The reason for this is that ordinary workday data are used as the data source, and there are fewer meals outside on weekdays; mobile phone users are mainly adult groups, the proportion of educational and cultural activities is small. In addition, Figure 4b shows the joint probability distribution between land use and activity types. It reflects the land use distribution of the study area. The land type of life services and enterprise account for a relatively high proportion, and they are mainly mapped to "Work" and "Home" activities.

Spatially Explicit Features of Urban Vitality
Based on the activity recognition results in Section 4.3, the method of area weight interpolation is used to estimate the economic vitality intensity and social vitality intensity of each analytical unit according to the mapping relationship between activity type and urban vitality. As shown in Figure  5I,II, most of the areas with high economic vitality are concentrated in the central area. Their distribution presents a ring-type structure from inside to outside. The economic vitality of Xinjiekou

Spatially Explicit Features of Urban Vitality
Based on the activity recognition results in Section 4.3, the method of area weight interpolation is used to estimate the economic vitality intensity and social vitality intensity of each analytical unit according to the mapping relationship between activity type and urban vitality. As shown in Figure 5I,II, most of the areas with high economic vitality are concentrated in the central area. Their distribution presents a ring-type structure from inside to outside. The economic vitality of Xinjiekou CBD and its adjacent areas, such as Huaihai Road and Huaqiao Road, is the highest. It is followed by a business district like Guangzhou Road and Hunan Road. The economic vitality of the outer business areas where are filled in blue is relatively low. Although the Aoti business district covers the financial district of Hexi CBD, its overall vitality level has been reduced, affected by its broad coverage area. As seen from Figure 5III,IV, the spatial distribution of social vitality is relatively more dispersed, and the range of high-vital regions is relatively large. For example, the business district of Sanpailou and Hanzhongmen far from Xinjiekou CBD become the highest-level vitality area, while they are at the second level in terms of economic vitality. Meanwhile, the higher level of social vitality districts is more gathered in the northern region than that of the southern region. The orange dotted area in Figure 5 is a low-vitality area blocked by mountains and lakes. In contrast, the Pink Ginger dotted area is a vibrant cold zone due to the sparseness of urban function (reference POI density, Figure 6a-4).
ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 16 of 26 Figure 5. Distribution of economic and social vitality in the study area (I,III) and distribution of economic and social vitality in the business districts (II,IV). The vitality intensity is graded using the natural breaks method [56], and the color from dark blue to red indicates the change in the vitality level from weak to strong, the unit is person/km 2 .

Figure 5.
Distribution of economic and social vitality in the study area (I,III) and distribution of economic and social vitality in the business districts (II,IV). The vitality intensity is graded using the natural breaks method [56], and the color from dark blue to red indicates the change in the vitality level from weak to strong, the unit is person/km 2 . To avoid multicollinearity problems with multiple explanatory variables, this paper eliminates the possibility of redundant explanatory variables by calculating the variance factor inflations (VIF) indicator. Generally, when VIF is less than 10, there is no multicollinearity. After verification, it is found that PT and LM have a VIF greater than 10 in the linear regression diagnosis with economic vitality and social vitality. So, we remove the PT to keep VIF less than 10. In addition, the GWR modeling requires spatial autocorrelation of variables, and we use Moran's I index [57] to test the spatial self-correlation of dependent variables before building the GWR model. After inspection, the Moran's I values of economic vitality and social vitality are 0.85 and 0.80, respectively. This indicates that there is a spatial aggregation of similar values and different values in urban vitality intensity. This paper takes the economic vitality intensity and social vitality intensity as the dependent variables. It uses multiple combinations of physical built environmental indicators and human-land evaluation indicators as the explanatory variable to construct the GWR model. The model construction results are detailed in Table 5. According to the diagnostic indicators of the modeling results, the models marked with an asterisk in the table (such as PEM3* and PSM3*) have better performance than unmarked models (such as PEM4 and PSM4). So, it turns out that the more explanatory variables there are, the better the modeling effect. Appropriate variable combinations can complement each other and prompt the regression model to fit the change in the dependent variable to the greatest extent. Finally, this study constructs the driving analysis models PEM(PEM3*), HEM(HEM3*) for economic vitality, and the driving analysis models PSM(PSM3*), HSM(HSM3*) for social vitality from two dimensions: physical analysis indicators and human-land interaction indicators. The modeling results are shown in Table 6 and Table 7, respectively. The above results reflect that the central area dominates the urban economic development, and contributes a lot of production creativity to the city. The high-level economic vitality districts have more hotness than social vitality districts. Furthermore, social vitality distribution mainly reflects the pattern of people's living, leisure, and recreation space.

Quantitative Results of Vitality Impact Indicators
Quantitative results of physical built environmental indicators and human-land interaction indicators are shown in Figure 6. For comparison, the evaluation results are divided into five levels by the natural breaks classification method.
According to Figure 6a-1 to a-5, it is found that RA and BD are mainly concentrated in the middle numerical interval, and the traffic accessibility of large areas in the central and western regions of the study area is relatively convenient (a-1, a-2). At the same time, the spatial distribution of medium and high building density is more uniform (a-2). The spatial differentiation of PD is relatively more prominent. The central area is a high PD concentration area (a-4). In contrast, the peripheral area has a lower value, which reflects the characteristics of urban functions radiating from the center to the surrounding. The two remaining indicators (a-3, a-4), especially the middle and high LM value, cover most of the study area, indicating that the urban functional diversity is relatively complete in most areas.
In Figure 6b-1 to b-5, the high business popularity (BP) area is mainly distributed in the urban center (b-3), while the high urban consumption level (ACL) area is close to the beautiful surrounding tourist areas and Hexi financial center area (b-1). According to the distribution of SL and JR indicators (b-2, b-4), urban employment opportunities and salary treatment are often related to the industry. The wages of employees in the technology and financial industries are usually relatively high, concentrated in the central and southern regions of Figure 6b-2. Employment opportunities are related to the size of enterprises and the accumulation of industries. Large-scale group enterprises and industrial parks have a high demand for talent, attracting many population pooling, and the spatial distribution in this study area is more scattered (b-4). In contrast, the spatial continuity of urban housing prices is strong, and the high-value areas for housing prices in the study area are mainly central urban areas and Hexi areas, which are closely related to the high-quality educational resources of the central urban areas and the leading economic development in the Hexi area (b-5).
To avoid multicollinearity problems with multiple explanatory variables, this paper eliminates the possibility of redundant explanatory variables by calculating the variance factor inflations (VIF) indicator. Generally, when VIF is less than 10, there is no multicollinearity. After verification, it is found that PT and LM have a VIF greater than 10 in the linear regression diagnosis with economic vitality and social vitality. So, we remove the PT to keep VIF less than 10. In addition, the GWR modeling requires spatial autocorrelation of variables, and we use Moran's I index [57] to test the spatial self-correlation of dependent variables before building the GWR model. After inspection, the Moran's I values of economic vitality and social vitality are 0.85 and 0.80, respectively. This indicates that there is a spatial aggregation of similar values and different values in urban vitality intensity. This paper takes the economic vitality intensity and social vitality intensity as the dependent variables. It uses multiple combinations of physical built environmental indicators and human-land evaluation indicators as the explanatory variable to construct the GWR model. The model construction results are detailed in Table 5. According to the diagnostic indicators of the modeling results, the models marked with an asterisk in the table (such as PEM3* and PSM3*) have better performance than unmarked models (such as PEM4 and PSM4). So, it turns out that the more explanatory variables there are, the better the modeling effect. Appropriate variable combinations can complement each other and prompt the regression model to fit the change in the dependent variable to the greatest extent. Finally, this study constructs the driving analysis models PEM(PEM3*), HEM(HEM3*) for economic vitality, and the driving analysis models PSM(PSM3*), HSM(HSM3*) for social vitality from two dimensions: physical analysis indicators and human-land interaction indicators. The modeling results are shown in Tables 6 and 7, respectively.   The spatial distribution of each explanatory variable coefficient of the economic vitality driving model PEM, HEM, and the social vitality driving model PSM, HSM is shown in Figure 7. Combining the GWR modeling results and Figure 7, we can summarize the following conclusions.
First, the most significant variables of the physical environment influencing urban economic and social vitality are RA, PD, LM. The LM indicator has the most noticeable impact (the standard deviation of the LM coefficient is the largest, which is 2.4757 and 2.5013, respectively). In contrast, BD has no significant driving effect on urban vitality in this study area (as shown in Table 5, PEM4 and PSM4 reduce the interpretation of urban vitality after adding BD indicator). The medians of the three physical environment explanatory variable coefficients of the PEM and PSM models are all positive. This demonstrates that superior urban transportation accessibility (urban form), adequate service resource facilities, and mixed land use (urban function) are necessary conditions for stimulating urban economic and social vitality. They are an essential guarantee for forming good space quality and perfecting space function. In addition, the outstanding driving performance of the LM variable on the urban vitality proves that a complete and balanced service resource allocation is an essential manifestation of the development maturity of the regional urban function. This coincides with the research conclusions [19,58] on the physical driving factors of urban vitality. The difference between this article and Huang et al. [20] is that their research did not consider spatial non-stationarity, and they listed both BD and PD as the two most important factors driving urban vitality. However, the GWR modeling results in this paper show that BD has local multicollinearity with other driving indicators. This may be explained for urban vitality instead by other indexes such PD.
Second, the three indicators of JR, BP, and ACL have the most robust explanatory degree for modeling economic vitality and social vitality in terms of the human-land interaction dimension. In contrast, the driving effect of HPL on urban vitality is not significant. This shows that resident employment and online reviews have a considerable impact on modern urban life. Furthermore, by comparing the standard deviation of all indicators, it is found that the JR indicator has the most significant variation range in urban vitality, and BP has the smallest variation range. This shows that not only is economic vitality directly affected by abundant employment opportunities, but also people expect that their place of residence is as close as possible to areas with abundant job opportunities, thereby reducing commuting distance. Meanwhile, low consumption (median value of economic vitality ACL coefficient is less than 0), and sufficient employment opportunities are favorable conditions for activating urban economic vitality. Social vitality is more affected by online reviews (the ACL coefficient value of social vitality is higher than economic vitality), and popular life service places greatly attract people's social lives. ISPRS Int. J. Geo-Inf. 2020, 9,    Third, the impact of various indicators on urban vitality has apparent spatial heterogeneity. As we can see in the first and third row of Figure 7, RA and LM have the most noticeable impact on the economic vitality in the central area of Nanjing's main urban area, while PD's effects are more global. Unlike economic vitality, social vitality is weakened by the traffic advantages (RA) in the urban center area, while the impact of PD and LM is strengthened. From another aspect, as shown in the second and last row of Figure 7, JR has an overall spatial driving effect on the urban economy and social vitality. On the contrary, the ACL is more negatively correlated with urban vitality, and the BP coefficient of the corresponding region is mostly positive. Overall, good traffic conditions and reasonable rents not only attract a large number of companies and businesses to settle in, creating more job opportunities, but also increases the convenience and livability of life. Meanwhile, people wish for low-consumption and high-reputation services most of the time.

How to Increase Urban Economic Vitality and Social Vitality?
The regions with the highest levels of economic and social vitality intensity are extracted as high vitality areas, respectively. The statistical distribution of each driving indicator in the high vitality areas is shown in Figure 8. Based on analyzing the statistical characteristics of the driving indicators in these areas, the following conclusions are summarized about how to effectively improve urban economic and social vitality.
(1) Economic vitality improvement strategies. From the perspective of physical space, urban high economic vitality areas are more concentrated than social vitality. High economic vitality has a strong positive correlation with road density and land use mix; the higher the land mix, the stronger the vitality, and the road density can be maintained within an appropriate range (94-380 nodes/km 2 in our study area). Economic vitality and POI density are linearly related in space and value. The closer to the central area, the higher the PD density and the more influential the economic vitality. From the perspective of human-land interaction, the positive correlation between urban economic vitality, employment opportunities (JR), and business enthusiasm (BP) is relatively significant. The positive association between BP and ACL is more influential in the central urban area than in the peripheral areas. It shows that great employment opportunities and low consumption are directly related to the performance of the urban economic vitality, and the central urban area needs more job opportunities and high-quality business services (to attract more consumer groups). Despite the reduced service level in the surrounding areas, lower consumption levels are needed as alternative conditions.
(2) Social vitality promotion strategies. The area with high social vitality is relatively scattered, and the spatial heterogeneity is relatively small. That is, to improve urban social vitality, more balanced space quality and social services are needed. For the physical environment, better spatial accessibility, and appropriate service resource allocation (Figure 8b-1 to b-3) are the primary conditions for improving social vitality. As for human-land interaction characteristics, adequate job opportunities and affordable urban consumption are the features of the social environment embodied in high social vitality areas. People are more willing to choose the smallest possible commuting cost and more reasonable consumption area to carry out social activities. The popularity of the business is mainly reflected in the central area; that is, people's enjoyable consumption life primarily occurs in the central urban area (as revealed in Xinjiekou, Hunan Road, and Changjiang Road Business District in our study area).

Conclusions
The advent of the smartphone era has changed people's lifestyles and produced massive spatiotemporal data for urban scientific research and applications. Mobile phone data have become an important data source, reflecting human daily activity patterns and city dynamics. They have been applied to many research fields, such as urban transportation, population, and smart management. This paper proposes an activity-based vitality evaluating framework that integrates multi-source big data to analyze urban vitality's connotation and its driving mechanisms. It is the first study to deeply evaluate urban vitality from the two aspects of the economy and society, as they constitute the main content of urban life. As people's urban life is increasingly being affected by cyberspace, this article creatively sought to explore the driving mechanism by combing urban static physical built environment features and dynamic human-land interaction indicators (online reviews). Hence, this method enables us to understand the meaning and driving mode of urban vitality. Through the empirical research, we found the spatial heterogeneity of urban vitality from both economic and social aspects in the main urban area of Nanjing, and some targeted measures were summarized to enhance the vitality of the city. This method and research conclusions can be extended to the vitality analysis of other regions or cities and can be used to formulate urban vitality shaping strategies that adapt to their own needs.
There are also some shortcomings in the article, which will be compensated in the subsequent research. The activity intensity in this paper is estimated by the area-based interpolation method, which lacks the type distribution of urban ground buildings and the existing human spatial distribution as supplementary materials, resulting in a specific error in the estimation of spatial activity distribution. Since the division of economic vitality and social vitality is relatively clear, the impact on the overall pattern of vitality intensity distribution would not be too high. Therefore, the follow-up research will consider the random distribution of positioning errors and the unbiased estimation method of human activity distribution supported by multiple auxiliary data sources to further improve the accuracy and reliability of the analysis results.