Multidimensional Spatial Vitality Automated Monitoring Method for Public Open Spaces Based on Computer Vision Technology: Case Study of Nanjing’s Daxing Palace Square

: Assessing the vitality of public open spaces is critical in urban planning and provides insights for optimizing residents’ lives. However, prior research has fragmented study scopes and lacks fine-grained behavioral data segmentation capabilities and diverse vitality dimension assessments. We utilized computer vision technology to collect fine-grained behavioral data and proposed an automated spatial vitality monitoring framework based on discrete trajectory feature points. The framework supported the transformation of trajectory data into four multidimensional vitality indicators: crowd heat, resident behavior ratio, movement speed, and spatial participation. Subsequently, we designed manual validation mechanisms to demonstrate the monitoring framework’s efficacy and utilized the results to explore the changes in vitality, and the influencing factors, in a small public space. Discrete trajectory feature points effectively addressed the literature’s fragmented study scope and limited sample size issues. Spatial boundaries had a significantly positive impact on spatial vitality, confirming the “boundary effect” theory. The peak spatial vitality periods were from 08:30 to 09:30 and from 17:30 to 18:30. A higher enclosure degree and better rest facilities positively impacted spatial vitality, while a lower enclosure degree did not consistently suppress spatial vitality in all situations. Overall, spatial features and spatial vitality have a complex nonlinear relationship.


Introduction
Public open spaces are essential elements of urban life and are significant for public health, safety, and quality of life [1][2][3][4].In recent years, China has embarked on extensive public open space renovation projects to address the urban challenges arising from rapid urbanization [5,6].Humanistic and precise planning and transformation have become mainstream approaches, supplanting "the grand narrative" concept in urban spatial planning and construction [7].This necessitates researchers to macroscopically monitor urban accessibility and the physical environment while microscopically re-evaluating public open spaces from the humancentric perspective [8][9][10][11][12][13].Based on Jacobs' humanistic urban studies [14], numerous researchers have emphasized the importance of crowd behavior to assess spatial vitality in public open spaces.Specifically, they have argued that crowd behavior can mirror spatial vibrancy and utilization, offering a foundation for urban designers to enhance public open-space environments and elevate residents' well-being [15][16][17].However, the acquisition of detailed crowd behavior data and the establishment of an equivalent relationship between crowd behavior information and spatial vitality continue to pose challenges in public open space research [18,19].For example, emerging technologies that rely on big data, global positioning system (GPS) location data [20][21][22], mobile signaling [2,18,23], social media data [24,25], and multisource big data have effectively assisted researchers in improving the efficiency of urban crowd behavior data acquisition [15,[26][27][28].However, in contrast to macroscale urban spatial vitality research, which prioritizes data breadth, microscale studies have exhibited heightened sensitivity toward the granularity of behavioral information, geographical positioning accuracy, and temporal coherence [29].Moreover, owing to data granularity and collection limitations, large-scale behavioral datasets are insufficient for application at the microscale [30].Therefore, researchers must identify a suitable method for acquiring small-scale spatial behavioral data that offers a convenient and efficient data acquisition approach while meeting the precision requirements of empirical research.
To obtain fine-grained behavioral data, mainstream behavioral data collection modes include questionnaire surveys and behavioral annotation methods [31][32][33][34], which are low-throughput approaches.Meanwhile, traditional manual statistics are inefficient, lack generalizability, and exhibit strong subjectivity when annotating the locations of behavioral data, making it difficult to record dynamic changes in activity over time.Wi-Fi probe technology has emerged as a method for acquiring small-scale spatial behavioral data, with significant advantages in terms of coverage and efficiency [35].However, data interference caused by environmental obstructions and non-mobile-phone crowds cannot be avoided.In comparison, video data can faithfully represent human activities, provide rich and precise behavioral trajectory information, and include body movements and facial expressions, thus offering multidimensional raw data for behavioral research.Therefore, building on Whyte's early work [36], some researchers have combined video data with emerging computer vision (CV) to develop video computer recognition technology that can quantitatively capture crowd behavior information.The effectiveness of video computer recognition technology has been verified via social behavior monitoring [37], public open space pedestrian flow distribution [38], and walking behavior assessments [39].Moreover, with the simultaneous support of fine-grained data, the relationship between crowd behavior and the physical environment of public open spaces has been analyzed at the microscale.For example, Loo et al. [40] constructed a regression model between social activities and interactive landscape features to analyze the relationship between urban furniture and social activities.Wei et al. [41] sampled surveillance videos from 16 urban streets to establish the relationship between behavioral preferences and urban physical boundaries.Furthermore, some studies have linked crowd behavior characteristics to spatial vitality, explored new factors representing spatial vitality [42], and constructed spatial vitality proxy models for automated spatial vitality assessment [29].While these studies have indicated that using video computer recognition technology is a reliable approach for acquiring behavioral data since it combines the data granularity advantages of traditional manual statistics with the data collection efficiency and objectivity of emerging technologies, there are unresolved issues in public open space vitality research based on crowd behavior.First, most studies have been confined to evaluations within the boundaries of independent video frames [29,40]; as such, they have failed to link data samples from multiple viewpoints and have prevented a holistic analysis of the spatial vitality distribution in a public open space.Second, some studies have used complete behavioral trajectories as statistical units [29], or relied solely on grid-based methods to count pedestrian movements [38], hindering spatiotemporal behavioral feature segmentation.Finally, most studies have considered a single population quantity metric to represent spatial vitality [38,41]; as such, they have failed to provide a multidimensional exploration of crowd behavior characteristics.
Accordingly, we propose a method for measuring crowd behavior characteristics using discrete weighted trajectory feature points as statistical units.Consequently, human behavioral characteristics can be expressed within a single trajectory feature point.Furthermore, by using weighted calculations to associate neighboring trajectory feature points and retain continuous trajectory features, we provide an effective solution for integrating data from multiple viewpoints [17], thereby addressing the challenges of spatiotemporal behavioral feature segmentation.Finally, we analyze the distribution of spatial vitality among different spatial features of a public open space.Our main research questions are as follows: How can we build an automated monitoring model using detailed population behavior data to characterize spatial vitality?
Are there spatial distribution differences in spatial vitality during different periods?Do spatial vitality characteristics vary heterogeneously for different spatial features?

Methodology Framework Overview
We first collected video datasets capable of replicating crowd behavior within the study site, which we then partitioned into seven internally homogeneous independent spatial units.Subsequently, we used a computerized automated monitoring framework to quantify the multidimensional spatial vitality of the research site.Specifically, we deployed the YOLOv7-DeepSort model on the Python platform to extract pedestrian trajectories from videos.We then developed a method for quantifying spatial vitality from four dimensions: crowd heat, staying behavior ratio, movement speed, and spatial participation.For detailed methodologies, see Sections 2.3 and 2.4.Then, we designed a set of manual assessment and verification methods, with each metric corresponding to the automated monitoring results.Finally, we divided the study site into hexagonal grids.Each hexagonal grid area corresponded to the staying behavior recognition threshold.We selected hexagonal grid shapes because of their strong statistical properties for trajectory data [40].We linked the trajectory feature point data to a grid to ascertain the average feature index of each spatial unit, facilitating the construction of spatial distribution maps.This approach enabled our investigation of the distribution of and variation in spatial vitality factors for different spatial features in the spatiotemporal dimensions.Figure 1 shows the research process.
tial vitality among different spatial features of a public open space.Our main research questions are as follows: How can we build an automated monitoring model using detailed population behavior data to characterize spatial vitality?
Are there spatial distribution differences in spatial vitality during different periods?Do spatial vitality characteristics vary heterogeneously for different spatial features?

Methodology Framework Overview
We first collected video datasets capable of replicating crowd behavior within the study site, which we then partitioned into seven internally homogeneous independent spatial units.Subsequently, we used a computerized automated monitoring framework to quantify the multidimensional spatial vitality of the research site.Specifically, we deployed the YOLOv7-DeepSort model on the Python platform to extract pedestrian trajectories from videos.We then developed a method for quantifying spatial vitality from four dimensions: crowd heat, staying behavior ratio, movement speed, and spatial participation.For detailed methodologies, see Sections 2.3 and 2.4.Then, we designed a set of manual assessment and verification methods, with each metric corresponding to the automated monitoring results.Finally, we divided the study site into hexagonal grids.Each hexagonal grid area corresponded to the staying behavior recognition threshold.We selected hexagonal grid shapes because of their strong statistical properties for trajectory data [40].We linked the trajectory feature point data to a grid to ascertain the average feature index of each spatial unit, facilitating the construction of spatial distribution maps.This approach enabled our investigation of the distribution of and variation in spatial vitality factors for different spatial features in the spatiotemporal dimensions.Figure 1 shows the research process.

Study Site
In the preliminary design phase of our study, we first established criteria for selecting the research site.Firstly, due to the need for repeated experiments in method exploration, the research site should be of appropriate size and free of excessive obstructions.This ensures that utilizing multiple camera positions to cover the research area will not incur excessively high research costs.Secondly, the site should have a high level of activity richness to validate the stability of the space vitality monitoring method in analyzing different trajectory characteristics.Thirdly, the site should have multiple spaces with distinct spatial features, aiding in testing the stability of the space vitality monitoring method under

Study Site
In the preliminary design phase of our study, we first established criteria for selecting the research site.Firstly, due to the need for repeated experiments in method exploration, the research site should be of appropriate size and free of excessive obstructions.This ensures that utilizing multiple camera positions to cover the research area will not incur excessively high research costs.Secondly, the site should have a high level of activity richness to validate the stability of the space vitality monitoring method in analyzing different trajectory characteristics.Thirdly, the site should have multiple spaces with distinct spatial features, aiding in testing the stability of the space vitality monitoring method under various spatial characteristics and exploring the differences in space vitality across different spatial features.
Ultimately, our research site was chosen to be Daxing Palace Square, Nanjing, southeast China (118 • 22 ′ E-119 • 14 ′ E, 31 • 14 ′ N-32 • 37 ′ N).Daxing Palace Square covers an area of 13,738 m 2 and is free of excessive obstructions.It is located in the densely populated Xuanwu District, adjacent to the city's significant attractions, such as Nanjing Presidential Palace, and a municipal library (Figure 2).This location has high pedestrian flow and a rich variety of activities.Additionally, the square contains diverse small space types, including fully covered and semi-covered tree-lined squares and open hard-surfaced squares of varying sizes.Therefore, Daxing Palace Square meets our research requirements and was selected as our research site.
various spatial characteristics and exploring the differences in space vitality across different spatial features.
Ultimately, our research site was chosen to be Daxing Palace Square, Nanjing, southeast China (118°22′ E-119°14′ E, 31°14′ N-32°37′ N).Daxing Palace Square covers an area of 13,738 m 2 and is free of excessive obstructions.It is located in the densely populated Xuanwu District, adjacent to the city's significant attractions, such as Nanjing Presidential Palace, and a municipal library (Figure 2).This location has high pedestrian flow and a rich variety of activities.Additionally, the square contains diverse small space types, including fully covered and semi-covered tree-lined squares and open hard-surfaced squares of varying sizes.Therefore, Daxing Palace Square meets our research requirements and was selected as our research site.

Quantification of Physical Site Environment
During our spatial vitality assessments and analyses of the study site, averaging the overall vitality level of the site may have led to the mutual interference of pedestrian data for different spatial features, reduced data granularity, and resulted in biased analyses.Therefore, our study site required further subdivision, based on its spatial features, to enhance analysis precision.
Spatial scale, openness, and seating arrangements are fundamental factors that affect spatial vitality [17].However, more complex factors, such as aesthetics and scenic qualities [43], are challenging to quantitatively analyze in nonlinear open spaces.Therefore, we selected the area size, enclosure degree, and number of rest facilities as the spatial features (Table 1).Combined with the site boundary conditions, we divided the hard-surfaced activity space into seven independent spatial units, labeled as s1-s7 (Figure 3), to ensure the homogenization of internal spatial features while differentiating features between adjacent spaces.We computed the area features using a geographic information system (GIS) platform.We obtained the number of facilities from field surveys.The enclosure degree referred to the extent to which a space was visually surrounded by structures, plants, and other environmental elements [43].We uniformly positioned 12 locations around the site and captured images from 4 angles at a height of 170 cm.We used semantic segmentation to extract the pixel proportions of buildings, walls, and vegetation.We calculated the enclosure degree of a space (Figure 4) as follows: where  represents the enclosure degree of a space, and  ,  , and  represent the average pixel proportions of buildings, walls, and vegetation in the photos, respectively.During our spatial vitality assessments and analyses of the study site, averaging the overall vitality level of the site may have led to the mutual interference of pedestrian data for different spatial features, reduced data granularity, and resulted in biased analyses.Therefore, our study site required further subdivision, based on its spatial features, to enhance analysis precision.
Spatial scale, openness, and seating arrangements are fundamental factors that affect spatial vitality [17].However, more complex factors, such as aesthetics and scenic qualities [43], are challenging to quantitatively analyze in nonlinear open spaces.Therefore, we selected the area size, enclosure degree, and number of rest facilities as the spatial features (Table 1).Combined with the site boundary conditions, we divided the hard-surfaced activity space into seven independent spatial units, labeled as s1-s7 (Figure 3), to ensure the homogenization of internal spatial features while differentiating features between adjacent spaces.We computed the area features using a geographic information system (GIS) platform.We obtained the number of facilities from field surveys.The enclosure degree referred to the extent to which a space was visually surrounded by structures, plants, and other environmental elements [43].We uniformly positioned 12 locations around the site and captured images from 4 angles at a height of 170 cm.We used semantic segmentation to extract the pixel proportions of buildings, walls, and vegetation.We calculated the enclosure degree of a space (Figure 4) as follows: where En represents the enclosure degree of a space, and A b , A w , and A g represent the average pixel proportions of buildings, walls, and vegetation in the photos, respectively.

Obtaining Pedestrian Trajectory Data
Different times, weather conditions, and social and natural factors cause crowd behavior in public open spaces to exhibit heterogeneous characteristics [44,45].Therefore, we tracked and recorded crowd behavior throughout the day.Through our preliminary research, we identified 08:30-20:30 as the site's peak activity hours.After 18:30, the site was significantly affected by the sunset, resulting in noticeable changes in the lighting environment, which were inconducive to stable crowd data collection.Regarding weather conditions, conducting the data collection on cloudy days was advantageous for reducing the impact of hard-to-measure weather changes on crowd behavior data.Ultimately, we collected the research data at Daxing Palace Square from 08:30 to 18:30 on 18 July 2023.Recordings were taken every hour, and each recording lasted for 5 min.We established six filming locations at Daxing Palace Square (Figure 5) to cover the primary areas of crowd activity.We used smartphones mounted on 1.7 m tripods for synchronized filming, with the standard video resolution of 1080 p and a frame rate of 30 fps.We used the YOLOv7-DeepSort model to recognize crowd trajectories frame by frame in the video data.Compared with the traditional YOLOv5 model, this model exhibits superior efficiency and accuracy, aiding researchers in obtaining high-confidence crowd trajectory data in large quantities.To smooth the trajectory data, we conducted multiple comparative experiments and decided on a 1:30 ratio for secondary trajectory sampling (Figure 6).Finally, we orthographically projected the crowd trajectories based on predefined landmarks in the frames, resulting in crowd trajectory data being obtained for the subsequent calculations.To ensure that the separately collected behavioral data did not overlap during the integration, we defined the shooting ranges for each collection point and removed the behavior point data that exceeded these ranges during post-processing.We performed this step after computing the behavioral data characteristics to ensure that each trajectory feature point was calculated in the dataset and to avoid errors caused by data cleaning.Simultaneously, we resolved the trajectory discontinuity issues at the edges of the imagery frames.
We obtained consent from the park management unit for the data collection.During filming, we used eye-catching filming equipment and signs to inform those present that their behaviors were being recorded.All image data were silenced and were only used during the research process.We obtained consent from the park management unit for the data collection.filming, we used eye-catching filming equipment and signs to inform those pres their behaviors were being recorded.All image data were silenced and were on during the research process.

Crowd Heat
According to Gehl, spatial vitality can be measured using the number of peo space and their staying times [17].Previous studies have defined staying time as erage length of the complete trajectories of different identifications (IDs) within th period [29].However, in complex environments with dense crowds, inevitable occ can lead to errors in corresponding trajectory IDs [39], significantly reducing the a of staying time measurements.Therefore, we proposed a crowd heat factor, with trajectory feature point data as the statistical unit, to calculate the total number of present per second in a space (1) and their staying times in that space as follows:

Spatial Vitality Factors 2.4.1. Crowd Heat
According to Gehl, spatial vitality can be measured using the number of people in a space and their staying times [17].Previous studies have defined staying time as the average length of the complete trajectories of different identifications (IDs) within the same period [29].However, in complex environments with dense crowds, inevitable occlusions can lead to errors in corresponding trajectory IDs [39], significantly reducing the accuracy of staying time measurements.Therefore, we proposed a crowd heat factor, with discrete trajectory feature point data as the statistical unit, to calculate the total number of people present per second in a space (1) and their staying times in that space as follows: where H k represents the crowd heat of space k, h i represents the number of people in sample frame i, and M k represents the area size of space k (m 2 ).

Staying Behavior Ratio
We defined staying behavior as a person remaining within a defined area for a certain period, excluding activities such as running and walking.Previous research has shown that the frequency of staying behavior is a significant indicator of a space's attractiveness [46].Public open spaces with high staying behavior frequency tend to generate crowd aggregation effects and social activities, thereby enhancing their spatial vitality.Moreover, distinguishing between non-staying and staying behaviors can assist researchers in better differentiating the impacts of different crowd trajectory types on a space's spatial vitality.Consequently, we utilized the following method for identifying staying behavior trajectory feature points and formulated identification criteria based on site observations: if a trajectory was within a 3 m radius for ≥6 s (Figure 7), we considered it a staying behavior trajectory.
ISPRS Int.J. Geo-Inf.2024, 13, x FOR PEER REVIEW 10 of 25 that the frequency of staying behavior is a significant indicator of a space's attractiveness [46].Public open spaces with high staying behavior frequency tend to generate crowd aggregation effects and social activities, thereby enhancing their spatial vitality.Moreover, distinguishing between non-staying and staying behaviors can assist researchers in better differentiating the impacts of different crowd trajectory types on a space's spatial vitality.Consequently, we utilized the following method for identifying staying behavior trajectory feature points and formulated identification criteria based on site observations: if a trajectory was within a 3 m radius for ≥6 s (Figure 7), we considered it a staying behavior trajectory.Finally, we calculated the proportion of staying behavior at a space in a period using the following formula: where  represents the proportion of the staying behavior in space k,  represents the total number of temporal staying behavior trajectory feature points in space , and  represents the total number of temporal trajectory feature points in space .

Movement Speed
Movement speed is a crucial dimension of measuring human activity in a space.When integrated with the results of staying behavior recognition, the movement speed of staying behaviors indicates the intensity of human activities, while the movement speed data for non-staying behaviors represents the transit speed of individuals.The former directly reflects spatial vitality, whereas the latter indicates a space's attractiveness.Highquality vibrant spaces encourage people to slow down and linger [45].Therefore, we considered movement speed as a factor that represented spatial vitality.To obtain quantitative values for this factor, we calculated the Euclidean distance between adjacent trajectory feature points with the same ID to represent the movement distance per second (3) and the average movement speed for each space within a specific period (4) as follows: where  represents the movement speed of trajectory feature point  ,  represents the sampling interval between two trajectory feature points, and () represents the total number of samples for a trajectory. represents the average movement speed for Finally, we calculated the proportion of staying behavior at a space in a period using the following formula: where SP k represents the proportion of the staying behavior in space k, sc k represents the total number of temporal staying behavior trajectory feature points in space k, and pc k represents the total number of temporal trajectory feature points in space k.

Movement Speed
Movement speed is a crucial dimension of measuring human activity in a space.When integrated with the results of staying behavior recognition, the movement speed of staying behaviors indicates the intensity of human activities, while the movement speed data for non-staying behaviors represents the transit speed of individuals.The former directly reflects spatial vitality, whereas the latter indicates a space's attractiveness.High-quality vibrant spaces encourage people to slow down and linger [45].Therefore, we considered movement speed as a factor that represented spatial vitality.To obtain quantitative values for this factor, we calculated the Euclidean distance between adjacent trajectory feature points with the same ID to represent the movement distance per second (3) and the average movement speed for each space within a specific period (4) as follows: where S pi represents the movement speed of trajectory feature point p i , t represents the sampling interval between two trajectory feature points, and len(tr) represents the total number of samples for a trajectory.AS k represents the average movement speed for space k within a specific period.The movement speed of the last trajectory feature point is the same as that of the second to last point.

Spatial Participation
We defined spatial participation as the extent to which human behavior was influenced by environmental or social forces within a space.Regarding crowd behavior, using space as a conduit for transportation results in a simple trajectory structure.In contrast, behaviors such as sitting, sightseeing, dancing, and others exhibit complex trajectory structures, indicating that these behaviors are subject to environmental or social forces to a greater extent [47].This suggests people's deep participation in space, thereby enhancing spatial vitality.Therefore, we utilized quantifiable trajectory structures to characterize people's spatial participation and aid our examination of spatial vitality.
Regarding the quantification of the trajectory structures, structural differences exist for different trajectories, and structural differences also exist among the local structures within the same trajectory.However, most of the extant methods for determining trajectory structures can only assess the structure of complete trajectories; as such, they neglect the specificity of local structures within the same trajectory [29].To enhance the representational capability of structural difference indicators for local trajectory segments, we used a structural difference calculation method based on distance and rank weighting and assigned structural difference values to trajectory feature points p i as follows: First, we introduced the trajectory length difference (5) and trajectory angle difference (6), whose effectiveness has been demonstrated in prior research.
where D p i represents the trajectory length difference, A p i represents the trajectory angle difference, and the length and angle differences between the initial and final trajectory feature points are the assigned values of the adjacent trajectory feature points.Subsequently, we calculated the weighted values (7) for the trajectory feature points based on the rank and distance differences, which were weighted using Gaussian functions: where W p ij represents the weight of trajectory feature point p j relative to p i , d p i , p j denotes the rank difference between trajectory feature points, dist p i , p j signifies the Euclidean distance between trajectory feature points, and c r and c l are the Gaussian weighting functions for the rank and Euclidean distance, respectively.We set c r as 3 and c l as 150, which was half of the threshold value for determining staying behavior.We considered the trajectory feature points that were further apart in rank and Euclidean distance to have relatively low weights, indicating low interrelatedness.Subsequently, we calculated the spatial participation (SC p i ) (8) based on Formulas ( 5)- (7).We then obtained the temporal average of the spatial participation in space k (AC k ) (9) as follows:

Manual Assessment of Spatial Vitality
Manual assessment is the mainstream method used to evaluate the effectiveness of machine monitoring and has been widely applied in various urban research fields [43], providing a humanistic research perspective.Currently, many indicators are used to evaluate spatial vitality [48]; however, different indicator types are prone to interference, affecting the evaluation results.Therefore, it was necessary to formulate a set of unified standards to evaluate the spatial vitality, provide training to experts, and standardize the theoretical basis of the evaluation results.Since we defined our aforementioned indicators to accommodate machine detection, they differed from those used in human assessments.Therefore, we adjusted the indicator definitions to align them with manual scoring conventions.Moreover, the indicators to be manually assessed were consistent with those used in machine monitoring in terms of their evaluation dimensions, allowing for the performance assessment of machine-monitored spatial vitality in the spatiotemporal dimensions.Finally, we utilized a set of spatial vitality evaluation standards composed of four sub-indicators: crowd heat, staying behavior ratio, movement speed, and spatial participation.

Assessment Criteria 2.6.1. Crowd Heat
We defined crowd heat as the total number of people present in a space per second during the video capture.Large transient crowds and small stationary crowds can provide equivalent crowd heat [17].Therefore, we graded crowd heat using five levels (Table 2).Subsequently, 13 experts with backgrounds in urban and rural planning and landscape architecture volunteered to conduct manual assessments of the spatial vitality.The experts used their professional knowledge to determine the two crowd types' contributions to crowd heat and provide comprehensive scores.The experts focused on the independence of crowd heat as a spatial vitality factor and avoided interference from other crowd behavior characteristics.We defined stationary behavior as a person remaining in a space for more than 6 s, including standing, resting, and activities, such as games and fitness.We graded the occurrence of stationary behavior in a space using five levels (Table 3).Because human observation cannot replicate machine precision, we required the experts to watch video clips repeatedly until they reached a consensus in the pre-scoring.We defined movement speed as the crowd's movement speed.Unlike the machine monitoring indicators, we used walking speed to refer to the distance that a person covered in a space within a certain period, rather than the displacement distance per second.This was because excessively detailed observation requirements would reduce the accuracy of human judgments.We graded movement speed using five levels based on the average movement speed of the crowd in a space (Table 4).Machine monitoring quantifies the structure of a trajectory, representing the interaction status of crowd behavior with both a space and other crowds.While a manual assessment cannot accurately assess the state of trajectory structure, the determination of fuzzy details, such as social behavior, upper body movements, and facial expressions, is an advantage of this method.We graded spatial participation using five levels based on the crowd behaviors in a space (Table 5).

Validation Process
We designed a standardized process for the manual assessment and subsequent validation of the results to ensure the credibility of the results.First, considering the manual assessment's high workload, we selected five typical periods as the evaluation samples-8:30, 10:30, 12:30, 15:30, and 18:30-to represent the variations in spatial vitality throughout the day.We then established the following workflow for the manual assessment.

Preparation Phase
We briefed the experts on the experimental procedures and provided scoring instructions, sample videos, and a platform on which to seek clarification.As a single video could include images from multiple spatial locations, we informed the experts in advance about the specific ranges of each space within each video to prevent interference from other spaces in the scoring.

Pre-Scoring Phase
We selected the video data from 09:30 as the training sample.The experts completed two full viewings of the video from spaces S1 to S7 and then scored each evaluation indicator.Subsequently, we used the widely accepted intraclass correlation coefficient (ICC) to verify the scoring effectiveness [49].This process was repeated until it passed the reliability test.As the scoring results were derived from a randomly sampled infinite dataset using a single method, the ICC used a two-way random model for the single rater type.We also used the absolute agreement type to consider systematic errors among the raters and monitor the potential for different individual raters to overestimate or underestimate the spatial vitality factor assessment.

Scoring Phase
To prevent scoring errors due to the order of video viewing, the experts initially scored each indicator and space for the five sample periods.They then repeated the viewing process.During this time, they could adjust their scores.Once completed, the scoring results were subjected to ICC reliability testing.The ICC model used was consistent with that used in the prescoring phase.

Validation Phase
We employed ICC to assess the consistency between the human and machine monitoring scores, which were standardized using min.-max.normalization.Since the scoring for this phase was based on a random sample from an infinitely large sample set, the ICC used a two-way random model with a single rater type.To investigate the presence of systematic errors between the human and machine monitoring scores, we compared the results using two ICC models: absolute agreement and consistency (Table 6).

Validation Results
Table 6 shows that for the intergroup consistency model using human evaluation, the ICC values for all indicators exceed 0.6, showing substantial agreement or higher.This indicates the high reliability of the spatial vitality scores obtained through human evaluation.
Regarding the intergroup consistency between machine monitoring and human evaluation using the absolute agreement model, the values for crowd heat, resident behavior ratio, and spatial participation range from 0.4 to 0.6, indicating moderate agreement.The ICC values for movement speed range from 0.6 to 0.8, signifying substantial agreement.The consistency ICC model values range from 0.6 to 0.8 for all indicators, also indicating substantial agreement.The ICC values for crowd heat, staying behavior ratio, and spatial participation under the consistency model are significantly higher, while those for movement speed are similar in both models.This may be because counting, comparing ratios, and distinguishing different behaviors require cognitive effort and memory, making it difficult for people to make effective judgments in adjacent situations and leading to conservative estimations with significant systematic biases.Movement speed can be scored in a way that aligns with human perceptions of everyday speeds, such as "no displacement", "slow walking", "normal walking", "fast walking", and "jogging", resulting in more objective scoring with less systematic bias.This further validates the advantages of our machine monitoring approach in terms of objectivity and precision in the spatial vitality assessment of the study site. in a way that aligns with human perceptions of everyday speeds, such as "no displacement", "slow walking", "normal walking", "fast walking", and "jogging", resulting in more objective scoring with less systematic bias.This further validates the advantages of our machine monitoring approach in terms of objectivity and precision in the spatial vitality assessment of the study site.

Crowd Heat
Figure 8 (left) shows that crowd heat peaks at 08:30 and 17:30 and troughs at 13:30, showing a pattern of low in the middle and high on both sides.The temporal trends of crowd heat in various spaces are generally consistent with the overall study site, with peaks at the 08:30-10:30 and 17:30-18:30 periods and troughs at the 12:30-14:30 period.
Figure 8 (right) shows that crowd heat in s4 and s6 maintains relatively high values throughout the period, demonstrating the significant advantage of spaces with large enclosures in attracting crowd activity.S3 and s5, with lower enclosures, demonstrate the highest crowd heat values at 08:30 and 18:30, respectively.This may be because collective sports, children's play, and social activities were performed at 08:30 and 18:30, and these activities utilized the hard surfaces provided by the spaces with low enclosures.S2, also characterized by a low enclosure, experiences low crowd heat values throughout the period owing to the disruption in spatial continuity caused by the array of tree pits.The spatial distribution map (Figure 9) confirms the aforementioned observations, with highfrequency crowd distributions occurring in high-enclosure spaces or at landscape boundaries throughout the period.From 08:30 to 9:30 and from 17:30 to 18:30, the crowd distributions occur synchronously in low-enclosure spaces with good spatial continuity.Figure 8 (right) shows that crowd heat in s4 and s6 maintains relatively high values throughout the period, demonstrating the significant advantage of spaces with large enclosures in attracting crowd activity.S3 and s5, with lower enclosures, demonstrate the highest crowd heat values at 08:30 and 18:30, respectively.This may be because collective sports, children's play, and social activities were performed at 08:30 and 18:30, and these activities utilized the hard surfaces provided by the spaces with low enclosures.S2, also characterized by a low enclosure, experiences low crowd heat values throughout the period owing to the disruption in spatial continuity caused by the array of tree pits.The spatial distribution map (Figure 9) confirms the aforementioned observations, with high-frequency crowd distributions occurring in high-enclosure spaces or at landscape boundaries throughout the period.From 08:30 to 9:30 and from 17:30 to 18:30, the crowd distributions occur synchronously in low-enclosure spaces with good spatial continuity.ISPRS Int.J. Geo-Inf.2024, 13, x FOR PEER REVIEW 16 of 25

Staying Behavior Ratio
Figure 10 (left) shows that the staying behavior ratio peaks at 09:30, decreases from 09:30 to 11:30, stabilizes before 16:30, and increases from 16:30 to 18:30.Overall, the staying behavior ratio exhibits a U-shaped distribution pattern, with higher values in the morning and evening and lower values at midday.
Regarding the spatial distribution, Figure 10 (right) shows that the proportion of staying behavior in s6 exceeds 0.8 throughout the period, whereas other spaces exhibit larger fluctuations.Specifically, the amplitude in s3 reaches 0.82.This may be attributed to the high-enclosure nature of and excellent rest facilities in s6, which enhance its attractiveness for staying behavior.However, this does not imply that low-enclosure spaces have a negative impact on staying time.Influenced by aggregative crowds, such as square dancing, the staying behavior ratio value in s3 at 08:30 is second only to that of s6.Additionally, there is a clustering phenomenon of staying behavior at the landscape boundaries for various spaces, which is consistent with the "boundary effect" theory (Figure 11) [17].

Staying Behavior Ratio
Figure 10 (left) shows that the staying behavior ratio peaks at 09:30, decreases from 09:30 to 11:30, stabilizes before 16:30, and increases from 16:30 to 18:30.Overall, the staying behavior ratio exhibits a U-shaped distribution pattern, with higher values in the morning and evening and lower values at midday.
Regarding the spatial distribution, Figure 10 (right) shows that the proportion of staying behavior in s6 exceeds 0.8 throughout the period, whereas other spaces exhibit larger fluctuations.Specifically, the amplitude in s3 reaches 0.82.This may be attributed to the high-enclosure nature of and excellent rest facilities in s6, which enhance its attractiveness for staying behavior.However, this does not imply that low-enclosure spaces have a negative impact on staying time.Influenced by aggregative crowds, such as square dancing, the staying behavior ratio value in s3 at 08:30 is second only to that of s6.Additionally, there is a clustering phenomenon of staying behavior at the landscape boundaries for various spaces, which is consistent with the "boundary effect" theory (Figure 11) [17].ISPRS Int.J. Geo-Inf.2024, 13, x FOR PEER REVIEW 17 of 25

Movement Speed
Figure 12 (left) shows that the movement speed peaks at 15:30 and troughs at 9:30, exhibiting an overall camel-shaped distribution, with higher values at midday and lower values in the morning and evening, showing a negative correlation with other feature factors.
whereas s4 and s6 exhibit lower movement speeds, suggesting a preference for high-sp behavior in low-enclosure spaces.Figure 13 shows that high-speed movement area hibit extensive clustering, particularly at 17:30, when continuous clusters appear in s and s5.Low-speed movement areas mostly display isolated point-like distributions a landscape boundaries.Additionally, low-speed movement areas in s3 exhibit clust distributions, whereas those in s6 mostly show scattered distribution patterns and phenomenon of intermittent high-speed behaviors.This suggests that low-density, speed behaviors tend to occur in high-enclosure spaces with better rest facilities, whe high-density, low-speed behaviors, represented by square dancing, tend to occur in enclosure spaces.Figure 12 (right) shows similar fluctuation trends for the movement speed in various spaces.S3, s5, and s7 maintain relatively higher movement speeds throughout the period, whereas s4 and s6 exhibit lower movement speeds, suggesting a preference for high-speed behavior in low-enclosure spaces.Figure 13 shows that high-speed movement areas exhibit extensive clustering, particularly at 17:30, when continuous clusters appear in s2, s3, and s5.Low-speed movement areas mostly display isolated point-like distributions at the landscape boundaries.Additionally, low-speed movement areas in s3 exhibit clustered distributions, whereas those in s6 mostly show scattered distribution patterns and the phenomenon of intermittent high-speed behaviors.This suggests that low-density, lowspeed behaviors tend to occur in high-enclosure spaces with better rest facilities, whereas high-density, low-speed behaviors, represented by square dancing, tend to occur in lowenclosure spaces.

Spatial Participation
Figure 14 (left) shows that the spatial participation fluctuates periodically, reaching a trough at 13:30 and experiencing significant growth from 14:30 to 18:30.Owing to the absence of crowd behavior data in s1, there is no spatial participation at 16:30, resulting in an overall decrease in spatial participation.Overall, the temporal distribution of spatial participation exhibits a U-shaped pattern, with higher values in the morning and evening and lower values at midday.

Spatial Participation
Figure 14 (left) shows that the spatial participation fluctuates periodically, reaching a trough at 13:30 and experiencing significant growth from 14:30 to 18:30.Owing to the absence of crowd behavior data in s1, there is no spatial participation at 16:30, resulting in an overall decrease in spatial participation.Overall, the temporal distribution of spatial participation exhibits a U-shaped pattern, with higher values in the morning and evening and lower values at midday. Figure 14 (right) shows that the spatial participation value in s5 is higher than that in other spaces from 08:30 to 09:30 due to the influence of morning exercise behavior.In other periods, s6 consistently leads in its spatial participation value.Figure 15 shows that at 17:30, s3 and s6 form large-scale clusters of low-and high-level spatial participation, respectively, reflecting differences in crowd behavior in spaces with distinct characteristics.In s3, the crowd behavior is passage-oriented, whereas in s6, crowds tend to engage in extensive interactions with the space.This aligns with the evaluations of other spatial vitality factors.Therefore, despite the relatively high levels of crowd heat in s3 and s6 at 17:30 (Figure 14 (left)), the spatial vitality values exhibit significant differences.Furthermore, from 08:30 to 09:30 and from 11:30 to 17:30, there is a clustering phenomenon of high-level spatial participation at the eastern boundary of s5, while isolated high-participation aggregation areas appear at the tree pool boundary of s1 from 10:30 to 17:30 (Figure 15).In other periods, the distribution of high-and low-participation crowd behavior is mixed, but high-participation areas are generally shown at the boundaries of high- enclosure spaces and landscape boundaries (Figure 15), possibly indicating that in-depth interactions between people and spaces are more likely to occur at spatial boundaries, which correlates with the tendency to exhibit staying behavior at boundaries.  Figure 14 (right) shows that the spatial participation value in s5 is higher than that in other spaces from 08:30 to 09:30 due to the influence of morning exercise behavior.In other periods, s6 consistently leads in its spatial participation value.Figure 15 shows that at 17:30, s3 and s6 form large-scale clusters of low-and high-level spatial participation, respectively, reflecting differences in crowd behavior in spaces with distinct characteristics.In s3, the crowd behavior is passage-oriented, whereas in s6, crowds tend to engage in extensive interactions with the space.This aligns with the evaluations of other spatial vitality factors.Therefore, despite the relatively high levels of crowd heat in s3 and s6 at 17:30 (Figure 14 (left)), the spatial vitality values exhibit significant differences.Furthermore, from 08:30 to 09:30 and from 11:30 to 17:30, there is a clustering phenomenon of high-level spatial participation at the eastern boundary of s5, while isolated high-participation aggregation areas appear at the tree pool boundary of s1 from 10:30 to 17:30 (Figure 15).In other periods, the distribution of high-and low-participation crowd behavior is mixed, but highparticipation areas are generally shown at the boundaries of high-enclosure spaces and landscape boundaries (Figure 15), possibly indicating that in-depth interactions between people and spaces are more likely to occur at spatial boundaries, which correlates with the tendency to exhibit staying behavior at boundaries.enclosure spaces and landscape boundaries (Figure 15), possibly indicating that in-depth interactions between people and spaces are more likely to occur at spatial boundaries, which correlates with the tendency to exhibit staying behavior at boundaries.

Discussion
We utilized a CV-supported method for spatial vitality detection and conducted a spatiotemporal statistical analysis of a small site's vitality.Compared with manual counting [34] or the results obtained from continuous trajectories [29], we quantitatively determined the spatial coordinates and temporal sequences of crowd behavior per second, thus enhancing the efficiency and reliability of the spatial vitality assessment indicators.Moreover, by using fine-grained behavioral data, we conducted a multidimensional behavioral factor assessment of spatial vitality at an appropriate scale so as to address the criticism of single-factor spatial vitality assessment subjectivity [38] and enhance the feasibility of examining the coupling between spatial vitality and spatial features in small-scale space research.
Regarding the behavioral data, we transformed continuous behavioral trajectories into discrete multidimensional behavior trajectory feature points to efficiently expand the dataset volume, reduce the impact of erroneous data, and enhance the granularity of the behavioral data.Simultaneously, our method of overlapping different viewpoints using the weighted calculations of the behavioral features in the original dataset, and our subsequent use of spatial segmentation, effectively resolved the trajectory discontinuity issues at the edges of the imagery frames [41].The manual assessment results demonstrated that the behavior feature calculation based on discrete trajectory feature points could characterize the various spatial vitality factors.However, we noticed that the tracking loss problems persisted owing to the occlusion of the camera view, behaviors being too far from the camera equipment, or behaviors extending beyond the cameras' coverage boundaries.This resulted in the loss of some data on crowd behavior.Our feasibility analysis demonstrated that the use of discrete trajectory feature point data to expand the dataset ensures that minor data losses do not affect the measurement of spatial vitality factors.Nevertheless, if video datasets with high camera view occlusion rates are used, or if the camera positions in the research site cannot cover the entire area, the accuracy of the spatial vitality measurement results cannot be guaranteed.Therefore, as with other studies that use video data, the quality of the video data is important.
Regarding the correlation between spatial vitality and spatial features, our experimental results had strong explanatory power for the "boundary effect" theory [17], indicating a significantly positive impact of landscape boundaries on spatial vitality phenomenon.This was deduced through our statistical analysis of the spatial vitality factor calculations and spatial vitality distribution maps supported by discrete trajectory feature point data.We found that the landscape boundary regions significantly and positively impacted the spatial vitality factors, which is consistent with Loo et al.'s results [40].Moreover, our results indicated that the spatial vitality factors were not solely influenced by the landscape boundaries; spatial characteristics, such as the continuity of hard-surfaced spaces, rest facilities, and enclosure levels, also had complex nonlinear influencing effects.Therefore, future research should use a nonlinear multivariate quantification model to explore the correlation between spatial features and spatial vitality.
Our study has a few limitations.Regarding the research site, first, the data used in this study were collected from a specific public open space in Nanjing, China.While this does not affect the credibility of the spatial vitality monitoring results, the findings on the temporal variation patterns of spatial vitality and its correlation with spatial features may not be generalizable to other cities and regions with different geographical and cultural backgrounds [51].Secondly, since the purpose of this study was to explore a feasible method for the automated multi-dimensional monitoring of public open space vitality at a small scale, we chose a city square that is rich in spatial types and activity categories for method testing.In our future work, we will aim to explore the applicability of this method in different types of public open spaces.Moreover, in different types of public open spaces, more trajectory features may need to be considered.For instance, in a commercial street space, the trajectory features generated by the interaction between crowds and surrounding shops might be an important variable in spatial vitality monitoring, while in mountainous parks, changes in trajectory on the z-axis will become one of the trajectory features we need to consider.
In terms of the selection of spatial features, further research into the nonlinear correlation between spatial features and spatial vitality, as mentioned in the previous discussion, will require the fine-grained quantification of spatial features and more potentially relevant spatial features.For example, Wei et al. [41] qualitatively discussed the "tele-coupling" phenomenon, which suggests that spatial features outside the research site may influence crowd behavior within the site, yet there is no method available for effective quantitative discussion.Moreover, if the spatial vitality monitoring method proposed in this paper were to be extended to studies of other types of public open spaces, more spatial features suited to the research subject would need to be considered.For instance, in urban street spaces, interface shapes and facade elements have been reported as important spatial features [16].In urban parks with environments that are more complex than that of the research site of this study, natural spatial features, such as bodies of water and lawns, as well as artificial spatial features, like sculptures, fountains, and fitness equipment, need to be taken into consideration [30].Our future work includes exploring methods to create a quantitative representation of spatial features to advance the quantitative study of the relationship between spatial features and spatial vitality in small-scale public open spaces.
Furthermore, we recognized that the nature of our sampling research could lead to the presence of errors.Our sampling periods covered the major activity periods of the day and met the accuracy requirements for conducting temporal change research.However, weather and lighting conditions, which are significant influencing factors of spatial vitality, change rapidly.Limited by current technological capabilities, we struck a balance between sampling time and efficiency and selected cloudy weather to mitigate the impact of lighting conditions.However, we again acknowledged the potential existence of sampling errors.When more efficient video collection and behavior tracking methods become available, continuous behavior monitoring throughout the day can replace our sampling approach and enhance the accuracy of spatial vitality research in the future.

Conclusions
Determining and analyzing the vitality of public open spaces is crucial for urban planners and designers to understand the use of such spaces and enhance their functionality, along with the support of CV technology to enable efficient quantitative data collection.Based on our behavioral data characteristics and the spatial vitality assessment, we transformed the behavioral data into four multidimensional spatial vitality factors, namely crowd density, staying behavior ratio, movement speed, and spatial participation, and analyzed their spatiotemporal distribution in Daxing Palace Square.Our study contributions are outlined below.
First, our use of discrete trajectory feature points reduced the data collection errors and helped us calculate the spatial vitality factors, which effectively addressed the data sensitivity issues arising from trajectory fragmentation and inadequate sample sizes between different data collection ranges.Therefore, our constructed spatial vitality monitoring model and vitality distribution maps avoided interference from duplicate data or blank areas.Second, by leveraging the advantages of discrete trajectory feature points in the subdivision of the behavioral feature space, we constructed spatial vitality maps that showed that all four spatial vitality factors exhibited positive characteristics at the landscape boundaries, which is consistent with Gehl's "boundary effect" theory [17], indicating that landscape boundaries have a significantly positive impact on spatial vitality.Third, we simultaneously quantified the changing trends in spatial vitality factors at different periods.From 08:30 to 09:30 and from 17:30 to 18:30, the crowd heat and staying behavior ratio values in the square were higher, and the crowd movement speed value was lower, indicating people's greater tendency to engage with the square space at these times.Spatially, this phenomenon was more prominent in the high-enclosure areas of s4 and s6.Finally, we examined the complex coupling relationship between spatial features and spatial vitality.
The analysis revealed that better rest facilities consistently and positively impacted spatial vitality.High-enclosure spaces were more likely to attract more spatial participation behaviors, increasing the interactions between people and spaces.Conversely, low-enclosure spatial features attracted high-and low-speed aggregative behaviors, negatively impacting the staying behavior ratio and spatial participation behaviors.Overall, high-enclosure areas positively impacted spatial vitality, while low-enclosure areas exhibited higher spatial vitality during periods of concentrated aggregative behavior.
The multidimensional spatial vitality automated monitoring method proposed in this article is a standardized and scalable workflow.In urban research, it addresses the challenge of quantitatively analyzing spatial vitality at the human scale, assisting urban researchers in examining the spatiotemporal distribution and influencing factors of vitality in public open spaces across multiple dimensions and scales and summarizing patterns from it.In the context of urban design and renovation, the method developed in this paper can assist designers in making quantitative assessments of the temporal and spatial differences in current spatial vitality.Furthermore, the spatial vitality recorded by discrete trajectory feature points can be conveniently chosen for analysis granularity and overlaid with other urban data for targeted urban space renovation.Designers can also quantitatively compare changes in spatial vitality before and after renovations, serving as scientifically backed material for design retrospectives.

Figure 2 .
Figure 2. Location map of the research site.

Figure 2 .
Figure 2. Location map of the research site.

Figure 3 .
Figure 3. Spatial division of the study site.

Figure 3 .
Figure 3. Spatial division of the study site.

Figure 3 .
Figure 3. Spatial division of the study site.

3. 2 .
Figure 8 (left) shows that crowd heat peaks at 08:30 and 17:30 and troughs at 13:30, showing a pattern of low in the middle and high on both sides.The temporal trends of crowd heat in various spaces are generally consistent with the overall study site, with peaks at the 08:30-10:30 and 17:30-18:30 periods and troughs at the 12:30-14:30 period.

Figure 9 .
Figure 9. Spatial distribution maps of crowd heat.

Figure 9 .
Figure 9. Spatial distribution maps of crowd heat.

Figure 10 .
Figure 10.Temporal accumulation of staying behavior ratio (left); temporal trends of staying behavior ratio (right).

Figure 11 .
Figure 11.Spatial distribution maps of staying behavior ratio.

Figure 11 .
Figure 11.Spatial distribution maps of staying behavior ratio.Figure 11.Spatial distribution maps of staying behavior ratio.

Figure 11 .
Figure 11.Spatial distribution maps of staying behavior ratio.Figure 11.Spatial distribution maps of staying behavior ratio.

Figure 12 .
Figure 12.Temporal accumulation of movement speed (left); temporal trends of movement sp (right).

Figure 12 .
Figure 12.Temporal accumulation of movement speed (left); temporal trends of movement speed (right).

Figure 13 .
Figure 13.Spatial distribution maps of movement speed.

Figure 13 .
Figure 13.Spatial distribution maps of movement speed.

Figure 15 .
Figure 15.Spatial distribution maps of spatial participation.

Table 1 .
Summary of spatial features.

Table 1 .
Summary of spatial features.

Table 1 .
Summary of spatial features.

Table 1 .
Summary of spatial features.

Table 1 .
Summary of spatial features.

Table 1 .
Summary of spatial features.

Table 2 .
Crowd heat rating rules.

Table 3 .
Staying behavior ratio rating rules.

Table 4 .
Movement speed rating rules.

Table 5 .
Spatial participation rating rules.have weak interactions with space, with few or no social behaviors Most crowd behaviors have moderate interactions with space, with relatively few social behaviors Most crowd behaviors have moderate interactions with space, with a moderate number of social behaviors Most crowd behaviors have strong interactions with space, with relatively many social behaviors Most crowd behaviors have strong interactions with space, with many social behaviors