Multi-Scale Street Vitality Analytics: A Comprehensive Review of Technologies, Data, and Applications

Yongming Huang; Mingze Chen; Xiamengwei Zhang; Ryosuke Shimoda; Ruochen Yang

doi:10.3390/buildings15213987

,

and

¹

College of Forestry and Landscape Architecture, Xinjiang Agricultural University, Urumqi 830052, China

²

Landscape Planning Laboratory, Graduate School of Horticulture, Chiba University, B-Building, Matsudo Campus 648 Matsudo, Chiba 271-8510, Japan

³

Urban Nature Design Research Lab, University of British Columbia, 2424 Main Mall, Vancouver, BC V6T 1Z4, Canada

⁴

College of Fine Arts, Capital Normal University, Beijing 100048, China

Buildings2025, 15(21), 3987;https://doi.org/10.3390/buildings15213987

This article belongs to the Topic 3D Computer Vision and Smart Building and City, 3rd Edition

Version Notes

Order Reprints

Abstract

Street vitality is an important indicator of urban attractiveness and sustainable development, and it has become a central topic in contemporary urban planning and research. Using the PRISMA methodology, this review systematically examines four major technologies including machine learning (ML), space syntax, GPS, and sensors, together with six categories of data that are commonly used in street vitality studies. The analysis traces the methodological development of these approaches and identifies application trends across both macro and micro spatial scales. ML has become the leading technology in this field, showing strong performance in dynamic modeling, pattern recognition, and the integration of multiple data sources. GPS provides high temporal accuracy for tracking mobility and identifying spatiotemporal dynamics. UAVs and sensor networks make it possible to observe environmental and behavioral responses in real time. When combined, these technologies support four main research themes: the built environment and vitality, pedestrian mobility and urban dynamics, spatial and visual characterization, and social interaction. Other complementary data sources, including social media, online maps, surveys, and government statistics, expand analytical coverage and improve contextual interpretation across different spatial and cultural settings. The review emphasizes the need to connect advanced technologies and diverse data sources with broader concerns of governance, ethics, and civic participation, while maintaining a focus on methodological and data-based synthesis. By clarifying the technological pathways and data foundations of street vitality research, this study provides a structured reference for researchers, urban designers, and policymakers who aim to develop evidence-based and socially responsive frameworks for urban space evaluation and planning.

Keywords:

big data; computer vision; street space; space syntax; machine learning

1. Introduction

Streets are fundamental components of the urban environment. They provide primary settings for daily human activity and help shape the public image of a city [1]. In this context, street vitality is one dimension of broader urban vitality and arises from the behaviors of users, who act as its principal agents [2,3]. Vitality can be evaluated through observable patterns in pedestrian activity, including type, frequency, and diversity [4,5]. Building on this premise, street open spaces function as physical and social arenas that host varied activities and facilitate interaction [6]. As a result, enhancing street vitality supports quality of life, well-being, and social cohesion, and it advances sustainable urban development [7]. However, transportation-oriented planning often marginalizes the social role of streets, and this tendency can erode vitality over time [8,9].

Against this background, previous research on vitality assessment relied mainly on manual approaches such as questionnaires, structured field observation, and manual counts [6,10,11]. These approaches yield detailed insights into spatial use [8,12,13,14,15,16], yet they are time consuming, labor intensive, and often inefficient [17]. Samples are frequently small and measurement is vulnerable to subjectivity, which can reduce accuracy [10]. As a further consequence, traditional approaches have limited capacity to capture pedestrian preferences and behavior at scale [5,7]. Hence, many studies linking vitality with physical indicators of the built environment remain qualitative and descriptive [4,12], with limited quantitative evidence to support robust assessment [5].

Recent advances in big data and computing change this landscape by enabling the collection of large volumes of dynamic data with precise spatial information [10]. These capabilities provide a strong basis for quantifying human activity, characterizing urban form, and identifying latent relationships in the data [18,19]. By making the complexity and diversity of urban vitality observable at scale, they offer cost effective and accurate ways to study human behavior, movement, preferences, attributes, and evaluations of the physical environment [4,20,21,22,23].

Building on these developments, a growing body of work applies computer vision to street-level imagery (SVI) in order to segment and quantify built environment features and examine their associations with street vitality [4,10,14,15,21,22,23,24,25,26]. CV has also been used to infer pedestrian trajectories and attributes, which supports analyses of urban walkability [17,22]. In addition, GPS tracking, the Internet of Things, and deep learning (DL) have been used to optimize street layouts and to analyze the spatiotemporal distribution of pedestrian density, thereby informing urban planning and management [27,28,29].

These studies point to a shift toward integrating multiple technologies and measuring pedestrian activity and environmental elements at larger scales. In particular, street-level imagery and machine learning (ML) allow analysis across wider geographic areas than manual methods [17]. Street-level imagery provides a human perspective that is well suited to behavior-focused studies [20]. However, data is especially effective for capturing dynamic activity and supports accurate identification of pedestrian attributes, activity types, and temporal variation [4,22]. At the same time, data collection devices are sensitive to weather, lighting, and occlusion, which can lower resolution and reduce accuracy [4,20,22,28,30]. In addition, GPS, Internet of Things, and social media data enable broader-scale analyses with greater applicability and generalizability [31].

Although research on street vitality has advanced, most studies still emphasize macro-scale contexts such as cities and neighborhoods. By comparison, technologies tailored for micro-scale street-level analysis remain underexplored. There has also been limited evaluation of the feasibility, performance, and integration of new tools intended to capture the dynamics of street vitality. This gap underscores the need for a systematic assessment of how advanced technologies can address challenges in behavioral dynamics, environmental interactions, and real-time spatial analysis.

The purpose of this paper is to examine the technologies used in street vitality research and to summarize their advantages, limitations, and recent trends. As a methodological and data-oriented systematic review, it focuses on key technologies, data sources, and application themes in micro-scale street vitality studies. It also offers an integrated assessment of data capacity, applicability, and technical constraints. We recognize that governance, public participation, rights, and oversight are central to practical implementation. Although a full treatment of these social and institutional dimensions lies beyond the scope of this paper, we briefly address them in a later section to provide initial guidance for future research and practice.

In addition, this study identifies areas for improvement and outlines future research directions. It draws on insights from urban and community planning, historic district research, and pedestrian behavior analysis to show how these perspectives can strengthen street vitality research. Building on this foundation, we propose the following research questions:

(1): What are the key technologies currently used in street vitality research? How do these technologies evolve, and what is their applicability?
(2): What are the typical application themes and spatial scopes of these technologies?
(3): What are the key limitations of the technologies applied in street vitality research? What is the potential for future technologies and methods?

The remainder of this paper is structured as follows. Section 2 describes the systematic review methodology. Section 3 presents the quantitative results and offers a comprehensive categorization of the current technological approaches in street vitality analysis. Section 4 discusses the findings, highlighting the main strengths and weaknesses while outlining future research pathways. The final section provides a summary of the key contributions and concluding remarks.

2. Materials and Methods

2.1. Research Framework and Overview

Drawing upon prior definitions and dimensions of street vitality [2,6], as well as the established links between pedestrian behavior and urban spatial characteristics [5], this study defines street vitality at the micro scale within urban outdoor environments. Specifically, street vitality is conceptualized as the dynamic expression of diverse pedestrian activities that emerge through interactions between physical elements of the built environment (e.g., street design, greenery, seating, and other street-level infrastructure) and patterns of social interaction. This definition highlights the fine-grained, spatially heterogeneous, and context-dependent nature of street vitality in localized urban settings, distinguishing it from broader conceptualizations of urban vitality assessed at the neighborhood or city scale. Adopting this micro-scale focus clarifies the scope for subsequent literature searches and analyses related to street vitality.

To identify studies that employ advanced technologies in the analysis of street vitality, this research follows the PRISMA protocol for systematic reviews (Table S1), in line with best practices established in recent urban analytics literature [20,21,32,33]. The systematic review process is illustrated in Figure 1. Given its authoritative reputation and reliability, the Web of Science (WoS) database was selected to ensure a focused and credible literature base. This database is particularly appropriate considering the interdisciplinary nature of street vitality research, which spans urban planning, computer vision, and data analytics.

Figure 1. Systematic review process (adapted from PRISMA protocol).

In view of the rapid advancements in artificial intelligence and machine learning, and to ensure both data relevance and currency, the review focuses on English-language publications from January 2015 to January 2025. This time frame facilitates the identification of emerging trends and state-of-the-art technological applications in studies of micro-scale urban outdoor vitality.

2.2. Search Criteria

The search was conducted in the Web of Science Core Collection using the Advanced Search function with Boolean operators. The field scope was set to All Fields so that each query examined the entire record content. Multi-word terms were entered as exact phrases to avoid unintended expansion and to ensure consistent interpretation across all searches.

This approach was chosen because the Advanced Search function applies explicit Boolean logic and fixed field scoping. It retrieves only records that match the specified terms exactly and does not include synonyms, related expressions, or inferred topics. As a result, the search outcomes are stable and can be reproduced by any reader who applies the same settings. In contrast, Smart Search operates as a semantic mode that reformulates queries automatically and ranks results by relevance. It often introduces synonyms, near matches, and concept-level terms. Consequently, Smart Search tends to return a larger and more variable number of records that may change over time or across sessions due to its algorithmic expansion. For a PRISMA-based systematic review that requires transparency and replicability, this study therefore employed Advanced Search rather than Smart Search to maintain methodological consistency.

After defining the search strategy, keywords were organized into three conceptual categories: technology, topic, and context, as shown in Table 1. These categories ensured a balanced coverage of technological approaches, thematic focuses, and urban settings. One term from each category was combined to form 540 independent queries. Each query contained one technology term, one topic term, and one context term, connected by the Boolean operator AND (such as “Technology” AND “Topic” AND “Context”) (Table S2). All searches were executed individually under identical parameters. The publication period was set from 2015 to 2025, the language was restricted to English, and the document types were limited to journal articles and review papers included in the Core Collection.

Table 1. Keywords used in the literature search.

Before implementing the full set of searches, a pilot test was conducted to verify keyword performance. Combinations that included the terms age, gender, and time yielded few and only weakly relevant records. Based on this preliminary evidence, these terms were excluded in the formal search to improve precision while maintaining all other parameters unchanged.

Through the complete set of 540 queries, the database returned a total of 1546 records. Duplicate entries were then removed based on article titles and digital object identifiers, resulting in 1003 unique papers for subsequent screening and analysis. This refined dataset provided the foundation for the next stage of the systematic review process.

2.3. Selection Criteria, Screening, and Extraction of Information

The initial batch of 1003 papers underwent a systematic screening process:

(1): Non-English papers were excluded, resulting in 1001 papers.
(2): Title Review: Papers unrelated to urban studies, such as those focused on biomedical research, were excluded based on keywords (urban, community, street, vitality, technology, and method). This reduced the dataset to 514 papers.
(3): Abstract Review: Papers were further filtered based on contextual relevance (e.g., walkability, vibrancy, urban, street), narrowing the selection to 73 papers.
(4): Full-text review: The remaining 73 papers underwent a detailed examination, and 62 papers were ultimately included in the study.

After completing the full-text review, we conducted a theory-informed narrative synthesis that relied on a structured extraction form rather than qualitative coding. The lead reviewer read each study in full and entered objective information into a standardized table. The table recorded bibliographic details, research questions, study sites and subjects, data sources, data types, algorithms and software, analytical procedures, principal findings, and the advantages and limitations reported by the authors. The other researchers independently checked an initial subset of entries to calibrate the extraction form and to finalize variable names and definitions. Once calibration was complete, the full dataset was processed, and the remaining entries were verified against the original texts. A senior researcher coordinated the overall consistency check and resolved any remaining disagreements through discussion. A simple log of ambiguous cases and final decisions was maintained to ensure that the synthesis remained transparent and traceable.

Thematic grouping was derived directly from the completed extraction table. Four analytical themes were identified with reference to the research questions and to previous reviews. Each paper was assigned one primary theme that represented the dominant data modality in the extraction table, together with the main analytical pipeline and topical focus of the study. When a paper clearly incorporated multiple modalities or analytical pipelines, these were retained as secondary tags. Quantitative summaries report only the primary theme to prevent double counting, while both tables and text indicate secondary tags where interdisciplinary overlaps occur.

To complement the thematic framework, two additional classifications were introduced. One reflects the type of technology and data modality, and the other reflects the analytical approach. Presenting results in this way allows readers to understand how study designs relate to both input data and analytical methods while keeping the overall reporting concise. For transparency, a compact data dictionary that defines each extraction field and specifies the rules for theme assignment can be provided upon reasonable request. This Supplementary Materials enable an independent reader to reconstruct the synthesis process without the need for access to individual reading notes.

2.4. Literature Statistics and Visualization

The 62 reviewed papers were added to our favorites in the WoS database and exported as a dataset. Subsequently, Python 3.9 was used to clean the input data and perform a series of analyses, including descriptive statistics, keyword occurrences, data sources, and temporal trends. The processed data was then visualized through graphs and charts to support further research.

3. Results

3.1. Trends in Urban Vitality Research: Keywords, Data Sources, and Technology

The study systematically reviewed literature published between 2015 and 2025. The analysis covered publications from 11 countries and 17 data sources and identified a total of 149 keywords grouped into 19 categories. The study areas represented twelve countries, including China, Japan, the United States, France, South Korea, Germany, Chile, Algeria, India, Israel, Singapore, and the United Kingdom.

A total of 32 journals published related studies. Among them, the ISPRS International Journal of Geo-Information accounted for the largest share (14.5%), representing nearly one-fifth of all publications. Other journals with strong disciplinary relevance, such as Sustainability (12.9%), Cities (6.5%), and Land (4.8%), also made substantial contributions. In addition, several studies appeared in journals that specialize in spatial analysis, urban systems, and environmental research, including Applied Geography (4.8%), Computers, Environment and Urban Systems (4.8%), and the International Journal of Environmental Research and Public Health (3.2%) (Figure 2a). This concentration indicates that the field remains methodologically grounded in spatial and data science, while urban planning journals are increasingly integrating computational approaches.

Figure 2. (a) Top 10 Journals by number of publications. (b) Publication trend over years. (c) Publication of ML-based papers trend over years. (d) Percentage of ML publication in total over years.

Figure 2b illustrates a clear rise in publications applying advanced technologies to street vitality research after 2018, with the number reaching its highest level in 2022, which accounted for 22.6% of all studies. This increase highlights the growing importance of street vitality research as an emerging area at the intersection of urban planning and technological innovation.

In the earlier years from 2015 to 2019, the use of machine learning techniques was still limited, accounting for only 3.3% to 6.7% of annual publications. Since 2020, however, studies employing machine learning have grown rapidly, reaching another peak in 2024 (26.7%). This trend confirms the expanding influence of machine learning as a core analytical tool in the study of street vitality (Figure 2c). It also reflects a broader methodological transition from descriptive observation toward data-driven modeling.

The changing proportion of studies using machine learning further reinforces this finding. The share of such publications increased sharply after 2019 and reached a prominent level in 2024 (Figure 2d). Although the percentage remained relatively low in certain years such as 2017 and 2018, which suggests a delay in technology adoption, the overall pattern demonstrates a consistent and accelerating integration of machine learning into this research domain.

A detailed examination of high-frequency keywords and their co-occurrence networks is shown in Figure 3a. Clusters of the same color represent groups of frequently co-occurring terms. The size of each node corresponds to the frequency of a keyword across all articles, while the thickness of the lines between nodes indicates the frequency of co-occurrence. The analysis identified seven clusters, including three primary ones—“Design and Walking,” “Environment and Body-Mass Index,” and “Land-Use and Behavior”—and four secondary ones—“Space Syntax and Centrality,” “City and Urban Vitality,” “Vibrancy and Urban,” and “Cities and Built Environment.” These clusters reveal that behavioral, environmental, and morphological perspectives co-exist within the field but remain methodologically fragmented.

Figure 3. (a) The keywords and their co-occurrence network, and (b) Most frequent keywords in all papers.

Figure 3b presents the most frequently occurring keywords, such as “Travel, Land-Use” (n = 10), “Design” (n = 9), and “Walking, Access, Physical Activity” (n = 8), along with their associated terms that show strong co-occurrence links. These results indicate a persistent emphasis on physical and environmental variables rather than social or perceptual aspects, suggesting an imbalance in current research priorities.

Taken together, these temporal, spatial, and thematic patterns demonstrate that research on street vitality is experiencing a clear transition toward technology-oriented and data-driven paradigms. At the same time, structural gaps remain between environmental, behavioral, and social perspectives, pointing to the need for more integrated and balanced approaches in future studies.

3.2. Classification Based on Technology and Type of Data Input

Figure 4 visually illustrates the categorical relationships among “Category,” “Data Input,” and “Data Source.” This diagram highlights the flow of data from broader categories to specific sources, providing a clear numerical representation of the data classification used in urban vitality studies. The diagram effectively demonstrates how data inputs connect with their sources, enabling a systematic understanding of the data distribution. The supporting dataset and extracted information are detailed in Appendix A for further reference. Additionally, a detailed analysis and summary were conducted for each technical type.

Figure 4. Keyword frequency statistics.

3.2.1. Supervised Learning, Unsupervised Learning, and Deep Learning

A total of 33 studies involving ML were identified in this review. Combined with the trends discussed in Section 3.1, these findings indicate that ML is gradually becoming the dominant methodological approach in street vitality research. All identified studies employed supervised learning (SL), while only two incorporated unsupervised learning (UL). This pattern is consistent with the observations of Grekousis (2019) and Ullah et al. (2020), who noted that the application of ML in urban studies continues to rely primarily on SL [34,35]. Among the identified models, SVM, LightGBM, LDA, and random forest demonstrated strong applicability. These algorithms represent classical ML approaches, in contrast to studies that incorporate DL components such as CNN-based feature extraction [23,30,36]. Overall, the corpus is dominated by supervised, classical ML, whereas UL applications are comparatively rare in the current evidence base. This distribution likely reflects data availability (particularly labeled samples), computational/resource constraints, and interpretability needs, rather than a linear progression from “conventional” to “advanced” methods. In this paper, we treat DL as a subfield of ML, while SL and UL denote learning paradigms that can be instantiated with either classical algorithms or deep models.

The majority of studies relied on image data as the primary input, typically sourced from mapping service platforms such as Google Maps, Baidu Maps, Tencent Maps, and AutoNavi. A smaller portion of research incorporated video or numerical data. For example, Wong et al. (2021) used the Oxford Town Centre and PETS video datasets [22], although the use of video data overall remains limited. The application of CV has been particularly prominent within ML-based approaches, most often using static two-dimensional images as input data (Table 2). In terms of regional distribution, studies conducted in China tend to draw from domestic platforms such as Baidu Maps and Tencent Maps, while international research relies mainly on OSM and Google Maps. This difference highlights the disparities in data coverage and accuracy among platforms [20]. Such regional divergence illustrates how analytical scope is closely tied to data accessibility and reveals persistent inequalities in research infrastructure across regions.

Table 2. Classification of specific technologies that are the subject of street vitality research.

Despite growing diversity in data sources, several limitations continue to hinder progress. Image data can be affected by visual obstructions, inadequate lighting, or unfavorable weather conditions, while nighttime vitality remains a largely underexplored topic. Video data, although rich in dynamic information, has not yet been widely adopted because of high acquisition and processing costs. Future studies should pay greater attention to the potential of video data and to the temporal variations of street vitality under different climatic and environmental conditions. Expanding these dimensions would enable a more comprehensive understanding of urban vitality and its dynamic spatial patterns.

3.2.2. Space Syntax

Space syntax is an essential tool for studying street connectivity and pedestrian dynamics and plays a critical role in street vitality research. Through mathematical and visualization methods, it quantifies and intuitively illustrates the relationship between complex urban spatial structures and human behaviors [55].

In this review, seven studies that applied space syntax were identified. These studies mainly focused on the relationship between spatial configuration, accessibility, and pedestrian volume. The data types used include road networks, POI, built environment data, and pedestrian volume and density. These datasets were obtained from official maps, GIS, online mapping platforms, and field observations [11,56,57,58,59,60].

Previous research has shown that integrating multiple data sources, such as economic indicators and POI data, can substantially enhance both the depth and scope of space syntax analysis. For example, Sheng et al. (2021) used POI data from Baidu Maps to examine the relationship between street connectivity and pedestrian distribution [60]. Lee et al. (2020) combined GIS data with built environment variables to explore spatial dynamic characteristics [57]. Similarly, X. Li et al. (2021) analyzed the spatial distribution of street vitality using Baidu heatmaps [61]. Yıldırım and Çelik (2023) conducted a human-centered analysis using SVIs [11], and Nag et al. (2022) examined pedestrian volume by integrating satellite imagery with field observations [58].

Despite these advancements, the use of space syntax in dynamic environments still holds substantial potential for further development. Its integration with big data technologies is widely regarded as an important direction for future research. Nevertheless, space syntax remains constrained by its dependence on static spatial data, particularly in micro-scale applications at the street level. Incorporating multi-source data can improve both the comprehensiveness and accuracy of findings, making the integration of space syntax and big data approaches an inevitable trend in future studies.

3.2.3. Multi-Variate Big Data and Computer Analytics

The integration of multi-source data and computational analysis techniques has provided unprecedented depth and breadth to street vitality research. By leveraging multi-dimensional data such as GPS, GIS, UAV data, computational analysis models, and visualization tools, these technologies have revealed the spatial and temporal dynamics of street vitality and have advanced the quantification and systematization of research in this field.

For example, S. Chen et al. (2022) conducted a quantitative analysis of urban vitality using GIS data, including POI density, taxi flow, building distribution, and road network data, combined with a Shannon entropy model [62]. They further employed a regression model to examine the impact of land-use diversity on vitality. Sugimoto et al. (2019) performed an in-depth analysis of tourists′ movement patterns and behavioral characteristics by combining GPS tracking technology with questionnaire surveys [63]. Similarly, Parra-Ovalle et al. (2023) used UAVs in conjunction with GIS to conduct spatial analysis of pedestrian distribution [27]. Furthermore, Gan et al. (2021) explored the impact of neighborhood scale on urban vitality by integrating check-in, POI, and commercial distribution data and applying the kernel density estimation method [64].

From the perspective of data diversity, these studies employed a wide range of data types, including images, POI, road networks, commercial density, social media check-in data, user comments, mobile communication data, building information, housing prices, videos, trajectories, demographic indicators, and economic data. This diversity reflects the rich variety of datasets used in the field. The data sources are equally diverse and mainly include Baidu Maps, Amap, Tencent Maps, OSM, Google Maps, government statistics, and surveillance systems. For instance, Baidu Maps is frequently used in studies conducted in China, while Google Maps and OSM are more common in international research [9,62,64]. In addition, researchers have incorporated blogs, merchant reviews, GPS devices, and UAVs into their analytical frameworks [27,62,63,64,65]. The combination of these sources and technologies demonstrates high adaptability and flexibility in the study of spatial dynamics and human behavior.

Despite the significant research potential of multi-source data approaches, their application faces several challenges. GPS devices are effective for capturing trajectory data, but their reliance on volunteer participation often results in limited sample sizes and behavioral biases caused by participants’ awareness of the research purpose [63]. Social media check-in data, although readily available, tends to underrepresent children and older adults, leading to limited sample representativeness. Merchant review data may also be subject to human interference, such as incentive-driven manipulation of ratings, which compromises its objectivity [65].

The use of UAVs is constrained by factors such as battery capacity, weather conditions, and aviation regulations, which restrict their broader deployment across administrative regions and climatic environments. Moreover, regional differences in data sources introduce further challenges. Studies in China often rely on local platforms such as Baidu Maps and Tencent Maps. Although these platforms provide high spatial coverage within China, their usefulness in international research is limited due to issues of data availability and accuracy. Conversely, Google Maps and OSM are widely applicable in international contexts but face policy-related restrictions in China. This regional disparity underscores the need for a more comprehensive assessment of the universality and limitations of data platforms in global research.

The multi-source data and computational analysis approach not only allows researchers to overcome the limitations of single data sources, providing a solid foundation for uncovering the complexity and diversity of street vitality, but also expands the breadth and depth of urban studies. Furthermore, the integration of multi-dimensional data enhances the comprehensiveness and accuracy of research findings. In the future, researchers should carefully select data sources based on specific research needs and regional characteristics while optimizing data quality and collection methods. In particular, when integrating local and global data platforms, further exploration of strategies to enhance data compatibility will be essential.

3.3. Classification Based on Themes and Use Case

We identified 33 papers employing ML. Figure 5 illustrates their four main application themes (Built Environment and Vitality, Pedestrian Mobility and Urban Dynamics, Urban Spatial and Visual Characterization, and Socialization and Urban Spatial Relations) and eight specific use cases (Pedestrian Counting and Activity Recognition, Street and Space Characterization, Pedestrian Attribute Recognition and Behavioral Patterns, Urban Vitality and Visual Perception, Urban Morphology and Visual Information Processing, Multi-Source Data and Urban Spatial Relationships, Social Data and Urban Vitality, and Urban Functional and Spatial Assessment). Additionally, Table 2 provides a systematic classification of the specific algorithms used in these studies.

Figure 5. The use scenarios in streets for vitality research.

These classifications not only reflect the diverse applications of street vitality research from both macro and micro perspectives but also highlight the critical role of ML in data processing and analysis. They facilitate a systematic exploration of existing research areas, data types, algorithm applications, as well as the respective advantages and limitations of these methods.

The core of the Built Environment and Vitality framework lies in quantifying environmental characteristics to uncover how spatial structures influence street vitality. Pedestrian counting, a key measure of vitality, is widely utilized to assess activity levels at specific locations. For instance, L. Chen et al. (2020) and Yin et al. (2015) estimated pedestrian flow using SVIs [10,17], while Y. Li et al. (2022) further identified pedestrian activity types by incorporating video data [4]. Additionally, Street and Space Characterization focuses on the impact of street structures on pedestrian behavior and perception [66]. For example, Gong et al. (2019) and Yin and Wang (2016) classified street spaces using SVIs and proposed a series of spatial indicators reflecting urban functions [23,25]. Zhao et al. (2023) explored the spatiotemporal heterogeneity and stability of the relationship between environmental factors and vitality [38]. These studies have established a theoretical framework for understanding the impact of the built environment on street vitality, providing a valuable reference for future research.

The impact of the built environment is ultimately reflected in Pedestrian Mobility and Urban Dynamics, making it essential to capture pedestrian behavior patterns to understand street vitality. In the Pedestrian Attribute Recognition and Behavioral Patterns subcategory, Jiang et al. (2021) proposed a theoretical framework for analyzing individual and group behaviors [39], while Wong et al. (2021) used DL to identify pedestrian paths and attributes, revealing the dynamic relationship between pedestrian behavior and street functionality [22].

On the other hand, Urban Vitality and Visual Perception examines the impact of perceptual variables on vitality by analyzing the relationship between environmental features and human perception. For instance, Qi et al. (2020) combined a CNN model with Place Pulse 2.0 to analyze the association between visual perception and street vitality [40]. However, these studies also highlight that public perception data may be influenced by factors such as age and gender. Future research needs to further control for subjective variables to enhance the objectivity of the findings.

From Urban Spatial and Visual Characterization, urban spatial features reveal the structured expression of vitality. Urban Morphology and Visual Information Processing quantifies urban morphological features using CV. For example, W. Chen et al. (2021) analyzed road network morphology using the ResNet-34 model [36], while Z. Liu et al. (2022) proposed a method to quantify street visual perceptual information [12], providing a more comprehensive representation of visual characteristics in complex urban scenes.

Meanwhile, Multi-Source Data and Urban Spatial Relationships integrates various data sources to deepen the understanding of urban vitality and spatial structures. For example, M. Li et al. (2021) combined microclimate data with pedestrian flow data to study the revitalization of historic districts [44]. Similarly, X. Li et al. (2022) analyzed the impact of the environment on vitality by integrating street vitality data with remote sensing imagery [26]. Y. Yang et al. (2021) explored the relationship between neighborhood vitality and the environment using multi-source data [45]. These studies demonstrate that the integration of multi-source data provides a powerful tool for the quantitative analysis of urban spatial characteristics.

The built environment not only influences the physical manifestation of vitality but also shapes the functional attributes of spaces through social interactions. In the study of social data and urban vitality, Tang et al. (2022) combined DL with social media data to analyze the relationship between built environment quality and spatial popularity [51]. Additionally, Urban Functional and Spatial Assessment explores how urban spaces provide functions and services to residents. For example, Hu et al. (2020) used POI and LDA data and the LDA model to develop a method for extracting urban functions to interpret socio-economic information [30]. These studies highlight that urban spaces are not only carriers of vitality but also integrated hubs of social interactions and economic activities.

Current research on street vitality, while relying on micro-level data such as street-view images and pedestrian counts, primarily focuses on understanding regional vitality dynamics at the neighborhood or urban scale. The linkage between micro-level data and macro-level objectives is a defining characteristic of this field, highlighting both its challenges and potential. Studies have demonstrated that micro-level analyses of pedestrian counts and behavioral patterns can offer precise insights into vitality patterns at neighborhood and city scales. For instance, the quantification of street and spatial characteristics aids in explaining the heterogeneity of urban functional zones and the distribution of vitality. However, when extrapolating findings from micro to macro scales, the limitations of data granularity may hinder the generalizability of conclusions.

3.4. Machine Learning Algorithms: Application Type and Data Type

This section addresses various types of ML based on their application areas. It is divided into three parts, each analyzing and summarizing the characteristics, advantages, limitations, and potential improvements of the respective ML techniques. For detailed classifications and information about the algorithms, refer to Appendix B.

3.4.1. Image-Based Technologies

Image-based ML is primarily used for object recognition and semantic segmentation. By leveraging SVIs and DL, these technologies have significantly enhanced the efficiency and accuracy of street vitality research [4,20]. Their applications span various fields, including pedestrian counting, environmental feature extraction, and visual perception analysis.

For example, L. Chen et al. (2020) and Yin et al. (2015) used SVIs to estimate pedestrian numbers and further analyzed the influence of the built environment on pedestrian counts [10,17]. Additionally, L. Chen et al. (2022) and C. Yang et al. (2023) integrated image data to extract environmental elements and examined their relationship with street vitality [24,37]. Research based on video data has further expanded the analytical scope. Y. Li et al. (2022) employed camera data to identify activity types [4], while Salazar-Miranda et al. (2023) analyzed behavioral patterns using real-time images and GPS data from smart devices [13].

These studies demonstrate that CV not only effectively detects and counts pedestrians but also identifies activity types and extracts environmental features. However, existing models, such as YOLO, ResNet-34, and CNNs, heavily depend on high-quality data, resulting in significant labeling costs [50]. Moreover, DL models like DeepLab V3+ and PSPNet exhibit high computational complexity and rely extensively on advanced hardware. Additionally, factors such as occlusion, lighting variations, and reflections pose challenges to the stability of analysis results.

Future research should prioritize optimizing algorithm structures by leveraging model pruning and hardware acceleration (e.g., GPU and TPU) to improve computational efficiency. Furthermore, data quality and diversity can be enhanced through data augmentation and automated annotation tools. Lightweight models and knowledge distillation techniques can help reduce resource demands, while privacy protection and edge computing provide robust solutions for ensuring data security and enabling real-time analysis [67,68,69,70].

3.4.2. Video Detection-Based Technologies

Video monitoring technology demonstrates significant potential in street vitality analysis. Utilizing CV, pedestrian counting, trajectory tracking, and attribute recognition are seamlessly integrated into a highly automated analysis process [4,22]. This approach provides detailed and comprehensive data support for analyzing pedestrian behavior patterns, while also laying a robust technical foundation for exploring the dynamic characteristics of street vitality.

As described by X. Wang, Zheng, et al. (2022), current computer vision algorithms for identifying pedestrian attributes can be categorized into several types, including those based on global images, local features, attention mechanisms, sequence prediction, loss-function-driven methods, curriculum learning, and graph models [21]. FairMOT, a local feature-based algorithm, demonstrates significant advantages in real-time pedestrian detection and attribute recognition. By integrating a multi-task learning framework, it can simultaneously perform pedestrian detection, trajectory tracking, and attribute recognition, significantly reducing the need for manual intervention and improving the efficiency of monitoring and analysis. However, FairMOT’s high computational requirements pose a major bottleneck for its application, particularly in complex scenarios or when processing large-scale video data. It often relies on GPU acceleration to meet real-time processing demands.

Although these technologies have broad applicability in fields such as intelligent transportation, urban planning, and security monitoring, their implementation inevitably faces challenges. The quality of video surveillance data is often influenced by device performance and environmental factors, such as air pollution, low light, and adverse weather conditions like rain or snow, which can compromise data accuracy (Figure 6). Additionally, challenges such as occlusion, reflection effects, and the high-density distribution of objects in videos further complicate the analysis process [20].

Figure 6. Image source: authors’ own video shot with GoPro9. (a) Poor nighttime picture stability; (b) Obstacles; (c) Reflection; (d) Non-sunny; (e) Overlapping; (f) Low light.

To enhance the practical effectiveness of video detection technology, researchers must focus on improvements in model optimization, hardware selection, and data processing. For instance, model optimization can reduce computational complexity and hardware requirements through pruning and lightweight design. Additionally, data augmentation techniques can generate more diverse training datasets, while automated data annotation tools can improve data quality. On the hardware side, edge computing technology can be leveraged to offload certain computational tasks to edge devices, reducing the central processing burden. Combined with GPU or TPU acceleration, this approach can significantly enhance processing efficiency [71]. Furthermore, integrating multi-source data, such as combining video data with GPS trajectory data, can enhance the analytical capabilities of models, enabling a more comprehensive understanding of the dynamic characteristics of street vitality.

When optimizing video detection technology, privacy protection must also be prioritized. By employing data anonymization techniques, personal privacy can be safeguarded while enhancing the credibility of data sharing. Moreover, researchers should address the demands of dynamic environments and long-term temporal data by developing more robust models capable of handling complex conditions, such as occlusion and lighting variations.

3.4.3. Based on Data Analysis

Algorithms used in data analysis play a critical role in street vitality research, being widely applied in multi-variate analysis and predictive modeling. They serve as essential tools for exploring the complex relationships between environmental variables and vitality patterns.

Regression analysis models excel in modeling variable relationships and forecasting trends, gaining popularity for their computational efficiency and predictive accuracy [4,12,24,51]. These models uncover causal mechanisms through variable selection and parameter estimation, and they are commonly used to analyze how environmental factors influence street vitality. However, traditional regression methods face limitations in handling non-linear relationships and complex spatial structural features, making it challenging to fully capture latent patterns in high-dimensional data [48,52].

To address these challenges, gradient boosting frameworks such as LightGBM are favored for their computational efficiency and predictive accuracy. These frameworks utilize efficient splitting strategies and memory management, significantly reducing computational costs. With their exceptional performance in large-scale data analysis and high-dimensional feature evaluation, they are widely applied to explore the complex relationships between street characteristics and crowd behavior.

Dimensionality reduction algorithms play a vital role in data preprocessing and feature extraction [4,12,14]. PCA reduces feature dimensions, enhancing model computational efficiency and visualization capabilities, serving as a convenient tool for exploring primary patterns within data. In contrast, LDA specializes in text analysis and topic extraction, revealing spatial semantic features and laying the foundation for urban function classification and semantic analysis. However, these algorithms struggle to capture non-linear features and complex data relationships, highlighting the need for advanced non-linear dimensionality reduction methods to improve analytical precision.

They can identify the spatial clustering characteristics of street vitality distribution, providing valuable data support for urban planning [72]. Decision tree and random forest algorithms are widely utilized for their intuitiveness and interpretability, making them particularly effective for handling non-linear and classification tasks [44,48,49]. However, these algorithms are prone to overfitting, and their model stability heavily relies on parameter tuning and data scale.

These algorithms still hold significant potential for performance improvement. For instance, optimizing parameter tuning and feature engineering can further enhance the generalization capability and stability of models. Integrating non-linear dimensionality reduction methods can better capture complex spatial data structures [73]. Moreover, improving topic extraction models by incorporating additional prior knowledge and constraints can enhance both interpretability and analytical depth.

Further integration of multi-source data and hybrid modeling frameworks represents a key development trend. For example, combining environmental features extracted through CV with regression analysis and clustering models facilitates comprehensive analysis and strengthens cross-modal data analysis capabilities. Additionally, exploring automated hyperparameter tuning and ensemble learning techniques can further enhance the robustness and predictive performance of models.

4. Discussion

4.1. Technology Applications and Regional Distribution

This systematic review evaluates and compares the technologies used in street vitality research and examines their evolution, with particular attention to practical applications across different regions and research scales. As shown in Figure 7, four main technology types and six data categories were identified, together with four research themes and eight subcategories of technological applications that span both macro and micro spatial levels. The main findings are summarized below:

Figure 7. Four data interpretation dimensions and six data-source categories, with research themes and eight subcategories of technological applications across macro and micro spatial scales. Note: This figure summarizes data–method associations in street vitality studies. It does not represent ML/DL or learning paradigms. DL is treated as a subfield of ML, while SL and UL denote learning paradigms used across both classical and deep models.

(1): ML has become the mainstream approach in street vitality research:

The main technologies employed in this field include ML, space syntax, GPS, and monitoring devices such as UAVs, cameras, and sensors. Among these, ML has emerged as the core analytical approach. SL is commonly used for classification and prediction tasks, including the extraction of street environmental features, the estimation of pedestrian flow, and the analysis of human behavioral patterns. UL has also demonstrated potential for processing unlabeled data, for instance, by applying clustering algorithms to identify patterns of human activity [72]. The combination of DL and CV has further advanced automated feature extraction and performs particularly well when analyzing video data collected by UAVs. Although these technologies have reached a relatively mature stage in macro- and meso-scale research, challenges remain in achieving real-time dynamic analysis and in acquiring high-resolution data at the micro scale.

(2): Existing studies mainly focus on four research themes: environmental characteristics, perception, social interaction, and behavioral dynamics:

At the macro scale, research often investigates how network layouts and functional areas shape the spatial distribution of vitality, including the influence of street connectivity on activity concentration. At the community scale, studies explore how green spaces, plazas, and nodal activity areas contribute to local vitality. However, our review indicates that micro-scale studies are still underdeveloped, particularly in topics such as street interface design, material use, boundary spaces, and street facilities. The difficulty of collecting detailed micro-scale data and the high cost of specialized equipment are key obstacles. These limitations weaken our understanding of how individual behavioral dynamics interact with the built environment. Future studies should therefore strengthen comprehensive analyses at the micro scale in order to gain a deeper and more integrated understanding of the mechanisms driving street vitality.

(3): Case studies remain concentrated in specific regions and data platforms:

Research on street vitality is mainly concentrated in economically developed regions such as China, the United States, and Europe, where central and highly urbanized areas are most frequently examined. In these contexts, researchers benefit from mature data platforms such as Google, Baidu, and Tencent, which provide broad coverage and high-quality data on street environments and behavioral dynamics. These platforms have supported studies at both urban and community scales and have proven valuable in analyses of street networks, functional zones, and street interface features.

In contrast, data coverage in remote and low-income areas remains limited, leading to uneven data sources and lower representativeness. Although real-time monitoring technologies such as UAVs, cameras, and sensors offer great potential, their application is often restricted by equipment costs, policy regulations, and ethical concerns. Micro-scale studies are especially affected by these constraints. Future research should expand the regional scope of case studies and increase attention to data-scarce contexts. With the continued growth of urban monitoring facilities, collaboration among governments, researchers, and communities will be essential. The integration of real-time dynamic data can help reveal finer patterns of street vitality and improve the inclusiveness and generalizability of research results. This approach will also provide stronger empirical support for urban planning and management in diverse settings.

Finally, street vitality should not be defined solely by technical indicators. Building on the review of methods and data, this paper emphasizes that street vitality is essentially a public value embedded in everyday civic life. It represents dimensions such as inclusiveness, accessibility, perceived safety, and diversity of use. Technology should support rather than replace this social meaning. Each technological application should begin with a clear statement of purpose and should follow principles of data minimization and anonymization. Public participation and feedback mechanisms should also be established to ensure fairness and transparency. Through these safeguards, the measurement of vitality can serve as a constructive supplement to public discussion and collective reflection on urban life, instead of becoming a substitute for them.

4.2. Technologies in Street Vitality Research

4.2.1. Data and Collection

Data form the foundation of analysis in studies of street vitality. The quality, nature, and dimensions of data, together with their respective strengths and limitations, largely determine the depth and scope of research outcomes. A detailed summary is provided in Appendix C. With the continued advancement of technology and the integration of multimodal sources, researchers have been able to examine the mechanisms driving street vitality from multiple perspectives. Figure 8a illustrates how different types of data vary in origin, structure, and analytical performance.

Figure 8. Performance evaluation: (a) Data. (b) Technical performance evaluation.

Among these, human activity data provide the most direct expression of vitality, revealing how frequently and in what ways people use streets [4]. Pedestrian counts, visit frequency, check-in records, and activity types together describe patterns of use across space and time. Image- and video-based observations collected through cameras or sensors form the basis for estimating pedestrian flow and density. By analyzing behavioral patterns extracted from such data, researchers can better understand how spatial form and human activities interact [27].

However, these approaches depend heavily on labeled datasets, and data collection is often constrained by environmental and technical conditions, which limits scalability. In dense urban areas, obstacles, lighting, and device performance can all introduce bias [20,22,44].

Mobile signaling data, by contrast, perform well in large-scale mobility analyses and provide broader spatial coverage. Yet they offer limited detail on individual behaviors and dwell time [74]. Check-in data from social media complement these sources by reflecting activity intensity at specific sites, especially in commercial areas. Even so, they often underrepresent non-commercial or open spaces and may contain sample bias due to unequal access to mobile devices among older adults or children [51,75].

In addition to activity data, environmental data provide the physical and material context for studying street vitality across scales. At the macro level, POI and economic indicators help reveal regional functions and economic dynamics, serving as both proxies for vitality and explanatory variables. For instance, POI data show the spatial distribution of commercial, transport, and educational facilities, which aids in assessing how functional zones contribute to vitality [60]. Yet these data tend to be static and have limited coverage in peripheral areas, which restricts their ability to capture temporal change [20]. Economic indicators, such as rental levels or transaction volumes, supplement vitality assessments but are often constrained by low temporal resolution [51].

At the micro scale, SVIs become an important data source. CV techniques allow the extraction of features such as road width, building height, and vegetation coverage, providing detailed representations of street environments [22]. However, the quality of image data is affected by weather, lighting, and obstruction, and the lack of nighttime imagery restricts temporal continuity.

Perception data further enrich vitality research by linking environmental features with human experience. Subjective perception data, obtained from surveys and online reviews, reflect people’s preferences, attitudes, and motivations regarding street environments. Sentiment analysis and topic modeling help interpret how people feel about and interact with urban spaces [51]. Nevertheless, such data are often noisy or biased and require rigorous text mining and validation. In contrast, objective perception data collected from biometric sensors such as eye tracking, EEG, or ECG record physiological responses to the environment, revealing how design elements influence human perception [76,77,78,79]. Although scientifically precise, these data are costly to obtain and difficult to scale due to complex experimental setups.

Complementing these perspectives, urban spatial data provide a structural foundation for analyzing street vitality. GIS and online maps contain essential information on street networks, building morphology, and greenery, which are key to assessing connectivity and accessibility [55,60]. Yet most of these data are static and fail to reflect dynamic behavioral patterns. Integrating them with real-time data such as GPS or video feeds is therefore critical for improving spatiotemporal analysis.

While this growing diversity of data enriches street vitality research, several challenges remain. Many datasets still suffer from limited spatial or temporal resolution. For instance, Baidu heatmaps (200 m × 200 m) and Mapbox movement data (100 m × 100 m) can support macro-level pattern analysis but lack the precision required for micro-level studies. Surveys and sensor-based data, though informative, demand substantial labor and logistical costs [80,81]. Image data are easily affected by environmental conditions, and social-media-based check-in data are biased toward commercial areas, limiting generalizability.

Moreover, regional disparities in data availability complicate cross-regional comparisons. In China, studies often use Baidu Maps and Tencent Maps, while international research relies more on Google Maps and OpenStreetMap. Differences in coverage and positional accuracy across these platforms increase uncertainty and make integration difficult [20]. In this regard, future work should focus on improving data resolution and coverage while developing more robust methods for integrating multiple data sources, which will allow vitality analysis to become more comprehensive and comparable across contexts.

Looking beyond technical aspects, data governance and participation are also essential. Open data should be viewed as a framework co-designed with the public. Communities or neighborhood committees, working with data stewards, should determine what to measure, why to measure it, for how long it should be measured, and under what safeguards. The research process should also include open channels for feedback and appeal from affected groups. When performing cross-regional comparisons, researchers should clearly disclose differences in platform coverage and uncertainty sources to improve the interpretability and transferability of findings.

To ensure responsible use and maintain social trust, Appendix C provides a “risk–safeguard” comparison organized by data category. It outlines measures for preventing re-identification and trajectory exposure, defining purpose limitations, ensuring data minimization and strong anonymization, enabling on-device processing that releases only aggregated indicators, enforcing retention limits and access audits, and establishing mechanisms for appeal and withdrawal. This framework complements the discussion in this section on data characteristics and applicability boundaries and serves as a baseline standard for future research and implementation.

Building on these considerations, the methodological framework can be further strengthened by linking data openness with indicator co-design. While this review primarily synthesizes technical and data-driven indicators of vitality, future research should involve participatory design processes with residents and local stakeholders. Such collaboration can help identify and refine experiential indicators, including perceived comfort, safety, inclusion of vulnerable groups, and cultural or caregiving activities. Incorporating these experiential dimensions would make vitality assessment frameworks not only more interpretable but also more socially meaningful.

4.2.2. Data Processing and Analysis Technologies

Progress in data processing and analysis methods has greatly improved efficiency and accuracy. The development from traditional tools to advanced technologies and equipment has expanded the potential of data classification, pattern recognition, and dynamic analysis. Nevertheless, when these methods are applied to complex scenarios or specific research needs, certain limitations remain, as illustrated in Figure 8b.

Space syntax marked an important shift in street vitality studies from qualitative observation to quantitative analysis. It reveals how the layout of streets relates to vitality distribution through the examination of spatial topology and network connectivity [55]. As a framework that focuses on connectivity and accessibility, space syntax has been widely used in macro- and meso-level research. For example, studies by Fareh and Alkama (2022) and X. Li et al. (2022) identified clear correlations between street network connectivity and pedestrian movement, offering valuable insights for improving urban vitality [26,56]. Even so, its static structure-based design makes it difficult to capture dynamic behavior or handle real-time information. When data such as GPS trajectories or video recordings are introduced, these temporal limitations become even more apparent.

In addition, space syntax pays most attention to spatial structure while often overlooking cultural and behavioral differences among users. In cities with diverse populations, groups may perceive and use streets in different ways, which are not easily represented by simplified network models. The framework performs well in terms of clarity and interpretability, yet its limited ability to respond to change and its narrow adaptability across different contexts reduce its usefulness in fine-grained temporal studies.

GPS and monitoring devices provide stronger capacity for dynamic response and real-time data capture. These technologies make it possible to collect environmental and behavioral information continuously, offering valuable support for the dynamic analysis of street vitality. For instance, sensors can monitor changes in pedestrian density and environmental conditions, while UAVs record street activities from multiple angles. Through such applications, researchers can overcome some of the static limitations of space syntax and gain a better understanding of human behavior in complex environments.

However, these technologies bring their own challenges. High costs and heavy resource requirements make them difficult to apply in regions with limited funding. They also lack flexibility when used in different scenarios and cannot easily address qualitative factors such as social or cultural influences. In addition, large-scale deployment often encounters problems in efficiency and maintenance, which calls for further technical improvement.

ML has become one of the main tools in street vitality research and is now widely used to handle large volumes of data (Figure 3). SL helps extract environmental features from street view images and videos and can estimate crowd flow through classification and prediction tasks [50,82]. The combination of CV and DL has improved automatic extraction and generative simulation of complex environmental features, showing particular strength in dealing with dynamic data [82]. Even so, supervised methods rely heavily on large labeled datasets. Producing these datasets requires considerable time and cost, and biases in the data may reduce the general applicability of the results.

UL performs better in cases where labeled data are limited. It can group data and detect hidden patterns in street environments [83]. For example, clustering algorithms can reveal links between pedestrian behavior and environmental quality. Yet the results are often difficult to interpret, and integration with multiple sources of heterogeneous data is still incomplete [34].

Although ML offers strong potential for analyzing dynamic data and scaling across systems, its demand for computing power and its dependence on labeled data make widespread use difficult. Limited transparency also remains a concern. Many models are not open, and access to training data or code is often restricted, which limits collaboration across disciplines and regions [20].

To improve reproducibility and public trust, researchers should provide model cards or data documentation, clearly state data use and retention periods, and make basic information about training data, preprocessing, evaluation, and errors available. Mechanisms for feedback and appeal should also be established so that affected groups can participate in review and correction. Because different technologies involve different risks when combining multiple data sources, Appendix A summarizes the corresponding typical risk and minimum safeguard requirements by technology category. These include openness in model and data disclosure, aggregation and de-identification of outputs, clear limits on data use and retention, and independent auditing with public documentation. This framework follows the principles of reproducibility and transparency discussed in this section and serves as a minimum standard for responsible technology selection and implementation.

The differences in technical performance show that no single method can fully satisfy the needs of street vitality research. Space syntax provides interpretability and strength in large-scale analysis but lacks responsiveness to change. GPS and sensor-based systems compensate for this weakness yet struggle with cost and contextual adaptability. ML techniques excel in processing multiple data sources and automating analysis but still require improvement in interpretability, openness, and resource efficiency.

4.3. Themes and Fields of Technology Application in Street Vitality

4.3.1. Macro-Scale: Dynamics and Technological Implementation of Vitality

In macro-scale street vitality research, multi-source data (e.g., high-coverage street view imagery, mobile signaling, POI, urban transportation networks, and GPS trajectories) provides extensive opportunities for selecting technical methodologies [72,84]. DL and data analysis tools excel not only in large-scale automated identification and prediction but also in the multi-dimensional, in-depth exploration of environmental factors and pedestrian distribution (Appendix B). Space syntax has also been widely applied in macro-scale analyses, enabling researchers to uncover the structural relationship between urban morphology and vitality distribution by quantifying street network connectivity and accessibility [55,58].

When selecting technologies, research objectives and data characteristics are particularly critical. For scenarios requiring rapid processing of massive volumes of video or image data to identify spatiotemporal dynamics of crowds, DL is most commonly deployed on cloud or distributed systems [13,71]. For studies primarily focused on network structures and functional area diagnostics, the combination of space syntax and clustering models offers advantages in topological visualization and interpretability [58,85,86]. Additionally, for researchers focusing on vitality differences across regions or conducting international comparisons, it is important to consider the accuracy and timeliness of different mapping platforms, including Baidu Maps, Amap, Google Maps, and OSM [20,45,75]. These platforms differ in image clarity, update frequency, and data authorization, which may lead to insufficient comparability when transferring models across regions or conducting technological comparisons.

To address this issue, some studies first use GPS or mobile signaling to identify vitality hotspots, followed by UAV aerial photography or street view imagery for secondary validation [27,84]. The multi-source data integration enables large-scale preliminary scanning and targeted focus but also imposes higher demands on computational power and network transmission.

Privacy and ethical challenges in macro-scale research primarily involve the large-scale sharing of data from mobile communications and social media platforms. Achieving higher spatiotemporal resolution necessitates collaboration with telecommunications operators or government agencies, requiring extensive efforts in data anonymization and compliance reviews [87]. Additionally, in DL or distributed analysis, researchers must implement dedicated modules for data encryption, access control, and ensuring the interpretability of visualized results. Only by safeguarding privacy and data security can these technologies be more broadly applied to city-scale vitality assessments, traffic organization, and functional zoning practices. Overall, the use of technologies at the macro scale tends to focus on rapidly identifying vitality hotspots, analyzing urban spatial structural features, and forecasting crowd aggregation and dispersion trends, thereby providing comprehensive and efficient support for macro-level planning and decision making.

4.3.2. Micro-Scale: Dynamics and the Built Environment of Vitality

At the micro scale, research focuses on individual streets, facilities, spatial morphology, and building interfaces, examining how micro-environmental elements influence human behavior and perception through high-precision data collection and real-time monitoring [4,10]. Visual models and embedded sensors (e.g., StreetAware sensor, CityGrid sensors) enable rapid segmentation and identification of street elements, making them suitable for deployment on mobile platforms such as bicycles and buses [13,44]. Compared with tools like Baidu heatmaps, Dewey’s neighborhood pattern, and Mapbox’s movement data, these technologies offer greater spatial detail and interpretability.

Space syntax can also be applied at the micro level through visibility analysis and path accessibility measures, providing structured visualizations and interpretive frameworks for localized street cross-sections or corner areas [60]. When more detailed patterns of crowd aggregation or pedestrian attributes are required, UAVs equipped with low-altitude aerial photography can be employed for 3D reconstruction and integrated with GPS trajectory tracking. These combined methods facilitate the analysis of how building façades, entrance layouts, and street furniture configurations influence pedestrian dwell time and social interactions.

Compared with macro-scale research, micro-scale studies raise more significant concerns regarding privacy and ethics. Close-range camera recordings often capture detailed personal information, posing substantial risks if measures such as facial blurring or data encryption are not implemented. Therefore, it is essential to carefully design data collection processes and select appropriate model scales and hardware configurations to balance flexibility with regulatory compliance. To address dynamic or small-sample scenarios, researchers have begun exploring unsupervised learning and transfer learning methods, which allow models to adaptively learn new street elements and behavioral patterns from limited or incremental data [13]. This approach reduces dependence on large-scale manual labeling, making it suitable for small pilot projects or experimental street renewal initiatives. However, these methods are still in their early stages of application, and their effectiveness remains insufficiently validated.

For micro-scale, close-range data collection and recognition tasks, research protocols should explicitly specify the minimum required resolution and blurring strategies, data collection purposes and retention periods, third-party auditing and access authorization procedures, as well as community-level participation and feedback mechanisms, in order to reduce re-identification risks and enhance the social acceptability of the research.

At the same time, the selection of technologies at this scale should aim to accurately capture mechanisms of human–environment interaction, assisting in the identification of areas for improvement or potential blind spots in street furniture, physical environments, and spatial configurations. Moreover, these technologies enable the rapid detection of localized safety risks and abnormal crowd activities, thereby supporting evidence-based management and design decisions.

When macro- and micro-scale technologies are integrated, researchers can first identify high-vitality areas or regions requiring intervention at the city scale using data from space syntax or GPS. Detailed analyses of target streets can then be conducted using near-field imagery, UAV footage, or StreetAware sensors. This multi-scale approach effectively balances the identification of overall trends with the refinement of street-level design, leveraging multi-source data integration and cross-scale technological collaboration to address diverse needs ranging from municipal planning to precise street-level interventions. By emphasizing both algorithmic interpretability and data governance, such studies can provide more robust and transparent decision-making support for theoretical research and practical applications in street vitality planning.

4.4. Challenges, Significance, and Future Research

Despite the progress achieved in street vitality research at both macro and micro levels, significant limitations persist in terms of data quality and the applicability of technologies.

Firstly, data coverage and reliability remain major challenges. Remote and low-income areas often lack access to high-resolution street view imagery, mobile signaling, or POI data, which restricts the generalizability of research findings across diverse regions and cultural contexts [20,51]. In high-density urban areas, micro-scale data collection demands substantial investments in equipment and labor and is frequently hindered by factors such as weather conditions, obstructions, and privacy concerns [22]. Thus, ensuring the reliability and comprehensiveness of data in resource-constrained environments continues to be a critical challenge that street vitality research must urgently address.

Secondly, the adaptability of existing methods remains insufficient. While ML can efficiently process large-scale imagery and location data, it places high demands on labeled datasets and computational resources. Conversely, UL faces notable challenges in interpretability and the complexity of integrating multi-source data. Space syntax offers intuitive analyses of urban topological accessibility; however, it falls short of accurately capturing the nuanced mechanisms of crowd activity without being supplemented by dynamic behavioral data or sociological surveys [56,58]. Additionally, UAVs and embedded sensors, while providing high-precision observations at the micro scale, are constrained by practical factors such as costs, computational power requirements, and regulatory approvals [44]. This disparity between macro- and micro-scale technologies presents a significant obstacle to constructing a cohesive multi-scale framework for street vitality research.

Thirdly, the lack of interdisciplinary and multi-agency collaboration mechanisms limits the sharing and practical application of research findings. With the increasing prevalence of urban monitoring facilities, such as traffic violation and safety surveillance cameras, the potential for real-time monitoring of human activities and environmental conditions has expanded. However, concerns regarding data security and privacy protection pose significant challenges, making negotiations with government agencies a critical hurdle. Currently, the integrated application of technologies such as ML, space syntax, and VR is primarily confined to small-scale pilot projects, requiring high levels of platform standardization and data openness. Without a unified collaborative platform established between urban planning departments and technology development teams to support data anonymization, labeling, and dynamic updates, the transferability of research outcomes and their relevance to policy making will remain significantly constrained. Nevertheless, these challenges also open up numerous opportunities for future research exploration:

(1): At the data level, promoting open data sharing through public platforms and cross-departmental collaboration can provide accessible foundational data for regions and researchers with limited budgets. Simultaneously, multi-source crowdsourcing and lightweight applications have the potential to collect key street vitality indicators in areas where device coverage is insufficient.
(2): Technology Integration and Algorithm Innovation: Integrating multiple methods to leverage their respective strengths could more comprehensively uncover the spatiotemporal dynamics of human activities.
(3): Micro-Scale Technological Applications: Expanding the use of sensing devices and fostering collaboration between urban regulatory equipment data and academic research can enhance studies at the micro scale. Integrating biometric devices, VR, or AR technologies with urban studies and environmental psychology could provide more refined and immersive evidence for exploring behavioral mechanisms and perceptual factors.
(4): While maintaining the methodological and data-oriented focus of this review, future research should ensure clear purpose definition, data minimization, anonymization, limited data retention, and independent auditing. In addition, mechanisms for public participation and appeals should be established to guarantee transparency and fairness. Building on this foundation, participatory approaches should be further extended to the design of indicator systems, enabling residents and local stakeholders to jointly define and evaluate the key dimensions of street vitality. Incorporating experiential factors such as comfort, safety, inclusiveness, and cultural use would further enhance the explanatory power and social relevance of vitality assessments.

5. Conclusions

This review systematically examines four core technologies and six data categories used in street vitality research, together with four application themes and eight subcategories spanning both macro and micro scales. It highlights key advances and persistent constraints related to data quality, technological adaptability, and application scenarios. While rapid innovation has opened substantial opportunities, uneven data coverage, limited methodological adaptability, and weak collaboration mechanisms continue to hinder broad and robust deployment.

Firstly, limitations in data coverage undermine generalizability. The absence of high-resolution data in remote and low-income areas restricts cross-regional and context-diverse studies. In high-density urban settings, micro-scale data collection faces challenges such as equipment costs and privacy protection. These constraints underscore the need for practical guidelines that can travel across contexts rather than case-specific prescriptions.

Secondly, existing techniques require stronger adaptability and integration. Machine learning, space syntax, and UAVs perform effectively in specific scenarios but exhibit shortcomings in dynamic behavior modeling, multi-source data fusion, and interdisciplinary collaboration. Particularly at the micro scale, developing cost-effective approaches for acquiring high-quality dynamic data remains critical. Equally important is methodological transparency (clear documentation of data provenance, preprocessing, and evaluation criteria) to ensure reproducibility and external scrutiny.

Thirdly, insufficient cross-sector collaboration and the lack of mature data-sharing architectures limit the translation of analytical findings into practice. Unified open platforms and reliable anonymization pipelines require further refinement to support policy and planning at scale. To promote compliant and ethical deployment, this review proposes a concise risk–safeguard mapping by technology and data category (including purpose limitation, data minimization and anonymization, on-device processing, retention caps, access logging, and redress mechanisms) as a baseline checklist for researchers and practitioners.

Based on the above challenges, the following four recommendations are proposed to guide future research directions:

(1): Promote open public data platforms to facilitate multi-agency collaboration and data anonymization mechanisms, thereby enhancing data accessibility in resource-constrained regions.
(2): Integrate multiple methodological approaches to leverage their respective strengths for a more comprehensive exploration of the dynamic mechanisms underlying street vitality. Strengthened interdisciplinary collaboration among data scientists, urban planners, behavioral researchers, and policy actors is essential to translate analytical models into shared tools for inclusive design and governance.
(3): Develop low-cost technological solutions adaptable to diverse scenarios, using multi-source data integration to support multi-scale vitality research and to provide efficient, evidence-based support for planning and policy decisions.
(4): Institutionalize a minimum set of safeguards, including purpose limitation, data minimization and anonymization, capped retention, on-device processing where feasible, audited access, and transparent reporting, so that measurement augments rather than replaces public deliberation about streets as civic spaces.

Beyond these directions, future studies should situate street vitality analytics within broader interdisciplinary ecosystems, linking computational innovation with social and ecological understanding. The reviewed technologies have potential to reshape how evidence informs design evaluation, participatory planning, and adaptive management through platforms such as digital twins and real-time urban dashboards. Such integration can bridge scientific rigor with public imagination, allowing vitality research to become both a methodological and civic enterprise.

In conclusion, this review clarifies the current state of technologies, their limitations, and promising paths forward for street vitality research. As smart-city systems mature, combining methodological integration with enforceable safeguards and cross-disciplinary collaboration will provide a stronger theoretical and practical foundation for optimizing urban public spaces and fostering meaningful social interaction.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/buildings15213987/s1, Table S1: PRISMA checklist; Table S2: “Method/technology” AND “Topic” AND “Context.” References [88,89] are cited in the supplementary materials.

Author Contributions

Conceptualization, Y.H. and M.C.; formal analysis, Y.H. and M.C.; methodology, Y.H. and M.C.; writing—original draft preparation, Y.H. and X.Z.; writing—review and editing, R.S., M.C. and R.Y.; data curation, Y.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Materials. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the existing affiliation information. This change does not affect the scientific content of the article.

Abbreviations

The following abbreviations are used in this manuscript:

DL	Deep Learning
ML	Machine Learning
POI	Point of Interest
SL	Supervised Learning
UL	Unsupervised Learning
CV	Computer Vision
SVIs	Street View Images
WoS	Web of Science

Appendix A

Table A1. Classification of technology, data input, and data sources.

Category	Risks	Safeguard	Data Input	Data Source	Citation
SL (DL in the study)	Supervised or deep learning models using image, video, or sensor inputs may produce reversible features or embeddings, which could enable individual re-identification or continuous surveillance.	On-device processing and automatic blurring are applied, with only aggregated indicators or irreversible embeddings being output. The use and retention periods are restricted, while model cards and data documentation are provided, and all access and audit logs are recorded.	Images	OSM and Google Map and Video Camera	[4]
				Tencent Maps and Baidu Maps	[17]
				Baidu Maps	[24,25,38,41,45,51]
				Baidu Maps and CitySpaces Dataset	[12]
				Camera-Based Device	[13]
				\	[40]
				Google Maps	[10,23,37]
				Google Maps and Baidu Maps	[39]
				Citygrid Sensors	[44]
				OSM	[36]
				Baidu Maps	[26]
				Baidu Maps and AutoNavi	[30]
				StreetAware Sensor	[46]
			Video	Municipality and PETS	[22]
			POI	AutoNavi	[30]
UL	Unsupervised clustering or topic mining may label population groups and, when linked with external data sources, lead to implicit profiling and discriminatory decisions.	Full-process de-identification and hierarchical aggregation are applied. The processing workflow and uncertainties are disclosed. Data are restricted to research use only, with linkage to identifiable information prohibited. Robustness and bias detections are conducted.	GPS Data	GPS Logger	[63]
UL			GPS Data	GPS Locators	[7]
GPS and GIS	Linking trajectory data with spatial features may reconstruct individual travel paths and activity ranges, which can be exploited for differential governance or tracking.	Apply coarse-grained spatiotemporal aggregation with minimum sample thresholds, implement de-identification and differential privacy, enforce purpose and retention limitations, and enable access logging and independent auditing.	Land-use Data and Neighborhood Data and POI and Activity Data	Municipality and Lianjia and Baidu Maps and GPS Tracking Devices	[90]
			Video	UAVs	[27]
			Pedestrian Volume and Land-use Data and Built-environment Variables	Municipality and GIS	[57]
			Pedestrian Volume	Observation	[56]
Space Syntax	When combined with external datasets, static network accessibility indicators may be misused for regional labeling or selective governance.	Disclose data coverage, accuracy, and indicator definitions; release only research-level aggregated results; verify with independent datasets; and restrict usage scenarios to prevent individual- or store-level applications.	Street Networks and Satellite Images and Pedestrian Volume	OSM and Municipality and Google Earth	[58]
			Street Networks and Non-residential buildings Data and GIS and Pedestrian Volume	GISrael and Manual Counters	[59]
			Street Networks and POI and Pedestrian/Vehicle Movement	Baidu Maps and Detailed Gate Count	[60]
			Maps and Pedestrian Volume and Pedestrian Behavior and Video and Image	Municipality and Pedestrian Counting and Questionnaire	[11]
			Street Networks and POI and Baidu Heatmap	OSM and Amap and Baidu Maps	[61]
			Images	AMOS Webcams	[28]
			Images and POI	Tencent Maps and OSM and Satellite Map	[14]
Multiple Big Data and Computer Analytics	Multi-source linkage (e.g., social, economic, trajectory, and POI data) may enable re-identification and detailed profiling, which can be misused for differential pricing or targeted actions.	Apply de-identification and hierarchical aggregation with minimum sample thresholds and outlier truncation; restrict purpose and retention period; maintain access auditing and grievance mechanisms; and conduct risk assessment prior to any cross-source linkage.	Street Network and Business Density and POI Density and Check-in Density and Comment Density	OMS and Baidu Maps and Blog and Dianping	[64]
			Mobile Phone Data	Municipality	[87]
			Mobile Phone Data and Building Data and Road Network and POI and Business Data and House Price and Recruitment Info	Mobile Phone Operator and Amap and Dianping and Fangtianxia and 51job	[75]
			Social Activity Intensity and Economic Activity Intensity and Pedestrian Density and Building Density and POI and Road Junction and Building Data	Weibo and Dianping and Tencent Maps and Tianditu and Baidu Maps and SinoGrids and Lianjia	[65]
			Check-in Data and Resident Population Data and House Price and POI	Blog and Municipality and Lianjia and Amap	[91]
			Check-in Data	Blog	[5]
			Video	Video Camera	[92]
			Road Network and Mobile Communication Dataset and Urban Spatial Data and POI	OSM and Mobile Phone Operator and Baidu Maps and AutoNavi	[84]
			Taxi Trajectories Records and Urban Physical Environment Features and POI	Taxis and Baidu Maps	[62]
			Check-in Data and POI and Population and GDP and Road Network	Blog and Baidu Maps and Municipality and OMS	[93]

Appendix B

Table A2. Classification of Machine Learning Techniques Based on Application Types.

Application Categories	Algorithm	Application Categories
Image-based Technologies	LDCF Algorithm	L. Chen et al. (2020) used it to evaluated pedestrian volumes with SVIs [17]. It is capable of counting pedestrians from images without image segmentation process. In pedestrian recognition field, orthogonal segmentation is more efficient and less computationally expensive during training and detection but may have advantages in dealing with high-dimensional data with highly correlated features [94].
	Image Recognition Algorithm ACF	Submitted by Dollar et al. (2014) [95]. Yin et al. (2015) used it to count pedestrians. It is SL and usually requires both positive and negative samples (e.g., pedestrians and non-pedestrians) for training and uses these samples to learn the difference between a target (e.g., pedestrian) and the background [10]. Cameras mounted on bicycles traveling on the street were used for acquisition and training in this study, similar to capturing GSVs. This allows the trained model to be used directly for detection without manually labeling pedestrians or training the model again.
	YOLOv4	YOLOv4 uses a CNN [96]. CV was used to measure street activity in real time [13]. Firstly, the image is used as input data and a prediction category is output for each human activity or traffic detected, then a confidence score is output. Secondly, the trained model weights are transferred to a YOLOv4-tiny structure and a camera-based device. Finally, it was installed on a bus to capture the urban vitality of this bus line. There are two advantages: (1) The equipment is fully automated] without the intervention of a driver. (2) The device processes images locally without saving, thus enabling the creation of rich indicators of street use while respecting personal privacy.
	PSPNet	It is an algorithm for scene parsing and semantic segmentation using SVI as input data through pyramid pooling module and pyramid scene parsing network. It is used to study streetscape features that can be directly perceived by pedestrians (micro-scale) [24,25,30,37,39,41,45].
	DLM-SVC Model	Proposed by [4], the model comprises a pedestrian-volume-based and activity-based model that is capable of inferring street vitality from two different aspects [4]. Video data is captured by both high- and low-set cameras and converted into image data for pedestrian counting and activity categorization as input data for assessing street vitality. Artificial neural network (ANN) is an estimated model using a large amount of input data [23]. The neural networks composed of interconnected neurons are adaptive to input data and enabled to learn. It is used for CV and speech recognition [97]. Yin and Wang (2016) used it to analyze the texture and color of images [23].
	Support Vector Machine (SVM)	It is a method for classification using SL, proposed by Cortes and Vapnik (1995), and has been used for image classification [98]. Yin and Wang (2016) used image features extracted by ANN (e.g., neighboring regions and their sizes and locations, the areas are the sky′s possibilities, and share boundary is a straight line possibility) as input data to SVM, classifying and labeling the images as sky or non-sky [23]. It was used to objectively measure visual closure and walking ability.
	Deeplab V3+	Presented by L.-C. Chen et al. (2018), it is a variant of CNN for semantic segmentation. The difference from the main PSPNet is that Deeplab V3+ combines null convolution and a new encoder–decoder structure [99]. Improved object boundaries make the model more adaptable to targets of different sizes [99]. Z. Liu et al. (2022) deconstruct vision by extracting formal features (form, line, texture, and color) of landscapes to assess their character (e.g., coherence, diversity, vividness, and harmony) [12]. Zhao et al. (2023) also extracted built-environment elements from SVI to explore street vitality in relation to them. It can be seen that this method of extracting abstract parameters from landscape is suitable for studying visual perception of landscape [38].
	Convolutional Neural Network (CNN)	It can used to recognize patterns and features in image and classification or regression operations on images [97]. In Qi et al. (2020), CNNs are utilized to mimic human perception of urban scenes and to recognize visual features of urban street vitality directly from street scenes [40].
	ResNet-34	ResNet-34 refers to a residual neural network (RNN) and was first proposed by He et al. (2015) [100]. In road network classification by W. Chen et al. (2021), road network types were artificially extracted using a colored road hierarchy diagram (CRHD) and trained with ResNet-34 input image data [36]. In contrast to common CNNs, ResNets reform the layers into a residual function that learns the input of the reference layer. ResNets have better performance per parameter and faster inference than earlier architectures such as VGG, with their speed, accuracy, and ability to filter important features.
	Segnet Neural Network	Submitted by Badrinarayanan et al. (2017). It is a deep neural network structure for semantic segmentation to recognize the sky and greenery in images [101]. M. Li et al. (2021) used it to recognize the pixels of SVIs as sky, greenery, and the rest of the street [44]. Faster inference compared to other semantic segmentation methods of the time [101].
	Fully Convolutional Neural Network (FCN)	It is DL, proposed by Long et al. (2015), and used for semantic segmentation [102]. X. Li et al. (2022) used FCN based on ADE20k dataset, and six types of street elements (pedestrians, bicyclists, motor vehicles, transit, private cars, and trucks) were extracted for counts [26].
	High-resolution Network (HRNet)	Proposed by J. Wang et al. (2020), adapted from Faster R-CNN network for object detection and human pose estimation by enabling state-of-the-art bottom-up segmentation using high-resolution feature pyramids [103]. There were six categories detected in Piadyk et al. (2023): people, cars, bicycles, trucks, motorcycles, and buses [46]. For pose estimation, the model detects each “person” independently and focuses on a specific bounding box. Despite the obvious lens vignetting and brightness variations in images, the method produces consistent estimates as the person moves towards or away from the camera. Inception V4 was proposed by Szegedy et al. (2017) for semantic segmentation [104]. Tang et al. (2022) trained and compared a total of four state-of-the-art CNN models (DenseNet-121, SENet-154, ResNeSt-50, Inception V4), and Inception V4 was chosen to assess people′s willingness to stay in relation to their environment [51].
	Multi-layer Perceptron (MLP)	MLP is a type of artificial neural network for regression and classification tasks. By adding hidden layers and non-linear activation functions, the MLP is able to capture non-linear relationships in the data, which makes it more powerful than simple linear models (e.g., linear regression or a single perceptron) [105]. Hu et al. (2020) used MLP to group urban functions by different POI urban function themes and street-view-based metrics using distribution probabilities of POI urban functional thematic and streetscape-based metrics as inputs [30]. At the output layer, the output is the detection results for each selected road segment. Then roads are categorized and specific urban function types are labeled.
	FairMOT	Proposed by Y. Zhang et al. (2021), it is based on the anchor-free object detection architecture CenterNet [106]. Used for automated video processing [22], and a framework for combining pedestrian tracking with attributes is proposed on this basis. The framework incorporates pedestrian high-level attribute features (gender, age group, and personal effects type) used for re-identification (ReID) to help analyze pedestrian mobility patterns. The method overcomes to some extent some of the problems that CV has often encountered in previous studies, such as (1) variability of human appearance and (2) occlusion. Using the method for pedestrian volumes will result in more reliable data.
Video Detection-based Technologies	LightGBM (Tree-based Regression Model)	It is a popular ML method in the current industry, first proposed by Ke et al. (2017) [107]. The advantages include fast speed, high accuracy, and ability to filter important features. It evolved from the gradient boosted decision tree (GBDT) to establish the relationship between morphology indices and vigor indicators (W. Chen et al. (2021)) [36].
Based on Data Analysis	Principal Component Analysis (PCA)	Proposed by Pearson (1901), it is a statistical method that finds a line or a plane that minimizes the sum of the squares of the distances of all data points to that line or plane [108]. But it was not originally designed for ML at that time, as ML did not exist. It is a statistical technique widely used today for data analysis and ML.
Based on Data Analysis	Latent Dirichlet Allocation (LDA)	LDA is a UL of Bayesian probability, primarily used to discover potential themes in a collection of documents [109]. Hu et al. (2020) use POI data to extract socio-economic information as a way to perform semantic urban function extraction. And, combined with semantic segmentation, an urban-function-driven street quality assessment method was proposed [30].

Appendix C

Table A3. Data Types and Attributes in Street Vitality Research.

Use Case	Subcategory	Source/Tools	Advantages	Limitations	Risks	Safeguard
Human Activity Data	Pedestrian Count	Sensors, Image, Video	Accurate for real-time monitoring; supports density and flow analysis	Affected by occlusion, lighting conditions	Camera footage may capture identifiable individuals and be misused for tracking or forensic purposes beyond the scope of research.	Process data solely on-device with automatic blurring of identifiable information; retain only aggregated counts such as the number of people; define explicit purposes and retention periods; and ensure on-site notification and auditing.
	Visitation	Mobile Phone Signaling	Good for large-scale mobility analysis	Limited behavioral detail; dependent on device coverage	Repeated positioning data may reconstruct individual mobility trajectories and be misused for selective management.	Release only de-identified data aggregated by large spatial and temporal units; establish contractual agreements on purpose and retention limits with data providers; and maintain access logging.
	Check-in Data	Social Media Platforms	High precision in location-specific data	Low coverage in non-commercial areas and specific demographics	Temporal and locational information may enable account re-identification and expose individual movements, leading to targeted harassment.	Provide only de-identified aggregated statistics without disclosing precise coordinates or timestamps; define purpose and retention limits; and offer data deletion and grievance mechanisms.
	Activity Types	Images, Video	Captures diverse behavioral patterns; supports detailed activity recognition	(1) High labeling cost; (2) equipment and environmental limitations	Activity labels may be used to infer sensitive habits or identities, posing potential risks to personal safety.	Perform recognition on the device without storing raw footage, and output only categorical counts.
Environmental Data	POI Data	Government Statistics, Online Maps	Provides functional and locational diversity	(1) Limited dynamic characteristics; (2) sparse in remote areas; (3) only specific locations can be fetched	The distribution of facilities may be used to label areas, which could in turn restrict services or impose behavioral guidance.	Disclose data sources, definitions, and update frequency along with the dataset; report platform differences; and restrict usage to research purposes only.
	Economic Data	Economic Reports, Government Statistics	Offers quantitative measures of economic vitality	Low temporal resolution	Regional rental or transaction information may be used for price discrimination or linked to specific stores or properties to exert pressure.	Release only de-identified data aggregated over large spatial and temporal units with sufficient sample sizes; prohibit store- or property-level outputs; and maintain access logging and auditing.
	Street-view-derived Features	CV	Automated analysis reduces manual work	(1) Data quality depends on image resolution, weather, and occlusion; (2) limited by data coverage; (3) uneven distribution over time dimension and indeed night data	Visual features may be used to infer sensitive attributes or trigger selective monitoring of individuals with atypical appearances.	Share only de-identified feature indicators; provide model documentation including data sources, limitations, and temporal coverage; and restrict use to research purposes.
Perception Data	Score, Topic	Surveys, Text Mining	Reflects user opinions and preferences	Prone to noise, bias, and fake reviews	Public opinion and textual data may be subject to organized manipulation or used to identify and target specific individuals or businesses.	Establish anti-abuse and data-cleaning procedures, conduct robustness verification, remove identifiable information, and provide channels for appeal and correction.
Perception Data	Physiological Data	Biometric Devices	Provides precise measurement of spatial impact on human physiology	High cost and difficult for large-scale application	Biometric signals are highly sensitive and may enable long-term identification or be collected without sufficient consent.	Require ethical approval and explicit consent; perform on-device de-identification; set strict data retention limits; allow withdrawal at any time; and prohibit linkage with external data sources.
Spatial Data	Spatial Features	Government Statistics, Online Maps	Enables connectivity and accessibility analysis	Limited to static features; requires high-quality data	Static map features may be used for differential regional control or combined with external data sources for large-scale tracking.	Disclose data coverage and accuracy, standardize indicator definitions, verify with dynamic data prior to any application, and restrict usage scenarios.

References

Zeng, J.; Qian, Y.; Ren, Z.; Xu, D.; Wei, X. Road Landscape Morphology of Valley City Blocks under the Concept of “Open Block”—Taking Lanzhou City as an Example. Sustainability 2019, 11, 6258. [Google Scholar] [CrossRef]
Jacobs, J. The Death and Life of Great American Citise; Yilin Press: Nanjing, China, 2020; ISBN 978-7-5447-4058-6. [Google Scholar]
Zhang, D.; Ling, G.H.T.; Misnan, S.H.B.; Fang, M. A Systematic Review of Factors Influencing the Vitality of Public Open Spaces: A Novel Perspective Using Social–Ecological Model (SEM). Sustainability 2023, 15, 5235. [Google Scholar] [CrossRef]
Li, Y.; Yabuki, N.; Fukuda, T. Exploring the Association between Street Built Environment and Street Vitality Using Deep Learning Methods. Sustain. Cities Soc. 2022, 79, 103656. [Google Scholar] [CrossRef]
Lu, S.; Huang, Y.; Shi, C.; Yang, X. Exploring the Associations Between Urban Form and Neighborhood Vibrancy: A Case Study of Chengdu, China. ISPRS Int. J. Geo-Inf. 2019, 8, 165. [Google Scholar] [CrossRef]
Mehta, V. Lively Streets: Determining Environmental Characteristics to Support Social Behavior. J. Plan. Educ. Res. 2007, 27, 165–187. [Google Scholar] [CrossRef]
Zhang, L.; Zhang, R.; Yin, B. The Impact of the Built-up Environment of Streets on Pedestrian Activities in the Historical Area. Alex. Eng. J. 2021, 60, 285–300. [Google Scholar] [CrossRef]
Wang, H.; Tang, J.; Xu, P.; Chen, R.; Yao, H. Research on the Influence Mechanism of Street Vitality in Mountainous Cities Based on a Bayesian Network: A Case Study of the Main Urban Area of Chongqing. Land 2022, 11, 728. [Google Scholar] [CrossRef]
Yang, Y.; Qian, Y.; Zeng, J.; Wei, X.; Yang, M. Walkability Measurement of 15-Minute Community Life Circle in Shanghai. Land 2023, 12, 153. [Google Scholar] [CrossRef]
Yin, L.; Cheng, Q.; Wang, Z.; Shao, Z. ‘Big Data’ for Pedestrian Volume: Exploring the Use of Google Street View Images for Pedestrian Counts. Appl. Geogr. 2015, 63, 337–345. [Google Scholar] [CrossRef]
Yıldırım, Ö.C.; Çelik, E. Understanding Pedestrian Behavior and Spatial Relations: A Pedestrianized Area in Besiktas, Istanbul. Front. Archit. Res. 2023, 12, 67–84. [Google Scholar] [CrossRef]
Liu, Z.; Ma, X.; Hu, L.; Lu, S.; Ye, X.; You, S.; Tan, Z.; Li, X. Information in Streetscapes—Research on Visual Perception Information Quantity of Street Space Based on Information Entropy and Machine Learning. ISPRS Int. J. Geo-Inf. 2022, 11, 628. [Google Scholar] [CrossRef]
Salazar-Miranda, A.; Zhang, F.; Sun, M.; Leoni, P.; Duarte, F.; Ratti, C. Smart Curbs: Measuring Street Activities in Real-Time Using Computer Vision. Landsc. Urban Plan. 2023, 234, 104715. [Google Scholar] [CrossRef]
Yang, J.; Li, X.; Du, J.; Cheng, C. Exploring the Relationship between Urban Street Spatial Patterns and Street Vitality: A Case Study of Guiyang, China. Int. J. Environ. Res. Public Health 2023, 20, 1646. [Google Scholar] [CrossRef]
Whyte, W.H. The Social Life of Small Urban Spaces; 7 Print; Project for Public Spaces: New York, NY, USA, 2010; ISBN 978-0-9706324-1-8. [Google Scholar]
Gehl, J.; Svarre, B. Counting, Mapping, Tracking and Other Tools. In How To Study Public Life; Island Press/Center for Resource Economics: Washington, DC, USA, 2013; pp. 21–36. ISBN 978-1-59726-445-7. [Google Scholar]
Chen, L.; Lu, Y.; Sheng, Q.; Ye, Y.; Wang, R.; Liu, Y. Estimating Pedestrian Volume Using Street View Images: A Large-Scale Validation Test. Comput. Environ. Urban Syst. 2020, 81, 101481. [Google Scholar] [CrossRef]
Ardic, S.I.; Kirdar, G.; Lima, A.B. An Exploratory Urban Analysis via Big Data Approach: Eindhoven Case. Cogn. City 2020, 2, 309–318. [Google Scholar]
Zhang, A.; Li, W.; Wu, J.; Lin, J.; Chu, J.; Xia, C. How Can the Urban Landscape Affect Urban Vitality at the Street Block Level? A Case Study of 15 Metropolises in China. Environ. Plan. B Urban Anal. City Sci. 2021, 48, 1245–1262. [Google Scholar] [CrossRef]
Biljecki, F.; Ito, K. Street View Imagery in Urban Analytics and GIS: A Review. Landsc. Urban Plan. 2021, 215, 104217. [Google Scholar] [CrossRef]
Wang, X.; Zheng, S.; Yang, R.; Zheng, A.; Chen, Z.; Tang, J.; Luo, B. Pedestrian Attribute Recognition: A Survey. Pattern Recognit. 2022, 121, 108220. [Google Scholar] [CrossRef]
Wong, P.K.-Y.; Luo, H.; Wang, M.; Leung, P.H.; Cheng, J.C.P. Recognition of Pedestrian Trajectories and Attributes with Computer Vision and Deep Learning Techniques. Adv. Eng. Inform. 2021, 49, 101356. [Google Scholar] [CrossRef]
Yin, L.; Wang, Z. Measuring Visual Enclosure for Street Walkability: Using Machine Learning Algorithms and Google Street View Imagery. Appl. Geogr. 2016, 76, 147–153. [Google Scholar] [CrossRef]
Chen, L.; Lu, Y.; Ye, Y.; Xiao, Y.; Yang, L. Examining the Association between the Built Environment and Pedestrian Volume Using Street View Images. Cities 2022, 127, 103734. [Google Scholar] [CrossRef]
Gong, Z.; Ma, Q.; Kan, C.; Qi, Q. Classifying Street Spaces with Street View Images for a Spatial Indicator of Urban Functions. Sustainability 2019, 11, 6424. [Google Scholar] [CrossRef]
Li, X.; Li, Y.; Jia, T.; Zhou, L.; Hijazi, I.H. The Six Dimensions of Built Environment on Urban Vitality: Fusion Evidence from Multi-Source Data. Cities 2022, 121, 103482. [Google Scholar] [CrossRef]
Parra-Ovalle, D.; Miralles-Guasch, C.; Marquet, O. Pedestrian Street Behavior Mapping Using Unmanned Aerial Vehicles. A Case Study in Santiago de Chile. PLoS ONE 2023, 18, e0282024. [Google Scholar] [CrossRef]
Petrasova, A.; Hipp, J.A.; Mitasova, H. Visualization of Pedestrian Density Dynamics Using Data Extracted from Public Webcams. ISPRS Int. J. Geo-Inf. 2019, 8, 559. [Google Scholar] [CrossRef]
Yu, S.; Wang, H.; Wang, Y. Optimization Design of Street Public Space Layout on Account of Internet of Things and Deep Learning. Comput. Intell. Neurosci. 2022, 2022, 7274525. [Google Scholar] [CrossRef]
Hu, F.; Liu, W.; Lu, J.; Song, C.; Meng, Y.; Wang, J.; Xing, H. Urban Function as a New Perspective for Adaptive Street Quality Assessment. Sustainability 2020, 12, 1296. [Google Scholar] [CrossRef]
Zhang, Z.; Zhao, L.; Zhang, M. Exploring Non-Linear Urban Vibrancy Dynamics in Emerging New Towns: A Case Study of the Wuhan Metropolitan Area. Sustain. Cities Soc. 2024, 112, 105580. [Google Scholar] [CrossRef]
Luo, T.; Chen, M. Advancements in Supervised Machine Learning for Outdoor Thermal Comfort: A Comprehensive Systematic Review of Scales, Applications, and Data Types. Energy Build. 2025, 329, 115255. [Google Scholar] [CrossRef]
Sun, P.; Chen, M.; Chen, J. The “Blue” Habitat of Urban & Suburban Areas and Approaches for Its Biodiversity Research: A Scoping Review. J. Environ. Manag. 2025, 373, 123567. [Google Scholar] [CrossRef]
Grekousis, G. Artificial Neural Networks and Deep Learning in Urban Geography: A Systematic Review and Meta-Analysis. Comput. Environ. Urban Syst. 2019, 74, 244–256. [Google Scholar] [CrossRef]
Ullah, Z.; Al-Turjman, F.; Mostarda, L.; Gagliardi, R. Applications of Artificial Intelligence and Machine Learning in Smart Cities. Comput. Commun. 2020, 154, 313–323. [Google Scholar] [CrossRef]
Chen, W.; Wu, A.N.; Biljecki, F. Classification of Urban Morphology with Deep Learning: Application on Urban Vitality. Comput. Environ. Urban Syst. 2021, 90, 101706. [Google Scholar] [CrossRef]
Yang, C.; Lo, S.M.; Ma, R.; Fang, H. The Effect of the Perceptible Built Environment on Pedestrians’ Walking Behaviors in Commercial Districts: Evidence from Hong Kong. Environ. Plan. B Urban Anal. City Sci. 2023, 51, 239980832311776. [Google Scholar] [CrossRef]
Zhao, K.; Guo, J.; Ma, Z.; Wu, W. Exploring the Spatiotemporal Heterogeneity and Stationarity in the Relationship between Street Vitality and Built Environment. SAGE Open 2023, 13, 215824402311522. [Google Scholar] [CrossRef]
Jiang, Y.; Chen, L.; Grekousis, G.; Xiao, Y.; Ye, Y.; Lu, Y. Spatial Disparity of Individual and Collective Walking Behaviors: A New Theoretical Framework. Transp. Res. Part D Transp. Environ. 2021, 101, 103096. [Google Scholar] [CrossRef]
Qi, Y.; Chodron Drolma, S.; Zhang, X.; Liang, J.; Jiang, H.; Xu, J.; Ni, T. An Investigation of the Visual Features of Urban Street Vitality Using a Convolutional Neural Network. Geo-Spat. Inf. Sci. 2020, 23, 341–351. [Google Scholar] [CrossRef]
Wu, C.; Ye, Y.; Gao, F.; Ye, X. Using Street View Images to Examine the Association between Human Perceptions of Locale and Urban Vitality in Shenzhen, China. Sustain. Cities Soc. 2023, 88, 104291. [Google Scholar] [CrossRef]
He, S.; Zhang, Z.; Yu, S.; Xia, C.; Tung, C.-L. Investigating the Effects of Urban Morphology on Vitality of Community Life Circles Using Machine Learning and Geospatial Approaches. Appl. Geogr. 2024, 167, 103287. [Google Scholar] [CrossRef]
Xu, J.; Wang, J.; Zuo, X.; Han, X. Spatial Quality Optimization Analysis of Streets in Historical Urban Areas Based on Street View Perception and Multisource Data. J. Urban Plann. Dev. 2024, 150, 05024036. [Google Scholar] [CrossRef]
Li, M.; Liu, J.; Lin, Y.; Xiao, L.; Zhou, J. Revitalizing Historic Districts: Identifying Built Environment Predictors for Street Vibrancy Based on Urban Sensor Data. Cities 2021, 117, 103305. [Google Scholar] [CrossRef]
Yang, Y.; Ma, Y.; Jiao, H. Exploring the Correlation between Block Vitality and Block Environment Based on Multisource Big Data: Taking Wuhan City as an Example. Land 2021, 10, 984. [Google Scholar] [CrossRef]
Piadyk, Y.; Rulff, J.; Brewer, E.; Hosseini, M.; Ozbay, K.; Sankaradas, M.; Chakradhar, S.; Silva, C. StreetAware: A High-Resolution Synchronized Multimodal Urban Scene Dataset. Sensors 2023, 23, 3710. [Google Scholar] [CrossRef]
Liu, Y.; Guo, X. A Dynamic Prediction Framework for Urban Public Space Vitality: From Hypothesis to Algorithm and Verification. Sustainability 2024, 16, 2846. [Google Scholar] [CrossRef]
Yao, L.; Gao, C.; Xu, Y.; Zhang, X.; Wang, X.; Hu, Y. Prediction of Commercial Street Location Based on Point of Interest (POI) Big Data and Machine Learning. ISPRS Int. J. Geo-Inf. 2024, 13, 371. [Google Scholar] [CrossRef]
Cheng, J.; Hu, L.; Zhang, J.; Lei, D. Understanding the Synergistic Effects of Walking Accessibility and the Built Environment on Street Vitality in High-Speed Railway Station Areas. Sustainability 2024, 16, 5524. [Google Scholar] [CrossRef]
Doan, Q.C.; Ma, J.; Chen, S.; Zhang, X. Nonlinear and Threshold Effects of the Built Environment, Road Vehicles and Air Pollution on Urban Vitality. Landsc. Urban Plan. 2025, 253, 105204. [Google Scholar] [CrossRef]
Tang, Y.; Zhang, J.; Liu, R.; Li, Y. Exploring the Impact of Built Environment Attributes on Social Followings Using Social Media Data and Deep Learning. ISPRS Int. J. Geo-Inf. 2022, 11, 325. [Google Scholar] [CrossRef]
Sun, Y.; Wan, B.; Sheng, Q. Relationship Between Spatial Form, Functional Distribution, and Vitality of Railway Station Areas Under Station-City Synergetic Development: A Case Study of Four Special-Grade Stations in Beijing. Sustainability 2024, 16, 10102. [Google Scholar] [CrossRef]
Taecharungroj, V.; Ntounis, N. What Amenities Drive Footfall in UK Town Centres? A Machine Learning Approach Using OpenStreetMap Data. Environ. Plan. B Urban Anal. City Sci. 2024, 52, 23998083241290343. [Google Scholar] [CrossRef]
Zhu, Y.; Su, F.; Han, X.; Fu, Q.; Liu, J. Uncovering the Drivers of Gender Inequality in Perceptions of Safety: An Interdisciplinary Approach Combining Street View Imagery, Socio-Economic Data and Spatial Statistical Modelling. Int. J. Appl. Earth Obs. Geoinf. 2024, 134, 104230. [Google Scholar] [CrossRef]
Bill, H. The Art of Place and the Science of Space. World Archit. 2015, 11, 24–34. [Google Scholar]
Fareh, F.; Alkama, D. The Effect of Spatial Configuration on the Movement Distribution Behavior: The Case Study of Constantine Old Town (Algeria). Eng. Technol. Appl. Sci. Res. 2022, 12, 9136–9141. [Google Scholar] [CrossRef]
Lee, S.; Yoo, C.; Seo, K.W. Determinant Factors of Pedestrian Volume in Different Land-Use Zones: Combining Space Syntax Metrics with GIS-Based Built-Environment Measures. Sustainability 2020, 12, 8647. [Google Scholar] [CrossRef]
Nag, D.; Sen, J.; Goswami, A.K. Measuring Connectivity of Pedestrian Street Networks in the Built Environment for Walking: A Space-Syntax Approach. Transp. Dev. Econ. 2022, 8, 34. [Google Scholar] [CrossRef]
Omer, I.; Kaplan, N. Using Space Syntax and Agent-Based Approaches for Modeling Pedestrian Volume at the Urban Scale. Comput. Environ. Urban Syst. 2017, 64, 57–67. [Google Scholar] [CrossRef]
Sheng, Q.; Jiao, J.; Pang, T. Understanding the Impact of Street Patterns on Pedestrian Distribution: A Case Study in Tianjin, China. Urban Rail Transit 2021, 7, 209–225. [Google Scholar] [CrossRef]
Li, X.; Qian, Y.; Zeng, J.; Wei, X.; Guang, X. The Influence of Strip-City Street Network Structure on Spatial Vitality: Case Studies in Lanzhou, China. Land 2021, 10, 1107. [Google Scholar] [CrossRef]
Chen, S.; Lang, W.; Li, X. Evaluating Urban Vitality Based on Geospatial Big Data in Xiamen Island, China. SAGE Open 2022, 12, 215824402211345. [Google Scholar] [CrossRef]
Sugimoto, K.; Ota, K.; Suzuki, S. Visitor Mobility and Spatial Structure in a Local Urban Tourism Destination: GPS Tracking and Network Analysis. Sustainability 2019, 11, 919. [Google Scholar] [CrossRef]
Gan, X.; Huang, L.; Wang, H.; Mou, Y.; Wang, D.; Hu, A. Optimal Block Size for Improving Urban Vitality: An Exploratory Analysis with Multiple Vitality Indicators. J. Urban Plann. Dev. 2021, 147, 04021027. [Google Scholar] [CrossRef]
Huang, B.; Zhou, Y.; Li, Z.; Song, Y.; Cai, J.; Tu, W. Evaluating and Characterizing Urban Vibrancy Using Spatial Big Data: Shanghai as a Case Study. Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1543–1559. [Google Scholar] [CrossRef]
Yu, B.; Sun, J.; Wang, Z.; Jin, S. Influencing Factors of Street Vitality in Historic Districts Based on Multisource Data: Evidence from China. ISPRS Int. J. Geo-Inf. 2024, 13, 277. [Google Scholar] [CrossRef]
Duan, J.; Liao, J.; Liu, J.; Gao, X.; Shang, A.; Huang, Z. Evaluating the Spatial Quality of Urban Living Streets: A Case Study of Hengyang City in Central South China. Sustainability 2023, 15, 10623. [Google Scholar] [CrossRef]
Hinton, G.; Vinyals, O.; Dean, J. Distilling the Knowledge in a Neural Network. arXiv 2015, arXiv:1503.02531. [Google Scholar] [CrossRef]
Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge Computing: Vision and Challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
Sweeney, L. K-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. Int. J. Unc. Fuzz. Knowl. Based Syst. 2002, 10, 557–570. [Google Scholar] [CrossRef]
Google Cloud. Cloud Tensor Processing Unit (TPU). Available online: https://cloud.google.com/tpu/docs/tpus?hl=ja (accessed on 30 June 2024).
Chen, M.; Cai, Y.; Guo, S.; Sun, R.; Song, Y.; Shen, X. Evaluating Implied Urban Nature Vitality in San Francisco: An Interdisciplinary Approach Combining Census Data, Street View Images, and Social Media Analysis. Urban For. Urban Green. 2024, 95, 128289. [Google Scholar] [CrossRef]
Laurens van der Maaten; Geoffrey Hinton Visualizing Data Using T-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. Available online: http://jmlr.org/papers/v9/vandermaaten08a.html (accessed on 7 October 2024).
Steiger, E.; Resch, B.; Zipf, A. Exploration of Spatiotemporal and Semantic Clusters of Twitter Data Using Unsupervised Neural Networks. Int. J. Geogr. Inf. Sci. 2016, 30, 1694–1716. [Google Scholar] [CrossRef]
Liu, S.; Zhang, L.; Long, Y.; Long, Y.; Xu, M. A New Urban Vitality Analysis and Evaluation Framework Based on Human Activity Modeling Using Multi-Source Big Data. ISPRS Int. J. Geo-Inf. 2020, 9, 617. [Google Scholar] [CrossRef]
Erkan, İ. The Neuro-Cognitive Approach to Urban Planning: Wayfinding Behavior Analysis and Its Effect on Urban Planning. J. Urban Technol. 2024, 31, 45–71. [Google Scholar] [CrossRef]
Mavros, P.; Austwick, M.Z.; Smith, A.H. Geo-EEG: Towards the Use of EEG in the Study of Urban Behaviour. Appl. Spat. Anal. 2016, 9, 191–212. [Google Scholar] [CrossRef]
Wang, Z.; Shen, M.; Huang, Y. Exploring the Impact of Facade Color Elements on Visual Comfort in Old Residential Buildings in Shanghai: Insights from Eye-Tracking Technology. Buildings 2024, 14, 1758. [Google Scholar] [CrossRef]
Zontone, P.; Affanni, A.; Piras, A.; Rinaldo, R. Exploring Physiological Signal Responses to Traffic-Related Stress in Simulated Driving. Sensors 2022, 22, 939. [Google Scholar] [CrossRef] [PubMed]
Baidu API. Available online: https://lbs.baidu.com/faq/api?title=iossdk/guide/map-render/heatingPower (accessed on 12 January 2025).
Mapbox Movement. Available online: https://www.mapbox.com/movement-data (accessed on 12 January 2025).
Wang, J.; Biljecki, F. Unsupervised Machine Learning in Urban Studies: A Systematic Review of Applications. Cities 2022, 129, 103925. [Google Scholar] [CrossRef]
El Bouchefry, K.; De Souza, R.S. Learning in Big Data: Introduction to Machine Learning. In Knowledge Discovery in Big Data from Astronomy and Earth Observation; Elsevier: Amsterdam, The Netherlands, 2020; pp. 225–249. ISBN 978-0-12-819154-5. [Google Scholar]
Guo, X.; Chen, H.; Yang, X. An Evaluation of Street Dynamic Vitality and Its Influential Factors Based on Multi-Source Big Data. ISPRS Int. J. Geo-Inf. 2021, 10, 143. [Google Scholar] [CrossRef]
Gholizadeh, N.; Saadatfar, H.; Hanafi, N. K-DBSCAN: An Improved DBSCAN Algorithm for Big Data. J. Supercomput. 2021, 77, 6214–6235. [Google Scholar] [CrossRef]
Soonthornphisaj, N.; Sira-Aksorn, T.; Suksankawanich, P. Social Media Comment Management Using SMOTE and Random Forest Algorithms. In Proceedings of the 2018 19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD), Busan, Republic of Korea, 27–29 June 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 129–134. [Google Scholar]
Jia, C.; Du, Y.; Wang, S.; Bai, T.; Fei, T. Measuring the Vibrancy of Urban Neighborhoods Using Mobile Phone Data with an Improved PageRank Algorithm. Trans. GIS 2019, 23, 241–258. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Available online: http://www.prisma-statement.org/ (accessed on 14 June 2023).
Wu, J.; Ta, N.; Song, Y.; Lin, J.; Chai, Y. Urban Form Breeds Neighborhood Vibrancy: A Case Study Using a GPS-Based Activity Survey in Suburban Beijing. Cities 2018, 74, 100–108. [Google Scholar] [CrossRef]
Lu, S.; Shi, C.; Yang, X. Impacts of Built Environment on Urban Vitality: Regression Analyses of Beijing and Chengdu, China. Int. J. Environ. Res. Public Health 2019, 16, 4592. [Google Scholar] [CrossRef]
Zacharias, J. Pedestrian Dynamics on Narrow Pavements in High-Density Hong Kong. J. Urban Manag. 2021, 10, 409–418. [Google Scholar] [CrossRef]
Wang, X.; Zhang, Y.; Yu, D.; Qi, J.; Li, S. Investigating the Spatiotemporal Pattern of Urban Vibrancy and Its Determinants: Spatial Big Data Analyses in Beijing, China. Land Use Policy 2022, 119, 106162. [Google Scholar] [CrossRef]
Nam, W.; Dollar, P.; Han, J.H. Local Decorrelation for Improved Pedestrian Detection. In Proceedings of the Advances in Neural Information Processing Systems 27 (NIPS 2014), Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
Dollar, P.; Appel, R.; Belongie, S.; Perona, P. Fast Feature Pyramids for Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 1532–1545. [Google Scholar] [CrossRef]
Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach Learn 1995, 20, 273–297. [Google Scholar] [CrossRef]
Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Lecture Notes in Computer Science. Springer International Publishing: Cham, Switzerland, 2018; Volume 11211, pp. 833–851, ISBN 978-3-030-01233-5. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Wang, J.; Sun, K.; Cheng, T.; Jiang, B.; Deng, C.; Zhao, Y.; Liu, D.; Mu, Y.; Tan, M.; Wang, X.; et al. Deep High-Resolution Representation Learning for Visual Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3349–3364. [Google Scholar] [CrossRef] [PubMed]
Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar] [CrossRef]
Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Representations by Back-Propagating Errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
Zhang, Y.; Wang, C.; Wang, X.; Zeng, W.; Liu, W. FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking. Int. J. Comput. Vis. 2021, 129, 3069–3087. [Google Scholar] [CrossRef]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Pearson, K.L., III. On Lines and Planes of Closest Fit to Systems of Points in Space. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1901, 2, 559–572. [Google Scholar] [CrossRef]
Blei, D.M. Latent Dirichlet Allocation. J. Mach. Learn. Res. 2003, 3, 993–1022. [Google Scholar]

Figure 1. Systematic review process (adapted from PRISMA protocol).

Figure 2. (a) Top 10 Journals by number of publications. (b) Publication trend over years. (c) Publication of ML-based papers trend over years. (d) Percentage of ML publication in total over years.

Figure 3. (a) The keywords and their co-occurrence network, and (b) Most frequent keywords in all papers.

Figure 4. Keyword frequency statistics.

Figure 5. The use scenarios in streets for vitality research.

Figure 6. Image source: authors’ own video shot with GoPro9. (a) Poor nighttime picture stability; (b) Obstacles; (c) Reflection; (d) Non-sunny; (e) Overlapping; (f) Low light.

Figure 7. Four data interpretation dimensions and six data-source categories, with research themes and eight subcategories of technological applications across macro and micro spatial scales. Note: This figure summarizes data–method associations in street vitality studies. It does not represent ML/DL or learning paradigms. DL is treated as a subfield of ML, while SL and UL denote learning paradigms used across both classical and deep models.

Figure 8. Performance evaluation: (a) Data. (b) Technical performance evaluation.

Table 1. Keywords used in the literature search.

Technology	Topic	Context
DL	Pedestrian activity	Urban
ML
Big data
GPS
Movement
AI	Pedestrian volume Walking activity	Community Street
Monitoring	Physical activity level
SVIs	Age
Space syntax	Gender
GIS	Pedestrian attribute
Semantic segmentation	Time
Video	Environmental characteristics
Camera	Vitality
POI
Field observations
Assessment
Trajectory
Tracking

DL: Deep learning. ML: Machine learning. POI: Point of interest.

Table 2. Classification of specific technologies that are the subject of street vitality research.

Category	Use Case	Methods	Citation
Built Environment and Vitality	Pedestrian Counting and Activity Recognition	LDCF and Deeplab V3+	[17]
		Image Recognition Algorithm ACF	[10]
		YOLOv4	[13]
		PSPNet	[24,37]
		DLM-SVC and MOT	[4]
	Street and Space Characterization	PSPNet	[25]
		ANN and SVM	[23]
		Deeplab V3+	[38]
Pedestrian Mobility and Urban Dynamics	Pedestrian Attribute Recognition and Behavioral Patterns	PSPNet and Baidu AI	[39]
	Pedestrian Attribute Recognition and Behavioral Patterns	FairMOT	[22]
	Urban Vitality and Visual Perception	CNN	[40]
	Urban Vitality and Visual Perception	PSPNet	[41]
Urban Spatial and Visual Characterization	Urban Morphology and Visual Information Processing	ResNet-34 and LightGBM	[36,42]
		Deeplab V3+	[12]
		SnowNLP	[43]
	Multi-source Data and Urban Spatial Relationships	Segnet Neural Network	[44]
		FCN	[26]
		PSPNet	[43,45]
		HRNet	[46]
		Decision Tree	[47,48]
		XGBoost	[49]
		YOLOv5	[50]
Socialization and Urban Spatial Relations	Social Data and Urban Vitality	Inception V4 and Deeplab V3	[51]
	Social Data and Urban Vitality	PSPNet and MLP	[30]
	Urban Functional and Spatial Assessment	Random Forest	[52,53]
	Urban Functional and Spatial Assessment	FCN and Gradient Boosting Decision Tree	[31,54]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Multi-Scale Street Vitality Analytics: A Comprehensive Review of Technologies, Data, and Applications

Abstract

1. Introduction

2. Materials and Methods

2.1. Research Framework and Overview

2.2. Search Criteria

2.3. Selection Criteria, Screening, and Extraction of Information

2.4. Literature Statistics and Visualization

3. Results

3.1. Trends in Urban Vitality Research: Keywords, Data Sources, and Technology

3.2. Classification Based on Technology and Type of Data Input

3.2.1. Supervised Learning, Unsupervised Learning, and Deep Learning

3.2.2. Space Syntax

3.2.3. Multi-Variate Big Data and Computer Analytics

3.3. Classification Based on Themes and Use Case

3.4. Machine Learning Algorithms: Application Type and Data Type

3.4.1. Image-Based Technologies

3.4.2. Video Detection-Based Technologies

3.4.3. Based on Data Analysis

4. Discussion

4.1. Technology Applications and Regional Distribution

4.2. Technologies in Street Vitality Research

4.2.1. Data and Collection

4.2.2. Data Processing and Analysis Technologies

4.3. Themes and Fields of Technology Application in Street Vitality

4.3.1. Macro-Scale: Dynamics and Technological Implementation of Vitality

4.3.2. Micro-Scale: Dynamics and the Built Environment of Vitality

4.4. Challenges, Significance, and Future Research

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

Abbreviations

Appendix A

Appendix B

Appendix C

References

Article Metrics

Citations

Article Access Statistics