MDPI - Publisher of Open Access Journals

26 pages, 4983 KB

Open AccessArticle

Closed-Set vs. Open-Vocabulary Object Detectors for Urban Architectural Typology Classification: A Comparative Study on Athenian Heritage Buildings

by Konstantinos Filippatos, Konstantina Siountri and Christos-Nikolaos Anagnostopoulos

Heritage 2026, 9(5), 206; https://doi.org/10.3390/heritage9050206 - 21 May 2026

Abstract

Architectural typology classification plays an important role in large-scale documentation and analysis of urban cultural heritage. Recent advances in computer vision enable automated approaches for detecting and categorizing buildings from street-level imagery, yet the suitability of different detection paradigms for architectural typology analysis [...] Read more.

Architectural typology classification plays an important role in large-scale documentation and analysis of urban cultural heritage. Recent advances in computer vision enable automated approaches for detecting and categorizing buildings from street-level imagery, yet the suitability of different detection paradigms for architectural typology analysis remains insufficiently explored. Despite recent advances in computer vision for architectural analysis, no systematic comparative study has evaluated closed-set CNN-based detectors against open-vocabulary vision–language grounding models for urban architectural typology classification. This study presents a comparative evaluation of closed-set convolutional object detectors and open-vocabulary vision–language grounding models for the classification of Athenian architectural typologies. A dataset of 3349 street-view images containing 11,111 annotated building instances was compiled and organized into five typological categories: Neoclassical, Neoclassical-Eclectic, Interwar-Eclectic, Interwar, and Postwar. The experiments compare several YOLO-based detection configurations with Grounding DINO under zero-shot inference, parameter-efficient adaptation (e.g., Kiw Rank Adaptation—LoRA), and full fine-tuning. Results show that supervised YOLO-based models achieve robust detection and classification performance with high localization accuracy and consistent typology discrimination in dense urban scenes. In contrast, open-vocabulary grounding models demonstrate limited reliability in zero-shot settings and require substantial adaptation to approach comparable performance levels. Analysis of confusion patterns further reveals that most classification errors originate from intrinsic architectural similarities between transitional styles rather than from model instability. The findings highlight the advantages of supervised object detection frameworks for scalable urban heritage documentation and provide insights into the current limitations of vision–language models for fine-grained architectural typology classification. Full article

(This article belongs to the Section Architectural Heritage)

30 pages, 3358 KB

Open AccessArticle

Streetscape Elements and Perceived Street Vitality for Sustainable Urban Renewal: A Geographically Weighted Machine Learning Analysis in Tianjin, China

by Yuqiao Zhang, Kewei Zhong, Jun Wu, Kunzhuo Wang, Yuning Liu, Qian Ji, Yang Yu and Luan Hou

Sustainability 2026, 18(10), 5165; https://doi.org/10.3390/su18105165 - 20 May 2026

Viewed by 165

Abstract

Perceived street vitality directly reflects residents’ assessments of the attractiveness of the street environment; it is not only an important focus of urban vitality research but also closely related to human-centred sustainable urban development. However, limited data availability and the complexity of urban [...] Read more.

Perceived street vitality directly reflects residents’ assessments of the attractiveness of the street environment; it is not only an important focus of urban vitality research but also closely related to human-centred sustainable urban development. However, limited data availability and the complexity of urban environments have constrained fine-grained spatial analysis at the city scale. To address this issue, this study quantified perceived street vitality by collecting street-view imagery, extracting streetscape features, and integrating these data with questionnaire survey results. After comparing multiple models, a geographically weighted machine learning model was employed to identify key visual predictors, model-estimated marginal associations, interaction patterns, and spatial heterogeneity related to perceived street vitality. The results show that areas with high perceived street vitality are mainly located along street segments with abundant greenery and open spaces, whereas low-value areas are concentrated in densely built and enclosed environments. Among the various streetscape elements, buildings, vegetation, and sky are the key visual elements most strongly associated with perceived street vitality. A model incorporating these elements accounted for 67.2% of the variance in perceived street vitality. Notably, the strength of these associations varied significantly across different areas. This study provides empirical evidence and evidence-based support for sustainable urban renewal, the optimisation of street-space layouts in high-density urban areas, and the improvement in street environmental quality. Full article

37 pages, 31418 KB

Open AccessArticle

Data-Driven Urban Color Governance for Digital City Planning: A Machine Learning-Assisted Framework Using Street View Images in Jiading District, Shanghai

by Jie Xu, Zhongnan Ye, Di Wang, Shasha Huang, Yang Liu and Yu Xiang

Buildings 2026, 16(10), 2009; https://doi.org/10.3390/buildings16102009 - 20 May 2026

Viewed by 150

Abstract

Urban color plays a fundamental role in shaping the visual character and cultural identity of cities. Yet in many contexts, current practices remain fragmented, with color analysis often disconnected from planning implementation and governance. To address this issue, this study proposes a decision-support [...] Read more.

Urban color plays a fundamental role in shaping the visual character and cultural identity of cities. Yet in many contexts, current practices remain fragmented, with color analysis often disconnected from planning implementation and governance. To address this issue, this study proposes a decision-support framework and a method for urban color evaluation and planning that integrates street view imagery, machine learning algorithms, and a parameter-based decision-support system. Using 430,000 street view images of Jiading District, Shanghai, we developed a computational model to systematically map building color characteristics in terms of hue, saturation, and brightness at both building and neighborhood scales. A multi-dimensional criteria framework encompassing the macro-environment, building characteristics, and micro-context is developed to guide automatic color scheme generation and evaluation for both existing and new buildings. The findings extract dominant color features and reveal spatial clustering patterns across Jiading District. The platform evaluates color schemes for new developments and generates color schemes for existing buildings, thereby linking urban color analysis with planning recommendations. This study presents a digital decision-support tool for urban color governance that integrates SVI, semantic segmentation, and rule-based reasoning. It shows how large-scale visual data can be organized and translated into structured references for planning practice, offering a more systematic and measurable support tool for urban color assessment. Full article

(This article belongs to the Special Issue New Challenges in Digital City Planning)

► Show Figures

Figure 1

24 pages, 35215 KB

Open AccessArticle

Revealing the Nonlinear Associations and Spatial Heterogeneity of Urban Environmental Indicators in Emotional Perception: A Machine Learning Perspective from Shanghai

by Ziyu Hu, Weizhen Xu, Zekun Lu, Tongyu Sun and Yuxiang Liu

Buildings 2026, 16(10), 1999; https://doi.org/10.3390/buildings16101999 - 19 May 2026

Viewed by 184

Abstract

Streets are major public spaces in high-density cities, and their visual environments are closely related to shaping emotional experience and wellbeing. However, existing studies often examine macro-scale urban form and pedestrian-level streetscape perception separately, while paying limited attention to nonlinear relationships and spatial [...] Read more.

Streets are major public spaces in high-density cities, and their visual environments are closely related to shaping emotional experience and wellbeing. However, existing studies often examine macro-scale urban form and pedestrian-level streetscape perception separately, while paying limited attention to nonlinear relationships and spatial heterogeneity. This limits the evidence available for fine-grained urban renewal in high-density contexts. Focusing on the area within Shanghai’s Outer Ring, this study develops a large-scale street-view dataset of 512,764 Baidu Street View images. Six perceptual dimensions—safety, lively, beautiful, wealthy, boring, and depressing—are estimated using a perception model trained on Place Pulse 2.0 and integrated into a composite Psychological and Emotional Index (PEI). XGBoost–SHAP is used to examine nonlinear relationships and threshold effects between perceptions and environmental indicators, while MGWR is employed to capture spatial nonstationarity and scale-dependent effects. The results show significant spatial heterogeneity and positive spatial autocorrelation across the six perceptual dimensions and the PEI. Compared with traditional morphological indicators, visual features showed stronger explanatory power and clearer threshold effects. Population density acts as a globally stable negative factor, whereas visual entropy and mixture show strong local sensitivity. These findings provide a data-driven basis for identifying context-specific priorities in urban renewal and spatial governance in high-density cities. Full article

(This article belongs to the Special Issue Urban Wellbeing: The Impact of Spatial Parameters—2nd Edition)

► Show Figures

Figure 1

35 pages, 11720 KB

Open AccessArticle

Effects of Street-Level Visual Perception on Different Types of Leisure Activity Intensity in Waterfront Spaces: A Case Study of the Core Section of the Pearl River, Guangzhou

by Yudan Pan, Yang Chen and Jin Cao

Land 2026, 15(5), 849; https://doi.org/10.3390/land15050849 (registering DOI) - 15 May 2026

Viewed by 160

Abstract

As urban waterfront public spaces have increasingly become important settings for residents’ daily leisure activities, there remains a lack of empirical evidence based on objective image data regarding how street-level visual environments influence different types of leisure activities. The existing studies have largely [...] Read more.

As urban waterfront public spaces have increasingly become important settings for residents’ daily leisure activities, there remains a lack of empirical evidence based on objective image data regarding how street-level visual environments influence different types of leisure activities. The existing studies have largely relied on macro-scale built environment indicators and paid limited attention to micro-scale visual perception from the pedestrian perspective. To address this gap, this study focuses on the core waterfront section of the Pearl River in Guangzhou. Behavioral observations were conducted across nine spatial units during different time periods on weekdays and weekends, yielding 54 samples of passive, active, and social activity intensity. Meanwhile, 109 street-view sampling points were established, generating 436 pedestrian-view images. Using Mask2Former with an ADE20K pre-trained model, visual environmental indicators—including the Green View Index (GVI), Sky View Index (SVI), built environment proportion, road proportion, and visual diversity (Entropy)—were extracted. Spearman correlation and multiple linear regression were applied to examine their effects on activity intensity. The results show that leisure activities are generally more active in the evening and on weekends, with social activities exhibiting the strongest temporal variation. Active activities remain relatively stable, passive activities show temporal dependence, and social activities display localized high-intensity clustering. Regression results reveal differentiated environmental responses: visual diversity has a stable positive effect on passive activities, active activities show weak associations with visual variables, and social activities are the most sensitive, with GVI, SVI, and built proportion showing significant negative effects, while visual diversity shows a significant positive effect. The social activity model also demonstrates the highest explanatory power (Adj. R² = 0.488). Overall, this study develops a street-view semantic segmentation-based method for quantifying waterfront visual environments, demonstrates the critical role of visual environmental composition in shaping activity patterns, and provides empirical support for the fine-grained and activity-oriented optimization of waterfront public spaces. Full article

(This article belongs to the Special Issue Computational Design and Planning for Socio-Environmental Sustainability of Landscapes and Communities: 2nd Edition)

► Show Figures

Figure 1

29 pages, 6442 KB

Open AccessArticle

Semantic Mapping of Urban Mobile Mapping LiDAR Using Panoramic OCR and Geometric Back-Projection

by Luma K. Jasim, Athraa Hashim Mohammed, Hussein Alwan Mahdi and Bashar Alsadik

Geomatics 2026, 6(3), 49; https://doi.org/10.3390/geomatics6030049 - 12 May 2026

Viewed by 174

Abstract

This paper presents a deterministic system that combines textual semantic data from panoramic images with LiDAR point clouds in a mobile mapping setup. Urban scenes often include textual elements, such as signs and business names, that provide key details typically missing from LiDAR-based [...] Read more.

This paper presents a deterministic system that combines textual semantic data from panoramic images with LiDAR point clouds in a mobile mapping setup. Urban scenes often include textual elements, such as signs and business names, that provide key details typically missing from LiDAR-based urban digital twins. The presented method uses deep learning-based OCR to extract text from street panoramas and then categorizes it into urban types using a rule-based classifier. Text regions are geometrically projected into the LiDAR environment by converting image coordinates into viewing rays that intersect LiDAR surfaces, such as facades. Data from multiple panoramas are merged with confidence-weighted spatial clustering to produce consistent semantic markers for urban features. Extracted business names enable text-based searches of the LiDAR point cloud, allowing facility location by category, keyword, or brand. Tests on datasets from European and U.S. cities support plausible facade-level localization and demonstrate the framework’s ability to enhance LiDAR point clouds with searchable semantic information. The main contribution is not a new standalone OCR or LiDAR-processing algorithm, but a deterministic multimodal integration framework that combines deep-learning OCR, geometric back-projection, and cross-view spatial fusion to convert street-level textual cues into reliable, queryable 3D semantic markers within mobile-mapping LiDAR data. Full article

(This article belongs to the Topic Democratizing 3D Mapping via Non-Conventional and Low-Cost LiDAR and Imaging Sensors)

► Show Figures

Figure 1

25 pages, 22830 KB

Open AccessArticle

Planning Shaded Corridors to Mitigate Heat: Assessment of Solar Radiation Exposure of Cyclists and Its Relationship with Built Environment in Shanghai

by Jiao Chen, Yu Zou and Xingchuan Shu

Land 2026, 15(5), 739; https://doi.org/10.3390/land15050739 - 27 Apr 2026

Viewed by 362

Abstract

In the context of escalating global warming and the urban heat island effects, recurrent extreme heat events will increase the exposure risk of cyclists, which will have a detrimental effect on both health and the sustainability of active mobility. Nevertheless, this risk has [...] Read more.

In the context of escalating global warming and the urban heat island effects, recurrent extreme heat events will increase the exposure risk of cyclists, which will have a detrimental effect on both health and the sustainability of active mobility. Nevertheless, this risk has not been given sufficient attention. To accurately quantify the levels of solar radiation exposure experienced by cyclists in high-temperature conditions and the impact of the built environment on these levels, this study focuses on central Shanghai as a case study. The integration of Mobike trajectories, street view imagery, and solar radiation data sets enabled the quantification of trip-level cumulative radiation exposure and per-minute exposure levels. Subsequently, the XGBoost–SHAP interpretability framework was employed to decipher the mechanisms of the built environment. The following key findings have been identified: (1) Spatiotemporally, the radiation exposure level of cyclists exhibited an inverted U-shaped pattern, peaking at midday (10:00–15:00), with per-minute values of 862–943 W/m². This intensity significantly exceeded that observed during the morning (407 W/m²) and evening (253 W/m²). (2) It was determined that geometric factors dominated the radiative exposure level. The shading index demonstrated a critical influence (57% contribution), with exposure reduction intensifying beyond 0.41 yet exhibiting diminishing marginal effects after 0.6. The sky view factor and building height elevated exposure risk by amplifying direct solar radiation. (3) Socioeconomic factors had divergent effects on the radiation exposure level of cyclists: commercial/business densities reduced exposure through continuous building shade, whereas transportation facility density increased exposure due to low-shaded layouts. Consequently, this study proposes “shaded corridors” as a core mitigation strategy, establishing a tripartite intervention framework (spatial-facility-governance) for radiation exposure reduction. The present study provides scientific foundations for the targeted enhancement of heat resilience in active mobility. Full article

(This article belongs to the Special Issue Comprehensive Transportation and Territorial Space Coordinated Planning)

► Show Figures

Figure 1

27 pages, 18721 KB

Open AccessArticle

Explainable Vision Analytics for Adaptive Campus Design: Diagnosing Multi-Dimensional Perceptual Differences

by Yan Lin, Wangchenxiao Liu and Xi Sun

Buildings 2026, 16(8), 1623; https://doi.org/10.3390/buildings16081623 - 20 Apr 2026

Viewed by 311

Abstract

Campus streetscapes are a key part of universities’ everyday public realm, yet the same scene may be perceived positively in one dimension while negatively in another. To diagnose such multi-dimensional perceptual differences and translate them into actionable design evidence, this study develops an [...] Read more.

Campus streetscapes are a key part of universities’ everyday public realm, yet the same scene may be perceived positively in one dimension while negatively in another. To diagnose such multi-dimensional perceptual differences and translate them into actionable design evidence, this study develops an interpretable vision analytics framework for adaptive campus design. Using 72,733 Baidu Street View images collected from 41 campuses in mainland China, the study integrates ResNet-50-based perception prediction, spatial element extraction, XGBoost–SHAP-based mechanism interpretation, Kruskal–Wallis H testing, and GIS-based scene mapping. Supported by supplementary in situ validation, six types of multi-dimensional perceptual differences were identified. Sky, buildings, vegetation, hardscape, and terrain were found to be the five most important spatial elements overall, among which sky, buildings, and vegetation repeatedly emerged as the dominant core elements distinguishing different perceptual types. These elements do not act independently or linearly, but jointly shape different types of multi-dimensional perceptual differences through nonlinear threshold effects and interactions. These perceptual difference types were further found to cluster in recognizable campus scenes, including main roads, plazas, lawns, forest belts, and lakeside spaces. Based on these findings, scene-specific piecemeal optimization strategies were derived to support the coordinated enhancement of perceived safety, liveliness, and beauty. Overall, the study shows that campus perception is shaped by holistic spatial configurations rather than the simple accumulation of isolated elements, and provides a quantitative basis for iterative, feedback-oriented adaptive campus design. Full article

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

► Show Figures

Figure 1

28 pages, 6613 KB

Open AccessArticle

Same Streets, Different Contexts: Personality-Based Differences in Cycling Willingness Revealed from Objective and Subjective Perspectives

by Chenfeng Xu, Yihan Li, Zibo Zhu, Zhengyang Zou, Xing Geng and Yike Hu

ISPRS Int. J. Geo-Inf. 2026, 15(4), 179; https://doi.org/10.3390/ijgi15040179 - 16 Apr 2026

Viewed by 726

Abstract

Against the backdrop of rising psychological stress and declining physical fitness in cities, how streetscape characteristics and Myers–Briggs Type Indicator (MBTI) personality traits jointly influence cycling willingness across different contexts remains underexplored. Using Shenzhen, China, as a case study, we integrated objective bicycle-sharing [...] Read more.

Against the backdrop of rising psychological stress and declining physical fitness in cities, how streetscape characteristics and Myers–Briggs Type Indicator (MBTI) personality traits jointly influence cycling willingness across different contexts remains underexplored. Using Shenzhen, China, as a case study, we integrated objective bicycle-sharing travel records from 2021 and subjective pairwise ratings of 1000 street-view images from 960 participants. Cycling willingness was extrapolated through the TrueSkill algorithm and a ResNet50-based model, while street view elements were extracted via DeepLabV3+ and summarized into five indicators. Multivariate regression and multifactor ANOVA were used to test main and moderating effects across six cycling contexts. Results show that (1) Objective cycling indicators and subjective willingness exhibit a pattern of lower values in the center and higher values in the periphery. (2) The Spatial Green Index, Sky Openness Index, Path Freedom Index, and Facility Accessibility Index are the main influencing factors, while the Interface Enclosure Index has the weakest and most context-dependent effect. (3) Intuition/Feeling traits are more salient in leisure and exploration, Judging/Thinking in fitness and transport, and Extraversion/Feeling in social and companion contexts. These findings provide evidence for optimizing urban street cycling spaces in a multi-context and personality-informed manner. Full article

(This article belongs to the Special Issue Innovative Mobility Services for Smart Cities)

► Show Figures

Figure 1

38 pages, 9459 KB

Open AccessArticle

A Multi-Level Street-View Recognition Framework for Quantifying Spatial Interface Characteristics in Historic Commercial Districts

by Yiyuan Yuan, Zhen Yu and Junming Chen

Buildings 2026, 16(8), 1474; https://doi.org/10.3390/buildings16081474 - 8 Apr 2026

Viewed by 470

Abstract

In the context of urban renewal, the spatial interface of historic commercial districts functions as both a carrier of historical character and a key setting for commercial activity, public life, and local cultural expression. To address the limitations of conventional studies that rely [...] Read more.

In the context of urban renewal, the spatial interface of historic commercial districts functions as both a carrier of historical character and a key setting for commercial activity, public life, and local cultural expression. To address the limitations of conventional studies that rely heavily on field observation and qualitative description, this study takes Xiaohe Zhijie in Hangzhou as a case and develops a multi-level street-view recognition framework for the quantitative analysis of spatial interface characteristics. Based on street-view image collection and standardized preprocessing, a sample database was established at the sampling-point scale. Semantic segmentation, automated commercial object detection, and manual interpretation were combined to identify interface elements, including buildings, sky, greenery, pavement, vehicles, pedestrians, and commercial objects, while commercial content was assessed in terms of locality and homogenization. The results show that Xiaohe Zhijie exhibits a building-dominated and relatively enclosed interface pattern, with greenery and pavement forming the basic environmental ground, weak vehicle interference, and localized enhancement of vitality through commercial objects and pedestrian activities. Significant differences were found among street segments in openness, commercial coverage, and local expression. Three interface types were identified: commercial–cultural composite, local life-oriented, and waterfront landscape–cultural composite. The main challenge lies not in commercialization itself, but in stronger visual locality than content locality and increasing homogenization, resulting in a pattern of “localized form but homogenized content.” Full article

(This article belongs to the Special Issue Digital Twins for Information Management in Digitalization, Sustainability, and Resilience: Bridging Heritage and the Modern Built Environment)

► Show Figures

Figure 1

19 pages, 8010 KB

Open AccessArticle

Multi-Model Fusion for Street Visual Quality Evaluation

by Qianhan Wang and Yuechen Li

ISPRS Int. J. Geo-Inf. 2026, 15(4), 158; https://doi.org/10.3390/ijgi15040158 - 6 Apr 2026

Viewed by 550

Abstract

With accelerating global urbanization and increasingly diverse demands for public spaces, promoting urban low-carbon transitions and enhancing residents’ quality of life have become central missions of modern urban development. As one of the city’s primary arteries, streets—through their green landscapes, slow-moving transportation systems, [...] Read more.

With accelerating global urbanization and increasingly diverse demands for public spaces, promoting urban low-carbon transitions and enhancing residents’ quality of life have become central missions of modern urban development. As one of the city’s primary arteries, streets—through their green landscapes, slow-moving transportation systems, and public facilities—play an indispensable role in reducing carbon emissions, promoting healthy living, and improving residents’ well-being. In this study, the Yubei District of Chongqing was selected as the research area, and an automated evaluation framework was proposed for street visual quality, based on multi-source street view data and ensemble learning. PSP-Net semantic segmentation model was employed to extract eight key visual indicators from street view images, including green view index, Visual Entropy (Entropy), sky view factor (SVF), drivable space, sidewalk, safety facilities, buildings, and enclosure. Based on these features, a Stacking-based ensemble learning model was constructed, integrating multiple base models such as Random Forest, XGBoost, and LightGBM, with Linear Regression as the meta-learner, to predict street visual quality. The results demonstrate that the ensemble model significantly outperforms any single model, achieving a correlation coefficient (r) of 0.77 and effectively capturing the complex perceptual features of street environments. This study provides a reliable, intelligent, and quantitative method for large-scale evaluation of urban street visual quality, while supplying data support and decision-making references for street renewal and spatial optimization. Full article

(This article belongs to the Special Issue Advances in AI-Driven Geospatial Analysis and Data Generation (2nd Edition))

► Show Figures

Figure 1

15 pages, 2004 KB

Open AccessArticle

Commercial Gentrification in a Tourist Town in Mallorca

by Joan Rossello-Geli

Urban Sci. 2026, 10(4), 194; https://doi.org/10.3390/urbansci10040194 - 2 Apr 2026

Viewed by 673

Abstract

Sóller, a highly touristic town in Mallorca, has been affected by gentrification problems related to the tourism industry. Recently, another gentrification process has appeared, affecting the retail fabric and leading to the disappearance of traditional locally owned shops and their substitution with tourist-focused [...] Read more.

Sóller, a highly touristic town in Mallorca, has been affected by gentrification problems related to the tourism industry. Recently, another gentrification process has appeared, affecting the retail fabric and leading to the disappearance of traditional locally owned shops and their substitution with tourist-focused stores. Using data from different sources, such as the City Hall documentary data, the Commerce Association archives and Google Street View images, this research highlights the gentrification process affecting two of the main commercial areas of the town. The results confirm that a commercial gentrification process, already identified in large cities such as Barcelona or Venice, can also affect medium-sized towns, creating a retail mutation that impacts local residents and their shopping capabilities. Full article

(This article belongs to the Section Urban Economy and Industry)

► Show Figures

Graphical abstract

23 pages, 10267 KB

Open AccessArticle

Identification of Leucaena leucocephala in Urban Landscapes Using Street-Level Images and Deep Learning

by Danielle Elis Garcia Furuya, Gleison Marrafon, Eduardo Lopes de Lemos, Michelle Tais Garcia Furuya, Robson Diego Silva Gonçalves, Wesley Nunes Gonçalves, José Marcato Junior, Édson Luis Bolfe, Veraldo Liesenberg, Lucas Prado Osco and Ana Paula Marques Ramos

Urban Sci. 2026, 10(4), 192; https://doi.org/10.3390/urbansci10040192 - 2 Apr 2026

Viewed by 480

Abstract

Mapping urban tree species supports green infrastructure planning. An essential issue refers to the monitoring of exotic species that may become invasive. Street-level imagery provides a complementary perspective to aerial images for species identification that are difficult to distinguish from above. In this [...] Read more.

Mapping urban tree species supports green infrastructure planning. An essential issue refers to the monitoring of exotic species that may become invasive. Street-level imagery provides a complementary perspective to aerial images for species identification that are difficult to distinguish from above. In this context, our study aimed to evaluate deep learning-based object detection and image segmentation approaches to identify a potentially invasive tree species known as Leucaena leucocephala in an urban environment in Brazil, using 422 street-level images acquired from Google Street View (SV) and mobile phones (MPs). Object detection models (YOLOv8 and DETR) and a foundation segmentation model (SAM, zero-shot) were applied to assess how deep learning paradigms perform under heterogeneous urban imaging conditions. YOLOv8 achieved detection performance with mAP50 above 0.83 and recall up to 0.76. DETR showed domain sensitivity, with mAP50 of 0.45 in SV images and 0.84 in MP imagery. For segmentation, SAM zero-shot achieved 0.92 accuracy and 0.93 F1-score in SV images, decreasing to 0.63 accuracy and 0.66 F1-score in MP images. Overall, this study demonstrates that combining detection and segmentation approaches provides complementary information for urban vegetation monitoring, supporting decision-making related to invasive species management and sustainable urban landscape planning. Full article

(This article belongs to the Special Issue Geotechnology in Urban Landscape Studies)

► Show Figures

Figure 1

23 pages, 8969 KB

Open AccessArticle

Evaluation of Spatial Integration Degree Between Hankou Historical and Cultural Blocks and Surrounding Areas in Wuhan Based on Street View Images

by Hong Xu, Xiaoyu Jiang, Jun Shao, Ziming Li, Wei Pang and Lixiang Zhou

Buildings 2026, 16(6), 1158; https://doi.org/10.3390/buildings16061158 - 15 Mar 2026

Viewed by 410

Abstract

With China’s urban growthism past its peak, urban development has shifted from incremental expansion to inventory quality improvement. Renovating historical and cultural blocks—a core area for urban quality enhancement—makes exploring their integration with surroundings highly significant. Existing studies on historical district research mainly [...] Read more.

With China’s urban growthism past its peak, urban development has shifted from incremental expansion to inventory quality improvement. Renovating historical and cultural blocks—a core area for urban quality enhancement—makes exploring their integration with surroundings highly significant. Existing studies on historical district research mainly focus on single-dimensional research such as protection policies, spatial structure analysis, and quality evaluation, lacking a systematic and quantitative evaluation of the spatial integration degree between historical and cultural blocks and their surrounding areas. To improve research on the integrated development of historical and cultural districts and their surrounding areas, this study employs deep learning and machine learning techniques to process street view images from 2721 data points in 2024, investigating the integration of Wuhan Hankou’s historical and cultural districts with their surrounding areas. The spatial integration degree between a historical and cultural district and its surroundings refers to the coordinated development level in terms of history and culture, spatial ecology, and transportation infrastructure. Specifically, the DeepLab v3+ model processes the blocks’ street view images to generate indicator data (Green Visual Index, Sky Visibility Index, Road Area Index, Spatial Enclosure Index, Color Richness (Wheel), Color Richness (Entropy), Spatial Accessibility Index, Vehicle Disturbance Index, Traffic Sign, which is used to quantify the historical culture, spatial ecology, and transportation facilities of historical and cultural blocks and their surrounding areas. The Coupling Coordination Degree model evaluates spatial integration, while the Geodetector Model quantitatively analyzes interactions between spatial integration and driving factors here. The results show that the spatial interaction and dependence between the Hankou Historical and Cultural District and its surrounding areas are relatively high, but spatial coordination is insufficient; the integration remains at a primary stage with structural contradictions. SVI, SEI, and RAI have a significant impact on integration, while Spatial Accessibility Index, Green Visual Index, and CRW have a moderate influence, and CRE, Vehicle Disturbance Index, and Traffic Signs have a relatively weak impact. Among them, SVI exhibits the strongest interactive effect with other indicators and plays a leverage role in improving integration. Full article

(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)

► Show Figures

Figure 1

28 pages, 5635 KB

Open AccessArticle

Interpretable Multimodal Framework for Human-Centered Street Assessment: Integrating Visual-Language Models for Perceptual Urban Diagnostics

by Kaiqing Yuan, Haotian Lan, Yao Gao and Kun Wang

Land 2026, 15(3), 449; https://doi.org/10.3390/land15030449 - 12 Mar 2026

Viewed by 576

Abstract

While objective street metrics derived from imagery or GIS have become standard in urban analytics, they remain insufficient to capture subjective perceptions essential to inclusive urban design. This study introduces a novel Multimodal Street Evaluation Framework (MSEF) that fuses a vision transformer (VisualGLM-6B) [...] Read more.

While objective street metrics derived from imagery or GIS have become standard in urban analytics, they remain insufficient to capture subjective perceptions essential to inclusive urban design. This study introduces a novel Multimodal Street Evaluation Framework (MSEF) that fuses a vision transformer (VisualGLM-6B) with a large language model (GPT-4), enabling interpretable dual-output assessment of streetscapes. Leveraging over 15,000 annotated street-view images from Harbin, China, we fine-tune the framework using Low-Rank Adaptation(LoRA) and P-Tuning v2 for parameter-efficient adaptation. The model achieves an F1 score of 0.863 on objective features and 89.3% agreement with aggregated resident perceptions, validated across stratified socioeconomic geographies. Beyond classification accuracy, MSEF captures context-dependent contradictions: for instance, informal commerce boosts perceived vibrancy while simultaneously reducing pedestrian comfort. It also identifies nonlinear and semantically contingent patterns—such as the divergent perceptual effects of architectural transparency across residential and commercial zones—revealing the limits of universal spatial heuristics. By generating natural-language rationales grounded in attention mechanisms, the framework bridges sensory data with socio-affective inference, enabling transparent diagnostics aligned with Sustainable Development Goal 11(SDG 11). This work offers both methodological innovation in urban perception modeling and practical utility for planning systems seeking to reconcile infrastructural precision with lived experience. Full article

(This article belongs to the Special Issue Big Data-Driven Urban Spatial Perception)

► Show Figures

Figure 1

Search Results (386)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (386)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI