Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (250)

Search Parameters:
Keywords = urban complex scenes

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 83627 KB  
Article
Research on Urban Perception of Zhengzhou City Based on Interpretable Machine Learning
by Mengjing Zhang, Chen Pan, Xiaohua Huang, Lujia Zhang and Mengshun Lee
Buildings 2026, 16(2), 314; https://doi.org/10.3390/buildings16020314 - 11 Jan 2026
Viewed by 107
Abstract
Urban perception research has long focused on global metropolises, but has overlooked many cities with complex functions and spatial structures, resulting in insufficient universality of existing theories when facing diverse urban contexts. This study constructed an analytical framework that integrates street scene images [...] Read more.
Urban perception research has long focused on global metropolises, but has overlooked many cities with complex functions and spatial structures, resulting in insufficient universality of existing theories when facing diverse urban contexts. This study constructed an analytical framework that integrates street scene images and interpretable machine learning. Taking Zhengzhou City as the research object, it extracted street visual elements based on deep learning technology and systematically analyzed the formation mechanism of multi-dimensional urban perception by combining the LightGBM model and SHAP method. The main findings of the research are as follows: (1) The urban perception of Zhengzhou City shows a significant east–west difference with Zhongzhou Avenue as the boundary. Positive perceptions such as safety and vitality are concentrated in the central business district and historical districts, while negative perceptions are more common in the urban fringe areas with chaotic built environments and single functions. (2) The visibility of greenery, the openness of the sky and the continuity of the building interface are identified as key visual elements affecting perception, and their directions and intensifies of action show significant differences due to different perception dimensions. (3) The influence of visual elements on perception has a complex mechanism of action. For instance, the promoting effect of greenery visibility on beauty perception tends to level off after reaching a certain threshold. The research results of this study can provide quantitative basis and strategic reference for the improvement in urban space quality and humanized street design. Full article
Show Figures

Figure 1

26 pages, 5848 KB  
Article
HR-Mamba: Building Footprint Segmentation with Geometry-Driven Boundary Regularization
by Buyu Su, Defei Yin, Piyuan Yi, Wenhuan Wu, Junjian Liu, Fan Yang, Haowei Mu and Jingyi Xiong
Sensors 2026, 26(2), 352; https://doi.org/10.3390/s26020352 - 6 Jan 2026
Viewed by 230
Abstract
Building extraction underpins land-use assessment, urban planning, and disaster mitigation, yet dense urban scenes still cause missed small objects, target adhesion, and ragged contours. We present High-Resolution-Mamba (HR-Mamba), a high-resolution semantic segmentation network that augments a High-Resolution Network (HRNet) parallel backbone with edge-aware [...] Read more.
Building extraction underpins land-use assessment, urban planning, and disaster mitigation, yet dense urban scenes still cause missed small objects, target adhesion, and ragged contours. We present High-Resolution-Mamba (HR-Mamba), a high-resolution semantic segmentation network that augments a High-Resolution Network (HRNet) parallel backbone with edge-aware and sequence-state modeling. A Canny-enhanced, median-filtered stem stabilizes boundaries under noise; Involution-based residual blocks capture position-specific local geometry; and a Mamba-based State Space Models (Mamba-SSM) global branch captures cross-scale long-range dependencies with linear complexity. Training uses a composite loss of binary cross entropy (BCE), Dice loss, and Boundary loss, with weights selected by joint grid search. We further design a feature-driven adaptive post-processing pipeline that includes geometric feature analysis, multi-strategy simplification, multi-directional regularization, and topological consistency verification to produce regular, smooth, engineering-ready building outlines. On dense urban imagery, HR-Mamba improves F1-score from 80.95% to 83.93%, an absolute increase of 2.98% relative to HRNet. We conclude that HR-Mamba jointly enhances detail fidelity and global consistency and offers a generalizable route for high-resolution building extraction in remote sensing. Full article
Show Figures

Figure 1

25 pages, 3879 KB  
Article
Robust Occluded Object Detection in Multimodal Autonomous Driving: A Fusion-Aware Learning Framework
by Zhengqing Li and Baljit Singh
Electronics 2026, 15(1), 245; https://doi.org/10.3390/electronics15010245 - 5 Jan 2026
Viewed by 238
Abstract
Reliable occluded object detection remains a persistent core challenge for autonomous driving perception systems, particularly in complex urban scenarios where targets are predominantly partially or fully obscured by static obstacles or dynamic agents. Conventional single-modality detectors often fail to capture adequate discriminative cues [...] Read more.
Reliable occluded object detection remains a persistent core challenge for autonomous driving perception systems, particularly in complex urban scenarios where targets are predominantly partially or fully obscured by static obstacles or dynamic agents. Conventional single-modality detectors often fail to capture adequate discriminative cues for robust recognition, while existing multimodal fusion strategies typically lack explicit occlusion modeling and effective feature completion mechanisms, ultimately degrading performance in safety-critical operating conditions. To address these limitations, we propose a novel Fusion-Aware Occlusion Detection (FAOD) framework that integrates explicit visibility reasoning with implicit cross-modal feature reconstruction. Specifically, FAOD leverages synchronized red–green–blue (RGB), light detection and ranging (LiDAR), and optional radar/infrared inputs, employs a visibility-aware attention mechanism to infer target occlusion states, and embeds a cross-modality completion module to reconstruct missing object features via complementary non-occluded modal information; it further incorporates an occlusion-aware data augmentation and annotation strategy to enhance model generalization across diverse occlusion patterns. Extensive evaluations on four benchmark datasets demonstrate that FAOD achieves state-of-the-art performance, including a +8.75% occlusion-level mean average precision (OL-mAP) improvement over existing methods on heavily occluded objects O=2 in the nuScenes dataset, while maintaining real-time efficiency. These findings confirm FAOD’s potential to advance reliable multimodal perception for next-generation autonomous driving systems in safety-critical environments. Full article
Show Figures

Figure 1

21 pages, 21514 KB  
Article
Robust Geometry–Hue Point Cloud Registration via Hybrid Adaptive Residual Optimization
by Yangmin Xie, Jinghan Zhang, Rijian Xu and Hang Shi
ISPRS Int. J. Geo-Inf. 2026, 15(1), 22; https://doi.org/10.3390/ijgi15010022 - 4 Jan 2026
Viewed by 182
Abstract
Accurate point cloud registration is a fundamental prerequisite for reality-based 3D reconstruction and large-scale spatial modeling. Despite significant international progress, reliable registration in architectural and urban scenes remains challenging due to geometric intricacies arising from repetitive and strongly symmetric structures and photometric variability [...] Read more.
Accurate point cloud registration is a fundamental prerequisite for reality-based 3D reconstruction and large-scale spatial modeling. Despite significant international progress, reliable registration in architectural and urban scenes remains challenging due to geometric intricacies arising from repetitive and strongly symmetric structures and photometric variability caused by illumination inconsistencies. Conventional ICP-based and color-augmented methods often suffer from local convergence and color drift, limiting their robustness in large-scale real-world applications. To address these challenges, we propose Hybrid Adaptive Residual Optimization (HARO), a unified framework that organically integrates geometric cues with hue-robust color features. Specifically, RGB data are transformed into a decoupled HSV representation with histogram-matched hue correction applied in overlapping regions, enabling illumination-invariant color modeling. Furthermore, a novel adaptive residual kernel dynamically balances geometric and chromatic constraints, ensuring stable convergence even in structurally complex or partially overlapping scenes. Extensive experiments conducted on diverse real-world datasets, including Subway, Railway, urban, and Office environments, demonstrate that HARO consistently achieves sub-degree rotational accuracy (0.11°) and negligible translation errors relative to the scene scale. These results indicate that HARO provides an effective and generalizable solution for large-scale point cloud registration, successfully bridging geometric complexity and photometric variability in reality-based reconstruction tasks. Full article
Show Figures

Figure 1

26 pages, 2483 KB  
Article
Intelligent UAV Navigation in Smart Cities Using Phase-Field Deep Neural Networks: A Comprehensive Simulation Study
by Lamees Aljaburi and Rahib H. Abiyev
Vehicles 2026, 8(1), 6; https://doi.org/10.3390/vehicles8010006 - 2 Jan 2026
Viewed by 268
Abstract
This paper proposes the integration of the phase-field method (PFM) with deep neural networks (DNNs) for UAV navigation in smart city environments. Using the proposed approach, simulations of an intelligent navigation and obstacle avoidance framework for drones in complex urban environments have been [...] Read more.
This paper proposes the integration of the phase-field method (PFM) with deep neural networks (DNNs) for UAV navigation in smart city environments. Using the proposed approach, simulations of an intelligent navigation and obstacle avoidance framework for drones in complex urban environments have been presented. Within the unified PFM-DNN model, phase-field modeling provides a continuous spatial representation, allowing for highly accurate characterization of boundaries between free space and obstacles. In parallel, the deep neural network component offers semantic perception and intelligent classification of environmental features. The proposed model was tested using the 3DCity dataset, which comprises 50,000 urban scenes under diverse environmental conditions, including fog, low light, and motion blur. The results demonstrated that the proposed system achieves high performance in classification and segmentation tasks, outperforming modern models such as DeepLabV3+, Mask R-CNN, and HRNet, while exhibiting high robustness to sensor noise and partial obstructions. The framework was evaluated within a simulated environment, and no real-world UAV drone tests were performed. This framework proves its effectiveness as a promising solution for intelligent drone navigation in future cities thanks to its ability to adapt and respond in dynamic environments. Full article
(This article belongs to the Special Issue Air Vehicle Operations: Opportunities, Challenges and Future Trends)
Show Figures

Figure 1

20 pages, 28888 KB  
Article
GIMMNet: Geometry-Aware Interactive Multi-Modal Network for Semantic Segmentation of High-Resolution Remote Sensing Imagery
by Qian Weng, Xiansheng Huang, Yifeng Lin, Yu Zhang, Zhaocheng Li, Cairen Jian and Jiawen Lin
Remote Sens. 2026, 18(1), 124; https://doi.org/10.3390/rs18010124 - 29 Dec 2025
Viewed by 208
Abstract
Remote sensing semantic segmentation holds significant application value in urban planning, environmental monitoring, and related fields. In recent years, multimodal approaches that fuse optical imagery with normalized Digital Surface Models (nDSM) have attracted widespread attention due to their superior performance. However, existing methods [...] Read more.
Remote sensing semantic segmentation holds significant application value in urban planning, environmental monitoring, and related fields. In recent years, multimodal approaches that fuse optical imagery with normalized Digital Surface Models (nDSM) have attracted widespread attention due to their superior performance. However, existing methods typically treat nDSM merely as an additional input channel, failing to effectively exploit its inherent 3D geometric priors, which limits segmentation accuracy in complex urban scenes. To address this issue, we propose a Geometry-aware Interactive Multi-Modal Network (GIMMNet), which explicitly models the geometric structure embedded in nDSM to guide the spatial distribution of semantic categories. Specifically, we first design a Geometric Position Prior Module (GPPM) to construct 3D coordinates for each pixel based on nDSM and extract intrinsic geometric priors. Next, a Geometry-Guided Disentangled Fusion Module (GDFM) dynamically adjusts fusion weights according to the differential responses of each modality to the geometric priors, enabling adaptive multimodal feature integration. Finally, during decoding, a Geometry-Attentive Context Module (GACM) explicitly captures the dependencies between land-cover categories and geometric structures, enhancing the model’s spatial awareness and semantic recovery capability. Experimental results on two public remote sensing datasets—Vaihingen and Potsdam—show that the proposed GIMMNet outperforms existing mainstream methods in segmentation performance, demonstrating that enhancing the model’s geometric perception capability effectively improves semantic segmentation accuracy. Notably, our method achieves an mIoU of 85.2% on the Potsdam dataset, surpassing the second-best multimodal approach, PACSCNet, by 2.3%. Full article
Show Figures

Figure 1

17 pages, 11372 KB  
Article
Integrating CNN-Mamba and Frequency-Domain Information for Urban Scene Classification from High-Resolution Remote Sensing Images
by Shirong Zou, Gang Yang, Yixuan Wang, Kunyu Wang and Shouhang Du
Appl. Sci. 2026, 16(1), 251; https://doi.org/10.3390/app16010251 - 26 Dec 2025
Viewed by 236
Abstract
Urban scene classification in high-resolution remote sensing images is critical for applications such as power facility site selection and grid security monitoring. However, the complexity and variability of ground objects present significant challenges to accurate classification. While convolutional neural networks (CNNs) excel at [...] Read more.
Urban scene classification in high-resolution remote sensing images is critical for applications such as power facility site selection and grid security monitoring. However, the complexity and variability of ground objects present significant challenges to accurate classification. While convolutional neural networks (CNNs) excel at extracting local features, they often struggle to model long-range dependencies. Transformers can capture global context but incur high computational costs. To address these limitations, this paper proposes a Global–Local Information Fusion Network (GLIFNet), which integrates VMamba for efficient global modeling with CNN for local detail extraction, enabling more effective fusion of fine-grained and semantic information. Furthermore, a Haar Wavelet Transform Attention Mechanism (HWTAM) is designed to explicitly exploit frequency-domain characteristics, facilitating refined fusion of multi-scale features. The experiment compared nine commonly used or most advanced methods. The results show that GLIFNet achieves mean F1 scores (mF1) of 90.08% and 87.44% on the ISPRS Potsdam and ISPRS Vaihingen datasets, respectively. This represents improvements of 1.26% and 1.91%, respectively, compared to the compared model. The overall accuracy (OA) reaches 90.43% and 92.87%, with respective gains of 2.28% and 1.58%. Experimental results on the LandCover.ai dataset demonstrate that GLIFNet achieved an mF1 score of 88.39% and an accuracy of 92.23%, exhibiting relative improvements of 0.3% and 0.28% compared with the control model. In summary, GLIFNet demonstrates advanced performance in urban scene classification from high-resolution remote sensing images and can provide accurate basic data for power construction. Full article
(This article belongs to the Special Issue Advances in Big Data Analysis in Smart Cities)
Show Figures

Figure 1

24 pages, 8240 KB  
Article
Multi-Constraint and Shortest Path Optimization Method for Individual Urban Street Tree Segmentation from Point Clouds
by Shengbo Yu, Dajun Li, Xiaowei Xie, Zhenyang Hui, Xiaolong Cheng, Faming Huang, Hua Liu and Liping Tu
Forests 2026, 17(1), 27; https://doi.org/10.3390/f17010027 - 25 Dec 2025
Viewed by 226
Abstract
Street trees are vital components of urban ecosystems, contributing to air purification, microclimate regulation, and visual landscape enhancement. Thus, accurate segmentation of individual trees from point clouds is an essential task for effective urban green space management. However, existing methods often struggle with [...] Read more.
Street trees are vital components of urban ecosystems, contributing to air purification, microclimate regulation, and visual landscape enhancement. Thus, accurate segmentation of individual trees from point clouds is an essential task for effective urban green space management. However, existing methods often struggle with noise, crown overlap, and the complexity of street environments. To address these challenges, this paper introduces a multi-constraint and shortest path optimization method for individual urban street tree segmentation from point clouds. In this paper, object primitives are first generated using multi-constraints based on graph segmentation. Subsequently, trunk points are identified and associated with their corresponding crowns through structural cues. To further improve the robustness of the proposed method under dense and cluttered conditions, the shortest-path optimization and stem-axis distance analysis techniques are proposed to further refine the individual tree extraction results. To evaluate the performance of the proposed method, the WHU-STree benchmark dataset is utilized for testing. Experimental results demonstrate that the proposed method achieves an average F1-score of 0.768 and coverage of 0.803, outperforming superpoint graph structure single-tree classification (SSSC) and nyström spectral clustering (NSC) methods by 17.4% and 43.0%, respectively. The comparison of visual individual tree segmentation results also indicates that the proposed framework offers a reliable solution for street tree detection in complex urban scenes and holds practical value for advancing smart city ecological management. Full article
(This article belongs to the Special Issue LiDAR Remote Sensing for Forestry)
Show Figures

Figure 1

33 pages, 628 KB  
Review
A Review of Pedestrian Trajectory Prediction Methods Based on Deep Learning Technology
by Xiang Gu, Chao Li, Long Gao and Xuefen Niu
Sensors 2025, 25(23), 7360; https://doi.org/10.3390/s25237360 - 3 Dec 2025
Cited by 1 | Viewed by 1374
Abstract
Pedestrian trajectory prediction is a critical component of autonomous driving and intelligent urban systems, with deep learning now dominating the field by overcoming the limitations of traditional models in handling multi-modal behaviors and complex social interactions. This survey provides a systematic review and [...] Read more.
Pedestrian trajectory prediction is a critical component of autonomous driving and intelligent urban systems, with deep learning now dominating the field by overcoming the limitations of traditional models in handling multi-modal behaviors and complex social interactions. This survey provides a systematic review and critical analysis of deep learning-based approaches, offering a structured examination of four key model families: RNNs, GANs, GCNs, and Transformer. Unlike previous reviews, we introduce a comparative analytical framework that evaluates each method’s strengths and limitations across standardized criteria. The review also presents a comprehensive taxonomy of datasets and evaluation metrics, highlighting both established practices and emerging trends. Finally, we derive future research directions directly from our critical assessment, focusing on semantic scene understanding, model transferability, and the precision–efficiency trade-off. Our work provides both a historical perspective on methodological evolution and a forward-looking analysis to guide future research development. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

17 pages, 10859 KB  
Article
TSFNet: A Two-Stage Fusion Network for Visual–Inertial Odometry
by Shuai Wang, Yuntao Liang, Jiongxun Lin, Yuxi Gan, Mengping Zhong, Xia Yin and Bao Peng
Mathematics 2025, 13(23), 3842; https://doi.org/10.3390/math13233842 - 30 Nov 2025
Viewed by 386
Abstract
In autonomous operations of unmanned aerial vehicles (UAVs), accurate pose estimation is a core prerequisite for achieving autonomous navigation, obstacle avoidance, and task execution. To address the challenge of localization in GNSS-denied environments, Visual–Inertial Odometry (VIO) has emerged as a mainstream solution due [...] Read more.
In autonomous operations of unmanned aerial vehicles (UAVs), accurate pose estimation is a core prerequisite for achieving autonomous navigation, obstacle avoidance, and task execution. To address the challenge of localization in GNSS-denied environments, Visual–Inertial Odometry (VIO) has emerged as a mainstream solution due to its outstanding performance. However, existing deep learning-based VIO methods exhibit limitations in their multi-modal fusion mechanisms. These methods typically employ simple concatenation or attention mechanisms for feature fusion. Furthermore, enhancements in accuracy are often accompanied by significant computational overhead. This makes it difficult for models to effectively handle complex, dynamic scenes while remaining lightweight. To this end, this paper proposes TSFNet (Two-stage Sequential Fusion Network), an efficient two-stage sequential fusion network. In the first stage, the network employs a lightweight visual backbone and a bidirectional recurrent network in parallel to extract spatial and motion features, respectively. A gated fusion unit is employed to achieve adaptive intra-frame feature fusion, dynamically balancing the contributions of different modalities. In the second stage, the fused features are organized into sequences and fed into a dedicated temporal network to explicitly model inter-frame motion dynamics. This decoupled fusion architecture significantly enhances the model’s representational capacity. Experimental results demonstrate that TSFNet achieves superior performance on both the EuRoC and Zurich Urban MAV datasets. Notably, on the Zurich Urban MAV dataset, it reduces the localization Root Mean Square Error (RMSE) by 62% compared to the baseline model, while simultaneously reducing the number of parameters and computational load by 76.65% and 24.30%, respectively. This research confirms that the decoupled two-stage fusion strategy is an effective approach for realizing high-precision, lightweight VIO systems. Full article
Show Figures

Figure 1

23 pages, 4770 KB  
Article
Multidimensional Street View Representation and Association Analysis for Exploring Human Subjective Perception Differences in East Asian and European Cities
by Shaojun Liu, Shaonan Zhu, Weitao Li, Yongbang Li and Yuting Dai
Land 2025, 14(12), 2343; https://doi.org/10.3390/land14122343 - 28 Nov 2025
Viewed by 390
Abstract
Urban landscapes exhibit significant regional differences shaped by geography, history, and culture, yet how these variations influence human perception remains underexplored. This study investigates the impact of street scene characteristics on human perceptions in East Asian and European cities by analyzing the large-scale [...] Read more.
Urban landscapes exhibit significant regional differences shaped by geography, history, and culture, yet how these variations influence human perception remains underexplored. This study investigates the impact of street scene characteristics on human perceptions in East Asian and European cities by analyzing the large-scale MIT Place Pulse 2.0 dataset. We employ DeepLab v3+ and Mask R-CNN to extract multidimensional physical and visual features and utilize logistic regression to model their association with six subjective perceptions. The findings reveal significant cultural differences: streets in East Asian cities are characterized by higher compactness and brightness, whereas European city streets exhibit greater levels of greening and openness. While perceptions of aesthetics and liveliness show cross-cultural consistency, the mechanisms influencing safety and wealth perceptions diverge significantly; for instance, East Asian cities associate safety with road openness, while European cities favor greater enclosure. The study provides practical insights for creating urban environments that resonate with local cultural identities, enhancing well-being and supporting sustainable urban development. Full article
Show Figures

Figure 1

24 pages, 35078 KB  
Article
AUP-DETR: A Foundational UAV Object Detection Framework for Enabling the Low-Altitude Economy
by Jiajing Xu, Xiaozhang Liu, Xiulai Li and Yuanyan Hu
Drones 2025, 9(12), 822; https://doi.org/10.3390/drones9120822 - 27 Nov 2025
Viewed by 681
Abstract
The ascent of the low-altitude economy underscores the critical need for autonomous perception in Unmanned Aerial Vehicles (UAVs), particularly within complex environments such as urban ports. However, existing object detection models often perform poorly when dealing with land–sea mixed scenes, extreme scale variations, [...] Read more.
The ascent of the low-altitude economy underscores the critical need for autonomous perception in Unmanned Aerial Vehicles (UAVs), particularly within complex environments such as urban ports. However, existing object detection models often perform poorly when dealing with land–sea mixed scenes, extreme scale variations, and dense object distributions from a UAV’s aerial perspective. To address this challenge, we propose AUP-DETR, a novel end-to-end object detection framework for UAVs. This framework, built upon an efficient DETR architecture, features the innovative Fusion with Streamlined Hybrid Core (Fusion-SHC) module. This module effectively fuses low-level spatial details with high-level semantics to strengthen the representation of small aerial objects. Additionally, a Synergistic Spatial Context Fusion (SSCF) module adaptively integrates multi-scale features to generate rich and unified representations for the detection head. Moreover, the proposed Spatial Agent Transformer (SAT) efficiently models global context and long-range dependencies to distinguish heterogeneous objects in complex scenes. To advance related research, we have constructed the Urban Coastal Aerial Detection (UCA-Det) dataset, which is specifically designed for urban port environments. Extensive experiments on our UCA-Det dataset show that AUP-DETR outperforms the YOLO series and other advanced DETR-based models. Our model achieves an mAP50 of 69.68%, representing a 4.41% improvement over the baseline. Furthermore, experiments on the public VisDrone dataset validate its excellent generalization capability and efficiency. This research delivers a robust solution and establishes a new dataset for precise UAV perception in low-altitude economy scenarios. Full article
Show Figures

Figure 1

21 pages, 2248 KB  
Article
V-PTP-IC: End-to-End Joint Modeling of Dynamic Scenes and Social Interactions for Pedestrian Trajectory Prediction from Vehicle-Mounted Cameras
by Siqi Bai, Yuwei Fang and Hongbing Li
Sensors 2025, 25(23), 7151; https://doi.org/10.3390/s25237151 - 23 Nov 2025
Viewed by 666
Abstract
Pedestrian trajectory prediction from a vehicle-mounted perspective is essential for autonomous driving in complex urban environments yet remains challenging due to ego-motion jitter, frequent occlusions, and scene variability. Existing approaches, largely developed for static surveillance views, struggle to cope with continuously shifting viewpoints. [...] Read more.
Pedestrian trajectory prediction from a vehicle-mounted perspective is essential for autonomous driving in complex urban environments yet remains challenging due to ego-motion jitter, frequent occlusions, and scene variability. Existing approaches, largely developed for static surveillance views, struggle to cope with continuously shifting viewpoints. To address these issues, we propose V-PTP-IC, an end-to-end framework that stabilizes motion, models inter-agent interactions, and fuses multi-modal cues for trajectory prediction. The system integrates Simple Online and Realtime Tracking (SORT)-based tracklet augmentation, Scale-Invariant Feature Transform (SIFT)-assisted ego-motion compensation, graph-based interaction reasoning, and multi-head attention fusion, followed by Long Short-Term Memory (LSTM) decoding. Experiments on the JAAD and PIE datasets demonstrate that V-PTP-IC substantially outperforms existing baselines, reducing ADE by 27.23% and 25.73% and FDE by 33.88% and 32.85%, respectively. This advances dynamic scene understanding for safer autonomous systems. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

25 pages, 9792 KB  
Article
A Field Study on Sustainable Development-Oriented Comprehensive Thermal–Acoustic–Vibrational Comfort in Zhengzhou’s TOD Underground Spaces, China
by Ruixin Li, Tingshuo Lei, Yujia Huo, Hanxue Li, Yabin Guo, Yong Li and Zhimin Guo
Sustainability 2025, 17(23), 10484; https://doi.org/10.3390/su172310484 - 22 Nov 2025
Viewed by 591
Abstract
In the process of global urbanization, the shortage of land resources and traffic congestion are prominent. China’s urban rail transit has developed rapidly in recent years. At present, the public transport-oriented Transit-Oriented Development model with “transportation + business + residence” as the core [...] Read more.
In the process of global urbanization, the shortage of land resources and traffic congestion are prominent. China’s urban rail transit has developed rapidly in recent years. At present, the public transport-oriented Transit-Oriented Development model with “transportation + business + residence” as the core is the core of the sustainable development of high urban rate. The underground space of Transit-Oriented Development faces extreme operational pressure and environmental comfort challenges in special periods such as the Spring Festival (personnel activities during weekends and important holidays in China) due to its strong closure, large population flow, high functional density, and the superposition of large passenger flow, commercial operation and rail transit activities. Due to the adult flow and complex physical field, the traditional single physical field research method has been unable to solve the problem of human comfort evaluation in complex environment. Based on the concept of sustainable development of underground space, this study takes a Transit-Oriented Development underground space in Zhengzhou City, central China as the research object. It explores the change law of multi-physical field environment of underground space under the superposition of ‘population density doubling and underground space shop operation’. The comprehensive comfort evaluation model suitable for this scene is established by Analytic Hierarchy Process–entropy weight method. It provides a theoretical basis for the design of Transit-Oriented Development underground space and the reduction in operating energy consumption. Full article
(This article belongs to the Section Green Building)
Show Figures

Figure 1

18 pages, 39629 KB  
Article
DSC-LLM: Driving Scene Context Representation-Based Trajectory Prediction Framework with Risk Factor Reasoning Using LLMs
by Sunghun Kim, Joobin Jin, Seokjun Hong, Dongho Ka, Hakjae Kim and Byeongjoon Noh
Sensors 2025, 25(23), 7112; https://doi.org/10.3390/s25237112 - 21 Nov 2025
Viewed by 852
Abstract
Autonomous driving in dense urban environments requires accurate trajectory forecasting supported by interpretable contextual evidence. This study presents a multimodal framework that performs driving scene context (DSC)-aware trajectory prediction while providing risk-aware explanations to reveal the contextual cues behind predicted motion. The framework [...] Read more.
Autonomous driving in dense urban environments requires accurate trajectory forecasting supported by interpretable contextual evidence. This study presents a multimodal framework that performs driving scene context (DSC)-aware trajectory prediction while providing risk-aware explanations to reveal the contextual cues behind predicted motion. The framework integrates temporal object states—trajectories, velocities, yaw angles, and motion status—with semantic information from forward-facing camera imagery, and is composed of four modules: object behavioral feature extraction, scene context extraction, DSC-augmented trajectory prediction, and risk-aware reasoning using a multimodal large language model (MLLM). Experiments on the Rank2Tell dataset demonstrate the feasibility and applicability of the proposed approach, achieving an ADE of 10.972, an FDE of 13.701, and an RMSE of 8.782. Additional qualitative evaluation shows that DeepSeek-R1-Distill-Qwen-7B generates the most coherent and contextually aligned explanations among the tested models. These findings indicate that combining DSC-aware prediction with interpretable reasoning provides a practical and transparent solution for autonomous driving in complex urban environments. Full article
(This article belongs to the Special Issue AI and Smart Sensors for Intelligent Transportation Systems)
Show Figures

Figure 1

Back to TopTop