MDPI - Publisher of Open Access Journals

19 pages, 3607 KB

Open AccessArticle

A Scalable Geospatial Transformation Workflow for Structuring Mid-Trip Stops and Hotspot Connectivity from Large-Scale Bike-Sharing GPS Trajectories

by Il-Jung Seo

ISPRS Int. J. Geo-Inf. 2026, 15(5), 186; https://doi.org/10.3390/ijgi15050186 - 28 Apr 2026

Viewed by 195

Abstract

High-resolution GPS trajectories pose a geospatial processing challenge: transforming temporally ordered observations into structured spatial representations that retain intra-trip state transitions at metropolitan scale. This study develops and validates a scalable geospatial transformation workflow for detecting and structuring recurrent mid-trip stops from large-scale [...] Read more.

High-resolution GPS trajectories pose a geospatial processing challenge: transforming temporally ordered observations into structured spatial representations that retain intra-trip state transitions at metropolitan scale. This study develops and validates a scalable geospatial transformation workflow for detecting and structuring recurrent mid-trip stops from large-scale trajectory data. Using approximately 97 million GPS observations from Seoul’s public bike-sharing system, stopping episodes are identified through speed-based segmentation and density-based spatial clustering (DBSCAN). Recurrent stopping hotspots are attributed with spatial context via a land-use overlay and proximity analysis to pedestrian crossings. Sequential transitions between recurrent hotspots are represented as directed and weighted hotspot-to-hotspot networks, whose structural properties are evaluated using connectivity, clustering, path length, and modularity metrics under degree-preserving randomization. The workflow emphasizes explicit parameterization and modular processing, aligning with reproducible GIS-based spatial analytical frameworks. By converting fine-grained trajectory observations into validated mesoscopic connectivity representations, the framework provides a transferable geospatial processing pipeline for extracting structured connectivity information from high-resolution trajectory datasets. Full article

► Show Figures

Figure 1

35 pages, 13122 KB

Open AccessArticle

A Three-Dimensional LiDAR Observability Framework for Pedestrian Representation: Sensor Placement and Multi-View Fusion on a Compact Autonomous Vehicle

by Juan Diego Valladolid, Juan P. Ortiz, Franklin Castillo, José Vuelvas and Chuan Yu

Sensors 2026, 26(9), 2670; https://doi.org/10.3390/s26092670 - 25 Apr 2026

Viewed by 770

Abstract

Reliable pedestrian perception in autonomous driving depends not only on detecting the target, but also on how completely and consistently its three-dimensional geometry is captured from different sensor viewpoints. This study presents a LiDAR-based observability framework for evaluating pedestrian representation on the ANTA [...] Read more.

Reliable pedestrian perception in autonomous driving depends not only on detecting the target, but also on how completely and consistently its three-dimensional geometry is captured from different sensor viewpoints. This study presents a LiDAR-based observability framework for evaluating pedestrian representation on the ANTA compact autonomous vehicle platform using a roof-mounted Top LiDAR (TL), a Front-Right LiDAR (FRL), and their fused configuration. The pedestrian was analyzed in a canonical local frame using geometric extent ratios, projected surface occupancy, voxel-based volumetric occupancy, and statistical descriptors of the local point distribution, integrated into a global observability score,

S_{3 D}

. A Distance-Robustness Index (DRI), an overlap-based complementarity analysis, and a lightweight temporal centroid-sensitivity check over 20 consecutive frames were used to characterize performance across distance. Using ROS 2 bag data processed offline in MATLAB R2025b the fused configuration achieved the highest mean global score (0.563), compared with 0.504 for FRL and 0.432 for TL, and the highest robustness (

DRI = 0.5628

,

C V = 10.7 %

). The results show that 1 m maximizes local density, 2–3 m maximize projected and volumetric completeness, and 7 m provides the best balanced observability. Within the evaluated platform and under the controlled benchmark conditions, complementary multi-LiDAR fusion provided the strongest overall geometry-aware pedestrian representation. Full article

(This article belongs to the Special Issue Sensor Fusion for the Safety of Automated Driving Systems)

28 pages, 33079 KB

Open AccessArticle

Pedestrian Localization Using Smartphone LiDAR in Indoor Environments

by Kwangjae Sung and Jaehun Kim

Electronics 2026, 15(9), 1810; https://doi.org/10.3390/electronics15091810 - 24 Apr 2026

Viewed by 162

Abstract

Many place recognition approaches, which identify previously visited places or locations by matching current sensory data, such as 2D RGB images and 3D point clouds, have been proposed to achieve accurate and robust localization and loop closure detection in global positioning system (GPS)-denied [...] Read more.

Many place recognition approaches, which identify previously visited places or locations by matching current sensory data, such as 2D RGB images and 3D point clouds, have been proposed to achieve accurate and robust localization and loop closure detection in global positioning system (GPS)-denied environments. Since visual place recognition (VPR) methods that rely on images captured by camera sensors are highly sensitive to variations in appearance, including changes in lighting, surface color, and shadows, they can lead to poor place recognition accuracy. In contrast, light detection and ranging (LiDAR)-based place recognition (LPR) approaches based on 3D point cloud data that captures the shape and geometric structure of the environment are robust to changes in place appearance and can therefore provide more reliable place recognition results than VPR methods. This work presents an indoor LPR method called PointNetVLAD-based indoor pedestrian localization (PIPL). PIPL is a deep network model that uses PointNetVLAD to learn to extract global descriptors from 3D LiDAR point cloud data. PIPL can recognize places previously visited by a pedestrian using point clouds captured by a low-cost LiDAR sensor on a smartphone in small-scale indoor environments, while PointNetVLAD performs place recognition for vehicles using high-cost LiDAR, GPS, and inertial measurement unit (IMU) sensors in large-scale outdoor areas. For place recognition on 3D point cloud reference maps generated from LiDAR scans, PointNetVLAD exploits the universal transverse mercator (UTM) coordinate system based on GPS and IMU measurements, whereas PIPL uses a virtual coordinate system designed in this study due to the unavailability of GPS indoors. In experiments conducted in campus buildings, PIPL shows significant advantages over NetVLAD (known as a convolutional neural network (CNN)-based VPR method). Particularly in indoor environments with repetitive scenes where geometric structures are preserved and image-based appearance features are sparse or unclear, PIPL achieved

39 %

higher top-1 accuracy and

10 %

higher top-3 accuracy compared to NetVLAD. Furthermore, PIPL achieved place recognition accuracy comparable to NetVLAD even with a small number of points in a 3D point cloud and outperformed NetVLAD even with a smaller model training dataset. The experimental results also indicate that PIPL requires over

76 %

less place retrieval time than NetVLAD while maintaining robust place classification performance. Full article

(This article belongs to the Special Issue Advanced Indoor Localization Technologies: From Theory to Application)

► Show Figures

Figure 1

24 pages, 1594 KB

Open AccessArticle

RMP-YOLO: Robust Multi-Scale Pedestrian Detection for Dense Scenarios

by Chenyang Gui, Zhangyu Fan, Taibin Duan and Junhao Wen

Sensors 2026, 26(9), 2621; https://doi.org/10.3390/s26092621 - 23 Apr 2026

Viewed by 600

Abstract

With the rapid advancement of autonomous driving in modern society, dense pedestrian detection technology has encountered performance bottlenecks. To address this, we propose a robust and lightweight pedestrian detection algorithm, RMP-YOLO, designed to efficiently detect small, occluded, and low-light objects. Firstly, RFAConv is [...] Read more.

With the rapid advancement of autonomous driving in modern society, dense pedestrian detection technology has encountered performance bottlenecks. To address this, we propose a robust and lightweight pedestrian detection algorithm, RMP-YOLO, designed to efficiently detect small, occluded, and low-light objects. Firstly, RFAConv is utilized as the core component of the backbone network, combining standard convolution with attention mechanisms and using group convolution to extract features from the spatial receptive field. Secondly, MobileViTv3 is introduced into the backbone to combine CNNs with Transformers. The model is further enhanced by adjusting feature fusion, introducing residual connections, and optimizing local representation with deep convolutional layers. Finally, the PIoUv2 loss function is employed for bounding-box regression, significantly reducing detection errors for small-scale pedestrians in crowded environments. Experimental results demonstrate that RMP-YOLO improves mAP@0.5 by 1.3% on a custom dataset and 0.91% on the WiderPerson dataset. Crucially, it maintains high efficiency with only 3.71 million parameters and 6.29 GFLOPs, meeting the deployment requirements for low computational power and high precision. Full article

(This article belongs to the Section Sensing and Imaging)

18 pages, 1437 KB

Open AccessProject Report

From Tradition to Technology: A Framework for Smart Pilgrim Management on the Camino de Santiago

by Adriana Mar, Fernando Monteiro, Pedro Pereira, Jose Carlos García, João F. A. Martins and Daniel Basulto

Multimodal Technol. Interact. 2026, 10(5), 44; https://doi.org/10.3390/mti10050044 - 23 Apr 2026

Viewed by 243

Abstract

The Camino de Santiago, a UNESCO-listed pilgrimage route, has experienced sustained growth in visitor numbers, challenging municipalities to preserve cultural integrity while ensuring service quality. This study reviews people-counting technologies and proposes a smart pilgrim management framework grounded in flux measurement systems to [...] Read more.

The Camino de Santiago, a UNESCO-listed pilgrimage route, has experienced sustained growth in visitor numbers, challenging municipalities to preserve cultural integrity while ensuring service quality. This study reviews people-counting technologies and proposes a smart pilgrim management framework grounded in flux measurement systems to support data-driven and sustainable decision-making. Drawing on the smart tourism literature, the conceptual framework integrates infrared counters, mobile tracking solutions, and GPS/Wi-Fi data to generate real-time insights into pilgrim flows. A pilot simulation illustrates how these data can inform operational and strategic planning. The framework enables local authorities to monitor pedestrian movements, anticipate service demands (sanitation, accommodation, and safety), and detect overcrowding in sensitive heritage areas. By incorporating technological solutions into traditionally low-tech pilgrimage settings, municipalities can transition from reactive to proactive management approaches. The paper contributes a scalable and ethically grounded framework tailored to heritage pilgrimage routes, advancing smart tourism applications in culturally significant contexts. Full article

► Show Figures

Graphical abstract

27 pages, 8631 KB

Open AccessArticle

From Light Pulses to Selective Enhancement: Performance Analysis of Event-Based Object Detection Under Pulsed Automotive Headlight Illumination

by Leonard Haensel and Torsten Bertram

Sensors 2026, 26(9), 2595; https://doi.org/10.3390/s26092595 - 22 Apr 2026

Viewed by 517

Abstract

Pulse-width-modulated (PWM) automotive headlights enhance nighttime event-based camera detection, yet systematic parameter optimization for vulnerable road user detection remains unexplored. This study evaluates PWM frequency, duty cycle, light distribution, ego-vehicle speed, and ambient lighting under European New Car Assessment Programme-inspired crossing scenarios for [...] Read more.

Pulse-width-modulated (PWM) automotive headlights enhance nighttime event-based camera detection, yet systematic parameter optimization for vulnerable road user detection remains unexplored. This study evaluates PWM frequency, duty cycle, light distribution, ego-vehicle speed, and ambient lighting under European New Car Assessment Programme-inspired crossing scenarios for cyclist and pedestrian detection. Results establish performance ranging from substantial improvements to severe degradation relative to continuous illumination. Cyclist detection achieves robust performance with high-frequency modulation across light distributions, while low-frequency operation with low beam produces severe degradation through background noise accumulation. Pedestrian detection requires high beam with street lighting enabled; low beam universally fails regardless of modulation parameters. Limited parameter combinations achieve simultaneous improvements for both targets. Detection performs optimally on retroreflective surfaces, while low-reflectivity clothing limits capability, requiring target-specific optimization. Full article

(This article belongs to the Special Issue Event-Driven Vision Sensor Architectures and Application Scenarios)

► Show Figures

Figure 1

21 pages, 5042 KB

Open AccessArticle

Real-Time Traffic Data Analysis on Resource-Constrained Edge Devices

by Dušan Bogićević, Dragan Stojanović, Milan Gnjatović, Ivan Tot and Boriša Jovanović

Electronics 2026, 15(8), 1703; https://doi.org/10.3390/electronics15081703 - 17 Apr 2026

Viewed by 363

Abstract

This paper evaluates the feasibility of real-time traffic data analysis on resource-constrained edge devices using a hybrid processing approach. The proposed architecture integrates an LF Edge eKuiper complex event processing engine, deployed within Docker containers, with a native YOLO deep learning model for [...] Read more.

This paper evaluates the feasibility of real-time traffic data analysis on resource-constrained edge devices using a hybrid processing approach. The proposed architecture integrates an LF Edge eKuiper complex event processing engine, deployed within Docker containers, with a native YOLO deep learning model for pedestrian detection. The model processes video frames at 480 × 240 resolution on CPU-only Raspberry Pi devices, achieving up to 30 FPS. The research specifically investigates the performance limits of Raspberry Pi 3 and Raspberry Pi 4 platforms when simultaneously processing high-throughput simulated traffic data from the SUMO simulator (Belgrade scenario, with vehicle distributions and densities adjusted for small, medium, and large traffic volumes) and live video streams, respectively. Experimental results indicate that while both platforms can process up to 2600 messages per second in the settings without image processing, the introduction of a camera sensor reveals a significant hardware bottleneck. The Raspberry Pi 4 maintains robust real-time performance with an average complex event detection latency of less than 500 ms. In contrast, the Raspberry Pi 3 exhibits severe performance degradation, with image processing delays exceeding 8 s, rendering it unsuitable for real-time safety alerts. The findings demonstrate that with appropriate hardware selection, edge-based complex event processing can successfully detect critical safety events, such as sudden vehicle acceleration near pedestrians, without relying on cloud infrastructure. Full article

(This article belongs to the Special Issue AI-Driven Edge Intelligence for Smart Cities, Healthcare, and Autonomous Systems)

► Show Figures

Figure 1

36 pages, 2125 KB

Open AccessArticle

Hybrid Neural Network-Based PDR with Multi-Layer Heading Correction Across Smartphone Carrying Modes

by Junhua Ye, Anzhe Ye, Ahmed Mansour, Shusu Qiu, Zhenzhen Li and Xuanyu Qu

Sensors 2026, 26(8), 2421; https://doi.org/10.3390/s26082421 - 15 Apr 2026

Viewed by 221

Abstract

Traditional pedestrian inertial navigation (PDR) algorithms usually assume that the carrying mode of a smartphone is fixed and remains horizontal, while ignoring the significant impact of dynamic changes in the carrying mode on heading estimation, which is the core element of PDR algorithms. [...] Read more.

Traditional pedestrian inertial navigation (PDR) algorithms usually assume that the carrying mode of a smartphone is fixed and remains horizontal, while ignoring the significant impact of dynamic changes in the carrying mode on heading estimation, which is the core element of PDR algorithms. In practical application scenarios, pedestrians often change their way of carrying smart terminals (e.g., calling) according to their needs, corresponding to the difference in the heading estimation method; especially when the mode is switched, it will cause a sudden change in heading, which will lead to a significant increase in the localization error if it cannot be corrected in time. Existing smart terminal carrying mode recognition methods that rely on traditional machine learning or set thresholds have poor robustness; lack of universality, especially weak diagnostic ability for mutation; and can not effectively reduce the heading error. Based on these practical problems, this paper innovatively proposes a PDR framework that tries to overcome these limitations. Based on this research purpose, firstly, this paper classifies four types of common carrying modes based on practical applications and designs a CNN-LSTM hybrid model, which can classify the four common carrying modes in near real-time, with a recognition accuracy as high as 99.68%. Secondly, based on the mode recognition results, a multi-layer heading correction strategy is introduced: (1) introducing a quaternion-based universal filter (VQF) algorithm to realize the accurate estimation of initial heading; (2) designing an algorithm to accurately detect the mode switching point and developing an adaptive offset correction algorithm to realize the dynamic compensation of heading in the process of mode switching to reduce the impact of sudden changes; and (3) considering the motion characteristics of pedestrians walking in a straight line segment where lateral displacement tends to be close to zero. This study designs a heading optimization method with lateral displacement constraints to further inhibit the drifting of the heading caused by the slight swaying of the smart terminal. In this study, two validation experiments are carried out in two different environment—an indoor corridor and a tree shelter—and the results show that based on the proposed multi-layer heading optimization strategy, the average heading error of the system is lower than 1.5°, the cumulative positioning error is lower than 1% of the walking distance, and the root mean square error of the checkpoints is lower than 2 m, which significantly reduces the positioning error and shows the effectiveness of the framework in complex environments. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

38 pages, 9459 KB

Open AccessArticle

A Multi-Level Street-View Recognition Framework for Quantifying Spatial Interface Characteristics in Historic Commercial Districts

by Yiyuan Yuan, Zhen Yu and Junming Chen

Buildings 2026, 16(8), 1474; https://doi.org/10.3390/buildings16081474 - 8 Apr 2026

Viewed by 412

Abstract

In the context of urban renewal, the spatial interface of historic commercial districts functions as both a carrier of historical character and a key setting for commercial activity, public life, and local cultural expression. To address the limitations of conventional studies that rely [...] Read more.

In the context of urban renewal, the spatial interface of historic commercial districts functions as both a carrier of historical character and a key setting for commercial activity, public life, and local cultural expression. To address the limitations of conventional studies that rely heavily on field observation and qualitative description, this study takes Xiaohe Zhijie in Hangzhou as a case and develops a multi-level street-view recognition framework for the quantitative analysis of spatial interface characteristics. Based on street-view image collection and standardized preprocessing, a sample database was established at the sampling-point scale. Semantic segmentation, automated commercial object detection, and manual interpretation were combined to identify interface elements, including buildings, sky, greenery, pavement, vehicles, pedestrians, and commercial objects, while commercial content was assessed in terms of locality and homogenization. The results show that Xiaohe Zhijie exhibits a building-dominated and relatively enclosed interface pattern, with greenery and pavement forming the basic environmental ground, weak vehicle interference, and localized enhancement of vitality through commercial objects and pedestrian activities. Significant differences were found among street segments in openness, commercial coverage, and local expression. Three interface types were identified: commercial–cultural composite, local life-oriented, and waterfront landscape–cultural composite. The main challenge lies not in commercialization itself, but in stronger visual locality than content locality and increasing homogenization, resulting in a pattern of “localized form but homogenized content.” Full article

(This article belongs to the Special Issue Digital Twins for Information Management in Digitalization, Sustainability, and Resilience: Bridging Heritage and the Modern Built Environment)

► Show Figures

Figure 1

24 pages, 2660 KB

Open AccessArticle

SpaA: A Spatial-Aware Network for 3D Object Detection from LiDAR Point Clouds

by Jianfeng Song, Chu Zhang, Cheng Zhang, Li Song, Ruobin Wang and Kun Xie

Remote Sens. 2026, 18(8), 1104; https://doi.org/10.3390/rs18081104 - 8 Apr 2026

Viewed by 363

Abstract

Grid-based 3D object detection methods effectively leverage mature point cloud processing techniques and convolutional neural networks for feature extraction and object localization. However, unlike the 2D object detection domain, the unique characteristics of point cloud data being unevenly and sparsely distributed in space [...] Read more.

Grid-based 3D object detection methods effectively leverage mature point cloud processing techniques and convolutional neural networks for feature extraction and object localization. However, unlike the 2D object detection domain, the unique characteristics of point cloud data being unevenly and sparsely distributed in space necessitate that detection networks possess a certain level of spatial structural perception. Learning spatial information such as point cloud density and distribution patterns can significantly benefit 3D detection networks. This paper proposes a Spatial-aware Network for 3D object detection (SpaA). Based on the 3D sparse convolution network, we designed a Variable Sparse Convolution network (VS-Conv) capable of perceiving the importance of locations. To address the issue of set abstraction operations completely ignoring spatial structure during local feature aggregation, we proposed a Spatial-aware Density-based Local Aggregation (SDLA) method. Experiments demonstrate that enhancing the spatial-awareness capability of detection networks is crucial for complex 3D object detection. Detection results on the KITTI dataset validate the effectiveness of our method. The test set results of SpaA achieved 3D AP values of 82.20%, 44.04%, and 70.34% for the Car, Pedestrian, and Cyclist categories, respectively, and a competitive 3D mAP of 67.23%, outperforming several published methods. Full article

(This article belongs to the Special Issue Intelligent Image Analysis: Advancing Remote Sensing with Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 33215 KB

Open AccessData Descriptor

ANAID: Autonomous Naturalistic Obstacle-Avoidance Interaction Dataset

by Manuel Garcia-Fernandez, Maria Juarez Molera, Adrian Canadas Gallardo, Nourdine Aliane and Javier Fernandez Andres

Data 2026, 11(4), 77; https://doi.org/10.3390/data11040077 - 8 Apr 2026

Viewed by 399

Abstract

This paper presents ANAID (Autonomous Naturalistic obstacle-Avoidance Interaction Dataset), a new multimodal dataset designed to support research on autonomous driving, particularly with regard to obstacle avoidance and naturalistic driver–vehicle interaction. Data were collected using a Hyundai Tucson Hybrid equipped with a Comma-3X autonomous-driving [...] Read more.

This paper presents ANAID (Autonomous Naturalistic obstacle-Avoidance Interaction Dataset), a new multimodal dataset designed to support research on autonomous driving, particularly with regard to obstacle avoidance and naturalistic driver–vehicle interaction. Data were collected using a Hyundai Tucson Hybrid equipped with a Comma-3X autonomous-driving development kit, combining high-resolution front-facing video with detailed CAN-bus telemetry. The dataset comprises four data collection campaigns, each corresponding to a single continuous driving session, yielding a total of 208 videos and 240,014 synchronized frames. In addition to the video data, the dataset provides vehicle state measurements (speed, acceleration, steering, pedal positions, turn signals, etc.) and an additional annotation layer identifying evasive maneuvers derived from steering-related signals. Data were recorded across four driving campaigns on an urban circuit at Universidad Europea de Madrid, capturing diverse real-world scenarios such as roundabouts, intersections, pedestrian areas, and segments requiring obstacle avoidance. A multi-stage processing pipeline aligns telemetry and visual data, extracts frames at 20 FPS, and detects evasive maneuvers using threshold-based time-series analysis. ANAID provides a fully aligned and non-destructive representation of naturalistic driving behavior, enabling research on control prediction, driver modeling, anomaly detection, and human–autonomy interaction in realistic traffic conditions. Full article

► Show Figures

Figure 1

19 pages, 1627 KB

Open AccessArticle

SST-YOLO: An Improved Autonomous Driving Object Detection Algorithm Based on YOLOv8

by Qinsheng Du, Ningbo Zhang, Wenqing Bi, Ruidi Zhu, Yuhan Liu, Chao Shen, Shiyan Zhang and Jian Zhao

Appl. Sci. 2026, 16(7), 3456; https://doi.org/10.3390/app16073456 - 2 Apr 2026

Viewed by 408

Abstract

As autonomous driving technology progresses, efficient and accurate object detectors are able to detect pedestrians, vehicles, road signs, and obstacles in real time, thereby enhancing driving safety and serving as a part of autonomous driving. However, the performance of such object detectors is [...] Read more.

As autonomous driving technology progresses, efficient and accurate object detectors are able to detect pedestrians, vehicles, road signs, and obstacles in real time, thereby enhancing driving safety and serving as a part of autonomous driving. However, the performance of such object detectors is limited and cannot be leveraged to satisfy modern autonomous driving systems. To address this issue, we develop an object detection network for autonomous driving scenarios, SST-YOLO, which is based on YOLOv8. First, we propose a Sobel Convolution & Convolution (SCC) module to enhance the backbone, which incorporates a SobelConv branch to explicitly model gradient-based edge information and improve structural feature representation. In addition, we replace the original path aggregation feature pyramid network (PAFPN) with a Small Object Augmentation Pyramid Network (SOAPN), which integrates SPDConv and CSP-OmniKernel modules to strengthen multi-scale feature fusion and enhance small object representation. Finally, a Task-Adaptive Decomposition & Alignment Head (TADAHead) is designed, which employs task decomposition, dynamic deformable convolution, and classification-aware modulation to decouple tasks and achieve adaptive spatial alignment, thereby improving detection accuracy and robustness in complex scenarios. Experiments on the public autonomous driving dataset KITTI show that our proposed method outperforms the baseline YOLOv8 model. Compared with the baseline results, mAP@0.5:0.95 ranges from 65.1% to 69.2%, which indicates that the proposed SST-YOLO network can achieve object detection for autonomous cars. Full article

(This article belongs to the Special Issue Advanced Computer Vision Techniques: AI-Based Object Detection, Tracking, Surveillance and Security Applications)

► Show Figures

Figure 1

20 pages, 5717 KB

Open AccessArticle

An Improved YOLOv10 and DeepSORT Algorithm for Pedestrian Detection and Tracking in Crowd Navigation

by Shihang Hu and Changyong Li

Algorithms 2026, 19(4), 274; https://doi.org/10.3390/a19040274 - 1 Apr 2026

Viewed by 330

Abstract

In indoor crowd navigation, quickly and accurately acquiring the kinematic data of pedestrians within a robot’s field of view is a crucial factor determining success. Existing indoor pedestrian tracking methods have limitations in accuracy and real-time performance. To address these issues, a lightweight [...] Read more.

In indoor crowd navigation, quickly and accurately acquiring the kinematic data of pedestrians within a robot’s field of view is a crucial factor determining success. Existing indoor pedestrian tracking methods have limitations in accuracy and real-time performance. To address these issues, a lightweight pedestrian tracking method based on an improved YOLOv10s and DeepSORT is proposed. In the detection stage, a CPNGhostNetV2 module incorporating Ghost Convolution and attention mechanisms is first designed to replace the original C2f module in YOLOv10s. This achieves lightweight while effectively preserving global feature information. Secondly, the GSConv module is introduced to further reduce computational load and model parameters. Finally, the Focal Loss function is introduced to enhance the detection capability of the YOLOv10s model in dense scenes. In the tracking stage, a novel trajectory management mechanism is proposed to reduce the ID-switching problem under occlusion conditions. The experimental results show that the improved YOLOv10s reduces computational complexity by 33.9% and parameters by 17.4% compared to the original model. It also improves mAP@50 by 0.6%. The improved DeepSORT algorithm achieves a 7.0% increase in MOTA, a 1.4% increase in MOTP, and a 24.8% reduction in ID-switch counts compared to the original YOLOv10-DeepSORT. It outperforms traditional algorithms in terms of accuracy, real-time performance, and computational efficiency, demonstrating promising application prospects. Full article

► Show Figures

Figure 1

26 pages, 4196 KB

Open AccessArticle

Real-Time Detection of Near-Miss Events and Risk Assessment in Urban Traffic Using Multi-Object Tracking and Bird’s Eye View Mapping

by Lu Yang and Tao Hong

Future Transp. 2026, 6(2), 80; https://doi.org/10.3390/futuretransp6020080 - 1 Apr 2026

Viewed by 375

Abstract

Near-miss events, defined as hazardous traffic interactions without actual collisions, provide valuable indicators for proactive traffic safety assessment. However, existing studies mainly focus on collision detection or object-level perception, while near-miss interactions and their severity remain insufficiently explored. This study proposes a video-based [...] Read more.

Near-miss events, defined as hazardous traffic interactions without actual collisions, provide valuable indicators for proactive traffic safety assessment. However, existing studies mainly focus on collision detection or object-level perception, while near-miss interactions and their severity remain insufficiently explored. This study proposes a video-based framework for real-time near-miss detection and risk evaluation in complex urban intersections. The framework integrates an enhanced YOLOv11 detector with a small-object detection head, BoT-SORT multi-object tracking, and bird’s-eye-view (BEV) transformation to accurately extract trajectories and motion features of heterogeneous road users. A Near-Miss Risk Index (RI) is developed by jointly considering spatial proximity, time-to-collision, and motion intensity to quantify near-miss severity levels. Experimental results on real-world CCTV data demonstrate that the proposed method effectively identifies high-risk interactions among vehicles, motorcycles, and pedestrians, providing interpretable severity assessment and supporting proactive traffic safety analysis for intelligent transportation systems. Full article

► Show Figures

Figure 1

29 pages, 2066 KB

Open AccessArticle

Intelligence Collision Detection Using a Combination of Tuning Base Methods and Convolutional Long Short Term Memory Models

by Mohammed Hilfi and Lubna Alazzawi

Smart Cities 2026, 9(4), 61; https://doi.org/10.3390/smartcities9040061 - 31 Mar 2026

Viewed by 509

Abstract

Effective traffic control using Artificial Intelligence (AI) is essential to ensure safe passage for all road users. AI-based collision detection systems offer advanced mechanisms to prevent accidents and improve highway safety. This research investigates two distinct collision scenarios: vehicle–pedestrian and vehicle–motorcyclist interactions. The [...] Read more.

Effective traffic control using Artificial Intelligence (AI) is essential to ensure safe passage for all road users. AI-based collision detection systems offer advanced mechanisms to prevent accidents and improve highway safety. This research investigates two distinct collision scenarios: vehicle–pedestrian and vehicle–motorcyclist interactions. The proposed method in this research involves the bidirectional Long Short Term Memory (LSTM), Convolutional Neural Network with LSTM (CNN–LSTM), and transformer models. The model is furthermore tuned using random or grid search. For the pedestrian–vehicle scenario, the CNN–LSTM model achieved 99.76% accuracy, 99.77% precision, and 99.76% recall, highlighting its strong classification performance. In the vehicle–motorcyclist scenario, the bidirectional LSTM reached 99.73% accuracy with precision and recall of 99.15%, demonstrating its effectiveness in detecting imminent crashes. The optimized CNN-LSTM by random search has focused on decreasing the false-positive rate and increasing the positive rate. It has achieved superior results compared to previous research. These results suggest that the system could be effectively implemented as an early collision warning solution on edge devices. Full article

► Show Figures

Figure 1

Search Results (894)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (894)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI