MDPI - Publisher of Open Access Journals

11 pages, 1933 KB

Open AccessArticle

Study on the Mechanism of Urban Road Car-Following Safety Under Adverse Weather Conditions

by Zhipeng Gu, Xing Wang and Yufei Han

Vehicles 2026, 8(3), 56; https://doi.org/10.3390/vehicles8030056 (registering DOI) - 13 Mar 2026

Car following is a common and important behavior in vehicle traffic flow, and the fluctuation of car-following behavior caused by the change in weather environment has also become one of the main causes of traffic accidents. To solve this problem, a driving scene [...] Read more.

Car following is a common and important behavior in vehicle traffic flow, and the fluctuation of car-following behavior caused by the change in weather environment has also become one of the main causes of traffic accidents. To solve this problem, a driving scene on urban roads was built through the driving simulation platform, and the driving simulator was used to carry out the vehicle-following test. The operating behavior parameters of the test drivers, such as steering wheel angle, headway, throttle opening, standard deviation of vehicle speed, acceleration, collision times, and so on, were collected and studied. The results showed that there were significant differences (p < 0.05) in indicators such as steering wheel angle, headway, acceleration, and standard deviation of speed under adverse weather conditions. The bad weather caused the line of sight to be blocked, which the driver compensated for by strengthening the trimming of the steering wheel angle, leading to the deterioration of the vehicle lateral stability. Moreover, safety studies have shown that the minimum driving interval occurred in foggy weather, while the maximum occurred in snowy weather. In addition, the standard deviation of vehicle speed and acceleration fluctuations have been reduced to ensure driving safety in adverse weather conditions. The driving experience of the drivers has a significant impact on the number of collisions, as novice drivers had a higher probability of collision. Full article

(This article belongs to the Special Issue Road Safety, Aberrant Driver Behaviour and Sustainable Transportation Planning)

► Show Figures

Figure 1

23 pages, 13360 KB

Open AccessArticle

Lumina-4DGS: Illumination-Robust Four-Dimensional Gaussian Splatting for Dynamic Scene Reconstruction

by Xiaoqiang Wang, Qing Wang, Yang Sun and Shengyi Liu

Sensors 2026, 26(5), 1650; https://doi.org/10.3390/s26051650 - 5 Mar 2026

Viewed by 178

Abstract

High-fidelity 4D reconstruction of dynamic scenes is pivotal for immersive simulation yet remains challenging due to the photometric inconsistencies inherent in multi-view sensor arrays. Standard 3D Gaussian Splatting (3DGS) strictly adheres to the brightness constancy assumption, failing to distinguish between intrinsic scene radiance [...] Read more.

High-fidelity 4D reconstruction of dynamic scenes is pivotal for immersive simulation yet remains challenging due to the photometric inconsistencies inherent in multi-view sensor arrays. Standard 3D Gaussian Splatting (3DGS) strictly adheres to the brightness constancy assumption, failing to distinguish between intrinsic scene radiance and transient brightness shifts caused by independent auto-exposure (AE), auto-white-balance (AWB), and non-linear ISP processing. This misalignment often forces the optimization process to compensate for spectral discrepancies through incorrect geometric deformation, resulting in severe temporal flickering and spatial floating artifacts. To address these limitations, we present Lumina-4DGS, a robust framework that harmonizes spatiotemporal geometry modeling with a hierarchical exposure compensation strategy. Our approach explicitly decouples photometric variations into two levels: a Global Exposure Affine Module that neutralizes sensor-specific AE/AWB fluctuations and a Multi-Scale Bilateral Grid that residually corrects spatially varying non-linearities, such as vignetting, using luminance-based guidance. Crucially, to prevent these powerful appearance modules from masking geometric flaws, we introduce a novel SSIM-Gated Optimization mechanism. This strategy dynamically gates the gradient flow to the exposure modules based on structural similarity. By ensuring that photometric enhancement is only activated when the underlying geometry is structurally reliable, we effectively prioritize geometric accuracy over photometric overfitting. Extensive experiments validate the quantitative superiority of Lumina-4DGS. On the Waymo Open Dataset, our method achieves a state-of-the-art Full Image PSNR of 31.12 dB while minimizing geometric errors to a Depth RMSE of 1.89 m and Chamfer Distance of 0.215 m. Furthermore, on our highly challenging self-collected surround-view dataset featuring severe unconstrained illumination shifts, Lumina-4DGS yields a significant 2.13 dB PSNR improvement over recent driving-scene baselines. These results confirm that our framework achieves photorealistic, exposure-invariant novel view synthesis while maintaining superior geometric consistency across heterogeneous camera inputs. Full article

(This article belongs to the Section Optical Sensors)

► Show Figures

Figure 1

23 pages, 5855 KB

Open AccessArticle

Pedestrian Flow Model Based on Cellular Automata Under Visual Trajectory and Multi-Scenario Evacuation Simulation Research

by Yueyue Chen, Jinbao Yao, Chenze Gao and Haoyuan Guo

Sensors 2026, 26(5), 1405; https://doi.org/10.3390/s26051405 - 24 Feb 2026

Viewed by 206

Abstract

Precise modeling and simulation of pedestrian flow are crucial for public space safety design and emergency management. This study proposes an interdisciplinary method integrating computer vision and cellular automata (CA). First, unidirectional pedestrian flow video data with different densities were collected from an [...] Read more.

Precise modeling and simulation of pedestrian flow are crucial for public space safety design and emergency management. This study proposes an interdisciplinary method integrating computer vision and cellular automata (CA). First, unidirectional pedestrian flow video data with different densities were collected from an overpass scene via controlled experiments. High-precision pedestrian trajectory extraction and tracking were achieved using the YOLO 11 model and DeepSORT algorithm, with image distortion corrected by perspective transformation. For the first time, the probability distribution of pedestrian turning angles derived from trajectory analysis was converted into data-driven transition probabilities for the Moore neighborhood in the CA model. An improved evacuation model was then constructed, comprehensively considering real-data-based transition probabilities, speed–density distribution, panic coefficient, individual life value, and hazard source dynamics. Multi-scenario simulations show that moderate panic may shorten evacuation time, while excessive panic causes behavioral disorders; group movement is constrained by the slowest individual, and increased hazard source speed reduces the proportion of safe pedestrians. This study provides new insights and methodological support for refined pedestrian evacuation simulation and safety management. Full article

(This article belongs to the Special Issue Intelligent Traffic Safety and Security)

► Show Figures

Figure 1

40 pages, 12177 KB

Open AccessArticle

Dynamic Multi-Relation Learning with Multi-Scale Hypergraph Transformer for Multi-Modal Traffic Forecasting

by Juan Chen and Meiqing Shan

Future Transp. 2026, 6(1), 51; https://doi.org/10.3390/futuretransp6010051 - 22 Feb 2026

Viewed by 241

Abstract

Accurate multi-modal traffic demand forecasting is key to optimizing intelligent transportation systems (ITSs). To overcome the shortcomings of existing methods in capturing dynamic high-order correlations between heterogeneous spatial units and decoupling intra- and inter-mode dependencies at multiple time scales, this paper proposes a [...] Read more.

Accurate multi-modal traffic demand forecasting is key to optimizing intelligent transportation systems (ITSs). To overcome the shortcomings of existing methods in capturing dynamic high-order correlations between heterogeneous spatial units and decoupling intra- and inter-mode dependencies at multiple time scales, this paper proposes a Dynamic Multi-Relation Learning with Multi-Scale Hypergraph Transformer method (MST-Hype Trans). The model integrates three novel modules. Firstly, the Multi-Scale Temporal Hypergraph Convolutional Network (MSTHCN) achieves collaborative decoupling and captures periodic and cross-modal temporal interactions of transportation demand at multiple granularities, such as time, day, and week, by constructing a multi-scale temporal hypergraph. Secondly, the Dynamic Multi-Relationship Spatial Hypergraph Network (DMRSHN) innovatively integrates geographic proximity, passenger flow similarity, and transportation connectivity to construct structural hyperedges and combines KNN and K-means algorithms to generate dynamic hyperedges, thereby accurately modeling the high-order spatial correlations of dynamic evolution between heterogeneous nodes. Finally, the Conditional Meta Attention Gated Fusion Network (CMAGFN), as a lightweight meta network, introduces a gate control mechanism based on multi-head cross-attention. It can dynamically generate node features based on real-time traffic context and adaptively calibrate the fusion weights of multi-source information, achieving optimal prediction decisions for scene perception. Experiments on three real-world datasets (NYC-Taxi, -Bike, and -Subway) demonstrate that MST-Hyper Trans achieves an average reduction of 7.6% in RMSE and 9.2% in MAE across all modes compared to the strongest baseline, while maintaining interpretability of spatiotemporal interactions. This study not only provides good model interpretability but also offers a reliable solution for multi-modal traffic collaborative management. Full article

(This article belongs to the Special Issue Recent Advances in Artificial Intelligence and Big Data for Intelligent Transportation Systems)

► Show Figures

Figure 1

30 pages, 964 KB

Open AccessReview

The Mystery of the Hidden Trace: Emerging Genetic Approaches to Improve Body Fluid Identification

by Dana Macfarlane, Gabriela Roca, Christian Stadler and Sara C. Zapico

Genes 2026, 17(2), 146; https://doi.org/10.3390/genes17020146 - 28 Jan 2026

Viewed by 503

Abstract

Body fluid identification at crime scenes is the first step in the forensic biology workflow, leading to the identification of the perpetrator and/or, in some cases, the victim. Current methods that are regularly used in forensic criminal evidence analysis utilize well-studied properties of [...] Read more.

Body fluid identification at crime scenes is the first step in the forensic biology workflow, leading to the identification of the perpetrator and/or, in some cases, the victim. Current methods that are regularly used in forensic criminal evidence analysis utilize well-studied properties of each fluid as the foundation of the protocol. Among these approaches, alternative light sources, chemical reactions, lateral flow immunochromatographic tests, and microscopic detection stand out to identify the main body fluids encountered at crime scenes: blood, semen, and saliva. However, these often come with limits for specificity and sensitivity. There is also difficulty with fluid mixtures, environmental degradation, and destruction of the sample by the method used. Other fluids, like vaginal fluid and fecal matter, lack standardized protocols and require innovative ideas for accurate analysis without compromising the sample. Emerging technologies based on molecular methods have been the focus of body fluid research, with emphasis on topics such as mRNA, microRNA, epigenetics, and microbial analysis. Additional information alongside the determination of fluid origin could be an advantage from new molecular techniques, such as the identification of donors from SNP analysis, if regular STR analysis is not possible. Validation studies and the integration of such research have the potential to expand and enhance the laboratory practices of forensic science. This article will provide an overview of the current methods applied in the crime lab for body fluid identification before exploring active research in this field, pointing out the potential of these techniques for application in forensic cases to overcome present issues and expand the variety of body fluids identified. Full article

(This article belongs to the Section Genetic Diagnosis)

► Show Figures

Figure 1

23 pages, 1657 KB

Open AccessArticle

A Spatial Optimization Evaluation Framework for Immersive Heritage Museum Exhibition Layouts: A Delphi–Group AHP–IPA Approach

by Yuxin Bu, Mohd Jaki Bin Mamat, Muhammad Firzan Bin Abdul Aziz and Yuxuan Shi

Buildings 2026, 16(3), 528; https://doi.org/10.3390/buildings16030528 - 28 Jan 2026

Viewed by 332

Abstract

As heritage museums shift toward more experience-oriented development, fragmented layouts and discontinuous visitor flows can reduce both spatial efficiency and the coherence of on-site experience. This study proposes an immersive experience-centred evaluation framework for exhibition layout in heritage museums, intended to translate experience [...] Read more.

As heritage museums shift toward more experience-oriented development, fragmented layouts and discontinuous visitor flows can reduce both spatial efficiency and the coherence of on-site experience. This study proposes an immersive experience-centred evaluation framework for exhibition layout in heritage museums, intended to translate experience goals into practical and diagnosable criteria for spatial optimization. An indicator system was refined through two rounds of Delphi consultation with an interdisciplinary expert panel, resulting in a hierarchical framework comprising five dimensions and multiple indicators. To support intervention prioritization in design and operations, weights were derived using the Group Analytic Hierarchy Process (GAHP), with Aggregation of Individual Judgments (AIJs) and consistency checks applied to control group judgement quality. A CV–entropy procedure was further used to support prioritization at the third-indicator level. Importance–Performance Analysis (IPA) was then employed to convert “importance–fit” assessments into an actionable sequence of optimization priorities. The results indicate that narrative and scene design carries the greatest weight (0.2877), followed by circulation and spatial organization (0.2281), sensory experience and atmosphere (0.1981), authenticity and sense of place (0.1644), and interactivity and participation (0.1217), suggesting that a “narrative–circulation–atmosphere” chain forms the core support for immersive layout design. A feasibility application using the Yinxu Museum demonstrates the framework’s value for benchmarking and diagnosis, helping decision-makers enhance visitor experience while respecting conservation constraints and more precisely target spatial investment priorities. Full article

► Show Figures

Figure 1

15 pages, 2212 KB

Open AccessArticle

Enhancing User Experience in Virtual Reality Through Optical Flow Simplification with the Help of Physiological Measurements: Pilot Study

by Abdualrhman Abdalhadi, Nitin Koundal, Mahdiyeh Sadat Moosavi, Ruding Lou, Mohd Zuki bin Yusoff, Frédéric Merienne and Naufal M. Saad

Sensors 2026, 26(2), 610; https://doi.org/10.3390/s26020610 - 16 Jan 2026

Viewed by 436

Abstract

The use of virtual reality (VR) has made significant advancements, and now it is widely used across a range of applications. However, consumers’ capacity to fully enjoy VR experiences continues to be limited by a chronic problem known as cybersickness (CS). This study [...] Read more.

The use of virtual reality (VR) has made significant advancements, and now it is widely used across a range of applications. However, consumers’ capacity to fully enjoy VR experiences continues to be limited by a chronic problem known as cybersickness (CS). This study explores the feasibility of mitigating CS through geometric scene simplification combined with electroencephalography (EEG)-based monitoring. According to the sensory conflict theory, this issue is caused by the discrepancy between the visually induced self-motion (VIMS) through immersive displays and the real motion the vestibular system detects. While prior mitigation strategies have largely relied on hardware modifications or visual field restrictions, this paper introduces a novel framework that integrates geometric scene simplification with EEG-based neurophysiological activity to reduce VIMS during VR immersion. The proposed framework combines EEG neurophysiology, allowing us to monitor users’ brainwave activity and cognitive states during virtual immersion experience. The empirical evidence from our investigation shows a correlation between CS manifestation and neural activation in the parietal and temporal lobes. As an experiment with 15 subjects, statistical differences were significantly different with

P = 0.001

and large effect size

η 2 = 0.28

, while preliminary trends suggest lower neural activation during simplified scenes. Notably, a decrease in neural activation corresponding to reduced optic flow (OF) suggests that VR environment simplification may help attenuate CS symptoms, providing preliminary support for the proposed strategy. Full article

(This article belongs to the Section Biomedical Sensors)

► Show Figures

Figure 1

19 pages, 38545 KB

Open AccessArticle

Improving Dynamic Visual SLAM in Robotic Environments via Angle-Based Optical Flow Analysis

by Sedat Dikici and Fikret Arı

Electronics 2026, 15(1), 223; https://doi.org/10.3390/electronics15010223 - 3 Jan 2026

Viewed by 498

Abstract

Dynamic objects present a major challenge for visual simultaneous localization and mapping (Visual SLAM), as feature measurements originating from moving regions can corrupt camera pose estimation and lead to inaccurate maps. In this paper, we propose a lightweight, semantic-free front-end enhancement for ORB-SLAM [...] Read more.

Dynamic objects present a major challenge for visual simultaneous localization and mapping (Visual SLAM), as feature measurements originating from moving regions can corrupt camera pose estimation and lead to inaccurate maps. In this paper, we propose a lightweight, semantic-free front-end enhancement for ORB-SLAM that detects and suppresses dynamic features using optical flow geometry. The key idea is to estimate a global motion direction point (MDP) from optical flow vectors and to classify feature points based on their angular consistency with the camera-induced motion field. Unlike magnitude-based flow filtering, the proposed strategy exploits the geometric consistency of optical flow with respect to a motion direction point, providing robustness not only to depth variation and camera speed changes but also to different camera motion patterns, including pure translation and pure rotation. The method is integrated into the ORB-SLAM front-end without modifying the back-end optimization or cost function. Experiments on public dynamic-scene datasets demonstrate that the proposed approach reduces absolute trajectory error by up to approximately 45% compared to baseline ORB-SLAM, while maintaining real-time performance on a CPU-only platform. These results indicate that reliable dynamic feature suppression can be achieved without semantic priors or deep learning models. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

21 pages, 3681 KB

Open AccessArticle

E-Sem3DGS: Monocular Human and Scene Reconstruction via Event-Aided Semantic 3DGS

by Xiaoting Yin, Hao Shi, Kailun Yang, Jiajun Zhai, Shangwei Guo and Kaiwei Wang

Sensors 2026, 26(1), 188; https://doi.org/10.3390/s26010188 - 27 Dec 2025

Viewed by 799

Abstract

Reconstructing animatable humans, together with their surrounding static environments, from monocular, motion-blurred videos is still challenging for current neural rendering methods. Existing monocular human reconstruction approaches achieve impressive quality and efficiency, but they are designed for clean intensity inputs and mainly focus on [...] Read more.

Reconstructing animatable humans, together with their surrounding static environments, from monocular, motion-blurred videos is still challenging for current neural rendering methods. Existing monocular human reconstruction approaches achieve impressive quality and efficiency, but they are designed for clean intensity inputs and mainly focus on the foreground human, leading to degraded performance under motion blur and incomplete scene modeling. Event cameras provide high temporal resolution and robustness to motion blur, making them a natural complement to standard video sensors. We present E-Sem3DGS, a semantically augmented 3D Gaussian Splatting framework that leverages hybrid event-intensity streams to jointly reconstruct explicit 3D volumetric representations of human avatars and static scenes. E-Sem3DGS maintains a single set of 3D Gaussians in Euclidean space, each endowed with a learnable semantic attribute that softly separates dynamic human and static scene content. We initialize human Gaussians from Skinned Multi-Person Linear (SMPL) model priors with semantic values set to 1 and scene Gaussians by sampling a surrounding cube with semantic values set to 0, then jointly optimize geometry, appearance, and semantics. To mitigate motion blur, we derive optical flow from events and use it to supervise image-based optical flow between rendered frames, enforcing temporal coherence in high-motion regions and sharpening both humans and backgrounds. On the motion-blurred ZJU-MoCap-Blur dataset, E-Sem3DGS improves the average full-frame PSNR from 21.75 to 32.56 (+49.7%) over previous methods. On MMHPSD-Blur, our method improves PSNR from 25.23 to 28.63 (+13.48%). Full article

(This article belongs to the Special Issue Sensors for Object Detection, Pose Estimation, and 3D Reconstruction)

► Show Figures

Figure 1

18 pages, 2680 KB

Open AccessArticle

Temporally Aware Objective Quality Metric for Immersive Video

by Jakub Stankowski, Bartosz Sojka, Tomasz Grajek and Adrian Dziembowski

Appl. Sci. 2026, 16(1), 274; https://doi.org/10.3390/app16010274 - 26 Dec 2025

Viewed by 351

Abstract

State-of-the-art objective quality metrics designed for immersive content typically prioritize spatial distortions; therefore, they can omit temporal artifacts introduced by view synthesis and dynamic scene rendering. Consequently, metrics such as the commonly used peak signal-to-noise ratio for immersive video (IV-PSNR) are “temporally blind”, [...] Read more.

State-of-the-art objective quality metrics designed for immersive content typically prioritize spatial distortions; therefore, they can omit temporal artifacts introduced by view synthesis and dynamic scene rendering. Consequently, metrics such as the commonly used peak signal-to-noise ratio for immersive video (IV-PSNR) are “temporally blind”, creating a conceptual gap where temporally stable distortions cannot be distinguished from disruptive temporal flickering. To address this limitation, we propose a temporal extension of the IV-PSNR metric that incorporates motion information into the quality assessment process. The method augments the traditional Y, U, and V color components with a fourth channel representing motion vectors (M), enabling the proposed four-component IV-PSNR_YUVM metric to account for dynamic distortions introduced by view rendering. To evaluate the effectiveness of the proposed approach, multiple configurations of motion integration were tested, including metrics based solely on motion consistency, metrics combining motion with texture, and several dense optical flow algorithms with different parameter settings. Extensive experiments performed on immersive video sequences demonstrate that the proposed four-component IV-PSNR_YUVM achieves the highest correlation with subjectively perceived video quality. These results confirm that combining texture with motion information provides a benefit, making the proposal a valuable addition for real-world immersive video systems. Full article

(This article belongs to the Special Issue Digital Signal Processing for Wireless Communications and Multimedia Systems)

► Show Figures

Figure 1

27 pages, 5112 KB

Open AccessArticle

A Lightweight and Low-Cost Underwater Localization System Based on Visual–Inertial–Depth Fusion for Net-Cage Cleaning Robots

by Chuanyu Geng, Junhua Chen and Hao Li

J. Mar. Sci. Eng. 2026, 14(1), 48; https://doi.org/10.3390/jmse14010048 - 26 Dec 2025

Viewed by 489

Abstract

Net-cage aquaculture faces challenges from biofouling, which reduces water exchange and threatens structural integrity. Automated cleaning robots provide an alternative to human divers but require effective, low-cost localization. Conventional acoustic–inertial systems are expensive and complex, while vision-only or IMU-based methods suffer from drift [...] Read more.

Net-cage aquaculture faces challenges from biofouling, which reduces water exchange and threatens structural integrity. Automated cleaning robots provide an alternative to human divers but require effective, low-cost localization. Conventional acoustic–inertial systems are expensive and complex, while vision-only or IMU-based methods suffer from drift in turbid, low-texture waters. This paper presents a lightweight Visual–Inertial–Depth (VID) fusion framework for underwater net-cage cleaning robots. Built on the VINS-Fusion system the method estimates scene scale using optical flow and stereo matching, and incorporates IMU pre-integration for high-frequency motion prediction. A pressure-based depth factor constrains Z-axis drift, and reflective-anchor initialization ensures global alignment. The system runs in real time on a Jetson Orin NX with ROS. Experiments in air, tank, pool, and ocean settings demonstrate its robustness. In controlled environments, the mean anchor coordinate error (ACE) was 0.05–0.16 m, and loop-closure drift (LCD) was ≤0.5 m per 5 m. In ocean trials, turbulence and biofouling led to drift (LCD 1.32 m over 16.05 m, 8.3%), but IMU and depth cues helped maintain vertical stability. The system delivers real-time, cost-effective localization in structured underwater cages and offers insights for improvements in dynamic marine conditions. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

27 pages, 8689 KB

Open AccessArticle

Comparative Evaluation of YOLO Models for Human Position Recognition with UAVs During a Flood

by Nataliya Bilous, Vladyslav Malko, Iryna Ahekian, Igor Korobiichuk and Volodymyr Ivanichev

Appl. Syst. Innov. 2026, 9(1), 6; https://doi.org/10.3390/asi9010006 - 25 Dec 2025

Viewed by 806

Abstract

Reliable recognition of people on water from UAV imagery remains a challenging task due to strong glare, wave-induced distortions, partial submersion, and small visual scale of targets. This study proposes a hybrid method for human detection and position recognition in aquatic environments by [...] Read more.

Reliable recognition of people on water from UAV imagery remains a challenging task due to strong glare, wave-induced distortions, partial submersion, and small visual scale of targets. This study proposes a hybrid method for human detection and position recognition in aquatic environments by integrating the YOLO12 object detector with optical-flow-based motion analysis, Kalman tracking, and BlazePose skeletal estimation. A combined training dataset was formed using four complementary sources, enabling the detector to generalize across heterogeneous maritime and flood-like scenes. YOLO12 demonstrated superior performance compared to earlier You Only Look Once (YOLO) generations, achieving the highest accuracy (mAP@0.5 = 0.95) and the lowest error rates on the test set. The hybrid configuration further improved recognition robustness by reducing false positives and partial detections in conditions of intense reflections and dynamic water motion. Real-time experiments on a Raspberry Pi 5 platform confirmed that the full system operates at 21 FPS, supporting onboard deployment for UAV-based search-and-rescue missions. The presented method improves localization reliability, enhances interpretation of human posture and motion, and facilitates prioritization of rescue actions. These findings highlight the practical applicability of YOLO12-based hybrid pipelines for real-time survivor detection in flood response and maritime safety workflows. Full article

(This article belongs to the Special Issue Advancements in Deep Learning and Its Applications)

► Show Figures

Figure 1

22 pages, 3966 KB

Open AccessArticle

TAS-SLAM: A Visual SLAM System for Complex Dynamic Environments Integrating Instance-Level Motion Classification and Temporally Adaptive Super-Pixel Segmentation

by Yiming Li, Liuwei Lu, Guangming Guo, Luying Na, Xianpu Liang, Peng Su, Qi An and Pengjiang Wang

ISPRS Int. J. Geo-Inf. 2026, 15(1), 7; https://doi.org/10.3390/ijgi15010007 - 21 Dec 2025

Viewed by 540

Abstract

To address the issue of decreased localization accuracy and robustness in existing visual SLAM systems caused by imprecise identification of dynamic regions in complex dynamic scenes—leading to dynamic interference or reduction in valid static feature points, this paper proposes a dynamic visual SLAM [...] Read more.

To address the issue of decreased localization accuracy and robustness in existing visual SLAM systems caused by imprecise identification of dynamic regions in complex dynamic scenes—leading to dynamic interference or reduction in valid static feature points, this paper proposes a dynamic visual SLAM method integrating instance-level motion classification, temporally adaptive super-pixel segmentation, and optical flow propagation. The system first employs an instance-level motion classifier combining residual flow estimation and a YOLOv8-seg instance segmentation model to distinguish moving objects. Then, temporally adaptive super-pixel segmentation algorithm SLIC (TA-SLIC) is applied to achieve fine-grained dynamic region partitioning. Subsequently, a proposed dynamic region missed-detection correction mechanism based on optical flow propagation (OFP) is used to refine the missed-detection mask, enabling accurate identification and capture of motion regions containing non-rigid local object movements, undefined moving objects, and low-dynamic objects. Finally, dynamic feature points are removed, and valid static features are utilized for pose estimation. The localization accuracy of the visual SLAM system is validated using two widely adopted datasets, TUM and BONN. Experimental results demonstrate that the proposed method effectively suppresses interference from dynamic objects (particularly non-rigid local motions) and significantly enhances both localization accuracy and system robustness in dynamic environments. Full article

(This article belongs to the Special Issue Indoor Mobile Mapping and Location-Based Knowledge Services)

► Show Figures

Figure 1

22 pages, 14012 KB

Open AccessArticle

Video Frame Interpolation for Extreme Motion Scenes Based on Dual Alignment and Region-Adaptive Interaction

by Xin Ning, Jiantao Qu, Junyi Duan, Kun Yang and Youdong Ding

Symmetry 2025, 17(12), 2097; https://doi.org/10.3390/sym17122097 - 6 Dec 2025

Viewed by 804

Abstract

Video frame interpolation in ultra-high-definition extreme motion scenes remains highly challenging due to large displacements, nonlinear motion, and occlusions that disrupt spatio-temporal symmetry. To address this issue, this study proposes a frame interpolation method for extreme motion scenes based on dual alignment and [...] Read more.

Video frame interpolation in ultra-high-definition extreme motion scenes remains highly challenging due to large displacements, nonlinear motion, and occlusions that disrupt spatio-temporal symmetry. To address this issue, this study proposes a frame interpolation method for extreme motion scenes based on dual alignment and region-adaptive interaction from the perspectives of cross-frame localization and adaptive reconstruction. Specifically, we design a two-stage motion information alignment strategy that obtains two types of motion information via optical flow estimation and offset estimation, and it progressively guides reference pixels for accurate long-range cross-frame localization, mitigating structural misalignment caused by limited receptive fields while simultaneously alleviating spatiotemporal asymmetry caused by inconsistent inter-frame motion speed and direction. Based on this, we introduce a region-adaptive interaction module that automatically adapts motion representations for different regions through cross-frame interaction and leverages distinct attention pathways to accurately capture both the global context and local high-frequency motion details. This achieves a dynamic feature fusion tailored to regional characteristics, significantly enhancing the model’s ability to perceive the overall structure and texture details in extreme motion scenarios. In addition, the introduction of a motion compensation module explicitly captures pixel motion relationships by constructing a global correlation matrix that compensates for the positioning errors of the dual alignment module in extreme motion or occlusion areas. The experimental results demonstrate that the proposed method achieves excellent overall performance in ultra-high-definition extreme motion scenes, with a PSNR improvement of 0.05 dB over state-of-the-art methods. In multi-frame interpolation tasks, it achieves an average PSNR gain of 0.31 dB, demonstrating strong cross-scene interpolation capability. Full article

(This article belongs to the Special Issue Symmetry in Artificial Intelligence and Applications)

► Show Figures

Figure 1

17 pages, 1697 KB

Open AccessArticle

Football-YOLO: A Lightweight and Symmetry-Aware Football Detection Model with an Enlarged Receptive Field

by Jingjing Zhou, Hongyang Liu, Gang Zhao and Ying Gao

Symmetry 2025, 17(12), 2046; https://doi.org/10.3390/sym17122046 - 1 Dec 2025

Viewed by 876

Abstract

In modern elite football, accurate ball localization is increasingly vital for smooth match flow and reliable officiating. Yet mainstream detectors still struggle with small objects like footballs in cluttered scenes due to limited receptive fields, weak feature representations, and non-trivial computational cost. To [...] Read more.

In modern elite football, accurate ball localization is increasingly vital for smooth match flow and reliable officiating. Yet mainstream detectors still struggle with small objects like footballs in cluttered scenes due to limited receptive fields, weak feature representations, and non-trivial computational cost. To address these issues and introduce structural symmetry, we propose a lightweight framework that balances model complexity and representational completeness. Concretely, we design a Dynamic clustering C3k2 module (DcC3k2) to enlarge the effective receptive field and preserve local–global symmetry and a SegNeXt-based noise-attentive C3k2 module (SNAC3k2) to perform multi-scale suppression of background interference. For efficient feature extraction, we adopt GhostNetV2—a lightweight convolutional backbone—thereby maintaining computational symmetry and speed. Experiments on a Football dataset show that our approach improves mAP by 3.4% over strong baselines while reducing computation by 2.2%. These results validate symmetry-aware lightweight design as a promising direction for high-precision small-object detection in football analytics. Full article

(This article belongs to the Special Issue Computer Vision, Pattern Recognition, Machine Learning, and Symmetry, 2nd Edition)

► Show Figures

Figure 1

Search Results (310)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (310)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI