Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,843)

Search Parameters:
Keywords = autonomous driving

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 1778 KB  
Article
A Trajectory Data-Driven Personalized Autonomous Driving Decision System for Driving Simulators
by Wenpeng Sun, Yu Zhang and Nengchao Lyu
Vehicles 2026, 8(4), 94; https://doi.org/10.3390/vehicles8040094 (registering DOI) - 19 Apr 2026
Abstract
To meet the high-fidelity testing environment requirements for autonomous driving system development, driving simulators are gradually evolving from tools that “only provide scenes and interaction interfaces” into integrated verification platforms for autonomous driving capabilities. These simulators, in particular, need to feature testable and [...] Read more.
To meet the high-fidelity testing environment requirements for autonomous driving system development, driving simulators are gradually evolving from tools that “only provide scenes and interaction interfaces” into integrated verification platforms for autonomous driving capabilities. These simulators, in particular, need to feature testable and scalable decision-making modules. However, the autonomous driving functions in existing driving simulators mostly rely on rule-based or simplified model approaches, which are inadequate for depicting the complex interactions in real-world traffic and fail to meet the personalized decision-making needs under various driving styles. To address these challenges, this paper designs and implements a trajectory data-driven personalized autonomous driving decision system, using drone aerial imagery as the core data source to provide realistic background traffic flow and human-like decision-making capabilities. The proposed system can be interpreted as an integrated decision–planning–control framework deployed within a high-fidelity driving simulation platform. It consists of a driving style classification module based on drone trajectory data, a personalized decision module integrating inverse reinforcement learning and dynamic game theory, and a planning and control module. First, a natural driving database is built using 4997 real vehicle trajectories, and prior features of different driving styles are extracted through trajectory feature engineering and an improved K-means++ method. Based on this, a personalized decision-making framework that combines dynamic game theory and maximum entropy inverse reinforcement learning is proposed, aiming to learn the preference weights of different driving styles in terms of safety, comfort, and efficiency. Furthermore, the Dueling Network Architecture (DuDQN) is used to generate human-like lane-changing strategies. Subsequently, a real-time closed-loop execution of personalized decisions in the simulation platform is achieved through fifth-order polynomial trajectory planning, lateral Linear Quadratic Regulator (LQR) control, and longitudinal cascade Proportional–Integral–Derivative (PID) control. Experimental results show that the personalized decision model trained with drone data can realistically reproduce vehicle decision-making behaviors in natural traffic flows within the simulation environment and generate autonomous driving strategies that are highly consistent with different driving styles. This significantly enhances the humanization and personalization capabilities of the autonomous driving module in the driving simulator. Full article
(This article belongs to the Special Issue Data-Driven Smart Transportation Planning)
22 pages, 876 KB  
Article
Large Autonomous Driving Overtaking Decision and Control System Based on Hierarchical Reinforcement Learning
by Chen-Ning Wang and Xiuhui Tang
Electronics 2026, 15(8), 1711; https://doi.org/10.3390/electronics15081711 - 17 Apr 2026
Abstract
To address the bottlenecks of low sample efficiency and poor control accuracy in traditional single-layer reinforcement learning during autonomous driving overtaking, this paper proposes an overtaking decision and control system based on hierarchical reinforcement learning to decouple complex tasks in spatial and temporal [...] Read more.
To address the bottlenecks of low sample efficiency and poor control accuracy in traditional single-layer reinforcement learning during autonomous driving overtaking, this paper proposes an overtaking decision and control system based on hierarchical reinforcement learning to decouple complex tasks in spatial and temporal dimensions. A heterogeneous two-layer architecture is constructed, where the upper layer adopts the Proximal Policy Optimization algorithm to generate macroscopic discrete decisions, while the lower layer employs Twin Delayed Deep Deterministic Policy Gradient combined with Long Short-Term Memory to achieve smooth continuous control of steering and acceleration by perceiving temporal features of dynamic obstacles. A composite reward mechanism, integrating hard safety constraints and soft efficiency incentives, is designed to balance safety, efficiency, and comfort. Experimental results in complex scenarios with multiple interfering vehicles and random lane-changing behaviors demonstrate that the proposed system improves the training convergence speed by approximately 30% within 500,000 steps compared to single-layer algorithms. In tests across varying traffic densities, the system achieves a 98.3% success rate in medium-density scenarios with a collision rate of only 0.6%. In high-density challenges, the success rate remains above 95%, with the collision rate reduced by about 80% compared to baseline models. Furthermore, the lateral control deviation is strictly limited to within 0.2 m, and the longitudinal safety distance remains stable above 5 m. This system provides a robust, high-efficiency paradigm for autonomous overtaking. Full article
31 pages, 4364 KB  
Article
Performance Degradation of Object Detection Neural Networks Under Natural Visual Contamination in Autonomous Driving
by Dániel Csikor and János Hollósi
Computers 2026, 15(4), 254; https://doi.org/10.3390/computers15040254 - 17 Apr 2026
Abstract
The operation of driver assistance systems and autonomous vehicles requires a sensor system and a control algorithm. Sensors provide information to detect people, vehicles and objects in the vehicle’s environment; however, their performance can be degraded by adverse environmental conditions and contamination. This [...] Read more.
The operation of driver assistance systems and autonomous vehicles requires a sensor system and a control algorithm. Sensors provide information to detect people, vehicles and objects in the vehicle’s environment; however, their performance can be degraded by adverse environmental conditions and contamination. This literature review identified factors that reduce sensor visibility, such as weather conditions and external contamination. In this study, the detection efficiency of state-of-the-art neural network-based object detectors was examined in a simulation environment using a synthetic dataset. A custom dataset comprising six urban and suburban traffic scenarios was created, including clean images and ten contaminated variants per scene with increasing mud coverage. The results show that contamination leads to a measurable reduction in detection performance across all models. Smaller variants are more sensitive to degradation, while medium-complexity models provide a favorable balance between robustness and computational cost. Increasing model size yields limited additional robustness, and performance differences between architectures highlight the importance of model design. Furthermore, the spatial distribution of contamination, particularly near the image center, has a significant impact on performance in addition to its overall extent. Full article
19 pages, 580 KB  
Article
Emergent Pedestrian Safety in a World-Model Driving Agent Under Adversarial Interaction Without Explicit Safety Rewards
by Stefan Zlatinov, Gorjan Nadzinski, Vesna Ojleska Latkoska, Dushko Stavrov and Mile Stankovski
Appl. Sci. 2026, 16(8), 3915; https://doi.org/10.3390/app16083915 - 17 Apr 2026
Abstract
Pedestrian interaction remains a central safety challenge for autonomous driving, particularly under non-compliant or adversarial pedestrian behavior. Existing research and evaluations predominantly test against rule-following pedestrians, leaving a gap in understanding how learning-based agents handle worst-case interactions. We introduce the Jaywalkers Library, a [...] Read more.
Pedestrian interaction remains a central safety challenge for autonomous driving, particularly under non-compliant or adversarial pedestrian behavior. Existing research and evaluations predominantly test against rule-following pedestrians, leaving a gap in understanding how learning-based agents handle worst-case interactions. We introduce the Jaywalkers Library, a novel configurable benchmark in CARLA with three adversarial pedestrian archetypes (Intruder, Indecisive Crosser, and Protester). We evaluate a DreamerV3 agent trained with sparse rewards, where the only pedestrian-specific signal is a terminal collision penalty. Evaluation employs a frozen-policy protocol with explicit train–test separation. Safety behavior is decomposed into endpoint outcomes, evasion dynamics, and efficiency costs. Under nominal conditions, the agent achieves high route completion and generalizes to an unseen town, whereas under adversarial exposure, an archetype-sensitive evasion strategy emerges. The agent swerves at speed against dynamic pedestrians but decelerates against the slow-moving Protester. Collision rates reveal a counterintuitive difficulty ordering in which the Protester is the hardest, followed by the Intruder, with the Indecisive Crosser as the most survivable. These findings show that a sparse terminal penalty suffices for emergent pedestrian avoidance in a world-model agent, but that effectiveness is bounded by the world model’s ability to predict pedestrian persistence. Full article
(This article belongs to the Special Issue Advances in Virtual Reality and Vision for Driving Safety)
34 pages, 10503 KB  
Article
Multi-Objective Trajectory Optimization for Autonomous Vehicles Based on an Improved Driving Risk Field
by Jianping Gao, Wenju Liu, Pan Liu, Peiyi Bai and Chengwei Xie
Modelling 2026, 7(2), 75; https://doi.org/10.3390/modelling7020075 - 17 Apr 2026
Abstract
Trajectory planning in dynamic multi-vehicle interaction environments faces three critical challenges, including the difficulty of quantifying spatial risk distributions, the complexity of characterizing behavioral uncertainty arising from the multimodal maneuvers of surrounding vehicles, and the challenge of simultaneously optimizing multiple competing objectives such [...] Read more.
Trajectory planning in dynamic multi-vehicle interaction environments faces three critical challenges, including the difficulty of quantifying spatial risk distributions, the complexity of characterizing behavioral uncertainty arising from the multimodal maneuvers of surrounding vehicles, and the challenge of simultaneously optimizing multiple competing objectives such as safety, efficiency, comfort, and energy consumption. To address these challenges, this paper proposes an Improved Driving Risk Field-based Multi-objective Trajectory Optimization (IDRF-MTO) method. First, a joint spatiotemporal social attention mechanism achieves unified modeling of spatial interactions, temporal dependencies, and spatiotemporal coupling, combined with a lateral–longitudinal intent strategy for multimodal trajectory prediction. Second, an improved dynamic risk field model is constructed comprising three components: a vehicle risk field that incorporates spatial orientation and motion direction factors for anisotropic risk representation, along with a collision tendency factor that converts objective risk into effective risk; a predicted trajectory risk field that achieves anticipatory quantification of future risk from surrounding vehicles through confidence-weighted fusion; and a driving environment risk field that encapsulates road geometry, static obstacles, and environmental conditions. Finally, a multi-objective cost function embedding risk field gradients is formulated, and multi-objective coordinated optimization is realized through a three-dimensional spatiotemporal situation graph with adaptive safety sampling. Simulation results demonstrate that the proposed method enhances safety while simultaneously improving comfort and efficiency and reducing energy consumption, exhibiting excellent planning performance in complex dynamic environments. Full article
(This article belongs to the Special Issue Advanced Modelling Techniques in Transportation Engineering)
Show Figures

Figure 1

22 pages, 10244 KB  
Article
TransBridge: A Transparent Communication Middleware with Unified RoCE and TCP Semantics
by Cong Zhou, Yulei Yuan and Peng Xun
Sensors 2026, 26(8), 2482; https://doi.org/10.3390/s26082482 - 17 Apr 2026
Abstract
In low-latency edge-intelligence scenarios such as autonomous driving and industrial edge analytics, the processing of large-scale sensor data imposes extremely stringent requirements on communication latency. However, the high overhead of the traditional TCP protocol makes it difficult to satisfy such demands, while the [...] Read more.
In low-latency edge-intelligence scenarios such as autonomous driving and industrial edge analytics, the processing of large-scale sensor data imposes extremely stringent requirements on communication latency. However, the high overhead of the traditional TCP protocol makes it difficult to satisfy such demands, while the semantic gap between the high-performance RoCE protocol and the standard Socket API prevents existing applications from directly exploiting its advantages. To address this problem, this paper proposes TransBridge, a lightweight user-space communication middleware that transparently bridges TCP and RoCE. Its design is realized through three key innovations: a transparent user-space compatibility architecture that enables unmodified Socket-based applications to benefit from RoCE performance; a microsecond-level low-latency transmission engine that bypasses kernel and protocol stack overhead; and a lightweight lock-free resource management mechanism based on a decentralized peer-to-peer architecture and deferred buffer updates. Experiments on a real RoCE network show that TransBridge significantly outperforms mainstream schemes: it achieves an average round-trip latency of 5.926 μs for 16 B messages and a throughput of 20.254 Gbps for 16 KB messages; in the Fast DDS application-level evaluation, it achieves a throughput of 188 Mbps and an average round-trip latency of about 150 μs. The results indicate that TransBridge can provide transparent and effective RoCE acceleration for existing Socket-based applications in resource-constrained edge environments. Full article
19 pages, 534 KB  
Article
Minimalism and Satisfaction with Collaborative Consumption and Life: The Moderating Role of Corporate Service Sincerity
by Kyung-Tae Lee, Hiroyasu Furukawa and Ken Kumagai
Sustainability 2026, 18(8), 3938; https://doi.org/10.3390/su18083938 - 16 Apr 2026
Viewed by 112
Abstract
While previous studies have established the positive effects of minimalism on well-being, the issue of how minimalism shapes satisfaction within specific consumption contexts remains underexplored. This study investigates the relationships among minimalism, satisfaction with collaborative consumption (CC), and life satisfaction, examining the moderating [...] Read more.
While previous studies have established the positive effects of minimalism on well-being, the issue of how minimalism shapes satisfaction within specific consumption contexts remains underexplored. This study investigates the relationships among minimalism, satisfaction with collaborative consumption (CC), and life satisfaction, examining the moderating role of corporate service sincerity. Drawing on goal satisfaction theory, we conceptualize minimalism as an intrinsic goal orientation that drives psychological fulfillment through value-congruent consumption. Survey data from 430 Japanese consumers with recent CC experience were analyzed using the SPSS PROCESS macro. Results indicate that minimalism positively predicts both satisfaction with CC and life satisfaction, and that these effects are amplified when the CC service is perceived as sincere. However, contrary to theoretical expectations, satisfaction with CC was negatively associated with life satisfaction, suggesting that domain-specific satisfaction in access-based consumption may not spill over to global well-being under certain conditions. We propose that this paradox reflects a boundary condition of goal satisfaction theory: when CC participation is constraint-driven rather than autonomously chosen, satisfaction may coexist with unfulfilled ownership aspirations. These findings advance the minimalist consumption literature by specifying mechanisms linking lifestyle values to consumption outcomes and offer practical guidance for sharing economy platforms seeking to engage value-driven consumers through authentic brand communication. Full article
Show Figures

Figure 1

32 pages, 1594 KB  
Article
Multi-Equipment Coordinated Scheduling Considering Dynamic Changes in Truck Handover Points Under Hybrid Traffic in Automated Container Terminals
by Suosuo Huang, Fang Yu, Qiang Zhang and Yongsheng Yang
Eng 2026, 7(4), 181; https://doi.org/10.3390/eng7040181 - 15 Apr 2026
Viewed by 98
Abstract
With the rapid maturation of autonomous driving technology, the hybrid traffic of Internal Container Trucks (ICTs) and External Container Trucks (ECTs) has become a major trend in Automated Container Terminals (ACTs), imposing higher demands on the interaction efficiency between trucks and Yard Cranes [...] Read more.
With the rapid maturation of autonomous driving technology, the hybrid traffic of Internal Container Trucks (ICTs) and External Container Trucks (ECTs) has become a major trend in Automated Container Terminals (ACTs), imposing higher demands on the interaction efficiency between trucks and Yard Cranes (YCs). This paper proposes a comprehensive optimization strategy for the coordinated scheduling of ICTs, ECTs and YCs under hybrid traffic. First, a task combination strategy for ICTs is designed to improve ICT utilization by pairing delivery and retrieval tasks across yard blocks. Second, a Chebyshev-motion-based coordination strategy for YC gantry and trolley movements is developed to reduce travel time and optimize handover points. Third, a mixed-integer programming model is formulated to minimize total energy consumption. An Improved Hybrid Genetic Algorithm (IHGA) is then developed, incorporating chaotic initialization, simulated annealing-based mutation, and dual local search to enhance convergence and solution quality. Simulation results confirm that the proposed model and strategy effectively reduce the total energy consumption of task execution, and the designed algorithm outperforms comparative algorithms in both optimization capability and convergence speed. Overall, the research provides theoretical support for future automated terminal development and practical guidance for achieving efficient and sustainable port operations. Full article
28 pages, 26837 KB  
Article
KA-IHO: A Kinematic-Aware Improved Hippo Optimization Algorithm for Collision-Free Mobile Robot Path Planning in Complex Grid Environments
by Chunhong Yuan, Yule Cai, Haohua Que, Yuting Pei, Xiang Zhang, Jiayue Xie, Qian Zhang, Lei Mu and Fei Qiao
Sensors 2026, 26(8), 2416; https://doi.org/10.3390/s26082416 - 15 Apr 2026
Viewed by 134
Abstract
Autonomous path planning in obstacle-dense environments remains challenging for swarm intelligence methods due to infeasible initialization, insufficient exploration–exploitation balance, and poor trajectory smoothness for real-robot execution. To address these issues, this paper proposes a Kinematic-Aware Improved Hippo Optimization algorithm (KA-IHO) for mobile robot [...] Read more.
Autonomous path planning in obstacle-dense environments remains challenging for swarm intelligence methods due to infeasible initialization, insufficient exploration–exploitation balance, and poor trajectory smoothness for real-robot execution. To address these issues, this paper proposes a Kinematic-Aware Improved Hippo Optimization algorithm (KA-IHO) for mobile robot path planning. The proposed method integrates four components: an elite safety pool initialization strategy to improve feasible solution generation in dense maps, a hierarchical elite-scout update mechanism to better balance global exploration and local exploitation, anti-stagnation mechanisms including a Population Stagnation Restart strategy and a 10-Direction Radial Micro-Search to guarantee high feasibility rates across all map complexities, and a late-stage Laplacian Line-of-Sight Ironing Operator to reduce path redundancy and improve trajectory smoothness. Comparative experiments are conducted on five reproducible grid maps with different complexity levels (40×40 and 80×80), where KA-IHO is evaluated against six representative algorithms, including HO, SBOA, PSO, GWO, ARO, and INFO, over 20 independent runs. The results show that KA-IHO consistently achieves collision-free planning and obtains lower mean fitness values with smaller standard deviations than the compared methods, indicating improved robustness and solution quality. In addition, hardware closed-loop experiments on a differential-drive mobile robot demonstrate that the planned paths can be executed reliably in real environments, with trajectory tracking errors controlled within ±4 cm. Full article
24 pages, 2803 KB  
Article
Dynamic Trajectory Tracking and Autonomous Berthing Control of a Container Ship Based on Four-Quadrant Hydrodynamics
by Chen-Wei Chen, Jiahao Yin, Jialin Lu, Chin-Yin Chen, Ningmin Yan and Zhuo Feng
J. Mar. Sci. Eng. 2026, 14(8), 724; https://doi.org/10.3390/jmse14080724 - 14 Apr 2026
Viewed by 152
Abstract
To address the strongly nonlinear hydrodynamic coupling and complex maneuvering challenges encountered by large ships during berthing operations in restricted waters, this paper proposes a high-precision autonomous berthing control system incorporating four-quadrant propeller hydrodynamics. Based on an improved Mathematical Maneuvering Group (MMG) framework, [...] Read more.
To address the strongly nonlinear hydrodynamic coupling and complex maneuvering challenges encountered by large ships during berthing operations in restricted waters, this paper proposes a high-precision autonomous berthing control system incorporating four-quadrant propeller hydrodynamics. Based on an improved Mathematical Maneuvering Group (MMG) framework, a three-degree-of-freedom (3-DOF) dynamic model is established to accurately capture the transient thrust and torque mappings of the propeller over all four quadrants. A dynamic line-of-sight (LOS) guidance system with a nonlinearly decaying acceptance radius is tightly coupled with PD/PI controllers to coordinate and regulate the rudder angle and propeller rotational speed. The numerical solver was rigorously validated against turning-test data for the S-175 container ship, with the errors of the key parameters all controlled within 15%. Subsequently, under the environmental conditions of Yangshan Port, full-condition path-planning and berthing simulations were conducted for the novel B-573 container ship under steady-current disturbances with multiple intensity levels (0 to 1.5 m/s) and multiple flow directions. Quantitative evaluation shows that, under the highly challenging current condition of 1.0 m/s, the dynamic corrective mechanism effectively drives the global mean absolute error (MAE) to converge to 85.50 m, representing a 62% statistical reduction relative to the transient peak value. In addition, a parameter sensitivity analysis based on the cumulative cross-track error confirms that, when subject to variations in the underlying hydrodynamic parameters, the proposed system can suppress fluctuations in trajectory error to a very low level, thereby demonstrating a certain degree of control robustness. During the terminal berthing stage, the vessel smoothly completed an extreme deceleration from an initial speed of 6.4 m/s to a full stop within 588 s, while constraining the maximum astern rotational speed to −2 rps and seamlessly passing through all four propeller quadrants. The results confirm that the proposed autopilot framework possesses a certain degree of engineering feasibility in complex maritime environments. Full article
(This article belongs to the Special Issue Advanced Modeling and Intelligent Control of Marine Vehicles)
19 pages, 30364 KB  
Article
CLIP-Mono3D: End-to-End Open-Vocabulary Monocular 3D Object Detection via Semantic–Geometric Similarity
by Zichong Gu, Shiyi Mu, Hanqi Lyu and Shugong Xu
Sensors 2026, 26(8), 2380; https://doi.org/10.3390/s26082380 - 13 Apr 2026
Viewed by 384
Abstract
Open-vocabulary 3D object detection (OV-3DOD) is crucial for real-world perception, yet existing monocular methods are often limited by predefined categories or heavy reliance on external 2D detectors. In this paper, we propose CLIP-Mono3D, an end-to-end one-stage transformer framework that directly integrates vision–language semantics [...] Read more.
Open-vocabulary 3D object detection (OV-3DOD) is crucial for real-world perception, yet existing monocular methods are often limited by predefined categories or heavy reliance on external 2D detectors. In this paper, we propose CLIP-Mono3D, an end-to-end one-stage transformer framework that directly integrates vision–language semantics into monocular 3D detection. By leveraging CLIP-derived semantic priors and grounding object queries in semantically salient regions, our model achieves robust zero-shot generalization to novel categories without requiring auxiliary 2D detectors. Furthermore, we introduce OV-KITTI, a large-scale benchmark extending KITTI with 40 new categories and over 7000 annotated 3D bounding boxes. Extensive experiments on OV-KITTI, KITTI, and Argoverse demonstrate that CLIP-Mono3D achieves competitive performance in open-vocabulary scenarios. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

24 pages, 12711 KB  
Article
Evidentially Driven Uncertainty Decomposition for Weakly Supervised Point Cloud Semantic Segmentation
by Qingyan Wang, Yixin Wang, Junping Zhang, Yujing Wang and Shouqiang Kang
ISPRS Int. J. Geo-Inf. 2026, 15(4), 167; https://doi.org/10.3390/ijgi15040167 - 12 Apr 2026
Viewed by 205
Abstract
Point cloud semantic segmentation is a core component in indoor scene understanding and autonomous driving. Under weak point-level supervision, only a small subset of points is annotated, making effective use of unlabeled points critical yet non-trivial. Many existing approaches rely on prediction confidence [...] Read more.
Point cloud semantic segmentation is a core component in indoor scene understanding and autonomous driving. Under weak point-level supervision, only a small subset of points is annotated, making effective use of unlabeled points critical yet non-trivial. Many existing approaches rely on prediction confidence to filter pseudo labels or enforce consistency, which can bias training toward easy points and amplify early mistakes. Consequently, confidently wrong predictions may be reinforced, while uncertain points around class boundaries or in geometrically complex regions are less utilized, limiting further gains. An evidential uncertainty decomposition framework is introduced for weakly supervised point cloud semantic segmentation. Network outputs are interpreted as evidential distributions, and uncertainty is decomposed to separate lack-of-knowledge uncertainty from boundary-related ambiguity, providing a more informative reliability signal for unlabeled points. Based on this signal, different constraints are applied to different subsets: reliable points are trained with pseudo labels together with prototype-based regularization to encourage intra-class compactness; boundary-ambiguous points are guided by evidential consistency to improve boundary learning; and points with high epistemic uncertainty are excluded from pseudo-label-based supervision to mitigate error reinforcement. In addition, an uncertainty calibration term on sparsely labeled points helps stabilize training. Experiments on S3DIS, ScanNet-V2, and SemanticKITTI yield 67.7%, 59.7%, and 53.3% mIoU, respectively, with only 0.1% labeled points, comparing favorably with prior weakly supervised point cloud segmentation methods. Full article
(This article belongs to the Special Issue Indoor Mobile Mapping and Location-Based Knowledge Services)
13 pages, 1462 KB  
Article
Interpretable Vision Transformers in Monocular Depth Estimation via SVDA
by Vasileios Arampatzakis, George Pavlidis, Nikolaos Mitianoudis and Nikos Papamarkos
Mathematics 2026, 14(8), 1272; https://doi.org/10.3390/math14081272 - 11 Apr 2026
Viewed by 320
Abstract
Monocular depth estimation is a central problem in computer vision with applications in robotics, augmented reality, and autonomous driving, yet the self-attention mechanisms used by modern Transformer architectures remain opaque. In this work, we integrate SVD-Inspired Attention (SVDA) into the Dense Prediction Transformer [...] Read more.
Monocular depth estimation is a central problem in computer vision with applications in robotics, augmented reality, and autonomous driving, yet the self-attention mechanisms used by modern Transformer architectures remain opaque. In this work, we integrate SVD-Inspired Attention (SVDA) into the Dense Prediction Transformer (DPT), introducing a spectrally structured attention formulation for dense prediction that decouples directional alignment from spectral modulation through a learnable diagonal matrix embedded in normalized query–key interactions. Experiments on KITTI and NYU-v2 show that SVDA preserves competitive predictive performance while enabling intrinsic interpretability: on KITTI, AbsRel improves from 0.058 to 0.056 and δ1 from 0.976 to 0.979, while on NYU-v2, AbsRel improves from 0.133 to 0.124 and δ1 from 0.865 to 0.872. This is achieved with only 0.01% additional parameters, at the cost of a measurable runtime overhead associated with the added normalization and spectral modulation. More importantly, SVDA enables six spectral indicators that quantify entropy, rank, sparsity, alignment, selectivity, and robustness, revealing consistent cross-dataset and depth-wise patterns in how attention organizes during training. These properties make the model easier to inspect and better suited to applications where transparency and reliability are important, such as robotics and autonomous navigation. Full article
Show Figures

Figure 1

23 pages, 2839 KB  
Article
A Reference-Free Lens-Flare-Aware Detector for Autonomous Driving
by Shanxing Ma, Tim Willems, Wenwen Ma, Marwan Yusuf, David Van Hamme, Jan Aelterman and Wilfried Philips
Sensors 2026, 26(8), 2359; https://doi.org/10.3390/s26082359 - 11 Apr 2026
Viewed by 176
Abstract
As autonomous driving technology advances, the deployment of autonomous vehicles in urban environments is rapidly increasing. Lens flare—an often overlooked optical artifact in object detection research—can lead to increased false positives or missed detections, particularly in the challenging conditions inherent to autonomous driving. [...] Read more.
As autonomous driving technology advances, the deployment of autonomous vehicles in urban environments is rapidly increasing. Lens flare—an often overlooked optical artifact in object detection research—can lead to increased false positives or missed detections, particularly in the challenging conditions inherent to autonomous driving. Current mitigation methods are often ill-suited for real-time implementation. This work proposes a solution to alleviate the adverse effects of lens flare by utilizing a lightweight lens flare perception network, eliminating the need for additional hardware or complex image pre-processing. Specifically, we propose a reference-free model utilizing a ResNet18 backbone integrated with a lightweight Multi-Layer Perceptron (MLP) to extract and leverage lens flare information. This model is developed via a teacher–student framework, which was distilled from an end-to-end reference-based model optimized using the Learned Perceptual Image Patch Similarity (LPIPS) metric. Our experiments demonstrate that incorporating lens flare information significantly enhances the performance of the baseline object detection network, outperforming previous mitigation methods by a substantial margin. The proposed method can be seamlessly integrated into existing object detectors and requires only an efficient training process, facilitating its deployment in practical autonomous driving tasks. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

19 pages, 4757 KB  
Article
SCSANet: Split Convolution Selective Attention Network of Drivable Area Detection for Mobile Robots
by Maozhang Ye, Xiaoli Li, Jidong Dai, Hongyi Li, Zhouyi Xu and Chentao Zhang
Eng 2026, 7(4), 176; https://doi.org/10.3390/eng7040176 - 11 Apr 2026
Viewed by 160
Abstract
Detecting drivable areas is a fundamental task in autonomous driving systems. Although semantic segmentation networks have demonstrated strong performance in segmenting drivable regions, two key challenges persist. First, acquiring sufficient contextual information in complex road scenarios remains difficult, often leading to segmentation errors. [...] Read more.
Detecting drivable areas is a fundamental task in autonomous driving systems. Although semantic segmentation networks have demonstrated strong performance in segmenting drivable regions, two key challenges persist. First, acquiring sufficient contextual information in complex road scenarios remains difficult, often leading to segmentation errors. Second, the coarseness of extracted features may degrade accuracy even when texture information is available in RGB images. To address these issues, we propose an enhanced DeepLabv3+ algorithm called Split Convolution Selective Attention Network (SCSANet), which incorporates the Adaptive Kernel (AK) and Split Convolution Attention (SCA) modules. AK adaptively adjusts the receptive field to accommodate varying road scenarios, while SCA improves boundary clarity by enhancing channel interaction. In addition, we employ surface normals to provide complementary geometric information, thereby strengthening the ability of the network to recognize drivable areas. To compensate for the lack of publicly available datasets for closed or semi-closed scenarios, we introduce XMUROAD, a new dataset of binocular disparity images. Experiments on the XMUROAD dataset demonstrate that the proposed architectural improvements yield an mIoU gain of 1.63% under the same RGB input, and the full pipeline with surface normal input achieves improvements of 1.55% to 2.59% in mF1 and 2.94% to 4.83% in mIoU over state-of-the-art methods. Experiments on the KITTI dataset further verify the generalization capability of SCSANet, with improvements of 1.58% in mF1 and 2.88% in mIoU over state-of-the-art methods. The proposed method provides a practical approach for accurate drivable area detection in closed and semi-closed mobile-robot scenarios. Full article
(This article belongs to the Special Issue Artificial Intelligence for Engineering Applications, 2nd Edition)
Show Figures

Figure 1

Back to TopTop