Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (3,860)

Search Parameters:
Keywords = autonomous driving

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 742 KB  
Article
Bounded Graph Conditioning for LiDAR 3D Object Detection Under Sensor Degradation
by Xiuping Li, Xiyan Sun, Jingjing Li, Yuanfa Ji and Wentao Fu
Sensors 2026, 26(9), 2667; https://doi.org/10.3390/s26092667 (registering DOI) - 25 Apr 2026
Abstract
Light Detection and Ranging (LiDAR) three-dimensional (3D) object detection degrades under point sparsity, outliers, coordinate noise, and calibration drift, yet detector evaluation remains largely limited to clean benchmarks. This study focuses on sensing robustness rather than detector redesign. We introduce Bounded Graph Conditioning [...] Read more.
Light Detection and Ranging (LiDAR) three-dimensional (3D) object detection degrades under point sparsity, outliers, coordinate noise, and calibration drift, yet detector evaluation remains largely limited to clean benchmarks. This study focuses on sensing robustness rather than detector redesign. We introduce Bounded Graph Conditioning (BGC)—a deterministic pre-voxelization front-end that applies k-nearest-neighbor (kNN) neighborhood averaging with bounded residual correction upstream of an unchanged detector backbone. BGC is evaluated together with a reproducible sensor-degradation stress protocol and a risk-constrained operating-boundary analysis. Experiments on KITTI with PointPillars, SECOND, and Voxel R-CNN show that BGC most clearly improves retained detection quality and feasible operating coverage under strong noise and strong outlier stress; gains under other degradation types are smaller and backbone-dependent. In the primary score-level box-disjoint calibration/test evaluation on SECOND, maximum feasible coverage at a target risk bound of 0.2 improves from 0.0754 to 0.1374 under strong noise (σ=0.10 m) and from 0.1323 to 0.1591 under strong outliers (p=0.10); a cross-backbone check on Voxel R-CNN confirms the same direction (0.18600.2864). Comparison with traditional filtering (SOR and ROR) reveals complementary strengths across fault types. A range-adaptive BGC variant that adjusts parameters per distance bin further improves performance under mixed unknown faults, spherical-coordinate noise, and on a dataset-matched nuScenes validation (adaptive BGC mAP/NDS: 0.2687/0.4493 vs. baseline 0.2471/0.3846 under strong noise). Severe translation drift collapses all configurations to full rejection, exposing an explicit sensing boundary beyond the reach of local conditioning. These results support BGC as a practical sensor-side robustness enhancement under the studied degradation protocol, with conditional rather than universal applicability across backbones and fault types. Full article
(This article belongs to the Section Radar Sensors)
35 pages, 13122 KB  
Article
A Three-Dimensional LiDAR Observability Framework for Pedestrian Representation: Sensor Placement and Multi-View Fusion on a Compact Autonomous Vehicle
by Juan Diego Valladolid, Juan P. Ortiz, Franklin Castillo, José Vuelvas and Chuan Yu
Sensors 2026, 26(9), 2670; https://doi.org/10.3390/s26092670 (registering DOI) - 25 Apr 2026
Abstract
Reliable pedestrian perception in autonomous driving depends not only on detecting the target, but also on how completely and consistently its three-dimensional geometry is captured from different sensor viewpoints. This study presents a LiDAR-based observability framework for evaluating pedestrian representation on the ANTA [...] Read more.
Reliable pedestrian perception in autonomous driving depends not only on detecting the target, but also on how completely and consistently its three-dimensional geometry is captured from different sensor viewpoints. This study presents a LiDAR-based observability framework for evaluating pedestrian representation on the ANTA compact autonomous vehicle platform using a roof-mounted Top LiDAR (TL), a Front-Right LiDAR (FRL), and their fused configuration. The pedestrian was analyzed in a canonical local frame using geometric extent ratios, projected surface occupancy, voxel-based volumetric occupancy, and statistical descriptors of the local point distribution, integrated into a global observability score, S3D. A Distance-Robustness Index (DRI), an overlap-based complementarity analysis, and a lightweight temporal centroid-sensitivity check over 20 consecutive frames were used to characterize performance across distance. Using ROS 2 bag data processed offline in MATLAB R2025b the fused configuration achieved the highest mean global score (0.563), compared with 0.504 for FRL and 0.432 for TL, and the highest robustness (DRI=0.5628, CV=10.7%). The results show that 1 m maximizes local density, 2–3 m maximize projected and volumetric completeness, and 7 m provides the best balanced observability. Within the evaluated platform and under the controlled benchmark conditions, complementary multi-LiDAR fusion provided the strongest overall geometry-aware pedestrian representation. Full article
(This article belongs to the Special Issue Sensor Fusion for the Safety of Automated Driving Systems)
22 pages, 2316 KB  
Article
MVDFusion: Multimodal Vehicle Detection in Foggy Weather Using LiDAR and Radar Fusion
by Jiake Tian, Yan Gao, Xin Xia, Guoliang Ju, Peijun Ye, Sijie Tang, Hong Wang and Xucong Wang
Sensors 2026, 26(9), 2663; https://doi.org/10.3390/s26092663 (registering DOI) - 25 Apr 2026
Abstract
Millimeter-wave (mmWave) radar is widely used for vehicle detection in adverse weather conditions due to its robustness against environmental interference. However, the sparsity of mmWave radar data and the lack of height information significantly limit its broader applicability. To address these challenges, we [...] Read more.
Millimeter-wave (mmWave) radar is widely used for vehicle detection in adverse weather conditions due to its robustness against environmental interference. However, the sparsity of mmWave radar data and the lack of height information significantly limit its broader applicability. To address these challenges, we propose MVDFusion, a multi-modal vehicle detection framework that integrates LiDAR and radar data for robust perception in foggy environments. The proposed framework is designed to fully exploit LiDAR information to compensate for the limitations of sparse radar data. Specifically, two key modules are developed: a radar height query module to enhance height estimation, and a radar–LiDAR query fusion module to improve feature representation. This design enables deep feature-level integration of mmWave radar and LiDAR data. Extensive experiments on the Oxford Radar RobotCar dataset demonstrate that MVDFusion achieves superior performance and robustness under foggy conditions. In particular, it outperforms existing state-of-the-art methods at intersection-over-union thresholds of 0.5, 0.65, and 0.8, achieving detection accuracies of 95.8%, 94.2%, and 81.5%. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

51 pages, 7385 KB  
Article
Spiking Neural Networks with Continual Learning for Steering Angle Regression: A Sustainable AI Perspective
by Fernando S. Martínez, Sergio Costa and Raúl Parada
Sensors 2026, 26(9), 2656; https://doi.org/10.3390/s26092656 - 24 Apr 2026
Abstract
This work explores the application of Spiking Neural Networks (SNNs) and Continual Learning (CL) methodologies to the problem of steering angle regression, using autonomous driving simulation as the experimental context, with a focus on energy efficiency and alignment with sustainable computing objectives. The [...] Read more.
This work explores the application of Spiking Neural Networks (SNNs) and Continual Learning (CL) methodologies to the problem of steering angle regression, using autonomous driving simulation as the experimental context, with a focus on energy efficiency and alignment with sustainable computing objectives. The primary goal was to design and implement CL techniques in SNNs to assess the model’s ability to maintain accuracy in explored environments while reducing CO2 emissions through the optimized use of a subset of the data. This study emerges in response to the increasing energy demand of deep learning models, which poses a challenge to sustainability. SNNs, inspired by the efficiency of biological neural systems, offer significant advantages in terms of computational and energy consumption, making them a promising alternative. CL techniques, such as Elastic Weight Consolidation and replay memory, are integrated to mitigate catastrophic forgetting in sequential learning tasks. The methodology includes adapting the PilotNet architecture for SNNs, preprocessing datasets generated in the Udacity driving simulator, and evaluating models in incremental learning scenarios. The experiments compare the performance of SNNs with CL against baseline models without CL, using mean squared error (MSE), computational efficiency, and equivalent CO2 emissions as evaluation metrics. The results demonstrate that replay memory enables the retention of prior knowledge with a limited increase in energy consumption. This work concludes that SNNs with CL are a viable alternative for sustainable AI applications. Future research directions include a focus primarily on hardware-specific implementations and real-world testing. Full article
21 pages, 2137 KB  
Article
Adaptive Multi-Level 3D Multi-Object Tracking with Transformer-Based Association and Scene-Aware Thresholds for Autonomous Driving
by Yongze Zhang, Feipeng Da and Haocheng Zhou
Machines 2026, 14(5), 472; https://doi.org/10.3390/machines14050472 - 23 Apr 2026
Abstract
3D multi-object tracking (MOT) for autonomous driving remains challenging due to frequent identity switches in crowded scenes, trajectory fragmentation during occlusions, and the difficulty of adapting association strategies to varying scene complexities. While existing methods rely on fixed geometric or appearance-based associations, they [...] Read more.
3D multi-object tracking (MOT) for autonomous driving remains challenging due to frequent identity switches in crowded scenes, trajectory fragmentation during occlusions, and the difficulty of adapting association strategies to varying scene complexities. While existing methods rely on fixed geometric or appearance-based associations, they struggle to handle ambiguous cases and detection failures. We present an adaptive multi-level 3D MOT framework that achieves robust tracking through three key innovations: (1) multi-granularity temporal modeling that captures both fine-grained short-term motion and coarse long-term trends via dual-scale spatio-temporal attention, enabling accurate motion prediction across different object dynamics; (2) Transformer-based Appearance Association that employs cross-attention to model global inter-object relationships, resolving ambiguous associations in crowded scenarios where geometric cues alone fail; and (3) scene-adaptive learned thresholds that automatically adjust association strictness based on object density, motion complexity, and occlusion levels, avoiding the one-size-fits-all limitations of fixed thresholds. Our hierarchical four-level tracking strategy progressively handles cases from easy geometric matching (Level 1) to complex interval-frame recovery (Level 4), with SOT-based virtual detection generation bridging detector failures. Extensive experiments on the nuScenes benchmark demonstrate state-of-the-art performance. Full article
(This article belongs to the Section Vehicle Engineering)
24 pages, 1594 KB  
Article
RMP-YOLO: Robust Multi-Scale Pedestrian Detection for Dense Scenarios
by Chenyang Gui, Zhangyu Fan, Taibin Duan and Junhao Wen
Sensors 2026, 26(9), 2621; https://doi.org/10.3390/s26092621 - 23 Apr 2026
Abstract
With the rapid advancement of autonomous driving in modern society, dense pedestrian detection technology has encountered performance bottlenecks. To address this, we propose a robust and lightweight pedestrian detection algorithm, RMP-YOLO, designed to efficiently detect small, occluded, and low-light objects. Firstly, RFAConv is [...] Read more.
With the rapid advancement of autonomous driving in modern society, dense pedestrian detection technology has encountered performance bottlenecks. To address this, we propose a robust and lightweight pedestrian detection algorithm, RMP-YOLO, designed to efficiently detect small, occluded, and low-light objects. Firstly, RFAConv is utilized as the core component of the backbone network, combining standard convolution with attention mechanisms and using group convolution to extract features from the spatial receptive field. Secondly, MobileViTv3 is introduced into the backbone to combine CNNs with Transformers. The model is further enhanced by adjusting feature fusion, introducing residual connections, and optimizing local representation with deep convolutional layers. Finally, the PIoUv2 loss function is employed for bounding-box regression, significantly reducing detection errors for small-scale pedestrians in crowded environments. Experimental results demonstrate that RMP-YOLO improves mAP@0.5 by 1.3% on a custom dataset and 0.91% on the WiderPerson dataset. Crucially, it maintains high efficiency with only 3.71 million parameters and 6.29 GFLOPs, meeting the deployment requirements for low computational power and high precision. Full article
(This article belongs to the Section Sensing and Imaging)
32 pages, 3533 KB  
Article
Multi-Objective Trajectory Optimization Method for Connected Autonomous Vehicles Based on Risk Potential Field
by Kedong Wang, Dayi Qu, Ziyi Yang, Yuxiang Yang and Shanning Cui
Mathematics 2026, 14(9), 1415; https://doi.org/10.3390/math14091415 - 23 Apr 2026
Abstract
The planning of trajectories for Connected Autonomous Vehicles (CAVs) represents a pivotal aspect of autonomous driving technologies, enabling secure navigation within traffic environments. Traditional models for trajectory control primarily focus on the efficiency and safety of individual vehicles but often overlook the dynamics [...] Read more.
The planning of trajectories for Connected Autonomous Vehicles (CAVs) represents a pivotal aspect of autonomous driving technologies, enabling secure navigation within traffic environments. Traditional models for trajectory control primarily focus on the efficiency and safety of individual vehicles but often overlook the dynamics involved in vehicle-to-vehicle and vehicle-to-infrastructure interactions. This study introduces a novel concept, the “driving risk field,” which imposes constraints on vehicular movement within designated road spaces to enhance safety. A vehicle dynamics model is developed, employing a non-linear fifth-degree polynomial to approximate the trajectory curves, with optimization performed using the Sequential Quadratic Programming (SQP) method. The efficacy of the optimized model is validated through simulations on the Prescan/Simulink plat Full article
(This article belongs to the Special Issue Advanced Methods in Intelligent Transportation Systems, 2nd Edition)
21 pages, 908 KB  
Article
Hierarchical Semantic Transmission and Lyapunov-Optimized Online Scheduling for the Internet of Vehicles
by Le Jiang, Yani Guo, Wenzhao Zhang, Penghao Wang and Shujun Han
Sensors 2026, 26(9), 2606; https://doi.org/10.3390/s26092606 - 23 Apr 2026
Abstract
The inherent redundancy in vehicle sensor data, coupled with constrained onboard resources and stringent latency requirements, renders traditional bit-oriented transmission paradigms inefficient for autonomous-driving perception tasks. Semantic communication offers a promising direction by shifting the focus from bit-level fidelity to task-level information delivery. [...] Read more.
The inherent redundancy in vehicle sensor data, coupled with constrained onboard resources and stringent latency requirements, renders traditional bit-oriented transmission paradigms inefficient for autonomous-driving perception tasks. Semantic communication offers a promising direction by shifting the focus from bit-level fidelity to task-level information delivery. In this paper, we propose a unified framework that integrates hierarchical transmission and online scheduling for Internet of Vehicles (IoV)-oriented collaborative perception. The proposed hierarchy separates information into two complementary layers: a coarse metadata layer (object bounding boxes) for latency-critical awareness, and fine-grained visual semantics (multi-scale region-of-interest (ROI) patches) for perception-intensive tasks. We formulate an online scheduling problem that jointly exploits Age of Information (AoI) and Channel State Information (CSI) to dynamically decide what to transmit and at what fidelity under per-frame budget constraints. To address cross-scheme fairness, we report resource utilization under a fixed kbps/fps physical budget and evaluate robustness using a combination of a lightweight task-proxy metric and COCO-style Average Recall (AR100) under ROI-only evaluation. The hierarchical transmission architecture, combined with AoI awareness, reduces global semantic staleness by approximately 78%. The Lyapunov-based online scheduler enables intelligent, signal-to-noise ratio (SNR)-adaptive switching between coarse and fine semantic levels, ensuring robust perception under varying channel quality. Under strict physical-budget constraints and unreliable channel conditions, joint source-channel coding (JSCC) exhibits significantly stronger task robustness than conventional schemes: at 0 dB SNR, the task-proxy detection rate improves by nearly 47 percentage points over the uncoded baseline. Full article
(This article belongs to the Section Sensor Networks)
26 pages, 6322 KB  
Article
Real-Time, Reconfigurable CAN Intrusion Detection for EV Powertrain Networks via Specification-Driven Timing and Integrity Constraints
by Engin Subaşı and Muharrem Mercimek
Electronics 2026, 15(9), 1788; https://doi.org/10.3390/electronics15091788 - 22 Apr 2026
Viewed by 230
Abstract
The Controller Area Network (CAN) remains the backbone of in-vehicle communication, but its lack of built-in security exposes safety-critical systems to cyberattacks. This paper presents a real-time, reconfigurable, specification-driven intrusion detection system (IDS) implemented on a custom test bench that emulates an EV [...] Read more.
The Controller Area Network (CAN) remains the backbone of in-vehicle communication, but its lack of built-in security exposes safety-critical systems to cyberattacks. This paper presents a real-time, reconfigurable, specification-driven intrusion detection system (IDS) implemented on a custom test bench that emulates an EV powertrain. The CAN traffic captured from the four-ECU setup formed the dataset used in this study. The IDS enforces a compact, reconfigurable ruleset covering timing bounds, jitter envelopes, identifier whitelists, frame format, data length code (DLC) compliance, bus-load thresholds, application-level CRC, and alive-counter verification. The IDS achieves detection times below 2 ms with false positive rates under 1% for injection, denial of service (DoS), and fuzzy attacks, even at CAN bus loads up to 70%, while microcontroller resource usage remains within the constraints of automotive-grade devices, supporting deployment in embedded environments. The main contributions of this study are as follows: (i) a validated and reproducible EV powertrain test bench with millisecond-level timing, (ii) a deployable and easily reconfigurable ruleset with deterministic runtime, and (iii) a latency-oriented evaluation framework that is portable across automotive microcontroller platforms. The EV powertrain dataset v1.0 was released in a public GitHub repository to facilitate reproducible research and enable future benchmarking studies. Full article
Show Figures

Figure 1

27 pages, 8631 KB  
Article
From Light Pulses to Selective Enhancement: Performance Analysis of Event-Based Object Detection Under Pulsed Automotive Headlight Illumination
by Leonard Haensel and Torsten Bertram
Sensors 2026, 26(9), 2595; https://doi.org/10.3390/s26092595 - 22 Apr 2026
Viewed by 313
Abstract
Pulse-width-modulated (PWM) automotive headlights enhance nighttime event-based camera detection, yet systematic parameter optimization for vulnerable road user detection remains unexplored. This study evaluates PWM frequency, duty cycle, light distribution, ego-vehicle speed, and ambient lighting under European New Car Assessment Programme-inspired crossing scenarios for [...] Read more.
Pulse-width-modulated (PWM) automotive headlights enhance nighttime event-based camera detection, yet systematic parameter optimization for vulnerable road user detection remains unexplored. This study evaluates PWM frequency, duty cycle, light distribution, ego-vehicle speed, and ambient lighting under European New Car Assessment Programme-inspired crossing scenarios for cyclist and pedestrian detection. Results establish performance ranging from substantial improvements to severe degradation relative to continuous illumination. Cyclist detection achieves robust performance with high-frequency modulation across light distributions, while low-frequency operation with low beam produces severe degradation through background noise accumulation. Pedestrian detection requires high beam with street lighting enabled; low beam universally fails regardless of modulation parameters. Limited parameter combinations achieve simultaneous improvements for both targets. Detection performs optimally on retroreflective surfaces, while low-reflectivity clothing limits capability, requiring target-specific optimization. Full article
(This article belongs to the Special Issue Event-Driven Vision Sensor Architectures and Application Scenarios)
Show Figures

Figure 1

27 pages, 13498 KB  
Article
A Hierarchical Hybrid Trajectory Planning Method Based on a TTA-Driven Dynamic Risk Filtering Mechanism
by Tao Huang, Lin Hu, Jing Huang and Huakun Deng
Electronics 2026, 15(9), 1782; https://doi.org/10.3390/electronics15091782 - 22 Apr 2026
Viewed by 88
Abstract
To reduce the conservatism of local trajectory planning in dynamic road scenarios caused by redundant projection of predicted trajectories, this paper proposes a hierarchical hybrid trajectory-planning framework with a time-to-arrival (TTA)-driven dynamic risk-filtering mechanism. In the Frenet coordinate system, road boundaries, ego states, [...] Read more.
To reduce the conservatism of local trajectory planning in dynamic road scenarios caused by redundant projection of predicted trajectories, this paper proposes a hierarchical hybrid trajectory-planning framework with a time-to-arrival (TTA)-driven dynamic risk-filtering mechanism. In the Frenet coordinate system, road boundaries, ego states, and static and dynamic obstacles are represented uniformly to construct an S–L fused risk field and an S–T spatiotemporal interaction graph, enabling the filtering of temporally irrelevant conflict regions based on TTA relationships. At the path-planning layer, risk-guided adaptive sampling is integrated with dynamic programming and quadratic programming to improve search efficiency and trajectory quality. At the speed-planning layer, spatiotemporal coordination is achieved through non-uniform discretization, safe-corridor extraction, and speed-profile optimization. Simulation results show that the proposed method generates safe, smooth, continuous, and executable local trajectories in scenarios involving static-obstacle avoidance, adjacent-vehicle cut-ins, non-motorized road-user crossings, and mixed multi-obstacle interactions, while reducing unnecessary deceleration and detours. Ablation results further indicate that adaptive sampling reduces the number of DP search nodes by approximately 50% and the average planning time by about 30%, while maintaining a nearly unchanged minimum safety distance. These findings demonstrate that the proposed framework effectively suppresses redundant conflict regions and improves planning efficiency, solution feasibility, and motion continuity without compromising safety. Full article
(This article belongs to the Section Electrical and Autonomous Vehicles)
Show Figures

Figure 1

22 pages, 7499 KB  
Article
Coupling Effects of Land Use Carbon Emissions and Ecological Security in Border Cities of Jilin Province, China
by Zhuxin Liu, Yang Han, Jiani Zhang, Xinning Huang and Ruohan Lu
Land 2026, 15(5), 692; https://doi.org/10.3390/land15050692 - 22 Apr 2026
Viewed by 149
Abstract
Rapid urbanization has led to a significant increase in land use carbon emission (LCE), putting great pressure on ecological security. The coupling relationship between LCE and the ecological security index (ESI) is the key to sustainable development. Based on land use/cover change (LUCC) [...] Read more.
Rapid urbanization has led to a significant increase in land use carbon emission (LCE), putting great pressure on ecological security. The coupling relationship between LCE and the ecological security index (ESI) is the key to sustainable development. Based on land use/cover change (LUCC) and Open-Data Inventory for Anthropogenic Carbon dioxide (ODIAC) data, the LCE of the Jilin Border Cities (JLBCs) from 2013 to 2023 was estimated. Twenty-seven indicators were selected from both natural and socioeconomic aspects to evaluate the ESI using the Driving forces–Pressure–State–Impact–Response–Management (DPSIRM) model. The spatial interaction between LCE and ESI was analyzed using the coupling degree model and spatial autocorrelation. The results show that from 2013 to 2023, the main LCE areas in the JLBCs were concentrated in central urban districts, while the total LCE remained negative but exhibited a clear upward trend. The ESIs in Tonghua City and Baishan City have continued to improve, but those in Yanbian Autonomous Prefecture have gradually deteriorated, with ecological security warnings intensifying progressively toward the east. The spatial variation in the LCE–ESI coupling degree is significant, predominantly exhibiting low coupling with differences across scales. Within the study area, coupling degree shows a strong positive correlation, revealing distinct spatial clustering patterns dominated by low clusters and cold spots. Future efforts should focus on promoting low-carbon development models, strengthening protection and restoration, while implementing targeted measures to enhance the overall ecology of JLBCs. Full article
(This article belongs to the Section Land Use, Impact Assessment and Sustainability)
Show Figures

Figure 1

7 pages, 1321 KB  
Proceeding Paper
Sandstorm Image Reconstruction by Adaptive Prior, Selective Enhancement, and Sky Detection
by Hsiao-Chu Huang, Tzu-Jung Tseng and Jian-Jiun Ding
Eng. Proc. 2026, 134(1), 63; https://doi.org/10.3390/engproc2026134063 - 21 Apr 2026
Viewed by 61
Abstract
In sandstorm environments, a large number of suspended particles in the air absorb and scatter light, causing strong color bias, low contrast, and blurred details in images. These degradations reduce the reliability of computer vision applications in surveillance systems, intelligent transportation systems, unmanned [...] Read more.
In sandstorm environments, a large number of suspended particles in the air absorb and scatter light, causing strong color bias, low contrast, and blurred details in images. These degradations reduce the reliability of computer vision applications in surveillance systems, intelligent transportation systems, unmanned aerial vehicle monitoring, and outdoor autonomous driving systems. A complete sandstorm image enhancement method was developed in this study by combining sky detection, color correction, contrast enhancement, and adaptive dark channel prior (ADCP) dehazing. The Lab color space was used to correct the color bias. The L channel was enhanced using normalized gamma correction and contrast-limited adaptive histogram equalization to improve brightness and contrast. Then, the sky region is detected to avoid over-processing, preserving the natural appearance of the sky region. Finally, ADCP is applied to non-sky regions for further dehazing. Experiments show that the proposed method provides better subjective and objective performance compared to other algorithms. Full article
Show Figures

Figure 1

28 pages, 7089 KB  
Article
Multi-Scale Context-Aware Network Implementation for Efficient Image Semantic Segmentation
by Yi Yang and Chong Guo
Appl. Sci. 2026, 16(8), 4033; https://doi.org/10.3390/app16084033 - 21 Apr 2026
Viewed by 124
Abstract
Image semantic segmentation is essential in autonomous driving, medical imaging, and remote sensing. While convolutional neural networks (CNNs) excel at local feature extraction and spatial structure modeling, their limited receptive fields restrict the capture of long-range dependencies and global semantic consistency. Transformers provide [...] Read more.
Image semantic segmentation is essential in autonomous driving, medical imaging, and remote sensing. While convolutional neural networks (CNNs) excel at local feature extraction and spatial structure modeling, their limited receptive fields restrict the capture of long-range dependencies and global semantic consistency. Transformers provide strong global modeling through self-attention but often lack local inductive bias and show weaker generalization on small datasets. To address these limitations, this paper proposes a Multi-Scale Context-aware Network (MSC-Net) for image semantic segmentation. Under an encoder–decoder framework, MSC-Net combines a convolutional backbone with a Multi-Scale Self-Attention module to integrate the complementary strengths of CNNs and attention mechanisms. The backbone extracts local texture and structural information and can adopt architectures such as MobileNet, Xception, DRN, and ResNet, while the attention module captures long-range dependencies and multi-scale contextual information. This design improves cross-layer feature collaboration, multi-scale feature fusion, and boundary quality while maintaining computational efficiency. Experimental results show that MSC-Net achieves 38.8% mIoU and 98.4% ACC under comparable computational settings. Compared with SegFormer and DeepLabV3+, the model improves mIoU by approximately +3.0 and +3.3 percentage points, respectively, while reducing FLOPs and parameter size. Full article
Show Figures

Figure 1

13 pages, 632 KB  
Article
AdaSeViLA: Adaptive Dynamic Temporal Window and Object-Aware Frame Selection for Video Question Answering
by Zehua Ji, Chao Zhang, Jian Huang, Siyang Li, Xudong Li and Zehao Li
Appl. Sci. 2026, 16(8), 4017; https://doi.org/10.3390/app16084017 - 21 Apr 2026
Viewed by 110
Abstract
Video question answering remains a challenging task that requires a sophisticated understanding of both visual content and temporal dynamics across video sequences. Current approaches typically rely on fixed temporal processing strategies and uniform frame-selection mechanisms, which fail to adapt to the diverse requirements [...] Read more.
Video question answering remains a challenging task that requires a sophisticated understanding of both visual content and temporal dynamics across video sequences. Current approaches typically rely on fixed temporal processing strategies and uniform frame-selection mechanisms, which fail to adapt to the diverse requirements of different question types and may overlook critical visual information. We propose AdaSeViLA, an adaptive framework that enhances video understanding through two key innovations: Adaptive Temporal Window Selection (ATWS) that dynamically adjusts the number of processed frames (3–12 frames) based on question-type classification, and Object-importance-Aware Frame Selection (OAFS) that combines global relevance with local visual saliency for enhanced frame identification. Our approach intelligently allocates computational resources based on question complexity while maintaining high accuracy through improved frame-selection mechanisms. Extensive experiments on three challenging VideoQA benchmarks demonstrate that AdaSeViLA achieves superior performance: 87.4% accuracy on MM-AU (+2.7% over SeViLA), 73.6% on NExT-QA (+0.4% improvement), and 61.6% on STAR (+0.6% gain), while providing up to 4× computational speedup for short-term tasks. These results validate the effectiveness of adaptive temporal processing and object-aware selection in advancing video question answering capabilities. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop