MDPI - Publisher of Open Access Journals

16 pages, 2127 KB

Open AccessArticle

VIPS: Learning-View-Invariant Feature for Person Search

by Hexu Wang, Wenlong Luo, Wei Wu, Fei Xie, Jindong Liu, Jing Li and Shizhou Zhang

Sensors 2025, 25(17), 5362; https://doi.org/10.3390/s25175362 - 29 Aug 2025

Viewed by 333

Unmanned aerial vehicles (UAVs) have become indispensable tools for surveillance, enabled by their ability to capture multi-perspective imagery in dynamic environments. Among critical UAV-based tasks, cross-platform person search—detecting and identifying individuals across distributed camera networks—presents unique challenges. Severe viewpoint variations, occlusions, and cluttered [...] Read more.

Unmanned aerial vehicles (UAVs) have become indispensable tools for surveillance, enabled by their ability to capture multi-perspective imagery in dynamic environments. Among critical UAV-based tasks, cross-platform person search—detecting and identifying individuals across distributed camera networks—presents unique challenges. Severe viewpoint variations, occlusions, and cluttered backgrounds in UAV-captured data degrade the performance of conventional discriminative models, which struggle to maintain robustness under such geometric and semantic disparities. To address this, we propose view-invariant person search (VIPS), a novel two-stage framework combining Faster R-CNN with a view-invariant re-Identification (VIReID) module. Unlike conventional discriminative models, VIPS leverages the semantic flexibility of large vision–language models (VLMs) and adopts a two-stage training strategy to decouple and align text-based ID descriptors and visual features, enabling robust cross-view matching through shared semantic embeddings. To mitigate noise from occlusions and cluttered UAV-captured backgrounds, we introduce a learnable mask generator for feature purification. Furthermore, drawing from vision–language models, we design view prompts to explicitly encode perspective shifts into feature representations, enhancing adaptability to UAV-induced viewpoint changes. Extensive experiments on benchmark datasets demonstrate state-of-the-art performance, with ablation studies validating the efficacy of each component. Beyond technical advancements, this work highlights the potential of VLM-derived semantic alignment for UAV applications, offering insights for future research in real-time UAV-based surveillance systems. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

34 pages, 6708 KB

Open AccessArticle

Unmanned Aerial Vehicle Tactical Maneuver Trajectory Prediction Based on Hierarchical Strategy in Air-to-Air Confrontation Scenarios

by Yuequn Luo, Zhenglei Wei, Dali Ding, Fumin Wang, Hang An, Mulai Tan and Junjun Ma

Aerospace 2025, 12(8), 731; https://doi.org/10.3390/aerospace12080731 - 18 Aug 2025

Viewed by 596

Abstract

The prediction of the tactical maneuver trajectory of target aircraft is an important component of unmanned aerial vehicle (UAV) autonomous air-to-air confrontation. In view of the shortcomings of low accuracy and poor real-time performance in the existing maneuver trajectory prediction methods, this paper [...] Read more.

The prediction of the tactical maneuver trajectory of target aircraft is an important component of unmanned aerial vehicle (UAV) autonomous air-to-air confrontation. In view of the shortcomings of low accuracy and poor real-time performance in the existing maneuver trajectory prediction methods, this paper establishes a hierarchical tactical maneuver trajectory prediction model to achieve maneuver trajectory prediction based on the prediction of target tactical maneuver intentions. First, extract the maneuver trajectory features and situation features from the above data to establish the classification rules of maneuver units. Second, a tactical maneuver unit prediction model is established using the deep echo-state network based on the auto-encoder with attention mechanism (DeepESN-AE-AM) to predict 21 basic maneuver units. Then, for the above-mentioned 21 basic maneuver units, establish a maneuver trajectory prediction model using the gate recurrent unit based on triangle search optimization with attention mechanism (TSO-GRU-AM). Finally, by integrating the above two prediction models, a hierarchical strategy is adopted to establish a tactical maneuver trajectory prediction model. A section of the confrontation trajectory is selected from the air-to-air confrontation simulation data for prediction, and the results show that the trajectory prediction error of the combination of DeepESN-AE-AM and TSO-GRU-AM is small and meets the accuracy requirements. The simulation results of three air-to-air confrontation scenarios show that the proposed trajectory prediction method helps to assist UAV in accurately judging the confrontational situation and selecting high-quality maneuver strategies. Full article

(This article belongs to the Special Issue Application of Multidisciplinary Optimization and Artificial Intelligence Techniques to Aerospace Engineering (Volume II))

► Show Figures

Figure 1

36 pages, 8958 KB

Open AccessArticle

Dynamic Resource Target Assignment Problem for Laser Systems’ Defense Against Malicious UAV Swarms Based on MADDPG-IA

by Wei Liu, Lin Zhang, Wenfeng Wang, Haobai Fang, Jingyi Zhang and Bo Zhang

Aerospace 2025, 12(8), 729; https://doi.org/10.3390/aerospace12080729 - 17 Aug 2025

Viewed by 562

Abstract

The widespread adoption of Unmanned Aerial Vehicles (UAVs) in civilian domains, such as airport security and critical infrastructure protection, has introduced significant safety risks that necessitate effective countermeasures. High-Energy Laser Systems (HELSs) offer a promising defensive solution; however, when confronting large-scale malicious UAV [...] Read more.

The widespread adoption of Unmanned Aerial Vehicles (UAVs) in civilian domains, such as airport security and critical infrastructure protection, has introduced significant safety risks that necessitate effective countermeasures. High-Energy Laser Systems (HELSs) offer a promising defensive solution; however, when confronting large-scale malicious UAV swarms, the Dynamic Resource Target Assignment (DRTA) problem becomes critical. To address the challenges of complex combinatorial optimization problems, a method combining precise physical models with multi-agent reinforcement learning (MARL) is proposed. Firstly, an environment-dependent HELS damage model was developed. This model integrates atmospheric transmission effects and thermal effects to precisely quantify the required irradiation time to achieve the desired damage effect on a target. This forms the foundation of the HELS–UAV–DRTA model, which employs a two-stage dynamic assignment structure designed to maximize the target priority and defense benefit. An innovative MADDPG-IA (I: intrinsic reward, and A: attention mechanism) algorithm is proposed to meet the MARL challenges in the HELS–UAV–DRTA problem: an attention mechanism compresses variable-length target states into fixed-size encodings, while a Random Network Distillation (RND)-based intrinsic reward module delivers dense rewards that alleviate the extreme reward sparsity. Large-scale scenario simulations (100 independent runs per scenario) involving 50 UAVs and 5 HELS across diverse environments demonstrate the method’s superiority, achieving mean damage rates of 99.65% ± 0.32% vs. 72.64% ± 3.21% (rural), 79.37% ± 2.15% vs. 51.29% ± 4.87% (desert), and 91.25% ± 1.78% vs. 67.38% ± 3.95% (coastal). The method autonomously evolved effective strategies such as delaying decision-making to await the optimal timing and cross-region coordination. The ablation and comparison experiments further confirm MADDPG-IA’s superior convergence, stability, and exploration capabilities. This work bridges the gap between complex mathematical and physical mechanisms and real-time collaborative decision optimization. It provides an innovative theoretical and methodological basis for public-security applications. Full article

(This article belongs to the Special Issue Application of Multidisciplinary Optimization and Artificial Intelligence Techniques to Aerospace Engineering (Volume II))

► Show Figures

Figure 1

22 pages, 6556 KB

Open AccessArticle

Multi-Task Trajectory Prediction Using a Vehicle-Lane Disentangled Conditional Variational Autoencoder

by Haoyang Chen, Na Li, Hangguan Shan, Eryun Liu and Zhiyu Xiang

Sensors 2025, 25(14), 4505; https://doi.org/10.3390/s25144505 - 20 Jul 2025

Cited by 1 | Viewed by 670

Abstract

Trajectory prediction under multimodal information is critical for autonomous driving, necessitating the integration of dynamic vehicle states and static high-definition (HD) maps to model complex agent–scene interactions effectively. However, existing methods often employ static scene encodings and unstructured latent spaces, limiting their ability [...] Read more.

Trajectory prediction under multimodal information is critical for autonomous driving, necessitating the integration of dynamic vehicle states and static high-definition (HD) maps to model complex agent–scene interactions effectively. However, existing methods often employ static scene encodings and unstructured latent spaces, limiting their ability to capture evolving spatial contexts and produce diverse yet contextually coherent predictions. To tackle these challenges, we propose MS-SLV, a novel generative framework that introduces (1) a time-aware scene encoder that aligns HD map features with vehicle motion to capture evolving scene semantics and (2) a structured latent model that explicitly disentangles agent-specific intent and scene-level constraints. Additionally, we introduce an auxiliary lane prediction task to provide targeted supervision for scene understanding and improve latent variable learning. Our approach jointly predicts future trajectories and lane sequences, enabling more interpretable and scene-consistent forecasts. Extensive evaluations on the nuScenes dataset demonstrate the effectiveness of MS-SLV, achieving a 12.37% reduction in average displacement error and a 7.67% reduction in final displacement error over state-of-the-art methods. Moreover, MS-SLV significantly improves multi-modal prediction, reducing the top-5 Miss Rate (

{MR}_{5}

) and top-10 Miss Rate (

{MR}_{10}

) by 26% and 33%, respectively, and lowering the Off-Road Rate (ORR) by 3%, as compared with the strongest baseline in our evaluation. Full article

(This article belongs to the Special Issue AI-Driven Sensor Technologies for Next-Generation Electric Vehicles)

► Show Figures

Figure 1

28 pages, 19790 KB

Open AccessArticle

HSF-DETR: A Special Vehicle Detection Algorithm Based on Hypergraph Spatial Features and Bipolar Attention

by Kaipeng Wang, Guanglin He and Xinmin Li

Sensors 2025, 25(14), 4381; https://doi.org/10.3390/s25144381 - 13 Jul 2025

Viewed by 615

Abstract

Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature [...] Read more.

Special vehicle detection in intelligent surveillance, emergency rescue, and reconnaissance faces significant challenges in accuracy and robustness under complex environments, necessitating advanced detection algorithms for critical applications. This paper proposes HSF-DETR (Hypergraph Spatial Feature DETR), integrating four innovative modules: a Cascaded Spatial Feature Network (CSFNet) backbone with Cross-Efficient Convolutional Gating (CECG) for enhanced long-range detection through hybrid state-space modeling; a Hypergraph-Enhanced Spatial Feature Modulation (HyperSFM) network utilizing hypergraph structures for high-order feature correlations and adaptive multi-scale fusion; a Dual-Domain Feature Encoder (DDFE) combining Bipolar Efficient Attention (BEA) and Frequency-Enhanced Feed-Forward Network (FEFFN) for precise feature weight allocation; and a Spatial-Channel Fusion Upsampling Block (SCFUB) improving feature fidelity through depth-wise separable convolution and channel shift mixing. Experiments conducted on a self-built special vehicle dataset containing 2388 images demonstrate that HSF-DETR achieves mAP50 and mAP50-95 of 96.6% and 70.6%, respectively, representing improvements of 3.1% and 4.6% over baseline RT-DETR while maintaining computational efficiency at 59.7 GFLOPs and 18.07 M parameters. Cross-domain validation on VisDrone2019 and BDD100K datasets confirms the method’s generalization capability and robustness across diverse scenarios, establishing HSF-DETR as an effective solution for special vehicle detection in complex environments. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

26 pages, 793 KB

Open AccessArticle

Holistic Approach for Automated Reverse Engineering of Unified Diagnostics Service Data

by Nico Rosenberger, Nikolai Hoffmann, Alexander Mitscherlich and Markus Lienkamp

World Electr. Veh. J. 2025, 16(7), 384; https://doi.org/10.3390/wevj16070384 - 8 Jul 2025

Viewed by 562

Abstract

Reverse engineering of internal vehicle communication is a crucial discipline in vehicle benchmarking. The process presents a time-consuming procedure associated with high manual effort. Car manufacturers use unique signal addresses and encodings for their internal data. Accessing this data requires either expensive tools [...] Read more.

Reverse engineering of internal vehicle communication is a crucial discipline in vehicle benchmarking. The process presents a time-consuming procedure associated with high manual effort. Car manufacturers use unique signal addresses and encodings for their internal data. Accessing this data requires either expensive tools suitable for the respective vehicles or experienced engineers who have developed individual approaches to identify specific signals. Access to the internal data enables reading the vehicle’s status, and thus, reducing the need for additional test equipment. This results in vehicles closer to their production status and does not require manipulating the vehicle under study, which prevents affecting future test results. The main focus of this approach is to reduce the cost of such analysis and design a more efficient benchmarking process. In this work, we present a methodology that identifies signals without physically manipulating the vehicle. Our equipment is connected to the vehicle via the On-Board Diagnostics (OBD)-II port and uses the Unified Diagnostics Service (UDS) protocol to communicate with the vehicle. We access, capture, and analyze the vehicle’s signals for future analysis. This is a holistic approach, which, in addition to decoding the signals, also grants access to the vehicle’s data, which allows researchers to utilize state-of-the-art methodologies to analyze their vehicles under study by greatly reducing necessary experience, time, and cost. Full article

(This article belongs to the Special Issue Electric Vehicle Technology Development, Energy and Environmental Implications, and Decarbonization: 2nd Edition)

► Show Figures

Figure 1

20 pages, 110802 KB

Open AccessArticle

Toward High-Resolution UAV Imagery Open-Vocabulary Semantic Segmentation

by Zimo Chen, Yuxiang Xie and Yingmei Wei

Drones 2025, 9(7), 470; https://doi.org/10.3390/drones9070470 - 1 Jul 2025

Viewed by 718

Abstract

Unmanned Aerial Vehicle (UAV) image semantic segmentation faces challenges in recognizing novel categories due to closed-set training paradigms and the high cost of annotation. While open-vocabulary semantic segmentation (OVSS) leverages vision-language models like CLIP to enable flexible class recognition, existing methods are limited [...] Read more.

Unmanned Aerial Vehicle (UAV) image semantic segmentation faces challenges in recognizing novel categories due to closed-set training paradigms and the high cost of annotation. While open-vocabulary semantic segmentation (OVSS) leverages vision-language models like CLIP to enable flexible class recognition, existing methods are limited to low-resolution images, hindering their applicability to high-resolution UAV data. Current adaptations—downsampling, cropping, or modifying CLIP—compromise either detail preservation, global context, or computational efficiency. To address these limitations, we propose HR-Seg, the first high-resolution OVSS framework for UAV imagery, which effectively integrates global context from downsampled images with local details from cropped sub-images through a novel cost-volume architecture. We introduce a detail-enhanced encoder with multi-scale embedding and a detail-aware decoder for progressive mask refinement, specifically designed to handle objects of varying sizes in aerial imagery. We evaluated existing OVSS methods alongside HR-Seg, training on the VDD dataset and testing across three benchmarks: VDD, UDD, and UAVid. HR-Seg achieved superior performance with mIoU scores of 89.38, 73.67, and 55.23, respectively, outperforming all compared state-of-the-art OVSS approaches. These results demonstrate HR-Seg’s exceptional capability in processing high-resolution UAV imagery. Full article

(This article belongs to the Special Issue Visual Language Models and Large Language Models for Unmanned Aerial Vehicles)

► Show Figures

Figure 1

23 pages, 51170 KB

Open AccessArticle

Automatic Detection of Landslide Surface Cracks from UAV Images Using Improved U-Network

by Hao Xu, Li Wang, Bao Shu, Qin Zhang and Xinrui Li

Remote Sens. 2025, 17(13), 2150; https://doi.org/10.3390/rs17132150 - 23 Jun 2025

Viewed by 748

Abstract

Surface cracks are key indicators of landslide deformation, crucial for early landslide identification and deformation pattern analysis. However, due to the complex terrain and landslide extent, manual surveys or traditional digital image processing often face challenges with efficiency, precision, and interference susceptibility in [...] Read more.

Surface cracks are key indicators of landslide deformation, crucial for early landslide identification and deformation pattern analysis. However, due to the complex terrain and landslide extent, manual surveys or traditional digital image processing often face challenges with efficiency, precision, and interference susceptibility in detecting these cracks. Therefore, this study proposes a comprehensive automated pipeline to enhance the efficiency and accuracy of landslide surface crack detection. First, high-resolution images of landslide areas are collected using unmanned aerial vehicles (UAVs) to generate a digital orthophoto map (DOM). Subsequently, building upon the U-Net architecture, an improved encoder–decoder semantic segmentation network (IEDSSNet) was proposed to segment surface cracks from the images with complex backgrounds. The model enhances the extraction of crack features by integrating residual blocks and attention mechanisms within the encoder. Additionally, it incorporates multi-scale skip connections and channel-wise cross attention modules in the decoder to improve feature reconstruction capabilities. Finally, post-processing techniques such as morphological operations and dimension measurements were applied to crack masks to generate crack inventories. The proposed method was validated using data from the Heifangtai loess landslide in Gansu Province. Results demonstrate its superiority over current state-of-the-art semantic segmentation networks and open-source crack detection networks, achieving F1 scores and IOU of 82.11% and 69.65%, respectively—representing improvements of 3.31% and 4.63% over the baseline U-Net model. Furthermore, it maintained optimal performance with demonstrated generalization capability under varying illumination conditions. In this area, a total of 1658 surface cracks were detected and cataloged, achieving an accuracy of 85.22%. The method proposed in this study demonstrates strong performance in detecting surface cracks in landslide areas, providing essential data for landslide monitoring, early warning systems, and mitigation strategies. Full article

► Show Figures

Figure 1

19 pages, 5602 KB

Open AccessArticle

PnPDA⁺: A Meta Feature-Guided Domain Adapter for Collaborative Perception

by Liang Xin, Guangtao Zhou, Zhaoyang Yu, Danni Wang, Tianyou Luo, Xiaoyuan Fu and Jinglin Li

World Electr. Veh. J. 2025, 16(7), 343; https://doi.org/10.3390/wevj16070343 - 21 Jun 2025

Viewed by 401

Abstract

Although cooperative perception enhances situational awareness by enabling vehicles to share intermediate features, real-world deployment faces challenges due to heterogeneity in sensor modalities, architectures, and encoder parameters across agents. These domain gaps often result in semantic inconsistencies among the shared features, thereby degrading [...] Read more.

Although cooperative perception enhances situational awareness by enabling vehicles to share intermediate features, real-world deployment faces challenges due to heterogeneity in sensor modalities, architectures, and encoder parameters across agents. These domain gaps often result in semantic inconsistencies among the shared features, thereby degrading the quality of feature fusion. Existing approaches either necessitate the retraining of private models or fail to adapt to newly introduced agents. To address these limitations, we propose PnPDA⁺, a unified and modular domain adaptation framework designed for heterogeneous multi-vehicle cooperative perception. PnPDA⁺ consists of two key components: a Meta Feature Extraction Network (MFEN) and a Plug-and-Play Domain Adapter (PnPDA). MFEN extracts domain-aware and frame-aware meta features from received heterogeneous features, encoding domain-specific knowledge and spatial-temporal cues to serve as high-level semantic priors. Guided by these meta features, the PnPDA module performs adaptive semantic conversion to enhance cross-agent feature alignment without modifying existing perception models. This design ensures the scalable integration of emerging vehicles with minimal fine-tuning, significantly improving both semantic consistency and generalization. Experiments on OPV2V show that PnPDA⁺ outperforms state-of-the-art methods by 4.08% in perception accuracy while preserving model integrity and scalability. Full article

(This article belongs to the Special Issue Cooperative Perception, Communication and Computing for Autonomous Vehicles)

► Show Figures

Figure 1

17 pages, 643 KB

Open AccessArticle

A Deep Reinforcement-Learning-Based Route Optimization Model for Multi-Compartment Cold Chain Distribution

by Jingming Hu and Chong Wang

Mathematics 2025, 13(13), 2039; https://doi.org/10.3390/math13132039 - 20 Jun 2025

Viewed by 1136

Abstract

Cold chain logistics is crucial in ensuring food quality and safety in modern supply chains. The required temperature control systems increase operational costs and environmental impacts compared to conventional logistics. To reduce these costs while maintaining service quality in real-world distribution scenarios, efficient [...] Read more.

Cold chain logistics is crucial in ensuring food quality and safety in modern supply chains. The required temperature control systems increase operational costs and environmental impacts compared to conventional logistics. To reduce these costs while maintaining service quality in real-world distribution scenarios, efficient route planning is essential, particularly when products with different temperature requirements need to be delivered together using multi-compartment refrigerated vehicles. This substantially increases the complexity of the routing process. We propose a novel deep reinforcement learning approach that incorporates a vehicle state encoder for capturing fleet characteristics and a dynamic vehicle state update mechanism for enabling real-time vehicle state updates during route planning. Extensive experiments on a real-world road network show that our proposed method significantly outperforms four representative methods. Compared to a recent ant colony optimization algorithm, it achieves up to a 6.32% reduction in costs while being up to 1637 times faster in computation. Full article

(This article belongs to the Special Issue Application of Neural Networks and Deep Learning)

► Show Figures

Figure 1

20 pages, 39846 KB

Open AccessArticle

MTCDNet: Multimodal Feature Fusion-Based Tree Crown Detection Network Using UAV-Acquired Optical Imagery and LiDAR Data

by Heng Zhang, Can Yang and Xijian Fan

Remote Sens. 2025, 17(12), 1996; https://doi.org/10.3390/rs17121996 - 9 Jun 2025

Cited by 1 | Viewed by 543

Abstract

Accurate detection of individual tree crowns is a critical prerequisite for precisely extracting forest structural parameters, which is vital for forestry resources monitoring. While unmanned aerial vehicle (UAV)-acquired RGB imagery, combined with deep learning-based networks, has demonstrated considerable potential, existing methods often rely [...] Read more.

Accurate detection of individual tree crowns is a critical prerequisite for precisely extracting forest structural parameters, which is vital for forestry resources monitoring. While unmanned aerial vehicle (UAV)-acquired RGB imagery, combined with deep learning-based networks, has demonstrated considerable potential, existing methods often rely exclusively on RGB data, rendering them susceptible to shadows caused by varying illumination and suboptimal performance in dense forest stands. In this paper, we propose integrating LiDAR-derived Canopy Height Model (CHM) with RGB imagery as complementary cues, shifting the paradigm of tree crown detection from unimodal to multimodal. To fully leverage the complementary properties of RGB and CHM, we present a novel Multimodal learning-based Tree Crown Detection Network (MTCDNet). Specifically, a transformer-based multimodal feature fusion strategy is proposed to adaptively learn correlations among multilevel features from diverse modalities, which enhances the model’s ability to represent tree crown structures by leveraging complementary information. In addition, a learnable positional encoding scheme is introduced to facilitate the fused features in capturing the complex, densely distributed tree crown structures by explicitly incorporating spatial information. A hybrid loss function is further designed to enhance the model’s capability in handling occluded crowns and crowns of varying sizes. Experiments conducted on two challenging datasets with diverse stand structures demonstrate that MTCDNet significantly outperforms existing state-of-the-art single-modality methods, achieving AP50 scores of 93.12% and 94.58%, respectively. Ablation studies further confirm the superior performance of the proposed fusion network compared to simple fusion strategies. This research indicates that effectively integrating RGB and CHM data offers a robust solution for enhancing individual tree crown detection. Full article

(This article belongs to the Special Issue Digital Modeling for Sustainable Forest Management)

► Show Figures

Figure 1

19 pages, 14298 KB

Open AccessArticle

BETAV: A Unified BEV-Transformer and Bézier Optimization Framework for Jointly Optimized End-to-End Autonomous Driving

by Rui Zhao, Ziguo Chen, Yuze Fan, Fei Gao and Yuzhuo Men

Sensors 2025, 25(11), 3336; https://doi.org/10.3390/s25113336 - 26 May 2025

Cited by 1 | Viewed by 1022

Abstract

End-to-end autonomous driving demands precise perception, robust motion planning, and efficient trajectory generation to navigate complex and dynamic environments. This paper proposes BETAV, a novel framework that addresses the persistent challenges of low 3D perception accuracy and suboptimal trajectory smoothness in autonomous driving [...] Read more.

End-to-end autonomous driving demands precise perception, robust motion planning, and efficient trajectory generation to navigate complex and dynamic environments. This paper proposes BETAV, a novel framework that addresses the persistent challenges of low 3D perception accuracy and suboptimal trajectory smoothness in autonomous driving systems through unified BEV-Transformer encoding and Bézier-optimized planning. By leveraging Vision Transformers (ViTs), our approach encodes multi-view camera data into a Bird’s Eye View (BEV) representation using a transformer architecture, capturing both spatial and temporal features to enhance scene understanding comprehensively. For motion planning, a Bézier curve-based planning decoder is proposed, offering a compact, continuous, and parameterized trajectory representation that inherently ensures motion smoothness, kinematic feasibility, and computational efficiency. Additionally, this paper introduces a set of constraints tailored to address vehicle kinematics, obstacle avoidance, and directional alignment, further enhancing trajectory accuracy and safety. Experimental evaluations on Nuscences benchmark datasets and simulations demonstrate that our framework achieves state-of-the-art performance in trajectory prediction and planning tasks, exhibiting superior robustness and generalization across diverse and challenging Bench2Drive driving scenarios. Full article

(This article belongs to the Section Vehicular Sensing)

► Show Figures

Figure 1

19 pages, 6004 KB

Open AccessArticle

Remote Sensing Image Change Detection Based on Dynamic Adaptive Context Attention

by Yong Xie, Yixuan Wang, Xin Wang, Yin Tan and Qin Qin

Symmetry 2025, 17(5), 793; https://doi.org/10.3390/sym17050793 - 20 May 2025

Viewed by 685

Abstract

Although some progress has been made in deep learning-based remote sensing image change detection, the complexity of scenes and the diversity of changes in remote sensing images lead to challenges related to background interference. For instance, remote sensing images typically contain numerous background [...] Read more.

Although some progress has been made in deep learning-based remote sensing image change detection, the complexity of scenes and the diversity of changes in remote sensing images lead to challenges related to background interference. For instance, remote sensing images typically contain numerous background regions, while the actual change regions constitute only a small proportion of the overall image. To address these challenges in remote sensing image change detection, this paper proposes a Dynamic Adaptive Context Attention Network (DACA-Net) based on an exchanging dual encoder–decoder (EDED) architecture. The core innovation of DACA-Net is the development of a novel Dynamic Adaptive Context Attention Module (DACAM), which learns attention weights and automatically adjusts the appropriate scale according to the features present in remote sensing images. By fusing multi-scale contextual features, DACAM effectively captures information regarding changes within these images. In addition, DACA-Net adopts an EDED architectural design, where the conventional convolutional modules in the EDED framework are replaced by DACAM modules. Unlike the original EDED architecture, DACAM modules are embedded after each encoder unit, enabling dynamic recalibration of T1/T2 features and cross-temporal information interaction. This design facilitates the capture of fine-grained change features at multiple scales. This architecture not only facilitates the extraction of discriminative features but also promotes a form of structural symmetry in the processing pipeline, contributing to more balanced and consistent feature representations. To validate the applicability of our proposed method in real-world scenarios, we constructed an Unmanned Aerial Vehicle (UAV) remote sensing dataset named the Guangxi Beihai Coast Nature Reserves (GBCNR). Extensive experiments conducted on three public datasets and our GBCNR dataset demonstrate that the proposed DACA-Net achieves strong performance across various evaluation metrics. For example, it attains an F1 score (F1) of 72.04% and a precision(P) of 66.59% on the GBCNR dataset, representing improvements of 3.94% and 4.72% over state-of-the-art methods such as semantic guidance and spatial localization network (SGSLN) and bi-temporal image Transformer (BIT), respectively. These results verify that the proposed network significantly enhances the ability to detect critical change regions and improves generalization performance. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

28 pages, 12170 KB

Open AccessArticle

Research on Multi-Objective Green Vehicle Routing Problem with Time Windows Based on the Improved Non-Dominated Sorting Genetic Algorithm III

by Xixing Li, Chao Gao, Jipeng Wang, Hongtao Tang, Tian Ma and Fenglian Yuan

Symmetry 2025, 17(5), 734; https://doi.org/10.3390/sym17050734 - 9 May 2025

Viewed by 1149

Abstract

To advance energy conservation and emissions reduction in urban logistics systems, this study focuses on the green vehicle routing problems with time windows (GVRPTWs), which remains underexplored in balancing environmental and service quality objectives. We propose a comprehensive multi-objective optimization framework that addresses [...] Read more.

To advance energy conservation and emissions reduction in urban logistics systems, this study focuses on the green vehicle routing problems with time windows (GVRPTWs), which remains underexplored in balancing environmental and service quality objectives. We propose a comprehensive multi-objective optimization framework that addresses this gap by simultaneously minimizing total distribution costs and carbon emissions while maximizing customer satisfaction, quantified based on the vehicle’s arrival time at the customer location. The rationale for adopting this tri-objective formulation lies in its ability to reflect real-world trade-offs between economic efficiency, environmental performance, and service level, which are often considered in isolation in previous studies. To tackle this complex problem, we develop an improved Non-Dominated Sorting Genetic Algorithm III (NSGA-III) that incorporates three key enhancements: (1) an integer-encoded initialization method to enhance solution feasibility, (2) a refined selection strategy utilizing crowding distance to maintain population diversity, and (3) an embedded 2-opt local search operator to prevent premature convergence and avoid local optima. Comprehensive validation experiments using Solomon’s benchmark instances and a real-world case demonstrate that the presented algorithm consistently outperforms several state-of-the-art multi-objective optimization methods across key performance metrics. These results highlight the effectiveness and practical relevance of our approach in advancing energy-efficient, low-emission, and customer-centric urban logistics systems. Full article

(This article belongs to the Special Issue Meta-Heuristics for Manufacturing Systems Optimization, 3rd Edition)

► Show Figures

Figure 1

36 pages, 10731 KB

Open AccessArticle

Enhancing Airport Traffic Flow: Intelligent System Based on VLC, Rerouting Techniques, and Adaptive Reward Learning

by Manuela Vieira, Manuel Augusto Vieira, Gonçalo Galvão, Paula Louro, Alessandro Fantoni, Pedro Vieira and Mário Véstias

Sensors 2025, 25(9), 2842; https://doi.org/10.3390/s25092842 - 30 Apr 2025

Viewed by 758

Abstract

Airports are complex environments where efficient localization and intelligent traffic management are essential for ensuring smooth navigation and operational efficiency for both pedestrians and Autonomous Guided Vehicles (AGVs). This study presents an Artificial Intelligence (AI)-driven airport traffic management system that integrates Visible Light [...] Read more.

Airports are complex environments where efficient localization and intelligent traffic management are essential for ensuring smooth navigation and operational efficiency for both pedestrians and Autonomous Guided Vehicles (AGVs). This study presents an Artificial Intelligence (AI)-driven airport traffic management system that integrates Visible Light Communication (VLC), rerouting techniques, and adaptive reward mechanisms to optimize traffic flow, reduce congestion, and enhance safety. VLC-enabled luminaires serve as transmission points for location-specific guidance, forming a hybrid mesh network based on tetrachromatic LEDs with On-Off Keying (OOK) modulation and SiC optical receivers. AI agents, driven by Deep Reinforcement Learning (DRL), continuously analyze traffic conditions, apply adaptive rewards to improve decision-making, and dynamically reroute agents to balance traffic loads and avoid bottlenecks. Traffic states are encoded and processed through Q-learning algorithms, enabling intelligent phase activation and responsive control strategies. Simulation results confirm that the proposed system enables more balanced green time allocation, with reductions of up to 43% in vehicle-prioritized phases (e.g., Phase 1 at C1) to accommodate pedestrian flows. These adjustments lead to improved route planning, reduced halting times, and enhanced coordination between AGVs and pedestrian traffic across multiple intersections. Additionally, traffic flow responsiveness is preserved, with critical clearance phases maintaining stability or showing slight increases despite pedestrian prioritization. Simulation results confirm improved route planning, reduced halting times, and enhanced coordination between AGVs and pedestrian flows. The system also enables accurate indoor localization without relying on a Global Positioning System (GPS), supporting seamless movement and operational optimization. By combining VLC, adaptive AI models, and rerouting strategies, the proposed approach contributes to safer, more efficient, and human-centered airport mobility. Full article

(This article belongs to the Special Issue Intelligent Vehicle, Infrastructure Perception and Control Based on Imaging and Sensing)

► Show Figures

Figure 1

Search Results (107)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (107)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI