Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (269)

Search Parameters:
Keywords = large-scale vehicle dataset

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 1964 KiB  
Article
Data-Driven Symmetry and Asymmetry Investigation of Vehicle Emissions Using Machine Learning: A Case Study in Spain
by Fei Wu, Jinfu Zhu, Hufang Yang, Xiang He and Qiao Peng
Symmetry 2025, 17(8), 1223; https://doi.org/10.3390/sym17081223 (registering DOI) - 2 Aug 2025
Abstract
Understanding vehicle emissions is essential for developing effective carbon reduction strategies in the transport sector. Conventional emission models often assume homogeneity and linearity, overlooking real-world asymmetries that arise from variations in vehicle design and powertrain configurations. This study explores how machine learning and [...] Read more.
Understanding vehicle emissions is essential for developing effective carbon reduction strategies in the transport sector. Conventional emission models often assume homogeneity and linearity, overlooking real-world asymmetries that arise from variations in vehicle design and powertrain configurations. This study explores how machine learning and explainable AI techniques can effectively capture both symmetric and asymmetric emission patterns across different vehicle types, thereby contributing to more sustainable transport planning. Addressing a key gap in the existing literature, the study poses the following question: how do structural and behavioral factors contribute to asymmetric emission responses in internal combustion engine vehicles compared to new energy vehicles? Utilizing a large-scale Spanish vehicle registration dataset, the analysis classifies vehicles by powertrain type and applies five supervised learning algorithms to predict CO2 emissions. SHapley Additive exPlanations (SHAPs) are employed to identify nonlinear and threshold-based relationships between emissions and vehicle characteristics such as fuel consumption, weight, and height. Among the models tested, the Random Forest algorithm achieves the highest predictive accuracy. The findings reveal critical asymmetries in emission behavior, particularly among hybrid vehicles, which challenge the assumption of uniform policy applicability. This study provides both methodological innovation and practical insights for symmetry-aware emission modeling, offering support for more targeted eco-design and policy decisions that align with long-term sustainability goals. Full article
(This article belongs to the Section Engineering and Materials)
Show Figures

Figure 1

22 pages, 6482 KiB  
Article
Surface Damage Detection in Hydraulic Structures from UAV Images Using Lightweight Neural Networks
by Feng Han and Chongshi Gu
Remote Sens. 2025, 17(15), 2668; https://doi.org/10.3390/rs17152668 (registering DOI) - 1 Aug 2025
Abstract
Timely and accurate identification of surface damage in hydraulic structures is essential for maintaining structural integrity and ensuring operational safety. Traditional manual inspections are time-consuming, labor-intensive, and prone to subjectivity, especially for large-scale or inaccessible infrastructure. Leveraging advancements in aerial imaging, unmanned aerial [...] Read more.
Timely and accurate identification of surface damage in hydraulic structures is essential for maintaining structural integrity and ensuring operational safety. Traditional manual inspections are time-consuming, labor-intensive, and prone to subjectivity, especially for large-scale or inaccessible infrastructure. Leveraging advancements in aerial imaging, unmanned aerial vehicles (UAVs) enable efficient acquisition of high-resolution visual data across expansive hydraulic environments. However, existing deep learning (DL) models often lack architectural adaptations for the visual complexities of UAV imagery, including low-texture contrast, noise interference, and irregular crack patterns. To address these challenges, this study proposes a lightweight, robust, and high-precision segmentation framework, called LFPA-EAM-Fast-SCNN, specifically designed for pixel-level damage detection in UAV-captured images of hydraulic concrete surfaces. The developed DL-based model integrates an enhanced Fast-SCNN backbone for efficient feature extraction, a Lightweight Feature Pyramid Attention (LFPA) module for multi-scale context enhancement, and an Edge Attention Module (EAM) for refined boundary localization. The experimental results on a custom UAV-based dataset show that the proposed damage detection method achieves superior performance, with a precision of 0.949, a recall of 0.892, an F1 score of 0.906, and an IoU of 87.92%, outperforming U-Net, Attention U-Net, SegNet, DeepLab v3+, I-ST-UNet, and SegFormer. Additionally, it reaches a real-time inference speed of 56.31 FPS, significantly surpassing other models. The experimental results demonstrate the proposed framework’s strong generalization capability and robustness under varying noise levels and damage scenarios, underscoring its suitability for scalable, automated surface damage assessment in UAV-based remote sensing of civil infrastructure. Full article
Show Figures

Figure 1

22 pages, 16421 KiB  
Article
Deep Neural Network with Anomaly Detection for Single-Cycle Battery Lifetime Prediction
by Junghwan Lee, Longda Wang, Hoseok Jung, Bukyu Lim, Dael Kim, Jiaxin Liu and Jong Lim
Batteries 2025, 11(8), 288; https://doi.org/10.3390/batteries11080288 - 30 Jul 2025
Viewed by 311
Abstract
Large-scale battery datasets often contain anomalous data due to sensor noise, communication errors, and operational inconsistencies, which degrade the accuracy of data-driven prognostics. However, many existing studies overlook the impact of such anomalies or apply filtering heuristically without rigorous benchmarking, which can potentially [...] Read more.
Large-scale battery datasets often contain anomalous data due to sensor noise, communication errors, and operational inconsistencies, which degrade the accuracy of data-driven prognostics. However, many existing studies overlook the impact of such anomalies or apply filtering heuristically without rigorous benchmarking, which can potentially introduce biases into training and evaluation pipelines. This study presents a deep learning framework that integrates autoencoder-based anomaly detection with a residual neural network (ResNet) to achieve state-of-the-art prediction of remaining useful life at the cycle level using only a single-cycle input. The framework systematically filters out anomalous samples using multiple variants of convolutional and sequence-to-sequence autoencoders, thereby enhancing data integrity before optimizing and training the ResNet-based models. Benchmarking against existing deep learning approaches demonstrates a significant performance improvement, with the best model achieving a mean absolute percentage error of 2.85% and a root mean square error of 40.87 cycles, surpassing prior studies. These results indicate that autoencoder-based anomaly filtering significantly enhances prediction accuracy, reinforcing the importance of systematic anomaly detection in battery prognostics. The proposed method provides a scalable and interpretable solution for intelligent battery management in electric vehicles and energy storage systems. Full article
(This article belongs to the Special Issue Machine Learning for Advanced Battery Systems)
Show Figures

Figure 1

22 pages, 5363 KiB  
Article
Accurate Extraction of Rural Residential Buildings in Alpine Mountainous Areas by Combining Shadow Processing with FF-SwinT
by Guize Luan, Jinxuan Luo, Zuyu Gao and Fei Zhao
Remote Sens. 2025, 17(14), 2463; https://doi.org/10.3390/rs17142463 - 16 Jul 2025
Viewed by 269
Abstract
Precise extraction of rural settlements in alpine regions is critical for geographic data production, rural development, and spatial optimization. However, existing deep learning models are hindered by insufficient datasets and suboptimal algorithm structures, resulting in blurred boundaries and inadequate extraction accuracy. Therefore, this [...] Read more.
Precise extraction of rural settlements in alpine regions is critical for geographic data production, rural development, and spatial optimization. However, existing deep learning models are hindered by insufficient datasets and suboptimal algorithm structures, resulting in blurred boundaries and inadequate extraction accuracy. Therefore, this study uses high-resolution unmanned aerial vehicle (UAV) remote sensing images to construct a specialized dataset for the extraction of rural settlements in alpine mountainous areas, while introducing an innovative shadow mitigation technique that integrates multiple spectral characteristics. This methodology effectively addresses the challenges posed by intense shadows in settlements and environmental occlusions common in mountainous terrain analysis. Based on the comparative experiments with existing deep learning models, the Swin Transformer was selected as the baseline model. Building upon this, the Feature Fusion Swin Transformer (FF-SwinT) model was constructed by optimizing the data processing, loss function, and multi-view feature fusion. Finally, we rigorously evaluated it through ablation studies, generalization tests and large-scale image application experiments. The results show that the FF-SwinT has improved in many indicators compared with the traditional Swin Transformer, and the recognition results have clear edges and strong integrity. These results suggest that the FF-SwinT establishes a novel framework for rural settlement extraction in alpine mountain regions, which is of great significance for regional spatial optimization and development policy formulation. Full article
Show Figures

Figure 1

32 pages, 6589 KiB  
Article
Machine Learning (AutoML)-Driven Wheat Yield Prediction for European Varieties: Enhanced Accuracy Using Multispectral UAV Data
by Krstan Kešelj, Zoran Stamenković, Marko Kostić, Vladimir Aćin, Dragana Tekić, Tihomir Novaković, Mladen Ivanišević, Aleksandar Ivezić and Nenad Magazin
Agriculture 2025, 15(14), 1534; https://doi.org/10.3390/agriculture15141534 - 16 Jul 2025
Viewed by 496
Abstract
Accurate and timely wheat yield prediction is valuable globally for enhancing agricultural planning, optimizing resource use, and supporting trade strategies. Study addresses the need for precision in yield estimation by applying machine-learning (ML) regression models to high-resolution Unmanned Aerial Vehicle (UAV) multispectral (MS) [...] Read more.
Accurate and timely wheat yield prediction is valuable globally for enhancing agricultural planning, optimizing resource use, and supporting trade strategies. Study addresses the need for precision in yield estimation by applying machine-learning (ML) regression models to high-resolution Unmanned Aerial Vehicle (UAV) multispectral (MS) and Red-Green-Blue (RGB) imagery. Research analyzes five European wheat cultivars across 400 experimental plots created by combining 20 nitrogen, phosphorus, and potassium (NPK) fertilizer treatments. Yield variations from 1.41 to 6.42 t/ha strengthen model robustness with diverse data. The ML approach is automated using PyCaret, which optimized and evaluated 25 regression models based on 65 vegetation indices and yield data, resulting in 66 feature variables across 400 observations. The dataset, split into training (70%) and testing sets (30%), was used to predict yields at three growth stages: 9 May, 20 May, and 6 June 2022. Key models achieved high accuracy, with the Support Vector Regression (SVR) model reaching R2 = 0.95 on 9 May and R2 = 0.91 on 6 June, and the Multi-Layer Perceptron (MLP) Regressor attaining R2 = 0.94 on 20 May. The findings underscore the effectiveness of precisely measured MS indices and a rigorous experimental approach in achieving high-accuracy yield predictions. This study demonstrates how a precise experimental setup, large-scale field data, and AutoML can harness UAV and machine learning’s potential to enhance wheat yield predictions. The main limitations of this study lie in its focus on experimental fields under specific conditions; future research could explore adaptability to diverse environments and wheat varieties for broader applicability. Full article
(This article belongs to the Special Issue Applications of Remote Sensing in Agricultural Soil and Crop Mapping)
Show Figures

Figure 1

28 pages, 7404 KiB  
Article
SR-YOLO: Spatial-to-Depth Enhanced Multi-Scale Attention Network for Small Target Detection in UAV Aerial Imagery
by Shasha Zhao, He Chen, Di Zhang, Yiyao Tao, Xiangnan Feng and Dengyin Zhang
Remote Sens. 2025, 17(14), 2441; https://doi.org/10.3390/rs17142441 - 14 Jul 2025
Viewed by 364
Abstract
The detection of aerial imagery captured by Unmanned Aerial Vehicles (UAVs) is widely employed across various domains, including engineering construction, traffic regulation, and precision agriculture. However, aerial images are typically characterized by numerous small targets, significant occlusion issues, and densely clustered targets, rendering [...] Read more.
The detection of aerial imagery captured by Unmanned Aerial Vehicles (UAVs) is widely employed across various domains, including engineering construction, traffic regulation, and precision agriculture. However, aerial images are typically characterized by numerous small targets, significant occlusion issues, and densely clustered targets, rendering traditional detection algorithms largely ineffective for such imagery. This work proposes a small target detection algorithm, SR-YOLO. It is specifically tailored to address these challenges in UAV-captured aerial images. First, the Space-to-Depth layer and Receptive Field Attention Convolution are combined, and the SR-Conv module is designed to replace the Conv module within the original backbone network. This hybrid module extracts more fine-grained information about small target features by converting image spatial information into depth information and the attention of the network to targets of different scales. Second, a small target detection layer and a bidirectional feature pyramid network mechanism are introduced to enhance the neck network, thereby strengthening the feature extraction and fusion capabilities for small targets. Finally, the model’s detection performance for small targets is improved by utilizing the Normalized Wasserstein Distance loss function to optimize the Complete Intersection over Union loss function. Empirical results demonstrate that the SR-YOLO algorithm significantly enhances the precision of small target detection in UAV aerial images. Ablation experiments and comparative experiments are conducted on the VisDrone2019 and RSOD datasets. Compared to the baseline algorithm YOLOv8s, our SR-YOLO algorithm has improved mAP@0.5 by 6.3% and 3.5% and mAP@0.5:0.95 by 3.8% and 2.3% on the datasets VisDrone2019 and RSOD, respectively. It also achieves superior detection results compared to other mainstream target detection methods. Full article
Show Figures

Figure 1

24 pages, 8079 KiB  
Article
Enhancing the Scale Adaptation of Global Trackers for Infrared UAV Tracking
by Zicheng Feng, Wenlong Zhang, Erting Pan, Donghui Liu and Qifeng Yu
Drones 2025, 9(7), 469; https://doi.org/10.3390/drones9070469 - 1 Jul 2025
Viewed by 344
Abstract
Tracking unmanned aerial vehicles (UAVs) in infrared video is an essential technology for the anti-UAV task. Given frequent UAV target disappearances caused by occlusion or moving out of view, global trackers, which have the unique ability to recapture targets, are widely used in [...] Read more.
Tracking unmanned aerial vehicles (UAVs) in infrared video is an essential technology for the anti-UAV task. Given frequent UAV target disappearances caused by occlusion or moving out of view, global trackers, which have the unique ability to recapture targets, are widely used in infrared UAV tracking. However, global trackers perform poorly when dealing with large target scale variation because they cannot maintain approximate consistency between target sizes in the template and the search region. To enhance the scale adaptation of global trackers, we propose a plug-and-play scale adaptation enhancement module (SAEM). This can generate a scale adaptation enhancement kernel according to the target size in the previous frame, and then perform implicit scale adaptation enhancement on the extracted target template features. To optimize training, we introduce an auxiliary branch to supervise the learning of SAEM and add Gaussian noise to the input size to improve its robustness. In addition, we propose a one-stage anchor-free global tracker (OSGT), which has a more concise structure than other global trackers to meet the real-time requirement. Extensive experiments on three Anti-UAV Challenge datasets and the Anti-UAV410 dataset demonstrate the superior performance of our method and verify that our proposed SAEM can effectively enhance the scale adaptation of existing global trackers. Full article
(This article belongs to the Special Issue UAV Detection, Classification, and Tracking)
Show Figures

Figure 1

17 pages, 8706 KiB  
Article
Rice Canopy Disease and Pest Identification Based on Improved YOLOv5 and UAV Images
by Gaoyuan Zhao, Yubin Lan, Yali Zhang and Jizhong Deng
Sensors 2025, 25(13), 4072; https://doi.org/10.3390/s25134072 - 30 Jun 2025
Viewed by 353
Abstract
Traditional monitoring methods rely on manual field surveys, which are subjective, inefficient, and unable to meet the demand for large-scale, rapid monitoring. By using unmanned aerial vehicles (UAVs) to capture high-resolution images of rice canopy diseases and pests, combined with deep learning (DL) [...] Read more.
Traditional monitoring methods rely on manual field surveys, which are subjective, inefficient, and unable to meet the demand for large-scale, rapid monitoring. By using unmanned aerial vehicles (UAVs) to capture high-resolution images of rice canopy diseases and pests, combined with deep learning (DL) techniques, accurate and timely identification of diseases and pests can be achieved. We propose a method for identifying rice canopy diseases and pests using an improved YOLOv5 model (YOLOv5_DWMix). By incorporating deep separable convolutions, the MixConv module, attention mechanisms, and optimized loss functions into the YOLOv5 backbone, the model’s speed, feature extraction capability, and robustness are significantly enhanced. Additionally, to tackle the challenges posed by complex field environments and small datasets, image augmentation is employed to train the YOLOv5_DWMix model for the recognition of four common rice canopy diseases and pests. Results show that the improved YOLOv5 model achieves 95.6% average precision in detecting these diseases and pests, a 4.8% improvement over the original YOLOv5 model. The YOLOv5_DWMix model is effective and advanced in identifying rice diseases and pests, offering a solid foundation for large-scale, regional monitoring. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

24 pages, 1151 KiB  
Article
EKNet: Graph Structure Feature Extraction and Registration for Collaborative 3D Reconstruction in Architectural Scenes
by Changyu Qian, Hanqiang Deng, Xiangrong Ni, Dong Wang, Bangqi Wei, Hao Chen and Jian Huang
Appl. Sci. 2025, 15(13), 7133; https://doi.org/10.3390/app15137133 - 25 Jun 2025
Viewed by 277
Abstract
Collaborative geometric reconstruction of building structures can significantly reduce communication consumption for data sharing, protect privacy, and provide support for large-scale robot application management. In recent years, geometric reconstruction of building structures has been partially studied, but there is a lack of alignment [...] Read more.
Collaborative geometric reconstruction of building structures can significantly reduce communication consumption for data sharing, protect privacy, and provide support for large-scale robot application management. In recent years, geometric reconstruction of building structures has been partially studied, but there is a lack of alignment fusion studies for multi-UAV (Unmanned Aerial Vehicle)-reconstructed geometric structure models. The vertices and edges of geometric structure models are sparse, and existing methods face challenges such as low feature extraction efficiency and substantial data requirements when processing sparse graph structures after geometrization. To address these challenges, this paper proposes an efficient deep graph matching registration framework that effectively integrates interpretable feature extraction with network training. Specifically, we first extract multidimensional local properties of nodes by combining geometric features with complex network features. Next, we construct a lightweight graph neural network, named EKNet, to enhance feature representation capabilities, enabling improved performance in low-overlap registration scenarios. Finally, through feature matching and discrimination modules, we effectively eliminate incorrect pairings and enhance accuracy. Experiments demonstrate that the proposed method achieves a 27.28% improvement in registration speed compared to traditional GCN (Graph Convolutional Neural Networks) and an 80.66% increase in registration accuracy over the suboptimal method. The method exhibits strong robustness in registration for scenes with high noise and low overlap rates. Additionally, we construct a standardized geometric point cloud registration dataset. Full article
Show Figures

Figure 1

28 pages, 11793 KiB  
Article
Unsupervised Multimodal UAV Image Registration via Style Transfer and Cascade Network
by Xiaoye Bi, Rongkai Qie, Chengyang Tao, Zhaoxiang Zhang and Yuelei Xu
Remote Sens. 2025, 17(13), 2160; https://doi.org/10.3390/rs17132160 - 24 Jun 2025
Cited by 1 | Viewed by 385
Abstract
Cross-modal image registration for unmanned aerial vehicle (UAV) platforms presents significant challenges due to large-scale deformations, distinct imaging mechanisms, and pronounced modality discrepancies. This paper proposes a novel multi-scale cascaded registration network based on style transfer that achieves superior performance: up to 67% [...] Read more.
Cross-modal image registration for unmanned aerial vehicle (UAV) platforms presents significant challenges due to large-scale deformations, distinct imaging mechanisms, and pronounced modality discrepancies. This paper proposes a novel multi-scale cascaded registration network based on style transfer that achieves superior performance: up to 67% reduction in mean squared error (from 0.0106 to 0.0068), 9.27% enhancement in normalized cross-correlation, 26% improvement in local normalized cross-correlation, and 8% increase in mutual information compared to state-of-the-art methods. The architecture integrates a cross-modal style transfer network (CSTNet) that transforms visible images into pseudo-infrared representations to unify modality characteristics, and a multi-scale cascaded registration network (MCRNet) that performs progressive spatial alignment across multiple resolution scales using diffeomorphic deformation modeling to ensure smooth and invertible transformations. A self-supervised learning paradigm based on image reconstruction eliminates reliance on manually annotated data while maintaining registration accuracy through synthetic deformation generation. Extensive experiments on the LLVIP dataset demonstrate the method’s robustness under challenging conditions involving large-scale transformations, with ablation studies confirming that style transfer contributes 28% MSE improvement and diffeomorphic registration prevents 10.6% performance degradation. The proposed approach provides a robust solution for cross-modal image registration in dynamic UAV environments, offering significant implications for downstream applications such as target detection, tracking, and surveillance. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches: UAV Data Analysis)
Show Figures

Graphical abstract

17 pages, 1049 KiB  
Article
Learning Part-Based Features for Vehicle Re-Identification with Global Context
by Rajsekhar Kumar Nath and Debjani Mitra
Appl. Sci. 2025, 15(13), 7041; https://doi.org/10.3390/app15137041 - 23 Jun 2025
Viewed by 386
Abstract
Re-identification in automated surveillance systems is a challenging deep learning problem. Learning part-based features augmented with one or more global features is an efficient approach for enhancing the performance of re-identification networks. However, the latter may increase the number of trainable parameters, leading [...] Read more.
Re-identification in automated surveillance systems is a challenging deep learning problem. Learning part-based features augmented with one or more global features is an efficient approach for enhancing the performance of re-identification networks. However, the latter may increase the number of trainable parameters, leading to unacceptable complexity. We propose a novel part-based model that unifies a global component by taking the distances of the parts from the global feature vector and using them as loss weights during the training of the individual parts, without increasing complexity. We conduct extensive experiments on two large-scale standard vehicle re-identification datasets to test, validate, and perform a comparative performance analysis of the proposed approach, which we named the global–local similarity-induced part-based network (GLSIPNet). The results show that our method outperforms the baseline by 2.5% (mAP) in the case of the VeRi dataset and by 2.4%, 3.3%, and 2.8% (mAP) for small, medium, and large variants of the VehicleId dataset, respectively. It also performs on par with state-of-the-art methods in the literature used for comparison. Full article
Show Figures

Figure 1

26 pages, 2362 KiB  
Article
ELNet: An Efficient and Lightweight Network for Small Object Detection in UAV Imagery
by Hui Li, Jianbo Ma and Jianlin Zhang
Remote Sens. 2025, 17(12), 2096; https://doi.org/10.3390/rs17122096 - 18 Jun 2025
Viewed by 594
Abstract
Real-time object detection is critical for unmanned aerial vehicles (UAVs) performing various tasks. However, efficiently deploying detection models on UAV platforms with limited storage and computational resources remains a significant challenge. To address this issue, we propose ELNet, an efficient and lightweight object [...] Read more.
Real-time object detection is critical for unmanned aerial vehicles (UAVs) performing various tasks. However, efficiently deploying detection models on UAV platforms with limited storage and computational resources remains a significant challenge. To address this issue, we propose ELNet, an efficient and lightweight object detection model based on YOLOv12n. First, based on an analysis of UAV image characteristics, we strategically remove two A2C2f modules from YOLOv12n and adjust the size and number of detection heads. Second, we propose a novel lightweight detection head, EPGHead, to alleviate the computational burden introduced by adding the large-scale detection head. In addition, since YOLOv12n employs standard convolution for downsampling, which is inefficient for extracting UAV image features, we design a novel downsampling module, EDown, to further reduce model size and enable more efficient feature extraction. Finally, to improve detection in UAV imagery with dense, small, and scale-varying objects, we propose DIMB-C3k2, an enhanced module built upon C3k2, which boosts feature extraction under complex conditions. Compared with YOLOv12n, ELNet achieves an 88.5% reduction in parameter count and a 52.3% decrease in FLOPs, while increasing mAP50 by 1.2% on the VisDrone dataset and 0.8% on the HIT-UAV dataset, reaching 94.7% mAP50 on HIT-UAV. Furthermore, the model achieves a frame rate of 682 FPS, highlighting its superior computational efficiency without sacrificing detection accuracy. Full article
Show Figures

Figure 1

24 pages, 6003 KiB  
Article
ADSAP: An Adaptive Speed-Aware Trajectory Prediction Framework with Adversarial Knowledge Transfer
by Cheng Da, Yongsheng Qian, Junwei Zeng, Xuting Wei and Futao Zhang
Electronics 2025, 14(12), 2448; https://doi.org/10.3390/electronics14122448 - 16 Jun 2025
Viewed by 369
Abstract
Accurate trajectory prediction of surrounding vehicles is a fundamental challenge in autonomous driving, requiring sophisticated modeling of complex vehicle interactions, traffic dynamics, and contextual dependencies. This paper introduces Adaptive Speed-Aware Prediction (ADSAP), a novel trajectory prediction framework that advances the state of the [...] Read more.
Accurate trajectory prediction of surrounding vehicles is a fundamental challenge in autonomous driving, requiring sophisticated modeling of complex vehicle interactions, traffic dynamics, and contextual dependencies. This paper introduces Adaptive Speed-Aware Prediction (ADSAP), a novel trajectory prediction framework that advances the state of the art through innovative mechanisms for adaptive attention modulation and knowledge transfer. At its core, ADSAP employs an adaptive deformable speed-aware pooling mechanism that dynamically adjusts the model’s attention distribution and receptive field based on instantaneous vehicle states and interaction patterns. This adaptive architecture enables fine-grained modeling of diverse traffic scenarios, from sparse highway conditions to dense urban environments. The framework incorporates a sophisticated speed-aware multi-scale feature aggregation module that systematically combines spatial and temporal information across multiple scales, facilitating comprehensive scene understanding and robust trajectory prediction. To bridge the gap between model complexity and computational efficiency, we propose an adversarial knowledge distillation approach that effectively transfers learned representations and decision-making strategies from a high-capacity teacher model to a lightweight student model. This novel distillation mechanism preserves prediction accuracy while significantly reducing computational overhead, making the framework suitable for real-world deployment. Extensive empirical evaluation on the large-scale NGSIM and highD naturalistic driving datasets demonstrates ADSAP’s superior performance. The ADSAP framework achieves an 18.7% reduction in average displacement error and a 22.4% improvement in final displacement error compared to state-of-the-art methods while maintaining consistent performance across varying traffic densities (0.05–0.85 vehicles/meter) and speed ranges (0–35 m/s). Moreover, ADSAP exhibits robust generalization capabilities across different driving scenarios and weather conditions, with the lightweight student model achieving 95% of the teacher model’s accuracy while offering a 3.2× reduction in inference time. Comprehensive experimental results supported by detailed ablation studies and statistical analyses validate ADSAP’s effectiveness in addressing the trajectory prediction challenge. Our framework provides a novel perspective on integrating adaptive attention mechanisms with efficient knowledge transfer, contributing to the development of more reliable and intelligent autonomous driving systems. Significant improvements in prediction accuracy, computational efficiency, and generalization capability demonstrate ADSAP’s potential ability to advance autonomous driving technology. Full article
(This article belongs to the Special Issue Advances in AI Engineering: Exploring Machine Learning Applications)
Show Figures

Figure 1

31 pages, 2868 KiB  
Article
Optimized Scheduling for Multi-Drop Vehicle–Drone Collaboration with Delivery Constraints Using Large Language Models and Genetic Algorithms with Symmetry Principles
by Mingyang Geng and Anping Chen
Symmetry 2025, 17(6), 934; https://doi.org/10.3390/sym17060934 - 12 Jun 2025
Viewed by 506
Abstract
With the rapid development of e-commerce and globalization, logistics distribution systems have become integral to modern economies, directly impacting transportation efficiency, resource utilization, and supply chain flexibility. However, solving the Vehicle and Multi-Drone Cooperative Delivery Problem with Delivery Restrictions is challenging due to [...] Read more.
With the rapid development of e-commerce and globalization, logistics distribution systems have become integral to modern economies, directly impacting transportation efficiency, resource utilization, and supply chain flexibility. However, solving the Vehicle and Multi-Drone Cooperative Delivery Problem with Delivery Restrictions is challenging due to complex constraints, including limited payloads, short endurance, regional restrictions, and multi-objective optimization. Traditional optimization methods, particularly genetic algorithms, struggle to address these complexities, often relying on static rules or single-objective optimization that fails to balance exploration and exploitation, resulting in local optima and slow convergence. The concept of symmetry plays a crucial role in optimizing the scheduling process, as many logistics problems inherently possess symmetrical properties. By exploiting these symmetries, we can reduce the problem’s complexity and improve solution efficiency. This study proposes a novel and scalable scheduling approach to address the Vehicle and Multi-Drone Cooperative Delivery Problem with Delivery Restrictions, tackling its high complexity, constraint handling, and real-world applicability. Specifically, we propose a logistics scheduling method called Loegised, which integrates large language models with genetic algorithms while incorporating symmetry principles to enhance the optimization process. Loegised includes three innovative modules: a cognitive initialization module to accelerate convergence by generating high-quality initial solutions, a dynamic operator parameter adjustment module to optimize crossover and mutation rates in real-time for better global search, and a local optimum escape mechanism to prevent stagnation and improve solution diversity. The experimental results on benchmark datasets show that Loegised achieves an average delivery time of 14.80, significantly outperforming six state-of-the-art baseline methods, with improvements confirmed by Wilcoxon signed-rank tests (p<0.001). In large-scale scenarios, Loegised reduces delivery time by over 20% compared to conventional methods, demonstrating strong scalability and practical applicability. These findings validate the effectiveness and real-world potential of symmetry-enhanced, language model-guided optimization for advanced logistics scheduling. Full article
Show Figures

Figure 1

20 pages, 6209 KiB  
Article
PSNet: Patch-Based Self-Attention Network for 3D Point Cloud Semantic Segmentation
by Hong Yi, Yaru Liu and Ming Wang
Remote Sens. 2025, 17(12), 2012; https://doi.org/10.3390/rs17122012 - 11 Jun 2025
Viewed by 483
Abstract
LiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep [...] Read more.
LiDAR-captured 3D point clouds are widely used in self-driving cars and smart cities. Point-based semantic segmentation methods allow for more efficient use of the rich geometric information contained in 3D point clouds, so it has gradually replaced other methods as the mainstream deep learning method in 3D point cloud semantic segmentation. However, existing methods suffer from limited receptive fields and feature misalignment due to hierarchical downsampling. To address these challenges, we propose PSNet, a novel patch-based self-attention network that significantly expands the receptive field while ensuring feature alignment through a patch-aggregation paradigm. PSNet combines patch-based self-attention feature extraction with common point feature aggregation (CPFA) to implicitly model large-scale spatial relationships. The framework first divides the point cloud into overlapping patches to extract local features via multi-head self-attention, then aggregates features of common points across patches to capture long-range context. Extensive experiments on Toronto-3D and Complex Scene Point Cloud (CSPC) datasets validate PSNet’s state-of-the-art performance, achieving overall accuracies (OAs) of 98.4% and 97.2%, respectively, with significant improvements in challenging categories (e.g., +32.1% IoU for fences). Experimental results on the S3DIS dataset show that PSNet attains competitive mIoU accuracy (71.2%) while maintaining lower inference latency (7.03 s). The PSNet architecture achieves a larger receptive field coverage, which represents a significant advantage over existing methods. This work not only reveals the mechanism of patch-based self-attention for receptive field enhancement but also provides insights into attention-based 3D geometric learning and semantic segmentation architectures. Furthermore, it provides substantial references for applications in autonomous vehicle navigation and smart city infrastructure management. Full article
Show Figures

Graphical abstract

Back to TopTop