MDPI - Publisher of Open Access Journals

20 pages, 1971 KiB

Open AccessArticle

FFG-YOLO: Improved YOLOv8 for Target Detection of Lightweight Unmanned Aerial Vehicles

by Tongxu Wang, Sizhe Yang, Ming Wan and Yanqiu Liu

Appl. Syst. Innov. 2025, 8(4), 109; https://doi.org/10.3390/asi8040109 - 4 Aug 2025

Viewed by 228

Target detection is essential in intelligent transportation and autonomous control of unmanned aerial vehicles (UAVs), with single-stage detection algorithms used widely due to their speed. However, these algorithms face limitations in detecting small targets, especially in aerial photography from unmanned aerial vehicles (UAVs), [...] Read more.

Target detection is essential in intelligent transportation and autonomous control of unmanned aerial vehicles (UAVs), with single-stage detection algorithms used widely due to their speed. However, these algorithms face limitations in detecting small targets, especially in aerial photography from unmanned aerial vehicles (UAVs), where small targets are often occluded, multi-scale semantic information is easily lost, and there is a trade-off between real-time processing and computational resources. Existing algorithms struggle to effectively extract multi-dimensional features and deep semantic information from images and to balance detection accuracy with model complexity. To address these limitations, we developed FFG-YOLO, a lightweight small-target detection method for UAVs based on YOLOv8. FFG-YOLO incorporates three modules: a feature enhancement block (FEB), a feature concat block (FCB), and a global context awareness block (GCAB). These modules strengthen feature extraction from small targets, resolve semantic bias in multi-scale feature fusion, and help differentiate small targets from complex backgrounds. We also improved the positioning accuracy of small targets using the Wasserstein distance loss function. Experiments showed that FFG-YOLO outperformed other algorithms, including YOLOv8n, in small-target detection due to its lightweight nature, meeting the stringent real-time performance and deployment requirements of UAVs. Full article

(This article belongs to the Topic Application of IoT on Manufacturing, Communication and Engineering)

► Show Figures

Figure 1

24 pages, 12286 KiB

Open AccessArticle

A UAV-Based Multi-Scenario RGB-Thermal Dataset and Fusion Model for Enhanced Forest Fire Detection

by Yalin Zhang, Xue Rui and Weiguo Song

Remote Sens. 2025, 17(15), 2593; https://doi.org/10.3390/rs17152593 - 25 Jul 2025

Viewed by 461

Abstract

UAVs are essential for forest fire detection due to vast forest areas and inaccessibility of high-risk zones, enabling rapid long-range inspection and detailed close-range surveillance. However, aerial photography faces challenges like multi-scale target recognition and complex scenario adaptation (e.g., deformation, occlusion, lighting variations). [...] Read more.

UAVs are essential for forest fire detection due to vast forest areas and inaccessibility of high-risk zones, enabling rapid long-range inspection and detailed close-range surveillance. However, aerial photography faces challenges like multi-scale target recognition and complex scenario adaptation (e.g., deformation, occlusion, lighting variations). RGB-Thermal fusion methods integrate visible-light texture and thermal infrared temperature features effectively, but current approaches are constrained by limited datasets and insufficient exploitation of cross-modal complementary information, ignoring cross-level feature interaction. A time-synchronized multi-scene, multi-angle aerial RGB-Thermal dataset (RGBT-3M) with “Smoke–Fire–Person” annotations and modal alignment via the M-RIFT method was constructed as a way to address the problem of data scarcity in wildfire scenarios. Finally, we propose a CP-YOLOv11-MF fusion detection model based on the advanced YOLOv11 framework, which can learn heterogeneous features complementary to each modality in a progressive manner. Experimental validation proves the superiority of our method, with a precision of 92.5%, a recall of 93.5%, a mAP50 of 96.3%, and a mAP50-95 of 62.9%. The model’s RGB-Thermal fusion capability enhances early fire detection, offering a benchmark dataset and methodological advancement for intelligent forest conservation, with implications for AI-driven ecological protection. Full article

(This article belongs to the Special Issue Advances in Spectral Imagery and Methods for Fire and Smoke Detection)

► Show Figures

Figure 1

23 pages, 13739 KiB

Open AccessArticle

Traffic Accident Rescue Action Recognition Method Based on Real-Time UAV Video

by Bo Yang, Jianan Lu, Tao Liu, Bixing Zhang, Chen Geng, Yan Tian and Siyu Zhang

Drones 2025, 9(8), 519; https://doi.org/10.3390/drones9080519 - 24 Jul 2025

Viewed by 427

Abstract

Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and [...] Read more.

Low-altitude drones, which are unimpeded by traffic congestion or urban terrain, have become a critical asset in emergency rescue missions. To address the current lack of emergency rescue data, UAV aerial videos were collected to create an experimental dataset for action classification and localization annotation. A total of 5082 keyframes were labeled with 1–5 targets each, and 14,412 instances of data were prepared (including flight altitude and camera angles) for action classification and position annotation. To mitigate the challenges posed by high-resolution drone footage with excessive redundant information, we propose the SlowFast-Traffic (SF-T) framework, a spatio-temporal sequence-based algorithm for recognizing traffic accident rescue actions. For more efficient extraction of target–background correlation features, we introduce the Actor-Centric Relation Network (ACRN) module, which employs temporal max pooling to enhance the time-dimensional features of static backgrounds, significantly reducing redundancy-induced interference. Additionally, smaller ROI feature map outputs are adopted to boost computational speed. To tackle class imbalance in incident samples, we integrate a Class-Balanced Focal Loss (CB-Focal Loss) function, effectively resolving rare-action recognition in specific rescue scenarios. We replace the original Faster R-CNN with YOLOX-s to improve the target detection rate. On our proposed dataset, the SF-T model achieves a mean average precision (mAP) of 83.9%, which is 8.5% higher than that of the standard SlowFast architecture while maintaining a processing speed of 34.9 tasks/s. Both accuracy-related metrics and computational efficiency are substantially improved. The proposed method demonstrates strong robustness and real-time analysis capabilities for modern traffic rescue action recognition. Full article

(This article belongs to the Special Issue Cooperative Perception for Modern Transportation)

► Show Figures

Figure 1

26 pages, 23518 KiB

Open AccessArticle

Avalanche Hazard Dynamics and Causal Analysis Along China’s G219 Corridor: A Case Study of the Wenquan–Khorgas Section

by Xuekai Wang, Jie Liu, Qiang Guo, Bin Wang, Zhiwei Yang, Qiulian Cheng and Haiwei Xie

Atmosphere 2025, 16(7), 817; https://doi.org/10.3390/atmos16070817 - 4 Jul 2025

Viewed by 353

Abstract

Investigating avalanche hazards is a fundamental preliminary task in avalanche research. This work is critically important for establishing avalanche warning systems and designing mitigation measures. Primary research data originated from field investigations and UAV aerial surveys, with avalanche counts and timing identified through [...] Read more.

Investigating avalanche hazards is a fundamental preliminary task in avalanche research. This work is critically important for establishing avalanche warning systems and designing mitigation measures. Primary research data originated from field investigations and UAV aerial surveys, with avalanche counts and timing identified through image interpretation. Snowpack properties were primarily acquired via in situ field testing within the study area. Methodologically, statistical modeling and RAMMS::AVALANCHE simulations revealed spatiotemporal and dynamic characteristics of avalanches. Subsequent application of the Certainty Factor (CF) model and sensitivity analysis determined dominant controlling factors and quantified zonal influence intensity for each parameter. This study, utilizing field reconnaissance and drone aerial photography, identified 86 avalanche points in the study area. We used field tests and weather data to run the RAMMS::AVALANCHE model. Then, we categorized and summarized regional avalanche characteristics using both field surveys and simulation results. Furthermore, the Certainty Factor Model (CFM) and the parameter Sensitivity Index (Sa) were applied to assess the influence of elevation, slope gradient, aspect, and maximum snow depth on the severity of avalanche disasters. The results indicate the following: (1) Avalanches exhibit pronounced spatiotemporal concentration: temporally, they cluster between February and March and during 13:00–18:00 daily; spatially, they concentrate within the 2100–3000 m elevation zone. Chute-confined avalanches dominate the region, comprising 73.26% of total events; most chute-confined avalanches feature multiple release areas; therefore the number of release areas exceeds avalanche points; in terms of scale, medium-to-large-scale avalanches dominate, accounting for 86.5% of total avalanches. (2) RAMMS::AVALANCHE simulations yielded the following maximum values for the region: flow height = 15.43 m, flow velocity = 47.6 m/s, flow pressure = 679.79 kPa, and deposition height = 10.3 m. Compared to chute-confined avalanches, unconfined slope avalanches exhibit higher flow velocities and pressures, posing greater hazard potential. (3) The Certainty Factor Model and Sensitivity Index identify elevation, slope gradient, and maximum snow depth as the key drivers of avalanches in the study area. Their relative impact ranks as follows: maximum snow depth > elevation > slope gradient > aspect. The sensitivity index values were 1.536, 1.476, 1.362, and 0.996, respectively. The findings of this study provide a scientific basis for further research on avalanche hazards, the development of avalanche warning systems, and the design of avalanche mitigation projects in the study area. Full article

(This article belongs to the Special Issue Climate Change in the Cryosphere and Its Impacts)

► Show Figures

Figure 1

29 pages, 18908 KiB

Open AccessArticle

Toward Efficient UAV-Based Small Object Detection: A Lightweight Network with Enhanced Feature Fusion

by Xingyu Di, Kangning Cui and Rui-Feng Wang

Remote Sens. 2025, 17(13), 2235; https://doi.org/10.3390/rs17132235 - 29 Jun 2025

Cited by 1 | Viewed by 687

Abstract

UAV-based small target detection is crucial in environmental monitoring, circuit detection, and related applications. However, UAV images often face challenges such as significant scale variation, dense small targets, high inter-class similarity, and intra-class diversity, which can lead to missed detections, thus reducing performance. [...] Read more.

UAV-based small target detection is crucial in environmental monitoring, circuit detection, and related applications. However, UAV images often face challenges such as significant scale variation, dense small targets, high inter-class similarity, and intra-class diversity, which can lead to missed detections, thus reducing performance. To solve these problems, this study proposes a lightweight and high-precision model UAV-YOLO based on YOLOv8s. Firstly, a double separation convolution (DSC) module is designed to replace the Bottleneck structure in the C2f module with deep separable convolution and point-by-point convolution fusion, which can reduce the model parameters and calculation complexity while enhancing feature expression. Secondly, a new SPPL module is proposed, which combines spatial pyramid pooling rapid fusion (SPPF) with long-distance dependency modeling (LSKA) to improve the robustness of the model to multi-scale targets through cross-level feature association. Then, DyHead is used to replace the original detector head, and the discrimination ability of small targets in complex background is enhanced by adaptive weight allocation and cross-scale feature optimization fusion. Finally, the WIPIoU loss function is proposed, which integrates the advantages of Wise-IoU, MPDIoU and Inner-IoU, and incorporates the geometric center of bounding box, aspect ratio and overlap degree into a unified measure to improve the localization accuracy of small targets and accelerate the convergence. The experimental results on the VisDrone2019 dataset showed that compared to YOLOv8s, UAV-YOLO achieved an 8.9% improvement in the recall of mAP@0.5 and 6.8%, while the parameters and calculations were reduced by 23.4% and 40.7%, respectively. Additional evaluations of the DIOR, RSOD, and NWPU VHR-10 datasets demonstrate the generalization capability of the model. Full article

(This article belongs to the Special Issue Geospatial Intelligence in Remote Sensing)

► Show Figures

Figure 1

28 pages, 1707 KiB

Open AccessReview

Video Stabilization: A Comprehensive Survey from Classical Mechanics to Deep Learning Paradigms

by Qian Xu, Qian Huang, Chuanxu Jiang, Xin Li and Yiming Wang

Modelling 2025, 6(2), 49; https://doi.org/10.3390/modelling6020049 - 17 Jun 2025

Viewed by 971

Abstract

Video stabilization is a critical technology for enhancing video quality by eliminating or reducing image instability caused by camera shake, thereby improving the visual viewing experience. It has deeply integrated into diverse applications—including handheld recording, UAV aerial photography, and vehicle-mounted surveillance. Propelled by [...] Read more.

Video stabilization is a critical technology for enhancing video quality by eliminating or reducing image instability caused by camera shake, thereby improving the visual viewing experience. It has deeply integrated into diverse applications—including handheld recording, UAV aerial photography, and vehicle-mounted surveillance. Propelled by advances in deep learning, data-driven stabilization methods have emerged as prominent solutions, demonstrating superior efficacy in handling jitter while achieving enhanced processing efficiency. This review systematically examines the field of video stabilization. First, this paper delineates the paradigm shift from classical to deep learning-based approaches. Subsequently, it elucidates conventional digital stabilization frameworks and their deep learning counterparts along with establishing standardized assessment metrics and benchmark datasets for comparative analysis. Finally, this review addresses critical challenges such as robustness limitations in complex motion scenarios and latency constraints in real-time processing. By integrating interdisciplinary perspectives, this work provides scholars with academically rigorous and practically relevant insights to advance video stabilization research. Full article

► Show Figures

Graphical abstract

18 pages, 4854 KiB

Open AccessArticle

Comparing UAV-Based Hyperspectral and Satellite-Based Multispectral Data for Soil Moisture Estimation Using Machine Learning

by Hadi Shokati, Mahmoud Mashal, Aliakbar Noroozi, Saham Mirzaei, Zahra Mohammadi-Doqozloo, Kamal Nabiollahi, Ruhollah Taghizadeh-Mehrjardi, Pegah Khosravani, Rabindra Adhikari, Ling Hu and Thomas Scholten

Water 2025, 17(11), 1715; https://doi.org/10.3390/w17111715 - 5 Jun 2025

Viewed by 832

Abstract

Accurate estimation of soil moisture content (SMC) is crucial for effective water management, enabling improved monitoring of water stress and a deeper understanding of hydrological processes. While satellite remote sensing provides broad coverage, its spatial resolution often limits its ability to capture small-scale [...] Read more.

Accurate estimation of soil moisture content (SMC) is crucial for effective water management, enabling improved monitoring of water stress and a deeper understanding of hydrological processes. While satellite remote sensing provides broad coverage, its spatial resolution often limits its ability to capture small-scale variations in SMC, especially in landscapes with diverse land-cover types. Unmanned aerial vehicles (UAVs) equipped with hyperspectral sensors offer a promising solution to overcome this limitation. This study compares the effectiveness of Sentinel-2, Landsat-8/9 multispectral data and UAV hyperspectral data (from 339.6 nm to 1028.8 nm with spectral bands) in estimating SMC in a research farm consisting of bare soil, cropland and grassland. A DJI Matrice 100 UAV equipped with a hyperspectral spectrometer collected data on 14 field campaigns, synchronized with satellite overflights. Five machine-learning algorithms including extreme learning machines (ELMs), Gaussian process regression (GPR), partial least squares regression (PLSR), support vector regression (SVR) and artificial neural network (ANN) were used to estimate SMC, focusing on the influence of land cover on the accuracy of SMC estimation. The findings indicated that GPR outperformed the other models when using Landsat-8/9 and hyperspectral photography data, demonstrating a tight correlation with the observed SMC (R² = 0.64 and 0.89, respectively). For Sentinel-2 data, ELM showed the highest correlation, with an R² value of 0.46. In addition, a comparative analysis showed that the UAV hyperspectral data outperformed both satellite sources due to better spatial and spectral resolution. In addition, the Landsat-8/9 data outperformed the Sentinel-2 data in terms of SMC estimation accuracy. For the different land-cover types, all types of remote-sensing data showed the highest accuracy for bare soil compared to cropland and grassland. This research highlights the potential of integrating UAV-based spectroscopy and machine-learning techniques as complementary tools to satellite platforms for precise SMC monitoring. The findings contribute to the further development of remote-sensing methods and improve the understanding of SMC dynamics in heterogeneous landscapes, with significant implications for precision agriculture. By enhancing the SMC estimation accuracy at high spatial resolution, this approach can optimize irrigation practices, improve cropping strategies and contribute to sustainable agricultural practices, ultimately enabling better decision-making for farmers and land managers. However, its broader applicability depends on factors such as scalability and performance under different conditions. Full article

(This article belongs to the Special Issue Applications of Multi-Source Remote Sensing Technologies in Soil Moisture Monitoring)

► Show Figures

Figure 1

29 pages, 6039 KiB

Open AccessArticle

Tree Species Detection and Enhancing Semantic Segmentation Using Machine Learning Models with Integrated Multispectral Channels from PlanetScope and Digital Aerial Photogrammetry in Young Boreal Forest

by Arun Gyawali, Mika Aalto and Tapio Ranta

Remote Sens. 2025, 17(11), 1811; https://doi.org/10.3390/rs17111811 - 22 May 2025

Viewed by 930

Abstract

The precise identification and classification of tree species in young forests during their early development stages are vital for forest management and silvicultural efforts that support their growth and renewal. However, achieving accurate geolocation and species classification through field-based surveys is often a [...] Read more.

The precise identification and classification of tree species in young forests during their early development stages are vital for forest management and silvicultural efforts that support their growth and renewal. However, achieving accurate geolocation and species classification through field-based surveys is often a labor-intensive and complicated task. Remote sensing technologies combined with machine learning techniques present an encouraging solution, offering a more efficient alternative to conventional field-based methods. This study aimed to detect and classify young forest tree species using remote sensing imagery and machine learning techniques. The study mainly involved two different objectives: first, tree species detection using the latest version of You Only Look Once (YOLOv12), and second, semantic segmentation (classification) using random forest, Categorical Boosting (CatBoost), and a Convolutional Neural Network (CNN). To the best of our knowledge, this marks the first exploration utilizing YOLOv12 for tree species identification, along with the study that integrates digital aerial photogrammetry with Planet imagery to achieve semantic segmentation in young forests. The study used two remote sensing datasets: RGB imagery from unmanned aerial vehicle (UAV) ortho photography and RGB-NIR from PlanetScope. For YOLOv12-based tree species detection, only RGB from ortho photography was used, while semantic segmentation was performed with three sets of data: (1) Ortho RGB (3 bands), (2) Ortho RGB + canopy height model (CHM) + Planet RGB-NIR (8 bands), and (3) ortho RGB + CHM + Planet RGB-NIR + 12 vegetation indices (20 bands). With three models applied to these datasets, nine machine learning models were trained and tested using 57 images (1024 × 1024 pixels) and their corresponding mask tiles. The YOLOv12 model achieved 79% overall accuracy, with Scots pine performing best (precision: 97%, recall: 92%, mAP50: 97%, mAP75: 80%) and Norway spruce showing slightly lower accuracy (precision: 94%, recall: 82%, mAP50: 90%, mAP75: 71%). For semantic segmentation, the CatBoost model with 20 bands outperformed other models, achieving 85% accuracy, 80% Kappa, and 81% MCC, with CHM, EVI, NIRPlanet, GreenPlanet, NDGI, GNDVI, and NDVI being the most influential variables. These results indicate that a simple boosting model like CatBoost can outperform more complex CNNs for semantic segmentation in young forests. Full article

► Show Figures

Graphical abstract

22 pages, 9648 KiB

Open AccessArticle

Three-Dimensional Real-Scene-Enhanced GNSS/Intelligent Vision Surface Deformation Monitoring System

by Yuanrong He, Weijie Yang, Qun Su, Qiuhua He, Hongxin Li, Shuhang Lin and Shaochang Zhu

Appl. Sci. 2025, 15(9), 4983; https://doi.org/10.3390/app15094983 - 30 Apr 2025

Viewed by 671

Abstract

With the acceleration of urbanization, surface deformation monitoring has become crucial. Existing monitoring systems face several challenges, such as data singularity, the poor nighttime monitoring quality of video surveillance, and fragmented visual data. To address these issues, this paper presents a 3D real-scene [...] Read more.

With the acceleration of urbanization, surface deformation monitoring has become crucial. Existing monitoring systems face several challenges, such as data singularity, the poor nighttime monitoring quality of video surveillance, and fragmented visual data. To address these issues, this paper presents a 3D real-scene (3DRS)-enhanced GNSS/intelligent vision surface deformation monitoring system. The system integrates GNSS monitoring terminals and multi-source meteorological sensors to accurately capture minute displacements at monitoring points and multi-source Internet of Things (IoT) data, which are then automatically stored in MySQL databases. To enhance the functionality of the system, the visual sensor data are fused with 3D models through streaming media technology, enabling 3D real-scene augmented reality to support dynamic deformation monitoring and visual analysis. WebSocket-based remote lighting control is implemented to enhance the quality of video data at night. The spatiotemporal fusion of UAV aerial data with 3D models is achieved through Blender image-based rendering, while edge detection is employed to extract crack parameters from intelligent inspection vehicle data. The 3DRS model is constructed through UAV oblique photography, 3D laser scanning, and the combined use of SVSGeoModeler and SketchUp. A visualization platform for surface deformation monitoring is built on the 3DRS foundation, adopting an “edge collection–cloud fusion–terminal interaction” approach. This platform dynamically superimposes GNSS and multi-source IoT monitoring data onto the 3D spatial base, enabling spatiotemporal correlation analysis of millimeter-level displacements and early risk warning. Full article

► Show Figures

Figure 1

27 pages, 11601 KiB

Open AccessArticle

Monitoring and Evaluation of Ecological Restoration Effectiveness: A Case Study of the Liaohe River Estuary Wetland

by Yongli Hou, Nanxiang Hu, Chao Teng, Lulin Zheng, Jiabing Zhang and Yifei Gong

Sustainability 2025, 17(7), 2973; https://doi.org/10.3390/su17072973 - 27 Mar 2025

Cited by 1 | Viewed by 717

Abstract

The Liaohe River Estuary Wetland, located in Panjin City, plays a critical role in reducing pollution loads, maintaining biodiversity, and ensuring ecological security in China’s coastal regions, contributing significantly to the implementation of the land–sea coordination strategy. As key components of ecological restoration [...] Read more.

The Liaohe River Estuary Wetland, located in Panjin City, plays a critical role in reducing pollution loads, maintaining biodiversity, and ensuring ecological security in China’s coastal regions, contributing significantly to the implementation of the land–sea coordination strategy. As key components of ecological restoration projects, monitoring and evaluating restoration effectiveness provide a reliable basis for decision-making and ecosystem management. This study established an innovative three-dimensional integrated monitoring and evaluation system combining satellite imagery, UAV aerial photography, and field sampling surveys, addressing the technical gaps in multi-scale and multi-dimensional dynamic ecological monitoring. Through systematic monitoring and the assessment of key indicators, including water environment, soil environment, biodiversity, water conservation capacity, and carbon sequestration capacity, we comprehensively evaluated the enhancement effects of ecological restoration projects on regional ecosystem structure, quality, and service functions. The findings demonstrated that the satellite–airborne–ground integrated monitoring technology significantly improved water quality and soil properties, enhanced soil–water conservation capabilities, and increased biodiversity indices and carbon sequestration potential. These results validate the scientific validity of ecological protection measures and the comprehensive benefits of restoration outcomes. The primary contributions of this research lie in the following: developing a novel monitoring framework that provides critical data support for decision-making, project acceptance, effectiveness evaluation, and adaptive management in ecological restoration; establishing transferable methodologies applicable not only to the Liaohe River Estuary wetlands, but also to similar ecosystems globally, showcasing broad applicability in ecological governance. Full article

(This article belongs to the Topic Water Management in the Age of Climate Change)

► Show Figures

Figure 1

23 pages, 31391 KiB

Open AccessArticle

A Method for Airborne Small-Target Detection with a Multimodal Fusion Framework Integrating Photometric Perception and Cross-Attention Mechanisms

by Shufang Xu, Heng Li, Tianci Liu and Hongmin Gao

Remote Sens. 2025, 17(7), 1118; https://doi.org/10.3390/rs17071118 - 21 Mar 2025

Cited by 1 | Viewed by 1194

Abstract

In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain [...] Read more.

In recent years, the rapid advancement and pervasive deployment of unmanned aerial vehicle (UAV) technology have catalyzed transformative applications across the military, civilian, and scientific domains. While aerial imaging has emerged as a pivotal tool in modern remote sensing systems, persistent challenges remain in achieving robust small-target detection under complex all-weather conditions. This paper presents an innovative multimodal fusion framework incorporating photometric perception and cross-attention mechanisms to address the critical limitations of current single-modality detection systems, particularly their susceptibility to reduced accuracy and elevated false-negative rates in adverse environmental conditions. Our architecture introduces three novel components: (1) a bidirectional hierarchical feature extraction network that enables the synergistic processing of heterogeneous sensor data; (2) a cross-modality attention mechanism that dynamically establishes inter-modal feature correlations through learnable attention weights; (3) an adaptive photometric weighting fusion module that implements spectral characteristic-aware feature recalibration. The proposed system achieves multimodal complementarity through two-phase integration: first by establishing cross-modal feature correspondences through attention-guided feature alignment, then performing weighted fusion based on photometric reliability assessment. Comprehensive experiments demonstrate that our framework achieves an improvement of at least 3.6% in mAP compared to the other models on the challenging LLVIP dataset, and with particular improvements in detection reliability on the KAIST dataset. This research advances the state of the art in aerial target detection by providing a principled approach for multimodal sensor fusion, with significant implications for surveillance, disaster response, and precision agriculture applications. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing (3rd Edition))

► Show Figures

Graphical abstract

31 pages, 21485 KiB

Open AccessArticle

UAV-SfM Photogrammetry for Canopy Characterization Toward Unmanned Aerial Spraying Systems Precision Pesticide Application in an Orchard

by Qi Bing, Ruirui Zhang, Linhuan Zhang, Longlong Li and Liping Chen

Drones 2025, 9(2), 151; https://doi.org/10.3390/drones9020151 - 18 Feb 2025

Cited by 3 | Viewed by 1043

Abstract

The development of unmanned aerial spraying systems (UASSs) has significantly transformed pest and disease control methods of crop plants. Precisely adjusting pesticide application rates based on the target conditions is an effective method to improve pesticide use efficiency. In orchard spraying, the structural [...] Read more.

The development of unmanned aerial spraying systems (UASSs) has significantly transformed pest and disease control methods of crop plants. Precisely adjusting pesticide application rates based on the target conditions is an effective method to improve pesticide use efficiency. In orchard spraying, the structural characteristics of the canopy are crucial for guiding the pesticide application system to adjust spraying parameters. This study selected mango trees as the research sample and evaluated the differences between UAV aerial photography with a Structure from Motion (SfM) algorithm and airborne LiDAR in the results of extracting canopy parameters. The maximum canopy height, canopy projection area, and canopy volume parameters were extracted from the canopy height model of SfM (

{C H M}_{S f M}

) and the canopy height model of LiDAR (

{C H M}_{L i D A R}

) by grids with the same width as the planting rows (5.0 m) and 14 different heights (0.2 m, 0.3 m, 0.4 m, 0.5 m, 0.6 m, 0.8 m, 1.0 m, 2.0 m, 3.0 m, 4.0 m, 5.0 m, 6.0 m, 8.0 m, and 10.0 m), respectively. Linear regression equations were used to fit the canopy parameters obtained from different sensors. The correlation was evaluated using

R^{2}

and

r R M S E

, and a t-test (α = 0.05) was employed to assess the significance of the differences. The results show that as the grid height increases, the

R^{2}

values for the maximum canopy height, projection area, and canopy volume extracted from

{C H M}_{S f M}

and

{C H M}_{L i D A R}

increase, while the

r R M S E

values decrease. When the grid height is 10.0 m, the

R^{2}

for the maximum canopy height extracted from the two models is

92.85 %

, with an

r R M S E

of

0.0563

. For the canopy projection area, the

R^{2}

is 97.83%, with an

r R M S E

of 0.01, and for the canopy volume, the

R^{2}

is 98.35%, with an

r R M S E

of 0.0337. When the grid height exceeds 1.0 m, the t-test results for the three parameters are all greater than 0.05, accepting the hypothesis that there is no significant difference in the canopy parameters obtained by the two sensors. Additionally, using the coordinates

x_{0}

of the intersection of the linear regression equation and

y = x

as a reference,

{C H M}_{S f M}

tends to overestimate lower canopy maximum height and projection area, and underestimate higher canopy maximum height and projection area compared to

{C H M}_{L i D A R}

. This to some extent reflects that the surface of

{C H M}_{S f M}

is smoother. This study demonstrates the effectiveness of extracting canopy parameters to guide UASS systems for variable-rate spraying based on UAV oblique photography combined with the SfM algorithm. Full article

(This article belongs to the Special Issue Recent Advances in Crop Protection Using UAV and UGV)

► Show Figures

Figure 1

22 pages, 5596 KiB

Open AccessArticle

URAdv: A Novel Framework for Generating Ultra-Robust Adversarial Patches Against UAV Object Detection

by Hailong Xi, Le Ru, Jiwei Tian, Bo Lu, Shiguang Hu, Wenfei Wang and Xiaohui Luan

Mathematics 2025, 13(4), 591; https://doi.org/10.3390/math13040591 - 11 Feb 2025

Cited by 3 | Viewed by 1227

Abstract

In recent years, deep learning has been extensively deployed on unmanned aerial vehicles (UAVs), particularly for object detection. As the cornerstone of UAV-based object detection, deep neural networks are susceptible to adversarial attacks, with adversarial patches being a relatively straightforward method to implement. [...] Read more.

In recent years, deep learning has been extensively deployed on unmanned aerial vehicles (UAVs), particularly for object detection. As the cornerstone of UAV-based object detection, deep neural networks are susceptible to adversarial attacks, with adversarial patches being a relatively straightforward method to implement. However, current research on adversarial patches, especially those targeting UAV object detection, is limited. This scarcity is notable given the complex and dynamically changing environment inherent in UAV image acquisition, which necessitates the development of more robust adversarial patches to achieve effective attacks. To address the challenge of adversarial attacks in UAV high-altitude reconnaissance, this paper presents a robust adversarial patch generation framework. Firstly, the dataset is reconstructed by considering various environmental factors that UAVs may encounter during image collection, and the influences of reflections and shadows during photography are integrated into patch training. Additionally, a nested optimization method is employed to enhance the continuity of attacks across different altitudes. Experimental results demonstrate that the adversarial patches generated by the proposed method exhibit greater robustness in complex environments and have better transferability among similar models. Full article

(This article belongs to the Special Issue Advance in Modeling, Cooperative Control, and Decision-Making Method for the Collective Large-Scale Intelligent Systems)

► Show Figures

Figure 1

17 pages, 3431 KiB

Open AccessArticle

Interchangeability of Cross-Platform Orthophotographic and LiDAR Data in DeepLabV3+-Based Land Cover Classification Method

by Shijun Pan, Keisuke Yoshida, Satoshi Nishiyama, Takashi Kojima and Yutaro Hashimoto

Land 2025, 14(2), 217; https://doi.org/10.3390/land14020217 - 21 Jan 2025

Viewed by 873

Abstract

Riverine environmental information includes important data to collect, and the data collection still requires personnel’s field surveys. These on-site tasks still face significant limitations (i.e., hard or danger to entry). In recent years, as one of the efficient approaches for data collection, air-vehicle-based [...] Read more.

Riverine environmental information includes important data to collect, and the data collection still requires personnel’s field surveys. These on-site tasks still face significant limitations (i.e., hard or danger to entry). In recent years, as one of the efficient approaches for data collection, air-vehicle-based Light Detection and Ranging technologies have already been applied in global environmental research, i.e., land cover classification (LCC) or environmental monitoring. For this study, the authors specifically focused on seven types of LCC (i.e., bamboo, tree, grass, bare ground, water, road, and clutter) that can be parameterized for flood simulation. A validated airborne LiDAR bathymetry system (ALB) and a UAV-borne green LiDAR System (GLS) were applied in this study for cross-platform analysis of LCC. Furthermore, LiDAR data were visualized using high-contrast color scales to improve the accuracy of land cover classification methods through image fusion techniques. If high-resolution aerial imagery is available, then it must be downscaled to match the resolution of low-resolution point clouds. Cross-platform data interchangeability was assessed by comparing the interchangeability, which measures the absolute difference in overall accuracy (OA) or macro-F1 by comparing the cross-platform interchangeability. It is noteworthy that relying solely on aerial photographs is inadequate for achieving precise labeling, particularly under limited sunlight conditions that can lead to misclassification. In such cases, LiDAR plays a crucial role in facilitating target recognition. All the approaches (i.e., low-resolution digital imagery, LiDAR-derived imagery and image fusion) present results of over 0.65 OA and of around 0.6 macro-F1. The authors found that the vegetation (bamboo, tree, grass) and road species have comparatively better performance compared with clutter and bare ground species. Given the stated conditions, differences in the species derived from different years (ALB from year 2017 and GLS from year 2020) are the main reason. Because the identification of clutter species includes all the items except for the relative species in this research, RGB-based features of the clutter species cannot be substituted easily because of the 3-year gap compared with other species. Derived from on-site reconstruction, the bare ground species also has a further color change between ALB and GLS that leads to decreased interchangeability. In the case of individual species, without considering seasons and platforms, image fusion can classify bamboo and trees with higher F1 scores compared to low-resolution digital imagery and LiDAR-derived imagery, which has especially proved the cross-platform interchangeability in the high vegetation types. In recent years, high-resolution photography (UAV), high-precision LiDAR measurement (ALB, GLS), and satellite imagery have been used. LiDAR measurement equipment is expensive, and measurement opportunities are limited. Based on this, it would be desirable if ALB and GLS could be continuously classified by Artificial Intelligence, and in this study, the authors investigated such data interchangeability. A unique and crucial aspect of this study is exploring the interchangeability of land cover classification models across different LiDAR platforms. Full article

(This article belongs to the Special Issue Application of Multi-Source Geographical Big Data in Land Use Decision-Making)

► Show Figures

Figure 1

21 pages, 10149 KiB

Open AccessArticle

Minimizing Seam Lines in UAV Multispectral Image Mosaics Utilizing Irradiance, Vignette, and BRDF

by Hoyong Ahn, Chansol Kim, Seungchan Lim, Cheonggil Jin, Jinsu Kim and Chuluong Choi

Remote Sens. 2025, 17(1), 151; https://doi.org/10.3390/rs17010151 - 4 Jan 2025

Viewed by 1065

Abstract

Unmanned aerial vehicle (UAV) imaging provides the ability to obtain high-resolution images at a lower cost than satellite imagery and aerial photography. However, multiple UAV images need to be mosaicked to obtain images of large areas, and the resulting UAV multispectral image mosaics [...] Read more.

Unmanned aerial vehicle (UAV) imaging provides the ability to obtain high-resolution images at a lower cost than satellite imagery and aerial photography. However, multiple UAV images need to be mosaicked to obtain images of large areas, and the resulting UAV multispectral image mosaics typically contain seam lines. To address this problem, we applied irradiance, vignette, and bidirectional reflectance distribution function (BRDF) filters and performed field work using a DJI Mavic 3 Multispectral (M3M) camera to collect data. We installed a calibrated reference tarp (CRT) in the center of the collection area and conducted three types of flights (BRDF, vignette, and validation) to measure the irradiance, radiance, and reflectance—which are essential for irradiance correction—using a custom reflectance box (ROX). A vignette filter was generated from the vignette parameter, and the anisotropy factor (ANIF) was calculated by measuring the radiance at the nadir, following which the BRDF model parameters were calculated. The calibration approaches were divided into the following categories: a vignette-only process, which solely applied vignette and irradiance corrections, and the full process, which included irradiance, vignette, and BRDF. The accuracy was verified through a validation flight. The radiance uncertainty at the seam line ranged from 3.00 to 5.26% in the 80% lap mode when using nine images around the CRT, and from 4.06 to 6.93% in the 50% lap mode when using all images with the CRT. The term ‘lap’ in ‘lap mode’ refers to both overlap and sidelap. The images that were subjected to the vignette-only process had a radiance difference of 4.48–6.98%, while that of the full process images was 1.44–2.40%, indicating that the seam lines were difficult to find with the naked eye and that the process was successful. Full article

(This article belongs to the Special Issue Innovative UAV and Satellite Technologies and Applications for Spatiotemporal Analysis)

► Show Figures

Figure 1

Search Results (166)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (166)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI