MDPI - Publisher of Open Access Journals

25 pages, 9517 KiB

Open AccessArticle

YOLOv8n-SSDW: A Lightweight and Accurate Model for Barnyard Grass Detection in Fields

by Yan Sun, Hanrui Guo, Xiaoan Chen, Mengqi Li, Bing Fang and Yingli Cao

Agriculture 2025, 15(14), 1510; https://doi.org/10.3390/agriculture15141510 - 13 Jul 2025

Viewed by 124

Barnyard grass is a major noxious weed in paddy fields. Accurate and efficient identification of barnyard grass is crucial for precision field management. However, existing deep learning models generally suffer from high parameter counts and computational complexity, limiting their practical application in field [...] Read more.

Barnyard grass is a major noxious weed in paddy fields. Accurate and efficient identification of barnyard grass is crucial for precision field management. However, existing deep learning models generally suffer from high parameter counts and computational complexity, limiting their practical application in field scenarios. Moreover, the morphological similarity, overlapping, and occlusion between barnyard grass and rice pose challenges for reliable detection in complex environments. To address these issues, this study constructed a barnyard grass detection dataset using high-resolution images captured by a drone equipped with a high-definition camera in rice experimental fields in Haicheng City, Liaoning Province. A lightweight field barnyard grass detection model, YOLOv8n-SSDW, was proposed to enhance detection precision and speed. Based on the baseline YOLOv8n model, a novel Separable Residual Coord Conv (SRCConv) was designed to replace the original convolution module, significantly reducing parameters while maintaining detection accuracy. The Spatio-Channel Enhanced Attention Module (SEAM) was introduced and optimized to improve sensitivity to barnyard grass edge features. Additionally, the lightweight and efficient Dysample upsampling module was incorporated to enhance feature map resolution. A new WIoU loss function was developed to improve bounding box classification and regression accuracy. Comprehensive performance analysis demonstrated that YOLOv8n-SSDW outperformed state-of-the-art models. Ablation studies confirmed the effectiveness of each improvement module. The final fused model achieved lightweight performance while improving detection accuracy, with a 2.2% increase in mAP_50, 3.8% higher precision, 0.6% higher recall, 10.6% fewer parameters, 9.8% lower FLOPs, and an 11.1% reduction in model size compared to the baseline. Field tests using drones combined with ground-based computers further validated the model’s robustness in real-world complex paddy environments. The results indicate that YOLOv8n-SSDW exhibits excellent accuracy and efficiency. This study provides valuable insights for barnyard grass detection in rice fields. Full article

(This article belongs to the Special Issue How Optical Sensors and Deep Learning Enhance the Production Management in Smart Agriculture)

► Show Figures

Figure 1

25 pages, 11253 KiB

Open AccessArticle

YOLO-UIR: A Lightweight and Accurate Infrared Object Detection Network Using UAV Platforms

by Chao Wang, Rongdi Wang, Ziwei Wu, Zetao Bian and Tao Huang

Drones 2025, 9(7), 479; https://doi.org/10.3390/drones9070479 - 7 Jul 2025

Viewed by 359

Abstract

Within the field of remote sensing, Unmanned Aerial Vehicle (UAV) infrared object detection plays a pivotal role, especially in complex environments. However, existing methods face challenges such as insufficient accuracy or low computational efficiency, particularly in the detection of small objects. This paper [...] Read more.

Within the field of remote sensing, Unmanned Aerial Vehicle (UAV) infrared object detection plays a pivotal role, especially in complex environments. However, existing methods face challenges such as insufficient accuracy or low computational efficiency, particularly in the detection of small objects. This paper proposes a lightweight and accurate UAV infrared object detection model, YOLO-UIR, for small object detection from a UAV perspective. The model is based on the YOLO architecture and mainly includes the Efficient C2f module, lightweight spatial perception (LSP) module, and bidirectional feature interaction fusion (BFIF) module. The Efficient C2f module significantly enhances feature extraction capabilities by combining local and global features through an Adaptive Dual-Stream Attention Mechanism. Compared with the existing C2f module, the introduction of Partial Convolution reduces the model’s parameter count while maintaining high detection accuracy. The BFIF module further enhances feature fusion effects through cross-level semantic interaction, thereby improving the model’s ability to fuse contextual features. Moreover, the LSP module efficiently combines features from different distances using Large Receptive Field Convolution Layers, significantly enhancing the model’s long-range information capture capability. Additionally, the use of Reparameterized Convolution and Depthwise Separable Convolution ensures the model’s lightweight nature, making it highly suitable for real-time applications. On the DroneVehicle and HIT-UAV datasets, YOLO-UIR achieves superior detection performance compared to existing methods, with an mAP of 71.1% and 90.7%, respectively. The model also demonstrates significant advantages in terms of computational efficiency and parameter count. Ablation experiments verify the effectiveness of each optimization module. Full article

(This article belongs to the Special Issue Intelligent Image Processing and Sensing for Drones, 2nd Edition)

► Show Figures

Figure 1

29 pages, 18908 KiB

Open AccessArticle

Toward Efficient UAV-Based Small Object Detection: A Lightweight Network with Enhanced Feature Fusion

by Xingyu Di, Kangning Cui and Rui-Feng Wang

Remote Sens. 2025, 17(13), 2235; https://doi.org/10.3390/rs17132235 - 29 Jun 2025

Viewed by 502

Abstract

UAV-based small target detection is crucial in environmental monitoring, circuit detection, and related applications. However, UAV images often face challenges such as significant scale variation, dense small targets, high inter-class similarity, and intra-class diversity, which can lead to missed detections, thus reducing performance. [...] Read more.

UAV-based small target detection is crucial in environmental monitoring, circuit detection, and related applications. However, UAV images often face challenges such as significant scale variation, dense small targets, high inter-class similarity, and intra-class diversity, which can lead to missed detections, thus reducing performance. To solve these problems, this study proposes a lightweight and high-precision model UAV-YOLO based on YOLOv8s. Firstly, a double separation convolution (DSC) module is designed to replace the Bottleneck structure in the C2f module with deep separable convolution and point-by-point convolution fusion, which can reduce the model parameters and calculation complexity while enhancing feature expression. Secondly, a new SPPL module is proposed, which combines spatial pyramid pooling rapid fusion (SPPF) with long-distance dependency modeling (LSKA) to improve the robustness of the model to multi-scale targets through cross-level feature association. Then, DyHead is used to replace the original detector head, and the discrimination ability of small targets in complex background is enhanced by adaptive weight allocation and cross-scale feature optimization fusion. Finally, the WIPIoU loss function is proposed, which integrates the advantages of Wise-IoU, MPDIoU and Inner-IoU, and incorporates the geometric center of bounding box, aspect ratio and overlap degree into a unified measure to improve the localization accuracy of small targets and accelerate the convergence. The experimental results on the VisDrone2019 dataset showed that compared to YOLOv8s, UAV-YOLO achieved an 8.9% improvement in the recall of mAP@0.5 and 6.8%, while the parameters and calculations were reduced by 23.4% and 40.7%, respectively. Additional evaluations of the DIOR, RSOD, and NWPU VHR-10 datasets demonstrate the generalization capability of the model. Full article

(This article belongs to the Special Issue Geospatial Intelligence in Remote Sensing)

► Show Figures

Figure 1

21 pages, 5194 KiB

Open AccessArticle

LMEC-YOLOv8: An Enhanced Object Detection Algorithm for UAV Imagery

by Xuchuan Tai and Xinjun Zhang

Electronics 2025, 14(13), 2535; https://doi.org/10.3390/electronics14132535 - 23 Jun 2025

Viewed by 418

Abstract

Despite the rapid development of UAV (Unmanned Aerial Vehicle) technology, its application for object detection in complex scenarios faces challenges regarding the small target sizes and environmental interference. This paper proposes an improved algorithm, LMEC-YOLOv8, based on YOLOv8n, which aims to enhance the [...] Read more.

Despite the rapid development of UAV (Unmanned Aerial Vehicle) technology, its application for object detection in complex scenarios faces challenges regarding the small target sizes and environmental interference. This paper proposes an improved algorithm, LMEC-YOLOv8, based on YOLOv8n, which aims to enhance the detection accuracy and real-time performance of UAV imagery for small targets. We propose three key enhancements: (1) a lightweight multi-scale module (LMS-PC2F) to replace C2f; (2) a multi-scale attention mechanism (MSCBAM) for optimized feature extraction; and (3) an adaptive pyramid module (ESPPM) and a bidirectional feature network (CBiFPN) to boost fusion capability. Experimental results on the VisDrone2019 dataset demonstrate that LMEC-YOLOv8 achieves a 10.1% improvement in mAP50, a 20% reduction in parameter count, and a frame rate of 42 FPS compared to the baseline YOLOv8n. When compared to other state-of-the-art algorithms, the proposed model achieves an optimal balance between accuracy and speed, validating its robustness and practicality in complex environments. Full article

(This article belongs to the Special Issue Deep Learning for Computer Vision, 2nd Edition)

► Show Figures

Figure 1

26 pages, 2362 KiB

Open AccessArticle

ELNet: An Efficient and Lightweight Network for Small Object Detection in UAV Imagery

by Hui Li, Jianbo Ma and Jianlin Zhang

Remote Sens. 2025, 17(12), 2096; https://doi.org/10.3390/rs17122096 - 18 Jun 2025

Viewed by 487

Abstract

Real-time object detection is critical for unmanned aerial vehicles (UAVs) performing various tasks. However, efficiently deploying detection models on UAV platforms with limited storage and computational resources remains a significant challenge. To address this issue, we propose ELNet, an efficient and lightweight object [...] Read more.

Real-time object detection is critical for unmanned aerial vehicles (UAVs) performing various tasks. However, efficiently deploying detection models on UAV platforms with limited storage and computational resources remains a significant challenge. To address this issue, we propose ELNet, an efficient and lightweight object detection model based on YOLOv12n. First, based on an analysis of UAV image characteristics, we strategically remove two A2C2f modules from YOLOv12n and adjust the size and number of detection heads. Second, we propose a novel lightweight detection head, EPGHead, to alleviate the computational burden introduced by adding the large-scale detection head. In addition, since YOLOv12n employs standard convolution for downsampling, which is inefficient for extracting UAV image features, we design a novel downsampling module, EDown, to further reduce model size and enable more efficient feature extraction. Finally, to improve detection in UAV imagery with dense, small, and scale-varying objects, we propose DIMB-C3k2, an enhanced module built upon C3k2, which boosts feature extraction under complex conditions. Compared with YOLOv12n, ELNet achieves an 88.5% reduction in parameter count and a 52.3% decrease in FLOPs, while increasing mAP₅₀ by 1.2% on the VisDrone dataset and 0.8% on the HIT-UAV dataset, reaching 94.7% mAP₅₀ on HIT-UAV. Furthermore, the model achieves a frame rate of 682 FPS, highlighting its superior computational efficiency without sacrificing detection accuracy. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence (AI) and Deep Learning (DL) in UAV-Based Remote Sensing)

► Show Figures

Figure 1

20 pages, 955 KiB

Open AccessArticle

Natural Language Interfaces for Structured Query Generation in IoD Platforms

by Anıl Sezgin

Drones 2025, 9(6), 444; https://doi.org/10.3390/drones9060444 - 18 Jun 2025

Viewed by 461

Abstract

The increasing complexity of Internet of Drones (IoD) platforms demands more accessible ways for users to interact with unmanned aerial vehicle (UAV) data systems. Traditional methods requiring technical API knowledge create barriers for non-specialist users in dynamic operational environments. To address this challenge, [...] Read more.

The increasing complexity of Internet of Drones (IoD) platforms demands more accessible ways for users to interact with unmanned aerial vehicle (UAV) data systems. Traditional methods requiring technical API knowledge create barriers for non-specialist users in dynamic operational environments. To address this challenge, we propose a retrieval-augmented generation (RAG) architecture that enables natural language querying over UAV telemetry, mission, and detection data. Our approach builds a semantic retrieval index from structured application programming interface (API) documentation and uses lightweight large language models to map user queries into executable API calls validated against platform schemas. This design minimizes fine-tuning needs, adapts to evolving APIs, and ensures schema conformity for operational safety. Evaluations conducted on a curated IoD dataset show 91.3% endpoint accuracy, 87.6% parameter match rate, and 95.2% schema conformity, confirming the system’s robustness and scalability. The results demonstrate that combining retrieval-augmented semantic grounding with structured validation bridges the gap between human intent and complex UAV data access, improving usability while maintaining a practical level of operational reliability. Full article

► Show Figures

Figure 1

25 pages, 4233 KiB

Open AccessArticle

A Lightweight Multi-Scale Context Detail Network for Efficient Target Detection in Resource-Constrained Environments

by Kaipeng Wang, Guanglin He and Xinmin Li

Sensors 2025, 25(12), 3800; https://doi.org/10.3390/s25123800 - 18 Jun 2025

Viewed by 455

Abstract

Target detection in resource-constrained environments faces multiple challenges such as the use of camouflage, diverse target sizes, and harsh environmental conditions. Moreover, the need for solutions suitable for edge computing environments, which have limited computational resources, adds complexity to the task. To meet [...] Read more.

Target detection in resource-constrained environments faces multiple challenges such as the use of camouflage, diverse target sizes, and harsh environmental conditions. Moreover, the need for solutions suitable for edge computing environments, which have limited computational resources, adds complexity to the task. To meet these challenges, we propose MSCDNet (Multi-Scale Context Detail Network), an innovative and lightweight architecture designed specifically for efficient target detection in such environments. MSCDNet integrates three key components: the Multi-Scale Fusion Module, which improves the representation of features at various target scales; the Context Merge Module, which enables adaptive feature integration across scales to handle a wide range of target conditions; and the Detail Enhance Module, which emphasizes preserving crucial edge and texture details for detecting camouflaged targets. Extensive evaluations highlight the effectiveness of MSCDNet, which achieves 40.1% mAP50-95, 86.1% precision, and 68.1% recall while maintaining a low computational load with only 2.22 M parameters and 6.0 G FLOPs. When compared to other models, MSCDNet outperforms YOLO-family variants by 1.9% in mAP50-95 and uses 14% fewer parameters. Additional generalization tests on VisDrone2019 and BDD100K further validate its robustness, with improvements of 1.1% in mAP50 on VisDrone and 1.2% in mAP50-95 on BDD100K over baseline models. These results affirm that MSCDNet is well suited for tactical deployment in scenarios with limited computational resources, where reliable target detection is paramount. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

23 pages, 3908 KiB

Open AccessArticle

MSUD-YOLO: A Novel Multiscale Small Object Detection Model for UAV Aerial Images

by Xiaofeng Zhao, Hui Zhang, Wenwen Zhang, Junyi Ma, Chenxiao Li, Yao Ding and Zhili Zhang

Drones 2025, 9(6), 429; https://doi.org/10.3390/drones9060429 - 13 Jun 2025

Cited by 1 | Viewed by 704

Abstract

Due to the objects in UAV aerial images often presenting characteristics of multiple scales, small objects, complex backgrounds, etc., the performance of object detection using current models is not satisfactory. To address the above issues, this paper designs a multiscale small object detection [...] Read more.

Due to the objects in UAV aerial images often presenting characteristics of multiple scales, small objects, complex backgrounds, etc., the performance of object detection using current models is not satisfactory. To address the above issues, this paper designs a multiscale small object detection model for UAV aerial images, namely MSUD-YOLO, based on YOLOv10s. First, the model uses an attention scale sequence fusion mode to achieve more efficient multiscale feature fusion. Meanwhile, a tiny prediction head is incorporated to make the model focus on the low-level features, thus improving its ability to detect small objects. Secondly, a novel feature extraction module named CFormerCGLU has been designed, which improves feature extraction capability in a lighter way. In addition, the model uses lightweight convolution instead of standard convolution to reduce the model’s computation. Finally, the WIoU v3 loss function is used to make the model more focused on low-quality examples, thereby improving the model’s object localization ability. Experimental results on the VisDrone2019 dataset show that MSUD-YOLO improves mAP50 by 8.5% compared with YOLOv10s. Concurrently, the overall model reduces parameters by 6.3%, verifying the model’s effectiveness for object detection in UAV aerial images in complex environments. Furthermore, compared with multiple latest UAV object detection algorithms, our designed MSUD-YOLO offers higher detection accuracy and lower computational cost; e.g., mAP50 reaches 43.4%, but parameters are only 6.766 M. Full article

► Show Figures

Figure 1

32 pages, 8925 KiB

Open AccessArticle

HSF-DETR: Hyper Scale Fusion Detection Transformer for Multi-Perspective UAV Object Detection

by Yi Mao, Haowei Zhang, Rui Li, Feng Zhu, Rui Sun and Pingping Ji

Remote Sens. 2025, 17(12), 1997; https://doi.org/10.3390/rs17121997 - 9 Jun 2025

Viewed by 571

Abstract

Unmanned aerial vehicle (UAV) imagery detection faces challenges in preserving small object features during multi-level downsampling, handling angle and altitude-dependent variations in aerial scenes, achieving accurate localization in dense environments, and performing real-time detection. To address these limitations, we propose HSF-DETR, a lightweight [...] Read more.

Unmanned aerial vehicle (UAV) imagery detection faces challenges in preserving small object features during multi-level downsampling, handling angle and altitude-dependent variations in aerial scenes, achieving accurate localization in dense environments, and performing real-time detection. To address these limitations, we propose HSF-DETR, a lightweight transformer-based detector specifically designed for UAV imagery. First, we design a hybrid progressive fusion network (HPFNet) as the backbone, which adaptively modulates receptive fields to capture multi-scale information while preserving fine-grained details critical for small object detection. Second, building upon features extracted by HPFNet, we develop MultiScaleNet, which enhances feature representation through dual-layer optimization and cross-domain feature learning, significantly improving the model’s capability to handle complex aerial scenarios with diverse object orientations. Finally, to address spatial–semantic alignment challenges, we devise a position-aware align context and spatial tuning (PACST) module that ensures effective feature calibration through precise alignment and adaptive fusion across scales. This hierarchical architecture is complemented by our novel AdaptDist-IoU loss with dynamic weight allocation, which enhances localization accuracy, particularly in dense environments. Extensive experiments using standard detection metrics (mAP50 and mAP50:95) on the VisDrone2019 test dataset demonstrate that HSF-DETR achieves superior performance with 0.428 mAP50 (+5.4%) and 0.253 mAP50:95 (+4%) when compared with RT-DETR, while maintaining real-time inference (69.3 FPS) on an NVIDIA RTX 4090D GPU with only 15.24M parameters and 63.6 GFLOPs. Further validation across multiple public remote sensing datasets confirms the robust generalization capability of HSF-DETR in diverse aerial scenarios, offering a practical solution for resource-constrained UAV applications where both detection quality and processing speed are crucial. Full article

(This article belongs to the Special Issue Deep Learning-Based Small-Target Detection in Remote Sensing)

► Show Figures

Graphical abstract

20 pages, 13952 KiB

Open AccessArticle

MSO-DETR: A Lightweight Detection Transformer Model for Small Object Detection in Maritime Search and Rescue

by Jing Li, Yun Hua and Mei Xue

Electronics 2025, 14(12), 2327; https://doi.org/10.3390/electronics14122327 - 6 Jun 2025

Viewed by 532

Abstract

In maritime search and rescue small object detection, existing high-accuracy detection models face deployment challenges on UAV platforms due to limited computational capabilities, while existing lightweight models often fail to meet performance requirements, reducing the overall effectiveness of rescue operations. To overcome the [...] Read more.

In maritime search and rescue small object detection, existing high-accuracy detection models face deployment challenges on UAV platforms due to limited computational capabilities, while existing lightweight models often fail to meet performance requirements, reducing the overall effectiveness of rescue operations. To overcome the difficulty of balancing lightweight design and detection accuracy, we propose Maritime Small Object Detection Transformer (MSO-DETR), a lightweight detection transformer model for small object detection in maritime search and rescue, based on an improved Real-Time Detection Transformer (RT-DETR) architecture. MSO-DETR employs StarNet as its backbone to reduce the computational cost with a slight drop in detection accuracy. In addition, the Dynamic-range Histogram Self-Attention (DHSA) mechanism is integrated with the Attention-based Intra-scale Feature Interaction (AIFI) module to construct DHAIFI, which enhances the model’s ability to perceive object features under challenging conditions such as sea surface reflections and wave interference. During the feature fusion phase, we propose the Scale-Tuned Enhanced Feature Fusion (STEFF) module, which integrates the improved Attentional Scale Sequence Fusion (ASF) structure with the newly designed Multi-Dilated Convolution Cross-Stage Partial (MDC_CSP) and Parallel Aggregation Downsampling (PAD) to enhance multi-scale aggregation and small object recognition while maintaining computational efficiency. Experimental results demonstrate that, in contrast to the baseline, MSO-DETR achieves significant model lightweighting, reducing parameters by 67.3% and GFLOPs by 46.5%, while maintaining detection accuracy on the SeaDronesSee dataset, with only a 0.1% decrease in mAP50 and a 0.5% improvement in mAP50:95. It also delivers comparable performance to the baseline on the AFO dataset. Full article

► Show Figures

Figure 1

17 pages, 1647 KiB

Open AccessProceeding Paper

Enhanced Drone Detection Model for Edge Devices Using Knowledge Distillation and Bayesian Optimization

by Maryam Lawan Salisu, Farouk Lawan Gambo, Aminu Musa and Aminu Aliyu Abdullahi

Eng. Proc. 2025, 87(1), 71; https://doi.org/10.3390/engproc2025087071 - 4 Jun 2025

Viewed by 469

Abstract

The emergence of Unmanned Aerial Vehicles (UAVs), commonly known as drones, has presented numerous transformative opportunities across sectors such as agriculture, commerce, and security surveillance systems. However, the proliferation of these technologies raises significant concerns regarding security and privacy, as they could potentially [...] Read more.

The emergence of Unmanned Aerial Vehicles (UAVs), commonly known as drones, has presented numerous transformative opportunities across sectors such as agriculture, commerce, and security surveillance systems. However, the proliferation of these technologies raises significant concerns regarding security and privacy, as they could potentially be exploited for unauthorized surveillance or even targeted attacks. Various research endeavors have proposed drone detection models for security purposes. Yet, deploying these models on edge devices proves challenging due to resource constraints, which limit the feasibility of complex deep learning models. The need for lightweight models capable of efficient deployment on edge devices becomes evident, particularly for the anonymous detection of drones in various disguises to prevent potential intrusions. This study introduces a lightweight deep learning-based drone detection model (LDDm-CNN) by fusing knowledge distillation with Bayesian optimization. Knowledge distillation (KD) is utilized to transfer knowledge from a complex model (teacher) to a simpler one (student), preserving performance while reducing computational complexity, thereby achieving a lightweight model. However, selecting optimal hyper-parameters for knowledge distillation is challenging due to a large number of search space and complexity requirements. Therefore, through the integration of Bayesian optimization with knowledge distillation, we present an enhanced CNN-KD model. This novel approach employs an optimization algorithm to determine the most suitable hyper-parameters, enhancing the efficiency and effectiveness of the drone detection model. Validation on a dedicated drone detection dataset illustrates the model’s efficacy, achieving a remarkable accuracy of 96% while significantly reducing computational and memory requirements. With just 102,000 parameters, the proposed model is five times smaller than the teacher model, underscoring its potential for practical deployment in real-world scenarios. Full article

(This article belongs to the Proceedings of The 5th International Electronic Conference on Applied Sciences)

► Show Figures

Figure 1

39 pages, 3695 KiB

Open AccessArticle

Fast Identification and Detection Algorithm for Maneuverable Unmanned Aircraft Based on Multimodal Data Fusion

by Tian Luan, Shixiong Zhou, Yicheng Zhang and Weijun Pan

Mathematics 2025, 13(11), 1825; https://doi.org/10.3390/math13111825 - 30 May 2025

Viewed by 668

Abstract

To address the critical challenges of insufficient monitoring capabilities and vulnerable defense systems against drones in regional airports, this study proposes a multi-source data fusion framework for rapid UAV detection. Building upon the YOLO v11 architecture, we develop an enhanced model incorporating four [...] Read more.

To address the critical challenges of insufficient monitoring capabilities and vulnerable defense systems against drones in regional airports, this study proposes a multi-source data fusion framework for rapid UAV detection. Building upon the YOLO v11 architecture, we develop an enhanced model incorporating four key innovations: (1) A dual-path RGB-IR fusion architecture that exploits complementary multi-modal data; (2) C3k2-DATB dynamic attention modules for enhanced feature extraction and semantic perception; (3) A bilevel routing attention mechanism with agent queries (BRSA) for precise target localization; (4) A semantic-detail injection (SDI) module coupled with windmill-shaped convolutional detection heads (PCHead) and Wasserstein Distance loss to expand receptive fields and accelerate convergence. Experimental results demonstrate superior performance with 99.3% mAP@50 (17.4% improvement over baseline YOLOv11), while maintaining lightweight characteristics (2.54M parameters, 7.8 GFLOPS). For practical deployment, we further enhance tracking robustness through an improved BoT-SORT algorithm within an interactive multiple model framework, achieving 91.3% MOTA and 93.0% IDF1 under low-light conditions. This integrated solution provides cost-effective, high-precision drone surveillance for resource-constrained airports. Full article

(This article belongs to the Special Issue Artificial Intelligence and Optimization in Aircraft Design and Unmanned Aerial Vehicles, 2nd Edition)

► Show Figures

Figure 1

22 pages, 11179 KiB

Open AccessArticle

Study on Lightweight Bridge Crack Detection Algorithm Based on YOLO11

by Xuwei Dong, Jiashuo Yuan and Jinpeng Dai

Sensors 2025, 25(11), 3276; https://doi.org/10.3390/s25113276 - 23 May 2025

Viewed by 820

Abstract

Bridge crack detection is a key factor in ensuring the safety and extending the lifespan of bridges. Traditional detection methods often suffer from low efficiency and insufficient accuracy. The development of computer vision has gradually made bridge crack detection methods based on deep [...] Read more.

Bridge crack detection is a key factor in ensuring the safety and extending the lifespan of bridges. Traditional detection methods often suffer from low efficiency and insufficient accuracy. The development of computer vision has gradually made bridge crack detection methods based on deep learning to become a research hotspot. In this study, a lightweight bridge crack detection algorithm, YOLO11-Bridge Detection (YOLO11-BD), is proposed based on the optimization of the YOLO11 model. This algorithm uses an efficient multiscale conv all (EMSCA) module to enhance channel and spatial attention, thereby strengthening its ability to extract crack features. Additionally, the algorithm improves detection accuracy without increasing the model size. Furthermore, a lightweight detection head (LDH) is introduced to process feature information from different channels using efficient grouped convolutions. It reduces the model’s parameters and computations whilst preserving accuracy, thereby achieving a lightweight model. Experimental results show that compared with the original YOLO11, the YOLO11-BD algorithm improves mAP50 and mAP50-95 on the bridge crack dataset by 3.1% and 4.8%, respectively, whilst significantly reducing GFLOPs by 19.05%. Its frame per second remains higher than 500, demonstrating excellent real-time detection capability and high computational efficiency. The algorithm proposed in this study provides an efficient and flexible solution for the monitoring of bridge cracks using remote sensing devices such as drones, and it has significant practical application value. Its lightweight design ensures strong cross-platform adaptability and provides reliable technical support for intelligent bridge management and maintenance. Full article

(This article belongs to the Section Physical Sensors)

► Show Figures

Figure 1

22 pages, 8270 KiB

Open AccessArticle

DFE-YOLO: A Multi-Scale-Enhanced Detection Network for Dense Object Detection in Traffic Monitoring

by Qingyi Li, Yi Li and Yanfeng Lu

Electronics 2025, 14(11), 2108; https://doi.org/10.3390/electronics14112108 - 22 May 2025

Viewed by 715

Abstract

The accuracy of object detection is crucial for the safety and efficiency of traffic management in monitoring systems. Existing detectors, however, struggle significantly within complex urban scenarios where high-density occlusions among the targets occur, as well as extreme scale variations resulting from the [...] Read more.

The accuracy of object detection is crucial for the safety and efficiency of traffic management in monitoring systems. Existing detectors, however, struggle significantly within complex urban scenarios where high-density occlusions among the targets occur, as well as extreme scale variations resulting from the size differences of vehicles and distance variations to the camera. To remedy these issues, we introduce DFE-YOLO, an enhanced multi-scale detection framework built upon YOLOv8 that fuses features from various layers at different scales through our ‘four adaptive spatial feature fusion’ module, which performs adaptive spatial fusion via learnable weights normalized by softmax and thereby allows effective feature aggregation across scales. The second contribution is DySample, which uses a lightweight, content-aware, point-based upsampling method to improve multi-scale feature representation as well as reduce imbalance across different object scales. The experiments conducted on the VisDrone-2019 and BDD100K benchmarks showed significantly superior performance against state-of-the-art detectors. Specifically, DFE-YOLO achieved a +4% and +5.1% boost over YOLOv10 in AP and APsmall. This study offers a useful fix for smart transport systems. Full article

(This article belongs to the Special Issue Object Detection in Autonomous Driving)

► Show Figures

Figure 1

13 pages, 5874 KiB

Open AccessArticle

An Investigation on Prediction of Infrastructure Asset Defect with CNN and ViT Algorithms

by Nam Lethanh, Tu Anh Trinh and Mir Tahmid Hossain

Infrastructures 2025, 10(5), 125; https://doi.org/10.3390/infrastructures10050125 - 20 May 2025

Viewed by 541

Abstract

Convolutional Neural Networks (CNNs) have been demonstrated to be one of the most powerful methods for image recognition, being applied in many fields, including civil and structural health monitoring in infrastructure asset management. Current State-of-the-Art CNN models are now accessible as open-source and [...] Read more.

Convolutional Neural Networks (CNNs) have been demonstrated to be one of the most powerful methods for image recognition, being applied in many fields, including civil and structural health monitoring in infrastructure asset management. Current State-of-the-Art CNN models are now accessible as open-source and available on several Artificial Intelligence (AI) platforms, with TensorFlow being widely used. Besides CNN models, Vision Transformers (ViTs) have recently emerged as a competitive alternative. Several demonstrations have indicated that ViT models, in many instances, outperform the current CNNs by almost four times in terms of computational efficiency and accuracy. This paper presents an investigation into defect detection for civil and structural components using CNN and ViT models available on TensorFlow. An empirical study was conducted using a database of cracks. The severity of crack is categorized into binary states: “with crack” and “without crack”. The results confirm that the accuracies of both CNN and ViT models exceed 95% after 100 epochs of training, with no significant difference observed between them for binary classification. Notably, the cost of this AI-based approach with images taken by lightweight and low-cost drones is considerably lower compared to high-speed inspection cars, while still delivering an expected level of predictive accuracy. Full article

(This article belongs to the Section Infrastructures Inspection and Maintenance)

► Show Figures

Figure 1

Search Results (139)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (139)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI