MDPI - Publisher of Open Access Journals

25 pages, 8563 KB

Open AccessArticle

GYS-RT-DETR: A Lightweight Citrus Disease Detection Model Based on Integrated Adaptive Pruning and Dynamic Knowledge Distillation

by Linlin Yang, Zhonghao Huang, Yi Huangfu, Rui Liu, Xuerui Wang, Zhiwei Pan and Jie Shi

Agronomy 2025, 15(7), 1515; https://doi.org/10.3390/agronomy15071515 - 22 Jun 2025

Viewed by 781

Abstract

Given the serious economic burden that citrus diseases impose on fruit farmers and related industries, achieving rapid and accurate disease detection is particularly crucial. In response to the challenges posed by resource-limited platforms and complex backgrounds, this paper designs and proposes a lightweight [...] Read more.

Given the serious economic burden that citrus diseases impose on fruit farmers and related industries, achieving rapid and accurate disease detection is particularly crucial. In response to the challenges posed by resource-limited platforms and complex backgrounds, this paper designs and proposes a lightweight method for the identification and localization of citrus diseases based on the RT-DETR-r18 model—GYS-RT-DETR. This paper proposes an optimization method for target detection that significantly enhances model performance through multi-dimensional technology integration. First, this paper introduces the following innovations in model structure: (1) A Gather-and-Distribute Mechanism is introduced in the Neck section, which effectively enhances the model’s ability to detect medium to large targets through global feature fusion and high-level information injection.(2) Scale Sequence Feature Fusion (SSFF) is used to optimize the Neck structure to improve the detection performance of the model for small targets in complex environments. (3) The Focaler-ShapeIoU loss function is used to solve the problems of unbalanced training samples and inaccurate positioning. Secondly, the model adopts two model optimization strategies: (1) The Group_taylor local pruning algorithm is used to reduce memory occupation and the number of computing parameters of the model. (2) The feature-logic knowledge distillation framework is proposed and adopted to solve the problem of information loss caused by the structural difference between teachers and students, and to ensure a good detection performance, while realizing the lightweight character of the model. The experimental results show that the GYS-RT-DETR model has a precision of 79.1%, a recall of 77.9%, an F1 score of 78.0%, a model size of 23.0 MB, and an mAP value of 77.8%. Compared to the original model, the precision, recall, the F1 score, the mAP value, and the FPS value have improved by 3.5%, 5.3%, 5.0%, 5.3%, and 10.3 f/s, respectively. Additionally, the memory usage of the GYS-RT-DETR model has decreased by 25.5 MB compared to the original model. The GYS-RT-DETR model proposed in this article can effectively detect various citrus diseases in complex backgrounds, addressing the time-consuming nature of manual detection and improving the accuracy of model detection, thereby providing an effective theoretical basis for the automated detection of citrus diseases. Full article

(This article belongs to the Section Precision and Digital Agriculture)

► Show Figures

Figure 1

21 pages, 8405 KB

Open AccessArticle

YOLOv11-BSS: Damaged Region Recognition Based on Spatial and Channel Synergistic Attention and Bi-Deformable Convolution in Sanding Scenarios

by Yinjiang Li, Zhifeng Zhou and Ying Pan

Electronics 2025, 14(7), 1469; https://doi.org/10.3390/electronics14071469 - 5 Apr 2025

Cited by 3 | Viewed by 1119

Abstract

In order to address the problem that the paint surface of the damaged region of the body is similar to the color texture characteristics of the usual paint surface, which leads to the phenomenon of leakage or misdetection in the detection process, an [...] Read more.

In order to address the problem that the paint surface of the damaged region of the body is similar to the color texture characteristics of the usual paint surface, which leads to the phenomenon of leakage or misdetection in the detection process, an algorithm for detecting the damaged region of the body based on the improved YOLOv11 is proposed. Firstly, bi-deformable convolution is proposed to optimize the convolution kernel shape offset direction, which effectively improves the feature representation power of the backbone network; secondly, the C2PSA-SCSA module is designed to construct the coupling between spatial attention and channel attention, which enhances the perceptual power of the backbone network, and makes the model pay better attention to the damaged region features. Then, based on the GSConv module and the DWConv module, we build the slim-neck feature fusion network based on the GSConv module and DWConv module, which effectively fuses local features and global features to improve the saturation of semantic features; finally, the Focaler-CIoU border loss function is designed, which makes use of the principle of Focaler-IoU segmented linear mapping, adjusts the border loss function’s attention to different samples, and improves the model’s convergence of feature learning at various scales. The experimental results show that the enhanced YOLOv11-BSS network improves the precision rate by 7.9%, the recall rate by 1.4%, and the mAP@50 by 3.7% over the baseline network, which effectively reduces the leakage and misdetection of the damaged areas of the car body. Full article

► Show Figures

Figure 1

23 pages, 3368 KB

Open AccessArticle

SDKU-Net: A Novel Architecture with Dynamic Kernels and Optimizer Switching for Enhanced Shadow Detection in Remote Sensing

by Gilberto Alvarado-Robles, Isac Andres Espinosa-Vizcaino, Carlos Gustavo Manriquez-Padilla and Juan Jose Saucedo-Dorantes

Computers 2025, 14(3), 80; https://doi.org/10.3390/computers14030080 - 23 Feb 2025

Cited by 1 | Viewed by 2723

Abstract

Shadows in remote sensing images often introduce challenges in accurate segmentation due to their variability in shape, size, and texture. To address these issues, this study proposes the Supervised Dynamic Kernel U-Net (SDKU-Net), a novel architecture designed to enhance shadow detection in complex [...] Read more.

Shadows in remote sensing images often introduce challenges in accurate segmentation due to their variability in shape, size, and texture. To address these issues, this study proposes the Supervised Dynamic Kernel U-Net (SDKU-Net), a novel architecture designed to enhance shadow detection in complex remote sensing scenarios. SDKU-Net integrates dynamic kernel adjustment, a combined loss function incorporating Focal and Tversky Loss, and optimizer switching to effectively tackle class imbalance and improve segmentation quality. Using the AISD dataset, the proposed method achieved state-of-the-art performance with an Intersection over Union (IoU) of 0.8552, an F1-Score of 0.9219, an Overall Accuracy (OA) of 96.50%, and a Balanced Error Rate (BER) of 5.08%. Comparative analyses demonstrate SDKU-Net’s superior performance against established methods such as U-Net, U-Net++, MSASDNet, and CADDN. Additionally, the model’s efficient training process, requiring only 75 epochs, highlights its potential for resource-constrained applications. These results underscore the robustness and adaptability of SDKU-Net, paving the way for advancements in shadow detection and segmentation across diverse fields. Full article

(This article belongs to the Special Issue Machine Learning Applications in Pattern Recognition)

► Show Figures

Figure 1

32 pages, 13599 KB

Open AccessArticle

Generalization Enhancement Strategies to Enable Cross-Year Cropland Mapping with Convolutional Neural Networks Trained Using Historical Samples

by Sam Khallaghi, Rahebeh Abedi, Hanan Abou Ali, Hamed Alemohammad, Mary Dziedzorm Asipunu, Ismail Alatise, Nguyen Ha, Boka Luo, Cat Mai, Lei Song, Amos Olertey Wussah, Sitian Xiong, Yao-Ting Yao, Qi Zhang and Lyndon D. Estes

Remote Sens. 2025, 17(3), 474; https://doi.org/10.3390/rs17030474 - 30 Jan 2025

Cited by 1 | Viewed by 1505

Abstract

Mapping agricultural fields using high-resolution satellite imagery and deep learning (DL) models has advanced significantly, even in regions with small, irregularly shaped fields. However, effective DL models often require large, expensive labeled datasets, which are typically limited to specific years or regions. This [...] Read more.

Mapping agricultural fields using high-resolution satellite imagery and deep learning (DL) models has advanced significantly, even in regions with small, irregularly shaped fields. However, effective DL models often require large, expensive labeled datasets, which are typically limited to specific years or regions. This restricts the ability to create annual maps needed for agricultural monitoring, as changes in farming practices and environmental conditions cause domain shifts between years and locations. To address this, we focused on improving model generalization without relying on yearly labels through a holistic approach that integrates several techniques, including an area-based loss function, Tversky-focal loss (TFL), data augmentation, and the use of regularization techniques like dropout. Photometric augmentations helped encode invariance to brightness changes but also increased the incidence of false positives. The best results were achieved by combining photometric augmentation, TFL, and Monte Carlo dropout, although dropout alone led to more false negatives. Input normalization also played a key role, with the best results obtained when normalization statistics were calculated locally (per chip) across all bands. Our U-Net-based workflow successfully generated multi-year crop maps over large areas, outperforming the base model without photometric augmentation or MC-dropout by 17 IoU points. Full article

(This article belongs to the Special Issue Remote Sensing and Associated Artificial Intelligence in Agricultural Applications (2nd Edition))

► Show Figures

Figure 1

24 pages, 7683 KB

Open AccessArticle

Hybrid-DETR: A Differentiated Module-Based Model for Object Detection in Remote Sensing Images

by Mingji Yang, Rongyu Xu, Chunyu Yang, Haibin Wu and Aili Wang

Electronics 2024, 13(24), 5014; https://doi.org/10.3390/electronics13245014 - 20 Dec 2024

Cited by 5 | Viewed by 3781

Abstract

Currently, embedded unmanned aerial vehicle (UAV) systems face significant challenges in balancing detection accuracy and computational efficiency when processing remote sensing images with complex backgrounds, small objects, and occlusions. This paper proposes the Hybrid-DETR model based on a real-time end-to-end Detection Transformer (RT-DETR), [...] Read more.

Currently, embedded unmanned aerial vehicle (UAV) systems face significant challenges in balancing detection accuracy and computational efficiency when processing remote sensing images with complex backgrounds, small objects, and occlusions. This paper proposes the Hybrid-DETR model based on a real-time end-to-end Detection Transformer (RT-DETR), featuring a novel HybridNet backbone network that implements a differentiated hybrid structure through lightweight RepConv Cross-stage Partial Efficient Layer Aggregation Network (RCSPELAN) modules and the Heat-Transfer Cross-stage Fusion (HTCF) modules, effectively balancing feature extraction efficiency and global perception capabilities. Additionally, we introduce a Small-Object Detection Module (SODM) and an EIFI module to enhance the detection capability of small objects in complex scenarios, while employing the Focaler-Shape-IoU loss function to optimize bounding box regression. Experimental results on the VisDrone2019 dataset demonstrate that Hybrid-DETR achieves mAP50 and mAP50:95 scores of 52.2% and 33.3%, respectively, representing improvements of 5.2% and 4.3% compared to RT-DETR-R18, while reducing model parameters by 29.33%. The effectiveness and robustness of our improved method are further validated on multiple challenging datasets, including AI-TOD and HIT-UAV. Full article

(This article belongs to the Special Issue New Insights in 2D and 3D Object Detection and Semantic Segmentation)

► Show Figures

Figure 1

28 pages, 8539 KB

Open AccessArticle

Enhancing YOLOv5 Performance for Small-Scale Corrosion Detection in Coastal Environments Using IoU-Based Loss Functions

by Qifeng Yu, Yudong Han, Yi Han, Xinjia Gao and Lingyu Zheng

J. Mar. Sci. Eng. 2024, 12(12), 2295; https://doi.org/10.3390/jmse12122295 - 13 Dec 2024

Cited by 4 | Viewed by 2144

Abstract

The high salinity, humidity, and oxygen-rich environments of coastal marine areas pose serious corrosion risks to metal structures, particularly in equipment such as ships, offshore platforms, and port facilities. With the development of artificial intelligence technologies, image recognition-based intelligent detection methods have provided [...] Read more.

The high salinity, humidity, and oxygen-rich environments of coastal marine areas pose serious corrosion risks to metal structures, particularly in equipment such as ships, offshore platforms, and port facilities. With the development of artificial intelligence technologies, image recognition-based intelligent detection methods have provided effective support for corrosion monitoring in marine engineering structures. This study aims to explore the performance improvements of different modified YOLOv5 models in small-object corrosion detection tasks, focusing on five IoU-based improved loss functions and their optimization effects on the YOLOv5 model. First, the study utilizes corrosion testing data from the Zhoushan seawater station of the China National Materials Corrosion and Protection Science Data Center to construct a corrosion image dataset containing 1266 labeled images. Then, based on the improved IoU loss functions, five YOLOv5 models were constructed: YOLOv5-NWD, YOLOv5-Shape-IoU, YOLOv5-WIoU, YOLOv5-Focal-EIoU, and YOLOv5-SIoU. These models, along with the traditional YOLOv5 model, were trained using the dataset, and their performance was evaluated using metrics such as precision, recall, F1 score, and FPS. The results showed that YOLOv5-NWD performed the best across all metrics, with a 7.2% increase in precision and a 2.2% increase in F1 score. The YOLOv5-Shape-IoU model followed, with improvements of 4.5% in precision and 2.6% in F1 score. In contrast, the performance improvements of YOLOv5-Focal-EIoU, YOLOv5-SIoU, and YOLOv5-WIoU were more limited. Further analysis revealed that different IoU ratios significantly affected the performance of the YOLOv5-NWD model. Experiments showed that the 4:6 ratio yielded the highest precision, while the 6:4 ratio performed the best in terms of recall, F1 score, and confusion matrix results. In addition, this study conducted an assessment using four datasets of different sizes: 300, 600, 900, and 1266 images. The results indicate that increasing the size of the training dataset enables the model to find a better balance between precision and recall, that is, a higher F1 score, while also effectively improving the model’s processing speed. Therefore, the choice of an appropriate IoU ratio should be based on specific application needs to optimize model performance. This study provides theoretical support for small-object corrosion detection tasks, advances the development of loss function design, and enhances the detection accuracy and reliability of YOLOv5 in practical applications. Full article

(This article belongs to the Special Issue Monitoring and Evaluation of Marine Engineering Equipment and Structures)

► Show Figures

Figure 1

22 pages, 6958 KB

Open AccessArticle

Distinguishing Difficulty Imbalances in Strawberry Ripeness Instances in a Complex Farmland Environment

by Yang Gan, Xuefeng Ren, Huan Liu, Yongming Chen and Ping Lin

Appl. Sci. 2024, 14(22), 10690; https://doi.org/10.3390/app142210690 - 19 Nov 2024

Cited by 4 | Viewed by 1166

Abstract

The existing strawberry ripeness detection algorithm has the problems of a low precision and a high missing rate in real complex scenes. Therefore, we propose a novel model based on a hybrid attention mechanism. Firstly, a partial convolution-based compact inverted block is developed, [...] Read more.

The existing strawberry ripeness detection algorithm has the problems of a low precision and a high missing rate in real complex scenes. Therefore, we propose a novel model based on a hybrid attention mechanism. Firstly, a partial convolution-based compact inverted block is developed, which significantly enhances the feature extraction capability of the model. Secondly, an efficient partial hybrid attention mechanism is established, which realizes the remote dependence and accurate localization of strawberry fruit. Meanwhile, a multi-scale progressive feature pyramid network is constructed, and the fine-grained features of strawberry targets of different sizes are accurately extracted. Finally, a Focaler-shape-IoU loss function is proposed to effectively solve the problem of the difficulty imbalance between strawberry samples and the influence of the shape and size of the bounding box on the regression. The experimental results show that the model’s precision and mAP0.5 reach 92.1% and 92.7%, respectively, which are 2.0% and 1.7% higher than the baseline model. Additionally, our model is better in detection performance than most models with fewer parameters and lower FLOPs. In summary, the model can accurately identify the maturity of strawberry fruit under complex farmland environments and provide certain technical guidance for automated strawberry-picking robots. Full article

(This article belongs to the Section Food Science and Technology)

► Show Figures

Figure 1

18 pages, 7039 KB

Open AccessArticle

Two-Stage Detection Algorithm for Plum Leaf Disease and Severity Assessment Based on Deep Learning

by Caihua Yao, Ziqi Yang, Peifeng Li, Yuxia Liang, Yamin Fan, Jinwen Luo, Chengmei Jiang and Jiong Mu

Agronomy 2024, 14(7), 1589; https://doi.org/10.3390/agronomy14071589 - 21 Jul 2024

Cited by 11 | Viewed by 2044

Abstract

Crop diseases significantly impact crop yields, and promoting specialized control of crop diseases is crucial for ensuring agricultural production stability. Disease identification primarily relies on human visual inspection, which is inefficient, inaccurate, and subjective. This study focused on the plum red spot ( [...] Read more.

Crop diseases significantly impact crop yields, and promoting specialized control of crop diseases is crucial for ensuring agricultural production stability. Disease identification primarily relies on human visual inspection, which is inefficient, inaccurate, and subjective. This study focused on the plum red spot (Polystigma rubrum), proposing a two-stage detection algorithm based on deep learning and assessing the severity of the disease through lesion coverage rate. The specific contributions are as follows: We utilized the object detection model YOLOv8 to strip leaves to eliminate the influence of complex backgrounds. We used an improved U-Net network to segment leaves and lesions. We combined Dice Loss with Focal Loss to address the poor training performance due to the pixel ratio imbalance between leaves and disease spots. For inconsistencies in the size and shape of leaves and lesions, we utilized ODConv and MSCA so that the model could focus on features at different scales. After verification, the accuracy rate of leaf recognition is 95.3%, and the mIoU, mPA, mPrecision, and mRecall of the leaf disease segmentation model are 90.93%, 95.21%, 95.17%, and 95.21%, respectively. This research provides an effective solution for the detection and severity assessment of plum leaf red spot disease under complex backgrounds. Full article

(This article belongs to the Special Issue The Applications of Deep Learning in Smart Agriculture)

► Show Figures

Figure 1

13 pages, 2629 KB

Open AccessArticle

Fire in Focus: Advancing Wildfire Image Segmentation by Focusing on Fire Edges

by Guodong Wang, Fang Wang, Hongping Zhou and Haifeng Lin

Forests 2024, 15(1), 217; https://doi.org/10.3390/f15010217 - 22 Jan 2024

Cited by 12 | Viewed by 2910

Abstract

With the intensification of global climate change and the frequent occurrence of forest fires, the development of efficient and precise forest fire monitoring and image segmentation technologies has become increasingly important. In dealing with challenges such as the irregular shapes, sizes, and blurred [...] Read more.

With the intensification of global climate change and the frequent occurrence of forest fires, the development of efficient and precise forest fire monitoring and image segmentation technologies has become increasingly important. In dealing with challenges such as the irregular shapes, sizes, and blurred boundaries of flames and smoke, traditional convolutional neural networks (CNNs) face limitations in forest fire image segmentation, including flame edge recognition, class imbalance issues, and adapting to complex scenarios. This study aims to enhance the accuracy and efficiency of flame recognition in forest fire images by introducing a backbone network based on the Swin Transformer and combined with an adaptive multi-scale attention mechanism and focal loss function. By utilizing a rich and diverse pre-training dataset, our model can more effectively capture and understand key features of forest fire images. Through experimentation, our model achieved an intersection over union (IoU) of 86.73% and a precision of 91.23%. This indicates that the performance of our proposed wildfire segmentation model has been effectively enhanced. A series of ablation experiments validate the importance of these technological improvements in enhancing model performance. The results show that our approach achieves significant performance improvements in forest fire image segmentation tasks compared to traditional models. The Swin Transformer provides more refined feature extraction capabilities, the adaptive multi-scale attention mechanism helps the model focus better on key areas, and the focal loss function effectively addresses the issue of class imbalance. These innovations make the model more precise and robust in handling forest fire image segmentation tasks, providing strong technical support for future forest fire monitoring and prevention. Full article

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning Applications in Forestry)

► Show Figures

Figure 1

19 pages, 4909 KB

Open AccessArticle

Monitoring Impervious Surface Area Dynamics in Urban Areas Using Sentinel-2 Data and Improved Deeplabv3+ Model: A Case Study of Jinan City, China

by Jiantao Liu, Yan Zhang, Chunting Liu and Xiaoqian Liu

Remote Sens. 2023, 15(8), 1976; https://doi.org/10.3390/rs15081976 - 8 Apr 2023

Cited by 9 | Viewed by 2776

Abstract

Timely and rapidly mapping impervious surface area (ISA) and monitoring its spatial-temporal change pattern can deepen our understanding of the urban process. However, the complex spectral variability and spatial heterogeneity of ISA caused by the increased spatial resolution poses a great challenge to [...] Read more.

Timely and rapidly mapping impervious surface area (ISA) and monitoring its spatial-temporal change pattern can deepen our understanding of the urban process. However, the complex spectral variability and spatial heterogeneity of ISA caused by the increased spatial resolution poses a great challenge to accurate ISA dynamics monitoring. This research selected Jinan City as a case study to boost ISA mapping performance through integrating the dual-attention CBAM module, SE module and focal loss function into the Deeplabv3+ model using Sentinel-2 data, and subsequently examining ISA spatial-temporal evolution using the generated annual time-series ISA data from 2017 to 2021. The experimental results demonstrated that (a) the improved Deeplabv3+ model achieved satisfactory accuracy in ISA mapping, with Precision, Recall, IoU and F1 values reaching 82.24%, 92.38%, 77.01% and 0.87, respectively. (b) In a comparison with traditional classification methods and other state-of-the-art deep learning semantic segmentation models, the proposed method performed well, qualitatively and quantitatively. (c) The time-series analysis on ISA distribution revealed that the ISA expansion in Jinan City had significant directionality from northeast to southwest from 2017 to 2021, with the number of patches as well as the degree of connectivity and aggregation increasing while the degree of fragmentation and the complexity of shape decreased. Overall, the proposed method shows great potential in generating reliable times-series ISA data and can be better served for fine urban research. Full article

(This article belongs to the Special Issue Urban Green and Blue Infrastructure Monitoring Using Remote Sensing: Current Progress and Future Vision)

► Show Figures

Figure 1

16 pages, 5908 KB

Open AccessArticle

Evaluation of Deep Learning Segmentation Models for Detection of Pine Wilt Disease in Unmanned Aerial Vehicle Images

by Lang Xia, Ruirui Zhang, Liping Chen, Longlong Li, Tongchuan Yi, Yao Wen, Chenchen Ding and Chunchun Xie

Remote Sens. 2021, 13(18), 3594; https://doi.org/10.3390/rs13183594 - 9 Sep 2021

Cited by 63 | Viewed by 4363

Abstract

Pine wilt disease (PWD) is a serious threat to pine forests. Combining unmanned aerial vehicle (UAV) images and deep learning (DL) techniques to identify infected pines is the most efficient method to determine the potential spread of PWD over a large area. In [...] Read more.

Pine wilt disease (PWD) is a serious threat to pine forests. Combining unmanned aerial vehicle (UAV) images and deep learning (DL) techniques to identify infected pines is the most efficient method to determine the potential spread of PWD over a large area. In particular, image segmentation using DL obtains the detailed shape and size of infected pines to assess the disease’s degree of damage. However, the performance of such segmentation models has not been thoroughly studied. We used a fixed-wing UAV to collect images from a pine forest in Laoshan, Qingdao, China, and conducted a ground survey to collect samples of infected pines and construct prior knowledge to interpret the images. Then, training and test sets were annotated on selected images, and we obtained 2352 samples of infected pines annotated over different backgrounds. Finally, high-performance DL models (e.g., fully convolutional networks for semantic segmentation, DeepLabv3+, and PSPNet) were trained and evaluated. The results demonstrated that focal loss provided a higher accuracy and a finer boundary than Dice loss, with the average intersection over union (IoU) for all models increasing from 0.656 to 0.701. From the evaluated models, DeepLLabv3+ achieved the highest IoU and an F1 score of 0.720 and 0.832, respectively. Also, an atrous spatial pyramid pooling module encoded multiscale context information, and the encoder–decoder architecture recovered location/spatial information, being the best architecture for segmenting trees infected by the PWD. Furthermore, segmentation accuracy did not improve as the depth of the backbone network increased, and neither ResNet34 nor ResNet50 was the appropriate backbone for most segmentation models. Full article

(This article belongs to the Special Issue Thematic Information Extraction and Application in Forests)

► Show Figures

Figure 1

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (11)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI