Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (41)

Search Parameters:
Keywords = inverted residual blocks

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 14596 KiB  
Article
Accurate Sugarcane Detection and Row Fitting Using SugarRow-YOLO and Clustering-Based Spline Methods for Autonomous Agricultural Operations
by Guiqing Deng, Fangyue Zhou, Huan Dong, Zhihao Xu and Yanzhou Li
Appl. Sci. 2025, 15(14), 7789; https://doi.org/10.3390/app15147789 - 11 Jul 2025
Viewed by 171
Abstract
Sugarcane is mostly planted in rows, and the accurate identification of crop rows is important for the autonomous navigation of agricultural machines. Especially in the elongation period of sugarcane, accurate row identification helps in weed control and the removal of ineffective tillers in [...] Read more.
Sugarcane is mostly planted in rows, and the accurate identification of crop rows is important for the autonomous navigation of agricultural machines. Especially in the elongation period of sugarcane, accurate row identification helps in weed control and the removal of ineffective tillers in the field. However, sugarcane leaves and stalks intertwine and overlap at this stage. They can form a complex occlusion structure, which poses a greater challenge to target detection. To address this challenge, this paper proposes an improved target detection method, SugarRow-YOLO, based on the YOLOv11n model. The method aims to achieve accurate sugarcane identification and provide basic support for subsequent sugarcane row detection. This model introduces the WTConv convolutional modules to expand the sensory field and improve computational efficiency, adopts the iRMB inverted residual block attention mechanism to enhance the modeling capability of crop spatial structure, and uses the UIOU loss function to effectively mitigate the misdetection and omission problem in the region of dense and overlapping targets. The experimental results show that SugarRow-YOLO performs well in the sugarcane target detection task, with a precision of 83%, recall of 87.8%, and mAP50 and mAP50-95 of 90.2% and 69.2%. In addition to addressing the problem of large variability in row spacing and plant spacing of sugarcane, this paper introduces the DBSCAN clustering algorithm and combines it with a smooth spline curve to fit the crop rows in order to realize the accurate extraction of crop rows. This method achieved 96.6% in the task, with high precision in sugarcane target detection and demonstrates excellent accuracy in sugarcane row fitting, offering robust technical support for the automation and intelligent advancement of agricultural operations. Full article
(This article belongs to the Section Agricultural Science and Technology)
Show Figures

Figure 1

23 pages, 8232 KiB  
Article
Intelligent Identification of Tea Plant Seedlings Under High-Temperature Conditions via YOLOv11-MEIP Model Based on Chlorophyll Fluorescence Imaging
by Chun Wang, Zejun Wang, Lijiao Chen, Weihao Liu, Xinghua Wang, Zhiyong Cao, Jinyan Zhao, Man Zou, Hongxu Li, Wenxia Yuan and Baijuan Wang
Plants 2025, 14(13), 1965; https://doi.org/10.3390/plants14131965 - 27 Jun 2025
Viewed by 369
Abstract
To achieve an efficient, non-destructive, and intelligent identification of tea plant seedlings under high-temperature stress, this study proposes an improved YOLOv11 model based on chlorophyll fluorescence imaging technology for intelligent identification. Using tea plant seedlings under varying degrees of high temperature as the [...] Read more.
To achieve an efficient, non-destructive, and intelligent identification of tea plant seedlings under high-temperature stress, this study proposes an improved YOLOv11 model based on chlorophyll fluorescence imaging technology for intelligent identification. Using tea plant seedlings under varying degrees of high temperature as the research objects, raw fluorescence images were acquired through a chlorophyll fluorescence image acquisition device. The fluorescence parameters obtained by Spearman correlation analysis were found to be the maximum photochemical efficiency (Fv/Fm), and the fluorescence image of this parameter is used to construct the dataset. The YOLOv11 model was improved in the following ways. First, to reduce the number of network parameters and maintain a low computational cost, the lightweight MobileNetV4 network was introduced into the YOLOv11 model as a new backbone network. Second, to achieve efficient feature upsampling, enhance the efficiency and accuracy of feature extraction, and reduce computational redundancy and memory access volume, the EUCB (Efficient Up Convolution Block), iRMB (Inverted Residual Mobile Block), and PConv (Partial Convolution) modules were introduced into the YOLOv11 model. The research results show that the improved YOLOv11-MEIP model has the best performance, with precision, recall, and mAP50 reaching 99.25%, 99.19%, and 99.46%, respectively. Compared with the YOLOv11 model, the improved YOLOv11-MEIP model achieved increases of 4.05%, 7.86%, and 3.42% in precision, recall, and mAP50, respectively. Additionally, the number of model parameters was reduced by 29.45%. This study provides a new intelligent method for the classification of high-temperature stress levels of tea seedlings, as well as state detection and identification, and provides new theoretical support and technical reference for the monitoring and prevention of tea plants and other crops in tea gardens under high temperatures. Full article
(This article belongs to the Special Issue Practical Applications of Chlorophyll Fluorescence Measurements)
Show Figures

Figure 1

20 pages, 4244 KiB  
Article
Edge-Optimized Lightweight YOLO for Real-Time SAR Object Detection
by Caiguang Zhang, Ruofeng Yu, Shuwen Wang, Fatong Zhang, Shaojia Ge, Shuangshuang Li and Xuezhou Zhao
Remote Sens. 2025, 17(13), 2168; https://doi.org/10.3390/rs17132168 - 24 Jun 2025
Viewed by 540
Abstract
Synthetic Aperture Radar image object detection holds significant application value in both military and civilian domains. However, existing deep learning-based methods suffer from excessive model parameters and high computational costs, making them impractical for real-time deployment on edge computing platforms. To address these [...] Read more.
Synthetic Aperture Radar image object detection holds significant application value in both military and civilian domains. However, existing deep learning-based methods suffer from excessive model parameters and high computational costs, making them impractical for real-time deployment on edge computing platforms. To address these challenges, this paper proposes a lightweight SAR object detection method optimized for edge devices. First, we design an efficient backbone network based on inverted residual blocks and the information bottleneck principle, achieving an optimal balance between feature extraction capability and computational resource consumption. Then, a Fast Feature Pyramid Network is constructed to enable efficient multi-scale feature fusion. Finally, we propose a decoupled network-in-network Head, which significantly reduces the computational overhead while maintaining detection accuracy. Experimental results demonstrate that the proposed method achieves comparable detection performance to state-of-the-art YOLO variants while drastically reducing computational complexity (4.4 GFLOP) and parameter count (1.9 M). On edge platforms (Jetson TX2 and Huawei Atlas DK 310), the model achieves real-time inference speeds of 34.2 FPS and 30.7 FPS, respectively, proving its suitability for resource-constrained, real-time SAR object detection scenarios. Full article
Show Figures

Graphical abstract

23 pages, 11085 KiB  
Article
Failure Mechanism and Movement Process Inversion of Rainfall-Induced Landslide in Yuexi Country
by Yonghong Xiao, Lu Wei and Xianghong Liu
Sustainability 2025, 17(12), 5639; https://doi.org/10.3390/su17125639 - 19 Jun 2025
Viewed by 293
Abstract
Shallow landslides are one of the main geological hazards that occur during heavy rainfall in Yuexi County every year, posing potential risks to the personal and property safety of local residents. A rainfall-induced shallow landslide named Baishizu No. 15 landslide in Yuexi Country [...] Read more.
Shallow landslides are one of the main geological hazards that occur during heavy rainfall in Yuexi County every year, posing potential risks to the personal and property safety of local residents. A rainfall-induced shallow landslide named Baishizu No. 15 landslide in Yuexi Country was taken as a case study. Based on the field geological investigation, combined with physical and mechanical experiments in laboratory as well as numerical simulation, the failure mechanism induced by rainfall infiltration was studied, and the movement process after landslide failure was inverted. The results show that the pore-water pressure within 2 m of the landslide body increases significantly and the factory of safety (Fs) has a good corresponding relationship with rainfall, which decreased to 0.978 after the heavy rainstorm on July 5 and July 6 in 2020. The maximum shear strain and displacement are concentrated at the foot and front edge of the landslide, which indicates a “traction type” failure mode of the Baishizu No. 15 landslide. In addition, the maximum displacement during landslide instability is about 0.5 m. The residual strength of soils collected from the soil–rock interface shows significant rate-strengthening, which ensures that the Baishizu No. 15 landslide will not exhibit high-speed and long runout movement. The rate-dependent friction coefficient of sliding surface was considered to simulate the movement process of the Baishizu No. 15 landslide by using PFC2D. The simulation results show that the movement velocity exhibited obvious oscillatory characteristics. After the movement stopped, the landslide formed a slip cliff at the rear edge and deposited as far as the platform at the front of the slope foot but did not block the road ahead. The final deposition state is basically consistent with the on-site investigation. The research results of this paper can provide valuable references for the disaster prevention, mitigation, and risk assessment of shallow landslides on residual soil slopes in the Dabie mountainous region. Full article
(This article belongs to the Section Hazards and Sustainability)
Show Figures

Figure 1

24 pages, 2119 KiB  
Article
Multimodal Medical Image Fusion Using a Progressive Parallel Strategy Based on Deep Learning
by Peng Peng and Yaohua Luo
Electronics 2025, 14(11), 2266; https://doi.org/10.3390/electronics14112266 - 31 May 2025
Viewed by 682
Abstract
Multimodal medical image fusion plays a critical role in enhancing diagnostic accuracy by integrating complementary information from different imaging modalities. However, existing methods often suffer from issues such as unbalanced feature fusion, structural blurring, loss of fine details, and limited global semantic modeling, [...] Read more.
Multimodal medical image fusion plays a critical role in enhancing diagnostic accuracy by integrating complementary information from different imaging modalities. However, existing methods often suffer from issues such as unbalanced feature fusion, structural blurring, loss of fine details, and limited global semantic modeling, particularly in low signal-to-noise modalities like PET. To address these challenges, we propose PPMF-Net, a novel progressive and parallel deep learning framework for PET–MRI image fusion. The network employs a hierarchical multi-path architecture to capture local details, global semantics, and high-frequency information in a coordinated manner. Specifically, it integrates three key modules: (1) a Dynamic Edge-Enhanced Module (DEEM) utilizing inverted residual blocks and channel attention to sharpen edge and texture features, (2) a Nonlinear Interactive Feature Extraction module (NIFE) that combines convolutional operations with element-wise multiplication to enable cross-modal feature coupling, and (3) a Transformer-Enhanced Global Modeling module (TEGM) with hybrid local–global attention to improve long-range dependency and structural consistency. A multi-objective unsupervised loss function is designed to jointly optimize structural fidelity, functional complementarity, and detail clarity. Experimental results on the Harvard MIF dataset demonstrate that PPMF-Net outperforms state-of-the-art methods across multiple metrics—achieving SF: 38.27, SD: 96.55, SCD: 1.62, and MS-SSIM: 1.14—and shows strong generalization and robustness in tasks such as SPECT–MRI and CT–MRI fusion, indicating its promising potential for clinical applications. Full article
(This article belongs to the Special Issue AI-Driven Medical Image/Video Processing)
Show Figures

Figure 1

21 pages, 43908 KiB  
Article
WHA-Net: A Low-Complexity Hybrid Model for Accurate Pseudopapilledema Classification in Fundus Images
by Junpeng Pei, Yousong Wang, Mingliang Ge, Jun Li, Yixing Li, Wei Wang and Xiaohong Zhou
Bioengineering 2025, 12(5), 550; https://doi.org/10.3390/bioengineering12050550 - 21 May 2025
Viewed by 498
Abstract
The fundus manifestations of pseudopapilledema closely resemble those of optic disc edema, making their differentiation particularly challenging in certain clinical situations. However, rapid and accurate diagnosis is crucial for alleviating patient anxiety and guiding treatment strategies. This study proposes an efficient low-complexity hybrid [...] Read more.
The fundus manifestations of pseudopapilledema closely resemble those of optic disc edema, making their differentiation particularly challenging in certain clinical situations. However, rapid and accurate diagnosis is crucial for alleviating patient anxiety and guiding treatment strategies. This study proposes an efficient low-complexity hybrid model, WHA-Net, which innovatively integrates three core modules to achieve precise auxiliary diagnosis of pseudopapilledema. First, the wavelet convolution (WTC) block is introduced to enhance the model’s characterization capability for vessel and optic disc edge details in fundus images through 2D wavelet transform and deep convolution. Additionally, the hybrid attention inverted residual (HAIR) block is incorporated to extract critical features such as vascular morphology, hemorrhages, and exudates. Finally, the Agent-MViT module effectively captures the continuity features of optic disc contours and retinal vessels in fundus images while reducing the computational complexity of traditional Transformers. The model was trained and evaluated on a dataset of 1793 rigorously curated fundus images, comprising 895 normal optic discs, 485 optic disc edema (ODE), and 413 pseudopapilledema (PPE) cases. On the test set, the model achieved outstanding performance, with 97.79% accuracy, 95.55% precision, 95.69% recall, and 98.53% specificity. Comparative experiments confirm the superiority of WHA-Net in classification tasks, while ablation studies validate the effectiveness and rationality of each module’s combined design. This research provides a clinically valuable solution for the automated differential diagnosis of pseudopapilledema, with both computational efficiency and diagnostic reliability. Full article
(This article belongs to the Section Biomedical Engineering and Biomaterials)
Show Figures

Figure 1

20 pages, 3955 KiB  
Article
Lightweight Pepper Disease Detection Based on Improved YOLOv8n
by Yuzhu Wu, Junjie Huang, Siji Wang, Yujian Bao, Yizhe Wang, Jia Song and Wenwu Liu
AgriEngineering 2025, 7(5), 153; https://doi.org/10.3390/agriengineering7050153 - 12 May 2025
Viewed by 657
Abstract
China is the world’s largest producer of chili peppers, which occupy particularly important economic and social values in various fields such as medicine, food, and industry. However, during its production process, chili peppers are affected by pests and diseases, resulting in significant yield [...] Read more.
China is the world’s largest producer of chili peppers, which occupy particularly important economic and social values in various fields such as medicine, food, and industry. However, during its production process, chili peppers are affected by pests and diseases, resulting in significant yield reduction due to the temperature and environment. In this study, a lightweight pepper disease identification method, DD-YOLO, based on the YOLOv8n model, is proposed. First, the deformable convolutional module DCNv2 (Deformable ConvNetsv2) and the inverted residual mobile block iRMB (Inverted Residual Mobile Block) are introduced into the C2F module to improve the accuracy of the sampling range and reduce the computational amount. Secondly, the DySample sampling operator (Dynamic Sample) is integrated into the head network to reduce the amount of data and the complexity of computation. Finally, we use Large Separable Kernel Attention (LSKA) to improve the SPPF module (Spatial Pyramid Pooling Fast) to enhance the performance of multi-scale feature fusion. The experimental results show that the accuracy, recall, and average precision of the DD-YOLO model are 91.6%, 88.9%, and 94.4%, respectively. Compared with the base network YOLOv8n, it improves 6.2, 2.3, and 2.8 percentage points, respectively. The model weight is reduced by 22.6%, and the number of floating-point operations per second is improved by 11.1%. This method provides a technical basis for intensive cultivation and management of chili peppers, as well as efficiently and cost-effectively accomplishing the task of identifying chili pepper pests and diseases. Full article
(This article belongs to the Topic Digital Agriculture, Smart Farming and Crop Monitoring)
Show Figures

Figure 1

20 pages, 7085 KiB  
Article
A Lightweight Citrus Ripeness Detection Algorithm Based on Visual Saliency Priors and Improved RT-DETR
by Yutong Huang, Xianyao Wang, Xinyao Liu, Liping Cai, Xuefei Feng and Xiaoyan Chen
Agronomy 2025, 15(5), 1173; https://doi.org/10.3390/agronomy15051173 - 12 May 2025
Cited by 1 | Viewed by 667
Abstract
As one of the world’s economically valuable fruit crops, citrus has its quality and productivity closely tied to the degree of fruit ripeness. However, accurately and efficiently detecting citrus ripeness in complex orchard environments for selective robotic harvesting remains a challenge. To address [...] Read more.
As one of the world’s economically valuable fruit crops, citrus has its quality and productivity closely tied to the degree of fruit ripeness. However, accurately and efficiently detecting citrus ripeness in complex orchard environments for selective robotic harvesting remains a challenge. To address this, we constructed a citrus ripeness detection dataset under complex orchard conditions, proposed a lightweight algorithm based on visual saliency priors and the RT-DETR model, and named it LightSal-RTDETR. To reduce computational overhead, we designed the E-CSPPC module, which efficiently combines cross-stage partial networks with gated and partial convolutions, combined with cascaded group attention (CGA) and inverted residual mobile block (iRMB), which minimizes model complexity and computational demand and simultaneously strengthens the model’s capacity for feature representation. Additionally, the Inner-SIoU loss function was employed for bounding box regression, while a weight initialization method based on visual saliency maps was proposed. Experiments on our dataset show that LightSal-RTDETR achieves a mAP@50 of 81%, improving by 1.9% over the original model while reducing parameters by 28.1% and computational cost by 26.5%. Therefore, LightSal-RTDETR effectively solves the citrus ripeness detection problem in orchard scenes with high complexity, offering an efficient solution for smart agriculture applications. Full article
(This article belongs to the Special Issue Advanced Machine Learning in Agriculture—2nd Edition)
Show Figures

Figure 1

26 pages, 6667 KiB  
Article
Rice Disease Detection: TLI-YOLO Innovative Approach for Enhanced Detection and Mobile Compatibility
by Zhuqi Li, Wangyu Wu, Bingcai Wei, Hao Li, Jingbo Zhan, Songtao Deng and Jian Wang
Sensors 2025, 25(8), 2494; https://doi.org/10.3390/s25082494 - 15 Apr 2025
Viewed by 823
Abstract
As a key global food reserve, rice disease detection technology plays an important role in promoting food production, protecting ecological balance and supporting sustainable agricultural development. However, existing rice disease identification techniques face many challenges, such as low training efficiency, insufficient model accuracy, [...] Read more.
As a key global food reserve, rice disease detection technology plays an important role in promoting food production, protecting ecological balance and supporting sustainable agricultural development. However, existing rice disease identification techniques face many challenges, such as low training efficiency, insufficient model accuracy, incompatibility with mobile devices, and the need for a large number of training datasets. This study aims to develop a rice disease detection model that is highly accurate, resource efficient, and suitable for mobile deployment to address the limitations of existing technologies. We propose the Transfer Layer iRMB-YOLOv8 (TLI-YOLO) model, which modifies some components of the YOLOv8 network structure based on transfer learning. The innovation of this method is mainly reflected in four key components. First, transfer learning is used to import the pretrained model weights into the TLI-YOLO model, which significantly reduces the dataset requirements and accelerates model convergence. Secondly, it innovatively integrates a new small object detection layer into the feature fusion layer, which enhances the detection ability by combining shallow and deep feature maps so as to learn small object features more effectively. Third, this study is the first to introduce the iRMB attention mechanism, which effectively integrates Inverted Residual Blocks and Transformers, and introduces deep separable convolution to maintain the spatial integrity of features, thus improving the efficiency of computational resources on mobile platforms. Finally, this study adopted the WIoUv3 loss function and added a dynamic non-monotonic aggregation mechanism to the standard IoU calculation to more accurately evaluate and penalize the difference between the predicted and actual bounding boxes, thus improving the robustness and generalization ability of the model. The final test shows that the TLI-YOLO model achieved 93.1% precision, 88% recall, 95% mAP, and a 90.48% F1 score on the custom dataset, with only 12.60 GFLOPS of computation. Compared with YOLOv8n, the precision improved by 7.8%, the recall rate improved by 7.2%, and mAP@.5 improved by 7.6%. In addition, the model demonstrated real-time detection capability on an Android device and achieved efficiency of 30 FPS, which meets the needs of on-site diagnosis. This approach provides important support for rice disease monitoring. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

19 pages, 3839 KiB  
Article
YOLO-YSTs: An Improved YOLOv10n-Based Method for Real-Time Field Pest Detection
by Yiqi Huang, Zhenhao Liu, Hehua Zhao, Chao Tang, Bo Liu, Zaiyuan Li, Fanghao Wan, Wanqiang Qian and Xi Qiao
Agronomy 2025, 15(3), 575; https://doi.org/10.3390/agronomy15030575 - 26 Feb 2025
Cited by 9 | Viewed by 2440
Abstract
The use of yellow sticky traps is a green pest control method that utilizes the pests’ attraction to the color yellow. The use of yellow sticky traps not only controls pest populations but also enables monitoring, offering a more economical and environmentally friendly [...] Read more.
The use of yellow sticky traps is a green pest control method that utilizes the pests’ attraction to the color yellow. The use of yellow sticky traps not only controls pest populations but also enables monitoring, offering a more economical and environmentally friendly alternative to pesticides. However, the small size and dense distribution of pests on yellow sticky traps lead to lower detection accuracy when using lightweight models. On the other hand, large models suffer from longer training times and deployment difficulties, posing challenges for pest detection in the field using edge computing platforms. To address these issues, this paper proposes a lightweight detection method, YOLO-YSTs, based on an improved YOLOv10n model. The method aims to balance pest detection accuracy and model size and has been validated on edge computing platforms. This model incorporates SPD-Conv convolutional modules, the iRMB inverted residual block attention mechanism, and the Inner-SIoU loss function to improve the YOLOv10n network architecture, ultimately addressing the issues of missed and false detections for small and overlapping targets while balancing model speed and accuracy. Experimental results show that the YOLO-YSTs model achieved precision, recall, mAP50, and mAP50–95 values of 83.2%, 83.2%, 86.8%, and 41.3%, respectively, on the yellow sticky trap dataset. The detection speed reached 139 FPS, with GFLOPs at only 8.8. Compared with the YOLOv10n model, the mAP50 improved by 1.7%. Compared with other mainstream object detection models, YOLO-YSTs also achieved the best overall performance. Through improvements to the YOLOv10n model, the accuracy of pest detection on yellow sticky traps was effectively enhanced, and the model demonstrated good detection performance when deployed on edge mobile platforms. In conclusion, the proposed YOLO-YSTs model offers more balanced performance in the detection of pest images on yellow sticky traps. It performs well when deployed on edge mobile platforms, making it of significant importance for field pest monitoring and integrated pest management. Full article
Show Figures

Figure 1

20 pages, 634 KiB  
Article
SATRN: Spiking Audio Tagging Robust Network
by Shouwei Gao, Xingyang Deng, Xiangyu Fan, Pengliang Yu, Hao Zhou and Zihao Zhu
Electronics 2025, 14(4), 761; https://doi.org/10.3390/electronics14040761 - 15 Feb 2025
Viewed by 581
Abstract
Audio tagging, as a fundamental task in acoustic signal processing, has demonstrated significant advances and broad applications in recent years. Spiking Neural Networks (SNNs), inspired by biological neural systems, exploit event-driven computing paradigms and temporal information processing, enabling superior energy efficiency. Despite the [...] Read more.
Audio tagging, as a fundamental task in acoustic signal processing, has demonstrated significant advances and broad applications in recent years. Spiking Neural Networks (SNNs), inspired by biological neural systems, exploit event-driven computing paradigms and temporal information processing, enabling superior energy efficiency. Despite the increasing adoption of SNNs, the potential of event-driven encoding mechanisms for audio tagging remains largely unexplored. This work presents a pioneering investigation into event-driven encoding strategies for SNN-based audio tagging. We propose the SATRN (Spiking Audio Tagging Robust Network), a novel architecture that integrates temporal–spatial attention mechanisms with membrane potential residual connections. The network employs a dual-stream structure combining global feature fusion and local feature extraction through inverted bottleneck blocks, specifically designed for efficient audio processing. Furthermore, we introduce an event-based encoding approach that enhances the resilience of Spiking Neural Networks to disturbances while maintaining performance. Our experimental results on the Urbansound8k and FSD50K datasets demonstrate that the SATRN achieves comparable performance to traditional Convolutional Neural Networks (CNNs) while requiring significantly less computation time and showing superior robustness against noise perturbations, making it particularly suitable for edge computing scenarios and real-time audio processing applications. Full article
Show Figures

Figure 1

21 pages, 15422 KiB  
Article
A Lightweight Model for Weed Detection Based on the Improved YOLOv8s Network in Maize Fields
by Jinyong Huang, Xu Xia, Zhihua Diao, Xingyi Li, Suna Zhao, Jingcheng Zhang, Baohua Zhang and Guoqiang Li
Agronomy 2024, 14(12), 3062; https://doi.org/10.3390/agronomy14123062 - 22 Dec 2024
Cited by 1 | Viewed by 1321
Abstract
To address the issue of the computational intensity and deployment difficulties associated with weed detection models, a lightweight target detection model for weeds based on YOLOv8s in maize fields was proposed in this study. Firstly, a lightweight network, designated as Dualconv High Performance [...] Read more.
To address the issue of the computational intensity and deployment difficulties associated with weed detection models, a lightweight target detection model for weeds based on YOLOv8s in maize fields was proposed in this study. Firstly, a lightweight network, designated as Dualconv High Performance GPU Net (D-PP-HGNet), was constructed on the foundation of the High Performance GPU Net (PP-HGNet) framework. Dualconv was introduced to reduce the computation required to achieve a lightweight design. Furthermore, Adaptive Feature Aggregation Module (AFAM) and Global Max Pooling were incorporated to augment the extraction of salient features in complex scenarios. Then, the newly created network was used to reconstruct the YOLOv8s backbone. Secondly, a four-stage inverted residual moving block (iRMB) was employed to construct a lightweight iDEMA module, which was used to replace the original C2f feature extraction module in the Neck to improve model performance and accuracy. Finally, Dualconv was employed instead of the conventional convolution for downsampling, further diminishing the network load. The new model was fully verified using the established field weed dataset. The test results showed that the modified model exhibited a notable improvement in detection performance compared with YOLOv8s. Accuracy improved from 91.2% to 95.8%, recall from 87.9% to 93.2%, and mAP@0.5 from 90.8% to 94.5%. Furthermore, the number of GFLOPs and the model size were reduced to 12.7 G and 9.1 MB, respectively, representing a decrease of 57.4% and 59.2% compared to the original model. Compared with the prevalent target detection models, such as Faster R-CNN, YOLOv5s, and YOLOv8l, the new model showed superior performance in accuracy and lightweight. The new model proposed in this paper effectively reduces the cost of the required hardware to achieve accurate weed identification in maize fields with limited resources. Full article
(This article belongs to the Collection AI, Sensors and Robotics for Smart Agriculture)
Show Figures

Figure 1

19 pages, 4992 KiB  
Article
BHI-YOLO: A Lightweight Instance Segmentation Model for Strawberry Diseases
by Haipeng Hu, Mingxia Chen, Luobin Huang and Chi Guo
Appl. Sci. 2024, 14(21), 9819; https://doi.org/10.3390/app14219819 - 27 Oct 2024
Cited by 3 | Viewed by 2025
Abstract
In complex environments, strawberry disease segmentation models face challenges, such as segmentation difficulties, excessive parameters, and high computational loads, making it difficult for these models to run effectively on devices with limited computational resources. To address the need for efficient running on low-power [...] Read more.
In complex environments, strawberry disease segmentation models face challenges, such as segmentation difficulties, excessive parameters, and high computational loads, making it difficult for these models to run effectively on devices with limited computational resources. To address the need for efficient running on low-power devices while ensuring effective disease segmentation in complex scenarios, this paper proposes BHI-YOLO, a lightweight instance segmentation model based on YOLOv8n-seg. First, the Universal Inverted Bottleneck (UIB) module is integrated into the backbone network and merged with the C2f module to create the C2f_UIB module; this approach reduces the parameter count while expanding the receptive field. Second, the HS-FPN is introduced to further reduce the parameter count and enhance the model’s ability to fuse features across different levels. Finally, by integrating the Inverted Residual Mobile Block (iRMB) with EMA to design the iRMA, the model is capable of efficiently combining global information to enhance local information. The experimental results demonstrate that the enhanced instance segmentation model for strawberry diseases achieved a mean average precision (mAP@50) of 93%. Compared to YOLOv8, which saw a 2.3% increase in mask mAP, the improved model reduced parameters by 47%, GFLOPs by 20%, and model size by 44.1%, achieving a relatively excellent lightweight effect. This study combines lightweight architecture with enhanced feature fusion, making the model more suitable for deployment on mobile devices, and provides a reference guide for strawberry disease segmentation applications in agricultural environments. Full article
Show Figures

Figure 1

23 pages, 5508 KiB  
Article
YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
by Xueqiang Zhao and Yangbo Chen
Drones 2024, 8(11), 609; https://doi.org/10.3390/drones8110609 - 24 Oct 2024
Cited by 7 | Viewed by 3507
Abstract
In recent years, research on Unmanned Aerial Vehicles (UAVs) has developed rapidly. Compared to traditional remote-sensing images, UAV images exhibit complex backgrounds, high resolution, and large differences in object scales. Therefore, UAV object detection is an essential yet challenging task. This paper proposes [...] Read more.
In recent years, research on Unmanned Aerial Vehicles (UAVs) has developed rapidly. Compared to traditional remote-sensing images, UAV images exhibit complex backgrounds, high resolution, and large differences in object scales. Therefore, UAV object detection is an essential yet challenging task. This paper proposes a multi-scale object detection network, namely YOLO-DroneMS (You Only Look Once for Drone Multi-Scale Object), for UAV images. Targeting the pivotal connection between the backbone and neck, the Large Separable Kernel Attention (LSKA) mechanism is adopted with the Spatial Pyramid Pooling Factor (SPPF), where weighted processing of multi-scale feature maps is performed to focus more on features. And Attentional Scale Sequence Fusion DySample (ASF-DySample) is introduced to perform attention scale sequence fusion and dynamic upsampling to conserve resources. Then, the faster cross-stage partial network bottleneck with two convolutions (named C2f) in the backbone is optimized using the Inverted Residual Mobile Block and Dilated Reparam Block (iRMB-DRB), which balances the advantages of dynamic global modeling and static local information fusion. This optimization effectively increases the model’s receptive field, enhancing its capability for downstream tasks. By replacing the original CIoU with WIoUv3, the model prioritizes anchoring boxes of superior quality, dynamically adjusting weights to enhance detection performance for small objects. Experimental findings on the VisDrone2019 dataset demonstrate that at an Intersection over Union (IoU) of 0.5, YOLO-DroneMS achieves a 3.6% increase in mAP@50 compared to the YOLOv8n model. Moreover, YOLO-DroneMS exhibits improved detection speed, increasing the number of frames per second (FPS) from 78.7 to 83.3. The enhanced model supports diverse target scales and achieves high recognition rates, making it well-suited for drone-based object detection tasks, particularly in scenarios involving multiple object clusters. Full article
(This article belongs to the Special Issue Intelligent Image Processing and Sensing for Drones, 2nd Edition)
Show Figures

Figure 1

25 pages, 38912 KiB  
Article
Thin Cloud Removal Generative Adversarial Network Based on Sparse Transformer in Remote Sensing Images
by Jinqi Han, Ying Zhou, Xindan Gao and Yinghui Zhao
Remote Sens. 2024, 16(19), 3658; https://doi.org/10.3390/rs16193658 - 30 Sep 2024
Cited by 3 | Viewed by 2335
Abstract
Thin clouds in Remote Sensing (RS) imagery can negatively impact subsequent applications. Current Deep Learning (DL) approaches often prioritize information recovery in cloud-covered areas but may not adequately preserve information in cloud-free regions, leading to color distortion, detail loss, and visual artifacts. This [...] Read more.
Thin clouds in Remote Sensing (RS) imagery can negatively impact subsequent applications. Current Deep Learning (DL) approaches often prioritize information recovery in cloud-covered areas but may not adequately preserve information in cloud-free regions, leading to color distortion, detail loss, and visual artifacts. This study proposes a Sparse Transformer-based Generative Adversarial Network (SpT-GAN) to solve these problems. First, a global enhancement feature extraction module is added to the generator’s top layer to enhance the model’s ability to preserve ground feature information in cloud-free areas. Then, the processed feature map is reconstructed using the sparse transformer-based encoder and decoder with an adaptive threshold filtering mechanism to ensure sparsity. This mechanism enables that the model preserves robust long-range modeling capabilities while disregarding irrelevant details. In addition, inverted residual Fourier transformation blocks are added at each level of the structure to filter redundant information and enhance the quality of the generated cloud-free images. Finally, a composite loss function is created to minimize error in the generated images, resulting in improved resolution and color fidelity. SpT-GAN achieves outstanding results in removing clouds both quantitatively and visually, with Structural Similarity Index (SSIM) values of 98.06% and 92.19% and Peak Signal-to-Noise Ratio (PSNR) values of 36.19 dB and 30.53 dB on the RICE1 and T-Cloud datasets, respectively. On the T-Cloud dataset, especially with more complex cloud components, the superior ability of SpT-GAN to restore ground details is more evident. Full article
Show Figures

Figure 1

Back to TopTop