An Instance Segmentation Method for Agricultural Plastic Residual Film on Cotton Fields Based on RSE-YOLO-Seg
Abstract
1. Introduction
2. Materials and Methods
2.1. Establishment of Agricultural Plastic Residual Film Dataset
2.2. Test Environment Configuration and Network Parameter Settings
2.3. Model Evaluation Metrics
2.4. Residual Film Segmentation Method
2.4.1. RFCAConv
2.4.2. C3K2_PKI Module
2.4.3. Segnext Attention
2.4.4. Segment_Efficient Head and Nwd-Mpd Loss Function
- (1)
- NWD-IoU
- (2)
- MPD-IoU
2.5. Theoretical Area of Residual Film Obtained Based on the Tracking Network
- (1)
- Residual film detection: The improved YOLO11-seg network is used to detect residual films in the video, obtaining detection boxes and mask information of residual films.
- (2)
- Target prediction: The Kalman filter is used to predict the position and state of the residual film in the next frame of the video.
- (3)
- Target matching: An improved Hungarian algorithm is used to perform optimal matching between residual films in consecutive video frames, obtaining the trajectory of residual films in the video. Trajectories of unmatched residual films are temporarily stored and continue to participate in subsequent frame prediction and matching. If a target remains unmatched for 30 consecutive frames, it is deemed a disappeared residual film and its trajectory is deleted. If the matching is successful, the results are output, parameters are updated, and residual film detection is restarted. In dynamic residual film detection, the complex environment—such as diverse residual films, occlusions, and multiple moving targets—can cause unpredictable jumps in the ID of the detected object, affecting detection results. To address this, a historical frame data recording module is introduced into the original tracker to store the historical trajectory information of residual film targets. This module indexes targets by their ID, with stored values including historical position information and frame indices. Through global variables, historical residual film information can be shared and updated across different frames. As described in the cascade matching and IoU matching steps in Figure 10, during target detection and tracking in each frame, the system checks whether the target’s ID exists in the historical frame module. If it exists, the system verifies the validity of the target ID and updates its historical information; otherwise, a new ID is reassigned [32].
2.6. RSE-YOLO-Seg Model Accelerated Deployment
3. Experimental Results and Analysis
3.1. Comparison of the Effects of Adding Attention Mechanisms to the Backbone
3.2. Effectiveness of Improved Receptive Field Modules in the Network
3.3. Comparison Test of Loss Functions
3.4. Ablation Experiments
- (1)
- Firstly, Module A is a multi-scale convolutional module integrated with an attention mechanism designed for residual films; when used alone, it significantly improved residual film detection performance. Specifically, bounding box and mask precision (P(b) and (P(m)) increased by 3.0 and 3.2 percentage points, respectively. Module B is a convolutional module incorporating receptive field attention; when introduced alone, it relatively evenly improved both the precision and recall of residual film detection, with the mean Average Precision (mAP) of bounding boxes and masks each increasing by 0.5 percentage points. Module C is an improved efficient detection head that reduces the model’s computational load while preserving its ability to extract deeper, richer image features. Module D is a loss function designed to address issues in residual film detection, such as missed detection of fragmented residual films, detection box distortion, and target miss-detection caused by overlapping residual films; when used alone, it improved mask precision (P(m)). The introduction of Modules A and B leads to a slight increase in model parameters and a corresponding extension of inference time. In contrast, Module C employs parallel-branch feature processing, which effectively reduces both the number of parameters and computational latency.
- (2)
- Secondly, we analyze the effects of combining modules. The combined use of Modules A and B showed a significant enhancement compared to their individual use: relative to the original model, the mean precision of bounding boxes (mAP(B)) and masks (mAP(M)) improved by 1.9 and 1.6 percentage points, respectively, leveraging the advantages of both modules. The combination of Modules A and D outperformed the baseline but saw decreases in P(b), P(m), R(m), and mAP compared to using A or D alone, indicating mutual suppression between the two modules, it is hypothesized that this suppression effect stems from subtle misalignments in their optimization objectives. Module A enhances multi-scale feature representation, with a particular focus on amplifying responses of strip-shaped structures. In contrast, Module D focuses on optimizing geometric consistency (via MPD-IoU) and small-object similarity (via NWD-IoU). The feature priorities emphasized by the attention mechanism may not fully align with the geometric constraints enforced by the loss function, potentially leading to conflicting gradient directions during training and thus suboptimal convergence. Although combining modules B and D improved accuracy rates (P(b), P(m)) compared to their individual use, this came at the cost of lower R(b) and R(m). Neither module A nor B unleashed their potential when combined with D. This finding reveals the complexity of inter-module interactions and underscores the necessity of carefully co-designing attention mechanisms and loss functions in future research. Additionally, combining any two of the three modules increased both parameters and inference time.
- (3)
- Therefore, we incorporated the lightweight, efficient detection head Module C into each combination. Adding C to Modules A and B reduced the parameter count by 10% but led to a decrease in average mask precision. In contrast, when Modules A + D and B + D were combined with C, the parameter count decreased while detection performance (P(b), R(b), P(m), R(m), and mAP) improved significantly, indicating that introducing Module C enhances the performance of Module D. Consequently, we introduced Module D into the A + B + C combination, resulting in a marked improvement in detection performance compared to previous configurations. This effectively mitigated performance degradation caused by reduced parameter counts. The combination of Modules A, B, C, and D complements each other, achieving overall performance enhancement. Furthermore, all the improved modules yielded a statistically significant gain in mAP (p < 0.05), which confirms the effectiveness and robustness of the proposed modifications in enhancing detection performance.
3.5. Comparative Experiments on Different Residual Film Detection Models
3.6. Comparison of Residual Film Detection Visualization Results Across Different Models
3.7. Edge Device Deployment Experiment
3.8. Experimental Determination of Residual Film Theoretical Area Based on Tracking Network
4. Discussion
5. Conclusions
- (1)
- Firstly, to tackle the multi-scale characteristics of residual films, we introduced the PKI Module (with a variable receptive field) into the C3K2 module of the backbone network to capture multi-scale texture features of residual films. Combined with SegNext_Attention, which extracts multi-scale features in parallel using convolutional kernels of different sizes, the model enhances focus on strip-shaped residual films of varying scales. Experimental results showed that the C3K2-CB module (designed specifically for residual films and integrated with SegNext attention) outperformed common efficient attention mechanisms such as SE, CA, CBAM, MLCA, MPCA, and AFGC. It improved residual film recognition accuracy from 83% to 86%, strengthening the model’s ability to identify residual films and reducing false detections.
- (2)
- Secondly, to address missed detections of fragmented residual films, we replaced standard convolutions in the model with receptive field convolutions (RFCAConv) to emphasize spatial features of the receptive field. This approach differentially processes receptive fields of varying regions and sizes while effectively sharing parameters from large-scale convolutional kernels, thereby enhancing the model’s ability to capture and utilize image information. The effectiveness of receptive field convolutions was validated via receptive field heatmaps: after replacement, the model effectively focused on residual film features, even attending to small residual films that were previously overlooked. Experiments also determined the optimal replacement positions for receptive field convolutions. Additionally, we designed a lightweight, Efficient-Head and a new NM (NWD-MPD) loss function. The efficient detection head adopts a decoupled head structure with parallel branches for feature processing. Each branch stacks two efficient modules to enhance the model’s complex function representation capability, while efficient convolution reduces model parameters by 10% and improves inference speed. Combined with the NM loss function, the residual film mask recall and average segmentation mask accuracy improved by 1.3 and 1.5 percentage points, respectively, enhancing the model’s ability to identify fragmented residual films and accurately segment residual films in complex scenarios. Furthermore, all improved modules showed statistically significant mAP gains (p < 0.05), confirming the effectiveness and robustness of the proposed modifications in enhancing detection performance.
- (3)
- Thirdly, we compared RSE-YOLO-seg with widely used detection algorithms, including real-time instance segmentation models (YOLOv5-seg, YOLOv8-seg, YOLOv10-seg, YOLOv11-seg, YOLOv12-seg) and three existing residual film detection models. Results showed that RSE-YOLO-seg outperformed these models in bounding box average precision by 5.1, 4.5, 4.2, 3.0, 7.3, 5.5, 4.2, and 2.0 percentage points, respectively, and in mask average precision by 5.1, 3.7, 3.1, 2.7, 7.3, 5.4, 3.5, and 2.9 percentage points, respectively. Additionally, its parameter count was 4–18% lower than that of lightweight models in the same series. Meanwhile, when deployed on edge devices (Jetson Nano B01, Jetson Orin Nano), the model achieves inference speeds of 17 FPS and 38 FPS, respectively, meeting real-time detection requirements.
- (4)
- Finally, through field residual film detection experiments, we tested 20 groups of plots across different regions. The proposed RSE-YOLO-seg, combined with DeepSORT, identifies and segments residual films in videos, tracks individual residual films, and uses the residual film area (converted from the maximum pixel count) as its theoretical contour area. Compared to the traditional method of randomly capturing residual film images, the mean error between predicted and actual areas decreased from 232.30 cm2 to 142.00 cm2, and the RMSE decreased from 251.53 cm2 to 130.25 cm2. This effectively mitigates random errors in static images of residual films under different orientations, thereby improving the accuracy of residual film area estimation.
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Shao, L.; Gong, J.; Fan, W.; Zhang, Z.; Zhang, M. Cost Comparison between Digital Management and Traditional Management of Cotton Fields—Evidence from Cotton Fields in Xinjiang, China. Agriculture 2022, 12, 1105. [Google Scholar] [CrossRef]
- Zhang, X.; Shi, Y.; Yan, J.; Yang, S.; Hou, Z.; Li, H. Residual Film–Cotton Stubble–Nail Tooth Interaction Study Based on SPH-FEM Coupling in Residual Film Recycling. Agriculture 2025, 15, 1198. [Google Scholar] [CrossRef]
- Lakhiar, I.A.; Yan, H.; Zhang, J.; Wang, G.; Deng, S.; Bao, R.; Zhang, C.; Syed, T.; Wang, B.; Zhou, R.; et al. Plastic Pollution in Agriculture as a Threat to Food Security, the Ecosystem, and the Environment: An Overview. Agronomy 2024, 14, 548. [Google Scholar] [CrossRef]
- Hu, C.; Wang, X.; Chen, X.; Tang, X.; Zhao, Y.; Yan, C. Current situation and control strategies of residual film pollution in Xinjiang. Trans. Chin. Soc. Agric. Eng. 2019, 35, 223–234. [Google Scholar] [CrossRef]
- Wang, G.; Sun, Q.; Wei, M.; Xie, M.; Shen, T.; Liu, D. Plastic Film Residue Reshaped Protist Communities and Induced Soil Nutrient Deficiency Under Field Conditions. Agronomy 2025, 15, 419. [Google Scholar] [CrossRef]
- Zheng, W.; Wang, R.; Cao, Y.; Jin, N.; Feng, H.; He, J. Remote Sensing Recognition of Plastic-film-mulched Farmlands on Loess Plateau Based on Google Earth Engine. Trans. Chin. Soc. Agric. Mach. 2022, 53, 224–234. [Google Scholar] [CrossRef]
- Wu, X.; Liang, C.; Zhang, D.; Yu, L.; Zhang, F. Identification Method of Plastic Film Residue Basedon UAV Remote Sensing Images. Trans. Chin. Soc. Agric. Mach. 2020, 51, 189–195. [Google Scholar] [CrossRef]
- Zhai, Z.; Chen, X.; Qiu, F.; Meng, Q.; Wang, H.; Zhang, R. Detecting surface residual film coverage rate in pre-sowing cotton fields using pixel block and machine learning. Trans. Chin. Soc. Agric. Eng. 2022, 38, 140–147. [Google Scholar] [CrossRef]
- Zhang, X.; Huang, S.; Jin, W.; Yan, J.; Shi, Z.; Zhou, X.; Zhang, C. Identification Method of Agricultural Film Residue Based on Improved Faster R-CNN. J. Hunan Univ. Nat. Sci. 2021, 48, 161–168. [Google Scholar] [CrossRef]
- Huang, D.; Zhang, Y. Combining YOLOv7-SPD and DeeplabV3+ for Detection of Residual Film Remaining on Farmland. IEEE Access 2024, 12, 1051–1063. [Google Scholar] [CrossRef]
- Niu, Y.; Li, Y.; Chen, Y.; Jiang, P. Image Segmentation Method of Residual Film on Cotton Field Surface based on Improved SegFormer Model. Jisuanji Yu Xiandaihua 2023, 7, 93–98. [Google Scholar] [CrossRef]
- Ma, J.; Zhao, Y.; Fan, W.; Liu, J. An Improved YOLOv8 Model for Lotus Seedpod Instance Segmentation in the Lotus Pond Environment. Agronomy 2024, 14, 1325. [Google Scholar] [CrossRef]
- Lin, Z.; Xie, L.; Bian, Y.; Jian, Z.; Zhou, L.; Shi, M. YOLO-SDI-based detection of residual film in agricultural fields. Comput. Eng. 2025, 56, 1–12. [Google Scholar] [CrossRef]
- Qiu, Z.; Huang, X.; Deng, Z.; Xu, X.; Qiu, Z. PS-YOLO-seg: A Lightweight Instance Segmentation Method for Lithium Mineral Microscopic Images Based on Improved YOLOv12-seg. J. Imaging 2025, 11, 230. [Google Scholar] [CrossRef]
- Ji, W.; Pan, Y.; Xu, B.; Wang, J. A Real-Time Apple Targets Detection Method for Picking Robot Based on ShufflenetV2-YOLOX. Agriculture 2022, 12, 856. [Google Scholar] [CrossRef]
- Wu, W.; He, Z.; Li, J.; Chen, T.; Luo, Q.; Luo, Y.; Wu, W.; Zhang, Z. Instance Segmentation of Tea Garden Roads Based on an Improved YOLOv8n-seg Model. Agriculture 2024, 14, 1163. [Google Scholar] [CrossRef]
- Shi, H.; Liu, C.; Wu, M.; Zhang, H.; Song, H.; Sun, H.; Li, Y.; Hu, J. Real-time detection of Chinese cabbage seedlings in the field based on YOLO11-CGB. Front. Plant Sci. 2025, 16, 1558378. [Google Scholar] [CrossRef]
- Wu, Z.; Zhen, H.; Zhang, X.; Bai, X.; Li, X. SEMA-YOLO: Lightweight Small Object Detection in Remote Sensing Image via Shallow-Layer Enhancement and Multi-Scale Adaptation. Remote Sens. 2025, 17, 1917. [Google Scholar] [CrossRef]
- Wei, H.; Zhao, L.; Li, R.; Zhang, M. RFAConv-CBM-ViT: Enhanced vision transformer for metal surface defect detection. J. Supercomput. 2025, 81, 155. [Google Scholar] [CrossRef]
- Liang, M.; Zhang, Y.; Zhou, J.; Shi, F.; Wang, Z.; Lin, Y.; Zhang, L.; Liu, Y. Research on detection of wheat tillers in natural environment based on YOLOv8-MRF. Smart Agric. Technol. 2025, 10, 100720. [Google Scholar] [CrossRef]
- Zhang, T.; Zhou, J.; Liu, W.; Yue, R.; Yao, M.; Shi, J.; Hu, J. Seedling-YOLO: High-Efficiency Target Detection Algorithm for Field Broccoli Seedling Transplanting Quality Based on YOLOv7-Tiny. Agronomy 2024, 14, 931. [Google Scholar] [CrossRef]
- Zou, J.; Song, T.; Cao, S.; Zhou, B.; Jiang, Q. Dress Code Monitoring Method in Industrial Scene Based on Improved YOLOv8n and DeepSORT. Sensors 2024, 24, 6063. [Google Scholar] [CrossRef] [PubMed]
- Qi, Z.; Wang, J. PMDNet: An Improved Object Detection Model for Wheat Field Weed. Agronomy 2025, 15, 55. [Google Scholar] [CrossRef]
- Yang, Z.; Xu, K.; Zhao, L.; Hu, N.; Wu, J. PWDE-YOLOv8n: An Enhanced Approach for Surface Corrosion Detection in Aircraft Cabin Sections. IEEE Trans. Instrum. Meas. 2025, 74, 2504722. [Google Scholar] [CrossRef]
- Song, J.; Ma, B.; Xu, Y.; Yu, G.; Xiong, Y. Organ segmentation and phenotypic information extraction of cotton point clouds based on the CotSegNet network and machine learning. Comput. Electron. Agric. 2025, 236, 110466. [Google Scholar] [CrossRef]
- Wang, Z.; Qin, J.; Huang, C.; Zhang, Y. CGMISeg: Context-Guided Multi-Scale Interactive for Efficient Semantic Segmentation. Comput. Mater. Contin. 2025, 9, 5811–5829. [Google Scholar] [CrossRef]
- Yi, X.; Chen, H.; Wu, P.; Wang, G.; Mo, L.; Wu, B.; Yi, Y.; Fu, X.; Qian, P. Light-FC-YOLO: A Lightweight Method for Flower Counting Based on Enhanced Feature Fusion with a New Efficient Detection Head. Agronomy 2024, 14, 1285. [Google Scholar] [CrossRef]
- He, Y.; Wan, L. YOLOv7-PD: Incorporating DE-ELAN and NWD-CIoU for Advanced Pedestrian Detection Method. Inf. Technol. Control 2024, 53, 390–407. [Google Scholar] [CrossRef]
- Xiong, C.; Zayed, T.; Jiang, X.; Alfalah, G.; Abelkader, E. A Novel Model for Instance Segmentation and Quantification of Bridge Surface Cracks-The YOLOv8-AFPN-MPD-IoU. Sensors 2024, 24, 4288. [Google Scholar] [CrossRef]
- Liu, Y.; Han, X.; Zhang, H.; Liu, S.; Ma, W.; Yan, Y.; Sun, L.; Jing, L.; Wang, Y.; Wang, J. YOLOv8-MSP-PD: A Lightweight YOLOv8-Based Detection Method for Jinxiu Malus Fruit in Field Conditions. Agronomy 2025, 15, 1581. [Google Scholar] [CrossRef]
- Chen, S.; Liu, J.; Xu, X.; Guo, J.; Hu, S.; Zhou, Z.; Lan, Y. Detection and tracking of agricultural spray droplets using GSConv-enhanced YOLOv5s and DeepSORT. Comput. Electron. Agric. 2025, 235, 110353. [Google Scholar] [CrossRef]
- Zhou, L.; Yang, Z.; Fu, L.; Duan, J. Yield Estimation in Banana Orchards Based on DeepSORT and RGB-Depth Images. Agronomy 2025, 15, 1119. [Google Scholar] [CrossRef]
- Zhang, X.; Li, B. Tennis ball detection based on YOLOv5 with tensorrt. Sci. Rep. 2025, 15, 21011. [Google Scholar] [CrossRef]
- Liao, J.; He, X.; Liang, Y.; Wang, H.; Zeng, H.; Luo, X.; Li, X.; Zhang, L.; Xing, H.; Zang, Y. A Lightweight Cotton Verticillium Wilt Hazard Level Real-Time Assessment System Based on an Improved YOLOv10n Model. Agriculture 2024, 14, 1617. [Google Scholar] [CrossRef]
- Zhou, X.; Chen, W.; Wei, X. Improved Field Obstacle Detection Algorithm Based on YOLOv8. Agriculture 2024, 14, 2263. [Google Scholar] [CrossRef]
- Zhu, C.; Hao, S.; Liu, C.; Wang, Y.; Jia, X.; Xu, J.; Guo, S.; Huo, J.; Wang, W. An Efficient Computer Vision-Based Dual-Face Target Precision Variable Spraying Robotic System for Foliar Fertilisers. Agronomy 2024, 14, 2770. [Google Scholar] [CrossRef]
- Duan, Y.; Han, W.; Guo, P.; Wei, X. YOLOv8-GDCI: Research on the Phytophthora Blight Detection Method of Different Parts of Chili Based on Improved YOLOv8 Model. Agronomy 2024, 14, 2734. [Google Scholar] [CrossRef]
- Meng, Q.; Zhai, Z.; Zhang, L.; Lu, J.; Wang, H.; Zhang, R. Recognition Method of Cotton Field Surface Residual Film Based on Improved YOLO 11. Trans. Chin. Soc. Agric. Mach. 2025, 56, 17–25+48. [Google Scholar] [CrossRef]
- Lou, L.; Lu, H.; Song, R. Segmentation of Plant Leaves and Features Extraction Based on Muti-view and Time-series Image. Trans. Chin. Soc. Agric. Mach. 2022, 53, 253–260. [Google Scholar] [CrossRef]
- Zhang, M.; Zhang, J.; Peng, Y.; Wang, Y. FreqDyn-YOLO: A High-Performance Multi-Scale Feature Fusion Algorithm for Detecting Plastic Film Residues in Farmland. Sensors 2025, 25, 4888. [Google Scholar] [CrossRef] [PubMed]
Different Morphologies of Residual Film | Original Images | Enhanced Images | Enhanced Image | ||
---|---|---|---|---|---|
Training Set | Validation Set | Test Set | |||
Exposed on the surface | 503 | 1509 | 1051 | 304 | 154 |
Suspended on Cotton Stalks | 498 | 1494 | 1048 | 299 | 147 |
In complex inter-row areas | 511 | 1533 | 1078 | 305 | 150 |
Multi-row mixed scenes captured by UAV | 1023 | 1023 | 715 | 204 | 104 |
Total | 2535 | 5559 | 3892 | 1112 | 555 |
Hyperparameter | Value |
---|---|
Image size | 640 × 640 |
Epoch | 250 |
Batch size | 32 |
Learning rate | 0.01 |
Momentum | 0.937 |
Weight decay | 0.0005 |
Optimizer | SGD |
Index | Parameters (Deploy Phase) |
---|---|
Operating system | Ubuntu 22.04 LTS |
Accelerated environment | CUDA 12.6+ cuDNN 9.3.0 |
Library | Pytorch 1.12 |
SDK | JetPack 6.2 |
TensorRT version | TensorRT 10.3.0 |
Model | P/% | R/% | mAP@50 (B)/% | mAP@50 (M)/% | Parameters | FLOPs/G | Latency (CPU)/ms | ||
---|---|---|---|---|---|---|---|---|---|
P(b) | P(m) | R(b) | R(m) | ||||||
Yolo11n | 83.0 | 82.9 | 77.2 | 76.3 | 85.7 | 84.5 | 2,834,763 | 10.2 | 51.0 |
-SE | 82.3 | 81.9 | 76.9 | 75.6 | 85.4 | 83.8 | 2,875,723 | 10.3 | 55.4 |
-CA | 82.6 | 83.2 | 77.4 | 76.4 | 86.0 | 84.5 | 2,841,443 | 10.2 | 53.7 |
-CBAM | 83.7 | 83.4 | 76.7 | 75.6 | 86.0 | 84.4 | 2,933,421 | 10.3 | 56.2 |
-ECA | 83.2 | 83.7 | 77.4 | 75.7 | 86.4 | 84.9 | 2,867,541 | 10.3 | 53.6 |
-MPCA | 82.2 | 83.0 | 76.9 | 75.1 | 85.8 | 84.3 | 3,195,979 | 10.3 | 52.5 |
-AFGC | 83.3 | 82.9 | 76.6 | 75.7 | 85.7 | 83.9 | 2,900,561 | 10.2 | 56.3 |
-SegNext | 86.0 | 86.1 | 77.4 | 76.4 | 86.8 | 85.4 | 2,915,107 | 10.3 | 52.7 |
Layers | P/% | R/% | mAP@50 (B)/% | mAP@50 (M)/% | Parameter | FLOPs/G | ||
---|---|---|---|---|---|---|---|---|
P(b) | P(m) | R(b) | R(m) | |||||
0 | 85.3 | 85.0 | 77.9 | 77.2 | 87.4 | 86.2 | 2,601,434 | 8.9 |
1 | 83.7 | 84.2 | 78.6 | 75.8 | 86.6 | 84.6 | 2,603,059 | 9.0 |
3 | 82.8 | 82.9 | 77.9 | 76.7 | 86.0 | 84.5 | 2,609,059 | 9.0 |
5 | 85.2 | 86.2 | 79.1 | 77.5 | 87.7 | 86.3 | 2,617,059 | 8.9 |
7 | 86.0 | 86.1 | 78.6 | 77.7 | 87.8 | 86.3 | 2,617,059 | 8.9 |
18 | 85.4 | 85.9 | 79.1 | 77.2 | 87.6 | 85.8 | 2,667,578 | 9.3 |
21 | 86.3 | 85.9 | 76.9 | 75.9 | 86.8 | 85.1 | 2,617,059 | 8.9 |
5, 7 | 86.5 | 86.8 | 78.3 | 77.2 | 87.3 | 86.2 | 2,633,083 | 8.9 |
5, 18 | 85.7 | 85.7 | 77.8 | 75.8 | 86.9 | 84.8 | 2,625,083 | 8.9 |
5, 7, 18 | 86.8 | 87.0 | 78.1 | 77.2 | 87.7 | 86.5 | 2,641,107 | 9.0 |
Our | 86.2 | 86.3 | 80.1 | 79.0 | 88.7 | 87.2 | 2,662,650 | 9.4 |
IoU | P/% | R/% | mAP@50 (B)/% | mAP@50 (M)/% | |||
---|---|---|---|---|---|---|---|
P(b) | P(m) | R(b) | R(m) | ||||
Model Performance Under Different SN Fusion Weight Ratios | iou_ratio = 0.0 | 85.0 | 85.2 | 78.0 | 76.2 | 87.2 | 85.3 |
iou_ratio = 0.2 | 85.7 | 85.5 | 78.4 | 76.9 | 88.1 | 86.0 | |
iou_ratio = 0.5 | 86.2 | 86.3 | 80.1 | 79.0 | 88.7 | 87.2 | |
iou_ratio = 0.8 | 86.1 | 85.8 | 78.7 | 77.5 | 88.3 | 86.2 | |
iou_ratio = 1.0 | 85.8 | 86.0 | 79.7 | 78.2 | 88.4 | 86.4 | |
Comparison of Experimental Results Using Different Loss Functions | CIoU | 84.7 | 84.9 | 78.8 | 75.9 | 87.3 | 85.7 |
SIoU | 84.4 | 85.7 | 79.3 | 76.6 | 87.4 | 85.5 | |
EIoU | 86.3 | 85.5 | 77.8 | 76.6 | 87.8 | 85.8 | |
PIoU | 84.4 | 85.4 | 78.2 | 76.0 | 86.7 | 85.0 | |
Shape | 83.5 | 83.4 | 77.8 | 76.3 | 86.0 | 84.3 | |
MPD | 85.0 | 85.2 | 78.0 | 76.2 | 87.2 | 85.3 | |
NWD | 85.8 | 86.0 | 79.7 | 78.2 | 88.4 | 86.4 | |
NM | 86.2 | 86.3 | 80.1 | 79.0 | 88.7 | 87.2 |
Model | Seg Next | RFCAConv | Efficient | MN | P/% | R/% | mAP 50(B)/% | mAP 50(M)/% | Parameter/M | Latency (CPU)/ms | p-Value | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
P(b) | P(m) | R(b) | R(m) | ||||||||||
Base | × | × | × | × | 83.0 ± 0.3 | 82.9 ± 0.4 | 77.2 ± 0.2 | 76.3 ± 0.5 | 85.7 ± 0.4 | 84.5 ± 0.3 | 2.83 | 51.0 | - |
A | √ | × | × | × | 86.0 ± 0.6 | 86.1 ± 0.7 | 77.4 ± 0.4 | 76.4 ± 0.5 | 86.8 ± 0.3 | 85.4 ± 0.6 | 2.92 | 52.7 | 0.035 |
B | × | √ | × | × | 83.9 ± 0.2 | 84.2 ± 0.3 | 77.3 ± 0.5 | 76.0 ± 0.2 | 86.2 ± 0.6 | 85.0 ± 0.4 | 2.90 | 53.1 | 0.013 |
C | × | × | √ | × | 83.2 ± 0.5 | 83.5 ± 0.3 | 77.6 ± 0.4 | 76.5 ± 0.6 | 86.0 ± 0.2 | 85.2 ± 0.5 | 2.56 | 44.3 | 0.019 |
D | × | × | × | √ | 83.3 ± 0.6 | 83.5 ± 0.4 | 77.3 ± 0.3 | 76.2 ± 0.5 | 86.3 ± 0.4 | 85.1 ± 0.7 | 2.83 | 50.2 | 0.027 |
A + B | √ | √ | × | × | 86.7 ± 0.1 | 86.5 ± 0.3 | 78.1 ± 0.4 | 76.8 ± 0.1 | 87.6 ± 0.3 | 86.1 ± 0.2 | 2.93 | 53.5 | 0.008 |
A + D | √ | × | √ | × | 85.6 ± 0.2 | 85.8 ± 0.1 | 77.7 ± 0.4 | 76.2 ± 0.5 | 86.7 ± 0.3 | 85.2 ± 0.4 | 2.92 | 52.8 | 0.018 |
B + D | × | × | √ | √ | 84.9 ± 0.6 | 84.9 ± 0.4 | 76.9 ± 0.3 | 75.8 ± 0.5 | 86.3 ± 0.4 | 84.7 ± 0.6 | 2.90 | 53.4 | 0.025 |
A + B + C | √ | √ | √ | × | 84.7 ± 0.3 | 84.9 ± 0.4 | 78.8 ± 0.2 | 75.9 ± 0.5 | 87.3 ± 0.3 | 85.7 ± 0.5 | 2.66 | 47.4 | 0.016 |
A + C + D | √ | × | √ | √ | 84.1 ± 0.5 | 85.4 ± 0.4 | 78.6 ± 0.4 | 76.7 ± 0.6 | 87.5 ± 0.4 | 85.9 ± 0.3 | 2.59 | 45.6 | 0.022 |
B + C + D | × | √ | √ | √ | 86.0 ± 0.4 | 86.3 ± 0.7 | 79.2 ± 0.5 | 77.9 ± 0.3 | 88.2 ± 0.6 | 86.8 ± 0.5 | 2.57 | 45.3 | 0.024 |
A + B + C + D | √ | √ | √ | √ | 86.2 ± 0.4 | 86.3 ± 0.2 | 80.1 ± 0.1 | 79.0 ± 0.3 | 88.7 ± 0.3 | 87.2 ± 0.2 | 2.66 | 46.3 | 0.011 |
Model | P/% | R/% | mAP@50(B)/% | mAP@50(M)/% | Parameter/M | FLOPs/G | Weight/MB | Latency (CPU)/ms | ||
---|---|---|---|---|---|---|---|---|---|---|
P(b) | P(m) | R(b) | R(m) | |||||||
YOLOv5n-seg | 79.4 | 79.2 | 75.6 | 75.1 | 83.6 | 82.1 | 2.76 | 11.0 | 5.8 | 50.2 |
YOLOv8n-seg | 80.1 | 82.6 | 77.1 | 75.6 | 84.2 | 83.5 | 3.26 | 12.0 | 6.8 | 53.9 |
YOLOv10n-seg | 81.9 | 82.5 | 77.0 | 76.1 | 84.5 | 84.1 | 2.84 | 11.7 | 6.0 | 51.7 |
YOLO11n-seg | 83.0 | 82.9 | 77.2 | 76.3 | 85.7 | 84.5 | 2.84 | 10.2 | 6.0 | 51.0 |
YOLO12n-seg | 78.5 | 78.9 | 72.9 | 71.9 | 81.4 | 79.9 | 2.76 | 9.7 | 5.7 | 52.6 |
YOLOv5s-seg | 81.1 | 82.4 | 77.6 | 75.0 | 84.6 | 82.8 | 9.77 | 37.8 | 18.9 | 73.5 |
YOLOv8s-seg | 80.6 | 81.4 | 78.1 | 76.0 | 85.1 | 83.0 | 11.80 | 42.7 | 23.9 | 75.3 |
YOLOv10s-seg | 82.5 | 82.5 | 78.1 | 76.7 | 85.5 | 83.6 | 10.06 | 41.2 | 20.5 | 75.1 |
YOLO11s-seg | 81.3 | 83.0 | 78.8 | 76.8 | 85.7 | 84.9 | 10.07 | 35.3 | 20.5 | 74.8 |
YOLO12s-seg | 78.5 | 78.7 | 75.8 | 74.4 | 83.2 | 81.4 | 9.73 | 33.3 | 20.0 | 74.4 |
YOLO-SPD [10] | 79.2 | 78.3 | 76.4 | 75.5 | 83.2 | 81.8 | 33.52 | 96.6 | 74.8 | 74.5 |
YOLO-SDI [13] | 85.1 | 84.7 | 77.2 | 77.2 | 84.5 | 83.7 | 6.52 | 20.0 | 19.0 | 76.2 |
DCA-YOLO [29] | 81.9 | 80.1 | 80.9 | 78.5 | 86.7 | 84.3 | 2.20 | 8.5 | 4.9 | 47.3 |
Our | 86.2 | 86.3 | 80.1 | 79.0 | 88.7 | 87.2 | 2.66 | 9.4 | 5.4 | 46.3 |
Model | Type | mAP50 (B)/% | mAP50 (M)/% | Latency (Jetson B01) /ms | Latency (Jetson Orin) /ms |
---|---|---|---|---|---|
YOLO11n + C3k2-PKI | .pt-FP32 | 86.3 | 85.2 | 92.5 (+6.3) | 44.6 (+5.3) |
YOLO11n + SegNext | .pt-FP32 | 86.8 | 85.4 | 94.1 (+7.9) | 45.7 (+6.4) |
YOLO11n + RFCAConv | .pt-FP32 | 86.2 | 85.0 | 95.7 (+9.5) | 47.5 (+8.2) |
YOLO11n + Efficient | .pt-FP32 | 86.0 | 85.2 | 79.1 (−7.1) | 34.4 (−4.9) |
YOLO11n + NM | .pt-FP32 | 86.3 | 85.1 | 86.4 (+0.2) | 40.9 (+1.6) |
YOLO11n-seg | .pt-FP32 | 85.7 | 84.5 | 86.2 | 39.3 |
.engine-FP32 | 85.6 | 84.3 | 63.5 (−22.7) | 28.1 (−11.2) | |
.engine-FP16 | 85.5 | 84.3 | 57.7 (−28.5) | 25.4 (−13.9) | |
YOLO11s-seg | .pt-FP32 | 85.7 | 84.9 | 105.4 (+19.2) | 57.1 (+17.8) |
.engine-FP32 | 85.7 | 84.9 | 81.2 (−5.0) | 33.5 (−5.8) | |
.engine-FP16 | 85.6 | 84.7 | 76.5 (−9.7) | 30.3 (−9.0) | |
RSE-YOLO-seg | .pt-FP32 | 88.7 | 87.2 | 88.1 (+1.9) | 41.2 (+1.9) |
.engine-FP32 | 88.5 | 87.1 | 65.6 (−20.6) | 28.7 (−10.6) | |
.engine-FP16 | 88.5 | 86.9 | 59.3 (−26.9) | 26.3 (−13.0) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fang, H.; Xu, Q.; Chen, X.; Wang, X.; Yan, L.; Zhang, Q. An Instance Segmentation Method for Agricultural Plastic Residual Film on Cotton Fields Based on RSE-YOLO-Seg. Agriculture 2025, 15, 2025. https://doi.org/10.3390/agriculture15192025
Fang H, Xu Q, Chen X, Wang X, Yan L, Zhang Q. An Instance Segmentation Method for Agricultural Plastic Residual Film on Cotton Fields Based on RSE-YOLO-Seg. Agriculture. 2025; 15(19):2025. https://doi.org/10.3390/agriculture15192025
Chicago/Turabian StyleFang, Huimin, Quanwang Xu, Xuegeng Chen, Xinzhong Wang, Limin Yan, and Qingyi Zhang. 2025. "An Instance Segmentation Method for Agricultural Plastic Residual Film on Cotton Fields Based on RSE-YOLO-Seg" Agriculture 15, no. 19: 2025. https://doi.org/10.3390/agriculture15192025
APA StyleFang, H., Xu, Q., Chen, X., Wang, X., Yan, L., & Zhang, Q. (2025). An Instance Segmentation Method for Agricultural Plastic Residual Film on Cotton Fields Based on RSE-YOLO-Seg. Agriculture, 15(19), 2025. https://doi.org/10.3390/agriculture15192025