3.4.2. Ablation Experiment
To systematically evaluate the individual and synergistic effects of each improved module in the fish-disease detection task, eight groups of ablation experiments were designed based on the YOLOv11n baseline network. All experiments were conducted under identical training environments and parameter settings. The results are presented in
Table 7.
The experimental outcomes demonstrate that the introduction of the PC_Shuffleblock module significantly enhances the detection performance of the YOLOv11n model, with mAP0.5 increasing from 93.4% to 95.1%, Precision improving from 92.5% to 95.1%, and Recall rising from 90.4% to 92.8%. This improvement indicates that the PC_Shuffleblock module, leveraging the flexible receptive field design of its pinwheel-shaped convolution (PConv), effectively strengthens the model’s ability to capture edge details and complex textures at the feature extraction stage, thereby improving detection accuracy and robustness.
Moreover, the integration of the SPPF_TSFA module also yields performance gains. Although mAP0.5 shows a modest increase, from 93.4% to 94.3%, Precision and Recall are elevated to 93.8% and 90.7%, respectively. This suggests that the synergistic effect of multi-scale feature-fusion and spatial-attention mechanisms in the SPPF_TSFA module effectively enhances the model’s contextual modeling capability and its ability to detect small targets.
Furthermore, the replacement of the loss function with SDIoU loss optimizes the model’s gradient responsiveness to samples of varying quality. The scale-dynamic regulation mechanism alleviates training perturbations caused by the instability of small-object IoU, resulting in an mAP0.5 increase to 94.7%, with Precision and Recall improving to 94.2% and 91.7%, respectively. Consequently, the model exhibits enhanced generalization performance under complex backgrounds and varying target scales.
These results comprehensively validate the complementarity among the modules in terms of structural design and optimization objectives, further demonstrating that the multi-module collaborative integration strategy significantly enhances the overall performance of the model in real-world detection tasks.
3.4.3. Contrast Experiment
To validate the performance advantages of the proposed YOLO-TPS model in fish-disease detection tasks, a systematic comparative analysis was conducted against multiple mainstream object-detection models, including Faster R-CNN, SSD, YOLOv5, YOLOv8n, YOLOv9n, YOLOv10n, and YOLOv11n. The experimental results, summarized in
Table 8, present the performance metrics of each model on the test dataset, reported as the mean and standard deviation over three independent training runs. The YOLO-TPS model comprises 2.513 million parameters, significantly fewer than Faster R-CNN (44.297 million) and SSD (30.245 million). Its floating-point operations (FLOPs) amount to 6.9 G, demonstrating markedly lower computational complexity compared to Faster R-CNN (207 G) and SSD (37.8 G). The model size is 5.4 MB, which is substantially smaller than that of Faster R-CNN (315 MB) and SSD (104 MB). Although the YOLO-TPS model’s parameter count and FLOPs are marginally higher than those of YOLOv11n (2.583 million parameters, 6.3 G FLOPs), and its model size (5.4 MB) is slightly larger than YOLOv11n’s (5.2 MB), it exhibits competitive advantages in key performance metrics, reflecting a balanced trade-off between lightweight design and detection efficiency.
Regarding performance evaluation, the YOLO-TPS model achieves a mean average Precision at IoU 0.5 (mAP0.5) of 97.2%, significantly outperforming Faster R-CNN (61.2%), SSD (56.2%), YOLOv5 (90.2%), YOLOv8n (91.4%), YOLOv9n (92.3%), YOLOv10n (92.2%), and YOLOv11n (93.4%), with respective improvements of 36.0%, 41.0%, 7.0%, 5.8%, 4.9%, 5.0%, and 3.8%. In terms of Precision, YOLO-TPS attains 97.9%, exceeding Faster R-CNN (70.5%), SSD (53.1%), YOLOv5 (87.4%), YOLOv8n (91.6%), YOLOv9n (88.8%), YOLOv10n (91.5%), and YOLOv11n (92.5%) by 27.4%, 44.8%, 10.5%, 6.3%, 9.1%, 6.4%, and 5.4%, respectively. The Recall rate of YOLO-TPS reaches 95.1%, also surpassing Faster R-CNN (65.9%), SSD (44.2%), YOLOv5 (86.9%), YOLOv8n (90.3%), YOLOv9n (89.0%), YOLOv10n (89.1%), and YOLOv11n (90.4%) by 29.2%, 50.9%, 8.2%, 4.8%, 6.1%, 6.0%, and 4.7%, respectively. Although YOLOv11n demonstrates comparable performance on certain metrics, the superior mAP0.5 score of YOLO-TPS (3.8% higher), combined with its overall Precision and Recall performance, underscores its outstanding capability in fish-disease detection.
The standard-deviation results indicate that YOLO-TPS demonstrated superior stability across multiple runs, with a standard deviation of 0.3% in mAP0.5, which is lower than that of YOLOv11n (0.4%), reflecting the enhanced robustness of the proposed model. To assess the statistical significance of the observed performance improvement, a paired t-test was conducted, comparing the mAP0.5 values of YOLO-TPS and YOLOv11n, based on test results from three independent runs. The results revealed that YOLO-TPS achieved a significantly higher mAP0.5 than YOLOv11n (p = 0.002, p < 0.01), confirming that the performance gain is not attributable to random variation, but is instead driven by the architectural enhancements incorporated in the model design.
In summary, the YOLO-TPS model excels in terms of parameter count (2.513 million), FLOPs (6.9 G), model size (5.4 MB), and performance metrics (mAP0.5 97.2%, Precision 97.9%, Recall 95.1%), demonstrating significant advantages over traditional and contemporary YOLO-based models. These results substantiate the efficiency and robustness of YOLO-TPS in fish-disease detection tasks, providing important theoretical insights and practical references for the advancement of intelligent diagnostic technologies in aquaculture.
We compared the predictions of the three models with the best combined performance in the results of the comparison experiments, and the confidence level was set to 0.5.
Figure 12 demonstrates the results of YOLOv11, YOLOv8n, and YOLO-TPS for the detection of different fish diseases.
The experiment selected images representing four typical scenarios, including complex cases such as overlapping multiple fish, single clear fish, interference from complex backgrounds, and multiple fish arranged side by side. The detection results are presented in
Figure 12. The first row of the figure displays the original images: the first column depicts a densely stacked scene of multiple fish on the water surface; the second and third columns show images of individual fish; the fourth column presents a scenario with a single fish held by hand, where the background includes hand- and ground-reflection interference; the fifth column illustrates multiple fish aligned side by side. The subsequent three rows show detection results from YOLOv8n, YOLOv11n, and YOLO-TPS, respectively. Detection bounding boxes are color-coded and annotated with confidence scores and pathological-category labels.
In the overlapping multiple-fish scenario, YOLOv8n and YOLOv11n identified some fish but exhibited notable bounding-box deviations, with several targets either mislocalized or missed, indicating incomplete detection. In contrast, YOLO-TPS demonstrated superior robustness in dense target detection by accurately localizing bounding boxes that fully encompassed all fish, and correctly labeling pathological categories.
In the single-fish detection scenario, both YOLOv8n and YOLOv11n successfully localized the targets; however, their bounding boxes showed lower tightness and confidence scores, and were more susceptible to background interference. Conversely, YOLO-TPS produced bounding boxes that closely conformed to the fish contours, achieved higher confidence scores, and effectively suppressing background noise while accurately annotating pathological features.
In the multiple-fish side-by-side scenario, YOLO-TPS further showcased its advantages, generating bounding boxes that precisely adhered to each fish’s contour, with confidence scores exceeding those of YOLOv8n and YOLOv11n. These improvements can be attributed to YOLO-TPS’s integration of the PC_Shuffleblock module, which enhances spatial modeling capability; the SPPF_TSFA module, which extracts multi-scale features for small lesions; and the dynamically optimized SDIoU loss function. Together, these components contribute to the model’s outstanding performance in detection accuracy, bounding-box precision, and prediction reliability, thereby providing an efficient and robust technical solution for intelligent diagnosis in complex fish-disease detection scenarios.
To further evaluate YOLO-TPS’s detection performance across different fish-disease types, this study conducted a statistical analysis of Precision, Recall, and mean Average Precision at IoU 0.5 (mAP
0.5) for six fish-disease categories plus a healthy class, with results summarized in
Table 9. Although certain categories, such as EUS, exhibited relatively lower Recall and mAP values—highlighting ongoing challenges in detecting complex lesions—YOLO-TPS demonstrated excellent average detection accuracy across other categories. Overall, the model achieved a Precision of 97.9%, Recall of 95.1%, and mAP
0.5 of 97.2%, reflecting strong generalization ability and robustness in multi-class fish-disease detection tasks.