You are currently viewing a new version of our website. To view the old version click .
by
  • Mahdieh Gol Hassani*,
  • Mozafar Saadat and
  • Peiran Lei

Reviewer 1: Anonymous Reviewer 2: Inzamam Mashood Nasir Reviewer 3: Anonymous

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper proposes a dual-layer sperm tracking framework based on the Extended Kalman Filter (EKF), named EKF-BoT-SORT, which aims to address the issues of identity switches (ID switches) and overcounting of sperm in dense, low-contrast microscopy videos. The method integrates BoT-SORT with EKF to enhance identity consistency and trajectory continuity by modeling the nonlinear motion of sperm, including position, velocity, and heading. Experiments on the VISEM dataset demonstrate that the proposed method achieves significant improvements across multiple multi-object tracking (MOT) metrics, particularly in IDF1, track duration, and the number of ID switches, while maintaining real-time processing capabilities. Overall, this manuscript requires some revisions before it can be considered for publication, and please address the following issues.

1.The paper emphasizes "more biologically meaningful trajectories," but does not present any comparative results of motility parameters (such as VCL, VSL, LIN, etc.) derived from the trajectories. Would it be possible to further validate the clinical relevance of the proposed method using these downstream analytical metrics?

2. The process noise covariance Q and measurement noise covariance R in the EKF were set empirically. Was a parameter sensitivity analysis conducted? How robust are these parameters under different video qualities or varying sperm densities?

3. The authors mention that the detection performance of YOLO directly affects the continuity of the tracking ID. If YOLO produces duplicate detections (i.e., the same sperm is detected as two targets, resulting in two bounding boxes), how does the proposed tracking algorithm address this issue?

4. BoT-SORT employs a two-stage association strategy that matches high- and low-confidence detections from YOLO. How were the high-confidence threshold τ and low-confidence threshold η are determined via cross-validation on the VISEM dataset? For sperm samples with varying densities (e.g., 10 sperm per frame vs. 50 sperm per frame), would these two thresholds need to be adjusted to reduce ID switches?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript presents a well-motivated and technically solid contribution by enhancing BoT-SORT with an Extended Kalman Filter (EKF) for more reliable sperm tracking under challenging microscopy conditions. The proposed approach addresses key issues such as identity fragmentation, overcounting, and trajectory instability, and the experimental evaluation demonstrates promising improvements in IDF1, identity switches, and trajectory duration compared to established baselines. The integration of EKF into the tracking pipeline is innovative, and the results on the VISEM dataset highlight the potential of the method for biomedical applications. Although it is a good study, I have the following concerns.

The evaluation is restricted to the VISEM dataset; additional tests under denser, noisier, or cross-dataset conditions would strengthen the evidence of robustness.

More mathematical detail and sensitivity analysis of the EKF identity reassignment threshold values are needed to confirm stability.

Comparisons should be extended to include recent transformer-based and graph-attention tracking approaches for completeness.

The ablation study should explicitly quantify the contribution of heading-angle modeling compared to simpler velocity-only EKF state vectors.

While runtime analysis shows efficiency, scalability and performance on lower-resource clinical devices should be discussed.

Examples of failure cases where EKF does not successfully reassign IDs would provide a more balanced picture of the method’s limitations.

The broader applicability of the framework beyond sperm tracking (e.g., cell, embryo, or microorganism tracking) should be elaborated.

The related work section should be expanded to cite Integrating Explanations into CNNs by Adopting Spiking Attention Block for Skin Cancer Detection (DOI: 10.3390/a17120557).

The discussion could be enriched by explaining how methods from these cited works could be adapted or contrasted with EKF-BoT-SORT for enhanced interpretability and robustness.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

1. You use the Kalman filter to track the trajectory of sperm movement, which is free of noise. You do not influence the movement of sperm in any way, you observe them. Isn't that right?

2. How do you perform nonlinear motion modeling adapted to dynamics?

3. What is the difference between the predicted and observed states of sperm movement?

4. What sensors did you use to record the movement? What equipment was used to conduct the research?

5. The algorithm you developed is the best in all respects. What are the shortcomings of your algorithm?

6. Under what lighting conditions did you conduct the observations? Did the recognition change when the lighting conditions changed?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf