Next Article in Journal
Eco-Friendly Degradation of Natural Rubber Powder Waste Using Some Microorganisms with Focus on Antioxidant and Antibacterial Activities of Biodegraded Rubber
Previous Article in Journal
Numerical Simulation of Natural-Gas-Hydrate Decomposition in Process of Heat-Injection Production
 
 
Article
Peer-Review Record

Counting Abalone with High Precision Using YOLOv3 and DeepSORT

Processes 2023, 11(8), 2351; https://doi.org/10.3390/pr11082351
by Duncan Kibet and Jong-Ho Shin *
Reviewer 1: Anonymous
Processes 2023, 11(8), 2351; https://doi.org/10.3390/pr11082351
Submission received: 23 May 2023 / Revised: 2 August 2023 / Accepted: 2 August 2023 / Published: 4 August 2023

Round 1

Reviewer 1 Report

Dear Authors,

The manuscript “Counting Abalone with High-Precision using Yolov3 and Deep SORT” is interesting however the manuscript could be improved a lot.

Following concerns needs to address.

 

Major concern:

 

1.     In the manuscript it is claimed, the effectiveness and detection of the proposed model/algorithm are better compared to other previously proposed methods. However, this claim is not validated.

 

It is recommended that authors prove the superiority of the proposed model compared to other models by comparing the counting rate, accuracy (sensitivity and specificity), and other variates to the pre-existing similar models in a tabulated format.

 

2.     In Table 2, the authors mentioned the accuracy rate of the model compared to manual counting. There is a large variability in the results. Could you please explain why the “slow-motion effect” causes such inaccuracy in some scenarios?

 

Best.

 

 Moderate editing of the English language is required.  Please use writing software.

 

Author Response

Reviewer 1

The manuscript “Counting Abalone with High-Precision using Yolov3 and Deep SORT” is interesting however the manuscript could be improved a lot.

 

Following concerns needs to address.

 

Thank you for your valuable time on reviewing the proposed paper. I appreciate the effort that you dedicated to providing feedback on our manuscript and are grateful for the insightful comments on and valuable improvements to our paper. I have incorporated most of the suggestions made by your comments. Those changes are highlighted within the manuscript in red. Please see below point-by-point the response to your comments and concerns.

 

(1) In the manuscript it is claimed, the effectiveness and detection of the proposed model/algorithm are better compared to other previously proposed methods. However, this claim is not validated

 

- As you commented, we validate that this proposed algorithm has potential in the counting of abalone where there is occlusion or some cases where abalone tends to get entangles resulting to being counted as one. To add, considering previously proposed work for example Park et al article (see in line 126 – 134). We have shown that this study has great potential according to the results obtained. In addition, we have added the difference from previous works is summarized as Table 1 in order to show the effectiveness of our model.   

 

(2) It is recommended that authors prove the superiority of the proposed model compared to other models by comparing the counting rate, accuracy (sensitivity and specificity), and other variates to the pre-existing similar models in a tabulated format.

- As you commented, Table 3 is added to compare and show the difference from other models.

 

(3) In Table 2, the authors mentioned the accuracy rate of the model compared to manual counting. There is a large variability in the results. Could you please explain why the “slow-motion effect” causes such inaccuracy in some scenarios?

 

- As you commented, Slow-motion effect often exhibit motion blur due to the slower shutter speed used during capture. The blurred frames can make it challenging object detection model to precisely identify and localize abalone in each frame. The blurriness can obscure important features of abalone, resulting in missed detections or inaccurate bounding box predictions. As seen in Table 3 in the count result in 4th row with actual abalone count of 25 and model count 16 with accuracy of 64%.                

 

 

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript addresses the problem of counting marine objects, specifically abalones, in industrial environments. The difficulty of counting tangled abalones and the need for automation to reduce costs and time are emphasized.

The manuscript presents a consistent approach in addressing the importance of modern and low-cost solutions for the fishing industry, aiming for automation and revenue generation. However, there are some aspects that could be further developed:

1-Justification for the choice of YOLOv3-Tensorflow and Deep SORT, as the arguments described at the beginning may not be sufficient when seeking greater accuracy.

2 - Reference to other previously proposed methods and their limitations. A table would be more illustrative...

3 - I also missed more details about the proposed methodology, such as the specific steps of YOLOv3 and Deep SORT, as well as discussing the challenges and limitations encountered during implementation...

4 - I suggest also checking the formatting of the text, specifically the references section.

The work highlights many efforts in implementation and conducted experiments. However, I believe that providing more details about the methodology, procedures, a better analysis, and discussion of the results would make the authors' contribution clearer.

Author Response

Reviewer 2

 

The manuscript addresses the problem of counting marine objects, specifically abalones, in industrial environments. The difficulty of counting tangled abalones and the need for automation to reduce costs and time are emphasized.

 

The manuscript presents a consistent approach in addressing the importance of modern and low-cost solutions for the fishing industry, aiming for automation and revenue generation. However, there are some aspects that could be further developed:

Thank you for your valuable time on reviewing the proposed paper. I appreciate the effort that you dedicated to providing feedback on our manuscript and are grateful for the insightful comments on and valuable improvements to our paper. I have incorporated most of the suggestions made by your comments. Those changes are highlighted within the manuscript in red. Please see below point-by-point the response to your comments and concerns.

 

(1) Justification for the choice of YOLOv3-Tensorflow and Deep SORT, as the arguments described at the beginning may not be sufficient when seeking greater accuracy.

 

- As you commented, the choice for YOLOv3-Tensorflow and Deep SORT is that:

YOLOv3 has shown significant improvement in detecting small objects compared to its predecessors. This capability can be vital in identifying individual abalones, which are relatively small creatures. YOLOv3 not only identifies objects within an image but also provides the bounding box coordinates for each detected object. This information is crucial for tracking objects over time, a task that Deep SORT can efficiently handle.

Deep SORT for Abalone Counting: Handling Occlusions: Deep SORT (Simple Online and Real-Time Tracking) can handle temporary occlusions, making it highly effective in dynamic environments where abalones might overlap or hide behind each other or objects in their environment.

Tracking ID Maintenance: Deep SORT maintains consistent ID tracking even when an object has temporarily left the scene or has been occluded. This makes the counting of individual abalones more accurate. Integration with YOLOv3: Deep SORT has been shown to work well with the YOLO. The bounding box information from YOLOv3 can be directly fed into Deep SORT for object tracking.

Use of Motion Information: Deep SORT incorporates motion information, which can improve tracking performance in cases where abalones are moving.

 

(2) Reference to other previously proposed methods and their limitations. A table would be more illustrative...

 

- As you commented, we added Table 1 to compare our method with previous works.

 

 

(3) I also missed more details about the proposed methodology, such as the specific steps of YOLOv3 and Deep SORT, as well as discussing the challenges and limitations encountered during implementation.

 

- As you commented, below are some of the main specific steps in this proposed article and the following are described in Chapter 3 line 206-222:

 

1. Dataset preparation: Gathering datasets of annotated images containing abalone instances from the video data. Annotating the abalone in each frame with bounding box coordinates.

 

2. YOLOv3 setup: Set up the YOLOv3 object detection model using TensorFlow.

 

3. Training: Train the YOLOv3 model on annotated abalone dataset. The training process involves feeding the annotated images and their corresponding bounding box annotations into the model and optimizing the model's parameters to accurately detect abalone objects.

 

4. Testing: After training, evaluating the performance of YOLOv3 model on a separate test set.           This step helps in assessing the accuracy and generalization capabilities of trained model.

 

5. DeepSORT integration: DeepSORT is a tracking algorithm that used in conjunction with YOLOv3 to track abalone objects across multiple frames. Integrate DeepSORT with YOLOv3 model to associate bounding box detections across frames and track abalone instances over time.

 

6. Tracking: Applying the combined YOLOv3-DeepSORT model on a video or sequence of frames containing abalone. The model will detect and track abalone instances, assigning unique IDs to each instance to count and monitor their movements.

 

7. Counting: Count the number of unique abalone instances based on their tracked IDs. Each ID represents a distinct abalone object. Summing up the unique IDs will give the count of abalone in the video or sequence of frames.

 

 

- For the challenges and limitation encountered during our implementation. Due to the blurriness of image within the video, lots of data should be discarded (described in Chapter 3.1 L231-233). This prevent us from obtaining data. Furthermore, in one of the video there was slow motion effect lead to low accuracy. Please refer in lines (490-497).

 

 

 

(4) I suggest also checking the formatting of the text, specifically the references section.

 

- As you commented, the references are modified as MDPI format.

 

 

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Dear Authors,

All my comments are addressed.

Minor editing of English language required.

Author Response

Thank you for your valuable time on reviewing the proposed paper. I appreciate the effort that you dedicated to providing feedback on our manuscript and are grateful for the insightful comments on and valuable improvements to our paper. As you commented, we have revised some sentences to improve English and we have modified the proposed paper through English language editing service. 

Reviewer 2 Report

The authors made several inclusions and corrections in the manuscript, which undoubtedly contributed to making the contribution clearer, which makes me very satisfied. However, I suggest considering some bibliographies in the final version of the manuscript.

S. R and M. M, "Comparing YOLOV3, YOLOV5 & YOLOV7 Architectures for Underwater Marine Creatures Detection," 2023 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), Dubai, United Arab Emirates, 2023, pp. 25-30, doi: 10.1109/ICCIKE58312.2023.10131703.

J. R. B. Garay, S. T. Kofuji and T. Tiba, "Overview of a system AMR based in computational Vision and Wireless sensor network," 2009 IEEE Latin-American Conference on Communications, Medellin, Colombia, 2009, pp. 1-5, doi: 10.1109/LATINCOM.2009.5305009.

B. Balakrishnan, R. Chelliah, M. Venkatesan and C. Sah, "Comparative Study On Various Architectures Of Yolo Models Used In Object Recognition," 2022 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India, 2022, pp. 685-690, doi: 10.1109/ICCCIS56430.2022.10037635.

J. R. Beingolea, A. B. Filho, J. Borja-Murillo, J. Rendulich and M. Zegarra, "Security System for University Spaces: Implementation Experience," 2021 2nd Sustainable Cities Latin America Conference (SCLA), Medellin, Colombia, 2021, pp. 1-4, doi: 10.1109/SCLA53004.2021.9540149.

 

I also suggest observing some aspects of formatting, although this does not affect the contributions presented in the manuscript in any way, it is important to verify. Ex. alignment of figures, tables, and their titles.

In general, I believe that the authors satisfactorily addressed the observations initially made about the first version of the manuscript.

Author Response

Thank you for your valuable time on reviewing the proposed paper. I appreciate the effort that you dedicated to providing feedback on our manuscript and are grateful for the insightful comments on and valuable improvements to our paper. I have incorporated most of the suggestions made by your comments. Those changes are highlighted within the manuscript in red. Please see below point-by-point the response to your comments and concerns.

1. The authors made several inclusions and corrections in the manuscript, which undoubtedly contributed to making the contribution clearer, which makes me very satisfied. However, I suggest considering some bibliographies in the final version of the manuscript. 

-> The suggested valuable papers are included in the section 2.1 with red color in order to help reader's understanding.

2. I also suggest observing some aspects of formatting, although this does not affect the contributions presented in the manuscript in any way, it is important to verify. Ex. alignment of figures, tables, and their titles.

-> Some formatting errors are modified and others missed will be modified with mdpi editor.

Back to TopTop