Comparison of Different Methods of Animal Detection and Recognition on Thermal Camera Images
Round 1
Reviewer 1 Report
In the article COMPARISON OF DIFFERENT METHODS OF ANIMAL DETECTION AND RECOGNITION ON THERMAL CAMERA IMAGES, the authors developed approaches to automatic recognition of animal silhouettes. The initial data were presented by night images from an IR camera. In the course of the work, the authors successfully solved the tasks of selecting objects in images, their separation and classification. I believe that this work can be published in a journal after a slight improvement.
1) Additional files must be attached. The database for training should be made available to the access, for example, on github.
2) It is worth discussing whether it is possible to improve recognition when implementing the HOG / SVM algorithm by changing the threshold values for brightness, for example.
3) It is necessary to accurately identify the limiting stages in image processing, without training. How much time each algorithm spends on segmenting and recognizing objects.
Author Response
Dear Sir/Madame
(1) Please be informed, that this dataset will be in the future commercialized, and that is the reason why it is not publicly available.
(2) In the updated version there is a fragment about why changing brightness won't change results for methods HOG/SVM (lines 328- 336)
(3) There was not enough time for the correction to perform another study for a suggested time assessment
Best Regards,
Łukasz Popek
Author Response File: Author Response.pdf
Reviewer 2 Report
The authors present animal detection based on YOLO. The study is interesting. In general, the main conclusions presented in the paper are supported by the figures and supporting text. However, to meet the journal quality standards, the following comments need to be addressed.
• Abstract: Should be improved and extended. The authors talk lot about the problem formulation, but novelty of the proposed model is missing. Also provided the general applicability of their model. Please be specific what are the main quantitative results to attract general audiences.
• The introduction can be improved. The authors should focus on extending the novelty of the current study. Emphasize should be given in improvement of the model (in quantitative sense) compared to existing state-of-the art models.
• More details about network architecture and complexity of the model should be provided.
• what about comparison of the result with current state-of-the art models? Did authors perform ablation study to compare with different models?
• What are the baseline models and benchmark results? The authors may compared the result with existing models evaluated with datasets
• Conclusion parts needs to be strengthened.
• Please provide a fair weakness and limitation of the model, and how it can be improved.
• Typographical errors: There are several minor grammatical errors and incorrect sentence structures. Please run this through a spell checker.
· Following references can be added as relevant object detection references ( see : Neural, Comput & Applic (2022) https://doi.org/10.1007/s00521-021-06651-x; Eco Informatic 2022 https://doi.org/10.1016/j.ecoinf.2022.101919; Symmetry 2022 https://doi.org/10.3390/sym14101976; Hence they should be briefly discussed in the related work section.
Author Response
Dear Sir/Madame,
(1 -2) The abstract and introduction were improved. Please check, if the changes are sufficient.
(3) The network architectures are base models with default parameter values.
(4) In the updated version there is a comparison between our and benchmark results
(5) The weakness of the model is described in lines 367-372 in the updated version of manuscript
(6) Hope there are no more lexical errors
Best regards
Łuaksz Popek
Author Response File: Author Response.pdf
Reviewer 3 Report
- The english of the article need great improvements, for example, line 11-12, and lines 16-17.
- Typo line 29, technic
- There is no such thing as autonomous as possible, either autonomous or not, line 35.
- Writing style need to be scientific, what is approach is too inaccurate on line 32? It is better to write "this approach suffers for low accuracy, high error rate, etc."
- The introduction need expansion to include motivation and critical literature review. It would be worthwhile to add appreciation of the used methods by citing relevant research that uses Yolo and Faster RCNN, see Detection of K-complexes in EEG signals using deep transfer learning and YOLOv3. Cluster Comput (2022). https://doi.org/10.1007/s10586-022-03802-0 and Detection of K-complexes in EEG waveform images using faster R-CNN and deep transfer learning. BMC Med Inform Decis Mak 22, 297 (2022). https://doi.org/10.1186/s12911-022-02042-x
- There need to be more details about the research methods, e.g. what is the feature extraction backbone, number of anchors, an location of feature extraction heads.
- What is the experimental setup? I.e., what machine and software did you use? This is necessary to appreciate the reproted times in table 1.
- Line 100, exam-ples
- What are the number of subjects per category?
- What is the data split and training/validation strategy?
- The paper lack so many technical details.
Author Response
Dear Sir/Madame,
(1) the mauscript was revised in terms of lexical errors. We hope so right now it is correct.
(2) The introduction was extended to a literature review and current state of the art. Please see attached file.
(3) Technical data about hardware was written (225) also technical details were improved. Please be informed that the utilized models were the very default version for Faster RCNN and YOLO.
(4) The new version of manuscript contains more detail about images and traing strategy(138-145)
Best regards,
Łukasz Popek
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
The revised manuscript is now suitable for publication.
Author Response
Dear Sir/Madame,
We made additional minor changes, at the request of a third reviewer.
Best regards,
Łukasz Popek
Author Response File: Author Response.docx
Reviewer 3 Report
I am not sure why the authors refrained from adding the technical details requested. Mainly, the following points:
- There need to be more details about the research methods, e.g. what is the feature extraction backbone, number of anchors, an location of feature extraction heads.
- What is the experimental setup? I.e., what machine and software did you use? This is necessary to appreciate the reproted times in table 1.
- Line 100, exam-ples
- What are the number of subjects per category?
- What is the data split and training/validation strategy?
If the authors used the default setting then those need to be mentioned for reproducability of the results.
- The caption for table 1 is wrong.
Author Response
Dear Sir/Madame,
(1) The currently utilized architectures details are described in lines 175 -176. Further is also emphasized in line 183 that there were no parameters change from default versions available in repositories.
(2) Technical details about the machine and software were provided in lines 161 for HOG/SVM (because training and test were performed on CPU). Available details for Google Colab GPU are summed in line 189. If these details do not meet technical expectations, please inform us what other necessary information should be written.
(3) Already corrected.
(4) There was submitted an exact number of subjects in the whole dataset in line 126.
(5) The information about the dataset split is written in line 128. We kindly ask for a recommendation on what necessary information should be added, to make it satisfactory.
Kind regards,
Łukasz Popek
Author Response File: Author Response.docx