Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessEditor’s ChoiceArticle

Peer-Review Record

Fish Monitoring from Low-Contrast Underwater Images

Electronics 2023, 12(15), 3338; https://doi.org/10.3390/electronics12153338

by Nikos Petrellis^1,*

, Georgios Keramidas², Christos P. Antonopoulos¹

and Nikolaos Voros¹

Reviewer 1:

Ke Gu

Reviewer 2:

Youfa Liu

Reviewer 3: Anonymous

Reviewer 4: Anonymous

Electronics 2023, 12(15), 3338; https://doi.org/10.3390/electronics12153338

Submission received: 10 June 2023 / Revised: 25 July 2023 / Accepted: 2 August 2023 / Published: 4 August 2023

(This article belongs to the Special Issue Advances in the Use of Artificial Intelligence (AI)/Machine Learning (ML) and IoT in the Primary Sector)

Round 1

Reviewer 1 Report

In this paper, authors proposed an estimation method for monitoring morphological features of fish from low-contrast underwater images and videos. The shape alignment technology based on regression tree collection machine learning method was applied to estimate the size of the fish and the position of the body parts of special interest, and high-speed fish orientation detection based on OpenCV service supported fish tracking. And hardware acceleration technology was suitable for fish detection and shape alignment phases for real-time video processing, which is a good performance of the work, but there are still the following issues need noting:

1. What is the ultimate purpose of fish detection and identification proposed by the authors? I am very curious about the practical application content and specific application scenarios and fields of your work.

2. It is suggested that the authors present the specific content of the final dataset, such as the specific number of all kinds fish, the number of categories of which features, the number of pictures, the number of overall pictures, etc., which will help readers to have a deeper understanding of your method.

3. Hardware platform acceleration, fish frame selection and ERT model are all existing methods of previous researchers, and the authors seemed only apply them to your own data set, without any innovation in your own methods. If there are innovation in your approach, it is recommended to emphasize this part.

4. The error of your method in the results of Table 6 is larger than that of the method in reference 17. I hope you can add your explanation and opinion on this result and add this content to the future work outlook of your method.

5. As you said, the image contrast in your dataset is low, but the description of this section is not existed in the whole paper, and it is suggested that the authors add it, for example, what measures your method takes to make it perform better in low-contrast images, and how low the contrast of the dataset is. In addition, the authors did not mention the positioning effect of other methods under water with low contrast at the beginning, resulting in a lack of motivation to propose this method for low contrast.

6. It is recommended that the authors cite some work on image quality enhancement of contrast, which can be cited in related work to make the authors' description of the field more comprehensive: “The analysis of image contrast: From quality assessment to automatic enhancement”, “Automatic contrast enhancement technology with saliency preservation”, “No-reference quality metric of contrast-distorted images based on information maximization”.

none

Author Response

Thank you very much for the time you spent to read our manuscript and your valuable comments. We believe we have addressed adequately those comments. Please find below our responses and references to the sections of the manuscript where each comment is addressed. We are at your disposal for any other improvement or clarification required.

Comment 1

What is the ultimate purpose of fish detection and identification proposed by the authors? I am very curious about the practical application content and specific application scenarios and fields of your work.

Author Response 1.

Thank you very much for your comment. The proposed framework consists of the following modules a) fish detection, b) orientation, c) tracking, d) morphological feature measurement based on shape alignment. This framework can be exploited both in aquacultures and open waters.

Applications for aquacultures:

- everyday noninvasive morphological feature measurement (size estimation, fish health assessment, feeding needs, optimal harvest time)

- fish tracking to assess fish behavior (health and feeding needs)

Applications for open waters: fish population measurement (from fish detection), fish behavior under stress (fish tracking), fish dimensions measurement (from morphological feature measurement), fish variation recognition and health assessment (from shape)

Please see:

the revised Abstract,
the modified-extended 1^st paragraph of Introduction where the applications are listed
the new 3^rd paragraph (lines 68-86) of Introduction where the research gap in the referenced approaches is discussed and the motivation of our work is set
the extended 7^th paragraph (lines 119-131) of Introduction where our previous work is compared
the paragraph before the last one in Section 5 (lines 894-899)

Comment 2

It is suggested that the authors present the specific content of the final dataset, such as the specific number of all kinds fish, the number of categories of which features, the number of pictures, the number of overall pictures, etc., which will help readers to have a deeper understanding of your method.

Author Response 2.

Thank you very much for your suggestions. More information about the dataset used is given in the revised manuscript. The developed dataset (UVIMEF) has been made public in Kaggle and is referenced in [35]

Please see

the extended section 3.1 where the details of the dataset are described and especially:
the new last 2 sentences of the 1^st paragraph of section 3.1 where the public UVIMEF dataset is referenced
the new Table 1 where the major fish categories used to train/test our models are listed with the number of photographs in each category
the new paragraph 3 of section 3.1 where the species appearing in Table 1 and balancing issues are discussed
the extended 1^st paragraph before the end of Section 3.1.

Comment 3

Hardware platform acceleration, fish frame selection and ERT model are all existing methods of previous researchers, and the authors seemed only apply them to your own data set, without any innovation in your own methods. If there are innovation in your approach, it is recommended to emphasize this part.

Author Response 3

Thank you for your remark. Although many of our framework modules are based on existing approaches, numerous modifications, adaptations and improvements have been applied that can be considered as innovations and are listed below:

ERT and its DEST package implementation has been developed for face shape alignment which is a quite different problem than fish shape alignment. Except from the different number of landmarks needed to describe each shape, fish shape is not symmetrical and therefore orientation classification of the input image is necessary
We have not used the DEST package as is, but we have heavily modified it in order to port it to 3 different platforms (Ubuntu environment, Xilinx Vitis, MS Visual Studio 2019). More important is that we modified its source code replacing time consuming Eigen calls with optimized fast C code that can also be implemented in hardware. With the software and hardware acceleration techniques developed by the authors, the frame processing latency was less than 0.5 usec on an Intel i5 platform and less than 16 ms on an embedded hardware target board
Multiple ERT models have been trained in the revised manuscript to tradeoff between speed and accuracy
The employed fish detection method was adapted by developing Python scripts to run locally and extract information (bounding box coordinates) needed by the other modules (orientation classification, fish tracking, shape alignment).
A novel fast draft orientation classification method is proposed
A number of fish tracking techniques are proposed like the fish bounding box interpolation for higher speed. Image buffering is proposed to validate the interpolation between actual fish localizations. The history of fish positions can be exploited to achieve more accurate fish orientation results while in the opposite direction fish orientation results can also be used to validate the tracking results.
A new landmark editor (LAE) has been developed
A new public dataset (UVIMEF) with low contrast/resolution images has been developed to test our approach under worst case conditions
The developed shape alignment method was trained with Mediterranean fish species in the category of seabream but can also be extended to any species.
The overall system architecture (fish detection+orientation+tracking+shape alignment) is also an innovative combination that achieves both high speed and accuracy.

Please see:

The extended Abstract
The extended 3^rd paragraph of Introduction for the motivation of our work (line 68-86)
The extended 5^th paragraph of Introduction and especially from sentence 3 to the end of this paragraph where the adaptation of the fish detection method, the novel orientation classification and the fish tracking methods are introduced (lines 98-108)
Last 2 sentences of paragraph 6 in Introduction where the porting of DEST application to different target environments is mentioned (lines 114-118). For porting to different targets please also see the 3^rd paragraph of Section 3.7 (lines 632-640)
Concerning our previous work [10], although we can exploit the absolute dimension measurement method, every other tool has been replaced and much higher speed is achieved with better accuracy as described in the extended paragraph 7 of Introduction (lines 119-131)
The modified paragraph 9 of Introduction that has been rewritten to stress the novelities and contributions of our approach (lines 150-160)
Modified Section 3.2, where the customizations of the fish detection and the principle of operation of fish tracking and orientation are introduced.
More details about what we changed in fish detection can be found in the 2^nd paragraph (lines 332-340) and details of the proposed novel Fish Position Interpolation (FPI) from the 4^th paragraph (line 353) to the end of section 3.3. Please also see Fig. 3 and 4
The whole section 3.4 proposes a novel fast draft orientation classification technique
Section 3.5 also proposes a tracking procedure. The extensions that could also be taken into consideration for improving tracking results are also discussed in the last paragraph of this section
The 2^nd paragraph of 3.7 describes the new LAE landmark editor that has been developed in the context of this work (lines 615-631, Fig 9). In the same paragraph and Table 3, the parameters that vary in the different ERT models that we trained and compare are discussed.
In paragraphs 4 and 5 of Section 3.7 the text that also existed in the previous paper version describes the software and hardware acceleration techniques applied for ERT shape alignment. However, we added the new last 2 paragraphs in this section that give specific details about the novel hardware acceleration that we propose and can also be useful for other applications where the hardware kernel needs large data transfers. Table 3 and Fig. 11 are also new.
The last 3 paragraphs of section 5 (Discussion) describe the achievements of the proposed approach compared to referenced ones, the contribution but also the limitations of our work.
Finally, please also see the revised conclusion section and especially lines 918-925 where the speed and accuracy achievements are summarized

Comment 4

The error of your method in the results of Table 6 is larger than that of the method in reference 17. I hope you can add your explanation and opinion on this result and add this content to the future work outlook of your method.

Author Response 4

Thank you for your remark. In [27] (old reference [17]), the dataset contains fish that are clearly visible. Our approach is applied under much worse conditions in terms of image contrast, lighting conditions, murky waters, etc. Still we get a comparable error even under these much worse conditions. Please also take into consideration that our fish height estimation is performed with lower error than [27]. Moreover, the speed of [27] is not reported so maybe the speed is much slower than our approach. Please see:

Extended Table 9 with comparison with referenced approaches
The new Fig. 13 that gives more information about the accuracy for various morphological feature estimations with models M1-M4 and paragraph 14 of Section 5 (lines 747-749) that refers to Fig. 13
Concerning the contrast of our dataset (UVIMEF) please see the new first 5 paragraphs of Section 5 (lines 788-821) and new eq. (19-21) as well as the new Table 10
The 3^rd sentence (lines 824-825) of the paragraph below Table 10 that updates the errors in fish length/height measurements
The new paragraphs 9 and 10 in Section 5 (lines 850-862) where more discussion about accuracy and speed takes place
Finally, please see the extended paragraph 12 (lines 880-893) of Section 5 where the achieved results are discussed in comparison with the referenced approaches.

Comment 5

As you said, the image contrast in your dataset is low, but the description of this section is not existed in the whole paper, and it is suggested that the authors add it, for example, what measures your method takes to make it perform better in low-contrast images, and how low the contrast of the dataset is. In addition, the authors did not mention the positioning effect of other methods under water with low contrast at the beginning, resulting in a lack of motivation to propose this method for low contrast.

Author Response 5

Thank you very much for your suggestion. As already described in our response to the previous comment, our approach is applied under much worse conditions in terms of image contrast, lighting conditions, murky waters, etc. Still we get a comparable error even under much worse conditions.

Please see concerning the contrast of our dataset (UVIMEF):

The new paragraph 8 of Introduction (lines 132-149) where the effect of contrast is discussed as well as how it can be improved to reduce the error in the shape alignment process
the new first 5 paragraphs of Section 5 (lines 788-821)
new eq. (19-21)
the new Table 10
the new Fig. 14 (sample images from ImageCLEF, Fish4Knowledge) and compare it to Fig. 7 (sample fish images from UVIMEF)

Comment 6

It is recommended that the authors cite some work on image quality enhancement of contrast, which can be cited in related work to make the authors' description of the field more comprehensive: “The analysis of image contrast: From quality assessment to automatic enhancement”, “Automatic contrast enhancement technology with saliency preservation”, “No-reference quality metric of contrast-distorted images based on information maximization”.

Author Response 6

Thank you for your suggestion. We have added two of the suggested references and one of our previous work that concern image normalization for different applications. Please see our response to the previous comment as well as:

the new references [11]-[13]
The new paragraph 8 of Introduction (lines 132-149) where the effect of contrast is discussed as well as how it can be improved to reduce the error in the shape alignment process

Reviewer 2 Report

1. In the Abstract, summarize previous work to demonstrate the innovation of your work.

2. Is your work the first innovation in the field of fish image capture?

3. Is the loss function of your model biologically meaningful or interpretable?

4. The code link for your model is not found in the article, please provide it.

There are some typos in this manuscript. Please correct them carefully.

Author Response

Comment 1

In the Abstract, summarize previous work to demonstrate the innovation of your work.

Author Response 1.

Thank you for your suggestion. The Abstract has been revised. The first 3 sentences summarize the applications where the proposed framework can be employed. The rest of the Abstract highlights the unique features that characterize the developed system.

Please also see the new 3^rd paragraph of the Introduction where the motivation of our work is described in detail.

Comment 2

Is your work the first innovation in the field of fish image capture?

Author Response 2.

ERT and its DEST package implementation has been developed for face shape alignment which is a quite different problem than fish shape alignment. Except from the different number of landmarks needed to describe each shape, fish shape is not symmetrical and therefore orientation classification of the input image is necessary
We have not used the DEST package as is but we have heavily modified it in order to port it to 3 different platforms (Ubuntu environment, Xilinx Vitis, MS Visual Studio 2019). More important is that we modified its source code replacing time consuming Eigen calls with optimized fast C code that can also be implemented in hardware. With the software and hardware acceleration techniques developed by the authors, the frame processing latency was less than 0.5 usec on an Intel i5 platform and less than 16 ms on an embedded hardware target board
Multiple ERT models have been trained in the revised manuscript to tradeoff between speed and accuracy
The employed fish detection method was adapted by developing Python scripts to run locally and extract information (bounding box coordinates) needed by the other modules (orientation classification, fish tracking, shape alignment).
A novel fast draft orientation classification method is proposed
A number of fish tracking techniques are proposed like the fish bounding box interpolation for higher speed. Image buffering is proposed to validate the interpolation between actual fish localizations. The history of fish positions can be exploited to achieve more accurate fish orientation results while in the opposite direction fish orientation results can also be used to validate the tracking results.
A new landmark editor (LAE) has been developed
A new public dataset (UVIMEF) with low contrast/resolution images has been developed to test our approach under worst case conditions
The developed shape alignment method was trained with Mediterranean fish species in the category of seabream but can also be extended to any species.
The overall system architecture (fish detection+orientation+tracking+shape alignment) is also an innovative combination that achieves both high speed and accuracy.

Please see:

The extended Abstract
The extended 3^rd paragraph of Introduction for the motivation of our work (line 68-86)
The extended 5^th paragraph of Introduction and especially from sentence 3 to the end of this paragraph where the adaptation of the fish detection method, the novel orientation classification and the fish tracking methods are introduced (lines 98-108)
Last sentence 2 sentences of paragraph 6 in Introduction where the porting of DEST application to different target environments is mentioned (lines 114-118). For porting to different targets please also see the 3^rd paragraph of Section 3.7 (lines 632-640)
Concerning our previous work [10], although we can exploit the absolute dimension measurement method, every other tool has been replaced and much higher speed is achieved with better accuracy as described in the extended paragraph 7 of Introduction (lines 119-131)
The modified paragraph 9 of Introduction that has been rewritten to stress the novelities and contributions of our approach (lines 150-160)
Modified Section 3.2, where the customizations of the fish detection and the principle of operation of fish tracking and orientation are introduced.
More details about what we changed in fish detection can be found in the 2^nd paragraph (lines 332-340) and details of the proposed novel Fish Position Interpolation (FPI) from the 4^th paragraph (line 353) to the end of section 3.3. Pleas also see Fig. 3 and 4
The whole section 3.4 proposes a novel fast draft orientation classification technique
Section 3.5 also proposes a tracking procedure. The extensions that could also be taken into consideration for improving tracking results are also discussed in the last paragraph of this section
The 2^nd paragraph of 3.7 describes the new LAE landmark editor that has been developed in the context of this work (lines 615-631, Fig 9). In the same paragraph and Table 3, the models that vary in the different ERT models that we trained and compare are discussed.
In paragraphs 4 and 5 of Section 3.7 the text that also existed in the previous paper version describes the software and hardware acceleration techniques applied for ERT shape alignment. However, we added the new last 2 paragraphs in this section that give specific details about the novel hardware acceleration that we propose and can also be useful for other applications where the hardware kernel needs large data transfers. Table 3 and Fig. 11 are also new.
The last 3 paragraphs of section 5 (Discussion) describe the achievements of the proposed approach compared to referenced ones, the contribution but also the limitations of our work.
Finally, please also see the revised conclusion section and especially lines 918-925 where the speed and accuracy achievements are summarized

Comment 3

Is the loss function of your model biologically meaningful or interpretable?

Author Response 3

Thank you for your comment. The meaning of the error in the estimation of fish dimensions is explained in Section 5 and especially in the modified 3 paragraphs after the new Table 10 (lines 822-849). In these paragraphs the relative error is translated to an absolute distance range measured in cm.

This error can be reduced if an images with higher contrast are used. Please see the new paragraph 8 of Introduction (lines 132-149) where the effect of contrast is discussed as well as how it can be improved to reduce the error in the shape alignment process

Comment 4

The code link for your model is not found in the article, please provide it.

Author Response 4

The source code is not publically available (at least not yet, it can be provided however to anybody interested under a specific agreement). Nevertheless, the original ERT code can be found in DEST repository [8]. We also made public a first version of the UVIMEF dataset we developed ([35]). It will soon be enriched with additional videos / photographs that are continuously captured and should be evaluated before uploading a new version. Demo videos are also available (their links were given during submission as supplementary material) showing shape alignment (facial and fish) with the tools we have developed.

Reviewer 3 Report

The reviewer has following critical comments regarding the article.

1. What is the problem statement and research gap? Explicitly mention in abstract and introduction.

2. An introduction must be coherent with abstract. Currently, related work is mixed in introduction which is not a good practice. I would rather recommend to add related work in a separate section and make introduction coherent with abstract.

3. Justify novelty of the proposed method and contributions.

4. Detailed information about experimental setup is required.

5. Experimental results section is too weak and detailed experimental analysis is required.

6. I can't see any comparison with state of the art models. It is necessary to compare with existing state of the art models.

7. Ablation study must be added.

8. A conclusion must be representative of the work described in the main body. Please make it coherent with work.

9. Figures must be improved and can be drawn with some efficient tools instead of Microsoft Excel.

English language is fine.

Author Response

Comment 1

What is the problem statement and research gap? Explicitly mention in abstract and introduction.

Author Response 1.

Thank you for your suggestion. We provide a combination of tools that can be exploited in several applications as described in the 1^st paragraph of the Introduction. The referenced approaches for morphological feature estimation are either fast or accurate. The system architecture we propose attempts to provide a solution that is both fast and accurate.

The Abstract has been revised. The first 3 sentences summarize the applications where the proposed framework can be employed. The rest of the Abstract highlights the unique features that characterize the developed system.

Please also see the new 3^rd paragraph (lines 68-86) of the Introduction where the motivation of our work is described in detail.

The modified 7^th paragraph (lines 119-131) of Introduction describes the differences from our previous work and the modified-extended 9^th paragraph (lines 150-160) summarizes the contribution of our approach.

Comment 2

An introduction must be coherent with abstract. Currently, related work is mixed in introduction which is not a good practice. I would rather recommend to add related work in a separate section and make introduction coherent with abstract.

Author Response 2

Thank you for your remark. Introduction has been split to the Introduction section where the general problem and research gap is described while our approach is introduced. The description of the references is made a separate section (Related Work)

Comment 3

Justify novelty of the proposed method and contributions.

Author Response 3.

ERT and its DEST package implementation has been developed for face shape alignment which is a quite different problem than fish shape alignment. Except from the different number of landmarks needed to describe each shape, fish shape is not symmetrical and therefore orientation classification of the input image is necessary
We have not used the DEST package as is but we have heavily modified it in order to port it to 3 different platforms (Ubuntu environment, Xilinx Vitis, MS Visual Studio 2019). More important is that we modified its source code replacing time consuming Eigen calls with optimized fast C code that can also be implemented in hardware. With the software and hardware acceleration techniques developed by the authors, the frame processing latency was less than 0.5 usec on an Intel i5 platform and less than 16 ms on an embedded hardware target board
Multiple ERT models have been trained in the revised manuscript to tradeoff between speed and accuracy
The employed fish detection method was adapted by developing Python scripts to run locally and extract information (bounding box coordinates) needed by the other modules (orientation classification, fish tracking, shape alignment).
A novel fast draft orientation classification method is proposed
A number of fish tracking techniques are proposed like the fish bounding box interpolation for higher speed. Image buffering is proposed to validate the interpolation between actual fish localizations. The history of fish positions can be exploited to achieve more accurate fish orientation results while in the opposite direction fish orientation results can also be used to validate the tracking results.
A new landmark editor (LAE) has been developed
A new public dataset (UVIMEF) with low contrast/resolution images has been developed to test our approach under worst case conditions
The developed shape alignment method was trained with Mediterranean fish species in the category of seabream but can also be extended to any species.
The overall system architecture (fish detection+orientation+tracking+shape alignment) is also an innovative combination that achieves both high speed and accuracy.

Please see:

The extended Abstract
The extended 3^rd paragraph of Introduction for the motivation of our work (line 68-86)
The extended 5^th paragraph of Introduction and especially from sentence 3 to the end of this paragraph where the adaptation of the fish detection method, the novel orientation classification and the fish tracking methods are introduced (lines 98-108)
Last sentence 2 sentences of paragraph 6 in Introduction where the porting of DEST application to different target environments is mentioned (lines 114-118). For porting to different targets please also see the 3^rd paragraph of Section 3.7 (lines 632-640)
Concerning our previous work [10], although we can exploit the absolute dimension measurement method, every other tool has been replaced and much higher speed is achieved with better accuracy as described in the extended paragraph 7 of Introduction (lines 119-131)
The modified paragraph 9 of Introduction that has been rewritten to stress the novelities and contributions of our approach (lines 150-160)
Modified Section 3.2, where the customizations of the fish detection and the principle of operation of fish tracking and orientation are introduced.
More details about what we changed in fish detection can be found in the 2^nd paragraph (lines 332-340) and details of the proposed novel Fish Position Interpolation (FPI) from the 4^th paragraph (line 353) to the end of section 3.3. Pleas also see Fig. 3 and 4
The whole section 3.4 proposes a novel fast draft orientation classification technique
Section 3.5 also proposes a tracking procedure. The extensions that could also be taken into consideration for improving tracking results are also discussed in the last paragraph of this section
The 2^nd paragraph of 3.7 describes the new LAE landmark editor that has been developed in the context of this work (lines 615-631, Fig 9). In the same paragraph and Table 3, the models that vary in the different ERT models that we trained and compare are discussed.
In paragraphs 4 and 5 of Section 3.7 the text that also existed in the previous paper version describes the software and hardware acceleration techniques applied for ERT shape alignment. However, we added the new last 2 paragraphs in this section that give specific details about the novel hardware acceleration that we propose and can also be useful for other applications where the hardware kernel needs large data transfers. Table 3 and Fig. 11 are also new.
The last 3 paragraphs of section 5 (Discussion) describe the achievements of the proposed approach compared to referenced ones, the contribution but also the limitations of our work.

Finally, please also see the revised conclusion section and especially lines 918-925 where the speed and accuracy achievements are summarized

Comment 4

Detailed information about experimental setup is required.

Author Response 4

Thank you for your comment. The experimental setup is described in the extended Section 3.1. Additional information included in this revised version of our manuscript:

The employed dataset has been made public in Kaggle and is referenced in [35]. Please see the last 2 sentences of the 1^st paragraph of Section 3.1 (lines 235-238)
See the extended 3^nd paragraph (lines 257-268)) of Section 3.1.
The new Table 1 where the photographs per fish species that have been used for training/testing are listed.

Comment 5

Experimental results section is too weak and detailed experimental analysis is required.

Author Response 5

Thank you for your comment. Experimental Section 4, has been expanded with a comparison between 4 ERT models that are presented in new Table 4 and their accuracy is compared in new Fig. 13.

Table 9 has also been extended to include our previous work and frame processing latencies to compare the speed of various approaches

Please also see the extended discussion:

The contrast metrics and measurements (first 5 paragraphs of Section 5 and the new Table 10 and Fig. 14)
The modified-extended paragraphs 6-8 of Section 5 that interpret the relative error to meaningful distances (lines 822-849)
The speed and accuracy comparison of the 4 new ERT models in the estimation of fish length, height and the position of the eyes and gills in paragraphs 9-10 of Section 5 (lines 850-862)

Comment 6

I can't see any comparison with state of the art models. It is necessary to compare with existing state of the art models.

Author Response 6

Thank you for your suggestion. Table 9 has been extended with more approaches and comparison of the speed and the accuracy. Previous work has also been included to highlight the improvements in speed and accuracy. Multiple ERT models have also been trained and compared.

More ERT models have been trained and their speed – accuracy is compared.

Comment 7

Ablation study must be added.

Author Response 7

Thank you for your remark. We believe that the 4 new ERT models that have been trained and compared in terms of accuracy/speed provide a kind of ablation study. Please see

Table 4 and the preceeding paragraph (lines 704-709),
the new Fig. 13 and
the extended paragraph 17 (lines 880-893) in Section 5

Comment 8

A conclusion must be representative of the work described in the main body. Please make it coherent with work.

Author Response 8

Thank you for your suggestion. Conclusions section has been modified to represent better the work of the paper.

Comment 9

Figures must be improved and can be drawn with some efficient tools instead of Microsoft Excel.

Author Response9

Thank you very much for your suggestion. Figure 12 (old Fig. 11) was plot in Octave and was replaced. We have also added the new Fig. 13 which was also plot in Octave

Reviewer 4 Report

The work focuses on the detection of Mediterranean fish using various image processing techniques, machine learning, and hardware acceleration. The authors acknowledge that the work is adapted from their previous work [31] published in Applied Sciences. Also, the authors clarify that the proposed method replaces two components of the previous method in [31] with two new ones (contour detection using shape alignment and ERT), and they maintain the absolute fish dimension estimation with stereo vision used in [31].

It is appreciated that the authors include references to their previous works and describe the changes introduced. However, to know with certainty the degree of novelty of the proposal and if it really provides a significant improvement to the state of the art, it is necessary to compare the proposed method with the original method.

Finally, I recognize that the work shows potential to receive interest from the specialized community, however, without resolving the previous defect it is impossible to move forward.

Minor editing of English language required

Author Response

Comment 1

Author Response 1

Thank you very much for your comments. The previous work [10] was based on completely different techniques (pattern matching, BMA, SCIA) that in most cases showed a much higher error and a latency orders of magnitude higher. The target of the present work was to develop and shape alignment method based on ERT that can be supported with hardware acceleration and could achieve high accuracy. In [10] different fish species and clearer photos were used. In this work a new dataset (UVIMEF) has been developed [35] with low contrast images that lead to results under worst case conditions. The achieved results show that a very good accuracy is achieved with high frame processing speed. Moreover, advanced techniques for fish detection, orientation and tracking are also described in the present work that have not been presented in our previous work. For the differences from our previous work please see:

The new 7^th paragraph before the end of Introduction (lines 119-131). although we can exploit the absolute dimension measurement method, every other tool has been replaced and much higher speed is achieved with better accuracy
The extended Table 9 where the results from previous work are included
The extended paragraph 12 of Section 5 (lines 880-893)

Concerning the innovations and the contribution of our work, although many of our framework modules are based on existing approaches, numerous modifications, adaptations and improvements have been applied that can be considered as innovations and are listed below:

ERT and its DEST package implementation has been developed for face shape alignment which is a quite different problem than fish shape alignment. Except from the different number of landmarks needed to describe each shape, fish shape is not symmetrical and therefore orientation classification of the input image is necessary
We have not used the DEST package as is but we have heavily modified it in order to port it to 3 different platforms (Ubuntu environment, Xilinx Vitis, MS Visual Studio 2019). More important is that we modified its source code replacing time consuming Eigen calls with optimized fast C code that can also be implemented in hardware. With the software and hardware acceleration techniques developed by the authors, the frame processing latency was less than 0.5 usec on an Intel i5 platform and less than 16 ms on an embedded hardware target board
Multiple ERT models have been trained in the revised manuscript to tradeoff between speed and accuracy
The employed fish detection method was adapted by developing Python scripts to run locally and extract information (bounding box coordinates) needed by the other modules (orientation classification, fish tracking, shape alignment).
A novel fast draft orientation classification method is proposed
A number of fish tracking techniques are proposed like the fish bounding box interpolation for higher speed. Image buffering is proposed to validate the interpolation between actual fish localizations. The history of fish positions can be exploited to achieve more accurate fish orientation results while in the opposite direction fish orientation results can also be used to validate the tracking results.
A new landmark editor (LAE) has been developed
A new public dataset (UVIMEF) with low contrast/resolution images has been developed to test our approach under worst case conditions
The developed shape alignment method was trained with Mediterranean fish species in the category of seabream but can also be extended to any species.
The overall system architecture (fish detection+orientation+tracking+shape alignment) is also an innovative combination that achieves both high speed and accuracy.

Please see:

The extended Abstract
The extended 3^rd paragraph of Introduction for the motivation of our work (line 68-86)
The extended 5^th paragraph of Introduction and especially from sentence 3 to the end of this paragraph where the adaptation of the fish detection method, the novel orientation classification and the fish tracking methods are introduced (lines 98-108)
Last sentence 2 sentences of paragraph 6 in Introduction where the porting of DEST application to different target environments is mentioned (lines 114-118). For porting to different targets please also see the 3^rd paragraph of Section 3.7 (lines 632-640)
The modified paragraph 9 of Introduction that has been rewritten to stress the novelities and contributions of our approach (lines 150-160)
Modified Section 3.2, where the customizations of the fish detection and the principle of operation of fish tracking and orientation are introduced.
More details about what we changed in fish detection can be found in the 2^nd paragraph (lines 332-340) and details of the proposed novel Fish Position Interpolation (FPI) from the 4^th paragraph (line 353) to the end of section 3.3. Pleas also see Fig. 3 and 4
The whole section 3.4 proposes a novel fast draft orientation classification technique
Section 3.5 also proposes a tracking procedure. The extensions that could also be taken into consideration for improving tracking results are also discussed in the last paragraph of this section
The 2^nd paragraph of 3.7 describes the new LAE landmark editor that has been developed in the context of this work (lines 615-631, Fig 9). In the same paragraph and Table 3, the models that vary in the different ERT models that we trained and compare are discussed.
In paragraphs 4 and 5 of Section 3.7 the text that also existed in the previous paper version describes the software and hardware acceleration techniques applied for ERT shape alignment. However, we added the new last 2 paragraphs in this section that give specific details about the novel hardware acceleration that we propose and can also be useful for other applications where the hardware kernel needs large data transfers. Table 3 and Fig. 11 are also new.
The last 3 paragraphs of section 5 (Discussion) describe the achievements of the proposed approach compared to referenced ones, the contribution but also the limitations of our work.
Finally, please also see the revised conclusion section and especially lines 918-925 where the speed and accuracy achievements are summarized

Comment 2

Finally, I recognize that the work shows potential to receive interest from the specialized community, however, without resolving the previous defect it is impossible to move forward.

Author Response 2

Thank you for your remark. Indeed, there are plenty of applications that could exploit the proposed framework. The proposed framework consists of a) fish detection, b) orientation, c) tracking, d) morphological feature measurement based on shape alignment. This framework can be exploited both in aquacultures and open waters.

Applications for aquacultures:

- everyday noninvasive morphological feature measurement (size estimation, fish health assessment, feeding needs, optimal harvest time)

- fish tracking to assess fish behavior (health and feeding needs)

Applications for open waters: fish population measurement (from fish detection), fish behavior under stress (fish tracking), fish dimensions (from morphological feature measurement), fish variations and health (from shape)

Please see

the revised Abstract,
the modified-extended 1^st paragraph of Introduction where the applications are listed
the new 3^rd paragraph (lines 68-86) of Introduction where the gap in the referenced approaches is discussed and the motivation of our work is set
the extended 7^th paragraph (lines 119-131) of Introduction where our previous work is compared
the paragraph before the last one in Section 5 (lines 894-899)

Round 2

Reviewer 3 Report

Thank you for addressing my previous comments however I have some concerns

1. Remove reference 11, it is irrelevant to the article's content.

2. You have used so many acronyms and those acronyms needs to be defined. It's better to add list of acronyms.

English language is fine.

Author Response

We thank once again the reviewer for the time he devoted to review the revised version of our paper.

Comment 1:

Remove reference 11, it is irrelevant to the article's content.

Authors' Response 1:

Thank you very much for your suggestion.Reference 11 has been removed and the rest of the references were renumbered

Comment 2:

You have used so many acronyms and those acronyms needs to be defined. It's better to add list of acronyms..

Authors' Response 2:

Thank you very much for your remark. We added Appendix A and Table A1 where all the abbreviations used are defined.

Reviewer 4 Report

The authors have positively addressed my suggestions.

Minor editing.

Author Response

We thank once again the reviewer for the time he devoted to review the revised version of our paper.

Comment 1:

The authors have positively addressed my suggestions.

Authors' Response 1:

Thank you very much. In the second revision of our paper you can find some further minor modifications asked by one reviewer (to remove ref [11] and add list of abbreviations that was added in Appendix A)

Article Menu

Fish Monitoring from Low-Contrast Underwater Images

Further Information

Guidelines

MDPI Initiatives

Follow MDPI