Next Article in Journal
Bi-Layered Porous/Cork-Containing Waste-Based Inorganic Polymer Composites: Innovative Material towards Green Buildings
Previous Article in Journal
Self-Service Data Science in Healthcare with Automated Machine Learning
 
 
Article
Peer-Review Record

MViDO: A High Performance Monocular Vision-Based System for Docking A Hovering AUV

Appl. Sci. 2020, 10(9), 2991; https://doi.org/10.3390/app10092991
by André Bianchi Figueiredo * and Aníbal Coimbra Matos
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2020, 10(9), 2991; https://doi.org/10.3390/app10092991
Submission received: 13 February 2020 / Revised: 27 March 2020 / Accepted: 30 March 2020 / Published: 25 April 2020

Round 1

Reviewer 1 Report

This article presents a method for docking an AUV using monocular vision. This article describes the whole process, from the pose estimator to the guiding law used, and some experimental results. This is the second version of the article, the authors have highlighted the changes. I did not get a response letter with the article.

The article does not present a major scientific contribution, it is more about the use of existing methods to solve the docking problem. Nevertheless, the work is interesting especially for the targeted journal that wants to apply it (“Applied Science”).

The authors have taken into account many remarks that had been made: the introduction is better (although in my opinion it contains too many numerical details for an introduction), the state of the art and the comparison with other methods has been reworked, a conclusion has been added.

My main points are:

-  The authors have made some efforts to use standard notations (matrices and vectors in bold, scalars in italics, etc.), but there are still many places where this has not been done, for example: eq (21), a quaternion is a vector, so q should be in bold, eq(22) same remark for Bl, in eq (7) pc, pr and tr are points and vectors, they should also be in bold…

-  There are always too many undefined terms, for example: in eq 3 ,4, 5 : uc, ua, a, f…, in eq 7 pc and pr are not defined, etc…

 

Minor remarks :

  • Please define what is a “USBL Systems”
  • Eq (7) « this will allow us to transform any point from one referential frame into the other”. If it's just a rigid transform between two frames, there is no “alpha” scale factor. If it's a perspective projection, it should be mentioned in the text.

 

In conclusion, the authors reworked the article to correct major flaws. Nevertheless, authors should review the entire article to correct my two major remarks before publication.

Author Response

We would like to kindly thank you for your careful reading of our work. Your comments and suggestions contributed to a notable improvement of the document. 

All your suggestions were taken into account and changes were made accordingly.

The changes to the text are highlighted (yellow) in the document.
We respond, one by one, to each of your comments as follows:

 

My main points are:
- The authors have made some efforts to use standard notations (matrices and vectors in bold, scalars in italics, etc.), but there are still many places where this has not been done, for example: eq (21), a quaternion is a vector, so q should be in bold, eq(22) same remark for Bl, in eq (7) pc, pr and tr are points and vectors, they should also be in bold...
- There are always too many undefined terms, for example: in eq 3 ,4, 5 : uc, ua, a, f..., in eq 7 pc and pr are not defined, etc... 

 

The document was revised and the notations were changed (standard notations).

The terms not defined in the document have been defined.

 

Minor remarks :
Please define what is a “USBL Systems”
Eq (7) « this will allow us to transform any point from one referential frame into 

the other”. If it's just a rigid transform between two frames, there is no “alpha” scale factor. If it's a perspective projection, it should be mentioned in the text. 

 

The abbreviations in the text have been defined.

 

Eq (7) :Yes, it is a perspective projection, the points (world) are transformed and projected on a plane. The text accompanying this equation has been modified in the document.

 

Thank you and Kind regards,

André Figueiredo

Reviewer 2 Report

The paper is drafted with low quality, from the language expression to the technical contents. This very lengthy paper has little content worth to present. Many redundant figures, formula and sentences make the paper boring and unattractive. 

Author Response

We would like to kindly thank you for your comments and suggestions.

We respond, one by one, to each of your comments as follows:

 

review 1:

The abstract wording to be more abstractive

Possible is not good enough

 

answer:

The phrase where the word "possible" appears has been reworded to:

“(…) for the linear degrees of freedom of the AUV, the AUV is conducted to the dock while …

 

review 2:

Is there any other sensor available on the AUV?

 

answer:

Yes, there are other navigation sensors available on the AUV: altimeter, sonar LBL (long baseline), pressure sensor and IMU. 

To maximize the potential of the vision sensor and to minimize the computational cost associated with sensory fusion, it was decided not to resort to fusion.

 

review 3:

Too many repeat sentences

 

answer:

Repeated phrase has been removed

 

review 4:

What is the difference between the title 1.1.1 and 1.1.2?

 

answer:

The title of the subsection 1.1.2 refers to the use of artificial vision for other purposes other than docking

 

review 5:

This subsection is unnecessary as it is not relevant to this work

 

answer:

During the first round of reviews, one of the reviewers suggested referring to other approaches to autonomous docking using other sensors. So this subsection was added to the original manuscript.

 

review 6:

Title is same as 1.1.1 and 1.1.2 

 

answer:

The title of the subsection 1.1.4 refers specifically to the use of artificial vision (and image processing) in estimating the relative location of the vehicle. 

It is a section only focused on methods of estimating relative location by using vision. Not necessarily for docking processes. In this section we intend to situate, in view of the state of the art, our approach to estimate the AUV relative pose.

 

Thank you and Kind regards,

André Figueiredo

Reviewer 3 Report

The article is devoted to an urgent topic: the study of MViDO system (Monocular Vision-based Docking Operation aid) for autonomous docking process of Autonomous Underwater Vehicles (AUV). MViDO system consists of three modules: a pose estimator, a tracker and a guidance sub-module.  The article contains a large amount of experimental material; the results obtained are beautifully illustrated. A physical test of the system was carried out ‒ a model was built for the test.

Thus, in my opinion, the article is a complete study conducted at a high scientific level. There are some insignificant comments on the article:

  1. The coefficient α(alpha) used in formula (7) requires a more detailed description: how its value is determined.
  2. In (22), the double index is used in the notation Crd Bl. It seems appropriate to replace the index with a more compact one.
  3. The notation (32) can be simplified taking into account (33).
  4. In Fig. 9 and 11, the horizontal axis is not indicated.
  5. There is no uniformity in the design of the graph.

In general, the article is written at a high scientific level and can be published in the journal.

 

Comments for author File: Comments.pdf

Author Response

Dear Reviewer,

 

Once again, we would like to kindly thank you for your careful reading of our work. 

All your suggestions were taken into account and changes were made accordingly.

The changes to the text are highlighted (yellow) in the document.
We respond, one by one, to each of your comments as follows:

 

1. The coefficient ? used in formula (7) requires a more detailed description: how its value is determined. 

 

If we consider no lens imperfections, then ? can be approximated to f/z_r.  However, it is important to mention that this scalar ? , given the properties of a quaternium, will not be relevant in the orientation estimation algorithm. As can be seen in the development of the equations.

The more detailed description has been added to the text (line 306-307)

 

2. In (22), the double index is used in the notation ?????. It seems appropriate to replace the index with a more compact one. 

 

Changed as suggested. See line 365-366

 

3. The notation (32) can be simplified taking into account (33). 

 

Changed as suggested. See notation (32)

 

4. In Fig. 9 and 11, the horizontal axis is not indicated.
5. There is no uniformity in the design of graph.
 

 

Figures 9 and 11 have been changed

 

 

Thank you very much and Kind regards,

André Figueiredo

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

After carefully reviewing the paper I believe it is technically correct and applicable for the docking and other related application. 

Very useful in the AUV area. Suggest to publish

Reviewer 2 Report

Find the comments in the attached document

Comments for author File: Comments.docx

Reviewer 3 Report

I recommend merging some figures, such as Figs. 22 & 23, shortening Section 4, and devoting the additional space to a more detailed discussion of Figs. 40 and 41.

Reviewer 4 Report

This article presents a method for docking an AUV using monocular vision. This article describes the whole process, from the pose estimator to the guiding law used, and some experimental results.

The article does not present a major scientific contribution, it is more about the use of existing methods to solve the docking problem. Nevertheless, the work is interesting especially for the targeted journal that wants to apply it (“Applied Science”). However, the article suffers from too many major flaws to be published.  My remarks/questions are listed below.

Relative works are presented but the authors do not make a link with their work. What are the advantages and disadvantages of the existing methods? Why were they not appropriate in the studied context? What else do the authors bring to the table in relation to these methods?

For example, it is mentioned that some methods use active markers. Why was this principle not retained? I wonder about the choice of using colored balls underwater. At what depth should the system operate? Because under water, colors disappear very quickly, for example red disappears from 5 meters.  Is there lighting on the rov?

On the chosen method, generally when we have markers whose dimensions are known, we use a PnP algorithm (here P3P) to obtain the pose directly [1]. Why didn't you choose this solution?

The colors are there to avoid ambiguities, but this could have been done more easily by having different distances on the two axes of the target (here the distances between the markers are almost the same, i.e. 0.30m). I suggest the authors to look at the works presented in [2] which deal with a similar problem.

Contributions should be mentioned in the introduction, possibly after the related works in order to position oneself in relation to them, but they appear too late (Section 3.3, line 178).

The attitude estimation algorithm is not described well enough. We do not understand what the authors have tried to do. What is plane c and plane r? This part lacks explanation. The text and the caption in figure 6 do not allow us to understand what the authors meant.

Mathematical notation standards are not respected (matrices and vectors in bold, scalars in italics, etc.).

Too many terms are not defined. For example, what are R11, R12, etc? I suppose they are the coefficients of the matrix R but this is not written. Another example is equation (21). The 3 Greek letters are not defined, nor is sx.

Sometimes there is an attempt of definition, but it is far too vague. For example, L222: "where q0, q1, q2 and q3 are the quaternion elements" -> the quaternions of which transformation? All this needs to be explained. These are just examples, but it's like that throughout the article, which makes it difficult to understand. Understanding is made even more difficult by the fact that you use the same variable names for different things: k1...k4 in equation (2) have nothing to do with k1 and k2 in equation (18). In this case you have to use other names.

The legends of the figures do not make it possible to understand them (e.g. figure 9). Even with the text of the article it is sometimes difficult to understand. Normally a caption should be sufficient to understand the figure.

There is no conclusion to the article.

 

Minor remarks :

The equations must be integrated into the text: ex eq 17.

L202 (see figure4) -> add a space between “figure” and the number

Eq 1 : give a name to the vector

L 207 : Where -> remove the capital letter since it is the continuation of the sentence.

Since eq (2) is not standard, it would have been interesting to give the literal expressions of coefficients k1 to k4 in order to avoid having to find the expressions in the paper [1], especially since the conventions of the axes are not the same.

What is P(Blicolor) in eq (18)? It's undefined. Is it the same function as Pcolor defined in l262?

Equation numbers in the text should appear in parentheses.

Caption of figure 7: « tp estimate » -> « to estimate”.

L360 : radius instead of radium ?

Eq(22) : "where xw, yw, zw are coordinates in the world reference frame W." -> That's not true. This equation is only valid if 3D points are expressed in the camera frame.

References

[1] Li, S., Xu, C., & Xie, M. (2012). A robust O (n) solution to the perspective-n-point problem. IEEE transactions on pattern analysis and machine intelligence, 34(7), 1444-1450.

[2] Boutteau, R., Rossi, R., Qin, L. et al. A Vision-Based System for Robot Localization in Large Industrial Environments. J Intell Robot Syst (2019).

Back to TopTop