Next Article in Journal
Automated Delineation of Microstands in Hemiboreal Mixed Forests Using Stereo GeoEye-1 Data
Previous Article in Journal
An Assessment of Electric Power Consumption Using Random Forest and Transferable Deep Model with Multi-Source Data
 
 
Article
Peer-Review Record

Robust Filter-Based Visual Navigation Solution with Miscalibrated Bi-Monocular or Stereo Cameras

Remote Sens. 2022, 14(6), 1470; https://doi.org/10.3390/rs14061470
by Damien Vivet *, Jordi Vilà-Valls, Gaël Pages and Eric Chaumette
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Remote Sens. 2022, 14(6), 1470; https://doi.org/10.3390/rs14061470
Submission received: 11 February 2022 / Revised: 14 March 2022 / Accepted: 16 March 2022 / Published: 18 March 2022
(This article belongs to the Section Remote Sensing Image Processing)

Round 1

Reviewer 1 Report

This paper uses linear constraint extended Kalman filter (LCEKF) approach to reduce the impact of miscalibration on navigation system. The calibration parameters could vary due to several factors such as mechanical stress due to vibrations. It results in miscalibration that requires online estimation of such parameters. 

The proposed method does not require to estimate calibration error on-the-fly as it mitigates the impact of such errors on a navigation system.

Comments

It is necessary to add a thorough comparison with other approaches. 

The validation of results is limited and has to be extended

Further comments

  • Line 108: Main contribution is not clear.
  • Figure 2: Description is not clear. Please either add a color image or use some patterns within image.
  • Line 170: It is not clear why a new landmark cannot be directly included in the map with its covariance.
  • Section 3.3 (line 3): it is not clear why direct application of method [27] fails?
  • Line 172: Please add some explanation of this observation why it diverges.

Author Response

Please find our answers in the attached file.

Author Response File: Author Response.pdf

Reviewer 2 Report

The authors present an application of their previously published linearly constrained extended Kalman filter to stereo-based visual SLAM. The key benefit of the approach is that it is able to handle the effect of a changing relative orientation of the two stereo cameras over time without explicitly estimating the parameters of the relative orientation.

While the first part of the manuscript is well presented, the description of applying it to stereo-based visual navigation has some weaknesses, inconsistencies, and errors, which make the reading and understanding a bit difficult (see below for details).

My major concern, however, is the significance and informative value of the experiments:

Section 4.3.:

  • EKF performs better in the case the calibration is exactly known. Could you comment on when the application of EKFfull and LCEKF should be preferred over EKF? The noise magnitude in the following experiments is rather high. What is the noise level / calibration error of the break-even-point?
  • The experiments show no advantage of LCEKF over EKFfull even in the presence of very high noise (0.1 m on a 2m baseline). Only when the noise is further increased to 0.5m, the results suggest an advantage of the LCEKF. I wonder, in which application the baseline of 2m oscillates with a magnitude of 0.5m with a high frequency? How relevant is this experiment for practice? Furthermore, how significant are the results? If you repeated this experiment for different scenarios (different trajectory, different 3D points, different intrinsics), would the results change? If the advantage of LCEKF is still visible, how would you explain this advantage?

Section 4.4:

  • I did not understand your conclusion “As stated before, the LCEKF is for each case more robust than the EKFfull proving that the IMU is not required to correct the miscalibration bias.”. Could you please explain that in more detail?

Section 4.5

  • In this section, the experiments show that LCEKF outperforms EKFfull if the covariances of the parameters are not well estimated. To my knowledge, the estimation of the covariances is well-known and typically works well in practice. Could you please explain in more detail the scenarios in which the covariance estimation fails?
  • Isn’t the relative orientation calculated implicitly in the LCEKF and couldn’t it be extracted from the estimated data after each time step? How would the relative orientation differ from that calculated by EKFfull?

Further points (unfortunately, the line numbering in the manuscript is incomplete):

  • In the manuscript miscalibration is restricted to the relative orientation (“extrinsic calibration parameters”). This should be made clearer in the document. Otherwise one could expect that it also includes the intrinsics.
  • L31: “linked to camera-lens” -> “linked to the camera-lens”
  • L108: “such error” -> “such errors”
  • L118: “dynamic system” -> “dynamic systems”
  • Formula 2: Explanation/Definition of “E” is missing
  • Formula 2: Explanation/Definition of “K_k” is missing
  • L146, 148: “10” -> “(10)”
  • L150: Explanation/Definition of “P_k|k” is missing
  • Section 3.2: The definition of pose is unclear. In my understanding, pose comprises position and orientation. In your definition, pose seems to be equivalent to position. If this is the case, I would suggest to explicitly define that or, preferably, to change “pose” to “position”.
  • Bold vs. normal style: What type of variables are bold? I assume that vectors and matrices should be bold and scalars should be non-bold, but I am not quite sure because this means that there are many inconsistencies in the manuscript, like P_3D_i, rotation matrices R, K, P2DL, …
  • Figure 1: Labels are shifted
  • L154: “bi-monocular” -> “binocular” ?
  • “pixelic reprojection” -> “reprojection”
  • Image point have inconstant variable names: p_1 vs. P2D_L vs. p_L
  • Text between Eqs. (15) and (16): t_w->c_L -> t_w->c_R (also wrong in Eq. (16))
  • Text between Eqs. (19) and (20): “both camera” -> “both cameras”
  • jacobian -> Jacobian
  • dependant -> dependent
  • (23): “widehat”
  • (24): Definition of ^ is missing
  • L166: Definition of “A” is missing
  • Figure 2: The 3D points are not included in x. This is in contradiction to the definition of x on page 5.
  • L205: or our -> of our
  • Inconsistent Fig. vs. Figure
  • Caption of Figure 3 is not a correct sentence
  • L208-210: At this point it is unclear, what the authors mean with noise, i.e., to which variables noise is added
  • L217: both model -> both models
  • Last line on page 9: observation -> observations
  • First paragraph on page 10: “field of view” is unclear
  • L232: “the inputs of the system are noised” ->” noise is added to the inputs of the system”
  • L234: Definition of “dt” is missing
  • L235: Definition of “P_z” is missing
  • L235-236: bellow -> below
  • L236: does “delta” really represent ground truth calibration values or ground truth calibration error values
  • L241: Here the terms “localization” and ”mapping errors” are used. Does localization include orientation? At other places in the manuscript, the terms “translation” and “rotation” are used. Sometimes instead of ”mapping” also “landmark pose” is used. There are many other inconsistencies. The manuscript should be carefully revised in this respect.
  • Figure 4: There is no legend (what is red, yellow, blue?)
  • Caption of Figure 4: t_z and alpha_x are in contradiction with the definition on page 10 and Eq. 19 (t_x and alpha_z?)
  • L273: What does “BA” mean?
  • What does “Bis” mean in the figures?
  • L293: Order of the setups is different from the order Table 1
  • Figure 10 et al: transaltion -> translation
  • Figures 16: what does “thetas” mean?
  • Caption of Figure 17 and 18: what is K?

Author Response

Please find our answers in the attached file.

Author Response File: Author Response.pdf

Reviewer 3 Report

This paper studies the miscalibration or recalibration of mobile stereo/bi-monocular camera setups. It proposes a robust linearly constrained state estimation method to mitigate the model mismatch. 

Overall, it is a solid study with extensive experimental performance evaluation. Here are some suggestions for the authors' considerations.
1. It would be good to include a paragraph to clarify the difference between this paper and the authors' early work [27].
2. It would be good to clarify the 'amount of errors', which varies in the experimental evaluation, as the paper claims that "the method keeps the system (and objects of the map) localized in real-time even with huge miscalibration errors and parameters’ variations."
3. There are many typos in the manuscript, such as 'widehat' in equation (23), '[email protected]' in author contact.
Please check the whole manuscript carefully.

Author Response

Please find our answers in the attached file.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The paper has been significantly improved and should be published.

Author Response

Thank you for the review.
Best regards.

Reviewer 2 Report

I would like to thank the authors for considering my suggestions from the first review round. I think it would be helpful for the reader to include the content of the authors' reply to my points 1. and 2. of the first review round into the manuscript.

Author Response

Thank you for the review.
As suggested, we added our answers to the paper.
Best regards.

Back to TopTop