Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

An Uncalibrated Image-Based Visual Servo Strategy for Robust Navigation in Autonomous Intravitreal Injection

Electronics 2022, 11(24), 4184; https://doi.org/10.3390/electronics11244184

by Xiangdong He¹

, Hua Luo^1,*

, Yuliang Feng², Xiaodong Wu¹

and Yan Diao¹

Reviewer 1:

Ahmed Elmogy

Reviewer 2:

Thadeu Brito

Reviewer 3: Anonymous

Electronics 2022, 11(24), 4184; https://doi.org/10.3390/electronics11244184

Submission received: 10 November 2022 / Revised: 9 December 2022 / Accepted: 10 December 2022 / Published: 14 December 2022

(This article belongs to the Special Issue Human Computer Interaction in Intelligent System)

Round 1

Reviewer 1 Report

The paper is technically very good. The paper idea and goals are very interesting. Some issues should be considered to improve the quality of the presented work:

1- More clarification should be added to the introduction section with clearly mentioning the main contributions of the presented work

2- The Siamese neural network is used in the proposed work without enough justification why it is used. Also, some details should be added about this type of networks

3- It is mentioned that the inverse kinematics problem is out of the scope of this work. More justification is required as this problem is a core problem when it comes to positioning the robot's end effector at a prescribed position

4- The authors should justify the types of used uncertainties and how they affect the system performance

5- The Sliding mode controller is used without enough details how to overcome the associated chattering problem

Author Response

Dear Editor and Reviewers,

Thank you so much for your efficient work in processing and reviewing our manuscript entitled “An uncalibrated image-based visual servo strategy for robust navigation in autonomous intravitreal injection” (Manuscript ID: electronics-2056457). Based on the comments, we made modifications to our original manuscript. We sincerely hope that the Revised Manuscript will meet the standard of Electronics. Our point-by-point responses to the reviewer #1 are presented as follows:

1. More clarification should be added to the introduction section with clearly mentioning the main contributions of the presented work.

Response: Based on the reviewer’s helpful suggestion, the main contribution of this work has been added and described more clearly in lines 63-65.

Revision to the manuscript: Lines 63-65, “The main contribution of this work is the achievement of automatic and robust navigation of intravitreal injections without calibration and depth information, only using limited information from the eye images. This proposed method can increase the level of automation of intravitreal injection and also helps to enhance the quality of the procedure as well as save valuable surgeon resources.” was added.

2. The Siamese neural network is used in the proposed work without enough justification why it is used. Also, some details should be added about this type of networks.

Response: More justification related to the use of the Siamese neural network has been added to the revised manuscript. Also, more details were added about this type of networks.

Revision to the manuscript: Lines 181: “The SNN is one of the best choices for comparing two input vectors and outputting the similarity, where the two identical neural networks work in parallel and compare their outputs at the end [58]. The amplitude of the relative pose is positively correlated with the similarity between the current image features and the target image features, which is well suited for SNN.” was added. Lines 181-182, “The network proposed by Tokuda, F. et al. is very inspiring.” was changed to “The network proposed by Tokuda, F. et al. is very inspiring and also applies an SNN architecture,”

3. It is mentioned that the inverse kinematics problem is out of the scope of this work. More justification is required as this problem is a core problem when it comes to positioning the robot's end effector at a prescribed position.

Response: The calculation of the inverse kinematics is provided by the official libfranka library, which is guaranteed to be accurate.

Revision to the manuscript: Lines 162-165, “The inverse kinematic algorithm [53] which inverse solves the displacement of each joint of the robot from the robot end displacement, is usually provided in common robot simulation and control software and is out of the scope of this paper.” was changed to “The inverse kinematic algorithm [53] which inverse solves the displacement of each joint of the robot from the robot end displacement, is provided by the official libfranka library.”

4. The authors should justify the types of used uncertainties and how they affect the system performance.

Response: The types of used uncertainties and how they affect the system performance has been discussed in the revised manuscript based on the reviewer’s helpful suggestion.

Revision to the manuscript: Line 244, “Although uncalibrated IBVS does not require considering camera and tool calibration, model fitting errors, system identification errors, robot repetitive motion errors, and image recognition errors can make the system model inaccurate and vary with the pose of the robot end-effector. An SMC is then designed to overcome these problems, which is robust to model uncertainties and disturbances in the environment.” was added.

5. The Sliding mode controller is used without enough detail show to overcome the associated chattering problem.

Response: More details related to the chattering problem associated with the sliding mode controller has been added to the revised manuscript.

Revision to the manuscript: Line 244, “To overcome the chattering problem associated with the SMC, we use quasi-SMC [65] instead of the classical SMC, i.e., we use the saturation function instead of the sign function.” was added. Line 250, “To reduce chattering, the saturation function is used instead of the sign function:” was changed to “Then, the sign function is re-placed with saturated function:”.

Reviewer 2 Report

Dear authors the comment/suggestion is attachment, but in case of not can open it, there is:

This work presents a development of autonomous intravitreal injection navigation based on Image-based visual servo control and applying some machine learning approaches to achieve more accuracy and robustness, such as Neural Network. Indeed, the work is very well explained, and the ideas flow according to the text evolution. Despite the excellent quality, there are some suggestions to improve future readers' understanding:

All the abbreviations need to be checked, and there are the lines that I found:

line 11: Image-based visual servo (IBVS) > Image-Based Visual Servo (IBVS)

line 52: Image-based visual servo (IBVS) > Image-Based Visual Servo (IBVS)

line 74: generic multilayer perceptron (MLP) > generic Multilayer Perceptron (MLP)

line 140: Since this acronym is created in line 52, you do not need to declare it here. So, in line 140: image-based visual servo (IBVS) > IBVS

line 141: position-based visual servo (PBVS) > Position-Based Visual Servo (PBVS)

line 181: siamese neural network > SNN (since it is created before)

line 240: linear time-invariant (LTI) > Linear Time-Invariant (LTI)

line 386: Since the acronym is already declared, there is no necessity to do it again. So, position-based visual servo (PBVS) > PBVS .

Some of the acronyms lack of full declaration, for example:

line 42: what are OCT and MRI?

line 43: what is SLAM?

line 54: what is MIS?

line 71: what is RBF?

line 128: what is FOV?

line 169: what is DOF?

line 183: what is CNN?

line 193: what is FC layer?

line 271: what is ScLERP?

line 306: what is UVC?

line 363: what is SNR?

The work has some typos that can be found at:

All the terms "et al." needs to be checked, such as line 45 is an "et al" (without dot).

line 89 has a lack of space between 20.5 and 26.4

line 128: cameras' FOV as possible > cameras' FOV as possible

Remove the bolt from:

line 233: "Figure 4".

lines 284-298: "Figure. 6", "Figure. 7" and "Table. 1"

line 304-327: "Figure. 9", "Figure. 9(b)" and "Figure. 9(a)".

line 332-line 341: "Figure. 10", "Figure. 11" and "Figure. 12".

line 351: "Table. 2".

line 364: "Table. 3"

line 376: "Table. 4"

At the beginning of Section 2.1, a term is unused in robotic topics. The "end of the arm" is the last part of a robotic arm (the last link of a robotic arm). So, my suggestion is:

line 82: ...fixed at the end of the robotic arm captures images > ...fixed as robot end-effector.

Please, consider changing the whole terms in this paragraph.

Figure 2 does not demonstrate the limitations between lines 166 to 175. I suggest presenting a figure indicating the robot's pose in this situation (or as close as possible).

What does the equation presented in Figure 2 mean? Where did that come from?

I suggest replacing the term "trajectory" (line 290) by path planning since this work's step is a simulation. Consequently, it is a possible solution (if the authors accept this suggestion, replace the term inside Figure 7 and the caption).

Figure 6: needs an explanation/discussion about the behavior of each line. For example, why sometimes do several lines look like an overlearning? Such as the line with the error from the 11 feature?

Figure 7: How do you generate the reference? Which algorithm did you use? With a spline? Or with Euclidean distance?

Table 1: How did you measure the overshoot?

Table 1: The baseline method was published before? If yes, please cite the publication at the end of the caption.

Figure 11: Does this figure show the same path planning? Why does the figure illustrate a different path from Figure 7?

Figure 11: Figure 11 lacks information about the reference path. What is the reference in terms of algorithm?

Comments for author File: Comments.pdf

Author Response

Dear Editor and Reviewers,

To reviewer #2:

1. All the abbreviations need to be checked.

Response: All the abbreviations have been checked based on the reviewer’s helpful suggestions.

Revision to the manuscript: Line 11, “Image-based visual servo (IBVS)” was changed to “Image-based Visual Servo (IBVS)”. Line 52, “Image-based visual servo (IBVS)” was changed to “Image-based Visual Servo (IBVS)”. Line 74, “generic multilayer perceptron (MLP)” was changed to “generic Multilayer Perceptron (MLP)”. Line 140, “image-based visual servo (IBVS)” was changed to “IBVS”. Line 141, “position-based visual servo (PBVS)” was changed to “Position-Based Visual Servo (PBVS)”. Line 181, “siamese neural network” was changed to “SNN”. Line 240, “linear time-invariant (LTI)” was changed to “Linear Time-Invariant (LTI)”. Line 386, “position-based visual servo (PBVS)” was changed to “PBVS”

2. Some of the acronyms lack of full declaration

Response: We have added full descriptions for all acronyms based on the reviewer’s helpful suggestions.

Revision to the manuscript: Line 40, “OCT” was changed to “Optical Coherence Tomography (OCT)”. Line 42, “MRI” was changed to “Magnetic Resonance Imaging (MRI)”. Line 43, “SLAM” was changed to “Simultaneous Localization and Mapping (SLAM)”. Line 54, “MIS” was changed to “Minimally Invasive Surgery (MIS)”. Line 71, “RBF” was changed to “Radial Basis Function (RBF)”. Line 128, “FOV” was changed to “Field of View (FOV)”. Line 169, “DOF” was changed to “degrees of freedom (DOF)”. Line 183, “CNN” was changed to “Convolutional Neural Networks (CNN)”. Line 193, “FC layer” was changed to “Fully Connected (FC) layer”. Line 271, “ScLERP” was changed to “Screw Linear Interpolation (ScLERP)”. Line 306, “UVC” was changed to “USB Video Class (UVC)”. Line 388-389, “SNR” was changed to “Signal-to-Noise Ratio (SNR)”. In addition, similar issues in other locations are also modified. Line 69, “LoA” was changed to “Level of Autonomy (LoA)”. Line 198, “GELU” was changed to “Gaussian Error Linear Unit (GELU)”. Line 200, “ReLU” was changed to “Rectified Linear Unit (ReLU)”. Line 200, “ELU” was changed to “Exponential Linear Unit (ELU)”.

3. The work has some typos.

Response: We have checked and corrected all typos based on the reviewer’s helpful suggestions.

Revision to the manuscript: Line 45, line 61, line 209, “et al” was changed to “et al.” Line 89, “20.5 -26.4” was changed to “20.5-26.4”. Line 128, “cameras‘ FOV as possible” was changed to “cameras' Field of View (FOV) as possible”.

4. Remove the bolt.

Response: All the bolts about the figures and tables in the main text have been removed based on the reviewer’s helpful suggestions.

Revision to the manuscript: The bolts in Line 233, line 286, line 291, line 294, line 307, line 314, line 315, line 322, line 339, line 351, line 364, line 376 were removed.

5. end of the arm -> robot end-effector

Response: Change all the "end of the arm" in this paragraph to "robot -effector" based on the reviewer’s helpful suggestions.

Revision to the manuscript: Lines 82-84, change “end of the arm” to “robot end-effector”.

6. Figure 2 does not demonstrate the limitations between lines 166 to 175. I suggest presenting a figure indicating the robot's pose in this situation (or as close as possible).

Response: The limitations of the injection needle have been demonstrated in Figure 2(a) based on the reviewer’s helpful suggestion. The robot's final pose is indicated in Figure 2(b).

Revision to the manuscript: We have updated Figure 2 to add the final case of the injection needle limitation and robot pose.

7. What does the equation presented in Figure 2 mean? Where did that come from?

Response: The equation presented in Figure 2 is the iris ellipse equation. We use least squares to fit the iris edge to obtain this equation. We have clarified in detail what it means and how it is calculated in line 168.

Revision to the manuscript: Line 168, “We use least squares to fit the iris edge to obtain the general equation of the ellipse: x²+c₁xy+c₂y²+c₃x+c₄y+c₅=0, which has five independent parameters c₁, c₂, c₃, c₄, c₅ [54].” was added.

8. I suggest replacing the term "trajectory" (line 290) by path planning since this work's step is a simulation. Consequently, it is a possible solution (if the authors accept this suggestion, replace the term inside Figure 7 and the caption).

Response: We think it is inappropriate to replace trajectory with path planning because it is the actual trajectory formed by the simulation calculation and not planned by our program.

Revision to the manuscript: To remove the ambiguity, lines 290-291, “The trajectory of the pinpoint is shown in Figure. 7.” was changed to “The reference path which is designed in section 2.1 and real path of the pin-point is shown in Figure. 7.” To reduce ambiguity, “trajectory” has been replaced with “path” in the text.

9. Figure 6: needs an explanation/discussion about the behavior of each line. For example, why sometimes do several lines look like an overlearning? Such as the line with the error from the 11 feature?

Response: Error 11 appears to rise for a while. But it is not caused by overlearning. The reason is that the needle tip needs to be gradually brought closer to the injection site from the inferior temporal direction, avoiding the bulging corneal. As can be seen from Figure 6, ref error 11 has also been rising for a while. Other similar phenomena are also caused by the same reason.

Revision to the manuscript: We have added the line of ref error 11 to Figure 6, and added “Error 11 appears to rise for a while, because the needle tip needs to be gradually brought closer to the injection site from the inferior temporal direction, avoiding the bulging corneal. Other similar phenomena are also caused by the same reason.” to the caption of Figure 6.

10. Figure 7: How do you generate the reference? Which algorithm did you use? With a spline? Or with Euclidean distance?

Response: As described in Section 2.1, the reference path is designed by us manually. It is interpolated using the ScLERP algorithm in the pytransform3d library. As the eyeball rotates, the corresponding reference path also rotates.

Revision to the manuscript: Line 290, “Since the reference path is fixed relative to the eye, the reference path rotates accordingly after the eye rotates, which causes the jump in the reference trajectory in the Figure. 7. We use the pytransform3d [68] library to calculate the reference trajectory after rotation.” was added.

11. Table 1: How did you measure the overshoot?

Response: We use image feature 1 to calculate the overshoot. We have added the calculation method of the overshoot in line 280.

Revision to the manuscript: Line 280, “The general overshoot calculation formula is (x_max-x_∞)/x_∞, and our calculation formula is e_min/(f_end-f_start)=(f_min-f_end)/(f_end-f_start) because the feature 1 is reduced in this process, where e_min is the minimum value of feature error 1, f_min is the minimum value of feature 1, f_start is the feature 1 at the reference start pose, and f_end is the feature 1 at the reference end pose.” was added.

12. Table 1: The baseline method was published before? If yes, please cite the publication at the end of the caption.

Response: The proportional controller is the typical control model in practice and we use this as a baseline. We have added a description of it.

Revision to the manuscript: Line 293, “Proportional controller is commonly used in practice, and we use the proportional controller with gain λ=0.12 as the baseline [69, 70].” was added.

13. Figure 11: Does this figure show the same path planning? Why does the figure illustrate a different path from Figure 7?

Response: Figure 11 shows a partial zoomed-in view of the final approach to the target position to better represent the details; it does not show the full trajectory. Moreover, simulation and physical model experiments just use the same reference path design method, and their reference paths are not the same.

Revision to the manuscript: Line 334, "In Figure 10, simulation and physical model experiments use the same reference path design method, and their reference paths are not exactly the same due to differences in the positions of eyeball model, robot and camera. Figure 11 shows a partial zoomed-in view of the final approach to the target position to better represent the details." was added.

14. Figure 11: Figure 11 lacks information about the reference path. What is the reference in terms of algorithm?

Response: The dashed path in Figure 11 is the reference path. The reference in the algorithm is the image feature captured by the robot end-effector along the reference path. More information about the design and interpolation algorithm of the reference path is added.

Revision to the manuscript: The caption of Figure 11 was changed to: “The reference trajectory was designed manually using the method in Section 2.3 and interpolated using the ScLERP algorithm.” Line 334, “The reference trajectory is specified manually during the design phase and later generated and executed by the machine in patients with the same iris diameter, using the method in Section 2.3.” was added.

Reviewer 3 Report

The paper proposes a reliable navigation system for autonomous intravitreal injection based on the combination of a Siamese neural network and RBF. The article may be published.

Author Response

Dear Editor and Reviewers,

Thank you so much for your efficient work in processing and reviewing our manuscript entitled “An uncalibrated image-based visual servo strategy for robust navigation in autonomous intravitreal injection” (Manuscript ID: electronics-2056457). Based on the comments, we made modifications to our original manuscript. We sincerely hope that the Revised Manuscript will meet the standard of Electronics. The reviewer #3 had no specific suggestions or comments, so we did not have point-by-point responses accordingly, and the revised manuscript is attached.

Article Menu

An Uncalibrated Image-Based Visual Servo Strategy for Robust Navigation in Autonomous Intravitreal Injection

Further Information

Guidelines

MDPI Initiatives

Follow MDPI