Next Article in Journal
Antimicrobial Photodynamic Therapy Combined with Photobiomodulation Therapy in Teeth with Asymptomatic Apical Periodontitis: A Case Series
Previous Article in Journal
Optimizing Daylighting Typology in Religious Buildings: A Case Study of Electrochromic Glazing Integration in the Masjid Al-Shagroud
 
 
Article
Peer-Review Record

Construction of Robotics and Application of the Optical-Flow Algorithm in Determining Robot Motions

Appl. Sci. 2024, 14(20), 9342; https://doi.org/10.3390/app14209342
by Anh Van Nguyen *, Van Tien Hoang and The Hung Tran
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2024, 14(20), 9342; https://doi.org/10.3390/app14209342
Submission received: 18 September 2024 / Revised: 8 October 2024 / Accepted: 11 October 2024 / Published: 14 October 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper presents a highly innovative approach to image processing for precise real-time object detection, positioning, and movement analysis, offering significant contributions to the development of advanced combat robots with minimal error rates in positioning, direction, and speed calculations. After revising according to the suggestions provided, it can be accepted for publication.

Citations are suggested to be numbered in increasing order.

Grammar errors throughout the text are suggested to be fixed.

The introduction can be enriched by discussing a larger number of relevant papers. For example, the authors mention the use of other techniques like laser and ultrasonic. Providing examples of these techniques in scientific papers can help the authors better understand the subject. Doing this is also expected to increase the paper's visibility.

Are there any studies that perform sub-pixel level analysis? The authors can increase the number of studies that focus on vision techniques in the introduction section.

Are there applications of vision techniques in robots that allow cloud computing?

The authors can also discuss stability issues by providing relevant citations.

In the 'In this study' paragraph, the authors are advised to omit giving results. Instead of presenting results, the impact of the highly accurate results can be mentioned to capture the reader's attention.

Equations and figures should be given in a standard format. Fonts must also be consistent.

Equation 1 is composed of two similar equations, so they can be labeled as 1a and 1b. The fonts and positions of all equations must be standardized, and there must be consistency among them.

Can the title of the second section be simplified to "Methodology"?

The first paragraph of the second section is more suited to the introduction section. Reasoning should only be discussed in the introduction. The methods section should focus solely on the methods without mentioning reasoning or providing results.

The description of Equation 1 does not clearly explain what X and Y refer to. Does A refer to position or object? Are X and Y coordinates? The notation of point A should be defined at the initial mention.

The formulations only cover the calculation of velocity and related information. However, the method for detecting the object by the camera at the given positions is not described. How the coordinates are determined in step 3 after the object’s detection in step 2 is not clearly explained.

Does the paper provide a novel approach to determining identification parameters? The method for this must be explained.

Each section and subsection should begin with a short introductory paragraph instead of listing titles in succession.

Parameters of the robot can be presented in a table.

The conclusion section can omit detailed results. Instead, the impact or variation of results can be described verbally.

Is the format of listing the conclusions suitable for the journal's format?

The fonts used in the figures must be consistent.

Methodology and results must be clearly distinguished.

Author Response

  1. Answer to reviewer

Referee #1 (Comments to Both Author and Editor):

Thank you very much for your detailed comments on the paper. We understand your concern about the format and contribution of the paper. We checked and revised the whole manuscript according to your comments to make them fit with the format and scope of the journal. The revised places were highlighted in blue color. Our answers for each comment of the editor are as follows:


Comments and Suggestions for Authors

This paper presents a highly innovative approach to image processing for precise real-time object detection, positioning, and movement analysis, offering significant contributions to the development of advanced combat robots with minimal error rates in positioning, direction, and speed calculations. After revising according to the suggestions provided, it can be accepted for publication.

Thank you very much for your comments and your help.

  1. Citations are suggested to be numbered in increasing order.

- We revised the reference list to make that it will be in order.

  1. Grammar errors throughout the text are suggested to be fixed.

- We checked and revised the whole manuscript again. We also asked a native English speaker to help us revise the paper.

  1. The introduction can be enriched by discussing a larger number of relevant papers. For example, the authors mention the use of other techniques like laser and ultrasonic. Providing examples of these techniques in scientific papers can help the authors better understand the subject. Doing this is also expected to increase the paper's visibility.

- We added more other studies in the introduction on pages 1 and 2.

  1. Are there any studies that perform sub-pixel level analysis? The authors can increase the number of studies that focus on vision techniques in the introduction section.

- Yes, there is. However, the sub-pixel analysis is mainly used in narrow space for increasing the resolution. We added other references for the sub-pixel technique in the paper.

- The sub-pixel technique can be used to increase the image's resolution or the measurement's accuracy. However, this technique will increase the complication of the calculation, which affects to the real-time analysis. In this study, the accuracy and its balance with numerical calculation were considered, so we used YOLO 8 to detect and identify the object.

- We added this information to the paper to Lines 61-64, page 2 (L61-64, P2).

  1. Are there applications of vision techniques in robots that allow cloud computing?

- Yes, it is. We added the vision techniques using cloud computing in this paper. We know that this technique helps to increase the calculation and makes the decision of the robot more accurate. However, this is a different topic of our current paper, so that we did not use this technique for our current study. We added this information to L46-48. P2.

  1. The authors can also discuss stability issues by providing relevant citations.

- Thank you very much for your comments. We added limitation of previous studies and the task of the current study to L78-80, P2.

  1. In the 'In this study' paragraph, the authors are advised to omit giving results. Instead of presenting results, the impact of the highly accurate results can be mentioned to capture the reader's attention.

- We revised the whole paper and added comparisons to make it clear and complete.

  1. Equations and figures should be given in a standard format. Fonts must also be consistent.

- Thank you very much for your comments. We revised the whole manuscript to make the equation, figure and text following the format of the journal.

  1. Equation 1 is composed of two similar equations, so they can be labeled as 1a and 1b. The fonts and positions of all equations must be standardized, and there must be consistency among them.

- Thank you very much for your comments. We revised equation 1 as 1a and 1b (L121, 122, P3). We also checked and revised the font size of all equations in the paper.

  1. Can the title of the second section be simplified to "Methodology"?

- Thank you very much for your comments. We revised the second section as Methodology (P2).

  1. The first paragraph of the second section is more suited to the introduction section. Reasoning should only be discussed in the introduction. The methods section should focus solely on the methods without mentioning reasoning or providing results.

- Thank you very much for your comments. The first paragraph presents the method to use the camera for the measurement. We agree with the reviewer that it can be moved to the introduction. However, please allow us to keep it in the methodology section to complete the description and to provide a short introduction of the methodology.

  1. The description of Equation 1 does not clearly explain what X and Y refer to. Does A refer to position or object? Are X and Y coordinates? The notation of point A should be defined at the initial mention.

- Thank you very much for your comments. We added a description of the x, y coordinates, and A in the paper. Here, the system coordinates xOy is fixed with the camera in robot 1. Point A refers to the position of the object in xOy coordinate, so the position of point A is as XA and YA to L109-111, P3.

  1. The formulations only cover the calculation of velocity and related information. However, the method for detecting the object by the camera at the given positions is not described. How the coordinates are determined in step 3 after the object’s detection in step 2 is not clearly explained.

- In this study, we used YOLO 8 to determine the object. The details of the algorithm, which was presented in the previous study, were not presented again in the current study. Instead of that, we presented our application to use YOLO 8 in the determination of the object and capture the information of the object for the next process. The data of the object from step 2 including the position of the object in the camera image by time is then used for step 3. We added this information to the paper (L188-190, P5).

  1. Does the paper provide a novel approach to determining identification parameters? The method for this must be explained.

- In this study, we provide a new method for determining the velocity and angles of an object in space by a single camera in real-time. The robots were also designed by us for the purpose. We added detailed information to the introduction and conclusion of the paper.

  1. Each section and subsection should begin with a short introductory paragraph instead of listing titles in succession.

- Thank you very much for your comments. We added a short introduction in the main section and subsection.

  1. Parameters of the robot can be presented in a table.

- We added a table to show the robot parameter (P7, 8).

  1. The conclusion section can omit detailed results. Instead, the impact or variation of results can be described verbally.

- Thank you very much for your comments. The main results were included in conclusion part, as it was often presented in previous studies, to emphasize the significance of the study. We also revised the conclusion part to make it more clearly.

18.Is the format of listing the conclusions suitable for the journal's format?

  1. The fonts used in the figures must be consistent.

Regarding questions 18, and 19, we will revise the font size of the figure and format in the final state.

  1. Methodology and results must be clearly distinguished.

- Thank you very much for your comment, we revised the methodology and results parts to make them clearly.

This is the end of our respondents. With your help, the quality of the current study was improved remarkably. Once again, thank you very much for your recommendations.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The paper describes an interesting approach for low-resource intensive position, velocity and direction tracking. However, there are revisions necessary that could significantly improve the quality of the manuscript. Particularly, a comparison to the state-of-the-art would be helpful to assess the novelty of this research. It is also not clear if this methodology only performs for objects that it was previously trained for. Furthermore, data should be more extensively interpreted instead of simply described. Below is a number of comments that should be addressed before publication:

-Introduction: comparison to state-of-the-art missing
-Equations 1-4 only account for 2d, how does a 3D movement on uneven ground affect the accuracy? Is this accounted for by the object center?
-Image acquisition: how does frame rate and motion blur affect the accuracy? At what speed does this concept reach its limits
-Object detection: It is unclear if this method only allows tracking of prior known objects that the model was trained for. Alternatively, how does detecting a standardized object affect the accuracy (e.g., tall human vs. short human)?
-More explanation on how kdt is determined would be helpful. Does this mean the actual height of the object needs to be known a priori? Does that in turn mean that the algorithm only works for prior (exactly) known objects?
-How does the tracking time affect velocity estimation accuracy? (1s vs shorter/longer)? How does this sensitivity change with speed?
-Robot 2 sensors: The sensors have inaccuracies themselves. Is there a true reference such as motion capture or external cameras overhead?
-Comma and dots swapped around in equations 6-8 (and at other instances throughout the manuscript, e.g. Fig 10). If this is for US publication, the dot is used for decimals while comma is for 1000s
-Not fully clear how accuracy robot 2 can determine its directionality based on the sensors

-It is not fully clear to the reader if this method is only valid for trained data on prior known objects?
- Earlier it is mentioned that the camera operates at up to 60fps. Why are 30 fps chosen?
-Equation #8: there should be a note here to why a(dx) is not zero at dx=0
-The magnitude in 8/9b (and following) are not easily readable. I'd suggest switching to a contour map instead of a 3D bar graph
-Fig. 10: heights don't seem to be consistent with realistic values (1.1m height of a car?) The average human height should be between 160-175cm.
-Again: It is not clear if this algorithm only performs for objects that it was actually trained for
- Is there a reason for using a tank as a reference? I'd suggest using a non-military application
- Is there an explanation for the different performance in tracking tank, person, car? Percentage errors would also be helpful
A step size of 2 degrees is pretty large, is there a better method for more accurate tracking of the desired angular position such as motion capture or an above mounted camera?
-Fig 18 shows two velocity plots instead of velocity and angle.

-Experimental results description for experiments 1-4 should be significantly reduced (overwhelming amount of data description) and data should be interpreted more instead (i.e., why certain changes are observed)?
-Plots of zero degree and +/-45 degree angle are not on the same y-scale. Either a consistent y-scale or plotting everything in the same plot would allow for better comparison and less plots
-Fig 21-23: Angle measurements: seem to be off ( how is Theta_rb2 near zero and not around 45 deg?--> Difference does not check out)
-Plots should be text centered not page centered
-Comparison to state of the art (accuracy, computation etc. missing), it is not clear where the novelty of this research lies
-References should be properly formatted

Comments on the Quality of English Language

Overall the quality of English Language is good, but can be slightly improved upon, especially in the introduction and at various other points throughout the manuscript (e.g. setup vs set up).

Author Response

Reviewers Recommendation:

Reviewer 2:

The paper describes an interesting approach for low-resource intensive position, velocity and direction tracking. However, there are revisions necessary that could significantly improve the quality of the manuscript. Particularly, a comparison to the state-of-the-art would be helpful to assess the novelty of this research. It is also not clear if this methodology only performs for objects that it was previously trained for. Furthermore, data should be more extensively interpreted instead of simply described. Below is a number of comments that should be addressed before publication:

Thank you very much for your detailed comments on the paper. We understand your concern about the format and contribution of the paper. We checked and revised the whole manuscript according to your comments to make them fit with the format and scope of the journal. The revised places were highlighted in blue color. Our answers for each comment of the editor are as follows:

  1. Introduction: comparison to state-of-the-art missing

- We added a more recent study to the introduction part in the introduction on pages 1 and 2.

  1. Equations 1-4 only account for 2d, how does a 3D movement on uneven ground affect the accuracy? Is this accounted for by the object center?

+ The current study focuses on objects moving in urban cities at distances smaller than 100 m. Consequently, we focused only on the algorithm for two-dimensional (2D) flat space. The change in position of the object in the z direction was not considered. Clearly, in 3D space, uncertainty in z direction can increase. However, the algorithm could not be changed and the concept in the current study can be applied for 3D space.

- Currently, the movement on uneven ground was also considered in our study. However, since this study is sufficiently long and the conclusion can be not changed, so it will be reported in our further study.

- We added this information to Line 141-145, Page 4 (L141-145, P4).

  1. Image acquisition: how does frame rate and motion blur affect the accuracy? At what speed does this concept reach its limits

- Thank you very much for your comments. The frame rate may have some effect on the results. However, since this study focuses on the low-speed movement of the robot and the transferring data system requires light, the frame rate of 1 s was selected. Note that choosing a higher frame rate may improve accuracy for high-speed movement. However, when the velocity of the object is low, the position of the object may remain the same at different frames and the velocity will be zero.

- In terms of motion, the speed of the camera should be increased. Since this study uses YOLO 8 to detect the object, we believe in the results of the program.

- We also conducted a test to determine the uncertainty of the object by using the current camera. The setup can be described in below Figure Here, the object moves at a high speed of 80 km/h at a distance of 100 m. The viewing angle of camera was 90° and the observation positions were from A to B with length of AB = 220 m. The frame rate of the camera was setup at 60 fps.

Figure 1. Evaluation of object in space.

From the configuration, we can calculate the distance movement of the object in each frame is around Δs = 0.3m. As the results, the error of angle was Δα = 0.17° and the error of distance was Δd =0.001m. As can be seen that the error was small and can be neglected. At low speed, similar results were also obtained.

- We added this information to L301-309, P10.


  1. Object detection: It is unclear if this method only allows tracking of prior known objects that the model was trained for. Alternatively, how does detecting a standardized object affect the accuracy (e.g., tall human vs. short human)?

- The current technique can be applied for not-trained object. The technique allows to capture any object through a trained model with a scale coefficient kdt.

- We agree with the reviewer that the position, velocity can be affected when the geometry of the object is changed, such as tall, short people. However, that this study focuses on fixed tall (standard) object, currently. Since this study focus on development of the algorithm, robot design so that we did not change the tall of the object. Additionally, only different model height was considered to test and the results can be applied for other fixed height. The algorithm for determination of the position, velocity, angle of the robot when its height changes are currently developed and we will present in our next study.

- We added this information in the paper to L415-425, P14.


  1. More explanation on how kdt is determined would be helpful. Does this mean the actual height of the object needs to be known a priori? Does that in turn mean that the algorithm only works for prior (exactly) known objects?

- Thank you very much for your comment. In this current study, the actual height of the model should be known priori as the reviewer said. Consequently, the scale factor kdt can be used to determine the ratio of the real model and the standard (trained model). The real dimension of the model can be different but the working principle of the algorithm will not change.

- We added this information to L415-425, P14.


  1. How does the tracking time affect velocity estimation accuracy? (1s vs shorter/longer)? How does this sensitivity change with speed?

- Thank you very much for your comments. As shown in answer to question 2, the error due to the frame rate of the camera was small. Since this study focused on real-time analysis, the speed of the camera can not be reach to high value. It does not affect the accuracy of the measurement. We added this information to the paper.


  1. Robot 2 sensors: The sensors have inaccuracies themselves. Is there a true reference such as motion capture or external cameras overhead?

- Thank you very much for your comments. In this study, the error of robot 2 has been removed using a geometry method and other algorithms of the robot. However, the movement of all errors is impossible due to the working principle of the robot as the reviewer comments. As can be seen from Figure 27 for complicated trajectory, the average error of the trajectory from the measurement system of robot 2 is less than 1 cm. Consequently, the accuracy of the measurement was guarantee.

- Note that using the external system the detect the position of the camera is possible. However, it requires an additional system for transferring information and makes the measurement becomes sufficiently complicated. Consequently, in this study, we used a system on camera 2.

We added this information to L652-657, P22.


  1. Comma and dots swapped around in equations 6-8 (and at other instances throughout the manuscript, e.g. Fig 10). If this is for US publication, the dot is used for decimals while comma is for 1000s.

- We revised it. We also checked and revised the whole paper again.


  1. Not fully clear how accuracy robot 2 can determine its directionality based on the sensors

- The robot 2 was designed and programmed so that it can move along a line with a fixed velocity. In fact, the sensor for the tracking line is at the lower surface of the robot Consequently, the movement direction by time can be determined.

- We added this information to L254-256, P8.

  1. It is not fully clear to the reader if this method is only valid for trained data on prior known objects?

In this study, the object does not need to train in the existing space. For a knowing space, we need to train only the standard object or a determining object. After that, we need to know only the height ratio (scaling factor) kdt​ of the real object and the standard object. Then, the position and velocity of the object can be determined in any other space.

Some part of this question was already answered in question 4. - We added this information in the paper to L415-425, P14.


  1. Earlier it is mentioned that the camera operates at up to 60fps. Why are 30 fps chosen?

- The maximum speed of the camera was 60 fps. However, in this study, the speed of the camera was fixed at 30 fps. The reason for that is to increase the speed of the computer system in determining the object's position and speed. By using a more powerful computer system, the frame rate of the camera can be increased. However, since this study focuses on in-expensive computer vision technique, so the speed of the camera was reduced.

- We added this information to L339-344, P11.


  1. Equation #8: there should be a note here to why a(dx) is not zero at dx=0

- Thank you very much for your comment. The equation for relation of α and Δx shows by:

  (8)

The a(dx) = 0.027. For the ideal case, a(dx) =0. However, due to the experimental setup, we can not get this number of zero. Instead of that in the real case, this number is close to zero, as shown in our study. This error is due to the in-coaxial of the Ox axis and the observation direction of the camera. In our attempt, we tried to reduce this number, but a(dx) = 0.027 is the minimum value, which can be obtained. The negative sign presents which direction of the axis misalignment.

- We added this information to L354-355, P12.

  1. The magnitude in 8/9b (and the following) is not easily readable. I'd suggest switching to a contour map instead of a 3D bar graph

Thank you very much for your comment. We tried to change it to 3D bar graph, but the results were not good enough to indicate. Please allow us to keep those figures. We also revised those figures again for clear illustration.

  1. Fig. 10: heights don't seem to be consistent with realistic values (1.1m height of a car?) The average human height should be between 160-175cm.

Thank you very much for your comment. We agree with the reviewer’s comment about the realistic value of cars and humans. Since this study focused on the algorithm and testing methods, the dimension of the object can be different from the real. For the real object, we need to change the coefficient kdt, which does not change the accuracy of the results.

- We added this information to L423-425, P14.


  1. Again: It is not clear if this algorithm only performs for objects that it was trained for

- This study trained only one standard model. For the other models, the training process is not required. This approach was not used previously. The determination of the position, velocity, and angle is calculated through the standard object. Consequently, it helps to reduce the numerical time and the complications of the program.

- Since this study used only training space, it may lead to some unclear points. Generally, the training process can be generated by the trajectory of Robot 2 in the standard environment. The training process in the real process is not required.

- We added this information in the paper to L423-425, P14.

  1. Is there a reason for using a tank as a reference? I'd suggest using a non-military application

- Thank you very much for your comments. We change the tank to other objects with similar dimensions to avoid changing the coefficient kdt and for application of non-military objects.


  1. Is there an explanation for the different performance in tracking tanks, person, car? Percentage errors would also be helpful

- The main difference from different training models is their length. The percentage of difference was also added to each experiments of Section 4.


  1. A step size of 2 degrees is pretty large, is there a better method for more accurate tracking of the desired angular position such as motion capture or an above mounted camera?

- Thank you very much for your comments. In this study, a line sensor was used for robot 2 to detect trajectory. The other technique can be also developed for increase the accuracy. However, by applying algorithm for reducing the error, it was shown the measurement error in the current study from the robot is small as shown in Fig. 18. Consequently, the step size of 2° can be used in the current study for good results. Increasing the accuracy of the robot is an important task of our further study. However, before that a powerful and light computer system should be developed. 

- We added this information to L652-657, P22.


  1. Fig 18 shows two velocity plots instead of velocity and angle.

- Thank you very much for your comments. We revised the right figure 18 for the angle (P18).


  1. Experimental results description for experiments 1-4 should be significantly reduced (overwhelming amount of data description) and data should be interpreted more instead (i.e., why certain changes are observed)?

- Thank you very much for your comments. We added more discussion to section 4. However, it required a large of time for each measurement case, so we kept the data in section 4.


  1. Plots of zero degree and +/-45 degree angle are not on the same y-scale. Either a consistent y-scale or plotting everything in the same plot would allow for better comparison and less plots

- Thank you very much for your comments. The difference in data is from the different coordinates used in this study. The measurement data is from camera 1 while the other data is from observation of robot 2. This is also explained for the error in question 12 of the reviewer. We added this information to L555-558, P19.


  1. Fig 21-23: Angle measurements: seem to be off ( how is Theta_rb2 near zero and not around 45 deg?--> Difference does not check out)

Thank you very much for your comments. The angle of the robot was measured using the robot system so that the values are around zero. The answer to this question is the same as question 21.


  1. Plots should be text centered not page centered

Thank you very much for your comments. We revised the whole plot in this study to make at text-centered.


  1. IComparison to state-of-the-art (accuracy, computation, etc. missing), it is not clear where the novelty of this research lies

- Thank you very much for your comments. We added more recent studies in the paper to make the novelty.

  1. References should be properly formatted

- Thank you very much for your comments. We revised the references list.

This is the end of our respondents. With your help, the quality of the current study was improved remarkably. Once again, thank you very much for your recommendations.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

All prior comments have been thoroughly reviewed, and the manuscript has been enhanced accordingly.

 
Back to TopTop