Robot Manipulation Skills Transfer for Sim-to-Real in Unstructured Environments
Round 1
Reviewer 1 Report
The literature review does not include information on the most commonly used force control additives in machining operations. Force control systems compensate for tool wear. This is the most common use. The force control system is also often used in assembly. A literature review should be completed.
The authors vaguely described how the solutions were transferred to the KUKA IIWA robot.
For understanding, it is worth giving the components of the vectors used in the article, xd, Fd, xc, Fc, etc.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
I recommend a division into subsections regarding the introduction, as follows:
section 1.1 motivations
to shed more light on which issues are intended to be improved and why they are important in applications. An added value would be to justify the issue within the context of industrial robotics/automation.
section 1.2 literature overview
when mentioning position control for robots, I recommend including some specific work such as the following references, not necessarily limited to these only:
Dini, Pierpaolo, and Sergio Saponara. "Model-Based Design of an Improved Electric Drive Controller for High-Precision Applications Based on Feedback Linearization Technique." Electronics 10.23 (2021): 2954.
Bernardeschi, Cinzia, et al. "Co-simulation and Verification of a Non-linear Control System for Cogging Torque Reduction in Brushless Motors." International Conference on Software Engineering and Formal Methods. Springer, Cham, 2019.
section 1.3 contributions
here we list the actual contributions of this paper to the state of the art by mentioning the organization of the manuscript in subsequent sections.
I recommend enriching the detail in Figure 3 regarding the Actor-Critic structure of the Agent used. It helps the reader visualize the components and where the information is processed by minimizing the reward function.
The model of the robot used in the tests is also shown for completeness. In particular the inertia matrices, Coriolis and the gravity/potential vector, and the Jacobians.
Explain whether Jr means the geometric or the analytical Jacobian.
I strongly recommend the inclusion of some implementation details.
To integrate the described RL-based algorithm, did you make use of automatic code generation, and thus import the actor-critic architecture data structures directly into the native KRL environment, or did you make use of supporting operating systems such as ROS/Gazebo to instantiate a simulation of the robot itself?
The simulation and learning part of the RL system seems to be implemented in Matlab/Simulink. I would ask you to state on which machine the simulations are carried out (processor, processor freq.) SW version and toolbox used...
In Figure 15, the stiffness ellipsoids are inserted. These require the calculation of singular values (hence eigenvalues), which I think is very cumbersome to do online. Are they done a posteriori? Or are they informed that the RL system uses?
I hope the comments are helpful to the authors in improving the manuscript.
Good luck!
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 3 Report
The manuscript presents investigations on the robot force control by eNAC method, that use/apply results from learning on a simple robot structure (from simulations) on complex robotic devices.
This topic is interesting for publication in Electronic in a Systems & Control Engineering section.
Consider revision of a few minor remarks:
- In line 106, you refer to Fig. 2. and mention that one of the parts is the "robot motion controller" bud in the figure you have "Computed Torque Controller", why?
- In line 165, please explain what mean acronym KL
- Figure 11, please add the location of the point O
- In lines 312 to 315, please explain more results from Fig. 14. By comparing the dependences of the 250th/300th episode with the 500th episode, then I have some doubts. Why is better the 500th iteration (episode), there are present higher oscillations (in the time 2s to 2.8s) and at the end of the dependence in the time approx 3.9s it is seen to step change to the value of 9N - 9.5N. Compared with episodes 250th or 300th where are present smaller oscillations, even though max. contact force is higher, but at the end of the characteristics, it drops to 8N.
- Figure 15., please consider adding some annotations to the stiffness ellipses for increasing the clarity of the figure.
- Figures 12., why in Fig. 12 you use a shorter word of the trajectory the word "traj" but, in Figs. 15. and 17. it is a "track"
- Only one consideration/question: have you considered the utilization of the soft or RCC (remote center compliance) end effectors? It could be expected that the end effector's own compliance will compensate for your "little deformation of the doors (line 329)". That could be understood as a relatively rigid object. And a combination of both approaches - your method and soft gripper can achieve better results, without theoretical deformation of the object/environment.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Compared with the previous version of the manuscript there is some improvement in the job description.
One final doubt comes to me, reading in particular Answer 6.
So the results presented are the result of simulations and not direct experimentation on the robot?
Would Figures 9 and 17 therefore only be a visual aid for the reader?
If so I would ask the authors to show an explicit analysis of computational complexity.
As a reference example take works such as the following (not necessarily these specifically):
https://ieeexplore.ieee.org/document/9805740
https://www.mdpi.com/1996-1073/12/11/2224
https://www.mdpi.com/1996-1073/13/10/2512
https://www.mdpi.com/1996-1073/13/8/2077
I would ask them to replicate the use of the Simulink Profiler, so as to understand which operations are the most onerous, both for the RL algorithm and for blocks related to geometric properties (such as precisely ellipsoid calculations); and also to replicate the analysis of the computational time required.
The latter will give an idea of the required computational time, i.e., whether during an entire simulation there are "instants" where the real-time trhoughput constraint is exceeded (comparing the computational time of each numerical interval with the sampling time set in simulation).
Obviously, these two analyses are platform-dependent....
I guess it is too onerous to ask you to verify that by separately implementing RL in Simulink and dynamic model of the robot in ROS/Gazebo and using the communication tool in fact the control loop gets the same results.
I leave this as a cue for a closer analysis of the results.
However, I consider the complexity analysis necessary.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 3
Reviewer 2 Report
I thank the authors for clarifying the points on which I had doubts.
I think the current version is more than adequate for publication in MDPI.
I do not want to give a minor for this, so I leave it to the discretion of the authors whether to integrate this last comment as well (which is more my curiosity). Have you used the following SW tool?
https://it.mathworks.com/matlabcentral/fileexchange/65446-kuka-sunrise-toolbox
If so, could you point this out in the manuscript, to clarify the fact that intrinsically an automatic code generator is used in the target programming language, but that the authors do not handle the integration on the processor since there is a third party tool that is open SW (which is more than welcome I would say in a scientific research context).
Thanks for the interesting work and good luck with your research!!!
Author Response
Please see the attachment.
Author Response File: Author Response.docx