Reinforcement Learning for Robot Assisted Live Ultrasound Examination
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis research presents a robotic ultrasound scanning system that integrates a reinforcement learning framework (LSTM-enhanced Deep Q-Network) with image segmentation (U-Net–based liver/aorta detection) to autonomously localize liver standard planes during ultrasound examinations. The system demonstrated proof-of-concept feasibility in phantom experiments, achieving consistent acquisition of target planes with an average PSNR of 21.53 dB across three real-world trials.
Here are my comments:
Three real-world trials are insufficient. A statistically significant number of phantom tests, followed by pilot in-vivo trials, are necessary.
Reliance on PSNR ignores clinical diagnostic relevance. Future work should include radiologist scoring, task-specific accuracy such as vessel diameter measurements, or comparison with manual probe control.
The weighting of distance vs. segmentation rewards is arbitrary. A sensitivity or ablation study is needed to justify these design choices.
No benchmarking against alternative navigation methods. For example, supervised learning–based navigation, classical image-based heuristics.
The transition from simulated training to physical deployment is under-discussed. Domain adaptation strategies should be elaborated.
Several important cross-disciplinary literatures are missing, like Predicting flow status of a flexible rectifier using cognitive computing; and Bio-inspired circular soft actuators for simulating defecation process of human rectum.
Issues such as patient motion, probe pressure variability, rib occlusion, and tissue heterogeneity are not considered.
The manuscript requires substantial language polishing for clarity and flow. Reward table and equations are difficult to parse and would benefit from clearer formatting.
Author Response
Please refer to the attached rebuttal file.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper proposes a robotic ultrasound scanning system (RUSS) based on reinforcement learning, utilizing a DQN+LSTM framework to achieve automatic localization of standard liver planes. The topic has certain clinical application value and aligns with the development trend of intelligent medical imaging. The authors have carried out relatively systematic work in system construction, data acquisition, simulation training, and real-world validation, and the feasibility has been verified through PSNR evaluation. However, the work still lacks sufficient technical innovation, experimental scale, and depth of analysis.
To meet the standards for journal publication, further strengthening of experimental design and technical justification is needed.
- The core methodology combines DQN and LSTM for RL-based probe control. This idea has been explored in prior studies, and the manuscript shows limited novelty. The authors should clearly highlight the new contributions, such as innovations in reward function design, environment construction, or system integration.
- The evaluation relies solely on PSNR, which is insufficient. It is recommended to incorporate additional metrics such as Structural Similarity Index (SSIM), Dice coefficient, or Intersection over Union (IoU) to provide a more comprehensive assessment of image quality.
- Although the reward function formulas are described, there is little discussion on their stability and convergence. It is suggested to include training process curves (e.g., reward convergence trends, loss reduction curves) and analyze the impact of different reward designs on performance.
- The dataset includes only 310 images (from phantom and reconstructed data), which is relatively small for training a deep model. The authors should clarify data augmentation strategies or expand the dataset to improve generalization.
- Some recent studies especially on reinforcement learning and machine learning framework are suggested to be discussed, such as " Self-triggered approximate optimal neuro-control for nonlinear systems through adaptive dynamic programming, "Adaptive critic design for safety-optimal FTC of unknown nonlinear systems with asymmetric constrained-input", "Unsupervised image stitching based on generative adversarial networks and feature frequency awareness algorithm", and "Noise suppression zeroing neural network for online solving the time-varying inverse kinematics problem of four-wheel mobile manipulators with external disturbances", which can improve the quality of the manuscript
- The manuscript mainly focuses on technical implementation but lacks sufficient discussion of clinical scenarios and practical significance. It is recommended to add a dedicated section explaining how this system could be integrated into clinical workflows (e.g., for initial screening, diagnostic assistance, or reducing physician workload) to enhance its relevance to the medical imaging community.
Author Response
Please refer to the attached rebuttal file.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis paper proposes a Robotic Ultrasound Scanning System (RUSS) that leverages reinforcement learning (DQN with LSTM) to automate the acquisition of standard liver ultrasound planes. The approach addresses the critical issue of operator dependency in ultrasound imaging and demonstrates feasibility through real-world trials, showing measurable performance via PSNR. The study’s novelty lies in integrating deep reinforcement learning with robotic control for clinical imaging, highlighting a promising direction toward automation in medical diagnostics.
However, there are some comments that authors need to consider in order to improve the quality of the manuscript:
- Summarize/highlight the main contributions of the paper at the end of introduction.
- It would be good if the authors separte the related work in a different section after the introudction. Provide a full background theory of the RL methods, segmentation methods, what are othre models of segmentations have been used for medical image segmentation.
- Do models like SAM (segment anything model) can be used in this study, discuss it.
-
The validation is limited to only three real-world trials, which is insufficient to claim robust generalizability.
- Provide more details for the Figure 5. LSTM-DQN network. i.e., the sub-blocks and modules in this figure.
- It would be good if see more details in the caption of the figures.
-
PSNR alone is not a comprehensive clinical evaluation metric; additional image quality or diagnostic accuracy measures are needed.
-
The description of experimental setup, patient diversity, and comparison with baselines (e.g., manual expert scanning or alternative algorithms) is unclear.
-
Lack of discussion on safety, practical deployment challenges, and computational efficiency of the system in clinical workflows.
- What are the limitation of the model, discuss it, show some failure cases.
- Please use new reference from recent years, like 2023-2025.
Author Response
Please refer to the attached rebuttal file.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsI have reviewed the revised manuscript and I am pleased to report that the authors have made substantial improvements in response to my previous comments. The manuscript now addresses the key concerns I raised, and the revisions enhance both the clarity and depth of the work. I recommend that the manuscript be accepted for publication in its current form.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe paper has been well addressed according to the comments in last round, and I have no more concerns.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors adderssed my previous comments clearly, and the manuscript is in a good shape.
