Preliminary Study on the Accuracy Comparison Between 3D-Printed Bone Models and Naked-Eye Stereoscopy-Based Virtual Reality Models for Presurgical Molding in Orbital Floor Fracture Repair
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsOnly two examiners and 11 cases; the “random order” and four trial sequences induce learning effects
Methodological details missing: you photograph templates for 2D contour metrics but then report 3D shape similarity (HD/RMSE) without explaining how the aluminum plates were digitised into 3D geometry (scanner type, acquisition workflow, calibration, surface registration). That makes the pipeline irreproducible.
Measurement protocols also have ambiguity. Depth is taken with calipers. The reference plane, landmarking, and repeat positioning are not standardised and viewing distance on the NED is set to 30 - 50 cm, introducing uncontrolled variance that could explain the weak depth reliability? Please elaborate
Statistical reporting has several issues: normality testing with small n is underpowered, but authors proceed with multiple paired t-tests without adjustment? ICC models/types (two-way random vs mixed, absolute agreement vs consistency)
Some claims extend beyond the data: authors conclude VR is “cost and time efficient” without measuring time, costs, user fatigue, or learning curves, and you frame comparability to 3D printing?
The reason for certain citations appears uncertain knee arthroplasty segmentation paper in a craniofacial context?
Transparency elements are partly inconsistent: informed consent is stated for all patients while the IRB approval date is after case window...
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis study aimed to evaluate whether naked-eye stereoscopic display (NED)–based virtual reality implant pre-shaping achieves reproducibility and dimensional accuracy comparable to conventional 3D-printed models for unilateral orbital floor reconstruction.
This study addresses an important gap, the methodology is carefully designed and the findings are adequately presented. I would like to make the following comments.
In the introduction, I would suggest reducing detail on general VR applications and focusing instead on the specific limitations of HMD-based systems that NEDs uniquely address.
In the methodology section, it would be useful to explain why only two surgeons were selected and whether their limited AR/VR experience could influence the learning curve or bias trial outcomes. Moreover, specify exclusion criteria. Please explain whether fracture size or complexity was controlled. I would also suggest providing printer resolution, material properties, and whether printed models were post-processed. As regards the randomization method, it would be useful to describe how the random order of cases was generated. Describe also whether surgeons were blinded to previous trials and whether templates were numbered or coded to prevent recall. It would be useful to indicate how the image scale was standardized when extracting contours and whether inter-photo variability was tested. Please explain whether implants were aligned using a standardized registration protocol before HD/RMSE analysis.
In the results section, depth repeatedly shows poor intra- and inter-rater ICCs. It would be important to explain why depth is difficult to reproduce and discuss whether this impacts clinical usefulness. Please explain briefly ΔLOA values.
In the discussion, explain whether 2–3 mm HD and ~1 mm RMSE are clinically acceptable for orbital floor reconstruction, referencing known tolerances.
As regards the conclusion, I would suggest not overgeneralizing clinical impact.
Author Response
"Please see the attachment."
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript presents a preliminary comparative study evaluating implant pre-shaping accuracy between 3D-printed anatomical models and models generated using a naked-eye stereoscopic VR display (NED). Measurements include major/minor axes, area, depth, and 3D shape similarity (HD, RMSE), with reproducibility and inter-examiner agreement analyzed using ICCs, Bland–Altman, and correlation metrics. The topic is clinically relevant, especially given the increasing role of VR/AR systems in craniofacial surgery. The paper is generally well structured and clearly written, but several issues must be addressed before publication.
1 – The Introduction is too descriptive and relies heavily on generic background about orbital fractures and VR systems. I recommend to clarify earlier what is the actual novelty of this work, specifically the direct implant pre-shaping using a stereoscopic VR display without HMDs.
2 – Previous studies comparing VR/AR and 3D printing are mentioned, but the research gap is not sharply defined. Authors should state more clearly what methodological or clinical insight this study adds beyond existing literature.
3 – The sample size is small (11 patients and two surgeons). This is acceptable for a preliminary study, but the manuscript should explicitly acknowledge that the results cannot support non-inferiority claims and that generalizability is limited.
4 – The implant preshaping protocol needs further clarification. Please indicate whether surgeons were allowed to freely rotate/scale the VR models during shaping, and provide more details about the familiarization or training with the naked-eye stereoscopic device before the experiment.
5 – Depth measurement shows the weakest reproducibility. Authors should explain more clearly how depth was measured on VR-based implants, since caliper positioning may introduce additional variability. This methodological limitation should also be highlighted.
6 – ICC interpretation should be moderated. For example, the VR depth ICC (ICC = 0.54 with CI including negative values) represents poor reliability rather than moderate agreement. This should be stated explicitly.
7 – Some Bland–Altman plots present wide limits of agreement (e.g., LOA = 7.23 mm for long axis). Considering clinical tolerances in orbital reconstruction (often ~1–2 mm), I recommend a more critical interpretation of these differences.
8 – Shape similarity results (HD ≈ 3 mm and RMSE ≈ 1.28 mm) are reasonable, but authors should relate these values to clinical thresholds to better contextualize their practical significance.
9 – Several figures (pages 7–10) are visually dense and the captions are not self-contained. I recommend to clarify captions and, if possible, simplify or reorganize the visual presentation.
10 – A compact summary table comparing VR vs. 3D printing across all metrics (axes, area, depth, HD, RMSE, ICC) would improve readability and help readers understand the global performance differences.
11 – The Discussion section repeats some results instead of focusing on clinical implications. I suggest to emphasize whether the observed differences (e.g., ~1 mm RMSE or ~3 mm HD) meaningfully affect postoperative orbital volume or implant positioning.
12 – Practical considerations such as surgeon fatigue, stereopsis variability, and workflow integration with VR systems should be discussed in more detail, since these aspects influence real-world adoption.
13 – The preliminary nature of the study should be stressed more explicitly. Based on the current data, the manuscript should avoid suggesting that VR can replace 3D printing at this stage.
Comments on the Quality of English Language1 – There are some long or repetitive sentences throughout the manuscript. I recommend a careful English proofreading to improve clarity and readability.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors have satisfactorily addressed most of the reviewer’s comments and they have properly justified why some of them have not been taken into account. I have no further remarks about the current version of the manuscript.
Comments on the Quality of English Language1 – There are some long or repetitive sentences throughout the manuscript. I recommend a careful English proofreading to improve clarity and readability.
