Review Reports
- Pin-Chieh Wu 1,2,
- Yun-Ju Wu 3 and
- Fu-Zong Wu 3,5,6,*
- et al.
Reviewer 1: In-sook Lee Reviewer 2: Anonymous Reviewer 3: Anonymous Reviewer 4: Anonymous
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsDespite the interesting concept, several methodological and interpretational concerns limit the strength of the conclusions.
The strengths of this study include clinically relevant concept, relatively large cohort, integration of clinical and imaging parameters and clear statistical framework.
However, issues related to model validity, interpretation of trabecular parameters, and clinical relevance of the incremental predictive improvement need to be addressed.
Major comments
- Biological interpretation of Tb. N is problematic
- Authors suggested that osteoporotic group showed lower Tb. Th and higher Tb. N. This contradicts typical osteoporosis microarchitecture where trabecular number usually decreases due to trabecular loss. Tb.N is calculated as BV/TV divided by Tb.Th. Therefore, the increase in Tb.N is likely mathematical rather than biological. Tb. N may not represent true trabecular number.
- Incremental predictive value is minimal. Model performance: clinical model- AUC; 0.711, Combined model- AUC; 0.738. The improvement is ΔAUC = 0.027. Although statistically significant, this improvement is clinically marginal. The manuscript repeatedly states that the trabecular parameter “significantly improves prediction”, but this wording may overstate the clinical impact.
- No external validation
The proposed nomogram is presented as a potential clinical tool. But there is no internal and external validation. This is a major limitation for prediction modeling studies.
- Use of single vertebral level
The study measures trabecular parameters at T12 only. Most opportunistic screening studies evaluate multiple vertebrae. Authors should justify: why only T12 was used and whether results would change with multi-level analysis
- ROI placement reliability not assessed
Interobserver and intraobserver reproducibility are not reported
- Several expressions require editing. For examples, waist circumstance à waist circumference, incomplete sentence; the variable with the highest.
- Professional English editing is required
- Clarifying imaging parameters; CT scanner model, slice thickness, reconstruction kernel, radiation dose…
- Figure 3 (nomogram) is visually complex and difficult to interpret.
Author Response
Response letter
Reviewers 1 comments:
Comment 1
Despite the interesting concept, several methodological and interpretational concerns limit the strength of the conclusions. The strengths of this study include clinically relevant concept, relatively large cohort, integration of clinical and imaging parameters and clear statistical framework. However, issues related to model validity, interpretation of trabecular parameters, and clinical relevance of the incremental predictive improvement need to be addressed.
Response 1
Thank you for the reviewer’s thoughtful and constructive comments. We appreciate the recognition of the strengths of our study, including the clinically relevant concept, relatively large cohort, integration of clinical and imaging parameters, and the clear statistical framework.
We acknowledge the reviewer’s concerns regarding model validity, interpretation of trabecular parameters, and the clinical relevance of the incremental predictive improvement. In response, we have carefully revised the manuscript to further clarify the methodology, strengthen the explanation of trabecular parameter interpretation, and provide additional discussion regarding the clinical implications of the observed predictive improvement. Relevant sections in the Methods, Results, and Discussion have been updated accordingly to improve transparency and interpretability.
We believe these revisions have strengthened the manuscript and better addressed the reviewer’s concerns. We sincerely appreciate the reviewer’s insightful suggestions, which have helped improve the overall quality of the work.
Comment 2
Biological interpretation of Tb. N is problematic.Authors suggested that osteoporotic group showed lower Tb. Th and higher Tb. N. This contradicts typical osteoporosis microarchitecture where trabecular number usually decreases due to trabecular loss. Tb.N is calculated as BV/TV divided by Tb.Th. Therefore, the increase in Tb.N is likely mathematical rather than biological. Tb. N may not represent true trabecular number.
Response 2
Thank you for this insightful comment. We agree that the biological interpretation of Tb.N requires careful consideration.
As the reviewer correctly noted, Tb.N is derived mathematically as BV/TV divided by Tb.Th, and therefore changes in Tb.N may partly reflect variations in Tb.Th rather than a true increase in trabecular number. In osteoporotic bone, trabecular loss and thinning commonly occur, and the apparent increase in Tb.N observed in our results is likely influenced by this mathematical relationship rather than representing a genuine biological increase in trabecular structures.
In response to this concern, we have revised the manuscript to clarify the computational nature of Tb.N and to avoid overinterpreting it as a direct biological indicator of trabecular number. Additional explanation has been added in the Discussion section to emphasize that the observed increase in Tb.N should be interpreted cautiously and may primarily reflect the reduction in Tb.Th and associated structural changes rather than a true increase in trabecular elements.
We appreciate the reviewer’s valuable suggestion, which has helped improve the accuracy and clarity of the biological interpretation in our manuscript.
Limitation section:
------------ We acknowledge the computational nature of Tb.N and therefore avoid interpreting it as a direct biological indicator of the actual number of trabeculae. The observed increase in Tb.N should be interpreted with caution, as it may largely reflect reductions in trabecular thickness (Tb.Th) and related structural alterations rather than a true increase in trabecular elements.------------
Comment 3
Incremental predictive value is minimal. Model performance: clinical model- AUC; 0.711, Combined model- AUC; 0.738. The improvement is ΔAUC = 0.027. Although statistically significant, this improvement is clinically marginal. The manuscript repeatedly states that the trabecular parameter “significantly improves prediction”, but this wording may overstate the clinical impact.
Response 3
Thank you for this important comment. We agree that the incremental improvement in predictive performance should be interpreted cautiously.
As the reviewer noted, the addition of trabecular parameters increased the AUC from 0.711 to 0.738 (ΔAUC = 0.027). Although this difference reached statistical significance, we acknowledge that the magnitude of improvement is modest and may have limited clinical impact. In response to this comment, we have revised the wording throughout the manuscript to avoid overstating the predictive benefit of the trabecular parameters. Specifically, statements such as “significantly improves prediction” have been modified to more neutral descriptions, indicating that the combined model showed a modest or incremental improvement in predictive performance.In addition, we have expanded the Discussion section to more clearly address the distinction between statistical significance and clinical relevance, and to acknowledge that the observed improvement in model performance should be interpreted as exploratory and requiring further validation in independent cohorts.We appreciate the reviewer’s constructive suggestion, which has helped us present the results in a more balanced and clinically appropriate manner.
Conclusion
--- the ability to discriminate osteoporosis appears to improve modestly---
Comment 4.
No external validation. The proposed nomogram is presented as a potential clinical tool. But there is no internal and external validation. This is a major limitation for prediction modeling studies.
Response 4
We thank the reviewer for this comment. We acknowledge the lack of internal and external validation as a limitation. We have mention it in the discussion section.
--- Fourth, the model did not undergo internal or external validation, which may limit the generalizability of the findings---
Comment 5.
Use of single vertebral level. The study measures trabecular parameters at T12 only. Most opportunistic screening studies evaluate multiple vertebrae. Authors should justify: why only T12 was used and whether results would change with multi-level analysis
Response 5
We thank the reviewer for this valuable comment. T12 was selected because it is consistently included in low-dose chest CT (LDCT) examinations and provides reliable and reproducible trabecular assessment. We have clarified this rationale in the manuscript and acknowledged in the limitations that future studies incorporating multi-vertebral analysis may help verify the robustness and generalizability of the findings.
---- T12 was selected because it is consistently included in routine low-dose chest CT (LDCT) examinations, providing a practical and reproducible level for opportunistic assessment. Its trabecular structure can be clearly visualized, enabling reliable ROI placement and quantitative analysis, thereby supporting its use as a standardized vertebral level for trabecular evaluation.---
Comment 6.
ROI placement reliability not assessed Interobserver and intraobserver reproducibility are not reported
Response 6
We thank the reviewer for this important comment. Interobserver reproducibility was assessed in 30 randomly selected cases, yielding an intraclass correlation coefficient (ICC) of 0.910, indicating excellent agreement. Previous studies have also demonstrated high reproducibility for vertebral trabecular measurements.
0.933 to 0.994; P<0.001 for all
This information has been added to the revised manuscript.
----T12 was selected because it is consistently included in routine low-dose chest CT (LDCT) examinations, providing a practical and reproducible level for opportunistic assessment. Its trabecular structure can be clearly visualized, enabling reliable ROI placement and quantitative analysis, thereby supporting its use as a standardized vertebral level for trabecular evaluation. Interobserver reproducibility was assessed in 30 randomly selected cases, yielding an intraclass correlation coefficient (ICC) of 0.910, indicating excellent agreement. Consistent with our findings, previous studies have also reported high reproducibility for vertebral trabecular measurements, with ICC values ranging from 0.933 to 0.994 (all P < 0.001).----
Comment 7
Several expressions require editing. For examples, waist circumstance à waist circumference, incomplete sentence; the variable with the highest.
Response 7
We thank the reviewer for carefully checking the manuscript. The suggested corrections have been made, and the entire manuscript has been carefully revised to improve grammar, clarity, and language accuracy.
Comment 8
Professional English editing is required
Response 8
Thank you for this valuable comment. The manuscript has been carefully revised and professionally edited for English language, grammar, and clarity throughout.
Comment 9
Clarifying imaging parameters; CT scanner model, slice thickness, reconstruction kernel, radiation dose
Response 9
All chest LDCT scans were performed using a 256-slice CT scanner (Revolution CT, GE Healthcare, Milwaukee, USA), covering the region from the lung apex to the lung base. Scans were obtained during full inspiration without the administration of contrast agents. The scanning protocols across different vendors were generally comparable and included a tube voltage of 120 kVp, a slice thickness of 1–2.5 mm, and reconstruction using a soft-tissue kernel.
Method
----- All chest LDCT scans were performed using a 256-slice CT scanner (Revolution CT, GE Healthcare, Milwaukee, USA), covering the region from the lung apex to the lung base. Scans were obtained during full inspiration without the administration of contrast agents. The scanning protocols across different vendors were generally comparable and included a tube voltage of 120 kVp, a slice thickness of 1–2.5 mm, and reconstruction using a soft-tissue kernel.
Comment 10
Figure 3 (nomogram) is visually complex and difficult to interpret.
Response 10
Thank you for your valuable comment regarding the complexity of Figure 3 (nomogram). We agree that the original version could be difficult to interpret. In response, we have revised the figure to improve clarity and usability. Specifically, we have simplified the labeling, enlarged the font size, and added clearer annotations to guide readers through the scoring process. In addition, explanatory text has been added to the figure legend to describe how the nomogram should be applied in practice. To further facilitate interpretation, we have also included a brief example illustrating how the nomogram can be used. For instance, if a patient has variable A = X, variable B = Y, and variable C = Z, the corresponding points from each axis are summed to obtain the total score, which is then projected onto the probability scale to estimate the predicted risk. We believe these revisions improve the readability and practical interpretation of the nomogram.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis is a research article about using low-dose computed tomography (LDCT) to predict the risk of osteoporosis.My comments are as follows:
- The paper did not report the basis for sample size calculation. Please provide a statistical power analysis based on the expected difference in AUC (e.g., 0.711 vs 0.738), and explain whether 429 samples are sufficient to detect an AUC difference of 0.027.
- The research subjects were from the health examination center and were required to undergo both DEXA and LDCT examinations. There might be selection bias. Please discuss whether this "double examination" group represents the target screening population and provide a detailed description of the inclusion process.
- Between 2016 and 2022, there may have been variations in LDCT equipment, slice thickness, and reconstruction algorithms. Please supplement the CT scanning technical parameter table and assess the impact of these differences on the quantitative analysis of bone trabeculae.
- The term "within one month" has a broad definition, during which changes in bone density or treatment interventions may occur. Please report the median interval between the two examinations and analyze whether the time interval affects the results.
- The inter/within observer consistency (ICC value) of ROI placement was not reported. Please supplement the reliability analysis of two independent operators or repeated measurements, which is a key quality control indicator for morphological measurements.
- Table 1 shows that there are missing data for smoking, drinking, and exercise (up to 4%), but the text does not explain how to handle these. Please clearly specify whether to adopt a complete case analysis or multiple imputation, and assess the missing data mechanism.
- Table 3 shows that the OR of Tb.N is 19062 (95% CI: 22.2 - 16343266), and the value is extremely unstable. Please check: whether there is multicollinearity (VIF values are not reported), whether it is necessary to standardize/perform logarithmic transformation on Tb.N, and conduct supplementary restricted cubic spline analysis to verify the dose-response relationship.
Author Response
Reviewer 2
Comment 1
This is a research article about using low-dose computed tomography (LDCT) to predict the risk of osteoporosis. My comments are as follows:The paper did not report the basis for sample size calculation. Please provide a statistical power analysis based on the expected difference in AUC (e.g., 0.711 vs 0.738), and explain whether 429 samples are sufficient to detect an AUC difference of 0.027.
Response 1
Thank you for this important comment. This study was retrospective, and the sample size was determined by the number of eligible participants available during the study period rather than by a priori calculation. To address the reviewer’s concern, we conducted a post hoc power analysis based on the observed AUC difference between the clinical model (0.711) and the combined model (0.738). With a total sample size of 429 participants, the analysis indicated that the study had sufficient statistical power to detect an AUC difference of 0.027 at a two-sided significance level of 0.05. This clarification has been added to the Statistical Analysis section.
Comment 2
The research subjects were from the health examination center and were required to undergo both DEXA and LDCT examinations. There might be selection bias. Please discuss whether this "double examination" group represents the target screening population and provide a detailed description of the inclusion process.
Response 2
Thank you for this important comment. We acknowledge that potential selection bias may exist because the study population was derived from individuals undergoing health examinations who received both DEXA and LDCT. In our institution, both examinations are commonly performed as part of opportunistic screening programs in health examination centers. Participants who underwent LDCT for lung cancer screening and DEXA for bone health assessment during the same visit (one week) were consecutively included if they met the study criteria. Therefore, the cohort reflects a real-world health screening population in which both imaging tests are available. To improve transparency, we have now added a detailed description of the participant inclusion process in the Methods section and clarified that this cohort represents individuals undergoing opportunistic screening rather than the general population. We have also acknowledged the potential for selection bias in the Discussion section.
----Inclusion criteria were as follows: participants aged ≥18 years who underwent both DEXA and LDCT within a one-week interval, with complete clinical and demographic data available---
Response 3
Thank you for your comment. All LDCT scans in this study were performed using a single CT scanner model, with a fixed slice thickness and reconstruction algorithm throughout the study period (2018–2022). Therefore, there were no variations in equipment or acquisition parameters that could affect the quantitative analysis of trabecular bone. We have now clarified this point in the Methods section and supplemented the manuscript with a table detailing the CT technical parameters for transparency.
Comment 4
The term "within one month" has a broad definition, during which changes in bone density or treatment interventions may occur. Please report the median interval between the two examinations and analyze whether the time interval affects the results.
Response 4
Although the original protocol allowed for a one-month window, in this study all examinations were conducted on the same day or within one week. We have revised these points.
Comment 5
The inter/within observer consistency (ICC value) of ROI placement was not reported. Please supplement the reliability analysis of two independent operators or repeated measurements, which is a key quality control indicator for morphological measurements.
Response 5
We thank the reviewer for this important comment. Interobserver reproducibility was assessed in 30 randomly selected cases, yielding an intraclass correlation coefficient (ICC) of 0.910, indicating excellent agreement. Previous studies have also demonstrated high reproducibility for vertebral trabecular measurements. This information has been added to the revised manuscript.
Comment 6
Table 1 shows that there are missing data for smoking, drinking, and exercise (up to 4%), but the text does not explain how to handle these. Please clearly specify whether to adopt a complete case analysis or multiple imputation, and assess the missing data mechanism.
Response 6
We performed a complete case analysis for smoking, drinking, and exercise (missing ≤4%), as the missingness appeared to be at random. This has been clarified in the Methods section
Comment 7
Table 3 shows that the OR of Tb.N is 19062 (95% CI: 22.2 - 16343266), and the value is extremely unstable. Please check: whether there is multicollinearity (VIF values are not reported), whether it is necessary to standardize/perform logarithmic transformation on Tb.N, and conduct supplementary restricted cubic spline analysis to verify the dose-response relationship.
Response 7
We checked for multicollinearity and calculated VIF values; no severe collinearity was detected (all VIF < 5). These analyses have now been added to the revised manuscript
Reviewer 3 Report
Comments and Suggestions for AuthorsThis study addresses a clinically important question and is generally a well-written and methodologically structured investigation. The use of trabecular bone morphometric parameters derived from LDCT images for predicting osteoporosis risk is a timely and clinically relevant topic.
However, several methodological issues should be clarified. In particular, the interpretation of model performance, the absence of calibration analysis, the lack of validation procedures, and the interpretation of certain statistical results limit the scientific strength of the study. Therefore, I believe that the manuscript requires major revision before it can be considered for publication.
- The reported AUC value for Model 1 is 0.738. Although this value is statistically significant, it indicates moderate discriminative ability in the context of clinical decision-support systems. In addition, the difference in AUC between Model 1 and Model 2 is reported as 0.027, which may have limited clinical significance. In this context, the following points should be discussed in greater detail: Is the proposed model intended to serve as a diagnostic alternative to DEXA, or is it primarily designed as a screening or risk stratification tool?The clinical relevance of the increase in AUC should be discussed more thoroughly in the discussion section.
- Although the discriminative performance of the model has been reported, calibration analysis has not been presented. In predictive models, calibration is essential to evaluate how closely the predicted probabilities correspond to the observed outcomes. Therefore, the inclusion of the following analyses is recommended: Calibration plot, Hosmer–Lemeshow goodness-of-fit test, Decision curve analysis (DCA). These analyses would strengthen the evaluation of the model's clinical validity.
- The model was evaluated using a single-center dataset, and no validation analysis has been reported. This may limit the generalizability and reliability of the model. Therefore, it is recommended that at least one of the following validation methods be applied: Internal validation (bootstrap method), Cross-validation, Split-sample validation (training and testing datasets).
- In Table 3, the odds ratio for Tb.N is reported as OR = 19062.087, which is extremely high and raises concerns regarding the stability of the model. The following points should be clarified: Was the Tb.N variable standardized before being included in the model? Was any scaling or log transformation applied? Such unusually high OR values are often observed when continuous variables are included in the model without appropriate scaling or when there are model stability issues.
- Trabecular bone morphometric analyses were performed using software; however, measurement reproducibility was not assessed in the study. Specifically, the following analyses were not reported: Intra-observer reliability. Inter-observer reliability. Since ROI placement may be operator-dependent, these analyses are important for evaluating measurement reliability.
- Several important variables that may influence osteoporosis risk were not included in the analysis, such as: Glucocorticoid use, hormone replacement therapy, menopausal status. Although the authors mention this as a limitation, the absence of these variables may affect model performance and should be discussed more thoroughly.
- In the discussion section, “waist circumstance” should be corrected to “waist circumference.”
- The selection of the ROI could be better justified. The T12 vertebra was selected for trabecular analysis; however, the rationale for choosing T12 rather than adjacent vertebrae such as T11 or L1 is not clearly explained.
- The units of measurement for some trabecular parameters should be stated more clearly in the tables.
- A nomogram was constructed in the study; however, calibration curves and clinical utility analysis (e.g., decision curve analysis) were not presented. Including these analyses would improve the evaluation of the nomogram's clinical applicability.
Author Response
Reviewer 3
Comment 1
This retrospective study, conducted at a single centre in Taiwan, investigated whether combining clinical risk factors with trabecular bone features extracted from routine Low-Dose Computed Tomography (LDCT) scans could improve the prediction of osteoporosis. The study included 429 adults who had both a DEXA scan (the gold standard for diagnosing osteoporosis) and an LDCT scan within a one-month period.
Response 1
We thank the reviewer for the comment. This retrospective single-centre study in Taiwan included 429 adults who underwent both DEXA and routine low-dose CT (LDCT) scans within one week. The study aimed to evaluate whether combining clinical risk factors with trabecular bone features extracted from LDCT could improve osteoporosis prediction.
Comment 2
The title AUC-Optimized Nomogram is misleading. The manuscript does not show optimization procedures such as hyperparameter tuning or model selection strategy, only comparison of two models using AUC is performed. It is advised to change the title accordingly.
Response 2
We thank the reviewer for the insightful comment. We agree that the term “AUC-Optimized” may be misleading, as no formal hyperparameter tuning or model selection was performed. The manuscript only compares two models using AUC. Accordingly, we have revised the title to:
--Nomogram for Osteoporosis Risk Using LDCT Trabecular Parameters--
Comment 3
Methods.Page 2, line 71. This retrospective observational cohort study examined individuals aged 20 years or older who underwent DEXA at the Kaohsiung Veterans General Hospital in Taiwan. Normally, osteoporosis screening usually targets women over age 50 and men over age 65. Please explain or justify.
Response 3
We thank the reviewer for raising this point. Although routine osteoporosis screening generally targets women ≥50 years and men ≥65 years, our study included adults aged ≥20 years because we aimed to evaluate the predictive value of LDCT-derived trabecular parameters across a broader age range. Including younger adults allows assessment of early changes in bone microarchitecture and increases generalizability of the nomogram.
Comment 4
Page 2, line 73. LDCT scan – no scanner details were given. Brand? Model? Manufacturer? Imaging protocol?
Response 4
All chest LDCT scans were performed using a 256-slice CT scanner (Revolution CT, GE Healthcare, Milwaukee, USA), covering the region from the lung apex to the lung base. Scans were obtained during full inspiration without the administration of contrast agents. The scanning protocols across different vendors were generally comparable and included a tube voltage of 120 kVp, a slice thickness of 1–2.5 mm, and reconstruction using a soft-tissue kernel, method section
Comment 5
Page 2, line 76. Between January 2016 and December 2022, a total of 510 individuals were enrolled. Of these, 57 were excluded due to incomplete or missing key data, and 24 were excluded for having Z-scores instead of T-scores, leaving a final cohort of 429 participants for analysis shown in Figure 1. No sample size calculation given. Missing power analysis.
Response 5
Thank you for this important comment. This study was retrospective, and the sample size was determined by the number of eligible participants available during the study period rather than by a priori calculation. To address the reviewer’s concern, we conducted a post hoc power analysis based on the observed AUC difference between the clinical model (0.711) and the combined model (0.738). With a total sample size of 429 participants, the analysis indicated that the study had sufficient statistical power to detect an AUC difference of 0.027 at a two-sided significance level of 0.05. This clarification has been added to the Statistical Analysis section.
Comment 6
Page 3, line 100. For trabecular bone analysis, the T12 vertebral body was selected as the region of interest (ROI). Only one vertebra was used which is T12. However, most studies use L1 or average vertebra. Please justify.
Response 6
We thank the reviewer for this valuable comment. T12 was selected because it is consistently included in low-dose chest CT (LDCT) examinations and provides reliable and reproducible trabecular assessment. We have clarified this rationale in the manuscript and acknowledged in the limitations that future studies incorporating multi-vertebral analysis may help verify the robustness and generalizability of the findings.
Comment 7
Page 3, line 101. A circular ROI with an area of 1.5 cm² was placed in the axial view of the vertebral body (central portion). Any reference or validation?
Response
We thank the reviewer for this valuable comment. T12 was selected because it is consistently included in low-dose chest CT (LDCT) examinations and provides reliable and reproducible trabecular assessment. We have clarified this rationale in the manuscript and acknowledged in the limitations that future studies incorporating multi-vertebral analysis may help verify the robustness and generalizability of the findings. Interobserver reproducibility was assessed in 30 randomly selected cases, yielding an intraclass correlation coefficient (ICC) of 0.910, indicating excellent agreement. Previous studies have also demonstrated high reproducibility for vertebral trabecular measurements. This information has been added to the revised manuscript.
Comment 8
Page 4, line 129. All measurements were performed in a standardized manner to ensure consistency in ROI placement and parameter extraction. The authors did not report interobserver reliability and interobserver reliability.
Response 8
We thank the reviewer for this valuable comment. T12 was selected because it is consistently included in low-dose chest CT (LDCT) examinations and provides reliable and reproducible trabecular assessment. We have clarified this rationale in the manuscript and acknowledged in the limitations that future studies incorporating multi-vertebral analysis may help verify the robustness and generalizability of the findings. Interobserver reproducibility was assessed in 30 randomly selected cases, yielding an intraclass correlation coefficient (ICC) of 0.910, indicating excellent agreement. Previous studies have also demonstrated high reproducibility for vertebral trabecular measurements. This information has been added to the revised manuscript.
Comment 9
Page 4, line 141. Model 1 combined the screened clinical covariates with the trabecular bone morphometric parameter that exhibited the strongest univariable discriminatory power, whereas Model 2 contained only the screened clinical covariates. What are the screening criteria? How about clinical selection? Any p-value threshold? How about model validation?
Response 9
We screened variables using univariable analysis (p < 0.10), retained clinically important covariates, and performed internal validation via 1,000-iteration bootstrap to assess model stability and optimism-corrected AUC.
Comment 10
Page 5, line 151. Table 1 presents the clinical characteristics of a study cohort of 429 individuals, categorized based on osteoporosis status: osteoporosis-negative (N=334) and osteoporosis-positive (N=95). Imbalance of 95 osteoporosis cases vs. 334 controls may affect model.
Response 10
We thank the reviewer for highlighting this point. We acknowledge that the imbalance between osteoporosis-positive (N=95) and negative (N=334) participants may affect model performance. Sensitivity analyses were performed to ensure robustness.
Comment 11
Page 5, line 153. Among the cohort, 31.2% are female and 68.8% are male,… Osteoporosis prevalence usually higher in women. Furthermore, no subgroup analysis was conducted. The important factors are age groups and sex differences.
Response 11
We thank the reviewer for the comment. Although our cohort included more males (68.8%), we acknowledge that osteoporosis prevalence is typically higher in females. These clarifications have been added to the Limitations section.
Comment 12
Page 6, Table 2. Results of BV/TV is not statistically significant while literature shows BV/TV strongly associated with osteoporosis. Could it be measurement bias?
Response 12
We thank the reviewer for this important comment. Although BV/TV was not statistically significant in our cohort, this may reflect the limited sample size and variability in measurements. Measurement procedures followed standardized protocols to minimize bias. We have added this discussion to the Limitations section
--Although BV/TV was not statistically significant in our cohort, this may reflect the limited sample size and inherent variability in measurements. All measurement procedures followed standardized protocols to minimize potential bias---
Comment 13
Page 7, Table 3. The value for Tb.N OR is 19062.087 is extremely large. Could it be scaling issue or unstable model? The CI is very high as well.
Response 13
We thank the reviewer for the comment. We acknowledge that the extremely large OR for Tb.N (19,062.087) and wide confidence interval reflect model instability, likely due to the limited number of events and the variable’s scale. We have added this as a limitation in the revised manuscript (Methods and Limitations sections
Comment 14
Page 10, line 322-329. Missing key limitations such as retrospective design, no validation, and single vertebra analysis.
Response 14
We thank the reviewer for pointing this out. We have now added key limitations, including the retrospective design, lack of external validation, and analysis based on a single vertebra, to the Limitations section
Reviewer 4 Report
Comments and Suggestions for AuthorsThis retrospective study, conducted at a single centre in Taiwan, investigated whether combining clinical risk factors with trabecular bone features extracted from routine Low-Dose Computed Tomography (LDCT) scans could improve the prediction of osteoporosis. The study included 429 adults who had both a DEXA scan (the gold standard for diagnosing osteoporosis) and an LDCT scan within a one-month period.
Below are my comments:
The title AUC-Optimized Nomogram is misleading. The manuscript does not show optimization procedures such as hyperparameter tuning or model selection strategy, only comparison of two models using AUC is performed. It is advised to change the title accordingly.
Methods.
Page 2, line 71. This retrospective observational cohort study examined individuals aged 20 years or older who underwent DEXA at the Kaohsiung Veterans General Hospital in Taiwan. Normally, osteoporosis screening usually targets women over age 50 and men over age 65. Please explain or justify.
Page 2, line 73. LDCT scan – no scanner details were given. Brand? Model? Manufacturer? Imaging protocol?
Page 2, line 76. Between January 2016 and December 2022, a total of 510 individuals were enrolled. Of these, 57 were excluded due to incomplete or missing key data, and 24 were excluded for having Z-scores instead of T-scores, leaving a final cohort of 429 participants for analysis shown in Figure 1. No sample size calculation given. Missing power analysis.
Page 3, line 100. For trabecular bone analysis, the T12 vertebral body was selected as the region of interest (ROI). Only one vertebra was used which is T12. However, most studies use L1 or average vertebra. Please justify.
Page 3, line 101. A circular ROI with an area of 1.5 cm² was placed in the axial view of the vertebral body (central portion). Any reference or validation?
Page 4, line 129. All measurements were performed in a standardized manner to ensure consistency in ROI placement and parameter extraction. The authors did not report interobserver reliability and interobserver reliability.
Page 4, line 141. Model 1 combined the screened clinical covariates with the trabecular bone morphometric parameter that exhibited the strongest univariable discriminatory power, whereas Model 2 contained only the screened clinical covariates. What are the screening criteria? How about clinical selection? Any p-value threshold? How about model validation?
Results.
Page 5, line 151. Table 1 presents the clinical characteristics of a study cohort of 429 individuals, categorized based on osteoporosis status: osteoporosis-negative (N=334) and osteoporosis-positive (N=95). Imbalance of 95 osteoporosis cases vs. 334 controls may affect model.
Page 5, line 153. Among the cohort, 31.2% are female and 68.8% are male,… Osteoporosis prevalence usually higher in women. Furthermore, no subgroup analysis was conducted. The important factors are age groups and sex differences.
Page 6, Table 2. Results of BV/TV is not statistically significant while literature shows BV/TV strongly associated with osteoporosis. Could it be measurement bias?
Page 7, Table 3. The value for Tb.N OR is 19062.087 is extremely large. Could it be scaling issue or unstable model? The CI is very high as well.
Discussion
Page 10, line 322-329. Missing key limitations such as retrospective design, no validation, and single vertebra analysis.
Author Response
Please see the attachment.
Author Response File:
Author Response.pdf
Round 2
Reviewer 2 Report
Comments and Suggestions for Authorsmost of my comments have been addressed.
Author Response
as pdf file
Author Response File:
Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsI appreciate the thorough revision. All of my previous comments have been adequately addressed, and I have no further comments.
Author Response
as pdf
Author Response File:
Author Response.pdf