MicroRNA Biogenesis Pathway Gene Variants Are Associated with Prostate Cancer Susceptibility
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript is timely and relevant due to the increased interest in understanding miRNA in the context of cancer diagnosis, prognosis, and treatment. The work is a valuable addition to the existing population-based studies, but would benefit from some additional work being performed.
- Since the study focuses on miRNA machinery genes (ex DICER1)- it would be helpful to at least integrate some miRNA expression data?
- For example do certain variants associated with specific miRNA expression signatures
- This would aid in the functional validation for this study.
- The analysis of miRNA could potentially offer biomarkers as well.
In terms of the population used - beyond cancer vs. no cancer were the samples stratified for increase risk such as "control patients" with mutation in known DDR genes . Using this may offer more depth to the insights in this manuscript.
One other point that perhaps is beyond the scope of this paper is that the biological significance of the pathway enrichment analysis is slightly limited by the small number of genes included in the analysis.
The use of some in vivo experiments would be helpful to elevate this work. Specifically, to address whether certain variants are associated with increased risk of prostate cancer.
Comments on the Quality of English Language
Some statements are too informal for a research article, such as "which remains a 'white spot' 19 on the genetic map for this disease.
Other phrases would benefit from English editorial review. For example "We have taken an attempt" is better stated " we have made an attempt"
Author Response
Review 1
The authors would like to express their deepest gratitude for your thorough and insightful review of our manuscript. All your comments have been carefully considered and, to the extent possible within the scope of the current study, addressed. Below please find our detailed point-by-point responses:
- Since the study focuses on miRNA machinery genes (ex DICER1)- it would be helpful to at least integrate some miRNA expression data? – We thank the reviewer for this valuable suggestion. We fully agree that integration of miRNA expression data would provide important functional context for the observed genetic associations and could help elucidate the biological consequences of variation in miRNA biogenesis pathway genes. However, miRNA expression data were not available for the study participants included in the present investigation. The primary objective of this study was to evaluate the association of genetic variants in miRNA biogenesis pathway genes with prostate cancer susceptibility and to assess their cumulative effects using a polygenic risk score approach. Nevertheless, we acknowledge the high value of such an approach and will certainly take your recommendation into account when planning future investigations. A corresponding comment has been added to the Discussion section of the manuscript.
- For example do certain variants associated with specific miRNA expression signatures This would aid in the functional validation for this study. – We greatly appreciate this perceptive comment. We agree that investigating the relationship between genetic variants and miRNA expression signatures would provide valuable mechanistic insight and could help identify potential functional links between genotype and phenotype. However, such analyses require matched genetic and transcriptomic data, which were not available in the current study cohort. We have acknowledged this limitation in the revised Discussion and highlighted expression quantitative trait locus (eQTL) and transcriptomic analyses as important future directions. And we will gladly keep your suggestion in mind for future research.
- The analysis of miRNA could potentially offer biomarkers as well. – We thank the reviewer for this valuable comment. We agree that miRNA expression profiles represent promising biomarkers for prostate cancer diagnosis and prognosis. Nevertheless, the present study focused on inherited genetic variation in miRNA biogenesis pathway genes rather than circulating or tissue miRNA expression. Future studies integrating genetic, transcriptomic, and clinical data may provide a more comprehensive assessment of the biomarker potential of miRNA-related pathways in prostate cancer.
- In terms of the population used - beyond cancer vs. no cancer were the samples stratified for increase risk such as "control patients" with mutation in known DDR genes. Using this may offer more depth to the insights in this manuscript. – We thank the reviewer for this detailed and constructive suggestion. Information regarding pathogenic variants in DNA damage repair (DDR) genes, including BRCA1, BRCA2, ATM, and related genes, was not available for the study participants. Consequently, stratified analyses based on DDR mutation status could not be performed. We greatly appreciate this valuable suggestion. We acknowledge that integration of established high-risk susceptibility variants with pathway-based polygenic risk models represents an important direction for future studies and could have further strengthened the scientific significance of our findings. We have now noted this point in the Discussion section and will certainly take the reviewer’s recommendation into account when planning our future research.
- One other point that perhaps is beyond the scope of this paper is that the biological significance of the pathway enrichment analysis is slightly limited by the small number of genes included in the analysis. – We thank the reviewer for this important observation. We agree that the pathway enrichment analysis should be interpreted cautiously because it was based on a relatively small number of genes selected a priori for their involvement in miRNA biogenesis and processing pathways. The enrichment analysis was intended as an exploratory approach to provide biological context for the identified genetic associations rather than as definitive evidence of pathway involvement. The Discussion has been revised accordingly to emphasize the exploratory nature of these findings.
- The use of some in vivo experiments would be helpful to elevate this work. Specifically, to address whether certain variants are associated with increased risk of prostate cancer. – We thank you for this profound and thoughtful suggestion. We agree that functional validation using in vitro or in vivo experimental models would substantially strengthen the biological interpretation of the identified associations. However, the present study was designed as a population-based genetic association investigation and did not include experimental validation. Functional characterization of the identified variants, including assessment of their effects on gene expression, miRNA processing, and prostate cancer-related phenotypes, represents an important avenue for future research.
Comments on the Quality of English Language
Some statements are too informal for a research article, such as "which remains a 'white spot' 19 on the genetic map for this disease. Other phrases would benefit from English editorial review. For example "We have taken an attempt" is better stated " we have made an attempt" – We thank the reviewer for this observation. The manuscript has been carefully revised by an English language specialist to improve its clarity, consistency, and scientific style. Informal expressions have been removed or rephrased, and several sentences have been edited for better readability and adherence to standard scientific English.
We sincerely thank you for deep analysis of the article and for your support in improving our article. We hope that our work has become more comprehensible, profound and clear.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis article conducted a study on the miRNA pathway PGS and the risk of prostate cancer. The research results have reference value for understanding the genetic predisposition of prostate cancer in the population described in this article. However, there are still some areas that need to be revised:
(1) Add a discussion on the limitation that the weight source of PGS is from the same dataset. It is recommended to clearly state the weight source in the method. Add the cross-validation results to show whether the AUC is stable.
(2) Explain the reason for the low AUC of PSA in this cohort (0.439).
(3) It is suggested to supplement stratified analysis by disease stage and Gleason score to explore the performance of PGS in different clinical subgroups.
(4) For the pathway enrichment analysis section, please add the specific rules for mapping SNPs to the gene list and the complete list of genes.
Author Response
Review 2
The authors would like to express their sincere gratitude for your thoughtful and constructive comments on our manuscript. We highly appreciate your recognition of the reference value of our study, and we have carefully addressed each of your suggestions below. Please find our point‑by‑point responses:
(1) Add a discussion on the limitation that the weight source of PGS is from the same dataset. It is recommended to clearly state the weight source in the method. Add the cross-validation results to show whether the AUC is stable. – We thank the reviewer for this important methodological suggestion. The Methods section has been revised to explicitly describe the source of SNP weights used for PRS construction. Specifically, SNP effect estimates were obtained from logistic regression analyses performed in the training subset under the age- and ethnicity-adjusted model and were subsequently applied to the independent validation subset. To address the potential concern regarding overfitting, we performed repeated stratified training-validation analyses. The dataset was randomly divided into training (70%) and validation (30%) subsets while preserving disease status and ethnicity distributions, and this procedure was repeated across ten independent random splits. The Results section now includes a summary table reporting median validation AUCs and interquartile ranges across all repetitions (Table 3), while detailed results for individual iterations have been added to the Supplementary Material. In addition, the Discussion has been expanded to acknowledge the limitations associated with deriving SNP weights from the study cohort and to emphasize the need for validation in independent populations.
(2) Explain the reason for the low AUC of PSA in this cohort (0.439). – We thank the reviewer for drawing attention to this issue. Following re-examination of the dataset, we determined that PSA measurements were available for all prostate cancer cases but only a subset of control individuals. Consequently, the original PSA analysis was not directly comparable to the analyses performed in the full study cohort. The PSA-related analyses have therefore been repeated using only individuals with available PSA measurements. In the revised manuscript, PSA alone yielded an AUC of 57.7% (95%CI [53.2%–62.2%]), while the combined PSA + PRS model achieved an AUC of 68.2% (95%CI [64.0%–72.3%]). The Results, Discussion, and Methods sections have been updated accordingly.
(3) It is suggested to supplement stratified analysis by disease stage and Gleason score to explore the performance of PGS in different clinical subgroups. – We sincerely thank the reviewer for this valuable suggestion. In response, we performed additional exploratory analyses examining the relationship between PRS and clinicopathological characteristics, including TNM stage, Gleason score, and Grade Group. No statistically significant associations were identified in these analyses. Given the absence of significant findings and the exploratory nature of these comparisons, we elected not to include them in the revised manuscript in order to maintain focus on the primary study objectives, namely genetic susceptibility and polygenic risk prediction.
(4) For the pathway enrichment analysis section, please add the specific rules for mapping SNPs to the gene list and the complete list of genes. – We thank the reviewer for this constructive suggestion. The enrichment analysis was based on the candidate genes represented by the SNPs included in the study. As the investigated variants were selected a priori from genes involved in miRNA biogenesis and related pathways, no separate SNP-to-gene mapping step was performed. We have clarified this point in the revised manuscript.
We sincerely thank you for your thorough and insightful analysis of our article and for your invaluable support in improving our work. We very much hope that, thanks to your comments, our manuscript has become more comprehensible, rigorous, and clear.
Reviewer 3 Report
Comments and Suggestions for Authors- The weights (log ORs) used to construct the PGS were derived from the same dataset used to evaluate the PGS performance. The authors acknowledge this in the Methods. It is a fundamental methodological failure. AUC, OR, and all performance metrics for the PGS are therefore heavily optimistic (inflated) due to in-sample overfitting. Without an independent training set and a held-out validation set, the PGS is essentially tautological.
- The authors are aware of this but treat it as a limitation rather than what it actually is: a reason why the core result is unreliable.
- The authors may wish to consider adding recent work on advanced imaging modalities for prostate cancer diagnosis and staging [10.36922/arnm.4590].
- The authors tested 21 SNPs and found 2 significant at p < 0.05. No correction for multiple comparisons is mentioned or applied. With 21 tests at α = 0.05, approximately 1 false positive is expected by chance alone.
- The manuscript states that 21 SNPs were selected based on HapMap, dbSNP, miRBase, and Ensembl. This is not a rationale, it is a list of databases. There is no explanation of the biological or statistical criteria used to select these specific 21 variants from millions of available polymorphisms.
- What prior functional evidence, effect size thresholds, or literature-based criteria guided the selection?
- Given the central role of miRNA in this study, the authors should consider referencing recent computational approaches for inferring small molecule-miRNA associations [10.1016/j.omtn.2023.102103].
- The authors report that PSA alone yielded an AUC of 0.439 in their cohort for distinguishing cases from controls. This is not a known limitation of PSA. PSA consistently yields AUCs of 0.70–0.85 in published case-control studies. An AUC below 0.50 implies either: (a) a major sampling/design issue, (b) PSA values were measured at or after diagnosis in cases rather than as a screening biomarker, or (c) controls were selected with elevated PSA. The authors attribute this to PSA being poor in aggressive, poorly differentiated tumors, but this explanation is biologically backwards (aggressive tumors typically have higher PSA).
- The study population includes three distinct ethnic groups, each with substantially different genetic backgrounds. Population stratification is a well-known confounder in genetic association studies. The authors state that the ethnic distribution was "comparable" between cases and controls, and that principal component analysis (PCA) or other stratification corrections were applied, but there is no PCA described in the Methods, no genomic inflation factor (λ) reported, and no stratified analyses by ethnicity are presented.
- The pathway analysis was performed with a gene list derived from only 21 SNPs in approximately 15 genes. The ShinyGO analysis with a minimum pathway size of 2 genes is an extremely low threshold. Finding 3 enriched pathways in a 15-gene list is not surprising and is likely to reflect chance findings.
- Furthermore, the biological interpretation offered in the Discussion is stated as if it provides mechanistic insight into prostate carcinogenesis, but these pathways are broadly relevant to virtually all cancers.
- The Discussion discusses BRCA variants, 260+ prostate cancer GWAS loci, and African-ancestry PGS studies. None of which are directly compared to the current findings in a meaningful way. These paragraphs read as filler to increase the apparent context of the work.
- The OR for DROSHA rs4867329 is listed as 1.17 with a 95% CI of 0.66-1.12. The upper bound of the CI (1.12) is below the point estimate (1.17). This is mathematically impossible.
- Similarly, DGCR8 rs417309 shows OR = 1.30 with 95% CI 0.51–1.16, again with an upper bound below the OR.
- These errors in three separate rows suggest the confidence intervals in Table 2 were not properly proofread.
- The manuscript could benefit from citing work on RNF216, a ubiquitin E3 ligase shown to regulate meiosis and PKA stability in the testes, as it provides broader context for the regulatory complexity of male reproductive biology in which prostate carcinogenesis occurs [10.1096/fj.202002294RR].
- The Conclusion states that PGS models were "constructed and validated." This language is inappropriate. The PGS was not externally validated.
- The claim that this provides "a foundational framework for refining polygenic risk assessment for PrC in diverse populations" is not supported by a study limited to one region with one sample and no external validation.
- The authors note a dip in prostate cancer incidence during 2021–2022 attributed to the COVID-19 pandemic. This claim would be strengthened by citing studies that have directly investigated the impact of COVID-19 infection and its variants on clinical outcomes and reproductive health [10.1111/aji.70012].
- The manuscript uses both "PGS" (polygenic score) and "PRS" (polygenic risk score, in the formula section) interchangeably without explanation.
- Given that the study investigates androgen-related cancer biology, the authors should consider citing the recent landmark identification and structural characterization of a membrane androgen receptor and its agonist design [10.1016/j.cell.2025.01.006].
- The unweighted PGS (AUC = 0.602, sensitivity = 0.84, specificity = 0.31) is described in the abstract as having "lower" performance but is never adequately explained in terms of its operational threshold choice.
Author Response
Review 3
We would like to express our sincere gratitude to the reviewer for the insightful and constructive assessment of our work. The manuscript has been revised in accordance with your thoughtful comments and suggestions. Below, we respond to each of your points individually..
1. The weights (log ORs) used to construct the PGS were derived from the same dataset used to evaluate the PGS performance. The authors acknowledge this in the Methods. It is a fundamental methodological failure. AUC, OR, and all performance metrics for the PGS are therefore heavily optimistic (inflated) due to in-sample overfitting. Without an independent training set and a held-out validation set, the PGS is essentially tautological. & 2. The authors are aware of this but treat it as a limitation rather than what it actually is: a reason why the core result is unreliable. – We are most grateful to the reviewer for highlighting this important concern. We fully and respectfully agree that deriving SNP weights and evaluating performance within the same dataset may lead to optimistic estimates. In response, we have substantially revised the manuscript. We implemented repeated stratified training‑validation splits (70% training, 30% validation) preserving disease status and ethnicity, with effect estimates derived only from the training set and applied to the validation set. This was repeated across ten independent splits. We also performed a sensitivity analysis using external GWAS effect estimates. These analyses yielded weaker predictive performance but confirmed that the observed associations were not driven solely by in‑sample optimization. All relevant sections (Results, Methods, Discussion, Conclusion) have been revised, and we no longer claim external validation. We thank the reviewer again for this most constructive feedback.
3. The authors may wish to consider adding recent work on advanced imaging modalities for prostate cancer diagnosis and staging [10.36922/arnm.4590]. – We thank the reviewer for this suggestion. We reviewed the recommended article and agree that advances in imaging modalities have substantially improved prostate cancer diagnosis and staging. The reference has been incorporated into the Discussion to provide additional clinical context.
4. The authors tested 21 SNPs and found 2 significant at p < 0.05. No correction for multiple comparisons is mentioned or applied. With 21 tests at α = 0.05, approximately 1 false positive is expected by chance alone. – We agree with the reviewer. The revised manuscript now includes correction for multiple testing using the Benjamini-Hochberg false discovery rate (FDR) procedure. Both nominal P values and PFDR values are reported. After correction, only rs595055 in AGO1 remained statistically significant (PFDR < 0.05). The Results, Methods, and Tables have been updated accordingly.
5. The manuscript states that 21 SNPs were selected based on HapMap, dbSNP, miRBase, and Ensembl. This is not a rationale, it is a list of databases. There is no explanation of the biological or statistical criteria used to select these specific 21 variants from millions of available polymorphisms. & 6. What prior functional evidence, effect size thresholds, or literature-based criteria guided the selection? – We sincerely thank the reviewer for noting that the original description was insufficient. The SNP selection section has been extensively revised to clarify the biological and literature‑based rationale underlying variant selection. Briefly, candidate variants were selected from genes encoding key components of the miRNA biogenesis pathway, including DROSHA, DGCR8, XPO5, RAN, DICER1, AGO1, AGO2, PIWIL1, and related genes. Variants were prioritized based on genomic location, predicted functional relevance, minor allele frequency, previous evidence of association with prostate cancer or other malignancies, and genotyping feasibility. Full details have now been incorporated into the revised Methods section. We are grateful to the reviewer for this valuable suggestion.
7. Given the central role of miRNA in this study, the authors should consider referencing recent computational approaches for inferring small molecule-miRNA associations [10.1016/j.omtn.2023.102103]. – We thank the reviewer for highlighting this article. The suggested reference has been incorporated into the revised manuscript to provide additional context regarding the expanding role of miRNA-related computational approaches and their potential translational applications.
8. The authors report that PSA alone yielded an AUC of 0.439 in their cohort for distinguishing cases from controls. This is not a known limitation of PSA. PSA consistently yields AUCs of 0.70–0.85 in published case-control studies. An AUC below 0.50 implies either: (a) a major sampling/design issue, (b) PSA values were measured at or after diagnosis in cases rather than as a screening biomarker, or (c) controls were selected with elevated PSA. The authors attribute this to PSA being poor in aggressive, poorly differentiated tumors, but this explanation is biologically backwards (aggressive tumors typically have higher PSA). – We sincerely thank the reviewer for identifying this issue. Upon re‑examination, we noted that PSA measurements were available for all prostate cancer cases but only for a subset of controls. Hence, the original PSA analysis was not directly comparable to the PRS analysis performed on the full dataset. We have therefore repeated the PSA‑related analyses using only individuals with available PSA measurements. In the revised manuscript, PSA alone produced an AUC of 57.7% (95% CI 53.2%–62.2%), while the combined PSA + PRS model achieved an AUC of 68.2% (95% CI 64.0%–72.3%). The interpretation of PSA performance has been revised accordingly. We thank the reviewer again for this helpful comment.
9. The study population includes three distinct ethnic groups, each with substantially different genetic backgrounds. Population stratification is a well-known confounder in genetic association studies. The authors state that the ethnic distribution was "comparable" between cases and controls, and that principal component analysis (PCA) or other stratification corrections were applied, but there is no PCA described in the Methods, no genomic inflation factor (λ) reported, and no stratified analyses by ethnicity are presented. – We are most grateful to the reviewer for raising this important point. Our study used a candidate‑gene panel of 21 SNPs located in highly conserved genomic regions, which are characterised by low inter‑population differentiation. Consequently, standard PCA performed on such markers would not capture existing population structure (Yunusbayev et al., 2015). For the same reason, the genomic inflation factor λ was not calculated, as it would be artificially close to 1 even in the presence of real stratification. To minimise confounding to the extent possible given these constraints, cases and controls were recruited from the same geographic region and demonstrated comparable ethnic composition. Furthermore, ethnicity was included as a covariate in all adjusted association and PRS analyses, and age‑plus‑ethnicity‑adjusted models were designated as the primary analytical framework. We acknowledge that residual stratification cannot be excluded and have added this point to the Discussion as a limitation. We thank the reviewer again for this insightful observation.
10. The pathway analysis was performed with a gene list derived from only 21 SNPs in approximately 15 genes. The ShinyGO analysis with a minimum pathway size of 2 genes is an extremely low threshold. Finding 3 enriched pathways in a 15-gene list is not surprising and is likely to reflect chance findings. & 11. Furthermore, the biological interpretation offered in the Discussion is stated as if it provides mechanistic insight into prostate carcinogenesis, but these pathways are broadly relevant to virtually all cancers. – We agree that pathway enrichment analyses based on a relatively small gene set should be interpreted cautiously. The purpose of the enrichment analysis was exploratory rather than confirmatory. We have therefore revised the Discussion to avoid strong mechanistic claims and now interpret the identified pathways as biologically plausible hypotheses requiring validation in larger studies rather than definitive evidence of causal mechanisms in prostate carcinogenesis.
12. The Discussion discusses BRCA variants, 260+ prostate cancer GWAS loci, and African-ancestry PGS studies. None of which are directly compared to the current findings in a meaningful way. These paragraphs read as filler to increase the apparent context of the work. – We thank the reviewer for this observation. The Discussion has been substantially revised to focus more directly on interpretation of the present findings. The sections discussing GWAS-derived PRSs, BRCA-associated risk, and previous prostate cancer PRS studies have been condensed and restructured to provide clearer context for the current pathway-based PRS rather than a general overview of the field.
13. The OR for DROSHA rs4867329 is listed as 1.17 with a 95% CI of 0.66-1.12. The upper bound of the CI (1.12) is below the point estimate (1.17). This is mathematically impossible. & 14. Similarly, DGCR8 rs417309 shows OR = 1.30 with 95% CI 0.51–1.16, again with an upper bound below the OR. & 15. These errors in three separate rows suggest the confidence intervals in Table 2 were not properly proofread. – We thank the reviewer for identifying these errors. The confidence intervals reported in the original table contained formatting/transcription mistakes. All association results were rechecked against the original regression outputs and the table has been corrected in the revised manuscript.
16. The manuscript could benefit from citing work on RNF216, a ubiquitin E3 ligase shown to regulate meiosis and PKA stability in the testes, as it provides broader context for the regulatory complexity of male reproductive biology in which prostate carcinogenesis occurs [10.1096/fj.202002294RR]. – We thank the reviewer for this suggestion. The cited work has been added to the Discussion as an example of the complex regulatory mechanisms involved in male reproductive biology and their potential relevance to prostate disease.
17. The Conclusion states that PGS models were "constructed and validated." This language is inappropriate. The PGS was not externally validated. & 18. The claim that this provides "a foundational framework for refining polygenic risk assessment for PrC in diverse populations" is not supported by a study limited to one region with one sample and no external validation. – We agree and have revised the manuscript accordingly. References to model "validation" have been replaced by more precise descriptions of internal validation using repeated training-validation splits. The Conclusion has also been rewritten to avoid overstating the generalizability of the findings and now emphasizes that the study provides evidence supporting further evaluation of pathway-based polygenic approaches in independent populations.
19. The authors note a dip in prostate cancer incidence during 2021–2022 attributed to the COVID-19 pandemic. This claim would be strengthened by citing studies that have directly investigated the impact of COVID-19 infection and its variants on clinical outcomes and reproductive health [10.1111/aji.70012]. – We thank the reviewer for this suggestion. The recommended reference has been added to support the statement regarding the potential impact of the COVID-19 pandemic on healthcare utilization and cancer detection during the 2021–2022 period.
20. The manuscript uses both "PGS" (polygenic score) and "PRS" (polygenic risk score, in the formula section) interchangeably without explanation. – We thank the reviewer for noting this inconsistency. Terminology has been standardized throughout the manuscript, and the term "polygenic risk score (PRS)" is now used consistently.
21. Given that the study investigates androgen-related cancer biology, the authors should consider citing the recent landmark identification and structural characterization of a membrane androgen receptor and its agonist design [10.1016/j.cell.2025.01.006]. – We thank the reviewer for drawing our attention to this recent study. The reference has been incorporated into the Discussion to acknowledge recent advances in our understanding of androgen signaling and prostate cancer biology.
22. The unweighted PGS (AUC = 0.602, sensitivity = 0.84, specificity = 0.31) is described in the abstract as having "lower" performance but is never adequately explained in terms of its operational threshold choice. – We appreciate this comment. The primary objective of the ROC analyses was comparison of overall discriminatory performance between models rather than optimization of a specific classification threshold. Consequently, interpretation focused on AUC values, which are threshold-independent measures of discrimination. We have clarified this point in the revised manuscript.
We sincerely thank you for your in-depth analysis of our work and for your desire to improve our article and for your support. We have tried to take your comments into account and very much hope that our work has become more comprehensible, profound and clear.
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsAll the comments have been well revised by the authors.

