Next Article in Journal
Improving the Nutritional Properties of Rabbit Meat Through Dietary Supplementation with Linseed Meal, Fodder Yeast, and Selenium Yeast
Next Article in Special Issue
Deep Learning-Powered Super Resolution Reconstruction Improves 2D T2-Weighted Turbo Spin Echo MRI of the Hippocampus
Previous Article in Journal
An Integrated Cellular Automata Model Improves the Accuracy of Secondary Fragmentation Prediction
Previous Article in Special Issue
An Efficient Method for Lung Lesions Classification Using Automatic Vascularization Evaluation on Color Doppler Ultrasound
 
 
Article
Peer-Review Record

Lymph Node Involvement Prediction Using Machine Learning: Analysis of Prostatic Nodule, Prostatic Gland, and Periprostatic Adipose Tissue (PPAT)

Appl. Sci. 2025, 15(10), 5426; https://doi.org/10.3390/app15105426
by Eliodoro Faiella 1,2, Giulia D’amone 1,2, Raffaele Ragone 1,2,*, Matteo Pileri 1,2, Elva Vergantino 1,2, Bruno Beomonte Zobel 1,2, Rosario Francesco Grasso 1,2 and Domiziana Santucci 1,2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2025, 15(10), 5426; https://doi.org/10.3390/app15105426
Submission received: 18 March 2025 / Revised: 25 April 2025 / Accepted: 6 May 2025 / Published: 13 May 2025
(This article belongs to the Special Issue Advances in Diagnostic Radiology)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

1 In the abstract:

There is a lack of explicit description of the number of patients in the analysis.

It should be clarified that the Random Forest model was based on radiomic and semantic data.

2. Introduction:

Introduction is too broad and at times repetitive. The description of the general application of radiomics should be shortened, focusing on the novelty of the study.

Suggestion: highlight the research hypothesis and formulate it clearly.

3 Materials and methods:

It is unclear at what stage ChatGPT 4.0 was used (was it for training the model or just for coding?). This should be clearly specified.

The feature selection method (feature selection) should be specified. What criteria were used?

Segmentation was performed by a resident. Indicate whether inter-observer evaluation or any segmentation validations were performed.

Information is missing about the data segmentation strategy (e.g., training/testing, cross-validation?).

Need details on the number of input features used in the model.

4 Results:

 There is a lack of information about the number of classes in the test set (unbalanced classes?).

The performance measures presented are not uniformly described: some are included in the text, others only in the table. It is worth standardizing.

AUC=1 for the combined masks is questionable. Has this result been verified externally? It may indicate overfitting.

5 Discussion:

The discussion is extensive and includes good references to the literature. Please read and cite the article: DOI:10.5114/pjr.2023.131215

The discussion should be expanded to include limitations: lack of external validation, single-person segmentation, lack of comparison with other ML models.

There is a lack of suggestions for practical application of the clinical model (e.g., can it support the decision to perform ePLND?).

6 Conclusions:

Conclusions should be more balanced. Stating 100% sensitivity and specificity suggests an ideal model, which is unlikely with this number of cases and lack of external validation.

Also:

Many grammatical and stylistic errors (e.g., “combing” instead of “combining”, ‘involment’ instead of “involvement”). Recommended language correction.

Comments on the Quality of English Language

The article needs a thorough linguistic revision. I recommend using a professional editing service or a native speaker.

Author Response

 

REV 1

 

1 In the abstract:

There is a lack of explicit description of the number of patients in the analysis.

 

Dear Reviewer , we would like to clarify that the number of patients included in the analysis is explicitly reported in the abstract, specifically in the “Methods” section, where it states: “A retrospective review of 85 patients...” We hope this addresses the concern, but we remain open to rephrasing or further emphasizing this information if the Reviewer deems it necessary for greater clarity.

 

It should be clarified that the Random Forest model was based on radiomic and semantic data.

 

Dear Reviewer, thank you for your suggestion, we added this information (line  41-42).

 

  1. Introduction:

Introduction is too broad and at times repetitive. The description of the general application of radiomics should be shortened, focusing on the novelty of the study.

Suggestion: highlight the research hypothesis and formulate it clearly.

 

Dear Reviewer, thank you for your suggestion. In accordance with the suggestion, we have revised the Introduction to make it more concise and focused. The general overview of radiomics has been significantly reduced to avoid repetition and unnecessary breadth, allowing for a clearer presentation of the specific context of our study. Furthermore, we have restructured the final part of the Introduction to better highlight the novelty of our approach—namely, the integration of radiomic features extracted not only from the tumor and prostate gland, but also from periprostatic adipose tissue (PPAT). Finally, we have explicitly stated the research hypothesis, emphasizing the potential of radiomics to noninvasively predict lymph node invasion in prostate cancer patients. We believe these modifications enhance the clarity and scientific relevance of the Introduction.

 

 

3 Materials and methods:

It is unclear at what stage ChatGPT 4.0 was used (was it for training the model or just for coding?). This should be clearly specified.

 

Dear Reviewer, as requested, we have clarified the use of ChatGPT 4.0 in the manuscript. Specifically, we have indicated that the model was used during the training phase (lines 329–331).

 

The feature selection method (feature selection) should be specified. What criteria were used?

 

Dear Reviewer, thank you for your suggestion. In the revised manuscript, we have now specified that a wrapper method was employed for feature selection. This approach allowed us to iteratively evaluate subsets of features using the performance of a predictive model, ensuring the selection of the most informative and non-redundant features while minimizing the risk of overfitting. The use of this method is now clearly stated in the “Methods” section (lines 513–515).

 

Segmentation was performed by a resident. Indicate whether inter-observer evaluation or any segmentation validations were performed.

 

Dear Reviewer, thank you for your suggestion. In response to the comment, we have added clarification in the manuscript regarding the segmentation process. Specifically, we now specify that although the initial segmentation was performed by a radiology resident, all segmentations were subsequently reviewed and validated by two board-certified senior radiologists (line 246-248).

 

 

Information is missing about the data segmentation strategy (e.g., training/testing, cross-validation?).

 

We thank the Reviewer for this pertinent observation. In response, we have clarified the data segmentation strategy used in our study. Specifically, the dataset was randomly divided into training and testing sets with an 80:20 ratio. To enhance the robustness of the model and reduce variability, a 5-fold cross-validation was performed on the training set. This approach allowed us to validate model performance and optimize hyperparameters in a controlled and repeatable manner. The corresponding details have now been included in the “Methods” section (lines 331–334).

 

 

Need details on the number of input features used in the model.

 

We thank the Reviewer for the helpful suggestion. As requested, we have now specified the number of input features used in the model. A total of 131 radiomic features were extracted from each volume of interest and included in the initial analysis. This information has been added to the manuscript (line 292).

 

4 Results:

There is a lack of information about the number of classes in the test set (unbalanced classes?).

 

We thank the Reviewer for this observation. We would like to clarify that the class imbalance within the dataset—specifically, the lower number of patients with lymph node involvement compared to those without—is acknowledged and discussed in the manuscript. This imbalance was maintained in both the training and test sets to reflect real-world clinical distributions. Additionally, data augmentation techniques were applied during model development to mitigate the effects of class imbalance. The relevant information has been included in line 331-338.

 

The performance measures presented are not uniformly described: some are included in the text, others only in the table. It is worth standardizing.

 

Dear Reviewer, thank you for your suggestion, we added this information.

 

AUC=1 for the combined masks is questionable. Has this result been verified externally? It may indicate overfitting.

We appreciate the Reviewer’s critical observation. We acknowledge that an AUC of 1.0 is an unusually high value and may raise concerns regarding potential overfitting. In our study, this result was obtained by combining radiomic features from all available masks (nodule, whole gland, and PPAT), which may have introduced feature redundancy and increased model complexity. However, to reduce the risk of overfitting, we employed an 80:20 train-test split along with 5-fold cross-validation within the training set. These steps were taken specifically to validate the model's generalizability and limit the influence of data-specific noise.

 

That said, we recognize the importance of external validation, which was not performed in this study due to the single-center, retrospective nature of the dataset. We have acknowledged this limitation in the revised Discussion section and highlighted the need for future studies using independent cohorts to confirm the robustness of the model.

 

5 Discussion:

The discussion is extensive and includes good references to the literature. Please read and cite the article: DOI:10.5114/pjr.2023.131215.

 

Dear Reviewer, thank you for your suggestion, we’ve just added this article.

 

The discussion should be expanded to include limitations: lack of external validation, single-person segmentation, lack of comparison with other ML models.

 

We thank the Reviewer for the helpful suggestion. As requested, we have expanded the Discussion section (lines 525–534) to address these limitations, including the lack of external validation, the use of single-operator segmentation, and the absence of comparison with alternative machine learning models.

 

There is a lack of suggestions for practical application of the clinical model (e.g., can it support the decision to perform ePLND?).

 

Dear Reviewer, thank you for your suggestion. In the revised Discussion, we have clarified the potential clinical utility of our model, particularly in supporting preoperative decision-making regarding the indication for extended pelvic lymph node dissection (ePLND). By identifying patients at higher risk for lymph node involvement based on non-invasive imaging features, the model may help avoid unnecessary ePLND procedures and their associated morbidity (line 501-505).

 

6 Conclusions:

Conclusions should be more balanced. Stating 100% sensitivity and specificity suggests an ideal model, which is unlikely with this number of cases and lack of external validation.

 

Dear Reviewer , thank you for your suggestion; the Conclusions section has been revised to provide a more balanced interpretation of the results, acknowledging the study’s limitations and avoiding overstatement of the model’s performance.

 

Also:

Many grammatical and stylistic errors (e.g., “combining” instead of “combining”, ‘involvement’ instead of “involvement”). Recommended language correction.

Dear Reviewer , thank you for your suggestion. we’ve just corrected the stylistic errors.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Editor-In-Chief

Applied Science, MDPI,

Subject: Review of the article applsci-3564448

 

Entitled “Lymph node Involvement prediction using Machine Learning: analysis of prostatic Nodule, Prostatic Gland and Periprostatic Adipose Tissue (PPAT)”

 

The study aims to predict lymph node invasion in prostate cancer patients using the combination of clinical information and mp-MRI radiomics features extracted from the suspicious nodule, the prostate gland, and the periprostatic adipose tissue . The work is both relevant and important, however the work needs enhancement on various levels. Enclosed below are some general and specific comments for the authors to consider. The work would benefit from professional English language editing. 

General comments

  1. how an this be an original study if the authors cite reference 10 in the aim section? What is new in the present study?
  2. T2 w intensity is affected by blood flow. What measures been taken to reduce the impact of blood flow.
  3. spell out all acronyms at first appearance. For example: FOV (line 130) and multiparametric MRI (mp MRI) etc...
  4. 4-5 weeks are sufficient for aggressive tumor to invade lymph nodes. How was this controlled?
  5. did senior radiologist approve the manually segmented VOI done by the 3rd year ? elaborate including all measures taken to verify all VOIs.

 

Specific comments:

  1. line 13: 29, 30 are citations? If yes, no need to include them in a summary. If not delete.
  2. line 31: radiology instead of radiological.
  3. keywords: it is recommended to spell out the acronyms
  4. line 56: citations in a text cannot start with 16. Rearrange the citation order throughout the text with a first reference to appear in the text is cited 1, the second 2 and so on…
  5. lines 117 and 119 include findings which should be part of the results section.
  6. line 126: you mean the gland image was reconstructed in axial, coronal and sagittal, correct?
  7. line 145: a third year instead of three year..
  8. line 154: manual segmentation instead of segmentation..
  9. why Fig 2? What does it show extra?
  10. accuracy and AUC for ADC and DWI are not convincing. Shouldn’t higher accuracy reflect higher AUC? Elaborate !!

 

 

 

Author Response

 

the study aims to predict lymph node invasion in prostate cancer patients using the combination of clinical information and mp-MRI radiomics features extracted from the suspicious nodule, the prostate gland, and the periprostatic adipose tissue . The work is both relevant and important, however the work needs enhancement on various levels. Enclosed below are some general and specific comments for the authors to consider. The work would benefit from professional English language editing. 

 

General comments

  1. how an this be an original study if the authors cite reference 10 in the aim section? What is new in the present study?

We thank the Reviewer for this important comment. The inclusion of reference 10 in the Aim section was an oversight and has now been removed in the revised version to avoid confusion regarding the originality of the study.

The novelty of our work lies in the development of a Random Forest model that integrates radiomic features extracted from all anatomical components of the prostatic lodge—including the tumor nodule, the entire prostate gland, and the periprostatic adipose tissue (PPAT)—to predict lymph node involvement in prostate cancer patients. While previous studies have analyzed individual components, to our knowledge, this is the first study to combine all three regions in a single predictive model, highlighting the importance of the local microenvironment and demonstrating superior diagnostic performance. This comprehensive radiomic approach provides a non-invasive tool with potential clinical value in preoperative risk stratification and surgical planning.

  1. T2 w intensity is affected by blood flow. What measures been taken to reduce the impact of blood flow.

To minimize the influence of blood flow on T2-weighted signal intensity, several precautions were taken during image acquisition. First, T2-weighted sequences with flow-compensating gradients were employed to reduce artifacts caused by pulsatile motion and flow-related signal variations, particularly near vascular structures. Second, careful slice planning and positioning were applied to avoid inclusion of large blood vessels, especially in the periprostatic and pelvic regions. Additionally, all patients were instructed to remain completely still during image acquisition to minimize motion-related artifacts, including those induced by blood flow. These steps were implemented to ensure high-quality image acquisition and reliable radiomic feature extraction from T2-weighted sequences.

 

  1. spell out all acronyms at first appearance. For example: FOV (line 130) and multiparametric MRI (mp MRI) etc.

Dear Reviewer thank for the suggestion; we’ve just added these informations. 

 

  1. 4-5 weeks are sufficient for aggressive tumor to invade lymph nodes. How was this controlled?

We thank the Reviewer for this important observation. To address this concern, we have clarified in the manuscript that lymphadenectomy was performed within a controlled and clinically acceptable timeframe following imaging. Specifically, the surgical procedure was carried out at a mean interval of 102 days, with a range between 91 and 115 days from the MRI examination. This timing was consistent across patients and considered adequate to limit the potential for significant interval progression in terms of lymph node involvement. This information has been added to the revised Methods section (line 208-210).

 

  1. did senior radiologist approve the manually segmented VOI done by the 3rd year ? elaborate including all measures taken to verify all VOIs.

We thank the Reviewer for this pertinent comment. As described in the revised manuscript, all manual segmentations initially performed by the third-year radiology resident were systematically reviewed and approved by two senior board-certified radiologists with 12 and 5 years of experience in prostate MRI interpretation. The verification process involved an independent evaluation of each volume of interest (VOI), with particular attention to anatomical accuracy across all sequences (T2-weighted, DWI, and ADC). In cases of uncertainty or disagreement, consensus was reached through joint re-evaluation to ensure consistency and reliability. This multi-step review process was implemented to guarantee the quality and reproducibility of the segmentations used for radiomic feature extraction. 

 

 

Specific comments:

  1. line 13: 29, 30 are citations? If yes, no need to include them in a summary. If not delete.

Dear Reviewer thank for the suggestion. The correction has been made in the revised manuscript.

  1. line 31: radiology instead of radiological.
    Dear Reviewer thank for the suggestion. The correction has been made in the revised manuscript.
  2. keywords: it is recommended to spell out the acronyms
    Dear Reviewer thank for the suggestion. The correction has been made in the revised manuscript.
  3. line 56: citations in a text cannot start with 16. Rearrange the citation order throughout the text with a first reference to appear in the text is cited 1, the second 2 and so on…
  4. lines 117 and 119 include findings which should be part of the results section.

Dear Reviewer thank for the suggestion. The correction has been made in the revised manuscript.

  1. line 126: you mean the gland image was reconstructed in axial, coronal and sagittal, correct?

Yes, correct.   

  1. line 145: a third year instead of three year.
    Dear Reviewer thank for the suggestion. The correction has been made in the revised manuscript.
  2. line 154: manual segmentation instead of segmentation..
    Dear Reviewer thank for the suggestion. The correction has been made in the revised manuscript.
  3. why Fig 2? What does it show extra? It is just another example of how we performed segmentation.
  4. accuracy and AUC for ADC and DWI are not convincing. Shouldn’t higher accuracy reflect higher AUC? Elaborate !!

We thank the Reviewer for this important observation. While we recognize that the accuracy and AUC values for the ADC and DWI sequences appear modest, this outcome reflects the intrinsic limitations of these sequences when used in isolation. Specifically, ADC and DWI provide valuable but partial information, and may not fully capture the complexity of the tumor microenvironment relevant to lymph node involvement. This likely contributes to the reduced discriminative power and explains the lower AUC values.

Regarding the relationship between accuracy and AUC, we agree that a general trend is expected. However, they measure different aspects of model performance. Accuracy depends on a fixed decision threshold and is influenced by class distribution, whereas AUC evaluates the model’s performance across all thresholds, reflecting its overall ability to distinguish between classes. In imbalanced datasets such as ours, a model may correctly classify the majority class (leading to high accuracy), while performing poorly on the minority class (resulting in lower AUC).

This discrepancy reinforces the importance of combining features from multiple regions (nodule, gland, and PPAT), which in our study significantly improved model performance. A clarification on this point has been added to the Discussion (line 473-480).

 

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

Authors response was well recieved and mostly satisfying.

Back to TopTop