Next Article in Journal
Broadband Reduction in Mutual Coupling in Compact MIMO Vehicle Antennas by Using Electric SRRs
Previous Article in Journal
Current-Adaptive Control for Efficiency Enhancement in Interleaved Converters for Battery Energy Storage Systems
Previous Article in Special Issue
Multi-Path Convolutional Architecture with Channel-Wise Attention for Multiclass Brain Tumor Detection in Magnetic Resonance Imaging Scans
 
 
Article
Peer-Review Record

Metric-Based Meta-Learning Approach for Few-Shot Classification of Brain Tumors Using Magnetic Resonance Images

Electronics 2025, 14(9), 1863; https://doi.org/10.3390/electronics14091863
by Sahar Gull and Juntae Kim *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Electronics 2025, 14(9), 1863; https://doi.org/10.3390/electronics14091863
Submission received: 18 March 2025 / Revised: 25 April 2025 / Accepted: 29 April 2025 / Published: 2 May 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Summary:

The paper presents an interesting AI-approach to tumor detection from MRI of brain, trying to tackle the problem of training a neural network when limited data is available. Notably, three types of tumors are taken into considerations in this work. I believe the approach is promising, however, the obtained results are not as good as stated in the abstract. Some clarifications about the methods are needed. Moreover, a comparison with more conventional deep-learning methods is missing. Additionally, some sentences in the manuscript should be rephrased or removed to reduce repetition and improve readability.

 

Legend for the comments: P stands for page/pages, L stands for line/lines.

 

Major comments:

  1. My main concern with the paper is the claim of “outperforming conventional methods” (P. 1 L 19) when the model is never compared to conventional methods. The abstract and the results section are very enthusiastic, but the actual results do not reflect this. The authors should compare the results with what they can obtain with their previously proposed approaches [1,2] (on the same test set).
  2. The pipeline is not clear, both from the text and the presented figures (i.e. fig 2). The ViT extract features but then you use images as inputs for the Siamese network, where do you use the features? Moreover, why the data split is done after the feature extraction step?
  3. P 5 L 199-202: a reference is missing here. Moreover, the authors should address the limitations specific to tumor detections from MRI.
  4. P 11 L 436: it is not clear if the images in your dataset are all from different patients or some of them come from the same patient. Moreover, there are no details about the test set. It would be good to further explain that the query set was not split into training samples.
  5. P 12 L 496-502: what is the novel set? This entire paragraph is confusing. Please explain what data you are referring to and for what you use it. This section is about the training details, therefore, if you are talking about test set the information should go somewhere else. Also, I was expecting different tumor classes for the meta-testing phase. You mention “novel classes” but don’t specify which ones.
  6. P 14 L 546: more details about “another model” are needed.
  7. P 14 L 550-556: are these models trained by the authors or someone else? Details about architecture, training set, training strategies are needed.
  8. P 15 L 568-571 + Table 5: this comparison should be removed. The authors cannot state that their model is better than another model trained and tested on data from a different medical domain. This is not a fair comparison.
  9. Please clarify which test set was used to produce table 3 and 4

 

 

Minor comments:

 

  1. P 1, L 40: please remove the word “incredible”
  2. P 2, L 44: in the caption I would also specify that you’re showing examples for 3 types of brain tumors.
  3. P 2, L 47: please replace “great” with “higher”
  4. P 2, L 52, L 54: are the dimensions in pixels?
  5. P 2, L 71-73: please rephrase/remove the sentence about CNN automatically identifying characteristic and learning “on its own”, this is unclear.
  6. P 3 L 84-85: this sentence is a bit unclear, please rephrase
  7. P 3 L 105-109: The sentence seems out of place. Why are we discussing Softmax now? Reference is also missing. Is the last sentence about softmax too?
  8. P 3 L 110-111: this sentence is confusing, you need to clarify which tasks you refer to
  9. P 3 L 128: the subject of the sentence is missing, probably “we preprocess” or “one need to preprocess”
  10. P 4 L 141-143: the last contribution seems a repetition of the previous ones
  11. P 4 L 161: please rephrase “analyzing numerous images”, this is too informal and unclear
  12. P 5 L 190: replace consists with consisting
  13. P 5 L 190-198: The description of this work is very confusing. Which mismatch? What do you mean by “unique requirements of medical imaging datasets”? I believe the use of “we” is a typo. Please consider rephrasing the entire paragraph.
  14. P 5 L 229: please avoid terms as “complicated” and “sophisticated”. Moreover, this sentence doesn’t say anything about model performance/limitations, it just repeats what already stated.
  15. P 5 L 229: replace for with to
  16. P 6 L 239-241: this is a repetition of what already stated multiple times
  17. P 7 L 291-293: verb is missing
  18. P 7 L 295: in the figure you mention “transformation” in data processing, but this is not described in the text. Can you please clarify/modify the figure?
  19. P 7 L 299: It would be good to specify which method was used for resizing.
  20. P 9 L 365: please replace erected
  21. P 9 L 365-367: this sentence is not clear, please rephrase.
  22. P 9 L 385-387: the definition of support set and query set is a bit unclear
  23. P 10 L 396: please correct verb
  24. P 10 L 409: I think a “for” is missing, the sentence is unclear
  25. P 11 L 438: please remove reputable
  26. P 13-14: paragraph 4.3 should be removed
  27. 8 should be removed because it presents the same results as Table 4, it does not add information.
  28. P 16 L 588: please remove sophisticated
  29. P 16 L 592: please replace resilience
  30. There are many sentences that are missing a reference, especially for the mentioned datasets. I will list the lines here:
    1. P 3 L 81
    2. P 3 L 114
    3. P 4 L 158-159
    4. P 4 L 177-178
    5. P 5 L 182
    6. P 5 L 219
    7. P 6 L 243-245
    8. P 6 L 248
    9. P 6 L 258
    10. P 6 L 265
    11. P 6 L 270-271
    12. P 6 L 279-280
    13. P 6 L 283
    14. P 12 L 471
  31. Furthermore, whenever you refer to an acronym, regardless of whether you have cited the source paper, you should always provide an explanation of what the acronym stands for. This happens many times in the manuscript.

References

[1] Gull, S., S. Akbar and H. U. Khan. "Automated detection of brain tumor through magnetic resonance images using convolutional neural network." BioMed Research International 2021 (2021): 3365043.

[2] Gull, S., S. Akbar and S. M. Naqi. "A deep learning approach for multi-stage classification of brain tumor through magnetic 734

resonance images." International Journal of Imaging Systems Technology 33 (2023): 1745-66. 

 

Comments for author File: Comments.pdf

Comments on the Quality of English Language

Some sentences require improvement, as there are errors with verbs and missing subjects, which can be easily resolved. Some sentences need to be rephrased to improve clarity. 

Author Response

  • The comments of Reviewer 1 have been highlighted in yellow color.
  • P stands for page/pages, L stands for line/lines. 

Major comments:

  1. My main concern with the paper is the claim of “outperforming conventional methods” (P. 1 L 19) when the model is never compared to conventional methods. The abstract and the results section are very enthusiastic, but the actual results do not reflect this. The authors should compare the results with what they can obtain with their previously proposed approaches [1,2] (on the same test set).

Response: We acknowledge the concern regarding the claim of outperforming conventional methods. As noted in the results section, there is currently no existing research that applies few-shot meta-learning specifically for brain tumor classification using MRI data. Therefore, a direct comparison with conventional methods on the same task and dataset was not feasible. To address this, we have revised the results section to clearly state this limitation. However, we have included a comparison of our proposed approach with our other developed models in Table 3, as well as with our previously proposed deep learning approaches using the same MRI dataset in Table 4 to provide a meaningful performance benchmark (P. 16, L 594-596).

  1. The pipeline is not clear, both from the text and the presented figures (i.e. fig 2). The ViT extract features but then you use images as inputs for the Siamese network, where do you use the features? Moreover, why the data split is done after the feature extraction step?

Response: We have revised both the figure (Fig. 2 and Fig. 4) and the corresponding explanation in the manuscript clarify that the data is first split into support and query sets, and then ViT is used to extract features. The Siamese network receives these feature embeddings, not raw images for similarity computation in the few-shot setting. (P. 7, L 295)

  1. P 5 L 199-202: a reference is missing here. Moreover, the authors should address the limitations specific to tumor detections from MRI.

Response: We have added a relevant reference to support the statement. Additionally, we have included a brief discussion on the limitations specific to tumor detection from MRI, such as low contrast between tumor and normal tissues, variations in tumor shape and size, and the presence of noise and artifacts in MR images. (P. 5, L 187-199)

  1. P 11 L 436: it is not clear if the images in your dataset are all from different patients or some of them come from the same patient. Moreover, there are no details about the test set. It would be good to further explain that the query set was not split into training samples.

Response: The MRI dataset used includes images from multiple patients, and we ensured that the support and query sets contain non-overlapping samples. The query set was strictly used for evaluation and not included in training. We have clarified this and added details about the test set and few-shot (1, 5, 10 shots) settings in the 4.1 Dataset section. (P. 11, L 462-472)

  1. P 12 L 496-502: what is the novel set? This entire paragraph is confusing. Please explain what data you are referring to and for what you use it. This section is about the training details, therefore, if you are talking about test set the information should go somewhere else. Also, I was expecting different tumor classes for the meta-testing phase. You mention “novel classes” but don’t specify which ones.

Response: We have revised the paragraph to clearly state that the "novel set" refers to unseen samples from the same tumor classes (glioma, meningioma, pituitary) used during meta-testing. No new tumor classes were introduced in the testing phase. We have moved the test-related information to the 4.1 Dataset section to maintain consistency with the structure of the manuscript. (P. 11, L 462-472)

  1. P 14 L 546: more details about “another model” are needed.

Response: We have clarified in the manuscript that “another model” refers to our other proposed models included for comparison.

  1. P 14 L 550-556: are these models trained by the authors or someone else? Details about architecture, training sets, training strategies are needed. (P. 14, L 553-569)

Response: The models mentioned in Lines 550–556 were implemented and trained by us. We have added details in the revised manuscript regarding their architecture, training datasets, and training strategies.

  1. P 15 L 568-571 + Table 5: this comparison should be removed. The authors cannot state that their model is better than another model trained and tested on data from a different medical domain. This is not a fair comparison.

Response: We have removed the comparison and the related discussion in Table 5. To the best of our knowledge, there are no existing studies on brain tumor classification using MRI within a few-shot meta-learning framework. As a result, direct comparisons with prior methods in this exact setting are not available. However, we added the results of previously developed deep learning models on the same dataset with fewshot settings in Table 4. (P. 15 & 16, L 580 & 596)

  1. Please clarify which test set was used to produce table 3 and 4.

Response: Tables 3 and 4 present results based on the query set, which serves as the test set in our few-shot learning framework. This test set consists of unseen MRI images from the same tumor classes used during meta-testing, ensuring a fair evaluation while remaining within the same medical imaging domain. (P. 11, L 462-472)

Minor comments:

  1. P 1, L 40: please remove the word “incredible”

Response: We have removed the word “incredible”. (P. 1, L 41)

  1. P 2, L 44: in the caption I would also specify that you’re showing examples for 3 types of brain tumors.

Response: We have updated the caption of Figure 1 to specify that the images illustrate examples of three types of brain tumors, providing clearer context for the reader. (P. 2, L 46)

  1. P 2, L 47: please replace “great” with “higher”

Response: We have replaced the word “great” with “higher”. (P. 2, L 49-50)

  1. P 2, L 52, L 54: are the dimensions in pixels?

Response: Yes, the dimensions mentioned refer to pixel values. We have revised the text to explicitly state that the dimensions are in pixels to avoid any ambiguity. (P. 2, L 55 & 56)

  1. P 2, L 71-73: please rephrase/remove the sentence about CNN automatically identifying characteristic and learning “on its own”, this is unclear.

Response: We have removed the sentence to eliminate ambiguity and improve clarity. (P. 2, L 73)

  1. P 3 L 84-85: this sentence is a bit unclear, please rephrase

Response: We have rephrased the sentence to enhance clarity. (P. 2, L 84)

  1. P 3 L 105-109: The sentence seems out of place. Why are we discussing Softmax now? Reference is also missing. Is the last sentence about softmax too?

Response: We have removed the sentence to maintain coherence. (P. 3, L 106)

  1. P 3 L 110-111: this sentence is confusing, you need to clarify which tasks you refer to

Response: We have revised the sentence to clarify that the “tasks” refer to individual tumor classification problems. (P. 3, L 107)

  1. P 3 L 128: the subject of the sentence is missing, probably “we preprocess” or “one need to preprocess”

Response: The sentence has been revised for clarity by adding a subject. “The MRI images are first preprocessed, followed by feature extraction using a Vision Transformer.” (P. 3, L 124)

  1. P 4 L 141-143: the last contribution seems a repetition of the previous ones

Response: We have revised the last contribution to remove redundancy and ensure each listed contribution is distinct and meaningful. (P.4, L 137)

  1. P 4 L 161: please rephrase “analyzing numerous images”, this is too informal and unclear

Response: We have rephrased “analyzing numerous images” to improve clarity and formality. (P 4, L 157)

  1. P 5 L 190: replace consists with consisting.

Response: We have corrected the grammatical issue by replacing “consists” with “consisting”. (P 5, L186)

  1. P 5 L 190-198: The description of this work is very confusing. Which mismatch? What do you mean by “unique requirements of medical imaging datasets”? I believe the use of “we” is a typo. Please consider rephrasing the entire paragraph.

Response: We have rephrased it to remove ambiguous terms such as “mismatch” and “unique requirements,” and have eliminated the use of “we”. The revised version provides a clearer explanation of the dataset and the ensemble strategies. (P.5, L 187-199)

  1. P 5 L 229: please avoid terms as “complicated” and “sophisticated”. Moreover, this sentence doesn’t say anything about model performance/limitations, it just repeats what already stated.

Response: We have revised the sentence to remove subjective terms such as “complicated” and “sophisticated.” The revised version avoids repetition and enhances clarity. (P. 5, L 225)

  1. P 5 L 229: replace for with to

Response: We have corrected the grammatical error in Line 229 by replacing “for” with “to” for improved accuracy and readability. (P. 5, L 227)

  1. P 6 L 239-241: this is a repetition of what already stated multiple times

Response: We have removed repetitive sentences to improve clarity. (P. 6, L 238)

  1. P 7 L 291-293: verb is missing

Response: We have corrected the sentence by adding the appropriate verb to ensure grammatical accuracy and clarity. (P. 7, L 290)

  1. P 7 L 295: in the figure you mention “transformation” in data processing, but this is not described in the text. Can you please clarify/modify the figure?

Response: The term “transformation” in the figure refers to the standard preprocessing steps applied to the MRI images before feeding them into the model. Specifically, it fixes input dimensions compatible with the vision transformer architecture and converts the image data into tensors. We have updated the methodology section to clearly describe these transformation processes to maintain consistency with the figure. (P. 7, L 287-292)

  1. P 7 L 299: It would be good to specify which method was used for resizing.

Response: we have updated the manuscript to specify that image resizing was performed using bilinear interpolation to match the input size requirements of the Vision Transformer model. (P. 7, L 298)

  1. P 9 L 365: please replace erected

Response: We have replaced the word “erected” with “designed” to improve clarity and appropriateness of language. (P. 9, L 367-369)

  1. P 9 L 365-367: this sentence is not clear, please rephrase.

Response: We have rephrased the sentence to improve clarity and ensure that the relationship between the Siamese network and the baseline model is clearly explained. (P. 9, L 367-369)

  1. P 9 L 385-387: the definition of support set and query set is a bit unclear

Response: We have revised the definition of the support and query sets to provide a clearer and more accurate explanation of their roles in the few-shot learning process. (P. 9, L 387-389)

  1. P 10 L 396: please correct verb

Response: We have corrected the verb usage to ensure proper grammatical structure and clarity. (P. 10, L 398-400)

  1. P 10 L 409: I think a “for” is missing, the sentence is unclear

Response: We have revised the sentence by adding the missing preposition “for” to improve grammatical clarity and readability. (P 10, L 410-413)

  1. P 11 L 438: please remove reputable

Response: We have removed the word “reputable” and revised the sentence for improved clarity. (P 11, L 438)

  1. P 13-14: paragraph 4.3 should be removed

Response: The content of paragraph 4.3 was inadvertently duplicated. We have removed the repeated paragraph to ensure clarity and eliminate redundancy. (P 13)

  1. Fig. 8 should be removed because it presents the same results as Table 4, it does not add information.

Response: We agree that Figure 8 duplicates the information presented in Table 4 without providing additional insights. Accordingly, we have removed Figure 8 from the manuscript.

  1. P 16 L 588: please remove sophisticated

Response: We have removed the word “sophisticated” and revised the sentence. (P 16, L 601)

  1. P 16 L 592: please replace resilience

Response: We have replaced the word “resilience” with “robustness” in Line 592 to improve clarity. (P 16, L 605)

  1. There are many sentences that are missing a reference, especially for the mentioned datasets. I will list the lines here:

P 3 L 81

  • P 3 L 114
  • P 4 L 158-159
  • P 4 L 177-178
  • P 5 L 182
  • P 5 L 219
  • P 6 L 243-245
  • P 6 L 248
  • P 6 L 258
  • P 6 L 265
  • P 6 L 270-271
  • P 6 L 279-280
  • P 6 L 283
  • P 12 L 471
  • Response: We have now included the references and highlighted them in the paper to ensure proper attribution and enhance clarity.
  1. 3, L 78
  2. 3, L 82
  3. 3, L 108
  4. 3, L 116
  5. 4, 153
  6. 4, 158
  7. 4, 168
  8. 4, 173-175
  9. 5, L 193
  10. 5, L 212
  11. 6, L 238
  12. 6, L 240-242
  13. 6, L 253
  14. 6, L 259
  • P 15, L 581
  1. Furthermore, whenever you refer to an acronym, regardless of whether you have cited the source paper, you should always provide an explanation of what the acronym stands for. This happens many times in the manuscript.

Response: We have carefully reviewed the manuscript and ensured that all acronyms are now defined upon their first mention, regardless of whether the source paper is cited.

References

[1] Gull, S., S. Akbar and H. U. Khan. "Automated detection of brain tumor through magnetic resonance images using convolutional neural network." BioMed Research International 2021 (2021): 3365043.

[2] Gull, S., S. Akbar and S. M. Naqi. "A deep learning approach for multi-stage classification of brain tumor through magnetic resonance images." International Journal of Imaging Systems Technology 33 (2023): 1745-66. 

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors propose a novel meta-learning approach for predicting brain tumors from MRI images, utilizing a vision transformer as the feature extractor and a metric-based Siamese neural network for few-shot learning. The methodology addresses the challenge of limited data availability and demonstrates superior model prediction accuracy compared to established methods. Overall, the manuscript is well-structured and clearly written.

Major Concerns:

1. Section 4.4 outlines several performance metrics, including accuracy and specificity, along with definitions for TP, TN, etc. However, there is a lack of discussion regarding the appropriateness of these metrics for evaluating model performance in this specific application. While accuracy is the sole metric employed for comparison with other methods in Tables 4 and 5, a justification for its selection is necessary. Additionally, since accuracy is the sole metric for comparison, the assertions made in Line 19 and Line 573 may not be completely valid.

2. Table 5 presents a comparison between the proposed method and a previous study focused on chest X-ray classification. To establish a fair comparison, it is essential to provide basic information about the previous work, including the specific classification problem and data availability.

3. While the results are adequately presented in Section 4.5, the manuscript would benefit from a more comprehensive discussion that explains why the proposed approach excels and how it surpasses state-of-the-art methods. Additionally, an analysis of potential synergies between the components of the proposed framework—specifically the vision transformer and Siamese networks—would enrich the reader's understanding and foster the generation of new ideas.

Minor Comments:

1. The abbreviation RViT in Line 221 needs clarification; it appears to stand for "rotation invariant vision transformer" and should be defined earlier in the manuscript.
2. In Equations (2) and (3), the notation LN requires specification to avoid any ambiguity.
3. There is duplication in the content presented in Section 4.2 at Line 469 and a second Section 4.3 at Line 516, which should be addressed.

Author Response

  • The comments from Reviewer 2 have been highlighted in sky-blue color.
  • P stands for page/pages, L stands for line/lines.

    Major Concerns:

    Comment 1. Section 4.4 outlines several performance metrics, including accuracy and specificity, along with definitions for TP, TN, etc. However, there is a lack of discussion regarding the appropriateness of these metrics for evaluating model performance in this specific application. While accuracy is the sole metric employed for comparison with other methods in Tables 4 and 5, a justification for its selection is necessary. Additionally, since accuracy is the sole metric for comparison, the assertions made in Line 19 and Line 573 may not be completely valid.

    Response: We have revised the manuscript and in response to your suggestion, we have also added results for additional evaluation metrics—precision, sensitivity, specificity, and F1-score in Table 3 to provide a more comprehensive assessment of model performance. Furthermore, the statements in Line 20 and Line 580 have been revised to ensure that the results are appropriately supported by the range of evaluation metrics presented. (P. 14 & 15, L 526 & 595)

    Comment 2. Table 5 presents a comparison between the proposed method and a previous study focused on chest X-ray classification. To establish a fair comparison, it is essential to provide basic information about the previous work, including the specific classification problem and data availability.

    Response: We have updated the manuscript to include a brief description of the previous study including the classification task and data domain (chest X-ray images) to provide context for the comparison. However, it is important to note that there are currently no existing studies on brain tumor classification using MRI within a few-shot meta-learning framework. As a result, a direct comparison with prior methods in this exact setting is not available. The included comparison is intended to provide a general performance perspective within the broader domain of few-shot medical image classification. (P. 15 & 16, L 570 & 596)

    Comment 3. While the results are adequately presented in Section 4.5, the manuscript would benefit from a more comprehensive discussion that explains why the proposed approach excels and how it surpasses state-of-the-art methods. Additionally, an analysis of potential synergies between the components of the proposed framework specifically the vision transformer and Siamese networks would enrich the reader's understanding and foster the generation of new ideas.

    Response: We have enriched the analysis in Section 4.5 to provide a deeper explanation of why the proposed approach performs well, emphasizing the complementary strengths of the Vision Transformer and Siamese network. Specifically, we highlight the ViT effectively captures global contextual features from limited data, while the Siamese architecture enhances discriminative comparison in few-shot settings. This synergy contributes to improved generalization on unseen classes. This work will encourage further exploration and development in this research direction. (P 14, L 543 & 570)

    Minor Comments:

    Comment 1. The abbreviation RViT in Line 221 needs clarification; it appears to stand for "rotation invariant vision transformer" and should be defined earlier in the manuscript.

    Response: We have clarified the abbreviation RViT as "Rotation-Invariant Vision Transformer" and defined it at its first occurrence in the manuscript to improve readability and avoid confusion. (P 5, L 219)

    Comment 2. In Equations (2) and (3), the notation LN requires specification to avoid any ambiguity.

    Response: We have updated the manuscript to clarify that LN refers to Layer Normalization, a standard operation used in Transformer architectures to stabilize and accelerate training by normalizing inputs across the feature dimension. This clarification has been added near Equations (2) and (3) to eliminate any ambiguity. (P. 8, L 338)

     

    Comment 3. There is duplication in the content presented in Section 4.2 at Line 469 and a second Section 4.3 at Line 516, which should be addressed.

    Response: We have corrected the duplication in Section 4.2 and resolved the erroneous repetition of Section 4.3. The section headings and content flow have been revised accordingly to ensure clarity and consistency throughout the manuscript. (P. 13)

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Dear Authors,

Thank you for addressing all the comments and adding more details to the manuscript. This revised version is much clearer and comprehensive. I truly appreciate your efforts in improving the manuscript.

I still have some minor comments for you:

  1. In my major comment number 3, I asked you to add a reference and some comments about challenges in detecting tumors from MRI. In the reply you mention “tissues, variations in tumor shape and size, and the presence of noise and artifacts in MR images”. However, I don’t see this in the manuscript. The reference is also missing.
  2. P 12 L 508: here you still mention the “novel classes”, even if you clarified that you’re not introducing novel tumor classes. Please correct this sentence too.
  3. Thank you for adding more information about the other models you investigated in your work. However, I think the result section would be clearer if you first mention that you also trained other models and then discuss and compare the results. Please rearrange.
  4. P 2 L 73-74: please check this sentence, the verb seems incorrect
  5. P 14 L 530: please replace indistinct with misleading
  6. P 4 L 137-139: this seems to be a rephrasing of the previous version, you’re still saying that the model works well when limited data is available, which is already mentioned in the previous point. Instead, as a third contribution, you could highlight the comparative analysis you performed with the other models.
  7. P 7 L 306-309: I would not consider this as “transformation” but more as data loading or formatting step. Moreover, I would remove the last sentence about consistency: converting images into tensor does not ensure consistency, it ensures that the data is in the correct format for training the model.
  8. P 15 L 570-579: if you want to keep this paragraph, I suggest moving it to the end of the Results section, as it represents the least meaningful comparison. I would also remove the sentence in lines 577-579. As said in my previous review, this is not a fair comparison. Moreover, sentence in lines 574-575 is incomplete.
  9. P 16 L 598: I suppose the first element in the first row of the table was something like “model” and not the name of the model. Please correct.

Comments for author File: Comments.docx

Comments on the Quality of English Language

Some sentences need to be rephrased for clarity, please see the provided comments. 

Author Response

  • The comments from Reviewer 1 have been highlighted in yellow.
  • Our responses are written below each comment in black text, starting with “Response:”.
  • All changes made in the revised manuscript are highlighted.

Reviewer 1:

  1. In my major comment number 3, I asked you to add a reference and some comments about challenges in detecting tumors from MRI. In the reply you mention “tissues, variations in tumor shape and size, and the presence of noise and artifacts in MR images”. However, I don’t see this in the manuscript. The reference is also missing.

Response: We sincerely apologize for the oversight. In the revised manuscript, we have now explicitly included the challenges mentioned in the paragraph discussing the limitations of model in the context of tumor classification from MRI. Specifically, we have described issues such as low contrast between normal and abnormal tissues, variations in tumor shape, size, and location, as well as noise and imaging artifacts. Additionally, we have cited relevant studies that highlight these challenges. (P. 5 L. 197-202)

  1. P 12 L 508: here you still mention the “novel classes”, even if you clarified that you’re not introducing novel tumor classes. Please correct this sentence too.

Response: Thank you for your comment. We agree that the use of the term “novel classes” was misleading given that we are not introducing new tumor categories. To clarify, we have revised the sentence to better reflect the intent of our study, which is to evaluate the generalization capability of the model to unseen data. (P. 12 L. 508-509)

  1. Thank you for adding more information about the other models you investigated in your work. However, I think the result section would be clearer if you first mention that you also trained other models and then discuss and compare the results. Please rearrange.

Response: Thank you for your valuable suggestion. The Results section has been revised to first introduce the additional models trained in this study, followed by a clear discussion and comparison of their performances. This restructuring aims to improve the logical flow and clarity of the section, as recommended. (P. 14-15 L 536-590)

  1. P 2 L 73-74: please check this sentence, the verb seems incorrect

Response: Thank you for pointing this out. We have corrected the grammatical issues in the sentence by adjusting the verb structure. (P. 2 L .73-74).

  1. P 14 L 530: please replace indistinct with misleading

Response: Thank you for your suggestion. We agree that “misleading” is a more appropriate word choice in this context. The sentence has been updated accordingly. (P. 14 L . 530-531).

  1. P 4 L 137-139: this seems to be a rephrasing of the previous version, you’re still saying that the model works well when limited data is available, which is already mentioned in the previous point. Instead, as a third contribution, you could highlight the comparative analysis you performed with the other models.

Response: Thank you for your insightful suggestion. We agree that the third contribution overlapped with earlier points. To address this, we have revised the third contribution to emphasize the comprehensive comparative analysis conducted with other models. (P. 4 L 136-138)

  1. P 7 L 306-309: I would not consider this as “transformation” but more as data loading or formatting step. Moreover, I would remove the last sentence about consistency: converting images into tensor does not ensure consistency, it ensures that the data is in the correct format for training the model.

Response: Thank you for your valuable feedback. We have revised the text to more accurately reflect that the process involves data formatting rather than transformation. Additionally, we have removed the sentence suggesting that tensor conversion ensures consistency, as suggested.(P. 7 L. 307-310)

  1. P 15 L 570-579: if you want to keep this paragraph, I suggest moving it to the end of the Results section, as it represents the least meaningful comparison. I would also remove the sentence in lines 577-579. As said in my previous review, this is not a fair comparison. Moreover, sentence in lines 574-575 is incomplete.

Response: The paragraph has been revised and relocated to the end of the Results section to reflect its lower comparative significance. The sentence in lines 577–579 has been removed as recommended, in acknowledgment of the unfair nature of the comparison. Additionally, the incomplete sentence has been revised for clarity and completeness. (P. 15 L 586-590).

  1. P 16 L 598: I suppose the first element in the first row of the table was something like “model” and not the name of the model. Please correct.

Response: Thank you for pointing this out. The first element in the first row of the table has been corrected to “Existing Method” to accurately represent the column header. (P 16. L 595)

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Thank you for the revision and responses.

Author Response

We sincerely thank the reviewer for the kind acknowledgement and for taking the time to review our revised manuscript.  As there were no additional comments in this round, we understand that the current version meets the expectations, and we truly appreciate your valuable feedback throughout the review process.

Back to TopTop