Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Enhanced Segmentation of Glioma Subregions via Modality-Aware Encoding and Channel-Wise Attention in Multimodal MRI

Appl. Sci. 2025, 15(14), 8061; https://doi.org/10.3390/app15148061

by Annachiara Cariola^†

, Elena Sibilano^†

, Antonio Brunetti^*

, Domenico Buongiorno

, Andrea Guerriero

and Vitoantonio Bevilacqua

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Reviewer 4:

Chen Zhao

Appl. Sci. 2025, 15(14), 8061; https://doi.org/10.3390/app15148061

Submission received: 23 June 2025 / Revised: 15 July 2025 / Accepted: 18 July 2025 / Published: 20 July 2025

(This article belongs to the Special Issue The Role of Artificial Intelligence Technologies in Health)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript introduces a novel Deep Learning architecture leveraging modality-specific encoding and attention-based refinement for these gmentation of glioma subregions, including peritumoral edema (ED), necrotic core (NCR), and enhancing tissue (ET). The model is validated on the BraTS 2023 dataset and outperforms the lightweight SegFormer3D baseline across all metrics and tumor subregions.

Abstract is short while comprehensive. The problem is introduced: effective feature extraction of key tumor subregions in adult gliomas from Magnetic Resonance Imaging (MRI). The novelty is specified: modality-specific encoding and channel-wise attention. Quantitative results (Dice scores) are reported and statistical significance is mentioned. A brief comparison with a baseline (SegFormer3D) is provided.

In the Introduction sections accurately glioma classification (WHO 2021), MRI modalities, and clinical importance of ED, NCR, and ET are presented. The authors describe thorough recent advancements on feature extraction from MRI brain images using deep learning techniques and the latest scientific literature relative to novel feature extraction framework based on CNNs and Vision Transformers. Introduction section ends up with a very good description of the authors scientific framework and methodology that they used to solve fine-grained segmentation of glioma subregions, namely ET, NCR and ED, from preoperative MRI scans problem with deep learning methods using a combination of CNNs and attention-based refinement, like Vision Transformers contain.

The section Related Work is thorough and well-structured. It gives a chronological and thematic overview of brain tumor segmentation models, mainly from the BraTS challenges.

In section Materials and Methods, datasets used and preprocessing methodology are explained in a very comprehensive manner. Further, a well-structured explanation of the multi-enconder with SE blocks, residual design, and decoder mechanics architecture is given. The authors make good use of the BraTS 2023 dataset and justify their focus on individual tumor subregions (ET, ED, NCR) rather than composite regions. The preprocessing pipeline—using Otsu thresholding, intensity normalization, and MONAI-based augmentations—is robust and reproducible. The training setup is modern and sound, employing 5-fold cross-validation, AdamW optimization, cosine learning rate scheduling, and early stopping, all executed on high-performance hardware. Evaluation metrics are standard and appropriate, though their rationale could be slightly expanded.

The Results section is compelling, well presented, and strongly support the efficacy of the proposed method for fine-grained glioma segmentation.

Discussion section effectively summarizes the study's core contributions. It is balanced, insightful, and aligns well with the study’s objectives. The author in Discussion section present a comparison of the proposed architecture with the well-established SegFormer3D framework, and they justify their Dice scores with relevant research found on literature.

In conclusions the authors sum up with their findings, discuss possible areas of application and their feature plans as far the aforementioned scientific work concerns.

However, in Related Work I think that CNNs and ViTs are mixed throughout. No clear break between "classic CNN-based", "hybrid", and "transformer" methods. Perhaps a clearer division of these works is needed.

In Materials and Methods, figure 2 effectively illustrates the overall structure of the proposed multi-encoder model, highlighting the use of modality-specific encoding and a shared decoding path. However, its explanatory value could be greatly enhanced by (1) clearer labeling of operations and data dimensions, (2) better depiction of the encoder-to-decoder fusion mechanism, and (3) a visual key or legend. Adding channel size annotations and modality-specific colors would also improve readability and support reproducibility. Also as far as the dep learning framework description concerns batch size is not specified, weight decay (used by AdamW) value is not mentioned and dropout rate is mentioned only for SE blocks, no global dropout rate is discussed.

I also think that the section Evaluation Metrics lacks of references. The chosen metrics should be justified by extra references, so that their choice is scientific objective.

A validation on an external dataset (BraTs test set, TCIA, or private medical datasets from a hospital) to confirm generalizability across centers and paties population is needed. Perhaps the authors should discuss this further on the Discussion Section, as future work.

The manuscript is written in good English.

Figure and Tables are well organized and labeled.

References are modern and relevant to the presented work.

In general, the paper is innovative, scientifically sound but needs further improvements.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript under consideration is devoted to the glioma segmentation using multimodal MRI and multi-encoder architecture. The topic of the manuscript is relevant. All types of diffuse gliomas are highly dangerous. Accurate diagnosis of brain tumors is of great importance for planning of treatment.

I really liked the manuscript, but I will recommend it for publication after main revision.

A few introductory words:

Subsection “2. Related Work” is quite informative. The authors described almost all previous studies in this area.

Finally, at the end of this subsection, the authors formulated the purpose of their study (lines 175-177):

“… the aim of this work is to design and validate a novel DL architecture for accurate segmentation of ET, NCR, and ED regions, while maintaining a moderate computational complexity.”

A few lines above, the authors also formulated the goal of the work (lines 70-73):

“… the aim of this study is to perform accurate and fine-grained segmentation of key subregions in adult gliomas, i.e., ET, NCR and ED, by employing a novel multi-encoder architecture which leverages a channel-wise attention mechanism while maintaining low computational complexity.”

Main remarks:

I ask the authors to supplement the following subsections with explanations of why the proposed multi-encoder architecture is effective. What are its advantages?

I ask authors to highlight all other novelties that distinguish their research from previously published ones.

The authors should explain how they achieved a moderate computational complexity. What is the trade-off between accuracy and computational cost?

Roughly speaking, all previous works more or less coped with the task of segmentation. Why is this work needed? Please underline this in the text.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Please refer to the attached file for detailed information.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The manuscript presents a promising approach to glioma segmentation with a focus on fine-grained details, but it requires a more robust statistical analysis to support its claims of improvement over existing methods. The following issues deserve attention.

Figure 1 is not particularly clear. It is recommended to optimize the layout to highlight the overall structure of the method proposed in this paper. Specific details can be described in additional subdiagrams.
Original descriptions：p. 18 "This has led to the development of automatic segmentation approaches, mainly based on Deep Learning (DL)..." Suggestions：The authors should clarify what specific aspects of DL methods were lacking in previous studies and how their approach fills those gaps. A more thorough literature review that directly relates their work to prior studies could enhance the manuscript's depth.
Original descriptions：p. 11 "The statistically significant improvements obtained on all regions highlight the effectiveness of integrating complementary modality-specific information and applying channel-wise feature recalibration in the proposed model." Suggestions：While the authors claim statistically significant improvements, the methodology section lacks detailed statistical analyses or robustness checks to substantiate these claims. Including more specific statistical tests and their results would strengthen the validity of their findings.
The evaluation index is suggested to combine both P and R, which is F-measure.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors took into account all my comments. The manuscript has been sufficiently improved to be published in Applied Sciences.

Reviewer 3 Report

Comments and Suggestions for Authors

I have nothing to add.

Reviewer 4 Report

Comments and Suggestions for Authors

The previous suggestions have been modified by authors. It is suggested to be accepted in present form.

Article Menu

Enhanced Segmentation of Glioma Subregions via Modality-Aware Encoding and Channel-Wise Attention in Multimodal MRI

Further Information

Guidelines

MDPI Initiatives

Follow MDPI