Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Deep Multi-Instance Conv-Transformer Frameworks for Landmark-Based Brain MRI Classification

Electronics 2024, 13(5), 980; https://doi.org/10.3390/electronics13050980

by Guannan Li, Zexuan Ji

and Quansen Sun^*

Reviewer 1:

Xiao Zhang

Reviewer 2:

Francisco Calisto

Reviewer 3: Anonymous

Electronics 2024, 13(5), 980; https://doi.org/10.3390/electronics13050980

Submission received: 12 January 2024 / Revised: 25 February 2024 / Accepted: 27 February 2024 / Published: 4 March 2024

(This article belongs to the Special Issue Research Advances in Image Processing and Computer Vision)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The paper describes a transformer-based architecture for brain MRI classification. The LD-MILCT network architecture is a model designed for image classification of 3D MRI scans. It combines local and global feature representations by using patch-level representations and a Visual Transformer layer. The model includes a multi-instance learning head (MIL head) to enhance the representation of brain structure at the image level. The performance of the model is evaluated using accuracy, sensitivity, specificity, and F-score metrics. The experiments are conducted on Autism Spectrum Disorder (ASD) and Alzheimer's Disease (AD) datasets obtained from open-access repositories.

Line 180-184, there is a need for clarification regarding the use of "standard learnable absolute positional encodings of the landmark position." The positional encoding method, specifically whether it involves the coordinates of each patch in the original MRI image or simple integer values, is not clearly explained.

Figure 1. The figure implies that landmarks are extracted from the same position of testing data as training data. It is thus not clear, for the testing data, are the landmarks extracted using the proposed “Data-Driven landmark identification” method, or the same coordinates from training data are used for testing data?

Some minor comments and suggestions:

Line 1: For brain diseases (i.e. autism spectrum disorder, ASD) → e.g. autism spectrum disorder, ASD

Line 121-122: “To adjust for intensity inhomogeneity, each image is resampled to a resolution of 256× 256 × 256.” To my understanding, resampling is to adjust for resolution inhomogeneity, not “intensity inhomogeneity”.

Line 122: The authors should provide more details about the “N3” method, the readers may not understand what N3 method is and what it is used for.

Line 223: Aggregation → aggregation

Equation (6): F − score: should be F-score

Another minor question: will the code be released to Github?

Comments on the Quality of English Language

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript offers a novel approach with its LD-MILCT framework, targeting computational challenges in medical imaging for conditions like Alzheimer's and autism spectrum disorders. This innovative application of artificial intelligence in medical imaging could significantly advance the field. However, to fully harness its potential, certain improvements are recommended.

An expanded literature review incorporating a broader range of recent AI and medical imaging studies would provide comprehensive context and better situate the research within the current scientific landscape (10.1038/s41585-023-00796-1, 10.1109/ISBI53787.2023.10230686). This is crucial, especially in highlighting the novelty of the LD-MILCT framework and its alignment with recent advancements in AI applications for medical imaging. The methodology section, presenting an innovative approach, requires a more detailed description for clarity and reproducibility (10.1038/s41569-023-00900-3, 10.26044/ecr2023/C-16014). A clearer explanation of the landmark detection process and its integration with the Conv-Transformer model is essential, enabling other researchers to build upon this approach. Moreover, the results would benefit from a more robust comparative analysis with existing methods (j.compbiomed.2023.107861, 10.1109/ISBI53787.2023.10230448). A detailed comparison, both quantitatively and qualitatively, is crucial to demonstrate the method’s standing relative to existing approaches and to validate the results effectively.

A stronger alignment between the results and conclusions is also needed to enhance the narrative flow and scientific rigor. This alignment is key in demonstrating the research findings' validity and reliability. Additionally, the manuscript would improve with thorough proofreading and consistent formatting, addressing language inconsistencies and formatting issues to enhance readability and professionalism. Discussing the potential application of the framework in broader contexts, beyond the specific diseases studied, would also enrich the manuscript. Exploring its adaptability to other medical imaging types or diseases would illustrate the versatility of the research and its broader implications in medical imaging and diagnosis.

In conclusion, this manuscript provides a promising approach with the potential to impact computer-aided diagnosis using sMRI significantly. Addressing these highlighted areas will elevate the overall quality and contribution of the paper, ensuring that it offers meaningful advancements in diagnosing and understanding complex brain diseases.

Comments on the Quality of English Language

The manuscript's quality in the English language requires attention for improved clarity and coherence. While the overall narrative is understandable, grammatical inconsistencies and awkward phrasings disrupt the flow of reading. Addressing these issues through thorough proofreading and language editing will significantly enhance the readability and professional presentation of the manuscript. Such revisions are essential to ensure that the scientific merit of the work is effectively communicated to its intended audience.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Manuscript Number: electronics-2832091

Title: Deep Multi-Instance Conv-Transformer Frameworks For

Landmark-based Brain MRI Classification

The authors proposed a landmark-based multi-instance Conv-Transformer framework as a solution to autism spectrum disorder (ASD) with unclear biological characteristics and Alzheimer’s disease (AD) in brain disease diagnosis.

This manuscript is adequately written and the study is well-performed. However, some minor drawbacks have to be addressed.

2. Methods

Line 119: “The preparatory stage is crucial for the ensuing stages of analysis. Skull stripping and

histogram matching for T1 MR images were done using in-house technologies. To adjust

for intensity inhomogeneity, each image is resampled to a resolution of 256 × 256 × 256.”

If a resampling of 512 × 512 × 512 was used instead of 256 × 256 × 256, can it influence the results of the N3 method?

Line 135: “We initially chose a T1-weighted image of superior quality to serve as a template. During

the preprocessing stage, all training MRIs are aligned with a template by rotation and translation. This ensures that all training pictures are in the same spatial position.”

The real T1-wighted images of the patients would not be of superior quality. Can your landmark-based multi-instance Conv-Transformer frame work perform in this case?

Discussion:

Line 368: “Furthermore, this study does not include additional information such as gender and age, which are characteristics that could potentially impact brain anatomy..”

From my point of view, the information about gender and age could be indeed impact brain anatomy. This information could probably be received from the Autism Brain Imaging Data Exchange.

Minor revision

Comments on the Quality of English Language

Minor editing of English language required

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Article Menu

Deep Multi-Instance Conv-Transformer Frameworks for Landmark-Based Brain MRI Classification

Further Information

Guidelines

MDPI Initiatives

Follow MDPI