Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

FungiLT: A Deep Learning Approach for Species-Level Taxonomic Classification of Fungal ITS Sequences

Computers 2025, 14(3), 85; https://doi.org/10.3390/computers14030085

by Kai Liu^1,2

, Hongyuan Zhao^1,2

, Dongliang Ren²

, Dongna Ma²

, Shuangping Liu^1,2,3,*

and Jian Mao^2,3,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Computers 2025, 14(3), 85; https://doi.org/10.3390/computers14030085

Submission received: 18 November 2024 / Revised: 6 January 2025 / Accepted: 9 January 2025 / Published: 28 February 2025

(This article belongs to the Special Issue Emerging Trends in Machine Learning and Artificial Intelligence)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript presents a deep learning model that integrates LSTM and Transformer architectures to identify fungal species from ITS sequence information. While the authors have conducted a comparative study with existing classification models, including CNN_FunBar, the manuscript lacks a clear exploration of the biological implications underlying the identified species. Additionally, the proposed model, FungiLT, does not demonstrate significant advantages over existing approaches, particularly CNN_FunBar. In its current form, the manuscript has substantial weaknesses that need to be addressed through significant revisions. These shortcomings must be resolved before the manuscript can be reconsidered for publication.

1. The manuscript focuses on a comparative study of prediction models, including the newly developed FungiLT method and existing models such as CNN_FunBar. While such comparisons are valuable, the authors should provide detailed explanations for the observed variations in prediction accuracy. Specifically, what factors contribute to the improvement or reduction in performance across different models? Addressing these factors would add depth to the study and enhance its scientific value. For instance, could specific sequence segments serve as indicators for particular species? If so, can the authors identify any biological implications from their analysis? Exploring such insights would significantly strengthen the manuscript by linking the technical findings to meaningful biological interpretations.

2. Regarding the performance comparison, FungiLT does not demonstrate significant improvement over CNN_FunBar, as evidenced in Tables 1 and 2. In particular, the F1 Score column in Table 2 suggests that CNN_FunBar outperforms FungiLT in several cases. The authors should consider highlighting CNN_FunBar in this context, as it seems to set a higher benchmark for prediction performance. A detailed discussion of why FungiLT underperforms in certain metrics and scenarios would also be beneficial. Are there specific characteristics of the sequence data or architectural limitations of FungiLT that could explain these results? Additionally, the manuscript would benefit from more emphasis on the biological implications of the findings. While the technical development and comparison are important, the lack of discussion about the biological significance of the results leaves the study incomplete. Addressing this aspect could provide a broader context and make the manuscript more impactful. Purely comparing model performance is insufficient for publication in this journal.

3. Related to point 2, some of the conclusions regarding the FungiLT model are overly strong and not well-supported by the data, which is problematic. The ablation experiments in Table 3 do not demonstrate a significant performance decline (page 7 line 243, page 8 line 254) after removing any individual components, including LSTM, Transformer, Dropout, etc. This also suggests that the combination of LSTM and Transformer—touted as a key novelty of this work—may not be necessary for achieving the reported results. The authors should temper their claims and provide a more balanced discussion of these findings, acknowledging the limited impact of these architectural elements as indicated by their ablation study.

4. Despite the rough study given by Figure 1A, the choice of k = 5 for later analysis requires further justification, especially as the exploration only covered a narrow range (k = 3 to 6). Given that ITS region sequences range from 200 to 800 bp, the k-value significantly influences model performance. Smaller k-values may capture short motifs but miss contextual information, while larger k-values can represent longer patterns but risk introducing noise or increasing computational demands. The repetitive nature of genomic sequences, such as G-tracts and A-tracts, suggests that short-range interactions like base-base stacking needs to be carefully evaluated. The authors should clarify whether this choice was based on prior studies, optimization experiments, or a trade-off between accuracy and efficiency. Exploring a broader range of k-values is recommended to ensure that important sequence patterns and biological features are adequately represented, enhancing the robustness of fungal identification.

5. Figure 1B repeats the data in Table 1, and Figure 2B repeats Table 2. This redundancy adds no new insights and should be consolidated to avoid repetition.

Author Response

请参阅附件。

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

1. While the article provides a detailed discussion of the databases included, a more comprehensive explanation of the criteria for selecting and integrating the six datasets can be provided

2. Metrics used for model evaluation are well-explained, but additional examples are needed

3. The article discusses the challenge of species data imbalance but does not elaborate on future approaches or techniques for mitigating this issue effectively.

4. The article compares FungiLT with BLAST, QIIME2, and CNN_FunBar. Including a table with performance metrics across datasets for each method would make the comparison more comprehensive.

5. Flowchart in Figure 5 is useful but could benefit from additional annotations explaining each step for readers who might not be familiar with ITS sequence processing.

Comments on the Quality of English Language

none

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have addressed my questions thoroughly, and I am satisfied with the current manuscript. I recommend acceptance pending minor adjustments to the figure sizes.

Author Response

Comments 1: The authors have addressed my questions thoroughly, and I am satisfied with the current manuscript. I recommend acceptance pending minor adjustments to the figure sizes.

Response 1: We sincerely thank you for your recognition of our manuscript and your valuable suggestions regarding the improvement of the figures. Based on your feedback, we have carefully adjusted all the figures in the manuscript to ensure a better presentation of the research findings. The specific improvements are as follows:

(1) We have standardized the font styles and line thicknesses across all figures, enhancing the visual clarity of the images to facilitate readers' understanding of the data.

(2) We have made slight adjustments to the positioning of legends, labels, and layouts to make the data presentation more intuitive, enabling readers to grasp key information more quickly.

(3) We have appropriately adjusted the figure sizes to ensure they occupy a reasonable amount of space in the manuscript while clearly presenting all key details.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have made sufficient changes.

Author Response

Comments 1: The authors have made sufficient changes.

Response 1: We sincerely thank you for your positive feedback and recognition of our revisions.

Article Menu

FungiLT: A Deep Learning Approach for Species-Level Taxonomic Classification of Fungal ITS Sequences

Further Information

Guidelines

MDPI Initiatives

Follow MDPI