Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Open AccessArticle

Peer-Review Record

Simplified Knowledge Distillation for Deep Neural Networks Bridging the Performance Gap with a Novel Teacher–Student Architecture

Electronics 2024, 13(22), 4530; https://doi.org/10.3390/electronics13224530

by Sabina Umirzakova¹, Mirjamol Abdullaev², Sevara Mardieva¹, Nodira Latipova³ and Shakhnoza Muksimova^1,*

Reviewer 1: Anonymous

Reviewer 2:

Gang Yan

Reviewer 3: Anonymous

Reviewer 4: Anonymous

Electronics 2024, 13(22), 4530; https://doi.org/10.3390/electronics13224530

Submission received: 14 October 2024 / Revised: 14 November 2024 / Accepted: 15 November 2024 / Published: 18 November 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Paper deals with important task. It has a scientific novelty. It has a logical structure. The paper is technically sound. The experimental section must be improved. The proposed approach is logical, results must be clarified.

Suggestions:

1. It is advisable to improve the motivation and practical significance of the results.

2. To demonstrate the results, the authors used the well-known CIFAR-10 dataset. To generalize the results, it is necessary to show the advantages of the proposed approach on one or more other datasets.

3. "Table 1. The results of the proposed teacher–student model in different T and α values with 100 epochs. The best and worst two results emphasized blue and red colors, respectively." There is no color selection in the table. It is appropriate to highlight key results in other tables as well. (Highlight is present only in Table 4).

4. In Table 4 for the proposed model the Train loss and Validation loss is greater than for ResNet152 / ResNet50, although Train accuracy and Validation accuracy are the best in comparison. What explains this?

5. The interpretation of the results is not entirely clear. In Table 4 and Table 5, the results for identical parameters do not match. According to Table 5, the performance is significantly worse.

Author Response

We sincerely thank the reviewers for their thorough and constructive feedback. Your comments have greatly contributed to improving the clarity, depth, and overall quality of our manuscript. We appreciate your insights and suggestions, which have enabled us to strengthen our work and provide a more comprehensive presentation of our research. Thank you for your time and effort in reviewing our paper. We upload a file where respectful reviewers can find responses. Thank you.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

1. Summary:

The paper introduces a novel knowledge distillation (KD) method aimed at simplifying the distillation process. The primary objective is to reduce the performance gap between large teacher models and smaller student models.

2. Strengths:

+ The paper proposes a unique approach by designing a new teacher architecture specifically for KD, rather than relying on traditional models like ResNet or VGG. This innovation has the potential to improve the efficiency of the KD process. The proposed method addresses key challenges in neural network compression and optimization by reducing training complexity while maintaining model performance.

+ The authors conducted comprehensive experiments to evaluate the impact of various hyperparameters, such as temperature ($T$) and smoothing factor ($\alpha$), on the model’s performance. The results, especially in terms of accuracy and loss across different configurations, clearly demonstrate the effectiveness of the method.

3. Weaknesses:

- The experiments are limited to CIFAR-10, which is a relatively simple and small dataset. Although CIFAR-10 is a standard benchmark, conducting additional experiments on more complex datasets, such as CIFAR-100 or ImageNet, would provide stronger evidence of the method’s generalizability and scalability.

- The paper mainly focuses on the design of the teacher network, but it does not investigate whether the choice of student network architecture affects the overall distillation performance. For example, it is unclear how using a smaller or more complex student network would influence the knowledge transfer process.

Author Response

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Please see the attached file

Comments for author File: Comments.pdf

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Author Response

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

In this paper, the authors proposed a novel teacher-student deep architecture. In the abstract, you should briefly introduce your teacher-student model. In your proposed model presented in a general framework on teacher-student model. Although you are claiming the model is lighweight, but in current draft it is not showing as you did not include trainable parameters/layers information of teacher and student architectures. Can you compare the proposed knowledge transfer using knowledge distillation with the tranfer learning with domain specific dataset, https://link.springer.com/article/10.1007/s11227-022-04830-8; https://www.sciencedirect.com/science/article/abs/pii/S0263224119301939

Author should reduce the plagiarism score.

Comments on the Quality of English Language

There is no major issue with language

Author Response

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you! Received a response to all recommendations.

Author Response

We would like to extend our sincere gratitude to the reviewers for their insightful and constructive feedback. Your suggestions have been invaluable in refining our manuscript, enhancing both its clarity and depth. We appreciate the time and effort you invested in reviewing our work, and we believe your guidance has significantly strengthened our research. Thank you once again for your valuable contributions to the development of this paper.

Reviewer 2 Report

Comments and Suggestions for Authors

The revised version has greatly addressed my concerns. I look forward to seeing more work from your group.

Author Response

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have complied with the reviewer's recommendations. I have no more questions.

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Author Response

Reviewer 4 Report

Comments and Suggestions for Authors

1. Conclusion part should be minimized and all figures should be clear.

2. In algorithm 1, remove the flowchart word after the proposed method.

3. In page 89, acoustic spectral imaging is not a transfer learning model.

4. You did not respond the comparison the proposed knowledge transfer using knowledge distillation with the transfer learning with domain specific dataset

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Author Response

Author Response File: Author Response.pdf

Article Menu

Simplified Knowledge Distillation for Deep Neural Networks Bridging the Performance Gap with a Novel Teacher–Student Architecture

Further Information

Guidelines

MDPI Initiatives

Follow MDPI