Next Article in Journal
Does the Nature of Added Bioactive Lipids Affect the Biological Properties of Yogurts?—Case Study Coconut and Avocado Oils
Next Article in Special Issue
A Feature Fusion Model with Data Augmentation for Speech Emotion Recognition
Previous Article in Journal
Effects of Repeated Sprints on Hamstring Active Shear Modulus Pattern and Neuromuscular Parameters in Football Players with and without Hamstring Strain Injury History—A Retrospective Study
Previous Article in Special Issue
Speech Enhancement Based on Two-Stage Processing with Deep Neural Network for Laser Doppler Vibrometer
 
 
Review
Peer-Review Record

Overview of Voice Conversion Methods Based on Deep Learning

Appl. Sci. 2023, 13(5), 3100; https://doi.org/10.3390/app13053100
by Tomasz Walczyna and Zbigniew Piotrowski *
Reviewer 1:
Reviewer 2:
Reviewer 3:
Appl. Sci. 2023, 13(5), 3100; https://doi.org/10.3390/app13053100
Submission received: 30 January 2023 / Revised: 23 February 2023 / Accepted: 24 February 2023 / Published: 28 February 2023
(This article belongs to the Special Issue Audio and Acoustic Signal Processing)

Round 1

Reviewer 1 Report

There is no novelty in this paper - no simulation results - no conclusions - no graphs

 

Its a brief overview of literature. I wonder whether it is publishable? 

Author Response

Dear Reviewer,

Here is a point-by-point response to your comments and concerns.

  • Comment 1: There is no novelty in this paper - no simulation results - no conclusions - no graphs. Its a brief overview of literature. I wonder whether it is publishable? 

Response: Thank you for taking the time to review our paper. We appreciate your feedback and apologise that you found our work lacking novelty, simulation results, and conclusions. We understand your concerns and want to clarify that our paper aims to provide a comprehensive overview of the literature in this field, highlighting key themes and recent advancements. While we did not include simulation results or graphs, we have added a new chapter with challenges and rebuilt a conclusion to the review, which will interest readers. We understand that this may differ from what you were expecting, but we hope you can see the value in a comprehensive literature review. Once again, thank you for your feedback.We look forward to hearing from you in due time regarding our submission and to respond to any further questions and comments you may have.

Sincerely,

MSc. Tomasz Walczyna

Reviewer 2 Report

The work is essential, but many points must take into account; these are:

1. The abstract section is well structured in this paper, although the fluidity is not smooth enough. 

2. English presentations should be further polished. Your text has more advanced writing issues; grammatical and sentence construction errors are found in the article. Check your Article at https://www.grammarly.com/  

1. The introductory paragraph, or opening paragraph, is the paper’s first paragraph; it introduces the main idea of the research, captures the interest of possible readers, and tells why this topic is important. In my opinion, the present introduction required to improve. The authors should highlight their contributions.

2. In Conclusion, the authors say, "The presented models have shown great potential in achieving high levels of conversion efficiency and naturalness in converted speech”. The performance of the current state of technology and providing a summary of the available resources for voice conversion research is required to validate the models, measure the technological progress, and benchmark a system against the state-of-the-art. The authors typically need to report the results in objective and subjective numerical measurements.

3. The promise and limitations of voice conversion techniques based on deep learning must be discussed. Also need to report on the recent Voice Conversion Challenges (VCC).

4. The conclusion is intended to help the reader understand why the research should matter to them after they have finished reading the paper. A conclusion is not merely a summary of the points or a re-statement of the research problem but a synthesis of key points. For most essays, one well-developed paragraph is sufficient for a conclusion.

 

Author Response

Dear Reviewer,

Here is a point-by-point response to your comments and concerns.

  • Comment 1: The abstract section is well structured in this paper, although the fluidity is not smooth enough. 

 

Response: Thank you for your feedback on our paper, and we are grateful that you found the structure of the abstract to be well-organized. We also appreciate your constructive criticism regarding the fluidity of the abstract, and we are pleased to inform you that we have carefully revised the abstract to make it more fluid and easier to read. We hope our updated version of the paper better meets your expectations, and we are grateful for the opportunity to make these improvements. 

 

  • Comment 2: English presentations should be further polished. Your text has more advanced writing issues; grammatical and sentence construction errors are found in the article. Check your Article at https://www.grammarly.com/  

Response:  We appreciate your feedback on the quality of the English language.
We understand that there were some grammatical and sentence construction errors in the previous version of the paper, and we apologize for any confusion that these may have caused. We assure you that we have taken your comments seriously and revised the paper using a grammar-checking tool. We hope our updated version of the paper meets
the standards you expect, and we appreciate the opportunity to make these improvements. 

 

  • Comment 3: The introductory paragraph, or opening paragraph, is the paper's first paragraph; it introduces the main idea of the research, captures the interest of possible readers, and tells why this topic is important. In my opinion, the present introduction required to improve. The authors should highlight their contributions.

Response: We appreciate your feedback on the quality of the introduction. We understand that the opening paragraph is critical to capturing possible readers' interest and conveying the topic's importance. We apologize if the previous version of our introduction did not meet your expectations, and we want to assure you that we have carefully revised and expanded the introduction in lines 25-33 to highlight our research contributions. We hope our updated version of the introduction better conveys the significance of our work and captures readers' interest. We appreciate the opportunity to make these improvements, and we hope you will find our updated paper valuable to the field. 

 

  • Comment 4: In Conclusion, the authors say, "The presented models have shown great potential in achieving high levels of conversion efficiency and naturalness in converted speech". The performance of the current state of technology and providing
    a summary of the available resources for voice conversion research is required to validate the models, measure the technological progress, and benchmark a system against the state-of-the-art. The authors typically need to report the results in objective and subjective numerical measurements.

Response: Thank you for your valuable feedback on the conclusion section. We appreciate your insights on the importance of reporting objective and subjective numerical measurements to validate the performance of the presented models and measure technological progress.
We understand the significance of providing numerical measurements as a benchmark against the state-of-the-art, and we have taken your comments seriously. We are pleased to inform you that we have modified Table 2 to include MOS to measure the performance of the models presented in different papers. We believe that this modification better validates
the performance of these models and provides a more comprehensive view of the state-of-the-art in the field of voice conversion. We appreciate the opportunity to make these improvements, and we hope you will find our updated paper valuable to the area. 

 

  • Comment 5: The promise and limitations of voice conversion techniques based on deep learning must be discussed. Also need to report on the recent Voice Conversion Challenges (VCC).

Response:  We appreciate your insights on the importance of discussing the promise
and limitations of voice conversion techniques based on deep learning and the need to report on the recent Voice Conversion Challenges (VCC). We have taken your comments seriously and are pleased to inform you that we have added a new chapter to the paper that discusses the promise and limitations of voice conversion techniques based on deep learning.
We believe this new chapter will provide readers with a more comprehensive understanding of the state-of-the-art in voice conversion research, and we hope it will address your concerns about the paper. We appreciate the opportunity to make these improvements, and we thank you again for your feedback. We look forward to your continued engagement with our research.

 

  • Comment 6: The Conclusion is intended to help the reader understand why the research should matter to them after they have finished reading the paper. A conclusion is not merely a summary of the points or a re-statement of the research problem but a synthesis of key points. For most essays, one well-developed paragraph is sufficient for a conclusion.

Response:  We appreciate your guidance on the importance of a conclusion in helping readers understand why the research should matter to them. In response to your comment, we have revised the Conclusion and added more points in future directions, which we believe will better synthesize the critical points of our paper. We hope our new Conclusion will meet your expectations and provide a comprehensive overview of the potential impact of our research.


Once again, thank you for your valuable feedback, and we hope you will find our updated paper to be a valuable contribution to the field. We look forward to hearing from you regarding our submission and responding to any further questions and comments you may have.

Sincerely,

MSc. Tomasz Walczyna

 

Reviewer 3 Report

Authors present overview of deep learning based voice conversion methods. Overall the paper is well written and thorough.

I  have minor comments about the paper

- The paper investigates the different steps of the voice conversion methods including, speaker identity extraction, linguistic content extraction, encoder, decoder and vocoder. It would be useful to see tables that summarizes paper for each step

- Although nice summaries of each method is given, authors should present advantages and disadvantages of methods.

- If possible, performance comparison of the methods would  be very useful.

 

Author Response

Dear Reviewer,

Here is a point-by-point response to your comments and concerns.

  • Comment 1: The paper investigates the different steps of the voice conversion methods including, speaker identity extraction, linguistic content extraction, encoder, decoder and vocoder. It would be useful to see tables that summarizes paper for each step
  • Response: Thank you for your valuable feedback. We greatly appreciate your suggestion on improving the readability and organization of our paper. We have considered your comment and have revised the paper accordingly. As suggested, we added Table 1, which summarizes the papers for each step of the voice conversion methods. We hope this will make it easier for readers to follow and understand the different steps of the voice conversion process. Thank you again for your helpful feedback.
  • Comment 2: Although nice summaries of each method is given, authors should present advantages and disadvantages of methods.

Response: Thank you for your comment on our paper. We appreciate your suggestion and have considered it in our revised paper version. We have included a challenges chapter that discusses the advantages and disadvantages of the methods presented in the paper. We believe this section will provide readers with a better understanding of the limitations and potential of the methods.

  • Comment 3: If possible, performance comparison of the methods would  be very useful.

Response: Thank you for your insightful comment on our paper. We greatly appreciate your suggestion to include a performance comparison of the methods described in the paper. As per your suggestion, we have modified Table 2 in the paper, showing a detailed performance comparison of each method in terms of Mean Opinion Score (MOS). We hope that this addition provides a comprehensive understanding of the performance of each method.

 

Once again, thank you for your helpful feedback. We hope that our revisions have addressed your concerns and that you will find the revised paper a more complete and informative resource. We look forward to hearing from you regarding our submission and responding to any further questions and comments you may have.

Sincerely,

MSc. Tomasz Walczyna

Round 2

Reviewer 1 Report

The previous review comments are addressed

Author Response

Dear Reviewer,

Thank you for taking the time to review our paper and for providing valuable feedback. We are pleased to hear that the previous review comments have been addressed to your satisfaction.

We appreciate your input and we hope that our paper now meets the standards of the journal. Thank you again for your time and consideration.

Best regards,
MSc. Tomasz Walczyna

Back to TopTop