Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

M2ASR-KIRGHIZ: A Free Kirghiz Speech Database and Accompanied Baselines

Information 2023, 14(1), 55; https://doi.org/10.3390/info14010055

by Ikram Mamtimin^1,†, Wenqiang Du^2,† and Askar Hamdulla^1,*

Reviewer 1:

Lucas Ondel

Reviewer 2: Anonymous

Reviewer 3:

Gintautas Tamulevičius

Reviewer 4:

Luis Javier Rodriguez Fuentes

Information 2023, 14(1), 55; https://doi.org/10.3390/info14010055

Submission received: 1 November 2022 / Revised: 18 December 2022 / Accepted: 6 January 2023 / Published: 16 January 2023

Round 1

Reviewer 1 Report

This work presents the collection of a Kirghiz speech database aimed at building ASR systems.

Besides some English grammatical mistakes, the paper is well organized and the work clearly presented. The first part review the particularities of the Kirghiz language while the second part details the collection process. I have only a minor comment on the manuscript: in the paragraph "Speaker selection" it is mentioned that the speakers were "selected to reflect diversity of gender, age, geography and education". In the next sentence it is reported that the speakers are all university student with 63% males and 37% females and age ranging from 19 to 25 years old. It seems that the set of speakers is not representative of the Kirghiz population. Please correct this sentence accordingly.

My main concern is that in spite of what is claimed in the paper: the data is not currently available (it is "coming soon" according to the link provided by the paper). Therefore, at the time being, it is difficult to judge the outcome of this work. The same goes with the Kaldi and WeNet recipes. I couldn't find a relevant project on the link provided.

Consequently, I think this paper cannot be accepted for publication as long as the data and recipes are not provided.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper provides a free database of Kirghiz's speech and related linguistic resources in this publication. This is the largest open Kirghiz speech database (transcribed, 128 hours from 163 speakers). The background knowledge of Kirghiz is detailly presented in this paper. The baselines are provided.

Author Response

Point 1: I This paper provides a free database of Kirghiz's speech and related linguistic resources in this publication. This is the largest open Kirghiz speech database (transcribed, 128 hours from 163 speakers). The background knowledge of Kirghiz is detailly presented in this paper. The baselines are provided.

Response 1: We appreciate the positive comments from the reviewer.

Reviewer 3 Report

All comments are given in the attached review document.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

The paper describes a new speech database for training ASR systems for the Kirghiz language, a minority language spoken in a region of China. Besides, it provides ASR results (in terms of Letter Error Rate) for five baseline systems implemented using SOA technology.

I find it very interesting the production of new resources to foster research and technological developments for minority languages. In this regard, the free availability of these new datasets and the baseline systems developed for validation is key for future research works.

Minor issues

The sequences of vowels and consonants shown in Table 6 are inverted. I mean that V should be replaced by C and C replaced by V.

In Section 4.1, the authors say that speakers were selected in order to have a diversity of gender, age, geography, and education. But taking into account the information provided in the paper, there is little diversity regarding age and education, because all speakers are students in the age range of 19 to 25 years old. So diversity restricts to gender and maybe geography.

Finally, while English writing is reasonably good, it requires some proofreading. Attached to this review, I provide my own notes and suggestions as a PDF file.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Remarks of my previous review have been adequately addressed:
1. Paragraph on speakers selection has been corrected

2. Links to the database is provided (data is accessible on request) and the authors describe the recipes they have used for Kaldi and WeNet (removing the need to provide link to their own repository.

Article Menu

M2ASR-KIRGHIZ: A Free Kirghiz Speech Database and Accompanied Baselines

Further Information

Guidelines

MDPI Initiatives

Follow MDPI