Next Article in Journal
The Effect of the Addition Turmeric on Selected Quality Characteristics of Duck Burgers Stored under Refrigeration
Next Article in Special Issue
A Comparison of Hybrid and End-to-End ASR Systems for the IberSpeech-RTVE 2020 Speech-to-Text Transcription Challenge
Previous Article in Journal
Analysis and Prediction Model of Fuel Consumption and Carbon Dioxide Emissions of Light-Duty Vehicles
Previous Article in Special Issue
A Proposal for Multimodal Emotion Recognition Using Aural Transformers and Action Units on RAVDESS Dataset
 
 
Article
Peer-Review Record

MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension

Appl. Sci. 2022, 12(2), 804; https://doi.org/10.3390/app12020804
by Pau Baquero-Arnal *, Javier Jorge, Adrià Giménez, Javier Iranzo-Sánchez, Alejandro Pérez, Gonçal Vicent Garcés Díaz-Munío, Joan Albert Silvestre-Cerdà, Jorge Civera, Albert Sanchis and Alfons Juan
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2022, 12(2), 804; https://doi.org/10.3390/app12020804
Submission received: 29 November 2021 / Revised: 3 January 2022 / Accepted: 8 January 2022 / Published: 13 January 2022

Round 1

Reviewer 1 Report

This article describes the ASR systems built by MLLP-VRAIN group for the Albayzín-RTVE 2020 Speech-to-Text Challenge. The techniques used seem to be correct and the performances are fairly good. The experimental analysis  is also sound.

However, there are some points should be justified or revised:

  1. The introduction section states the motivation of the article. Nevertheless, commonly some paragraphs of "related works" should be included in the section or in a separate section, to tell readers the efforts other researchers tried.
  2. Section 3.1 to 3.3 describe the methods used to build the acoustic and language models. In my opinion, details of the methods are too limited, although a brief introduction of the methods and reference papers are given by the authors. More details of important techniques should be described to make readers read easier, e.g., the HCS strategy for LM combination, WMA for online decoding, etc...

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Paper describes ASR system for Albayzín-RTVE 2020 challenge.

 

L42 and further - transcriptions are not a task for professional linguists, but for a skilled typewriter.

Chapter 2 - sampling rate of the data should be given.

L124 "...a final model was selected with a 256-unit embedding layer and two hidden LSTM layers of 2048 units." -> "a model  with a 256-unit embedding layer and two hidden LSTM layers of 2048 units was selected."

L242 - in paper like this, proper system results analysis should be done. Authors therefore should evaluate what part of the WER difference between online and offline system is caused by wrong WAD and and what by full sentence normalization. Also conclusions and possible improvement should be drown from such evaluations. Otherwise the paper is just a description of the system not offering much new insights into differences between online and offline speech processing.

Figures 2 and 5 should have the same legend (tf - TML...). Some points should be also labelled with the window size, otherwise the pictures does not have much information value.

Fig.1 is completely presented in fig. 3.  Similarly figs 2 and 4. I believe only 2 figures could be present with only slight modification in text.

L311 and further - authors discuss the gap between close and open condition system and reduction of the gap by having "open LM".  It would be nice to give the information how many errors were recovered by the open LM  and ideally also analysis of remaining once.

citations others -> at al.

Is there really all the time so many authors that the full list cannot be given in the citations?

 

General comment: There is too many different shortcuts in the article making the reading more difficult. For example LMHR and LMHP appears only twice in the article and could be replaced by full version.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop