Next Article in Journal
Probabilistic Moment Capacity Models of Reinforced Concrete Slab Members for Underground Box Culverts
Next Article in Special Issue
The Domain Mismatch Problem in the Broadcast Speaker Attribution Task
Previous Article in Journal
Optimised Extraction of Archaeological Features from Full 3-D GPR Data
Previous Article in Special Issue
Automatic Speech Recognition (ASR) Systems Applied to Pronunciation Assessment of L2 Spanish for Japanese Speakers
Article

The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses

1
Department of Information Technology, Escuela Politécnica Superior, Universidad San Pablo CEU, Campus Montepríncipe, Urbanización Montepríncipe, 28925 Madrid, Spain
2
AUDIAS, Electronic and Communication Technology Department, Escuela Politécnica Superior, Universidad Autónoma de Madrid, Av. Francisco Tomás y Valiente, 11, 28049 Madrid, Spain
3
Voice Group, Advanced Technologies Application Center, CENATAV, Rpto. Siboney, Playa, La Habana 74390, Cuba
*
Author to whom correspondence should be addressed.
Academic Editor: Valentín Cardeñoso-Payo
Appl. Sci. 2021, 11(18), 8519; https://doi.org/10.3390/app11188519
Received: 30 July 2021 / Revised: 7 September 2021 / Accepted: 10 September 2021 / Published: 14 September 2021
The large amount of information stored in audio and video repositories makes search on speech (SoS) a challenging area that is continuously receiving much interest. Within SoS, spoken term detection (STD) aims to retrieve speech data given a text-based representation of a search query (which can include one or more words). On the other hand, query-by-example spoken term detection (QbE STD) aims to retrieve speech data given an acoustic representation of a search query. This is the first paper that presents an internationally open multi-domain evaluation for SoS in Spanish that includes both STD and QbE STD tasks. The evaluation was carefully designed so that several post-evaluation analyses of the main results could be carried out. The evaluation tasks aim to retrieve the speech files that contain the queries, providing their start and end times and a score that reflects how likely the detection within the given time intervals and speech file is. Three different speech databases in Spanish that comprise different domains were employed in the evaluation: the MAVIR database, which comprises a set of talks from workshops; the RTVE database, which includes broadcast news programs; and the SPARL20 database, which contains Spanish parliament sessions. We present the evaluation itself, the three databases, the evaluation metric, the systems submitted to the evaluation, the evaluation results and some detailed post-evaluation analyses based on specific query properties (in-vocabulary/out-of-vocabulary queries, single-word/multi-word queries and native/foreign queries). The most novel features of the submitted systems are a data augmentation technique for the STD task and an end-to-end system for the QbE STD task. The obtained results suggest that there is clearly room for improvement in the SoS task and that performance is highly sensitive to changes in the data domain. View Full-Text
Keywords: search on speech; spoken term detection; query-by-example spoken term detection; international evaluation; Spanish language search on speech; spoken term detection; query-by-example spoken term detection; international evaluation; Spanish language
Show Figures

Figure 1

MDPI and ACS Style

Tejedor, J.; Toledano, D.T.; Ramirez, J.M.; Montalvo, A.R.; Alvarez-Trejos, J.I. The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses. Appl. Sci. 2021, 11, 8519. https://doi.org/10.3390/app11188519

AMA Style

Tejedor J, Toledano DT, Ramirez JM, Montalvo AR, Alvarez-Trejos JI. The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses. Applied Sciences. 2021; 11(18):8519. https://doi.org/10.3390/app11188519

Chicago/Turabian Style

Tejedor, Javier, Doroteo T. Toledano, Jose M. Ramirez, Ana R. Montalvo, and Juan I. Alvarez-Trejos 2021. "The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses" Applied Sciences 11, no. 18: 8519. https://doi.org/10.3390/app11188519

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop