Next Article in Journal
Reflections on the Customer Decision-Making Process in the Digital Insurance Platforms: An Empirical Study of the Baltic Market
Next Article in Special Issue
An Analysis of Sound Event Detection under Acoustic Degradation Using Multi-Resolution Systems
Previous Article in Journal
Optical Turbulence Profile Forecasting and Verification in the Offshore Atmospheric Boundary Layer
Previous Article in Special Issue
The Multi-Domain International Search on Speech 2020 ALBAYZIN Evaluation: Overview, Systems, Results, Discussion and Post-Evaluation Analyses
 
 
Article

The Domain Mismatch Problem in the Broadcast Speaker Attribution Task

by *,†, *,†, *,† and *,†
ViVoLab, Aragón Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, Spain
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Academic Editor: José A. González-López
Appl. Sci. 2021, 11(18), 8521; https://doi.org/10.3390/app11188521
Received: 4 August 2021 / Revised: 3 September 2021 / Accepted: 9 September 2021 / Published: 14 September 2021
The demand of high-quality metadata for the available multimedia content requires the development of new techniques able to correctly identify more and more information, including the speaker information. The task known as speaker attribution aims at identifying all or part of the speakers in the audio under analysis. In this work, we carry out a study of the speaker attribution problem in the broadcast domain. Through our experiments, we illustrate the positive impact of diarization on the final performance. Additionally, we show the influence of the variability present in broadcast data, depicting the broadcast domain as a collection of subdomains with particular characteristics. Taking these two factors into account, we also propose alternative approximations robust against domain mismatch. These approximations include a semisupervised alternative as well as a totally unsupervised new hybrid solution fusing diarization and speaker assignment. Thanks to these two approximations, our performance is boosted around a relative 50%. The analysis has been carried out using the corpus for the Albayzín 2020 challenge, a diarization and speaker attribution evaluation working with broadcast data. These data, provided by Radio Televisión Española (RTVE), the Spanish public Radio and TV Corporation, include multiple shows and genres to analyze the impact of new speech technologies in real-world scenarios. View Full-Text
Keywords: speaker attribution; diarization; multi-domain; domain mismatch speaker attribution; diarization; multi-domain; domain mismatch
Show Figures

Figure 1

MDPI and ACS Style

Viñals, I.; Ortega, A.; Miguel, A.; Lleida, E. The Domain Mismatch Problem in the Broadcast Speaker Attribution Task. Appl. Sci. 2021, 11, 8521. https://doi.org/10.3390/app11188521

AMA Style

Viñals I, Ortega A, Miguel A, Lleida E. The Domain Mismatch Problem in the Broadcast Speaker Attribution Task. Applied Sciences. 2021; 11(18):8521. https://doi.org/10.3390/app11188521

Chicago/Turabian Style

Viñals, Ignacio, Alfonso Ortega, Antonio Miguel, and Eduardo Lleida. 2021. "The Domain Mismatch Problem in the Broadcast Speaker Attribution Task" Applied Sciences 11, no. 18: 8521. https://doi.org/10.3390/app11188521

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop