Advances in Phonetic Sciences: Role of Speech Corpora and Automatic Processing

A special issue of Languages (ISSN 2226-471X).

Deadline for manuscript submissions: closed (15 April 2022) | Viewed by 2618

Special Issue Editors


E-Mail Website
Guest Editor
LISN/CNRS, UMR 9015, Université Paris-Saclay, 91405 Orsay, France
Interests: large scale corpora; phonetic variation; language change
1. CRISCO/EA4255, Université de Caen, 14000 Caen, France
2. LISN/CNRS, UMR 9015, Université Paris-Saclay, 91405 Orsay, France
3. Laboratoire de Phonétique et Phonologie, UMR 7018, CNRS-Sorbonne Nouvelle, 75005 Paris, France.
Interests: large corpora phonetics; variation in continuous speech; second language acquisition

E-Mail Website
Guest Editor
LISN/CNRS, UMR 9015, Université Paris-Saclay, 91405 Orsay, France
Interests: phonetic variation in large scale corpora; fine phonetic details; phonology of romance languages

Special Issue Information

Dear Colleagues,

We are pleased to announce a call for papers for a Special Issue on “Advances in Phonetic Sciences: The role of Speech Corpora and Automatic Processing”.

This Special Issue aims to bring together recent research on advances in speech corpora and to better comprehend the current status and challenges in the construction and analysis of spoken corpora.

During the past few decades, we have witnessed an increasing collaboration between linguistics and speech technology communities (Bradlow et al., 2011; Ernestus and Warner, 2011; Coleman et al., 2011). For instance, on the linguistic side, this collaboration has resulted mainly in an increasing integration of methods, tools and corpora from speech technologies into the analytical practices of linguistic domains such as phonetics and laboratory phonology. In particular, the automatic or semi-automatic analysis of large collections of spoken data are impacting phonetic sciences. These analyses have allowed us to test classical theoretical issues from a different perspective, and they have greatly facilitated the work of linguists (Liberman, 2019). On the speech technology side, in-depth explorations of speech reduction phenomena helped to improve pronunciation dictionaries for speech recognition systems (Adda-Decker and Lamel, 2018; Vasilescu et al., 2018).

In this Special Issue, we would like to address different demands and interactions between linguistics  (with a particular focus on phonetics and laboratory phonology research) and computer science. We also aim to provide a state of the art on corpus construction and publication, technological processing of corpora, ecological use of corpora gathered for a specific purpose by other scholars and data sharing in general, and benefits for real-world applications of advances in speech corpus construction and analysis.  Special emphasis will be placed on the relevance of multidisciplinarity in spoken data creation, analysis and sharing, and on collaborations among different research disciplines. We welcome submissions on advances in speech corpus covering technological and/or linguistic aspects.

We request that, prior to submitting a manuscript, interested authors initially submit a proposed title and an abstract of 400–600 words summarizing their intended contribution. Please send it to the Guest Editors ([email protected], [email protected], [email protected]) or to the Languages Editorial Office ([email protected]). Abstracts will be reviewed by the Guest Editors for the purposes of ensuring proper fit within the scope of the Special Issue. Full manuscripts will undergo double-blind peer review.

The tentative completion schedule is as follows:

* Abstract submission deadline: 15 November 2021

* Notification of abstract acceptance: 15 December 2021

* Full manuscript deadline: 15 April 2022

List of references:

Adda-Decker, M., & Lamel, L. (2018). 4. Discovering speech reductions across speaking styles and languages. In Rethinking Reduction. De Gruyter Mouton, pp. 101–128

Bradlow, A. R., Guion-Anderson, S., & Polka, L. (2011). Cross-language Speech Perception and Variations in Linguistics Experience. C. T. Best (Ed.). Elsevier.

Coleman, J., Liberman, M., Kochanski, G., Burnard, L., & Yuan, J. (2011). Mining a year of speech. VLSP 2011: New tools and methods for very-large-scale phonetics research, 16–19.

Ernestus, M., & Warner, N. (2011). An introduction to reduced pronunciation variants. Journal of Phonetics, 39(SI), 253–260.

Liberman, M. Y. (2019). Corpus phonetics. Annual Review of Linguistics, 5, 91–107.

Vasilescu, I., Wu, Y., Jatteau, A., Adda-Decker, M., & Lamel, L. (2020). Alternances de voisement et processus de lénition et de fortition: une étude automatisée de grands corpus en cinq langues romanes. Traitement Automatique des Langues (TAL), 3, pp. 11–36

Prof. Dr. Ioana Vasilescu
Dr. Yaru Wu
Dr. Mathilde Hutin
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a double-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Languages is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • phonetics
  • laboratory phonology
  • data collection
  • large scale corpora
  • speech production variation
  • speech perception
  • speech technology

Published Papers (1 paper)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 3372 KiB  
Article
Operation LiLi: Using Crowd-Sourced Data and Automatic Alignment to Investigate the Phonetics and Phonology of Less-Resourced Languages
by Mathilde Hutin and Marc Allassonnière-Tang
Languages 2022, 7(3), 234; https://doi.org/10.3390/languages7030234 - 8 Sep 2022
Cited by 1 | Viewed by 1844
Abstract
Less-resourced languages are usually left out of phonetic studies based on large corpora. We contribute to the recent efforts to fill this gap by assessing how to use open-access, crowd-sourced audio data from Lingua Libre for phonetic research. Lingua Libre is a participative [...] Read more.
Less-resourced languages are usually left out of phonetic studies based on large corpora. We contribute to the recent efforts to fill this gap by assessing how to use open-access, crowd-sourced audio data from Lingua Libre for phonetic research. Lingua Libre is a participative linguistic library developed by Wikimedia France in 2015. It contains more than 670k recordings in approximately 150 languages across nearly 740 speakers. As a proof of concept, we consider the Inventory Size Hypothesis, which predicts that, in a given system, variation in the realization of each vowel will be inversely related to the number of vowel categories. We investigate data from 10 languages with various numbers of vowel categories, i.e., German, Afrikaans, French, Catalan, Italian, Romanian, Polish, Russian, Spanish, and Basque. Audio files are extracted from Lingua Libre to be aligned and segmented using the Munich Automatic Segmentation System. Information on the formants of the vowel segments is then extracted to measure how vowels expand in the acoustic space and whether this is correlated with the number of vowel categories in the language. The results provide valuable insight into the question of vowel dispersion and demonstrate the wealth of information that crowd-sourced data has to offer. Full article
Show Figures

Figure 1

Back to TopTop