Pauses in Speech

A special issue of Languages (ISSN 2226-471X).

Deadline for manuscript submissions: closed (20 November 2022) | Viewed by 43407

Special Issue Editors


E-Mail Website
Guest Editor
Department of Language Science and Technology, Saarland University, 66123 Saarbrücken, Germany
Interests: speech pauses; speech respiration; prosodic breaks; pause-internal phonetic particles

E-Mail Website
Guest Editor
Department of Language Science and Technology, Saarland University, 66123 Saarbrücken, Germany
Interests: speech production; speech perception; prosody; speech synthesis

Special Issue Information

Dear Colleagues,

Producing continuous speech without pauses is impossible. However, speech pauses are rarely the object of scientific research in linguistics and speech science and technology. Pauses are taken for granted and usually ignored in terms of annotation and analysis of spoken language. Even though speech pauses as a temporal variable are prosodic in nature, prosody research and, specifically, studies on speech timing tend to ignore pauses. The fact that speech pauses are an under-investigated research area is reflected by, for instance, a lack of coverage in the recent Handbook of Language Prosody (Gussenhoven and Chen, 2020), suggesting a relative lack of sensitivity in the phonetic and prosodic communities in speech material beyond single utterances or sentences.

Speech pauses are often considered as silence, i.e., as the absence of phonetic gestures, although many pauses are in fact not silent in an acoustic-phonetic sense: they often contain phonetic particles such as breath noises, tongue clicks and lip smacks, and these particles can be informative with respect to speech planning and preparation. Complementary to the inadequate term 'silent pauses', the term 'filled pauses' is often used to refer to a hesitation syllable, which consists of either a vowel or a vowel followed by a nasal consonant, but not to the entire pause event, which includes silent phases before or after, or both, of a hesitation syllable or other phonetic particles. Apart from a lack of a consensus on such descriptive terms, the underlying relation between pauses and the planning and execution of speech production and their role in speech perception are evidently still under-researched.

Speech pauses are sometimes used as synonyms for prosodic boundaries found in fluent and well-formed speech. These boundaries usually reflect syntactic but also rhythmical structures (Gee and Grosjean, 1983). Speech pauses can also be used beyond 'spoken interpunction', for instance for emphasis and thus have a highlighting function, typically directing the listener's attention to upcoming linguistic material (e.g. Fuchs et al., 2013), but they also play a role in turn-taking (e.g. Lundholm Fors, 2015). In addition, pauses are core markers of non-scripted speech styles. The analysis and modeling of speech tempo and fluency, which is essential for many fields of spoken language research and applications - such as non-native speech, pathological forms of speech, forensic analyses, and speech synthesis and recognition, must crucially consider speech pauses.

This special issue attempts to fill the gaps identified above and bring together contributions from several areas of spoken language research. Possible research questions include, but are not limited to: What is a pause? What is the role of breathing for speech pausing? How do pauses affect speech fluency? What are the phonetic characteristics of hesitations and filler particles? What is the contribution of pauses to perceived tempo, speaking rate, and fluency? To what extent are pausing patterns idiosyncratic or language/culture dependent? What are the signatures of pauses in dialogues, multimodal contexts and in different speech styles, including affective speech?

We request that, prior to submitting a manuscript, interested authors initially submit a proposed title and an abstract of 400-600 words summarizing their intended contribution. Please send it to the guest editors Jürgen Trouvain ([email protected]) and Bernd Möbius ([email protected]) or to the journal's editorial office ([email protected]). Abstracts will be reviewed by the guest editors for the purposes of ensuring proper fit within the scope of the special issue. Full manuscripts will undergo double-blind peer-review.

Tentative completion schedule:

Abstract submission deadline: 30 June 2022

Notification of abstract acceptance: 31 July 2022

Full manuscript deadline: 31 October 2022

References:

Fuchs, S., Petrone, C., Krivokapić, J. & Hoole, P. (2013). Acoustic and respiratory evidence for utterance planning in German. Journal of Phonetics 41, pp. 29–47.

Gee, J.P. & Grosjean, F. (1983). Performance structures: A psycholinguistic and linguistic appraisal. Cognitive Psychology 15(4), pp. 411–458.

Gussenhoven, C. & Chen, A. (eds) 2020. The Oxford Handbook of Language Prosody. Oxford: OUP.

Lundholm Fors, K. 2015. Production and Perception of Pauses in Speech. PhD thesis Gothenburg University.

Dr. Jürgen Trouvain
Prof. Dr. Bernd Möbius
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a double-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Languages is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • speech pauses
  • prosodic boundaries
  • speech respiration
  • filler particles
  • pause perception
  • fluency
  • timing patterns
  • speaking rate
  • nonverbal vocalisations

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

15 pages, 770 KiB  
Article
Pauses and Parsing: Testing the Role of Prosodic Chunking in Sentence Processing
by Caoimhe Harrington Stack and Duane G. Watson
Languages 2023, 8(3), 157; https://doi.org/10.3390/languages8030157 - 28 Jun 2023
Viewed by 1791
Abstract
It is broadly accepted that the prosody of a sentence can influence sentence processing by providing the listener information about the syntax of the sentence. It is less clear what the mechanism is that underlies the transmission of this information. In this paper, [...] Read more.
It is broadly accepted that the prosody of a sentence can influence sentence processing by providing the listener information about the syntax of the sentence. It is less clear what the mechanism is that underlies the transmission of this information. In this paper, we test whether the influence of the prosodic structure on parsing is a result of perceptual breaks such as pauses or whether it is the result of more abstract prosodic elements, such as intonational phrases. In three experiments, we test whether different types of perceptual breaks, e.g., intonational boundaries (Experiment 1), an artificial buzzing sound (Experiment 2), and an isolated pause (Experiment 3), influence syntactic attachment in ambiguous sentences. We find that although full intonational boundaries influence syntactic disambiguation, the artificial buzz and isolated pause do not. These data rule out theories that argue that perceptual breaks indirectly influence grammatical attachment through memory mechanisms, and instead, show that listeners use prosodic breaks themselves as cues to parsing. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

30 pages, 8424 KiB  
Article
Disfluencies Revisited—Are They Speaker-Specific?
by Angelika Braun, Nathalie Elsässer and Lea Willems
Languages 2023, 8(3), 155; https://doi.org/10.3390/languages8030155 - 26 Jun 2023
Cited by 2 | Viewed by 6583
Abstract
The forensic application of phonetics relies on individuality in speech. In the forensic domain, individual patterns of verbal and paraverbal behavior are of interest which are readily available, measurable, consistent, and robust to disguise and to telephone transmission. This contribution is written from [...] Read more.
The forensic application of phonetics relies on individuality in speech. In the forensic domain, individual patterns of verbal and paraverbal behavior are of interest which are readily available, measurable, consistent, and robust to disguise and to telephone transmission. This contribution is written from the perspective of the forensic phonetic practitioner and seeks to establish a more comprehensive concept of disfluency than previous studies have. A taxonomy of possible variables forming part of what can be termed disfluency behavior is outlined. It includes the “classical” fillers, but extends well beyond these, covering, among others, additional types of fillers as well as prolongations, but also the way in which fillers are combined with pauses. In the empirical section, the materials collected for an earlier study are re-examined and subjected to two different statistical procedures in an attempt to approach the issue of individuality. Recordings consist of several minutes of spontaneous speech by eight speakers on three different occasions. Beyond the established set of hesitation markers, additional aspects of disfluency behavior which fulfill the criteria outlined above are included in the analysis. The proportion of various types of disfluency markers is determined. Both statistical approaches suggest that these speakers can be distinguished at a level far above chance using the disfluency data. At the same time, the results show that it is difficult to pin down a single measure which characterizes the disfluency behavior of an individual speaker. The forensic implications of these findings are discussed. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

19 pages, 988 KiB  
Article
Speech Rate and Turn-Transition Pause Duration in Dutch and English Spontaneous Question-Answer Sequences
by Damar Hoogland, Laurence White and Sarah Knight
Languages 2023, 8(2), 115; https://doi.org/10.3390/languages8020115 - 22 Apr 2023
Cited by 1 | Viewed by 2387
Abstract
The duration of inter-speaker pauses is a pragmatically salient aspect of conversation that is affected by linguistic and non-linguistic context. Theories of conversational turn-taking imply that, due to listener entrainment to the flow of syllables, a higher speech rate will be associated with [...] Read more.
The duration of inter-speaker pauses is a pragmatically salient aspect of conversation that is affected by linguistic and non-linguistic context. Theories of conversational turn-taking imply that, due to listener entrainment to the flow of syllables, a higher speech rate will be associated with shorter turn-transition times (TTT). Previous studies have found conflicting evidence, however, some of which may be due to methodological differences. In order to test the relationship between speech rate and TTT, and how this may be modulated by other dialogue factors, we used question-answer sequences from spontaneous conversational corpora in Dutch and English. As utterance-final lengthening is a local cue to turn endings, we also examined the impact of utterance-final syllable rhyme duration on TTT. Using mixed-effect linear regression models, we observed evidence for a positive relationship between speech rate and TTT: thus, a higher speech rate is associated with longer TTT, contrary to most theoretical predictions. Moreover, for answers following a pause (“gaps”) there was a marginal interaction between speech rate and final rhyme duration, such that relatively long final rhymes are associated with shorter TTT when foregoing speech rate is high. We also found evidence that polar (yes/no) questions are responded to with shorter TTT than open questions, and that direct answers have shorter TTT than responses that do not directly answer the questions. Moreover, the effect of speech rate on TTT was modulated by question type. We found no predictors of the (negative) TTT for answers that overlap with the foregoing questions. Overall, these observations suggest that TTT is governed by multiple dialogue factors, potentially including the salience of utterance-final timing cues. Contrary to some theoretical accounts, there is no strong evidence that higher speech rates are consistently associated with shorter TTT. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

31 pages, 9159 KiB  
Article
Distributional and Acoustic Characteristics of Filler Particles in German with Consideration of Forensic-Phonetic Aspects
by Beeke Muhlack, Jürgen Trouvain and Michael Jessen
Languages 2023, 8(2), 100; https://doi.org/10.3390/languages8020100 - 31 Mar 2023
Viewed by 2470
Abstract
In this study, we investigate the use of the filler particles (FPs) uh, um, hm, as well as glottal FPs and tongue clicks of 100 male native German speakers in a corpus of spontaneous speech. For this purpose, the frequency [...] Read more.
In this study, we investigate the use of the filler particles (FPs) uh, um, hm, as well as glottal FPs and tongue clicks of 100 male native German speakers in a corpus of spontaneous speech. For this purpose, the frequency distribution, FP duration, duration of pauses surrounding FPs, voice quality of FPs, and their vowel quality are investigated in two conditions, namely, normal speech and Lombard speech. Speaker-specific patterns are investigated on the basis of twelve sample speakers. Our results show that tongue clicks and glottal FPs are as common as typically described FPs, and should be a part of disfluency research. Moreover, the frequency of uh, um, and hm decreases in the Lombard condition while the opposite is found for tongue clicks. Furthermore, along with the usual F1 increase, a considerable reduction in vowel space is found in the Lombard condition for the vowels in uh and um. A high degree of within- and between-speaker variation is found on the individual speaker level. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

14 pages, 1891 KiB  
Article
Occurrences and Durations of Filled Pauses in Relation to Words and Silent Pauses in Spontaneous Speech
by Mária Gósy
Languages 2023, 8(1), 79; https://doi.org/10.3390/languages8010079 - 9 Mar 2023
Cited by 2 | Viewed by 3899
Abstract
Filled pauses (i.e., gaps in speech production filled with non-lexical vocalizations) have been studied for more than sixty years in different languages. These studies utilize many different approaches to explore the origins, specific patterns, forms, incidents, positions, and functions of filled pauses. The [...] Read more.
Filled pauses (i.e., gaps in speech production filled with non-lexical vocalizations) have been studied for more than sixty years in different languages. These studies utilize many different approaches to explore the origins, specific patterns, forms, incidents, positions, and functions of filled pauses. The present research examines the presence of filled pauses by considering the adjacent words and silent pauses that define their immediate positions as well as the influence of the immediate position on filled pause duration. The durations of 2450 filled pauses produced in 30 narratives were analyzed in terms of their incidence, immediate positions, neighboring silent pauses, and surrounding word types. The data obtained showed that filled pauses that were attached to a word on one side were the most frequent. Filled pauses occurring within a word and between two silent pauses were the longest of all. Hence, the durations of filled pauses were significantly influenced by the silent pauses occurring in their vicinity. The durations and occurrence of filled pauses did not differ when content or function words preceded the filled pause or followed it. These findings suggest that the incidence and duration of filled pauses as influenced by the neighboring words and silent pauses may be indicative of their information content, which is related to the processes of transforming ideas into grammatical structures. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

18 pages, 4647 KiB  
Article
The Dance of Pauses in Poetry Declamation
by Plinio A. Barbosa
Languages 2023, 8(1), 76; https://doi.org/10.3390/languages8010076 - 8 Mar 2023
Cited by 2 | Viewed by 2390
Abstract
In poetry declamation, the appropriate use of prosody to cause pleasure is essential. Among the prosodic parameters, pause is one of the most effective to engage the listeners and provide them with a pleasant experience. The declamation of three poems in two varieties [...] Read more.
In poetry declamation, the appropriate use of prosody to cause pleasure is essential. Among the prosodic parameters, pause is one of the most effective to engage the listeners and provide them with a pleasant experience. The declamation of three poems in two varieties of Portuguese by ten Brazilian Portuguese (BP) speakers and ten European Portuguese (EP) speakers, balanced for gender, was used as a corpus for evaluating the degree of pleasantness by listeners from the same language variety. The distributions of pause duration and inter-pause interval (IPI) both varied greatly across the subjects, being the main source of variability and strongly right-tailed. The evaluation of the degree of pleasantness revealed that pause duration predicts degree of pleasantness in EP, whereas IPI predicts degree of pleasantness in BP. Reciters perform a kind of complex “dance”, where sonority between pauses is favored in BP and pause duration in EP. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

24 pages, 1774 KiB  
Article
Cognitive Load Increases Spoken and Gestural Hesitation Frequency
by Simon Betz, Nataliya Bryhadyr, Olcay Türk and Petra Wagner
Languages 2023, 8(1), 71; https://doi.org/10.3390/languages8010071 - 2 Mar 2023
Cited by 7 | Viewed by 3131
Abstract
This study investigates the interplay of spoken and gestural hesitations under varying amounts of cognitive load. We argue that not only fillers and silences, as the most common hesitations, are directly related to speech pausing behavior, but that hesitation lengthening is as well. [...] Read more.
This study investigates the interplay of spoken and gestural hesitations under varying amounts of cognitive load. We argue that not only fillers and silences, as the most common hesitations, are directly related to speech pausing behavior, but that hesitation lengthening is as well. We designed a resource-management card game as a method to elicit ecologically valid pausing behavior while being able to finely control cognitive load via card complexity. The method very successfully elicits large amounts of hesitations. Hesitation frequency increases as a function of cognitive load. This is true for both spoken and gestural hesitations. We conclude that the method presented here is a versatile tool for future research and we present foundational research on the speech-gesture link related to hesitations induced by controllable cognitive load. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

27 pages, 3419 KiB  
Article
Defining Filler Particles: A Phonetic Account of the Terminology, Form, and Grammatical Classification of “Filled Pauses”
by Malte Belz
Languages 2023, 8(1), 57; https://doi.org/10.3390/languages8010057 - 16 Feb 2023
Cited by 6 | Viewed by 3069
Abstract
The terms hesitation, planner, filler, and filled pause do not always refer to the same phonetic entities. This terminological conundrum is approached by investigating the observational, explanatory, and descriptive inadequacies of the terms in use. Concomitantly, the term filler particle is [...] Read more.
The terms hesitation, planner, filler, and filled pause do not always refer to the same phonetic entities. This terminological conundrum is approached by investigating the observational, explanatory, and descriptive inadequacies of the terms in use. Concomitantly, the term filler particle is motivated and a definition is proposed that identifies its phonetic exponents and describes them within the linguistic category of particles. The definition of filler particles proposed here is grounded both theoretically and empirically and then applied to a corpus of spontaneous dialogues with 32 speakers of German, showing that in addition to the prototypical phonetic forms, there is a substantial amount of non-prototypical forms, i.e., 9.5%, comprising both glottal (e.g., [Ɂ]) and vocal forms (e.g., [ɛɸ], [j~ɛvə]). The grammatical classification and the results regarding the phonetic forms are discussed with respect to their theoretical relevance in filler particle research and corpus studies. The phonetic approach taken here further suggests a continuum of phonetic forms of filler particles, ranging from singleton segments to multi-syllabic entities. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

44 pages, 582 KiB  
Article
Hesitations in Primary Progressive Aphasia
by Lorraine Baqué and María Jesús Machuca
Languages 2023, 8(1), 45; https://doi.org/10.3390/languages8010045 - 1 Feb 2023
Cited by 3 | Viewed by 2237
Abstract
Hesitations are often used by speakers in spontaneous speech not only to organise and prepare their speech but also to address any obstacles that may arise during delivery. Given the relationship between hesitation phenomena and motor and/or cognitive–linguistic control deficits, characterising the form [...] Read more.
Hesitations are often used by speakers in spontaneous speech not only to organise and prepare their speech but also to address any obstacles that may arise during delivery. Given the relationship between hesitation phenomena and motor and/or cognitive–linguistic control deficits, characterising the form of hesitation could be potentially useful in diagnosing specific speech and language disorders, such as primary progressive aphasia (PPA). This work aims to analyse the features of hesitations in patients with PPA compared to healthy speakers, with hesitations understood here as those related to speech planning, that is, silent or empty pauses, filled pauses, and lengthened syllables. Forty-three adults took part in this experiment, of whom thirty-two suffered from some form of PPA: thirteen from logopenic PPA (lvPPA), ten from nonfluent PPA (nfvPPA), and nine from semantic PPA (svPPA). The remaining 11 were healthy speakers who served as a control group. An analysis of audio data recorded when participants produced spontaneous speech for a picture description task showed that the frequency of silent pauses, especially those classified as long (>1000 ms) was particularly useful to distinguish PPA participants from healthy controls and also to differentiate among PPA types. This was also true, albeit to a lesser extent, of the frequency of filled pauses and lengthened syllables. Full article
(This article belongs to the Special Issue Pauses in Speech)
24 pages, 1364 KiB  
Article
Pause Length and Differences in Cognitive State Attribution in Native and Non-Native Speakers
by Theresa Matzinger, Michael Pleyer and Przemysław Żywiczyński
Languages 2023, 8(1), 26; https://doi.org/10.3390/languages8010026 - 13 Jan 2023
Viewed by 7904
Abstract
Speech pauses between turns of conversations are crucial for assessing conversation partners’ cognitive states, such as their knowledge, confidence and willingness to grant requests; in general, speakers making longer pauses are regarded as less apt and willing. However, it is unclear if the [...] Read more.
Speech pauses between turns of conversations are crucial for assessing conversation partners’ cognitive states, such as their knowledge, confidence and willingness to grant requests; in general, speakers making longer pauses are regarded as less apt and willing. However, it is unclear if the interpretation of pause length is mediated by the accent of interactants, in particular native versus non-native accents. We hypothesized that native listeners are more tolerant towards long pauses made by non-native speakers than those made by native speakers. This is because, in non-native speakers, long pauses might be the result of prolonged cognitive processing when planning an answer in a non-native language rather than of a lack of knowledge, confidence or willingness. Our experiment, in which 100 native Polish-speaking raters rated native and non-native speakers of Polish on their knowledge, confidence and willingness, showed that this hypothesis was confirmed for perceived willingness only; non-native speakers were regarded as equally willing to grant requests, irrespective of their inter-turn pause durations, whereas native speakers making long pauses were regarded as less willing than those making short pauses. For knowledge and confidence, we did not find a mediating effect of accent; both native and non-native speakers were rated as less knowledgeable and confident when making long pauses. One possible reason for the difference between our findings on perceived willingness to grant requests versus perceived knowledge and confidence is that requests might be more socially engaging and more directly relevant for interpersonal cooperative interactions than knowledge that reflects on partners’ competence but not cooperativeness. Overall, our study shows that (non-)native accents can influence which cognitive states are signaled by different pause durations, which may have important implications for intercultural communication settings where topics are negotiated between native and non-native speakers. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

18 pages, 5308 KiB  
Article
Occurrence and Duration of Pauses in Relation to Speech Tempo and Structural Organization in Two Speech Genres
by Pavel Šturm and Jan Volín
Languages 2023, 8(1), 23; https://doi.org/10.3390/languages8010023 - 11 Jan 2023
Cited by 4 | Viewed by 2948
Abstract
Pauses act as important acoustic cues to prosodic phrase boundaries. However, the distribution and phonetic characteristics of pauses have not yet been fully described either cross-linguistically or in different genres and speech styles within languages. The current study examines the pausal performance of [...] Read more.
Pauses act as important acoustic cues to prosodic phrase boundaries. However, the distribution and phonetic characteristics of pauses have not yet been fully described either cross-linguistically or in different genres and speech styles within languages. The current study examines the pausal performance of 24 Czech speakers in two genres of read speech: news reading and poetry reciting. The pause rate and pause duration are related to genre differences, overt and covert text organization, and speech tempo. We found a significant effect of several levels of text organization, including a strong effect of punctuation. This was reflected in both measures of pausal performance. A grammatically informed analysis of a subset of pauses within the smallest units revealed a significant contribution for pause rate only. An effect of tempo was found in poetry reciting at a macro level (speaker averages) but not when pauses were observed individually. Genre differences did not manifest consistently and analogically for the two measures. The findings provide evidence that pausing is used systematically by speakers in read speech to convey not only prosodic phrasing but also text structure, among other things. Full article
(This article belongs to the Special Issue Pauses in Speech)
Show Figures

Figure 1

Back to TopTop