Next Article in Journal
Discourse with Visual Health Data: Design of Human-Data Interaction
Previous Article in Journal
Virtual Reality and Games
Article Menu

Export Article

Open AccessArticle
Multimodal Technologies Interact. 2018, 2(1), 9; doi:10.3390/mti2010009

Interactive Hesitation Synthesis: Modelling and Evaluation

Cluster of Excellence Cognitive Interaction Technology (CITEC), Bielefeld University, 33615 Bielefeld, Germany
Phonetics and Phonology Workgroup, Faculty of Linguistics and Literary Studies, Bielefeld University, 33615 Bielefeld, Germany
Dialogue Systems Group, Faculty of Linguistics and Literary Studies, Bielefeld University, 33615 Bielefeld, Germany
Applied Informatics Group, Faculty of Technology, Bielefeld University, 33615 Bielefeld, Germany
Author to whom correspondence should be addressed.
Received: 8 December 2017 / Revised: 15 February 2018 / Accepted: 26 February 2018 / Published: 2 March 2018
(This article belongs to the Special Issue Situated Speech Synthesis: Beyond Text-to-Waveform Mapping)
View Full-Text   |   Download PDF [738 KB, uploaded 2 March 2018]   |  


Conversational spoken dialogue systems that interact with the user rather than merely reading the text can be equipped with hesitations to manage dialogue flow and user attention. Based on a series of empirical studies, we elaborated a hesitation synthesis strategy for dialogue systems, which inserts hesitations of a scalable extent wherever needed in the ongoing utterance. Previously, evaluations of hesitation systems have shown that synthesis quality is affected negatively by hesitations, but that they result in improvements of interaction quality. We argue that due to its conversational nature, hesitation synthesis needs interactive evaluation rather than traditional mean opinion score (MOS)-based questionnaires. To validate this claim, we dually evaluate our system’s speech synthesis component, on the one hand, linked to the dialogue system evaluation, and on the other hand, in a traditional MOS way. We are thus able to analyze and discuss differences that arise due to the evaluation methodology. Our results suggest that MOS scales are not sufficient to assess speech synthesis quality, leading to implications for future research that are discussed in this paper. Furthermore, our results indicate that synthetic hesitations are able to increase task performance and that an elaborated hesitation strategy is necessary to avoid likability issues. View Full-Text
Keywords: speech synthesis; evaluation; hesitation; virtual agents; interaction speech synthesis; evaluation; hesitation; virtual agents; interaction

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Betz, S.; Carlmeyer, B.; Wagner, P.; Wrede, B. Interactive Hesitation Synthesis: Modelling and Evaluation. Multimodal Technologies Interact. 2018, 2, 9.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Metrics

Article Access Statistics



[Return to top]
Multimodal Technologies Interact. EISSN 2414-4088 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top