Next Article in Journal
Interaction Order and Historical Body Shaping Children’s Making Projects—A Literature Review
Previous Article in Journal
F-Formations for Social Interaction in Simulation Using Virtual Agents and Mobile Robotic Telepresence Systems
Previous Article in Special Issue
Graph-Based Prediction of Meeting Participation
Open AccessArticle

Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation

1
NTT Media Intelligence Laboratories, NTT Corporation, 1-1, Hikarinooka, Yokosuka-shi, Kanagawa 239-0847, Japan
2
NTT Communication Science Laboratories, NTT Corporation, 3-1, Morinosato Wakamiya, Atsugi-shi, Kanagawa 243-0198, Japan
*
Author to whom correspondence should be addressed.
This paper is an extension of the content our past conference paper.
Multimodal Technologies Interact. 2019, 3(4), 70; https://doi.org/10.3390/mti3040070
Received: 11 September 2019 / Accepted: 15 October 2019 / Published: 26 October 2019
(This article belongs to the Special Issue Multimodal Conversational Interaction and Interfaces)
We investigated the mouth-opening transition pattern (MOTP), which represents the change of mouth-opening degree during the end of an utterance, and used it to predict the next speaker and utterance interval between the start time of the next speaker’s utterance and the end time of the current speaker’s utterance in a multi-party conversation. We first collected verbal and nonverbal data that include speech and the degree of mouth opening (closed, narrow-open, wide-open) of participants that were manually annotated in four-person conversation. A key finding of the MOTP analysis is that the current speaker often keeps her mouth narrow-open during turn-keeping and starts to close it after opening it narrowly or continues to open it widely during turn-changing. The next speaker often starts to open her mouth narrowly after closing it during turn-changing. Moreover, when the current speaker starts to close her mouth after opening it narrowly in turn-keeping, the utterance interval tends to be short. In contrast, when the current speaker and the listeners open their mouths narrowly after opening them narrowly and then widely, the utterance interval tends to be long. On the basis of these results, we implemented prediction models of the next-speaker and utterance interval using MOTPs. As a multimodal-feature fusion, we also implemented models using eye-gaze behavior, which is one of the most useful items of information for prediction of next-speaker and utterance interval according to our previous study, in addition to MOTPs. The evaluation result of the models suggests that the MOTPs of the current speaker and listeners are effective for predicting the next speaker and utterance interval in multi-party conversation. Our multimodal-feature fusion model using MOTPs and eye-gaze behavior is more useful for predicting the next speaker and utterance interval than using only one or the other. View Full-Text
Keywords: mouth movement; multi-party conversation; next-speaker prediction; utterance interval prediction; turn-changing; eye-gaze behavior mouth movement; multi-party conversation; next-speaker prediction; utterance interval prediction; turn-changing; eye-gaze behavior
Show Figures

Figure 1

MDPI and ACS Style

Ishii, R.; Otsuka, K.; Kumano, S.; Higashinaka, R.; Tomita, J. Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation. Multimodal Technologies Interact. 2019, 3, 70.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop