Next Article in Journal
The Design of Musical Instruments for Grey Parrots: An Artistic Contribution toward Auditory Enrichment in the Context of ACI
Previous Article in Journal
Analysis of Upper-Limb and Trunk Kinematic Variability: Accuracy and Reliability of an RGB-D Sensor
Previous Article in Special Issue
Prediction of Who Will Be Next Speaker and When Using Mouth-Opening Pattern in Multi-Party Conversation
Article

Generation of Head Movements of a Robot Using Multimodal Features of Peer Participants in Group Discussion Conversation

1
Faculty of Informatics, The University of Fukuchiyama, Fukuchiyama, Kyoto 620-0886, Japan
2
Center for Advanced Intelligence Project, RIKEN, Kyoto, Kyoto 606-8501, Japan
3
Graduate School of Informatics, Kyoto University, Kyoto, Kyoto 606-8501, Japan
4
College of Information Science and Engineering, Ritsumeikan University, Kusatsu, Shiga 525-8577, Japan
*
Author to whom correspondence should be addressed.
Multimodal Technol. Interact. 2020, 4(2), 15; https://doi.org/10.3390/mti4020015
Received: 26 May 2019 / Revised: 23 September 2019 / Accepted: 15 October 2019 / Published: 29 April 2020
(This article belongs to the Special Issue Multimodal Conversational Interaction and Interfaces)
In recent years, companies have been seeking communication skills from their employees. Increasingly more companies have adopted group discussions during their recruitment process to evaluate the applicants’ communication skills. However, the opportunity to improve communication skills in group discussions is limited because of the lack of partners. To solve this issue as a long-term goal, the aim of this study is to build an autonomous robot that can participate in group discussions, so that its users can repeatedly practice with it. This robot, therefore, has to perform humanlike behaviors with which the users can interact. In this study, the focus was on the generation of two of these behaviors regarding the head of the robot. One is directing its attention to either of the following targets: the other participants or the materials placed on the table. The second is to determine the timings of the robot’s nods. These generation models are considered in three situations: when the robot is speaking, when the robot is listening, and when no participant including the robot is speaking. The research question is: whether these behaviors can be generated end-to-end from and only from the features of peer participants. This work is based on a data corpus containing 2.5 h of the discussion sessions of 10 four-person groups. Multimodal features, including the attention of other participants, voice prosody, head movements, and speech turns extracted from the corpus, were used to train support vector machine models for the generation of the two behaviors. The performances of the generation models of attentional focus were in an F-measure range between 0.4 and 0.6. The nodding model had an accuracy of approximately 0.65. Both experiments were conducted in the setting of leave-one-subject-out cross validation. To measure the perceived naturalness of the generated behaviors, a subject experiment was conducted. In the experiment, the proposed models were compared. They were based on a data-driven method with two baselines: (1) a simple statistical model based on behavior frequency and (2) raw experimental data. The evaluation was based on the observation of video clips, in which one of the subjects was replaced by a robot performing head movements in the above-mentioned three conditions. The experimental results showed that there was no significant difference from original human behaviors in the data corpus and proved the effectiveness of the proposed models. View Full-Text
Keywords: robot; gaze; visual focus of attention; nod; multiparty interaction; machine learning; support vector machine robot; gaze; visual focus of attention; nod; multiparty interaction; machine learning; support vector machine
Show Figures

Figure 1

MDPI and ACS Style

Huang, H.-H.; Kimura, S.; Kuwabara, K.; Nishida, T. Generation of Head Movements of a Robot Using Multimodal Features of Peer Participants in Group Discussion Conversation. Multimodal Technol. Interact. 2020, 4, 15. https://doi.org/10.3390/mti4020015

AMA Style

Huang H-H, Kimura S, Kuwabara K, Nishida T. Generation of Head Movements of a Robot Using Multimodal Features of Peer Participants in Group Discussion Conversation. Multimodal Technologies and Interaction. 2020; 4(2):15. https://doi.org/10.3390/mti4020015

Chicago/Turabian Style

Huang, Hung-Hsuan, Seiya Kimura, Kazuhiro Kuwabara, and Toyoaki Nishida. 2020. "Generation of Head Movements of a Robot Using Multimodal Features of Peer Participants in Group Discussion Conversation" Multimodal Technologies and Interaction 4, no. 2: 15. https://doi.org/10.3390/mti4020015

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop