Subjective Evaluation of Generative AI-Driven Dialogues in Paired Dyadic and Topic-Sharing Triadic Interaction Structures
Abstract
:Featured Application
Abstract
1. Introduction
2. Methods
2.1. Dialogue System
2.2. Settings of Dyadic and Triadic Interactions
- In the dyadic interaction (between the system and the subject) setting, the system responds to what the subject says. If the subject goes off-topic, the system follows the subject. For example, the subject asks, “What do you like?”, the system responds, “One of my absolute favorite hobbies is traveling. Exploring new places and experiencing different cultures brings me so much joy!”
- In the dyadic interaction (between the system and the topic) setting, the system reads out the prepared sentences one at a time and does not respond to what the subject says. For example, the subject asks, “What do you like?” the system responds, “Both oranges and apples have their own wonderful qualities!”
- In the triadic interaction (between the system, the subject, and the topic) setting, the system responds topically to what the subject says. If the subject goes off-topic, the system continues to talk about the topic and sometimes points it out. For example, if a subject asks, “What do you like?” the system responds, “I enjoy both oranges and apples, but my choice depends on my mood or the season. They’re both nutritious and delicious fruits, don’t you think?”
2.3. Topic
2.4. General Algorithms
2.5. Questionnaire
2.6. Subjects
2.7. Experimental Procedure
- The subject inputs any voice, such as hello, into the system.
- The dialogue system announces a topic and starts a dialogue.
- After five minutes, the system announces the end of a dialogue and asks the subject to fill out the post-dialogue questionnaire.
3. Results
3.1. The Dialogue Evaluation
3.2. The Dialogue Evaluation from High-Rating Groups in Dyadic and Triadic Settings
3.3. The Dialogue System Evaluation
4. Discussion
- In the case of the dyadic (with the topic) setting, the structure in the system does not include the subject, as shown in Figure 4, and does not match any structure to which the subject belongs. In this case, the structure construction fails between the system and the subject. In the experiment, the subjects were ignored by the system, which talked one-sidedly. In the results, they evaluated the dialogues negatively and commented on the impossibility of dialogue.
- In the case that the subject is oriented toward dyadic interaction, the subject intends to construct the structure, as shown in Figure 9. This structure matches the structure in the dyadic (with the subject) setting of the system, as shown in Figure 4, and they can interact well on the successful construction. However, the structure in the triadic setting of the system extends beyond this structure. In this case, the structure construction fails due to the system’s undesirable action for the subject. In the experiment, the subjects in the dyadic (with the subject) setting high-rating group would have been oriented toward dyadic interaction. In the results, they evaluated the dialogue in the dyadic (with the subject) setting positively and the dialogue in the triadic setting negatively. Their comments in the triadic setting included discomfort with the dialogue system’s inflexibility in persisting with the topic and forcing them to say something that the system wanted. They evaluated the system in the triadic setting as an untrustworthy dialogue partner, awkward and inhuman.
- In the case that the subject is oriented toward triadic interaction, the subject intends to construct the same structure of the triadic interaction, as shown in Figure 4. This structure includes the structure in the dyadic (with the subject) setting of the system and matches the structure in the triadic setting. When the structure is included, the structure construction succeeds even if it is something lacking for the subject, and they can interact well to some extent on the successful construction. In the experiment, the subjects in the triadic setting high-rating group would have been oriented toward triadic interaction. In the results, they evaluated the dialogue in the dyadic (with the subject) setting not negatively and the dialogue in the triadic setting positively. They evaluated the system in the triadic setting as a well-listening, trustworthy dialogue partner.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Folstad, A.; Skjuve, M.; Brandtzaeg, P.B. Different Chatbots for Different Purposes: Towards a Typology of Chatbots to Understand Interaction Design. Internet Sci. 2019, 11551, 145–156. [Google Scholar] [CrossRef]
- Reeves, B.; Nass, C. The Media Equation: How People Treat Computers, Television, and New Media Like Real People and Places; Cambridge University Press: Cambridge, UK, 1996. [Google Scholar]
- Ho, A.; Hancock, J.; Miner, A.S. Psychological, Relational, and Emotional Effects of Self-Disclosure After Conversations with a Chatbot. J. Commun. 2018, 68, 712–733. [Google Scholar] [CrossRef] [PubMed]
- Okada, Y.; Kimoto, M.; Iio, T.; Shimohara, K.; Shiomi, M. Two is Better than One: Apologies from Two Robots are Preferred. PLoS ONE 2023, 18, e0281604. [Google Scholar] [CrossRef] [PubMed]
- Kosinski, M. Evaluating Large Language Models in Theory of Mind Tasks. Proc. Natl. Acad. Sci. USA 2024, 121, e2405460121. [Google Scholar] [CrossRef]
- Jones, C.R.; Bergen, B.K. People Cannot Distinguish GPT-4 from a Human in a Turing Test. arXiv 2024, arXiv:2405.08007. [Google Scholar]
- Zhou, H.; Huang, M.; Zhang, T.; Zhu, X.; Liu, B. Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, LU, USA, 2–7 February 2018; pp. 730–738. [Google Scholar]
- Nißen, M.; Rüegger, D.; Stieger, M.; Flückiger, C.; Allemand, M.; v Wangenheim, F.; Kowatsch, T. The Effects of Health Care Chatbot Personas with Different Social Roles on the Client-Chatbot Bond and Usage Intentions: Development of a Design Codebook and Web-based Study. J. Med. Internet Res. 2022, 24, e32630. [Google Scholar] [CrossRef]
- Miyanishi, T.; Hirayama, J.; Kanemura, A.; Kawanabe, M. Answering Mixed Type Questions About Daily Living Episodes. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018; pp. 4265–4271. [Google Scholar] [CrossRef]
- Rapp, A.; Curti, L.; Boldi, A. The Human Side of Human-Chatbot Interaction: A Systematic Literature Review of Ten Years of Research on Text-based Chatbots. Int. J. Hum. Comput. Stud. 2021, 151, 102630. [Google Scholar] [CrossRef]
- Svenningsson, N.; Faraon, M. Artificial Intelligence in Conversational Agents: A Study of Factors Related to Perceived Humanness in Chatbots. In Proceedings of the 2019 2nd Artificial Intelligence and Cloud Computing Conference, Kobe, Japan, 21–23 December 2019; pp. 151–161. [Google Scholar] [CrossRef]
- Abe, K.; Quan, C.; Cao, S.; Luo, Z. Classification of Properties in Human-like Dialogue Systems Using Generative AI to Adapt to Individual Preferences. Appl. Sci. 2025, 15, 3466. [Google Scholar] [CrossRef]
- Ahmad, R.; Siemon, D.; Gnewuch, U.; Robra-Bissantz, S. Designing Personality-Adaptive Conversational Agents for Mental Health Care. Inf. Syst. Front. 2022, 24, 923–943. [Google Scholar] [CrossRef]
- Durante, Z.; Huang, Q.; Wake, N.; Gong, R.; Park, J.S.; Sarkar, B.; Taori, R.; Noda, Y.; Terzopoulos, D.; Choi, Y.; et al. Agent AI: Surveying the Horizons of Multimodal Interaction. arXiv 2024, arXiv:2401.03568. [Google Scholar]
- Zhang, J.; Oh, Y.J.; Lange, P.; Yu, Z.; Fukuoka, Y. Artificial Intelligence Chatbot Behavior Change Model for Designing Artificial Intelligence Chatbots to Promote Physical Activity and a Healthy Diet: Viewpoint. J. Med. Internet Res. 2020, 22, e22845. [Google Scholar] [CrossRef] [PubMed]
- Ayedoun, E.; Hayashi, Y.; Seta, K. Adding Communicative and Affective Strategies to an Embodied Conversational Agent to Enhance Second Language Learners’ Willingness to Communicate. Int. J. Artif. Intell. Educ. 2019, 29, 29–57. [Google Scholar] [CrossRef]
- Herrmann-Werner, A.; Festl-Wietek, T.; Junne, F.; Zipfel, S.; Madany Mamlouk, A. “Hello, my name is Melinda”—Students’ Views on a Digital Assistant for Navigation in Digital Learning Environments; A Qualitative Interview Study. Front. Educ. 2021, 5, 541839. [Google Scholar] [CrossRef]
- Gao, T.; McCarthy, G.; Scholl, B.J. The Wolfpack Effect: Perception of Animacy Irresistibly Influences Interactive Behavior. Psychol. Sci. 2010, 21, 1845–1853. [Google Scholar] [CrossRef]
- Kanakogi, Y.; Okumura, Y.; Inoue, Y.; Kitazaki, M.; Itakura, S. Rudimentary Sympathy in Preverbal Infants: Preference for Others in Distress. PLoS ONE 2013, 8, e65292. [Google Scholar] [CrossRef]
- Tomasello, M. Becoming Human: A Theory of Ontogeny; Belknap Press of Harvard University Press: Cambridge, MA, USA, 2021. [Google Scholar]
- Scassellati, B. Theory of mind for a humanoid robot. Auton. Robot. 2002, 12, 13–24. [Google Scholar] [CrossRef]
- Hill, J.; Ford, W.R.; Farreras, I.G. Real Conversations with Artificial Intelligence: A Comparison Between Human-Human Online Conversations and Human-Chatbot Conversations. Comput. Hum. Behav. 2015, 49, 245–250. [Google Scholar] [CrossRef]
- Mou, Y.; Xu, K. The media inequality: Comparing the Initial Human-Human and Human-AI Social Interactions. Comput. Hum. Behav. 2017, 72, 432–440. [Google Scholar] [CrossRef]
- Brandtzaeg, P.B.; Folstad, A. Why People Use Chatbots. Internet Sci. 2017, 10673, 377–392. [Google Scholar] [CrossRef]
- Loeffler, D.; Schmidt, N.; Tscharn, R. Multimodal Expression of Artificial Emotion in Social Robots Using Color, Motion and Sound. In Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, Chicago, IL, USA, 5–8 March 2018; pp. 334–343. [Google Scholar] [CrossRef]
- Jacka, R.E.; Garrodb, O.G.B.; Yu, H.; Caldarac, R.; Schyns, P.G. Facial Expressions of Emotion are not Culturally Universal. Proc. Natl. Acad. Sci. USA 2012, 109, 7241–7244. [Google Scholar] [CrossRef]
- Nishiura, M.; Daibo, I. The Relationships Between Attraction of Same-sex Friend and Relationship-maintenance Motivation in the Light of Personal Importance. Jpn. J. Interpers. Soc. Psychol. 2010, 10, 115–123. (In Japanese) [Google Scholar] [CrossRef]
- Chidori, H.; Matsuzaki, G. Impression for the Robot that the Purpose of Communication. In Proceedings of the 59th Annual Conference of Japanese Society for the Science of Design, Sapporo, Japan, 22–24 June 2012; (In Japanese) [Google Scholar] [CrossRef]
- Wang, Y.; Song, W.; Tao, W.; Liotta, A.; Yang, D.; Li, X.; Gao, S.; Sun, Y.; Ge, W.; Zhang, W.; et al. A Systematic Review on Affective Computing: Emotion Models, Databases, and Recent Advances. Inf. Fusion 2022, 83–84, 19–52. [Google Scholar] [CrossRef]
- Ortega-Bolaños, R.; Bernal-Salcedo, J.; Germán Ortiz, M.; Galeano Sarmiento, J.; Ruz, G.A.; Tabares-Soto, R. Applying the Ethics of AI: A Systematic Review of Tools for Developing and Assessing AI-based Systems. Artif. Intell. Rev. 2024, 57, 110. [Google Scholar] [CrossRef]
- Laranjo, L.; Dunn, A.G.; Tong, H.L.; Kocaballi, A.B.; Chen, J.; Bashir, R.; Surian, D.; Gallego, B.; Magrabi, F.; Lau, A.Y.S.; et al. Conversational Agents in Healthcare: A Systematic Review. J. Am. Med. Inform. Assoc. 2018, 25, 1248–1258. [Google Scholar] [CrossRef]
- Sundar, S.S. Rise of Machine Agency: A Framework for Studying the Psychology of Human–AI Interaction (HAII). J. Comput. Mediat. Commun. 2020, 25, 74–88. [Google Scholar] [CrossRef]
- Sano, T.; Horii, T.; Abe, K.; Nagai, T. Temperament estimation of toddlers from child–robot interaction with explainable artificial intelligence. Adv. Robot. 2021, 35, 1068–1077. [Google Scholar] [CrossRef]
- Ring, L.; Shi, L.; Totzke, K.; Bickmore, T. Social support agents for older adults: Longitudinal affective computing in the home. J. Multimodal User Interfaces 2015, 9, 79–88. [Google Scholar] [CrossRef]
Dialogue Progression | Interaction Setting | ||
---|---|---|---|
Dyadic Interaction (with Subject) | Dyadic Interaction (with Topic) | Triadic Interaction | |
Start | Neutral level facial expression image is displayed, and a standard phrase is announced, e.g., “Hello. The first topic is which is a better fruit, orange or apple?” | ||
Subject turn | Some voice input, e.g., “I like an orange. I think, an orange is better.” | ||
Processing | Prompt for a response: Respond to a subject input in a polite tone with approximately 25 words. Prompt for emotional evaluation: Rate happiness and sadness on a scale of 100 about “a subject input and a generated response”. The difference in numbers between the degree of happiness and sadness is changed to five levels and used to select a face image. | All response texts and face image selections are prepared in advance. The generated sentences about topics are cut into appropriate lengths, and face images are selected in the same way as the others, except for subject inputs. | Prompt for a response: Respond topically to a subject input in a polite tone of approximately 25 words on a topic. Prompt for emotional evaluation: Rate happiness and sadness on a scale of 100 about “a subject input and a generated response”. The difference in numbers between the degree of happiness and sadness is changed to five levels and used to select a face image. |
System turn | A selected face image is displayed, and a response text is read out. | ||
Repeat Subject turn, Processing, and System turn for 5 min. | |||
End | A standard phrase is announced, “Thank you. That’s all for this topic. Please fill out the questionnaire. Please talk to me after you’ve finished.” |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abe, K.; Quan, C.; Cao, S.; Luo, Z. Subjective Evaluation of Generative AI-Driven Dialogues in Paired Dyadic and Topic-Sharing Triadic Interaction Structures. Appl. Sci. 2025, 15, 5092. https://doi.org/10.3390/app15095092
Abe K, Quan C, Cao S, Luo Z. Subjective Evaluation of Generative AI-Driven Dialogues in Paired Dyadic and Topic-Sharing Triadic Interaction Structures. Applied Sciences. 2025; 15(9):5092. https://doi.org/10.3390/app15095092
Chicago/Turabian StyleAbe, Kaori, Changqin Quan, Sheng Cao, and Zhiwei Luo. 2025. "Subjective Evaluation of Generative AI-Driven Dialogues in Paired Dyadic and Topic-Sharing Triadic Interaction Structures" Applied Sciences 15, no. 9: 5092. https://doi.org/10.3390/app15095092
APA StyleAbe, K., Quan, C., Cao, S., & Luo, Z. (2025). Subjective Evaluation of Generative AI-Driven Dialogues in Paired Dyadic and Topic-Sharing Triadic Interaction Structures. Applied Sciences, 15(9), 5092. https://doi.org/10.3390/app15095092