MDPI - Publisher of Open Access Journals

21 pages, 1662 KB

Open AccessArticle

Controllable Speech-Driven Gesture Generation with Selective Activation of Weakly Supervised Controls

by Karlo Crnek and Matej Rojc

Appl. Sci. 2025, 15(17), 9467; https://doi.org/10.3390/app15179467 - 28 Aug 2025

Viewed by 631

Generating realistic and contextually appropriate gestures is crucial for creating engaging embodied conversational agents. Although speech is the primary input for gesture generation, adding controls like gesture velocity, hand height, and emotion is essential for generating more natural, human-like gestures. However, current approaches [...] Read more.

Generating realistic and contextually appropriate gestures is crucial for creating engaging embodied conversational agents. Although speech is the primary input for gesture generation, adding controls like gesture velocity, hand height, and emotion is essential for generating more natural, human-like gestures. However, current approaches to controllable gesture generation often utilize a limited number of control parameters and lack the ability to activate/deactivate them selectively. Therefore, in this work, we propose the Cont-Gest model, a Transformer-based gesture generation model that enables selective control activation through masked training and a control fusion strategy. Furthermore, to better support the development of such models, we propose a novel evaluation-driven development (EDD) workflow, which combines several iterative tasks: automatic control signal extraction, control specification, visual (subjective) feedback, and objective evaluation. This workflow enables continuous monitoring of model performance and facilitates iterative refinement through feedback-driven development cycles. For objective evaluation, we are using the validated Kinetic–Hellinger distance, an objective metric that correlates strongly with the human perception of gesture quality. We evaluated multiple model configurations and control dynamics strategies within the proposed workflow. Experimental results show that Feature-wise Linear Modulation (FiLM) conditioning, combined with single-mask training and voice activity scaling, achieves the best balance between gesture quality and adherence to control inputs. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

34 pages, 1952 KB

Open AccessArticle

Using Large Language Models to Embed Relational Cues in the Dialogue of Collaborating Digital Twins

by Sana Salman and Deborah Richards

Systems 2025, 13(5), 353; https://doi.org/10.3390/systems13050353 - 6 May 2025

Viewed by 1320

Abstract

Embodied Conversational Agents (ECAs) serve as digital twins (DTs), visually and behaviorally mirroring human counterparts in various roles, including healthcare coaching. While existing research primarily focuses on single-coach ECAs, our work explores the benefits of multi-coach virtual health sessions, where users engage with [...] Read more.

Embodied Conversational Agents (ECAs) serve as digital twins (DTs), visually and behaviorally mirroring human counterparts in various roles, including healthcare coaching. While existing research primarily focuses on single-coach ECAs, our work explores the benefits of multi-coach virtual health sessions, where users engage with specialized diet, physical, and cognitive coaches simultaneously. ECAs require verbal relational cues—such as empowerment, affirmation, and empathy—to foster user engagement and adherence. Our study integrates Generative AI to automate the embedding of these cues into coaching dialogues, ensuring the advice remains unchanged while enhancing delivery. We employ ChatGPT to generate empathetic and collaborative dialogues, comparing their effectiveness against manually crafted alternatives. Using three participant cohorts, we analyze user perception of the helpfulness of AI-generated versus human-generated relational cues. Additionally, we investigate whether AI-generated dialogues preserve the original advice’s semantics and whether human or automated validation better evaluates their lexical meaning. Our findings contribute to the automation of digital health coaching. Comparing ChatGPT- and human-generated dialogues for helpfulness, users rated human dialogues as more helpful, particularly for working alliance and affirmation cues, whereas AI-generated dialogues were equally effective for empowerment. By refining relational cues in AI-generated dialogues, this research paves the way for automated virtual health coaching solutions. Full article

(This article belongs to the Special Issue Advancements in Practical Applications of Agents, Multi-Agent Systems and Digital Twins)

► Show Figures

Figure 1

20 pages, 4055 KB

Open AccessArticle

An Efficient Gaze Control System for Kiosk-Based Embodied Conversational Agents in Multi-Party Conversations

by Sunghun Jung, Junyeong Kum and Myungho Lee

Electronics 2025, 14(8), 1592; https://doi.org/10.3390/electronics14081592 - 15 Apr 2025

Viewed by 1116

Abstract

The adoption of kiosks in public spaces is steadily increasing, with a trend toward providing more natural user experiences through embodied conversational agents (ECAs). To achieve human-like interactions, ECAs should be able to appropriately gaze at the speaker. However, kiosks in public spaces [...] Read more.

The adoption of kiosks in public spaces is steadily increasing, with a trend toward providing more natural user experiences through embodied conversational agents (ECAs). To achieve human-like interactions, ECAs should be able to appropriately gaze at the speaker. However, kiosks in public spaces often face challenges, such as ambient noise and overlapping speech from multiple people, making it difficult to accurately identify the speaker and direct the ECA’s gaze accordingly. In this paper, we propose a lightweight gaze control system that is designed to operate effectively within the resource constraints of kiosks and the noisy conditions common in public spaces. We first developed a speaker detection model that identifies the active speaker in challenging noise conditions using only a single camera and microphone. The proposed model achieved a 91.6% mean Average Precision (mAP) in active speaker detection and a 0.6% improvement over the state-of-the-art lightweight model (Light ASD) (as evaluated on the noise-augmented AVA-Speaker Detection dataset), while maintaining real-time performance. Building on this, we developed a gaze control system for ECAs that detects the dominant speaker in a group and directs the ECA’s gaze toward them using an algorithm inspired by real human turn-taking behavior. To evaluate the system’s performance, we conducted a user study with 30 participants, comparing the system to a baseline condition (i.e., a fixed forward gaze) and a human-controlled gaze. The results showed statistically significant improvements in social/co-presence and gaze naturalness compared to the baseline, with no significant difference between the system and human-controlled gazes. This suggests that our system achieves a level of social presence and gaze naturalness comparable to a human-controlled gaze. The participants’ feedback, which indicated no clear distinction between human- and model-controlled conditions, further supports the effectiveness of our approach. Full article

(This article belongs to the Special Issue AI Synergy: Vision, Language, and Modality)

► Show Figures

Figure 1

18 pages, 2982 KB

Open AccessArticle

The Development of an Emotional Embodied Conversational Agent and the Evaluation of the Effect of Response Delay on User Impression

by Simon Christophe Jolibois, Akinori Ito and Takashi Nose

Appl. Sci. 2025, 15(8), 4256; https://doi.org/10.3390/app15084256 - 11 Apr 2025

Viewed by 3933

Abstract

Embodied conversational agents (ECAs) are autonomous interaction interfaces designed to communicate with humans. This study investigates the impact of response delays and emotional facial expressions of ECAs on user perception and engagement. The motivation for this study stems from the growing integration of [...] Read more.

Embodied conversational agents (ECAs) are autonomous interaction interfaces designed to communicate with humans. This study investigates the impact of response delays and emotional facial expressions of ECAs on user perception and engagement. The motivation for this study stems from the growing integration of ECAs in various sectors, where their ability to mimic human-like interactions significantly enhances user experience. To this end, we developed an ECA with multimodal emotion recognition, both with voice and facial feature recognition and emotional facial expressions of the agent avatar. The system generates answers in real time based on media content. The development was supported by a case study of artwork images with the agent playing the role of a museum curator, where the user asks the agent for information on the artwork. We evaluated the developed system in two aspects. First, we investigated how the delay in an agent’s responses influences user satisfaction and perception. Secondly, we explored the role of emotion in an ECA’s face in shaping the user’s perception of responsiveness. The results showed that the longer response delay negatively impacted the user’s perception of responsiveness when the ECA did not express emotion, while the emotional expression improved the responsiveness perception. Full article

(This article belongs to the Special Issue Human–Computer Interaction and Virtual Environments)

► Show Figures

Figure 1

40 pages, 6363 KB

Open AccessArticle

Learning and Evolution: Factors Influencing an Effective Combination

by Paolo Pagliuca

AI 2024, 5(4), 2393-2432; https://doi.org/10.3390/ai5040118 - 15 Nov 2024

Cited by 2 | Viewed by 1311

Abstract

(1) Background: The mutual relationship between evolution and learning is a controversial argument among the artificial intelligence and neuro-evolution communities. After more than three decades, there is still no common agreement on the matter. (2) Methods: In this paper, the author investigates whether [...] Read more.

(1) Background: The mutual relationship between evolution and learning is a controversial argument among the artificial intelligence and neuro-evolution communities. After more than three decades, there is still no common agreement on the matter. (2) Methods: In this paper, the author investigates whether combining learning and evolution permits finding better solutions than those discovered by evolution alone. In further detail, the author presents a series of empirical studies that highlight some specific conditions determining the success of such combination. Results are obtained in five qualitatively different domains: (i) the 5-bit parity task, (ii) the double-pole balancing problem, (iii) the Rastrigin, Rosenbrock and Sphere optimization functions, (iv) a robot foraging task and (v) a social foraging problem. Moreover, the first three tasks represent benchmark problems in the field of evolutionary computation. (3) Results and discussion: The outcomes indicate that the effect of learning on evolution depends on the nature of the problem. Specifically, when the problem implies limited or absent agent–environment conditions, learning is beneficial for evolution, especially with the introduction of noise during the learning and selection processes. Conversely, when agents are embodied and actively interact with the environment, learning does not provide advantages, and the addition of noise is detrimental. Finally, the absence of stochasticity in the experienced conditions is paramount for the effectiveness of the combination. Furthermore, the length of the learning process must be fine-tuned based on the considered task. Full article

► Show Figures

Figure 1

33 pages, 4031 KB

Open AccessArticle

Support of Migrant Reception, Integration, and Social Inclusion by Intelligent Technologies

by Leo Wanner, Daniel Bowen, Marta Burgos, Ester Carrasco, Jan Černocký, Toni Codina, Jevgenijs Danilins, Steffi Davey, Joan de Lara, Eleni Dimopoulou, Ekaterina Egorova, Christine Gebhard, Jens Grivolla, Elena Jaramillo-Rojas, Matthias Klusch, Athanasios Mavropoulos, Maria Moudatsou, Artemisia Nikolaidou, Dimos Ntioudis, Irene Rodríguez, Mirela Rosgova, Yash Shekhawat, Alexander Shvets, Oleksandr Sobko, Grigoris Tzionis and Stefanos Vrochidis Show full author list Hide full author list

Information 2024, 15(11), 686; https://doi.org/10.3390/info15110686 - 1 Nov 2024

Viewed by 1785

Abstract

Apart from being an economic struggle, migration is first of all a societal challenge; most migrants come from different cultural and social contexts, do not speak the language of the host country, and are not familiar with its societal, administrative, and labour market [...] Read more.

Apart from being an economic struggle, migration is first of all a societal challenge; most migrants come from different cultural and social contexts, do not speak the language of the host country, and are not familiar with its societal, administrative, and labour market infrastructure. This leaves them in need of dedicated personal assistance during their reception and integration. However, due to the continuously high number of people in need of attendance, public administrations and non-governmental organizations are often overstrained by this task. The objective of the Welcome Platform is to address the most pressing needs of migrants. The Platform incorporates advanced Embodied Conversational Agent and Virtual Reality technologies to support migrants in the context of reception, integration, and social inclusion in the host country. It has been successfully evaluated in trials with migrants in three European countries in view of potentially deviating needs at the municipal, regional, and national levels, respectively: the City of Hamm in Germany, Catalonia in Spain, and Greece. The results show that intelligent technologies can be a valuable supplementary tool for reducing the workload of personnel involved in migrant reception, integration, and inclusion. Full article

(This article belongs to the Special Issue Advances in Human-Centered Artificial Intelligence)

► Show Figures

Figure 1

20 pages, 1186 KB

Open AccessArticle

Reengineering eADVICE for Long Waitlists: A Tale of Two Systems and Conditions

by Deborah Richards, Patrina H. Y. Caldwell, Amal Abdulrahman, Amy von Huben, Karen Waters and Karen M. Scott

Electronics 2024, 13(14), 2785; https://doi.org/10.3390/electronics13142785 - 16 Jul 2024

Cited by 1 | Viewed by 1347

Abstract

Long outpatient waiting times pose a significant global challenge in healthcare, impacting children and families with implications for health outcomes. This paper presents the eHealth system called eADVICE (electronic Advice and Diagnosis Via the Internet following Computerised Evaluation) that is designed to address [...] Read more.

Long outpatient waiting times pose a significant global challenge in healthcare, impacting children and families with implications for health outcomes. This paper presents the eHealth system called eADVICE (electronic Advice and Diagnosis Via the Internet following Computerised Evaluation) that is designed to address waiting list challenges for paediatricians. Initially designed for children’s incontinence, the system’s success in terms of health goals and user experience led to its adaptation for paediatric sleep problems. This paper focuses on user experiences and the development of a working alliance with the virtual doctor, alongside health outcomes based on a randomised controlled trial (N = 239) for incontinence. When reengineering eADVICE to sleep disorders, the promising results regarding the reciprocal relationship between user experience and building a working alliance encouraged a focus on the further development of the embodied conversational agent (ECA) component. This involved tailoring the ECA discussion to patient cognition (i.e., beliefs and goals) to further improve engagement and outcomes. The proposed eADVICE framework facilitates adaptation across paediatric conditions, offering a scalable model to enhance access and self-efficacy during care delays. Full article

(This article belongs to the Special Issue Human-Computer Interactions in E-health)

► Show Figures

Figure 1

20 pages, 860 KB

Open AccessArticle

Exploring the Effectiveness of Evaluation Practices for Computer-Generated Nonverbal Behaviour

by Pieter Wolfert, Gustav Eje Henter and Tony Belpaeme

Appl. Sci. 2024, 14(4), 1460; https://doi.org/10.3390/app14041460 - 10 Feb 2024

Cited by 3 | Viewed by 1556

Abstract

This paper compares three methods for evaluating computer-generated motion behaviour for animated characters: two commonly used direct rating methods and a newly designed questionnaire. The questionnaire is specifically designed to measure the human-likeness, appropriateness, and intelligibility of the generated motion. Furthermore, this study [...] Read more.

This paper compares three methods for evaluating computer-generated motion behaviour for animated characters: two commonly used direct rating methods and a newly designed questionnaire. The questionnaire is specifically designed to measure the human-likeness, appropriateness, and intelligibility of the generated motion. Furthermore, this study investigates the suitability of these evaluation tools for assessing subtle forms of human behaviour, such as the subdued motion cues shown when listening to someone. This paper reports six user studies, namely studies that directly rate the appropriateness and human-likeness of a computer character’s motion, along with studies that instead rely on a questionnaire to measure the quality of the motion. As test data, we used the motion generated by two generative models and recorded human gestures, which served as a gold standard. Our findings indicate that when evaluating gesturing motion, the direct rating of human-likeness and appropriateness is to be preferred over a questionnaire. However, when assessing the subtle motion of a computer character, even the direct rating method yields less conclusive results. Despite demonstrating high internal consistency, our questionnaire proves to be less sensitive than directly rating the quality of the motion. The results provide insights into the evaluation of human motion behaviour and highlight the complexities involved in capturing subtle nuances in nonverbal communication. These findings have implications for the development and improvement of motion generation models and can guide researchers in selecting appropriate evaluation methodologies for specific aspects of human behaviour. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 9159 KB

Open AccessArticle

The Effect of Eye Contact in Multi-Party Conversations with Virtual Humans and Mitigating the Mona Lisa Effect

by Junyeong Kum, Sunghun Jung and Myungho Lee

Electronics 2024, 13(2), 430; https://doi.org/10.3390/electronics13020430 - 19 Jan 2024

Cited by 2 | Viewed by 2514

Abstract

The demand for kiosk systems with embodied conversational agents has increased with the development of artificial intelligence. There have been attempts to utilize non-verbal cues, particularly virtual human (VH) eye contact, to enable human-like interaction. Eye contact with VHs can affect satisfaction with [...] Read more.

The demand for kiosk systems with embodied conversational agents has increased with the development of artificial intelligence. There have been attempts to utilize non-verbal cues, particularly virtual human (VH) eye contact, to enable human-like interaction. Eye contact with VHs can affect satisfaction with the system and the perception of VHs. However, when rendered in 2D kiosks, the gaze direction of a VH can be incorrectly perceived, due to a lack of stereo cues. A user study was conducted to examine the effects of the gaze behavior of VHs in multi-party conversations in a 2D display setting. The results showed that looking at actual speakers affects the perceived interpersonal skills, social presence, attention, co-presence, and competence in conversations with VHs. In a second study, the gaze perception was further examined with consideration of the Mona Lisa effect, which can lead users to believe that a VH rendered on a 2D display is gazing at them, regardless of the actual direction, within a narrow range. We also proposed the camera rotation angle fine tuning (CRAFT) method to enhance the users’ perceptual accuracy regarding the direction of the VH’s gaze.The results showed that the perceptual accuracy for the VH gaze decreased in a narrow range and that CRAFT could increase the perceptual accuracy. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

21 pages, 8706 KB

Open AccessArticle

Translating Virtual Prey-Predator Interaction to Real-World Robotic Environments: Enabling Multimodal Sensing and Evolutionary Dynamics

by Xuelong Sun, Cheng Hu, Tian Liu, Shigang Yue, Jigen Peng and Qinbing Fu

Biomimetics 2023, 8(8), 580; https://doi.org/10.3390/biomimetics8080580 - 1 Dec 2023

Viewed by 2476

Abstract

Prey-predator interactions play a pivotal role in elucidating the evolution and adaptation of various organism’s traits. Numerous approaches have been employed to study the dynamics of prey-predator interaction systems, with agent-based methodologies gaining popularity. However, existing agent-based models are limited in their ability [...] Read more.

Prey-predator interactions play a pivotal role in elucidating the evolution and adaptation of various organism’s traits. Numerous approaches have been employed to study the dynamics of prey-predator interaction systems, with agent-based methodologies gaining popularity. However, existing agent-based models are limited in their ability to handle multi-modal interactions, which are believed to be crucial for understanding living organisms. Conversely, prevailing prey-predator integration studies often rely on mathematical models and computer simulations, neglecting real-world constraints and noise. These elusive attributes, challenging to model, can lead to emergent behaviors and embodied intelligence. To bridge these gaps, our study designs and implements a prey-predator interaction scenario that incorporates visual and olfactory sensory cues not only in computer simulations but also in a real multi-robot system. Observed emergent spatial-temporal dynamics demonstrate successful transitioning of investigating prey-predator interactions from virtual simulations to the tangible world. It highlights the potential of multi-robotics approaches for studying prey-predator interactions and lays the groundwork for future investigations involving multi-modal sensory processing while considering real-world constraints. Full article

(This article belongs to the Special Issue Biology for Robotics and Robotics for Biology)

► Show Figures

Figure 1

22 pages, 1359 KB

Open AccessArticle

Identifying Which Relational Cues Users Find Helpful to Allow Tailoring of e-Coach Dialogues

by Sana Salman, Deborah Richards and Mark Dras

Multimodal Technol. Interact. 2023, 7(10), 93; https://doi.org/10.3390/mti7100093 - 2 Oct 2023

Cited by 6 | Viewed by 2926

Abstract

Relational cues are extracts from actual verbal dialogues that help build the therapist–patient working alliance and stronger bond through the depiction of empathy, respect and openness. ECAs (Embodied conversational agents) are human-like virtual agents that exhibit verbal and non-verbal behaviours. In the digital [...] Read more.

Relational cues are extracts from actual verbal dialogues that help build the therapist–patient working alliance and stronger bond through the depiction of empathy, respect and openness. ECAs (Embodied conversational agents) are human-like virtual agents that exhibit verbal and non-verbal behaviours. In the digital health space, ECAs act as health coaches or experts. ECA dialogues have previously been designed to include relational cues to motivate patients to change their current behaviours and encourage adherence to a treatment plan. However, there is little understanding of who finds specific relational cues delivered by an ECA helpful or not. Drawing the literature together, we have categorised relational cues into empowering, working alliance, affirmative and social dialogue. In this study, we have embedded the dialogue of Alex, an ECA, to encourage healthy behaviours with all the relational cues (empathic Alex) or with none of the relational cues (neutral Alex). A total of 206 participants were randomly assigned to interact with either empathic or neutral Alex and were also asked to rate the helpfulness of selected relational cues. We explore if the perceived helpfulness of the relational cues is a good predictor of users’ intention to change the recommended health behaviours and/or development of a working alliance. Our models also investigate the impact of individual factors, including gender, age, culture and personality traits of the users. The idea is to establish whether a certain group of individuals having similarities in terms of individual factors found a particular cue or group of cues helpful. This will establish future versions of Alex and allow Alex to tailor its dialogue to specific groups, as well as help in building ECAs with multiple personalities and roles. Full article

► Show Figures

Figure 1

34 pages, 6061 KB

Open AccessArticle

The Impression of Phones and Prosody Choice in the Gibberish Speech of the Virtual Embodied Conversational Agent Kotaro

by Antonio Galiza Cerdeira Gonzalez, Wing-Sum Lo and Ikuo Mizuuchi

Appl. Sci. 2023, 13(18), 10143; https://doi.org/10.3390/app131810143 - 8 Sep 2023

Cited by 2 | Viewed by 2473

Abstract

The number of smart devices is expected to exceed 100 billion by 2050, and many will feature conversational user interfaces. Thus, methods for generating appropriate prosody for the responses of embodied conversational agents will be very important. This paper presents the results of [...] Read more.

The number of smart devices is expected to exceed 100 billion by 2050, and many will feature conversational user interfaces. Thus, methods for generating appropriate prosody for the responses of embodied conversational agents will be very important. This paper presents the results of the “Talk to Kotaro” experiment, which was conducted to better understand how people from different cultural backgrounds react when listening to prosody and phone choices for the IPA symbol-based gibberish speech of the virtual embodied conversational agent Kotaro. It also presents an analysis of the responses to a post-experiment Likert scale questionnaire and the emotions estimated from the participants’ facial expressions, which allowed one to obtain a phone embedding matrix and to conclude that there is no common cross-cultural baseline impression regarding different prosody parameters and that similarly sounding phones are not close in the embedding space. Finally, it also provides the obtained data in a fully anonymous data set. Full article

(This article belongs to the Special Issue Recent Advances in Human-Robot Interactions)

► Show Figures

Figure 1

34 pages, 2639 KB

Open AccessArticle

The Co-Design of an Embodied Conversational Agent to Help Stroke Survivors Manage Their Recovery

by Deborah Richards, Paulo Sergio Miranda Maciel and Heidi Janssen

Robotics 2023, 12(5), 120; https://doi.org/10.3390/robotics12050120 - 22 Aug 2023

Cited by 5 | Viewed by 3637

Abstract

Whilst the use of digital interventions to assist patients with self-management involving embodied conversational agents (ECA) is emerging, the use of such agents to support stroke rehabilitation and recovery is rare. This iTakeCharge project takes inspiration from the evidence-based narrative style self-management intervention [...] Read more.

Whilst the use of digital interventions to assist patients with self-management involving embodied conversational agents (ECA) is emerging, the use of such agents to support stroke rehabilitation and recovery is rare. This iTakeCharge project takes inspiration from the evidence-based narrative style self-management intervention for stroke recovery, the ‘Take Charge’ intervention, which has been shown to contribute to significant improvements in disability and quality of life after stroke. We worked with the developers and deliverers of the ‘Take Charge’ intervention tool, clinical stroke researchers and stroke survivors, to adapt the ‘Take Charge’ intervention tool to be delivered by an ECA (i.e., the Taking Charge Intelligent Agent (TaCIA)). TaCIA was co-designed using a three-phased approach: Stage 1: Phase I with the developers and Phase II with people who delivered the original Take Charge intervention to stroke survivors (i.e., facilitators); and Stage 2: Phase III with stroke survivors. This paper reports the results from each of these phases including an evaluation of the resulting ECA. Stage 1: Phase I, where TaCIA V.1 was evaluated by the Take Charge developers, did not build a good working alliance, provide adequate options, or deliver the intended Take Charge outcomes. In particular, the use of answer options and the coaching aspects of TaCIA V.1 were felt to conflict with the intention that Take Charge facilitators would not influence the responses of the patient. In response, in Stage 1: Phase II, TaCIA V.2 incorporated an experiment to determine the value of providing answer options versus free text responses. Take Charge facilitators agreed that allowing an open response concurrently with providing answer options was optimal and determined that working alliance and usability were satisfactory. Finally, in Stage 2: Phase III, TaCIA V.3 was evaluated with eight stroke survivors and was generally well accepted and considered useful. Increased user control, clarification of TaCIA’s role, and other improvements to improve accessibility were suggested. The article concludes with limitations and recommendations for future changes based on stroke survivor feedback. Full article

(This article belongs to the Special Issue Chatbots and Talking Robots)

► Show Figures

Figure 1

18 pages, 1306 KB

Open AccessArticle

A Digital Coach to Promote Emotion Regulation Skills

by Katherine Hopman, Deborah Richards and Melissa M. Norberg

Multimodal Technol. Interact. 2023, 7(6), 57; https://doi.org/10.3390/mti7060057 - 29 May 2023

Cited by 8 | Viewed by 4369

Abstract

There is growing awareness that effective emotion regulation is critical for health, adjustment and wellbeing. Emerging evidence suggests that interventions that promote flexible emotion regulation may have the potential to reduce the incidence and prevalence of mental health problems in specific at-risk populations. [...] Read more.

There is growing awareness that effective emotion regulation is critical for health, adjustment and wellbeing. Emerging evidence suggests that interventions that promote flexible emotion regulation may have the potential to reduce the incidence and prevalence of mental health problems in specific at-risk populations. The challenge is how best to engage with at risk populations, who may not be actively seeking assistance, to deliver this early intervention approach. One possible solution is via digital technology and development, which has rapidly accelerated in this space. Such rapid growth has, however, occurred at the expense of developing a deep understanding of key elements of successful program design and specific mechanisms that influence health behavior change. This paper presents a detailed description of the design, development and evaluation of an emotion regulation intervention conversational agent (ERICA) who acts as a digital coach. ERICA uses interactive conversation to encourage self-reflection and to support and empower users to learn a range of cognitive emotion regulation strategies including Refocusing, Reappraisal, Planning and Putting into Perspective. A pilot evaluation of ERICA was conducted with 138 university students and confirmed that ERICA provided a feasible and highly usable method for delivering an emotion regulation intervention. The results also indicated that ERICA was able to develop a therapeutic relationship with participants and increase their intent to use a range of cognitive emotion regulation strategies. These findings suggest that ERICA holds potential to be an effective approach for delivering an early intervention to support mental health and wellbeing. ERICA’s dialogue, embedded with interactivity, therapeutic alliance and empathy cues, provide the basis for the development of other psychoeducation interventions. Full article

(This article belongs to the Special Issue Feature Papers in Multimodal Technologies and Interaction—Edition 2023)

► Show Figures

Figure 1

22 pages, 1188 KB

Open AccessArticle

Framework for Guiding the Development of High-Quality Conversational Agents in Healthcare

by Kerstin Denecke

Healthcare 2023, 11(8), 1061; https://doi.org/10.3390/healthcare11081061 - 7 Apr 2023

Cited by 12 | Viewed by 4399

Abstract

Evaluating conversational agents (CAs) that are supposed to be applied in healthcare settings and ensuring their quality is essential to avoid patient harm and ensure efficacy of the CA-delivered intervention. However, a guideline for a standardized quality assessment of health CAs is still [...] Read more.

Evaluating conversational agents (CAs) that are supposed to be applied in healthcare settings and ensuring their quality is essential to avoid patient harm and ensure efficacy of the CA-delivered intervention. However, a guideline for a standardized quality assessment of health CAs is still missing. The objective of this work is to describe a framework that provides guidance for development and evaluation of health CAs. In previous work, consensus on categories for evaluating health CAs has been found. In this work, we identify concrete metrics, heuristics, and checklists for these evaluation categories to form a framework. We focus on a specific type of health CA, namely rule-based systems that are based on written input and output, have a simple personality without any kind of embodiment. First, we identified relevant metrics, heuristics, and checklists to be linked to the evaluation categories through a literature search. Second, five experts judged the metrics regarding their relevance to be considered within evaluation and development of health CAs. The final framework considers nine aspects from a general perspective, five aspects from a response understanding perspective, one aspect from a response generation perspective, and three aspects from an aesthetics perspective. Existing tools and heuristics specifically designed for evaluating CAs were linked to these evaluation aspects (e.g., Bot usability scale, design heuristics for CAs); tools related to mHealth evaluation were adapted when necessary (e.g., aspects from the ISO technical specification for mHealth Apps). The resulting framework comprises aspects to be considered not only as part of a system evaluation, but already during the development. In particular, aspects related to accessibility or security have to be addressed in the design phase (e.g., which input and output options are provided to ensure accessibility?) and have to be verified after the implementation phase. As a next step, transfer of the framework to other types of health CAs has to be studied. The framework has to be validated by applying it during health CA design and development. Full article

(This article belongs to the Special Issue Evaluation of the Usability of Healthcare Systems)

► Show Figures

Figure 1

Search Results (33)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (33)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI