Perspectives on Socially Intelligent Conversational Agents

Brinkschulte, Luisa; Schlögl, Stephan; Monz, Alexander; Schöttle, Pascal; Janetschek, Matthias

doi:10.3390/mti6080062

Open AccessArticle

Perspectives on Socially Intelligent Conversational Agents

by

Luisa Brinkschulte

¹,

Stephan Schlögl

^2,*

,

Alexander Monz

²,

Pascal Schöttle

²

and

Matthias Janetschek

²

¹

IBM Germany, 80807 Munich, Germany

²

Department Management, Communication & IT, MCI-The Entrepreneurial School, 6020 Innsbruck, Austria

^*

Author to whom correspondence should be addressed.

Multimodal Technol. Interact. 2022, 6(8), 62; https://doi.org/10.3390/mti6080062

Submission received: 3 June 2022 / Revised: 7 July 2022 / Accepted: 19 July 2022 / Published: 25 July 2022

(This article belongs to the Special Issue Multimodal Conversational Interaction and Interfaces, Volume II)

Download Versions Notes

Abstract

:

The propagation of digital assistants is consistently progressing. Manifested by an uptake of ever more human-like conversational abilities, respective technologies are moving increasingly away from their role as voice-operated task enablers and becoming rather companion-like artifacts whose interaction style is rooted in anthropomorphic behavior. One of the required characteristics in this shift from a utilitarian tool to an emotional character is the adoption of social intelligence. Although past research has recognized this need, more multi-disciplinary investigations should be devoted to the exploration of relevant traits and their potential embedding in future agent technology. Aiming to lay a foundation for further developments, we report on the results of a Delphi study highlighting the respective opinions of 21 multi-disciplinary domain experts. Results exhibit 14 distinctive characteristics of social intelligence, grouped into different levels of consensus, maturity, and abstraction, which may be considered a relevant basis, assisting the definition and consequent development of socially intelligent conversational agents.

Keywords:

artificial intelligence; conversational agents; human–agent interaction; social intelligence; Delphi study

1. Introduction

The vision to naturally converse with machines as if they were humans has fascinated people for many years and is perpetually supported by respective scenarios featured in a wide range of science fiction movies (e.g., 2001: A Space Odyssey, Her, Ex Machina, etc.). Considerable progress in relevant research fields (i.e., most notably in Artificial Intelligence and its sub-field Natural Language Processing) has been closing this gap between a vision of future human–machine dialog and its reality [1]. That is, today’s voice-controlled digital assistants such as Alexa, Siri, Cortana or Google Assistant are not only capable of understanding speech commands, but increasingly also optimized to recognize conversational contexts and respond accordingly. However, despite these advancements, interactions often lack fluency, representing support for ‘sentence ping pong’ while missing essential conversational characteristics.

One common way of increasing the naturalness of human–machine dialogs was found in the mimicry of human conversational traits [2]—an approach which is often showcased by contestants of the annual Loebner Prize competition (Online: https://en.wikipedia.org/wiki/Loebner_Prize (accessed on 27 April 2022)), featuring Alan Turing’s famous test scenario [3], or by tech giants such as Google prominently demonstrating their advancements in spoken human–computer interaction (Online: https://ai.googleblog.com/2018/05/duplex-ai-system-for-natural-conversation.html (accessed on 27 April 2022)). However, while the Turing Test has long been considered the ultimate challenge for AI researchers and enthusiasts [4], recent discussions have made it one of the most disputed topics in AI, philosophy, and cognitive science. There seems to be a growing discordance in whether the ‘spoofing’ of human behavior contributes to or rather handicaps the progress of AI research. Even Steve Worswick, Loebner Prize winner of 2016, 2017 and 2018, has been calling for a change in focus in conversational user interface (CUI) design and evaluation so as to re-direct respective research efforts (Online: https://discover.bot/bot-talk/the-turing-test-time-for-change/ (accessed on 27 April 2022)).

There is evidence showing that if a CUI is perceived to be social in its behavior it is more easily accepted [5]. This is further stressed by studies highlighting the importance of social intelligence in human–machine interaction [6,7,8].

Although much of the current and previous work in this field focuses on endowing agents with some sort of mimicked social intelligence, there are notable exceptions investigating more deeply what users expect and wish from social interaction with conversational agents (CA).

On the one hand, researchers, such as Du et al. [9], have employed and further extended existing technology acceptance models to better understand peoples’ preferences regarding functional elements of social agents (e.g., their perceived usefulness and perceived ease of use). To this end, seniors were a particularly often-investigated target group as here CAs are considered a means to tackle potential issues of loneliness and companionship (e.g., [10,11,12]). However, hedonic and social characteristics of CAs have also been researched. Shamekhi et al. [13], for example, found that human interlocutors prefer a CA that matches their communication style (similar to interactions with humans), although Clark et al. [14] highlight that people do not necessarily need a bond or common ground in human–agent communication. Furthermore, the (human-like) use of conversational fillers (e.g., “um”, “uh”) is disliked in CAs [15]. Tone of voice, however, has shown to have a significant influence on perceived interaction quality. Particularly voice pitch directly relates to perceived trust [16].

Trust has generally been one of the most researched aspects in human–agent interaction (e.g., [17,18,19,20]), and it was found to be particularly influenced by the social characteristics of a CA [21]. This also fits the social reasoning framework that has been proposed by Lee et al. [22] to guide the normative behavior of intelligent virtual agents.

Finally, building upon the results of 80 intelligent agent user studies, the work by Fitrianie et al. [23] aims to provide a generic set of 19 measuring constructs (covering agent, human, and interaction perspectives) to evaluate the interaction with artificial social agents.

Although this shows that past work has highlighted the need for machines to embed human-like traits (cf. also [24]), the more general issue of understanding the extent to which human social intelligence may serve as the ultimate guideline for creating this type of ‘artificial social intelligence’ is still being discussed and provides grounds for dissenting opinions (often coming from experts outside typical engineering disciplines, e.g., [25]). This underlines the complexity of the topic and the need for further investigation and discussion. Aiming to extend the body of knowledge in this disputed domain, the goal of the work presented in this paper is to capture insights and arguments from a diverse set of experts as to how intelligence characteristics should be exhibited by future CAs. Consequently, the respective work was guided by the following two key questions:

Which characteristics of social intelligence should future conversational agents be able to master?
To what extent is there a consensus on the relevance of these characteristics among experts in the field?

Our report starts with an exploration of previous work on the progress of CAs and respective developments in embedding (human) social intelligence in Section 2. Following, Section 3 outlines our efforts in collecting and condensing expert opinions by applying a Delphi study methodology. Next, Section 4 reports on the study’s results, and Section 5 reflects on some of the more ambivalent insights. Finally, Section 6 concludes, outlines the study’s limitations, and provides pointers for potential future research directions.

2. Related Work

“By 2020, the average person will have more conversations with bots [conversational agents] than with their spouse.” (Online: https://www.gartner.com/smarterwithgartner/gartner-predicts-a-virtual-world-of-exponential-change/ (accessed on 28 April 2022)).

Although this prediction from October 2016 did not hold, the existence and use of bots, i.e., conversational agents, imitating human characteristics has taken on a significant role in human–technology interactions. The following will demonstrate this progress by discussing the path towards today’s CAs, initially triggered by science fiction (cf. Section 2.1), and elaborating on the role social intelligence may play in fulfilling these expectations (Section 2.2).

2.1. The Path towards Today’s Conversational Agents

Since Computer in the 1966 series Star Trek (Online: https://www.imdb.com/title/tt0060028/ (accessed on 28 April 2022)), Stanley Kubrick’s HAL from the 1968 movie 2001: A Space Odyssey (Online: https://www.imdb.com/title/tt0062622/ (accessed on 28 April 2022)) and George Lucas’ R2-D2 and C-3PO in his first Star Wars movie from 1977 (Online: https://www.imdb.com/title/tt0076759/ (accessed on 28 April 2022)), the vision of intelligent conversational agents has become a popular topic in the science fiction film industry, further emphasized by numerous successors such as Terminator (1984) (Online: https://www.imdb.com/title/tt0088247/ (accessed on 28 April 2022)), Data in the 1987 series Star Trek: The Net Generation (Online: https://www.imdb.com/title/tt0092455/ (accessed on 28 April 2022)), Wall-E (2008) (Online: https://www.imdb.com/title/tt0910970/ (accessed on 28 April 2022)), or Her (2013) (Online: https://www.imdb.com/title/tt1798709/ (accessed on 28 April 2022)). Today, the option to talk to and with a technical artifact (i.e., a computer or robot) is no longer just science fiction, but has become reality, underlined by the widespread use of virtual agent services such as Apple’s Siri, Google’s Assistant, Microsoft’s Cortana and Amazon’s Alexa.

Conversational Agents, also referred to as bots, personal assistants, digital personal assistants, mobile assistants, voice assistants, conversational user interfaces or virtual personal assistants, have become mainstream [1]. Gartner defines them as “conversational, computer-generated character[s] that simulate a conversation to deliver voice- or text-based information to a user” (Online: https://www.gartner.com/it-glossary/virtual-assistant-va/ (accessed on 28 April 2022)), whereas McTear et al. [1] emphasize their more general purpose of user assistance ([1], p. 11). While some of these assistants perform rather mundane tasks, such as obtaining information, providing directions, or setting an alarm, others offer very specialized functionalities, such as personalized fitness monitoring or contextualized recipe instructions. Equipped with such powerful capabilities, CAs have been moving out of the technological environment and increasingly into social contexts [7,26,27]. Consequently, one may argue that today CAs can be defined as conversational (voice- or text-based), computer-generated entities, which exist either virtual or embodied and aim to deliver assistance to (a) user(s) in a given socio-technical context.

Developments in CA technology, from early visions to current implementations, may be attributed to five key ingredients:

(1): Significant advances in language technologies, such as improved accuracy in speech recognition, increased anthropomorphism in text-to-speech synthesis, and greater flexibility in dialogue management, which overall have improved agents’ communication capabilities;
(2): The emergence of the Semantic Web, whose machine-readable content structure helps CA technology answer more complex types of questions [1];
(3): Smartphones and other mobile devices, which not only have long surpassed the power of earlier personal computers and now allow for the ubiquitous availability of sophisticated computing services, but also given access to contextual information such as users’ location, calendar, and contact details and thus foster personalization;
(4): Widespread connectivity through faster wireless networks, almost ubiquitous Wi-Fi availability and the introduction of cloud computing, which enables resource-intensive tasks such as speech recognition to be performed on remote servers;
(5): The increased effort that major technology companies such as Microsoft, Google, Amazon, or Apple have put into the development of CA technology and application domains, tackling ever more complex tasks such as education, sales, or different types of therapy.

It is particularly the latter that shows that future CAs need to move beyond being voice-controlled information providers and become conversational companions showing almost human-like social behavior.

2.2. Social Intelligence and Conversational Agents

Extensive research has been conducted on linking social intelligence and CAs. Especially the area of social dialog [28], where agents are designed to interact with humans in a natural and socially intelligent manner [29], has been gaining increased attention. Relevant previous research projects in this field include Humaine (Online: https://cordis.europa.eu/project/id/507422 (accessed on 30 June 2022)), which focused on emotional human–machine interaction and provided an extensive corpus of data on the forms emotion can take on during conversations [30], as well as Semaine (Online: https://cordis.europa.eu/project/id/211486 (accessed on 30 June 2022)), which explored the impact of nonverbal expressions such as head gestures [31] and laughter [32]. Technical artifacts resulting from these projects, such as GRETA [33] or the Agents United platform [34], help researchers and developers to setup their own multi-agent applications. Furthermore, focusing more on the healthcare domain, the SimSensei system showed how CAs may be used as a tool to measure psychological distress in semi-structured interviews [35]. Furthermore, the ODVIC and EMPATHIC (Online: https://cordis.europa.eu/project/rcn/212371/en (accessed on 1 July 2022)) projects targeted the health and well-being domain. The former focused on dialogue-based coaching towards behavioral change [36], whereas the latter aimed to build an empathic virtual coach to improve the independent healthy-life-years of the elderly [37].

These research efforts show that CAs interacting with humans in the social sphere need to act emotionally and be socially intelligent in order to be effective [38]. Furthermore, building upon Nass et al. [24], there is evidence that humans can be socially influenced by these types of artificial entities just as they would be by humans. Researchers in agent technology, however, vary in their interpretations of how social intelligence and CAs relate to each other. On the one hand, there is the argument that CA social intelligence should be modeled after human social intelligence. Thus, the created models should be based on theories about human–human interaction. On the other hand, Dautenhahn states that if we want CAs as social interaction partners for humans, they have to be only “a bit like us” ([39], p. 23).

Following Gardner’s theory of multiple intelligences (Online: https://www.simplypsychology.org/multiple-intelligences.html (accessed on 20 December 2021)), Albrecht [40] depicts social intelligence as one of six dimensions of intelligence (cf. Table 1) and defines it as the ability to get along and cooperate with other people (i.e., ‘dealing with people’). He furthermore defines five sub-dimensions of social intelligence expressed by the S.P.A.C.E. acronym, i.e., Situational Awareness, Presence, Authenticity, Clarity, and Empathy.

Table 1. Six dimensions of intelligence, according to Albrecht [40].

Intelligence Dimension	Description
Abstract Intelligence	Symbolic reasoning
Practical Intelligence	Getting things done
Emotional Intelligence	Self-awareness and self-management
Aesthetic Intelligence	Sense of form, design, music, art, and literature
Kinesthetics Intelligence	Whole-body skills, dancing, or flying a jet fighter
Social Intelligence	Dealing with people

2.2.1. Situational Awareness

The ability to assess a given situation is indispensable to effectively make decisions [41]. To this end, Situational Awareness (SA) describes the process of ‘reading’ situations and interpreting people’s behavior so as to identify their possible intentions, their emotional states, as well as their proclivity to interact [40]. Albrecht (ibid.) divides SA into three different contexts. First, the proxemics context, which depicts the dynamics of a physical space within which people interact. Second, the behavioral context, which relates to patterns of action, emotion, motivation, and intention that show up in interactions between people. Third, the semantic context, which refers to the language patterns used in disclosure. It examines the nature of relationships, governing social codes, differences in status and social class and the degree of understanding created by language habits. Furthermore, SA may be divided into actions that are influenced by individual awareness, interpersonal awareness (e.g., group collaboration, such as tutoring and meetings), and social-cultural awareness (e.g., race, age, gender, education level, etc.).

Since SA counts as crucial for human decision making, the concept is also considered when developing human-like CAs. That is, while in pervasive/ubiquitous computing contexts, adaptation allows for technology to better integrate into a given environment, and agents rather use this situational awareness to convey a certain level of intelligence [42]. They sense, interpret, and combine information in order to paint a coherent picture of a setting. The processed data are thereby categorized as spatial (i.e., orientation, movement, structure), informational (i.e., building information structure, decision making, information processing, perception, recognition), and functional (i.e., building functional structure, task decomposition, planning, specialization) [43].

2.2.2. Presence

Presence incorporates appearance, verbal and nonverbal patterns, as well as other interpersonal signals that help form an impression of a person [40]. Appearance hereby refers to how a person’s look influences their perceived credibility and attractiveness [44]. It addresses physical form, communication style, age, gender, dress, and socio-economic status. Although with regards to CAs, appearance is often secondary, Baylor [44] argues that voice alone is not sufficient. This viewpoint is supported by Hone et al. [45], who found that embodied CAs are significantly more effective than non-embodied agents. Furthermore, Kidd and Breazeal [46] claim that physically embodied robots are considered more enjoyable to interact with, more engaging, more informative, and more credible than animated characters. This is also supported by Lee et al. [47], whose findings show that physical embodiment presents an added value for human–agent interaction and that it is an effective means to increase an agents’ social presence. However, researchers are often in dispute about whether an agent’s appearance (physical or animated) should look human- or machine-like. Złotowski et al. [48], for example, found that an extremely human-like agent is perceived to be less empathetic and trustworthy than one which is more machine-like (see also [49]). Furthermore, the relation between a person’s gender and how they perceive an agent’s gender affects trustworthiness. That is, on the one hand, Siegel et al. [50] showed that an opposite-gender agent is perceived as more trustworthy than a same-gender agent. On the other hand, research by Baylor and Kim [51] and Guadagno et al. [52] indicate that humans are more persuaded by agents of the same gender—an equality principle that seems to also apply to an agent’s ethnicity and/or race [51,53].

As for verbal communication, perceptions of appearance relate to the linguistic part of the interpersonal communication. This includes both written and spoken communication, where word count, turn length and time to respond are critical parts of verbal communication [54]. In contrast, nonverbal communication describes all communication activities that transcend the written and spoken word [55], including gestures, postures, facial expression, gaze direction, and other emotional expressions.

Finally, interpersonal signals influencing people’s perception of an agent’s appearance include, e.g., the split-persona effect, where Baylor and Ebbers [56] found that having roles and functionalities split into distinct agent personas impacts positively on an agent’s perceived value or the agent’s level of knowledge, which positively affects self-efficacy beliefs in cases where it is in line with human knowledge levels [57].

2.2.3. Authenticity

Authenticity defines, among others, a person’s honesty and sincerity [40]. It deals with establishing cooperation, preventing manipulation and being true to oneself and others. Additionally, having respect, staying true to one’s values and ‘playing fair’ makes a person and potentially also an agent authentic. In this context, Albrecht [40] refers to a ‘social radar’, which absorbs behavioral signals and lets people judge a person’s (or agent’s) honesty, openness, and trustworthiness. Authenticity is considered relationship-oriented rather than task-oriented, and honesty counts as its major and overall quality. That is, people who are honest with themselves and others are perceived as authentic, regardless of their actions. Even if their actions are not accepted, they will still be praised for being authentic. An agent, on the other hand, is perceived as authentic when it shows transparency and predictability with respect to its decision making [58], as well as experience and coherence [59].

2.2.4. Clarity

Clarity refers to the ability to explain oneself, illuminate ideas, pass data clearly and accurately on to others and thus support cooperation [40]. According to Albrecht (ibid.), it also leads to higher levels of perceived empathy and open-mindedness, and fosters the free exchange of ideas. Adapting the language to a given situation is the foundation for clarity. Additionally, the use of neutral speech patterns and metaphors. That is, substituting an abstract concept for a familiar experience, one eases topic understanding for others. As for agents, clarity is often connected to personality. To this end, Persson et al. [60] depict personality as an enduring agent dimension, which comprises various traits as well as social role schemas, including expectations about occupancy, social stereotypes and archetypes. Although, the question remains whether an agent should be equipped with a designed personality or rather develop traits over time [27]. Here, Persson et al. [60] claim that an agent’s ability to observe and consequently learn may be (more) beneficial for the human–agent relationship.

2.2.5. Empathy

Finally, Empathy may be defined as the ability to understand and respond appropriately to the affective states of others [61]. Communicating in an empathetic manner results in a state of connectedness with another person, which creates the basis for cooperation and interaction. In this sense, empathy includes the understanding of what other people think, how they feel in concrete situations, and how one may compassionately engage with them. Generally, there are two approaches to building empathy. On the one hand, there is the moment-to-moment experience of connecting with another person, and on the other hand, there is the maintenance process where one keeps a relationship over time [40]. Additionally, attentiveness, appreciation, and affirmation help establish a strong empathetic connection to a person or a group [40].

Research has also shown that empathetic behavior can reduce stress [62], for which one may argue that the conversational style an agent uses may impact people’s anxiety levels. Such seems relevant with automated call centers, in particular if the agent may need to respond to potentially sensitive questions (e.g., health-related topics) [63].

Based on the vision described in Section 2.1, where CAs overcome their purpose as information providers and grow into a companion role, and the consequent need for CAs to express certain human traits outlined in Section 2.2, our goal was to investigate to what extent experts agree on the necessary integration of social intelligence into CA technology. That is, we were interested in what socio-technical challenges they see and what design directions they agree upon when it comes to building future CAs.

3. Method, Sampling and Study Procedure

We used a three-stage Delphi study approach to collect insights from domain experts. As demonstrated by previous work aiming to explore similar socio-technical problem spaces (e.g., [64,65,66,67,68]), Delphi can be considered a particularly suitable research method for this type of policy-focused investigation, since its systematic procedure yields understanding and circumvents group dispute [69]. The first stage of the study sought to identify those characteristics of social intelligence, which a future CA should be able to master. The two following stages evaluated experts’ agreement among the identified characteristics concerning their relevance and respective importance. To this end, relevance was defined as the degree to which something is related or useful to the topic under discussion [70]. Hence, for the purpose of this study, relevance shows whether a distinct characteristic was perceived to be particularly relevant for a socially intelligent CA or rather considered to be a more general characteristic of an AI system. Importance, on the other hand, describes whether said characteristic was considered a necessity (i.e., compulsory) to building CAs that express a certain level of social intelligence.

The study’s expert selection followed the five-step procedure proposed by Okoli and Pawlowski [71], reaching out to representatives from academia as well as industry. Experts from academia were determined based on their Google Scholar rankings and their latest contributions to relevant sectors (i.e., linguistics, artificial intelligence, software engineering, psychology, philosophy, and social sciences). Contenders were either required to have already obtained a PhD in their respective field or had to be in the final stages of doing so. Industry experts were chosen based on their affiliation with organizations that design and develop CAs with social skills or consultancies providing advice on this topic.

From the initially identified and contacted 121 experts, we were able to recruit

n = 21

for our investigation (3 female; age range at the time of the study: 25–65; average number of citations per expert at the time of the study: 1588). All experts had indicated their previous knowledge about, as well as their experiences and interactions with conversational agents. Twelve of them were geographically situated in EMEA (Europe, Middle-East and Africa), seven in AMER (North, Central and South America), and two in APAC (Asia Pacific). All of them completed all three study stages, which, according to Ziglio and Adler [69], should have led to rather clear viewpoints. As for the experts’ background, Table 2 provides information on their placement and expertise/field of work at the time of the study.

During the first stage of the Delphi study, experts were given a description of the study purpose and its design and then asked to (1) provide some demographic information and to (2) answer 5 open-ended questions regarding the social intelligence of CAs. Questions were designed along Albrecht’s social intelligence dimensions presented in Section 2.2. Experts were given two weeks to think about and work on the questions before returning their answers. Subsequently, the provided input was summarized, structured and interpreted through a qualitative content analysis [72,73,74], leading to a total of 14 different CA characteristics (cf. Table 3).

During stage two, the experts were then asked to what degree they agree with the relevance and importance of each of these characteristics. In both cases, agreement was measured on a 7-point Likert-scale [75] ranging from

1 = n o t r e l e v a n t

to

7 = e x t r e m e l y r e l e v a n t

and

1 = n o t i m p o r t a n t

to

7 = e x t r e m e l y i m p o r t a n t

, respectively. Additionally, experts were asked to rank the 10 characteristics they would find most important for the upcoming five years, and they were given the possibility to revise, reassess, and further elaborate on these characteristics if they thought such was necessary [76,77]. This led to 92 additional comments, which again underwent structured content analysis.

Finally, during stage three of the Delphi study, experts were given the group mean result for each of these characteristics and were then asked to confirm, comment, clarify, or potentially revise their earlier responses. This time, a total of 33 additional clarifications and comments were collected, providing help in the explanation of different viewpoints.

All quantitative data derived from stages two and three were furthermore analyzed using measures of central tendency and dispersion, specifically mean, median, standard deviation and interquartile range [76,78,79]. The entire design as well as its respective data collection and analysis procedures were approved by the university’s research ethics group in terms of ethical considerations regarding research with human participation.

4. Results

The following sections report on the results of the above-outlined study procedure. We start with an overview of the identified characteristics and the level of agreement they have reached among the experts regarding their relevance and importance for building socially intelligent CAs. Following this overview, the subsequent sections will elaborate on these characteristics in more detail.

4.1. Identified Characteristics for Socially Intelligent Conversational Agents

As outlined above, experts rated all of the identified characteristics regarding their relevance as well as their importance. Ratings were provided on 7-point Likert scales ranging from

1 = n o t r e l e v a n t | i m p o r t a n t

to

7 = e x t r e m e l y r e l e v a n t | i m p o r t a n t

. Based on the resulting standard deviation, we used clustering to categorize them as having either yielded consensus (i.e.,

S D

< 1), dissent (i.e.,

S D

> 1.4) or indecision (i.e.,

1 \leq S D \leq 1.4

) among experts.

Looking at the data (cf. Table 3), five of the characteristics, i.e., Context-related Acting, Reflective Language, Enculturation, Customizability and Engagement, achieved a rather clear consensus among experts both regarding relevance as well as importance (

S D < 1

). As for the characteristics Consistency, Depth, Continuous Interaction, Respectful Honesty and Justifiability, the result is indecisive (

1 \leq S D \leq 1.4

). That is, by our definition, neither consensus nor dissent regarding relevance and importance was reached. Finally, four characteristics, i.e., Establish/Maintain Relationships, Respectful Acting, Otherness and Individual Personality, indicate dissent among experts (

S D > 1.4

).

4.2. Context-Related Acting

Context-related acting has been evaluated as the most relevant characteristic. More than a quarter of all statements collected in the first round referred to this characteristic. According to Expert 10 (E10), “an evaluation of the context is necessary in order to understand how the user is behaving” and consequently also to determine the agent’s next actions. Thus, socially intelligent agents should be designed towards “context-sensitive and context-appropriate presence” (E02). Furthermore, they should observe their surrounding environment and act accordingly by familiarizing themselves with the “social and behavioral rules that make an interactional exchange successful” (E05). Such may be connected to the six interdependent types of context highlighted by McGaan (Online: https://department.monm.edu/cata/saved_files/Handouts/CONTEXTS.FSC.html (accessed on 30 November 2021)):

(1) The physical context, which includes the properties of the surrounding, the communication as well as other elements of the physical world that may influence communication. Examples include furniture arrangement, room size, colors, temperature, or time of day. Socially intelligent agents should be able to “produce contextually-accurate representations of their [...] surroundings” (E02). This requires the ability to “integrate communication streams from multiple sensors for audio, video, motion, location, proximity and other ambient inputs” (E02). The different sensory systems should be managed appropriately, and it should be differentiated between conversational, ambient, background, and secondary signals.

Furthermore, data includes statements about (2) the inner context, which refers to all feelings, thoughts, sensations, and emotions that may influence how events are interpreted. “If [an agent] has to spend much time with humans, then a large amount of resources will have to be spent on processing the human emotions” (E18), potentially based on people’s facial expressions, body language, tone, or physiological sensors measuring indicators such as heart rate. Socially intelligent agents should further “make assumptions about emotional states from the analysis of speech patterns and the use of language as seen in sentiment analysis” (E03). Such is important so as to identify and understand “complex human feelings, behavior patterns and life circumstances” (E15) and react accordingly.

Furthermore, experts mentioned aspects of (3) the symbolic context, which includes all interactions occurring before (and eventually after) a given communication event and which influences the agent or the user in their understanding of said event. To this end, a socially intelligent agent should be “focused on retaining the meaning of context in a conversation” (E04). If, for example, the user asks about the capital of France, and the agent replies Paris, then Paris should become the context of the conversation so that a user is able to continue the dialogue around the French capital without the need to refocus the agent. In other words, a socially intelligent agent needs to be equipped with both a “long term and short term memory” (E04).

Furthermore, (4) the relational context is found in our data, highlighting that agents need to “understand the interrelationships between people in the spaces in which they and the agent operate” (E04); e.g., student–teacher, father–son, friend–friend, or expert–layman. This is because use cases are mostly person-specific. An agent needs to know who is asking what, if it is to respond in a correct way. What is more, “it is evident that the same words may be used as a joke, or as a genuine question seeking an answer, or as an aggressive challenge” (E05). Knowing what is an adequate continuation of the interaction depends on identifying the interlocutors’ intentions, feelings, and beliefs. Some people, particularly children, may have trouble expressing themselves in words and sometimes say the opposite or completely different things to what they intend to say. Similar to humans, socially intelligent agents should thus be able to “detect this kind of confusion” (E18).

Furthermore, (5) the situational context, which relates to the activities interlocutors are involved in during the conversation (e.g., having a lecture, being on a date, playing a game, etc.), was addressed by our experts. Consequently, agents should vary the use of social interaction “dependent on the current situation in which the user [...] is using the agent” (E11). To this end, also the agent’s appearance should be adapted to the “role the agent must satisfy and the context it is called to function in” (E05). As too well-defined and consistent appearance may appear boring, the agent should also be able to change and adapt over time.

Finally, approximately one-third of the expert comments may be linked to (6) the cultural context as being relevant for an agent to master. Here it is required for a CA to be “sensitive to social norms and expectations” (E01). Such may not only depend upon the social and cultural context but also on the “status the agent is able to claim” (ibid.). The cultural context includes patterns and rules of communication that are given by and learned from a culture. Cultures (e.g., American, Japanese, British, etc.) and sub-cultures (e.g., Hispanic, Southern, rural-Midwest, urban gang, etc.) differ substantially, and each demand tailored responses. On top of that, cultures depend not only on the geographic location. Companies, associations and other social groups also develop their own cultures. Socially intelligent agents need to understand the culture they are in and act appropriately, e.g., through culturally aware gestures.

Looking at the quantitative analysis, Context-related Acting received the highest ratings regarding both relevance (

M e a n

= 6.38;

S D

= 0.74) and importance (

M e a n

= 6.33;

S D

= 0.73). Experts described it as the “first step towards social intelligence [in agents]” (E16), although a lack of it may be somewhat “mitigated by the fact that users are aware [...] that they are not interacting with real people” (E07).

4.3. Reflective Language

Although experts did not deem clear language as being necessary (at least not the way it was described by Albrecht [40]), the majority of them stressed the importance of reflective language so that agents are able to “be [better] understood by the user” (E09). In other words, socially intelligent agents “should be as understandable, as one would wish for human interlocutors to be” (E15). In order to do so, CAs need to adapt to the interlocutor’s language via a “mechanism in which speakers adjust to each others vocabulary and intonation” (E02). Features such as “optimal sentence length, and the use of vocabulary which is appropriate to the situation and the user’s knowledge should be taken into account” (E09). Agents should further vary their sentence length based on situational awareness, as the sentence length that is perceived to be optimal is “highly dependent on the user and the current topic” (E11). Characteristics of reflective language also include the application of “jargons, human dialects and cultural-regional characteristics” (E06). Agents need to be able to learn and adapt to jargon and slang typical in given organizations, societies or groups. Furthermore, nonverbal communication such as “gestures are an important communication channel in interpersonal communication and should thus be exploited in human–agent-interaction” (E10). In this context, experts believe reflective language to be a bilateral relationship. Eventually, agents and users “will meet somewhere in the middle, with the accents and tones approximating that of the user, and the vocabulary and syntax being sufficiently formal for easy parsing” (E18).

4.4. Enculturation

More than half of the experts commented on characteristics that may be subsumed under the term enculturation. For an agent to be able to act intelligently, it has to become aware of its environment. (Self)Awareness of a social entity has to be in tune with those entities it interacts with. Thus, “the cognitive correlate of context needs to be co-learnt with the social group it interacts with so that its behavior is context-dependent in a way that is compatible with theirs” (E01). This implies that agents have to spend time enculturating, which means that they have to acquire the “characteristics and norms of a culture a group or a person” (E01). This allows to “recognize the same contexts and learn the associated behaviors, responses and knowledge that are relevant to these situations” (ibid.). However, an agent must not mimic the user. Rather, “it needs to analyze the reaction of the user and adjust accordingly” (E13). Beyond that, “just as agents would have to learn to accept us for who we are, we will also need to learn to accept them for who [and what] they are.” (E20). The acceptance of agents will depend to a great degree on how “acclimated humans are to the presence of agents, as well as how widespread those entities are in society at that point in time” (E08). To this end, E01 states that “it is the process by which they [the agents] are made” and that it will “not be possible to program them offline correctly and then just place them into use”. Rather, for socially intelligent agents to fluidly and easily interact with humans, they will need a “considerable period of enculturation” (E01). As this implies a profound amount of learning time, experts see a great challenge in enculturation. For example, “nobody accepts a security system that only starts to work after the second home invasion” (E16).

From a quantitative point of view, enculturation was rated the second most relevant (

M e a n

= 6.19;

S D

= 0.81) and third most important (

M e a n

= 5.90;

S D

= 0.89) characteristic. As E13 puts it, “Adjusting ones characteristics in accordance to a given situation [cultural context] is the key to becoming a part of a social network”.

4.5. Customizability

Approximately one-third of the experts stressed for customizability to be an important agent characteristic. As people have “many different social preferences” it would be helpful for socially intelligent agents to “adapt to them” (E12). Call center automation already demonstrates that customers are different in the way they communicate. While some of them “just want to have their problems solved, others want to hear empathetic phrases” (E06). Thus, a socially intelligent agent needs to “behave in accordance with what [level of empathy] the user wants” (E06). A modest approach “gives users the control over the interaction and the amount of emotions they want to share” (E16). We thus may distignuish between two approaches to customization. Agents could either be customized by the user or “customized prior to their delivery to a user” (E02).

However, so as to be in accordance with social norms, an agent must not be customizable at all levels, as this would “not seem natural anymore” (E06). Rather, agents should maintain their “personality but then adapt to other individuals and contexts like humans do” (E06). Technically speaking, they should be able to “look through the history of a person so as to be able to personalize to particular needs” (E10).

4.6. Engagement

Engangement is a characteristic that four of the experts believe a socially intelligent agent should be able to master. They fear that “with non-expressive and non-varied resultant conversational interfaces people will quickly become bored” (E04). As E20 notes, “sharing of experiences is engaging for humans as they are empathic creatures”. Thus, “future social intelligent agents should [...] provide sustained engagement” (E04). This involves the “choice of different words and phrases and the organization of the message concerning its discourse structure” (E03). Furthermore, “turn length is important in a conversational encounter as the conversational partner will want to contribute and so the agent should not take over the floor for long periods of time” (E03). Agents may even include variance such as different humorous remarks and build in “conversational delights such as sarcasm and self-deprecation” (E04). A further consideration remarks that a good interactive system “leaves room for interpretation and not over explains how to interact with it” (E16). Still, an irreproducible cause and effect relation is needed in order to give users a point where they can start to explore and interpret. In order to keep it traceable and simultaneously prevent a rather boring action–reaction system, “a learning curve of increasingly complex behavior could be used” (E16).

4.7. Consistency

Furthermore, consistency was stated by four of the experts as being important for socially intelligent agents. To this end, a “coherent narrative to the agent’s actions” (E01) is considered to be a part of the social process. Acting in a consistent manner helps users feel secure, as they can see “actions lining up with intentions” (E18). Generally, an agent needs to act consistently toward its goals and missions and showcase an obvious agenda so as to be perceived as authentic. Especially “being coherent with their previous behaviors or being able to explain changes and motivate them” (E05) should be taken into account. Although, it may be understood as a “more after-the-fact ability to give a coherent account of actions rather than a real day-day consistency” (E01). Furthermore, an agent should be able to express who it is by explaining what it does and how it works. This would make it easier for users to use the agent as a “building block helping them get what they want” (E19).

Overall, ratings point to both relevance (

M e a n

= 5.67;

S D

= 1.06) as well as importance (

M e a n

= 5.62;

S D

= 1.12) for agents to be consistent, although on both points experts seemed slightly indecisive (i.e.,

S D > 1

).

4.8. Depth

Depth was also brought up by some experts as being a relevant agent characteristic. That is, for an agent, it is important that its behavior is “not just a shallow set of programmed reactions but comes from something that motivates these, such as goals and/or a culture” (E01). Experts state that having an obvious agenda would help users “feel safe and secure as they can see the actions lining up with the agents intentions” (E18). Additionally, it was perceived as important that agents have the ability to adjust their goals in an exploratory fashion as “goals and ambitions in humans (also) change continuously” (E05).

4.9. Continuous Interaction

Continuous interaction was another potential agent characteristic highlighted by the experts. One fundamental difference between humans and intelligent agents is that humans have “continuous interaction with other individuals and groups” (E06). Humans thus continuously collect information and consequently build up knowledge to be used in different contexts. Most CAs, however, rely on the conversational context given to them by the preceding six to eight conversational turns. Even if they could remember all previous interactions, they would be turned off in between and, therefore, would not hold all information required for more continuous interaction. From this, experts concluded that “agents may need to be available and willing to interact more often” (E05), and also, “the correct attribution of memories” (E09) may play an important role, so that agents will eventually be able to “maintain meaningful [continuous] interactions” (E04).

However, although continuous interaction seems important, this does not necessarily mean that future CAs need to be always on and available. While interacting with the user could happen during “work or social hours” (E18), there must also be time for “play, rest, and sleep/dream” (E18), where agents may ‘upgrade’ themselves and humans are ‘left alone’. (Note: the issue of privacy was purposefully omitted from our investigations, yet we do want to stress that any agent technology, which would offer more or less ubiquitous availability, would need to make privacy preservation one of its highest priorities).

4.10. Respectful Honesty

More than half of the experts emphasized respectful honesty as an essential CA characteristic. An agent should not ”say or do things just to obtain its goals” (E01) but rather be “true to or at least consistent with reality” (E01). In other words, CAs must not lie, but they should be able to “deny answers in order to avoid revealing certain facts like for example information that is socially sensitive” (E09). This type of honesty also implies that agents should be able to “identify situations in which they cannot perform well and thus warn users with enough time in advance to take over control” (E09). They should be able to “convey information in an objective and reliable way” (E03). For basic conversations, “explanations of agent behavior [and/or reasoning] may disturb the communication flow, but for important decisions, it may be important to know how the agent came to its conclusions” (E11). Lastly, respectful honesty also means that if an agent “can’t understand something, because a message is ambiguous, it has to ask for clarification so as to be able to classify questions, statements and situations” (E15).

Respectful honesty was rated the third most relevant (

M e a n

= 5.76;

S D

= 1.18) and second most important (

M e a n

= 5.90;

S D

= 1.18) characteristic. However, experts also pointed to the difficulty of implementing this characteristic, as it is often unclear how a situation should be best handled.

4.11. Justifiability

Six of the experts mentioned justifiability as a necessary characteristic. Accordingly, CAs “do not need predictability nor high levels of transparency” (E01). What is needed, however, is justifiability, i.e., “the ability to explain an entity’s [agent’s] actions in socially acceptable terms” (ibid.). This also involves that agents need to “follow a set of ethical guidelines that should be known to human users” (E09). Furthermore, a user should be able to see “which data it [the CA] works with and what it does with the data” (E06). Furthermore, the agents’s purpose should be clear, and it should be explained “why it was developed and what its aims are” (E06). Similar to humans, CAs need to be able to justify their actions as “this is a requisite for being part of a society” (E01).

4.12. Establish/Maintain Relationships

A total of 30 statements may be integrated into the ability of CAs to establish/maintain relationships with user(s). The ability to make the user(s) “feel good […] is an essential social skill, and CAs of the future will need this ability” (E01). They should be able to make others feel good and evoke positive feelings or at least “avoid evoking negative feelings” (E16). Without this, “communication and collaboration will likely be short-lived” (E16). Building up a relationship with the user means for an agent to be “pleased to interact” (E05). In this effect, it must be “endowed with features that allows it to ‘like’ its interlocutors” (E05). Generally, building and maintaining relationships “requires basic social skills and a knowledge of the expectations and norms that hold in a conversational situation” (E01). Agents need to apply this by “learning behavior patterns of the user […] to develop rapport with the user” (E10). They further need to “show interest […], maintain a memory of what was learned from previous interactions, and be able to use this information in future interactions” (E03). The characteristic is highly context-dependent and particularly important “for long term interactions” (E01).

On the one hand, the Likert-scaled data points to both relevance (

M e a n

= 5.48;

S D

= 1.44) and importance (

M e a n

= 5.14;

S D

= 1.42) of establishing and maintaining relationships. However, it also shows significant disagreement among experts concerning this agent characteristic.

4.13. Respectful Acting

Respectful acting depicts another essential characteristic CAs should be equipped with. To this end, one expert pointed out that respect and consideration are an important component of social intelligence. Consequently, agents need to treat humans and other intelligent entities with respect, refraining from any type of discrimination. While predictability may hinder engagement and depth (cf. Section 4.6 and Section 4.8) (“persons amazing me are always unpredictable from a standard point of view”(E05)), universal respect towards an interlocutor is important. In other words, a CA should be “authentically respectful while being unpredictable” (E05). Respect is thereby described as a bilateral relationship, in that we have moral obligations towards other persons, animals, or any other kind of sentient being, and thus should also show respect towards CAs.

Experts consider respect as a pre-condition to all other characteristics, although implementing features that entirely prevent discrimination “may be problematic, as an agent would not be able to differentiate between its primary user and other people” (E18). Furthermore, it may be difficult to relate respect to changing parameters such as political correctness, as “our understanding of them changes as time goes by” (E19). Similarly, agents shall be able to understand human diversity and prevent any form of “prejudice or absolute judgments” (E18). Therefore, they “should [also] not behave as if they are sub-human or servants to humanity” (E20) as this may foster existing societal stereotypes.

4.14. Otherness

Simulating human traits may not suffice to make agents appear socially intelligent. Rather, they would need to show “honest and authentic behavior which does not simply mimic human or animal characteristics, but has its own unique appearance and interaction style” (E16). This indicates that future CAs may eventually depict “radically different intelligent entities; somewhat unrelated to human beings” (E19). To this end, the concept of otherness was brought up by some of our experts. Rather than being programmed to behave similarly to humans, future agents should be equipped with “enough learning tools to derive their own consciousness and reason upon how to communicate effectively with both humans and other agents” (E20). They could be able to “obtain enough reasoning to deduce key properties of human inter-personal relationship skills and expand upon that as their function sees fit” (E20).

The development of such a unique entity “will probably take some time, as new things are often modeled after something people already know” (E16), yet once people become accustomed to the otherness of CAs, it may eventually allow for the agents to “evolve into unique and human unrelated objects” (E16). Although, “being an abstract object should not mean that these agents would not have any human/animal characteristics” (E16). They still need to be able to “translate their assembly language code, or whichever internal language system they use, into a language that we humans speak” (E20), as only then “communication could flow freely” (E20).

The quantitative data shows that the expert panel perceived this concept of otherness as relevant (

M e a n

= 5.05;

S D

= 1.56) as well as important (

M e a n

= 4.52;

S D

= 1.78). However, as illustrated by the very high standard deviation, experts advocate contrasting viewpoints. While some of them rated this characteristic as very relevant and very important, others, for example, noted that humanoid forms and characteristics of humaneness “are very relevant and very important […] and seem to be more engaging than other shapes” (E05).

4.15. Individual Personality

Finally, being equipped with or being capable of developing an individual personality was emphasized by seven experts. They argued that agents should be able to develop their own individual personality based on their experiences with their users. Such may be achieved by “learning the behavior patterns of users and mimicking behavioral trends so as to develop rapport” (E10). Establishing its own personality is an intensely social process, where a “set of different signals and responses in terms of behavior need to be developed” (E01). Agents would need to be equipped with the “ability to convey these signals so that others have clues about how to interact with them” (ibid.). Through this, “people may have an easier time relating to these agents as they would exhibit individual histories” (E12).

The expert ratings regarding individual personality points to a relevant (

M e a n

= 4.95;

S D

= 1.69) yet only moderately important (

M e a n

= 3.86;

S D

= 1.80) agent characteristic. However, the idea seemed to polarize opinions, as illustrated by the rather large standard deviation. That is, experts expressed very diverse viewpoints. Opponents of the characteristic, e.g., stated that “humans impute personality to many things that do not have them, thus it is not so important to build this in” (E01). On the other hand, “we expect everyone to be unique today […] once we get used to identical robotic personalities, who knows what will become the norm” (E19).

5. Meta Reflections

Our study identified three levels of agreement regarding 14 identified characteristics socially intelligent agents should be able to master. With five of those characteristics, experts expressed strong consensus, and with the others, they were either indecisive (five characteristics) or in disagreement (four characteristics). These three levels of consensus may be linked to the degree of maturity and abstractness the different characteristics hold. Characteristics with shared consensus illustrate a very present and concrete character. Characteristics on which agreement did not exist were vaguely described and seem to serve as a long-term vision for the future rather than for current design improvements in agent technology. For example, reflective language, in which experts strongly agreed, addresses concrete design recommendations such as optimal sentence length and the use of distinct vocabulary. On the other hand, characteristics divided by disagreement, such as otherness, imply visionary recommendations, e.g., for agents to derive their own consciousness and reason upon how to communicate effectively.

Many of the more obvious features that may be linked to the identified characteristics were already taken up by previous work (e.g., customizability of agents [80], support for reflective dialogue [81], the need to offer variable agent personalities [82]). The demand for context-related acting and respective enculturation, for example, supports recent work by Rato et al. [83], who found that context-aware and context-adaptable agents are perceived as more social. To this end, Griol and Callejas [84] have already provided a framework to develop respective CAs for mobile applications, Mavropoulos et al. [85] have demonstrated how video data can help adapt conversational agents to people’s behavior, and Bradley et al. [86] have shown how the analysis of voice commands may be used to augment a context model and consequently trigger appropriate agent actions. Furthermore, the call for reflective language is supported by previous work, as it has been shown that a virtual agent’s mimicry enhances its perceived intelligence [87] and increases rapport [88]. Likewise, customizability has been found to improve an agent’s perception and compliance with its recommendations [80]. Concerning the engagement of CAs, previous work by Jusoh [89] suggests that active recommendation and negotiation increases perceived CA intelligence, whereas Gaffney et al. [90] underline a CAs storytelling abilities as being supportive. Pro-activity in conversation management, as outlined by Wu et al. [91], may furthermore help convey a certain level of depth. Furthermore, consistency has previously been addressed as an important CA characteristic. Allbeck and Badler [92], for example, recommended consistent agent behavior as a means to prevent mixed messages and miscommunication, and Bentahar et al. [93] proposed a respective framework for CAs to take part in consistent conversations.

The challenge of having a continuous interaction with a CA has already been showcased by Campos et al. [94] as well as Xu et al. [95]. On the other hand, it has also been highlighted that these types of long-lasting interactions are required to establish/maintain relationships [96] with social agents. To this end, trustworthy CA behavior, supported by characteristics such as respectful honesty and respectful acting, has also been subject to various previous studies (e.g., [97,98,99]), building upon which Guo et al. [100] have recently proposed respective CA design principles.

Finally, the 14 characteristics agreed upon by our experts also align with many of the ASA measurement constructs and dimensions proposed by Fitrianie et al. [23]. That is, while some of them are explicitly found in this construct list (e.g., individual personality and engagement), others may be connected to or subsumed under single items. For example, consistency and respectful acting match with an agent’s coherence, whereas context-related acting, reflective language and customization support an agent’s believability. Similarly, continuous interaction and enculturation add to an agent’s sociability and justifiability and depth underlie an agent’s intentionality. Furthermore, one may argue that establish/maintain relationships helps build and keep a potential user–agent alliance and that respectful honesty should foster a user’s trust in a CA.

One rather controversial finding of the present study, however, regards the question of whether the characteristics of future agents should be modeled after those of humans. The ability to simulate human traits has long been considered the hallmark of AI, famously showcased year after year in various Turing Test competitions. However, modern agent and (ro)bot systems seem to have outlived this ambition. In her paper “Robots should be Slaves”, Bryson [101] even argues that robots should not be described as persons as this would further dehumanize real people. This also aligns with recent arguments put forward by Pradhan and Lazar [102], saying that modeling a CA after a distinct persona may reinforce existing societal stereotypes. This correlates with comments by some of our experts in that they call for some sort of otherness in agents, and is further aggravated by the argument that agents may only be (perceived) authentic if they refrain from mimicking human behavior but instead show their own unique appearance and interaction styles. To this end, experts recommended that future agents be equipped with enough learning features so as to allow for a social coexistence of technology and humans.

Connected to this notion of otherness is the controversial topic of a distinct agent personality. While some argue against designing artificial personality traits, it is likely that people would subconsciously attribute certain characteristics and intentions to artificial entities. For example, research has shown that people make judgments about agents’ personalities, based on perceived voices or faces [103], leading to infantile agents as being perceived more sociable than agents with different types of faces [104]. Consequently, even if agents are not equipped with artificial personality characteristics, human interlocutors may assign predominant characteristics to them. Thus, by explicitly designing agent personalities, one could potentially help deplete these arbitrarily added societal stereotypes.

Reducing the mimicry of human characteristics and appearances might further alleviate feelings of eeriness in users interacting with agents, which stem from the so-called uncanny valley effect [49]. More so, future agents might need their own distinct moral rules, to which humans would then owe ethical obligations the same way as they do to other humans [101]. This bilateral relationship is also found in the recommendation to build some sort of respectful acting into future agent technology. Here, one would further need to address questions concerning the hierarchical level agents should be classified on. In his paper, Coeckelbergh [105] takes the view that the rationale to respect an agent (or, in his case, a robot) is not that it has moral agency, but that it belongs to a human and has value for that person. Consequently, humans have certain indirect obligations towards agents as property. Similar to animals, where humans have already accepted that some non-human, moral beings should be treated with respect.

Finally, on a different note, both the law and AI experts have been trying to answer questions concerning the justifiability and responsibility of decisions and actions taken by artificial entities. To this end, some of our experts recommend that agents should have the ability to explain their actions, their purpose, as well as their aims. However, it seems safe to say that discussions concerning the accountability of agent behavior will continue as these AI systems become more ‘intelligent’.

6. Conclusions, Limitations, and Future Work

In conclusion, the results of this Delphi study with

n = 21

experts from industry and academia produced 14 characteristics of social intelligence to be implemented by future conversational agents. While some of the identified characteristics are clearly in line with previous work, we believe the study was able to augment the existing body of knowledge, particularly with respect to discussions on socio-ethical aspects of AI (e.g., to which extent should AI characteristics be modeled after human characteristics). Those questions also raise questions regarding moral agency and ethical obligations of as well as towards future agent technology.

The presented results furthermore show that the perception of social intelligence in conversational agents likely depends on a variety of interconnected characteristics, which should not be focused on in isolation. Furthermore, some of the more visionary characteristics (e.g., CAs which develop their very own personality) may have undesired, even negative impacts on humans and their relationship to technology. Consequently, future work needs to continue exploring AI applications and their role in society. Depending on whether we want future CAs to be slave-like entities, butlers, or our best friends, our socio-ethical perception of the technology has to be shaped.

6.1. Limitations

One limitation of the presented work regards the selection of participating experts. There was a clear focus on people who had previously published their thoughts and research results in international academic papers, which excluded reports that were not presented in English. Furthermore, although we aimed for an international field of experts, the majority of those who contributed to the Delphi study were researchers based in Europe. Among them, the great majority of contributors were male, despite significant efforts put into recruiting female experts (i.e., reaching out to well-respected female researchers and using snow-balling). This is unfortunate, and we thus want to underline that this clear gender/region-bias may have had a significant impact on our study results.

Another limitation may be found in a certain researcher bias inherent to this type of qualitative research. We tried to mitigate potential misinterpretation by staying close to the words of the experts and, consequently, used the cyclical nature of the Delphi study methodology to deal with ambiguities and alterations and to elaborate on emerging ideas and themes.

Finally, one may find a certain limitation in our focus on Albrecht’s S.P.A.C.E framework, which served as guidance for the presented research. Different scholarly theories on social intelligence may potentially provide a contrasting perspective and should thus be used in the future to broaden our understanding of the scope social intelligence has in AI research.

6.2. Potential Future Research Directions

The findings of the presented study provide a starting point for a wide area of future investigations. Outcomes show that there is a certain disagreement on whether some characteristics of social intelligence should be considered in agent technology and, if yes, in which way they should be included. Identified characteristics and their combination require further exploration so as to develop a more profound theoretical framework of socially intelligent conversational agents.

Respective characteristics should further be evaluated from a user’s, developer’s, and enterprise’s point of view. Moreover, the interplay of implemented characteristics requires extensive investigation. To this end, future work should especially address how characteristics could generate synergies and which of them are mutually incompatible.

Next, picking up on the most relevant characteristic identified by our study, i.e., context-related acting, the influence of social intelligence in specific use cases should be explored.

Finally, with respect to socio-ethics, questions about moral agency and ethical obligations of AI should be addressed, and it should be examined whether the implementations of specific characteristics may generate crossover effects on how people treat each other.

6.3. Final Thoughts

With regards to future conversational agents, which should be working for and collaborate with humans, there seems to be no doubt that the social characteristics of these entities will play a significant role in their acceptance. Today, conversational agents are already used in clinical and therapeutic contexts [106,107], in interactive education environments [108,109], as well as in e-commerce [110,111]. They are likely to also propagate into other domains so that they will continue to engage, entertain, and potentially even enlighten us, to the point where we become used to them acting as our servants, butlers, or even ‘best friends’, perfectly adapted to our needs and preferences. Consequently, they need to fit into our world, which makes us responsible for them.

We operate and manufacture them. We determine their behavior and goals, either directly by specifying their ‘intelligence’, or indirectly by specifying how they acquire knowledge. This makes us obligated not to conversational agents per se but to society. Our responsibility to future generations requires visionary thinking about how we want to use this type of AI technology and what we shall expect from this usage.

Ultimately, humans are social creatures and, as such, may require respective traits in AI technology. However, we may need to closely select these characteristics so that by emphasizing anthropomorphic behavior we (1) do not impede AI technology in reaching its full potential and (2) find an optimal balance between human–technology and human–human interactions.

Author Contributions

The article was a collaborative effort by all five co-authors. L.B. acted as the lead researcher in the presented study, which was supervised by S.S. Furthermore, S.S. wrote the original draft of the article and was significantly supported by A.M. Finally, P.S. and M.J. helped with the writing, reviewing, and editing of the final document. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of MCI—The Entrepreneurial School.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

McTear, M.F.; Callejas, Z.; Griol, D. The Conversational Interface; Springer: Cham, Switzerland, 2016; Volume 6. [Google Scholar]
Ferrara, E.; Varol, O.; Davis, C.; Menczer, F.; Flammini, A. The rise of social bots. Commun. ACM 2016, 59, 96–104. [Google Scholar] [CrossRef]
Machinery, C. Computing machinery and intelligence-AM Turing. Mind 1950, 59, 433. [Google Scholar]
Saygin, A.P.; Cicekli, I.; Akman, V. Turing test: 50 years later. Minds Mach. 2000, 10, 463–518. [Google Scholar] [CrossRef]
De Ruyter, B.; Saini, P.; Markopoulos, P.; Van Breemen, A. Assessing the effects of building social intelligence in a robotic interface for the home. Interact. Comput. 2005, 17, 522–541. [Google Scholar] [CrossRef]
Duffy, B.R. Anthropomorphism and the social robot. Robot. Auton. Syst. 2003, 42, 177–190. [Google Scholar] [CrossRef]
Breazeal, C. Toward sociable robots. Robot. Auton. Syst. 2003, 42, 167–175. [Google Scholar] [CrossRef]
Forlizzi, J. Robotic products to assist the aging population. Interactions 2005, 12, 16–18. [Google Scholar] [CrossRef]
Du, X.; Zhao, X.; Wu, C.H.; Feng, K. Functionality, Emotion, and Acceptance of Artificial Intelligence Virtual Assistants: The Moderating Effect of Social Norms. J. Glob. Inf. Manag. (JGIM) 2021, 30, 1–21. [Google Scholar] [CrossRef]
Justo, R.; Ben Letaifa, L.; Palmero, C.; Gonzalez-Fraile, E.; Torp Johansen, A.; Vázquez, A.; Cordasco, G.; Schlögl, S.; Fernández-Ruanova, B.; Silva, M.; et al. Analysis of the interaction between elderly people and a simulated virtual coach. J. Ambient. Intell. Humaniz. Comput. 2020, 11, 6125–6140. [Google Scholar] [CrossRef]
Esposito, A.; Amorese, T.; Cuciniello, M.; Riviello, M.T.; Esposito, A.M.; Troncone, A.; Torres, M.I.; Schlögl, S.; Cordasco, G. Elder user’s attitude toward assistive virtual agents: The role of voice and gender. J. Ambient. Intell. Humaniz. Comput. 2021, 12, 4429–4436. [Google Scholar] [CrossRef]
Gessl, A.S.; Schlögl, S.; Mevenkamp, N. On the perceptions and acceptance of artificially intelligent robotics and the psychology of the future elderly. Behav. Inf. Technol. 2019, 38, 1068–1087. [Google Scholar] [CrossRef]
Shamekhi, A.; Czerwinski, M.; Mark, G.; Novotny, M.; Bennett, G.A. An exploratory study toward the preferred conversational style for compatible virtual agents. In Proceedings of the International Conference on Intelligent Virtual Agents, IVA 2016, Los Angeles, CA, USA, 20–23 September 2016; pp. 40–50. [Google Scholar]
Clark, L.; Pantidi, N.; Cooney, O.; Doyle, P.; Garaialde, D.; Edwards, J.; Spillane, B.; Gilmartin, E.; Murad, C.; Munteanu, C.; et al. What makes a good conversation? Challenges in designing truly conversational agents. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–12. [Google Scholar]
Jeong, Y.; Lee, J.; Kang, Y. Exploring effects of conversational fillers on user perception of conversational agents. In Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–6. [Google Scholar]
Elkins, A.C.; Derrick, D.C. The sound of trust: Voice as a measurement of trust during interactions with embodied conversational agents. Group Decis. Negot. 2013, 22, 897–913. [Google Scholar] [CrossRef]
Jaques, N.; McDuff, D.; Kim, Y.L.; Picard, R. Understanding and predicting bonding in conversations using thin slices of facial expressions and body language. In Proceedings of the International Conference on Intelligent Virtual Agents, IVA 2016, Los Angeles, CA, USA, 20–23 September 2016; pp. 64–74. [Google Scholar]
Lee, S.K.; Kavya, P.; Lasser, S.C. Social interactions and relationships with an intelligent virtual agent. Int. J. Hum.-Comput. Stud. 2021, 150, 102608. [Google Scholar] [CrossRef]
Kumar, B.; Singh, A.V.; Agarwal, P. AI based Computational Trust Model for Intelligent Virtual Assistant. J. Inf. Syst. Telecommun. JIST 2021, 4, 263. [Google Scholar] [CrossRef]
Glikson, E.; Woolley, A.W. Human trust in artificial intelligence: Review of empirical research. Acad. Manag. Ann. 2020, 14, 627–660. [Google Scholar] [CrossRef]
Pitardi, V.; Marriott, H.R. Alexa, she’s not human but… Unveiling the drivers of consumers’ trust in voice-based artificial intelligence. Psychol. Mark. 2021, 38, 626–642. [Google Scholar] [CrossRef]
Lee, J.H.; Lee, S.W.; Padget, J. Using social reasoning framework to guide normative behaviour of intelligent virtual agents. In Proceedings of the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 7–10 October 2018; pp. 2466–2471. [Google Scholar]
Fitrianie, S.; Bruijnes, M.; Richards, D.; Bönsch, A.; Brinkman, W.P. The 19 unifying questionnaire constructs of artificial social agents: An iva community analysis. In Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, Virtual Event, UK, 20–22 October 2020; pp. 1–8. [Google Scholar]
Nass, C.; Steuer, J.; Tauber, E.R. Computers are social actors. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Boston, MA, USA, 24–28 April 1994; pp. 72–78. [Google Scholar]
Heffernan, T. Fiction meets science: Ex Machina, artificial intelligence, and the robotics industry. In Cyborg Futures; Palgrave Macmillan Cham: London, UK, 2019; pp. 127–140. [Google Scholar]
Mathur, M.B.; Reichling, D.B. Navigating a social world with robot partners: A quantitative cartography of the Uncanny Valley. Cognition 2016, 146, 22–32. [Google Scholar] [CrossRef]
Fong, T.; Nourbakhsh, I.; Dautenhahn, K. A survey of socially interactive robots. Robot. Auton. Syst. 2003, 42, 143–166. [Google Scholar] [CrossRef]
Bickmore, T.; Cassell, J. Social dialongue with embodied conversational agents. In Advances in Natural Multimodal Dialogue Systems; Springer: Dordrecht, Germany, 2005; pp. 23–54. [Google Scholar]
Fincannon, T.; Barnes, L.E.; Murphy, R.R.; Riddle, D.L. Evidence of the need for social intelligence in rescue robots. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No. 04CH37566), Sendai, Japan, 28 September–2 October 2004; Volume 2, pp. 1089–1095. [Google Scholar]
Douglas-Cowie, E.; Cox, C.; Martin, J.C.; Devillers, L.; Cowie, R.; Sneddon, I.; McRorie, M.; Pelachaud, C.; Peters, C.; Lowry, O.; et al. The HUMAINE database. In Emotion-Oriented Systems; Springer: Berlin/Heidelberg, Germany, 2011; pp. 243–284. [Google Scholar]
Gunes, H.; Pantic, M. Dimensional emotion prediction from spontaneous head gestures for interaction with sensitive artificial listeners. In Proceedings of the International Conference on Intelligent Virtual Agents, IVA 2010, Philadelphia, PA, USA, 27–30 August 2010; pp. 371–377. [Google Scholar]
Petridis, S.; Pantic, M. Audiovisual laughter detection based on temporal features. In Proceedings of the 10th International Conference on Multimodal Interfaces, Chania, Greece, 20–22 October 2008; pp. 37–44. [Google Scholar]
Niewiadomski, R.; Bevacqua, E.; Mancini, M.; Pelachaud, C. Greta: An interactive expressive eca system. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems, Budapest, Hungary, 10–15 May 2009; Volume 2, pp. 1399–1400. [Google Scholar]
Beinema, T.; Davison, D.; Reidsma, D.; Banos, O.; Bruijnes, M.; Donval, B.; Valero, Á.F.; Heylen, D.; Hofs, D.; Huizing, G.; et al. Agents United: An open platform for multi-agent conversational systems. In Proceedings of the 21st ACM International Conference on Intelligent Virtual Agents, Kyoto, Japan, 14–17 September 2021. [Google Scholar]
DeVault, D.; Georgila, K.; Artstein, R.; Morbini, F.; Traum, D.; Scherer, S.; Rizzo, A.A.; Morency, L.P. Verbal indicators of psychological distress in interactive dialogue with a virtual human. In Proceedings of the SIGDIAL 2013 Conference, Metz, France, 22–24 August 2013; pp. 193–202. [Google Scholar]
Lisetti, C.; Amini, R.; Yasavur, U.; Rishe, N. I can help you change! An empathic virtual agent delivers behavior change health interventions. ACM Trans. Manag. Inf. Syst. (TMIS) 2013, 4, 1–28. [Google Scholar] [CrossRef]
Torres, M.I.; Olaso, J.M.; Montenegro, C.; Santana, R.; Vázquez, A.; Justo, R.; Lozano, J.A.; Schlögl, S.; Chollet, G.; Dugan, N.; et al. The empathic project: Mid-term achievements. In Proceedings of the 12th ACM International Conference on Pervasive Technologies Related to Assistive Environments, Rhodes, Greece, 5–7 June 2019; pp. 629–638. [Google Scholar]
Schulte, J.; Rosenberg, C.; Thrun, S. Spontaneous, short-term interaction with mobile robots. In Proceedings of the 1999 IEEE International Conference on Robotics and Automation (Cat. No. 99CH36288C), Detroit, MI, USA, 10–15 May 1999; Volume 1, pp. 658–663. [Google Scholar]
Dautenhahn, K. Ants don’t have friends—Thoughts on socially intelligent agents. Soc. Intell. Agents 1997, 97, 22–27. [Google Scholar]
Albrecht, K. Social Intelligence: The New Science of Success; Jossey-Bass: New York, NY, USA, 2006. [Google Scholar]
Wickens, C.D.; Hollands, J.G.; Banbury, S.; Parasuraman, R. Engineering Psychology and Human Performance; Psychology Press: New York, NY, USA, 2015. [Google Scholar] [CrossRef]
Hoogendoorn, M.; van Lambalgen, R.M.; Treur, J. Modeling situation awareness in human-like agents using mental models. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Spain, 16–22 July 2011. [Google Scholar]
Kornienko, S.; Kornienko, O.; Levi, P. Collective AI: Context awareness via communication. In Proceedings of the 19th International Joint Conference on Artificial Intelligence, IJCAI’05, Edinburgh, UK, 30 July–5 August 2005; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2005; pp. 1464–1470. [Google Scholar]
Baylor, A.L. Promoting motivation with virtual agents and avatars: Role of visual presence and appearance. Philos. Trans. R. Soc. Lond. Ser. Biol. Sci. 2009, 364, 3559–3565. [Google Scholar] [CrossRef]
Hone, K.; Akhtar, F.; Saffu, M. Affective agents to reduce user frustration: The role of agent embodiment. In Proceedings of the Human-Computer Interaction (HCI2003), Bath, UK, 22–27 June 2003. [Google Scholar]
Kidd, C.; Breazeal, C. Effect of a robot on user perceptions. In Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), Sendai, Japan, 28 September–2 October 2004; Volume 4, pp. 3559–3564. [Google Scholar]
Lee, K.M.; Jung, Y.; Kim, J.; Kim, S.R. Are physically embodied social agents better than disembodied social agents?: The effects of physical embodiment, tactile interaction, and people’s loneliness in human–robot interaction. Int. J. Hum.-Comput. Stud. 2006, 64, 962–973. [Google Scholar] [CrossRef]
Złotowski, J.; Sumioka, H.; Nishio, S.; Glas, D.F.; Bartneck, C.; Ishiguro, H. Appearance of a Robot Affects the Impact of Its Behaviour on Perceived Trustworthiness and Empathy. Paladyn. J. Behav. Robot. 2016, 7, 55–66. [Google Scholar] [CrossRef]
Mori, M.; MacDorman, K.F.; Kageki, N. The uncanny valley [from the field]. IEEE Robot. Autom. Mag. 2012, 19, 98–100. [Google Scholar] [CrossRef]
Siegel, M.; Breazeal, C.; Norton, M.I. Persuasive robotics: The influence of robot gender on human behavior. In Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 10–15 October 2009; pp. 2563–2568. [Google Scholar]
Baylor, A.L.; Kim, Y. Pedagogical agent design: The impact of agent realism, gender, ethnicity, and instructional role. In Proceedings of the International Conference on Intelligent Tutoring Systems, Maceió, Brazil, 30 August–3 September 2004; Springer: Berlin/Heidelberg, Germany, 2004; pp. 592–603. [Google Scholar]
Guadagno, R.E.; Blascovich, J.; Bailenson, J.N.; McCall, C. Virtual humans and persuasion: The effects of agency and behavioral realism. Media Psychol. 2007, 10, 1–22. [Google Scholar]
Gulz, A.; Haake, M.; Tärning, B. Visual Gender and Its Motivational and Cognitive Effects: A User Study. Lund Univ. Cogn. Stud. 2007, 137, 1–22. [Google Scholar]
Niederhoffer, K.G.; Pennebaker, J.W. Linguistic style matching in social interaction. J. Lang. Soc. Psychol. 2002, 21, 337–360. [Google Scholar] [CrossRef]
Fabri, M.; Moore, D.; Hobbs, D. Expressive agents: Non-verbal communication in collaborative virtual environments. In Proceedings of the Autonomous Agents and Multi-Agent Systems (Embodied Conversational Agents), Bologna, Italy, 15–19 July 2002. [Google Scholar]
Baylor, A.; Ebbers, S. The pedagogical agent split-persona effect: When two agents are better than one. In Proceedings of the EdMedia + Innovate Learning 2003, Honolulu, HI, USA, 2003; Lassner, D., McNaught, C., Eds.; Association for the Advancement of Computing in Education (AACE): Honolulu, HI, USA, 2003; pp. 459–462. [Google Scholar]
Kim, Y.; Baylor, A.L.; PALS Group. Pedagogical Agents as Learning Companions: The Role of Agent Competency and Type of Interaction. Educ. Technol. Res. Dev. 2006, 54, 223–243. [Google Scholar] [CrossRef]
Kahn, P.H.; Ishiguro, H.; Friedman, B.; Kanda, T. What is a Human?: Toward psychological benchmarks in the field of human-robot interaction. In Proceedings of the ROMAN 2006—The 15th IEEE International Symposium on Robot and Human Interactive Communication, Hatfield, UK, 6–8 September 2006; pp. 364–371. [Google Scholar] [CrossRef]
Neururer, M.; Schlögl, S.; Brinkschulte, L.; Groth, A. Perceptions on authenticity in chat bots. Multimodal Technol. Interact. 2018, 2, 60. [Google Scholar] [CrossRef]
Persson, P.; Laaksolahti, J.; Lönnqvist, P. Understanding socially intelligent agents—A multilayered phenomenon. IEEE Trans. Syst. Man Cybern. Part Syst. Hum. 2001, 31, 349–360. [Google Scholar] [CrossRef]
Leite, I.; Pereira, A.; Castellano, G.; Mascarenhas, S.; Martinho, C.; Paiva, A. Modelling empathy in social robotic companions. In Proceedings of the Advances in User Modeling, Girona, Spain, 11–15 July 2011; Ardissono, L., Kuflik, T., Eds.; Springer: Berlin/Heidelberg, Germany, 2011; pp. 135–147. [Google Scholar]
Ono, M.; Fujita, M.; Yamada, S. Physiological and Psychological Responses to Expressions of Emotion and Empathy in Post-Stress Communication. J. Physiol. Anthropol. 2009, 28, 29–35. [Google Scholar] [CrossRef] [PubMed]
Miner, A.S.; Milstein, A.; Schueller, S.; Hegde, R.; Mangurian, C.; Linos, E. Smartphone-Based Conversational Agents and Responses to Questions About Mental Health, Interpersonal Violence, and Physical Health. JAMA Intern. Med. 2016, 176, 619–625. [Google Scholar] [CrossRef] [PubMed]
Motta, I.; Quaresma, M. Exploring the opinions of experts in conversational design: A Study on users’ mental models of voice assistants. In Proceedings of the International Conference on Human-Computer Interaction, Virtual Event, 26 June–1 July 2022; Springer: Cham, Switzerland, 2022; pp. 494–514. [Google Scholar]
Fröhlich, M.; Hulm, P.; Alt, F. Under pressure. A user-centered threat model for cryptocurrency owners. In Proceedings of the 2021 4th International Conference on Blockchain Technology and Applications, Xi’an, China, 17–19 December 2021; pp. 39–50. [Google Scholar]
Nelson, K.L.; Powell, B.J.; Langellier, B.; Lê-Scherban, F.; Shattuck, P.; Hoagwood, K.; Purtle, J. State Policies that Impact the Design of Children’s Mental Health Services: A Modified Delphi Study. Adm. Policy Ment. Health Ment. Health Serv. Res. 2022, 23, 1–14. [Google Scholar] [CrossRef] [PubMed]
Tiberius, V.; Gojowy, R.; Dabić, M. Forecasting the future of robo advisory: A three-stage Delphi study on economic, technological, and societal implications. Technol. Forecast. Soc. Chang. 2022, 182, 121824. [Google Scholar] [CrossRef]
Bu, X.; Ng, P.H.; Tong, Y.; Chen, P.Q.; Fan, R.; Tang, Q.; Cheng, Q.; Li, S.; Cheng, A.S.; Liu, X.; et al. A Mobile-Based Virtual Reality Speech Rehabilitation App for Patients with Aphasia after Stroke: Development and Pilot Usability Study. JMIR Serious Games 2022, 10, e30196. [Google Scholar] [CrossRef]
Ziglio, E.; Adler, M. Gazing into the Oracle: The Delphi Method and Its Application to Social Policy and Public Health; Kingsley: London, UK, 1996. [Google Scholar]
Hjørland, B. The foundation of the concept of relevance. J. Am. Soc. Inf. Sci. Technol. 2010, 61, 217–237. [Google Scholar] [CrossRef]
Okoli, C.; Pawlowski, S. The Delphi method as a research tool: An example, design considerations and applications. Inf. Manag. 2004, 42, 15–29. [Google Scholar] [CrossRef]
Gläser, J.; Laudel, G. Experteninterviews und Qualitative Inhaltsanalyse als Instrumente Rekonstruierender Untersuchungen; Lehrbuch, VS, Verl. für Sozialwiss: Wiesbaden, Germany, 2012. [Google Scholar]
Mayring, P. Qualitative Content Analysis: Theoretical Foundation, Basic Procedures and Software Solution. AUT. 2014. Available online: https://www.semanticscholar.org/paper/Qualitative-content-analysis%3A-theoretical-basic-and-Mayring/18882a33873fc61b0f026f8ee31440a934eaa4a9 (accessed on 2 June 2022).
Krüger, D.; Riemeier, T. Die qualitative Inhaltsanalyse—Eine Methode zur Auswertung von Interviews. In Methoden in der Naturwissenschaftsdidaktischen Forschung; Springer: Berlin/Heidelberg, Germany, 2014; pp. 133–145. [Google Scholar]
Beech, B. Go the extra mile—Use the Delphi Technique. J. Nurs. Manag. 1999, 7, 281–288. [Google Scholar] [CrossRef]
Hsu, C.C.; Sandford, B. The Delphi Technique: Making Sense of Consensus. Pract. Assess. Res. Eval. 2007, 12, 1–8. [Google Scholar]
Skulmoski, G.J.; Hartman, F.T.; Krahn, J. The Delphi method for graduate research. J. Inf. Technol. Educ. Res. 2007, 6, 1–21. [Google Scholar] [CrossRef]
Hasson, F.; Keeney, S.; Mckenna, H. Research guidelines for the Delphi Survey Technique. J. Adv. Nurs. 2000, 32, 1008–1015. [Google Scholar] [CrossRef]
Holey, E.A.; Feeley, J.L.; Dixon, J.; Whittaker, V.J. An exploration of the use of simple statistics to measure consensus and stability in Delphi studies. BMC Med. Res. Methodol. 2007, 7, 52. [Google Scholar] [CrossRef]
Paul, S.C.; Bartmann, N.; Clark, J.L. Customizability in conversational agents and their impact on health engagement. Hum. Behav. Emerg. Technol. 2021, 3, 1141–1152. [Google Scholar] [CrossRef]
Pon-Barry, H.; Clark, B.; Schultz, K.; Bratt, E.O.; Peters, S.; Haley, D. Contextualizing reflective dialogue in a spoken conversational tutor. J. Educ. Technol. Soc. 2005, 8, 42–51. [Google Scholar]
Sonlu, S.; Güdükbay, U.; Durupinar, F. A conversational agent framework with multi-modal personality expression. ACM Trans. Graph. (TOG) 2021, 40, 1–16. [Google Scholar] [CrossRef]
Rato, D.; Couto, M.; Prada, R. Fitting the room: Social motivations for context-aware agents. In Proceedings of the 9th International Conference on Human-Agent Interaction, Virtual Event, Japan, 9–11 November 2021; pp. 39–46. [Google Scholar]
Griol, D.; Callejas, Z. Mobile conversational agents for context-aware care applications. Cogn. Comput. 2016, 8, 336–356. [Google Scholar] [CrossRef]
Mavropoulos, T.; Meditskos, G.; Symeonidis, S.; Kamateri, E.; Rousi, M.; Tzimikas, D.; Papageorgiou, L.; Eleftheriadis, C.; Adamopoulos, G.; Vrochidis, S.; et al. A context-aware conversational agent in the rehabilitation domain. Future Internet 2019, 11, 231. [Google Scholar] [CrossRef]
Bradley, N.; Fritz, T.; Holmes, R. Context-aware conversational developer assistants. In Proceedings of the 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), Gothenburg, Sweden, 27 May–3 June 2018; pp. 993–1003. [Google Scholar]
Kaptein, M.; Markopoulos, P.; de Ruyter, B.; Aarts, E. Two acts of social intelligence: The effects of mimicry and social praise on the evaluation of an artificial agent. AI Soc. 2011, 26, 261–273. [Google Scholar] [CrossRef]
Hale, J.; Hamilton, A.F.D.C. Testing the relationship between mimicry, trust and rapport in virtual reality conversations. Sci. Rep. 2016, 6, 35295. [Google Scholar] [CrossRef]
Jusoh, S. Intelligent conversational agent for online sales. In Proceedings of the 2018 10th International Conference on Electronics, Computers and Artificial Intelligence (ECAI), Iasi, Romania, 28–30 June 2018; pp. 1–4. [Google Scholar]
Gaffney, H.; Mansell, W.; Tai, S. Conversational agents in the treatment of mental health problems: Mixed-method systematic review. JMIR Ment. Health 2019, 6, e14166. [Google Scholar] [CrossRef]
Wu, W.; Guo, Z.; Zhou, X.; Wu, H.; Zhang, X.; Lian, R.; Wang, H. Proactive human-machine conversation with explicit conversation goals. arXiv 2019, arXiv:1906.05572. [Google Scholar]
Allbeck, J.M.; Badler, N.I. Towards behavioral consistency in animated agents. In Deformable Avatars; Springer: Boston, MA, USA, 2001; pp. 191–205. [Google Scholar]
Bentahar, J.; Moulin, B.; Chaib-draa, B. Towards a formal framework for conversational agents. In Proceedings of the Agent Communication Languages and Conversation Policies AAMAS 2003 Workshop, Melbourne, Australia, 14 July 2003. [Google Scholar]
Campos, J.; Kennedy, J.; Lehman, J.F. Challenges in exploiting conversational memory in human-agent interaction. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, Stockholm, Sweden, 10–15 July 2018; pp. 1649–1657. [Google Scholar]
Xu, X.; Gou, Z.; Wu, W.; Niu, Z.Y.; Wu, H.; Wang, H.; Wang, S. Long Time No See! Open-Domain Conversation with Long-Term Persona Memory. arXiv 2022, arXiv:2203.05797. [Google Scholar]
Bickmore, T.W.; Picard, R.W. Establishing and maintaining long-term human-computer relationships. ACM Trans. Comput.-Hum. Interact. (TOCHI) 2005, 12, 293–327. [Google Scholar] [CrossRef]
Elkins, A.C.; Derrick, D.C.; Burgoon, J.K.; Nunamaker, J.F., Jr. Predicting users’ perceived trust in Embodied Conversational Agents using vocal dynamics. In Proceedings of the 2012 45th Hawaii International Conference on System Sciences, Maui, HI, USA, 4–7 January 2012; pp. 579–588. [Google Scholar]
Seeger, A.M.; Pfeiffer, J.; Heinzl, A. When do we need a human? Anthropomorphic design and trustworthiness of conversational agents. In Proceedings of the SIGHCI 2017, AIS Electronic Library, Seoul, Korea, 10 December 2017. [Google Scholar]
Müller, L.; Mattke, J.; Maier, C.; Weitzel, T.; Graser, H. Chatbot acceptance: A latent profile analysis on individuals’ trust in conversational agents. In Proceedings of the 2019 on Computers and People Research Conference, Nashville, TN, USA, 20–22 June 2019; pp. 35–42. [Google Scholar]
Guo, Y.; Wang, J.; Wu, R.; Li, Z.; Sun, L. Designing for trust: A set of design principles to increase trust in chatbot. CCF Trans. Pervasive Comput. Interact. 2022, 1–8. [Google Scholar] [CrossRef]
Bryson, J.J. Robots should be slaves. Close Engag. Artif. Companions Key Soc. Psychol. Ethical Des. Issues 2010, 8, 63–74. [Google Scholar]
Pradhan, A.; Lazar, A. Hey Google, do you have a personality? Designing personality and personas for conversational agents. In Proceedings of the CUI 2021—3rd Conference on Conversational User Interfaces, Bilbao, Spain, 27–29 July 2021; pp. 1–4. [Google Scholar]
Fussell, S.R.; Kiesler, S.; Setlock, L.D.; Yew, V. How people anthropomorphize robots. In Proceedings of the 3rd ACM/IEEE International Conference on Human Robot Interaction, HRI ’08, Amsterdam, The Netherlands, 12–15 March 2008; Association for Computing Machinery: New York, NY, USA, 2008; pp. 145–152. [Google Scholar] [CrossRef]
Powers, A.; Kiesler, S. The advisor robot: Tracing people’s mental model from a robot’s physical attributes. In Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction, HRI ’06, Salt Lake City, UT, USA, 2–3 March 2006; Association for Computing Machinery: New York, NY, USA, 2006; pp. 218–225. [Google Scholar] [CrossRef]
Coeckelbergh, M. Moral appearances: Emotions, robots, and human morality. Ethics Inf. Technol. 2010, 12, 235–241. [Google Scholar] [CrossRef]
Monnier, D. Woebot: A continuation of and an end to psychotherapy? Psychotherapies 2020, 40, 71–78. [Google Scholar]
Inkster, B.; Sarda, S.; Subramanian, V. An empathy-driven, conversational artificial intelligence agent (Wysa) for digital mental well-being: Real-world data evaluation mixed-methods study. JMIR mHealth uHealth 2018, 6, e12106. [Google Scholar] [CrossRef]
Schlimbach, R.; Rinn, H.; Markgraf, D.; Robra-Bissantz, S. A literature review on pedagogical conversational agent adaptation. In Proceedings of the Pacific Asia Conference on Information System, PACIS 2022, Virtual Conference, Sydney, Australia, 5–9 July 2022. [Google Scholar]
Khalil, M.; Rambech, M. Eduino: A telegram learning-based platform and chatbot in higher education. In Proceedings of the International Conference on Human-Computer Interaction, Online, 26 June–1 July 2022; Learning and Collaboration Technologies. Novel Technological Environments; Zaphiris, P., Ioannou, A., Eds.; Springer International Publishing: Cham, Switzerland, 2022; pp. 188–204. [Google Scholar]
Soares, A.M.; Camacho, C.; Elmashhara, M.G. Understanding the impact of chatbots on purchase intention. In Proceedings of the World Conference on Information Systems and Technologies, Budva, Montenegro, 12–14 April 2022; pp. 462–472. [Google Scholar]
Alnefaie, A.; Singh, S.; Kocaballi, A.B.; Prasad, M. Factors influencing artificial intelligence conversational agents usage in the E-commerce field: A systematic. In Proceedings of the ACIS 2021, Sydney, Australia, 8–10 December 2021. [Google Scholar]

Table 2. Experts who participated in the Delphi study.

No.	Sex	Age	Location	Placement	Expertise and/or Field of Work
E01	M	25–34	APAC	Academia	Methods and Philosophy of Agent-based Social Simulation;
E02	M	55–64	EMEA	Academia	Cybersecurity, mHealth, and Computer-mediated Communications;
E03	M	35–44	AMER	Academia	Artificial Intelligence, Natural Language Processing, Human–Computer Interaction;
E04	M	45–54	EMEA	Academia	Artificial intelligence, Assistive Technologies, Data Science;
E05	M	65+	EMEA	Academia	Cybernetics, Psycholinguistics, Neurosciences and Cognitive Psychology;
E06	F	45–54	APAC	Academia	Linguistics, Cognition and Computation;
E07	M	35–44	AMER	Academia	Artificial Intelligence in Education, Serious Games, Intelligent Synthetic Agents;
E08	M	35–44	EMEA	Academia	Ethics of Artificial Intelligence, Human Enhancement Ethics, Animal Ethics;
E09	M	25–34	EMEA	Academia	Multimedia User Interfaces, Semantic Computing, and Search Engines;
E10	M	25–34	APAC	Academia	Human–Robot Interaction, Social Robotics, Embodied Conversational Agents;
E11	M	25–34	AMER	Academia	Cognitive Science, Machine Learning, Computational Linguistics;
E12	M	45–54	EMEA	Industry	Chief Scientist: Robotics company;
E13	W	35–44	EMEA	Industry	Scientist: Multinational consumer technology company;
E14	M	25–34	EMEA	Industry	Computational Linguist: Multinational internet technology company;
E15	M	45–54	EMEA	Industry	Co-founder: Voice platform company;
E16	F	35–44	AMER	Industry	Engineer: Industrial design agency developing robots;
E17	F	45–54	AMER	Industry	Product Designer: Multinational social media and networking company;
E18	M	35–44	AMER	Industry	Head of Development: Artificial intelligence marketplace;
E19	M	25–34	EMEA	Industry	Consultant: Professional services and auditing company;
E20	M	25–34	EMEA	Industry	Product Manager: AI platform company;
E21	M	35–44	AMER	Industry	Engineer: Multinational IT company;

Table 3. Level of expert agreement regarding relevance (first row of each characteristic) and importance (second row of each characteristic) of characteristics identified in Round 1. Data are sorted ascending by the standard deviation in expert agreement after Round 3. We show descriptive statistics (i.e., mean, median, SD and IQR) for Round 2 (R2) and Round 3 (R3) and their respective absolute change (Δ).

Characteristic	Mean_R2	Median_R2	SD_R2	IQR_R2	Mean_R3	Median_R3	SD_R3	IQR_R3	Mean_Δ	SD_Δ	IQR_Δ
Context-related Acting	6.19	7.00	1.03	1.00	6.38	7.00	0.74	1.00	0.19	−0.29	0.00
Context-related Acting	6.10	6.00	1.04	1.00	6.33	6.00	0.73	1.00	0.24	−0.31	0.00
Reflective Language	5.76	6.00	0.77	1.00	5.71	6.00	0.72	1.00	−0.05	−0.05	0.00
Reflective Language	5.48	5.00	0.87	1.00	5.43	5.00	0.81	1.00	−0.05	−0.06	0.00
Enculturation	5.95	6.00	1.16	1.00	6.19	6.00	0.81	1.00	0.24	−0.35	0.00
Enculturation	5.76	6.00	1.09	2.00	5.90	6.00	0.89	2.00	0.14	−0.20	0.00
Customizability	5.67	6.00	1.02	1.00	5.71	6.00	0.96	1.00	0.05	−0.06	0.00
Customizability	5.38	5.00	1.20	2.00	5.29	5.00	0.96	1.00	−0.10	−0.25	−1.00
Engagement	5.33	5.00	1.20	1.00	5.33	5.00	0.97	1.00	0.00	−0.23	0.00
Engagement	5.19	5.00	1.33	2.00	5.05	5.00	0.97	0.00	−0.14	−0.35	−2.00
Consistency	5.52	6.00	1.12	1.00	5.67	6.00	1.06	1.00	0.14	−0.06	0.00
Consistency	5.43	6.00	1.40	3.00	5.62	6.00	1.12	2.00	0.19	−0.28	−1.00
Depth	5.62	5.00	1.24	2.00	5.29	5.00	1.10	1.00	−0.33	−0.14	−1.00
Depth	5.10	5.00	1.61	3.00	5.00	5.00	1.18	1.00	−0.10	−0.43	−2.00
Continuous Interaction	5.24	5.00	1.22	1.00	5.19	5.00	1.17	1.00	−0.05	−0.05	0.00
Continuous Interaction	5.10	6.00	1.45	2.00	5.33	6.00	1.24	2.00	0.24	−0.21	0.00
Respectful Honesty	5.81	6.00	1.08	2.00	5.76	6.00	1.18	1.00	−0.05	0.10	−1.00
Respectful Honesty	5.90	6.00	1.04	1.00	5.90	6.00	1.18	1.00	0.00	0.13	0.00
Justifiability	5.52	6.00	1.33	1.00	5.57	6.00	1.21	1.00	0.05	−0.12	0.00
Justifiability	5.48	6.00	1.25	1.00	5.38	6.00	1.32	1.00	−0.10	0.07	0.00
Establish/Maintain Relationships	5.43	6.00	1.54	1.00	5.48	6.00	1.44	1.00	0.05	−0.10	0.00
Establish/Maintain Relationships	5.19	5.00	1.54	1.00	5.14	5.00	1.42	1.00	−0.05	−0.11	0.00
Respectful Acting	5.57	6.00	1.43	2.00	5.57	6.00	1.43	2.00	0.00	0.00	0.00
Respectful Acting	5.24	5.00	1.45	3.00	5.24	5.00	1.45	3.00	0.00	0.00	0.00
Otherness	5.00	5.00	1.58	2.00	5.05	5.00	1.56	2.00	0.05	−0.02	0.00
Otherness	4.52	5.00	1.81	3.00	4.52	5.00	1.78	3.00	0.00	−0.03	0.00
Individual Personality	4.95	5.00	1.75	2.00	4.95	5.00	1.69	2.00	0.00	−0.06	0.00
Individual Personality	3.81	4.00	1.78	2.00	3.86	4.00	1.80	2.00	0.05	0.02	0.00

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Brinkschulte, L.; Schlögl, S.; Monz, A.; Schöttle, P.; Janetschek, M. Perspectives on Socially Intelligent Conversational Agents. Multimodal Technol. Interact. 2022, 6, 62. https://doi.org/10.3390/mti6080062

AMA Style

Brinkschulte L, Schlögl S, Monz A, Schöttle P, Janetschek M. Perspectives on Socially Intelligent Conversational Agents. Multimodal Technologies and Interaction. 2022; 6(8):62. https://doi.org/10.3390/mti6080062

Chicago/Turabian Style

Brinkschulte, Luisa, Stephan Schlögl, Alexander Monz, Pascal Schöttle, and Matthias Janetschek. 2022. "Perspectives on Socially Intelligent Conversational Agents" Multimodal Technologies and Interaction 6, no. 8: 62. https://doi.org/10.3390/mti6080062

APA Style

Brinkschulte, L., Schlögl, S., Monz, A., Schöttle, P., & Janetschek, M. (2022). Perspectives on Socially Intelligent Conversational Agents. Multimodal Technologies and Interaction, 6(8), 62. https://doi.org/10.3390/mti6080062

Article Menu

Perspectives on Socially Intelligent Conversational Agents

Abstract

1. Introduction

2. Related Work

2.1. The Path towards Today’s Conversational Agents

2.2. Social Intelligence and Conversational Agents

2.2.1. Situational Awareness

2.2.2. Presence

2.2.3. Authenticity

2.2.4. Clarity

2.2.5. Empathy

3. Method, Sampling and Study Procedure

4. Results

4.1. Identified Characteristics for Socially Intelligent Conversational Agents

4.2. Context-Related Acting

4.3. Reflective Language

4.4. Enculturation

4.5. Customizability

4.6. Engagement

4.7. Consistency

4.8. Depth

4.9. Continuous Interaction

4.10. Respectful Honesty

4.11. Justifiability

4.12. Establish/Maintain Relationships

4.13. Respectful Acting

4.14. Otherness

4.15. Individual Personality

5. Meta Reflections

6. Conclusions, Limitations, and Future Work

6.1. Limitations

6.2. Potential Future Research Directions

6.3. Final Thoughts

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI