Enhancing trust in autonomous vehicles through intelligent user interfaces that mimic human behavior

: Autonomous vehicles use sensors and artiﬁcial intelligence to drive themselves. Surveys indicate that people are fascinated by the idea of autonomous driving, but are hesitant to relinquish control of the vehicle. Lack of trust seems to be the core reason for these concerns. In order to address this, an intelligent agent approach was implemented, as it has been argued that human traits increase trust in interfaces. Where other approaches mainly use anthropomorphism to shape appearances, the current approach uses anthropomorphism to shape the interaction, applying Gricean maxims (i.e., guidelines for effective conversation). The contribution of this approach was tested in a simulator that employed both a graphical and a conversational user interface, which were rated on likability, perceived intelligence, trust, and anthropomorphism. Results show that the conversational interface was trusted, liked, and anthropomorphized more, and was perceived as more intelligent, than the graphical user interface. Additionally, an interface that was portrayed as more conﬁdent in making decisions scored higher on all four constructs than one that was portrayed as having low conﬁdence. These results together indicate that equipping autonomous vehicles with interfaces that mimic human behavior may help increasing people’s trust in, and, consequently, their acceptance of them.


Introduction
Autonomous vehicles are vehicles that can drive without human control. This lack of control makes people uncertain whether the vehicles are reliable, and what they will do and why [1]. One solution to this could be to make the working of the vehicle more transparent, by transforming it into an intelligent agent that explains what it is doing. This explanation can be provided through a conversational interface that allows the driver to obtain information about what the car is doing in a natural way [2].
Such conversational interfaces already enable people to interact with wearables, robots, and other smart devices in a natural way, as if they were talking with another person [2]. When successfully applied in autonomous vehicles, those systems can assist the driver in a variety of tasks [3]. While the technology necessary for implementing such conversational interfaces is under development [2,4], the discussion on how knowledge about social interactions between humans and other technologies can be used in this domain has recently begun [5]. The objective of this paper is to investigate effects of providing explanations about behavioral decisions through a conversational interface on people's trust in and perceived agency of a self-driving car.

Autonomous Vehicles
Autonomous vehicles (AVs) use sensors, global positioning system coordinates and artificial intelligence to drive themselves without the active intervention of a human operator. The arrival of completely autonomous cars seems to be imminent. As a matter of fact, on-road testing has already begun and its progress is accelerating rapidly. Google's/Waymo's fleet of AVs, for instance, drove over one million kilometers autonomously in 2016, having only 124 reported disengagements, or an average of one every 8250 ks [6]. In a few years from now, many big car manufacturers like Mercedes, Ford, and General Motors expect to have at least semi-autonomous vehicles out for purchase [7].
One of the many benefits of AVs is that they can reduce the number of road accidents [8]. In 2015, the US National Highway Traffic Safety Administration carried out a survey examining the critical reasons for automobile crashes [9]. Human factors were reported to account for more than 90% of all automobile crashes. With the introduction of self-driving cars, the effects of these human errors on road accidents will decrease. Broadly speaking, human errors that lead to crashes fall into four categories: intoxication, misjudgment, fatigue and distraction [10].
Effects of intoxication (such as driving under the influence of alcohol or drugs) can be prevented with technical means such as ignition interlocks, which seem to reduce alcohol-related crashes [11]. Misjudgments involve the inability of drivers to accurately derive sufficient information from relative velocity cues, which lead to rear end collisions [12]. Drivers often engage in seemingly simple situations such as overtaking and crossing a high-speed road that are beyond their visual or perceptual capabilities [13]. Even when drivers of an AV have their eyes on a conflict object, they are unlikely to respond by taking action [14]. Fatigue and distractions are inherently human and play a pivotal role in the risk of crashing. Driving for longer periods of time (about 90 min) on monotonous highway environments significantly increases driver fatigue and is a potential cause of accidents [15]. Being distracted on the road is shown to increase the number of errors made by drivers [16]. All these risks are potentially reduced immensely by keeping the vehicle in charge of all driving related tasks.
The research on AVs is largely focused on technical aspects, safety and operation [17], but from a customer perspective a variety of concerns exist that thwart the acceptance of AVs, the roots of which appear to be the reluctance to trust them. Indeed, trust is one of the important determinants for people's intention to use AVs [18].

Trust in Automation
Mayer et al. [19] found that trustworthiness is based on the perceived integrity, benevolence, and ability of a given system. Personality traits and previous experiences with the system helped facilitate trust in it. Lee and See [20] proposed a model of the dynamic process that governs trust in automation and its effect on reliance. According to them, trust in automation develops in four stages, namely information assimilation, trust evolution, intention formation and reliance action. This conceptual model provides insightful information regarding trust in automation and describes guidelines that can help in appropriately calibrating trust, thereby avoiding misuse (where people use the technology in unintended ways, by trusting it too much) and disuse (where people do not utilize the capabilities of the technology).
Hoff and Bashir's model [21] describes three layers of trust: dispositional trust (an individual's overall tendency to trust automation), situational trust (trust that is context-dependent) and learned trust (evaluations of a system drawn from past experience or the current interaction). According to this model, whilst interacting with the system, trust is moderated by system performance, design features and the experience of the interaction itself.
In the partially automated vehicles of today, one of the ways in which trust is aimed to be established is by means of displays [22]. Currently existing concepts in such cars visualize the state of the system, allowing the driver to monitor the automated components of the car. In doing so, these display concepts improve the transparency of the system and support predictability, which in turn generates trust [23]. McKnight and Chervany [24] propose that trust in AVs can be increased by the possibility to intervene and take over control from the system when desired, anthropomorphism, system transparency, and polite communication.

Taking over Control
An Accenture survey in 2011 determined that giving up control of the vehicle is an apprehension among people, and that they are relieved to have the ability to take back control. An additional concern is that of legal liability, i.e., who is responsible in the event of a mishap [1,25]. Finally, it is obvious that a lot of people enjoy driving, and would not like to relinquish control of the vehicle [26]. It can be concluded that the possibility to take over control when desired is an important determinant for people to trust AVs. However, when people do have the possibility to take back control, the quality of this take-over action varies for different traffic situations [27]. This is likely to be related to a loss of situation-awareness [28], especially when a secondary task is involved [29].
One way to give people the feeling that they are in control, whilst they are not actually in control, is to provide reasons and explanations. Interpretive control is a type of control that refers to people's search for meaning and understanding [30]. In order for an AV to communicate reasons and explanations to the driver, it needs to have a form of agency. This agency can be attributed based on its resemblence to a human agent.

Anthropomorphism
Anthropomorphism is generally defined as the degree to which people attribute human-like traits to non-human agents (such as computers, robots, or AVs [31][32][33][34][35][36]). Providing human-like features of appearance is a very common approach in increasing trust in non-human agents. Indeed, human-like appearances act as a catalyst and speed up the ease with which trust is established [37,38]. The resemblance of the front of Google's driver-less car to that of a human face is a prime example of this approach. Likewise, Waytz et al. [39] tested the effects of anthropomorphism on trust and likability of AVs and found that people liked and trusted the vehicle more when anthropomorphised (given a name, gender and voice).
Of course, having too much perceived human-likeness may also cause averse effects. The system may create unrealistically high expectations. Studies in human-robot interaction showed that this could cause the system to be rated as uncanny [40], and in the domain of automotive it could create over-trust, where people trust an AV even in situations where they should not [41]. It is therefore also important to communicate about the system's capabilities, or simply design for repeated interactions, since this is shown to decrease the eeriness of robots irrespective of their embodiment [42].
While most work on anthropomorphic interfaces focuses on rather superficial characteristics like gender, name or a human-like face, we believe that it takes more than this to establish human-like agency. Although it is evident from the work on the media equation theory [43,44] that superficial characteristics like a green ribbon may help to create a bond between a person and a computer and make people consider the computer as more friendly, intelligent and trustworthy, in most experiments reported in [44] it was rather the behavior of the computers that influenced people's judgments of that computer. Thus, we believe that, for an automated system to be trusted by virtue of showing human-like characteristics, behavioral aspects of human intelligence are more important than a mere human-like appearance. Intelligence here is meant in the sense of showing contextually appropriate behavior, reflecting a proper understanding of the situation and the ability to devise a suitable action.

Transparency
The information a car provides when it drives itself is the most direct information for a driver to try and understand its behavior. Dependability and predictability are the basis of trust as people's experiences with autonomous systems become more regular. Clarifying unexpected actions by providing system transparency can increase trust, which in turn increases usage [45]. In testing information provision, Khastgir et al. [46] determined that providing information to the user has Multimodal Technologies and Interact. 2018, 2, 62 4 of 16 a positive effect: an increase in the number of failures affects trust in autonomous systems less when information is provided than when information is not provided. Verberne et al. [47] found that Adaptive Cruise Control (ACC) systems that take over control of the vehicle whilst providing information about how hard it would accelerate or brake are evaluated as more trustworthy and acceptable than ACC systems that do not provide this information.
Similar ideas can be found in the work of Koo et al. [48]. They provided three kinds of messages to the driver of an AV: the how message (information about how the car is acting, such as "The car is braking"), the why message (situational information explaining the reason for engaging automation, such as "Obstacle ahead") and a combination of both these messages ("The car is braking due to an obstacle ahead"). Results revealed that the safest driving performance was observed when a combination of the two messages was provided, although it was the least preferred, due to its high cognitive load. It is therefore important to consider the way in which a message is provided.

Polite Communication
Grice [49] explains conversation as a collaborative effort, where each conversationalist complies with a general principle. This principle is called the cooperative principle: "Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged" [pp. 45] [49]. Rules for effective communication that comply with this principle have been proposed by Grice, and are hence known as the Gricean maxims.
Four categories of maxims are proposed. The maxim of quantity focuses on the amount of information that should be included in a contribution to a conversation ("give as much information as is needed, and no more"). The maxim of quality states that information that is provided needs to be believed to be true ("do not give information that is false or that is not supported by evidence"). The maxim of relation states that any contribution to a discussion needs to have a certain level of relevance ("be relevant and say things that are pertinent to the discussion"). The maxim of manner is related to the following of social norms ("be clear, brief, and orderly, and avoid obscurity and ambiguity").
Since cooperativeness applies at each moment in a conversation, the constant monitoring of the understanding of the addressee may lead speakers to speed up if they notice that the addressee is familiar with the subject matter, but also to slow down when noticing that the addressee has problems with digesting the information, and by giving additional explanation when there are problems, e.g., by explaining difficult terms.
An important issue is how to shape a conversation between an AV and its user. The work of Nass and colleagues [50,51] indicates that adding a human voice to technology makes people treat it as more human-like [51]. Shroeder and Epley [52] found that participants rated another person as more mindful (i.e., more thoughtful, more rational) when they heard the audio of an interview than when they read a transcript of the same interview. Qiu and Benbasat [53] demonstrated that a text-to-speech voice exerts significant effects on users' perceptions of trust in technology. This is consistent with the findings of Jensen et al. [54], who showed that more immediate forms of communication, like text-to-speech, exhibit a greater impact on the development of trust.
These observations about the effects of having a human voice provide the inspiration for designing the interface of an AV incorporating human-like behavior. By identifying situations where users may have problems understanding the behavior of the AV and communicating about this in a plausible way, the AV could potentially help users understand what is going on and consequently trust the system more.

The Current Study
In the current study, we take human communication, more specifically spoken conversation, as the pre-eminent example of a behavioral aspect of human intelligence. In order to explore people's responses to spoken messages in an AV, an experiment was performed to test the effects of conversational intelligence on agency and trust.
Based on the work mentioned thus far, we hypothesized that users of an AV would appreciate being informed of its decision-making capacity and self-reflection, and that providing this information would facilitate the establishment of trust. In order to provide an AV with these attributes, information that reflects aspects of human intelligence must be exchanged with the driver/user. One of the ways to do this is by means of a conversational interface. Since conversation itself is an inherently human trait, simulating a conversation between an AV and its user may enhance perceived human-likeness. Therefore, we expected that an autonomous vehicle that employs a conversational user interface would be anthropomorphized more than an autonomous vehicle without such an interface.
An autonomous vehicle that is portrayed as being aware of its limitations and transparent about its decisions can be perceived as intelligent, and findings in earlier studies show an emerging trend of higher trust in intelligent agents. However, an AV that communicates intelligence by informing the driver that it is not confident with driving in the current situation may cause averse effects. To find out whether this is the case, we explored whether an AV that communicates high confidence would be trusted more than an AV that communicates low confidence.
These relations were investigated in a driving simulator experiment in which people experienced different driving scenarios. We manipulated the type of interface and the confidence level of the system, and tested effects on trust, anthropomorphism, and likability.

Design
The experiment had a 2 (Interface type: Graphical vs. Conversational) × 2 (Confidence level: High vs. Low) mixed design. Confidence level of the system was manipulated within-subjects, with all participants experiencing both conditions. Order of the conditions was counterbalanced across participants. Interface type was manipulated between-subjects, with 26 participants experiencing a Graphical User Interface (GUI) and 31 experiencing the same interface with an additional layer of spoken messages, which will be referred to as Conversational Interface (CUI).

Materials
The experiment took place in a fixed-based driving simulator, as shown in Figure 1a. The GUI is shown in Figure 1b, and was designed using Adobe Illustrator (Adobe Systems, San Francisco, CA, USA) and Processing 3.3 (Processing Foundation, Boston, MA, USA). It acted as a platform that facilitates information exchange between the user and the system. A simple design was used, avoiding as much unnecessary information and showing as few components as possible, while maintaining its essence: showing the speedometer, the fuel gauge and the system's confidence level.
Confidence level of the system (i.e., the extent to which the system was certain about the decisions it took) was indicated with an icon in the interface showing either high (90%) or low (30%) confidence (see Figure 2). A threshold was marked by a red line just above the 3rd bar (of a total of 10 bars) in the system confidence bar. The two levels of system confidence were chosen based on Helldin et al. [55], and communicated to participants by telling them how certain the system was to drive itself.
A scenario was created consisting of an urban environment in which the simulator drove for five minutes, followed by a highway environment, where it drove for another five minutes. In accordance with the Gricean maxims, voice messages were played in the CUI condition at situations that demanded explanation. The content of these messages is included in the scenario description below. The complete GUI for the conditions of manual driving, autonomous driving at 30% certainty, and autonomous driving at 90% certainty are shown in figures 4.2, 4.3 and 4.4 respectively. The only difference in the GUI functionality between the autonomous and manual driving modes was the fading away or lighting up of the icon.  The first five minutes of the scenario consisted of the AV driving in an urban environment. The route was a combination of two-lane roads and cobbled paths with pedestrians. In a typical session, the AV started on a straight stretch of a two-lane road, and stopped at a traffic light. When the light turns green, the AV gives way to a bicyclist ("I'm giving way to the bicyclist."). By the time the bicyclist passes, the light turns yellow and the AV waits until it turns green again ("The light is about to turn red."). Then, the vehicle turns right and drives on a curved road, taking a right at the end of it. After going straight for a while, the vehicle yields at a stop sign and then proceeds onto a cobbled road, slowing down as it does ("We're on a cobbled road with pedestrians, I'm slowing down."). At the end of the cobbled road, the AV turns left after yielding for a motorcyclist ("I'm yielding."). Then, it drives on a two-lane road and waits at a traffic light. It yields for another motorcyclist when the light turns green, then turns left. In alternating between the confidence levels of the AV, the low confidence level was characterized by not giving the right of way to bicyclists and motorcyclists.
For the next five minutes of the session, the AV drove on a highway, mostly at a constant speed of approximately 100 km/h. While being on the left lane ready to overtake a car, it returned to the right lane to give way to a faster car ("I'm letting a faster vehicle overtake me."). In the Low confidence condition, the vehicle exited the highway about two minutes after it starts, slightly going off the lane. In two cases, at this point, the vehicle crashed ("I'm terribly sorry about that, I hope you're okay!"). On entering the highway again, the AV drove onto the shoulder of the highway at the end of the on-ramp, and slowed down to a halt. The vehicle then waited until the traffic on the highway lanes cleared ("I need the traffic to reduce before I can safely enter the highway."), then entered the highway and kept driving.
The CUI was identical to the GUI, with an added layer of spoken messages played by the simulator at fixed points during the experiment. These spoken messages explained the behavior of the vehicle in instances where required. Since we wanted to focus on the evaluation of the effects of conversational intelligence, a Wizard-of-Oz approach was applied for the CUI instead of spending effort on the technical implementation of the CUI.
Voice messages were created using NaturalReader 14 (NaturalSoft Ltd., Vancouver, BC, Canada), which is a standard text-to-speech software that reads out loud content typed in a text-box. Participants were asked to choose a gender for the autonomous vehicle in which they would drive. Based on their answer, a male or female voice was used in conjunction with the interface.
Before the experiment, participants completed an online introductory questionnaire that consisted of demographic information, attitudes towards autonomous driving, and propensity to trust autonomous vehicles. Attitudes towards autonomous driving were measured with eight questions (a mixture of yes/no answers, multiple choice answers and Likert-scale judgments), extracted and adapted from Kyriakidis et al. [ [61]). This model calculates so-called person ability scores: values that are calibrated to the probability of items being answered with "yes". More specifically, an item that is unlikely to be answered with "yes" is given a higher weight than an item that has a high probability of being answered with "yes". With this method, relative differences between the items are included in the calculation of a person's score, and as such it is a closer representation of attributions of agency.

Procedure
Participants signed up for the study online, after which they completed the introductory questionnaire that contained demographic questions, their preferred gender of the automated vehicle (used for determining the voice for the CUI), and attitudes of and propensities to trust AVs.
Upon arrival in the lab, participants read and signed an informed consent form. They then took a seat in the simulator, and drove it for three minutes to familiarize themselves with the vehicle before being driven around by it. This was done to induce a sense of realism because directly making them sit through autonomous driving in the simulator might make it seem like watching a video recording where the participants could potentially remove themselves from all responsibility.
Participants in the GUI condition were given an explanation about the elements in the interface by the researcher, whereas those in the CUI condition received the exact same explanation as a spoken message from the vehicle. Next, participants experienced the first ride, either with high or with low system confidence. At the end of the ride, participants left the simulator to complete the questionnaires.
They then experienced the second ride (with the other confidence level) and completed the questionnaires a second time. At the end of the experiment, participants were thanked for their contribution, debriefed, and paid for their participation.

Results
In this section, we test the expectation that an AV with a CUI is trusted and anthropomorphized more than an AV without such an interface, and whether an AV that communicates high confidence is trusted more than an AV that communicates low confidence. Since an AV that is portrayed as being aware of its limitations and transparent about them can be perceived as more intelligent and likable, we also explore effects on perceived intelligence and likability.
Before testing effects of Interface type and Confidence level, we checked whether participants in the two between-subjects groups differed in their attitudes towards AVs. Average values of the two indicators were submitted to an ANOVA with the two Interface types as groups. No differences were found on either attitudes towards autonomous driving (F(1, 56) = 0.09, p = 0.77) and propensity to trust AVs (F(1, 56) = 1.59, p = 0.21). We also checked whether the order of the two Confidence levels influenced people's evaluations of the system. No effects of order on any of the variables of interest were found, all p-values > 0.12. Finally, we checked whether the gender of the AV influenced any of the variables of interest, and again no effects were found, with all p-values > 0. 15. To test effects of Interface type and Confidence level, scores on all four constructs were submitted to a 2 (Interface type: GUI vs. CUI) × 2 (Confidence level: High vs. Low) Univariate General Linear Model (although, strictly speaking, scale judgments are ordinal data and therefore non-parametric tests are more appropriate, and such tests do not allow testing for interactions). Significant main effects of both Interface type and Confidence level were found on all four constructs (see Tables 1 and 2). None of the interactions were significant. The CUI was trusted more than the GUI, with F(1, 55) = 8.81, p = 0.004, η p 2 = 0.14. In addition, the interface with high confidence was trusted more than the interface with low confidence, with F(1, 55) = 11.56, p = 0.001, η p 2 = 0.17. These effects are visualized in Figure 3. The CUI was perceived as more intelligent than the GUI, with F(1, 55) = 6.60, p = 0.013, η p 2 = 0.11.
In addition, the interface with high confidence was perceived as more intelligent than the interface with low confidence, with F(1, 55) = 13.25, p = 0.001, η p 2 = 0.19. These effects are visualized in Figure 4.
The CUI was perceived as more human-like than the GUI, with F(1, 55) = 41.35, p = 0.001, η p 2 = 0.19. In addition, the interface with high confidence was perceived as more human-like than the interface with low confidence, with F(1, 55) = 9.18, p = 0.004, η p 2 = 0.14. These effects are visualized in Figure 5.
In addition, the interface with high confidence scored higher on likability than the interface with low confidence, with F(1, 55) = 14.73, p < 0.001, η p 2 = 0.21. These effects are visualized in Figure 6. Although no significant interaction effects were found, the figures above show that the effects of Interface type tend to be slightly bigger for low confidence than for high confidence.

Discussion
This paper presented a study that investigated whether a conversational interface that is portrayed as being aware of its limitations and transparent about them is trusted and anthropomorphized more than a more generic graphical user interface. In addition, the confidence level of the system was manipulated to be high or low. We expected the type of interface to affect drivers' perceptions of trust and agency in the system, and we explored whether these perceptions would differ between the system's confidence levels.
Results showed that the CUI scored higher than the GUI on all four constructs: trust, perceived intelligence, anthropomorphism and likability. Similar effects were found for the interface with high confidence compared to the one with low confidence. These results indicate that a conversational interface that explains why it behaves in a certain way is trusted more, is considered to be more intelligent, is seen as more human-like, and is liked more. These findings show that adding a layer of human intelligence to an AV by making it communicate about its behavioral decisions may be beneficial for the design of such systems in terms of trust and acceptance.
An important aspect of this layer of human intelligence is that it should include an explanation as to why certain decisions are made. A recent study showed that providing messages that explain why a certain recommendation is provided, rather than only providing the recommendation by itself, increases adherence to these recommendations [62]. It seems that people like to understand why things are happening, and, when they do, they are more likely to accept the decisions made by an AV.
The findings of the current study are similar to observations in the area of recommender systems. Tintarev and Masthoff stated that "a user may be more forgiving, and more confident in recommendations, if they understand why a bad recommendation has been made" [pp. 802] [63]. This may also make people more likely to accept mistakes made by AVs, as long as those mistakes do not have severe negative consequences (e.g., unnecessary waiting to overtake another car or other measures that cause a short time delay). Likewise, Sinha and Swearingen showed that users of recommender systems like and feel more confident about recommendations when they perceive them as more transparent [64]. It would be interesting to study how people respond to little mistakes that are either communicated transparently or not in the context of self-driving cars.
On the other hand, Culley and Madhavan argue that care must be taken that users ascribe undue trust to agents because they are depicted as capable of human reasoning and possessing human motivations [65]. Such undue trust, or over-trust, is known to have several disadvantages. Based on experiences with automation in aviation, Stanton and Marsden state that, when a driver who usually relies on auto-pilot is suddenly asked to perform a previously auto-piloted action, s/he will be prone to making mistakes [66].
One explanation for this is that higher levels of trust are associated with lower monitoring frequencies [67,68]. More specifically, people tend to be less inclined to monitor the performance of a system when they trust that system more. Moreover, Llaneras et al. found that increased trust could lead drivers to allocate less visual attention to the road ahead [69].
Moray and Inagaki claim that excessive trust in automation can compromise people's readiness to take over manual control [70]. Over-trust in AVs also leads to slower reaction times when people need to intervene in case of an emergency, so-called manual control recovery [71]. Abe et al. discovered that reaction times to warnings increased with increasing trust in the vehicle [72]. Reactions to take-over requests also lead to low quality decisions [73], showing that people are not well prepared to take over in case the car approaches a difficult situation. A conversational interface could potentially solve these problems by keeping the driver in the loop of its decision-making process, thereby potentially avoiding the decrease in reaction times.
The conversational interface was also perceived as more human-like than the graphical user interface. This should not come as a surprise, given that having a (human) voice has been strongly associated with human-likeness [50,74]. Our findings are also in line with earlier work on anthropomorphism and trust in the domains of artificial agents [75] and AVs [39]. An interesting question that remains is what would happen if the conversational interface would not comply with the Gricean maxims. More specifically, if the explaining messages from such an interface would contain too little or false information, or if they would contain irrelevant information, those messages may actually decrease people's trust in the AV. Since young children already seem to be sensitive to violations of Gricean maxims [76], it may be the case that compliance is an important determinant for the success of conversational interfaces.

Limitations and Future Work
In this study, a driving simulator was used to investigate effects of behavioral aspects of human intelligence. While using a driving simulator allowed us to have much experimental control, it did not have much ecological validity. There is a substantial difference between a simulator that makes little errors in its driving behavior or an actual self-driving car on the road that makes the same errors: the stakes in a simulator are arguably lower.
When AVs are deployed on the roads, alongside actual traffic, this limitation will become evident. Simulator fidelity (the exactness with which real situations are felt in a simulator) is known to affect user opinion [77]. Given that all participants were aware that they were driving (or being driven in) a simulator, and not an actual vehicle, there is no sense of jeopardy. The perceived safety and trust in the vehicle might have been influenced by this lack of actual danger. In future work, the addition of critical situations, such as a sudden traffic accident, could help increasing the sense of danger. Another option would be to include certain stakes in a simulator study, such as monetary costs for participants when errors are made (which is a common procedure in the field of economics). These measures could increase the ecological validity of simulator studies, giving us more information about what would happen in real-world situations.
In addition, given that a Wizard-of-Oz approach was used to demonstrate intelligent behavior, it still has to be shown whether an actual system would achieve the same transparency and earn the same trust. For this to happen, it should understand which situations require explanation (obviously, according to the cooperative principle of Grice not everything should be explained), and this requires deep understanding of the situational and communicative context.
One way to deal with this challenge would be to use information from the users, such as facial expressions, just as human speakers are doing, but it still remains to be shown how effective this is. Real-time detection of facial expressions has been possible since more than a decade [78], and recent findings show that these expressions are not always as one would expect [79]. The results of the current study indicate that it might be worthwhile to further explore such approaches in the domain of conversational interfaces.
Another limitation of the current approach is that it is not possible to know whether the effects of interface type were caused by the fact that a conversational interface was introduced, or that the messages from that interface complied with the Gricean maxims. Future studies should be designed to disentangle these two elements, by having two conversational interfaces: one that complies with the Gricean maxims and one that violates them.
Furthermore, both high-and low-confidence conditions were used, and it might be argued that the ultimate goal of AVs is to achieve 100% safety and therewith reduce the number of situations that require explanations. As argued above, users might need explanation primarily for cases that do not meet their expectations. If an autonomous vehicle makes no mistakes and always behaves as expected, there might be no need for explanations. This appears to be compatible with the trend in the current results, showing that for a system with high confidence the difference between the graphical interface and the conversational interface, where only the latter provided additional explanations, seemed to be slightly smaller than for low confidence condition, where the system conducted more unexpected maneuvers. As long as such systems do not operate flawlessly, showing intelligent behavior by understanding when explanations are needed will enhance the transparency for the user and thereby their trust in the system.
Moreover, the behavior of the AV differed between the two confidence levels. It is therefore not possible to know whether the reported effects were caused by the icon that indicated confidence level or the behavior of the vehicle. However, we believe that in a realistic situation these two elements should be related, since an AV that is not confident about making the correct behavioral decisions is more likely to make errors.
The conversation in the study was simulated by text-to-speech commands via an online service. Therefore, the voice could be perceived as monotonous and lacking variety in its tones. Nevertheless, commands of this nature are realistic enough to generate significant social responses, which has been demonstrated by the results of other research efforts [43,[80][81][82]. However, the design of the study did not allow for two-directional conversations between participants and the AV. We believe that including the possibility for people to ask questions to the interface of an AV may on the one hand increase their acceptance of them-because they would be even more informed about the reasons why certain behavioral decisions are made-and, on the other hand, increase their situation awareness.

Conclusions
Self-driving cars are the next stage in a long evolutionary process of the automobile. The current study was aimed at addressing the heart of the issue: where people's acceptance of AVs currently stands, and what can be done so that they are welcomed and trusted more easily, readily and appropriately. Anthropomorphizing AVs by providing it a voice and simulating intelligent conversation did not merely achieve higher trust ratings, but also higher ratings of likability and intelligence. With further iterative improvements on the interface and the use of real-time, intelligent personal agents to stimulate conversations, we can expect a paradigm shift towards higher trust in and easier acceptance of autonomous vehicles.

Conflicts of Interest:
The authors declare no conflict of interest.