Making Autonomous Taxis Understandable: A Comparative Study of eHMI Feedback Modes and Display Positions for Pickup Guidance

Ren, Gang; Huang, Zhihuang; Zhu, Yaning; Lin, Wenshuo; Huang, Tianyang; Wang, Gang; Lee, Jeehang

doi:10.3390/electronics14122387

Open AccessArticle

Making Autonomous Taxis Understandable: A Comparative Study of eHMI Feedback Modes and Display Positions for Pickup Guidance

by

Gang Ren

¹

,

Zhihuang Huang

¹

,

Yaning Zhu

¹

,

Wenshuo Lin

¹,

Tianyang Huang

¹,

Gang Wang

^1,*

and

Jeehang Lee

^2,3,*

¹

School of Design Arts, Xiamen University of Technology, Xiamen 361024, China

²

Department of Human-Centered AI, Sangmyung University, Seoul 03016, Republic of Korea

³

Institute for Advanced Intelligence Study, Daejeon 34189, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Electronics 2025, 14(12), 2387; https://doi.org/10.3390/electronics14122387

Submission received: 13 May 2025 / Revised: 4 June 2025 / Accepted: 6 June 2025 / Published: 11 June 2025

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Versions Notes

Abstract

Passengers often struggle to identify intended pickup locations when autonomous taxis (ATs) arrive, leading to confusion and delays. While prior external human–machine interface (eHMI) studies have focused on pedestrian crossings, few have systematically compared feedback modes and display positions for AT pickup guidance at varying distances. This study investigates the effectiveness of three eHMI feedback modes (Eye, Arrow, and Number) displayed at two positions (Body and Top) for communicating AT pickup locations. Through a controlled virtual reality experiment, we examined how these design variations impact user performance across key metrics including selection time, error rates, and decision confidence across varied parking distances. The results revealed distinct advantages for each feedback mode: Number feedback provided the fastest response times, particularly when displayed at the top position; Arrow feedback facilitated more confident decisions with lower error rates in close-range scenarios; and Eye feedback demonstrated superior performance in distant conditions by preventing severe identification errors. Body position displays consistently outperformed top-mounted ones, improving users’ understanding of the vehicle’s intended actions. These findings highlight the importance of context-aware eHMI systems that dynamically adapt to interaction distances and operational requirements. Based on our evidence, we propose practical design strategies for implementing these feedback modes in real-world AT services to optimize both system efficiency and user experience in urban mobility environments. Future work should address user learning challenges and validate these findings across diverse environmental conditions and implementation frameworks.

Keywords:

autonomous taxi; external human–machine interface; display positioning; feedback modes

1. Introduction

Autonomous driving technology has matured significantly, evolving from early automated parking systems [1] to sophisticated AT services [2,3], thereby fundamentally reshaping urban mobility paradigms. However, with the absence of human drivers, traditional human–vehicle interactions face fundamental challenges. In conventional driving scenarios, driver eye contact and gestures are key elements of road communication, and the absence of these non-verbal cues in autonomous driving environments creates a significant information gap [4,5]. This problem is particularly prominent in AT pickup scenarios. When an AT needs to select the optimal stopping location in complex urban road networks, effectively conveying pickup information to waiting passengers [3,6] and ensuring that passengers can quickly and accurately identify their booked vehicle has become a key challenge affecting service safety and operational efficiency. These identification failures can lead to passenger confusion, service delays, and safety risks in urban traffic environments.

The existing research has proposed eHMI as a solution to compensate for the loss of traditional driver–passenger communication [7,8]. eHMIs convey vehicle status and intention information to external users through various visual media [9,10,11,12], helping road users predict vehicle behavior and make appropriate decisions [13,14]. However, most studies primarily focused on pedestrian crossing scenarios [13,14,15,16,17], with limited specialized discussion on the unique passenger identification and location indication needs of AT [3]. Unlike pedestrian–vehicle interactions that primarily concern yielding behaviors and crossing intentions, AT passenger identification requires precise location communication and vehicle recognition capabilities, necessitating specialized eHMI solutions tailored to pickup scenarios rather than general traffic communication. Furthermore, despite evidence suggesting that the information content, presentation form [13,18], and installation position [9,19] of eHMIs significantly impact interaction effectiveness, there remains a lack of comprehensive systematic comparative studies on the relative advantages and limitations of various feedback modes at different observation distances in AT pickup scenarios. Existing eHMI studies typically employ static display approaches with fixed visual presentations, while our research introduces dynamic eHMI systems that adaptively respond to changing pickup distances, providing distance-sensitive feedback to optimize passenger identification across varying spatial contexts.

In light of these research opportunities, our study focuses on AT pickup scenarios, comprehensively evaluating three eHMI visual feedback modes (Eye, Arrow, and Number) and two display positions (body and top) across four parking locations. The Eye, Arrow, and Number feedback modes were selected to represent three fundamental communication approaches—social attention cues [14], directional guidance [20], and explicit identification, respectively [21]—covering the complete spectrum of passenger–vehicle interaction needs. Body and top display positions were chosen to compare superior visual recognition performance [22] versus enhanced attention capture from multiple viewing angles [23]. Our research contributions are primarily demonstrated through two key innovations: (1) a systematic comparison of different eHMI feedback combinations to address the precise location identification requirements of AT pickup scenarios, evaluated through metrics including users’ subjective cognition, decision-making efficiency, and error rates; and (2) an in-depth investigation of the interaction effects between display position and feedback mode, which reveals the relative advantages of vehicle-body versus vehicle-top displays across varying parking locations. To comprehensively validate these contributions, our experimental design incorporates multiple validation measures including objective performance metrics (selection time, error rates, and behavioral data) and subjective assessments (user ratings and interviews) to systematically evaluate different eHMI combinations across varying parking distances. These findings provide practical design guidelines for AT external interaction systems. The main contributions of our study can be summarized as follows:

Eye feedback demonstrates superior effectiveness in long-distance scenarios, effectively mitigating critical identification errors while maintaining user engagement and attention.
Arrow feedback delivers unambiguous directional cues, empowering users to make confident decisions and achieving significant reductions in both error rates and selection deviation during close-range scenarios.
Number feedback at the top position provides the fastest response times, proving particularly advantageous in time-sensitive situations where the rapid identification of an AT parking locations is required.
Feedback on the vehicle body typically improves passengers’ accuracy in their perception of autonomous driving intentions, facilitating clearer comprehension of the designated parking space selection.

Based on the findings above, we validated the application value of distance-adaptive eHMI feedback in AT pickup scenarios, demonstrating how different feedback modes and display positions can be strategically employed to optimize passenger–vehicle identification at varying distances. These findings offer effective design strategies for improving the efficiency and reliability of AT pickup services in urban mobility systems.

The structure of our paper is organized as follows: Section 2 reviews related research on eHMI applications in autonomous driving environments; Section 3 and Section 4 introduce the design and results of our two experiments, respectively, focusing on evaluating the impact of different eHMI feedback modes and display positions on user identification efficiency, decision confidence, and error rates; Section 5 discusses the results and proposes specific design recommendations; and finally, Section 6 summarizes the paper and outlines directions for future research.

2. Related Works

This section reviews prior research on human–vehicle interactions in autonomous traffic. We first outline the communication gaps that arise when the human driver and, with it, the well-established repertoire of eye contact and gestures, disappears. We then survey eHMI as an explicit communication channel, analyzing its modes, visual design parameters, and mounting positions. Finally, we highlight design challenges specific to AT pickup scenarios, where precise localisation and passenger identification are crucial for safety, efficiency, and user acceptance.

2.1. Communication Challenges in Autonomous Traffic Environments

With the rapid development of autonomous vehicle (AV) technology, innovative applications, from automated parking [1] to AT services [2,3], are gradually transforming people’s daily travel patterns. However, this technological revolution also brings unprecedented challenges, particularly in the field of human–vehicle interactions. In traditional traffic environments, pedestrians and drivers have established a mature set of non-verbal communication mechanisms. Especially when visibility or environmental conditions are limited, pedestrians typically seek eye contact or gesture-based confirmation from drivers to interpret vehicle intentions and ensure safe street-crossing [4,5]. Nevertheless, the introduction of autonomous driving technology fundamentally changes this interaction paradigm. In highly automated driving scenarios, there may be no driver in the vehicle, or the driver might be engaged in non-driving activities such as reading [13,24], resulting in the absence of this key communication link. Consequently, those non-verbal communication mechanisms that have long relied on driver feedback cannot function effectively in the new environment [14], introducing unprecedented uncertainty factors and related safety risks in human–vehicle communication.

In human–vehicle interaction processes, communication methods can be divided into two types: implicit and explicit [8,25]. Implicit communication refers to the indirect conveyance of intentions through vehicle-driving behaviors [26]. Pedestrians primarily interpret these implicit signals by observing the vehicle’s motion characteristics and kinematic parameters [27], thereby predicting vehicle intentions and planning their own behaviors [28,29]. For example, AVs intending to yield to pedestrians typically display noticeable deceleration trajectories, while vehicles with no intention to yield tend to maintain their original speed or slightly accelerate through crosswalk areas [30]. However, this implicit communication has limitations in the precision of these expressions in specific scenarios, particularly in complex traffic environments or special service situations where vehicle intentions cannot be clearly conveyed.

2.2. eHMI as a Solution for External Vehicle Communication

To address the challenges of human–vehicle interaction in autonomous driving environments, eHMI has been proposed as an effective solution. As the primary carrier of explicit communication, the core function of eHMI is to directly and clearly convey key information about AVs to the interaction subjects [7,31]. This information includes driving modes, current behaviors, yielding intentions, and environmental perception status [9], enabling pedestrians and other road users to accurately understand vehicle intentions through intuitive interface designs [13,32].

Compared to implicit communication, which relies on vehicle kinematic signals, eHMI, as an explicit interaction method, can convey more specific and precise information. This advantage is particularly evident in AT service scenarios. Kim et al.’s research indicates that AT services face numerous location-confirmation challenges, such as mismatches between user and vehicle locations, a lack of appropriate stopping locations, or users not arriving on time [6]. In response to these challenges, eHMI systems can provide critical information, such as passenger identification and exact stopping locations [3], effectively reducing misunderstandings and uncertainties during interaction and thereby improving pickup efficiency and user experience.

Multiple studies further confirm that AVs equipped with eHMI interfaces demonstrate higher efficiency and safety in various interaction scenarios [15,33]. Whether conveying intentions in pedestrian crossing scenarios or identifying passengers and confirming locations in AT pickup services, eHMI can significantly enhance user satisfaction and overall interaction experience [34]. In recent years, eHMI has become an important research direction in both the academic and industrial fields, aiming to provide key technological support for building safer and more efficient transportation environments by filling the gap in traditional driver–pedestrian communication in autonomous driving environments [14,35].

2.3. eHMI Design Variants and Performance

Researchers have proposed diverse eHMI concepts to optimize information transmission for AVs [7,9,36,37]. Based on their information presentation characteristics, these designs can be categorized into three major types: visual, auditory, and physical [7]. Among these, visual interfaces have become the most widely applied form in current research and practice due to their intuitiveness and recognizability [9], and are mainly implemented through lighting systems [10,34,35,38], vehicle displays [11,20,36], and projection devices [12,39]. Effective eHMI design needs to comprehensively consider multiple key factors, including interaction mode selection [7,40], color system application [41,42], and information expression types [43], to ensure clarity and intuitiveness in information transmission.

In comparative studies of different types of visual eHMI, multiple empirical studies have revealed the differences in their performance. Research by de Clercq et al. shows that although various eHMIs generally enhance pedestrians’ sense of safety in crossing decisions, non-textual interfaces (such as LED light strips) typically require longer learning adaptation periods compared to textual interfaces [15]. This finding resonates with research by Guo et al., whose experimental results indicate that textual information and combinations of text and symbols received the widest user acceptance and significantly accelerated users’ comprehension speed [18]. Lim et al. further discovered that in complex scenarios where pedestrians’ and drivers’ lines of sight are obstructed, eHMIs combining textual information can more effectively transmit emergency warning information than solutions using only light bands or no eHMI, thereby significantly shortening driver reaction times [44]. In their comparison between symbolic and textual information, Rettenmaier et al. confirmed that symbolic information has a longer recognition distance compared to textual information, which is more conducive for pedestrians to perceive AV intentions in a timely manner [16]. In an evaluation of different types of symbols, Dou et al. found that arrow symbols are more intuitively recognized by users compared to smiley face symbols, with the arrow symbol group receiving significantly higher comprehension scores in experiments [20].

Anthropomorphic design, as an innovative direction for eHMI, has shown unique value in recent research. Chang et al.’s study demonstrated that not only does using anthropomorphic eyes as an eHMI reduce potential traffic accidents, but the direction of their gaze can also effectively enhance pedestrians’ intuitive perception of safe and dangerous situations [13]. Building on this, Gui et al. explored the application potential of anthropomorphic eyes in indicating direction through real vehicle experiments, showing that such designs can break through the expression limitations of traditional turn signals, enabling pedestrians to accurately identify five different turning directions with lower error rates and faster reaction speeds, providing a new solution for achieving fine-grained interaction between AVs and pedestrians [14]. Their subsequent field research further revealed that if AVs using eyes as an eHMI, this can effectively demonstrate the vehicle’s perception capabilities, enhance pedestrians’ sense of safety, and help pedestrians recognize AVs and interpret their behavior from a distance [17].

2.4. eHMI Display Positioning and Effectiveness

Beyond their content and expression, the research indicates that the installation position of eHMIs significantly influences the efficency of information transmission and user experience [9,23]. Existing studies have explored various mounting positions, including the front grille [45,46,47], headlights [48,49], windshield [11,20,22], and roof [9,23,50]. Dey et al.’s research reveals that the windshield area is currently the most widely used position, followed by the front bumper, roof, and front grille [51], with these positions selected primarily based on installation convenience and visual accessibility [52].

Through empirical evaluations, Eisma et al. confirmed that eHMIs on the front grille, windshield, and roof performed best in subjective clarity ratings [52], while Ackermann et al.’s experiments further verified that windshield positions offer better visual recognition effects compared to lower-positioned solutions [22]. Kim et al.’s research further strengthened these findings, discovering that deploying eHMIs on the windshield not only enhanced users’ perceived clarity but also achieved significantly higher overall user satisfaction ratings [19]. However, the research results regarding position effects are not entirely consistent. Through eye-tracking, Guo et al. found that eHMI position has no significant effect on decision time, but it does alter the distribution of visual attention, with the front grille and windshield areas attracting more attention, while roof positions induce longer initial gazes [23]. Furthermore, Zheng et al.’s research indicates that vehicle size, structural features, and the relationship between eHMI position and pedestrian eye level significantly affect interaction effectiveness, requiring differentiated designs for different vehicle types [9]. This suggests that eHMI position selection should consider specific scenarios and may need to be flexibly adjusted based on visibility and usage contexts in complex traffic environments.

Despite the valuable insights provided by the existing research on eHMI design, specialized studies targeting AT pickup scenarios remain insufficient (Table 1). The current eHMI systems mainly focus on general interaction scenarios, such as pedestrian crossings, and struggle to meet specific needs like precise location indication during pickup processes [3]. In complex urban environments, when facing multiple possible parking areas or areas unsuitable for stopping, insufficient information transmission may cause identification difficulties between passengers and vehicles, prolonging waiting times and even causing temporary traffic congestion at roadside pickup areas. This research will explore the combined application effects of different eHMI design modes and display locations in AT pickup scenarios, providing reliable design guidance for developing more intuitive and efficient AT interaction systems.

3. Experimental Design and Implementation

Based on the literature review, while the existing research has provided valuable insights into eHMI design principles, there remains a significant gap in the understanding of their effectiveness specifically in AT pickup scenarios. The unique challenges of passenger–vehicle identification in urban environments require tailored solutions beyond general pedestrian interaction designs. To address this research gap, we conducted a systematic experimental investigation comparing different combinations of eHMI feedback modes and display positions.

Our experimental framework employed a controlled study comparing visual feedback modes (Eye, Arrow, and Number) with different display positions (vehicle body and top). The design incorporated varying distances in parking scenarios to simulate real-world pickup conditions. Performance was measured through multiple metrics, including identification efficiency, decision confidence, and error rates. Through rigorous data analysis and user feedback assessment, we established a comprehensive evaluation of each design combination’s effectiveness. The following sections detail our methodology, measurement instruments, and analytical findings, culminating in evidence-based design recommendations for AT external communication systems.

3.1. Design of eHMI Feedback Modes and Display Positions

To achieve this goal, our research developed combinations of three eHMI visual feedback modes (Eye, Arrow, and Number) with two display positions (body and top), as shown in Figure 1a. Each eHMI feedback was combined with both display positions to evaluate the effectiveness of all combinations. These feedback combinations aim to assess the communication efficiency of ATs when conveying specific parking location information to waiting passengers, providing empirical evidence for optimizing passenger identification experiences.

Our research designed a parking scenario, as shown in Figure 1b, where the AT (dimensions: 4.9 m × 1.968 m × 1.614 m; wheelbase: 2.985 m) approaches at a constant speed of 10 m/s. The scenario consists of four parking spaces (each 5.0 m × 2.4 m), with the AT’s center point positioned 2.4 m away from the passenger (Camera). The passenger stands 1.4 m from the nearest parking location (park 1), while the farthest spot (park 4) is 10 m away from the approaching vehicle.

3.1.1. eHMI Feedback Mode Design

When communicating AT intentions, the interface design involves multidimensional performance trade-offs. Text-based interfaces demonstrate clear advantages, with shorter learning adaptation periods [15] and widespread user acceptance [18]. Particularly when the text includes numerical elements, the research confirms that users can accurately interpret the specific information conveyed [21]. Meanwhile, symbolic displays excel in two key aspects: they provide longer recognition distances, allowing for pedestrians to perceive vehicle intentions earlier [16], and they receive higher user preference ratings [53]. Among the symbol types, arrow symbols stand out, with significantly higher comprehension scores in experimental settings [20], indicating their unique advantage in intuitively communicating directional intent. Additionally, Gui et al.’s research revealed the potential of anthropomorphic eyes in direction-indication applications, enabling more nuanced interaction between AVs and pedestrians [14].

Based on the comprehensive analysis of these research findings, we selected Eye, Arrow, and Number as eHMI design elements, using cyan [23,41] as the unified presentation color, as this was validated by multiple studies as the optimal choice for communicating AV intentions [54,55,56]. The specific design parameters for the three eHMI types are as follows (Figure 1a):

Eye feedback [3,57]: A length of 0.75 m and height of 0.25 m, with a pupil diameter of 0.05 m and eye frame stroke width of 0.01 m. The pupil and eye frame maintain a coplanar relationship, as defined in Equation (2). This constraint ensures the pupil always moves within the frame plane while tracking the target.
Arrow feedback [16,20]: Length of 0.25 m; height of 0.2 m.
Number feedback [18,21]: Length of 0.5 m; height of 0.2 m.

Our research proposes a mathematical model for dynamic target-oriented positioning in AT eHMI systems. Using Eye feedback as the primary example, we demonstrate how a fixed eye frame (denoted by

Q_{frame}

) and a movable pupil element (denoted by

Q_{pupil}

) establish precise spatial relationships through coplanarity constraints and projection algorithms. The pupil intelligently adjusts its position within constrained boundaries to “gaze” at the designated parking spot, thus intuitively conveying the vehicle’s intended destination to passengers through anthropomorphic visual signals. This projection mechanism establishes an exact mathematical mapping between the vehicle’s expected trajectory and the visual feedback elements.

Let the target point coordinate be

P \in R^{3}

. The cone parameters are defined as follows:

Base center: $C \in R^{3}$ ;
Vertex: $T \in R^{3}$ ;
Base radius: $R = 2 m$ ;
Base normal vector: $n \in R^{3}$ (unit vector).

Eye frame is geometrically determined by

Q_{frame} = C, n_{frame} = n

(1)

The pupil center coordinate

Q_{pupil} \in R^{3}

satisfies the following:

Coplanarity constraint:

$(Q_{pupil} - Q_{frame}) \cdot n_{frame} = 0$

(2)
Dynamic projection relationship:

$Q_{pupil} = \{\begin{matrix} T + (\frac{| | Q_{frame} - T | |}{| | P - T | |}) (P - T), & if | | Q_{pupil}^{intersect} - Q_{frame} | | \leq R \\ Q_{frame} + R \cdot \frac{Q_{pupil}^{intersect} - Q_{frame}}{| | Q_{pupil}^{intersect} - Q_{frame} | |}, & otherwise \end{matrix}$

(3)

where the theoretical intersection point is

$Q_{pupil}^{intersect} = T + (\frac{(Q_{frame} - T) \cdot n_{frame}}{(P - T) \cdot n_{frame}}) (P - T)$

(4)

In addition to the anthropomorphic Eye feedback, our system explores two alternative feedback mechanisms that follow the same projection geometric principles:

Arrow feedback employs dynamic angle adjustments to maintain precise alignment with the target location. By consistently pointing toward the designated parking position, it provides AT passengers with intuitive directional information about their intended stopping position.
Number feedback calculates and displays the real-time Euclidean distance (e.g., “20 m”) between the vehicle and its designated parking point. This numerical value updates dynamically as the vehicle approaches, providing passengers with precise information about their remaining distance to the destination.

3.1.2. eHMI Display Position Design

Research confirms that eHMI display position significantly impacts information delivery efficiency and user experience [9,23]. When communicating parking locations for AT, the strategic positioning of visual cues becomes crucial for effective passenger–vehicle coordination. Studies have identified several viable installation positions with varying performance characteristics. The windshield position demonstrates superior visual recognition performance compared to lower-positioned alternatives [22], while receiving high ratings for subjective clarity [52] and overall user satisfaction [19]. Alternatively, the top position attracts longer initial fixations [23], potentially allowing for attention to be better captured from various viewing angles.

Based on these research insights, we selected the body and top positions as two distinct eHMI display positions for evaluation in our AT pickup system (Figure 1a):

Body Position [44]: Integrated into the vehicle’s side windows and rear windshield surfaces. The side windows measure 2.28 m in length and 0.406 m in height, while the rear windshield measures 1.3 m in length and 0.325 m in height. This multi-surface placement allows for the feedback information to remain visible from various viewing angles as the vehicle maneuvers, providing continuous communication regardless of the vehicle’s orientation relative to the user.
Top Position [58]: Mounted on the vehicle roof with dimensions of 0.3 m in length and 1 m in width, consistently facing the user throughout the approach sequence.

Table 2 summarizes the systematic combinations of the feedback modes and display positions evaluated in our study, providing a structured framework for comparing their relative effectiveness across different parking locations.

3.2. Experimental Settings and Design

3.2.1. Experimental Settings

In our study, we conducted research using the Pico 4 Ultra (PICO Immersive ltd, Beijing, China) virtual reality headset, equipped with dual 32 MP color cameras, an iToF depth-sensing camera, and four environmental tracking cameras for precise spatial positioning. The headset features high-resolution displays with 2160 × 2160 pixels per eye and a 90 Hz refresh rate, delivering immersive visuals through Pancake optical lenses with a 105° field of view. The experimental application was developed using Unity 2021.3.27f1c2 with the PICO Unity SDK 2.5.0.

3.2.2. Independent Variables

Our research investigates the influence of different eHMI feedback methods and display positions on conveying parking location information for AT. We designed an experiment with three independent variables, evaluating the effectiveness of three eHMI feedback modes and two display positions in communicating specific location information. The specific independent variables include the following:

Feedback Mode (three levels): Eye, Arrow and Number.
Display Position (two levels): body and top.
Parking Location (four levels): 1, 2, 3, 4.

3.2.3. Experimental Design

In this study, we employed a within-subjects repeated-measures design to evaluate the effectiveness of three eHMI visual feedback modes (Eye, Arrow, and Number) and two display positions (body and top) in helping passengers identify the parking location of ATs. The within-subjects design was selected to control for individual differences in spatial cognition abilities and technology familiarity, ensuring that variations in the eHMI system experience would not confound the comparison between feedback modes. The experiment consisted of a pre-experiment phase and a formal testing phase, with a 2-min rest period between phases to reduce fatigue. In each phase, participants completed 24 trials (3 feedback modes × 2 display positions × 4 parking locations).

The pre-experiment phase (24 trials total; 8 trials per feedback mode) was implemented to standardize participants’ exposure to each eHMI feedback mode, thereby controlling for individual differences in prior technology experience that could affect performance. Trial order was randomized across participants. Participants first completed the pre-experiment phase (24 trials total; 8 trials per feedback mode) to familiarize themselves with the task procedure and feedback combinations. After completing the pre-experiment, participants proceeded to the formal testing phase (24 trials total; 8 trials per feedback mode), with each participant completing a total of 48 experimental trials.

3.2.4. Participants and Procedure

The study recruited eighteen university student volunteers (Table 3), evenly split between females and males (nine each), with a mean age of 23.83 (

s d = 1.54

). Participants had varied driving experience: six had none, nine had 1–5 years, and three had over 5 years. All participants provided informed consent prior to participation, and each experimental session lasted approximately 40 min. Throughout the experiment, participants remained standing while wearing a Pico headset to engage with the virtual environment. Two participants experienced mild discomfort, which was resolved during the rest interval, while other participants reported no similar issues. At the beginning of each trial, participants pressed any key on a selection keyboard in front of them, which featured four keys corresponding to different parking locations. An autonomous SUV then moved from left to right across the scene, with an eHMI system to display dynamic visual cues. Participants were required to interpret these real-time feedback signals to identify the intended parking location. Once confident in their interpretation, they pressed the corresponding key to submit their response and complete the trial. Pressing any key again initiated the next trial.

Upon completing all experimental tasks, participants completed two subjective questionnaires: one assessing comprehension and confidence, and the other evaluating the eHMI feedback experience. These instruments aimed to systematically evaluate the effectiveness and user experience quality of different eHMI feedback modalities in facilitating the identification of AT pickup locations.

Comprehension Assessment: Evaluation of how effectively combinations of eHMI messages and display positions facilitate understanding of automated vehicles’ intended parking locations [19,23], measured using a 5-point Likert scale (0–4). This assessment provides critical insights into the cognitive clarity of different communication strategies, helping identify optimal configurations that minimize misinterpretation and enhance pedestrian–vehicle interaction safety.

Decision Confidence Rating: Participants rated their confidence level from 0% to 100%, consistent with measurement approaches established in previous human–machine interaction research [59].

eHMI Feedback Assessment Scale: A 7-point Likert scale (1–7) questionnaire evaluating users’ perceptions of different feedback modes (Eye, Arrow, and Number) across five core dimensions: safety perception, trust, and communication effectiveness [17]. The final question asks participants to evaluate their overall impressions of different feedback modes [17]. These collected responses aim to provide statistical data for analyzing user preferences and perceptions regarding different eHMI feedback designs.
–
Automation: The feedback mode helps me identify whether the vehicle is in autonomous driving mode.
–
Intention: The feedback mode helps me understand the vehicle’s driving intentions.
–
Trust: The feedback mode improves my trust in the vehicle.
–
Attention: The feedback mode makes me feel that the vehicle is aware of my presence.
–
Safety: The feedback mode enhances my sense of safety.
–
Emotional Response: Please describe your feeling about the feedback mode using one word.

4. Results

In our experiment, we analyzed performance metrics and subjective ratings across different modes using Generalized Estimating Equations (GEE) [60,61,62]. We selected GEE over mixed-effects models based on our data characteristics and analytical requirements: (1) GEE naturally accommodates the diverse distributional families in our dataset (Gamma, Binomial, Negative Binomial, Tweedie) within a unified framework; and (2) GEE provides enhanced robustness in correlation structure assumptions, which offers additional reliability for our complex three-factor repeated-measures design. This analytical approach was selected specifically to address the non-normal distributions [61,62] in our data while properly accounting for the correlated nature of measurements in our within-subjects design with multiple factors.

We employed various distribution and link function combinations based on data characteristics: Gamma distribution with Log link function was used for continuous measures (Selection Time, Head Rotation Angle, Head Movement Distance, and 7-point Likert scale responses); binomial distribution with Probit link function was used for binary outcomes (Error Rate, 0 or 1); the negative binomial distribution with Log link function was used for selection deviation and 5-point comprehension scale (0–4); and Tweedie distribution with Log link function was used for confidence ratings (0–100%). All models utilized an exchangeable correlation structure to account for within-subject correlations. Our models incorporated all main effects and interactions for the three independent variables: Feedback Mode (Feedback), Parking Location (Park), and Display Position (Display), enabling a comprehensive analysis of how these interface factors influence both performance and user perceptions.

4.1. Selection Time, Error Rate, Selection Deviation, and User Behavior Data

Selection Time

We measured participants’ selection times across various eHMI feedback conditions, from vehicle movement initiation to user response. The main effects were found for Feedback (

χ^{2} (2) = 129.25, p < 0.01

), Park (

χ^{2} (3) = 99.76, p < 0.01

), and Display (

χ^{2} (1) = 39.62, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Park (

χ^{2} (6) = 80.45, p < 0.001

) and between Feedback and Display (

χ^{2} (2) = 20.88, p < 0.001

). The mean selection time across Feedback, Park, and Display is illustrated in Figure 2a. Post hoc Bonferroni pairwise comparisons showed that selection time of Number (9.04 s) was significantly faster than Eye (11.90 s)

(p < 0.01)

and Arrow (11.22 s)

(p < 0.01)

. The selection time of park 1 (6.89 s) was significantly faster than those for park 2 (10.02 s)

(p < 0.01)

, park 3 (12.95 s)

(p < 0.01)

and park 4 (14.39 s)

(p < 0.01)

; the time for park 2 (10.02 s) was significantly faster than that for park 3 (12.95 s)

(p < 0.01)

and park 4 (14.39 s)

(p < 0.01)

; the time for park 3 (12.95 s) was significantly faster than that for park 4 (14.39 s)

(p < 0.01)

. The selection time of display body (10.22 s) was significantly faster than the display top (11.09 s)

(p < 0.01)

.

We further analyzed the selection time across the four park locations. For park 1, the main effects were found for Feedback (

χ^{2} (2) = 29.88, p < 0.001

) and Display (

χ^{2} (1) = 9.42

, p < 0.01). Post hoc Bonferroni pairwise comparisons showed that the selection time of Number (5.40 s) was significantly faster than for Arrow (7.14 s)

(p < 0.01)

and Eye (8.48 s)

(p < 0.01)

; Arrow (7.14 s) was significantly faster than Eye (8.48 s)

(p < 0.01)

. The selection time for display body (6.52 s) was significantly faster than for display top (7.24 s)

(p < 0.01)

. For park 2, the main effects were found for Feedback (

χ^{2} (2) = 32.89, p < 0.001

) and Display (

χ^{2} (1) = 8.86, p < 0.01

). Additionally, we observed significant interaction effects between Feedback and Display (

χ^{2} (2) = 12.84, p < 0.01

). Post hoc Bonferroni pairwise comparisons showed that selection time of Number (8.18 s) was significantly faster than for Arrow (11.23 s)

(p < 0.01)

and Eye (10.94 s)

(p < 0.01)

. The selection time for display body (9.50 s) was significantly faster than for display top (10.56 s)

(p < 0.01)

. For park 3, the main effect was found for Feedback (

χ^{2} (2) = 56.03, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Display (

χ^{2} (2) = 37.31, p < 0.001

). Post hoc Bonferroni pairwise comparisons showed that selection time for Number (11.02 s) was significantly faster than that for Arrow (14.21 s)

(p < 0.01)

and Eye (13.83 s)

(p < 0.01)

. For park 4, the main effect was found for Feedback (

χ^{2} (2) = 10.20, p < 0.01

) and Display (

χ^{2} (1) = 7.17, p < 0.01

). Post hoc Bonferroni pairwise comparisons showed that the selection time of Number (13.70 s) was significantly faster than for Eye (15.61 s)

(p < 0.01)

. The selection time for display body (13.84 s) was significantly faster than for display top (14.94 s)

(p < 0.01)

.

Error Rate

We recorded participants’ choices and compared them with actual data, marking correct choices as (0) and incorrect ones as (1). The main effects were found for Park (

χ^{2} (3) = 39.93, p < 0.001

) and Display (

χ^{2} (1) = 13.69, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Park (

χ^{2} (6) = 106.79, p < 0.01

). The mean error rate across Feedback, Park and Display is illustrated in Figure 3a. Post hoc Bonferroni pairwise comparisons showed that the error rate of park 4 (0.49) was significantly higher than that of park 1 (0.06)

(p < 0.01)

and park 2 (0.29)

(p < 0.05)

; the error rate of park 3 (0.45) was significantly higher than that of park 1 (0.06)

(p < 0.01)

and park 2 (0.29)

(p < 0.05)

; the error rate of park 2 (0.29) was significantly higher than that of park 1 (0.06)

(p < 0.01)

. The error rate of display top (0.38) was significantly higher than that of display body (0.20)

(p < 0.01)

.

We further analyzed the error rate across the four park locations. For park 1, the main effects were found for Feedback (

χ^{2} (2) = 28.07, p < 0.001

) and Display (

χ^{2} (1) = 7.89

, p < 0.01). Post hoc Bonferroni pairwise comparisons showed that the error rate of Eye (0.24) was significantly higher than that of Arrow (0.01)

(p < 0.01)

and Number (0.01)

(p < 0.01)

. The rrror rate of display top (0.13) was significantly higher than that of display body (0.01)

(p < 0.05)

. For park 2, the main effect was found for Feedback (

χ^{2} (2) = 17.43, p < 0.001

). Post hoc Bonferroni pairwise comparisons showed that the error rate of Eye (0.39) was significantly higher than that of Arrow (0.08)

(p < 0.01)

; the error rate of Number (0.53) was significantly higher than that of Arrow (0.08)

(p < 0.01)

. For park 3, the main effect was found for Display (

χ^{2} (1) = 14.93, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Display (

χ^{2} (2) = 6.36, p < 0.05

). Post hoc Bonferroni pairwise comparisons showed that the error rate of display top (0.62) was significantly higher than that of display body (0.29)

(p < 0.01)

. For park 4, the main effect was found for Feedback (

χ^{2} (2) = 22.46, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Display (

χ^{2} (2) = 10.71, p < 0.01

). Post hoc Bonferroni pairwise comparisons showed that the error rate of Arrow (0.67) was significantly higher than that of Eye (0.25)

(p < 0.01)

; the error rate of Number (0.57) was significantly higher than that of Eye (0.25)

(p < 0.05)

.

Selection Deviation

We measured the deviation between participants’ selections and the correct answers. The main effects were found for Park (

χ^{2} (3) = 23.27, p < 0.001

) and Display (

χ^{2} (1) = 8.31

, p < 0.01). Additionally, we observed significant interaction effects between Feedback and Park (

χ^{2} (6) = 62.87, p < 0.001

). The mean selection deviation across Feedback, Park, and Display is illustrated in Figure 4a. Post hoc Bonferroni pairwise comparisons showed that the selection deviation of park 4 (0.52) was significantly higher than that of park 1 (0.06)

(p < 0.01)

and park 2 (0.25)

(p < 0.05)

; the selection deviation of park 3 (0.45) was significantly higher than that of park 1 (0.06)

(p < 0.01)

and park 2 (0.25)

(p < 0.01)

; and the selection deviation of park 2 (0.25) was significantly higher than that of park 1 (0.06)

(p < 0.01)

. The selection deviation of display top (0.31) was significantly higher than that of display body (0.19)

(p < 0.01)

.

We further analyzed the selection deviation across the four park locations. For park 1, the main effects were found for Feedback (

χ^{2} (2) = 23.36, p < 0.001

). Post hoc Bonferroni pairwise comparisons showed that the deviation of Eye (0.26) was significantly higher than that of Number (0.02)

(p < 0.05)

. For park 2, the main effects were found for Feedback (

χ^{2} (2) = 12.71, p < 0.01

). Post hoc Bonferroni pairwise comparisons showed that the deviation of Number (0.53) was significantly higher than that of Arrow (0.08)

(p < 0.01)

; the deviation of Eye (0.39) was significantly higher than that of Arrow (0.08)

(p < 0.01)

. For park 3, the main effects were found for Display (

χ^{2} (1) = 10.84, p < 0.001

). Post hoc Bonferroni pairwise comparisons showed that the deviation of display top (0.63) was significantly higher than that of display body (0.27)

(p < 0.01)

. For park 4, the main effects were found for Feedback (

χ^{2} (2) = 25.17, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Display (

χ^{2} (2) = 15.87, p < 0.001

). Post hoc Bonferroni pairwise comparisons showed that the deviation of Arrow (0.86) was significantly higher than that of Eye (0.25)

(p < 0.01)

, and the deviation of Number (0.59) was significantly higher than that of Eye (0.25)

(p < 0.01)

.

Head Rotation Angle

We recorded the total head rotation angle during the selection time. The main effects were found for Park (

χ^{2} (3) = 21.88, p < 0.001

) and Display (

χ^{2} (1) = 10.62, p < 0.01

). Additionally, we observed significant interaction effects between Feedback and Park (

χ^{2} (6) = 22.62, p < 0.001

). The mean total head rotation angle across Feedback, Park, and Display is illustrated in Figure 5a. Post hoc Bonferroni pairwise comparisons showed that the total head rotation angle of park 3 (35.69 degree) was significantly higher than that of park 1 (22.05 degree)

(p < 0.01)

and park 2 (28.38 degree)

(p < 0.05)

; the angle of park 4 (35.57 degree) was significantly higher than that of park 1 (22.05 degree)

(p < 0.01)

and park 2 (28.38 degree)

(p < 0.05)

; and the angle of park 2 (28.38 degree) was significantly higher than that of park 1 (22.05 degree)

(p < 0.05)

. The head rotation angle of the display top (32.98 degree) was significantly higher than that of the display body (27.02 degree)

(p < 0.01)

.

Head Movement Distance

We recorded the total head movement distance during the selection time. The main effects were found for Park (

χ^{2} (3) = 32.09, p < 0.001

) and Display (

χ^{2} (1) = 14.06

, p < 0.001). No other main or interaction effects were found. The mean total head movement distance across Feedback, Park, and Display is illustrated in Figure 5b. Post hoc Bonferroni pairwise comparisons showed that the total head movement distance of park 3 (88.83 cm) was significantly longer than that of park 1 (52.53 cm)

(p < 0.01)

and park 2 (62.56 cm)

(p < 0.01)

, and the distance observed for park 4 (82.58 cm) was significantly longer than that of park 1 (52.53 cm)

(p < 0.01)

and park 2 (62.56 cm)

(p < 0.01)

. The head movement distance of the display top (78.39 cm) was significantly longer than that of the display body (62.63 cm)

(p < 0.01)

.

We also recorded the head movement distance along the X, Y, and Z dimensions (as shown in Figure 6). For the X dimension, the main effects were found for Park (

χ^{2} (3) = 19.46, p < 0.001

) and Display (

χ^{2} (1) = 8.36, p < 0.01

). Additionally, we observed significant interaction effects between Feedback and Park (

χ^{2} (6) = 15.04, p < 0.05

). Post hoc Bonferroni pairwise comparisons showed that the distance of park 3 (26.06 cm) was significantly longer than that of park 1 (16.58 cm)

(p < 0.01)

; the distance of park 4 (25.73 cm) was significantly longer than that of park 1 (16.58 cm)

(p < 0.01)

; and the distance of park 2 (21.19 cm) was significantly more than that of park 1 (16.58 cm)

(p < 0.05)

. The display top distance (24.00 cm) was significantly longer than the display body distance (20.22 cm)

(p < 0.01)

.

For the Y dimension, the main effects were found for Park (

χ^{2} (3) = 41.91, p < 0.001

) and Display (

χ^{2} (1) = 15.12, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Park (

χ^{2} (6) = 18.89, p < 0.01

). Post hoc Bonferroni pairwise comparisons showed that the distance of park 3 (10.59 cm) was significantly longer than that of park 1 (6.18 cm)

(p < 0.01)

and park 2 (8.15 cm)

(p < 0.01)

, and the distance of park 4 (9.88 cm) was significantly longer than that of park 1 (6.18 cm)

(p < 0.01)

and park 2 (8.15 cm)

(p < 0.05)

. The distance of the display top (9.76 cm) was significantly longer than that of the display body (7.44 cm)

(p < 0.01)

.

For the Z dimension, the main effects were found for Feedback (

χ^{2} (2) = 6.24, p < 0.05

), Park (

χ^{2} (3) = 30.32, p < 0.001

), and Display (

χ^{2} (1) = 11.50, p < 0.001

). Additionally, we observed significant interaction effects between Feedback and Park (

χ^{2} (6) = 32.31,

p < 0.001). Post hoc Bonferroni pairwise comparisons showed that the Eye distance (13.13 cm) was significantly longer than the Number distance (10.34 cm)

(p < 0.05)

. The distance of park 4 (14.97 cm) was significantly longer than that of park 1 (8.18 cm)

(p < 0.01)

and park 2 (10.54 cm)

(p < 0.01)

, and the distance of park 3 (13.98 cm) was significantly longer than that of park 1 (8.18 cm)

(p < 0.01)

and park 2 (10.54 cm)

(p < 0.01)

. The display top distance (12.95 cm) was significantly longer than that of the display body (10.37 cm)

(p < 0.01)

.

4.2. User Evaluation and Interview

We collected subjective evaluations of different feedback modes and display positions through questionnaires and semi-structured interviews to understand user experience and preferences. The questionnaire results revealed the impact of feedback modes on aspects such as comprehensibility, trust, and attention attraction, while the interviews further explored users’ emotional responses and practical evaluations of various feedback designs. Together, these data provide important insights into optimizing communication between AVs and pedestrians.

4.2.1. Subjective User Evaluation

After the test, participants completed a questionnaire measuring both comprehension and confidence. The GEE revealed a significant effect of Feedback (

χ^{2} (2) = 6.09, p < 0.05

). The mean comprehension of the Feedback and Display is illustrated in Figure 7a. Post hoc Bonferroni pairwise comparisons showed that the comprehension of Arrow (2.92) was significantly higher than that of Number (2.07)

(p < 0.05)

. For confidence ratings, the GEE revealed a significant effect of Feedback (

χ^{2} (2) = 6.53, p < 0.05

). The mean confidence across Feedback and Display is illustrated in Figure 7b. However, post hoc Bonferroni pairwise comparisons did not reveal any significant differences between individual feedback modes.

Additionally, we assessed user perception using five questions. For Automation, Trust, and Safety, the GEE revealed no significant effects

(p > 0.05)

. For Intention, the GEE revealed a significant effect of Feedback (

χ^{2} (2) = 14.22, p < 0.001

). Post hoc Bonferroni pairwise comparisons showed that the intention results for Arrow (5.65) were significantly higher than those obtained for Number (4.00)

(p < 0.01)

; the intention results for Eye (5.40) were significantly higher than those obtained for Number (4.00)

(p < 0.05)

. For Attention, the GEE revealed a significant effect of Feedback (

χ^{2} (2) = 32.30, p < 0.001

). Post hoc Bonferroni pairwise comparisons showed that the attention for Eye (4.65) was significantly higher than that for Arrow (2.95)

(p < 0.01)

and Number (2.75)

(p < 0.01)

. The mean user perceptions revealed in the feedback are illustrated in Figure 7c.

Additionally, by collecting participants’ subjective evaluations of the three feedback modes, we gained insights into user experience (Figure 8).

Eye feedback performed generally positively on the emotional level, with 10 participants giving positive evaluations, significantly more than the 5 who held negative views and 3 who expressed neutral attitudes. This result confirms that anthropomorphic eye feedback can effectively establish emotional connections, although individual differences exist.
Arrow feedback was primarily recognized for its utility, with nine participants affirming its clear advantages in directional indication, exceeding the five participants who expressed negative emotions. Additionally, four participants reported positive emotional responses to arrow feedback, indicating that while this functional feedback may cause pressure for some users, it is still generally considered a valuable guidance element.
Attitudes towards Number feedback demonstrated clear differences: although six participants appreciated its precision and clarity, eight participants reported comprehension difficulties, indicating challenges in the understandability of this feedback method. Emotionally, digital feedback elicited fewer strong reactions, with only two participants expressing clearly positive emotions and two expressing negative emotions, reflecting the relatively neutral nature of digital feedback in terms of emotional evocation.

4.2.2. User Interview

According to the user interview results, over 61% of participants explicitly expressed a preference for Arrow feedback (six for Eye, nine for Arrow, and three for Number). Participants generally believed that arrows provided an intuitive and clear indication of the vehicle’s intended parking spot, with information that was clearly understandable, and that arrows were able to quickly convey key information. However, some participants believed that, when observed from a distance, arrows decreased in accuracy when distinguishing adjacent parking spaces, and the interaction experience was rather mechanical, lacking a human–machine emotional connection. As one participant said: “Arrows have strong directionality, allowing me to immediately determine the vehicle’s parking location, but compared to Eye feedback, arrows lack a sense of life, giving an impression of pure mechanics without emotional connection.” In contrast, participants who chose Eye feedback emphasized the emotional advantages brought by its anthropomorphic qualities. One participant stated: “The eyes make me feel noticed, giving me more trust and a sense of safety, as if someone is paying attention to me, not just a machine.” Another participant added: “Eyes have a friendly, warm communication feel, like characters in the movie ’Cars’, making me feel more familiar and trusting.” The minority of participants who chose Number feedback valued its precision and sense of security. One user explained: “Numbers give me a great sense of security, allowing clear judgment of parking location as the distance decreases, as intuitive as backup cameras.” However, most participants expressed difficulty understanding number information: “I don’t know what distance the numbers represent, there is no clear reference point, and constant analysis of changes is required, making it easy to miss important information.”

Regarding the feedback display position, over 88% of participants showed a clear preference for the body display position over the top display position (16 body; 2 top). Participants believed the body display position provided better visual perception and spatial positioning ability, especially when the vehicle was turning, allowing for a more accurate judgment of direction and distance. As one participant stated: “Looking down allows me to see more information, and the window matches the usual habit of looking at vehicles, making it easier to identify.” Multiple users pointed out that the body display position was closer to the ground and parking locations, reducing visual errors and enhancing intuitiveness: “The body is better than the top, as when turning, the height is closer to the parking location, closer to the parking location.” Participants also emphasized that the multi-faceted nature of the body helped determine whether it was facing them: “The vehicle body has many sides, making it clearer whether it is facing me, while with the top, I do not know which direction it is in.” Among the minority who preferred the top display position, one user stated “the top is more attention grabbing,” believing the top display position was more visually noticeable.

Our interview results reveal a fundamental tension in AT communication design between functional utility and emotional engagement. The widespread preference for Arrow feedback demonstrates the primary importance of clear directional information, while the positive response to anthropomorphic Eye feedback indicates the value of incorporating human-like elements to build trust. The strong preference for body position reflects users’ desire for feedback that aligns with natural viewing patterns and provides optimal spatial context during vehicle maneuvers.

5. Discussion

In this section, we distill the key insights from our findings on feedback mode and display position in AT contexts. By systematically comparing Eye, Arrow, and Number feedback presented at both body and top positions, we identified distinct advantages in speed, clarity, and error prevention across various combinations, and analyzed how contextual factors such as parking distance influence the effectiveness of feedback. Building on these insights, we conclude with theoretical implications, practical considerations, and future directions for eHMI design.

5.1. eHMI Efficiency Across Feedback Modes and Display Positions

Our research results indicate that selection time is significantly influenced by both the feedback mode and the parking location (Table 4 and Figure 9a). Specifically, across all parking locations, Number feedback consistently yields the shortest selection times, thus improving overall selection efficiency. This finding aligns with prior studies showing that number information can effectively streamline decision-making and enhance efficiency [21]. Notably, except for parking location 4, Number feedback outperforms Arrow feedback at each location; Arrow feedback only surpasses Eye feedback in efficiency when parking location 1.

Additionally, we observed an interaction effect between feedback type and display position (Figure 9b): Number feedback exhibits faster selection times when presented at the top compared to the body. This aligns with Guo et al.’s findings, which indicated that top-mounted displays draw users’ initial attention for longer periods [23], potentially aiding in the more accurate recognition of Number Feedback. Meanwhile, Eye feedback and Arrow feedback perform better when combined with the body, possibly because the body position more closely simulates the eye-contact position between drivers and users in conventional vehicles [22], thus enhancing the naturalness and intuitiveness of the interaction.

Our research also indicates that display position (body vs. top) significantly influences users’ selection time (Table 5). As shown in Figure 10 and Table 6, Eye feedback and Arrow feedback perform better when placed on the body, while Number feedback demonstrates superior efficiency in the top position, particularly for parking locations 2 and 3. The qualitative data further supports this finding, with participants describing body-positioned feedback as “more intuitive” and “naturally aligned with their attention.” This result aligns with Kim et al.’s research on the importance of eHMI positioning [19], but contrasts with Guo et al.’s findings, who observed no significant effect of position on decision time in pedestrian crossing scenarios [23]. This difference likely stems from the fundamental distinction between the nature of the task: AT pickup scenarios require a more precise understanding of location information. Additionally, the precise positioning of eHMI elements on the vehicle body is critical to their effectiveness. While Zheng et al. found that windshield-bottom placement resulted in diminished user ratings [9], our study implemented windshield-center positioning—a seemingly minor spatial distinction—potentially yielding significant improvements in interaction quality and user experience.

However, a marked contrast exists between objective efficiency metrics and users’ subjective assessments (Table 7). Although Number feedback resulted in the shortest selection time, it scored lower in terms of comprehensibility and intention communication, suggesting that number information may need to be combined with symbolic elements to effectively convey intentions [21]. Arrow feedback was considered to provide clearer directional indications with higher intuitive recognition [20]. Eye feedback attracted the most visual attention, consistent with previous research on the salience of anthropomorphic eyes [13]. However, it is worth noting that, in our pickup scenario, Eye feedback produced the longest selection time, contrasting with Gui et al.’s previous report showing that eye cues can accelerate pedestrian responses [14]. Nevertheless, our research still validates their proposition that Eye feedback has practical application potential in scenarios requiring precise spatial positioning, such as AV pickups [14].

Furthermore, the advantages of Eye feedback in long-distance scenarios can be explained through an evolutionary cognitive perspective: eye feedback essentially constructs a new type of human–vehicle interaction mode [48,56], effectively activating humans’ natural response mechanism to gaze cues. The research shows that even when explicitly informed that gaze direction is irrelevant to target location, people still involuntarily follow the direction of a gaze, demonstrating the automaticity of this inherent response [63]. Compared to non-social cues like Arrows, gaze can trigger special social attention [64] and can be automatically processed without increasing cognitive load [65]. These characteristics powerfully explain the unique performance of Eye feedback in long-distance scenarios. Chang et al.’s empirical research further validates this point, confirming that vehicle eye signals can effectively convey intentions at relatively long distances [48]. Based on the understanding of these cognitive mechanisms, Eye feedback can be used as a complementary technology in conjunction with other precise methods [17], optimizing information processing efficiency by integrating multiple feedback modes and providing a more comprehensive interaction experience.

5.2. eHMI Accuracy Across Feedback Modes and Display Positions

Our research results reveal a significant interaction between feedback modes and parking locations, jointly affecting user error rates and selection bias (Figure 11). Data analysis (Table 8) shows that feedback effectiveness is notably moderated by distance to parking location: at closer parking locations (locations 1 and 2), Arrow feedback demonstrates significant advantages, with lower error rates than Eye feedback, and at parking location 1, Number feedback also shows lower error rates than Eye feedback; however, at more distant locations (location 4), this pattern reverses (Table 9), with Eye feedback gaining a clear advantage in both error rates and selection bias over Arrow and Number feedback. This finding resonates with Gui et al.’s research [14], suggesting that Eye cues provide more recognizable directional information under distance conditions. While previous research confirms that symbolic information generally has a longer recognition distance than textual information [16], our results further reveal performance differences between symbol types in terms of the perception distance.

A systematic analysis of the error patterns reveals the differentiated effects of feedback mode and display position on user accuracy (Table 10). In terms of total error counts, Eye and Arrow feedback performed similarly and substantially better than Number feedback (44 Eye, 45 Arrow, 61 Number). Across all three feedback modes, body position consistently generated fewer errors (59 body vs. 91 top), indicating that display position is a critical factor influencing accuracy. The error severity analysis shows that Number feedback performed worst in both low-level (D1: 42 Eye, 40 Arrow, 55 Number) and medium-level deviations (D2: 2 Eye, 3 Arrow, 5 Number), while Eye feedback demonstrated superior performance in high-level deviations (D3), with no severe errors recorded (0 Eye, 2 Arrow, 1 Number). This pattern reveals important design trade-offs: while Number feedback may enhance selection efficiency, it compromises accuracy; Eye feedback, though requiring a longer processing time, effectively prevents severe errors, which has special value in AT system design.

To further assess the impact of display position, we analyzed how body versus top placement alters the effectiveness of each feedback mode (Figure 12a,b). A particularly pronounced interaction effect was observed for parking location 3: Arrow feedback exhibited the lowest error rate when shown at the body position but the highest when displayed at the top, while Eye feedback demonstrated the opposite pattern. A comprehensive analysis (Table 11) showed that the body position generally helps users maintain lower error rates and lower selection biases compared to the top position. This display position effect is evident across different feedback modes, confirming the significant impact of feedback spatial positioning on operational accuracy and supporting design decisions to place external human–machine interfaces on the vehicle’s body.

We conducted a detailed analysis of user confidence levels (Table 12), revealing distinct distribution patterns for each feedback mode. In the low-confidence range (0–49%), Arrow feedback was used least frequently, suggesting that users generally felt more certain when relying on directional cues; in the medium-confidence range (50–89%), distributions appeared relatively balanced across all feedback and display conditions; while in the high-confidence range (90–100%), Arrow feedback occurred more frequently than Eye or Number feedback, consistent with participants’ self-reported ease of understanding arrows, likely stemming from the familiarity of arrow symbols, which are widely used traffic signs [20].

An analysis of extreme confidence values showed that complete uncertainty (0% confidence) appeared only once in two conditions: Eye feedback at the top position and Number feedback at the body position. Complete certainty (100% confidence) was observed across multiple combinations, including Eye feedback at the top (1 instance), Arrow feedback at both the top (1 instance) and body (1 instance), and Number feedback at both the top (1 instance) and body (2 instances). Notably, Number feedback at the body position, while less efficient in terms of selection time than that at the top position, generated the most complete certainty events, with interview data indicating that this dissociation between efficiency and certainty stemmed from specific user groups perceiving stronger spatial positioning with number cues at the body position, highlighting the importance of considering both objective efficiency and subjective certainty in AT design.

5.3. Practical Design Guidelines for Effective eHMI in AT Services

Our research highlights the unique advantages of Eye, Arrow, and Number feedback in AT pickup scenarios, providing clear pathways to enhance selection speed, confidence, and accuracy. By integrating body and top display positions, we propose flexible strategies to minimize errors and improve usability in both near and distant parking environments. Our core design recommendations are as follows:

Adopt Eye feedback for distant parking spots to prevent severe errors. Eye feedback effectively reduces error rates and selection deviation when identifying AT parking locations in longer-distance scenarios, preventing severe identification errors while maintaining user engagement and attention. Interaction designers should prioritize eye-based feedback mechanisms when designing eHMI interfaces for longer-range parking spot identification to optimize accuracy and user focus.
Use Arrow feedback to reduce low-confidence selections. Arrow feedback provides clear directional information that helps users make more confident decisions, significantly reducing error rates and selection deviation when identifying AT parking locations in closer-range scenarios. Interaction designers should incorporate arrow-based visual cues into eHMI design to enhance user confidence and minimize errors when users are identifying which parking spot the AT will occupy.
Prioritize Number feedback in the top position for quick decisions. Number feedback in the top position leads to the fastest response times. When the rapid identification of an AT’s parking location is critical, Number feedback delivers the shortest response times, particularly for nearby parking spots, with a higher identification accuracy. Interface developers should implement this combination for AT systems requiring the quick identification of parking locations and improved passenger efficiency.
Position feedback on the vehicle body to improve the accuracy of the perception of autonomous driving intentions. Displaying feedback on the AT’s body reduces errors and selection deviation while improving efficiency in identifying parking location, as users more clearly understand which parking spot the vehicle will occupy. Designers should prioritize this body-located feedback to optimize user experience and comprehension of intentions.
Implement adaptive feedback to match parking distances and optimize user experience. Display position and feedback mode should adapt to the distance between the user and the parking locations. Arrow feedback works well for closer parking locations, while eye feedback becomes more effective in distant conditions, where users need enhanced depth perception and directional clarity. Interface designers should implement adaptive feedback systems that intelligently switch between modes based on the detected distance to parking locations, optimizing both identification accuracy and user experience.

These design guidelines have direct implications for real-world AT deployment. In high-traffic environments such as airports or business districts, the superior speed of Number feedback in the top positions could significantly reduce passenger waiting times. For standard urban pickup scenarios, Arrow feedback would enhance user confidence and accuracy. This prioritization of Arrow feedback aligns with the strong user preference for its intuitive guidance compared to faster alternatives. In challenging conditions such as crowded areas or areas of poor visibility, where error prevention is critical, Eye feedback should be prioritized to minimize severe identification mistakes. Service providers should implement adaptive systems that dynamically adjust between these feedback modes based on the pickup distance and environmental conditions, while considering factors such as varying lighting conditions, weather interference, and standardized design conventions across different providers to reduce the learning curves for passengers unfamiliar with eHMI systems.

6. Conclusions

This study investigated the effectiveness of different eHMI visual feedback modes and display positions in AT pickup scenarios. Our findings revealed that all three feedback modes effectively communicated precise spatial information to help passengers understand where ATs would stop, although with varying strengths. Number feedback provided the fastest selection times but led to lower comprehension ratings, while Arrow feedback facilitated more confident decisions with lower error rates in close-range scenarios. Eye feedback exhibited superior performance in distant conditions by preventing severe identification errors, despite having longer processing times. Regarding display positions, body placement consistently generated fewer errors than top placement across all feedback modes, enhancing the accuracy of users’ perception of autonomous driving intentions.

We acknowledge the limitations of our study, particularly its controlled virtual environment setting rather than a setting using actual AT pickup scenarios. Future research should expand to real-world urban environments where ATs operate under various weather, lighting, traffic, and congestion conditions, and examine the impact of different vehicle speeds and urban congestion on eHMI effectiveness to validate these findings. Future work should also consider multicultural contexts and explore multimodal feedback combining visual and auditory elements for a more inclusive design. Additionally, further exploration of Eye feedback’s potential in AT services is warranted, especially examining how it might be integrated with other feedback modes to create adaptive systems that dynamically adjust based on the distance of the pickup location. Such context-sensitive, multi-modal communication systems could significantly improve passengers’ identification of their assigned vehicles in complex urban settings, enhancing both the efficiency and user experience of AT services.

Author Contributions

Conceptualization, G.R., J.L. and G.W.; methodology, G.R. and Y.Z.; software, G.R. and W.L.; validation, T.H. and J.L.; formal analysis, Z.H.; investigation, Y.Z.; resources, G.W.; data curation, Y.Z.; writing—original draft preparation, G.R., Z.H. and Y.Z.; writing—review and editing, G.R., Z.H. and J.L.; visualization, W.L.; supervision, G.W.; project administration, G.R. and G.W.; funding acquisition, G.W. and J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the Korea Institute of Police Technology (KIPoT; Police Lab 2.0 program) grant funded by MSIT (RS-2023-00281194); a research grant (2024-0035) funded by HAII Corporation; the Fujian Province Social Science Foundation Project (No. FJ2025MGCA042); the 2024 Fujian Provincial Lifelong Education Quality Improvement Project (No. ZS24005); the Xiamen University of Technology High-level Talent Research Project (No. YSK24016R); the Education and Teaching Research Project of Xiamen University of Technology (No. JYCG202448) and the Virtual Reality System for Printing Materials and Technologies (2023CXY0425).

Institutional Review Board Statement

The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Review Board of School of Design Arts, Xiamen University of Technology (Approval Number: XMUT-SDA-IRB-2024-11/012).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

All data are contained within the manuscript. Raw data are available from the corresponding author upon request.

Acknowledgments

We appreciate all participants who took part in the studies.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AT	Autonomous Taxi
AV	Autonomous Vehicle
eHMI	external Human–Machine Interface
GEE	Generalized Estimating Equations

References

Singh, M.K.; Haouari, R.; Papazikou, E.; Sha, H.; Quddus, M.; Chaudhry, A.; Thomas, P.; Morris, A. Examining Parking Choices of Connected and Autonomous Vehicles. Transp. Res. Rec. J. Transp. Board 2023, 2677, 589–601. [Google Scholar] [CrossRef]
Zeng, T.; Zhang, H.; Moura, S.J.; Shen, Z.J.M. Economic and Environmental Benefits of Automated Electric Vehicle Ride-Hailing Services in New York City. Sci. Rep. 2024, 14, 4180. [Google Scholar] [CrossRef] [PubMed]
Gui, X.; Javanmardi, E.; Seo, S.H.; Chauhan, V.; Chang, C.M.; Tsukada, M.; Igarashi, T. “text + Eye” on Autonomous Taxi to Provide Geospatial Instructions to Passenger. In Proceedings of the 12th International Conference on Human-Agent Interaction, Swansea, UK, 24–27 November 2024; Association for Computing Machinery: New York, NY, USA, 2024. HAI ’24. pp. 429–431. [Google Scholar] [CrossRef]
Guéguen, N.; Meineri, S.; Eyssartier, C. A Pedestrian’s Stare and Drivers’ Stopping Behavior: A Field Experiment at the Pedestrian Crossing. Saf. Sci. 2015, 75, 87–89. [Google Scholar] [CrossRef]
Sucha, M.; Dostal, D.; Risser, R. Pedestrian-Driver Communication and Decision Strategies at Marked Crossings. Accid. Anal. Prev. 2017, 102, 41–50. [Google Scholar] [CrossRef]
Kim, S.; Chang, J.J.E.; Park, H.H.; Song, S.U.; Cha, C.B.; Kim, J.W.; Kang, N. Autonomous Taxi Service Design and User Experience. Int. J. Hum.-Interact. 2020, 36, 429–448. [Google Scholar] [CrossRef]
Mahadevan, K.; Somanath, S.; Sharlin, E. Communicating Awareness and Intent in Autonomous Vehicle-Pedestrian Interaction. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Montreal, QC, Canada, 21–26 April 2018; pp. 1–12. [Google Scholar] [CrossRef]
Dey, D.; Terken, J. Pedestrian Interaction with Vehicles: Roles of Explicit and Implicit Communication. In Proceedings of the 9th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Oldenburg, Germany, 24–27 September 2017; Association for Computing Machinery: New York, NY, USA, 2017. AutomotiveUI ’17. pp. 109–113. [Google Scholar] [CrossRef]
Zheng, N.; Li, J.; Li, N.; Zhang, M.; Cai, J.; Tei, K. Exploring Optimal eHMI Display Location for Various Vehicle Types: A VR User Study. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems; Association for Computing Machinery: New York, NY, USA, 2024; pp. 1–7. [Google Scholar] [CrossRef]
Dey, D.; Senan, T.U.; Hengeveld, B.; Colley, M.; Habibovic, A.; Ju, W. Multi-Modal eHMIs: The Relative Impact of Light and Sound in AV-Pedestrian Interaction. In Proceedings of the CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2024; pp. 1–16. [Google Scholar] [CrossRef]
Rouchitsas, A.; Alm, H. Ghost on the Windshield: Employing a Virtual Human Character to Communicate Pedestrian Acknowledgement and Vehicle Intention. Information 2022, 13, 420. [Google Scholar] [CrossRef]
Dey, D.; Van Vastenhoven, A.; Cuijpers, R.H.; Martens, M.; Pfleging, B. Towards Scalable eHMIs: Designing for AV-VRU Communication beyond One Pedestrian. In Proceedings of the 13th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Leeds, UK, 9–14 September 2021; pp. 274–286. [Google Scholar] [CrossRef]
Chang, C.M.; Toda, K.; Gui, X.; Seo, S.H.; Igarashi, T. Can Eyes on a Car Reduce Traffic Accidents? In Proceedings of the 14th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Seoul, Republic of Korea, 17–20 September 2022; Association for Computing Machinery: New York, NY, USA, 2022. AutomotiveUI ’22. pp. 349–359. [Google Scholar] [CrossRef]
Gui, X.; Toda, K.; Seo, S.H.; Chang, C.M.; Igarashi, T. “I Am Going This Way”: Gazing Eyes on Self-Driving Car Show Multiple Driving Directions. In Proceedings of the 14th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Seoul, Republic of Korea, 17–20 September 2022; Association for Computing Machinery: New York, NY, USA, 2022. AutomotiveUI ’22. pp. 319–329. [Google Scholar] [CrossRef]
de Clercq, K.; Dietrich, A.; Núñez Velasco, J.P.; de Winter, J.; Happee, R. External Human-Machine Interfaces on Automated Vehicles: Effects on Pedestrian Crossing Decisions. Hum. Factors 2019, 61, 1353–1370. [Google Scholar] [CrossRef]
Rettenmaier, M.; Schulze, J.; Bengler, K. How Much Space Is Required? Effect of Distance, Content, and Color on External Human–Machine Interface Size. Information 2020, 11, 346. [Google Scholar] [CrossRef]
Gui, X.; Toda, K.; Seo, S.H.; Eckert, F.M.; Chang, C.M.; Chen, X.A.; Igarashi, T. A Field Study on Pedestrians’ Thoughts toward a Car with Gazing Eyes. In Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems; Association for Computing Machinery: New York, NY, USA, 2023; pp. 1–7. [Google Scholar] [CrossRef]
Guo, J.; Yuan, Q.; Yu, J.; Chen, X.; Yu, W.; Cheng, Q.; Wang, W.; Luo, W.; Jiang, X. External Human–Machine Interfaces for Autonomous Vehicles from Pedestrians’ Perspective: A Survey Study. Sensors 2022, 22, 3339. [Google Scholar] [CrossRef]
Kim, Y.W.; Han, J.H.; Ji, Y.G.; Lee, S.C. Exploring the Effectiveness of External Human-Machine Interfaces on Pedestrians and Drivers. In Proceedings of the 12th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Virtual, 21–22 September 2020; pp. 65–68. [Google Scholar] [CrossRef]
Dou, J.; Chen, S.; Tang, Z.; Xu, C.; Xue, C. Evaluation of Multimodal External Human–Machine Interface for Driverless Vehicles in Virtual Reality. Symmetry 2021, 13, 687. [Google Scholar] [CrossRef]
Alhawiti, A.; Kwigizile, V.; Oh, J.S.; Asher, Z.D.; Hakimi, O.; Aljohani, S.; Ayantayo, S. The Effectiveness of eHMI Displays on Pedestrian–Autonomous Vehicle Interaction in Mixed-Traffic Environments. Sensors 2024, 24, 5018. [Google Scholar] [CrossRef] [PubMed]
Ackermann, C.; Beggiato, M.; Schubert, S.; Krems, J.F. An Experimental Study to Investigate Design and Assessment Criteria: What Is Important for Communication between Pedestrians and Automated Vehicles? Appl. Ergon. 2019, 75, 272–282. [Google Scholar] [CrossRef]
Guo, F.; Lyu, W.; Ren, Z.; Li, M.; Liu, Z. A Video-Based, Eye-Tracking Study to Investigate the Effect of eHMI Modalities and Locations on Pedestrian–Automated Vehicle Interaction. Sustainability 2022, 14, 5633. [Google Scholar] [CrossRef]
Rouchitsas, A.; Alm, H. Smiles and Angry Faces vs. Nods and Head Shakes: Facial Expressions at the Service of Autonomous Vehicles. Multimodal Technol. Interact. 2023, 7, 10. [Google Scholar] [CrossRef]
Lau, M.; Jipp, M.; Oehl, M. Toward a Holistic Communication Approach to an Automated Vehicle’s Communication with Pedestrians: Combining Vehicle Kinematics with External Human-Machine Interfaces for Differently Sized Automated Vehicles. Front. Psychol. 2022, 13, 882394. [Google Scholar] [CrossRef]
Fuest, T.; Michalowski, L.; Träris, L.; Bellem, H.; Bengler, K. Using the Driving Behavior of an Automated Vehicle to Communicate Intentions—A Wizard of Oz Study. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 3596–3601. [Google Scholar] [CrossRef]
Pillai, A.K. Virtual Reality Based Study to Analyse Pedestrian Attitude towards Autonomous Vehicles. Master’s Thesis, Aalto University, Espoo, Finland, 2017. [Google Scholar]
Ezzati Amini, R.; Katrakazas, C.; Antoniou, C. Negotiation and Decision-Making for a Pedestrian Roadway Crossing: A Literature Review. Sustainability 2019, 11, 6713. [Google Scholar] [CrossRef]
Lee, Y.M.; Madigan, R.; Giles, O.; Garach-Morcillo, L.; Markkula, G.; Fox, C.; Camara, F.; Rothmueller, M.; Vendelbo-Larsen, S.A.; Rasmussen, P.H.; et al. Road Users Rarely Use Explicit Communication When Interacting in Today’s Traffic: Implications for Automated Vehicles. Cogn. Technol. Work 2021, 23, 367–380. [Google Scholar] [CrossRef]
Jayaraman, S.K.; Creech, C.; Tilbury, D.M.; Yang, X.J.; Pradhan, A.K.; Tsui, K.M.; Robert, L.P. Pedestrian Trust in Automated Vehicles: Role of Traffic Signal and AV Driving Behavior. Front. Robot. AI 2019, 6, 117. [Google Scholar] [CrossRef]
Holländer, K.; Colley, A.; Mai, C.; Häkkilä, J.; Alt, F.; Pfleging, B. Investigating the Influence of External Car Displays on Pedestrians’ Crossing Behavior in Virtual Reality. In Proceedings of the 21st International Conference on Human-computer Interaction with Mobile Devices and Services, Taipei, Taiwan, 1–4 October 2019; pp. 1–11. [Google Scholar] [CrossRef]
Epke, M.R.; Kooijman, L.; De Winter, J.C.F. I See Your Gesture: A VR-Based Study of Bidirectional Communication between Pedestrians and Automated Vehicles. J. Adv. Transp. 2021, 2021, 5573560. [Google Scholar] [CrossRef]
Ackermans, S.; Dey, D.; Ruijten, P.; Cuijpers, R.H.; Pfleging, B. The Effects of Explicit Intention Communication, Conspicuous Sensors, and Pedestrian Attitude in Interactions with Automated Vehicles. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; Association for Computing Machinery: New York, NY, USA, 2020. CHI ’20. pp. 1–14. [Google Scholar] [CrossRef]
Lee, Y.M.; Madigan, R.; Uzondu, C.; Garcia, J.; Romano, R.; Markkula, G.; Merat, N. Learning to Interpret Novel eHMI: The Effect of Vehicle Kinematics and eHMI Familiarity on Pedestrian’ Crossing Behavior. J. Saf. Res. 2022, 80, 270–280. [Google Scholar] [CrossRef]
Tran, T.T.M.; Parker, C.; Yu, X.; Dey, D.; Martens, M.; Bazilinskyy, P.; Tomitsch, M. Evaluating Autonomous Vehicle External Communication Using a Multi-Pedestrian VR Simulator. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2024, 8, 130:1–130:26. [Google Scholar] [CrossRef]
Colley, M.; Belz, J.H.; Rukzio, E. Investigating the Effects of Feedback Communication of Autonomous Vehicles. In Proceedings of the 13th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Leeds, UK, 9–14 September 2021; pp. 263–273. [Google Scholar] [CrossRef]
Colley, M.; Bajrovic, E.; Rukzio, E. Effects of Pedestrian Behavior, Time Pressure, and Repeated Exposure on Crossing Decisions in Front of Automated Vehicles Equipped with External Communication. In Proceedings of the CHI Conference on Human Factors in Computing Systems, New Orleans, LA, USA, 29 April–5 May 2022; pp. 1–11. [Google Scholar] [CrossRef]
Hensch, A.C.; Neumann, I.; Beggiato, M.; Halama, J.; Krems, J.F. How Should Automated Vehicles Communicate?—Effects of a Light-Based Communication Approach in a Wizard-of-Oz Study. In Advances in Human Factors of Transportation; Stanton, N., Ed.; Springer International Publishing: Cham, Switzerland, 2020; Volume 964, pp. 79–91. [Google Scholar] [CrossRef]
Nguyen, T.T.; Holländer, K.; Hoggenmueller, M.; Parker, C.; Tomitsch, M. Designing for Projection-Based Communication Between Autonomous Vehicles and Pedestrians. In Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Utrecht, The Netherlands, 21–25 September 2019; pp. 284–294. [Google Scholar] [CrossRef]
Haimerl, M.; Colley, M.; Riener, A. Evaluation of Common External Communication Concepts of Automated Vehicles for People with Intellectual Disabilities. Proc. ACM Hum.-Comput. Interact. 2022, 6, 1–19. [Google Scholar] [CrossRef]
Faas, S.M.; Baumann, M. Light-Based External Human Machine Interface: Color Evaluation for Self-Driving Vehicle and Pedestrian Interaction. Proc. Hum. Factors Ergon. Soc. Annu. 2019, 63, 1232–1236. [Google Scholar] [CrossRef]
Bazilinskyy, P.; Dodou, D.; De Winter, J. External Human-Machine Interfaces: Which of 729 Colors Is Best for Signaling ‘Please (Do Not) Cross’? In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Toronto, ON, Canada, 11–14 October 2020; pp. 3721–3728. [Google Scholar] [CrossRef]
Colley, M.; Walch, M.; Gugenheimer, J.; Askari, A.; Rukzio, E. Towards Inclusive External Communication of Autonomous Vehicles for Pedestrians with Vision Impairments. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–14. [Google Scholar] [CrossRef]
Lim, D.; Kwon, Y. How to Design the eHMI of AVs for Urgent Warning to Other Drivers with Limited Visibility? Sensors 2023, 23, 3721. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Dikmen, M.; Hussein, T.G.; Wang, Y.; Burns, C. To Cross or Not to Cross: Urgency-Based External Warning Displays on Autonomous Vehicles to Improve Pedestrian Crossing Safety. In Proceedings of the 10th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Toronto, ON, Canada, 23–25 September 2018; pp. 188–197. [Google Scholar] [CrossRef]
Zhanguzhinova, S.; Makó, E.; Borsos, A.; Sándor, Á.P.; Koren, C. Communication between Autonomous Vehicles and Pedestrians: An Experimental Study Using Virtual Reality. Sensors 2023, 23, 1049. [Google Scholar] [CrossRef] [PubMed]
Eisele, D.; Kraus, J.; Schlemer, M.M.; Petzoldt, T. Should Automated Vehicles Communicate Their State or Intent? Effects of eHMI Activations and Non-Activations on Pedestrians’ Trust Formation and Crossing Behavior. Multimed. Tools Appl. 2024. [Google Scholar] [CrossRef]
Chang, C.M.; Toda, K.; Sakamoto, D.; Igarashi, T. Eyes on a Car: An Interface Design for Communication between an Autonomous Car and a Pedestrian. In Proceedings of the 9th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Oldenburg, Germany, 24–27 September 2017; pp. 65–73. [Google Scholar] [CrossRef]
Löcken, A.; Golling, C.; Riener, A. How Should Automated Vehicles Interact with Pedestrians?: A Comparative Analysis of Interaction Concepts in Virtual Reality. In Proceedings of the International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Utrecht, The Netherlands, 22–25 September 2019. [Google Scholar]
Faas, S.M.; Mathis, L.A.; Baumann, M. External HMI for Self-Driving Vehicles: Which Information Shall Be Displayed? Transp. Res. Part F Traffic Psychol. Behav. 2020, 68, 171–186. [Google Scholar] [CrossRef]
Dey, D.; Habibovic, A.; Löcken, A.; Wintersberger, P.; Pfleging, B.; Riener, A.; Martens, M.; Terken, J. Taming the eHMI Jungle: A Classification Taxonomy to Guide, Compare, and Assess the Design Principles of Automated Vehicles’ External Human-Machine Interfaces. Transp. Res. Interdiscip. Perspect. 2020, 7, 100174. [Google Scholar] [CrossRef]
Eisma, Y.B.; Van Bergen, S.; Ter Brake, S.M.; Hensen, M.T.T.; Tempelaar, W.J.; De Winter, J.C.F. External Human–Machine Interfaces: The Effect of Display Location on Crossing Intentions and Eye Movements. Information 2019, 11, 13. [Google Scholar] [CrossRef]
Schmidt-Wolf, M.; Feil-Seifer, D. Vehicle-to-Pedestrian Communication Feedback Module: A Study on Increasing Legibility, Public Acceptance and Trust. In Proceedings of the International Conference on Software Reuse, Montpellier, France, 15–17 June 2022. [Google Scholar]
Tiesler-Wittig, H. Functional Application, Regulatory Requirements and Their Future Opportunities for Lighting of Automated Driving Systems; No. 2019-01-0848; SAE International: Warrendale, PA, USA, 2019. [Google Scholar] [CrossRef]
Dey, D.; Habibovic, A.; Pfleging, B.; Martens, M.; Terken, J. Color and Animation Preferences for a Light Band eHMI in Interactions between Automated Vehicles and Pedestrians. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; pp. 1–13. [Google Scholar] [CrossRef]
Carmona, J.; Guindel, C.; Garcia, F.; De La Escalera, A. eHMI: Review and Guidelines for Deployment on Autonomous Vehicles. Sensors 2021, 21, 2912. [Google Scholar] [CrossRef]
Singh, P.; Chang, C.M.; Igarashi, T. I See You: Eye Control Mechanisms for Robotic Eyes on an Autonomous Car. In Proceedings of the 14th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, Seoul, Republic of Korea, 17–20 September 2022; pp. 15–19. [Google Scholar] [CrossRef]
Bindschädel, J.; Krems, I.; Kiesel, A. Two-Step Communication for the Interaction between Automated Vehicles and Pedestrians. Transp. Res. Part F Traffic Psychol. Behav. 2022, 90, 136–150. [Google Scholar] [CrossRef]
Schömbs, S.; Pareek, S.; Goncalves, J.; Johal, W. Robot-Assisted Decision-Making: Unveiling the Role of Uncertainty Visualisation and Embodiment. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 11–16 May 2024; Association for Computing Machinery: New York, NY, USA, 2024. CHI ’24. pp. 1–16. [Google Scholar] [CrossRef]
Hanley, J.A. Statistical Analysis of Correlated Data Using Generalized Estimating Equations: An Orientation. Am. J. Epidemiol. 2003, 157, 364–375. [Google Scholar] [CrossRef] [PubMed]
Ballinger, G.A. Using Generalized Estimating Equations for Longitudinal Data Analysis. Organ. Res. Methods 2004, 7, 127–150. [Google Scholar] [CrossRef]
Chen, J.; Li, N.; Shi, Y.; Du, J. Cross-Cultural Assessment of the Effect of Spatial Information on Firefighters’ Wayfinding Performance: A Virtual Reality-Based Study. Int. J. Disaster Risk Reduct. 2023, 84, 103486. [Google Scholar] [CrossRef]
Nummenmaa, L.; Calder, A.J. Neural Mechanisms of Social Attention. Trends Cogn. Sci. 2009, 13, 135–143. [Google Scholar] [CrossRef]
Yin, X. Influences of Eye Gaze Cues on Memory and Its Mechanisms: The Function and Evolution of Social Attention. Front. Psychol. 2022, 13, 1036530. [Google Scholar] [CrossRef] [PubMed]
Visser, T.A.W.; Roberts, A. Automaticity of Social Cues: The Influence of Limiting Cognitive Resources on Head Orientation Cueing. Sci. Rep. 2018, 8, 10288. [Google Scholar] [CrossRef]

Figure 1. eHMI feedback designs and experimental parking scenario.

Figure 2. Mean selection time results of the study. (a) Mean selection time; (b) mean selection time comparison across feedback modes by park location. For this and all following figures, statistical significance is denoted by ** (p < 0.01) and * (p < 0.05).

Figure 3. Mean error rate results of the study. (a) Mean error rate; (b) mean error rate comparison across feedback modes by park location.

Figure 4. Mean selection deviation results of the study. (a) Mean selection deviation; (b) mean selection deviation comparison across feedback modes by park location.

Figure 5. Mean head rotation angle and movment distance results of the study. (a) Mean head rotation angle; (b) mean head movement distance.

Figure 6. Mean head movement distance along different study dimenstions. (a) Mean head movement distance along X; (b) mean head movement distance along Y; (c) mean head movement distance along Z.

Figure 7. Comprehension and confidence ratings across different feedback modes: (a) mean comprehension ratings across feedback modes; (b) mean confidence ratings across feedback modes; (c) user perception ratings across different feedback modes.

Figure 8. Distribution of descriptive terms Used by participants to evaluate different feedback modes.

Figure 9. Mean selection time across different feedback modes, parks, and display positions: (a) mean selection time between feedback and park; (b) mean selection time between feedback and display.

Figure 10. Mean selection time between feedback modes and display positions in parks 2 and 3: (a) mean selection time between feedback and display in park 2; (b) mean selection time between feedback and display in park 3.

Figure 11. Mean error and selection deviation across different feedback types and parks: (a) mean error between feedback and parks; (b) mean selection deviation between feedback and parks.

Figure 12. Mean error and selection deviation between feedback and display: (a) mean error between feedback and display in park 3; (b) mean error between feedback and display in park 4; (c) mean selection deviation between feedback and display in park 4.

Table 1. eHMI Applications in human–vehicle interaction research.

Application Scenario	Feedback Mode	Display Position	References
Pedestrian Crossing	Anthropomorphic Eyes	Front Headlights	[13,14]
	Anthropomorphic Eyes	Front Windshield	[17]
	Arrows/Symbols	Bumper and Radiator Grille	[16]
		Roof	[18]
		Front Windshield	[20,21]
		Grille, Windshield and Roof	[23]
	Text Messages	Bumper and Radiator Grille	[16]
		Roof	[18]
		Front Windshield	[20,21]
		Grille, Windshield and Roof	[23,52]
		Above Wheels, Road Projection	[52]
	Light Effects	Windshield, Bumper	[9]
	Light Effects	Headlights, Front Grille	[15]
AT Pickup	Eye and Text	Front Windshield	[3]

Table 2. Experimental design matrix of feedback mode and display position combinations.

Feedback Mode	Display Position	Theoretical Advantage
Eye	Body	Emotional engagement + Natural viewing
Eye	Top	Emotional engagement + Attention capture
Arrow	Body	Directional clarity + Natural viewing
Arrow	Top	Directional clarity + Attention capture
Number	Body	Precision + Natural viewing
Number	Top	Precision + Attention capture

Table 3. Participant demographics and prior experience.

Characteristics	Statistics
All Participants	18
Male	9
Female	9
Age (SD)	23.83 (1.54)
Driving Experience	Number of Users
0 years	6
1–5 years	9
>5 years	3

Table 4. Statistically significant selection time (s) differences across feedback modes.

Location	Feedback Mode			Improvement (%)
Location	Eye	Arrow	Number	Number vs. Eye	Number vs. Arrow	Arrow vs. Eye
Overall	11.90	11.22	9.04	24.03%	19.43%	-
Park 1	8.48	7.14	5.40	36.32%	24.37%	15.80%
Park 2	10.94	11.23	8.18	25.23%	27.16%	-
Park 3	13.83	14.21	11.02	20.32%	22.45%	-
Park 4	15.61	13.90	13.70	12.24%	-	-

Note: The symbol “-” indicates that the difference between the two feedback modes was not statistically significant (

p > 0.05

). All percentage values represent statistically significant improvements (

p < 0.05

).

Table 5. Statistically significant selection time (s) improvements for body vs. top display positions.

Location	Display Position		Improvement (%)
Location	Body	Top	Body vs. Top
Overall	10.22	11.09	7.84%
Park 1	6.52	7.24	9.94%
Park 2	9.50	10.56	10.04%
Park 4	13.84	14.94	7.36%

Table 6. Selection times (s) across different feedback modes, display positions, and parking locations.

Feedback Mode	Display Position	Park Location
Feedback Mode	Display Position	1	2	3	4
Eye	body	7.65	9.90	13.71	15.26
Eye	top	9.39	12.08	13.95	15.97
Arrow	body	6.40	10.10	12.63	13.03
Arrow	top	7.96	12.49	16.00	14.83
Number	body	5.66	8.56	11.87	13.34
Number	top	5.15	7.81	10.22	14.08

Table 7. Significant improvements in user perception across feedback modes.

Metric	Eye	Arrow	Number	Comparison	Improvement (%)
Comprehension	2.67	2.92	2.07	Arrow vs. Number	41.06%
Intention	5.40	5.65	4.00	Arrow vs. Number Eye vs. Number	41.25% 35.00%
Attention	4.65	2.95	2.75	Eye vs. Arrow Eye vs. Number	57.63% 69.09%

Table 8. Arrow and number feedback modes show fewer errors and lower selection deviation in parks 1 and 2.

Metric	Location	Feedback Mode			Improvement (%)
Metric	Location	Eye	Arrow	Number	Arrow vs. Eye	Arrow vs. Number	Number vs. Eye
Error Rate	Park 1	0.24	0.01	0.01	95.83%	-	95.83%
Error Rate	Park 2	0.39	0.08	0.53	79.5%	84.91%	-
Selection Deviation	Park 1	0.26	0.02	0.02	-	-	92.31%
Selection Deviation	Park 2	0.39	0.08	0.53	79.49%	84.91%	-

Note: All percentage values indicate statistically significant differences (

p < 0.05

). The “-” symbol indicates no statistically significant difference was found (

p > 0.05

). Lower values for error and selection deviation represent better performance.

Table 9. Eye feedback shows fewer errors and lower selection deviation in park 4 compared to other modes.

Metric	Location	Feedback Mode			Improvement (%)
Metric	Location	Eye	Arrow	Number	Eye vs. Arrow	Eye vs. Number
Error Rate	Park 4	0.25	0.67	0.57	62.69%	56.14%
Selection Deviation	Park 4	0.25	0.86	0.59	70.93%	57.63%

Table 10. Error count with breakdown by deviation levels across different feedback modes, display positions, and parking locations.

Feedback Mode	Display Position	Park Location
Feedback Mode	Display Position	1 Total (D1, D2, D3)	2 Total (D1, D2, D3)	3 Total (D1, D2, D3)	4 Total (D1, D2, D3)
Eye	body	2 (1,1,0)	7 (7,0,0)	5 (5,0,0)	5 (5,0,0)
Eye	top	8 (8,0,0)	7 (7,0,0)	6 (5,1,0)	4 (4,0,0)
Arrow	body	0 (0,0,0)	1 (1,0,0)	3 (3,0,0)	12 (9,2,1)
Arrow	top	1 (1,0,0)	2 (2,0,0)	14 (14,0,0)	12 (10,1,1)
Number	body	0 (0,0,0)	10 (10,0,0)	8 (8,0,0)	6 (6,0,0)
Number	top	1 (1,0,0)	9 (9,0,0)	13 (11,2,0)	14 (10,3,1)

Note: Values represent the total error count, followed by counts for each selection deviation level in parentheses: (D1,D2,D3). D1 = deviation level 1; D2 = deviation level 2; D3 = deviation level 3. Deviation level indicates the distance between the correct location and the user’s selection.

Table 11. Statistically significant error and selection deviation improvements for body vs. top display positions.

Metric	Location	Display Position		Improvement (%)
Metric	Location	Body	Top	Body vs. Top
Error Rate	Overall	0.20	0.38	47.37%
	Park 1	0.01	0.13	92.31%
	Park 3	0.29	0.62	53.23%
Selection Deviation	Overall	0.19	0.31	38.71%
Selection Deviation	Park 3	0.27	0.63	57.14%

Table 12. Distribution of user confidence level counts across different feedback modes and display positions.

Feedback Mode	Display Position	Confidence Level			Extreme Values
Feedback Mode	Display Position	Low (0–49%)	Medium (50–89%)	High (90–100%)	0% Count	100% Count
Eye	body	3	13	2	0	0
Eye	top	4	12	2	1	1
Arrow	body	1	11	6	0	1
Arrow	top	1	13	4	0	1
Number	body	5	11	2	1	2
Number	top	3	13	2	0	1

Note: This table presents the distribution of selection times across different feedback modes and display positions. The “Extreme Values” columns specifically highlight instances where participants reported either 0% confidence (complete uncertainty) or 100% confidence (absolute certainty).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ren, G.; Huang, Z.; Zhu, Y.; Lin, W.; Huang, T.; Wang, G.; Lee, J. Making Autonomous Taxis Understandable: A Comparative Study of eHMI Feedback Modes and Display Positions for Pickup Guidance. Electronics 2025, 14, 2387. https://doi.org/10.3390/electronics14122387

AMA Style

Ren G, Huang Z, Zhu Y, Lin W, Huang T, Wang G, Lee J. Making Autonomous Taxis Understandable: A Comparative Study of eHMI Feedback Modes and Display Positions for Pickup Guidance. Electronics. 2025; 14(12):2387. https://doi.org/10.3390/electronics14122387

Chicago/Turabian Style

Ren, Gang, Zhihuang Huang, Yaning Zhu, Wenshuo Lin, Tianyang Huang, Gang Wang, and Jeehang Lee. 2025. "Making Autonomous Taxis Understandable: A Comparative Study of eHMI Feedback Modes and Display Positions for Pickup Guidance" Electronics 14, no. 12: 2387. https://doi.org/10.3390/electronics14122387

APA Style

Ren, G., Huang, Z., Zhu, Y., Lin, W., Huang, T., Wang, G., & Lee, J. (2025). Making Autonomous Taxis Understandable: A Comparative Study of eHMI Feedback Modes and Display Positions for Pickup Guidance. Electronics, 14(12), 2387. https://doi.org/10.3390/electronics14122387

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Making Autonomous Taxis Understandable: A Comparative Study of eHMI Feedback Modes and Display Positions for Pickup Guidance

Abstract

1. Introduction

2. Related Works

2.1. Communication Challenges in Autonomous Traffic Environments

2.2. eHMI as a Solution for External Vehicle Communication

2.3. eHMI Design Variants and Performance

2.4. eHMI Display Positioning and Effectiveness

3. Experimental Design and Implementation

3.1. Design of eHMI Feedback Modes and Display Positions

3.1.1. eHMI Feedback Mode Design

3.1.2. eHMI Display Position Design

3.2. Experimental Settings and Design

3.2.1. Experimental Settings

3.2.2. Independent Variables

3.2.3. Experimental Design

3.2.4. Participants and Procedure

4. Results

4.1. Selection Time, Error Rate, Selection Deviation, and User Behavior Data

4.2. User Evaluation and Interview

4.2.1. Subjective User Evaluation

4.2.2. User Interview

5. Discussion

5.1. eHMI Efficiency Across Feedback Modes and Display Positions

5.2. eHMI Accuracy Across Feedback Modes and Display Positions

5.3. Practical Design Guidelines for Effective eHMI in AT Services

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI