Nonverbal Interactions with Virtual Agents in a Virtual Reality Museum

Sung, Chaerim; Nam, Sanghun

doi:10.3390/electronics14224534

Open AccessArticle

Nonverbal Interactions with Virtual Agents in a Virtual Reality Museum

by

Chaerim Sung

¹ and

Sanghun Nam

^2,*

¹

Department of Culture Technology Convergence, Changwon National University, Changwon-si 51140, Republic of Korea

²

Department of Meta-Convergence Content, Changwon National University, Changwon-si 51140, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(22), 4534; https://doi.org/10.3390/electronics14224534

Submission received: 2 October 2025 / Revised: 9 November 2025 / Accepted: 13 November 2025 / Published: 20 November 2025

(This article belongs to the Special Issue New Trends in User-Centered System Design and Development)

Download

Browse Figures

Versions Notes

Abstract

Virtual reality (VR) learning environments can provide enriched, effective educational experiences by heightening one’s sense of immersion. Consequently, virtual agents (VAs) capable of complementing or substituting human instructors are gaining research traction. However, researchers predominantly examine VAs in nonimmersive contexts, rarely investigating their roles within immersive VR settings. Users’ sense of immersion and social presence in VR environments can fluctuate more significantly than in nonimmersive platforms, rendering the communicative attributes of VAs particularly consequential. This study investigates the effects of VAs’ nonverbal behaviors on user experience in a VR-based learning environment. A VR environment modeled after an art museum was developed, in which a virtual curator engaged participants through two distinct modes of interaction. Participants were randomly assigned to one of the two groups: a group with an agent applying both verbal and nonverbal behaviors or a group with an agent that only uses verbal communication. Findings demonstrated that the inclusion of nonverbal behaviors enhanced the participants’ sense of immersion, social presence, and engagement with the learning content. This study enriches the literature by identifying effective communication strategies for the design of VAs in VR environments and by offering implications for the development of more immersive and engaging VR experiences.

Keywords:

virtual reality; virtual agents; nonverbal interaction; virtual reality curation

1. Introduction

In recent years, various computerized learning environments, such as simulation- and web-based platforms, have been actively adopted in educational contexts as a popular means of delivering instructional content [1]. Among these, virtual reality (VR) learning environments (VRLEs) have attracted significant attention for their ability to provide experiential learning opportunities that closely resemble real-world situations using immersive and interactive technologies [2]. VR technology overcomes physical limitations by visually and spatially reproducing scenarios that learners may not ordinarily encounter in daily life, thereby enabling experience-centered learning. The integration of visual and experiential components helps learners store and retain knowledge effectively as if they have experienced it firsthand [3,4,5]. VRLEs also enhance concentration and engagement by encouraging active participation within virtual environments, which improves overall user experience [6,7,8].

The potential use of virtual agents (VAs) to supplement or substitute traditional instructor roles has grown with the utilization of VR as a novel educational medium. VAs are computer-based systems that use natural language interfaces and digital avatars to engage in social interactions with users; they have been applied across various domains, including customer service, personal assistance, education, and healthcare [9,10]. VAs in educational settings primarily interact with learners through text- or audio-based modalities [11]. Schroeder et al. defined VAs in education as “on-screen characters that facilitate instruction”; however, with recent advances in natural language understanding, dialog management, and emotion recognition, VAs are evolving into intelligent “educational entities” [12,13,14,15]. Meta-analyses have shown that learning environments incorporating VAs generally yield superior learning outcomes compared to those without VAs. Equipped with sophisticated multimodal communication capabilities, modern embodied conversational agents (ECAs) help improve the quality of interaction by fostering rapport with learners and delivering human-like conversational experiences [16,17,18]. These attributes enhance learners’ immersion in interactions with VAs [19], ultimately leading them to perceive VAs as not only information delivery tools but also communicative partners [20].

Similar to real-world interactions, communication with VAs is realized through both verbal and nonverbal channels. Verbal elements may include text, speech, and video, allowing agents to engage in flexible, contextually appropriate exchanges with users [21]. For example, text-based interaction is particularly useful in chatbot-style applications, where users input specific commands or queries, whereas advances in speech recognition and synthesis have brought voice-based interaction closer to natural human conversation [22,23,24]. Nonverbal elements are also crucial in shaping interaction. Influential studies on animated characters have shown that users tend to respond to computer systems in ways similar to human–human interaction when these systems provide social signals [25]. Nonverbal cues enable more natural interaction with VAs similar to how people communicate in everyday contexts. Gestures can be used to explain ideas or provide information, gaze shifts can draw attention, and facial expressions can convey emotions [26,27,28]. Nonverbal behaviors strengthen emotional bonds and foster trust and intimacy between users and agents while providing contextual cues that facilitate comprehension of information [29,30,31]. However, this positive narrative is not universal. Poorly designed, excessive, or culturally inappropriate nonverbal behaviors can also be perceived as distracting, detrimental, or disingenuous, potentially breaking immersion rather than enhancing it [32]. Our study proceeds with this balanced view, aiming to understand the impact of naturalistic, motion-captured nonverbal behaviors.

Over the past several decades, VA research has steadily progressed, particularly regarding applications in public cultural spaces, such as museums. For instance, Kopp et al. introduced the conversational curator agent Max, which engages museum visitors in natural dialog to provide information on exhibitions [33]. Max successfully fosters visitor engagement through human-like communication strategies, highlighting the effectiveness and advantages of using ECAs as interactive guides. Similarly, Potdevin et al. examined the impact of an agent’s communication style on perceived intimacy, showing that the presence or absence of nonverbal behaviors significantly influences user perceptions [34]. However, their study was limited in that it relied on staged video stimuli rather than actual user interactions. Grivokostopoulou et al. compared learners who interacted with VAs with those who did not, demonstrating that VAs are crucial in virtual learning environments [35]. Their findings suggested that when positioned as learning companions, agents can enhance learners’ experiences, increase engagement, and improve knowledge construction and performance. However, they did not investigate agent communication channels or the role of nonverbal behaviors.

Although numerous studies have investigated VAs in nonimmersive platforms, such as text-only platforms, chatbots and animated videos, research specifically examining VAs in immersive VR environments remains relatively scarce [36,37]. Further-more, many prior studies on both immersive and nonimmersive platforms have relied on procedural, looped, or pre-canned animations, which can lack naturalism. The impact of high-fidelity, performer-based motion capture on these interactions is less understood. This highlights the need for studies that compare the verbal and nonverbal communicative elements of VAs within VR, specifically focusing on the quality and naturalness of the nonverbal behaviors. This gap is particularly important in contexts like a VR art museum. This setting is fundamentally different from typical VA studies that focus on dyadic (1-on-1) social conversations; the art museum facilitates a triadic (User–Agent–Object) interaction. Here, the VA acts as a curator, whose primary role is to guide the user through the space and direct their attention to external objects. In this setting, nonverbal behaviors therefore include not only social cues (facing behavior, expression) but also these crucial spatial and object-based interactions (e.g., walking to an artwork, pointing at it). To address this gap, this study focuses on a more specific question: We investigate how these high-fidelity, motion-captured spatial behaviors influence not only the User–Agent relationship but also the User–Environment relationship and the User–Content relationship. We developed an immersive museum-inspired VR environment in which participants interact with a virtual curator exhibiting both verbal and nonverbal behaviors. Through an experiment, we analyzed how users perceived the VA, focusing on phenomena such as the uncanny valley, user immersion, and social presence.

2. Developing the VA and Virtual Environment

2.1. VA Implementation

Mori defined the uncanny valley as the emotional response—either positive or negative—that arises when humans encounter objects resembling themselves [38]. In the context of virtual avatars, the uncanny valley refers to discomfort or eeriness that users may feel when interacting with avatars that appear highly realistic but not fully human. Users’ unease tends to intensify with an avatar’s human likeness [39,40]. Thus, users often prefer less realistic, more cartoon-like avatars. Indeed, cartoon-style virtual avatars have become increasingly familiar to contemporary audiences, particularly in popular media, such as virtual YouTubers and metaverse platforms [41]. Considering the uncanny valley hypothesis, designing a VA that closely resembles humans but lacks complete realism can elicit user discomfort. Therefore, the VA in this study was designed as a cartoon-style avatar to avoid the aforementioned issue and instead provide users with a more comfortable and engaging experience. The avatar model, designed as a stylized female character with a 1:7.5 body proportion, was created using the 3D character modeling software VRoid Studio (version 1.30). The model specifications included 37,655 polygons, 14 materials, and 118 bones to balance the rendering load and joint-control precision in real time. The joint axes and lengths were refined in Unity to minimize joint collisions and skinning distortions.

Also, our VA design is theoretically grounded in two key frameworks that directly informed our implementation: Social Presence Theory and Embodied Cognition. First, Social Presence Theory posits that users can perceive virtual entities as genuine social actors, not just digital objects [42,43]. The degree of social presence is heavily influenced by the richness and immediacy of social cues. Therefore, we reasoned that a VA (Group A) delivering crucial social cues—such as orienting its body toward the user (facing behavior) and displaying context-appropriate facial expressions—would be perceived as a more believable and engaging social partner compared to a static agent (Group B) that lacks these fundamental cues. Second, Embodied Cognition frameworks suggest that communication is not an abstract process; rather, physical actions like gestures are integral to how agents and humans create and comprehend meaning [44]. This is especially true in a spatial context like a museum. We applied this theory by reasoning that abstract information (e.g., “This painting is…”) becomes more concrete and effective when grounded by the VA’s body. An agent that physically walks to an artwork and points at it creates a shared spatial understanding with the user, making the communication feel more natural and enhancing immersion. These theoretical groundings guided our design decisions. Prior studies support this approach, indicating that facing and orientation behaviors strengthen users’ perception of an agent as a social presence, and that timely, meaningful gestures enhance immersion. Therefore, we deliberately incorporated nonverbal elements into the VA design. Specifically, motion (locomotion), gestures, facial expressions, and facing behavior were treated as key variables in VA implementation.

The VA was designed to perform various nonverbal behaviors to replicate a museum curation scenario. The implemented motions included walking and idle states, and the gestures primarily comprised hand movements, such as pointing at exhibits, nodding, clapping, open-palm gestures, and emphasis gestures. Additional nonverbal elements included facial expressions, blinking, and facing behavior. These actions functioned as not merely mechanical motions but also mechanisms enhancing users’ social presence and immersion.

The nonverbal expressions were implemented through motion capture and integrated into Unity following the procedure presented in Figure 1. Motion capture was conducted using the DTrack3 optical tracking system (ART). To collect motion data, 16 reflective markers were attached to the performer’s head, shoulders, back, arms, legs, pelvis, hands, and feet. The performer repeatedly enacted several motions and gestures, and the recorded animations were stored in Biovision Hierarchy (BVH) format. The captured motion data underwent rigging and refinement using Autodesk MotionBuilder (Figure 2). First, the captured 3D skeleton was mapped to the VA skeleton for accurate tracking. Given the discrepancies in arm length and joint positions between the performer and the avatar, MotionBuilder’s scaling and translation functions were used for precise adjustment. The data were synchronized with the avatar model through rigging, and the animations were exported in Autodesk Filmbox (FBX) format for final import into the Unity environment.

The motion data applied to the VA avatar were further edited and refined by correcting awkward or misaligned movements to enhance naturalness. These processes were performed in the Unity environment using the UMotion Pro asset. After being rigged in MotionBuilder, the FBX data were imported into Unity, and the joint positions and rotation values were adjusted frame by frame through UMotion. For instance, arm length discrepancies between the curator model and the captured animation occasionally resulted in misaligned gestures, such as clapping motions where the hands did not meet properly. In such cases, the contact points were manually realigned. Moreover, fine actions, such as finger movements, were smoothed using UMotion’s curve editor.

The implemented gestures are shown in Figure 3. A representative example was the pointing gesture, in which the VA raised its hand to indicate an artwork at the beginning of an explanation to direct the user’s attention to the target painting. One- and two-handed pointing motions were recorded, with right- and left-arm variations, ensuring that the extended fingertip precisely indicated the target artwork. The designed nodding motion provided feedback indicating that the VA understood the user’s response or encouraged attentive listening during the VA’s explanation. For this gesture, the VA performed a sequence of three short nods lasting approximately 2 s in total, thereby conveying a natural conversational flow.

When users made selections or interacted with the system, the clapping gesture, performed at chest height, conveyed a positive impression of the interaction. The open-palm gesture, in which the VA extended one hand with the palm facing the user, signaled the start or transition of explanations, serving as an attention-directing cue implying guidance or initiation. Additional gestures were incorporated to emphasize specific parts of the explanation. Motions such as drawing a circle with both hands or extending the fingers added vividness to the conveyed information, introduced rhythm, and helped prevent monotony. These nonverbal expressions contributed to information transmission and helped create an interactive, engaging user experience.

The VA moved to a designated location by walking in a manner resembling human gait instead of teleporting instantaneously. A prerecorded motion-captured walking animation was used, and its stride length was synchronized with the VA movement speed to avoid unnatural movement. Upon arriving at the target position, the VA’s entire body rotated to face the user by continuously tracking the user’s current position. Such naturalistic walking and turning behaviors reinforced the impression that the VA was consistently aware of the user, thereby enhancing the user’s sense of presence in the virtual environment and contributing to the user’s perception of direct communication with the VA.

Facial expressions, another form of nonverbal behavior, were implemented using the blendshape functionality embedded within the VRM avatar model created in VRoid Studio as shown in Figure 4. Blendshapes control the deformation of facial regions through adjustable parameters as shown in Table 1. Global expressions, such as those suggesting neutrality (Neutral), anger (Angry), fun (Fun), joy (Joy), sorrow (Sorrow), and surprise (Surprised), were used to control the entire face, whereas localized parameters, namely, eyebrows (BRW), eyes (EYE), and mouth (MTH), enabled detailed adjustments. This approach addresses the limitations of prior studies, where repetitive neutral or smiling animations lacked variability, by enabling context-appropriate emotional expression through weight-based control. The weights were floating-point values of 0.0–100.0, with fine-tuned settings applied according to the communicative context. For instance, the Neutral = 100 value was maintained in the default state (Figure 4a) to preserve a gentle smile. At the end of an explanation, the parameters controlling the eyes were adjusted to create folded eyes, suggesting friendliness, as seen in the “Joy” expression (Figure 4b). In situations requiring stronger positive reactions, the parameters were adjusted to show the “Fun” expression (Figure 4c), which combined raised eyebrows, slightly closed eyes, and an upturned mouth for a more vivid smile. Conversely, for delivering sad lines, the parameters produced the “Sorrow” expression (Figure 4d), characterized by lowered brows, drooping eyelids, and a downturned mouth. For inquisitive utterances, the settings yielded smaller pupils and a slightly open mouth to express subtle doubt, creating the “Question” expression (Figure 4e). In surprising contexts, the relevant parameters for the eyebrows, eyes, and mouth were adjusted to create the “Surprised” expression (Figure 4f), maximizing the intensity of astonishment.

Instead of being implemented as static changes, these facial transitions were executed dynamically using Unity’s animation curve and linear interpolation functions to play over approximately 0.3–0.5 s. Consequently, expressions did not shift abruptly; instead, they changed progressively in a manner analogous to human facial muscle movements.

The average frequency of spontaneous human blinking is approximately 12–20 times per minute, which corresponds to once every ~3–5 s on average [45,46]. The VA was set to blink randomly at intervals of approximately 3–5 s to replicate naturalistic facial movements. Because the blinking motion was executed within a separate animation logic, it continuously appeared even when the VA was idle (performing no other actions).

The VA’s facing behavior was designed such that the body orientation continuously tracked the user’s position throughout the experience as shown in Figure 5. Through the persistent monitoring of the head-mounted display (HMD) position in the VR environment, the VA orientation was always directed toward the user, thereby creating the impression of sustained attention. This design fostered an experience akin to human interaction and played a crucial role in enhancing the user’s social presence and immersion while interacting with the VA [47].

For refined motion blending and precise motion control, the body was divided into separate layers within the animator as shown in Figure 6. The upper-body layer was assigned motions such as hand gestures, arm movements, and torso bending, which were used to point at exhibits or perform other communicative gestures. The lower-body layer was assigned locomotion-related motions, including walking, standing, and directional changes. Additionally, a facial layer was designated for controlling facial motions associated with emotional expressions. Weight values were allocated to each layer, allowing the simultaneous execution of multiple motions, such as walking while swinging the arms or altering the facial expressions according to the emotional content of an explanation, resulting in smooth composite movements. Because of this layered structure, the VA could naturally reproduce nonverbal communication behaviors. This layered approach minimized animation blending artifacts, such as ‘foot-sliding’, by synchronizing root motion speed with the animation controller’s transition (0.3 s duration) between ‘Idle’ and ‘Walk’ states.

2.2. VRLE

The study environment was developed using the Unity game engine (version 2022.3.13f1) with the Oculus Integration software development kit (Version 57.0.2-deprecated), and the system was operated using Meta Quest 3 HMD and controllers. System performance was optimized to ensure a stable user experience. The VA model and environment assets were profiled, maintaining a stable 90 FPS (Min: 72 FPS) on the Meta Quest 3. The number of participants was limited to one, and a single VA instance was used to eliminate interference between multiple users and focus user attention on VA interaction. Given the inherent flexibility of virtual learning environments, which are not restricted by physical locations and can provide diverse experiences [3], an art museum featuring a collection of famous paintings was utilized as the setting. Curatorial interactions were designed to occur in realistic locations within the virtual world to enhance participant immersion. Excessive exposure to expert knowledge was avoided, and explanations focused primarily on key content; consequently, the difficulty level did not disrupt environmental engagement. For the implementation of the museum, 3D structures were created using the Galleries and Museums asset package of Mixall, and the images of the explained artworks were inserted separately.

3. Experiment

3.1. Experimental Design

The curator VA provided information regarding famous artworks and could interact through two distinct communication modes: voice- and text-based verbal interaction. In the first mode, the VA delivered explanations of the artworks using speech synthesis. The Naver CLOVA speech synthesis platform was used for voice output. The second mode allowed users to confirm the VA’s explanations visually. Text boxes were implemented using a user interface canvas system in Unity and displayed the same sentences as the speech synthesis scripts. By default, the text boxes were positioned to the right of the VA; for artworks, the text boxes appeared adjacent to the exhibited piece to maximize readability. The voice narration and text display were synchronized by automatically activating each text box at the start of the corresponding audio file and deactivating it upon completion based on the file duration.

The curation scenario, comprising three artworks, remained identical regardless of the nonverbal elements. The experimental sequence allowed the user to determine the order of interactions within the VR environment, and the session concluded once all three artworks had been curated. When the user pointed a ray at an artwork using the controller, the piece was highlighted in red. When the user pressed the trigger on the controller, the VA oriented toward the selected artwork, approached it, performed a pointing gesture with its hand, and began its verbal explanation, thereby completing a feedback loop as shown in Figure 7). Once an artwork had been curated, the corresponding button was deactivated via Unity’s event system, preventing further selection. All interactions followed a predefined script to ensure consistency and control within the experimental environment. The participants were given the option to respond to the VA’s explanations, enabling active engagement rather than passive reception of information. Options could be selected using the ray and trigger of the right-hand controller, and the VA’s responses were immediate, facilitating a natural, continuous interaction experience.

Scenario and animation control were implemented programmatically via scripts. Transitions to specific states were controlled within the code either at the start of each dialog line or upon completion of the VA’s locomotion animations to ensure that the audio, gestures, facial expressions, and other animations were executed at appropriate times. Script-based control offers greater flexibility compared to fixed-timeline methods, enabling the real-time adjustment of diverse expressions, which is advantageous for interactive content with numerous user-interaction elements.

This study adopted a between-subjects design to compare two groups. The VA in Group A (Nonverbal) performed curation using speech, subtitles, and a full suite of nonverbal behaviors, including gestures, walking animations, dynamic facial expressions, and continuous facing behavior toward the user. Conversely, the VA in Group B (Verbal-only) delivered identical speech and subtitle explanations but was devoid of these nonverbal behaviors. To isolate the manipulation and control for potential confounds (e.g., timing, proximity, and audio cues), the VA in Group B traversed the same path to the artwork as in Group A. This movement occurred with the same timing and to the same final location, but it was rendered without a walking animation, thus appearing to “slide”. Furthermore, the VA in Group B remained facially static and only reoriented toward the user upon arriving at a destination, in contrast to the continuous facing behavior in Group A.

Each scene lasted approximately 10 min, with roughly 3 min allocated per artwork. Before the experiment, the participants completed a brief pre-experiment survey and received guidance on device usage. Following the experiment, the participants completed a post-experiment evaluation questionnaire to assess their experience.

3.2. User Experiment

Thirty university students (15 male, 15 female, mean age = 22.3 years) participated in the user experiment. They were informed of the non-invasive nature of the study, the 10 min duration, and their right to withdraw at any time without penalty if any issues or discomfort (such as cybersickness) arose. They were randomly assigned to either Group A (with nonverbal expressions) or Group B (without nonverbal expressions), with 15 participants per group. Each participant experienced a single scene. The participants operated the environment using the Meta Quest 3 HMD and controllers. Locomotion was controlled via the left-hand joystick controller, whereas orientation for museum exploration was adjusted using the right-hand joystick controller. The participants observed each artwork while listening to the VA’s explanations, which were delivered simultaneously through voice and subtitles. All data were collected anonymously, following the obtained informed consent.

3.3. Experiment Method

Following the participants’ interaction with the curator VA, their user experience was assessed using a post-experiment questionnaire comprising 22 items. All items were answered using a 5-point Likert scale (1: strongly disagree, 5: strongly agree) to capture the participants’ subjective evaluations.

The questionnaire items were designed referring to prior studies on the uncanny valley to evaluate the users’ perceptions of unnaturalness, discomfort, and related emotional responses toward the VA as shown in Table 2. These items were based on Mori’s (1970) uncanny valley hypothesis [38], which posits that avatars resembling humans but lacking perfect realism can evoke discomfort in users.

The users’ subjective perception of being in the same space as the VA (self-reported copresence), their awareness of the VA recognizing and responding to them (perceived other’s copresence), and social presence were evaluated as shown in Table 3. These measures were grounded in prior research on social presence and copresence, such as the studies of Slater and Wilbur (1997) and Biocca et al. (2001), and were suitable for quantitatively assessing users’ social perception and interaction quality within immersive virtual environments [48,49].

The Igroup Presence Questionnaire (IPQ) determines the extent to which users feel spatially present within a virtual environment as shown in Table 4. It includes items assessing spatial presence (the sense of being physically located in the virtual space), involvement (degree of immersion), and experienced realism (perceived realism of the environment). Schubert et al. (2001) developed the IPQ with four subscales—spatial presence, involvement, realism, and general presence—and it is widely recognized as a valid instrument for quantitatively evaluating subjective immersion and presence in VR environments [50].

4. Result Analysis

The effect of the inclusion of nonverbal elements in VA design on user experience in the VR environment was examined through independent-samples t-tests. The 30 participants were divided into two groups: Group A (n = 15), which experienced content with nonverbal expressions, and Group B (n = 15), which experienced content without nonverbal expressions. For the analysis, composite scores for each category were calculated by computing the mean of their respective items.

The analysis focused on their perception of the uncanny valley using the mean differences between the groups tested for each measure. The Uncanny Valley scale demonstrated good internal consistency (Cronbach’s α = 0.875). A composite score was calculated by computing the mean of these 5 items. As shown in Table 5, statistical analysis of the participants’ uncanny valley perception showed that Group A (M = 4.35, SD = 0.57) reported significantly higher human likeness and likability than Group B (M = 3.52, SD = 0.68) (t(28) = 3.615, p = 0.001, Cohen’s d = 1.320, 95% CI [0.516, 2.105]). Therefore, the VA with nonverbal elements was perceived as more natural and human-like, effectively mitigating the uncanny valley effect.

The reliability for the copresence and social presence scales was assessed. Composite scores were calculated by computing the mean of the items for each subscale. The internal consistency was acceptable for the Self-Reported Copresence scale (α = 0.702), the Perceived Other’s Copresence scale (α = 0.735), and the Social Presence scale (α = 0.874). For self-reported copresence, the difference between Group A (M = 4.47, SD = 0.55) and Group B (M = 4.00, SD = 0.70) was not statistically significant (t(28) = 2.02, p = 0.053, Cohen’s d = 0.737, 95% CI [−0.010, 1.472]) as shown in Table 6. Therefore, the VA’s presence was perceived regardless of the absence or presence of nonverbal expressions. However, for perceived other’s copresence, Group A (M = 4.2, SD = 0.54) scored significantly higher than Group B (M = 3.07, SD = 0.8) (t(28) = 4.54, p < 0.001, Cohen’s d = 1.656, 95% CI [0.809, 2.481]), suggesting that the VA with nonverbal expressions more strongly conveyed the impression of recognizing and responding to users, thereby enhancing social interaction. Similarly, social presence was significantly higher in Group A (M = 4.1, SD = 0.66) than in Group B (M = 2.93, SD = 0.76) (t(28) = 4.61, p < 0.001, Cohen’s d = 1.684, 95% CI [0.833, 2.512]), indicating that interactions with VAs involving nonverbal communication felt more natural and effectively made the users feel socially present.

The internal consistency for the IPQ subscales was also confirmed. Composite scores were calculated by computing the mean of the items for each subscale. The Spatial Presence scale (α = 0.709), the Involvement scale (α = 0.827), and the Experienced Realism scale (α = 0.731) all demonstrated good reliability as shown in Table 7. For the IPQ subscales, the spatial presence scores were significantly higher in Group A (M = 4.77, SD = 0.24) than in Group B (M = 4.27, SD = 0.54) (t(28) = 3.28, p = 0.004, Cohen’s d = 1.199, 95% CI [0.409, 1.971]), suggesting that the VA with nonverbal expressions contributed to a stronger sense of actual presence within the virtual space. Similarly, involvement was significantly higher in Group A (M = 4.73, SD = 0.29) than in Group B (M = 3.64, SD = 0.82) (t(28) = 4.85, p < 0.001, Cohen’s d = 1.770, 95% CI [0.908, 2.611]), indicating that the VA with nonverbal elements helped enhance user focus and immersion more effectively. As for experienced realism, Group A (M = 4.73, SD = 0.53) scored significantly higher than Group B (M = 4.17, SD = 0.77) (t(28) = 2.34, p = 0.026, Cohen’s d = 0.856, 95% CI [0.100, 1.599]), implying that interactions with the VA that had nonverbal expressions enabled the users to perceive the virtual environment as more realistic and natural. The nonverbal elements, such as facial expressions, gestures, and facing, contributed positively to the immersive experience and realism perception aside from simply delivering information. The significant differences observed across all three IPQ subscales (spatial presence, involvement, and experienced realism) indicated that the nonverbal expressions of the VA played a crucial role in not only social interaction but also enhancing immersion and realism within the virtual environment. Therefore, the design of nonverbal communication elements is essential for immersive virtual learning systems and metaverse content development, providing empirical support for their positive impact on user engagement and presence.

Prior to the experiment, participants reported their prior VR experience on a 4-point scale (1 = No experience, 2 = 1–2 times, 3 = 3–5 times, 4 = 5 or more times). To further address the potential influence of participants’ prior VR experience we con-ducted an exploratory analysis of the descriptive statistics. Based on participants’ responses to the 4-point experience scale we split them into a ‘No Experience’ group (n = 12; those who responded ‘1—No experience’) and an ‘Experienced’ group (n = 18; those who responded ‘2’, ‘3’, or ‘4’). A review of the means and standard deviations for each of the 7 dependent variables across these four conditions revealed the following patterns.

For the Uncanny Valley scale, participants in the ‘No Experience’ group (n = 12) reported (M = 4.543, SD = 0.538) for Group A (Nonverbal) and (M = 3.333, SD = 1.101) for Group B (Verbal-only). Among the ‘Experienced’ group (n = 18), the scores were (M = 4.175, SD = 0.580) for Group A and (M = 3.567, SD = 0.589) for Group B. For Self-Reported Copresence, the ‘No Experience’ group scores were (Group A: M = 4.714, SD = 0.394; Group B: M =4.167, SD = 1.041), while the ‘Experienced’ group scores were (Group A: M = 4.250, SD = 0.598; Group B: M = 3.958, SD = 0.655). For Perceived Other’s Copresence, the ‘No Experience’ group scores were (Group A: M = 4.333, SD = 0.430; Group B: M = 3.000, SD = 1.527), while the ‘Experienced’ group scores were (Group A: M = 4.083, SD = 0.636; Group B: M = 3.083, SD = 0.621). For Social Presence, the ‘No Experience’ group scores were (Group A: M = 4.428, SD = 0.460; Group B: M = 2.444, SD = 0.962), while the ‘Experienced’ group scores were (Group A: M = 3.875, SD = 0.733; Group B: M = 3.055, SD = 0.693). For Spatial Presence, the ‘No Experience’ group scores were (Group A: M = 4.750, SD = 0.288; Group B: M = 4.166, SD = 0.577), while the ‘Experienced’ group scores were (Group A: M = 4.781, SD = 0.208; Group B: M = 4.291, SD = 0.552). For Involvement, the ‘No Experience’ group scores were (Group A: M = 4.809, SD = 0.262; Group B: M = 4.000, SD = 0.882), while the ‘Experienced’ group scores were (Group A: M = 4.666, SD = 0.308; Group B: M = 3.556, SD = 0.821). Finally, for Experienced Realism, the ‘No Experience’ group scores were (Group A: M = 4.857, SD = 0.244; Group B: M = 3.333, SD = 1.256), while the ‘Experienced’ group scores were (Group A: M = 4.625, SD = 0.694; Group B: M = 4.375, SD = 0.482).

The analysis of interaction effects indicated that prior VR experience might moderate the experimental results, as significant interactions were found for the Social Presence scale (p = 0.05) and the Experienced Realism scale (p = 0.017). For Social Presence, an interaction pattern emerged: scores were higher for the ‘Experienced’ group than the ‘No Experience’ group within Group A (Nonverbal). Conversely, within Group B (Verbal-only), scores were higher for the ‘No Experience’ group than the ‘Experienced’ group. For Experienced Realism, there was little difference between experience levels in Group A; however, in Group B, the ‘No Experience’ group reported higher scores than the ‘Experienced’ group. However, these interaction results should be interpreted with caution. This is due to the highly imbalanced (or skewed) cell sizes within Group B, where the ‘No Experience’ group had only 3 participants, while the ‘Experienced’ group had 12. This small and uneven sample size may limit the reliability of this exploratory analysis.

5. Conclusions

This study investigated the effects of the verbal and nonverbal interactions of VAs on users in an immersive VR environment. The experimental results demonstrated that a VA with high-fidelity, motion-captured nonverbal expressions, such as gestures, facial expressions, and facing behavior, increased the users’ sense of social presence, immersion, and engagement. This suggests that the quality and naturalness of the motions, not just their mere presence, are critical factors. Significant differences were observed in social presence, spatial presence, and experienced realism, providing empirical evidence that multimodal communication by VAs can enhance the overall user experience in virtual environments beyond simple information delivery. This result addresses the practical question of implementation effort, suggesting that the investment in high-fidelity motion capture provides a significant and measurable added value compared to a simpler, verbal-only VA alternative. These findings provide valuable insights for designing VAs across various domains, including VR-based experiences, cultural content, and metaverse services.

However, this study has some limitations, including constraints on sample size and experimental environment. Future studies will conduct extended experiments with a wider variety of participant samples. This diversity is also important in examining potential cultural changes and individual differences in social processing (e.g., autism spectrum considerations) in interpreting nonverbal cues, but was not covered in this study. Furthermore, future studies will utilize longitudinal or extended scenarios to improve ecological feasibility and investigate the impact of long-term participation. In addition, our study will be able to expand the study by investigating new factors, such as realistic lip-syncing accompanied by speech from underutilized VA, and isolating the impact of factors currently included in our study (e.g., comparing behavior-only face-to-face conditions versus gesture-only conditions versus movement).

In conclusion, this study provides foundational data for establishing effective VA communication strategies in VR environments. It offers practical contributions to the design of immersive systems and the broader field of human–computer interaction research.

Author Contributions

Conceptualization, C.S.; methodology, C.S. and S.N.; formal analysis, C.S.; resources, C.S.; conduct experiment, C.S.; data collection, C.S.; writing—original draft preparation, C.S.; writing—review and editing, C.S. and S.N.; supervision, S.N.; funding acquisition, S.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF2020R1I1A3051739) and received funding from the ‘Mid-career Faculty Research Support Grant’ at Changwon National University in 2024.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ECA	Embodied conversational agent
FIVE	Framework for immersive virtual environments
HMD	Head-mounted display
IPQ	Igroup presence questionnaire
VA	Virtual agent
VR	Virtual reality
VRLE	Virtual reality learning environment

References

Weller, M. Virtual Learning Environments: Using, Choosing and Developing Your VLE; Routledge: London, UK, 2007. [Google Scholar]
Shin, D.-H. The role of affordance in the experience of virtual reality learning: Technological and affective affordances in virtual reality. Telemat. Inform. 2017, 34, 1826–1836. [Google Scholar] [CrossRef]
Freina, L.; Ott, M. A Literature Review on Immersive Virtual Reality in Education: State of the Art and Perspectives. In Proceedings of the International Scientific Conference Elearning and Software for Education, Bucharest, Romania, 23–24 April 2015; Volume 1. [Google Scholar]
Burdea, G. Haptic Feedback for Virtual Reality, Keynote Address of Proceedings of International Workshop on Virtual Prototyping. In Proceedings of the International Workshop on Virtual Prototyping, Laval, France, 17–29 May 1999. [Google Scholar]
Burdea, G.C.; Coiffet, P. Virtual Reality Technology; John Wiley & Sons: New York, NY, USA, 2003. [Google Scholar]
Huang, H.-M.; Rauch, U.; Liaw, S.-S. Investigating learners’ attitudes toward virtual reality learning environments: Based on a constructivist approach. Comput. Educ. 2010, 55, 1171–1182. [Google Scholar] [CrossRef]
Hanson, K.; Shelton, B.E. Design and development of virtual reality: Analysis of challenges faced by educators. J. Educ. Technol. Soc. 2008, 11, 118–131. [Google Scholar]
Parong, J.; Pollard, K.A.; Files, B.T.; Oiknine, A.H.; Sinatra, A.M.; Moss, J.D.; Passaro, A.; Khooshabeh, P. The mediating role of presence differs across types of spatial learning in immersive technologies. Comput. Human. Behav. 2020, 107, 106290. [Google Scholar] [CrossRef]
Martin, J.H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition; Pearson/Prentice Hall: Hoboken, NJ, USA, 2009. [Google Scholar]
Iglesias, A.; Luengo, F. Intelligent agents in virtual worlds. In Proceedings of the 2004 International Conference on Cyberworlds, Tokyo, Japan, 18–20 November 2004; pp. 62–69. [Google Scholar]
Veletsianos, G. The impact and implications of virtual character expressiveness on learning and agent–learner interactions. J. Comput. Assist. Learn. 2009, 25, 345–357. [Google Scholar] [CrossRef]
Schroeder, N.L.; Adesope, O.O.; Gilbert, R.B. How effective are pedagogical agents for learning? a meta-analytic review. J. Educ. Comput. Res. 2013, 49, 1–39. [Google Scholar] [CrossRef]
Swartout, W.; Artstein, R.; Forbell, E.; Foutz, S.; Lane, H.C.; Lange, B.; Morie, J.; Noren, D.; Rizzo, S.; Traum, D. Virtual humans for learning. AI. Mag 2013, 34, 13–30. [Google Scholar] [CrossRef]
Sabourin, J.; Mott, B.; Lester, J.C. Modeling learner affect with theoretically grounded dynamic Bayesian networks. In Proceedings of the Lecture Notes in Computer Science; Springer Berlin Heidelberg: Memphis, TN, USA; Berlin/Heidelberg, Germany, 2011; Volume 2011, pp. 286–295. [Google Scholar]
Johnson, W.L.; Lester, J.C. Face-to-face interaction with pedagogical agents, twenty years later. Int. J. Artif. Intell. Educ. 2016, 26, 25–36. [Google Scholar] [CrossRef]
Kang, S.-H.; Gratch, J.; Wang, N.; Watt, J.H. Does the Contingency of Agents’ Nonverbal Feedback Affect Users’ Social Anxiety? In Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems, Estoril, Portugal, 12–16 May 2008; Volume 1, pp. 120–127. [Google Scholar]
Huang, L.; Morency, L.-P.; Gratch, J. Virtual Rapport 2.0. In Proceedings of the Intelligent Virtual Agents: 10th International Conference, Philadelphia, PA, USA, 20–22 September 2010; Springer: Reykjavik, Iceland; Berlin/Heidelberg, Germany, 2011; Volume 2011, pp. 68–79. [Google Scholar]
Yalçin, Ö.N. Modeling empathy in embodied conversational agents. In Proceedings of the 20th ACM International Conference on Multimodal Interaction, Boulder, Colorado, 16–20 October 2018; ACM: New York, NY, USA, 2018; pp. 546–550. [Google Scholar]
Diederich, S.; Brendel, A.B.; Morana, S.; Kolbe, L. On the design of and interaction with conversational agents: An organizing and assessing review of human-computer interaction research. J. Assoc. Inf. Syst. 2022, 23, 96–138. [Google Scholar] [CrossRef]
Gunkel, D.J. Communication and artificial intelligence: Opportunities and challenges for the 21st century. Communication +1 2012, 1, 1–26. [Google Scholar] [CrossRef]
Jolibois, S.; Ito, A.; Nose, T. Multimodal expressive embodied conversational agent design. In Proceedings of the Communications in Computer and Information Science, Copenhagen, Denmark, 23–28 July 2023; Springer Nature Switzerland: Cham, Switzerland, 2023; pp. 244–249. [Google Scholar]
Mekni, M. An artificial intelligence based virtual assistant using conversational agents. J. Softw. Eng. Appl. 2021, 14, 455–473. [Google Scholar] [CrossRef]
André, E.; Pelachaud, C. Interacting with embodied conversational agents. In Speech Technology; Springer: Berlin/Heidelberg, Germany, 2010; pp. 123–149. [Google Scholar] [CrossRef]
Aneja, D.; Hoegen, R.; McDuff, D.; Czerwinski, M. Understanding Conversational and Expressive Style in a Multimodal Embodied Conversational Agent. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, Yokohama, Japan, 8–13 May 2021; ACM: New York, NY, USA, 2021; pp. 1–10. [Google Scholar]
Reeves, B.; Nass, C. The Media Equation: How People Treat Computers, Television, and New Media like Real People; Center for the Study of Languag: Cambridge, UK, 1996; Volume 10, pp. 19–36. [Google Scholar]
McNeill, D. Hand and Mind: What Gestures Reveal About Thought; University of Chicago Press: Chicago, IL, USA, 1992. [Google Scholar]
Frischen, A.; Bayliss, A.P.; Tipper, S.P. Gaze cueing of attention: Visual attention, social cognition, and individual differences. Psychol. Bull. 2007, 133, 694–724. [Google Scholar] [CrossRef] [PubMed]
Ekman, P. Facial expressions. In Handbook of Cognition and Emotion; Dalgleish, T., Power, M., Eds.; Wiley: New York, NY, USA, 1999; Volume 16, p. e320. [Google Scholar]
Tanenbaum, T.J.; Hartoonian, N.; Bryan, J. “How Do I Make This Thing Smile?” An Inventory of Expressive Nonverbal Communication in Commercial Social Virtual Reality Platforms. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, Honolulu, HI, USA, 25–30 April 2020; Association for Computing Machinery: New York, NY, USA, 2020; pp. 1–13. [Google Scholar] [CrossRef]
Maloney, D.; Freeman, G.; Wohn, D.Y. “Talking without a voice” understanding non-verbal communication in social virtual reality. Proc. ACM Hum. Comput. Interact. 2020, 4, 1–25. [Google Scholar] [CrossRef]
Burgoon, J.K.; Buller, D.B.; Woodall, W.G. Nonverbal Communication: The Unspoken Dialogue; McGraw-Hill: New York, NY, USA, 1996. [Google Scholar]
Wang, I.; Ruiz, J. Examining the use of nonverbal communication in virtual agents. Int. J. Hum.-Comput. Interact. 2021, 37, 1648–1673. [Google Scholar] [CrossRef]
Kopp, S.; Gesellensetter, L.; Krämer, N.C.; Wachsmuth, I. A Conversational Agent as Museum Guide—Design and Evaluation of a Real-World Application. In Proceedings of the Lecture Notes in Computer Science, Kos, Greece, 12–14 September 2005; Springer Berlin Heidelberg: Kos, Greece; Berlin/Heidelberg, Germany, 2005; Volume 2005, pp. 329–343. [Google Scholar]
Potdevin, D.; Clavel, C.; Sabouret, N. Virtual intimacy in human-embodied conversational agent interactions: The influence of multimodality on its perception. J. Multimodal User Interfaces 2021, 15, 25–43. [Google Scholar] [CrossRef]
Grivokostopoulou, F.; Kovas, K.; Perikos, I. The effectiveness of embodied pedagogical agents and their impact on students learning in virtual worlds. Appl. Sci. 2020, 10, 1739. [Google Scholar] [CrossRef]
Galanxhi, H.; Nah, F.F.H. Deception in cyberspace: A comparison of text-only vs. avatar-supported medium. Int. J. Hum.-Comput. Stud. 2007, 65, 770–783. [Google Scholar] [CrossRef]
Takano, M.; Yokotani, K.; Kato, T.; Abe, N.; Taka, F. Avatar Communication Provides More Efficient Online Social Support Than Text Communication. In Proceedings of the International AAAI Conference on Web and Social Media 2025, Copenhagen, Denmark, 23–26 June 2025; Volume 19, pp. 1862–1879. [Google Scholar]
Mori, M. Bukimi No Tani [the uncanny valley]. Energy 1970, 7, 33. [Google Scholar]
Tinwell, A.; Nabi, D.A.; Charlton, J.P. Perception of psychopathy and the uncanny valley in virtual characters. Comput. Hum. Behav. 2013, 29, 1617–1625. [Google Scholar] [CrossRef]
Tinwell, A.; Grimshaw, M.; Nabi, D.A.; Williams, A. Facial expression of emotion and perception of the uncanny valley in virtual characters. Comput. Hum. Behav. 2011, 27, 741–749. [Google Scholar] [CrossRef]
Mennecke, B.E.; Triplett, J.L.; Hassall, L.M.; Conde, Z.J. Embodied social presence theory. In Proceedings of the 2010 43rd Hawaii International Conference on System Sciences, Honolulu, HI, USA, 5–8 January 2010; pp. 1–10. [Google Scholar]
Allmendinger, K. Social presence in synchronous virtual learning situations: The role of nonverbal signals displayed by avatars. Educ. Psychol. Rev. 2010, 22, 41–56. [Google Scholar] [CrossRef]
Foglia, L.; Wilson, R.A. Embodied cognition. Wiley Interdiscip. Rev. Cogn. Sci. 2013, 4, 319–325. [Google Scholar] [CrossRef] [PubMed]
Cassell, J.; Bickmore, T.; Campbell, L.; Vilhjalmsson, H.; Yan, H. Designing embodied conversational agents. In Embodied Conversational Agents; The MIT Press: Cambridge, MA, USA, 2000; Volume 29. [Google Scholar]
Doughty, M.J. Consideration of three types of spontaneous eyeblink activity in normal humans: During reading and video display terminal use, in primary gaze, and while in conversation. Optometry Vis. Sci. 2001, 78, 712–725. [Google Scholar] [CrossRef]
Stern, J.A.; Walrath, L.C.; Goldstein, R. The endogenous eyeblink. Psychophysiology 1984, 21, 22–33. [Google Scholar] [CrossRef]
Oh, S.Y.; Bailenson, J.; Krämer, N.; Li, B. Let the avatar brighten your smile: Effects of enhancing facial expressions in virtual environments. PLoS ONE 2016, 11, e0161794. [Google Scholar] [CrossRef] [PubMed]
Slater, M.; Wilbur, S. A framework for immersive virtual environments (FIVE): Speculations on the role of presence in virtual environments. Presence 1997, 6, 603–616. [Google Scholar] [CrossRef]
Biocca, F.; Kim, J.; Choi, Y. Visual touch in virtual environments: An exploratory study of presence, multimodal interfaces, and cross-modal sensory illusions. Presence 2001, 10, 247–265. [Google Scholar] [CrossRef]
Schubert, T.; Friedmann, F.; Regenbrecht, H. The experience of presence: Factor analytic insights. Presence Teleoperators Virtual Environ. 2001, 10, 266–281. [Google Scholar] [CrossRef]

Figure 1. Motion capture workflow.

Figure 2. The process of rigging and refining motion capture data using MotionBuilder.

Figure 3. Curator gestures.

Figure 4. Examples of VA facial expressions implemented using blendshapes: (a) Neutral, (b) Joy, (c) Fun, and (d) Sorrow, (e) Question, (f) Surprised.

Figure 5. Facing behavior: Rotation of a VA to always face the user.

Figure 6. Animator layer structure.

Figure 7. Trigger interaction. The Korean sentences shown in the figure translates to [You must be very interested in art! Will you be able to enjoy this experience?], [Choose your artwork using a controller].

Table 1. Facial parameter mapping.

Situation	BRW (Eyebrow)	EYE (Eyes)	MTH (Mouth)	ALL	Note
Default	-	-	-	Neutral = 100	Natural smile
End of curation	-	Joy = 80	-	-	Folded-eye smile
Positive	Fun = 70	Fun = 100	Fun = 65 Joy = 40	-	Strong smile
Sad	Sorrow = 60	Sorrow = 50	Angry = 50	-	Downturned eyes, brows, and mouth
Inquisitive		Surprised = 50	Sorrow = 100	-	Subtle doubtful expression
Surprised	Surprised = 80	Surprised = 80	Surprised = 50	-	Wide eyes and open mouth

Table 2. Questionnaire items related to the uncanny valley.

Category	Items
Human Likeness	The museum curator closely imitated human behavior.
Attractiveness	The museum curator appeared attractive.
Eeriness	The museum curator evoked eeriness.
Comfort	Interacting with the museum curator felt comfortable.
Warmth	The museum curator felt warm and friendly.

Table 3. Questionnaire items for copresence and social presence.

Category	Items
Self-Reported Copresence	I felt like I was sharing the same space as the curator.
Self-Reported Copresence	I was continuously aware of the curator’s presence during the interaction.
Perceived Other’s Copresence	I felt that the curator was aware of my presence.
	I felt like the curator was acting like she was with me in the same real space.
	I felt that the curator was interacting with me and responding to my actions.
Social Presence	Interacting with the curator felt natural.
	I felt that the curator understood my emotions and responses.
	Interacting with the curator was an enjoyable and meaningful experience.

Table 4. Items for the IPQ.

Category	Items
Spatial Presence	I felt as if I were actually inside the virtual museum.
	I was able to intuitively understand the layout of the exhibition space and my position within it.
	Moving around in the virtual museum felt similar to walking in a real physical space.
	I could easily perceive my position and orientation in the virtual museum.
Involvement	I was deeply focused on the exhibits in the virtual museum.
	The information and experiences provided by the virtual museum kept me engaged.
	Interacting with the interactive elements (such as explanations and guides) in the virtual museum was enjoyable.
Experienced Realism	The spatial design of the virtual museum felt realistic.
Experienced Realism	The information provided by the virtual museum (such as explanations, texts, and audio) was similar to what I would expect from a real museum experience.

Table 5. Comparison of the uncanny valley scores by group.

Category	Group	n	Mean	SD	p
Uncanny Valley	A	15	4.35	0.57	0.001
Uncanny Valley	B	15	3.52	0.68	0.001

Table 6. Comparison of copresence and social presence scores by group.

Category	Group	n	Mean	SD	p
Self-Reported Copresence	A	15	4.47	0.55	0.053
Self-Reported Copresence	B	15	4	0.70	0.053
Perceived Other’s Copresence	A	15	4.2	0.54	<0.001
Perceived Other’s Copresence	B	15	3.07	0.8	<0.001
Social Presence	A	15	4.1	0.66	<0.001
Social Presence	B	15	2.93	0.76	<0.001

Table 7. Comparison of IPQ subscale scores by group.

Category	Group	n	Mean	SD	p
Spatial Presence	A	15	4.77	0.24	0.004
Spatial Presence	B	15	4.27	0.54	0.004
Involvement	A	15	4.73	0.29	<0.001
Involvement	B	15	3.64	0.82	<0.001
Experienced Realism	A	15	4.73	0.53	0.026
Experienced Realism	B	15	4.17	0.77	0.026

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sung, C.; Nam, S. Nonverbal Interactions with Virtual Agents in a Virtual Reality Museum. Electronics 2025, 14, 4534. https://doi.org/10.3390/electronics14224534

AMA Style

Sung C, Nam S. Nonverbal Interactions with Virtual Agents in a Virtual Reality Museum. Electronics. 2025; 14(22):4534. https://doi.org/10.3390/electronics14224534

Chicago/Turabian Style

Sung, Chaerim, and Sanghun Nam. 2025. "Nonverbal Interactions with Virtual Agents in a Virtual Reality Museum" Electronics 14, no. 22: 4534. https://doi.org/10.3390/electronics14224534

APA Style

Sung, C., & Nam, S. (2025). Nonverbal Interactions with Virtual Agents in a Virtual Reality Museum. Electronics, 14(22), 4534. https://doi.org/10.3390/electronics14224534

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Nonverbal Interactions with Virtual Agents in a Virtual Reality Museum

Abstract

1. Introduction

2. Developing the VA and Virtual Environment

2.1. VA Implementation

2.2. VRLE

3. Experiment

3.1. Experimental Design

3.2. User Experiment

3.3. Experiment Method

4. Result Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI