Investigating User Experience of an Immersive Virtual Reality Simulation Based on a Gesture-Based User Interface

Featured Application: The results of this work can be applied to the design, implementation, and evaluation of immersive virtual reality experiences that utilize gesture-based user interfaces. Abstract: The affordability of equipment and availability of development tools have made immersive virtual reality (VR) popular across research fields. Gesture-based user interface has emerged as an alternative method to handheld controllers to interact with the virtual world using hand gestures. Moreover, a common goal for many VR applications is to elicit a sense of presence in users. Previous research has identified many factors that facilitate the evocation of presence in users of immersive VR applications. We investigated the user experience of Four Seasons, an immersive virtual reality simulation where the user interacts with a natural environment and animals with their hands using a gesture-based user interface (UI). We conducted a mixed-method user experience evaluation with 21 Korean adults (14 males, 7 females) who played Four Seasons. The participants filled in a questionnaire and answered interview questions regarding presence and experience with the gesture-based UI. The questionnaire results indicated high ratings for presence and gesture-based UI, with some issues related to the realism of interaction and lack of sensory feedback. By analyzing the interview responses, we identified 23 potential presence factors and proposed a classification for organizing presence factors based on the internal–external and dynamic–static dimensions. Finally, we derived a set of design principles based on the potential presence factors and demonstrated their usefulness for the heuristic evaluation of existing gesture-based immersive VR experiences. The results of this study can be used for designing and evaluating presence-evoking gesture-based VR experiences.


Introduction
Virtual reality (VR) has recently emerged as a mainstream interaction technology because of the development of affordable head-mounted displays (HMDs) capable of delivering highly immersive experiences and powerful application development platforms, such as Unreal Engine and Unity.Consequently, VR experiences have found use across several fields, including tourism [1,2], education [3][4][5], and health and rehabilitation [6,7].Although most VR experiences have different target groups, content, interaction methods, and target hardware, they share the common goal of eliciting presence in users.Presence is defined as a "psychological state or subjective perception in which even though part of all of an individual's current experience is generated by or filtered through human-made technology, part of all of the individual's perception fails to accurately acknowledge the role of the technology in the experience" [8].That is, the user feels that they are present in a virtual scenario and forgets that they are experiencing a simulation mediated by technology.Thus, enabling presence is a key challenge in the design of VR experiences.Previous research has identified several factors that contribute to presence, including, but not limited to, the quality of information [9,10], user control [10,11], realism [10], and certain cognitive, emotional, and behavioral factors [10,12].High-quality equipment (e.g., HMD, controllers, motion trackers) and the correctness of the code that orchestrates the VR experience may contribute to eliciting presence.These are examples of presence factors, which we define as any element of a VR system or its usage context that has a positive effect on eliciting presence.
Gesture-based interactions have been studied since the early days of VR [13][14][15] to make the use of VR technology more comfortable and natural.In a gesture-based interface, physical gestures such as hand movements are used to interact with a computer system.A gesture-based interface is a popular instantiation of the broader concept of a natural user interface (NUI), which can be used to realize the transparency of the interface in HMD-based VR.
Factors that facilitate presence in VR have previously been studied [9][10][11][12]16,17]; however, studies on presence in the context of gesture-based UI are lacking.Identifying factors that may affect presence in VR by utilizing a gesture-based UI can be helpful when designing future immersive experiences.Therefore, we investigated the user experience of an immersive VR experience that utilizes gesture-based UI from the perspectives of presence and gesture-based interaction.The target gesture-based VR experience is called Four Seasons, and it was designed to provide an engaging and relaxing experience in a natural environment simulation.We formulated the following research questions and sought answers to them: 1.
How do users perceive the gesture-based UI of Four Seasons? 2.
What potential presence factors can be associated with Four Seasons?
To answer the research questions, we conducted a mixed-method user experience evaluation of the Four Seasons VR experience based on a questionnaire and interview data gathered from 21 adult participants.The results of this study offer the following contributions to the immersive VR field: (i) the presentation of Four Seasons, a gesture-based immersive VR experience; (ii) a user experience evaluation of Four Seasons, revealing insights into Korean adult users of it; (iii) the identification of potential presence factors associated with Four Seasons; (iv) the classification of the potential presence factors; (v) design principles based on the potential presence factors; and (vi) a heuristic evaluation of the Pillow immersive VR experience with a gesture-based UI.The results of this study are expected to facilitate the design and evaluation of immersive VR experiences that leverage gesture-based UI.

Theoretical Background 2.1. Gesture-Based User Interface
The term natural user interface (NUI) was first proposed by Steve Mann in the 1970s, as it was invented to describe an interface that reflects the real world [18].NUI is a general term used by designers and developers, referring to a UI that has a minimum visual direction and is easy to learn.Unlike traditional graphical UIs and command line interfaces, NUI can be used naturally without traditional input/output devices such as keyboards or mice.Moreover, NUI aims to be natural based on the previous experiences of the user; thus, the user can use a well-designed NUI even without receiving training.In particular, the direct manipulation of virtual objects requires that the system provides representations of objects that behave as if they are real-world objects [19].Additionally, McGloin and Krcmar [20] demonstrated that using natural controls in games can influence perceived realism (graphics and sound), which, in turn, can predict spatial presence and enjoyment.
Based on the work of Sae-Bae et al. [21], we classify NUIs into five types, as explained in Table 1.The touch interface is probably the most well-known NUI, with billions of smart devices using it.Voice interfaces have gained popularity through smart speakers and other voice-controlled smart home devices.While some commercial BCI products exist, they have not become popular in the consumer market.Finally, eye interfaces can be provided through an eye tracker embedded in VR HMDs (e.g., FOVE, HTC VIVE Pro Eye, Varjo XR-4, Among the NUIs presented in Table 1, the gesture interface allows intuitive and comfortable interaction with content based on the user's movement and can provide a comfortable and natural interaction with the user because it enables three-dimensional interaction.In addition, compared to other NUIs, various expressions are possible, and they are intuitive and convenient to use [22].Behavioral factors are more easily acquired, and recognizing the learning context related to the performance of the content is also easy [23].Considering these points, we chose to use a gesture interface to explore presence in a VR experience.

Implementation Methods for Gesture-Based User Interfaces
Gesture-based UIs can be classified into touch-based and touchless methods.As a touch-based method, the user directly wears and interacts with devices capable of detecting the user's motion.Accurate user gesture information can be obtained using relatively expensive equipment.A representative device for implementing touch-based gesture interfaces is the Data Glove [24], which was invented in the 1980s and can provide realtime tracking of hand position and orientation for gesture recognition.Currently, several commercial data gloves exist for motion capture and animation experts.However, the user is burdened with wearing the equipment, and errors may accumulate if the equipment is used for a long time.Another disadvantage is that the equipment is typically expensive, and the calibration process is complicated.
The touchless gesture interface recognizes a user's movement using a single camera, multiple cameras, or other sensors.Typically, initialization, tracking, user posture prediction, and gesture recognition are performed.In the initialization process, the camera is calibrated by determining its internal and external parameters.The tracking process continuously tracks feature points from a sequence of captured images depicting the movement of the user.The user's posture is predicted using the tracked feature points, and the movement is recognized.This method allows for more natural movement for the user compared to touch-based methods, but errors may occur in finding and tracking the feature points of the user's movement.To overcome this problem, a touchless gesture interface can utilize a marker.Two types of markers are used: an active marker with an LED and infrared light and a passive marker using color.Although this method increases the recognition rate, the difference from those of the touch-based gesture interfaces is negligible.Therefore, markerless and touchless gesture tracking using only a camera can benefit the most from touchless gesture interfaces.
Leap Motion and Kinect are representative markerless and touchless gesture interface devices.Kinect recognizes the entire body of the user and is used not only for entertainment but also for sports, physical activity, and rehabilitation [25].Because Kinect tracks the entire body, recognizing delicate gestures, such as hand gestures, is difficult.In contrast, Leap Motion is a high-sensitivity hand gesture recognition device that can recognize up to 0.01 m of hand motion at 290 frames-per-second using an infrared camera [26].It detects independent movements of fingers and thumbs and recognizes the three-dimensional position and direction information of the fingertips, as well as the direction information of the palm.In addition, it is relatively inexpensive, and the device size is small; therefore, it is easy to mount on wearable devices.For this reason, Leap Motion may have the potential to be used as an interface in various platforms and applications.However, the Leap Motion controller has limited operating conditions and field of view; when hands are overlapped or interlocked, the recognition rate of joint information tends to decrease.Therefore, an interface design suitable for the operating environment is required.However, Leap Motion can act as a substitute for more expensive sensors in the industry and on the market [25].Moreover, a recent literature review conducted by Bhiri et al. [27] showed that although Leap Motion offers the best price/performance ratio and has been applied to a wide range of fields, such as sign language, robotics, education, home assistance, VR, and medicine, it still faces challenges that affect the performance of hand gesture recognition.The identified challenges include the deformity of hands and neurological diseases, uncontrolled light conditions, deficiencies in data acquisition protocols, and unbalanced datasets for gesture recognition [27].

Presence Factors for Immersive VR Experiences
In this paper, we define the presence factor as any element of a VR system or its usage context that has a positive effect on presence.Presence factors can help users to experience presence, thereby contributing to memorable VR experiences.Examples of potential presence factors are high-quality displays in the HMD, realistic virtual characters, realistic and contextually appropriate virtual environments, accurate and natural hand gestures, high-quality positional sound effects, comfort wearing the HMD, well-tested code free of obvious bugs, and a quiet obstacle-free physical environment.Many studies have investigated the effects of presence factors on the presence experienced by users.For example, Louis et al. [9] explored how the technical affordances of VR equipment and network infrastructure, such as resolution, latency, frame rate, and jitter, affect presence.They proposed a method of measuring presence using these technical factors to complement subjective instruments [28,29] that are typically used in presence research.Riches et al. [11] conducted a qualitative study to explore the factors that affect presence in a social VR environment.They concluded that presence increased when the VR contents elicited cognitive, emotional, and behavioral responses in the users, as well as when the users were able to create their narrative about the events.Similarly, Sas and O'Hare [12] found significant correlations between presence and cognitive factors including absorption, creative imagination, empathy, and willingness to experience presence.Based on their findings, Sas and O'Hare proposed a presence equation that can be used to predict presence by measuring cognitive factors.Finally, Schuemie et al. [10] reviewed the early literature on presence, including the studies behind some of the presence questionnaires.From other studies, they gathered several factors that contribute to presence in VR.

Presence Factor Explanation
High quality of information [9,10] The quality and resolution of presented information should be high to make it feel real.
High frame rate [9] VR experience should be offered at a high frame rate.
Consistency of presentation across displays [10] The system should provide a consistent multimodal presentation to users using different equipment.
Interaction with and manipulation of the environment [10] The user should be able to interact with and manipulate the virtual environment, including objects within it.
Virtual body of the user's avatar [10] The VR experience should present an avatar for the user including a body.Showing just hands can harm the presence.
Anticipatable effects of actions [10] The user should be able to predict and anticipate, to some extent, the results of their actions.For example, when a user throws a virtual rock, it should fly and fall as a real rock would, possibly damaging the target it hits.
User's control in the VR environment [10,11] The user should have control of the VR experience, for example, by controlling the avatar and events in the virtual world, as well as creating personal narratives of these events.
Richness of the displayed information [10] The VR experience should provide rich information in multiple formats.
Richness of other sensory information [10] The VR experience should be multimodal and engage multiple senses of the users to mimic real-world sensory experiences.
Absence of distractions [10] The user should not have any distractions that emerge from outside the VR experience.
Pictorial and social realism [10] VR experience should be realistic in terms of how likely the presented events are to occur in reality (social realism) and how real the virtual world looks.
Personalization based on users' characteristics [10] The system should consider differences in users' characteristics and adapt the VR experience accordingly.
Cognitive, emotional, and behavioral responses [11] VR experience should be able to elicit cognitive, emotional, and behavioral responses in the user.
Cognitive factors [12] Presence can be facilitated if the user has certain cognitive characteristics of the user, such as absorption, creative imagination, empathy, and willingness to experience presence.
High quality network infrastructure [9] The network infrastructure should be able to provide a high-quality experience by minimizing latency and jitter.
The feeling of fun [16] The VR experience should be designed to elicit fun and enjoyment in the user, as fun has been shown to correlate positively with presence.
Immersion [30] High immersion (e.g., with an HMD and other sense-stimulating output devices) has been shown to promote presence in VR.
Avoidance of cybersickness [17] Symptoms of cybersickness can diminish or break the feeling of presence.
Immersive storytelling [31] Storytelling in VR is deeply linked to immersion, which, in turn, promotes presence.
Realistic soundscape [32] The sounds in VR should be realistic and appropriate to the virtual environment and the user's actions in it.

Materials and Methods
We conducted a mixed-method evaluation of Four Season, a gesture-based immersive VR experience, with Korean adults to investigate its user experience and presence.Figure 1 presents an overview of this study, including the stimulus, participants, data collection methods, and results.The following subsections present the details of the evaluation method.

Stimulus: Four Seasons Immersive VR Experience
In this study, we used the Four Seasons VR experience, which provides a relaxing multimodal gesture-based VR experience in a Korean countryside location.Four Seasons' key design pillars are beautiful graphics, gesture-based interactions, and a friendly narrator voice represented with a three-dimensional model of a female child.The VR experience reproduced the scenery of a beautiful village in the countryside of the Republic of Korea.The scene is in a mountain valley, with a stream flowing between the foothills next to a traditional house.The player can hear natural sounds (e.g., water, wind, and insects), which are intended to help with relaxation.Different seasons are depicted in this scene: spring, summer, autumn, and winter.Each of these scenes contain elements specific to a season, such as flowers, insects, scarecrows, snowmen, and cows, which are animated along with other environmental objects.Some of these elements are interactable, for example, a butterfly can land on the user's hand, and the user can brush the reeds with fireflies with their hands.These graphical and interactive elements were designed to help users to feel relaxed and engaged.The user can play Four Seasons either by freely selecting a season to play or play all seasons sequentially in the story mode.
Four Seasons was developed using Unity 5 and Maya 2017.The target VR platform was an HTC VIVE equipped with a Leap Motion hand tracker sensor to capture the hand gestures of the user for gesture-based interaction.Figure 2 presents the devices and handtracking results for two participants, and Table 3 summarizes the software and hardware used to create Four Seasons.The UI was designed to be gesture-based, thereby avoiding the controller-based interactions often observed in VR applications.The gesture-based interaction parts consist of the elements presented in Table 4, which also outlines some of the specific elements of different seasons.The winter season is not presented in the table because it does not contain interactive elements.

Stimulus: Four Seasons Immersive VR Experience
In this study, we used the Four Seasons VR experience, which provides a relaxing multimodal gesture-based VR experience in a Korean countryside location.Four Seasons' key design pillars are beautiful graphics, gesture-based interactions, and a friendly narrator voice represented with a three-dimensional model of a female child.The VR experience reproduced the scenery of a beautiful village in the countryside of the Republic of Korea.The scene is in a mountain valley, with a stream flowing between the foothills next to a traditional house.The player can hear natural sounds (e.g., water, wind, and insects), which are intended to help with relaxation.Different seasons are depicted in this scene: spring, summer, autumn, and winter.Each of these scenes contain elements specific to a season, such as flowers, insects, scarecrows, snowmen, and cows, which are animated along with other environmental objects.Some of these elements are interactable, for example, a butterfly can land on the user's hand, and the user can brush the reeds with fireflies with their hands.These graphical and interactive elements were designed to help users to feel relaxed and engaged.The user can play Four Seasons either by freely selecting a season to play or play all seasons sequentially in the story mode.
Four Seasons was developed using Unity 5 and Maya 2017.The target VR platform was an HTC VIVE equipped with a Leap Motion hand tracker sensor to capture the hand gestures of the user for gesture-based interaction.Figure 2 presents the devices and handtracking results for two participants, and Table 3 summarizes the software and hardware used to create Four Seasons.The UI was designed to be gesture-based, thereby avoiding the controller-based interactions often observed in VR applications.The gesture-based interaction parts consist of the elements presented in Table 4, which also outlines some of the specific elements of different seasons.The winter season is not presented in the table because it does not contain interactive elements.The scene appears after the user touches the butterfly.There are two options: the story mode and the free selection mode.The user selects the mode with a hand-grabbing motion.The story mode proceeds sequentially according to the story, whereas the user can select a desired season in the season-select mode.
Touch: check collisions between the virtual hand and the target object.

Flowers
In the spring season, the user can observe the surrounding flowers blossoming.The user can perform various actions according to the young girl's (narrator) guidance.The user can experience the natural movement and the blooming of the flower petals as the buds are touched with a swipe gesture.

Touch: like in Menu
Selection.Swipe: the movement of the flowers is determined based on the hand position and velocity during the swipe gesture.

Butterflies, Fireflies, Dragonflies
The user can see butterflies in spring, fireflies on a summer night, and dragonflies in autumn.The voice of the narrator guides the interaction so that the user can perform corresponding actions.The butterflies, fireflies, and dragonflies sit on the user's hand when the palm faces down and the hand stays still.
Palm down: checking the orientation of the hand in world space.Hand movement: comparing the velocity of the hand to a threshold value.

Reeds
The user can experience reeds in the autumn season as the narrator's voice guides the user to perform natural movement and brush the reeds using the swipe gesture.When the user brushes reeds with their hand, colliding reeds sway with a sound effect, and fireflies appear from the reeds.Fireflies sit on the user's hand when the palm faces up.The scene appears after the user touches the butterfly.There are two options: the story mode and the free selection mode.The user selects the mode with a hand-grabbing motion.The story mode proceeds sequentially according to the story, whereas the user can select a desired season in the season-select mode.
Appl.Sci.2024, 14, x FOR PEER REVIEW 7 of 29  The scene appears after the user touches the butterfly.There are two options: the story mode and the free selection mode.The user selects the mode with a hand-grabbing motion.The story mode proceeds sequentially according to the story, whereas the user can select a desired season in the season-select mode.
Touch: check collisions between the virtual hand and the target object.

Flowers
In the spring season, the user can observe the surrounding flowers blossoming.The user can perform various actions according to the young girl's (narrator) guidance.The user can experience the natural movement and the blooming of the flower petals as the buds are touched with a swipe gesture.

Touch: like in Menu
Selection.Swipe: the movement of the flowers is determined based on the hand position and velocity during the swipe gesture.

Butterflies, Fireflies, Dragonflies
The user can see butterflies in spring, fireflies on a summer night, and dragonflies in autumn.The voice of the narrator guides the interaction so that the user can perform corresponding actions.The butterflies, fireflies, and dragonflies sit on the user's hand when the palm faces down and the hand stays still.
Palm down: checking the orientation of the hand in world space.Hand movement: comparing the velocity of the hand to a threshold value.

Reeds
The user can experience reeds in the autumn season as the narrator's voice guides the user to perform natural movement and brush the reeds using the swipe gesture.When the user brushes reeds with their hand, colliding reeds sway with a sound effect, and fireflies appear from the reeds.Fireflies sit on the user's hand when the palm faces up.
Touch: like in Mode Selection.Swipe: like in Flowers.Palm up: checking the orientation of the hand in world space.
Touch: check collisions between the virtual hand and the target object.

Flowers
In the spring season, the user can observe the surrounding flowers blossoming.The user can perform various actions according to the young girl's (narrator) guidance.The user can experience the natural movement and the blooming of the flower petals as the buds are touched with a swipe gesture.

Element Scene Description Display Gesture Implementation
Mode Selection (Menu) The scene appears after the user touches the butterfly.There are two options: the story mode and the free selection mode.The user selects the mode with a hand-grabbing motion.The story mode proceeds sequentially according to the story, whereas the user can select a desired season in the season-select mode.
Touch: check collisions between the virtual hand and the target object.

Flowers
In the spring season, the user can observe the surrounding flowers blossoming.The user can perform various actions according to the young girl's (narrator) guidance.The user can experience the natural movement and the blooming of the flower petals as the buds are touched with a swipe gesture.

Touch: like in Menu
Selection.Swipe: the movement of the flowers is determined based on the hand position and velocity during the swipe gesture.

Butterflies, Fireflies, Dragonflies
The user can see butterflies in spring, fireflies on a summer night, and dragonflies in autumn.The voice of the narrator guides the interaction so that the user can perform corresponding actions.The butterflies, fireflies, and dragonflies sit on the user's hand when the palm faces down and the hand stays still.
Palm down: checking the orientation of the hand in world space.Hand movement: comparing the velocity of the hand to a threshold value.

Reeds
The user can experience reeds in the autumn season as the narrator's voice guides the user to perform natural movement and brush the reeds using the swipe gesture.When the user brushes reeds with their hand, colliding reeds sway with a sound effect, and fireflies appear from the reeds.Fireflies sit on the user's hand when the palm faces up.

Butterflies, Fireflies, Dragonflies
The user can see butterflies in spring, fireflies on a summer night, and dragonflies in autumn.The voice of the narrator guides the interaction so that the user can perform corresponding actions.The butterflies, fireflies, and dragonflies sit on the user's hand when the palm faces down and the hand stays still.

Element Scene Description Display Gesture Implementation
Mode Selection (Menu) The scene appears after the user touches the butterfly.There are two options: the story mode and the free selection mode.The user selects the mode with a hand-grabbing motion.The story mode proceeds sequentially according to the story, whereas the user can select a desired season in the season-select mode.
Touch: check collisions between the virtual hand and the target object.

Flowers
In the spring season, the user can observe the surrounding flowers blossoming.The user can perform various actions according to the young girl's (narrator) guidance.The user can experience the natural movement and the blooming of the flower petals as the buds are touched with a swipe gesture.

Touch: like in Menu
Selection.Swipe: the movement of the flowers is determined based on the hand position and velocity during the swipe gesture.

Butterflies, Fireflies, Dragonflies
The user can see butterflies in spring, fireflies on a summer night, and dragonflies in autumn.The voice of the narrator guides the interaction so that the user can perform corresponding actions.The butterflies, fireflies, and dragonflies sit on the user's hand when the palm faces down and the hand stays still.
Palm down: checking the orientation of the hand in world space.Hand movement: comparing the velocity of the hand to a threshold value.

Reeds
The user can experience reeds in the autumn season as the narrator's voice guides the user to perform natural movement and brush the reeds using the swipe gesture.When the user brushes reeds with their hand, colliding reeds sway with a sound effect, and fireflies appear from the reeds.Fireflies sit on the user's hand when the palm faces up.
Touch: like in Mode Selection.Swipe: like in Flowers.Palm up: checking the orientation of the hand in world space.
Palm down: checking the orientation of the hand in world space.Hand movement: comparing the velocity of the hand to a threshold value.

Reeds
The user can experience reeds in the autumn season as the narrator's voice guides the user to perform natural movement and brush the reeds using the swipe gesture.When the user brushes reeds with their hand, colliding reeds sway with a sound effect, and fireflies appear from the reeds.Fireflies sit on the user's hand when the palm faces up.The scene appears after the user touches the butterfly.There are two options: the story mode and the free selection mode.The user selects the mode with a hand-grabbing motion.The story mode proceeds sequentially according to the story, whereas the user can select a desired season in the season-select mode.
Touch: check collisions between the virtual hand and the target object.

Flowers
In the spring season, the user can observe the surrounding flowers blossoming.The user can perform various actions according to the young girl's (narrator) guidance.The user can experience the natural movement and the blooming of the flower petals as the buds are touched with a swipe gesture.

Touch: like in Menu
Selection.Swipe: the movement of the flowers is determined based on the hand position and velocity during the swipe gesture.

Butterflies, Fireflies, Dragonflies
The user can see butterflies in spring, fireflies on a summer night, and dragonflies in autumn.The voice of the narrator guides the interaction so that the user can perform corresponding actions.The butterflies, fireflies, and dragonflies sit on the user's hand when the palm faces down and the hand stays still.
Palm down: checking the orientation of the hand in world space.Hand movement: comparing the velocity of the hand to a threshold value.

Reeds
The user can experience reeds in the autumn season as the narrator's voice guides the user to perform natural movement and brush the reeds using the swipe gesture.When the user brushes reeds with their hand, colliding reeds sway with a sound effect, and fireflies appear from the reeds.Fireflies sit on the user's hand when the palm faces up.

Ducks and Drakes
In the summer season, there is a stream flowing down the mountain towards the valley.The environment presents a scene of playing ducks and drakes.The stone is thrown by holding and releasing the grab gesture, where the time of grabbing affects the throwing force.As the stones skim the water's surface, ripples pop up with a sound effect.

Ducks and Drakes
In the summer season, there is a stream flowing down the mountain towards the valley.The environment presents a scene of playing ducks and drakes.The stone is thrown by holding and releasing the grab gesture, where the time of grabbing affects the throwing force.As the stones skim the water's surface, ripples pop up with a sound effect.
Grab: using the grab gesture provided by the Leap Motion API.

Participants
Twenty-five Korean adults volunteered as participants in response to a notice distributed in a university department in the Republic of Korea.Out of the 25 participants, 4 experienced dizziness or headache when using VR and, therefore, discontinued the ex-Grab: using the grab gesture provided by the Leap Motion API.

Participants
Twenty-five Korean adults volunteered as participants in response to a notice distributed in a university department in the Republic of Korea.Out of the 25 participants, 4 experienced dizziness or headache when using VR and, therefore, discontinued the experiment.Table 5 provides the demographic distribution of the remaining 21 participants.After the participants completed the Four Seasons VR experience, they filled in a questionnaire comprising questions for collecting demographic information and five-point Likert items related to presence and their experience with the gesture-based UI of Four Seasons.The presence-related items were constructed based on the following presence factors: immersion [30], information/presentation quality [9,10], pictorial/social realism [10], storytelling [31], enjoyment/fun [16], emotional responses [11], the richness of information [10], soundscape [32], and cybersickness [17].The items regarding gesture-based interaction were formulated according to the presence factors related to interaction with the environment and the richness of sensory information [10], as well as overall usability of the gesture-based interaction.The questionnaire was prepared in English and then translated into Korean.
We used a custom set of items rather than a set of standard questionnaires for presence and gesture-based interaction to keep the questionnaire length appropriate.Moreover, we aimed to include items that are specific to Four Seasons and do not appear in standard questionnaires.We acknowledge that this decision reduces the comparability of the results.

Semi-Structured Interview
After the participants filled in the questionnaire, we conducted individual semistructured interviews to gain insights into their subjective thoughts and opinions related to the experience, with a particular focus on user experience with the gesture-based UI and the content modalities.The questions (Appendix A) were formulated based on the studies by Cramer et al. [33], Kim and Biocca [34], and Nielsen [35] and translated to Korean for the interviews.The interviews were then recorded, transcribed, and translated to English for analysis.

Experimental Procedure
Figure 3 illustrates the procedure of the experiment.The experiment room and devices were prepared as follows: The experimental room was a laboratory at a university with curtains covering the windows and stable office light supply.No other people were present except for a participant and a researcher who instructed and assisted the participant.The headset used, HTC VIVE, was tested prior to the experiment to confirm that it blocks external light.A notice was posted outside the laboratory door to request silence during the experiment operation.

Experimental Procedure
Figure 3 illustrates the procedure of the experiment.The experiment room and devices were prepared as follows: The experimental room was a laboratory at a university with curtains covering the windows and stable office light supply.No other people were present except for a participant and a researcher who instructed and assisted the participant.The headset used, HTC VIVE, was tested prior to the experiment to confirm that it blocks external light.A notice was posted outside the laboratory door to request silence during the experiment operation.The participants participated in the experiment individually according to an agreedupon schedule.After receiving an orientation for the experiment in which a researcher explained the devices, content, and procedure, the participants gave informed consent for the collection, storage, and processing of the experimental data.They then answered demographic questions in a pre-survey.The first step (orientation) required approximately 10 min.Subsequently, the participants experienced Four Seasons by playing once through all of the seasons, which lasted approximately 15 min.Figure 2 shows two of the participants experiencing the content.
After the playing of Four Seasons was completed, the participants filled in the postquestionnaire and were then interviewed to investigate their opinions on the VR experience and the use of a gesture-based UI.The post-experiment data collection required approximately 15 min.Therefore, the total length of the experiment was approximately 40 min.

Data Analysis
The questionnaire data were analyzed with descriptive statistics (mean, standard deviation, Cronbach's alpha) and visualized using charts.The transcribed interview data were iteratively coded and categorized [36] using a spreadsheet application.In the first iteration, we analyzed each expression uttered by the participants and assigned it a descriptive code.Positive and negative responses were coded to initial potential presence factors.In the next iteration, the codes were analyzed to refine and combine them when appropriate.Finally, in the third iteration, the codes were further refined and assigned to three categories that emerged from the analysis of the data: user experience, content design, and technology.
In order to create a classification for organizing presence factors, the authors independently analyzed the potential presence factors identified in the evaluation and proposed dimensions onto which the factors could be mapped.Then, through an iterative process, the authors evaluated the dimensions through the discussion and mapping of the identified potential presence factors onto them.Finally, two dimensions were selected to be included in the final classification as they were deemed to be able to accommodate the identified factors.The participants participated in the experiment individually according to an agreedupon schedule.After receiving an orientation for the experiment in which a researcher explained the devices, content, and procedure, the participants gave informed consent for the collection, storage, and processing of the experimental data.They then answered demographic questions in a pre-survey.The first step (orientation) required approximately 10 min.Subsequently, the participants experienced Four Seasons by playing once through all of the seasons, which lasted approximately 15 min.Figure 2 shows two of the participants experiencing the content.
After the playing of Four Seasons was completed, the participants filled in the postquestionnaire and were then interviewed to investigate their opinions on the VR experience and the use of a gesture-based UI.The post-experiment data collection required approximately 15 min.Therefore, the total length of the experiment was approximately 40 min.

Data Analysis
The questionnaire data were analyzed with descriptive statistics (mean, standard deviation, Cronbach's alpha) and visualized using charts.The transcribed interview data were iteratively coded and categorized [36] using a spreadsheet application.In the first iteration, we analyzed each expression uttered by the participants and assigned it a descriptive code.Positive and negative responses were coded to initial potential presence factors.In the next iteration, the codes were analyzed to refine and combine them when appropriate.Finally, in the third iteration, the codes were further refined and assigned to three categories that emerged from the analysis of the data: user experience, content design, and technology.
In order to create a classification for organizing presence factors, the authors independently analyzed the potential presence factors identified in the evaluation and proposed dimensions onto which the factors could be mapped.Then, through an iterative process, the authors evaluated the dimensions through the discussion and mapping of the identified potential presence factors onto them.Finally, two dimensions were selected to be included in the final classification as they were deemed to be able to accommodate the identified factors.

Ethical Considerations
This study was conducted in accordance to the Declaration of Helsinki.Informed consent was obtained from all participants included in this study.The participants, who all were adult volunteers, signed the informed consent form to participate in the experiment after a researcher briefed them about the experiment, how the collected data are going to be used, and what types of side effects can incur (e.g., symptoms of cybersickness).The Institutional Review Board of the authors' university approved the research design (201809-HS-002).All collected data were anonymized and stored on a secure storage server on campus.

Presence
The results regarding presence are presented in Figure 4 with means and standard deviations for each statement.The combined mean and standard deviation for the 12 presence statements are 4.21 and 0.76, respectively.Cronbach's alpha for the presence statements is 0.77, thus indicating acceptable internal consistency.As the results indicate, most statements related to presence garnered positive responses from the participants.The symptoms of cybersickness were largely avoided among those who finished the experiment (statement 3-µ: 4.38; σ: 0.97); however, four participants left the experiment early due to symptoms of cybersickness and did not answer the questionnaire.Four seasons succeeded particularly well in keeping the interface clean (statement 6-µ: 4.48; σ: 0.75), providing multisensory experience (statement 7-µ: 4.24; σ: 0.62), and making the experience enjoyable (statement 11-µ: 4.05; σ: 0.80).In contrast, the lowest responses to statements 10 (µ: 3.67; σ: 0.86) and 12 (µ: 3.52; σ: 0.98) indicate that "being there" with emotional engagement was particularly lacking for some participants.Possible reasons are revealed in the next section, where we present the results of the interview data analysis.
consent was obtained from all participants included in this study.The participants, who all were adult volunteers, signed the informed consent form to participate in the experiment after a researcher briefed them about the experiment, how the collected data are going to be used, and what types of side effects can incur (e.g., symptoms of cybersickness).The Institutional Review Board of the authors' university approved the research design (201809-HS-002).All collected data were anonymized and stored on a secure storage server on campus.

Presence
The results regarding presence are presented in Figure 4 with means and standard deviations for each statement.The combined mean and standard deviation for the 12 presence statements are 4.21 and 0.76, respectively.Cronbach's alpha for the presence statements is 0.77, thus indicating acceptable internal consistency.As the results indicate, most statements related to presence garnered positive responses from the participants.The symptoms of cybersickness were largely avoided among those who finished the experiment (statement 3-µ: 4.38; σ: 0.97); however, four participants left the experiment early due to symptoms of cybersickness and did not answer the questionnaire.Four seasons succeeded particularly well in keeping the interface clean (statement 6-µ: 4.48; σ: 0.75), providing multisensory experience (statement 7-µ: 4.24; σ: 0.62), and making the experience enjoyable (statement 11-µ: 4.05; σ: 0.80).In contrast, the lowest responses to statements 10 (µ: 3.67; σ: 0.86) and 12 (µ: 3.52; σ: 0.98) indicate that "being there" with emotional engagement was particularly lacking for some participants.Possible reasons are revealed in the next section, where we present the results of the interview data analysis.

Potential Presence Factors
As described in Section 3.5, the process of identifying potential presence factors from the interview data was iterative.Here, we present the final codes that represent the identified potential presence factors along with their explanations and supporting quotes from the interviews.In total, we identified 23 potential presence factors that covered various aspects of VR experience.Tables 6-8 present these factors in three thematic areas: user experience, content design, and technology.The results indicate that Four Seasons potentially has some unique presence factors that controller-based VR experiences may not have, such as natural hand movement, an increased feeling of realism, convenience, novelty, and natural interaction.

Potential Presence Factors
As described in Section 3.5, the process of identifying potential presence factors from the interview data was iterative.Here, we present the final codes that represent the identified potential presence factors along with their explanations and supporting quotes from the interviews.In total, we identified 23 potential presence factors that covered various aspects of VR experience.Tables 6-8 present these factors in three thematic areas: user experience, content design, and technology.The results indicate that Four Seasons potentially has some unique presence factors that controller-based VR experiences may not have, such as natural hand movement, an increased feeling of realism, convenience, novelty, and natural interaction.

Natural hand movement
Hand movement should feel natural in the VR environment.This presence factor can be facilitated by a gesture-based interface, whereas using controller devices may inhibit it.
"It is convenient that I don't need to grab anything, and it doesn't require any complicated controls."(Male, 30s) "This is more convenient than holding anything.
It is hard to rotate using hand controllers, but this way it is easy."(Male, 20s)

Realistic hand simulation
The virtual hand models and animations must be implemented realistically to achieve presence in a VR experience with a gesture-based UI.If the virtual hand model or its animation looks unrealistic owing to poor quality or technical glitches, it is likely to catch the user's attention and, therefore, divert them away from presence.
"All finger joints movements are realistic" (Female, 20s) "There is a problem in that hands tremble and hand modeling is unrealistic."(Male, 20s) "There is a sense of difference between hand modeling and real hands."(Male, 20s) "The hand modeling is unrealistic and immersion is reduced."(Female, 40s) Table 6.Cont.

Gesture synchronization
In a gesture-based UI, hand gestures made by the user must be accurately synchronized to the avatar's actions in the virtual world, and the consequences of those actions must feel natural and anticipatable [10].For example, making a throwing gesture to throw a virtual stone should result in the stone being hurled to a believable distance in relation to the gesture made.Any lag or accuracy error in the gesture synchronization can harm this presence factor."My hand movement is expressed minutely, and it makes me feel free."(Male, 30s) "I feel good when these reeds move according to my hand movement."(Male, 20s) "The device can't follow the movement correctly."(Male, 30s) "My hand movement in VR is slower than the actual movement."(Female, 20s)

Familiarity
Achieving presence is easier when the user is already familiar with controlling the VR experience.However, users experiencing VR for the first time may feel unfamiliar with controller devices, as they typically contain several buttons.When hand gestures are used for interaction in the VR environment, this issue is diminished.Familiarity can be further promoted by providing users with a tutorial and allowing them to practice gestures before entering the actual VR experience.
"The HMD device is uncomfortable for me but this way of control doesn't need an extra learning process as I can play with bare hands right away, so it is convenient."(Female, 30s) "I can't adapt to this easily because this way is not popular yet.If I'll adapt, it'll be fine."(Male, 20s) "The keyboard is better.Using hand gestures is not popular now so I can't adapt to using it."(Male, 20s) Onboarding Sufficient onboarding content (e.g., an interactive tutorial) is necessary, particularly for novice users, so that they know how to use the system.A good tutorial can help users to feel immersed and take a step toward feeling presence.
"It was a difficult experience because of lack of tutorial" (Male, 30s)

Realistic gesture interaction
Realism is a key construct measured by popular presence questionnaires [28,29].The VR experience should, therefore, have sufficiently realistic environments, objects, and characters/creatures.In a gesture-based interface, the gesture interactions with the virtual world and its objects should feel realistic and not artificial (e.g., the reeds bend when the user brushes them with their hand).
"I feel a sense of realism when a butterfly lands on my hand."(Female, 20s) "I feel realism when insects are on my hand."(Male, 30s) "If I should grab anything, it would decrease the feeling of realism.But in this case, I don't need to do this so I can feel realistic."(Male, 20s) "Brushing the reeds using my own hands, and the insects' reactions to hand gestures are interesting."(Male, 20s)

Feeling of touch
Presence may be harmed if the user cannot feel touch when interacting with virtual objects using their hands.Equipment such as haptic gloves [37] may help to achieve the feeling of touch.
"It has less sensation compared to the controllers."(Male, 20s) "This way of controlling doesn't have any sense of pushing, so it makes me feel dull."(Female, 20s)

Feeling of freedom
The freedom feeling is characteristic of gesture-based interfaces that let the user use their hands freely.Consequently, the user feels the interactions to be more natural, similar to those in the real world.Controllers, in turn, can be more difficult to use and uncomfortable to hold over time.This is closely related to the "Natural hand movement" presence factor.
"Without holding anything in hands makes me use my hands freely."(Female, 40s) "My hand movement shows the same way in the content.That feels like freedom."(Female, 20s) "The way is more convenient than holding anything.It is hard to rotate using hand controllers, but this way it is easy."(Male, 20s) "If my hands sweat then grabbing anything makes me uncomfortable but it is good that I can have free hands."(Male, 20s)

Feeling of novelty
The novelty perceived by the user in a VR experience has been shown to contribute to increased user satisfaction [38].Consequently, this may contribute to presence, as a satisfied user is more likely to accept the virtual world as a substitute for the real world.
"The hand gesture interface is fresh."(Male, 20s) "It's amazing that it can recognize hand motion without any device" (Male, 20s) Table 6.Cont.

Feeling of fun
When the user experiences fun in a VR experience, they are more likely to experience stronger presence [16].
Gamification techniques [39] can be particularly useful for increasing fun in VR experiences.
"It is fun that the insects adhere to my hand directly."(Female, 20s)

High plausibility
High plausibility occurs when the credibility of the virtual scenario matches the expectations of the user [40].For example, we found that one participant was looking for the narrator's character but could not find it.That is, what the user expected and what they perceived in the virtual environment were mismatched.
"I feel so lonely because I can hear the character's voice but can't see her." (Female, 30s) Table 7. Identified potential presence factors related to content design.

Presence Factor Explanation Evidence
Immersion VR systems are immersive through their use of HMDs that replace the real-world view with a virtual-world view [30].A gesture-based UI using hand gestures can increase this immersion further through an increased sense of realism in hand movements.Immersion can contribute to presence through two dimensions: sensory immersion (graphics, sounds, and other perceptual cues) and motor immersion (body movement and feedback thereof) [41].

Interaction with virtual objects
Even a well-simulated VR environment can inhibit presence if the user cannot interact with any objects or the interaction does not feel natural.Therefore, a VR experience should provide various means for interacting with and manipulating the virtual environment and objects [10].Objects' responses to these interactions should be predictable and based on the rules of physics in most cases.For example, the physics used for the rock movement in Duck and Drakes was unrealistic, which caught the attention of a participant."Brushing the reeds using my own hands, and the insects' reactions to hand gestures are interesting."(Male, 20s) "Actually, I can't play ducks and drakes well, but I can do it well in this game."(Male, 20s) "I feel realism when insects are on my hand."(Female, 20s) "The rock's movement doesn't look real.It could have been better if it was animated by real-world physics."(Female, 40s)

Aesthetic design
Aesthetic design is essential for providing better user interaction and engagement [42].Moreover, aesthetic design can invoke emotional responses in users, which is known to be a presence factor [11].Aesthetic design encompasses not only the design of graphical assets but also animations, soundscape, and other content that constitute the overall VR experience.
"On the summer night, it is so beautiful when I shake my hands and the fireflies appear."(Female, 30s) "Beautiful autumn sky graphic."(Female, 20s) "What a beautiful scene and lights in the house at night." (Female, 20s) "Beautiful graphic of rural night view." (Male, 30s)

Detailed environment design
Related to the "Natural environment simulation" presence factor, the virtual environment should have sufficient quality and detail for users to perceive it to be realistic.However, a trade-off must often be made between the desired environment design quality and the capacity of the target VR hardware.
"On the summer night, it is so beautiful when I shake my hands and the fireflies appear."(Female, 30s) "What a beautiful scene and lights in the house at night." (Female, 20s) "Brushing the reeds using my own hands, and the insects' reactions to hand gestures are interesting."(Male, 20s)

Realistic scale and distance
Some participants had difficulty with identifying how far virtual objects were from them.This can make the virtual world feel unnatural and interaction with virtual objects cumbersome because the object's position must first be "found" by hands before the interaction can begin.To alleviate this, the scale and distance measures should be realistic.
"I can't perceive a sense of distance."(Male, 20s) "It is hard to estimate the object's position."(Female, 20s) Unobstructed physical environment Some participants identified potential danger in using Four Seasons in a limited physical area where unnecessary physical objects might be placed near the user.Collisions with objects and fixed structures will certainly break the immersion, thereby diminishing presence.Even the perceived threat of collisions may keep the user on their toes, thereby affecting presence.
"If there are some obstacles nearby the user, it'll be dangerous because the user can touch those with bare hands."(Male, 20s) "It'll be dangerous when touching some obstacles with my hands."(Male, 20s)

Flawless technical implementation
Technical issues, such as software bugs and malfunctioning hardware, diminish the user experience of any interaction technology.In gesture-based VR experiences, technical issues related to VR equipment and hand tracking are particularly important to avoid, along with software issues that cause delays and distorted content."Hand graphics were twisted."(Male, 20s) "The immersion drops when there are some technical issues."(Male, 20s)

Comfortable form factor
The VR device and other accessories should feel unobtrusive to wear.The properties of the equipment (e.g., weight, fit, sweat inducement) can cause discomfort and inconvenience to the user, thereby inhibiting presence.Novice users may particularly feel inconvenienced by wearing multiple devices.
"The HMD device is too heavy, so it is hard to rotate my head."(Male, 20s) "All the experiment devices are too many."(Male, 30s)

Classification of Presence Factors
The number of potential presence factors is substantial.Therefore, it would be helpful for designers to have a method to logically organize them.To this end, we propose a classification method for organizing presence factors comprising four groups along the static-dynamic and internal-external dimensions as follows: 1.
Dynamic-Internal: These presence factors relate to the user's affection through personal preferences and perception.They may change over time; for example, the initial feeling of fun may diminish over time if the VR experience is repetitive.

2.
Static-Internal: These presence factors emerge from the user's prior experience, as well as their abilities.The presence factors in this group cannot (easily) change while experiencing VR content.

3.
Dynamic-External: These presence factors relate to the physical performance of the hardware, proficiency of designers and developers, and some aspects of the environment.All of these can be upgraded over time, hence the dynamicity.4.
Static-External: These factors influenced by the operation of the system (e.g., HMD, controllers, gloves and other haptic devices, sensors) that is not dependent on the technical capabilities, as well as environmental aspects, that are not easily changeable.
We mapped the identified potential presence factors into these groups, as shown in Figure 6.The classification can be updated by adding presence factors identified in future experiments, thus making it more comprehensive.

Design Principles and Heuristic Evaluation of Pillow
To demonstrate the use of the identified potential presence factors for conducting heuristic evaluations of existing immersive VR experiences, we devised a set of design principles that expand on the potential presence factors (Table A1 in Appendix B).Using the principles, we analyzed Pillow, a relaxing immersive VR and mixed reality experience where the user lies down on the bed to experience relaxing minigames.Pillow was chosen because it shares the goal of relaxation with Four Seasons, and it also has unique interaction methods, such as speech recorded by other users and using the controller on the chest to track the user's breathing.
At the time of writing this paper, Pillow has four minigames, which can be played in single-player mode or two-player mode (Figure 7).In the Stargazer minigame, the user can explore the night sky and learn about constellations in an interactive manner.The Fisherman minigame presents the user with a pond and a fishing rod.Using the rod, the user captures fishes that contain voice messages from other players who answer arbitrary questions.The Meditator minigame contains four instructed meditation contents (Energy, Focus, Chill, and Sleepy) that focus on breathing exercises, which are tracked by the controller placed on the chest.Finally, the Storyteller minigame generates and presents an interactive story based on the selected genre, character, and setting.

Design Principles and Heuristic Evaluation of Pillow
To demonstrate the use of the identified potential presence factors for conducting heuristic evaluations of existing immersive VR experiences, we devised a set of design principles that expand on the potential presence factors (Table A1 in Appendix B).Using the principles, we analyzed Pillow, a relaxing immersive VR and mixed reality experience where the user lies down on the bed to experience relaxing minigames.Pillow was chosen because it shares the goal of relaxation with Four Seasons, and it also has unique interaction methods, such as speech recorded by other users and using the controller on the chest to track the user's breathing.
At the time of writing this paper, Pillow has four minigames, which can be played in single-player mode or two-player mode (Figure 7).In the Stargazer minigame, the user can explore the night sky and learn about constellations in an interactive manner.The Fisherman minigame presents the user with a pond and a fishing rod.Using the rod, the user captures fishes that contain voice messages from other players who answer arbitrary questions.The Meditator minigame contains four instructed meditation contents (Energy, Focus, Chill, and Sleepy) that focus on breathing exercises, which are tracked by the controller placed on the chest.Finally, the Storyteller minigame generates and presents an interactive story based on the selected genre, character, and setting.The detailed results of our evaluation of Pillow are presented in Table A2 in Appendix C. In summary, Pillow was found to support well most of the potential presence factors, but there were a few aspects where the experience could be improved in this regard.We found that prolonged use of hand gestures in the Stargazer and Fisherman can cause arm fatigue and even pain.The gestures are smooth and convenient to use, but interactions with some objects like the fishing rod are not realistic.The graphics quality is not high, and the designs of the environments are simplified.Aligned with this, we found that the application does not provide the user with means to explore the details of the virtual environments.Moreover, when using Pillow with hand tracking, the user cannot receive tactile feedback to enrich interactions with the content.In conclusion, the identified issues The detailed results of our evaluation of Pillow are presented in Table A2 in Appendix C. In summary, Pillow was found to support well most of the potential presence factors, but there were a few aspects where the experience could be improved in this regard.We found that prolonged use of hand gestures in the Stargazer and Fisherman can cause arm fatigue and even pain.The gestures are smooth and convenient to use, but interactions with some objects like the fishing rod are not realistic.The graphics quality is not high, and the designs of the environments are simplified.Aligned with this, we found that the application does not provide the user with means to explore the details of the virtual environments.Moreover, when using Pillow with hand tracking, the user cannot receive tactile feedback to enrich interactions with the content.In conclusion, the identified issues could be remedied by increasing the graphical fidelity and environmental detail in the design, allowing the user to explore the virtual environments (e.g., in the Storyteller), providing more realistic interactions with some of the content (e.g., the fishing rod), and utilizing haptic gloves for tactile feedback.

Discussion
We explored the Four Seasons immersive VR experience with a gesture-based UI to identify the perceptions of young Korean adults on the gesture-based UI and experienced presence.Based on the analysis of the collected data, we identified 23 potential presence factors, some of which can be associated with gesture-based interaction.In the following subsections, we discuss the findings from various perspectives.

On Overall User Experience
The presented user experience evaluation of Four Seasons focused on collecting and analyzing the participants' perceptions of presence and using hand gestures to control the immersive VR experience.Although the questionnaire responses measuring user experience (Figures 3 and 4) were mostly positive, a few interesting findings motivate further discussion.
The participants who responded to the questionnaire experienced few symptoms of cybersickness (statement 3 in Figure 4).We expect that this was contributed to by a design decision to keep the user stationary in the virtual world.We did not implement any teleportation or continuous movement methods for the user to travel in the virtual environment because rapid artificial movement and rotation are among the key causes of cybersickness in immersive VR [43].However, Four Seasons still caused cybersickness in four participants who discontinued the experience and did not respond to the questionnaire.Therefore, cybersickness remains an issue to be resolved in Four Seasons and VR in general [44].
Four Seasons aimed at providing an immersive and realistic experience through the simulation of a natural environment with flora and fauna, which can be manipulated by hand gestures.Although the affective effects of Four Seasons are not explored in depth in this study, we expected that the participants might have some emotional responses to the content.However, the user experience evaluation indicated that only some of the users felt a change in their emotional state after playing Four Seasons (statement 12 in Figure 4).Previous research has shown emotional engagement to be an important part of user experience in immersive VR [45].
The interview results showed that wearing the VR equipment (HTC VIVE) caused discomfort in some participants.In a previous study, VR technology was used to perform a specific function to improve a user's skill training [46].The results of the study showed that users tend to be willing to put on complex and uncomfortable wearable gear when the VR content is meaningful to them and presented attractively.This result has also been observed in user acceptance models using information technology [47].

On Controlling VR Experience with Gestures
The game industry has adopted VR technology extensively and rapidly because players are familiar with interaction controls in the virtual world.Console games with controllers simulate gestures in the real world, and they are similar to hand controllers in VR.However, market analysis has identified device compatibility limitations as a challenge that poses a barrier to the industry's growth [48]: different VR hardware developers have designed their controllers with a varying number of buttons and other control methods.Although some of these methods are similar, these differences necessitate onboarding when the user begins using a new controller device.If the user has never used a VR controller, the onboarding time will increase.In contrast, basic hand gestures, as our results show, promote feelings of freedom and immersion while also making the interaction convenient for the user.This is possible when the gesture-tracking technology works well and the user interface is well designed.Some participants in our evaluation noted issues with gesture tracking (gesture synchronization).Several tracking issues of Leap Motion have been identified in other studies, including losing hand tracking [49], confusing hands/fingers [49,50], and synchronization errors such as latency and misplaced hands [49,50].To mitigate these issues in gesture-based VR contents that require robust tracking, we recommend using HMDs with accurate built-in tracking or VR gloves, some of which can also provide haptic feedback for increasing immersion [37].The lack of sensory feedback (statement 9 in Figure 5) is a possible reason for the participants of our study finding the gesture-based interface to not resemble real-world interaction (statement 1 in Figure 5).
Gesture-based VR experiences can be broadly divided into room-scale, seated, and standing VR experiences.When the prerequisites for these three VR setups are compared in terms of using hand gestures, room-scale and standing VR allow more freedom in movement and viewing angles, so the user's hands must be systematically restricted from moving out of sight.Although seated VR has more disadvantages than room-scale and standing VR [51], it has an advantage for gesture-based interaction in that the user's posture is mostly predictable, and the positions of the hands are expected to remain in front of the users, thus making it the safest way to experience a hand-gesture interface.However, the physical limitations of seated VR experiences can limit the freedom of movement in a gesture-based UI.

On Subjectivity and Personalization
The evaluation results show that some features of Four Seasons triggered both positive and negative experiences among the participants.For example, the interactive insects, VR hardware used, and gesture synchronization received both positive and negative responses:
Negative experience: "In the Autumn scene, if this content had been made more like the real world, I would have been able to kill the dragonflies."b.
Positive experience: "I feel realism when insects are on my hand." • VR hardware: a.
Negative experience: "I can't adapt to this easily because this way is not popular yet."b.
Positive experience: ". ..this way of control doesn't need an extra learning process as I can play with bare hands right away, so it is convenient." • Gesture synchronization: a.
Negative experience: "My hand movement is slower in VR than actual movement."b.
Positive experience: "My hand movement is expressed minutely, and it makes me feel free." Potential reasons for these discrepancies include different preferences and experience levels of the participants, as well as irregular problems in the system (e.g., high motion-tophoton latency) that only affected some users.The subjectivity of the results is aligned with the findings of Schuemie et al. [10], who suggested in their review of presence factors that presence can be improved when a VR experience considers users' characteristics.Thus, for example, in the case of the interactive insects, Four Seasons could be upgraded to allow the user to choose what kind of interaction is possible depending on their preferences.However, personalizing a VR experience for diverse user types requires significant effort.To alleviate this, pre-constructed interaction profiles based on an existing user model could be used, such as Bartle's player types [52] in the case of VR games.Users would then be categorized into one of the types using a simple questionnaire, with an option to later change the type manually or automatically [53].

On Realism
Realism was one of the identified potential presence factors in the evaluation, as many participants praised the realism of Four Seasons in the interviews.However, the lack of tactile feedback was a possible reason to reduce the realism in the gesture-based interaction (Figure 5).These results align with previous research that considers realism to be a key component of presence [10,28,29].Although achieving high realism is difficult due to the constraints of technology, the level of realism should be such that the user can accept it as an alternate reality that they can feel.However, people may have different expectations of realism, depending on their previous VR experience and willingness to suspend their disbelief.Designers must, therefore, consider the level of quality that is possible with the currently available VR hardware and make decisions on matters such as (i) the detail level of the virtual assets, (ii) the prioritization of asset quality (e.g., interactable assets and virtual characters have higher quality than props and non-interactive environmental assets), and (iii) the time and labor required to create and optimize the assets.
Research has shown that realism in VR is a multifaceted phenomenon that covers dimensions such as pictorial/visual, social, and perceived realism [10,54].While much of the realism-related presence research focuses on visual realism, perceived realism has recently received attention from researchers, as it considers users' subjective perceptions of realism.According to Weber et al. [54], perceived realism comprises the evaluation of the subjective degree of reality that varies between users and the plausibility and credibility of the virtual environment.Using this definition, some of the factors identified in this study could be associated with perceived realism, which relates to the subjectivity discussion in the previous section.

On Classifying Presence Factors
VR experience designers and developers can utilize the proposed presence factor classification method in different manners.First, it can be used as a tool for designing new VR experiences: from early design iterations onward, the designer can strive to maximize the presence factors in different groups.For example, they can ensure that the hardware quality is adequate for enabling the desired level of realism and that the overall content and technical designs diminish the aspects known to contribute to cybersickness, such as tracking errors, motion-to-photon latency, and screen flickering [55].Second, the proposed classification method can be used as a tool to prepare the physical environment of the user.Eliminating bright and flickering light sources, removing potential sources of noise, and moving unnecessary physical objects further away are some examples of this approach.Third, the classification can be used as a tool to guide evaluations of VR experiences: the more potential presence factors the experience has, the more likely it is to support presence among its users.The results of an evaluation can inform the development process to improve presence in future versions.
Many more presence factors may have been discussed in the other VR literature that we did not cover in this study because our literature review was not comprehensive.Moreover, as we focused on examining Four Seasons with a gesture-based UI, other potential presence factors may exist in other VR experiences.For example, the perceived relevance of the VR experience could be identified as a potential presence factor if Four Seasons was evaluated with older adults who might have lived in such a countryside location in their childhood.Moreover, although the proposed classification emerged from the context of gesture-based UI in VR, the static-dynamic and internal-external dimensions are conceptually generic.Similarly, many of the identified potential presence factors may be present in other VR experiences with other UI methods.

On Technology Advancement
Immersive VR technology is under constant change, with new hardware and software being released at a rapid pace.The technology employed in Four Season can be considered obsolete in 2024, as the latest HMDs, such as Varjo XR-4, Meta Quest 3, Apple Vision Pro, and HTC VIVE XR Elite, offer built-in high-precision hand tracking, high-resolution displays, and other sensors to track the user's gestures and actions.We expect that future immersive experiences will happen in mixed reality with various degrees of overlay between the real world and the virtual world.At the same time, artificial intelligence (AI) technologies, such as voice cloning, natural language processing, and generative AI models, enable the provision of natural interaction with AI-driven virtual characters.
The objective of this study was to investigate the user experience of Four Seasons and, therefore, identify potential presence factors that can facilitate the design of future immersive VR experiences.Despite the use of old technology, or perhaps due to it, we identified 23 potential presence factors that can inform the design of immersive VR experiences with recent technology.For example, tracking errors of Leap Motion helped us to identify presence factors related to hand gesture synchronization, hand animation simulation, and natural hand movement; although such errors are not likely to occur so much with the latest hardware, it is important to acknowledge their effect on the user experience.Moreover, potential presence factors related to the content (e.g., aesthetics, realism of environment, interactions with the animals) are relevant to any immersive VR hardware configuration and especially important for experiences presented via standalone HMDs that have limited rendering capacity.

Comparison to Previous Work
To the best of our knowledge, this study is the first attempt to analyze the user experience of an immersive VR experience based on a gesture-based UI to identify potential presence factors.The results of our study complement the presence factors identified in previous studies (see Table 2).However, these studies did not specifically focus on immersive VR with a gesture-based UI.
With the recent proliferation of studies related to the metaverse, body gestures enabled by full-body tracking [56] have become an essential part of avatar-driven human-to-human interaction in immersive virtual environments [57].In addition to synchronizing the fullbody movement of the user with that of their avatar, haptic systems have been developed to offer touch-based interaction [58,59].For example, Kim et al. [59] proposed a haptic hand-tracking system that accurately tracks the user's hands and provides sensations of vibration and heat to fingertips.Four Seasons is a single-user immersive VR experience that does not use haptic devices, but we expect that experiencing the natural environment together with friends while experiencing haptic feedback can enhance the sensations of immersion and presence [59].
The emotional aspects of immersive VR experiences have recently piqued the interest of researchers [11,[60][61][62], some of whom have found linkages between emotions and presence [11,61,62].Although our study did not specifically focus on the emotional aspects of immersive VR, unlike these previous studies, our results indicated emotional responses, with the participants expressing affective phrases such as "it is fun", "I feel good", "it's amazing", "it is so beautiful", and "it is wonderful".However, more research is needed to understand the connections between gesture-based UI and emotional engagement.
The gorilla-arm effect is a known adverse effect of gesture-based UI using mid-air hand gestures [63].It refers to the feelings of fatigue and discomfort caused by prolonged use of mid-air hand gestures.None of the participants in our study reported or showed signs of the gorilla-arm effect in our experiment.However, it is important to consider the potential effects of fatigue on the design of immersive VR experiences by choosing gestures that are less likely to cause fatigue [63], keeping the gesture-based interaction time relatively short, and allowing the user to rest their hands at will.

Limitations
We acknowledge several limitations in this study.First, the experiment was conducted with a single VR application on old hardware (HTC VIVE with Leap Motion), thus limiting the generalizability of the results to a certain type of VR experience.Moreover, Leap Motion is known to have tracking issues [49,50] that also affected some participants' user experience.Further studies are required to test whether the same potential presence factors can be identified in other gesture-based VR systems and state-of-the-art immersive VR hardware with built-in hand tracking.Second, the identified potential presence factors were based on the subjective opinions of the participants, which were sometimes conflicting.Although this relates to presence as a subjective psychological experience, it makes formulating a consensus challenging.Third, the experiment was conducted using a fairly homogeneous group of participants.Outcomes may be different for participants in other age groups and those coming from different cultural backgrounds.Fourth, this study used data collection instruments that were not based on existing well-known scales but were crafted for the purpose of this study.This limits the comparability of the results.Fifth, although most participants (76.2%) had previous VR experience, this variable was not controlled in the experiment, which may have influenced the results.Finally, the identified presence factors are potential; thus, comparative validation studies with a control group and an experiment group are required to determine whether the existence and absence of each identified factor affect presence in different immersive VR experiences.

Future Work
The findings of this study pave the way for several avenues of future research.First, a deeper literature review and more experiments are needed to complement the proposed classification method with latent presence factors, as well as to validate potential presence factors.Moreover, we suggest that some presence factors, such as immersion and aesthetic design, could be split into more concrete presence factors that can be better utilized in the design and implementation processes.A comprehensive, multi-level taxonomy of presence factors could be one way to achieve this in a future study.
Second, inspired by the mixed results on some of the presence factors in our study, we encourage further research on exploring the manners in which the personal characteristics of the user and other contextual factors affect the experienced presence.This is a challenge for designers of immersive VR experiences, as the design must be personalized to meet the needs and preferences of diverse users.Further research on this could lead into concrete recommendations, frameworks, and tools for personalizing VR experiences to better facilitate presence across diverse individuals.
Third, we hypothesize that users will be willing to forgive minor inconveniences and issues if VR experience is otherwise of high quality and the positive effects of the experience triumph the negative aspects.Future research is needed to test this hypothesis, identify various types of inconveniences, and investigate the effects of them on user experience, including presence.
Fourth, although this study did not focus on exploring the aspects of emotional engagement in VR experiences, the user study results demonstrated emotional engagement among some of the participants.To the best of our knowledge, emotional engagement in immersive VR using gesture-based interaction is a largely unexplored area.Potential investigations include understanding the difference between VR experience with and without gesture-based interaction in terms of emotional engagement, investigating which emotional responses are associated with a gesture-based interaction, and exploring how a gesture-based interaction effects emotional engagement in multi-user scenarios, such as in metaverse environments.
Fifth, as our study only investigated Four Seasons as a manifestation of the gesturebased interaction paradigm, the potential presence factors and other findings should be validated in a future comparative study that would consider different types of gesturebased UIs, as well as different variations in diegetic [64] and non-diegetic interfaces, to see their effects on presence and what kinds of presence factors could be associated with each one.

Conclusions
This study aimed to find answers to two research questions in the context of the Four Seasons VR experience, which uses a gesture-based UI.The mixed-method user experience study involved 21 Korean adult participants who experienced Four Seasons.The first research question-"How do users perceive the gesture-based UI of Four Seasons?"-was answered by analyzing the questionnaire answers of the participants.The results showed a high mean rating (4.71/5) for the statements measuring presence, thus indicating high overall presence when using Four Seasons with its Gesture-based UI.The specific statements related to the gesture-based UI garnered a mean rating of 4.21/5, with the most significant identified issues being related to the realism of gesture-based interaction and lack of sensory stimulation.
The second research question-"What potential presence factors that can be associated with Four Seasons?"-was answered by an iterative analysis of the participants' interview answers to identify factors that may affect presence in Four Seasons.As a result, we found 23 potential presence factors in three thematic groups: user experience, content design, and technology.Some of these factors are related to gesture-based UI, whereas others are related to overall the immersive VR experience.The identified potential presence factors confirm and complement previously discovered factors, thereby highlighting what constitutes the sense of presence in VR, particularly when using a gesture-based UI.As the number of potential factors is substantial and likely to grow as a result of future studies, we proposed a classification system to keep them organized.It is noteworthy that the identified presence factors are potential, and, therefore, further research is required to validate them.
The design principles that we devised based on the potential presence factors can be utilized for designing immersive VR experiences that leverage a gesture-based UI.Moreover, these design principles can be of use when conducting heuristic evaluations on existing immersive VR experiences, as we have demonstrated by heuristically evaluating Pillow, a third-party VR/MR application with a gesture-based UI.Finally, the formulated design principles can be expanded by deriving more design principles for the presence factors found in previous and future studies.
The results of this study can be useful for designers and researchers seeking to enable, evaluate, improve, or research the sense of presence in immersive VR applications that utilize a gesture-based UI.The identified potential presence factors and proposed classification system can be used in the design process to include elements that help to elicit these factors.Similarly, they can be used in the evaluation of existing VR applications to estimate the extent to which the application can promote presence.
To address the limitations of this study presented in Section 5.8, we propose future comparative studies that validate the identified potential presence factors.Another avenue of future research is to explore presence factors with different types of VR experiences (with or without a gesture-based UI) with different hardware configurations (e.g., HMDs and gesture tracking solutions) aimed at different audiences (e.g., age groups, ethnic groups, and professions).These studies could collect not only subjective data but also objective data on user experience, presence, and interaction quality, for example, through the tracking of physiological, psychological, and social responses, which could help to validate the results of this study.With diverse multimodal data, more potential presence factors are likely to emerge.Consequently, the proposed classification may need to be amended, for example, by a taxonomy that organizes the factors into multiple levels and by properties of content.

Immersion
Both VR and mixed reality mode feel highly immersive given the polished graphics and immersive soundscape, especially when running the content in tethered mode.The mixed reality interface is particularly good when played in a semi-dark room.
Natural hand movement Due to the accurate hand tracking of Meta Quest Pro, the virtual hands move naturally with almost no motion-to-photon latency.

Realistic hand simulation
The virtual hands are white and, therefore, not perfectly realistic.However, they replicate the movements of hands and fingers well.

Aesthetic design
Pillow provided limited depictions of real-world environments with medium-quality graphics.However, its interface is clean and aesthetically pleasing.

Detailed environment design
There is no rich virtual environment.The design of environment lacks details and graphical fidelity.

Narration sound
The narration is performed by a clear, calm voice, which suits the application well.

Realistic soundscape
The sound design is immersive and appealing.The ambient music is calm and dream-like.

Realistic scale and distance
The scale and distances of objects felt realistic.There were no issues with reaching for interactable objects to touch them.

Flawless technical implementation
We could not identify any technical errors or flaws during the evaluation.

Gesture synchronization
The environment and objects respond to the gestures in an anticipatable manner.This is facilitated by the clear guidance in the user interface.

Feeling of touch
There are no haptic devices to provide tactile feedback when playing in the hand tracking mode.

Natural environment simulation
The virtual environments (e.g., the pond in the Fisherman minigame) do not look natural, as the design is simplified and dream-like.The environments respond to actions in a natural way.

Interaction with virtual objects
The user can interact with several objects, such as the sheep's nightcap (to start the experience), constellations in the Stargazer minigame, and fishes in the pond of the Fisherman minigame.The objects' responses were accurate and predictable.

Examination of the content
It is not possible to examine the virtual environment and its objects more closely, except when holding an object in your hand (e.g., the nightcap or a fish).However, the objects are of low graphical detail, thus limiting the possibilities of examination.

Unobstructed physical environment
Pillow is to be played in a bed, so there are typically no physical obstructions in front of the user.Moreover, the user does not use locomotion during playing.

Comfortable form factor
The Meta Quest Pro HMD feels comfortable when worn in bed.

Figure 1 .
Figure 1.Overview of this study.

Figure 1 .
Figure 1.Overview of this study.

Figure 2 .
Figure 2. The setup for gesture-based interaction using HTC VIVE and Leap Motion.Users interact with fireflies (top) and a butterfly (bottom).

Touch:
like in Mode Selection.Swipe: like in Flowers.Palm up: checking the orientation of the hand in world space.

Figure 2 .
Figure 2. The setup for gesture-based interaction using HTC VIVE and Leap Motion.Users interact with fireflies (top) and a butterfly (bottom).

Figure 2 .
Figure 2. The setup for gesture-based interaction using HTC VIVE and Leap Motion.Users interact with fireflies (top) and a butterfly (bottom).

Figure 2 .
Figure 2. The setup for gesture-based interaction using HTC VIVE and Leap Motion.Users interact with fireflies (top) and a butterfly (bottom).

Touch:
like in Mode Selection.Swipe: like in Flowers.Palm up: checking the orientation of the hand in world space.Touch: like in Menu Selection.Swipe: the movement of the flowers is determined based on the hand position and velocity during the swipe gesture.

Figure 2 .
Figure 2. The setup for gesture-based interaction using HTC VIVE and Leap Motion.Users interact with fireflies (top) and a butterfly (bottom).

Figure 2 .
Figure 2. The setup for gesture-based interaction using HTC VIVE and Leap Motion.Users interact with fireflies (top) and a butterfly (bottom).

Touch:
like in Mode Selection.Swipe: like in Flowers.Palm up: checking the orientation of the hand in world space.Touch: like in Mode Selection.Swipe: like in Flowers.Palm up: checking the orientation of the hand in world space.

Figure 4 .
Figure 4. Means and standard deviations of the statements related to presence.The scale ranges from strongly disagree (1) to strongly agree (5).

Figure 4 .
Figure 4. Means and standard deviations of the statements related to presence.The scale ranges from strongly disagree (1) to strongly agree (5).

Figure 5 .
Figure 5. Means and standard deviations of the statements related to the gesture-based UI in Four Seasons.The scale ranges from strongly disagree (1) to strongly agree (5).

Figure 5 .
Figure 5. Means and standard deviations of the statements related to the gesture-based UI in Four Seasons.The scale ranges from strongly disagree (1) to strongly agree (5).

29 Figure 6 .
Figure 6.The classification of the potential presence factors identified in this study.

Figure 6 .
Figure 6.The classification of the potential presence factors identified in this study.

Figure 7 .
Figure 7. Sample screens from the Pillow immersive VR and mixed reality experience: (A) the Stargazer, (B) the Fisherman, (C) the Meditator, and (D) the Storyteller.

Figure 7 .
Figure 7. Sample screens from the Pillow immersive VR and mixed reality experience: (A) the Stargazer, (B) the Fisherman, (C) the Meditator, and (D) the Storyteller.

Table 2 .
Presence factors proposed by previous studies.

Table 3 .
The hardware and software used to create Four Seasons.

Table 3 .
The hardware and software used to create Four Seasons.

Table 4 .
Interaction elements of Four Seasons using gesture-based interaction.

Table 4 .
Interaction elements of Four Seasons using gesture-based interaction.

Table 4 .
Interaction elements of Four Seasons using gesture-based interaction.

Table 4 .
Interaction elements of Four Seasons using gesture-based interaction.

Table 4 .
Interaction elements of Four Seasons using gesture-based interaction.

Table 4 .
Interaction elements of Four Seasons using gesture-based interaction.

Table 5 .
The demographic distribution and previous VR experience of the participants who finished the experiment.

Table 6 .
Identified potential presence factors related to user experience.

Table 8 .
Identified potential presence factors related to technology.