Evaluating the Effect of Multi-Sensory Stimulation onStartle Response Using the Virtual Reality Locomotion Interface MS.TPAWT

The purpose of the study was to understand how various aspects of virtual reality and extended reality, specifically, environmental displays (e.g., wind, heat, smell, and moisture), audio, and graphics, can be exploited to cause a good startle, or to prevent them. The TreadPort Active Wind Tunnel (TPAWT) was modified to include several haptic environmental displays: heat, wind, olfactory, and mist, resulting in the Multi-Sensory TreadPort Active Wind Tunnel (MS.TPAWT). In total, 120 participants played a VR game that contained three startling situations. Audio and environmental effects were varied in a two-way analysis of variance (ANOVA) study. Muscle activity levels of their orbicularis oculi, sternocleidomastoid, and trapezius were measured using electromyography (EMG). Participants then answered surveys on their perceived levels of startle for each situation. We show that adjusting audio and environmental levels can alter participants physiological and psychological response to the virtual world. Notably, audio is key for eliciting stronger responses and perceptions of the startling experiences, but environmental displays can be used to either amplify those responses or to diminish them. The results also highlight that traditional eye muscle response measurements of startles may not be valid for measuring startle responses to strong environmental displays, suggesting that alternate muscle groups should be used. The study’s implications, in practice, will allow designers to control the participants response by adjusting


Introduction
What startles you? The startle response is an involuntary muscular reaction that protects oneself from an unexpected stimulus. Startles are common and desirable in horror movies, video games, and psychology research, but generally undesirable in the VR-based teleoperation of robots and remote systems. Although it is well established that higher audio levels cause greater startle responses [1] and that adding more haptic feedback devices for multi-stimulation in VR generally corresponds to higher realism [2,3], the question that is unknown in these situations is how the environment (e.g., wind, heat, smell, and moisture) combined with graphics and audio affects a startling situation. This knowledge is important to understand as researchers develop new VR worlds. Hence, the goal of this study was to determine how various aspects of VR and extended reality (XR) can be exploited to cause a good startle, or to prevent them.
To address this research, a multi-sensory virtual reality system was created. The goal was to create a system capable of providing natural full-body haptic stimulation mimicking natural environmental effects without interfering with the graphical display of the virtual world. We also wanted to allow users to locomote through the virtual world while experiencing environmental display, which would allow the results of this study Virtual Worlds 2022, 1 63 to be extended to creating realistic training for first responders, gait therapy sessions for people with Parkinson's disease [4], strokes, or spinal cord injury [5].
VR headsets and walking in place are one option; however, we elected to modify an existing system to provide this capability. The result is the Multi-Sensory TreadPort Active Wind Tunnel (MS.TPAWT, pronounced Ms. Teapot), a modified version of the TreadPort Active Wind Tunnel (TPAWT). The TreadPort was originally a VR environment that consisted of a large treadmill locomotion interface with a CAVE display [6], which lets users walk around and explore a VR world. A wind display was added via a large wind tunnel built around the TreadPort to create steerable wind that appeared to come from the graphical displays [7]. The TPAWT was modified for this study to include several haptic environmental displays: heat, olfactory, and mist. An overview of the system is presented in Figure 1. Combined, the system created an extended reality (XR) experience that allowed a user to walk through a virtual world while physically experiencing haptic displays portraying environmental aspects of the VR world. This system also allowed us to investigate the effects of environment on warnings provided to users wearing protective robots during physical activity [8], which enabled safe controlled experimental conditions for such experiments, one of the true benefits of virtual reality systems.
To address this research, a multi-sensory virtual reality system was created. The goal was to create a system capable of providing natural full-body haptic stimulation mimicking natural environmental effects without interfering with the graphical display of the virtual world. We also wanted to allow users to locomote through the virtual world while experiencing environmental display, which would allow the results of this study to be extended to creating realistic training for first responders, gait therapy sessions for people with Parkinson's disease [4], strokes, or spinal cord injury [5].
VR headsets and walking in place are one option; however, we elected to modify an existing system to provide this capability. The result is the Multi-Sensory TreadPort Active Wind Tunnel (MS.TPAWT, pronounced Ms. Teapot), a modified version of the Tread-Port Active Wind Tunnel (TPAWT). The TreadPort was originally a VR environment that consisted of a large treadmill locomotion interface with a CAVE display [6], which lets users walk around and explore a VR world. A wind display was added via a large wind tunnel built around the TreadPort to create steerable wind that appeared to come from the graphical displays [7]. The TPAWT was modified for this study to include several haptic environmental displays: heat, olfactory, and mist. An overview of the system is presented in Figure 1. Combined, the system created an extended reality (XR) experience that allowed a user to walk through a virtual world while physically experiencing haptic displays portraying environmental aspects of the VR world. This system also allowed us to investigate the effects of environment on warnings provided to users wearing protective robots during physical activity [8], which enabled safe controlled experimental conditions for such experiments, one of the true benefits of virtual reality systems. We designed three startles (i.e., bird, beam, and thunder) with different levels of visual, audio, and environmental feedback aimed to elicit a variety of psychological and physiological responses. Some of the startles were surprising and others offered premonition. Some startles were small and discrete, whereas others were large. This range of startling events enabled evaluation of how starling events can be designed to elicit a strong startle or to negate the effects of a startle. The former is important for developing VR horror games, psychology studies, and gait therapy, whereas the latter is important for developing VR-based interfaces for teleoperation.
To better understand how environmental stimuli, graphics (i.e., size of startling event), and audio affected responses to the startles, the audio level and environmental display were varied as the participants experienced the different startles. Graphic levels were naturally varied because each startle was visually different. Qualitative startle responses were measured using a survey, and quantitative startle responses were determined by measuring the activation of several muscles associated with the different startles. Video footage provided an evaluation of participant reactions.
The results highlight both expected and unexpected findings. As expected, audio was usually able to elicit a strong startle response, but we found that enabling or disabling the We designed three startles (i.e., bird, beam, and thunder) with different levels of visual, audio, and environmental feedback aimed to elicit a variety of psychological and physiological responses. Some of the startles were surprising and others offered premonition. Some startles were small and discrete, whereas others were large. This range of startling events enabled evaluation of how starling events can be designed to elicit a strong startle or to negate the effects of a startle. The former is important for developing VR horror games, psychology studies, and gait therapy, whereas the latter is important for developing VR-based interfaces for teleoperation.
To better understand how environmental stimuli, graphics (i.e., size of startling event), and audio affected responses to the startles, the audio level and environmental display were varied as the participants experienced the different startles. Graphic levels were naturally varied because each startle was visually different. Qualitative startle responses were measured using a survey, and quantitative startle responses were determined by measuring the activation of several muscles associated with the different startles. Video footage provided an evaluation of participant reactions.
The results highlight both expected and unexpected findings. As expected, audio was usually able to elicit a strong startle response, but we found that enabling or disabling the environmental display could have significant impact on the physiological responses and psychological perception of the startles. Likewise, the findings indicate that physiological and psychological responses do not necessarily agree. In some cases, muscle responses can be used to indicate a startle, whereas in others, particularly the eye, they are in reaction to the environmental display itself; however, this depends on the magnitude of the environmental display. Additionally, we found that visually larger startles elicited stronger responses both psychologically and physiologically than smaller startles. When comparing video response to muscle activation, we found that the eye muscle is not suitable for understanding the effects of environmental displays. This is generally because the added effects of wind and moisture directed to the face cause extra activation. This suggests that future studies should use other muscle groups when evaluating environmental displays. Lastly, we found that the environment was vital towards creating a premonition effect; in our case, even more so than graphics.
As a result, this paper presents several contributions. It presents a comprehensive XR system, MS.TPAWT, designed to simulate outdoor virtual environments, which is also useful for training scenarios and therapy. An XR game is presented with several startling events that are proven to be capable of eliciting a range of startle responses. The subject study demonstrates how multi-sensory display can be leveraged to create or cancel the effects of startling situations. We show that adjusting audio and environmental levels can alter participants' physiological and psychological responses to the virtual world. We show that the environment can be used to increase or reduce startle responses depending on how the virtual world is designed and how classical audio and graphical startle techniques are affected by environmental displays. The results also suggest that eye muscle measurements may not be valid for measuring startle response when environmental displays are enabled because strong displays may cause squinting even when startle responses are suppressed. These findings should be useful for designers of virtual worlds and haptic systems aiming to create a startle or to prevent them.

Startles, Fear, and Premonition
Startles are widely used in in psychology as a physiological means of measuring psychological characteristics [9][10][11][12][13]. The review by Blumenthal et al. [14] provides an excellent overview of the established techniques for eliciting a startle response, as well methods used for measuring the response. Eye twitches are one of the oldest metrics used in psychology research. EMG measurements of eye muscle activation are typically used to measure eye muscle activity as an indicator of a response to stimuli [14], although eyeblinks can be measured by cameras or mechanical devices [15], yielding similar results. In this research, we used EMG measurements of muscle activity near the eye, back, and neck, because the different startles were expected to activate muscles in those areas based upon the participant's whole-body physical response (e.g., jumping, looking up, and blinking) due to the startles. Our results suggest, however, that the classical eye squint response measurements may not be valid for measuring response to strong environmental displays because they naturally induce squinting.
In psychology research, startle responses are elicited using loud audio, electrical stimulation, magnetic stimulation, light flashes, or mechanical stimulation [14]. One can imagine that some of these techniques may be more desirable for VR. Audio stimulation, the most common technique, consists of a short burst of white noise (~50 ms) played at high sound levels (~100 dBA) to ensure that a startle response is elicited, although research indicates that audio levels as low as 50 dBA can elicit a startle response [16]. Audio is a natural part of a VR experience, but bursts of white noise are not. Hence, researchers must decide if their goal is to elicit a clinically accepted startle response, or if the goal is to evaluate the effect of different startle audio. Clinically accepted white noise audio bursts have been used in VR research to evaluate the effects of conditioning [17], although some do use natural sounds from the environment, such as the sounds of explosions and sirens in VR training simulators [18]. Our results highlight that enabling environmental display appropriately can actually magnify or diminish the startle effect of loud audio.
To drown out background noise (e.g., machinery), noise-cancelling headphones have been used to mask the sounds of apparatus [19]. The role of artificial background noise is debated because background noises in the 65-75 dBA range have been shown to reduce startle responses (i.e., pre-pulse inhibition) while masking the sound of apparatus [20]. In contrast, louder background levels, such as 75 dBA or 85 dBA, have been shown to increase startle responses [21]. In this study, we used noise-cancelling headphones to reduce sounds from the apparatus and then used background sounds that would naturally occur in the virtual world; startle sounds were also natural, corresponding to the event that the user experienced.
Visual stimulation, such as bright lights with increasing intensity [22] or rapidly approaching threatening objects [23], are also used by psychologists to elicit startle reflexes. Psychologists sometimes use still images to evoke fear [24], but videos have been shown to be more effective for generating fear than static images [25]. Similarly, light and darkness have been shown to elicit stress in participants akin to real life as they drive cars through tunnels in VR [26]. VR simulations for emergency responders use graphical representations of a car exploding coupled with the sound of an explosion, instead of bursts of noise [18]. Startles consisting of loud white noise were used in [27] as a user travelled through different parts of a virtual world. Scary games are used with surprising events, such as falling furniture or ghouls that appear suddenly [28], which are also combined with startle audio characteristic of those phenomena. Our virtual world was most similar to the former in terms of visual startle variety, but our phenomena were typical of real life; we employed a small bird that suddenly fluttered across the display, a beam that fell from the ceiling of a barn, and a large flash of light due to lightning, all of which included audio that was appropriate to each startle.
A variety of other startle stimulation techniques have been used. Electrical stimulation is achieved by the application of an electrical potential above the threshold of detection and below the threshold of pain [29]. Magnetic stimulation can also be used to elicit eyeblink responses [30]. Air stimulation can elicit startle responses using brief puffs of lowpressure air [31,32], typically applied to the forehead or temporal regions of the head [33]. Mechanical stimulation, such as tapping or ballistic impacts, has also been used [34]. The study presented here used wind as an environmental display, but this is quite different compared with the air pulses used to generate startles in the abovementioned studies. When enabled, wind was used as a steady haptic display until the thunder startle, whereupon wind was increased as a premonition of an impending storm. Others have used haptic displays, such as vibrating interfaces, to elevate the heartrate in VR, but this does not result in a startle response [35].
The usage of startles in VR research is often focused on psychology. Studies have used startle responses in VR to study phobia reactions [17], the motivational mechanisms of cravings [36], and to improve the effectiveness of post-traumatic stress disorder treatments [37]. These types of studies measure startle responses to evaluate the efficacy of their therapies. Studies using VR to study startle responses have found that users who were startled while involved in a complex task exhibited smaller startle responses [18], and that social anxiety is associated with a greater startle response [38]. Other studies have also used startle responses in VR to study the effects of extinction learning [27], effectiveness of certain trauma treatment methods [28], and conditioning for phobias [39,40]. In contrast, startles [41] and audio warnings [42] have also been examined to alert a user to an impending impact so that a tensed reaction can be used to better protect themselves; although these were not focused on VR, the latter used a treadmill interface similar to the one used in this research in order to deliver warnings at precise instances during a running gait. One of the long-term goals of this research is to augment that work to evaluate the effectiveness of different audio and visual warnings in controlled realistic VR environments as users wear protective gear such as smart helmets [8]. Finally, it is worth noting that not all VR environments promote startles or rely on fears; work by Noronhona used a natural outdoor VR world, as we do, but they were focused on promoting calmness via peaceful graphical presentations and soothing sounds as a means of therapy [43]. Our VR world focused on pleasing natural environments, such as a walk along a river, through the forest, and in the mountains; however, we did not focus on promoting peacefulness, which could be a focus of future research given the advanced haptic displays presented here.
Premonitions created by VR were also a feature of this study. Specifically, the thunder startle was designed to feature graphics and environmental displays which cued the participant that a startle was about to occur. This is typical of extinction learning [44], where repeated exposure to a stimulus is used to reduce the effect of the stimulus. In this case, we relied upon associations of the stimuli with the thunder startle to diminish its impact. Previous VR studies have used imagery of spiders as the unconditioned stimulus [39] and colored light as the conditioned stimulus reducing the fear response. Images of fierce dogs and falling walls have also been used as unconditioned stimuli [24]. The thunder stimulus is fundamental to child development to the point that it is known to have become "extinct through awareness" rather than requiring any specific conditioning [45], which is why it was selected for this study. We have yet to find any papers that deal with environmental display or graphical display in VR as cues (e.g., the conditioned stimuli) for thunder startles. Hence, we believe that this paper highlights advancements regarding the ability of graphical and environmental displays in virtual worlds to leverage these natural premonitions engrained in humans from childhood.
It is important to note that premonition is different from exposure therapy, where a user is repeatedly exposed to a stimulus to reduce its effect, such as the treatment of acrophobia [40]. In fact, this study only presents each stimulus once. Likewise, psychological research focuses on pre-pulse inhibition for a startle, meaning that things that happen moments before the startle can diminish the magnitude of the startle. Pre-pulse tones, for example, can diminish the intensity of startle response to loud white-noise pulses [1], but as indicated by [46], tones that occur 15 to 400 milliseconds before the startle inhibit reactions, whereas longer pre-pulse periods (e.g., 2 s) become ineffective. This is in contrast to a fear-potentiated startle, such as the dark haunted house used in [28] to potentiate startles, which is typical of fear-potentiated VR research. In this study, however, we created a startling situation, a thunderstorm, and examined the effect of an environmental display (wind, mist, heat, and odor) and graphical display (darkening skies) several seconds before the startle to attenuate the startle response instead of potentiating it.

Simulator Technology
Development of computer graphics technology in the 1960s and 1970s led to the early development of VR simulators for training pilots [47]. As indicated by [48], contemporary simulators range from a computer screen and joystick to high-fidelity mockups with 6 DOF motion platforms and graphical displays to create realistic experiences [49]. Similar difficulties operating the simulations have been reported across the range of simulators; however, neurological activity is noted to be significantly increased with VR-based simulators. VR simulators provide an improved sense of presence, and many use haptic interactions with remote systems and simulations, leading to extended reality (XR), which is used in applications such as piloting remotely operated vehicles [50,51] and surgical robotics [52]. Haptic feedback is often an important part of this process so that user can better perceive the conditions and interactions of a remote system or simulation. We have not found any simulators that evaluate the effect of environmental display on their training results, although it is common for simulators to make users respond to the effect of adverse environmental conditions, suggesting that environmental displays could add a new dimension to these training scenarios.
To enhance realism in VR, many studies have used sensory feedback systems such as olfactory, heat, and wind. Few, however, combine as many as MS.TPAWT. Examples of olfactory feedback systems include the Lotus system, which uses a directional mist system, and the VE VIREPSE, which uses a fan-based system [53,54]. Both systems differ from ours in that they focus solely on the effects of olfactory feedback.
Warmth has also been used across many types of VR systems to induce a greater feeling of presence. Examples of this being implemented include using several heating/cooling elements to enable a seated user to feel changes in temperature as they explore a space with their hand [55], or through heatpads worn with a mobile headset [56]. Our system allows ambient heat to be felt by the user in a way realistic to sunlight without limiting locomotion of the user.
An example of a wind system is the WindCube [57], which can provide wind from several directions, but impedes heavily on the virtual environment, and thus, the total immersion [58]. Simpler, less intrusive wind systems include the VR Scooter [59] and Sensorama [60], which both use fixed fans hidden from the user to simulate speed. Ambiotherm combines a thermal display with a wind display as an accessory for head-mounted displays [56]. These setups minimize intrusion on the VR experience, but do not allow for multidirectional wind or user movement. Our system created multidirectional wind from a device hidden from the user and allowed for user movement.
Head-mounted displays have become popular commodities for VR research and home applications due to their affordability and portability; however, CAVE display systems are still a common occurrence for research-related VR systems [4,61,62] and commercial locomotion systems [61]. Without locomotive input, they have comparable effectiveness in some scenarios to higher-end mobile displays such as the Samsung Gear VR [62]. CAVE displays combined with locomotion systems are very popular for locomotion studies and gait therapy, however [4,63]. The user can interact with their physical environment, for example, using the railings on a treadmill or via advanced haptic displays that render terrain features [4], slopes [64][65][66][67], or inertial forces [6,68], while also experiencing their virtual world unencumbered. Physical therapists can more easily interact with the users, helping to guide their therapy. Bodyweight support systems [5] allow for safety in applications such as gait therapy for spinal cord rehabilitation, or in simulations of reduced gravity. Tetherbased systems also allow for perturbations to be applied during gait [43,69]. Such systems often provide sufficient space perturbations and for motion capture systems that can be used for characterizing gait properties and controlling interactions with the user. CAVE displays also remove the added weight of a head-mounted display, which would perturb the user's kinematics, allowing the user to move more freely and naturally. Although MS.TPAWT provides all of the above features, only a subset were applied in this research. Future research would expand on the results from this paper to evaluate the effects of other haptic interactions and user experiences.
Most of the related studies described above use up to two haptic feedback devices. MS.TPAWT used four haptic feedback devices coupled with a locomotive input and visual/audio display. These effects are designed to be non-intrusive, allowing the user to freely interact with their environment. The combination of these systems enables the effective study of startling experiences which would be difficult or dangerous were they to be performed without VR.

System Description
MS.TPAWT is a VR system that couples graphical and locomotion interfaces with several haptic displays including wind, olfactory, moisture, heat, and audio, as shown in Figure 1. This section further elaborates on the displays and details the game design used in this study.

CAVE Display and Locomotion
The CAVE display presented a 180 • view of the virtual world to the participant. There were four projections: left, right, front, and ground. The first three projections were angled 120 • to the front, whereas the ground projection was projected on the white treadmill and its surrounding floorboards [7].
The TreadPort consisted of a treadmill-style locomotion interface and used a robotic tether to display inertial forces for realistic walking [70]. In addition to the tether, participants were attached to a fall arrest harness to prevent falling.

Environmental Display
The environmental display consisted of wind, olfactory, moisture, heat, and audio. Wind was generated with a Greenheck QEID 33 mixed flow fan [7], which was split along two channels located on the side of the system and exited from both sides of the CAVE display [7]. The airflow then followed the curvature of the screens before merging and being redirected towards the participant. In effect, the user felt wind as though it were coming straight from the screen. The angle of wind experienced by the user could be adjusted by changing the ratio of air flow between the two channels. For this study, the wind direction was fixed directly towards the participant.
Two scents were administered in this study: an ambient lavender scent and a rain scent used in situations when the participant encountered water. The VR environment consisted of forested areas and flowers; therefore, we used a lavender scent developed by P&J Trading. The oil was applied to a cloth and placed on the exit of the wind system which dispersed the scent. Air collection charcoal filters were installed at the back of MS.TPAWT, which prevented the participants from being exposed to other scents. For the rain scent, a Numatics 236-102B solenoid control valve released pressurized air which flowed over a rain-scented (P&J Trading) cotton ball and exited an SUF1 air atomizing nozzle (Spray Systems Co., Glendale Heights, IL, USA) in front of the user [71].
Moisture was created by spraying water particles into the wind just in front of the participant. The wind then carried the moisture directly in front of them. The system was identical to the olfactory system, except that it was attached to a water source.
To simulate sun exposure, heat was provided overhead by an RPH-208-A Infrared Heater (Fostoria). It was turned on when the user was in sunlight and turned off when the user was not. Lastly, we used Cowin E7 active noise-cancelling Bluetooth headphones to provide audio. These headphones can produce volumes higher than the sound levels throughout the game which are 95 dB SPL or lower.

VR Game
Participants experienced a series of startle events that were each designed to elicit a variety of responses. There were three startles: a bird flying in front of the participant (bird), a beam crashing down in front of the participant (beam), and a thunder strike that landed in front of them (thunder). Images for each startle are presented in Figures 2-4, respectively.
There were four projections: left, right, front, and ground. The first three projections were angled 120° to the front, whereas the ground projection was projected on the white treadmill and its surrounding floorboards [7].
The TreadPort consisted of a treadmill-style locomotion interface and used a robotic tether to display inertial forces for realistic walking [70]. In addition to the tether, participants were attached to a fall arrest harness to prevent falling.

Environmental Display
The environmental display consisted of wind, olfactory, moisture, heat, and audio. Wind was generated with a Greenheck QEID 33 mixed flow fan [7], which was split along two channels located on the side of the system and exited from both sides of the CAVE display [7]. The airflow then followed the curvature of the screens before merging and being redirected towards the participant. In effect, the user felt wind as though it were coming straight from the screen. The angle of wind experienced by the user could be adjusted by changing the ratio of air flow between the two channels. For this study, the wind direction was fixed directly towards the participant.
Two scents were administered in this study: an ambient lavender scent and a rain scent used in situations when the participant encountered water. The VR environment consisted of forested areas and flowers; therefore, we used a lavender scent developed by P&J Trading. The oil was applied to a cloth and placed on the exit of the wind system which dispersed the scent. Air collection charcoal filters were installed at the back of MS.TPAWT, which prevented the participants from being exposed to other scents. For the rain scent, a Numatics 236-102B solenoid control valve released pressurized air which flowed over a rain-scented (P&J Trading) cotton ball and exited an SUF1 air atomizing nozzle (Spray Systems Co., Glendale Heights, IL, USA) in front of the user [71].
Moisture was created by spraying water particles into the wind just in front of the participant. The wind then carried the moisture directly in front of them. The system was identical to the olfactory system, except that it was attached to a water source.
To simulate sun exposure, heat was provided overhead by an RPH-208-A Infrared Heater (Fostoria). It was turned on when the user was in sunlight and turned off when the user was not. Lastly, we used Cowin E7 active noise-cancelling Bluetooth headphones to provide audio. These headphones can produce volumes higher than the sound levels throughout the game which are 95 dB SPL or lower.

VR Game
Participants experienced a series of startle events that were each designed to elicit a variety of responses. There were three startles: a bird flying in front of the participant (bird), a beam crashing down in front of the participant (beam), and a thunder strike that landed in front of them (thunder). Images for each startle are presented in Figures 2-4, respectively.   The bird startle was designed for participants to track a graphically subtle but fast object moving past them. In contrast, the beam startle was designed to be large and easily noticed. The thunder startle incorporated the environment as part of its design. Before the thunder strike, the heat was turned off, wind speed was increased, and mist and rain scents were activated. The thunder then struck in front of them, creating a loud noise.
At the start of the game, participants started on top of a hill near a brick house. They ventured downhill, following a dirt path, with occasional markers to indicate the way. After some time, they crossed through a shallow river, during which, if environmental effects were on, olfactory and moisture systems were activated. This experience took approximately 5 min and was intended for users to become immersed in the game.
After traversing through the river, participants crossed a bridge, leading to a farm. A few steps past the bridge, the first startle was activated and a blue bird flew from the bottom right of the display and out to the left. Participants continued their walk marked by the path indicators through the farmland for 2-3 min. Eventually, they entered a barn, triggering the second startle where the beam crashed in front of them, creating a loud sound. After exiting the barn and continuing along the path for an additional 2-3 min, the thunder startle was activated. The system then returned to normal and the participants walked for 1-2 min before the game was ended by the researcher.

Methods and Procedures
The experimental design is detailed in this section. We discuss the environment and startles, statistical design, measures, and participants. The study was completed with University of Utah IRB_00100544 approval.   The bird startle was designed for participants to track a graphically subtle but fast object moving past them. In contrast, the beam startle was designed to be large and easily noticed. The thunder startle incorporated the environment as part of its design. Before the thunder strike, the heat was turned off, wind speed was increased, and mist and rain scents were activated. The thunder then struck in front of them, creating a loud noise.
At the start of the game, participants started on top of a hill near a brick house. They ventured downhill, following a dirt path, with occasional markers to indicate the way. After some time, they crossed through a shallow river, during which, if environmental effects were on, olfactory and moisture systems were activated. This experience took approximately 5 min and was intended for users to become immersed in the game.
After traversing through the river, participants crossed a bridge, leading to a farm. A few steps past the bridge, the first startle was activated and a blue bird flew from the bottom right of the display and out to the left. Participants continued their walk marked by the path indicators through the farmland for 2-3 min. Eventually, they entered a barn, triggering the second startle where the beam crashed in front of them, creating a loud sound. After exiting the barn and continuing along the path for an additional 2-3 min, the thunder startle was activated. The system then returned to normal and the participants walked for 1-2 min before the game was ended by the researcher.

Methods and Procedures
The experimental design is detailed in this section. We discuss the environment and startles, statistical design, measures, and participants. The study was completed with University of Utah IRB_00100544 approval. The bird startle was designed for participants to track a graphically subtle but fast object moving past them. In contrast, the beam startle was designed to be large and easily noticed. The thunder startle incorporated the environment as part of its design. Before the thunder strike, the heat was turned off, wind speed was increased, and mist and rain scents were activated. The thunder then struck in front of them, creating a loud noise.
At the start of the game, participants started on top of a hill near a brick house. They ventured downhill, following a dirt path, with occasional markers to indicate the way. After some time, they crossed through a shallow river, during which, if environmental effects were on, olfactory and moisture systems were activated. This experience took approximately 5 min and was intended for users to become immersed in the game.
After traversing through the river, participants crossed a bridge, leading to a farm. A few steps past the bridge, the first startle was activated and a blue bird flew from the bottom right of the display and out to the left. Participants continued their walk marked by the path indicators through the farmland for 2-3 min. Eventually, they entered a barn, triggering the second startle where the beam crashed in front of them, creating a loud sound. After exiting the barn and continuing along the path for an additional 2-3 min, the thunder startle was activated. The system then returned to normal and the participants walked for 1-2 min before the game was ended by the researcher.

Methods and Procedures
The experimental design is detailed in this section. We discuss the environment and startles, statistical design, measures, and participants. The study was completed with University of Utah IRB_00100544 approval.

Participants
This study enlisted 120 participants (77 male and 43 female) with a mean age of 23 (between 18 and 49 years old). USD 10 gift cards were given as compensation for their time.

Design
To understand how environmental and audio stimuli affect startle response, we varied the levels of each and analyzed their responses with a two-way ANOVA. The environment was either on or off, and audio levels were varied between off, medium, and high. All configurations are presented in Table 1. Participants were randomly assigned to a configuration upon arrival. To understand the effects of graphics on participant response, we performed a oneway ANOVA across the startles which had varying level of graphics. Participants with environment were removed from this analysis to avoid any possible level of interaction.
To understand the effects of the environment on premonition, we compared the beam and thunder startles, because these both contained large visual elements but had varying levels of premonition. Two-way ANOVA was used to analyze the main effects of environment and premonition, as well as the interactions between them.

Measures
Two measures were used in the study: EMG and survey. EMG was captured using a Trigno Avanti Platform. The sensors were placed on muscle groups associated with startle activation. These included the orbicularis oculi (eye), sternocleidomastoid (neck), and trapezius (back) muscles. The maximum voluntary contraction (MVC) was measured and used to normalize the EMG data. EMG data were passed through a 50 Hz high pass filter, after which a moving RMS envelope of 100 samples was calculated. EMG data affected by equipment malfunctions were removed from the analysis. Values falling outside of three standard deviations were also removed.
The survey assessed the user's perception of each startle. A standard 5-point visual analog scale with "strongly agree" and "strongly disagree" was used for all questions, with scores of "5" and "1" correlating to the two respective extremes. A score of "3" was neutral, but not labelled as such. The following questions were asked: Q1. When the bird flew in front of you did you feel startled? Q2. When the beam fell in front of you in the barn did you feel startled? Q3. When the lightning hit the ground did you feel startled?

Results
This section explains the roles of audio, environment, and graphics on perception and muscle activation in the different scenarios. For each scenario, muscle activation and participant perception were also compared. The results are presented in five subsections. The first three analyze the effects of environment and audio for the bird, beam, and thunder startles separately. The last two subsections analyze the effect of graphics and the effects of environment and premonition on participant responses. EMG and survey data were analyzed using two-way ANOVA. Follow-up tests were conducted using Tukey-Kramer's difference procedure. We used a standard significance level of (p < 0.05) and considered notable effects to be in the range of 0.05 ≤ p < 0.09. EMG data were slightly skewed; therefore, Box-Cox transform was applied before analyses [72]. Normality was verified using the Kolmogorov-Smirnov test, validating the use of ANOVA. Statistics are reported using the transformed data, whereas effect sizes are reported using the non-transformed EMG data. It was unclear discussing effect sizes with units that had percentages; therefore, two metrics were used for muscle activation. The first was the %MVC increase, which was the true effect size between two groups. The other was the percentage increase between the two groups. For example, if group 1 had 10 %MVC and group 2 had 50 %MVC, we determined that there was a 40 %MVC increase and an overall 500% increase in muscle activation between the two groups.

Bird Startle
The bird startle was designed with a small visual focus and larger auditory focus. The startle took place in an outdoor setting with wind and heat activated when the environment was turned on. The bird was a small graphical object flittering across the screen, accompanied by the sound of wings fluttering when the startle audio was enabled (i.e., Aud Med and Aud High). During the bird startle, participants generally moved their neck to track the bird moving across the screen and closed their eyes when the bird flew close to them. The results are presented in Table 2 and Figure 5. der startles separately. The last two subsections analyze the effect of graphics and the effects of environment and premonition on participant responses. EMG and survey data were analyzed using two-way ANOVA. Follow-up tests were conducted using Tukey-Kramer's difference procedure. We used a standard significance level of (p < 0.05) and considered notable effects to be in the range of 0.05 ≤ < 0.09. EMG data were slightly skewed; therefore, Box-Cox transform was applied before analyses [72]. Normality was verified using the Kolmogorov-Smirnov test, validating the use of ANOVA. Statistics are reported using the transformed data, whereas effect sizes are reported using the non-transformed EMG data. It was unclear discussing effect sizes with units that had percentages; therefore, two metrics were used for muscle activation. The first was the %MVC increase, which was the true effect size between two groups. The other was the percentage increase between the two groups. For example, if group 1 had 10 %MVC and group 2 had 50 %MVC, we determined that there was a 40 %MVC increase and an overall 500% increase in muscle activation between the two groups.

Bird Startle
The bird startle was designed with a small visual focus and larger auditory focus. The startle took place in an outdoor setting with wind and heat activated when the environment was turned on. The bird was a small graphical object flittering across the screen, accompanied by the sound of wings fluttering when the startle audio was enabled (i.e., Aud Med and Aud High). During the bird startle, participants generally moved their neck to track the bird moving across the screen and closed their eyes when the bird flew close to them. The results are presented in Table 2 and Figure 5.
Audio had a significant effect on how startled they felt, as per survey Q1. Between Audio Off and Audio High, participants responded with an average 1.88-point (115%) increase (t(119) = 7.32, p < 0.001), which was similar for Med. startle audio, with an average 1.58-point (93%) increase (t(119) = 5.76, p < 0.001). Both indicated that the startle audio made perceptions of the event more startling. In contrast, the effect of environment on/off had no statistically significant effect on their feelings towards the startle.
Likewise, audio had a significant effect on neck and eye activation. Between Aud Off and Aud High, there was an average 13.5%MVC (161%) increase in their neck (t(112) = 2.54, p = 0.033), and an 8.21%MVC (85%) increase in their eye (t(112) = 3.17, p = 0.006) activation. Many participants did not even notice the bird with Aud Off (i.e., background noise only). Bird startle audio was clearly important for eliciting a response.
There was an obvious pattern in the EMG data and survey data. As the confidence interval plots indicate, the general trend is that Aud Off resulted in the lowest %MVC and survey scores, regardless of the environment settings. Aud High generally resulted in significantly higher values when compared with Aud Off, whereas responses to Aud Med were generally somewhere in between. Again, this highlights the importance of audio with a small graphical object.  Audio had a significant effect on how startled they felt, as per survey Q1. Between Audio Off and Audio High, participants responded with an average 1.88-point (115%) increase (t(119) = 7.32, p < 0.001), which was similar for Med. startle audio, with an average 1.58-point (93%) increase (t(119) = 5.76, p < 0.001). Both indicated that the startle audio made perceptions of the event more startling. In contrast, the effect of environment on/off had no statistically significant effect on their feelings towards the startle.
Likewise, audio had a significant effect on neck and eye activation. Between Aud Off and Aud High, there was an average 13.5% MVC (161%) increase in their neck (t(112) = 2.54, p = 0.033), and an 8.21% MVC (85%) increase in their eye (t(112) = 3.17, p = 0.006) activation. Many participants did not even notice the bird with Aud Off (i.e., background noise only). Bird startle audio was clearly important for eliciting a response.
There was an obvious pattern in the EMG data and survey data. As the confidence interval plots indicate, the general trend is that Aud Off resulted in the lowest % MVC and survey scores, regardless of the environment settings. Aud High generally resulted in significantly higher values when compared with Aud Off, whereas responses to Aud Med were generally somewhere in between. Again, this highlights the importance of audio with a small graphical object.

Beam Startle
The beam startle had an auditory and visual focus. The beam had a large graphical representation coupled with a loud crashing sound as the beam fell, which was only present with Aud Med. and Aud High. Participants with environment on felt wind and heat before entering the barn where the startle occurred, but the environmental effects were identical to those provided by the bird. During the startle, participants typically responded with a backwards flinch, involving a shrugging motion of the shoulders (back activation), looking up at the beam falling in front of them (neck activation), and a widening of the eyes (eye activation). The results are presented in Figure 6 and Table 3.    Audio had a significant effect on the perception of startles, as indicated by Q2. The loud startle sounds, Aud High, resulted in an average 1.62-point (65%) increase compared with Aud Off (t(119) = 6.39, p < 0.001), whereas Aud Med. resulted in an average 1.33-point (47%) increase compared with Aud Off (t(119) = 5.21, p < 0.001). Again, the environmental display did not have a significant impact on the startle perception.
Audio levels had a statistically significant effect on EMG activity, demonstrating significantly increased muscle activity with the addition of Aud Med. and Aud High. Neck muscle responses did exhibit environmental significance. Enabling the environmental display significantly increased neck activation by an average 23.8% MVC (129%) (t(104) = 2.66, p = 0.009).
Similar to the bird startle, there was a notable pattern in muscle activation and questionnaire responses. Generally, Aud Off resulted in the lowest % MVC and questionnaire response, whereas Aud Med and Aud High generally resulted in significantly higher levels. This highlights the importance of startle audio (Aud Med and Aud High) with the beam startle.

Thunder Startle
The thunder startle had a visual and auditory focus, but with a focus on creating premonition a few seconds before the startle. All participants experienced dark grey clouds overhead, and reduced sunlight. Those with environment enabled experienced increased wind speed and the heat was turned off; rain scent and mist were activated. The startle then consisted of flashing the screen brightly to simulate lightning, followed by a loud thunderclap. During the startle, participants typically responded by stiffening their neck, looking away from the screen, and squinting their eyes when the lightning and thunder struck. Confidence interval plots and ANOVA results are shown in Figure 7 and Table 4, respectively.
Virtual Worlds 2022, 1, FOR PEER REVIEW 13 startle then consisted of flashing the screen brightly to simulate lightning, followed by a loud thunderclap. During the startle, participants typically responded by stiffening their neck, looking away from the screen, and squinting their eyes when the lightning and thunder struck. Confidence interval plots and ANOVA results are shown in Figure 7 and Table  4, respectively.   In contrast to the bird and beam startles, audio did not have a statistically significant effect on startle level (Q3), although environment did have a significant effect. Those with environment on reported an average 0.68-point (20.8%) decrease in startle perception compared with those without, reducing the questionnaire results from an average of 3.28 with environment off to 2.6 with environment on (t(119) = 2.87, p = 0.004). These results indicate that enabling the environmental display reduced how startled participants felt, regardless of audio levels. Audio had a significant effect on neck muscle activation. Between Aud Off and Aud Med, there was an average 9.61% MVC (71%) increase (t(112) = 2.54, p = 0.033). Increased audio had an impact on the stiffening of the neck and looking away from the screen during the startle.
Environment had a notable effect on eye muscle activation. There was an average 8.80% MVC (29.9%) increase in activation when the environment was enabled (t(112) = 1.91, p = 0.058). The participants squinted their eyes more when the environment was enabled. This could have been a result of the startle itself, but the increased wind and the mist likely resulted in a natural physical response to such conditions.
There was a notable decrease in back activation when turning the environment on (t(113) = 1.80, p = 0.074). Back EMG decreased an average of 5.5% MVC (29.6%) with the environment enabled, which highlights that participants shrugged less or jumped back less when it was enabled. This correlated well with survey data indicating that the environmental display resulted in the event being less startling, which resulted in a reduced physical response to the startle.
Patterns in EMG and survey data were more varied in the thunder startle. Neck EMG, which was the only metric to show significance for audio levels, was also the only metric to show the previously mentioned pattern where audio resulted in low activation for Aud Off and significantly higher activation for startle audio (Aud Med. and Aud High). No other metrics demonstrated this trend. This could suggest that the neck response is more of a subconscious physiological startle response, which may be more reliable than the eye squint response in the presence of an environmental display for measuring startle responses.
In fact, audio made no noticeable difference in startle survey data when the environment was enabled, suggesting that the audio made no difference psychologically to the user. EMG data indicated a similar trend, where back, neck, and eye responses generally exhibited blunted responses to Aud High when the environment was on. This lack of audio significance in the neck, back, and eye responses, as well as lack of audio significance in perceptions of the startle, suggest that enabling the environmental display in fact made the startle less effective. This is believed to be due to the premonition that the environmental display created. An interesting pattern was noted when the environment was on or off. Specifically, when the environment is off, as audio is increased, we observed gradual increases in neck EMG, eye EMG, and startle survey data with each successive level. Otherwise, enabling the environment resulted in lower startle responses compared with environment off. Combined, these results indicate that increased audio yields increased startle responses (both physical and psychological), but that enabling environment provided a premonition that resulted in a lower startle response.

Graphics
One-way ANOVA was used across startles to measure the effects of varying graphic levels, as shown in Table 5 and Figure 8. The bird was designed to be a small graphical startle, Vis Small, the beam was designed to be a large graphical startle Vis Large, and the thunder was designed to be a large graphical startle with premonition (the sky darkened before the thunder strike), Vis Large w/Premonition. The bird and beam were both startles with objects of varying size moving towards the user, whereas thunder was a bright flash of light. Categories with the environment were removed from the analysis to avoid any possible level of interaction; hence, participant responses were truly dependent on graphics. Lastly, only eye muscle activation was analyzed for EMG, because this muscle is the most receptive to visual stimulation.    The effects of visual premonition can be compared between beam and thunder, because the beam suddenly appeared and the thunder was preceded by darkening skies. There was a decrease in how startling the event was by 0.46 points (14%), although technically not quite notable (t(179) = 1.05, p = 0.14): Table 5. This suggests that visual premonition may decrease the effect of visually startling events.

Premonition and Environment
The results presented in Sections 5.3 and 5.4 hinted that premonition caused by environmental effects decreased participant muscle activation and perception. This section aims to solidify it. Levels of graphics were kept constant by comparing the beam and thunder startles because these both contained large visual elements but had varying levels of premonition. Two-way ANOVA was used to analyze the main effects of the environment and premonition, as well as the interactions between them. As noted in Section 5.3, eye muscle activation was likely caused by wind and mist blowing into the participants' faces when the environment was active; thus, we excluded it from the analyses.
Results of the two-way ANOVA indicated there was a significant interaction between premonition and environment on participant perception of how startling the event was, as shown in Table 6 and Figure 9, i.e., participants responses to premonition, or lack thereof, changed depending on whether the environment was activated. For the startle with premonition (thunder), participants on average felt less startled with Env On compared with Env Off, by 0.68 points (26%) (t(239) = 2.84, p = 0.024). In contrast, the environment had no significant effect on the startle without premonition (beam). The thunder startle had the same level of visual premonition across environment activation, which suggests that the added effect of the environment truly decreased their perceptions of how startling the event was. This is further supported by the results presented in Section 5.4, where it is reported that the visual element of premonition did not have a significant effect on their perception. In contrast, the environment had no significant effect on the startle without premonition (beam). The thunder startle had the same level of visual premonition across environment activation, which suggests that the added effect of the environment truly decreased their perceptions of how startling the event was. This is further supported by the results presented in Section 5.4, where it is reported that the visual element of premonition did not have a significant effect on their perception.   Somewhat similar results are reflected in neck muscle activation. For the startle without premonition, there was a notable increase in neck activation when turning the environment on: 23.8% MVC (129%), (t(113) = 1.80, p = 0.074). This is in line with the previous results exploring the effects of Env on the beam startle, although the drop from significant to notably significant was most likely caused by changing the main effects of the analysis. In contrast, the environment had no significant effect for the startle with premonition (thunder). The above results suggest that Env can be used to either amplify a response or blunt it. For startles without premonition, turning on the environment can be used to amplify neck muscle activations. For startles with premonition, the environment can be used to blunt the perception of how startling the event is.

Discussion
The goal of the startles was to elicit a variety of psychological and physiological responses. The bird startle was designed for participants to track a graphically subtle, fast, and potentially dangerous object moving past them, but without any dependence on the environment. As the EMG responses show, participants used their neck and eyes to follow the bird. As expected, startle audio was important for making the event startling. Startle perception was increased with startle audio. Startle audio significantly increased EMG activity in the eye and neck as the user noticed the bird. As expected, environment had little effect.
The beam was intended to be a large visual startle that is easily noticed, again with no dependence on environmental display. As expected, startle audio was important to increasing how startling the beam was and for increasing the EMG response. We had expected that the large graphical appearance would reduce the change in effect size, but again, startle audio created large increases in startle effects. As EMG responses show, participants jumped backwards, shrugged, and stiffened their neck significantly more when startle audio was provided. Startle audio correlated to participants widening their eyes more as the beam fell, which contrasted with the bird and thunder startles, where participants instead closed their eyes. Surprisingly, enabling the environment significantly increased neck activation, meaning that the participants looked upwards significantly more with the environmental display, which was not expected. Although startle audio was important for eliciting a startle response, adding the environment had a significant effect on muscle activation, which notably compounded the effectiveness of startle audio. The environmental display did not impact perceptions of the startle; however, it had a clear impact on the physical response to the startle, indicating that audio and environment combined can be used to amplify responses.
Trends for the thunder startle were quite different from those for the bird and the beam, because the goal was to use the environment and graphics to create premonition and reduce startle. The thunder startle had a full-screen graphical effect, darkening followed by a bright flash, combined with loud audio, but the environmental display played an active role. Wind increased, scent and moisture were sprayed, and the heat lamp turned off before the thunder struck. We had anticipated that the use of environment would cue the participant that a startle would occur. We expected participants to be less startled with the environment and more startled with louder startle audio. This would correspond to them closing their eyes and jumping back less with the environment on and more with audio. Instead, participants reacted by squinting their eyes and looking away. As expected, turning the environment on decreased the perceptions of how startling it was. The environmental display amplified the effect of the full-screen graphical display, and warned the participant that something was coming. Surprisingly, the premonition effect was so significant with the environment enabled that the startle audio had little effect on the perceptions of the startle. This was reinforced by the reduced back responses with environment enabled, meaning that the participants jumped less. The environment did increase eye activity, but this was believed to be due to squinting in response to the mist and increased wind speeds, because it contradicted the survey results. It further suggests that the classical eye squint measurements for assessing startle responses may not be valid with strong environmental effects. The startle audio still had a significant impact on neck activity, meaning that the startle audio still caused them to look up more aggressively. Overall, the results highlight that environmental displays can be used to help participants better anticipate startling events, although the natural neck response to the startle audio remains.
Future work could focus on evaluating the subject study using a head-mounted display (HMD). HMDs are very popular in the VR community, but they may interfere with environmental displays. The advantage of the CAVE display used by MS.TPAWT is that it allows a user to view the VR world and experience full-body and full-face contact with the environmental displays. Such interfaces are used in systems with locomotion interfaces, enabling large workspaces, such MS.TPAWT or CAREN [61], teleoperation [73], or theme park entertainment systems [74], whereas HMDs are more common in personal entertainment or gaming VR interfaces.

Conclusions
These results have shown that extended reality created by environmental displays can be quite effective for altering the response of participants and their perceptions of virtual worlds. Without doubt, audio is key for eliciting stronger responses and perceptions of startling experiences, but environmental displays can be used to either amplify those responses or to diminish those responses. Environmental displays can be used to create premonitions of startling situations, which can be used to diminish startle responses, both physiologically and psychologically. Environmental displays can also be used to either increase or decrease physiological responses to surprising events, depending on the type and sequence of events. In conclusion, we can control for the response to a startling situation by altering environment displays and audio. These findings suggest that environmental displays could be investigated as a means of improving awareness during the teleoperation of remote systems.