Judgments of Object Size and Distance across Different Virtual Reality Environments: A Preliminary Study

train future workforce to adapt to work in altered environments. Abstract: Emerging technologies offer the potential to expand the domain of the future workforce to extreme environments, such as outer space and alien terrains. To understand how humans navigate in such environments that lack familiar spatial cues this study examined spatial perception in three types of environments. The environments were simulated using virtual reality. We examined participants’ ability to estimate the size and distance of stimuli under conditions of minimal, moderate, or maximum visual cues, corresponding to an environment simulating outer space, an alien terrain, or a typical cityscape, respectively. The ﬁndings show underestimation of distance in both the maximum and the minimum visual cue environment but a tendency for overestimation of distance in the moderate environment. We further observed that depth estimation was substantially better in the minimum environment than in the other two environments. However, estimation of height was more accurate in the environment with maximum cues (cityscape) than the environment with minimum cues (outer space). More generally, our results suggest that familiar visual cues facilitated better estimation of size and distance than unfamiliar cues. In fact, the presence of unfamiliar, and perhaps misleading visual cues (characterizing the alien terrain environment), was more disruptive than an environment with a total absence of visual cues for distance and size perception. The ﬁndings have implications for training workers to better adapt to extreme environments.


Introduction
Emerging technologies have transformed the world of work and increasingly are making desolate and hard-to-reach environments, such as outer space, deep oceans, and polar regions, more accessible to humans [1][2][3][4][5][6]. The visuospatial conditions of these environments may adversely affect the spatial cognitive processing of people in such environments, hindering their ability to work safely and productively [7][8][9][10]. In normal environments, familiar natural or manmade landmarks, such as roads, people, cars, buildings, streetlights, and trees, offer spatial cues to help individuals not just to create a clear mental representation of the area, but also to relate to the various spatial elements, judge their size and position, and determine the speed of a moving object [11,12]. Such cues are not available in remote and desolate places devoid of landmarks, such as the North and South Poles and deserts. Likewise, the alien terrains of the Earth's moon or Mars lack familiar spatial landmarks. To help humans better adapt to such altered conditions, reliable and cost-effective training technologies are needed. Therefore, to inform design principles for such technologies, examining how a lack of visuo-spatial cues affects spatial perception is also critical.
Virtual reality technology has been employed by multiple studies to simulate extreme physical conditions in order to investigate spatial cognitive processing in such conditions [13][14][15]. For instance, some studies [16,17] simulated a multi-module space station in VR and concluded that spatial tests displayed in VR can predict work and navigation performance in such environments. Other studies [18,19] have successfully employed VR to simulate extreme environments and found that VR has the potential to serve as a training system to help people adapt to a microgravity environment. In fact, applying VR-based simulations to test and train spatial abilities of astronauts has been recommended as a cost-effective and safer approach than conventional parabolic flights and drop towers [19,20].
Newly developed training tools to address the impact of limited spatial cues will need fundamental research to examine their efficacy. Such knowledge is critical to informing the design and development of training tools to prepare a broad population for space exploration. Although extreme environments pose a range of challenges (e.g., microgravity or extreme temperature), this study focuses on potential challenges of operating in environments that lack familiar visual cues. Specifically, we examine how the absence of familiar spatial cues influences spatial perception. Spatial perception, in the context of this study, refers to the ability to accurately estimate the distance and scale of objects in space [21]. This is particularly important under extreme conditions. Misinterpreting size and distance may not only hinder work productivity but also result in life-threatening conditions. In 1997, a Russian Progress supply spacecraft and the Russian Mir space station collided due to human error in estimating distance and size ( [22], cited by [23]). To avoid catastrophic effects, there is an urgent need to understand how limited visual cues or the absence of visual cues affects human spatial perception in extreme conditions. Previous research has established that in a real-world environment people judge size and distance based on visual cues such as landmarks. Judgments of size and distance can be adversely affected when visual cues are limited or incomplete. Our study compares the accuracy of estimating spatial dimensions across three levels of visual cue availability using virtual reality environments. We hypothesize that a higher availability of familiar visual cues will facilitate more accurate perceptions of object distance and size.

Background
Prior research has established the existence of perceptual distortions in extreme environments. Mars, the Moon, and deep space asteroids are just three examples. In a study of eight astronauts on board the International Space Station, [23] reported that the astronauts' judgment of size and distance was adversely impacted by microgravity. Astronauts underestimated distance in the depth plane (7.9-18.2%) and perceived a 3D cuboid to be 3.5% taller, 4.5% thinner, and 3.5% shallower than the actual dimensions, demonstrating distorted perception under microgravity conditions.
Oravetz et al. [24] investigated the perceptual distortion of slope, distance, and height estimation within lunar-like and lunar VR environments. Results showed that slope was significantly overestimated, that distance estimates also varied significantly, and that estimation accuracies were affected by viewing distance. People tend to underestimate distance at a far-viewing distance but overestimate at a near-viewing distance. Height estimates also varied considerably, ranging from −568 to 688 m. Oravetz et al. also showed that the perception of the height of a hill is influenced by the viewing distance. Participants overestimate the height of a small hill at both the near-and far-viewing distances. Although there is no statistical significance in the case of taller hills, there is a greater tendency to underestimate the height at a near-viewing than a far-viewing distance.
The underlying causes of perceptual distortions are not fully understood, with most studies attributing distortion to Earth-discrepant gravity conditions [25,26]. Jörges and López-Moliner [26] used the concept of a "gravity prior" to describe long-term experience in the Earth gravity environment. Based on a Bayesian framework of perception, a gravity prior implies that individuals tend to rely on their strong prior experiences on Earth. Furthermore, people find it very difficult to perceptually adapt to a non-Earth gravity environment [26][27][28]. Few studies have directly addressed unfamiliar and limited visual cues as a source of inaccuracies in spatial perception.
On Earth, visuospatial cues provided by familiar landmarks, such as roads, buildings, and trees, play an important role in creating an accurate cognitive representation of what exists [20]. Research has [29,30] suggested that these landmarks, or non-geometric spatial features, influence the perception of distance, size, direction, and scale. Naceri and Hoinville [31] also found that familiar objects may provide linear perspective and a sense of scale (familiar size) which in turn may lead to more accurate distance judgement.
Vienne et al. [32] investigated how screen distance in VR influences perceived depth in two environments, a gray-untextured background environment and a cue-rich environment. They found that distance perception was substantially affected by screen distance for the far-distance stimuli. However, the effect lessened under a cue-rich environment. Ballestin et al. [33] examined a 3D blind reaching task using video (VST) and optical see-through (OST) head-mounted displays and found an underestimation of distance, particularly with OST systems. Gerig et al. [34] studied the effect of the absence of visual cues such as aerial and linear perspective, shadows, texture gradient, and occlusion, in virtual environments. After comparing the results of a computer screen and an Head Mounted Display(HMD) group, they concluded that the screen group performed worse than the HMD group in terms of completion time. Additionally, both HMD groups, the one with and without visual cues, achieved the optimal minimum in terms of speed peaks and hand path ratio.
Wayfinding research [35] has underscored the significance of landmarks when determining spatial representation and planning navigation toward a destination. The absence of spatial cues, therefore, may cause difficulties in spatial cognitive processing. In a desert or polar region, spatial cognitive challenges arise due to the absence of landmarks, suggesting that landmarks provide a context for calculating depth, height, position, and direction [20]. In this regard, it seems plausible that the limited and incomplete visual cues in extreme environments affect the accuracy of spatial perception in the same way it does on Earth.
In summary, the literature to date suggests that inaccuracies in size and distance judgments under extreme environments are a consequence of the absence of familiar spatial cues. Here we investigate this claim using VR as a platform. Specifically, we explored perception of size (i.e., depth, height, and length) and distance under three separate VR environments differing in their degree of familiar visual cues. The use of a VR platform allowed us to examine the role of landmarks and of other environmental conditions under more controlled conditions than in previous research.

Research Goal and Objectives
The main goal of this research was to understand how a lack of visual cues affects spatial perception, particularly distance and size perception. The study had the following research objectives:

•
To identify the impact of different levels of visuospatial cues on distance perception. For this, we specifically measured both self-to-object (egocentric) distance and objectto-object (allocentric) distances.

•
To assess the influence of visuospatial cues on the accuracy of size perception. To this end, we considered all dimensions of an object (i.e., length, depth, and height).

Participants
There were 32 participants (12 females), each with normal or corrected-to-normal vision. Participants were students recruited through an announcement sent through the university's email system. They ranged in age from 18 to 39 years old (M = 24.8, SD = 6.27).
All participants provided written consent prior to the study and the study was approved by the university's Institutional Review Board. A power analysis was conducted in G*Power 3.1.9.2 using reed measures ANOVA to determine a sufficient sample size, with α = 0.05, power = 0.8, and effect size (d) = 0.96. The desired sample sizes needed to be around 20 for the distance estimation task, and around 11 for the size estimation task.

Study Environment
The VR environments were created in the Unity 3D game engine [36]. Unity 3D offers the ability to customize environments and interaction through scripts to emulate specific performance and functionality [36].
We created three environments with a set of two stimuli as shown in Figure 1. Each environment contained either maximal, moderate, or minimal visual and gravitational cues. The environments were chosen to represent different environments as follows: Environment 1: Cityscape. This environment was constructed to represent a typical city. It included familiar landmarks and both visual and gravitational frames of reference. This environment was designed to offer the maximum number of cues ( Figure 1A). Environment 2: Spacescape. This environment was constructed to represent an alien planet (e.g., Mars). It included visual and gravitational frames of reference due to the presence of terrain and hypo-gravity. This environment did not contain familiar landmarks. This environment was designed to offer a moderate number of cues ( Figure 1B). Environment 3: Outer space. This environment was constructed to represent space. It did not contain visual or gravitational frames of reference nor familiar landmark objects, such as trees, cars, and buildings. This environment was designed to offer a minimum number of cues ( Figure 1C).
Each of the three environments were constructed with equal visual quality (resolution and sharpness) to allow for fair comparison across stimuli.
For each trial, participants were seated in a swivel chair and viewed the interior of one of the three environments presented on the HTC VIVE System HMD at a per-display resolution of 1440 × 1600 pixels, and a 110 • field of view (FOV). Participants were free to explore all 360-degrees of the virtual environment during the experiments.  To assess the influence of visuospatial cues on the accuracy of size perception. To this end, we considered all dimensions of an object (i.e., length, depth, and height).

Participants
There were 32 participants (12 females), each with normal or corrected-to-normal vision. Participants were students recruited through an announcement sent through the university's email system. They ranged in age from 18 to 39 years old (M = 24.8, SD = 6.27). All participants provided written consent prior to the study and the study was approved by the university's Institutional Review Board. A power analysis was conducted in G*Power 3.1.9.2 using reed measures ANOVA to determine a sufficient sample size, with α = 0·05, power = 0·8, and effect size (d) = 0·96. The desired sample sizes needed to be around 20 for the distance estimation task, and around 11 for the size estimation task.

Study Environment
The VR environments were created in the Unity 3D game engine [36]. Unity 3D offers the ability to customize environments and interaction through scripts to emulate specific performance and functionality [36].
We created three environments with a set of two stimuli as shown in Figure 1. Each environment contained either maximal, moderate, or minimal visual and gravitational cues. The environments were chosen to represent different environments as follows: Environment 1: Cityscape. This environment was constructed to represent a typical city. It included familiar landmarks and both visual and gravitational frames of reference. This environment was designed to offer the maximum number of cues ( Figure 1A). Environment 2: Spacescape. This environment was constructed to represent an alien planet (e.g., Mars). It included visual and gravitational frames of reference due to the presence of terrain and hypo-gravity. This environment did not contain familiar landmarks. This environment was designed to offer a moderate number of cues ( Figure  1B). Environment 3: Outer space. This environment was constructed to represent space. It did not contain visual or gravitational frames of reference nor familiar landmark objects, such as trees, cars, and buildings. This environment was designed to offer a minimum number of cues ( Figure 1C).  Visual cues denote a gravitational environment, such as the placement of objects on the ground and the horizon. Green and yellow rectangular cuboids are the stimuli for the distance and size estimation test. (B) Environment 2, a space scape with moderate spatial cues. Visual cues denote a gravitational environment include the horizon and the terrain on the ground. It did not contain familiar landmarks. The green and yellow rectangular cuboid is the stimuli for distance and size estimation test. (C) Experimental environment with minimum spatial cues. It did not contain visual cues nor familiar landmark. The green and yellow rectangular cuboid is the stimuli for the distance and size estimation test.
Each of the three environments were constructed with equal visual quality (resolution and sharpness) to allow for fair comparison across stimuli.
For each trial, participants were seated in a swivel chair and viewed the interior of one of the three environments presented on the HTC VIVE System HMD at a per-display resolution of 1440 × 1600 pixels, and a 110° field of view (FOV). Participants were free to explore all 360-degrees of the virtual environment during the experiments.

Task and Procedure
The participants' task was to estimate the size and distance of stimuli in three separate environments. At the beginning of the study participants completed a demographic questionnaire. Then, they were introduced to the three VR environments. Three sets of two rectangular cuboids (set A, set B, and set C), one yellow one green, Figure 1. (A) Environment 1, a cityscape with maximum spatial cues. Familiar landmarks include the chair, trees, bollard, road, doors, and lamp posts. Visual cues denote a gravitational environment, such as the placement of objects on the ground and the horizon. Green and yellow rectangular cuboids are the stimuli for the distance and size estimation test. (B) Environment 2, a space scape with moderate spatial cues. Visual cues denote a gravitational environment include the horizon and the terrain on the ground. It did not contain familiar landmarks. The green and yellow rectangular cuboid is the stimuli for distance and size estimation test. (C) Experimental environment with minimum spatial cues. It did not contain visual cues nor familiar landmark. The green and yellow rectangular cuboid is the stimuli for the distance and size estimation test.

Task and Procedure
The participants' task was to estimate the size and distance of stimuli in three separate environments. At the beginning of the study participants completed a demographic questionnaire. Then, they were introduced to the three VR environments. Three sets of two rectangular cuboids (set A, set B, and set C), one yellow one green, served as a target for both the distance and size estimation task. In each set, the two cuboids in different sizes were placed at two separated distances, near and far (see Figure 2). Near targets were placed 6 to 9 m from the participant. Far targets were placed 12 to 13 m from the participant. Figure 2 illustrates this. The stimuli were colored in yellow and green to allow them to stand out from the background and allow the participant to clearly distinguish between the two. We wanted to make participants rely solely on the spatial cues of the environment when they perform the spatial tasks. The target objects should not serve as a spatial cue. Therefore, we used rectangular cuboids as stimuli as these would not have features that could bias participants. Additionally, studies, such as Gerig et al. [34], showed that visual depth cues rendered in virtual environments may have a minor effect on participants' performance while completing a task.
served as a target for both the distance and size estimation task. In each set, the two cuboids in different sizes were placed at two separated distances, near and far (see Figure  2). Near targets were placed 6 to 9 m from the participant. Far targets were placed 12 to 13 m from the participant. Figure 2 illustrates this. The stimuli were colored in yellow and green to allow them to stand out from the background and allow the participant to clearly distinguish between the two. We wanted to make participants rely solely on the spatial cues of the environment when they perform the spatial tasks. The target objects should not serve as a spatial cue. Therefore, we used rectangular cuboids as stimuli as these would not have features that could bias participants. Additionally, studies, such as Gerig et al. [34], showed that visual depth cues rendered in virtual environments may have a minor effect on participants' performance while completing a task. Figure 2. Distance estimation task. "a" and "b" show the egocentric distance between the participant and near and far targets, respectively, and "c" shows the allocentric distance between the two targets.
Prior to each experiment trial, participants completed two practice trials. The study employed a within-subject design. Each participant completed one of the cuboids sets (set A, set B, or set C) in each environmental condition (minimum, moderate, and maximum cues). The order of the environment and the cuboids sets were counterbalanced to avoid a learning effect (see Table 1). Participants first completed the distance estimation task, then the size estimation task.

Figure 2.
Distance estimation task. "a" and "b" show the egocentric distance between the participant and near and far targets, respectively, and "c" shows the allocentric distance between the two targets.
Prior to each experiment trial, participants completed two practice trials. The study employed a within-subject design. Each participant completed one of the cuboids sets (set A, set B, or set C) in each environmental condition (minimum, moderate, and maximum cues). The order of the environment and the cuboids sets were counterbalanced to avoid a learning effect (see Table 1). Participants first completed the distance estimation task, then the size estimation task.

Distance Estimation Task
Participants were asked to estimate and verbally report the absolute distance between their own position and the position of each of the two targets in the perspective (egocentric distance). Participants also reported the distance between the two targets (allocentric distance) Distances were reported using the participants' choice of conventional unit system (e.g., feet, yards, or meters), thereby allowing distance reporting in familiar units.

Size Estimation Task
Participants were asked to determine the relative size of two rectangular cuboids (green and yellow) by first identifying the shortest side of the cuboids (e.g., depth, height, or length). Participants were then subsequently asked to define the aspect ratio of the other sides relative to the shortest side. For example, imagine the given stimuli's shortest side is its length (χ), the depth is same as the length and the height is twice bigger than the length. In this example, the aspect ratio of the cuboid is χ: χ: 2χ. In the study, the shortest side (χ) was set at 1 for convenience. A schematic of this is shown in Figure 3.
Participants were asked to estimate and verbally report the absolute distance between their own position and the position of each of the two targets in the perspective (egocentric distance). Participants also reported the distance between the two targets (allocentric distance) Distances were reported using the participants' choice of conventional unit system (e.g., feet, yards, or meters), thereby allowing distance reporting in familiar units.

Size Estimation Task
Participants were asked to determine the relative size of two rectangular cuboids (green and yellow) by first identifying the shortest side of the cuboids (e.g., depth, height, or length). Participants were then subsequently asked to define the aspect ratio of the other sides relative to the shortest side. For example, imagine the given stimuli's shortest side is its length (χ), the depth is same as the length and the height is twice bigger than the length. In this example, the aspect ratio of the cuboid is χ: χ: 2χ. In the study, the shortest side (χ) was set at 1 for convenience. A schematic of this is shown in Figure 3.

Results
To ensure consistency, distance estimation units were converted into SI units (meters). The ratio of the difference between the actual distance and the estimated distance was used as relative error. The same formula was used to calculate the relative error for size. A relative error of 0 indicated a perfect estimation. A negative value represented underestimation in distance, whereas a positive value represented overestimation of distance.
For each analysis, relative errors more than two standard deviations from the respective mean were considered outliers and discarded from the analysis [37]. We felt it was important to screen for and remove outliers to improve the reliability of the dataset; the importance of doing so was underscored by Osborne et al. [38], who observed that less than 10% of the studies they reviewed even checked for the presence of outliers. Strategies for dealing with outliers have generated debate (e.g., see [39] vs. [40]). For the present study, we followed the widely accepted criterion of 2 or 3 standard deviations (SD) for identification of outlier data points ( [37,41]). We found that there was no

Results
To ensure consistency, distance estimation units were converted into SI units (meters). The ratio of the difference between the actual distance and the estimated distance was used as relative error. The same formula was used to calculate the relative error for size. A relative error of 0 indicated a perfect estimation. A negative value represented underestimation in distance, whereas a positive value represented overestimation of distance.
For each analysis, relative errors more than two standard deviations from the respective mean were considered outliers and discarded from the analysis [37]. We felt it was important to screen for and remove outliers to improve the reliability of the dataset; the importance of doing so was underscored by Osborne et al. [38], who observed that less than 10% of the studies they reviewed even checked for the presence of outliers. Strategies for dealing with outliers have generated debate (e.g., see [39] vs. [40]). For the present study, we followed the widely accepted criterion of 2 or 3 standard deviations (SD) for identification of outlier data points ( [37,41]). We found that there was no meaningful difference between using the criterion of 2 SD versus 3 SD in our dataset and so we used the more conservative cutoff of 2 SD. In dealing with the outlier points, it has been shown that elimination of extreme values from the data set results in more accuracy and less errors of statistical inferences [37]. In our preliminary study, the strategy of outlier elimination was adopted to aim for as accurate results as possible. Due to the within-subject design of the study for all tasks' conditions, the sample size remained acceptable even after removing the outliers (see the power analysis reported above).

Egocentric Distance Perception
One participant failed to complete this task, so was not included. In addition, we screened for outliers by condition (there were about 23% outliers). As a result, data from 24 participants were used for the final analyses.
For further analyses, absolute relative errors were used to represent inaccurate estimation, regardless of direction, under each environmental condition (i.e., overestimation or underestimation). However, in terms of the magnitude of error, a negative relative error is just as different from zero (i.e., accurate estimation) as an equivalent positive relative error. A 3 (maximum vs. moderate vs. minimum visual and gravitational cues) × 2 (near vs. far) repeated measures ANOVA was performed to investigate the effects of the withinsubject variables of environmental condition and proximity of the target, respectively, on the absolute relative error of distance estimation. The results showed no main effect of environmental condition or target proximity. The interaction of the two variables was not significant, either (all p > 0.05, Greenhouse-Geisser correction; see Figure 4). so we used the more conservative cutoff of 2 SD. In dealing with the outlier points, it has been shown that elimination of extreme values from the data set results in more accuracy and less errors of statistical inferences [37]. In our preliminary study, the strategy of outlier elimination was adopted to aim for as accurate results as possible. Due to the withinsubject design of the study for all tasks' conditions, the sample size remained acceptable even after removing the outliers (see the power analysis reported above).

Egocentric Distance Perception
One participant failed to complete this task, so was not included. In addition, we screened for outliers by condition (there were about 23% outliers). As a result, data from 24 participants were used for the final analyses.
The data showed a tendency for underestimation of distance in environments with maximum cues ( For further analyses, absolute relative errors were used to represent inaccurate estimation, regardless of direction, under each environmental condition (i.e., overestimation or underestimation). However, in terms of the magnitude of error, a negative relative error is just as different from zero (i.e., accurate estimation) as an equivalent positive relative error. A 3 (maximum vs. moderate vs. minimum visual and gravitational cues) × 2 (near vs. far) repeated measures ANOVA was performed to investigate the effects of the within-subject variables of environmental condition and proximity of the target, respectively, on the absolute relative error of distance estimation. The results showed no main effect of environmental condition or target proximity. The interaction of the two variables was not significant, either (all p > 0.05, Greenhouse-Geisser correction; see Figure 4).  Because there was no significant effect of target proximity, we increased the power of the analyses by combining the data from the near and far targets by averaging the absolute relative error of the two proximity conditions for each participant. For participants whose data point for either the near or far condition was an outlier, we used the data point that was not an outlier. This resulted in omission of fewer data points (~16%) and the remaining data from 26 participants was entered into this analysis. A repeated measures ANOVA was conducted on the effect of environmental condition on the relative error of distance perception. The effect of the environmental condition approached significance, F (1, 25) = 2.92, p = 0.10, η 2 p = 0.11; Greenhouse-Geisser correction. The relative error was larger under the environment with a moderate level of visual and gravitational cues (M relative error = 10.71, SD = 30.11) than the environment with minimum cues (M relative error = 0.62, SD = 0.21) or relative to the environment with maximum cues (M relative error = 0.57, SD = 0.21).
To examine whether the absolute relative error of egocentric distance perception under each environmental condition was significantly greater than zero, we conducted one-sample t-tests (one-tailed) against zero for each environmental condition. The results revealed significant deviation from zero, i.e., an accurate estimation under all environments with maximum (t (25) = 13.60, p < 0.001, d = 2.67), moderate (t (25) = 1.81, p = 0.04, d = 0.36), and minimum (t (25) = 14.99, p < 0.001, d = 2.94) visual and gravitational cues.

Allocentric Distance Perception
One participant failed to complete this task. For each environmental condition, the outlier data points were dropped from the analyses, which led to the omission of about 16% of the data. The analyses were conducted on the data from the remaining 26 participants.
Overall, participants' responses showed an overestimation of the distance between the two targets under the environment with moderate cues ( Again, for further analyses, the absolute relative error was used. A one-way repeatedmeasures ANOVA on perception of allocentric distance perception revealed an effect of the environmental condition that approached significance; F (1, 25) = 2.82, p = 0.11 (Greenhouse-Geisser correction). In estimation of the distance between the two targets, participants made larger errors under the environment with moderate visual and gravitational cues (M relative error = 17.24, SD = 50.49) than the environment with minimum cues (M relative error = 0.66, SD = 0.32) than the environment with maximum visual and gravitational cues (M relative error = 0.51, SD = 0.27; see Figure 4).
To determine if the absolute relative error of allocentric distance perception under each environmental condition was significantly greater than zero, we conducted single sample t-tests (one-tailed) against zero for each environmental condition. The results revealed significant deviation from zero, i.e., an accurate estimation under all environments with maximum (t (25) = 9.25, p < 0.001, d = 1. Together, the results of both egocentric and allocentric distance estimation tasks showed an underestimation of distance in the environment with both visual and gravitational cues (maximum: cityscape) and the environment with no cues at all (minimum: outer space). However, there was a tendency for an overestimation of distance in the environment that had gravitational cues but no familiar visual cues (moderate: space scape).

Size Estimation
Two participants failed to complete this task appropriately. Further, for each study condition, outliers were dropped from the analyses, which resulted in the exclusion of about 13% of the data. The remaining data from 23 participants was entered into the analyses. In the environment with maximum visual and gravitational cues, participants un- Similar to the analyses of distance estimation data, the sign of relative errors was removed for further analyses of size estimation data. A 3 (maximum vs. moderate vs. minimum visual and gravitational cues) × 2 (near vs. far) × 3 (depth vs. height vs. length) repeated measures ANOVA was performed to examine the effects of the within-subject variables of the environmental condition, proximity of the target, and dimension of the cuboid, respectively, on the absolute relative error of size estimation.
We followed up with simple effects analyses of the environmental condition each level of dimension for the near target, using Bonferroni correction (p < 0 estimation of depth, the effect of environmental condition was significant, F (1. To break down the significant three-way interaction, simple two-way interaction analyses between environmental condition and dimension were conducted within each level of target proximity, using Bonferroni correction (i.e., p < 0.025). The interaction between environmental condition and dimension was only significant in size estimation of the near target (F (1.6, 35.21) = 5.035, p = 0.017, η 2 p = 0.186; Greenhouse-Geisser correction) but not the far target (p > 0.025).
We followed up with simple effects analyses of the environmental condition within each level of dimension for the near target, using Bonferroni correction (p < 0.017). In estimation of depth, the effect of environmental condition was significant, F (1.1, 24.17) = 7.37, p = 0.01, η 2 p = 0.251; Greenhouse-Geisser correction. Simple pairwise comparisons with Bonferroni adjustment revealed that estimation of depth was significantly more accurate in the environment with minimum cues (M relative error = 0, SD = 0) than in environments with moderate (M relative error = 0.21, SD = 0.23; p < 0.001) and maximum (M relative error = 0.23, SD = 0.23; p = 0.001) cues. However, estimation of depth was equally accurate under the environments with maximum and moderate visual and gravitational cues.
In estimation of height, the effect of environmental condition was significant, In conclusion, the interaction of environmental condition and estimated dimension of the cuboid was dependent on target proximity. This interaction was only significant in size estimation of the near target. Further analyses on estimation of the near target revealed an effect of environmental condition in estimation of the depth and height, but not in estimation of length. Namely, the accuracy of estimation of depth was substantially better in the environment without any visual or gravitational cues than the other two environments. However, estimation of height was more accurate in the environment with maximum cues than the environment with minimum visual and gravitational cues. On a descriptive level, estimation of depth overall was more accurate than the other two dimensions, regardless of the target proximity and environmental condition (see Figure 5).

Discussion
This study was designed to better understand how variability across environments in the presence of familiar spatial cues can influence spatial perception ability. Specifically, we investigated participants' ability to determine the absolute distance and relative size of stimuli under three environmental conditions.
Results confirmed difficulties in distance and size estimation, in particular under the moderate visual cues environment (environment 2: space scape). However, even with maximum visual cues (environment 1: cityscape), perceived distance significantly deviated from the actual distance, and was consistently underestimated. Distance compression could be due to the VR platform. Existing studies have shown that distance estimation is regularly underestimated in a VE when compared to the real world [14,[42][43][44][45][46][47][48]. Thompson et al. [35], for example, compared absolute distance judgment in the real world with varying quality VR environments (e.g., low-quality graphics and wireframe graphics). They showed that distance judgment in VR was significantly underestimated. For this study, we evaluated relative distances to mitigate this underestimation. Virtual environments may also not be able to represent the real physical environments exactly. There may also be perceived dilations and compressions of space as found by Cutting [49] in a study of lenses that may influence the field of view aspect of computer graphics and VR.
We found that the absolute distance to the target was underestimated in maximum (environment 1: cityscape) and minimum (environment 2: space scape) spatial cue conditions, whereas it was overestimated in the moderate cue (environment 3: outer space) condition. We attribute more accurate perceptual judgments in environment 1 to the present's recognizable objects. These objects serve as key frames of reference to determine the distance and size of other objects [50]. This was also expressed in several participants' comments. When judging a target's absolute distance in environment 1: cityscape, the maximum visual cue environment, participants noted that they used the uniform tiles on the floor or standard height street poles to measure the relative distance and size of the virtual objects.
In contrast to our expectation that environment 3 (outer space), which had the least amount of visual cues, would yield the highest relative error, it was in fact environment 2 (space scape), the moderate visual cue condition, that produced the highest relative error. The reason for this is unclear, but one may speculate that unfamiliar visual cues in the presence of other familiar cues such as gravity, could lead to misjudgments regarding distance and size. This partially replicates previous findings that show the deceptive nature of surfaces, and that the absence of familiar objects hinders distance judgment [51][52][53][54].
Participants also reported difficulties in estimating distance in environment 2 (space scape), the moderate spatial cue condition, due to the presence of unfamiliar spatial features, such as mountains, valleys, and craters. Some participants reported it was challenging to determine whether the mountain on the Mars-like terrain was a big mountain far away or a small sandpile close by, because of the deceptive nature of the altered environment's surface. Therefore, distance judgment varied considerably between participants. One participant estimated a target at an actual distance nine meters from the standing point as being one meter away, while another judged the same 9-m target to be 356 m away. Alan Shepard, a former Apollo 14 astronaut, remarked in On the Moon: The Apollo Journals, that "It's crystal clear up there-there's no closeness that you try to associate with it in Earth terms-it just looks a lot closer than it is" ( [55], as cited in [51]).
In the size estimation task, a tendency to underestimate depth was observed. This resonates with previous studies [23,24,56]. Existing research indicated consistent underestimation of depth and overestimation of height by participants in altered conditions, such as microgravity and lunar terrain. Moreover, our experiment found that the size estimation was more accurate under the minimum visual cue condition, suggesting that the relationship between size perception and presence of visual cues might be non-linear. The existence of unfamiliar visual cues may potentially produce more misleading size perceptions than an environment with no visual cues, where individuals solely rely on an idiosyncratic reference frame.
Our findings underscore the need for extra support for spatial perception ability through workforce training for extreme environmental conditions (e.g., alien or arctic terrain). As [57] stated, "If an astronaut cannot accurately visualize the volume of the station, its surroundings, or a planetary surface, navigation may cause delays and frustration. There may also be consequences for space habitat design if squared volumes do not look square to people in space".

Limitations
The study reported here has certain limitations that are important to mention. As it was a preliminary study, the number of trials and participants was relatively small. Future research should increase both. The extreme environment (e.g., Mars-like terrain) simulated in this study did not consider different lighting conditions [24,58]. Relatedly, we did not consider shadows, surface color, nor textural contrast [59]. Additionally, virtual environments may not be fully representative of a real environment; thus, caution must be exercised in interpreting the results. Future work should directly compare virtual and real conditions to see if these cues may moderate human spatial perception.

Conclusions
This novel study demonstrated a clear impairment of spatial perception under extreme conditions. The results suggest that new tools are necessary to train future workers, whose environment may well be extreme, to improve spatial abilities. Future work in space, for example, will include constructing habitable bases or stations, conducting experiments, and collecting samples in vast uninhabited environments. These tasks, and those perhaps yet unimagined, will require a wide range of perceptual motor and perceptual cognitive tasks, all requiring excellent spatial abilities [23,54,60,61]. Piloting-based tasks, such as safely flying and landing quadcopters and operating unmanned/manned rovers, require complete understanding of the spatial characteristics necessary to successfully execute the task. Spatial perception including judgement of relative size, height, scale, and position of spatial components will be critical to such work [4,21]. In the same vein, spatial ability is relevant to submarine and polar exploration missions, and a similar ability will be necessary to succeed in these environments, as in space missions [20].
Our findings have potential applications in other work domains exhibiting extreme visual and gravitational conditions; deep sea and deserts are two such candidates. Similar spatial difficulties are often reported by workers in underwater environments [62]. Our study, though preliminary, provides insights into how environments with limited or no visual cues impair spatial cognition. It also illuminates the use of technologies for training workers to adapt to such extreme environments and suggests how such tools could be designed and developed. It is hoped that this study provides an impetus for similar investigations in a range of other extreme environmental conditions.