Extended Reality in Spatial Sciences: A Review of Research Challenges and Future Directions

This manuscript identifies and documents unsolved problems and research challenges in the extended reality (XR) domain (i.e., virtual (VR), augmented (AR), and mixed reality (MR)). The manuscript is structured to include technology, design, and human factor perspectives. The text is visualization/display-focused, that is, other modalities such as audio, haptic, smell, and touch, while important for XR, are beyond the scope of this paper. We further narrow our focus to mainly geospatial research, with necessary deviations to other domains where these technologies are widely researched. The main objective of the study is to provide an overview of broader research challenges and directions in XR, especially in spatial sciences. Aside from the research challenges identified based on a comprehensive literature review, we provide case studies with original results from our own studies in each section as examples to demonstrate the relevance of the challenges in the current research. We believe that this paper will be of relevance to anyone who has scientific interest in extended reality, and/or uses these systems in their research.


Introduction
The terms virtual, augmented, and mixed reality (VR, AR, MR) refer to technologies and conceptual propositions of spatial interfaces studied by engineering, computer science, and human-computer-interaction (HCI) researchers over several decades.Recently, the term 'extended reality' (or XR) has been adopted as an umbrella term for VR/MR/AR technologies.In the past five years or so, XR has been met with a renewed excitement within the sciences and industry as recent technological developments have led to cheaper and lighter devices, and significantly more powerful software than previous generations.Previously, the use of XR remained 'in the lab', or only in specialized domains.The recent turn has led to a wider uptake in society such as in civil defense, aviation, emergency preparedness and evacuation planning, and nearly all educational disciplines as well as in the private sector.In Geographic Information Science (GIScience) and related domains, XR concepts and technologies present unique opportunities to create spatial experiences in the ways humans interact with their environments and acquire spatial knowledge [1][2][3], a shift that GIScience scholars had already envisioned [4,5] 20 years ago.With recent developments in software, spatial computing [6,7] has emerged as a powerful paradigm enabling citizens with smartphones to map their environment in real time, in 3D, and in high fidelity.XR is, however, not simply about 3D representations, or photorealism: immersive interfaces offer powerful experiences based on visualization and interaction design transforming our sense of space.Through well-designed, compelling, and meaningful information experiences, we can collectively extend our lived experience with geographic spaces (hence, the term extended reality).A fundamental paradigm shift lies in the implication that, with VR, we can experience a sense of place comparable or even identical to the real world.Such a development may alleviate the need for travel through virtual teleportation and time travel and impact scientific research in unprecedented ways.Bainbridge (2007) [8] presents the potential of online virtual worlds such as Second Life [9] and World of Warcraft [10] for virtual laboratory experiments, observational ethnography, and analysis of social networks.In addition to VR, by infusing information into our environment through spatial computing with AR and MR, we can annotate and redesign our physical world in previously unimaginable ways.Such (textual, audio, visual or even olfactory) annotations have considerable potential to influence human spatial thinking and learning as well as everyday life.On the positive side, people can easily receive relevant information (or assistance, companionship, inspiration) in context.For example, one can generalize (in the sense of cartographic generalization) the information in the real world by highlighting, accentuating, or masking specific objects or any part of the visual field, so that relevant phenomena are presented differently than 'noise'.On the worrisome side, there are important ethical considerations [11]: If someone else manages our geographic reality, there is a threat to human autonomy through unprecedented political control, information pollution motivated by commercials, and deeper issues that may influence our cognitive system at a fundamental level.For example, we may lose our fundamental cognitive assumption of object permanence [12] if we can no longer tell virtual objects from real ones, which opens up many questions for spatial sense-making.
In sum, the profusion of 3D spatial data, the democratization of enabling technologies, and advances in computer graphics that enable realistic simulations of natural phenomena all necessitate a proper re-examination of state-of-the-art in XR and current research challenges.Therefore, in this manuscript, we selectively review the interdisciplinary XR discourse and identify the challenges from the perspectives of technology, design, and human factors (Sections 2-4).Before we present these core sections, we review the evolving XR terminology, disambiguate key concepts characterizing XR, and give a brief explanation of our methodology.Through the manuscript, we cover a broad range of opportunities and research challenges as well as risks linked to XR technologies and experiences, especially in connection to spatial information sciences.Each section ends with an example from GIScience to link the literature review to research practice.

Key Terminology
In recent years, the term extended reality (XR) has been used as an umbrella term to encapsulate the entire spectrum of VR, AR, and MR [13], and in some sense, is similar to the well-known reality-virtuality continuum [14] (elaborated in the next section).Some unpack XR as cross reality, though this appears to be less dominant: A Google search for extended reality provided 323k hits, whereas cross reality provided ~63k hits at the time of this writing.XR-related terms are often used in interdisciplinary contexts without in-depth explanations or justifications, causing some confusion.How should we distinguish AR from MR? Milgram and Kishino's (1994) seminal realityvirtuality continuum (Figure 2) loosely supports a distinction between AR and MR: MR is a superclass of visual displays consisting of both AR and AV (augmented virtuality) [24].In this classification (Figure 2), one can see MR falling between the real and virtual environments at a point that enables blending the physical world and virtual elements.Milgram and Kishino's (1994) [24] seminal continuum expresses the degree of mixture between real and virtual objects.The real environment and the VE represent the two ends of this continuum while MR occupies the section between real and virtual containing AR and AV.The original figure is by Milgram and Kishino [24], and this public domain illustration is modified from Freeman [25].
In Milgram and Kishino's (1994) model, AR can also supplement VEs, leading to augmented virtuality (AV); a virtual environment with traces of reality such as the user's hands.The two inner categories (AR and AV) in the reality-virtuality continuum are vague by design, and thus are interpreted differently by different authors.Milgram and Kishino's MR (containing AR and AV), and what people mean today by MR differs.In modern discourse, MR suggests that there is real time How should we distinguish AR from MR? Milgram and Kishino's (1994) seminal reality-virtuality continuum (Figure 2) loosely supports a distinction between AR and MR: MR is a superclass of visual displays consisting of both AR and AV (augmented virtuality) [24].In this classification (Figure 2), one can see MR falling between the real and virtual environments at a point that enables blending the physical world and virtual elements.
ISPRS Int.J. Geo-Inf.2020, 9, x FOR PEER REVIEW 3 of 29 As XR expresses a spectrum of VR, AR, and MR, delineation between these terms necessarily remains fuzzy.There is little debate on the definition of VR (except the role of immersion, elaborated in the next section), but distinguishing MR and AR is not as straightforward.To characterize VR displays and tell them apart from media such as videos or visualization software, several scholars proposed distinguishing criteria.In GIScience, MacEachren [4] proposes four key factors: Immersion, Interactivity, Information intensity, and Intelligence of objects, building on the earlier propositions such as Burdea's VR triangle [15].With these criteria in mind, a 3D-movie, in which the user does not interactively control the viewpoint, would not be considered as a virtual environment (VE) [14].Slocum et al.'s (2009, p.521) [14] definition of a VE also includes these criteria implicitly: "a 3D computer-based simulation of a real or imagined environment that users are able to navigate through and interact with".Based on a similar understanding, Lin and Gong (2001) [16] drew upon early visions of the future of geography by Castells (1997) [17] and Batty (1997) [18], and formalized the term virtual geographic environments (VGEs) [19,20].Peddie (2017) posited that a defining feature of VR is that it removes the user from actual reality by introducing barriers (e.g., as with head-mounted displays (HMDs) and partially also by Cave Automatic Virtual Environments (CAVEs) [21][22][23]).Conversely, AR and MR contain virtual objects without fully occluding reality: AR and MR displays are either optical or video see-through.With optical see-through displays, viewers can see the real world through semi-transparent mirrors, whereas with the video see-through, the real world is captured through cameras and presented to the user in the HMD [16,24].Figure 1   How should we distinguish AR from MR? Milgram and Kishino's (1994) seminal realityvirtuality continuum (Figure 2) loosely supports a distinction between AR and MR: MR is a superclass of visual displays consisting of both AR and AV (augmented virtuality) [24].In this classification (Figure 2), one can see MR falling between the real and virtual environments at a point that enables blending the physical world and virtual elements.Milgram and Kishino's (1994) [24] seminal continuum expresses the degree of mixture between real and virtual objects.The real environment and the VE represent the two ends of this continuum while MR occupies the section between real and virtual containing AR and AV.The original figure is by Milgram and Kishino [24], and this public domain illustration is modified from Freeman [25].
In Milgram and Kishino's (1994) Milgram and Kishino's (1994) [24] seminal continuum expresses the degree of mixture between real and virtual objects.The real environment and the VE represent the two ends of this continuum while MR occupies the middle section containing AR and AV.The original figure is by Milgram and Kishino [24], and this public domain illustration is modified from Freeman [25].
In Milgram and Kishino's (1994) model, AR can also supplement VEs, leading to augmented virtuality (AV); a virtual environment with traces of reality such as the user's hands.The two inner categories (AR and AV) in the reality-virtuality continuum are vague by design, and thus are interpreted differently by different authors.Milgram and Kishino's MR (containing AR and AV), and what people mean today by MR differs.In modern discourse, MR suggests that there is real time spatial referencing (i.e., spatial computing), thus virtual and real objects are in the same spatial reference frame and meaningful interactions between them is possible, whereas information superimposed anywhere in the world (such a heads-up display, menu items etc.) would be AR [7].We will adopt this definition in this manuscript, though the readers should note that there is no consensus on the definition of MR at this point [26].Milgram and Kishino (1994) [24] predicted that categorizing an experience as 'based on real or virtual stimuli' may become less obvious over time, thus the terms AV and AR would be needed less, and MR might replace both.Today, 26 years later, AR is still the dominant term, even when people mean MR or AV: a Google search for AR revealed ~49 million hits, though MR is indeed on the rise (9.75 million hits), and not many people use the term AV (~74k hits).Note that various flavors of MR are also sometimes referred to as hybrid reality [21].We will use MR throughout this paper when we mean environments that contain spatially registered (i.e., real-time georeferenced) virtual objects in the real world.

Disambiguation of Some Key Concepts
Besides the XR spectrum, the following concepts are important in XR discourse: immersion and presence (sense of presence, telepresence), and the (re)emerging terms digital twin, mirror world, and digital earth.
Immersion is an important term to clarify as it seems to be used to mean slightly different things.Its origin in natural language is linked to being submerged in water [27].Thus, metaphorical use of the word immersion suggests being fully surrounded by something, thereby perceiving the experience through all human senses 'as a whole'.Taking a technology-centered position, Slater and Wilbur (1997) [28] state that immersion is a computer display's ability to create a vivid (visual) illusion of reality.However, XR technology ultimately aspires to create a full experience involving all human senses (not only vision).Sherman and Craig (2018) [29] acknowledge that the term works both for physical and psychological spaces, and that immersion can refer to a purely mental state of being very deeply engaged with a stimulus (e.g., reading a story, watching a movie) or being physically surrounded by stimuli with the goal to achieve such mental immersion (e.g., as in immersive analytics, where the representation is not of reality but we can walk in the data [30]).Many stereoscopic displays offer stimuli that surrounds the viewer, and they are important in XR systems [28,31].Partly due to the immersion provided by stereoscopic displays, current VR headsets seem to create a jaw-dropping effect (i.e., the so-called wow-factor (Figure 3)).
Viewers spatially 'enveloped' in a virtual space may feel present in that space and feel the presence of others, and the terms (sense of) presence and telepresence are critical conceptual constructs.In basic terms, presence means a sense of being somewhere [32].Presence in XR suggests that the viewer does simply watch, but rather they feel that they are in that space themselves.The central idea that one can achieve a sense of presence via technological means has many interesting implications for spatial sciences.For example, with telepresence [29], one can simulate traveling in space (e.g., visit Himalayan peak, take a walk on the Moon, explore the oceans, visit a friend at home), or in time (e.g., go back to their childhood home, visit ancient Maya) [33].Citizens as sensors [34] and Internet of Things (IoT) continuously accelerate data generation, making these futuristic XR visions more and more plausible.Due to smart devices and IoT, every entity can theoretically emit real-time data to their digital twin [35].A digital twin is a digital replica of a physical entity; ideally, it contains updates and stores all information about its physical counterpart, and there are two-way interactions between the twins [35] (e.g., by tightening a screw's digital twin, we can tighten the physical screw and vice versa).With the accelerated production of digital twins, the long-standing geospatial concepts of mirror worlds [36] and the digital earth [37] become more imaginable and will continue to affect GIScience.
very deeply engaged with a stimulus (e.g., reading a story, watching a movie) or being physically surrounded by stimuli with the goal to achieve such mental immersion (e.g., as in immersive analytics, where the representation is not of reality but we can walk in the data [30]).Many stereoscopic displays offer stimuli that surrounds the viewer, and they are important in XR systems [28,31].Partly due to the immersion provided by stereoscopic displays, current VR headsets seem to create a jaw-dropping effect (i.e., the so-called wow-factor (Figure 3)).
Figure 3.A random sample of images found online using the keywords "virtual reality + people" using Google's image search.Above is a subset of the first 100 images, in which roughly 80% of the images depict people using VR systems with euphoric expressions and their mouths open.The images are then composed and stylized using an image processing software by the authors (i.e., illustration and collage is the authors' own work).The collection shows almost all people using a VR headset with their mouths wide open, demonstrating how literally "jaw dropping" VR experiences can be.A random sample of images found online using the keywords "virtual reality + people" using Google's image search.Above is a subset of the first 100 images, in which roughly 80% of the images depict people using VR systems with euphoric expressions and their mouths open.The images are then composed and stylized using an image processing software by the authors (i.e., illustration and collage is the authors' own work).The collection shows almost all people using a VR headset with their mouths wide open, demonstrating how literally "jaw dropping" VR experiences can be.

Extended Reality (XR) in GIScience
While the merits of classical spatial analyses are well-established, XR contains integrated spatial computing that provides new explicit and implicit ways to connect data-driven representations directly to real-world phenomena.Thus, XR technologies provide unique experiences with significant potential to transform how and where we produce and consume geospatial data, providing not only new methods (to view, interact with, and understand that data), but also fundamentally new spaces of engagement and representation [1].Hugues et al. (2011) [38] proposed a classification scheme that categorized AR applications (in which MR is implicitly included) for the display of GIS-derived data as either augmented maps or augmented territories; the former provides augmented views of geographic data as self-contained digital 3D models, and the latter provides augmented views of the natural environment superimposed with digital content.In both instances, the AR applications were described as deterministic, simply offering the user a collection of contrived 3D visualizations.XR in GIScience moves beyond static, predetermined content toward interfaces that provide experiential platforms for inquiry, experimentation, collaboration, and visual analysis of geospatial data.XR interfaces enable visualizing, perceiving, and interacting with spatial data in everyday spaces and performing spatial operations (e.g., using Boolean logic or through buffering) using those interfaces in situ to promote spatial knowledge transfer by closing the gap between traditional spaces of analysis (the real world) and the spaces where GIS analyses are traditionally conducted (GIS labs).In traditional spatial analyses, the fact that the user must interpret the relationship between the data product and the space characterized by it in their mind requires additional cognitive processing.Mobile technologies including smartphones, tablets, and HMDs such as Microsoft's HoloLens [39] and the Magic Leap [40] are capable of connecting data and space through a process known as real-time reification [40].As examples of MR in GIscience, Lonergan and Hedley (2014) [41] developed a set of applications simulating virtual water flows across real surfaces, and Lochhead and Hedley (2018) [42] provided situated simulations of virtual evacuees navigating through real multilevel built spaces.In both, the simulations were made possible by spatial data representations of the physical structure of those spaces.Future MR applications will enable real-time sampling and simulation, allowing the user to explore virtual water flows across any surface or to analyze virtual evacuations through any building.A major research challenge for such applications is developing technology that can sense the structure of complex spaces with high accuracy and precision (elaborated in the next section Technology).

Review Method
Due to the multifaceted and interdisciplinary nature of the subject tackled in this manuscript (i.e., an overview of the entire XR discourse and broad research challenges), we followed a traditional narrative format.A key structure in the text is the three main perspectives we used to create the core sections: Technology, Design, and Human factors.These three dimensions cover a wide spectrum of the efforts in this domain, and similar structures have been proposed in earlier works (e.g., MacEachren's famous cartography 3 [43]), and a more recent publication on visual complexity [44] have been sources of inspiration.Where necessary, we quantified our arguments, otherwise the text is a qualitative synthesis of the literature.We reviewed more than 150 scholarly publications and some web-based information on recent technologies.These publications were selected based on a semi-structured approach.A broad range of keyword combinations were used to identify seminal work and recent research using a broad range of search engines (e.g., Google Scholar, Scopus, IEEE (Institute of Electrical and Electronics Engineers)), representing a comprehensive set.Complete coverage is not reasonable due the sheer number of papers and the broad interdisciplinary scope.The created repository was annotated ('coded') by co-authors to assess relevance to spatial sciences, and the degree of importance of the publications in the XR domain.The collected materials were then synthesized under the structure introduced.We also include commentary (interpretation and critique) in this text when we have a position on the covered subjects.

Technology
Although the main ideas behind XR-related developments and their inspiration have been philosophical at their core, the majority of XR research to-date has been technology-driven.Example questions that are commonly asked from a technology perspective are: How can we create virtual worlds of high visual quality that match the everyday human visual experience; how can we do this intelligently (e.g., with most efficiency and usefulness); can we stream such high-quality visual worlds to networked devices; how can we manage the level of detail; how can we do all of the above in a fast, real-time, on-demand fashion; how can we make hardware and software that is ergonomic . . .and so forth.This line of research is well-justified; functioning technology is simply necessary in creating desired experiences and testing hypotheses, and how technology functions (or does not function) plays an important role in any XR experience.Due to this strong interaction between technology and all XR research, we examine the technological research challenges and the current gaps in XR experiences due to unsolved technology issues.

Display Devices
Following the ideas in the Introduction, we will examine levels of immersion as an important dimension in which displays are classified (Figure 4), and levels of (visual) realism as an additional consideration that has relevance to GIScience [13].
Since the 1990s, visual displays have been classified as non-immersive, semi-immersive, or (fully) immersive [45].Even though immersion does not only depend on the device, but the interaction and visualization design, such classifications are extremely common, thus will also be covered here.According to formal definitions, a non-immersive display should not be considered VR.However, in practice, this principle is often violated (e.g., 3D interactive walk-throughs projected on 2D displays or 3D videos captured from VEs have been called VR [46]).Semi-immersive devices often use stereoscopic displays, though monoscopic displays may also be considered semi-immersive (such as simulators, large and curved displays).Fully immersive displays surround the user and cover their entire field of view with the virtual world [47].Prime examples of fully immersive display types are the HMDs and 6-wall CAVEs [48] listed in Figure 4. AR/MR displays can be head-mounted, smartphone-or tablet-based, holographic or smart glasses, and are not usually classified according to the level of immersion.Example optical see-through AR/MR devices include Microsoft's Hololens [39], Magic Leap One [40], and Google Glass [49], and video see-through examples include smartphone-based HMDs and VR-focused devices such as the HTC VIVE [50], which can be used for creating AR experiences.Holographic displays (e.g., Looking Glass [51] and Holovect [52]) are the oldest forms of AR dating back to 1940s [53] and are inherently different than stereoscopic displays, as they use light diffraction to generate virtual objects while stereoscopic displays rely on an illusion.When it comes to using these systems in GIScience, software development kits such as Google's ARCore tout their capacity for environmental understanding, but so far, they limit the real world to a collection of feature points and planes [54].Registering augmented maps in space or to conduct basic planar simulations alone may be enough for some tasks, however, this oversimplification of space severely impedes the ability to conduct any meaningful GIS-like spatial analyses.Emerging sensors such as the Structure Sensor by Occipital [55] and Apple's iPad Pro [56] use infrared projectors, cameras, or LIDAR scanners to actively map the environment.Each approach has a limitation (e.g., infrared works well indoors but struggles outdoors), nonetheless, a combination of these active sensors will be strong surveying tools that will service GIScience and spatial projects such as in navigation, spatial information acquisition, and volumetric 3D analyses.Since the 1990s, visual displays have been classified as non-immersive, semi-immersive, or (fully) immersive [45].Even though immersion does not only depend on the device, but the interaction and visualization design, such classifications are extremely common, thus will also be covered here.According to formal definitions, a non-immersive display should not be considered VR.However, in practice, this principle is often violated (e.g., 3D interactive walk-throughs projected on 2D displays or 3D videos captured from VEs have been called VR [46]).Semi-immersive devices often use stereoscopic displays, though monoscopic displays may also be considered semi-immersive (such as simulators, large and curved displays).Fully immersive displays surround the user and cover their entire field of view with the virtual world [47].Prime examples of fully immersive display types are the HMDs and 6-wall CAVEs [48] listed in Figure 4. AR/MR displays can be head-mounted, smartphone-or tablet-based, holographic or smart glasses, and are not usually classified according to the level of immersion.Example optical see-through AR/MR devices include Microsoft's Hololens [39], Magic Leap One [40], and Google Glass [49], and video see-through examples include smartphone-based HMDs and VR-focused devices such as the HTC VIVE [50], which can be used for creating AR experiences.Holographic displays (e.g., Looking Glass [51] and Holovect [52]) are the oldest forms of AR dating back to 1940s [53] and are inherently different than stereoscopic displays, as they use light diffraction to generate virtual objects while stereoscopic displays rely on an illusion.When it comes to using these systems in GIScience, software development kits such as Google's ARCore tout their capacity for environmental understanding, but so far, they limit the real world to a collection of feature points and planes [54].Registering augmented maps in space or to conduct basic planar simulations alone may be enough for some tasks, however, this oversimplification of space severely impedes the ability to conduct any meaningful GIS-like spatial analyses.Emerging sensors such as the Structure Sensor by Occipital [55] and Apple's iPad Pro [56] use infrared projectors, cameras, or LIDAR scanners to actively map the environment.Each approach has a limitation (e.g., infrared works well indoors but struggles outdoors), nonetheless, a combination of

Tracking in XR: Concepts, Devices, Methods
Tracking the position, orientation, and pose of the users' head, hands, eyes, body, and display/control devices' position and orientation is critically important in XR interfaces.Such tracking is vital to establishing spatial correspondence in virtual (or digitally enhanced) space.When tracking works well, the display can adapt to the user's perspective and respond to interactions that are actively or passively triggered by the user, making the experience more believable.Dysfunctional tracking would interrupt the experience and the sense of immersion.As Billinghurst et al. ( 2008) [57] posited, tracking accuracy is much more important for MR than for VR.In MR, spatial registration is critical, whereas in VR, the relative positions of the virtual objects are in the same reference frame and users do not have the same levels of mobility, thus tracking is simpler.Furthermore, in VR, imperfections in spatiotemporal tracking accuracy may go unnoticed or corrected by the perceptual system (e.g., a small delay or a small mismatch in geometry might not be a threat to the experience).
Head tracking is typically integrated in XR systems to monitor the orientation of the head and adapt the contents of a user's field of view accordingly [58].An important part of the tracking system is the inertial measurement unit (IMU) [57].IMUs are not typically integrated in low-cost HMDs; however, it is possible to benefit from the IMUs integrated in smartphones, which can be used with such HMDs.For example, Vuforia [59] accesses this information to enable extended tracking features that allow the camera to drift from the target while maintaining virtual content.ARCore [54] also works in this way.Unlike VR tracking, which works in fully controlled environments (e.g., indoors, in a single physical room, or a user wearing an HMD and moving a minimal amount), the tracking components for AR/MR displays should be able to work outdoors or in large indoor spaces.In the scope of AR/MR technologies there are broadly two types of tracking [60]: image-based and location-based (position tracking).Image-based tracking allows virtual content to be displayed after the system camera detects a predefined unique visual marker [58].Location-based AR/MR technologies allows the user to view virtual content registered in real-world space using a wireless network or the Global Navigation Satellite System (GNSS).When needed, one can track the entire body of a person, which enables a versatile set of new HCI paradigms [61].While the whole-body interaction remains experimental, two specific paradigms emerge as important directions in interacting with XR: eye tracking and hand tracking.Eye tracking is interesting because it enables not only interaction [62], but possibly a perceptual optimization of the display [63].Gaze-based input can be complicated by the so-called "Midas touch" and a family of issues related to this (e.g., unintended commands given by stares or blinks) [64].Nonetheless, with careful interaction design and user training, people can even type using their eyes [65,66].Hand tracking is also interesting because in a 3D user interface (3DUI), the standard mouse-and-keyboard interactions offer limited help or no help at all, and the most natural behavior for humans is to reach and touch things [67].Tracking hands and fingers in XR systems enables (potentially) intuitive interaction and contributes to the usability of these systems in unprecedented ways.We elaborate on the control devices and paradigms in the following section.

Control Devices and Paradigms for 3D User Interfaces
Arguably, when we go from a 2D to a 3D UI, the entire 'world' becomes the interface.For 3D UIs, Bowman et al. ( 2004) [68] suggest that in VR, user interaction should be controlled using devices with more than two degrees of freedom (DOF) and ideally, they should provide six DOF.A 3D mouse in the classical sense (one that rests on a planar surface) is perhaps sufficient for desktop VR applications or when the user is sitting (Figure 5, left).Hand-held controllers (Figure 5, middle and right) are moved freely in space, enabling standing and some movement by the user.Examples include the Nintendo WiiMote [69], which supports three DOF (movement in three dimensions, no rotation), or the HTC VIVE's [50] and Oculus Touch's [70] controllers, which provide six DOF tracking.The number of necessary DOFs depends on the purpose of interaction [71].Desktop input devices such as a keyboard, mouse, or touch interactions offer only two DOF, which might be sufficient in some cases (e.g., desktop VR or similar) [68].Aside from the above-mentioned dedicated hardware, combined hardware-and software-based tracking paradigms are used in controlling the virtual world and the objects within it using the body, eyes, and hands [29,62,[72][73][74].Current research challenges regarding head/hand/eye tracking are partly technical, such as increasing spatiotemporal accuracy and precision, making the algorithms efficient for real-time response, and making the hardware lighter and more seamlessly integrated in the XR setups.However, there are also important challenges in design and user experience, such as finding the right combination of interaction paradigms that fit the user needs, establishing perceptually tolerable lags, and the ergonomics of the hardware (see sections on Design and Human factors).Aside from the above-mentioned dedicated hardware, combined hardware-and software-based tracking paradigms are used in controlling the virtual world and the objects within it using the body, eyes, and hands [29,62,[72][73][74].Current research challenges regarding head/hand/eye tracking are partly technical, such as increasing spatiotemporal accuracy and precision, making the algorithms efficient for real-time response, and making the hardware lighter and more seamlessly integrated in the XR setups.However, there are also important challenges in design and user experience, such as finding the right combination of interaction paradigms that fit the user needs, establishing perceptually tolerable lags, and the ergonomics of the hardware (see sections on Design and Human factors).

Visual Realism, Level of Detail, and Graphics Processing
Another technology concern from a visualization perspective, with close links to design and user experience, is the tension between realism and abstraction in displays.This is a persistent research challenge in cartography and GIScience [75].Conceptually, most-if not all-VR displays aspire to attain a high level of visual realism (the "R" in XR is for "reality" after all), and current systems are getting very good at (photo)realism [76].However, there are many (intentionally) abstract-looking and/or fictitious virtual objects as well as low-resolution virtual worlds.This is partly due to fact that the data volume of fully photorealistic virtual worlds remains an issue to process, render, and transmit [77].Thus, we still need sophisticated level-of-detail (LOD) and level-of-realism (LOR) approaches to render and stream data efficiently.XR displays can be personalized, for example, based on user logging to model patterns in user behavior [78], or eye tracking input in applications such as foveated gaze contingent displays (GCDs) [63,79,80]).We examine GCDs further under the Interaction Design subsection.In the case of AR and MR, visual realism discussion is different to VR.Using see-through displays means that the user always sees the real world partially [22,23].With AR and MR, the issues of visual realism are linked more closely to human factors and design than to that of technology (such as occluding parts of the real world and the use of transparency), thus they will be elaborated in that section.
Additionally, related to data handling, the hardware for the graphics rendering unit of the display system is critical for creating successful XR applications.An inefficient rendering due to low processing power may lead to challenges in maintaining the illusion of reality since visual artifacts and delays in rendering are not something that we experience with our natural visual processing.Refresh rates, rendering power, and screen resolution requirements must be customized to the project because in the 'unlimited' space of an XR display, dramatically more pixels are rendered than in standard displays.At the same time, this higher resolution must be displayed at lower latencies to preserve the continuous illusion of reality.These two requirements place high demands on operational memory (RAM), the processor (CPU), and the graphics card (GPU).Computer assemblies currently use operating memory with a capacity of tens of gigabytes (GB), multi-core processors (e.g., Intel and AMD), and high-end graphic cards (e.g., the NVidia GeForce RTX 2080 or AMD Radeon VII).It might also be important that all the calculations and rendering take place quickly or in real-time.Cummings and Bailenson (2016) [32] report that-albeit based on a small sample size-the rendering rates of VR systems have a demonstrable impact on the level of presence conveyed by VR applications.

Bottlenecks in 3D Reconstruction and the Emerging Role of Artificial Intelligence in Automation
Another important technical issue in the XR discourse is virtual content creation.Currently, building a realistic-looking and geometrically correct 3D model of a physical object involves a relatively complex set of operations [81][82][83].Input data are typically one (or a mixture) of the various forms of imaging and scanning, such as photography, radar/LiDAR, ultrasound, and tomography.The following steps involve intense computational power and careful manual work using a multitude of software.In this process, there are bottlenecks, including the segmentation of imagery, subsequent topological problems, processes of construction of solid surfaces, polygonal models, data optimization, and including 'physics' for motion (animation, interactivity) [81][82][83].Some of these steps can be automated or at least performed semi-automatically.XR systems are in synergy today with artificial intelligence (AI) systems, specifically, machine (and deep) learning.Recent XR frameworks [84] use AI for spatial computing (i.e., interpreting the physical scene in MR [6], improving features such as predicting and increasing stability against motion sickness [85], tracking and gesture recognition [86] among many others).As such, XR and AI research are inexorably intertwined.AI can learn from the data collected through XR interactions and technologies, while simultaneously, XR experiences can be improved through subsequent AI developments.

Research Priorities in XR Technology
While XR technologies have made unprecedented progress in recent years, they are far from ready to replace our everyday display systems.Below, we list the broader research challenges and priorities, as identified in the literature, that stand in the way of XR becoming commonplace.

•
Improved visual quality and more efficient rendering • Support for human factors research: Sophisticated/targeted, configurable, easy-to-use 'metaware' for observing the system and the user, linking with technologies and methods beneficial in user testing (user logging, eye-tracking, psychophysiological measurements, etc.) are necessary to inform technical developments and to improve the user experience.

•
More effort in automated content creation with machine learning and AI, and more meaningful content are needed.Currently, XR content is predominantly created by technologists, with some interaction with 3D artists (though not always).If XR becomes the 'new smartphone', the content needs to be interesting and relevant to more people beyond specialists.

Example: An Imagined Lunar Virtual Reality
The spaces (or times) that we cannot easily visit make VR case studies especially meaningful.If a place is not reachable or very hard to reach, a VR experience can be the 'next best thing'.On the other hand, such places or times present challenges for creating realistic experiences or representations due to the lack of access to 'ground truth'.In such cases, research teams use both authentic space science data to populate virtual worlds and interactive artistic renderings of hypothesized environments.If we take the Earth's Moon, a first-hand experience is prohibitively complex, but one can use terrain models (e.g., see Figure 6), remote sensing technology, and additional sensor data to create a VR experience [87].A lunar VR would be useful as an educational tool in schools [88] or in science exhibitions, and could serve as a laboratory in which one can conduct experiments (e.g., to explore its potential as a space station [89][90][91][92][93]).Using recorded observations embedded in the VR, scientists can examine and theorize complex processes on the Moon.For example, it would be possible to explore lunar physics and chemistry, or how gravity might affect human interactions with the lunar environment.Some of these ideas have been explored by several teams for a long time, for example, Loftin (1996) and colleagues [94] built fully explorable VR simulators of the International Space Station (ISS) long before it was built to verify ergonomics, train astronauts, and anticipate operational challenges.
A 'digital moon' vision (as in digital earth) could feature a replica of the regional lunar landscapes.However, experiences should be produced by simulating approximately one-sixth gravity of the Earth.Virtual flight suits (e.g., using a VR body suit) can be designed for users to support a realistic experience.An in-depth exploration of the available knowledge about the chemistry, physics, and overall habitability of the Moon could be facilitated through VEs, and one could create this virtual experience for any planetary body as similar goals would be relevant.An interesting example related to lunar VR is the Moon Trek from NASA's Jet Propulsion Laboratory (JPL), which can be explored at https://trek.nasa.gov/moon/.

Design
In XR discourse, one can design different aspects of a system or user experience such as the visual display, sounds, interactions with the system, and the story itself.In making such design decisions, introspection-based methods such as cognitive walk-throughs [95] and including users through user centered design (UCD) cycles [96] are important.In this section, we review the design of XR systems in the subsections on visualization design and interaction design, followed by identified challenges and examples.Even though visualization and interaction design are difficult to entirely separate in XR because visualization can be a means to interaction, we separated them here because they are usually treated as different research foci, and the body of knowledge comes from specialized communities.

Visualization Design
In general, visualization communities have 'a complicated relationship' with 3D in representation (i.e., opinions and evidence are mixed on if 3D is good or bad [97]).When 3D is used in plots, maps, and other graphics (i.e. in information or data visualization), it is often considered a bad idea, and there is empirical evidence to support this position [98,99].3D can introduce visual clutter, especially if the third dimension is superfluous to the task [100,101], and can bring additional cognitive complexity to users interacting with maps or plots [98,100,101].Furthermore, if the 3D representation is interactive or animated, and the task requires remembering what was shown and making comparisons from memory, this can increase the number of errors people make in visuospatial tasks [98,101].Distance and size estimations can also be harder with 3D visualizations A lunar VR would be useful as an educational tool in schools [88] or in science exhibitions, and could serve as a laboratory in which one can conduct experiments (e.g., to explore its potential as a space station [89][90][91][92][93]).Using recorded observations embedded in the VR, scientists can examine and theorize complex processes on the Moon.For example, it would be possible to explore lunar physics and chemistry, or how gravity might affect human interactions with the lunar environment.Some of these ideas have been explored by several teams for a long time, for example, Loftin (1996) and colleagues [94] built fully explorable VR simulators of the International Space Station (ISS) long before it was built to verify ergonomics, train astronauts, and anticipate operational challenges.
A 'digital moon' vision (as in digital earth) could feature a replica of the regional lunar landscapes.However, experiences should be produced by simulating approximately one-sixth gravity of the Earth.Virtual flight suits (e.g., using a VR body suit) can be designed for users to support a realistic experience.An in-depth exploration of the available knowledge about the chemistry, physics, and overall habitability of the Moon could be facilitated through VEs, and one could create this virtual experience for any planetary body as similar goals would be relevant.An interesting example related to lunar VR is the Moon Trek from NASA's Jet Propulsion Laboratory (JPL), which can be explored at https://trek.nasa.gov/moon/.

Design
In XR discourse, one can design different aspects of a system or user experience such as the visual display, sounds, interactions with the system, and the story itself.In making such design decisions, introspection-based methods such as cognitive walk-throughs [95] and including users through user centered design (UCD) cycles [96] are important.In this section, we review the design of XR systems in the subsections on visualization design and interaction design, followed by identified challenges and examples.Even though visualization and interaction design are difficult to entirely separate in XR because visualization can be a means to interaction, we separated them here because they are usually treated as different research foci, and the body of knowledge comes from specialized communities.

Visualization Design
In general, visualization communities have 'a complicated relationship' with 3D in representation (i.e., opinions and evidence are mixed on if 3D is good or bad [97]).When 3D is used in plots, maps, and other graphics (i.e., in information or data visualization), it is often considered a bad idea, and there is empirical evidence to support this position [98,99].3D can introduce visual clutter, especially if the third dimension is superfluous to the task [100,101], and can bring additional cognitive complexity to users interacting with maps or plots [98,100,101].Furthermore, if the 3D representation is interactive or animated, and the task requires remembering what was shown and making comparisons from memory, this can increase the number of errors people make in visuospatial tasks [98,101].Distance and size estimations can also be harder with 3D visualizations because scale is non-uniform over space due to the perspective effect [97].On the other hand, if the goal requires more holistic information processing such as identifying phenomena, naming objects, gist recognition, or scene interpretation, 3D can help [102].This is arguably because 3D provides a human recognizable quality (i.e., in most cases, 3D objects and scenes resemble more what they represent than 2D ones [97].While using 3D in information visualization (i.e., infovis) is debated, we argue that the goals of XR are fundamentally different than infovis (except immersive analytics, which combines infovis and XR [30]).With XR, the goal is to create experiences that should compare to real-world experiences or enhance them.Thus, in XR, 3D usually means stereoscopic 3D, which mimics natural human depth perception.Stereoscopic 3D is an entirely different experience than the quasi-3D representations that exploit (combinations of) monoscopic depth cues such as occlusion, perspective distortion, shading, etc. [103].There are also known issues with stereoscopic viewing (e.g., some people cannot see stereo, and long term use can create discomfort in various ways [104]), nonetheless, there is strong empirical evidence that stereoscopic 3D displays improve performance in a number of visuospatial tasks [102].Thus, one should be cautious transferring the standard visualization principles to XR displays.Nonetheless, Bertin's visual variables (i.e., position, size, shape, value, color, orientation, and texture) and semiology principles [105] or the idea of marks and channels [106] remain broadly relevant to XR, even though XR requires thinking of additional variables such as the camera angle [107], position of the light source for color, shadow, and time perception, levels of realism used in textures, etc. [29,58].Gestalt principles (i.e., figure/ground, proximity, similarity, symmetry, connectedness, continuity, closure, common fate, transparency [108]) may also be important to consider in XR (e.g., in storytelling, for presenting fictional worlds/objects, for grouping any overlain menu items for information or interaction purposes [109]).Aside from these two fundamental design theories to visualization (i.e., Bertin's visual variables and the Gestalt Laws), a combination of principles from generalization in cartography, and from LOD management in computer graphics are useful in making design decisions regarding XR displays.Generalization and LOD management are important for computational efficiency, perceptual fidelity, and semantics of a display.They guide the designer in decisions on what to include and what to remove, what to simplify, understate, or highlight.In dynamic displays, temporal decisions are also important (i.e., to decide when to include what).These decisions determine the visual experience of the viewer.LOD management enables assessing, for example, when a detail may be perceptually irrelevant, or where the thresholds are for visual quality vs. computational efficiency [79], whereas cartographic generalization teaches us that we can sometimes sacrifice precision in favor of clarity or legibility (displacement operations), or decide what may be important for a larger group of people (hospitals, schools, other public buildings), so that they can decide what labels or features to retain when the scale changes, or keep important landmarks in view.There is no perfect LOD or generalization solution that responds to all needs, but designers work with heuristics, expert knowledge and, when available, empirical evidence.Based on the considerations above, a scene is organized and rendered using a particular illumination direction using certain LOD, photorealism, or transparency (against occlusion, a persistent challenge with 3D).These are all examples of the design decisions among many when preparing a virtual object or scene, and require careful thinking.Aside from 3D, level of realism is an interesting visualization consideration in XR.Since XR aspires for the creation of realistic experiences, it may seem clear that the XR scene should be rendered in high visual fidelity.The majority of technology-driven research so far has argued that any reason to render a scene in low fidelity was due to resource constraints, and barring that, one would aspire to provide the highest possible visual fidelity.This is a debated position: if used as a 'blanket assumption', it negates the principles of cartographic generalization (i.e., there are many benefits to abstraction).There are also good reasons to think about the levels of detail and realism in XR from the perspective of human cognition (see the section of Human Factors).

Interaction Design
In traditional devices, keyboard, mouse, or touch-based interactions have matured and function reasonably well, meeting the demands of contemporary GIScience software.However, these interaction modalities and classic metaphors such as the WIMP (Windows, Icons, Menus, Pointer) paradigm, do not work well for XR.In XR, ideally one is in a virtual world (VR), or the virtual objects are mixed with the real world (AR/MR).This means that users want to walk, reach, grab, and move objects using their hands like they would in the real world.How then should one design interaction in XR? Which interaction modalities would be better for XR? Voice interaction is technically promising because of the recent developments in natural language processing (NLP) and machine learning [110].However, more research is needed for lesser studied languages, and on the lack of privacy.Recognition of hand gestures and isolation of precise finger movements also seems promising; even though not exactly intuitive, they are learnable (like one would learn sign language).Additionally, the so-called gorilla arm effect (i.e., arms, neck and back getting tired because they are extended out) creates ergonomics concerns [111] (also see the Human Factors section).For hands, earlier attempts involved wearing a glove [112], which is not ideal, but remains interesting for haptic feedback along with body suits [113,114].With developments that make them more precise and less cumbersome to use, they may become a part of the future wearables.Gaze-based interaction is also an 'almost viable' possibility in XR, based on eye tracking [115], or head tracking (i.e., head gaze) [116].Eye tracking devices supported by near-infrared sensors work well, and webcam-based eye tracking is improving [117,118].However, as previously discussed, gaze interaction remains a challenge at the intersection of technology, design, and human factors because it has several drawbacks in an input modality.Other efforts on recognizing facial expressions and gestures such as nodding as input exist, although they are mostly experimental [119][120][121].Hand-held controllers (Figure 5) are the most common current input devices, but text input is difficult with them.Walking or other movement (locomotion) in XR also remains premature; current experimental modalities involve treadmills and full body suits.Of course, one does not have to use a single interaction modality but can combine several as a part of the user experience design.Depending on the goal of the project, designing a user's experience with an XR system requires designing a story e.g., using story boards or similar, and considering what type of interaction may be necessary and useful in the course of the story (note that parts of this paragraph were inspired by a Twitter thread by Antti Oluasvirta on current interaction problems in XR with commentary and links to relevant literature: https://twitter.com/oulasvirta/status/1103298711382380545).
Another on-going research challenge at the intersection of interaction and visualization is how to design an XR experience for more than one person (collaborative XR).Enabling multiple simultaneous users to communicate and collaborate is useful in many applications (e.g., see Section 4.3) Collaborative scenarios require design reconsiderations both for visualization (e.g., who should see what when, can one share a view, point at objects, show the other person(s) something), and interaction (e.g., can two people have eye contact, can one user hand something to another user, can two people carry a virtual table together).These are not trivial challenges, and proposed solutions are largely experimental [122,123].Ideally, in a collaborative XR, people should (1) experience the presence of others (e.g., using avatars of the full body, or parts of the body such as hands) [124,125]; (2) be able to detect the gaze direction of others [126], and eventually, experience 'eye contact' [127]; (3) have on-demand access to what the others see ('shared field of view') [128,129]; (4) be able to share spatial context [123], especially in the case of remote collaboration (i.e., does it 'rain or shine' in one person's location, are they on the move, is it dark or light, are they looking at a water body?); (5) be able to use virtual gestures (handshake, wave, nod, other nonverbal communication) [129,130]; (6) be able to add proper annotations to scenes and objects and see others' annotations; and last but not least (7), be able to 'read' the emotional reactions of their collaboration partner [131].To respond to these needs, a common paradigm that is currently used in the human-computer interaction (HCI) community for collaborative XR is the so-called awareness cues [132] (i.e., various visual elements added to the scene to signal what the other parties are doing (e.g., a representation of their hands), or what they are looking at (e.g., a cursor that shows their gaze) [122]).

Research Priorities in Extended Reality (XR) Design
Designing XR content has many dimensions to consider.First, when we break down XR into VR/AR/MR, we have different problems to solve.Then, the goal of the project matters: when designing for entertainment vs. for a GIScience project there will be different constraints on the liberties designers can take.Below, we provide a brief bullet list of broader open research directions:

•
More theoretical work that is not only technical, but focuses on "why" questions besides "how" questions is needed for the design of XR content.Currently a distilled set of visualization and interaction principles for XR informed by empirical evidence, meta reviews, philosophical, and social theories, including ethics, is lacking.

•
More mature and better-tested HCI concepts and new alternatives are needed (e.g., hand-or gaze-based interaction should be carefully researched).Since hands and eyes have more than one function, they introduce complications in interaction such as the Midas touch, thus one must design when to enable or disable hand or eye tracking in an XR system.

•
Collaborative XR interaction concepts should be further explored such as shared cues, gaze-based interaction, eye contact, handshake, and other non-verbal interactions.

•
We need to develop both functional and engaging visual content, and spend more effort in bridging different disciplines (e.g., creative arts, psychology, and technology).

Examples
Example 1: Designing XR for 'In Situ' Use Cases Here, we present two in situ MR examples.The first application allows the users to manipulate the urban design by 'removing' existing buildings and adding virtual ones [133] (Figure 7).
ISPRS Int.J. Geo-Inf.2020, 9, x FOR PEER REVIEW 14 of 30 Designing XR content has many dimensions to consider.First, when we break down XR into VR/AR/MR, we have different problems to solve.Then, the goal of the project matters: when designing for entertainment vs. for a GIScience project there will be different constraints on the liberties designers can take.Below, we provide a brief bullet list of broader open research directions: • More theoretical work that is not only technical, but focuses on "why" questions besides "how" questions is needed for the design of XR content.Currently a distilled set of visualization and interaction principles for XR informed by empirical evidence, meta reviews, philosophical, and social theories, including ethics, is lacking.

•
More mature and better-tested HCI concepts and new alternatives are needed (e.g., hand-or gaze-based interaction should be carefully researched).Since hands and eyes have more than one function, they introduce complications in interaction such as the Midas touch, thus one must design when to enable or disable hand or eye tracking in an XR system.

•
Collaborative XR interaction concepts should be further explored such as shared cues, gazebased interaction, eye contact, handshake, and other non-verbal interactions.

•
We need to develop both functional and engaging visual content, and spend more effort in bridging different disciplines (e.g., creative arts, psychology, and technology).

Examples
Example 1: Designing XR for 'In Situ' Use Cases Here, we present two in situ MR examples.The first application allows the users to manipulate the urban design by 'removing' existing buildings and adding virtual ones [133] (Figure 7).To mask the existing building, one must project new, altered perspectives.Dense mobile mapping enables identifying what is behind the removed object, allowing the application to occlude the building with what exists behind it.A key challenge here is to ensure that users comprehend the differences between reality and virtuality without being confused by the representation or interaction artifacts (i.e., new design paradigms are needed when a 'visualization' contains physical and virtual objects).For example, how realistic should the virtual portion be?Is it ok if one cannot tell the difference between virtual and real in a MR experience?Should we consider marking the virtual objects to ensure that the viewer knows that they are looking at a 'fake'?How should we handle occlusion, which in some cases is desirable, and in some cases is an obstacle?Example 2: Walking through time In the second MR example, when a user walks, the application superimposes old photographs in their original locations.As the system tracks and interprets the real-world features using computer vision and photogrammetry, the user is presented with a choice between today and a century ago (Figure 8).To mask the existing building, one must project new, altered perspectives.Dense mobile mapping enables identifying what is behind the removed object, allowing the application to occlude the building with what exists behind it.A key challenge here is to ensure that users comprehend the differences between reality and virtuality without being confused by the representation or interaction artifacts (i.e., new design paradigms are needed when a 'visualization' contains physical and virtual objects).For example, how realistic should the virtual portion be?Is it ok if one cannot tell the difference between virtual and real in a MR experience?Should we consider marking the virtual objects to ensure that the viewer knows that they are looking at a 'fake'?How should we handle occlusion, which in some cases is desirable, and in some cases is an obstacle?Example 2: Walking through time In the second MR example, when a user walks, the application superimposes old photographs in their original locations.As the system tracks and interprets the real-world features using computer vision and photogrammetry, the user is presented with a choice between today and a century ago (Figure 8).An accurate 3D city model with detailed street-level data is necessary to spatially register the historical data in their original location.This process recreates a historical street using old photographs as textures.After these offline processing steps, the MR experience can begin.Accurate geolocalization is critical to project the historical images exactly on the real buildings and roads.A number of design questions also emerge in this case: how to handle uncertainties about historical material (i.e., if real photographs and 'artistic' interpretations are mixed, for example, to fill in the gaps, how should the viewer be informed?How should we avoid the risks introduced by occlusion, would transparency work, or would it be at the cost of immersion?Are there interaction and visualization design solutions that allow us to give reality another 'skin' in an immersive yet safe manner?).These two experiments also raise issues pertaining to the accuracy of the (geo)localization.Centimetric or millimetric precision might not be necessary, nor highly detailed 3D models, but this MR example clearly redefines the need for accurate and precise locations and highly detailed 3D models, in order to effectively achieve the illusion of virtual content blended with reality.

Human Factors
In this paper, we use the term human factors in its broadest sense, covering all human-centric aspects in the XR discourse ranging from usability to ethics.It is important to note that, in the HCI literature, human factors often is a synonym for ergonomics.The International Ergonomics Association (IEA) defines ergonomics as, "…the scientific discipline concerned with the understanding of interactions among humans and other elements of a system, and the profession that applies theory, principles, data, and other methods to design in order to optimize human well-being and overall system performance" [134].In the XR discourse, human factors research is commonly conducted at the intersection of HCI and cognitive science domains (e.g., [13,29,46,58,98,[135][136][137]).HCI researchers try to improve technology and design by understanding humans better [46,97,138], whereas cognitive science researchers utilize technology to understand humans better [139,140].At the intersection of these two domains, several HCI questions connected to perceptual and cognitive processing of visuospatial information arise, including: • Which human factors must be considered as we transition from hand-held devices to hands-free, computer-enabled glasses in applications of immersive XR, especially in spatial sciences?• How are humans impacted by XR in the near and long term?As the lines of real and virtual get blurry, and our visuospatial references are no longer reliable, would there be existential consequences of using XR everywhere, for example, questions we touched on in the Introduction section regarding developmental age children and loss of object permanence?What are the direct ethical issues regarding political power, commercial intent, and other possible areas where these An accurate 3D city model with detailed street-level data is necessary to spatially register the historical data in their original location.This process recreates a historical street using old photographs as textures.After these offline processing steps, the MR experience can begin.Accurate geolocalization is critical to project the historical images exactly on the real buildings and roads.A number of design questions also emerge in this case: how to handle uncertainties about historical material (i.e., if real photographs and 'artistic' interpretations are mixed, for example, to fill in the gaps, how should the viewer be informed?How should we avoid the risks introduced by occlusion, would transparency work, or would it be at the cost of immersion?Are there interaction and visualization design solutions that allow us to give reality another 'skin' in an immersive yet safe manner?).These two experiments also raise issues pertaining to the accuracy of the (geo)localization.Centimetric or millimetric precision might not be necessary, nor highly detailed 3D models, but this MR example clearly redefines the need for accurate and precise locations and highly detailed 3D models, in order to effectively achieve the illusion of virtual content blended with reality.

Human Factors
In this paper, we use the term human factors in its broadest sense, covering all human-centric aspects in the XR discourse ranging from usability to ethics.It is important to note that, in the HCI literature, human factors often is a synonym for ergonomics.The International Ergonomics Association (IEA) defines ergonomics as, " . . . the scientific discipline concerned with the understanding of interactions among humans and other elements of a system, and the profession that applies theory, principles, data, and other methods to design in order to optimize human well-being and overall system performance" [134].In the XR discourse, human factors research is commonly conducted at the intersection of HCI and cognitive science domains (e.g., [13,29,46,58,98,[135][136][137]).HCI researchers try to improve technology and design by understanding humans better [46,97,138], whereas cognitive science researchers utilize technology to understand humans better [139,140].At the intersection of these two domains, several HCI questions connected to perceptual and cognitive processing of visuospatial information arise, including:

•
Which human factors must be considered as we transition from hand-held devices to hands-free, computer-enabled glasses in applications of immersive XR, especially in spatial sciences?• How are humans impacted by XR in the near and long term?As the lines of real and virtual get blurry, and our visuospatial references are no longer reliable, would there be existential consequences of using XR everywhere, for example, questions we touched on in the Introduction section regarding developmental age children and loss of object permanence?What are the direct ethical issues regarding political power, commercial intent, and other possible areas where these technologies can be used for exploitation of the vulnerable populations?Some of these questions are inherently geo-political, and thus also in the scope of GIScience research.• Do adverse human reactions (e.g., nausea, motion sickness) outweigh the novelty and excitement of XR (specifically, VR), and are there good solutions to these problems?Does one gradually build tolerance to extended immersion?Does it take time for our brains to recover after extended XR sessions?If yes, is this age-dependent?• Why do some 'jump at the opportunity' while others decline to even try out the headset?How do individual and group differences based on attitude, abilities, age, and experience affect the future of information access?Can we create inclusive XR displays for spatial sciences?• How do we best measure and respond to issues stemming from the lack of or deficiency in perceptual abilities (e.g., binocular vision, color vision) and control for cognitive load?
Many of the questions above are not trivial and do not have straightforward answers.Below, we provide a literature review that touches upon some of these questions as well as revisit some that have been brought up in earlier sections, but this time taking a human-centric perspective.

State of the Art and Trends in Human Factors for XR
Consider that the full XR experience includes taking users' emotions and comfort into account, which are possibly the biggest barriers against the widespread adoption of XR devices [141].Important human factors for XR can be distilled into the following six [142] categories, amongst the originally identified 20 by the Human Factors and Ergonomics Society (HFES) for wearable devices [143]: Aesthetics (appeal, desirability), comfort (prolonged use, temperature, texture, shape, weight, tightness), contextual awareness (perceived comfort depends on the context), customization (fit all sizes and shapes, provide options for style, e.g., color, general appearance), ease of use (clunky hardware), and overload (e.g., cognitive abilities).At the least, these six human factors should be considered when evaluating wearable XR devices.The IEA highlights three areas of particular interest to XR systems: Physical, cognitive, and organizational ergonomic [134].Organizational ergonomics deals with the optimization of processes such as management, teams, policy, etc., which may be relevant to XR from a decision maker's openness to innovation and adopting new systems, but these aspects of ergonomics are beyond the scope of this paper.Physical ergonomics concern human anatomy, physiology, and biomechanical characteristics.For example, a heavy headset may seem undesirable, but comfort depends on multiple factors (e.g., a headset weighs under 500 grams such as the Oculus Rift can cause motion sickness due to the extra degrees of motion, whereas a heavier one such as the SONY PlayStation VR might feel comfortable if it is well-designed for human physical features and rests against the crown of one's head [144]).Cognitive and perceptual factors that are important in XR constitute a large list.For example, a portion of the population cannot see stereoscopically (~20%) or have color deficiencies (~8% for men) [104], might have imperfect motor skills, hearing, or other issues due to aging or disabilities [137], or are still at a developmental age where strain from accommodation convergence conflict may have different implications.These psychophysical concerns and other sensory issues cannot be ignored given that XR devices should be accessible to all.A cross-cutting concept that is relevant to all of the above is cognitive load [145] (i.e., people have more trouble processing or remembering things if there is too much information [46,138,146], or too much interaction [147]).For example, in visual realism studies, it has been demonstrated that a selective representation of photo textures (a 'reduced realism') informed by navigation theories fulfills tasks such as route learning better than fully photorealistic alternatives [46,138,147].Such studies suggest that there are perceptual and cognitive reasons to design a virtual scene with lower fidelity.Similarly, if we take perceptual complexity as a topic [44,108,148,149], gaze-contingent displays (GCDs) may help reduce cognitive load by reducing the overall entropy [63,79,80].A perceptual argument at the intersection of visual complexity and cognitive load is that the human visual system does not process the information in the visual field in a uniform fashion, thus there are abundant "perceptually irrelevant" details that one can remove or manage based on gaze or mouse input [80].Both using realism selectively and GCDs discard some information from the visual display.Even though they help improve human performance in some tasks, one may or may not have liberty to discard information (e.g., in cases such as crime scene documentation and restoration, full fidelity to original scene/object may be very important).Furthermore, cognitive load can also be viewed from the lens of individual and group differences (i.e., what is difficult for one person can be easy for another).Factors such as expertise [149], spatial abilities [150,151], age [152], technology exposure such as video game experience all influence a person's performance with and attitude toward technology.Customization for targeted groups and personalization for the individual may thus be important.For example, key ergonomics concerns about stereoscopic displays stem from individual and group differences.Some of the discomfort with stereoscopic displays such as nausea, motion sickness, or fatigue [153] might be alleviated by simulating the depth of field realistically, and possibly with personalized gaze-contingent solutions [104].Another interesting conversation regarding the effect of XR on humans is the questions linked to immersion and presence.Mostly presented as a positive aspect of XR, immersion and presence can also make the experience confusing (i.e., not being able to tell real from virtual).All questions related to the impact of human cognition and ethics currently remain under-explored.However, there has been long-standing efforts, for example, to quantify and examine levels of immersion [28,32,154], which can be useful in framing the larger questions and conducting studies.Additionally, since current XR headsets have better technology and design than earlier generations, some of the negative human response may have decreased [141,155].Nonetheless, the perceived value and knowledge of the technology also matters [141].Examining the use of XR in education [60,156,157], Bernardes et al. (2018) [158] developed the 3D Immersion and Geovisualization (3DIG) system by utilizing game engines for XR experiential learning.Below, we present an original user survey linked to the 3DIG.

An XR Knowledge Survey
In a study with 425 participants (41.7% male; age 18-20), we measured undergraduate students' knowledge of and previous exposure to XR technologies, evaluated their experience, as well as their perception of the value and potential uses for these technologies.90.5% of the students were enrolled in an "Introduction to Physical Geography" class, and the remaining 9.5% were a mixture of geography and environmental science majors at different levels in their studies (i.e., not a homogenous group, thus would not skew the outcome in a particular direction).The majority reported little to no prior knowledge of AR/MR, whereas they were mildly familiar with VR (Figure 9).The greatest self-reported knowledge was for VR, which was below the midpoint (average score 2.6), followed by AR (1.7), and MR (1).These results were somewhat surprising because we assumed that young men and women at college level would have had opportunities to be exposed to XR technologies.Those who had experience with the technology primarily reported exposure through gaming.While self-reported measures can have biases, these results are in line with the interviews conducted by Speicher et al. (2019) [26], indicating a need for increased exposure to XR technologies in education and incorporation in existing GIScience educational curricula.
A research priority in education related XR is generalizable assessments of whether people learn better with XR than with traditional (or alternative) methods, and if so, why this is the case.Current empirical studies showed improved performance in all kinds of training: Students scored better in quizzes after learning with XR, trainees achieved good outcomes using XR in a variety of domains including geospatial learning, navigation, cognitive training, surgical training, or learning how to operate machines (flight or driving simulators) [102,138,156,[159][160][161][162][163][164][165].More controlled laboratory experiments that test various factors and approaches in combination, and meta-analyses that connect the dots would guide the field in the right direction both in applied and fundamental sciences.A more in-depth treatment of human factors in XR are covered extensively in recent books, for example, by Jung and Dieck (2018) [166], Jerald (2016) [137], Aukstakalnis (2012) [167], and Peddie (2018) [21].9).The greatest self-reported knowledge was for VR, which was below the midpoint (average score 2.6), followed by AR (1.7), and MR (1).These results were somewhat surprising because we assumed that young men and women at college level would have had opportunities to be exposed to XR technologies.Those who had experience with the technology primarily reported exposure through gaming.While self-reported measures can have biases, these results are in line with the interviews conducted by Speicher et al. (2019) [26], indicating a need for increased exposure to XR technologies in education and incorporation in existing GIScience educational curricula.A research priority in education related XR is generalizable assessments of whether people learn better with XR than with traditional (or alternative) methods, and if so, why this is the case.Current empirical studies showed improved performance in all kinds of training: Students scored better in quizzes after learning with XR, trainees achieved good outcomes using XR in a variety of domains including geospatial learning, navigation, cognitive training, surgical training, or learning how to operate machines (flight or driving simulators) [102,138,156,[159][160][161][162][163][164][165].More controlled laboratory experiments that test various factors and approaches in combination, and meta-analyses that connect the dots would guide the field in the right direction both in applied and fundamental sciences.A

Research Priorities in Human Factors for XR
In addition to those highlighted above, there are many other human factors challenges that XR researchers need to address in the coming decade(s).Below, we provide a curated list of what we consider as current research priorities: • Theoretical frameworks, for example, based on meta-analyses, identifying key dimensions across relevant disciplines and systematic interdisciplinary knowledge syntheses.

•
Customization and personalization that provide visualization and interaction design with respect to individual and group differences, and the cognitive load in a given context.Eye strain, fatigue, and other discomfort can also be addressed based on personalized solutions.

•
Solutions or workarounds that enable people who may be otherwise marginalized in technology to participate by finding ways to create accessible technology and content.

•
Establishing a culture for proper user testing and reporting for reproducibility and generalizability of findings in individual studies.

•
Examining the potential of XR as 'virtual laboratories' for conducting scientific experiments to study visuospatial subjects, for example, understanding navigation, visuospatial information processing, people's responses to particular visualization or interaction designs, exploratory studies to examine what may be under the oceans or above the skies . . .etc., and how XR as a tool may affect the scientific knowledge created in these experiments.

•
Developing rules of thumb for practitioners regarding human perceptual and cognitive limits in relation to XR.Studies comparing display devices with different levels of physical immersion to investigate when is immersion beneficial and when it would not be beneficial.Separate and standardized measures of physical vs. mental immersion are relevant in the current discourse.

•
Rigorous studies examining collaborative XR.How should we design interaction and presence for more than one user?For example, in telepresence, can everyone share visuospatial references such as via an XR version of 'screen sharing' to broadcast their view in real time to the other person who can even interact with this view?Can handshakes and eye contact be simulated in a convincing manner?Can other non-visual features of the experience be simulated (e.g., temperature, tactile experiences, smells).

•
Importantly, sociopolitical and ethical issues in relation to tracking and logging people's movements, along with what they see or touch [11] should be carefully considered.Such tracking and logging would enable private information to be modeled and predicted, leaving vulnerable segments of the population defenseless.Ethical initiatives examining technology, design, and policy solutions concerning inclusion, privacy, and security are needed.

Examples: Urban Planning
In these examples, we demonstrate the use of XR in city planning.City planning has traditionally experienced low levels of adoption in using emerging technology [168,169].Spreadsheet software and geographical information systems (GIS) had high levels of uptake, but more sophisticated planning support systems [170] did not find a strong audience due to a lack of usability [169].Here, we briefly reflect on selected XR platforms and their utility to support city planning.
Envisioning future cities using VR: 3D virtual worlds (e.g., Alpha World of Active Worlds, CyberTown, Second Life, and Terf) have gained limited traction in city planning practices.Terf has been used for education [171], but technical issues handling building information modeling (BIM) objects at a city scale limits their utility in practice.3D virtual globes (e.g., Google Earth, NASA Worldview, and Worldwind, Cesium) support a growing number of digital planning applications.For example, the Rapid Analytics Interactive Scenario Explorer (RAISE) toolkit (Figure 10) utilizes Cesium, allowing city planners to drag and drop new metro infrastructure into a 3D VE to better understand its impact on property prices and explore other 'what-if' scenarios [171,172].Envisioning future cities using AR/MR: With the rapid developments in XR headsets has come a vast array of available data, and interfaces have become more sophisticated as both game engines (e.g., Unity and Unreal Engine) and application programming interfaces (APIs) are stimulating the development of applied tools and services.Consequently, we are seeing a marked increase in practical products within the remit of planning and architecture professions (and beyond).Examples of these products include HoloCity [173] (Figure 11).Envisioning future cities using AR/MR: With the rapid developments in XR headsets has come a vast array of available data, and interfaces have become more sophisticated as both game engines (e.g., Unity and Unreal Engine) and application programming interfaces (APIs) are stimulating the development of applied tools and services.Consequently, we are seeing a marked increase in practical products within the remit of planning and architecture professions (and beyond).Examples of these products include HoloCity [173] (Figure 11).Envisioning future cities using AR/MR: With the rapid developments in XR headsets has come a vast array of available data, and interfaces have become more sophisticated as both game engines (e.g., Unity and Unreal Engine) and application programming interfaces (APIs) are stimulating the development of applied tools and services.Consequently, we are seeing a marked increase in practical products within the remit of planning and architecture professions (and beyond).Examples of these products include HoloCity [173] (Figure 11).   Figure 11 demonstrates large arrays of transportation sensor data overlaid on a virtual model of Sydney, Australia, generated through OpenStreetMap (OSM), a form of volunteered geographic information (VGI) [173], and MapBox AR, an open-source toolkit combining AR software with MapBox's global location data.The Tabletop AR of MapBox can generate interactive cityscapes that attach themselves to the nearest visible flat surface.In situ XR visualization of objects and urban 'situated analytics' are more challenging [174,175].With the opportunities presented by XR platforms are the challenges brought about by their introduction.These include:

•
Data Accessibility.With the rise of the open data movement, many cities' digital data assets are becoming more accessible to researchers and developers for creating XR city products.For example, Helsinki 3D provides access to over 18,000 individual buildings [176].• Development and testing.Usability of XR systems is paramount to their adoption.Applying humancentered design and testing standards (e.g., ISO9421 [171,177]) could assist researchers in understanding how to create effective systems in specific city design and planning contexts.

•
Disrupting standard practices.Widespread adoption of XR in the city planning community is contingent upon disrupting standard practices, which traditionally rely on 2D maps and plans.A recent study evaluating the use of HMD-based VR by city planners found that planners preferred the standard 2D maps, describing the immersive VR experience as challenging, with some participants noting motion sickness as a barrier for adoption [171].

•
Moving beyond 'Digital Bling'.While XR offers visualizing data in new and interesting ways, a major challenge in this context is moving beyond the novelty of 'Digital Bling' and also offer new and meaningful insights on the applied use of these platforms.This should, in turn, aim to generate more informed city planning than was possible with traditional media.

Conclusions
A complete account of all XR knowledge is beyond the scope of a single article.For example, more examples of case studies on emergency preparedness and evacuation planning using XR [178,179] could further enrich the paper.Nonetheless, organizing the information under technology, design, and human factors dimensions demonstrates the interconnectedness of the interdisciplinary questions, and allows a more holistic picture to be built, rather than focusing on one of them alone, for example, how do our tools and methods (technology and design) affect our cognitive and human factors studies?For whom is the technology intended, how well will it work, and under what conditions and for what purpose will it be used?Can we optimize the design to the intended context and audience?Today, high-end smartphones can scan the physical world around them, process it in real time to create point clouds and geometric models, and spatially reference the virtual objects with the physical world for MR.Much of this is enabled purely due to the developments in hardware, but not all devices can cope with the computational demands of MR, nor is it possible to record or stream such high-resolution data in any given setup.Such in situ visualizations and real-time sensing and simulation give rise to XR GIScience including an ability to collect data, visualize and query it, and explore the results in situ all while using a single MR interface.Future MR could help close the gap that exists between the field space and lab space, allowing for situated GIScience that supports the cognitive connection between data and space.XR represents opportunities to interact with and experience data; perceive and interpret data multidimensionally; and perform volumetric, topological 3D analyses, and simulations seamlessly across immersive, situated, and mixed reality environments.Some speculate that the wearable XR is the next technological revolution (post smartphones).This may well be true, though it requires conscious effort connecting multiple disciplines.An interesting aspect of XR (especially MR) is that one can change people's reality.Assuming MR becomes commonplace, this may have important implications for the human experience and evolution.Creating good, useful, and thoughtful XR experiences can only be achieved with interdisciplinary collaboration.Consolidating and digesting expertise from different disciplines can be complex, and there is a risk of not getting enough depth and making naïve assumptions.However, it is important to remember that entirely tech-driven research can also be risky, and a critical reflection of the impacts of these technologies on individuals and societies is important.Questioning what XR might mean for our understanding of the world, with all of its underlying ethical and political implications, all scientists working on XR should also reflect on policies, and where possible collaborate with policy makers to influence the responsible use of these powerful technologies.

Figure 1 .
Figure 1.Left to right: (a) Augmented Reality (AR) and (b) Mixed Reality (MR) experiences extend reality by adding virtual elements, whereas (c) a Virtual Environment (VE) can be a fantasy world, and (d) a Virtual Reality (VR) immerses the user in a "real" setting such as a travel experience.

Figure 2 .
Figure 2.Milgram and Kishino's (1994) [24] seminal continuum expresses the degree of mixture between real and virtual objects.The real environment and the VE represent the two ends of this continuum while MR occupies the section between real and virtual containing AR and AV.The original figure is by Milgram and Kishino[24], and this public domain illustration is modified from Freeman[25].

Figure 1 .
Figure 1.Left to right: (a) Augmented Reality (AR) and (b) Mixed Reality (MR) experiences extend reality by adding virtual elements, whereas (c) a Virtual Environment (VE) can be a fantasy world, and (d) a Virtual Reality (VR) immerses the user in a "real" setting such as a travel experience.
illustrates a few examples on the XR spectrum.

Figure 1 .
Figure 1.Left to right: (a) Augmented Reality (AR) and (b) Mixed Reality (MR) experiences extend reality by adding virtual elements, whereas (c) a Virtual Environment (VE) can be a fantasy world, and (d) a Virtual Reality (VR) immerses the user in a "real" setting such as a travel experience.

Figure
Figure 2.Milgram and Kishino's (1994) [24] seminal continuum expresses the degree of mixture between real and virtual objects.The real environment and the VE represent the two ends of this continuum while MR occupies the section between real and virtual containing AR and AV.The original figure is by Milgram and Kishino[24], and this public domain illustration is modified from Freeman[25].
model, AR can also supplement VEs, leading to augmented virtuality (AV); a virtual environment with traces of reality such as the user's hands.The two inner categories (AR and AV) in the reality-virtuality continuum are vague by design, and thus are interpreted differently by different authors.Milgram and Kishino's MR (containing AR and AV), and what people mean today by MR differs.In modern discourse, MR suggests that there is real time

Figure 2 .
Figure 2.Milgram and Kishino's (1994) [24] seminal continuum expresses the degree of mixture between real and virtual objects.The real environment and the VE represent the two ends of this continuum while MR occupies the middle section containing AR and AV.The original figure is by Milgram and Kishino[24], and this public domain illustration is modified from Freeman[25].

Figure 3 .
Figure3.A random sample of images found online using the keywords "virtual reality + people" using Google's image search.Above is a subset of the first 100 images, in which roughly 80% of the images depict people using VR systems with euphoric expressions and their mouths open.The images are then composed and stylized using an image processing software by the authors (i.e., illustration and collage is the authors' own work).The collection shows almost all people using a VR headset with their mouths wide open, demonstrating how literally "jaw dropping" VR experiences can be.

Figure 4 .
Figure 4.A taxonomy of the current display systems, organized as a spectrum of non-immersive to immersive display types (and those in between).The categories are based on the literature reviewed in this section, and brand examples are provided as concrete references.

Figure 4 .
Figure 4.A taxonomy of the current display systems, organized as a spectrum of non-immersive to immersive display types (and those in between).The categories are based on the literature reviewed in this section, and brand examples are provided as concrete references.

Figure 6 .
Figure 6.The immediate vicinity of the Apollo 17 landing site is visualized using digital elevation model generated from Narrow Angle Camera (NAC) images.

Figure 6 .
Figure 6.The immediate vicinity of the Apollo 17 landing site is visualized using digital elevation model generated from Narrow Angle Camera (NAC) images.

Figure 7 .
Figure 7. (a) Original view of the user in-situ.(b) The user selects a part or an entire building to remove.(c) After removal, the real building is no longer visible.(d) The user adds a part of a 3D building model.

Figure 7 .
Figure 7. (a) Original view of the user in-situ.(b) The user selects a part or an entire building to remove.(c) After removal, the real building is no longer visible.(d) The user adds a part of a 3D building model.

Figure 8 .
Figure 8. Capture of what the user sees while walking in the street through mixed reality glasses (Hololens).A portal to jump 100 years ago in that same street, by Alexandre Devaux.See a video here: https://twitter.com/AlexandreDevaux/status/1070333933575917569.

Figure 8 .
Figure 8. Capture of what the user sees while walking in the street through mixed reality glasses (Hololens).A portal to jump 100 years ago in that same street, by Alexandre Devaux.See a video here: https://twitter.com/AlexandreDevaux/status/1070333933575917569.

30 Figure 10 .
Figure 10.Rapid Analytics Interactive Scenario Explorer (RAISE) Toolkit built on the Cesium platform, enabling city planners to explore future city scenarios.

Figure 11 .
Figure 11.(a) HoloCity: an example of a splash screen of an AR model of Sydney with floating user interface.(b) MapBox AR: an example of the TableTop AR function showing a generated component of the same city of Sydney.

Figure 11
Figure 11 demonstrates large arrays of transportation sensor data overlaid on a virtual model of Sydney, Australia, generated through OpenStreetMap (OSM), a form of volunteered geographic information (VGI) [173], and MapBox AR, an open-source toolkit combining AR software with MapBox's global location data.The Tabletop AR of MapBox can generate interactive cityscapes that attach themselves to the nearest visible flat surface.In situ XR visualization of objects and urban 'situated analytics' are more challenging [174,175].With the opportunities presented by XR platforms

Figure 10 .
Figure 10.Rapid Analytics Interactive Scenario Explorer (RAISE) Toolkit built on the Cesium platform, enabling city planners to explore future city scenarios.

30 Figure 10 .
Figure 10.Rapid Analytics Interactive Scenario Explorer (RAISE) Toolkit built on the Cesium platform, enabling city planners to explore future city scenarios.

Figure 11 .
Figure 11.(a) HoloCity: an example of a splash screen of an AR model of Sydney with floating user interface.(b) MapBox AR: an example of the TableTop AR function showing a generated component of the same city of Sydney.

Figure 11
Figure 11 demonstrates large arrays of transportation sensor data overlaid on a virtual model of Sydney, Australia, generated through OpenStreetMap (OSM), a form of volunteered geographic information (VGI) [173], and MapBox AR, an open-source toolkit combining AR software with MapBox's global location data.The Tabletop AR of MapBox can generate interactive cityscapes that attach themselves to the nearest visible flat surface.In situ XR visualization of objects and urban

Figure 11 .
Figure 11.(a) HoloCity: an example of a splash screen of an AR model of Sydney with floating user interface.(b) MapBox AR: an example of the TableTop AR function showing a generated component of the same city of Sydney.

•
Higher screen density (pixels per unit area), screen resolution (total number of pixels on a display), and frame rate (more frames per second (fps)); • More efficient rendering techniques informed by human factors research and AI, (e.g., adaptive rendering that does not compromise perceptual quality [79]).Future XR devices with eye tracking may enable foveated rendering for perceptually adaptive level of detail (LOD) management of visual quality and realism [63]; • Approaches to handle a specific new opportunity/challenge (i.e., generalizing reality in MR (referring to cartographic generalization) via masking/filtering).Many specific research questions emerge from this idea in technological, HCI-related (e.g., use of transparency to prevent occlusion, warning the user), and social (if we change people's experience of reality, what kind of psychological, political and ethical responsibilities should one be aware?)domains.
•Improved interaction and maximized immersion • More accurate tracking of control devices and IMUs for headsets; • More intuitive control devices informed by human factors research investigating usefulness and usability of each for specific task(s) and audience; • AI supported or other novel solutions for controlling and navigating in XR without control devices (e.g., hand-tracking, gesture recognition or gaze-tracking); • More collaboration tools.Current systems contain tools for a single user.More tools need to be designed and implemented to enable collaborative use of XR; • Creating hardware and software informed by human factors research that supports mental immersion and presence.