1. Introduction
Spatial-location memory is a type of declarative memory for spatial information [
1,
2]. This memory allows people to make precise associations between objects and their spatial locations. Spatial-location memory in large environments allows people to remember the locations of small objects that typically change their location in the environment (e.g., personal items) and is essential for success in typical daily activities at home and at work [
3]. Short-term spatial memory is defined as the ability of an individual to remember the location of items in the environment for short periods of time [
4]. Humans, like most animals, use it for tasks such as orienting ourselves in space, remembering a path, or remembering where we have left our belongings.
Conventional assessments of short-term spatial memory typically involve the presentation of objects on paper or screens [
5,
6] while participants are seated. However, previous works have highlighted the importance of physical movement in the acquisition of spatial skills [
7]. Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR) are technologies that can be exploited to develop tools for studying spatial memory.
Extended Reality (XR) is an umbrella term for AR, MR, and VR [
8]. Milgram and Kishino’s continuum [
9] is a widely recognized classification framework that classifies AR, VR, and MR technologies along the ‘virtuality continuum’ ranging from completely real environments to fully virtual ones. In AR, the real environment is ‘augmented’ by integrating virtual objects [
9]. A more exhaustive definition was provided by Azuma [
10]. He describes AR as systems that possess three key characteristics: they combine real and virtual; they are interactive in real time; and they are registered in 3-D. In VR, the user is fully immersed and can interact with an entirely artificial world [
9]. In MR, real-world and virtual objects coexist, lying anywhere between the extremes of the ‘virtuality continuum’ [
9]. It is important to highlight that the ‘virtuality continuum’ was explicitly concerned with only visual displays and was established 30 years ago. Attempts to expand upon Milgram and Kishino’s continuum have focused on exploring the boundaries between physical and virtual spaces in MR environments [
11], proposing a classification framework for multisensory experiences [
12], introducing new concepts such as mediated reality [
13], proposing a conceptual framework for MR [
14], or revisiting Milgram and Kishino’s continuum [
15].
There is no clear, universally accepted definition of what MR is or how it differs from other concepts. Sometimes, MR and AR are used interchangeably [
14]. In addition to the above definition of MR, Skarbez et al. [
15] defined MR as an environment that seamlessly combines elements from the real and virtual worlds into a unified perceptual experience. In such an environment, users simultaneously perceive real and virtual content, often across multiple senses. According to Parveau and Adda [
16], MR is a paradigm that integrates technologies that are capable of mapping the user’s environment to present 3D virtual content registered in space and time. Virtual elements can be spatially aligned with the physical world, the user, or other virtual or real objects. Moreover, the MR experience should prioritize the user, providing natural and responsive interactions. One commonly recognized definition describes MR as a blend of real and virtual environments that creates an immersive experience, enabling interaction among physical and digital objects [
8,
17]. In the case of our application with HoloLens 2 (Microsoft, Redmond, WA, USA), we categorized it as MR because, in our opinion, it best fits the above definitions of MR, although our application could be classified as AR, as it also meets the criteria for such a classification.
XR technologies are particularly useful when users need to navigate real-world environments physically, as they offer experiences that closely mimic real-world tasks. Traditionally, VR has not allowed the user to physically move to navigate the virtual environment, and navigation is conducted using controllers (e.g., joysticks). However, physical navigation became possible with the advent of standalone VR headsets such as Meta Quest (similar or superior). AR and MR allow the user to physically move around the real environment. In addition, the latest MR headsets allow the user to have a natural perception of the real environment, a full blending of the virtual elements, and a natural interaction between virtual and real elements. In our case, we used HoloLens 2. HoloLens 2 is a standalone headset that does not need to be connected to any other device to work, giving the user complete freedom of movement. The digital elements blend into the real environment giving a sense of integration. Interaction is gestural with the hands, and no controllers are required. Voice interaction is also possible.
In this work, we present a new MR application that runs on an MR headset (HoloLens 2) for the assessment of short-term spatial memory. Our MR application does not require any physical elements to be added to the scene for tracking, and participants are free to move around indoor spaces. The application task is divided into three phases: a familiarization phase, a learning phase, and an evaluation phase. In the familiarization phase, the participants become familiar with the headset and the application. In the learning phase, the participants search for virtual 3D objects distributed in a physical room. In the evaluation phase, the participants recall these objects and use the MR application to correctly place the objects in the room. The application stores useful data for further study (e.g., the successes and errors, the Learning Time, the Evaluation Time). The main objective is to prove whether MR is useful for assessing short-term spatial memory and has advantages over AR and VR. A study was conducted to validate our MR application. Our study involved twenty-nine participants and aimed to compare the data with two previous studies, one that used AR on a mobile device [
18] and another one that used a virtual environment that was the modeled room in which the AR and MR applications were validated [
19]. The main hypothesis of this work is that MR running on headsets such as HoloLens 2 or superior is an effective technology for assessing short-term spatial memory and that the experience is satisfactory for the participants.
1.1. Spatial Memory
Short-term memory and long-term memory have been studied extensively in animals [
20]. One of the classic experiments is to create mazes with rewards whose path must be memorized in order to find the reward more quickly [
21]. Similar studies have also been conducted in humans [
22]. Other methods that have been used to assess spatial memory in humans involve graphic representations or images on paper [
23,
24] or on a screen while the subject remains seated in a chair, but these methods do not involve physical displacement [
25,
26,
27].
Using visualization devices, new methods have been developed to assess spatial memory in humans by simulating environments or large rooms without the need to use real rooms. For example, Shore et al. [
28] used computer-generated virtual environments to study spatial short-term memory in humans. The subjects were able to move around a large virtual space displayed on a screen using a keyboard or a joystick.
According to previous works [
29,
30], there is a neural process in humans that is responsible for updating the mental map of the environment when the subject physically moves, which does not occur when the objects in the environment are the ones that move. In these studies, a subject was shown a series of objects around a round table. After the objects were hidden, sometimes the subject moved around the table, and sometimes the table rotated. The users were able to recognize and remember the position of the objects much better when they were the ones moving around the table, rather than when the table rotated. The results suggest that subjects recognize and remember better when they are the ones moving. This supports the hypothesis that people are more adept at remembering spatial landmarks when they are actively engaged in physical movement. Therefore, AR and MR offer potential advantages for the development of applications that allow physical navigation through an environment. AR and MR applications that encourage user movement may have a positive impact on memory tasks compared to other methods in which the subject remains static or seated.
1.2. Technology-Assisted Assessment of Spatial Memory
Short-term spatial memory can be assessed using paper-and-pencil tests [
23,
24]. The first computerized assessment tools used the same principles, but they replaced paper with a screen [
31]. This already offers some advantages over analog tools, such as the possibility of collecting some variables in an automated way (successes, failures, and reaction times).
The incorporation of VR, and later AR, has opened up new possibilities for the assessment of spatial memory. These technologies allow the user’s interaction with objects to be natural, and navigation through the environment can be achieved with the user’s physical displacement. In addition, there is no need to model the environment in AR since the user is in direct contact with the real world. Experiences with these technologies are much more similar to everyday activities. The first applications of virtual environments to assess spatial memory used a monitor as the display device, and the subject used a keyboard, joystick, or mouse to navigate and interact with the environment while remaining seated [
25,
26,
27]. Later studies used the physical displacement of the subject to explore the virtual environment [
32,
33]. The use of AR is more recent. The first mobile AR applications [
34,
35] used images as targets for tracking and thereby to determine the position and orientation of virtual objects in the real environment. More recent works have introduced mobile AR applications based on Simultaneous Localization and Mapping (SLAM) [
36,
37], eliminating the need for physical images in the real world. Other authors have developed an MR application that incorporates holographic grids to study their impact on distance estimation and location memory [
38]. Their results suggested that the display of a grid led to more accurate distance estimates, but location memory performance was worse. Auditory stimuli have also been used to assess spatial memory [
39]. A mobile AR application containing both visual and auditory stimuli was developed to assess spatial memory [
39]. Their study aimed to compare the participants’ performance between visual and auditory stimuli and found similar success rates, but memory for spatial–visual associations was dominant since the spatial location of visual stimuli was remembered more precisely and rapidly. Tactile stimuli have also been studied for spatial memory assessment [
18]. The results showed similar success rates between visual and tactile stimuli, but again, memory for spatial locations was more precise and rapid with visual stimuli [
18].
4. Discussion
In this work, an MR application using an optical see-through headset was developed for the assessment of short-term spatial memory. The application works in any indoor environment of any size. It does not require the addition of physical elements for tracking. The user has freedom of movement. Interaction is gestural and natural. The supervisor can personalize the task with the number and position of the objects to be memorized, and this configuration is stored so that the task can be repeated at any time. To our knowledge, this is the first application using an MR headset (HoloLens 2) for this purpose and with these features. In previous studies, VR headsets have already been used to explore spatial knowledge acquisition tasks in controlled environments [
19,
48,
49]. In line with our work, which highlights the importance of VR, AR, and MR for memory-related tasks, other studies also highlight the potential of VR headsets to achieve similar objectives [
19,
48,
49]. Additionally, like other works [
19,
48,
49], we also evaluate the time spent on tasks and performance outcomes. Some works only use VR with headsets [
19,
49]. Another work compared VR with headsets and a real environment [
48]. In this last work [
48], a maze challenge was used as the experimental task, where participants were asked to retrieve objects placed within the maze. The participants’ performance in two conditions (VR vs. real) was compared in terms of navigation times and the routes chosen. Statistical analyses showed no significant differences between VR and real conditions in terms of total time or navigation performance, regardless of prior gaming experience or self-assessed navigation skills. The work of Monteiro et al. [
48] and our work coincide in using the same environment (real or modeled) for comparisons. In our study, the participants were not instructed to complete the tasks as quickly as possible, unlike other studies which emphasized minimizing the task completion time [
48]. This lack of urgency led some participants to spend extra time on certain tasks, significantly increasing their total time spent. In our work, in line with prior research [
48], we analyzed whether there were correlations between the computer experience of the subjects and other study variables as well as between video game experience and other study variables. As previous studies argue [
48], we agree that it is valuable to explore spatial memory using the latest VR/MR technologies, given their rapid and significant evolution, and to ensure that prior research is accessible to provide a solid foundation for future investigations. The evolution of headsets can be observed in previously published works. For example, the Oculus Rift CV1 [
48], which required a computer connection to operate, the Oculus Quest [
19], one of the first standalone devices, and the Meta Quest 2 [
49] have already been used. In this work, HoloLens 2 is employed. To the best of our knowledge, while not specifically for spatial memory, other headsets for MR with grayscale or color passthrough have been used in various applications. For example, the Meta Quest Pro has been employed for learning to play the piano [
50], and the Apple Vision Pro could open up new possibilities, such as for individuals with visual deficits [
51].
Our MR application was validated with 29 participants, and the data were compared with two other studies. One of them used a mobile AR application in the same environment. The other study used a VR application, using a headset.
When comparing the MR application and the mobile AR application for the performance variables, the only difference was the Evaluation Time. The participants using the MR application took significantly more time. Our argument for this result is that the use of HoloLens 2 was new to all of the users, they enjoyed observing objects, and they had no minimum time requirement to perform the task. It is important to note that there were no significant differences for the other three performance variables.
When comparing the MR application and the VR application for the performance variables, there were differences in the Learning Time and Evaluation Time in favor of the VR application. Our explanation for this result is similar to that for the AR application: all of the users were new to using HoloLens 2, they found observing the objects appealing, and they had no minimum time requirement to perform the task.
When comparing the total number of correctly placed objects between the MR application and the map task, no statistically significant difference was found. This result indicates that the objects placed with the MR application were correctly remembered and placed on the 2D map of the environment. This result is consistent with previous works [
19]. This result shows that the participants learned the spatial–visual associations and were able to transfer these associations from the 3D array of the real environment to the 2D array of the map and the mental image of the room. Thus, the first part of the main objective has been fulfilled, and the first part of the main hypothesis has been confirmed: our MR application has proven to be a useful tool for the assessment of short-term spatial memory.
The participants rated the MR experience positively, with a median rating of 6 or higher (on a scale of 1 to 7) for seven of ten subjective variables and a median rating of 7 for the satisfaction and the non-physical effort variables. Therefore, the second part of the main hypothesis has been supported: the participants had a satisfactory experience when using our MR application. No statistically significant differences were found for the performance variables when gender was taken into account. Our results are consistent with previous studies [
19,
39].
With regard to the comparison between the MR application and the mobile AR application for the subjective variables, there were significant differences in favor of the MR application for the variables calmness, non-physical effort, and satisfaction. Our argument for the lower perceived physical effort in the MR application is that, in the AR application, the weight of the mobile phone and its handling significantly influenced this result.
With regard to the comparison between the MR application and the mobile AR application for the subjective variables, there were significant differences for the variables concentration, usability, and sense of presence, in favor of the AR application. Our argument for the higher perceived usability in the AR application is that users are very accustomed to using a mobile phone, and it was the first time they had to use air gestures to interact with an MR application. Our argument for the lower sense of perceived sense of presence in the MR application is that the virtual objects are holograms, which are objects made of points of light. It is a different visualization than the one perceived in reality and in the rest of the visualization systems, and users perceived this difference. This is a technical limitation that will be improved in future MR headsets. One area of improvement for optical see-through headsets would be higher resolution displays. Increasing the pixel density to render sharper images with finer detail would reduce the perception of objects as mere points of light. Another option is to use the latest video see-through headsets for MR, such as Apple Vision Pro, which uses color passthrough.
After conducting this study, the advantages of MR using an optical see-through headset over mobile AR and VR using headsets were identified. Thus, the second part of the main objective has been fulfilled. First, the advantages of MR using an optical see-through headset over mobile AR are as follows:
A more immersive experience can be created by seamlessly blending digital content with the real world.
The freedom to interact with digital content without holding or manipulating the device allows for more natural and intuitive interactions.
Because MR headsets project digital content directly into the user’s field of view, there are fewer environmental distractions compared to viewing content on a mobile device screen.
Headsets are becoming more ergonomically designed, while mobile AR requires users to hold and manipulate a device.
Second, the advantages of MR with an optical see-through headset over VR with headsets are as follows:
The environment is real and does not need to be modeled. The user’s home, a therapist’s room, or any other chosen location can be used.
MR allows users to maintain awareness of their real environment while interacting with virtual content, increasing safety and enabling collaboration with others in physical space. In contrast, VR isolates users from the real world, which can lead to disorientation and safety issues.
With optical see-through headsets, users can interact with virtual and physical objects using natural gestures and movements in their real environment. In VR with headsets, interaction is mainly limited to virtual objects within the modeled environment.
MR with optical see-through headsets enables social interaction and collaboration among users in the same physical space, fostering communication and teamwork. VR with headsets tends to isolate users in individual virtual environments, limiting social interaction to online platforms.
MR allows virtual objects to interact with real-world objects and the real world. VR with headsets lacks this capability because the virtual content is isolated from the physical world.
MR preserves spatial cues and depth perception from the real world, allowing users to accurately perceive distances and spatial relationships between objects. In contrast, VR with headset generates spatial cues only in the virtual environment, which can lead to discrepancies between perceived and actual distances.