Navigation in Indoor Environments: Does the Type of Visual Learning Stimulus Matter?

This work addresses the impact of a geovisualization’s level of realism on a user’s experience in indoor navigation. The key part of the work is a user study in which participants navigated along a designated evacuation route previously learnt in a virtual tour or traditional 2D floor plan. The efficiency and effectiveness of completing the task was measured by the number of incorrect turns during navigation and completion time. The complexity of mental spatial representations that participants developed before and after navigating the route was also evaluated. The data was obtained using several qualitative and quantitative research methods (mobile eye tracking, structured interviews, sketching of cognitive maps, creation of navigation instructions, and additional questions to evaluate spatial orientation abilities). A total of 36 subjects (17 in the “floor plan” group and 19 in the “virtual tour” group) participated in the study. The results showed that the participants from both groups were able to finish the designated navigation route, but more detailed mental spatial representations were developed by the “virtual tour” group than the “floor plan” group. The participants in the virtual tour group created richer navigation instructions both before and after evacuation, mentioned more landmarks and could recall their characteristics. Visual landmark characteristics available in the virtual tour also seemed to support the correct decision-making.


Introduction
Virtual environments (VE) represent an opportunity for both cartographic and psychological research, as they both strive to utilize the advantages of controlled environments and stimuli. Knowledge and experience obtained in VEs can under certain circumstances be transferred to the real environment. Safety, financial or capacity reasons in some cases make it hard to perform certain tasks in reality, and virtual environments could be a convenient substitution. Until recently, one of the biggest obstacles in using VE for both research and application was the unavailability of real-world models that could be transferred into virtual environments. However, the emergence of Building Information Management (BIM) highly innovated traditional building design that relied on two-dimensional technical drawings and extended it into the third dimension. BIM is also active throughout the entire life cycle of a building, including planning and modeling stages when 3D visualization and VEs play a major role in shaping the building's final appearance and functionality. The EU Public Procurement and Repealing Directive [1] recommends using electronic tools such as Building Information Management (BIM), thereby legally supporting convergence of the information technology and construction sectors. The option or even necessity of using VEs for buildings also opens new horizons for indoor navigation in general and evacuation planning in particular. Current practices of evacuation planning rely on combining traditional 2D printed plans and evacuation marks

STIMULI: Realism vs. Abstraction
The processing of geographic information is influenced by the learning environment. As early as in 1980, Evans and Pezdek [18] defined the differences between the general spatial knowledge recovered from maps and direct outdoor experience. Later, Thorndyke and Hyes-Roth [19] postulated the differences in spatial knowledge users gained from maps and navigation. While maps are more suitable for acquiring the spatial relationships that reside in long term memory, navigation experience provides procedural knowledge of the path between two locations.
In creating geographical visualizations, identifying the right amount of realism is a multifaceted, complex and unresolved issue in cartography [20]. MacEachren et al., [21] stated that traditional geovisualizations and virtual environments are similar in terms of the use of realism. The suitability of the use of individual visualizations cannot be generalized and is dependent rather on the task type [22][23][24][25][26][27][28].
Opinions on how the level of realism influences human perception differ. Some authors state that higher levels of realism in visualizations encourage a user's imagination, reduce cognitive load and make visualizations more user-friendly. Loomis et al., [29] support the idea that higher visual quality in the virtual environment and more interaction and movement options increase the chances of memorizing the environment. Users are generally enthusiastic about realistic visualizations and consider them intuitive and easier to interpret [30]. Some scientists believe that people are better able to remember visual elements that closely depict reality [31][32][33]. Higher levels of realism in visualizations can also mean higher ecological validity if we are modeling a real environment [34,35]. Other authors believe that higher levels of realism increase cognitive load and exhaust memory capacity more quickly [36][37][38][39]. This view is based on the traditional foundations of cognitive cartography. According to Bertin [40], simplification is a necessary part of the communication process. Lowe [41] warned against higher levels of detail that could lead to misinterpretation. Tufte [42] appealed to designers to maximize the ratio of data to ink. According to Sanchez and Branaghan [43], adding detail to visualizations adversely affects map reading.
Smallman and St. John [44] devised the Naïve Realism theory, which explains contradictory preferences and performance when people use realistic visualizations. People mistakenly believe that the realistic visualization they are presented with will be effortlessly transformed into a complete and accurate mental representation. In reality, however, realistic stimuli are transformed vaguely and with many inaccuracies. If a task requires accurate judgment, a vague mental representation is not sufficient to correctly solve the task [22]. According to the Naïve Realism theory, people prefer more realistic visualizations and believe they need a high degree of realism to complete the task. Only those individuals with better spatial abilities can revise their opinion retrospectively. In fact, visualizations with only a moderate degree of realism lead to the best performance [45]. Similar results were obtained by [46][47][48][49][50].
Currently, visualizations combining 2D and 3D elements and using the benefits of both types of visualization are becoming more common [50][51][52][53]. Stachoň et al., [54] also focused on the influence of the level of realism on user navigation in virtual reality. Their findings showed that the level of realism did not affect the memorability of the environment. However, they found that higher levels of realism benefited route finding, as participants worked more effortlessly and made fewer errors.
The role of different graphical stimuli in navigation was described by several scholars for both outdoor and indoor environments, for example, [47,50,51,[53][54][55][56]. Ishikawa and Yamazaki [55] compared the spatial orientation performance of people using maps or pictures with arrows indicating direction after they had left a metro station. Their findings showed that people estimated directions better if they were shown pictures independently of their mental-rotation skills, which were also evaluated. Boumenir et al., [57] conducted an experiment in which they compared a virtual tour, 2D maps and real user navigation experiences. In this study, schematic 2D representations were more effective in outdoor navigation than realistic 3D representations, providing essential geometric and topological information that was important for creating a basic axis in the participant's mental representation of the space. By contrast, Schnitzler et al., [58] compared the usability of three different navigation aids (digital maps, paper maps and signage) in indoor navigation using mobile eye tracking. They concluded that navigators concentrated most (highest number of eye fixations) during initial orientation in an unknown space. Navigators also focused more on the decision points associated with changing floors in a building. Both conclusions are valid regardless of the wayfinding assistance stimuli.
Conclusions from these studies examining the level of realism in geovisualizations are not always straightforward. In some cases, a higher level of realism seemed to provide an advantage, but in others, it was unsuitable. Defining the right level of realism required to support a specific task is therefore appropriate.

PARTICIPANTS: Spatial Abilities, Expertise and Gender
Variation in the development of spatial skills and navigation performance is closely related to the participants' predispositions. Self and Golledge [59] reported differences between genders in different navigation strategies, females tending to orient themselves according to landmarks and their visual characteristics and males orienting themselves more according to direction and the spatial relationships between landmarks. Males are often more self-efficient and confident about navigation tasks [60]. Besides gender, the studies also focused on age [61] and mental capacity [62]. Interpersonal variation in spatial behavior is closely linked to the concept of cognitive style [63]. This concept refers to the way individuals think, perceive and orient themselves in the environment using two principal dimensions [64,65]-the verbal-imagery dimension (preference for representing information in words or as mental pictures) and the holist-analytic dimension (preference for processing information as either integrated wholes or discrete parts). An Object-Spatial Imagery and Verbal Questionnaire (OSIVQ) can be conducted in order to measure the cognitive style of users and their respective tendencies to solve tasks analytically or holistically. The role of cognitive styles in map perception was studied byČížková [66] and Šašinka et al [67]. A significant influence of expertise on the effectiveness of solving spatial tasks was reported by Herman et al., [28]. Participants' predispositions are important to consider, as they can strongly bias results when focusing on stimuli rather than group comparison.

Analyzing Navigation Performance
In user studies generally, the current trend in research prefers the use of several methods simultaneously in order to collect data. Roth [68] claims that using several methods improves the effectiveness, efficiency, reliability, validity and significance of the experiment and results obtained. The current research agenda of the Use, User, and Usability Issues Commission of International Cartographic Association [69] also encourages the use of mixed methods in user studies. Štěrba et al., [65] point out that the use of purely qualitative or quantitative research methods is inadequate for answering many questions. A combination of both qualitative and quantitative methods is more conducive to achieving better results and a more complete interpretation of user behavior.
Considering the eye-mind hypothesis introduced by Just and Carpenter [70], we may learn a lot about human behavior by monitoring eye movements. Eye tracking devices monitor eye activity, typically by identifying fixations and saccades. Landmarks represent an important aid in user navigation. As they are most often perceived visually, eye tracking is a suitable method for detecting whether a user saw a landmark. The length of fixations has been found to correlate with a feature's salience [71]. However, the relationship between fixation duration and selective focus on a given landmark may not be straightforward. A user can fixate longer on a landmark because it is complex and then categorize it as an unsuitable mental navigational aid [72]. People also obtain much information from peripheral vision, which cannot be monitored by eye tracking devices, or they may focus on a point without getting any information at all [73,74]. Another problem is that landmarks may not necessarily be identifiable in reality only because of their visual characteristics but also because of their semantics or previous user memories and experiences, which also provide information to the user [75]. Because eye tracking does not answer all research questions, it is often supplemented by other methods in user studies [72,76,77].
Another frequently used method in navigation studies is asking participants to reconstruct a route using navigation instructions, for example, in an interview. Creating navigation instructions after experiencing the environment can be considered a form of a retrospective think aloud protocol and can provide a user's perspective on their mental representations [78,79]. However, a user may not always be able to accurately express his or her thoughts [79]. In order to analyze these navigation instructions, we used quantitative content analysis, which is based on previously created codes. The frequency of these codes can consequently be checked in a statement [80]. Several authors [81,82] recommend working with landmark categories and not specifying landmarks individually. Some studies [72,82] have shown that participants pay more attention to functional landmarks (e.g., doors, stairs, lifts, etc.). For example, Golledge et al., [59] used other methods by which route learning can be evaluated. In addition to drawing external mental spatial representations and providing a verbal description of a route based on acquired spatial knowledge, "homing" (following the route in the opposite direction), recognizing landmarks in a photo of the traveled environment and estimating directions can also be employed.

Materials and Methods
Eight research methods were used in this study: a structured questionnaire with an integrated OSIVQ spatial abilities test [64], eye tracking, creation of navigation instructions, mental maps, estimation of direction, estimation of route length, landmark identification in photographs, and additional questions. As not all of the results were in the scope of this paper's study, only the methods whose results are referred to in this paper are introduced.
Mobile SMI eye tracking glasses recording at 60 Hz were used to measure a participant's point of attention. The data describing participants' eye movements were processed and analyzed using BeGaze 3.5.101, which is distributed with the SMI glasses. Special attention was paid to eye movements at decision points. Records of each participant's route were processed manually using the Semantic Gaze Mapping method. Areas of Interest (AOIs) were sketched in advance around objects considered landmarks according to their semantic significance. Each fixation was then marked manually from each participant's eye tracking record.
Quantitative content analysis was used to analyze navigation instructions by observing the occurrence and descriptive characteristics of landmarks created in these instructions. Special code categories were created for functional landmarks (doors, stairs) and the green evacuation signs indicating the direction of the designated evacuation route. The remaining landmarks were not differentiated into categories. Several categories related to the spatial layout of the designated route were also created: direction change, spatial relationships, distances and destination. All the resulting codes were categorized according to the spatial knowledge developed by Siegel and White [6] (Figure 1). The quantitative content analysis was undertaken using ATLAS.ti (v. 7.5.7).

Materials and Methods
Eight research methods were used in this study: a structured questionnaire with an integrated OSIVQ spatial abilities test [64], eye tracking, creation of navigation instructions, mental maps, estimation of direction, estimation of route length, landmark identification in photographs, and additional questions. As not all of the results were in the scope of this paper's study, only the methods whose results are referred to in this paper are introduced.
Mobile SMI eye tracking glasses recording at 60 Hz were used to measure a participant's point of attention. The data describing participants' eye movements were processed and analyzed using BeGaze 3.5.101, which is distributed with the SMI glasses. Special attention was paid to eye movements at decision points. Records of each participant's route were processed manually using the Semantic Gaze Mapping method. Areas of Interest (AOIs) were sketched in advance around objects considered landmarks according to their semantic significance. Each fixation was then marked manually from each participant's eye tracking record.
Quantitative content analysis was used to analyze navigation instructions by observing the occurrence and descriptive characteristics of landmarks created in these instructions. Special code categories were created for functional landmarks (doors, stairs) and the green evacuation signs indicating the direction of the designated evacuation route. The remaining landmarks were not differentiated into categories. Several categories related to the spatial layout of the designated route were also created: direction change, spatial relationships, distances and destination. All the resulting codes were categorized according to the spatial knowledge developed by Siegel and White [6] ( Figure 1). The quantitative content analysis was undertaken using ATLAS.ti (v. 7.5.7). The accuracy of navigation instructions was evaluated by separating the entire route into smaller sections. Each section always began and ended with a decision point, in other words, a turning point where the instruction had to be correctly described. If a participant gave incorrect directions or did not state a direction at all, an error was counted. An error was also counted if a participant described extra segments (e.g., extra staircases) or failed to mentioned a segment. This method was inspired by Agrawala's [83] dissertation research, which emphasized the importance of navigation instructions at decision points at the expense of local and contextual information and also corresponds to the findings of Kim and Hirtle [84], who suggested that knowledge of routes is represented as a sequence of intersection-based choice points where procedural decisions must be made. The accuracy of navigation instructions was evaluated by separating the entire route into smaller sections. Each section always began and ended with a decision point, in other words, a turning point where the instruction had to be correctly described. If a participant gave incorrect directions or did not state a direction at all, an error was counted. An error was also counted if a participant described extra segments (e.g., extra staircases) or failed to mentioned a segment. This method was inspired by Agrawala's [83] dissertation research, which emphasized the importance of navigation instructions at decision points at the expense of local and contextual information and also corresponds to the findings of Kim and Hirtle [84], who suggested that knowledge of routes is represented as a sequence of intersection-based choice points where procedural decisions must be made.
Estimation of direction and route distance was also employed as a method. Direction estimations were measured with a compass in degrees and distance estimations were obtained in meters. In the analysis, variations of estimations from real values were processed.
Identification of landmarks from photographs was used to test the participants' visual memories. Photos were taken of different building interior scenes with landmarks, but only some of them were located on the evacuation route. Participants were then asked to separate these 17 photos of landmarks into three categories depending on whether they had seen the same landmarks while navigating the route ("Yes", "No") or whether they could not decide with certainty having seen the landmark ("Not sure"). For each correct categorization, participants were awarded 1 point, for incorrect answers, −0.25 points, and for "Not sure", 0 points.

Experimental Design
In following the call to improve consistency and detail in reporting experimental design and to support the transparency, transferability and reproducibility of research studies [69,85], a detailed description of our experimental design is provided below. A brief overview of the experiment's structure and stages is shown in Figure 2. The research methods whose results are evaluated in this paper are highlighted with a black outline ( Figure 2). Estimation of direction and route distance was also employed as a method. Direction estimations were measured with a compass in degrees and distance estimations were obtained in meters. In the analysis, variations of estimations from real values were processed.
Identification of landmarks from photographs was used to test the participants' visual memories. Photos were taken of different building interior scenes with landmarks, but only some of them were located on the evacuation route. Participants were then asked to separate these 17 photos of landmarks into three categories depending on whether they had seen the same landmarks while navigating the route ("Yes", "No") or whether they could not decide with certainty having seen the landmark ("Not sure"). For each correct categorization, participants were awarded 1 point, for incorrect answers, -0.25 points, and for "Not sure", 0 points.

Experimental Design
In following the call to improve consistency and detail in reporting experimental design and to support the transparency, transferability and reproducibility of research studies [69,85], a detailed description of our experimental design is provided below. A brief overview of the experiment's structure and stages is shown in Figure 2. The research methods whose results are evaluated in this paper are highlighted with a black outline ( Figure 2).

Personal profile
A few weeks before the experiment, participants completed a web questionnaire at home. Besides basic personal information, it contained an integrated OSIVQ questionnaire [64] to measure the participant's cognitive style. Based on the data collected from the questionnaire, participants were categorized into two groups (two learning stimuli) to obtain a balanced distribution of cognitive styles, gender and experience with maps.

Intro stage
The experiment was conducted at the Headquarters of Masaryk University. Participants were not familiar with the selected building before testing. They were brought by alternative route to the meeting room, which was the starting point of the designated evacuation route and where the experiment began.

Personal Profile
A few weeks before the experiment, participants completed a web questionnaire at home. Besides basic personal information, it contained an integrated OSIVQ questionnaire [64] to measure the participant's cognitive style. Based on the data collected from the questionnaire, participants were categorized into two groups (two learning stimuli) to obtain a balanced distribution of cognitive styles, gender and experience with maps.

Intro Stage
The experiment was conducted at the Headquarters of Masaryk University. Participants were not familiar with the selected building before testing. They were brought by alternative route to the meeting room, which was the starting point of the designated evacuation route and where the experiment began.
After welcoming the participant, each provided informed consent to participate in the research study. The study was conducted in accordance with the Declaration of Helsinki, and the research protocol was approved by the Ethics Committee of Masaryk University. The eye tracking device was then calibrated, and the experiment began with an introductory stage during which the participant was briefly acquainted with the basic structure of the experiment. The participants also received instructions for working with a given visualization according to their grouping. The virtual tour group obtained introductory information along with instructions and references to the virtual tour in the form of an offline website. The floorplan group was presented with a PowerPoint presentation.

Learning Stage
Participants from the virtual tour group learned the evacuation route from the virtual tour available at: http://ofm.ukb.muni.cz/vt/nav/rektorat/. An example from the virtual tour conditions is shown in Figure 3. Participants from the floor plan group learned the evacuation route from the second part of the PowerPoint presentation, which showed the schematic plan of the individual floors of the building with the designated evacuation path ( Figure 4). The complete experimental stimuli given to the participants are available in the Supplementary Materials. During the learning stage of the experiment, participant's eye movements were monitored using the mobile eye tracking device. Participants in both visualization groups were given no time limit for the learning stage and could proceed backwards through the visualization.

Determination of Experimental Hypotheses
Our experimental hypotheses were proposed on the basis of the results of previously conducted studies. According to the theory of Naïve Realism [44], more realistic visual stimuli lead to the creation of vague mental representations, which, if the task requires accurate judgment, are not sufficient for identifying correct solutions to the task. In our experiment, the navigation task did not require accurate judgment. A higher level of realism in virtual tour stimuli generates increased cognitive load, whereas the virtual tour's navigational merits should contribute to the development of procedural knowledge of the evacuation route. We therefore expected both experimental groups to perform similarly on this task.

Hypothesis 1:
Users who learn the route from the schematic floor plan navigate along the designated evacuation route as efficiently and effectively as users who learn the route from the virtual tour.
However, we also wanted to investigate the influence of the selected visual stimuli on participants' mental representations of the environment. These representations were developed by participants after learning the route, and we were interested in how these mental representations

Before Navigation Stage
In the first interview, participants were instructed to draw the route they had learned and to create navigation instructions. They were asked to imagine instructing a visitor who had never been inside the building. They were also asked to point towards the route's destination (i.e., the direction) and estimate the route's distance.

Navigation
After the interview, the mobile eye tracker was calibrated for a different focal distance and participants were sent out of the room to navigate along the evacuation route. Each participant navigated the 86-meter route individually while one of the research team members followed behind at a reasonable distance to ensure the participant's safety.

After Navigation Stage
When the participant arrived in the main lobby of the building, which was the designated route's destination, the mobile eye tracking device was removed. A second interview was conducted during which the participant could modify their route drawing and was again asked to create navigation instructions for the route they had just traveled. They were asked to indicate the direction towards the route's starting point, estimate the route's length and answer some additional questions. The final task was identifying landmarks from photographs.

Determination of Experimental Hypotheses
Our experimental hypotheses were proposed on the basis of the results of previously conducted studies. According to the theory of Naïve Realism [44], more realistic visual stimuli lead to the creation of vague mental representations, which, if the task requires accurate judgment, are not sufficient for identifying correct solutions to the task. In our experiment, the navigation task did not require accurate judgment. A higher level of realism in virtual tour stimuli generates increased cognitive load, whereas the virtual tour's navigational merits should contribute to the development of procedural knowledge of the evacuation route. We therefore expected both experimental groups to perform similarly on this task.

Hypothesis 1:
Users who learn the route from the schematic floor plan navigate along the designated evacuation route as efficiently and effectively as users who learn the route from the virtual tour.
However, we also wanted to investigate the influence of the selected visual stimuli on participants' mental representations of the environment. These representations were developed by participants after learning the route, and we were interested in how these mental representations changed after they had navigated the route in the real environment. Overall, we expected that users who learned the route from the virtual tour would develop a more detailed mental representation of the environment than users from floor plan group. Based on previous research and the research methods used, we hypothesized the following: Hypothesis 2.1: Virtual tour users will concentrate more on landmarks and their visual characteristics while navigating the route than schematic floor plan users.

Hypothesis 2.2:
Virtual tour users will create more detailed navigation instructions and include more landmarks and visual characteristics in these instructions than schematic floor plan users [33,56,86,87].
Based on the Self and Golledge [59] findings, we hypothesized that females would perform better in the task of identifying landmarks from photographs whereas males would perform better in estimating directions and route length.

Hypothesis 3:
Males will more accurately estimate the direction and length of the route than females. Females will identify more landmarks correctly in photographs than males.

Participants
In total, 36 participants participated in the experiment: 17 in the floorplan group and 19 in the virtual tour group. All participants were volunteers and could quit the experiment at any time. The ratio of women to men was reasonably balanced in both groups: seven women (41%) to ten men in the floorplan group and nine women (47%) to ten men in the virtual tour group. More than 80% of the participants were 18 to 26 years old. No participants were aged over 40 years. Participants were of Czech and Slovak nationalities and mostly university graduates or students and had different work backgrounds.

Results
In this chapter, results are reported in the context of the hypotheses. All data collected in the experiment was checked for normality using the Shapiro-Wilk normality test [88]. The differences between tested groups (between-subject design) were examined using Welch's two sample t-test (for normally distributed data) [89] and the Mann-Whitney-Wilcoxon non-parametric test [90]. The differences between experimental stages (within-subject design) were examined using a Wilcoxon signed rank test [91]. The resulting p-values < 0.05 are reported as statistically significant. We also report effect sizes using Cohen's d and r values, and following the guidelines [92], interpret them as 0.2, 0.5, 0.8 (Cohen's d) and 0.1, 0.3, 0.5 (r) as small, medium and large, respectively. All boxplots use a 1.5xIQR (interquartile range) rule and Tukey's fences [93] for whiskers and identifying outliers. Asterisk notation is used to visualize statistical significance (ns: p-value > 0.05, *: p-value ≤ 0.05, **: p-value ≤ 0.01, ***: p-value ≤0.001). The statistical analysis of results was conducted in RStudio (v.1.0.153). Descriptive statistics for all measurements for both experimental groups are summarized in Table 1.  The efficiency and effectiveness of route navigation was measured by the total time spent navigating the route and the number of wrong turns taken. Data for the total navigation time were extracted from eye tracking measurements. The floor plan group showed a higher variability in time spent navigating the route, which was caused by those participants making navigation errors ( Figure 5). The difference in time spent navigating was tested using a two-tailed Mann-Whitney-Wilcoxon non-parametric test. The results were not statistically significant between the groups for all participants Considering the navigation effectiveness, all of the participants could find the designated route's destination. While navigating the route, 21% of participants from the virtual tour group made a mistake during navigation, and one participant even failed twice. From the floor plan group, 35% of participants took a wrong turn while navigating, but none of them repeatedly. The difference between navigation effectiveness in the two groups was tested with a two-tailed Mann-Whitney-Wilcoxon non-parametric test. The results between the groups were not statistically significant (α = 0.05; W = 178.50; p-value = 0.5041; r = 0.0017).
Route deviations for each decision point are illustrated in Figure 6. Photographs of the decision points are available as supplementary material. Only participants from the floor plan group (three of them) deviated from the designated route at the final decision point (5. DP), where there was an alternative exit from the building through a white glass door. Two participants from the virtual tour group deviated from the designated route at the very start (cyan and purple color) and proceeded straight ahead instead of turning right. Two participants (virtual tour group and floor plan group) passed by the staircase instead of using it (pink and brown color). Two participants from the floor plan group turned left instead of right after the first staircase (red color). One participant from the virtual tour group (dark blue color) turned left at the start of the designated route and wanted to ascend two floors on the first staircase, but then changed his mind and proceeded correctly. Considering the navigation effectiveness, all of the participants could find the designated route's destination. While navigating the route, 21% of participants from the virtual tour group made a mistake during navigation, and one participant even failed twice. From the floor plan group, 35% of participants took a wrong turn while navigating, but none of them repeatedly. The difference between navigation effectiveness in the two groups was tested with a two-tailed Mann-Whitney-Wilcoxon non-parametric test. The results between the groups were not statistically significant (α = 0.05; W = 178.50; p-value = 0.5041; r = 0.0017).
Route deviations for each decision point are illustrated in Figure 6. Photographs of the decision points are available as Supplementary Materials. Only participants from the floor plan group (three of them) deviated from the designated route at the final decision point (5. DP), where there was an alternative exit from the building through a white glass door. Two participants from the virtual tour group deviated from the designated route at the very start (cyan and purple color) and proceeded straight ahead instead of turning right. Two participants (virtual tour group and floor plan group) passed by the staircase instead of using it (pink and brown color). Two participants from the floor plan group turned left instead of right after the first staircase (red color). One participant from the virtual tour group (dark blue color) turned left at the start of the designated route and wanted to ascend two floors on the first staircase, but then changed his mind and proceeded correctly. As both the effectiveness and efficiency of navigation did not significantly differ between the groups of participants, hypothesis H1 can be confirmed (Users who learn the route from the schematic floor plan navigate along the designated evacuation route as efficiently and effectively as users who learn the route from the virtual tour.). As both the effectiveness and efficiency of navigation did not significantly differ between the groups of participants, hypothesis H1 can be confirmed (Users who learn the route from the schematic floor plan navigate along the designated evacuation route as efficiently and effectively as users who learn the route from the virtual tour.). H2.1: Virtual tour users will concentrate more on landmarks and their visual characteristics while navigating the route than schematic floor plan users.
The degree of attention participants paid to landmarks was monitored using eye tracking. Route intersections (six decision points) where landmarks provided the greatest advantage to support decision-making during navigation were especially monitored. To quantify the performance of both participant groups at decision points, fixation and saccade count metrics were used. The differences in mean fixation and saccade counts at decision points were examined with a two-tailed Welch's two sample t-test. The results between the groups were statistically significant (fixation count: α = 0.05; t = −3.26; df = 20.21; p-value = 0.0039; saccade count: α = 0.05; t = −3.41; df = 20.13; p-value = 0.0028, see Figure 7). A strong effect of the learning stimuli types on both eye fixation count (d Cohen = 1.35) and saccade count (d Cohen = 1.41) was observed.
H2.1: Virtual tour users will concentrate more on landmarks and their visual characteristics while navigating the route than schematic floor plan users.
The degree of attention participants paid to landmarks was monitored using eye tracking. Route intersections (six decision points) where landmarks provided the greatest advantage to support decision-making during navigation were especially monitored. To quantify the performance of both participant groups at decision points, fixation and saccade count metrics were used. The differences in mean fixation and saccade counts at decision points were examined with a two-tailed Welch's two sample t-test. The results between the groups were statistically significant (fixation count: α = 0.05; t = -3.26; df = 20.21; p-value = 0.0039; saccade count: α = 0.05; t = -3.41; df = 20.13; p-value = 0.0028, see Figure 7). A strong effect of the learning stimuli types on both eye fixation count (dCohen = 1.35) and saccade count (dCohen = 1.41) was observed. The degree of attention paid to specific landmarks at individual decision points (DP) was also investigated. The route had six decision points where participants had to decide which direction they would proceed. AOIs were created at these decision points for each object considered a landmark according to its semantic significance (see supplementary materials).
Using the Semantic Gaze Mapping method, eye tracking records were processed for each participant. Fixations outside AOIs were not analyzed. Figure 8 shows the sequence of fixations at AOIs for each DP and participant. At the first DP, no difference was observed between the experimental groups. The fire extinguisher was as attractive to the floor plan group as the virtual tour group. A bigger difference can be seen at the second DP, where participants from the floor plan group paid more attention to the evacuation sign than the virtual tour group. At the third DP, participants from the floor plan group focused mainly on doors and windows and less on flowers, floor level signs and radiators. Participants from the virtual tour group focused much more on floor level signs and radiators. The degree of attention paid to doors was almost the same. The fourth DP had only two AOIs: stairs and a green evacuation sign, which attracted only one participant's attention from the virtual tour group. More fixations on stairs were observed in the virtual tour group. At the fifth DP, participants from the virtual tour group paid more attention to flowers than participants from the floor plan group. Both groups showed almost the same number of fixations on radiators and windows. A high number of fixations on the white door was observed in some individuals from both groups. The final DP showed no difference between the groups. The degree of attention paid to specific landmarks at individual decision points (DP) was also investigated. The route had six decision points where participants had to decide which direction they would proceed. AOIs were created at these decision points for each object considered a landmark according to its semantic significance (see Supplementary Materials).
Using the Semantic Gaze Mapping method, eye tracking records were processed for each participant. Fixations outside AOIs were not analyzed. Figure 8 shows the sequence of fixations at AOIs for each DP and participant. At the first DP, no difference was observed between the experimental groups. The fire extinguisher was as attractive to the floor plan group as the virtual tour group. A bigger difference can be seen at the second DP, where participants from the floor plan group paid more attention to the evacuation sign than the virtual tour group. At the third DP, participants from the floor plan group focused mainly on doors and windows and less on flowers, floor level signs and radiators. Participants from the virtual tour group focused much more on floor level signs and radiators. The degree of attention paid to doors was almost the same. The fourth DP had only two AOIs: stairs and a green evacuation sign, which attracted only one participant's attention from the virtual tour group. More fixations on stairs were observed in the virtual tour group. At the fifth DP, participants from the virtual tour group paid more attention to flowers than participants from the floor plan group. Both groups showed almost the same number of fixations on radiators and windows. A high number of fixations on the white door was observed in some individuals from both groups. The final DP showed no difference between the groups. Since the degree of attention paid to landmarks appeared to be dependent on the landmark type, hypothesis H2.1 cannot be confirmed (Virtual tour users will concentrate more on landmarks and their visual characteristics while navigating the route than schematic floor plan users.).
H2.2: Virtual tour users will create more detailed navigation instructions and include more landmarks and visual characteristics in these instructions than schematic floor plan users.
The navigation instructions created by the participants were analyzed in terms of their accuracy and informational content. The accuracy of navigation instructions was evaluated using the method described in Section 3. Using the content analysis, the frequency of code categories detected in the navigation instructions was counted and compared. The richness of the navigation instructions was measured by the sum of all the code categories in a particular statement. Figures 9 and 10 illustrate the differences in accuracy and richness of the instructions between groups and experiment stages. The differences were tested using a Mann-Whitney-Wilcoxon non-parametric test (between-group comparison) and Wilcoxon signed rank test (between-stages comparison). The results are shown in Table 2 and Table 3. Statistically significant differences (α = 0.05) are highlighted in bold cursive. Since the degree of attention paid to landmarks appeared to be dependent on the landmark type, hypothesis H2.1 cannot be confirmed (Virtual tour users will concentrate more on landmarks and their visual characteristics while navigating the route than schematic floor plan users.). H2.2: Virtual tour users will create more detailed navigation instructions and include more landmarks and visual characteristics in these instructions than schematic floor plan users.
The navigation instructions created by the participants were analyzed in terms of their accuracy and informational content. The accuracy of navigation instructions was evaluated using the method described in Section 3. Using the content analysis, the frequency of code categories detected in the navigation instructions was counted and compared. The richness of the navigation instructions was measured by the sum of all the code categories in a particular statement. Figures 9 and 10 illustrate the differences in accuracy and richness of the instructions between groups and experiment stages. The differences were tested using a Mann-Whitney-Wilcoxon non-parametric test (between-group comparison) and Wilcoxon signed rank test (between-stages comparison). The results are shown in Tables 2 and 3. Statistically significant differences (α = 0.05) are highlighted in bold cursive.  The occurrences of individual code categories in navigation instructions were also analyzed. Because a strong effect of learning stimuli on the richness of the created navigation instructions was observed (richness was significantly lower in both experiment stages in the floor plan group), the relative number of occurrences was calculated for a between-group comparison. The frequency of each code category was divided by the sum of occurrences for each participant. Figure 11 illustrates the differences in relative occurrences of individual code categories in navigation instructions between groups and experimental stages. The differences were tested using a Mann-Whitney-Wilcoxon non-parametric test (between-group comparison) and Wilcoxon signed rank test (between-stages comparison). The results are shown in Table 2 and Table 3. Statistically significant differences (α = 0.05) are highlighted in bold cursive.  The occurrences of individual code categories in navigation instructions were also analyzed. Because a strong effect of learning stimuli on the richness of the created navigation instructions was observed (richness was significantly lower in both experiment stages in the floor plan group), the relative number of occurrences was calculated for a between-group comparison. The frequency of each code category was divided by the sum of occurrences for each participant. Figure 11 illustrates the differences in relative occurrences of individual code categories in navigation instructions between groups and experimental stages. The differences were tested using a Mann-Whitney-Wilcoxon non-parametric test (between-group comparison) and Wilcoxon signed rank test (between-stages comparison). The results are shown in Table 2 and Table 3. Statistically significant differences (α = 0.05) are highlighted in bold cursive.   The occurrences of individual code categories in navigation instructions were also analyzed. Because a strong effect of learning stimuli on the richness of the created navigation instructions was observed (richness was significantly lower in both experiment stages in the floor plan group), the relative number of occurrences was calculated for a between-group comparison. The frequency of each code category was divided by the sum of occurrences for each participant. Figure 11 illustrates the differences in relative occurrences of individual code categories in navigation instructions between groups and experimental stages. The differences were tested using a Mann-Whitney-Wilcoxon non-parametric test (between-group comparison) and Wilcoxon signed rank test (between-stages comparison). The results are shown in Tables 2 and 3. Statistically significant differences (α = 0.05) are highlighted in bold cursive.   Participants from the virtual tour group created significantly richer navigation instructions. In the navigation instructions created before navigation, they also included significantly more landmarks (except in the "stairs" code category), but significantly fewer visual characteristics than participants in the floor plan group. After navigation, the number of mentioned landmarks and their visual characteristics was more balanced. Based on these results, hypothesis H2.2 can be partially (except in the "stairs" category) confirmed (Virtual tour users will create more detailed navigation instructions and include more landmarks and visual characteristics in these instructions than schematic floor plan users.). H3: Males will more accurately estimate the direction and length of the route than females. Females will identify more landmarks correctly in photographs than males.
Gender differences were also studied in the data analysis. Descriptive statistics for selected metrics for both genders are summarized in Table 4 and Figure 12. The differences were tested using a Mann-Whitney-Wilcoxon non-parametric test (between-group comparison) and Wilcoxon signed rank test (between-stages comparison). The results of the differences are shown in Tables 4 and 5. Statistically significant differences (α = 0.05) are highlighted in bold cursive. (except in the "stairs" category) confirmed (Virtual tour users will create more detailed navigation instructions and include more landmarks and visual characteristics in these instructions than schematic floor plan users.).
H3: Males will more accurately estimate the direction and length of the route than females. Females will identify more landmarks correctly in photographs than males.
Gender differences were also studied in the data analysis. Descriptive statistics for selected metrics for both genders are summarized in Table 4 and Figure 12. The differences were tested using a Mann-Whitney-Wilcoxon non-parametric test (between-group comparison) and Wilcoxon signed rank test (between-stages comparison). The results of the differences are shown in Table 4 and Table  5. Statistically significant differences (α = 0.05) are highlighted in bold cursive.    Before navigating, males estimated route length and direction significantly better than females. After navigating, males performed better only in route length estimation. An interesting trend was observed in male participants, their overall direction estimation deviation being significantly higher (α = 0.05; V= 42.50; p-value = 0.0356; r = 0.2928) after navigating than before navigating. Females performed significantly better in the landmark identification task than males. Since females estimated directions equally as males after navigating, hypothesis H3 can be only partially confirmed (Males will more accurately estimate the direction and length of the route than females. Females will identify more landmarks correctly in photographs than males.).

Discussion
To summarize, participants from both groups were able to successfully navigate the designated evacuation route. The total time spent on the route was not statistically different between groups. Nevertheless, the comparison of descriptive statistics indicated possible group differences. The most significant difference may be the variation in time spent on the task (higher in the FP group), which differs between the FP and VT groups if all participants are included (Table 1). It possibly indicates the different nature of navigation information derived from the stimuli. When the participants from FP group made navigation errors, it took them more time to get back to the intended route. This could possibly be explained by the absence of general but visually attractive landmarks (flowers, radiators, etc.) on the floor plan stimuli. This finding needs to be verified in future studies.
In a closer analysis of the route travelled by participants who made navigation errors, similarities in both groups were observed ( Figure 6). Participants from the floor plan group made errors at the fourth (3. DP) and the final DP (5. DP), turning in the wrong direction. Participants in this group perhaps only had a limited mental image of decision points, as opposed to the virtual group, who had seen detailed representations of the actual DPs in the virtual tour. At the fourth DP (3. DP), participants in the virtual group could decide according to the 2nd floor level sign, which they had previously seen in the learning stage. Analyzed eye tracking data confirmed a greater degree of attention given to this sign by participants in the virtual tour group. We argue that in both cases, the participants from the virtual tour group benefited from the additional visual information they acquired in the learning stage and therefore made correct decisions as opposed to participants from the floor plan group. However, only participants from the virtual tour group committed navigation errors at the start of navigation (0.DP). Two participants proceeded straight ahead instead of turning right at the second DP (1. DP). These results are in accordance with Dalton's conclusions [11] and also with the initial segment strategy [12] that navigators are literally "following their noses" and prefer routes that have an initial straight segment. This strategy could explain participants' decisions when they were not sure how to proceed along the route. Furthermore, acquiring the same orientation as in the learning stimuli was not possible, and initial orientation therefore required a different degree of mental translation.
Floor plan stimuli also provide information about the surrounding environment and therefore better represent the spatial context of the designated route. The "turning effect" used in the virtual tour could also have affected the initial choice, as more participants stated in the interviews that they needed time to understand it.
It is important to emphasize that the two visualizations in the experiment provided different levels of detail about the evacuation route and that the virtual tour more closely represented the real indoor navigational experience. In the main task, both groups performed similarly, yet the two groups developed different mental spatial representations. This finding is consistent with studies conducted by Evans and Pezdek [18] and Throndyke and Hayes-Roth [19]. Considering the results presented in Section 4, it could be said that the participants who learned the route from the virtual tour developed a more detailed mental spatial representation. This was the visualization with a higher level of realism. Participants learning from the virtual tour created richer navigation instructions, as expected in our hypothesis. After navigating the designated route, the richness of navigation instructions created by the virtual tour group decreased. In the floor plan group, however, the opposite effect was observed. One reason may be that in the second interviews, participants tried to mention only the relevant landmarks they had used for orientation. A similar conclusion was reached byČížková [66].
Based on the evaluation of eye tracking data for route decision points, a statistically higher number of fixations and saccades were found in participants from the virtual tour group. Overall, participants from the virtual tour group focused more on landmarks they knew from the virtual tour (flowers, radiators, door signs) than participants from the floor plan group, who had not previously seen those landmarks on the floor plan stimulus. Participants from both groups focused mostly on functional landmarks, which corresponds to the findings in studies by Ohm et al., [82] and Viaene et al., [72], and also to the content analysis results of the navigation instructions in which participants often mentioned the doors and stairs. They also frequently mentioned the evacuation arrows indicating the route's direction, which were therefore very helpful orientation cues. However, these are relatively small objects and fixations cannot be included in the area of interest because of the eye tracking device's degree of accuracy.
The results from the analysis of gender differences are consistent with the results of studies examining similar issues. Self and Golledge [59] showed that females orient themselves more according to landmarks and their visual characteristics, which corresponds to the higher score of points accumulated in the task of identifying landmarks from photographs. The results also demonstrated that males oriented themselves more using direction and distance, which corresponds to the smaller spatial deviations in their estimations observed in our study. Interestingly, the overall estimations of direction by females became better after navigating the route. By contrast, mean deviation in the male participants doubled. This was mainly, however, caused by the estimation of one male participant, which was significantly worse after navigation. Female participants scored lower in estimating route length, but even the mean deviation scored by male participants was as high as the actual route length. During interviews, most of the participants reported that it was more difficult to estimate distance in indoor environments and that they tried to compare this distance, for example, to an everyday distance such as walking to a bus stop or around a running track.
Boumenir et al., [57] conducted an experiment that was in many ways similar to this study; however, they found that virtual tour users performed more poorly. The lower effectiveness of the virtual tour may have been the result of a different design. Directions were represented using arrows that always pointed straight ahead on the monitor. Users therefore relied on a linear mental representation of space. The virtual tour in our study was designed so that when a participant clicked on another decision point, the scene automatically and slowly turned in the direction along which the route continued. We did not observe this effect providing any advantage or disadvantage in creating non-linear mental spatial representations (e.g., survey knowledge). However, how a virtual tour is designed and how a user can interact with it specifically influences their performance in solving tasks.
Our results are also likely to differ since our experiment was conducted inside a building as opposed to a forest park outside.
Most of the studies mentioned above differed in some way from our study in terms of experimental conditions, visual stimuli, number of participants, and so on. This study is one of the first examining the use of a virtual tour for evacuation from a building. It is important to note that the results cannot be easily generalized because they relate to the interior of a particular building and designated route, a limited number of participants and the specific equipment used in the experiment. All the factors mentioned could possibly influence the documented results.

Conclusions and Future Work
Visualizations used in navigation provide the basis for developing the mental spatial representations that are crucial to effective navigation. In our study, we simulated a simple evacuation scenario of a person, who was trying to find the way to the evacuation assembly area after learning the route from presented stimuli. The degree of realism of geographic visualization is known to affect user performance in navigation tasks. The main aim of the study was to review the role of different levels of realism in graphic stimuli (2D floor plan and virtual tour) on the accuracy and efficiency of indoor navigation.
Two variants of cartographic visualizations were created for the study-a schematic plan and a virtual tour-that depicted the same evacuation route established for the purpose of the experiment. The main aim of the study was to review the role of different levels of realism in graphic stimuli (2D floor plan and virtual tour) on the accuracy and efficiency of indoor navigation. Specifically, successful completion and the level of detail of mental spatial representations developed by different user groups were examined.
The results of the experiment showed that the type of cartographic visualization did not influence whether participants completed the navigation task successfully, but the participants who learned the route from the virtual tour developed more detailed mental spatial representations of the building's interior. A strong effect of learning stimuli on the overall richness of created navigation instructions was observed. Regarding specific landmark code categories, a greater difference was observed before the navigation stage. Navigation experience seemed to balance the observed differences, but not entirely. The type of the learning stimulus had a strong effect on the navigation process, influencing eye movement activity. Participants from the virtual group demonstrated significantly more fixations and saccades, which could imply that different cognitive processes were involved in solving the task. This matter needs to be examined more closely, however. Since the presented study was designed with high ecological validity and considered a real-life building and experience, it provides valuable insight on how the level of realism of a cartographic visualization influences the user performing the evacuation task. The results, however, relate highly to the relatively small number of participants and the building's specific characteristics and indoor spatial metrics.
Future work could involve analyzing individual differences, especially more detailed studies of the eye tracking data collected in the learning stage from participants who made navigation errors. For more generally conclusive results, additional experiments in different buildings involving more participants etc. would be required. Bao et al., [94] argued that it would be better to propose categories of buildings based on their navigability rather than test individual buildings. Real-time monitoring of human behavior complemented by space syntax describing specific environments could provide general results for predicting navigation success.