Exploring the Cognitive Load of Expert and Novice Map Users Using EEG and Eye Tracking

The main objective of this research is to explore the cognitive processes of expert and novice map users during the retrieval of map-related information, within varying difficulty levels (i.e., easy, moderate, hard), by using eye tracking and electroencephalogram (EEG). In this context, we present a spatial memory experiment consisting of a large number of stimuli to study the effect of task difficulty on map users’ behavior through cognitive load measurements. Next to the reaction time and success rate, we used fixation and saccade related eye tracking metrics (i.e., average fixation duration, the number of fixations per second, saccade amplitude and saccade velocity), and EEG power spectrum (i.e., event-related changes in alpha and theta frequency bands) to identify the cognitive load. While fixation metrics indicated no statistically significant difference between experts and novices, saccade metrics proved the otherwise. EEG power spectral density analysis, on the other side, suggested an increase in theta (i.e., event-related synchronization) and a decrease in alpha (except moderate tasks) (i.e., event-related desynchronization) at all difficulty levels of the task for both experts and novices, which is an indicator of cognitive load. Although no significant difference emerged between two groups, we found a significant difference in their overall performances when the participants were classified as good and relatively bad learners. Triangulating EEG results with the recorded eye tracking data and the qualitative analysis of focus maps indeed provided a detailed insight on the differences of the individuals’ cognitive processes during this spatial memory task.


Introduction
Cognitive processes emerge from both overt (externally detectable) and covert attention (internally detectable), and attention is a fundamental cognitive function that controls all the other cognitive processes, such as perception, memory and learning. Attention can be driven unintentionally by external events (i.e., bottom-up), or deliberately by internal expectations requiring cognitive control (i.e., top-down). Top-down attention influences the selection of visual stimuli based on previous experience and current goals, while filtering out distracting objects/visuals. The working memory, whose performance depends on the cognitive demands of the task, plays a critical role in guiding these top-down attentional processes by keeping present goals in mind [1]. Map learning involves complex cognitive processes, and is different from other learning concepts, in the sense that it requires understanding and memorizing the information presented in map format, and this information is presented at once [2]. When people need to perform a spatial memory task, they tend to memorize the location, color, shape, and size of the objects (i.e., visual variables), together with their spatial relationships between each other [3]. They also adapt themselves to when, how, and in which order they select and focus on a map object of their interest. Therefore, each map user can develop their own strategy to approach the spatial information on maps. Being an example of top-down attentional tasks, map learning causes a cognitive load that varies depending on the task difficulty and the individual characteristics. Cognitive load refers to the used amount of working memory resources. It has been used to explain how humans deal with increasing cognitive demands associated with the increased task difficulty in actions where the cognitive skills are more important than the physical ones. Even if the task's difficulty is one of the most essential factors affecting performance, cognitive load is used to describe the mental cost of accomplishing task demands. Fluctuations of attentional state are also modulated by cognitive load in a sense that an increase in cognitive load involves increased attentional processing [4]. In this context, map design and the level of complexity of maps might have an impact on cognitive load, and even influence how difficult a particular task can be.
Performance is generally defined by the reaction times and accuracy/success rates. It is worth mentioning that the reaction time can be related as a metric to indicate the fatigue of participants. More difficult tasks would require more cognitive effort from a user and may result in longer response times. Cognitive load can be a complementary measure to distinguish between users who perform a task with equal reaction time and accuracy rates, but with different levels of cognitive effort, helping to develop interfaces that require less cognitive capacities.
Cognitive load can be extracted by using both fixation and saccade related eye tracking metrics, On the one hand, fixations are stable point of regards (PORs) during a certain time span (at least 80 to 100 ms) and indicate the users' content interpretation at that location [5]. For instance, average fixation duration is associated with the attentional procedures, while the number of fixations per second indicates the speed of attention. Fixation duration and the number of fixations are generally inversely correlated, and higher fixation durations indicate higher processing load [3,6]. On the other hand, saccades are short (e.g., typically 30-80 ms) and voluntary eye movements between two fixations and can be visualized as scan paths. Saccadic eye movements are identified with their amplitude, duration, and velocity, and the relationship between these three parameters is called the 'main sequence'. The measurements of saccade velocity and amplitude help observing the pattern of a scan path and exploring the cognitive effort. Saccade amplitude (length) and saccade velocity are highly correlated to each other and discriminatory parameters in terms of cognitive performance [4]. Saccade velocity ( • /s) is the average saccade speed in degrees per second, whereas saccade amplitude ( • ) is the size of the saccade in degrees. Higher saccade velocity average indicates higher stress and task complexity and lower concentration while doing the task. The higher the cognitive load, the shorter the saccades, and the higher the saccade velocity [7].
Based on the existing eye tracking literature on the differences between expert and novice map users, we know that experts have better defined eye-scanning patterns, mostly have shorter reaction times and fixations and more fixations per second, e.g., [8,9], and also fewer saccades, e.g., [10], of which all are correlated with a low cognitive load. Regardless of their expertise, users' eye movements reflect the main elements on map stimuli, and their attention is influenced by deviating colors on maps [11]. The cognitive strategies of experts and novices might differ as well, regardless of the type of the visual stimuli. In the context of solving a physics problem, correct answers are associated with the fact that the participants look at thematically relevant areas, unlike wrong answers being correlated with their focus on perceptually salient areas of the visual stimulus [12]. Similarly, while solving a thematic map problem, unsuccessful participants were not able to use of the thematic legend properly, focus on the relevant map layout elements, and adequate map content [13].
Electroencephalogram (EEG) is another non-invasive and direct method to measure cognitive processes in the brain. The EEG signal represents oscillations observed across a wide range of frequencies which are commonly divided into distinct frequency bands (i.e., delta band: <4 Hz, theta band: 4-8 Hz, alpha band: 8-12 Hz, beta band: 13-30 Hz, gamma band: >30Hz) [14]. Spectral analyses of the EEG (i.e., power spectral density (PSD)) can be used to compute the band-specific frequency power for given periods of time, i.e., during a task/trial. Task-related power decreases from a reference to an activation interval are commonly referred to as event-related desynchronization (ERD), while power increases are referred to as event-related synchronization (ERS). Alpha and theta power are associated with the cognitive load, in a sense that alpha decreases and theta increases as cognitive processing increases [15].
To study the cognitive procedures of individuals during a map learning task, eye tracking and electroencephalogram (EEG) technologies can be combined e.g., [3,6,[16][17][18]. Since eye movements and attentive cognition are linked, it is possible to detect users' cognitive states in situ via eye trackers. Once these cognitive states are understood, effective spatial visualizations that adapt themselves to their users' current cognitive capacities (e.g., cognitive load) can be developed [19]. While eye tracking is used to detect overt attention through gaze movements, EEG, which is sensitive to the instantaneous changes in the brain, is more likely to detect covert attention through direct measures of the electrical activity along the scalp. As well as eye tracking, EEG requires a statistical and visual analysis of cognitive processes. Furthermore, it has commonly been applied in cognitive and experimental psychology, to study how the human brain responds to any kind of external stimuli e.g., [20][21][22]. Therefore, the co-registration of eye movements and EEG rhythms is promising for cartographic usability research, especially when studying the behavior of different map user groups (e.g., experts, novices), as the insights that particularly arose from the personal differences contribute to creating effective and user-friendly cartographic products for those user groups.
Our main research objective is to explore the cognitive processes of expert and novice map users during the retrieval of map-related information contained in map stimuli and within varying difficulty levels. Therefore, we aim to test the effect of the task difficulty on map users' attentive behavior through cognitive load measurements. We are interested to explore whether the cognitive procedures used by experts and those used by novices differ for basic spatial memory tasks. We expect that experts might apply more structured strategies that are particular for map use, and might execute the tasks faster and in a more efficient way due to their specific map knowledge. This paper is part of a larger PhD research study. Previously, we investigated the spatial memory (i.e., map learning) abilities of map users through two user experiments employing mixed methods of (i) sketch maps and eye tracking [3]; and (ii) eye tracking and EEG [6] by emphasizing the importance of cartographic/psychological experimental design. While, in the previous paper [6], we mostly focused on the experimental set-up of the user study presented in this paper, we now present the results of the EEG analysis in detail with the aim of triangulating them with the recorded eye tracking data. With this approach, we will be able to interpret cognitive processes occurring during this spatial memory task in a more holistic way. In this context, we introduce the behavioral data (i.e., reaction time, response accuracy), saccade-related metrics, such as saccade velocity and saccade amplitude, their relationship with the previously obtained fixation related metrics, and their impact on understanding the cognitive strategies of expert and novice map users. Additionally, attentional behaviors of two groups are further explored with the qualitative analysis of focus maps. We also provide event-related EEG analysis (i.e., PSD) of two user groups for different difficulty levels of the spatial memory task. Alternatively, the recruited participants were classified as good learners and relatively poor learners, based on their overall task success rates, and we present the results of the EEG analysis conducted with respect to this classification as well.

Methodology
The methodology used to process the collected eye tracking and EEG data is not straightforward, and the algorithms used to detect fixation, and saccades or EEG rhythms could influence the results. We used the same methodology for the experimental design and a subset of the same dataset for data analysis of the collected data as in [6]. Table 1 summarizes the experimental design elements. Preprocessing of the EEG recordings, including the steps such as noise filtering, bad channel removal, channel interpolation, re-referencing, and segmentation was handled in EEGLAB open-source MATLAB toolbox by following Makoto's preprocessing pipeline [23]. Event-related changes in the spectral power density (PSD) with respect to alpha and theta frequency bands were calculated as explained in [15,24]. We calculated the EEG metrics not only for experts and novices (i.e., classification based on expertise), but also for good learners and relatively poor learners (i.e., classification based on success rates). Theoretically similar to what was done by Thorndyke and Stazs [2], we defined good learners as those who performed better than the average did, and the rest would be relatively poor map learners, regardless of their expertise.

Apparatus
A dual PC set-up was established for EEG and eye tracking to simultaneously capture participants' psychological data (please read [6] for more detail about the set-up). EEG was recorded using BIOPAC Acqknowledge software and hardware, and an International 10-20 System ECI electrode cap (i.e., recording electrodes: Fp1, Fp2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6; the linked mastoids as reference and ground) with a sampling rate of 500 Hz. The SMI RED 250 eye tracker was synchronized with EEG to capture the gaze activities simultaneously, and to monitor the possible eye movement artifacts on EEG. To ensure a good enough skin-electrode interface, the impedance was measured using BIOPAC EL-CHECK in advance of recording for every participant. We paid attention to keep the electrode impedances of the whole circuit (i.e., ground, active EEG, reference electrodes) less than 10 K ohms, as suggested by Herman [25] and Teplan [26]. To ensure that participants remained at a fixed distance from the screen and to avoid head movements, we established a chin rest, which was positioned at 70 cm from the screen. The horizontal and vertical eye positions for both eyes were recorded at a rate of 250 Hz.

Participants
This research got the approval of the Ethics Committee of the Faculty of Business and Economics of Ghent University, where we conducted our experiments. We considered experts who hold at least an MSc degree in geomatics and other geo-related domains. The novices were selected from the volunteers who had no professional experience with maps. As a rule of psychological experiments, they were at least above 23-years-old, to be able match their age with the average age of experts with a low standard deviation. In total, both experts and novices, whose age (N = 20, MED = 27.2, SD = 3.9) and gender (10 F, 10 M) match, performed the same experiment under the same conditions (see Table 1 for more detail).

Task and Stimuli
The spatial memory task related to the retrieval of the main structuring elements of maps varied in difficulty; hard, moderate, and easy. Hard tasks focused on (1) all elements, and (2) roads and hydrography; moderate ones focused on (3) roads and green areas, and (4) green areas and hydrography; and easy ones focused on (5) green areas, (6) hydrography, and (7) roads. Accordingly, the stimuli included in tasks were presented as seven randomized blocks: each including 50 trials of the same type of map elements (i.e., in total 350 trials) (Figure 1). For the classification of the task difficulty, we considered the average reaction times of all participants, corrected based on the amount of errors committed, i.e., the inverse efficiency score. It is the oldest and most frequently used measure to integrate reaction times and error rates [27].

Procedures
Before each trial, participants were shown a fixation cross in the middle of the screen for a duration of two seconds. This is called baseline period (i.e., reference interval) and refers to the pre-stimulus duration without any task demands, except for concentrating on a displayed cross. During the trial, participants were asked to study a map stimulus in a free-viewing condition for seven seconds. This is called the task period (i.e., activation interval), where the participants were required to perform the experimental task. After studying the map stimulus, four graphical response panels appeared. The panels included skeleton maps in which only the main structuring element(s) of interest were drawn, and the participants were instructed to select the panel with the correct skeleton map corresponding to the stimulus they had just studied. By the time they decided on their answers with a simple key command, the trial was terminated, and they were automatically presented with the fixation cross for two seconds, and then the next map stimulus/trial was initiated. With the preparation of the participant (i.e., reading instructions, signing informed consent, wearing the EEG cap, impedance check, and calibration of the eye tracker) and small breaks between blocks to combat the fatigue, each participant averagely required 2.5 hours to complete the experiment consisting of seven blocks.

Psychological Measures to Estimate Cognitive Load
Next to the average fixation duration and the number of fixations per second, which were previously published in [6], we explored saccadic eye movements as measures of cognitive processing demands, i.e., cognitive load, because they are highly distinctive, task-dependent, and can be correlated with fixation duration, to interpret the overall cognitive load [19]. There is a strong evidence that longer fixation duration and shorter saccades are related to higher cognitive load [28] and indicate that more attentional resources were required [29]. Consequently, it is possible to formulate that fixation duration and saccade velocity increase, but saccade length decreases when information processing rises.
In addition to the above-mentioned quantitative analysis of the collected eye fixations and saccades, we randomly selected 10 stimuli used in the experiments and conducted a qualitative analysis (e.g., visual inspection of eye fixations), focusing on the attentional behavior of the participants to the map elements of interest using focus/heat maps.
For EEG, we focused on the specific group of electrodes, due to the fact that alpha reduction (alpha ERD) is generally observed at parietal regions e.g., [30,31] and theta increase (theta ERS) is most profound over frontal electrode locations [32]. In this context, we focused on the Fp1, Fp2, F3, F4, F7, and F8 frontal channels for theta power and the P3 and P4 parietal channels for alpha power, e.g., [33][34][35] (Figure 2). After averaging all usable trials within the reference and activation periods, event-related power changes at an electrode were calculated by subtracting the log-transformed the power during activation intervals from the log-transformed the power during the reference intervals [6]. Consequently, we grouped the trials based on the task difficulty and averaged the theta spectral power at frontal channels and alpha spectral power at parietal channels for novices and experts separately.

Behavioral Measures
The overall average reaction time was 5.1 s (SD = 1.1) for experts and 3.7 s (SD = 0.5) for novices, consequently, experts spent more time than novices did for all tasks. For hard tasks, experts spent 6.9 s (SD = 1.4) whereas novices took 5.0 s (SD = 0.6); for moderate tasks, experts completed them in 5.4 s (SD = 1.4) while novices took 2.9 s (SD = 0.6); and lastly, for easy tasks, experts responded in 3.7 s (SD = 0.9) and novices took 2.8 s (SD = 0.4) (Figure 3a). These differences between the two groups were statistically significant across the hard and moderate tasks (Mann-Whitney U test: U hard = 106.000, p = 0.022; U moderate = 114.000, p = 0.035), however, it was not the case for easy tasks (Mann-Whitney U test: U easy = 128.000, p = 0.138). The overall average success rate (i.e., correct answers in %) was quite high for both groups of participants, averaged for 350 trials per participant in total, M overall = 91.8% (SD = 4.7, range = 78.3-98.3%). For hard tasks, experts scored 90.6%, whereas novices scored 86.8%; for moderate tasks, experts achieved the score of 93.5%, while novices scored 88.8%; and lastly, for easy tasks, experts responded the trials with a 95.5% success rate and novices scored a 93.3% success rate (Figure 3b). The success rate did not significantly differ across the categories of expertise for any difficulty level (M experts = 93.2%, M novices = 89.6%; Mann-Whitney U test: U easy = 165.000, p = 0691, U moderate = 150.000, p = 0.400, U hard = 178.500, p = 1.000). The reason for high success rates underlies the design of the experiment, because we intended to collect the data with as many correct answers as possible for EEG analysis. When accomplishing a task or failing it, different cognitive processes occur in the brain, hence, it is appropriate to consider correct and wrong answers separately. If approximately equal in number, correct and wrong answers could be compared in terms of the EEG-related analysis. However, we were interested in the cognitive processes occurring during the accomplishment of the task, in other words, correct answers. In this context, the experiment was designed as a rather easy one, so that when we exclude the wrong answers, which are much less in number, there would still remain enough trials to average for the EEG power spectrum analysis.
Although we observed slight differences in performances, it might be interesting to explore success rates in terms of good and relatively poor map learners, instead of experts and novices. Out of 20, 15 participants were good learners, with an average score of 94.6% overall and consisted of nine experts (4F, 5M) and six novices (3F, 3M); the remaining five were relatively poor learners with an average score of 85.8% and this involved one expert (F) and four novices (2M, 2F). This difference was statistically significant (Mann-Whitney U test: U = 67.500, p = 0.000) and showed that good map learners remembered more map elements compared to the relatively poor learners.
The overall average success rate for hard tasks were 88.7%, with the lowest score of 66.0% and the highest being 98.0%. Good learners averagely scored 92.5%, whereas the other group scored 77.4%. The overall average success rate for moderate tasks was 91.2%, with the lowest score of 74% and the highest being 98.0%. The average score of good learners was 93.7% and the relatively poor learners scored 83.4%. The overall average success rate for easy tasks was 94.4%, with the lowest score of 80.0% and the highest being 99.3%. Good learners resulted as 96.1% while the remaining group as 89.2%. The difference occurred between good learners and the relatively poor learners for easy tasks was 6.9%; for moderate ones, 10.3% and for hard ones, 15.1%. It is also important to mention that we observed an increase in terms of the performance differences between good learners and relatively poor learners as the task difficulty increases (Figure 4). One interesting finding to note is that the reaction times longer than the average (i.e., between 4.5 s-6.6 s) all belonged to the good learners, which consisted of five experts and one novice. Additionally, the top three shortest overall reaction times (i.e., 3.0s, 3.1s, 3.5 s.) all belonged to the relatively poor learners (all novices) with the top three lowest overall success rates (i.e., 78.3%, 80.0%, 85.1%). This shows that spending more time on tasks helped experts achieving higher accomplishment rates, whereas the fast responses of novices resulted in a lower number of correct answers.

Psychological Measures
Although novices had longer fixation durations compared to the experts did for all tasks (Figure 5a), this difference for fixation duration was not considered as significant as a result of applied statistical tests (F easy = 0.261, p = 0.232; F moderate = 0.174, p = 0.514, Mann-Whitney U test: U hard = 1812391.500, p = 0.886). Experts mostly exhibited a higher number of fixations per second for all difficulties (Figure 5b). However, the difference between experts and novices was not significant for hard and moderate tasks (F moderate = 1.861, p = 0.165, F hard = 0.064, p = 0.983), whereas it was significant for easy tasks (F easy = 0.006, p = 0.019) (see [6] for more detail).
In our previous paper [6], we suggested that it would be useful to investigate saccade related metrics to interpret the cognitive load further. Figure 5c,d show saccade amplitude and velocity varied for experts and novices for the easy, moderate, and hard tasks. As the task becomes harder, we observed that the saccade amplitude becomes smaller; hence, the saccades become shorter, which indicates a higher cognitive load. Regarding to saccade velocity, a contradicting trend is seen between experts and novices. Novices exhibited the highest velocity with the easy category, which is linked with the highest amplitude, and they demonstrated the lowest with the hard category, which is linked with the lowest amplitude. This finding is in line with the previous research, e.g., [7], however, the expert group showed the opposite result, in the sense that they had the highest velocity with the most difficult category and, thus, the lowest amplitudes. None of the saccade related metrics for all types of tasks (i.e., easy, moderate, hard) fits the normal distribution. (Shapiro-Wilk test for saccade amplitude: W = 0.933, p = 0.000 < 0.05; for saccade velocity: W = 0.970, p = 0.000 < 0.05). Therefore, we applied Mann-Whitney U non-parametric test, to measure the significance of the differences between two groups, and as a result, saccade amplitude (U hard = 1,554,376.500, p hard = 0.000 < 0.05; U moderate = 1,363,219.500, p moderate = 0.000 < 0.05; U easy = 3,061,036.000, p0.000 < 0.05) and saccade velocity (U hard = 1,750,439.000, p hard = 0.000 < 0.005; U moderate = 1,536,918.500, p moderate = 0.000 < 0.05; U easy = 3,368,847.500, p easy = 0.000 < 0.05) of expert and novice map users were all significantly different for all types of tasks.
Compared to experts, novices exhibited larger saccades at all difficulty levels, and the difference in saccade amplitude between experts and novices increased as the task difficulty decreased (Figure 5c). The easy tasks received larger saccades and the hard tasks received shorter saccades, as expected. Due to the higher number of elements to pay attention to in hard tasks, the participants had to jump from one object to another in a short amount of time; therefore, they exhibited shorter saccades. Shorter saccades demonstrated a higher cognitive load for experts at all difficulty levels, however, the saccade velocity data claimed slightly differently. The novices accomplished moderate and easy tasks with a faster saccade velocity, whereas experts had higher saccade velocity in hard tasks (Figure 5d). These findings show that experts manifested more cognitive load in hard tasks, according to their shorter saccade amplitudes and higher saccade velocity, although their fixation durations were shorter compared to the novices but not significantly. Accordingly, experts did not accomplish the tasks with a less cognitive load, but they scan the map faster and in a more effective way (with higher success rates) when it comes to hard tasks.
We observed several common characteristics between expert and novices when heat/focus maps of randomly selected ten stimuli were visually evaluated. Some fundamental remarks are listed as follows (see Figure 6): • Block 1-all map elements: the road junctions, especially in the center or close to the center of the map, are where all the participants inherently focused the most. Both experts and novices generally paid the most attention to the green areas that are large and isolated. These isolated and large green areas received more and longer fixations in comparison to the water bodies. Hydrographic features received lesser fixations compared to others. The labels/texts on the map also received much attention from both groups. This outcome might be due to the unfamiliar language used for labels, or the size and position of the labels; therefore, it was a distraction, yet it could be a useful input in map design. • Block 2-roads and hydrography: independently from what the spatial memory task demands, green areas received as many fixations as roads and hydrography did from both groups. In some cases, the participants drew their attention even to the smaller green areas. • Block 3-roads and green areas: a similar situation as in Block 2 occurred for Block 3, and, in this case, the hydrographic elements received as many fixations as the roads and green areas did. • Block 4-hydrography and green areas: large green areas and road junctions received the most fixations. In this case, the relatively larger hydrographic areas did not receive as many fixations as the smaller ones did. • Block 5-green areas, Block 6-hydrography, Block 7-roads: both expert and novice participants majorly focused only on what the task demanded, therefore, in their focus maps, only the map elements of interest stood out. This shows that it was easier to maintain an undivided attention when participants needed to focus on only one map element class.
Based on the PSD analysis of alpha and theta, we observed an ERD alpha and ERS in theta for easy and hard tasks. This finding is in line with the literature on the frontal theta activity increasing e.g., [33,34,36,37], and parietal alpha decreasing e.g., [30,31,38] with the cognitive load in a working memory task. For moderate tasks, alpha was observed to be increasing as well as theta was. Although the changes in alpha seem very small, the theta effects seem stronger, and confirm that the experts and novices have a different experience in this spatial memory task in a sense that experts exhibited more theta in moderate and hard tasks whereas novices did more in easy tasks. The increase in alpha during moderate tasks might be due to this spectral power feature possibly not being sensitive enough to discriminate on an aggregated level. A great deal of information is lost, considering the values for alpha power activity are averaged for the whole duration of the condition [38]. However, the results indicate an interaction with the participants for easy and hard tasks (see Figure 7, Table 2).
To compare the cognitive load based on the task difficulty, we focused on theta, since a very small alpha effect was observed. The difference of theta changes between experts and novices was 0.0001525 for easy tasks; 0.0000462 for moderate tasks; and 0.0001372 for hard tasks. For none of the difficulty levels, theta values fit the normal distribution (Shapiro Wilk test: W hard = 0.875, p = 0.001; W moderate = 0.773, p = 0.000 < 0.05; W easy = 0.922, p = 0.002,), accordingly, we applied Mann-Whitney U test for assessing the significance of the differences. The distribution of the theta change was the same across categories of expertise and difficulties (U hard = 77.000, p hard = 0.519 > 0.000; U moderate = 124.000 p moderate = 0.367; U easy = 262.000 p easy = 0.766), which shows that the difference between two user groups was not statistically significant. This finding suggests theta power may not be as sensitive for average cognitive load, and that it may be developed into a valid objective measure of average cognitive load, although its true potential lies in the possibility to measure online fluctuations in cognitive load or instantaneous cognitive load [39].   Although we expected that there would be a greater effect on theta for the hard tasks compared to the others, we observed the greatest difference for easy tasks. This could be due to the hard tasks being too overwhelming, it being hard to stay motivated, and also the participants' tendency to give up and not to invest mental effort and resources anymore e.g., [38]. Participants confirmed in their post-test questionnaires that they find the task hard to focus on and tiring.
Alternatively, we calculated the event-related changes in EEG power spectrum for good and relatively poor map learners ( Figure 8, Table 3). Good learners exhibited slightly higher cognitive load at all the levels of difficulty. Regarding the overall performances, only small and non-significant power changes occurred in alpha (Mann-Whitney U test: U alpha = 846.000, p = 0.501 > 0.05), whereas the theta power seemed higher for good learners in all tasks, and the difference that emerged between good and relatively poor learners was significant (Mann-Whitney U test: U theta = 753.000, p = 0.020 < 0.05). It shows that the good learners exhibited higher cognitive load, regardless of the task difficulty. The biggest difference (0.000377) in terms of theta power change between good learners and relatively poor learners was observed for easy tasks. However, the differences in theta power among none of the difficulty levels was statistically significant ((Mann-Whitney U test: U hard = 53.000, p hard = 0.589 > 0.000; U moderate = 90.000 p moderate = 0.323; U easy = 125.000 p easy = 0.068).

Discussion and Conclusion
In this paper, we investigated the spatial memory abilities of expert and non-expert map users within a simple map-learning task, using eye tracking and EEG and triangulated the behavioral and psychological data to indicate the cognitive load caused by the task. Some highlights of the findings are listed as follows: • Experts had longer reaction times (significantly longer for moderate and hard tasks), but higher success rates. They might be a bit more ambitious and driven to accomplish the task compared to novices, and have saved extra time to review or verify their response before submitting it. It seems the fact that experts exhibiting more cognitive load paid off with higher success rates.

•
Novices were observed to have longer fixation durations, mostly lower number of fixations per second and higher saccade velocity (except for hard tasks), which indicate a higher cognitive load for novices. Additionally, the saccade amplitudes of novices were longer. In longer saccades (i.e., larger amplitude), the search goes all across the image, and is thus less organized. Experts exhibiting shorter saccades means a more targeted search from one focal point to the next, which are close to each other in the map, therefore, a less chaotic search pattern. The shorter fixations of experts also show that they needed less time to interpret what they saw.

•
Although not significant, experts demonstrated higher theta power (except for easy tasks), which can be associated with a higher cognitive load. • Qualitative analysis of the eye movements shows that both groups showed similar attentional behavior in terms of the map area covered and the map elements on which they focused.
Based on the findings, it is difficult to favor one user group in terms of their performance while retrieving the map-related information. The map-learning and recalling strategies of experts and novices and their approach to the task might not be similar, however, the overall performances of them did not differ much. In fact, novices, in some cases, outperformed experts. This outcome might seem to contradict the results within the expert-novice research paradigm e.g., [5,10], however, it is in line with the findings in the field of geography e.g., [40] and in map learning domains e.g., [2].
On the one hand, the reason why we did not find significant differences between novices and experts might be that they pay attention to the different aspects of a task. This affects both their perceptions of task complexity (i.e., task analyzability and variability) and their performance on the task. Superior performance by experts depends on the match between the experts' cognition and the demands of the task [41]. The fact that novices sometimes perform better than experts would be an evidence that they use different learning strategies. As explained by Postigo and Pozo [40] 'the subject lacking domain-specific knowledge tends to construct a visual-spatial mental representation, as opposed to the semantic representation of the expert. Experts represented given information in a domain-specific manner that was concerned with the deep semantic structure of that information, whereas the novices mentally represented focused-upon superficial domain-general aspects' (p. 77,78).
On the other hand, the reason why expertise is not as influential as we think, especially for simple map-learning tasks, is due to the effect of other individual differences. According to Hunt [42], those differences originate from 'the use of simple processing procedures, knowledge related to the task and the ability to perform the low-level mechanics of information processing'. Experts did not always outperform novices, which could explain that domain knowledge was not that relevant to the task, instead, general education and the ability to perform basic operations, such as decoding, visualization, selective filtering, memory retrieval, and memory comparison, played an influential role in high-level procedure and strategy choices. Therefore, the variation in performance and strategy choice might arise from the differences in basic visual or spatial ability. Additionally, the competence of the expert group does not only rely on their extensive knowledge, but also the organization of this knowledge that forms their cognitive representations and characterizes them [2,40].
We alternatively grouped the participants as good learners and relatively poor learners, based on their success rates, and calculated the change in EEG power spectral density with respect to this classification. We observed that good learners exhibited significantly higher theta ERS, considering their overall performance. Although the cognitive load of these groups did not differ based on the task difficulty within the frame of this research, classifying participants based on their spatial memory performances provided different insights on map user' cognitive processes. Similar to what was found by Havelková and Gołębiowska [13], unsuccessful participants differed in the general problem-solving approach, in a way that they tended to choose fast, less cautious strategies, and lacked motivation.
This study also showed that high cognitive load is not necessarily associated with the low task performance, in fact, for most cases, it was an indicator of more elaborate, structured, and efficient cognitive strategies, especially demonstrated by experts. Therefore, it is useful to triangulate data collected via difference sources (i.e., quantitative and qualitative methods) to interpret the cognitive load and to understand the underlying behavior of the participants.
Within this paper, we analyzed the influence of the independent variables, such as task difficulty and expertise level on the cognitive strategies of map users. As well as task and expertise, map design characteristics play an important role in users' cognitive load and learning performance, hence, should be evaluated, in order to contribute to enhancing the design and usability of cartographic products e.g., [18]. We used Google Maps, which is designed for everyone (i.e., regardless of the users' individual differences of spatial cognition), as stimuli in our experiments, and we found no significant difference between experts and novices in terms of the cognitive load that these maps caused. It is important to mention that, if the quality of the cartographic design fulfills its purpose of the design, it has a positive effect on users' experience.
Furthermore, the EEG metrics used in the study and the procedures of extracting them have an influence on the results. To have more detailed insights on cognitive load and to detect the small changes in the EEG power spectrum, different procedures can be applied to the collected data. For instance, the seven-seconds-long study period can be segmented into sub-parts and the exact time points of the peak values of alpha and theta can be identified. These peak values can further be analyzed simply for the time periods of interest by visually inspecting the EEG time-frequency plots. Another interesting approach is to investigate the lower (8-10 Hz) and upper (10-12 Hz) alpha bands separately, in order to indicate specific frequency effects that are not distinct when only looking at the broad alpha range as suggested by Morton et al. [38]. There are a number of researches demonstrating different activity in upper and lower alpha bands in cognitive load conditions, in a sense that upper alpha decreases when cognitive activity increases e.g., [35,43,44]. It is also possible to measure gamma oscillations, which are directly proportional to the cognitive activity e.g., [45], and beta oscillations increase upon cognitive load e.g., [46]. EEG data might be overwhelming, and there are various other aspects to investigate further and a countless number of analysis to perform besides the ones mentioned above. However, the experimental design has a primary importance in a sense that deciding on where to pay attention to and what to expect from the collected data have to be well planned and tested, before conducting the main experiments. When integrated with other qualitative and quantitative user testing methods, EEG indeed suggests a valuable contribution to the understanding of the cognitive processes of individuals.