Introduction
When humans attend to their surrounding environment, looking does not always equate to seeing. That is, the externalities of the visual process do not always correspond to the attended percept. Historically, visual attention has been measured predominantly using eye fixations (
Groner, 1988;
Groner & Groner, 1989). The implicit assumption here is that fixating an object secures visual attention and allocates mental resources. However, fixations do not necessarily imply attentional focus (
Groner & Groner, 1989;
Mack & Rock, 1998;
Groner & Groner, 2000).
Looking without seeing can give an explanation for various phenomena of inattentional blindness, which have been reported beyond the laboratory in a number of applied domains such as surface transportation (Strayer, Drews, & Johnston, 2003), baggage screening (Hubal, Mitroff, & Cain, 2010), and surveying crowds (
Simons & Chabris, 1999). As an example of these real-world scenarios, consider a driver who is stopped on the roadway, their eyes directed toward a red signal. The signal turns green, but the driver fails to react. As they wait, eyes directed toward a signal that is now green, we can understand that they are certainly passively ‘looking‘ at the light. Further, if they fail to respond, they cannot be said to have processed the change from red to green and thus to have ‘seen‘ the signal. Looking without seeing is a phenomenon which should be explained by workable theories of human information processing, most notably models of attention. However, apart from a behavioural reaction, no measure allowing for an objective distinction between looking and seeing has been suggested so far. The present work evaluates the utility of microsaccades as an indicator of visual attention and its underlying sensory and physiological processes in order to distinguish between looking from seeing by using a replicable and quantitative measure. In the present context, “paying attention” is considered a top-down regulated mechanism of allocating processing resources to parts or properties of the input on cost of other (see the taxonomy of attentional processes in
Groner & Groner, 2000). Microsaccades will be investigated as possible indicators of such a process of resources allocation.
Microsaccades represent small, involuntary eye movements, similar to miniature versions of voluntary saccades. Typically, microsaccades have an amplitude less than two degrees of visual angle (Martinez-Conde, Macknik, Troncoso, & Hubel, 2009;
Rolfs, 2009). Microsaccades occur during visual fixation in the period of relative stability between the larger saccades. Even when we think that our eyes are not moving, they are. Microsaccades are not under voluntary control, and therefore they are more robust with respect to external influences (
Rolfs, 2009; Martinez-Conde, Otero-Millan, & Macknik, 2013). The functions of microsaccades are not yet fully understood. Research has focused on the relation between microsaccades and the control of fixation position, reduction of perceptual fading, continuity of perception, visual acuity, scanning of small spatial regions, shifts of spatial attention and resolving perceptual ambiguities (Martinez-Conde, Macknik, & Hubel, 2004;
Martinez-Conde et al., 2009). Recent results challenge the interpretation of microsaccades as strictly low-level oculomotor phenomena (
Martinez-Conde et al., 2004). Accumulating empirical evidence is beginning to confirm that microsaccades serve both perceptual and oculomotor goals. A direct link between microsaccade production and visibility has been shown; increased microsaccade production during fixation results in enhanced visibility for peripheral and parafoveal visual targets (Costela, McCamy, Macknik, Otero-Millan, & Martinez-Conde, 2013). Decreased microsaccade production leads to periods of visual fading (Martinez-Conde, Macknik, Troncoso, & Dyar, 2006). Several studies have found that microsaccades, like saccades themselves, can be modulated by attention. For instance, the spatial location indicated by an attentional/visual cue can bias microsaccade directionality (
Engbert & Kliegl, 2003;
Martinez-Conde et al., 2013). This is most likely due to the extensive overlap between the neural systems that control attention and the system that generates saccadic eye movements.
Martinez-Conde et al. (
2009) have suggested production or control of microsaccadic activity by attentional processes, toward the goal of improving vision through dynamic enhancement and suppression of low-level visual information over time. Such suppositions require further investigation, but these existing results suggest that microsaccadic activity could be a robust biosignature for internal attentional processes.
Microsaccades activities are influenced by the attentional load of visual tasks (Benedetto, Pedrotti, & Bridgeman, 2011; Hicheur, Zozor, Campagne, & Chauvin, 2013) as well as non-visual cognitive tasks (
Siegenthaler et al., 2013; Gao, Yan, & Sun, 2015; Dalmaso, Castelli, Scatturin & Galfano, 2017). These, non-visual cognitive tasks include arithmetic operation and digit retention, and are intended to involve mental processes that do not rely on vision. However, the growing body of literature on attentional load and microsaccade rate is inconsistent. Some studies indicate that tasks with higher attentional load lead to a lower microsaccade rate. For example,
Pastukhov and Braun (
2010) found higher attentional load associated with lower microsaccades rates and increased microsaccade directional congruency. Their paradigm employed visual recognition tasks requiring either low attentional load (reporting color) or high attentional load (reporting letter shape).
Siegenthaler et al. (
2013) found increasing task difficulty to correspond to lower microsaccade rate, using a paradigm which employed a mental arithmetic task, lacking any visual component. Gao, Yan and Sun (2015) performed a subsequent replication, which also showed an inverse relationship between the microsaccade rate and task difficulty.
Dalmaso et al. (
2017) used two-digit (low load) and five-digit (high load) number memorizing tasks to investigate the association between the working memory load and the microsaccade rate. In line with these previous studies, they revealed that the microsaccade rate was significantly suppressed in the task with high working memory load. However, still other studies have found microsaccade rate increases with increasing task demand.
Benedetto et al. (
2011) employed a simulated driving task using a low load task (control task) and a high load task (dual task including visual search task). They found significantly more microsaccades under the high load condition.
Hicheur et al. (
2013) used a forced choice-task paradigm. Participants had to judge the orientation of a titled stimulus that was placed in static or dynamic backgrounds. A higher microsaccade rate was found when participants were engaged in the high load task, in which execution of the discrimination task was needed, compared to the low load task, in which no response was needed.
Under the assumption that complicated interactions between the effects of perceptual and working memory load could occur, Xue, Huang, Ju, Chai, Li and Chen (2017) conducted an experiment with monkeys using a task with primarily perceptual load being manipulated. Results indicated that microsaccade rate was lower with high load than with low load. They conclude that the perceptual costs or benefits of microsaccades might drive the observers to adjust their fixation strategies to facilitate behavior performance.
In summary, previous results have shown that a) tasks which induce mostly cognitive load are linked with a decreased microsaccade rate (
Siegenthaler et al., 2013;
Gao et al., 2015;
Dalmaso et al., 2017) and that b) increasing difficulty in tasks with a strong but not exclusive visual component enhances microsaccade rate (
Benedetto et al., 2011;
Hicheur et al., 2013). This potentially implies that microsaccades are a top-down regulated mechanism of allocating processing resources to parts or properties of input at cost of other processes. In applied settings, this potentially means that microsaccades would indicate whether a person was paying attention to a visual scene or if their attention had shifted to some other cognitive task.
The present study
To evaluate the assumption that microsaccade rate reflects the amount of visual attention, visual and non-visual attention were manipulated systematically in this study. To investigate this question, a dual task setting with tasks inducing 1) cognitive and 2) visual load was employed. Visual load was defined as the level of complexity of a visual scene relying on the attributes of a visual scene (Milam, El-Nasr, Moura, & Bartram, 2011). Thus, an environment in which participants would find it difficult to differentiate between important visual cues and irrelevant visual elements was considered “high visual load”. The systematic combination of both tasks allows for an analysis of relations between visual attention and microsaccade rate. We hypothesize that microsaccade rate is increased in trials with high visual load and low mental load. Furthermore, we anticipate that microsaccade rate will decrease in trials with a low visual load and a high mental load.
Method
Participants
Eighteen participants, nine male, nine female, with an average age of 21 years (SD ± 2.56) participated in one single experimental session. All participants were University of Central Florida (UCF) students and received class credit for their participation. All had normal or corrected-to-normal vision, as tested by a Snellen eye chart (McGraw, Winn, & Whitaker, 1995). Experiments were carried out in conformity with the declaration of Helsinki, as well as the appropriately accredited Internal Review Board (IRB) policies. Written informed consent was obtained from each participant prior to the commencement of testing.
Experimental Design
A 3 x 3 repeated measures design was used in this study. Visual demand (free view vs. easy view vs. hard view) and mental demand (no count vs. easy count vs. hard count) were manipulated as independent variables (see Figure 2), with ‘free view’ and ‘no count’ conditions representing control conditions. The order of the different experimental cells was randomized for each participant.
Stimuli and Tasks
Visual stimuli representing three different complexity levels were used to manipulate visual load. For the ‘easy view’ and ‘hard view’ conditions, ‘spot the difference’ puzzles were used. While in ‘easy view‘, stimulus material consisted of simple line drawings, photographs with complex visual information were used for the ‘hard view‘ condition.
The tasks for the ‘hard‘ and ‘easy view‘ conditions consisted of determining differences between the two images displayed next to each other. In ‘easy view‘ condition, such differences were simple to detect, while in the ‘hard view‘ condition, differences were much more difficult to detect (see
Figure 1). In the control condition representing the lowest level of visual load (i.e. the free view condition), stimuli consisted of contained three simple geometric forms. This condition involved no visual search task, participants were simply asked to view the images. In order to provide as natural a task as possible, no center target was provided. Ten examples of each type of stimuli were used, one in training and nine in the experiment.
In order to manipulate cognitive load, participants were asked to complete mental arithmetic tasks while performing the visual search tasks described above. In the ‘easy count‘ condition, participants were instructed to count forward by increments of 2, starting from a random two-digit number.
In the ‘hard count‘ condition, participants counted backward by increments of 17, starting from a random four-digit number (e.g., 3123). In the control condition (i.e. no count), participants were instructed not to count and pay full attention to the picture.
Visual tasks and mental arithmetic tasks were always presented in combination, summing to nine experimental conditions. The ‘no count’ and ‘free view’ conditions represent control conditions in which no formal task was completed. Thus, pairings of conditions including one of these control conditions can be considered as single tasks whereas all the others represent dual tasks. Both tasks have been used in previous studies (
Siegenthaler et al., 2013; Otero-Millan, Macknik, Langston, & Martinez-Conde, 2013).
Measures and Instruments
Performance was measured for both the visual task and the arithmetic task. For the visual task, the percentage of total available differences detected in each puzzle was calculated.
In the counting tasks, participants were holding a game controller in both hands. As participants completed each cycle of counting, they pressed a button on the controller. These button presses were recorded by a purpose built program (MCT (Mental Count Timer), Sawyer, 2017). This made it possible to monitor whether participants continually performed the task without requiring them to vocalize, and therefore cause interference with eye tracking. At the end of each trial, participants reported the number at which they had arrived. Answers were scored as either correct or incorrect, based upon the number of iterations reported through MCT combined with the increment required by the counting task (2’s or 17’s).
Eye position was detected binocularly and noninvasively with a video-based eye tracker at 1000 HZ (EyeLink 1000, SR Research, instrument noise 0.01° RMS). In a screening process (for details see
Siegenthaler et al., 2013), erroneous (i.e. temporary intermittent signal) eye position data was first identified and then discarded. In addition, portions of data where very fast decreases and increases in pupil area occurred were extracted (> 50 units/sample, such periods are thought to represent semi-blinks where the pupil is never fully occluded; Troncoso, Macknik, & Martinez-Conde, 2008). Also, blink periods as portions of the raw data where pupil information was missing were identified and removed. Before and after each blink/semi-blink interval 200 ms were added to eliminate the initial and final parts where the pupil was still partially occluded (
Troncoso et al., 2008). After the rectifying the eye position data, saccades were identified with a modified version of the algorithm developed by Engbert and Kliegl (
Engbert & Kliegl, 2003;
Engbert, 2006a,
2006b; Laubrock, Engbert, & Kliegl, 2005; Rolfs, Laubrock, & Kliegl, 2006) with λ = 6 (used for the velocity threshold detection) and a minimum saccadic duration of 6 ms. Only binocular saccades (saccades with a minimum overlap of one data sample in both eyes;
Engbert, 2006a,
2006b;
Laubrock et al., 2005;
Rolfs et al., 2006) were considered in order to reduce the amount of potential noise. In addition, a minimum intersaccadic interval of 20 ms was applied with the intention of not categorizing new saccades as potential overshoot corrections (Møller, Laursen, Tygesen, & Sjølie, 2002). Saccades with magnitude < 2° in both eyes were defined as microsaccades (Beer, Heckel, & Greenlee, 2008;
Betta & Turatto, 2006; Hafed, Goffart, & Krauzlis, 2009; Martinez-Conde, Macknik, Troncoso, & Dyar, 2006;
Martinez-Conde et al., 2009;
Troncoso et al., 2008). Finally, to calculate microsaccade properties such as magnitude and peak velocity, the values for the right and left eyes were averaged.
In order to assess mental workload subjectively as part of a manipulation check, the NASA-Task Load Index (NASA-TLX, see
Hart & Staveland, 1988) was administered after each trial. This subjective multidimensional assessment tool measures perceived workload with six subscales: mental demand, physical demand, temporal demand, performance, effort and frustration on a scale ranging from 1 (very low) to 20 (very high), with performance using verbal anchors ranging from ‘perfect’ to ‘failure’. The scale is widely used in human factors research (Colligan, Potts, Finn & Sinkin, 2015;
Hart, 2006) and has good psychometric properties (c.f.
Hart & Staveland, 1988).
Apparatus
The room in which the experiment was conducted was quiet, and equal illumination was used for each session. Participants were placed in a head/chin support, facing a desktop-mounted EyeLink 1000 eye tracker capable of 1000 Hz binocular tracking. Fifty-seven cm away from the support, visual stimuli were displayed on a linearized video monitor (Barco Reference Calibrator V, 75 Hz refresh rate), using SR Research Experiment Builder.
Procedure
Participants first engaged a training session, which exposed them to each of the experimental manipulations individually and allowed them to ask questions. The experimental session contained 3 blocks, each containing 9 trials, one per experimental condition. For each participant, the trial sequence was randomized. Each trial was 60 seconds in duration, resulting in a total of 27 min of eye-tracking data per participant.
Before each trial, an instruction screen indicated the task which was to be performed. During the free view condition, participants were instructed to look at the picture on the screen, with no search for differences or any specific response being required from them. For the mental arithmetic task, participants were instructed to push a gamepad key with their index finger each time they counted (i.e., either 2 or 17). For the ‘no count‘ task, participants were instructed not to count and pay full attention to the picture. After each trial, participants completed the NASA-Task Load Index. After completion of each block, a five-minute break was administered.
Each visual task had an arithmetic counterpart (see
Figure 2). Tasks were always presented in combination, summing to nine total conditions, each a unique combination of visual and arithmetic tasks. The ‘no count‘ and ‘free view‘ condition is essentially the absence of any formed task. Pairings of conditions that include one of these ‘non-tasks’ can be considered as single task.
Data Analysis
Microsaccade rate and performance data met the assumption of normality (via the Shapiro-Wilks test, all P-values > .05), and all data were normally distributed. The dependent variable was microsaccade rate and on this variable we performed a 3 x 3 (no view, easy view, hard view x no count, easy count, hard count) repeated measures MANOVA. Mauchly’s test indicated that the assumption of sphericity is violated (χ²(2) = 29.65, p < .001), therefore degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (Ԑ = 0.56). Pairwise comparisons with a Bonferroni correction were calculated for post-hoc comparisons.
As a manipulation check of the effectiveness of the task difficulty, a 2 x 3 (easy view, hard view x no count, easy count, hard count) MANOVA was calculated for the dependent variable main differences found. For the number completed counting steps, a 2 x 3 (easy count, hard count x free view, easy view, hard view) MANOVA was calculated.
Discussion
Our results show that microsaccade rate reflects the amount of visual attention toward a visual task. For demanding tasks, this suggests the utility of microsaccade rate as a biomarker of whether an operator is just gazing an object or if they have really focused their attention. In this, our hypothesis was upheld, as trials with increased visual load (‘easy‘ or ‘hard view‘ task) did result in increased microsaccadic rates, relative to trials with low visual load (‘free view‘ task). Trials with high demand visual tasks also increased microsaccadic rates more than those with low demand visual tasks. These results are in accordance with Benedetto et al. (2010) and
Hicheur et al. (
2013). Also, our hypothesis was upheld, since tasks inducing cognitive load (‘easy count‘ or ‘hard count‘) alone would result in decreased microsaccadic rates. Likewise, trials with high demand cognitive tasks decreased microsaccadic rates more than those with no demand cognitive tasks. These findings are in accordance with
Siegenthaler et al. (
2013),
Gao et al. (
2015) and
Dalmaso et al. (
2017). However, contrary to
Siegenthaler et al. (
2013) we didn’t find a linear effect but only a general load effect. There was no significant effect between easy count and hard count. Beyond replicating past results, the present data show that microsaccade rate rather granularly reflects the difficulty of visual stimuli. Indeed, it may in fact reflect how much attention is directed to a visual task, and how much of the visual information is processed. As such, microsaccades may well be useful in applied settings to indicate how much attentional capacity is directed toward a visual target, if indeed any.
Measuring Visual Load
The present results show that the visual demand of a task is systematically reflected in microsaccade rate (
Figure 5). Any single visual task (‘easy‘/‘free‘/‘hard view‘ task combined with ‘no count‘ task) showed an increased microsaccade rate compared to its comparator in a dual task setting (
Figure 7). Also, all ‘hard view‘ condition tasks show an increased microsaccade rate compared to all ‘easy view‘ condition tasks. The explanation of these results is that in a single visual task the operator shifts his full attention to that visual task. A ‘hard view‘ condition task, inducing more visual load, requires more visual attention reflected by a higher microsaccade rate. However, when the visual task is combined with a mental task (dual task setting), the microsaccade rate decreases significantly. The underlying explanation here is that the second non visual task requires a certain amount of attention. In consequence, the operator does not direct his full working memory capacity which is shifted towards the visual task.
Limitations
The difference in microsaccade rate between the ‘easy count‘ and ‘hard count‘ task was not significant. It seems likely that in this case there was a floor effect, since the hard count task was indeed ‘hard‘ for the participants. Indeed, anecdotally, participants found our task of counting backwards by 17s so difficult that they sometimes just gave up. Another possible explanation is that pushing the button in our MCT task required resources relevant to our DVs of interest, and so had some systematic influence. In the ‘easy count‘ condition participants pushed the button more often than in the ‘hard count‘ condition. Also, it is important to remember that the aggregate difficulty of difficult visual and cognitive demand may not be additive, but multiplicative. Other studies with a constant visual task showed a similar effect to this study (
Siegenthaler et al., 2013).
Of course, more work is needed to understand both the import and full meaning of the present pattern of data. Very little, one must remember, is known about microsaccadic activity, especially in rich visual stimuli like that used in the present effort. Indeed, higher microsaccade rates shown in the present study might simply be the result of some artifact of our stimuli set; for example, fine detail on the picture. The higher rate of occurrence of microsaccades in the hard view condition could be due to task-related demands, but also because there are more small features in the ‘hard view‘ condition task. The effect could be partially bottom-up and not only determined by the difficulty of the change detection task.
The distribution of attentional processes
According to the present results microsaccade rate is modulated by the visual information processing (and visual attention) and a certain microsaccade level is required for minimal visual attention. As a consequence, the decrease in the microsaccade rate demonstrates a limited capacity for simultaneous attentional processes in different modalities (i.e. visual vs. non-visual). In everyday life humans deal with visual information simultaneously while dealing with other non-visual information (i.e. mental processes, acoustic-, tactile-, or olfactory- information). A very common example would be in driving a car and simultaneously making a phone call. The decision as to what information is processed is reflected in the distribution of that attention. Working memory has a central role in this distributional process and absolute and relative microsaccade rate could help to specify these attentional shifts (
Dalmaso et al., 2017). Further, they could give insight into the neurological conceptions of working memory and the distribution of attentional processes.
Importance in Practical Settings
A measure that monitors visual attention and to what extent an individual is processing the associated visual information is of critical importance. Not only will basic research benefit from this knowledge, but also vast swathes of applied investigation will profit since inattention to visual cues frequently lead to errors and accidents. The example given in the introduction; a car driver who doesn’t register a signal turning green, might appear to be a rather benign example. But consider a car driver not registering a green signal turning to red. Or consider an educational setting. A teacher may draw student pupils` attention to a certain visual location, but if the student simply ‘looked but did not see‘ then the next steps in the learning sequence may be negated as the thread of learning lost; all the while the teacher might feel assured that they had sufficiently featured the item so that they assumed fixation had equated with content processing. In such cases, inattention directly leads to failure.
Having a measure for visual attention and visual information processing might distinguish between ‘looking‘ and actually ‘seeing‘. Especially where safety is a function of attention (i.e. traffic safety, aviation safety, patient safety etc.) the significance and benefits of such a measure should be clearly evident. Indeed, such a measure could provide real-time feedback as to how much an individual is spending their attention on a visual task. For example, it could provide feedback on how much a car driver is visually focused on the street and relevant surrounding and signals and it would give feedback whenever the attention is shifting to non-driving displays (i.e. to mental processes) (
Hancock & Sawyer, 2015). At the moment there exists no unequivocal physiological measure for visual attention or visual information processing. Indeed, even at a time when the visual fixation of an object has been shown unequivocally to not necessarily be equated with focusing attention toward that object, there are still systems which use this logic, presumably for lack of something better. For example, Chevrolet’s SuperCruise, a production self-driving technology, uses measures of gaze to the roadway to enforce eyes-on-road during autonomous driving. How much better to enforce attention-to-driving-task, given the technological means!
Although there has been extensive and prolonged use of certain visual processing measures, the specifics of the idea to include fixational eye movements (i.e. microsaccades) is a relatively new one. Microsaccades are typically investigated in neurological settings and are interesting measures since they are mostly not consciously controlled. One procedural problem is the infrastructure needed for detecting microsaccades. High-speed eye tracking devices are typically non-mobile and not suitable for applied settings beyond evaluation in simulators. Since there is obviously empirical evidence that microsaccades are an adequate measure for visual information processing, the development of mobile high-speed eye tracking systems will hopefully progress. This would open a new field in many real-world settings.
Conclusion
In the same way that vagal tone has been represented as being responsive to variations in cognitive load (Hancock, Meshkati, & Robertson, 1985), we have proposed and confirmed here that inhibition in microsaccade rate accompanies increases in cognitive demand. As with the vagal connection, we also recognize that microsaccades, most probably, do not subserve a one single function. However, it is evident that such measures do provide a window into cognitive state and that clarity of that window (i.e., the signal to noise ratio of this specific measure) is high. This makes microsaccade rate observation an exceptionally useful and diagnostic tool in the evaluation and prediction of real-world behavior.
Our results indicate that the microsaccade rate can reflect both the level of visual attention and the level of visual information processing. A measure that monitors how and to what extent an individual is focused on a specific visual task is this a critical step for the application of visual assessment to real world tasks. More research is necessary to see whether the paradigm works in a variety of ever more applied field settings and the degree to which the resultant signed can be fed-back into cybernetic control systems for human-machine interface and exchange. More work is needed on the basic vision-science, where significant gaps in our understanding of microsaccades remain. The reward will be a measure which reflects to what extent and how an operator is processing visual information, a critical step for both experimental work to understand multitasking, and toward the application of sophisticated visual assessment to real world tasks.