As common as visual search is in everyday life, as long is its history in scientific research. Whereas an abundance of literature exists relating measures of search efficiency like error rates and reaction times to attention, much less research has been undertaken to investigate search efficiency via eye movement parameters in regard to attention, and many questions are still open.
In the research presented here, we analyzed scan paths in visual feature search with graded target-distractorsimilarity in a typical search paradigm. We investigated visual search efficiency in terms of amplitudes of saccades, fixation durations, and number of reinspections/refixations in dependence of the experimental conditions
target-distractor-similarity,
set size, and
target presence. Based on the results, we drew conclusions on how the three eye movement parameters relate to three hypothetically underlying attentional concepts: the attentional focus size, the attentional dwell time, and the depth of attentional processing. Finally, by proposing these relations, we were able to provide empirical support for the reaction time model of visual search STRAVIS (STRategies of VIsual Search;
Müller-Plath & Pollmann, 2003), which is a modified version of
Wolfe’s Guided Search (
1994), and suggest improvements.
Search efficiency in terms of search rates, eye movement parameters, and attentional processes
In the tradition of visual search, search rates have been generally interpreted to draw conclusions regarding processes of visual attention (
Treisman & Gelade, 1980). In general, flat reaction time curves with increasing set sizes argue for a more efficient search than steep reaction time curves (see
Wolfe, 1996, for a review). Since in the reaction time slope the size of the attentional focus, the dwell time, and the total number of visits per item group are inextricably confounded, the search rate is of limited use for testing models of visual attention in detail, or for investigating interrelations between the focus size, the dwell time, the total number of visits, and conditions of the task. The analysis of eye movements in overt search might provide better insight here, provided that certain linking propositions between eye movement parameters and attentional processes hold. It is the central concern of the present study to test these propositions and to relate them back to (reaction time) models of visual search.
With respect to eye movements, efficient (overt) search shows up in (a) large saccade amplitudes, going along with (b) few saccades, and (c) short fixation durations (
Jacobs, 1986). Furthermore, efficient search should mean (d) to fixate items/item positions at most once.
There exists a variety of models explaining search efficiency in terms of underlying attentional processes. Whereas Treisman’s Feature Integration Theory (FIT;
Treisman & Gelade, 1980;
Treisman & Sato, 1990) draws a clear distinction between attentional processes in feature and conjunction search, which is inconsistent with a variety of findings (e.g.,
Nakayama & Silverman, 1986), saliency-map-based models like Guided Search (GS;
Cave & Wolfe, 1990;
Wolfe, 1994) are able to account for a continuum of search efficiencies in feature search with variable target-distractor-similarity. However, neither their idea of a fixed attentional dwell time nor that of a focus containing only one item could be empirically confirmed. A modified version of GS that models search efficiency by an attentional focus of variable size and varying dwell times is STRAVIS, the main ideas of which have been outlined in
Müller-Plath & Pollmann (
2003) and
Müller-Plath (
2008; p. 318).(
Compared to an older version (Müller-Plath & Pollmann, 2003), the process of attentional selection and the attentional focus has been conceptualized in more detail in the 2008 paper. The 2003 paper concentrates on reaction time modelling on a quite coarse scale) STRAVIS is a dynamical saliency map model with a strategic component that allows an explicit individual estimation of parameters from reaction times.
Although there might have been some conceptual confusion in the past (e.g.,
Itti & Koch, 2000), it should be clear that the “saliency” of a stimulus is per definitionem a psychological concept and not a physical one. Since the perception of physical attributes is modulated by attention (see e.g.,
Moran & Desimone, 1985;
Kastner & Ungerleider, 2000), the saliency of a stimulus results from its physical contrast to its neighbours, as well as from the amount of selective attention paid to these contrasts. The more narrowly attention is focused (in space as well as in the amount of features attended to), the stronger modulation of perception is achieved. First, the model assumes that the observer can voluntarily adjust on how large an area he/she deploys how much attention, with the total capacity being limited. This area is termed “attentional focus.” Second, the attentional modulation of perception takes time. Thus, in order to achieve stronger perceptual modulation, attention has to be deployed longer. In visual feature search with homogeneous distractors, a target is detected if it is sufficiently salient. Therefore, if the target is physically dissimilar to the distractors, not much attention is necessary to achieve sufficient salience. Consequently, attention will be distributed widely (large focus; see
Figure 1a. for a metaphorical illustration) and dwell only briefly. In the case of high target-distractor similarity, sufficient target saliency will be achieved only if attention is concentrated more narrowly in space (small focus; see
Figure 1b,c) and dwells there for a longer time.
All models, including STRAVIS, assume that search is self-terminating in target-present trials and exhaustive in target-absent trials, i.e., sufficient processing depth of items inside the focus and a perfect memory for already visited item positions (but see
Horowitz & Wolfe, 1998). Consequently, no refixations or reinspections should occur, which is in contrast to already existing eye movement studies on visual search (see e.g.,
Dickinson & Zelinsky, 2005,
2007;
Gilchrist & Harvey, 2006;
Hooge & Erkelens, 1996 for reinspections, and e.g.,
Findlay, Brown & Gilchrist, 2000;
Hooge & Erkelens, 1998,
1999 for refixations).
In the present study, we manipulated search efficiency by varying target-distractor-similarity and set size. We measured error rates and reaction times in target-present and target-absent trials as well as the amplitudes of a representative saccade in every trial, the durations of a representative fixation, and the number of reinspections and refixations per trial. In order to draw conclusions about the three hypothetically underlying attentional concepts the attentional focus size, the attentional dwell time, and the depth of attentional processing we presumed the following linking propositions: Fixations are linked with allocating attention to the fixation point and the execution of a saccade is preceded by a covert shift of attention (
Deubel & Schneider, 1996;
Findlay & Gilchrist, 2003;
Hoffman & Subramaniam, 1995). The saccade amplitude might then reflect how many items are checked in parallel (attentional focus size), and the fixation duration might indicate how much time is necessary to inspect one item or item group (attentional dwell time). The relation between reinspections/refixations and underlying cognitive processes is less clear. They might be linked with incomplete perceptual processing, leading to uncertainty in decision-making, or with incomplete memory (
Peterson, Kramer, Wang, Irwin, & McCarley, 2001).
Amplitudes of saccades and the focus of attention
Following the above linking proposition, the longer the amplitude and the smaller the number of saccades, the
larger should be the attentional focus. Here, we define the “size of the focus of attention” in terms of the number of items: Subjects try to adapt it to as many items as possible so that if one of them is the target it will still stick out (
Müller-Plath & Pollmann, 2003). Synonyms in the literature are (beside others): “visual span” (
O’Regan, Levy- Schoen, & Jacobs, 1983) or “zone of focal attention” (
Motter & Belky, 1998a). Since the number of saccades depends on the focus size as well as on the number of reinspections/refixations (see the previous section), we regarded it as not a suitable measure of the focus size. We relied solely on saccade amplitudes.
Several eye movement studies found that saccade amplitudes decreased with increasing target-distractorsimilarity (e.g.,
Hooge & Erkelens, 1996,
1998,
Näsänen, Ojanpää, & Kojo, 2001;
Vlaskamp, Over, & Hooge, 2005). These findings are in line with predictions from models of visual attention for the focus size (or synonymous concepts). According to our notion of visual attention (see e.g.,
Müller-Plath & Pollmann, 2003;
Müller-Plath, 2008; also
Jacobs, 1986) it should be an efficient strategy in case of low target-distractor-similarity to process many items simultaneously (large focus of attention), because the target would be salient enough to stick out and the risk of missing a target would be low. In case of high target-distractor-similarity, it should be an efficient strategy to process only few items simultaneously in order to keep the risk of missing a target low (small focus of attention), because the target would be not salient enough to stick out from a larger group. The Attentional Engagement Theory (AET;
Duncan & Humphreys, 1989), although theoretically different, predicts the same. As suggested by both lines of research (eye movement analyses and models of attention), we expected thus (i) decreasing amplitudes of saccades with increasing target-distractorsimilarity.
The set size also seems to have an influence on the amplitudes of saccades (e.g.,
Motter & Belky, 1998a,
1998b;
Näsänen et al., 2001).
Motter and Belky (
1998a,
1998b) studied visual search and eye movements using rhesus monkey subjects. Their results showed an increase of the amplitude of saccades with increasing set sizes.
Näsänen et al. (
2001) replicated these results with human subjects and concluded that the visual span increased with increasing set sizes. In attention theories, the role of the set size in regard to the search efficiency is controversially discussed. On the one hand,
Treisman and Gelade (
1980) proposed no influence of set size on feature search. On the other hand, some models that include the computation of a saliency map suggest an (implicit) effect of the set size: Through a process of averaging perceptual contrasts across all items in the display, the perceptual salience of the target should be the higher the more homogeneous distractors are presented, even if the items do not physically change (
Müller-Plath & Pollmann, 2003;
Wolfe, 1994). Consequently, the attentional focus size might increase with increasing set size (see above). In the present experiment, we thus expected (ii) increasing amplitudes of saccades with an increasing set size.
Results for target-absent trials have rarely been interpreted in the eye movement literature. In visual search, the mean reaction time in target-present trials is usually shorter than in target-absent trials. A 1:2 ratio in search slopes is often interpreted as self-terminating or exhaustive search with focus size 1 (item) in both target-present and absent trials (e.g.,
Treisman & Gelade, 1980;
Wolfe, 1994). However, this has never been confirmed directly. Since in our experiment the observer did not know anything about target presence when starting inspection, saccades before finding the target should not be influenced by its presence. However, when the target is within reach, the focus should be narrowed and the saccade shortened in order to fixate it (
Zelinsky, Rao, Hayhoe, & Ballard, 1997). Although we only analyzed saccades that occurred early in the trial as
representatives of the focus size (2
nd saccades, for details see the Methods section), a considerable number of them might be target saccades, especially in the
easy condition with low target-distractor-similarity. We thus expected (iii) shorter amplitudes in target-present than in target-absent trials.
Fixation duration and the dwell time
An open question is how the size of the attentional focus, i.e., the number of items to which attention is paid to simultaneously, is related to the dwell time. In terms of the associated eye movement parameters researchers have explored the relation in two opposite ways. On the one hand, in tasks with variable presentation time an increase of fixation duration was found to be associated with an increase of the number of inspected items (
Mackworth, 1976;
Salthouse & Ellis, 1980;
Scialfa & Joffe, 1998). These results were interpreted as an increase in visual span or focus size, implying a positive correlation between attentional focus and dwell time.
On the other hand, using free viewing with unlimited presentation time, some studies showed an increase of fixation duration with increasing target-distractorsimilarity (e.g.,
Hooge & Erkelens, 1996,
1998), others found decreasing saccade amplitudes with increasing target-distractor-similarity (see above). Taken together, this argues for saccade amplitude and fixation duration to be negatively correlated. Theoretically, long fixations and small amplitudes might both reflect low target salience, implying a negative correlation between the attentional focus size and the dwell time (
Müller-Plath & Pollmann, 2003). In the present experiment, which resembled the latter viewing conditions, we thus expected (i) increasing fixation durations with increasing target-distractorsimilarity.
An influence of set sizes on fixation durations is rarely reported.
Motter and Belky (
1998a,
1998b) found the fixation duration to be independent of the set size. Since some attentional models that include saliency maps suggest higher perceptual target salience with increasing set size when distractors are homogeneous, the dwell time might covary with the set size, provided one assumes that the dwell time depends on target salience. Linking fixation durations with dwell times, we expected (ii) decreasing fixation durations with increasing set size in the present experiment.
As mentioned above, results for target-absent trials are rarely reported in the visual search literature (
Jacobs, 1986). Since in our experiment the observer did not know anything about target presence when starting inspection, fixations before finding the target should not be influenced by its presence. However, fixations on targets may take more time than fixations on distractors. Regarding the STRAVIS model, it seems plausible that part of the time
c (see
Table 6 for a brief description of STRAVIS’ parameters), reflecting the preparation and execution of the motor response, takes place during the target fixation. The fixation we chose as representative – the 2
nd fixation of each trial – will in a considerable number of trials be the target fixation in target-present trials. We thus expected (iii) longer fixation durations in target-present than in target-absent trials.
Reinspections, refixations, and the depth of attentional processing
In addition to eye movement parameters, analyzing scan paths gives information about the searching behaviour and efficiency in greater detail. Particularly reinspections and refixations can be interpreted in regard to attentional processes, e.g., incomplete perceptual processing of items, or in regard to the memory of the search path (e.g.,
Horowitz & Wolfe, 1998,
2001, and
2003;
Kristjánsson, 2000;
Peterson et al., 2001;
McCarley, Wang, Kramer, Irwin, & Peterson, 2003;
Dickinson & Zelinsky, 2005,
2007). A reinspection is defined here as a fixation of an item that has previously been visited with at least one different item visited in between. Synonyms in the literature are refixation, revisitation, or regressive saccade, whereas a refixation is defined here as the immediate revisitation of an item. Since the two types of recurred fixations might have different functions in decision-making on target absence or target presence, in contrast to other researchers (e.g.,
Dickinson & Zelinsky, 2005;
Hooge & Erkelens, 1998;
Peterson et al., 2001), we analyzed reinspections and refixations separately.
Reinspections are often reported in the literature but interpreted in different ways (e.g.,
Hooge & Erkelens, 1996;
Dickinson & Zelinsky, 2005,
2007). A small number of reinspections can be associated with memorydriven models. However, many reinspections (
Gilchrist & Harvey, 2006;
Dickinson & Zelinsky, 2005,
2007) need not necessarily imply amnesic search.
Peterson et al. (
2001) suggested that participants might intentionally reinspect items because attention has prematurely left the item before it was adequately processed. They fitted three models of conjunction search which lead to different predictions about the distribution of reinspections based on the hazard function (the hazard function gives the conditional probability that an event will occur at a time
t given that it has not occurred before
t). Their first model assumed amnesic search, the second model proposed inadequate processing (miss), and the third model hypothesized conscious inadequate processing (miss + realization). Both, miss-model and miss + realization-model fitted the data better than the amnesic model.
The occurrence of refixations is reported often too, but researchers handle them differently. Some authors called them corrective saccades, attributed them to incomplete processing of objects, and excluded them from analysis (e.g.,
Dickinson & Zelinsky, 2005;
Hooge & Erkelens, 1998,
1999).
Peterson et al. (
2001) summed up the durations of subsequent fixations on one item.
Findlay, Brown and Gilchrist (
2000) analyzed all second saccades in a conjunction search task and found that small corrective second saccades (refixations) occurred more frequently when the first saccade landed on a target rather than on a distractor. But taken together, refixations have rarely been analyzed in a systematic fashion.
As mentioned above, most models of visual search, including STRAVIS, postulate that each item (item position) is reviewed at most once and assume that visual search is self-terminating in target-present trials and exhaustive in target-absent trials (e.g.,
Duncan & Humphreys, 1989;
Müller-Plath & Pollmann, 2003;
Treisman & Gelade, 1980;
Treisman & Sato, 1990;
Wolfe, 1994; but see
Chun & Wolfe, 1998, for an alternative account). Applying these models to overt search, reinspections and refixations should not occur.
In the present study we first wanted to provide evidence for the “Peterson conclusion” that visual search has a memory and reinspections are possible in memorydriven models (
Peterson et al., 2001). Second, we wanted to analyze the function of reinspections and refixations in greater detail. Concerning the former, we chose experimental conditions in which a memory-driven model is highly plausible, i.e.: a) The display was systematically and statically organized (
Gilchrist & Harvey, 2006); b) the items were presented until the subject responded (
Horowitz & Wolfe, 2003); and c) set sizes were only four, six, or eight items (
Dickinson & Zelinsky, 2007). If reinspections supported an amnesic model, we should observe them equally across all four target-distractorsimilarity levels. If they reflected decision uncertainty, their portion should increase with increasing similarity.
Assuming the latter, we expected (i) the highest portion of reinspections and refixations to occur in trials with the highest target-distractor-similarity. Supposed that increasing the set size from four to eight items would increase target salience and thereby facilitate decision making, we expected (ii) a larger portion of reinspections and refixations in smaller set size conditions. Concerning the difference between reinspections and refixations, we assumed that uncertainty about a negative decision arises mainly after having visited the entire set of items without finding a target, and that it would be most effectively reduced by going back to items with the highest probability of being a target - the peaks in a saliency map (
Müller-Plath, 2008;
Wolfe, 1994) - to ensure that there is no target that has been missed. Thus, we expected (iii) more reinspections in target-absent than in target-present trials. Assuming likewise that uncertainty about a positive decision arises mainly after having fixated the target too briefly, it would be most effectively reduced by an immediate refixation. We thus expected (iv) more refixations in target-present than in target-absent trials.