The visual search paradigm
Visual selection processes underlying object recognition in complex scenes have been extensively characterized on the behavioral level; the experimental paradigm is commonly referred to as visual search (see
Wolfe, 1998, for a review). In these experiments the everyday process of visual search was mimicked by instructing subjects to search for a target object within an array of visual objects. In some studies, the exploration of the stimulus array by eye movements was allowed. In these studies, reaction times and the pattern of eye movements were observed (e.g.
Bichot & Schall, 1999,
Findlay, 1997;
Groner & Groner 2000). In other studies, however, restricted inspection conditions were used, i.e., the subject was instructed not to move eyes during search (
Wolfe, 1998). This procedure utilized the subject’s ability to direct attention to objects in the periphery of the visual field while fixating a central point. This ability to shift attention was also termed covert attention (
Posner & Petersen, 1990). During visual search the target was present in some trials while only distractors appeared in others. The subject had to decide whether the target was present or not. The amount of the objects in the array was varied (i.e. variable set sizes) and reaction times were depicted as a function of set size. This function was fitted linearly by a regression analysis and briefly named “search slope”. If the search slope ran flat, reaction times were independent of set size. If it increased, the presence of each further object cost time. In general, the slope of the search function is understood as a measure of difficulty of the according search task, independently of the postulated selection processes underlying the slope (
Wolfe, 1998).
Figure 2.
The visual search paradigm. Left: stimulus array of a typical feature search. Middle: stimulus array of a typical conjunction search. Right: idealized result pattern.
Figure 2.
The visual search paradigm. Left: stimulus array of a typical feature search. Middle: stimulus array of a typical conjunction search. Right: idealized result pattern.
In the left stimulus array of
Figure 2, the vertical target bar differs from the remaining bars by its orientation (as an elementary visual feature). This type of search was termed feature search. In the middle stimulus array, the target is a vertical-green bar amongst vertical-red and horizontal-green bars. In this case, the target differs from the distractors only in the conjunction of its features. That type of search was named conjunction search. On the right side, idealized search slopes are depicted that resulted from experiments of this type in classic studies (
Treisman & Gelade, 1980). The greater slope of the black curve indicates that the conjunction search is more difficult than the feature search. This basic result pattern was replicated in a numerous studies (
Wolfe, 1998;
Cave & Wolfe, 1999).
Later studies, however, revealed striking deviations from this typical pattern: First, some conjunction searches produced flat search functions (
Nakayama and Silverman, 1986). Second, feature search behavior exhibited a strong dependence on attention (
Joseph et al., 1997). These and other results indicated that conjunction and feature search foot on uniform selection processes and differ only quantitatively (
Duncan & Humphreys, 1989;
Wolfe, 1998). Selection efficiency was argued to depend on the similarity between target and distractors (
Duncan & Humphreys, 1989). Target and distractors are more similar in conjunction search than in feature search, as only in the former, distractors share common features with the target (see
Figure 2).
Functional models of visual search
According to classic-serial models (
Treisman & Gelade, 1980;
Koch & Ullmann, 1985), the visual system initially analyzes the whole scene by extracting its elementary features (e.g. color or orientation of contours). Subsequently, these features are integrated to objects by focusing attention on a location. Specifically,
Treisman and Gelade’s (
1980) classical feature integration theory explained the apparent behavioral dichotomy between efficient feature search and inefficient conjunction search with a dichotomy of search mechanisms: Feature search is based on the detection of a unique feature in one of the topographic maps representing visual features; a target with a unique feature does not need to be integrated, and hence, can be found without the employment of attention. By contrast, focal attention is required to integrate visual features into representations of objects. Therefore, single positions of the scene have to be selected serially by focal attention in order to find a target defined by a feature conjunction. This theory was attractive because strong link to the multiple parallel ‘feature maps’ in the primate visual cortex (
Felleman and Van Essen, 1991;
Treisman, 1996). The ‘master map’ of locations controlling spatial selection could be naturally mapped onto spatial representations in posterior parietal cortex (PPC;
Treisman, 1996;
Colby & Goldberg, 1999). But it could not account for the efficient conjunction searches and inefficient feature searches observed in later behavioral studies.
Subsequent models postulated a common selection process for feature and conjunction searches operating in parallel by simultaneously evaluating the behavioral relevance of all objects in the visual scenes (
Duncan & Humphreys, 1989;
Desimone & Duncan, 1995; Usher & Niebuhr, 1996).
Duncan and Humphrey’s (
1989) “Attentional Engagement Theory” is based on the assumption that search performance is explained by both the (dis-) similarity between targets and distractors and the (dis-) similarity among the distractors. Search difficulty depends on the discriminability of target and distractors, and not on the necessity of feature conjunction. All objects in the scene compete for ‘weights’ controlling a limitedcapacity shortterm memory stage controlling behavior (see also
Duncan et al., 1997). Crucially, selection in this and related models is guided by a memory representation of the searched-for target, which is now commonly assumed to be stored in the prefrontal cortex (PFC;
Desimone and Duncan, 1995; Usher & Niebuhr, 1996).
In more recent parallel-serial hybrid models, such as Wolfe’s (1994) “Guided Search” (see also
Bichot and Desimone, 2006), the selection process ultimately operates serially, but under the parallel guidance of a memory representation of the target features (‘top-down’) and a representation of the conspicuousness of local features (‘bottom-up’); These two are combined into a topographic representation of visual salience indicating the behavioral relevance of local stimuli (
Treue 2003). Such a representation has been proposed to exist in the PPC (
Gottlieb et al., 1998) and the frontal eye fields (FEF;
Bichot et al., 1999;
Thompson and Bichot, 2005). During visual search focal attention is directed to the most salient location first, then to the second most salient location, etc. The top-down guidance is the more difficult, the less the target is discriminable from distractors. A large-scale neural network model motivated by these hybrid theories (
Hamker, 2004) focuses on the ‘re-entry’ of spatial attention signals in a distributed network and on how this process can be guided by memorized features. This model predicts that attention builds up successively by convergence and feedback. Feedback passes the target information from PFC to IT and extrastriate areas as V4. The “oculomotor circuit”, comprising FEF, LIP and superior colliculus, includes this distributed activity and yields a continuous spatial reentry signal.
Thus, these three broad classes of functional models make different predictions regarding the cortical substrates involved in the selection mechanisms of visual search. Parallel models predict that the memory representation of the target, stored in the PFC directly controls the selection in visual search, operating on representations in the visual cortices. By contrast, if spatial locations are selected serially, as claimed by the classic serial and hybrid models, the PPC (and FEF) must be engaged in the processes that control visual search.
A frontoparietal network for covert visual search
In a first experiments (
Donner et al., 2000), we therefore tested if the FEF and these three intraparietal subregions are also involved in visual conjunction search, as predicted by classical and hybrid models involving a spatially serial selection process. In the experimental condition, subjects searched for a target defined by a conjunction of color and orientation (conjunction search, see
Figure 3). In the baseline condition (easy feature search), subjects searched for a uniquely colored target, regardless of its orientation. In order to minimize the occurrence of saccadic eye movements in the scanner, the search arrays were masked after the presentation time of 80ms.
Reaction times increased significantly with set size both in conjunction search and but not in easy feature search. This indicated that the experimental condition conjunction search was attentionally more demanding than the control condition easy feature search (see
Figure 3).
Figure 3.
Left panel: Visual search conditions. Exemplary “target-present trials” are shown for the Hard Feature, Easy Feature, and Conjunction tasks. The target cluster consisted of vertical bars of either color in Hard Feature (lower right quadrant), yellow bars of either orientation in Easy Feature (upper left quadrant), and of yellow and vertical bars in Conjunction (upper right quadrant). Subjects were instructed to fixate and to indicate the absence or presence of the target. Due to the higher luminance of the yellow clusters, the ratio of target to distracter salience was lower in Hard Feature, rendering search more difficult than in Easy Feature.
Figure 3.
Left panel: Visual search conditions. Exemplary “target-present trials” are shown for the Hard Feature, Easy Feature, and Conjunction tasks. The target cluster consisted of vertical bars of either color in Hard Feature (lower right quadrant), yellow bars of either orientation in Easy Feature (upper left quadrant), and of yellow and vertical bars in Conjunction (upper right quadrant). Subjects were instructed to fixate and to indicate the absence or presence of the target. Due to the higher luminance of the yellow clusters, the ratio of target to distracter salience was lower in Hard Feature, rendering search more difficult than in Easy Feature.
Right panel: search slopes for conjunction search, hard feature search and easy feature search. Reaction times from psychophysical validation are depicted as function of set size. Values are means of correct responses. The error bars indicate the standard error.
Contrasting both conditions (conjunction search and easy feature search), significant differential activations between conditions were ascribed to top-down modulation of neural activity (see
Figure 4). The activated region in the dorsal part of the precentral sulcus corresponds unambiguously in its localization to the human analogue of FEF (Corbetta; 1998;
Paus, 1996;
Courtney et al., 1998; Beauchaump et al., 2001). FEF and PPC were consistently activated. PPC showed activation with a larger extent on the right hemisphere. In principle, this is in line with the assumption of the predominance of the right hemisphere for the control of spatial attention (
Mesulam, 1999). However, it indicates a quantitative rather than a qualitative difference of the contribution of both PPCs. The group activation showed three statistical peaks in the intraparietal sulcus corresponding to the three sub-regions AIPS, PIPS and IPTO. The anatomical position and the talairach coordinates von AIPS, PIPS and IPTO agree well with the three sub-areas found by
Corbetta et al. (
1998) during spatial shifts of attention. For the group analyses, individual brains needed to be transformed in a standardized brain, which reduces spatial discriminatory power. This leads to largely fused activations in PPC. Therefore a multi regression analysis was performed that was based not on a group, but on individual data. As a result, consistent over the subjects, activations in the parietal regions were spatially distinct (see
Figure 4). These results suggest an involvement of the human frontal eye field in covert visual selection of potential targets during search. The activation of the right posterior parietal cortex was roughly consistent with a the results of an earlier PET study of visual search using similar stimulus array (
Corbetta et al. 1995), in which a conjunction search task was also contrasted with an easy feature task. However,
Corbetta et al. (
1995) described no task-related modulation in the FEF. Moreover, surface reconstruction and analysis of individual subjects permitted more finegrained mapping of multiple distinct parietal activations in our study. The activation site in the right superior parietal lobe observed by
Corbetta et al. (
1995) seemed to correspond best to the right posterior IPS region.
Figure 4.
A) Group activation map, superimposed on one subject’s rendered brain. Left, dorsolateral view; right, dorsoposterior view. Activations are produced by conjunction search (CS), relative to easy feature search (EFS). Abbreviations: PreCeS, precentral sulcus; PostCeS, postcentral sulcus; AIPS, anterior intraparietal sulcus; PIPS, posterior intraparietal sulcus; IPTO, intraparietal transverse occipital sulcus. Posterior parietal activation is larger in the right than in the left hemisphere, but contains corresponding peaks in both hemispheres. B) Sections of folded posterior parietal cortex of three exemplary subjects with their individual activation patterns. PIPS consistently extends to the convexity of the superior parietal lobulus.
Figure 4.
A) Group activation map, superimposed on one subject’s rendered brain. Left, dorsolateral view; right, dorsoposterior view. Activations are produced by conjunction search (CS), relative to easy feature search (EFS). Abbreviations: PreCeS, precentral sulcus; PostCeS, postcentral sulcus; AIPS, anterior intraparietal sulcus; PIPS, posterior intraparietal sulcus; IPTO, intraparietal transverse occipital sulcus. Posterior parietal activation is larger in the right than in the left hemisphere, but contains corresponding peaks in both hemispheres. B) Sections of folded posterior parietal cortex of three exemplary subjects with their individual activation patterns. PIPS consistently extends to the convexity of the superior parietal lobulus.
Feature binding is not critical for engaging the frontoparietal network in visual search
What type of attentional process does this activation pattern reflect? It had been proposed that the engagement of parietal cortex in object identification is linked to the process of feature conjunction through spatial selection (
Friedman-Hill et al., 1995;
Treisman, 1996;
Robertson, 1998). We wondered whether the necessity to conjoin the features (i.e. color and orientation) for target identification is a prerequisite for the involvement of PPC as well as FEF in attentive visual search. Besides imposing the need for feature integration, visual conjunction searches are also more difficult than many simple feature searches because the targets are commonly of lower relative saliency as compared to the distractors (
Wolfe et al., 1994,
Wolfe, 1998). We therefore investigated in a second experiment (
Donner et al., 2002), whether high search difficulty alone, without the necessity for feature integration, is sufficient for a frontoparietal engagement in visual search.
To this end, we introduced a new experimental condition: hard feature search. By means of an adequate psychophysical manipulation, this new condition was adjusted to the difficulty of conjunction search (see
Figure 3). In order to render hard feature search more difficult than easy feature search, the ratio of target and distracter salience had to be decreased in hard feature search by a modulation of the objects’ luminance. Thus, the difference between experimental and control condition reflects only difficulty and not a combination of difficulty and feature conjunction (as in the first experiment). If the frontoparietal areas are differentially activated, their activity correlates generally with difficulty, as all other factors (sensory stimulation and motor response) are identical.
Moreover, both differential responses (conjunction search – easy feature search; and hard feature search - easy feature search) can be assessed by a direct quantitative comparison. This comparison allows for testing whether there are areas that show equally strong activation, which is correlated only with difficulty and not with feature conjunction. For that purpose, regions in FEF, AIPS, PIPS and IPTO, activated during conjunction search, were identified on cortical surfaces and served as regions of interest (ROI) (see
Figure 5).
In each of the four ROIs, the fMRI signal was relative to the common control condition (easy feature search). Furthermore, the fMRI signal was averaged over all voxels of the ROI, over all trials of each condition, over both hemispheres and over all subjects. The resulting means were compared statistically. The amplitude of the fMRI signal modulation in AIPS was higher during hard feature search than during conjunction search, whereas the modulation in FEF and IPTO was stronger during conjunction search. There was no difference between conditions in PIPS.
Figure 5.
A) ROIs for the comparision of the fMRI responses in conjunction search and hard feature search. Significant activated regions during conjunction search served as ROIs and were marked in red on the enfolded cortical surfaces. B) FMRI responses during Hard Feature and Conjunction. Normalized and averaged response amplitudes of the regions in FEF, AIPS, PIPS and IPTO of significant activation during Conjunction are noted for Conjunction and Hard Feature. Error bars represent standard error. Significant differences between Conjunction and Hard Feature are indicated by “*” for P, 0.05 and by “**” for P, 0.01. Abbreviations: FEF, frontal eye field; AIPS, anterior intraparietal sulcus; PIPS, posterior intraparietal sulcus; IPTO, intraparietal transverse occipital sulcus.
Figure 5.
A) ROIs for the comparision of the fMRI responses in conjunction search and hard feature search. Significant activated regions during conjunction search served as ROIs and were marked in red on the enfolded cortical surfaces. B) FMRI responses during Hard Feature and Conjunction. Normalized and averaged response amplitudes of the regions in FEF, AIPS, PIPS and IPTO of significant activation during Conjunction are noted for Conjunction and Hard Feature. Error bars represent standard error. Significant differences between Conjunction and Hard Feature are indicated by “*” for P, 0.05 and by “**” for P, 0.01. Abbreviations: FEF, frontal eye field; AIPS, anterior intraparietal sulcus; PIPS, posterior intraparietal sulcus; IPTO, intraparietal transverse occipital sulcus.
The activity of PIPS was tightly correlated with search difficulty. This finding supports the hypothesis that this area contains the human analogue of LIP of the macaque (
Corbetta, 1998;
Corbetta et al., 1998). Based on physiological evidence, there is presumably a salience map implemented in LIP (
Gottlieb et al., 1998;
Itti & Koch, 2001) on which, according to spatial serial models, spatial selection takes place (
Wolfe, 1994). The activity of LIP during search should reflect differences in difficulty. Furthermore, the diverse pattern of modulation of AIPS, PIPS and IPTO during both search condition indicates that PPC sub-regions incorporate different functional properties.
To sum up, conjunction search and hard feature search were not discriminable in their psychophysical characteristics during the psychophysical validation. The absent difference in reaction times between conjunction search and hard feature search at a set size of four objects during the two experiments is in line with the psychophysical validation. In the second experiment, similar areas as in the first experiment were activated, in particular FEF, AIPS, PIPS and IPTO. Analogue to the first experiment, the results indicate the bilateral involvement of PPC during search with a gradual asymmetry in favor of the right hemisphere. In particular, this result seems to be contradictory to the common hypothesis that the activation of PPC during visual search reflects the process of feature conjunction specifically. However, the strength of activation was only not significantly different in PIPS during conjunction search and hard feature search. The response of this region reflects the same difficulty of both tasks in the clearest way. By means of further control experiments, it was excluded that the differential activations in the two experiments reflected a difference of the deployment of eye movements or the minimal differences of the stimuli between experimental and control condition. Overall, the results show that the differential activations in the two experiments reflect neither sensory nor motor processes but rather visual selection processes.
Nonetheless, PPC engagement is a very general feature of a difficult visual task (
Wojciulik & Kanwisher, 1999,
Culham & Kanwisher, 1999). Two mechanisms have been proposed to account for this general role: (1) PPC is a source of top-down signals counteracting suppressive effects of distractors on the target, thereby biasing object competition towards the target (
Reynolds & Desimone, 1999). (2) PPC actively inhibits distractors (
Wojciulik & Kanwisher, 1999). The common characteristic of both hypotheses is the critical significance of the presence of multiple distractors for a PPC involvement in visual tasks. This leads to the following prediction:
Parietal cortex should not be engaged in visual search in the absence of distractors in the visual field.
The presence of distractors is critical for the engagement of some, but not all, frontoparietal areas
In a further experiment, we investigated (
Donner et al., 2003) whether the parietal cortex is also engaged in visual search without distractors. Two single object visual search tasks were matched in sensory stimulation and motor requirements, but were different in task difficulty. Differences between fMRI responses during both tasks were found within (predominantly left) AIPS and IPTO. Activation of PIPS and FEF was less reliable and failed to be significant in the group average (see
Figure 6).
Figure 6.
ROI responses during conjunction in the single object experiment. The signal is normalized to the mean of feature. Group averages are displayed with 95% confidence intervals on the left and 99% confidence intervals on the right. The amplitudes of differential responses were accepted as significant if the confidence interval did not include zero. In the group average, significant differential responses were found in AIPS and IPTO bilaterally and in the FEF and PIPS only in the right hemisphere. According to the 99% confidence criterion, significant responses were restricted to AIPS and IPTO of both hemispheres in the group average.
Figure 6.
ROI responses during conjunction in the single object experiment. The signal is normalized to the mean of feature. Group averages are displayed with 95% confidence intervals on the left and 99% confidence intervals on the right. The amplitudes of differential responses were accepted as significant if the confidence interval did not include zero. In the group average, significant differential responses were found in AIPS and IPTO bilaterally and in the FEF and PIPS only in the right hemisphere. According to the 99% confidence criterion, significant responses were restricted to AIPS and IPTO of both hemispheres in the group average.
Accordingly, these findings indicate that parts of PPC are engaged in attentional control even if a single peripheral object has to be identified. Neither the presence of inter-object competition (
Reynolds & Desimone, 1999) nor the necessity for distractor inhibition (
Wojciulik & Kanwisher, 1999) seems to be a prerequisite for their engagement. What kind of attention mechanism does PPC engagement in single object search reflect? At least, three types of mechanism are conceivable: (1) endogenous control of spatial attention shifts towards the peripheral object (
Wolfe, 1994;
Treisman & Gelade, 1980;
Corbetta et al., 1995), (2) prolonged maintenance of the attentional focus at the peripheral location during the identification of the feature conjunction (
Chelazzi, 1999), and (3) the control of feature-based attention (
Chelazzi, 1999;
Wolfe, 1994;
. Desimone & Duncan, 1995;
Grossberg et al., 1994).
Co-activation of multiple sub-regions appears to be a characteristic feature of parietal lobe function, complicating attempts to understand its functional organization (
Culham & Kanwisher, 1999). By contrast, the present data point to a functional dissociation: AIPS and IPTO were consistently engaged while PIPS was not. Interestingly, Shulman and co-workers observed a predominantly left-hemispheric activation in AIPS during the delay of a non-spatial feature matching task (
Shulman et al., 2002). This finding contrasts with the reliable predominance of right PPC in studies of spatial attention (
Mesulam, 1999;
Corbetta & Shulman, 2002), but corresponds well with the present results.
We found that two sub-regions of the parietal cortex, AIPS and IPTO, are engaged in the attentional control of visual conjunction search irrespective of the presence of multiple distractors. By contrast, the engagement of another sub-region, PIPS, seems to presuppose the presence of distractors. Similar to non-spatial attention tasks, parietal activity during single object search predominates in the left hemisphere.