Introduction
Intuitively, when we see a person shift his or her gaze in a particular direction, we seem to be inclined to also look in that direction. Studies of social attention, investigating the effects of other people’s gaze shifts, typically present participants with a static image of a person looking in a certain direction, while a peripheral target is presented. In these studies, the direction of perceived gaze is non-predictive (e.g.,
Bayliss & Tipper, 2006; Friesen, Moore, & Kingstone, 2005; Quadflieg, Mason, & Macrae, 2004;
Ristic & Kingstone, 2005; Sato, Okada, & Toichi, 2007) (for a review, see Frischen, Bayliss, & Tipper, 2007) or counter-predictive (
Driver et al., 1999; Friesen, Ristic, & Kingstone, 2004;
Tipples, 2008) of forthcoming target direction, but faster response times are found for targets congruent with the direction of perceived gaze. Such congruency effects have been observed for manual responses (
Driver et al., 1999;
Friesen & Kingston, 1998;
Friesen et al., 2004;
Sato et al., 2007), suggestive of covert shifts of attention, and also for overt saccadic (eye movement) responses made towards peripheral targets (Itier, Villate, & Ryan, 2007;
Friesen & Kingstone, 2003;
Kuhn & Kingstone, 2009; Mansfield, Farroni, & Johnson, 2003; Ricciardelli, Bricolo, Aglioti, & Chelazzi, 2002).
While these studies provide valuable information about the influence of perceiving someone’s averted gaze, it would be interesting to know what happens in an observer who perceives the actual gaze shift rather than its result. Stimuli showing the actual gaze shift have often been avoided in studies of social attention, because of a possible confound between the effects of the social aspect of the cue and similar effects of the perceived motion direction (Farroni, Johnson, Brockbank, & Simion, 2000). At a neural level, such a confound might be expected, as single cell recording studies have demonstrated that neurons sensitive to social cues (
Perrett et al., 1985; Perrett, Hietanen, Oram, & Benson, 1992) may also respond to biological motion (
Oram & Perrett, 1994). Consequently, stimuli used in studies of social attention are often carefully designed to avoid any possible motion cues, for example by presenting the averted gaze stimulus immediately after the fixation stimulus (an empty screen containing a fixation symbol only), or by covering the eyes in the image before the presentation of the gaze cue (Bayliss, Pellegrino, & Tipper, 2005;
Hermens & Walker, 2010).
A few studies have attempted to address these issues and to extend the work on social attention to a more realistic situation using dynamic cues (
Bayliss et al., 2005;
Farroni et al., 2000;
Kuhn & Tipples, 2011;
Ricciardelli et al., 2002;
Rutherford & Krysko, 2008; Swettenham, Condie, Campbell, Milne, & Coleman, 2003). One of the strategies applied has been to compare the effects of dynamic social cues with those arising from a gaze stimulus in which the eyes remained stationary, while the head moves in the opposite direction. The results of such comparisons have been mixed. In a comparison of adults with autism spectrum disorder (ASD) and a group of controls, both groups showed patterns of response times congruent with an attention shift in the direction of the perceived gaze shift rather than the direction of the motion of the head (
Rutherford & Krysko, 2008). Likewise, comparing male and female participants from a typical student population, attention shifts were found to follow the direction of observed gaze, rather than the direction of the shift of the head (
Bayliss et al., 2005). By contrast, young infants (16 to 21 weeks old) displayed gaze behavior suggestive of attention shifts in the direction of perceived motion (
Farroni et al., 2000). Interestingly, gaze shifts induced by an opposite shift of the head (pupils stationary) and a static gaze cue (eyes first covered and later revealed) showed similar effects on manual response times (
Bayliss et al., 2005), suggesting that the influence of static and dynamic gaze cues might be alike.
Studies of social attention have often used response times to study the effects of the cues in the observer. It is unclear, however, whether the effects of social cues depend on this particular measure, or whether similar results can be obtained if other measures are used. One possible other measure is the frequency of saccade direction errors following a social cue when participants have to make eye movements towards a peripheral target (
Kuhn & Benson, 2007;
Kuhn & Kingstone, 2009). Erroneous saccades in the direction of the cue rather than the target indicate that the cue elicited an automatic saccadic response in its direction (Irwin, Colcombe, Kramer, & Hahn, 2000; Ludwig, Ranson, & Gilchrist, 2008). Comparisons of saccade errors revealed no differences in their occurrence across social (gaze) and symbolic (arrow) static cues (
Kuhn & Benson, 2007;
Kuhn & Kingstone, 2009), and displayed similar patterns of results as response times, suggesting that the two measures (saccadic response times and direction error rates) tap into the same underlying process. Another possible measure is obtained by inspecting trajectories of saccadic eye movements towards a target presented in an orthogonal direction to that of the gaze cue (
Hermens & Walker, 2010;
Nummenmaa & Hietanen, 2006). Trajectories of eye movements have been found to curve away from the direction of covert attention (
Nummenmaa & Hietanen, 2006; Sheliga, Riggio, & Rizzolatti, 1994; Sheliga, Riggio, Craighero, & Rizzolatti, 1995; Van der Stigchel, Meeter, & Theeuwes, 2007) and the onset of peripheral distractors (
Doyle & Walker, 2001; McSorley, Haggard, & Walker, 2004, 2005, 2006, 2009; Van Zoest, Van der Stigchel, & Barton, 2008; Walker, McSorley, & Haggard, 2006), and might therefore provide a measure of attention and response preparation following the perception of a direction cue. Studies applying saccade trajectories to examine the influence of social cues have yielded contradicting results, with saccade trajectories deviating from the cue in one study (
Nummenmaa & Hietanen, 2006), but not in the other (
Hermens & Walker, 2010). The first goal of the present study will therefore be to compare the different possible measures of attention shifts and response preparation to determine whether they yield similar patterns of results.
Whereas several studies using dynamic social cues have shown that attention in the observer tends to follow the direction of perceived gaze rather than that of perceived motion (
Bayliss & Tipper, 2006;
Rutherford & Krysko, 2008), it is not clear whether the same shifts of attention can also be obtained with different types of cues. For static cues, this question has been examined by comparing the effects of a social (gaze) cue with the effects of arrows (
Tipples, 2002;
Kuhn & Benson, 2007), direction words (Hommel, Pratt, Colzato, & Godijn, 2001), or a face with an extended tongue (Downing, Dodds, & Bray, 2004), often demonstrating similar effects of social and non-social cues. For dynamic cues such a comparison has not yet been made, and therefore the second aim of the present study is to compare socially relevant and socially less relevant dynamic cues.
Methods
Participants
Twenty-three students from the psychology research pool (most of them female and between 18 to 25 years of age) and author FH took part in experiment in return for payment or course credit. Participants gave their informed consent for their participation in the study that was approved by the local ethics committee.
Apparatus
Participants were seated at a distance of 57cm from a 21 inch CRT computer screen on which the stimuli were presented, restricted by a chin rest. Eye movements were recorded using an Eyelink II (SR Research Osgood, ON, Canada) video-based eye tracker. Eye tracking was performed binocularly, but as the direction of gaze is often similar in both eyes, the analysis only used the data of the right eye. Stimuli were displayed using the Experiment Builder software package (SR Research Osgood, ON, Canada), while a second PC recorded the participants’ eye movements at a sampling rate of 500Hz in pupil-only mode.
Stimuli
Videos, recorded with a Canon Powershot A430 photo camera at a frame rate of 20fps in the 640x480 AVI pixels mode, served as the stimuli. The video clips, reduced to a size of 320 by 240 pixels (12 by 9.1 degrees of visual angle), were presented in the center of the screen, with the fixation symbol (a small plus sign) vertically aligned with the position of eyes of the ‘actor’ in the video clip (see
Figure 1). To the left and right of fixation two place-holders (circles with a diameter of 0.6 degrees) were presented at a distance of 7.2 degrees from fixation. After a delay, the target (a plus sign, 0.4 degrees in height and width) appeared inside one of the place-holders or 6 degrees directly above fixation.
The recorded video clips were edited into sections of 1.5 seconds long using the Windows Movie Maker 2 software from Windows XP (SP2). The first 500ms of each video clip showed either the actor looking straight ahead (‘eyes’ or ‘tilt’ movie clips) or an empty room (‘walk’), after which the movement started (except for in the ‘neutral’ condition). The video clips produced by Windows Movie Maker were converted into the appropriate format for the Experiment Builder (SR Research Osgood, ON, Canada) stimulus presentation software using the open source package MMconvert. Several video clips of the same movement were used to incorporate some of the natural variation in the speed of the movements.
Three types of video clips were created, as illustrated in
Figure 1. The ‘eyes’ video clips showed an actor making a horizontal eye movement to the left or to the right (generally lasting for 3 frames or 60ms), while in the control condition, the actor looked straight ahead throughout. In the ‘tilt’ video clips, the actor tilted his head leftward or rightward (about 6 video frames or 120ms) or kept his head straight (control condition). During these head movements, the actor’s gaze did not change. Finally, in the ‘walk’ condition, the clips showed the actor walking from the left to the right of the screen or vice versa (about 20 video frames or 400ms). In the control condition of the ‘walk’ condition, the actor did not appear and only the wall in the background was shown. The length of each video clip was trimmed such that the movement started 500ms into the clip (to avoid influences of the onset of the video clip on performance), and the total video lasted for 1500ms. For the ‘eyes’ and ‘tilt’ condition, this meant that the person’s head was in the center of the screen and no movement was made until 500ms after the onset of the video clip. For the ‘walk’ condition, the screen remained empty during the first 500ms. A fixation screen, containing the fixation cross and two place-holders, was presented for a random interval between 800ms and 1200ms before the onset of the video clip. The target always appeared 950ms after the onset of the video clip (i.e., 450ms after the onset of the movement in the video clip), to ensure the movement in the clip was completed before the target appeared.
Design
Participants performed six blocks of 36 trials, consisting of two repeated blocks of each type of movement (‘eyes’, ‘tilt’ or ‘walk’). The order of the blocks was counter-balanced for each participant (following an ABC-CBA design) and the type of cue that was presented in the first block was randomized across participants. The cues were not predictive of the upcoming target direction. When a movement was presented in the video clip, it was equally often followed by a target in the same direction, in the opposite direction or in the upward direction.
Within each block, video clips showing a movement to the left or the right (for the ‘congruent’ and ‘incongruent’ conditions), and video clips without a movement (for the ‘neutral condition’) were paired with targets appearing to the left, to the right or above fixation. Different repetitions of the same movement by the actor were used (four with a movement to the left, four with a movement to the right and four in which no movement occurred), to counteract the effects of small differences in how the movement was performed. The order of the video clips and target locations was randomized within each block.
Procedure
Each block started with a 9-point calibration of the eye tracker. The first block of the six experimental blocks started with 5 to 10 practice trials. Drift correction was applied after each 12th trial to realign the recorded eye positions to central fixation if required. Blocks were separated by a short break.
Each trial in the experiment consisted of the presentation of a central fixation stimulus for a random duration of between 800ms and 1200ms, followed by the video clip for 1500ms. A peripheral target appeared left, right or above fixation 950ms after the onset of the animation (to ensure that each video clip had ended as soon as the target appeared; see
Figure 1). Participants were instructed to look at the fixation symbol (a plus sign, ‘+’) and to then make an eye movement to the target (another plus sign) as quickly as possible. They were also informed that video clips would be played in the background and that the movements in these clips were unrelated to where the target was going to appear.
Data analysis
Saccadic latencies were computed by measuring the time from the onset of the target to the onset of the saccade. Saccades were detected by the Eyelink II software, applying a velocity criterion of 30 deg/sec and an acceleration criterion of 8,000 deg/sec2.
For upward saccades, trajectory deviations were computed as the maximum distance of the trajectory to the straight path connecting the start and the end point of the saccade, and expressed as a percentage of the amplitude of the saccade (‘peak deviation’). Saccade trajectory deviations were then compared to baseline by subtracting the deviations for the no-movement video clips from those with a movement (left or right). Because trajectory deviations were similar for movement cues to the left and to the right (except for their sign), these numbers were pooled into a single number for each reflecting the size the deviations away from the direction of the cue.
Saccadic latencies and saccade trajectory deviations were based on the trials in which participants moved their eyes directly from the fixation point to the peripheral target. That is, before computing these measures, trials in which participants moved their eyes in the wrong direction or made a saccade of insufficient amplitude, or in which participants blinked during the first saccade after target onset were removed. In addition, responses with a reaction time (RT) less than 80ms and 2.5 times the standard deviation larger than the mean were excluded from the computation of these measures. As a result, data of one participant (with more than >30% of the trials removed) were removed, and for the remaining 23 participants in the analysis, latencies and trajectory deviations were based on, on average, 90.2%, 91.7% and 89.3% of the trials, in the ‘eyes’, ‘tilt’ and ‘walk’ conditions, respectively. In contrast, saccade direction errors were based on all trials (from the remaining 23 participants) and were defined as trials with saccadic responses after target onset that did not contain blinks, were of sufficient amplitude (>3deg), and were made in a direction other than that of the target. In addition, we examined the latency of these saccade direction errors. Because saccade direction errors were rather infrequent, we computed their median latency to reduce the influence of outlier response times without having to remove responses on the basis of their latencies.
Discussion
We compared saccadic latencies, saccade direction errors and saccade trajectory deviations for a dynamic social cue (eye-gaze shift) to the cueing effects of two movements less commonly associated with attention shifts in another person (a head tilt movement and a person walking past). For all types of cues, faster response times were found towards the peripheral target when the direction of perceived motion was congruent with its location, with the largest cueing effects for the socially relevant cue (comparing congruent and incongruent directions). Saccade direction errors and trajectory deviations revealed very similar effects for the two centrally presented cues (eye-gaze and head-tilting), whereas opposite effects were found for the walking-past cue. Differences in the response times between the congruent and the incongruent condition were larger for the gaze cue, but no differences were found in the number of saccade errors and the size of the saccade trajectory deviations between the eye-gaze and head-tilt cues. Whereas the response times suggest that eye-gaze leads to stronger cueing, the other two measures cast doubts on the special nature of social dynamic cues. Our findings for the centrally presented cues are mostly in line with observations for static cues, where, for a wide range of other directional cues, such as arrows (
Tipples, 2002), direction words (
Hommel et al., 2001), and an extended tongue (
Downing et al., 2004), non-predictive non-social cues yielded cueing effects of similar strength as social (eye gaze) cues.
The lack of a clear difference in the magnitude of the cueing effects for the two centrally presented cues is quite remarkable, given the many dimensions on which the two cues differ. Besides the difference in social relevance, the cues also differ in their duration of motion (60ms versus 120 ms), the distance across which the movement occurs and the size of the moving object (shift of the pupils versus a movement of the head) and the relative movement of the stimulus with respect to the point of fixation. This suggests that these factors do not strongly influence the cueing effect. The fact that the ‘walk’ cue resulted not only in quantitative, but also qualitative differences in the cueing effects than the two centrally presented cues suggests the contribution of other aspects of the cue to the cueing effect. Possible aspects that could have influenced the cueing patterns in the ‘walk’ cue with respect to the other two cues are the sudden onset in the ‘walk’ cue, the shorter interval between cue offset and target onset and the difference between the neutral and cued conditions (empty screen versus a fast moving stimulus). The factor most likely to contribute to the differential cueing effects for the ‘walk’ cue, and in particular the reverse direction of the effects on saccade direction errors and saccade trajectory deviations, is the sudden onset provided by the actor appearing on one side of an earlier empty display. The shorter delay from cue offset to target onset might be less of a factor, because the ‘tilt’ and the ‘eyes’ cues differed also in this respect, but resulted in small differences in the magnitude of the observed cueing effect. However, it cannot be excluded on the basis of the present data that decreasing the cue offset to target onset interval even further (as in the ‘walk’ condition) has a similar effect. The fact that the screen in the ‘walk’ condition was blank in the control condition differed from the static image of a person in the control conditions of the other cues. However, the fixation cue, which remained present throughout the trial, should have maintained the participants gaze / attention in these conditions. The onset of the person in the ‘walk’ condition may have caused the differential cueing effects for this condition, and it remains for future research to establish whether this was the critical factor. This could be examined by comparing the effects with each cue when presented upside-down. This control is often used in research on biological motion (e.g.,
Troje & Westhoff, 2006) and has also been applied to studies of social attention (
Swettenham et al., 2003), where it has been suggested that inversion of the stimulus takes away its biological significance.
We also examined the different measures of the cueing effect - response times, saccade direction errors and saccade trajectory deviations. A similar pattern of results was observed for the two centrally presented movement cues, but a clear dissociation occurred with the walk condition. The response time effects with this cue were similar to that for the other two dynamic cues with faster saccadic response times in the congruent than in the incongruent condition. Saccade error rates and trajectory deviations showed the opposite pattern from that found for the central cues where more errors occurred in the direction
away from the direction of perceived motion and trajectory deviations were
towards the direction of motion. The ‘walk’ cues produced the typical pattern of results on response times (
Driver et al., 1999;
Friesen et al., 2004;
Tipples, 2008;
Rutherford & Krysko, 2008), with faster responses to targets that appeared in the direction of the cue. However, for the saccade direction errors and saccade trajectory deviations, the ‘walk’ condition produced the opposite pattern. Earlier studies found incorrect saccades that were more often in the direction of the cue (
Kuhn & Benson, 2007;
Kuhn & Kingstone, 2009), see also our ‘eyes and ‘tilt conditions), while the ‘walk condition resulted in incorrect saccades away from the direction of motion. Saccade trajectories typically deviate away from distractors or the direction of attention (e.g.,
Doyle & Walker, 2001;
Hermens & Walker, 2010;
Nummenmaa & Hietanen, 2006, and our ‘eyes and ‘tilt conditions), but the ‘walk’ cue showed deviations towards the direction of motion.
The pattern of results for the ‘walk’ condition could have a number of potential explanations. Cueing effects on response times are commonly interpreted as reflecting shifts of covert attention (e.g.,
Posner, 1980;
Posner & Cohen, 1984). The interpretation of the effects of a visual cue on saccade direction errors and saccade trajectories is less well established. Saccade direction errors presumably reflect the involvement of oculomotor preparation (preparing to make a response in the cued direction), but they could also be an indication of strong exogenous shifts of attention, as in the oculomotor capture effect (e.g., Theeuwes, Kramer, Hahn, & Irwin, 1998). In these studies, sometimes both the preparation and execution of the saccade is found (when an eye movement is made to the distractor stimulus), but sometimes there are indications of just the preparation (when an eye movement starts in the direction of the distractor, but turns around to go to the target). Saccade trajectories have been shown to be influenced by endogenous shifts of attention (
Sheliga et al., 1994,
1995;
Van der Stigchel et al., 2007), but are also modulated by the onset of distractor stimuli (
Doyle & Walker, 2001;
McSorley et al., 2006), suggestive of an influence of exogenously shifted attention. Oculomotor areas, such as the superior colliculus and frontal eye field have been shown to be involved in saccade trajectory deviation effects (e.g.,
McPeek & Keller, 2002; McPeek, Han, & Keller, 2003), therefore indicating the planning of an oculomotor response that must be inhibited (
Aizawa & Wurtz, 1998). Previous experiments have shown a strong link between covert attention and oculomotor preparation, which led to the formulation of a ‘premotor’ theory of attention (Rizzolatti, Riggio, Dascola, & Umiltá, 1987). In this theory, attention shifts´ are assumed to be achieved by activating, but not executing a motor program for an oculomotor (or manual) response. In its original formulation, both goal-driven (endogenous) and reflexive (exogenous) attention was assumed to be linked to response preparation, but recent findings have suggested that the link between attention and response preparation might be restricted to exogenous attention alone (
Smith & Schenk, 2012). Taking these considerations into account, we suggest the following interpretation of the apparent dissociation of the effects of the ‘walk’ cue on response times and saccade direction errors and saccade trajectory deviations. Response times indicate the strength and direction of endogenous attention shifts, similar to the deployment of attention in biological motion, where upright dynamic, but not static or inverted human or animal motion results in faster response times to stimuli in the direction of motion (Shi, Weng, He, & Jiang, 2010). The other two measures (saccade errors, saccade trajectory deviations) measure either both oculomotor response preparation and shifts of exogenous attention, or exogenous shifts of attention alone. In terms of premotor theory (
Rizzolatti et al., 1987) or the adapted version of premotor theory (
Smith & Schenk, 2012), a combination of an exogenous attention shift and oculomotor preparation is the more likely of the two possibilities, as the two often coincide. A possible factor why the ‘walk’ cue produced the observed dissociation in the direction of the cueing effects on the three measures is the duration of the cue. Endogenous cueing effects are often assumed to be slowly arising, and could therefore have emerged later, while exogenous cueing effects tend to be rapid, but short-lived, and could therefore have occurred earlier during the presentation of the ‘walk’ stimulus.
For the two central cues (‘eyes’, ‘tilt’), the longest response times were found for the neutral (nomovement) cues, in which no motion occurred. The faster response times in the other two conditions (congruent and incongruent cues) could reflect the effects of the motion onset acting as a warning signal to the onset of the target, which was presented after a fixed delay after motion onset. Interestingly, this pattern of results does not seem to hold for the ‘walk’ condition, which could imply that only movements at the center of fixation are effective as a warning signal. Moreover, response times for the ‘walk’ cue were generally longer than for the other cues, and therefore, whatever caused these longer latencies (possibly some confusion about where to look) might have reduced the relative size of the warning effect for the neutral cue in this condition.
The differences between the average trajectory deviations for the three conditions in our study do not seem to relate to differences in vertical saccade latency. Earlier studies have demonstrated that trajectories tend to deviate towards distractors for short latencies, while for longer latencies, saccades tend to deviate away (
McSorley et al., 2006,
2009; Mulckhuyse, Van der Stigchel, & Theeuwes, 2009;
Van der Stigchel & Theeuwes, 2007;
Van Zoest et al., 2008). For differences in average latencies to explain the differences in saccade trajectory deviations for our experimental conditions, conditions with faster average latencies on the vertical saccades (which were used to measure saccade trajectory deviations) should show deviations towards and conditions with slower latencies should show average deviations away. This is not what was found. Fastest average vertical saccade response times were found for the ‘tilt’ condition, intermediate average saccade latencies for the ‘eyes’ and longest average latencies for the ‘walk’ condition. Deviations towards, however, were found for the ‘walk’ condition, which had the longest latencies, and deviations away were found for the two conditions with the faster average latencies. This suggests that some other cause than differences in latencies caused average deviations in the three conditions to differ.
Response latencies for the vertical saccades were generally quite short in comparison to the other conditions. One possible reason is that in this condition, the target did not appear inside a place-holder, and might therefore have acted more as a new onset than for the other two conditions (with a target on the horizontal axis), where the place-holder might have made target onset less salient. The reason for using horizontal place-holders was to create ‘objects’ to which the cue could be directed. One possible disadvantage, however, could be that the examined time-course of the effects on response times and saccade direction errors on the one hand and saccade trajectory deviations might slightly deviate. Future studies that apply place-holders in each of the places or no place-holders at all, should reveal whether such differences in time-courses do indeed take place.
Conclusion
In conclusion, centrally presented dynamic cues result in very similar effects in the observer, independent of whether the cue is socially relevant. Different response measures (response times, error rates and trajectory deviations), however, can sometimes be dissociated, as shown by a third cue (a person walking past), suggesting that different eye movements measures can reveal different aspects of the influence of a cue on the observer. Our results show that it can be beneficial to study more ecologically valid, but often less controlled stimuli. Whereas complex experimental designs are often employed to reveal a dissociation between covert attention and response preparation (for an overview, see
Smith & Schenk, 2012), our ‘walk’ stimulus shows that dissociations can occur naturally when more ecologically valid stimuli are used.