1. Introduction
The Ravenous Bugblatter Beast of Traal in Douglas Adams’s
The Hitchhiker’s Guide to the Galaxy [
1] is a beast so stupid that it thinks that if a person cannot see it, then it cannot see that person. Therefore, we can cover our eyes with a towel to avoid being attacked. Here, Douglas Adams is playing with the idea that people are aware that they might be biased to favour their own view, and he surprises the reader by offering a reverse scenario. However, it is possible that people are a little bit like the Ravenous Bugblatter Beast of Traal in that they cannot help being affected by what other people see. This is the key tenet of a social perspective-taking view of attention.
Attention can be oriented by different cues, such as arrows
or eyes
. These cues can produce an automatic (or reflexive) rather than a voluntary orientation of attention
4].While voluntary shift depends upon the observer’s expectations and intentions, reflexive shifts of attention (RAS) are associated with sensory stimulation and generated by unforeseen changes in the visual field, particularly by the abrupt onset of stimuli, which elicit reorienting and saccadic eye movements [
In particular, taking in consideration the notion of "Theory of Mind"
5], it has been suggested that attention can be reflexively shiftedtoward where another person is looking (
Figure 1), which causes errors or slower responses when reporting what we see, if this is different from what the other person sees [
To investigate the Reflexive Attentional Shift (RAS) phenomenon, Samson et al.
7] devised a semi-experimental paradigm. The general setup consists of a 3D virtual room presented on a computer screen with the back, left, and right walls visible. A human avatar placed in the center of the room is used as a cue to direct attention toward either the left or right wall of the room. During the experiment, discs appear on either the left, right, or both walls. The participants’ task is to indicate (in each trial) how many discs they or the human avatar can see. As the participants see both the left and right walls, they can see all of the discs. However, since the human avatar faces left or right, it can only see the discs placed on one side. Therefore, there are consistent and inconsistent trials. In consistent trials, the number of discs visible to the participant and to the avatar is the same. In inconsistent trials, the participant can see some discs that the human avatar cannot (
Figure 2). Participants respond as quickly as possible, with responses >2000 ms counted as errors. Reaction Times (RTs) and errors are the dependent variables.
The authors confirmed the existence of egocentric bias: reporting what the avatar can see is affected by what the participant can see. However, they also found interference (i.e., longer reaction times and more errors) in inconsistent trials even when participants had to report how many disks they can see. This interference is defined as “alter-centric intrusion.” Various egocentric biases are known and have been extensively studied such as Piaget’s three mountain task [
8]. An alter-centric bias is, by contrast, a less well-documented phenomenon. Both can, in theory, co-exist.
To explain the alter-centric intrusion effect, Samson et al.
7] advanced the “Perspective-Taking” theory, asserting that people spontaneously incorporate the viewpoint of others. Since then, several studies have reported supporting evidence for this theory [
Nuku and Bekkering
, Teufel et al.
, and Furlanetto et al.
13] found that alter-centric intrusion is present when participants are asked to judge their own perspective while the human avatar was believed to be able to see. On the other hand, intrusion was not present, when the human avatar was believed to be unable to see (e.g., its line of sight obscured).
These findings, supporting the Perspective-Taking theory, run counter to the “Perceptual” theory, which argues that the perceptual features (i.e., the direction of other’s face/nose/posture) are sufficient to explain attentional orientation [
16]. These studies found that human avatars spontaneously orient attention of the observers even when the human avatars cannot see the stimuli either because a physical barrier prevents the view, as in Cole et al. [
14] or when the cue employed in the dot perspective task does not have a mental state (e.g., an arrow, or a camera) as in Wilson et al. [
16]. According to this theory, the effects showed in the dot perspective task are due to domain-general processes rather than perspective taking.
In an effort to bridge this gap and clarify the mechanism behind social perception, Michael and D'Ausilio
17] suggested that the dot perspective task itself may engage both Theory of Mind and domain-general processes. Social and non-social clues would, therefore, engage the same attentional process eliciting RAS, despite being represented by two different functional systems.
Further research has focused on whether the perspective may not be a spontaneous phenomenon. If this is the case, then RAS would be only modulated by perceptual characteristics of the cue, while perspective taking would be due to top-down processes [
. These authors noted that the human avatar employed by Samson et al.
7] was unable to generate an attentional shift when the target discs were presented within 300 ms from the presentation of the cue. The authors, therefore, concluded that attentional shift may be induced by taking a perspective, but this cannot be defined “reflexive” since it requires some time to occur.
Inspired by the previously mentioned literature in which similar results may be interpreted either in agreement with the perceptual characteristics of the cue (perceptual position) or its social characteristics (social position), the journal
Vision recently hosted a special issue titled “Reflexive Shifts in Visual Attention.” This special issue (
www.mdpi.com/journal/vision/special_issues/RAS) provided a place where some of those studies, supporting old and new theories behind RAS, are collated. This review is, therefore, intended as an overview of contemporary research on the RAS phenomenon, which summarizes each of the contributions to this special issue on RAS and briefly outlines directions for future research.
2. Visual Attention and Reflexive Attentional Shift
We are constantly surrounded by a world containing more information and objects than what our cognitive system can process. Attention allows us to choose and select certain stimuli and ignore others. The complexities of attention are shown by neuroimaging data illustrating how attention is carried out by a network of anatomical areas and is, therefore, neither the property of a specific brain region nor a function of the brain as a whole [
2]. In particular, it has been shown that the existence of three networks is related to a different aspect of attention, which alerts orienting and executive controls [
22]. As pointed out by Carrasco [
22], attention seems to be influenced and facilitated by previous knowledge and assumption of the surrounding world. This places attentional processes halfway between perception and cognition. Our attention can, therefore, be influenced by different factors, which can be grouped in two main categories: bottom-up (or exogenous) factors, in which attention is usually deployed reflexively due to the characteristics of the scene and stimuli’s salience, and top-down (or endogenous) factors, in which attention is often deployed voluntarily in accordance with specific tasks or goals, and with the task or goal having a strong influence on where the participants allocate their attention.
In her contribution to this special issue, Zhaoping
23] provided further insights on the mechanisms behind visual attention orientation. Previous results showed that a target stimulus is localised quicker if it is presented to only one eye [
24]. The author investigated whether the ocularity contrast of a visual input, which is a feature that is often hardly visible, captures attention exogenously, where ocularity of a visual input refers to the difference of visual input between the two eyes. Results from the study showed that, regardless of its task relevance, a visual location with a strong ocularity contrast attracts attention. These findings are in line with previous literature, which supports the idea that the primary visual cortex creates a bottom-up saliency map to guide attention exogenously. According to these studies, target characteristics, such as changes in luminance, motion, or color, are combined in a spatial map, which highlights the most salient aspect, as a consequence of which attention is reflexively shifted [
Extending Zhaoping's study
, Burnett et al.
27] examined specific characteristics of exogenous cues that are either more or less likely to draw attention. They used a dual-task paradigm to test whether luminance or an equiluminant colour change modulated motion and colour discrimination effects. Their results showed that the motion and colour tasks were affected differently by the two cues. Motion validity was more strongly affected by luminance than colour cues, whereas the colour validity showed no difference in effect between luminance and colour cues. These results have implications for our understanding of how low-level properties of cues could influence visual attention, with the authors suggesting that “cues which engage the same visual channel as the target are more effective in enhancing target processing at the cued location.” Moreover, if further work supports this view that exogenous cueing is not a unitary process, then this will need to be considered when studies apply cueing tasks.
4. Additional Factors Involved in RAS
Further contributions to this special issue, rather than focusing on the social-perceptual debate, placed their focus on the different variables that may affect and influence RAS and the attentional cueing paradigms such as temporal information, changes in tonic alertness, and inter-individual differences.
4.1. The Influence of Temporal and Auditory Information
Among those contributions, Laidlaw and Kingston
37] investigated how ignoring temporal information eliminates reflexive spatial orienting. In particular, the authors investigated whether the interaction between temporal and spatial attention modulates the Reflexive Attentional Shift. Temporal attention refers to the process of allocating brain resources on the predicted onset of an incoming event [
38]. To investigate this interaction, the authors explored the fore period effect [
39]. This is the effect by which the cuing of a target generates an inverse relationship between subjects’ reaction times and the time between the cue and target appearance: longer time between the stimuli results in shorter reaction time.
The authors systematically manipulated spatial characteristics of the cue (arrows-to elicit reflexive attention versus letters-to elicit volitionally attention), SOA (100, 500, and 1000 ms) and congruency of the cue (congruent versus incongruent).
The results showed the emergence of a fore period effect and of a spatial cueing effect with both arrows and letters, but only at longer SOAs and only in congruent conditions. On the other hand, with shorter SOAs and in incongruent conditions, the fore period effect did not occur while the spatial cuing effect occurred only with letters.
The authors, therefore, concluded that only reflexive spatial attention orienting is modulated by the implicit changes in temporal attention, while volitional spatial attention is not. Thus, the way in which spatial and temporal attention interact must be taken into serious consideration during visual attentional studies.
Extending Laidlaw and Kingston's research
, Hayward and Ristic
40] investigated two different processes that may be present in any study involving spatial cueing: tonic alertness and voluntary temporal preparation. In this study, the authors tested whether changes in tonic alertness and voluntary temporal preparation affect attentional orienting. They confirmed that a task-relevant social gaze and non-social arrow cues affected spatial attention, with no differences between the two cues (
Figure 9).
They found that the magnitude of the generated attentional shift may be modulated by high tonic alertness, while no differences were found with voluntary temporal preparation. Even if, overall, those results seem to be contrasted with Laidlaw and Kingston
37], both studies seem to converge on the idea that the cue generated an attentional shift that appears to remain robust across different cueing task settings. However, the task parameters seem to play an important role when modulating the magnitude of the attentional orienting effect elicited by the different types of cues.
On a similar line of enquiry, Klein
41] focused on the control of visual attention by auditory stimuli. In a series of cross-modal experiments using the cueing paradigm, the author presented to the subjects an auditory cue indicating the position of a target manipulating the cue informative value (congruent versus incongruent with the location of the target) and its onset asynchrony (SOA).
The results showed that the informative value of the auditory cue affected the target localisation. This suggests that localizable auditory stimuli exogenously (rapidly and automatically) capture visual attention. In addition, it was found that subjects were faster to identify the cued target at short SOA, while participants were slower when SOA was between 500 and 1000 ms. The author, therefore, concluded that, for SOA within this temporal window, the exogenous shift of attention is overcome by the endogenous one.
Furthermore, in another series of experiments, the author manipulated the auditory cue changing its pitch rather than its location (
Figure 10). The cue was centrally presented but its glide frequency was manipulated to indicate the target position. The glide frequency could have been informative (raising tone indicating top location and vice-versa) or uninformative. Subjects were faster in the informative conditions, which shows that changes in the glide of the auditory cue shift attention only when it is meaningful.
4.2. Inter-Individual and Laboratory-Real World Differences
Inter-individual differences and the differences between laboratory settings and the real world are usually overlooked when attentional orienting and/or taking a perspective are investigated. In their contribution to this special issue, Bukowski and Samson
42] explained some of the individual differences in terms of the ability to handle conflict between two conflicting perspectives, and the variability in the strength of the egocentric perspective. The study used a visual perspective-taking task and a large sample. Results showed that individuals varied in their difficulty in considering another person’s differing perspective. A cluster analysis suggested four underlying profiles, which can be placed within a two-dimensional space. The two axes are the ability to handle conflict and the relative attentional focus on the self rather than the other person’s perspective.
In line with Bukowski and Samson's findings
, another contribution to this special issue seems to highlight the importance of inter-individual differences. Prpic
43] investigated how perceiving musical note values causes a spatial shift of attention in musicians (
Figure 11). The author contributed to the discussion on RAS by taking into consideration the Spatial-Numerical Association of Response Codes (SNARK) effect. This is the phenomenon by which perceiving numbers can affect the allocation of spatial attention, which causes a leftward target detection advantage after perceiving small numbers and a rightward advantage for large numbers. The aim of the study was to test whether the effect can be reproduced in musicians when reading musical notes instead of numbers. The visual representation of the duration of musical notes shares with the numbers a symbolic representation that goes from left to right. Specifically, images depicting whole and half notes represent a relatively long duration, while eighth and sixteenth notes represent a short duration.
The author found an advantage in detecting a leftward (vs. rightward) target after perceiving small (vs. large) musical note values, which suggests that musicians process numbers and note values in a similar manner. Future studies on RAS might benefit these findings for testing whether the SNARK is affected by the presence of an “Other” on either side of the stimuli presentation.
Lastly, Blair et al.
44] presented a way for assessing individual instances of cover attentional orienting in response to gaze and arrow cues. The authors investigated whether gaze-following behavior occurs in laboratory tasks as frequently as in natural settings. In the first experiment, the presence of costs or benefits in cue trials was calculated, i.e., the proportion of RT responses falling more than 1 SD outside of the performance of neutral control trials. The authors, then, replicated the study in a second experiment with a different directional cue, which serves as the control comparison. The results of both experiments suggest that attentional orienting in gaze-cuing tasks is infrequent, and occurs in less than 50% of trials. However, even though benefits and costs occurred in less than 50% trials, which is consistent with the literature, results indicated that more benefits relative to costs occurred in a valid trial (stimulus appears on targeted location) (
Figure 12). More costs relative to benefits occurred in invalid trials (stimulus appears on non-targeted location). Furthermore, the results showed no differences between gaze cues and arrow cues.
These results have important implications for the use of cueing tasks in the lab. The theoretical explanations that come from their use and the analysis method employed presents a useful starting point for examining the frequency of attentional orientation in future gaze-cueing studies within and across real world and laboratory investigations.
5. Conclusions and Future Directions
On one hand, observers are good at knowing where another person is looking [
45]. On the other hand, there are also limits and mistakes in reasoning about the role of a viewpoint in a scene [
46]. This paper focused on how attention is affected by the presence of another individual in the scene. Researchers have shown that this other individual may act as a cue directing our attention. However, there is no agreement on how this process works.
Some research studies show that the other individual has the same role as any other directional cues that can bias attention such as an arrow. Other research studies, however, show that observers are specifically sensitive to the social characteristics of the other individual, and, therefore, are affected by the content of another person’s viewpoint. In this review, we consider the contributions that appeared in the special issue on Reflexive Attentional Shift (RAS) published in
Vision. Establishing whether RAS is a perceptual or a social process is important because RAS is used as a measure of visual perspectives and mental state attribution in both developmental and clinical contexts. For example, visual perspectives may be used to evaluate children development with regard to the Autism Spectrum Disorder (ASD) [
48]. The contributions of this special issue allow the reader to reach deeper insights into the RAS phenomenon, by not only focusing on the importance of understanding the nature of the process behind it, but also providing further theories and knowledge about different variables that may influence or elicit RAS.
Taking all evidence into account, the contributions confirm that human attention is biased by the presence of a directional cue in the scene. By analyzing the different experiments, it appears that the social relevance of the cue may be necessary in some contexts but not in others.
Specifically, the papers in this special issue helped outline a number of avenues for future research to clarify and solve this debate. For example, the role of participants' beliefs about the other's perspective may play an important role in the interpretation of the RAS phenomenon and future research will need to take this into consideration. For example, Langton
, Wiese et al.
, and Gardner et al.
21] pointed out that participants must believe that the directional cue represents an intentional agent in order to take its point of view. In this case, however, the shift of attention is not “reflexive” but is a voluntary, top-down, process.
In addition, the high level of individual variation needs to be accounted for in future work. For example, Prpic
showed that perceiving musical note values causes a spatial shift of attention in expert musicians but not in non-experts. Similarly, Bukowski and Samson
42] found individual differences in the ability to handle conflicting perspectives.
Furthermore, the research in this issue distinguishes among attentional orienting, level 1, and level 2 perspective-taking
34]. It may be the case that social factors have differential effects on each of the previously mentioned processes. Therefore, it will be important going forward for researchers to be specific about which type of perspective-taking is under examination. Lastly, evidence from the current issue suggests that certain effects might depend on the cognitive demand of the experimental task [
49], which indicates that social factors are involved when the task is cognitively demanding, while they may not be necessary in other cases.
Additional contributions presented in this special issue move away from the social-perceptual debate by trying to provide further insight about the nature of the cues and other variables that may influence RAS and attentional cueing paradigms
. Among those, further confirmations that the cognitive demand of a task plays an important role in attentional orienting have been provided. Specifically, Albonico et al.
30] provided evidence that the deployment of focal attention depends on the interaction between the task demand and the type of the directional cues.
In conclusion, the contributions to the special issue greatly improved our understanding of the RAS phenomenon, and opened up new avenues of investigation, which may allow for a deeper, more sophisticated interpretation of RAS, which may go beyond the perceptual versus social interpretations.