Introduction
In this article I briefly review several experiments that my colleagues and I have conducted to investigate a number of aspects of binocular coordination (for a more comprehensive review of research investigating binocular coordination see Kirkby, Webster, Blythe & Liversedge, 2008).
Towards the centre of the human retina there is a small area called the fovea that is responsible for providing very high acuity visual information to the visual system. While visual information is available from areas other than the fovea, it is less rich in detail since acuity falls off very rapidly from the centre of the fovea to the retinal periphery (see
Balota & Rayner, 1991). Thus, in order that the human brain might receive high quality visual information, the eyeball must be oriented such that light from the specific point in space that a person wishes to view clearly falls precisely onto the foveal region. This requirement is possible since primates have eyes that are positioned frontally in the skull that can be rotated in three dimensions, the most important of which are the horizontal and vertical dimensions (I will not discuss torsion eye movements in this paper, as ordinarily during upright reading and scene viewing, they are not instrumental in bringing the eyes to fixate objects in space).
During reading and other free scanning tasks where a static scene is under scrutiny, humans move their eyes in a stereotypical manner making saccades - rapid ballistic rotations of the eyes (usually in the order of 20-40 ms), and fixations, which are brief periods when the eyes are comparatively still (usually between 180-350 ms during normal reading; see
Liversedge & Findlay, 2000;
Rayner, 1998). During fixations visual information is extracted and processed; saccadic eye movements are made in order that the viewer may fixate different portions of the visual environment. Thus, saccadic eye movements are the primary behavioural means by which humans sample their visual environment. A further, basic, but very important characteristic of the human visual system is that it is binocular. The visual input that is delivered to the brain for processing ordinarily arrives via two eyes, not one. As primate eyes are frontally placed, the system responsible for oculomotor control must coordinate movements of both eyes such that there is corresponding visual input from each retina (at least to some degree).
In this article I will consider the question of whether perfectly corresponding patterns of retinal stimulation are required for non-diplopic vision. This, in turn, will lead me to discuss aspects of psychological processing that are required in order that a single unified percept of our visual environment is experienced. I will also discuss several questions that the current work raises for future investigation.
A long held and pervasive assumption within the field of eye movements and reading is that each eye fixates the same letter of a word. This assumption is reflected in many undergraduate textbook depictions of the arrangement of the eyes during binocular human vision. Such diagramatic illustrations usually specify a trigonometric arrangement, suggesting that the two eyes’ lines of sight are perfectly aligned such that where they cross is the specific point under fixation. Thus, such depictions give the strong impression that the same letter within a word would be fixated by each eye during reading. This (often implicit) assumption has been prevalent in the majority of published papers investigating normal reading (In the vast majority of studies investigating eye movements during normal reading, only the movements of one of the two eyes have been measured. Most researchers considered it unnecessary to record the movements of both eyes since it was assumed that the data for one eye would duplicate the data for the other eye. In addition, some eye trackers only provide data from one of the two eyes, and procedures for binocular recordings are more complicated than those for monocular recordings.).
It is important to note, however, that quite a number of studies from the 1980s forward did investigate binocular coordination. Many of these focused on disconjugacy that occurs during saccades between pairs of simple visual stimuli in the same or different depth planes (see e.g., Bains, Crawford, Cadera & Vilis 1992; Collewijn, Erkelens & Steinman 1988;
Erkelens & Sloot, 1995; Zee, Fitzgibbon & Optican 1992). A second area that also received considerable attention in this period is binocular coordination, or fixation stability, in dyslexic readers (e.g., Stein, Riddell, & Fowler, 1988; see also Cornelissen, Munro, Fowler, & Stein, 1993). More recently, Kapoula and her colleagues have continued this interesting line of research (Kapoula, Bucci, Ganem, Poncet, Daunys, & Bremond-Gignac 2008; Kapoula, Bucci, Jurion, Ayoun, Afkhami& Bremond-Gignac 2007; see also Kapoula, Vernet, Yang & Bucci, 2008 in the present Special Issue). For brevity’s sake, I will not discuss these studies in detail in this article (though for a full discussion see Kirkby, Webster, Blythe & Liversedge, 2008).
Importantly, however, until recently there has been very little work to investigate binocular coordination during saccadic eye movements in normal reading (i.e., non-dyslexic readers), and even less to assess the prevalence of binocular disparity during fixations rather than saccades. The prevalence of disparity during fixations is of particular importance. Since it is during fixations that the visual characteristics of the fixated word are extracted and processed, one might reasonably anticipate that the eyes would be aligned during this period. During the last decade, however, there has been a burst of research activity in this area (see Kirkby, Webster, Blythe & Liversedge, 2008). Interestingly, these studies have now demonstrated that disparity between the two points of fixation often occurs during a fixation and the assumption that each eye fixates the same letter of a word during reading is not correct on a substantial proportion of fixations.
The earliest study to investigate binocular coordination during reading was carried out by
Hendriks (
1996). In her studies participants were required to either read normally, or to sub-vocalise linguistic stimuli that took the form of either prose passages, or lists of unrelated words. Hendriks measured vergence velocity during fixation and found effects of task (increased vergence velocities during reading than during sub-vocalising), and text type (increased vergence velocities during prose reading than word reading). More importantly, however, there emerged a relationship between saccade extent and vergence velocity, with longer saccades producing increased vergence velocity. This result was important as saccade amplitude is influenced both by the task and the nature of the linguistic stimulus being processed, and as such, the task and text type effects obtained by Hendriks could be explained simply in terms of saccade amplitude.
Another study by
Heller and Radach (
1999) directly measured fixation disparity during reading in three experiments. In the first, they compared fixation disparity in a simple scanning task with that which occurred during reading. Similar magnitudes of disparity were obtained in both tasks. They also investigated whether disparity accumulated over fixations during reading by requiring participants to read passages of text. While there was some change from the first line to later lines, they found little evidence overall to suggest that disparity accumulated across fixations obtaining an average fixation disparity of 1.5 characters (though direction of disparity was unspecified). In their second experiment they examined binocular coordination during monocular and binocular viewing and observed similar behaviour under the two viewing conditions. In their final experiment they investigated whether making the text visually unfamiliar (through the use of a mIxEd CaSe manipulation) affected disparity, and found that there was a reduction for mixed case text compared to text presented normally. Heller and Radach concluded that there was a greater tolerance for disparity when text was easy to visually process than when it was more difficult.
More recently we, among others (e.g., see Kleigl, Nuthman & Engbert, 2006), have followed up the experimental work carried out by
Hendriks (
1996) and
Heller and Radach (
1999) in a series of experiments. In our experiments we were most keen to quantify the magnitude of fixation disparity that occurred during reading, as well as determining the direction of any disparity that we observed (i.e., how often the lines of sight were crossed with the left eye fixating a point to the right of the right eye, or how often the lines of sight were uncrossed with the left eye fixating a point to the left of the right eye). Given that the smallest constituent part of a word is a letter, we reasoned that if the eyes were fixating more than one character space apart, then they were disparate. We conducted a number of experiments (Blythe, Liversedge, Joseph, White, Findlay & Rayner 2006; Juhasz, Liversedge, White, & Rayner, 2006; Liversedge, White, Findlay, & Rayner, 2006) in which we independently replicated our initial findings twice as well as extending them in important ways. We first showed that the eyes were disparate on 47% of fixations, the disparities being crossed on 8% and uncrossed on 39% of fixations (the overall disparity data, and then the data broken down by disparity type for each individual subject are shown in
Figure 1A and
Figure 1B respectively).
Also, the magnitude of the disparity was 1.9 characters when the eyes were disparate and vergence movements occurred during a fixation. Disparity magnitudes were greater at the beginning of a fixation than at the end of a fixation, thus the vergence movements that we observed served to reduce fixation disparity. These findings were very largely in agreement with the data reported by
Hendriks (
1996) and
Heller and Radach (
1999). In our later studies we manipulated the difficulty of the text by employing the mixed case manipulation of visual processing difficulty used by Heller and Radach, as well as a linguistic manipulation of processing difficulty, namely, word frequency. In this experiment neither visual nor linguistic processing difficulty affected binocular disparity. Finally, we conducted an experiment to assess binocular coordination in children as well as adults. In this experiment we found that disparity occurred as frequently in children as adults, but that crossed disparities were more prevalent in children than adults. Furthermore, the magnitude of disparity was greater in children than in adults. We explained these differences in terms of differential muscular balances between adults and children that arise due to children performing the majority of their visual work at distances that are closer to them than adults.
On the basis of the discussion above, it should be clear that there is a growing body of evidence to indicate that disparity does occur on a substantial proportion of fixations during reading, and that the disparity was not of constant size, but changed in magnitude from fixation to fixation. This finding raised an issue that we considered to be extremely important. When we read normally, our overriding sense is that we perceive a single unified visual array – the text is clearly visible and we do not experience diplopia. How is this cyclopean representation of the visual environment achieved given the quite different patterns of retinal stimulation in each eye? Furthermore, given that disparity occurs to a greater or lesser degree on a fixation by fixation basis, then the system that compensates for this must have quite a degree of flexibility. We postulated that there could be two psychological mechanisms by which this state could be attained; fusion of the two retinal inputs whereby corresponding elements in each input are associated and somehow combined in order that a single representation be constructed, or instead, suppression of one of the two inputs. In our next experiment we set out to discriminate between these two possibilities (Liversedge, Rayner, White, Findlay & McSorley, 2006).
To do this, we employed a dichoptic presentation methodology whereby we mounted a pair of shutter goggles on our eye tracking devices. These goggles alternately opened and closed such that when the shutter for the left eye was open, the shutter for the right eye was closed, and vice versa (see
Figure 2).
Each image alternation occurred very rapidly every 8 ms. In synchrony with the shutter goggle alternations we manipulated what was presented on the screen. Each stimulus comprised a single sentence within which was embedded a target word that was a compound noun (e.g., cowboy). All of the words of the sentence other than the target words were presented in full to both eyes. However, for the target word we had three different presentation conditions and the target word was presented under these conditions throughout the entirety of the trial. In the control condition the whole word cowboy was presented alternately to both eyes. In the congruous condition the letter string cowb was presented to the left eye, and the letter string wboy was presented to the right eye. The w and the b of each word part were overlaid such that the full word cowboy appeared normal. Finally, in the incongruous condition the letter string wboy was presented to the left eye and the letter string cowb was presented to the right eye. Thus, in all three conditions all the letters of the target word were presented to the reader and appeared in their appropriate order, but in the congruous and incongruous conditions, only part of the word was presented uniquely to each eye.
We were keen to investigate how this manipulation influenced saccadic targeting. It is well documented that saccades during reading are roughly targeted towards the middle of the upcoming word (McConkie, Kerr, Reddix & Zola, 1988). Given this, we hypothesised that if readers were suppressing one of the two retinal inputs, and saccades were targeted on the basis of one or other visual input, then saccades onto the target word in the control condition would differ in length to those observed in the congruous and incongruous conditions, with saccade size depending upon which retinal input was being suppressed. In contrast, if saccadic targeting was based on a fused representation of the two (congruous or incongruous) retinal inputs, then saccadic targeting should be uninfluenced by the dichoptic presentation method and targeting should be identical in all three conditions.
The results were clear. There were reliable effects of the dichoptic manipulation on reading times (see
Figure 3A). While there was a clear and consistent difference between the landing positions of the left and the right eye in the congruous, incongruous and control conditions (reflecting the basic finding that the eyes are often disparate and uncrossed by about 1 to 2 characters), these effects were not modulated by the dichoptic presentation. Regardless of whether the target word was presented congruously, incongruously or normally, the landing positions of the left and right eye on the target word were identical (see
Figure 3B). The data clearly support the fusion hypothesis.
To summarise, our studies have shown that the assumption that the points of fixation of the two eyes are perfectly aligned during reading is incorrect. Instead disparity does occur quite often during fixations and vergence movements reduce, but do not eradicate, this disparity. Disparate fixations are more likely to involve uncrossed lines of sight than crossed lines of sight, and disparity magnitudes did not appear to be influenced by visual or linguistic processing difficulty during reading. Furthermore, as children show more crossed disparity than adults, though the frequency with which disparity occurs overall is similar in adults and children. Finally, a unified visual percept is achieved from disparate retinal inputs via a process of fusion and saccade metrics are computed on this basis.
Our findings have raised a number of important issues that we are currently carrying out experiments to investigate. The questions that are of primary interest to us concern aspects of the process of fusion, and in particular, how we achieve a non-diplopic perceptual representation given different patterns of retinal stimulation. We have recently carried out an experiment to investigate the magnitude of disparity readers are able to tolerate and yet still perceive a non diplopic word (Blythe, Joseph, Findlay, & Liversedge, 2008). To do this, we measured participants’ binocular eye movements and presented whole word and nonword stimuli dichoptically with different horizontal offsets. Participants were required to make a lexical decision to the stimuli (and in order to do this sucessfully the stimuli had to be fused). In this way we attempted to quantify Panum’s fusional area for reading. Panum’s area is the region of binocular single vision, and to our knowledge, it has never been assessed for written linguistic stimuli. We anticipate that the size of Panum’s fusional area will be related to the disparity magnitudes that we have observed in the experiments discussed above. A second, related question that we are also now investigating concerns which features of a word must be present in corresponding visual stimuli presented exclusively to each eye in order for a non diplopic visual representation to be attained. Roughly speaking, in these experiments we are interested to know the degree of overlap necessary between dichoptically presented stimuli in order for fusion to occur. A further issue we intend to explore concerns whether the visual context in which such words appear modulates any fusion effects found for words presented in isolation.
In this selective review I have covered a number of experimental studies that my colleagues and I have conducted to investigate binocular coordination during reading. Our findings provide insight into an aspect of written language comprehension that had not received detailed investigation until quite recently, and this is the case despite the significant amount of eye movement research that has been carried out to investigate reading. We believe that future investigations into binocular coordination during reading and other visual tasks is an important area of research that is receiving increased interest. It is hoped that this research will lead to developments in our understanding of this important aspect of human vision.