1. Introduction
When viewing a scene, we can readily decompose it into ever-finer elements until reaching the limit of our visual acuity. When asked, we can report back either the entire scene or some subtle detail within it. However, this ability can fail. In 1924, Wolpert introduced the term “simultanagnosie” to describe the inability of a patient to see “the whole simultaneously while having a good grasp of the details”, which he called “a disturbance of overall view” [
1].
Later studies [
2,
3,
4] found that there were clear differences in the nature and severity of the perceptual abnormalities present. Farah and colleagues separated them into dorsal and ventral simultanagnosia after the sites of the lesions, which are characteristic of each [
5,
6,
7]. They noted that both groups could recognise simple shapes but struggled with interpreting complex scenes. They further observed that the fundamental perceptual deficits and lesion locations differ under these conditions—that the bilateral occipito-parietal lesions seen in dorsal simultanagnosia lead to a deficit in multiple object recognition and a failure to appreciate the spatial relationship between objects in a scene, while lesions of the left temporal-occipital cortex lead to a general slowing of information processing and prosopagnosia. Farah noted that with ventral lesions, patients can see multiple objects simultaneously but process them sequentially, with reading inability being especially prominent [
6]. Patients with ventral simultanagnosia who have left temporo-occipital lesions can see and recognise multiple objects, albeit slowly, unlike patients with dorsal simultanagnosia. The dorsal form has been described as leading to deficits in the processing of multiple objects or multiple features of a single object, while the ventral form leads to reading impairment and a deficit in the processing of complex scenes [
8]. Thus, while the two classes of simultanagnosia share features, they differ in their specifics. Some tasks may be impossible in dorsal simultanagnosia, while in the ventral form, simple, but not complex, versions of the same task may be possible.
Riddoch and Humphreys [
9] described a patient with bilateral damage to the occipito-temporal cortex who could not integrate local parts of a figure into a higher-order shape despite having normal elementary sensory functions such as visual acuity, colour vision, brightness discrimination, and normal stored knowledge of objects. They called this integrative agnosia and noted that it might be a result of simultanagnosia. Similarly, when shown a large letter composed of smaller letters (a Navon letter), patients with dorsal simultanagnosia fail to identify the larger, global letter while readily seeing the small, local ones [
10,
11].
Because simultanagnosia is defined by deficits in processing multiple or complex images, the question obviously arises as to whether this is related to deficits in acquiring visual information in the first place; that is, do patients’ eyes fixate normally on elements of the visual scene they wish to identify? In a study of two patients with dorsal simultanagnosia, it was found that when they were asked to identify the global form of Navon letters (large letters composed of multiple small letters) [
12], gaze patterns did not differ between successful and unsuccessful trials [
13]. They identified 100% of the local letters but only 12/42 of the global ones, even though their scanning often covered the extent of the global letter [
14]. A patient with ventral simultanagnosia arising from posterior cortical atrophy was also found to be severely impaired in recognising the global letter in the Navon test [
15]. Their gaze behaviour was not recorded.
Another stimulus where identification of the global form depends on the integration of local features is pseudoisochromatic (e.g., Ishihara) colour plates, which test for colour vision defects by requiring patients to read out digits defined by similarly coloured circles surrounded by circles differing in colour. A report of seven patients with dorsal simultanagnosia found that they were unable to identify the digits but could correctly identify the colours of the individual circles [
16]. In contrast, a patient described as having “profound simultanagnosia” arising from bilateral dorsal stream pathology had normal results on the Ishihara test [
17]. A patient with ventral simultanagnosia was able to correctly identify single colours but was unable to integrate the coloured circles of the Ishihara plates [
4]. In addition, he was unable to trace the digits on the plates with his finger.
A more familiar object with important features at both local and global scales is the human face. Faces convey both identity and emotion. One can even have an autonomic response to a familiar face that is consciously unrecognised, as has been described in prosopagnosia [
18]. We might thus expect differences in how identity and affect are processed in simultanagnosia. There is also a well-established “inverted triangle” scanning pattern generated by normal individuals [
2], which may be distorted in many conditions, e.g., posterior cortical atrophy (PCA) [
2,
19]. In other disorders (e.g., schizophrenia), the scanning pattern may be abnormal during undirected free viewing but may normalise during a task such as emotion recognition [
20]. Studies on how faces are perceived in simultanagnosia are scarce. A patient with ventral simultanagnosia cannot recognise faces per se but can use features such as hair colour to identify individuals [
4]. In contrast, a patient with dorsal simultanagnosia can recognise famous faces and extract gender and emotional information from them [
21]. As areas specialised for face processing lie in the ventral portion of the temporal lobe [
22], these differences in performance may be explained by the differences in neuropathology between the two categories of simultanagnosia. Another patient, described as having ventral simultanagnosia and dorsal prosopagnosia, could recognise famous faces but had difficulty discriminating between unfamiliar faces and identifying emotional expressions [
23].
Although how fixations are distributed across a scene may play a significant role in how individuals extract information from it, very few studies of simultanagnosia, especially the ventral form, have recorded eye movements. One case report of a patient with simultanagnosia and damage to the occipital, parietal and temporal lobes, thus encompassing both the dorsal and ventral pathways, fixated the informative regions of the “Cookie Theft” test image but could only describe individual features of the scene, not the narrative the image represented [
24]. For either form of simultanagnosia, studies reporting gaze behaviour have only reported such performance on a single task. Assessing gaze behaviour and perception across a range of tasks would allow us to better delineate the limits on our patient’s ability to integrate visual information and to determine what role, if any, abnormal information acquisition plays in these limits.
Here, we report on a patient with ventral simultanagnosia and integrative agnosia with ventral encephalomalacia of the occipital lobes due to bilateral occipital lobe infarction. She underwent clinical assessment and extensive neuropsychological evaluation, along with structural magnetic resonance imaging (MRI) evaluation. Her observation that she was unable to recognise faces and her failure to identify the Ishihara plates during clinical assessment led to those tasks’ inclusion in the testing protocol presented here. The emotional faces were added based on her description of how she made this judgement. The Navon figures were included, as they are by now a classic way to assess the integration of local information into a global unit. This formed a battery of perceptual tasks whose completion required holistic processing of local features: facial identification, facial emotion identification, and Navon figures, both in their usual form and when surrounded by black filled circles (“dotted” Navons), and pseudoisochromatic plates. They varied in the degree to which they depended on local or global processing. For each, gaze was recorded and analysed to determine if fixation patterns differed for successful and unsuccessful performance, particularly fixation of the salient elements of the stimuli. Facial recognition requires holistic processing of individual features. Emotion recognition is facilitated by the integration of several local features for some emotions, but for others, it depends largely on one feature. Both the global Navon figures and the characters in the pseudoisochromatic plates depend on the integration of simple local elements to be perceived. In addition, both the colour plates and the “dotted” Navon figures contained distractor local features, which could hinder the integration process. The use of this test battery allowed us to look for commonalities in performance across the tasks, and the recording of gaze allowed us to determine whether failures on the tasks were accompanied by changes in how visual information was acquired. We hypothesised that poor perceptual performance would be reflected in fixation of non-salient regions of the stimuli. Such studies are rare for dorsal simultanagnosia and virtually absent for the ventral form.
2. Materials and Methods
MH underwent several types of assessments. She underwent a full neuro-ophthalmological examination, a neuropsychological assessment, structural magnetic resonance imaging (MRI) and an evaluation of her eye movements as she performed a series of perceptual tasks. These were carried out initially with a Tobii 1750 eye tracker (Tobii Technology AB, Stockholm, Sweden), which operated at 50 Hz with an accuracy of 0.5 deg and a resolution of 0.25 deg. Recordings were made with a viewing distance of 75 cm. Others were carried out using an Eyelink 1000 Plus (SR Research, Ottawa, ON, Canada) after it became available. It operated at 500 Hz, with an accuracy of 0.25 deg and a resolution of 0.01 deg. The methodology for each task, including the tracker used, will be presented separately. As the eye movement analyses were of fixations rather than peak velocity or latency, we considered that, despite the grossly different frame rates, the two systems were comparable for the purpose of this study.
2.1. Famous Faces
Methods: Whilst a few cases of face recognition deficits in ventral simultanagnosia have been reported [
3,
4], none have included eye tracking, so it is unknown whether impaired recognition is associated with abnormal acquisition of visual information, as has been observed in prosopagnosia [
25]. Using the Tobii 1750, we presented 10 black and white images of famous individuals from entertainment, politics and sport: Donald Bradman, Cary Grant, Marilyn Monroe, Marlon Brando, Humphrey Bogart, Audrey Hepburn, Elizabeth Taylor, Robert Menzies, Queen Elizabeth II, and President John F Kennedy, looking as if they would have before our patient suffered her initial incident. They were presented for 5 s each. There was no fixed time limit on the participant’s verbal response. For analysis, we drew areas of interest (AOIs) around the eyes, nose, and mouth. Tobii Clearview 2.7.1 software was used to identify the number of fixations of each AOI and the duration of each fixation.
2.2. Emotional Faces
Methods: In addition to identity, faces convey emotion. It has been proposed, although not universally accepted, that the mechanisms underlying the perception of identity and emotion are to some degree separable (see [
26] (Calder & Young, (2005) for a review). We therefore tested our patient on the recognition of a standardised set of faces (
https://www.paulekman.com/product/pictures-of-facial-affect-pofa, accessed on 10 June 2024) that expressed the emotions anger, disgust, fear, sadness, happiness, surprise and neutrality. Each emotion was represented by 8 different images (4 male, 4 female), with images being presented for 8 s, each preceded by a fixation cross. Images were presented on a 17” screen with a resolution of 1024 × 768, 90 cm away from the participant, who rested on a chinrest. The stimuli subtended 8 × 13 deg at the viewing distance from the eye. Following the presentation of each face stimulus, a list of all the emotions was presented on the screen, and MH was asked to verbally identify the emotion portrayed in the previous image. She was given as much time as needed to respond. Eye movements were recorded by an Eyelink 1000 (SR Research, Ottawa, ON, Canada). As before, areas of interest were defined for the eyes, nose and mouth, and the number of fixations and fixation duration were calculated for each area.
2.3. Colour Plates
Methods: Digital versions of ten Ishihara colour test plates were used as stimuli. Eye movements were recorded on a Tobii 1750, with a viewing distance of 75 cm. The colour plates subtended 15 deg of visual angle. The subject was given 15 s to identify the number displayed. On two plates, she was subsequently also given the opportunity to trace the digit with the tip of a pencil, having 1 min to do so.
2.4. Navon Figures, Plain and “Dotted”
Methods: To explicitly examine global versus local feature identification in our patient, stimuli consisting of larger letters composed of smaller letters were developed [
12]. In all instances, the large and small letters were incongruent. Twenty characters per test set were presented, subtending, with slight variations between letters, 6 × 5 deg. The first set consisted of conventional Navon-type letters; the second used similar letters, but these were now surrounded by black circles of the same size. This mimicked the organisation of Ishihara plates, where the characters to be identified were surrounded by a field of coloured circles. We hypothesised that the increased visual complexity of the stimulus would disrupt the participant’s ability to integrate the local target letters into a global construct and that this would be reflected by changes in their fixation distribution. Each image was presented for 5 s, with the subject being asked to identify both the large and the small letters. The plain and “dotted” targets were intermixed pseudo-randomly. Stimuli were presented on a 17” screen with a resolution of 1024 × 768, 90 cm away from the participant, who rested on a chinrest, and eye movements were recorded by the Eyelink 1000 Plus, with fixations categorised as salient on the small letters and non-salient if they were in the surrounding region.
4. Discussion
This paper presents, for the first time, the performance of someone with ventral simultanagnosia on a series of perceptual tasks requiring the global integration of local information, with concurrent recording of gaze. This allows us to look for relationships between the acquisition of visual information and its processing into a coherent percept. Our main hypothesis was not supported: even when task performance was greatly impaired, this was not reflected in gaze distribution. We have, however, considerably extended our understanding of the limits of holistic processing in this rare condition.
Given that feature integration is impaired but not absent in ventral simultanagnosia, we can thus look for commonalities across tasks. Ventral simultanagnosia leads to less severe deficits than the dorsal form [
5,
6,
7]. Farah described the key difference: “In dorsal simultanagnosia, perception is piecemeal in that it is limited to a single object or visual gestalt, without awareness of the presence or absence of other stimuli. In ventral simultanagnosia,
recognition is piecemeal, that is, limited to one object at a time, although, in contrast to dorsal simultanagnosia, other objects are
seen [
6].” That is, in dorsal simultanagnosia, only one element in the visual environment at a time elicits a response. In the ventral form, multiple elements may be responded to (e.g., as in a counting task) but without reaching conscious awareness. Analysing gaze can provide evidence of which parts of the scene are attended to, whether consciously or otherwise. Thus, during the patient’s scanning of the Cookie Theft picture (
Figure 4), she fixated on areas relevant to the scenario, both those she mentioned and those she did not recognise. While her scanpath resembled that of a patient with dorsal simultanagnosia [
24], unlike him, she was able to summarise key activities in the scene.
The two face-processing tasks had different relationships to gaze behaviour. At a qualitative level, fixations on famous faces followed the familiar inverted triangle [
34,
35], unlike the aberrant scanning seen in an acquired prosopagnosia patient [
25]. The latter authors attributed their patient’s inability to recognise faces to his failure to fixate the inner, salient elements of faces, thus attributing perceptual impairment to abnormal information acquisition. In contrast, our patient at least qualitatively took in information from those salient features but completely failed to integrate them into an identifiable percept. Consistent with this, Van Belle et al. [
36] found in a masking experiment that a prosopagnosia patient processed faces feature by feature, showing no evidence of the holistic integration needed for recognition.
In contrast, the identification of facial emotion depends less upon holistic processing. Wegrzyn et al. [
37], also using the Ekman FACT database, found that various emotions were expressed differently by the inner face features. The recognition of sadness and fear depended most on the eyes, while disgust and happiness depended on the mouth. They also noted that anger and disgust, as well as fear and surprise, were frequently confused, attributing this to how the areas around the eyes express these emotions. Our patient identified happiness on all trials, saying that she could do so because she always looked at the mouth and so readily saw a smile. If we consider the frequently confused anger–disgust and fear–surprise pairs, then if the corresponding responses are combined, they would also have reached significance. This is consistent with the fixation patterns shown in
Figure 8, where both the eyes and mouth attracted attention, and the pairwise multiple comparison tests, where happiness and disgust were frequently significantly different from the other emotions.
Analysis of the fixation patterns made using the pseudoisochromatic (“Ishihara”) plates is descriptive but nonetheless informative. Some patients with dorsal simultanagnosia were unable to identify the digits while successfully identifying the individual colours of the circles [
16], but another similar case could identify them [
17]. One case of ventral simultanagnosia evaluated on the task also failed at it and, unlike our patient, could not trace the figure with his finger [
4]. Our patient performed normally on the FM-15 colour test but could not integrate her repeated fixations on the digit on the colour plate (e.g.,
Figure 9b) into a percept. When given sufficient time, she could trace the numerals with her finger, mapping out her finger’s movement along the digit (
Figure 9c). This is consistent with Milner and Goodale’s framework of vision for perception and vision for action [
38], whereby the ventral stream provides the necessary information for recognition while interacting with the dorsal stream to guide motor behaviour. This interaction may be occurring in a region in the posterior parietal cortex [
39]. As this region was not damaged in our patient, local information was thus available to drive her motor system, guiding her finger. However, due to damage to the temporooccipital cortex (area TO), the limited holistic integration mechanisms available could not overcome the competition from the surrounding circles to allow her to perceive the digit that her finger traced.
The final task discussed is the Navon global/local letter identification task [
12]. This task was designed to assess the ability to process both global and local information. Navon found that in normal individuals, the global form takes precedence, which should make it a particular challenge for individuals with simultanagnosia. As described earlier, an individual with dorsal simultanagnosia performed normally on local letter identification but was impaired when naming global characters, even though their gaze covered the entire character [
16]. Showing the potential utility of proprioception in form identification, another similar patient could identify only the global letter with her eyes closed while her finger was moved passively over it [
19]. This contrasted with our patient’s performance when tracing a figure on one of the colour plates. However, her performance on conventional Navon characters was almost perfect, which is consistent with the limited preservation of global processing in ventral simultanagnosia. Her failure to name the colour plates motivated us to modify the Navon characters by adding dotted surrounds analogous to those in the colour plates, providing competition for attention in a way similar to the effect of the coloured surrounds in the Ishihara plates. These novel stimuli had a profound effect on her performance. Her recognition of the local letters fell only from 20/20 to 19/20, but recognition of the global letters fell from 20/20 to 0/20. This might have been because the added distractors drew her gaze away from the global letter, but our eye movement recordings refuted this possibility. As
Figure 11 shows, confirmed by subsequent ANOVAs, gaze was nearly unaffected by the addition of the surround (contradicting our hypothesis), but perceptually, global feature integration was abolished. Considering this and our patient’s inability to perceive any of the colour plate digits, it is clear that local elements defined by achromatic form and those defined by colour become impossible to integrate when faced with competing distractors in this individual with ventral simultanagnosia, even though fixations continue to be directed to the salient elements of the stimulus.
One caveat that must be raised is the role of our patient’s visual field deficit. Ideally, a control group with similar field loss, as might occur in optic nerve or retinal disease, would have addressed this issue. In its absence, we can point to the fact that on several tasks, her impairment was the same regardless of viewing distance, but whether field loss contributed to impaired performance cannot be ruled out. The other limitation is, of course, that this is a report on a single patient, so any generalisations about the findings can only be tentative. It is hoped that if further patients come to light, the hypotheses tested here could be further examined. Whether any of the deficits observed could be ameliorated through perceptual training might also be examined.