Preserved Contextual Cueing in Realistic Scenes in Patients with Age-Related Macular Degeneration

Foveal vision loss has been shown to reduce efficient visual search guidance due to contextual cueing by incidentally learned contexts. However, previous studies used artificial (T- among L-shape) search paradigms that prevent the memorization of a target in a semantically meaningful scene. Here, we investigated contextual cueing in real-life scenes that allow explicit memory of target locations in semantically rich scenes. In contrast to the contextual cueing deficits in artificial scenes, contextual cueing in patients with age-related macular degeneration (AMD) did not differ from age-matched normal-sighted controls. We discuss this in the context of visuospatial working-memory demands for which both eye movement control in the presence of central vision loss and memory-guided search may compete. Memory-guided search in semantically rich scenes may depend less on visuospatial working memory than search in abstract displays, potentially explaining intact contextual cueing in the former but not the latter. In a practical sense, our findings may indicate that patients with AMD are less deficient than expected after previous lab experiments. This shows the usefulness of realistic stimuli in experimental clinical research.


Introduction
When we enter an environment, our eye movements can be guided by memory of the same or similar environments that we encountered in the past. When you think of your kitchen, you can explicitly tell where the refrigerator is and in which cupboard you will find a coffee mug. Similarly, when participants were asked to find a target in a realistic scene, their response time was dramatically reduced when scenes-with the target placed at the same location-were repeated [1]. Unsurprisingly, participants were explicitly aware of the location of targets in the scenes when tested at the end of the experiment. However, search facilitation in repeated displays was also observed when the "scene" was a symbolic display, typically consisting of a T-shaped target among L-shaped distractors in a randomly generated configuration without any semantic meaning. Nevertheless, response time was reduced Table 1. Patient characteristics. Notes: RE, right eye; LE, left eye; AMD, age-related macular degeneration; -, no scotoma; r, relative scotoma; a, absolute scotoma; ag , tested with Amsler grid; * tested eye. Patients S1-S6 were tested binocularly.

Diagnoses
Acuity Scotoma In addition to the patients, we tested twenty controls binocularly (four males, all right-handed, mean age of 72.45 years; range, 66-79 years). Testing took place at Otto-von-Guericke University, Magdeburg. Controls' acuity was examined using the Freiburger Vision Test 3.9.9. [17]. All controls had normal or corrected to normal visual acuity (decimal visual acuity > 0.84).
Participants remained naive to the purpose of the research during the experiments. None of them had been tested in a similar study before. Immediately after the experiment, individuals from both groups received a fixed reimbursement of EUR 20 for the one-hour session.

Stimuli
The stimuli were presented and recorded by a Lenovo ThinkPad L420 laptop (Linux operating system, version 8) using PsychoPy v1.82.01 Software [18] under Python. The laptop was connected to a full-color 24-inch HD LCD BenQ XL2410T presentation monitor. The monitor was 521-mm (1920 pixels) wide and 293-mm (1080 pixels) high, with a refresh rate of 120 Hz. Stimuli were presented at a distance of 80 cm in a quiet and dimly-lit room. The stimuli were designed with Sweet Home 3D software [19]. Twelve 3D-rendered illustrations of naturalistic scenes of 22.8 • × 17.1 • visual angle represented a bathroom, bedroom, cinema room, game room, garage, nursery room, kitchen, library, living room, music room, office, and studio ( Figure 1). Participants remained naive to the purpose of the research during the experiments. None of them had been tested in a similar study before. Immediately after the experiment, individuals from both groups received a fixed reimbursement of EUR 20 for the one-hour session.

Stimuli
The stimuli were presented and recorded by a Lenovo ThinkPad L420 laptop (Linux operating system, version 8) using PsychoPy v1.82.01 Software [18] under Python. The laptop was connected to a full-color 24-inch HD LCD BenQ XL2410T presentation monitor. The monitor was 521-mm (1920 pixels) wide and 293-mm (1080 pixels) high, with a refresh rate of 120 Hz. Stimuli were presented at a distance of 80 cm in a quiet and dimly-lit room. The stimuli were designed with Sweet Home 3D software [19]. Twelve 3D-rendered illustrations of naturalistic scenes of 22.8° × 17.1° visual angle represented a bathroom, bedroom, cinema room, game room, garage, nursery room, kitchen, library, living room, music room, office, and studio ( Figure 1).

Figure 1.
Interior scenes used in the experiment. The task was to search for the yellow mug and to indicate the left/right direction of the handle by an alternative forced choice response. For each scene, all six possible targets and their locations are shown. In the experiment, however, each scene contained only one target. In repeated displays, the target occurred at the same location across repetitions whereas in new displays, the target appeared once at each of the shown positions across repetitions. The green squares indicating target locations are for illustrative purposes and were not presented during the experiment.
A yellow mug with a handle pointing either left or right constituted the visual search target and was presented in one of six equal-sized rectangular parts of the display (upper/lower, left/center/right). Each display contained only one mug. The mug could appear both in familiar positions as well as in unfamiliar positions in the rooms, but not in physically impossible places, e.g., up in the air (Figure 1). Depending on the position, the size of the mug varied between ca. 1.1° × 1.1° and 0.6° × 0.6°. The orientation of the handle (left or right) was chosen randomly and balanced in each block. A yellow mug with a handle pointing either left or right constituted the visual search target and was presented in one of six equal-sized rectangular parts of the display (upper/lower, left/center/right). Each display contained only one mug. The mug could appear both in familiar positions as well as in unfamiliar positions in the rooms, but not in physically impossible places, e.g., up in the air ( Figure 1). Depending on the position, the size of the mug varied between ca. 1.1 • × 1.1 • and 0.6 • × 0.6 • . The orientation of the handle (left or right) was chosen randomly and balanced in each block.

Procedures
A trial began with the presentation of a central black fixation cross with a line length of 2.5 • for 1000 ms. After a blank screen of 500 ms, the display was presented until participants indicated the Brain Sci. 2020, 10, 941 5 of 12 direction of the mugs handle by an alternative button press response. After another blank screen for 500 ms, the next trial began. Participants had to indicate the direction of the mug's handle by pressing the left or right mouse button. A 2000-Hz high-pitch tone provided positive auditory feedback and a 500-Hz low-pitch tone provided negative auditory feedback.
The visual search task consisted of six blocks, with each block including 12 trials, and each trial with a different room as search display. These 12 rooms were repeated in each block in a random order. In six of the 12 rooms-the repeated displays-randomly drawn for each participant, the mug's position was randomly drawn for each individual for the first block and then fixed across repetitions (blocks), whereas in the other six displays-the new displays-the mug's position varied randomly across the six rectangular sections, with the restriction that no location was used twice in a room. In this way, the probability of target presentation was about the same across the six rectangular areas of a grid created by a horizontal line through the center of the scene and two vertical lines dividing the scene into equally wide sections), both for new and repeated locations, thus preventing a confound of target location cueing and contextual cueing. Observers were not informed about target location repetitions or scene repetitions. Figure 2 shows an example of a trial.

Procedures
A trial began with the presentation of a central black fixation cross with a line length of 2.5° for 1000 ms. After a blank screen of 500 ms, the display was presented until participants indicated the direction of the mugs handle by an alternative button press response. After another blank screen for 500 ms, the next trial began. Participants had to indicate the direction of the mug's handle by pressing the left or right mouse button. A 2000-Hz high-pitch tone provided positive auditory feedback and a 500-Hz low-pitch tone provided negative auditory feedback.
The visual search task consisted of six blocks, with each block including 12 trials, and each trial with a different room as search display. These 12 rooms were repeated in each block in a random order. In six of the 12 rooms-the repeated displays-randomly drawn for each participant, the mug's position was randomly drawn for each individual for the first block and then fixed across repetitions (blocks), whereas in the other six displays-the new displays-the mug's position varied randomly across the six rectangular sections, with the restriction that no location was used twice in a room. In this way, the probability of target presentation was about the same across the six rectangular areas of a grid created by a horizontal line through the center of the scene and two vertical lines dividing the scene into equally wide sections), both for new and repeated locations, thus preventing a confound of target location cueing and contextual cueing. Observers were not informed about target location repetitions or scene repetitions. Figure 2 shows an example of a trial.

Recognition Test
Subsequent to the experiment, we ran an explicit recognition test in order to assess whether the repeated scenes were explicitly or implicitly remembered. The six previously repeated scene configurations were presented without the search target (mug). The participant's task was to indicate the target's position for each specific scene by pointing with the mouse cursor. Recognition accuracy was operationalized as the frequency of correct mouse cursor placements in the target display section of the 2 × 3 grid (see last paragraph).

Data Analysis
All trials with incorrect responses (3% for patients; 1% for controls) and trials in which the response time exceeded the outlier threshold of ± 2 standard deviations from the mean (patients: 2.7% of repeated, 3.3% of new trials; controls: 2.4% in both trials groups) were excluded from the responsetime analyses. Mean response times (RTs) were determined separately for repeated and new configurations for each subject and block. To increase statistical power, RTs were aggregated to three epochs, with each epoch containing two subsequent blocks. Next to RTs, because of the different overall response times between patients and controls, normalized contextual cueing effects [(RT(new) − RT(repeated))/RT(new)] were analyzed. Thus, positive values indicate a benefit for repeated configurations. When Mauchly's test indicated that the assumption of sphericity had been violated,

Recognition Test
Subsequent to the experiment, we ran an explicit recognition test in order to assess whether the repeated scenes were explicitly or implicitly remembered. The six previously repeated scene configurations were presented without the search target (mug). The participant's task was to indicate the target's position for each specific scene by pointing with the mouse cursor. Recognition accuracy was operationalized as the frequency of correct mouse cursor placements in the target display section of the 2 × 3 grid (see last paragraph).

Data Analysis
All trials with incorrect responses (3% for patients; 1% for controls) and trials in which the response time exceeded the outlier threshold of ± 2 standard deviations from the mean (patients: 2.7% of repeated, 3.3% of new trials; controls: 2.4% in both trials groups) were excluded from the response-time analyses. Mean response times (RTs) were determined separately for repeated and new configurations for each subject and block. To increase statistical power, RTs were aggregated to three epochs, with each epoch containing two subsequent blocks. Next to RTs, because of the different overall response times between patients and controls, normalized contextual cueing effects [(RT(new) − RT(repeated))/RT(new)] were analyzed. Thus, positive values indicate a benefit for repeated configurations. When Mauchly's test indicated that the assumption of sphericity had been violated, degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity. For all statistical tests, the alpha level was set at 0.05.

Response Times
We first investigated potential effects of monocular vs. binocular viewing on RT in the patients. A repeated measures ANOVA with group (monocular, binocular) as the between-subjects factor and configuration (repeated, new) and epoch (1-3) as within-subjects factors yielded neither a significant effect of group (F(1, 18) = 0.133, p = 0.72, η p 2 = 0.007) nor significant interactions with group Because of the absence of ocularity effects on visual search, we collapsed over ocularity in all further analyses to increase power. Averaged response times for patients and controls are shown in Figure 3 and Table 2. A repeated measures ANOVA with group (patients, controls) as the between-subjects factor and configuration   of the patients with a repeated measures ANOVA with the within-subject factor epoch (1)(2)(3) and the between-subjects factor experimental group (patients, controls). Standardized cueing scores did not differ significantly between groups (F(1, 38) = 2.656, p = 0.111, ηp 2 = 0.065). Moreover, neither the main effect of epoch (FGG(1.521, 57.817) = 0.948, p = 0.371, ηp 2 = 0.024) nor the group × epoch interaction (F(2, 76) = 0.748, p = 0.477, ηp 2 = 0.019) were significant. Thus, relative to the baseline differences, patients did not reduce their response times more than controls in the course of the experiment. Moreover, there was, again, no indication of a contextual cueing impairment in the patients. An analogous Bayesian t-test for the RT yielded a BF01 = 2.148, indicating, again, only weak evidence for equal size of contextual cueing in patients and controls [22].  The ANOVA hinted at a later onset of contextual cueing in the patient sample. We therefore calculated paired samples t-tests over the collapsed epochs 2 and 3, separately, for patients and controls. Both patients (t(37) = 2.19, p = 0.035) and controls (t(39) = 4.898, p < 0.001) showed a significant response-time advantage for repeated displays.
To investigate the evidence for equal contextual cueing strength in both groups [20], we calculated a Bayes factor analysis on contextual cueing scores in epochs 2 and 3. Using Jasp [21], a Bayesian t-test for independent samples compared the fit of the data under the null hypothesis of no difference and the alternative hypothesis of different cueing effects between the control group and AMD patients. The analysis yielded a BF 01 = 1.851, i.e., only weak evidence for equality of contextual cueing effects in the two groups.
Normalized contextual cueing scores were analyzed because of the overall longer response times of the patients with a repeated measures ANOVA with the within-subject factor epoch (1)(2)(3) and the between-subjects factor experimental group (patients, controls). Standardized cueing scores did not differ significantly between groups (F(1, 38)  An analogous Bayesian t-test for the RT yielded a BF 01 = 2.148, indicating, again, only weak evidence for equal size of contextual cueing in patients and controls [22].
If contextual cueing is preserved in AMD patients, when searching in real world scenes, contextual cueing scores should, overall, be unrelated to the degree of foveal vision impairment. To test this prediction, we correlated logMAR visual acuity as a measure of general visual performance and the normalized contextual cueing effect in repeated displays in the last epoch using Kendall's τ non-parametric rank order correlation (Figure 4). Only patients who had performed the experiment monocularly were considered for this analysis (n = 14). LogMAR visual acuity of the tested eye did not correlate significantly with the size of contextual cueing (τ = −0.069, p = 0.739), implying that contextual cueing was also reliable in more severely impaired patients.

Recognition Test
The experiment was followed by an explicit recognition test to address the question of whether learning was implicit or explicit. For this purpose, the displays with repeated target locations were presented again and the participants had to indicate the target sector (see Methods). The average

Accuracy
Response time is the central variable of interest in the contextual cueing paradigm [23] and accuracy is usually high in contextual cueing paradigms because participants have no time limit. This was also observed in the present study, both for patients (monocular: mean 96%; binocular: mean 97%) and controls (mean: 99%). Thus, whereas reaction times are the main variable of interest, accuracy was, nevertheless, analyzed to rule out potential speed-accuracy trade-offs.
A repeated measures ANOVA comparing accuracy with the between-subjects factor experimental group (control group, AMD monocular, AMD binocular) and the within-subjects factors configuration (novel, repeated) and epoch (1)(2)(3)

Recognition Test
The experiment was followed by an explicit recognition test to address the question of whether learning was implicit or explicit. For this purpose, the displays with repeated target locations were presented again and the participants had to indicate the target sector (see Methods). The average recognition accuracy in patients was 39.16%. To test whether the participants were able to recall the position of the target, we compared the accuracy obtained in the sample with the chance level of 16.6%, since in each trial, the probability to select the correct position was 1/6. Patients' accuracy was significantly above the chance level (t(19) = 5.107, p < 0.001), indicating that they recalled the repeated target locations. For the controls, accuracy was 58.33%. This, too, was significantly above the chance level (t(19) = 7.611, p < 0.001). Recognition accuracy was significantly increased for controls compared with patients (t(19.167) = 2.728, p ≤ 0.01).

Discussion
We investigated contextual cueing of visual search in repeated scenes in patients with AMD. In previous studies with patients with AMD [7] as well as simulated central vision loss in normal-sighted observers [8,9], impairment of contextual cueing was observed during search with foveal vision loss. In these studies, artificial displays were presented in which a T-shaped target had to be searched among L-shaped distractors. In the present study, we investigated if contextual cueing was preserved in patients with AMD when searching in real-world scenes. In line with previous work [1], we found faster response times in displays with repeated target locations, indicating contextual cueing.
Although this search facilitation was numerically somewhat smaller in the patients than the controls, we observed no significant group × configuration interaction, which would have been indicative of group differences in the amount of contextual cueing. Nevertheless, we cannot make a firm conclusion about the equality of the size of the contextual cueing effects in AMD patients and controls, as the Bayes factor analyses yielded only evidence for a roughly two-fold higher probability for equal than for unequal size. The important point, however, is that the patients showed significantly faster search times in repeated displays after the first third of the experiment, indicative of contextual cueing.
Visual search in general-for new and repeated target locations alike-was much slower in the patient group, as was expected, due to their impaired vision. While contextual cueing manifested itself in the faster search for targets at constant target locations, there was also an unspecific learning effect that led to faster search for both constant and variable target locations from epoch to epoch. This general learning effect was present in patients and controls alike. It may even have been somewhat stronger in the patients, as suggested by the group × epoch interaction in the age-matched patient sample (and the trend in the overall patient sample). This effect, however, could be attributed to their higher initial response times.
We confirmed the central hypothesis of this paper that the clear contextual cueing deficit that we observed in previous work with AMD patients [7] would be ameliorated for search in realistic displays. Our reasoning was that realistic displays would enable explicit memory of the target location in the form of a semantic template (e.g., the cup is on the kitchen table). In turn, this would resolve the competition between concurrent visuospatial working memory demands for top-down controlled exploration of the scene (e.g., navigating the fixation to the table), on the one hand, and keeping a detailed visuospatial memory template active during search. In contrast, in symbolic displays, visuospatial working memory is needed both for keeping the contextual memory template active during search for comparison with the display and to navigate the display with eye movements concurrently [9]. In keeping with this reasoning, we observed that patients and controls alike could explicitly remember the repeated target locations, unlike the AMD patients tested with symbolic displays in our previous contextual cueing study [7]. Moreover, visual scene exploration was clearly impaired in the patients, indicated by their much longer response times. Thus, visuospatial working memory demands most likely were as high as, if not higher than, the patients with AMD of our previous study [7]. This supports our view that patients did not use visuospatial working memory but semantic memory to guide visual search in the present experiment.
The fact that contextual cueing in naturalistic scenes went along with explicit memory in our study-in keeping with previous work with normal-sighted observers-does not imply that object or scene recognition has to occur explicitly in AMD. In fact, scene categories can be distinguished after brief presentation in the peripheral visual field [24] and scenes may implicitly prime recognition of congruent objects in AMD patients [25]. In the present displays with repeated target location, the scene category (e.g., kitchen or living room) may have been rapidly identified, leading to the memory trace linking the particular scene to the associated target location.
Although this study was inspired by our previous work on contextual cueing in AMD patients, a direct comparison is limited by the fact that different patient samples were tested. Thus, the present results should be seen as evidence that contextual cueing in realistic displays is possible in AMD patients, not as a direct comparison of contextual cueing strengths between these studies. A direct comparison of contextual cueing in symbolic and realistic displays would further by complicated by the inherent differences of naturalistic and symbolic displays. It is, perhaps, noteworthy that we reported a search advantage for AMD patients in naturalistic displays which was not observed in symbolic displays, whereas one could assume that the greater visual complexity of naturalistic displays might cause problems for patients with visual deficits, as has been reported in a study of glaucoma patients [26]. A worthwhile issue for future studies would be to compare search in familiar scenes (as in the present study) with search in unfamiliar scenes. A disadvantage for the latter might support our interpretation that the semantic content of the scene helps to reduce working memory load during contextual search guidance.
In a recently published study, we carried out the same experiment with the same scenes as in the present study in healthy younger observers under scotoma simulation [16]. Regarding contextual cueing, improvement of exploration efficiency was very similar to unimpaired observers in our previous work using arbitrary search displays [7][8][9] in that fixation number was significantly reduced and scan paths showed a tendency towards increased efficiency in repeated compared to novel displays. Furthermore, the onset of the monotonic path that distinguishes an early inefficient search phase from the efficient guidance of the eye to the target with each successive fixation occurred significantly earlier in repeated than novel displays. Interestingly, we found comparable contextual cueing effects between search with or without a simulated scotoma using the realistic scenes [16], suggesting that even observers without previous vision-loss experience may better profit from contextual cues to guide their eye movements in realistic scenes than in arbitrary search displays. This was the case, although exploration was generally more complicated, reflected by more fixations that were longer in duration and increased saccade amplitudes. Thus, future work may want to investigate eye movements-perhaps in parallel in AMD patients and normal-sighted participants with gaze-contingent simulated scotomata-in order to see if patients show comparable eye-movement patterns-e.g., less fixations of longer duration and higher saccade amplitudes-as can be induced by simulated central scotomata [9,27]. Unfortunately, it was not possible to perform eye tracking in the present study. In a recently published study, however, we carried out the same experiment using realistic scenes in healthy younger observers with scotoma simulation [16]. Regarding contextual cueing, improvement of exploration efficiency was very similar to unimpaired observers in our previous work using arbitrary search displays [7][8][9] in that fixation number was significantly reduced and scan paths showed a tendency towards increased efficiency in repeated compared to novel displays. Furthermore, the onset of the monotonic path that distinguishes an early inefficient search phase from the efficient guidance of the eye to the target with each successive fixation occurred significantly earlier in repeated displays than novel displays.
Beyond the issue of contextual search guidance, the present data show that the use of realistic stimuli may contribute to the question of how well laboratory experiments can predict behavior and its limitations due to pathology in real life. The contextual cueing deficits that we observed in our previous work in AMD patients [7] may be compensated for in everyday situations by the use of semantic memory templates. The size of the search advantage due to contextual cueing was in the order of several hundred ms in the present study, rendering it a factor that should be of importance for visual search in everyday situations.