Introduction
The studies of the neural mechanisms of aesthetic experience have been attracting the attention and the enthusiasm of a continuously increasing number of researchers of different scientific disciplines from neuroscience and psychology to art and sociology (see for example, Ramachandran and Hirstein 1999, Solso 2000, Zaidel 2005, Zeki 1999). As stated by Semir
Zeki (
2001) artists are like neuroscientists, able to exploit the potential and the capacities of the brain to arouse aesthetic experience. Pierre Bonnard (1867-1947) has been considered, in this sense, as one of the greatest and most revolutionary painters of the last century. In his renowned review, John Elderfield emphasized how Bonnard s paintings are thought to remember a moment of time and to evolve over time as the subject engages perceptually, intellectually and emotionally with it (
Elderfield, 1998). There is a definitive change in emotional response to late artistic acquisition of the paintings visual content as often reported by viewers in our experiments and explained by John
Elderfield (
1998); Bonnard is able to add a temporal dimension into his art, the viewer takes time over his paintings being aware that something is hidden but will eventually be identified. This is an important characteristic of Bonnard’s artistic concept which frequently placed the most important objects on the periphery of a picture thus delaying their perception.
As Bonnard notoriously put it, vision is variable and mobile. Vision is variable because the visual acuity of the retina changes from fovea to periphery among the photoreceptors. Vision is mobile because of the perpetual alternation of strings of saccades, rapid eye movement jumps, with eye fixations. The scanpath theory put forward by
Noton and Stark (
1971a,
1971b) takes into account the same dichotomy between low-resolution peripheral vision and high-resolution foveal vision. This dualism necessitates an important role for eye movements since they must carry the fovea and, consequently, the visual attention, on each part of the image to be fixated upon and processed with high resolution. The movement of sensors, the act of examination and exploration, was for Bonnard a fundamental part of the perceptual experience as discussed later in neuroscience by
Hebb (1968) and
Hochberg (1970) and, more recently, by O Regan and Noë (2001).
We conducted an eye movement experiment using some of Bonnard’s most famous paintings. The resulting scanpaths were compared with eye movement data collected with a number of non-Bonnard control images. Two scanpath metrics, one for spatial similarity and one for sequential similarity were used for the comparison; we found that Bonnard s scanpaths evolved in time during sequential viewing sessions, contrary to what we observed in the control behavior and stated by the Scanpath theory. We called this mechanism "extended scanpaths".
Methods
Five different subjects participated in the experiment; their age ranged between 35 and 45, with different technical and humanistic background. They were not artists but were all interested in art. We used our lab eye-tracker based on a PentiumII computer and standard infrared technology. All the calibration, tracking and eye movement data parsing software was developed internally in the lab by different generations of students and postdocts. Each picture was displayed three consecutive times (
Figure 1) each for four seconds; a brief calibration routine was inserted before and after each single scanpath session.
Subjects could rest between different pictures but not during the viewing session of a specific picture. The subtended visual angle for each image was approximately twenty degrees. Raw eye movement data (
Figure 1, top row) were finally parsed to identify eye fixation loci which are shown in the figure as circles connected by dashed lines (
Figure 1, lower row), the green and red circles representing the starting and ending fixation respectively. Four different paintings by Bonnard were used for the experiment; they are "After the meal", 1925; "White tablecloth", 1926; "Dining room overlooking the garden", 1930; "Table in front of the window", 1934.
Bonnard’s paintings elicit an emotional response on the part of the viewer which is enriched over time by the progressive acquisition of the peripheral paintings’ visual content. One of the primary objectives of Bonnard’s compositional method is to delay and at the same time facilitate the apprehension of this painting’s visual content so as to modulate the emotional effect of such perceptive acquisition in a sort of crescendo.
A different set of images were used in a control experiment; they include paintings of impressionists such as Degas and Renoir, and classic artists from the Renaissance period such as Leonardo DaVinci. Eye movement data were also provided by our laboratory experiment archive; we selected data from experiments conducted in the past few years with the same three-consecutive viewing protocol and with high resolution digital photos of natural scenes or portraying common human activities. Some of these images can be inspected in our website (
http://scan.berkeley.edu/people/toyomi/picture/); an example of an eye movement experiment with a control image is also provided (
Figure 2, Beach Scene, Degas, 1905).
Fixation loci (see
Figure 1, circles in the lower row) were represented by a string of letters with each letter corresponding to a different region of interest (see example in
Figure 3). Two different metrics, Sp and Ss, were then employed to measure the similarity between two different scanpaths. Sp is the spatial similarity metric and it represents the number of fixation loci that two different scanpaths have in common in a given picture. In our case, the normalized number of letters present in both strings; a Sp value of one indicates a complete loci overlap between the two scanpaths while zero means a complete spatial dissociation.
The temporal ordering similarity is measured by the sequential metric Ss based on a string editing optimization algorithm. The algorithm defines the minimum distance or total cost to convert one string into the second string. A Ss similarity value of one indicates that the fixational loci are identical not only in space but also in their sequencing, a value of zero indicates a complete dissociation between two scanpaths in terms of both loci and (consequently) temporal ordering.
The meaning of the two metrics Sp and Ss is exemplified in
Figure 4, where two simplified scanpaths are compared. Two different scanpaths (
Figure 4, left) with no locational similarity and no sequential similarity yield Sp=0 and Ss=0. When the two scanpaths have exactly the same locational similarity but no sequential similarity (
Figure 4, middle) we have Sp=1 and Ss=0. Finally, two scanpaths with exact similarity yield Sp=1 and Ss=1 (
Figure 4, right).
In the example (
Figure 3) the two strings ABCDEFEGHI and BJEDKLML yields to a Sp similarity value of 0.375; only three letters B, D, and E, are present in both strings, 3 is then normalized by the length of the shortest string, Sp=3/8=0.375. The sequence similarity Ss is much lower; editing the first (truncated) string to match the second string requires five insertions (J, K, L, M ), cost 5; one deletion (A), cost 1; and one shifting (E with D) cost 1; the final editing distance is thus Ss=1-(7/8)=0.125. The string editing algorithm is based on a dynamic programming and all the necessary information can be found in Privitera and Stark (1998).
Results
In the first example (
Figure 1), "After the meal", 1925, the primary focus is a richly set table at the center of the image with bottles and leftovers of a meal. To its right, a woman seems occupied with clearing the table. Another woman in the upper-left-hand corner seems to be entering the scene in a blurry patch of colors. The first two scanpaths (Figure1, first and second panels) seem to concentrate on the central elements of the painting but then the viewer s attention is enticed by the peripheral presence as shown in the last part of the viewing sessions (Figure1, third panel).
In "Dining room overlooking the garden", 1930, (
Figure 5T) light shifts and shimmers on the objects resting on the table. There is an oval yellowish object on the right of the tabletop which is focused on during the first scanpath but then, again, eye movements indicate a shift of attention to a blurred woman in her housecoat appearing on the extreme left edge of the painting during the last part of the viewing session (
Figure 5T). In the White tablecloth, 1926, (
Figure 5R), again, a grayish figure appears on the very right side of the painting next to the table which is attended only during the last scanpath. One more example is reported for the painting After the meal and a different subject (
Figure 5F).
We applied the two similarity metrics Sp and Ss to the set of Bonnard experimental data and to the experimental control database. For each subject and viewed image, the three consecutive viewing sessions were pairwise compared. The three indexes, resulting from the comparison of the three consecutive viewing scanpaths with each other (1st scanpath vs. 2nd scanpath, 1st and 3rd, and then 2nd and 3rd), indicate how repetitive the subject’s scanpaths were when she or he viewed the same images at different times. If a viewer is very consistent during the three different viewing sessions, then the three repetitive comparison indexes should be very high. On the contrary, a change in the spatial distribution and temporal pattern of eye movements during the sequential presentations would result in a lower value of the three indexes. The triple of indexes generated by each subject and stimulus was collected with all the other subjects triples for all viewed images and we compared the final distribution for the two sets of data, Bonnard and the control experiment. We found that the two distributions were indeed significantly different. In the control experiment, the mean of both Repetitive indexes were high, 0.67 for Sp and 0.47 for Ss; this evidences a high level of repetitiveness both in terms of fixation loci distribution and their temporal sequencing. In the Bonnard experiment however, the two Repetitive averages were significantly lower, a mean of 0.46 for Sp and 0.15 for Ss; both significantly different from the control experiment (two-sample tailed T-test, p<0.01). This defines the phenomenon of Bonnard’s "extended scanpaths" s that refers to a temporal modification of the scanpaths during the repetitive viewings of the Bonnard’s painting.
Discussion
The notion of repetitiveness is fundamental in the Scanpath theory. Scanpaths consist of sequences of alternating saccades and fixations that appear spontaneously without special instructions to the subjects; they were found to be repetitive when a subject is viewing the same picture and idiosyncratic with respect to the person viewing and the picture or scene viewed. This led Noton & Stark to the Scanpath theory (
Noton and Stark, 1971a,
1971b) that a top-down internal cognitive representation over distributed modules of the cortex generates the complex model of perception as an active process and controls the eye movements. Other laboratories have confirmed the repetitive and idiosyncratic nature of the scanpath sequence of eye movements trough experiments with ambiguous figures and more recently by experiments on comparisons of abstract versus realistic images (
Zangemeister et al., 1995) and visual imagery (
Brandt and Stark, 1997).
Our control experiment clearly evidences the repetitive tendency of scanpaths which confirms the Scanpath theory; the mean Sp value of 0.67 is very high; it means that almost 70% of fixational loci are re-foveated when viewers are exposed to the same stimulus at different and consecutive times. The mean Ss value for the control experiment is 0.47; this is also very high if we considered that randomly generated scanpath sequences generated a mean Ss value of 0.07. An high Ss value signifies that not only viewers tend to look at the same loci when the same stimulus is repeatedly displayed, but they also maintain their own idiosyncratic sequential pattern of eye movements. The Bonnard experiment’s scanpaths preserve some level of repetitiveness as both indexes Sp and Ss are significantly better than randomly generated scanpath sequences, yet, mean values are significantly lower than in the control experiment. This means that something changes during the three consecutive viewings of a Bonnard’s painting both in term of spatial distribution of the fixational loci and the way the eyes are driven over them.
Of course the implicit or explicit task-setting in which the subject is immersed can strongly modify the scanpath (see for example the pioneering work of
Yarbus (
1967) or our recent study (
Privitera and Stark, 2003)). In this experiment however, no specific task instructions were specified to the subjects who were asked to view the pictures naturally. Nevertheless, our results clearly show the phenomenon of extended scanpath that is a progression of the scanpath pattern over time as those reported for example in Figure1 and 5. Does the extended scanpath elicit the emotional response as described by Elderfield in his essay? We have not explicitly evaluated and monitored the emotional status of our subjects during the experiment. The progression of scanpath pattern, as evidenced by our findings, definitively enhances (or add) a temporal dimension in the perception experience; this is not always so evident and perceivable when we look at a static image. In this sense, Bonnard’s primary artistic and perceptual objective seems to have been accomplished; late fixations of the extended scanpath often fall on poignant and somehow unexpected figures which might very likely trigger some sense of surprise and, thus, an emotional reaction. In this sense, our findings support the phenomenon of late emotional response.
Early forms of visual conspicuity are computed covertly across the entire visual field at low-resolution (
Itti and Koch, 2001;
Privitera et al., 2005). Peripheral salient regions are important to inform the scanpath motor control and help to manage the internal representation. Bonnard emphasized this aspect using broad patches of color (contrarily to our set of control images) which look sharper by the parafoveal or peripheral vision and he relied on, and facilitated, multiple fixations across every part of the visual field. Kersten, Groner and Groner (2005) showed in an experimental search paradigm that saccadic eye movements are, in part, controlled by stimuli with high spatial frequency: The search for an object presented in low spatial frequency needed more fixations than the search for the same object with high spatial frequency (and they also argued that the late fixation might trigger some sense of surprise). Another interesting example of this mechanism has been reported by
Zangemeister et al. (1995) for abstract paintings; an important review about the role of eye movements in aesthetic perception can be found in
Locher (1996).
Bonnard’s careful displacement of those poignant figures in later fixated regions of the painting creates a complex narration of perception and completes the artistic effect. All these elements are inherent in the process of the neuro-physiology of perception rather than simply to the effort of pure representation of the external world (or "substance"); as stated by Elderfield, a painting by Bonnard might be thought of not as a representation of substance but as a representation of the perception of substance.