Does Descriptive Text Change How People Look at Art? A Novel Analysis of Eye—Movements Using Data—Driven Units of Interest

Davies, Alan; Reani, Manuele; Vigo, Markel; Harper, Simon; Gannaway, Clare; Grimes, Martin; Jay, Caroline

doi:10.16910/jemr.10.4.4

Open AccessArticle

Does Descriptive Text Change How People Look at Art? A Novel Analysis of Eye—Movements Using Data—Driven Units of Interest

by

Alan Davies

¹,

Manuele Reani

¹,

Markel Vigo

¹,

Simon Harper

¹,

Clare Gannaway

²,

Martin Grimes

² and

Caroline Jay

¹

School of Computer Science, University of Manchester, Manchester, UK

²

Manchester Art Gallery, Manchester, UK

J. Eye Mov. Res. 2017, 10(4), 1-13; https://doi.org/10.16910/jemr.10.4.4

Submission received: 23 May 2017 / Published: 22 November 2017

Download

Browse Figures

Versions Notes

Abstract

Does reading a description of an artwork affect how a person subsequently views it? In a controlled study, we show that in most cases, textual description does not influence how people subsequently view paintings, contrary to participants’ self-report that they believed it did. To examine whether the description affected transition behaviour, we devised a novel analysis method that systematically determines Units of Interest (UOIs), and calcu-lates transitions between these, to quantify the effect of an external factor (a descriptive text) on the viewing pattern of a naturalistic stimulus (a painting). UOIs are defined using a grid-based system, where the cell-size is determined by a clustering algorithm (DBSCAN). The Hellinger distance is computed for the distance between two Markov chains using a permutation test, constructed from the transition matrices (visual shifts between UOIs) of the two groups for each painting. Results show that the description does not affect the way in which people transition between UOIs for all but one of the paintings --an abstract work --suggesting that description may play more of a role in determining transition behaviour when a lack of semantic cues means it is unclear how the painting should be interpreted. The contribution is twofold: to the domain of art/curation, we pro-vide evidence that descriptive texts do not effect how people view paintings, with the possible exception of some abstract paintings; to the domain of eye-movement research, we provide a method with the potential to answer questions across multiple research areas, where the goal is to determine whether a particular factor or condition consistently affects viewing behaviour of naturalistic stimuli.

Keywords:

art; paintings; eye tracking; eye movement; painting narration; art perception; areas of interest; regions of interest; Markov chain

Introduction

Science and art are often considered to be parallel dis-ciplines with little interaction between the two (Quiroga & Pedreira, 2011); here we provide a scientific perspec-tive on the perception of art, which emerged from a col-laborative project between the University of Manchester and Manchester Art Gallery. Manchester Gallery were specifically interested in understanding the behaviour of their web visitors for curatorial purposes. We explored whether reading a description of an artwork affects the way a person subsequently views it in a controlled study, leading to a richer understanding of how people view art, and a generalizable method that can be used by research-ers in eye-movement research. The method presented can help to answer similar questions about differences in viewing behaviour between groups, when using stimuli where Areas of Interest (AOI) segmentation is challeng-ing.

Art is a unique and subjective perceptual experience (Quiroga & Pedreira, 2011). Although arguably the best context for some forms of art, museums can be difficult for some people to visit, including older people, those who have disabilities, and those who are unable to travel to them. It is also known that the time people spend view-ing artworks decreases as people move through an exhibi-tion, a phenomenon termed “museum fatigue” (Brieber, Nadal, Leder, & Rosenberg, 2014). As it is now possible to view many paintings online, more people can poten-tially access artworks than previously, and can access those works faster. It is known that the context in which art is viewed has an effect on how people evaluate it (Gartus & Leder, 2014). With more and more art con-sumed online, new questions are emerging as to how best to digitally present it. Eye-tracking can play a valuable role in understanding how people perceive art, and has the potential to provide information that can be used to support its curation. Quantitatively analysing gaze data over artwork can be challenging, due to the fact that im-ages are often naturalistic (representing the various col-ours and forms as they appear in nature), and do not gen-erally contain explicit semantic regions that can be la-belled as Areas/Regions of Interest (AOIs/ROIs). In this paper we introduce a method of stimulus segmentation for subsequent data analysis that reduces researcher bias and aids in the segmentation of stimuli with difficult to identify or subjective semantic details. The method is used to quantitatively examine whether presenting a de-scriptive text to people before they view a painting sub-sequently affects their viewing pattern. The text consists of a short description written by a curator or other expert, providing information about the painting; in the current study, we used texts taken from the Art UK website (http://www.artuk.org), which accompany each painting displayed online. The following example is taken from one of the paintings used in the study, entitled, ‘Self Portrait’, by Louise Jopling (Figure 1):

“A frontal bust portrait of the artist as a young wom-an with her hair tied up, wearing a pale coat with white collar and matching hat, set at an angle. At her neck she wears a decorative pink neck scarf. Her skin and features are smoothly and evenly painted, in comparison to her more textured clothes. She is set against a dark plain background.”

This study is the first to consider the impact of a descrip-tive text on subsequent gaze patterns over a painting. As visual scanning is the genesis of aesthetic experience (Massaro et al., 2012), we apply a quantitative method to determine if the presence or absence of a description has any impact on the visual behaviour of participants by using eye-tracking, an established measure of visual at-tention (Borji, Sihite, & Itti, 2013) that has previously been identified as a meaningful method for quantifying how people view artworks (Quiroga & Pedreira, 2011).

The texts examined in the current study are primarily used for describing the stimulus such that it can be searched for within an online collection. We demonstrate that reading such descriptions does not generally appear to affect people's viewing behaviour in terms of the na-ture of fixation frequency or duration, and that whilst transition behaviour between UOIs is generally similar across groups, it appears to vary more when a work is abstract.

Background

Areas of interest (AOIs) are used to identify semantic regions in a stimulus that are of importance to an experi-ment (Holmqvist et al., 2011). It can be challenging to apply these to artwork, due to the non-uniform and natu-ralistic nature of the stimuli, which make it harder to determine how and where to draw the boundaries for AOIs. Our work addresses this issue by segmenting the image into regions using a data-driven clustering algo-rithm, before going on to compare differences in gaze transitions between these areas across two groups. We begin with a review of other work that has used eye-tracking to explore how people view art, and highlight the effect of the environment on how a work is perceived. The second section looks at different methods of segre-gating images to produce areas of interest, to provide context for our approach.

Art and Eye-Tracking

Several studies have used eye-tracking to investigate how people view and interact with art. Bubic, Susac and Palmovic (2014) used eye-tracking to explore how people view images that can represent either a human face, or alternately when inverted (displayed upside down), a still-life image, where no distinct facial components are identifiable. The results showed that in the upright posi-tion people fixated more on the image elements that rep-resent faces, focusing more on the eyes (upper AOI) than in the inverted position.

Gartus, Klemer, and Leder (2015) considered the ef-fect that context has on how people view art, using eye tracking to determine whether perception changed ac-cording to whether works were viewed in a museum or street context. They demonstrated that viewing durations were substantially longer in the museum context than they were in the street context. The study also provided evidence that the context had no impact on ratings of graffiti art, but that modern art received higher ratings for beauty and interest in the museum context.

The effect of context on the “experience” of viewing art has also been considered by Brieber et al. (2014), who determined that viewing art in the context of a museum, as opposed to a laboratory setting, led people to view paintings for longer, and that there was a stronger rela-tionship between viewing time and appreciation in the laboratory context. In both situations the information labels were viewed longer for works with a higher appre-ciation rating.

In a study that explored centre-stage effect (CSE) --the phenomenon that items/options placed in the central position are more popular than those located to the sides --participants were shown three paintings in a row, and asked to select their preferred painting. Findings show that allocation of a substantially larger proportion of gaze to the paintings in the left and centre positions was not associated with preference when the paintings were iden-tical. The fixation duration did, however, predict prefer-ence when the paintings were different. The centre-stage effect was seen in the centre image only when the paint-ings were identical and had a positive valence. They conclude that valence has a greater impact on CSE than gaze allocation.

The authors suggest that the ‘centre stage heuristic’ -- the assumption that the best items are in the centre -- can explain their results. The final fixation was found to be predictive of people choosing the central item, if it exhib-ited positive valence (Kreplin, Thoma, & Rodway, 2014). As acknowledged by the authors, only a subset (22 of 50) participants had their eye-movements measured with an eye-tracker, which may have had an impact on the rela-tionship strength between gaze allocation and preferred painting (Kreplin et al., 2014).

Massaro et al. (2012) found that visual exploration patterns appeared to be affected more by knowledge-driven top-down processes when people are viewing faces, than when they are viewing natural scenes, where the gaze path appears to be driven more by low-level features.

Visual behaviour can also be manipulated by modu-lating the luminance of a painting, to guide people's gaze to a portion of the painting not being directly focused upon (McNamara, Booth, & Sridharan, 2012).

Aesthetic experience is known to be comprised of competing top-down and bottom-up processes (Massaro et al., 2012). A large variability between participants (n=10) viewing figurative paintings was identified by Quiroga and Pedreira (2011), in a study examining how digital manipulation of artworks affects fixation patterns. This variability, attributed to the participants’ individual knowledge and appreciation of the work, made analysis of the subject difficult (Quiroga & Pedreira, 2011) .

Image complexity has also been considered in relation to gaze-behaviour. Complexity relating to pattern detail that changes with scale can be examined with the fractal dimension. Regions of paintings with a higher fractal dimension were fixated on for a longer period than other regions with a lower fractal dimension (Nagai, Oyana-Higa, & Miao, 2007).

Many studies do not use AOIs when analysing gaze over artwork, choosing instead to employ qualitative methods (e.g. observation of gaze plots or heat maps) or statistical methods, for understanding basic distribution of gaze (Brieber et al., 2014; Gartus & Leder, 2014; Kreplin et al., 2014).

A study by Brinkmann et al., (2013) looked at differ-ences in the attention profiles of participants when look-ing at both abstract and representational artwork. The study revealed more diffuse attention for abstract art. They also found that eye-movement patterns varied more for the abstract paintings than the representational paint-ings, pointing to individual image characteristics playing a greater part in structuring attention when compared to socio-demographic factors. The study used a bottom-up approach to define AOIs, which were defined by circles with an area of 90 pixels and a minimum of 5 fixations in the circle per minute.

One study (Kapoula, Daunys, Herbez, & Yang, 2009) did examine the effect that a painting title had on non-realistic cubist paintings. In this study there were 3 exper-imental conditions: 1) no title; 2) participants had to de-cide on a title; 3) participants told the actual title. They discovered that the duration of fixations increased in the group told the painting title relative to the group tasked with coming up with a title for the painting. They also found that the most fixated area for all the paintings was the centre of the painting. There was an increase in sac-cadic amplitude for the group that were told the title of the paintings in the case of one painting, which was at-tributed to additional cognitive processing being required to link the title to the image. The study concluded that the title information did have an impact on the eye-movements and fixation distribution over time (Kapoula et al., 2009). For this study the AOIs were defined for each painting by arbitrarily splitting the image into a grid containing 12 cells, the rationale for which is not dis-cussed in detail in the paper.

Here we consider the impact of painting description, written by experts, on the gaze behaviour of people view-ing artwork.

Defining Areas of Interest

The generation of areas or regions of interest requires researchers to make decisions about how to segment a stimulus. As AOIs usually correspond to semantic items in a scene, they can be very useful for determining which of these items participants focus their interest upon (Holmqvist et al., 2011). AOIs defined by the researcher in a top-down manner can be very useful for answering particular research questions, but they are also subject to bias, as a decision about how to segment the scene will affect the subsequent data analysis process, and may not be optimal. Although it is not possible to eradicate all of the top-down factors that can affect the way people view scenes, such as the semantic dependency of objects or the context of the scene (Borji et al., 2013), it is possible to reduce the potential bias introduced by researcher-imposed segmentation of the scene for analysis purposes. Gridded AOIs are crude in comparison, but allow for content-independent analysis to take place (Goldberg & Kotval, 1999). One of the principal issues with using gridded AOIs is determining the cell size, as this can significantly affect the results (i.e. capture more or fewer fixations in the defined geospatial area).

Bottom-up AOIs have been generated with clustering techniques, using circles and a minimum number of fixa-tions to define the AOI (Brinkmann, Commare, Leder, & Rosenburg, 2013)(Brinkmann et al., 2013; Klein et al., 2014). This is also the case for eye-tracking analysis software, such as Eyetrace (Kübler et al., 2015) that al-lows both user defined top-down AOIs and bottom-up data-driven AOIs using fixation clustering. This is achieved by setting neighbourhood thresholds or using mean-shift clustering. As circles do not tessellate there are gaps between them that exclude fixation data. Where the circles do overlap and do not leave spaces, deciding which cluster fixations belong to which AOIs can be problematic. This can also be compounded by differently sized AOIs that make carrying out comparative analysis challenging. Indeed Klein, (2014) state that they did not analyse the AOI data across paintings due to differences in their gross geometric structure.

Other methods used to segment stimuli include Voro-noi diagrams, fuzzy AOIs and convex hulls. Voronoi diagrams (segmentation of an area into different regions, derived from the distance between predefined points in subsets of the area) divide scenes into cells. The distribu-tion of the cells correlates to the fixation density distribu-tions. This method is predominantly spatial and is analo-gous to fixation clustering (Over, Hooge, & Erkelens, 2006).

Fuzzy AOIs do not have hard borders, and thus rather than taking a ‘hit or miss’ approach, they use a probabil-istic method to determine which AOI (or none) a fixation belongs to (Holmqvist et al., 2011). Fuzzy AOIs can also be of use when data quality is lower, as thresholds can be varied to account for poorer precision (Holmqvist et al., 2011). Convex hulls, which are sets of points in Euclide-an space or on a plane can be used to describe the mini-mum area covered by a cluster of fixations. The convex hull essentially represents the line that encapsulates a set of these points such that the enclosing polygon is convex as opposed to concave (i.e. has no indents). Holmqvist et al. (2011) point out that generating AOIs this way is unsuitable for transitional analysis due to the amount of manual post editing that would be required and the poten-tial for inflated values (when compared to smaller AOIs), caused by the collection of stray data points. The irregu-larly shaped and sized AOIs resulting from Voronoi dia-grams and convex hulls makes quantitative comparison between them complex, and difficult from a statistical perspective.

Orquin, Ashby, and Clarke (2016) describe several recommendations for using AOIs with behavioural eye-tracking studies. These recommendations include using maximum margins around AOIs when there are large distances between objects of importance on the stimulus. This allows fixations related to the object to be included and reduces overlap. By contrast, if the distance between objects is small, a smaller AOI margin should be used. Orquin, Ashby, and Clarke (2016) go on to state that researchers should either choose these AOI margins be-forehand based on the possible overlap, or alternatively do this post-hoc based on data conforming with quality criteria. Finally, they suggest that details of the AOI mar-gins are reported alongside the analysis.

Like (Brinkmann et al., 2013) and (Klein et al., 2014), we use a bottom-up clustering approach, but rather than using this to generate the AOIs directly, we instead apply clustering to determine the cell size for a grid that we then apply to each painting. Details of our approach to AOI definition are provided in the ‘Analysis’ section below.

Methods

A between-subjects experiment was conducted. The main factor was “stimulus presentation”. This had two levels: 1) “no-textual description”, henceforth referred to as the “no-description” condition, where 8 paintings were shown sequentially without any description, and 2) “de-scription” condition, where the same 8 paintings were presented sequentially, preceded by a descriptive narra-tive written by experts presented before each painting. The order of the presentation of the paintings was fixed, and the same for both conditions. Participants were ran-domly allocated to one of the two conditions. Neither of the groups were told the titles of the paintings or given any information concerning the artists. No specific task(s) were given to the participants, allowing them to view the paintings naturally.

Procedure

The experiment was run at two open day events at the University of Manchester, in a quiet controlled environ-ment, in facilities dedicated to the purpose. Participants were given an information sheet to read, and asked to sit in front of a desktop computer with a Tobii X2-60 (https://www.tobiipro.com/siteassets/tobii-pro/user-manuals/tobii-pro-x2-60-eye-tracker-user-manual.pdf/?v=1.0.3) eye-tracker, attached to a monitor with a resolution of 1366 x 768 pixels.

Forty-four participants (with normal or corrected-to-normal vision) who attended the open days volunteered to take part in the experiment. Two participants were ex-cluded due to poor data quality leaving 42 participants, 24 males, 18 females. Before starting the experiment participants signed an informed consent form describing the nature of the study, in accordance with the Universi-ty's ethical procedures.

Once the participant's gaze had been calibrated, they began the experiment. In the ‘no-description’ condition, the paintings were displayed on the screen in sequence for 10 seconds each, and the participant sat and viewed them. In the ‘description’ condition, a written description of the painting appeared on the screen; when the partici-pant had read this, he or she pressed the space bar to view the subsequent painting (also for 10 seconds per paint-ing). Participants in the description condition were also asked a multiple choice on-screen question after viewing all the paintings: “Do you think the text (information about the paintings) changed the way you looked at the paintings?” (yes/no). Participants' gaze was recorded throughout the experiment.

Stimuli

Digital versions of eight paintings were selected by Manchester Art Gallery staff as being representative of their collections, and artwork they would be interested in understanding people's visual perception of. The descrip-tive text for each painting was obtained from the “Art UK” website (http://www.artuk.org/discover/artworks/search/venue:\\ manchester-art-gallery-7282-54853). The paintings con-sisted of 3 landscapes, 2 portraits and 3 abstract pieces. The descriptions of the paintings were written by art experts (Table 1).

Analysis

All the analysis reported here was carried out using the R project for statistical computing, version 3.3.2. (Core R Team, 2014). Note that where effect size (partial eta squared) is reported, it was calculated on untrimmed data. The full code and data is available from (Davies, 2017). The Density-based spatial clustering of applications with noise (DBSCAN) algorithm (Ester, Kriegel, Sander, & Xu, 1996) was used to cluster visual fixations for each of the paintings, using the DBSCAN package (Hahsler, 2016) available for R. The algorithm was selected as it is widely used for cluster discovery, without a requirement to state the number of clusters in advance. This was important, as we did not know a priori where the fixations would be clustered, or how many clusters there would be. Density is determined by counting the points in a specific radius (termed Eps). Where the number of these points exceeds the threshold defined by a value called MinPts, it is considered a “core point” by the algorithm. Noise points are those that are neither core points, nor contain a core point within the Eps radius (Ester et al., 1996). Formally the Eps-neighbourhood (Eps) for a point is defined as N_Eps P = {q ∈ D|dist p, q ≤ Eps}. Points can also be directly density-reachable, defined as p ∈ N_Eps qand N_Eps q ≤ MinPts (Ester et al., 1996). The optimal Eps value was selected for each painting by computing the k-nearest neighbour distances and plotting them in ascending order to visualise the “knee of the curve” (point of curve with significant change) that corresponds to the optimal Eps value. The MinPts value, referring to the minimum number of points that are required to form “core points”, was set to 4. This value was used as the original authors state that k-distance graphs did not alter significantly with values > 4, but did, however, require greater computational effort (Ester et al., 1996).

A grid of squares was then applied to each painting, with the cell dimensions (height and width) set as double the average of the optimal DBSCAN Eps value (radius). Each of the cells represented a Unit of Interest (UOI), for which fixation data was calculated.

To ensure the analysis considered only fixations on the painting itself, we added an offset to the grid using a bespoke algorithm to calculate the position of the paint-ing inside the black container area (Figure 3). Additional-ly as the division of individual cell dimensions into the image space may leave a remainder, we accounted for this by locating the grid in the centre of the image so that any additional space between the area of the grid and that of the painting will be around the edges of the painting. This was done instead of locating the grid in the top left of the image on the assumption that the salient features for a given painting are located centrally rather than pe-ripherally.

As there is an inherent error in gaze accuracy for each eye-tracker, we consider the appropriateness of the size of the cells used in the grid to determine if the cell size is small enough to be impacted significantly by gaze accu-racy error. The units spanned by the visual field (x) can be calculated by first determining the visual angle (!) given the object size (s) and the object distance (d) con-verted from radians into degrees (Equation 1).

Equation 1. Units spanned by the visual field

The smallest of the cell sizes used in this study (1.55°) is greater than the 1 to 1.5° that is suggested as the minimum practical AOI size (Holmqvist et al., 2011). A transition matrix was then constructed for each condition (description and no-description), representing the number of transitions from and to each cell in the grid. This is then converted into a Markov chain representing the probability of transitioning from a given UOI to the same UOI, or to a different one. This technique has been used to compare differences between clinicians making correct and incorrect interpretations of medical scans with the Jensen-Shannon distance (Davies, 2016), and modified by Reani, (2017) to use the Hellinger distance, which is more appropriate for comparing transition behaviour, as it permits values of 0 in the transition matrix. The Hellinger distance (Equation 2) is used to determine the difference between the Markov chains representing each condition and can be used as a proxy for dissimilarity.

Equation 2. Hellinger distance for discrete probability distributions

This value can then be compared against that obtained for two groups of equivalent size, but containing participants chosen at random. By performing this operation 10,000 times using a permutation test (Knijnenburg, Wessels, Reinders, & Shmulevich, 2009), we obtain a distribution of the difference between groups chosen at random, which can then be compared against the difference between the description and no-description groups. This allows a threshold to be set, where “p-values” to the right of the critical value allow the rejection of the null hypothesis. This allows us to see if there is something “special” about the two conditions that is unlikely to be explained by random chance, given that permutation tests are known to be robust against Type I error (Wilcox, 2010).

The individual participants' scanpaths were also com-pared against one another for both conditions (description and no-description). The order in which the participants transition their gaze around the UOIs was represented as a “string” of text. This essentially represents each partici-pant's scanpath around the stimulus in terms of the se-quences of UOIs visited. Although this does not account for the temporal dimension of the scanpath sequence, it does allow comparison between the areas visited. The Levenshtein distance applies a cost for each operation (insertion, deletion and substitution) used to transform one string of text into another (Levenshtein, 1966). Visu-alising the resulting distance in a matrix allows us to rapidly visually compare all participants against each other and detect any outliers, or participants that appear to use similar visualisation strategies. The darker the shade of red the more similar the scanpaths are; the dark-er the shade of blue the less similar they are. Figure 2 shows an example of the Levenshtein distance for the 2 groups (description and no-description) for the “Rhyl Sands” painting, where the participants' scanpaths can be compared with one another in that condition and between conditions. This allows for rapid initial analysis of the spatial and sequential aspects of the participants’ eye-movements as they viewed the paintings. In this repre-sentative example we can see that most of the cells are red in both groups, implying that the sequence of transi-tions is fairly similar for most of the participants.

The visualisations were generated for each stimulus for the two groups (description and no-description). The visualisations allow for rapid high level comparison of the two conditions per painting. As mentioned previously this also makes it easier to identify outliers among the participants for further examination, or possible exclusion from subsequent data analysis. A summary of the Le-venshtein distance results for each painting can be seen in Table 2.

Results

The results indicate that the majority of fixations made for both groups tend to occur in the 100-300 ms duration range, suggesting that relatively short fixations are predominant in both conditions. There were, on average, 893 (SD = 72) fixations in the no-description group and 815 (SD = 97) fixations in the description group. The mean fixation duration for the no-description group was 239 ms (SD = 191) and 227 ms for the description group (SD = 128). Figure 5 shows the fixations for both conditions along with the fixation durations. A two-way mixed ANOVA with trimmed means (γ = 0.2) which was used due to data violating parametric assumptions (Field, Miles, & Field, 2012; Wilcox, 2012), showed that there was no significant difference between fixation counts for the description and no-description groups. There was a significant but small main effect of painting (Q = 4.15, p = .006, η² = .04), which post-hoc pairwise t-tests (Bonferroni correction) showed was significant for “Flask-walk, Hampstead” and “Sir Gregory Page-Turner” paintings (p = .013).

Figure 6 summarises the average fixation durations and number of fixations for each paining for both groups.

The results of the permutation test, with the exception of the painting “Release” did not show any significant dif-ferences in transitions between the groups, resulting in p values greater than .10. The painting “Release” did indi-cate a difference between groups (Hd = 0.859, p = .08), see Figure 7. Although this is not significant for α = .05.

This form of analysis is known to be very robust, and there is thus a very low risk of a type I error.

We examined the recording quality between the two groups to see if this could have impacted on any differ-ences between the groups. Recording quality pertains to a percentage value that is derived from the number of gaze samples identified by the eye-tracking software that is divided by the number of attempts. Problems arising from a failure to detect the eyes and participants looking away from the screen can all contribute to reducing the record-ing quality value. To this end a t-test was carried out comparing the recording quality percentage values for both groups. The difference was not significant t(37) = 0.61, p > .05 suggesting that any differences detected were due to behavioural factors rather than as a result of uneven recording quality between the groups.

Discussion

The majority of the paintings (n=7) did not demonstrate any difference in terms of fixation and transition behaviour. The painting that was associated with a difference between groups was “Release”. As the permutation test is robust against type I error (Wilcox, 2010), this result suggests that there was a real difference between the groups for this painting.

It is notable that this painting lacks distinctive fea-tures differentiating one area of the painting from any other. The text, which provided an explanation of what the features of the painting represent (images of chromo-somes viewed through an electron microscope) could explain why transitional behaviour differs between the groups in this case. Here, the description appeared to provide information that could not be gleaned from the painting itself, and it thus caused people to examine the features of the painting differently. With no salient fea-tures and a fairly uniform pattern, there may not other-wise be cues to drive viewing behaviour.

68% of the participants who read the description thought that it did make a difference to how they subse-quently viewed the painting. A difference in gaze behav-iour was not observed between groups in this experiment, and it would thus be interesting to further explore the potential nature of this difference, if it exists.

Recently there has been a move towards providing descriptive information in a form that departs from for-mal language, using instead descriptions that focus on placing the work in context and describing the artist's intentions (Gail, 2010). The work reported here focuses on detecting whether reading a descriptive text leads to a difference in visual behaviour; future work could system-atically address how different forms and formats of cura-torial narrative affect gaze patterns as well as different types of image.

The method presented here goes some way toward removing biases that can be introduced by researchers when manually defining areas/regions of interest. Alt-hough in the current study it was applied to detecting differences in visual behaviour with paintings, the meth-od could also be applied to other domains. Applying a grid to the stimulus allows for comparison between these uniform spatial Units of Interests (UOIs), defined using a data-driven approach. This could be applied to any type of stimulus that lacks obvious or predefined semantic areas or regions. AOIs are typically added to segment a stimulus in response to a hypothesis. Changing the AOI changes the hypothesis, and adding AOIs after recording data thus equates to formulating a post-hoc hypothesis (Holmqvist et al., 2011). Data-driven UOIs allow the segmentation of the stimuli space into equally sized re-gions based on the clustering of the fixation data. This allows for the easy comparison of such UOIs to deter-mine areas of the stimulus that the participants focus most attention on, and how they move between these areas. This data-driven method allows an unbiased, ex-ploratory approach to interpreting gaze data across a stimulus. To mitigate the arbitrary selection of grid size when using a gridded AOI system, Holmqvist et al. (2011) recommends several different cell sizes be used. As changing the AOI size has an effect on the results, there may be a temptation for researches to do this until they find a statistically significant result, a practice that can be problematic from a scientific perspective (Orquin et al., 2016). The method presented in this work address-es this issue by using a combination of clusters within the gaze data and pragmatic considerations to determine the size of the grid.

Limitations

A primary limitation of this study from the perspective of the domain was the fact that participants were not able to view the descriptive text and painting simultaneously or switch between them, as they would on a website or in a gallery; separating them was necessary to ensure the paintings could be presented in the same way to participants in both conditions. Paintings were viewed for only 10 seconds, and it may thus be the case that, regardless of the description, in this short time the eyes were instinctively drawn to the salient components of the painting, such as buildings, body shapes, facial details etc (bottom-up features). The texts used here were functional, and used for describing an artwork to aid its identification; whilst we can hypothesize other forms of curatorial narrative will not affect viewing behaviour, we cannot be sure. Viewing a painting online is likely to be quite different to viewing in a gallery, where scale and context will affect the experience, so it is not clear whether these results would extend to this scenario.

Conclusions and Future Work

The use of a grid based system with cell size deter-mined by data-driven clustering allows for the creation of units of interest (UOIs), which can serve as a basis for subsequent analysis. The UOIs simultaneously aid in removing the bias introduced by researchers deciding on the size and location of these areas, and allow direct comparison between the units, as they are of equal di-mensions. The study demonstrated that viewing a de-scriptive text has no significant impact on subsequent gaze patterns over a painting, with one exception -- an image that had a relatively uniform pattern with few distinctive features. The effect may also vary according to the type and quality of the textual information provid-ed in these descriptions, and it would be interesting to test this in a future study. Here we examined descriptions -- it may be that different forms of curatorial narrative have a greater affect on viewing patterns. The techniques de-scribed in this study may have a much wider application, as they could also be of use in identifying the effects of descriptive data on viewing behaviour in other domains, such as understanding how patient history shown before a subsequent medical scan affects the way this image is viewed.

Ethics and Conflict of Interest

The authors declare that the contents of the article are in agreement with the ethics described in http://biblio.unibe.ch/portale/elibrary/BOP/jemr/ethics.html and that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

We would like to thank the Engineering and Physical Sciences Research Council (EPSRC) for funding this work through grants EP/K502947/1, EP/L504877/1 and EP/K503782/1-079. We would also like to thank Liz Mitchell and Alex Wood for their help selecting the paintings.

References

Borji, A., D. N. Sihite, and L. Itti. 2013. What stands out in a scene? A study of human explicit saliency judgment. Vision Research 91: 62–77. [Google Scholar] [CrossRef] [PubMed]
Brieber, D., M. Nadal, H. Leder, and R. Rosenberg. 2014. Art in time and space: Context modulates the relation between art experience and viewing time. PLoS ONE 9, 6: 1–8. [Google Scholar] [CrossRef]
Brinkmann, H., L. Commare, H. Leder, and R. Rosenburg. 2013. Abstract Art as a Universal Language? Leonardo 46, 5: 488–489. [Google Scholar] [CrossRef]
Bubic, A., A. Susac, and M. Palmovic. 2014. Keeping our eyes on the eyes: The case of Arcimboldo. Perception 43, 5: 465–468. [Google Scholar] [CrossRef] [PubMed]
Core R Team. 2014. R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing. Available online: http://www.r-project.org/.
Davies, A. 2016. ECG Eye-tracking Experiment 1. The University of Manchester: Manchester. [Google Scholar] [CrossRef]
Davies, A. 2017. Manchester Art Gallery Eye-tracking Analysis. The University of Manchester: Manchester. [Google Scholar] [CrossRef]
Ester, M., H.-P. Kriegel, J. Sander, and X. Xu. 1996. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In KDD-96 Proceedings. pp. 226–231. [Google Scholar]
Field, A., J. Miles, and Z. Field. 2012. Discovering statistics using R. Los Angeles: Sage. [Google Scholar]
Gail, G. 2010. Your labels make me feel stupid. Available online: http://www.webcitation.org/6m77vjVPX.
Gartus, A., N. Klemer, and H. Leder. 2015. The effects of visual context and individual differences on perception and evaluation of modern art and graffiti art. Acta Psychologica 156: 64–76. [Google Scholar] [CrossRef] [PubMed]
Gartus, A., and H. Leder. 2014. The white cube of the museum versus the gray cube of the street: the role of context in aesthetic evaluations. Psychology of Aesthetics, Creativity, and the Arts 8, 3: 311–320. [Google Scholar] [CrossRef]
Goldberg, J. H., and X. P. Kotval. 1999. Computer interface evaluation using eye movements: Methods and constructs. International Journal of Industrial Ergonomics 24, 6: 631–645. [Google Scholar] [CrossRef]
Hahsler, M. 2016. dbscan: Density Based Clustering of Applications with Noise (DBSCAN) and Related Algorithms. Available online: http://cran.r-project.org/package=dbscan.
Holmqvist, K., M. Nystrom, R. Anderson, R. Dewhurst, H. Jarodzka, and J. Van de Weijer. 2011. Eye tracking: A comprehensive guide to methods and measures. Oxford University Press. [Google Scholar]
Kapoula, Z., G. Daunys, O. Herbez, and Q. Yang. 2009. Effect of title on eye-movement exploration of cubist paintings by Fernand Léger. Perception 38, 4: 479–491. [Google Scholar] [CrossRef] [PubMed]
Klein, C., J. Betz, M. Hirschbuehl, C. Fuchs, B. Schmiedtov, M. Engelbrecht, and R. Rosenberg. 2014. Describing art-An interdisciplinary approach to the effects of speaking on gaze movements during the beholding of paintings. PLoS ONE 9, 12: 1–18. [Google Scholar] [CrossRef]
Knijnenburg, T. a., L. F. a Wessels, M. J. T. Reinders, and I. Shmulevich. 2009. Fewer permutations, more accurate P-values. Bioinformatics 25, 12: 161–168. [Google Scholar] [CrossRef] [PubMed]
Kreplin, U., V. Thoma, and P. Rodway. 2014. Looking behaviour and preference for artworks: The role of emotional valence and location. Acta Psychologica 152: 100–108. [Google Scholar] [CrossRef] [PubMed]
Kübler, T. C., K. Sippel, W. Fuhl, G. Schievelbein, J. Aufreiter, R. Rosenberg, and E. Kasneci. 2015. Analysis of eye movements with eyetrace. Communications in Computer and Information Science. [Google Scholar] [CrossRef]
Levenshtein, V. I. 1966. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady. [Google Scholar]
Massaro, D., F. Savazzi, C. Dio, D. Freedberg, V. Gallese, G. Gilli, and A. Marchetti. 2012. When art moves the eyes: A behavioral and eye-tracking study. PLoS ONE 7, 5. [Google Scholar] [CrossRef] [PubMed]
McNamara, a, T. Booth, and S. Sridharan. 2012. Directing gaze in narrative art. Acm Sap 2012, vol. 1, pp. 63–70. [Google Scholar] [CrossRef]
Nagai, M., M. Oyana-Higa, and T. Miao. 2007. Relationship between image gaze location and fractal dimension. Conference Proceedings-IEEE International Conference on Systems, Man and Cybernetics; pp. 4014–4018. [Google Scholar] [CrossRef]
Orquin, J. L., N. J. S. Ashby, and A. D. F. Clarke. 2016. Areas of Interest as a Signal Detection Problem in Behavioral Eye-Tracking Research. Journal of Behavioral Decision Making 29, 2–3: 103–115. [Google Scholar] [CrossRef]
Over, E. a B., I. T. C. Hooge, and C. J. Erkelens. 2006. A quantitative measure for the uniformity of fixation density: The Voronoi method. Behavior Research Methods 38, 2: 251–261. [Google Scholar] [CrossRef] [PubMed]
Quiroga, R. Q., and C. Pedreira. 2011. How do we see art: an eye-tracker study. Frontiers in Human Neuroscience 5, 98: 1–9. [Google Scholar] [CrossRef]
Reani, M. 2017. Scanpath analysis of 2-grams using Hellinger distance. The University of Manchester. [Google Scholar] [CrossRef]
Wilcox, R. R. 2010. Fundamentals of Modern Statistical Methods. Fundamentals of Modern Statistical Methods: Substantially Improving Power and Accuracy, 2nd ed. Springer Verlag. [Google Scholar] [CrossRef]
Wilcox, R. R. 2012. Introduction to Robust Estimation and Hypothesis Testing. Introduction to Robust Estimation and Hypothesis Testing, 3rd ed. Elsevier. [Google Scholar] [CrossRef]

Figure 1. ‘Self-Portrait’, by Louise Jopling (1843-1933).

Figure 2. Levenshtein distance “Rhyl Sands” painting (one partici-pant removed due to poor data quality).

Figure 3. Horizontal and vertical offsets applied to locate the grid in the area occupied by the paintings.

Figure 4. Sample UOI's generated for the “Self-portrait” painting. The 8x8 grid generates 64 UOIs for this painting.

Figure 5. Fixations for both groups with their associated durations (all paintings).

Figure 6. Distribution of distance results of permutation test for the painting “Release”. The vertical bar shows the distance for the descrip-tion and no-description conditions.

Figure 7. Distribution of distance results of permutation test for the painting “Release”. The vertical bar shows the distance for the descrip-tion and no-description conditions.

Table 1. Paintings used as stimuli in the experiment.

Table 2. Summary of Levenshtein distance per painting.

Share and Cite

MDPI and ACS Style

Davies, A.; Reani, M.; Vigo, M.; Harper, S.; Gannaway, C.; Grimes, M.; Jay, C. Does Descriptive Text Change How People Look at Art? A Novel Analysis of Eye—Movements Using Data—Driven Units of Interest. J. Eye Mov. Res. 2017, 10, 1-13. https://doi.org/10.16910/jemr.10.4.4

AMA Style

Davies A, Reani M, Vigo M, Harper S, Gannaway C, Grimes M, Jay C. Does Descriptive Text Change How People Look at Art? A Novel Analysis of Eye—Movements Using Data—Driven Units of Interest. Journal of Eye Movement Research. 2017; 10(4):1-13. https://doi.org/10.16910/jemr.10.4.4

Chicago/Turabian Style

Davies, Alan, Manuele Reani, Markel Vigo, Simon Harper, Clare Gannaway, Martin Grimes, and Caroline Jay. 2017. "Does Descriptive Text Change How People Look at Art? A Novel Analysis of Eye—Movements Using Data—Driven Units of Interest" Journal of Eye Movement Research 10, no. 4: 1-13. https://doi.org/10.16910/jemr.10.4.4

APA Style

Davies, A., Reani, M., Vigo, M., Harper, S., Gannaway, C., Grimes, M., & Jay, C. (2017). Does Descriptive Text Change How People Look at Art? A Novel Analysis of Eye—Movements Using Data—Driven Units of Interest. Journal of Eye Movement Research, 10(4), 1-13. https://doi.org/10.16910/jemr.10.4.4

Article Menu

Does Descriptive Text Change How People Look at Art? A Novel Analysis of Eye—Movements Using Data—Driven Units of Interest

Abstract

Introduction

Background

Art and Eye-Tracking

Defining Areas of Interest

Methods

Procedure

Stimuli

Analysis

Results

Discussion

Limitations

Conclusions and Future Work

Ethics and Conflict of Interest

Acknowledgments

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI