Introduction
Music and Text Reading
The study
of music and text reading,
particularly with eye-tracking
techniques, has developed gradually in recent
years and continues to gain interest.
Despite this, establishing comparisons between
music and language processing is still not simple. In fact, interpretation of
syntax, semantics and phonology cannot be subjectively transferred from one
domain to another (ThompsonSchill, 2013). Additionally, the natures of
hierarchical structures in music and language differ (Asano & Boeckx,
2015) and are not always
in accordance with one
another (Arbib, 2013).
Moreover, it is not yet possible to
establish whether score reading and text reading are functionally independent, particularly in learning
(Madell & Hebert, 2009) or in
high-level cognitive processes. It is for this
reason that the study of eye movements can provide new evidence for better understanding the cognitive mechanisms under-lying complex
processes in music and language (Rayner, Chace, Slattery & Ashby, 2006).
Both music and verbal texts are determined
by structural and visual
features that are hierarchically organized. In the particular case of
western music, tonal distances determine the hierarchical organization of
musical structures (Schön et al., 2007). From this perspective the sentence
(small-scale structure) represents, as stated by Sloboda (1974, 1977), a relevant unit of comparison between music and text reading. This author pointed
out that structural markers (“The structural markers define a unit in terms of sequential
rules to obey his constituents "..." The physical markers enable the
definition of a unit before the analysis
of its components" (Sloboda,
1977, p. 117)) are related to musical processing (e.g. spacing of notes,
articulations) and that musicians use both structural and physical markers
when reading music. This suggests that parallels exist
between music and language processing at high levels of abstraction. Additionally,
markers can help the reader extract and connect ideas from the text to larger
units that represent a main theme (Graesser, McNamara, & Louwerse, 2003).
To date most music reading
studies, with some exceptions,
have been based on short musical passages or excerpts or the analysis of low
level process (Servant & Baccino, 1999). As a consequence, there is little
understanding of how musical information is processed with respect to its more
general context. In fact, large-scale music processing (narrative or discourse
in language) involves the construction of meanings at different levels:
sentence-by-sentence, cumulative and structural (Seyfert et al., 2013).
Furthermore, large-scale structure processing depends on surface information
interpretation as well as global factors that are reliant on the meaning of
symbols (Kinsler & Carpenter, 1995).
Eye Movement Measurements in Music and Text Reading
Different eye movement measurements have
been used to study reading, including the number and duration of fixations, type of fixations
(regressive or progressive) and the amplitude of saccades. Other measurements
are more specific, although not exclusive to the study of eye movements in text reading. For example the first pass or second
pass, which refer respectively to the total length of fixations in the first and second inspections of the text
(Rayner, 1998; Johansson, Holmqvist, Mossberg, & Lindgren, 2012). Similar
measurements have been used to study the
oculomotor behaviour associated with music reading without the corresponding
motor execution. There is some evidence that fixation durations in language are
shorter than in music reading. Bigand et al. (2010) reported an average
fixation duration of 375 ms for music and two subsets
for language, between
225-250 ms for silent reading and 175-325 ms for oral reading. Rayner
& Pollatsek (1997) reported an average fixation duration of 350-400 ms for
music and 200-250 ms for silent reading of English text. However, other studies
have reported longer fixation durations of 377-474 ms (Goolsby, 1994a) and 730
ms (Penttinen, Huovinen & Ylitalo, 2015). Duration of fixations is an
indicator of difficulties in information processing (Baccino, 2004; Baccino &
Colombi, 2001; Bigand et al., 2010,
Goldberg & Kotval, 1999;
Goldberg & Schryver, 1995; Holmqvist,
Holšánová, Barthelson, & Lundqvist, 2003; Rayner, 1998).
Previous research (Sloboda, 1977; Goolsby,
1994b; Kinsler & Carpenter, 1995) has indirectly addressed the comparison
of eye movements in reading music and text without studying this theme
exclusively in single experiments, for example
during silent reading
or sight-reading. In the
present study we directly compare eye movement patterns of music students
performing a silent reading task of music and texts. Physical and structural
markers will serve as boundaries defining
small-scale structures as comparison units between the two
domains.
Silent Reading and Sight-Reading
Silent reading is based on the absence of performance in the case of music, or of reading aloud in the
case of texts. Silent reading studies are more commonly associated with reading
texts (see Rayner, 1998), and we can find a wide range of tasks to evaluate eye
movements while reading. For example: yes/no comprehension questions
after reading sentences including some target words (Ashby & Clifton, 2005); free
reading of text passages and pseudoword lists (Hutzler & Wimmer, 2004). In
contrast, silent music reading studies are not so frequent and the main tasks
described are related to information extraction such as: search through the musical phrases
for errors (Gilman &
Underwood, 2003); recognizing certain musical patterns (Burman & Booth, 2009); indicating
whether a new musical stimulus is the same or different from the preceding one
(Waters, Underwood & Findlay, 1997, Waters
& Underwood, 1998); silent
reading before
performance/humming (Polanka, 1995); or even tasks assessing personal
strategies for musical stimuli inspection, for example free silent reading
(Penttinen, Huovinen, & Ylitalo (2013) or careful reading of a score in
order to answer questions about the musical theme presented (Servant &
Baccino, 1999).
The average duration
of fixations in silent music reading studies
using stimuli of four or more bars ranged from 242-2,004 ms (Servant & Baccino,
1999; Gilman & Uderwood, 2003; Drai-Zerbib, Baccino & Bigand 2012;
Penttinen et al., 2013; Drai-Zerbib & Baccino, 2014). In studies that
included less than 4 bars of musical stimuli, fixation duration ranged from
193-309 ms (Waters, Underwood & Findlay, 1997; Waters & Underwood,
1998). Some of the conclusions from silent reading studies are that eye movement patterns depend on the visual features of the stimuli and that musicians use “chunking strategies” similar to those
used for reading
texts (Waters et al.,
1997; Waters & Underwood, 1998). Silent reading also implies high cognitive
load demands (Gilman & Underwood, 2003) and is an important task for
studying high level cognitive processes. It can also inform us about
integrative reading mechanisms (Servant & Baccino, 1999; Drai-Zerbib &
Baccino, 2005).
Gilman and Underwood
(2003) is one of the few studies addressing music reading through
silent reading and music performance. Music stimuli consisted of 3 bars of Bach
choral excerpts with the tenor part removed. The authors demonstrate that
saccades and fixation durations are longer in a silent reading task (283-388
ms) than in a sight-reading task (284-328 ms). They suggest that these
differences could be explained by the fact that in silent reading musicians do
not need to read normally from left
to right. Furthermore the nature of the silent reading task (error detection)
might encourage note-by-note reading.
Sight-reading is more commonly associated
with the performance of unrehearsed written
music. Few sightreading studies have analysed
differences in eye movement patterns between different music styles. Weaver
(1943) shows that anticipation capacities are constrained by music texture. There is also evidence
that music structure has an
effect on anticipation behaviour, revealing musicians’ capacity to adapt their eye-hand
span to phrase boundaries (Sloboda, 1974, 1977; Wurtz, Mueri, &
Wiesendanger, 2009). Lehmann and Ericsson (1996) show that sight-reading
achievement can be predicted based primarily on specialized expertise and to a
lesser extent on the number of years of practice. From sightreading studies we
also know that musicians develop the ability to identify and processes groups
of notes or “chunks” (Kinsler & Carpenter, 1995; Polanka, 1995; Waters et al., 1997;
Wolf, 1976); this could be considered
as an element of comparison between the musical and verbal domains
as well as the use of horizontal peripheral vision (Goolsby, 1994b).
Furthermore, it has been proved that “chunking” abilities correlate
with sight reading performances (Waters,
Townsend, & Underwood, 1998).
Eye Movements as an Indicator of Semantic Integration During Reading
Early theories of eye movement control
(Hochberg, 1976) showed that spatial decision depends on the knowledge of syntax and semantics. Progressive fixations occur when the reader is moving forward to obtain information.
In contrast, regressive eye fixations occur when the reader returns to
previously read text to recover information or improve
comprehension, such as occurs when reading complex sentences (Bigand et al.,
2010). Re-fixations take place in areas that have already been read, ensuring
understanding of syntactic structure from certain indices found in the phrase
(Dai-Zerbib & Baccino 2005). The number of regressive fixations has been
described as an indicator of difficulties in information integration
(Lévy-Schoen, 1988; Hyönä 1995; Servant & Baccino, 1999). Furthermore,
regressive fixations are influenced by musical complexity (Goolsby, 1994a).
A quite recent review of the literature
indicates that there are three main hypotheses for regressions in reading
which determine three different, non-exclusive levels of processing: “going
from low-level visuomotor processes, to higher-level word identification and sentence
comprehension processes” (Vitu, 2005, p. 12). Furthermore, analysis of
regressive fixations suggests that there are local and global controls inherent
to the cognitive processes of reading that determine coherency, an essential
property in comprehension (Servant & Baccino, 1999).
In a more recent study, Penttinen et al.
(2013) examine visual processing and verbal descriptions during silent
reading of a music score by different expertise groups (novices and amateur musicians). The authors identified
three different silent-reading styles: accurate processors, accurate
analysers and accurate integrators. Integrator readers use shorter fixation
times than other groups. The work of Servant & Baccino (1999) uses modified
and original musical excerpts from
Beethoven's Bagatelles as stimuli. The authors show that modified versions of
Beethoven's Bagatelles require further global and local information
integration. This was demonstrated by a decrease in the number of regressive
fixations on the target areas (modified from the original). For the authors,
these differences are explained
by the processing of the musical
facts to capture characteristic features of the composer's work, informing the
reader about the mode and place of the intervention of integrative mechanisms
(Servant & Baccino, 1999). These mechanisms serve an important function, to
release working-memory (see next section) from
superficial representations. This allows the reader to obtain an idea, albeit
a limited one, of every word (Just
& Carpenter, 1980, 1987).
Working Memory
The different processes involved in the
integration of information, in order to bind and enrich the new information
with that already stored, are held in the working memory system (Servant &
Baccino, 1999). Baddeley (1990) defines working
memory as a complex system
that temporarily stores and processes information in a series of cognitive tasks. This system consists
of two subsystems, the phonological loop and the visuospatial sketchpad, which
respond to a central executor responsible for the selection and implementation
of control processes. These slave systems are respectively responsible for the
maintenance and the processing of verbal and spatial information. The fourth
component, the episodic buffer, is attentionally controlled by the executive,
combining auditory and visual information (multidimensional code) and serves as an interface between
working and long-term memory (Baddeley, 2003).
To determine the inter-individual
differences in reading texts, Daneman & Carpenter (1980) proposed the
reading span task. This task aims to
measure the capacity of working memory for storing and processing information. The authors observe
correlations between reading comprehension outcomes and the
results of the reading span task, explained by the implementation of common
processes that are linked to the nature of the information triggered (Desmette,
Hupet, Schelstraete, & Van der Linden, 1995). These
findings have been corroborated by several studies
but have also been challenged by others (see Ehrlich, Brébion, & Tardieu,
1994; see also Kane et al., 2004, for an overview).
Comprehension in Music and Language
Verbal comprehension is a process in which the reader interacts with the text
through the use of thought and language (Iser, 1997). It therefore implies the
use of a series of cognitive and metacognitive skills (Van Dijk & Kintsch,
1983) mediated by mastering verbal procedures (Byrne, 2005). Text reading
involves the simultaneous implementation of a series of activities where the
goal is the construction of a representation (Fayol, David, Dubois, &
Rémond, 2000).
Music reading is characterized by a
vertical component, or the simultaneous reading of various musical notes, and
the integration of agogic nuances, such as changes in tempo. Sequential units
are commonly constituted by series of chords represented by a simultaneously or consecutively played
group of notes (Sloboda, 1984; Hébert & Cuddy, 2006).
The literature on music meaning is mainly divided
into two interpretations: that music has a self-significance (Arom,
2000); and that musical comprehension can be defined in consideration of a
broader view of meaning (Patel, 2008; Koelsch et al., 2004). In the first
approach musical meaning can be extracted, for example, from musical form,
while in the second approach the relationship between the different structural
elements of music (intra-musical references) determines musical meaning. The resolution, stability and continuity of those structural
elements allow the large-scale meaning of music to emerge (Koelsch, 2012).
The Present Study
The goal of the present study was to
compare eye movement patterns in music students
during a silent
reading task of verbal texts and scores.
To our knowledge this is the first
study that directly compares these two types of
reading. The comparison is achieved by considering physical and structural
limits of stimuli (small-scale processing) through different musical styles and
types of texts (large-scale processing). We chose a silent reading task to
compare the two domains as it does not engage motor movement. Unlike previous
research, the use of entire, short musical pieces allows us to study information
integration mechanisms through the construction of a global representation of
the stimuli.
Firstly, this study aimed to look for
differences in eye movements in reading music and verbal
texts. We expected longer fixation durations in reading scores, contrasted with
an increased number of fixations in reading texts. For regressive fixations, we
expected a higher number in the scores as a consequence of twodimensional
processing.
Secondly, we aimed to demonstrate that eye
movements are sensitive to physical and structural markers which determine
different eye movement patterns. These patterns might vary between music styles (tonal and contemporary) and verbal texts
(informative and literary) depending on: (1) the occurrence of integrative
controls (regressive fixations) in both local and global levels, and (2)
the reading stage in which they occur (first-pass and re-reading).
Methods
Participants and Stimuli
Ten Bachelor of Music students studying
functional piano, between the third and fifth year in the Faculty of Arts of
the University of Chile, participated in the study. The participants were all
students of the same piano teacher. Their age varied between 21 and 26 years (M
= 23.4, SD = 1.5), with piano playing
experience between 3 and 11 years (M = 6.05, SD =
2.89) and music reading experience between 2 and 8 years (M
= 4.8, SD = 2.29).
It is important to note that the range of music experience includes both formal and informal experiences. However, informal experiences did not allow the students to advance
to more difficult courses in the Bachelor programme (at least in reading).
Musical background was evaluated by self-reports considering music reading
experience and hours of piano training. This research was conducted
prior to the approval of the review committee of the Faculty of Arts at the
University of Chile. Participants were verbally informed about the purpose and
procedures of the research, as well as the alternatives. All 10 pianists voluntarily accepted to participate in the experiment
without any outside influence, and confidentiality was assured. The authors
complied with ethical practices for the purpose of this research.
Six contrasting styles for both scores
(tonal and contemporary) and texts (informative and literary) were selected. Musical
stimuli were chosen
based on two criteria:
historical (time periods)
and hierarchical organization (tonal vs atonal scores). The
stimuli were extracted from two sources: the Notenmappe, a pedagogical book containing
collections of musical pieces composed in classical-romantic style, and a book
published by the Faculty of Arts of
the University of Chile containing short pieces by contemporary pianists
(Botto, 1965). The contemporary scores were composed by Leni Alexander, a
PolishJewish musician who came to settle in Chile as a refugee in 1939. These
musical pieces were created within a
pedagogical context, adopting the concept of dissonance but maintaining similar
construction to traditional counterpoint. Due to the era in which they were
composed (1950s) and the composer’s rejection of tonality, these pieces can be
considered as contemporary but neither experimental nor avant-garde. Scores
were selected with the following criteria: (1) an equivalent number of notes,
bars and semi-phrases; (2) not more than two lines, double stave each (see
Figure 1 and
Table A1). All musical stimuli were full pieces with the exception of one
Notenmappe score (Gavotte).
Verbal texts were selected according to the
age and education of participants. Lexile measures ranging between 1240L and
1470L were used. These levels represent a reading level suitable for students
finishing high school or in their
first year of university (Wright
& Stenner, 1998). The average sentence
and word lengths
were 34.15 and 4.7 characters,
respectively. Literary and expositive texts were chosen as they generally have
contrasting rhetorical structures, paragraph organization and levels of
communication (implicit
vs explicit) (Graesser et al., 2003). The number of small-scale structure units
(musical semi-phrases and sentences) are equivalent in number between the
musical and verbal stimuli (see
Table A1
and
Table A2).
Eye-Tracking Equipment
Eye movements were monitored by a Tobii eyetracker
T120, with a sampling frequency of 60 Hz. The stimuli were presented on a 17-inch screen with a resolution of 1024 x 768 pixels. The
distance between the participant’s head and the screen was 600 mm.
Working Memory and Spatial Memory
Reading span test (RST). The working
memory task presented to the participants was a French version (Desmette et
al., 1995; Morlaix & Suchaut, 2015) of the traditional Daneman and
Carpenter (1980) RST. This task, based on the second version, included
true/false statements. The RST was translated from French to Spanish,
the native language of the participants, and administered on a computer.
The entire examination consists of a training block (three sequences of two
extracts each) followed by 30 test phrases. The participants were required to read brief phrases and simultaneously memorize a number. Once each phrase was
read, the participants had to decide if it was coherent or not before going on
to the next phrase. The test phrases are divided into three blocks of 9, 12 and
20 phrases, comprising 12 sequences in total. After
three sequences, one phrase is added so that
sequences 1 to 3 (training level ) have two phrases, sequences 4 to 6 (level 2) have three phrases, sequences
7 to 9 (level 3) include four phrases and sequences 10 to 12 (level 4) have five phrases. For the
scoring of the test we took into account the total number of digits recalled in
order to adopt a continuous measure as recommended by Friedman and Miyake
(2005).
Corsi block-tapping test (CBT). All
the material and the instructions were presented on a computer screen and each trial began with
a 5-second count-down. The sequence of “blocks” was then displayed in random
order. Once the sequence was finished, the participant repeated the order by
clicking on the blocks with the mouse. The first sequences (training) had two
items, and one item was added every
three sequences until the limit of the participant’s capacities was reached. To
move on to the next level, one of the three presentations had to be successfully completed. The dependent variable
was the total number
of items accurately recalled in the correct serial position. Partial-credit
unit scoring was used to grade the
tests (Conway et al., 2005); specifically, a decimal score was assigned for
tapping the blocks in the correct order and position at each level.
Procedure
Participants performed the RST and then, in
another room equipped with an eye-tracker, performed the reading comprehension
test (music and texts). The presentation of each set of stimuli was preceded by
a calibration phase in which participants had to follow a red circle connecting
nine points arranged on the screen. The calibration phase aims at correcting
the variations that can influence the geometrical parameters necessary to
assess gaze directions (Hammoud, 2008 in Holmqvist, et al., 2011). The two
types of stimuli were counterbalanced.
Most previous research does not use time limits
when stimuli are longer than 4 bars. However, there are some exceptions (Waters & Underwood, 1998, Penttinen
et al., 2013). In our study,
participants were not under time limits for inspecting the stimuli and were
instructed to keep their eyes on the screen. After reading the texts or scores,
participants had to answer two questions on a sheet of paper. A new calibration
phase preceded the presentation of each new stimulus. This procedure was
repeated until 6 scores and 6 texts had been read. After completing both parts
of the reading test, participants performed the CBT (see
Figure 2). It should
be noted that the CBT was applied at the end of the
session because it has a shorter duration than RST, ensuring better attentional
response at the end of the experiment.
Following the procedure of Servant and
Baccino (1999), comprehension questions were designed to keep the attention of
the musicians while permitting study of their oculomotor behaviour. In order to
discourage reading patterns based on the surface structure of the text, one
question was related to a local feature
(i.e. peculiarities of the stimulus) and the other was related to a global feature
(e.g. styles of music and text). For
each question, participants had to
choose between three possible answers. Examples of local questions are: How
does the first musical phrase finish – Tonic, Subdominant or Dominant? Who is
father Le Paige – a Missionary, a Traveller or an Architect? Examples of global
questions are: What is the character of the piece – calm and restful, incisive
or leggero? How would you describe
the passage – Descriptive,
Fantastic or Historical?
Data Analysis
The algorithm used for data analysis comes
from Tobii Studio software (Tobii standard filter). For texts and music, the
same fixation filter with radius of 50 pixels was adopted in order to
facilitate comparison between reading modalities allowing a balanced measure
for both types of stimuli.
The dependent variables considered for
statistical analysis were the number and duration of fixations, and the number
of regressive fixations. The following independent variables were considered:
(1) integrative controls (intra-sentence/phrase or inter-sentence/phrase; (2)
reading stage (first pass or re-reading). We conducted Student’s t-test to
analyse simple effects for reading times, number of fixations and durations of
fixations (6 scores + 6 texts). Regressive
fixations data from the selected
stimuli were analysed in a repeated measures design.
Harmonic structure refers to the pitch
dimension of music and the vertical organization of notes (chords); melodic
structure defines the limits of phrases, for example, based
on physical (i.e. space) or structural (i.e. harmonic) markers as noted by Sloboda (1977). Thus, in the present
study sentences in text reading are equivalent to musical phrases, as the
limits are defined by physical markers. In this work the unit of analysis in
text reading is the sentence
delimited by punctuation. In music, the unit
is the musical semi-phrase. Therefore
we will refer to phrase in
the case of music, sentence in the case of text, and sentence/phrase when
referring to both domains.
Criteria for structural analysis of
stimuli. In order to analyse more deeply
the role of regressive fixations in the reading comprehension process and compare
the different reading
strategies used by participants, one of each style of text and score were selected.
The stimuli selected
meet the following criteria: the highest percentage of correct answers
in the comprehension questions (i.e. scores 70% and texts 80%) and an average
reading time that falls within the 95% confidence
interval of the overall reading time distribution.
Intra-phrase regressive fixations occur
within the same phrase/sentence, that
is to say, in the case of texts between punctuation marks, and in the case of
scores within semi-phrases. However, in the case of scores, semi-phrases in
some pieces sometimes overlapped the next beat, meaning that not all
semi-phrases finished exactly at the bar line (i.e. elision). In these cases,
the boundary established for analysis was the final
beat of the musical semi-phrase.
Reading stages (first pass and
re-reading measurements). Earlier stages of reading can be studied with first pass measurements (Clifton, Staub
& Rayner, 2007), while the later stages of reading can be assessed by analysing
second pass measures intended to integrate and store information (Servant &
Baccino, 1999) during
reading. In silent reading it is possible to study both reading stages
more deeply. For texts, the first pass corresponds to the initial set of eye fixations
within the same sentence before
moving on to the next sentence. In the case of music, the first pass comprises
a full bar considering that there must be at least one fixation (in beats) per
unit in each stave (i.e.
treble and bass clef). Re-reading measurements were
defined as all fixations on information subsequent to the first pass of information. When a region was not re-read, it was not considered. The criteria for the
analysis of inter and intra-sentence/phrase limits in texts (punctuation marks)
and scores were the same as those used for music scores and mentioned in the
preceding section.
In order to account for cognitive processes
operating at different stages in reading, we analysed the first pass and the
re-reading information for the selected stimuli. Furthermore the intra and
inter-sentence/phrase level led us to compare local and global controls during
the integration of musical and verbal information. Data was analysed using the
MATLAB programming which allowed us to study in greater detail the intra and
intersentence/phrase eye movement patterns. For this reason we do not consider
defining areas of interest (AOI) in stimuli to be relevant.
Results
Two analyses were performed. The first
analysis examined overall data (six texts and six scores). The second analysis was performed
using data from four selected stimuli, one for each type of
text (informative and literary) and musical style (tonal and contemporary).
Analysis 1, Overall Data
To analyse the simple effects
between reading texts or
scores (reading modalities and styles), mean comparisons
(Student’s t-test) were performed including the reading times and the number
and duration of fixations.
Reading Times
The reading times of the musical pieces (M
= 68.09 ms, SD = 16.76 [56.10,
80.08]) and the texts (M = 64.55 ms, SD
= 17.04 [52.36,
76.75]) did not differ significantly (t(9) = 0.641, p = .54, d = 0.209, 95
% CI [-16.02, 8.95], r = .10), nor were
significant differences in reading time found between musical styles (t(9) = 0.323, p = .75 , d = 0.018, 95 % CI [-42.85, 57.12],
r = .009) or text styles (t(9) = 1.47, p = .18, d
= 0.251, 95 % CI [-11.76, 2.51], r = .12). These results suggest that
the time spent in silent reading of texts and scores was relatively equivalent,
which supports the closeness between the two tasks.
Eye Fixations
Number of fixations. Participants
performed on average half the number of fixations when reading music (
M = 80.5,
SD =
31.39, 95 % CI [58.05,
102.95]) as compared with reading text (
M = 170.68,
SD = 39.79 [142.22,
199.15]). Thus, differences between reading modalities were significant (
t(9) = 8.38,
p < .001,
d = 2.516, 95 % CI [65.84, 114.53],
r = .78). Although the number of fixations in tonal
music was greater than in contemporary music, the difference did not quite
achieve significance. Moreover, no significant differences were
found between informative and
literary texts, representing a possible explanation for the absence
of difference in reading times (see
Table 1). Significant correlations between the overall
reading time and the overall
number of eye fixations (r(8) = .79, p =
.007) were observed which is in line with the literature (Kinsler &
Carpenter, 1995).
Duration of fixations. We observed
significant differences (t(9) = 14.38, < .001, d =
3.306, 95 % CI [226.11, 301.04 ], r = .87)
between texts and scores, with longer fixation durations
for scores (M = 586.52 ms, SD = 87.95 [554.56, 619.08])
than for texts (M = 319.16 ms, SD
= 53.44 [301.4, 338.12]). Regarding the music and text styles, no
significant difference was observed either between tonal and contemporary
music, or between informative and literary styles. On average however, fixation
length was higher for tonal music and for literary texts.
Analysis 2, Selected Stimuli
A repeated measures ANOVA for the number of
regressive fixations with processing level (i.e. intersentence/phrase or
intra-sentence/phrase) and reading stage (i.e. first pass
or re-reading) as factors was conducted.
Fixation patterns. Broadly speaking,
the number of regressive fixations is higher for music (M = 57.3, SD =
18.11 [44.34, 70.26]) than for texts (M = 52, SD = 20.49 [37.34, 66.66]), although the
difference was not significant (F(1,36) = 0.97, p = 0.33, ηp2
= .03). In contrast, significant differences were observed between inter and
intra-sentence/phrase processing levels (F(1,36) = 20.30, p <
.001, ηp2 = .36) with further processing intrasentence/phrase
(M = 37.15 , SD = 11.56 [22.88, 45.42]) than with
inter-sentence/phrase processing level (M = 17.30, SD = 6.14 [12.91, 21.69]). An interaction between
the type of reading (texts/scores) and the processing level (F(1,36) = 30.58, p = .001, η 2 =
.46) indicates that in music reading, inter-phrase regressive fixations are
more frequent and there was less variation between inter and intra-phrase
levels. By contrast, text reading is characterized by higher intra-phrase
processing levels.
Planned contrasts show significant
differences between the styles of music and texts with more intrasentence/phrase
regressive fixations in contemporary music in comparison with tonal music (
F(1,36) = 8.25,
p =
.006 , η
p2= .19)
and in informative text in comparison with literary text (
F(1,36) = 44.36,
p < .001, η
p2 = .55). Regarding
inter-sentence/phrase regressive fixations, we found differences between music
styles (
F(1,36) = 4.60,
p =
.002, η
p2 =
.11) but not between text styles (see
Figure 3).
First Pass and Re-Reading
During the first pass (sentences in texts
and semiphrases in scores),
the number of regressive fixations
was significantly lower (F(1,36) = 23.32, p < .001, ηp2 = .39) in the scores (M = 14.3, SD = 3.8 [11.58, 17.02]) than in the texts (M = 36.1,
SD = 12.95
[26.83, 45.39]).
Globally, we observed a decrease in
intra-sentence regressive fixations in texts and an increase in interphrase
regressive fixations in scores between the two reading stages analysed. Planned
contrast showed that the
average number of regressive fixations during re-reading increased
significantly at intra and inter-phrase level in contemporary music (F(1,36)
= 17.46, p < .001, = .33; (F(1,36) = 16.46, p <
.001, ηp2 = .31,
respectively), but only at inter-phrase level in tonal music (F(1,36) =
6.19, p = .017, ηp2 = .15) in contrast with the
intra-phrase level where differences between reading phases were not significant (F(1,36) = 2.34, p =
.13, ηp2 = .06). In text reading, regressive fixations decreased at
intra-sentence level in both informative and literary styles (F(1,36) = 22.55, p < .001, ηp2 = .39; F(1,36) = 133.24, p < .001, ηp2 = .79, respectively) and increased at
inter-sentence level but only in literary style
F(1,36) = 4.17, p = .048, ηp2 = .10). In informative text this
increase was not statistically significant (F(1,36) = 0.44, p =
.51, ηp2 = .01).
In summary, it should be noted that during
text reading, after the first pass, intra-sentence controls decrease
significantly. This suggests
that the integration of information
in text reading is performed sequentially at intrasentence level, with a
tendency to increase intersentence/phrase controls during the re-reading stage.
On the other hand, music reading is characterized by deeper and repetitive
intra-phrase controls. Thus, the first reading of a semi-phrase would not be
enough to obtain the necessary information
for global integration. However, it should be noted that piano notation is
characterized by organization into vertical units (chords) and two staves. This implies
a zigzag eye movement trajectory, described by Weaver (1934) which could well explain the occurrence
of intra-phrase controls.
Individual Strategies
From analysis
of individual strategies, and comparing
this information with the results
previously presented, we conclude that there are no overlapping reading patterns in the two domains. We can therefore say
that reading patterns are directly related to the specific music style or type
of text. However, two main reading strategies can be assumed, one
sequential and the other selective. Neither strategy can be associated with a particular style of music or type of text, as previously
stated. More specific individual profiles are difficult to classify. For example, some musicians did not re-read most of
the stimuli but did reread one in particular. Others employed few integration
controls and used a selective but non-integrative strategy in contrast with
those who utilized a selective and integrative strategy.
Memory Tests
We found significant correlations between
the CBT and the number of digits recalled in RST (r(8)= .76 p <
0.01). This suggests certain independence between visual-perceptual and
analytical-comprehensive information processing.
With the exception of the correlations
between memory tasks and fixation duration in text reading (i.e. RST reaction
times with fixation
duration, r = .66, p = .038), the RST was mostly associated with
variables accounted for by contemporary music. These variables included, in the
storage component of the RST, the average number of fixations in all musical pieces (r = –.86, p <
.001) and fixation duration in contemporary music (r = -.56,
p = .09). In the processing component of the RTS, these variables concern
the reading time in contemporary music (r = –.67, p =
.032) and the number of hours of piano practice (r = .65, p =
.044).
In parallel, a link
between spatial memory and regressive inter-sentence/phrase fixations
in literary texts (r = .66, p =
.032) and contemporary music (r =
.70, p = .025) suggests
that some resources may be shared between the two domains.
Discussion
The present study examined oculomotor behaviour in a silent reading task that differed
from other tasks such as
reading music for performance or simply reading aloud (Caltelhano & Rayner,
2008). We studied how structural units defined by physical and structural
markers can determine changes in eye movement patterns between different styles
of music (tonal and contemporary) and text types (informative and literary).
These comparisons are based on the reasoning that: (1) there are no differences in the number
and duration of eye fixations
nor in reading times between styles of music and texts; (2)
small-scale structures of the stimuli (sentences and musical semiphrases) are
equivalent in number; and (3) the amount of information in each stimulus
seems to be compensated by the number/duration ratio of eye
fixations, especially between reading types (i.e. scores and texts).
Task Constraints
Globally,
we found some differences between
reading modalities: eye fixations were more numerous in text reading,
and fewer but longer in music reading. These results are corroborated in the
literature (Rayner, 1998; Baccino, 2004; Ahken, Comeau, Hébert, & Balasubramaniam,
2012). In particular, the reading of tonal music and literary texts was
associated with longer eye fixations, while reading of informative texts and
contemporary music was associated with a greater number of regressive
fixations. As was expected we found a higher number of regressive fixations in
the scores than in the texts. We propose that there is a certain independence
between these two variables (duration of fixations and number of regressive fixations) to account for the relation between musical facts and eye-movement
patterns. However, neither the
previously discussed results nor knowledge from previous studies lead us to the
conclusion that there is equivalence between
a particular musical stimulus (tonal or contemporary) and a verbal text
(literary or informative).
Reading Stages and Reading Effectiveness
We found that the efficiency of text
reading as compared to music reading is not
easy to prove if one considers only the earlier stages of the
reading process. In fact, the number of inter-phrase regressive fixations is
higher for music while intra-phrase regressive fixations are more associated with text reading (with a
significant decrease during re-reading in both text types).
We observed a tendency to increase the number
of regressive fixations during the
re-reading stage at an interphrase level, however this increase was not significant in text reading. An inverse
relationship was observed in intra-phrase fixations, illustrated first by a
tendency to increase the number of regressive fixations during the rereading
stage in the case of contemporary music and secondly by a decrease in
intra-sentence/phrase fixations in both styles of text. This suggests,
firstly, that the effectiveness of the local and global
integrative reading mechanisms
depends mainly on the style of music and text;
and secondly that reading strategies based on the informational
structure of the stimuli (Landragin, 2004), linked with inter-phrase regressive
fixations, can be used to verify the adequacy of certain global elements in the
local context (i.e. at intra-phrase level).
Concerning total reading time, the
association between informative and literary texts suggests that the speed of
information integration responds to the mobilization of common mechanisms in
text reading. This is plausible because the amount of information in the different stimuli appears to be comparable
since both had a similar number of words. Although there is no
a priori evidence
to confirm that literary texts contain less information than informative texts
(see
Table A2), the reading of informative texts does not require additional regressive fixations in the re-reading process. Instead, the
ideas would be extracted sequentially through inter-sentence integration and
with the mobilisation of spatial memory coding (see Baccino & Pynte, 1994;
see also Kennedy, Brooks, Flynn, & Prophet, 2003). Additionally, correlations
between the overall reading time and the overall number of eye fixations are in
line with the literature (Kinsler & Carpenter, 1995).
Reading Patterns and Reading Strategies
Music reading patterns vary according to
the music style. This suggests that the integration of information in the two styles occurs
at different levels.
In the case of contemporary music, a tendency for
inter-phrase and intra-phrase controls was observed. Cave et al. (2008) suggest that if the number of saccades performed
between complex visual patterns is equal to the number of saccades within
each pattern, this would go against a holistic
strategy for stimulus comparison (p. 154). It seems then that reading
contemporary music induces the development of less sequential inspection
strategies. These patterns, unlike tonal music, may be determined according to the
melodic shape of musical motifs (or musical texture from a historical counterpoint perspective) rather than the
sequential development of musical phrases.
In summary, the reading of contemporary
music is characterized by a less sequential path for musical information
integration, while the reading of tonal music is characterized by a more
sequential path for musical information integration determined by harmonic tonal relationships.
These differences are confirmed by analysis of participants’
individual reading strategies. Like previous studies (Penttinen, et al., 2013),
we found integrative and analytical strategies for reading music and texts.
However, individual strategies were modified depending on the style of music
and type of text, suggesting the mobilisation of domain-specific skills. In
this regard we provide evidence of a link between
visuospatial capacities and the
reading of contemporary music. Further research is required to address the link between
cognitive skills and the
reading strategies of different musical styles.
Cognitive Skills
In literary texts, there were a greater number of intrasentence regressive
fixations, however, almost no rereading process. This suggests that there are
significant differences with respect to the integrative mechanism between texts
and scores. Due to the absence of significant
correlations between the memory tests and text reading-related
variables, it is not possible to determine whether these differences depend on
storage or verbal processing capabilities, or even short-term spatial memory.
Our results suggest that spatial memory is
associated with eye movement patterns according to the style. This arises from the basis of the correlations observed
between the average reading time
for contemporary music and
the results in the CBT.
Furthermore, the increased number of inter-phrase regressive fixations
in reading contemporary music could reflect
information processing based on visual and perceptual aspects of the stimulus,
related to the capacities measured by the CBT. In fact, the mobilization of spatial memory capacities seems to facilitate the processing of information in contemporary music reading, thus facilitating global comparisons between
musical motifs (correlations between the CBT and inter-phrase regressive
fixations). This suggests that analytical and visual-perceptual features are
relatively independent in silent music reading. Moreover, in the case of
literary texts, inter-sentence processing is associated with shortterm spatial
memory, suggesting a spatial coding of verbal information (Baccino & Pynte,
1994).
From these results
and the observation of individuals’ inspection
strategies we can suggest
that musicians adopt
different strategies according to their cognitive skills, possibly offset by a
more analytical reading. This could be considered if we assume
that the CBT involves executive functions (Klauer & Zhao,
2004; Miyake, Friedman, Rettinger, Shah, & Hegarty, 2001; Vandierendonck,
Kemps, Fastame, & Szmalec, 2004; Thompson et al., 2006).
The fact that the correlations between the RST and the set of all the variables studied were
found mainly in contemporary music
reading suggests that tonal music is not strongly dependent on verbal or
spatial memory capabilities. On the other hand, this could indicate the implementation of different strategies during the reading of
contemporary music. For example, information from the analysis of musical
harmony could temporarily be coded verbally. The coding assigned to each unit
would facilitate comparison of different musical motifs (e.g. descendent
motifs). This would mean, for example, that with more efficient processing of
verbal information we should find shorter eye fixations in the reading of contemporary music.
The limitations of this study mainly concern
the adoption of certain methodological commitments that allowed
us to compare the two domains. For example, the absence
of a time limit for inspection of the stimuli, as well as the definition of the comparison units and
the choice of eye fixation filters.
The results of the present study could have
an impact on clarifying the relationship between music and verbal language as
well as in the definition of units of comparison between the two domains. It could also contribute to the
definition of reading comprehension strategies and their relationship to
cognitive abilities and skills developed by musicians.