Designers of new musical instruments can often be concerned with ensuring accessibility for users either with no previous musical experience, or for those who already have training in another instrument, so that they can easily alter or learn new techniques. Several claims regarding the optimal pitch layout of new musical instruments or interfaces have been made, but as yet there is little empirical investigation of the factors that may enhance or disturb learning and performance on these devices. Our previous conference paper detailed the impact of adjacency and shear on pitch accuracy for the transfer task [1
]; in this paper, we take a more comprehensive approach by also considering timing accuracy, and the training and retention tasks.
1.1. Isomorphic Layout Properties
Since the nineteenth century, numerous music theorists and instrument builders have conjectured that isomorphic pitch layouts
provide important advantages over the conventional pitch layouts of traditional musical instruments [2
]. Indeed, a number of new musical interfaces have used isomorphic layouts (e.g., Array Mbira [6
], Thummer [7
], AXiS-49 [8
], Musix Pro [9
], LinnStrument [10
], Lightpad Block [11
], Terpstra [12
An isomorphic layout is one where the spatial arrangement of any set of pitches (a chord, a scale, a melody, or a complete piece) is invariant with respect to musical transposition. This contrasts with conventional pitch layouts on traditional musical instruments; for example, on the piano keyboard, playing a given chord or melody in a different transposition (e.g., in a different key) typically requires changing fingering to negotiate the differing combinations of vertically offset black and white keys.
Isomorphic layouts also have elegant properties for microtonal scales, which contain pitches and intervals “between the cracks” of the piano keyboard [13
]. Although strict twelve-tone equal temperament (12-TET) is almost ubiquitous in contemporary Western music, different tunings are found in historical Western and in non-Western traditions. Isomorphic layouts may, therefore, facilitate the performance of music both within and beyond conventional contemporary Western traditions.
One elegant property relevant to non-standard tunings is that, unlike the piano keyboard, isomorphic layouts do not have an immutable periodicity in their spatial structure. On the piano keyboard, only scale systems that repeat every twelve pitches can be intuitively mapped to its keys. Conversely, isomorphic layouts provide consistent spatial representations of scales regardless of their periodicity. This matters because there are many useful tuning systems that do not repeat every 12 chromatic pitches, such as meantone tunings in 19-TET or 31-TET, which are suitable for conventional Western music but provide better approximations to just intonation than the standard 12-TET; Bohlen-Pierce scales, which repeat every 13 equal divisions of the 3/1 tritave (instead of the standard 2/1 octave); the Javanese Pelog system, which is often approximated by a 7-pitch scale in 9-TET; and numerous other scale systems [14
In this paper, we do not compare isomorphic and non-isomorphic layouts. Instead, we focus on how different isomorphic layouts impact on learning. This is because there are an infinite number of unique isomorphic layouts (and a large number that are practicable for conventionally tuned diatonic-chromatic music): they all share the property of transpositional invariance (by definition) but they differ in a number of other ways that may plausibly impact their usability. For example, successive scale pitches, such as C, D, and E, are spatially adjacent in some isomorphic layouts while in others they are not; additionally, in some isomorphic layouts, pitches are perfectly correlated to a horizontal or vertical axis while in others they are not [15
]. With respect to the instrumentalist, the “horizontal” axis runs from left to right, the “vertical” axis from bottom to top or from near to far. In some layouts, octaves may be vertically or horizontally aligned; in others, they are slanted. Properties such as a vertical pitch axis or a vertical octave axis, or adjacent major seconds, and so forth, may be conjectured as desirable (or undesirable): either way, they are typically non-independent because changing one (e.g., pitch axis orientation) may change another (e.g., octave axis orientation). Choosing an optimal layout thus becomes a non-trivial task that requires knowledge of the relative importance of the different properties. However, due to their non-independence, it is challenging to investigate the relative importance of these features experimentally.
To address this, the experiment presented in this paper explores how two independent spatial transformations of isomorphic layouts—shear and adjacency—impact on learning in a set of melody retention and transfer tasks. The shear is used to manipulate the angles of the pitch axis and major second axis, while keeping the octave axis constant; the adjacency manipulation determines whether or not major seconds are spatially adjacent. These two transformations enable us to test our hypotheses that adjacent major seconds and a vertical pitch axis facilitate the learning and playing of melodies.
The four layouts that result from these transformations are illustrated in Figure 1
a–d. Each figure shows how pitches are positioned, and the orientation of three axes that we hypothesize will impact on the layout’s usability. Each label indicates whether the layout has adjacent major seconds or not (A
, respectively) and whether it is sheared or not (S
, respectively). The three axes are the pitch axis
, the octave axis
, and the major second axis
, as now defined (the implications of these three axes, and why they may be important, are detailed in Section 1.1.2
The pitch axis
is any axis onto which the orthogonal (perpendicular) projections of all button centres are proportional to their pitch; for any given isomorphic layout, all such axes are parallel [16
] (see the caption for Figure 1
for a practical demonstration of how this works).
The octave axis is here defined as any axis that passes through the closest button centres that are an octave apart.
The major second axis (M2 axis, for short) is here defined as any axis that passes through the closest button centres that are a major second apart.
When considering tunings different to 12-TET (e.g., meantone or Pythagorean), alternative—but more complex—definitions for the octave and M2 axes become useful.
1.1.1. Adjacent (A) or NonAdjacent () Seconds
Scale steps (i.e., major and minor seconds) are, across cultures, the commonest intervals in melodies [17
]. It makes sense for such musically privileged intervals also to be spatially privileged. An obvious way of spatially privileging intervals is to make their pitches adjacent: this makes transitioning between them physically easy, and makes them visually salient. However, when considering bass or harmony parts, scale steps may play a less important role. This suggests that differing layouts might be optimal for differing musical uses.
The focus of this experiment is on melody so, for any given layout, we tested one version where all major seconds are adjacent and an adapted version where they are nonadjacent (minor seconds were nonadjacent in both versions). Both types of layouts have been used in new musical interfaces; for example, the Thummer (which used the Wicki layout (Figure 1
b) had adjacent major seconds, while the AXiS-49 (which uses a Tonnetz
-like layout [18
]) has nonadjacent seconds but adjacent thirds and fifths.
1.1.2. Sheared (S) or Unsheared ()
We conjecture that having any of the above-mentioned axes (pitch, octave, and M2) perfectly horizontal or perfectly vertical makes the layout more comprehensible: if the pitch axis is vertical or horizontal (rather than slanted), it allows for the pitch of buttons to be more easily estimated by sight, thereby enhancing processing fluency. Similar advantages hold for the octave and M2 axes: scales typically repeat at the octave, while the major second is the commonest scale-step in both the diatonic and pentatonic scales that form the backbone of most Western music.
However, changing the angle of one of these axes requires changing the angle of one or both of the others, so their independent effects can be hard to disambiguate. A way to gain partial independence of axis angles is to shear the layout parallel with one of the axes—the angle of the parallel-to-shear axis will not change while the angles of the other two will. A shear
is a spatial transformation in which points are shifted parallel to an axis by a distance proportional to their distance from that axis. (For example, shearing a rectangle parallel to an axis running straight down its middle produces a parallelogram; the sides that are parallel to the shear axis remain parallel to it, while the other two sides rotate). As shown by comparing Figure 1
a with Figure 1
c, or by comparing Figure 1
b with Figure 1
d, we used a shear parallel with the octave axis to create two versions of the nonadjacent layout and two versions of the adjacent layout: each unsheared version (
) has a perfectly horizontal M2 axis but a slanted (non-vertical) pitch axis; each sheared version version (
) has a slanted (non-horizontal) M2 axis but a vertical pitch axis. In both cases the octave axis was vertical.
In this investigation, therefore, we remove any possible impact of the octave axis orientation; we cannot, however, quantitatively disambiguate between the effects of the pitch axis and the M2 axis.
Unsheared layouts are common in new musical interfaces because these typically use buttons arranged in a perfectly square or hexagonal array; we are not aware of a hardware interface that makes use of shear to make the pitch axis vertical or horizontal (although this is a design feature of the software MIDI sequencer Hex [15
1.2. Motor Skill Learning in Music Performance
Learning a new musical instrument requires a number gross and fine motor skills in order to physically play a note. This is often carried out in tandem with sensory processing of feedback from the body and of auditory features (e.g., melody, rhythm, timbre) in order to learn how to play specific sequences [19
]. For the purposes of our experiment, by using musically-trained participants and sequences familiar to those musicians such as scales and arpeggios, we reduce this to a motor learning problem. How best can musicians learn to play on a new pitch layout?
In learning a motor skill there are three general stages [20
A cognitive stage, encompassing the processing of information and detecting patterns. Here, various motor solutions are tried out, and the performer finds which solutions are most effective.
A fixation stage, when the general motor solution has been selected, and a period commences where the patterns of movement are perfected. This stage can last months, or even years.
An autonomous stage, where the movement patterns do not require as much conscious attention on the part of the performer.
Essentially, learning the motor-pitch associations of a new instrument requires the performer to perceive and remember pitch patterns. Once these pitch patterns are learned, the performer becomes more focused on eliminating various sources of motor error. Because achieving motor autonomy is a lengthy process—one that can seldom be captured by short-term experiments—our current study focuses on only the first two elements of motor learning.
Learning a pattern of actions and their associated responses can be affected by pre-existing action-response representations: essentially, the anticipated effects of an action have an influence on the performance of that action; for example, reaction time is faster when participants are instructed to press a button forcefully and this elicits a loud tone, rather than when the effect is not compatible with the action (e.g., a soft tone) [21
]. Therefore, it may also hold that pre-existing expectations of the pitch effects of a sequence of actions may have an influence on the performance of that sequence.
Research into the Spatial-Musical Association of Response Codes (or SMARC effect) demonstrates not only a vertical alignment (increasing pitch height is mapped vertically from low to high), but also a horizontal alignment (increasing pitch height is mapped horizontally from left to right) in musically trained participants [22
]. This horizontal effect is far more subtle in non-musicians [23
] and in some cases non-existent [24
], suggesting that musical training enhances this particular spatial dimension. It is posited that this may be a learned-association effect [26
]. These pitch representations have been shown to influence motor planning and action. Keller and colleagues found that, for a sequence of three consecutive keypresses, timing was more accurate when the produced tones were compatible with the pre-existing associations that increasing vertical movement results in an increase in pitch height [27
]. This appears to be evident across different levels of expertise (non-musicians and trained musicians), although, as expected, training enhances the strength of this existing representation [29
]. We investigate only 2-dimensional pitch layouts, so do not consider the implications of Shepard’s helical model of pitch perception [30
], which requires a cylindrical—hence 3-dimensional—form [31
The tendency in the pitch-motor representation literature has been to reverse or scramble pitches from the traditional down-to-up or left-to-right assignment. Although many new pitch layouts may not violate this basic learned pitch-motor association, adjustments to the learned general motor pattern may still be required depending on the spacing of intervals, and the precise orientation of the pitch axis. Stewart and colleagues [26
] demonstrated an effect on reaction time in a task using “normal” versus “stretched” representations of pitch along a horizontal axis (sequences which did or did correspond to a learned pattern of movement that could be played with the fingers of a single hand). This suggests that, despite their similarity to other layouts (both “normal” and “stretched” satisfied the left-right horizontal sequence), the patterns of notes may have fundamentally changed for the performer, and so require a certain amount of motor learning in this new (but clearly related) task.
It seems plausible then that certain aspects of a new layout, within the realm of satisfying the vertical and horizontal SMARC effects, will facilitate such learning, while others may hinder it. These aspects may be related to (a) previously learned pitch-motor mappings; (b) ergonomic issues, such as the physical ease of making the motions required to play the target pitches, and also from (c) processing fluency, such as how easy it is to see or sense, by proprioception, musical features that are relevant to the task. As detailed in Section 1.1.1
and Section 1.1.2
, in this experiment, we focus on the last of these and, in particular, on two musical attributes that are important for melodies and two spatial attributes that have a plausible impact on processing fluency. The musical attributes are major seconds (important because of their prevalence in melodies and musical scales) and pitch height. The spatial attributes are verticality (we hypothesize that perfectly vertical, or horizontal, lines are easier to imagine than are slanted lines) and adjacency (we hypothesize that it is generally easier to find a spatially adjacent pitch than one that is separated). The experimental manipulation, therefore, involves participants learning and playing pitch layouts with vertical versus slanted pitch axes, and adjacent versus nonadjacent major seconds.
To test how well the participants have learned the new layouts and perfected their motor pattern, we are particularly interested in the transfer of learning from one task to another. For instance, a piano player will practice scales not only to achieve good performance of scales, but also to fluently play scale-like passages in other musical pieces. In our study, we designed a training and testing paradigm for the different pitch layouts such that the transfer task involved a previously unpracticed, but familiar (in pitch) melody.
1.3. Study Design
For this experiment, we were interested in examining how features of a pitch layout affected performance accuracy in the learning of a new motor pattern, how this skill was retained at test immediately after training, and performance accuracy in transfer of this skill to a new, untrained task. Musically experienced participants played three out of the four layouts under consideration (see Figure 1
): all 24 participants played both
, with 12 participants each playing either
The independent variables were
Adjacency , where 0 is the code for a layout with non-adjacent major seconds ( or ), and 1 is the code for a layout with adjacent major seconds ( or ).
Shear , where 0 is the code for an unsheared layout ( or ), and 1 is the code for a sheared layout ( or ).
LayoutNo , where 0 is the code for the first layout played by a participant, 1 is the code for the second layout they played, and 2 is the code for the third and final layout they played.
PerfNo , where 0 is the code for their first performance of a given layout, 1 is the code for their second performance of a given layout, 2 is the code for their third performance of a given layout, 3 is the code for their fourth performance of a given layout. Note that participants gave three performances for the training, two performances for the immediate retention tasks, and four performances for the transfer task.
Each participant played the layouts in one of four different sequences, and each such sequence was played by 6 participants:
then then .
This means that the nonadjacent seconds layouts ( and ) were always presented second, and that participants who started with the unsheared adjacent layout () finished with the sheared adjacent layout (), and vice versa.
In each such layout, participants received an equivalent training and testing program: first for the C major scale, then for arpeggios of all triads in C major. The scale task was used to support the learning of the spatial patterns of seconds in the diatonic scale; the arpeggios to support the learning of the spatial patterns of larger intervals such as thirds and fourths in the diatonic scale. Immediate retention (performance without any audiovisual training) was tested after each task. The transfer task required participants to perform a well-known melody (Frère Jacques
) for which they had received no prior training. This melody contains numerous major and minor seconds but also larger intervals. Participants were given 20 s to practice before their performances were recorded. These procedures are further detailed in Section 2.2
, Section 2.3
and Section 2.4
Participants’ preferences were elicited in a semi-structured interview, a detailed analysis of which is available in [1
]. The current paper will fully describe the results of the performances of training and testing materials (both retention and transfer tasks), assessed for their inaccuracy in terms of number of incorrect notes as well as the timing of the performed notes in comparison to either the audiovisual sequence (training) or the metronome beat (retention and transfer).