The Effect of Isomorphic Pitch Layouts on the Transfer of Musical Learning †

Featured Application: The results obtained in this paper are applicable to the design of new musical instruments intended to facilitate the learning and playing of music. Abstract: The physical arrangement of pitches in most traditional musical instruments—including the piano and guitar—is non-isomorphic , which means that a given spatial relationship between two keys, buttons, or fretted strings can produce differing musical pitch intervals. Recently, a number of new musical interfaces have been developed with isomorphic pitch layouts where these relationships are consistent. Since the nineteenth century, it has been widely considered that isomorphic pitch layouts facilitate the learnability and playability of instruments, particularly when a piece is transposed into a different key; however, prior to this paper, this has not been experimentally tested. To address this, we investigated four different pitch layouts to examine whether isomorphism facilitates retention and transfer of musical learning within and across keys. Both non-musicians and musicians were tested on two training tasks: two immediate retention tasks and a transfer task. Each participant played every task on two distinct layouts—one being an isomorphic layout (Wicki or Bosanquet), the other being a minimally adjusted non-isomorphic version. For musicians, isomorphism was found to facilitate transfer of learning to a novel task; for non-musicians, the results were mixed. This study provides insight into features that are important to music instrument design.


Introduction
Music instrument learning is a domain commonly associated with numerous social, emotional and cognitive benefits. These include increased self-esteem [1,2], increased sense of social identity [3,4], and the provision of a medium for self-expression [5]. Cognitive benefits include improved fluid intelligence [6] increased IQ [7], slowed cognitive decline [8] and increased executive functioning in older adults [9].
In an age where computers are so prevalent and technological advances are constant, it seems desirable for musical instruments also to advance so as to increase accessibility to the many benefits of music learning. Indeed, a number of new music interfaces, such as the Thummer [10], AXiS-49 [11], LinnStrument [12], and Lightpad Block [13] have been developed over the last decade. So far, however, none of these have achieved widespread or "mainstream" acceptance. This may be due to a lack of clear and agreed upon guidelines as to what the important features of successful musical instruments (traditional or digital) are [14]. Paine and colleagues [10] suggested that desirable features to control on musical instruments include pitch, dynamics, articulation, attack/release, and vibrato. "Touchability", that is, the ability to make a physical connection with an interface, is another desirable trait of musical interfaces, relating primarily to the user's ability to control the quality of an instrument's sound.
However, before aspects of sound quality can be accurately manipulated, deciphering and memorising the placement of pitches is a crucial first step in learning how to play any given instrument (traditional or digital). MacRitchie and Milne [15] argued that learnability and playability are important features of instrument design in terms of their influences on how fast a new motor control program is devised by the learner. Learnability, in this instance, refers to how easy it is to understand and acquire the skills required to play an instrument, that is, to understand and remember the patterns of musical pitches contained within a certain layout. Playability refers to how easy it is to execute the physical motions involved in playing an instrument.
These two aspects are related to the first two of three stages of motor learning described by Schmidt and Lee [16]. In the cognitive phase, patterns are appraised and potential motor techniques are tested to determine which solution is optimal. In the fixation stage, the optimal motor solution identified in the previous phase is practised and eventually mastered. When the autonomous stage is reached, less conscious attention is required for the performer to execute the motor skill. This study focuses on the cognitive phase of motor skill development-a phase that is, according to the above characterisation, foundational.
Two features of pitch layouts that have been empirically tested so far for their effects on learnability and playability are major second adjacency (whether pitches a major second apart are next to each other or not) and shear (whether the pitch axis is vertical or slanted) [15]. The results showed that the accuracy of musicians' performances on a series of tasks was higher on layouts with adjacent major seconds than on layouts with non-adjacent major seconds. The results also hinted at an overall positive effect of a vertical pitch axis on accuracy (in an isomorphic layout, the pitch axis is the direction in which pitches ascend [17,18]-see Figures 5 and 7 for examples). However, this latter effect was inconsistent across the different tasks within the study. Furthermore, only isomorphic pitch layouts were assessed.
An isomorphic pitch layout is one in which each given spatial relationship between two note-playing "triggers" (e.g., keys, buttons, or fretted strings) always produces the same musical interval. A non-isomorphic layout is one where a given spatial relationship between two pitch triggers can produce differing musical pitch intervals. For example the ubiquitous piano keyboard is non-isomorphic in that playing two adjacent white keys can produce either a wholetone or a semitone depending on the actual pair of keys pressed (other examples are provided in [15]). The hypothesised advantage of isomorphic layouts is that once a given musical sequence-such as a chord, scale, melody, or entire piece-is learned in one key, it should be straightforward to play the same sequence in any other key; all that is required is to replicate the same spatial pattern (hence using the same fingering) but just starting from a different location on the instrument [19].

Current Study
The above advantages of isomorphic layouts have been proposed since the nineteenth century [20][21][22]; however, despite this long history and the recent development of multiple music interfaces utilising isomorphic pitch layouts (e.g., Array Mbira [23], Thummer [14], AXiS-49 [11], Musix Pro [24], LinnStrument [12], Lightpad Block [13], Terpstra [25]), no study has directly compared isomorphic and non-isomorphic layouts. To address this long-standing, but untested suggestion, in this study, we investigated whether the learnability and playability of a musical instrument increase with the use of isomorphic pitch layouts both within and, crucially, across keys. In doing so, we aimed to extend the research done by [15] to clarify cognitive and sensory-motor aspects of playing instruments in order to inform the design of musical instruments that facilitate learning.
Participants were tested on two variants of two well-established pitch layouts: the Wicki layout [20] and the Bosanquet layout [21,26]. In each case, one version of each type of layout was isomorphic (i.e., in its standard form), while another was slightly adjusted to make it non-isomorphic. The resulting four layouts are detailed in Section 2.2 and Figures 5-8. These layouts are also all broadly congruent with the SMARC (spatial-musical association of response codes) effect. The SMARC effect is a phenomenon described by Rusconi et al. [27] whereby pitch is mapped on a spatial representation such that high pitches correspond to upward right-hand positions and low pitches correspond to downward left-hand positions. This effect is similar to the SNARC (spatial-numerical association of response codes) effect described by Dehaene, Bossini and Giraux [28], which maps numbers in the same way. The pitches on the Wicki-like and Bosanquet-like layouts ascend in an upwards direction and a rightwards direction, respectively. While congruency with this effect was not examined in this study, it does have implications for the learnability of pitch layouts. Ensuring the layouts tested in this study were congruent with the SMARC effect was done to eliminate it as confounding variable. All layouts also had adjacent major seconds. This is a factor that, as previously mentioned, has already been established to impact learnability and playability [15], so, like the SMARC effect, it is helpful that the chosen layouts eliminate this as a potential confound.
It should be noted that a comparison of the two broad layout types (that is, Wicki vs. Bosanquet) was not of primary interest in this study, but, instead, the use of two different layout types facilitated an exploration of whether or not any general benefits of isomorphism occurred . The Wicki layout has been used previously in the development of the Thummer musical interface and in Wicki-Hayden concertinas. The Bosanquet layout has been examined with reference to its isomorphic properties since 1877 [21] (here, Helmholtz specifically suggested that the arrangement of keys on Bosanquet's keyboard reduces the number of chord forms a player must learn), and has been implemented in various slightly different forms in a variety of instruments, such as the Jankó keyboard.
As in MacRitchie and Milne [15], this study utilised a series of tasks on a multi-touch tablet (see Figure 1 in Section 2.2) involving exercises typically used in the teaching of musical instruments, namely scales and arpeggios. These exercises are useful in music learning because they relate to how musical pieces are structured, and they introduce learners to the different intervals between pitches. For example, scales are made up of major and minor seconds, which are the most common intervals found in Western melodies. Similarly, arpeggios are made up of major and minor thirds, and perfect fourths, which occur frequently in harmonies and bass lines, respectively. The scale and arpeggio tasks served to teach participants where pitches lie in relation to each other on the novel musical interface. These tasks were taught using an audiovisual demonstration (training tasks), and tested immediately after training (immediate retention).
Another task involved participants playing a well-known but unlearned melody (transfer task). This task assessed transfer of learning by examining how participants applied their learning of intervals in the scale and arpeggio tasks to a novel sequence. It is important to assess this aspect of learning considering the amount of time and effort required to learn a new motor skill-it is pertinent to know how well the newly learned skill can be applied outside the context in which it was originally acquired. Musically, the ability to apply information learned in exercises such as scales and arpeggios is important for learning pieces written by other composers and for composing music oneself. This is also important when transposing music into another key. Transposition is often required when harmonising different instruments, such as the clarinet and the flute. This is because the clarinet is a B instrument (most commonly) and so when a C fingering is applied on the clarinet, a B is sounded. To account for such skills, this task also required participants to play the melody both in the key in which they were trained for the scale and arpeggio tasks and in an additional, untrained key.
This study included both musicians and non-musicians. This was useful to allow testing of the initial learnability of the layouts of interest. That is, this allowed the assessment of learning for those who were not transferring or modifying their existing motor skills, but were developing them in their first form. This also facilitated exploration of differences across levels of expertise, that is, whether or not particular layouts were easier for beginners and whether this changed with increased experience. The testing of non-musicians was also conducted to determine the generalisability of the findings to a wider population. Finally, this study also made use of semi-structured interviews to allow a greater depth of assessment of factors affecting learning.
Given previous suggestions that isomorphic pitch layouts are advantageous to music learning due to their property of transpositional invariance, it was expected that the accuracy of participants' performances would be higher for tasks completed on isomorphic layouts than those completed on non-isomorphic layouts (H1). It was also expected that the accuracy of participants' performances in the untrained key would be lower than that of those completed in the trained key (H2). However, as spatial relationships between pitches are maintained on isomorphic layouts when modulating to a new key, it was expected that the accuracy of participants' performances in an untrained key on the non-isomorphic layout would be lower than those in an untrained key on the isomorphic layout (H3).
To acknowledge some of the other factors relevant to learning tasks, such as practice effects (as observed in the previous study [15]), secondary predictions were made. Due to the differences in previous development of relevant motor skills, it was expected that accuracy of musicians' performances would be higher than that of non-musicians' performances (H4). It was also expected that accuracy of participants' performances would increase with each performance given and each new layout learned (H5). Finally, it was expected that accuracy would exhibit a ceiling effect where further improvement was harder to achieve (H6).

Participants
All participants gave their informed consent for participation in the study, which was approved by the Human Research Ethics Committee of Western Sydney University (H12504).
Sixty-eight participants were recruited (46 males, 22 females). Participants ranged in age from 18 to 69 (M = 24.88, SD = 9.38). Twenty-six of these were musicians and forty-two were non-musicians. Musicians had five or more years of formal training on a musical instrument or equivalent (M = 6.12, SD = 1.32 years). Piano was the primary instrument played by musicians (54%). Other instruments played by musicians included the drums, guitar, violin, and flute. Non-musicians had less than five years of formal training on a musical instrument (M = 2.07, SD = 1.5 years) and played the drums, guitar, bongo, recorder, or had singing lessons. Thirty-nine participants (57%) stated they were Australian with the remainder of participants primarily ranging in origin from Asia, Europe, or the Middle East. Non-musicians were recruited via Western Sydney University's online recruitment system, SONA, and were given course credit for their participation. Musicians were recruited via email through personal contacts, fliers, and social media and were paid $40 for their participation.
Data from one participant was discarded as they made multiple statements in the experimental session that indicated that they did not understand the instructions being provided.

Materials
The experimental tasks were completed on a Microsoft Surface Tablet. The tasks were presented in the program Hex 2.1.1 (a version specially adapted for multitouch, pictured in Figure 1), and performances were recorded and saved as MIDI files using a Max patch (Cycling '74, Max 7), which was also used to program the layouts and control the tasks that participants were presented with. Examples of the scales, arpeggios, and melody sequences are provided in Figures 2-4. Examples of the layouts on which participants played are provided in  these are screenshots of the software here overlaid with note names that were not visible to the participants.     Figure 5. The isomorphic Wicki layout. The red and green lines show the respective paths traced out by two octaves of the B and E major scales. The blue arrow shows the pitch axis-if the layout is rotated to make the pitch axis vertical, the pitch of every button is proportional to its height. Figure 6. The non-isomorphic Wicki-like layout. The red and green lines show the respective paths traced by two octaves of the B and E major scales. Note how-compared with the isomorphic version-after every two rows, the pitches are cumulatively shifted one button to the right. This means that the same spatial relationship will produce inconsistent interval sizes. For example, the button "northeast" of G 4 is a perfect fourth (5 semitones) higher (C 5 ); the button northeast of C 5 is a perfect fifth (7 semitones) higher (G 5 ). These types of inconsistencies mean there is no consistent pitch axis.
The isomorphic Bosanquet layout. The red and green lines show the respective paths traced by two octaves of the B and F major scales. The blue arrow shows the pitch axis-it is perfectly horizontal, so the pitch of every button is proportional to its left-right position. Figure 8. The non-isomorphic Bosanquet-like layout. The red and green lines show the respective paths traced by two octaves of the B and F major scales. Note how-compared with the isomorphic version-after every 12 columns, the pitches are cumulatively shifted northeast. This means that the same spatial relationship will produce inconsistent interval sizes. For example, the button two steps to the right of F 4 is a major third (4 semitones) higher (A 4 ); the button two steps to the right of G 4 is a minor third (3 semitones) higher (B 4 ). Such inconsistencies mean there is no consistent pitch axis.

Procedure
Each participant completed three phases of experimental tasks involving (1) scales, (2) arpeggios, and (3) the melody Frère Jacques on two different layouts. These three phases were completed on both an isomorphic and non-isomorphic version of either a Wicki-like or a Bosanquet-like layout. Those who were assessed on the Wicki-like layouts completed all of the training and retention tasks in the key B major or in the key of E major and the transfer task in both keys; hence, one key was untrained in the transfer task. Those who were assessed on the Bosanquet-like layouts completed the training and retention tasks in B major or F major and the transfer task in both keys; hence, one key was untrained in the transfer task. The choice of layout type and training key was randomised and balanced, and the order in which the layouts were presented was counterbalanced across participants.
These tasks proceeded in a manner similar to that of the study by MacRitchie and Milne [15]. The scale and arpeggio phases comprised a training task and an immediate retention task. The training tasks involved participants watching one audiovisual demonstration of either a scale (phase 1) or an arpeggio (phase 2), played at a tempo of 90 bpm on the tablet. Participants then played the scale/arpeggio on the tablet at the same time as the audiovisual demonstration a further three times. Each audiovisual demonstration was preceded by eight metronome beats. The immediate retention tasks involved participants giving two performances of the previously trained sequence without any audiovisual support. A metronome beat at 90 bpm was provided during these tasks as a timing reference.
Phase 3 involved a transfer task to test the generalisation of learning. Participants listened to an audio recording of a well-known, but unlearned melody (Frère Jacques) in either the same key in which they had completed the previous tasks or in an untrained key. The first note of the melody was then indicated to participants by the experimenter. Following this, participants were given either 20 s (musicians) or 60 s (non-musicians) to explore the layout in order to determine the correct notes, timing, and rhythm of the melody. After the onset of a 90 bpm metronome beat indicating the end of the exploration time, participants were required to give two performances of the melody. In the same manner, participants then completed the transfer task again in the alternate key. The order in which keys were presented in this phase was counterbalanced across participants. Similarly, the keys in which participants completed the first two phases was also counterbalanced.
As in [29], following the experimental tasks, 66 out of 68 participants completed a semi-structured interview with the experimenter. These were audio-recorded and transcribed. The interview consisted of three primary questions with follow up questions asked when necessary to further clarify meaning. The primary questions were

•
Did you have a preference for either layout? • What did you prefer about that layout? • Do you have any other comments about the layouts or the tasks?

Data Analysis
Data analysis and modelling was performed using the brms package (version 2.6.0) [30,31] in R Studio (version 1.1.456) and R (version 3.5.1). The scripts and data are available at https://osf.io/ 6vgph/. Preliminary data screening and processing was performed using MATLAB (version R2016b) by examining how well the pitches of the notes performed by participants matched the template performance of the task. To establish which notes of the performed sequence in any of the tasks were "correctly" pitched, we used the Note Time Playing Path software [32] (version 0.1), which uses a windowing process to identify where extra, skipped, or substituted (wrongly pitched) notes occur. For performances where there were large numbers of pitch errors (more than five), this matching process was visually confirmed. Of the original 1876 performance files (67 participants × 2 layouts × 14 performances) over the three experimental phases, 1752 were retained for data analysis (93.4%). Sixteen performance files failed to be recorded as a result of computer error. A further 16 files were empty because participants had failed to produce any performance for the task. Ninety-two files were deleted as the performed data could not be reconciled with the relevant templates of ideal performances. This occurred when files contained too many errors (e.g., too many extra, skipped, or substituted notes as compared to the template performance such that the Note Time Playing Path software could not continue to track any correct notes) or tracked too few notes (e.g., when the participant had played only one or two notes in total in a performance).

Accuracy: Training Tasks
MATLAB was also used to calculate the dependent variable, performance accuracy, henceforth denoted Accuracy. Our measure of Accuracy accounted for both whether the correct pitch was played, and how accurate its timing was. Some changes were made to the calculations detailed in MacRitchie and Milne [15] in order to ensure that Accuracy was strictly bounded in the open unit interval, that is, 0 < Accuracy < 1.
In the training tasks, the participant played along with the audiovisual highlighted sequence. Here, a "correct" performance constituted playing the correct pitches at the correct times in the sequence, that is, matching the sequence being produced aurally and visually onscreen. For each target note in the audiovisual demo, a window of 666 ms (the interonset interval at 90 bpm), centred around the target note, was created. The first performed note within this window to match the target's MIDI note number was used. In this way, inserted notes that did not match the expected MIDI note were not penalised if the "correct" note was played at some point in the time window. If no matching MIDI note number was identified within this window, a zero score was allocated for this target note in the sequence. The first matching MIDI note number was then assigned a score reflecting its timing accuracy-a value of 1, if played at the same time as the target, linearly reducing to 0 as the timing error increased to the boundary of the window (±333 ms). The resulting scores for each performance were summed to a total performance score, denoted Score, and then normalised into a final Accuracy value, as detailed in (1).
For the training tasks, the calculation was where • N is the total number of notes in the target sequence • t n is the time in milliseconds of nth target note • t p is the time in milliseconds of first performed note within the t n ± 333 time window that matches the pitch of the target note.
Note that the ratio of 999/1000 is arbitrary, its purpose merely being to ensure that exact zeros and exact ones cannot occur. This ratio very slightly reduces the range of possible values [33] so that exact zeros were increased by 0.0005, exact ones were reduced by 0.0005, and all numbers in-between were changed proportionately.

Accuracy: Retention and Transfer Tasks
In the immediate retention and transfer tasks, the participant played the test sequence along with a 90 bpm metronome. Here, a "correct" performance constituted playing the correct notes in the expected sequence as before but with one caveat related to penalising errors. Because these tasks were accompanied only aurally by a metronome beat, the performer might add an extra note or leave a gap so that all subsequent notes were then played one metronome beat late. In situations such as this, it is reasonable to penalise the first late note, but not the subsequent notes, which are then correct subject to a delay. The same argument holds if a performer were to play two or three extra notes before returning to the correct sequence of pitches. Only the extra notes should be penalised. Similarly, a performer might skip a note, or multiple notes, hence all subsequent notes would be one or several metronome beats early. As before, it is reasonable only to penalise the first pitch error. The final penalty was then calculated by the total number of such "first" errors, designated by the letter E in (2). The calculation used to summarise the overall accuracy was where • N is the total number of notes in the target sequence. • C is the total number of notes in the corrected performance. • t c is the time in milliseconds of the cth note in the corrected performance.
• t m is the time of the metronome beat closest to the performed note t c . • E is the number of "first errors" calculated.

Semi-Structured Interviews
Recordings of the semi-structured interviews were transcribed into Microsoft Word (2011). Using the Review function in Microsoft Word, statements were then coded under two pre-determined themes, Layout preference and Reason for layout preference, which were effectively defined by the first two interview questions. Statements relevant to the third interview question were also coded. Following this initial coding process, the codes were assessed for commonalities in order to identify further themes and sub-themes. Interview transcripts were independently coded by a third party for the purpose of minimising bias. Codes identified by the two independent coders were then compared, with disagreements discussed and resolved by both parties. Finally, the numbers of statements identified for each theme and sub-theme were calculated in order to assess their relative prominence.
Because Accuracy is a real number in the open unit interval, it was modelled with a beta distribution, which is a standard choice for such dependent variables. A logit link was used to transform the Accuracy score from the unit interval to the full real-line, thereby making it suitable for linear prediction. This means that the exponential of any effect gives the multiplicative factor by which the ratio Accuracy/Inaccuracy of a performance changes, where Inaccuracy is defined as 1 − Accuracy, which also implies that Accuracy = Accuracy/Inaccuracy 1+(Accuracy/Inaccuracy) (in Tables A1-A6, we show the linear effect in the "Estimate" column and its exponential in the "Eff.Fact" column). For example, for an effect of 0.5, a unit increase in the corresponding predictor results in a multiplicative increase in Accuracy/Inaccuracy by e 0.5 = 1.65, that is, an increase in Accuracy/Inaccuracy of 65%. More concretely, imagine an expected Accuracy at a given set of conditions of 0.6; this implies that Accuracy/Inaccuracy = 0.6/0.4 = 1.5. If a predictor (whose effect is 0.5) is increased by one unit, keeping all else the same, the model tells us that the resulting Accuracy/Inaccuracy ratio would now have an expected value of 1.65 × 1.5 = 2.48; hence, the expected Accuracy would become 2.48/(1 + 2.48) = 0.71.

Bayesian Regression
An important advantage of Bayesian regression is that, given the observed data and a prior distribution (see Section 2.5.2 for a discussion of priors), it calculates the whole posterior probability distribution of each predictor's effect rather than only a point-estimate of each predictor's most probable effect. This allows for credibility intervals to be calculated; unlike the confidence intervals in classical regression, credibility intervals have a straightforward and intuitive meaning: the 95% credibility interval of an effect is the interval that we can be 95% certain contains the effect's true value. It also allows evidence ratios to be calculated; these are the odds (probability ratios) in favour of directional hypotheses (such as a given effect being greater than zero). For example, if the integral of the posterior distribution over the interval (0, ∞) is p, the evidence ratio in favour of the effect being greater than 0 is p/(1 − p); so, if the lower boundary of a (one-sided) 95% credibility interval is precisely zero, this implies that there is a 5% probability the effect is less than zero and a 95% probability it is greater than zero; hence, the evidence ratio is 0.95/0.05 = 19.
To qualify the weight of evidence for or against any given hypothesis (e.g., that an effect is greater than 0), we followed the guidelines proposed by Jeffreys (1961 as cited by [35,36]) in which evidence ratios of 1-3 represent no evidence for the tested hypothesis; evidence ratios of 3-10 are "moderate" evidence for the hypothesis; evidence ratios of 10-30 are "strong" evidence; and evidence ratios above 30 are "very strong" evidence.
The convergence of each such model was assessed with Rhat values (in all cases, these values were acceptable, as can be seen in the tables). Posterior predictive checks showed that each model predicted data with a distribution similar to that of the observed data; furthermore, every model had a high a Bayesian R-squared value [37] (0.89, 0.89, 0.95, 0.94, and 0.94 for the five respective performance tasks).

Priors
The priors used in all of our models are what are termed weakly informative priors [38]. These reflect a minimal amount of knowledge that we possess about the effects' sizes, as now detailed. All of our predictors were close to having a scale of approximately 1 standard deviation (ranging from 0.5 SDs for the binary variables and just over 1 for PerfNo), while the dependent variable had a standard deviation of approximately 0.25. This immediately implies that we would be unlikely to see effect sizes with very high magnitudes. For example, although it would not be surprising to see an effect size of, say, 0.1 or −1.2, it would be surprising to see an effect size of 10 or −10, while an effect size of 1,000,000 or −1,000,000 would be so unlikely that it would immediately suggest a coding error. A prior can and should reflect this prior knowledge.
For all population level effects, except for the intercept, we used a prior with a Student's t-distribution with 3 degrees of freedom, a mean of 0, and a scale of 3. Crucially, the zero mean indicates that our prior beliefs weakly favour the null hypothesis of zero effect size and-in comparison to using a flat uninformative prior-regularises the estimations, thereby reducing overfitting. The prior on the intercept was the brms default, which is a Student's t-distribution with 3 degrees of freedom, a mean of 0, and a scale of 10; the latter reflecting that, in standardised models, intercepts can take on larger sizes than the effects.
The group level effects (which are standard deviations) were given the default brms prior, which is a half-Student's t-distribution with 3 degrees of freedom, a mean of 0, and a scale of 10 (hence reflecting weak prior support for zero standard deviations); the correlation matrix had an lkj(1) prior which corresponds to a uniform distribution for correlations and standard deviations (respectively, off-diagonals and diagonals in the multivariate normal distribution's covariance matrix).

Results
This section is divided into two parts. First, there is an outline of the confirmatory analysis. Secondly, there is a more substantive description of an exploratory analysis that includes a number of important interactions that were not anticipated in the pre-registration but which provide greater insight into the underlying processes involved in learning and the transfer of learning.

Confirmatory Analysis
In the principal task-the transfer task-directional hypothesis tests showed strong evidence that performances in the trained key were better on isomorphic layouts than on non-isomorphic layouts. The evidence ratio was 39 and performances, as quantified by the ratio Accuracy/Inaccuracy (see Section 2.5), increased by 14% on isomorphic versions of the layouts. However, evidence for participants performing better on isomorphic layouts in the untrained key was only moderate-the evidence ratio was 8.5 with an increase in performance of 12%. Tables A6 and A7 provide full details.
For the remaining training and immediate retention tasks, strong evidence of an effect of isomorphism was found only in the arpeggio training task where, counter to expectation, the non-isomorphic layout improved performance (an evidence ratio of 3999.0 with a performance increase of 60%). As can be seen in the next section, this effect was driven principally by the Bosanquet-like layouts only.
As expected, there was generally strong evidence in favour of learning effects: Accuracy increased with the second layout and over successive performances of each layout, but these increases reduced over time due to approaching a performance ceiling. Musicians also performed substantially better than non-musicians.
It is quite evident from the descriptive graphs provided in Section 3.2 that the impact of isomorphism was strongly moderated by both the type of layout (IsBosanquet) and by musicianship (IsMusician). For this reason, we henceforth focus on an exploratory analysis that includes these interactions in the model. These models were checked with approximate leave-one-out cross-validation and their fits to unseen data were not significantly different to those of the simpler confirmatory models (hence, they were not overfitting the data).

Exploratory Analysis
For each of the five tasks performed by our participants (scale training, scale retention, arpeggio training, arpeggio retention, and melody transfer), we follow the same format and present the results in both graphical form (Figures 9-23) and tabular form (Tables 1-5). Firstly, we provide descriptive graphs of the mean accuracies for each of the four layouts across the successive performances and successive layouts for that task. Separate graphs are provided for non-musicians and musicians, and the means are displayed with 95% confidence intervals calculated with 10,000 bootstrap samples. Secondly, we provide a plot of the model's regression coefficients with their 95% credibility intervals (tables of their numerical values are included in Appendix A). Thirdly, we graph the marginal effects on performance accuracy associated with IsNonIsomorphic, IsBosanquet, and, for the final task, IsUntrained. Again, separate plots are provided for musicians and non-musicians. Fourth, we present a table of directional hypothesis tests for relevant effects as follows: the impact of IsNonIsomorphic conditioned on the different values for IsMusician, IsBosanquet, and, for the final task, IsUntrained; the impact of IsBosanquet conditioned on the different values for IsMusician, IsNonIsomorphic, and, for the final task, IsUntrained; the impact of IsMusician conditioned on the different values for IsNonIsomorphic, IsBosanquet, and, for the final task, IsUntrained; the impacts of LayoutNo, PerfNo, and their interaction, conditioned on IsMusician's two values. In the table, the effect under examination is shown first, while the values it is conditioned on are shown in the following parentheses. Finally, a verbal description of all effects of interest with evidence ratios of at least 10 is provided. Task   Table 1 and Figure 11 show

•
Strong evidence that, on the Wicki-like layouts, non-musicians and musicians performed better on the isomorphic versions; conversely, on the Bosanquet layouts, non-musicians and musicians performed better on the non-isomorphic versions. • Strong evidence that performances were better on Bosanquet-like layouts than on Wicki-like layouts in all cases except for musicians playing the isomorphic layout (mean effect across all conditions was 0.88).

•
Very strong evidence that musicians performed substantially better than non-musicians under all conditions (mean effect across all conditions was 1.84).

•
Very strong evidence that accuracy improved across successive performances of the task and in the second-played layout, although the improvement was smaller for non-musicians when both were high.  Table A1 for numerical values).    Task   Table 2 and Figure 14 show

Scale Immediate Retention
• No strong evidence for any impact of isomorphism on accuracy. • Strong evidence that non-musicians performed better on the non-isomorphic Bosanquet-like layout than on the non-isomorphic Wicki-like layout.

•
Very strong evidence that musicians performed substantially better than non-musicians under all conditions (mean effect across all conditions was 1.08).

•
Very strong evidence that accuracy improved in the second-played layout. However, only musicians improved with successive performances, and the improvements for musicians reduced for their second-played layout.  Table A2 for numerical values).   Task   Table 3 and Figure 17 show

Arpeggio Training
• Strong evidence that, for the Bosanquet layouts, non-musicians and musicians performed better on the non-isomorphic versions.

•
Very strong evidence that musicians performed better on the non-isomorphic Bosanquet-like layout than on the non-isomorphic Wicki-like layout.

•
Very strong evidence that musicians performed substantially better than non-musicians under all conditions (mean effect across all conditions is 2.41).

•
Very strong evidence that accuracy improved across successive performances of the task and in the second-played layout.  Table A3 for numerical values).   Task   Table 4 and Figure 20 show:

Arpeggio Immediate Retention
• Strong evidence that, for the Wicki-like layouts, musicians performed better on the isomorphic version; for the Bosanquet-like layouts, non-musicians and musicians both performed better on the non-isomorphic versions.

•
Very strong evidence that musicians performed better on the non-isomorphic Bosanquet-like layout than on the non-isomorphic Wicki layout. • Strong evidence that musicians performed substantially better than non-musicians under all conditions (mean effect across all conditions is 1.58).

•
Very strong evidence that accuracy improved for the second-played layout. For musicians only, performance also improved with successive performances of the task but less so on the second layout.  Figure 19. Arpeggio retention task: the model's population level effects and 95% credibility intervals (see Table A4 for numerical values).   Task   Table 5 and Figure 23 show:

Melody Transfer
• Strong evidence that non-musicians' performances in the trained key were better on the isomorphic Wicki layout compared with the non-isomorphic version; strong evidence that musicians' performances in the trained key were better using the isomorphic Bosanquet layout than using the non-isomorphic version.

•
Strong evidence that non-musicians' performances in the untrained key were worse on the isomorphic Bosanquet than the non-isomorphic Bosanquet layout; very strong evidence that musicians' performances in the untrained key were better on the isomorphic versions of both types of layout (mean effect of 0.79).

•
Strong evidence that musicians' performances in the trained key were better on the non-isomorphic Bosanquet than the non-isomorphic Wicki layout. • Very strong evidence that musicians performed better than non-musicians across all conditions (mean effect was 2.05).

•
Very strong evidence that performance improved with the second layout and with successive performances, although for non-musicians this reduced on the second layout.   Table A5 for numerical values).   Table 6 shows how many participants had preferences for particular pitch layouts. A large majority of participants (44%) indicated a preference for isomorphic pitch layouts. This was particularly evident amongst those who played on the Wicki-like isomorphic layout, where 35% of musicians and 29% of non-musicians preferred this layout. In contrast, 20% of participants indicated a preference for a non-isomorphic layout. However, it should be noted that there was a greater preference for the non-isomorphic version of the Bosanquet-like layout compared with the isomorphic version. In particular, approximately 20% of non-musicians preferred this layout compared to 7% of musicians.

Reason for Layout Preference
Positive participant statements (N = 59 statements) regarding why particular layouts were preferred were categorised into the following sub-themes: Layout structure, Intuitiveness, Similarity to piano, Identifying reference note, and Practice effects. Table 7 shows examples of statements from each sub-theme, as well as the number of statements per sub-theme. Table 8 shows the layout preferences for the numbers of participants who made statements fitting into each sub-theme.

Sub-Theme Example Statement Number of Statements Mus Non-Mus
Layout structure "I found it's more structured as to the rows 4 and then 3 and 4 before you change into the next note." (ID1).

8
Intuitiveness "I felt like I could feel the notes better in this layout. . . Like the way my brain put together like playing the notes and the sounds-it felt easier to do in this layout basically." (ID14).

5
Similarity to piano "It was kind of like the piano, so it was easier to remember I guess because I could do it with my fingers in the same way as for scales." (ID22).

0
Identifying reference note "Well for the first one [Wicki-like non-isomorphic],...I couldn't find the reference note. For the second one [Wicki-like isomorphic] it was easier to know where the C was I guess or whatever the tonic was." (ID7).

3
Practice effects "I think because I did it the first time, so the second time was like kinda the same thing but I had more practice. I guess I was more familiar with it by then." (ID44). Table 8. Number of statements for each sub-theme of Reason for layout preference according to preferred layout. The sub-theme Layout structure contained statements made by separate participants suggesting how identifiable aspects of the pitch layouts influenced their layout preferences. Such aspects included the positions of notes, the way notes were grouped, and the pattern sequences followed. Layout structure accounted for the largest proportion of statements (approximately 37%) contained under Reason for layout preference. Additionally, approximately 57% of statements were made by participants who preferred isomorphic layouts, whereas 22% of statements were made by those who preferred non-isomorphic layouts. However, of those who preferred non-isomorphic layouts, all statements were made by those who played on Bosanquet-like layouts. Furthermore, for four of these five participants (two musicians, two non-musicians), statements were related to the greater number of pitches per row on the non-isomorphic layout. The statements for this sub-theme were largely made by musicians, who made more statements for this sub-theme than any other sub-theme (58% of musicians made statements under this sub-theme). Statements of this type also accounted for the second greatest proportion of non-musicians' statements (20%).

Sub-Theme Isomorphic Non-Isomorphic No Preference Wicki-Like Bosanquet-Like Wicki-Like Bosanquet-Like
The sub-theme Practice effects accounted for the second largest proportion (approximately 32%) of statements included under Reason for layout preference. It contained statements that mentioned that developing familiarity with the device over the experimental session had influenced layout preference. For the 17 participants in this sub-theme who had specified a preference for a particular layout, the preferred layout was the one presented second to the participant. Furthermore, 50% of statements were made by those who preferred isomorphic pitch layouts, whereas 40% of statements were made by those who preferred non-isomorphic pitch layouts. In addition, Practice effects accounted for the largest proportion of statements under Reason for layout preference made by non-musicians (37%) and the second largest proportion of statements made by musicians (19%).
The remaining sub-themes, Intuitiveness, Similarity to piano, and Identifying reference note accounted for approximately 15%, 3%, and 8% of the statements included under Reason for layout preference, respectively. Intuitiveness included statements indicating that playing came more easily, or was generally understood more easily on particular layouts. Similarity to piano contained statements citing common features between particular layouts and a piano. Both of these statements were made by musicians. However, the primary instrument they played was not always the piano. Identifying reference note contained statements indicating the relative ease of locating the position of the starting note for any exercise/melody on a particular layout. Similar to Layout structure and Practice effects, the majority of statements under these sub-themes were made by participants who preferred isomorphic pitch layouts.

Challenging Aspects
Participant statements (N = 30 statements) regarding the features of the experiment that made it difficult were categorised into three sub-themes: Layout structure, Pace of audiovisual (AV) demonstration, and Familiarity of interface. Table 9 shows the number of statements assigned to each sub-theme as well as example statements from each. Table 9. Example statements from the sub-themes under Challenging aspects.

Sub-Theme Example Statement Number of Statements
Layout structure But just trying to figure out where all the notes were at, it was more of a challenge of making something work. Like where the intervals were." (ID24).

20
Familiarity of interface But umm for this sort of task and especially when it's an interface that I'm not familiar with. . . because it's so alien to me, it's not as simple to pick up." (ID15).

8
Pace of AV demonstration I think sometimes the keys, with the highlighting, it flashed a little fast, like you're trying to catch up with it." (ID2).

2
The sub-theme Layout structure consisted of negative statements which indicated difficulty tracking the positions of pitches and the patterns in which they were laid out throughout the experiment. As such, it can be seen as the counterpart to Layout structure under the previous theme. As in the previous theme, this sub-theme also accounted for the largest proportion of statements (approximately 67%) coded under the theme Challenging aspects. In addition, more musicians (50%) and non-musicians (17%) made statements for this sub-theme than any other sub-theme.
The sub-themes Familiarity of interface and Pace of AV demonstration accounted for the remaining 27% and 6% of statements included under Challenging aspects, respectively. Familiarity of interface included statements that indicated a lack of experience with the type of interface used. Statements of this type were primarily made by musicians. Pace of AV demonstration included statements (made by musicians only) that described the difficulty some participants experienced in keeping up with the AV demonstration utilised in the training tasks.

Discussion
This study was conducted to examine whether the use of isomorphic pitch layouts on a novel musical instrument would facilitate learning better than non-isomorphic pitch layouts. Accuracy was measured for participants' performances on a series of musical training, immediate retention, and transfer tasks so as to infer learning. The transfer task was the primary task in which the effects of interest were expected to be most evident. This was because it required the application of previously learned pitch combinations to an untrained task, which would occur both in the key in which they were trained and in an additional untrained key. Thus, it did not merely require the repetition of a previously played sequence. Effects seen in the transfer task would suggest elements that facilitate sustained learning, rather than any temporary learning effects that may be seen in the training and immediate retention tasks. The strong evidence in the transfer task for the effect of isomorphism in the trained key and moderate evidence for the untrained key seen in the confirmatory analysis only partly explained the differences seen across performances. The exploratory analysis more fully delineated the moderating effects of the type of layout (Wicki or Bosanquet), and the musical experience of the participant (musician or non-musician).

Melody Transfer Task
For the transfer task, the most important hypothesis test (H3, in Section 1) was whether participants performed better in an untrained key when playing on an isomorphic pitch layout than when playing a non-isomorphic layout. For musicians, this hypothesis was strongly confirmed-their performances, as quantified by the ratio Accuracy/Inaccuracy (see Section 2.5), were 57% better on the isomorphic Wicki version than on the non-isomorphic version and 40% better on the isomorphic Bosanquet than the non-isomorphic version. However, for non-musicians playing the Bosanquet-like layouts, the results were counter to this hypothesis-performances were better on the non-isomorphic version (by 38%).
There are at least two plausible mechanisms (detailed in Section 1) to explain the advantages of isomorphic layouts for musicians: (a) any given spatial relationship always produces the same sized musical interval, so they are more consistent; (b) they have a consistent pitch axis, which allows players to assess the pitch of any given button purely by its spatial position. Neither is true for non-isomorphic layouts (by definition). Arguably, these advantages are only of relevance to players who already have some degree of musical sophistication, for example, the capacity to aurally distinguish differently sized intervals rather than being aware only of a general melodic contour.
The above argument implies that, for non-musicians, we would see a reduced positive effect or no distinct effect at all. For non-musicians, although we saw no distinct effect with the Wicki-like layouts, we unexpectedly saw a distinct negative effect with the Bosanquet-like layouts. A possible reason for this-as can be seen in Figure 8-is that, on the non-isomorphic Bosanquet-like layout, all the scale pitches within a given octave are arranged on two rows rather than three. This means that, just by sticking to a single row and following the contour of the melody, a moderate number of correct pitches would result. In other words, the non-isomorphic Bosanquet layout allows for a simple playing strategy to achieve moderate accuracy (there is no difference in the numbers of rows per octave in the isomorphic and non-isomorphic versions of the Wicki-like layouts). It is also worth noting that the Accuracy values for non-musicians in this task were low (hovering around 0.25, see Figure 21). It may be that the simple strategy just described is useful for improving performance from low levels, whereas isomorphism is useful for improving performance from higher levels. This also raises another question, which is whether a one-dimensional isomorphic layout-a single row of buttons each ascending by a semitone-would be advantageous for learning. However, that is beyond the scope of this article.
The next most important hypothesis test investigated whether participants perform better in the trained key when playing on an isomorphic version of a given layout. There was distinct evidence in favour of this hypothesis-non-musicians' performances on the Wicki-like layouts, as quantified by the ratio Accuracy/Inaccuracy, were 17% better on the isomorphic version; musicians' performances on the Bosanquet-like layouts were 28% better on the isomorphic version. Under the other two conditions (non-musicians playing Bosanquet-like layouts and musicians playing Wicki-like layouts), there was no strong evidence pro or contra isomorphism.
Although no formal hypothesis tests were performed, Figure 23 shows that performances in the untrained key were not, on average, worse than those in the trained key.
We made no hypothesis for the impact of Bosanquet-like versus Wicki-like layouts and, indeed, there was no evidence provided for such an effect in any of the eight tested conditions shown in Table 5.
As expected, learning effects, as quantified by PerfNo, LayoutNo, and their interaction were distinct and in the expected directions-generally, Accuracy improved with successive performances and layouts with a subtle ceiling effect for non-musicians. Also, as expected, musicians performed considerably better in this task than non-musicians.

Training and Immediate Retention Tasks
The training and immediate retention tasks differed fundamentally from the transfer task in that they measured pattern learning. They also served the crucial purpose of training participants for doing the final task. Any effects seen here demonstrate temporary influences on the rate of learning a new layout. Although informative about how learning takes place during training, it is only when considered with the effects seen in the transfer task that an idea of the influences that promote sustained learning over time can be gained.
In three of the tasks (scale training, arpeggio training, and arpeggio immediate retention), performances on the non-isomorphic version of the Bosanquet-like layouts were distinctly better than those on the isomorphic version. This is counter to our expectations. As suggested in the previous subsection, it may be that the non-isomorphic Bosanquet-like layout benefits from using less rows per octave than the isomorphic version.
Performances on the isomorphic version of the Wicki-layout were better in the scale training task and, only for musicians, in the arpeggio retention task. This is suggestive of a positive effect of isomorphism for the Wicki-like layouts.
We made no hypothesis for the impact of Bosanquet-like versus Wicki-like layouts. However, Tables 1-4 show that, for the non-isomorphic layouts, performance was often better on the Bosanquet-like than the Wicki-like layout. For the standard isomorphic versions, the Bosanquet layout was better than the Wicki layout only for non-musicians in the scale training task; across all other tasks and conditions, there was no distinct difference between them. These results, once again point to the non-isomorphic version of the Bosanquet layout benefiting from having fewer rows per octave.
There is very strong evidence for expected learning effects: accuracy improved from the first to second layout and usually over successive performances within a given layout. In the scale training task, there was strong evidence that both non-musicians and musicians approached a ceiling, while musicians also approached a ceiling in the scale retention and arpeggio training tasks. This probably reflects the greater difficulty of these latter tasks combined with musicians already starting from a higher base level of skill. Indeed, as expected, musicians were distinctly better at all training and retention tasks than non-musicians.

Summary of All Tasks
For the training and retention tasks, the reversal of the effect of isomorphism across the two layout types suggests that additional factors may need to be considered. The models used here had very good fits to the data collected, as shown by the R-squared values in Tables A1-A6, but the strong moderation of isomorphism by layout type (IsBosanquet) suggests that features specific to that non-isomorphic layout (such as the extent to which pitches lie on a single row) are also important.
For the transfer task, the moderation of the effect of isomorphism by IsBosanquet was less evident (across all eight tested conditions, isomorphism had a distinctly positive impact in four and a distinctly negative impact in one), perhaps because we would expect to see isomorphism playing a stronger positive role in the transfer task than in the training and retention tasks.
The results are suggestive of a positive impact of isomorphism on the ability to transfer learning from scales and arpeggios to previously unplayed melodies and to new keys. However, the contradictory results for the non-musicians playing the Bosanquet-like layout in the transfer task and the results of the training and retention tasks suggest a need to test more isomorphic/non-isomorphic pairs to obtain greater confidence in the general effect of isomorphism.

Semi-Structured Interviews
Interview responses provided further insight to the effects described above. There was a greater overall preference for isomorphic pitch layouts. However, this effect was largely driven by those who played on Wicki-like layouts and more so by musicians than non-musicians. In contrast, there was a greater preference for the non-isomorphic layout amongst those who played on Bosanquet-like layouts, which was largely driven by non-musicians. Findings from responses under Reason for layout preference shed further light on this. The sub-theme Layout structure was the most prominent (it contained the greatest number of statements) and comprised statements primarily made by musicians rather than non-musicians. Statements under this sub-theme also primarily corresponded to those with a preference for the Wicki-like isomorphic layout. However, all statements corresponding to non-isomorphic layouts related to the Bosanquet-like arrangement and, with the exception of one statement, referred to how many pitches were contained on each row. The difference in the patterns of layout preferences across layout types hints at the presence of additional factors that influence learning. The commonalities between statements made by those with a preference for the non-isomorphic Bosanquet-like layout provide support for the previous suggestion that one such factor is likely the number of pitches per row.
The second most prominent sub-theme under Reason for layout preference was Practice effects, which largely comprised statements made by non-musicians. Additionally, musicians made more statements for this sub-theme than any other sub-theme indicating that, for non-musicians, Practice effects were the most influential factor when learning on a novel musical interface. This is in contrast to musicians who made the highest number of statements for the sub-theme Layout structure. The differing proportions of musicians and non-musicians across layout preferences suggests differences between how base learning and advanced learning occur.
As for Reason for layout preference, the sub-theme Layout structure was also the most prominent of the theme Challenging aspects. The consistency of this sub-theme's level of prominence within its respective themes indicates that structure-that is, note positions, note groupings, and sequence patterns-is the most important factor contributing to an individual's experience with a pitch layout.

Limitations and Future Directions
There were several limitations to this study. Firstly, the generalisability of these results is somewhat limited by the demographic factors of this study, such as the age and nationality of the participants. Secondly, given the nature of the way in which performances were scored, it was possible for performances in which, say, only two notes were played but played correctly, to achieve higher accuracy scores than performances where an attempt was made to play the entire sequence in question but many errors were made. Therefore, highly inaccurate scores were not necessarily a reliable representation of what a participant had learned. Thirdly, as already mentioned, it is hard to make general statements about high-level features such as isomorphism without having tested a greater variety of layouts. However, given the time-consuming nature of the experiments, there was a practical limitation as to the number of layouts that could be feasibly tested. Fourth, as in [15], only melody performances were tested; bearing in mind that melodies often comprise numerous major seconds, different results might be obtained when participants play bass lines (which often comprise numerous fourths and fifths) and harmonies (which often comprise numerous thirds and sixths as well as fourths and fifths). Lastly, while the instruments played by musicians were identified, they were not controlled for. Piano was the primary instrument played by a large proportion of musicians in this study, and it is possible that their results may be biased with respect to the broader population of musicians. As such, future research could aim to include equal numbers of participants playing a range of instruments so as to eliminate this potential bias.
Future research could also investigate whether learnability is better facilitated by pitch layouts congruent with the SMARC effect, something that was not tested here because all our layouts were congruent. Furthermore, there are still open questions regarding the individual importance of various interval directions in pitch layouts, such as the directions of the pitch axis, the octave axis, and the major second axis, the impacts of which have, to date, been only partially differentiated [15].

Conclusions
Overall, this study provides insights into the cognitive and sensory-motor aspects of playing pitches on musical instruments. This is useful for informing the design of new musical instruments so as to enhance their learnability and playability. We showed strong evidence that, for musically sophisticated participants, isomorphic pitch layouts facilitate the transfer of previously learned scales and arpeggios to an unlearned melody in the same key and in a new key. Results for non-musicians were, however, contradictory. Generally, performances-particularly in the training and retention tasks-on the non-isomorphic Bosanquet-like layout were unexpectedly good. This suggests that other aspects that were not explicitly modelled here (such as the number of rows per octave) also play roles. Further investigation is required to more clearly identify the general effects of isomorphism and to explore other features that contribute to the learnability and playability of pitch layouts in new musical instruments. Acknowledgments: The authors would like to thank Anthony Prechtl for specially recoding Hex to make it respond to multitouch input.

Conflicts of Interest:
The authors declare no conflict of interest.