Statistical Properties in Jazz Improvisation Underline Individuality of Musical Representation

: Statistical learning is an innate function in the brain and considered to be essential for producing and comprehending structured information such as music. Within the framework of statistical learning the brain has an ability to calculate the transitional probabilities of sequences such as speech and music, and to predict a future state using learned statistics. This paper computationally examines whether and how statistical learning and knowledge partially contributes to musical representation in jazz improvisation. The results represent the time-course variations in a musician’s statistical knowledge. Furthermore, the ﬁndings show that improvisational musical representation might be susceptible to higher- but not lower-order statistical knowledge (i.e., knowledge of higher-order transitional probability). The evidence also demonstrates the individuality of improvisation for each improviser, which in part depends on statistical knowledge. Thus, this study suggests that statistical properties in jazz improvisation underline individuality of musical representation. were signiﬁcantly higher than those of ( − 1) ( p = 0.014), (1) ( p = 0.003), (3) ( p = 0.001), ( − 4) ( p = 0.042), and (4), (5), ( and (7) (all: p of (2) were signiﬁcantly higher than those of (1) ( p = 0.006), (4) ( p ( 5) ( p and (7) ( p The TPs of ( − 3) were signiﬁcantly higher than those of (1) ( p = 0.044), (3) ( p (4) ( p < ( − 5) ( p < and (7) ( p < 0.001). The TPs of (3) were signiﬁcantly higher than those of ( − 5) ( p = 0.013) and (7) ( p < 0.001). The TPs of (0) were signiﬁcantly higher than those of (7) ( p = 0.013). The TPs of ( − 4) were signiﬁcantly higher than those of (7) ( p = 0.005). The TPs of (4) were signiﬁcantly higher than those of (7) ( p = 0.011). The TPs of (5) were signiﬁcantly higher than those of ( − 5) ( p 0.019) and (7) ( p <


Statistical Learning: A Mathematical Model of a Learning System in the Brain
According to the hypothesis of statistical learning (SL), our brain has an ability to learn the transitional probabilities (TPs) of sequential information such as speech and music without intention, and predicts a future state using the learned statistical information [1][2][3]. Statistical learning, which occurs even in sleeping neonates and infants [4,5], is considered an implicit and innate system in humans. Evidence has suggested that, even if humans are unaware of what they learn, neurophysiological responses disclose statistical learning effects [6][7][8][9][10][11][12][13]. This learned knowledge can be in part expressed via an abstract medium such as a musical melody [14][15][16]. Thus, statistical knowledge may be reflected in the melodies written by an improviser: a high-probability sequence in music may be one that an improviser is more likely to choose, while creative artists are capable of producing low-probability (i.e., novel, unexpected) improvisations. Indeed, statistical learning has been applied to develop an automatic composition system that gives to computers learning and creative ability similar to those of the human brain [17]. The statistical-learning model is often expressed by the Markov process in which the probability of the forthcoming state is statistically defined only by the latest state, as in the information theory framework of Claude Shannon [18], including the use of n-grams to estimate the statistical structure of environmental information, and the development of entropy as a quantitative measure of the predictability of information [19]. Using multi-order Markov stochastic models, some researchers have also investigated the relationships between human statistical knowledge and statistical structure in music [20][21][22][23]. It has been suggested that the musical creativity of humans in part depends on the statistical learning that occurs in the brain [24][25][26][27].

Corpus Study in Music: Classical and Jazz Music
Music has numerous domain-specific structures (e.g., tonal pitch spaces, hierarchical tension, attraction contours) [28,29]. This suggests that statistical learning may be inadequate as a model of human music composition. Indeed, improvisers acquire extensive music-specific knowledge such as knowledge of harmonics and musical grammar, and intentionally follow these theories when composing music. The characteristics inherent in music can be revealed by referring to music-specific structures, such as harmonics and tonalities, in musical scores. This suggests that musical knowledge underlying composition strategies can be extracted based on music's specificity. On the other hand, it has been suggested that musicians access their statistical knowledge when composing music [20][21][22][23]. It is of note, however, that we cannot exclude the possibility that pitch choices during improvisation may strongly depend on aesthetic preferences involved in domain-specific and long-term musical knowledge such as tonality, as well as learned transition probabilities, although it is difficult to empirically reveal these effects because long-term exposure to music is hard to measure. Nonetheless, in jazz music, at least compared to other types of musical composition, in which an improviser deliberates and refines a composition scheme for a long time based on musical theory, musicians are forced to immediately play their own music, partially using long-term training associated with procedural learning [30][31][32][33]. Thus, the performance of musical improvisation is intimately bound to statistical knowledge because of the necessity of intuitive decision-making [34][35][36] and auditory-motor planning based on procedural knowledge. To understand the relationships between improvisational musical representation and statistical knowledge, it is important to examine the stochastic structure embedded in musical improvisation. According to previous studies, the brain codes the statistics of auditory sequences as relative information such as distribution of relative-pitch frequencies, and this information can be used in the comprehension of other sequential structures [7,8]. Thus, this study investigates distribution of relative-pitch frequencies (i.e., pitch interval) in music. This procedure has the advantage also of eliminatng the effects of differences in key on transitional patterns. The present study examines how TP of sequence associated with relative but not absolute pitch is distributed in improvisation.

Computational Modeling of Musical Improvisation
According to SL theory, the brain automatically computes TP distributions in sequential phenomena. The TP distributions sampled from sequential information such as music are often expressed by nth-order Markov models. The nth-order Markov model is based on the conditional probability of an element e n+1 given the preceding n elements: P(e n+1 e n ) = P(e n+1 ∩ e n ) P(e n ) From the viewpoint of Psychology, the formula can be interpreted as positing that the brain predicts a subsequent event e n+1 based on the preceding events in a sequence. In other words, learners expect the event with the highest TP based on the latest n states, whereas they are likely to be surprised by an event with lower TP. The TPs of all transition patterns in a piece of music can be expressed by a TP matrix ( Figure 1).

Study Purpose
The purpose of the present study is to examine the statistical structure embedded in musical improvisation based on various-order hierarchical models of TPs, and to reveal the statistical differences among improvisers. The TPs of melody in pieces of improvisational music played by Bill Evans, McCoy Tyner, and Herbert Jeffrey Hancock (hereafter, Herbie Hancock) were calculated based on nth-order Markov models ( Figure 2). In Study 1, using the TPs, information contents (I) were calculated in the framework of information theory. In previous studies, information contents could be regarded as surprising or predictable. In other words, lower information contents mean higher predictabilities and smaller contents mean more surprise, whereas higher information contents mean lower predictabilities and larger surprise [37][38][39][40][41]. From the psychological perspective, a tone with lower information content based on a TP may be one that an improviser is more likely to predict and choose as the next tone, compared to tones with higher information contents. Thus, information content can be used in computational studies of music to discuss psychological phenomena involved in prediction and statistical learning. Information contents based on TPs were compared among improvisers and sequences. It was hypothesized that statistical similarity may represent common styles which all improvisers shared, whereas statistical differences among improvisers may represent the specific characteristics of each improviser. In Study 2, chronological variations of the TPs were investigated. It was hypothesized that there may be two types of chronological variation: TPs that gradually decrease, and those that gradually increase, consistent with chronological order. If so, these findings suggest that statistical knowledge in music gradually shifts over a composer's lifetime. Second, differences in the statistical structure between musicians and between pieces of music were examined. It was hypothesized that there were statistical characteristics shared between different musicians and between pieces of music. If so, these findings suggest that statistics embedded in musical improvisation may represent individualities of musical expression.

Study Purpose
The purpose of the present study is to examine the statistical structure embedded in musical improvisation based on various-order hierarchical models of TPs, and to reveal the statistical differences among improvisers. The TPs of melody in pieces of improvisational music played by Bill Evans, McCoy Tyner, and Herbert Jeffrey Hancock (hereafter, Herbie Hancock) were calculated based on nth-order Markov models ( Figure 2). In Study 1, using the TPs, information contents (I) were calculated in the framework of information theory. In previous studies, information contents could be regarded as surprising or predictable. In other words, lower information contents mean higher predictabilities and smaller contents mean more surprise, whereas higher information contents mean lower predictabilities and larger surprise [37][38][39][40][41]. From the psychological perspective, a tone with lower information content based on a TP may be one that an improviser is more likely to predict and choose as the next tone, compared to tones with higher information contents. Thus, information content can be used in computational studies of music to discuss psychological phenomena involved in prediction and statistical learning. Information contents based on TPs were compared among improvisers and sequences. It was hypothesized that statistical similarity may represent common styles which all improvisers shared, whereas statistical differences among improvisers may represent the specific characteristics of each improviser. In Study 2, chronological variations of the TPs were investigated. It was hypothesized that there may be two types of chronological variation: TPs that gradually decrease, and those that gradually increase, consistent with chronological order. If so, these findings suggest that statistical knowledge in music gradually shifts over a composer's lifetime. Second, differences in the statistical structure between musicians and between pieces of music were examined. It was hypothesized that there were statistical characteristics shared between different musicians and between pieces of music. If so, these findings suggest that statistics embedded in musical improvisation may represent individualities of musical expression. Representative phrases of transition patterns with the highest probabilities that were weight-averaged in the five different hierarchical models of TPs in each musician.

Methods
We used the improvisational sequences of the highest pitches from 21 pieces of music played by three improvisors (i.e., 7 improvisations per performer, Table 1). The present study only analyzed the highest pitch because the definition of melody in each title is still controversial in musicological study, as different melodies could concurrently appear in different titles of music, and melody is often played at the highest pitches. The TPs of sequences at the highest pitches were calculated based on five different hierarchical models. The probability of a forthcoming tone was statistically defined by the last tone to five successive tones, respectively (i.e., first-to fifth-order Markov chains). For each type of pitch transition, all pitches were numbered so that the first pitch was 0 in each transition, and an increase or decrease in a semitone was 1 and -1 based on the first pitch, respectively. This reveals how the pitches, but not the notes, transitioned from the first pitch. For example, the transition of C5, D5, E5, F5, and B4 was numbered 0, 2, 4, 5, and -1. This procedure was employed to eliminate the effects of the change of key on transitional patterns. Each transition pattern was chronologically ordered based on the time courses in which music was played by each musician. Using the transitional patterns that appear in all pieces of music, the time-course variations of the TPs were analyzed by multiple regression analyses using the stepwise method. The criteria of the variance inflation factor (VIF) and condition index (CI) were set at VIF < 2 and CI < 20 to confirm that there was no multi-collinearity. The VIF is a measure of the effect of other predictor variables on a

Methods
We used the improvisational sequences of the highest pitches from 21 pieces of music played by three improvisors (i.e., 7 improvisations per performer, Table 1). The present study only analyzed the highest pitch because the definition of melody in each title is still controversial in musicological study, as different melodies could concurrently appear in different titles of music, and melody is often played at the highest pitches. The TPs of sequences at the highest pitches were calculated based on five different hierarchical models. The probability of a forthcoming tone was statistically defined by the last tone to five successive tones, respectively (i.e., first-to fifth-order Markov chains). For each type of pitch transition, all pitches were numbered so that the first pitch was 0 in each transition, and an increase or decrease in a semitone was 1 and −1 based on the first pitch, respectively. This reveals how the pitches, but not the notes, transitioned from the first pitch. For example, the transition of C5, D5, E5, F5, and B4 was numbered 0, 2, 4, 5, and −1. This procedure was employed to eliminate the effects of the change of key on transitional patterns. Each transition pattern was chronologically ordered based on the time courses in which music was played by each musician. Using the transitional patterns that appear in all pieces of music, the time-course variations of the TPs were analyzed by multiple regression analyses using the stepwise method. The criteria of the variance inflation factor (VIF) and condition index (CI) were set at VIF < 2 and CI < 20 to confirm that there was no multi-collinearity. The VIF is a measure of the effect of other predictor variables on a regression coefficient. CI represents the collinearity of combinations of variables in the data set (actually the relative size of the eigenvalues of the matrix). Using TP matrices in music, each statistical characteristic was compared by Kendall's tau coefficient analysis. The criteria of the eigenvalue were set above 1. Furthermore, the TPs were weight-averaged in all pieces of music for each musician and for all musicians. Using the weight-averaged TP matrices for each musician, statistical characteristics were compared among musicians by correlation analysis. The representative phrases of transition patterns with the highest probabilities for each musician were decoded as music scores. Statistical significance levels were set at p = 0.05 for all analyses.   4,7,11)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the 16 transitional patterns. A significant regression equation was found (F(2,4) = 12.71, p = 0.018), with an adjusted R2 of 0.80 (Table 2a). The predicted chronological order is equal to 4.12-144.06 (transition of (0,2,3,5)) + 627.99 (transition of (0,−3,−4,−6)). The TPs of (0,2,3,5) and (0,−3,−4,−6) gradually decreased and increased consistently with the chronological order, respectively ((0,2,3,5) p = 0.028, (0,−3,−4,−6) p = 0.043). These TPs were significant predictors of chronological order. Four transitional patterns with five tones that appear in all pieces of music were detected ((0,−1,−2,−3,−4), (0,1,3,4,6), (0, 1,5,8,12), and (0,−3,−2,2,5)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music is played by each musician, based on the TPs of the four transitional patterns. A significant regression equation was found (F(1,5) = 12.71, p = 0.022), with an adjusted R2 of 0.62 (Table 2a). The predicted chronological order is equal to 7.02-317.34 (transition of (0,1,3,4,6)). The TPs of (0,1,3,4,6) gradually decreased consistently with the chronological order (p = 0.022). These TPs were significant predictors of the chronological order. Transitional patterns with six tones that appear in all pieces of music were detected ((0,−3,−2,2,5,9)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music is played by each musician, based on the TPs of the transitional pattern. No significant regression equation was detected, however.

H. J. Hancock
Sixteen transitional patterns with two tones that appear in all pieces of music were detected ((0,0), (0,1), (0,−1), (0,2), (0,−2), (0,3), (0,−3), (0,4), (0,−4), (0,5), (0,−5), (0,6), (0,7), (0,−7), (0, 9), and (0,−9)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the 16 transitional patterns.  )). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the 19 transitional patterns. No significant regression equation was detected, however. Transitional patterns with four tones that appear in all pieces of music were detected ((0,1,0,−2)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the transitional pattern. No significant regression equation was detected, however. No transitional patterns with five or more tones that appear in all pieces of music were detected.  ,5,7)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the 32 transitional patterns. A significant regression equation was found (F(2,4) = 43.63, p = 0.002), with an adjusted R2 of 0.93 (Table 2b). The predicted chronological order is equal to −0.12 + 191.08 (transition of (0,−2,−5)) − 485.35 (transition of (0,5,7)). The TPs of (0,−2,−5) and (0,5,7) gradually increased and decreased consistently with the chronological order, respectively ((0,−2,−5) p = 0.001, (0,5,7) p = 0.019). These TPs were significant predictors of the chronological order. Seven transitional patterns with four tones that appear in all pieces of music were detected ((0,2,4,2), (0,−2,−5,−9), (0,3,5,3), (0,−3,−5,−7), (0,−3,−7,−5), (0,4,2,0), and (0,5,3,0)). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the seven transitional patterns. No significant regression equation was detected, however. Transitional patterns with five tones that appear in all pieces of music were detected (0,5,3,0,−4). Using these transitional patterns, a multiple linear regression was carried out to predict the chronological order in which music was played by each musician, based on the TPs of the transitional pattern. No significant regression equation was detected, however. No transitional patterns with six tones that appear in all pieces of music were detected. Correlations could be detected between musicians in the same title of music: between W. J. Evans and M. Tyner in "Autumn Leaves" (r = 0.620), and between W. J. Evans and H. J. Hancock in "Someday My Prince Will Come" (r = 0.719). "Autumn Leaves" improvised by W. J. Evans was related only to "Autumn leaves" improvised by M. Tyner whereas "Autumn Leaves" improvised by M. Tyner was related to six pieces of music improvised by W. J. Evans. All results of the correlation analysis of the weight-averaged TP matrices in each musician are shown in Figure 4. In the first-and second-order TPs, musical improvisations are strongly and moderately related to each other, respectively. In the third-order TP, musical improvisations by H. J. Hancock are weakly (0.2 |r| < 0.4) related to those by both W. J. Evans (r = 0.356, p < 0.001) and M. Tyner (r = 0.303, p < 0.001). In the fourth-and fifth-order TPs, musical improvisations by W. J. Evans are weakly (r = 0.222, p < 0.001) and moderately (r = 0.   In the principal component analysis, the decision was made to specify a four-factor solution (eigenvalue > 1; Table 3 and Figure 5). The four factors accounted for 91.84% of the total variance. All of the pieces of music loaded higher than 39 on Factor 1, suggesting that this explains the general component of the jazz musical improvisation of the three musicians. All of the pieces of music improvised by W. J. Evans loaded lower than 0, and those by M. Tyner loaded higher than 0 on Factor 2, suggesting that this explains a component of improvisation in W. J. Evans or M. Tyner. Four pieces of music improvised by H. J. Hancock loaded lower than 0, and the other three pieces of music loaded higher than 0. Factors 3 and 4 did not detect a component for a musician and a piece of music. "Who Can I Turn To?" improvised by W. J. Evans (0.52) and "'Maiden Voyage" improvised by H. J. Hancock (−0.64) loaded heavily on Factor 3. The representative phrases of transition patterns with the highest probabilities, that were weight-averaged in each musician, were decoded as music scores (Figure 2). In the principal component analysis, the decision was made to specify a four-factor solution (eigenvalue > 1; Table 3 and Figure 5). The four factors accounted for 91.84% of the total variance. All of the pieces of music loaded higher than 39 on Factor 1, suggesting that this explains the general component of the jazz musical improvisation of the three musicians. All of the pieces of music improvised by W. J. Evans loaded lower than 0, and those by M. Tyner loaded higher than 0 on Factor 2, suggesting that this explains a component of improvisation in W. J. Evans or M. Tyner. Four pieces of music improvised by H. J. Hancock loaded lower than 0, and the other three pieces of music loaded higher than 0. Factors 3 and 4 did not detect a component for a musician and a piece of music. "Who Can I Turn To?" improvised by W. J. Evans (0.52) and "'Maiden Voyage" improvised by H. J. Hancock (−0.64) loaded heavily on Factor 3. The representative phrases of transition patterns with the highest probabilities, that were weight-averaged in each musician, were decoded as music scores (Figure 2).

Time-Course Variation of Statistics in Musical Representation
The present study hypothesized that a tone with a higher TP in a musical sequence may be one that a composer is more likely to choose, and that the TP matrix calculated from musical improvisation can represent in part the characteristics of a composer's statistical knowledge. This study investigated how characteristics of musical improvisation are reflected in TP distributions. Consistent with chronological order in music by W. J. Evans, the third-order TPs of (0,2,3,5] and fourth-order TPs of (0,1,3,4,6] gradually decreased. The third-order TPs of (0,−3,−4,−6] gradually increased. The transition in which TPs gradually decreased (i.e., (0,2,3,5) and (0,1,3,4,6)) may be popular musical phrases compared to the transition in which TPs gradually increased (i.e., (0,−3,−4,−6)). For instance, the transition . This suggests that there were statistical characteristics of this music shared between different musicians. Furthermore, in W. J. Evans and M. Tyner, "Autumn Leaves" by W. J. Evans was related only to "Autumn Leaves" by M. Tyner whereas "Autumn Leaves" by M. Tyner was related to six pieces of music by W. J. Evans. These results may be logical because of the chronological order. This also supports the hypothesis that H. J. Hancock may refer to musical improvisation by W. J. Evans. The firstto third-order transition patterns with the highest probabilities in H. J. Hancock were also the same as those in W. J. Evans (Figure 2), suggesting the similarity of statistical characteristics between the two musicians.

Improvisation
It is widely accepted that statistical knowledge causes a sense of intuition, spontaneous behaviour and skill acquisition based on procedural learning, and is further closely tied to musical production, including creativity, composition, and playing. Particularly, jazz musicians intuitively engage in musical improvisation based on their knowledge and experience, which do not necessarily follow explicit music-specific rules. They are forced to express intuitive creativity and immediately play their own music based on long-term training associated with procedural and implicit learning. Thus, it is considered that the performance of musical improvisation is intimately bound to statistical knowledge because of the necessity of intuitive decision-making [34][35][36] and auditory-motor planning based on procedural knowledge). That is to say, the statistics embedded in musical improvisation may represent the musician's individuality of expression that has been developed via learning and knowledge. In the present study, the transitional patterns that appear in all pieces of music were used to verify the statistical characteristics of familiar sequences in musical improvisation. There are several billion types of transitional patterns in music. Thus, it is difficult to define statistical significance when we target all of the transitional patterns. Future study is needed to investigate transitional patterns that do not appear in some pieces of musical improvisation.
It is of note that this study did not directly reveal implicit statistical knowledge of music because only the statistics of musical scores were analyzed. Thus, there may be other possible explanations for the results. For instance, it might be part of a deliberate plan to improvise music moving from familiar to increasingly unfamiliar, based on the statistical structure of music. The possibility cannot be excluded that these findings do not necessarily reflect a composer's implicit statistical knowledge.
Future studies should investigate, in parallel, how implicit statistical learning in music is reflected in neurological response and how learned knowledge is expressed in musical improvisation [27].

Methods
The pieces of music were the same as in study 1. We used the improvisational sequences of the highest pitches from 21 pieces of music played by three improvisors (i.e., 7 improvisations per performers, Table 1). The pitches were extracted using the transcriptions by Yasutoshi Inamori (Chuoart Publishing Co., Ltd.), Tomoyuki Hayashi (Doremi Music Publishing Co., Ltd.), The McCoy Tyner Collection: Piano Transcriptions: Jazz Giants (Hal Leonard Corporation), and Herbie Hancock Excellent Performance Collection (Doremi Music Publishing Co., Ltd.). Furthermore, the pitches were also confirmed by hearing. The present study only analyzed the highest pitch because the definition of melody in each musical title is still controversial in musicological study, since different melodies could concurrently appear in different titles of music, and melody is often played at the highest pitches. The highest pitches were chosen based on the following definitions: pitches connected with ties can be counted as one, and grace notes were excluded. Using all the sequences at the highest pitches, TPs were calculated for each piece of music as a statistic based on Markov chains. The nth-order Markov chain is the conditional probability of an event e n+1 , given the preceding n events, based on Bayes' theorem (see Formula (1)).
Then, for each type of pitch transition, all pitches were numbered so that the first pitch was 0 in each transition, and an increase or decrease in a semitone was 1 and −1 based on the first pitch, respectively. This revealed interval patterns, but not pitch pattern. This procedure was employed to eliminate the effects of change of key on transitional patterns. The interpretation of the change of key depends on the musician, and it is difficult to define in an objective manner. Thus, the results in the present study may represent a variation of the statistics associated with relative pitch rather than absolute pitch. Only the transitional patterns that appear in all music were used in the present study (Table 4). From the second-order Markov chains of interval transitions, transitional patterns that appear in all music could not be detected. Information content (I) was calculated using TPs in the framework of information theory. I(P(e n+1 |e n )) = log 2 1/P(e n+1 |e n ) (bit) (2) Table 4. The transitional patterns of pitch intervals that appear in all music sequences based on 0thand first-order Markov models. 3 7 Chi-square-test for uniformity (goodness-of-fit test) was conducted to analyze TP distributions, followed by repeated-measure analysis of variances (ANOVAs) with a between-factor improviser (Bill Evans vs Herbie Hancock vs McCoy Tyner), and a within-factor sequences for each hierarchy of the Markov model (i.e., 0th-and first-order Markov chains). When significant effects were detected, Bonferroni-corrected post-hoc tests were conducted for further analysis. Statistical significance levels were set at p < 0.05 for all analyses.

Relationships among Sequences
The present study examined the statistical structure embedded in musical improvisation based on various-order hierarchical models of TPs to reveal the statistical differences among improvisers. The TPs of the improvisational sequences containing the highest pitches in the pieces of music played by Bill Evans, McCoy Tyner, and Herbie Hancock were calculated based on different hierarchical Markov stochastic models. The TPs were compared among improvisers and sequences. It was hypothesized that there were general statistical characteristics shared among jazz improvisers, and specific statistical characteristics unique to each improviser. If so, statistical similarity may represent common styles all improvisers shared, whereas statistical differences among improvisers may represent the specific characteristics of each improviser. Thus, universal and independent statistical knowledge among improvisers can be disclosed by computational analysis of music based on the Markov model of pitch intervals in the framework of statistical learning In the 0th-order Markov model, the smaller the pitch interval, the higher the TP (Figure 6). From a psychological perspective, statistical learning of TPs contributes to chunk perception. For example, humans can recognize pattern and phrase embedded in tone sequences by calculating the TPs of the sequences, because TPs within phrases are higher than those between phrases. Thus, the present findings may suggest that smaller intervals of pitch can be perceived as a phrase. Furthermore, the findings in the first-order hierarchical model may also support the hypothesis that statistical learning contributes to musical improvisation. In this model, sequences that consisted of intervals of major and minor thirds (i.e., (−4,−7)), which produce a minor chord, showed the highest TPs. These sequences are also frequently used in music. On the other hand, other studies also claim that the perception of music is associated with the gestalt principle [28]. Generally, a sequence in which the interval between adjacent tones is not large is likely to be perceived as a phrase in the framework of the gestalt principle. Thus, the present findings may reflect gestalt perception as well as statistical learning. Future study is needed to investigate how these interact with each other in music perception and production using a number of pieces of improvisational music.

Relationships among Sequences
The present study examined the statistical structure embedded in musical improvisation based on various-order hierarchical models of TPs to reveal the statistical differences among improvisers.
The TPs of the improvisational sequences containing the highest pitches in the pieces of music played by Bill Evans, McCoy Tyner, and Herbie Hancock were calculated based on different hierarchical Markov stochastic models. The TPs were compared among improvisers and sequences. It was hypothesized that there were general statistical characteristics shared among jazz improvisers, and specific statistical characteristics unique to each improviser. If so, statistical similarity may represent common styles all improvisers shared, whereas statistical differences among improvisers may represent the specific characteristics of each improviser. Thus, universal and independent statistical knowledge among improvisers can be disclosed by computational analysis of music based on the Markov model of pitch intervals in the framework of statistical learning.
In the 0th-order Markov model, the smaller the pitch interval, the higher the TP (Figure 6). From a psychological perspective, statistical learning of TPs contributes to chunk perception. For example, humans can recognize pattern and phrase embedded in tone sequences by calculating the TPs of the sequences, because TPs within phrases are higher than those between phrases. Thus, the present findings may suggest that smaller intervals of pitch can be perceived as a phrase. Furthermore, the findings in the first-order hierarchical model may also support the hypothesis that statistical learning contributes to musical improvisation. In this model, sequences that consisted of intervals of major and minor thirds (i.e., (−4,−7)), which produce a minor chord, showed the highest TPs. These sequences are also frequently used in music. On the other hand, other studies also claim that the perception of music is associated with the gestalt principle [28]. Generally, a sequence in which the interval between adjacent tones is not large is likely to be perceived as a phrase in the framework of the gestalt principle. Thus, the present findings may reflect gestalt perception as well as statistical learning. Future study is needed to investigate how these interact with each other in music perception and production using a number of pieces of improvisational music.

Relationships among Improvisers
The TP of the sequence of (−1) and (1) was lower in McCoy Tyner than in Bill Evans and Herbie Hancock (Figure 7). In contrast, the TP of the sequence of (0), (5), and (−2,0) showed opposite trends. The TPs of these sequences were higher in McCoy Tyner than in Bill Evans and Herbie Hancock. Difference could also be detected between Bill Evans and Herbie Hancock (i.e., (1), (3,7), and (−3,−5)). These results may support the hypothesis that musical improvisation partially depends on statistical knowledge. The individuality of improvisation style may be reflected in TP distribution in music. Interestingly, the results detected general trends of TPs among the three improvisers. The largest difference in TPs was between Bill Evans and McCoy Tyner, and the TPs in Herbie Hancock took a middle position between those of Bill Evans and McCoy Tyner. It would be interesting to determine if these trends among three improvisers represent how the musical improvisations of an improviser influence those by another improviser in the context of statistical knowledge.

Statistical Learning in Psychological and Computational Studies
To understand the brain's higher-order statistical learning systems in a form closer to that used for music, sequential paradigms based on higher-order Markov models have been used in neurophysiological [6,8] as well as computational studies [19]. Furthermore, the nth-order Markov model has been applied to develop artificial intelligence that gives computers learning and decision-making abilities similar to those of the human brain, thus generating systems for automatic music composition [17,42,43]. Such models can verify hierarchies of statistical learning based on various-order TPs. Music includes higher-order statistics such as hierarchical syntactical structures and grammar. Thus, information-theoretical approaches, including information content based on nth-order Markov models, may be useful in understanding the SL as it functions in response to real-world learning phenomena in the interdisciplinary realms of brain and computational sciences.
Recent neural studies have suggested that, during statistical learning, the brain codes the statistics of auditory sequences as relative information, such as relative distribution of pitch frequencies, and that this information can be used in the comprehension of other sequential structures [7]. This suggests that the brain does not have to code and accumulate all received information and thus saves some memory capacity [44]. Thus, from the perspective of information theory [18], the brain's SL is systematically efficient. Based on these neural findings, the present computational study investigated TP distributions of relative pitches (i.e., pitch interval) in musical improvisation. The results support the hypothesis that specific as well as general characteristics among improvisers can be detected in TP distributions of pitch intervals. These results may suggest that statistical similarities and differences represent common and specific styles among improvisers, and that universal and independent statistical knowledge among improvisers can be disclosed by computational analysis of music based on the Markov model of pitch intervals in the framework of statistical learning. It is of note, however, that the present study did not directly investigate the musicians' statistical knowledge of music, as only the statistics of musical scores were analyzed. This suggests that there may be other possible explanations for its findings. For instance, it might have been the musician's plan to engage in improvisation that is statistically different from other specific music. Moreover, it is not clear if actual probabilities are learned regardless of domain-specific features of music such as tonalities and harmonic syntax. Indeed, these three jazz masters may have very distinct personal styles, and are generally regarded as among the more influential jazz pianists.
Transition probabilities, however, shape only a small part of their respective styles. The neural correlates of such computational models have been investigated by a body of previous studies (see review, [45,46]. For example, the n-gram model, which is frequently verified by neural approaches, is considered to correspond to chunking and word-segmentation processes in statistical learning [2]. The online perception and production of music, however, is not the mere chunking of sequential patterns, like word segmentation, but is a dynamic form of prediction to maintain an aesthetic melody with various temporal and spectral features [47], hierarchical building, and harmonics [48], which interact with each other [28]. Musical prediction and representation are not restricted to a single stream of events but, rather, they interact with each parallel stream [49][50][51]. However, future study is needed to investigate other aspects of music such as harmony and the hierarchical building of not only single but parallel streams forming harmony.

Conclusions
The present study investigated how statistical knowledge is reflected in musical improvisation by analyzing statistical structures in music and differences between musicians and between pieces of music. First, the results suggest that time-course variations of statistical structures in music may represent time-course variations of a musician's statistical knowledge, by which a forthcoming tone is statistically defined. Second, the results suggest that there are specific statistical characteristics shared between distinct pieces of music of the same title played by different musicians and between pieces of music of distinct titles played by the same musician. Third, the musical improvisation of a musician may influence the higher-order statistical knowledge (i.e., knowledge of higher-order transitional probability) of other musicians without explicit intention and awareness. Fourth, the results showed the statistical similarities and differences of improvisers' individuality. The present study sheds new light on novel methodologies that can be employed to evaluate how musical improvisations are influenced by other musical improvisations via statistical learning, using interdisciplinary approaches including Psychology, Musicology, and Informatics.