Identiﬁcation and Description of Outliers in the Densmore Collection of Native American Music

: This paper presents a method for outlier detection in structured music corpora. Given a music collection organised into groups of songs, the method discovers contrast patterns which are signiﬁcantly infrequent in a group. Discovered patterns identify and describe outlier songs exhibiting unusual properties in the context of their group. Applied to the collection of Native American music collated by Frances Densmore (1867–1957) during ﬁeldwork among several North American tribes, and employing Densmore’s music content descriptors, the proposed method successfully discovers a concise set of patterns and outliers, many of which correspond closely to observations about tribal repertoires and songs presented by Densmore. synergies between quantitative research data revisits and analysis of Native American using computational pattern discovery and outlier detection.


Introduction
Computational ethnomusicology refers to the development and exploitation of technological tools and computational methods for accessing, analysing and documenting world musics [1]. Early, pre-digital, approaches were largely championed by anthropologists and musicologists, including pioneering applications to Native American language and music such as the possibly earliest adoption of the phonograph [2], phonophotographic analysis of recorded songs [3] and quantitative studies of song corpora [4]. During recent decades, the field has often been dominated by technological and computational perspectives, applications have been seen as lacking a clear musicological motivation and results have been described as "solutions in search of a problem" ( [1], p. 12). This paper explores synergies between pre-digital quantitative music studies and research in data mining: it revisits Frances Densmore's collection and analysis of Native American music using computational pattern discovery and outlier detection.
Frances Densmore (1867Densmore ( -1957, regarded as one of the leading experts on Native American music in her time ( [5], p. 45), collected music to support its preservation and further study by scholars as well as contemporary composers [6]. In her work for the Bureau of American Ethnology [7][8][9][10][11][12][13][14][15][16][17][18], she "selected typical tribes in the principal areas of the United States and gave special attention to recording their old songs and preserving the information concerning their use" ( [19], p. 176). In addition to transcriptions of songs and contextual information, Densmore published systematic analyses of many songs and suggested common characteristics of tribal repertoires and interesting features of individual songs (see example in Figure 1). Densmore's quantitative comparison of songs by different tribes (Figure 1 bottom) can productively be seen as an instance of contrast pattern mining [20]. Within the repertoire of a tribe, songs may be distinguished by features which are surprisingly rare, unusual or peculiar (Figure 1 top right), that is, they deviate from common characteristics of the tribe's songs and thus represent outliers [21]. Figure 1 gives an example: in the Papago group, songs with a harmonic structure (with chord-based intervals between contiguous accented tones) are under-represented in comparison with other tribal repertoires and hence unusual. In this paper we present a method for detecting and describing outliers in grouped data based on infrequent contrast patterns and illustrate this method in an application to Densmore's collection and analyses of Native American music.
The current paper outlines a method which integrates identification and symbolic description of outliers. More specifically we introduce an outlier detection method which exploits significantly infrequent contrast patterns to both identify and describe songs in a music collection which show unusual properties in the context of related songs (Section 3). The method is applied to the Densmore collection of Native American music (Section 2), and discovered outliers are presented with reference to Densmore's writings (Section 4). The proposed method complements previous work in outlier detection and music pattern discovery, and findings from its application to Densmore's collection illustrate its potential for studies in computational ethnomusicology (Section 5).

Densmore's Collection and Analysis of Native American Music
To study Native American music, Densmore carried out field observation and recorded songs, which she subsequently transcribed and analysed ( [6], p. 191). Many of the resulting materials were published in bulletins of the Bureau of American Ethnology (BAE), Smithsonian Institution, with which she became associated as a collaborator in 1907 [5]. As evident from Table 1, the publications were collated over almost half a century and cover an extensive repertoire of Native American music, spanning the continent from the West coast to the Great Lakes and the Southeast. Generally songs were collected on reservations. As an exception, the publication on Pueblos music represents songs from the Isleta, Cochiti and Zuñi Pueblos in New Mexico but is based on recordings undertaken in Washington and Wisconsin: records of 16 Acoma songs made available by Dr. M. W. Stirling, director of the BAE, and recordings made by Densmore during the Stand Rock Indian Ceremonial at Wisconsin Dells, in which members of Pueblos tribes took part [18]. Metadata Densmore's publications provide a range of metadata on individual songs, such as a unique catalogue number for each song, a serial number within the respective bulletin, information related to collecting a song (e.g., the name of the singer) and the use of the song (e.g., song type or context of use). This study primarily considers the tribes among which songs were collected, following the structure of the bulletins to partition the corpus into groups: focusing on songs covered by Densmore's systematic analyses, the corpus in the current study comprises 1700 songs, organised into eleven groups (Table 1).
Transcriptions The transcriptions presented in Densmore's publications (see example in Figure 1, top right) are generally based on phonograph recordings, where possible taking into account multiple renditions of a song. Densmore transcribed the recordings into standard musical notation, occasionally using special signs to indicate slight deviations in pitch or duration. While standard notation does not systematically reflect microtonal or timbral aspects of performances, it provides an accessible format ( [6], p. 192) and serves as a basis to analyse the melodic trend, principal rhythm and general character of songs which are maintained across renditions ( [7], p. 3). In our work we use Densmore's transcriptions during data preparation to disambiguate uncertain feature assignments in her analyses and to replace selected missing values.
Music content analysis of songs In her systematic analyses of the collected songs, Densmore applied various analysis criteria relating to tone material, melody and rhythmic-metric aspects. Criteria were chosen which can be analysed for large numbers of songs ( [15], p. 37). For these attributes Densmore determined an attribute value for each song. The bulletins record these analyses by serial number, listing against each attribute value the songs matching the respective value ( Figure 1 top left).
In the current study we consider 13 of Densmore's attributes which are documented across the majority of bulletins (Table 2). While Densmore's analyses are remarkably consistent, given the extended period of time from the 1910s to the late 1950s, the selection and terminology of attributes and their values vary slightly between bulletins. To allow analysis of feature distributions across the groups, attribute names and values were harmonised. Missing attributes were added based on the song transcriptions where feasible; the remaining cases are explicitly encoded as missing the attribute value. Tabular comparative analyses The music content analysis of songs serves as a preliminary to Densmore's tabular analyses ( [4], p. 549); an example of her quantitative comparison between songs of different tribes is shown in Figure 1 (bottom left). Generally, Densmore compared the repertoire under consideration against all previously analysed songs. These analyses reveal features which are over-or under-represented in a tribe compared to other tribes and can contextualise outlying examples and their description. Please note that Densmore's comparisons depend on the chronology of publications, analysing a tribe against songs published in earlier bulletins-e.g., Northern Ute songs against the combined Chippewa and Sioux songs-rather than against all other songs in the complete corpus of 1700 songs to which outlier detection is applied in the current study.
Textual commentary The transcriptions of songs as well as the comparative analyses are accompanied by narrative descriptions of songs or groups (Figure 1 right), which serve as a reference for the interpretation of our data mining results. It should be noted that these descriptions are selective rather than exhaustive, i.e., they do not necessarily cover all analysis attributes nor do they always assess observed features as either typical or unusual. Thus, they are more suited to support qualitative than quantitative evaluation of discovered outliers.

Outlier Identification and Description with Infrequent Contrast Patterns
The method presented in this paper discovers local outliers in group-labelled data, using infrequent contrast patterns: given a set of songs organised into groups, it finds songs distinguished by one or more properties which are unexpectedly rare in the relevant group. The method is summarised in Figure 2, outlining the partitioning of the dataset in preparation of contrast mining and outlier detection.

Data Representation and Organisation
Each song in the database is encoded by multiple attributes. Attributes are categorical (such as musical material, e.g., major pentatonic scale); numeric attributes are discretised into value ranges (such as compass of a song, e.g., between a fifth and an octave). If for a song an attribute is undetermined its value is explicitly marked as missing. For convenience, an attribute-value pair (e.g., material: major_pentatonic) will be called a feature; a feature set is a set of features (e.g., {material: major_pentatonic, compass: five_to_eight_tones}). A song satisfies a feature if it is described by the attribute-value pair; it satisfies a feature set if it satisfies all features in the set. The number of songs satisfying a feature set is called the support count of the feature set.

partitioning: one-vs-all
Overview of the method. The input is a corpus of songs that is partitioned into groups G 1 , . . . , G k , and the output is the union of the outlier songs found in each group.
In addition, each song is associated with a group label. Labels can refer to e.g., different ethnic or language groups, geographical regions, diachronic periods or song types. The group labels are used to partition the dataset for contrast mining and outlier detection, as illustrated in Figure 2: each group in the dataset is taken in turn as the target group G and compared against a background ¬G formed by the union of all other groups. Thus, in a dataset organised into k groups, the mining performs k one-vs-all comparisons [46] to discover infrequent contrast patterns and outlying songs for each of the groups. Interestingly, Densmore's group-level analyses generally are also carried out as one-vs-all comparisons (see example in Figure 1, bottom).

Discovery of Significantly Infrequent Contrast Patterns
A pattern is a feature set associated with a group. A pattern is infrequent in a group if the proportion of songs in the group which support the feature set is low. However, if the feature set is infrequent across the complete dataset, its low frequency in the group is trivial, not specific to the group. Thus, we are particularly interested in feature sets which are proportionally less frequent in the target group than in the background; feature sets satisfying this condition will be called infrequent contrast patterns. Figure 3 (left) illustrates the concept schematically. The two larger boxes represent the target group G of songs labelled with the respective group label (light grey box) and the remaining corpus ¬G of songs in other groups (dark grey box). The inner white box indicates the songs satisfying a feature-set pattern X. For the purposes of outlier detection and description, the pattern should cover a proportionally smaller part in the target group than in the background, that is (X, G) relative to G should be smaller than (X, ¬G) relative to ¬G. Figure 3 (right) indicates the 2 × 2 contingency table underlying the evaluation of pattern candidates: the table maps the occurrence of a pattern X in a target group G and the background ¬G. Here N denotes the total number of songs in the corpus, n(G) the number of songs in group G, n(X) the number of songs in the complete corpus which satisfy pattern X and n(X, G) the number of songs in group G which satisfy pattern X. The marginal counts in the table are adjusted to take into account songs with missing values in one or more attributes represented in the candidate pattern: for these songs it cannot be determined whether or not they satisfy the feature-set pattern, i.e., whether they satisfy X or ¬X, and thus they should not be considered in counts n(¬X, G) and n(¬X, ¬G), and consequently n(G), n(¬G) and N.
A contrast pattern is significantly infrequent if its low frequency in the group is less than expected given the pattern's frequency in the corpus and the size of the group relative to the complete corpus. Statistical significance is evaluated by applying Fisher's exact test (left tail): the p-value computed by the test gives the probability of observing at most n(X, G) songs in G satisfying X, given the (adjusted) marginal counts n(X), n(G) and N. A pattern is considered a significantly infrequent contrast pattern if the p-value is below a specified threshold α. As testing multiple hypotheses, or candidate patterns, increases the probability of finding one or more false positives (family-wise error rate), the threshold α is adjusted using a Bonferroni correction.  Figure 4 provides a concrete example of p-value computation for a real pattern in Densmore's analysis. This is the pattern {structure: harmonic} for the group Papago, which also appears in Figure 1 (group-level analysis). Here the marginal counts are n(Papago) = 167 and n(structure: harmonic) = 184, and the corpus size is N = 987. Figure 4 shows how the p-value varies under different hypothetical support counts for the pattern in the group. The actual support count (9, solid vertical bar) in the corpus gives a low p-value: therefore this pattern would be significantly infrequent (α = 0.01) and suggests that songs with the pattern {structure: harmonic} are outliers among Papago songs. Support counts of 20 or more (dashed vertical bar) are not significant (α = 0.01): the 20 or more songs would not be considered outliers to the group.

Pattern Pruning and Outlier Detection
The output of the infrequent contrast pattern mining is used to identify outlying songs: the discovered patterns are unexpectedly rare in the relevant group, thus songs satisfying these patterns may be outliers with respect to other members of the group.
The set of all significantly infrequent patterns includes redundant patterns. To reduce the pattern set considered in the outlier detection, two pruning strategies are applied. First, the pattern set is restricted to minimal significantly infrequent contrast patterns, i.e., with the least number of features, in order to reduce redundancy between discovered patterns and to enhance comprehensibility of outlier descriptions by focusing on concise patterns [47]: a significantly infrequent feature-set pattern is minimal if none of the feature subsets is significantly infrequent. Second, support-based pruning is employed: a maximum support count threshold θ is set such that only patterns with n(X, G) ≤ θ are considered. The value of θ depends on the analysis interest: a low support count threshold restricts the mining output to individual outliers, i.e., songs characterised by patterns which are satisfied by only a few exceptions in a group. A higher threshold, on the other hand, also captures sets of outliers, i.e., several outlying songs sharing the same patterns.
To detect outliers given the pruned pattern set, the proposed method iterates over the set of contrast patterns and the songs in the respective group which satisfy the patterns, assigning to each song all satisfied patterns. A song is considered an outlier in its group if it satisfies one or more statistically significant infrequent contrast patterns. The patterns used to identify outlying songs at the same time provide a direct description of their unusual properties.

Outliers in the Densmore Collection of Native American Music
The proposed outlier detection method was applied to the Densmore corpus of 1700 Native American songs which are organised into eleven groups, as described in Section 2. The results presented here are obtained with a significance level of α = 0.01 and maximum support count threshold θ = 15. Table 3 lists the significant minimal contrast patterns discovered in the analysis corpus. For reference, Table 3 gives both the proportion of songs satisfying the respective pattern in the group and their proportion in the background: in each case the proportion in the group is significantly lower than the frequency in the background of all remaining groups. The focus on minimal patterns favours short patterns, providing concise descriptions; the listed patterns consist of single features or feature pairs. Inspection of the outliers identified and described by these infrequent patterns supports several observations, which are summarised in the following sections. Table 3. Significantly infrequent minimal contrast patterns in the Densmore collection (significance level α = 0.01, with Bonferroni correction; maximum support count threshold θ = 15).  Table 4 lists several outliers characterised by single-feature patterns whose descriptions can be related to similar observations in Densmore's writings. For example, the Nootka song ID1337 "is a lively, interesting melody, with a compass of 10 tones, and begins on the highest tone of the compass [the third above the octave], which is unusual" ([16], p. 93, our emphasis). Densmore's description of the Papago song ID1027 explicitly refers to the context of the song's group: "The song is harmonic in structure, which is unusual in the present series" ( [13], p. 112, our emphasis). If underlying a categorical attribute is an inherently ordinal scale the outlying feature may represent a value towards the margins of the scale: the Choctaw song ID2357 "has a compass of 11 tones which is the largest in the Choctaw songs" ( [17], p. 140, our emphasis). Similarly, the compass of ten tones in the Yuman song ID1241 is "larger than any other song in the analysed group" ( [15], p. 196, our emphasis). For some discovered outliers the outlier description is reflected in Densmore's comments on the song without Densmore explicitly identifying a feature as rare. In several cases, however, the unusual occurrence of properties can be inferred from Densmore's comments on other songs in the group. For example, while for the Nootka song ID1337 Densmore emphasises the unusual beginning on the tenth, for song ID1345 the same property is not explicitly described as being unusual though it can be assumed by analogy to ID1337. Still in other cases Densmore's comments do not include the features highlighted in the outlier description. For example, Densmore's narrative analysis of the Pueblos song ID1907 omits reference to the constant metre: "This is a pleasing melody with a simple rhythmic unit and a compass of nine tones. It progresses chiefly by whole tones, which comprise 19 of the 33 intervals. Next in frequency is a minor third" ( [18], p. 27). Nevertheless, the outlier status of these songs is indirectly confirmed by Densmore's group-level description: "A change of measure lengths occurs in 97 percent of these Pueblo songs [...] and in only 80 percent of the combined group" ( [18], p. 112); less than 3% of the analysed Pueblos songs (2 out of 82 songs) contain no change of metre, a smaller proportion than in any other group. Similarly, the infrequent beginning on the octave in Nootka and Quileute songs, such as songs ID1303 and ID1474, is highlighted at the group-level, here explicitly emphasising the under-representation of the feature: "With one exception, the Nootka and Quileute songs show the smallest percentage beginning on the octave above the keynote, only 6 percent having this beginning in contrast to 17 percent in the total number of songs previously analyzed" ( [16], p. 42). Please note that these two songs satisfy not only one but two infrequent patterns (a mixed structure as well as the beginning on the octave), which suggest them to be outliers in their group.

Outlier Songs with Unusual Feature Combinations
While Densmore's tabular analyses consider each attribute on its own (Figure 1 left), computational analysis supports systematic consideration of feature combinations. Table 5 lists six of the 16 outliers suggested by infrequent feature sets. In these cases it is not the individual feature which is rare but the co-occurrence of the features participating in the patterns; hence this is different from songs ID1303 and ID1474 in Table 4 which satisfy more than one infrequent single-feature pattern. For example, 68 (41%) of the Papago songs in the corpus end on the keynote, but only two of these, songs ID930 and ID954, start on the octave above the keynote. Among Menominee songs in the collection, 40% have a compass of between five and eight tones and 31% start with an ascending interval, but only 10% satisfy both of these features. In fact, in her description of song ID1609 Densmore points out that the compass of eight tones is shared among several Menominee songs. In the three examples in Table 5 in which Densmore highlights a property which is unusual (songs ID930 and ID1607) or interesting to note (song ID1632), her narrative description is more specific than the attributes underlying her quantitative analysis (firstReKey and firstDir): in song ID930, the initial tone is the octave above the keynote additionally characterised by the duration of a half note; for songs ID1607 and ID1632, the textual description further specifies the upward-directed motion at the beginning of the song by its interval. The results demonstrate that the computational outlier detection method can discover significantly infrequent feature sets, but generally mining for minimal infrequent contrast patterns favours short, even single-feature, patterns; in the current study, this facilitates the comparison of computationally discovered outliers with Densmore's own analyses. Table 5. Outliers in the Densmore collection with significantly infrequent feature combinations.

Song
Group Outlier Description  Table 6 illustrates outliers pointing to challenges in the transcription or analysis of a song, or to peculiarities in the performance when the song was collected. For example, for the Chippewa song ID70 Densmore comments that "[the] tonality of this song is obscure. It is transcribed exactly as sung, the different renditions being identical, yet the key is not definitely established, neither are modulations indicated with sufficient clearness to be safely assumed" ( [7], p. 80), hence the ending of the song cannot be related to a clearly defined keynote (lastReKey: irregular). In the Chippewa song ID125, the ambivalence between two keys arises from the presentation of the song, when the singer shifted pitch by a semitone between sections of the song. The resulting change in key from C minor to C sharp minor highlighted in Densmore's narrative analysis is clearly reflected by the change of key signature in her transcription of the song ( Figure 5).  The Choctaw songs in Densmore's collection generally begin on an accented count of the measure ( [17], p. 182); the song ID2271 is one of only eight (out of 65) Choctaw songs starting with an unaccented tone. However, of this song two renditions were recorded and transcribed, and only the first rendition indeed begins on an unaccented part of the measure, while the second starts on the first, accented, beat of the measure, thus following the more common trend of the Choctaw songs in the collection.

Context
Reference to metadata or contextual information can support further exploration of discovered outliers. The outliers listed in Table 7 provide some examples. Infrequent properties of outliers may be related to a different origin or type of song, a minority subset within the song's class: Densmore's publication on Yuman and Yaqui songs also presents three songs of the Mohave, including the Mohave song ID1289 with a compass of nine to twelve tones: "only four [Mohave songs] were recorded, as the Mohave songs were not a subject of special investigation. Three were transcribed [...]. These Mohave songs have a somewhat larger compass and are more lively in general character than the songs of the other Yuman tribes under consideration" ( [15], p. 183, our emphasis). As another example, the Teton Sioux song ID679 "is said to be the only lullaby used among the Sioux. [...] This and song No. 204 [ID682] are the only songs in the present series having a compass of but four tones" ( [9], p. 493, our emphasis). Song ID682 is a begging song with "a wailing melody, well calculated to wear out the patience of the listeners" ( [9], p. 482). Alternatively inspection of outlying songs sharing the same unusual features may reveal links between individual songs. Commenting on the Chippewa song ID280, Densmore remarks: "This melody contains three peculiarities which rarely occur in Chippewa songs. First, it begins and ends on the same tone [the keynote]. This feature is found in only 11 songs (3 per cent) of the entire series of 340" ( [8], pp. 139-140, our emphasis). Among these 11 songs, songs ID102 and ID447 are in fact two versions of the same lullaby recorded at different reservations ( Figure 6): "The only two songs which the Lac du Flambeau Chippewa were found to have in common with the White Earth Chippewa are the lullaby and the song accompanying the folk tale of We'nabo'jo and the ducks" ( [8], p. 241). The ending on the keynote is common among Chippewa songs, thus the outlier description based on minimal infrequent patterns highlights the beginning on the keynote.

Discussion and Conclusions
The method introduced in this paper integrates outlier detection and description in grouped data based on infrequent contrast patterns. The outlier identification is unsupervised: the method is used for exploratory analysis of a music collection in which neither outliers nor inliers are known a priori.
Existing work in outlier detection has exploited both frequent and infrequent patterns: an example is less likely an outlier if it satisfies frequent patterns [48], or more likely an outlier if it satisfies infrequent patterns [49] or violates frequent patterns [50]. Despite the suitability of patterns for descriptive analysis, however, among the mentioned methods only FP-Outlier [48] explicitly includes outlier description: while the identification of outliers is based on frequent patterns satisfied by an example-with a larger number of satisfied frequent patterns making the example less likely to be an outlier-the description is based on frequent patterns not satisfied by the example (contradicting patterns), i.e., identification and description are based on different sets of patterns. In contrast, our method elegantly uses the same significantly infrequent patterns to both identify and describe outliers. Furthermore, our analysis interest is complementary to existing work on pattern-based outlier detection in grouped data. On the one hand, as unsupervised outlier identification the method differs from supervised or semi-supervised approaches which contrast a target group (known outliers or unlabelled examples) against a background assumed to contain no outliers (examples labelled 'normal') [51][52][53]. On the other hand, among unsupervised outlier detection methods it focuses on attribute outliers, i.e., examples with unusual attribute values or attribute-value combinations in the context of their group, rather than class outliers or class noise, i.e., examples with a deviating and potentially erroneous group label [54].
Applications of outlier detection to music generally assume numeric input data and employ instance-or model-based outlier detection methods [40][41][42][43]. In contrast to the pattern-based outlier analysis proposed here, the existing applications lack an explicit description of outlying examples.
Within computational ethnomusicology, our method builds on research in contrast pattern mining [55]. In particular, it complements work on antipatterns, i.e., patterns over-represented in the background or anticorpus [35]: while the earlier study illustrated that exceptions to antipatterns might reveal erroneous or ambiguous class labelling (class outliers), here songs described by infrequent patterns reveal properties which are unusual within their class (attribute outliers). In this context, Densmore's collection and analyses provide an invaluable opportunity to qualitatively assess outlier identification for exploratory analysis of music collections. The fact that many discovered outliers can be directly or indirectly matched to related observations by Densmore indicates that the proposed method is effective in identifying songs with unusual properties. Beyond Densmore's manual analysis, computational outlier detection and description facilitates transparent discovery of infrequent feature combinations in addition to infrequent single features. A further extension to Densmore's analysis concerns the evaluation of pattern candidates, which employs a statistical test to discover significantly infrequent patterns.
Descriptions of outlying songs and their groups depend on the chosen feature vocabulary. The study presented here has adopted Densmore's own content descriptors, thus allowing to compare results of the computational analysis with Densmore's observations. Digital encoding of the Densmore collection [56] offers opportunities to complement Densmore's features by computational feature extraction [57] and sequential pattern mining [35], both to systematically analyse aspects occasionally mentioned in Densmore's narrative analyses but not captured in her features (e.g., linking melodic and duration features) and to add further music content descriptors (e.g., aspects of melodic contour or melodic motifs [20,58]). Computational features applied to symbolically encoded music data focus on structural features and generally do not reflect aspects of performance or context. Interestingly, as illustrated in this paper, further inspection of outliers discovered based on music content analysis may lead to investigation of performance or context. More generally, beyond supporting exploratory navigation of large music collections pattern-based outlier detection and description can provide a starting point to combine corpus-level analysis with subsequent in-depth study of selected songs.