Genotyping by Sequencing Reveals Genetic Relatedness of Southwestern U.S. Blue Maize Landraces

Maize has played a key role in the sustenance and cultural traditions of the inhabitants of the southwestern USA for many centuries. Blue maize is an important component of the diverse landraces still cultivated in the region but the degree to which they are related is unknown. This research was designed to ascertain the genotypic, morphological, and phenotypic diversity of six representative southwestern blue maize landraces. Their genotypic diversity was examined using tunable genotyping-by-sequencing (tGBS™). A total of 81,038 high quality SNPs were identified and obtained through tGBS. A total of 45 morphological and biochemical traits were evaluated at two locations in New Mexico. The varieties Los Lunas High and Flor del Rio were genetically less related with other southwestern landraces whereas diffusion between Navajo Blue, Hopi Blue, Yoeme Blue, and Taos Blue demonstrated that these landraces were genetically related. Phenotypic variability was highest for kernel traits and least for plant traits. Plant, ear, and kernel traits were fairly consistent within and across locations. Principal component analysis and tGBS showed that Corn Belt variety ‘Ohio Blue’ was distinctly different from southwestern landraces. Genotypic analysis displayed that southwestern landraces are genetically closely related, but selection has resulted in differing phenotypes. This study has provided additional insight into the genetic relatedness of southwestern blue maize landraces.


Introduction
The development of modern cultivars and farming systems narrows the germplasm base and heightens crop genetic vulnerability [1,2]. Less genetic diversity will restrict our capacity to maintain and enhance crop production and our ability to respond to climate change. Plant genetic resources are vital assets for improving human conditions and crop diversity must be preserved to ensure global food security. Landraces are the primary contributors to the diversity of our genetic resources. They are essential in traditional farming systems, conventional or modern breeding, and genetic engineering programs. Seed banks play a vital role in the preservation of genetic diversity, and so too does conserving landraces in situ.
Historically, the richness of in situ crop genetic diversity has been protected within "cultural landscapes". In Mexico, the birthplace of maize [3], Hernandez [4] envisioned the landscape as a three-way relationship between environment, culture, and maize. Sadly, roughly 80% of the genetic diversity of maize has already been lost. Fortunately, substantial reservoirs of crop diversity remain in certain regions where landraces are still used in traditional farming systems. These unique areas can be termed hotspots [5] or "primary regions of diversity" [6]. Identification, characterization, and preservation of the remaining crop genetic resources in hotspots is urgently needed.
The southwestern region of the USA (the Southwest) is such a relatively unknown hotspot for traditional cultivation of diverse maize landraces. Archeological evidence confirms that native communities of the Southwest have practiced maize farming for more than 4100 years [5,7,8]; making it the oldest continuously managed agricultural area in the USA. Hunter-gatherers during the Basketmaker Era gradually adopted races introduced from the Mexican Highlands. These races became adapted to semi-arid agricultural systems [9] across large elevational gradients, and new races evolved over millennia with periodic influxes of different varieties and races from Mexico [7,[10][11][12] and through trade with Southern Plains Indian tribes [13]. An era of rapid development of maize cultivation in the Southwest occurred starting approximately 2000 years ago, a time that also coincided with an influx of Chapalote and Reventador races that came to the Southwest through Pacific Coastal routes in present Sinaloa, Mexico [7]. Pre-colonial southwestern landraces assumed paramount importance in the native farm communities that cultivated, conserved, and maintained them for generations [14].
Increasingly dynamic movement, exchange, and interaction of diverse crop germplasm likely occurred with the arrival of new farming cultures in the Southwest during the Spanish Colonial and U.S. Territorial eras. The increasing movement of peoples and seed made possible by improvements in transportation likely facilitated a higher frequency of genetic exchange between races resulting in the occurrence of racial admixtures. Anderson and Cutler [15] noted the presence of what they referred to as "recent admixtures" and some obvious "intermediates" between Pima-Papago and Pueblo races. A north-south pathway of gene influx along the eastern piedmont of the Sierra Madre Occidental has also been proposed by Hernández and Flores [13] who described the similarities between the newly identified northern race of Maiz Azul (Blue Maize) in northern Mexico and Puebloan maize from New Mexico, and also between the races Blandito de Sonora and floury Papago (Pima-Papago) maize [13,16].
Sturtevant [17] noted that 18 of 18 samples of maize cultivated by Native American Indians in Arizona and New Mexico displayed soft (floury) kernels. That author also commented on the diverse colors of southwestern maize varieties, notably samples from the Zuni and Tesuque Pueblos of New Mexico. Anderson and Cutler [15] stated that Pueblo maize is usually colored and Pima-Papago maize is either white or a bright light yellow. Blue maize can still be found throughout the Pueblos of New Mexico, and it is also of special importance to the Hopi of northern Arizona [18]. Blue maize is also grown on Navajo farms in Arizona, New Mexico, and Utah, but as Nabhan [19] pointed out, Navajo blue floury maize can look remarkably different from Hopi blue floury maize grown just a few miles away.
Today, blue maize is highly valued by southwestern Hispanic and Native American communities, especially in northern New Mexico. It also appears with lower frequency in other parts of the Southwest, e.g., in the homelands of the Yoeme (southern Arizona and northern Sonora, Mexico). It is not represented among the predominant landrace (Pima-Papago) of the Pima and Tohono O-odham tribes of Arizona. Of the two major southwestern races, Pima-Papago is considered to be relatively uniform phenotypically, whereas Pueblo maize may display traits that have resulted from recent influxes of races such as Southeastern and Southern Dent or even Corn Belt Dent [20,21]. Doebley et al. [22] examined the taxonomic and anthropological implications of diverse landraces of southwestern maize and concluded that Pima-Papago and Puebloan maize differed in isozyme constitution, but showed some overlap, suggesting gene exchange between races that associated with geographically isolated cultural systems. Papago maize displayed little isozyme variation within landraces, but much among landraces. Pueblo maize showed considerable variation within landraces, but less among the landraces.
Maize landraces are open-pollinated varieties from which farmers save seed for subsequent planting. They are not static populations since they continue to evolve in response to farmer-directed and natural selection for adaptation to local physical, social, and cultural environments within a particular geographical region [23][24][25]. Traditional farmers have retained these landraces for their particular storage, cooking, nutritional, and processing qualities, as well as for historical and cultural reasons. Those reasons may include a desire for traditional foods, dietary diversity, fulfilling market niches or use in religious ceremonies [26,27].
Use of specific landrace varieties is typically associated with perceptual distinctiveness (PD) traits, which serve as indicators for identification and maintenance of landrace integrity [28]. These traits can assist in maintaining the genetic purity between diverse landraces suited for planting at particular locations or for various end-uses. In northwestern Mexico, the PD trait of kernel color has been popularly used by farmers as an ecological, dietary, and medicinal indicator [26]. The southwestern blue maize varieties likely reflect the same or very similar PD trait selection. It is assumed that Southwestern blue maize does not constitute a race such as Maiz Azul (Blue Maize) in the isolated high-altitude regions of Chihuahua, Mexico. Rather, traditional farmers in different geographic regions may have independently recognized blue kernel color as a PD trait reflecting their preference for its intrinsic value or as an indicator of ecological or cultural value. In this case, nomenclatural aggregation of southwestern landraces of blue maize could be useful primarily for similar end-use product differentiation.
The PD traits, which allow recognition of individual landraces by farmers, can also be used by taxonomists to create and manage racial diversity [28,29]. The need for natural classification, and difficulties associated with grouping maize into natural races and sub-races, was discussed by Anderson and Cutler [15]. Classical studies contributed fundamental principles for racial classification based on morphological traits [30,31] and natural classification [32][33][34]. Because morphological traits are influenced by environmental factors, and because many interacting genes often contribute to trait expression, morphological diversity is not an ideal measure of genetic diversity. Variability for ear morphology traits can make classification of maize accessions across regions difficult [35] but their relationship to PD traits used by farmers makes them relevant. We wished to determine whether natural classification groups could be achieved using the different trait data, or if genetic diversity would be displayed by southwestern blue maize varieties expressing the same PD trait (anthocyanin based blue/purple kernel color). We examined representative landraces of southwestern blue maize using molecular, morphological, and biochemical descriptors from replicated test sites in New Mexico.

Results
Tunable genotyping by sequencing (tGBS) of blue southwestern maize landraces was performed to identify their genetic relatedness. We also examined morphological and biochemical traits previously used to classify maize races and sub-races to allow comparison with tGBS findings. The genotypic analysis (tGBS) showed degree of relatedness among the southwestern landraces, but all (except Navajo Blue) were unrelated to the Corn Belt variety 'Ohio Blue'. Morphological and kernel compositional traits ascertained phenotypic and biochemical variation. The highest variability between southwestern landraces was observed for kernel traits and the lowest for plant traits. Several sets of single nucleotide polymorphisms (SNPs) were generated during tGBS analysis. The first set of 1,437,967 polymorphic sites included all sites that differed from the reference in at least one sample. This set was generated after all reads that aligned to the reference genome. Then we examined, sample-by-sample and identified a set of 217,178 "ALL SNPS", which aligned to the polymorphic sites. Subsequently, ALL SNPs were further filtered and a subset of high-quality SNPs was identified that had less than 50% missing data (low missing data or LMD50) across the 105 individuals. The resulting LMD50 (each of which was genotyped in at least 50% of the samples) SNP set contained 81,038 SNPs.
Distributions of various characteristics for the LMD50 SNPs dataset, including quantity of missing data, minor allele frequency, heterozygosity and genotype number, are summarized in Figure 1. In the tGBS analysis, a total of 81,038 high quality SNPs were identified, each of which exhibited less than 50% missing data among the 105 samples used to create the phylogenetic tree. The summary of SNP genotypes including the number of SNPs that are homozygous for the REF allele (reference allele), homozygous for the ALT allele (alternate allele), and heterozygous and missing data, can be seen in the top panel of Figure 2. The bottom portion of the panel shows the proportions of the SNPs per sample that are homozygous for the REF allele, homozygous for the ALT allele, or heterozygous among the non-missing data. The average missing data rate per LMD50 SNP site across samples is provided in the left panel in Figure 3. Sequencing data support 68.3% of all possible SNP calls. The right panel of Figure 3 presents the minimum, maximum, average and median number of reads per SNP per sample. Each SNP call was supported by 44tGBS sequence reads per sample, thus ensuring the accuracy of these non-imputed SNP calls.

Population Structure Analysis
Based on 81,038 SNPs from 105 individuals of seven accessions, the population structure within southwestern blue maize landraces was investigated. We ran the Admixture software with K ranging from 1 to 10 in the assumptions of 1 to 10 sub-populations defined in the studied genotypes. By the cross-validation error (CV error) among K = 1 to 10, the minimum CV error was detected at K = 2 ( Figure S1) and consequently, the population was divided in two groups including six and one genotypes, respectively ( Figure 4). Group 1 comprised of Navajo Blue, Los Lunas High Blue, Flor del Rio, Yoeme Blue, Hopi Blue, and Taos Blue landraces, which are of Southwest origin. Group 2 consists of Midwestern Ohio Blue (Figure 4). The population structure of K = 1 to 10 is shown in Figure S2, which shows the presence of sub-population in each population structure analysis such as one sub-population in K = 1, two sub-populations in K = 2 and so forth until K = 10. The percent contribution of each sub-population in respective population structure analysis is shown in Table S1. The structure analysis has validated the distinct separation of Ohio Blue from southwestern blue maize landraces as seen in the phylogenetic analysis and principal component analyses.

Phylogenetic Analysis
The phylogenetic tree of all 105 sequenced genotypes is shown in Figure 5 which displays the evolutionary relationships between the studied landraces. The phylogenic tree is categorized with landraces using different colors. All samples from a given accession (one color) are grouped together or at most two landraces are mingled within the tree. The phylogenetic tree further validates the distinct difference of monophyletic group Ohio Blue from all southwestern landraces as it represents evidence of independent development of Ohio Blue without genetic exchange with other landraces in the phylogeny. Flor del Rio and Los Lunas High were also grouped separately from the rest of the landraces. There was evidence of genetic exchange between individual plants of Flor del Rio and Los Lunas High populations. The paraphyletic group of Navajo Blue, Hopi Blue, Taos Blue, and Yoeme Blue were mixed, amongst all four accessions and discerns evidence of genetic exchange between these landraces. Hopi Blue was more variably distributed than the rest of the landraces. The genetic exchange of this accession showed that Hopi Blue and Yoeme Blue were genetically related with each other. Taos Blue and Navajo Blue are also closely related to each other. . Phylogenetic tree of sequenced genotypes. Phylogenetic tree was built using 81,038 high quality SNPs, which exhibited less than 50% missing data among the 105 samples.

Quantitative Trait Variation
In phenotypic evaluations, pre-harvest plant and post-harvest trait diversity were discerned through detailed examination of variation related to agro-morphological features of plant, ear, and kernel descriptors (Table 1). Trait variation was examined at different phenological growth stages ranging from vegetative, reproductive to post-harvest stage. Landraces evaluated across locations displayed significant differences for post-harvest ear and kernel traits while location effect showed significant differences for pre-harvest plant traits only (Table 1). However, the interaction between accession and location (A*L) showed non-significant differences for all traits except for kernel rows per ear and grain yield (Table 1). Across locations, the highest range of variation was observed for number of tillers (35.4%), ear height (22.2%), number of secondary branches (20.9%), number of ears per plant (20.1%), and grain yield (19.8%) ( Table 1) while least variation was for circumference of ear middle (4.0%), circumference of cob bottom (3.4%), and cob diameter (4.0%). The detailed morphological characteristics of plant, ear and kernel traits for Los Lunas are shown in Tables S2, S4, and S6, respectively. Most traits evaluated at Los Lunas showed broad variability for plant (Table S2), ear (Table S4), and kernel traits (Table S6). The morphological traits associated with plant, ear, and kernel traits evaluated at Alcalde are shown in Tables S3, S5, and S7, respectively. Interestingly, most plant (Table S3), ear (Table S5), and kernel (Table S7) trait values obtained from Alcalde were relatively higher as compared to those from Los Lunas. Table 1. Descriptive statistics and analysis of variance (ANOVA) for all pigmented maize landraces evaluated by pre-harvest plant and post-harvest ear, and kernel traits evaluated across Alcalde and Los Lunas. Level of significance expressed is * p < 0.05, ** p < 0.01; *** p < 0.001.

Qualitative Trait Variation
The relative proportion of color classes for different morphological traits and presence of kernel dent phenotype between different blue maize landraces is shown in Table 2. A Chi-square analysis of these qualitative traits is shown in Table 3. The majority of tassels, silks and glumes, across all landraces were green, except those of Flor del Rio. Some purple, white, and a combination of purple and green colors were also observed ( Table 2). Most landraces displayed white and green leaf midribs and shoots, respectively. The Flor del Rio accession was distinctly more variable than other landraces with both green and purple tassels and red, purple, brown and/or white cobs. Chi-square analysis of these qualitative traits showed that tassel, silk, glume, leaf midrib, shoot and cob color classes were significantly different among landraces (Table 3) evaluated at each location as well as across both locations.

Kernel Compositional Trait Variation
The descriptive statistics for kernel biochemical traits across locations are shown in Table 4. Detailed descriptive statistics for Los Lunas and Alcalde are shown in Table S8 and S9, respectively. In all evaluated landraces, Taos Blue displayed the highest oil and fatty acid contents whereas Flor del Rio showed the lowest. Highest protein content was reported from Flor del Rio and lowest from Los Lunas High and Taos Blue (Table 4). Starch content was invariably similar across all landraces except Yoeme Blue. Anthocyanin content was highly varied, and Santa Ana Blue and Hopi Blue displayed highest anthocyanin values whereas Flor del Rio displayed the lowest. Santa Ana Blue and Taos Blue consistently showed higher values for most of the biochemical traits, whereas Flor del Rio displayed the lowest values, except for protein content. Location-wise, kernel biochemical traits evaluated at Los Lunas (Table S8) were higher than those from Alcalde (Table S9), except starch.

Principal Component Analysis (PCA)
In order to assess the morphological and biochemical diversity between southwestern blue maize landraces we examined the traits in groupings that could reflect associated PD traits. In this manner, pre-and post-harvest morphological traits were also examined at the specific plant organ level. PCA analyses were then performed on pre-harvest plant traits, post-harvest ear and kernel traits, and kernel biochemical traits. The variation contributed by these traits can be seen in Figure 6. The variability generated for different agro-morphological and biochemical traits were contributed by a total of 23 principal components (PCs). The first 11 PCs with > 1 eigenvalue were identified by factor analysis and around 92.39% of the total phenotypic variance was contributed by these PC (Table 5) and first two components explained 22.69% (PC1) and 16.50% (PC2) variance, respectively. Landrace by trait (L*T) biplot between PC1 and PC2 displayed number of tillers, circumference of ear mid and bottom, ear diameter, circumference of cob top and mid, kernel length, kernel width, and kernel weight as the primary contributing traits to PC1 variance ( Figure 6 and Table 6); whereas kernel compositional traits (total fatty acids, protein, and oil), number of kernels per ear, kernel weight per ear, ear length, ear weight, circumference of ear top, ratio between circumference of ear top and bottom, and ratio between circumference of ear bottom and ear top contributed to the PC2 variance. Biplot between PC1 and PC2 showed that the traits associated with PC1 distinctly separated Ohio Blue and Flor del Rio and those associated with PC2, separated Santa Ana Blue, and Yoeme Blue from the rest of the southwestern landraces ( Figure 6). Hopi Blue, LL High, and Yoeme Blue were observed to be the most phenotypically variable landraces.  Table 5. Landraces from Navajo Blue, LL High, Santa Ana Blue, Flor del Rio, Yoeme Blue, Ohio Blue, Hopi Blue, and Taos Blue are shown in "chartreuse2", "slateblue4", "black", "red1", "darkorchid2", "darkorange", "yellow" and "blue3", respectively. Traits contributing to PC1 and PC2 are also assigned different colors with a gradient ranging from 1,3, and 5 with "gray15", "forestgreen", and "firebrick", respectively. Larger consensus points for each accession discerns the midpoint of a given accession.  Variability contributed by different plant traits can be seen in Figure 7 where 63.5% of variability was contributed by PC1 and PC2. The variability contributed to PC1 was predominantly due to plant height, ear height, ears per plant, leaves above primary ear, number of nodes and internodes whereas variability contributed to PC2 was due to number of tillers, and secondary branches (Figure 7). Based on variability of plant traits, Ohio Blue appeared distinctly different from the rest of the southwestern landraces (Figure 7). The variability for ear traits can be seen in Figure 8 where 56.7% of variability was contributed by PC1 and PC2. Circumference of ear (mid and bottom), circumference of cob (top and bottom), and ear diameter mainly contributed the variability to PC1 whereas ratios between circumference of ear top and bottom and circumference of ear bottom and top and cob weight contributed to the variability of PC2 (Figure 8). The majority of southwestern landraces were similar in terms of variation of ear traits except Santa Ana Blue; however, Ohio Blue differed from the southwestern landrace cohort except for partial overlap with Navajo Blue (Figure 8).
Diversity associated with kernel traits was mainly related to post-harvest kernel morphological and biochemical traits. PCA biplot for kernel traits is shown in Figure 9 where 66.1% of variance was contributed by PC1 and PC2. Ratio between kernel width and length, kernel width, 100 kernel weight, and single kernel weight contributed to PC1 whereas kernel rows per ear, number of kernels per ear and ratio between kernel length and width contributed to PC2 variability. Group-wise, Navajo Blue separated from other landraces; however, Ohio Blue was overlapped with Flor del Rio and Taos Blue (Figure 9). Biochemical diversity was analyzed using kernel biochemical traits and a total of 73.5% of variability for biochemical traits was contributed by PC1 and PC2 ( Figure 10). Variability for PC1 was mainly contributed by starch whereas PC2 variability was contributed by protein. Biochemical diversity estimated using PCA displayed no distinct grouping as all landraces overlapped. Landrace by trait (L*T) biplot for pre-harvest plant traits. Landraces from Navajo Blue, LL High, Santa Ana Blue, Flor del Rio, Yoeme Blue, Ohio Blue, Hopi Blue, and Taos Blue are shown in "chartreuse2", "slateblue4", "black", "red1", "darkorchid2", "darkorange", "yellow" and "blue3", respectively. Traits contributing to PC1 and PC2 are also assigned different colors with a gradient ranging from 5, 10, and 15 with "gray15", "forestgreen" and "firebrick", respectively. Larger consensus points for each accession discerns the midpoint of a given accession.  Table 5. Landraces from Navajo Blue, LL High, Santa Ana Blue, Flor del Rio, Yoeme Blue, Ohio Blue, Hopi Blue, and Taos Blue are shown in "chartreuse2", "slateblue4", "black", "red1", "darkorchid2", "darkorange", "yellow" and "blue3", respectively. Traits contributing to PC1 and PC2 are also assigned different colors with a gradient ranging from 2.5, 5, and 7.5 with "gray15", "forestgreen" and "firebrick" respectively. Larger consensus points for each accession discerns the midpoint of a given accession.  Table 5. Landraces from Navajo Blue, LL High, Santa Ana Blue, Flor del Rio, Yoeme Blue, Ohio Blue, Hopi Blue, and Taos Blue are shown in "chartreuse2", "slateblue4", "black", "red1", "darkorchid2", "darkorange", "yellow" and "blue3", respectively. Traits contributing to PC1 and PC2 are also assigned different colors with a gradient ranging from 4, 8, and 12 with "gray15", "forestgreen" and "firebrick" respectively. Larger consensus points for each accession discerns the midpoint of a given accession. Blue, Flor del Rio, Yoeme Blue, Ohio Blue, Hopi Blue, and Taos Blue are shown in "chartreuse2", "slateblue4", "black", "red1", "darkorchid2", "darkorange", "yellow" and "blue3", respectively. Traits contributing to PC1 and PC2 are also assigned different colors with a gradient ranging from 5, 15, and 25 with "gray15", "forestgreen" and "firebrick" respectively. Larger consensus points for each accession discerns the midpoint of a given accession.

Discussion
Genotypic, morphological, and biochemical traits were used to determine the genetic diversity and relatedness of southwestern U.S. blue maize landraces. The landraces were representative of different geographic regions of the Southwest, and a Corn Belt Dent variety was included for comparison. The relatedness of Hopi Blue, Yoeme Blue, Taos Blue, Los Lunas High and Navajo Blue suggests that there has been a common origin, with some gene flow between distinct regional landraces. Noteworthy is the east-west diffusion between Taos Blue and Navajo Blue and between Yoeme Blue and Hopi Blue across northern and southern Arizona/northern Sonora, Mexico. Overall findings suggest that the southwestern landraces are genetically closely related, but selection has resulted in differing phenotypes. A key finding of our study was the dissimilarity of natural classifications achieved by phenotypic and genotypic analysis. Conclusions regarding groupings will vary depending on the type of analysis, number of traits evaluated in a given analysis, and uncontrolled variation accrued from sample size, individual trait attributes (i.e., their heritability), and environmental factors.
Genotypic analysis was based on 81,043 high quality SNPs whereas phenotypic analysis was conducted using 40 diverse morphological traits. The disproportionate number of traits could have been a major factor for these differences. Evaluation of morphological traits included plant, ear and kernel traits. Kernel traits were more variable than the plant and ear traits, therefore our findings showed similarity for some traits whereas dissimilarity among others. The genotypic analysis was done based on the genetic sequences and none of the variation was accrued from environmental factors, which were likely a major source of variation in phenotypic analysis. Our findings, taken together provide a fuller picture of both the genotypic and phenotypic relatedness between the landraces.
The findings from our PCA analysis for morphological traits showed a variation of 57.7%, 14.1%, and 11.7% due to PC1, PC2, and PC3, respectively. These values were lower than those observed by Sánchez et al. [36] which suggests that racial classification across multiple environments and years is more robust than those based on a single year evaluation. The biochemical diversity variation of PC1, PC2, and PC3 was reported at 59.0%, 39.8%, and 1.1%, respectively. Racial classification using biochemical traits has not been reported in the recent past. The Principal Coordinate Analysis (PCA) of Doebley et al. [25] study showed no distinct clusters among different southwestern landraces. Significant overlap was reported among them. Our study also showed overlap between different landraces-with the exception of Navajo Blue. Our results are also consistent with the presence of interracial admixtures among Pueblo maize varieties from New Mexico [11].
Hernandez and Flores [13] studied similar morphological traits from Mexican Maiz Azul (Mexican Blue) race with the exception of shank length, which we did not examine. The plant height reported in our study ranged from 1.6 to 2.0 m in comparison to 1.9 to 2.2 m plant height for Maiz Azul. The number of nodes and internodes reported in our study ranged from 13 to 14, respectively in comparison to reported eight to nine nodes in Maiz Azul. Seventeen secondary tassel branches were reported in our study whereas 2 to 3 secondary tassel branches were observed in Maiz Azul. Tassels of Maiz Azul appear to be considerably smaller than those of southwestern blue maize landraces. Ear traits measured in southwestern blue maize were closely aligned with Hernandez and Flores [13] findings of Maiz Azul morphological traits. Average ear and cob diameter reported from southwestern blue maize landraces were 3.9 and 2.7 cm, respectively and Hernandez and Flores [13] also reported similar observations for Maiz Azul. Blue kernel color reported in southwestern blue maize was similar to Maiz Azul kernel pigmentation. The differences observed showed that the variation in plant, ear and kernel characteristics might be mainly associated with geographical and sociocultural differences involved in the traditional cultivation and farmer selection. The Maiz Azul race is found in Western Mexico and is cultivated by Mestizos tribes in the mountainous region of western Chihuahua whereas blue maize landraces found in the Southwest are grown by different American Indian tribes from New Mexico and Arizona.
The qualitative traits of tiller, silk, glume, midrib, shoot and cob color of blue maize have not been studied previously for racial classification with the exception of the Soleri and Smith [18] study. Those authors studied the glume color and have reported red and purple glumes from Hopi Blue whereas we have reported green and purple/green glumes, which suggest that we were examining different landraces, both called Hopi Blue. Beside aesthetic importance of qualitative traits, kernel color in pigmented maize has played a pivotal role in selection for nutrition and socio-cultural importance in the Southwest USA for centuries [26,27]. The importance of kernel colors in selection for human nutrition has presumably "de novo" evolved in the Southwest [37] and may have created different races based on the kernel color.
Sánchez and Goodman [38] classified Mexican races using cluster analysis and identified three different racial groups. A sub-group from the Sierra de Chihuahua group containing several races of maize from the highlands of central and northern Mexico was also revealed. The sub-groups included the Cristalino de Chihuahua, Gordo, Azul (Blue), Apachito, and Serrano de Jalisco. Those races are restricted to the highlands of northwestern Mexico in valleys from 2000 to 2600 m above sea level. Maiz Azul is characterized by short-statured plants with few tassel branches and long slender ears that are tapered at the base. Kernels are rounded and tend to be of floury texture. In contrasts, samples of blue maize from the Southwest display considerable variation for plant and ear characteristics, and they are cultivated in diverse environments at a broad range of altitudes [20]. The racial classification of southwestern maize by Adams et al. [24] was based on 27 distinct groups of 123 pigmented maize landraces and a total of four groups were formed using PCA based on the ear length and shank size. Our results were based on a broader scale including the evaluation of pre-harvest phenotypic characters, post-harvest morphological traits, and kernel biochemical traits. Based on the PCA, Corn Belt Dent Ohio Blue was readily distinguished from the southwestern landraces.
Diversity of germplasm collections can be studied at phenotypic or morphological, geographical, molecular, and functional levels [39]. Genotypic diversity analysis has allowed us to understand the genetic similarities and differences between southwestern landraces and a distant Corn Belt population, Ohio Blue.

Germplasm
We examined six blue maize landraces representative of the blue maize found in Ari-  [40]. Other related types of clarage include 'Improved Clarage', 'Eichelberger Clarage', and 'Rotten Clarage' [40,41]. Ned's Blue was still offered for sale by Ned Place of Wapakoneta, OH in Auglaize County (western Ohio) until the early 21st century. It was apparently selected from local corn and was grown in the Ohio, Indiana, and Michigan region. Its origin is unknown, but it is conceivable that it was also selected from Rotten Clarage (a mixture of blue, yellow, and mixed pigmentation kernels) that was popular in southwestern Ohio. Geographical and botanical features of each accession can be seen in Table 7 and cultivation region of each landrace is shown in Figure S3. Representative ears and cobs (shelled ears) with kernels of each accession are shown in Figure 11.   [42].

Trimming of Sequencing Reads
Prior to alignment, the nucleotides of each raw read were scanned for low quality bases. Bases with PHRED quality value <15 (out of 40) [43,44], i.e., error rates of ≤3%, were removed by our trimming pipeline. Each read was examined in two phases. In the first phase reads were scanned starting at each end and nucleotides with quality values lower than the threshold were removed. The remaining nucleotides were then scanned using overlapping windows of 10 bp and sequences beyond the last window with average quality value less than the specified threshold were truncated. The trimming parameters were referred to the trimming software, Lucy [45,46].

Alignment of Reads to Public Maize B73 Reference Genome
Trimmed reads were aligned to the Maize reference genome using GSNAP [47] and confidently mapped reads were filtered if it mapped uniquely (≤ 2 mismatches every 36 bp and less than 5 bases for every 75 bp as tails) and used for subsequent analyses.

Discovery of Polymorphic Sites
The coordinates of confident and single (unique) alignments to the consensus reference sequence that passed our filtering criteria were used for SNP discovery. Polymorphisms at each potential SNP site were carefully examined and putative homozygous and heterozygous SNPs were identified in each sample separately using the following criteria: • Homozygous SNP calling The most common allele was supported by at least 80% of all the aligned reads covering that position. At least 5 unique reads supported the most common allele. Polymorphisms in the first and last 3 bp of each read were ignored. Each polymorphic base had at least a PHRED base quality value of 20 (≤ 1% error rate).

• Heterozygous SNP calling
Each of the two most common alleles was supported by at least 30% of all aligned reads covering that position. At least 5 unique reads supported each of the two most common alleles.
The sum of reads of the two most common alleles accounted for at least 80% of all aligned reads covering that nucleotide position. Polymorphisms in the first and last 3 bp of each quality-trimmed read were ignored.
Each polymorphic base had at least a PHRED base quality value of 20 (≤1% error rate).

Population Structure Analysis
Population genetic structure was analyzed using Admixture software version 1.3.0 [48]. Admixture has become mainstream software for genetic population structure analysis by virtue of its high-speed computation. Prerequisite conditions of Hardy-Weinberg equilibrium and minor allele frequency (MAF = 0.05) were met in the analyzed data set. The independent SNPs were selected by the -show-tags all and -block in plink software [49,50] before subjecting to structure analysis. The SNPs without any highcorrelated SNPs (r2 = 0.8) in the data set and the first SNPs in each haplotype block were kept as the independent SNP data set. The binary files of SNPs and the assumed number of sub-population (K = 1 to 10) applied to the Admixture. Cross-validation error (CV error) was extracted from the results file. CV error rate of different K values was used to identify the best K value based on the smallest CV error. The obtained results were further visualized in R to obtain the final distribution of CV error using package ggplot2 version 3.3.3 [51] as shown in Figure S1. The results showed that K = 2 corresponds to the smallest CV error hence identified as the best K value. According to the results file calculated by the software Admixture at K = 2, the genetic structure bar plot was created using biplot R package.

Phylogenetic Tree Construction
Pairwise distances were estimated between genotyped individuals using an unbiased model of substitution frequencies. Distance estimates were then used to construct a phylogenetic tree using the Neighbor-Joining like algorithm described by Criscuolo and Gascuel [52] and implemented in the njs module of the R APE package [53]. Unlike conventional neighbor joining methods, the njs algorithm is tolerant of missing data, enabling its use with GBS data. Relative branch lengths are proportional to the amount of divergence observed among individuals.

Measurement of Phenotypic Traits
A total of 40 morphological traits were examined (Table S10). Pre-harvest traits were measured when the plants were standing in the field at two New Mexico locations (Los Lunas and Alcalde) in 2014. Ear and kernel traits were measured post-harvest in the laboratory. Ear length was measured using a measuring board and ear diameter, cob diameter, kernel length, and kernel diameter was measured using calipers (General ® ULTRATECH, Secaucus, NJ, USA). Weights of ear, cob, kernel, and 100-kernel samples were measured using a NewClassic MS Balance (Mettler Toledo, Columbus, OH, USA).

Measurement of Biochemical Traits
Five kernel biochemical traits were analyzed from representative kernel samples produced at Los Lunas and Alcalde in 2014. The Experiment Station Chemical Laboratory of the University of Missouri analyzed the kernel constituents. The traits of total fatty acids, oil, protein, starch, and anthocyanin were examined. Oil and protein were measured by AOAC method 920.39 (A) and 990.03. Starch content was analyzed using base method: American Association of Cereal Chemists, approved method 76-13 and total fatty acids were analyzed by AOAC official methods 996.06 and Ca 5b-71. Total anthocyanin content was analyzed according to Li et al. [54] and Nankar et al. [55] method.

Statistical Analysis
Analyses of variance for both pre-and post-harvest traits were performed for each location separately as well as interaction between accession and location (A*L) was also evaluated using pooled data across both experiment locations. Analysis of variance of quantitative traits was performed using "PROC GLM" and Chi-Square analysis of qualitative traits was performed using "PROC FREQ". Landraces were considered as fixed effects and locations were considered as random effects. Means were compared by least significant differences test using "LSD" mean separation. Statistical analysis was performed with SAS V9.3 [56]. Principal component analysis of phenotypic diversity was performed on a total of 34 morphological and five kernel biochemical traits using R program version 4.0.3. Eigenvalue, eigenvector, percent variance of different principal components, and accession by trait biplot were estimated by ggplot2 version 3.3.3 [51], missMDA version 1.18 [57], FactoMineR version 2.4 [58], and Factoextra version 1.0.7 [59] R packages.

Conclusions
In this research, we have employed data on morphological, biochemical, and molecular variation to characterize the genetic diversity of southwestern blue maize landraces. The use of a molecular technique tGBS was more effective than morphological or biochemical traits for determining distinct varieties among southwestern blue corn landraces. The coalesced analysis of genetic structure, phylogeny, and principal component analysis proved to be effective in elucidating genetic structure of southwestern blue maize landraces from the midwestern Ohio Blue variety. However, the majority of southwestern landraces appeared to display interracial diffusion and belong to the same cohort, with the exceptions of Los Lunas High and Flor del Rio. Among the southwestern blue maize landraces, Navajo Blue displayed noticeable variation whereas Santa Ana, Los Lunas High, Flor del Rio, Yoeme Blue, Hopi Blue and Taos Blue showed little variation. Weight of cob, ear, kernel, 100 kernels and kernels per ear contributed to the variability in Navajo Blue. The groupings were more robust when performed using post-harvest traits. Our findings confirm that southwestern blue maize landraces are genetically related, and reflect the attributes of admixtures, but are phenotypically uniform. Diversity of trait values suggested that selection for a strong PD (blue kernel color) did not result in uniformity for other traits, but that overlap in phenotypic traits was consistent with earlier evidence of genetic exchange between southwestern landraces of maize.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/ijms22073436/s1, Figure S1: Cross-validation error (CV error) for population structure K = 1 to K = 10, Figure S2: Population structure bar plots for K = 1 to K = 10, Figure S3: Growing region map of each blue maize landrace presented in this work, Table S1: Subsample contribution to each population structure for K = 1 to K = 10, Table S2: Descriptive statistics of plant traits evaluated at Los Lunas, New Mexico, Table S3: Descriptive statistics of plant traits evaluated at Alcalde, New Mexico, Table S4: Descriptive statistics of ear traits evaluated at Los Lunas, New Mexico, Table S5: Descriptive statistics of ear traits evaluated at Alcalde, New Mexico, Table S6: Descriptive statistics of kernel traits evaluated at Los Lunas, New Mexico, Table S7: Descriptive statistics of kernel traits evaluated at Alcalde, New Mexico, Table S8: Descriptive statistics of kernel biochemical traits evaluated at Los Lunas, New Mexico, Table S9: Descriptive statistics of kernel biochemical traits evaluated at Alcalde, New Mexico, Table S10: List of traits evaluated at pre and post-harvest stages.  Data Availability Statement: All the data is contained within the article and in the supplementary material.