Comparison of Morphological and Genetic Characteristics of Avocados Grown in Tanzania

Tanzania has been growing avocado for decades. A wide variability of the avocado germplasm has been found, and the crop is largely contributing to the earnings of the farmers, traders, and the government, but its genetic diversity is scantly investigated. With the purpose of comparing morphological and genetic characteristics of this germplasm and uncovering the correlation between them and the geographical location, 226 adult seedling avocado trees were sampled in southwestern Tanzania. Their morphological characters were recorded, and their genetic diversity was evaluated based on 10 microsatellite loci. Discriminant analysis of principal components showed that the germplasm studied consisted of four genetic clusters that had an overall average gene diversity of 0.59 and 15.9% molecular variation among them. Most of the phenotypes were common in at least two clusters. The genetic clusters were also portrayed by multivariate analysis and hierarchical clustering for the molecular data but not for the morphology data. Using the Mantel test, a weak significant correlation was found between the genetic, morphological, and geographical distances, which indicates that the genetic variation present in the material is weakly reflected by the observed phenotypic variation and that both measures of variation varied slightly with the geographical sampling locations.


Introduction
Avocado (Persea americana Mill.) is an important fruit plant cultivated in tropical and subtropical climates. The fruit consumption is increasing worldwide, although much of the global production is in South and Mesoamerica [1,2]. P. americana is a polymorphic species with three botanical groups or horticultural races (the West Indian, Guatemalan, and Mexican) that are ecologically distinguishable [3]. Individuals in each botanical group also have some common genetic characteristics that distinguish them from members of other groups [4,5].
Evaluating the genetic variation existing in a given germplasm is essential to understand its potential application in crop breeding. The knowledge is also important in estimating the loss of genetic diversity, in providing proofs of the evolutionary forces shaping the genotypic variations, and also in selecting genotypes to be prioritized in conservation strategies [6]. Different markers have been used in avocado germplasm characterization, management, and conservation. Morphological markers were used to characterize avocado germplasm in California [7], Florida [8], Ghana [9,10], Mexico [11], Indonesia [12], and Tanzania [13], among others. However, besides being labor intensive, morphological traits are associated with some shortcomings, such as low variability

Study Sites and Sampling
The study sites were eight avocado-rich districts in the Mbeya, Songwe, and Njombe regions located in southwestern Tanzania ( Figure 1). Two-hundred twenty-six seed originated adult avocado trees in 53 villages across the study sites were phenotyped during March through August 2017. Young leaf material of these trees was sampled, then dried and preserved using silica gel and later used for DNA extraction. The latitude and longitude of the collecting sites were determined with a Garmin Epix GPS mapping and multisport watch. The number of trees studied per district varied from 7 to 43 (Table 1).

Phenotyping
Phenotypic characters of the 226 avocado trees were examined following the International Plant Genetic Resources Institute's avocado crop descriptors [34]. Thirteen of the most important descriptors for avocado characterization were investigated. These descriptors included plant, fruit, and seed characteristics. The plant descriptors were the surface of the trunk, pubescence, and color of the young twig, the shape of and pubescence on the underside of the leaf, the number of primary leaf veins, and the leaf vein divergence at the middle of the leaf. The fruit descriptors included the shape of the mature fruit and pedicel, the peel thickness, and the flesh texture. For the seed, the descriptors assessed were the mature seed shape and its cotyledon surface. Color and peel thickness determination was achieved with the aid of the RHS color chart [35] and a ruler, respectively. Some phenotyping activities are presented in Figure 2.

DNA Extraction, Microsatellite Loci Amplification, and Genotyping
DNA was extracted from the dry avocado leaf tissue of the 226 trees using a Thermo Scientific genomic DNA purification kit following the protocol included in the kit. The analysis of the DNA integrity was done by running 1.2% agarose gel electrophoresis, whereas the DNA quality and quantity were checked with the NanoDrop spectrophotometer. Ten microsatellite loci of the sampled trees were investigated, of which nine were genomic, and one was an EST (expressed sequence tag) based microsatellite ( Table 2). The ten microsatellite markers used were selected, based on their clear polymorphism pattern, from 16 highly polymorphic markers identified among 39 markers initially screened. The amplification of each locus was undertaken in 25 µL volume containing 25 ng genomic DNA, 0.3 µM of each of fluorescent-labeled forward primer and unlabeled reverse primer, 0.3 mM dNTPs, 1× PCR buffer, 1.5 mM MgCl 2 , and 1 U/µL Taq DNA polymerase. We used the S1000™ thermal cycler (BIO-RAD, Hercules, CA, USA) to run the PCR reactions under the program that involved initial denaturation at 94 • C for 60 s, followed by 35 cycles of denaturation at 94 • C for 60 s, primer annealing at primer-specific temperature for 30 s, and primer extension at 72 • C for 60 s. Then, the 35 cycles were followed with a final extension at 72 • C for 60 s. The capillary electrophoresis of the amplified products was carried out on the Applied Biosystems 3500 Genetic Analyzer (ThermoFisher Scientific, Waltham, MA, USA) using the GeneScan 500 LIZ size standard. The output, in the form of electropherograms, generated was imported to GeneMarker ® software V2.7 (SoftGenetics, State College, PA, USA) for visualization and allele-calling. The allele dataset at the 10 microsatellite loci was then organized in an Excel spreadsheet for further analyses.  We employed a discriminant analysis of principal components (DAPC) to infer the genetic clusters (subpopulations) and explore the population structure of the sampled trees using the allele dataset. The allele dataset in the GenAlEx format was first converted into a genind object using the R program df2genind [38], and then the DAPC was carried out on the genid object following the method described by Jombart and Collins [39]. The method involved the identification of the optimal number of genetic clusters (K) by using the find.clusters function and then employed the Bayesian Information Criterion (BIC) in choosing the optimal number of genetic clusters based on the elbow approach. Thereafter, the obtained clusters were further described by the DAPC. Since the genetic clusters derived from analysis of population structure might indicate the racial origin of avocado, all analyses of this work considered the genetic clusters as populations. This was also important for facilitating observation and comparison of the clustering of trees in the microsatellite and morphology-based multivariate analysis and hierarchical cluster analysis.

Genetic Diversity among the Identified Clusters, Analysis of Molecular Variance (AMOVA) and Population Divergence
The total number of alleles scored, and the total number of different alleles observed were computed in HP-RARE [40]. Allelic richness (R A ) and private allelic richness (R PA ) were computed based on the rarefaction in HP-RARE. The estimation of the number of different alleles per locus, number of effective alleles, number of private, rare and common alleles per locus, Shannon's information index, and average expected heterozygosity was done using GenAlEx 6.5 [41]. Average observed heterozygosity among the clusters was computed with Arlequin 3.5.2.2 [42]. The average gene diversity across the 10 loci for each cluster was computed with Arlequin. The global analysis of molecular variance (AMOVA) was performed on the clusters in Arlequin. Population divergence was assessed by comparing pairwise population F ST and Nei's genetic distance in Arlequin and GenAlEX, respectively.

Genetic Relatedness among the Identified Clusters
Using GenAlEx, the Nei's genetic distance was computed from the microsatellite data and then used for the principal coordinate analysis (PCoA) in the same software to study the relatedness of the trees with respect to their genetic clusters. The matrix used for PCoA consisted of 227 rows × 227 columns. The neighbor-joining dendrogram was computed in MEGAX [43] using Nei's genetic distance matrix, and thereafter, the output in Newick-format was viewed and customized using the online tool iTOL v5 following Letunic and Bork [44].

Phenotypic Characterization
Morphological characters of all the trees were organized in the Excel spreadsheet. Character variants that only occurred among some individuals of a particular cluster were identified. Principal components analysis of mixed data (PCAmix) [45] was performed on all morphological data to study morphological relatedness among the trees with respect to their genetic clusters (subpopulations). The analysis was carried out in XLSTAT version 2019.4.2 [46]. Thereafter, the dissimilarity matrix was computed in the same software from all morphological data. The matrix was used in producing a dendrogram to reveal morphological relatedness among the trees with regard to their genetic clusters. The dendrogram in the Newick format was produced in the R software using the Ward.D2 method [47,48]. The Newick format dendrogram was then viewed and customized using iTOL v5.

Correlation between Genetic, Morphological, and Geographical Distances
The geographic distance matrix was computed from the latitude and longitude of the collecting sites in GenAlEx. Correlation between genetic, morphological, and geographical distance matrices was computed with the Mantel test at 999 permutations in the same software.

Identification of Genetic Subpopulations (Clusters) and Description of Population Structure
DAPC was employed to study the population structure of the sampled trees in detail. The 'find.cluster' function detected four clusters associated with the lowest BIC value (Figure 3a). These four clusters were considered to be useful in describing our data. Therefore, DAPC analysis was performed on the four clusters, and their proficient description was delivered. The forty-first PCs of the PCA, amounting to 81.2% of the total variance, and three discriminant functions were retained. These values were confirmed by a cross-validation analysis ( Figure 3b). The DAPC plot ( Figure 4) showed four clusters, with the linear discriminant 1 separating clusters 1 and 3 (to the left) from clusters 2 and 4 (to the right). The linear discriminant 2 only separated cluster 1 from cluster 3. Of the four clusters, cluster 1 was the largest with 90 individuals, followed by cluster 4 with 53 individual samples (Table 3). Cluster 2 and cluster 3 had a similar number of individuals, 42 and 41, respectively. In cluster 1, Rungwe had the highest number of individuals (32), followed by Busokelo (18). Neither the Mbeya city nor the Mbozi district contributed samples to this cluster. In cluster 2, the Njombe rural and Mbeya rural contributed a similar number of samples (12 and 11, respectively), whereas only two samples came from the Wanging'ombe district. No samples from Rungwe, Busokelo, or Njombe could be found in cluster 2. Cluster 3 had samples from two districts only, Mbeya city and Mbeya rural, which contributed 23 and 18 samples, respectively. Cluster 4 had more samples from the Mbozi district samples (17), followed by the Mbeya city (12), whereas only two samples came from Rungwe. The Njombe rural and Wanging'ombe districts had a similar contribution with 9 and 7 samples, respectively. Neither Busokelo nor Njombe urban contributed samples to this cluster. Further investigation of the individuals in each cluster revealed that the most 'admixed' individuals, i.e., individuals having a maximum of 90% probability to be a member of a single cluster, were nineteen among all samples ( Figure 5). The allele composition of the four clusters for all the studied trees is presented in Figure S1.

Genetic Diversity among the Four Genetic Clusters
The total number of alleles scored among the four clusters ranged from 727 (Cluster 3) to 1414 (Cluster 1), while the total number of different alleles observed ranged from 66 (Cluster 3) to 118 (Cluster 4; Table 4).
The analysis of allele frequency of the different clusters (populations) revealed that the mean number of different alleles per locus was lowest in cluster 3 (6.60) and highest in cluster 4 (11.80). The lowest and the highest private allele richness was recorded in cluster 3 (1.00) and cluster 1 (2.09), respectively. The effective number of alleles was lowest in cluster 3 (3.62) and highest in cluster 4 (5.68). Gene diversity, the unbiased expected and observed heterozygosity were lowest in cluster 1, i.e., 0.55, 0.70, and 0.60, respectively, pointing to a lower diversity among individuals of this group compared to other groups. The gene diversity was highest in cluster 3, i.e., 0.63, while allelic richness, unbiased expected heterozygosity, and the Shannon information index were highest for cluster 4, i.e., 9.48, 0.79, and 1.93, respectively, pointing to a higher diversity in these two avocado groups. The number of alleles unique to a specific cluster, i.e., private alleles, per locus was lowest and highest in cluster 2 (0.5) and cluster 1 (2.3; Table 4). The least frequent private allele was an allele of 82 bp at the locus AVAG22, which had a frequency of 0.6% in cluster 1 (Table S1). The most frequent private allele was a 184 bp allele at the locus LMAV14, having a frequency of 61.5% in cluster 1. The number of alleles with a frequency of less than 5% in a population, i.e., rare alleles, per locus ranged from 1.8 (cluster 3) to 5.8 (cluster 4). The number of common alleles, with a frequency above or equal to 5% among the populations, per locus varied from 4.5 (cluster 1) to 6.0 (cluster 4). The most frequent common alleles were a 92 bp allele at the locus AVAG05, which had a frequency of 76.3% in cluster 1, followed by a 199 bp allele at the locus LMAV24, which had a frequency of 65.9% in cluster 2.

Genetic Relationship among the Studied Avocado Samples
PCoA was used to study the genetic relationship among the investigated avocado trees. The grouping pattern of the trees in the PCoA ( Figure 6) was more or less similar to the DAPC findings, i.e., a grouping of samples into four clusters. The first two principal axes explained 19.64% of the total variation. Some individuals of cluster 2 and cluster 4 were projected on almost similar positions, while all individuals of cluster 1 and cluster 3 were resolved into distinct positions.
The Nei's genetic distance matrix of the 226 avocado samples was used to study the genetic relationship among the four clusters identified by the DAPC. The dendrogram derived through the neighbor-joining cluster analysis method resolved clusters 1 and 3 into distinct groups except in a few cases (corresponding to group 1 and 3, respectively, in Figure 7). Group 2 contained samples from clusters 2 (in orange) and 4 (in blue), which suggests that members of the two clusters had higher genetic relatedness compared to the other clusters.

Analysis of Molecular Variance and Population Differentiation
Analysis of molecular variance showed a higher molecular variance among the four avocado clusters (15.91%) than among individuals within clusters (9.91%), with the within individuals variance being the highest, 74.18% (Table 5). The genetic differentiation among the four clusters identified by DAPC was investigated further by computing population pairwise F ST (Table 6), and the analysis revealed significant differentiation among all pairs of clusters. The highest genetic differentiation was observed between clusters 1 and 2 (F ST = 0.174), whereas clusters 2 and 4 displayed the lowest differentiation (F ST = 0.062). Likewise, the analysis of the Nei's genetic distances between clusters revealed the largest genetic distance between clusters 1 and 2 (1.163) and the lowest distance between clusters 2 and 4 (0.310). Cluster 1 had the largest mean F ST (0.077) and genetic distance (0.735) from the other three clusters. The lowest mean F ST (0.055) and genetic distance (0.0486) from the other three clusters were recorded in cluster 4 and 3, respectively. Table 6. Pairwise differentiation of clusters (F ST ) (above diagonal) and Nei's genetic distance between clusters (below diagonal) and mean F ST and genetic distance of each cluster from the other three clusters. All pairwise F ST values were significant at p < 0.001.

Morphological Characteristics among Individuals of the Genetic Clusters
Analysis of morphological characteristics among individuals of the four clusters revealed that the majority of the phenotypes appeared in at least two clusters (Table 7).

Morphological Relationships among Individuals of the Four Clusters
Morphology based-principal components analysis of mixed data (PCAmix) of the investigated trees showed intermingling of the individuals from the four genetic clusters with the first two axes showing a cumulative variation of 10.13% of the total variation ( Figure 8). Similar results were noted in the morphology-based dendrogram in which the avocado trees were clustered into three groups, with each group containing individuals from all four clusters (Figure 9).

Correlation between Genetic, Morphological and Geographic Distances
The Mantel test indicated a low positive but statistically supported correlation between the genetic and geographical distances (r = 0.15, p = 0.001; Figure S2), between the morphological and geographical distances (r = 0.08, p = 0.001; Figure S3) and between the genetic and morphological distances (r = 0.11, p = 0.001; Figure S4) when the analysis was performed on individual samples.

Discussion
The present study has demonstrated the effectiveness of the genetic markers (microsatellite markers) over traditional morphological markers in characterizing avocado, exploring the diversity and the relationships among the individuals. Likewise, the study has shown the utility of DAPC in establishing the population structure of avocado crops and providing in-depth information on the individuals of the identified genetic clusters, which is an important step for practical plant breeding and conservation.
High diversity was noticed among the individuals of the four genetic clusters at the ten microsatellite loci. The mean number of different alleles per locus among the four clusters ranged from 6.60 (cluster 3) to 11.80 (cluster 4), with an average of 9.40 across the four clusters and loci (Table 4). Gross-German and Viruel [37] found a range of 3.7 (West Indian group) to 7.10 (hybrid group) with an average of 5.58 for the four populations they investigated, which consisted of a total of 41 avocado samples. Boza et al. [4] reported a range of 7.93 (Mexican group) to 9.78 (Guatemalan group), among the three horticultural groups, with a much higher overall mean of 9.09. Similarly, Schnell et al. [23] got a range of 6.00 (Mexican × West Indian group) to 13.35 (Mexican group) with an overall average of 10.26 for six populations of avocado comprising 221 samples. Cañas-Gutiérrez et al. [49] reported a lower overall mean, 4.46 for 18 geographical populations. In the present work, allele richness was lowest in cluster 3 (6.00) and highest in cluster 4 (9.48) with an overall mean value of 7.69. This suggests that clusters 3 and 4 were the least and the most genetically diverse clusters, respectively. The most genetically diverse groups would be offered protection in conservation programs, and they may provide the best plant materials for breeding programs, whereas the least genetically diverse groups would deserve special conservation management [50]. Guzmán et al. [24] recorded a comparatively lower allelic richness, 5.95 (Mexican group) to 6.22 (West Indian group) with an overall average of 6.10, for the three avocado racial groups. While the current study's private allele richness ranged from 1.00 (cluster 3) to 2.09 (cluster 1) with an overall average of 1.45, Guzmán et al. [24] recorded a range of 0.63 (Mexican group) to 0.89 (Guatemalan group) with an overall mean of 0.74 for the three avocado populations. The average observed and expected heterozygosity for the four clusters was found to be 0.65 and 0.74, respectively. Lower values were reported by Boza et al. [4], Ho: 0.53 and He: 0.64, for the three horticultural races included in their study. Higher values were estimated by Gross-German and Viruel [37], Ho: 0.66 and He: 0.71 (four populations), and Schnell et al. [23], Ho: 0.71 and He: 0.77 (six populations), indicating a comparatively higher diversity. While the overall average gene diversity in the present work was 0.59, Boza et al. [4] obtained a higher value (0.63) for the three avocado races they investigated.
The number of private alleles per locus ranged from 0.50 (cluster 2) to 2.30 (cluster 1) with a grand mean of 1.23 across all populations and loci (Table 4). Private alleles are a measure of population differentiation, thus the highest value for the number of private alleles per locus detected in cluster 1 indicates the greatest genetic differentiation of this cluster as was also revealed by its largest mean F ST . Boza et al. [4] reported the number of private allele per locus ranging from 0.65 (Mexican group) to 0.71 (West Indian group) among the three avocado races, and 0.02 to 0.07 among their six hybrid groups with a grand mean value of 0.23 for the nine populations, which is lower than the value obtained in our study. While, in the present study, the lowest and highest number of rare alleles per locus was 1.80 (cluster 3) and 5.80 (cluster 4), Boza et al. [4] got a range of 3.31 (Mexican group) to 6.24 (West Indian group) among the three botanical groups, and 0.00 to 3.44 among their six hybrid groups. Rare alleles are significant in plant breeding as they may be associated with adaptations to biotic and abiotic stresses [51]. In our study, the number of common alleles per locus varied from 4.40 (cluster 1 and cluster 3) to 6.00 (cluster 4), whereas Boza et al. [4] got a range of 3.22 (West Indian group) to 4.67 (Guatemalan group) among the three botanical groups, and 3.80 to 4.64 among their six hybrid groups.
The PCoA ( Figure 6) and dendrogram (Figure 7) obtained from microsatellite markerbased analyses resolved the studied trees into groups that were more or less similar to the four genetic clusters established by the DAPC analysis. Gross-German and Viruel [37] observed that the model-based (STRUCTURE) genetic clustering, PCoA, and cluster analysis results were in line with the distribution of avocado into botanical races, i.e., the Mexican, West Indian, and interracial Guatemalan × Mexican. Similarly, Alcaraz and Hormaza [15] observed that the UPGMA based dendrogram grouped 75 avocado accessions into three major groups that mainly corresponded to the botanical races. The four genetic clusters (groups) generated in the present study might represent the three avocado races and a hybrid group. This was also indicated by Juma et al. [13], as Tanzanian avocado germplasm analyzed using different morphological traits was shown to contain material from all three races. Traits included were trunk surface and peel thickness. Smooth trunk surface was reported as an attribute of the Mexican and Guatemalan races, and the rough and very rough trunk surface is attributed to the West Indian race [52]. Thin ripe peel (≤1 mm thick) is ascribed to the West Indian and Mexican races, and a thick ripe peel (2-3 mm thick) was ascribed to the Guatemalan race [53]. Other traits were the doughy and buttery flesh textures ascribed to the Guatemalan and Mexican races and the watery flesh texture attributed to the West Indian group [53]. However, in the present study, the examination of these characteristics showed that they appeared among individuals of all four clusters. More genetic studies need to be carried out on the Tanzanian avocado germplasm together with representative samples of the three avocado races to confirm the germplasm's racial origin.
The AMOVA indicated that the overall genetic differentiation among the four avocado genetic clusters, F ST , was 0.159 (p < 0.0001). This implies a substantial amount of diversity harbored by the trees investigated and that the four genetic clusters were significantly distinct. The level of population differentiation (F ST ) observed in this study was higher than the values reported by Juma et al. [25] for the same plant material when AMOVA was carried out on district-based populations (F ST = 0.061, p < 0.0001) and altitudinal groups (F ST = 0.025, p < 0.0001). Gross-German and Viruel [37] and Boza et al. [4] found an overall population differentiation of 0.25 and 0.193, respectively, which are comparatively higher than the value obtained in our study. In both studies, populations were based on the racial origin of avocado. Contrary to that, Cañas-Gutiérrez et al. [49] noted an overall population differentiation of 0.054 among the municipality-based populations, which is about 69% less than the value observed in the present study. Considering the AMOVA-based findings from the mentioned studies, it can be concluded that the overall population differentiation among avocado groups is higher if the grouping is based on racial origin than if it is based on geographical origin.
Pairwise comparison of population differentiation (F ST ) and divergence (Nei's genetic distance) revealed significant differentiation among all the clusters, with the lowest differentiation/genetic distance between clusters 2 and 4 (0.310; Table 6). The comparatively low Nei's genetic distance between clusters 2 and 4 explains why the two clusters were less resolved from one another on the DAPC and microsatellite-based PCoA and dendrogram.
The morphology-based-PCAmix and dendrogram did not group the analyzed trees into their genetic clusters. The two analyses showed the intermingling of the individual trees from the four clusters. This finding suggests that the SSR loci investigated were not linked to the genes governing the investigated morphological traits. Another explanation is that the environment significantly influenced the phenotypes if linkage exists.
A weak positive correlation was revealed between the geographical distance of the sampling locations and the genetic distance (r = 0.15, p = 0.001) and between the geographical distance and the morphological dissimilarity matrix (r = 0.08, p = 0.001). Prohens et al. [54] observed a lack of correlation between geographical distance and AFLP-based genetic distance (r = 0.11, p < 0.10) in their study of 28 Spanish eggplant accessions (Solanum melongena L). Contrary to our study, they observed a comparatively higher correlation between the geographical and morphological distances (r = 0.25, p < 0.01). Sreekumar et al. [55] reported a highly significant correlation between geographical distance and AFLP-based genetic distance (r = 0.73, p = 0.009), whereas no correlation could be found between geographical distance and morphological trait-based distance (r = 0.44, p = 0.07) in their study of 60 breadfruit samples (Artocarpus altilis) in India. The weak correlation between geographical and genetic or morphological distances observed in the present study could be due to persistent movements and sharing of seeds between farmers of different areas [13,25,33]. In the present study, a weak positive correlation was also noticed between the genetic and morphological distances (r = 0.11, p = 0.001). This suggests that there was no strong association between the studied morphological traits and the 10 SSR loci investigated. It also suggests that the morphological trait variation cannot fully display the pattern of genetic diversity in avocado. Working with 62 Ethiopian maize accessions, Beyene et al. [28] noticed a moderate positive significant correlation between AFLP-based genetic and morphological distances (r = 0.39, p = 0.001), and also between SSR-based genetic and morphological distances (r = 0.43, p = 0.001). In a similar study on Vietnamese and Cambodian sesame accessions, Pham et al. [56] reported a highly significant positive correlation (r = 0.88, p = 0.001) between agro-morphological and RAPD marker based distances between the accessions. Contrary to that, Roldan-Ruiz et al. [57] observed an absence of correlation between AFLP-based genetic and morphological distances (r = −0.06, p < 0.375) and a weak correlation between the sequence tag sites (STS)-based genetic and morphological distances (r = 0.18, p < 0.12) in 16 ryegrass varieties. Similarly, Sreekumar et al. [55] reported an absence of correlation between the AFLP-based genetic distance and the morphological distance (r = 0.01, p = 0.5) of breadfruit in India. Smith and Smith [14] asserted that phenotypic variation sometimes does not follow genetic variation due to the influence of the environment on the phenotypic expression of the genotypes and potential multiple gene action on the traits.

Conclusions
The findings from this study showed that the population structure of the analyzed avocado trees comprised four genetic clusters that might represent the racial origin of the germplasm: Mexico, Guatemala, and West India. Although the four clusters were genetically distinguishable, their morphological characters, even for the characters that were supposed to be found only in a particular avocado horticultural race (a cluster), were overlapping. The weak positive correlation observed between geographical and genetic or morphological distances indicates that the genetic and morphological characteristics of the studied trees varied slightly with the geographical locations. Similarly, the weak positive correlation observed between the genetic and morphological distances indicates a low level of agreement between the diversity patterns derived from the two distances.
Supplementary Materials: The following are available online at https://www.mdpi.com/2073-442 5/12/1/63/s1, Figure S1: The distribution pattern of the alleles of different clusters for all avocado trees, Figure S2: Mantel test showing the correlation between genetic and geographical distances, Figure S3: Mantel test showing a correlation between dissimilarity matrix and geographical distance, Figure S4: Mantel test showing a correlation between dissimilarity matrix and genetic distance, Table S1: Allele frequencies by locus for the four genetic clusters.