You are currently viewing a new version of our website. To view the old version click .
Genes
  • Article
  • Open Access

4 January 2021

Comparison of Morphological and Genetic Characteristics of Avocados Grown in Tanzania

,
,
,
,
,
,
and
1
Department of Plant Breeding, Swedish University of Agricultural Sciences, P.O. Box 101, 23053 Alnarp, Sweden
2
Department of Botany, University of Dar es Salaam, P.O. Box 35060, Dar es Salaam, Tanzania
*
Author to whom correspondence should be addressed.
This article belongs to the Section Plant Genetics and Genomics

Abstract

Tanzania has been growing avocado for decades. A wide variability of the avocado germplasm has been found, and the crop is largely contributing to the earnings of the farmers, traders, and the government, but its genetic diversity is scantly investigated. With the purpose of comparing morphological and genetic characteristics of this germplasm and uncovering the correlation between them and the geographical location, 226 adult seedling avocado trees were sampled in southwestern Tanzania. Their morphological characters were recorded, and their genetic diversity was evaluated based on 10 microsatellite loci. Discriminant analysis of principal components showed that the germplasm studied consisted of four genetic clusters that had an overall average gene diversity of 0.59 and 15.9% molecular variation among them. Most of the phenotypes were common in at least two clusters. The genetic clusters were also portrayed by multivariate analysis and hierarchical clustering for the molecular data but not for the morphology data. Using the Mantel test, a weak significant correlation was found between the genetic, morphological, and geographical distances, which indicates that the genetic variation present in the material is weakly reflected by the observed phenotypic variation and that both measures of variation varied slightly with the geographical sampling locations.

1. Introduction

Avocado (Persea americana Mill.) is an important fruit plant cultivated in tropical and subtropical climates. The fruit consumption is increasing worldwide, although much of the global production is in South and Mesoamerica [1,2]. P. americana is a polymorphic species with three botanical groups or horticultural races (the West Indian, Guatemalan, and Mexican) that are ecologically distinguishable [3]. Individuals in each botanical group also have some common genetic characteristics that distinguish them from members of other groups [4,5].
Evaluating the genetic variation existing in a given germplasm is essential to understand its potential application in crop breeding. The knowledge is also important in estimating the loss of genetic diversity, in providing proofs of the evolutionary forces shaping the genotypic variations, and also in selecting genotypes to be prioritized in conservation strategies [6]. Different markers have been used in avocado germplasm characterization, management, and conservation. Morphological markers were used to characterize avocado germplasm in California [7], Florida [8], Ghana [9,10], Mexico [11], Indonesia [12], and Tanzania [13], among others. However, besides being labor intensive, morphological traits are associated with some shortcomings, such as low variability (polymorphism) and heritability, late expression, influence by environmental factors, and subjectivity [14,15]. Nowadays, avocado germplasm characterization has been improved by the use of genetic markers, which can even discriminate closely related individuals. Some genetic markers that have been applied are isozymes [16], minisatellites [17], variable number tandem repeats (VNTRs) [18], randomly amplified polymorphic DNA (RAPD) [19], and restriction fragment length polymorphisms (RFLP) [20,21]. Others are inter-simple sequence repeats (ISSR) [22], simple sequence repeat (SSR) [15,23,24,25], and single nucleotide polymorphisms (SNPs) [5,26,27]. Choosing which marker type to employ in a diversity study depends on the study objectives and available financial resources, expertise, and facilities [28].
Population genetics has been used in describing the genetic composition of avocado populations and mechanisms affecting the composition [15,23,24,25]. Bayesian cluster analysis employed in the STRUCTURE and discriminant analysis of principal components has been widely utilized in studying the population structure of crops, including avocado [5,24,25,26,29]. Bayesian cluster analysis generates genetic clusters (genetic populations) with individuals in each cluster having distinctive allele frequencies at the investigated loci [30,31,32]. In avocado research, these genetic clusters have, sometimes, been shown to conform to the horticultural origin of the crop [4,5].
Tanzania rises from the sea level to more than 2900 m above sea level. The country has varying topographies, soils, and climates, which support the growth of different cultivars of avocados [13,25]. Although avocado is grown in several regions of Tanzania for the export and domestic markets [33], only two studies have been executed to characterize this germplasm based on morphological traits [13] and SSR markers [25]. The present study aimed to compare morphological and genetic characteristics of this germplasm and uncover correlations existing among the morphological and genetic characteristics and the geographical sampling locations. Such insights can provide important information for plant breeders to plan breeding programs in the future. In addition, the insights can increase awareness about avocado genetic resources that could be exploited for management and utilization in Tanzania.

2. Materials and Methods

2.1. Study Sites and Sampling

The study sites were eight avocado-rich districts in the Mbeya, Songwe, and Njombe regions located in southwestern Tanzania (Figure 1). Two-hundred twenty-six seed originated adult avocado trees in 53 villages across the study sites were phenotyped during March through August 2017. Young leaf material of these trees was sampled, then dried and preserved using silica gel and later used for DNA extraction. The latitude and longitude of the collecting sites were determined with a Garmin Epix GPS mapping and multisport watch. The number of trees studied per district varied from 7 to 43 (Table 1).
Figure 1. Study site map: (a) Top left is Tanzania’s location in Africa (small scale map) and the three avocado rich regions’ locations (large scale map); (b) Top center is the three regions showing the districts included in this research; (c) Bottom are the village/street locations in the districts.
Table 1. Sampling information.

2.2. Phenotyping

Phenotypic characters of the 226 avocado trees were examined following the International Plant Genetic Resources Institute’s avocado crop descriptors [34]. Thirteen of the most important descriptors for avocado characterization were investigated. These descriptors included plant, fruit, and seed characteristics. The plant descriptors were the surface of the trunk, pubescence, and color of the young twig, the shape of and pubescence on the underside of the leaf, the number of primary leaf veins, and the leaf vein divergence at the middle of the leaf. The fruit descriptors included the shape of the mature fruit and pedicel, the peel thickness, and the flesh texture. For the seed, the descriptors assessed were the mature seed shape and its cotyledon surface. Color and peel thickness determination was achieved with the aid of the RHS color chart [35] and a ruler, respectively. Some phenotyping activities are presented in Figure 2.
Figure 2. Measuring some morphological traits and collecting leaf samples.

2.3. DNA Extraction, Microsatellite Loci Amplification, and Genotyping

DNA was extracted from the dry avocado leaf tissue of the 226 trees using a Thermo Scientific genomic DNA purification kit following the protocol included in the kit. The analysis of the DNA integrity was done by running 1.2% agarose gel electrophoresis, whereas the DNA quality and quantity were checked with the NanoDrop spectrophotometer. Ten microsatellite loci of the sampled trees were investigated, of which nine were genomic, and one was an EST (expressed sequence tag) based microsatellite (Table 2). The ten microsatellite markers used were selected, based on their clear polymorphism pattern, from 16 highly polymorphic markers identified among 39 markers initially screened. The amplification of each locus was undertaken in 25 μL volume containing 25 ng genomic DNA, 0.3 μM of each of fluorescent-labeled forward primer and unlabeled reverse primer, 0.3 mM dNTPs, 1× PCR buffer, 1.5 mM MgCl2, and 1 U/μL Taq DNA polymerase. We used the S1000™ thermal cycler (BIO-RAD, Hercules, CA, USA) to run the PCR reactions under the program that involved initial denaturation at 94 °C for 60 s, followed by 35 cycles of denaturation at 94 °C for 60 s, primer annealing at primer-specific temperature for 30 s, and primer extension at 72 °C for 60 s. Then, the 35 cycles were followed with a final extension at 72 °C for 60 s. The capillary electrophoresis of the amplified products was carried out on the Applied Biosystems 3500 Genetic Analyzer (ThermoFisher Scientific, Waltham, MA, USA) using the GeneScan 500 LIZ size standard. The output, in the form of electropherograms, generated was imported to GeneMarker® software V2.7 (SoftGenetics, State College, PA, USA) for visualization and allele-calling. The allele dataset at the 10 microsatellite loci was then organized in an Excel spreadsheet for further analyses.
Table 2. The repeat motif of the simple sequence repeat (SSR) loci used in this study.

2.4. Data Analysis

2.4.1. Population Structure Analysis

We employed a discriminant analysis of principal components (DAPC) to infer the genetic clusters (subpopulations) and explore the population structure of the sampled trees using the allele dataset. The allele dataset in the GenAlEx format was first converted into a genind object using the R program df2genind [38], and then the DAPC was carried out on the genid object following the method described by Jombart and Collins [39]. The method involved the identification of the optimal number of genetic clusters (K) by using the find.clusters function and then employed the Bayesian Information Criterion (BIC) in choosing the optimal number of genetic clusters based on the elbow approach. Thereafter, the obtained clusters were further described by the DAPC. Since the genetic clusters derived from analysis of population structure might indicate the racial origin of avocado, all analyses of this work considered the genetic clusters as populations. This was also important for facilitating observation and comparison of the clustering of trees in the microsatellite and morphology-based multivariate analysis and hierarchical cluster analysis.

2.4.2. Genetic Diversity among the Identified Clusters, Analysis of Molecular Variance (AMOVA) and Population Divergence

The total number of alleles scored, and the total number of different alleles observed were computed in HP-RARE [40]. Allelic richness (RA) and private allelic richness (RPA) were computed based on the rarefaction in HP-RARE. The estimation of the number of different alleles per locus, number of effective alleles, number of private, rare and common alleles per locus, Shannon’s information index, and average expected heterozygosity was done using GenAlEx 6.5 [41]. Average observed heterozygosity among the clusters was computed with Arlequin 3.5.2.2 [42]. The average gene diversity across the 10 loci for each cluster was computed with Arlequin. The global analysis of molecular variance (AMOVA) was performed on the clusters in Arlequin. Population divergence was assessed by comparing pairwise population FST and Nei’s genetic distance in Arlequin and GenAlEX, respectively.

2.4.3. Genetic Relatedness among the Identified Clusters

Using GenAlEx, the Nei’s genetic distance was computed from the microsatellite data and then used for the principal coordinate analysis (PCoA) in the same software to study the relatedness of the trees with respect to their genetic clusters. The matrix used for PCoA consisted of 227 rows × 227 columns. The neighbor-joining dendrogram was computed in MEGAX [43] using Nei’s genetic distance matrix, and thereafter, the output in Newick-format was viewed and customized using the online tool iTOL v5 following Letunic and Bork [44].

2.4.4. Phenotypic Characterization

Morphological characters of all the trees were organized in the Excel spreadsheet. Character variants that only occurred among some individuals of a particular cluster were identified. Principal components analysis of mixed data (PCAmix) [45] was performed on all morphological data to study morphological relatedness among the trees with respect to their genetic clusters (subpopulations). The analysis was carried out in XLSTAT version 2019.4.2 [46]. Thereafter, the dissimilarity matrix was computed in the same software from all morphological data. The matrix was used in producing a dendrogram to reveal morphological relatedness among the trees with regard to their genetic clusters. The dendrogram in the Newick format was produced in the R software using the Ward.D2 method [47,48]. The Newick format dendrogram was then viewed and customized using iTOL v5.

2.4.5. Correlation between Genetic, Morphological, and Geographical Distances

The geographic distance matrix was computed from the latitude and longitude of the collecting sites in GenAlEx. Correlation between genetic, morphological, and geographical distance matrices was computed with the Mantel test at 999 permutations in the same software.

3. Results

3.1. Genetic Characterization

3.1.1. Identification of Genetic Subpopulations (Clusters) and Description of Population Structure

DAPC was employed to study the population structure of the sampled trees in detail. The ‘find.cluster’ function detected four clusters associated with the lowest BIC value (Figure 3a). These four clusters were considered to be useful in describing our data. Therefore, DAPC analysis was performed on the four clusters, and their proficient description was delivered. The forty-first PCs of the PCA, amounting to 81.2% of the total variance, and three discriminant functions were retained. These values were confirmed by a cross-validation analysis (Figure 3b).
Figure 3. Determination of the optimum number of clusters (a) and number of principal components (PCs) and discriminant functions to be retained in the discriminant analysis of principal components (DAPC) analysis (b).
The DAPC plot (Figure 4) showed four clusters, with the linear discriminant 1 separating clusters 1 and 3 (to the left) from clusters 2 and 4 (to the right). The linear discriminant 2 only separated cluster 1 from cluster 3. Of the four clusters, cluster 1 was the largest with 90 individuals, followed by cluster 4 with 53 individual samples (Table 3). Cluster 2 and cluster 3 had a similar number of individuals, 42 and 41, respectively. In cluster 1, Rungwe had the highest number of individuals (32), followed by Busokelo (18). Neither the Mbeya city nor the Mbozi district contributed samples to this cluster. In cluster 2, the Njombe rural and Mbeya rural contributed a similar number of samples (12 and 11, respectively), whereas only two samples came from the Wanging’ombe district. No samples from Rungwe, Busokelo, or Njombe could be found in cluster 2. Cluster 3 had samples from two districts only, Mbeya city and Mbeya rural, which contributed 23 and 18 samples, respectively. Cluster 4 had more samples from the Mbozi district samples (17), followed by the Mbeya city (12), whereas only two samples came from Rungwe. The Njombe rural and Wanging’ombe districts had a similar contribution with 9 and 7 samples, respectively. Neither Busokelo nor Njombe urban contributed samples to this cluster. Further investigation of the individuals in each cluster revealed that the most ‘admixed’ individuals, i.e., individuals having a maximum of 90% probability to be a member of a single cluster, were nineteen among all samples (Figure 5). The allele composition of the four clusters for all the studied trees is presented in Figure S1.
Figure 4. Discriminant analysis of principal components (DAPC) for 226 avocado samples. The axes represent the first two Linear Discriminants (LD). Each circle represents a cluster, and each symbol represents an individual. Numbers represent the different subpopulations identified by DAPC.
Table 3. Information on the number of samples in each cluster.
Figure 5. The distribution pattern of the alleles of different clusters for the most admixed individuals revealed by DAPC.

3.1.2. Genetic Diversity among the Four Genetic Clusters

The total number of alleles scored among the four clusters ranged from 727 (Cluster 3) to 1414 (Cluster 1), while the total number of different alleles observed ranged from 66 (Cluster 3) to 118 (Cluster 4; Table 4).
Table 4. Estimates of different genetic diversity parameters within the four genetic clusters.
The analysis of allele frequency of the different clusters (populations) revealed that the mean number of different alleles per locus was lowest in cluster 3 (6.60) and highest in cluster 4 (11.80). The lowest and the highest private allele richness was recorded in cluster 3 (1.00) and cluster 1 (2.09), respectively. The effective number of alleles was lowest in cluster 3 (3.62) and highest in cluster 4 (5.68). Gene diversity, the unbiased expected and observed heterozygosity were lowest in cluster 1, i.e., 0.55, 0.70, and 0.60, respectively, pointing to a lower diversity among individuals of this group compared to other groups. The gene diversity was highest in cluster 3, i.e., 0.63, while allelic richness, unbiased expected heterozygosity, and the Shannon information index were highest for cluster 4, i.e., 9.48, 0.79, and 1.93, respectively, pointing to a higher diversity in these two avocado groups.
The number of alleles unique to a specific cluster, i.e., private alleles, per locus was lowest and highest in cluster 2 (0.5) and cluster 1 (2.3; Table 4). The least frequent private allele was an allele of 82 bp at the locus AVAG22, which had a frequency of 0.6% in cluster 1 (Table S1). The most frequent private allele was a 184 bp allele at the locus LMAV14, having a frequency of 61.5% in cluster 1. The number of alleles with a frequency of less than 5% in a population, i.e., rare alleles, per locus ranged from 1.8 (cluster 3) to 5.8 (cluster 4). The number of common alleles, with a frequency above or equal to 5% among the populations, per locus varied from 4.5 (cluster 1) to 6.0 (cluster 4). The most frequent common alleles were a 92 bp allele at the locus AVAG05, which had a frequency of 76.3% in cluster 1, followed by a 199 bp allele at the locus LMAV24, which had a frequency of 65.9% in cluster 2.

3.1.3. Genetic Relationship among the Studied Avocado Samples

PCoA was used to study the genetic relationship among the investigated avocado trees. The grouping pattern of the trees in the PCoA (Figure 6) was more or less similar to the DAPC findings, i.e., a grouping of samples into four clusters. The first two principal axes explained 19.64% of the total variation. Some individuals of cluster 2 and cluster 4 were projected on almost similar positions, while all individuals of cluster 1 and cluster 3 were resolved into distinct positions.
Figure 6. Principal coordinate analysis (PCoA) demonstrating the genetic relationships among individuals of the four clusters identified through DAPC.
The Nei’s genetic distance matrix of the 226 avocado samples was used to study the genetic relationship among the four clusters identified by the DAPC. The dendrogram derived through the neighbor-joining cluster analysis method resolved clusters 1 and 3 into distinct groups except in a few cases (corresponding to group 1 and 3, respectively, in Figure 7). Group 2 contained samples from clusters 2 (in orange) and 4 (in blue), which suggests that members of the two clusters had higher genetic relatedness compared to the other clusters.
Figure 7. A simple sequence repeat (SSR) based dendrogram of the genetic relationship between the 226 avocado samples showing three major groups; group 1 and 3 correspond to clusters 1 and 3, respectively, and group 2 was a mosaic of individuals of two closely related clusters, i.e., cluster 2 (in orange) and cluster 4 (in blue) with three individuals from cluster 1 (in red). Samples marked with the same color belong to the same cluster. Highly admixed samples are indicated with arrows.

3.1.4. Analysis of Molecular Variance and Population Differentiation

Analysis of molecular variance showed a higher molecular variance among the four avocado clusters (15.91%) than among individuals within clusters (9.91%), with the within individuals variance being the highest, 74.18% (Table 5).
Table 5. Analysis of molecular variance (AMOVA) was conducted by grouping trees into their respective genetic clusters.
The genetic differentiation among the four clusters identified by DAPC was investigated further by computing population pairwise FST (Table 6), and the analysis revealed significant differentiation among all pairs of clusters. The highest genetic differentiation was observed between clusters 1 and 2 (FST = 0.174), whereas clusters 2 and 4 displayed the lowest differentiation (FST = 0.062). Likewise, the analysis of the Nei’s genetic distances between clusters revealed the largest genetic distance between clusters 1 and 2 (1.163) and the lowest distance between clusters 2 and 4 (0.310). Cluster 1 had the largest mean FST (0.077) and genetic distance (0.735) from the other three clusters. The lowest mean FST (0.055) and genetic distance (0.0486) from the other three clusters were recorded in cluster 4 and 3, respectively.
Table 6. Pairwise differentiation of clusters (FST) (above diagonal) and Nei’s genetic distance between clusters (below diagonal) and mean FST and genetic distance of each cluster from the other three clusters. All pairwise FST values were significant at p < 0.001.

3.2. Morphological Characterization

3.2.1. Morphological Characteristics among Individuals of the Genetic Clusters

Analysis of morphological characteristics among individuals of the four clusters revealed that the majority of the phenotypes appeared in at least two clusters (Table 7).
Table 7. Frequency distribution of different phenotypic characteristics across the four clusters (cluster-specific phenotypes in bold).

3.2.2. Morphological Relationships among Individuals of the Four Clusters

Morphology based-principal components analysis of mixed data (PCAmix) of the investigated trees showed intermingling of the individuals from the four genetic clusters with the first two axes showing a cumulative variation of 10.13% of the total variation (Figure 8). Similar results were noted in the morphology-based dendrogram in which the avocado trees were clustered into three groups, with each group containing individuals from all four clusters (Figure 9).
Figure 8. Principal components analysis of mixed data (PCAmix) demonstrating the morphological relationships among individuals of the four clusters.
Figure 9. Morphological trait-based dendrogram of the 226 avocado trees, which grouped the trees into three major groups, with each group being composed of individuals from all the four genetic clusters. Note: samples marked with the same color belong to the same genetic cluster.

3.3. Correlation between Genetic, Morphological and Geographic Distances

The Mantel test indicated a low positive but statistically supported correlation between the genetic and geographical distances (r = 0.15, p = 0.001; Figure S2), between the morphological and geographical distances (r = 0.08, p = 0.001; Figure S3) and between the genetic and morphological distances (r = 0.11, p = 0.001; Figure S4) when the analysis was performed on individual samples.

4. Discussion

The present study has demonstrated the effectiveness of the genetic markers (microsatellite markers) over traditional morphological markers in characterizing avocado, exploring the diversity and the relationships among the individuals. Likewise, the study has shown the utility of DAPC in establishing the population structure of avocado crops and providing in-depth information on the individuals of the identified genetic clusters, which is an important step for practical plant breeding and conservation.
High diversity was noticed among the individuals of the four genetic clusters at the ten microsatellite loci. The mean number of different alleles per locus among the four clusters ranged from 6.60 (cluster 3) to 11.80 (cluster 4), with an average of 9.40 across the four clusters and loci (Table 4). Gross-German and Viruel [37] found a range of 3.7 (West Indian group) to 7.10 (hybrid group) with an average of 5.58 for the four populations they investigated, which consisted of a total of 41 avocado samples. Boza et al. [4] reported a range of 7.93 (Mexican group) to 9.78 (Guatemalan group), among the three horticultural groups, with a much higher overall mean of 9.09. Similarly, Schnell et al. [23] got a range of 6.00 (Mexican × West Indian group) to 13.35 (Mexican group) with an overall average of 10.26 for six populations of avocado comprising 221 samples. Cañas-Gutiérrez et al. [49] reported a lower overall mean, 4.46 for 18 geographical populations. In the present work, allele richness was lowest in cluster 3 (6.00) and highest in cluster 4 (9.48) with an overall mean value of 7.69. This suggests that clusters 3 and 4 were the least and the most genetically diverse clusters, respectively. The most genetically diverse groups would be offered protection in conservation programs, and they may provide the best plant materials for breeding programs, whereas the least genetically diverse groups would deserve special conservation management [50]. Guzmán et al. [24] recorded a comparatively lower allelic richness, 5.95 (Mexican group) to 6.22 (West Indian group) with an overall average of 6.10, for the three avocado racial groups. While the current study’s private allele richness ranged from 1.00 (cluster 3) to 2.09 (cluster 1) with an overall average of 1.45, Guzmán et al. [24] recorded a range of 0.63 (Mexican group) to 0.89 (Guatemalan group) with an overall mean of 0.74 for the three avocado populations. The average observed and expected heterozygosity for the four clusters was found to be 0.65 and 0.74, respectively. Lower values were reported by Boza et al. [4], Ho: 0.53 and He: 0.64, for the three horticultural races included in their study. Higher values were estimated by Gross-German and Viruel [37], Ho: 0.66 and He: 0.71 (four populations), and Schnell et al. [23], Ho: 0.71 and He: 0.77 (six populations), indicating a comparatively higher diversity. While the overall average gene diversity in the present work was 0.59, Boza et al. [4] obtained a higher value (0.63) for the three avocado races they investigated.
The number of private alleles per locus ranged from 0.50 (cluster 2) to 2.30 (cluster 1) with a grand mean of 1.23 across all populations and loci (Table 4). Private alleles are a measure of population differentiation, thus the highest value for the number of private alleles per locus detected in cluster 1 indicates the greatest genetic differentiation of this cluster as was also revealed by its largest mean FST. Boza et al. [4] reported the number of private allele per locus ranging from 0.65 (Mexican group) to 0.71 (West Indian group) among the three avocado races, and 0.02 to 0.07 among their six hybrid groups with a grand mean value of 0.23 for the nine populations, which is lower than the value obtained in our study. While, in the present study, the lowest and highest number of rare alleles per locus was 1.80 (cluster 3) and 5.80 (cluster 4), Boza et al. [4] got a range of 3.31 (Mexican group) to 6.24 (West Indian group) among the three botanical groups, and 0.00 to 3.44 among their six hybrid groups. Rare alleles are significant in plant breeding as they may be associated with adaptations to biotic and abiotic stresses [51]. In our study, the number of common alleles per locus varied from 4.40 (cluster 1 and cluster 3) to 6.00 (cluster 4), whereas Boza et al. [4] got a range of 3.22 (West Indian group) to 4.67 (Guatemalan group) among the three botanical groups, and 3.80 to 4.64 among their six hybrid groups.
The PCoA (Figure 6) and dendrogram (Figure 7) obtained from microsatellite marker-based analyses resolved the studied trees into groups that were more or less similar to the four genetic clusters established by the DAPC analysis. Gross-German and Viruel [37] observed that the model-based (STRUCTURE) genetic clustering, PCoA, and cluster analysis results were in line with the distribution of avocado into botanical races, i.e., the Mexican, West Indian, and interracial Guatemalan × Mexican. Similarly, Alcaraz and Hormaza [15] observed that the UPGMA based dendrogram grouped 75 avocado accessions into three major groups that mainly corresponded to the botanical races. The four genetic clusters (groups) generated in the present study might represent the three avocado races and a hybrid group. This was also indicated by Juma et al. [13], as Tanzanian avocado germplasm analyzed using different morphological traits was shown to contain material from all three races. Traits included were trunk surface and peel thickness. Smooth trunk surface was reported as an attribute of the Mexican and Guatemalan races, and the rough and very rough trunk surface is attributed to the West Indian race [52]. Thin ripe peel (≤1 mm thick) is ascribed to the West Indian and Mexican races, and a thick ripe peel (2–3 mm thick) was ascribed to the Guatemalan race [53]. Other traits were the doughy and buttery flesh textures ascribed to the Guatemalan and Mexican races and the watery flesh texture attributed to the West Indian group [53]. However, in the present study, the examination of these characteristics showed that they appeared among individuals of all four clusters. More genetic studies need to be carried out on the Tanzanian avocado germplasm together with representative samples of the three avocado races to confirm the germplasm’s racial origin.
The AMOVA indicated that the overall genetic differentiation among the four avocado genetic clusters, FST, was 0.159 (p < 0.0001). This implies a substantial amount of diversity harbored by the trees investigated and that the four genetic clusters were significantly distinct. The level of population differentiation (FST) observed in this study was higher than the values reported by Juma et al. [25] for the same plant material when AMOVA was carried out on district-based populations (FST = 0.061, p < 0.0001) and altitudinal groups (FST = 0.025, p < 0.0001). Gross-German and Viruel [37] and Boza et al. [4] found an overall population differentiation of 0.25 and 0.193, respectively, which are comparatively higher than the value obtained in our study. In both studies, populations were based on the racial origin of avocado. Contrary to that, Cañas-Gutiérrez et al. [49] noted an overall population differentiation of 0.054 among the municipality-based populations, which is about 69% less than the value observed in the present study. Considering the AMOVA-based findings from the mentioned studies, it can be concluded that the overall population differentiation among avocado groups is higher if the grouping is based on racial origin than if it is based on geographical origin.
Pairwise comparison of population differentiation (FST) and divergence (Nei’s genetic distance) revealed significant differentiation among all the clusters, with the lowest differentiation/genetic distance between clusters 2 and 4 (0.310; Table 6). The comparatively low Nei’s genetic distance between clusters 2 and 4 explains why the two clusters were less resolved from one another on the DAPC and microsatellite-based PCoA and dendrogram.
The morphology-based-PCAmix and dendrogram did not group the analyzed trees into their genetic clusters. The two analyses showed the intermingling of the individual trees from the four clusters. This finding suggests that the SSR loci investigated were not linked to the genes governing the investigated morphological traits. Another explanation is that the environment significantly influenced the phenotypes if linkage exists.
A weak positive correlation was revealed between the geographical distance of the sampling locations and the genetic distance (r = 0.15, p = 0.001) and between the geographical distance and the morphological dissimilarity matrix (r = 0.08, p = 0.001). Prohens et al. [54] observed a lack of correlation between geographical distance and AFLP-based genetic distance (r = 0.11, p < 0.10) in their study of 28 Spanish eggplant accessions (Solanum melongena L). Contrary to our study, they observed a comparatively higher correlation between the geographical and morphological distances (r = 0.25, p < 0.01). Sreekumar et al. [55] reported a highly significant correlation between geographical distance and AFLP-based genetic distance (r = 0.73, p = 0.009), whereas no correlation could be found between geographical distance and morphological trait-based distance (r = 0.44, p = 0.07) in their study of 60 breadfruit samples (Artocarpus altilis) in India. The weak correlation between geographical and genetic or morphological distances observed in the present study could be due to persistent movements and sharing of seeds between farmers of different areas [13,25,33]. In the present study, a weak positive correlation was also noticed between the genetic and morphological distances (r = 0.11, p = 0.001). This suggests that there was no strong association between the studied morphological traits and the 10 SSR loci investigated. It also suggests that the morphological trait variation cannot fully display the pattern of genetic diversity in avocado. Working with 62 Ethiopian maize accessions, Beyene et al. [28] noticed a moderate positive significant correlation between AFLP-based genetic and morphological distances (r = 0.39, p = 0.001), and also between SSR-based genetic and morphological distances (r = 0.43, p = 0.001). In a similar study on Vietnamese and Cambodian sesame accessions, Pham et al. [56] reported a highly significant positive correlation (r = 0.88, p = 0.001) between agro-morphological and RAPD marker based distances between the accessions. Contrary to that, Roldan-Ruiz et al. [57] observed an absence of correlation between AFLP-based genetic and morphological distances (r = −0.06, p < 0.375) and a weak correlation between the sequence tag sites (STS)-based genetic and morphological distances (r = 0.18, p < 0.12) in 16 ryegrass varieties. Similarly, Sreekumar et al. [55] reported an absence of correlation between the AFLP-based genetic distance and the morphological distance (r = 0.01, p = 0.5) of breadfruit in India. Smith and Smith [14] asserted that phenotypic variation sometimes does not follow genetic variation due to the influence of the environment on the phenotypic expression of the genotypes and potential multiple gene action on the traits.

5. Conclusions

The findings from this study showed that the population structure of the analyzed avocado trees comprised four genetic clusters that might represent the racial origin of the germplasm: Mexico, Guatemala, and West India. Although the four clusters were genetically distinguishable, their morphological characters, even for the characters that were supposed to be found only in a particular avocado horticultural race (a cluster), were overlapping. The weak positive correlation observed between geographical and genetic or morphological distances indicates that the genetic and morphological characteristics of the studied trees varied slightly with the geographical locations. Similarly, the weak positive correlation observed between the genetic and morphological distances indicates a low level of agreement between the diversity patterns derived from the two distances.

Supplementary Materials

The following are available online at https://www.mdpi.com/2073-4425/12/1/63/s1, Figure S1: The distribution pattern of the alleles of different clusters for all avocado trees, Figure S2: Mantel test showing the correlation between genetic and geographical distances, Figure S3: Mantel test showing a correlation between dissimilarity matrix and geographical distance, Figure S4: Mantel test showing a correlation between dissimilarity matrix and genetic distance, Table S1: Allele frequencies by locus for the four genetic clusters.

Author Contributions

Conceptualization, I.J., A.N., M.F. and R.O.; Methodology, I.J., M.G., H.P.H., A.N. and R.O.; Data collection, I.J.; Data analysis, I.J. under guidance of M.G., G.V.S. and R.O.; Resources, A.N. and R.O.; Writing—Original draft preparation, I.J.; Writing—Review and editing, I.J., M.G., H.P.H., A.N., M.F., A.S.C. and R.O.; Project administration, R.O.; Funding acquisition, R.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Swedish International Development Cooperation Agency (Sida), grant number SIDA-Tz-UDSM-2015 and “The APC was funded by the same agency, Sida”.

Institutional Review Board Statement

Not applicable.

Acknowledgments

The authors thank Jan-Erik Englund and Adam Flöhr (Swedish University of Agricultural Sciences, SLU) for assisting with data analyses. We thank the SLU Bioinformatics Infrastructure (SLUBI) for the bioinformatics support.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yasir, M.; Das, S.; Kharya, M.D. The phytochemical and pharmacological profile of Persea americana Mill. Pharm. Rev. 2010, 4, 77–84. [Google Scholar] [CrossRef]
  2. Rodriguez, P. Avocado Market Trends Hitting 2020. Available online: https://www.inspirafarms.com/avocado-market-trends-hitting-2020/ (accessed on 20 September 2020).
  3. Bergh, B.O.; Ellstrand, N. Taxonomy of the avocado. Calif. Avocado Soc. Yearb. 1986, 70, 135–145. [Google Scholar]
  4. Boza, E.J.; Tondo, C.L.; Ledesma, N.; Campbell, R.J.; Bost, J.; Schnell, R.J.; Gutiérrez, O.A. Genetic differentiation, races and interracial admixture in avocado (Persea americana Mill.), and Persea spp. evaluated using SSR markers. Genet. Resour. Crop Evol. 2018, 65, 1195–1215. [Google Scholar] [CrossRef]
  5. Talavera, A.; Soorni, A.; Bombarely, A.; Matas, A.J.; Hormaza, J.I. Genome-Wide SNP discovery and genomic characterization in avocado (Persea americana Mill.). Sci. Rep. 2019, 9, 1–3. [Google Scholar] [CrossRef]
  6. Thormann, C.E.; Ferreira, M.E.; Camargo, L.E.A.; Tivang, J.G.; Osborn, T.C. Comparison of RFLP and RAPD markers to estimating genetic relationships within and among cruciferous species. Theor. Appl. Genet. 1994, 88, 973–980. [Google Scholar] [CrossRef]
  7. UCAVO. (Undated). Avocado Varieties, Variety List. Available online: http://www.ucavo.ucr.edu/AvocadoVarieties/VarietyFrame.html#Anchor-47857 (accessed on 20 September 2020).
  8. Florida Department of Agriculture and Consumer Services. (Undated). Florida Avocado Varieties. Available online: https://www.ams.usda.gov/sites/default/files/media/FloridaAvocadoVarieties.pdf (accessed on 20 July 2020).
  9. Nkansah, G.O.; Ofosu-Budu, K.G.; Ayarna, A.W. Genetic diversity among local and introduced avocado germplasm based on morpho-agronomic traits. Int. J. Plant Breed Genet. 2013, 7, 76–91. [Google Scholar] [CrossRef][Green Version]
  10. Abraham, J.D.; Abraham, J.; Takrama, J.F. Morphological characteristics of avocado (Persea americana Mill.) in Ghana. Afr. J. Plant Sci. 2018, 12, 88–97. [Google Scholar] [CrossRef]
  11. Gutiérrez-Díez, A.; Sánchez-González, E.I.; Torres-Castillo, J.A.; Cerda-Hurtado, I.M.; Ojeda-Zacarías, M.D.C. Genetic diversity of mexican avocado in Nuevo Leon, Mexico. In Molecular Approaches to Genetic Diversity; Mahmut, Ç., Halil, K., Gül, C.Ö., Birgul, O., Eds.; IntechOpen: London, UK, 2015; pp. 141–159. [Google Scholar]
  12. Ismadi, R.S.H.; Hafifah, I.F. Exploration and morphological characterization of vegetative part of avocado at Bebesan subdistrict central Aceh district, Indonesia. In Proceedings of MICoMS 2017 (Emerald Reach Proceedings Series, Volume 1); Emerald Publishing Limited: Bingley, UK, 2018; pp. 69–73. [Google Scholar]
  13. Juma, I.; Nyomora, A.; Hovmalm, H.P.; Fatih, M.; Geleta, M.; Carlsson, A.S.; Ortiz, R. Characterization of Tanzanian avocado using morphological traits. Diversity 2020, 12, 64. [Google Scholar] [CrossRef]
  14. Smith, J.S.C.; Smith, O.S. Fingerprinting crop varieties. Adv. Agron. 1992, 47, 85–140. [Google Scholar] [CrossRef]
  15. Alcaraz, M.L.; Hormaza, J.I. Molecular characterization and genetic diversity in an avocado collection of cultivars and local Spanish genotypes using SSRs. Hereditas 2007, 144, 244–253. [Google Scholar] [CrossRef] [PubMed]
  16. Torres, A.M.; Bergh, B.O. Isozymes as indicators of outcrossing among ‘Pinkerton’seedlings. Calif. Avocado Soc. Yrbk. 1978, 62, 103–110. [Google Scholar]
  17. Lavi, U.; Hillel, J.; Vainstein, A.; Lahav, E.; Sharon, D. Application of DNA fingerprints for identification and genetic analysis of avocado. J. Am. Soc. Hortic. Sci. 1991, 116, 1078–1081. [Google Scholar] [CrossRef]
  18. Mhameed, S.; Sharon, D.; Kaufman, D.; Lahav, E.; Hillel, J.; Degani, C.; Lavi, U. Genetic relationships within avocado (Persea americana Mill) cultivars and between Persea species. Theor. Appl. Genet. 1997, 94, 279–286. [Google Scholar] [CrossRef]
  19. Fiedler, J.; Bufler, G.; Bangerth, F. Genetic relationships of avocado (Persea americana Mill.) using RAPD markers. Euphytica 1998, 101, 249–255. [Google Scholar] [CrossRef]
  20. Furnier, G.R.; Cummings, M.P.; Clegg, M.T. Evolution of the avocados as revealed by DNA restriction fragment variation. J. Hered. 1990, 81, 183–188. [Google Scholar] [CrossRef]
  21. Davis, J.; Henderson, D.; Kobayashi, M.; Clegg, M.T. Genealogical relationships among cultivated avocado as revealed through RFLP analyses. J. Hered. 1998, 89, 319–323. [Google Scholar] [CrossRef]
  22. Cuiris-Pérez, H.; Guillén-Andrade, H.; Pedraza-Santos, M.E.; López-Medina, J.; Vidales-Fernández, I. Genetic variability within Mexican race avocado (Persea americana Mill.) germplasm collections determined by ISSRs. Rev. Chapingo Ser. Hortic. 2009, 15, 169–175. [Google Scholar] [CrossRef]
  23. Schnell, R.J.; Brown, J.S.; Olano, C.T.; Power, E.J.; Krol, C.A. Evaluation of avocado germplasm using microsatellite markers. J. Am. Soc. Hortic. Sci. 2003, 128, 881–889. [Google Scholar] [CrossRef]
  24. Guzmán, L.F.; Machida-Hirano, R.; Borrayo, E.; Cortés-Cruz, M.; Espíndola-Barquera, M.C.; Heredia-García, E. Genetic structure and selection of a core collection for long term conservation of avocado in Mexico. Front. Plant Sci. 2017, 8, 243. [Google Scholar] [CrossRef]
  25. Juma, I.; Geleta, M.; Nyomora, A.; Saripella, G.V.; Hovmalm, H.P.; Carlsson, A.S.; Fatih, M.; Ortiz, R. Genetic diversity of avocado from the southern highlands of Tanzania as revealed by microsatellite markers. Hereditas 2020, 157, 1–12. [Google Scholar] [CrossRef]
  26. Ge, Y.; Zhang, T.; Wu, B.; Tan, L.; Ma, F.N.; Zou, M.H.; Chen, H.H.; Pei, J.L.; Liu, Y.Z.; Chen, Z.H.; et al. Genome-wide assessment of avocado germplasm determined from specific length amplified fragment sequencing and transcriptomes: Population structure, genetic diversity, identification, and application of race-specific markers. Genes 2019, 10, 215. [Google Scholar] [CrossRef] [PubMed]
  27. Rubinstein, M.; Eshed, R.; Rozen, A.; Zviran, T.; Kuhn, D.N.; Irihimovitch, V.; Sherman, A.; Ophir, R. Genetic diversity of avocado (Persea americana Mill.) germplasm using pooled sequencing. BMC Genom. 2019, 20, 379. [Google Scholar] [CrossRef] [PubMed]
  28. Beyene, Y.; Botha, A.M.; Myburg, A.A. A comparative study of molecular and morphological methods of describing genetic relationships in traditional Ethiopian highland maize. Afr. J. Biotechnol. 2005, 4, 586–595. [Google Scholar] [CrossRef]
  29. Loots, S.; Nybom, H.; Schwager, M.; Sehic, J.; Ritz, C.M. Genetic variation among and within Lithops species in Namibia. Plant Syst. Evol. 2019, 305, 985–999. [Google Scholar] [CrossRef]
  30. Pritchard, J.K.; Stephens, M.; Donnelly, P. Inference of population structure using multilocus genotype data. Genetics 2000, 155, 945–959. [Google Scholar] [PubMed]
  31. Falush, D.; Stephens, M.; Pritchard, J.K. Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 2003, 164, 1567–1587. [Google Scholar] [PubMed]
  32. Hubisz, M.J.; Falush, D.; Stephens, M.; Pritchard, J.K. Inferring weak population structure with the assistance of sample group information. Mol. Ecol. Resour. 2009, 9, 1322–1332. [Google Scholar] [CrossRef] [PubMed]
  33. Juma, I.; Fors, H.; Hovmalm, H.P.; Nyomora, A.; Fatih, M.; Geleta, M.; Carlsson, A.S.; Ortiz, R. Avocado production and local trade in the southern highlands of Tanzania: A case of an emerging trade commodity from horticulture. Agronomy 2019, 9, 749. [Google Scholar] [CrossRef]
  34. IPGRI. Descriptors for Avocado (Persea spp.); International Plant Genetic Resources Institute: Rome, Italy, 1995; p. 52. [Google Scholar]
  35. Royal Horticultural Society. RHS Large Colour Chart, 6th ed.; Royal Horticultural Society: London, UK, 2015. [Google Scholar]
  36. Sharon, D.; Cregan, P.; Mhameed, S.; Kusharska, M.; Hillel, J.; Lahav, E.; Lavi, U. An integrated genetic linkage map of avocado. Theor. Appl. Genet. 1997, 95, 911–921. [Google Scholar] [CrossRef]
  37. Gross-German, E.; Viruel, M.A. Molecular characterization of avocado germplasm with a new set of SSR and EST-SSR markers: Genetic diversity, population structure, and identification of race-specific markers in a group of cultivated genotypes. Tree Genet. Genomes 2013, 9, 539–555. [Google Scholar] [CrossRef]
  38. Jombart, T.; Kamvar, Z.N. df2genind: Convert a Data.Frame of Allele Data to a Genind Object. 2020. Available online: https://rdrr.io/cran/adegenet/man/df2genind.html (accessed on 10 July 2020).
  39. Jombart, T.; Collins, C. A Tutorial for Discriminant Analysis of Principal Components (DAPC) Using Adegenet 2.0.0. 2015. Available online: http://adegenet.r-forge.r-project.org/files/tutorial-dapc.pdf (accessed on 15 July 2020).
  40. Kalinowski, S.T. HP-RARE 1.0: A computer program for performing rarefaction on measures of allelic richness. Mol. Ecol. Notes 2005, 5, 187–189. [Google Scholar] [CrossRef]
  41. Peakall, R.; Smouse, P.E. GenAlEx 6.5: Genetic analysis in Excel. Population genetic software for teaching and research—An update. Bioinformatics 2012, 28, 2537–2539. [Google Scholar] [CrossRef] [PubMed]
  42. Excoffier, L.; Lischer, H. Arlequin suite ver 3.5: A new series of programs to perform population genetics analyses under Linux and Windows. Mol. Ecol. Resour. 2010, 10, 564–567. [Google Scholar] [CrossRef] [PubMed]
  43. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [Google Scholar] [CrossRef] [PubMed]
  44. Letunic, I.; Bork, P. Interactive tree of life (iTOL) v4: Recent updates and new developments. Nucleic Acids Res. 2019, 47, W256–W259. [Google Scholar] [CrossRef] [PubMed]
  45. Ramsay, J.O.; Silverman, B.W. Principal components analysis of mixed data. In Functional Data Analysis; Springer Series in Statistics; Springer: New York, NY, USA, 1997. [Google Scholar] [CrossRef]
  46. Addinsoft. The XLSTAT-Base Solution, Essential Data Analysis Tools for Excel; Addinsoft: Boston, MA, USA,, 2018. [Google Scholar]
  47. Kaufman, L.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; Wiley: New York, NY, USA, 1990. [Google Scholar]
  48. Legendre, P.; Legendre, L. Numerical Ecology, 3rd ed.; Elsevier: Amsterdam, The Netherlands, 2012. [Google Scholar]
  49. Cañas-Gutiérrez, G.P.; Arango-Isaza, R.E.; Saldamando-Benjumea, C.I. Microsatellites revealed genetic diversity and population structure in Colombian avocado (Persea americana Mill.) germplasm collection and its natural populations. J. Plant Breed. Crop Sci. 2019, 11, 106–119. [Google Scholar] [CrossRef]
  50. Petit, R.; El Mousadik, A.; Pons, O. Identifying populations for conservation on the basis of genetic markers. Conserv. Biol. 1998, 12, 844–855. [Google Scholar] [CrossRef]
  51. Reyes-Valdés, M.H.; Burgueño, J.; Singh, S.; Martínez, O.; Sansaloni, C.P. An informational view of accession rarity and allele specificity in germplasm banks for management and conservation. PLoS ONE 2018, 13, e0193346. [Google Scholar] [CrossRef]
  52. Bergh, B. The origin, nature, and genetic improvement of the avocado. The biennial conference of the Australian Avocado Growers’ Federation, Gold Coast, Australia, 28th September–October, 1992. Calif. Avocado Soc. Yrbk. 1992, 76, 61–75. Available online: https://pdfs.semanticscholar.org/4626/2f11cf7965151fe6e306c57d6643ccfc9e5d.pdf (accessed on 15 July 2020).
  53. Popenoe, W. Manual of Tropical and Subtropical Fruits; Hafner Press: New York, NY, USA, 1974. [Google Scholar]
  54. Prohens, J.; Blanca, J.M.; Nuez, F. Morphological and molecular variation in a collection of eggplants from a secondary center of diversity: Implications for conservation and breeding. J. Am. Soc. Hortic. Sci. 2005, 130, 54–63. [Google Scholar] [CrossRef]
  55. Sreekumar, V.B.; Binoy, A.M.; George, S.T. Genetic and morphological variation in breadfruit (Artocarpus altilis (Park.) Fosberg) in the Western Ghats of India using AFLP markers. Genet. Resour. Crop Evol. 2007, 54, 1659–1665. [Google Scholar] [CrossRef]
  56. Pham, T.D.; Geleta, M.; Bui, T.M.; Bui, T.C.; Merker, A.; Carlsson, A.S. Comparative analysis of genetic diversity of sesame (Sesamum indicum L.) from Vietnam and Cambodia using agro-morphological and molecular markers. Hereditas 2011, 148, 28–35. [Google Scholar] [CrossRef] [PubMed]
  57. Roldan-Ruiz, L.; van Eeuwijk, F.A.; Gilliland, T.J.; Dubreuil, P.; Dillmann, C.; Lallemand, J.; de Loose, M.; Baril, C.P. A comparative study of molecular and morphological methods of describing relationships between perennial ryegrass (Lolium perenne L.) varieties. Theor. Appl. Genet. 2001, 103, 1138–1150. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.