Preliminary Evidence for Domestication Effects on the Genetic Diversity of Guazuma crinita in the Peruvian Amazon

Guazuma crinita, a fast-growing timber tree species, was chosen for domestication in the Peruvian Amazon because it can be harvested at an early age and it contributes to the livelihood of local farmers. Although it is in an early stage of domestication, we do not know the impact of the domestication process on its genetic resources. Amplified fragment length polymorphic (AFLP) fingerprints were used to estimate the genetic diversity of G. crinita populations in different stages of domestication. Our objectives were (i) to estimate the level of genetic diversity in G. crinita using AFLP markers, (ii) to describe how the genetic diversity is distributed within and among populations and provenances, and (iii) to assess the genetic diversity in naturally regenerated, cultivated and semi-domesticated populations. We generated fingerprints for 58 leaf samples representing eight provenances and the three population types. We used seven selective primer combinations. A total of 171 fragments were amplified with 99.4% polymorphism at the species level. Nei’s genetic diversity and Shannon information index were slightly higher in the naturally regenerated population than in the cultivated and semi-domesticated populations (He = 0.10, 0.09 and 0.09; I = 0.19, 0.15 and 0.16, respectively). The analysis of molecular variation showed higher genetic diversity within rather than among provenances (84% and 4%, respectively). Cluster analysis (unweighted pair group method with arithmetic mean) and principal coordinate analysis did not show correspondence between genetic and geographic distance. There was significant genetic differentiation among population types (Fst = 0.12 at p < 0.001). The sample size was small, so the results are considered as preliminary, pending further research with larger sample sizes. Nevertheless, these results suggest that domestication has a slight but significant effect on the diversity levels of G. crinita and this should be considered when planning a domestication program.


Introduction
Tropical forests provide many valuable products, including rubber, fruits and nuts, medicinal herbs, lumber, firewood, and charcoal [1]. Natural forest populations typically possess considerable genetic variation [2]. However, deforestation due to slash-and-burn agriculture [3], over-harvesting and other unsustainable forestry practices are reducing tree genetic diversity in many areas in the

Sampling
In this study, we analyzed the genetic diversity of G. crinita populations in three stages of tree domestication. A natural population included wild, naturally regenerated trees from one provenance that farmers retained in their fields. In the cultivated population, we sampled trees from one provenance that farmers planted in a home garden using seedlings produced in a home nursery. The semi-domesticated population included trees from six provenances in a clonal garden. Genotypes in the clonal garden were selected over a period of years from progeny trees originating from an extensive collection of 200 mother trees. The natural, cultivated and semi-domesticated populations are in the second, fourth and sixth stages, respectively, of the seven stages of domestication proposed by Vodouhe and Dansi [18].
A total of 84 individuals from the three different population types (natural, cultivated and semi-domesticated) were sampled from eight G. crinita provenances in the Peruvian Amazon ( Figure 1). Thirty individuals from the village of Nuevo Piura were randomly sampled from a population of natural regeneration located in Campo Verde district, Ucayali region (150 m.a.s.l). In the city of Tingo Maria, Huanuco region, 30 cultivated individuals were sampled in a home garden (564 m.a.s.l). In addition, 24 vegetative propagated trees were sampled from a clonal multiplication garden at the Peruvian Amazon Research Institute (IIAP), located 12.4 km from Pucallpa, Ucayali Region (154 m.a.s.l). They represent selected genotypes from six provenances in two watersheds in the Peruvian Amazon. Young leaf tissues were collected from individual plants and then dried in silica gel for DNA extraction. Amazon. Young leaf tissues were collected from individual plants and then dried in silica gel for DNA extraction.
The sample size in this study was small so we consider the results as preliminary. Other studies of genetic diversity in tropical tree species have also used small sample sizes [17,[19][20][21][22] and reported genetic diversity patterns consistent with studies based on large sample sizes.

DNA Extraction
DNA from the 84 leaf samples was extracted using the CTAB (cetyltrimethylammonium bromide) method [23] with a slight modification (adding a trace of polyvinylpyrrolidone (PVP) and 5 µL of RNase). The DNA quality was determined by 0.8 % agarose gel electrophoresis using a Nanodrop Spectrophotometer (Thermo Scientific, Delaware, USA). We only obtained genomic DNA of sufficient quality for amplification from 58 of the 84 samples (Table 1). It was diluted to 50 ng/µL and stored at -20 °C. The sample size in this study was small so we consider the results as preliminary. Other studies of genetic diversity in tropical tree species have also used small sample sizes [17,[19][20][21][22] and reported genetic diversity patterns consistent with studies based on large sample sizes.

DNA Extraction
DNA from the 84 leaf samples was extracted using the CTAB (cetyltrimethylammonium bromide) method [23] with a slight modification (adding a trace of polyvinylpyrrolidone (PVP) and 5 µL of RNase). The DNA quality was determined by 0.8 % agarose gel electrophoresis using a Nanodrop Spectrophotometer (Thermo Scientific, Delaware, USA). We only obtained genomic DNA of sufficient quality for amplification from 58 of the 84 samples (Table 1). It was diluted to 50 ng/µL and stored at −20 • C.

AFLP Amplification
Molecular AFLP markers were used because no previous genome information is required and a large number of polymorphic loci can be analysed simultaneously [24]. With the use of AFLP, we expected to successfully assess the genetic relationships between G. crinita populations in the Peruvian Amazon.
Techniques for the AFLP analysis of G. crinita were adapted from those described by Vos et al. [25]. Commercial AFLP kits (Stratec Molecular, Berlin, Germany) were used for the restriction, ligation and pre-amplification steps.
An AFLP Core Plant Reagent Kit I (Stratec Molecular, Berlin, Germany) was used for restriction and ligation. The restriction reaction volume was 5 µL and included the following: 1 uL of 5 × Reaction Buffer The selective amplification reactions with slight modifications were performed following the protocol described in Mikulášková et al. [26], with a total volume of 9.8 µL, comprising 2.3 µL of preamplified DNA, 5.1 µL ddH2O, 1 µL 10 × polymerase buffer (100 mM Tris-HCl (pH 8.3), 500 mM KCl, 11 nM MgCl 2 and 0.1% gelatin) (Sigma-Aldrich, Saint Louis, USA), 0.2 mM dNTP (Thermo Scientific, USA), 0.5 pmol fluorescent dye-labelled EcoRI primer (Applied Biosystems, Foster city, California, USA), 0.5 pmol MseI primer (Generi Biotech, Hradec Králové, Czech Republic) and 0.2 U RedTaq DNA polymerase (Sigma Aldrich, Saint Louis, USA). Selective PCR amplifications were carried out using the following cycle profile: 92 • C for 2 min, 65 • C for 30 s and 72 • C for 2 min. A touchdown protocol was applied in the following eight cycles at 94 • C for 1 s, at 64 • C (1 • C decrease each cycle) for 30 s, and at 72 • C for 60 s. This was followed by 23 cycles of 94 • C for 1 s, at 56 • C for 30 s and at 72 • C for 2 min. Final elongation was carried out at 60 • C for 30 min.
Eleven primer combinations were tested but only seven were selected for final analysis because they produced distinct polymorphic bands. For all PCR amplifications T100TM Thermal Cycler (Bio-Rad Laboratories, California, USA) was used. The final products after selective amplification were visualized on 1.8% agarose gels buffered in 1 × TBE. Following a successful amplification, the AFLP products were prepared for analysis on 3500 Genetic Analyser, automated sequencer (Applied Biosystems, Foster city, California, USA). Ten percent of the samples were analyzed twice for error rate estimation.

Data Analysis
AFLP fragments were analyzed using GeneMarker v 2.0.2 (SoftGenetics, USA). Polymorphic and strong peaks were scored as present or absent and then converted into a binary matrix. The data were used to calculate the percentage of polymorphic fragments, gene diversity (He) and Shannon's information index (I) using POPGENE v1.32 [27].
Analysis of molecular variance (AMOVA) was carried out to evaluate genetic diversity within and among samples, as well as to estimate genetic differentiation indexes, using GenAlEx v6 [28]. Principal coordinate analysis (PCoA) was carried out to assess genetic relationship among samples, also using GenAlEx v6.
Patterns of genetic relationships among samples was also investigated using cluster analysis. A dendrogram was constructed based on Jaccard's dissimilarity index with UPGMA using DARwin5 [29]. The software STRUCTURE v2.3.2.1 [30] was used to identify the number of similar population clusters (K) and the proportion of membership of each population in each of the K clusters. The analysis of the number of clusters was performed using the recessive allele model with a burn-in and run lengths of 100,000 and 1,000,000 interactions, respectively. The number of clusters was determined following the guidelines of Pritchard and Wen [31] and Evano et al. [32] using the online software Structure Harvester [33], and subsequently visualized using DISTRUCT 1.1 [34]. AFLP percentage of reproducibility was calculated following Bonin et al. [35].

AFLP Fingerprint
The seven primer combinations selected for the analysis revealed 10 to 35 fragments in the 58 G. crinita samples, with the mean of 24 fragments. Of the 171 total fragments, 99.7% were polymorphic. The fragments were in a size range of 52 to 336 bp ( Table 2). The first primer combination (EcoRI-ACG/MseI-CTT) was the most successful with a polymorphic rate of 20.5%. The least successful was EcoRI-ACG/MseI-CAG (5.8%). Ten percent of the sample size was independently replicated with the same primer combinations, resulting in 85% of the fragment reproducibility among replicated samples.

Genetic Diversity and Population Structure
In a single measurement of intra-population diversity, e.g., the percentage of polymorphic fragments, samples from Nuevo Piura provenance (natural population) exhibited the highest diversity (72.5%), followed by the samples from Tingo Maria (cultivated population, 42.2%). There was less diversity in the provenances in the semi-domesticated population (20.7% on average) (Table 3). However, in the semi-domesticated provenances considered as one population, the percentage of polymorphic fragments was 54.4%, Nei's genetic diversity was 0.09 and the Shannon index was 0.16. Based on 170 polymorphic fragments from the 58 G. crinita samples, Nei's genetic diversity values ranged from 0.06 to 0.10 and the Shannon information index (I) ranged from 0.09 to 0.19 (Table 3). Comparing the three population types, all measure (polymorphic fragments (PF), percentage of polymorphic fragments (PPF), He, and I) were slightly higher in the population of natural regeneration.
The coefficient of genetic differentiation (G st ) among the three population types was 0.10. This indicates that 10% of the genetic diversity was distributed among the population types. Nei's genetic identity comparison between population types indicated that the highest identity (0.011) was between natural and cultivated populations, and the lowest identity (0.022) was between cultivated and semi-domesticated populations.
Pairwise genetic distance between provenances ranged from 0.011 to 0.063 (Table 4). Nueva Piura and Tingo Maria were the most similar with the minimum distance value of 0.011, while the highest value of genetic distance (0.063) was between Nuevo Piura, Puerto Inca, and Tahuayo, San Alejandro. Analysis of molecular variation (AMOVA) showed that 12% of the variation was among population types, 4% was among provenances and 84% was within provenances ( Table 5). The level of differentiation among provenances was higher (F st = 0.16) than among population types (F st = 0.12) at p < 0.001. Patterns of a genetic relationship were visualized using principal coordinate analysis (PCoA) ( Figure S1) and a dendrogram based on Jaccard's dissimilarity, which grouped the 58 samples into two main clusters with seven sub-clusters ( Figure S2). The number of clusters (K value) assessed by STRUCTURE analysis suggested two was the optimal K because it had the largest delta K value. Under this K = 2 model, provenance from the semi-domesticated population (NR, TS, PI, MA, SA and CU) had some individuals with mixed assignment membership in cluster 1 (black bar) and cluster 2 (white bar, Figure 2). The analysis also provided membership assignment, with the higher membership ranging from 56.6% (CU provenance in cluster 1) to 89.3% (SA provenance in cluster 1). In natural and cultivated populations (NP and TM, respectively), the membership was 73.2% and 80.4%, respectively in cluster 2.
Analysis of molecular variation (AMOVA) showed that 12% of the variation was among population types, 4% was among provenances and 84% was within provenances ( Table 5). The level of differentiation among provenances was higher (Fst =0.16) than among population types (Fst =0.12) at p ˂ 0.001. Patterns of a genetic relationship were visualized using principal coordinate analysis (PCoA) ( Figure S1) and a dendrogram based on Jaccard's dissimilarity, which grouped the 58 samples into two main clusters with seven sub-clusters ( Figure S2). The number of clusters (K value) assessed by STRUCTURE analysis suggested two was the optimal K because it had the largest delta K value. Under this K = 2 model, provenance from the semi-domesticated population (NR, TS, PI, MA, SA and CU) had some individuals with mixed assignment membership in cluster 1 (black bar) and cluster 2 (white bar, Figure 2). The analysis also provided membership assignment, with the higher membership ranging from 56.6% (CU provenance in cluster 1) to 89.3% (SA provenance in cluster 1). In natural and cultivated populations (NP and TM, respectively), the membership was 73.2% and 80.4%, respectively in cluster 2.

Discussion
Studies of genetic variation in growth and wood traits of Guazuma crinita have been published [14,36,37], but genetic variation in morphological traits represents a small part of a total genetic variation in a species [38]. This research assesses genetic diversity in G. crinita based on amplified fragment length polymorphism (AFLP) markers, and we found 99.4% polymorphism. In another study involving eleven provenances of G. crinita in the Peruvian Amazon, there was 93.8% polymorphism based on Inter Simple Sequence Repeat (ISSR) [17]. Although the methods were different, both studies confirm high levels of genetic diversity in G. crinita. The high levels of diversity are probably related to the fact that G. crinita is a pioneer species and has long-distance seed dispersal, which results in extensive gene flow [39,40].
We analyzed the genetic diversity of G. crinita from three different population types (natural, cultivated and semi-domesticated). Comparing the genetic diversity parameters, such as PPF, He, and I, the naturally regenerated population had slightly greater genetic diversity than the cultivated and semi-domesticated populations. This suggests that artificial selection in the domestication process has reduced the levels of G. crinita genetic diversity. Other studies also confirmed that wild populations usually maintain higher levels of genetic diversity compared with cultivated populations [5,6,41,42].
Higher diversity in natural populations is expected because they are not affected by artificial selection. Maintaining high genetic diversity in natural populations is important because it reduces the risk of local extinction under natural conditions [1,43]. The conservation of cultivated populations is also important to conserve genetic diversity, particularly for those cultivated populations with superior individuals. Genetic diversity parameters were slightly higher in the semi-domesticated population compared with the cultivated population. This probably is due to the larger genetic base of the semi-domesticated population. The semi-domesticated population included six provenances with individuals selected from offspring of 200 mother trees (details of the initial collection were reported by Rochon et al. [14]), while the cultivated population represented only one provenance and a few mother trees. The number of mother trees used to establish a population is a key factor that affects inbreeding and genetic diversity: a low number will cause inbreeding among progeny, while a large number will increase genetic diversity and reduce differentiation among plantations [44].
In this study, three parameters were used to assess genetic differentiation among the three population types, and they gave similar results (AMOVA = 12%, G st coefficient = 0.10 and F st = 0.12). This indicates that about 12% of the variation was due to the domestication stage. In contrast, genetic differentiation among naturally regenerated and managed stands of Picea abies (L.) Karst in Europe are much lower (F st = 0.012), suggesting that tree breeding activities have not greatly altered gene frequencies compared with natural populations of this species [45]. Geographical and climatic factors can also affect genetic differentiation, and their effects should be assessed in future studies of G. crinita [46,47].
The relatively low level of genetic differentiation (4%) among the eight G. crinita provenances can be explained by the high gene flow value (Nm = 12.9) reported by Tuisima et al. [17]. The high gene flow probably reflects the long-distance dispersal of its small seed by wind and water [37]. Lower genetic differentiation is also expected for cross-pollinated species [48]. The high level of genetic diversity within provenances in this study was consistent with reports based on phenotypic traits [14,37].
Other studies have also reported relatively low genetic differentiation among tree populations in the Amazon Basin. Russell et al. [49] reported 9% variation among populations of Calycophylum spruceanum Benth from several watersheds in the Peruvian Amazon Basin, using seven AFLP primer combinations. This species also produces small seeds that are dispersed over long distances by both wind and water. Nassar et al. [50] found low levels of diversity among populations of three native species from the Amazon Basin based on allozymes (8%, 6% and 7%, respectively, among populations of Samanea saman (Jack.) Merr. (Fabaceae), Guazuma ulmifolia (Malvaceae) and Hura crepitans L. (Euphorbiaceae)).
Many trees species have adaptations that allow long-distance seed dispersal [48,51]. One may expect that trees sampled over a relatively small geographical range (as in our study) would show low levels of variation among populations [15]. However, in some studies in the tropics, trees were sampled over extensive geographical ranges and still showed low differentiation among population (e.g., Swietenia macrophylla King [52]; Vitellaria paradoxa Gaertn [53]; Inga edulis Mart [22]).
According to STRUCTURE analysis, individuals within provenances were assigned mixed membership in the two clusters, so provenances were not distinctly separated. This is consistent with the AMOVA, which showed much greater genetic diversity within than among provenances. As a result, it was not possible to correctly identify groups [54]. Nevertheless, we notice similarity among provenances from the semi-domesticated population (cluster 1), and similarity between provenances from the naturally regenerated and cultivated populations (cluster 2).

Conclusions
AFLP markers were successful and effective for the assessment of the genetic diversity and structure of G. crinita populations in different stages of the domestication process. A high level of genetic diversity was observed at the species level, and this probably reflects extensive gene flow due to long-distance seed dispersal.
Genetic diversity appears to be slightly greater in the natural population compared to the cultivated and semi-domesticated populations, while significant genetic differentiation was detected among the three population types. These results are preliminary, given the small sample size, and suggest the presence of a slight, but significant genetic bottleneck in the cultivated and semi-domesticated populations. The semi-domesticated population appears to have a slightly higher genetic diversity than the cultivated population.
There appears to be significant differentiation among the natural, cultivated and semi-domesticated populations, a result presented with caution given the small sample size employed. Future studies should include larger sample sizes in different domestication stages to confirm the results reported in this paper.
The in situ and circa situ conservation and sustainable management of naturally regenerated populations are recommended to maintain G. crinita genetic resources in order to cope with potential inbreeding depression and environmental changes.
To increase genetic variation in planted populations, we recommend further sampling, the collection of G. crinita seeds over an extensive geographic range (including various natural stands), and the establishment of seedlings and clonal seed orchards.