Genetic Variability and Structure of Fragaria nilgerrensis Schlecht. Germplasm in Sichuan Province

: Fragaria nilgerrensis Schlecht. (wild strawberry) is widely distributed in Southwest China, characterized by stress tolerance and the fruits of a notable peach aroma. So far there is only limited knowledge of variability and genetic structure in this species. Using AFLP markers, we investigated the genetic variability of 37 plants of F. nilgerrensis sampled in six main mountain areas of Sichuan Province and analyzed their genetic structure. Genetic similarity according to Nei and Li was used for cluster analysis based on UPGMA method and Agglomerative Hierarchical Clustering. Stratiﬁcation of plants into more distinctive genetic groups was determined using Bayesian structure analysis. Six primer combinations produced a total of 1302 fragments of which 818 (62.8%) were polymorphic. Bayesian analysis showed the 37 plants of F. nilgerrensis grouped into ﬁve distinctive genetic groups. Most of the plants from the same mountain area clustered into the same genetic group, indicating each area as an area with the unique genetic proﬁle. The genetic parameters analyzed here indicate a huge genetic variability of F. nilgerrensis in Sichuan Province. Our results provide reference data for surveying and protecting the germplasm resources of F. nilgerrensis that could be used in strawberry breeding programs.


Introduction
China has more wild strawberry genetic resources than any other country in the world. Out of about 20 recognized Fragaria species [1], 14 species are distributed in China [2,3]. Therefore, China has been considered as a crucial geographical distribution center of wild strawberry resources [4]. Fragaria nilgerrensis Schlecht. is only one of the 14 wild Fragaria species present in China [2]. It is a perennial herbaceous plant of the Rosaceae family. It is a vigorous species, endemic to Southwest Asia, from the mountains of the Philippines, across Central and Southwestern China, to the hilly region of southern India. F. nilgerrensis is also distributed in the mountain regions of Southeast Asia. In China, F. nilgerrensis is distributed mainly in the mountain regions of Sichuan Province and the surrounding areas in Southwest China [5][6][7], but also in Yunnan Province [4].
F. nilgerrensis is a diploid wild strawberry (2n = 2x = 14) [8] with a karyotype composed of six pairs of metacentric and one pair of submetacentric chromosomes [9]. The discovery of the narrow genetic base of the cultivated strawberry [10,11] has sparked renewed interest in the use of wild Fragaria species in breeding programs. Lower ploidy Fragaria species (2x, 4x, and 6x) with desirable characteristics, such as unique flavor, aroma, vigor, disease, pest resistance, and adaptability to a wide range of habitats, could potentially be used in breeding programs [12]. Species hybrids of F. nilgerrensis may produce plants with higher net carbon exchange rates (NCER) and/or increased dry matter accumulation at high temperatures [13].
Some specimens of F. nilgerrensis from Southwestern China have shown resistance to aphids and leaf diseases, have a strong peach aroma [14], and are resistant to waterlogging [2,15]. These desirable characteristics of F. nilgerrensis could be introduced into cultivated varieties by interspecific hybrids and back-crossing. For this reason, F. nilgerrensis may be excellent interspecific cross-material for improving the genetic divergence of cultivated strawberries [16]. To date, a few interspecific hybridizations were performed between F. nilgerrensis and F. × ananassa Duch. [17,18], and also between F. nilgerrensis and F. nubicola F. pentaphylia, and F. viridis [19]. Until now, the genomic analysis of F. nilgerrensis was reported with a focus on the genetic basis of anthocyanin accumulation [4], sequencing of the full genome and phylogenetic analyses [11], sequencing of chloroplast genome in 10 wild strawberries species, including F. nilgerrensis [20], and comparative genomic analysis of Fragaria species, including F. nilgerrensis [21], but there has been no study on genetic divergence within the F. nilgerrensis. Phylogenic analysis of different wild strawberry species showed the outgroup of F. nilgerrensis from other diploid Fragaria species [22] but were always closer to diploid than other polyploid species [23].
Knowing the genetic variability and genetic structure of F. nilgerrensis germplasm is the basis for conservation biology and possible genetic improvement of strawberry cultivars. Therefore, the hypothesis of this research is presumed existing huge genetic variability and detectable genetic stratification of the plants into more distinctive genetic groups, probably due to limitations of natural spreading among different mountain areas resulting in low to moderate flow of the genes among these populations.
The goals of the present study were to (1) estimate the genetic variability among genotypes of F. nilgerrensis and (2) determine the genetic stratification among genotypes to clarify genetic structure of F. nilgerrensis genotypes sampled from six main mountain areas in southwest China using AFLP markers. The data that is obtained may help us understand the genetic variability of F. nilgerrensis and provide a basis for its potential use in strawberry breeding programs.

Plant Material
Sichuan Provincial Department of Science and Technology is the relevant national regulatory body who permitted and encouraged our researchers to carry out the investigation, collection, evaluation, innovation, and utilization on Fragaria nilgerrensis Schlecht. resources in the national park or other protected areas of land.
Thirty-seven accessions of F. nilgerrensis were sampled in the six main mountain areas (MAs) in Sichuan Province in south west China. The collecting sites ranged from 26 • N to 32 • N latitude and from 101 • E to 108 • E longitude and were located at altitudes between 1034 and 3051 m ( Table 1). The 37 sampled plants were propagated vegetatively, planted and maintained under the same cultivation conditions at the Horticultural Research Institute of the Sichuan Academy of Agricultural Sciences. Young fresh leaves from each plant were placed immediately into separate Ziploc plastic bags filled with dry silica gel for drying. The bags were then stored in a −80 • C freezer. Details of the collection sites and sampled plants are listed in Table 1.

DNA Isolation
Genomic DNA was isolated from 1 g of young fresh leaves using the modified cetyltrimethylammonium bromide (CTAB) method [41,42]. Dried leaves of 37 accessions were ground with liquid nitrogen in a mortar to a fine powder. The powder was transferred to a 10 mL tube, and 4 mL of 2% CTAB extraction buffer (100 mmol L −1 Tris-HCl pH 8.0, 20 mmol L −1 EDTA (ethylenediaminetetraacetic acid), 1.4 mol L −1 NaCl, 2% CTAB, 1% PVP-40, 1% Vc. 10% Na 2 S 2 O 5 , and 2% β-mercaptoethanol) was added into the tube, mixed well, and incubated at 65 • C for 30 min in a water bath. The suspension was mixed several times during incubation by inverting the tube. After cooling to room temperature, 1 mL of 5 M potassium acetate was added and mixed well, and then the tube was placed in an ice bath for 20 min. A total of 4 mL of chloroform-isoamyl alcohol (24:1 [v/v]) was added, and the sample was mixed by inverting the tube. The mixture was centrifuged at 12,000× g rpm for 15 min at room temperature. The supernatant was transferred using cut pipette tips to a new 10 mL tube, then cold isopropanol (−20 • C) was added in a volume equivalent to 2/3 the volume of the transferred supernatant. The tubes were inverted slowly several times to precipitate the DNA. The precipitated DNA was collected by glass capillary, transferred to a 1.5 mL tube, and washed twice in 500 µL ice cold 75% (v/v) ethanol by inverting the tube. The final washing was performed in absolute ethanol, which was poured out, and the DNA was air-dried at room temperature. The DNA sample was dissolved in 0.5 mL T10E0.1 buffer in a 1.5 mL tube. RNase-A was added to a final concentration of 10 µg µL −1 , and the mixture was incubated at 37 • C for 60 min. Next, 0.5 mL chloroform-isoamyl alcohol (24:1 [v/v]) was added, and the sample was mixed well and centrifuged at 12,000× g rpm for 15 min at −4 • C. The new supernatant was transferred into a new tube, and 1/10 volume (40 µL) of 3 M of sodium acetate and 2 volumes (0.8 mL) of ice-cold absolute ethanol (−20 • C) were added and mixed well. The precipitated DNA was transferred into a new 1.5 mL tube, washed two times with absolute ethanol, air-dried at room temperature, and dissolved in 0.2 mL of double-distilled water. The quality of the isolated DNA was checked in 0.8% agarose gel in 0.5x TBE buffer.

Molecular Analysis
To identify primer combinations that gave the optimal number of polymorphic bands, six plants of F. nilgerrensis were chosen randomly and analyzed using a total of 15 randomly selected primer combinations (primer screening) to detect the number of polymorphic bands. On the basis of numbers of scorable polymorphic bands, six highly polymorphic primer combinations were chosen and used to generate AFLP fragments for all 37 plants (Table 2). The AFLP analysis was carried out in the biotechnology laboratory at the Faculty of Agriculture, University of Zagreb, Croatia, according to the method described by Vos et al. [43], with modifications described below. Briefly, 0.2 µg samples of the genomic DNA of F. nilgerrensis were double digested using EcoRI and MseI restriction enzymes (New England BioLabs ® , Ipswich, MA, USA). The EcoRI and MseI adapters and primers for preselective and selective amplifications were synthesized by Applied Biosystems (Foster City, CA, USA). Ligase (Fermentas, Vilnius, Lithuania) was used to ligate the EcoRI and MseI adapters to the ends of the DNA fragments. The adapter-ligated DNA was diluted with 75 µL of T 10 E 0.1 buffer and subjected to preselective amplification using the synthesized EcoRI and MseI primers that consisted of a core sequence and an enzyme-specific sequence [43], except that each primer had one additionally selective nucleotide (EcoRI+A, MseI+C). The preselective amplification reaction was performed in a total volume of 20 µL containing 1x PCR buffer, 3 mM MgCl 2 , 0.2 mM dNTP (Fermentas), 0.25 µM of each primer, 0.5 U Taq polymerase (Sigma, St. Louis, MO, USA), and 5 µL double-digested and adapter-ligated DNA. PCR amplification was performed in a Veriti ® 96-well thermal cycler (Applied Biosystem) using one step of 2 min at 72 • C, 20 cycles of 20 s at 94 • C, 30 s at 56 • C, and 2 min at 72 • C, with a final elongation step of 30 min at 60 • C. The preselective amplification products were diluted 25 times in T 10 E 0.1 buffer. Selective amplification was carried out using EcoRI and MseI primers with three additional selective nucleotides ( Table 2). Each of the forward primers (EcoRI primers) were labeled with one of the following fluorescent dyes: 6-FAM, VIC, NED, or PET. The selective amplification reaction was performed in a total volume of 20 µL containing 1x PCR buffer, 3 mM MgCl 2 , 0. The AFLP fragments within the range of 50 to 500 bp were scored by GeneMapper v.4.0 software (Applied Biosystems). The GeneMapper output data based on the size and the height of AFLP fragments also included four replicates of DNA samples added as duplicates and three samples as negative controls. Additional AFLP fragment selection was achieved by the ScanAFLP 1.3 software [44], excluding those fragments with peaks lower than 50 rfu, those with heights lower than 10% of the mean height of the maximum height frequency class and those with coefficient of variation higher than one, and fragments which differed in more than one fragment among replicates. The binary matrix obtained after such a systematic elimination of dubious fragments was used for subsequent statistical analysis.

Data Analysis
The AFLP binary matrix was used for calculation of polymorphic information content (PIC) per primer combination for calculating the genetic similarity between individuals (S NL ), for UPGMA, and for the Bayesian STRUCTURE analysis [45].
The polymorphism information content (PIC) for each AFLP fragment was calculated using the following formula [46]: where PIC i is the polymorphic information content of marker i, f i is the frequency of the AFLP fragments that are present, and 1 − f i is the frequency of AFLP fragments that are absent. The PIC value for dominant markers is up to 0.50 for f i = 0.50 [47].
Genetic similarity between individuals (S NL ) was calculated according to Nei-Li [48,49]. The similarity matrix was constructed on the coefficient of similarities between all pairs of 37 accessions of F. nilgerrensis. Cluster analysis based on the UPGMA method using Agglomerative Hierarchical Clustering (AHC) was carried out using XLSTAT Version 2021.1.1 software [50]. Bootstrap analysis based on 1000 resamplings of the data set was computed using software NTSYSpc ver. 2.21L [51].
The level of genetic stratification (K) (the modal value of ∆K) among the F. nilgerrensis plants was assessed using Bayesian clustering analysis incorporated in the STRUCTURE ver. 2.3.4 software [45]. Bayesian cluster analyses included a burn-in period (initial stage of the sampling process) of 10,000 replicates, followed by 10,000 Markov chain Monte Carlo (MCMC) replicates for each run. Twenty repeat runs were carried out to quantify the amount of variation of the likelihood for each K (from K = 1 to K = 6), using an ADMIXTURE model and correlated allele frequencies and allowing for recessive alleles [52]. The posterior probability of the data lnP(K) for a given K can be used as an indication of the most likely number of distinctive groups or subpopulations [53]. Therefore, the height of the modal value of the ∆K distribution was calculated to detect the number of distinctive groups K using Structure Harvester v 0.6.92 [54]. The K that best described the data was chosen by examining the lnP(K) [55] and by calculating ∆K, as described by Evanno et al. [53]. The value of K with the highest mean log likelihood [lnP(K)] and ∆K statistic was selected.

Results
After applying ScanAFLP 1.3 software [44], the initial dataset of polymorphic AFLP fragments based on six primer combinations dropped from 1302 to 818, with the estimated error rate between 0.73% and 1.69% per primer combination, with a mean of 1.07%, which is within the usual range for AFLP data [56]. The number of polymorphic fragments by an individual primer combination varied from 105 (E+AAG/M+CTA) to 182 (E+AGA/M+CAT), with an average of 136.3 ( Table 3). The average PIC value ranged from 0.26 (E+AGA/M+CAT) to 0.31 (E+ACA/M+CAG) with an average of 0.29 per primer combination ( Table 3). Table 3. AFLP primer combinations, total number of scorable fragments, number and percentage of polymorphic fragments, and polymorphic information content (PIC) per primer combination in 37 F. nilgerrensis plants using AFLP markers. Similarity coefficients (S NL ) between the 37 F. nilgerrensis plants ranged from 0.17 (PI 1 vs. PI 96) to 0.95 (PI 95 and PI 96). UPGMA clustering divided the plants into six groups (Figure 1a). Most of the plants from the same mountain area clustered into the same group, but there were some exceptions: part of the plants from the mountain area MA 4 clustered together with the plants originated from mountain area MA 5, while plants originated from MA 6 clustered under two separated clusters (Figure 1a).

AFLP Primer Combination Total Number of Fragments Number and Percentage (%) of Polymorphic Fragments PIC
Bayesian clustering analysis was used to test the population structures; one to six clusters (K) were tested. Average log probability lnP(K) values increased to the fifth testing cluster, after which the rate of change in the log probability value decreased dramatically ( Figure 2).
The relationship between K and ∆K indicated that the 37 plants of F. nilgerrensis clustered into five distinctive genetic groups (Figure 1b)  Bayesian clustering analysis was used to test the population structures; one to six clusters (K) were tested. Average log probability lnP(K) values increased to the fifth testing cluster, after which the rate of change in the log probability value decreased dramatically ( Figure 2). but there were some exceptions: part of the plants from the mountain area MA 4 clustered together with the plants originated from mountain area MA 5, while plants originated from MA 6 clustered under two separated clusters (Figure 1a). Bayesian clustering analysis was used to test the population structures; one to six clusters (K) were tested. Average log probability lnP(K) values increased to the fifth testing cluster, after which the rate of change in the log probability value decreased dramatically ( Figure 2). An exchange of genes in the neighboring area between genetic groups K1 and K3 and between K3 and K5 also seems to have occurred, but to a lesser extent.
The mean value of membership coefficients of F. nilgerrensis, sampled in six mountain areas in Sichuan province was calculated by each mountain area on the basis of membership coefficients for K = 5 for 37 plants (pie charts in Figure 3). The majority of genetic basis shown in percentages in the pie charts, as revealed by STRUCTURE, is exchanged among MA 1 and MA 5 in one side (green) and MA 4 and MA 6 in the other side, and also among MA 6 (red) and MA 1. The genetic basis of the plants from MA 2 (blue) is 100% unique (without exchange), while MA 3 (yellow) and MA 5 (green) are almost unique, with just a gentle exchange with other groups of 1% on average. The main reason for that is probably the limited possibility of the spreading of generative and vegetative propagation material among distant mountain areas.
of the genes from the neighboring genetic groups. The remaining 11 plants from mountain area MA 4 cluster in a separate genetic group K2. The plants from mountain areas MA 6, MA 2, and MA 3 cluster separately in the remaining three genetic groups K3, K4, and K5, respectively.
An exchange of genes in the neighboring area between genetic groups K1 and K3 and between K3 and K5 also seems to have occurred, but to a lesser extent.
The mean value of membership coefficients of F. nilgerrensis, sampled in six mountain areas in Sichuan province was calculated by each mountain area on the basis of membership coefficients for K = 5 for 37 plants (pie charts in Figure 3). The majority of genetic basis shown in percentages in the pie charts, as revealed by STRUCTURE, is exchanged among MA 1 and MA 5 in one side (green) and MA 4 and MA 6 in the other side, and also among MA 6 (red) and MA 1. The genetic basis of the plants from MA 2 (blue) is 100% unique (without exchange), while MA 3 (yellow) and MA 5 (green) are almost unique, with just a gentle exchange with other groups of 1% on average. The main reason for that is probably the limited possibility of the spreading of generative and vegetative propagation material among distant mountain areas.

Discussion
Here we have used AFLP molecular markers to investigate the genetic variability in F. nilgerrensis. The study has revealed the wide range of genetic variability in the germplasm of F. nilgerrensis collected in the six main mountain areas of Sichuan province in Southwestern China.
The AFLP technique has been widely recognized as the most efficient marker system when compared to RFLP, RAPD, and SSR [25]. In a study on their application in F. x ananassa [35], AFLP markers generated from any one of the four pairs of primer combinations were sufficient to distinguish among 19 cultivars, some of which were closely related,

Discussion
Here we have used AFLP molecular markers to investigate the genetic variability in F. nilgerrensis. The study has revealed the wide range of genetic variability in the germplasm of F. nilgerrensis collected in the six main mountain areas of Sichuan province in Southwestern China.
The AFLP technique has been widely recognized as the most efficient marker system when compared to RFLP, RAPD, and SSR [25]. In a study on their application in F. x ananassa [35], AFLP markers generated from any one of the four pairs of primer combinations were sufficient to distinguish among 19 cultivars, some of which were closely related, identifying in total 228 scorable markers and demonstrating high effectiveness in the fingerprinting of strawberry cultivars. In a similar study, Tyrka et al. [36] obtained 129 scorable markers using 10 AFLP primer combinations.
In the present study, we obtained 818 scorable markers using six AFLP primer combinations. The number of scorable markers is affected by the method used for marker visualization. Degani et al. [35] used radioactively labeled primers, and Tyrka et al. [36] used agarose gel for marker separation. In our study, the large number of scorable markers was obtained using a highly susceptible fluorescence-based detection system with four-capillary electrophoresis. Our result is consistent with the findings [57] that a significantly larger number of markers was obtained with a fluorescence-based detection system compared to radiolabeling.
Zhang et al. [37] reported that the percentage of AFLP polymorphisms in 107 strawberry (F. ananassa Duch.) cultivars was 70.9%. Degani et al. [35] found that 84.6% of the AFLP fragments in 19 strawberry cultivars were polymorphic, using four primer combinations. Tyrka et al. [36] detected 89.92% of the AFLP fragments to be polymorphic in 19 strawberry clones, using 10 primer combinations. In our study, the six primer combinations revealed a lower genetic variability (62.8%) in F. nilgerrensis compared to the polymorphisms detected in the cultivars and clones of F. ananassa reported in the previous studies. Regarding the other strawberry species, and other marker systems, Fragaria chiloensis ssp. chiloensis f. chiloensis revealed 48% of polymorphic ISSR fragments, while F. chiloensis ssp. chiloensis f. patagonica showed higher variability of 90% of polymorphic markers [58]. F. nubicola also exhibits higher variability of SSR markers, reaching as high as 88.9% [59].
The cluster analysis based on the similarity coefficients showed that all 37 plants of F. nilgerrensis collected in six mountain areas in Sichuan province separated into five differentiated genetic groups. The dendrogram (Figure 1a) indicated that the geographically determined groups of plants located in the southwest, south, east, northeast and mid-south of Sichuan province were relatively independent genetic groups.
Three of the six mountain areas (MA 2, MA 3, and MA 5) represent unique genetic groups of the F. nilgerrensis germplasm in Sichuan (pie charts in Figure 1), meaning each of these three mountain areas groups consist of mainly unique and independent genetic profiles with admixtures up to 1%. Another two mountain areas (MA 4 and MA 6) are characterized by higher contribution of admixtures from more genetic groups, but these areas may still be considered to be areas with unique genetic profiles. Only one area (MA 1) may be considered as an admixture of two genetic profiles.
Following these results, we may conclude that individuals within each mountain area adapted to specific environmental conditions during the long-term natural selection. In a recent study, Lu et al. [60] analyzed the genetic structure of 16 populations and 169 individuals of F. nilgerrensis using 16 EST-ISSR markers and found all the populations separated in two genetic groups, with some individuals admixed. The principal aim of this study was to clarify if the genetic structure of wild strawberry F. nilgerrensis existed in the mountain areas of Sichuan Province. Bayesian STRUCTURE analysis revealed that five distinct genetic groups existed in Sichuan Province, indicating higher genetic variability in comparison to the above mentioned study.

Conclusions
Our findings clearly show that all 37 plants of F. nilgerrensis collected in Sichuan Province belong to five genetic groups. The identification of five distinct genetic groups may help to narrow the classification and evaluation of the whole F. nilgerrensis genetic pool within the germplasm collections and provide a clearer insight into the genetic structure of the sampled material, which will help breeders in setting up hybridization schemes.