Genetic Divergence between Two Sympatric Ecotypes of Phalaenopsis pulcherrima on Hainan Island

: Ecotypes are the result of ecological differentiation at the early stages of speciation. Adaptation to soil conditions offers arguably the best examples of local adaptation in plants. Two sympatric ecotypes, with either a red or green abaxial leaf surface, were found without clear geographical isolation in Phalaenopsis pulcherrima , a Southeast Asia endemic and endangered orchid. The soil of the red leaf ecotype has a higher water content and nutrient content than the green ecotype. What is the genetic structure of the two ecotypes? Is there complete or partial reproductive isolation between the two ecotypes? In this work, leaf reﬂection of the two ecotypes in P. pulcherrima were compared, to illustrate their difference in leaf color. The genetic differentiation between two ecotypes was examined, using ISSR and SRAP markers to determine the genetic structure of the populations. Our results showed that the green ecotype had reﬂectance spectrum peaks at 530 nm and 620 nm, while in the red ecotype, the peak at 530 nm was absent. A total of 165 ISSR and SRAP loci showed a high level of genetic diversity within the green ecotype, and analyses of the population structure revealed two genetic clusters that corresponded to the red and green ecotypes. The percentage of variation between the two ecotypes (24.55%) was greater than the percentage of variation among the populations (16.54%)—indicating partial reproductive isolation, high genetic differentiation, and that ecological differentiation has been more important than geographical barriers among populations within ecotypes. Most pairwise F ST values between the populations within either ecotype on Hainan Island were less than 0.15; however, the F ST between both the Thai and Malaysian populations and the Hainan Island population was greater than 0.25, due to South China sea isolation. Ecotypic differentiation is an important part of speciation; therefore, we must take into account the axes along which lineages sort, when formulating protection strategies.


Introduction
Ecotypes refer to groups of individuals within a species, with distinct phenotypic and genetic variations [1,2]. Typically, ecotypes are geographically well separated, but sometimes ecotypes co-exist at the local scale [3,4]. It is widely recognized that ecotypes are the result of ecological differentiation at the early stages of speciation [5,6]. This early stage is often characterized by the formation of partial reproductive isolation [7][8][9][10][11].
Adaptation to soil conditions (edaphic adaptation) offers a classic example of local adaptation in plants [12,13]. Soil attributes, including water content, nutrient status, and toxicity, provide especially rich conditions for divergent natural selection, resulting in divergent morphology, phenology, and, more generally, life history [2,14,15]. Among them, divergence in flowering time is a common cause of reproductive isolation. Inland and coastal ecotypes of the yellow monkeyflower Mimulus guttatus represent an excellent model

Population Sampling
Through field surveys, we found a total of 7 populations on Hainan Island that were distributed on 7 mountains (3 in Wangxia, 1 in Yajia, 1 in Jianfeng, and 2 in Ledong) ( Figure 1 and Table 1). We sampled these 7 populations, along with 1 population in Thailand and 1 population in Malaysia as outgroups, according to population size and the number of red and green individuals in the population. R PEER REVIEW 3 of 16

Population Sampling
Through field surveys, we found a total of 7 populations on Hainan Island that were distributed on 7 mountains (3 in Wangxia, 1 in Yajia, 1 in Jianfeng, and 2 in Ledong) (Figure 1 and Table 1). We sampled these 7 populations, along with 1 population in Thailand and 1 population in Malaysia as outgroups, according to population size and the number of red and green individuals in the population.

Leaf Color Reflectance
To assess the light reflection patterns of leaves at different wavelengths, we used an S2000 miniature fiber optic spectrometer with a PX-2 pulsed xenon lamp (Ocean Optics, Dunedin, FL, USA) indoors. All measurements were carried out over the range 350-850 nm using 0.38 nm increments.

Molecular Markers
2.3.1. DNA Extraction DNA was extracted using a slightly modified method of Doyle and Doyle [40]; we

Leaf Color Reflectance
To assess the light reflection patterns of leaves at different wavelengths, we used an S2000 miniature fiber optic spectrometer with a PX-2 pulsed xenon lamp (Ocean Optics, Dunedin, FL, USA) indoors. All measurements were carried out over the range 350-850 nm using 0.38 nm increments.

DNA Extraction
DNA was extracted using a slightly modified method of Doyle and Doyle [40]; we raised the percentage of PVP in the extraction buffer to 3% for more effective removal of excess polyphenols, and added an additional washing step using 70% cold ethanol before DNA pellets were dried and dissolved in Tris-EDTA buffer. Quantity and quality of extracted DNA was determined in 1% agarose gels buffered with 1 × TAE.

ISSR-PCR Amplification
We used the amplification assay developed by Zietkiewicz et al. [41] to generate intersimple sequence repeat (ISSR) data. We screened 12 primers (Table 2) from 100 universal primers developed by Biotechnology Laboratory, University of British Columbia, Canada. The reaction mixture for ISSR amplification assay had a total volume of 20 µL, which contained 25 ng genomic DNA, 0.33 mM dNTPs, 1.6 mM MgCl 2 , 1.5U Taq polymerase, and 0.6 µM primer. The amplification was carried out with an initial 6 min denaturation at 94 • C; 32 cycles of amplification (30 s at 94 • C, annealing 45 s, 90 s at 72 • C); a final extension step for 7 min at 72 • C. The PCR products of each sample, along with a 2 kb DNA ladder, were fractionated on 1.6% agarose gel for 1.5 h at 80 V. The fractionated amplified genomic DNA bands were visualized on UV transilluminator and photographed with a Kodak ® digital camera (Kodak, Rochester, NY, USA).

SRAP-PCR Amplification
We screened and used 12 pairs of sequence-related amplified polymorphism (SRAP) primers (Table 2) from 340 combinations of 17 forward and 20 reverse primers. The protocol for SRAP analysis was based on Li and Quiros [42], with modifications. PCR was performed in a total volume of 25 µL containing approximately 40 ng genomic DNA, 0.15 mM dNTPs, 3 mM MgCl 2 , 0.2 mM of each primer, 1U Taq DNA polymerase using an EDC-810 gene amplification instrument with the following reaction: 6 min of initial denaturation at 94 • C, recycles of five steps of 1 min of denaturation at 94 • C, 1 min of annealing at 35 • C, 1 min of elongation at 72 • C, 33 cycles of 94 • C for 1 min, 50 • C for 1 min, and 72 • C for 1 min, followed by a final extension of 7 min at 72 • C. PCR products were mixed with 10 µL formamide loading buffer (95% formamide and 20 mM EDTA (pH 8.0), Xylene cyanol and Bromophenol blue) and analyzed on 8% non-denaturing polyacrylamide gels in 1 × TBE buffer, with a 200-bp DNA ladder as a size marker. PCR products were visualized by silver staining [43].

Data Analysis
We used the average measured reflectivity from all individuals of each ecotype to represent the light reflection patterns of the two ecotypes.
Reproducible and consistent ISSR and SRAP fragments were scored as present (1) or absent (0) for all samples, and a binary qualitative data matrix was constructed. Basic estimates of genetic diversity were calculated using POPGENE version 1.32 [44]. Observed number of alleles (na), effective number of alleles (ne), Nei's gene diversity index (h), Shannon index (I), and principal co-ordinates analysis (PCoA) plots were obtained in GenAlEx 6.41 [45].
Ancestry and clustering analyses were performed in STRUCTURE 2.3.4 [46][47][48] using an "admixture" model with correlated allele frequencies, 1,000,000 iterations after a burnin of 200,000 iterations, and cluster (K) values ranging from 1 to 10 clusters, with three replicate runs per K. The optimal value of K was determined using ∆K according to the Structure Harvester software [49]. STRUCTURE results obtained using the optimal K value were plotted on the CLUMPAK server [50].
Hierarchical locus-by-locus analyses of molecular variation (AMOVA) were used to evaluate the relative level of genetic variations among groups (F CT ), populations within groups (F SC ), and individuals within populations (F ST ); pairwise F ST among populations and the significance of each value were tested using Arlequin ver. 3.5 [51,52]. Gene flow (Nm) between populations was estimated from F-statistics with the formula [53].
Population-and individual-based cluster analysis was performed using NTSYS-pc software by un-weighted pair group method with arithmetic averages (UPGMA) method.

Leaf Color Reflectance
The two ecotypes showed differences in leaf color. The red ecotype had dark-red leaves on the whole plant, while the green ecotype had light-yellow-green leaves (Figure 2A,B). The green ecotype showed peaks in the reflectance spectrum at 530 nm and 620 nm, while the peak of the red ecotype at 530 nm was absent ( Figure 2C).

Genetic Diversity
We obtained a total of 165 ISSR and SRAP markers (Table 3); 164 loci were polymorphic in the green ecotype, 158 loci in the red ecotype. Nei's gene diversity (h) and Shannon's information index (I) were higher in the green ecotype, showing that the green ecotype had a higher genetic diversity (Table 3).

Inferred Genetic Groups
STRUCTURE analysis showed an obvious peak at K = 2 ( Figure 3A), indicating that individuals could best be described by two genetic groups that mostly corresponded to the expected ecotypes, although nine green ecotype individuals (five in WX1, two in WX2, and two in WX3) were classified into the red genetic group ( Figure 3B). The assignment

Genetic Diversity
We obtained a total of 165 ISSR and SRAP markers (Table 3); 164 loci were polymorphic in the green ecotype, 158 loci in the red ecotype. Nei's gene diversity (h) and Shannon's information index (I) were higher in the green ecotype, showing that the green ecotype had a higher genetic diversity (Table 3).

Inferred Genetic Groups
STRUCTURE analysis showed an obvious peak at K = 2 ( Figure 3A), indicating that individuals could best be described by two genetic groups that mostly corresponded to the expected ecotypes, although nine green ecotype individuals (five in WX1, two in WX2, and two in WX3) were classified into the red genetic group ( Figure 3B). The assignment probabilities (Q values) of the red ecotype individuals were higher than the green ecotype individuals. The red ecotype individuals had almost 100% red ecotype alleles, while the green ecotype individuals had slightly more varied ancestry, with individuals in the TL and ML populations having about 70% green ecotype alleles ( Figure 3B).
R PEER REVIEW 7 of 16 probabilities (Q values) of the red ecotype individuals were higher than the green ecotype individuals. The red ecotype individuals had almost 100% red ecotype alleles, while the green ecotype individuals had slightly more varied ancestry, with individuals in the TL and ML populations having about 70% green ecotype alleles ( Figure 3B).

Genetic Similarity among Individuals
The first two axes of the PCoA explained 31.29% of the total observed variation, with the first two axes explaining 24.49% of and 6.80% of the total variation. The PCoA results showed that the two ecotypes formed mostly distinct clusters, although nine green individuals were clustered more closely to the red ecotype individuals (Figure 4).

Genetic Similarity among Individuals
The first two axes of the PCoA explained 31.29% of the total observed variation, with the first two axes explaining 24.49% of and 6.80% of the total variation. The PCoA results showed that the two ecotypes formed mostly distinct clusters, although nine green individuals were clustered more closely to the red ecotype individuals (Figure 4).

Proportion of Genetic Variance
The AMOVA analyses revealed that most of the genetic variation (58.92%, F ST = 0.411, p < 0.001) was distributed among individuals within populations. The percentage of variation between the ecotypes (24.55%, F CT = 0.245, p < 0.001) was larger than that among populations within the ecotypes (16.54%, F SC = 0.219, p < 0.001), indicating that the effect of ecotypes was greater than that of geographical distance (Table 4).

Proportion of Genetic Variance
The AMOVA analyses revealed that most of the genetic variation (58.92%, FST = 0.411, p < 0.001) was distributed among individuals within populations. The percentage of variation between the ecotypes (24.55%, FCT = 0.245, p < 0.001) was larger than that among populations within the ecotypes (16.54%, FSC = 0.219, p < 0.001), indicating that the effect of ecotypes was greater than that of geographical distance (Table 4). Table 4. The proportion of genetic variance partitioned between ecotypes, among populations within ecotypes, and among individuals within populations as determined by analysis of molecular variance (AMOVA).

Source of Variation
Degree of Freedom

Genetic Differentiation and Gene Flow among Populations
Most pairwise FST values we measured were significant at the 5% nominal level (Table 5).
In the green ecotype, the FST values among the WX populations (WX1, WX2, WX3) were less than 0.05 (Nm

Genetic Differentiation and Gene Flow among Populations
Most pairwise F ST values we measured were significant at the 5% nominal level (Table 5).
It is generally assumed that F ST < 0.05 corresponds to no genetic differentiation, 0.05 < F ST < 0.15 means moderate genetic differentiation, 0.15 < F ST < 0.25 implies high genetic differentiation, and F ST > 0.25 implies very high genetic differentiation [54].
In the green ecotype, the F ST values among the WX populations (WX1, WX2, WX3) were less than 0.05 (Nm > 4.75). The Most the F ST values between the green and red ecotype populations were greater than 0.25 (Nm < 0.75), indicating very high genetic differentiation. These results show limited gene flow between ecotypes and a gradual decrease in gene flow between populations with the increase in geographic distance. Table 5. Pairwise FST (below diagonal) and gene flow (Nm) (above diagonal) values among 16 populations. Green, red, and black fonts indicate values between populations within green-only, red-only, green and red populations.

Genetic Relationships among Populations and Individuals
UPGMA clustering of the populations revealed distinct green and red ecotype populations, and that the genetic distance between the populations within ecotypes increased with the increase in geographic distance ( Figure 5). These results were consistent with the results of pairwise F ST . UPGMA clustering of individuals also resulted in two genetic clusters corresponding to the green and red ecotypes, but nine green ecotype individuals (G-WX1-2, G-WX1-3, G-WX1-4, G-WX1-6, G-WX1-9, G-WX2-3, G-WX2-4, G-WX3-4, and G-WX3-5) were clustered with the red ecotype ( Figure 6). The results of the UPGMA tree of individuals were generally congruent with the genetic clustering of STRUCTURE and PCoA. 01 * 0.457 * 0.629 * 0.696 * 0.674 * 0.655 * 0.599 * 0.414 * 0.474 * 0.460 * 0.435 * 0.477 * 0.486 * 0.464 * * indicates significance of the FST values at the 5% level.

Genetic Relationships among Populations and Individuals
UPGMA clustering of the populations revealed distinct green and red ecotype populations, and that the genetic distance between the populations within ecotypes increased with the increase in geographic distance ( Figure 5). These results were consistent with the results of pairwise FST. UPGMA clustering of individuals also resulted in two genetic clusters corresponding to the green and red ecotypes, but nine green ecotype individuals (G-WX1-2, G-WX1-3, G-WX1-4, G-WX1-6, G-WX1-9, G-WX2-3, G-WX2-4, G-WX3-4, and G-WX3-5) were clustered with the red ecotype ( Figure 6). The results of the UPGMA tree of individuals were generally congruent with the genetic clustering of STRUCTURE and PCoA.

Genetic Divergence between Two Ecotypes
All of our clustering analyses show that the individuals of P. pulcherrima can be divided into two genetic clusters that correspond to the red and the green ecotype, with the exception of nine green individuals that cluster with the red ecotype. These results are consistent with the mating restrictions between the two ecotypes. Sympatric ecotypes with entire or partially reproductive isolation often show this type of genetic pattern [9]. For example, Bayesian STRUCTURE analysis of two cryptic ecotypes of a sexually deceptive orchid, Drakaea elastica, revealed two distinct, but imperfect, genetic clusters that were broadly congruent with the ecotype distributions [55].
Most sympatric ecotypes need mating restrictions to prevent the fusion of ecotypes [9]. Therefore, the percentage of variation between ecotypes isolated by environment (IBE) is generally greater than that among populations or among regions isolated by distance (IBD) [56]. In this research, the percentage of variation between the two ecotypes of P. pulcherrima (24.55%) was greater than the percentage of variation among populations (16.54%), and also greater than that among the regions of many orchids, such as Changnienia amoena (14.99%) [57], Cypripedium calceolus (12.26%) [58], and Phragmipedium longifolium (19%) [59]. However, the isolation time between most ecotypes is not as long as that between species. Therefore, the percentage of variation between ecotypes is generally less than that between species. For example, the percentage of variation among other orchid groups, such as 31.39% among Eulophia parviflora, E. streptopetala, E. welwitschii, E. speciosa, and E. clavicornis [60]; and 26.9% among Cymbidium goeringii, C. faberi, C. ensifolium, C. kanran, C. sinense, and C. goeringii var. longibracteatum [61], was greater than the percentage of variation between the two ecotypes of P. pulcherrima.
Recently differentiated ecotypes show little genetic differentiation in neutral molecular markers, which is common in heavy metal-tolerant species, because heavy metals are toxic for most plants [14,62], and promote rapid population differentiation between contaminated and noncontaminated sites [15]. For example, in populations of Thlaspi caerulescens [32], Armeria maritima [29], Silene paradoxa [30], and Arabidopsis halleri [31], there is little or no genetic differentiation between metallicolous and nonmetallicolous ecotypes. In contrast, the genetic differentiation of ecotypes that are driven by soil water content and nutrients is slower and less intense, and greater genetic differentiation between ecotypes mostly occurred as a result of long-time isolation. For example, the yellow monkeyflowers Mimulus guttatus [27] and Solidago virgaurea [17] exhibit greater genetic differentiation between ecotypes than among populations within ecotypes. The greater percentage of variation between the two ecotypes of P. pulcherrima shows that they have gone through a long period of isolation, which is similar to ecotypes that are driven by soil water content and nutrients.
In plants, barriers to gene flow often present as divergence in flowering time [28,[63][64][65][66] and pollinator preferences for floral traits [67][68][69]. To verify whether there is divergence in flowering time and pollinator preferences for floral traits between the two ecotypes of P. pulcherrima, we need to evaluate the extent of temporal overlap of the flowering times in common gardens and natural populations. The inability of seeds to germinate, and the inability of immigrants to survive in alternative habitats, are common reasons that ecotype distributions have obvious boundaries. To test these two mechanisms, we need investigate seed germination, growth rate, survival rate, and seed production, based on plant chamber, open common garden, and reciprocal transplant experiments. In natural populations, we did not observe dead seedlings or dead plants growing outside of their expected soil conditions, which leads us to believe that soil factors are more likely to affect the germination of seeds. The germination of orchid seeds is mostly affected by mycorrhizal fungi. Ke at al. [35] found that the soil of red ecotype contained significantly more endophytic fungi, dominated by Fusarium, Rhizoctonia, and Pyrenochaeta, which occurs in most other orchids, such as Paphiopedilum armeniacum and Cymbidium sinense [70]. The endophytic fungi associated with the green ecotype individuals of P. pulcherrima were mainly Cylindrosporium and Ozonium, which are relatively uncommon in orchids [35]. Therefore, in addition to soil water and nutrients, mycorrhizal fungi may be another factor affecting the seed germination of the two ecotypes, which warrants further research.
The relationship between ecotype habitats and phenotype is a fundamental theme in the study of ecotype differentiation and formation [5]. New ecotypes often have growth or genetic advantages in new habitats [2]. Yin et al. [71] conducted a study on the intrachromosomal karyotype asymmetry index percentage of P. pulcherrima that showed that red leaves were ancestral and green leaves were derived. Green-leafed plants have a potential growth advantage over red-leafed plants, because they have more chloroplasts for photosynthesis [72]. More chloroplasts for the photosynthesis of the green-leafed individuals of P. pulcherrima may have a competitive advantage in lower water and nutrient soils. The adaption to uncommon endophytic fungi in green ecotype soil may give another advantage to the growth of the green ecotype [70].

Genetic Divergence within Ecotypes
Meta-analyses have shown that the mean genetic differentiation in Orchidaceae (F ST = 0.146, based on allozyme loci, moderate genetic differentiation) is the third lowest reported for well-studied plant families [54,73]. Most F ST values, regardless of ecotype, among populations on Hainan Island, were less than 0.15. This may be due to small geographical distances on Hainan Island, coupled with moderate pollen dispersal distances, resulting in little geographical isolation. However, sea isolation is an important factor causing genetic differentiation [74]. The South China sea between Hainan Island and Thailand and Malaysia restricts the gene flow of P. pulcherrima. The F ST between the Thai and Malaysian populations, and the population on Hainan Island was greater than 0.25. The UPGMA tree was consistent with these results. Therefore, in addition to ecotype, geographic distance is another factor that affects the genetic differentiation of P. pulcherrima.

Implications for Conservation
Habitat heterogeneity is an important driver of speciation and the maintenance of biodiversity [75,76]. However, habitat heterogeneity, and its associated fragmentation, may also threaten rare and endangered species, especially species with highly specialized microhabitat preferences or needs, and obvious metapopulation or subpopulation structures [77,78]. Soil heterogeneity is a key factor of divergence for the two ecotypes and maintaining the genetic diversity of P. pulcherrima. When we formulate protection strategies, we must take into account the resources of different habitats. Based on F ST and gene flow among populations within ecotypes, the populations in Hainan Island have low genetic differentiation and frequent gene flow. Therefore, when formulating a protection strategy, it is necessary to consider protecting P. pulcherrima on a large scale. The genetic diversity of the green ecotype was higher than the red ecotype, so the green ecotype should be a conservation priority and the red ecotype has the urgency of conservation.
We propose some of the following guidelines for conservations of P. pulcherrima: (i) both in situ and ex situ conservation should take into account both the ecotypes; (ii) preservation of rocky outcrops with randomly distributed stones and shrubs should be encouraged, in order to maintain habitat diversity and connectivity; (iii) large forest gaps or mountain slopes, where the processes that create these micro habitats should be a conservation priority; (iv) in order to maintain genetic diversity, conservation strategies should be developed on a larger scale; (v) restoration actions that create a habitat containing a mixture of rocky or stony substrate and shrubs should be established, particularly at the roadside and landslide slopes that have experienced recent disturbances; and (vi) the green ecotype should be a conservation priority.
On the whole, the results showed that the green ecotype had a higher genetic diversity than red ecotype; the individuals of P. pulcherrima could be divided into two genetic clusters, which would basically correspond to the red ecotype and the green ecotype, indicating partial reproductive isolation and high genetic differentiation. The percentage of variation between the two ecotypes was 24.55%, which was greater than the percentage of variation among the populations (16.54%), which indicated a long history of reproductive isolation. Within the green or red ecotype, most F ST among the populations on Hainan Island were less than 0.15. However, the F ST between the TL, ML populations, and the population on Hainan Island was greater than 0.25, due to South China sea isolation. When we formulate protection strategies, we must take into account the resources of different habitats.