Genome-Wide Mining and Characterization of SSR Markers for Gene Mapping and Gene Diversity in Gossypium barbadense L . and Gossypium darwinii G . Watt Accessions

The present study aimed to characterize the simple sequence repeat markers in cotton using the cotton expressed sequence tags. A total of 111 EST-SSR polymorphic molecular markers with trinucleotide motifs were used to evaluate the 79 accessions of Gossypium L., (G. darwinii, 59 and G. barbadense, 20) collected from the Galapagos Islands. The allele number ranged from one to seven, with an average value of 2.85 alleles per locus, while polymorphism information content values varied from 0.008 to 0.995, with an average of 0.520. The discrimination power ranks high for the majority of the SSRs, with an average value of 0.98. Among 111 pairs of EST-SSRs and gSSRs, a total of 49 markers, comprising nine DPLs, one each of MonCGR, MUCS0064, and NAU1028, and 37 SWUs (D-genome), were found to be the best matched hits, similar to the 155 genes identified by BLASTx in the reference genome of G. barbadense, G. arboreum L., and G. raimondii Ulbr. Related genes GOBAR_DD21902, GOBAR_DD15579, GOBAR_DD27526, and GOBAR_AA04676 revealed highly significant expression 10, 15, 18, 21, and 28 days post-anthesis of fiber development. The identified EST-SSR and gSSR markers can be effectively used for mapping functional genes of segregating cotton populations, QTL identification, and marker-assisted selection in cotton breeding programs.


Introduction
Cotton (Gossypium L.) belongs to an allotetraploidy complex of angiosperms taxonomically related to the Gossypieae tribe; it is a part of the family Malvaceae.It is a major contributor of raw fibers, accounting for 95% of the world's cotton [1].Modern cotton cultivars are allotetraploid, G. hirsutum L., and G. barbadense, with chromosome numbers (n = 2x = 26) [2].These modern cultivars evolved from A-genome diploids (n = x = 13) with an African or Asian origin by subsequent hybridization with the indigenous D-genome diploids of the New World through transoceanic dispersal [2].G. darwinii is also allotetraploid cotton, having 26 chromosome pairs (2n = 4x = 52; 13 A_ and 13 D_) [3].
Genetic diversity among cultivated plants is of high value in worldwide biodiversity due to their wide range of contributions to the world economy and principal position in the production of world crops.In order to sustain the inherited enhancements of major cultivated species, it is necessary to collect, evaluate, characterize, and conserve, either on-farm or ex situ, their biodiversity.Thus major cotton production and the future of cotton breeding programs are highly reliant on meticulous knowledge and advanced genetic diversity assessment in cotton gene pools [4].
The development of new cotton cultivars through conventional sources is a time-consuming process; it takes as much as 7-9 years to advance from the F 1 to the F 9 generation-A major bottleneck in cotton breeding programs.Various plant breeding techniques have been employed to reduce the time taken, but Marker-Assisted Selection (MAS) is highly efficient [5,6].Conventional breeding approaches and techniques are inadequate to explain the complex traits.The choice of molecular markers linked to significant agronomic traits can be an alternative to variety selection in the initial phase of breeding programs; this is called Marker-Assisted Selection and has to do with selecting the superior parents in a cross.So, this process can significantly reduce the time and costs required for developing novel accessions.Simple sequence repeats (SSRs) with small motifs consisting of 1-6 tandem repeat base pairs and edged by well-conserved sequences are most suitable for designing exact primers [7].
The SSRs are designed from expressed sequence tags (ESTs), which are actually a result of expressed genes that promote gene mapping with identified functional pathways [8,9].These are set in the transcribed portion of the genome [10] and determine a direct association between genes and vital agronomic characteristics.A large number of plant species have been studied by developing ESTs, including cotton [11], willow [12], wheat [13], barley [14], sugarcane [15], and the rubber tree [16].The CottonFGD (Cotton Functional Genomics database) covers genomic sequences, gene structural and functional annotations, genetic marker data, transcriptome data, and genome resequencing data of populations among the four sequenced Gossypium species [17].Most of the EST-SSRs and gSSR markers suggest that the genes are linked to important metabolic processes like in carbohydrate metabolism, photosynthesis, sugar transport, and amino acid metabolism, along with biotic and abiotic stress response mechanisms [18].
Fewer molecular analyses have been reported; their main focus was G. barbadense and G. darwinii genetic diversity as compared to G. hirsutum.One of the earliest studies used isozymes to estimate the diversity and region of origin of G. barbadense and G. darwinii accessions [19].The results depicted geographic clustering and a proposed geographical relationship across the Galapagos Islands.
An estimation of the cotton genetic diversity, as shown in the Wild Cotton Germplasm Nursery of China, is necessary to launch plans for collecting, conserving, and utilizing these germplasm resources in future breeding programs.Considerable effort has gone into describing the diversity of modern cultivated cottons G. barbadense and G. hirsutum, but not wild allotetraploid cotton species, even though it has been recognized that these accessions are useful for grasping the diversity of the germplasm collection.The objectives of the present study were to map the genetic structure of G. barbadense and G. darwinii within the Wild Cotton Germplasm Nursery of China, characterize the EST-SSRs and gSSRs, mine the useful genes related to important agronomic traits, determine whether these accessions can be elucidated by SSR markers and are useful in categorizing the diversity perceived within these accessions, and assess their worth in implementation for measuring collection purity.

Plant Material and DNA Extraction
We scrutinized the genetic diversity among 79 tetraploid accessions of Gossypium darwinii and Gossypium barbadense (Table 1).These accessions comprised 20 G. barbadense and 59 G. darwinii.Standard/control in G. barbadense accessions were four, while 16 collected from the Galapagos Islands and G. darwinii (five and 54 accessions, respectively) were compared.In this investigation, standard/control G. darwinii and G. barbadense accessions have been indicated growing to their high throughput selection pressure of cotton breeders while wild types are only those that are truly wild accessions, collected from Isabella, San Cristobal, and Santa Cruz islands of the Galapagos.These accessions were selected for the diversity study because both Gossypium species (G.barbadense and G. darwinii) are genetically very closely related.Genomic DNA was extracted from fresh leaves, collected using the method described by Zhang and Stewart [20] with some modifications.

PCR Primer Selection and Amplification
In total, 111 pairs of EST-SSRs and gSSRs polymorphic primers were analyzed for characterization of G. barbadense and G. darwinii accessions and functional gene mapping.To identify these markers, we used the CMD database (Cotton Marker Database CMD), Cottongen (http://www.cottongen.org;Cotton marker database), which is the largest database of cotton ESTs and genomic SSRs markers; it contains 12,250 and 24,000 SSR markers, respectively [21].Particular primers and suitable PCR amplification conditions were employed for each SSR, and their polymorphism information content (PIC) and discrimination power (DP) were calculated.To confirm the validity of EST-SSRs and gSSRs, the genetic diversity in 79 cotton accessions (of two species) was assessed and compared to the results of earlier reports on different SSRs and pedigrees [22,23].These SSR markers included 47 DPL, six MonCGR, one MUCS, two NAU, and 55 SWU (D-genome) derived from the Cotton Marker Database (CMD; http://www.cottonmarker.org).PCR amplification was done on a TAKARA Bio Inc. TP 600 thermal cycler (Bio Inc., Kusatsu, Japan) and silver staining was performed following the method described by Zhang et al. [24].

Identification of Genes with GO Functional Classification and Expression Analysis
The complete sequence between physical positions each of 47 DPL, six MonCGR, one MUCS, and two NAU; EST-SSR and gSSRs markers was used for BLAST and full sequences between the two nearest markers on the same chromosome of each 55 SWU (D-genome) EST-SSRs and gSSRs were used to mine the genes [25].A BLASTx search E ≤ 1 × 10 −5 , identity ≥70%, and matched length ≥200 bp was applied to find homologs by sequence similarity in cotton (Gossypium spp.) [26,27].A functional cotton genome database (https://cottonfgd.org/search)was used to analyze genes extracted from their genetic characteristics, protein features, GO functional classification, and RNA expression, by means of G. raimondii, G. arboreum, and G. barbadense as the reference genome [28].Analysis of resulting RNA expression data was done to construct a heatmap using R statistical software version 3.4.4(CSIRO Mathematical and Information Sciences, Cleveland, Australia) [29].

Polymorphism Analysis
The polymorphism information content (PIC) and discrimination power (DP) were calculated using Excel 2010 (Microsoft Office, http://www.microsoft.com/) and the following formula was employed on the basis of band presence/absence data for individual accessions: where Pi stands for the frequency of the allele at locus i and the summation covered n patterns [30].
A PIC value of 1 indicates that the marker can differentiate each line, and 0 indicates a monomorphic marker.DP values for the K-th primer were calculated based on the following formula: where N is the number of individuals and pj is the frequency of the j-th pattern [31].

Genetic Diversity Analysis
The EST-SSR-and gSSR-based pairwise genetic distances between accessions were determined according to Nei et al. [32] as described in D A distance.The gene diversity at each locus was calculated using Powermarker software package version 3.25 (http://www.powermarker.net).The phylogenetic tree was created on the basis of a distance matrix.Cluster analysis was done by the neighbor-joining (NJ) method, as illustrated by Liu et al. [33].The dendrogram resulting from these calculations was plotted using MEGA6 software (Research Center for Genomics and Bioinformatics, Tokyo Metropolitan University, Hachioji, Tokyo, Japan and Center for Evolutionary Medicine and Informatics, Biodesign Institute, Arizona State University, Tempe, AZ, USA).

Genome-Wide Mining and Characterization of Functional SSR Markers
The cotton genome functional database (http://cottonfgd.org/) was used to characterize the EST-SSRs and gSSR markers against the G. arboreum, G. raimondii, and G. barbadense whole-genome sequences.Among 111 pairs of EST-SSRs and gSSRs, a total of 49 markers including nine DPLs, one each of MonCGR, MUCS0064, and NAU1028, and 37 SWU (D-genome) were found in 155 G. barbadense genes based on BLASTx.Similarly, 42 markers have the best similarity hits with 116 genes by using G. raimondii as the reference genome, while 46 markers were found to be best matched with 135 genes when G. arboreum was used as the reference genome.Genes were distributed among all chromosomes except 10, 11, and 13 because all marker sequences do not match the best similarity in G. barbadense.The highest number of genes (51) was found on chromosome 7and they are associated with two DPL and 12 SWU markers, followed by chromosome 6 with 21 genes, while the minimum was two genes on chromosome 3.The rest of the chromosomes contained genes ranging in number from seven to 15.The physiochemical analysis of genes mined in this study indicated that genes have both hydrophobic and hydrophilic properties as the Grand Average of Hydropathy (GRAVY) values were found to range from −1 to 0.765, with a molecular weight ranging from 10.11 to 288.95 and an isoelectric point (PI) value in the range of −26.5 to 54 (Table S1).The intron-exon interaction was determined to analyze the gene structure and the results imply that among 155 genes identified, the highest intron disruption was found in the genes GOBAR_DD07974 and GOBAR_DD37421, with 31 introns each.Further analysis of gene features was done to gather descriptive information about mined genes.
Various kinds of genes were observed to be related to fiber development and stress response.The genes showing specific responses to stress were Zinc finger protein, E3 ubiquitin protein ligase, NAC domain protein, and plant UBX domain-containing protein.Genes involved in fiber development were also detected, i.e., MLO-like protein, protein CER1, protein ECIFERUM, protein PAT1 homolog, Ninja family protein, indole 3 acetic acid amido synthase, cyclin U4-1, Omega-3 fatty acid desaturase, mediator RNA II polymerase transcription, and L10 interacting MYB domain-containing protein (Table S2).
The used molecular markers are helpful in direct mapping of genes [34,35].In this study, cotton ESTs derived from CMD was explored to find a group of markers related to various biological/molecular functions in cotton.These EST-SSRs and gSSRs improve the marker-trait association as these are bonded to the transcribed domain of a genome.These functional molecular markers are valuable for getting data about gene variability that can affect significant breeding traits.

GO Functional Annotations and Expression Analysis
Gene ontology (GO) annotations of G. raimindiiand G. barbadense-based mined genes were carried out by using the cotton functional genomics database (http://cottonfgd/analyze/) to describe their biological process (BP), molecular functions (MF), and cellular components (CC) [28].In five cellular components, functions related to ribosomes, intracellular, integral components of the membrane, the membrane, and the vesicle coat were identified.Similarly, 21 molecular functions and 19 biological processes were observed, while a maximum of 155 genes was involved in regulating the biological processes (Figure 1).Ultimately, we implemented RNA sequence expression data analysis to verify the genetic annotation functions through gene ontology.RNA sequences were downloaded from the cotton functional genome database (https://cottonfgd.org/).The genes revealed differential expression: 49 genes were found to be upregulated for fiber developmental stages at 10, 15, 18, 21, and 28 days post-anthesis (DPA) (Table S3).Thus, 108 genes out of 155 were involved in the upregulation of various tissues; a heatmap was constructed that only addresses fiber developmental stages on the basis of their respective expression levels (log10) (Figure 2; Table S4).The heatmap constructed for 108 genes identified on the basis of G. barbadense genome sequence and four genes, GOBAR_DD21902, GOBAR_DD15579, GOBAR_DD27526, and GOBAR_AA04676, showed highly significant expression 10, 15, 18, 21, and 28 days post-anthesis (DPA), while GOBAR_DD21362, GOBAR_DD19297, and GOBAR_AA19856 showed highly significant expression at all points except 28 DPA.There were also certain genes not directly related to fiber development and presented in Table S1 that were found to be significantly involved in biotic and abiotic stress responses under stress conditions, like NAC family, COL3, COL4, UBX2, CRKs, MED33A, MED33B, L1YMB, and ABC transporter family.
(BP), molecular functions (MF), and cellular components (CC) along with the Q-value.Results are indicated in pie charts and presented in detail in Table S2.

Analysis of Genetic Diversity
A total of 111 EST-SSRs and gSSR markers were polymorphic in both G. barbadense and G. darwinii.The number of bands noticed in each SSR marker was used to evaluate the level of gene diversity, polymorphic information content (PIC), and discrimination power (DP) in the germplasm collection being used in this study.An average of 2.85 alleles per marker was detected across all accessions, with a maximum of seven alleles perceived in the DPL0501 locus for both G. barbadense and G. darwinii accessions.Among all markers, a maximum value of PIC (0.995) was observed for marker DPL0064 and a minimum of 0.008 for DPL-0608, with an average of 0.520 for all cotton accessions.The discrimination power was found to range from 0.947 to 1.000, with an average of 0.981.Both PIC and DP values were found to be very high, indicating a high level of genetic diversity in the newly collected germplasm accessions.The maximum value of PIC (0.752) was observed for DPL0489 and a minimum value of PIC (0.000) for DPL0638 among the 49 markers characterized for functional gene mapping.The maximum value for DP (0.991) was found for marker DPL0309, while the minimum DP value (0.959) was found for the markers SWU18262-SWU18289.The gene diversity value of the 49 markers ranged from 0.026 to 0.478, with an average of 0.317 (Tables S2 and S5).

Multivariate Analysis of Genetic Diversity
The genetic diversity was evaluated by performing principal coordinate analysis (PCoA) on the 79 accessions using EST-SSRs and gSSR markers.The results indicated a diverse pattern of separation between G. barbadense and G. darwinii accessions (Figure 4).
The estimation of the G. barbadense and G. darwinii accessions revealed that the first principal coordinate (PC1) showed variation of 17.81% and separated the accessions into clusters indicating individual's origin of collection, with some overlap (Figure 4).The second principal coordinate, PC2, indicated 30.77% variation among the different accessions of Gossypium.The cumulative variation was 48.58% along the two principal coordinates, showing high genetic diversity among the newly collected cotton accessions.We observed broad clusters (much broader in G. darwinii) in both species of Gossypium, which reflects the high diversity among these accessions.The analysis of molecular variance (AMOVA) after 999 permutations was done to validate the significance of the differences between G. darwinii and G. barbadense populations.The AMOVA results estimated 38% variation, which is due to five population groups among accessions and 62% genetic variation within the populations of both species (Table 2).Current analysis indicated a significant difference within population groups and between individuals (p = 0.001).

Multivariate Analysis of Genetic Diversity
The genetic diversity was evaluated by performing principal coordinate analysis (PCoA) on the 79 accessions using EST-SSRs and gSSR markers.The results indicated a diverse pattern of separation between G. barbadense and G. darwinii accessions (Figure 4).The estimation of the G. barbadense and G. darwinii accessions revealed that the first principal coordinate (PC1) showed variation of 17.81% and separated the accessions into clusters indicating individual's origin of collection, with some overlap (Figure 4).The second principal coordinate, PC2, indicated 30.77% variation among the different accessions of Gossypium.The cumulative variation was 48.58% along the two principal coordinates, showing high genetic diversity among the newly collected cotton accessions.We observed broad clusters (much broader in G. darwinii) in both species of Gossypium, which reflects the high diversity among these accessions.The analysis of molecular variance (AMOVA) after 999 permutations was done to validate the significance of the differences between G. darwinii and G. barbadense populations.The AMOVA results estimated 38% variation, which is due to five population groups among accessions and 62% genetic variation within the populations of both species (Table 2).Current analysis indicated a significant difference within population groups and between individuals (p = 0.001).

Discussion
Cotton fiber production in the form of lint and its quality has been deteriorating because of biotic and abiotic stress, which is exacerbated by the narrow genetic base of best tetraploid cotton [36].To improve the genetic diversity, the key agronomic traits of wild accessions can be efficiently introgressed into the commercial varieties [26,37].Therefore, based on the estimated polymorphic EST-SSRs and gSSR markers in this study, a total of 155 genes were quarried with the use of G. barbadense as a reference genome by using the full sequences of these polymorphic DNA markers.The significance of these mined genes has been illustrated because of their involvement in fiber development as well as biotic and abiotic stress tolerance.Relative to stress factors, different genes linked to drought were found, including NAC 100 (NAC domain-containing protein 100), and others including COL3, COL4 (zinc finger protein), At3g19950 (E3 ubiquitin protein ligase), and UBX2 (plant UBX domain containing protein) [38].Interestingly, At3g19950 was found to be associated with GOBAR_AA37282 and GOBAR_AA29781.The plant-specific NAC family was found to be involved in regulating several important biological phenomena in wheat.NAC TFs are considered to play a significant role in processes like senescence, nutrient remobilization [36,39], and reactions to biotic (stripe rust) [40,41] and abiotic stress (drought and salt) [42].
Ten members of the genes CYCU4-1 and CYCP3-1, related to serine/threonine-protein kinases proteins, play a significant role in plant cell growth, development, and oncogenesis [43].Plants' survival despite increasing environmental degradation is linked to vital proteins like E3-ubiquitin-protein ligase, i.e., RING1-like (a really interesting gene) was found to be involved in fiber development [44], while ORTRUS 2 and ubiquitin thiosterase OTU 1, belonging to At3g19950, ORTH 2, and yod1 genes, also have the same functions.These proteins regulate cell trafficking, DNA repair, early flowering, and prevention cell degradation through hydrolysis and signaling, and together have profound significance in cell biology [45].
Among the four identified genes, the GOBAR_DD27526 has a single domain (PF02338) and is highly conserved, which can be useful in an evolutionary study of cotton.The other three genes are uncharacterized, while the homology of two genes, i.e., GOBAR_DD15579 and GOBAR_AA04676, was found to be similar in Arabidopsis thaliana (L.) Heynh., At4g09830.1.This homologue is the nuclear receptor protein family 2 group C proteins expressed during various plant growth developmental phases (http://cottonfgd.org/analyze).
The results of RNA sequence expression indicated that 60 genes are involved in upregulation at different fiber developmental stages, as shown in the heatmap, e.g., AFP2, AFP3, and AFP-B1 (Ninja family protein), which are linked to fiber development in plants through antagonistic interactions by regulating plant growth under stress conditions [47].The FDR2 member of the ABC transporter family (pleiotropic drug resistant protein 2) is understood to play a role in the signaling and plant defense mechanism [48].The ABC superfamily includes membrane-bound transporters as well as soluble proteins that have roles in different processes, many of which have significant potential in the agriculture, biotechnology, and medical fields.ABCG38 and ABCG39 (ABC transporter G family member 38, 39) bind with ATP hydrolysis to transport a variety of substances including lipids, heavy metal ions, inorganic acids, glutathione conjugates, sugars, amino acids, peptides, secondary metabolites, and xenomolecules across several membranes [49].PUX2 (Plant UBX domain-containing protein 2) controls plant growth by monitoring the oligomeric structure and function, as studied in Arabidopsis [50].The cysteine-rich protein kinases (CRKs), also known as DUF26 RLKs, are one subfamily of RLKs.CRKs have a typical RLK domain structure, i.e., they contain an extracellular domain responsible for signal perception, a trans-membrane domain, and a conserved intracellular protein kinase domain responsible for signal transduction [51].CRKs are involved in early flowering and leaf senses under environmental stress.CRK3 reduces the cell density, ultimately increasing the cell expansion [52].Similar results were found in our study, indicating the crucial role of CRK3 in cell expansion (e.g., fiber development) and the best match with the full sequence of MUCS0064 in G. barbadense.
The identification of these genes from the polymorphic EST-SSR and gSSRs offers plant breeders subsequent use of the recognized genes to improve the highly evolved tetraploid cotton.As shown in the cotton QTL database (http://www2.cottonqtldb.org:8081/),the cotton genome D has been linked with numerous stable QTLs of morphological, physiological, and biochemical traits associated with fiber qualities.Furthermore, it has been determined to play an important role during biotic and abiotic stress conditions compared to the A genome [53].
The highest diversity within the G. darwinii and G. barbadense accessions involved in this study is generally in accordance with the Wendell and Percy investigations, with few differences [19].Moreover, the grouping of accessions into distinct clusters was linked to island of origin by employing the neighbor-joining (NJ) method, as Nie et al. described in 1983.The accessions were distinguished into five distinct clusters based on Nie's genetic distance pattern.The same results have been indicated in various studies of genetic diversity among Gossypium accessions [54].Alleles were identified in a range from one to seven; there were on average 2.85 alleles per locus.The results of polymorphism information content (PIC) showed values from 0.008 to 0.995, with an average of 0.520.The PIC value ranged from 0.037 to 0.825, with an average of 0.489 for G. darwinii; this was higher than for G. barbadense accessions (0.091-0.774, average: 0.459), even though downstreaming the G. darwinii samples equal to G. barbadense was in contrast to earlier reports of Wendell and Percy, in which they suggested that G. barbadense has high genetic diversity compared to G. darwinii.The discrimination power was found to be high in the majority of the EST-SSRs and gSSRs, with an average of 0.98.The discrimination power is an extension of the polymorphism information content (PIC), which actually describes the efficiency of a given marker to discriminate between genotypes, i.e., the probability that two randomly selected individuals have different arrays [30].Thus, high PIC coupled with DP values exhibited that these markers have the potential to disclose allelic variation and each of these markers had greater affinity towards discriminating between two accessions [55].The principal coordinate analysis (PCoA) supported the neighbor-joining (NJ) clustering of accessions.The cumulative variation was found to be 48.58%along the two principal coordinates, showing high genetic diversity among the newly collected cotton accessions.The AMOVA results estimated 38% variation among population of accessions and 62% genetic variation within the populations of both species.A similar previous study indicated 14% variation among populations and 52% variation between individuals of G. arboreum [56].

Conclusions
The EST-SSRs and gSSR markers used in this study have great potential to be used for mapping functional genes studies in cotton segregating populations.Moreover, the genes mined in G. barbadense will set a precedent for solving the dilemma of cotton breeding for fiber quality improvement.The high Polymorphism Information Content (PIC), combined with Discriminant Power (DP), can assist with QTL identification and marker-assisted selection because linking the markers with functional regions of genome has become a comprehensive studying tool that will open up new avenues for cotton breeders to use in future cotton breeding programs.

Supplementary Materials:
The following are available online at http://www.mdpi.com/2073-4395/8/9/181/s1, Figure S1: Dendrogram based on neighbor-joining (NJ) method, Table S1: The characteristics of mined genes from SSR marker regions, Table S2: GO functional annotations of EST-SSRs and gSSRs, Table S3: Summary of 49 genes with differential expression during fiber development, Table S4: RNA seq expression data of 108 genes with differential expression of both up and down regulation and Table S5: Markers with physical positions and PIC value.
Author Contributions: K.W., A.D., and F.L. planned the experiments.A.D. and Z.Z.conducted the experiments and analyzed the results.A.D. and Z.Z.performed the majority of the experiments and contributed equally.A.D. applied all computational analyses.F.L., X.C., X.W., K.W.O., M.S., Y.X., Y.H., M.K.R.K., and M.S.I. participated in mapping experiments.A.D. created the draft of the manuscript and K.W. reviewed and studied the manuscript.Ultimately, all authors read and agreed to submit the final manuscript.

Figure 1 .
Figure 1.Gene ontology functional classification for G. barbadense with differential expression of 155 genes.GO analysis of 155 protein sequences illustrated for their association with biological processes

Figure 1 .
Figure 1.Gene ontology functional classification for G. barbadense with differential expression of 155 genes.GO analysis of 155 protein sequences illustrated for their association with biological processes (BP), molecular functions (MF), and cellular components (CC) along with the Q-value.Results are indicated in pie charts and presented in detail in TableS2.

Figure 2 .
Figure 2. RNA sequence data analysis of 108 genes with differential expression of both up-and downregulation for each RNA sequence.The heatmap was constructed by using log 10 of the expression values.Color coding shows expression as clarified in the key and genes represented along the y-axis divided into three distinct groups with relative expression (2 −ΔΔCt ).The DPAs are demonstrated at the top.

Figure 2 .
Figure 2. RNA sequence data analysis of 108 genes with differential expression of both up-and downregulation for each RNA sequence.The heatmap was constructed by using log 10 of the expression values.Color coding shows expression as clarified in the key and genes represented along the y-axis divided into three distinct groups with relative expression (2 −∆∆Ct ).The DPAs are demonstrated at the top.

Figure 3 .
Figure 3. Dendrogram based on NJ analysis.The different colored dots indicate the wild-type accessions, while red and light blue squares represent the improved accessions.

Figure 3 .
Figure 3. Dendrogram based on NJ analysis.The different colored dots indicate the wild-type accessions, while red and light blue squares represent the improved accessions.

Table 1 .
Summary of germplasm collection from the Galapagos Islands.

Table 2 .
Summary of analysis of molecular variance (AMOVA).