1. Introduction
Approximately 20% of the irrigated agricultural land worldwide is affected by soil salinization, which has become the most urgent agricultural issue [
1]. The deterioration of the natural environment, global warming, and poor irrigation methods have exacerbated soil salinization. The arable land area will likely decrease by 50% by the middle of the 21st century [
2]. Thus, how to develop and utilize saline soil has become an urgent problem for agricultural production and environmental ecology [
3].
Kenaf (
Hibiscus cannabinus L.) is an economically important fiber and ornamental crop [
4]. Due to its high salt tolerance and biological yield, it may be an ideal plant for saline soil [
5,
6,
7,
8]. The biological yield of kenaf is approximately 3 to 4 times greater than that of forest trees, while its carbon dioxide assimilation capacity is approximately 4 to 5 times greater than that of forest trees. The quality of kenaf pulp is similar to that of the pulp from broadleaf forest trees. Accordingly, kenaf pulp is considered to be a new raw material for the production of paper that can replace wood pulp, especially in developed countries [
9]. Kenaf is also an important raw material for the traditional textile industry. In addition to being used to produce hemp rope, sacks, geo textiles, carpet cloth, wall coverings, canvas, and curtain cloth [
10,
11], kenaf-fiber raw materials have recently been widely used to develop and produce automobile linings, paper film, light plates, sewage purification materials, soil conditioners, plastic fillers, activated carbon, and environmentally friendly adsorption materials because it is a natural fiber with desirable characteristics (e.g., antibacterial, breathable, moisturizing, dries quickly, and degradable) [
9]. Kenaf has been described as a “potential dominant crop in the 21st century” and a “futuristic crop” [
9].
DNA molecular marker technology has several important applications, including analyses of genetic diversity, genetic structures, species evolution, and genetic mechanisms as well as DNA fingerprinting, assessments of seed purity, and molecular marker-assisted selection-based breeding of agriculturally important germplasm resources [
12,
13,
14]. The development of kenaf DNA molecular marker technology was initiated relatively recently. Hence, a large number of methods must be used to generate markers useful for the selection of ideal kenaf accessions. The main molecular markers currently used for kenaf research include amplified fragment length polymorphisms (AFLPs), randomly amplified polymorphic DNA (RAPD), resistant gene analogs (RGAs), simple sequence repeats (SSRs), inter-simple sequence repeats (ISSRs), insertions/deletions (InDels), and chloroplast markers [
5,
10,
15,
16,
17,
18,
19,
20,
21,
22]. These molecular markers have primarily been applied to examine genetic diversity and population structures as well as the DNA fingerprinting of kenaf germplasm materials, but there is an insufficient number of markers.
Compared with other molecular markers, SSR markers are more commonly used to study kenaf genetic diversity. The recent decrease in sequencing costs has promoted the development of kenaf SSR molecular markers, which have been used to investigate genetic diversity and genetic differentiation [
23]. Previous research confirmed that SSR markers may be used as part of a quick, simple, and inexpensive method to assess genetic diversity [
5]. Moreover, the number of markers in kenaf has considerably increased [
24,
25,
26,
27]. Additionally, expressed sequence tag (EST)–SSR markers, which are highly reproducible, co-dominant, polymorphic, and conserved, can be used to analyze genetic diversity [
28,
29]. However, because of the particularity of transcriptome sequencing and the temporospatial specificity of gene expressions, the expressed sequences in different transcriptomes may vary. Therefore, sequencing transcriptomes and developing EST-SSR molecular markers are effective strategies for analyzing different physiological activities in the same crop.
In this study, EST-SSR markers were developed on the basis of the transcriptome of salt-stressed kenaf, which further enriched the number of molecular markers of kenaf. In addition, the interactions among a group of proteins encoded by the identified differentially expressed genes (DEGs) were analyzed. Moreover, a few primers were selected and verified using kenaf germplasm materials. The polymorphism of the novel EST-SSR molecular markers and the utility of these markers to analyze the genetic diversity and population structure of germplasm resources were assessed.
2. Materials and Methods
2.1. Plant Materials
Kenaf cultivar H368, provided by Professor Defang Li (Institute of Bast Fiber Crops, Chinese Academy of Agricultural Sciences), was grown under the following conditions: day/night cycle of 16 h/8 h at 28 °C/25 °C, respectively; light intensity of 700 μmol m
−2 s
−1; and relative humidity close to 60%. The plants were grown in pots (15 cm height; 18 cm diameter) containing the same weight of a soil mixture comprising red soil, humus, and vermiculite at a ratio of 2:1:1,
v/
v/
v. Additionally, 250 mL 1/4 Hoagland nutrient solution was added to each pot every other day. When the plant height reached 55 cm, the kenaf seedlings entered a rapid development stage. During this period, the plant height increased by 2.1–5.0 cm per day and the kenaf seedlings were extremely sensitive to salinity stress. For the salt treatment, the 1/4 Hoagland nutrient solution was supplemented with 1 mol/L NaCl, as previously described [
7]. For the two treatments (control and salt), 3 replicates were prepared, with 10 pots (3 seedlings each) in each replicate for a total of 180 seedlings. Kenaf samples were collected 72 h after initiating the salt treatment and then frozen in liquid nitrogen before being stored at −80 °C.
A total of 30 kenaf germplasm materials obtained from different regions worldwide were used to screen for and verify EST-SSR markers (
Table 1).
2.2. RNA Extraction, Library Preparation, and Sequencing
Total RNA was extracted from the frozen kenaf stems using TRIzol reagent (Invitrogen, Waltham, CA, USA). The quality of the RNA was evaluated by gel electrophoresis using a 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA). Duplicated cDNA libraries for the control (CO1 and CO2) and NaCl-treated (NA1 and NA2) kenaf samples were constructed. Briefly, poly-A mRNA was separated from the total RNA using magnetic beads and then fragmented. Double-stranded cDNA was synthesized using a SuperScript Double-Stranded cDNA Synthesis kit (Invitrogen) and a random hexamer (N6) primer (Illumina, San Diego, CA, USA). The constructed libraries were sequenced using the Illumina HiSeq 2000 platform. The raw sequencing data for the transcriptomes were submitted to the NCBI database (SRR9613936 to SRR9613939).
2.3. Transcriptome-Based SSR and SNP Variation Analysis
Unigene sequences obtained after the transcriptome sequencing analysis were screened for SSRs using MISA software Version 1.0. The type and frequency distribution of the SSRs were determined. The repeated units of the SSR loci were selected. When a single nucleotide was used as the repeating unit, there were more than 10 mononucleotide repeats, more than 6 dinucleotide repeats, and more than 5 trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide repeats, all of which were used to detect SSRs [
30]. After identifying the SSRs, SSR primers were designed in batches using Primer3 and unigenes for the PCR amplification of fragments comprising 100–400 bp.
By comparing SAM and Picard Tools, the results were sorted by chromosome coordinates and repeated reads were discarded. Finally, the mutation detection software GATK3 Version 3.4 was used to label SNPs and InDels. The original results were screened to obtain information regarding high-quality SNP mutations.
2.4. GO and KEGG Enrichment Analyses
TBTools software Version 1.120 [
4] was used to perform GO and KEGG enrichment analyses of the unigenes containing mutation sites (SSRs and SNPs) and the differentially expressed genes (DEGs) encoding proteins in the protein–protein interaction (PPI) network as well as to analyze the commonalities between the unigenes containing mutations and the DEGs in the PPI network as well as the main biological processes, molecular functions, cellular components, and metabolic pathway characteristics involved.
2.5. Analysis of the Interaction Network for the Proteins Encoded by Differentially Expressed Genes
The interactions and functions of the proteins encoded by DEGs were analyzed using String (
https://string-db.org/) (accessed on 4 December 2017) and the PPI network was visualized and edited using Cytoscape software Version 3.6.1.
2.6. Genomic DNA Extraction
Genomic DNA was extracted from 200 mg tender kenaf leaves using a DNAsecure Plant DNA Kit (Tiangen). A 2 μL aliquot of the extracted DNA was analyzed using a NanoDrop ND1000 spectrophotometer to determine the concentration and purity (A260/A280 ratio). Additionally, DNA integrity was assessed by 1% agarose gel electrophoresis (4 μL volume). The DNA samples that produced a clear and non-tailed main band were used for the subsequent SSR genotyping.
2.7. SSR Genotyping
Differences in SSRs in expressed genes are likely to be associated with altered functions of the encoded proteins. Thus, specific EST-SSR sites in the DEGs encoding the proteins in the PPI network were selected for the synthesis of SSR primers. The 5′-end of the primers was ligated to FAM fluorescent groups. The DNA polymerase was Phi29 DNA Polymerase (TransGen). The PCR amplification was completed using a 20 μL reaction volume consisting of 2 μL DNA template, 2 μL buffer, 0.3 μL TransTaq, 1.6 μL dNTP, 12.1 μL ddH
2O, and 1 μL forward and reverse primers (2 μmol/μL). The PCR amplification program was set as follows: pre-denaturation at 94 °C for 4 min, denaturation at 94 °C for 30 s → annealing at 56 °C for 90 s → extension at 72 °C for 1 min, these three stages are circulated for 35 times, extended at 72 °C for 5 min; and stored at 4 °C [
31]. After the PCR amplification, a 1 μL aliquot of the PCR product was analyzed using an ABI3730xl capillary electrophoresis instrument. GeneMapper 4.0 software was used to obtain genotyping information.
2.8. Phylogenetic Analysis
The Nei and Takezaki (1983) genetic distance based on the allele frequency was calculated using PowerMarker 3.25 software [
32]. A phylogenetic tree was constructed according to the neighbor-joining method using MEGA 11.0 software.
2.9. Genetic Diversity Analysis
The SSR genotyping data were converted into different formats depending on the requirements of various programs. The genotyping data were imported into POP GENE 1.32 and the diploid co-dominant data format was selected to analyze the allele number (Na), effective allele number (Ne), observed heterozygosity (Ho), expected heterozygosity (He), and Shannon’s diversity index (I) of individual SSR loci in the whole sample [
33]. An F statistical analysis was carried out using GenAIEx 6.5 software [
34]. The polymorphism information content (PIC) of each locus was calculated using PowerMarker 3.25. The Jaccard genetic similarity coefficient between two samples was computed using NTSYSPC 2.10 and the matrix of the genetic similarity coefficient was generated [
35].
4. Discussion
In this study, EST-SSR markers were developed using the transcriptome sequencing data for salt-treated kenaf samples. The number of inclusions spliced using sequencing reads from the marker source was compared with the corresponding data generated in previous kenaf transcriptome sequencing studies [
7,
21,
26,
37]. More common sequences, including SSR and InDel sequence variations, were detected in this study than in the earlier study by Jeong et al. [
26], indicating that the EST-SSRs identified in this study may cover more expressed genes. Compared with other transcriptome analyses of kenaf (GO and KEGG enrichment analyses), we more thoroughly determined the functional characteristics of DEGs and the expression patterns varied between the different transcriptomes [
6]. Therefore, developing EST-SSR markers using different transcriptomes for the same species is important to supplement the available molecular markers for kenaf.
According to the results of the GO and KEGG enrichment analyses, the genes containing SSRs and SNPs were primarily involved in transcription, metabolism, and signal transduction. Accordingly, the EST-SSR markers developed in this study mainly belonged to these genes. Most unigenes were found to have single nucleotide mutations, specifically for transitions and transversions. The genes containing SNPs included DEGs related to the response of kenaf to salt stress. Under saline conditions, kenaf genes encoding the proteins affecting metabolic activities, including amino acid metabolism and carbon–water metabolism, are highly enriched [
38]. In the current study, the utility of EST-SSR markers was verified using some of the DEGs encoding proteins in the PPI network because of the potential functional correlation among these genes. In addition, EST-SSR markers may influence gene functions, leading to changes in the expression of other genes in the same network, which would likely lead to phenotypic changes in crops [
7].
The enriched KEGG metabolic pathways among the DEGs encoding proteins in the PPI network included the (00030) pentose phosphate pathway, (00630) glyoxylate and dicarboxylate metabolism, and (03011) ribosome, which were similar to the enriched metabolic pathways among the DEGs containing sequence variations (SSRs and SNPs) that may be useful molecular markers. The population structure and distribution were determined according to the multi-locus genotypes [
39]. Of the 55,219 pairs of SSR primers that were developed, 20 EST-SSR primer pairs were identified according to the EST sequences of the DEGs encoding proteins in the PPI network. A total of 30 kenaf germplasm materials were used for the marker verification. One highly polymorphic locus, two moderately polymorphic loci, and seven loci with a relatively low polymorphism rate were found. The genotypes of 9 EST-SSR markers divided the 30 kenaf germplasm materials into two types. The results of the two-dimensional PCoA and the phylogenetic analysis were consistent. Thus, these nine EST-SSR markers may be used to analyze the genetic diversity and genetic structure of kenaf germplasm resources in future investigations, laying the foundation for further research.
In summary, we developed a batch of EST-SSR markers using transcriptome data for kenaf plants grown under saline conditions and the SSR primers of the DEGs of PPI were selected for screening, verification, and application. Finally, nine primer pairs for new polymorphic EST-SSR markers suitable for genotyping were obtained and used to analyze the genetic diversity and population structure of kenaf germplasm resources, with implications for the molecular marker-assisted selection and characterization of the kenaf genome.
5. Conclusions
In this study, a set of EST-SSR markers related to the kenaf response to salinity stress was developed on the basis of the transcriptome data for kenaf plants exposed to salt stress. These markers were mainly single nucleotide repeats. The DEGs encoding proteins in the PPI network and the associated metabolic pathways were revealed. Moreover, 20 pairs of EST-SSR primers were used to genotype 30 kenaf varieties (lines), among which 9 primer pairs were confirmed as ideal markers according to the level of polymorphism (i.e., high, moderate, and low). The SSR molecular markers for kenaf developed in this study may be useful tools for the molecular marker-assisted breeding of salt-tolerant kenaf cultivars. Based on the polymorphic EST-SSR markers and kenaf EST-SSR marker library developed in this study, we can prove the relationship and effect between EST-SSR markers of these differentially expressed genes from salt-stress transcription groups and salt-stress phenotypes in natural populations as well as genetic linkage populations in the future, guiding the polymerization of excellent salt-tolerant alleles to truly apply these molecular markers to the creation and screening of salt-tolerant kenaf germplasms. Furthermore, these markers are of great significance for the development and discovery of important marker types related to kenaf salt stress in the future.