A Useful Technical Application of the Identification of Nucleotide Sequence Polymorphisms and Gene Resources for Cinnamomum osmophloeum Kaneh . ( Lauraceae )

The plant genus Cinnamomum contains economically important evergreen aromatic trees and shrubs belonging to the laurel family, Lauraceae. Our study tree species Cinnamomum osmophloeum Kaneh. (CO) has high economic value in Taiwan. The present study attempts to identify the gene resources of Cinnamomum osmophloeum Kaneh. by analyzing the nucleotide sequences of the partial noncoding internal transcribed spacer 2 (pITS2) of the ribosomal DNA and the trnL-trnF chloroplast genome. Seventy-three geographical strains of Cinnamomum osmophloeum, preserved in the Lien Hua-Chin Research Center of the Forestry Research Institute and the Hua-Lin Forestry Center of Chinese Culture University, were collected and analyzed by PCR amplification and DNA sequencing to study the genetic diversity and nucleotide sequence polymorphisms of the tested specimens. Our results allowed us to accurately identify the lineage of Cinnamomum osmophloeum and to conclude that the strains belonging to the Lien Hua-Chin Research Center had much higher genetic diversity than those preserved in the Hua-Lin Forestry Center. Multiple sequence alignments demonstrated that the variability of the nucleotide sequence polymorphisms for the pITS2 region was higher than those of the trnL intron and trnL-trnF intergenic spacer (IGS) regions among the 73 tested specimens of Cinnamomum osmophloeum. Cluster analyses, using the neighbor-joining and maximum parsimony methods, for the 73 tested geographical strains of Cinnamomum osmophloeum and species of Cinnamomum registered in the GenBank and EMBL databases were performed to demonstrate the genus and species distribution of the samples. Here, we describe the use of pITS2 polymorphisms as a genetic classifier and report the establishment of a DNA sequence database for CO gene resource identification. The sequence database described in this study can be used to identify CO specimens at the interor intraspecies level using pITS2 DNA sequences, which illustrates its value in gene resource identification. Our study results can be used further for correctly identifying the true Cinnamomum osmophloeum Kaneh.


Introduction
Cinnamomum belongs to the Lauraceae and is composed of approximately 350 species of evergreen trees and shrubs [1].Cinnamon is the dried bark from species of Cinnamomum, such as C. cassia (L.) J.Presl and C. verum J.Presl.The former is from the southern part of the Chinese mainland, specifically Guangdong, Guangxi, Hainan, and Yunnan; the latter originates largely from Ceylon and India [2].The essential oils extracted from the barks of these plants include cinnamaldehyde, coumarin, Forests 2019, 10, 306; doi:10.3390/f10040306www.mdpi.com/journal/forestscinnamyl alcohol and eugenol; these compounds are used not only as medicinal ingredients but also as ingredients in stomachic agents, carminatives, astringents, food, drinks, cosmetics, spices, and preservatives [3].For the past thirty years, Taiwan's indigenous Cinnamomum osmophloeum Kaneh.
(CO) has been used widely as a substitute for C. cassia because of the similarities in the chemical compositions of the essential oils from these plants.Studies of the essential oils extracted from CO leaves have demonstrated their excellent insecticidal [4][5][6], antibacterial [7,8], antifungal [9,10], anti-inflammatory [11][12][13][14][15][16] and potential use as a medicinal material for decreasing high uric acid and high blood sugar [17,18].Cinnamomum osmophloeum leaves can be collected and essential oils extracted without damaging the trees, and the annual harvest is highly profitable.
Cinnamomum osmophloeum is a species endemic to Taiwan.There are many geographic strains specific to different growing areas.Ta-Wei Hu and his colleagues set up a garden in the Hua-Lin Forestry Center of Chinese Culture University in 1985 to maintain superior strains of CO.In 1992, a preserved site was built, and many geographical strains of CO were collected and maintained systematically at the Lien Hua-Chin Research Center of the Forest Research Institute [19].The Hua-Lin Forestry Center (HL) and Lien Hua-Chin Research Center (LHC) have collected large numbers of geographic strains of CO, and these collections are an important material resource for studying indigenous CO.While geographical strains of CO are similar in appearance, the variation in the essential oil content of their leaves is substantial.Based on the main constituents of the leaf essential oils and cluster analyses of their relative contents, CO has been classified into nine chemotypes, as follows: cassia (80% cinnamaldehyde and 10% coumarin), cinnamaldehyde, coumarin, linalool, eugenol, camphor cinnamaldehyde/cinnamyl acetate, cinnamyl acetate, linalool, camphor, 4-terpinenol, linalool-terpinenol, and mixed [20,21].
Traditionally, species identification is based on the morphological or histological characterization of the tree or shrub.However, identification based on morphological characteristics alone is difficult due to the morphological similarities between plants.Recently, DNA sequence comparisons of internal transcribed spacers (ITSs) have become widely used as an improved method of species identification [22,23].The ITSs of ribosomal DNA includes two segments, ITS1 and ITS2, which are divided by the 5.8 rDNA sequence.The lengths of the ITS1 and ITS2 sequences in angiosperms are between 187-298 bp and 187-252 bp, respectively.The variation in length of these regions facilitates the authentication process because they are usually shorter than 300 bp for most of the species examined to date; thus, they can be directly amplified and sequenced.For example, the sequence variations in ITS2 were sufficient to allow researchers to differentiate among medicinal Dendrobium species [24].Recently, fifty-five processed medicinal herbs belonging to forty-eight families were successfully authenticated using ITS2 with specific primers [25].ITS sequences can be used as species identification codes through the publicly available sequence domains, such as GenBank and EMBL.With molecular source identification technology, we can study associations not only between variations in DNA sequences and evolutionary relationships among species, but also perform the molecular authentication of traditional Chinese medicines at the interspecies, or even intraspecies, level [26].
In addition to the ITS regions of ribosomal DNA, universal primers have been proposed for the amplification of the trnL (UAA) intron and the trnL-trnF (GAA) intergenic spacer (IGS) from the noncoding regions of chloroplast DNA [27].Because noncoding regions have faster rates of evolution [28], these loci have been successfully used as markers for both phylogenetic and gene diversity studies of plant species at the genus and species levels [29,30].Indeed, this feature of noncoding regions also provides an opportunity for species identification.Cinnamon species have been identified genetically by analyzing the nucleotide sequences of trnL-trnF chloroplast DNA from four species: Cinnamomum cassia (L.) J.Presl and C. verum J.Presl., C. burmannii (Nees & T.Nees) Blume, and C. sieboldii (Makino) Hatus [31].Tsai et al. [32] have reported the use of the trnL intron and the trnL-trnF intergenic spacer (IGS) in the chloroplast genome and have established a DNA sequence database for the forensic identification of popular plant species in Taiwan.
We have previously described the DNA barcoding of CO based on the pITS2 region of ribosomal genes [33,34].Seven representative geographical strains, belonging to six different chemotypes of CO, were used to study genetic diversity.Our results indicated that the pITS2 nucleotide sequences for all seven of the geographical strains are not correlated with essential oil composition.The pITS2 sequences were sufficient for the barcoding of CO, while we found that additional genes needed to be analyzed to identify the samples of various chemotypes.Surprisingly, nucleotide sequence polymorphisms in the pITS2 regions from geographical strains of CO were observed when we tested additional specimens from the LHC of the Forestry Research Institute in our later experiments.In the present study, the identification of the gene resources, genetic diversity, and nucleotide sequence polymorphisms for 73 geographical strains of CO were determined using the ITS2 locus of ribosomal DNA and the trnL-trnF locus of the chloroplast genome.

Source of Samples and Their Treatment
The fourteen geographical strains (coded as "CO"-followed by a letter) of C. osmophloeum Kaneh.that were used in the present study were collected by Ta-Wei Hu and his colleagues from various locations in Taiwan and were planted in the Lien Hua-Chin Research Center located in central Taiwan in 1992 [19].Another 59 geographical strains of CO (used in this experiment and coded with numbers) from the Hua-Lin Forestry Center of Chinese Culture University located in the northern part of Taiwan were also collected by Ta-Wei Hu and his colleagues in 1985.All of the leaves were authenticated by Fu-Yuan Lu in the Department of Forestry and Natural Resources at the National Chiayi University.The leaves were dehydrated, divided into air-permeable bags and stored dry in sealed containers.Voucher specimens were deposited in the Department of Forestry and Nature Resources at the National Chiayi University.We obtained partial fragments of the ITS2, trnL-trnF IGS and trnL intron; any distinct PCR product that was visualized by agarose gel electrophoresis on at least three occasions for each of the geographical strains was directly sequenced.All of the nucleotide sequence data reported in this paper appeared in the DDBJ/EMBL/GenBank nucleotide sequence databases under the accession numbers shown in Table 1.(Accession numbers will be provided upon the acceptance of the manuscript.)See Table 1 for the species names, codes and GenBank accession numbers of the specimens used in this study.AB054242 (464) 1 Sample numbers 1-14 with letter codes were collected from LHC. 2 Sample numbers 15-49 with number codes were collected from HL. 3 The pITS2 nucleotide sequence of CO-003 is homologous to that of CO-005. 4The pITS2 nucleotide sequence of CO-010 is homologous to that of CO-026, 037. 5 The pITS2 nucleotide sequence of CO-016 is homologous to that of CO-128. 6The pITS2 nucleotide sequence of CO-024 is homologous to that of CO-031 and 034. 7The pITS2 nucleotide sequence of CO-025 is homologous to that of CO-032. 8The pITS2 nucleotide sequence of CO-028 is homologous to that of CO-  13 The trnL-trnF IGS sequence of CO-031 is homologous to that of CO-050, 070 and 125. 14The trnL-trnF IGS sequence of CO-067 is homologous to that of CO-071. 15The present study.

DNA Extraction, Polymerase Chain Reaction (PCR), and DNA Sequencing
Genomic DNA was isolated according to a modified cetyl trimethyl ammonium bromide (CTAB) approach with minor modifications [35].Briefly, 100 mg of the dried leaf material was ground into a fine powder in liquid nitrogen using a mortar and pestle.After the addition of 1 mL of prewarmed extraction buffer (100 mM Tris-HCl, [pH 8.0], 20 mM EDTA, 1 M NaCl, 1% CTAB, 1% PVP-40), the mixture was incubated in a water bath at 65 • C or 20 minutes (min) with gentle shaking.The sample solution was mixed with an equal volume of chloroform: isoamyl alcohol (24: 1) and centrifuged at 11,000× g for 20 min at 4 • C. The supernatant was transferred to a new Eppendorf tube containing 2 mL of precipitation buffer (50 mM Tris-HCl [pH 8.0], 10 mM EDTA, 40 mM NaCl, 1% CTAB), incubated at room temperature for 1 hour (h), and centrifuged at 11,000× g for 15 min at 4 • C. The supernatant was carefully decanted, and the pellet was gently suspended in 350 mL of 1.2 M NaCl with 10 mg/mL RNase A. After incubation at 37 • C for 30 min, an extraction with 350 mL of chloroform: isoamyl alcohol (24:1) was performed, and the aqueous phase was transferred to a new tube.Then, 3 mM sodium acetate (one-tenth of the recovered volume) and 95% ethanol (equal to twice the recovered volume) were added to precipitate the DNA.After centrifugation at 12,000× g for 20 minutes, the DNA pellet was washed with 1 mL of 70% ethanol, dried and dissolved in a volume of 50 to 100 mL of TE buffer.
The pITS2 fragments of the 73 geographical strains of CO were amplified using the BEL-1/BEL-3 primer set designed by Chiou et al. [25].The trnL-trnF IGS and trnL intron were amplified by PCR.
The "e" and "f" primers and the "c" and "d" primers were used for the trnL-trnF IGS and the trnL intron, respectively [27].A schematic diagram of the rDNA ITS, trnL-trnG IGS and trnL intron regions are shown in Figure 1, and the designed primers and their nucleotide sequences are shown in Table 2. PCR amplification of the pITS2, trnL-trnG IGS and trnL intron fragments was carried out as described below.A total volume of 25 mL of prepared solution, containing 2 mL of template DNA (40-80 ng), 2.5 mL of 10 × PCR reaction buffer, 1 mL of 25 mM MgCl 2 , 2 mL of 2.5 mM dNTPs, 0.5 mL of 10 mM forward primer, 0.5 mL of 10 mM reverse primer, 0.15 mL (5 units) of Taq DNA polymerase (Geneaid Biotech Ltd.; Taipei, Taiwan), 2 mL of betaine and 13.35 mL of sterile distilled water, was used for each PCR solution.For the amplification of ITS2, the template DNA was denatured at 95 The PCR products were examined by 1.5% agarose gel electrophoresis and visualized on an ABI3730XL capillary-based DNA sequencer (Applied Biosystems).Samples were purified for sequencing using an ABI PRISM ® 377 DNA sequencer (Applied Biosystems Industries; Foster City, CA, USA).The obtained sequences were compiled with BioEdit software (version 7.0) [36] and verified by comparison to the in-house and GenBank databases.The obtained sequences published in this paper were deposited in the GenBank databases (GenBank accession numbers will be provided upon the acceptance of the manuscript, see Table 1).Universal primers were used for the amplification of the noncoding regions of chloroplast DNA; the "e" and "f" primers were used to amplify the trnL-trnF IGS, whereas the "c" and "d" primers were used for the trnL intron (see Table 2 for the primer sequences) [27].The pITS2 fragments of the 73 geographical strains of CO were amplified using the BEL-1/BEL-3 primer set designed by Chiou et al. [25].The trnL-trnF IGS and trnL intron were amplified by PCR.
The "e" and "f" primers and the "c" and "d" primers were used for the trnL-trnF IGS and the trnL intron, respectively [27].A schematic diagram of the rDNA ITS, trnL-trnG IGS and trnL intron regions are shown in Figure 1, and the designed primers and their nucleotide sequences are shown in Table 2. PCR amplification of the pITS2, trnL-trnG IGS and trnL intron fragments was carried out as described below.A total volume of 25 mL of prepared solution, containing 2 mL of template DNA (40-80 ng), 2.5 mL of 10 × PCR reaction buffer, 1 mL of 25 mM MgCl2, 2 mL of 2.5 mM dNTPs, 0.5 mL of 10 mM forward primer, 0.5 mL of 10 mM reverse primer, 0.15 mL (5 units) of Taq DNA polymerase (Geneaid Biotech Ltd.; Taipei, Taiwan), 2 mL of betaine and 13.35 mL of sterile distilled water, was used for each PCR solution.For the amplification of ITS2, the template DNA was denatured at 95 °C for 5 min and then subjected to 40 cycles of 95 °C for 30 seconds (s), 55 °C for 30 s, and 72 °C for 45 s.The final cycle included an extension at 72 °C for 10 min.For the amplification of the trnL-trnG IGS and trnL intron, the template DNA was denatured at 95 o C for 5 min and then subjected to 35 cycles of 95 °C for 30 s, 50 °C for 30 s, and 72 °C for 45 s.The final cycle included an extension at 72 °C for 10 min.The PCR products were examined by 1.5% agarose gel electrophoresis and visualized on an ABI3730XL capillary-based DNA sequencer (Applied Biosystems).Samples were purified for sequencing using an ABI PRISM ® 377 DNA sequencer (Applied Biosystems Industries; Foster City, CA, USA).The obtained sequences were compiled with BioEdit software (version 7.0) [36] and verified by comparison to the in-house and GenBank databases.The obtained sequences published in this paper were deposited in the GenBank databases (GenBank accession numbers will be provided upon the acceptance of the manuscript, see Table 1).Universal primers were used for the amplification of the noncoding regions of chloroplast DNA; the "e" and "f" primers were used to amplify the trnL-trnF IGS, whereas the "c" and "d" primers were used for the trnL intron (see Table 2 for the primer sequences) [27].   1 The sequences in this table are for the primers shown in Figure 1. 2 Where D represents A, G or T, K represents G or T, H represents A, C or T and Y represents T or C.

Local DNA Database Establishment and Sequence Analysis
The sequences of all of the collected samples of the partial ITS2, the chloroplast trnL intron and the trnL-trnF IGS were imported into BioEdit [36] for comparison purposes.MEGA 7.0 [37] was used to construct the phylogenetic trees of the CO and Cinnamomum spp.without an outgroup species, based on the neighbor-joining (NJ) and maximum-parsimony (MP) methods.The default phylogeny test options used to construct the NJ and MP phylogenetic trees were the following: Bootstrap (1000 replicates), seed = 22607; Gaps/Missing Data-Complete Deletion; Substitution Model-Nucleotide (kimura 2-parameter); Substitution to include-d, Transitions + Transversions; Pattern among Lineages-Same (Homogeneous); and Rate among sites-uniform rates.

Sequence Analysis
The C. osmophloeum specimens were chosen to cover a wide range of geographical regions in Taiwan.For this study, 14 geographical strains of CO were collected from the Lien Hua-Chin Research Center of the Forestry Research Institute (LHC), and 59 strains were collected from the Hua-Lin Forestry Center of Chinese Culture University (HL) (Table 1).Figure 1 illustrates these three regions of interest.The primer set (BEL-1 and BEL-3) used was originally designed to amplify the ITS2 region of medicinal plants and worked well for the Cinnamomum spp. in our previous research [25].
The pITS, trnL intron, and trnL-trnF IGS sequences identified in this study, including those sequences determined by the present study or available from GenBank, were aligned using BioEdit software.For the 73 geographical strains of CO studied, we obtained 49, 10, and 7 nonidentical DNA nucleotide sequences for the pITS2, trnL intron, and trnL-trnF IGS regions, respectively, and we obtained 16, 19, and 18 representative nucleotide sequences, respectively, for Cinnamomum spp.from GenBank (Table 1).The length of the PCR products for the pITS2 and trnL-trnF IGS regions were highly diverse among the Cinnamomum spp.; this diversity was greater than that for the trnL intron.The sizes ranged from 146 bp to 171 bp for the pITS2 and from 328 bp to 356 bp for the trnL-trnF IGS, whereas the amplicons were either 464 bp or 465 bp in length for the trnL intron among all of the Cinnamomum spp.For CO, the average lengths for the PCR products of the pITS2, trnL intron, and trnL-trnF IGS sequences were 168.3 bp, 464.1 bp, and 357.8 bp, respectively.The variation in the sizes of the PCR products was greatest for the pITS2 region of the CO specimens collected from the LHC, but these same geographical strains appeared to share conserved sequences for the other two regions tested: PCR products of 464 bp and 356 bp for the for trnL intron and trnL-trnF IGS, respectively.The variation of sequence identity matrix was the greatest for pITS2 among 14 geographical strains of CO collected from LHC.The same specimens of these 14 geographical stains collected from LHC, they are identical for nucleotide sequences in the trnL-trnF IGS and trnL intron regions.Sequence analysis data of PCR products of the pITS2, trnL intron and trnL-trnF IGS in the present study are shown in Table 3.

Sequence Analysis of the DNA Database in This Study
To evaluate the probability that the pITS2, trnL intron, and trnL-trnF IGS regions can be used to identify any sample at the species level, the nucleotide sequences of 18 Cinnamomum spp.collected from 42 different locations were acquired from GenBank (data included was as of January 2019, see Table 1).One representative sequence from each collection site was selected for each species if more than one homologous sequence was deposited in GenBank.These sequences were used to construct our local DNA database.The sequences for the pITS2, trnL intron, and trnL-trnF IGS regions were aligned separately using the MEGA 7 program, and the results are shown in Figure 5.Our results indicated that the NJ tree constructed from the pITS2 region data had much better resolution than those of the trees constructed from the trnL intron and trnL-trnF IGS sequence data (Figure 5A, B, and  C).The sequence variation in the pITS2 region was sufficient not only for the clustering of specimens of the same species, but also for discriminating the geographical strains intraspecifically.The NJ trees constructed from the trnL intron and trnL-trnF IGS sequence data would not be predicted to perform well for species clustering because of the lower sequence variation in these two regions.It should be noted that the strains of CO collected from the LHC and HL were clustered and placed in two separate branches in the NJ tree using the pITS2 sequences.One pITS2 sequence for CO (GQ255635) has already been registered in the GenBank and EMBL databases.Unfortunately, this sequence was not successfully clustered with the other CO specimens.

Sequence Analysis of the DNA Database in This Study
To evaluate the probability that the pITS2, trnL intron, and trnL-trnF IGS regions can be used to identify any sample at the species level, the nucleotide sequences of 18 Cinnamomum spp.collected from 42 different locations were acquired from GenBank (data included was as of January 2019, see Table 1).One representative sequence from each collection site was selected for each species if more than one homologous sequence was deposited in GenBank.These sequences were used to construct our local DNA database.The sequences for the pITS2, trnL intron, and trnL-trnF IGS regions were aligned separately using the MEGA 7 program, and the results are shown in Figure 5.Our results indicated that the NJ tree constructed from the pITS2 region data had much better resolution than those of the trees constructed from the trnL intron and trnL-trnF IGS sequence data (Figure 5A-C).The sequence variation in the pITS2 region was sufficient not only for the clustering of specimens of the same species, but also for discriminating the geographical strains intraspecifically.The NJ trees constructed from the trnL intron and trnL-trnF IGS sequence data would not be predicted to perform well for species clustering because of the lower sequence variation in these two regions.It should be noted that the strains of CO collected from the LHC and HL were clustered and placed in two separate branches in the NJ tree using the pITS2 sequences.One pITS2 sequence for CO (GQ255635) has already been registered in the GenBank and EMBL databases.Unfortunately, this sequence was not successfully clustered with the other CO specimens.

Phylogenetic Analysis and Gene Resource Identification
The pITS2 sequences of ten Cinnamomum spp.collected from 16 different locations, which were available in GenBank, and the data from the 38 geographical strains of CO were used to further construct the phylogenetic tree using MP methods for gene resource identification purposes (see Table 1).Based on our results, all of the specimens were properly clustered into seven separate groups at the species level, without an out-group species (Figure 6).Nineteen strains of CO, accounting for 79.2% of the representative sequences from the HL specimens, were clustered into Group 1 and were phylogenetically related to the one registered sequence from CO (GQ255635).Thirteen strains, accounting for 92.9% of the representative sequences from the LHC specimens, were clustered into

Phylogenetic Analysis and Gene Resource Identification
The pITS2 sequences of ten Cinnamomum spp.collected from 16 different locations, which were available in GenBank, and the data from the 38 geographical strains of CO were used to further construct the phylogenetic tree using MP methods for gene resource identification purposes (see Table 1).Based on our results, all of the specimens were properly clustered into seven separate groups at the species level, without an out-group species (Figure 6).Nineteen strains of CO, accounting for 79.2% of the representative sequences from the HL specimens, were clustered into Group 1 and were phylogenetically related to the one registered sequence from CO (GQ255635).Thirteen strains, accounting for 92.9% of the representative sequences from the LHC specimens, were clustered into Group 6 and were phylogenetically distinct from the strains collected from the HL in Group 1. CO-B2 was the only strain collected from the LHC that was related to the strains collected from the HL in Group 1. CO-030, C. insulari-montanum (AF272263), C. loureiroi-CL (VN-Thanh Hoa) (GQ255632) and five specimens of C. Cassia (CC1-CC5) were closely related and clustered into Group 4. The phylogenetic trees constructed from the pITS2 data using the MP method were more accurate than those constructed using the NJ method for the inter-or intraspecific gene resource identification.

Discussion
Cinnamon species in Taiwan have been used heavily for hundreds years.However, in the markets, there is a serious problem regarding fake or misidentified cinnamon materials.These are very difficult to identify from their morphology.There are four indigenous cinnamon species in Taiwan: Cinnamomum osmophloeum, Cinnamomum insulari-montanum Hayata, Cinnamomum pedunculatum Nees, and Cinnamomum macrostemon Hayata [2].Because of the economic importance of these plants, Dr. Ta-Wei Hu and his colleagues collected geographical strains of CO from various natural habitats, which were planted in the LHC in 1992.The essential oil extracted from CO trees has received more attention lately because recent studies have revealed that this plant has antifungal, insecticidal, antibacterial, and mosquito larvicidal properties [4][5][6][7][8][9][10].In addition, recent studies have detailed the potential pharmacological uses of compounds from these plants; they are believed to have antitumor, chemoprotective, anti-inflammatory, and antioxidative properties, as well as the ability to relieve blood uric acid levels [11][12][13][14][15][16].For these reasons, CO is a valuable indigenous plant to Taiwan.
Based on phylogenetic analyses and the GC-MS analysis of essential oils extracted from the leaves, researchers have previously reported at least six chemotypes for the geographical strains of CO in the LHC [20,21,33].Unfortunately, we were previously unsuccessful in authenticating the chemotype classification for specimens of geographical strains of CO [33].In the present study, we found that the nucleotide sequence polymorphisms in the ITS2 regions were sufficient to distinguish the geographical strains of CO collected from the LHC (Figure 2).Of the 73 geographical strains tested, there were 31, 8, and 4 unique sequences for the pITS2, trnL intron, and trnL-trnF IGS regions, respectively.CO-004 had two unique sequences in two different marker regions.Therefore, the 43 (total) unique sequences for these three regions were used for the intraspecific molecular identification of 42 out of 73 geographical strains of CO (Figures 2-4).However, more studies will be needed to establish a method for the DNA authentication of chemotypes for the strains of CO.
The sequence identity (ID) in the ITS2 regions was between 0.310 and 0.994 among the geographical strains of CO. Surprisingly, this intraspecific variation was much greater, which ranged from 0.72 to 1.00.The majority of the variation arose from the strains of CO collected from the LHC.In contrast, the ID among the strains of CO collected from the HL only ranged between 0.872 and 0.994, which was reasonably small when compared to that observed among Cinnamomum spp.(Table 3).All of the nucleotide sequences in the trnL intron and trnL-trnF IGS regions were 100% homologous for the tested specimens of CO from the LHC.Additionally, all of the strains preserved in the LHC have been authenticated and properly labeled.Therefore, our results indicated the presence of great genetic diversity among the geographical strains of CO that are preserved in the LHC.
For the three DNA regions used in the present study, the nucleotide sequence variations in the trnL-trnG IGS regions were the greatest among the Cinnamomum spp.(Table 3).All of the C. cassia specimens collected from the five locations (CC1-CC5, see Table 1) were determined to have identical trnL-trnG IGS nucleotide sequences.However, the two species C. loureiroi and C. cassia could not be separated successfully using the trnL-trnF IGS sequences.The IDs found in the trnL intron were less variable (between 0.993 and 1) among the Cinnamomum spp.Many different species have identical trnL intron sequences, such as C. cassia, C. insulari-montanum, C. burmannii and C. wilsonii Sarg.Based on our results, the trnL intron is not a good marker for identifying species, and instead, we recommend ITS2 because it provides better resolution for species identification among Cinnamomum spp.
The leaves of all Cinnamomum species have three-branched veins and are similar in appearance.It would be difficult to distinguish C. osmophloeum, C. insulari-montanum, C. macrostemon, C. subavenium Miq., C. burmannii, and other subspecies from each other in the field, solely based on their leaf morphology; other characteristics, such as the bud scales and the shapes of the perianth and fruits, must be used to facilitate identification.The use of molecular markers can overcome the inherent difficulties of species identification and eliminate misclassification.These molecular markers can be employed for a broad range of purposes, including species identification, phylogenetic studies, the study of systems evolution, and geological phylogenetics [38].Among ribosomal RNA (rRNA) genes, ITS sequences evolve relatively quickly, and the subsequent variation aids in solving interspecies classification issues within genera.In this study, phylogenetic trees were constructed using the NJ method based on three regions of markers to compare Cinnamomum spp.(Figure 5).The genetic distance separating the species was the shortest for the phylogenetic tree constructed based on the trnL intron because of the lack of nucleotide variation in this region (Figure 5B).The sixteen registered ITS2 sequences for Cinnamomum spp. in the GenBank database were properly clustered and successfully separated at the species level (Figure 5A).However, the geographical strains collected from the LHC were clustered into two different groups (LHC-1 and LHC-2) and were distinct from the strains collected from the HL.The greater genetic diversity of the strains from the LHC could have resulted from its endeavor to create a comprehensive collection reflecting the diversity of CO in various regions of Taiwan.In contrast, the species from the HL were selected for their superiority.
The phylogenetic tree based on the ITS2 sequence data and constructed using the NJ and parsimony method provided better clustering efficacy for all of the specimens of Cinnamomum spp.(Figure 6).Nineteen of the 24 (79.2%)representative sequences for strains collected from the HL were clustered together with a bootstrap value of 63 and positioned next to a previously registered sequence of CO-D4 in Group 1. CO-D6 and four other strains were clustered together with a bootstrap value of 100, and CO-SP1 and seven other strains were clustered together with a bootstrap value of 89.These 13 strains, which accounted for 92.9% of the representative sequences for strains collected from the LHC, were clustered together in Group 7. In Group 4, we found that C. cassia specimens collected from five different locations in mainland China were clustered together with a bootstrap value of 66, and these specimens were related to C. loureiroi Nees from Vietnam.We also determined that CO-030 is a close relative of C. insulari-montanum.These results indicated that the construction of a phylogenetic tree may be based on the ITS2 sequence variability among Cinnamomum spp.and can be used for molecular identification purposes and the selection of valuable strains among the various geographical strains of CO.ITS2 was proposed as a novel DNA barcode for identifying medicinal plant species [39,40].DNA barcoding is a rapidly growing area of research.Our results show the value of these markers for barcoding and gene resource identification among Cinnamomum spp, especially for Cinnamomum osmophloeum.

Conclusions
In conclusion, our study discovers the best region for this species used as DNA barcoding.The Lien Hua-Chin Research Center of the Forest Research Institute and the Hua-Lin Forestry Center of Chinese Culture University are important locations for the preservation of the diversity of geographical strains of Cinnamomum osmophloeum.The ITS2 rDNA local sequence database established in the present study could be used in gene resource identification and the selection of desirable Cinnamomum osmophloeum strains.Our study results can be used further for correctly and rapidly identifying the true Cinnamomum osmophloeum in the front line, and avoid fake Cinnamomum osmophloeum in the markets.
• C for 5 min and then subjected to 40 cycles of 95 • C for 30 seconds (s), 55 • C for 30 s, and 72 • C for 45 s.The final cycle included an extension at 72 • C for 10 min.For the amplification of the trnL-trnG IGS and trnL intron, the template DNA was denatured at 95 • C for 5 min and then subjected to 35 cycles of 95 • C for 30 s, 50 • C for 30 s, and 72 • C for 45 s.The final cycle included an extension at 72 • C for 10 min.

Figure 1 .Table 2 .
Figure 1.The positions and directions of the PCR amplification and sequencing primers used in the present study.(a) A diagram of the rDNA pITS2 regions and the designed primers; (b) a diagram of the trnL-trnF chloroplast DNA regions and the designed primers.The head and tail of each arrow indicate the 3′ and 5′ end of each primer, respectively.The boxed areas represent coding regions.

Figure 1 .
Figure 1.The positions and directions of the PCR amplification and sequencing primers used in the present study.(a) A diagram of the rDNA pITS2 regions and the designed primers; (b) a diagram of the trnL-trnF chloroplast DNA regions and the designed primers.The head and tail of each arrow indicate the 3 and 5 end of each primer, respectively.The boxed areas represent coding regions.

Figure 2 .
Figure 2. Nucleotide Sequence Polymorphism of C. osmophloem Kaneh.Sequence alignments of the trnL intronic region among the different C. osmophloeum Kaneh.varieties used in this study.* is to mean the same base pair.Square is to indicate the special base pair.Arrow is to mean the different base pair.

Figure 2 .
Figure 2. Nucleotide Sequence Polymorphism of C. osmophloem Kaneh.Sequence alignments of the trnL intronic region among the different C. osmophloeum Kaneh.varieties used in this study.* is to mean the same base pair.Square is to indicate the special base pair.Arrow is to mean the different base pair.

Figure 3 .
Figure 3.Nucleotide Sequence Polymorphism of C. osmophloem Kaneh.The Cinnamomum osmophloeum Kaneh.Varieties are shown according to the sequence alignments of the chloroplast trnL-trnF IGS region used in this study.

Figure 3 .
Figure 3.Nucleotide Sequence Polymorphism of C. osmophloem Kaneh.The Cinnamomum osmophloeum Kaneh.Varieties are shown according to the sequence alignments of the chloroplast trnL-trnF IGS region used in this study.

Figure 4 .
Figure 4.Nucleotide Sequence Polymorphism of C. osmophloem Kaneh.The Cinnamomum osmophloeum Kaneh.Varieties are shown according to the sequence alignments of the nuclear noncoding region pITS2 used in this study.

Figure 4 .
Figure 4.Nucleotide Sequence Polymorphism of C. osmophloem Kaneh.The Cinnamomum osmophloeum Kaneh.Varieties are shown according to the sequence alignments of the nuclear noncoding region pITS2 used in this study.

Figure 5 .
Figure 5.A neighbor-joining tree constructed from the sequences of three molecular markers among Cinnamomum spp.(A) The partial ITS2 sequences, (B) the trnL intron sequences, and (C) the trnL-trnF IGS sequences.The bootstrap values at the nodes of the trees were obtained from 1000 replicates and are shown as percentages.The scale means an estimation of evolutionary distance between them for the best candidate phylogenetic tree.

Forests 2019 , 22 Figure 6 .
Figure 6.A maximum parsimony tree of the partial ITS2 sequences from the Cinnamomum species used in this study.The bootstrap values at the nodes of the trees were obtained from 1000 replicates and are shown as percentages.

Figure 6 .
Figure 6.A maximum parsimony tree of the partial ITS2 sequences from the Cinnamomum species used in this study.The bootstrap values at the nodes of the trees were obtained from 1000 replicates and are shown as percentages.

Table 2 .
The sequences of the primers used in this study to amplify the three noncoding regions.