DNA Barcoding of Endangered and Rarely Occurring Plants in Faifa Mountains (Jazan, Saudi Arabia)

: Conservation of plant genetic resources, especially threatened species, is an important topic in biodiversity. It is a ﬁeld that requires prior knowledge of the target species, in addition to correct identiﬁcation and taxonomic description. In botany, the identiﬁcation of plant species traditionally relies on key morphological descriptions and anatomical features. However, in complex species and tree plants, molecular identiﬁcation can facilitate identiﬁcation and increase species delimitation accuracy. In the Faifa mountains of Jazan province in Saudi Arabia, 12 rarely occurring plants were recorded and identiﬁed using two DNA barcoding regions (i.e., rbcL and ITS). All the samples were successfully ampliﬁed, sequenced, and analyzed using the standard DNA barcode protocol, and this resulted in the clear and accurate identiﬁcation of 11 out of the 12 sampled species. A total of ﬁve species were in agreement in terms of both morpho-and molecular-based identiﬁcation. Four and two species were identiﬁed based solely on ITS and rbcL phylogenetics, respectively. The geographic distribution records of the identiﬁed species showed that some species were distributed at a distance far from their usual region, while others were reported in proximate regions and localities. Some species were found to be medicinally important and required additional conservation plans.


Introduction
The conservation of threatened species is an essential part of reaching the target set by the Convention on Biological Diversity to improve global biodiversity status [1].The first crucial step in conserving and managing threatened species is the correct identification and delimitation of the target species [2].Identification of plant species traditionally relies on morphological characteristics, and especially, reproductive parts such as flowers and fruits.Regarding trees, these features can be time-consuming to access, and can be only present during certain parts of the year.Accurate identification in species-rich or taxonomically-complex groups also typically requires expert knowledge, which is not always available, especially in tropical areas.Therefore, the identification and species delimitation of endangered tropical tree species are often difficult [3,4].For threatened species, whose trade is regulated by the Convention on International Trade of Endangered Species (CITES) and the IUCN Red List of Threatened Species, correct identification is crucial for the enforcement of regulations and future conservation of the species [5].There is an urgent need to develop efficient and rapid identification tools for illegally harvested species.There is also an urgent need to develop a cost-effective system for easy and reliable identification in fields such as drug discovery, the verification of herbals and other plantbased products, ecological studies, and the national and international regulation of trade involving biological materials (e.g., species regulated by the international conventions) [6].
Vegetative characteristics of plants that are very much similar to each other prevent correct identification of species that are particularly under threat.DNA barcoding is a potential method to meet these challenges, in which identification is based on a short universal DNA sequence that exhibits a sufficient discriminative genetic variation at the species level [7].Each barcode is unique to each species and provides clear-cut identification, even between closely-related species.Barcoding techniques for species identification of plants using DNA sequences have become popular, and the availability of DNA sequence data for plant species has dramatically expanded.Barcoding is useful for identifying the source species of natural medicines and has also been used to detect foreign matter contamination in natural medicines, especially those in small pieces or powder forms.DNA barcodes are standardized and agreed-upon short sequence(s) of DNA from either organelle or nuclear genome, which can discriminate one species from others.
The Plant Working Group of Consortium for the Barcode of Life [8] evaluated the performance of seven plastid barcode loci (psbK-psbI, atpF-atpH, and trnH-psbA spacers, and matK, rbcL, rpoB, and rpoC1 genes) and suggested rbcL and matK as standard DNA barcode regions; in addition to recommending ITS and trnH-psbA as supplementary barcode loci.As opposed to current taxonomic methods that require the whole plant, preferably in the flowering stage for its authentic identification, DNA barcodes, once standardized, can identify the species even if only a small amount of tissue is available.Therefore, DNA barcoding can become a powerful tool in the hands of enforcement agencies responsible for curbing illegal trade practices or biopiracy [9][10][11].Among the applications of DNA barcoding for the plant, conservation is the identification of illegally-traded endangered species from small samples or vegetative specimens [12].Further to the identification of species, DNA barcodes could be effectively used in biodiversity tagging, identifying cryptic and polymorphic species, and identifying constituents (e.g., herbal formulations and foodstuffs) and their adulterants with look-alike substitutes [9,13,14].
The flora of Saudi Arabia is one of the richest biodiversity areas in the Arabian Peninsula and comprises very important genetic resources of crops and medicinal plants.In addition to its large number of endemic species, the components of the flora are an admixture of elements from Asia, Africa, and the Mediterranean region, but a clear lack of molecular-based identified species can be observed [15].In the present study, we address the applicability of DNA barcoding to identify twelve rarely occurring plant species in the Faifa mountains in Saudi Arabia using two barcode loci (i.e., the rbcL gene and ITS region).DNA barcoding could contribute to generating sufficient knowledge on the taxonomy and the distribution of those species, and help in the conservation planning efforts of those species.

Study Area
The study was conducted in the Faifa Mountains (also known as Fayfa or Fifa mountains) located in the southwestern region of Jazan province in Saudi Arabia (17 • 15 N 43 • 06 E) Figure 1.These undulating mountains range in elevation from 400 m to about 2000 m.The Faifa Mountains are characterized by relatively rich and diverse flora.Approximately 63% of the Jazan flora is in Faifa Mountains [16].

Sample Collection
Twelve different species of threatened plant species belonging to nine families were collected from their high-altitude natural habitats during the summer of 2021.A leaf sample was collected from each species (approximately 25 g), and all samples were labeled with a site code and immediately dried with silica gel at room temperature for DNA extraction.Species identification and assignment were independently confirmed before the molecular studies, and were based on an assessment of morphological descriptors (www.Tropicos.org,accessed on 5 June 2022).
sample was collected from each species (approximately 25 g), and all samples were labeled with a site code and immediately dried with silica gel at room temperature for DNA extraction.Species identification and assignment were independently confirmed before the molecular studies, and were based on an assessment of morphological descriptors (www.Tropicos.org,accessed on 5 June 2022).

DNA Extraction, PCR Amplification, and Sequencing
The total genomic DNA of each sample was isolated from ~ 200 mg of dried leaves using a WizPrep™ gDNA Mini Kit (Cell/Tissue; Korea), according to the manufacturer's instructions, with a final elution volume of 50 μL.The isolated DNA was tested for quality by 1% gel electrophoresis and visualized under UV light using the Ingenius3 Gel documentation system (Syngene, UK).Extracted DNA was stored at −20 °C until required for PCR.
The optimized PCR profile for both rbcL and ITS comprised of an initial denaturation at 95 °C for 5 min, followed by 35 cycles of 94 °C for 1 min, annealing for rbcL and ITS at 50 °C for 30 sec, an extension at 72 °C for 90 s, and a final extension segment at 72 °C for 10 min.The amplified PCR products were visualized on 1.5% agarose gels stained with ethidium bromide.Amplicon sizes were confirmed by comparison with the 1 Kb DNA ladder (Genedirex ® , Taiwan), and successful amplifications were purified by spin column using an EasyPure PCR Purification Kit (TransGen Biotech, Beijing, China), following the manufacturer's instructions.Purified PCR products were submitted for commercial sequencing in both directions through the Sanger method (Macrogen Inc., Seoul, South Korea).

DNA Extraction, PCR Amplification, and Sequencing
The total genomic DNA of each sample was isolated from ~200 mg of dried leaves using a WizPrep™ gDNA Mini Kit (Cell/Tissue; Republic of Korea), according to the manufacturer's instructions, with a final elution volume of 50 µL.The isolated DNA was tested for quality by 1% gel electrophoresis and visualized under UV light using the Ingenius3 Gel documentation system (Syngene, UK).Extracted DNA was stored at −20 • C until required for PCR.
The optimized PCR profile for both rbcL and ITS comprised of an initial denaturation at 95 • C for 5 min, followed by 35 cycles of 94 • C for 1 min, annealing for rbcL and ITS at 50 • C for 30 sec, an extension at 72 • C for 90 s, and a final extension segment at 72 • C for 10 min.The amplified PCR products were visualized on 1.5% agarose gels stained with ethidium bromide.Amplicon sizes were confirmed by comparison with the 1 Kb DNA ladder (Genedirex ® , Taiwan), and successful amplifications were purified by spin column using an EasyPure PCR Purification Kit (TransGen Biotech, Beijing, China), following the manufacturer's instructions.Purified PCR products were submitted for commercial sequencing in both directions through the Sanger method (Macrogen Inc., Seoul, Republic of Korea).

Sequence Alignment and Data Analysis
After sequencing, the obtained chromatograms were further analyzed using Geneious R10 [19].To check the quality of each sequence, the peaks corresponding to each nucleotide were examined, and a consensus sequence was produced after trimming the poor-quality DNA sequence ends, and aligning forward and reverse sequences.The consensus se-quences were identified using the BLAST search tool in the NCBI database applying default parameters.
Each sequence of the rbcL gene and ITS region were separately aligned with the BLAST query results using the MAFFT aligner [20] and implemented in Geneious R10.The phylogenies for each gene region were generated using maximum likelihood methods (ML).The ML tree was computed using FastTree V2 [21] and implemented in Geneious R10.The phylogenetic analysis based on the morphological inspection was retrieved using TimeTree of Life (http://timetree.org/,accessed on 5 June 2022).

Morphological Inspection
All the inspected plant samples were identified as flowering plants belonging to the class Magnoliopsida (Angiosperms).The 12 plant samples were found to evenly present two major clades, the Asterids (six species) and the Rosids (six species).In the case of the Asterids, two major groups were identified.An undefined group contained the order Ericales, where the family Ebenaceae was represented by Diospyros mespiliformis and Euclea racemosa.The other major group was defined as Lamiids, where three orders were represented.The order Gentianales was represented by two species of the family Rubiaceae, namely Pavetta gardeniifolia and Psydrax schimperiana.The other orders of the same class were uniquely represented by Nuxia oppositifolia (family Stilbaceae, order lamiales), and Cordia monoica (family Cordiaceae, order Boraginales; Figure 2).
After sequencing, the obtained chromatograms were further analyzed using Gene ious R10 [19].To check the quality of each sequence, the peaks corresponding to each nu cleotide were examined, and a consensus sequence was produced after trimming the poor-quality DNA sequence ends, and aligning forward and reverse sequences.The con sensus sequences were identified using the BLAST search tool in the NCBI database ap plying default parameters.
Each sequence of the rbcL gene and ITS region were separately aligned with the BLAST query results using the MAFFT aligner [20] and implemented in Geneious R10 The phylogenies for each gene region were generated using maximum likelihood meth ods (ML).The ML tree was computed using FastTree V2 [21] and implemented in Gene ious R10.The phylogenetic analysis based on the morphological inspection was retrieved using TimeTree of Life (http://timetree.org/,accessed on 5 June 2022).

Morphological Inspection
All the inspected plant samples were identified as flowering plants belonging to the class Magnoliopsida (Angiosperms).The 12 plant samples were found to evenly presen two major clades, the Asterids (six species) and the Rosids (six species).In the case of the Asterids, two major groups were identified.An undefined group contained the order Eri cales, where the family Ebenaceae was represented by Diospyros mespiliformis and Eucle racemosa.The other major group was defined as Lamiids, where three orders were repre sented.The order Gentianales was represented by two species of the family Rubiaceae namely Pavetta gardeniifolia and Psydrax schimperiana.The other orders of the same clas were uniquely represented by Nuxia oppositifolia (family Stilbaceae, order lamiales), and Cordia monoica (family Cordiaceae, order Boraginales; Figure 2).In the case of the Rosids, two clades were identified; the Malvids contained the orde Sapindales and were represented by the Vepris nobilis (family Rutaceae) and the Fabids The latter was presented by three orders, order Malpighales of single species Fluegge In the case of the Rosids, two clades were identified; the Malvids contained the order Sapindales and were represented by the Vepris nobilis (family Rutaceae) and the Fabids.The latter was presented by three orders, order Malpighales of single species Flueggea virosa (family Phyllanthaceae), order Rosales was represented by two species, Ficus ingens (family Moraceae) and Trema orientalis (family Cannabaceae), and order Fabales was represented by two species of the family Fabaceae, namely Abrus precatorius and Dichrostachys cinerea (Figure 2).
3).The phylogenetic status was highly similar to the published taxonomical information, where two major clades corresponding to the Asterids and the Rosids were distinguished.Species (02) was correctly clustered with other members of the family Ebenaceae and closely related to species (01) of the same family (order Ericales).Equally, the correct taxonomical assignment of the studied species was observed for species (06, 07, 09, 10, and 11) where the studied species were close to or clustered with the matched species at high bootstrap support (> 0.60).However, incongruences were observed for species (01, 04, 05, and 12), and the cladistic resolution was insufficient to determine the correct species identification, even though the BLAST results confirmed one of them as the top match.Moreover, the case of species (03) that was found was differently identified from what was expected and was positioned as a member of the Amaranthaceae family (order Caryophyllales, Rosids; Figure 3).
Based on the aligned sequences, the maximum likelihood tree was constructed and visualized as a rooted cladogram (Figure 4).The phylogenetic status was not similar to the published taxonomical information.In detail, none of the two major clades corresponding to the Asterids and the Rosids were distinguished.However, all species of the same family were correctly clustered together, but the families were not grouped by known taxonomical information.
Species (01) and (02) were correctly clustered with other members of the family Ebenaceae, being closely related to each other and representing the order Ericales.Equally, the correct taxonomical assignment at the family level of the studied species was observed for species (04), (05), and (06), presenting the rest of the Asterids but separately clustered from the order Ericales.Species (07) and (12) were correctly assigned to their family members, but at a monophyletic position to all other families.An equal incidence was found for species (09) and (10) of the family Fabaceae.The case was different for species (08) and (11) where both were clustered within their families as Phyllanthaceae (order Malpighiales) and Cannabaceae (order Rosales), respectively clades but closely related to each other and against what would otherwise be expected.Species (12) was clustered with members of the family Moraceae of order Rosales, found on the head of the monophyletic clade.Species (01) and (02) were correctly clustered with other members of the family Ebenaceae, being closely related to each other and representing the order Ericales.Equally, the correct taxonomical assignment at the family level of the studied species was observed for species (04), (05), and (06), presenting the rest of the Asterids but separately clustered from the order Ericales.Species (07) and (12) were correctly assigned to their family members, but at a monophyletic position to all other families.An equal incidence was found for species (09) and (10) of the family Fabaceae.The case was different for species (08) and (11) where both were clustered within their families as Phyllanthaceae (order Malpighiales) and Cannabaceae (order Rosales), respectively clades but closely related to each other and against what would otherwise be expected.Species (12) was clustered with members of the family Moraceae of order Rosales, found on the head of the monophyletic clade.The studied species were all close to or clustered with the matched species at high bootstrap support (>0.86).However, the cladistic resolution was insufficient to determine the correct species identification for species (06) and ( 12), even though the BLAST results confirmed one of them as the top match.Moreover, species (03) was differently identified from what would be expected and was positioned as a member of the Amaranthaceae family (order Caryophyllales, Rosids; Figure 4).

Morpho-Molecular Comparative Analysis
A comparison between the inspected morphological identification versus the DNA barcoding technique showed agreements, as well as discrepancies.Based on the BLAST results, the molecular identification using both molecular loci agreed with the morphological inspection for species (04), (08), (09), and (10), as Cordia monoica, Flueggea virosa, Abrus precatorius, and Dichrostachys cinerea, respectively.Species (02) was equally identified between rbcL and ITS as Euclea divinorum, but it was not equal to the morphological inspection that identified the species as E. racemosa.
Total disagreement between the morphological inspection, ITS, and rbcL was found at the species level for species (05), (06), and (12) (Table 3).The rbcL phylogenetic analysis showed enough genetic variation to delimit species by paraphyletic clustering for species (01) and (06), in contrast to the ITS monophyletic clustering for those two species.The morphological inspection and rbcL were matched for species (01), identified as Diospyros mespiliformis, but not with ITS, which was identified as Diospyros lotus.
In the case of species (06), the rbcL phylogeny-based identification determined it as Pavetta abyssinica, regardless of the morphological inspection and the ITS phylogenybased identification.Agreement between the morphological inspection and ITS was found for species (07) and (11); by the paraphyletic clustering of both species compared to the rbcL phylogenetics, they were confirmed as Vepris nobilis and Trema orientalis, respectively.Similarly, the ITS cluster showed a paraphyletic structure, where species (05) clustered with Psydrax umbellatai.The ITS phylogenetic analysis discriminated species (03), (05), (07), and (11) more than the rbcL.
Disagreement between morphological inspection and both DNA barcoding markers at the genus level was found for species (03).In rbcL, three potential species from the same Achyranthes genus (i.e., A. longifolia, A. bidentata, and A. aspera) showed a monophyletic clustering, as well as the same BLAST PI% = 99.5%.In contrast, the ITS cluster was paraphyletic, where species (03) was clustered with A. aspera at a high bootstrap value = 0.94.None of the two DNA barcoding regions were able to delimit species (12), which was identified as Ficus sp., with no certain match.

Species Listing and Rarity Assessment
The applied DNA barcoding method was used to identify this group of species to gather data in order to understand its geographical distribution, endemism, and rarity in Saudi Arabia, as well as worldwide.According to the IUCN Red List of Threatened Species in 2021, all the identified species were labeled as "Least Concern" worldwide.However, regarding their distribution in the Faifa mountains, Jazan Province, Saudi Arabia, none of the identified species were recorded, except for C. monoica, D. cinerea, F. virosa, and the uncertain F. ingens.A. precatorius and A. aspera were not reported from the Arabian Peninsula, nor from proximate continents, along with P. umbellate, which was reported only from India.T. orientalis was reported as being widely distributed in sub-Saharan Africa and southern Asia.The two species, P. abyssinica and V. nobilis, were exclusively reported from the middle eastern part of Africa, along with E. divinorum, which extends to Yemen, a neighboring country to the sampling location (Figure 5).none of the identified species were recorded, except for C. monoica, D. cinerea, F. virosa, and the uncertain F. ingens.A. precatorius and A. aspera were not reported from the Arabian Peninsula, nor from proximate continents, along with P. umbellate, which was reported only from India.T. orientalis was reported as being widely distributed in sub-Saharan Africa and southern Asia.The two species, P. abyssinica and V. nobilis, were exclusively reported from the middle eastern part of Africa, along with E. divinorum, which extends to Yemen, a neighboring country to the sampling location (Figure 5).

Discussion
By comparing both ITS and rbcL phylogenetic analysis, we found that the ITS tree was better at identifying the samples at familiar or lower levels.However, the higher taxonomical ranks (i.e., order or phyla) were perfectly defined based on the rbcL.Therefore, the rbcL region might not be suitable enough for DNA barcoding for these families.Although it was more efficient to differentiate between genera and some species, the ITS region could not be used for single-DNA barcoding due to the variation within species [13,[22][23][24].Combining both regions, and guided by morphological observations and expert inspection, would perfectly help to identify unknown or wild species.In the current analysis, we were able to identify eleven out of the twelve species, four of which were identified as morphologically inspected.Species (12) was morphologically inspected as Ficus ingens, but the barcodes did not match the DNA sequences of this species in the NCBI database.Database and sequence search strategies are the key factors that may affect how barcode markers operate in species identification [25].
Regardless of the urgent need to conserve all the rarely occurring plant species we have identified, the medicinal and ethnobotanical role of some of those species should be highlighted.For example, Cordia monoica is a small tree belonging to the Boraginaceae family.It is a small tree that grows up to 6 m.The chloroform and ethyl acetate extracts of the roots showed significant anti-ulcer activity when compared with standard Lansoprazole (30 mg/kg); no toxicity signs and symptoms were observed [26].Another important example revealed by the current study was Diospyros mespiliformis, commonly called Jackal berry or African ebony.It is a tall, evergreen tree (15-50 m high) with a dense, rounded, and buttressed stem.It prefers areas with a continuous water supply, and this enhances natural regeneration.Ethnobotanical applications of different parts of the plant have been reported.It is traditionally used in Africa for the treatment of malaria [27].The roots and bark are utilized in enhancing delivery and as a remedy for pneumonia, malaria, leprosy syphilis, and diarrhea.Phytochemical investigations on D. mespiliformis have revealed several secondary metabolites, including alkaloids, tannins, saponins, glycosides, anthraquinones, flavonoids, and volatile oils [28].However, there is limited published data about its safety and systemic evaluation at high doses of the plant using the oral route; thus, investigation of its safety profile is needed [29].Trema orientalis is also a fast-growing species and can be harvested for valuable pulpwood in 3-4 years.T. orientalis is among the fastest-growing trees in the tropical and temperate regions, and produces wood that can be widely used by the paper industry [30].It also has antioxidant and antibacterial activities within different plant tissue extracts [31].The aerial parts, flowers, bark, and seeds of T. orientalis exhibit various pharmacological activities, including laxative, hypoglycemic, antipyretic, analgesic, anti-microbial, anticonvulsant, and anti-plasmodial [32].DNA barcoding proved to provide an efficient tool for species identification, but in the case of medicinal species, the assaying of the metabolic profiling would be essential for the determination and authentication of the medicinal importance of a species [33].
The previous record of the detected species was surprising for A. precatorius and A. aspera, and the Indian P. umbellate; these species were scarcely found in the Faifa mountains, but were also very distant from their common regions and usual climate, which probably suggests that the human factors might cause their appearance in the sampling locations.A similar explanation might be true for T. orientalis, which was never recorded in the whole Arabian Peninsula, even in less arid regions (e.g., Yemen).This was unlike P. abyssinica, V. nobilis, and E. divinorum, as their natural region and common climate conditions were proximate to the sampling location in the Jazan area.Natural gene flow and migration patterns are probable explanations for the occurrence of these species in the Faifa mountains (e.g., members of the family Ebenaceae [34]).The species C. monoica, D. cinerea, F. virosa, and the uncertain F. ingens were reported once in Saudi Arabia (e.g., near Abha city, 200 km from Jazan city [35]), but never from the Faifa mountains, a case that reflects the importance of such studies to record, survey, and enrich the database with taxa lists and DNA sequences for the flora of Saudi Arabia.The effects of climate change on the sampling area may contribute to the vegetation diversity detected in the Faifa mountains.Climate change affects species distributions through changes in plant growth and reproduction; it can act directly (e.g., drought, wind) as well as indirectly (e.g., temperature and disease outbreaks) [36].
Based on our findings, we recommend an in situ conservation plan for the studied plant species.This would play a valuable role in maintaining genetic resources, and allowing for the continued adaptation and evolution of migrated plant genotypes.Simultaneously, ex situ techniques would support the conservation and survival of threatened species and the associated genetic diversity.

Figure 1 .
Figure 1.Geographical map of the Faifa mountains located in Jazan province, Saudi Arabia.

Figure 1 .
Figure 1.Geographical map of the Faifa mountains located in Jazan province, Saudi Arabia.

Figure 2 .
Figure 2. The counts (A) and the literature-based phylogeny (B) of the twelve collected plant sam ples from the Faifa mountains based on observation and visual inspection.

Figure 2 .
Figure 2. The counts (A) and the literature-based phylogeny (B) of the twelve collected plant samples from the Faifa mountains based on observation and visual inspection.

Figure 3 .
Figure 3. Maximum-likelihood phylogenetic tree based on rbcL gene.The taxonomical ranks are defined while the incongruences are highlighted with rectangular gray.Asterids clade is highlighted in blue, while the Rosids clade is highlighted in black.The species (03) with unexpected identification is ranked in red.

Figure 3 .
Figure 3. Maximum-likelihood phylogenetic tree based on rbcL gene.The taxonomical ranks are defined while the incongruences are highlighted with rectangular gray.Asterids clade is highlighted in blue, while the Rosids clade is highlighted in black.The species (03) with unexpected identification is ranked in red.

Figure 4 .
Figure 4. Maximum-likelihood phylogenetic tree based on nuclear ITS region.The taxonomical ranks are defined while the incongruences are highlighted in rectangular gray.The species (03) with unexpected identification is ranked in red.

Figure 4 .
Figure 4. Maximum-likelihood phylogenetic tree based on nuclear ITS region.The taxonomical ranks are defined while the incongruences are highlighted in rectangular gray.The species (03) with unexpected identification is ranked in red.

Figure 5 .
Figure 5. Worldwide distribution of the identified species from the Faifa mountains, Jazan, Saudi Arabia.

Figure 5 .
Figure 5. Worldwide distribution of the identified species from the Faifa mountains, Jazan, Saudi Arabia.

Table 1 .
Blast results of the rbcL sequences of 12 rare species of the Faifa mountains.

Table 2 .
Blast results of the ITS sequences of 12 rare species of the Faifa mountains.

Table 3 .
Comparative identification summary based on the morphological inspection, rbcL, and ITS DNA barcodes of twelve rare plants of the Faifa mountains.
* Species identified based on both DNA barcoding regions are written in bold, while the species identified based on one region are underlined.