DNA Barcoding of Penaeidae (Decapoda; Crustacea): Non-Distance-Based Species Delimitation of the Most Economically Important Shrimp Family

: The Penaeidae family includes some of the most economic and ecological important marine shrimp, comprising hundreds of species. Despite this importance and diversity, the taxonomic classiﬁcation for penaeid shrimp has constantly been revised, and issues related to the species identiﬁcation are common. In this study, we implemented DNA barcoding analyses in addition to single-gene species delimitation analyses in order to identify molecular operational taxonomy units (MOTUs) and to generate robust molecular information for penaeid shrimp based on the cytochrome oxidase subunit I (COI) mitochondrial gene. Our ﬁnal data set includes COI sequences from 112 taxa distributed in 23 genera of penaeids. We employed the general mixed Yule coalescent (GMYC) model, the Poisson tree processes (PTP), and the Bayesian PTP model (bPTP) for MOTUs delimitation. Intraspeciﬁc and interspeciﬁc genetic distances were also calculated. Our ﬁndings evidenced a high level of hidden diversity, showing 143 MOTUs, with 27 nominal species not agreeing with the genetic delimitation obtained here. These data represent potential new species or highly structured populations, showing the importance of including a non-distance-based species delimitation approach in biodiversity studies. The results raised by this study shed light on the Penaeidae biodiversity, addressing important issues about taxonomy and mislabeling in databases and contributing to a better comprehension of the group, which can certainly help management policies for shrimp ﬁshery activity in addition to conservation programs.


Introduction
Occurring in all oceans, especially in tropical and subtropical regions, the Penaeidae family includes some of the most important marine shrimp, comprising, up until 2020, 32 genera with 224 species [1], some of which are considered the crustaceans of most major economic importance in the world [2][3][4]. The world production of shrimp, adding catches and shrimp farming, represents the most important fish product traded internationally in terms of value. While catches of shrimp reached new records in recent years, the world aquaculture production of crustaceans in 2018, for instance, consisted of 9.4 million tons, representing 11.4% of the world total aquaculture production of aquatic animals [3]. In several tropical developing countries, shrimp fishing represents the most valuable export product and an important employment-generating activity [5].
Due to their great commercial value, many species of the penaeid group have been economically overexploited, especially in tropical regions [3,[5][6][7][8][9][10][11]. Such overexploitation can led to a marked decline in their natural stocks, promoting disruption of the marine environment where it occurs by affecting important ecological functions and ecosystem services, causing changes in competition and predation, loss of spawning biomass, removal of juveniles, reduction in water and habitat quality, modification of species composition and interaction, potential local extinctions, and decreasing biodiversity [12,13].
The implementation of management actions appropriate to regional realities can lead to the recovery of fishing stocks, as shown in the study that evaluated the effect of management reforms (2013-2017 period) aligned with the FAO Code of Conduct for Responsible Fisheries 2013, on Colombian shrimp stocks, employing fisheries' performance indicators. The results revealed that a regulatory reform implementation improved ecological performance by increasing stock size and decreasing bycatch, also showing positive social outcomes [14]. In India, a study conducted to evaluate the trends in penaeid shrimp landings for a period of approximately ten years (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010) suggested the restriction of fishing efforts to ensure sustainability of this resource [9].
Overexploitation and depletion of key penaeid species can negatively impact higher trophic levels with a possible erosion of other fishery resources. Regional studies, such as the analysis of the population dynamics of the commercial species Parapenaeopsis coromandelica from the coastal waters of Teluk Penyu in Indonesia, show that the rate of exploitation per year (E) for females and males is above the sustainable level (E = 0.5) [15]. The presence of the Atlantic seabob Xiphopenaeus kroyeri, in high densities in coastal waters of Suriname, for example, plays an important role for trophic ecology, since this species is the main high-density epibenthic organism found up to 30 m deep, acting as a vector for energy from intertidal primary production to secondary subtidal production and serving as prey for fish species [16]. In addition, recent genetic studies show that the number of penaeid species is still underestimated, evidencing hidden cryptic diversity and taxonomic inconsistencies [1]. Delimitation analysis of penaeid shrimps from Southeast Asia indicated 94 putative species within 71 recognized species, including Kishinouyepenaeopsis cornuta, K. incisa, Mierspenaeopsis sculptilis, M. hardwicki, Parapenaeopsis coromandelica, and the giant tiger prawn Penaeus monodon [1]. These results show that even known species of great commercial importance for fishing or shrimp farming, such as P. monodon, can cover up the existence of several cryptic species, hindering the implementation of effective measures for the management and conservation of natural stocks and/or of genetic-based selective breeding programs [17].
Despite the ecological and economic importance of penaeid shrimp, the taxonomic classification for this group has constantly been revised, mainly because of some disagreements regarding the morphological characters and their related adopted criteria [4,18]. Morphological distinctions among some penaeid species are often quite subtle, especially involving close species that are distinguishable only due to slim differences in genitalia [2]. Consequently, penaeid species delimitation frequently requires a high level of knowledge and training on specific diagnostic characteristics. Thereby, some approaches based on molecular analyses, such as DNA barcoding, have been proposed as an efficient alternative tool to aid species identification of several crustaceans [19,20], including crabs [21,22], lobsters [23], and shrimp and prawn [24][25][26]. Within penaeids, DNA barcoding has been applied to identify species from a specific site [1,26], discriminating cryptic species [18,[27][28][29], identifying juveniles and larvae individuals [30], and characterizing a given genus, as reported for Metapenaeopsis [25].
The implementation of the DNA barcoding can effectively contribute to a prompt identification of penaeid shrimp, decreasing or even avoiding misidentification and mislabeling products [31,32]. This approach can also disclose hidden diversity [33], revealing lineages or pointing out new species [18] that would be eventually managed inadequately if they remained unknown. Overall, in this study, we employed the DNA barcoding approach combined with three non-distance-based single-gene species delimitation analyses in order to identify consensus MOTUs (molecular operational taxonomy units) and to test the utility of this combined approach to reveal hidden diversity. Our hypothesis is that there is a degree of hidden diversity in the family Penaeidae that could be detected using species delimitation methods. Our study analyzed COI sequences for an expressive number of penaeid genera and species around the world, highlighting relevant information for this important economic and ecological resource.

Sampling
Sampling was performed following all legal requirements determined by the governmental laws of each country (Brazil, Mozambique, and Peru) for the ethical fishery of marine shrimp stocks. In total, we sampled 114 specimens from 18 nominal species, i.e., taxonomically valid species, that were collected in Southeast Pacific (5), Southwest Atlantic (7), and Southeast Indic (6) oceans (Table S1), comprising nine genera. Fragments of muscle tissue (about 1-2 cm 3 ) were fixed in 96% ethanol kept at 4 • C until DNA extraction procedures. Firstly, an initial species identification was performed by the visualization of morphological traits following taxonomic references for penaeid shrimp recognition at the species level [2,[34][35][36][37][38][39]. Voucher information for the sampled species is provided in the Supplementary Material (Table S1).
Additionally, 596 DNA barcode sequences were obtained for 108 nominal species from 23 genera through data downloaded from the Barcode of Life Data System (BOLD; available at http://www.boldsystems.org/, accessed on March 2021). BOLD data were filtered by deleting data for individuals with dubious species identification, in which a single specimen was clustered with several individuals from the non-corresponding species in a preliminary phylogenetic clustering before implementing subsequent deeper analyses.

DNA Extraction, Amplification, and Sequencing
Total DNA was extracted using a standard phenol-chloroform method based on the protocol proposed by Sambrook et al. [40]. Fragments of the cytochrome oxidase subunit I (COI) mitochondrial gene were amplified through the polymerase chain reaction (PCR). For the species from the Southwest Atlantic and Southeast Indic oceans, we used a set of primers specifically designed for penaeid group using the Primer 3 software [41], the forward (F) and reverse (R) primer pair: COIPenF2 (3 -AGATTTACAGTCTATCGCCTA-5) and COIPenR (3 -ATACCAAATACRGCTCCYATTGA-5 ). PCR was carried out on a final volume of 30 µL, using 200 µM dNTPs, 1X PCR buffer, 0.3 µM of each primer, 2.5 mM MgCl2, and 1U of Taq polymerase; using a Veriti™ Thermal Cycler (Applied Biosystems) programmed according to the following parameters: 35 cycles at 94 • C for 50 s, 51 • C for 80 s, and 72 • C for 60 s. For the species from the Southeast Pacific Ocean, we used the pair of primers LCO1490 and HCO2198 [42]. PCR was carried out on a final volume of 18 µL, using 125 µM dNTPs, 1X PCR buffer, 0.25 µM of each primer, 2 mM MgCl 2 , and 1U of Taq polymerase, using a Veriti TM Thermal Cycler (Applied Biosystems), following the parameters: 35 cycles at 95 • C for 50 s, 49 • C for 50 s, and 72 • C for 70 s. PCR products were purified using the PEG (polyethylene glycol 20%) protocol [43] and then COI amplicons were Sanger-sequenced for both strands using an ABI3730XL automatic sequencer (Applied Biosystems, Foster City, CA, USA). The obtained sequences were visualized and manually edited using the software Bioedit [44]. Stop codons and indels were checked, and low-quality regions were deleted. Sequence data were deposited in both public databases: GenBank (https://www.ncbi.nlm.nih.gov/genbank/) from the National Center for Biotechnology Information (NCBI) and BOLD (http://www.boldsystems.org/). GenBank accession, and BOLD record numbers for the sequences analyzed here are shown in Table S1.

DNA Barcoding and MOTU Delimitation Analyses
For the MOTUs investigation, we employed three species delimitation methods: the general mixed Yule coalescent (GMYC) model with a single threshold [45], the Poisson tree processes (PTP), and the Bayesian implementation of the PTP model (bPTP) [46]. As input for these methods, firstly an ultrametric tree was generated using Beast 2.6 [47], with a log normal relaxed clock, a birth and death model, and a GTR+I+G substitution model chosen by jModelTest 2 [48], with 200 million MCMC generations, sampled every 30,000 iterations, and a burn-in of 10%. Convergence and adequate sample size (greater than 200) were evaluated in Tracer v. 1.7 [49]. The different delimitation outputs were compared using the pipeline SPdel (https://github.com/jolobito/SPdel, accessed on July 2021) that generates a consensus delimitation (Consensus MOTUs) and provides image visualizations. Additionally, intraspecific and interspecific genetic distances, with a K2p substitution model, were calculated for nominal species and consensus MOTUs using SPdel as well.

Results
The alignment of COI sequences resulted in 609 base pair fragments, with 256 parsimony informative sites without gaps. The single-gene species delimitation analyses evidenced 144 MOTUs (p < 0.0001) for the GMYC, 143 MOTUs for the PTP, and 142 MOTUs for the bPTP analyses (Figures 1-5). SPdel summarized the previous results in 143 consensus MOTUs, of which only 85 matches with valid nominal species (Figures 1-5). The mean intra-group distances, the maximum intra-group distance, the nearest neighbor (NN), and the NN's minimum distance for both consensus MOTUS and nominal species are shown in Supplementary Material (Table S2). The overall mean of intraspecific distances was 1.3%, the maximum intraspecific distance was 19.7% (Trachysalambria curvirostris), and the minimum interspecific distance was 0% (Farfantepenaeus duorarum, Farfantepenaeus notialis, Melicertus latisulcatus, Melicertus plebejus, Metapenaeopsis palmensis, Metapenaeopsis provocatoria, Metapenaeopsis toloensis, and Metapenaeopsis velutina). No barcode gap was found, and the intraspecific distance for 16 nominal species was higher than the interspecific one (Table S2).
For consensus MOTUs, the overall intra-MOTU distances were 0.27%, the maximum intra-MOTU distance was 2.18% for Trachysalambria curvirostris from China and Egypt (Table S3), and the minimum inter-MOTU distance was 1% for both MOTUs of Farfantepenaeus isabelae (Table S3). Few MOTUs (two for consensus MOTUs) whose intra-MOTU distance was slightly higher than inter-MOTU distance were found and no barcode gaps were observed using any delimitation method (Table S3).

Discussion
Our integrated approach, combining DNA barcoding with non-distance-based singlegene species delimitation methods, was efficient in raising some issues and pointing inconsistences out for the penaeids analyzed herein. These methods are advantageous because they are independent of a distance criterion (cut-off threshold value) and do not require prior delimitation. It is known that such strategy can constitute a powerful tool for MOTU delimitation, aiding the knowledge about species diversity in different taxa [1,28,[50][51][52]. Here, we present a meaningful COI dataset for different species from the most commercially important shrimp family, highlighting novelties on the penaeid biodiversity, including 114 new records for little-studied areas, and demonstrating the effectiveness of species delimitation (Consensus MOTUs) to accelerate the study of biodiversity.
Despite of the economic importance of the Penaeidae group, our findings evidenced a high level of hidden diversity, showing 143 MOTUs distributed in 112 nominal species, with 27 nominal species not agreeing with the genetic delimitation obtained here. These data represent potential new species or highly structured populations that have probably not been managed or protected adequately by the existing fishery policies legislation. The degree of hidden diversity found herein (143 MOTUs in 112 species, 27.6%) is similar to that found for penaeids from South-East Asian waters (94 MOTUs in 71, 32%) [1]; however, the methodology used herein is based on three different coalescent species delimitation methods and COI barcoding region, while the latter used ABGD (a species delimitation method based on distance) and bPTP in two different COI regions. Such a high level of hidden diversity likely reflects the already-mentioned difficulties in identifying and discriminating penaeid shrimp species using only morphological characters [2,18,34]. Moreover, for some penaeid genus, the species identification is commonly based on the genitalia morphology of adults, requiring a high level of expertise to correctly identify the species [2,18,34]. This fact usually implies misidentification, compromising the data reliability available in public databases (e.g., GenBank or BOLD) as observed for Metapenaeopsis mogiensis. Our results showed this species to be polyphyletic, with three unrelated groups, including one species more related to Megorkis sedili than Metapenaeopsis (Figure 4). These findings at least indicate that a revision of the vouchers reported in the BOLD database is needed to determine the correct identification of these specimens.
The potential presence of cryptic species also challenges the correct discrimination of species with similar morphology. For Melicertus plebejus and M. latisulcatus, for example, we observed three MOTUs (Figure 1), two of them including only specimens from one nominal species and a third one including specimens of both species. Indeed, these penaeids share a similar morphology and coloration, and they have been considered sister species [53]. In this sense, we observed this third MOTU to be more related to the MOTU including only M. plebejus (Figure 1), likely representing a potential cryptic species hidden by a resembling morphology.
A similar case encompasses Farfantepenaeus duorarum and F. notialis (Figure 2), which was found herein distributed in two MOTUs, one including only F. notialis specimens; however, a different clade clustered both F. notialis and F. duorarum species. The minimum inter-MOTU genetic distance between these groups was 1.69%. In fact, F. notialis and F. duorarum are morphologically very similar, F. notialis being initially described as a subspecies of F. duorarum [4]. In previous molecular studies, the validity of these species was questioned due to the low genetic distance and the lack of reciprocal monophyly [54]. Our results, using species delimitation methods, supports the existence of two MOTUs, suggesting the existence of different species but not supporting the current taxonomy identification of F. notialis and F. duorarum sampling. In this sense, a study including samples from the entire geographical distribution of both species and conclusive taxonomic diagnose is still necessary to correctly delimitate these taxa.
The current nominal valid species subdivided here in two or more MOTUs, correlated with the geography and with high intra-MOTU genetic distance, may include potential cryptic species, indicating therefore the need of further taxonomic studies [55]. This is the case observed in Funchalia villosa, Litopenaeus stylirostris, Metapenaeopsis toloensis, Mierspenaeopsis hardwickii, Parapenaeus investigatoris, and Rimapenaeus constrictus (Table 1). Overall, an integrative taxonomic approach, also including broader sampling, is imperative to understand the meaning of the findings raised here for these species.  [61], and Xiphopenaeus riveti [34]. Additionally, our species delimitation analyses supported molecular groups distinct from the molecular studies prior reported for four species (Table 1): Farfantepenaeus isabelae [54], Metapenaeopsis palmensis [1], Penaeus monodon [57], and Trachysalambria curvirostris [62]. Some of these species, such as T. curvirostris, are considered as morphological species complex [63].
For Metapenaeopsis provocatoria, M. velutina and M. quinquedentate the analyses evidenced interesting results grouping these species within a single MOTU, with a maximum intra-MOTU distance of 0.49%. As discussed by Cheng et al. [25], the three nominal species are morphologically distinguishable, but some morphological traits might vary depending on the environmental conditions. In this way, an integrative study considering more representative sampling using a larger number of molecular markers is still necessary to address the taxonomy status of these species. Additionally, the possibility of misidentification of these samples should be explored.
In sum, our findings state that the family Penaeidae still holds a large unknown diversity that was revealed here after combining the DNA barcoding approach with robust species delimitation methods, showing the importance of including this approach in biodiversity studies. The data raised by this study shed light on the penaeid biodiversity, addressing important issues about taxonomy and mislabeling in databases and contributing to a better comprehension into the group that can certainly help management policies for shrimp fishery activity, in addition to conservation programs.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/d13100460/s1, Table S1: Sampling information, BOLD process ID and GenBank accession for specimens included in the analysis. Table S2: Genetic K2P distances of Penaeidae species. The mean and the maximum of intra-group distances, the nearest neighbor (NN), and the minimum distance to the NN. Table S3: Genetic K2P distances of Penaeidae MOTUs. The mean and the maximum of intra-group distances, the nearest neighbor (NN), and the minimum distance to the NN. Data Availability Statement: All sequences files are available from the BOLD database (dataset DS-PENAEID) and GenBank (Table S1).