A Rapid Method for the Identification of Fresh and Processed Pagellus erythrinus Species against Frauds

The commercialization of porgies or seabreams of the family Sparidae has greatly increased in the last decade, and some valuable species have become subject to seafood substitution. DNA regions currently used for fish species identification in fresh and processed products belong to the mitochondrial (mt) genes cytochrome b (Cytb), cytochrome c oxidase I (COI), 16S and 12S. However, these markers amplify for fragments with lower divergence within and between some species, failing to provide informative barcodes. We adopted comparative mitogenomics, through the analysis of complete mtDNA sequences, as a compatible approach toward studying new barcoding markers. The intent is to develop a specific and rapid assay for the identification of the common pandora Pagellus erythrinus, a sparid species frequently subject to fraudulent replacement. The genetic diversity analysis (Hamming distance, p-genetic distance, gene-by-gene sequence variability) between 16 sparid mtDNA genomes highlighted the discriminating potential of a 291 bp NAD2 gene fragment. A pair of species-specific primers were successfully designed and tested by end-point and real-time PCR, achieving amplification only in P. erythrinus among several fish species. The use of the NAD2 barcoding marker provides a rapid presence/absence method for the identification of P. erythrinus.


Introduction
Food frauds are considered a major safety, quality, and economic problem worldwide, with rising awareness and concern in consumers [1][2][3][4][5]. In Europe, mislabeling is responsible for 41.8% of food fraud violations and fish products have a central role in this scenario [6]. The International Food Safety Authorities Network (INFOSAN) recently pointed out that there is a common desire from all countries for more technical support regarding food frauds prevention and management [1]. Fish and fish products are the most subject to fraud in the European Union, with intentional mislabeling being the main type of violation [6]. The common pandora (Pagellus erythrinus Linnaeus, 1758) is one of the most commercially caught Sparidae species in the Mediterranean Sea and the Atlantic Ocean. The commercialization of P. erythrinus, deriving from fishing activity between 2006 and 2016, has expanded from 9.34 to 13.90 tonnes (+48.72%), while the aquaculture production has progressively dropped from 197.00 tonnes in 2006 to 0.04 in 2016 [7]. Most of the wild common pandora catches come from Mediterranean fishery grounds (12.25 tonnes), mainly in Italy, Libya, Spain, and Tunisia, with a small share from Atlantic countries (1.65 tonnes) such as Morocco [7]. Fish farm production of P. erythrinus was 127.27 tonnes in 2014 in Greece, and 0.23 tonnes in 2016 in Cyprus [8]. Owing to the increasing appreciation of European consumers, P. erythrinus is deplorably subject to replacement with less valuable species. Fraudulent substitution of the common pandora occurs not only when prepared and processed specimens have lost external characters but also as whole fish, due to the morphological resemblance with other sparid species (e.g., Pagellus acarne, Pagellus bellottii, Pagrus pagrus, Lutjanidae spp.) [9][10][11][12][13]. Currently, cytochrome b-Cytb, cytochrome c oxidase I-COI, 16S, and 12S genes of the mitochondrial (mt) DNA are traditionally used for fish species identification [14][15][16]. However, these universal barcodes are not always effective for unambiguous species authentication as well as for forensic approaches such as fraud detection [14,17,18]. The limitation of providing adequate resolution at deep nodes emerges especially when fish species have a high degree of genetic homology. Traditional markers are inadequate, especially when they amplify for fragments that differentiate species by point mutations [19]. We have recently shown that the analysis of the complete mtDNA (mitogenomics) provides a useful approach for identifying fast and adequate barcoding markers. In particular, our previous study showed that the NAD5 gene, compared to traditional markers, possesses a higher discrimination capacity for species of the family Sparidae [20]. NAD5 is coding for the subunit 5 of NADH dehydrogenase, part of the mitochondrial membrane, involved in the electron transport to the respiratory chain [21]. This gene consists of short variable sequence regions associated with conserved areas that allowed to obtain universal primers for sparids, amplifying in all species for a 265 bp fragment.
Here, a more detailed study on the genetic distance between the complete mtDNA of species belonging to the family Sparidae was carried on using a gene-by-gene approach. The updated mtDNA comparison allowed identifying NAD2 as a new barcoding gene that agrees to discriminate species within the same genus, based on its extremely high level of divergence in the nucleotide sequence. Our applicative research shifted toward the development of presence/absence assay for discriminating P. erythrinus. In this study, we propose a NAD2 fragment as a specific marker for the common pandora. The gene NAD2 was previously investigated as a marker for population studies of plants, parasites and animals [22][23][24][25]. However, no use is made of this gene, as far as we know, in fish species identification against frauds, and our results show that its use warrants further investigation. We designed species-specific primers, testing their performance in both classical and Real-Time PCR. The present research allows and simplifies common pandora authentication since a classical PCR sequencing-free or RT-PCR without electrophoresis concludes analysis and significantly reduces the time and costs needed for correct species identification.

Fish Samples
Ten common pandora specimens from different FAO areas (27, 34, and 37) were sampled and used to test the PCR primer species-specificity (Table 1, Figure 1). The geographical coordinates of the fishing spots were provided by fish market operators. In fact, for the traceability protocol established by EC Reg. 1224/2009, each fishing company with boats over 10 m must have an electronic logbook for detecting the point where the nets are lowered (trawl fishing).  In the FAO area 37, spanning most of the species distribution (www.aquamaps.org), eight specimens were harvested from as many Geographical Sub-Areas (GSA) (Figure 1). The evaluation of primer specificity was extended to the other 26 fish species, as reported in Table 2.  In the FAO area 37, spanning most of the species distribution (www.aquamaps.org), eight specimens were harvested from as many Geographical Sub-Areas (GSA) (Figure 1). The evaluation of primer specificity was extended to the other 26 fish species, as reported in Table 2.

Abbreviation
All additional species were provided by Pozzuoli (Naples, Italy) and Salerno (Italy) fish markets. Specimens were labeled and conserved on board at −20 • C, transported in isothermal conditions to fish markets, and then to the research laboratory. The classification at the species level was carried out on the bases of fishes anatomical and morphological characteristics at the Department of Veterinary Medicine and Animal Production, University of Naples Federico II, Naples (Italy).

Total Genomic DNA Extraction
Total DNA extraction for each sample was performed in double from muscle tissue using the DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany) according to the procedure proposed by the manufacturer [27]. DNA concentration was determined with Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). The range of DNA amount was 40 ng/µL, while the purity was 1.8-2.0 ratio at A260/A280. DNA quantity and quality were evaluated by electrophoretic analysis in 1% agarose gel.

Comparative Analysis of mtDNA Complete Sequences
Sixteen complete mitogenome sequences of sparid species were analyzed. Ten of the sixteen were available in GenBank and six were sequenced and deposited in NCBI by this research group (Table 4). MtDNA sequences were analyzed and compared by the use of several bioinformatics tools, with the aim to find species-specific gene fragments for P. erythrinus identification. Unipro UGENE software [28] was used to perform the alignment. With the aim to evaluate the genetic divergence among genes and species, the hamming distance algorithm was used [29]. The p-genetic distance within species, the nucleotide sequence variability, pairwise and multiple alignments and gene divergence were determined as previously described [20,30,31].

NAD2 Fragment Amplification and Sequence Analysis
A pair of P. erythrinus-specific NAD2 primers that amplified a 291 bp fragment was designed by eye after multiple alignments of the 16 sparid species complete mtDNA sequences using BioEdit Sequence Alignment Editor [46]. In particular, primer design was carried out so that the nucleotide base at 3 end was species-specific. Melting temperature (Tm), secondary structure, self-annealing, and inter-primer binding, were verified using Multiple Primer Analyzer (Thermo Fisher Scientific, Waltham, MA, USA). NAD2 primers efficiency for sparid species identification was further verified in silico [28].
PCR amplifications were performed as reported in Ceruso et al., 2019 [20], with the annealing temperature at 57 • C (291 bp) for 1 min. PCR products were purified using the QIAquick PCR Purification Kit (Qiagen). PCR reactions were carried out on mtDNA extracted from 10 fresh P. erythrinus specimens originating from different GSA/FAO areas (Table 1, Figure 1) and fresh specimens belonging to several other species of commercial interest (Tables 2 and 3).
With the aim to confirm the correct amplification of the NAD2 gene fragment, amplicons sequences were assessed with the Sanger method using the Automated Capillary Electrophoresis Sequencer 3730 DNA Analyzer (Applied Biosystems, Foster City, CA, USA) at the Molecular Biology Service at the Stazione Zoologica Anton Dohrn. NAD2 sequences were evaluated with the BioEdit Sequence Alignment Editor. The concordance between morphological and molecular analyses was assessed running a BLAST analysis of the obtained sequences on GenBank for species identification [47].

Real-Time PCR
To assess the presence or absence of an amplicon with a simple yes/no answer, making it similar to a terminal PCR and gel electrophoresis, we tested primer sensibility and specificity in a protocol for Real-Time PCR (RT-PCR) technique. For some labs, it could be useful to assemble a reaction, load it into a single instrument, and obtain the needed information by visualizing species-specific amplification without the additional electrophoresis step. The DNA samples were used as a template for RT-PCR experiments performed in a Viia7 real-time PCR system (Applied Biosystems) at 1:10 dilution, with the primers described before [48]. The PCR volume of each sample was 10 µL with 5 µL of 2× RT-PCR SYBR Green Master Mix (Thermo Fisher), 0.05 or 0.1 pmol/µL for each primer and 2 µL of 1:10 diluted DNA template. The expected fragment of 291 bp in length had a theoretical melting temperature of about 87 • C calculated by Endmemo (http://www.endmemo.com/bio/tm.php). A slight difference in RT-PCR experiments is expected, due to the specific instrument used and the reaction conditions. In the set-up experiment, a specific amplicon with a melting temperature of about 80 • C was detected (data not shown), so we decided to test all fish specimens. Experiments were performed in triplicate. The diagram was elaborated with Excel.

Pagellus erythrinus mtDNA Comparative Data
Hamming distance comparison results showed that the genetic dissimilarity among mtDNA of P. erythrinus and other sparid species varies from a minimum of 10% (Dentex dentex) to a maximum of 19% (Acanthopagrus latus and Rhabdosargus sarba) (Figure 2). In Figure 2, the species are ordered clockwise from the species closest to the farthest from P. erythrinus. Statistical analysis was reported in Ceruso et al., 2019 [20].

NAD2 Amplification and Analysis
Following the results of Sparids mitogenome comparison, primers for amplifying species-specific nucleotide sequences were designed on the NAD2 gene. The high degree of nucleotide sequence variability of this gene among sparids species allowed to correctly design primers amplifying a 291 bp (from 303 to 593 nt) fragment so the nucleotide base at 3 end was species-specific for P. erythrinus (Table 5, Figure 6). Following the results of Sparids mitogenome comparison, primers for amplifying speciesspecific nucleotide sequences were designed on the NAD2 gene. The high degree of nucleotide sequence variability of this gene among sparids species allowed to correctly design primers amplifying a 291 bp (from 303 to 593 nt) fragment so the nucleotide base at 3′ end was species-specific for P. erythrinus (Table 5, Figure 6).  As reported in the Materials and Methods Section 2.4, the specificity of the designed primers was firstly tested in silico against all available Sparidae mitogenomes (Table 4).
Endpoint-PCR amplification was obtained in all the 10 genomic DNA samples of P. erythrinus. No amplification occurred in the DNAs of other fish species (Figure 7). As reported in the Materials and Methods Section 2.4, the specificity of the designed primers was firstly tested in silico against all available Sparidae mitogenomes (Table 4).
Endpoint-PCR amplification was obtained in all the 10 genomic DNA samples of P. erythrinus. No amplification occurred in the DNAs of other fish species (Figure 7). The ten P. erythrinus amplicons were sequenced to confirm the correct amplification of the 291 bp NAD2 fragment. Amplicon sequence comparison with databases showed accurate species identification, with similarity scores of NAD2 sequences ranging between 98% and 100%.
The value of intraspecific genetic variability ranged between 0% and 0.8%, with three different nucleotides found in two out of ten specimens (specimens Pe9 and Pe10 in Table 1).

NAD2 Amplification by RT-PCR
All the P. erythrinus samples gave an amplification with a Ct (Cycle Threshold) between 23 and 31 (a low Ct indicates the presence of the specific fragment used in the experiment) and a Tm between 78 °C and 79 °C (as expected). All the other species gave an amplification with Ct between 36 and 40 (where 40 means undetectable signal, like the negative control) and Tm that varied between 60 °C (the negative control) and 72 °C (Figure 8), indicating the presence of a very small amount of a nonspecific product. The ten P. erythrinus amplicons were sequenced to confirm the correct amplification of the 291 bp NAD2 fragment. Amplicon sequence comparison with databases showed accurate species identification, with similarity scores of NAD2 sequences ranging between 98% and 100%.
The value of intraspecific genetic variability ranged between 0% and 0.8%, with three different nucleotides found in two out of ten specimens (specimens Pe9 and Pe10 in Table 1).

NAD2 Amplification by RT-PCR
All the P. erythrinus samples gave an amplification with a Ct (Cycle Threshold) between 23 and 31 (a low Ct indicates the presence of the specific fragment used in the experiment) and a Tm between 78 • C and 79 • C (as expected). All the other species gave an amplification with Ct between 36 and 40 (where 40 means undetectable signal, like the negative control) and Tm that varied between 60 • C (the negative control) and 72 • C (Figure 8), indicating the presence of a very small amount of a non-specific product.

Discussion
Current studies focused on DNA based fish species identification use mitochondrial markers (e.g., COI, Cytb, 12S and 16S) standardly, assuming that they have discriminating ability on all fish families, genus or species. Nevertheless, the poor reliability of most widely used mitochondrial genes for barcoding and for constructing phylogenetic trees has been outlined earlier [49][50][51][52]. Thus, the high degree of genetic homology among fish species of the same family may cause limitations to providing adequate resolution for a correct identification, since species are differentiated by point mutations. Currently, mitochondrial genomes can be rapidly obtained from genome or transcriptome datasets [27], but comparative mitogenomics has been barely used for the discovery of new specific genes useful as markers for species and strain recognition [19,53,54].
The first innovation of this paper is that a vast amount of data was explored, considering the complete mtDNA of sixteen Sparidae species, to verify the presence of genes and gene fragments with more genetic divergence than standard markers. Thus, the investigation of the complete mitochondrial genome of the most commercially important sparid species allowed us to identify a new barcoding marker. The new approach showed that the "top five genes" with higher values of genetic dissimilarity among sparids were 1-ATP6, 2-NAD2, 3-NAD4, 4-NAD6, and 5-NAD5. It is important to note that among these five genes, there are two (ATP6-684 bp and NAD6-522 bp) with a total length of much less than others (NAD2-1047 bp, NAD4-1381 bp and NAD5-1839 bp). In the three longer genes, the genetic dissimilarity degree appears to have greater statistical significance. Another consideration is that this ranking does not include genes currently used for sparid species identification (COI, Cytb, 12S and 16S).

Discussion
Current studies focused on DNA based fish species identification use mitochondrial markers (e.g., COI, Cytb, 12S and 16S) standardly, assuming that they have discriminating ability on all fish families, genus or species. Nevertheless, the poor reliability of most widely used mitochondrial genes for barcoding and for constructing phylogenetic trees has been outlined earlier [49][50][51][52]. Thus, the high degree of genetic homology among fish species of the same family may cause limitations to providing adequate resolution for a correct identification, since species are differentiated by point mutations. Currently, mitochondrial genomes can be rapidly obtained from genome or transcriptome datasets [27], but comparative mitogenomics has been barely used for the discovery of new specific genes useful as markers for species and strain recognition [19,53,54].
The first innovation of this paper is that a vast amount of data was explored, considering the complete mtDNA of sixteen Sparidae species, to verify the presence of genes and gene fragments with more genetic divergence than standard markers. Thus, the investigation of the complete mitochondrial genome of the most commercially important sparid species allowed us to identify a new barcoding marker. The new approach showed that the "top five genes" with higher values of genetic dissimilarity among sparids were 1-ATP6, 2-NAD2, 3-NAD4, 4-NAD6, and 5-NAD5. It is important to note that among these five genes, there are two (ATP6-684 bp and NAD6-522 bp) with a total length of much less than others (NAD2-1047 bp, NAD4-1381 bp and NAD5-1839 bp). In the three longer genes, the genetic dissimilarity degree appears to have greater statistical significance. Another consideration is that this ranking does not include genes currently used for sparid species identification (COI, Cytb, 12S and 16S).
Another important consideration and innovation of our study is that the comparison carried out allowed to find a new gene, never considered before for sparids barcoding scopes (NAD2), with an incisive, rapid, and unequivocal discriminating ability. Therefore, applying bioinformatics analysis, the barcoding value of all the NAD group genes was previously reported in fish families, but the NAD5 gene was selected as a potential marker [20,55].
One of the main features that differentiates the NAD2 gene from the NAD5 is the distribution of genetic diversity. In fact, the NAD5 had highly divergent areas in alternance with very similar areas between species. This made it possible to design primers that amplified across all species, allowing diversification through sequencing only. Instead, the NAD2 high degree of genetic divergence is spread throughout all the gene, in order to allow the design of primers with a 3 primer start position different from the other species hence specific for P. erythrinus. Clearly, this NAD2 feature makes it possible to design species-specific primers for other sparid species, bringing great benefits in the research field of species identification against frauds.
Finally, this research provides an expansion of knowledge on the genetic dissimilarity among Sparids. Previous research on the genetic distance among sparid species in the family Sparidae focalized on single genes (e.g., Cytb), not on the complete mtDNA [56].
The new approach based on the complete mtDNA genome has allowed us to know the potential information content on genetic variation and lineage divergence of Sparidae species. The comparative mitogenomic analysis reveals the close relationship of P. erythrinus with other sparid species, with genetic divergence ranging from 10 to 19%. Gene-by-gene Hamming distance analysis identifies ATP6, NAD2, NAD4, NAD5, and NAD6 as the less conserved mitochondrial genes among porgies and seabreams.
Our results allowed to find the species-specific barcode marker for P. erythrinus, to use in frauds prevention based on species substitution. This research represents a new way to approach the studies on fish species identification and should be applied to other fish species to prevent and manage frauds.

Conclusions
An in-depth analysis of the complete mitogenomes of sparid fishes provides a novel species-specific barcoding marker for the identification of the common pandora P. erythrinus, a seafood industry product commonly subject to frauds by replacement. This study reports a consistent and fast presence/absence visual assay to prevent frauds by substitution of P. erythrinus, through a simple end-point PCR that does not need sequencing. The RT-PCR confirmed primer specificity and sensibility and does not require electrophoresis but implies more laboratory skills.
The Official Controls Regulation (EU) 2017/625, gives increasing importance to consumer protection and safety against fraudulent practices along the agri-food chain. The replacement and mislabeling of fish species with others of lower commercially value is a growing problem in the production and distribution of fishery products chain. Effective species identification in fresh, frozen, and treated products may contribute to the "molecular traceability" of seafood, in agreement with Regulation (EU) 1379/2013 (European Commission, Brussels, Belgium, 2013). Competent national authorities could make the food control and authentication activities more effective thanks to the full use of DNA test analysis, in order to discourage fraudulent species replacement activities.