Exploring Deep-Sea Biodiversity in the Porcupine Bank (NE Atlantic) through Fish Integrative Taxonomy

: This study combined morphological and molecular approaches to the species assignment of several rare or poorly known deep-water ﬁshes caught between 549 and 1371 m depth during a Spanish bottom trawl survey in the Porcupine Bank, west of Ireland. The following ﬁsh species were identiﬁed: Nessorhamphus ingolﬁanus (Schmidt, 1912), Borostomias antarcticus (Lönnberg 1905), Scopelosaurus lepidus (Krefft and Maul 1955), Bathypterois dubius Vaillant, 1888, Evermannella balbo (Risso, 1820), Antimora rostrata (Günther, 1878), Melanonus zugmayeri Norman, 1930, Lyconus brachycolus Holt and Byrne, 1906; Paraliparis hystrix Merrett, 1983, Neocyttus helgae (Holt and Byrne, 1908); Platyberyx opalescens Zugmayer, 1911; Howella atlantica Post and Qu é ro, 1991, Lycodes terraenovae Collett, 1896 and Pseudoscopelus altipinnis Parr, 1933. The presence of L. brachycolus , P. opalescens and P. altipinnis is reported for the ﬁrst time in the Bank. The DNA barcoding results were largely consistent with morphological identiﬁcation in 10 species but four did not ﬁt the current taxonomy, indicating cases of potential cryptic speciation, misidentiﬁcation, synonymy or recent diversiﬁcation. Among them, the results strongly suggest that P. garmani and P. hystrix are conspeciﬁc, making P. hystrix a junior synonym of P. garmani .


Introduction
The deep ocean makes up 95% of the volume of the seas and is the largest and least explored biome on the planet [1]. Organisms in this environment have evolved to adapt to high pressures, almost perpetual darkness and low food availability that characterizes much of the deep sea. The term deep-sea fishes was often used to refer to fish that live in darkness below the sunlit surface waters, at approximately 200 m water depth, i.e., below the epipelagic or photic zone of the sea or the continental shelf break. However, there is not rigid definition, and there is a lack of consensual criteria for establishing the initial boundary of deep-sea habitat. For example, Glover et al. [2] defined a deep-sea species as one that is more than 500 m deep, since at this distance from the surface the seasonal variation of physical parameters and the influence of sunlight are minimal.
As deep-sea fishes have been poorly sampled globally, overall knowledge of fish distributions and drivers of community composition and diversity remain incomplete [3]. The capture of rare or poorly known deep-sea fishes is of special interest as it adds to knowledge of the basic taxonomy, biology and distribution [4]. Such records allow a better understanding of species distribution ranges and morphological variations, as well as the mechanisms involved in connectivity between areas [5,6].
'Integrative taxonomy' emerges as an important tool for species delimitation and thus for a better understanding of the composition of deep-sea ichthyofauna. It is defined as the science that aims to delimit the units of life's diversity from multiple and complementary perspectives [7]. DNA barcoding is a tool that has been successfully integrated and enhanced traditional morphological analysis in the systematic studies of fishes [8,9]. This integrative study can highlight identification mistakes and incongruities between molecular and morphological results, helping to reveal cryptic species, the identification of immature specimens, and clarification of problems of synonymy [10]. Gaps and inconsistencies in reference DNA databases can make accurate identification of fishes to species level difficult, suggesting the need of reinforcing DNA barcoding reference datasets [11]. Without reference sequences from voucher specimens identified by qualified taxonomists, there is no reliable library for comparing newly generated query sequences [12].
Conditions in deep-sea ecosystems are more uniform and constant than those in shallower waters, and this is the main reason given for the wider observed global distribution of deep-sea fishes. Under similar selective pressures, deep-sea fishes have convergently evolved adopting similar morphology, often developing analogous structures as adaptations to similar environments [13]. This phenomenon makes it difficult to correct identification of these fishes, which is an obstacle to a complete understanding of the true biodiversity. Molecular taxonomy has revealed cases of intraspecific divergence compatible with different species, exposing cases of possible cryptic species or the resurrection of synonyms [14].
The Porcupine Bank is located in the north-eastern Atlantic, 200 km off the west coast of Ireland, forming a seamount-like structure, with its related anticyclonic structures. The fish fauna of this bank and the adjacent areas are well reported in the ichthyological literature [15,16], but the occurrence of unreported fishes is not unusual [8,17], showing that the knowledge of this environment is far from being complete.
The aim of this manuscript is, firstly, to document the presence of poorly known deep-water fishes in the Porcupine Bank and secondly, to confirm the taxonomy of these species by means of an integrative taxonomy approach, combining both morphological examination and the molecular DNA barcoding method. These objectives are within the recommendations to inform the development of the Decadal Ocean Actions focused on the deep sea and help to resolve the question of what is the diversity of life in this environment? [18]. In particular, in the objective 1 "Capacity development", it is stated that all actions should commit to sharing specimens, including whole animals, tissue, barcoding and environmental DNA samples, and investing in the deposition of specimens in established and regionally relevant institutions that have recognized charters to support the permanent storage and care of archived specimens and the recommendation of open access publication of research and data whenever possible [19].

Study Site and Sampling
A Spanish bottom trawl research survey has been carried out annually since 2001 in the Porcupine Bank (ICES Divisions 7c and 7k), on board the R/V Vizconde de Eza, to study the distribution, relative abundance and biological parameters of commercial fish. The survey covers an area that extends from longitude 12 • W to 15 • W and from latitude 51 • N to 54 • N, following the standard methodology for the IBTS North Eastern Atlantic Surveys. In September-October 2020, during the 2020 Spanish Bottom Trawl Survey on the Porcupine Bank (SP-PORC-Q3) a total of 91 bottom trawls of 20 min of duration were made between 190 and 1400 m depth using a Baca-GAV 39/52 with a cod-end mesh size of 20 mm ( Figure 1). Twenty-five specimens caught between 549 and 1371 m depth were selected for further study. After removing tissue samples for molecular analysis, the specimens, were stored at −28 • C and deposited in the fish collection of the Museo Luis Iglesias de Ciencias Naturais in Santiago de Compostela (MHNUSC).
selected for further study. After removing tissue samples for molecular analysis, the specimens, were stored at −28 °C and deposited in the fish collection of the Museo Luis Iglesias de Ciencias Naturais in Santiago de Compostela (MHNUSC).

Morphological Analysis
First, a preliminary onboard identification was carried out on fresh specimens. After the scientific survey, specimens were defrosted and definitively identified to species level following ichthyological guides and keys [20,21]. The main morphometric and meristic characters were recorded according to the literature as follows: Total length (TL), Standard length (SL), Head length (HL); Upper jaw length (JL); Pre-orbital length (PO); Eye diameter (ED); Post orbital Length (POL); Inter-orbital length (IO); Predorsal length (PD); Pre-first dorsal length (PD1); Pre-second dorsal length (PD2); Prepectoral length (PP); Preanal length (PA); Dorsal fin base length (DB); First dorsal fin base length (DB1); Second dorsal fin base length (DB2); Anal fin base length (AB); Pre-pectoral length (PP); Pre-pelvic length (PV); Pectoral fin length (PL); Pelvic fin length (VL); Maximum body depth (BH); Caudal peduncle depth (CP); Number of rays in dorsal fin (D); Number of rays in first dorsal fin (D1); Number of rays in second dorsal fin (D2); Number of rays in pectoral fin (P); Number of rays in anal fin (A); Number of rays in caudal fin (C); Branchiostegal rays (BR); Gill-rakers (GR); Scales in the lateral line (SLL). With the exception of TL and SL, measurements are distances perpendicular to the length of the fish measured with a digital calliper to the nearest mm. All measurements are expressed as the percentage of standard length (%SL). Descriptive data are reported individually for species represented by one or two specimens and ranges are reported when there are three or more.

Morphological Analysis
First, a preliminary onboard identification was carried out on fresh specimens. After the scientific survey, specimens were defrosted and definitively identified to species level following ichthyological guides and keys [20,21]

DNA Extraction, PCR Amplification and Sequencing
Molecular procedures were carried out in the Biochemistry laboratories of the University of Vigo (Spain). DNA extraction and purification were carried out from muscle tissue of each specimen, using the E.Z.N.A. Tissue DNA Kit from Omega Bio-Tek, following the manufacturer's instructions. About 30 mg of muscle tissue was used and total DNA was recovered in 200 µL of elution buffer. The primer sets used in the sequence amplification procedures were the COI-1 or COI-3 cocktails for the COI-5P barcoding marker [22] and, in the case of Lycodes terraenovae, 12SA-12SB for 12S rRNA and GLU-CB2 for Cyt b [23]. All information regarding these specimens as well as their DNA barcodes, images, places of capture and other complementary data are available in the project "Fishes of the Porcupine Bank" (code PORCU) in the Barcoding of Life Database (BOLD, http://www.boldsystems.org/ (accessed on 6 September 2021)).
Each PCR reaction was carried out in 20 µL final volume including Phire Green Hot Start II PCR Master Mix (Thermo Scientific, Göteborg, Sweden) for COI-5P amplifications, or Green-Taq DNA polymerase Master Mix (Canvax), the corresponding primers, nucleasefree water and 2 µL of template DNA. The time-temperature profiles were for COI:  [24]. The sequencing reactions were carried out in both senses to subsequently generate a consensus sequence, using the same primers as in the amplifications, except for the amplicons obtained using the COI-3 cocktail, where the primers M13F (-21) and M13R (-27) [25] were used.

Molecular Analysis and Assignment of Specimens
For each of the 14 taxa, the COI-5P sequences of the specimens assigned after morphological analysis to the species level were used as queries in the nucleotide BLAST (BLASTn) tool (https://blast.ncbi.nlm.nih.gov/Blast.cgi (accessed on 6 September 2021)). In those cases, where the percentage of identity was low (less than 98%) or the resulting species name did not match, the sequences were manually aligned with others belonging to the same genus that were publicly available in the BOLD database. Sequences were collapsed into unique haplotypes employing FaBox [26]. Molecular taxonomy cladograms were produced by the Neighbor-Joining (NJ) method [27] using MEGA X [28]. Observed differences between sequences were expressed proportionally as p-distances [29]. Despite their purely taxonomic use, a bootstrap resampling process [30] was included in the elaboration of the trees, with 2000 iterations. A total of 652 nucleotide positions were considered in the final alignments, with a gap/missing data treatment of pairwise deletion.
Species recognition using barcodes relies on different species having different unique sequences or different assemblages of closely related sequences so that the intraspecific variation or genetic distance is thus generally much less than interspecific variation, enabling species identification [31].   rounded; scales present in the predorsal area and the abdominal part; 9 palatine teeth; 9 vomer teeth; 26 premaxillary teeth; spinules on gill-rakers of first gill arch; coloration uniformly brownish; peritoneum dark. Distribution: Both sides of the Atlantic Ocean; in eastern Atlantic from Rockall to South Africa [36]. Remarks: The stomach content reveals the presence of 7 whole discs and a large number of arm-ossicles from brittle-stars belonging to the familiy Ophiactidae. There are also sponge-spicules from the hexactinellid Pheronema carpenteri (Thomson, 1869), which might be considered as accidental prey, because of this brittle-stars seem to live closely associated to this deep-sea sponge, well known as an important habitat-builder species.

NJ Trees and Genetic Distances
COI barcodes were obtained from all 25 specimens (Table S1). The four sequences of P. histrix and the one from L. terraenovae and P. opalescens are new contributions to the BOLD project. Table 1 shows the result of the molecular identifications of the specimens using COI-5P sequences, compared to those made by morphology. In 10 of the 14 taxa involved, the identification is coincident. Where traditional taxonomy identifies H. atlantica, L. terraenovae and P. opalescens, BLASTn analysis returns Howella broidei, Lycodes adolfi and Paracaristius maderensis, respectively. In the case of the four specimens of P. hystrix, the percentage of identity with BLASTn is too low to be reliable, being in the 94-95% range, returning Paraliparis sp. The molecular identification process of the COI-5P sequences of P. hystryx is improved in relation to the BLAST search, following the Neighbor-Joining analysis ( Figure S1A). In this case, the resulting haplotypes cluster robustly with sequences from P. garmani and others identified only at the genus level, exhibiting a maximum distance between them of 0.0107 and 0.0428 from their nearest neighbor.
The alignment of sequence MW908015 assigned to P. opalescens (PORCU021-21) was constructed using the closest available sequences found in the BOLD database for the family Caristiidae ( Figure S1B). The sequence groups with two other Paracaristius maderensis (Maul, 1949) from the North Atlantic into a statistically well-supported clade with a mean distance between sequences of 0.001, and 0.0401 to its nearest neighbor, FMVIC386 assigned to Caristius macropus (Bellotti, 1903). Overall, the NJ tree shows several statistically wellsupported clusters that include sequences from different species and genera mixed together.
The cladogram in Figure S1C shows that the sequence MW908001 from L. terraenovae (PORCU023-21) clusters into a statistically well-supported clade with the available COI barcodes of the species Lycodes adolfi Nielsen and Fosså, 1993 and Lycodes pallidus Collett 1879, with a maximum genetic distance of 0.0061. When other mitochondrial marker sequences, such as 12S and Cyt b, are obtained from the same specimen, they cluster with existing sequences from L. terraenovae and L. adolfi ( Figure S2).

Discussion
Although current ichthyological research is largely oriented towards ecological and applied ichthyological aspects, many basic scientific questions remain unsolved for many fish species. The taxonomic knowledge of marine fish is much higher for coastal and/or commercial species compared to non-commercial or deep-sea species [38]. Many deepwater fishes are thought to have wide geographical distributions, but for some species, this conclusion rests primarily on morphological traits [39]. DNA barcoding is an important tool for fish species identification, but also provides a standardized measure of sequence divergence between distant areas. Integrative morphological and molecular analysis will allow also to reveal geographical divergences [8]. This methodology has been successfully used to detect cases of misidentification, synonymies and crypticism [8,9,38] and is being progressively implemented in fish biodiversity studies.
In relation to the specimens captured on the Porcupine Bank, it can be said that, in general terms, the morphology (biometric and meristic data) are in agreement with previous reports in the ichthyological literature [20,21,36,37].
The correlation between traditional taxonomic identification and DNA barcoding of the 25 specimens and their assignment to valid species was successful in 10 of the 14 taxa.
Of the four divergent cases, H. atlantica inBLASTn analysis returned H. brodiei as the best result, but that identification could have been based on Howella brodiei atlantica Post and Quéro, 1991 now considered a synonym of H. atlantica [20].
The three haplotypes of Paraliparis hystrix clustered with two Paraliparis garmani Burke, 1912 and two Paraliparis sp. with interspecific distances at intraspecific levels, which could indicate potential synonymy. Paraliparis garmani was described off New England, western Atlantic, by Burke [40] and then was included in the revision of the family [41], and P. hystrix was described decades after by Merret [42] in the west and south of Ireland, in the eastern Atlantic. Interestingly, Merrett [42] examining the holotype of P. garmani as comparative material, which leads us to conclude that he thought it was the closest species. Despite the high level of overlap found in morphological characters between the two species (Table 2), there is no comparative analysis in Merret's manuscript and this similarity has not been highlighted. No additional descriptions or identification keys to confirm the validity of the two species have been published. DNA barcoding not only serves to detect overlooked species of liparids, but also to rapidly generate important insights into the taxonomy and diversification of this group [43]. The results of this research strongly suggest based on morphology and molecular, P. hystrix is a junior synonym of P. garmani. Geographic distribution also supports this hypothesis ( Figure 3). Chernova et al. [33] report the distribution of each species, eastern Atlantic for P. hystrix and western Atlantic for P. garmani. However, there are records of both species across both areas [44][45][46] indicating a single widespread species in the North Atlantic.  Figure 3). Chernova et al. [33] report the distribution of each species, eastern Atlantic for P. hystrix and western Atlantic for P. garmani. However, there are records of both species across both areas [44][45][46][47] indicating a single widespread species in the North Atlantic. The sequence of Platyberyx opalescens clustered with two MAECO sequences of P. maderensis from the Mid-Atlantic Ridge. However, P. opalescens can be distinguished from P. maderensis by the suborbital series not expanded, space between orbit and mouth narrow; upper jaw relatively long, extending to posterior margin of orbit and the presences of palatine and vomerine teeth [34]. Additionally, the morphological description is in general agreement with that of Stevenson and Kenalay [48]. Diagnostic characters were present in The sequence of Platyberyx opalescens clustered with two MAECO sequences of P. maderensis from the Mid-Atlantic Ridge. However, P. opalescens can be distinguished from P. maderensis by the suborbital series not expanded, space between orbit and mouth narrow; upper jaw relatively long, extending to posterior margin of orbit and the presences of palatine and vomerine teeth [34]. Additionally, the morphological description is in general agreement with that of Stevenson and Kenalay [47]. Diagnostic characters were present in the specimen examined, confirming the correct identification. Therefore, misidentifications of P. maderensis or a recent speciation event seem to be the most likely explanations, but more sequences of specimens of both species would be needed to elucidate this point. In any case, the molecular cladogram does not even show sequence groupings according to the current taxonomy of the family Caristiidae.
Only a few species of Lycodes have been sequenced [48], making identification through DNA barcoding alone difficult. The sequence alignment of the Porcupine Bight specimen of L. terraenovae with those available in BOLD shows its closeness to L. adolfi, as previously reported [23]. However, the cladograms resulting from mitochondrial 12S and Cyt b markers show a close relationship with another specimen of L. terraenovae, in addition to L. adolfi.
Although there are obvious differences in the morphological characters between these two species, this relationship was already suggested in the first description of L. adolfi [49]. Molecular similarity could be due to misidentification, synonymy or recent diversification events. Allometric growth variability, sexual dimorphism and geographical variation have been found in some morphological characters of L. terraenovae [50,51] and Nielsen and Fosså [49] also found ontogenic changes in L. adolfi. This intraspecific morphological variability could lead to consider individuals of the same species as being of separate species and could be an important source of misidentifications and synonymies in this genus. An integrative and comparative study would be necessary with the aim to establish the true relationship between these two species.
Biodiversity requires vouchered and curated specimens for biomass measurements, morphological and genetic analysis (DNA barcoding) to confirm identification and establish robust taxonomy and phylogeny [18]. Molecular tools have been of increasing importance in the discovery of cryptic deep-sea species and taxonomic synonymies resulting from the phenotypic plasticity of a wide range of taxa. However, molecular analysis is a routine procedure, requiring comparatively less effort and formation than classical morphological taxonomy. Consequently, molecular data have grown considerably in the last decades in the repositories, whereas morphological analyses remain scarce in deep-sea fishes. Although modern molecular techniques allow plausible new taxonomic status to be proposed, the paucity of specialised taxonomic expertise and funding (taxonomic impediment [52]), means that these matters will remain unresolved for a long time.
Gaps and inconsistencies in reference DNA databases can make it difficult to accurately identify fish to the species level, suggesting the need to strengthen the DNA barcoding reference datasets [11]. There are considerable geographical variations in the number of deep-sea fish barcodes deposited in repositories. Higher activity has been detected in some countries, such as Canada and the United States in the Northwest Atlantic, and Australia and New Zealand in the Southeast Pacific. The Northeast Atlantic is comparatively less sampled, due to a lower tissue sampling effort in the European Atlantic countries, which is higher compared to less developed and resourced countries such as, for example, the African Atlantic coast.
This integrated approach is proving to be a powerful tool in the exploration of the diversity of the deep-sea, helping fill the gaps in the understanding of the true diversity of life in this environment.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/jmse9101075/s1, Figure S1. Neighbor-Joining trees based on p-distances, including COI-5P sequences of the 14 deep-sea species captured at Porcupine Bank. The percentage of replica trees in which the associated taxa clustered together in the bootstrap test (2000 replicates) are shown next to the branches when values are higher than 70%. The trees are drawn to scale, with branch lengths in the same units as those of the genetic distances used to infer the cladograms. Differences among sequences were computed as p-distances and are in the units of the number of base differences per site. All ambiguous positions were removed for each sequence pair (pairwise deletion option). There was a total of 652 positions in each final dataset. The analyses were conducted in MEGA X. Annotations are also included for each tree regarding the location where the specimens were collected. The species names shown in the cladograms are those that were associated with the sequences downloaded from BOLD. (A) Paraliparis hystrix; (B) Platyberyx opalescens; (C) Lycodes terraenovae. Figure S2: Neighbor-Joining trees of Lycodes terraenovae mitochondrial Cyt b (A) and 12S rDNA (B) nucleotide sequences based on p-distances. The percentage of replica trees in which the associated taxa clustered together in the bootstrap test (2000 replicates) are shown next to the branches when values are higher than 70%. The trees are drawn to scale, with branch lengths in the same units as those of the genetic distances used to infer the cladograms. Differences among sequences were computed as p-distances and are in the units of the number of base differences per site. All ambiguous positions were removed for each sequence pair (pairwise deletion option). The analyses were conducted in MEGA X.