Application of DNA Barcoding for Monitoring Madagascar Fish Biodiversity in Coastal Areas

: Madagascar is a marine biodiversity hotspot. A recent checklist recorded 1689 marine or transitional water ﬁsh species, 2.5% being endemic. To date, studies in this country were mostly focused on adult ﬁshes using morphological-based identiﬁcation. The early life stages of ﬁshes remain largely understudied. The present work aimed to improve knowledge of ﬁsh biodiversity in Madagascar by focusing on post-larval reef ﬁshes and settled juveniles in seagrass meadows of southwest Madagascar by using either species identiﬁcation keys or DNA barcoding. Up to 119,500 individuals were collected, and 1096 individuals were successfully barcoded. We identiﬁed 387 species—85 through their morphology (with 58 unsuccessfully sequenced) and 302 by using CO1 barcoding corresponding to 302 barcode index numbers (BINs). This study added 27 new BINs for the BOLD database, 120 new for Madagascar, but only 159 were assigned a precise species name. By referring to the updated checklist of Madagascar ﬁshes, 10 new species were detected for Madagascar. This number will probably increase when all the barcoded specimens become assigned to precise species names. These preliminary ﬁndings stress our poor knowledge of marine ﬁsh biodiversity in Madagascar and demonstrate the relevance of DNA barcoding in improving this knowledge.


Introduction
Madagascar is recognized for its terrestrial and marine biodiversity [1], including fishes.The country has about 1800 fish species with about 1700 marine fish, including anadromous and catadromous species [2].Among marine fish species, up to 43 are endemic to Madagascar according to Fricke et al. [2].The knowledge about Madagascar marine fish biodiversity mainly focuses on morphology-based identification which is usually not useful for some families or for early life stages.However, surveying the early life stages of fishes could probably improve this knowledge as demonstrated by [3] in the waters of Reunion Island.
Specimen identification at the species level constitutes a major issue when working on post-larvae or juvenile fish.Indeed, no specialized identification guide exists for reef fish juveniles, and the available identification guides for fish larvae rarely go beyond the genus level [4].Several more precise identification guides exist, but they have been developed for specific areas and generally concern only a few species [5,6].Obtaining the precise

Study Sites and Sampling
This study was carried out in coral reef habitats and seagrass beds of the southwestern coast of Madagascar (Figure 1).The post-larval fish survey was conducted in (i) a flat coral reef surrounding Nosy Ve Island, and (ii) in the northern part of the Toliara Great Barrier Reef.These sites were selected because previous scientific information on habitat morphology and the associated resources were available [10,19].A juvenile fish survey was conducted in the seagrass beds off Ankilibe, 15 km to the south of Toliara.This area was chosen because of the presence of small-scale fishermen engaged in juvenile fishing using mosquito seine nets [20].
Post-larvae sampling took place over three six-month periods during the warm season: November to April of 2014-2015, 2016-2017, and 2017-2018.The warm season was chosen as the peaks of abundance and species richness always occur during this period [10].For each warm season, three stations per site were sampled by using light traps called "SLEEP" (acronym of "Système Lumineux Electronique d'Echantillonnage des Post-larves") developed by OCEA Consult' on Reunion Island [3].Each monthly sampling consisted of three consecutive nights centered on the new moon period, as the fish larval supply on the reef environment mainly occurs around this lunar phase [21].Although light traps are selective and their efficiency is influenced by the water-current conditions or by water mass turbidity [22,23], they were used because they allow one to catch fish larvae before they settle onto benthic habitats, at a stage called "post-larvae" [24,25].Six light traps per night were deployed, with three traps per site (i.e., one trap per station).Light traps were set around sunset and retrieved around the crepuscule.The individuals caught by each trap were placed into separate containers filled with seawater for keeping them alive.
For seagrass fishes, the catches of two small-scale fishermen using mosquito seine nets were sampled during the warm seasons of 2016-2017 and 2017-2018 (i.e., November to April).This period corresponds to the highest landings in southwestern Madagascar [26].Although several small-scale fishing gears are commonly used in southwest Madagascar, fishermen using mosquito seine nets were chosen as they capture many juveniles [27].
The catches were collected for three days at spring tide, a period more practical for the deployment of mosquito seine net needing a depth of less than 1.3 m and thus more favorable for fishing.Sampled catches were put into a cooler with ice and transported to the IH.SM laboratory for the identification process.For seagrass fishes, the catches of two small-scale fishermen using mosquito seine nets were sampled during the warm seasons of 2016-2017 and 2017-2018 (i.e., November to April).This period corresponds to the highest landings in southwestern Madagascar [26].Although several small-scale fishing gears are commonly used in southwest Madagascar, fishermen using mosquito seine nets were chosen as they capture many juveniles [27].The catches were collected for three days at spring tide, a period more practical for the deployment of mosquito seine net needing a depth of less than 1.3 m and thus more favorable for fishing.Sampled catches were put into a cooler with ice and transported to the IH.SM laboratory for the identification process.

Identification Process
In the laboratory, living individuals from light traps, and dead fishes from mosquito nets, were sorted by morphospecies, i.e., individuals presenting a similar morphology.Each morphospecies was identified at the lowest possible taxonomical level by using published keys [28][29][30], the identification of some specimens remaining to family or order level.One specimen per morphospecies and per sample was randomly selected and photographed with a camera Nikon model D90 equipped with a Sigma 105 mm macro lens.Post-larvae were euthanized with an overdose of clove oil.The camera was connected directly to a computer equipped with the Adobe Lightroom 5.7 ® software ⎯ created by Adobe Systems Inc. in San Jose California USA-used for managing the photos and all the information related to each specimen.A tissue fragment from each photographed specimen was preserved in 90% ethanol and stored at −20 °C until total DNA extraction.

Identification Process
In the laboratory, living individuals from light traps, and dead fishes from mosquito nets, were sorted by morphospecies, i.e., individuals presenting a similar morphology.Each morphospecies was identified at the lowest possible taxonomical level by using published keys [28][29][30], the identification of some specimens remaining to family or order level.One specimen per morphospecies and per sample was randomly selected and photographed with a camera Nikon model D90 equipped with a Sigma 105 mm macro lens.Post-larvae were euthanized with an overdose of clove oil.The camera was connected directly to a computer equipped with the Adobe Lightroom 5.7 ® software-created by Adobe Systems Inc. in San Jose California USA-used for managing the photos and all the information related to each specimen.A tissue fragment from each photographed specimen was preserved in 90% ethanol and stored at −20 • C until total DNA extraction.Several tissue fragments per morphospecies were brought for barcoding at the genotyping and sequencing facility, Institut des Sciences de l'Evolution, CEMEB, University of Montpellier, France.Tissue fragments were then rinsed with distilled water to remove the alcohol and then dried in 2 mL individual tubes.An automated purification of genomic DNA from dried tissues was performed by using a Macherey-Nagel NucleoMag ® 96 Tissue kit [31].About 650 bp were amplified from cytochrome oxidase 1 (CO1) mitochondrial gene by using the cocktail of primers FishF1-5 TCAACCAACCACAAAGACATTGGCAC3 and FishF2-5 TCGACTAATCATAAAGATATCGGCAC3' [BOLD Primer code: C_FishF1] in combination with FishR1-5 TAGACTTCTGGGTGGCCAAAGAATCA3 [BOLD Primer code: VR1] [32].A complete description of the sequencing process can be found in [3].The CO1 sequences were manually edited by using Chromas 2.6.4 (DNA Sequencing software.available online: http://technelysium.com.au/wp/chromas/,accessed on 15 May 2017).CO1 sequences with their corresponding specimen images and sampling details (e.g., location, time, taxonomy) were uploaded into the Barcode of Life Data System database [dx.doi.org/10.5883/DS-PHDJAO].In BOLD, each sequence was assigned a barcode index number (BIN) by delineating sequence clusters through single linkage analysis by using a 2.2% sequence divergence for obtaining the operational taxonomic units [33].The resulting OTUs were refined by using Markov clustering and the optimal partition was selected through silhouette criterion ( [33].For assigning a species name to each BIN, we adopted an approach consisting of several steps.First, if the BIN corresponded to only one species in BOLD, and this species was observed in this BIN only, the specimen was identified as "Genus + species".Second, if the BIN corresponded to different species from the same genus, or if the species corresponding to this BIN was also assigned to different BINs in BOLD, the specimen was identified as "Genus + BIN".Third, if the BIN corresponded to species from different genera, but belonged to the same family, the specimen was identified as "Family + BIN".Fourth, when the BIN was new for BOLD (i.e., corresponding to specimens that had never been barcoded before), identification was based on the combination of the lowest taxonomical level based on morphological character and the BIN (i.e., Genus + BIN and/or Family + BIN).Note that identification such as "Genus + BIN", or "Family + BIN", do correspond to identifications at the species level, as each BIN corresponds to an OTU, and thus to a putative species [33].

Data Analyses
Data analyses were implemented with R programming software R3.5.1 created by R Core Team in Vienna-Austria [34].For providing the expected observed fish species richness and assessing the additional sampling benefits, species accumulation curves of the population from mosquito seine nets and light-trap catches were plotted by using the "vegan" package (Version 2.5.4,[35]).Bootstrap estimators were demonstrated as better estimators of total species richness than jackknife and Chao's estimator [36].The "wiqid" package (Version 0.2.2, [37]) using richboot function was used to obtain the bootstrapped estimation of species richness [38].
The number of species corresponding to the initial names obtained from morphologicalbased identifications was compared to the resulting OTU from BOLD to highlight the strength of molecular-based identification approach.For assessing the effort of fish DNA barcoding in Madagascar, the existing BINs in BOLD-associated with fish species names from this country-were extracted from the Public Data Portal on 4 May 2022 by using the keywords "Actinopterygii Madagascar".The species names that occurred most were assigned to each of the extracted BINs and compared to the list of fish species reported by [2].To obtain the contribution of the present study in increasing DNA barcoding effort for Madagascar fishes, these extracted BINs were compared with the BINs obtained in this study.

Species Richness: Morphological vs. Molecular-Based Identification
In total, 364 samples were collected: 286 from light traps and 78 from mosquito seine nets.119,500 individuals were collected, 50,342 from light traps and 69,158 from mosquito seine nets.Using morphological characteristics only, we identified 656 morphospecies among postlarvae and juvenile fish.A total of 1096 individuals from 571 morphospecies were successfully barcoded (with 1 to 17 barcoded specimens for each morphospecies; see Supplementary Material Table S1).Finally, 387 species from 66 families were obtained by the combination of molecular-based identification and morphology approach (see Supplementary Material Table S2).Ten families were the most specious: Apogonidae with 37 species, Pomacentridae (26), Syngnathidae (25), Lethrinidae (22), Labridae (22), Mullidae (20), Chaetodontidae (18), Gobiidae (16), Acanthuridae (15), and Carangidae (14).Among the 387 species, 238 were observed as post-larvae and 232 as seagrass fishes, 83 species being observed in both sampling gears.Based on the species accumulation curves, the maximum values of species richness were not reached with light traps (Figure 2a) nor with mosquito seine nets (Figure 2b).The bootstrap estimator of the total species richness estimated that light traps and mosquito seine nets should catch up to approximately 275 and 264 species, respectively.

Discordances in Species Assignment
Among the 141 BINs presenting ambiguities in the species' name assignment, 70, including the 27 new ones, were simply not assigned to species name in the BOLD database.The remaining 71 BINs presented ambiguities in species assignation.These ambiguities came either from BINs associated with several species names in BOLD, or from BINs assigned to one species name to which two or more BIN(s) were associated.The first type of ambiguity concerned three BINs: BOLD:AAA9764 associated with Myripristis hexagona and M. murdjan, BOLD:AAD1777 associated with Pempheris adusta and P. nesogallica, and BOLD: AAD5600 associated with Apogon erythrinus and Ostorhinchus aureus.The second type of ambiguity corresponded to 68 BINs.

Discordances in Species Assignment
Among the 141 BINs presenting ambiguities in the species' name assignment, 70, including the 27 new ones, were simply not assigned to species name in the BOLD database.The remaining 71 BINs presented ambiguities in species assignation.These ambiguities came either from BINs associated with several species names in BOLD, or from BINs assigned to one species name to which two or more BIN(s) were associated.The first type of ambiguity concerned three BINs: BOLD:AAA9764 associated with Myripristis hexagona and M. murdjan, BOLD:AAD1777 associated with Pempheris adusta and P. nesogallica, and BOLD: AAD5600 associated with Apogon erythrinus and Ostorhinchus aureus.The second type of ambiguity corresponded to 68 BINs.

Geographical Distribution of Ambiguous Species
The species' geographical distribution is important for understanding whether the ambiguities in species name assignment could be linked to the different regions where they occur.The geographical distribution of the 71 BINs presenting ambiguities in the species' name assignment could be categorized into two groups.The first group comprised 44 BINs for which the associated species name was also associated with another BIN(s) observed in different regions (Table 1).For the second group of 27 BINs, no clear geographical distribution was observed (Table 2).

Geographical Distribution of Ambiguous Species
The species' geographical distribution is important for understanding whether the ambiguities in species name assignment could be linked to the different regions where they occur.The geographical distribution of the 71 BINs presenting ambiguities in the species' name assignment could be categorized into two groups.The first group comprised 44 BINs for which the associated species name was also associated with another BIN(s) observed in different regions (Table 1).For the second group of 27 BINs, no clear geographical distribution was observed (Table 2).

DNA Barcoding Effort for Madagascar Fishes
Among the 1689 marine or transitional water fish species recorded in Madagascar EEZ [2], only 419 (i.e., about 24.8%) had already been barcoded and assigned to a BIN in the BOLD database in May 2022 (Table S3).Based on the comparison of these 419 BINs with the obtained BINs from this study, the present work added 120 new BINs for Madagascar (see Supplementary Material Table S2) including the 27 new BINs that were never recorded before for the BOLD database.These 120 new BINs belonged to 41 families, half of them corresponding to species of Pomacentridae (16 species), Apogonidae (14), Chaetodontidae (eight), Labridae (eight), Mullidae (eight), and Lethrinidae (six).
This study allowed us to identify 10 new records for Madagascar, i.e. species that had never been observed in the coastal waters of this country.These new species for Madagascar were among the 159 BINs that were unambiguously associated to a species name.These species were Chaetodon ulietensis Cuvier 1831, Diagramma labiosum Macleay

Discussion
The present work highlighted the usefulness of DNA barcoding as a tool for inventorying biodiversity based on early life stages of fish.We observed an over-evaluation of species richness when using only morphological characters, 571 morphospecies vs. 387 species when using DNA barcoding combined with a morphological approach.Inversely in other studies, using only a morphological approach has also led to an underestimation of the number of species as observed by [13] in the southwest tropical Atlantic Coast.These authors identified 76 post-larval reef fish morphospecies from 465 samples (i.e., the catches of one light trap for one night) over five years by using morphological and meristic characteristics.Even though these authors did not use DNA barcoding, this richness appeared to be low compared to the 733 reef fish species known in this part of the Atlantic Coast [39].For instance, in [13] only one morphospecies assigned to "Genus spp.or Family spp." was observed for each of the families Scorpaenidae, Gobiidae, and Synodontidae because meristic characters largely overlap between species.This highlights the inconsistency of traditional identification techniques due to the inefficiency of morphological characteristics for separating some species as already reported by [16].
The present work-using DNA-based identification-observed up to 265 putative species (i.e., 265 BINs) caught as post-larvae from 286 samples only.Previous work using similar light traps at the same sites obtained 128 post-larval reef fish morphospecies from 145 samples, 79 only being identified at the species level ( [11].Based on the graph presented in Figure 2a, 145 samples would correspond to approximately 180 species, i.e., ~40% more species than observed without using DNA barcoding.These findings confirm the DNA barcoding usefulness for identifying the early life stage of fish species as reported by [17], and support its higher efficiency compared to traditional techniques. DNA barcoding can also contribute to knowledge improvement of fish biodiversity at local or global scales.As revealed by [40], DNA barcoding is useful for enhancing fish biodiversity knowledge as they detected up to 90 new species for South Africa during their ten-year-long post-larval survey.Similarly-in waters of Reunion Island-Collet et al. [3] demonstrated the importance of post-larval surveys in improving fish biodiversity knowledge using DNA barcoding.These authors found ten new species that had never been recorded in the Reunion Exclusive Economic Zone over six months of sampling during a single warm season (about 108 samples).These findings appear to be in line with our results as 10 species that had never been recorded in Madagascar were observed during three warm seasons (364 samples).By referring to the Catalog of Fishes, four of these new species for Madagascar (Dipterygonotus balteatus, Foa fo, Neamia octospina, Paramonacanthus frenatus, and Pempheris ibo) have already been observed in the Western Indian Ocean region.On the other hand, five of the new species (Chaetodon ulietensis, Diagramma labiosum, Equulites klunzingeri, Ostorhinchus gularis, and Scarus fuscopurpureus) appeared to be new records for this region.This result may be related to their misidentification during previous works using morphological identification only.This could also correspond to an extension of their geographical distribution linked to their larvae dispersal during tropical storms, which has already been demonstrated for Epinephelus marginatus whose larvae were suggested to be transported by a cyclone from South Africa-which holds the nearest known populationto Reunion Island [41].Moreover, the present work also contributed to the enhancement of fish biodiversity knowledge at a global scale as 27 BINs new for the BOLD database were obtained.These new BINs may either correspond to fish species that have never been described due to the rarity of molecular-based studies in the region.This can also correspond to species that have never been barcoded at the adult stage [42] as most barcoding studies are still based on the commercial value of the species.The lack of CO1 sequences in BOLD may also be linked to the way specimens are usually selected for DNA analyses.Most of the time, specimens are first grouped by similar morphologies (i.e., morphospecies) and patterns of colors.As the number of barcoded specimens per morphospecies is often limited, some species that look similar based on their morphology may be ignored.In the present work, several specimens were barcoded within each morphospecies.This was particularly the case for Lethrinids post-larvae and juveniles as cryptic species are often encountered in this family [43] and morphological keys are unable to distinguish the species [44].In the present study, BOLD assigned the CO1 sequences of 138 Lethrinids specimens to 14 BINs, four of them being new for BOLD.This finding not only confirms that DNA barcoding can deal with species that are hard to differentiate [15] but also demonstrate that the way individuals are selected for DNA analyses is important for increasing fish biodiversity knowledge.
Although DNA barcoding appeared to be a promising tool for identifying the early life stage of fish, some barcoded individuals were not assigned to a species name.Wibowo et al. [45] were also unable to assign to species names a large proportion of larval fish morphospecies (68%) they had collected in Indonesia tropical swamps.According to these authors, the main problem was the lack of reference CO1 sequences.In the present work, 141 BINs were not clearly assigned to a species name.Of these 141 BINs, 43 (excluding the 27 new BINs for BOLD) were not associated with a precise species name.For the remaining 71 BINs, the species name corresponding to each of these BINs was also assigned to other BINs in BOLD.Ideally, there should be one BIN only for each species and vice versa [33].Interestingly, for 44 of these 71 BINs (i.e., 62%), the species name associated with each BIN was also associated with another BIN corresponding to specimens caught in a different geographical location (Table 2).For example, among the two BINs corresponding to Plectroglyphidodon lacrymatus, BIN BOLD:AAI8860, observed during the present study, is solely observed in the Western Indian Ocean, whereas BIN BOLD:AAB6989 is observed in the Western Pacific Ocean.These findings confirm the DNA barcoding ability for tracing species origin [15].It is, however, difficult to determine if these BINs correspond to sister species or only to one species presenting a high phylogeographic structure (i.e., an intraspecific divergence without speciation process) as observed in the groupers [46].In order to not increase noise in BOLD, the specimens caught during the current study were, thus, identified as Plectroglyphidodon [BOLD:AAI8860].For the remaining 27 BINs out of these 71 BINs, no clear spatial distribution in the BINs corresponding to each species name was observed (Table 2).For example Zebrasoma desjardinii is associated either with the BIN BOLD:AAF6311 or with BOLD:ACV8450, both of them being observed in the Western Indian Ocean.This kind of ambiguous case may be associated with specimen voucher misidentifications that often occur in laboratories [47].Indeed, Zebrasoma desjardinii is difficult to distinguish morphologically from Zebrasoma veliferum [48].Such misidentifications cause serious problems in reference libraries such as in BOLD and has already been stressed by [18] for the Indo-Pacific fish larvae.This highlights the important need for revising the information present in reference libraries.
An increase in the DNA barcoding effort, not only for adult fishes but also for larval and juvenile fishes, remains mandatory for a better knowledge of fish biodiversity.For example, 1919 fish species are known from South Africa's exclusive economic zone, but only 1006 (52.4%) have been barcoded [40].Nevertheless, these authors demonstrated that the DNA barcoding effort added 90 new species caught as larvae and 139 as adults.For Madagascar, among the 1689 marine or transitional water fish species reported by [2], only 419 (~25%) had been barcoded before the present work.The current study sequenced up to 120 fish species of Madagascar that were never barcoded, and added 27 new BINs in BOLD, and recorded at least 10 new species for Madagascar.Thus, the number of marine, or transitional water, fish species in Madagascar rose to 1710, 530 of them (~36%) being barcoded.In Madagascar, the effort in terms of DNA barcoding is still in an early stage of development compared to other countries in the Western Indian Ocean such as South Africa.Increasing this DNA barcoding effort by conducting surveys at broader geographic and temporal scales, using all life stages from post-larvae to adults, will thus improve the knowledge of Malagasy fish biodiversity.This will allow updating the checklist of Malagasy fishes reported by [2].

Conclusions
In conclusion, this study clearly stressed the usefulness of DNA barcoding as a promising tool for enhancing the knowledge of the Malagasy marine fish biodiversity.This work-which added a large number of new sequences to BOLD-represents an important step for establishing a national sequence reference library for marine fish of Madagascar.The large number of fish species that remain to be barcoded demonstrates that Madagascar is still in the early stages of development in terms of molecular-based identification of marine fish.Increasing research efforts based on molecular-based identification by conducting a study on a broader scale will contribute to a significant effect on fish biodiversity.Although the present work was carried out in a restricted area only, nine new records for Madagascar were detected.This number will probably increase when all the barcoded specimens-which presented ambiguities-are assigned to a precise species name.However, even if some specimens were not assigned to a species name, BIN can be used for biodiversity or ecological surveys [49].Therefore, accurately identifying species based on their corresponding BIN will permit linking the early life stages to adults and investigating the response of these different life stages to environmental conditions.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/d14050377/s1.Table S1: Dataset of individuals successfully barcoded during the present work.Table S2: List of species captured by light traps and mosquito seine nets during austral warm seasons.The species are organized according to Nelson (2006).Total abundances are reported for each sampling gear as well as the number of successfully barcoded specimens.Table S3: List of barcoded fish species from Madagascar in BOLD in 4 May 2022.Institutional Review Board Statement: Ethical review and approval were waived for this study as we work on dead animals retrieved from small-scale fisher catches for juvenile monitoring.For the collected living fish post-larvae from light trap, only one individual per taxa were collected and the remaining we returned to the sea.

Figure 1 .
Figure 1.Locations of the sampling sites for post-larvae (red circles) and juvenile fish (dotted circle).With ANA = Anakao reef, and GRT = Great Barrier Reef of Toliara.

Figure 1 .
Figure 1.Locations of the sampling sites for post-larvae (red circles) and juvenile fish (dotted circle).With ANA = Anakao reef, and GRT = Great Barrier Reef of Toliara.

Figure 2 .
Figure 2. Species accumulation curves for post-larval (a) and juvenile fishes (b).One sample corresponds to the catch of each light trap per night (a) or each mosquito seine net per day (b).With black bars: interval of confidence and Red line: accumulation curve.

Figure 2 .
Figure 2. Species accumulation curves for post-larval (a) and juvenile fishes (b).One sample corresponds to the catch of each light trap per night (a) or each mosquito seine net per day (b).With black bars: interval of confidence and Red line: accumulation curve.

Figure 3 .
Figure 3. Identification levels.The numbers in brackets correspond to the number of species for each category.

Figure 3 .
Figure 3. Identification levels.The numbers in brackets correspond to the number of species for each category.

Table 1 .
Ambiguously identified species with clearly separated origins of the specimens of each BIN, with IO: Indian Ocean, PO: Pacific Ocean, and AO: Atlantic Ocean.

Table 1 .
Ambiguously identified species with clearly separated origins of the specimens of each BIN, with IO: Indian Ocean, PO: Pacific Ocean, and AO: Atlantic Ocean.
a Species complex.

Table 2 .
Ambiguously identified species without separated origins of the specimens of each BIN, with IO: Indian Ocean, PO: Pacific Ocean, and AO: Atlantic Ocean.
a Species complex.