Indo-Pacific Phylogeography of the Lemon Sponge Leucetta chagosensis

: The sponge Leucetta chagosensis Dendy (1913) has a wide distribution throughout the Indo-Pacific (IP) region, with previous studies focussing primarily on the western Pacific Ocean. To increase our knowledge of the spatial variation of genetic diversity throughout the IP, we constructed a phylogeny for L. chagosensis for the IP to assess the evolutionary patterns for this species. We generated 188 sequences of L. chagosensis and constructed maximum likelihood and Bayesian inference trees, using concatenated mitochondrial cytochrome oxidase subunit 3 gene ( cox3 ) and nuclear ribosomal RNA gene ( 28S ) markers for the first time. The spatial variation of genetic diversity of L. chagosensis was assessed using a phylogeographic approach. Leucetta chagosensis is composed of five cryptic lineages confined to different biogeographic regions with the specimens found in the Indian Ocean differing significantly from those found in the rest of the IP region. Genetic divergence was particularly high for the cox3 marker, with a low nucleotide diversity but high haplotype diversity for most lineages. This study highlights the need for a sustained effort in studying sponge diversity, boosted by the ongoing discovery of hidden biodiversity among this ecologically important taxon.


Introduction
Phylogeography has provided numerous insights into the distribution of genetic variation in terrestrial and aquatic ecosystems. In particular, it has contributed to our understanding of the evolutionary history and spatio-temporal genetic divergence of species, with broad implications ranging from systematics and taxonomy [1,2] to conservation [3,4]. In addition, phylogeographical studies have the potential to support effective conservation strategies [5], especially for threatened marine environments like the Indian Ocean [6,7].
The Indian Ocean (IO), is recognised as one of the world's marine biodiversity hotspots, with a high number of endemic species [8,9,10]. The IO has been neglected from a bio-and phylogeographical standpoint, particularly compared to the rest of the Indo-Pacific (IP) region [11].
This lack of information is particularly concerning because baseline biodiversity information is essential for any conservation plan that aims to mitigate climate change impacts [12,13,14,15] and over-exploitation [16,17,18]. Recent studies have emphasised a biogeographic break between reef species of the Indian and Pacific Oceans. For example, distinct genetic lineages have been observed between reef fish species [19,20] and coral reef sea stars [21,22,23] of the Indian and Pacific Oceans, respectively. Such genetic discontinuities are important from a conservation perspective, as populations of marine species in these regions likely experience unique evolutionary dynamics, which require separate management and conservation prioritisation.
Sponges are important functional components of reef ecosystems in the IP region [24,25], but few studies have focussed on their diversity and phylogeography so far -a gap we aim to close with this study. We chose the calcareous "lemon" sponge Leucetta chagosensis (Order: Clathrinida, Family: Leucettidae) as the target species of this study for several reasons.
First of all, the lemon sponge has a wide distribution, ranging from the Red Sea and the Coral triangle region to the central Pacific, thus covering the whole Indo-Pacific region. Leucetta chagosensis was first described by Dendy (1913) [26] from the IO Chagos Archipelago. The sponge was later reported in different areas in the IP, as well as other places in the IO ( Figure 1). Recently, L. chagosensis has been observed in Rodrigues Island [25] and Reunion Island [27], although it has not yet been detected in Mauritius (Pasnin, pers. Comm.) ( Figure 1).
Second, compared to other calcareous sponges, L. chagosensis is abundant and can be easily identified because of its bright lemon-yellow colour. It is usually found in shaded areas [28] at depths below 15 m [29], and the species can locally be relatively abundant compared to other sponges. Like all calcareous sponges [30], it is viviparous, and therefore assumed to have limited dispersal capabilities [28].
Third, the species has been subject to previous phylogeographic studies, which have focussed (although not exclusively) on the Pacific region [28,29]. In these studies, deeply diverged clades were discovered: two large and sub structured overlapping clades in the north west (NW) and south west (SW) Pacific, with a geographical overlap in the Great Barrier Reef and the Coral Sea. The other divergent clades were observed from individuals sampled from the Maldives and the Red Sea, forming a distinct clade, and also another clade grouping individuals found in the Philippines [29]. The overlapping distribution of genetically divergent clades [28,29] suggests that reproductive barriers exist between them, supporting the hypothesis of ongoing speciation processes within L. chagosensis. Samples from the Southern IO were not included. Still, they may be crucial for understanding the phylogeography of the species -for example, like in the Acanthaster planci-species complex [22,31], in which deep genetic breaks exists between Northern and Southern IO. Additionally, both previous studies on L. chagosensis phylogeography [28,29] were based entirely on nuclear loci, because mitochondrial markers were not available at that time, due to the markedly different organization of the mitochondrial genomes of calcareous sponges compared to other metazoans [32]. Including mitochondrial markers has been proven to improve the resolution of relationships between geographic regions, also among sponge species [33,34,35]. Most sponges have low mitochondrial variation [36] compared to other marine species. However, the first unambiguously identified mitochondrial genes from calcareous sponges exhibited an unusually high genetic variation, exceeding the one characterizing other sponges [37,32]. A pilot study by Voigt et al. [37] with L. chagosensis demonstrated the high-resolution potential of the mitochondrial gene of cytochrome oxidase subunit 3 (cox3) by analysing samples of the target species from its NW and SW Pacific clades. Only one sample from the IO (Red Sea) was included in that study.
Here, for the first time, we comprehensively analyse the phylogeographic structure of L. chagosensis over its distribution range with both nuclear and mitochondrial markers. In this way, we were able to widen our sample size and improve the resolution power of our analyses. The aims of this study were to (1) assess if the L. chagosensis found in the central IO is isolated from the Pacific Ocean (PO) and (2) unravel the evolutionary relationships of L. chagosensis. We hypothesise that, given the viviparous life history trait of L. chagosensis, limits to long-distance dispersal do probably exist. As a consequence, the alleged wide geographic distribution of the species throughout the IP could be an overestimation. In particular, we hypothesise that L. chagosensis found in the central IO has diverged from those found in the PO and that the nominal L. chagosensis is composed of different cryptic lineages that are evolving separately in their specific biogeographic regions.

Sequence Generation
For this study, we generated 162 cox3 and 56 28S new sequences of L. chagosensis collected from Indian Ocean, as well the western and central Pacific Oceans, except for the Taiwan region (Table 1). In addition, we included 35 and 146 available mtDNA partial cytochrome oxidase subunit III gene cox3 and 28S sequences, respectively [37,28,29]. The compiled dataset of 209 L. chagosensis samples covers its entire range, encompassing sites in the Indian and Pacific Oceans (Table 1), making this the most comprehensive dataset available to date. Detailed information of the specimens and their sequences is provided in Table S1. The DNA of newly collected specimens was extracted using the NucleoSpin Tissue kit (Machery-Nagel) following the manufacturer's instructions. DNA extracts from other specimens were available from previous studies [28,29]. PCR amplification followed Voigt et al. [37] for the mtDNA partial cytochrome oxidase subunit III gene (cox3) and Wörheide et al. [28] for nuclear 28S markers. All PCR products were run on a 1% agarose gel (BioWhittaker Molecular Applications) and gel-purified using Biospin Gel Extraction Kits, following the manufacturer's instructions. All products were sequenced on an ABI 3730 Capillary sequencer with Big Dye terminator chemistry (Applied Biosystems) at the Central Analytical Facility at Stellenbosch University or at the Biozentrum of the Ludwig-Maximilians-Universität Munich. Forward and reverse sequences were aligned in CodonCode Aligner (https://www.codoncode.com/) or Geneious v11.1.5 (http://www.geneious.com; [38]). Generated sequences were submitted to GenBank (http://www.ncbi.nlm.nih.gov) and the European Nucleotide Archive (http://www.ebi.ac.uk/ena) and are available under the accession numbers provided in Table S1.

Phylogenetic Reconstructions
Of the 209 specimens, both markers, cox3 and 28S, were available for 188 individuals (Table S1). These 188 sequences of each marker were aligned using Muscle 3.8.425 [39] implemented in Geneious, with default parameters and manually trimmed to same lengths. The resulting sequence lengths for the cox3 and 28S alignments were 412 bp and 306 bp, respectively. Every sequence was checked for poriferan origin (specifically L. chagosensis) against the NCBI database before downstream analyses, in order to validate the morphological taxonomy. The cox3 sequences were checked for their reading frame by translating them with the reported calcinean-specific mitochondrial genetic code [32]. The two gene fragments were concatenated, but divided into two partitions, protein-coding (cox3) and non-protein-coding (28S), before finding for the best partition scheme and nucleotide evolutionary model using ModelFinder in IQ-Tree [40,41] under the Bayesian information criterion (BIC). The best partition scheme and model used for the concatenated sequences are found in Table S2. Attempts to include the closely related species Pericharax orientalis as an outgroup were omitted because no robust positioning of the root was possible. For 28S, the genetic variation was too low to find a reasonably supported root position, while for the mitochondrial cox3 marker, the previously documented, extraordinarily high variation in mitochondrial protein-coding genes in Calcinea [32] produced a very long branch to the outgroup. Even at the amino-acid level, the p-distance between P. orientalis and L. chagosensis was about 40%. An appropriate outgroup is required for L. chagosensis, as for now it is difficult to find a suitable one due to the high divergence of the cox3 marker with L. chagosensis [37]. Moreover, there is no molecular clock available for the class Calcarea. For these reasons, all trees were midpoint rooted.
Maximum likelihood (ML) and Bayesian inference (BI) midpoint rooted trees were constructed respectively using IQ-Tree [42], with 2000 ultrafast-bootstrap replicates and MrBayes [43,44], with a chain length (three heated and one cold chain) of 10,000,000 generations and discarding the first 25% of trees as burn-in. Single marker trees were reconstructed for each partition first, and then subsequently with the concatenated dataset to assess any difference in the topology obtained. The ultrafast bootstrap values for the ML tree were obtained with single branch tests (SH-like approximate likelihood ratio test) implemented in IQ-Tree. The parameters for the Bayesian trees were checked using the trace output, and convergence was accepted when values for an average standard deviation for split frequencies were well below 0.01 and effective sample size values were above 200. All trees were visualised and formatted using Figtree v1.4.4 (tree.bio.ed.ac.uk/software/figtree/). Sequences with identical haplotypes forming a monophylum were collapsed for each biogeographic region sampled.

Population Structure and Genetic Diversity
To assess the relationships among haplotypes, TCS networks of all cox3 and 28S sequences were constructed separately in Popart 1.7 [45]. DNAsp v6 [46] was used to calculate the number of haplotypes per sampling region. Also, haplotype and nucleotide diversities were calculated for the different biogeographic groupings shown in Figures 1 and 2. An AMOVA was conducted in Arlequin 3.5 [47], with five groups following the different clades observed in the haplotype network and phylogenetic tree, in order to assess the level of genetic differentiation within and between the groupings obtained.

Phylogenetic Reconstruction
The separate ML (Figures S1 and S2) and BI-based phylogenetic reconstructions for the cox3 and 28S markers were different, with the phylogeny estimated from cox3 having higher support based on bootstrap and posterior probability values. One main difference was that in the 28S-based phylogeny, the samples from the Philippines were not reciprocally monophyletic, whereas they were in the cox3based phylogeny, where they formed a separate clade ( Figure S1). The ML and BI tree from the concatenated sequences were congruent ( Figure S3 and Figure 1), and here the samples from the Philippines formed a reciprocally monophyletic group. The concatenated tree provided evidence for five clades, with most lineages having high support values ( Figure 1). Clade 1 (green in Figure 1) contained most of the samples from the western Pacific, grouping specimens from Australia, Papua New Guinea, Vanuatu, Fiji, American Samoa, and Polynesia. Clade 2 (purple in Figure 1) included specimens from the IO only, with samples collected from the Red Sea, Maldives, and Rodrigues. Clade 3 (orange in Figure 1) grouped some of the samples from the Australian coast (orange in Figure  1), found around the Great Barrier Reef region, and also samples from Papua New Guinea, Palau, Guam, and Taiwan (orange in Figure 1). Clade 4 (blue in Figure 1) was comprised of samples collected in Indonesia and Japan, but grouped no samples from areas in between (for instance, Taiwan and the Philippines). Clade 5 (turquoise in Figure 1) was comprised of only samples from the Philippines.

Population Structure and Genetic Diversity
The haplotype network based on cox3 sequences (Figure 2) generated more information in terms of haplotype diversity than the one obtained from 28S sequences ( Figure S4). The relationship between the 45 haplotypes is described in the cox3 haplotype network (Figure 2), with the main groupings corresponding to the five clades obtained from the phylogenetic tree (Figure 1). The number of haplotypes obtained from cox3 and 28S for each study region, with their respective clades obtained from the phylogenetic tree, is provided in Table S3. Clades 1 and 3 grouped haplotypes from different geographic locations, namely Australia and Papua New Guinea. Moreover, haplotypes from Australia showed strong differentiation separated by 20 mutational steps in Clade 3. The haplotypes from Rodrigues, the Red Sea and the Maldives were grouped in Clade 3, but were all separated according to their geographic location. The most distinct haplogroup contained the haplotypes from the Philippines (Clade 5 in Figures 1 and 2), separated by 35 mutational steps. In general, few haplotypes are shared across the sampled biogeographic regions (Table S3; Figure 1). However, Clade 1 ( Figure 1) contained individuals from Polynesia, Fiji, and American Samoa, sharing the same cox3 haplotypes (Table S3). Haplotypes from Clade 2, grouping regions from the IO, were distinct from the other clades, with no haplotypes shared even between sampling localities (Rodrigues, Maldives, and the Red Sea). The Maldives shared one 28S haplotype with Polynesia (Table S3). Clade 3 shared one 28S haplotype for specimens obtained from Guam, Taiwan and Palau, whereas only Guam and Taiwan shared a cox3 haplotype. Clade 4 (Japan and Indonesia) shared one cox3 haplotype. Specimens from Clade 5, which was the most distinct one obtained from the phylogenetic tree ( Figure 1) and the cox3 haplotype network (Figure 2), shared one haplotype with Australia and Indonesia with regards to the 28S marker (Table S3). The AMOVA supports the hypothesis of high levels of genetic structure with Fst = 0.88 (P < 0.001) for the cox3 marker compared to the 28S marker with Fst = 0.46 (P < 0.001) for 28S (Table 2). Moreover, a high percentage variation (87.54) was observed for the cox3 marker among the different clades observed with a low percentage variation (12.46) within the clades. For the 28S marker a lower percentage variation (46.23) was observed among populations compared to within populations (53.77). Further analyses based on pairwise comparisons (Fst) showed significant differences ranging from 0.689 to 0.974 with the cox3 marker between the five clades obtained in Figures 1 and 2 (Table  S4). Nucleotide diversity for both markers was low and ranged from 0.000 to 0.046. Haplotype diversity ranged from 0.000 to 0.914 (Table 3). The global haplotype diversity was high for both cox3 (0.914) and 28S (0.881) markers.

Discussion
Our findings confirm previous investigations that showed L. chagosensis as several genetically distinct but morphologically cryptic lineages [28,29]. To further disentangle the phylogenetic and phylogeographic relationships of L. chagosensis, we utilised for the first time two independent markers that previously were demonstrated to differ in their ability to resolve evolutionary histories of sponges [29,37], namely a nuclear and a mitochondrial gene. Combining a larger sample size in relation to previous studies and analysing a concatenated dataset merging both nuclear and mitochondrial sequences, here we provide a phylogenetic insight into what we could call the L. chagosensis species complex. Assuming that the pelagic larval dispersal (PLD) takes place for only few days [48,49], the mixing of larvae across large geographical areas seems unlikely, thus contributing to the isolation of lineages, even in a relatively open system like the ocean. Our phylogenetic reconstruction (Figure 1) is consistent with the hypothesis that L. chagosensis is composed of morphologically cryptic lineages, with their distinct evolutionary histories presumably shaped by a combination of short PLD, geographic distance, and physical barriers, such as the rise of shelfs during the glacial period. In summary, Clades 1, 2, & 5 were distinct to their respective geographic locations; namely the western PO, IO and the Philippines. Clades 3 and 4 showed some complexity in their phylogenetic relationships, as they grouped irrespective to the proximity of their sampling locations (it should be noted that the specimens of these two clades are predominately from the Coral Triangle).

Deep Divergence and Biogeographical Overlap in Cryptic Lineages
An important finding of our study is the newly discovered sister group relationship between the Indian Ocean specimens and the samples obtained from the central PO (Figure 1). This is not concordant with the various studies undertaken on marine species including reef invertebrates found in the IP region, where a distinct divergence among the IO and PO populations has been reported (reviewed in Crandall et al. [50]).
In order to unravel the complex array of L. chagosensis lineages, the five different clades (Figures  1 and 2) obtained across their biogeographic regions are discussed below.

Western and Central PO (Clade 1)
Clade 1 grouped samples from Australia, Papua New Guinea, Fiji, Polynesia, American Samoa, and Vanuatu, thus regrouping all samples from the Indo-Australian plate with high support values from ML and BI analyses (Figure 1). The relatively short distance between these regions may explain the groupings of those two regions in a single clade. In addition, the East Australian current and Equatorial Counter current potentially sustain the transfer of larvae and buds (asexual fragments), respectively, from the eastern part of Papua New Guinea to northern and eastern part of Australia, and from Papua New Guinea and East Australia to Fiji, Polynesia, American Samoa, and Vanuatu [28,29].

Indian Ocean (Clade 2)
In the phylogeny derived from the concatenated sequences, all samples from the IO formed a distinct clade, consistent with our hypothesis that no lineages are shared between the Indian and Pacific Oceans [29]. With the Australian and Eurasian plates colliding in the mid-Miocene, the water flow between the Pacific and the Indian Oceans was reduced [51]. The water level dropped drastically during the Pleistocene period, due to the glacial formation during that age, and the Sunda shelf was exposed [52]. This sea-level change created a barrier between the Indian and the Pacific Oceans, restricting gene flow that subsequently led to allopatric speciation in many cases [53]. For example, Gaither et al. [54] showed that the IP barrier at the glacial maxima and the ensuing lowered sea levels contributed towards shaping the evolutionary patterns of the reef fish Cephalopholis argus, showing two main haplogroups, one characterizing the IO and one characterizing the PO. The coral reef sea star Linckia laevigata also has limited gene flow between the IO and the PO, due to the sea fluctuations in the Pleistocene [23].
The L. chagosensis found in the IO is grouped together in Clade 2 (Figure 1), and a clear distinction of the different sampling locations was observed from the haplotype network ( Figure 2, Table S3). In contrast to other studies (e.g., Vogler et al. [22]), a deep genetic divergence between the Southern IO, the Northern IO, and the Red Sea could not be observed. Nonetheless, no cox3 or 28S haplotypes are shared between these regions ( Figure 2, Table S3). Therefore, gene flow between these regions in the western IO appears to be restricted for L. chagosensis. The Red Sea is separated from the remaining western IO by physical barriers, most notably by the shallow and narrow Strait of Bab al Mandab [55], and numerous phylogeographic studies have provided evidence for the deep genetic divergence of Red Sea marine species (e.g., DiBattista et al. [10,56]; Fernandez-Silva et al. [57]). Consistent with this hypothesis, we observed that L. chagosensis individuals sampled in Rodrigues are more closely related to those from the Red Sea than the geographically closer Maldives. The seasonal mixing of surface waters of the Red Sea and the western IO [58], which favours the transfer of larvae or asexual fragments between the Red Sea and Rodrigues, might also contribute to the observed phylogeographic pattern.

Western PO (Clades 3, 4, and 5)
Clade 3 is comprised of samples mainly from the northeast of Australia but also specimens from Papua New Guinea, Palau, Guam, and Taiwan ( Figure 1). The North Equatorial Current likely promotes gene flow between these geographically relative proximate sites (Figure 1). The same scenario has been observed for the macroalga Sargassum aquifolium, where oceanic currents and land bridges have been reported to shape the S. aquifolium populations in the Southeast Asia region. A genetic break between populations was observed between Southeast Asia and the western Pacific Islands and Guam [59]. Furthermore, Clade 3 has diverged significantly with high support values from ML and BI analyses from Clade 1, which also contains samples from Australia and Papua New Guinea, clearly showing a distinction of two different lineages of L. chagosensis in the same biogeographic region, in concordance with previous results [29].
Clade 4 contains other samples from Japan and Indonesia. Considering the role of the North Equatorial current in mixing the surface waters from Indonesia and Japan (Figure 1), a phylogeographic relatedness would be expected between the samples collected in the western PO, as it is highlighted in Clade 3. However, even though Clades 3 and 4 are phylogenetically more closely related to each other than to other clades, there is a clear separation between them ( Figure 1). With the numerous island formations in that region and the sea level minima and maxima during glacial cycles, the complexity of speciation occurring on the different islands in that region is likely intensified [60]. Such phylogeographic complexity has also been reported for other species found in the western PO -for example, the mangrove species Excoecaria agallocha [61], the abalone Haliostis diversicolor [1], and the Neon damselfish Pomacentrus coelestis [62].
Clade 5 contains only samples from the Philippines, defining this geographic region as biogeographically distinct. Wörheide et al. [29] suggested a long isolation of L. chagosensis populations in the Philippines as a possible reason for its genetic divergence. The phylogeographic complexity of the Philippines archipelago has also been described in the case of the marine red alga Portiera [63], for which DNA taxonomy suggested that it is composed of 21 species within the Philippines, although it was previously assumed to be one widely distributed IP species based on morphology.
Our study now reinforces the exceptional biogeographical role of the Philippine Archipelago by confirming the deep separation of L. chagosensis from the rest of the IP [29], based on the analysis of a mitochondrial locus.

Genetic Structure and Diversity of L. Chagosensis Across Its Range
With the inclusion of a mitochondrial marker and the incorporation of additional samples from the IO, our study provides a unique opportunity to assess the genetic structure and diversity of the allegedly wide-ranging species L. chagosensis. The haplotype network based on the cox3 marker ( Figure 2) retrieves five lineages, showing high levels of genetic structure, evidenced by the relatively large number of mutational steps between most lineages. High Fst values obtained from the cox3 marker (Table 2) also support the observation of highly structured populations. Most of the clades showed a high haplotype diversity and a low nucleotide diversity, which may indicate geographic population expansions within each clade [64]. Clade 5 (Philippines) has no values for the nucleotide and haplotype diversities, because the 28S sequences were identical in this clade. The lack of genetic diversity may either be caused by a severe bottleneck effect of the Philippine population or be artefactual, due to the low number of samples obtained from that region. Clade 4 (Australia, Papua New Guinea, Guam, Palau, Taiwan, and Indonesia) was the only clade showing high nucleotide diversity with regards to cox3. This region comprises the Great Barrier Reef and also the part of the Coral Triangle, the latter of which is famous as a marine biodiversity hotspot for many marine taxa [65,66]. It also has been proposed to be a centre of overlap between the IO and PO [67], a centre of origin [68], and the centre of survival [69,70].

Conclusions
In this study, we unravelled the phylogeographic relationships of L. chagosensis lineages across the known biogeographic range, using a combination of nuclear and mitochondrial DNA markers. The morphological taxonomy of L. chagosensis reveals only one species from the Indian to the Pacific Oceans. The results obtained from the molecular analyses suggest several genetic lineages isolated to their respective geographic location, with a relationship observed between the IO and the eastern PO. The isolation of the Philippines clade reiterates an urgent call for more L. chagosensis samples in that region, in order to understand its biogeographical importance. This demonstrates the benefits of applying the combined use of nuclear and mitochondrial markers for resolving sponge phylogeography. By including specimens from additional sites of the Indian Ocean, we provided further evidence that the L. chagosensis found in the central Indian Ocean are isolated from the Pacific Ocean, thus reinforcing the call for further phylogeographic studies in this biodiversity hotspot region, including for additional marine invertebrates. Such studies will specifically help to better understand the evolutionary patterns of connectivity of the Indian Ocean fauna to other parts of the Indo-Pacific region.

Supplementary Materials:
The following are available online at www.mdpi.com/1424-2818/12/12/466/s1. Table S1: Information on all available L. chagosensis specimens; Table S2: Best partition scheme and model of evolution (maximum likelihood, Bayesian inference; Table S3: Number of haplotypes from cox3 and 28S for each study region with their respective clades obtained from the phylogenetic tree; Table S4: AMOVA results with population pairwise FSTs values for Clades 1 to 5 observed in Figures 1 and 2); Figure S1: Midpoint rooted ML phylogenetic tree from cox3; Figure S2: Midpoint rooted ML phylogenetic tree from 28S; Figure S3: Midpoint rooted BI tree with concatenated sequences of cox3 and 28S; Figure S4: Haplotype network from 28S marker.