Contrasting host-parasite population structure and role of host specificity: morphology and mitogenomics of a parasitic flatworm on pelagic deepwater cichlid fishes from Lake Tanganyika

Little phylogeographic structure is presumed for highly mobile species in pelagic zones. Lake Tanganyika is a unique ecosystem with a speciose and largely endemic fauna famous for its remarkable evolutionary history. In bathybatine cichlid fishes, the pattern of lake-wide population differentiation differs among species. We tested the magnifying glass hypothesis for their parasitic flatworm Cichlidogyrus casuarinus . Lake-wide population structure of C. casuarinus ex Hemibates stenosoma was assessed based on a portion of the mtCOI gene combined with morphological characterisation. Additionally, intraspecific mitogenomic variation among 80 individuals within one spatially constrained parasite metapopulation sample was assessed using shotgun NGS. While no clear geographic genetic structure was detected in parasites, both geographic and host-related phenotypic variation was apparent. The incongruence with the genetic north-south gradient observed in the host may be explained by the broad host range of this flatworm as some of its other host species previously showed no lake-wide restriction of gene flow. Our results are consistent with host driven morphological variation without genetic differentiation of the parasite, and highlight the importance of integratively approaching parasites` potential as “tags” for their hosts. We present the first parasite mitogenome from Lake Tanganyika and propose a methodological framework for studying intraspecific mitogenomic variation of dactylogyrid monogeneans.


Introduction
Species richness is generally low in the pelagic zones of large water bodies compared to their littoral zones. This is not only true for marine ecosystems [1], but also in large lakes [2,3]. The oftenuniform appearance of highly mobile pelagic species like e.g. fish reflects the lack of physical barriers to gene flow and of resource-based diversification [4]. Nevertheless, in many cases it remains notoriously difficult to follow gene-flow across the open water and consequently, to draw a connection between panmixia and the specialisation on open-water habitats.
Similar to the rather uniform fish host species composition in pelagic habitats, low parasite species diversity have been observed in the open water and in deep sea ecosystems worldwide [5][6][7][8][9][10][11]. Due to their shorter generation time and high mutation rate, parasite lineages are often more species-rich than their hosts, and accelerated microevolution is also visible in their population structure [12]. Therefore, distribution patterns of parasites have been suggested to mirror and further magnify population structure, migration patterns and historical distribution of their hosts [13][14][15][16]. However, evolutionary mechanisms in most parasite taxa remain poorly studied, especially at the population level. Due to diversity of life strategies and host taxa being involved, flatworms earned the label "masters of parasitism" [17]. Monogenean parasites, a group of neodermatan flatworms, are monoxenous (life-cycle depending on a single host individual) and often display high levels of hostspecificity, i.e. a single host species is involved [18]. They are, therefore, considered a prime candidate model for using parasites as tags for their hosts' history and ecology.
Unlike the open sea, pelagic zones of lakes are geographically confined and easier to monitor as a whole. Therefore, they could serve as more approachable systems for studying evolutionary processes and host-parasite relationships among open water taxa. Lake Tanganyika is well suited to study parasite distribution patterns because it is highly isolated from surrounding water bodies and is home to a speciose endemic cichlid species assemblage, a famous textbook model of evolution in natural conditions [19][20][21][22], infected by an equally stunning diversity of monogeneans [23,24]. The lake comprises a large but still monitorable pelagic zone with layer stratification (epi-, meso-and bathypelagic) inhabited by schooling freshwater species of sardines (Clupeiformes, Clupeidae), their latid predators (Perciformes, Latidae) as well as bentho-and eupelagic endemic cichlid lineages (Cichliformes, Cichlidae) belonging to the tribes Bathybatini, Boulengerochromini and Perrisodini [25,26]. Decreased levels of parasite species richness are often connected with low host-specificity, usually described as the number of different host species a certain parasite species infects [27]. So far, various levels of host-specificity in monogeneans infecting cichlid fishes have been recorded in the lake [28,29]. Mechanisms causing this variation in parasite host-specificity remain mostly unknown in natural parasite-host systems. In order to understand microevolutionary processes driving the hostspecificity of parasites, it is paramount to understand thoroughly what the drivers of generalism and large host range are. The combination of high host dispersal capacities and low host population densities was proposed to cause reduced parasite host-specificity in deep waters [30][31][32], while the former have been suggested to affect the morphology of monogenean populations in Lake Tanganyika [33,34]. This is likely also the case for Cichlidogyrus casuarinus Pariselle, Muterezi Bukinga & Vanhove, 2015, which is classified as intermediate generalist parasite (infecting host species from more than one genus, following [35] infecting bentho-and eupelagic bathybatine cichlids in the lake. Discovered about a decade ago [36], its size (larger compared to its congeners so far discovered in the lake) makes it a convenient subject for thorough microscopical investigation [37], and because of its relatively broad host range it was the first African monogenean to be analysed at the population genetic level [28].
The broad use of Next-generation Sequencing (NGS) technologies enables to cost-effectively study population structure and distribution patterns of aquatic migratory species [38,39]. However, the use of genomic data in monogeneans has so far been hindered by their small size and low yields from DNA extraction. Therefore the use of whole genome data has been restricted to few, mostly model parasite taxa of medical importance such as the agents of malaria [40] and schistosomiasis [41]. Population genomics on monogeneans, sourced from the wild without experimental procedures, is to our knowledge an uncovered field.
Recently, a comparative phylogeographic study on bathybatines showed that benthopelagic species do display geographic population structure, whereas eupelagic species do not [42]. No hostrelated (meta)population structure was found within C. casuarinus in the northern sub-basin of Lake Tanganyika [28], we question its magnifying potential in a lake-wide scale. In this study, we investigate the geographic population structure of this parasite infecting bentho-and eupelagic hosts and present a rare comparison of morphological characterisation and a classic mitochondrial marker used to study intraspecific diversification in monogeneans, with NGS data of the same parasite population. We hypothesize that its phylogeographic structure is shaped by dispersal capacity of the most mobile hosts. Alternatively, isolation by distance would suggest philopatry of C. casuarinus.

Sampling
Monogenean specimens were isolated from the gills of Hemibates stenosoma (Boulenger, 1901) (n=8), and a single individual of Bathybates graueri (Steindachner, 1911) purchased from local fishermen in Mpulungu, Zambia, September 2018 and preserved in 99% EtOH prior to examination. Monogenean individuals were cut into three parts with attachment organ (haptor) and male copulatory organ (MCO) fixed on slides using Hoyer's medium and the rest of the tissue kept apart for molecular characterisation. All the collected monogenean specimens were identified as C. casuarinus based on the morphological details of haptoral structures (mainly length of dorsal bar and first pair of marginal hooks), the fact that the male copulatory tube has a spirally thickened wall, and that the male copulatory organ (MCO) has a 26-59-µm-long heel. Parasite voucher material was deposited in the collection of Hasselt University under accession numbers xx-xx. New data was used in conjunction with previously published (geo-)morphometric and genetic data of C. casuarinus from the northern sub-basin [28] to elucidate lake-wide geographical patterns in this monogenean species.

Morphometrics
Detailed characterisation of parasite sclerotised structures performed on 43 individuals ex H. stenosoma and eight ex B. graueri. In total, 22 parameters from the haptor and 3 from the MCO were measured following the terminology of [43]. Given that [27] observed morphological intraspecific variation between specimens from different host species, the existence of geography driven differentiation of C. casuarinus was tested solely on specimens collected from H. stenosoma. The acquired morphological data were combined with the previously published measurements on C. casuarinus ex H. stenosoma from the northern sub-basin of Lake Tanganyika (see Supplementary  Table 1 and [28]). The total dataset consisted of parasite specimens from the northern basin (off Bujumbura, Uvira and near the Malagarasi river delta) and the southern basin (off Mpulungu) (Fig. 1). Intraspecific morphological variation was explored using principal component analysis (PCA) performed on scaled measurements from the haptor in the R package ade4 (Thioulouse et al., 2018). Additionally, host size available for the specimens collected in Mpulungu was visualised in the resulting biplots to check the relationship between the morphological variation pattern and the host size. Missing data were replaced by the average value and specimens with more than 50% missing measurements were excluded from the analysis. To test the significance of intraspecific differences in MCO structures, pairwise t-tests were performed in the R package stats [44]. The assumption of normality and homogeneous variance within sample groups was verified by Levene's test in R package stats [44].

Geomorphometrics
Phenotypic variation of C. casuarinus related to geographic origin was also studied by shape analysis in addition to the linear measurements. The shape of the dorsal and ventral anchor for each parasite individual was digitised using eight fixed landmarks and 92 equally distributed semilandmarks (see Supplementary Fig. 1) in tps Dig v2.30 [45]. To minimize bending energy with respect to a mean reference form, fixed landmarks were superimposed using Generalized Full Procrustes Analyses under the Least Squares criterion [46,47]. PCA using fixed landmarks only was performed in MorphoJ v2.0 [48]. ANOVA supplied with a permutation test of 10,000 iterations was used to statistically validate differences between populations and dependency on the centroid size. A Relative Warp Analysis (RWA) [49] was performed on the overall shape of both anchors (using fixed landmarks and semilandmarks) with the Procrustes coordinates using tps Relw v1.49 [45]. In order to give all landmarks equal weight, the scaling option was set to α = 0. Results of all multivariate statistics were visualised using R packages ggplot2 [50] and tidyverse [51].

Data acquisition
Whole genomic DNA extraction of the individual parasites (n=26) was performed using an optimised protocol for low input DNA samples. Initial digestion was performed in 195µl of TNES buffer (400 mM NaCl, 20 mM EDTA, 50 mM Tris pH 8, 0.5% SDS) and 5µl of proteinase K (20 mg/mL) incubated at 55 °C for ~30-60 minutes. DNA was precipitated in a mixture of 65 µl 5 M NaCl and 290 µl 96% EtOH yeast tRNA as a carrier, stored in -20 C for 1 hour and purified with three runs of 1 mL chilled 70% EtOH. Extracted DNA was eluted in 50 µl of 0.1x TE buffer with 0.02% Tween-20. To assess the intraspecific genetic diversity of C. casuarinus across Lake Tanganyika, part of the mitochondrial cox1 gene was amplified using ASmit1 (5'-TTT TTT GGG CAT CCT GAG GTT TAT-3') [52] combined with Schisto3 (5'-TAAT GCAT MGG AAA AAA ACA-3') [53], and with ASmit2 (5'-TAA AGA AAG AAC ATA ATG AAA ATG-3') in a nested PCR [52]. The first PCR was performed with ASmit1 and Schisto3 primers in 24 µl of PCR mix (one unit of Q5 High Fidelity Polymerase (New England Biolabs, Ipswich, MA, USA), 5X buffer containing 2 mM MgCl2, 0.2 mM dNTPs, 0.5 mM of each primer, 0.1 mM of bovine serum albumin (BSA)) and 1 μl of isolated DNA (concentration was not measured) for a total reaction volume of 25 µl. The reaction carried out under the following conditions: initial denaturation at 95 C for 5 min, then 40 cycles of 1 min at 94 C, 1 min at 55 C and 1 min at 72 C, and final elongation for 7 min at 72 C. The nested PCR with ASmit1 and ASmit2 primers followed the same protocol as the first one with 1:100 dilution of template DNA. To genetically verify parasite species identification for the new host-parasite combination reported in this study, individuals of C. casuarinus collected from B. graueri were further subjected to PCR of the 28S rRNA gene (28S), a nuclear marker traditionally used to delineate monogenean species. Partial 28S was amplified using the C1 primer (5´-ACC CGC TGA ATT TAA GCA T-3´) and D2 primer (5´-TGG TCC GTG TTT CAA GAC-3´) [54]. Each reaction mix contained one unit of Q5 High Fidelity Polymerase (New England Biolabs, Ipswich, MA, USA), 5X buffer containing 2 mM MgCl2, 0.2 mM dNTPs, 0.5 mM of each primer and 2 μl of isolated DNA (concentration was not measured) in a total reaction volume of 30 μl under the following conditions: 2 min at 94 C, 39 cycles of 20 seconds at 94 C, 30 seconds at 63 C and 1 min and 30 s at 72 C, and finally 10 min at 72 C. The final PCR products were enzymatically purified using 4 µl of ExoSAP-IT reagent (ThermoFisher Scientific, Waltham, USA) and 10 µl of PCR product under the following conditions: 15 min at 37 C and 15 min at 80 C. The newly obtained haplotype sequences were deposited in NCBI GenBank under the accession numbers xx-xx (28S rRNA) and xx-xx (COI mtDNA).

Population genetic analyses
The obtained cox1 sequences (n=26) were combined with the previously published sequence data of C. casuarinus from the northern sub-basin (n=42). The number of haplotypes and polymorphic sites, haplotype diversity, nucleotide diversity and Tajima's D (Tajima, 1989) were calculated in Arlequin v3.5 [55]. Phylogenetic relationships among haplotypes were inferred by constructing a Median Joining haplotype network in PopART v1.7 [56] using a cox1 mtDNA gene portion of 392 bp. Population differentiation between parasite populations originating from the northern and southern sub-basin was estimated using the FST index as implemented in Arlequin v3.5.

Mitogenome assembly and annotation
In total, 80 individuals of C. casuarinus ex H. stenosoma were pooled for mitogenome assembly and to compare their level of variation in the cox1 region with that based on Sanger sequencing of individual amplicons. Genomic DNA of the pooled samples was extracted using the Quick-DNA TM Miniprep Plus Kit (Zymo Research) following the manufacturer's instructions with minor modifications, initial incubation overnight, and elution in 2x50 ul after 10 minutes incubation at RT each. Library preparation (Illumina TruSeq Nano, 550 bp insert size) and sequencing on NovaSeq6000 (2x 150bp) platform were outsourced (Macrogen Europe). Raw sequences were trimmed using Trimmomatic v0.39 [57] using a sliding windows option, cutting 5 bases from the start of each read and applying a minimal read length of 100bp [57]. The mitogenome of C. casuarinus was assembled using part of the cox1 sequence (KX007864.1) as seed in NOVOPlasty v3.7.2 [58] with a k-mer length from 21 -39, read length of 130 and insert size of 390. Partly assembled mitogenome sequences from k-mers 35 and 37 were combined in overlapping regions in Geneious Prime v11.0.6. The mitogenome was annotated using the MITOS web server (code echinoderm mitochondrial) [59] combined with visualisation of open reading frames and alignment with available mitogenomes of closely related monopisthocotylean monogeneans in Geneious Prime v11.0.6. In addition to MITOS, the tRNAscan-SE [60] and RNAfold [61] web servers were used to verify the tRNA-coding regions. When results between applications conflicted, the solution proposing a 7 bp acceptor stem was chosen. As noncoding mitochondrial regions were assembled, the presence of repeat sequences was checked with Tandem Repeats Finder [62]. The raw reads and annotated sequence were deposited in NCBI GenBank under accession numbers xx and xx.

Mitogenome diversity
Trimmed reads were mapped back to the assembled mitogenome, both majority and unambiguous consensus sequence, respectively, using bwa mem with the mean insert size of 450 bp (min. 300 bp, max. 1000 bp) [63]. Bwa mem has been identified as a suitable alignment method due to low false-positive rates and has been demonstrated to be the most effective open-source method for mapping PoolSeq data [64]. PCR duplicates were removed using SAMBLASTER v0.1.24 [65]. Mapped reads were filtered for low quality (-q 20) and paired reads only with SAMtools v0.1.11 [66]. Resulting bam files were converted to mpileup files using SAMtools v0.1.11 [66]. The number of polymorphic sites, nucleotide diversity (π) and Tajima's D in a pooled sample were calculated in PoPoolation v.1.2.2 [67] using a sliding window approach across the mitogenome excluding the repeat region (window size of 300 bp and a step size of 10 bp, minimum coverage 4, minimum count 2) and across the cox1 fragment (window size of 392 bp and a step size of 2 bp, minimum coverage 4, minimum count 2, pool size 80 and minimum coverage fraction 0.6). To assess the interspecific nucleotide diversity between the mitochondrial protein coding genes known from species in this parasite genus, a sliding window analysis was performed on aligned sequences of two other species of Cichlidogyrus (C. halli (Price & Kirk, 1967) MG970255.1 [68] and C. sclerosus Paperna & Thurston, 1969 JQ038226.1 (unpublished) and the majority consensus sequence of C. casuarinus in DnaSP v5 [69] (with a window size of 300 bp and a step size of 10 bp). These are the only two members of the genus for which a complete mitochondrial genome sequence was already available. Conveniently, C. halli, C. sclerosus and C. casuarinus belong to different clades within Cichlidogyrus [70], ensuring a certain phylogenetic coverage of the genus.

Results
In total, 156 individuals of C. casuarinus were collected from H. stenosoma (prevalence 88.9%, maximum infection intensity 86, minimum infection intensity 1, mean 19.5, and abundance 17.3). Nine individuals of C. casuarinus were collected from B. graueri, representing the first record on this host.

Morphological variation
Based on the observed mutual position of parasitic individuals in the PCA scatterplot, phenotypic variation related to geographic origin was visible along the first and second PC axes (Fig. 1a). The pattern was driven mainly by 'total length' and 'length to notch' of both anchors and the 'maximum straight width' of the dorsal bar (Fig. 1a). Moreover, clustering of specimens collected from similarsized fish hosts was visible along the second PC axis. Conversely, no significant differences in the morphology of the parasite's male copulatory organ related to sub-basin were detected (copulatory tube length -F=0.000(1,65), p=0.989, heel length -F=0.132(1,68), p=0.718, Fig. 1b&c). Overall, a more pronounced differentiation between geographically defined populations was apparent in the shape of the dorsal anchor compared to the results for the ventral one (Fig. 2). Differentiation of geographically defined populations of C. casuarinus in the shape of the dorsal anchor was visible mainly along the second PC axis. The results of RWA (including sliding landmarks) followed the pattern obtained via PCA but did not provide higher resolution (Fig. 3). Nevertheless, the shape of both anchors is significantly different between the sub-basins (dorsal anchor -F= 4,38 (12,552), p<.0001, ventral anchor -F=2,39(12,588), p=0.0051) with significant correlation between the shape and centroid size of dorsal (F=52,52(1,46), p<.0001) and ventral anchor (F=28,93(1,49), p<.0001), respectively.

Genetic diversity and differentiation
Overall, the total dataset (lake-wide sample) of the cox1 gene portion (n=68) contained 55 different haplotypes with 65 polymorphic sites. Haplotype and nucleotide diversity were estimated at 0.9890 and 0.021099, respectively. Tajima's D was negative (D = -1.31540, p = 0.07300). The nonhierarchical topology of the haplotype network indicated the absence of geographically driven population structure (Fig. 4). However, significant differentiation between populations from H. stenosoma from the northern sub-basin and Mpulungu (the southern sub-basin) (FST = 0.05002, p = 0.04524 ± 0.0020) was observed. New sequences for cox1 mtDNA were obtained from 24 individuals of C. casuarinus ex H. stenosoma from the southern sub-basin (Mpulungu), comprising 21 different haplotypes and containing 33 polymorphic sites. Haplotype and nucleotide diversity in the southern sub-basin were estimated to 0.987 and 0.02017, respectively. Tajima's D was negative, but not significantly different from zero (D = -0.39985, p = 0.39800). Genetic distance among haplotypes ranged from 0.3% to 3.8%.

Mitogenome and nuclear ribosomal operon
Genomic DNA sequencing on the Illumina NovaSeq6000 platform yielded 15,980,972 indexed paired-end reads. A complete mitochondrial genome of 15,575 bp was assembled and annotated (Fig.  5). The total number of properly mapped reads across the assembled mitochondrial genome was 100,448 and 76,009 after filtering steps. The coverage along the various mitochondrial regions is detailed in Table 1. The total length of the ribosomal operon was 7,584 bp (see Supplementary Table  2). The mitochondrial genome of C. casuarinus comprises 12 (all except atp8) intron free protein coding genes, 22 tRNA genes and two genes coding for the large and small subunits of the mitochondrial rRNA following the gene order of other species of Cichlidogyrus (see Table 1) . The absence of atp8 was detected as in other neodermatan flatworms (Caña-Bozada et al., 2021). We report an abbreviated stop codon TA for nad2 as previously observed in C. halli and C. mbirizei [68]. An alternative start codon ATT was found for nad1. Three non-coding regions were assembled in the mitochondrial genome of C. casuarinus. One of the non-coding regions is located before the genes coding for rRNA and is AT rich (1,096bp, 31,8% GC). Further, another AT rich region was assembled after the genes coding for rRNA (354bp, 21,1% GC). A repeat region of 1,307bp long including 11 repetitions of a 90bp motif is located between the genes coding for nad5 and trnG.  The sliding window analysis showed various levels of intraspecific nucleotide diversity between the protein coding genes of C. casuarinus with the highest values reported for atp6, nad2 and parts of nad5 (see Fig. 6a). All protein coding genes showed negative values of Tajima's D with the lowest values in cytb and nad6 (Fig. 6b). The nucleotide diversity for the cox1 fragment in the PoolSeq data was 0.01460, Tajima's D parameter -1.67146. In contrast to the intraspecific level, at the interspecific level the gene coding for cox1 showed the lowest level of nucleotide diversity in comparison to other protein coding genes. The highest values were reported for the nad2 and nad5 genes (Fig. 7).

Morphological and mitochondrial diversity and host use in Cichlidogyrus casuarinus
In this study, phenotypic variation related to geographic origin of a monogenean parasite infecting bentho-and eupelagic fish hosts in Lake Tanganyika, C. casuarinus, was contrasted with the lack of a clear phylogeographic structure in the genetic data. In general, as a crucial part of the attachment organ and the physical interface between parasite and host, sclerotised haptoral structures of monogeneans are presumably under strong evolutionary constraints [75,76]. Given the lack of host preference reported in the northern sub-basin [28] and a lack of clear geographic structure on a lake-wide scale found in the present study, we propose that geographical morphological variation displayed by C. casuarinus is driven by external environmental conditions imprinted during ontogenetic development. Specifically, the variation present in the dorsal anchor and bar of C. casuarinus correlates with host species identity [28] and environmental conditions related to contrasting geographic origin. In addition, morphological variation in the ventral anchor is related to geography in C. casuarinus ex H. stenosoma. In general, the overall shape of the ventral anchor was found to be more informative for the host species identity [28] and the dorsal anchor for the external environment. The lack of a clear genetic phylogeographic structure in C. casuarinus is in accordance with the fact that the MCO structures are of similar size throughout the lake, suggesting there is no reproductive isolation at the geographical scale. A positive correlation between host dispersal capacity and monogenean intraspecific morphological variability was suggested already for other cichlidmonogenean combinations in the lake by another recent study [34]. Similar to the present study, morphological variation related to different host and external environmental conditions was reported in Neobenedenia girellae (Hargis, 1955), a cosmopolitan fish parasite [77] and Gyrodactylus katharineri Malmberg, 1964 infecting various cyprinid hosts in Europe [78]. The maintenance of the generalist lifestyle of C. casuarinus might be explained by its adaptation to lower host availability via stabilizing selection on genotypes promoting the morphological variation. However, more data is needed to reveal processes behind recorded patterns. A higher level of phenotypic plasticity of rather generalist monogenean species, including C. casuarinus, is visible in the wider range of morphological characters compared to more specialised congeners such as . This pattern has also been reported in other monogenean genera such as Kapentagyrus infecting clupeid species in Lake Tanganyika [79] or Lamellodiscus spp. from sparid fishes in the Mediterranean Sea [80]. The fact that morphologically similar sister species [70] [34,81]. Moreover, the previous suggestion that host species size is related to morphological variation of C. casuarinus [28] has been confirmed here. Such a phenomenon was also reported for Kapentagyrus spp. infecting clupeid species in Lake Tanganyika [33] and Kuhnia scrombri (Kuhn, 1829) parasitizing Scomber australasicus Cuvier, 1832 and S. japonicus Houttuyn, 1782 in the Indo-Pacific Ocean [82].
In a deepwater pelagic environment with a lack of apparent physical barriers and a limited number of ecological niches, fish speciation is assumed to be mainly driven by resource partitioning [83,84] and spawning behaviour [19,[85][86][87]. Therefore, benthopelagic foraging is believed to limit dispersal propensity of B. graueri and H. stenosoma, in contrast to the unrestricted migration of eupelagic species such as Bathybates fasciatus Boulenger, 1901 and Bathybates leo Poll, 1956. The lack of clear phylogeographic structure in C. casuarinus contrasts with the reported north-south gradient seen in the host species, H. stenosoma [42]. We propose this is a result of the parasite's intermediate generalist lifestyle [35] infecting a range of host species from different genera. Other host species of C. casuarinus, such as B. fasciatus and B. leo, show no restriction of gene flow in the study of [42], indicating migration on a lake-wide scale, and may hence transport C. casuarinus across Lake Tanganyika. Therefore, our results show a limitation of the parasite's magnifying potential by the least structured and the most mobile host species, as these are even in low densities sufficient to maintain a widely distributed and lake-wide nearly panmictic parasite population. This suggests physically (at least seasonally) overlapping occurrence of bathybatine cichlids as this is a condition necessary for maintaining the broad host range of C. casuarinus. Even though there is large-scale niche differentiation with respect to prey preferences, main habitat and preferred water depth among the various bathybatine species vary [88], they are likely to come into contact on a regular basis. Whether they also occur in mixed schoals like semi-pelagic cichlids of the more shore associated genera Cyprichromis and Paracyprichromis [26,89] is not known for sure, but mixed catches at the local fish markets indicate that they do occur, at least occasionally, together. Alternatively, as the monogeneanhost relationships in Lake Tanganyika remain poorly understood [70], undescribed host interactions of C. casuarinus could bridge physical niche partitioning of bathybatine cichlids. High levels of haplotype and nucleotide diversity in the studied portion of cox1 region throughout concur with previously reported results for C. casuarinus in the northern part of the lake. Further, negative values of Tajima's D, though not significant, are consistent with previously suggested population expansion in this monogenean species congruent to the demographic history of some of the bathybatine hosts [42].
The known host range of C. casuarinus was updated with B. graueri. The fact that this hostparasite interaction was not found in the northern part of Lake Tanganyika may be explained by geographically and/or seasonally dependent infection of C. casuarinus on B. graueri. However, given the overall low level of geographic structuring and the lack of host preference, it is also possible that because of the small sample size this interaction was overlooked previously. Consequently, only two species of Bathybatini remain for which the presence of monogenean parasites is unconfirmed, Bathybates ferox Boulenger, 1898 (investigated in [28], but only n=7) and Hemibates koningsi Schedel & Schliewen, 2017, the latter of which was only described recently [90].

Ribosomal operon and its utility for future studies/research
Portions of ribosomal operon coding for all the nuclear ribosomal genes (18S, 5.8S and 28S rRNA) and internal transcribed spacer regions (ETS, ITS1, ITS2) are widely used for phylogenetic reconstruction of parasitic [53,91] as well as free living flatworms [92]. However, the low number of species with an assembled ribosomal operon available has restricted its full use in phylogenetic reconstructions of monogenean taxa so far. Within parasitic flatworms, the combination of rRNA genes and ITS regions is commonly applied to address species level boundaries [91,93]. Length differences of the ribosomal operon within species of Cichlidogyrus (7,496 bp in C. halli, 7,005 bp in C. mbirizei and 7,584 bp in C. casuarinus) are mainly present in the ITS regions, as reported in the first genomic study on African monogeneans [68]. Knowledge on the variation present within this multicopied DNA locus could be further applied to the emerging field of environmental (eDNA) metabarcoding and metagenomics and enable routine identification of parasite communities including monogeneans.

Mitochondrial genome Lake Tanganyika and the rest of the monogenean world
In the present study, the first monogenean and first parasite mitochondrial genome from Lake Tanganyika is presented. A high level of genomic diversity in mitochondria including numerous rearrangements has been previously reported in monogeneans [68], other parasitic or endosymbiotic [94] and free living flatworms [95]. Comparisons at the family level of Dactylogyridae revealed tRNA gene transposition of trnT and between trnL2 and trnR (reviewed in [96]). Unlike in other genera of parasitic flatworms, such as Schistosoma spp. [97] or Syndesmis spp. [94], the lack of rearrangements of the order of PCGs or tRNA genes compared to other species of Cichlidogyrus suggests that gene order is conserved in this monogenean lineage across different clades [70].Similar to its congeners for which the full mitogenome is available, three non-coding regions were assembled in C. casuarinus. Variability in the position, length and GC content in NCRs within and between lineages was previously reported in endosymbiotic/parasitic [68,94] and also free living flatworms [98]. However, the presence of a NCR between the nad5 and cox3 coding genes has been found in all representatives of Dactylogyridae and in other monogenean families such as Diplectanidae [99] and Tetraonchidae [100] for which the mitochondrial genomes are available. Similarly, the position of an AT-rich NCR between rrnS and cox2 seems to be fixed in Cichlidogyrus, as already suggested by Vanhove et al., (2018). However, as more than 130 species of Cichlidogyrus have been already described [24], future mitogenomic studies are needed to verify the generality of these patterns. In the mitogenome of C. casuarinus, the position of an AT rich NCR between rrnL and cox1 is currently unique within monogeneans. Mitochondria play a central role in energy generation and in several other mechanisms involved in cellular homeostasis [101]. The function of NCRs in the mitogenomes of flatworms remains for the most part unknown but a function in mtDNA replication and transcription, including the initiation site for replication has been suggested [102,103].
Most of the assembled PCGs in the mitogenome of C. casuarinus employed canonical start and stop codons, but similarly to the situation in other species of Cichlidogyrus, cox3 and nad2 regions end in abbreviated stop codons T and TA, respectively. However, in the case of nad2, an overlap of 1 bp with trnV would allow the presence of the canonical stop codon TAG as was reported in the annotation of C. sclerosus (JQ038226, unpublished). Truncated stop codons have been reported across different lineages of parasitic [68,96,104], endosymbiotic [94] and free living flatworm taxa [105] , and also in early diverging acoelomorphs [106]. An alternative start codon ATT was previously assembled for several flatworm taxa [94,98,107,108] but here it is reported for the first time in dactylogyrid monogeneans, as the start codon of the nad1 gene in C. casuarinus.

Intraspecific variation at the mitogenome level
In concordance with previous studies on monogenean mitochondrial diversity [68,109], the cox1 region appeared as the least variable PCG at the interspecific level. Moreover, it showed the lowest non-synonymous to synonymous substitution ratio compared to the other PCGs in previous studies [68,100] and its product is considered a highly conserved protein [110]. Moreover, the reported negative values of Tajima's D across the mitogenome of C. casuarinus suggest that all PCGs are under purifying selection and/or the species experienced recent population expansion [111]. The sliding window approach applied on pooled NGS data of C. casuarinus (Fig. 6) revealed a similar level of nucleotide diversity in cox1 as in the other PCGs. Globally, purifying selection which acts against deleterious mutations is reported for mitogenomes across the animal kingdom in line with the major role of mitochondria in the respiratory chain which requires coding sequence functionality [112]. However, purifying selection acting on mitochondrial genes does not prevent local positive selection at the intraspecific level driven by host and/or environmental differences with the highest number of polymorphic sites occurring in cox1 and cytB [113]. In comparison to other pelagic monogenean lineages in Lake Tanganyika, such as Kapentagyrus spp. infecting clupeids and Dolicirroplectanum lacustre (Thurston and Paperna, 1969) parasitising on lates perches, C. casuarinus showed a higher nucleotide diversity in cox1 [33,114]. Adaptive evolution driven by life-history innovations acting on mitochondrial genes has been already reported for monogeneans [115], other parasitic flatworms and other invertebrate and vertebrate taxa [116], including cichlid fishes in Lake Tanganyika [117]. The high level of intraspecific variation in the cox1 region might be explained by the generalist lifestyle of C. casuarinus possibly as an adaptation to the broad ecological niche of its host assemblage.
High coverage in regions coding for rRNA (Table 1) might be explained by the uneven postmortem fragmentation of mitochondrial regions resulting in uneven representation in genome libraries towards better preserved regions [118] or by certain motifs being prone to high rates of error and low coverage [119]. Alternatively, the presence of nuclear insertions of mitochondrial origin (NUMTs) as detected in flatworms (Roberts et al., 2020) and nuclear genomes of various organisms [120,121] cannot be excluded.

Methodological implications for future studies
Notably, allele frequencies in the shared polymorphic sites identified using the individual-based approach and PoolSeq dataset, respectively, were highly comparable. We report a higher number of polymorphic sites (51 versus 33) and lower nucleotide diversity in the pooled NGS data compared to the individually retrieved haplotypes of the cox1 gene portion retrieved from the same metapopulation of C. casuarinus. These results correspond with the larger number of individuals pooled compared to individually sequenced (80 versus 24) and the relatively high haplotype diversity in the studied cox1 gene portion of C. casuarinus. As such, the reported minor differences in allele frequencies between individual-based and NGS datasets might be a consequence of the different parasite individuals data were generated from. False positive SNPs can be possibly identified using the known frequency of the rare alleles present in the population of targeted species as a threshold [122]. In our study, 10 polymorphic sites unique to the NGS dataset showed a lower frequency compared to the rarest allele captured using individual-based sequencing (see Fig. 8) and could be therefore considered as false positive. The unique sites showed the relatively low frequency of the alternative allele. Additionally, nine SNPs in the PoolSeq data had a lower frequency compared to the theoretical threshold of a singleton in a population of 80 individuals (allele frequency of 0.0125). The absence of certain polymorphic sites (13 captured in individual-based approach from Mpulungu only) could be caused by the loss of rare haplotypes due to the necessary filtering steps as part of the NGS data pipeline [123,124]. The reported difference might be further related to the lower coverage per individual (ranging from 7.6 to 13.6X -see Table 1) compared to the 20X proposed to adequately reflect genetic variability based on experimental studies [125,126]. Additionally, DNA quantification followed by optimisation of DNA input per specimen prior to pooling might reduce the bias towards certain specimens and sites [124,126]. Overall, PoolSeq proved suitable to assemble the mitogenome of non-model, tiny organisms preserved under field conditions, and to evaluate the level of intraspecific nucleotide diversity across the mitogenome.
Given its relatively high abundance, and the now considerable baseline knowledge on its morphological and genetic variation, combined with a widespread occurrence in the closed pelagic ecosystem of Lake Tanganyika, we propose C. casuarinus as a model to study 1) mechanisms driving host-range difference in comparison with host-specific species of Cichlidogyrus that also occur in Lake Tanganyika, and 2) the role of phenotypic plasticity in (the lack of) diversification and speciation. Author contributions NK performed the morphological and molecular characterisation, analysed (geo)morphometric, genetic and genomic data and drafted the manuscript. CH and JV contributed to PoolSeq data generation and analyses. MPMV discussed the results, helped draft the manuscript and supervised the study. SK and HZ identified the fish hosts, contributed to sampling and provided scientific background on Lake Tanganyika and its ichthyofauna. SK, HZ, MPMV, TA and CH revised the manuscript. NK, MPMV and CH designed the study. MG provided scientific background in the field of parasite ecology, and TA in the field of flatworm taxonomy and evolution. All authors read and approved the final manuscript.

Ethical statement
Fieldwork was carried out with the approval of the competent local authorities under the permission of the Fisheries Department of Zambia and under a study permit issued by the government of Zambia (SP 008732).

Conflict of interest
The authors declare that they have no competing interests.