Aquatic Hemiptera in Southwest Cameroon: Biodiversity of Potential Reservoirs of Mycobacterium ulcerans and Multiple Wolbachia Sequence Types Revealed by Metagenomics

Buruli ulcer (BU), caused by Mycobacterium ulcerans, is a neglected tropical disease associated with freshwater habitats. A variety of limnic organisms harbor this pathogen, including aquatic bugs (Hemiptera: Heteroptera), which have been hypothesized to be epidemiologically important reservoirs. Aquatic Hemiptera exhibit high levels of diversity in the tropics, but species identification remains challenging. In this study, we collected aquatic bugs from emerging foci of BU in the Southwest Region of Cameroon, which were identified using morphological and molecular methods. The bugs were screened for mycobacterial DNA and a selection of 20 mycobacteria-positive specimens from the families Gerridae and Veliidae were subjected to next-generation sequencing. Only one individual revealed putative M. ulcerans DNA, but all specimens contained sequences from the widespread alphaproteobacterial symbiont, Wolbachia. Phylogenetic analysis placed the Wolbachia sequences into supergroups A, B, and F. Circularized mitogenomes were obtained for seven gerrids and two veliids, the first from these families for the African continent. This study suggests that aquatic Hemiptera may have a minor role (if any) in the spread of BU in Southwest Cameroon. Our metagenomic analysis provides new insights into the incursion of Wolbachia into aquatic environments and generated valuable resources to aid molecular taxonomic studies of aquatic Hemiptera.


Introduction
Mycobacterium ulcerans is a slow-growing environmental pathogen that infects the skin and subcutaneous tissues, causing Buruli ulcer (BU; also known as Bairnsdale/Daintree ulcer in Australia) [1].BU is an emerging, treatable but neglected skin infection that manifests as slowly developing, unspecific indolent nodules, papules, or induration by oedema, which can progress to necrotizing skin ulcers [2].BU treatment outcomes can be good if the condition is diagnosed early, while late diagnosis can lead to extreme patient suffering and severe complications necessitating surgery or even amputation of limb(s) [3,4].The incubation period of the disease varies from weeks to several months, with a median of four to five months [5,6].Emergence of BU has been linked to the acquisition of a megaplasmid, pMUM, by Mycobacterium marinum, a far less virulent environmental pathogen associated with aquatic habitats [7].The megaplasmid encodes for polyketide synthases and accessory enzymes for the biosynthesis of a macrolide toxin, mycolactone [8], which is responsible for tissue damage and local immune suppression in BU [9].Phylogenomic evidence indicates that all mycolactone-producing mycobacteria (MPM) form a monophyletic group comprising three clonal lineages, which should be considered ecovars of M. ulcerans rather than distinct species (the so-called "Mycobacterium liflandii", "Mycobacterium pseudoshottsii", and "Mycobacterium shinshuense") [7].
BU has been reported in over 35 countries [10], almost half of which are in Africa [2].In Cameroon, the first cases of BU were diagnosed in 1969 in 47 patients residing in a confined area near the villages of Ayos and Akonlinga [11,12].This was two decades after BU was first described in Australia [13] and some years after the first case report in Africa [14].Although the number of BU cases in Cameroon has declined following active surveillance in line with recent global trends [15], it is still a major public health problem in this country.New foci of infection continue to emerge, while Bankim and Akonolinga remain as disease hot spots.From 2001 to 2014, the number of BU endemic health districts rose from two to 64 [15][16][17][18].
Mycobacterium ulcerans DNA has been detected in aquatic bugs (Hemiptera: Heteroptera) [19,20], other limnic organisms (e.g., aquatic plants, fish, tadpoles, and mosquitoes [21][22][23]), and environmental samples (freshwater and soil [24,25]) in several ecological studies, but its route(s) of transmission from these sources has not yet been fully elucidated [26][27][28].It has been hypothesized that M. ulcerans can be transmitted by human-biting aquatic bugs, notably those in the families Naucoridae, Belostomatidae, Notonectidae, and Nepidae [29][30][31].Although it has been clearly established that M. ulcerans specifically colonizes the salivary glands of biting bugs and they can transmit the pathogen to laboratory rodents [19,31], their role as disease vectors in nature remains controversial.Thus, while it is possible that bites from aquatic bugs could transmit M. ulcerans to humans [32], there is no evidence that this is epidemiologically relevant on a wide scale [33].Nevertheless, as most families of aquatic bugs contain species (or morphs) with flight ability [34], the colonization of aquatic bugs by M. ulcerans may be important in the spread of the pathogen into new areas.
Previous studies in central Cameroon have implicated aquatic bugs as reservoirs of M. ulcerans, with colonization of the insects being detected in a BU-endemic area but not a proximate nonendemic area [19].The current study investigated aquatic bugs from the previously undersampled Southwest Region of Cameroon for signatures of M. ulcerans in order to map environmental sources and potential reservoirs of the pathogen, as evaluating the presence of M. ulcerans in the environment is important in assessing the risks for human infection [26,28,34].Precise species identification and delimitation of aquatic bugs in tropical regions, where species diversity is very high, can be challenging.Several authors have proposed the use of DNA barcoding targeting a standard region of the mitochondrial gene, cytochrome c oxidase subunit I (coi), in the taxonomy of true bugs [35,36].Therefore, our study extended the previous use of DNA barcoding for identification of Cameroonian aquatic bugs [37] in a more focused geographical area, where BU appears to be emerging.
Recent reports of a high prevalence of the α-proteobacterial symbiont, Wolbachia, in aquatic insects [38] may also be of epidemiological relevance.Although Wolbachia is the most widespread animal symbiont on Earth, being found in > 50% of terrestrial arthropod species [39], the extent of its penetration into aquatic environments remains unclear.Wolbachia is known to affect the susceptibility of arthropod hosts to colonization by other microorganisms (either positively or negatively, depending on specific combinations of pathogen, vector, and Wolbachia strain [40][41][42]) and can profoundly impede vector competence if artificially introduced into naïve hosts [43].Therefore, M. ulcerans and Wolbachia may interact within the microbiome of aquatic bugs, either in a facilitative or competitive manner as observed with Wolbachia in other systems [44][45][46], leading to potential impacts on dissemination of the pathogen in the host and perhaps the wider environment.
Here, we present the results of an intensive environmental and taxonomic survey of aquatic bugs in the Southwest Region of Cameroon.Through a combination of traditional morphology-based taxonomy and DNA barcoding coupled with metagenomics, we provide new insights into the diversity of aquatic Hemiptera in Cameroon and acquired the first complete mitochondrial genomes from bugs for the country.While our original goal to characterize environmental strains of M. ulcerans from aquatic bugs in this region was not achievable due to a virtual absence of the pathogen in our specimens, we demonstrate that serendipitously, they harbored surprisingly diverse Wolbachia sequence types.

Study Sites
A cross-sectional descriptive study, in which study sites were selected using a multistage sampling approach, was carried out in the Southwest Region of Cameroon.This region was chosen because it is BU-endemic, but very little information on the disease (especially environmental sources of M. ulcerans) exists for this part of the country.Localities endemic for BU have been identified in the Southwest Region, leading to the establishment of a BU detection and treatment center; however, research and control efforts have so far been focused only in the three main BU-endemic foci (located in other administrative regions): Akonlinga (Centre), Ayos (Centre), and Bankim (Adamawa) [16,47].
Six health districts (HDs) from a total of 18 in the Southwest Region were selected for the study based on the cumulative number of BU cases reported within a 14-year period (2001-2014) [15].Two of the HDs (Mbonge and Ekondo-Titi) are referred to as mesoendemic foci in the present study because they reported cumulative BU cases of 75 and 37, respectively; values much lower than those reported by the three major endemic foci in Cameroon [Akonolinga (1081), Bankim (557), and Ayos (485)] within the same 14-year period [15].The other four HDs (Limbe, Buea, Muyuka, and Kumba) are referred to as hypoendemic foci because each reported <10 cumulative BU cases within the stated period.
An initial scoping exercise was undertaken in the towns and accessible villages in each selected HD to i) introduce the study to the local population and ii) map the sampling points (also termed as water bodies or sampling sites), which was done with the assistance of the population.Most sites were shallow and slow flowing, but became flooded rapidly during heavy rain during the sampling period, increasing flow rates temporarily (Figure S1A-D)

Sample Collection and Transportation
Aquatic bugs were collected from each of the 25 water bodies (Figure 1) from 13-25th July 2017, with sampling taking place between the hours of 09:00 and 15:00.As the survey required the collection of representative samples of aquatic bugs in the brief time available between heavy rainfall, and the focus was on molecular screening rather than ecological research, the area or volume of water sampled at each location was not standardized.A team of four workers spent approximately 40 minutes sampling from the water surface to the substrate (a maximum depth of ~1.3 m) at each site using a professional threepiece collapsible hand net (250 mm wide frame, 1 mm in mesh size, and 300 mm deep) (EFE and GB Nets, Lostwithiel, UK).The samples were sorted in a white plastic tray (EFE and GB Nets) in order to remove extraneous vegetable matter and nonhemipteran animals.Putative hemipterans were transferred into a 50 mL polypropylene tube containing absolute acetone, kept in a cooler, and transported to the Laboratory for Emerging Infectious Diseases, University of Buea, for short-term storage.Subsequently, the acetone was decanted and all samples were shipped to the University of Liverpool, UK, under an import license issued by the UK Animal and Plant Health Agency.The acetone was replaced and the insects stored at 4 °C.

Aquatic Bugs Identification and Morphotyping
Each bug was numbered uniquely for taxa identification.As discussed in prior works, the identification of aquatic Hemiptera in West Africa remains problematic due to high species diversity and incomplete faunistic records [37,48].Therefore, in case of specific relationships between bug populations and M. ulcerans, we proceeded to classify the material initially into morphotypes to allow retrospective matching between samples expended for molecular assays and those retained as voucher specimens.Using a stereomicroscope, taxonomic keys, written descriptions, specimen comparisons, image verification, and an extensive literature search [37,, the aquatic bugs were placed into families, genera, and, wherever possible, species.Confirmation of the identity of selected specimens was done by morphological comparisons with material in the collections of the Natural History Museum, London.
The bugs were rinsed with distilled water to remove acetone and each specimen was cut into several pieces with a separate sterile disposable scalpel (Swann-Morton, Sheffield, England) before transfer into a 1.5 mL snap cap tube.Ammonium hydroxide (150 μL, 1.5 M) (Sigma-Aldrich) was added to the tube and boiled in a heating block at 100 °C for 15 min [70].The tubes were centrifuged at 5,000× g for 1 min and reheated at 100 °C for 15 min with the lids open to evaporate ammonia and obtain a final volume of 70-100 μL.Insoluble material was removed after centrifugation at 10,000× g for 10 min and discarded.The concentration of DNA was determined by Quant-iT PicoGreen dsDNA quantification assay using a microplate fluorimeter (Infinite F200, Tecan) and Magellan Data Analysis Software (Tecan).The DNA samples were stored at 4 °C.

Coi Amplification (DNA Barcoding)
For DNA barcoding, the mitochondrial coi region was amplified.Unless stated otherwise, each conventional PCR run in this study had a final volume of 20 μL, comprising 10 μL BioMix Red master mix (2×), 1 μL of each primer from a 10 μM working stock (final concentration, 0.5 μM), 1 μL DNA template, and 7 μL nuclease-free water.All PCR amplifications were conducted in a Biometra TRIO thermal cycler (Analytik, Jena, Germany) and electrophoretic separations of PCR products were on a 1% agarose gel stained with SYBR Safe DNA gel stain (Invitrogen).The PCR products were visualized and photographed with a Safe Imager transilluminator (Invitrogen).A negative control, in which DNA template was replaced with nuclease-free water, was included in each PCR run.
The primers used for the amplification of the coi fragment (710 bp) were LCO1490: 5'-GGTCAACAAATCATAAAGATATTGG-3' (sense) and HC02198: 5'-TAAACTTCAGGGTGACCAAAAAATCA-3' (antisense) [71].PCR amplifications comprised initial denaturation at 94 °C for 5 min, followed by 35 cycles of denaturation at 94 °C for 1 min, annealing at 50 °C for 1.5 min, and extension at 72 °C for 1 min.The final extension was at 72 °C for 5 min and the reactions were then held at 4 °C.Selected amplicons were directly sequenced in both directions by Macrogen (Seoul, South Korea) using the same primers used for PCR amplification.

Amplification of a Mycobacterial rpoB Gene Fragment by Quantitative PCR
Quantitative real-time PCR (qPCR) was used for rapid screening to detect and quantify mycobacterial DNA in a subset of aquatic bug specimens (see Results).The qPCR was designed to target a fragment (88 bp) of the RNA polymerase β-subunit (rpoB) gene that is 100% conserved in most nontuberculous mycobacteria, including MPM.The following primers and probes were synthesized by Eurofins Genomics (Ebersberg, Germany): forward primer (Myco-F: 5'-GATCTCCGACGGTGACAAGC-3'), reverse primer (Myco-R: 5'-CAGGAACGGCATGTCCTCG-3') and Myco-probe (5'-6FAM-ACGGCAACAAGGGCGTCATCGGCAAGATCCT-BHQ-1-3').A 20 μL reaction mix was prepared containing 1 μL DNA template, 10 μL SYBR Green master mix (2 ×) (Bioline), and final concentrations of 0.5 μM for each primer and 0.25 μM for the probe in PCR-grade water.Amplification proceeded on a CFB-3220 DNA Engine Opticon 2 System (Bio-Rad) at 50 °C for 3 min (holding) and 95 °C for 5 min (initial denaturation), followed by 40 cycles of 20 sec at 95 °C and 40 sec at 60 °C.To quantify the mycobacterial load of each sample, each run included a standard curve with DNA concentrations corresponding to 1,000,000 to 0.1 copies per reaction of a synthetic mycobacterial rpoB oligonucleotide standard (Eurofins Genomics), diluted in 100 ng/μL yeast tRNA (Invitrogen, Carlsbad, CA, USA) to prevent aggregation.Each DNA standard dilution and sample was assayed in duplicate.Opticon Monitor software (v.3.1) was used for linear regression analysis based on tenfold dilutions of the standard.Each PCR run included a negative control (PCR-grade water) and a positive control (genomic DNA from Mycobacterium ulcerans NCTC 10417, Public Health England Culture Collections, Salisbury, UK).
The frequency of rpoB-positive bugs was compared between meso-and hypoendemic sites using a χ2 analysis at www.socscistatistics.com.

Confirmatory Conventional PCRs for Bacteria
For specimens positive with the rpoB qPCR, a 400 bp fragment of the IS2404 insertion sequence from MPM was amplified in a first-round PCR, while the nested round amplified a 150-200 bp fragment.Both rounds of PCR were performed with primers as previously described [72].The PCR reactions were prepared as described above for coi, except for the nested round, in which 1 μL PCR product from the first-round PCR was used as a DNA template.The PCR amplifications were carried out under the following conditions: initial denaturation at 95 °C for 15 min followed by 25 cycles of denaturation at 94 °C for 30 sec, annealing at 64 °C for 1 min, and extension at 72 °C for 1 min.The final extension was at 72 °C for 10 min.The cycling parameters for the nested round were identical to those of the first round, except the number of cycles was raised to 30.

Metagenomic Sequencing
Twenty gerrid and veliid specimens with the highest copy numbers of mycobacterial rpoB were subjected to next-generation sequencing on an Illumina HiSeq 4,000 platform (Illumina, Inc., San Diego, CA, USA).Illumina fragment libraries were prepared from genomic DNA using the Nextera XT kit as previously described [74], and sequenced as paired-ends (2 × 150 bp) over two lanes, with a minimum of 280 million clusters per lane.Base-calling, demultiplexing of indexed reads, and trimming of adapter sequences was conducted as reported previously [74].Raw data were submitted to the Sequence Read Archive at the National Center for Biotechnology Information (NCBI) under BioProject PRJNA542604 and accession numbers SRX5884614-SRX5884633.

Taxonomic Assignment
Taxonomic read assignments were performed using k-mer-based classifiers Kraken v. 2.0.7 [75] and CLARK-S v. 1.2.5 [76] with default settings.The output from Kraken2 or CLARK-S was visualized using the Krona v. 2.7 [77] tool, creating hierarchical interactive pie charts.MetaPhlAn2 v. 2.6.0 [78] was also used to identify bacterial species and estimate their relative abundance across all 20 samples according to the database 'mpa_v20_m200'.Velvet v. 1.2.10 [79] wrapped by VelvetOptimiser v. 2.2.6 was used to assemble the 20 bug genomes to contig level.According to these assemblies, GC-coverage plots (proportion of GC bases and Velvet node coverage) were generated using BlobTools v. 1.0.1 [80].Taxonomic assignment of these contigs was achieved using megablastn to search against the nr database.

Mitogenome Assembly and Annotation
The mitogenome of each sample was assembled de novo using NOVOPlasty v. 2.7.2 [81], with the D. melanogaster mitogenome as a seed sequence (NC_024511.2).Each circularized mitogenome assembly obtained in the first run was used iteratively as a seed sequence to run NOVOPlasty again.In all the assembly processes, the k-mer length was set to 39 and other operation parameters in the configuration file were left on the default setting or set to fit the features of the reads.Protein-coding, rRNA, and tRNA genes were annotated with Mitos2 [82] by searching the Refseq 81 Metazoa database using the invertebrate mitochondrial genetic code.The protein-coding genes were manually added or edited, if necessary, after comparing with other hemipteran mitogenomes via tblastn.The tRNAs were identified with the tRNAscan-SE search server v. 2.0 [83] by setting sequence source to 'other mitochondrial' and genetic code for tRNA isotype prediction to 'invertebrate mito'.Mitogenome sequences were compared using BLAST Ring Image Generator v. 0.95 [84] to Perittopus sp.(JQ910988.1)[85], Aquarius paludum (NC_012841.1)[86], and Gigantometra gigas (NC_041084.1)[87] reference sequences.Gene rearrangement analysis of the assembled mitogenomes were visualized using the Mauve genome aligner v. 25 February 2015 [88].MrBayes v. 3.2.6 [89] was run with the generation parameter set to '1 in 100 samples' and the substitution model set to 'GTR + G + I' (the first 25% of samples were discarded as burn-in) to build a mitogenome tree based on protein superalignments, as previously described [90].The assembled mitogenomes were submitted to NCBI under accession numbers MN027271-MN027279.

Barcode Analysis of coi Sequences and Phylogenetics
All Sanger-sequenced coi reads were trimmed of ~50 poor-quality bases at the 5' and 3' end before assembly using the cap3 [91] program.The assemblies were then submitted to the Barcode of Life Data (BOLD) System [92] v. 4 under the project name "CBUG" and analyzed using the "Cluster Sequences" feature, which calculates pairwise distance following alignment with an amino acid-based hidden Markov model.Gaps were handled by pairwise deletion and the sequences were binned into operational taxonomic units (OTUs) by nearest-neighbor distance.The coi sequences were compared with reference gerrid, veliid, and mesoveliid sequences from NCBI and from a previous Cameroonian barcoding study of aquatic bugs [37], in which the data were available in their supplementary information but not BOLD Systems or NCBI.Alignments were performed using Mafft v. 7.402 [93] with the '--auto' option and automatically trimmed with Gblocks v. 0.91b [94].A phylogenetic tree for coi sequences was reconstructed using IQ-TREE v. 1.6.7 [95] with the parameters '-m MFP -alrt 1000 -bnni -bb 1000' to determine the best-fit model for ultrafast bootstrap, and both Shimodaira-Hasegawa-like (SH)-like approximate likelihood ratio tests and ultrafast bootstrapping were conducted to access branch supports within one single run.

Analysis of Wolbachia Genes and Phylogenetics
Wolbachia reads assigned by Kraken2 were further assembled and scaffolded using SPAdes-3.7.1 [96] with the '--careful' function.Wolbachia protein-coding genes in these assemblies were predicted using Prokka v1.13 [97] with default settings.One-to-one orthologs of Wolbachia protein-coding genes in five reference Wolbachia strains (GCF_000008025.1 GCF_000073005.1, GCF_000306885.1, GCF_000008385.1, GCF_001931755.2, and GCF_000829315.1), each from a different taxonomic "supergroup" (higher-level clade), were identified by using OrthoFinder 2.2.3 [98].To investigate the Wolbachia supergroups in our samples, concatenated phylogenetic trees were constructed for multiple orthologs (depending on the recovery of Wolbachia protein-coding genes in each assembly) at the amino acid level using IQ-TREE as described above.A phylogenetic analysis for the Wolbachia wsp gene (trimmed PCR amplicons) was also performed using MrBayes with the same parameters as for the whole mitogenome tree.Velvet-assembled contigs classified as Wolbachia bacteriophage (phage WO) were aligned onto a complete phage genome (KX522565.1)from wVitA, a symbiont of the parasitoid wasp Nasonia vitripennis [99], to identify the presence of phage in sequenced bug samples.

Distribution of Aquatic Bugs in the Health Districts
Of the 1,113 aquatic insects collected in the Southwest Region, 1,102 were identified as true bugs (Hemiptera) belonging to eight families and 29 putative genera.Specimens in each genus were further separated into morphotypes, generating a total of 110 (of which 34 were represented by only one specimen each).Subsequently, 331 individual specimens were retained for species confirmation and voucher specimens, while the remaining 771 were subjected to molecular analysis.Of the 25 sampling points, 17 (68%) were in hypoendemic zones, while the remaing eight (32%) were in mesoendemic areas.
Similarly, of the 1,102 aquatic bugs captured, 714 (64.8%) were from the hypoendemic sites and 388 (35.2%) from the mesoendemic sites.The distribution of the aquatic bugs by HD is shown in Figure 2.

Composition of Aquatic Bug Taxa
Most bugs collected belonged to just two families, with 498 (45.2%) from the Gerridae (water striders) and 426 (38.7%) from the Veliidae (riffle bugs).The Notonectidae, Naucoridae, Belostomatidae, Mesoveliidae, Hydrometridae, and Nepidae were represented by 54 (4.9%), 52 (4.7%), 36 (3.3%), 18 (1.6%),11 (1.0%), and 7 (0.6%) specimens, respectively (Figure 3).Overall, the population of aquatic bugs originating from families known to bite humans was very low (149/1,102, 15.5%) and distributed in the HDs according to Table S1.Visual inspection of the morphotypes indicated that Gerridae were dominated by Limnogonus spp.and the Veliidae by Rhagovelia spp.(Table 1).The marked morphological diversity of these two genera, highlighting the complexity of the taxonomic assignment, is illustrated in Figure 4. Overall, the specimens were allocated to 42 putative species, some containing multiple morphotypes (Table 1).The family Gerridae was the most diverse, with 18 putative species, followed by Naucoridae (8 putative species), Notonectidae (6 putative species), Veliidae (4 putative species), Belostomatidae (3 putative species), and one putative species each for the families Mesoveliidae, Hydrometridae, and Nepidae.The Gerridae, Veliidae and Hydrometridae were present in all HDs (Table S2); indeed, six sampling points (three in Buea HD and one each in Muyuka, Kumba, and Mbonge) had only gerrids and veliids.,In the first round of M. ulcerans screening by rpoB qPCR, the human-biting bugs (Table S1) were not analyzed for the detection of M. ulcerans due to the high intraspecific morphological polymorphism observed and their relative scarcity.For example, the 54 notonectids, placed in six putative species, comprised 16 morphotypes with the most abundant morphotype having only five specimens.Similarly, bugs in the families Mesoveliidae (18, 1.6%) and Hydrometridae (11, 1.0%) were also not analyzed for M. ulcerans carriage because of their small numbers (Table 1), although the former were not always easily differentiated from veliids (see Discussion).

Prevalence of Mycobacterial DNA in the Aquatic Bugs
The mycobacterial rpoB gene was detected and quantified in 8.9% of the 771 specimens, comprising 13.2% (53/405) from Gerridae and 4.4% (16/366) from Veliidae.The distribution of the positive specimens among the putative species is detailed in Table 2.The positive specimens comprised 7.9% (21/266) from mesoendemic sites and 9.5% (48/505) from hypoendemic sites (p = 0.457).None of the gerrid and veliid specimens positive for the mycobacterial rpoB gene were positive for the M. ulcerans-specific IS2404 sequence, although this region was successfully amplified from the positive control.To ensure potentially positive bug specimens had not been missed, DNA was also extracted from the 149 biting bugs (Table S1) and then assayed by the IS2404-nested PCR.Mycobacterium ulcerans DNA was not detected in any of these samples.

Bacterial Sequences in Gerrid and Veliid Metagenomic Datasets
As both epineustonic and nektonic bugs have been reported to have high infection rates with M. ulcerans in other BU-endemic regions of Cameroon (see Discussion), we applied next-generation genomic sequencing to the 20 bug specimens with the highest rpoB copy numbers to confirm that they were infected only with non-MPM species.Approximately 11-27 million reads were obtained per specimen (Table 3) and taxon-annotated GC-coverage plots ("blob analysis") revealed that several of the samples contained apparent bacterial sequences at a relatively high coverage rate (Figure S2).In most samples, the CLARK-S tool identified ~10-fold fewer bacterial k-mers than did Kraken2 (data not shown), so was not used for downstream analyses of bacterial sequences.K-mers identified as originating from Mycobacterium spp.by Kraken2 exhibited low abundance (<0.5% of all bacterial read pairs) in all specimens except for a Metrocoris sp.(Gerridae) from Muyuka, in which mycobacterial sequences comprised almost 5% of all bacterial read pairs (Table 3).However, 98% of these mycobacterial k-mers were classified as deriving from Mycobacterium sp.WY10, a non-MPM soil-associated species [100], with no sequences classified as M. ulcerans in this specimen.Only one specimen, a Limnogonus hypoleucus from Limbe, contained M. ulcerans sequences at a proportion of >0.1% of all bacterial sequences (84 read pairs), accounting for ~50% of mycobacterial sequences (Table 3).
Strikingly, read pairs classified as Wolbachia were more abundant than mycobacterial sequences in 16/20 specimens, and in many cases, the differential in read pair count between the two bacterial genera was 2-3 logs (Table 3).However, Wolbachia read pairs accounted for a very wide range of bacterial sequences (0.05%-63.08%) between specimens.Using a conventional PCR targeting wsp, we were able to amplify this Wolbachia-specific gene from all 12 specimens with >6,000 Wolbachia read pairs, but from none of the samples with a lower Wolbachia DNA content (Table 3).
In order to classify the Wolbachia strains in the bug specimens, we applied three approaches.First, Sanger sequencing of the wsp amplicons and a Bayesian phylogenetic analysis suggested that most Wolbachia infections belonged to supergroup B, represented in the tree by strain wPip (Figure 5).Two specimens apparently deviated from this pattern, with a supergroup A sequence related to wMel being amplified from Rhagovelia sp. 2, while a more divergent wsp sequence was recovered from Microvelia sp. 1, which was nested between representative sequences from supergroups F and D (Figure 5).Second, as wsp is known to recombine [101], especially between supergroups A and B that coinfect numerous arthropod hosts, we also built trees from protein-coding Wolbachia orthologues recovered from the metagenomic datasets.These supported the supergroup assignments suggested by the wsp analysis in all cases, with the Wolbachia sequences from Microvelia sp. 1 indicating a placement in supergroup F (Figure S3).Letters in parentheses after Wolbachia strain names refer to supergroup designations; suffixes after bug names in square brackets refer to location codes (see Table 3).
Third, the metagenomic phylogenetic analysis tool MetaPhlAn2 identified matches with supergroup B Wolbachia genomes in almost all samples with amplifiable wsp, although a strong signal assigned to a supergroup A genome (wMelPop) was apparent in the Rhagovelia sp. 2 specimen (Figure 6).For the Microvelia sp. 1 sample, Wolbachia reads were detected but not classified as being associated with a specific sequenced strain.The classification of the Wolbachia reads by MetaPhlAn2 suggested that more than one strain might be present in several samples, especially the Rhagovelia spp (Figure 6).Finally, it was noteworthy that sequences matching the bacteriophage of Wolbachia (phage WO), which is known to carry the genes responsible for Wolbachia-mediated reproductive manipulations in arthropods [102], were identified on numerous contigs from the Rhagovelia sp. 2 specimen (Figure S4).Of the 20 bug specimens subjected to next-generation sequencing, it was possible to assemble circularized mitogenomes in nine cases, of which only two were from veliids.Annotation of these genomes showed a gene order in both bug families that was typical of insects (Figure S5).The veliid mitogenomes appeared to be approximately 100 bp shorter than those from gerrids (Figure S5, Figure 7).However, due to a well-recognized limitation of Illumina sequencing technology [103], it was apparent that we achieved only low coverage of the control region (AT-rich region; see breaks in Figure 7 at ~15 kbp) compared with three available references from gerrids and veliids, especially for Metrocoris and Trepobates.Although putative differences in this region between species must be confirmed using alternative sequencing technologies, it appears to be potentially useful for discriminating between Limnogonus spp.(Figure 7A,B).To evaluate the concordance between morphological identifications and DNA barcodes, 50 gerrid, veliid, and mesoveliid samples that produced bright coi amplicons by PCR were sequenced, and 43 samples generated unambiguous alignments of forward and reverse reads.These were supplemented by an additional nine coi sequences obtained from the Illumina sequencing effort, producing a total of 52 sequences across 20 morphotypes.Twenty-one Sanger-sequenced samples were flagged as containing stop codons by BOLD Systems but were retained for downstream analyses.Sequence clustering by BOLD Systems produced 19 putative OTUs.A phylogenetic analysis of the coi sequences from gerrids, veliids, and mesoveliids, incorporating sequences from the study of Ebong et al. and top BLASTn hits from NCBI, revealed incomplete resolution of bug families [e.g., note the position of "V5" veliid sequences from Ebong et al. and references for Stridulivelia spp., Microvelia americana, and Perittopus horvathi (all Veliidae) on the tree; Figure 8].However, clustering of gerrid genera from our study (Trepobates, Eurymetropis, Metrocoris, Metrobates, and Limnogonus) and other species (L.hypoleucus, L. cereiventris, and to a large extent, L. intermedius) was clearly evident.The inclusion of barcodes containing stop codons frequently, but not always, generated spurious OTUs containing a single specimen in the BOLD cluster analysis as highlighted previously (Figure 8) [104].This had a comparatively greater impact on the analysis of the smaller veliid dataset, although a more likely explanation for the aberrant placement of our Mesovelia specimens is specimen misidentification, since support for their positioning within a clade of Rhagovelia spp. was strong (Figure 8).In summary, the 20 morphotypes were collapsed into 12 high-confidence OTUs, once those that contained only sequences with stop codons (seven OTUs) were disqualified.

Assembly of Complete Mitogenomes from Gerrids and Veliids and DNA Barcoding
Figure 8. Phylogenetic tree of coi sequences from gerrids (green), veliids (blue), and mesoveliids (dark red).Sequences from the current study are marked with an asterisk (*) and show location codes in square brackets, references from NCBI show accession numbers, and sequences from the prior study of Ebong et al. [37] are represented by short alphanumeric codes.The "Cluster Sequences" feature of BOLD Systems [92] was used to generate the operational taxonomic units (OTUs) and nearestneighbor distances (in parentheses).Dubious OTUs (labelled in red text) are associated with stop codons in coi sequences, indicated by a caret symbol (^) after the putative species name.Numbers on the nodes of the tree are in the format "SH-like approximate likelihood ratio test support (%)/ultrafast bootstrap support (%)".
With the exception of the "V5" sequences from the Ebong et al. (2016) analysis [37] and our "Mesovelia" specimens, clustering at the family level in both studies was consistent (Figure 8).A phylogenetic analysis at the whole mitogenome level also clearly separated gerrid and veliid specimens (including references from NCBI) into family-specific groups, while Limnogonus spp.were resolved as a monophyletic clade with unambiguous segregation by species (Figure 9).The "V4" sequences from Ebong et al. [37] appear to belong to our Rhagovelia spp.OTU 5, and the "Ge6" sequences identified as L. hypoleucus in Ebong et al. [37] were positioned proximal to our own L. hypoleucus (OTU 10) (Figure 8).However, we did not identify any Angilia spp.("V1") in our study (Figure 8).Additional comparisons between the two works were not possible, as Ebong et al. [37] did not provide putative identifications for several sequence clusters and their data are not held in BOLD Systems.Nevertheless, it is striking that despite the much greater geographic coverage of the Ebong et al. study [37], extending across most of Cameroon's administrative regions, they reported the initial identification of five gerrid genera, of which only three (all observed by us) were corroborated by DNA barcoding in their analysis.In our study located within one region, eight gerrid genera were recorded, of which five were selected for DNA barcoding and received strong phylogenetic support.Four of these genera (Trepobates, Eurymetropis, Metrocoris, and Metrobates) were not observed in the Ebong et al. study [37].

Discussion
The findings of our study contrast markedly with others from Cameroon in unearthing little evidence for the presence of M. ulcerans in aquatic bugs.In order to determine if this is a genuine insight into distinctive ecological characteristics of the pathogen in Southwest Cameroon, the possibility of artefacts or biases arising from our methods used in the field and/or laboratory need to be considered.We sought to screen the bug specimens rapidly with a general mycobacterial rpoB qPCR assay that allowed us to assess bacterial load, not only qualitative presence, for downstream next-generation sequencing.This was originally intended to provide data on the environmental strain(s) of M. ulcerans in the Southwest Region.However, the fact that the rpoB-positive specimens were negative with the nested IS2404 assay, which targets a highly repeated insertion sequence [7], strongly suggested that the bugs contained non-MPM DNA.This lack of M. ulcerans was confirmed in most cases following whole genome sequencing of 20 bug specimens with relatively high rpoB copy numbers.Thus, it seems unlikely that M. ulcerans infections in our collection were missed.
In common with other surveys of M. ulcerans infection in aquatic invertebrates [19,33,105], we did not attempt to remove external contamination from the insects' surface.There were two reasons for this: (a) in systematic studies, "surface sterilization" of insects has been shown to have no significant impact on the microbiome profiles determined by next-generation sequencing [106]; and (b) in predatory species, the impact of ingested organisms within gut contents is likely to have a much larger influence on microbiome data than surface contamination.The latter would have required painstaking dissection of hundreds of fresh specimens to mitigate even in part.Thus, although ~80 read pairs were classified as originating from M. ulcerans in one gerrid from a hypoendemic site (Limbe), this extremely low signal (not confirmed by the IS2404 assay) could originate from degraded M. ulcerans DNA from gut contents or nucleic acids bound to the insect cuticle.Nevertheless, it is suggestive of the presence of M. ulcerans at that site.
Since the laboratory procedures between our study and those reporting substantial infection rates of M. ulcerans in aquatic bugs were similar, then potential differences in the field sampling strategy should be considered.The most obvious is that our study was much smaller than those of Marion et al. [19] and Garchitorena et al. [105] in terms of the total number of bugs collected, and we assayed insects individually rather than in small pools.The previous studies were also undertaken in highly endemic areas for BU, with >10-fold more recorded human cases than even our "mesoendemic" sites.The influence of seasonality is also potentially important, although in areas of high endemicity, the positivity rate for M. ulcerans has been reported to be greatest in July to August [19,105], which was a similar period to our field survey in the Southwest.
A further important difference between our study and prior screens of aquatic bugs for M. ulcerans was evident in the dominant hemipteran families found at our sampling sites.Following the publication of the hypothesis that biting bugs are vectors of M. ulcerans [29], subsequent studies have tended to focus on biting taxa at the expense of nonbiting bugs, whereas we paid greater attention to gerrids and veliids, as they were much more abundant in our survey.A recent ecological analysis linked infection of aquatic Hemiptera with M. ulcerans to the nektonic lifestyle of predatory species that feed on animals their own size or even larger, occupy aquatic vegetation, and live towards the bottom of the water column [34].However, it is possible that these conclusions are a result of confirmation bias resulting from an emphasis in the literature on biting species in the epidemiology of BU.In accordance with this interpretation, when gerrids, and even phytophagous taxa such as the Corixidae, have been screened previously for M. ulcerans in Cameroon (albeit with smaller sample sizes than biting bugs), they have been found to be positive at rates equivalent or higher than those of other bug taxa [19].Moreover, the first isolation and characterization of the pathogen from a nonclinical source was from a gerrid in Benin [20], indicating that this aquatic bug family is at least as likely to have a symbiotic relationship with M. ulcerans as any other.Data from veliids in the BU literature are too scant to draw conclusions [33], but as they have a similar ecology to gerrids, there is no a priori reason to believe that they cannot be colonized by M. ulcerans.These considerations all emphasize that the lack of evidence for M. ulcerans in aquatic Hemiptera in Southwest Cameroon is not the result of methodological flaws; indeed, several studies on a much larger scale have fundamentally challenged the hypothesis that these insects have any substantive role in BU epidemiology [26,33].
The sequencing of the bug specimens in our study provided an opportunity to investigate other components of the microbiome in these relatively little-studied insects.In this small sample, Wolbachia was found not only to be the dominant symbiont but also to be represented by multiple sequence types.While Wolbachia is ubiquitous in arthropods from terrestrial environments, very few data on its distribution in aquatic species existed until recently, apart from analyses of holometabolous, medically important vectors with aquatic larval stages (principally mosquitoes [107], but also blackflies [108,109], and some limited surveys in aquatic crustaceans [110,111]).However, a 2017 meta-analysis of Wolbachia prevalence in aquatic insects (n = 228 species) estimated that >50% of species are infected, with Hemiptera exhibiting a particularly high prevalence (69%) [38].Importantly, assessing Wolbachia infection by molecular methods alone (including metagenomics) leads to potential pitfalls in interpretation, primarily due to high rates of lateral gene transfer from Wolbachia into the host genome, which can lead to Wolbachia relics or "genomic fossils" in species that may have lost the infection [112][113][114].Host genomes containing sequences from more than one supergroup have even been reported [115].Therefore, our putative Wolbachia infections in Cameroonian gerrids and veliids require confirmation by microscopy for whole bacteria or deeper sequencing to recover closed bacterial genomes.
Whether all of the Wolbachia infections in the aquatic bugs are extant or not, in our very small sample of fully sequenced specimens, evidence for three Wolbachia supergroups was apparent.Supergroups A and B are the most prevalent clades in arthropods worldwide, forming the basis of the so-called Wolbachia "pandemic" [116,117].Coinfection by both clades in a single host has been described repeatedly [118,119], with some evidence for this in our own specimens.The presence of supergroup F-like sequences in Microvelia sp. 1 was a more unusual finding, as this clade is less widespread than the A and B supergroups, although it is also the only one known from both arthropod and nematode hosts [120,121].As the Wolbachia signal in this specimen was relatively low (~6,000 read pairs, or 0.04% of the total), the possibility that the clade F sequences originate from gut contents or an arthropod or nematode parasite or parasitoid cannot be discounted at this stage [122,123].A priority for future research on the association of aquatic bugs with M. ulcerans will be to determine if Wolbachia influences this relationship; e.g., whether its presence might explain the limited evidence of M. ulcerans infection in our specimens compared with other locations in Cameroon.Notably, while most research on pathogen protection by Wolbachia in arthropods has focused on artificial infections for vector-borne disease control [43], Wolbachia has also been reported to influence susceptibility to pathogens in some natural systems [41,42].
A final contribution of our study was to extend the integrative taxonomy of aquatic Hemiptera in Cameroon following a previous published work [37].The Ebong et al. study [37] included a much greater number of specimens used for DNA barcoding (188) than our own from across the whole country.However, our intensive field survey in the Southwest produced a similar number of barcodes to Ebong et al. [37] for the Gerridae and Veliidae and they were selected from a larger sample size for these taxa.We obtained barcodes from four gerrid genera not observed in the Ebong et al. [37] study (Trepobates, Eurymetropis, Metrocoris, and Metrobates) and complete mitogenomes for Trepobates, Metrocoris, and Limnogonus; the first to be published worldwide.The barcoding effort showed some limitations, particularly with respect to the amplification of putative nuclear mitochondrial transfers (which accumulate mutations) and/or the impact of high rates of heteroplasmy [104].However, with stringent interpretation of barcode segregation, biases caused by these artifacts can be minimized, and the effects of mixing Sanger and next-generation sequencing technologies should also be considered carefully [87].
We noted that separation of aquatic bug families by coi sequences was not perfect in all cases, especially when considering reference sequences deposited in NCBI.Misidentifications are an obvious explanation for this discrepancy, and probably also explain the aberrant placement of Mesovelia spp. in our study.Unfortunately, although previous studies using a larger number of characters (e.g., whole mitogenomes) have been published for Hemiptera, representation of aquatic taxa has been patchy, with no inclusion of mesoveliid specimens for instance [85,86].While our whole mitogenome analysis for the small number of gerrid and veliid assemblies available separated the two families unambiguously, future efforts should focus on obtaining complete mitogenomes from mesoveliids too.
At the genus and species level, the coi barcoding proved its usefulness for gerrids in particular, but sampling of a more diverse range of genera will be required before its utility with veliid specimens can be firmly established.A very large barcoding effort for Hemiptera found that the minimum interspecific distance was >3% for 77% of congeneric species pairs, although a case of identical barcodes between different genera of mirid bugs was uncovered [36].An important consideration in species infected with symbionts that can cause cytoplasmic incompatibility, such as Wolbachia, is introgression of mitochondrial haplotypes following symbiont-driven selective sweeps [124].In some cases, this introgression can penetrate across hybrid zones during breakdown of reproductive barriers [125][126][127].In this respect, it is noteworthy that a phylogenetic analysis of Limnogonus spp.using two mitochondrial and one nuclear marker found the two subgenera (Limnogonus sensu stricto and the exclusively Afrotropical Limnogonoides) to be paraphyletic, as well as weak bootstrap support for clades and species within Limnogonoides [48].Although L. guttatus was not included in that analysis, it is thought to be closely related to L. hypoleucus and L. poissoni [48], and the extent to which these species might be able to hybridize in nature, potentially sharing Wolbachia strains, should be explored in future.

Conclusions
In this survey of M. ulcerans infection of aquatic Hemiptera in Southwest Cameroon, an emerging focus of BU, almost no evidence for a role for these insects as an environmental reservoir of the pathogen was uncovered.Therefore, we recommend that future epidemiological studies in the region sample more broadly from a variety of animal, plant, and inanimate sources.As aquatic Hemiptera from Cameroon are likely to remain of interest for both basic biodiversity studies and comparative ecological analyses on the distribution of M. ulcerans in the environment across West Africa, the role of Wolbachia in these species, and its potential impacts on hemipteran population structure and microbiomes, should be evaluated further.

Figure 1 .
Figure 1.Sample collection sites in the Southwest Region of Cameroon.

Figure 2 .
Figure 2. The distribution of the aquatic bugs collected in the health districts.

Figure 3 .
Figure 3. Aquatic bug families identified in this study and their distribution between six locations (health districts).

Figure 4 .
Figure 4. Representative specimens of the most common genera of aquatic bugs identified in this study: Limnogonus (Gerridae) and Rhagovelia (Veliidae).Note the marked morphlogical diversity within each genus.

Figure 7 .
Figure 7. BLAST Ring Image Generator [84] plots for circularized mitogenomes of Gerridae (A, B) and Veliidae (C).Details of reference genomes from National Center for Biotechnology Information (NCBI) are provided in the center of each map.

Table 1 .
Composition of aquatic bugs based on morphological characteristics and confirmation at the Natural History Museum, London.

Table 2 .
Detection and quantification of rpoB gene fragments in aquatic bugs from the study sites.

Table 3 .
Distribution of sequences classified as originating from Mycobacterium spp.and Wolbachia in 20 aquatic bug genomes as determined by Kraken2.