Virus Diversity, Abundance, and Evolution in Three Different Bat Colonies in Switzerland

Bats are increasingly recognized as reservoirs for many different viruses that threaten public health, such as Hendravirus, Ebolavirus, Nipahvirus, and SARS- and MERS-coronavirus. To assess spillover risk, viromes of bats from different parts of the world have been investigated in the past. As opposed to most of these prior studies, which determined the bat virome at a single time point, the current work was performed to monitor changes over time. Specifically, fecal samples of three endemic Swiss bat colonies consisting of three different bat species were collected over three years and analyzed using next-generation sequencing. Furthermore, single nucleotide variants of selected DNA and RNA viruses were analyzed to investigate virus genome evolution. In total, sequences of 22 different virus families were found, of which 13 are known to infect vertebrates. Most interestingly, in a Vespertilio murinus colony, sequences from a MERS-related beta-coronavirus were consistently detected over three consecutive years, which allowed us to investigate viral genome evolution in a natural reservoir host.


Introduction
It has been estimated that up to 75% of emerging and reemerging infectious diseases originate from wildlife species [1,2]. The increase of zoonotic spillover is likely due to human population growth, high exposure of humans to wildlife and products of animal origin, intensive livestock farming, climate change, habitat destruction, wildlife markets, and bushmeat consumption [3][4][5][6][7]. The inter-and intraspecies transmission of infectious diseases is based strongly on interactions between hosts, pathogens, and the environment [5].
Bats are considered to play a major role in disease emergence and as reservoir hosts of viruses that pose a risk to public health [8][9][10][11]. Most of the viruses found in bat feces originate from insects, plants, and fungi, representing the dietary habits of the animals [12]. However, many viruses of vertebrates are also found in bats, including emerging viruses such as Hendravirus, Ebolavirus, Nipahvirus, and SARS-and MERS-coronavirus (CoV) [12][13][14][15][16][17][18]. The origin of SARS-CoV-2 has not yet been clarified. However, in several different animal species, i.e., Himalayan palm civets, Asian palm civets, and bats of the species Rhinolophus and Aselliscus, viruses closely related to SARS-CoV-2 have been detected [19]. Although a beta-coronavirus named RaTG13 isolated from a bat of the species Rhinolophus affinis in Yunnan province in China seems to be the virus most closely related to SARS-CoV-2, with 96.2% nucleotide identity [20], this virus is not assumed to be the direct progenitor of SARS-CoV-2 [20,21]. SARS-CoV-2 is suspected to have a history of recombination and to share a common ancestor with three other bat viruses called PRC31, RpYN06, and RmYN02, which are more similar than RaTG13 in most of the genome [22,23].

Comparison of Viruses and Genes
Reads were cleaned from adapters and quality-trimmed with fastp (v0.20.1, [50]). Reads matching ribosomal sequences were removed with sortmerna (v4.3.4, [51]) using the eukaryotic 18S and 28S reference sequences from SILVA [52]. Reads matching bat genomes from Myotis myotis (NCBI ID GCF_014108235.1) or Rhinolophus ferrumequinum (NCBI ID GCF_004115265.1) with minimap2 (v2.24-r1122, [53]) were removed as well. Surviving reads were re-paired with the repair.sh script from BBTools (https: //sourceforge.net/projects/bbmap, accessed on 10 July 2022). Only paired reads were kept and aligned to a collection of viral reference sequences (downloaded from NCBI on February 2022; see Table S1) using bowtie2 with the option -very-sensitive-local (v2.3.5.1, [54]). The reference sequences were specific for each colony and were based on the results from the virome analysis above. Single nucleotide variants (SNVs) were called with freebayes (v1.3.2-40-gcce27fc, [55]) with the options -p 1 -min-coverage 10 -skip-coverage 100,000 -min-alternate-count 2 -use-best-n-alleles 2 -min-alternate-fraction 0.1 -pooledcontinuous -min-mapping-quality 1. SNV calls were filtered for a minimal quality of 20 with the script vcffilter distributed with freebayes. Number of reads and percent genome coverage were extracted with samtools (v1.10, [56]). The SNVs were filtered to include only the two best-supported variants per site. Importantly, these were not necessarily the reference SNVs from the reference assembly. Hence, we did not classify any of them as reference SNVs, but only compared the change in SNV frequencies over time.

Test for Shifts in SNV Frequency
To test whether SNVs changed in frequency from one year to the later years, we compared the allelic depths with Fisher's exact test. We did this for 2019 vs. 2020, 2020 vs. 2021, and 2019 vs. 2021. The p-values from all comparisons were then adjusted to reflect Viruses 2022, 14, 1911 4 of 22 false discovery rates (FDRs), and sites with an FDR < 0.01 were marked as significant. We classified whether the SNVs had an effect on the coding sequence by comparing the protein sequences containing the two best-supported SNVs with a custom script. We also identified potentially novel SNVs by checking whether variants were already present in the earlier year (s).

Phylogenetic Analysis
The NSP12 genes (encoding RNA-dependent RNA polymerase (RdRp)) of the alphaand beta-CoVs identified in this study were aligned using MUSCLE in MEGA X [57]. Phylogenetic trees were constructed in Mega X using the Maximum Likelihood algorithm based on the Tamura-Nei model with 1000 bootstrap values and a cut-off of 70%. For both viruses, full NSP12 reference genes were downloaded from the NCBI sequence database based on BLAST analysis.

PCR
The presence of a Middle East respiratory syndrome-related coronavirus (MERSrelated CoV) detected by NGS each year in the AG colony was confirmed by RT-PCR targeting the entire gene encoding the spike protein. First, RNA was extracted using the QIAamp Viral RNA Mini Kit (Qiagen, Hombrechtikon, Switzerland) according to the manufacturer's manual with a sample volume of 140 µL and the following modifications: RNA carrier was omitted, and for the elution step, 100 µL RNase free water was used instead of 60 µL AVE buffer. Then, the cDNA was synthesized using the SuperScript® IV First-Strand Synthesis System kit (Thermo Fisher, Waltham, MA, USA) as described by the manufacturer using 11 µL of RNA and random hexamers (Thermo Fisher, Waltham, MA, USA). To amplify the genomic region of the spike protein, five overlapping pairs of specific primers were designed using the Primer-BLAST tool (https://www.ncbi.nlm.nih.gov/tools/primer-blast/, accessed on 10 July 2022) based on the sequences obtained from NGS. Additional primer pairs were designed to fill gaps based on the Sanger sequences (Table S2). PCR was performed using a Phusion Hot Start II High Fidelity DNA Polymerase kit (Thermo Scientific, Waltham, MA, USA) in a three-step protocol (as described in the manual provided by the manufacturer) with 45 cycles and 1.5 µL of cDNA in a final reaction volume of 25 µL. The PCR products were visualized on 1.5% agarose gel with GelRed® (GelRed® Nucleic Acid Gel Stain, Biotium, Fremont, CA, USA) staining. PCR products with the expected size were purified by a QIAquick PCR Purification Kit (Qiagen, Hombrechtikon, Switzerland) following the instructions of the manufacturer with an elution volume of 30 µL. Finally, the DNA concentration was measured on the NanoDropTM OneC (Thermo Fisher Scientific, Waltham, MA, USA), and sent to Microsynth (Balgach, Switzerland) for Sanger sequencing.

Data Availability
The nucleotide sequences of the coronaviruses detected in this study were registered at GenBank under accession numbers ON325307-ON325310. All raw sequencing data were uploaded to the Sequence Read Archive (SRA) under accession number PRJNA835632.

Results
The results present the unbiased virome of three bat colonies and the evolution of selected virus genomes over three years.

Virome Analysis
In total, 1.69 × 10 8 raw sequencing reads were generated with an average of 2.17 × 10 7 reads per sample (range from 1.27 × 10 7 to 3.27 × 10 7 reads per sample) and analyzed using a reference-based assembly. The raw sequencing reads also included sequences of bacteriophages and plant viruses, which were excluded from further analysis. From the total number of reads, 2.76 × 10 6 (1.63%) were classified as viral, of which 1.57 × 10 6 (57%) were determined as viruses from vertebrates and 1.19 × 10 6 (43%) as viruses from invertebrates. Figure 1 shows a general overview of the different virus families from vertebrates and invertebrates found in the different colonies and years. Figure 2 and Supplementary Figures S1-S3 represent heatmaps of specific viruses of each colony aligned to a collection of viral reference sequences. The reads classified as viral were assembled to 22 different virus families, i.e., 9 from vertebrates, 9 from invertebrates, and 4 families with members infecting vertebrates or invertebrates. a reference-based assembly. The raw sequencing reads also included sequences of bac riophages and plant viruses, which were excluded from further analysis. From the to number of reads, 2.76 × 10 6 (1.63%) were classified as viral, of which 1.57 × 10 6 (57%) w determined as viruses from vertebrates and 1.19 × 10 6 (43%) as viruses from invertebra Figure 1 shows a general overview of the different virus families from vertebrates a invertebrates found in the different colonies and years. Figure 2 and Supplementary F ures S1, S2 and S3 represent heatmaps of specific viruses of each colony aligned to a c lection of viral reference sequences. The reads classified as viral were assembled to different virus families, i.e., 9 from vertebrates, 9 from invertebrates, and 4 families w members infecting vertebrates or invertebrates. The virus sequences identified at least at one sampling time point in the AG colo belonging to 15 different virus families ( Figure 1). Of these, seven families are known infect vertebrates, including Adenoviridae, Cirvoviridae, Coronaviridae, Nairoviridae, Perib yaviridae, Picornaviridae, and Polyomaviridae, and five families are known to infect inver brates, including Alphatetraviridae, Dicistroviridae, Iflaviridae, Nodaviridae, and Polycipiv dae. Three of the virus families found include members that can infect vertebrates or vertebrates, including Genomoviridae, Parvoviridae, and unclassified viruses. Five fami of DNA viruses were identified, i.e., Adenoviridae, Circoviridae, Genomoviridae, Parvovirid and Polyomaviridae, along with nine families of RNA viruses, i.e., Coronaviridae, Nairov dae, Peribunyaviridae, Picornaviridae, Alphatetraviridae, Dicistroviridae, Iflaviridae, Nodaviri high read abundance over the years was a CoV belonging to the genus alpha-coronavirus (aligned to MN065811) ( Figure 2C). Although the reads per million (RPM) was consistently high, the genome coverage in the year 2020 was poor.
As in the AG colony, bat-associated circovirus (aligned to MT815980) was the most abundant virus from vertebrates and showed the same RPM pattern, with the highest number in 2020 ( Figure 2C).  As in the AG colony, bat-associated circovirus (aligned to MT815980) was the most abundant virus from vertebrates and showed the same RPM pattern, with the highest number in 2020 ( Figure 2C).

Variability of Selected Genomes and ORFs over Time
We analyzed SNVs and amino acid substitutions in single-sequence reads of selected DNA and RNA viruses of vertebrates relative to reference genomes (Tables 1 and 2). In the AG colony, sequences aligned to two different circoviruses (bat circovirus isolate Acheng30, NC_035799, and bat-associated circovirus isolate BatACV/BtVm/Switzerland/2019, MT815982) and one polyomavirus (pipistrellus pipistrellus polyomavirus 1 PyV8-SHZC27, LC426677) were analyzed (Table 1). In the gene for the replication protein (rep) of the batassociated circovirus, only three non-synonymous nucleotide substitution were detected out of a total of 13 substitutions. In the sequences encoding the late T-antigen of the polyomavirus, only a single synonymous nucleotide substitution was observed.
In the GR colony, changes in the sequences aligned to adeno-associated virus 2 (AAV2, AF043303) and starling circovirus (DQ172906) were analyzed (Table 1). In four genes of AAV2, i.e., rep68, 78, 40, and 52, two non-synonymous nucleotide substitutions per gene were identified. In the circovirus, only one synonymous nucleotide substitution in the capsid protein (cap) was found.

Variability of Selected RNA Virus Genomes
In the AG colony, sequences aligned to a MERS-related CoV (MG021452) were found at all three time points. According to phylogenetic analyses based on the RNA-dependent RNA polymerase (RdRp) gene, this virus sequence clustered with and showed the highest similarity (approx. 87%) to a MERS-related CoV genome from China (MG021452) ( Figure 3).
As there was some variation of the genome coverage over the three years-77% in 2019, 86% in 2020, and 91% in 2021-only sequences that were identified in at least two years were analyzed for SNVs. Of the 186 substitutions found in the ORF1ab gene, 14 were non-synonymous and 14 were novel ( Table 2). A substitution was classified as novel when it was not previously present in the dataset. In the gene encoding the spike (S) protein, 12 substitutions were identified, of which 3 were non-synonymous and 3 were novel. In ORF4a and b, one non-synonymous substitution was detected in each gene, and in both genes, one substitution was novel. In ORF5, one out of four substitutions was non-synonymous and one substitution was novel. In genes encoding the envelope (E) and membrane (M) proteins, all substitutions were synonymous, while for the gene encoding the nucleocapsid (N) protein, 2 out of 10 substitutions detected were non-synonymous and 2 were novel. Finally, for ORF8b, five substitutions were identified, of which four were non-synonymous and three were novel (Table 2). stitutions were equally distributed over the regions of NTD and RBD, whereas the distribution of non-synonymous substitutions in the region of the SD-1 and SD-2 showed a gap.
The MERS-related CoV contig of 2019 had previously been uploaded to GenBank (MT818221). The contig of 2020, which was covered completely (30'047 nt), and of 2021, which was partial (25'618 nt), are uploaded now, as well, and can be found under accession numbers ON325307 and ON325308, respectively.  The substitutions were distributed equally over the genome, except for the locus encoding the spike protein and the ORF3, which showed a lower number of substitutions ( Figure 4). However, this is likely an artifact because the genomic region was only covered in the NGS data from the 2020 sample. To overcome this issue, we RT-PCR amplified and Sanger sequenced the genomic region encoding the spike protein from samples collected at all three time points. The analysis revealed that of the 429 substitutions, 333 were synonymous and 96 were non-synonymous. The majority of substitutions were found in the N-terminal domain (NTD) and the receptor-binding domain (RBD), while there were fewer in the S1 and S2 subdomains (SD-1 and SD-2) ( Figure 5). The non-synonymous substitutions were equally distributed over the regions of NTD and RBD, whereas the distribution of non-synonymous substitutions in the region of the SD-1 and SD-2 showed a gap.   ; receptor-binding domain of the S1 subunit (RBD); subdomains of S1 and S2 subunits (SD-1 and SD-2)) were predicted by ClustalW alignment to the homologous regions of available MERS-related CoVs. Each triangle indicates an SNV; green, synonymous; red, non-synonymous.
In the GR colony, one synonymous nucleotide substitution was identified in each of two bat rotavirus segments encoding NSP3 (aligned to MN433625) and VP4 (aligned to MN433620). In the two mammalian orthoreovirus segments sequenced, one substitution was synonymous (L3, aligned to KU194660), and the other was non-synonymous and also novel (S1, aligned to JQ979272) ( Table 2). However, it is important to mention that the coverage for both viruses in the year 2021 was low.
In the LU colony, sequences aligned to alpha-CoVs (MN535733 and MN535734) were found at all three time points and were chosen for further analysis. According to phylogenetic analysis based on the NS12 gene, these virus genomes clustered together in the phylogenetic tree and showed similarity (approx. 83%) to an alpha-CoV genome from China (OM030318), which is a member of the genus alpha-coronavirus, subgenus unclassified alpha-coronavirus ( Figure 6). In the ORF1ab, 219 substitutions were detected, of which 35 were non-synonymous and 20 were novel (Table 2). In the gene encoding the spike protein, 99 substitutions were identified, of which 24 were non-synonymous and 23 were novel. In ORF3, 12 nucleotide substitutions were detected, of which 7 were nonsynonymous and 2 were novel. In the gene encoding the envelope protein, only one substitution was detected, which was synonymous and not novel. Finally, the two substitutions in the gene encoding the nucleocapsid protein were both non-synonymous and novel (Table 2). Contigs from the de novo analysis were uploaded to GenBank under accession numbers ON325309 (2019; 27'932 nt, full genome) and ON325310 (2021; 27'259 nt, almost complete).   ; receptor-binding domain of the S1 subunit (RBD); subdomains of S1 and S2 subunits (SD-1 and SD-2)) were predicted by ClustalW alignment to the homologous regions of available MERS-related CoVs. Each triangle indicates an SNV; green, synonymous; red, non-synonymous.
In the GR colony, one synonymous nucleotide substitution was identified in each of two bat rotavirus segments encoding NSP3 (aligned to MN433625) and VP4 (aligned to MN433620). In the two mammalian orthoreovirus segments sequenced, one substitution was synonymous (L3, aligned to KU194660), and the other was non-synonymous and also novel (S1, aligned to JQ979272) ( Table 2). However, it is important to mention that the coverage for both viruses in the year 2021 was low.
In the LU colony, sequences aligned to alpha-CoVs (MN535733 and MN535734) were found at all three time points and were chosen for further analysis. According to phylogenetic analysis based on the NS12 gene, these virus genomes clustered together in the phylogenetic tree and showed similarity (approx. 83%) to an alpha-CoV genome from China (OM030318), which is a member of the genus alpha-coronavirus, subgenus unclassified alpha-coronavirus ( Figure 6). In the ORF1ab, 219 substitutions were detected, of which 35 were non-synonymous and 20 were novel (Table 2). In the gene encoding the spike protein, 99 substitutions were identified, of which 24 were non-synonymous and 23 were novel. In ORF3, 12 nucleotide substitutions were detected, of which 7 were nonsynonymous and 2 were novel. In the gene encoding the envelope protein, only one substitution was detected, which was synonymous and not novel. Finally, the two substitutions in the gene encoding the nucleocapsid protein were both non-synonymous and novel (Table 2). Contigs from the de novo analysis were uploaded to GenBank under ac- Figure 5. Overview of SNVs distributed over the spike protein locus (arrow) of MERS-related-CoV from the AG colony based on RT-PCR and Sanger sequencing. The domains (N-terminal domain of the S1 subunit (NTD); receptor-binding domain of the S1 subunit (RBD); subdomains of S1 and S2 subunits (SD-1 and SD-2)) were predicted by ClustalW alignment to the homologous regions of available MERS-related CoVs. Each triangle indicates an SNV; green, synonymous; red, non-synonymous.
The MERS-related CoV contig of 2019 had previously been uploaded to GenBank (MT818221). The contig of 2020, which was covered completely (30'047 nt), and of 2021, which was partial (25'618 nt), are uploaded now, as well, and can be found under accession numbers ON325307 and ON325308, respectively.
In the GR colony, one synonymous nucleotide substitution was identified in each of two bat rotavirus segments encoding NSP3 (aligned to MN433625) and VP4 (aligned to MN433620). In the two mammalian orthoreovirus segments sequenced, one substitution was synonymous (L3, aligned to KU194660), and the other was non-synonymous and also novel (S1, aligned to JQ979272) ( Table 2). However, it is important to mention that the coverage for both viruses in the year 2021 was low.
In the LU colony, sequences aligned to alpha-CoVs (MN535733 and MN535734) were found at all three time points and were chosen for further analysis. According to phylogenetic analysis based on the NS12 gene, these virus genomes clustered together in the phylogenetic tree and showed similarity (approx. 83%) to an alpha-CoV genome from China (OM030318), which is a member of the genus alpha-coronavirus, subgenus unclassified alpha-coronavirus ( Figure 6). In the ORF1ab, 219 substitutions were detected, of which 35 were non-synonymous and 20 were novel (Table 2). In the gene encoding the spike protein, 99 substitutions were identified, of which 24 were non-synonymous and 23 were novel. In ORF3, 12 nucleotide substitutions were detected, of which 7 were non-synonymous and 2 were novel. In the gene encoding the envelope protein, only one substitution was detected, which was synonymous and not novel. Finally, the two substitutions in the gene encoding the nucleocapsid protein were both non-synonymous and novel ( In comparison to the MERS-related CoV from the AG colony, the substitutions we more evenly distributed over the genome in alpha-CoV from the LU colony and we more frequently non-synonymous (Figures 4 and 7). However, in alpha-CoV, the nuc ocapsid protein was less covered in the sequence from 2020. Of the 230 substitutions d tected in the MERS-related CoV, 26 (11.3%) were non-synonymous, while of the 333 su stitutions in the alpha-CoV genome, 68 (20.4%) were non-synonymous. The number novel substitutions detected was comparable in alpha-CoV (47, 14.1%) and MERS-relat CoV (25, 10.9%) ( Table 2, Figures 4 and 7).  In comparison to the MERS-related CoV from the AG colony, the substitutions were more evenly distributed over the genome in alpha-CoV from the LU colony and were more frequently non-synonymous (Figures 4 and 7). However, in alpha-CoV, the nucleocapsid protein was less covered in the sequence from 2020. Of the 230 substitutions detected in the MERS-related CoV, 26 (11.3%) were non-synonymous, while of the 333 substitutions in the alpha-CoV genome, 68 (20.4%) were non-synonymous. The number of novel substitutions detected was comparable in alpha-CoV (47, 14.1%) and MERS-related CoV (25, 10.9%) ( Table 2, Figures 4 and 7).

Discussion
NGS and metagenomic analysis of ground stool samples of three different bat colonies of the species Rhinolophus hipposideros, Vespertilio murinus, and Myotis myotis sampled over three years identified genome sequences of 22 different virus families infecting vertebrates or invertebrates. A comparable number of virus families infecting vertebrates has been found in similar studies from the United States, South Africa, and French Guiana [35,36,58]. Viruses of vertebrates detected in our study, i.e., Coronaviridae, Adenoviridae, Circoviridae, and Parvoviridae, have been repeatedly detected in studies from Croatia, the United States, and China [12,34,36,40,59]. However, as opposed to these other reports, genomes of viruses that belong to the Rhabdoviridae, Retroviridae, and Poxviridae were not detected in our study. Circoviridae and Genomoviridae were the only viruses of vertebrates detected in all colonies and at all time points. Virus genomes from these two families were also the most abundant in a recent study from Argentina [37]. Nonetheless, direct comparison of viral diversity between various studies is challenging, since many factors, i.e., sample size, type, preparation, in silico analysis, bat species, location, and health of the bats, may affect virus composition.
Insectivorous bats hunt for prey in areas close to their roosting place. The presence and abundance of particular types of insects changes across seasons and from year to year. Indeed, no single virus species has been detected at two time points, although the diversity of viruses from invertebrates detected in each colony in our study was vast and several members of Iflaviridae and Parvoviridae were present in all years and all colonies (Figure 1). Viruses belonging to the family Iflaviridae can infect members of the orders Lepidoptera, Hymenoptera, Hemiptera, and Diptera [60][61][62][63][64][65][66]. Members of Parvoviridae, subfamily Densovirinae, infect invertebrates of six orders, including Blattodea, Diptera, Hymenoptera, Hemiptera, Lepidoptera, and Orthoptera [67], which are the main source of food for the bat species analyzed here [27]. In the GR colony, a high abundance of members of the subfamily Densovirinae was detected, similar to a previous study from China that also analyzed fecal samples [68].
The composition of virus genomes was similar in the three colonies, although each colony consisted of a different bat species. The ratio of virus families infecting vertebrates and invertebrates was 10 to 8 for AG and LU, and 9 to 9 for GR (Figure 1.). Of note, some virus families consist of members that infect vertebrates (e.g., subfamily Parvovirinae within Parvoviridae) and other members that infect invertebrates (e.g., Densovirinae within Parvoviridae). Some virus families were detected in all colonies at least once. Among vi-

Discussion
NGS and metagenomic analysis of ground stool samples of three different bat colonies of the species Rhinolophus hipposideros, Vespertilio murinus, and Myotis myotis sampled over three years identified genome sequences of 22 different virus families infecting vertebrates or invertebrates. A comparable number of virus families infecting vertebrates has been found in similar studies from the United States, South Africa, and French Guiana [35,36,58]. Viruses of vertebrates detected in our study, i.e., Coronaviridae, Adenoviridae, Circoviridae, and Parvoviridae, have been repeatedly detected in studies from Croatia, the United States, and China [12,34,36,40,59]. However, as opposed to these other reports, genomes of viruses that belong to the Rhabdoviridae, Retroviridae, and Poxviridae were not detected in our study. Circoviridae and Genomoviridae were the only viruses of vertebrates detected in all colonies and at all time points. Virus genomes from these two families were also the most abundant in a recent study from Argentina [37]. Nonetheless, direct comparison of viral diversity between various studies is challenging, since many factors, i.e., sample size, type, preparation, in silico analysis, bat species, location, and health of the bats, may affect virus composition.
Insectivorous bats hunt for prey in areas close to their roosting place. The presence and abundance of particular types of insects changes across seasons and from year to year. Indeed, no single virus species has been detected at two time points, although the diversity of viruses from invertebrates detected in each colony in our study was vast and several members of Iflaviridae and Parvoviridae were present in all years and all colonies (Figure 1). Viruses belonging to the family Iflaviridae can infect members of the orders Lepidoptera, Hymenoptera, Hemiptera, and Diptera [60][61][62][63][64][65][66]. Members of Parvoviridae, subfamily Densovirinae, infect invertebrates of six orders, including Blattodea, Diptera, Hymenoptera, Hemiptera, Lepidoptera, and Orthoptera [67], which are the main source of food for the bat species analyzed here [27]. In the GR colony, a high abundance of members of the subfamily Densovirinae was detected, similar to a previous study from China that also analyzed fecal samples [68].
The composition of virus genomes was similar in the three colonies, although each colony consisted of a different bat species. The ratio of virus families infecting vertebrates and invertebrates was 10 to 8 for AG and LU, and 9 to 9 for GR (Figure 1). Of note, some virus families consist of members that infect vertebrates (e.g., subfamily Parvovirinae within Parvoviridae) and other members that infect invertebrates (e.g., Densovirinae within Parvoviridae). Some virus families were detected in all colonies at least once. Among viruses infecting vertebrates, these included Circoviridae, Genomoviridae, Parvoviridae, Adenoviridae, Picornaviridae, and unclassified viruses. Among viruses infecting invertebrates, these included the Iflaviridae, Dicistroviridae, Parvoviridae, Polycipiviridae, and Genomoviridae. Viruses belonging to Circoviridae, Adenoviridae, and Parvoviridae are frequently detected in bats all over the world regardless of the species [8,9,36,58,69,70]. In our study, only Genomoviridae, Circoviridae, Iflaviridae, and Parvoviridae were detected in all colonies and at all time points. Adenoviridae, Parvoviridae, Dicistroviridae, and Genomoviridae were present at three time points in one colony (LU) and at two time points in the AG (likely because of the low read number obtained from that colony in 2020) and GR colonies (only two sampling time points available). Some virus families, e.g., Polyomaviridae and Nairoviridae, were detected just once or in two non-consecutive years. Nairoviridae, especially the genus Orthonairovirus, are known to use arthropods as hosts, and to be transmitted from there to mammals, birds, and bats [71]. Members of this family were found in the Vespertilio murinus bats of the AG colony in 2020 as well as in several other bat species in different countries, such as Eptesicus nilsonii in Germany and Pipistrellus pipistrellus and Myotis mystacinus in France [31,72]. Similarly, members of the family Polyomaviridae are well known to infect bats of multiple species and countries [33,[73][74][75] and were also detected at all time points in the AG and LU colonies in our study. In the GR colony, polyomaviruses were not detected, although these viruses have been detected previously in bats of the genus Rhinolophus [76]. The differences found between the colonies may be due to the differences concerning the bat species, the environment, the hunting territory, and/or food availability.
Migratory behavior may increase contact with animals and pathogens and consequently increase the number of viruses. However, in our analysis of sedentary (Rhinolophus hipposideros and Myotis myotis) and migrating species (Vespertilio murinus), no gross differences in relative numbers of viral sequences were observed. It has previously been shown that the virome of migrating species is not remarkably abundant, and that migration can indeed lower infection risk because animals leave contaminated habitats, and infected and sick animals may remain behind or may not survive migration [39,77].
Coronaviridae members were consistently detected in the colonies of AG and LU over the three years. Both alpha-and beta-CoVs were found in previous studies performed on bat samples from many locations in Europe [78][79][80][81][82][83][84]. MERS-related-CoV was detected at all three time points, with the highest number of reads in 2020; the number of reads slightly decreased in 2021. So far, no human-pathogenic beta-CoVs have been isolated from bats in Europe. However, the viruses previously identified share 76-83% similarity with SARS-CoV, MERS-CoV, and SARS-CoV-2 [43,83,85,86]. The MERS-related-CoV sequence consistently found in all three years in the AG colony in our study showed the highest similarity (approximately 87%) to a MERS-related-CoV genome from China (MG021452). The AG colony is not accessible to the public, and contact with intermediate hosts in this region, and therefore a spillover event, is rather unlikely. It has been demonstrated that bats are the original reservoir host of MERS-CoV, and that dromedaries or camels can serve as intermediate hosts [18,[87][88][89]. MERS-CoV uses the receptor human-dipeptidylpeptidase-4 (hDPP4) for cell entry; however, most of the circulating MERS-related-CoV strains found in bats cannot bind hDPP4 properly and therefore likely use another receptor for cell entry [90,91]. Most importantly, MERS-related CoVs are not known to infect humans [85,90]. Nevertheless, Lau et al. showed that Tylonicteris-bat-CoV-HKU4 (Ty-BatCoV-HKU4), a Merbecovirus isolated from bats in southern China, can use hDPP4 as well as dromedary-camel-DPP4 (dc-DPP4) and Tylonicteris pachypus-DPP4 (Tp-DPP4) for cell entry and can infect hDPP4-transgenic mice [92,93]. As MERS-CoV can use only hDPP4 and dc-DPP4 but not Tp-DPP4 for cell entry, this indicates a risk that Ty-BatCoV-HKU4 may overcome the species barrier and may have the potential for direct bat-to-human transmission [93].
RNA viruses are predominant in the emergence of new diseases due to high mutation rates, which lead to the appearance of new mutants with increased fitness, highly diverse populations, and finally, easier adaptation to new hosts and outbreak events [94,95]. During the SARS-CoV pandemic, the virus may have been transmitted from bats to civets (an amplification host) and finally to humans [16,96]. Therefore, tracing the evolution of RNA viruses, especially CoVs, is of utmost importance. Many CoV sequences have been found since the SARS pandemic in 2003, allowing investigation of virus development over time in different locations and hosts. Furthermore, multiple studies following the emergence of new variants in the human population and with focus on the mutations of viruses in vitro have been conducted [93,[97][98][99][100][101][102][103][104]. However, in vitro studies are performed under sterile conditions without the influence of multiple viral and bacterial co-infections as is commonly reported in bat samples and that can be insignificant, detrimental, or even beneficial for some viruses. Furthermore, the exact number of replication cycles is known in laboratory settings, while in the natural environment it can be only estimated. This might be an advantage regarding the accuracy of results and the ability to compare between different studies because the circumstances can be neglected in the interpretation of the results. However, in vitro discoveries cannot always be applied to the natural situation. For example, Lau et al. showed evidence that MERS-CoV isolated from bats was able to directly infect cells via the hDPP4 receptor [93]. However, no such direct transmissions have been reported in the natural environment. Various factors in the natural environment can influence host and infection dynamics, i.e., stress, lack of food and habitat, high animal density, which enables the introduction of new variants in a population, or multiple coinfections within individuals, which can lead to recombination events. Furthermore, intrinsic factors such as sex, age, breeding, and general health status may play a role in virus infection and surveillance [2,[24][25][26], factors which cannot be mimicked in in vitro conditions.
The most important aspect of our work is that we can investigate virus genome evolution, including that of alpha-and beta-CoVs, in the natural reservoir species and environment. The substitution rate of both alpha-and beta-CoVs detected in this study was within the range of previously described values from 2 × 10 −2 /nt/year to 1 × 10 −5 /nt/year [105][106][107]. The general substitution rate of MERS-related-CoV was slightly lower (3.8 × 10 −3 /nt/year) than that of alpha-CoV (5.9 × 10 −3 /nt/year). This difference may be due to the poorer coverage of MERS-related CoV in 2019. Moreover, the number of novel and non-synonymous substitutions in alpha-CoV was also higher than in MERS-related CoV. The absolute numbers of substitutions in both MERS-related-and alpha-CoV were highest in ORF1ab, followed by the genes encoding the spike protein and the nucleocapsid protein, and correlated well with the substitutions found in SARS-CoV-2 genomes of human origin [108]. The absolute numbers of substitutions correlated directly with the length of the genes. The relative numbers of substitutions in the alpha-CoV genome were the highest in the spike protein with 1.2 × 10 −2 /nt/year, followed by ORF1ab and the genes encoding the envelope and nucleocapsid proteins with 5.4 × 10 −3 /nt/year, 2.17 × 10 −3 /nt/year, and 7.9 × 10 −3 /nt/year, respectively. The spike protein is known as a mutational hotspot [109][110][111][112]. Due to poorer coverage of the spike protein sequences in MERS-related CoV, not many substitutions were detected by NGS. However, Sanger sequencing of RT-PCR products from the genome region encoding the spike protein amplified from samples collected in the three consecutive years showed 429 substitutions, of which 96 were non-synonymous.
These data cannot be directly compared to the NGS data, which provide high sequencing depth at each nucleotide, while the Sanger sequencing data represent "frozen" sequences assembled from five overlapping RT-PCR products that are likely derived from multiple virus genomes. Nevertheless, Sanger sequencing shows that the spike protein region of MERS-related CoV is also highly variable and contains regions with higher frequencies of substitutions. Most interestingly, comparable regions with mutational hotspots were also found in a previous study of the genomes of SARS-CoV-2 from humans [113], although the overall number of nt substitutions was much lower in that report compared to our data.
Many studies have been performed to better understand the mutation pattern of SARS-CoV-2 and the dynamics of CoV evolution in the human population worldwide [107,114,115]. Qualitatively, the mutation pattern of the virus genomes from bats analyzed in this study was surprisingly comparable to that of other CoVs from humans [105,106,108,109,111,112] despite the differences in physiology, lifestyle, and environmental factors of the two different species. Bats are known to harbor viruses without showing any clinical signs. Their immune system is believed to have adapted to their unique biological and physiological characteristics, such as flight, longevity, and formation of large colonies. Moreover, their immune system is able to adapt and cope with viral infections over a long time of co-evolution, as they phylogenetically belong to the oldest species of mammals. Indeed, bats can control viral replication more effectively than other mammals [8,[116][117][118].
Bats live in large, closed colonies where introducing new viruses to the population is difficult. However, once a new virus has been introduced, it may infect many individuals, which can lead to an increased number of replications and variants. Nevertheless, strains from different locations of bat SARS-like-CoV have similarities of approx. 90%, including strains identified in Russia and Bulgaria [80,83]. Humans, on the other hand, have much more open social structures compared to bats. Travel and contact with many random individuals may lead to effective spreading of virus and rising numbers of variants. While in human studies, mostly a single CoV variant is analyzed since samples are collected from individuals, in bat colonies, we can anticipate that several circulating variants are sequenced from the collection of fecal mixture from many bats.
In the DNA viruses found in our study, the number of substitutions was, as expected, relatively small, and the majority were synonymous, with one exception. Of the 12 nucleotide substitutions found in the rep genes of AAV2, 10 were non-synonymous. It is well known that parvoviruses such as parvovirus B19 in humans mutate at a much higher frequency than other DNA viruses [119][120][121][122]. This appears to also hold true for parvoviruses in bats.
Circovirus genomes were detected in all three colonies. The number of substitutions in the cap ORF of GR and AG was small and all were synonymous. However, of the substitutions found in the rep sequences analyzed, some were non-synonymous. The number of substitutions per location and year was higher in the circoviruses in our study, with a rate of 6.02 × 10 −4 /nt/year to 7.9 × 10 −3 /nt/year, compared to previously reported values in porcine circovirus 3 (PCV3) of 2.35 × 10 −5 /nt/year to 7.32 × 10 −4 /nt/year [123][124][125], but comparable to that of psittacine beak and feather disease virus (PBFDV) (3.41 × 10 −3 /nt/year, [126]). It has been suggested that circoviruses may tolerate higher substitution rates than other DNA viruses due to their small genomes and low probability of accumulating deletions [127,128]. The vast diversity among PBFDV genomes supports the theory that viral evolutionary rates are not just dependent on the activity of high-fidelity polymerase, but also on genome architecture. Specifically, the loop-like DNA structures found in PBFDV may provide important recombination sites [105,126].
The collection of ground stool samples is an easy, noninvasive tool to surveil endemic bat populations. The gathered metagenomics data expand our knowledge of the variability of virus composition over time and the evolution of specific virus genomes in a natural reservoir species. This approach, nevertheless, has some limitations. First, the outcome and validity of the results can vary depending on the coverage and sequencing depth. For example, only few reads from RVs were obtained in 2021, which did not allow us to draw any conclusions on virus evolution. Similarly, the coverage and sequencing depth specifically from the genome region encoding the spike protein of MERS-related CoV from the AG colony was highly variable between the three consecutive years, a problem that was only partially overcome by targeted Sanger sequencing. Second, the colonies were re-sampled after a long interval; thus, new strains may have been introduced into the population.
In conclusion, this study gives insight into the complex diversity, abundance, and evolution of the virome and of the genomes of particular viruses in three selected bat colonies and provides a basis for further studies investigating how viruses can change over time in the natural reservoir host and environment.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/v14091911/s1, Figure S1: Heatmap of specific genomes of viruses from invertebrates detected in the AG colony; Figure S2: Heatmap of specific genomes of viruses from invertebrates detected in the GR colony; Figure S3: Heatmap of specific genomes of viruses from invertebrates detected in the LU colony; Table S1: Database of specific viruses used for the alignment; Table S2: The primer pairs used for sequencing of the spike protein locus of the MERS-related CoVs from the Aargau colony including the expected product length.  Institutional Review Board Statement: Ethical review and approval were not necessary for this study. Sampling feces from the ground of the bat colonies does not fall under the Swiss Animal Welfare Act or the Animal Protection Ordinance since not a single animal had to be manipulated or disturbed.

Data Availability Statement:
The nucleotide sequences of coronaviruses detected in this study have been registered at GenBank under accession numbers ON325307-ON325310. All the raw sequencing data generated in this study were uploaded to the Sequence Read Archive (SRA) under accession number PRJNA835632. Supplementary Materials showing detailed results of this study are available online.