Detection and Complete Genome Analysis of Circoviruses and Cycloviruses in the Small Indian Mongoose (Urva auropunctata): Identification of Novel Species

Fecal samples from 76 of 83 apparently healthy small Indian mongooses (Urva auropunctata) were PCR positive with circovirus/cyclovirus pan-rep (replicase gene) primers. In this case, 30 samples yielded high quality partial rep sequences (~400 bp), of which 26 sequences shared maximum homology with cycloviruses from an arthropod, bats, humans or a sheep. Three sequences exhibited maximum identities with a bat circovirus, whilst a single sequence could not be assigned to either genus. Using inverse nested PCRs, the complete genomes of mongoose associated circoviruses (Mon-1, -29 and -66) and cycloviruses (Mon-20, -24, -32, -58, -60 and -62) were determined. Mon-1, -20, -24, -29, -32 and -66 shared <80% maximum genome-wide pairwise nucleotide sequence identities with circoviruses/cycloviruses from other animals/sources, and were assigned to novel circovirus, or cyclovirus species. Mon-58, -60 and -62 shared maximum pairwise identities of 79.90–80.20% with human and bat cycloviruses, which were borderline to the cut-off identity value for assigning novel cycloviral species. Despite high genetic diversity, the mongoose associated circoviruses/cycloviruses retained the various features that are conserved among members of the family Circoviridae, such as presence of the putative origin of replication (ori) in the 5′-intergenic region, conserved motifs in the putative replication-associated protein and an arginine rich region in the amino terminus of the putative capsid protein. Since only fecal samples were tested, and mongooses are polyphagous predators, we could not determine whether the mongoose associated circoviruses/cycloviruses were of dietary origin, or actually infected the host. To our knowledge, this is the first report on detection and complete genome analysis of circoviruses/cycloviruses in the small Indian mongoose, warranting further studies in other species of mongooses.


Introduction
Viruses belonging to the family Circoviridae (genera Circovirus and Cyclovirus) contain a covalently closed, circular, single-stranded DNA genome (~1.7-2.1 kb in size) [1,2]. The circovirus and cyclovirus genomes have an ambisense organization, consisting of at least two inversely arranged open reading frames (ORFs) that encode the replication-associated protein (Rep) and the capsid protein (Cp) [1,2]. In circoviruses, the ORF coding for the Rep is organized on the virion-sense (positive-sense) strand, whilst the cycloviral Rep is encoded by the complementary (anti-sense) strand of a double-stranded DNA replicative form. The Rep is the most conserved protein in circoviruses/cycloviruses and contains sequence motifs that are characteristic of proteins participating in rolling-circle replication (RCR) [1,2]. On the other hand, the Cp has been found to be significantly more divergent and is characterized by the presence of an arginine/basic amino acid (aa) rich region in the amino terminus that might be involved in DNA binding activity [1][2][3][4][5].
Viruses 2021, 13, 1700 3 of 20 cycloviruses by nested PCR assays using pan-rep primers (primers CV-F1, CV-R1, CV-F2 and CV-R2, targeting a short stretch (~400 bp) of the Rep-encoding ORF), as described previously [3]. Additional primers were designed from the partial Rep-encoding ORF sequences and used in inverse nested PCRs to amplify the complete genomes of the mongoose associated circovirus and cyclovirus (Supplementary material S1). PCRs were performed using the Platinum™ Taq DNA Polymerase (Invitrogen™, Thermo Fisher Scientific Corporation, Waltham, MA, USA) according to the manufacturer's instructions. Sterile water was used as a negative control in all PCR reactions.

Nucleotide Sequencing
The PCR amplicons were purified using the Wizard ® SV Gel and PCR Clean-Up kit (Promega, Madison, WI, USA) following the instructions provided by the manufacturer. Nucleotide sequences were obtained using the ABI Prism Big Dye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Foster City, CA, USA) on an ABI 3730XL Genetic Analyzer (Applied Biosystems, Foster City, CA, USA).

Sequence Analysis
Homology search for related nt and deduced aa sequences were performed using the standard BLASTN and BLASTP program (Basic Local Alignment Search Tool, www. ncbi.nlm.nih.gov/blast, accessed on 22 June 2021), respectively. Putative ORFs encoding the viral Rep and Cp were identified using the ORF finder (https://www.ncbi.nlm.nih. gov/orffinder/, accessed on 20 June 2021), whilst those with a putative intron between the Rep coding sequences (CDS) were determined by BLASTN analysis with the CDS feature. Pairwise sequence (%) identities for the complete viral genomes, and the putative Rep and Cp were determined using the MUSCLE algorithm in the SDTv1.2 program, as described previously [2,48]. On the other hand, pairwise identities between the partial Rep-encoding ORF sequences were calculated using the MUSCLE alignment program (https://www.ebi.ac.uk/Tools/msa/muscle/, accessed on 23 June 2021) and the 'align two or more sequences' option of BLASTN program (https://blast.ncbi.nlm.nih.gov/, accessed on 23 June 2021). The maps of the circular viral genomes were constructed with the 'Draw Custom Plasmid Map' program (https://www.rf-cloning.org/savvy.php, accessed on 20 June 2021). The putative stem-loop structure was identified in the viral genome using the mFold program [49].
Multiple alignments of nt and deduced aa sequences were carried out using the MUSCLE algorithm embedded in the MEGA7 software [50]. Phylogenetic analysis was performed by the maximum likelihood (ML) method using the MEGA7 software [50], with the GTR+G model of substitution and 1000 bootstrap replicates, as described previously [2]. The complete genomes of the mongoose associated circoviruses and cycloviruses were investigated for recombination events using the RDP4 program with default parameters [51]. A circovirus/cyclovirus sequence was determined as a recombinant if it was supported by two, or more than two detection methods (3Seq, BOOTSCAN, CHIMERA, GENECONV, MAXCHI, RDP and SISCAN) with a highest acceptable p-value of p < 0.01 with Bonferroni's correction [17,51].

GenBank Accession Numbers
The GenBank accession numbers for the mongoose associated CRESS DNA viral sequences determined in this study are MZ382570-MZ382599.

Detection of Circoviruses and Cycloviruses in the Small Indian Mongoose
The small island of St. Kitts (~69 square miles, human population of~35,000) is inhabited by a large population of the small Indian mongoose (~45,000) that dwell in wild and urban habitats ( Figure 1A,B) [44,52] samples from 76 (91.56%) of the 83 small Indian mongooses yielded the expected~400 bp amplicon with circovirus/cyclovirus pan-rep primers in screening PCR assays. In this case, 39 of the 76 positive samples showed strong PCR amplification and were sequenced for the partial rep gene. By BLASTN analysis, all the mongoose-associated partial CRESS DNA viral sequences shared maximum homology with published circovirus, or cyclovirus rep gene sequences, except for Mon-66, which shared maximum pairwise nt sequence identities of 63.37% with that of an unclassified CRESS DNA virus (GenBank accession number KY487932) from a wastewater sample (Table 1). However, based on analysis of the complete genome sequence, Mon-66 was assigned to the genus Circovirus ( Table 2). Nine of the partial rep sequences lacked high quality and were excluded from further analysis.

Detection of Circoviruses and Cycloviruses in the Small Indian Mongoose
The small island of St. Kitts (~69 square miles, human population of ~35,000) is inhabited by a large population of the small Indian mongoose (~45,000) that dwell in wild and urban habitats ( Figure 1A,B) [44,52], [https://www.sknbs.org/about-bureau/about-stkitts-and-nevis/, accessed on 14 July 2021]. In the present study, single fecal samples from 76 (91.56%) of the 83 small Indian mongooses yielded the expected ~400 bp amplicon with circovirus/cyclovirus pan-rep primers in screening PCR assays. In this case, 39 of the 76 positive samples showed strong PCR amplification and were sequenced for the partial rep gene. By BLASTN analysis, all the mongoose-associated partial CRESS DNA viral sequences shared maximum homology with published circovirus, or cyclovirus rep gene sequences, except for Mon-66, which shared maximum pairwise nt sequence identities of 63.37% with that of an unclassified CRESS DNA virus (GenBank accession number KY487932) from a wastewater sample (Table 1). However, based on analysis of the complete genome sequence, Mon-66 was assigned to the genus Circovirus ( Table 2). Nine of the partial rep sequences lacked high quality and were excluded from further analysis.  [44], and was used here with permission from the corresponding author of the publication [44]. Table 1. Pairwise identities of partial circular Rep-encoding single-stranded (CRESS) DNA viral sequences detected in the small Indian mongoose (Urva auropunctata) with those reported from other animal species/environmental samples. Based on BLASTN analysis and pairwise nucleotide (nt) sequence identities, the mongoose associated partial CRESS DNA viral sequences were classified into at least 8 putative groups (designated as I-VIII). Mongoose associated viral sequences sharing >95% nt sequence identities between themselves were assigned to the same group and are highlighted with the same color.  Table 2. Maximum/significant pairwise nucleotide (nt) sequence identities of the complete genomes of mongoose associated circoviruses and cycloviruses between themselves and with those from other animal species.

GenBank accession number
Maximum/Significant Pairwise nt Sequence (%) Identities 1 Based on BLASTN analysis and pairwise nt sequence identities of partial rep, the mongoose associated partial CRESS DNA viral sequences were classified into at least 8 putative groups (designated as I-VIII) ( Table 1). Mongoose associated partial rep sequences sharing >95% pairwise identities between themselves were assigned to the same group (Table 1). Group-I sequences shared maximum pairwise identities of 77.65-79.20% with circovirus isolate C072 that was detected in the intestinal sample from a vesper bat (Myotis fimbriatus) in China (Table 1). Group II-VII exhibited maximum homology with cyclovirus rep sequences (Table 1). Group-II and -VI sequences shared maximum homology (pairwise identities of 97.62-98.31% and 77.47-82.21%, respectively) with cycloviruses from bats ( Table 1). Group-III and -V consisted of a single rep sequence each, and shared maximum pairwise identities of 94.25% and 93.11% with a cockroach associated cyclovirus and a sheep associated cyclovirus, respectively (Table 1). Group-IV was the largest, consisting of 15 partial rep sequences that shared maximum pairwise identities of 95.37-97.48% with human cyclovirus TN2 (detected in fecal sample of a healthy child who came in contact with a non-polio-infected acute flaccid paralysis patient [3]). Group-VII sequences shared maximum homology (pairwise identities of 91.84-92.07%) with human cyclovirus VS5700009 that was detected in the serum of a patient with unexplained paraplegia [36].

Between mongoose associated circoviruses With circovirus (Strain Name/Detected in Animal Species/Country/Year/GenBank Accession Number) from other animal species
Even though we reported high rates of detection of CRESS DNA viral sequences that exhibited maximum homology with circoviruses, or cycloviruses, all the fecal samples were obtained from apparently healthy mongooses, indicating a lack of association between these viruses and clinical conditions. Circoviruses appear to primarily infect vertebrates, whilst those detected in hematophagous arthropod vectors might actually be viruses of mammals, or birds that these insects feed upon [2,5,21,23]. On the other hand, cycloviruses have been reported in a wide array of both invertebrates and mammals [2,5,21,23]. Based on these observations, it has been proposed that cycloviruses are more diverse and widespread than circoviruses [2,5,21,23]. In the present study,~90% (35/39) of the partial rep sequences shared maximum pairwise identities with cycloviruses, which constituted 6 of the 8 putative groups of diverse CRESS DNA viral sequences (Table 1). Since only fecal samples were analyzed, and the small Indian mongoose has been known to feed on small mammals, reptiles, birds, bird and reptile eggs, crustaceans, insects and human waste [53], we could not determine whether the mongoose associated CRESS DNA viruses replicated in the host or were of dietary origin.
Based on the detection of closely related circoviruses/cycloviruses in different animal species, especially in tissues, some studies have proposed interspecies transmission events within the family Circoviridae [2,3,5,54]. In this study, 15 (group-IV) and 2 (group-II) of the mongoose associated partial rep sequences were closely related (>95% pairwise nt identities) to cycloviruses from humans and bats, respectively (Table 1). St. Kitts island has a sizeable bat population [55], and the small Indian mongoose has often been seen in close proximity to humans [44], offering an ideal environment for interspecies transmission events. However, we could not obtain the full-length genome sequences for group-II viruses, whilst that of complete genome of human cyclovirus TN2 (closely related to group-IV sequences) was not available in the GenBank database. Furthermore, caution should be exercised whilst commenting on cross-species transmission of circoviruses/cycloviruses from fecal samples, as they may have a dietary origin [2,3,5,54], especially in mongooses that have a wide-range of feeding habits [53]. Interestingly, one of the mongoose associated partial rep sequences (group-III) shared maximum homology with a cockroach associated cyclovirus, corroborating previous observations that cyclovirus sequences detected in vertebrate samples might be actually those from viruses of arthropods [21,23] (Table 1).

Analysis of the Complete Genomes of Mongoose Associated Circoviruses and Cycloviruses
Since the ICTV classification scheme for the family Circoviridae is based on genomewide pairwise identities [1,2], attempts were made to determine the complete genome sequences of mongoose associated CRESS DNA viruses representing the putative groups I-VIII (Table 1). Using an inverse nested PCR assay, we obtained the full-length genome  (Tables 1 and 2), whilst the complete genomes of group-II, -III and -V viruses could not be amplified. The genomic organization of the mongoose associated circoviruses and cycloviruses are shown in Figure 2. The complete genome sequences of the mongoose associated circoviruses and cycloviruses (collectively referred to as the 'Mon sequences') retained the various features that are conserved in members of the genera Circovirus and Cyclovirus, respectively, within the family Circoviridae (Figures 2-4; Supplementary material S2) [1,2]. several models in the RDP4 program as minor parent to goat associated cyclovirus 1 (isolate PKgoat11, GenBank accession number HQ738636), whilst the major parent was unknown (Supplementary material S6).
The complete genomes of mongoose associated circoviruses Mon-1 and -29 were 1879 bp in size (Figure 2), which was comparable to those observed in most circoviruses [2]. Mon-1 and -29 were closely related to each other ( Figure 5; Tables 2-4; Supplementary Materials S3), and shared maximum pairwise identities of 67.40% and 67.20%, respectively, with that of bat circovirus isolate BtPspp.-CV (GenBank accession number KJ641716) from China [59], followed by identities of 66.90% and 66.60%, respectively, with bat associated circovirus 10 (isolate HK02976, GenBank accession number LC456717) from Japan (Table 2) [60]. Phylogenetically, the complete genome sequences of Mon-1 and -29 formed a distinct cluster within a clade that mostly consisted of circoviruses from bats, including isolates BtPspp.-CV and HK02976 ( Figure 5). These observations were corroborated by analysis of the putative proteins (pairwise identities of Rep and Cp, and phylogenetic analysis of Rep) of Mon-1 and -29 (Tables 3 and 4; Supplementary material S5).  The inversely arranged major open reading frames encoding the putative replication associated (Rep) and capsid (Cp) proteins are shown with blue and red arrows, respectively. The putative origin of replication (ori) characterized by a nonanucleotide motif at the apex of a stem-loop structure is marked in the 5′-intergenic region. The size of the Rep and Cp are shown in parenthesis. nt: nucleotide; aa: amino acid.   The inversely arranged major open reading frames encoding the putative replication associated (Rep) and capsid (Cp) proteins are shown with blue and red arrows, respectively. The putative origin of replication (ori) characterized by a nonanucleotide motif at the apex of a stem-loop structure is marked in the 5′-intergenic region. The size of the Rep and Cp are shown in parenthesis. nt: nucleotide; aa: amino acid.  All the Mon sequences contained the putative ori in the 5 -IR, marked by the presence of the conserved nonanucleotide motif ((C/T)AGTATTAC) at the apex of a potential stem-loop structure (Figures 2 and 3). Following ICTV guidelines [1,2], the first nt of the nonanucleotide motif was considered as 'position one' of the Mon sequences. Mon-1, -29 and -66 contained the putative ori in the Rep-encoding strand, whilst the putative ori was located in the Cp coding strand of Mon-20, -24, -32, -58, -60 and -62 ( Figure 2). The 3 -IR was absent, or smaller in Mon-20, -24, -32, -58, -60 and -62 than those observed in Mon-1, -29 and -66 ( Figure 2). Based on these observations, genome-wide pairwise identities, and phylogenetic analysis, Mon-1, -29 and -66 were classified as circoviruses, whilst Mon-20, -24, -32, -58, -60 and -62 were assigned to the genus Cyclovirus (Figure 2, Figure 3, Figure 4, Figure 5, Figure 6; Table 2, Table 3, Table 4; Supplementary Materials S2-S5).
Corroborating previous observations [1,2], the putative Rep of the mongoose associated circoviruses and cycloviruses retained the conserved RCR (motifs I through III) and superfamily 3 helicase (Walker A and B, and motif C) motifs (Figure 4). On the other hand, the putative Cp, although much more divergent than the Rep (Tables 3 and 4), contained the conserved arginine rich region at the amino terminus, with the exception of Mon-66 (Supplementary material S2). Since recombinants have been reported in both circoviruses and cycloviruses [4,17,29,[56][57][58], the Mon sequences were evaluated for potential recombination using the RDP4 program. However, we did not obtain reliable evidence for recombination events in the Mon sequences, except for Mon-32, which was identified by several models in the RDP4 program as minor parent to goat associated cyclovirus 1 (isolate PKgoat11, GenBank accession number HQ738636), whilst the major parent was unknown (Supplementary material S6).
The complete genomes of mongoose associated circoviruses Mon-1 and -29 were 1879 bp in size (Figure 2), which was comparable to those observed in most circoviruses [2]. Mon-1 and -29 were closely related to each other ( Figure 5; Table 2, Table 3, Table 4; Supplementary Materials S3), and shared maximum pairwise identities of 67.40% and 67.20%, respectively, with that of bat circovirus isolate BtPspp.-CV (GenBank accession number KJ641716) from China [59], followed by identities of 66.90% and 66.60%, respectively, with bat associated circovirus 10 (isolate HK02976, GenBank accession number LC456717) from Japan (Table 2)          The complete genome of Mon-66 was 2432 nt long (Figure 2), which was larger than those of most circoviruses [1,2]. Even though the partial rep sequence of Mon-66 exhibited maximum homology with that of an unclassified CRESS DNA virus (Table 1), the complete genome of Mon-66 shared maximum pairwise identities of 60.70% with that of porcine circovirus 2 isolate MZ-5 (GenBank accession number LC004750) from India (Table 2). Since the genome-wide pairwise identity cut-off value for assigning a sequence to a genus within the family Circoviridae is 55% [1,2], Mon-66 was classified as a circovirus. On the other hand, the Rep and Cp of Mon-66 shared maximum deduced aa identities of 51.20% and 44.40% with that of unclassified CRESS DNA viruses from a wild bird (MW182878) and a wastewater sample (KY487977), respectively (Tables 3 and 4). Phylogenetically, Mon-66 formed an isolated branch within the clade of circoviruses ( Figure 5; Supplementary material S5).
The complete genomes of the mongoose associated cycloviruses were 1771 nt, 1786 nt or 1831 nt long (Figure 2), which was within the size-range for cyclovirus genomes [1,2]. Mon-20 and -24 were fully identical in their complete genome sequences, and shared maximum homology (pairwise identities of 72.10%) with that of bat cyclovirus isolate CyV-LysokaP4 (GenBank accession number MG693174) from Cameroon (Table 2; Figure 6) [61]. Pairwise identities of the Rep and Cp, and phylogenetic analysis of Rep of Mon-20 and -24 also revealed similar findings (Tables 3 and 4; Supplementary material S5).
Mon-32 exhibited maximum pairwise identity of 77.30% with the complete genome of goat cyclovirus isolate PKgoat11, followed by 75.30% with that of human cyclovirus isolate PK5006 (GQ404844) ( Table 2). Cyclovirus isolates PKgoat11 and PK5006 were detected in the same study from Pakistan [3]. The Rep and Cp of Mon-32 exhibited maximum pairwise deduced aa identity of 87.10% and 51.60% with that of isolates PKgoat11 and PK5006, respectively (Tables 3 and 4). Phylogenetically, Mon-32 clustered with isolate PKgoat11 within a clade that also consisted of PK5006 ( Figure 6; Supplementary material S5).
The ICTV has recommended a species demarcation threshold of 80% genome-wide pairwise nt sequence identity for members of the family Circoviridae [1,2]. Based on the ICTV classification system, mongoose associated circoviruses Mon-1/Mon-29 and Mon-66 qualify as novel species within the genus Circovirus, whilst mongoose associated cycloviruses Mon-20/Mon-24 and Mon-32 represent new species in the genus Cyclovirus (Table 2; Supplementary Materials S3 and S4). On the other hand, the maximum pairwise identities (79.90-80.20%) of mongoose associated cycloviruses Mon-58, -60 and -62 with cycloviruses from other animals/sources were borderline to the cut-off identity value for assigning novel cycloviral species (Table 2; Supplementary material S4).

Conclusions
Taken together, our findings suggest that CRESS DNA viruses are widely circulating in the small Indian mongoose population on the island of St. Kitts. However, in the absence of sampling from tissues, and considering that mongooses are polyphagous predators, we could not determine whether the circoviruses/cycloviruses in fecal samples of apparently healthy mongooses were of dietary origin, or actually infected the host. Based on the classification scheme proposed by the ICTV Circoviridae study group [1,2], we identified 2 novel species in each of the genera Circovirus and Cyclovirus, further expanding the genetic diversity of these viruses. However, despite high genetic diversity, the mongoose associated circoviruses/cycloviruses retained the various features that are conserved among members of the family Circoviridae. Studies aimed at detection of viral DNA in tissues, screening for virus-specific antibodies, in-vitro replication of virus in mongoose cells and virus inoculation in gnotobiotic animals are required to gain a proper understanding of circovirus/cyclovirus infection in mongooses. To our knowledge, this is the first report on detection and complete genome analysis of circoviruses and cycloviruses in the small Indian mongoose, warranting further studies in other species of mongooses.