Whole Genome Classification and Phylogenetic Analyses of Rotavirus B strains from the United States

Rotaviruses (RVs) are a major etiological agent of acute viral gastroenteritis in humans and young animals, with rotavirus B (RVB) often detected in suckling and weaned pigs. Group A rotavirus classification is currently based on the two outer capsid proteins, VP7 and VP4, and the middle layer protein, VP6. Using RVB strains generated in this study and reference sequences from GenBank, pairwise identity frequency graphs and phylogenetic trees were constructed for the eleven gene segments of RVB to estimate the nucleotide identity cutoff values for different genotypes and determine the genotype diversity per gene segment. Phylogenetic analysis of VP7, VP4, VP6, VP1–VP3, and NSP1–NSP5 identified 26G, 5P, 13I, 5R, 5C, 5M, 8A, 10N, 6T, 4E, and 7H genotypes, respectively. The analysis supports the previously proposed cutoff values for the VP7, VP6, NSP1, and NSP3 gene segments (80%, 81%, 76% and 78%, respectively) and suggests new cutoff values for the VP4, VP1, VP2, VP3, NSP2, NSP4, and NSP5 (80%, 78%, 79%, 77% 83%, 76%, and 79%, respectively). Reassortment events were detected between the porcine RVB strains from our study. This research describes the genome constellations for the complete genome of Group B rotaviruses in different host species.


Introduction
Rotaviruses (RVs) are a major etiological agent causing acute viral gastroenteritis in humans and young animals, including young calves, weaning and post-weaning pigs [1,2]. RVs are members of the Reoviridae family, with a genome consisting of eleven double-stranded RNA gene segments that encode six structural proteins (VP1-VP4, VP6, and VP7) and five or six nonstructural proteins (NSP1-NSP5/NSP6) [3]. The triple-layered capsid is comprised of the outer layer of VP7 and VP4, the inner layer of VP6, and the core VP2. RVs are classified into eight species (A-H) based on antigenic relatedness or sequencing of the inner capsid protein VP6. Two tentative species, I and J, have recently been identified in fecal specimens from sheltered dogs in Hungary and guano samples from bats from Serbia, respectively [4][5][6]. The most common species infecting animals, including humans, are rotavirus A, B and C (RVA, RVB, and RVC, respectively), with RVA being the most prevalent, whereas Groups D-H only infect animals [3,7]. Rotavirus B (RVB) was first identified as the cause of severe gastroenteritis among adults in China from late 1982 to early 1983. RVB continued to be responsible for diarrheal disease in humans in India, Bangladesh, Nepal, and Myanmar [8][9][10][11][12][13][14][15]. In addition to humans, RVB strains have been detected in different host species such as rats [16], cattle [17][18][19][20][21][22], goats [23,24], sheep [25], and swine [26][27][28].
RVB has yet to be isolated in cell culture, which has hampered obtaining serological information on this species.

Materials and Methods
Extraction of RNA from clinical samples used Trizol reagent (Ambion, Carlsbad, CA, USA) and DirectZol filter columns (Zymo Research, Irvine, CA, USA). Full-length cDNA was produced by the single primer amplification technique (SPAT) from dsRNA [36,37]. Briefly, DNA primers were ligated onto the 3 ends of the double-stranded RNA genome segments, and RT-PCR was carried out using primers complementary to the ligated sequences. The cDNA was prepared for next-generation sequencing (NGS) with the NexTera XT library preparation kit (Illumina, San Diego, CA, USA). Sequencing was performed on the Miseq (Illumina) NGS platform using the 2 × 150 bp run option. Raw de-multiplexed sequencing reads were trimmed and de novo assembled using the CLC Genomics Workbench (Qiagen Bioinformatics/CLC Bio, Redwood City, CA, USA). The complete CDS for the NSP genes were obtained, whereas some of the VP genes for specific strains could not be obtained (VP4 and VP2 from strain RVB/Pig-wt/USA/KS2/2012; VP1 and VP3 from strain RVB/Goat-wt/USA/CA22/2014) due to a low viral read count. The porcine and caprine RVB nucleotide sequences used in the present study have been submitted to GenBank (NCBI) and under the following accession numbers: NSP1 ( The RVB sequences from this study and the RVB sequences available from the GenBank were aligned using Muscle alignment in Geneious 10.1.3 software [38]. Strains with less than 80% of the open reading frame were excluded from analysis. To determine the genotype classification of the eleven dsRNA segments, phylogenetic trees and pairwise nucleotide (nt) identity frequency graphs were created, and the cutoff values were defined as the percentages separating nucleotide identities between inter and intra genotypes [29]. Kruskal-Wallis chi-squared rank sum test was run to determine host nucleotide identities difference per gene segment. The phylogenetic trees were constructed by maximum likelihood using general time reversible substitution model [39] with 500 bootstrap replicates in Geneious software.

Results
Porcine fecal samples (n = 21) from farms in Illinois and Kansas and a single goat fecal sample from California were submitted to the Veterinary Diagnostic Laboratory at Kansas State University between 2012 and 2014 for sequencing. Pairwise identity frequency graphs (Supplementary Materials, Figure S1) and phylogenetic trees were constructed for the eleven gene segments of RVB strains generated in the present study and RVB sequences available from GenBank to assess nucleotide identity between the RVB host species and determine the nucleotide percent identity cutoff values and number of genotypes per gene segment. Porcine median nucleotide identities were significantly lower than bovine and human for all gene segments except VP1, VP2, NSP1, NSP2, and NSP3 (Table 1). Human strains had the highest median nucleotide identities for all but the VP3, NSP1, NSP3, NSP4, and NSP5 segments.  Values with superscripts "a", "b", and "c" are statistically different from one another (p-value < 0.05) within the same gene segment based on a Kruskal-Wallis rank sum test.
While nt cutoff values identified in this study were consistent with already established cutoffs for the VP7, VP6, NSP1, and NSP3 gene segments (80%, 81%, 76% and 78%, respectively), pairwise identity frequency graphs indicated new nt cutoff values for the NSP2, NSP4, NSP5 and VP3 (83%, 76%, 79% and 77%, respectively; Table 2). In addition, we propose nt cutoff values of 80%, 78% and 79% for VP4, VP1, and VP2, respectively. The NSP5 gene segment includes an additional genotype to the six genotypes reported by [33], classifying the caprine strains and the Japanese bovine strains within the same genotype ( Figure 1). Phylogenetic analysis showed no interspecies mixing of genotypes among porcine, human, or murine strains while goat and bovine strains shared clades for all segments except VP1.

Discussion
Until now, genotype classifications for the RVB gene segments VP1, VP2, and VP4 were lacking. This study identified percent identity nucleotide cutoff values for VP1, VP2, and VP4 while updating VP3, NSP2, NSP4 and NSP5 cutoff values using additional porcine RVB strains from the US. Compared to RVA and RVC, the RVB nucleotide percent identity cutoff values are lower for all gene segments except for the VP7, which shares the same nucleotide cutoff value with RVA [29,40]. The lower cutoff values suggest higher sequence diversity of RVB compared to other rotavirus species, which has been discussed in previous studies as well [22,27]. Refuted by more recent studies illustrating that the range of percent identities of VP6 are between 65% and 100% for both RVA and RVB [4], RVB does appear to be more diverse when considering the number of genotypes present in certain hosts. In swine, only three and eight VP6 genotypes have been identified for RVA and RVC, respectively, compared to ten VP6 genotypes for RVB [31,40,41]. There are 17 VP7 genotypes of RVB in swine compared to 12 and 15 VP7 genotypes in RVA and RVC, respectively [42].
Our dataset indicated a higher diversity of RVB genotypes in swine hosts compared to other hosts, which has been observed in swine RVC as well. Percent identities of swine RVC VP7 are notably lower than human and bovine strains, and greater numbers of genotypes of nearly all gene segments exist for swine RVC than in other hosts [40,43,44]. This highlights the important contribution of swine to the genetic diversity of RVB and RVC and, as others have suggested, may indicate swine are the main hosts for these viruses [44]. While the range of VP6 percent identities found in this study agrees with previous work, we found that only sequences of porcine origin had percent identities lower than 70%, and it is possible that rotaviruses evolve more heterogeneously in swine than in other hosts.
Reassortment among rotaviruses is a common phenomenon due to their segmented genomes [45,46]. A previous study investigated the VP6 and VP7 segments among many of the reference porcine strains used in this study and found frequent reassortment [31]. Even within genotypes, substantial genetic diversity can be present, and reassortment among sub-clades within human-specific RVB genotypes [47]. In the samples sequenced for this study, we found evidence of frequent VP4 segment reassortment, which is likely due to coinfection of RVB within swine. Reassortment of the outer capsid VP7 and VP4 proteins, in particular, would be expected to confer an evolutionary advantage since they are the targets of neutralization and reassortment help strains escape immune recognition.
Phylogenetic analysis exhibited host-specific RVB genotypes for murine, human and porcine species, and genotype constellations for these species did not show cross-species reassortment events, which is in contrast to RVA where multiple interspecies events were reported, especially between humans and domestic animals such as swine, bovine, and horses [29,48,49]. Human-porcine and bovine-porcine reassortment of the VP3 and VP6 genes was reported in RVC [40]. The exception to the RVB host specificity found in this study was the phylogenetic clustering of bovine and caprine RVB strains [24]. Whole genome sequencing of goat RVA strains reveal close phylogenetic relationship with bovine strains, pointing to historical reassortment events between the two host species [50][51][52]. Interspecies transmission of RVB could have occurred to produce this genetic similarity, although the bovine and goat strains were geographically separated, and any interspecies reassortment probably happened many years ago. Although we did not observe geographical separation of genotypes, additional sequencing and epidemiological studies could elucidate prevalence of genotypes in other countries.
In summary, a provisional genome-based classification for RVB strains from human, bovine, caprine, porcine and murine species was established, providing relevant information to understanding the evolution and epidemiology of RVB. Future research should include the sequencing and analysis of more RVB strains to ensure the consistency of the nucleotide cutoff values, remaining the true diversity of RVB.