Identification of Putative Novel Rotavirus H VP7, VP4, VP6 and NSP4 Genotypes in Pigs

Rotavirus H (RVH) has been detected in humans, pigs and bats. Recently, RVH infections were reported in different porcine farms worldwide, suggesting epidemiological relevance. However, to date, the genome information of RVH strains has been limited due to the scarcity of deposited sequences. This study aimed to characterize the VP7, VP4, VP6 and NSP4 genes of RVHs from 27 symptomatic pigs, in Italy, between 2017 and 2021. RVH genes were amplified via RT-PCR using specific primers, and the amplicons were sequenced. By coupling the data generated in this study with the sequences available in the databases, we elaborated a classification scheme useful to genotype the VP7, VP4, VP6 and NSP4 genes. The nucleotide identity and phylogenetic analyses unveiled an impressive genetic heterogeneity and allowed the classification of the Italian RVH strains into 12G (VP7), 6P (VP4), 8I (VP6) and 8E (NSP4) genotypes, of which 6I, 5E and the totality of the G and P genotypes were of novel identification. Our data highlight the high genetic variability of the RVH strains circulating in pigs and underline the importance of a robust classification system to track the epidemiology of RVHs.


Introduction
Rotaviruses (RVs) represent one of the major etiological agents of enteritis in humans and animals worldwide.Based on the genetic diversity of the intermediate capsid protein (VP6), RVs are currently classified into nine groups (A-L) (https://talk.ictvonline.org/taxonomy/, accessed on 1 September 2023).The genome comprises 11 segments of doublestranded RNA encoding for six structural (VP1-VP4, VP6 and VP7) and five or six nonstructural proteins (NSP1-NSP5/NSP6) [1].The outer proteins VP4 and VP7 mediate the interaction with host cell receptors and induce immunity protection by eliciting neutralizing antibodies [2].VP6 is involved in the transcription activity of the double-layered virion [2].The NSP4 protein is an enterotoxin that causes intracellular calcium imbalance and induces secretory diarrhea [2].Due to their role in the host range restriction and pathogenicity, the gene segments encoding these proteins are often molecularly characterized [3,4].
RV infections are highly prevalent in pig herds, frequently associated with acute diarrhea in young pigs, representing an important concern for the swine industry.The RV groups most commonly associated with enteric disease in pigs are RVA, RVB, RVC and RVH [5][6][7].
RVH was first identified in China in sporadic cases in human patients with gastroenteritis in 1987 and 1988 [8].In 1994 in Beijing and 1997 in Shijiazhuang, RVH caused large gastroenteritis outbreaks [9].RVH was initially named as new adult diarrhea RV (ADRV-N) [10,11].In pigs, RVH was first described in fecal samples of animals affected by diarrhea between 1991 and 1995 in Japan [12].Between 2008 and 2022, the presence of porcine RVH was also found in the United States [13], Brazil [14], South Africa [15], Vietnam [16] and China [17] as well as some European countries, such as Spain [18], Italy [7] and Russia [19].In the period 2016-2022, the rates of detection of RVH in symptomatic pigs ranged between 9 and 14% in Spain [18], Italy [7], Brazil [20] and China [17].These values indicate that RVH is relatively widespread in swine populations, although the impact of RVH on porcine herds in terms of costs and animal health has not been assessed.
Previous studies reported that RVH infects mostly adult pigs in combination with other RV groups [7,20,21].In addition to co-infections, infections by multiple strains of the same group (e.g., RVA) were also described [22,23].While co-infections have been reported to increase the severity of diarrhea [24,25], the clinical impact of co-infections is not known.
Genome sequencing is crucial to understand the evolutionary and epidemiological relationship among different RV strains and the adoption of a robust classification scheme is essential to facilitate data sharing and to understand promptly the origin of genome segments, thereby unveiling events of interspecies transmission, among animals and from animals to humans, eventually coupled with reassortment of RV genome segments [26].To date, a uniform classification system for all 11 gene segments has been established by the Rotavirus Classification Working Group (RCWG) only for RVA strains but not for the other RV groups [27].Despite the relatively low number of complete genomic sequences of RVH available, recently, a system for assigning genotypes, similar to the classification scheme adopted for RVA, has been suggested [21].The RVH classification scheme has proposed 10G, 6P, 6I, 3R, 4C, 7M, 6A, 2N, 4T, 6E and 3H-genotypes for the VP7, VP4, VP6, VP1, VP2, VP3, NSP1, NSP2, NSP3, NSP4 and NSP5 genes, respectively.This genotype system is based on the alignment of complete ORF sequences for each gene of RVHs available in GenBank and the adoption of cut-off values and additional criteria for partial ORF sequences [27].Based on the analysis of RVH sequences, it is clear that human strains are genetically highly homogeneous, whilst porcine RVH strains are genetically diverse, even in restricted geographical areas [21], suggesting that human RVH originated from a recent bottleneck event from an unidentified animal host.However, the limit of this RVH classification scheme is that it relies on a small database.For these reasons, our study aimed to collect additional RVH sequences in order to implement the classification system.These data serve as a foundation for developing accurate molecular diagnostics and conducting comprehensive epidemiological investigations.
To achieve this purpose, we determined the nucleotide sequences of genes VP7, VP4, VP6 and NSP4 from 27 RVH-positive stool specimens collected from Italian porcine herds in the period 2017-2021.The RVH classification scheme was optimized using the larger sequence data set re-calculating the nucleotide cut-off values previously proposed [21] for the classification of VP7, VP4, VP6 and NSP4 genotypes [27].Through this process, we identified a great multiplicity of novel RVH genotypes circulating in pigs.

Samples
Between January 2017 and December 2021, 27 fecal samples from pigs with enteric disease, obtained as part of routine analyses conducted by the Istituto Zooprofilattico Sperimentale della Lombardia ed Emilia Romagna, were determined as RVH-positive using a previously established RT-qPCR protocol [7].These specimens were collected from 23 different porcine farms across Italy: 17 from Lombardia, 2 from Veneto, 2 from Emilia Romagna, 1 from Piemonte and 1 from Umbria.Three farms were sampled twice in the same year (farm 2, 12, 13), while one farm was sampled twice in two different years (farm 4).All the farms were industrial, with 19 managed for farrow to weaning and 8 for weaning (Table 1).All the pigs were fed with commercialized food and none were subjected to an anti-RVA vaccination plan.Age data and the status of infection by RV groups, tested by RT-qPCR [7], are reported in Table 1.

RNA Extraction and Amplicon Generation
Double-stranded RNA was extracted from 200 µL of 10% fecal suspension in a minimum essential medium using QIAzol Lysis Reagent (QIAGEN, Hilden, Germany) according to the manufacturer's instructions.The extracted RNA underwent a denaturation step at 95 • C for 5 ′ , and RT-PCR was performed using the SuperScriptIV One step kit (Invitrogen, Waltham, MA, USA) according to the manufacturer's protocol using custom primers.The primers employed (Table 2) were designed based on the full-length sequences of porcine RVH VP7, VP4, VP6, and NSP4 available in the NCBI GeneBank database (Supplementary Table S1).The PCR products were purified via the Nucleospin ® gel kit (Macherey-Nagel, Düren, Germany).The concentration and the quality of the obtained DNA were assessed using the Nanoquant Infinite M200 spectrophotometer (Tecan, Männedorf, Switzerland).

Sequencing
Purified DNA products from each sample were pooled and quantified using a QuantiFluor ® ONE dsDNA kit and a Quantus Fluorometer (Promega, Fitchburg, MA, USA).The libraries were generated with an Illumina DNA Prep (M) tagmentation kit and sequenced on an Illumina MiniSeq platform with 2 × 150-bp paired end reads (Illumina, San Diego, CA, USA).Adapters were automatically removed from FASTQ files.Raw reads were filtered to remove low-quality bases (Phred score < 30) and trimmed to remove residual sequencing adapters using Trimmomatic (v 0.39).
The reads were de novo assembled into contigs by CLC Genomic Workbench (v.23.0.5, QIAGEN, Hilden, Germany), and SPAdes (v.3.15.5) [28] assemblers.Contig sequences that were identified by both assemblers and that showed a minimum average coverage of 30 were employed for the phylogenetical analyses.The gene sequences which resulted as incomplete through the Illumina approach were re-sequenced using a BigDye Terminator v1.1 Cycle Sequencing kit on an automated ABI Prism 3500 × l Genetic Analyzer (Thermo Fisher Scientific, Waltham, MA, USA) using the primers reported in Table 2.The identity of each segment was confirmed via BLAST analysis using the National Center for Biotechnology Information GenBank Tool (NCBI, https://www.ncbi.nlm.nih.gov/,accessed on 3 July 2023).The sequences generated in our study were deposited into GenBank under accession numbers from OR817801 to OR817920.

Genotyping and Phylogenetic Analyses
The nucleotide sequences were aligned with known genotypes of porcine and human RVH strains available in the GenBank database (Supplementary Table S1) using the ClustalW software (v.2.1) implemented in BioEdit, version 7.2.5 [29].Genetic distances were calculated using Kimura's two-parameter correction at the nucleotide level, performed with MEGA v.11 [30].The nucleotide cut-off values for genotype assignment were established based on a nearly full-length ORF for each gene (ORF > 94%) of reference and Italian sequences (Supplementary Table S1).Pairwise identity frequency graphs, obtained via genetic distances, were constructed by plotting all the calculated pairwise identities on a graph with the percentage of identity on the x-axis and the frequency of each of the calculated pairwise identities on the y-axis.The most appropriate cut-off was defined as the value that separates the intra-genotype identities and the inter-genotype identities in order to avoid overlaps between different genotypes.In the case of sequences with partial ORF (<94%), genotype assignment was performed using values 2% higher than those calculated for complete ORFs according to the RCWG criteria for RVA [27].However, only sequences encompassing at least 80% of the ORF length were genotyped.Minimum and maximum values of nucleotide identity between genotypes, intra-genotype (Supplementary Table S2), and between the reference and the Italian strains were calculated (Supplementary Table S3).The phylogenetic trees were constructed on the partial ORF of VP7 (81%), VP4 (94%), VP6 (92%), and NSP4 (88%) genes using the maximum-likelihood method with 500 bootstrap replicates.The best-fit nucleotide model for each gene dataset was selected based on the lowest Bayesian Information Criterion (BIC) score upon the model testing in Mega v.11 software [30].The selected best-fit models were the Tamura 3 parameter (T92) model with the discrete Gamma distribution (G) and Invariant sites (I) for VP6 and VP7, Hasegawa-Kishino-Yano (HKY) with G+I for NSP4, and General Time Reversible (GTR) model with G+I for VP4 sequences.

VP4
Three out of twenty-seven fecal samples (ITA/48625-1/2017, ITA/76963/2018 and ITA/165825/2021) exhibited mixed VP4 sequences.The genotyping threshold for the VP4 gene was adjusted from the previously proposed 86% [21] to 87%.Upon comparing 94% of the ORF sequences, it was observed that all 26 Italian strains shared a percentage of identity lower than 87% with the reference strains, suggesting that they could belong to novel genotypes (P7-P12).Similarly, the phylogenetic tree showed that Italian strains are not closely related to reference VP4 sequences (Figure 1).Strain ITA/51105-6/2020 shared a nucleotide identity above the 87% cut-off with only 1 sequence of P10 genotype (ITA/14193-5/2020), but phylogenetically, it belonged to a different clade.For these reasons, the genotype assignment remained unclear.Interestingly, the alignment of the region of 1425 nucleotides in length (from position 945 to 2370) of ITA/51105-6/2020 and ITA/14193-5/2020 showed a high degree of similarity (97%).Analysis conducted via the RDP software (v.4.101) with sequences of porcine RVH strains evidenced the presence of significant recombination events within ITA/51105-6/2020 and ITA/14193-5/2020 (Table 4).Despite both the recombinants shared the larger fraction of the sequence, the origin of the minor fraction of VP4 gene of ITA/51105-6/2020 remained unknown, while that of ITA/14193-5/2020 was probably derived from four P10 strains (ITA/154488/2018, ITA/48625-1(I)/2017, ITA/101803/2018 and ITA/305471/2017).The similarity plot analysis confirmed the RDP results, evidencing the position of the breakpoint at 945 bp of the gene sequence (Supplementary Figure S1).Samples collected within the same year and from the same farm (2,12,13) were infected by strains of the same genotype (Table 3).On the contrary, samples collected in the same location but during two different years showed the presence of different genotypes.
However, the Italian strains that cluster with Spanish SP-VC29, SP-VC36 and SP-VC19 constitute two separate clades distinct from those of other I3 strains (from Brazil and South Africa).These data suggest that these strains could belong to novel genotypes.
Samples collected from the same farm (farm 2, 12, 13) during the same year (2020), as well as those collected in different years (farm 4, 2018 and 2019), were found to be infected by the same genotype I (Table 3).
Strain ITA/84987/2021 shared percentages of identity above the cut-off with only a few E4 clade 1 sequences, so its classification is not well defined.
Distinct E-genotypes were detected in samples collected from the same farms during the same year or in different years (farms 2, 4, 13).Only in farm 12, which was sampled twice during the same year, the sequenced strains belonged to the same genotype (E4 clade 1) (Table 3).

Discussion
To date, a substantial amount of information regarding the evolution of RV group A is available due to the high number of genomic sequences deposited in Genebank and the presence of a consolidated genotyping system (RCWG) [27].However, data on the genomic classification of other RV groups are poor and often inconsistent due to the absence of a standardized genotyping method.In particular, RVH, discovered fairly recently, in both diarrheic and healthy pigs, needs a detailed analysis in terms of genomic variability, host specificity, and geographic distribution.Therefore, this study sequenced and genotyped the genomic segments VP7, VP4, VP6 and NSP4, which are involved in virus infectivity and pathogenesis.These data were used for the validation and improvement of the classification system recently proposed [21].The analyses were performed on the nearly full-length sequences of genes VP7, VP4, VP6 and NSP4, which were amplified by PCR with specific primers and sequenced by the next-generation method or, in some cases, by the Sanger method.In this study, attempts were made to characterize and genotype Italian porcine RVH strains, updating the VP7, VP4, VP6, and NSP4 cut-off values previously proposed [21].For the gene VP7, the same cut-off value previously defined was established [21].For the other genes, the values were slightly increased.In the case of partial ORFs (<94% ORF), we applied the criteria established by RCWG for RVA, which recommend a minimum ORF length (>50%) and threshold (above 2% of the appropriate cut-off for complete ORFs).For a more reliable classification, we included only sequences containing more than 80% of the ORF.
Based on these criteria, we identified 12G, 6P, 8I and 8E-different genotypes in porcine herds in Northern Italy during the period 2017-2021.Interestingly, many of the described genotypes were classified as potential new types and were not related to any reference strain.Only in the case of the VP6 and NSP4 genes were some strains genetically related to the reference strains of established genotypes.
Based on nucleotide identity and phylogenetic analyses, most of the VP6 sequences were closely related to the Spanish strains VC-19, VC-29 and VC-36, which were previously classified as I3 genotype [18].However, they were phylogenetically distant from other I3 strains from Brazil and South Africa.These data suggest that the Spanish and Brazilian/South African strains may likely be classified into two distinct variants of the I3 genotype.
Similarly, NSP4 analyses showed that some sequences belonged to distinct clades of the E3 genotype, while other strains seemed to belong to different clades of E4 genotypes.However, with an increased number of sequences and updated nucleotide cut-off values for these genes, both genotype assignment and phylogenetic relationships are now better defined.Therefore, strains that were previously grouped in the same genotype cluster may be re-classified as belonging to a novel genotype/variant group.For this reason, sequencing and analyzing more RVH strains are necessary to provide robustness to the genotype classification system.
For NSP4 ITA/84987/2021 and VP4 ITA/51105-6/2020 sequences, genotype assignment was not achieved since nucleotide identity was above the threshold only with a few sequences of the same genotype.
In particular, among strains of porcine RVH, we identified two recombinants of VP4 (ITA/51105-6/2020 and ITA/14193-5/2020).These two strains were collected from the same farm in the same year and might have originated from P10 strains or an unknown genotype.However, our analyses did not evidence recombinations among strains of other RV groups, events less frequent but described recently in NSP3 between RVH and RVC strains [19,21].However, aside from recombination, other events, such as reassortment and mutations, may have contributed to the high variability of strains.These events are common in rotaviruses due to their genome features being segmented and RNA-based [2].
To compare the variability of the strains from this study against reference strains collected worldwide (Japan, USA, South Africa, China, Spain), we calculated the range of nucleotide identity (Supplementary Table S3).Such analyses showed that the Italian strains, collected within a restricted geographical area over a limited timeframe (4 years), shared a sequence identity comparable to that observed among reference strains collected over a more extended period (Supplementary Table S3).
The wide variability observed among the Italian strains could be related to the high density of intensive swine farms, typical of the north of Italy.Moreover, the detection of multiple gene alleles within the same sample, as observed in some cases, would suggest the simultaneous circulation of different RVH strains within herds, which could favor reassortment and recombination events, further increasing the genetic variability of circulating strains.Finally, it is important to note that the NGS sequencing method employed herein was applied to PCR amplicons of gene segments using degenerated primers specific for the different RVH genes.With this approach, it cannot be excluded that specific gene alleles may be preferentially amplified in PCR.In addition, in some cases, a gene sequence was obtained by a combination of sequencing technique to confirm the data.It should be noted that when Sanger sequencing was applied, it was possible to determine only the dominant sequence whilst mixed infections resulted in poor-quality electropherograms.
This study highlighted the great variability among RVH strains, but it does not explore the correlation between specific genotypes and clinical symptoms, both because the

Figure 1 .
Figure 1.Phylogenetic trees constructed from a partial open reading frame of VP7, VP6, VP4 and NSP4 genes of RVH strains.Phylogenetic trees were constructed via the maximum likelihood method.Statistical support was provided via the bootstrapping of 500 pseudo-replicates.Bootstrap values above 70 are given at each branch node.Black circles represent the RVH strains analyzed in this study.Genotypes are specified on the right.The outgroup branch length for VP6 and VP4 was shortened (dotted line) due to its excessive length.

Figure 1 .
Figure 1.Phylogenetic trees constructed from a partial open reading frame of VP7, VP6, VP4 and NSP4 genes of RVH strains.Phylogenetic trees were constructed via the maximum likelihood method.Statistical support was provided via the bootstrapping of 500 pseudo-replicates.Bootstrap values above 70 are given at each branch node.Black circles represent the RVH strains analyzed in this study.Genotypes are specified on the right.The outgroup branch length for VP6 and VP4 was shortened (dotted line) due to its excessive length.

Table 1 .
Data on porcine farm management and RV infection status.

Table 2 .
Primers used for the amplification of the VP7, VP4, VP6 and NSP4 segments.

Table 3 .
Genotypes for each gene of the RVH strains obtained in the study.