Molecular Evolutionary Analyses of the RNA-Dependent RNA Polymerase (RdRp) Region and VP1 Gene in Human Norovirus Genotypes GII.P6-GII.6 and GII.P7-GII.6

To understand the evolution of GII.P6-GII.6 and GII.P7-GII.6 strains, the prevalent human norovirus genotypes, we analysed both the RdRp region and VP1 gene in globally collected strains using authentic bioinformatics technologies. A common ancestor of the P6- and P7-type RdRp region emerged approximately 50 years ago and a common ancestor of the P6- and P7-type VP1 gene emerged approximately 110 years ago. Subsequently, the RdRp region and VP1 gene evolved. Moreover, the evolutionary rates were significantly faster for the P6-type RdRp region and VP1 gene than for the P7-type RdRp region and VP1 genes. Large genetic divergence was observed in the P7-type RdRp region and VP1 gene compared with the P6-type RdRp region and VP1 gene. The phylodynamics of the RdRp region and VP1 gene fluctuated after the year 2000. Positive selection sites in VP1 proteins were located in the antigenicity-related protruding 2 domain, and these sites overlapped with conformational epitopes. These results suggest that the GII.6 VP1 gene and VP1 proteins evolved uniquely due to recombination between the P6- and P7-type RdRp regions in the HuNoV GII.P6-GII.6 and GII.P7-GII.6 virus strains.


Introduction
Human norovirus (HuNoV) is a major causative agent of acute gastroenteritis in humans of all ages [1,2]. Previous epidemiological data suggest that HuNoV may be associated with 30-60% of patients with gastroenteritis [3][4][5]. Moreover, this agent has caused large outbreaks of food poisoning worldwide [6,7]. However, effective vaccines and antiviral agents are not available at present [7]. Therefore, this agent may be a public health concern [8].
Recently, authentic bioinformatic technologies have been used in population genetics, including the study of the evolution of various viruses [20]. Indeed, these methods may allow us to estimate the phylogeny, genome population, and antigenicity using threedimensional antigen structures. Information that reflects viral evolution may contribute to a better response to these questions. We studied the molecular evolution of chimeric HuNoV, such as GII.P17-GII.17, GII.P2-GII.2, and GII.P16-GII.2, which have caused major outbreaks in many countries [21][22][23]. However, such studies have not been performed on other GII genotypes to better understand GII HuNoV. In norovirus infections, GII.4 is the predominant genotype worldwide. However, recombinant GII.6 strains have been circulating since the 1970s, with outbreaks reported in Japan in 2008-2009 and in the United States and Italy in 2014-2015, with an overall prevalence second only to GII.4 [18,24,25]. Although this genotype has been reported to play an important epidemiological role in norovirus outbreaks, the molecular epidemiological mechanism underlying these outbreaks has not been studied in detail. Moreover, GII.6 almost always displays P6-or P7-type RdRp genotypes [26]. Therefore, in this study, we performed a comprehensive molecular analysis of globally collected HuNoV GII.P6-GII.6 and GII.P7-GII.6 strains.

Strains Used in This Study
To analyse the molecular evolution of HuNoV GII.6, the complete genome sequences of HuNoV were downloaded from GenBank (last accessed on 28 December 2022). In total, 11,810 strains were collected. They were classified into genotypes using the Norovirus Typing Tool (Ver.2.0), and GII.6 strains were selected [10]. HuNoV GII.6 data collected from each local government public health institution were added to the dataset because the number of GII.6 strains available for analysis, especially GII.P6-GII.6 strains, was small. Strains with an uncertain sequence or an unclear year of collection or area were excluded. Finally, 141 strains belonging to GII.6 remained and were used to analyse the molecular evolution of VP1. Similarly, 141 strains belonging to HuNoV GII.6 were obtained and used to analyse the molecular evolution of RdRp region. Details of the strains used in this study are presented in Supplementary Table S1.

Time-Scaled Phylogenetic Analyses
To evaluate the molecular evolution of the present strains, phylogenetic trees of the HuNoV RdRp region (1530 bp, excluding the stop codon) and the VP1 gene (1641-1650 bp, excluding the stop codon) were constructed using the Bayesian Markov chain Monte Carlo (MCMC) method in the BEAST package (v.2.6.7), as previously described [27,28]. First, the jModelTest2 program was used to determine the suitable substitution models [29]. Second, the path-sampling/stepping-stone sampling marginal likelihood estimation method was used to evaluate the best of the four clock models (strict clock, relaxed clock exponential, relaxed clock log normal, and random local clock) and the two prior tree models (coalescent constant population and coalescent exponential population). Although these were performed independently for VP1 gene and RdRp region analyses, SYM-Γ-I, relaxed clock exponential, and coalescent exponential population were selected for the molecular evolutionary analysis of VP1 gene. However, SYM-Γ, relaxed clock exponential, and coalescent exponential population were adopted for the molecular evolutionary analysis of RdRp. The lengths of the Bayesian MCMC chains and samples are listed in Table S2. Effective sample sizes (ESS) were calculated using Tracer and the convergence of all parameters was confirmed if the ESS was greater than 200. After a 10% burn-in, phylogenetic trees were generated using TreeAnnotator (v.2.6.7) and rendered using FigTree (v.1.4.0). Representative strains of each cluster were selected based on the most recent age of detection within the cluster. In addition, to compare the amino acid sequences of these representative strains, the GII.6 strain (AB039777) prototype was determined based on a previous report [10]. Molecular evolutionary rates were estimated using suitable models selected for each dataset, as described above. Statistical analyses were performed using the Kruskal-Wallis t-test for EZR [30].

Phylogenetic Distance Analyses
To calculate the phylogenetic distances among the strains, we used MEGA7 software [31]. The best substitution models were estimated using the jModelTest2 program. The phylogenetic distances between the present GII.6 strains were calculated from the pairwise maximum likelihood (ML) tree of the ML tree using the Patristic program [32].

Phylodynamic Analyses
To assess the phylodynamics of the GII.6 strains, the effective population sizes of the RdRp region and VP1 gene were calculated using Bayesian skyline plot (BSP) analysis implemented in the BEAST package [27]. Similar to the Bayesian MCMC method, the best substitution and clock models were selected. A Bayesian skyline plot and the 95% highest probability density (HPD) were visualized using Tracer [33].

Selective Pressure Analyses
The non-synonymous (dN) and synonymous (dS) substitution rates at each amino acid site were calculated to identify the selective pressure sites for the RdRp region and VP1 gene using the Datamonkey server (https://www.datamonkey.org/ accessed on 8 October 2022) [34]. Five algorithms-single likelihood ancestor counting (SLAC), fixedeffects likelihood (FEL), internal fixed-effects likelihood (IFEL), the mixed-effects model of evolution (MEME) method, and the fast, unconstrained Bayesian approximation (FUBAR) method-were used to identify positively selected sites, and all of them except FUBAR were used to detect negatively selected sites. The significance level was set at p < 0.05 for SLAC, FEL, IFEL, and MEME. Evidence of selective pressure for FUBAR was supported by a posterior probability > 0.9. In the positive selection analysis, sites common to more than four methods were regarded as positive selection sites, whereas in the negative selection analysis, sites common to more than three methods were considered negative selection sites.

Construction of the 3D Structure of RdRp and VP1 Proteins
To compare the VP1 and RdRp protein structures among genotypes, three-dimensional (3D) structural models of VP1 and RdRp proteins were constructed for each genotype using homology modeling. First, 3D structural models of VP1 in representative strains of each genotype (AB039777, LC122916, MH791993, MK956199, and JX989075) were generated using Protein Data Bank (PDB) ID: 6OTF as a template. Then, five models for each VP1 genotype were generated using Modellar software (version 9.23) [35]. These models were evaluated by Ramachandran plot analysis using WinCoot implemented in the CCP4 package [36] and the best-scoring models were chosen. Finally, the energy of the selected models for each strain was minimized using GROMOS96 implemented in the Swiss PDB viewer (ver4.1.0) [37]. Using a similar procedure, the models of RdRp protein in each representative strain (AB039777, LC122916, LC760173, MK956199, and JX989075) (Table S1) were constructed using the crystal structure of RdRp (PDB ID:1SH0) as a template.

Conformational B-Cell Epitope Prediction
PDB files of the crystal structures of the GII.2 VP1 protein (PDB ID:6OTF) and FASTA files of their amino acid sequences were downloaded from PDB (https://pdbj.org/?lang=ja accessed on 30 August 2022) to use as templates in the homology modeling method. To assess the conformational B-cell epitopes of the constructed VP1 protein models, four methods, DiscoTope 2.0 [38], ElliPro [39], SEMA [40], and SEPPA [41], were used, with cutoff values of −3.7, 0.5, 0.76, and 0.064, respectively. Regions with amino acid sequences predicted by three or more of these methods and those contiguous with three or more residues were regarded as conformational epitopes. Furthermore, conformational epitopes were mapped onto the VP1 protein models constructed above. Time-scale phylogenetic trees were constructed based on the full-length nucleotides of the RdRp region and VP1 gene using the Bayesian MCMC method. First, as shown in Figure 1A, a common ancestor of the P6-and P7-type RdRp regions diverged around December 1966 (mean; 95% HPDs, January 1941-March 1984). Subsequently, the P6-or P7-type RdRp regions further diverged and formed clusters 1 and 3, respectively. The main divergence times are shown in Figure 1A. The results suggested that a common ancestor of the P6-and P7-type RdRp region diverged approximately 50 years ago and evolved.  Next, as shown in Figure 1B, a common ancestor of the GII.6 VP1 gene diverged around March 1904 (mean; 95% HPDs, September 1823-April 1962). Thereafter, the genes diverged to form four clusters. The main divergence times are shown in Figure 1B. Finally, the GII.6 strains with the P6-type formed only one cluster, while the GII.6 strains with the P7-type formed three independent clusters. Furthermore, this phylogenetic tree estimated that a common ancestor of the P6-and P7-type VP1 genes diverged around November 1982 (mean; 95% HPDs, September 1970-February 1992). Thus, this time may be estimated as a recombination event between the GII.P6-GII.6 and GII.P7-GII.6 genomes in the present strains.

Evolutionary
Rates of the RdRp Region and VP1 Gene in HuNoV GII.P6-GII.6 and GII.P7-GII. 6 We also calculated the evolutionary rates using the Bayesian MCMC method. As shown in Table 1, the evolutionary rate was higher for GII.6 VP1 than the RdRp region, including P6-and P7-types (141 strains). The evolutionary rate was higher for the P6-type RdRp region than the P7-type RdRp. The evolutionary rate was higher for the P6-type GII.6 VP1 than the P7-type GII.6 VP1. These results suggest that the RdRp region and VP1 gene in the present strains evolved independently, and the evolutionary rates were significantly distinct.

Phylogenetic Distances among the Present Strains
To assess the genetic divergence of the RdRp region and VP1 gene in the present strains, we calculated their phylogenetic distances. The phylogenetic distances of the P6and P7-type RdRp regions and the GII.6 VP1 gene were 0.112 ± 0.098 (mean ± 1 standard deviation [SD]) and 0.317 ± 0.259 (mean ± 1 SD). As shown in Figure 2A, the VP1 gene showed statistically greater genetic divergence than the RdRp region (unpaired t-test, p < 0.001). Moreover, the genetic divergence was greater for the P7-type RdRp region than the P6-type RdRp region (unpaired t-test, p < 0.001). The detailed statistical data are shown in Table 2.  To assess the phylodynamics of the present GII.P6-GII.6 and GII.P7-GII.6 strains, we calculated time-scaled genome population sizes using the BSP method ( Figure 3A   To assess the phylodynamics of the present GII.P6-GII.6 and GII.P7-GII.6 strains, we calculated time-scaled genome population sizes using the BSP method ( Figure 3A

Positive Selection Sites in the RdRp and VP1 Proteins
We analysed the positive selection sites in the RdRp and VP1 proteins to estimate selective pressure against the host. No positive selection site was detected in the RdRp protein. In contrast, a few positively selected sites were identified in VP1. Of these, only Lys386His was predicted in the P6-type VP1 protein, whereas Pro354Thr, Pro354Ser,

Positive Selection Sites in the RdRp and VP1 Proteins
We analysed the positive selection sites in the RdRp and VP1 proteins to estimate selective pressure against the host. No positive selection site was detected in the RdRp protein.
In contrast, a few positively selected sites were identified in VP1. Of these, only Lys386His was predicted in the P6-type VP1 protein, whereas Pro354Thr, Pro354Ser, Pro354Gln, Asn390Thr, and Asn390Asp were predicted in the P7-type VP1 protein. These sites were located in the protruding 2 (P2) domain of the protein (Table S4). These results suggest that the GII.P7-GII.6 strains may receive stronger selection pressure from the host than the GII.P6-GII.6 strains. Amino acid substitutions were also observed in sites other than the P2 domain, but no positive selection sites were identified.

Negative Selection Sites in RdRp and VP1 Proteins
In general, negative selection sites may prevent the deterioration of protein function. Therefore, we calculated the number of negative-selection sites in these strains. Many negative selection sites were estimated for the P7-type RdRp protein (205 sites) and P7-type VP1 protein (274 sites). However, a small number of negative selection sites were estimated in the P6-type RdRp protein (3 sites) and P6-type VP1 protein (8 sites) ( Table 3). A few irregularly positioned negative selection sites in the P6-type RdRp and VP1 proteins were identified. Details of these negative selection sites are shown in Supplementary Table S3A,B. In both the RdRp and VP1 proteins, amino acid variations in the negative selection sites were often located in non-overlapping positions.

3D Mapping Relationships between Amino Acid Substitutions and Active Sites of the RdRp Dimer Proteins
To better assess the relationships between amino acid substitutions and the active sites of RdRp proteins, we constructed 3D RdRp dimers and mapped them. Within RdRp, there were few amino acid substitutions in both the P6 and P7 types, but none were observed in the RdRp active sites (aa182, aa242, aa243, aa300, aa309, aa343, and aa344) ( Figure 4A-E).
Pro354Gln, Asn390Thr, and Asn390Asp were predicted in the P7-type VP1 protein. These sites were located in the protruding 2 (P2) domain of the protein (Table S4). These results suggest that the GII.P7-GII.6 strains may receive stronger selection pressure from the host than the GII.P6-GII.6 strains. Amino acid substitutions were also observed in sites other than the P2 domain, but no positive selection sites were identified.

Negative Selection Sites in RdRp and VP1 Proteins
In general, negative selection sites may prevent the deterioration of protein function. Therefore, we calculated the number of negative-selection sites in these strains. Many negative selection sites were estimated for the P7-type RdRp protein (205 sites) and P7-type VP1 protein (274 sites). However, a small number of negative selection sites were estimated in the P6-type RdRp protein (3 sites) and P6-type VP1 protein (8 sites) ( Table 3). A few irregularly positioned negative selection sites in the P6-type RdRp and VP1 proteins were identified. Details of these negative selection sites are shown in Supplementary Table S3A,B. In both the RdRp and VP1 proteins, amino acid variations in the negative selection sites were often located in non-overlapping positions.

3D Mapping Relationships between Amino Acid Substitutions and Active Sites of the RdRp Dimer Proteins
To better assess the relationships between amino acid substitutions and the active sites of RdRp proteins, we constructed 3D RdRp dimers and mapped them. Within RdRp, there were few amino acid substitutions in both the P6 and P7 types, but none were observed in the RdRp active sites (aa182, aa242, aa243, aa300, aa309, aa343, and aa344) (

3D Mapping of the Positive Selection Sites and Conformational Epitopes in the VP1 Trimer Proteins
Furthermore, to better evaluate the locations of positive selection sites and conformational epitopes on the VP1 protein, we constructed and mapped 3D VP1 trimer proteins. First, as shown in Figure 5A-E and Table 3, and Section 3.5, positive selection sites for both P6-and P7-type VP1 proteins were located in the P2 domain. Of these, the positive selection sites in the P7-type VP1 proteins (Pro354Thr, Pro354Ser, Pro354Gln, Asn390Thr, and Asn390Asp) overlapped with some conformational epitopes (Tables 3 and S4), whereas a positive selection site (Lys386His) in P6-type VP1 proteins did not. These results suggest that the positive selection sites in P7-type GII.6 VP1 proteins escaped amino acid mutations. In addition, in both representative strains, amino acid mutations overlapped with the conformational epitope at many sites.

3D Mapping of the Positive Selection Sites and Conformational Epitopes in the VP1 Trimer Proteins
Furthermore, to better evaluate the locations of positive selection sites and conformational epitopes on the VP1 protein, we constructed and mapped 3D VP1 trimer proteins. First, as shown in Figure 5A-E and Table 3, and Section 3.5, positive selection sites for both P6-and P7-type VP1 proteins were located in the P2 domain. Of these, the positive selection sites in the P7-type VP1 proteins (Pro354Thr, Pro354Ser, Pro354Gln, Asn390Thr, and Asn390Asp) overlapped with some conformational epitopes (Tables 3 and S4), whereas a positive selection site (Lys386His) in P6-type VP1 proteins did not. These results suggest that the positive selection sites in P7-type GII.6 VP1 proteins escaped amino acid mutations. In addition, in both representative strains, amino acid mutations overlapped with the conformational epitope at many sites.   Supplementary Table S4.

Discussion
To better understand the evolution of HuNoV GII.6 strains with different RdRp types (P6 and P7), we analyzed both the RdRp region and VP1 gene using various authentic bioinformatics technologies. First, a time-scaled phylogenetic tree showed that a common ancestor of the P6-and P7-type RdRp region emerged approximately 50 years ago and uniquely evolved and formed clusters. A common ancestor of P6-and P7-type GII.6 VP1 gene emerged approximately 110 years ago and formed clusters. The dominant type for  Supplementary Table S4. Table 3. Number of amino acid residues of predicted positive and negative selection sites in HuNoVGII.6.

Discussion
To better understand the evolution of HuNoV GII.6 strains with different RdRp types (P6 and P7), we analyzed both the RdRp region and VP1 gene using various authentic bioinformatics technologies. First, a time-scaled phylogenetic tree showed that a common ancestor of the P6-and P7-type RdRp region emerged approximately 50 years ago and uniquely evolved and formed clusters. A common ancestor of P6-and P7-type GII.6 VP1 gene emerged approximately 110 years ago and formed clusters. The dominant type for both the RdRp region and VP1 gene was P7-type ( Figure 1A,B). Secondly, the evolutionary rates of both the P6-type RdRp region and VP1 gene were faster than those of the P7-type RdRp region and VP1 gene (Table 1). Next, the phylogenetic distances of the P7-type RdRp region and VP1 gene were wider than those of the P6-type RdRp region and VP1 gene. Furthermore, phylodynamic data showed that the RdRp region and VP1 gene population sizes fluctuated after 2000 ( Figure 3A-F). Some positive selection sites in the VP1 proteins were estimated, and these were located in the antigenicity-related P2 domain. Among these, the positive selection sites in the P7-type VP1 protein overlapped with the conformational epitopes ( Figure 5A-D and Table S4). These data imply that the GII.6 VP1 gene and VP1 protein uniquely evolved because of recombination between the P6-and P7-type RdRp regions in the HuNoV GII.P6-GII.6 and GII.P7-GII.6 genomes.
A previous report regarding the evolutionary analyses of the RdRp region of various HuNoV genotypes showed that the P6-and P7-type RdRp region diverged from a common ancestor of other RdRp genotypes, including P18, P15, and P20 [42]. This report also estimated that the divergence year of the P6-and P7-types of the RdRp region was in the 1960s [42]. This may be compatible with the present data (December 1966). Moreover, the topologies of the previous time-scaled evolutionary tree and our tree were similar [42]. Although this and other reports did not show the evolutionary rates of each RdRp genotype, the evolutionary rates of various RdRp genotypes were estimated as 2.52 × 10 −3 s/s/y to 3.12 × 10 −3 s/s/y. The present data are also compatible with the data from a previous report [42]. These results suggested that the P6-and P7-type RdRp regions are genetically related.
Next, the HuNoV RdRp region/RdRp protein may have affected the evolution of the VP1 gene/VP1 protein [22,23]. As shown in Figure 1, the phylogeny of the VP1 gene in GII.P6-GII.6 and GII.P7-GII.6 was clearly divided and evolved uniquely. In contrast, in the present study, the topology of the evolutionary tree of the RdRp region was similar to that of the VP1 gene. Notably, the phylogeny of ORF1 and ORF2 was topologically similar in other genogroups and genotypes [43]. Previous reports have also suggested that recombination between the HuNoV genome ORF1, incorporating the RdRp region and ORF2, incorporating VP1 gene, affects VP1 gene/VP1 protein evolution and HuNoV antigenicity [8, 22,23]. For example, during the 2016/2017 season, recombination between different lineages of the P16-type RdRp region in the GII.P16-GII.2 strains occurred, and the recombinant caused large outbreaks of acute gastroenteritis in various countries [44][45][46]. Moreover, the GII.4 genotype caused a gastroenteritis pandemic between 2006 and 2012 [43,47]. Outbreaks may also be associated with recombination between ORF1 and ORF2 in GII.4 strains [48]. Based on previous and the present results, the prevalence of GII.P7-GII.2 strains was due to the recombination of P6-and P7-type RdRp regions. Moreover, the acquisition of new polymerases in recombinant strains may alter the evolution rate of the VP1 gene [49]. Among Kawasaki 2014 type-detected cases, a norovirus GII.17 variant that was predominant in Hong Kong from 2014 to 2015 was significantly more common than GII.4 in elderly cases. GII.17 VP1 protein evolution is estimated to be faster (by an order of magnitude) than VP1 gene evolution in GII.4 [50]. This suggests that recombination may alter the susceptibility and evolutionary rate of the VP1 protein in GII.17. Although we did not analyse VP1 functional changes pre-and post-recombination between GII.P6 and GII.P7 in this study, a similar effect may have occurred in the GII.P7-GII.6 and GII.P6-GII.6 strains. Furthermore, the evolutionary rates of the VP1 gene combined with P6-and P7-type RdRp regions were estimated as 5.063 × 10 −3 s/s/y and 3.022 × 10 −3 s/s/y, respectively. Previous data estimated the mean rates of various GII.2 genotype strains (GII.1 to GII.22) as 3.21 × 10 −3 to 4.30 × 10 −3 s/s/y [51]. Thus, these values and the present data may be similar [52]. Taken together, these findings provide information on the evolutionary history of these viral strains and suggest that recombination events may have played a pivotal role in their evolution.
We estimated the genetic divergence of the P6-and P7-type RdRp regions and P6-and P7-type VP1 genes in the present strains. First, a larger divergence of P7-type RdRp regions and P7-type VP1 genes was estimated compared to that of the P6-type VP1 gene. In the present study, the number of P6-type strains was relatively small (15 strains), although statistical analyses were performed.
We also analyzed the phylodynamics of the RdRp region and VP1 gene. The results showed that the genome population size of GII.P6-GII.6 increased around 2000-2003, while the genome population size of GII.P7-GII.6 increased after 2010 and peaked around 2014. Previous epidemiological studies conducted in Shanghai, China and East Java, Indonesia showed that the detection of the P7-type in HuNoV cases peaked in 2014-2015, although the sample size was small [53,54]. Moreover, another molecular epidemiological study suggested that GII.6 had a biphasic prevalence between 2000 and 2005 and 2007 and 2010. Thus, the present phylodynamic data may reflect the prevalence of GII.P6-GII.6 and GII.P7-GII.6, although we could not determine the factors underlying these changes.
To evaluate the functional and evolutionary characteristics of the P6-and P7-type RdRp proteins, we constructed 3D dimeric RdRp proteins and mapped them with amino acid substitutions (Figure 4). Several amino acid substitutions were also identified. Previous reports have suggested that some amino acid substitutions are associated with replication efficacy [8,48]. For example, the efficacy of HuNoV genome replication is increased by amino acid substitutions (291Thr or 291Val) in various RdRp proteins [52]. However, no substitutions in the active sites were found in the P6-and P7-type RdRp proteins. Furthermore, both GII.P6-GII.6 and GII.P7-GII.6 RdRp proteins had a relatively small number of amino acid substitutions and no identifiable positive selection site. These results are compatible with previous reports investigating other genogroups (GI) and genotypes [42,55]. This is partly because RdRp, which is not a target of neutralising antibodies, undergoes less selective pressure from host immune systems than VP1.
We also constructed P6-and P7-type 3D trimeric VP1 proteins ( Figure 5). Previous reports have shown that the P2 domain may act not only as a host cell-binding site, but also as a major part of the HuNoV antigen [56,57]. Therefore, amino acid substitutions in this domain may be associated with infectivity and antigenicity [56,57]. Moreover, positively selected sites may function as escape mutations in the host [58]. In the present study, some conformational epitopes were identified in both RdRp-type VP1 proteins. Some of these were located in the P2 domain. Positively selected sites were also identified. Moreover, the amino acid positions of the conformational epitopes and positive selections between the P6and P7-type VP1 proteins were distinct. These results suggested that the P6-and P7-type VP1 proteins have distinct antigenicity, and both may undergo distinct selective pressure from host defence systems (i.e., host immunity). To our knowledge, the present study is the first to analyse the differences in antigenicity between GII.P7-GII.6 and GII.P6-GII.6, although we did not examine this in vitro.
Next, we evaluated negative selection sites for the RdRp and VP1 proteins (Tables 3 and S3). Many negative selection sites in P7-type RdRp (205 sites) and VP1 proteins (274 sites) were estimated, while P6-type RdRp and VP1 proteins were small. In general, negative selection sites play a role in preventing the deterioration of viral protein function [46]. Thus, the present negative selection data may indicate the maintenance of RdRp and VP1 protein function. Furthermore, a small number of negative selection sites in P6-type RdRp and VP1 proteins were estimated. This may be because of the relatively small number of strains used in this study (15 strains).
The present study has some limitations. First, it lacks in vitro and in vivo approaches. Bioinformatics techniques are crucial in molecular evolution analyses, but they do not always reflect actual evolutionary trajectories. Our antigenicity analyses should be validated using in vitro and in vivo approaches in the future. Second, the study included a relatively small number of GII.P6-GII.6 strains. Despite collecting additional HuNoV GII.6 data from each local government public health institution, the number of P6-and P7-type RdRp strains that we collected and analysed was 15 and 126, respectively. Therefore, our results regarding the P6-type may be biased. Considering this limitation, further evolutionary analyses should be conducted after additional P6-type strains are detected.

Conclusions
In this study, to better understand the evolution of the HuNoV GII.P6-GII.6 and GII.P7-GII.6 strains, we performed a detailed analysis of both the RdRp region and VP1 gene in these viruses using various bioinformatics methods. A common ancestor of the P6-and P7-type RdRp region emerged approximately 50 years ago and formed clusters. A common ancestor of the P6-and P7-type VP1 gene emerged approximately 110 years ago. Moreover, both RdRp region and VP1 gene have evolved uniquely. The evolutionary rates of the P6-type RdRp region and P6-type VP1 gene were faster than the evolutionary rates of the P7-type RdRp region and VP1 genes. More genetic divergence was observed in the P7-type RdRp region and VP1 gene than in the P6-type RdRp region and VP1 gene. The phylodynamics of the RdRp region and VP1 fluctuated after 2000. Some positive selection sites in the VP1 proteins were located in the antigenicity-related P2 domain, and these sites in the P7-type VP1 protein overlapped with the conformational epitopes. Taken together, the GII.6 VP1 and VP1 proteins evolved uniquely due to recombination between the P6and P7-type RdRp regions in the HuNoV GII.P6-GII.6 and GII.P7-GII.6 strains.
Supplementary Materials: The following supporting information can be downloaded from: https: //www.mdpi.com/article/10.3390/v15071497/s1, Table S1. List of GII.P6-GII.6 and GII.P7-GII.6 strains used in this study. Table S2. Parameters used in Bayesian Markov chain Monte Carlo (MCMC) analyses. Table S3a. Details of P6-and P7-type RdRp region negative selection sites in comparing with each genotype. Table S3b. Details of GII.P6-GII.6 and GII.P7-GII.6 VP1 gene negative selection sites in comparison with each genotype. Table S4. Detailed amino acid sequences of the VP1 gene for representative and prototype strains of each cluster used in this study.  The funders played no role in the study design, data collection, analysis, decision to publish, or manuscript preparation.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The datasets generated and/or analyzed in the present study are available from the corresponding author upon reasonable request.