Diversity of Noroviruses throughout Outbreaks in Germany 2018

Human norovirus accounts for the majority of viral gastroenteritis cases worldwide. It is a fast evolving virus generating diversity via mutation and recombination. Therefore, new variants and new recombinant strains emerge in the norovirus population. We characterized norovirus positive stool samples from one intensively studied district Märkisch-Oderland state Brandenburg with the samples from other states of Germany in order to understand the molecular epidemiological dynamics of norovirus outbreaks in Germany 2018. PCR systems, Sanger sequencing, and phylogenetic analyses were used for genotyping. Noroviruses of 250 outbreaks in Germany were genotyped, including 39 outbreaks for the district Märkisch-Oderland. Viral diversity in Märkisch-Oderland as compared to Germany was similar, but not identical. The predominant genogroup in Germany was GII with predominate genotype GII.P16-GII.4 Sydney, whereas GII.P31-GII.4 Sydney was the most frequent in Märkisch-Oderland. Genogroup I viruses were less frequently detected, regional and national. Within the sequences of GII.4 recombinants, two distinct clusters were identified with outbreaks from Märkisch-Oderland. Further analysis of sequence data and detailed epidemiological data are needed in order to understand the link between outbreaks in such clusters. Molecular surveillance should be based on samples collected nationally in order to trace comprehensive virus distribution and recombination events in virus population.


Introduction
Noroviruses were estimated to be the cause of 18% of acute gastroenteritis (AGE) cases worldwide [1] affecting people of all ages. In Germany and other industrialized countries, norovirus outbreaks are frequently associated with health-or childcare settings predominantly in the winter months from November to April.
Human noroviruses are non-enveloped viruses with positive sense single stranded RNA of approximately 7.5 kb. The genome is organized in three open reading frames (ORF). ORF1 encodes nonstructural proteins, whereas ORF2 and ORF3 encode for the major and minor viral capsid protein, respectively.
Human noroviruses are divided according to their genetic diversity into five genogroups: GI, GII, GIV, GVIII, and GIX. Recently, 18 GI and 49 GII polymerase types were proposed together with nine GI VP1 types and 29 GII VP1 variants [2]. Recombination is a common evolution strategy for noroviruses, it occurs most often as a recombination event at the junction of ORF1 and ORF2 to affect viral fitness and lead to predominance in the viral population. Point mutations within the genome particularly within the encoding region of the capsid proteins (VP1 and VP2 protein), permit the virus the escape of the host immune system [3]. The capsid consists of the VP1 protein, which contains the shell (S) domain and the protruding (P) domain. The P domain of VP1 is further subdivided into the P1 and P2 domain. The P2 region is the most variable and exposed region of the VP1 protein containing antigenic epitopes [4][5][6][7][8].
For better understanding of the molecular epidemiological dynamics of norovirus outbreaks in Germany, in particular outbreaks of GII.P16 variants, we chose one district in the federal state of Brandenburg in order to trace a large number of outbreaks in 2018. We were interested in the distribution of norovirus genotypes, the detection of new and rare emerging variants, and the epidemiology of the outbreaks.

Ethics Statement
Surveillance data and data of molecular diagnostics were collected on the basis of the German Infection Protection Act. Thus, a review by an ethics committee was not required.

Sample Collection
In this study, human stool samples were collected from norovirus outbreaks in the federal state of Brandenburg district Märkisch-Oderland, Germany in 2018. Stool samples were sent by local health authorities and diagnostic laboratories for genotyping purpose. In total, 66 samples were sent to the consultant laboratory for norovirus at the Robert Koch Institute (RKI) from the district for analysis.

PCR and Sequence Analysis
All of the stool samples were diluted in PBS (1:10) spiked with an internal extraction-and PCR-control (MS-2 phage) and the RNA was extracted from 140 µl stool suspension while using the QIAcube device. Viral RNA was extracted by QIAamp Viral RNA Mini Kit (Qiagen, Hilden, Germany) and eluted in a total volume 60 µl.
For the first screening, RT-qPCR was conducted, as described previously [13], including improvements in primer and probe concentrations yielding a modified triplex RT-qPCR (Table 1). 10 µL mastermix containing Superscript ™ III Platinum OneStep qRT-PCR System (Invitrogen, Karlsruhe, Germany) and 2 µL extracted virus RNA was used for detection. The internal extraction control was adjusted to a crossing point (Cp) value of 28 ± 2. All of the norovirus specific primers were used in a final concentration of 416 nM; however, the NV192 + NV192a mix was adjusted to a final concentration of 208 nM. The genogroup I specific probe NV-TM9-MGB was used with a final concentration of 75 nM. The final concentration of the probe specific for genogroup II (NV-TM15-MGB) was adjusted to 150 nM. Primer and probe for internal control MS2 were published previously [14], and primers were used with a final concentration of 208 nM and MS2 specific probe of 83 nM. The RT-qPCR was performed on the LC480 system (Roche, Mannheim, Germany) with cycling conditions as follows: 15 min. 50 °C, 2 min. 95 °C, (15 s 95 °C, 32 s 60 °C), 45 cycles. Plasmid DNA (2 × 10 2 to 2 × 10 6 copies/reaction) was used for semi quantitative detection as well as RNA of positive controls for GI, GII and RNA for MS2 phage as a positive control for the internal control (Roche, Mannheim, Germany) [14]. In the second step the viruses were genotyped. Two RT-PCR systems were used. To obtain information of the RNA-dependent RNA polymerase (RdRp, ORF1), a previously described RT-PCR [15,16] was adapted, amplifying 331bp in the nested PCR generic for genogroup GI and GII ( The PCR products were analyzed with Sanger sequencing and phylogenetic analysis were performed with Geneious 11.1.5 and MEGA7 [17]. The alignment was calculated with MAFFT algorithm in Geneious. In MEGA, best fit model of substitution pattern was determined with the lowest BIC score (Baysian information criterion) and modeling of a maximum-likelihood (ML) tree was done. The reliability of the branching pattern was tested with bootstrapping (1000 replicates). Distances of evolutionary divergence were calculated with MEGA7. The P2 GII.4 tree ( Figure 5) was calculated with IQ-Tree, version 1.5.3 and it was further processed with iTOL (https://itol.embl.de/). GenBank accession numbers for sequences from Märkisch-Oderland were as follows: MT584847-MT584850 for GI, MT734074-MT734105, and MT745916-MT745950 for GII sequences.

Surveillance Data
Symptomatic norovirus infections with laboratory confirmation have been notifiable in Germany since 2001. The detection of viral RNA by RT-PCR or detection of norovirus antigen is reported to the local public health department by the identifying laboratory. The health department completes and verifies case information according to the national surveillance case definition. Case data are anonymized and electronically transmitted to the state health department and, from there to the RKI, the national public health institute in Germany. Cases of disease without laboratory confirmation, but with an epidemiological link (occurring e.g., in outbreaks in care homes), are notifiable to the local health departments, but this case information is not passed on to the state or federal level. Thus, only laboratory confirmed cases were included in this analysis.

Epidemiology
In 2018, n = 77,583 laboratory confirmed cases of norovirus gastroenteritis were reported to the RKI, corresponding to an incidence of 93 cases per 100,000 population. In the state of Brandenburg and the district of Märkisch-Oderland, incidence was higher than the national average with 147 and 139 cases per 100,000 populations ( Figure 1 shows the location of the district Märkisch-Oderland within Germany).  Age specific incidence of laboratory confirmed norovirus disease in the district of Märkisch-Oderland was the highest in children under two years and also above the mean in the elderly population (Figure 3).

Correlation of Norovirus GII.4 Surveillance Data and Phylogenetic Analyses in Märkisch-Oderland (Brandenburg) and Germany in 2018
In order to obtain detailed information about the correlation between sequence data and epidemiological information of samples collected in Märkisch-Oderland, phylogenetic analysis was performed and clustering of sequences were identified (Supplementary Materials: Figure S2). In order to determine whether clusters were limited to sequences from Märkisch-Oderland or if they were linked to other German GII.4 sequences, 83 randomized GII.4 sequences from different federal states of Germany were included in phylogenetic analysis ( Figure 4). Overall, the number of nucleotide differences per sequence from averaging over all sequence pairs was 41.86 over >600 bases per sequence. The clustering of sequences in the tree was heterogenic (Figure 4). No clear clustering of sequences from same federal state was observed and no time-depended clustering of sequences was seen. Small clusters of two sequences were found from Berlin, Bavaria, Baden-Württemberg, Saxony-Anhalt. All of the analyzed GII.4 capsid sequences were linked with three polymerase types: P31, P16, and P4 2009, and the capsid sequences clustered according to the corresponding polymerase type.  On the right side of the tree the outbreak location and the sampling date in Märkisch-Oderland is given (blue label).
Regarding sequences from Märkisch-Oderland, there were two main clusters in the phylogenetic tree. One cluster only contained sequences from Märkisch-Oderland with 4.30 base differences per sequence from averaging over all sequence pairs. In this cluster five outbreaks in a time interval from 16.10.2018 to 28.11.2018 were seen in schools, nursery school, after school care club and one affected household. In total, sixty-eight infected persons were reported in this cluster (four to 23 patients per outbreak). The polymerase type that was linked to these capsid sequences was GII.P31.
In the second cluster sequences from district Märkisch-Oderland and sequences from different federal states (Saxony-Anhalt, Baden-Württemberg, Lower Saxony, and North Rhine-Westphalia) were mixed. Outbreaks began in January (2.1.2018) and ended in September (27.9.2018), outbreaks in Märkisch-Oderland were identified from 15.1.2018 to 27.9.2018. Regarding the polymerase type of capsid sequences in this cluster, two sequences from Baden-Württemberg (GER 18-G0386, GER 18-G0246) and one sequence from Saxony-Anhalt (GER 18-G0457) represented genotype GII.P31, the polymerase type from the samples of Lower Saxony (GER 18-G0945) was not determined; all of the sequences from Märkisch-Oderland belonged to GII.P31. The base differences per sequence from averaging over all sequence pairs within this cluster were 4.38. The epidemiological data from district Märkisch-Oderland showed that 149 patients in total with six to 38 patients reported per outbreak were affected in this cluster. Two further GII.4 Sydney sequences representative for one outbreak with 57 patients and another outbreak with seven patients in Märkisch-Oderland clustered with several GII.P16-GII.4 Sydney sequences from other federal states of Germany. Both of the sequences from Märkisch-Oderland were linked with the polymerase type GII.P16.

Amino Acid Changes in GII.P16 Variants
In previous studies, it was shown that the current circulating GII.P16 variants exhibit amino acid changes in the polymerase. Five mutations were found in different studies: D173E; S293T; V332I; K357Q; and, T360A [8,11,18]. In this study, fourteen RdRp-sequences (234bp) were analyzed in the ORF1 region, amino acid changes S293T and V332I were found in all sequences ( Figure 5).

Discussion
It is well known that different norovirus genotypes are circulating at the same time in the same region. We traced 39 outbreaks in the district Märkisch-Oderland to study the predominant genotypes and circulation of norovirus variants in order to understand whether typing of outbreaks in one region in Germany may be representative for Germany.
In 2018, the diversity of norovirus genotypes of genogroup II was larger than the diversity of genotypes of genogroup I in Germany. The recombinant strain GII.P16-GII.4 Sydney was the most common genotype in Germany, followed by GII.P31-GII.4 Sydney. The predominant genotype in Brandenburg, district Märkisch-Oderland was GII.P31-GII.4 Sydney, followed by GII.P16-GII.2. The diversity of genotypes corresponding to GI genogroup found in Märkisch-Oderland was smaller than in Germany. Consequently, we conclude that screening for noroviruses in a small district, like Märkisch-Oderland, would not be suitable to obtain representative data for Germany although the epidemiological data (incidence, age of patients) are similar between Germany and Märkisch-Oderland. National surveillance activities are more accurate to trace new genotypes in the virus population and identity recombination events quickly.
In  [19,20]. Ruis et al., 2017 [18] supposed that this lineage circulated in UK and USA since October 2014 and detected them in stool samples that were collected between June 2015 and April 2016.
Because GII.P16-GII.4 Sydney was the predominate genotype in Germany, it was unexpected that, in the district Märkisch-Oderland GII.P31-GII.4 Sydney was the predominant genotype, followed by GII.P16-GII.2. In the US, GII.P31-GII.4 Sydney was predominating in 2012, but it was replaced by GII.P16-GII.4 Sydney in 2015 [11]. The predominance of the genotype GII.P31-GII.4 Sydney was also reported from other countries, like China [21,22] and from Brazil [23]. It was assumed that GII.P16-GII.4 viruses would have a better viral fitness than GII.P16-GII.2 viruses [7]. For a better understanding of the norovirus diversity and predominance of GII genotypes,  divided genotypes according to the VP1 region into different pattern of evolution: evolving strains and static viruses [24]. GII.4 viruses represented an evolving pattern with a short time span of 5.3 years between distinct variants within each genotype [24]. GII.2 viruses were described as static viruses. Following this hypothesis, the combination of the new emerged GII.P16 variants together with the fast evolving GII.4 capsid might increase the fitness of genotype GII.P16-GII.4 Sydney. This was confirmed by the predominance of this recombinant virus in Germany 2018. but not in Märkisch-Oderland.
Analysis versus amino acid substitution were done in different studies in order to understand the successful emergence of GII.P16 recombinants. The novel GII.P16 recombinants in the US had no unique amino acid substitution in the VP1 region in comparison to the former circulating strains [11]. Nevertheless, several amino acid changes in the non-structural proteins were found: in the NS1/2 proteins, NS4 protein, and in the RdRp [11] indicating that the emergence is derived from complex interaction of mutations in different genome regions. Different amino acid changes in the RdRp of GII.P16-GII.4 Sydney viruses were observed when compared to former GII.P16 strains: D173E; S293T; V332I; K357Q; and, T360A [7,11,25]. In the present study, two mutations S293T; V332I were detected in all of the investigated GII.P16 strains. Tohma et al., 2017 [25] depicted that these mutations in the RdRp might influence the kinetics or the fidelity of the enzyme. It has been suggested that the mutation rate in combination with a high replication rate may represent the key factor in epidemiological fitness [26].
The P2 region encoding for the protruding domain of the capsid protein is the most variable region of the viral genome. This region contains the binding sites for human histo-blood group antigen (HBGA) carbohydrates and neutralizing antibodies. It is known that cross-protection among the different norovirus genotypes is limited and recurrent infections in children were seen in a study from Japan [27]. It has been shown for GII.4 viruses that amino acid changes in epitopes of the VP1 region resulted in global epidemics for decades. Antigenic drift variants of GII.4 persist for years until they are replaced by a new variant in a chronological sequential emergence [8], similar to influenza A viruses (reviewed by G.I. Parra [28]). Analysis of specific antigenic sites within the capsid protein that interact with human antibodies has given important insights into our understanding how genotype GII.4 strains escape from herd immunity and drive pandemic outbreaks. Different antigenic sites (A-G) in the P domain were identified for GII.4 variants, three of them (A, C, and G) were supposed to be drivers of selection and emergence of new GII.4 variants [8].
The phylogenetic analysis of P2 region from genotype GII.4 sequences of district Märkisch-Oderland and randomized chosen sequences from other German states revealed the clustering of sequences from Märkisch-Oderland, mostly in two groups. We only detected a number of four nucleotides differences in sequences forming clusters. Regarding all analyzed German sequences, a clear clustering of numerous sequences correlated to the region in which the samples were collected was not seen. Regarding the number of nucleotide difference in all German sequences, the difference per sequence averaging over all sequences was 30.86 (0 to 60 base differences). The question is why sequences from Märkisch-Oderland clustered so close together. Did we see different norovirus outbreaks or was there a link between these outbreak? It is known that norovirus outbreaks can be linked via asymptomatic patients. A meta-analysis of the global distribution of asymptomatic norovirus infections revealed that the prevalence rate in Europe was 18% (95% CI: 10-30%), when asymptomatic individuals were exposed under outbreak circumstances [29]. It has been discussed that GII.4 variants might originate from immunocompetent humans by inter-host transmission and the accumulation of mutations during the intra-variant periods [10]. Unfortunately, we did not have any further epidemiological information to track infection chains and find hints for asymptomatic patients.

Conclusions
Norovirus diversity in Germany is very dynamic and it is dominated by few dominating genotypes of genogroup II; the co-circulation of additional genotypes extends this high diversity. The virus diversity of a small district, like Märkisch-Oderland, reflects the diversity in Germany to some degree, but cannot cover it completely. To assess trends of genotype distribution and facilitate the timely detection of recombination events in a country, molecular surveillance of norovirus strains should be based on samples from many parts of the country.