Relative Risk Prediction of Norovirus Incidence under Climate Change in Korea

As incidences of food poisoning, especially norovirus-induced diarrhea, are associated with climate change, there is a need for an approach that can be used to predict the risks of such illnesses with high accuracy. In this paper, we predict the winter norovirus incidence rate in Korea compared to that of other diarrhea-causing viruses using a model based on B-spline added to logistic regression to estimate the long-term pattern of illness. We also develop a risk index based on the estimated probability of occurrence. Our probabilistic analysis shows that the risk of norovirus-related food poisoning in winter will remain stable or increase in Korea based on various Representative Concentration Pathway (RCP) scenarios. Our approach can be used to obtain an overview of the changes occurring in regional and seasonal norovirus patterns that can help assist in making appropriate policy decisions.


Introduction
As the reproduction of microorganisms is substantially affected by weather-related factors such as temperature and humidity, climate change is extremely likely to cause changes in the seasonal patterns of food poisoning incidents caused by microorganisms. Therefore, as climate conditions continue to change, it is expected that adjustments to food safety policies related to food will become necessary, and scientific methodologies that can accurately predict long-term fluctuations in food poisoning patterns must be developed to assist administrators in proactively devising reasonable policies to respond to the effects of climate change. Most advanced countries already have access to vast amounts of weather and food poisoning-related clinical data; the key challenge is deriving useful information from these big data. According to a recent 2016 study, more than 40 million disability-adjusted life years were lost in children under the age of five due to cases of diarrhea and related deaths [1]. Although the incidence of diarrhea may be reduced by economic development, some reports have found that climate change damages urban infrastructure and reduces overall water availability [2]. The spread of diarrhea is complex and depends on many factors. Infectious diarrhea can be caused by a variety of pathogens, and it is affected by both host susceptibility and environmental components. Therefore, there is an urgent need for research that quantifies the impact of diarrheal illness as well as improved predictive estimates [3][4][5][6]. This paper chooses to consider norovirus specifically for two main reasons. First, variations in the incidence pattern of norovirus are clinically significant. As diagnostic technology continues advancing, norovirus has been increasingly identified as a major cause of food poisoning [7]. The World Health Organization (WHO) reports that norovirus is one of the top five causes of death from food poisoning [8]. More than 200,000 people die from norovirus every year, including 70,000 children in developing Life 2021, 11, 1332 2 of 16 countries [9]. In the United States, norovirus causes 23 million cases of acute gastroenteritis (AGE) every year [10]. According to the Centers for Disease Control and Prevention (CDC), norovirus causes annual economic losses of more than $2 billion in the United States alone. In other words, norovirus continues to have a negative impact on both human life and the economy worldwide. Second, norovirus has a distinct seasonal pattern, with the frequency of cases rising in the winter [11]. Because of this, it is also called "the winter vomiting disease" [12,13]. New norovirus variants often emerge on cruise ships in the winter and early spring. In the United States, there were particularly high numbers of deaths from norovirus in the winter months in both 2002-2003 and 2006-2007. These seasonal patterns in disease prevalence make it easier to analyze the relationship between weather-related factors and food poisoning using probabilistic methods. As described above, the main causes of norovirus are contaminated food and human contact; however, many papers examining the relationship between norovirus incidence and climate support the assertion of this paper. Previous studies have shown that norovirus is negatively correlated with mean temperature [14] and that it is a wintertime phenomenon, at least in the temperate northern hemisphere [15]. Another study reviewing norovirus incidence over the previous decade showed differences in case numbers between influenza and norovirus infections, with norovirus still showing strong incidence in certain high population density areas [16]. A recent study examined temperature and relaxation rates among patients with diarrhea given the climate change scenarios in Japan. That study estimated the future probability of the disease based on a simple comparison by region [17]. The above study explains the importance of temperature-related future changes in diarrheal patients. To simplify the model and actively identify temperature-related effects, among various weather-related factors, the current study focused on average daily temperature among various weather-related factors. In this paper, we also attempted to predict the variations in norovirus incidence that will occur as the temperature continues to rise in winter due to climate change.
We introduce an index that we developed to quantify the incidence rate of norovirusinduced diarrhea according to Representative Concentration Pathway (RCP) scenarios that explain climate change, and we used this index to calculate the future risk of disease due to climate change. In this analysis, a generalized additive linear model (GALM) method using B-spline was used to compensate for the fact that the probability of dependence on the explanatory variable increases or decreases in simple logistic regression. In addition, the risk assessment was made more objective by using the relative risk index (RRI) to represent risk.

RCP Scenarios
In order to establish an effective response and adaptation plan for risk factors that will appear due to climate change, the Intergovernmental Panel on Climate Change (IPCC) introduced a set of climate change scenarios. The RCP scenarios consider both recent observations along with greenhouse gas reduction technologies. The IPCC has recommended that researchers use the RCP scenarios as basic data for studies that involve climate change impact evaluations; in particular, four scenarios have been introduced: RCP2.6, RCP4.5, RCP6.0, and RCP8.5. The numbers in the scenario names indicate the increasing amounts of net radiation energy experienced by the year 2100 [18].
The Korea Meteorological Administration (KMA) has contributed to the RCP scenarios (http://www.climate.go.kr (accessed on 5 August 2021)). The KMA developed RCP scenarios using five regional climate models (HadGEM3-RA, RegCM4, SNURCM, GRIMs, WRF) [19][20][21][22][23]. As climate models contain uncertainty associated with several sources, e.g., imprecise initial conditions, some statistical refinement techniques and bias correction methods are applied to the output of the model, before providing data to the end users [24]. From these five models they created two model ensembles. The first model is MME4s, based on data from HadGEM3-RA, RegCM4, SNURCM, and WRF. The second one is MME5s, based on data from HadGEM3-RA, RegCM4, SNURCM, GRIMs, and WRF. Climate projections until the year 2100 are evaluated under RCP2.6 and 6.0 for MME4s, and under RCP4.5 and 8.5 for MME5s. The latitude and longitude of the RCP scenario

Data Descriptions
From 2005 to 2018, the total number of patients in Korea reported to have diarrhea was 283,651. Of these cases, 24,642 were caused by norovirus, representing a rate of 8.69%, thus making norovirus a very significant cause of diarrhea. These clinical data come from the Korea Centers for Disease Control and Prevention Portal (http://www.kdca.go.kr/ (accessed on 3 September 2020)). Table 1 summarizes the known norovirus outbreaks by gender and age and the rate of norovirus diarrhea cases compared to diarrhea cases from all-causes. The incidence rates of norovirus do not significantly differ by gender, with the rates in males and females being 8.82% and 8.53%, respectively. In contrast, the effect of age is clearly significant. In particular, the incidence rates of norovirus in the age groups of 0-5, 6-15, 16-59, and over 59 are 13%, 3.37%, 4.79%, and 8.74%, respectively. This shows that very young people are significantly more vulnerable to norovirus than those in other age groups. When dividing Korea into northern, central, and southern regions, there were slight differences in the rates of norovirus outbreaks between regions, with rates of 10.14%, 6.05%, and 9.26%, respectively. In the southern part of the country along the coast, diet and temperature are expected to affect the risk of norovirus outbreaks. The northern part of the country also has a somewhat high risk as it includes Seoul, which is particularly densely populated, leading to higher disease risk. The results of the logistic regression analysis used to calculate the odds ratio by gender, age, and region for norovirus occurrence are shown in Table 2. The odds ratio for a female (vs. male) is 0.101, which is not significant. Age had a significant effect: compared to those aged 0-5, the odds ratios were 0.646, 0.350, and 0.245 for individuals aged 6-15, 16-59, and 60 and older, respectively. This indicates that individuals under five years of age are substantially more vulnerable to norovirus infection than those in other age groups. In addition, the odds ratio for the central and southern regions of the country relative to the northern region were significantly different at 0.777 and 1.062, respectively. Figure 1 shows the regions described in Table 2 on a map for better understanding. We used R 4.2.1 for all data analysis [25][26][27][28][29][30].

Methods
First, the relationship between weather-related factors and the occurrence of food poisoning caused by norovirus was analyzed stochastically. Based on the results of that analysis, we calculated how the seasonal pattern of norovirus incidence is expected to change in the future due to climate change. In Korea, incidences of diarrhea are reported daily by region and cause. We used average daily temperature data from the Meteorological Data Open Portal (https://data.kma.go.kr/ (accessed on 5 August 2021)). The number of diarrhea patients with norovirus may vary as winter temperatures rise rapidly in accordance with the climate projections under the considered RCP scenarios.
In addition, a generalized additional model (GAM) was employed to express the additive model of the functions including these bases. The logistic model using GAM helps us to predict the probability of the occurrence of norovirus according to the average temperature. In the GAMs that were used, the linear components ∑ of the model were replaced with ∑ ( ) [31][32][33][34]. π( | , , ⋯ , ) is used to denote the probability that norovirus cases appear along with the explanatory variables , , ⋯ , . The GALM assumes that

Methods
First, the relationship between weather-related factors and the occurrence of food poisoning caused by norovirus was analyzed stochastically. Based on the results of that analysis, we calculated how the seasonal pattern of norovirus incidence is expected to change in the future due to climate change. In Korea, incidences of diarrhea are reported daily by region and cause. We used average daily temperature data from the Meteorological Data Open Portal (https://data.kma.go.kr/ (accessed on 5 August 2021)). The number of diarrhea patients with norovirus may vary as winter temperatures rise rapidly in accordance with the climate projections under the considered RCP scenarios.
In addition, a generalized additional model (GAM) was employed to express the additive model of the functions including these bases. The logistic model using GAM helps us to predict the probability of the occurrence of norovirus according to the average temperature. In the GAMs that were used, the linear components β j ∑ x ij of the model were replaced with β j ∑ f j x ij [31][32][33][34]. π(y i x i1 , x i2 , · · · , x ip ) is used to denote the probability Life 2021, 11, 1332 5 of 16 that norovirus cases appear along with the explanatory variables x i1 , x i2 , · · · , x ip . The GALM assumes that where we used the functions f 1 , f 2 , · · · , f p to smooth bases estimated in various forms, such as with B-spline, cubic splines, and natural cubic splines. We used recursive 4th B-spline bases as supports of the mean temperature as follows [34][35][36][37][38]. Let U be a set of m + 1 non-decreasing numbers, We create a logistic regression model using the previously generated N i,p as f p x ip and the presence or absence of norovirus as the dependent variable. A backfitting algorithm is typically used for estimations when the splines are complex. However, in this paper, the splines are calculated and then used to estimate the logistic regression because they can be calculated with relative ease. We calculate the binary logistic regression model using iteratively reweighted least squares (IRLS), which is equivalent to maximizing the log-likelihood of a Bernoulli distributed process using Newton's method as follows. using iterative algorithm The average temperature that maximizes the probability of GALM is defined as the maximum rate temperature, and the intervals of height corresponding to 0.8 and 0.9 times the maximum rate are respectively defined as the risk interval and the high-risk interval.
We employed the RRI to understand the effect of temperature-dependent probabilities in the RCP scenarios. RRI measures how the probability of occurrence is driven by temperature in GALM models.
Relative risk index (RRI) = ∑ where A is defined as the focus set, Deviation i is defined as π(y i |·) -the critical value, and χ represents the indicator function.
Larger RRIs indicate an increased risk of infection by norovirus. That is, the probability in a particular group is greater than the mean probability of being infected by norovirus among diarrhea patients. For example, the incidence of norovirus among all diarrhea Life 2021, 11, 1332 6 of 16 patients aged 0-5 is 13.00%, so the critical value is 0.13. We can compute the RRI for age, region, and X-year RCP scenarios.
We calculated the RRI in each RCP scenario for the winter months in Korea (December to February). We organized the 451,351 grids in the RCP scenarios into 96,172 grids based on their location in the Korean peninsula to calculate the RRIs, and then we used the GALM model to predict the probability of the incidence of norovirus on a daily basis. Figure 2 shows the grid, made up of 96,172 points, we used. In a logistic regression model the outcome of y i is 0 or 1, with 1 indicating an event occurring (such as becoming infected with a disease) and 0 indicating no event occurring. In a typical logistic model, the probability of a response increases or decreases depending on the value of the explanatory variable. Therefore, the quadratic form, which increases and decreases in order to obtain an optimum value, is not applicable here. To compensate for this, a number of orthogonal bases were used in this paper to make the quadratic form feasible. We calculated the RRI in each RCP scenario for the winter months in Korea (December to February). We organized the 451,351 grids in the RCP scenarios into 96,172 grids based on their location in the Korean peninsula to calculate the RRIs, and then we used the GALM model to predict the probability of the incidence of norovirus on a daily basis. Figure 2 shows the grid, made up of 96,172 points, we used. In a logistic regression model the outcome of is 0 or 1, with 1 indicating an event occurring (such as becoming infected with a disease) and 0 indicating no event occurring. In a typical logistic model, the probability of a response increases or decreases depending on the value of the explanatory variable. Therefore, the quadratic form, which increases and decreases in order to obtain an optimum value, is not applicable here. To compensate for this, a number of orthogonal bases were used in this paper to make the quadratic form feasible.  Figure 3 shows the GALM analysis using B-spline. The GALM model is more useful for predicting the probability of occurrence than the generalized linear model. The findings indicate that the rate of occurrence in patients in different age groups differs significantly depending on temperature. From the analysis of Figure 3 it is evident that people under the age of 15 are more than twice as likely to contract norovirus compared to those over the age of 15. Especially, the two age groups containing patients under the age of 15 show the highest probability of contracting norovirus when the temperature is near 0 degrees, while the other two age groups, 16-59 years and over 60 years, have the highest probability of being infected by a norovirus when the temperature is between -10 and 0. In other words, the temperature interval in which the highest probability of norovirus occurs was found to be lower for those over 15 years of age compared to those under 15 years of age.  Figure 3 shows the GALM analysis using B-spline. The GALM model is more useful for predicting the probability of occurrence than the generalized linear model. The findings indicate that the rate of occurrence in patients in different age groups differs significantly depending on temperature. From the analysis of Figure 3 it is evident that people under the age of 15 are more than twice as likely to contract norovirus compared to those over the age of 15. Especially, the two age groups containing patients under the age of 15 show the highest probability of contracting norovirus when the temperature is near 0 degrees, while the other two age groups, 16-59 years and over 60 years, have the highest probability of being infected by a norovirus when the temperature is between −10 and 0. In other words, the temperature interval in which the highest probability of norovirus occurs was found to be lower for those over 15 years of age compared to those under 15 years of age.   Table 3 lists the maximum incidence rates, maximum rate temperatures, risk in vals, and high-risk intervals of the four age groups. Table 3 shows the temperature which patients in each group are vulnerable to norovirus; this information could pr very helpful in preventing norovirus. As noted in the previous section, the proportion all diarrhea patients aged 0-5 who contract norovirus is 13%, which is significantly hig than the corresponding proportions in all other age groups. The predicted norovirus fection rate among patients aged 0-5 years reaches a maximum of 23.8% at −2.0 °C, wh is significantly greater than the overall observed rate of 13.00%. A risk interval ab 19.0% corresponds to an average daily temperature between −10.4 °C and 6.4 °C. In ad tion, a high-risk interval over 21.4% corresponds to an average daily temperature betw −7.8 °C and 3.8 °C. The predicted norovirus infection rate among all diarrhea patients a 6-15 years reaches a maximum of 21.2% at −0.5 °C; this is significantly greater than overall observed rate of 8.74% for the same group. A risk interval above 17.0% co  Table 3 lists the maximum incidence rates, maximum rate temperatures, risk intervals, and high-risk intervals of the four age groups. Table 3 shows the temperatures at which patients in each group are vulnerable to norovirus; this information could prove very helpful in preventing norovirus. As noted in the previous section, the proportion of all diarrhea patients aged 0-5 who contract norovirus is 13%, which is significantly higher than the corresponding proportions in all other age groups. The predicted norovirus infection rate among patients aged 0-5 years reaches a maximum of 23.8% at −2.0 • C, which is significantly greater than the overall observed rate of 13.00%. A risk interval above 19.0% corresponds to an average daily temperature between −10.4 • C and 6.4 • C. In addition, a high-risk interval over 21.4% corresponds to an average daily temperature between −7.8 • C and 3.8 • C. The predicted norovirus infection rate among all diarrhea Life 2021, 11, 1332 8 of 16 patients aged 6-15 years reaches a maximum of 21.2% at −0.5 • C; this is significantly greater than the overall observed rate of 8.74% for the same group. A risk interval above 17.0% corresponds to an average daily temperature of between −5.7 • C and 4.6 • C. A high-risk interval exceeding 19.1% corresponds to an average daily temperature between −4.1 • C and 3.0 • C. The predicted norovirus infection rate among all diarrhea patients aged 16-59 reaches a maximum of 11.9% at −5.8 • C, which is more than double the overall observed rate of 4.79% for the same group. A risk interval exceeding 9.5% corresponds to an average daily temperature between −14.3 • C and 2.7 • C. The high-risk interval exceeding 10.7% corresponds to an average daily temperature between −11.7 • C and 0.1 • C. The predicted norovirus infection rate among all diarrhea patients aged 60 years or older reaches a maximum of 7.7% at −4.6 • C, which is significantly greater than the overall observed rate of 3.37% for the same group. The risk interval exceeding 6.2% corresponds to an average daily temperature between −12.9 • C and 3.6 • C. A high-risk interval, which exceeds 7.0%, corresponds to an average daily temperature between −10.3 • C and 1.1 • C. In particular, the incidence rates in patients aged 0-5 and 6-15 are much greater than the average rates. In this paper, we constructed scenarios with rates greater than the average (13% for 0-5 years old and 8.74% for 6-15 years old) where norovirus occurs as a priority for each age group. For each age group, the RRI was calculated using the critical value of the incidence of norovirus among diarrhea patients. Tables 4-7 and Figure 4 show the resulting values and 25 and 97.5 percentile of these RRIs by age relative to the critical values in the years 2030, 2050, 2070, and 2100 according to RCP scenarios 2.6, 4.5, 6.0, and 8.5. These results were obtained using percentile bootstrap methods over 100 trials. In Figure 4, the RRI shows a similar pattern in the same age groups and RCP scenarios. In the RCP 2.6 scenario, the RRI for the 0-5 years, 16-59 years, and over 60 years age groups will increase by 2050 and 2070, but in 2100 it is similar or only slightly higher than in 2030. In RCP 4.5, RRI tends to decrease, but it rises briefly in 2070. In RCP 6.0, the RRI increases until 2070 and then decreases slowly, and by 2100 the RRI is decreasing rapidly. In RCP 8.5, the RRI is maintained until 2050, then decreases significantly until 2100. In Figures 5-8, we present the probability of being infected with norovirus according to the temperature calculated by GALM for each grid on the map of the Korean Peninsula. Overall, it can be seen that the more rapid the climate change, the lower the probability of infection. In addition, in terms of the regional analysis, the probability of infection decreases from the southern and eastern coasts. In Figure 6, it can be seen that in all RCPs, the probability of infection in the northeast area is low in 2030 and in 2050. It is lower in the coastal areas than in other areas for two age groups, 16 to 60 and over 60 years groups.

Discussion and Conclusions
In the previous section, we calculated the probability of norovirus occurrence for different age groups by applying the GALM model using average daily temperature as an explanatory variable, and we calculated the risk index according to RCP scenarios based on the RRI index presented in this paper. These results are consistent with the general

Discussion and Conclusions
In the previous section, we calculated the probability of norovirus occurrence for different age groups by applying the GALM model using average daily temperature as an explanatory variable, and we calculated the risk index according to RCP scenarios based on the RRI index presented in this paper. These results are consistent with the general observation that food poisoning caused by norovirus occurs most often in winter. Considering that the average winter temperature in Korea is 1.5 • C in December, −1.0 • C in January, and 1.1 • C in February, Korea's winter season is favorable for norovirus to occur. The results show that the risk of norovirus occurrence increases in 2050 and 2070 but decreases by the year 2100 under the RCP 2.6 scenario for 0-5 years, 15-60 years, and over 60 years age groups. For all age groups, the risk of norovirus occurrence decreases over time under the RCP 4.5 scenario, despite a brief increase in 2070. Under the RCP 6.0 scenario, for all age groups, the risk increases until 2070 and then decreases to 2100. Under the RCP 8.5 scenario, for all age groups, the risk remains stable until 2050 and then decreases to 2100. In the prediction probability plots, we can see that the probability of occurrence decreases from near the coast, and that the probability of occurrence in the northeast area is low in 2030 and 2050 in all RCP scenarios. In general, the incidence of norovirus is predicted to decrease when the temperature rises due to climate change. However, the findings of this study indicate that, even if the temperature rises, the temperature in Korea will remain within the risk interval of temperature for norovirus occurrence, so the risk of norovirus will either remain stable or increase. Despite the predicted increase in temperature due to climate change, it is necessary to continue to prepare for future norovirus outbreaks in Korea, among those under the age of 15. Based on these simulation results, if the winter temperature rises, the possibility of food poisoning due to domestic norovirus will increase, and effective disease control measures will be needed. Of course, our ability to predict changes in the pattern of norovirus outbreaks simply by looking at the average daily temperature alone may be somewhat limited. A number of factors should be considered to improve the accuracy of these predictions, such as humidity and precipitation in particular, as these are closely related to microbial reproduction. In fact, there have been reports that lower water temperatures increase the viability of noroviruses in their natural environments [39,40], and, statistically, lower water temperatures in the Ontario River have been associated with many cases of food poisoning caused by noroviruses [41]. Norovirus can be transmitted not only through fecal and oral routes but also through aerosols caused by vomiting, so an increase in humidity is likely to increase pathogen viability [40]. In a similar vein, there have been reports that high humidity increases the viability of rotavirus [42]. In order to reasonably predict changes in the seasonal patterns of norovirus outbreaks, it is necessary to consider not only climate-related factors but also human behavior and socioeconomic changes that may occur with climate change. The cooler temperatures in winter cause people to remain inside longer, thus increasing the incidence of food poisoning from norovirus, meaning that changes in human behavior can affect the incidence of norovirus. This paper calculated the risk index according to the temperature changes suggested in various RCP scenarios, but this approach has limitations, as the occurrence of norovirus is affected by other latent factors. For example, outbreaks of food poisoning caused by norovirus are often related to the consumption of fresh fruits and vegetables, so a significant increase in the consumption of fresh fruits and vegetables in winter could lead to a marked increase in norovirus food poisoning regardless of the climate conditions. In addition, 42.5% of cases of food poisoning caused by norovirus in Korea from 2000 to 2007 are known to have been caused by careless food handling by cooks, so improved hygiene practices among individuals working in large centers that handle fresh agricultural products could reduce incidences of norovirus. Nevertheless, based on the results of this study, we predict that the risk of norovirus in winter will either remain stable or increase in Korea despite climate change leading to warmer winters. This approach provides an overview of the likely changes that will occur in regional and seasonal norovirus patterns, which are expected to help appropriate policy decisions.