The Incidence of West Nile Disease in Russia in Relation to Climatic and Environmental Factors

Since 1999, human cases of West Nile fever/neuroinvasive disease (WND) have been reported annually in Russia. The highest incidence has been recorded in three provinces of southern European Russia (Volgograd, Astrakhan and Rostov Provinces), yet in 2010–2012 the distribution of human cases expanded northwards considerably. From year to year, the number of WND cases varied widely, with major WND outbreaks in 1999, 2007, 2010, and 2012. The present study was aimed at identifying the most important climatic and environmental factors potentially affecting WND incidence in the three above-mentioned provinces and at building simple prognostic models, using those factors, by the decision trees method. The effects of 96 variables, including mean monthly temperature, relative humidity, precipitation, Normalized Difference Vegetation Index, etc. were taken into account. The findings of this analysis show that an increase of human WND incidence, compared to the previous year, was mostly driven by higher temperatures in May and/or in June, as well as (to a lesser extent) by high August-September temperatures. Declining incidence was associated with cold winters (December and/or January, depending on the region and type of model). WND incidence also tended to decrease during year following major WND outbreaks. Combining this information, the future trend of WND may be, to some extent, predicted, in accordance with the climatic conditions observed before the summer peak of WND incidence.

The incidence of WND is seasonal in the temperate zones of Eurasia, the Mediterranean Basin, and North America, peaking from July through October [17,20]. In temperate climates the WND incidence is related to the weather, although it is uncertain how temperature and rainfall influence epidemic transmission [24]. In 1999, there was a large outbreak of WND in Southern Russia (>500 cases in the Volgograd Province). In 2000-2004, the WND incidence decreased steadily to zero, but a new outbreak occurred in 2007. The analysis of historical climate data for Volgograd from 1900 to present has shown that the years 1999 and 2007 were the hottest ones due to a very mild "winter" (December-March) and a hot "summer" (June-September) [2,3]. Similar conclusions were reached by analyzing weather conditions during the WNV outbreak in Israel in 2000 [25]. It is clear, however, that there is a complex interplay of viral, avian, mosquito, human, and climatic factors that contribute to epidemic/epizootic transmission of WNV [24,26]. For example, avian herd immunity developed during a year of high WNV activity might result in decrease in transmission during the following season.
Either the number of human WND cases or the prevalence and WNV infection rate of mosquitoes were used as a measure of the WND risk in certain conditions [25,[27][28][29][30][31][32][33]. An increase in temperature in the 18-30 °C range shortens the gonotrophic period (GP) of Culex pipiens increasing the frequency of mosquito-host contact and therefore infection/transmission. Also the reproduction of WNV in infected mosquitoes increases along with temperature rise [34][35][36]. For example, so-called extrinsic incubation period (EIP) of WNV NY99 strain in Cx. tarsalis mosquitoes decreased from 30 days at 18 °C to 10 days at 26 °C [36]. Supposedly, this results in intensifying of spreading the WNV infection [2,3]. The transmission of WNV by Culex mosquitoes may accelerate sharply with an increase in temperature [34][35][36]. If the values of GP and EIP are experimentally estimated for a certain mosquito vector and WNV strain, the "biological"/"process-driven" models may be constructed to analyze the temperature effects on the spatiotemporal dynamics of WNV transmission [37]. Unfortunately, such data were not available for "Russian" WNV strains and Russian populations of Culex mosquitoes, so we had to use purely "statistical" approaches for modeling based on solely human WND incidence.
Territorial factors like urbanization and population density showed also the effect on WND incidence, as observed in US-North East and Iowa [32,38]. Discriminant analysis of characteristics of regions in the Canadian Saskatchewan province in 2003 and 2007 revealed the relationship between high WND incidence and the following risk factors: low precipitation in June-July and high temperatures in July-August [28]. According to the data obtained in Illinois, the value of cumulative temperatures (above 22 °C) was the best differentiator of years with high WND incidence and high WNV infection rate of Culex mosquitoes. Dry spring followed by humid summer also contributed to the infection spread, though to a lesser extent. Areas with high incidence and mosquitoes infection rate were characterized by lesser summer precipitation [29]. During the WND outbreak in Europe in 2010, the human morbidity correlated with the higher temperature and, to a lesser extent, with relative humidity, while the association with precipitation was not consistent. Notably, northern ("colder") countries displayed strong correlations between a number of WND cases and temperature with a lag of up to four weeks, in contrast to southern ("warmer") countries, where the response was immediate [5].
Such relations should be used to develop models based on data gathered daily for a large territory through remote satellite monitoring, to forecast the risk of WND outbreaks [39,40]. Data on the timing of spring green up (measured with NDVI), temperature variability in early spring and summer (measured with land surface temperature), and moisture availability from late spring through early summer (measured with actual evapotranspiration) can be useful predictor of the risk of human WND infection, while abundance of mosquitoes may be predicted based on values of daily surface water inundation fraction, surface air temperature, soil moisture, and microwave vegetation opacity.
The current study has been designed to identify climatic and environmental factors that have considerably influenced the WND incidence in endemic regions of southern Russia in recent years, and to make, with these factors taken into account, models that would enable to forecast epidemiological situation of WND in the current year.

Materials and Methods
Russia's territory is administratively divided into 83 so-called "constituent entities of the Russian Federation" (province, territory or republic). The data on WND incidence in the Russian Federation constituent entities are based on official statistics of the Ministry of Health (Rospotrebnadzor) and research papers published over the period of 1999-2013 [5,6,12,[41][42][43][44][45].
The main dataset was made with data of WND occurred from 2001 to 2012 in Astrakhan, Volgograd and Rostov Provinces ( Figure 1). These data, that include about 58% of 2,283 WND cases reported in Russia from 1999 to 2013, may be considered informative and reliable since they were gathered using standardized procedures according to official Methodological Guidelines [45][46][47][48].
All patients with high fever or neuroinvasive disease hospitalized in these three provinces from June to October in 2000-2013 underwent laboratory investigation including serological testing. By definition, a WND case was mandatory notified if all other possible diagnoses were excluded and high titer of WNV antibodies was found in a serum samples by IgM-capture ELISA. Up to 30% of WND cases were additionally confirmed by IgG seroconversion and/or specific PCR but these tests were not mandatory. All patients died because encephalitis in this period were subjected to PCR-based investigation and the presence of WNV RNA in a brain autopsy sample was considered as laboratory confirmation of WNV infection. Thus, the case definition included both WND neuroinvasive and non-neuroinvasive cases which were not separately notified [45][46][47]. These procedures could lead to some underestimation of WND incidence, because the patients with mild symptoms were not hospitalized and investigated, or its overestimation because of false-positive ELISA results or the presence of IgM antibody developed in response to a previous WNV exposure. Therefore, we had not used the absolute values of the WND incidence for decision trees. The trees were constructed using the changes of incidence in the province in comparison with the incidence in a previous year (see below).
Incidence in a particular year was compared to factors' values throughout a "WND epidemic year", beginning from November of the previous (as relative to the incidence data) year through October of the current year. Two types of models were made-"explanatory" and "prognostic". Explanatory Models (ExpMod) took into account variations of climatic and environmental parameters over the entire "epidemic year" including the period of the highest WND incidence, from July to October (This period will be called "an epidemic season" below.) Prognostic Models (ProMod) only used parameters' values from November to June, that is, they were, by definition, intended to "predict" incidence changes before the beginning of an epidemic season.  In order to build the model, the events subjected to analysis and forecasting were classified by three types of outcome: (1) decrease of incidence in the province in comparison with the incidence in the previous year (by at least 15% from the long-term average for the province in question); (2) stabilization of incidence (the value of the current year differs from last year's value by less than 15% from the long-term average); (3) increase of incidence in the province in comparison with the incidence in a previous year (by at least 15% from the long-term average). The long-term average was defined as the arithmetic mean of WND incidence in 2001-2012. The threshold of 15% was arbitrary.
The incidence in the three provinces is analyzed for correlation with 96 variables averaged for the same entity: mean monthly temperature (T); mean monthly relative humidity (RH); mean monthly atmospheric pressure (AtmP); mean monthly amount of precipitation per day (AP); mean monthly value of NDVI (NDVI); mean monthly value of NDVI for forests (NDVI-forest); mean monthly value of NDVI for meadow and steppe vegetation (NDVI-meadow); mean monthly value of NDVI for arable lands (NDVI-arable). So 8 environmental parameters were considered for each month of "epidemic year". Primary meteorological data were freely available from National Center for Atmospheric Research (NCAR, Boulder, CO, USA) [49]. NDVI was calculated from MODIS data using standard product MOD 09 [50,51]. Primary data were re-analyzed, cleared from noise, and linked to Russian administrative regions and cities using techniques of the VEGA geoportal created and maintained by the Space Research Institute of the Russian Academy of Sciences [51,52]. The maps of land cover were also based on MODIS data and produced using original method developed in the Space Research Institute [51,53]. For the purpose of this study the values of climatic and environmental parameters were directly retrieved from a database making the basis for VEGA geoportal.
Statistical analysis was performed with the IBM SPSS Statistics 19 software [54]. The Decision Tree procedure tested the hypothesis of an association between climatic and environmental parameters and increasing, decreasing or stable incidence of WND. The CRT method (Classification and Regression Trees) was used; verification also involved using the CHIAD method (Chi-squared Automatic Interaction Detection). A constructed final tree was crossvalidated dividing the sample into 10 of subsamples. Tree models were then generated, excluding the data from each subsample in turn. The crossvalidated misclassification risk estimate for the final tree was calculated as the average of the risks for all of the trees [55].
The task of the Decision Tree procedure was to select several most significant parameters and thresholds for their categorization so that their combination allowed to classify and "predict" one of three events: decrease, stability, or increase of WND incidence in comparison with a previous year. The term "correct classification" below means that Decision Tree procedure forms a pure terminal node containing only one "correct" type of outcome (e.g., the node with six years of increase in Figure 6a). If a terminal node contains several types of outcome (e.g., the node with one year of stability and four years of decrease in Figure 6a), a prevalent outcome is considered to be correctly classified and less frequent outcomes are formally considered as "errors". Figure 1 presents a map of the south of Russia, where most of the WND cases were reported; Figure 2 shows the seasonal changes of temperature and precipitation in Astrakhan, Volgograd and Rostov Provinces with continental climate. In general, winter is colder in Volgograd and spring-summer months are warmer in Astrakhan Province that is also more arid. In Southern Russia there are two large rivers, the Volga and the Don, with their tributaries, artificial lakes and channels, marshland and ponds. As a result a lot of migrating birds from Africa, mainly waterfowl, nest in this region in spring-summer. The capitals, Astrakhan City and Volgograd City are on the banks of the Volga river, and Rostov City is on the banks of the Don river. The land use derived from remote sensing data [50,53] is coded by colors in Figure 1 and shown in Table 1. The urbanization level is moderate. Yellow "agricultural land" in Volgograd and Rostov Provinces means mainly cropland. Pink "grassland, steppe and semi-desert" in Astrakhan and Volgograd Provinces are used for grazing or gardening, or not used. Large forests ("green colors", Figure 1) are nearly absent although there is a lot of vegetation near rivers and ponds.   [41][42][43][44][45]. The overall case-fatality rate was about 2%. The ratio of WND incidence in rural and in urban population was 0.40 and 0.35 (median values) for Volgograd and Rostov Provinces, respectively. This ratio was significantly higher (1.1) in Astrakhan Province  -Volgograd -Rostov Table 2 presents the WND incidence in Astrakhan, Volgograd and Rostov Provinces over the period from 1997 to 2013. The data on WND cases in other Russian provinces in 2010-2013 can be found in the ECDC website [6].

Results and Discussion
According to Table 2 Correlations between mean temperatures and the incidence are presented in Table 3 and in Figure 4. In some cases the average temperature for several months, e.g., December-January, correlated better with WND incidence than any single monthly temperature (the average December-January T is defined as ("December T" + "January T")/2.) It is noteworthy that other factors, namely precipitation, relative humidity, atmospheric pressure or NDVI values, did not significantly correlate with the WND incidence in any of the provinces. * The rules for classification of WND incidence changes as "decrease", "stability" and "increase" are given in the Methods; ? The data obtained from official publications were considered not reliable and were not used below.
For Volgograd Province, the most important relation was the correlation of morbidity with May to July temperatures; so WND outbreaks occurred in the years, when the mean temperature in May-July exceeded 21 °C (four points in the upper right corner of Figure 4a corresponding to years 2007 and 2010-2012). As a linear approximation (Figure 4а) suggests, on average, a 1 °C rise of temperature at this period would cause an increase of incidence with a factor of 4.6.
In Rostov Province, the most important factor was the temperature in May, and (to a lesser extent) in June, but not in July. Incidence increases were also associated with warmer than usual temperatures in December of the previous year. In the southernmost Astrakhan Province, summer temperatures are generally rather high (Figure 3a) and do not restrict the spread of the WND. In this province, the temperature of December-January may act as a limiting factor: when below −5 °C, WND incidence tends to be lower than usual (five points in the lower left corner of Figure 4c corresponding to years 2003, 2006, and 2008-2010). In all the three provinces, a weak positive correlation was observed with temperatures in August-September, when most of the WND cases were recorded (Figure 2a, Figure 3). Owing to a relatively short observation period, this correlation did the statistically significant value (p = 0.05).  In Figure 5 cumulative spring-summer temperatures in Volgograd and Rostov Provinces in years with low WND incidence (2003 and 2008) and WND outbreaks (2007 and 2010) are plotted. In years with low incidence, the period, when the mean temperatures steadily exceed 21 °C, come a few weeks later, and the final cumulative temperatures are lower. The effect of temperatures above the threshold value of 14 °C is less obvious. For planning and implementation of preventive measures and control of epidemics, it is necessary, above all, to know and predict the pattern of the incidence variation from information obtained in the previous year(s). ExpMod and, particularly, ProMod were designed for this purpose. Only main dataset, that is the data relating to years 2001-2012, were used to build these models. In addition to mean monthly parameters, we included also the average temperature of different time periods of the year that showed significant correlations with WND incidence (Table 3). Taking into account that the WNV activity in the previous year might modulate the WND morbidity in the current year, we have also added the variable "WND incidence in the previous year" as the independent "predicting" variable. This variable is designated as WN_IN_PY below. With regard the ExpMod and ProMod, only the models capable to provide the better explanation (ExpMod) or prediction (ProMod) for each province with the minimum number of parameters are presented. One of these models was constructed without variable WN_IN_PY and second model included the variable WN_IN_PY. We have also plotted the model common for all the three provinces, though their geographical location, fauna, type of landscape and other characteristics are not identical, which could have modified the effects of climatic and environmental factors being assessed. The logical structure, thresholds and performance of seven selected models is presented in Table 4. Decision trees corresponding to three of the seven models are shown in Figure 6.  * This model was also the best ProMod, as it has been developed using only parameters obtained before the beginning of the "epidemic season" (July-October current year); ** Less frequent outcomes shown in italics are formally considered as "errors of prediction"; *** WN_IN_PY means "WND incidence in the previous year", no. of cases per 100,000 population.
The best ExpMod 2A* for Astrakhan Province uses three parameters: first of all, the variable WN_IN_PY and then the mean monthly temperatures (T) in December and January (Table 4). Noteworthy, this model was also the best ProMod, as it has been developed using parameters obtained (December previous year and January current year) before the beginning of the epidemic season (July-October current year). ExpMod 2A* classified correctly all the 12 outcomes in 2001-2012 (Figure 7). High WND incidence in the previous year correlates with the decrease of WND incidence in current year. Relatively warm December and January contribute to the increased WND incidence in the following summer. If the variable WN_IN_PY is deliberately excluded from the analysis, the best ExpMod 1A*, which is also ProMod as indicated by an asterisk, uses December-January T and then May T (Table 4 and Figure 6a).
The best ExpMod 1V for Volgograd Province uses May-June T and then September T, classifying correctly all the 12 outcomes in 2001-2012 (Table 4, Figure 7). The increase is expected in case of warm May-June; the decrease is expected if both May-June and September are cold. If the variable WN_IN_PY is included in the analysis, the decision tree procedure selects two variables, WN_IN_PY and June T, and constructs ProMod 1V providing the best "predictions" for 2001-2012 (Table 4, Figures 6b and 7). For Rostov Province, the best ExpMod 1R* was the one that fitted two parameters, the mean monthly May T and January T. Values of T in May above 16.5 °C were unambiguously related to an increase in the WND incidence in July-October (epidemic season) of the same year. If both May and January were cold, the decrease was predicted (Table 4, Figure 7). ExpMod 2R* took also WN_IN_PY into account. Again the decrease of WND incidence followed years with relatively high WND incidence. ExpMod 1R* and ExpMod 2R* could not correctly classify two years of stability, 2001 and 2009.  The analysis proves that it is possible to build acceptable models which are common for all the three provinces. ExpMod 1AVR provides the best possible classification using temperatures in May, January, and August-September, as well as the variable WN_IN_PY (Table 4, Figures 6c and 7). ExpMod 1AVR It has to be pointed out that including in our models a larger number of parameters would have avoided even these minor classification errors, but we deliberately limited the number of parameters to the 2-4 most important, to prevent the "overfitting" due to the small number of observations. For this purpose, the maximum number of levels for the CRT Growing Method was custom fixed as 3, but many branches of the decision trees were even shorter, having just one or two nodes ( Figure 6).
In general, all seven selected models provide correct classifications for the years 2003, 2005-2008, and 2010-2012. Some, but not all models may not explain the situation in Volgograd in 2002, in Astrakhan in 2009, and in Rostov in 2001, 2002. Notably, these were years of low WND incidence, since only 14, 3, 5, 0, 7, and 1 human WND cases were registered, respectively. They account for only 2% of the WND cases (30 of 1,330) identified in these provinces during the study period 2001-2012.
With certain caution, the suggested models may be used for analysis of WND incidence in other years, as well. When referred to earlier observations, province-specific ExpMod and ProMod explain and "forecast" an increase in WND incidence in Astrakhan Province in 1999, in Volgograd Province in 1998 and 1999, in Rostov Province in 2000 (Figure 7). For the 2013, we observed conflicting results; the province-specific models ExpMod 1A* and ExpMod 1R* without variable WN_IN_PY "expected" the increase of human WND incidence in Astrakhan and Rostov because of relatively mild winter and warm May. Conversely, province-specific models ExpMod 2A*, ProMod 1V, and ExpMod 1R*, using the variable WN_IN_PY, predicted correctly the decrease of human WND incidence in all three provinces, since human morbidity and, presumably, WNV transmission, had been high in 2012. Noteworthy, major WND outbreaks in Romania, Israel, Greece and Russia have been invariably followed by a relative decline of incidence in the following year [3,4,7,[12][13][14][15]20,[41][42][43][44][45].
An additional pitfall for modeling was that the WNV strains were not the same in the study period in the three provinces. It can be reasonably assumed that in recent history (i.e., since 1990) there were two independent introductions of WNV strains in Volgograd, possibly by migratory birds from Africa. The first WNV lineage 1a clone (prototype strain VLG-4 AF317203) was introduced around 1995 [9,56]. From 1999 to 2003, all WNV isolates from Volgograd were very similar [3,12,19]. This implies the persistence/overwintering of the WNV clone in Volgograd, while multiple invasions of the same strain appear to be unlikely. Unfortunately, the mechanisms of WNV overwintering in Volgograd have not been properly studied and remain undefined. In 2004, there was no human case of WND in Volgograd, and we were unable to find any WNV RNA in birds or mosquitoes [12]. The climatic conditions in 2003 and 2004 were unfavorable for WNV infection; as a consequence WNV of lineage 1a disappeared and was not found in Volgograd anymore. The second WNV clone, belonging to lineage 2 (prototype strain Reb_VLG_07 FJ425721), was introduced in Russia and Eastern Europe after the year 2000 [12,15,16]. Between 2007 and 2012, all WNV isolates from Volgograd and Rostov were nearly identical to the strain Reb_VLG_07. This implies again the persistence of the WNV clone in Southern Russia. The data on WNV genotypes in Rostov and Astrakhan Provinces are scarce. Apparently, several WNV clones may persist at the same time in more southern Astrakhan Province [12,56,57]. Climatic conditions might facilitate or impede WNV introduction into the northern regions and, vice versa, the features of prevalent WNV strains, for example their virulence for vectors and hosts, might affect WNV transmission independently of climatic factors. Despite that, the same, or similar, models proved to be applicable for the whole study period in all three provinces.
It is noteworthy that relative humidity or atmospheric pressure values were not taken into account in order to explain variations in WND incidence. The increase of incidence is mostly related to higher than normal mean temperatures in May and/or June (the period of WNV amplification in epizootic cycles bird-mosquito-bird) and, to a lesser extent, high temperatures in August-September (the peak of human morbidity). A decline of incidence may be associated with cold winter (December and/or January and/or February, depending on the region and the type of model).
Neither the temperature or precipitation registered in the other months, nor the values of NDVI of various types of vegetation, appeared to produce an independent statistically significant effect on the WND incidence, as their potential positive or negative effects were weaker than the effect produced by significant factors.
Our "statistical" findings could not be automatically translated into the language of process-driven models. Biological mechanisms affected by warm temperature include the shortening of the GP duration of the mosquito and the EIP of the virus both of which increase efficiency of WNV transmission [34][35][36][37]. The variations of air temperature in the range from 14 °C to 30 °C are critical and this is exactly the range of temperatures observed in Southern European Russia in May-September ( Figure 2a). Unfortunately, Russian data on WNV vectors and hosts are scarce. WNV strains present in Astrakhan, Volgograd and Rostov has not been found outside Russia and Romania, and the populations of main Russian WNV vectors, Cx. pipiens pipiens, Cx. pipiens molestus, and Cx. modestus have some genetic and ecological differences from European and American populations [3,21]. Thus, the exact relation of the GP and EIP with the temperature is not known for these vectors and WNV strains, although our field studies in Volgograd showed that Culex mosquitoes abundance in an epidemic season was higher in the years with a mild winter and a hot summer [3].
To our knowledge, the effect of winter temperatures below zero has not been previously noted. In contrast to most WNV endemic regions, winters in Russian continental climate are really cold, with temperature that can reach −30°C (Figure 2a), that probably affects the survival of overwintering mosquitoes outdoors (both Culex imago and Aedes eggs) [3,23]. The development of autogenous C. pipiens overwintering in non-heated basements may be also affected by ambient temperature.
In general, the effects of precipitation on WNV transmission and human WBF morbidity remains controversial [5,28,29,32,33]. Apparently, in contrast to the more arid regions [58], the amount of rainfall in the epidemic season is not essential in Southern Russia. The influence of relative humidity and precipitation in the epidemic season is minimized, that is probably due to the fact that in most cases, WND in Russia is observed near major rivers (the Volga, the Don), artificial water reservoirs, lakes, or marshlands. In these areas the mosquito can find suitable places for breeding and survival irrespective of the amount of precipitation.
Our analysis suggests that the weather effects are most critical in temperate climate at the northern border of WND area, for example in Volgograd, where the amplitude of fluctuation of human WND incidence is huge (Table 2). However, the cyclic changes in the WND incidence may be partly due to the natural cyclic changes of avian host immunity [27].

Conclusions
The results of this study confirm and ascertain our preliminary hypotheses [2,3] and, with Russia's geographical, ecological, climatic and epidemiological peculiarities taken into account, are close to results of Paz et al. [5] reporting that "For human morbidity, significant positive correlations were observed between a number of WND cases and temperature, with a geographic latitude gradient: northern ("colder") countries displayed strong correlations with a lag of up to four weeks, in contrast to southern ("warmer") countries, where the response was immediate". In Russia, like Romania, and differently to what observed in the southern countries of Greece, the Balkans or Italy, the first human cases of WND are recorded from mid-July and the incidence peaks is in late August-early September. However, the incidence is more affected by higher temperatures in May to June. In Russia's continental climate, winter weather conditions are also important since low temperatures in winter are related to lower WND incidence in the subsequent year. The WND incidence also tended to decrease after the years of major WND outbreaks, probably, because of the effects of previously acquired avian herd immunity. On the whole, there is, beyond doubt, a dependence of WND epidemiological pattern on climatic and weather factors, which enables to develop and (along with accumulation of observation data) ascertain models predicting the changes in the risk of this emerging infection.