Transmission Patterns of Seasonal Influenza in China between 2010 and 2018

Background Understanding the transmission source, pattern, and mechanism of infectious diseases is essential for targeted prevention and control. Though it has been studied for many years, the detailed transmission patterns and drivers for the seasonal influenza epidemics in China remain elusive. Methods In this study, utilizing a suite of epidemiological and genetic approaches, we analyzed the updated province-level weekly influenza surveillance, sequence, climate, and demographic data between 1 April 2010 and 31 March 2018 from continental China, to characterize detailed transmission patterns and explore the potential initiating region and drivers of the seasonal influenza epidemics in China. Results An annual cycle for influenza A(H1N1)pdm09 and B and a semi-annual cycle for influenza A(H3N2) were confirmed. Overall, the seasonal influenza A(H3N2) virus caused more infection in China and dominated the summer season in the south. The summer season epidemics in southern China were likely initiated in the “Lingnan” region, which includes the three most southern provinces of Hainan, Guangxi, and Guangdong. Additionally, the regions in the south play more important seeding roles in maintaining the circulation of seasonal influenza in China. Though intense human mobility plays a role in the province-level transmission of influenza epidemics on a temporal scale, climate factors drive the spread of influenza epidemics on both the spatial and temporal scales. Conclusion The surveillance of seasonal influenza in the south, especially the “Lingnan” region in the summer, should be strengthened. More broadly, both the socioeconomic and climate factors contribute to the transmission of seasonal influenza in China. The patterns and mechanisms revealed in this study shed light on the precise forecasting, prevention, and control of seasonal influenza in China and worldwide.


Introduction
As a major cause of human morbidity and mortality every year, seasonal influenza epidemics lead to about 88,100 excess respiratory deaths annually in China [1]. Seasonal influenza epidemics are mostly initiated from a source region and then spread worldwide rather than descending from a previous epidemic in the same region [2]. On a global scale, influenza A viruses circulating in Southeast Asia have been proposed as the source of seasonal epidemics in other regions [2,3]. In China, new antigenic variants usually first emerge from the south [4]. China is a geographically, economically, and climatologically diverse country, and prior studies have suggested that northern China experienced influenza epidemics only in the winter-spring months, while in southern China, influenza is prevalent throughout the year with a clear peak in both the summer and winter [5,6]. However, a detailed transmission pattern and its potential mechanism are still missing, which are critical for designing targeted surveillance and control strategies.
Prior studies have suggested that influenza spatial spread results from the combined effect of climate factors and population mobility patterns [6][7][8]. Experimental studies suggest that influenza viruses favor low humidity and low temperature [9]. A human population-level epidemiological study also indicated that absolute humidity drove the seasonal variations of influenza transmission [10] and the 2009 pandemic influenza spatial transmission [11] in the continental United States. The roles of human mobility in the spatial spread of seasonal influenza have been explored by several studies. On a global scale, the worldwide air transportation network serves as the predominant channel for the dissemination of the pandemic and seasonal influenza viruses [2,12,13]. On the regional scale, short-distance work commutes are a major driver of the spread of seasonal outbreaks in the US, though longer-range air traffic has been implicated in playing an important role as well [14][15][16][17]. Different from the transportation pattern in the United States, where personal vehicular movement and long-range airline travel play the predominant roles in the within-region and inter-regional transport, respectively [16], in China, passenger transport, particularly the high-speed railway, has developed rapidly in the past decade and could play an important role in the inter-province human mobility.
In this study, based on the new comprehensive surveillance data of seasonal influenza in China between 2010 and 2018, the more detailed transmission patterns of seasonal influenza in continental China were explored, in addition to the potential mechanisms based on a combination of epidemiological, genetic, socioeconomic, and meteorological data.

Influenza Surveillance Dataset
The national sentinel hospital-based influenza-like illness (ILI) surveillance system in China was established in 2000, with 99 sentinel surveillance hospitals by 2008 [18]. After the 2009 influenza pandemic, the system was expanded to 554 sentinel hospitals with all 31 provinces implementing year-round surveillance. The weekly reports of influenza cases were downloaded from the Chinese Centre for Disease Control and Prevention (http://www.chinaivdc.cn/cnic/, accessed on 24 October 2019). The dataset is based on the number of specimens tested in 554 sentinel hospitals located in 31 Chinese administrative regions; here, the major administrative regions included provinces, autonomous regions, and municipalities, and for simplicity, termed "province" hereafter. Tibet is excluded from our analysis because of its relatively poor surveillance system. The period from 1 April 2010 to 31 March 2018 was studied in this work, as the quality of the surveillance data improved dramatically after the 2009 influenza A(H1N1) pandemic [19]. The dataset provided the weekly number of visits in each hospital, the number of influenza-like illness (ILI) cases, the number of specimens tested, the number of laboratory-confirmed influenza cases for seasonal influenza A(H1N1)pdm09, A(H3N2), B/Yamagata, and B/Victoria (Table 1). According to the Chinese National Influenza Surveillance Guideline, an ILI case was defined as a patient with a fever of ≥38 • C, accompanied by cough or sore throat [20].

Sequence Data
The hemagglutinin (HA) nucleotide sequences of the seasonal A(H3N2) influenza viruses from diverse regions of Mainland China between 1 April 2010 and 31 March 2018 were downloaded from the Global Initiative on Sharing Avian Influenza Data (GISAID, https://www.gisaid.org, accessed on 24 October 2019). The sequences were then aligned with MUSCLE v3.7 using its default settings. Only those sequences with the full date (year, month, and day) and location (province) information were used. Furthermore, the outliers and redundant data were manually removed. Finally, 1403 HA sequences of H3N2 were analyzed in this study. The sequence datasets of the A(H1N1)pdm09, B/Yamagata, and B/Victoria influenza viruses were obtained and processed in the same way, with sequence numbers 828, 647, and 431, respectively.

Climate and Demographic Data
To assess the role of the putative drivers of spatial influenza transmission, the provincelevel demographic and climate were collected during the study period. For the demographic data, the railway data as a proxy of inter-provincial human mobility were used [21][22][23] (see Supplementary Materials Part S1 for more details). For the climate data, temperature and relative humidity were used. The daily climate data for each participating weather station during the study period were obtained from the website of the China Meteorological Administration (http://data.cma.cn, accessed on 24 October 2019). The province-level daily meteorological indicators were calculated as the averages of the data in all the weather stations in that province, and the weekly indicators were calculated as averages of the daily values in that week.

Time Series Analysis
From the data streams, same as previous studies, a proxy for the weekly strain-specific incidence rate (henceforth simply termed "incidence rate") of each subtype was used, a measurement more precisely representing influenza infections [24][25][26]. The weekly incidence rate for each subtype was defined as the ILI rate among the visiting patients in sentinel hospitals multiplied by the viral detection rate for each subtype individually. When the number of specimens is small, the incidence rate for a specific week could be biased as an outlier. To remove the outliers, a threshold of 30 was used for the number of specimens, i.e., when the weekly number of specimens was lower than 30, the value of the outlier was replaced by the mean value of the same week in other years.
Since China is located in the Northern Hemisphere, an epidemiological annual cycle was defined as the period from 1 April (calendar week 14, epidemiological week 1 in our notation) to 31 March of the next year [6]. Several studies have reported that seasonal influenza epidemics experience a semi-annual cycle pattern in southern China and an annual cycle in northern China [5,6], so the annual cycle was also divided into the summer season and winter season, which are from calendar week 14 to 39 and from calendar week 40 to week 13 of the next year, respectively [6]. For simplicity, an epidemic was referred to as a summer wave or a winter wave if the peak time occurred in the summer season or winter season defined above. The traditional wavelet analysis was also applied to explore the seasonality of seasonal influenza in China (see Supplementary Materials Part S1 for more details). A cumulative incidence rate was defined as the sum of the weekly incidence rate for a specific time period [24], and the prevalence of a specific subtype at a given period was defined as the proportion of the cumulative incidence rate of this subtype during this period.
The onset of seasonal influenza for a specific region was defined as the first of three consecutive weeks with increasing smoothed weekly incidence rates and exceeding a prescribed baseline [11,26]. Since fluctuations in the surveillance data may play a particular role in estimating the onset timing of an epidemic, a 3-week moving average was used to smooth the surveillance data (see Supplementary Materials Part S1 for more details). The baseline was set as the 40% quantile of the non-zero weekly incidence rate of an influenza strain or a combined strain [26]. The ending of an epidemic was defined as the first of three consecutive weeks with a smoothed weekly incidence rate below the baseline.

Evolutionary Analysis
To characterize the transmission pattern between southern and northern China in genomic ways, the most parsimonious evolutionary path analysis was used [27], which defines each virus's most likely ancestor to have the highest sequence similarity among all older viruses. All the most parsimonious evolutionary paths were traced, and the transition events between northern and southern China both within the same season or across the different seasons were recorded. In order to avoid sampling bias, bootstrap was applied 100 times via random sampling to 3 sequences per season from each province each time. The Hamming distance was used as an indicator of the genetic distance in this study.

Clustering Analysis
Hierarchical clustering with Ward's minimum variance method [28] was used to identify regional clusters, relying on the squared Euclidian pairwise difference (epidemiological distance) between the Spearman correlations of incidence rate time series [29] among the provinces as the distance metrics [30].

Effective Distance
The identification of the initiating area of an epidemic from an epidemiological approach relied on the concept that an epidemic's arrival time in one region is linearly related to the effective distance from the source [12,31]. An effective distance between region i and region j d ij is defined as where T ij = T ij /T j , T ij is the number of those individuals who traveled from region i to region j per time unit, and T j = ∑ N i=1 T ij . Additionally, in this study, the number of people traveling by railway was used to estimate inter-province transport in China (see Supplementary Materials Part S1 for more details).

Statistical and Regression Analysis
To reveal the determinants for the spatial spread of influenza, the relative contributions of different potential drivers were quantified using a fixed-effect linear regression model, with an interaction term between temperature and relative humidity, which were believed to be correlated. A set of independent variables were firstly normalized via min-max scaling. From both the spatial scale and temporal scale, the dependent variable was the onset time, and it could be compared with the onset times of the region of origin or the first epidemiological year as references. The independent variables were the daily human mobility from the potential regional origin, the differences in the mean temperature and humidity in the summer months, and the latitude and longitude. The epidemiological year 2010 was excluded for analysis because the epidemics in the summer months this year were much later than those in other years, partly due to the influence of the 2009 influenza A(H1N1) pandemic. Their goodness-of-fit was measured with the Akaike information criterion (AIC), and a backward-reduced AIC-variable selection procedure tested whether the best fitting bivariate relationship could be further improved by removing the independent variables. The statistical difference was tested using t-tests, and the two-sided p values are reported. The correlation coefficient was determined based on the Pearson correlation method. The data were analyzed using R software, version 3.6.1.  (Table 1).  Figure S1).

Epidemiology of Seasonal Influenza in China
waves of influenza B/Yamagata in southern and northern China, respectively, all in the winter season. There were five and three winter epidemic waves of influenza B/Victoria in southern China and northern China, respectively. Our detailed wavelet analysis confirmed that the influenza A(H1N1)pdm09 and B viruses followed a single annual cycle with peaks during the winter season for both northern and southern China, while the influenza A(H3N2) virus experienced a semi-annual cycle in southern China, with additional peaks in the summer season (Supplementary Figure S1).  The heatmaps for the mean weekly incidence rate of influenza A(H1N1) pdm09, A(H3N2), B/Yamagata, and B/Victoria at the province level showed the same consistent pattern: influenza A(H1N1)pdm09 and B experienced epidemics in winter, and for the influenza A(H3N2) virus, there was an additional summer peak in the south (Figure 2). During the study period, the disease burden caused by the influenza A(H3N2) virus was the largest (the proportions of A(H1N1)pdm09, A(H3N2), B/Yamagata, and B/Victoria were 27.1%, 43.6%, 17.4%, and 11.9%, respectively). In the summer epidemics, for the 15 provinces in southern China, on average, the influenza A(H3N2) virus contributed to 69.1% of the epidemics (ranging from 43.2% to 87.3%) (Supplementary Figure S2A). For the winter epidemics, on average, influenza A(H1N1)pdm09, A(H3N2), and B each contributed to one-third, respectively (Supplementary Figure S2B). At the same time, for the winter epidemics, the prevalence of influenza B decreased toward the north (R 2 = 0.53, p = 4.9 × 10 −6 ), while the prevalence of influenza A(H3N2) increased with latitude (R 2 = 0.74, p = 1.3 × 10 −9 ). provinces in southern China, on average, the influenza A(H3N2) virus contributed to 69.1% of the epidemics (ranging from 43.2% to 87.3%) (Supplementary Figure S2A). For the winter epidemics, on average, influenza A(H1N1)pdm09, A(H3N2), and B each contributed to one-third, respectively (Supplementary Figure S2B). At the same time, for the winter epidemics, the prevalence of influenza B decreased toward the north (R 2 = 0.53, p= 4.9 × 10 −6 ), while the prevalence of influenza A(H3N2) increased with latitude (R 2 = 0.74, p=1.3 × 10 −9 ).

Spatial Transmission of Seasonal Influenza in China
Based on the epidemiological surveillance data, there was a latitude gradient for the onset timing of influenza A(H3N2) virus epidemics in southern China in the summer months, with regions in the lower latitude starting the epidemic earlier (R 2 = 0.50, p = 0.0032, Table 2). However, there was no longitude gradient (R 2 = 0.05, p = 0.43, Table 2). For the winter-spring waves, there was a weak longitude gradient in the influenza A(H3N2) onset timing (R 2 = 0.13, p = 0.047) and a latitude gradient in the influenza B/Yamagata onset timing (R 2 = 0.51, p = 8.7 × 10 −6 , Table 2). Based on the evolutionary relationship among viruses, for influenza A(H1N1)pdm09, B/Yamagata, and B/Victoria, which mainly experienced epidemics in the winter months, the regions in the south played more important seeding roles based on the fact that more transition events were observed from the south to the north (Supplementary Tables S1-S3). For influenza A(H3N2) in the winter from both the north and south, the viruses mostly originated from the previous summer season in southern China, rather than from the previous winter season (Supplementary Table S4). Similarly, the regions in the south play more important roles in maintaining the circulation of the influenza A(H3N2) virus during both the summer and winter seasons (Supplementary Table S5).

Initiating Area and Drivers for Transmission of Influenza A(H3N2) in the Summer
The three most southern provinces (Guangdong, Guangxi, and Hainan, defined as the "Lingnan" region) shared similar influenza epidemiological patterns and were clustered together ( Figure 3A,B). Based on the spatial gradient observed above for the influenza A(H3N2) onset time during the summer season in the south, the "Lingnan" region was the potential initiating area of influenza A(H3N2) epidemics. We tested our hypothesis based on two analyses: One is based on the concept of effective distance, and the other one is based on the viruses' evolutionary relationship (see Section 2 for more details). There was a significant positive relationship between the influenza A(H3N2) onset time and the effective distance from the "Lingnan" region ( Figure 3C). Additionally, the genetic distance from the "Lingnan" region was linearly increased with the latitude (Figure 3D), both of which supported our hypothesis.
According to the regression analysis between the onset time and the three candidate drivers, the climate factors became the statistically significant driver for the spread of seasonal influenza A(H3N2) virus on the spatial scale (Table 3, p < 0.05). The positive correlation means that the larger the difference in the mean temperature and relative humidity in the originating "Lingnan" region, the later the arrival of the epidemics. Additionally, from a temporal scale, human mobility became a significant driver in addition to the relative humidity (Table 3, p < 0.05). The negative correlation regarding human mobility indicated that the greater the degree of human mobility from a region of origin, the earlier the arrival of the influenza epidemics. Moreover, only relative humidity was the statistically significant driver in both the spatial and temporal scales. According to the regression analysis between the onset time and the three candidate drivers, the climate factors became the statistically significant driver for the spread of seasonal influenza A(H3N2) virus on the spatial scale (Table 3, p < 0.05). The positive correlation means that the larger the difference in the mean temperature and relative humidity in the originating "Lingnan" region, the later the arrival of the epidemics. Additionally, from a temporal scale, human mobility became a significant driver in addition to the relative humidity (Table 3, p < 0.05). The negative correlation regarding human mobility indicated that the greater the degree of human mobility from a region of origin, the earlier the arrival of the influenza epidemics. Moreover, only relative humidity was the statistically significant driver in both the spatial and temporal scales.  Table 3. Regression analysis between onset time and temperature, relative humidity, and transportation of seasonal influenza A(H3N2) epidemics on spatial and temporal scales. Interaction term −0.5 (1.6) 7.5 × 10 −1 −14.5 (7.9) 7.5 × 10 −2

Discussion
The motivation for this study was to reveal the detailed transmission patterns of seasonal influenza to assist the ongoing efforts for real-time forecasting and targeted control strategies in China. Based on the availability of rich surveillance data between 2010 and 2018, a comprehensive analysis of the seasonal influenza epidemics in China was carried out, and the pattern and mechanisms were identified. There was a single annual peak for influenza A(H1N1)pdm09 and B in the winter season and a semi-annual periodicity for A(H3N2) in southern China with an additional peak in the summer. The "Lingnan" region in the southern continental area of China, which includes Hainan, Guangxi, and Guangdong Provinces, was the potential initiating region for seasonal influenza A(H3N2). Every year, seasonal influenza A(H3N2) epidemics started around April in the "Lingnan" region and spread to other provinces in southern China within 2-3 months, driven by both the climate and human mobility. When a suitable climate arrived, the influenza A(H3N2) viruses circulated in the summer season from southern China were rapidly transmitted to the whole country, leading to the winter epidemics. Overall, the regions in the south played more important roles in maintaining the circulation of all the seasonal influenza strains in China, which is well-aligned with the source-sink models proposed previously [3,32,33]. Thus, influenza surveillance in the "Lingnan" region should be strengthened. Additionally, the surveillance data from the "Lingnan" region could be used to inform the arrival time and intensity of influenza epidemics in other regions in southern China a few weeks ahead, combined with the human mobility and climate data. The influenza surveillance information in the "Lingnan" region could also be used to predict the dominant influenza subtype in other provinces for the same epidemiological year, which could provide a basis for better recommendations for influenza vaccine strains and the use of antiviral drugs. These are helpful for the precise prediction and prevention of seasonal influenza in China.
Prior work has suggested that influenza viruses favor low humidity and low temperature, as indicated in an experimental study [9] as well as a human population-level epidemiological study [34]. Based on the data from our study, there was another suitable climate parameter in the summer for the effective transmission of H3N2, with high temperatures (20 • C-30 • C) and high levels of relative humidity (75-90%) (Supplementary Figure  S3A). Furthermore, there was a clear spatial gradient regarding the spread of the seasonal influenza A(H3N2) virus initiated by the "Lingnan" region, which was mostly determined by the climate factors, i.e., the regions meeting favorable climate conditions were affected first (Supplementary Figure S3B). The seasonal influenza epidemics in the United States also likely originated in the South, including cities such as Miami and Houston [15]. Brazil also experiences a transmission gradient from the tropical to temporal regions [7]. These patterns are consistent with the previous knowledge of the importance of tropical regions for initiating seasonal influenza epidemics [2,3].
The role of human mobility in the spatial spread of seasonal influenza is crucial. In this study, we showed that the increasing volume of human travel has the potential to accelerate influenza spread on a temporal scale in China (Table 3). Additionally, northern China does not have favorable climate conditions (Supplementary Figure S3A) for the seasonal influenza A(H3N2) virus in the summer, and the summer peaks in northern China were always followed by the intense summer peaks of influenza A(H3N2) in southern China (Figure 1), implying that the summer peaks in northern China might be caused by the population from southern China.
In this study, we only analyzed the influenza surveillance data between 2010 and 2018, and the emergency of the COVID-19 pandemic in 2020 largely disturbed the influenza transmission due to the non-pharmaceutical interventions (NPIs) used to control the COVID-19 pandemic [35,36]. However, we believe that the results from this study could also be applicable after the COVID-19 pandemic when the non-pharmaceutical interventions are lifted. Unfortunately, we could not perform a recent analysis from the start of the pandemic due to the lack of data, as influenza epidemics almost disappeared in China during this period.
Our study is prone to several limitations. The mobility data we used were based on inter-province railway transportation data. We argue that although road travel plays an important role in population movement, it is mainly short-distanced and is not increasing.
Although human mobility due to air travel has increased, it is still several magnitudes less than rail travel. Future work needs to collect more detailed population movement data and comprehensively peruse all those data. Additionally, socioeconomic events including school opening and population size and density need to be considered in future analyses to reveal more detailed transmission patterns and additional potential drivers [11,15,17]. Additionally, the potential interaction between the different strains of seasonal influenza was reported [24] and observed by the data (i.e., a reverse spatial gradient between influenza B and A(H3N2) in the winter), and these confounding factors should be considered in the future. Finally, evolutionary evidence is supplementary to epidemiological support, and more genetic sequences are needed for robust and detailed analyses.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/v14092063/s1, Part S1: Methods; Table S1: Transmission analysis within the same winter season for influenza A(H1N1)pdm09 based on sequences; Table S2: Transmission analysis within the same winter season for influenza B/Yamagata based on sequences; Table S3: Transmission analysis within the same winter season for influenza B/Victoria based on sequences; Table S4: Transmission analysis between seasons for influenza A(H3N2) based on sequences; Table S5: Transmission analysis within the same season for influenza A(H3N2) based on sequences; Figure