The Impact of Urbanization and Human Mobility on Seasonal Influenza in Northern China

The intensity of influenza epidemics varies significantly from year to year among regions with similar climatic conditions and populations. However, the underlying mechanisms of the temporal and spatial variations remain unclear. We investigated the impact of urbanization and public transportation size on influenza activity. We used 6-year weekly provincial-level surveillance data of influenza-like disease incidence (ILI) and viral activity in northern China. We derived the transmission potential of influenza for each epidemic season using the susceptible–exposed–infectious–removed–susceptible (SEIRS) model and estimated the transmissibility in the peak period via the instantaneous reproduction number (Rt). Public transport was found to explain approximately 28% of the variance in the seasonal transmission potential. Urbanization and public transportation size explained approximately 10% and 21% of the variance in maximum Rt in the peak period, respectively. For the mean Rt during the peak period, urbanization and public transportation accounted for 9% and 16% of the variance in Rt, respectively. Our results indicated that the differences in the intensity of influenza epidemics among the northern provinces of China were partially driven by urbanization and public transport size. These findings are beneficial for predicting influenza intensity and developing preparedness strategies for the early stages of epidemics.


Introduction
In temperate regions, the peak influenza season occurs in the winter months [1], and the scale of seasonal influenza epidemics can vary greatly between provinces and years [2,3]. However, little is known about the drivers of this variation. A better understanding of the factors that govern epidemic intensity is necessary for the public health system to accurately and promptly prepare for seasonal influenza epidemics.
Climatic factors are important drivers of influenza epidemics in temperate regions. Experimental studies have shown that a reduction in relative humidity improves the viability and transmission of influenza virus aerosols [4,5]. Epidemiological evidence also indicates that a reduction in relative humidity is associated with a higher risk of influenza A in the population [6]. Urbanization and human mobility are also believed to be drivers of influenza epidemics [7][8][9]. A simulation-based investigation in Australia highlighted that the increased peak prevalence and faster spreading rate of influenza pandemics could partially be attributed to an increase in population fractions living in cities [7]. A study of weekly incidence data from the United States found that the size of the urban population was positively associated with the incidence of city-level influenza and further showed that the intensity of influenza epidemics was shaped by urbanization and humidity [10]. Empirical evidence revealed that airline volume was a significant predictor of the spread of influenza between regions [8], and high mobility within countries (internal commuting) could accelerate epidemics [9]. However, these studies focused mainly on the impact of human mobility on interregional influenza epidemics. Evidence regarding the influence of human mobility on intracity or intraprovincial epidemics is limited.
Recent studies on influenza epidemics have revealed unexplained differences between provinces with similar urbanization and climate conditions in China [2,3,11], suggesting that there are other unidentified factors driving the differences in influenza epidemics between provinces. China is a vast country that comprises provinces with different climatic and economic backgrounds. These factors have led to varying levels of heterogeneity regarding population structure and mobility. Therefore, we assumed that the unexplained interprovince differences in influenza epidemic intensity may be caused by the heterogeneity of population mobility in provinces with similar climates and urbanization levels. Higher human mobility inside a province increases close contact between people, and thus the transmission of the influenza virus among people may be enhanced.
In the present study, we explored the above hypothesis by using 6 years (2012 to 2017) of data on weekly influenza-like disease and virus activity in 14 provinces in northern China.

Data
The temperature of both the environment and the dew point for each province were obtained from the China Meteorological Administration to calculate the relative humidity, using the R package 'humidity' (R software, version 4.2.1). The approximating function can closely simulate relative humidity: r(t) = u × cos × (2 × π(t − 5))/52 + m. Census data, including population size, urbanization, and public transportation data, were recovered from the China National Bureau of Statistics [12]. Weekly influenza-like disease incidence rate data (ILI) and viral detection positive rate data for each province were obtained from the Chinese National Influenza Surveillance Network. Referring to previous studies [13,14], proxy measures of the weekly incidence rate (referred to as the 'incidence rate') were obtained by multiplying the ILI percentage among patients visiting sentinel hospitals with the proportions of influenza-positive specimens. This proxy is considered a precise representation of the activity of influenza infection [15,16].

SEIRS Model
Referring to previous studies [3,10], we constructed a susceptible-exposed-infectiousremoved-susceptible (SEIRS) compartmental model to work with province-level weekly incidence rate data (the ILI rate × the proportions of influenza-positive specimens). Susceptible (S) refers to individuals at risk of infection with influenza, representing approximately 90% of the total population. Exposed (E) refers to people in the latent period. Infectious (I) refers to people who have been infected. Removed (R) refers to people who have recovered or died. The SEIRS model consists of the following ordinary differential equations: where δ is the rate of reinfection, which is equal to 1/52; ε is the rate of infection after exposure, which is equal to 7; and γ is the rate of recovery from infection, which is equal to 7/2. The values of δ, ε, and γ were taken from Dalziel's research [10]. The generation time was assumed to be 3 days.
After a certain period [17,18], the immunity of infected individuals weakens and these individuals enter the susceptible compartment. New infections are generated when a susceptible individual comes into contact with an infected individual at a rate of βSI/N, where N refers to the size of the population. In a stable population, the incidence of new infections is governed by the transmission function β(t) = g+σ −ωr(t) , where g refers to the maximum gain in the transmission potential at 0 relative humidity and ω refers to the rate of the loss of viral viability caused by relative humidity. The transmission function β(t) is composed of the sum of two parts: a seasonally invariant base transmission potential g, which refers to transmission between individuals under the same climatic conditions (in this case, the impact of climate is 0), and an additional transmission governed by relative humidity, σ −ωr(t) , which increases with the decrease in relative humidity in Chinese provinces in winter and thus increases the risk of transmission between individuals under different climate conditions.

GLM Model
A generalized linear model (GLM) of the SEIRS model was further constructed to explore the patterns of influenza dynamics by fitting the incidence data. GLM avoids the defects associated with the nonlinearity of the SEIRS model. The corresponding generalized linear model is as follows: where Y nj = log[I nj ], I nj indicate the number infected in week n of season j. We obtained I nj by multiplying the incidence rate (the ILI rate × the proportions of influenzapositive specimens) with the province population size. The parameter vector a is given by a = [log(g), log(g + σ 1 ), . . . , log(g + σ 6 )], which is estimated from the SEIRS model. The design vector W nj with seven elements indicates whether the data point associated with (n,j) is in the off-peak regime or in one of the six influenza seasons. b and c are parameter vectors with two elements, b = [ω 1 , ω 2 ] and c = [ρ 1 , ρ 2 ], where b is obtained by fitting the relationship between relative humidity and viral viability and c is obtained from the observed incidence data. X nj is a design vector with two elements that indicate whether the point associated with (n,j) is in the off-peak or peak influenza season. P nj indicates cumulative incidence, P nj = 1 N Σ n k=0 I kj . O nj is an offset term O nj = log(<S 0j >) − log(N) + αY nj , where <S 0j > = 0.9N refers to the expected population-level initial susceptibility each season, taken from Wang and colleagues' study [19]. The influenza peak was defined as extending from 5 weeks before the peak incidence rate observed in each season to 5 weeks after [20]. A detailed explanation of the GLM model was provided in a previous study [10].

Estimation of Transmissibility
The weekly instantaneous reproduction number R t was estimated according to the Bayesian framework applied to the branching process model proposed by Cori et al. [21], which is an extension of Fraser method [22]. Fraser proposed that the renewal estimation equation for the R t of an epidemic could be written as: where I t refers to the number of reported cases (here, the incidence rate times a constant) between time t and time t + 1, and w s refers to the generation time distribution, such that ∑ m s=0 w s = 1. The expected incidence at time t is Poisson distributed with a mean (R t ∑ m s=0 w s I t−s ). The transmissibility is assumed to be constant over the time period [t − τ, t] and measured by R where Λ s = ∑ m s=0 w s I t−s . The generation time distribution is a gamma distribution with a mean of 3 days (SD = 1.5 d) and is assumed to be constant throughout an epidemic. A Bayesian framework with a gamma-distributed prior with parameters (a, b) was developed for R [t−τ,t] , and the posterior joint distribution of R [t−τ,t] can be derived as proportional to Equation (3) indicates that the posterior distribution of

Regression Analysis with Transmissibility β and R t of Each Influenza Season
A simple linear regression model was used to explore the relationship between driving factors and β. R-squared values (R 2 ) were used to quantify the impact of individual drivers. To make the results more intuitive, we used the same method to further quantify the relationship between each driving factor and the simulated R t . Because public transportation is easily affected by population size and urbanization, for example, in provinces with larger population sizes and higher urbanization, the accessibility of public transport is higher and more people may use public transportation. Therefore, we can more accurately represent human mobility per unit density. A combined mobility index, h, was calculated using population size (PS), urbanization (U), and public transportation (PT) to represent human mobility more accurately: h = log PS * U PT . A higher value of h indicates more frequent population mobility. Differences in Akaike information criteria (∆AIC) were used to estimate the relative quality of the GLM, where higher values indicate models with poorer relative support.

Results
Annual data on population size, urbanization, and public transportation size are presented in Table 1. As shown in Figure 1, influenza incidence varies with urbanization, climate, and transportation size. The maximum and mean incidence of influenza at peak times tended to be higher in provinces with larger magnitudes of urbanization and larger transportation sizes ( Figure 1A-D). The SEIRS model was used to fit the influenza incidence rate data to explore the reasons for the temporal and spatial differences in the intensity of the influenza epidemics. As described in the Methods section, the transmission potential of each influenza season in each province could be obtained using the SEIRS model. Influenza epidemics vary in intensity by year and province, indicating a difference in transmission potential. The SEIRS model is a common method for fitting influenza time series data. However, the model is nonlinear; thus, minor changes in input parameters can cause significant changes in the prediction results. Therefore, a general function of the SEIRS model was constructed to work with province-level influenza incidence data. The results are shown in Figure 2. Ten fitted parameters (Supplementary Table S1) were obtained using province-level time series models. The results were obtained for the following three provinces randomly selected from the total of 14: Beijing, Heilongjiang, and Ningxia. Spearman's r = 0.83 for the comparison of the observed and predicted influenza incidence (Figure 3).  Figure 1. Bubble charts demonstrating the incidence rate in provinces with different levels of urbanization, transport, and relative humidity (A-F). Provinces with higher max and mean incidence tended to have a higher magnitude of urbanization (A,B), lower relative humidity (C,D), and larger transportation size (E,F).
The SEIRS model was used to fit the influenza incidence rate data to explore the reasons for the temporal and spatial differences in the intensity of the influenza epidemics. As described in the Methods section, the transmission potential of each influenza season in each province could be obtained using the SEIRS model. Influenza epidemics vary in intensity by year and province, indicating a difference in transmission potential. The SEIRS model is a common method for fitting influenza time series data. However, the model is nonlinear; thus, minor changes in input parameters can cause significant changes in the prediction results. Therefore, a general function of the SEIRS model was constructed to work with province-level influenza incidence data. The results are shown in Figure 2. Ten fitted parameters (Supplementary Table S1) were obtained using province-level time series models. The results were obtained for the following three provinces randomly selected from the total of 14: Beijing, Heilongjiang, and Ningxia. Spearman's r = 0.83 for the comparison of the observed and predicted influenza incidence (Figure 3). The early transmission potential obtained by SEIRS is only a mathematical value, and its practical significance is limited. In this regard, we further explored the factors that influenced the transmission potential. As shown in Figure 4, urbanization and public transportation size could explain 1.28% and 27.62% of the variation in the annual transmission potential, respectively. Close contact between individuals is a prerequisite for influenza transmission. The frequency of close contact can directly affect the transmission potential of the influenza virus between persons. A larger public transport system does not necessarily mean that contact between persons is more frequent. To meet the commuting needs of residents, public transportation may be more extensive in areas with large populations. The size of public transportation per unit population in urban areas can reduce the impact of population size to better represent the transmission potential of contact between people. Urbanization, population size, and public transportation size were used to calculate the combined index h, which indicated population mobility. The results showed a positive correlation between the combined index h and the annual transmission potential, R 2 = 0.1349 (p < 0.05).  The early transmission potential obtained by SEIRS is only a mathematical value, and its practical significance is limited. In this regard, we further explored the factors that influenced the transmission potential. As shown in Figure 4, urbanization and public transportation size could explain 1.28% and 27.62% of the variation in the annual transmission potential, respectively. Close contact between individuals is a prerequisite for influenza transmission. The frequency of close contact can directly affect the transmission potential of the influenza virus between persons. A larger public transport system does not neces-  . Urbanization, transportation size, and combined index estimated from census data predicted transmission potential, maximum Rt, and mean Rt during peak period of influenza season (A-L). Gray points show transmission potential, maximum Rt, and mean Rt during peak period of influenza season estimated from the influenza incidence rate. Red lines refer to the prediction for transmission potential during peak period of influenza season (A-C). Purple lines refer to the prediction for maximum Rt during peak period of influenza season (D-F). Green lines refer to the prediction for mean Rt during peak period of the influenza season (G-L).

Discussion
Weekly surveillance data on outpatient ILI and virus activity from 14 provinces in northern China revealed that the annual transmission potential was positively associated with the size of public transportation. The maximum Rt and mean Rt of each influenza season during the peak period estimated from province-level incidence data were positively correlated with urbanization and the size of public transportation. The results presented here suggest that, at least in northern China, the intensity of the influenza epidemic may be governed by urbanization and intra-city human mobility.
Climate conditions, urbanization, and human mobility play a significant role in the spread of seasonal influenza. Relative humidity is an important environmental factor that affects the survival of influenza viruses in aerosols, and it is also a crucial driving factor Figure 4. Urbanization, transportation size, and combined index estimated from census data predicted transmission potential, maximum R t , and mean R t during peak period of influenza season (A-L). Gray points show transmission potential, maximum R t , and mean R t during peak period of influenza season estimated from the influenza incidence rate. Red lines refer to the prediction for transmission potential during peak period of influenza season (A-C). Purple lines refer to the prediction for maximum R t during peak period of influenza season (D-F). Green lines refer to the prediction for mean R t during peak period of the influenza season (G-L).
In addition, the association between urbanization, public transportation size, and the combined index h was also examined with the maximum R t and mean R t at the peak of each influenza season. Urbanization and public transportation size were significant drivers of max R t and mean R t during the peak period of each influenza season.

Discussion
Weekly surveillance data on outpatient ILI and virus activity from 14 provinces in northern China revealed that the annual transmission potential was positively associated with the size of public transportation. The maximum R t and mean R t of each influenza season during the peak period estimated from province-level incidence data were positively correlated with urbanization and the size of public transportation. The results presented here suggest that, at least in northern China, the intensity of the influenza epidemic may be governed by urbanization and intra-city human mobility.
Climate conditions, urbanization, and human mobility play a significant role in the spread of seasonal influenza. Relative humidity is an important environmental factor that affects the survival of influenza viruses in aerosols, and it is also a crucial driving factor for influenza seasonality [23]. In the current study, we controlled for the influence of climate on transmission by fitting approximate functions. Therefore, the annual transmission potential refers to the comprehensive influence of other factors, excluding climate conditions.
The instantaneous reproduction number (R t ) is typically used to characterize real-time transmissibility. A higher R t value indicates a higher transmission potential. The pathogen spreads when R t > 1 and is under control when R t < 1. We calculated the maximum R t and the mean R t of each influenza season for 14 provinces to quantify the transmissibility at peak times.
Further analysis showed that the size of public transport was positively correlated with the yearly transmission potential. This was consistent with the results of a previous simulation study [24]. Globally, people traveling by air cause the transmission of pandemic and seasonal influenza viruses, especially the A/H3N2 viruses [25][26][27][28][29][30]. At the regional scale, the spatial transmission of influenza is dominated by patterns of human contact, including school closure times and commute patterns [2,31,32]. In the current study, the maximum peak R t and the mean peak R t of the influenza season were positively associated with the size of public transport, which could explain the variations for more than one fifth of the maximum peak R t variations and about one fifth of the peak mean R t , respectively. These results provide new evidence for understanding the impact of human mobility on influenza epidemics.
Previous studies have examined the impact of urbanization on the intensity or epidemic patterns of influenza [2,7,10]. However, the definitions of urbanization vary between studies. For example, in Dalziel et al.'s study, urban population size is regarded as an indicator of urbanization [10]. In the studies by Lei and Zachreson, urbanization refers to the proportion of the total population living in urban areas [7,10]. In our study, different urbanization indicators were used to evaluate the relationship between urbanization and influenza transmission. Our results showed that the proportion of the total population living in urban areas was also positively correlated with the maximum peak R t and the mean peak R t of the influenza season. However, we did not find a consistent positive relationship between urban population density, urban population size, and influenza transmission (Supplementary Figures S2 and S3). Our findings suggest that the proportion of the total population living in urban areas may be a better indicator for studying the relationship between urbanization and influenza transmission in northern China compared with urban population size and urban population density.
Two reasons may be responsible for this result. First, regarding infectious diseases, current explosive trends in urbanization mean that more people are concentrated in urban regions. Coupled with the spread of suburbs, this can lead to large hubs in the commuter interaction network, which can cause a faster spread of infectious diseases between work and home [33,34]. Second, public transportation (buses and subways) is a common means of traveling in many cities around the world; thus, if an infected person interacts closely with other users of public transportation on a bus or subway, combined with insufficient ventilation and overcrowded conditions, it can increase the risk of influenza for other uninfected people and lead to the spread of influenza among colleagues and family members [35].
A higher transmission potential and R t indicate that the number of infected cases will increase in a short period of time, requiring increased surge capacity in the public health system, including primary care facilities and clinical laboratories [36]. The significance of our study is that, when the influenza season arrives, it can help predict the intensity of the influenza epidemic according to urbanization and human mobility to prepare for its medical and social impact in advance. Additionally, obtaining information on transmissibility at peak times is beneficial for mitigating influenza spread by vaccination and taking nonpharmaceutical interventions (NPIs) in the early stages of epidemics [37,38].
A potential limitation of our study was that school holidays were not included in our model. Previous studies have emphasized the importance of children in the spread of influenza, and the impact of school holidays and school closures on transmissibility [16,39,40]. Additionally, we did not have information on the impact of antigen drift and host immunity on epidemics. However, a study based on the city-level analysis of the subtypes and antigenical characteristics of the influenza virus in Australia demonstrated that antigenic novelty has limited effects on epidemic size. It suggested that other factors drive influenza epidemics apart from host immunity at the local scale in temperate areas [41].

Conclusions
In conclusion, urbanization and human mobility were positively associated with the intensity of influenza. Increased commuting by public transport (including buses and subways) can accelerate the spread of influenza. Monitoring flows for public transport may be conducive to early detection and response to influenza epidemics.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/v14112563/s1. Figure S1: The map indicating the provinces studied in northern China. The color indicates the climatic domain: cold-temperate (black); midtemperate (blue); warm-temperate (green); Table S1: Summary statistics on the means of fitted model parameters across provinces; Figure S2: The association for urban population density with transmission potential (A), maximum R t (B) and mean R t (C) in peak time of influenza season; Figure  S3: The association for urban population size with transmission potential (A), maximum R t (B) and mean R t (C) in peak time of influenza season.  Data Availability Statement: Due to the potentially sensitive information included, the original dataset is not public and is available from the corresponding author upon reasonable request.