Short-Term Effects of Meteorological Factors and Air Pollutants on Hand, Foot and Mouth Disease among Children in Shenzhen, China, 2009–2017

Background: A few studies have explored the association between meteorological factors and hand, foot, and mouth disease (HFMD) with inconsistent results. Besides, studies about the effects of air pollutants on HFMD are very limited. Methods: Daily HFMD cases among children aged 0–14 years in Shenzhen were collected from 2009 to 2017. A distributed lag nonlinear model (DLNM) model was fitted to simultaneously assess the nonlinear and lagged effects of meteorological factors and air pollutants on HFMD incidence, and to further examine the differences of the effect across different subgroups stratified by gender, age and childcare patterns. Results: The cumulative relative risk (cRR) (median as reference) of HFMD rose with the increase of daily temperature and leveled off at about 30 °C (cRR: 1.40, 95%CI: 1.29, 1.51). There was a facilitating effect on HFMD when relative humidity was 46.0% to 88.8% (cRR at 95th percentile: 1.18, 95%CI: 1.11, 1.27). Short daily sunshine duration (5th vs. 50th) promoted HFMD (cRR: 1.07, 95%CI: 1.02, 1.11). The positive correlation between rainfall and HFMD reversed when it exceeded 78.3 mm (cRR: 1.41, 95% CI: 1.22, 1.63). Ozone suppressed HFMD when it exceeded 104 µg /m3 (cRR at 99th percentile: 0.85, 95%CI: 0.76, 0.94). NO2 promoted HFMD among infants and the cRR peaked at lag 9 day (cRR: 1.47, 95%CI: 1.02, 2.13) (99th vs. 50th). Besides, children aged below one year, males and scattered children were more vulnerable to high temperature, high relative humidity, and short sunshine duration. Conclusions: Temperature, relative humidity, sunshine duration, rainfall, ozone and NO2 were significantly associated with HFMD, and such effects varied with gender age and childcare patterns. These findings highlight the need for more prevention effort to the vulnerable populations and may be helpful for developing an early environment-based warning system for HFMD.

. Results of sensitivity analyses by changing the df for controlling long-term trends and seasonality from 7 to 9 and changing the maximum lag days to 21 days, showing the estimated overall cumulative effects over maxlag days (except for NO2 over lag0

Evaluating indicators
Quasi Akaike Information Criterion (QAIC): QAIC is modified version of AIC to deal with the over-dispersed Poisson model, which can be used to assess the model fit of the quasi-Poisson regression model. It considers both the statistical fitness of the model and the number of parameters fitted. Besides, Peng et al. discussed the performance of model selection criteria such as AIC, BIC and PACF in time series studies of air pollutants and death, and proved AIC to be a better choice [1] .
The sum of partial autocorrelation coefficients (Sum of PACF): The sum of PACF is the sum of the absolute value of the partial autocorrelation function (PACF) of the residuals (we set 14 days as the maximum lag in this analysis). The overall sum of PACF was used to compare the autocorrelation of residuals between different model choices. A smaller number means the residuals are less autocorrelated.

Section 1Choices of the degree of freedom for exposure and lag in cross-basis function
In the cb functions at the basic model, we set a 14d maximum lag and use natural cubic splines function both for exposure and lag. We varied the df of exposure from 2 to 6 and df of lag from 2 to 4 in order to select the best df combination with the smallest QAIC. Here are the results.

Section2 Controlling for long-term trends and seasonality
The time-series distribution curves of daily HFMD cases, meteorological factors and air pollutants show long-term trends and seasonality. The seasonal and long-term patterns in both the exposure and outcome data can dominate crude associations, making the short-term associations of interest hard to detect. By explicitly controlling for long-term patterns, the effects of exposure variable(s) of interest and the short-term variation around these long-term patterns can be explored. Previous literatures suggested that we could use the following methods to control for the long-term trends and seasonality [2] :

Option1: Spline function of time
It can model long-term patterns smoothly and capture seasonal patterns in a way that is allowed to vary from one year to the next. However, a controversial issue is determining how much smoothness, measured with degrees of freedom (df) of spline function [1] . Too few will fail to capture the main long-term patterns closely, whereas too many will result in a very 'wobbly' function which may compete with the variable of interest to explain the short-term variation of interest, widening confidence intervals of relative risk estimates.
In our studies, we varied the df of time splines from 2 to 13 df per year to explore its impact. We evaluated the results of different dfs based on QAIC and the sum of PACF. The results were displayed in table S1 and Figure S1. From the curves we found that when the df equaled 8 per year, both the QAIC and sum of PACF declined dramatically and remained at a relatively stable level. The results agreed with the previous literature [3] . Therefore, we set the df of time splines as 8 per year in our final model.

option2: Time stratified case-crossover design
We compared another method, the time-stratified case crossover design, to control for the unmeasured time-varying confounding. Compared with the method of spline function of time, this method was not suitable based on the results of the higher QAIC and sum of PACF.  Figure S1. QAIC and sum of PACF for different df of time splines.

Section3 Controlling for autocorrelations caused by disease transmission
When the absolute magnitudes of the PACF plot for the first two lag days were both less than 0.1, the basic model was regarded as adequate; if this criterion was not met, autoregression terms for lag up to 7 days were introduced to improve the model. After our exploration, we found that model with autoregression terms for lag up to 2 days was adequate (showed in table 2 and Figure S2).

Section 4 Results of univariate models
In the univariate analysis, the overall cumulative effects of temperature, relative humidity, rainfall and sunshine duration were significant ( Figure S3). The effect of temperature showed an obviously inverted V-shape and the cumulative RR value reached maximum 1.33 (95%CI: 1.24，1.42) at 28.7°C. However, the cumulative effect of air pressure and wind speed appeared to be nonsignificant on HFMD incidence. Figure S3. The estimated overall cumulative association between meteorological variables and HFMD occurrence over 14 days with their distributions, using a natural cubic spline DLNM in unimeteorological variable models. The red lines are the cumulative relative risks (medians as references), and the gray regions are 95% CIs. Shenzhen 2009-2017.   We developed multivariate models to control the influence of other meteorological factors. The first thing was to choose the appropriate function for the meteorological covariate. Considering the non-linear and lagged effect of meteorological factors reported in previous literature, here are some choices.

Section 5 Results of multivariate models
In consideration of the non-linearity: 1. Natural cubic spline (ns) of the current value, which was commonly used in previous studies. But this method didn't consider about the lag effect of these meteorological variables, which may be non-ignorable according to our previous results. So we considered the following other methods; In consideration of the lag effect: 2. The single lag value calculated by the function Lag in R package tsModel. But this method can only consider the single lag day effect at one time; 3. Moving Average, equaling mean of values of lag0 to maximum lag day, calculated by the function runMean in R package tsModel; 4. Exponential Moving Average, equaling exponentially-weighted mean of values of lag0 to maximum lag day, calculated by the function EMA in R package TTR; Considering both the non-linearity and lag effect: 5. Combining the above functions, natural cubic spline of Moving Average/ Exponential Moving Average; 6. Generating a cross-basis matrix (cb) for the two dimensions of exposure and lags, calculated by the function crossbasis in package dlnm. The cross-basis matrix can distinguish effects of different lag days through user-specified functions and model the relationship of lag and response more elaborately.
We compared the above functions and evaluated the results based on QAIC, taking relative humidity and sunshine duration as examples. Temperature was included in each model for its main influence on HFMD supported from previous knowledge and literature. Both for relative humidity and sunshine duration, QAIC was the smallest when the cross-basis matrix of the variable was applied (Table S4). Therefore, we chose the cross-basis function for meteorological covariate finally. Section 6 Sensitivity analysis Figure S4. Results of sensitivity analyses by changing the df for controlling long-term trends and seasonality from 7 to 9 and changing the maximum lag days to 21 days, showing the estimated overall cumulative effects over maxlag days (except for NO2 over lag0).