4.1. Analysis Framework
The model set out above is applied to spatial interpolation of asthma and pollution, and to estimation of the impact of pollution on prevalence in Outer NE London. Bayesian estimation is carried out using the WINBUGS package. Observed asthma data (ICD10 codes J45-J46) consist of prevalence counts
yi for GP practice areas in 2010–2011. These data provide source area observations to make interpolated prevalence estimates for neighbourhoods (target areas). Additional collateral data are asthma hospitalisations
Mk for neighbourhoods, and air quality indices
xk (log transformed), also for neighbourhoods. Note that higher
x values denote worse air quality.
Figure 1 and
Figure 2 contain maps of hospitalization rates (crude rates per 1,000 population) and the logged air quality indices.
Figure 1.
Asthma hospitalisations, rates per 1,000, LSOAs, outer NE London.
Figure 1.
Asthma hospitalisations, rates per 1,000, LSOAs, outer NE London.
Figure 2.
Index of low air quality, LSOAs, outer NE London.
Figure 2.
Index of low air quality, LSOAs, outer NE London.
There are N = 189 GP areas, with an average population of 5,380, and K = 562 neighbourhoods (LSOAs). Neighbourhood and GP area locations are represented by population weighted centroids. There are J = 84 points at 2 km spacing in the discrete grid covering the region.
Of importance for practical application of the methodology of
Section 3 is stability or otherwise of inferences regarding the distribution of, and spatial patterning in, the interpolated asthma prevalence rates in neighbourhoods. Four alternative models are applied differing in kernel form (bivariate normal
vs. bivariate exponential) and in the random process (normal
vs. Student
t). The same assumption regarding the kernel-process combination is applied both to the prevalence process
z1 and the risk factor process
z2. There is a wide range of possible model options: different kernels, different form of random grid effect (
wj), and differing regression forms for
ρ on
x. The selected four models are intended as a representative subset of the possibilities.
All four models assume standard kernels and a linear regression effect of ecological covariates
x on prevalence risk
ρ. Thus model 1 combines a standard normal kernel (
α = 1) with a normal process
wj ~
N(0,

) with unknown variance. Model 2 combines a standard exponential kernel (
η = 1) with a normal process
wj ~
N(0,

). As mentioned by Clark
et al. [
13], exponential kernels allow for more leptokurtic (more peaked and fat tailed) densities than normal kernels, and so exponential kernels may represent a more flexible assumption for latent prevalence rates
ρ and air quality rates
x. Model 3 combines a standard normal kernel with a Student
t process with five degrees of freedom
wj ~
t(0, 5,

), and Model 4 combines a standard exponential kernel with a Student
t process with five degrees of freedom. The Student
t with low degrees of freedom allows for heavier tails than the normal.
Bayesian inference requires that prior densities on parameters be specified. Since there is no extensive prior evidence, relatively diffuse options are chosen. Gamma priors with scale 1 and index 0.001 are assumed for unknown precisions, and normal priors with mean 0 and variance 1,000 are assumed for regression parameters {
β,
γ,
δ}. Inferences are based on the second halves of two chain runs of 25,000 iterations, with convergence assessed using Brooks-Gelman criteria [
17].
While stability of inferences is an important aspect in assessing the interpolated prevalence data, fit to the three observed datasets is, of course, also central.
Table 1 uses the DIC criterion to assess how well the method estimates the observed small-area counts or rates, whether obtained on one area frame (GP service areas) or the other (residential neighbourhoods). The DIC values are based on the Poisson deviance for
yi (prevalence counts observed over GP service areas) and
Mk (hospitalisations observed over neighbourhoods) namely:
and:
The deviance for the air quality data
xk (observed over neighbourhoods) is:
where
µk =
η1 +
z2(
tk).
4.2. Results
Defining a composite measure of fit is complicated by the different scales of the three sets of observations (and hence deviances), but it can be seen from
Table 1 that no model provides a best fit across all observations. The best fit for the
y-data (generated by modelled prevalence rates
ρi in GP areas) and
M-data is provided by a normal kernel
κ combined with a normal process
w, while the best fit for the
x-data is provided by an exponential kernel combined with Student
t errors. Such results illustrate that default choices such as normal kernels may not necessarily provide best performance.
Table 1.
Deviance information criteria.
Table 1.
Deviance information criteria.
| Observed Data |
---|
Model | y | M | x |
---|
1 | 376.5 | 1,096.9 | 3.234 |
2 | 378.1 | 1,107.7 | 3.223 |
3 | 377.5 | 1,104.1 | 3.220 |
4 | 379.2 | 1,098.4 | 3.207 |
Table 2 summarises estimates of the main structural parameters across sub-models. These are
β1 and
β2 in the GP area and neighbourhood prevalence equations:
the “reflexive effect” parameter
γ2 in the model for neighbourhood hospitalisations:
and the heteroscedasticity parameters in the log-variance equations:
Table 2.
Posterior summary (means and quantiles), structural parameters.
Table 2.
Posterior summary (means and quantiles), structural parameters.
Parameter | Interpretation | Model | Mean | 2.5% | 5% | 95% | 97.5% |
---|
β1 | Prevalence Model Intercept | 1 | −0.25 | −0.35 | −0.33 | −0.17 | −0.16 |
2 | −0.22 | −0.32 | −0.31 | −0.11 | −0.08 |
3 | −0.23 | −0.30 | −0.29 | −0.15 | −0.14 |
4 | −0.23 | −0.33 | −0.32 | −0.14 | −0.12 |
β2 | Prevalence Model, Pollution Effect | 1 | 0.31 | −0.02 | 0.03 | 0.58 | 0.61 |
2 | 0.22 | −0.26 | −0.14 | 0.49 | 0.52 |
3 | 0.25 | −0.09 | −0.03 | 0.48 | 0.51 |
4 | 0.25 | −0.08 | −0.03 | 0.58 | 0.65 |
γ1 | Hospitalisation Model Intercept | 1 | −1.82 | −2.06 | −2.04 | −1.62 | −1.60 |
2 | −1.78 | −2.05 | −2.01 | −1.56 | −1.53 |
3 | −1.87 | −2.06 | −2.03 | −1.70 | −1.68 |
4 | −1.88 | −2.06 | −2.03 | −1.75 | −1.73 |
γ2 | Hospitalisation Model, Prevalence Effect | 1 | 0.38 | 0.33 | 0.33 | 0.43 | 0.43 |
2 | 0.36 | 0.30 | 0.31 | 0.42 | 0.43 |
3 | 0.38 | 0.34 | 0.35 | 0.42 | 0.42 |
4 | 0.38 | 0.35 | 0.36 | 0.42 | 0.43 |
δ1 | Heteroscedasticity Model, Intercept | 1 | −3.37 | −5.32 | −5.04 | −1.73 | −1.48 |
2 | −2.63 | −4.37 | −4.19 | −2.67 | −0.99 |
3 | −2.55 | −4.42 | −4.16 | −1.32 | −1.01 |
4 | −2.27 | −4.37 | −4.12 | −0.58 | −0.29 |
δ2 | Heteroscedasticity Model, Slope | 1 | 0.03 | −0.34 | −0.29 | 0.35 | 0.39 |
2 | −0.09 | −0.43 | −0.40 | −0.09 | 0.21 |
3 | −0.12 | −0.41 | −0.36 | 0.17 | 0.24 |
4 | −0.16 | −0.52 | −0.48 | 0.19 | 0.23 |
It can be seen that all models agree on the positivity of γ2, namely that asthma hospitalisation rates in neighbourhoods increase in line with neighbourhood prevalence of asthma. All models also provide closely similar results on the level of prevalence (the parameter β1).
However, results are more equivocal concerning the effect of pollution (poor air quality) on asthma prevalence. Posterior means for this coefficient are all positive, ranging from 0.22 to 0.31, but a significant effect, namely an entirely positive 90% credible interval, is only apparent under Model 1. There is also no clear evidence that variances of unstructured residuals ei and ek are related to population size.
To assess stability in inferences about the spatial pattern of asthma risk, the four models are compared in terms of co-location of neighbourhoods identified as having high localized risk, namely:
and in terms of co-location of cluster centres, defined by the probabilities
Ck. The probabilities that neighbourhoods are classed as cluster centres has a lower threshold of 0.25, as this classification requires coincidence of one or more
Jl =
I(
ρl > 1) = 1 (for
l ∈
Ak) together with
Jk = 1.
It can be seen from
Table 3 that a normal kernel combined with a normal discrete process (Model 1) leads to smaller totals of neighbourhoods classed as local high asthma risk or as cluster centres than the other models. However, cross-tabulation of co-located high risk areas or cluster centres (in the lower two subtables of
Table 3) show that all areas classed as locally high risk or as cluster centres under Model 1 are also classed as such by the other models.
Table 3.
Classification of asthma risk in neighborhoods.
Table 3.
Classification of asthma risk in neighborhoods.
Number of Neighbourhoods (from K = 562) |
---|
Model | Total Neighbourhoods with Local Exceedance Probabilities > 0.8 | Total Neighbourhoods with Cluster Centre Probabilities > 0.25 |
1 | 66 | 68 |
2 | 75 | 90 |
3 | 74 | 87 |
4 | 80 | 89 |
Colocation of local exceedance and cluster centre classifications |
Local exceedance | Model 1 |
Model | | No | Yes |
2 | No | 487 | 0 |
Yes | 9 | 66 |
3 | No | 488 | 0 |
Yes | 8 | 66 |
4 | No | 482 | 0 |
Yes | 14 | 66 |
Cluster centres | Model 1 |
Model | | No | Yes |
2 | No | 472 | 0 |
Yes | 22 | 68 |
3 | No | 475 | 0 |
Yes | 19 | 68 |
4 | No | 473 | 0 |
Yes | 21 | 68 |
Table 4 considers distributional features of neighbourhood asthma prevalence as indicated by posterior mean asthma prevalence rates from the four models (expressed as percentage rates). There is close agreement between Models 2–4, but slightly lower mean prevalence, and lower 95th and 99th percentiles under Model 1.
Figure 3 depicts the distribution of asthma prevalence over the Outer NE London region, using Model 1 estimates. There is a close correspondence between prevalence and hospitalization rates (
Figure 1), even though the latter are simple crude rates, so providing consistent evidence on the asthma burden.
Table 4.
Neighbourhood asthma prevalence (percentage rates).
Table 4.
Neighbourhood asthma prevalence (percentage rates).
Distributional Characteristics |
---|
| 1 | 2 | 3 | 4 |
---|
Mean | 4.58 | 4.65 | 4.65 | 4.65 |
Median | 4.50 | 4.56 | 4.57 | 4.58 |
Skewness | 0.44 | 0.48 | 0.44 | 0.43 |
1st percentile | 2.73 | 2.76 | 2.70 | 2.74 |
5th percentile | 3.14 | 3.12 | 3.14 | 3.12 |
95th percentile | 6.36 | 6.53 | 6.50 | 6.47 |
99th percentile | 6.99 | 7.24 | 7.19 | 7.13 |
Figure 3.
Asthma prevalence (%), LSOAs, outer NE London.
Figure 3.
Asthma prevalence (%), LSOAs, outer NE London.