Next Article in Journal / Special Issue
On Application of the Empirical Bayes Shrinkage in Epidemiological Settings
Previous Article in Journal
Alcohol Abuse in Pregnant Women: Effects on the Fetus and Newborn, Mode of Action and Maternal Treatment
Previous Article in Special Issue
To Match or Not to Match in Epidemiological Studies—Same Outcome but Less Power

Int. J. Environ. Res. Public Health 2010, 7(2), 333-352;

A Multilevel Model for Comorbid Outcomes: Obesity and Diabetes in the US
Department of Geography and Centre for Statistics, Queen Mary University of London, Mile End Rd, London E1 4NS, UK
Received: 16 November 2009 / Accepted: 21 January 2010 / Published: 27 January 2010


Multilevel models are overwhelmingly applied to single health outcomes, but when two or more health conditions are closely related, it is important that contextual variation in their joint prevalence (e.g., variations over different geographic settings) is considered. A multinomial multilevel logit regression approach for analysing joint prevalence is proposed here that includes subject level risk factors (e.g., age, race, education) while also taking account of geographic context. Data from a US population health survey (the 2007 Behavioral Risk Factor Surveillance System or BRFSS) are used to illustrate the method, with a six category multinomial outcome defined by diabetic status and weight category (obese, overweight, normal). The influence of geographic context is partly represented by known geographic variables (e.g., county poverty), and partly by a model for latent area influences. In particular, a shared latent variable (common factor) approach is proposed to measure the impact of unobserved area influences on joint weight and diabetes status, with the latent variable being spatially structured to reflect geographic clustering in risk.
diabetes; obesity; multilevel; multinomial; latent variable; spatial; poverty

1. Introduction

Two of the major risk factors for cardiovascular disease are obesity and diabetes, and analysis of geographic patterning in the variation and interrelation of these two major conditions is important for ensuring that resources for prevention and care match need and are effectively targeted. The close link between obesity and diabetes is well established [1,2], and increases in the prevalence of obesity and overweight are a major factor in the growth of diabetes [3,4]. In the US there is evidence of wide geographic contrasts in the prevalence of both obesity and diabetes, and of clear differences in relative risk between age and ethnic groups, and between socioeconomic groups [5,6].
It is important to establish whether geographic variations are simply the result of differences between populations in their age, social and ethnic composition (compositional effects), or whether there are distinct geographic effects that account for part of the variation. The distinct impacts of area on health, and also those of interactions between area variables and individual level risk factors, are often denoted as ”contextual variation”. Thus, area-based measures of socioeconomic status may affect health outcomes even after control for individual risks [710], while interactions between geography and individual risk factors are exemplified by the study of Subramanian et al. [11], which considers ”geographic variation in the individual relationship between race/ethnicity and mortality”. In the present paper, contextual variation is assessed by the significance (after allowing for major individual level risk factors) of both known area variables, and latent area effects, on chances of diabetes and/or excess weight.
This paper develops a multilevel multinomial regression model for diabetes and weight category as joint outcomes. The model framework allows for subject level risk factors, and contextual (area) effects including known area influences (e.g., poverty, race composition, and population density), and unmeasured area influences at two levels (county and state). The latter are modelled using a latent variable approach that results in a summary index of latent contextual effects shared across multinomial outcomes. Because of clustering in both diseases, the latent variables are assumed to be spatially structured [12].
A case study application is based on the 2007 Behavioral Risk Factor Surveillance System (BRFSS) survey, which is an annual random-digit-dialed telephone survey to determine the prevalence among adults (ages 18 and over) of major illnesses and health risk behaviors. The results described in this paper are based on 128,150 male respondents to the 2007 BRFSS, and living in the continental United States. The main object is to demonstrate unique aspects of the methodology such as the use of a common spatial factor with a multinomial health outcome, and within a multilevel analysis that also allows for the impact of individual risk factors. The method transfers straightforwardly to other cross-sectional settings, including (say) joint obesity-diabetes prevalence for females in 2007, and no distinct methodological elements would be involved in considering females. Therefore the analysis is confined to males. Distinct methods would certainly be involved if time were introduced as an extra feature (e.g., how has the joint obesity-diabetes multilevel relationship evolved since 2000), but this is left for another study.
Obesity is defined as a body mass index over 30, based on self-reported height and weight, with overweight defined as BMI between 25 and 29.9. To determine diabetes status, respondents were asked “Have you ever been told by a doctor that you have diabetes?”, encompassing both types of diabetes. Women with gestational diabetes were excluded. The BRFSS does include other questions on diabetes such as age of onset, and whether or not taking insulin; answers to these questions have been combined in some studies [13] to informally differentiate diabetes 1 (onset before age 30 combined with current insulin use) from diabetes 2. However, for the majority (94%) of diabetic survey subjects who have onset after age 30, obesity is an established risk factor for diabetes.

2. Multinomial Regression Combining Individual and Geographic Risk Factors

A multinomial response is involved when there are three or more sub-categories of a single condition, or may be obtained by combining sub-categories over two or more conditions. In the present BRFSS case study, the response yi (i = 1,.., n) is based on combining sub-categories of two conditions: there are J + 1 = 6 categories defined by diabetic status and weight status, namely diabetic and obese (y = 1), diabetic and overweight (y = 2), diabetic and normal weight (y = 3), non-diabetic and obese (y = 4), non-diabetic and overweight (y = 5), and non-diabetic and normal weight (y = 6). So categories 1 to 5 all show some form of morbidity relative to the final non-morbid category who are normal weight and not diabetic.
A multinomial regression is applied with the final (non-morbid) category as reference, so that
Pr ( y i = j ) = π i j = exp ( φ i j ) 1 + j = 1 J exp ( φ i j ) j = 1 , .. J
Pr ( y i = J + 1 ) = π i , J + 1 = 1 1 + j = 1 J exp ( φ i j ) .
where the φij are J regression terms. Let dij = 1 if subject i is in the jth category. Then for equally weighted subjects the likelihood L would take the form
L = i = 1 n j = 1 J + 1 π i j d i j ,
with log-likelihood
log L = i = 1 n j = 1 J + 1 d i j log ( π i j ) .
However, with population survey data, such as the BRFSS [14], it is necessary also to incorporate survey weights wi for respondents i to account for differential response between demographic groups and regions. Then a weighted likelihood Lw is obtained as
L w = i = 1 n j = 1 J + 1 [ π i j d i j ] w i ,
with weighted log-likelihood
log L w = i = 1 n j = 1 J + 1 w i d i j log ( π i j ) .
Three classes of predictors are used in the multinomial regression defined by (1.1)–(1.2), and with weighted likelihood as in (3.1)–(3.2). As well as subject level risk variables R, the regression model includes known geographic influences GK, and latent geographic influences, GL, so the regression terms have the generic form φij = φij(R, GK, GL).
Predictor effects are modelled either as fixed or random effects, in a form of general linear mixed model, in particular one with a multinomial outcome [15]. Random effects are used to pool strength (e.g., over areas or age groups) and to incorporate anticipated correlations in the age or spatial profiles of prevalence for categories j = 1,..., J (see Appendix 1). For example, while the levels of the different combinations of diabetic and weight status are different (e.g., obesity without diabetes is much more common than diabetes with normal weight), one would expect their age profiles to be similar (i.e., correlated).

3. Subject Level Predictors

There has been extensive research on variations in obesity and diabetes over demographic and socioeconomic categories, such as age, socioeconomic status and race. A pronounced gradient in diabetes prevalence by age is reported by CDC [16], though obesity may reduce slightly among the very old. Paeratakul et al. [5] also report impacts of socioeconomic status (SES) on obesity and its comorbidities. For example, obese subjects with lower education reported higher rates of diabetes compared to those with higher education; these differentials were more marked than those between high and low income individuals; see Table 3 in Paeratakul et al. [5]. Freudenberg & Ruglis [17] argues that ”although education is highly correlated with income and occupation, evidence suggests that education exerts the strongest influence on health”, and Zhang et al. [18] and Maty et al. [19] also argue the benefit of using education as a measure of SES risk. For the 2007 BRFSS data, Table 1 shows morbidity rates due to obesity/overweight and/or diabetes by education level, namely percentages of subjects at each education level located in the six diabetic-weight categories (including the reference category). Higher levels of combined morbidity, namely suffering diabetes combined with obesity or overweight, occur for the less well educated. For example, 5.6% of subjects with less than high school education (namely, 794 of 14,158 survey participants at this education level) are obese and diabetic, compared to 3.2% of college graduates (namely, 1562 of 48186 survey participants at this education level).
Ethnic group variability in levels of obesity and diabetes are well established. Paeratakul et al. [5] found the prevalence of overweight and obesity to be higher among black and Hispanic groups compared to whites, and the prevalence of obesity comorbidities (including diabetes) was also found to be higher in blacks than whites. For the 2007 BRFSS data, Table 2 shows morbidity rates due to obesity/overweight and/or diabetes by race, namely percentages of subjects in the six diabetic-weight categories. Compared to other races, black non-Hispanics are more likely to be located in the first two categories, and also have the highest proportion who are both obese and non-diabetic. The other race category (penultimate column in Table 2) has a relatively high proportion who are diabetic but of normal weight. As Zhang et al. [18] mention, racial disparities in diabetes are not entirely explained by racial/ethnic differences in the prevalence of common risk factors such as obesity: racial differences in diabetes risk remain after controlling for body mass and socioeconomic status. Hence cross tabulation such as in Table 2 does not control for interrelations between risk factors (e.g., between race and education), and a regression is required.
Subject level risks are here represented in the regression terms φij by:
  • overall intercepts (αj),
  • differential risks by ethnic group r, namely r = 1 for white non-Hispanic, r = 2 for black, r = 3 for Hispanic, and r = 4 for other races (mainly American Asians and native Americans); these are modelled as fixed effects within each φij, with unknown parameters βjr, r = 2, 3, 4, and with βj1 = 0 as reference under an identifying corner constraint;
  • differential risks by education attainment e, namely e = 1 for less than high school; e = 2 for high school graduate; e = 3 for some college or technical school; and e = 4 for college graduate; these are also modelled as fixed effects, with unknowns γje, e = 2,.., 4, and γj1 = 0 as reference;
  • differential risks by age group a = 1,.., A (with A = 12 for ages 18–24, 25–29, 30–34, .., 70–74, and 75+), represented by unknowns ηja. These are modelled using a random effects approach that allows correlation in the age profiles over the first J prevalence categories (see Appendix 1); an identifying constraint is applied that ensures these effects sum to zero within outcomes, so that α = 1 A η j a = 0 ..

4. Area Level Predictors

Health disparities not explained by population composition (i.e., by considering subject level risk factors alone) may be linked to area effects. For example, Do et al. [20] seek to estimate the share of racial health disparities that can be explained by differences in residential context. There are a wide range of potential area level risk factors for obesity, diabetes and related conditions that have been suggested or applied in the literature. These include area poverty and income levels [21], area racial composition [22,23], climate [24], income inequality [25,26], social cohesion [27,28], type of place (e.g., level of urbanicity) [2931], and urban sprawl [3234].
As to geographic effects in the BRFSS, these are defined by the lowest spatial scale identified by that study, namely the county of residence. In fact this means that there are two potentially relevant spatial divisions for the BRFSS data considered here, namely states and counties. There are 3,110 counties across the mainland US, albeit varying considerably in population size. The choice of known area predictors (GK) in the current study is defined partly by availability of a complete and contemporary profile of county level indicators; for example, some studies suggest potential effects of environmental pollution on diabetes [35,36], but a comprehensive US wide index of environmental quality is not available at county level.
Known county level predictors included in the regression in this paper are 2007 county poverty rate x1c (as a proportion between 0 and 1), county population density x2c (logarithmically transformed), and binary indicators (x3c, x4c, x5c) for counties with proportions in the top decile of county population who are black, Hispanic and other nonwhite. Thus the 311 counties with proportions black exceeding the 90th percentile (over all 3,110 continental counties) are coded x3c = 1, and other counties coded zero, x3c = 0. All these variables have the advantage of being updateable between censuses, whereas a number of more complex indices (urban sprawl, social cohesion, etc) rely on 2,000 census variables in their construction. There are therefore five county predictors, each with outcome specific effects. These are represented by fixed effect parameters (δj1, δj2, δj3, δj4, δj5), for categories j = 1, .., J, applying to the five county level predictors {xjc, j = 1, .., 5; c = 1, .., 3110}.
To account for unmeasured (i.e., omitted) area effects (GL), a latent variable strategy is adopted. Given considerable evidence of spatial clustering in high levels of diabetes and obesity, this feature should be incorporated in the latent variable specification. One option is a separate random effect for each area and each outcome, but this would involve heavy parameterisation. The object of the method adopted here is a parsimonious summary of risks that tend to produce the well known clustering of both high obesity/overweight and high diabetes in certain parts of the US (e.g., in the South East and Appalachians).
Specifically, a spatially correlated county effect vc for counties c = 1,..., 3110 is adopted with loadings λj defining the impact of the shared county effect on weight-diabetes category j (see Appendix 1 for the form of the spatial dependence). A second set of spatially structured random effects us is defined according to state s of residence (s = 1,.., 49 including District of Columbia), with loadings κj defining the impact of that effect on category j. The latter model relatively broad scale and unmeasured effects for states. Identifiability is obtained by setting the first category loadings to 1, namely λ1 = κ1 = 1, so that the conditional variances of vc and us are unknowns.
Let Ci and Si denote the county and state of residence for respondents i = 1,.., n where n = 128, 150, and let {ai, ri, ei} denote the age, race and education status of individual respondents. Then the regression terms φij (i = 1,..n; j = 1,.., J) defining the multinomial logit regression are represented in full form as:
φ i j = α j + β j , r i + γ j , e i + η j , a i + δ j 1 x 1 , C i + δ j 2 x 2 , C i + δ j 3 x 3 , C i + δ j 4 x 4 , C i + δ j 5 x 5 , C i + λ j υ C i + κ j u S i .
Thus the model provides estimates both of the impacts of individual level risk factors and of area effects. Let Sc ∈ (1,..., 49) denote the state that county c is located in. Then the composite county latent effect for joint prevalence category j is defined by the sum
t j c = λ j υ c + κ j u S c ,
and will incorporate both localized county effects, but also distinctive state level influences. In particular, the total county effect for obesity and diabetes combined is
t 1 c = υ c + u S c .

5. Modelling Strategy and Distinct Geographic Effects

A major question of interest in multilevel modelling of health data is the presence (or otherwise) of distinct area effects, both effects of known area indicators (such as county poverty or population density) and effects of latent unmeasured area characteristics [20,37]. The presence of geographic contrasts is apparent from Table 3, which contains age standardized prevalence rates (as percents) for the six conditions by state; the nine census divisions of the states are also listed. For example, the highest rates of obesity & diabetes combined exceed 5.5% (e.g., in Tennessee, Mississippi, and Illinois) while the lowest rates are under 3%, for example, in Colorado and Montana (see Figure 1). Such variation may be due largely to differences in population composition, or there may be substantial area effects, and the role of such area effects is assessed here by using an incremental modelling strategy. Distinct area effects (sometimes called contextual effects), due either to known area covariates or latent area effects, are those remaining after the influence on prevalence of individual level attributes has been controlled for.
Thus a baseline model estimates county level prevalence rates from a reduced version of the full model (4), including only subject level age and latent county and state effects. The resulting estimates of county prevalences of the different weight-diabetes categories are adjusted for age [38], but not for population differences in race and education composition, or for the effect of measured county level factors. Thus the baseline model (model 1) involves the regression terms
φ i j = α j + η j , a i + λ j υ C i + κ j u S i , j = 1 , .. , J
which account for the differing age composition of survey subjects in different areas. Defining
w c j = exp ( α j + λ j υ c + κ j u S c ) , j = 1 , .. , J w c , J + 1 = 1 ,
age standardised proportions in counties c = 1,.., 3110, for the J + 1 = 6 weight-diabetes categories are then estimated as
p c j = w c j / j = 1 J + 1 w c j .
A second model (model 2) adds the effect of measured area predictors, namely county poverty, race composition and population density to the baseline model. Thus in model 2
φ i j = α j + η j , a i + δ j 1 x 1 , C i + δ j 2 x 2 , C i + δ j 3 x 3 , C i + δ j 4 x 3 , C i + δ j 5 x 5 , C i + λ j υ C i + κ j u S i .
Age standardised prevalence rates by county and category under model 2 are estimated via p c j = w c j / j = 1 J + 1 w c j, where now
w c j = exp ( α j + δ j 1 x 1 , C i + δ j 2 x 2 , C i + δ j 3 x 3 , C i + δ j 4 x 4 , C i + δ j 5 x 5 , C i + λ j υ c + κ j u S c ) .
These models are compared to the full model (model 3) including all subject level predictors (age, race, education) and both types of area effect (known and latent), namely
φ i j = α j + β j , r i + γ j , e i + η j , a i + δ j 1 x 1 , C i + δ j 2 x 2 , C i + δ j 3 x 3 , C i + δ j 4 x 4 , C i + δ j 5 x 5 , C i + λ j υ C i + κ j u S i .
Of particular interest are changes in the level of variance of the latent area effects as county predictors and individual risk variables are added to the model. Also of interest are changes in the impact (and statistical significance) of known area predictors when individual level risk variables are added. For example, are there distinct county poverty effects on prevalence after individual level race and education level are allowed for?

6. Case Study Application

Fitting of the regression models and assessment of their goodness of fit follows a Bayesian approach. A Bayesian strategy is advantageous for estimating models with several sets of random effects, including random effects which are spatially clustered, especially when the responses (as here) are not continuous variables but discrete, namely a multinomial category. Under the Bayesian approach, prior densities are specified on all parameters in the model, and final (or posterior) estimates of parameters are based on the combination of the data likelihood and the prior densities.
Estimation uses iterative Monte Carlo Markov Chain (MCMC) sampling methods [39], as provided in the WINBUGS program [40]. Goodness of fit is assessed by a measure of fit that penalizes model complexity, known as the Deviance Information Criterion or DIC [41]. The DIC is obtained as the average deviance, using the definition (3.2), plus a measure of complexity. Lower values of the DIC indicate better fitting models. Posterior summaries of parameters are based on the 2nd half of runs of 10,000 iterations, using two chains starting from dispersed starting values. Convergence was achieved in all models using Brooks-Gelman-Rubin criteria [42].
Figure 2 maps the composite latent county effects t1c = vc + uSc from the baseline model 1 (these are posterior means from the MCMC sample). For example, c = 1 for Autauga County in Alabama, and Alabama is the first state alphabetically among the 49 states in the analysis, so Sc = S1 = 1. The effects t1c summarise varying risks for the jointly obese-diabetic condition between counties before controlling for factors such as county poverty and race composition. They show higher risks in the East South Central states (Kentucky, Tennessee, Alabama, Mississippi), and in some East North Central states (e.g., Illinois, Ohio) [43]. Model 1 also provides age profiles for the five diagnostic groups, plotted in Figure 3 as log-odds coefficients relative to the reference category. An increasing prevalence with age is confined to the categories obese & diabetic, overweight & diabetic, and normal weight & diabetic.
One useful feature of this initial analysis is that the county effects can be profiled against known county and state level characteristics. For example, Figure 4 shows the profile of the average t1c according to county poverty decile (defined by grouping counties into ten categories according to their ranked poverty rates).
Table 4 compares the fit from the baseline model against the other two models, and also presents the variances of the latent spatial effects. These need to account for differential loadings {λj, κj} of the area effects by category (see Table 5), and for the distribution of subjects between categories, and are obtained marginally as var(λjvCi + κjuSi). It can be seen that adding known area predictors in Model 2 results in improved fit (a reduced DIC), and also in a (relatively slight) reduction in the variance of the latent spatial effects.
Table 6 contains the δjk coefficients from model 2 (j = 1,.., J; k = 1,.., 5). It can be seen that the county poverty rate has a strong influence in raising chances of being both obese and diabetic (the first category). It is also an important positive influence on area relativities in the joint normal weight-diabetic category, and on obesity without diabetes. As Table 3 shows, the latter condition applies to around 22% of the US male population and occurs across relatively evenly the age spectrum. Table 6 also shows (via the coefficients δj2) higher rates of morbidity in lower density areas, typically non-metropolitan areas, for four of the five categories. This is consistent with findings that lower density areas, with greater sprawl and lower ”walkability”, have higher rates of obesity and overweight [44]. The exception to this effect is diabetes combined with normal weight, which is higher in more densely populated areas. As to effects of county ethnic structure, high concentrations of blacks or Hispanics remain a positive influence on the three morbidity categories involving diabetes, even after controlling for county poverty.
Figure 5 maps county level variations in proportions jointly obese & diabetic from model 2, namely
p c 1 = w c 1 / j = 1 6 w c j ,
where the ωcj are as in (10). The role of county poverty in defining levels of the joint obese-diabetic category under model 2 results in isolated high prevalence clusters in West North Central states such as North Dakota, Montana and Nebraska. These may, for example, be low income rural areas or counties with concentrations of native Americans [45]. Figure 6 maps variations in the prevalence of obesity without diabetes, namely
p c 4 = w c 4 / j = 1 6 w c j .
The geographic pattern of this condition broadly resembles that of the rarer joint obese-diabetic condition; the state level correlation between these two sets of prevalence rates is 0.50.
As might be expected, combining individual and county level predictors in model 3 produces the lowest DIC and a reduced spatial variance, though over 80% in the baseline spatial variance remains unexplained. Table 7 summarises the effects of individual level risk factors under model 3, in terms of relativities between education and race groups for each of the five morbidity categories. These are represented by the education parameters γje (e = 2,.., 4), and race parameters βjr (r = 2, 4); the reference coefficient for education is γj1 = 0 for less than high school, and the reference coefficient for education is βj1 = 0 for white non-Hispanics.
A notable feature from the education parameters is the lower morbidity among college graduates. Generally, morbidity is greater for subjects with lower education attainment, except for the overweight-non diabetic category.
The race parameters in Table 7 show that black and Hispanic males have higher morbidity than white non-Hispanic males for all conditions. By contrast, other ethnicity (primarily Asian Americans and native Americans) enhances only the rate of diabetes without obesity. This is consistent with the original survey data (see Table 2) which shows that other ethnic groups have the highest proportion of all race groups in the non-morbid (normal weight, non-diabetic) category. The high rate of diabetes without excess weight among asian Americans has been shown by other studies [43,46]. Although other studies [47] show high obesity among native Americans, the results in Table 7 suggest this may to a considerable extent be explained by socioeconomic status (as measured by education) and by area effects.
Table 8 contains the δjk coefficients relating to county level predictors under model 3. The effects of county poverty rate remain pronounced, and are in fact enhanced for the joint obese-diabetic and obese-non-diabetic categories. The significantly higher prevalence of all conditions (except diabetic normal weight) in lower density counties is also still evident. Thus the effects of known area predictors have been largely maintained after allowing for subject level race and education, established as major individual level risk factors for the two conditions [48]. The reduction (relatively slight) in the variance of latent area effects (see Table 4) under model 3 may then be mainly attributable to control for population composition.

7. Conclusion

This paper has considered an approach to modelling prevalence variations in diseases or conditions considered jointly, taking account of both area effects and characteristics of survey subjects. The influence of area is represented partly by known variables (e.g., county poverty), and partly by spatially clustered latent area influences. The application has focussed on the joint prevalence of diabetes and weight status, so providing a geographic perspective on weight-related diabetes prevalence. However, the approach is generic and potentially extends to more than two conditions.
Geographic variability in chronic conditions whether considered singly or jointly will partly reflect variations in the socio-demographic characteristics of area populations, sometimes termed ‘compositional’ effects [49]. However, a number of studies find evidence for prevalence variations between different areas even after controlling for population composition, illustrating what are sometimes termed ‘contextual’ effects [50]. The present study adds to this evidence by showing enduring geographic contrasts in prevalence of different joint obesity-diabetes categories after taking account of individual level age, race and education status.
In the present paper contextual effects have been represented by shared latent effects over the joint obesity-diabetes categories. These are spatially structured random effects for counties and states, and the consistently positive loadings in Table 5 demonstrate that a shared univariate effect is appropriate. Elaborations to the model presented above are possible, such as ethnic group differentiation in age or education gradients, or additional subject level predictors, though those included (age, race, education) are established as the major dimensions of variation for diabetes and obesity [5,48]. One might also assume spatially varying impacts of the known area predictors, such as the county poverty rate [51].


  1. Balluz, L; Okoro, C; Mokdad, A. Association between selected unhealthy lifestyle factors, body mass index, and chronic health conditions among individuals 50 years of age or older, by race/ethnicity. Ethn. Dis 2008, 18, 450–457. [Google Scholar]
  2. Yach, D; Stuckler, D; Brownell, K. Epidemiologic and economic consequences of the global epidemics of obesity and diabetes. Nat. Med 2006, 12, 62–663. [Google Scholar]
  3. Mokdad, A; Ford, E; Bowman, B; Nelson, D; Engelgau, M; Vinicor, F; Marks, J. Diabetes trends in the US: 1990–1998. Diabetes Care 2001, 24, 1278–1283. [Google Scholar]
  4. Gregg, E; Cheng, Y; Narayan, K; Thompson, T; Williamson, D. The relative contributions of different levels of overweight and obesity to the increased prevalence of diabetes in the United States: 1976–2004. Prev Med 2007, 45, 348–352. [Google Scholar]
  5. Paeratakul, S; Lovejoy, J; Ryan, D; Bray, G. The relation of gender, race and socioeconomic status to obesity and obesity comorbidities in a sample of US adults. Int. J. Obes. Relat. Metab. Disord 2002, 26, 1205–1210. [Google Scholar]
  6. Cowie, C; Rust, K; Ford, E; Eberhardt, M; Byrd-Holt, D; Li, C; Williams, D; Gregg, E; Bainbridge, K; Saydah, S; Geiss, L. Full accounting of diabetes and pre-diabetes in the US population in 1988–1994 and 2005–2006. Diabetes Care 2009, 32, 287–294. [Google Scholar]
  7. Krieger, N; Chen, J; Waterman, P; Rehkopf, D; Subramanian, S. Race/ethnicity, gender, and monitoring socioeconomic gradients in health: a comparison of area-based socioeconomic measures. Amer. J. Public Health 2003, 93, 1655–1671. [Google Scholar]
  8. Drewnowski, A; Rehm, C; Solet, D. Disparities in obesity rates: analysis by ZIP code area. Soc. Sci. Med 2007, 65, 2458–2463. [Google Scholar]
  9. Lee, R; Cubbin, C; Winkleby, M. Contribution of neighbourhood socioeconomic status and physical activity resources to physical activity among women. J. Epid. Comm. Health 2007, 61, 882–890. [Google Scholar]
  10. Cubbin, C; Hadden, W; Winkleby, M. Neighborhood context and cardiovascular disease risk factors: the contribution of material deprivation. Ethn. Dis 2001, 11, 687–700. [Google Scholar]
  11. Subramanian, S; Chen, J; Rehkopf, D; Waterman, P; Krieger, N. Racial disparities in context: a multilevel analysis of neighborhood variations in poverty and excess mortality among black populations in Massachusetts. Am. J. Public Health 2005, 95, 260–265. [Google Scholar]
  12. Schuurman, N; Peters, P; Oliver, L. Are obesity and physical activity clustered?. A spatial analysis linked to residential density. Obesity 2009. [Google Scholar]
  13. Saaddine, J; Narayan, K; Engelgau, M; Aubert, R; Klein, R; Beckles, G. Prevalence of self-rated visual impairment among adults with diabetes. Am. J. Public Health 1999, 89, 1200–1205. [Google Scholar]
  14. Jiles, R; Hughes, E; Murphy, W; Flowers, N; McCracken, M; Roberts, H; Ochner, M; Balluz, L; Mokdad, A; Elam-Evans, L; Giles, W. Surveillance for certain health behaviors among states and selected local areas–Behavioral Risk Factor Surveillance System. MMWR Surveill Summ 2005, 54, 1–116. [Google Scholar]
  15. Hedeker, D. A mixed-effects multinomial logistic regression model. Stat. Med 2003, 22, 1433–1444. [Google Scholar]
  16. Center for Disease Control and Prevention (CDC). Prevalence of diabetes and impaired fasting glucose in adults—United States, 1999–2000. MMWR Surveill Summ 2003, 52, 833–837. [Google Scholar]
  17. Freudenberg, N; Ruglis, J. Reframing school dropout as a public health issue. Prev. Chronic Dis 2007, 7, 63. [Google Scholar]
  18. Zhang, Q; Wang, Y; Huang, E. Changes in racial/ethnic disparities in the prevalence of Type 2 diabetes by obesity level among US adults. Ethn. Health 2009, 14, 439–457. [Google Scholar]
  19. Maty, S; Everson-Rose, S; Haan, M; Raghunathan, T; Kaplan, G. Education, income, occupation, and the 34-year incidence (1965–1999) of Type 2 diabetes in the Alameda County Study. Int. J. Epid 2005, 34, 1282–1283. [Google Scholar]
  20. Do, D; Finch, B; Basurto-Davila, R; Bird, C; Escarce, J; Lurie, N. Does place explain racial health disparities? Quantifying the contribution of residential context to the Black/white health gap in the United States. Soc. Sci. Med 2008, 67, 1258–1268. [Google Scholar]
  21. Schwartz, F; Ruhil, A; Denham, S; Shubrook, J; Simpson, C; Boyd, S. High self-reported prevalence of diabetes mellitus, heart disease, and stroke in 11 counties of rural Appalachian Ohio. J. Rur. Health 2009, 25, 226–230. [Google Scholar]
  22. Mellor, J; Milyo, J. Individual health status and racial minority concentration in US states and counties. Am. J. Public Health 2004, 94, 1043–1048. [Google Scholar]
  23. Lopez, R. Neighborhood risk factors for obesity. Obesity 2007, 15, 2111–2119. [Google Scholar]
  24. Franz, K; Bailey, S. Geographical variations in heart deaths and diabetes: effect of climate and a possible relationship to magnesium. J. Amer. Coll. Nutr 2004, 23, 521S–524S. [Google Scholar]
  25. Pickett, K; Kelly, S; Brunner, E; Lobstein, T; Wilkinson, R. Wider income gaps, wider waistbands? An ecological study of obesity and income inequality. J. Epid. Comm. Health 2005, 59, 670–674. [Google Scholar]
  26. Fuller-Thomson, E; Gadalla, T. Income inequality and limitations in activities of daily living: a multilevel analysis of the 2003 American Community Survey. Public Health 2008, 122, 221–228. [Google Scholar]
  27. Holtgrave, D; Crosby, R. Is social capital a protective factor against obesity and diabetes? Findings from an exploratory study. Ann. Epidemiol 2006, 16, 406–408. [Google Scholar]
  28. Kim, D; Subramanian, S; Gortmaker, S; Kawachi, I. US state- and county-level social capital in relation to obesity and physical inactivity: a multilevel, multivariable analysis. Soc. Sci. Med 2006, 63, 1045–1059. [Google Scholar]
  29. Mainous, A; King, D; Garr, D; Pearson, W. Race, rural residence, and control of diabetes and hypertension. Ann. Fam. Med 2004, 2, 563–568. [Google Scholar]
  30. Koopman, R; Mainous, A; Geesey, M. Rural residence and Hispanic ethnicity: doubly disadvantaged for diabetes? J. Rur. Health 2005, 22, 63–68. [Google Scholar]
  31. Lovasi, G; Neckerman, K; Quinn, J; Weiss, C; Rundle, A. Effect of individual or neighborhood disadvantage on the association between neighborhood walkability and body mass index. Amer. J. Public Health 2009, 99, 279–284. [Google Scholar]
  32. Ewing, R; Schmid, T; Killingsworth, R. Relationship between urban sprawl and physical activity, obesity, and morbidity. Amer. J. Health Promot 2003, 18, 47–57. [Google Scholar]
  33. Joshu, C; Boehmer, T; Brownson, R; Ewing, R. Personal, neighbourhood and urban factors associated with obesity in the United States. J. Epid. Comm. Health 2008, 62, 202–208. [Google Scholar]
  34. Li, F; Harmer, P; Cardinal, B; Bosworth, M; Johnson-Shelton, D; Moore, J; Acock, A; Vongjaturapat, N. Built environment and 1-year change in weight and waist circumference in middle-aged and older adults: Portland Neighborhood Environment and Health Study. Amer. J. Epid 2009, 169, 401–408. [Google Scholar]
  35. Ershow, A. Environmental influences on development of type 2 diabetes and obesity: challenges in personalizing prevention and management. J. Diab. Sci. Tech 2009, 3, 727–734. [Google Scholar]
  36. Schreinemachers, D. Mortality from ischemic heart disease and diabetes mellitus (type 2) in four U.S. wheat-producing states: a hypothesis-generating study. Environ. Health Perspect 2006, 114, 186–193. [Google Scholar]
  37. Sastry, N; Hussey, J. An investigation of race and ethnic disparities in birthweight in Chicago neighborhoods. Demography 2003, 40, 701–725. [Google Scholar]
  38. Gregg, E; Cheng, Y; Cadwell, B; Imperatore, G; Williams, D; Flegal, K; Narayan, K; Williamson, D. Secular trends in cardiovascular disease risk factors according to body mass index in US adults. J. Amer. Med. Assoc 2005, 293, 1868–1874. [Google Scholar]
  39. Gelfand, A; Smith, A. Sampling based approaches to calculate marginal densities. J. Amer. Statist. Assoc 1990, 85, 398–409. [Google Scholar]
  40. Lunn, D; Thomas, A; Best, N; Spiegelhalter, D. WinBUGS – a Bayesian modelling framework: concepts, structure, and extensibility. Stat. Comput 2000, 10, 325–337. [Google Scholar]
  41. Spiegelhalter, D; Best, N; Carlin, B; van der Linde, A. Bayesian measures of model complexity and fit. J. Roy. Stat. Soc. B 2002, 64, 583–639. [Google Scholar]
  42. Brooks, S; Gelman, A. Alternative methods for monitoring convergence of iterative simulations. J. Comp. Graph. Stat 1998, 7, 434–456. [Google Scholar]
  43. Wang, Y; Beydoun, M. The obesity epidemic in the United States–gender, age, socioeconomic, racial/ethnic, and geographic characteristics: a systematic review and meta-regression analysis. Epid. Rev 2007, 29, 6–28. [Google Scholar]
  44. Papas, M; Alberg, A; Ewing, R; Helzlsouer, K; Gary, T; Klassen, A. The built environment and obesity. Epid. Rev 2007, 29, 129–143. [Google Scholar]
  45. Kirschner, A. Poverty in the rural West. Perspect. Poverty Policy Place 2005, 3, 4–6. [Google Scholar]
  46. McNeely, M; Boyko, E. Type 2 diabetes prevalence in Asian Americans: results of a national health survey. Diabetes Care 2004, 27, 66–69. [Google Scholar]
  47. Broussard, B; Johnson, A; Himes, J; Story, M; Fichtner, R; Hauck, F; Bachman-Carter, K; Hayes, J; Frohlich, K; Gray, N. Prevalence of obesity in American Indians and Alaska Natives. Amer. J. Clin. Nutr 1991, 53, 1535S–1542S. [Google Scholar]
  48. Geiss, L; Pan, L; Cadwell, B; Gregg, E; Benjamin, S; Engelgau, M. Changes in incidence of diabetes in U.S. adults, 1997–2003. Amer. J. Prev. Med 2006, 30, 371–377. [Google Scholar]
  49. Duncan, C; Jones, K; Moon, G. Context, composition and heterogeneity: using multilevel models in health research. Soc. Sci. Med 1998, 46, 97–117. [Google Scholar]
  50. Sacker, A; Wiggins, R; Bartley, M. Time and place: putting individual health into context. A multilevel analysis of the British household panel survey, 1991–2001. Health Place 2006, 12, 279–290. [Google Scholar]
  51. Gamerman, D; Moreira, A; Rue, H. Space-varying regression models specifications and simulation. Comput. Statist. Data Analysis 2003, 42, 513–533. [Google Scholar]
  52. Fahrmeir, L; Lang, S. Bayesian inference for generalized additive mixed models based on Markov random field priors. J. Roy. Stat. Soc. C 2001, 50, 201–220. [Google Scholar]
  53. Besag, J; York, J; Mollie, A. Bayesian image restoration, with two applications in spatial statistics. Ann. Inst. Statist. Math 1991, 43, 1–59. [Google Scholar]

Appendix 1 Structured Age and Area Effects

Age and area effects follow autoregressive random effect schemes. To pool strength across the age profiles of different outcomes, a low order multivariate random walk prior [52] may be adopted for the J dimensional vector ηa = (η1a,.., ηJa), a = 1,.., A. For example, first and second order random walk priors have conditional forms
η a N J ( η a 1 , Ω η 1 ) , η a N J ( 2 η a 1 η a 2 , Ω η 1 ) ,
where the J × J matrix Ω η 1 represents covariation between age mortality profiles over different weight-diabetes status categories. Here a first order random walk is used, and the precision matrix Ωη is assigned a Wishart prior with identity scale matrix and J degrees of freedom, namely ΩηWish(I, J).
The county and state effects follow conditional autogressive (CAR) priors [53], with pooling of strength over neighbouring areas. Suppose the locality Lc of county c (the counties adjacent to it) contains dc counties. For vc conditional on all remaining effects v[c] = (v1,−1, vc+1,, one has
υ c | υ | c | N ( V c , δ d c )
where δ is a conditional variance parameter, and Vc is the average of the vh (hc) in locality Lc, namely
V c = h L c υ h / d c .
The state level prior pools strength over neighbouring states in the same way.
Figure 1. Percent of adults both diabetic & obese, BRFSS 2007.
Figure 1. Percent of adults both diabetic & obese, BRFSS 2007.
Ijerph 07 00333f1
Figure 2. County latent effects.
Figure 2. County latent effects.
Ijerph 07 00333f2
Figure 3. Age profiles of the different diagnostic groups.
Figure 3. Age profiles of the different diagnostic groups.
Ijerph 07 00333f3
Figure 4. Total latent area effect, model 1, by county poverty decile.
Figure 4. Total latent area effect, model 1, by county poverty decile.
Ijerph 07 00333f4
Figure 5. Rate of obesity & diabetes jointly, US Counties, Model 2.
Figure 5. Rate of obesity & diabetes jointly, US Counties, Model 2.
Ijerph 07 00333f5
Figure 6. Rate of obesity without diabetes, US counties, model 2.
Figure 6. Rate of obesity without diabetes, US counties, model 2.
Ijerph 07 00333f6
Table 1. Percentage prevalence of obesity-diabetes status, by education.
Table 1. Percentage prevalence of obesity-diabetes status, by education.
Less than high schoolHigh school graduateSome collegeCollege graduateAll
Obese & Diabetic5.
Overweight & Diabetic3.
Normal weight & Diabetic2.
Obese non-diabetic21.324.925.119.022.4
Overweight & non-diabetic36.337.339.444.040.2
Normal weight & non-diabetic30.828.626.929.628.8
Table 2. Percentage prevalence of obesity-diabetes status, by race.
Table 2. Percentage prevalence of obesity-diabetes status, by race.
White non-HispanicBlack non-HispanicHispanicOtherAll
Obese & Diabetic4.
Overweight & Diabetic3.
Normal weight & Diabetic1.
Obese non-diabetic22.
Overweight & non-diabetic41.035.341.537.140.2
Normal weight & non-diabetic28.526.325.439.528.8
Table 3. Percentage prevalence of obesity-diabetes status, by race.
Table 3. Percentage prevalence of obesity-diabetes status, by race.
StateCensus DivisionObese & DiabeticOverweight & DiabeticNormal weight & DiabeticObese non-diabeticOverweight non-diabeticNormal weight non-diabetic
AlabamaE South Central5.72.81.323.639.627.0
ArkansasW South Central4.62.80.924.741.225.8
ConnecticutNew England3.
DelawareSouth Atlantic4.23.50.926.939.525.0
District of ColumbiaSouth Atlantic3.32.91.316.135.041.5
FloridaSouth Atlantic4.03.21.521.641.428.3
GeorgiaSouth Atlantic5.13.21.720.839.829.3
IllinoisE North Central6.
IndianaE North Central4.72.81.520.742.228.2
IowaW North Central4.22.41.325.138.428.5
KansasW North Central4.42.80.924.739.927.3
KentuckyE South Central4.
LouisianaW South Central5.
MaineNew England4.13.00.823.141.027.9
MarylandSouth Atlantic4.
MassachusettsNew England3.
MichiganE North Central5.03.61.324.538.427.2
MinnesotaW North Central3.62.10.724.440.528.7
Table 4. Model fit summary.
Table 4. Model fit summary.
Average DevianceEffective Parameters (Complexity)DICVariance Spatial Effects
Model 12606448022614460.329
Model 22605818212614020.290
Model 32569248462577690.279
Table 5. Loadings on shared latent effects, by model.
Table 5. Loadings on shared latent effects, by model.
1CountyObese & Diabetic1
Overweight & Diabetic0.720.580.87
Normal weight & Diabetic0.490.320.67
Obese non-diabetic1.421.321.54
Overweight non-diabetic0.880.810.96
StateObese & Diabetic1
Overweight & Diabetic0.880.621.20
Normal weight & Diabetic0.440.080.84
Obese non-diabetic0.160.030.31
Overweight non-diabetic0.050.000.13
2CountyObese & Diabetic1.00
Overweight & Diabetic1.070.971.18
Normal weight & Diabetic1.531.351.69
Obese non-diabetic2.562.502.65
Overweight non-diabetic1.751.671.85
StateObese & Diabetic1
Overweight & Diabetic1.080.961.29
Normal weight & Diabetic0.14−0.020.30
Obese non-diabetic0.860.800.93
Overweight non-diabetic0.420.370.48
3CountyObese & Diabetic1
Overweight & Diabetic1.251.021.52
Normal weight & Diabetic1.270.951.55
Obese non-diabetic2.442.162.83
Overweight non-diabetic1.651.441.91
StateObese & Diabetic1
Overweight & Diabetic1.090.681.74
Normal weight & Diabetic0.450.120.64
Obese non-diabetic0.480.320.69
Overweight non-diabetic0.300.140.48
Table 6. Effects of county predictors by diabetes-weight category, model 2.
Table 6. Effects of county predictors by diabetes-weight category, model 2.
CategoryPredictorPosterior MeanStandard DevnMC Error2.5%97.5%
Obese & DiabeticPoverty Rate1.
Popn Density−−0.12−0.10
Top decile Black0.310.
Top decile Hispanic0.
Top decile Other Ethnicity−−0.23−0.02
Overweight & DiabeticPoverty Rate−−0.270.09
Popn Density−−0.06−0.03
Top decile Black0.020.090.02−0.230.16
Top decile Hispanic0.300.
Top decile Other Ethnicity−−0.24−0.06
Normal weight & DiabeticPoverty Rate0.460.
Popn Density0.
Top decile Black0.
Top decile Hispanic0.370.
Top decile Other Ethnicity−−0.25−0.04
Obese non-diabeticPoverty Rate0.
Popn Density−−0.08−0.05
Top decile Black0.020.020.00−0.030.08
Top decile Hispanic0.
Top decile Other Ethnicity−−0.23−0.05
Overweight non-diabeticPoverty Rate−0.420.100.02−0.64−0.24
Popn Density−−0.020.00
Top decile Black−−0.16−0.03
Top decile Hispanic0.010.020.00−0.040.06
Top decile Other Ethnicity−−0.12−0.03
Table 7. Effects of county predictors by diabetes-weight category, model 2.
Table 7. Effects of county predictors by diabetes-weight category, model 2.
Education CoefficientsPosterior MeanStandard DevnMC Error2.5%97.5%
Obese & DiabeticHigh School Graduate−−0.180.01
Some College0.020.050.01−0.070.13
College Graduate−0.660.050.01−0.74−0.56
Overweight & DiabeticHigh School Graduate0.
Some College0.020.050.01−0.080.12
College Graduate−−0.34−0.14
Normal weight & DiabeticHigh School Graduate−−0.300.04
Some College−0.340.090.01−0.53−0.18
College Graduate−0.500.070.01−0.64−0.38
Obese non-diabeticHigh School Graduate0.
Some College0.370.030.000.300.43
College Graduate−−0.23−0.10
Overweight non-diabeticHigh School Graduate0.
Some College0.
College Graduate0.
Race CoefficientsPosterior MeanStandard DevnMC Error2.5%97.5%
Obese & DiabeticBlack0.600.060.010.480.73
Overweight & DiabeticBlack0.810.080.010.630.95
Normal weight & DiabeticBlack0.560.110.010.350.79
Obese non-diabeticBlack0.390.030.000.330.44
Overweight non-diabeticBlack0.
Table 8. Effects of county predictors by diabetes-weight category, model 3.
Table 8. Effects of county predictors by diabetes-weight category, model 3.
CategoryPredictorPosterior MeanStandard DevnMC Error2.5%97.5%
Obese & DiabeticPoverty Rate2.
Popn Density−−0.13−0.09
Top decile Black0.070.060.01−0.060.18
Top decile Hispanic−−0.24−0.02
Top decile Other Ethnicity−−0.170.01
Overweight & DiabeticPoverty Rate0.580.
Popn Density−−0.09−0.04
Top decile Black−−0.41−0.12
Top decile Hispanic0.030.060.01−0.070.15
Top decile Other Ethnicity−−0.25−0.03
Normal weight & DiabeticPoverty Rate0.070.330.04−0.640.62
Popn Density0.010.010.00−0.020.04
Top decile Black−−0.260.14
Top decile Hispanic0.120.070.01−0.030.24
Top decile Other Ethnicity−−0.30−0.05
Obese non-diabeticPoverty Rate1.
Popn Density−−0.12−0.06
Top decile Black−−0.190.04
Top decile Hispanic−−0.42−0.08
Top decile Other Ethnicity−−0.25−0.09
Overweight non-diabeticPoverty Rate0.150.170.02−0.120.52
Popn Density−−0.06−0.02
Top decile Black−−0.150.00
Top decile Hispanic−−0.31−0.09
Top decile Other Ethnicity−−0.15−0.04
Back to TopTop