This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Typical statistical analysis of epidemiologic data captures uncertainty due to random sampling variation, but ignores more systematic sources of variation such as selection bias, measurement error, and unobserved confounding. Such sources are often only mentioned via qualitative caveats, perhaps under the heading of ‘study limitations.’ Recently, however, there has been considerable interest and advancement in probabilistic methodologies for more integrated statistical analysis. Such techniques hold the promise of replacing a confidence interval reflecting only random sampling variation with an interval reflecting all, or at least more, sources of uncertainty. We survey and appraise the recent literature in this area, giving some prominence to the use of Bayesian statistical methodology.

Much of the methodological literature on inferring exposure-disease relationships from observational data looks, either implicitly or explicitly, at the best-case situation: a random sample from the study population can be obtained, and all pertinent variables can be measured without error on sampled individuals. Commensurately, in real applications it is common to see best-case quantitative methods applied in settings where the best-case assumptions likely, or perhaps surely, fail. This is often ameliorated with some qualitative discussion of how the best-case assumptions might be violated, and some speculation on what impact such violations may have had on the quantitative results which are reported.

Unfortunately, the alliance between the quantitative results based on best-case assumptions and the qualitative comments casting doubt on these assumptions is typically shaky. If the only quantitative results given pertain to the best-case analysis, then these results tend to form the take-away message of the research. However well-intended, caveats and provisos about the realism of best-case assumptions are easily set aside by readers. Plus, even the most diligent readers will not be able to turn the qualitative remarks into clear inferential summaries. For instance, say an estimate and 95% confidence interval are reported for an effect of interest, using a method which assumes best-case conditions. Then qualitative caveats are added about possible discrepancies between the actual sampling scheme and random sampling, possible imperfections in measuring the available variables, and omissions of variables which are possibly germane to the relationship of interest. There may be some intuition about the direction (if any) in which the estimate ought to be shifted given these concerns, but there is seldom clarity on how large a shift is needed. Worse still, there is no recipe to indicate how much wider the confidence interval ought to be in light of the caveats, and it can be hard to generate even rough intuition on this question. Quantitative results based on overly strong assumptions are just not readily synthesized with qualitative remarks on the possible violations of these assumptions.

A situation commonly faced by consumers of research is the best-case confidence interval for the effect of interest excludes the null value, but qualitative concerns are expressed about the likely divergence of reality from the best-case. Given such concerns, there is a literature on various forms of sensitivity analysis, going back to Cornfield’s work on tobacco smoking and lung cancer [

Before proceeding with further discussion of probabilistic sensitivity analysis, we review some of the major ways in which best-case assumptions will often be violated in epidemiological investigations of exposure-disease relationships. We do this in the framework of exposure and disease variables which are labeled _{1}, . . . ,_{p}

Implicit in a standard analysis is the assumption the data are representative of the study population. This is guaranteed for simple random sampling of (

In contrast, case-control studies involve sampling on the outcome, in the sense that the probability of being selected into the study will necessarily depend on

Things go awry if the exposure affects the probability of selection. For example, in case-control studies of magnetic field exposure and childhood leukaemia, Mezei and Kheifets [

In concept selection bias is readily dealt with by modifying a standard analysis,

Many variables of interest in epidemiologic studies are not easily measured. It may be expensive or even technically impossible to measure a particular variable without error. This could apply to an outcome variable, to an exposure variable, or to a confounding variable. There is a particular emphasis in the literature, however, on poor measurement of exposure. This arises in part because many human exposures of a toxicological or nutritional nature are inherently hard to measure well. It also arises in part because parameter estimates from statistical models explaining the conditional distribution of disease outcome given exposure are particularly susceptible to bias as a result of unacknowledged exposure measurement error.

In general, if enough is known about exposure measurement error then the effect of this error can be ‘undone’ statistically. Roughly speaking, this applies if the measured exposure

Unfortunately, many epidemiological settings involve insufficient knowledge of the nature and distribution of exposure measurement error. Thus the problem devolves to one of sensitivity analysis. One can examine the inferences that arise about the desired exposure-disease relationship over a range of assumptions concerning the magnitude of exposure measurement error.

While not emphasized here, it should also be mentioned that poor measurement of confounding variables is also a challenging situation in practice. Intuitively, error in measuring confounders C should result in an analysis which is intermediate between confounder adjustment based on correctly measured C and a crude analysis without confounder adjustment. See Fewell

A further challenge in epidemiology is unobserved confounding. Suppose that _{1}, . . . , _{q}

Suppose that we study the distribution of the outcome across levels of exposure, but without taking

The most intuitive approach to dealing with confounding–when the confounding variable is observed–is via stratified analysis. We estimate the exposure effect within levels of

In recent years the has been renewed interest in techniques for sensitivity analysis for unmeasured confounding; see Schneeweiss [

In what follows, Sections 2 through 4 illustrate sensitivity analysis applied to unobserved confounding, selection bias, and misclassification, respectively. Some concluding remarks are given in Section 5. The more technical arguments involved in Sections 2 and 4 appear in the

To give some simple illustrations of sensitivity analysis, we focus on a setting which doesn’t involve any observed confounding variables, and the only observed variables are the binary exposure

As an example, consider the data in

The crude odds ratio for these data is 2.75, which, assuming no selection bias or measurement error, we can regard as an estimate of the exposure-disease odds ratio in the study population. Standard methods give a 95% confidence interval around this estimate as (1.66, 4.55).

Now we focus on what inferences we might draw in the face of concern about unobserved confounding. To wit, suppose that we consider the simple case where a binary variable

Without the ability to observe

The prevalence of

The extent to which

The extent to which

How the conditional association between

For convenience, and in line with much of the literature on sensitivity analysis (see, for instance [

Items 1 through 3 are conveniently dealt with together by assuming a relationship of the form
_{u}_{u}_{xu}_{yu}_{xu}_{yu}_{u}_{xu}_{yu}

As explained in further detail in the _{xy}

Thus a basic sensitivity analysis is comprised of assuming values for the three components of _{xy}

One point of interest under this set up is that if _{xu}_{yu}_{xy}

Another point of interest is that the labels _{xy}_{u}

Returning to the CPC data given in _{xy}_{u}_{xu}_{yu}_{xu}_{yu}_{u}_{u}

One way to summarize the findings from

An alternative strategy is a

Sample a value for each of (_{u}_{,} _{xu}_{yu}

Based on the selected scenario, sample a value of

Based on the resulting set of ‘completed’ data (actual values of _{xy}

_{xy}

While Step 1 simply involves computer-implemented random sampling from the chosen distribution over scenarios, and Step 2 involves such sampling dictated by

We illustrate this method by assigning the following distribution to the bias parameters, under which _{u}_{xu}_{yu}_{u}_{xu}_{yu}_{xy}^{th}^{th}

The use of uniform distributions over bias parameters lends itself to comparison with the _{u}_{xu}_{yu}_{xu}_{yu}

As a more formal step beyond MCSA, we consider _{xy}

In case-control studies, selection bias can result from the manner in which controls are sampled. Conceptually one can regard the case-control design either as comparing exposure patterns amongst those diseased to exposure patterns amongst the general population, or as comparing exposure patterns amongst those diseased to exposure patterns amongst the disease-free portion of the population. For most applications the disease is very rare in the study population, in which case the distinction between these two formulations becomes unimportant. What matters is that the scheme to select cases yields an exposure prevalence matching the target population.

In the CPC example, the cases were hospital patients with pancreatic cancer. The controls were sampled from patients who were hospitalized by the same attending physicians who had hospitalized the cases. MacMahon

In fact, the possibility of selection bias was acknowledged in [

The study generated great controversy because coffee drinking is so common. In later years, researchers tried unsuccessfully to replicate the study findings, and the relationship between coffee drinking and pancreatic cancer has been largely refuted [

We now use this example to illustrate how to adjust for selection bias. In case-control studies, selection bias occurs when the exposure affects the probability of participating in the study. This dependence has the effect of censoring the exposure distribution so that the odds ratio is no longer unbiased. More generally, selection bias results from conditioning on a variable that is affected by both the exposure and disease. It can also emerge in prospective studies, in which case it is called informative censoring. See Hernán [

As before, let

A simple formula for adjusting for selection bias is given by Greenland [_{XY}_{|}_{S}_{=1} = Odds(_{XY}

In the pancreatic cancer example, the crude odds ratio _{XY|S}_{=1} computed from _{x}_{y}_{0} determines the probability of selection among the controls with zero exposure, and _{xy}_{0}, _{x}_{y}_{xy}

To illustrate bias adjustment in action, _{XY}_{x}_{y}_{0}. To generate the table, we lock the parameter _{xy}_{XY}_{0}, _{x}_{y}_{xy}

As was the case with unmeasured confounding, the bias parameters _{x}_{y}_{0} we see distinct combinations of (_{x}_{y}_{y}_{y}

_{x}_{y}_{x}_{0} equal to −3 or −5 to reflect that non-cases in the population have very low probability of being included as controls in the study, with probabilities expit(−3) = 5% or expit(−5) < 1%.

In _{x}_{y}_{x}_{y}

The significant association between coffee drinking and cancer is very robust to different assumptions about selection bias. For example, the bias parameter combination (_{0}, _{y}_{x}

Taking the analysis a step further, we can conduct MCSA where we assign probability distributions to the bias parameters (_{0}, _{x}_{y}_{xy}_{0}, _{x}_{y}_{xy}_{xy}_{0} ∼ ^{2}), meaning that we believe that the probability of selection for unexposed controls lies between 0.1% and 5% with probability 95%. We assign _{y}^{2}) to indicate that cases have roughly +8 greater log-odds of being selected in the study. This seems reasonable because presumably nearly all pancreatic cancer cases are recruited into the study, whereas disease free individuals constitute the bulk of the study population. Finally we set _{x}^{2}). This last prior is determined based on the 2 × 2 table presented in Table 4 of Rosenberg

Paralleling the discussion of MCSA in Section 2, we repeatedly implement the following three step procedure described by Greenland [

Draw a value of (_{0}, _{x}_{y}_{xy}

Compute the inverse probabilities of selection, {expit(_{x}X_{y}Y^{−1}, as weights for fitting a weighted logistic regression of

Based on the fit, use the estimated _{XY}

By repeating the steps a large number of times, an ensemble of values for log _{XY}_{XY}

Based on these results we arrive at several conclusions. Plausible priors on the bias parameters (_{0}, _{x}_{y}_{xy}

It is difficult to know how to interpret these results. Selection bias alone does not appear to be driving the association in

We conclude with a discussion of a full Bayesian treatment of selection bias. Interestingly, unlike other biases, selection bias adjustment does not have a natural Bayesian interpretation. The reason is that the bias adjustment formula in

In the coffee and pancreatic cancer example, exposure status is obtained by asking study participants about coffee consumption via a questionnaire. Given that exposure is defined in this study as any non-zero level of daily consumption, one might anticipate that the answers provided by participants are quite accurate in this setting. The situation would be muddied, however, for participants whose consumption pattern may have changed over time. More generally, concern about accurate measurement or classification of exposure abounds in epidemiology. Typically this falls under the heading of exposure measurement error when the exposure variable is continuous, and exposure misclassification when the exposure variable is categorical (with absent/present being an important special case).

To give an illustration of how we might conduct an analysis which acknowledges exposure misclassification, say the relationship of interest is

To focus on a simple case for illustration, we consider _{w}_{xw}_{w}_{xw}_{w}_{w}

Note that the present framework is somewhat parallel to our treatment of an unobserved confounder in Section 2, with two regression relationships at play. That is, (5), like (2) before, governs the relationship of scientific interest, while (6), like (1) before, governs the link between observed and unobserved variables. Moreover, just as before, (5) and (6) can be subsumed by a Poisson model for the counts of subjects stratified by (

Despite the appealing similarity, it is more challenging to provide some form of probabilistic sensitivity analysis in the present setting than in the Section 2 setting. The challenge arises via the fact that the distribution of the unobserved variable _{w}_{xw}_{xy}

With case-control data and nondifferential misclassification, we can focus on the four parameter model studied in detail in Gustafson, Le and Saskin [_{xy}

To explore this further, the 95% central interval above translates to (0.67, 4.44) on the log odds-ratio, _{xy}

In fact, even in the simpler setting of known (

A particular point of interest in this example is comparison of the prior and posterior distributions on the bias parameters, _{xy}

As a second point, the fact that substantial prior-to-posterior updating of a bias parameter can be seen, at least for some datasets, speaks against the applicability of TSA or MCSA in such settings. With TSA or MCSA, their is no attempt to access any information

Probabilistic sensitivity analysis is a new suite of methods to help epidemiologists deal with bias in observational studies. On the one hand, such methods permit shifting odds-ratios away from or toward the null value, based on beliefs about biasing mechanisms. Moreover, they also give more plausible quantifications of uncertainty, as they incorporate uncertainties about the magnitude and direction of bias. For example, in the CPC data example, the actual amount of selection bias is debatable. It could be large or small depending on assumptions about the probabilities of selection. Prior distributions on bias parameters incorporate this uncertainty into the analysis.

The methods are particularly useful in settings where confidence intervals around effect estimates are small, e.g., in meta-analysis or studies using large population databases. In such settings random sampling variation contributes only a small fraction of the total uncertainty that is at play. Thus probabilistic bias analysis gives a more honest assessment of the major sources of uncertainty.

In the CPC example, we illustrated that the association between coffee drinking and cancer is robust to different assumptions about selection bias. These findings are surprising in light of the fact that the study results were never successfully replicated. One possible interpretation is that there were multiple biases involved in the data collection process. Indeed the simultaneous modeling of

One outstanding question is determining why exactly the odds ratio in

In the unobserved confounding setting, let _{xyu}_{xyu}_{y}

Turning to the real problem at hand with

In the present setting, for each level of _{xy•}_{xy}_{0} + _{xy}_{1} is observed, while (_{xy}_{0}, _{xy}_{1}) = (_{xy•}_{xy}_{1}, _{xy}_{1}) are unobserved. As a general technical point about fitting models involving unobserved data, the _{xy}_{1} given _{xy}_{•}, with expectation _{xy}_{1}(_{xy}_{•}expit(_{u}_{xu}x_{yu}y_{xyu}_{xyu}

Now we consider the MCSA scheme. We formally denote the missing and observed data as _{mis}_{001}, _{011}, _{101}, _{111}) and _{obs}_{00•}, _{01•}, _{10•}, _{11•}), noting that the complete data _{com}_{xyu}_{mis}_{obs}_{mis}_{ij}_{1}|_{ij}_{•}, _{com}

In contrast, under a fully Bayesian analysis the posterior density is
_{xyu}_{mis}

The weighted sample represents the BSA results just as the unweighted sample represents the MCSA results. For instance, the Bayesian 95% interval for the target parameter _{xy}_{xy}

The R code we developed to implement the TSA, MCSA, and BSA analyses is posted online (

For a case-control setting with binary exposure and nondifferential misclassification, the sampling model can be expressed via the numbers of controls and cases classified as exposed, given the total numbers of controls and cases, as being independent Binomial realizations. The ‘success’ probabilities are given by

model

{

wtot0 ˜ dbin (p0, n0)

wtot1 ˜ dbin (p1, n1)

p0 <– r0★SN + (1–r0)★(1–SP)

p1 <– r1★SN + (1–r1)★(1-SP)

lor <– logit (r1) – logit (r0)

r0 ˜ dunif (0, 1)

r1 ˜ dunif (0, 1)

SN ˜ dbeta (a. sn, b. sn)

SP ˜ dbeta (a. sp, b. sp)

}

This code requires input of the dataset in the form of wtot0 out of n0 controls and wtot1 out of n1 cases having apparent exposure

Work supported by grants from the Natural Sciences and Engineering Research Council of Canada and the Canadian Institutes of Health Research (Funding Reference Number 62863).

Posterior distributions of _{xy}

Case-control data for coffee drinking and pancreatic cancer.

Cases | Controls | |||
---|---|---|---|---|

Coffee drinking (cups per day) | ≥ 1 | 347 | 555 | 902 |

0 | 20 | 88 | 108 | |

367 | 643 | 1,010 |

Point estimate and 95% confidence interval for target parameter under different values of bias parameters. Recall that under the assumption of no unobserved confounding the estimated odds ratio is 2.75, with a 95% confidence interval of (1.66, 4.55). The method for determining the confidence intervals is described in the

_{U} |
_{XU} |
_{YU} |
exp (_{XY} |
95% CI | |
---|---|---|---|---|---|

−2 | −1 | −1 | 2.62 | 1.58 | 4.34 |

−2 | −1 | 1 | 3.06 | 1.85 | 5.07 |

−2 | 1 | −1 | 3.06 | 1.85 | 5.07 |

−2 | 1 | 1 | 2.27 | 1.37 | 3.75 |

−1 | −1 | −1 | 2.47 | 1.49 | 4.09 |

−1 | −1 | 1 | 3.34 | 2.02 | 5.52 |

−1 | 1 | −1 | 3.34 | 2.02 | 5.52 |

−1 | 1 | 1 | 2.16 | 1.31 | 3.58 |

Bias corrected odds ratios _{XY}_{0}; _{x}_{y}_{xy}

_{0} |
_{y} |
_{x} |
_{XY} |
95% CI | |
---|---|---|---|---|---|

−5 | 0 | −1 | 2.75 | 1.66 | 4.55 |

−5 | 0 | 0 | 2.75 | 1.66 | 4.55 |

−5 | 0 | 1 | 2.75 | 1.66 | 4.55 |

−5 | 1 | −1 | 2.73 | 1.65 | 4.52 |

−5 | 1 | 0 | 2.75 | 1.66 | 4.55 |

−5 | 1 | 1 | 2.80 | 1.69 | 4.64 |

−3 | 0 | −1 | 2.75 | 1.66 | 4.55 |

−3 | 0 | 0 | 2.75 | 1.66 | 4.55 |

−3 | 0 | 1 | 2.75 | 1.66 | 4.55 |

−3 | 1 | −1 | 2.62 | 1.58 | 4.34 |

−3 | 1 | 0 | 2.75 | 1.66 | 4.55 |

−3 | 1 | 1 | 3.06 | 1.85 | 5.07 |