# Quantifying and Adjusting for Disease Misclassification Due to Loss to Follow-Up in Historical Cohort Mortality Studies

^{*}

*Int. J. Environ. Res. Public Health*

**2015**,

*12*(10), 12834-12846; https://doi.org/10.3390/ijerph121012834

## Abstract

**:**

_{DM-LTF}) that compared the mortality of workers with the highest cumulative exposure to those that were considered never-exposed. The geometric mean OR

_{DM-LTF}ranged between 1.65 (certainty interval (CI): 0.50–3.88) and 3.33 (CI: 1.21–10.48), and the geometric mean of the disease-misclassification error factor (ε

_{DM-LTF}), which is the ratio of the observed odds ratio to the adjusted odds ratio, had a range of 0.91 (CI: 0.29–2.52) to 1.85 (CI: 0.78–6.07). Only when workers in the highest exposure category were more likely than those never-exposed to be misclassified as non-cases did the OR

_{DM-LTF}frequency distributions shift further away from the null. The application of uncertainty analysis to historical cohort mortality studies with multi-level exposures can provide valuable insight into the magnitude and direction of study error resulting from losses to follow-up.

## 1. Introduction

## 2. Methods

#### 2.1. Error Term for Disease Misclassification Due to Losses

_{DM- LTF}is the odds ratio adjusted for disease misclassification due to loss to follow-up, OR

_{observed}is the observed crude odds ratio, and ε

_{i}are the terms which quantify the systematic error in a study. Because in this manuscript only one error is being evaluated, the denominator has been simplified to ε

_{DM-LTF}, the error term for disease misclassification due to loss to follow-up. ε

_{DM-LTF}is calculated by taking the ratio of the observed odds ratio to the adjusted odds ratio.

#### 2.2. Crude Odds Ratio

**Table 1.**Cell counts used to estimate the crude odds ratio and 95% confidence limits for the association between occupational TCDD exposure and ischemic heart disease using data reported by McBride et al. [13].

Outcome | TCDD Exposure | ||
---|---|---|---|

≥2085.8 ppt-mo | 0–2085.7 ppt-mo | Never-Exposed | |

IHD Cases | 14 | 47 | 14 |

Non-cases | 148 | 925 | 451 |

Alive | 112 | 826 | 414 |

Deceased ^{a} | 36 | 99 | 37 |

**From causes of death other than IHD.**

^{a}#### 2.3. Number of All-Cause Deaths among Losses to Follow-up

**Table 2.**Bias-analysis scenarios: description of probability distributions for classification parameters used to estimate the number of workers lost to follow-up that could have died from IHD and corresponding geometric mean errors (ε

_{DM-LTF}), adjusted odds ratios (OR

_{DM-LTF}) and 95% bias-analysis certainty intervals.

Scenario | Total All-Cause Deaths | Total IHD Deaths | IHD Deaths by Exposure Status | ε_{DM-LTF} | OR_{DM-LTF} | ||||
---|---|---|---|---|---|---|---|---|---|

Distribution (Parameters) | Distribution (Parameters) | Direction of Misclassification | Distribution (Parameters)-Never-exposed | Distribution (Parameters)-≥2085.8 ppt TCDD-mo | GM | 95% Certainty Interval | GM | 95% Certainty Interval | |

1 | Negative Binomial (0.02, 3)^{a} | BetaPERT (0, 0.204 ^{b} × AD, AD)^{c} | Differential A ^{d} | BetaPERT (0, 3/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) ^{e} | 1.85 | 0.78–6.07 | 1.65 | 0.50–3.88 |

3 | Negative Binomial (0.02, 3)^{a} | BetaPERT (0, 0.139 × AD, AD)^{c} | Differential A | BetaPERT (0, 3/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) | 1.62 | 0.74–4.93 | 1.88 | 0.62–4.11 |

5 | Negative Binomial (0.027, 2)^{f} | BetaPERT (0, 0.204 × AD, AD) | Differential A | BetaPERT (0, 3/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) | 1.50 | 0.87–3.92 | 2.03 | 0.78–3.51 |

7 | Negative Binomial (0.027, 2)^{f} | BetaPERT (0, 0.139 × AD, AD) | Differential A | BetaPERT (0, 3/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) | 1.43 | 0.87–3.65 | 2.13 | 0.83–3.50 |

2 | Negative Binomial (0.02, 3)^{a} | BetaPERT (0, 0.204 × AD, AD) | Differential B ^{g} | BetaPERT (0, 1/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) | 0.91 | 0.29–2.52 | 3.33 | 1.21–10.48 |

4 | Negative Binomial (0.02, 3)^{a} | BetaPERT (0, 0.139 × AD, AD) | Differential B | BetaPERT (0, 1/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) | 0.92 | 0.32–2.37 | 3.31 | 1.29–9.65 |

6 | Negative Binomial (0.027, 2)^{f} | BetaPERT (0, 0.204 × AD, AD) | Differential B | BetaPERT (0, 1/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) | 0.95 | 0.42–1.98 | 3.20 | 1.54–7.26 |

8 | Negative Binomial (0.027, 2)^{f} | BetaPERT (0, 0.139 × AD, AD) | Differential B | BetaPERT (0, 1/4 × ID, ID) | BetaPERT (0, 1/2 × IDE, IDE) | 0.96 | 0.46–1.86 | 3.18 | 1.64–6.64 |

**Negative binomial distribution (probability, shape)—probability and shape were determined based on minimum, likeliest and maximum counts of (0, 104, 338);**

^{a}**BetaPERT distribution (minimum, likeliest, maximum);**

^{b}**0.204: proportion of all-cause deaths due to IHD among non-Maori males; 0.139: proportion of all-cause deaths due to IHD among Maori females;**

^{c}**Never-exposed more likely to be misclassified as alive than highest exposed;**

^{d}**The maximum value for this distribution is capped at 112, which is the number of individuals in the highest exposure group (i.e., ≥2085.8 ppt-mo) that were classified as living non-cases;**

^{e}**Negative binomial distribution (probability, shape)—probability and shape were determined based on minimum, likeliest and maximum counts of (0, 37, 338);**

^{f}**Never-exposed less likely to be misclassified as alive than highest exposed.**

^{g}**Figure 1.**Flow diagram describing how losses to follow-up in Mcbride et al. [13] could result in outcome misclassification. Lighter shapes with bolded text indicate the parameters that were specified in our bias analysis.

#### 2.4. Number of Total IHD Deaths among Losses to Follow-Up

#### 2.5. Number of IHD Deaths among Losses to Follow-Up: Never-Exposed

#### 2.6. Number of IHD Deaths among Losses to Follow-Up: ≥2085.8 ppt TCDD-mo

#### 2.7. Scenarios and Monte Carlo Simulation Methods

_{DM-LTF}and ε

_{DM-LTF}as well as 95% certainty intervals. Under specific conditions, a 95% certainty interval may approximate a 95% Bayesian posterior probability interval, such that there is a 95% chance that the true estimate for the sample population will fall within the interval [17,18,19]. This interpretation is different from that of a 95% confidence interval, which is defined as a range of values that will include the true parameter value 95% of the time.

## 3. Results

_{DM-LTF}) had a range of 0.91 to 1.85. The geometric mean adjusted odds ratio (OR

_{DM-LTF}) ranged between 1.65 and 3.33. Estimated certainty intervals (CI) for the geometric mean OR

_{DM-LTF}excluded the null for all four scenarios in which those categorized as “never-exposed” were less likely to be misclassified as alive than workers in the highest exposure category.

**Figure 3.**Geometric mean errors (ε

_{DM-LTF}) (

**a**), adjusted odds ratios (OR

_{DM-LTF}) (

**b**) and 95% certainty intervals by scenario. The dashed horizontal black line in (

**b**) indicates the crude odds ratio (OR

_{observed}) of 3.05. In the Differential A scenarios, the “never-exposed” were more likely to be misclassified as alive than the highest exposed. In the Differential B scenarios, the “never-exposed” were less likely to be misclassified as alive than the highest exposed.

_{DM-LTF}frequency distributions toward the null, lessening the observed effect for the exposure-disease relationship (Figure 4).

## 4. Discussion

_{DM-LTF}in the simulations for this bias analysis were quite wide, with the distance between the upper and lower bounds ranging from 2.67 to 9.27. While the exposure parameters (“IHD deaths: never-exposed” and “IHD deaths: ≥2085.8 ppt-mo”) were the main determinants of the location of the geometric mean adjusted odds ratios, the width of the certainty intervals was likely influenced more by the degree of misclassification.

## 5. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Vena, J.E.; Sultz, H.A.; Carlo, G.L.; Fiedler, R.C.; Barnes, R.E. Sources of bias in retrospective cohort mortality studies: A note on treatment of subjects lost to follow-up. J. Occup. Med.
**1987**, 29, 256–261. [Google Scholar] [PubMed] - Swaen, G.M.; Meijers, J.M. Influence of design characteristics on the outcome of retrospective cohort studies. Brit. J. Ind. Med.
**1988**, 45, 624–629. [Google Scholar] [CrossRef] - Savitz, D.A.; Moure, R. Treatment of subjects lost to follow-up: Effect on oil refinery cancer risks. J. Occup. Med.
**1988**, 30, 89–91. [Google Scholar] [PubMed] - Checkoway, H.; Pearce, N.; Kriebel, D. Monographs in epidemiology and biostatistics. In Research Methods in Occupational Epidemiology, 2nd ed.; Oxford University Press: New York, NY, USA, 2004; p. 372. [Google Scholar]
- Fox, M.P.; Lash, T.L.; Greenland, S. A method to automate probabilistic sensitivity analyses of misclassified binary variables. Int. J. Epidemiol.
**2005**, 34, 1370–1376. [Google Scholar] [CrossRef] [PubMed] - Jurek, A.M.; Lash, T.L.; Maldonado, G. Specifying exposure classification parameters for sensitivity analysis: Family breast cancer history. Clin. Epidemiol.
**2009**, 1, 109–117. [Google Scholar] [CrossRef] [PubMed] - Lash, T.L.; Fink, A.K. Semi-automated sensitivity analysis to assess systematic errors in observational data. Epidemiology
**2003**, 14, 451–458. [Google Scholar] [CrossRef] [PubMed] - Lash, T.L.; Fox, M.P.; Fink, A.K. Statistics for biology and health. In Applying Quantitative Bias Analysis to Epidemiologic Data; Springer: Dordrecht, The Netherlands; New York, NY, USA, 2009; p. 192. [Google Scholar]
- Jurek, A.M.; Greenland, S. Adjusting for multiple-misclassified variables in a study using birth certificates. Ann. Epidemiol.
**2013**, 23, 515–520. [Google Scholar] [CrossRef] [PubMed] - Jurek, A.M.; Maldonado, G.; Greenland, S. Adjusting for outcome misclassification: The importance of accounting for case-control sampling and other forms of outcome-related selection. Ann. Epidemiol.
**2013**, 23, 129–135. [Google Scholar] [CrossRef] [PubMed] - Jurek, A.M.; Maldonado, G.; Spector, L.G.; Ross, J.A. Periconceptional maternal vitamin supplementation and childhood leukaemia: An uncertainty analysis. J. Epidemiol. Community Health
**2009**, 63, 168–172. [Google Scholar] [CrossRef] [PubMed] - Maldonado, G. Adjusting a relative-risk estimate for study imperfections. J. Epidemiol. Community Health
**2008**, 62, 655–663. [Google Scholar] [CrossRef] [PubMed] - McBride, D.I.; Collins, J.J.; Humphry, N.F.; Herbison, P.; Bodner, K.M.; Aylward, L.L.; Burns, C.J.; Wilken, M. Mortality in workers exposed to 2,3,7,8-tetrachlorodibenzo-p-dioxin at a trichlorophenol plant in New Zealand. J. Occup. Environ. Med.
**2009**, 51, 1049–1056. [Google Scholar] [CrossRef] [PubMed] - Mortality and Demographic Data 2008; New Zealand Ministry of Health: Wellington, New Zealand, 2011.
- Vose, D. Risk Analysis: A Quantitative Guide, 2nd ed.; Wiley: Chichester, UK; New York, NY, USA, 2000; p. 418. [Google Scholar]
- Crystal Ball; Oracle Corporation: Redwood Shores, CA, USA, 2014.
- Greenland, S. Multiple-bias modelling for analysis of observational data. J. R. Stat. Soc. A
**2005**, 168, 267–306. [Google Scholar] [CrossRef] - Jurek, A.M.; Maldonado, G.; Greenland, S.; Church, T.R. Uncertainty analysis: An example of its application to estimating a survey proportion. J. Epidemiol. Community Health
**2007**, 61, 650–654. [Google Scholar] [CrossRef] [PubMed] - MacLehose, R.F.; Gustafson, P. Is probabilistic bias analysis approximately Bayesian? Epidemiology
**2012**, 23, 151–158. [Google Scholar] [CrossRef] [PubMed]

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Scott, L.L.F.; Maldonado, G. Quantifying and Adjusting for Disease Misclassification Due to Loss to Follow-Up in Historical Cohort Mortality Studies. *Int. J. Environ. Res. Public Health* **2015**, *12*, 12834-12846.
https://doi.org/10.3390/ijerph121012834

**AMA Style**

Scott LLF, Maldonado G. Quantifying and Adjusting for Disease Misclassification Due to Loss to Follow-Up in Historical Cohort Mortality Studies. *International Journal of Environmental Research and Public Health*. 2015; 12(10):12834-12846.
https://doi.org/10.3390/ijerph121012834

**Chicago/Turabian Style**

Scott, Laura L. F., and George Maldonado. 2015. "Quantifying and Adjusting for Disease Misclassification Due to Loss to Follow-Up in Historical Cohort Mortality Studies" *International Journal of Environmental Research and Public Health* 12, no. 10: 12834-12846.
https://doi.org/10.3390/ijerph121012834