- H1: The cases and controls do not exhibit substantial residential mobility over the life course.
- Rationale: The breast cancer risk factors found significant in the parent study operate at different points in a woman’s life course, some early in life, some later in life, and others involve long-term behaviors over several years. Substantial residential mobility in the study group suggests residence in Marin County may not be indicative of risk factors that occurred in Marin County.
- H2: There is no statistically significant global clustering of breast cancer cases relative to controls after accounting for known risk factors and residential mobility.
- Rationale: Global clustering might suggest the action of an unidentified risk factor not accounted for in the original case-control study design that impacts risk for most if not all of the cases (a large-scale signal, Global Ǫ statistic).
- H3: There are no time periods when the breast cancer cases, considered as a group, exhibit statistically significant clustering relative to the controls.
- Rationale: Large-scale spatial clustering at specific time periods may indicate past exposures that impacted many if not all of the cases (Global Ǫt test).
- H4: None of the cases exhibit statistically significant clustering over their life course, such that they tend to have other cases as neighbors.
- Rationale: Clustering over the life course might indicate cases with similar residential histories—they either tend to travel together because of behavioral factors (e.g., seeking treatment, friendship) and/or have lived in areas that have elevated breast cancer risk (Ǫi test).
- H5: Cases that cluster over the life course (Ǫi) are not part of local clusters at specific times (Ǫt).
- Rationale: Such a pattern (excess risk over the life course coupled with local clusters of excess risk) might indicate the action of ephemeral, geographically localized risk factors that were not accounted for in the parent case-control study.
3.1. Space-Time Analysis
3.2.1. A Diagnostic Framework for Ǫ-Statistics in Relation to Disease Processes
3.2.2. Assessing Overall Significance of Cluster Types
|Cluster set||Description||Pattern||Possible etiology|
|Cases (i) that at times t have a significant number of nearest neighbors that are cases||Cases (i) that at times t have a significant number of nearest neighbors that are cases.||Infection: Contagious process such that infection spreads from a case to its susceptible neighbors. Vector-borne disease process such that individuals in specific areas have increased risk of infection.|
Chronic (e.g., cancer): Increased cancer risk for individuals residing in local areas over a defined time period. Duration of elevated risk must be sufficiently long relative to the duration of time individuals live in the affected areas (e.g.,) exposure time must be sufficient to induce disease response.
|Clustering over the life course||Cases (i) who, over the study, have a significant number of nearest neighbors that are cases.||Infection: The “typhoid Mary” or “super-spreader” process, whereby case (i) (the super-spreader) is infectious over the study period and transmits infections to nearest neighbors.|
Chronic (e.g., cancer): A process whereby neighbors of case i have increased cancer risk and such risk is elevated over the life course of case i. An example would be behaviors that increase cancer risk for others such as second hand smoke. May also arise when groups with elevated risk tend to move or remain together over their life course (e.g., familial groups with common genetic and/or behavioral risk factors).
|Temporal case clustering||Large scale spatial clustering of cases at time t. Clustering of cases relative to controls is significant at time t when all cases and controls are considered.||Infection: Infection outbreak such that the infection impacts a large portion of the study population; endemic phase of infection with multiple local outbreaks.|
Chronic (e.g., cancer): Chronic disease with an underlying infectious etiology (e.g., viral hypothesis of cancer) that impacts a large portion of the study participants; Disease risk mediated by environmental exposures that vary across the study area such that risk is elevated for a large number of study participants. Duration of elevated risk must be sufficiently long relative to the duration of time individuals live in the affected areas (e.g., exposure time must be sufficient to induce disease response).
|Ã||Locations and time when cases with significant clustering over their life course are members of a geographically localized cluster. Includes both ephemeral and persistent clusters.||Infection: Local foci of infection occurring at times t from which infected and infectious cases move away.|
Chronic (e.g., cancer): Local areas of persistent elevated risk that are sustained for a sufficient period of time that (1) disease risk is increased for individuals residing in the local area and (2) the duration of residence of cases in the area is of sufficient length to result in a significant Ǫi statistic.
|Cases (i) who, over the study, have a significant number of nearest neighbors that are cases.||Infection: Large-scale outbreak at specific times, t, that may be comprised of local pockets of infection. For vector-borne diseases this can arise when large portions of the study area have suitable vector habitat during some parts of the study period.|
Chronic (e.g., cancer): Large scale exposures that occur at a specific time(s) t. An example would be leukemia in response to the Chernobyl and Hiroshima incidents.
|Cases that have clustering over their life course and are part of large-scale spatial clusters at times t. Includes cases whose Ǫit are not statistically significant, and some whose Ǫit are statistically significant.||Infection: Large-scale outbreak at times t with at least some of the resulting cases that (1) move together over their life course; and/or (2) remain infectious over their life course and continue to infect their neighbors. For a vector-borne disease this may arise when there is an initial large scale outbreak with some of the resulting cases continuing to be disease reservoirs (e.g., pathogen sources) whose infection can then be transmitted to neighbors.|
Behavioral: Individuals who have a behavior link that causes them to be at an increased exposure to an environmental factor, pathogen, or vector. The difference from Ǫi is that here, the exposure factor must temporally “outbreak” in nature in that it either cycles in population like a vector/pathogen can (such as bird flu) or in severity for an environmental factor. For example, imagine a poultry reseller who moves around. He has an elevated risk any time a bird flu epidemic breaks out so will show a Ǫi cluster and when the epidemics outbreak, there will be Ǫt clusters. An example would be where a pesticide is applied but because of laws is phased out, but later on people start using it again.
Chronic (e.g., cancer): Large scale exposures that occur at a specific time(s) t with some of the resulting cases that (1) move together through life course or (2) continue to reside in the affected area over most of the study period.
|Cases that have clustering over their life course, are part of large scale clusters at time t and whose local clusters Ǫit are all statistically significant.||Etiology is similar to set , but is restricted to include only those individuals that are centers of significant local clustering of cases at times t. For infection, this may be indicative of index cases; for chronic diseases this may indicate individuals who are within local pockets of the largest exposure.|
|Cluster type||Cluster description||Test statistic||Probability of test statistic|
|Temporal case clustering|
|Cluster type||Cluster description||Test statistic||Number of possible elements in each set (n(Ǫ) in Equations (15) and (16))|
|Temporal case clustering||T|
3.2.3. Calculating the Empirical Type I Error under Multiple Testing
3.2.4. Adjusting Critical Values of the Test for Different Cluster Types
3.3. Analysis Steps
- H1: Evaluate residential mobility by mapping places of residence for the study participants at three spatial scales: Marin County, California, and the continental United States.
- H2: Evaluate global clustering over the entire study using the Global Ǫ-statistic.
- H3: Evaluate whether and when there are times that cases cluster relative to controls when all of the study participants are considered together using the Ǫt statistic.
- H4: Evaluate whether specific cases tend, over their life course, to have other neighbors as cases using the Ǫi statistic.
- H5: Using the significance of set A, evaluate whether cases that cluster over their life course (significant Ǫi) are part of local clusters at specific times (Ǫit).
4.2.1. H1: Cases and Controls Do not Exhibit Substantial Residential Mobility over the Life Course
4.2.2. H2: There Is No Statistically Significant Global Clustering of Breast Cancer Cases Relative to Controls When Accounting for Known Risk Factors and Covariates and for Residential Mobility
4.2.3. H3: There Are No Time Periods When the Breast Cancer Cases, Considered as a Group, Exhibit Statistically Significant Clustering Relative to the Controls
4.2.4. H4: Cases Do not Exhibit Statistically Significant Clustering over Their Life Course
4.2.5. H5: Cases that Cluster over the Life Course (Ǫi) Are not Part of Local Clusters at Specific Times (Ǫit)
4.3. Synopsis and Synthesis: Locations of Persistent Life Course Clusters
Conflicts of Interest
- Rothman, K.J.; Greenland, S.; Lash, T.L. Case-Control Studies. In Encyclopedia of Quantitative Risk Analysis and Assessment; Melnick, E.L., Everitt, B.S., Eds.; John Wiley Sons Ltd.: Chichester, UK, 2008; pp. 192–204. [Google Scholar]
- Jacquez, G.M.; Kaufmann, A.; Meliker, J.; Goovaerts, P.; AvRuskin, G.; Nriaqu, J. Global, local and focused geographic clustering for case-control data with residential histories. Environ. Health 2005, 4. [Google Scholar] [CrossRef][Green Version]
- Vieira, V.; Webster, T.; Weinberg, J.; Aschnegrau, A. Spatial analysis of bladder, kidney, and pancreatic cancer on upper Cape Cod: An application of generalized additive models to case-control data. Environ. Health 8, 2009. [CrossRef]
- Rogerson, P.; Yamada, I. Statistical Detection and Surveillance of Geographic Clusters; CRC Press: New York, NY, USA, 2009. [Google Scholar]
- Wrensch, M.; Chew, T.; Farren, G.; Barlow, J.; Belli, F.; Clarke, C.; Erdmann, C.A.; Lee, M.; Moghadassi, M.; Peskin-Mentzer, R.; et al. Risk factors for brast cancer in a population with high incidence rates. Breast Cancer Res. 2003, 5, R88–R102. [Google Scholar] [CrossRef][Green Version]
- Nuckols, J.; Airola, M.; Colt, J.; Johnson, A.; Schwenn, M.; Waddell, R.; Karagas, M.; Silverman, D.; Ward, M.H. The impact of residential mobility on exposure assessment in cancer epidemiology. Epidemiology 2009, 20, S259–S260. [Google Scholar]
- Jacquez, G.M. Space-Time Intelligence System Software for the Analysis of Complex Systems. In Handbook of Applied Spatial Analysis: Software Tools, Methods and Applications; Fischer, M., Getis, A., Eds.; Springer: New York, NY, USA, 2009; pp. 113–124. [Google Scholar]
- Jacquez, G.M.; Meliker, J.R.; AvRuskin, G.A.; Goovaerts, P.; Kaufmann, A.; Wilson, M.L.; Nriagu, J. Case-control geographic clustering for residential histories for epidemiologic studies. Int. J. Health Geogr. 2006, 5. [Google Scholar] [CrossRef]
- Hochberg, Y. A sharper Bonferroni procedure for multiple test of significance. Biometrika 1988, 75, 800–802. [Google Scholar] [CrossRef]
- Hommel, G. A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 1988, 75, 383–386. [Google Scholar] [CrossRef]
- Simes, R.J. An improved Bonferroni procedure for multiple tests of significance. Biometrika 1986, 73, 751–754. [Google Scholar] [CrossRef]
- Benjamini, Y.; Hochberg, Y. Controlling the false discovery rate: A practical powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 1995, 57, 289–300. [Google Scholar]
- Storey, J.D.; Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. USA 2003, 100, 9440–9445. [Google Scholar] [CrossRef]
- Jacquez, G.M.; Slotnick, M.J.; Meliker, J.R.; AvRuskin, G.; Copeland, G.; Nriagu, J. Accuracy of commercially available residential histories for epidemiologic studies. Am. J. Epidemiol. 2011, 173, 236–243. [Google Scholar] [CrossRef]
- Meliker, J.R.; Jacquez, G.M. Space-time clustering of case-control data with residential histories: Insights into empirical induction periods, age-specific susceptibility, and calendar year-specific effects. Stoch. Environ. Res. Risk Assess. 2007, 21, 625–634. [Google Scholar] [CrossRef]
© 2013 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).