Gene-Environment Interactions in the Development of Complex Disease Phenotypes

The lack of knowledge about the earliest events in disease development is due to the multi-factorial nature of disease risk. This information gap is the consequence of the lack of appreciation for the fact that most diseases arise from the complex interactions between genes and the environment as a function of the age or stage of development of the individual. Whether an environmental exposure causes illness or not is dependent on the efficiency of the so-called “environmental response machinery” (i.e., the complex of metabolic pathways that can modulate response to environmental perturbations) that one has inherited. Thus, elucidating the causes of most chronic diseases will require an understanding of both the genetic and environmental contribution to their etiology. Unfortunately, the exploration of the relationship between genes and the environment has been hampered in the past by the limited knowledge of the human genome, and by the inclination of scientists to study disease development using experimental models that consider exposure to a single environmental agent. Rarely in the past were interactions between multiple genes or between genes and environmental agents considered in studies of human disease etiology. The most critical issue is how to relate exposure-disease association studies to pathways and mechanisms. To understand how genes and environmental factors interact to perturb biological pathways to cause injury or disease, scientists will need tools with the capacity to monitor the global expression of thousands of genes, proteins and metabolites simultaneously. The generation of such data in multiple species can be used to identify conserved and functionally significant genes and pathways involved in gene-environment interactions. Ultimately, it is this knowledge that will be used to guide agencies such as the U.S. Department of Health and Human Services in decisions regarding biomedical research funding and policy.


Defining "Gene-Environment Interaction"
The concept that the phenotype is the consequence of gene-environment interaction is not new. Garrod, in 1902, was one of the first to note that the effects of genes could be modified by the environment [1]. He suggested that individual differences in genetics could play a role in variation in response to drugs, and that this effect of one's genotype could be further modified by the diet. Wright, in 1932, further emphasized the existence of a functional relationship between various biological endpoints and networks of genes and environmental factors in his studies of mutation, selection, and breeding [2]. The phrase geneenvironment interaction infers that the direction and magnitude of the clinical effect that a genetic variant has on the disease phenotype can vary as the environment changes. In other words, genetic risk for disease is modifiable in an environment-specific manner. Furthermore, an individual can inherit a predisposition for a devastating disease, yet never develop the disease unless exposed to the appropriate environmental trigger(s). Although concerns about the role of gene-environment interactions in disease etiology have developed over the last century, prioritizing the understanding of these interactions as a means to prevent complex diseases has only emerged in the past 15 years [3][4][5]. In 1992, the National Institute of Environmental Health Sciences identified the role that gene-environment interaction has on disease prevention and intervention as one of its two research priorities. A few years later, the initiation of the Environmental Genome Project (EGP) represented the first large-scale effort to discover the susceptibility alleles likely to be important in gene-environment interactions [6][7][8][9]. Prior to the initiation of the EGP, most of the momentum for research on gene-environment interactions was occurring in the field of pharmacogenetics to identify genetic variants that influence drug efficacy. As a consequence, many published studies have linked specific polymorphisms with specific drug responses [10,11]. However, even when all the highly relevant genes and their interactions with specific environmental components have been identified, it will still be difficult to relate the influence of an individual's genotype to their disease phenotype due to the added complexity of gene-gene interactions, posttranslational processing, and protein-protein interactions.
Unfortunately, the exploration of the relationship between genes and the environment has been hampered in the past by the limited knowledge of the human genome. This was further complicated by the inclination of scientists to study disease development using experimental models that consider exposure to a single environmental agent. Rarely in the past were interactions between multiple genes or between genes and environmental agents considered in studies investigating the causes of chronic disease. Today, there is a greater appreciation of the view that the variable degree of morbidity associated with chronic diseases (e.g., asthma, obesity, cardiovascular disease) is attributed to the differential environmental exposure of the individual [12][13][14][15][16][17]. However, the use of the hypotheses linking genes, the environment, and disease phenotypes and prevalence is rather new. The biomedical research community is at "an opportunistic crossroad" as we explore how the interaction between an individual's intrinsic genetic susceptibility and the environment influences the etiology of the disease phenotype. Equally important in this emerging paradigm shift of investigating the etiology of complex disease phenotypes is the "window of opportunity." In other words, it is essential that we identify the critical timing of exposure to specific environmental agents (i.e., biological, chemical, physical) that results in the highest morbidity/mortality. It is also essential that we redefine the spectrum of environmental agents that may explain the variability of health outcomes amongst individuals with similar disease phenotypes. This spectrum includes co-morbidity, risk behavior/ lifestyle, community factors (e.g., the built environment, violence), and opportunities (or lack thereof) for a person's educational and economic development.

Genes, the Environment and the Etiology of Chronic Disease
The increase in prevalence of certain chronic diseases in the latter half of the 20 th century in industrialized countries is likely a function of the increase in life expectancy (due to effective public health reforms) coupled with our ever-increasing exposure to environmental toxicants.
Yet to date, the exact mechanism(s) or event(s) involved in the early stages of disease development are still unknown primarily due to the multi-factorial nature of chronic disease. Frequently, it is forgotten that human biology and patho-physiology are choreographed at the level of gene regulation. The quote "genetics loads the gun but the environment pulls the trigger" exemplifies the complex relationship between human disease and the environment. This famous analogy by Dr. Judith Stern, Distinguished Professor of Nutrition and Internal Medicine at the University of California, Davis, conveys the message that disease phenotypes are not only a result of interaction between different genes within the host but also between genes and the environment.
A prominent role for the environment is supported by geographic differences in incidence of disease, by variations in trends over time, and by studies of disease patterns in immigrant populations. An example of this would be the "thrifty genotype hypothesis" which is often cited to explain the increased prevalence of obesity and incidence of Type 2 diabetes within certain ethnic groups [18]. This hypothesis states that the human genotype has evolved to insure an insulin resistant state during famines when food is scarce. However, in the past 100 years, the incidence of famines has decreased as the abundance of food has increased. For ethnic groups who report high prevalence of obesity and Type 2 diabetes, the evolutionary pressure to conserve genes for the "next famine" is now a risk factor in the current environment of abundant food. Furthermore, it is known that the human genotype has evolved very slowly [19]. Therefore, the current epidemic of obesity and Type 2 diabetes may likely be the result of "a variable environment reacting with a relatively constant genetic substrate" resulting in an unfavorable combination of genetic variations and environmental exposures.
Population-based, twin cohort studies are the "gold standard" for distinguishing between the contributions of genetics versus the contribution of the environment to disease development.
Lichtenstein, et al. [20] and Verkasalo, et al. [21] have both reported that there is likely a substantial contribution of the environment in cancer since less than 50% of cancer incidence amongst twins is attributed to genetic factors. In 1999, Tanner, et al. [22] found no evidence to support the hypothesis of a genetic contribution to the concordance rate of Parkinson's Disease among 172 twin pairs who were diagnosed after the age of fifty.
A positive family history of disease captures the underlying complexities of gene-gene and geneenvironment interactions by identifying families with combinations of risk factors (both measured and unmeasured) that lead to disease expression. In their 2003 review of strategies to prevent heart disease, Hunt, et al. [23] stated that even when family members share a family history of heart disease, they also share other risk factors (e.g., diet, activity). Additionally, Greenland, et al. [24] reported that a sizable majority of individuals with fatal or non-fatal coronary heart disease events have at least one major risk factor. The two studies highlight the importance of the synergy between genes and the environment with respect to chronic disease morbidity and mortality, and further emphasize the need to assess all risk factors when assessing the prevalence of complex disease phenotypes.
Autoimmune disease is characterized by a detrimental immune response directed at one's own tissue, and requires a signal from the environment even when a susceptible genetic background is already present [25]. The synergistic relationship between genes and the environment in the etiology of autoimmune disease has been confirmed by numerous studies [26,27]. Recent studies suggest that candidate genes in the human leukocyte antigen (HLA) complex strongly contribute to an individual's genetic predisposition towards autoimmune disease [28].
It is now well-established that alterations in highly penetrant genes explain only a small fraction of complex diseases. Such genes represent a small fraction of variations relative to the more common polymorphisms that have a less disruptive effect on protein function. Furthermore, studies continue to refute the myth that "bad genes gone awry" are the source of increased disease prevalence and health disparities [29]. The evidence for a prominent role for the environment in the development of human disease is so compelling that Rothman, et al. [30] concluded that "the epidemiologic evidence accumulated to date indicates that environmental exposures, broadly defined to include lifestyle factors, are responsible for most cancers."

The Emerging Role of Genetics in the Etiology of Occupationally-Related Disease
Occupational epidemiology has historically provided valuable hypotheses for studies attempting to elucidate the role that environmental exposures play in the etiology of chronic disease. Occupational settings provide a "natural" cohort of exposed and control populations by virtue of the workplace setting (i.e., coal mine vs. office workers). With the evolution of molecular or genetic epidemiology, occupational exposure assessments now consider the threshold of acceptable risk conferred by the prevalence of specific genetic biomarkers among the general population [31,32]. The identification of biomarkers that increase the probability of disease occurrence has global implications in protecting worker safety and health. In 2004, the National Office of Public Health Genomics and the National Institute of Occupational Safety and Hazards (both located at the U.S. Centers for Disease Control and Prevention) addressed this emerging public health issue with the development of three research priorities in the area of occupational genetics [33].
Scientific evidence supports the need for further investigation into the relationship between an individual's genotype and the development of chronic disease. In 1994, Smith, et al. observed an increased risk (odds ratio, 2.1, p < 0.05) of asbestos-induced pulmonary disease (e.g. asbestosis) among carpenters who possessed a homozygous deletion of the gene encoding glutathione-Stransferase μ (GSTM-1) [34]. Among Caucasian individuals, 50% possess this deletion and therefore do not produce this enzyme that is critical in response to the oxidative stress that occurs in the lung upon asbestos exposure. In response to this stress, various inflammatory cytokines are produced resulting in the scarring of the lung tissue which is replaced with fibrous tissue and thus accounts for the progressive breathing difficulties in affected individuals. Asbestosis is also a risk factor for mesothelioma which is a malignant cancer of the mesothelium and difficult to control once it is diagnosed. Recent evidence has shown that individuals with the GSTM-1 polymorphism are at an increased risk for malignant mesothelioma (OR = 1.69, p = 0.034) [35]. In spite of the evidence, the global use of asbestos has continued even though studies have shown a significant linear relationship between its use in industrialized countries and the respective incidence of mesothelioma [36][37][38].
Berylliosis or chronic beryllium disease (CBD) is another example of an occupational disease associated with a susceptibility genotype [39]. Beryllium is a hard, grayish metal that occurs naturally mineral rocks, coal, and soil [40]. Once commercially mined, beryllium is purified for use in nuclear weapons and reactors, aircraft and space vehicle structures, instruments, x-ray machines, and mirrors. Exposure to this metal may result in CBD, an irreversible and sometimes fatal scarring of the lungs. Recent cross-sectional studies among beryllium workers have found that up to 10% had been sensitized to beryllium with up to 4% having evidence of the disease [41][42][43]. Although it is unknown what percentage of sensitized individuals progress to CBD, occupational studies have found an increased risk among these individuals for developing CBD. CBD provides the same peril as asbestos-related disease in that it is often asymptomatic for several years after exposure. It is clinically characterized by a gradual decline in lung function with one third of patients eventually succumbing to respiratory failure [44] . At the cellular level, CBD is characterized by an accumulation of beryllium-activated CD4 + T cells and the development of granulomatous inflammation in the lung [45]. The activation of the Tcells is made possible by the major histocompatability complex (MHC) class II molecules known as human leukocyte antigen-DP (HLA-DP). During immunological sensitization, the HLA-DP molecules are responsible for presenting the antigenic beryllium to the pathogenic CD4 + T cells. Susceptibility to developing CBD has been associated with specific alleles of HLA-DP that possess a glutamic acid at the 69 th position of the beta chain [46,47]. Further evidence supporting the critical role of this glutamic acid substitution in CBD was published by Wang, et al. who found that among CBD patients, the glutamic acid-modified allele was present in 14 of their 20 study subjects [48]. Studies continue today investigating differential genetic susceptibility to beryllium sensitization and CBD prevalence within the occupational community.
Other examples of fibrotic lung diseases attributed to genetic predisposition and occupational exposure to mineral dust are silicosis and coal workers' pneumoconiosis (CWP) [49][50][51]. Silicosis is a fibronodular lung disease [52] caused by inhalation of dust containing crystalline silica, most commonly found as quartz, which is abundantly present in granite, slate, and sandstone. CWP, which often occurs in conjunction with silicosis, is defined as "the accumulation of coal dust in the lungs and the tissue's reaction to its presence" [53]. Approximately 1.6 million U.S. workers are exposed to silica annually and it is estimated that 60,000 are living with silicosis. Although biomarkers indicative of the presence of disease in the early stages remain lacking, recent evidence demonstrates an important role of the pro-inflammatory cytokines, tumor necrosis alpha (TNF-α) and interleukin-1 (IL-1) in the transition to the more aggressive phenotype, progressive massive fibrosis (PMF) [54]. This progression has been observed to occur with or without continued exposure to mineral dust.
Studies have identified the alleles of HLA-B7 and HLA-DR8 to be associated with silicosis and CWP, respectively. Specific variants of the genes encoding for TNF-α and IL-1 appear to synergistically modify an individual's susceptibility to both diseases. Interestingly, the gene encoding for TNF-α is located on chromosome 6 between HLA-B and HLA-DR. In a study of Belgian coal miners, a TNF-α gene variant with a SNP at position -308 was found to be present in 50% of those diagnosed with CWP [55]. The TNF-α variant with a SNP at position -308 was found to be a risk factor for both the moderate and severe forms of silicosis (OR = 3.8 and 1.6, respectively) [56]. The same study found that an additional TNF-α variant with a SNP at position -238 conferred a substantial risk for severe silicosis (OR = 4.0, 95% CI 2.4-6.8) but was a "protective" factor for the moderate form of the disease. A member of the IL-1 cytokine family is the IL-1 receptor agonist (IL-1 RA). A variant of the IL-1 receptor antagonist gene, IL-1 RA (+2018), is observed to be increased in coal miners with both moderate and severe silicosis suggesting that the risk conferred from this variant is for disease occurrence and not disease severity [57]. Finally, the presence of both the IL-1-RA (+2018) and TNF-α (-308) variants conferred over a 2-fold increase in risk for aggressive development of severe silicosis [56].
These examples demonstrate an exciting and emerging study area of investigation within the field of occupational health. Although we mentioned the number of U.S. workers who are exposed and at risk for these diseases, we emphasize that these issues have global workforce implications.
Although guidelines for occupational exposure limits to the aforementioned substances are periodically reviewed, understanding the contribution of an individual's genotype during the risk assessment process is crucial when estimating the amount of risk that we are willing to accept in the risk/benefit "tradeoff".

Interdisciplinary Research Teams --The Future of Environmental Health Research
In this section, we present the case for a paradigm shift in research investigating the link between genes, the environment and human disease. Such a shift is required since the "business as usual" approach to understanding disease etiology is outdated and no longer appropriate [58]. The latter statement reflects our personal opinion that the current practice of biomedical researchers to focus on limited or circumscribed components of complex disease needs to be addressed. The beginning of the new century has made it apparent that a fundamental restructuring of the biomedical research enterprise is required with a greater emphasis on the development of sustainable interdisciplinary research teams. The 2002 Nobel laureate Sydney Brenner stated in 1980 that "progress in science depends on new techniques, new discoveries, and new ideas, probably in that order." With the completion of projects such as the Human Genome Project [59,60] and HapMap [61], the major challenge facing biomedical researchers is to elucidate or decode the interactive relationship between gene-gene and gene-environment interaction in the etiology of disease. Investments in these projects led to the development of powerful new tools to conduct large-scale, population-based studies necessary to untangle complex interactions between genes and environmental factors. As a result, descriptive studies by toxicologists and epidemiologists have identified genetic and environmental risk factors for chronic disease [62][63][64][65][66][67]. But, the actual causes ("triggers") and mechanisms of disease development remain poorly understood.
Separate disciplinary approaches have led to new insights into disease causation, but a collective explanation is lacking. Epidemiologists tend to focus on exposures, geneticists target susceptibility genes, cell/molecular biologists tend to explore mechanisms, and social scientists study behavior. The design and conduct of interdisciplinary studies in the future will require the development and incorporation of tools (e.g., metabolomics, disease-specific cytokine assays) that possess increased sensitivity, specificity, and predictability for the chronic diseases that have the highest mortality, morbidity, and prevalence. But, even more problematic is the fact that methods to accurately measure exposure and to statistically model multiple interactive components that make modest contributions to the phenotype are not yet available. To alter the current state of affairs with respect to technology and database needs will require investments in the following areas of research: (i) identification of disease risk factors; (ii) elucidation of genetic differences and similarities between human and animal models; (iii) development of improved sensitivity/specificity of investigative tools and statistical models to assess exposure and interactions of multiple components; (iv) development of high-throughput, low cost and more informative strategies to assess toxicity of drugs and environmental xenobiotics; and (v) elucidation of mechanisms and metabolic pathways influenced by geneenvironment interactions.
Toxicogenomics, a relatively new discipline, is a "systems toxicology" approach that integrates genomics, proteomics, metabolomics, bioinformatics, and conventional toxicology/pathology to study the host's response to the environment [68][69][70].
Genomic technology provides the ability to identify genes related to complex diseases (and their environmental modifiers) and complements the sub-discipline of transcriptonomics --the genome-scale analysis of RNA expression.
In environmental exposure studies, proteomics is used to investigate the cell-and tissue-wide analysis of protein expression, structure, and function while metabolomics characterizes the metabolic profile in response to stresses like disease, toxic exposure, or dietary change. Bioinformatics is the discipline that merges biology, computer science, and information technology to form a comprehensive picture of cellular activities in both normal and diseased states. Ultimately, bioinformatics will allow the biomedical research community to understand basic biological processes so that the diagnosis, treatment, and prevention of disease will become more efficient. Furthermore, with the evolution of toxicogenomics from a novel discipline to a "standard of practice," the adoption of evidence-based medical care, especially for those with complex disease phenotypes, will be inevitable.

Realizing the Vision for the Future of Environmental Health Research
The environmental health community has moved swiftly to take advantage of new databases and tools derived from the Human Genome Project.
The Environmental Genome Project (EGP) was initiated in 1997 with the long-term goal of characterizing how specific human genetic variations (or polymorphisms) contribute to environmentally induced disease susceptibility [6,8,9,68,[71][72][73]. This effort was expanded to include functional molecules (e.g., RNA, protein, carbohydrates, lipids, and metabolites) encoded by human and mouse genomes. To date, the EGP has identified "environmentally responsive" genes that are likely to influence the outcome of environmental exposures. Additionally, the functional significance of specific gene variants and polymorphisms are being delineated.
The development of a repository is under way that will allow the examination of the sequence of DNA base pairs (sequencing) in a predefined set of human DNA samples that represents the diversity of the United States. Both of the aforementioned are crucial for molecular epidemiology studies that will help identify which genetic factors are correlated with increased or decreased risk of disease. With approximately 35,000 genes, hundreds of thousands of protein species and numerous metabolites, the task of identifying and characterizing them, with respect to function, is an ambitious undertaking. It will require improvement, standardization, and validation of existing methods to achieve reproducibility, increased sensitivity and specificity, and high throughput. This undertaking will also require large-scale, multiinstitutional collaboration and interdisciplinary expertise to amass and analyze such large genetic databases. In Table 1, a compilation of programs and initiatives that will facilitate such efforts are listed.

The Ultimate Goal
The elucidation of how genes and environmental factors interact to perturb biological pathways that cause injury or disease will require that scientists have investigative tools with the capacity to monitor the global expression of thousands of genes, proteins and metabolites simultaneously. Additionally, techniques employing invitro analysis, mouse models, and human data are essential. It is imperative that we recognize that genes are not autonomous; genes respond to internal and external signals. In fact, the leading causes of morbidity or mortality are likely the cause of genes interacting with the environment (i.e., cancer, cardiovascular disease, respiratory disease, Type 2 diabetes). However, it is essential that we as biomedical research scientists realize that we are the "gatekeepers" of how such knowledge will translate across the various scientific disciplines with the ultimate goal of improving human health. The "unraveling" of the human genome had created optimism that we would have a rational, scientific, and biologically plausible explanation for health disparities. Although this turned out not to be the case, we are closer to identifying the gene-gene and gene-environment interactions that determine the onset of chronic disease, allow us to identify "high risk" individuals, and afford us the opportunity to develop effective and sustainable disease prevention strategies.
The scientific community is at a unique period in history. We now have the building materials (i.e., candidate genes and genetic variability) and toolboxes (i.e., disease phenotype-specific genetically modified animals, genetic characterization of individuals enrolled in disease specific-longitudinal studies) to provide a blueprint of how studies investigating the influence of gene-environment interaction will proceed in the 21 st century. Simultaneously, the research community must agree on a standard definition of the environment, the standard measurements of environmental influence (including that of the social/behavioral environment), and refine the tools by which we measure these exposures. Ultimately, it is this concerted effort that will be used to guide national and global agencies such as the U.S. Department of Health and Human Services and the World Health Organization in decisions regarding biomedical research funding and policy with respect to the prevalence of complex disease phenotypes.