2. Overview of Genetic Methodology
There are a variety of methods that can be used to understand the genetic contributions to a phenotype of interest. Brief descriptions of main methods are provided in Section 2.1
, Section 2.2
and Section 2.3
(and in Table 1
), but the reader is directed to additional resources for a more thorough overview of twin studies [7
] and statistical genetics methods [8
]. Note that this section focuses on approaches used within the insomnia genetics literature, and thus does not encompass all possible genetic or genomic methods.
2.1. Twin and Family Studies
Twin and family studies represent a quantitative approach that allows estimation of the relative contributions of genes and environment to a trait, based on known relationships among family members. Family studies compare family members of affected individuals to those of unaffected individuals. If a phenotype has significant genetic influence, members of an affected individuals’ family will be more likely to report experiencing the same disorder/symptoms when compared to the family members of the non-affected individuals. Further, as the genetic relationship between individuals increases, family members that are more closely related are expected to be more alike on the phenotype of interest. A major limitation of the family study approach is that family members share both genes and environment: The familial aggregation that is observed could be due to either genes or shared environment and thus cannot be narrowed down to purely genetic effects.
Twin studies, however, are able to further disentangle genes and environment by comparing correlations across identical (monozygotic: MZ) and fraternal (dizygotic: DZ) twins who have been raised together [9
]. This is important, since being raised together means that the twins have experienced the same family environment, minimizing the potential contributions of family environment to phenotypic differences. Since MZ twins are assumed to share 100% of their genes, and DZ twins share, on average, 50%, similarity across twin pairs can be compared. If MZ twins are more similar (i.e., their scores on a phenotype are more highly correlated with each other) than DZ twins, then one can assume that this is due to differences in genes. The classic twin study separates variance components into additive genetic effects (A), which contribute twice as much to MZ as compared to DZ twins; common environmental effects (C), experiences that twins share that make them more similar (e.g., peer groups, parental attitudes); and unique environmental effects (E), which reflect the individuals’ unique experiences and thus contribute to between-twin differences. Further, the E component also encompasses measurement error. From these results, heritability (i.e., the proportion of variance due to genetic effects) can be calculated. It is also possible to examine dominant (non-additive) genetic effects (D), instead of C, in a twin study. Note that broad-sense heritability includes total genetic effects (both additive and dominant), while narrow-sense heritability refers to the proportion of variance due to additive genetic effects only. In the sections that follow, we are using the term ”heritability“ to refer to narrow-sense heritability unless otherwise specified. Extensions of the ACE model allow for comparison of sex differences, the modeling of multiple waves of data across time to examine latent constructs and stability (longitudinal and measurement modeling), and examination of overlap in etiologic influences across multiple phenotypes. Although twin studies can quantify relative genetic and environmental contributions, they have several limitations: It is not possible to identify specific genes, and other types of genetic effects beyond those that are additive or dominant in nature, such as epigenetics or gene-gene interactions, cannot be measured.
2.2. Candidate Gene Studies
Candidate gene studies represent an early approach to the identification of specific genes. In this method, genes of interest are chosen a priori based on existing knowledge of biology and potential mechanisms (e.g., a neurotransmitter system is chosen due to its known involvement in a process thought to be involved in the disorder; animal models indicate relevance of a certain gene). The purpose of an association study, such as a candidate gene study, is to compare variation in a gene of interest across individuals with and without the disorder (or with different levels of symptoms). The most common type of genetic variation studied is the single nucleotide polymorphism (SNP), which represents a base pair change at a specific position, occurring in some individuals but not others, creating two different alleles. On average, unrelated individuals differ at 1 bp per every 1000 bps across the genome. The allele that is more often observed in the ancestral population under study is the major allele, and the less common is the minor allele. A simple association approach in a case-control design tests if the frequency of either allele (or genotype) is increased in cases versus controls. SNPs can be found in areas of the genome that code for protein sequences (i.e., coding regions), and those that do not (i.e., non-coding regions). Another important type of variation is the variable number tandem repeat (VNTR) where a short sequence of nucleotides is repeated multiple times, and individuals can differ in terms of the length of repeats they possess.
In a candidate gene study, the chosen gene(s) of interest are genotyped for appropriate polymorphisms, and association analyses (e.g., logistic or linear regressions, chi-squared tests; depending on the outcome variable) are conducted to determine if allele or genotype frequencies are significantly related to the gene of interest. Extensions of the candidate gene approach investigate gene-environment interactions (G×E), which hypothesize that different genotypes may result in differential responses to environmental exposure and thus contribute to variation in the phenotype of interest. Stressful life events are commonly studied using a G×E approach. Despite initial popularity, the candidate gene approach has fallen out of favor, given limitations such as the choice of gene, which is limited to what is thought to be biologically plausible. The majority of candidate gene studies have also failed to replicate, which could be due to multiple reasons (e.g., phenotypic heterogeneity, false positives).
2.3. Genome-Wide Association Studies (GWAS)
GWAS is an association approach that allows for an unbiased interrogation of the genome, resulting in the simultaneous examination of thousands of SNPs. With recent advances in imputation techniques (which use statistical methods to infer genotypes which were not actually measured), millions of SNPs can now be studied. In a GWAS, association analyses are conducted across all available SNPs, which can be obtained in multiple ways (e.g., from a chip with a specific set of SNPs, whole genome-sequencing). GWAS is generally limited to detecting common variation. While rarer variants are often measured and/or imputed, it is difficult to examine the effects of rare variants, due to them being filtered out during quality control steps, given the lack of power to detect effects in genotyped variants at low frequencies. This may also be due to poor imputation of rare variants. There are approaches specifically for examining rare variants, but this is usually done in aggregate. With the cost of sequencing decreasing, GWAS represents a way to identify novel genes that one would not have hypothesized as being related to the trait of interest. However, the method is not without limitations, which include false positives due to the large number of tests run at once, as well as population stratification (if ancestry is not properly controlled for, results may be due to differences in the distribution of allele frequencies across populations, and not true results). In interpretation of both candidate gene and GWAS results, it is also important to consider that the identified variant may not reflect the causal polymorphism—instead, it could be in linkage disequilibrium (LD) (i.e., inherited together more often than expected by chance) with the actual causal variant. LD is a function of multiple genetic and selection processes, including time [10
]. As a result, alleles are inherited in LD blocks, and those alleles that aggregate in the same block are often inherited together. Thus, instead of being the causal variant itself, a SNP could be representing, or “tagging”, other variants that are within its LD block, which are the causal polymorphisms. LD structure (i.e., size/composition of LD blocks) does differ by population, since different ancestry groups diverged at different periods of time. For example, Europeans and Asians are newer populations, so there has been less time for recombination to occur, making LD blocks larger than those of Africans.
3. Twin and Family Studies of Insomnia
There is a substantial twin and family literature indicating that insomnia is moderately heritable, despite varying definitions used across studies. Many investigations have focused on insomnia symptoms, as opposed to strict insomnia disorder phenotypes. Several reviews outline the genetics research on insomnia (and other sleep disorders) and provide an overview of early twin and family studies, some of which date as far back as 1966 [6
]. The family study literature is small, with only five studies looking at insomnia using this method [13
]. Overall, these studies support the idea that insomnia “runs in families”, but have the disadvantage of being unable to disentangle genetic and shared environmental effects. Twin studies do allow for the examination of both genes and shared environment simultaneously, since twins are assumed to have identical shared environments. There is a much larger twin literature, with over 20 studies conducted to date that include insomnia phenotypes [6
]. Insomnia heritability estimates range from 22% to 59% in adults, with a much wider range for studies in children (14%–71%). The reader is directed to Table 2
for study-specific heritability estimates.
Newer twin studies of insomnia have incorporated multiple waves of data and more sophisticated models, which can provide useful information about genetic contributions to insomnia. A recent study by Lind and colleagues [31
] used a large longitudinal twin sample and confirmed insomnia’s moderate heritability in adults (22%–25%) at single time points. This study was novel in that heritability estimates increased substantially when two time points were incorporated simultaneously and measurement error was removed (59% for females, 38% for males) and showed that insomnia heritability is stable across time in adults. Further, significant quantitative sex differences were demonstrated in the final longitudinal model, providing some of the first evidence for larger genetic contributions to insomnia in women (vs. men). Another recent longitudinal study twin study of insomnia examined its heritability across childhood and adolescence, utilizing four waves of data [39
]. For three of the waves, the best-fit models included dominant (non-additive) genetic effects, accounting for 24%–38% of the variance across waves. Heritability was estimated at 14% at Wave 3, where the best-fit model included additive instead of dominant genetic effects. The non-additive genetic contributions to insomnia at Wave 1 contributed to insomnia across all other waves, providing evidence that genetic influences on insomnia are stable across time in youth. Additionally, there were new non-additive genetic effects at Wave 2 (approximate age: 10 years old), representing genetic innovation (i.e., new genes come into play that were not present at previous time points), but also stability, since Wave 2’s effects were also important at Waves 3 and 4. Thus, despite the inconsistency of insomnia phenotypes, findings for insomnia heritability appear to be robust, demonstrating that genetic contributions are important in the development of the disorder. However, a significant limitation of twin studies is the fact that they are unable to identify specific genes involved in a disorder; genetic contributions can only be examined in aggregate.
Given that insomnia has a bidirectional relationship with psychiatric disorders, particularly internalizing disorders [40
] and is also related to many other physical health outcomes [4
], there is a subset of the literature that examines the overlap between genetic and environmental influences on insomnia and other disorders and traits. Psychopathology, particularly depression and anxiety, have been examined in relation to insomnia in studies of both children and adults, and results show significant genetic overlap with insomnia [36
]. There is also a small subset of the literature that examines overlap with externalizing (and other) psychiatric disorders (e.g., [42
]). While a detailed discussion of this part of the literature will not be given here, studies demonstrating genetic overlap do have important implications for molecular genetics research, which will be covered in Section 5
5. Conclusions and Future Directions
It is clear there has been progress in our understanding of the genetics of insomnia, despite limitations, and overall results do shed light on several potential molecular mechanisms for the development of the disorder. Several of the novel genes identified through GWAS also appear to be associated with other psychiatric disorders (e.g., CACNA1C
with bipolar disorder), providing evidence for pleiotropy (where one gene can influence multiple, often unrelated, traits) and suggesting that attempts to test genes from other psychiatric disorders in insomnia may be useful. Many of the polymorphisms investigated in candidate gene studies of insomnia (e.g., serotonin transporter polymorphism, dopamine system polymorphisms) have also been examined in the context of other related psychiatric disorders such as depression and PTSD, although these disorders also report mixed findings (particularly for serotonin) (e.g., [75
]). Further, although not identical across studies, the novel genes converge in function and suggest systems that may not have been previously considered in a candidate gene approach: Many are related to intrinsic neuronal excitability, which has a plausible role in insomnia: One study showed that individuals with insomnia had disturbed intracortical excitability [77
], and some have suggested that insomnia is a result of over-activation of wake-promoting areas during sleep [78
]. For example, the CAC1AIC
gene codes for a subunit of voltage-dependent calcium channels, which are important for neuronal excitation and the release of neurotransmitters, among other functions [79
, the main gene of interest from the sleep latency GWAS, was found to be related to the function of neurotransmitters such as GABA [72
]. GABA is the main inhibitory neurotransmitter in the brain, relevant for sleep onset. A decrease in GABA function, however, could lead to excitation of wake-promoting neurons instead [49
]. Although not directly related to insomnia per se, a recent GWAS of sleep duration found that ABCC9
, important for KATP
channels, was relevant, again pointing to an important role for ion channel function. When this gene was utilized in a model system approach (using Drosophila
), RNA interference of ABCC9
resulted in flies that were unable to sleep for the first 3 h of their night, providing additional evidence for ABCC9
’s role in sleep [80
]. Additionally, 5-HTTLPR
is related to the stress response and hyperarousal, and given some associations with insomnia, could reflect an arousal mechanism [58
]. Taken together, these results suggest a mechanism of changes in neuronal excitability, which if it occurs in wake-promoting neurons (or sleep-inhibiting neurons) could lead to hyperarousal and difficulty sleeping, thus promoting insomnia. This might suggest that genetic risk for insomnia is more related to “local” vs. “global” sleep processes.
These mechanisms could also underlie a vulnerability to greater sleep reactivity (the level of sleep disruption an individual experiences following a challenge; often a stressful event [28
]), which could also be related to extended hyperarousal/excitability and in turn influence insomnia. The extant literature supports sleep reactivity as an insomnia risk factor (e.g., [81
]), with a recent prospective study expanding findings to indicate that not only did higher sleep reactivity scores predict development of insomnia, the contribution of this effect was larger than that of family history [82
]. Sleep reactivity has also been shown to be independently related to depression symptoms, although this relationship was mediated by insomnia [83
], further emphasizing the importance of considering sleep reactivity as a potential insomnia mechanism. There is also evidence that sleep reactivity is moderately heritable [28
], and that its genetic influences overlap significantly with those of insomnia (genetic correlation = 0.55 (females) and 0.65 (males); [28
]). Thus, in addition to serving as a predictor of insomnia, sleep reactivity shares genetic vulnerability with insomnia, supporting the hypothesis that the two phenotypes are related on a genetic level. It may be that the genes currently identified for insomnia, since they are involved in excitatory pathways, are instead related to sleep reactivity, but more research would need to be done to examine this. There is some evidence that genetic mechanisms may be through hyperarousal as a recent family study has shown that parents with higher levels of stress-related insomnia have offspring more likely to exhibit higher cognitive-emotional hyperarousal [84
]. Investigation of specific genes involved in sleep reactivity may also provide insights into insomnia and represent a potential area for early intervention by identifying individuals at risk for the disorder.
There is still much to understand about the contributions of specific genes and their mechanisms to insomnia. As the field of statistical genetics enters the post-GWAS era, the use of new techniques and advances that improve upon identification of relevant genes and expand the analysis of genome-wide data will be crucial in advancing the genetics of insomnia. For example, genome-wide complex trait analysis (GCTA) has provided a method for using genome-wide data to calculate the heritability of a trait in a large sample of unrelated individuals [85
]. This method, which produces a SNP-based heritability estimate from the additive effect of all available SNPs, is not without limitations (e.g., power) and it should be noted that SNP-based heritability estimates for many common psychiatric phenotypes have been lower than twin estimates, but GCTA does offer a novel approach for analyzing genomic data. To our knowledge, there are currently no published GCTAs specifically of insomnia phenotypes. However, one recent study conducted GCTAs of depression symptom clusters, which included insomnia. The SNP-based heritability of insomnia symptoms was 30% in this sample, with estimates for other depression symptoms ranging from 5% (anxiety) to 30% (appetite) [86
]. This does align with current insomnia heritability estimates, but should be interpreted with caution given that insomnia is presented in the context of depression.
There are also several methodologies that permit examination of molecular genetic overlap across phenotypes. This may be especially relevant for insomnia, as twin studies demonstrate significant (>50%) overlap in the additive genetic contributions to insomnia and common internalizing psychopathology (depression, anxiety) across the lifespan (e.g., [36
]), indicating that understanding molecular overlap with common psychopathology may aid in the identification of insomnia-related genes. Further, recent studies show overlap with other psychiatric traits such as alcohol use disorder and psychotic symptoms (e.g., [42
]). The polygenic risk score (PRS) approach uses the results from a GWAS of phenotype A (discovery sample) to create a risk score that is then applied to the genotypic data for individuals with phenotype B (target sample) and used to predict phenotype B. PRS approaches have been successful for phenotypes such as schizophrenia, identifying overlap with phenotypes such as bipolar disorder [87
]. As GWAS of insomnia improve in phenotype and sample size, it will be important to conduct PRS analyses with insomnia and psychiatric disorders to expand upon twin studies.
Another new method for comparing across traits, which only requires data at the level of GWAS summary statistics (i.e., the output from a GWAS, which consists of the SNPs analyzed, their effects, and p
-values for the specific phenotype analyzed; this is not individual-specific data), is linkage disequilibrium score regression (LDSC) [89
]. LDSC uses summary statistics and information on the LD structure of the population used to generate a heritability estimate (similar to GCTA), and can produce a genetic correlation across multiple phenotypes. Bivariate GCTA can also be run to obtain a correlation, but has the disadvantage of requiring full genomic data for all individuals. LDSC may also prove useful for insomnia: A recent study used LD score regression and found significant genetic correlations between sleep duration and type 2 diabetes (rG = 0.26) and schizophrenia (rG = 0.19), suggesting that genetic variants that contribute to these disorders may also be relevant for sleep duration [90
]. Thus, conducting genetic correlation analyses for insomnia and other complex traits (psychiatric and non-psychiatric in nature) has the potential to inform gene-finding efforts.
Beyond identification of genes and examination of overlap, there are methods to follow up on GWAS findings. Gene and pathway analyses can be useful to examine SNPs and genes of interest from GWAS and can determine if there is evidence that specific pathways or genes involved in specific functions are enriched within the sample (e.g., using Ingenuity®
Pathway Analysis software; [91
]). The programs and methods available are constantly changing, as bioinformatics techniques advance rapidly. Epigenetics, which encompasses alterations in the genome that are not due to base pair changes (e.g., methylation, RNA interference [10
]) may also be of particular relevance to insomnia, given that environmental factors are also important in the etiology of the disorder. A recent review suggests that since epigenetic mechanisms may be involved in sleep regulation as well as the stress response, it is plausible that epigenetics are a significant contributor to insomnia risk and symptoms [92
Another genetic approach that can inform on insomnia mechanisms and may have more direct applications to a clinical population is gene expression (i.e., to what extent certain genes are turned on or off under specific conditions). Lately, we are gaining new insight into insomnia and treatment outcomes based on results from gene expression studies. Gill and colleagues [93
] compared gene expression profiles of veterans with and without insomnia, finding differential expression of 43 genes in insomnia. One gene in particular, Urotensin 2, was significantly down regulated, and represents a potential insomnia target, given its role in orexin regulation and rapid eye movement. Another study by the same group investigated gene expression changes in military personnel with insomnia, focusing on differences between individuals who did or did not experience improvements in sleep following three months of insomnia treatment [94
]. Positive changes in immune-related genes (i.e., lower expression of inflammatory cytokines, increases in regulatory genes), implication of the ubiquitin pathway (via pathway analysis) and improvement in depression symptoms were all seen in the improved sleep group, providing insight into the molecular changes that occur with successful insomnia interventions. Further, Irwin and colleagues [95
] examined the effects of several insomnia interventions (CBT-I, tai chi) on levels of inflammatory markers in a randomized control trial design using older individuals. Both interventions resulted in positive changes. More studies of gene expression, particularly those that can extend findings to different populations with insomnia, will be useful in understanding molecular mechanisms of the disorder and how to treat it.
The goal of studies on genetic underpinnings of insomnia is to gain insight into the etiology of the disorder to inform development of interventions for both prevention and treatment of insomnia. For example, if we know that a certain gene influences an individual’s response to insomnia treatment (e.g., individuals with one genotype respond well to certain medication while those with other genotypes do not), we could screen individuals before treatment to maximize outcomes. The same strategy could also be used to determine whether an individual should receive cognitive behavioral therapy for insomnia or medication. Further, the identification of risk and protective genes could allow us to identify at risk individuals based on a combination of genotypes, allowing for early intervention and prevention. Eventually, it may even be possible to alter expression of risk genes (through drugs that target gene regulation) or use gene therapy to replace defective genes. There are many possibilities. Advances in psychiatric genetics have contributed to our knowledge of insomnia genetics, but there is still more work to do. Researchers should continue to focus on improving phenotypes and utilizing the most updated genetic methodology.