Abstract
Genome-wide association studies (GWAS) are commonly employed to study the genetic basis of complex traits/diseases, and a key question is how much heritability could be explained by all single nucleotide polymorphisms (SNPs) in GWAS. One widely used approach that relies on summary statistics only is linkage disequilibrium score regression (LDSC); however, this approach requires certain assumptions about the effects of SNPs (e.g., all SNPs contribute to heritability and each SNP contributes equal variance). More flexible modeling methods may be useful. We previously developed an approach recovering the “true” effect sizes from a set of observed z-statistics with an empirical Bayes approach, using only summary statistics. However, methods for standard error (SE) estimation are not available yet, limiting the interpretation of our results and the applicability of the approach. In this study, we developed several resampling-based approaches to estimate the SE of SNP-based heritability, including two jackknife and three parametric bootstrap methods. The resampling procedures are performed at the SNP level as it is most common to estimate heritability from GWAS summary statistics alone. Simulations showed that the delete-d-jackknife and parametric bootstrap approaches provide good estimates of the SE. In particular, the parametric bootstrap approaches yield the lowest root-mean-squared-error (RMSE) of the true SE. We also explored various methods for constructing confidence intervals (CIs). In addition, we applied our method to estimate the SNP-based heritability of 12 immune-related traits (levels of cytokines and growth factors) to shed light on their genetic architecture. We also implemented the methods to compute the sum of heritability explained and the corresponding SE in an R package SumVg. In conclusion, SumVg may provide a useful alternative tool for calculating SNP heritability and estimating SE/CI, which does not rely on distributional assumptions of SNP effects.
1. Introduction
Genome-wide association studies (GWAS) have proven to be successful in dissecting the genetic basis of a variety of diseases. A number of new susceptibility loci have been discovered, providing novel insight into the pathophysiology of many diseases. Nevertheless, a large proportion of heritability still remains unexplained. It is natural to question the maximum variance that could be explained by all variants in a GWAS (or meta-analyses of GWAS), as we expect that many true susceptibility variants are “hidden” due to limited power.
A number of methods have been developed to estimate total heritability according to all measured SNPs (also known as SNP-based heritability). Regarding methods that require individual-level data, in a pioneering work, Yang et al. [1] derived a method to estimate the variance explained by all SNPs in a GWAS using a linear mixed model with random SNP effects. The approach assumes that all SNPs have non-zero and normally distributed effects (beta), with a mean effect of zero. Each SNP is assumed to contribute to the same level of explained variance (i.e., variance explained by each SNP = total heritability/number of SNPs). Other similar approaches have also been proposed. For example, LDAK [2] assumes that a different heritability explains each SNP, depending on the minor allele frequencies (MAF), linkage disequilibrium (LD) score and imputation quality of the SNP. Advanced methods have also been developed to estimate SNP-based heritability using summary statistics alone. (Here, summary statistics refer to GWAS results for each SNP, with effect size (beta), standard error of beta and test statistics/p-values available, or at least two items available.) LD score regression (LDSC) is one of the most widely used approaches for this purpose [3]. LDSC assumes a mean effect (beta) of zero and equal variance explained by each SNP (i.e., an infinitesimal model). SumHer [4] is an alternative approach based on the LDAK assumptions. For a more detailed technical review, please refer to ref [5]. The broader problem of SNP-based heritability estimation has also been discussed in several other reviews or opinion pieces [6,7,8,9].
Prior to the development of LDSC, we have developed an alternative framework (see ref [10]; referred to as “SumVg” in this paper) to achieve the same goal of estimating SNP-based heritability using summary statistics alone. Essentially, we aimed to recover the true effect sizes from a set of observed z-statistics based on formulas presented by Robbins [11] (who attributed the idea to Maurice Kenneth Tweedie), Brown [12] and Efron [13]. The corrected z-statistics are then converted to variance explained. There are several advantages to this method. Most importantly, the SumVg approach does not rely on any distributional assumptions of the effect sizes of susceptibility variants. In addition, it does not assume an equal amount of heritability is explained by each SNP, or that all SNPs contribute to the heritability (infinitesimal model). There are also no assumptions about the relationship between allele frequencies and variance explained. The method is also computationally fast. In addition, since the LDSC method directly leverages LD patterns, a well-matched LD reference panel is usually required [14]. There is less reliance on LD information when using SumVg as LD is mainly used for pruning.
Our method has been applied in a number of studies (for example see [15,16,17,18,19,20,21,22,23]). However, there are no methods available to quantify the standard error (SE) or precision of the heritability estimates from SumVg, or the corresponding confidence intervals (CIs). There is considerable technical difficulty in developing a reliable approach for estimating the SE since usually only the summary data (instead of individual-level data) are available. If raw data are available, a standard non-parametric bootstrap could be employed by sampling individuals with a replacement. However, there are currently no methods for evaluating the SE or CI of the point estimate of heritability when only summary statistics are available.
We summarize the contributions of this study below. In this work, we proposed five re-sampling approaches to estimate the SE of the total heritability of all SNPs in GWAS, based on summary statistics alone. Extensive simulations were performed to compare and validate the performance of different methods. We also explored various methods for constructing CIs. Secondly, we also developed an easy-to-use R program to implement the SumVg approach with different flexible modeling options, available at https://github.com/lab-hcso/Estimating-SE-of-total-heritability/ (accessed on 12 October 2023). Thirdly, we reported heritability estimates for 12 immune-related traits (levels of cytokines and growth factors) [24] based on this approach, for which LDSC was unable to provide reasonable estimates. Such cytokines/growth factors are regulators of immune responses and inflammation, and are important intermediate phenotypes for autoimmune, inflammatory and infectious diseases [25]. As such, it is of scientific and clinical importance to unravel the genetic architecture of these traits, and estimating their heritability may be considered a useful contribution in its own right.
2. Results
2.1. Overview of Methods
We estimated the total heritability (Vg) explained by all variants in a GWAS panel using the Tweedie’s formula [10], which corrects selection bias in the observed z-statistics. To estimate the standard errors (SE), we proposed five resampling methods. The first two are based on jackknife, namely delete-one and delete-d-jackknife (with d = n/5 observations removed each time). We also proposed three parametric bootstrap methods, where z-statistics were sampled from a normal distribution based on the ‘corrected’ z-statistics, and/or the local false discovery rate (fdr) (i.e., estimated probability that a SNP is null). We also proposed several methods for constructing confidence intervals (CIs), including normal approximation with various bootstrap bias corrections, as well as the percentile and union of CI methods. We tested the performance of the SE and CI estimation methods in simulations under different heritability and sample size scenarios. We applied our methods to estimate SNP-based heritability and the SEs of 12 immune traits to reveal their genetic architecture.
2.2. Simulation Results for SE Estimation
Standard errors (SEs) of heritability, as estimated by the jackknife and bootstrap approaches, are listed in Table 1 and plotted in Figure 1. Bias, variance and root mean square error (RMSE) of SEs were calculated over 100 simulations (Table 2; Figure 1 and Figure 2).
Table 1.
Standard error (SE) of the sum of variance explained (Vg) estimated by different resampling approaches.

Figure 1.
Boxplots of SE estimated by different approaches. Vg is the sum of variance explained, N is the sample size, and the horizontal line refers to true SE calculated by repeating the experiments 100 times based on the true data generating mechanism. jack_del_1, jack_del_d, paraboot, fdrboot1 and fdrboot2 are different SE estimation approaches as described above.
Table 2.
Bias, variance and root mean squared error (RMSE) of SE estimated by different resampling approaches.

Figure 2.
Bar plots of bias, variance and root mean squared error (RMSE) of SE estimated by different approaches in simulations. Vg is the sum of variance explained, and N is the sample size. jack_del_1, jack_del_d, paraboot, fdrboot1 and fdrboot2 are different SE estimation approaches as described above.
The delete-[n/5]-jackknife worked reasonably well when the total heritability explained was low (when heritability = 0.101), but it tended to overestimate the SE when the total heritability was higher, especially with larger sample sizes. The bias was also positive across all simulation scenarios. The standard (delete-1) jackknife approach performed the worst among all methods, producing inflated estimates of SE. The variance and RMSE of this estimator were high compared to other approaches. The SE was, in general, over-estimated at all heritability levels across all sample sizes. This may be explained by the fact that the sum of variance explained is not a very smooth parameter, which impairs the performance of delete-1-jackknife estimators.
The other methods, including the original parametric bootstrap (paraboot) and the modified versions with consideration of local fdr, performed reasonably well and closely resembled the true SE. With the exception of one simulation setting, the parametric bootstrap methods achieved the lowest (absolute) bias for SE. For the variance and RMSE of SE, parametric bootstrap also performed the best. In terms of RMSE, the parametric bootstrap approaches modeling the local fdr (i.e., fdrboot1 and fdrboot2) outperformed the other methods. The RMSE of different estimators were also observed to reduce with increasing sample sizes.
2.3. Performance of Different CI Construction Methods
The full results are presented in Table 3, Tables S1 and S2. For standard CI (based on normal approximation), the CIs built from the SE of delete-d-jackknife performed reasonably well (in terms of coverage) for large sample sizes, although the coverage was not always adequate for modest samples sizes, especially for N < 20,000. The coverage of CIs constructed from other types of SEs were more variable, with good coverage for some scenarios but poor coverage for others. Therefore, we primarily focus on the SE from delete-d-jackknife when a standard CI is used. Interestingly, the bias-corrected standard CI, with bias correction based on paraboot or fdrboot2, performed better in the several cases when the standard CI had low coverage (<50%) (we assume that the SE from delete-d-jackknife was employed). The performance of percentile CIs was highly variable across different scenarios.
Table 3.
Coverage probabilities of different union CI (UCI) approaches for 95% CI.
In view of the highly variable performance of different CI construction methods, we expect the union of CI (UCI) to perform better and be more robust across different scenarios. We observed that UCI, no matter if it is constructed from the standard or percentile CI estimators, in general, achieved good coverage across most simulation scenarios, although in some cases the coverage was still below the desired level (95%). When we further took the union of standard and percentile UCI estimators (i.e., Method 3 listed under ‘Union CI’ in the Section 4), the coverage was adequate for almost all scenarios, except one case in which both the sample size and the sum of variance explained (Vg) were low (N = 5000, Vg = 0.101).
2.4. Results on Immune Traits
PLINK was applied to trim GWAS data for 12 immunological traits (Table 4) with various r2 criteria to obtain roughly independent SNPs. We only included common variants with an MAF > 0.01 for further analysis. Then, using SumVg, the “true” z-statistics of trimmed SNPs were retrieved to capture the missing heritability. The jackknife and bootstrap methods were used to compute the corresponding SEs (Table 5; Figure S1).
Table 4.
Summary of the immune traits being studied.
Table 5.
SE of the sum of variance explained estimated by different resampling approaches, for 12 immune traits (under different r2 pruning thresholds).
The total SNP-based heritability predicted by SumVg for the selected traits, in contrast to the comparatively low or negative heritability estimates from LDSC, were around 10–20% based on a collection of LD-pruned SNPs. We obtained a stable (and likely conservative) estimate of heritability at r2 ~ 0.01 or 0.005. Lower r2 values (i.e., r2 < 0.0025 and r2 < 0.001) had limited impact on final estimates of heritability. The delete-one jackknife consistently produced the highest standard error, while the bootstrap and delete-d jackknife approaches produced SEs that were more comparable to one another. Out of the 12 cytokines/growth factors studied, the highest heritability was observed for the levels of IL-4 and IL-17.
2.5. R Package Implementation
We also implemented the methods to compute the sum of heritability explained and the corresponding SEs in an R package SumVg, available at https://github.com/lab-hcso/Estimating-SE-of-total-heritability/ (accessed on 12 October 2023).
The computational speed of different resampling approaches using SumVg is presented in Table S3 (assuming 100,000 SNPs and 200 resampling iterations). The speed is generally fast and the time taken was around 2–4 min for each resampling method, using a single core (Intel Xeon Gold 6230 CPU @ 2.10 GHz).
3. Discussion
In this study, we presented an approach for estimating the SE of SNP-based heritability estimates using SumVg, and our applications to immune phenotypes demonstrate the usefulness of this approach.
Our main purpose is to provide an alternative approach for SNP-based heritability and SE estimation, since different approaches have different statistical modeling assumptions, or assumptions about the genetic architecture. In practice, it is almost impossible to know the true genetic architecture of a disease/trait, and as such, it is very difficult to verify the correctness of heritability estimates due to the lack of a ‘gold standard’. It will be more reassuring if one observes similar heritability estimates from diverse methods. SumVg may provide a useful alternative reference for heritability estimates, in conjunction with existing approaches such as LDSC. SumVg may also be useful when standard approaches are unable to give reasonable results (e.g., close to zero heritability for traits that are likely to be heritable from previous studies, or negative estimates). It will be interesting to investigate the reasons underlying negative heritability estimates for LDSC; one possibility is mis-specified model assumptions [26], but the exact reasons will require further studies.
We recommended pruning the SNPs (such that SNPs are roughly in linkage equilibrium) before applying our method of heritability estimation. One approach is to employ a series of r2 thresholds (e.g., decreasing r2 from 0.1 to 0.001) and consider the point at which heritability became stable. Our empirical applications showed that an r2 threshold of ~0.01 may be sufficient. The resulting SNP-based heritability may be considered to be a conservative estimate (due to the possibility of removing some causal variants during LD-pruning). While not directly modeling LD is a limitation of this approach, the lower reliance on accurate LD information may be advantageous in some cases, for example when in-sample LD information is not available and only limited external reference data are present. On the other hand, we are also investigating methods to model LD in the SumVg framework. Since SumVg and LDSC are based on different modeling strategies and assumptions, and that the main focus of this study is the development of new SE/CI estimation approaches for SumVg (as well as applications to immune traits and presentation of a new R package), we shall leave carrying out a detailed comparison between SumVg and LDSC (or other SNP heritability estimation methods) for future work.
We have not investigated methods for SE estimation when raw genotype data are available. When raw data is available, one potential approach is to simply resample the individuals with a replacement (i.e., standard non-parametric bootstrap). However, such an approach is computationally intensive and its performance over methods based on summary statistics requires further research. The above resampling methods can also potentially be sped up by splitting the job into multiple processes to be run in parallel, although this approach has not been implemented in our software yet. We also wish to point out that, as the resampling methods were supposed to apply to GWAS summary data, in general the computational speed is fast, and the speed is not affected by sample sizes.
We have explored various approaches to construct CI, although we cannot yet find a single approach that yields an optimal CI with good coverage across all scenarios. We shall leave the development of more sophisticated and novel methodologies for CI construction for future works. For practical purposes, the union CI appears to perform well in terms of coverage across most scenarios (at the expense of wider CIs). On the other hand, we suspect that the issue of CI construction may not be unique to the SumVg approach; other methods for estimating SNP-based heritability typically require more stringent assumption on the distribution of effects, and/or that all SNPs contribute to heritability. The violation of such assumptions may lead to the estimates being biased and the inadequate coverage of CIs. Here we have proposed a bootstrap correction of bias, which indeed led to improvement in CI coverage in some cases, for example the standard CI under small sample sizes. Nevertheless, bootstrap correction showed a variable performance across different scenarios and did not always reduce bias in all cases. The above issues may warrant further studies.
Here we further highlight several important points to note and limitations of our framework. Regarding the SumVg estimator of total SNP-based heritability, one future research direction is to further explore its asymptotic theoretical properties. We did not pursue this direction here. Of note, the key difficulty in Equation (1) (i.e., the Tweedie’s formula) is to estimate and accurately. We primarily employed a kernel density estimator here, although other density estimation approaches may also be attempted. Notably, the kernel density estimator has been shown to be asymptotically consistent under certain assumptions [27]. In the paper by Efron [28], the asymptotic regret (Reg) of the empirical Bayes approach (i.e, using Tweedie’s formula) was studied by comparing the Tweedie’s estimate with the Bayes estimate of the true effect size, for a fixed value of at . It was shown that Reg tends towards zero as N tends towards infinity, and the regret depends on the squared error of as an estimator of , where . Future theoretical studies of SumVg and other SNP heritability estimation methods are warranted.
In the current work, we assume that the summary statistics have been corrected for population stratification and other types of bias. If the original GWAS study suffered from bias, e.g., confounding, selection/ascertainment bias, sampling bias, bias due to missing data, etc., the resulting Vg estimate will also be affected. We suggest that the above bias should be carefully addressed at the design and/or analysis stage of the GWAS, for example by performing proper random sampling, inverse probability weighting to address selection bias [29], proper imputation of missing data, etc. As with any method, independent replication is also important.
Another limitation is that the proposed approach for calculating SNP heritability and SE/CI estimation may not work well for very small sample sizes. Since GWAS sample sizes are generally getting larger (most with N > 5000), we did not address the performance under very small sample sizes here. In such cases, both the SNP heritability and SE estimates may need to be viewed with greater caution. Meta-analysis of GWAS results across multiple studies may be recommended. Future work may also explore more innovative approaches to addressing small sample sizes, for example whether specifying a prior for the underlying effect sizes (δ) may help. (The current approach does not require any specification of the distribution of δ).
We also note that resampling methods often assume that the data points are independent of each other. In our study, prior to the analysis, we processed the data to remove strongly linked SNPs using LD pruning. The resulting SNPs are therefore roughly independent though some residual LD might remain. As a future direction, it may be useful to explore ways to fully tackle LD, for example by block bootstrap or jackknife [30]. However, external LD data from reference panels would be required, and there may be risks of LD mismatch between the studied and external samples. Further studies are required to investigate these issues.
Different resampling methods like bootstrap and jackknife may have different assumptions and applicability to different kinds of data. We have conducted relatively extensive simulations to compare performance of different methods across a range of heritability levels and sample sizes, which helps evaluate their applicability. We believe the proposed methods are generally applicable to most GWAS summary data. Note that the parametric bootstrap approaches assume that the observed data (z-statistics) are drawn from a certain specified parametric distribution. In our case, it is assumed that the δ and/or local fdr are estimated reasonably well. For small sample sizes, this assumption may not hold very well. The jackknife approaches do not require parametric assumptions; however, delete-one-jackknife has been shown to produce inconsistent variance estimators for non-smooth estimators such as the sample quantiles [31]. Delete-d-jackknife can resolve this problem, but the choice of d may not be straightforward. We suggest that multiple types of resampling methods should be performed; similar results across different methods may provide reassurance to the validity of results. Future work may include more extensive simulations for different genetic architectures and wider applications to complex traits.
There may be a concern that resampling methods may not handle extreme values or skewed distributions well. As discussed above, we recommend the GWAS should be conducted carefully in the first place. For example, skewed phenotypes may require transformation before analysis, and confounding or other kinds of bias need to be addressed. The SumVg method works on summary statistics. It is possible to perform further inverse-rank transformation to the summary statistics if the distribution is skewed or outliers are present, although this may create some bias to the Vg estimate. One may also trim the outlying z-statistics, and increasing the number of resamples may also help. The performance of these approaches will be a topic for further studies.
Importantly, we have also applied our approach to estimate the heritability of different cytokines, which play important roles in immune response and the pathogenesis of autoimmune, inflammatory and infectious diseases. Our analyses suggest that the studied cytokines are moderately heritable in general.
To summarize, SumVg is useful for triangulating evidence from different approaches to support conclusions regarding SNP-based heritability. We present novel methods of computing SE and CI and an easy-to-use software here, which we believe will be helpful for other researchers. Our application to the cytokine levels also sheds light on the genetic architecture of these clinically important immune traits.
4. Materials and Methods
4.1. Estimation of the Total Heritability Explained (Vg)
We previously proposed an approach [10] to estimate the sum of heritability explained by all variants on a GWAS panel. Our approach leverages Tweedie’s formula for estimating the true underlying effect sizes of SNPs, based on the observed GWAS summary statistics. The principles are described in detail in the work by Efron [28].
4.1.1. Estimation of Total Vg Based on Tweedie’s Formula
More specifically, assuming we have a large number of normally distributed variables (here z-statistics from a GWAS analysis), each with its own unobserved mean parameter δi, then
where k is the total number of variables. The attention is focused on the more extreme values, for example the top SNPs in high-dimensional genomics studies. As described by Efron [28], ‘selection bias’ may be at play here. Intuitively, the more extreme z-statistics might have been ‘lucky’ as random errors pushed them to deviate from zero; as such they can ‘stand out’ among the other z-statistics. In other words, the true underlying effect sizes of these top SNPs tend to be less extreme than the observed values. This phenomenon is also known as the ‘winner’s curse’, for example see [32,33,34]. As a result, if we directly used the observed z-statistics to estimate the true effect sizes, the performance may not be optimal. Some form of ‘correction’ of the observed z-statistics are required.
Efron [28] proposed an empirical Bayes approach to reduce the selection bias, which was first described by Robbins [11] who attributed the ideas to Tweedie. The method assumes that
In other words, we assume that δ was sampled from a prior ‘density’ g(.), then z ~ (δ, σ2) were observed, and the variance σ2 was known. There are no assumptions on the form of the prior density g. According to the Tweedie’s formula,
In our setting of GWAS analyses, we assume σ2 = 1, since we work with the summary z-statistics. We estimated the true or ‘corrected’ effect sizes of SNPs using
which is equivalent to the formula above when σ2 = 1. Here z denotes the observed z-statistic, obtained from the estimated regression coefficient divided by the estimated SE (i.e., ). δ is the z-statistic derived by the true effect size divided by the estimated SE of the sample ), which can be considered a form of the ‘standardized’ true effect size. We previously proposed to employ a kernel density estimator to compute f(z) [10], which was shown to perform well in simulations. The total variance explained (Vg) can be obtained by converting the underlying effects δ to the Vg scale (see below and ref [10]).
4.1.2. Conversion of z-Statistics to Vg
For continuous traits, the conversion formula followed our previous work [10], which can be derived from ANOVA table of regression,
For binary outcomes, it is also possible to convert the z-statistics to Vg, provided that the estimated SE (or beta) and minor allele frequencies (MAF) of the SNPs, as well as the outcome prevalence, are available. We followed the methodology described in ref [35], which described how to convert coefficients from a logistic model to the liability scale. Note that the liability is assumed to have a variance of one. We followed Equation (4) from the above paper [35] to derive the coefficient (τ1) under a liability scale. We converted τ1 to the standardized coefficient (τstandard) by multiplying τ1 by sqrt(2 × MAF × (1 − MAF)), which is the standard deviation (SD) of the allelic count (coded as 0, 1, 2). Total variance explained is given by sum of the squared τstandard.
4.1.3. Assumptions
Regarding the assumptions of this approach, we emphasize that it does not require prior assumptions about the underlying distributions of the true effect sizes δ, which is an important advantage over other SNP-heritability estimation methods. On the other hand, we assume that the summary statistics have been corrected for population stratification or other confounding factors. The z-statistics are assumed to follow normal distributions; for very small samples sizes, rare variants, highly imbalanced case to control ratio, or highly skewed continuous outcomes, etc., caution should be taken as to whether the test statistic follows a normal distribution. We assume full GWAS summary statistics as input; if the summary statistics have been selected based on their significance levels (e.g., some GWAS only released the top SNPs, say top 10,000 SNPs), the proposed Tweedie’s formula may not work well. The effect sizes may be overestimated in this case as the other SNPs have been selected for being significant.
4.1.4. An Alternative Conditional Estimator
We also proposed an alternative approach by evaluating the expected effect size conditioned on H1 (i.e., 0)
where fdr is the local false discovery rate described in Efron [36]. The resulting estimate of Vg can be obtained by first converting to the Vg scale (see Section 4.1.2), then multiply by 1 − fdr(z).
The conditional estimator, however, is prone to large random variations as it involves local fdr estimation of each SNP. In many subsequent applications of our heritability estimation method [15,16,17], the unconditional estimator (Equation (1)) was primarily employed. We shall hence focus on the unconditional estimator in this paper, although the resampling approaches described below can readily be applied to other estimators in our previous work [10] as well.
4.2. Estimation of the Standard Error (SE) of Vg
4.2.1. Standard and Delete-d-Jackknife to Estimate SE
In standard (delete-one) jackknife procedure [37], we estimate the standard error (SE) by leaving out one observation at a time. The SE is defined by
where n is the sample size, is the parameter estimate from the sample with the ith observation removed and
In our case, the parameter is the sum of heritability from all variants.
An extension is the delete-d-jackknife [31] where we leave out d observations at a time. There are in total possibilities of removing d out of n observations. In practice, N is usually very large. One may simply randomly repeat the procedure m times only instead of exhausting all possibilities of removing d out of n observations. The standard error is given by
where denotes the parameter estimate in the vth jackknife replicate where d observations are left out. The delete-d-jackknife (when ) works better than the standard jackknife for non-smooth parameters like the median [31].
There are no clear rules on the choice of d in delete-d-bootstrap. Chatterjee [38] suggested n/5 as a reasonable choice for d based on the consideration of efficiency and likely model conditions. We followed the suggestion by Chatterjee [38] and set d as n/5 (=20,000) in all simulations.
4.2.2. Parametric Bootstrap Approaches for Estimating SE
In parametric bootstrap, in each replication we simulated z-statistics based on , the ‘corrected’ z-statistics from original sample (this method is referred to as ‘paraboot’). We have
where denotes the ith z-statistic in the bth bootstrap replicate. For small effects, the will be shrunken towards zero.
We further proposed a modified approach by also considering the local fdr (i.e., probably of null given z) of each z-statistic. In each replicate, we simulated z-statistics according to the following scheme:
where denotes the observed z-statistics. The standard error is then computed from the simulated z-statistics. This method is referred to as “fdrboot1”.
Alternatively, one may employ the corrected z-statistics instead of the observed z-statistics as the mean in each simulation, i.e.,
The method is also referred to as ‘fdrboot2’.
4.3. Construction of Confidence Intervals (CIs): An Exploratory Analysis
The construction of a proper CI is a more demanding task as it requires the unbiasedness of the estimate and correct estimation of the variability of the estimate. Given the difficulty of constructing accurate CIs, here we consider CI estimation as a secondary or exploratory analysis which requires further investigation and methodological development. We have explored a few approaches as described below.
4.3.1. Normal Approximation (Standard Approach)
Firstly, we explored the standard approach for constructing the 95% CI by using normal approximation, i.e., , where is the quantile of a standard normal distribution at the 97.5th percentile. Assuming a polygenic model, the total heritability is the sum of variance explained contributed by many variants of small to modest effect sizes. Hence, it is reasonable to assume normality according to the central limit theorem (as is assumed by other SNP-heritability estimation tools). We examined the performance of different CIs, with SE determined by various methods. Empirically, we found that SE computed by the delete-d-jackknife performed reasonably well.
On the other hand, we also explored this using bootstrap to correct for bias of the point estimates of Vg. In brief, the bias can be estimated by [39]
where denotes the observed Vg, and is the mean of the bootstrapped estimates of Vg. The bias-corrected estimator of Vg is given by
The 95% CI is then based on . Since we proposed 3 bootstrap procedures, there were 3 bootstrap bias-corrected CIs based on normal approximation. The standard CI without bias correction was also included as another estimator.
4.3.2. Percentile Approach
Secondly, we explored the percentile CI approach, namely construction of 95% CIs based on the 2.5th and 97.5th percentiles of the bootstrapped Vg. Again, bias correction can be applied as follows
where and are the 2.5th and 97.5th percentiles of the bootstrapped replicates of Vg, respectively. Bias correction was based on the same bootstrap method that was used to derive the percentiles. Again, we also included the percentile CIs without bias correction.
4.3.3. Union CI
Thirdly, we explored a more robust CI estimator by taking the union of individual CIs (UCI). The union of multiple CIs is constructed by taking the minimum of the lower CIs across different methods as the final lower CI, and the maximum of different upper CIs as the final upper CI. This union approach can ensure better robustness if CI construction approaches perform differently under different scenarios. The UCI method has been widely employed in instrumental variables regression to improve robustness of results in the presence of pleiotropy [40].
In summary, the following methods were explored:
- Normal approximation (standard approach), without bias correction (one estimator) or with bootstrap bias correction (3 estimators), then take the union of CIs;
- Percentile approach, without bias correction (3 estimators) and with bias correction (3 estimators), then take the union of CIs;
- Union of the final CI obtained from 1 and 2.
4.4. Simulation Studies
We compare the SE estimated from the above methods with the ‘true’ SE obtained from one hundred simulations with known data generating distributions. The details of the simulations is as follows [10]. Briefly, a gamma distribution was used to simulate three levels of variance explained (Vg = 0.101, 0.191, 0.295), which were converted to true effect sizes (δ). Z-statistics for 100,000 independent SNPs (0.5% were non-null) with different sample sizes (N = 5000, 10,000, 20,000, 50,000, 100,000, 200,000) were then simulated as input for SumVg following the distribution N(δ,1). Two hundred replicates were run for each bootstrap or jackknife procedure. We focus on quantitative traits in our simulations, but the results should most likely apply to binary traits as well, as the only difference in these two scenarios is the formula to convert z to variance explained (Vg). The performance of different methods for CI construction was also evaluated.
4.5. Application to Immune Traits
A selected set of immune-related traits (levels of cytokines/growth factors) were included for study, based on the GWAS by Ahola-Olli et al. [24]. We selected 12 continuous immune traits with (1) sample size N > 5000 and (2) very low (≤3%) or negative SNP-based heritability estimated by LDSC. The LDSC heritability were based on pre-calculated values from GWASAtlas (https://atlas.ctglab.nl/; accessed 1 May 2023). SNPs in strong LD were removed using the PLINK command “--indep-pairwise 100 25 r2” with a series of r2 thresholds (0.1, 0.05, 0.025, 0.01, 0.005, 0.002, 0.001). The 1000G Phase3 EUR sample was used as the reference panel to calculate LD among variants. Independent SNPs with MAF > 0.01 were then applied to SumVg.
Supplementary Materials
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms25021347/s1.
Author Contributions
Conceptualization, H.-C.S.; methodology, H.-C.S. and P.-C.S.; software, H.-C.S.; formal analysis, H.-C.S.; investigation, H.-C.S., X.X., Z.M. and P.-C.S.; data curation, H.-C.S., X.X. and Z.M.; writing—original draft preparation, H.-C.S.; writing—review and editing, H.-C.S., X.X., Z.M. and P.-C.S.; supervision, H.-C.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by a Theme-based Research Grant, grant number T44-410/21-N from the Research Grants Council (RGC), an NSFC grant (grant number 81971706), and a Collaborative Research Fund (CRF), grant number C4054-17W from RGC. H.-C.S. was also supported by the KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, and the Lo Kwee Seong Biomedical Research Fund.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The GWAS summary statistics of the immune traits are downloaded from the GWASAtlas (https://atlas.ctglab.nl/; accessed 1 May 2023); LDSC heritability were extracted from the same website.
Acknowledgments
We would also like to thank Jinghong Qiu for the help in formatting the manuscript, and Kenneth C. Y. Wong for helping with the software coding and documentations.
Conflicts of Interest
The authors declared no conflict of interests.
References
- Yang, J.; Benyamin, B.; McEvoy, B.P.; Gordon, S.; Henders, A.K.; Nyholt, D.R.; Madden, P.A.; Heath, A.C.; Martin, N.G.; Montgomery, G.W. Common SNPs explain a large proportion of the heritability for human height. Nat. Genet. 2010, 42, 565–569. [Google Scholar] [CrossRef] [PubMed]
- Speed, D.; Hemani, G.; Johnson, M.R.; Balding, D.J. Improved heritability estimation from genome-wide SNPs. Am. J. Hum. Genet. 2012, 91, 1011–1021. [Google Scholar] [CrossRef] [PubMed]
- Bulik-Sullivan, B.K.; Loh, P.R.; Finucane, H.K.; Ripke, S.; Yang, J.; Schizophrenia Working Group of the Psychiatric Genomics Consortium; Patterson, N.; Daly, M.J.; Price, A.L.; Neale, B.M. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet. 2015, 47, 291–295. [Google Scholar] [CrossRef] [PubMed]
- Speed, D.; Balding, D.J. SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat. Genet. 2019, 51, 277–284. [Google Scholar] [CrossRef]
- Zhu, H.; Zhou, X. Statistical methods for SNP heritability estimation and partition: A review. Comput. Struct. Biotechnol. J. 2020, 18, 1557–1568. [Google Scholar] [CrossRef]
- Barry, C.-J.S.; Walker, V.M.; Cheesman, R.; Davey Smith, G.; Morris, T.T.; Davies, N.M. How to estimate heritability: A guide for genetic epidemiologists. Int. J. Epidemiol. 2023, 52, 624–632. [Google Scholar] [CrossRef]
- Zuk, O.; Hechter, E.; Sunyaev, S.R.; Lander, E.S. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc. Natl. Acad. Sci. USA 2012, 109, 1193–1198. [Google Scholar] [CrossRef]
- Brandes, N.; Weissbrod, O.; Linial, M. Open problems in human trait genetics. Genome Biol. 2022, 23, 131. [Google Scholar] [CrossRef] [PubMed]
- Young, A.I. Solving the missing heritability problem. PLoS Genet. 2019, 15, e1008222. [Google Scholar] [CrossRef]
- So, H.C.; Li, M.; Sham, P.C. Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study. Genet. Epidemiol. 2011, 35, 447–456. [Google Scholar] [CrossRef]
- Robbins, H. An empirical Bayes approach to statistics. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Cambridge, UK, 26–31 December 1954, July and August 1955; University of California Press: Berkeley, CA, USA; Los Angeles, CA, USA, 1956; Volume 1, pp. 157–163. [Google Scholar]
- Brown, L.D. Admissible estimators, recurrent diffusions, and insoluble boundary value problems. Ann. Math. Stat. 1971, 42, 855–903. [Google Scholar] [CrossRef]
- Efron, B. Empirical Bayes estimates for large-scale prediction problems. J. Am. Stat. Assoc. 2009, 104, 1015–1028. [Google Scholar] [CrossRef]
- Zhang, Y.; Cheng, Y.; Jiang, W.; Ye, Y.; Lu, Q.; Zhao, H. Comparison of methods for estimating genetic correlation between complex traits using GWAS summary statistics. Brief. Bioinform. 2021, 22, bbaa442. [Google Scholar] [CrossRef] [PubMed]
- Benke, K.S.; Nivard, M.G.; Velders, F.P.; Walters, R.K.; Pappa, I.; Scheet, P.A.; Xiao, X.; Ehli, E.A.; Palmer, L.J.; Whitehouse, A.J.; et al. A genome-wide association meta-analysis of preschool internalizing problems. J. Am. Acad. Child. Adolesc. Psychiatry 2014, 53, 667–676.e667. [Google Scholar] [CrossRef] [PubMed]
- Lubke, G.H.; Hottenga, J.J.; Walters, R.; Laurin, C.; de Geus, E.J.; Willemsen, G.; Smit, J.H.; Middeldorp, C.M.; Penninx, B.W.; Vink, J.M.; et al. Estimating the genetic variance of major depressive disorder due to all single nucleotide polymorphisms. Biol. Psychiatry 2012, 72, 707–709. [Google Scholar] [CrossRef]
- van Beek, J.H.; Lubke, G.H.; de Moor, M.H.; Willemsen, G.; de Geus, E.J.; Hottenga, J.J.; Walters, R.K.; Smit, J.H.; Penninx, B.W.; Boomsma, D.I. Heritability of liver enzyme levels estimated from genome-wide SNP data. Eur. J. Hum. Genet. 2014, 23, 1223–1228. [Google Scholar] [CrossRef]
- Hibar, D.P.; Stein, J.L.; Renteria, M.E.; Arias-Vasquez, A.; Desrivieres, S.; Jahanshad, N.; Toro, R.; Wittfeld, K.; Abramovic, L.; Andersson, M.; et al. Common genetic variants influence human subcortical brain structures. Nature 2015, 520, 224–229. [Google Scholar] [CrossRef]
- Paternoster, L.; Standl, M.; Waage, J.; Baurecht, H.; Hotze, M.; Strachan, D.P.; Curtin, J.A.; Bonnelykke, K.; Tian, C.; Takahashi, A.; et al. Multi-ancestry genome-wide association study of 21,000 cases and 95,000 controls identifies new risk loci for atopic dermatitis. Nat. Genet. 2015, 47, 1449–1456. [Google Scholar] [CrossRef]
- Lo, M.T.; Hinds, D.A.; Tung, J.Y.; Franz, C.; Fan, C.C.; Wang, Y.; Smeland, O.B.; Schork, A.; Holland, D.; Kauppi, K.; et al. Genome-wide analyses for personality traits identify six genomic loci and show correlations with psychiatric disorders. Nat. Genet. 2017, 49, 152–156. [Google Scholar] [CrossRef] [PubMed]
- Minica, C.C.; Verweij, K.J.H.; van der Most, P.J.; Mbarek, H.; Bernard, M.; van Eijk, K.R.; Lind, P.A.; Liu, M.Z.; Maciejewski, D.F.; Palviainen, T.; et al. Genome-wide association meta-analysis of age at first cannabis use. Addiction 2018, 113, 2073–2086. [Google Scholar] [CrossRef] [PubMed]
- Ahluwalia, T.S.; Prins, B.P.; Abdollahi, M.; Armstrong, N.J.; Aslibekyan, S.; Bain, L.; Jefferis, B.; Baumert, J.; Beekman, M.; Ben-Shlomo, Y.; et al. Genome-wide association study of circulating interleukin 6 levels identifies novel loci. Hum. Mol. Genet. 2021, 30, 393–409. [Google Scholar] [CrossRef]
- Shin, S.H.; Park, S.; Wright, C.; D’Astous, V.A.; Kim, G. The Role of Polygenic Score and Cognitive Activity in Cognitive Functioning Among Older Adults. Gerontologist 2021, 61, 319–329. [Google Scholar] [CrossRef] [PubMed]
- Ahola-Olli, A.V.; Würtz, P.; Havulinna, A.S.; Aalto, K.; Pitkänen, N.; Lehtimäki, T.; Kähönen, M.; Lyytikäinen, L.P.; Raitoharju, E.; Seppälä, I.; et al. Genome-wide Association Study Identifies 27 Loci Influencing Concentrations of Circulating Cytokines and Growth Factors. Am. J. Hum. Genet. 2017, 100, 40–50. [Google Scholar] [CrossRef] [PubMed]
- Turner, M.D.; Nedjai, B.; Hurst, T.; Pennington, D.J. Cytokines and chemokines: At the crossroads of cell signalling and inflammatory disease. Biochim. Biophys. Acta (BBA)—Mol. Cell Res. 2014, 1843, 2563–2582. [Google Scholar] [CrossRef] [PubMed]
- Steinsaltz, D.; Dahl, A.; Wachter, K.W. On Negative Heritability and Negative Estimates of Heritability. Genetics 2020, 215, 343–357. [Google Scholar] [CrossRef] [PubMed]
- Wied, D.; Weißbach, R. Consistency of the kernel density estimator: A survey. Stat. Pap. 2012, 53, 1–21. [Google Scholar] [CrossRef][Green Version]
- Efron, B. Tweedie’s formula and selection bias. J. Am. Stat. Assoc. 2011, 106, 1602. [Google Scholar] [CrossRef] [PubMed]
- Carry, P.M.; Vanderlinden, L.A.; Dong, F.; Buckner, T.; Litkowski, E.; Vigers, T.; Norris, J.M.; Kechris, K. Inverse probability weighting is an effective method to address selection bias during the analysis of high dimensional data. Genet. Epidemiol. 2021, 45, 593–603. [Google Scholar] [CrossRef]
- Horowitz, J.L. Bootstrap methods in econometrics. Annu. Rev. Econ. 2019, 11, 193–224. [Google Scholar] [CrossRef]
- Shao, J.; Wu, C.J. A general theory for jackknife variance estimation. Ann. Stat. 1989, 17, 1176–1197. [Google Scholar] [CrossRef]
- Zhong, H.; Prentice, R.L. Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. Biostatistics 2008, 9, 621–634. [Google Scholar] [CrossRef] [PubMed]
- Sun, L.; Bull, S.B. Reduction of selection bias in genomewide studies by resampling. Genet. Epidemiol. Off. Publ. Int. Genet. Epidemiol. Soc. 2005, 28, 352–367. [Google Scholar] [CrossRef] [PubMed]
- Zöllner, S.; Pritchard, J.K. Overcoming the winner’s curse: Estimating penetrance parameters from case-control data. Am. J. Hum. Genet. 2007, 80, 605–615. [Google Scholar] [CrossRef] [PubMed]
- Gillett, A.C.; Vassos, E.; Lewis, C.M. Transforming summary statistics from logistic regression to the liability scale: Application to genetic and environmental risk scores. Hum. Hered. 2019, 83, 210–224. [Google Scholar] [CrossRef]
- Efron, B.; Tibshirani, R.; Storey, J.D.; Tusher, V. Empirical Bayes analysis of a microarray experiment. J. Am. Stat. Assoc. 2001, 96, 1151–1160. [Google Scholar] [CrossRef]
- Miller, R.G. The jackknife—A review. Biometrika 1974, 61, 1–15. [Google Scholar]
- Chatterjee, S. Another look at the jackknife: Further examples of generalized bootstrap. Stat. Probab. Lett. 1998, 40, 307–319. [Google Scholar] [CrossRef]
- Efron, B.; Tibshirani, R.J. An Introduction to the Bootstrap; CRC Press: Boca Raton, FL, USA, 1994. [Google Scholar]
- Conley, T.G.; Hansen, C.B.; Rossi, P.E. Plausibly exogenous. Rev. Econ. Stat. 2012, 94, 260–272. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).