SumVg: Total Heritability Explained by All Variants in Genome-Wide Association Studies Based on Summary Statistics with Standard Error Estimates

Genome-wide association studies (GWAS) are commonly employed to study the genetic basis of complex traits/diseases, and a key question is how much heritability could be explained by all single nucleotide polymorphisms (SNPs) in GWAS. One widely used approach that relies on summary statistics only is linkage disequilibrium score regression (LDSC); however, this approach requires certain assumptions about the effects of SNPs (e.g., all SNPs contribute to heritability and each SNP contributes equal variance). More flexible modeling methods may be useful. We previously developed an approach recovering the “true” effect sizes from a set of observed z-statistics with an empirical Bayes approach, using only summary statistics. However, methods for standard error (SE) estimation are not available yet, limiting the interpretation of our results and the applicability of the approach. In this study, we developed several resampling-based approaches to estimate the SE of SNP-based heritability, including two jackknife and three parametric bootstrap methods. The resampling procedures are performed at the SNP level as it is most common to estimate heritability from GWAS summary statistics alone. Simulations showed that the delete-d-jackknife and parametric bootstrap approaches provide good estimates of the SE. In particular, the parametric bootstrap approaches yield the lowest root-mean-squared-error (RMSE) of the true SE. We also explored various methods for constructing confidence intervals (CIs). In addition, we applied our method to estimate the SNP-based heritability of 12 immune-related traits (levels of cytokines and growth factors) to shed light on their genetic architecture. We also implemented the methods to compute the sum of heritability explained and the corresponding SE in an R package SumVg. In conclusion, SumVg may provide a useful alternative tool for calculating SNP heritability and estimating SE/CI, which does not rely on distributional assumptions of SNP effects.


Introduction
Genome-wide association studies (GWAS) have proven to be successful in dissecting the genetic basis of a variety of diseases.A number of new susceptibility loci have been discovered, providing novel insight into the pathophysiology of many diseases.Nevertheless, a large proportion of the heritability still remained unexplained.It is natural to question the maximum variance that could be explained by all variants in a GWAS (or meta-analyses of GWAS), as we expect many true susceptibility variants are "hidden" due to limited power.
A number of methods have been developed to estimate the total heritability by all measured SNPs (also known as SNP-based heritability.Regarding methods that require individual-level data, in a pioneering work, Yang et al 1 derived a method to estimate the variance explained by all SNPs in a GWAS by a liner mixed model with random SNP effects.The approach assumes all SNPs have non-zero and normally distributed effects (beta), with a mean effect of zero.Each SNP is assumed to contribute to the same level of explained variance (i.e., variance explained by each SNP = total heritability/number of SNPs).Other similar approaches have also been proposed.For example, LDAK 2 assumes different heritability explained each SNP, depending on the minor allele frequencies (MAF), LD score and imputation quality of the SNP.Advanced methods have also been developed to estimate SNP-based heritability using summary statistics alone.LD score regression (LDSC) is one of the most widely used approaches for this purpose 3 .
LDSC assumes a mean effect (beta) of zero and equal variance explained by each SNP (i.e., an infinitesimal model).
SumHer 4 is an alternative approach based on the LDAK assumptions.For a more detailed review, please refer to 5 .
Prior to the development of LDSC, we have developed an alternative framework ( 6 ; referred to as "SumVg" in this paper) to achieve the same goal of estimating SNP-based heritability using summary statistics alone.Essentially, we aimed to recover the "true" z-statistic from a set of observed z-statistics based the following formula established by Brown 7 and Efron 8 .The corrected z-statistics are then converted to variance explained.There are several advantages of this method.Most importantly, the SumVg approach does not rely on any distributional assumptions of the effect sizes of susceptibility variants.In addition, it does not assume an equal heritability explained by each SNP, or that all SNPs contribute to the heritability (infinitesimal model).There are also no assumptions on the relationship between allele frequencies and variance explained.The method is also computationally fast.Also, since the LDSC method directly leverages LD patterns, a well-matched LD reference panel is usually required 9 .There is less reliance on LD information for SumVg as LD is mainly used for pruning.
Our method has been applied in a number of studies [for example see [10][11][12][13][14][15][16][17][18] ].However, there are no available methods to quantify the variance or precision of the heritability estimates from SumVg.If raw data is available, a standard non-parametric bootstrap by sampling individuals with replacement could be employed.However, in most cases only summary statistics are available and there are currently no methods for evaluating the standard error of the point estimate of heritability.
We summarize the contributions of this study below.In this work, we proposed five re-sampling approaches to estimate the SE of the total heritability by all SNPs in GWAS, based on summary statistics alone.Extensive simulations were performed to compare and validate the performance of different methods.Secondly, we also presented an easy-to-use R program to implement the SumVg approach with different flexible modeling options.
Thirdly, we reported heritability estimates for 12 immune-related traits (levels of cytokines and growth factors) 19 based on this approach, for which LDSC was unable to provide reasonable estimates.Such cytokines/growth factors are regulators of immune responses and inflammation, and are important intermediate phenotypes for autoimmune, inflammatory and infectious diseases 20 .As such, it is of scientific and clinical importance to unravel the genetic architecture of these traits, and estimating their heritability may be considered a useful contribution in its own right.

Estimation of the total heritability explained
Readers may refer to our previous paper 6 for details of estimation of the sum of heritability explained.In brief, we estimated the "true" z-statistics by the following correction formula: ----------------------- (1)   where z denotes the observed z-statistic and denotes the "true" z-statistic (i.e. the z-statistic one would obtain if there were no random noise; it reflects the actual effect size).Note that the method does not require prior assumptions on the underlying distributions of the true effect sizes .
We also proposed an alternative approach by evaluating the expected effect size conditioned on H1 (i.e.0) where fdr is the local false discovery rate described in Efron 21 .The resulting estimate of the true Vg is then (1 -fdr(z)) multiplied by the estimator in (2) converted to Vg scale.
The conditional estimator however is prone to large random variations as it involves local fdr estimation of each SNP.In many subsequent applications of our heritability estimation method [10][11][12] , the unconditional estimator [equation (1)] was primarily employed.We shall hence focus on the unconditional estimator in this paper, although the resampling approaches described below can readily be applied to other estimators in our previous work 6 as well.

Standard and delete-d-jackknife
In standard (delete-one) jackknife procedure 22 , we estimate the standard error (SE) by leaving out one observation at a time.The SE is defined by where n is the sample size, is the parameter estimate from the sample with the i th observation removed and In our case the parameter is the sum of heritability from all variants.
An extension is the delete-d-jackknife 23 where we leave out d observations at a time.There are in total N= possibilities of removing d out of n observations.In practice, N is usually very large.One may simply randomly repeat the procedure m times only ( ) instead of exhausting all possibilities of removing d out of n observations.The standard error is given by where denotes the parameter estimate in the v th jackknife replicate where d observations are left out.The delete-d-jackknife (when d>1) works better than the standard jackknife for non-smooth parameters like the median 23 .
There are no clear rules on the choice of d in delete-d-bootstrap.Chatterjee 24 suggested n/5 as a reasonable choice for d based on consideration of efficiency and likely model conditions.We followed the suggestion by Chattejee 24 and set d as n/5 (=20000) in all simulations.

Parametric bootstrap
In parametric bootstrap, in each replication we simulated z-statistics based on  , the 'corrected' z-statistics from original sample.We have where denotes the i th z-statistic in the b th bootstrap replicate.For small effects, the  will be shrunken towards zero.
We further proposed a modified approach by also considering the local fdr (i.e., probably of null given z) of each z-statistic.In each replicate, we simulated z-statistics according to the following scheme: Alternatively, one may employ the original z-statistics instead of the corrected z-statistics as the mean in each simulation, i.e.
The standard error is then computed from the simulated z-statistics.

Tests of resampling-based SE estimates
We compare the SE estimated from the above methods with the 'true' SE obtained from one hundred simulations with known data generating distributions.The details of the simulations follows 6 .Briefly, a gamma distribution was used to simulate three levels of variance explained (Vg =0.101, 0.191, 0.295), which were converted to true effect sizes (δ).
Two hundred replicates were run for each bootstrap or jackknife procedure.We focus on quantitative traits in our simulations, but the results should most likely apply to binary traits as well, as the only difference in these two scenarios is the formula to convert z to variance explained (Vg).

Application to immune traits
A selected set of immune-related traits (levels of cytokines/growth factors) were included for study, based on the GWAS by Ahola-Olli et al. 19 .We selected 12 continuous immune traits with (1) sample size N>5000 and (2) very low (<=3%) or negative SNP-based heritability estimated by LDSC.The LDSC heritability were based on pre-calculated values from GWASAtlas (https://atlas.ctglab.nl/).SNPs in strong LD were removed using the PLINK command "--indep-pairwise 100 25 r2" with a series of r 2 thresholds (0.1, 0.05, 0.025, 0.01, 0.005, 0.002, 0.001).The 1000G Phase3 EUR sample was used as the reference panel to calculate LD among variants.Independent SNPs with MAF>0.01 were then applied to SumVg.

Simulation results for SE estimation
Standard errors (SE) of heritability, as estimated by jackknife and bootstrap approaches, are listed in Table 1 and plotted in Figure 1.Bias, variance, and root mean square error (RMSE) of SE were calculated over 100 simulations (Table 2; Figures 1-2).
The delete-[n/5]-jackknife worked reasonably well when the total heritability explained is low (when heritability=0.101),but it tended to overestimate the SE when the total heritability is higher, especially at larger sample sizes.The bias was also positive across all simulation scenarios.The standard (delete-1) jackknife approach performed the worst among all methods, producing inflated estimates of SE.The variance and RMSE of this estimator were high compared to other approaches.The SE was in general over-estimated at all heritability levels across all sample sizes.This may be explained by the fact that sum of Vg is not a very smooth parameter, which impairs the performance of delete-1-jackknife estimators.
The other methods, including parametric bootstrap (paraboot) and the modified versions with consideration of local fdr, performed reasonably well and closely resembled the true SE.With the exception of one simulation setting, the parametric bootstrap methods achieved the lowest (absolute) bias.For variance and RMSE, parametric bootstrap also performed the best.In terms of RMSE, the parametric bootstrap approaches that also model the local fdr outperformed other methods.The RMSE of different estimators were also observed to reduce with increasing sample sizes.

Results on immune traits
Plink was applied to trim GWAS data for 12 immunological traits at various r 2 criteria to obtain roughly independent SNPs.We only included common variants with MAF>0.01 for further analysis.Then, using SumVg, "true" z-statistics of trimmed SNPs were retrieved to capture the missing heritability.The jackknife and bootstrap methods were used to compute the corresponding SE (Table 4).
The total SNP-based heritabilities predicted by SumVg for the selected traits, in contrast to the comparatively low or negative heritability estimates from LDSC, were around 10-20% based on a collection of LD-pruned SNPs.We obtained a stable (and likely conservative) estimate of heritability at r 2 ~ 0.01 or 0.005.Lower r 2 values (i.e., r 2 <0.0025 and r 2 <0.001) had limited impact on final estimates of heritability.The del-1 jackknife consistently produced the highest standard error, while the bootstrap and del-d jackknife approaches produced SE that were more comparable to one another.Out of the 12 cytokines/growth factors study, highest heritability was observed for IL-4 and IL-17 levels.

R package implementation
We also implemented the methods to compute the sum of heritability explained and the corresponding SE in an R package SumVg, available at https://github.com/lab-hcso/Estimating-SE-of-total-heritability/.

Discussions
Here we presented an approach for estimating SE of SNP-based heritability estimates using SumVg, and our applications demonstrate the usefulness of the approach.
Our main purpose is to provide an alternative approach for SNP-based heritability and SE estimation, since different approaches have different statistical modeling assumptions, or assumptions on the genetic architecture.In practice, it is almost impossible to know the true genetic architecture of a disease/trait, and as such it is very difficult to verify the correctness of heritability estimates due to lack of a 'gold standard'.It will be more reassuring if one observes similar heritability estimates from diverse methods.SumVg may provide a useful alternative reference for heritability estimates, in conjunction with existing approaches such as LDSC.SumVg may also be useful when standard approaches are unable to give reasonable results (e.g.close to zero heritability for traits that are likely heritable from previous studies, or negative estimates).
We recommended pruning of SNPs (such that SNPs are roughly in linkage equilibrium) before applying our method of heritability estimation.One approach is to employ a series of r 2 thresholds (e.g., decreasing r 2 from 0.1 to 0.001) and consider the point at which heritability became stable.Our empirical applications showed that an r 2 threshold of ~0.01 may be sufficient.The resulting SNP-based heritability may be considered a conservative estimate (due to possibility of removing some causal variants during LD-pruning).While not directly modeling LD is a limitation of this approach, the lower reliance on accurate LD information may be advantageous in some cases (e.g. when in-sample LD information is not available and only limited external reference data is present).
We have not investigated methods for SE estimation when raw genotype data are available.When raw data are available, one potential approach is to simply resample the individuals with replacement (i.e., standard non-parametric bootstrap).However, such an approach is computationally intensive and the performance over methods based on summary statistics requires further research.
The above resampling methods can potentially be sped up by splitting the job into multiple processes to be run in parallel, although this approach has not been implemented in our software yet.We have not yet fully evaluated the building of confidence interval (CI) in our study, but a natural approach is to assume normality and calculate CI in the form of .Assuming a polygenic model, the total heritability is the sum of Vg contributed by many variants of small to modest effect sizes.It is hence reasonable to assume normality by the central limit theorem (as is assumed by other heritability estimation tools).Further research may focus on developing other methods of building CIs and their comparisons.
Here we have also applied our approach to estimate the heritability of different cytokines, which play important roles in immune response and the pathogenesis of autoimmune, inflammatory and infectious diseases.Our analyses suggest that the studied cytokines are moderately heritable in general.
To summarize, SumVg is useful for triangulating evidence from different approaches to support conclusions regarding SNP-based heritability.We present novel methods of computing SE and an easy-to-use software here, which we believe will be helpful to other researchers.Our application to cytokine levels also shed light on the genetic architecture of these clinically important immune traits.paraboot, parametric bootstrap approach as described in the text; fdrboot1, a "weighted" bootstrap approach with consideration of the local fdr, using the observed z-statistic as the mean in each simulation; fdrboot2, a "weighted" bootstrap approach with consideration of the local fdr, using the corrected z-statistic as the mean in each simulation; Trait, N, LDSC(h2, se) are the same meaning as in Table 3; h2, heritability estimated by SumVg across a set of r 2 thresholds; r2, the r2 pruning threshold; n_pruned_snp, number of SNPs after LD pruning at the corresponding r 2 threshold; se_jack_1, se_jack_del_d, se_paraboot, se_fdrboot1 and se_fdrboot2 are SE estimated by different approaches as we described above; "NA" was denoted when "locfdr" failed to estimate local false discovery rate.
The estimates with r2=0.01 were highlighted, as we observed that in general the heritability estimates stabilize at r2 ~ 0.01.

Figure 1 Figure 2
Figure 1Boxplot of SE estimated by different approaches

Table 1 .
Standard error (SE) of the sum of variance explained estimated by different resampling approaches

Table 2 .
Bias, Variance and Root mean squared error (RMSE) of SE for the sum of variance explained estimated by different resampling approaches

table 2 :
Sum_Vg, true total heritability explained; N, sample size.The best performing method (for estimation of SE) in each scenario is in bold.For other abbreviations, please refer to table 1.

Table 3 .
Summary of the immune traits being studied

Table 4 .
SE of the sum of variance explained estimated by different resampling approaches for 12 immune traits