Next Article in Journal
Qualitative Analysis of Multi-Terms Fractional Order Delay Differential Equations via the Topological Degree Theory
Next Article in Special Issue
Comparing Groups of Decision-Making Units in Efficiency Based on Semiparametric Regression
Previous Article in Journal
Detecting Extreme Values with Order Statistics in Samples from Continuous Distributions
Previous Article in Special Issue
Combination of Ensembles of Regularized Regression Models with Resampling-Based Lasso Feature Selection in High Dimensional Data
Open AccessArticle

Robust Linear Trend Test for Low-Coverage Next-Generation Sequence Data Controlling for Covariates

1
Department of Psychiatry, New York University School of Medicine, New York, NY 10016, USA
2
Department of Neurology, Chonnam National University Medical School, Gwangju 61469, Korea
3
Department of Applied Statistics, Chung-Ang University, Seoul 06974, Korea
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(2), 217; https://doi.org/10.3390/math8020217
Received: 30 December 2019 / Revised: 4 February 2020 / Accepted: 5 February 2020 / Published: 8 February 2020
(This article belongs to the Special Issue Uncertainty Quantification Techniques in Statistics)
Low-coverage next-generation sequencing experiments assisted by statistical methods are popular in a genetic association study. Next-generation sequencing experiments produce genotype data that include allele read counts and read depths. For low sequencing depths, the genotypes tend to be highly uncertain; therefore, the uncertain genotypes are usually removed or imputed before performing a statistical analysis. It may result in the inflated type I error rate and in a loss of statistical power. In this paper, we propose a mixture-based penalized score association test adjusting for non-genetic covariates. The proposed score test statistic is based on a sandwich variance estimator so that it is robust under the model misspecification between the covariates and the latent genotypes. The proposed method takes advantage of not requiring either external imputation or elimination of uncertain genotypes. The results of our simulation study show that the type I error rates are well controlled and the proposed association test have reasonable statistical power. As an illustration, we apply our statistic to pharmacogenomics data for drug responsiveness among 400 epilepsy patients.
Keywords: allele read counts; low-coverage; mixture model; next-generation sequencing; sandwich variance estimator allele read counts; low-coverage; mixture model; next-generation sequencing; sandwich variance estimator
MDPI and ACS Style

Lee, J.Y.; Kim, M.-K.; Kim, W. Robust Linear Trend Test for Low-Coverage Next-Generation Sequence Data Controlling for Covariates. Mathematics 2020, 8, 217.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop