Next Article in Journal
Polyphenol and Tannin Nutraceuticals and Their Metabolites: How the Human Gut Microbiota Influences Their Properties
Previous Article in Journal
Ventricular Repolarization and Calcium Transient Show Resonant Behavior under Oscillatory Pacing Rate
Article

scEpiLock: A Weakly Supervised Learning Framework for cis-Regulatory Element Localization and Variant Impact Quantification for Single-Cell Epigenetic Data

1
Center for Complex Biological Systems, University of California, Irvine, CA 92697, USA
2
Department of Biological Chemistry, School of Medicine, University of California, Irvine, CA 92697, USA
3
Department of Computer Science, University of California, Irvine, CA 92697, USA
*
Authors to whom correspondence should be addressed.
Academic Editors: Gang Hu and Kui Wang
Biomolecules 2022, 12(7), 874; https://doi.org/10.3390/biom12070874
Received: 6 May 2022 / Revised: 16 June 2022 / Accepted: 16 June 2022 / Published: 23 June 2022
Recent advances in single-cell transposase-accessible chromatin using a sequencing assay (scATAC-seq) allow cellular heterogeneity dissection and regulatory landscape reconstruction with an unprecedented resolution. However, compared to bulk-sequencing, its ultra-high missingness remarkably reduces usable reads in each cell type, resulting in broader, fuzzier peak boundary definitions and limiting our ability to pinpoint functional regions and interpret variant impacts precisely. We propose a weakly supervised learning method, scEpiLock, to directly identify core functional regions from coarse peak labels and quantify variant impacts in a cell-type-specific manner. First, scEpiLock uses a multi-label classifier to predict chromatin accessibility via a deep convolutional neural network. Then, its weakly supervised object detection module further refines the peak boundary definition using gradient-weighted class activation mapping (Grad-CAM). Finally, scEpiLock provides cell-type-specific variant impacts within a given peak region. We applied scEpiLock to various scATAC-seq datasets and found that it achieves an area under receiver operating characteristic curve (AUC) of ~0.9 and an area under precision recall (AUPR) above 0.7. Besides, scEpiLock’s object detection condenses coarse peaks to only ⅓ of their original size while still reporting higher conservation scores. In addition, we applied scEpiLock on brain scATAC-seq data and reported several genome-wide association studies (GWAS) variants disrupting regulatory elements around known risk genes for Alzheimer’s disease, demonstrating its potential to provide cell-type-specific biological insights in disease studies. View Full-Text
Keywords: scATAC-seq; cis-regulatory localization; deep learning; brain disorder scATAC-seq; cis-regulatory localization; deep learning; brain disorder
Show Figures

Figure 1

MDPI and ACS Style

Gong, Y.; Srinivasan, S.S.; Zhang, R.; Kessenbrock, K.; Zhang, J. scEpiLock: A Weakly Supervised Learning Framework for cis-Regulatory Element Localization and Variant Impact Quantification for Single-Cell Epigenetic Data. Biomolecules 2022, 12, 874. https://doi.org/10.3390/biom12070874

AMA Style

Gong Y, Srinivasan SS, Zhang R, Kessenbrock K, Zhang J. scEpiLock: A Weakly Supervised Learning Framework for cis-Regulatory Element Localization and Variant Impact Quantification for Single-Cell Epigenetic Data. Biomolecules. 2022; 12(7):874. https://doi.org/10.3390/biom12070874

Chicago/Turabian Style

Gong, Yanwen, Shushrruth S. Srinivasan, Ruiyi Zhang, Kai Kessenbrock, and Jing Zhang. 2022. "scEpiLock: A Weakly Supervised Learning Framework for cis-Regulatory Element Localization and Variant Impact Quantification for Single-Cell Epigenetic Data" Biomolecules 12, no. 7: 874. https://doi.org/10.3390/biom12070874

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop