Next Article in Journal
A Tool for Shared Decision Making on Referral for Prostate Biopsy in the Primary Care Setting: Integrating Risks of Cancer with Life Expectancy
Previous Article in Journal
Metformin Pharmacogenetics: Effects of SLC22A1, SLC22A2, and SLC22A3 Polymorphisms on Glycemic Control and HbA1c Levels
Open AccessArticle

Aligning the Aligners: Comparison of RNA Sequencing Data Alignment and Gene Expression Quantification Tools for Clinical Breast Cancer Research

1
Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL 33612, USA
2
Epigenetics & Functional Genomics Laboratory, Department of Research and Development, Bay Pines Veteran Administration Healthcare System, Bay Pines, FL 33744, USA
*
Author to whom correspondence should be addressed.
J. Pers. Med. 2019, 9(2), 18; https://doi.org/10.3390/jpm9020018
Received: 27 February 2019 / Revised: 25 March 2019 / Accepted: 28 March 2019 / Published: 3 April 2019
The rapid expansion of transcriptomics and affordability of next-generation sequencing (NGS) technologies generate rocketing amounts of gene expression data across biology and medicine, including cancer research. Concomitantly, many bioinformatics tools were developed to streamline gene expression and quantification. We tested the concordance of NGS RNA sequencing (RNA-seq) analysis outcomes between two predominant programs for read alignment, HISAT2, and STAR, and two most popular programs for quantifying gene expression in NGS experiments, edgeR and DESeq2, using RNA-seq data from breast cancer progression series, which include histologically confirmed normal, early neoplasia, ductal carcinoma in situ and infiltrating ductal carcinoma samples microdissected from formalin fixed, paraffin embedded (FFPE) breast tissue blocks. We identified significant differences in aligners’ performance: HISAT2 was prone to misalign reads to retrogene genomic loci, STAR generated more precise alignments, especially for early neoplasia samples. edgeR and DESeq2 produced similar lists of differentially expressed genes, with edgeR producing more conservative, though shorter, lists of genes. Gene Ontology (GO) enrichment analysis revealed no skewness in significant GO terms identified among differentially expressed genes by edgeR versus DESeq2. As transcriptomics of FFPE samples becomes a vanguard of precision medicine, choice of bioinformatics tools becomes critical for clinical research. Our results indicate that STAR and edgeR are well-suited tools for differential gene expression analysis from FFPE samples. View Full-Text
Keywords: atypia; breast neoplasms; ductal carcinoma in situ (DCIS); gene expression profiling; high-throughput nucleotide sequencing; infiltrating ductal carcinoma (IDC); paraffin embedding; sequence alignment; transcriptome atypia; breast neoplasms; ductal carcinoma in situ (DCIS); gene expression profiling; high-throughput nucleotide sequencing; infiltrating ductal carcinoma (IDC); paraffin embedding; sequence alignment; transcriptome
Show Figures

Graphical abstract

MDPI and ACS Style

Raplee, I.D.; Evsikov, A.V.; Marín de Evsikova, C. Aligning the Aligners: Comparison of RNA Sequencing Data Alignment and Gene Expression Quantification Tools for Clinical Breast Cancer Research. J. Pers. Med. 2019, 9, 18.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop