Next Article in Journal
Epigenetic Mechanisms Influencing Epithelial to Mesenchymal Transition in Bladder Cancer
Previous Article in Journal
Potential Interplay of the Gatipotuzumab Epitope TA-MUC1 and Estrogen Receptors in Ovarian Cancer
Previous Article in Special Issue
Polycystins in Colorectal Cancer
Article Menu

Export Article

Open AccessArticle
Int. J. Mol. Sci. 2019, 20(2), 296; https://doi.org/10.3390/ijms20020296

High-Throughput Omics and Statistical Learning Integration for the Discovery and Validation of Novel Diagnostic Signatures in Colorectal Cancer

1
College of Pharmacy and Research Institute of Pharmaceutical Sciences, Seoul National University, Seoul 08826, Korea
2
Department of Statistics, Seoul National University, Seoul 08826, Korea
3
School of Medicine, Vietnam National University, Ho Chi Minh 70000, Vietnam
*
Author to whom correspondence should be addressed.
Received: 30 November 2018 / Revised: 31 December 2018 / Accepted: 4 January 2019 / Published: 12 January 2019
(This article belongs to the Special Issue Basic and Translational Research in Colorectal Cancer)
Full-Text   |   PDF [4960 KB, uploaded 12 January 2019]   |  

Abstract

The advancement of bioinformatics and machine learning has facilitated the discovery and validation of omics-based biomarkers. This study employed a novel approach combining multi-platform transcriptomics and cutting-edge algorithms to introduce novel signatures for accurate diagnosis of colorectal cancer (CRC). Different random forests (RF)-based feature selection methods including the area under the curve (AUC)-RF, Boruta, and Vita were used and the diagnostic performance of the proposed biosignatures was benchmarked using RF, logistic regression, naïve Bayes, and k-nearest neighbors models. All models showed satisfactory performance in which RF appeared to be the best. For instance, regarding the RF model, the following were observed: mean accuracy 0.998 (standard deviation (SD) < 0.003), mean specificity 0.999 (SD < 0.003), and mean sensitivity 0.998 (SD < 0.004). Moreover, proposed biomarker signatures were highly associated with multifaceted hallmarks in cancer. Some biomarkers were found to be enriched in epithelial cell signaling in Helicobacter pylori infection and inflammatory processes. The overexpression of TGFBI and S100A2 was associated with poor disease-free survival while the down-regulation of NR5A2, SLC4A4, and CD177 was linked to worse overall survival of the patients. In conclusion, novel transcriptome signatures to improve the diagnostic accuracy in CRC are introduced for further validations in various clinical settings. View Full-Text
Keywords: colorectal cancer; transcriptomics; diagnosis; biomarker; machine learning; variable selection colorectal cancer; transcriptomics; diagnosis; biomarker; machine learning; variable selection
Figures

Graphical abstract

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material

SciFeed

Share & Cite This Article

MDPI and ACS Style

Long, N.P.; Park, S.; Anh, N.H.; Nghi, T.D.; Yoon, S.J.; Park, J.H.; Lim, J.; Kwon, S.W. High-Throughput Omics and Statistical Learning Integration for the Discovery and Validation of Novel Diagnostic Signatures in Colorectal Cancer. Int. J. Mol. Sci. 2019, 20, 296.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top