Next Article in Journal
Targeted Approaches for In Situ Gut Microbiome Manipulation
Next Article in Special Issue
Application of Whole Exome and Targeted Panel Sequencing in the Clinical Molecular Diagnosis of 319 Chinese Families with Inherited Retinal Dystrophy and Comparison Study
Previous Article in Journal
Description of Genetic Variants in BRCA Genes in Mexican Patients with Ovarian Cancer: A First Step towards Implementing Personalized Medicine
Previous Article in Special Issue
A Novel Probability Model for LncRNA–Disease Association Prediction Based on the Naïve Bayesian Classifier
Article Menu
Issue 7 (July) cover image

Export Article

Open AccessArticle
Genes 2018, 9(7), 350; https://doi.org/10.3390/genes9070350

Ensemble Consensus-Guided Unsupervised Feature Selection to Identify Huntington’s Disease-Associated Genes

College of Computer and Control Engineering, Nankai University, Tianjin 300350, China
*
Author to whom correspondence should be addressed.
Received: 30 May 2018 / Revised: 6 July 2018 / Accepted: 9 July 2018 / Published: 12 July 2018
Full-Text   |   PDF [1342 KB, uploaded 12 July 2018]   |  

Abstract

Due to the complexity of the pathological mechanisms of neurodegenerative diseases, traditional differentially-expressed gene selection methods cannot detect disease-associated genes accurately. Recent studies have shown that consensus-guided unsupervised feature selection (CGUFS) performs well in feature selection for identifying disease-associated genes. Since the random initialization of the feature selection matrix in CGUFS results in instability of the final disease-associated gene set, for the purposes of this study we proposed an ensemble method based on CGUFS—namely, ensemble consensus-guided unsupervised feature selection (ECGUFS) in order to further improve the accuracy of disease-associated genes and the stability of feature gene sets. We also proposed a bagging integration strategy to integrate the results of CGUFS. Lastly, we conducted experiments with Huntington’s disease RNA sequencing (RNA-Seq) data and obtained the final feature gene set, where we detected 287 disease-associated genes. Enrichment analysis on these genes has shown that postsynaptic density and the postsynaptic membrane, synapse, and cell junction are all affected during the disease’s progression. However, ECGUFS greatly improved the accuracy of disease-associated gene prediction and the stability of the disease-associated gene set. We conducted a classification of samples with labels based on the linear support vector machine with 10-fold cross-validation. The average accuracy is 0.9, which suggests the effectiveness of the feature gene set. View Full-Text
Keywords: ensemble consensus guided unsupervised feature selection; disease-associated genes; Huntington’s disease; RNA-Seq data ensemble consensus guided unsupervised feature selection; disease-associated genes; Huntington’s disease; RNA-Seq data
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Guo, X.; Jiang, X.; Xu, J.; Quan, X.; Wu, M.; Zhang, H. Ensemble Consensus-Guided Unsupervised Feature Selection to Identify Huntington’s Disease-Associated Genes. Genes 2018, 9, 350.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Genes EISSN 2073-4425 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top