Next Article in Journal
Antrodia cinnamomea Oligosaccharides Suppress Lipopolysaccharide-Induced Inflammation through Promoting O-GlcNAcylation and Repressing p38/Akt Phosphorylation
Next Article in Special Issue
The Integrative Method Based on the Module-Network for Identifying Driver Genes in Cancer Subtypes
Previous Article in Journal
A Concise Review of the Conflicting Roles of Dopamine-1 versus Dopamine-2 Receptors in Wound Healing
Previous Article in Special Issue
Extracting Fitness Relationships and Oncogenic Patterns among Driver Genes in Cancer
Article Menu
Issue 1 (January) cover image

Export Article

Open AccessArticle
Molecules 2018, 23(1), 52; https://doi.org/10.3390/molecules23010052

Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics

School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China
*
Author to whom correspondence should be addressed.
Received: 10 November 2017 / Revised: 15 December 2017 / Accepted: 16 December 2017 / Published: 26 December 2017
Full-Text   |   PDF [225 KB, uploaded 28 December 2017]

Abstract

Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data. View Full-Text
Keywords: SVM-RFE; overlapping degree; feature selection SVM-RFE; overlapping degree; feature selection
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Lin, X.; Li, C.; Zhang, Y.; Su, B.; Fan, M.; Wei, H. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics. Molecules 2018, 23, 52.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Molecules EISSN 1420-3049 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top