Next Article in Journal
A Genome-Wide Association Study Revealed Key SNPs/Genes Associated With Salinity Stress Tolerance In Upland Cotton
Previous Article in Journal
Comparative Study of Gut Microbiota in Wild and Captive Giant Pandas (Ailuropoda melanoleuca)
Open AccessArticle

i6mA-DNCP: Computational Identification of DNA N6-Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features

by Liang Kong 1,* and Lichao Zhang 2,3
1
School of Mathematics and Information Science & Technology, Hebei Normal University of Science & Technology, Qinhuangdao 066004, China
2
School of Mathematics and Statistics, Northeastern University at Qinhuangdao, Qinhuangdao 066004, China
3
College of Sciences, Northeastern University, Shenyang 110819, China
*
Author to whom correspondence should be addressed.
Genes 2019, 10(10), 828; https://doi.org/10.3390/genes10100828
Received: 27 August 2019 / Revised: 16 October 2019 / Accepted: 18 October 2019 / Published: 20 October 2019
(This article belongs to the Section Technologies and Resources for Genetics)
DNA N6-methyladenine (6mA) plays an important role in regulating the gene expression of eukaryotes. Accurate identification of 6mA sites may assist in understanding genomic 6mA distributions and biological functions. Various experimental methods have been applied to detect 6mA sites in a genome-wide scope, but they are too time-consuming and expensive. Developing computational methods to rapidly identify 6mA sites is needed. In this paper, a new machine learning-based method, i6mA-DNCP, was proposed for identifying 6mA sites in the rice genome. Dinucleotide composition and dinucleotide-based DNA properties were first employed to represent DNA sequences. After a specially designed DNA property selection process, a bagging classifier was used to build the prediction model. The jackknife test on a benchmark dataset demonstrated that i6mA-DNCP could obtain 84.43% sensitivity, 88.86% specificity, 86.65% accuracy, a 0.734 Matthew’s correlation coefficient (MCC), and a 0.926 area under the receiver operating characteristic curve (AUC). Moreover, three independent datasets were established to assess the generalization ability of our method. Extensive experiments validated the effectiveness of i6mA-DNCP. View Full-Text
Keywords: N6-methyladenine; dinucleotide composition; DNA properties; bagging N6-methyladenine; dinucleotide composition; DNA properties; bagging
Show Figures

Graphical abstract

MDPI and ACS Style

Kong, L.; Zhang, L. i6mA-DNCP: Computational Identification of DNA N6-Methyladenine Sites in the Rice Genome Using Optimized Dinucleotide-Based Features. Genes 2019, 10, 828.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop