Next Article in Journal
The Emerging Role of GLP-1 Receptors in DNA Repair: Implications in Neurological Disorders
Next Article in Special Issue
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks
Previous Article in Journal
Potential Diagnostic and Therapeutic Applications of Oligonucleotide Aptamers in Breast Cancer
Previous Article in Special Issue
IonchanPred 2.0: A Tool to Predict Ion Channels and Their Types
Article Menu
Issue 9 (September) cover image

Export Article

Open AccessArticle
Int. J. Mol. Sci. 2017, 18(9), 1856;

PSFM-DBT: Identifying DNA-Binding Proteins by Combing Position Specific Frequency Matrix and Distance-Bigram Transformation

School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen 518055, China
Author to whom correspondence should be addressed.
Received: 28 July 2017 / Revised: 19 August 2017 / Accepted: 22 August 2017 / Published: 25 August 2017
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Full-Text   |   PDF [2145 KB, uploaded 25 August 2017]   |  


DNA-binding proteins play crucial roles in various biological processes, such as DNA replication and repair, transcriptional regulation and many other biological activities associated with DNA. Experimental recognition techniques for DNA-binding proteins identification are both time consuming and expensive. Effective methods for identifying these proteins only based on protein sequences are highly required. The key for sequence-based methods is to effectively represent protein sequences. It has been reported by various previous studies that evolutionary information is crucial for DNA-binding protein identification. In this study, we employed four methods to extract the evolutionary information from Position Specific Frequency Matrix (PSFM), including Residue Probing Transformation (RPT), Evolutionary Difference Transformation (EDT), Distance-Bigram Transformation (DBT), and Trigram Transformation (TT). The PSFMs were converted into fixed length feature vectors by these four methods, and then respectively combined with Support Vector Machines (SVMs); four predictors for identifying these proteins were constructed, including PSFM-RPT, PSFM-EDT, PSFM-DBT, and PSFM-TT. Experimental results on a widely used benchmark dataset PDB1075 and an independent dataset PDB186 showed that these four methods achieved state-of-the-art-performance, and PSFM-DBT outperformed other existing methods in this field. For practical applications, a user-friendly webserver of PSFM-DBT was established, which is available at View Full-Text
Keywords: PSFM-DBT; DNA binding protein; distance bigram transformation; PSFM PSFM-DBT; DNA binding protein; distance bigram transformation; PSFM

Graphical abstract

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material


Share & Cite This Article

MDPI and ACS Style

Zhang, J.; Liu, B. PSFM-DBT: Identifying DNA-Binding Proteins by Combing Position Specific Frequency Matrix and Distance-Bigram Transformation. Int. J. Mol. Sci. 2017, 18, 1856.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top