Next Article in Journal
Synthesis and DFT Calculations of Novel Vanillin-Chalcones and Their 3-Aryl-5-(4-(2-(dimethylamino)-ethoxy)-3-methoxyphenyl)-4,5-dihydro-1H-pyrazole-1-carbaldehyde Derivatives as Antifungal Agents
Next Article in Special Issue
Integrative Pathway Analysis of Genes and Metabolites Reveals Metabolism Abnormal Subpathway Regions and Modules in Esophageal Squamous Cell Carcinoma
Previous Article in Journal
Anti-Melanogenic Effects of Flavonoid Glycosides from Limonium tetragonum (Thunb.) Bullock via Inhibition of Tyrosinase and Tyrosinase-Related Proteins
Previous Article in Special Issue
Detection of Interactions between Proteins by Using Legendre Moments Descriptor to Extract Discriminatory Information Embedded in PSSM
Article Menu
Issue 9 (September) cover image

Export Article

Open AccessArticle
Molecules 2017, 22(9), 1463;

EPuL: An Enhanced Positive-Unlabeled Learning Algorithm for the Prediction of Pupylation Sites

School of Information Science and Technology, Northeast Normal University, Changchun 130117, China
School of Computing Science and Engineering, VIT University, Vellore 632014, Tamil Nadu, India
School of Computer Science and Technology, Jiangsu Normal University, Xuzhou 221116, China
Authors to whom correspondence should be addressed.
Received: 23 July 2017 / Revised: 29 August 2017 / Accepted: 30 August 2017 / Published: 5 September 2017
(This article belongs to the Special Issue Computational Analysis for Protein Structure and Interaction)
Full-Text   |   PDF [665 KB, uploaded 5 September 2017]   |  


Protein pupylation is a type of post-translation modification, which plays a crucial role in cellular function of bacterial organisms in prokaryotes. To have a better insight of the mechanisms underlying pupylation an initial, but important, step is to identify pupylation sites. To date, several computational methods have been established for the prediction of pupylation sites which usually artificially design the negative samples using the verified pupylation proteins to train the classifiers. However, if this process is not properly done it can affect the performance of the final predictor dramatically. In this work, different from previous computational methods, we proposed an enhanced positive-unlabeled learning algorithm (EPuL) to the pupylation site prediction problem, which uses only positive and unlabeled samples. Firstly, we separate the training dataset into the positive dataset and the unlabeled dataset which contains the remaining non-annotated lysine residues. Then, the EPuL algorithm is utilized to select the reliably negative initial dataset and then iteratively pick out the non-pupylation sites. The performance of the proposed method was measured with an accuracy of 90.24%, an Area Under Curve (AUC) of 0.93 and an MCC of 0.81 by 10-fold cross-validation. A user-friendly web server for predicting pupylation sites was developed and was freely available at View Full-Text
Keywords: positive-unlabeled learning algorithm; pupylation sites; prediction; web server; support vector machine positive-unlabeled learning algorithm; pupylation sites; prediction; web server; support vector machine

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Supplementary material


Share & Cite This Article

MDPI and ACS Style

Nan, X.; Bao, L.; Zhao, X.; Zhao, X.; Sangaiah, A.K.; Wang, G.-G.; Ma, Z. EPuL: An Enhanced Positive-Unlabeled Learning Algorithm for the Prediction of Pupylation Sites. Molecules 2017, 22, 1463.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Molecules EISSN 1420-3049 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top