ijms-logo

Journal Browser

Journal Browser

Special Protein Molecules Computational Identification

A special issue of International Journal of Molecular Sciences (ISSN 1422-0067). This special issue belongs to the section "Biochemistry".

Deadline for manuscript submissions: closed (15 October 2017) | Viewed by 96371

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editor

School of Computer Science and Technology, Tianjin University, Tianjin, China
Interests: bioinformatics; machine learning; string algorithm
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

For some special protein molecules, it is time consuming and costly to detect new ones. These special proteins include cytokines, enzymes, cell-penetrating peptides, anticancer peptides, cancerlectins, G protein-coupled receptors, etc. Researchers often employed computer programs to list some candidates, and validated the candidates with molecular experiments. The “computer program” is the key issue, which could save on wet experiments costs. High false positive software would lead to high costs in the validation process.

In this Special Issue, we will focus on these “computer program” approaches and algorithms. Some “golden features” from protein primary sequences have been proposed for these problems, such as Chou’s PseAAC (Pseudo Amino Acid Composition). PseAAC has been tried on nearly all kinds of protein identification, together with SVM (support vector machine, a type of classifier). However, I prefer special features, and classification methods should be proposed for special protein molecules. “Golden features” cannot work well on all kinds of proteins. I hope that submissions will focus on a type of special protein molecules, collect related data sets, get better prediction performance (especially low false positives), and develop user-friendly software tools or web servers.

Prof. Quan Zou
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Molecular Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. There is an Article Processing Charge (APC) for publication in this open access journal. For details about the APC please see here. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • bioinformatics

  • machine learning

  • feature selection

  • protein classification

  • prediction

  • PseAAC features

  • Proteomics

  • anticancer peptides

  • Cell-Penetrating Peptides

  • oncogene

  • type III secreted proteins

  • DNA/RNA binding proteins

  • MHC binding peptide

Related Special Issue

Published Papers (19 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research

9 pages, 961 KiB  
Editorial
Special Protein Molecules Computational Identification
by Quan Zou and Wenying He
Int. J. Mol. Sci. 2018, 19(2), 536; https://doi.org/10.3390/ijms19020536 - 10 Feb 2018
Cited by 5 | Viewed by 3931
Abstract
Computational identification of special protein molecules is a key issue in understanding protein function. It can guide molecular experiments and help to save costs. I assessed 18 papers published in the special issue of Int. J. Mol. Sci., and also discussed the [...] Read more.
Computational identification of special protein molecules is a key issue in understanding protein function. It can guide molecular experiments and help to save costs. I assessed 18 papers published in the special issue of Int. J. Mol. Sci., and also discussed the related works. The computational methods employed in this special issue focused on machine learning, network analysis, and molecular docking. New methods and new topics were also proposed. There were in addition several wet experiments, with proven results showing promise. I hope our special issue will help in protein molecules identification researches. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Figure 1

Research

Jump to: Editorial

17 pages, 1487 KiB  
Article
Assessing the Performances of Protein Function Prediction Algorithms from the Perspectives of Identification Accuracy and False Discovery Rate
by Chun Yan Yu, Xiao Xu Li, Hong Yang, Ying Hong Li, Wei Wei Xue, Yu Zong Chen, Lin Tao and Feng Zhu
Int. J. Mol. Sci. 2018, 19(1), 183; https://doi.org/10.3390/ijms19010183 - 08 Jan 2018
Cited by 32 | Viewed by 5883
Abstract
The function of a protein is of great interest in the cutting-edge research of biological mechanisms, disease development and drug/target discovery. Besides experimental explorations, a variety of computational methods have been designed to predict protein function. Among these in silico methods, the prediction [...] Read more.
The function of a protein is of great interest in the cutting-edge research of biological mechanisms, disease development and drug/target discovery. Besides experimental explorations, a variety of computational methods have been designed to predict protein function. Among these in silico methods, the prediction of BLAST is based on protein sequence similarity, while that of machine learning is also based on the sequence, but without the consideration of their similarity. This unique characteristic of machine learning makes it a good complement to BLAST and many other approaches in predicting the function of remotely relevant proteins and the homologous proteins of distinct function. However, the identification accuracies of these in silico methods and their false discovery rate have not yet been assessed so far, which greatly limits the usage of these algorithms. Herein, a comprehensive comparison of the performances among four popular prediction algorithms (BLAST, SVM, PNN and KNN) was conducted. In particular, the performance of these methods was systematically assessed by four standard statistical indexes based on the independent test datasets of 93 functional protein families defined by UniProtKB keywords. Moreover, the false discovery rates of these algorithms were evaluated by scanning the genomes of four representative model organisms (Homo sapiens, Arabidopsis thaliana, Saccharomyces cerevisiae and Mycobacterium tuberculosis). As a result, the substantially higher sensitivity of SVM and BLAST was observed compared with that of PNN and KNN. However, the machine learning algorithms (PNN, KNN and SVM) were found capable of substantially reducing the false discovery rate (SVM < PNN < KNN). In sum, this study comprehensively assessed the performance of four popular algorithms applied to protein function prediction, which could facilitate the selection of the most appropriate method in the related biomedical research. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

1139 KiB  
Article
Protein Subcellular Localization with Gaussian Kernel Discriminant Analysis and Its Kernel Parameter Selection
by Shunfang Wang, Bing Nie, Kun Yue, Yu Fei, Wenjia Li and Dongshu Xu
Int. J. Mol. Sci. 2017, 18(12), 2718; https://doi.org/10.3390/ijms18122718 - 15 Dec 2017
Cited by 9 | Viewed by 2878
Abstract
Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact [...] Read more.
Kernel discriminant analysis (KDA) is a dimension reduction and classification algorithm based on nonlinear kernel trick, which can be novelly used to treat high-dimensional and complex biological data before undergoing classification processes such as protein subcellular localization. Kernel parameters make a great impact on the performance of the KDA model. Specifically, for KDA with the popular Gaussian kernel, to select the scale parameter is still a challenging problem. Thus, this paper introduces the KDA method and proposes a new method for Gaussian kernel parameter selection depending on the fact that the differences between reconstruction errors of edge normal samples and those of interior normal samples should be maximized for certain suitable kernel parameters. Experiments with various standard data sets of protein subcellular localization show that the overall accuracy of protein classification prediction with KDA is much higher than that without KDA. Meanwhile, the kernel parameter of KDA has a great impact on the efficiency, and the proposed method can produce an optimum parameter, which makes the new algorithm not only perform as effectively as the traditional ones, but also reduce the computational time and thus improve efficiency. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

1487 KiB  
Article
UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences
by Pu-Feng Du, Wei Zhao, Yang-Yang Miao, Le-Yi Wei and Likun Wang
Int. J. Mol. Sci. 2017, 18(11), 2400; https://doi.org/10.3390/ijms18112400 - 14 Nov 2017
Cited by 15 | Viewed by 3809
Abstract
With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important [...] Read more.
With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

620 KiB  
Article
Protein-Protein Interactions Prediction Using a Novel Local Conjoint Triad Descriptor of Amino Acid Sequences
by Jun Wang, Long Zhang, Lianyin Jia, Yazhou Ren and Guoxian Yu
Int. J. Mol. Sci. 2017, 18(11), 2373; https://doi.org/10.3390/ijms18112373 - 08 Nov 2017
Cited by 36 | Viewed by 5311
Abstract
Protein-protein interactions (PPIs) play crucial roles in almost all cellular processes. Although a large amount of PPIs have been verified by high-throughput techniques in the past decades, currently known PPIs pairs are still far from complete. Furthermore, the wet-lab experiments based techniques for [...] Read more.
Protein-protein interactions (PPIs) play crucial roles in almost all cellular processes. Although a large amount of PPIs have been verified by high-throughput techniques in the past decades, currently known PPIs pairs are still far from complete. Furthermore, the wet-lab experiments based techniques for detecting PPIs are time-consuming and expensive. Hence, it is urgent and essential to develop automatic computational methods to efficiently and accurately predict PPIs. In this paper, a sequence-based approach called DNN-LCTD is developed by combining deep neural networks (DNNs) and a novel local conjoint triad description (LCTD) feature representation. LCTD incorporates the advantage of local description and conjoint triad, thus, it is capable to account for the interactions between residues in both continuous and discontinuous regions of amino acid sequences. DNNs can not only learn suitable features from the data by themselves, but also learn and discover hierarchical representations of data. When performing on the PPIs data of Saccharomyces cerevisiae, DNN-LCTD achieves superior performance with accuracy as 93.12%, precision as 93.75%, sensitivity as 93.83%, area under the receiver operating characteristic curve (AUC) as 97.92%, and it only needs 718 s. These results indicate DNN-LCTD is very promising for predicting PPIs. DNN-LCTD can be a useful supplementary tool for future proteomics study. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Figure 1

5730 KiB  
Article
Predicting Amyloidogenic Proteins in the Proteomes of Plants
by Kirill S. Antonets and Anton A. Nizhnikov
Int. J. Mol. Sci. 2017, 18(10), 2155; https://doi.org/10.3390/ijms18102155 - 16 Oct 2017
Cited by 22 | Viewed by 5380
Abstract
Amyloids are protein fibrils with characteristic spatial structure. Though amyloids were long perceived to be pathogens that cause dozens of incurable pathologies in humans and mammals, it is currently clear that amyloids also represent a functionally important form of protein structure implicated in [...] Read more.
Amyloids are protein fibrils with characteristic spatial structure. Though amyloids were long perceived to be pathogens that cause dozens of incurable pathologies in humans and mammals, it is currently clear that amyloids also represent a functionally important form of protein structure implicated in a variety of biological processes in organisms ranging from archaea and bacteria to fungi and animals. Despite their social significance, plants remain the most poorly studied group of organisms in the field of amyloid biology. To date, amyloid properties have only been demonstrated in vitro or in heterologous systems for a small number of plant proteins. Here, for the first time, we performed a comprehensive analysis of the distribution of potentially amyloidogenic proteins in the proteomes of approximately 70 species of land plants using the Waltz and SARP (Sequence Analysis based on the Ranking of Probabilities) bioinformatic algorithms. We analyzed more than 2.9 million protein sequences and found that potentially amyloidogenic proteins are abundant in plant proteomes. We found that such proteins are overrepresented among membrane as well as DNA- and RNA-binding proteins of plants. Moreover, seed storage and defense proteins of most plant species are rich in amyloidogenic regions. Taken together, our data demonstrate the diversity of potentially amyloidogenic proteins in plant proteomes and suggest biological processes where formation of amyloids might be functionally important. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Figure 1

1212 KiB  
Article
Protein Complexes Prediction Method Based on Core—Attachment Structure and Functional Annotations
by Bo Li and Bo Liao
Int. J. Mol. Sci. 2017, 18(9), 1910; https://doi.org/10.3390/ijms18091910 - 06 Sep 2017
Cited by 10 | Viewed by 3496
Abstract
Recent advances in high-throughput laboratory techniques captured large-scale protein–protein interaction (PPI) data, making it possible to create a detailed map of protein interaction networks, and thus enable us to detect protein complexes from these PPI networks. However, most of the current state-of-the-art studies [...] Read more.
Recent advances in high-throughput laboratory techniques captured large-scale protein–protein interaction (PPI) data, making it possible to create a detailed map of protein interaction networks, and thus enable us to detect protein complexes from these PPI networks. However, most of the current state-of-the-art studies still have some problems, for instance, incapability of identifying overlapping clusters, without considering the inherent organization within protein complexes, and overlooking the biological meaning of complexes. Therefore, we present a novel overlapping protein complexes prediction method based on core–attachment structure and function annotations (CFOCM), which performs in two stages: first, it detects protein complex cores with the maximum value of our defined cluster closeness function, in which the proteins are also closely related to at least one common function. Then it appends attach proteins into these detected cores to form the returned complexes. For performance evaluation, CFOCM and six classical methods have been used to identify protein complexes on three different yeast PPI networks, and three sets of real complexes including the Munich Information Center for Protein Sequences (MIPS), the Saccharomyces Genome Database (SGD) and the Catalogues of Yeast protein Complexes (CYC2008) are selected as benchmark sets, and the results show that CFOCM is indeed effective and robust for achieving the highest F-measure values in all tests. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

716 KiB  
Article
CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks
by Min Li, Dongyan Li, Yu Tang, Fangxiang Wu and Jianxin Wang
Int. J. Mol. Sci. 2017, 18(9), 1880; https://doi.org/10.3390/ijms18091880 - 31 Aug 2017
Cited by 79 | Viewed by 10213
Abstract
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here [...] Read more.
Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Figure 1

2145 KiB  
Article
PSFM-DBT: Identifying DNA-Binding Proteins by Combing Position Specific Frequency Matrix and Distance-Bigram Transformation
by Jun Zhang and Bin Liu
Int. J. Mol. Sci. 2017, 18(9), 1856; https://doi.org/10.3390/ijms18091856 - 25 Aug 2017
Cited by 66 | Viewed by 5034
Abstract
DNA-binding proteins play crucial roles in various biological processes, such as DNA replication and repair, transcriptional regulation and many other biological activities associated with DNA. Experimental recognition techniques for DNA-binding proteins identification are both time consuming and expensive. Effective methods for identifying these [...] Read more.
DNA-binding proteins play crucial roles in various biological processes, such as DNA replication and repair, transcriptional regulation and many other biological activities associated with DNA. Experimental recognition techniques for DNA-binding proteins identification are both time consuming and expensive. Effective methods for identifying these proteins only based on protein sequences are highly required. The key for sequence-based methods is to effectively represent protein sequences. It has been reported by various previous studies that evolutionary information is crucial for DNA-binding protein identification. In this study, we employed four methods to extract the evolutionary information from Position Specific Frequency Matrix (PSFM), including Residue Probing Transformation (RPT), Evolutionary Difference Transformation (EDT), Distance-Bigram Transformation (DBT), and Trigram Transformation (TT). The PSFMs were converted into fixed length feature vectors by these four methods, and then respectively combined with Support Vector Machines (SVMs); four predictors for identifying these proteins were constructed, including PSFM-RPT, PSFM-EDT, PSFM-DBT, and PSFM-TT. Experimental results on a widely used benchmark dataset PDB1075 and an independent dataset PDB186 showed that these four methods achieved state-of-the-art-performance, and PSFM-DBT outperformed other existing methods in this field. For practical applications, a user-friendly webserver of PSFM-DBT was established, which is available at http://bioinformatics.hitsz.edu.cn/PSFM-DBT/. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

942 KiB  
Article
IonchanPred 2.0: A Tool to Predict Ion Channels and Their Types
by Ya-Wei Zhao, Zhen-Dong Su, Wuritu Yang, Hao Lin, Wei Chen and Hua Tang
Int. J. Mol. Sci. 2017, 18(9), 1838; https://doi.org/10.3390/ijms18091838 - 24 Aug 2017
Cited by 60 | Viewed by 5627
Abstract
Ion channels (IC) are ion-permeable protein pores located in the lipid membranes of all cells. Different ion channels have unique functions in different biological processes. Due to the rapid development of high-throughput mass spectrometry, proteomic data are rapidly accumulating and provide us an [...] Read more.
Ion channels (IC) are ion-permeable protein pores located in the lipid membranes of all cells. Different ion channels have unique functions in different biological processes. Due to the rapid development of high-throughput mass spectrometry, proteomic data are rapidly accumulating and provide us an opportunity to systematically investigate and predict ion channels and their types. In this paper, we constructed a support vector machine (SVM)-based model to quickly predict ion channels and their types. By considering the residue sequence information and their physicochemical properties, a novel feature-extracted method which combined dipeptide composition with the physicochemical correlation between two residues was employed. A feature selection strategy was used to improve the performance of the model. Comparison results of in jackknife cross-validation demonstrated that our method was superior to other methods for predicting ion channels and their types. Based on the model, we built a web server called IonchanPred which can be freely accessed from http://lin.uestc.edu.cn/server/IonchanPredv2.0. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

6185 KiB  
Article
Understanding Insulin Endocrinology in Decapod Crustacea: Molecular Modelling Characterization of an Insulin-Binding Protein and Insulin-Like Peptides in the Eastern Spiny Lobster, Sagmariasus verreauxi
by Jennifer C. Chandler, Neha S. Gandhi, Ricardo L. Mancera, Greg Smith, Abigail Elizur and Tomer Ventura
Int. J. Mol. Sci. 2017, 18(9), 1832; https://doi.org/10.3390/ijms18091832 - 23 Aug 2017
Cited by 27 | Viewed by 5302
Abstract
The insulin signalling system is one of the most conserved endocrine systems of Animalia from mollusc to man. In decapod Crustacea, such as the Eastern spiny lobster, Sagmariasus verreauxi (Sv) and the red-claw crayfish, Cherax quadricarinatus (Cq), insulin endocrinology governs male sexual [...] Read more.
The insulin signalling system is one of the most conserved endocrine systems of Animalia from mollusc to man. In decapod Crustacea, such as the Eastern spiny lobster, Sagmariasus verreauxi (Sv) and the red-claw crayfish, Cherax quadricarinatus (Cq), insulin endocrinology governs male sexual differentiation through the action of a male-specific, insulin-like androgenic gland peptide (IAG). To understand the bioactivity of IAG it is necessary to consider its bio-regulators such as the insulin-like growth factor binding protein (IGFBP). This work has employed various molecular modelling approaches to represent S. verreauxi IGFBP and IAG, along with additional Sv-ILP ligands, in order to characterise their binding interactions. Firstly, we present Sv- and Cq-ILP2: neuroendocrine factors that share closest homology with Drosophila ILP8 (Dilp8). We then describe the binding interaction of the N-terminal domain of Sv-IGFBP and each ILP through a synergy of computational analyses. In-depth interaction mapping and computational alanine scanning of IGFBP_N’ highlight the conserved involvement of the hotspot residues Q67, G70, D71, S72, G91, G92, T93 and D94. The significance of the negatively charged residues D71 and D94 was then further exemplified by structural electrostatics. The functional importance of the negative surface charge of IGFBP is exemplified in the complementary electropositive charge on the reciprocal binding interface of all three ILP ligands. When examined, this electrostatic complementarity is the inverse of vertebrate homologues; such physicochemical divergences elucidate towards ligand-binding specificity between Phyla. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

1866 KiB  
Article
An Ameliorated Prediction of Drug–Target Interactions Based on Multi-Scale Discrete Wavelet Transform and Network Features
by Cong Shen, Yijie Ding, Jijun Tang, Xinying Xu and Fei Guo
Int. J. Mol. Sci. 2017, 18(8), 1781; https://doi.org/10.3390/ijms18081781 - 16 Aug 2017
Cited by 45 | Viewed by 4148
Abstract
The prediction of drug–target interactions (DTIs) via computational technology plays a crucial role in reducing the experimental cost. A variety of state-of-the-art methods have been proposed to improve the accuracy of DTI predictions. In this paper, we propose a kind of drug–target interactions [...] Read more.
The prediction of drug–target interactions (DTIs) via computational technology plays a crucial role in reducing the experimental cost. A variety of state-of-the-art methods have been proposed to improve the accuracy of DTI predictions. In this paper, we propose a kind of drug–target interactions predictor adopting multi-scale discrete wavelet transform and network features (named as DAWN) in order to solve the DTIs prediction problem. We encode the drug molecule by a substructure fingerprint with a dictionary of substructure patterns. Simultaneously, we apply the discrete wavelet transform (DWT) to extract features from target sequences. Then, we concatenate and normalize the target, drug, and network features to construct feature vectors. The prediction model is obtained by feeding these feature vectors into the support vector machine (SVM) classifier. Extensive experimental results show that the prediction ability of DAWN has a compatibility among other DTI prediction schemes. The prediction areas under the precision–recall curves (AUPRs) of four datasets are 0 . 895 (Enzyme), 0 . 921 (Ion Channel), 0 . 786 (guanosine-binding protein coupled receptor, GPCR), and 0 . 603 (Nuclear Receptor), respectively. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

1491 KiB  
Article
Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System
by Jinjian Jiang, Nian Wang, Peng Chen, Chunhou Zheng and Bing Wang
Int. J. Mol. Sci. 2017, 18(7), 1543; https://doi.org/10.3390/ijms18071543 - 18 Jul 2017
Cited by 16 | Viewed by 4448
Abstract
Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address [...] Read more.
Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Figure 1

731 KiB  
Article
Relationship of Triamine-Biocide Tolerance of Salmonella enterica Serovar Senftenberg to Antimicrobial Susceptibility, Serum Resistance and Outer Membrane Proteins
by Bożena Futoma-Kołoch, Bartłomiej Dudek, Katarzyna Kapczyńska, Eva Krzyżewska, Martyna Wańczyk, Kamila Korzekwa, Jacek Rybka, Elżbieta Klausa and Gabriela Bugla-Płoskońska
Int. J. Mol. Sci. 2017, 18(7), 1459; https://doi.org/10.3390/ijms18071459 - 11 Jul 2017
Cited by 7 | Viewed by 3367
Abstract
A new emerging phenomenon is the association between the incorrect use of biocides in the process of disinfection in farms and the emergence of cross-resistance in Salmonella populations. Adaptation of the microorganisms to the sub-inhibitory concentrations of the disinfectants is not clear, but [...] Read more.
A new emerging phenomenon is the association between the incorrect use of biocides in the process of disinfection in farms and the emergence of cross-resistance in Salmonella populations. Adaptation of the microorganisms to the sub-inhibitory concentrations of the disinfectants is not clear, but may result in an increase of sensitivity or resistance to antibiotics, depending on the biocide used and the challenged Salmonella serovar. Exposure of five Salmonella enterica subsp. enterica serovar Senftenberg (S. Senftenberg) strains to triamine-containing disinfectant did not result in variants with resistance to antibiotics, but has changed their susceptibility to normal human serum (NHS). Three biocide variants developed reduced sensitivity to NHS in comparison to the sensitive parental strains, while two isolates lost their resistance to serum. For S. Senftenberg, which exhibited the highest triamine tolerance (6 × MIC) and intrinsic sensitivity to 22.5% and 45% NHS, a downregulation of flagellin and enolase has been demonstrated, which might suggest a lower adhesion and virulence of the bacteria. This is the first report demonstrating the influence of biocide tolerance on NHS resistance. In conclusion, there was a potential in S. Senftenberg to adjust to the conditions, where the biocide containing triamine was present. However, the adaptation did not result in the increase of antibiotic resistance, but manifested in changes within outer membrane proteins’ patterns. The strategy of bacterial membrane proteins’ analysis provides an opportunity to adjust the ways of infection treatments, especially when it is connected to the life-threating bacteremia caused by Salmonella species. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

2868 KiB  
Article
Identification of Direct Activator of Adenosine Monophosphate-Activated Protein Kinase (AMPK) by Structure-Based Virtual Screening and Molecular Docking Approach
by Tonghui Huang, Jie Sun, Shanshan Zhou, Jian Gao and Yi Liu
Int. J. Mol. Sci. 2017, 18(7), 1408; https://doi.org/10.3390/ijms18071408 - 30 Jun 2017
Cited by 13 | Viewed by 4986
Abstract
Adenosine monophosphate-activated protein kinase (AMPK) plays a critical role in the regulation of energy metabolism and has been targeted for drug development of therapeutic intervention in Type II diabetes and related diseases. Recently, there has been renewed interest in the development of direct [...] Read more.
Adenosine monophosphate-activated protein kinase (AMPK) plays a critical role in the regulation of energy metabolism and has been targeted for drug development of therapeutic intervention in Type II diabetes and related diseases. Recently, there has been renewed interest in the development of direct β1-selective AMPK activators to treat patients with diabetic nephropathy. To investigate the details of AMPK domain structure, sequence alignment and structural comparison were used to identify the key amino acids involved in the interaction with activators and the structure difference between β1 and β2 subunits. Additionally, a series of potential β1-selective AMPK activators were identified by virtual screening using molecular docking. The retrieved hits were filtered on the basis of Lipinski’s rule of five and drug-likeness. Finally, 12 novel compounds with diverse scaffolds were obtained as potential starting points for the design of direct β1-selective AMPK activators. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

17057 KiB  
Article
3D-QSAR and Molecular Docking Studies on the TcPMCA1-Mediated Detoxification of Scopoletin and Coumarin Derivatives
by Qiu-Li Hou, Jin-Xiang Luo, Bing-Chuan Zhang, Gao-Fei Jiang, Wei Ding and Yong-Qiang Zhang
Int. J. Mol. Sci. 2017, 18(7), 1380; https://doi.org/10.3390/ijms18071380 - 27 Jun 2017
Cited by 17 | Viewed by 5353
Abstract
The carmine spider mite, Tetranychus cinnabarinus (Boisduval), is an economically important agricultural pest that is difficult to prevent and control. Scopoletin is a botanical coumarin derivative that targets Ca2+-ATPase to exert a strong acaricidal effect on carmine spider mites. In this [...] Read more.
The carmine spider mite, Tetranychus cinnabarinus (Boisduval), is an economically important agricultural pest that is difficult to prevent and control. Scopoletin is a botanical coumarin derivative that targets Ca2+-ATPase to exert a strong acaricidal effect on carmine spider mites. In this study, the full-length cDNA sequence of a plasma membrane Ca2+-ATPase 1 gene (TcPMCA1) was cloned. The sequence contains an open reading frame of 3750 bp and encodes a putative protein of 1249 amino acids. The effects of scopoletin on TcPMCA1 expression were investigated. TcPMCA1 was significantly upregulated after it was exposed to 10%, 30%, and 50% of the lethal concentration of scopoletin. Homology modeling, molecular docking, and three-dimensional quantitative structure-activity relationships were then studied to explore the relationship between scopoletin structure and TcPMCA1-inhibiting activity of scopoletin and other 30 coumarin derivatives. Results showed that scopoletin inserts into the binding cavity and interacts with amino acid residues at the binding site of the TcPMCA1 protein through the driving forces of hydrogen bonds. Furthermore, CoMFA (comparative molecular field analysis)- and CoMSIA (comparative molecular similarity index analysis)-derived models showed that the steric and H-bond fields of these compounds exert important influences on the activities of the coumarin compounds.Notably, the C3, C6, and C7 positions in the skeletal structure of the coumarins are the most suitable active sites. This work provides insights into the mechanism underlying the interaction of scopoletin with TcPMCA1. The present results can improve the understanding on plasma membrane Ca2+-ATPase-mediated (PMCA-mediated) detoxification of scopoletin and coumarin derivatives in T. cinnabarinus, as well as provide valuable information for the design of novel PMCA-inhibiting acaricides. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

5790 KiB  
Article
Biochemical and Computational Insights on a Novel Acid-Resistant and Thermal-Stable Glucose 1-Dehydrogenase
by Haitao Ding, Fen Gao, Yong Yu and Bo Chen
Int. J. Mol. Sci. 2017, 18(6), 1198; https://doi.org/10.3390/ijms18061198 - 05 Jun 2017
Cited by 6 | Viewed by 4031
Abstract
Due to the dual cofactor specificity, glucose 1-dehydrogenase (GDH) has been considered as a promising alternative for coenzyme regeneration in biocatalysis. To mine for potential GDHs for practical applications, several genes encoding for GDH had been heterogeneously expressed in Escherichia coli BL21 (DE3) [...] Read more.
Due to the dual cofactor specificity, glucose 1-dehydrogenase (GDH) has been considered as a promising alternative for coenzyme regeneration in biocatalysis. To mine for potential GDHs for practical applications, several genes encoding for GDH had been heterogeneously expressed in Escherichia coli BL21 (DE3) for primary screening. Of all the candidates, GDH from Bacillus sp. ZJ (BzGDH) was one of the most robust enzymes. BzGDH was then purified to homogeneity by immobilized metal affinity chromatography and characterized biochemically. It displayed maximum activity at 45 °C and pH 9.0, and was stable at temperatures below 50 °C. BzGDH also exhibited a broad pH stability, especially in the acidic region, which could maintain around 80% of its initial activity at the pH range of 4.0–8.5 after incubating for 1 hour. Molecular dynamics simulation was conducted for better understanding the stability feature of BzGDH against the structural context. The in-silico simulation shows that BzGDH is stable and can maintain its overall structure against heat during the simulation at 323 K, which is consistent with the biochemical studies. In brief, the robust stability of BzGDH made it an attractive participant for cofactor regeneration on practical applications, especially for the catalysis implemented in acidic pH and high temperature. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

1883 KiB  
Article
Determination of Genes Related to Uveitis by Utilization of the Random Walk with Restart Algorithm on a Protein–Protein Interaction Network
by Shiheng Lu, Yan Yan, Zhen Li, Lei Chen, Jing Yang, Yuhang Zhang, Shaopeng Wang and Lin Liu
Int. J. Mol. Sci. 2017, 18(5), 1045; https://doi.org/10.3390/ijms18051045 - 13 May 2017
Cited by 19 | Viewed by 5037
Abstract
Uveitis, defined as inflammation of the uveal tract, may cause blindness in both young and middle-aged people. Approximately 10–15% of blindness in the West is caused by uveitis. Therefore, a comprehensive investigation to determine the disease pathogenesis is urgent, as it will thus [...] Read more.
Uveitis, defined as inflammation of the uveal tract, may cause blindness in both young and middle-aged people. Approximately 10–15% of blindness in the West is caused by uveitis. Therefore, a comprehensive investigation to determine the disease pathogenesis is urgent, as it will thus be possible to design effective treatments. Identification of the disease genes that cause uveitis is an important requirement to achieve this goal. To begin to answer this question, in this study, a computational method was proposed to identify novel uveitis-related genes. This method was executed on a large protein–protein interaction network and employed a popular ranking algorithm, the Random Walk with Restart (RWR) algorithm. To improve the utility of the method, a permutation test and a procedure for selecting core genes were added, which helped to exclude false discoveries and select the most important candidate genes. The five-fold cross-validation was adopted to evaluate the method, yielding the average F1-measure of 0.189. In addition, we compared our method with a classic GBA-based method to further indicate its utility. Based on our method, 56 putative genes were chosen for further assessment. We have determined that several of these genes (e.g., CCL4, Jun, and MMP9) are likely to be important for the pathogenesis of uveitis. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

736 KiB  
Article
PCVMZM: Using the Probabilistic Classification Vector Machines Model Combined with a Zernike Moments Descriptor to Predict Protein–Protein Interactions from Protein Sequences
by Yanbin Wang, Zhuhong You, Xiao Li, Xing Chen, Tonghai Jiang and Jingting Zhang
Int. J. Mol. Sci. 2017, 18(5), 1029; https://doi.org/10.3390/ijms18051029 - 11 May 2017
Cited by 57 | Viewed by 7137
Abstract
Protein–protein interactions (PPIs) are essential for most living organisms’ process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is [...] Read more.
Protein–protein interactions (PPIs) are essential for most living organisms’ process. Thus, detecting PPIs is extremely important to understand the molecular mechanisms of biological systems. Although many PPIs data have been generated by high-throughput technologies for a variety of organisms, the whole interatom is still far from complete. In addition, the high-throughput technologies for detecting PPIs has some unavoidable defects, including time consumption, high cost, and high error rate. In recent years, with the development of machine learning, computational methods have been broadly used to predict PPIs, and can achieve good prediction rate. In this paper, we present here PCVMZM, a computational method based on a Probabilistic Classification Vector Machines (PCVM) model and Zernike moments (ZM) descriptor for predicting the PPIs from protein amino acids sequences. Specifically, a Zernike moments (ZM) descriptor is used to extract protein evolutionary information from Position-Specific Scoring Matrix (PSSM) generated by Position-Specific Iterated Basic Local Alignment Search Tool (PSI-BLAST). Then, PCVM classifier is used to infer the interactions among protein. When performed on PPIs datasets of Yeast and H. Pylori, the proposed method can achieve the average prediction accuracy of 94.48% and 91.25%, respectively. In order to further evaluate the performance of the proposed method, the state-of-the-art support vector machines (SVM) classifier is used and compares with the PCVM model. Experimental results on the Yeast dataset show that the performance of PCVM classifier is better than that of SVM classifier. The experimental results indicate that our proposed method is robust, powerful and feasible, which can be used as a helpful tool for proteomics research. Full article
(This article belongs to the Special Issue Special Protein Molecules Computational Identification)
Show Figures

Graphical abstract

Back to TopTop