ijms-logo

Journal Browser

Journal Browser

Special Protein or RNA Molecules Computational Identification 2019

A special issue of International Journal of Molecular Sciences (ISSN 1422-0067). This special issue belongs to the section "Molecular Informatics".

Deadline for manuscript submissions: closed (28 October 2019) | Viewed by 31027

Special Issue Editor

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
Interests: bioinformatics; parallel computing; deep learning; protein classification; genome assembly
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear colleagues,

The discovery of new molecules remains an important and challenging task. For some special proteins or RNA molecules, it is difficult, time-consuming, and costly to detect new ones. These special proteins include cytokines, enzymes, cell-penetrating peptides, anticancer peptides, cancerlectins, G protein-coupled receptors, etc. Some noncoding RNAs are also required to be annotated in the sequencing data, such as microRNA, snoRNA, snRNA, circle RNA, tRNA, etc. Researchers have often employed computer programs to list some candidates, and validated the candidates using molecular experiments. The “computer program” used is a key issue, which could cut wet experiment costs. High false positive software would lead to high costs in the validation process.

In addition to proteins, we encourage authors to pay attention to noncoding RNA molecules. MicroRNA and other noncoding RNA detections are still openly challenging for bioinformatic researchers. A perfect performance could remove the cost of Northern Blot or rtPCR. RNA function and the RNA–disease relationship are also interesting and welcome. Some network methods, including random walk and matrix factorization, have been employed in the RNA–disease relationship prediction. However, they are not robust. Sometimes, state-of-the-art methods would be invalid upon updating the datasets. I hope to see more novel and robust methods and golden benchmark datasets in the new Special Issue.

Prof. Dr. Quan Zou
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. International Journal of Molecular Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. There is an Article Processing Charge (APC) for publication in this open access journal. For details about the APC please see here. Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Bioinformatics
  • Machine learning
  • Feature selection
  • Protein classification
  • PseAAC features
  • Anticancer peptides
  • Cell-penetrating peptides
  • Oncogene
  • DNA/RNA binding proteins
  • MHC binding peptide
  • Noncoding RNA
  • MicroRNA
  • RNA–disease relationship
  • Network

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 928 KiB  
Article
DeepMiR2GO: Inferring Functions of Human MicroRNAs Using a Deep Multi-Label Classification Model
by Jiacheng Wang, Jingpu Zhang, Yideng Cai and Lei Deng
Int. J. Mol. Sci. 2019, 20(23), 6046; https://doi.org/10.3390/ijms20236046 - 30 Nov 2019
Cited by 5 | Viewed by 2529
Abstract
MicroRNAs (miRNAs) are a highly abundant collection of functional non-coding RNAs involved in cellular regulation and various complex human diseases. Although a large number of miRNAs have been identified, most of their physiological functions remain unknown. Computational methods play a vital role in [...] Read more.
MicroRNAs (miRNAs) are a highly abundant collection of functional non-coding RNAs involved in cellular regulation and various complex human diseases. Although a large number of miRNAs have been identified, most of their physiological functions remain unknown. Computational methods play a vital role in exploring the potential functions of miRNAs. Here, we present DeepMiR2GO, a tool for integrating miRNAs, proteins and diseases, to predict the gene ontology (GO) functions based on multiple deep neuro-symbolic models. DeepMiR2GO starts by integrating the miRNA co-expression network, protein-protein interaction (PPI) network, disease phenotype similarity network, and interactions or associations among them into a global heterogeneous network. Then, it employs an efficient graph embedding strategy to learn potential network representations of the global heterogeneous network as the topological features. Finally, a deep multi-label classification network based on multiple neuro-symbolic models is built and used to annotate the GO terms of miRNAs. The predicted results demonstrate that DeepMiR2GO performs significantly better than other state-of-the-art approaches in terms of precision, recall, and maximum F-measure. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

15 pages, 4746 KiB  
Article
LDAPred: A Method Based on Information Flow Propagation and a Convolutional Neural Network for the Prediction of Disease-Associated lncRNAs
by Ping Xuan, Lan Jia, Tiangang Zhang, Nan Sheng, Xiaokun Li and Jinbao Li
Int. J. Mol. Sci. 2019, 20(18), 4458; https://doi.org/10.3390/ijms20184458 - 10 Sep 2019
Cited by 25 | Viewed by 2557
Abstract
Long non-coding RNAs (lncRNAs) play a crucial role in the pathogenesis and development of complex diseases. Predicting potential lncRNA–disease associations can improve our understanding of the molecular mechanisms of human diseases and help identify biomarkers for disease diagnosis, treatment, and prevention. Previous research [...] Read more.
Long non-coding RNAs (lncRNAs) play a crucial role in the pathogenesis and development of complex diseases. Predicting potential lncRNA–disease associations can improve our understanding of the molecular mechanisms of human diseases and help identify biomarkers for disease diagnosis, treatment, and prevention. Previous research methods have mostly integrated the similarity and association information of lncRNAs and diseases, without considering the topological structure information among these nodes, which is important for predicting lncRNA–disease associations. We propose a method based on information flow propagation and convolutional neural networks, called LDAPred, to predict disease-related lncRNAs. LDAPred not only integrates the similarities, associations, and interactions among lncRNAs, diseases, and miRNAs, but also exploits the topological structures formed by them. In this study, we construct a dual convolutional neural network-based framework that comprises the left and right sides. The embedding layer on the left side is established by utilizing lncRNA, miRNA, and disease-related biological premises. On the right side of the frame, multiple types of similarity, association, and interaction relationships among lncRNAs, diseases, and miRNAs are calculated based on information flow propagation on the bi-layer networks, such as the lncRNA–disease network. They contain the network topological structure and they are learned by the right side of the framework. The experimental results based on five-fold cross-validation indicate that LDAPred performs better than several state-of-the-art methods. Case studies on breast cancer, colon cancer, and osteosarcoma further demonstrate LDAPred’s ability to discover potential lncRNA–disease associations. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

17 pages, 2041 KiB  
Article
CNNDLP: A Method Based on Convolutional Autoencoder and Convolutional Neural Network with Adjacent Edge Attention for Predicting lncRNA–Disease Associations
by Ping Xuan, Nan Sheng, Tiangang Zhang, Yong Liu and Yahong Guo
Int. J. Mol. Sci. 2019, 20(17), 4260; https://doi.org/10.3390/ijms20174260 - 30 Aug 2019
Cited by 31 | Viewed by 3264
Abstract
It is well known that the unusual expression of long non-coding RNAs (lncRNAs) is closely related to the physiological and pathological processes of diseases. Therefore, inferring the potential lncRNA–disease associations are helpful for understanding the molecular pathogenesis of diseases. Most previous methods have [...] Read more.
It is well known that the unusual expression of long non-coding RNAs (lncRNAs) is closely related to the physiological and pathological processes of diseases. Therefore, inferring the potential lncRNA–disease associations are helpful for understanding the molecular pathogenesis of diseases. Most previous methods have concentrated on the construction of shallow learning models in order to predict lncRNA-disease associations, while they have failed to deeply integrate heterogeneous multi-source data and to learn the low-dimensional feature representations from these data. We propose a method based on the convolutional neural network with the attention mechanism and convolutional autoencoder for predicting candidate disease-related lncRNAs, and refer to it as CNNDLP. CNNDLP integrates multiple kinds of data from heterogeneous sources, including the associations, interactions, and similarities related to the lncRNAs, diseases, and miRNAs. Two different embedding layers are established by combining the diverse biological premises about the cases that the lncRNAs are likely to associate with the diseases. We construct a novel prediction model based on the convolutional neural network with attention mechanism and convolutional autoencoder to learn the attention and the low-dimensional network representations of the lncRNA–disease pairs from the embedding layers. The different adjacent edges among the lncRNA, miRNA, and disease nodes have different contributions for association prediction. Hence, an attention mechanism at the adjacent edge level is established, and the left side of the model learns the attention representation of a pair of lncRNA and disease. A new type of lncRNA similarity and a new type of disease similarity are calculated by incorporating the topological structures of multiple bipartite networks. The low-dimensional network representation of the lncRNA-disease pairs is further learned by the autoencoder based convolutional neutral network on the right side of the model. The cross-validation experimental results confirm that CNNDLP has superior prediction performance compared to the state-of-the-art methods. Case studies on stomach cancer, breast cancer, and prostate cancer further show the ability of CNNDLP for discovering the potential disease lncRNAs. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

14 pages, 386 KiB  
Article
FKRR-MVSF: A Fuzzy Kernel Ridge Regression Model for Identifying DNA-Binding Proteins by Multi-View Sequence Features via Chou’s Five-Step Rule
by Yi Zou, Yijie Ding, Jijun Tang, Fei Guo and Li Peng
Int. J. Mol. Sci. 2019, 20(17), 4175; https://doi.org/10.3390/ijms20174175 - 26 Aug 2019
Cited by 24 | Viewed by 2761
Abstract
DNA-binding proteins play an important role in cell metabolism. In biological laboratories, the detection methods of DNA-binding proteins includes yeast one-hybrid methods, bacterial singles and X-ray crystallography methods and others, but these methods involve a lot of labor, material and time. In recent [...] Read more.
DNA-binding proteins play an important role in cell metabolism. In biological laboratories, the detection methods of DNA-binding proteins includes yeast one-hybrid methods, bacterial singles and X-ray crystallography methods and others, but these methods involve a lot of labor, material and time. In recent years, many computation-based approachs have been proposed to detect DNA-binding proteins. In this paper, a machine learning-based method, which is called the Fuzzy Kernel Ridge Regression model based on Multi-View Sequence Features (FKRR-MVSF), is proposed to identifying DNA-binding proteins. First of all, multi-view sequence features are extracted from protein sequences. Next, a Multiple Kernel Learning (MKL) algorithm is employed to combine multiple features. Finally, a Fuzzy Kernel Ridge Regression (FKRR) model is built to detect DNA-binding proteins. Compared with other methods, our model achieves good results. Our method obtains an accuracy of 83.26% and 81.72% on two benchmark datasets (PDB1075 and compared with PDB186), respectively. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

12 pages, 1963 KiB  
Article
In Silico Prediction of Drug-Induced Liver Injury Based on Ensemble Classifier Method
by Yangyang Wang, Qingxin Xiao, Peng Chen and Bing Wang
Int. J. Mol. Sci. 2019, 20(17), 4106; https://doi.org/10.3390/ijms20174106 - 22 Aug 2019
Cited by 22 | Viewed by 3632
Abstract
Drug-induced liver injury (DILI) is a major factor in the development of drugs and the safety of drugs. If the DILI cannot be effectively predicted during the development of the drug, it will cause the drug to be withdrawn from markets. Therefore, DILI [...] Read more.
Drug-induced liver injury (DILI) is a major factor in the development of drugs and the safety of drugs. If the DILI cannot be effectively predicted during the development of the drug, it will cause the drug to be withdrawn from markets. Therefore, DILI is crucial at the early stages of drug research. This work presents a 2-class ensemble classifier model for predicting DILI, with 2D molecular descriptors and fingerprints on a dataset of 450 compounds. The purpose of our study is to investigate which are the key molecular fingerprints that may cause DILI risk, and then to obtain a reliable ensemble model to predict DILI risk with these key factors. Experimental results suggested that 8 molecular fingerprints are very critical for predicting DILI, and also obtained the best ratio of molecular fingerprints to molecular descriptors. The result of the 5-fold cross-validation of the ensemble vote classifier method obtain an accuracy of 77.25%, and the accuracy of the test set was 81.67%. This model could be used for drug-induced liver injury prediction. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

19 pages, 2374 KiB  
Article
Inferring the Disease-Associated miRNAs Based on Network Representation Learning and Convolutional Neural Networks
by Ping Xuan, Hao Sun, Xiao Wang, Tiangang Zhang and Shuxiang Pan
Int. J. Mol. Sci. 2019, 20(15), 3648; https://doi.org/10.3390/ijms20153648 - 25 Jul 2019
Cited by 35 | Viewed by 2885
Abstract
Identification of disease-associated miRNAs (disease miRNAs) are critical for understanding etiology and pathogenesis. Most previous methods focus on integrating similarities and associating information contained in heterogeneous miRNA-disease networks. However, these methods establish only shallow prediction models that fail to capture complex relationships among [...] Read more.
Identification of disease-associated miRNAs (disease miRNAs) are critical for understanding etiology and pathogenesis. Most previous methods focus on integrating similarities and associating information contained in heterogeneous miRNA-disease networks. However, these methods establish only shallow prediction models that fail to capture complex relationships among miRNA similarities, disease similarities, and miRNA-disease associations. We propose a prediction method on the basis of network representation learning and convolutional neural networks to predict disease miRNAs, called CNNMDA. CNNMDA deeply integrates the similarity information of miRNAs and diseases, miRNA-disease associations, and representations of miRNAs and diseases in low-dimensional feature space. The new framework based on deep learning was built to learn the original and global representation of a miRNA-disease pair. First, diverse biological premises about miRNAs and diseases were combined to construct the embedding layer in the left part of the framework, from a biological perspective. Second, the various connection edges in the miRNA-disease network, such as similarity and association connections, were dependent on each other. Therefore, it was necessary to learn the low-dimensional representations of the miRNA and disease nodes based on the entire network. The right part of the framework learnt the low-dimensional representation of each miRNA and disease node based on non-negative matrix factorization, and these representations were used to establish the corresponding embedding layer. Finally, the left and right embedding layers went through convolutional modules to deeply learn the complex and non-linear relationships among the similarities and associations between miRNAs and diseases. Experimental results based on cross validation indicated that CNNMDA yields superior performance compared to several state-of-the-art methods. Furthermore, case studies on lung, breast, and pancreatic neoplasms demonstrated the powerful ability of CNNMDA to discover potential disease miRNAs. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

12 pages, 1272 KiB  
Article
An Ensemble Classifier to Predict Protein–Protein Interactions by Combining PSSM-based Evolutionary Information with Local Binary Pattern Model
by Yang Li, Li-Ping Li, Lei Wang, Chang-Qing Yu, Zheng Wang and Zhu-Hong You
Int. J. Mol. Sci. 2019, 20(14), 3511; https://doi.org/10.3390/ijms20143511 - 17 Jul 2019
Cited by 14 | Viewed by 2485
Abstract
Protein plays a critical role in the regulation of biological cell functions. Among them, whether proteins interact with each other has become a fundamental problem, because proteins usually perform their functions by interacting with other proteins. Although a large amount of protein–protein interactions [...] Read more.
Protein plays a critical role in the regulation of biological cell functions. Among them, whether proteins interact with each other has become a fundamental problem, because proteins usually perform their functions by interacting with other proteins. Although a large amount of protein–protein interactions (PPIs) data has been produced by high-throughput biotechnology, the disadvantage of biological experimental technique is time-consuming and costly. Thus, computational methods for predicting protein interactions have become a research hot spot. In this research, we propose an efficient computational method that combines Rotation Forest (RF) classifier with Local Binary Pattern (LBP) feature extraction method to predict PPIs from the perspective of Position-Specific Scoring Matrix (PSSM). The proposed method has achieved superior performance in predicting Yeast, Human, and H. pylori datasets with average accuracies of 92.12%, 96.21%, and 86.59%, respectively. In addition, we also evaluated the performance of the proposed method on the four independent datasets of C. elegans, H. pylori, H. sapiens, and M. musculus datasets. These obtained experimental results fully prove that our model has good feasibility and robustness in predicting PPIs. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

16 pages, 2847 KiB  
Article
MPLs-Pred: Predicting Membrane Protein-Ligand Binding Sites Using Hybrid Sequence-Based Features and Ligand-Specific Models
by Chang Lu, Zhe Liu, Enju Zhang, Fei He, Zhiqiang Ma and Han Wang
Int. J. Mol. Sci. 2019, 20(13), 3120; https://doi.org/10.3390/ijms20133120 - 26 Jun 2019
Cited by 14 | Viewed by 3478
Abstract
Membrane proteins (MPs) are involved in many essential biomolecule mechanisms as a pivotal factor in enabling the small molecule and signal transport between the two sides of the biological membrane; this is the reason that a large portion of modern medicinal drugs target [...] Read more.
Membrane proteins (MPs) are involved in many essential biomolecule mechanisms as a pivotal factor in enabling the small molecule and signal transport between the two sides of the biological membrane; this is the reason that a large portion of modern medicinal drugs target MPs. Therefore, accurately identifying the membrane protein-ligand binding sites (MPLs) will significantly improve drug discovery. In this paper, we propose a sequence-based MPLs predictor called MPLs-Pred, where evolutionary profiles, topology structure, physicochemical properties, and primary sequence segment descriptors are combined as features applied to a random forest classifier, and an under-sampling scheme is used to enhance the classification capability with imbalanced samples. Additional ligand-specific models were taken into consideration in refining the prediction. The corresponding experimental results based on our method achieved an appreciable performance, with 0.63 MCC (Matthews correlation coefficient) as the overall prediction precision, and those values were 0.604, 0.7, and 0.692, respectively, for the three main types of ligands: drugs, metal ions, and biomacromolecules. MPLs-Pred is freely accessible at http://icdtools.nenu.edu.cn/. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Graphical abstract

14 pages, 2973 KiB  
Article
mACPpred: A Support Vector Machine-Based Meta-Predictor for Identification of Anticancer Peptides
by Vinothini Boopathi, Sathiyamoorthy Subramaniyam, Adeel Malik, Gwang Lee, Balachandran Manavalan and Deok-Chun Yang
Int. J. Mol. Sci. 2019, 20(8), 1964; https://doi.org/10.3390/ijms20081964 - 22 Apr 2019
Cited by 134 | Viewed by 6788
Abstract
Anticancer peptides (ACPs) are promising therapeutic agents for targeting and killing cancer cells. The accurate prediction of ACPs from given peptide sequences remains as an open problem in the field of immunoinformatics. Recently, machine learning algorithms have emerged as a promising tool for [...] Read more.
Anticancer peptides (ACPs) are promising therapeutic agents for targeting and killing cancer cells. The accurate prediction of ACPs from given peptide sequences remains as an open problem in the field of immunoinformatics. Recently, machine learning algorithms have emerged as a promising tool for helping experimental scientists predict ACPs. However, the performance of existing methods still needs to be improved. In this study, we present a novel approach for the accurate prediction of ACPs, which involves the following two steps: (i) We applied a two-step feature selection protocol on seven feature encodings that cover various aspects of sequence information (composition-based, physicochemical properties and profiles) and obtained their corresponding optimal feature-based models. The resultant predicted probabilities of ACPs were further utilized as feature vectors. (ii) The predicted probability feature vectors were in turn used as an input to support vector machine to develop the final prediction model called mACPpred. Cross-validation analysis showed that the proposed predictor performs significantly better than individual feature encodings. Furthermore, mACPpred significantly outperformed the existing methods compared in this study when objectively evaluated on an independent dataset. Full article
(This article belongs to the Special Issue Special Protein or RNA Molecules Computational Identification 2019)
Show Figures

Figure 1

Back to TopTop