Computational Intelligence in Structure and Function Prediction and Modeling of Proteins

A special issue of Biomolecules (ISSN 2218-273X). This special issue belongs to the section "Bioinformatics and Systems Biology".

Deadline for manuscript submissions: closed (15 September 2022) | Viewed by 19515

Special Issue Editor

School of Computer and Information Technology, Xinyang Normal University, Xinyang 464000, China
Interests: molecular biochemistry; high-throughput protein structure and function analysis; data mining and deep learning; intelligent computing

Special Issue Information

Dear Colleagues,

Exploring the functions and structures of proteins is paramount for understanding the molecular mechanisms of life. The analyses and predictions of protein functional residues contribute to the research of protein function. The traditional methods used to extract protein structures and function information all involve biophysics-related or biochemistry-related technologies. These technologies need expensive experimental instruments, complex experimental procedures and elaborate human resources. They are benefited from the development of bioinformatics, which uses intelligent computing methods to accurately predict protein structure information and functional residues. Currently, only a few proteins have accurate 3D structure information. With the avalanche of hundreds of thousands of unknown proteins, computational intelligence methods have become ever more popular since they could provide informative and valuable clues for biologists.

This Special Issue of Biomolecules is dedicated to computational methods and analyses focusing on the identification, elucidation, and analysis of protein function-related factors. We welcome both original articles and surveys that cover state-of-the-art advances in this rapidly developing area. We also encourage the submission of experimental studies that are coupled with computational analysis.

We look forward to your contributions. 

Dr. Jian Zhang
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Biomolecules is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • protein tertiary structure
  • protein folding
  • protein functional sites
  • protein functions
  • intrinsic disorder
  • drug discovery
  • computational prediction
  • machine learning and deep learning
  • optimization algorithms

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

14 pages, 2839 KiB  
Article
Mechanistic Insights into the Protection Effect of Argonaute–RNA Complex on the HCV Genome
by Haiming Zhuang, Dong Ji, Jigang Fan, Mingyu Li, Ran Tao, Kui Du, Shaoyong Lu, Zongtao Chai and Xiaohua Fan
Biomolecules 2022, 12(11), 1631; https://doi.org/10.3390/biom12111631 - 03 Nov 2022
Cited by 4 | Viewed by 1411
Abstract
While host miRNA usually plays an antiviral role, the relentless tides of viral evolution have carved out a mechanism to recruit host miRNA as a viral protector. By complementing miR-122 at the 5′ end of the genome, the hepatitis C virus (HCV) gene [...] Read more.
While host miRNA usually plays an antiviral role, the relentless tides of viral evolution have carved out a mechanism to recruit host miRNA as a viral protector. By complementing miR-122 at the 5′ end of the genome, the hepatitis C virus (HCV) gene can form a complex with Argonaute 2 (Ago2) protein to protect the 5′ end of HCV RNA from exonucleolytic attacks. Experiments showed that the disruption of the stem-loop 1(SL1) structure and the 9th nucleotide (T9) of HCV site 1 RNA could enhance the affinity of the Ago2 protein to the HCV site 1 RNA (target RNA). However, the underlying mechanism of how the conformation and dynamics of the Ago2: miRNA: target RNA complex is affected by the SL1 and T9 remains unclear. To address this, we performed large-scale molecular dynamics simulations on the AGO2-miRNA complex binding with the WT target, T9-abasic target and SL1-disruption target, respectively. The results revealed that the T9 and SL1 structures could induce the departing motion of the PAZ, PIWI and N domains, propping up the mouth of the central groove which accommodates the target RNA, causing the instability of the target RNA and disrupting the Ago2 binding. The coordinated motion among the PAZ, PIWI and N domains were also weakened by the T9 and SL1 structures. Moreover, we proposed a new model wherein the Ago2 protein could adopt a more constraint conformation with the proximity and more correlated motions of the PAZ, N and PIWI domains to protect the target RNA from dissociation. These findings reveal the mechanism of the Ago2-miRNA complex’s protective effect on the HCV genome at the atomic level, which will offer guidance for the design of drugs to confront the protection effect and engineering of Ago2 as a gene-regulation tool. Full article
Show Figures

Figure 1

14 pages, 2259 KiB  
Article
EP-Pred: A Machine Learning Tool for Bioprospecting Promiscuous Ester Hydrolases
by Ruite Xiang, Laura Fernandez-Lopez, Ana Robles-Martín, Manuel Ferrer and Victor Guallar
Biomolecules 2022, 12(10), 1529; https://doi.org/10.3390/biom12101529 - 21 Oct 2022
Cited by 3 | Viewed by 1911
Abstract
When bioprospecting for novel industrial enzymes, substrate promiscuity is a desirable property that increases the reusability of the enzyme. Among industrial enzymes, ester hydrolases have great relevance for which the demand has not ceased to increase. However, the search for new substrate promiscuous [...] Read more.
When bioprospecting for novel industrial enzymes, substrate promiscuity is a desirable property that increases the reusability of the enzyme. Among industrial enzymes, ester hydrolases have great relevance for which the demand has not ceased to increase. However, the search for new substrate promiscuous ester hydrolases is not trivial since the mechanism behind this property is greatly influenced by the active site’s structural and physicochemical characteristics. These characteristics must be computed from the 3D structure, which is rarely available and expensive to measure, hence the need for a method that can predict promiscuity from sequence alone. Here we report such a method called EP-pred, an ensemble binary classifier, that combines three machine learning algorithms: SVM, KNN, and a Linear model. EP-pred has been evaluated against the Lipase Engineering Database together with a hidden Markov approach leading to a final set of ten sequences predicted to encode promiscuous esterases. Experimental results confirmed the validity of our method since all ten proteins were found to exhibit a broad substrate ambiguity. Full article
Show Figures

Figure 1

14 pages, 4726 KiB  
Article
Computational Dissection of the Role of Trp305 in the Regulation of the Death-Associated Protein Kinase–Calmodulin Interaction
by Yu-Ping Zhu, Xin-Yi Gao, Guo-Hui Xu, Zhao-Fu Qin, Hai-Xing Ju, De-Chuan Li and De-Ning Ma
Biomolecules 2022, 12(10), 1395; https://doi.org/10.3390/biom12101395 - 29 Sep 2022
Cited by 5 | Viewed by 1346
Abstract
Death-associated protein kinase 1 (DAPK1), as a calcium/calmodulin (CaM) regulated serine/threonine kinase, functions in apoptotic and autophagy pathways and represents an interesting drug target for inflammatory bowel disease and Alzheimer’s disease. The crystal structure of the DAPK1 catalytic domain and the autoregulatory domain [...] Read more.
Death-associated protein kinase 1 (DAPK1), as a calcium/calmodulin (CaM) regulated serine/threonine kinase, functions in apoptotic and autophagy pathways and represents an interesting drug target for inflammatory bowel disease and Alzheimer’s disease. The crystal structure of the DAPK1 catalytic domain and the autoregulatory domain (ARD) in complex with CaM provides an understanding of CaM-dependent regulation of DAPK1 activity. However, the molecular basis of how distinct Trp305 (W305Y and W305D) mutations in the ARD modulate different DAPK1 activities remains unknown. Here, we performed multiple, μs-length molecular dynamics (MD) simulations of the DAPK1–CaM complex in three different (wild-type, W305Y, and W305D) states. MD simulations showed that the overall structural complex did not change significantly in the wild-type and W305Y systems, but underwent obvious conformational alteration in the W305D system. Dynamical cross-correlation and principal component analyses revealed that the W305D mutation enhanced the anti-correlated motions between the DAPK1 and CaM and sampled a broader distribution of conformational space relative to the wild-type and W305Y systems. Structural and energetical analyses further exhibited that CaM binding was unfavored in response to the W305D mutation, resulting in the decreased binding of CaM to the W305D mutant. Furthermore, the hydrogen bonds and salt bridges responsible for the loss of CaM binding on the interface of the DAPK1–CaM complex were identified in the W305D mutant. This result may provide insights into the key role of Trp305 in the regulation of CaM-mediated DAPK1 activity. Full article
Show Figures

Figure 1

16 pages, 2791 KiB  
Article
Elucidation of the Correlation between Heme Distortion and Tertiary Structure of the Heme-Binding Pocket Using a Convolutional Neural Network
by Hiroko X. Kondo, Hiroyuki Iizuka, Gen Masumoto, Yuichi Kabaya, Yusuke Kanematsu and Yu Takano
Biomolecules 2022, 12(9), 1172; https://doi.org/10.3390/biom12091172 - 24 Aug 2022
Cited by 2 | Viewed by 2207
Abstract
Heme proteins serve diverse and pivotal biological functions. Therefore, clarifying the mechanisms of these diverse functions of heme is a crucial scientific topic. Distortion of heme porphyrin is one of the key factors regulating the chemical properties of heme. Here, we constructed convolutional [...] Read more.
Heme proteins serve diverse and pivotal biological functions. Therefore, clarifying the mechanisms of these diverse functions of heme is a crucial scientific topic. Distortion of heme porphyrin is one of the key factors regulating the chemical properties of heme. Here, we constructed convolutional neural network models for predicting heme distortion from the tertiary structure of the heme-binding pocket to examine their correlation. For saddling, ruffling, doming, and waving distortions, the experimental structure and predicted values were closely correlated. Furthermore, we assessed the correlation between the cavity shape and molecular structure of heme and demonstrated that hemes in protein pockets with similar structures exhibit near-identical structures, indicating the regulation of heme distortion through the protein environment. These findings indicate that the tertiary structure of the heme-binding pocket is one of the factors regulating the distortion of heme porphyrin, thereby controlling the chemical properties of heme relevant to the protein function; this implies a structure–function correlation in heme proteins. Full article
Show Figures

Figure 1

29 pages, 6858 KiB  
Article
Integrating Conformational Dynamics and Perturbation-Based Network Modeling for Mutational Profiling of Binding and Allostery in the SARS-CoV-2 Spike Variant Complexes with Antibodies: Balancing Local and Global Determinants of Mutational Escape Mechanisms
by Gennady Verkhivker, Steve Agajanian, Ryan Kassab and Keerthi Krishnan
Biomolecules 2022, 12(7), 964; https://doi.org/10.3390/biom12070964 - 10 Jul 2022
Viewed by 1783
Abstract
In this study, we combined all-atom MD simulations, the ensemble-based mutational scanning of protein stability and binding, and perturbation-based network profiling of allosteric interactions in the SARS-CoV-2 spike complexes with a panel of cross-reactive and ultra-potent single antibodies (B1-182.1 and A23-58.1) as well [...] Read more.
In this study, we combined all-atom MD simulations, the ensemble-based mutational scanning of protein stability and binding, and perturbation-based network profiling of allosteric interactions in the SARS-CoV-2 spike complexes with a panel of cross-reactive and ultra-potent single antibodies (B1-182.1 and A23-58.1) as well as antibody combinations (A19-61.1/B1-182.1 and A19-46.1/B1-182.1). Using this approach, we quantify the local and global effects of mutations in the complexes, identify protein stability centers, characterize binding energy hotspots, and predict the allosteric control points of long-range interactions and communications. Conformational dynamics and distance fluctuation analysis revealed the antibody-specific signatures of protein stability and flexibility of the spike complexes that can affect the pattern of mutational escape. A network-based perturbation approach for mutational profiling of allosteric residue potentials revealed how antibody binding can modulate allosteric interactions and identified allosteric control points that can form vulnerable sites for mutational escape. The results show that the protein stability and binding energetics of the SARS-CoV-2 spike complexes with the panel of ultrapotent antibodies are tolerant to the effect of Omicron mutations, which may be related to their neutralization efficiency. By employing an integrated analysis of conformational dynamics, binding energetics, and allosteric interactions, we found that the antibodies that neutralize the Omicron spike variant mediate the dominant binding energy hotpots in the conserved stability centers and allosteric control points in which mutations may be restricted by the requirements of the protein folding stability and binding to the host receptor. This study suggested a mechanism in which the patterns of escape mutants for the ultrapotent antibodies may not be solely determined by the binding interaction changes but are associated with the balance and tradeoffs of multiple local and global factors, including protein stability, binding affinity, and long-range interactions. Full article
Show Figures

Figure 1

10 pages, 1994 KiB  
Article
The Repeating, Modular Architecture of the HtrA Proteases
by Matthew Merski, Sandra Macedo-Ribeiro, Rafal M. Wieczorek and Maria W. Górna
Biomolecules 2022, 12(6), 793; https://doi.org/10.3390/biom12060793 - 07 Jun 2022
Cited by 1 | Viewed by 1997
Abstract
A conserved, 26-residue sequence [AA(X2)[A/G][G/L](X2)GDV[I/L](X2)[V/L]NGE(X1)V(X6)] and corresponding structure repeating module were identified within the HtrA protease family using a non-redundant set (N = 20) of publicly available structures. While the repeats themselves were [...] Read more.
A conserved, 26-residue sequence [AA(X2)[A/G][G/L](X2)GDV[I/L](X2)[V/L]NGE(X1)V(X6)] and corresponding structure repeating module were identified within the HtrA protease family using a non-redundant set (N = 20) of publicly available structures. While the repeats themselves were far from sequence perfect, they had notable conservation to a statistically significant level. Three or more repetitions were identified within each protein despite being statistically expected to randomly occur only once per 1031 residues. This sequence repeat was associated with a six stranded antiparallel β-barrel module, two of which are present in the core of the structures of the PA clan of serine proteases, while a modified version of this module could be identified in the PDZ-like domains. Automated structural alignment methods had difficulties in superimposing these β-barrels, but the use of a target human HtrA2 structure showed that these modules had an average RMSD across the set of structures of less than 2 Å (mean and median). Our findings support Dayhoff’s hypothesis that complex proteins arose through duplication of simpler peptide motifs and domains. Full article
Show Figures

Graphical abstract

16 pages, 1699 KiB  
Article
Deep Ensemble Learning with Atrous Spatial Pyramid Networks for Protein Secondary Structure Prediction
by Yuzhi Guo, Jiaxiang Wu, Hehuan Ma, Sheng Wang and Junzhou Huang
Biomolecules 2022, 12(6), 774; https://doi.org/10.3390/biom12060774 - 02 Jun 2022
Cited by 3 | Viewed by 1754
Abstract
The secondary structure of proteins is significant for studying the three-dimensional structure and functions of proteins. Several models from image understanding and natural language modeling have been successfully adapted in the protein sequence study area, such as Long Short-term Memory (LSTM) network and [...] Read more.
The secondary structure of proteins is significant for studying the three-dimensional structure and functions of proteins. Several models from image understanding and natural language modeling have been successfully adapted in the protein sequence study area, such as Long Short-term Memory (LSTM) network and Convolutional Neural Network (CNN). Recently, Gated Convolutional Neural Network (GCNN) has been proposed for natural language processing. It has achieved high levels of sentence scoring, as well as reduced the latency. Conditionally Parameterized Convolution (CondConv) is another novel study which has gained great success in the image processing area. Compared with vanilla CNN, CondConv uses extra sample-dependant modules to conditionally adjust the convolutional network. In this paper, we propose a novel Conditionally Parameterized Convolutional network (CondGCNN) which utilizes the power of both CondConv and GCNN. CondGCNN leverages an ensemble encoder to combine the capabilities of both LSTM and CondGCNN to encode protein sequences by better capturing protein sequential features. In addition, we explore the similarity between the secondary structure prediction problem and the image segmentation problem, and propose an ASP network (Atrous Spatial Pyramid Pooling (ASPP) based network) to capture fine boundary details in secondary structure. Extensive experiments show that the proposed method can achieve higher performance on protein secondary structure prediction task than existing methods on CB513, Casp11, CASP12, CASP13, and CASP14 datasets. We also conducted ablation studies over each component to verify the effectiveness. Our method is expected to be useful for any protein related prediction tasks, which is not limited to protein secondary structure prediction. Full article
Show Figures

Figure 1

16 pages, 8919 KiB  
Article
How Similar Are Proteins and Origami?
by Hay Azulay, Aviv Lutaty and Nir Qvit
Biomolecules 2022, 12(5), 622; https://doi.org/10.3390/biom12050622 - 21 Apr 2022
Cited by 2 | Viewed by 3212
Abstract
Protein folding and structural biology are highly active disciplines that combine basic research in various fields, including biology, chemistry, physics, and computer science, with practical applications in biomedicine and nanotechnology. However, there are still gaps in the understanding of the detailed mechanisms of [...] Read more.
Protein folding and structural biology are highly active disciplines that combine basic research in various fields, including biology, chemistry, physics, and computer science, with practical applications in biomedicine and nanotechnology. However, there are still gaps in the understanding of the detailed mechanisms of protein folding, and protein structure-function relations. In an effort to bridge these gaps, this paper studies the equivalence of proteins and origami. Research on proteins and origami provides strong evidence to support the use of origami folding principles and mechanical models to explain aspects of proteins formation and function. Although not identical, the equivalence of origami and proteins emerges in: (i) the folding processes, (ii) the shape and structure of proteins and origami models, and (iii) the intrinsic mechanical properties of the folded structures/models, which allows them to synchronically fold/unfold and effectively distribute forces to the whole structure. As a result, origami can contribute to the understanding of various key protein-related mechanisms and support the design of de novo proteins and nanomaterials. Full article
Show Figures

Figure 1

Review

Jump to: Research

22 pages, 994 KiB  
Review
Drug-Disease Association Prediction Using Heterogeneous Networks for Computational Drug Repositioning
by Yoonbee Kim, Yi-Sue Jung, Jong-Hoon Park, Seon-Jun Kim and Young-Rae Cho
Biomolecules 2022, 12(10), 1497; https://doi.org/10.3390/biom12101497 - 17 Oct 2022
Cited by 4 | Viewed by 2632
Abstract
Drug repositioning, which involves the identification of new therapeutic indications for approved drugs, considerably reduces the time and cost of developing new drugs. Recent computational drug repositioning methods use heterogeneous networks to identify drug–disease associations. This review reveals existing network-based approaches for predicting [...] Read more.
Drug repositioning, which involves the identification of new therapeutic indications for approved drugs, considerably reduces the time and cost of developing new drugs. Recent computational drug repositioning methods use heterogeneous networks to identify drug–disease associations. This review reveals existing network-based approaches for predicting drug–disease associations in three major categories: graph mining, matrix factorization or completion, and deep learning. We selected eleven methods from the three categories to compare their predictive performances. The experiment was conducted using two uniform datasets on the drug and disease sides, separately. We constructed heterogeneous networks using drug–drug similarities based on chemical structures and ATC codes, ontology-based disease–disease similarities, and drug–disease associations. An improved evaluation metric was used to reflect data imbalance as positive associations are typically sparse. The prediction results demonstrated that methods in the graph mining and matrix factorization or completion categories performed well in the overall assessment. Furthermore, prediction on the drug side had higher accuracy than on the disease side. Selecting and integrating informative drug features in drug–drug similarity measurement are crucial for improving disease-side prediction. Full article
Show Figures

Figure 1

Back to TopTop