Predicting how a point mutation alters a protein’s stability can guide pharmaceutical drug design initiatives which aim to counter the effects of serious diseases. Conducting mutagenesis studies in physical proteins can give insights about the effects of amino acid substitutions, but such wet-lab work is prohibitive due to the time as well as financial resources needed to assess the effect of even a single amino acid substitution. Computational methods for predicting the effects of a mutation on a protein structure can complement wet-lab work, and varying approaches are available with promising accuracy rates. In this work we compare and assess the utility of several machine learning methods and their ability to predict the effects of single and double mutations. We in silico generate mutant protein structures, and compute several rigidity metrics for each of them. We use these as features for our Support Vector Regression (SVR), Random Forest (RF), and Deep Neural Network (DNN) methods. We validate the predictions of our in silico mutations against experimental
stability data, and attain Pearson Correlation values upwards of 0.71 for single mutations, and 0.81 for double mutations. We perform ablation studies to assess which features contribute most to a model’s success, and also introduce a voting scheme to synthesize a single prediction from the individual predictions of the three models.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited