Next Article in Journal
Investigating Circular RNAs Using qRT-PCR; Roundup of Optimization and Processing Steps
Next Article in Special Issue
Influence of Genetics on the Response to Omalizumab in Patients with Severe Uncontrolled Asthma with an Allergic Phenotype
Previous Article in Journal
Secondary and Topological Structural Merge Prediction of Alpha-Helical Transmembrane Proteins Using a Hybrid Model Based on Hidden Markov and Long Short-Term Memory Neural Networks
Previous Article in Special Issue
Targeting the Brain with Single-Domain Antibodies: Greater Potential Than Stated So Far?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Harnessing Fc/FcRn Affinity Data from Patents with Different Machine Learning Methods

by
Christophe Dumet
1,2,
Martine Pugnière
3,
Corinne Henriquet
3,
Valérie Gouilleux-Gruart
1,4,
Anne Poupon
2,5,6 and
Hervé Watier
1,4,*
1
EA7501, Université de Tours, 37041 Tours, France
2
MAbSilico, 1 Impasse du Palais, 37000 Tours, France
3
Institut de Recherche en Cancérologie de Montpellier, Université de Montpellier, 34090 Montpellier, France
4
Laboratoire d’Immunologie, Centre Hospitalier Universitaire, 37044 Tours, France
5
Physiologie de la Reproduction et des Comportements, INRAE UMR-0085, CNRS UMR-7247, Université de Tours, 37380 Nouzilly, France
6
Musca, Inria Saclay-Île-de-France, 91120 Palaiseau, France
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2023, 24(6), 5724; https://doi.org/10.3390/ijms24065724
Submission received: 7 February 2023 / Revised: 7 March 2023 / Accepted: 11 March 2023 / Published: 16 March 2023

Abstract

:
Monoclonal antibodies are biopharmaceuticals with a very long half-life due to the binding of their Fc portion to the neonatal receptor (FcRn), a pharmacokinetic property that can be further improved through engineering of the Fc portion, as demonstrated by the approval of several new drugs. Many Fc variants with increased binding to FcRn have been found using different methods, such as structure-guided design, random mutagenesis, or a combination of both, and are described in the literature as well as in patents. Our hypothesis is that this material could be subjected to a machine learning approach in order to generate new variants with similar properties. We therefore compiled 1323 Fc variants affecting the affinity for FcRn, which were disclosed in twenty patents. These data were used to train several algorithms, with two different models, in order to predict the affinity for FcRn of new randomly generated Fc variants. To determine which algorithm was the most robust, we first assessed the correlation between measured and predicted affinity in a 10-fold cross-validation test. We then generated variants by in silico random mutagenesis and compared the prediction made by the different algorithms. As a final validation, we produced variants, not described in any patents, and compared the predicted affinity with the experimental binding affinities measured by surface plasmon resonance (SPR). The best mean absolute error (MAE) between predicted and experimental values was obtained with a support vector regressor (SVR) using six features and trained on 1251 examples. With this setting, the error on the log(KD) was less than 0.17. The obtained results show that such an approach could be used to find new variants with better half-life properties that are different from those already extensively used in therapeutic antibody development.

1. Introduction

The wide therapeutic success of monoclonal antibodies (mAbs) in numerous indications is mainly due to their high target specificity and their long half-life, ranging from 3 days to more than 30 days for non-engineered mAbs. Further enhancing the half-life of therapeutic antibodies allows a decrease in the periodicity of administration and increases their efficacy [1,2,3]. Antibody half-life depends on many factors, such as the target, target-mediated drug disposition [4], heavy-chain allotype [5,6], and presence of anti-drug Abs. However, the predominant mechanism determining the half-life is the binding of the IgG Fc portion to FcRn, which protects IgG from catabolism. This binding is pH-dependent due to the presence of histidine residues in the Fc portion and glutamic acid residues in FcRn. The high-affinity complex is formed in endosomal compartments at low pH (pH 6) but not extracellularly at physiological pH (pH 7.4). In order to harness this mechanism, many companies have tested Fc mutations improving the binding to FcRn at acidic pH only, which improves the endosomal recycling efficiency and enhances the pharmacokinetics of the antibody. For example, Medimmune and Xencor have patented the M252Y/S254T/T256E and M428L/N434S mutations, respectively [1,7,8]. Finding useful mutations is not trivial, since increasing binding at acidic pH often results in a simultaneous increase in affinity at neutral pH, which mitigates the desired effect [9]. Such mutations can even worsen the pharmacokinetic properties [7,10] because of reduced antibody release from FcRn back to the plasma. In contrast, some companies voluntarily enhance the binding to FcRn at neutral pH in order to flush out antigens more rapidly [9,11].
To find the right mutants, alanine scanning combined with rational design was initially the most commonly used technique [12], leading to the identification of amino acids that are essential for the binding of Fc to FcRn. For example, mutation of the isoleucine at position 253 [12] or histidine at position 310 [13] by any other amino acid diminishes or abrogates the binding. Conversely, substitution of asparagine at position 434 by a hydrophobic amino acid (N434A, N434W, N434Y, N434F) or other types of amino acids (N434H, N434G, N434S, N434Q) [9,14] enhances the binding. More powerful approaches were then developed to find new variants, such as phage display [9], random plus directed mutagenesis [15], or combinations of in silico methods and rational design [16,17]. However, the generated mutants frequently appear as a combination of already described single mutations. Moreover, these methods still require experimental testing of many variants because of their low performance in predicting the combinatorial effect of several single mutations.
Several in silico methods have been developed to predict protein/protein binding affinity [18]. These methods are generally pre-determined equations (scoring functions) of energy terms, and the weights of the terms are optimized by machine learning on experimental datasets comprising various protein–protein complex structures. If these methods perform well with the training dataset, they generally show low correlation with a new test set, which is certainly due to the fact that the test set diverges too much from the learning set [19,20]. Indeed, as with all machine learning settings, the final performance is highly dependent on the quality and diversity of the learning dataset. Algorithms dedicated to the prediction of Fc/FcRn binding affinity have been developed [21,22]. However, the precision of these scoring functions is low, especially for evaluating the impact of multiple mutations. Most of these algorithms suffer from too reduced learning sets. Nevertheless, a lot of data are available regarding Fc/FcRn variants, but they have not been exploited with these methods yet. Indeed, only a selection of variants is usually described in the scientific literature, even in supplementary data, although a larger number of tested variants can be retrieved from patent applications or patents. For example, researchers from Chugai Pharma tested more than 1000 variants, but the comprehensive set of mutated variants can only be found in some patent applications (e.g., WO2013046704), whereas only 7 variants are described in the corresponding article [23].
In the present work, we collected these data in order to constitute a specific Fc/FcRn dataset that could be used in machine learning algorithms. Our dataset of Fc variants was mainly collected from the patent literature. We then trained different algorithms with Fc/FcRn parameters calculated with bioinformatic tools, together with affinity data, and assessed the performance of the different algorithms in a 10-fold cross-validation setting. We also evaluated the algorithms by comparing the distribution of predicted affinities for thousands of in silico randomly generated Fc variants. Finally, to validate the robustness of the models, we produced three new variants with three, five, and seven mutations and compared the predicted affinity with the experimental binding affinities measured by SPR.

2. Results

2.1. Description of the Fc Variant Dataset and Creation of the Learning Sets

Global patent database software was queried with various keywords, such as FcRn, antibodies, variant, mutation, or half-life, in the patents claims to specifically retrieve FcRn-directed antibody-engineering-related documents. This request resulted in 225 documents (patents or patent applications), which were analyzed in order to eliminate documents that did not contain relevant examples, or that contained only variants with no amino acid substitution directly in the interface of the Fc/FcRn complex. As of December 2020, the dataset contained 1323 variants from 20 patents. Among them, 1099 are variants with an affinity reported at pH 7.0 only, measured with an accurate technique (SPR), with the same protocol (T = 25 °C, same buffer and procedures), and by the same company. The 224 other variants are reported at pH 6.0 only, measured by ELISA or Amplified Luminescent Proximity Homogeneous Assay (temperature unknown, reported as room temperature). The Fc variants (mainly IgG1) of the dataset can have up to 12 mutations. In this study, we built two learning sets of different sizes. The first learning set (FLS) contains very homogeneous data: the 1099 Fc variants evaluated at pH 7.0 by SPR with the same protocol. The second learning set (SLS) also contains the 224 variants only evaluated at pH 6.0 in addition to the 1099 variants of the FLS. The contents of the two datasets are summarized in Table 1. In an attempt to use all the available data, despite the pH difference, we homogenized the data by multiplying by 68 the KD of the 224 examples reported at pH 6.0 only, since the wild-type Fc was reported to have a KD of 1.3 × 10−6 M at pH 6.0 and 8.8 × 10−5 M at pH 7.0. Indeed, it has already been proposed by other authors that the variation in the log(KD) with the pH was fairly linear between pH 6.0 and 7.4 [24]. The relevance of this first approach will be further discussed in the discussion section.

2.2. Algorithms and Tested Features

The 3D structure of the 1323 Fc variants were modeled from the Fc/FcRn co-crystal (4N0U.pdb file [13]) with PyMOL v2.5.4, and features reported to be relevant in previous studies [25,26,27] were calculated with the CCP4 software v8.0.009. In total, 147 features were initially considered (Table A1) and collected from the 1323 Fc/FcRn 3D models. In our model, variants considered as different in the original patent can lead to duplicates since not all amino acids are used for computing parameters. For example, if a variant has the S239K/T256E substitution and the other variant has the L235R/T256E substitution, it is considered as a duplicate because the influence of the S239K or L235R substitutions is ignored in our model. Including these positions in the study was nevertheless considered. However, from our dataset, mutations at these two positions do not significantly alter the affinity. Consequently, in this example, only the T256E substitution is taken into account, and the two variants appear as duplicates in our set. We thus eliminated such duplicates, which could bias the training results. As a result, the FLS contains 1048 examples and the SLS 1251 examples.
We then tested different machine learning (ML) algorithms using the FLS and SLS learning sets. Among the scikit-learn library [28], we chose four different algorithms: support vector regressor (SVR), multi-linear regression (MLR), multi-layer perceptron (MLP), and random forest regressor (RFR). These methods were well suited for the type of data we had and the type of predictions we wanted to obtain. Moreover, they are quite simple in their principles, and we wanted to see if the parameters we had in mind were sufficient for the task. Using complex and more opaque artificial intelligence methods hinders problems such as insufficient examples in the learning set or overfitting.
We first used the SelectFromModel method of scikit-learn. This method evaluates the importance of each parameter based on the optimized models. The parameter with the lowest importance is removed, and the performance of the new model is computed. If the performance is not altered, the removal is confirmed, and removal of the next lowest importance parameter is evaluated. The iteration stops when the performance as compared to the initial model is altered by removal of the lowest importance parameter. Application to our two models consistently retained 25 to 28 features for the FLS and 10 to 12 features for the SLS. This first reduction in the number of features greatly improved the performance (evaluated by 10-fold cross-validation) of the MLR algorithm. The performance of SVR, RFR, and MLP remained unchanged (data not shown), but with a net gain in calculation speed.
We then removed features that were highly correlated (evaluated by the pandas.DataFrame.corr method) and kept 11 features for the FLS and 6 for the SLS (Figure 1). This second step slightly improved the performance of the MLR with the FLS and slightly decreased the performance of the other algorithms with the SLS. However, this further dimension reduction is useful to prevent overfitting. Further dimension reduction (removing of features) negatively impacted the performance of all algorithms.
We compared the results obtained for the two learning sets using the optimal number of features: FLS with 11 features and SLS with 6 features. The most important feature of the FLS model (35% relative importance) is the number of atoms interacting between the β chain of FcRn (β2-microglobulin) and the Fc (Figure 1). The accessible surface area of residue at position 255 and buried surface area of residue at position 434 of the Fc come in second and third position, respectively (Figure 1). The other features have lower impact but altogether account for about half of the model information (Figure 1). The most important feature retained with the SLS model is the buried surface area of the amino acid at position 129 of FcRn, with a relative importance of 0.7.
Figure 1. Parameter selection and machine learning performance; parameters are defined in Table A1. (A) Impact of the features in the FLS model. (B) Scatterplots of the 10-fold cross-validation predictions with the 4 algorithms trained on FLS. (C) Impact of the features in the SLS model. (D) Scatterplots of the 10-fold cross-validation predictions with the 4 algorithms trained on SLS. The scatterplots show the experimental (X axis) vs predicted (Y axis) affinities of the variants. Regression line is in red; R2: coefficient of determination; MAE: mean absolute error; MSE: mean squared error; the features kept for the models have been evaluated with the “SelectFromModel” of scikit-learn.
Figure 1. Parameter selection and machine learning performance; parameters are defined in Table A1. (A) Impact of the features in the FLS model. (B) Scatterplots of the 10-fold cross-validation predictions with the 4 algorithms trained on FLS. (C) Impact of the features in the SLS model. (D) Scatterplots of the 10-fold cross-validation predictions with the 4 algorithms trained on SLS. The scatterplots show the experimental (X axis) vs predicted (Y axis) affinities of the variants. Regression line is in red; R2: coefficient of determination; MAE: mean absolute error; MSE: mean squared error; the features kept for the models have been evaluated with the “SelectFromModel” of scikit-learn.
Ijms 24 05724 g001
To ensure that the models were not overfitting, despite good learning performance on the entire training datasets, we used a 10-fold cross-validation scheme. We performed this cross-validation test several times for each algorithm to ensure that scores were consistent between different runs, because each run of the algorithms can produce different results. With optimized parameters (see Materials and Methods), the consistent regression scores of the M1048/11 model (R2) obtained with MLR, MLP, SVR, and RFR are on average 0.45, 0.60, 0.75, and 0.84, respectively, and 0.77, 0.80, 0.82, and 0.88, respectively, with the M1251/6 model (Figure 2). The scores of MAE (mean absolute error) and MSE (mean squared error) are also ranked according to the best regression score, with the best scores obtained for the RFR. Although regression scores are better with the SLS model due to the larger range of KD values in the training set, MAE and MSE increased significantly for all algorithms compared to the FLS model. We also shuffled KD values in order to control the fit of our models. As expected, the correlation dropped drastically with R2 below 0 (R2 with no intercept can result in a negative value) while MAE and MSE increased dramatically at the same time for all algorithms. We also tested a model that also incorporated the energy terms (60 parameters) calculated from the FoldX suite v5.0 [29] with all variants and following the same procedure of removing duplicates and correlated features, but the performance did not improve, and the best correlation obtained was 0.89 with 11 parameters with the RFR (Figure A1).

2.3. Randomly Generated Variants Predicted Affinity Comparison with the Four Algorithms

For evaluating the capacity of our two models and algorithms to generalize to new data, we tested both models with the four algorithms with in silico randomly generated Fc variants. We generated two sets of more than 8000 variants containing three (mut3 set) and five (mut5 set) random mutations. These mutations were introduced at positions 251, 252, 253, 254, 255, 256, 257, 285, 286, 288, 307, 308, 309, 310, 311, 314, 428, 433, 434, 435, and 436 because the calculated features of our models only included these positions. We generated one additional set of 1000 Fc variants containing six to eight mutations (mut8 set), with not too much destabilizing, or with a positive effect on their own according to our dataset. The number of mutations was limited to eight because the effect of close mutations on the stability and production of the antibody is hard to predict.
We first compared the distribution of the predicted log KD values for the SLS (1323 variants, σ 1.47, log KD values range: [−1.03, −8.49] at pH 7.0) by the four algorithms (Figure 2). With the FLS model (Figure 2 top), the four algorithms have the same overall distributions but fail to reproduce the same distribution of log KD values as the SLS set, in contrast to the SLS models (Figure 2 bottom).
Our two models did not reproduce the same distribution of log KD values with the three sets of random mutants. With our two models, all the algorithms predicted that random variants of the mut3 and mut5 sets, but also variants of the mut8 set, would have on average less affinity at pH 7.0 than variants of the DS with a tendency to predict higher affinity for the mut8 set. The standard deviations and calculated log KD means are far higher for the SLS model than for the M1048/11 model.
Interestingly, different algorithms yield different distributions of log KD, especially for the set of random variants (Figure 2). The RFR has the lowest log KD mean predictions with the SLS and is the only algorithm that does not predict higher log KD mean for the mut8 set. The MLP predicted the same type of distribution of KD values as RFR with a tendency to predict higher values. The MLR is the algorithm with the highest standard deviation with the two models. Finally, the SVR showed a much narrower range of values with a standard deviation decreasing with the number of mutations with the first model in contrast to the second model.

2.4. Experimental Validation

To further validate our prediction method, we predicted the affinity of three new variants, which to our knowledge have never been tested. We then produced them and measured their affinities. We chose variants within our sets of in silico randomly generated variants (A3 (M252W/M428K/N434W), B5 (T256Y/H285Q/N286D/V308A/N434Y), C7 (T256E/N286H/K288E/V308P/L309D/N434Y/Y436K)) and introduced them in tocilizumab. For the control, we also generated two tocilizumab variants reported in the patent application: T8 (M252Y/N286E/T307Q/V308P/Q311A/N434Y/Y436V) and T3 (M252Y/T307D/N434Y). Our first two variants contain at least one substitution reported as a single destabilizing mutation in patents: the M428K for the mut3 variant and T256Y for the mut5 variant. The variant with seven mutations is a variant with a high predicted affinity by all the algorithms from the set of eight mutations. Affinities of the variants T8 and T3 measured in our SPR assay are close to the affinities reported in the patent application (Figure A2 and Figure A3). Overall, with the FLS and SLS, the four algorithms predict the affinity within a good range and are in good correlation with the measured affinities (Table 2, Figure A2 and Figure A3). In accordance with the 10-fold cross-validation results, the model poorly performs on the WT (tocilizumab) because it belongs to a class of antibodies with very weak binding for FcRn at neutral pH, whereas our model has better predictive potency for antibodies with affinities ranging from 1 × 10−9 to 1 × 10−6 for FcRn at neutral pH. The correlation of the six predicted vs actual measured affinities is better with the SLS model for the RFR, SVR, and MLR algorithms, in contrast to the MLP. However, the MAE is reduced for all algorithms (Table 3). For the new variants we produced, the SLS model has better performance than the FLS for all algorithms, especially for the SVR (Table 3). Overall, with the SLS model, the SVR algorithm has the best performance followed by the RFR, MLR, and MLP.

3. Discussion

Altogether, the present results show that it is possible to computationally predict the affinity for FcRn of Fc variants mutated at the interface of the Fc/FcRn complex with reasonable precision (+/−1 log). To do so, we carefully collected as many as possible publicly available Fc variants/FcRn affinity data by scrutinizing the scientific literature and relevant patents. Since differences exist between protocols used to measure the affinities, we built two different datasets. The smallest one includes only values obtained using a single protocol; the largest includes all available values. To build the two models based on these data, a large number of features relevant to the affinity prediction of a protein complex as well as features relevant for this particular type of complex were included. We also minimized as much as possible the overfitting by eliminating features that were too correlated between them in each learning set. To further optimize our procedure, we tested four algorithms. The results of these tests showed that random forest has the best capacity to adapt to our learning sets as compared to MLP, MLR, or SVR algorithms (with our hyper-parameters). Indeed, regression, MAE, and MSE scores are always better with this algorithm, regardless of the model used. This study also shows that the learning set has a high impact on the importance of features and on average predictions.
Not only are the models important but also the algorithms, as they show some variability in the predicted values and their distributions. It is, however, difficult to explain the variability between algorithms since their parameters are different. For example, the larger standard deviation of the MLR algorithm is probably due to its mathematical function, which is less sensitive to threshold effects than are MLP, SVR, and RFR. The MLP algorithm has been tuned with the tanh function (sigmoid function) and with an alpha parameter of 20 to limit overfitting. An alpha parameter of 0.1 would yield a larger range of value, but it would have a tendency to overfit the data. Algorithms with this kind of threshold are more relevant from a biochemical point of view, since the affinity of Fc variants is usually limited to 1 × 10−10, especially for random variants. This is important to keep in mind because if two algorithms are compared and have more or less the same performance in a cross-validation scheme, then it becomes difficult to decide which of them will better generalize to new data. It is also possible that an algorithm with good performance overfits to data, even with a cross-validation test, and will consequently have less capacity to generalize to new data than an algorithm with lesser performance on the same cross-validation test. For example, the RFR has the best performance in the cross-validation test, but the SVR has better performance with new variants. Moreover, the MLR has the worst performance on the cross-validation test, but it performs slightly better at predicting affinities for new variants than the MLP.

3.1. Model FLS

Our entire dataset is composed of 1323 variants. However, we built our FLS model selecting only homogenous data, derived from an accurate technique (SPR), in order to limit noise that could be induced by outliers. The drawback is that the FLS model is biased towards a particular type of variants, namely variants engineered to have better affinity at pH 7.0. Indeed, despite our efforts to get a maximum of unique variants from the patent database, our approach is still limited by the number and quality of data. For example, the exact KD value of a variant described as a non-binder cannot be known, yet the impact of its mutations would certainly increase performance. In addition, companies tend to only publish good results, i.e., variants with better affinity, and not those with decreased affinity. This results in a dataset with a majority of variants with high affinities for FcRn, which decreases the performance in estimating low affinities. The quality and consistency of data is also a prerequisite of any model. However, the accuracy of measures may be low, especially for variants that are discarded from the first round of selection. Moreover, there are also sometimes discrepancies between studies reporting affinities. For example, in a recent study [17], mutation N434S has been reported to reduce the binding affinity of Fc to FcRn, whereas in patent US20100204454 this sole mutation has been reported to enhance the binding by threefold. Another effect of the dataset bias is that not only KD but also the weight of the features could be over- or underestimated. The difference in importance of the features in this model can be explained by the composition of the FLS. Indeed, most of the variants of this learning set contain a hydrophobic amino acid at position 434, but they do not systematically have mutations in the region of the Fc near the β2m, which changes the number of interactions between the two molecules. As a result, this feature has a higher importance than the buried surface of residue 434 of the Fc. The relative importance of features with this model is also due to the absence of variants containing mutations at positions that are deeply buried (252, 253, and 310), explaining the very low importance (although crucial for the binding of the complex) of these positions in this model.

3.2. Model SLS

It has been shown that antibodies binding to FcRn with affinities lower than 860 nM at physiologic pH have reduced half-lives [30]. Having data on the same variants at both acidic and physiological pH could help to better quantify the impact of this parameter. However, affinities at physiological pH are almost always reported as “no binding” because of the low sensitivity of the methods. It has been proposed that the pH impact was fairly linear between pH 6.0 and 7.4 on a log scale [24]; hence, a constant value could suffice to approximate the pH change. We made the second model M1251/6 with KD at pH 7.0 based on this assumption, since all new examples of this second model were only reported at pH 6.0, or with no binding measure at neutral pH, and were mainly variants with a single destabilization mutation introducing an interpretation bias for the pH parameter (the algorithms interpret the diminution of pH as a factor reducing the binding). We homogenized the data by lowering the KD of the examples reported only at pH 6.0 by 68-fold, since tocilizumab was reported to have a KD of 1.3 × 10−6 M at pH 6.0 and 8.8 × 10−5 M at pH 7.0. Although this is a crude approximation, the correlation increased for all algorithms. However, the MAE and MSE increased, probably because the 68-fold change in KD cannot be applied to all variants, or because these new examples had their affinities measured by less sensitive techniques such as ELISA. Indeed, we also evaluated the prediction of the four algorithms with our two models. The same transformation was applied on the reported affinities at pH 6.0, but the resulting precision for the described affinity was only +/−1.5 log KD by the four algorithms with the second model (Table A2). In addition, if several histidine mutations are considered, the KD change between the two pH values could be more drastic. In model M1251/6, the buried surface area of the FcRn amino acid 129 is the most discriminant feature (importance: 0.7) because most variants with no hydrophobic mutation at this position have decreased affinities for FcRn in the SLS. The weights of other features calculated by MLR, MLP, and SVR are negligible, which explains why the correlation curves of the second model show very little change in predicted KDs for large, measured KD ranges and can cluster into two groups.
Cross-validation is the classical test to evaluate if a model does not overfit. Even if the algorithms performed well with the two models, both models are biased towards variants engineered to have high affinity at neutral pH as explained above. To evaluate the impact of this bias, we tested whether the models would reproduce the same distribution of predicted KD of the learning set with the random variant sets (mut3 and mut5 sets). All the algorithms predicted ranges of values of lower affinity for the random variant sets than the learning set of the M1251/6 model (Figure 2). Conversely, the M1048/11 model tends to stick to the range of value of the learning set except for the SVR (Figure 2).
In contrast to the SVR and MLR, the RFR and MLP algorithms did not predict higher affinities within the set of eight “good” random mutations in which only individual mutations shown to increase the affinity were kept. However, some mutation combinations incorporated in this set might have decreased affinities.
We also compared the first 20 variants for each set with the higher predicted affinity, considering each algorithm. Most of the experimental Fc variants with significantly better affinity for FcRn at neutral pH have hydrophobic substitution at position 434, whereas histidine 310 and isoleucine 253 are not substituted. However, none of the algorithms tested shows this pattern in its top 20 ranked variants (Table A3).
We challenged our models with mutation combinations not diverging too much from the examples of the learning set. We choose two variants from the set of three and five mutations, each containing a destabilizing mutation. To ensure that we would be able to measure an affinity for these variants, they also had to contain at least one mutation which showed great improvement in affinity (such as the N434Y or N434W mutations) to counterbalance the negative effect on affinity. Although the chosen mutants do not diverge too much from the learning sets, the results of the experimental measurements show that we are able to accurately predict their affinities.

3.3. Further Improvements

Although our experimental validations show the reliability of the method, the robustness and predictive power of the models would be significantly increased with a larger experimental validation set. In addition, our DS comprises 1323 variants, but this number could be larger if we had taken into account intramolecular interaction or long-range effects. Indeed, some mutations that are not at the interaction surface can impact the affinity of the complex. For example, Booth et al. [16] hypothesized that M428L and A378V could stabilize the 250 pseudo-helix. They also proposed in their study to complement the positively charged N-terminal region of the FcRn β-domain with T256, T307, H285, N286, and N315. Other general descriptors to consider could be the electrostatic complementarity between regions of the complex or the rigidity of the 250 pseudo-helix. It has also been shown that the destabilization of the region of the Fc at low pH could be responsible for higher binding [31]. Although the reasons are not very well understood, Monnet et al. [15] showed that the positions that are not in the interaction site (264 and 389) could favorably impact the binding. More intriguingly, they have also shown that mutations far away from the interaction site (P230S, P228L, or P228R) could enhance FcRn binding, although not consistently. In the same way, Ternant et al. [5] reported the influence of four different G1m allotypes regarding FcRn binding, although amino acids 214, 356, and 358 are distant from the interaction site. Some of these mutations outside of the Fc/FcRn interaction site have been introduced for optimizing binding to Fcγ receptors (or already exist in natural sequences), and they could still have an impact on FcRn binding. These new parameters could thus enhance the performance of our method.
As explained at the beginning of this paper, we chose using rather simple methods for learning because we did not know whether we had enough data, because we wanted to avoid overfitting, and because we wanted to demonstrate the validity of the global approach. The results bring positive answers to these three points, and it would now be worth trying more complex methods such as evolutionary algorithms or neural networks.
Finally, we focused on predicting the overall affinity (KD) because there were too few data on kon and koff. However, to obtain variants with desirable properties, kon and koff should also be taken into account [24]. Indeed, it has been shown that the endosomal trafficking time of the antibody was very short (a half-life time less than 10 min). Thus, it would be important for an antibody to have a very high kon at pH 6.0 rather than a low koff, which could prevent the antibody from being released back into the circulation. However, generated variants with a slow off-rate exhibited an extended half-life in mice and cynomolgus monkeys [16]. In any case, integrating these data could help to improve in silico design methods.

4. Materials and Methods

4.1. Antibody Expression and Purification

T3, T8, A3, B5, and C7 antibodies were produced by RD-Biotech (Besançon, France) following standard procedures by transient transfection of CHO cells. Antibodies were purified with protein A.

4.2. Surface Plasmon Resonance

SPR experiments were performed on Bia3000 apparatus at 25 °C in 50 mM phosphate buffer with 150 mM NaCl containing 0.05% P20 surfactant (GE Healthcare, Chicago, IL, USA) adjusted at pH 7 or pH 6 as required. hFcRn (Immunitrack, Copenhagen, Denmark) was immobilized in acetate buffer at pH 5 on CM5 sensor chips at a level lower than 200 RU. Increasing concentrations of antibody variants were injected over 180 s. After a dissociation phase of 400 s, the FcRn-coated sensor chip was regenerated by a pulse of 10 mM NaOH and PBS. The multi-cycle kinetics were evaluated by a bivalent model fitting (BiaEvaluation 4.1.1, GE Healthcare). Each variant was analyzed on freshly immobilized hFcRn.

4.3. Structure-Based Feature Extraction

To model the 3D structures of the Fc mutants, the 4N0U.pdb file was used as a template. Using the mutagenesis tool from PyMOL v2.5.4, the 3D structure of the complex between FcRn and each mutant from the dataset was generated and exported as a pdb file. CCP4 software v8.0.009 was used to compute the different features used in the algorithms. Features calculated for each residue by CCP4 were: BSA (buried surface area), ASA (accessible surface area), and solvation energy. General features calculated by CCP4 for the whole complex were: number of interface residues, ∆G (solvation energy gain score), p-value (hydrophobic score), BE (theoretical binding energy), and number of hydrogen and salt bridges between interfaces. Total number of hydrogen bonds (cutoff: 3.5 angstroms), total number of salt bridges (cutoff: 4.0 angstroms), total number of contacts between amino acids’ cα (cutoff: 4.0 angstroms), average distance between hydrogen bonds and number of paired hydrophilic amino acids were also added in addition to CCP4-calculated parameters.
Algorithms from scikit-learn v0.20.3 were used. Data were standardized.
The estimator’s parameters were set to:
RFR: (n_estimators = ‘warn’, criterion = ‘mse’, max_depth = 10, min_samples_split = 2, min_samples_leaf = 1, min_weight_fraction_leaf = 0.0, max_features = ‘auto’, max_leaf_nodes = None, min_impurity_decrease = 0.0, min_impurity_split = None, bootstrap = True, oob_score = False, n_jobs = None, random_state = None, verbose = 0, warm_start = False).
SVR: (kernel = ‘rbf’, degree = 3, gamma = ‘auto_deprecated’, coef0 = 0.0, tol = 0.001, C = 1.0, epsilon = 0.1, shrinking = True, cache_size = 200, verbose = False, max_iter = −1).
MLPRegressor: (solver = ‘lbfgs’, alpha = 20, hidden_layer_sizes = (20,2), random_state = 10, activation = ‘tanh’, max_iter = 4000, tol = 0.00001, early_stopping = True).
LR: (fit_intercept = True, normalize = False, copy_X = True, n_jobs = None).

5. Conclusions

Affinity prediction is one of the toughest bioinformatics challenges, and although progress has been made, there is still room for improvement. We chose to focus on one particular protein complex type for which many data were available. The results of the training show that this kind of approach is appropriate and also that the diversity of the training set is crucial to avoid bias and to correctly evaluate the importance of the different features. Despite all the limitations of our models, we were able to correctly predict the affinities of the three variants that were produced in this study. However, the obtained results do not allow us to make an educated choice between the methods. The SLS-trained algorithms appear to perform better than the FLS-trained ones, both in 10-fold cross-validation (Figure 1) and in predicting the affinities of the new variants (Table 2 and Table 3). However, the MLS and MLP algorithms perform better in predicting the new variants, but the RFR algorithm is better in the 10-fold cross-validation. Thus, deciding between the three methods will require more validations.
The advantage of this method is that it does not require initial knowledge to generate in silico random variants and select mutants with high affinity. However, like most artificial-intelligence-based methods, it does not explain how various combinations of mutations can modulate the affinity of the Fc to FcRn. Still, it provides new interesting combinations of mutations while reducing the number of variants to test.

Author Contributions

Conceptualization, C.D., V.G.-G., A.P. and H.W.; methodology, C.D., V.G.-G., A.P. and H.W.; software, C.D. and A.P.; validation, C.D., M.P., C.H. and V.G.-G.; data curation, C.D.; writing, C.D., V.G.-G., A.P. and H.W.; supervision, H.W.; funding acquisition, V.G.-G. and H.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the French Higher Education and Research Ministry under the program “Investissements d’Avenir”, grant agreement: LabEx MAbImprove ANR-10-LABX-53-01. Christophe Dumet was funded by a Ph.D. grant by LabEx MAbImprove. This work was also part of the MAbMapping technological intelligence platform of the University of Tours; MAbMapping was funded by the European Regional Development Fund and was funded by the regional program ARD 2020 Biopharmaceuticals. We thank Yann Jullian and Thomas Bourquard for their helpful comments.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We thank Yann Jullian and Thomas Bourquard for their helpful comments.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. List of descriptors investigated.
Table A1. List of descriptors investigated.
No.Feature NameMeaning
1FcSOLV251ASolvation effect of the 251 residue in Å2
2FcSOLV252ASolvation effect of the 252 residue in Å2
3FcSOLV253ASolvation effect of the 253 residue in Å2
4FcSOLV254ASolvation effect of the 254 residue in Å2
5FcSOLV309ASolvation effect of the 309 residue in Å2
6FcSOLV310ASolvation effect of the 310 residue in Å2
7FcSOLV311ASolvation effect of the 311 residue in Å2
8FcSOLV314ASolvation effect of the 314 residue in Å2
9FcSOLV428ASolvation effect of the 428 residue in Å2
10FcSOLV433ASolvation effect of the 433 residue in Å2
11FcSOLV434ASolvation effect of the 434 residue in Å2
12FcSOLV435ASolvation effect of the 435 residue in Å2
13FcSOLV436ASolvation effect of the 436 residue in Å2
14FcSOLV253BSolvation effect of the 253 residue in Å2
15FcSOLV255BSolvation effect of the 255 residue in Å2
16FcSOLV256BSolvation effect of the 256 residue in Å2
17FcSOLV257BSolvation effect of the 257 residue in Å2
18FcSOLV285BSolvation effect of the 285 residue in Å2
19FcSOLV286BSolvation effect of the 286 residue in Å2
20FcSOLV288BSolvation effect of the 288 residue in Å2
21FcSOLV307BSolvation effect of the 307 residue in Å2
22FcSOLV308Solvation effect of the 308 residue in Å2
23FcSOLV309BSolvation effect of the 309 residue in Å2
24FcSOLV310BSolvation effect of the 310 residue in Å2
25FcBSA251ABuried surface of the 251 residue in Å2
26FcBSA252ABuried surface of the 252 residue in Å2
27FcBSA253ABuried surface of the 253 residue in Å2
28FcBSA254ABuried surface of the 254 residue in Å2
29FcBSA309ABuried surface of the 309 residue in Å2
30FcBSA310ABuried surface of the 310 residue in Å2
31FcBSA311ABuried surface of the 311 residue in Å2
32FcBSA314ABuried surface of the 314 residue in Å2
33FcBSA428ABuried surface of the 428 residue in Å2
34FcBSA433ABuried surface of the 433 residue in Å2
35FcBSA434ABuried surface of the 434 residue in Å2
36FcBSA435ABuried surface of the 435 residue in Å2
37FcBSA436ABuried surface of the 436 residue in Å2
38FcBSA253BBuried surface of the 253 residue in Å2
39FcBSA255BBuried surface of the 255 residue in Å2
40FcBSA256BBuried surface of the 256 residue in Å2
41FcBSA257BBuried surface of the 257 residue in Å2
42FcBSA285BBuried surface of the 285 residue in Å2
43FcBSA286BBuried surface of the 286 residue in Å2
44FcBSA288BBuried surface of the 288 residue in Å2
45FcBSA307BBuried surface of the 307 residue in Å2
46FcBSA308Buried surface of the 308 residue in Å2
47FcBSA309BBuried surface of the 309 residue in Å2
48FcBSA310BBuried surface of the 310 residue in Å2
49FcASA251ASurface accessible to the solvent of the 251 residue in Å2
50FcASA252ASurface accessible to the solvent of the 252 residue in Å2
51FcASA253ASurface accessible to the solvent of the 253 residue in Å2
52FcASA254ASurface accessible to the solvent of the 254 residue in Å2
53FcASA309ASurface accessible to the solvent of the 309 residue in Å2
54FcASA310ASurface accessible to the solvent of the 310 residue in Å2
55FcASA311ASurface accessible to the solvent of the 311 residue in Å2
56FcASA314ASurface accessible to the solvent of the 314 residue in Å2
57FcASA428ASurface accessible to the solvent of the 428 residue in Å2
58FcASA433ASurface accessible to the solvent of the 433 residue in Å2
59FcASA434ASurface accessible to the solvent of the 434 residue in Å2
60FcASA435ASurface accessible to the solvent of the 435 residue in Å2
61FcASA436ASurface accessible to the solvent of the 436 residue in Å2
62FcASA253BSurface accessible to the solvent of the 253 residue in Å2
63FcASA255BSurface accessible to the solvent of the 255 residue in Å2
64FcASA256BSurface accessible to the solvent of the 256 residue in Å2
65FcASA257BSurface accessible to the solvent of the 257 residue in Å2
66FcASA285BSurface accessible to the solvent of the 285 residue in Å2
67FcASA286BSurface accessible to the solvent of the 286 residue in Å2
68FcASA288BSurface accessible to the solvent of the 288 residue in Å2
69FcASA307BSurface accessible to the solvent of the 307 residue in Å2
70FcASA308BSurface accessible to the solvent of the 308 residue in Å2
71FcASA309BSurface accessible to the solvent of the 309 residue in Å2
72FcASA310BSurface accessible to the solvent of the 310 residue in Å2
73FcRnSOLV88ASolvation effect of the 88
74FcRnSOLV112ASolvation effect of the 112
75FcRnSOLV113ASolvation effect of the 113
76FcRnSOLV114ASolvation effect of the 114
77FcRnSOLV115ASolvation effect of the 115
78FcRnSOLV116ASolvation effect of the 116
79FcRnSOLV128ASolvation effect of the 128
80FcRnSOLV129ASolvation effect of the 129
81FcRnSOLV130ASolvation effect of the 130
82FcRnSOLV131ASolvation effect of the 131
83FcRnSOLV132ASolvation effect of the 132
84FcRnSOLV133ASolvation effect of the 133
85FcRnSOLV135ASolvation effect of the 135
86FcRnSOLV1BSolvation effect of the 1
87FcRnSOLV2BSolvation effect of the 2
88FcRnSOLV3BSolvation effect of the 3
89FcRnSOLV4BSolvation effect of the 4
90FcRnSOLV85BSolvation effect of the 85
91FcRnSOLV86BSolvation effect of the 86
92FcRnBSA88ABuried surface of the 88
93FcRnBSA112ABuried surface of the 112
94FcRnBSA113ABuried surface of the 113
95FcRnBSA114ABuried surface of the 114
96FcRnBSA115ABuried surface of the 115
97FcRnBSA116ABuried surface of the 116
98FcRnBSA128ABuried surface of the 128
99FcRnBSA129ABuried surface of the 129
100FcRnBSA130ABuried surface of the 130
101FcRnBSA131ABuried surface of the 131
102FcRnBSA132ABuried surface of the 132
103FcRnBSA133ABuried surface of the 133
104FcRnBSA135ABuried surface of the 135
105FcRnBSA1BBuried surface of the 1
106FcRnBSA2BBuried surface of the 2
107FcRnBSA3BBuried surface of the 3
108FcRnBSA4BBuried surface of the 4
109FcRnBSA85BBuried surface of the 85
110FcRnBSA86BBuried surface of the 86
111FcRnASA88ASurface accessible to the solvent of the 88
112FcRnASA112ASurface accessible to the solvent of the 112
113FcRnASA113ASurface accessible to the solvent of the 113
114FcRnASA114ASurface accessible to the solvent of the 114
115FcRnASA115ASurface accessible to the solvent of the 115
116FcRnASA116ASurface accessible to the solvent of the 116
117FcRnASA128ASurface accessible to the solvent of the 128
118FcRnASA129ASurface accessible to the solvent of the 129
119FcRnASA130ASurface accessible to the solvent of the 130
120FcRnASA131ASurface accessible to the solvent of the 131
121FcRnASA132ASurface accessible to the solvent of the 132
122FcRnASA133ASurface accessible to the solvent of the 133
123FcRnASA135ASurface accessible to the solvent of the 135
124FcRnASA1BSurface accessible to the solvent of the 1
125FcRnASA2BSurface accessible to the solvent of the 2
126FcRnASA3BSurface accessible to the solvent of the 3
127FcRnASA4BSurface accessible to the solvent of the 4
128FcRnASA85BSurface accessible to the solvent of the 85
129FcRnASA86BSurface accessible to the solvent of the 86
130nbaainterFcANumber of atoms interacting between Fc and the FcRns alpha chain
131nbaainterFcBNumber of atoms interacting between Fc and the FcRns beta chain
132nbliaiHFcANumber of hydrogen bonds between Fc and the FcRns alpha chain
133nbliaiHFcBNumber of hydrogen bonds between Fc and the FcRns beta chain
134nbsaltFcANumber of salt bridges between Fc and the FcRns alpha chain
135nbsaltFcBNumber of salt bridges between Fc and the FcRns beta chain
136interFace_solv_en_FcASolvation energy gain score calculated by PISA between Fc and the FcRns alpha chain
137interface_solv_en_FcBSolvation energy gain score calculated by PISA between Fc and the FcRns beta chain
138p_valueFcAHydrophobic score calculated by PISA between Fc and the FcRns alpha chain
139p_valueFcBHydrophobic score calculated by PISA between Fc and the FcRns beta chain
140delta_g_theoriqueFcATheoretical binding energy score calculated by PISA between Fc and the FcRns alpha chain
141delta_g_theoriqueFcBTheoretical binding energy score calculated by PISA between Fc and the FcRns beta chain
142Bond StrengthAverage distance between bonds
143paired hydrophilicNumber of paired hydrophilic amino acids
144pHpH
145nbr_bounds_hTotal number of hydrogen bonds
146nbr_ bounds _sTotal number of salt bridges
147nbr_ bounds _cTotal number of contacts between amino acids’ cα atoms
A and B stand for FcRns α chain and β chain, respectively.
Table A2. Predictions of affinities at pH 7 of variants for which the measure has been done at pH 6.0 reported in [17].
Table A2. Predictions of affinities at pH 7 of variants for which the measure has been done at pH 6.0 reported in [17].
Variant #KD pH 7
Predicted
KD pH 7
Computed
KD pH 6
Measured
RFR
T256E/T307Q4.06 × 10−51.58 × 10−52.32 × 10−7
T256D/T307W1.78 × 10−41.15 × 10−51.69 × 10−7
M252Y/T256D3.23 × 10−56.39 × 10−69.40 × 10−8
M252Y/T256E1.87 × 10−68.70 × 10−61.28 × 10−7
M252Y/T307W2.31 × 10−58.02 × 10−61.18 × 10−7
M252Y/T256D/T307Q5.48 × 10−77.82 × 10−61.15 × 10−7
M252Y/T256E/T307Q6.69 × 10−71.48 × 10−52.18 × 10−7
MLP
T256E/T307Q4.45 × 10−51.58 × 10−52.32 × 10−7
T256D/T307W4.50 × 10−51.15 × 10−51.69 × 10−7
M252Y/T256D8.66 × 10−66.39 × 10−69.40 × 10−8
M252Y/T256E8.74 × 10−68.70 × 10−61.28 × 10−7
M252Y/T307W1.33 × 10−68.02 × 10−61.18 × 10−7
M252Y/T256D/T307Q1.46 × 10−67.82 × 10−61.15 × 10−7
M252Y/T256E/T307Q1.48 × 10−61.48 × 10−52.18 × 10−7
MLR
T256E/T307Q9.25 × 10−51.58 × 10−52.32 × 10−7
T256D/T307W1.03 × 10−41.15 × 10−51.69 × 10−7
M252Y/T256D7.33 × 10−56.39 × 10−69.40 × 10−8
M252Y/T256E6.92 × 10−58.70 × 10−61.28 × 10−7
M252Y/T307W3.00 × 10−58.02 × 10−61.18 × 10−7
M252Y/T256D/T307Q2.81 × 10−57.82 × 10−61.15 × 10−7
M252Y/T256E/T307Q2.72 × 10−51.48 × 10−52.18 × 10−7
SVR
T256E/T307Q9.25 × 10−51.58 × 10−52.32 × 10−7
T256D/T307W1.03 × 10−41.15 × 10−51.69 × 10−7
M252Y/T256D7.33 × 10−56.39 × 10−69.40 × 10−8
M252Y/T256E6.92 × 10−58.70 × 10−61.28 × 10−7
M252Y/T307W3.00 × 10−58.02 × 10−61.18 × 10−7
M252Y/T256D/T307Q2.81 × 10−57.82 × 10−61.15 × 10−7
M252Y/T256E/T307Q2.72 × 10−51.48 × 10−52.18 × 10−7
Table A3. For the two different models and the four different algorithms, 20 variants predicted with the highest KD.
Table A3. For the two different models and the four different algorithms, 20 variants predicted with the highest KD.
Model FL
RFR Learning_Set
Variant #MutationsKD
833235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.46 × 10−9
831250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.61 × 10−9
802235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.70 × 10−9
829235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.70 × 10−9
832235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.77 × 10−9
800250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.87 × 10−9
801235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V4.06 × 10−9
828235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V4.06 × 10−9
568252Y, 286E, 307Q, 308P, 311A, 428I, 434Y4.24 × 10−9
1239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y4.55 × 10−9
2239K, 252W, 286E, 308P, 428Y, 434Y6.50 × 10−9
3239K, 252W, 256E, 286E, 308P, 428Y, 434Y6.51 × 10−9
567252Y, 286E, 307Q, 308P, 311A, 434Y6.78 × 10−9
527239K, 252Y, 286E, 307Q, 308P, 311A, 434Y7.15 × 10−9
5239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y7.45 × 10−9
7239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y7.47 × 10−9
8239K, 252Y, 270F, 286E, 308P, 428I, 434Y7.63 × 10−9
4239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y7.68 × 10−9
565252Y, 286E, 308P, 428I, 434Y7.92 × 10−9
6239K, 252Y, 286E, 308P, 428I, 434Y7.97 × 10−9
MLP learning_set
Variant #MutationsKD
568235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V7.65 × 10−9
1250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V7.65 × 10−9
29235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V7.98 × 10−9
5235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V8.07 × 10−9
802235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V8.19 × 10−9
829250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V8.19 × 10−9
800235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V8.32 × 10−9
801235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V8.32 × 10−9
828252Y, 286E, 307Q, 308P, 311A, 428I, 434Y8.32 × 10−9
831239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y8.66 × 10−9
832239K, 252W, 286E, 308P, 428Y, 434Y8.66 × 10−9
19239K, 252W, 256E, 286E, 308P, 428Y, 434Y8.69 × 10−9
47252Y, 286E, 307Q, 308P, 311A, 434Y8.72 × 10−9
24239K, 252Y, 286E, 307Q, 308P, 311A, 434Y8.72 × 10−9
495239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y8.72 × 10−9
833239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y9.13 × 10−9
567239K, 252Y, 270F, 286E, 308P, 428I, 434Y9.44 × 10−9
527239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y9.44 × 10−9
3252Y, 286E, 308P, 428I, 434Y1.02 × 10−8
570239K, 252Y, 286E, 308P, 428I, 434Y1.09 × 10−8
MLR learning_set
Variant #MutationsKD
544235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V5.13 × 10−9
530250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V6.30 × 10−9
244235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V7.93 × 10−9
247235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V8.08 × 10−9
343235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V8.10 × 10−9
543250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V9.51 × 10−9
568235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V9.56 × 10−9
1235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V9.69 × 10−9
5252Y, 286E, 307Q, 308P, 311A, 428I, 434Y9.77 × 10−9
567239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y1.10 × 10−8
527239K, 252W, 286E, 308P, 428Y, 434Y1.11 × 10−8
29239K, 252W, 256E, 286E, 308P, 428Y, 434Y1.11 × 10−8
47252Y, 286E, 307Q, 308P, 311A, 434Y1.12 × 10−8
833239K, 252Y, 286E, 307Q, 308P, 311A, 434Y1.16 × 10−8
24239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y1.18 × 10−8
495239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y1.18 × 10−8
19239K, 252Y, 270F, 286E, 308P, 428I, 434Y1.27 × 10−8
802239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y1.31 × 10−8
829252Y, 286E, 308P, 428I, 434Y1.31 × 10−8
536239K, 252Y, 286E, 308P, 428I, 434Y1.45 × 10−8
SVR learning_set
Variant #MutationsKD
831235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V4.90 × 10−9
832250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V4.90 × 10−9
800235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V5.24 × 10−9
801235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V5.24 × 10−9
828235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V5.24 × 10−9
833250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V5.87 × 10−9
2235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V6.92 × 10−9
3235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V7.08 × 10−9
802252Y, 286E, 307Q, 308P, 311A, 428I, 434Y7.28 × 10−9
829239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y7.28 × 10−9
5239K, 252W, 286E, 308P, 428Y, 434Y8.13 × 10−9
565239K, 252W, 256E, 286E, 308P, 428Y, 434Y8.71 × 10−9
6252Y, 286E, 307Q, 308P, 311A, 434Y8.71 × 10−9
7239K, 252Y, 286E, 307Q, 308P, 311A, 434Y8.71 × 10−9
8239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y8.71 × 10−9
763239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y1.02 × 10−8
818239K, 252Y, 270F, 286E, 308P, 428I, 434Y1.02 × 10−8
783239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y1.02 × 10−8
567252Y, 286E, 308P, 428I, 434Y1.20 × 10−8
527239K, 252Y, 286E, 308P, 428I, 434Y1.20 × 10−8
RFR 3mut
Variant #MutationsKD
22,717L309Q, N434D, Y436L3.14 × 10−8
27,638H310N, H435G, Y436L3.72 × 10−8
21,828M252R, H310E, H433N4.01 × 10−8
23,131H310E, N434L, Y436K4.03 × 10−8
26,871H310R, N434F, Y436K4.18 × 10−8
22,177K288G, H310G, N434W5.50 × 10−8
27,282Q311R, M428F, N434F6.17 × 10−8
21,781K288G, H310G, H433T6.34 × 10−8
23,175M252W, I253D, N286R7.10 × 10−8
25,334M252R, K288S, H433S7.47 × 10−8
20,956M252R, I253G, H433G8.59 × 10−8
26,312I253D, H433A, Y436K8.70 × 10−8
25,958K288A, H310T, N434H8.87 × 10−8
23,441Q311R, M428N, N434F9.07 × 10−8
28,117K288R, H310D, H433Y9.16 × 10−8
27,621T256W, H435G, Y436N9.43 × 10−8
20,844P257V, N286D, T307Y9.79 × 10−8
27,672M252Q, T256N, H433Y9.80 × 10−8
20,219M252K, T256Y, H433N1.01 × 10−7
22,805N286R, T307R, Y436I1.02 × 10−7
MLP 3mut
Variant #MutationsKD
25,115L309Q, N434D, Y436L9.91 × 10−9
23,790H310N, H435G, Y436L1.18 × 10−8
27,256M252R, H310E, H433N1.30 × 10−8
27,568H310E, N434L, Y436K1.36 × 10−8
27,086H310R, N434F, Y436K1.40 × 10−8
22,280K288G, H310G, N434W1.42 × 10−8
21,044Q311R, M428F, N434F1.49 × 10−8
20,905K288G, H310G, H433T1.61 × 10−8
21,807M252W, I253D, N286R1.64 × 10−8
22,937M252R, K288S, H433S1.67 × 10−8
22,638M252R, I253G, H433G1.71 × 10−8
20,841I253D, H433A, Y436K1.74 × 10−8
25,914K288A, H310T, N434H1.76 × 10−8
21,608Q311R, M428N, N434F1.77 × 10−8
27,445K288R, H310D, H433Y1.79 × 10−8
26,619T256W, H435G, Y436N1.79 × 10−8
21,178P257V, N286D, T307Y1.83 × 10−8
24,744M252Q, T256N, H433Y1.84 × 10−8
21,756M252K, T256Y, H433N1.85 × 10−8
21,891N286R, T307R, Y436I1.85 × 10−8
MLR 3mut
Variant #MutationsKD
23,046L309Q, N434D, Y436L8.56 × 10−10
23,821H310N, H435G, Y436L1.90 × 10−9
20,005M252R, H310E, H433N1.90 × 10−9
21,325H310E, N434L, Y436K2.65 × 10−9
20,606H310R, N434F, Y436K2.72 × 10−9
25,146K288G, H310G, N434W3.56 × 10−9
23,660Q311R, M428F, N434F3.96 × 10−9
23,971K288G, H310G, H433T4.53 × 10−9
26,091M252W, I253D, N286R4.68 × 10−9
27,856M252R, K288S, H433S4.85 × 10−9
25,298M252R, I253G, H433G5.72 × 10−9
28,072I253D, H433A, Y436K5.80 × 10−9
28,288K288A, H310T, N434H5.88 × 10−9
27,058Q311R, M428N, N434F6.28 × 10−9
21,705K288R, H310D, H433Y6.42 × 10−9
25,221T256W, H435G, Y436N6.85 × 10−9
22,944P257V, N286D, T307Y6.95 × 10−9
22,506M252Q, T256N, H433Y7.37 × 10−9
25,804M252K, T256Y, H433N7.59 × 10−9
27,795N286R, T307R, Y436I7.70 × 10−9
SVR 3mut
Variant #MutationsKD
22,166L309Q, N434D, Y436L1.09 × 10−7
25,716H310N, H435G, Y436L1.16 × 10−7
26,932M252R, H310E, H433N1.26 × 10−7
20,339H310E, N434L, Y436K1.42 × 10−7
26,518H310R, N434F, Y436K1.44 × 10−7
27,880K288G, H310G, N434W1.46 × 10−7
21,576Q311R, M428F, N434F1.51 × 10−7
27,672K288G, H310G, H433T1.51 × 10−7
22,597M252W, I253D, N286R1.58 × 10−7
20,333M252R, K288S, H433S1.64 × 10−7
22,168M252R, I253G, H433G1.64 × 10−7
23,757I253D, H433A, Y436K1.69 × 10−7
21,145K288A, H310T, N434H1.70 × 10−7
20,273Q311R, M428N, N434F1.71 × 10−7
27,350K288R, H310D, H433Y1.72 × 10−7
22,409T256W, H435G, Y436N1.73 × 10−7
24,464P257V, N286D, T307Y1.73 × 10−7
26,610M252Q, T256N, H433Y1.74 × 10−7
27,113M252K, T256Y, H433N1.76 × 10−7
23,602N286R, T307R, Y436I1.77 × 10−7
RFR 5mut
Variant #MutationsKD
31,995M252K, H285Q, T307D, L309K, H433Y1.24 × 10−8
32,898M252R, N286Q, H310R, M428I, H435N1.49 × 10−8
31,966M252W, I253H, N286D, Q311K, H433P1.67 × 10−8
37,526L251I, L309K, H433N, N434Q, Y436L1.72 × 10−8
36,771I253F, N286R, T307Q, M428F, H435E1.85 × 10−8
33,965L251A, M252K, L309K, L314S, H433Y1.94 × 10−8
37,863L309K, Q311E, H433G, N434H, Y436R2.09 × 10−8
32,948S254N, T307A, L309K, H433T, Y436R2.45 × 10−8
30,050T256E, N286Q, Q311N, N434H, Y436N2.53 × 10−8
35,099I253D, P257S, Q311E, M428Y, H433D2.64 × 10−8
38,056N286E, L309D, L314H, H435N, Y436N2.90 × 10−8
34,584L251D, M252Q, N286D, V308A, H433A2.97 × 10−8
32,714M252W, S254T, V308F, H310R, H435S2.98 × 10−8
35,697M252K, N286R, L309Q, H435N, Y436F3.14 × 10−8
31,821L251H, M252Q, N286H, Q311N, H433A3.34 × 10−8
37,707T256S, Q311K, L314R, M428W, N434W3.44 × 10−8
38,325K288T, V308A, L309K, H433G, Y436K3.68 × 10−8
30,121I253W, N286Y, L309V, M428F, H433D3.72 × 10−8
32,551M252F, P257W, H285E, Q311E, N434H3.84 × 10−8
34,867H285E, N286Y, L309Y, M428H, H433Y3.88 × 10−8
MLP 5mut
Variant #MutationsKD
36,622M252K, H285Q, T307D, L309K, H433Y5.13 × 10−9
38,413M252R, N286Q, H310R, M428I, H435N7.18 × 10−9
32,294M252W, I253H, N286D, Q311K, H433P7.20 × 10−9
34,399L251I, L309K, H433N, N434Q, Y436L7.91 × 10−9
35,394I253F, N286R, T307Q, M428F, H435E8.58 × 10−9
34,608L251A, M252K, L309K, L314S, H433Y8.68 × 10−9
31,958L309K, Q311E, H433G, N434H, Y436R8.91 × 10−9
34,236S254N, T307A, L309K, H433T, Y436R9.17 × 10−9
36,343T256E, N286Q, Q311N, N434H, Y436N9.26 × 10−9
35,234I253D, P257S, Q311E, M428Y, H433D9.47 × 10−9
38,030N286E, L309D, L314H, H435N, Y436N9.53 × 10−9
30,188L251D, M252Q, N286D, V308A, H433A9.83 × 10−9
35,109M252W, S254T, V308F, H310R, H435S9.98 × 10−9
32,632M252K, N286R, L309Q, H435N, Y436F1.00 × 10−8
30,914L251H, M252Q, N286H, Q311N, H433A1.04 × 10−8
35,398T256S, Q311K, L314R, M428W, N434W1.05 × 10−8
32,539K288T, V308A, L309K, H433G, Y436K1.05 × 10−8
35,860I253W, N286Y, L309V, M428F, H433D1.06 × 10−8
35,943M252F, P257W, H285E, Q311E, N434H1.07 × 10−8
37,387H285E, N286Y, L309Y, M428H, H433Y1.07 × 10−8
MLR 5mut
Variant #MutationsKD
36,120M252K, H285Q, T307D, L309K, H433Y4.79 × 10−10
31,116M252R, N286Q, H310R, M428I, H435N5.14 × 10−10
37,434M252W, I253H, N286D, Q311K, H433P1.40 × 10−9
33,162L251I, L309K, H433N, N434Q, Y436L1.44 × 10−9
35,517I253F, N286R, T307Q, M428F, H435E1.66 × 10−9
37,684L251A, M252K, L309K, L314S, H433Y1.88 × 10−9
37,301L309K, Q311E, H433G, N434H, Y436R1.88 × 10−9
30,930S254N, T307A, L309K, H433T, Y436R1.94 × 10−9
36,097T256E, N286Q, Q311N, N434H, Y436N1.95 × 10−9
38,430I253D, P257S, Q311E, M428Y, H433D1.97 × 10−9
38,202N286E, L309D, L314H, H435N, Y436N2.09 × 10−9
37,863L251D, M252Q, N286D, V308A, H433A2.09 × 10−9
30,545M252W, S254T, V308F, H310R, H435S2.15 × 10−9
31,317M252K, N286R, L309Q, H435N, Y436F2.25 × 10−9
34,813L251H, M252Q, N286H, Q311N, H433A2.34 × 10−9
36,045T256S, Q311K, L314R, M428W, N434W2.40 × 10−9
38,596K288T, V308A, L309K, H433G, Y436K2.55 × 10−9
33,006I253W, N286Y, L309V, M428F, H433D2.69 × 10−9
33,288M252F, P257W, H285E, Q311E, N434H2.72 × 10−9
33,871H285E, N286Y, L309Y, M428H, H433Y2.75 × 10−9
SVR 5mut
Variant #MutationsKD
31,131M252K, H285Q, T307D, L309K, H433Y6.12 × 10−8
37,573M252R, N286Q, H310R, M428I, H435N8.12 × 10−8
34,132M252W, I253H, N286D, Q311K, H433P8.81 × 10−8
32,677L251I, L309K, H433N, N434Q, Y436L9.49 × 10−8
38,342I253F, N286R, T307Q, M428F, H435E1.10 × 10−7
31,134L251A, M252K, L309K, L314S, H433Y1.12 × 10−7
37,613L309K, Q311E, H433G, N434H, Y436R1.23 × 10−7
36,014S254N, T307A, L309K, H433T, Y436R1.29 × 10−7
32,967T256E, N286Q, Q311N, N434H, Y436N1.48 × 10−7
30,390I253D, P257S, Q311E, M428Y, H433D1.52 × 10−7
31,621N286E, L309D, L314H, H435N, Y436N1.58 × 10−7
30,551L251D, M252Q, N286D, V308A, H433A1.63 × 10−7
32,946M252W, S254T, V308F, H310R, H435S1.68 × 10−7
32,204M252K, N286R, L309Q, H435N, Y436F1.74 × 10−7
30,254L251H, M252Q, N286H, Q311N, H433A1.75 × 10−7
31,168T256S, Q311K, L314R, M428W, N434W1.77 × 10−7
32,902K288T, V308A, L309K, H433G, Y436K1.78 × 10−7
30,243I253W, N286Y, L309V, M428F, H433D1.79 × 10−7
30,417M252F, P257W, H285E, Q311E, N434H1.83 × 10−7
30,211H285E, N286Y, L309Y, M428H, H433Y1.84 × 10−7
RFR 8mut
Variant #MutationsKD
30,849M252W, T256P, N286K, L309K, Q311A, M428F, Y436G5.67 × 10−9
30,198L251R, I253T, H285N, N286D, L309K, M428F, N434D1.21 × 10−8
30,864L251T, M252Y, I253P, N286K, V308F, L309R, H433G1.52 × 10−8
30,501M252Y, I253E, H285I, N286D, V308A, N434H1.73 × 10−8
30,454M252Y, N286Q, K288F, L309W, Q311L, N434Y1.77 × 10−8
30,390R255Y, P257N, H285D, V308A, L309K, M428W, N434H1.94 × 10−8
30,947M252W, I253Y, R255F, N286E, L309D, Q311K, H433P2.32 × 10−8
30,169L251T, I253D, R255S, T256S, Q311A, M428F, H433G2.60 × 10−8
30,358M252E, P257V, L309K, M428L, H433F, H435K2.85 × 10−8
30,582P257A, H285I, T307W, M428W, N434H2.85 × 10−8
30,338R255Q, N286K, T307Q, L309P, M428F, H433I, H435R3.57 × 10−8
30,211H285E, N286K, T307R, L309E, Q311K, M428W, N434F3.59 × 10−8
30,974I253S, T256S, H285D, N286E, V308A, M428L, H435E3.94 × 10−8
30,696M252D, N286W, L309R, Q311V, N434H, H435K4.33 × 10−8
30,401M252W, I253D, P257A, V308F, L309E, N434W4.42 × 10−8
30,416I253P, T256V, N286R, Q311K, M428W, N434F4.79 × 10−8
30,777M252W, P257T, N286H, T307F, L309G, H433L5.22 × 10−8
30,280M252V, R255Q, T256N, N286H, M428F, N434Y, Y436K5.47 × 10−8
30,913L251P, T256P, N286Q, L309K, Q311I, Y436G5.66 × 10−8
30,948L251G, T256N, N286E, V308A, L309K, H433L, N434T5.66 × 10−8
MLP 8mut
Variant #MutationsKD
30,126M252W, T256P, N286K, L309K, Q311A, M428F, Y436G5.75 × 10−9
30,501L251R, I253T, H285N, N286D, L309K, M428F, N434D6.73 × 10−9
30,603L251T, M252Y, I253P, N286K, V308F, L309R, H433G7.09 × 10−9
30,947M252Y, I253E, H285I, N286D, V308A, N434H8.43 × 10−9
30,070M252Y, N286Q, K288F, L309W, Q311L, N434Y9.86 × 10−9
30,582R255Y, P257N, H285D, V308A, L309K, M428W, N434H1.09 × 10−8
30,198M252W, I253Y, R255F, N286E, L309D, Q311K, H433P1.14 × 10−8
30,822L251T, I253D, R255S, T256S, Q311A, M428F, H433G1.18 × 10−8
30,554M252E, P257V, L309K, M428L, H433F, H435K1.37 × 10−8
30,950P257A, H285I, T307W, M428W, N434H1.40 × 10−8
30,154R255Q, N286K, T307Q, L309P, M428F, H433I, H435R1.46 × 10−8
30,942H285E, N286K, T307R, L309E, Q311K, M428W, N434F1.47 × 10−8
30,842I253S, T256S, H285D, N286E, V308A, M428L, H435E1.49 × 10−8
30,454M252D, N286W, L309R, Q311V, N434H, H435K1.52 × 10−8
30,259M252W, I253D, P257A, V308F, L309E, N434W1.55 × 10−8
30,042I253P, T256V, N286R, Q311K, M428W, N434F1.59 × 10−8
30,782M252W, P257T, N286H, T307F, L309G, H433L1.60 × 10−8
30,241M252V, R255Q, T256N, N286H, M428F, N434Y, Y436K1.62 × 10−8
30,171L251P, T256P, N286Q, L309K, Q311I, Y436G1.64 × 10−8
30,365L251G, T256N, N286E, V308A, L309K, H433L, N434T1.68 × 10−8
MLR 8mut
Variant #MutationsKD
30,835M252W, T256P, N286K, L309K, Q311A, M428F, Y436G2.34 × 10-11
30,259L251R, I253T, H285N, N286D, L309K, M428F, N434D8.60 × 10-11
30,184L251T, M252Y, I253P, N286K, V308F, L309R, H433G3.28 × 10−10
30,558M252Y, I253E, H285I, N286D, V308A, N434H4.17 × 10−10
30,317M252Y, N286Q, K288F, L309W, Q311L, N434Y5.83 × 10−10
30,395R255Y, P257N, H285D, V308A, L309K, M428W, N434H6.03 × 10−10
30,787M252W, I253Y, R255F, N286E, L309D, Q311K, H433P6.18 × 10−10
30,968L251T, I253D, R255S, T256S, Q311A, M428F, H433G1.34 × 10−9
30,762M252E, P257V, L309K, M428L, H433F, H435K1.36 × 10−9
30,253P257A, H285I, T307W, M428W, N434H1.68 × 10−9
30,500R255Q, N286K, T307Q, L309P, M428F, H433I, H435R2.17 × 10−9
30,926H285E, N286K, T307R, L309E, Q311K, M428W, N434F2.26 × 10−9
30,023I253S, T256S, H285D, N286E, V308A, M428L, H435E2.27 × 10−9
30,515M252D, N286W, L309R, Q311V, N434H, H435K2.49 × 10−9
30,087M252W, I253D, P257A, V308F, L309E, N434W2.51 × 10−9
30,209I253P, T256V, N286R, Q311K, M428W, N434F3.25 × 10−9
30,832M252W, P257T, N286H, T307F, L309G, H433L3.36 × 10−9
30,947M252V, R255Q, T256N, N286H, M428F, N434Y, Y436K3.49 × 10−9
30,179L251P, T256P, N286Q, L309K, Q311I, Y436G3.51 × 10−9
30,577L251G, T256N, N286E, V308A, L309K, H433L, N434T3.74 × 10−9
SVR 8mut
Variant #MutationsKD
30,401M252W, T256P, N286K, L309K, Q311A, M428F, Y436G1.53 × 10−7
30,245L251R, I253T, H285N, N286D, L309K, M428F, N434D1.71 × 10−7
30,105L251T, M252Y, I253P, N286K, V308F, L309R, H433G1.84 × 10−7
30,625M252Y, I253E, H285I, N286D, V308A, N434H1.91 × 10−7
30,022M252Y, N286Q, K288F, L309W, Q311L, N434Y1.92 × 10−7
30,142R255Y, P257N, H285D, V308A, L309K, M428W, N434H1.96 × 10−7
30,501M252W, I253Y, R255F, N286E, L309D, Q311K, H433P2.02 × 10−7
30,097L251T, I253D, R255S, T256S, Q311A, M428F, H433G2.03 × 10−7
30,974M252E, P257V, L309K, M428L, H433F, H435K2.04 × 10−7
30,684P257A, H285I, T307W, M428W, N434H2.04 × 10−7
30,186R255Q, N286K, T307Q, L309P, M428F, H433I, H435R2.04 × 10−7
30,955H285E, N286K, T307R, L309E, Q311K, M428W, N434F2.04 × 10−7
30,582I253S, T256S, H285D, N286E, V308A, M428L, H435E2.05 × 10−7
30,012M252D, N286W, L309R, Q311V, N434H, H435K2.05 × 10−7
30,905M252W, I253D, P257A, V308F, L309E, N434W2.05 × 10−7
30,502I253P, T256V, N286R, Q311K, M428W, N434F2.05 × 10−7
30,785M252W, P257T, N286H, T307F, L309G, H433L2.05 × 10−7
30,685M252V, R255Q, T256N, N286H, M428F, N434Y, Y436K2.06 × 10−7
30,261L251P, T256P, N286Q, L309K, Q311I, Y436G2.06 × 10−7
30,206L251G, T256N, N286E, V308A, L309K, H433L, N434T2.06 × 10−7
Model FLS
RFR Learning_Set
Variant #MutationsKD
833235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.66 × 10−9
831250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.79 × 10−9
832235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.92 × 10−9
802235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V4.09 × 10−9
829235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V4.09 × 10−9
800250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V4.24 × 10−9
801235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V4.39 × 10−9
828235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V4.39 × 10−9
568252Y, 286E, 307Q, 308P, 311A, 428I, 434Y4.71 × 10−9
1239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y5.40 × 10−9
567252Y, 286E, 307Q, 308P, 311A, 434Y7.07 × 10−9
5239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y7.27 × 10−9
2239K, 252W, 286E, 308P, 428Y, 434Y7.34 × 10−9
7239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y7.60 × 10−9
527239K, 252Y, 286E, 307Q, 308P, 311A, 434Y7.90 × 10−9
6239K, 252Y, 286E, 308P, 428I, 434Y7.95 × 10−9
8239K, 252Y, 270F, 286E, 308P, 428I, 434Y8.12 × 10−9
4239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y8.41 × 10−9
3239K, 252W, 256E, 286E, 308P, 428Y, 434Y8.59 × 10−9
565252Y, 286E, 308P, 428I, 434Y1.03 × 10−8
MLP learning_set
Variant #MutationsKD
20235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V2.96 × 10−8
90250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.42 × 10−8
40235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V3.42 × 10−8
566235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.51 × 10−8
570235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.51 × 10−8
569250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.51 × 10−8
204235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.51 × 10−8
119235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V3.51 × 10−8
110252Y, 286E, 307Q, 308P, 311A, 428I, 434Y3.51 × 10−8
44239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y3.51 × 10−8
131252Y, 286E, 307Q, 308P, 311A, 434Y3.51 × 10−8
581239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y3.54 × 10−8
23239K, 252W, 286E, 308P, 428Y, 434Y3.54 × 10−8
19239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y3.55 × 10−8
568239K, 252Y, 286E, 307Q, 308P, 311A, 434Y3.57 × 10−8
1239K, 252Y, 286E, 308P, 428I, 434Y3.57 × 10−8
5239K, 252Y, 270F, 286E, 308P, 428I, 434Y3.59 × 10−8
53239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y3.80 × 10−8
98239K, 252W, 256E, 286E, 308P, 428Y, 434Y3.80 × 10−8
59252Y, 286E, 308P, 428I, 434Y3.84 × 10−8
MLR learning_set
Variant #MutationsKD
684235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V9.96 × 10−9
163250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V1.79 × 10−8
167235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V2.03 × 10−8
216235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V2.03 × 10−8
182235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V2.55 × 10−8
128250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V2.60 × 10−8
94235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V2.64 × 10−8
231235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V2.64 × 10−8
120252Y, 286E, 307Q, 308P, 311A, 428I, 434Y2.68 × 10−8
495239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y2.69 × 10−8
24252Y, 286E, 307Q, 308P, 311A, 434Y2.69 × 10−8
192239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y2.84 × 10−8
127239K, 252W, 286E, 308P, 428Y, 434Y2.84 × 10−8
145239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y2.85 × 10−8
496239K, 252Y, 286E, 307Q, 308P, 311A, 434Y2.89 × 10−8
235239K, 252Y, 286E, 308P, 428I, 434Y2.89 × 10−8
130239K, 252Y, 270F, 286E, 308P, 428I, 434Y2.89 × 10−8
77239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y2.89 × 10−8
82239K, 252W, 256E, 286E, 308P, 428Y, 434Y2.89 × 10−8
107252Y, 286E, 308P, 428I, 434Y2.89 × 10−8
SVR learning_set
Variant #MutationsKD
243235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V7.25 × 10−9
276250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V7.26 × 10−9
208235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 428I, 434Y, 436V1.29 × 10−8
8235R, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V1.31 × 10−8
7235K, 239K, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V1.31 × 10−8
6250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V1.31 × 10−8
565235R, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V1.31 × 10−8
4235K, 239K, 250V, 252Y, 286E, 307Q, 308P, 311A, 434Y, 436V1.31 × 10−8
800252Y, 286E, 307Q, 308P, 311A, 428I, 434Y1.34 × 10−8
828239K, 252Y, 270F, 286E, 307Q, 308P, 311A, 428I, 434Y1.34 × 10−8
801252Y, 286E, 307Q, 308P, 311A, 434Y1.34 × 10−8
802239K, 252Y, 270F, 286E, 307Q, 308P, 428I, 434Y1.34 × 10−8
829239K, 252W, 286E, 308P, 428Y, 434Y1.34 × 10−8
633239K, 252Y, 270F, 286E, 308P, 387E, 428I, 434Y1.36 × 10−8
59239K, 252Y, 286E, 307Q, 308P, 311A, 434Y1.42 × 10−8
43239K, 252Y, 286E, 308P, 428I, 434Y1.67 × 10−8
38239K, 252Y, 270F, 286E, 308P, 428I, 434Y1.67 × 10−8
37239K, 252Y, 270F, 286E, 308P, 311A, 428I, 434Y1.67 × 10−8
42239K, 252W, 256E, 286E, 308P, 428Y, 434Y1.67 × 10−8
41252Y, 286E, 308P, 428I, 434Y1.67 × 10−8
RFR 3mut
Variant #MutationsKD
24,936H310L, N434H, Y436Q3.32 × 10−8
23,681T307K, H310D, N434F3.80 × 10−8
22,900P257H, H310G, N434F4.03 × 10−8
22,177K288G, H310G, N434W4.59 × 10−8
24,569H310V, Q311Y, N434F4.84 × 10−8
23,303H310D, H433S, N434F4.85 × 10−8
24,652H310E, Q311G, N434F4.88 × 10−8
22,597H285W, H310E, N434W4.98 × 10−8
20,285P257I, T307G, N434F5.05 × 10−8
23,152I253E, H310A, N434W5.38 × 10−8
27,018H310A, M428K, N434F6.27 × 10−8
25,958K288A, H310T, N434H6.53 × 10−8
26,256R255K, T307K, N434F8.46 × 10−8
23,826S254A, T307D, N434F9.36 × 10−8
26,389R255K, T307F, N434W9.95 × 10−8
21,973H285T, L309N, N434F1.04 × 10−7
27,024M252F, L309D, N434H1.17 × 10−7
26,949T256L, V308P, N434W1.26 × 10−7
20,052L309G, M428L, N434F1.27 × 10−7
23,029R255Y, T307S, N434H1.28 × 10−7
MLP 3mut
Variant #MutationsKD
20,285H310L, N434H, Y436Q2.53 × 10−8
23,242T307K, H310D, N434F2.71 × 10−8
26,256P257H, H310G, N434F3.19 × 10−8
22,900K288G, H310G, N434W3.77 × 10−8
20,052H310V, Q311Y, N434F4.56 × 10−8
26,389H310D, H433S, N434F5.99 × 10−8
23,681H310E, Q311G, N434F6.05 × 10−8
22,715H285W, H310E, N434W6.82 × 10−8
23,826P257I, T307G, N434F6.94 × 10−8
21,576I253E, H310A, N434W7.11 × 10−8
20,293H310A, M428K, N434F7.80 × 10−8
21,093K288A, H310T, N434H8.15 × 10−8
22,166R255K, T307K, N434F8.20 × 10−8
21,460S254A, T307D, N434F9.15 × 10−8
27,689R255K, T307F, N434W1.10 × 10−7
20,625H285T, L309N, N434F1.12 × 10−7
26,125M252F, L309D, N434H1.15 × 10−7
22,533T256L, V308P, N434W1.15 × 10−7
26,616L309G, M428L, N434F1.16 × 10−7
20,670R255Y, T307S, N434H1.21 × 10−7
MLR 3mut
Variant #MutationsKD
20,285H310L, N434H, Y436Q2.00 × 10−8
23,242T307K, H310D, N434F2.28 × 10−8
26,256P257H, H310G, N434F3.15 × 10−8
22,900K288G, H310G, N434W3.73 × 10−8
21,460H310V, Q311Y, N434F5.02 × 10−8
20,052H310D, H433S, N434F5.07 × 10−8
23,681H310E, Q311G, N434F5.34 × 10−8
23,826H285W, H310E, N434W6.64 × 10−8
20,625P257I, T307G, N434F6.95 × 10−8
26,616I253E, H310A, N434W7.54 × 10−8
20,293H310A, M428K, N434F7.62 × 10−8
21,093K288A, H310T, N434H7.68 × 10−8
23,441R255K, T307K, N434F7.70 × 10−8
27,689S254A, T307D, N434F8.09 × 10−8
25,977R255K, T307F, N434W8.61 × 10−8
21,576H285T, L309N, N434F8.84 × 10−8
26,389M252F, L309D, N434H8.84 × 10−8
22,715T256L, V308P, N434W9.07 × 10−8
22,166L309G, M428L, N434F9.34 × 10−8
27,485R255Y, T307S, N434H1.02 × 10−7
SVR 3mut
Variant #MutationsKD
22,166H310L, N434H, Y436Q3.15 × 10−8
22,050T307K, H310D, N434F3.63 × 10−8
26,109P257H, H310G, N434F5.30 × 10−8
20,052K288G, H310G, N434W1.15 × 10−7
23,242H310V, Q311Y, N434F1.21 × 10−7
23,889H310D, H433S, N434F1.28 × 10−7
25,411H310E, Q311G, N434F1.40 × 10−7
22,409H285W, H310E, N434W1.66 × 10−7
25,263P257I, T307G, N434F1.69 × 10−7
21,432I253E, H310A, N434W1.70 × 10−7
22,743H310A, M428K, N434F1.78 × 10−7
22,378K288A, H310T, N434H1.82 × 10−7
20,285R255K, T307K, N434F1.88 × 10−7
23,826S254A, T307D, N434F1.93 × 10−7
24,481R255K, T307F, N434W2.00 × 10−7
21,447H285T, L309N, N434F2.01 × 10−7
23,303M252F, L309D, N434H2.05 × 10−7
28,002T256L, V308P, N434W2.15 × 10−7
26,447L309G, M428L, N434F2.32 × 10−7
25,009R255Y, T307S, N434H2.35 × 10−7
RFR 5mut
Variant #MutationsKD
37,435S254G, T256Q, H310I, Q311W, N434H1.07 × 10−8
35,309M252Q, R255K, L309N, N434H, Y436R1.16 × 10−8
35,463H285Y, N286R, H310N, H433Y, N434H1.36 × 10−8
37,379S254T, K288G, H310S, M428V, N434F2.91 × 10−8
38,616M252R, P257E, T307Q, V308L, N434H4.50 × 10−8
31,621I253E, V308Y, H310D, L314I, N434Y5.33 × 10−8
30,182I253T, K288N, V308T, H310D, N434W5.58 × 10−8
37,088R255K, T256Q, V308H, L314S, N434Y5.96 × 10−8
31,104S254A, V308D, H310I, M428Q, N434H6.29 × 10−8
37,174M252P, T307Y, V308R, Q311E, N434F6.62 × 10−8
33,328S254G, H285S, N286F, L309A, N434Y6.77 × 10−8
33,978M252F, R255A, N286R, H310N, N434H7.29 × 10−8
31,320V308I, L309N, M428S, N434W, H435D8.29 × 10−8
33,091R255K, T307W, Q311D, H433Q, N434F8.31 × 10−8
31,232I253P, P257G, T307E, L309Y, N434H8.36 × 10−8
33,342I253D, S254D, T256V, T307G, N434F8.43 × 10−8
33,591M252T, I253V, N286G, H310L, N434H8.67 × 10−8
30,984P257K, T307A, V308F, M428H, N434Y8.69 × 10−8
30,585M252T, P257H, V308Y, L309F, N434F8.74 × 10−8
38,371M252E, S254F, P257Y, T307A, N434H8.88 × 10−8
MLP 5mut
Variant #MutationsKD
33,091S254G, T256Q, H310I, Q311W, N434H3.22 × 10−8
37,254M252Q, R255K, L309N, N434H, Y436R3.28 × 10−8
33,646H285Y, N286R, H310N, H433Y, N434H3.42 × 10−8
34,469S254T, K288G, H310S, M428V, N434F4.33 × 10−8
34,320M252R, P257E, T307Q, V308L, N434H4.47 × 10−8
32,501I253E, V308Y, H310D, L314I, N434Y4.61 × 10−8
34,132I253T, K288N, V308T, H310D, N434W5.00 × 10−8
30,984R255K, T256Q, V308H, L314S, N434Y5.02 × 10−8
32,098S254A, V308D, H310I, M428Q, N434H5.06 × 10−8
34,494M252P, T307Y, V308R, Q311E, N434F5.83 × 10−8
34,889S254G, H285S, N286F, L309A, N434Y5.97 × 10−8
33,342M252F, R255A, N286R, H310N, N434H6.06 × 10−8
31,505V308I, L309N, M428S, N434W, H435D6.15 × 10−8
35,586R255K, T307W, Q311D, H433Q, N434F6.60 × 10−8
37,174I253P, P257G, T307E, L309Y, N434H6.85 × 10−8
31,465I253D, S254D, T256V, T307G, N434F7.08 × 10−8
36,149M252T, I253V, N286G, H310L, N434H7.21 × 10−8
33,080P257K, T307A, V308F, M428H, N434Y7.22 × 10−8
37,661M252T, P257H, V308Y, L309F, N434F7.23 × 10−8
30,906M252E, S254F, P257Y, T307A, N434H7.32 × 10−8
MLR 5mut
Variant #MutationsKD
30,906S254G, T256Q, H310I, Q311W, N434H1.22 × 10−8
37,580M252Q, R255K, L309N, N434H, Y436R1.33 × 10−8
33,885H285Y, N286R, H310N, H433Y, N434H1.39 × 10−8
32,098S254T, K288G, H310S, M428V, N434F1.75 × 10−8
33,646M252R, P257E, T307Q, V308L, N434H3.23 × 10−8
34,320I253E, V308Y, H310D, L314I, N434Y3.36 × 10−8
34,469I253T, K288N, V308T, H310D, N434W3.62 × 10−8
33,091R255K, T256Q, V308H, L314S, N434Y3.74 × 10−8
37,954S254A, V308D, H310I, M428Q, N434H4.06 × 10−8
30,984M252P, T307Y, V308R, Q311E, N434F4.59 × 10−8
37,254S254G, H285S, N286F, L309A, N434Y4.88 × 10−8
34,889M252F, R255A, N286R, H310N, N434H5.60 × 10−8
33,342V308I, L309N, M428S, N434W, H435D5.62 × 10−8
31,505R255K, T307W, Q311D, H433Q, N434F6.39 × 10−8
33,080I253P, P257G, T307E, L309Y, N434H6.61 × 10−8
34,248I253D, S254D, T256V, T307G, N434F6.68 × 10−8
35,586M252T, I253V, N286G, H310L, N434H6.86 × 10−8
33,509P257K, T307A, V308F, M428H, N434Y6.93 × 10−8
37,174M252T, P257H, V308Y, L309F, N434F7.15 × 10−8
34,132M252E, S254F, P257Y, T307A, N434H7.67 × 10−8
SVR 5mut
Variant #MutationsKD
31,465S254G, T256Q, H310I, Q311W, N434H9.50 × 10−9
33,646M252Q, R255K, L309N, N434H, Y436R2.70 × 10−8
30,585H285Y, N286R, H310N, H433Y, N434H2.83 × 10−8
31,612S254T, K288G, H310S, M428V, N434F3.98 × 10−8
30,423M252R, P257E, T307Q, V308L, N434H4.17 × 10−8
31,505I253E, V308Y, H310D, L314I, N434Y4.66 × 10−8
38,216I253T, K288N, V308T, H310D, N434W6.22 × 10−8
34,132R255K, T256Q, V308H, L314S, N434Y7.10 × 10−8
32,098S254A, V308D, H310I, M428Q, N434H7.50 × 10−8
37,379M252P, T307Y, V308R, Q311E, N434F9.68 × 10−8
33,080S254G, H285S, N286F, L309A, N434Y1.03 × 10−7
36,338M252F, R255A, N286R, H310N, N434H1.03 × 10−7
31,469V308I, L309N, M428S, N434W, H435D1.14 × 10−7
37,661R255K, T307W, Q311D, H433Q, N434F1.23 × 10−7
36,149I253P, P257G, T307E, L309Y, N434H1.30 × 10−7
37,777I253D, S254D, T256V, T307G, N434F1.37 × 10−7
34,998M252T, I253V, N286G, H310L, N434H1.37 × 10−7
38,029P257K, T307A, V308F, M428H, N434Y1.44 × 10−7
34,712M252T, P257H, V308Y, L309F, N434F1.49 × 10−7
32,754M252E, S254F, P257Y, T307A, N434H1.51 × 10−7
RFR 8mut
Variant #MutationsKD
30,401M252W, I253D, P257A, V308F, L309E, N434W3.06 × 10−8
30,320L251Q, P257S, N286P, V308W, L309E, Q311A, N434H5.89 × 10−8
30,747M252G, T256A, L309D, N434W, H435E6.07 × 10−8
30,663I253S, P257V, K288G, T307G, N434H, Y436S7.74 × 10−8
30,083L251P, P257T, K288N, T307R, V308P, L309K, N434H8.02 × 10−8
30,582P257A, H285I, T307W, M428W, N434H9.00 × 10−8
30,549T256G, H285D, T307Y, L309T, N434F, H435T9.50 × 10−8
30,647M252W, T256P, P257A, K288L, T307S, M428I, N434H9.59 × 10−8
30,596L251Q, K288F, T307I, L309K, Q311T, M428I, N434W1.01 × 10−7
30,548L251R, M252H, V308R, L309D, N434H, Y436G1.06 × 10−7
30,915K288E, T307E, V308N, L309W, M428W, N434Y1.07 × 10−7
30,501M252Y, I253E, H285I, N286D, V308A, N434H1.10 × 10−7
30,848M252V, T256A, L309G, H433S, N434W, H435K1.11 × 10−7
30,912R255S, P257N, H285R, L309D, M428I, N434Y1.14 × 10−7
30,116M252I, I253S, N434W, H435P, Y436K1.17 × 10−7
30,780I253T, N286Q, V308P, Q311A, N434Y, Y436S1.21 × 10−7
30,625T256N, N286L, K288P, T307P, Q311A, M428L, N434Y1.23 × 10−7
30,245M252E, P257T, H285N, V308P, Q311L, N434Y1.24 × 10−7
30,045P257Y, Q311T, M428L, N434Y, H435P1.36 × 10−7
30,560P257A, K288T, T307F, Q311V, N434H, H435K1.38 × 10−7
MLP 8mut
Variant #MutationsKD
30,829M252W, I253D, P257A, V308F, L309E, N434W4.39 × 10−8
30,549L251Q, P257S, N286P, V308W, L309E, Q311A, N434H4.75 × 10−8
30,625M252G, T256A, L309D, N434W, H435E6.72 × 10−8
30,061I253S, P257V, K288G, T307G, N434H, Y436S7.36 × 10−8
30,860L251P, P257T, K288N, T307R, V308P, L309K, N434H8.15 × 10−8
30,721P257A, H285I, T307W, M428W, N434H8.68 × 10−8
30,045T256G, H285D, T307Y, L309T, N434F, H435T9.13 × 10−8
30,234M252W, T256P, P257A, K288L, T307S, M428I, N434H9.13 × 10−8
30,852L251Q, K288F, T307I, L309K, Q311T, M428I, N434W9.38 × 10−8
30,063L251R, M252H, V308R, L309D, N434H, Y436G9.75 × 10−8
30,022K288E, T307E, V308N, L309W, M428W, N434Y1.09 × 10−7
30,565M252Y, I253E, H285I, N286D, V308A, N434H1.09 × 10−7
30,490M252V, T256A, L309G, H433S, N434W, H435K1.10 × 10−7
30,669R255S, P257N, H285R, L309D, M428I, N434Y1.13 × 10−7
30,401M252I, I253S, N434W, H435P, Y436K1.13 × 10−7
30,583I253T, N286Q, V308P, Q311A, N434Y, Y436S1.17 × 10−7
30,245T256N, N286L, K288P, T307P, Q311A, M428L, N434Y1.21 × 10−7
30,211M252E, P257T, H285N, V308P, Q311L, N434Y1.26 × 10−7
30,596P257Y, Q311T, M428L, N434Y, H435P1.42 × 10−7
30,683P257A, K288T, T307F, Q311V, N434H, H435K1.43 × 10−7
MLR 8mut
Variant #MutationsKD
30,829M252W, I253D, P257A, V308F, L309E, N434W2.17 × 10−8
30,669L251Q, P257S, N286P, V308W, L309E, Q311A, N434H2.84 × 10−8
30,549M252G, T256A, L309D, N434W, H435E4.85 × 10−8
30,583I253S, P257V, K288G, T307G, N434H, Y436S7.34 × 10−8
30,022L251P, P257T, K288N, T307R, V308P, L309K, N434H7.90 × 10−8
30,721P257A, H285I, T307W, M428W, N434H8.09 × 10−8
30,063T256G, H285D, T307Y, L309T, N434F, H435T8.84 × 10−8
30,100M252W, T256P, P257A, K288L, T307S, M428I, N434H1.05 × 10−7
30,625L251Q, K288F, T307I, L309K, Q311T, M428I, N434W1.05 × 10−7
30,401L251R, M252H, V308R, L309D, N434H, Y436G1.11 × 10−7
30,683K288E, T307E, V308N, L309W, M428W, N434Y1.11 × 10−7
30,565M252Y, I253E, H285I, N286D, V308A, N434H1.12 × 10−7
30,225M252V, T256A, L309G, H433S, N434W, H435K1.13 × 10−7
30,045R255S, P257N, H285R, L309D, M428I, N434Y1.24 × 10−7
30,605M252I, I253S, N434W, H435P, Y436K1.30 × 10−7
30,860I253T, N286Q, V308P, Q311A, N434Y, Y436S1.36 × 10−7
30,061T256N, N286L, K288P, T307P, Q311A, M428L, N434Y1.40 × 10−7
30,245M252E, P257T, H285N, V308P, Q311L, N434Y1.52 × 10−7
30,211P257Y, Q311T, M428L, N434Y, H435P1.66 × 10−7
30,085P257A, K288T, T307F, Q311V, N434H, H435K1.67 × 10−7
SVR 8mut
Variant #MutationsKD
30,501M252W, I253D, P257A, V308F, L309E, N434W3.20 × 10−8
30,401L251Q, P257S, N286P, V308W, L309E, Q311A, N434H4.26 × 10−8
30,848M252G, T256A, L309D, N434W, H435E8.11 × 10−8
30,479I253S, P257V, K288G, T307G, N434H, Y436S1.19 × 10−7
30,829L251P, P257T, K288N, T307R, V308P, L309K, N434H1.26 × 10−7
30,045P257A, H285I, T307W, M428W, N434H1.27 × 10−7
30,397T256G, H285D, T307Y, L309T, N434F, H435T1.69 × 10−7
30,116M252W, T256P, P257A, K288L, T307S, M428I, N434H1.70 × 10−7
30,157L251Q, K288F, T307I, L309K, Q311T, M428I, N434W1.74 × 10−7
30,336L251R, M252H, V308R, L309D, N434H, Y436G1.86 × 10−7
30,061K288E, T307E, V308N, L309W, M428W, N434Y1.88 × 10−7
30,560M252Y, I253E, H285I, N286D, V308A, N434H1.89 × 10−7
30,549M252V, T256A, L309G, H433S, N434W, H435K2.00 × 10−7
30,891R255S, P257N, H285R, L309D, M428I, N434Y2.26 × 10−7
30,228M252I, I253S, N434W, H435P, Y436K2.27 × 10−7
30,911I253T, N286Q, V308P, Q311A, N434Y, Y436S2.31 × 10−7
30,386T256N, N286L, K288P, T307P, Q311A, M428L, N434Y2.50 × 10−7
30,605M252E, P257T, H285N, V308P, Q311L, N434Y2.62 × 10−7
30,150P257Y, Q311T, M428L, N434Y, H435P2.76 × 10−7
30,924P257A, K288T, T307F, Q311V, N434H, H435K2.79 × 10−7
Figure A1. Model with FoldX.
Figure A1. Model with FoldX.
Ijms 24 05724 g0a1
Figure A2. SPR experiments’ bivalent fit for variants A3, B5, C7, and T8.
Figure A2. SPR experiments’ bivalent fit for variants A3, B5, C7, and T8.
Ijms 24 05724 g0a2
Figure A3. SPR measurements of variant T3 at pH 7.0 (bivalent fit).
Figure A3. SPR measurements of variant T3 at pH 7.0 (bivalent fit).
Ijms 24 05724 g0a3

References

  1. Zalevsky, J.; Chamberlain, A.K.; Horton, H.M.; Karki, S.; Leung, I.W.L.; Sproule, T.J.; Lazar, G.A.; Roopenian, D.C.; Desjarlais, J.R. Enhanced Antibody Half-Life Improves in Vivo Activity. Nat. Biotechnol. 2010, 28, 157–159. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Ko, S.-Y.; Pegu, A.; Rudicell, R.S.; Yang, Z.; Joyce, M.G.; Chen, X.; Wang, K.; Bao, S.; Kraemer, T.D.; Rath, T.; et al. Enhanced Neonatal Fc Receptor Function Improves Protection against Primate SHIV Infection. Nature 2014, 514, 642–645. [Google Scholar] [CrossRef] [Green Version]
  3. Ramdani, Y.; Lamamy, J.; Watier, H.; Gouilleux-Gruart, V. Monoclonal Antibody Engineering and Design to Modulate FcRn Activities: A Comprehensive Review. Int. J. Mol. Sci. 2022, 23, 9604. [Google Scholar] [CrossRef] [PubMed]
  4. Liu, L. Pharmacokinetics of Monoclonal Antibodies and Fc-Fusion Proteins. Protein Cell 2018, 9, 15–32. [Google Scholar] [CrossRef] [Green Version]
  5. Ternant, D.; Arnoult, C.; Pugnière, M.; Dhommée, C.; Drocourt, D.; Perouzel, E.; Passot, C.; Baroukh, N.; Mulleman, D.; Tiraby, G.; et al. IgG1 Allotypes Influence the Pharmacokinetics of Therapeutic Monoclonal Antibodies through FcRn Binding. J. Immunol. 2016, 196, 607–613. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Vidarsson, G.; Dekkers, G.; Rispens, T. IgG Subclasses and Allotypes: From Structure to Effector Functions. Front. Immunol. 2014, 5, 520. [Google Scholar] [CrossRef] [Green Version]
  7. Dall’Acqua, W.F.; Kiener, P.A.; Wu, H. Properties of Human IgG1s Engineered for Enhanced Binding to the Neonatal Fc Receptor (FcRn). J. Biol. Chem. 2006, 281, 23514–23524. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Dumet, C.; Pottier, J.; Gouilleux-Gruart, V.; Watier, H. Insights into the IgG Heavy Chain Engineering Patent Landscape as Applied to IgG4 Antibody Development. mAbs 2019, 11, 1341–1350. [Google Scholar] [CrossRef] [Green Version]
  9. Yeung, Y.A.; Leabman, M.K.; Marvin, J.S.; Qiu, J.; Adams, C.W.; Lien, S.; Starovasnik, M.A.; Lowman, H.B. Engineering Human IgG1 Affinity to Human Neonatal Fc Receptor: Impact of Affinity Improvement on Pharmacokinetics in Primates. J. Immunol. 2009, 182, 7663–7671. [Google Scholar] [CrossRef] [Green Version]
  10. Deng, R.; Loyet, K.M.; Lien, S.; Iyer, S.; DeForge, L.E.; Theil, F.-P.; Lowman, H.B.; Fielder, P.J.; Prabhu, S. Pharmacokinetics of Humanized Monoclonal Anti-Tumor Necrosis Factor-α Antibody and Its Neonatal Fc Receptor Variants in Mice and Cynomolgus Monkeys. Drug Metab. Dispos.: Biol. Fate Chem. 2010, 38, 600–605. [Google Scholar] [CrossRef] [Green Version]
  11. Ward, E.S.; Ober, R.J. Targeting FcRn to Generate Antibody-Based Therapeutics. Trends Pharmacol. Sci. 2018, 39, 892–904. [Google Scholar] [CrossRef] [PubMed]
  12. Shields, R.L.; Namenuk, A.K.; Hong, K.; Meng, Y.G.; Rae, J.; Briggs, J.; Xie, D.; Lai, J.; Stadlen, A.; Li, B.; et al. High Resolution Mapping of the Binding Site on Human IgG1 for Fc Gamma RI, Fc Gamma RII, Fc Gamma RIII, and FcRn and Design of IgG1 Variants with Improved Binding to the Fc Gamma R. J. Biol. Chem. 2001, 276, 6591–6604. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Oganesyan, V.; Damschroder, M.M.; Cook, K.E.; Li, Q.; Gao, C.; Wu, H.; Dall’Acqua, W.F. Structural Insights into Neonatal Fc Receptor-Based Recycling Mechanisms. J. Biol. Chem. 2014, 289, 7812–7824. [Google Scholar] [CrossRef] [Green Version]
  14. Petkova, S.B.; Akilesh, S.; Sproule, T.J.; Christianson, G.J.; Al Khabbaz, H.; Brown, A.C.; Presta, L.G.; Meng, Y.G.; Roopenian, D.C. Enhanced Half-Life of Genetically Engineered Human IgG1 Antibodies in a Humanized FcRn Mouse Model: Potential Application in Humorally Mediated Autoimmune Disease. Int. Immunol. 2006, 18, 1759–1769. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  15. Monnet, C.; Jorieux, S.; Urbain, R.; Fournier, N.; Bouayadi, K.; De Romeuf, C.; Behrens, C.K.; Fontayne, A.; Mondon, P. Selection of IgG Variants with Increased FcRn Binding Using Random and Directed Mutagenesis: Impact on Effector Functions. Front. Immunol. 2015, 6, 39. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Booth, B.J.; Ramakrishnan, B.; Narayan, K.; Wollacott, A.M.; Babcock, G.J.; Shriver, Z.; Viswanathan, K. Extending Human IgG Half-Life Using Structure-Guided Design. mAbs 2018, 10, 1098–1110. [Google Scholar] [CrossRef] [PubMed]
  17. Mackness, B.C.; Jaworski, J.A.; Boudanova, E.; Park, A.; Valente, D.; Mauriac, C.; Pasquier, O.; Schmidt, T.; Kabiri, M.; Kandira, A.; et al. Antibody Fc Engineering for Enhanced Neonatal Fc Receptor Binding and Prolonged Circulation Half-Life. mAbs 2019, 11, 1276–1288. [Google Scholar] [CrossRef] [Green Version]
  18. Pierce, B.; Weng, Z. ZRANK: Reranking Protein Docking Predictions with an Optimized Energy Function. Proteins Struct. Funct. Bioinform. 2007, 67, 1078–1086. [Google Scholar] [CrossRef]
  19. Kastritis, P.L.; Bonvin, A.M.J.J. Are Scoring Functions in Protein−Protein Docking Ready to Predict Interactomes? Clues from a Novel Binding Affinity Benchmark. J. Proteome Res. 2011, 10, 921–922. [Google Scholar] [CrossRef] [Green Version]
  20. Gromiha, M.M.; Yugandhar, K.; Jemimah, S. Protein–Protein Interactions: Scoring Schemes and Binding Affinity. Curr. Opin. Struct. Biol. 2017, 44, 31–38. [Google Scholar] [CrossRef] [PubMed]
  21. Spassov, V.Z.; Yan, L. pH-Selective Mutagenesis of Protein-Protein Interfaces: In Silico Design of Therapeutic Antibodies with Prolonged Half-Life. Proteins: Struct. Funct. Bioinform. 2013, 81, 704–714. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Huang, X.; Zheng, F.; Zhan, C.-G. Binding Structures and Energies of the Human Neonatal Fc Receptor with Human Fc and Its Mutants by Molecular Modeling and Dynamics Simulations. Mol. BioSyst. 2013, 9, 3047. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Igawa, T.; Maeda, A.; Haraya, K.; Tachibana, T.; Iwayanagi, Y.; Mimoto, F.; Higuchi, Y.; Ishii, S.; Tamba, S.; Hironiwa, N.; et al. Engineered Monoclonal Antibody with Novel Antigen-Sweeping Activity in Vivo. PLoS ONE 2013, 8, e63236. [Google Scholar] [CrossRef]
  24. Maas, B.M.; Cao, Y. A Minimal Physiologically Based Pharmacokinetic Model to Investigate FcRn-Mediated Monoclonal Antibody Salvage: Effects of Kon, Koff, Endosome Trafficking, and Animal Species. mAbs 2018, 10, 1322–1331. [Google Scholar] [CrossRef] [Green Version]
  25. Horton, N.; Lewis, M. Calculation of the Free Energy of Association for Protein Complexes. Protein Sci. 1992, 1, 169–181. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  26. Guerois, R.; Nielsen, J.E.; Serrano, L. Predicting Changes in the Stability of Proteins and Protein Complexes: A Study of More than 1000 Mutations. J. Mol. Biol. 2002, 320, 369–387. [Google Scholar] [CrossRef] [PubMed]
  27. Wang, R.; Lai, L.; Wang, S. Further Development and Validation of Empirical Scoring Functions for Structure-Based Binding Affinity Prediction. J. Comput.-Aided Mol. Des. 2002, 16, 11–26. [Google Scholar] [CrossRef] [PubMed]
  28. Varoquaux, G.; Buitinck, L.; Louppe, G.; Grisel, O.; Pedregosa, F.; Mueller, A. Scikit-Learn. GetMobile: Mob. Comput. Commun. 2015, 19, 29–33. [Google Scholar] [CrossRef]
  29. Schymkowitz, J.; Borg, J.; Stricher, F.; Nys, R.; Rousseau, F.; Serrano, L. The FoldX Web Server: An Online Force Field. Nucleic Acids Res. 2005, 33, W382–W388. [Google Scholar] [CrossRef] [Green Version]
  30. Borrok, M.J.; Wu, Y.; Beyaz, N.; Yu, X.-Q.; Oganesyan, V.; Dall’Acqua, W.F.; Tsui, P. pH-Dependent Binding Engineering Reveals an FcRn Affinity Threshold That Governs IgG Recycling. J. Biol. Chem. 2015, 290, 4282–4290. [Google Scholar] [CrossRef] [Green Version]
  31. Walters, B.T.; Jensen, P.F.; Larraillet, V.; Lin, K.; Patapoff, T.; Schlothauer, T.; Rand, K.D.; Zhang, J. Conformational Destabilization of Immunoglobulin G Increases the Low pH Binding Affinity with the Neonatal Fc Receptor. J. Biol. Chem. 2016, 291, 1817–1825. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 2. KD distributions obtained with the four different algorithms in the two models for the three sets of random mutants. RFR in green, MLP in red, MLR in yellow, SVR in blue, trained either with FLS (top) or SLS (bottom) models. Purple star represents the average KD.
Figure 2. KD distributions obtained with the four different algorithms in the two models for the three sets of random mutants. RFR in green, MLP in red, MLR in yellow, SVR in blue, trained either with FLS (top) or SLS (bottom) models. Purple star represents the average KD.
Ijms 24 05724 g002
Table 1. Datasets and machine learning methods used.
Table 1. Datasets and machine learning methods used.
Datasets
NameNumber of VariantsSelection Criteria
First learning set (FLS)1099Affinities measured by SPR at 25 °C, pH 7.
Second learning set (SLS)1323FLS variants + 224 variants with affinities only measured at pH 6.
Algorithms
NameDescription
Support vector regressor (SVR)The objective of support vector machines (SVMs) is to find the hyperplane separating at best the two categories of instances defined in a training sample. Support vector regression (SVR) uses the same principle, adding a constraint on the maximal distance between the instances and the hyperplane.
Multi-linear regression (MLR)Multiple linear regression optimizes a linear function of the parameters.
Multi-layer perceptron (MLP)An MLP is a class of feedforward artificial neural networks (ANNs) with at least three layers of nodes (input, hidden, and output) and the neurons of hidden and output layers using non-linear activation functions.
Random forest regressor (RFR)A random forest is a meta-estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control overfitting.
Table 2. Comparison of predicted versus experimental affinities at pH 7.0 for 3 randomly generated variants. The 3 variants have 3, 5, and 7 mutations and are predicted with the two different models and with 4 different algorithms. Measured affinities at pH 6.0 are also shown. Cells in green, yellow, and red correspond to very good (log err = |log(pred) − log(KD)| ≤ 0.1), correct (0.1 < log err ≤ 1), and incorrect (log err > 1) predictions, respectively. Statistical analysis is given in Table 3.
Table 2. Comparison of predicted versus experimental affinities at pH 7.0 for 3 randomly generated variants. The 3 variants have 3, 5, and 7 mutations and are predicted with the two different models and with 4 different algorithms. Measured affinities at pH 6.0 are also shown. Cells in green, yellow, and red correspond to very good (log err = |log(pred) − log(KD)| ≤ 0.1), correct (0.1 < log err ≤ 1), and incorrect (log err > 1) predictions, respectively. Statistical analysis is given in Table 3.
Tocilizumab *T8 *T3 *C7B5A3
MutationsNoneM252Y/N286E/T307Q/V308P/Q311A/N434Y/Y436VM252Y/T307D/N434YT256E/N286H/K288E/V308P/L309D/N434Y/Y436KT256Y/H285Q/N286D/V308A/N434YM252W/M428K/N434W
KD at pH7 (patent)8.8 × 10−54.4 × 10−92.1 × 10−7
KD at pH7 (this work)NB7.8 × 10−93.8 × 10−71.6 × 10−76.2 × 10−75.7 × 10−7
KD at pH6 (this work)3.8 × 10−71.3 × 10−91.3 × 10−83.4 × 10−84.5 × 10−81.1 × 10−8
Prediction setting
SVR/FLS6.91 × 10−77.29 × 10−92.54 × 10−81.90 × 10−71.90 × 10−71.40 × 10−7
SVR/SLS6.70 × 10−61.30 × 10−89.50 × 10−82.30 × 10−74.20 × 10−74.20 × 10−7
MLR/FLD6.20 × 10−71.30 × 10−88.00 × 10−81.80 × 10−84.40 × 10−81.30 × 10−7
MLR/SLS1.00 × 10−45.40 × 10−86.50 × 10−87.40 × 10−81.40 × 10−72.70 × 10−7
MLP/FLS8.00 × 10−71.30 × 10−88.00 × 10−88.70 × 10−81.30 × 10−77.70 × 10−8
MLP/SLS1.90 × 10−54.40 × 10−86.00 × 10−86.80 × 10−77.10 × 10−66.90 × 10−7
RFR/FLS1.20 × 10−63.80 × 10−92.47 × 10−76.00 × 10−81.60 × 10−73.30 × 10−7
RFR/SLS3.40 × 10−64.10 × 10−91.50 × 10−74.90 × 10−82.10 × 10−73.20 × 10−7
* Tocilizumab, T8, and T3 were removed from the learning set in each prediction setting.
Table 3. Comparison between MAE, Pearson correlation coefficient, and maximum error between predictions at pH 7.0 and measurements for the 6 antibodies of Table 2 or only for the 3 produced variants (Mut3, Mut5, and Mut8).
Table 3. Comparison between MAE, Pearson correlation coefficient, and maximum error between predictions at pH 7.0 and measurements for the 6 antibodies of Table 2 or only for the 3 produced variants (Mut3, Mut5, and Mut8).
SVR/FLSSVR/SLSMLR/FLSMLR/SLSMLP/FLSMLP/SLSRFR/FLSRFR/SLS
Log KD MAE
(6 Abs)
0.640.190.810.110.630.260.520.47
Pearson correlation coefficient
(6 Abs)
0.880.980.910.890.980.840.910.97
Log KD Maximum error
(6 Abs)
2.111.122.151.092.041.061.871.41
Log KD MAE
(A, B, C Abs)
0.350.050.910.440.600.590.420.41
Pearson correlation coefficient
(A, B, C Abs)
−0.450.990.810.830.350.550.880.96
Log KD Maximum error
(A, B, C Abs)
0.610.171.150.650.871.060.590.51
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dumet, C.; Pugnière, M.; Henriquet, C.; Gouilleux-Gruart, V.; Poupon, A.; Watier, H. Harnessing Fc/FcRn Affinity Data from Patents with Different Machine Learning Methods. Int. J. Mol. Sci. 2023, 24, 5724. https://doi.org/10.3390/ijms24065724

AMA Style

Dumet C, Pugnière M, Henriquet C, Gouilleux-Gruart V, Poupon A, Watier H. Harnessing Fc/FcRn Affinity Data from Patents with Different Machine Learning Methods. International Journal of Molecular Sciences. 2023; 24(6):5724. https://doi.org/10.3390/ijms24065724

Chicago/Turabian Style

Dumet, Christophe, Martine Pugnière, Corinne Henriquet, Valérie Gouilleux-Gruart, Anne Poupon, and Hervé Watier. 2023. "Harnessing Fc/FcRn Affinity Data from Patents with Different Machine Learning Methods" International Journal of Molecular Sciences 24, no. 6: 5724. https://doi.org/10.3390/ijms24065724

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop