Deep Learning Modelling of Androgen Receptor Responses to Prostate Cancer Therapies

Gain-of-function mutations in human Androgen Receptor (AR) are amongst major causes of drug resistance in prostate cancer (PCa). Identifying mutations that cause resistant phenotype is of critical importance for guiding treatment protocols as well as for designing drugs that do not elicit adverse responses. However, experimental characterization of these mutations is time consuming and costly, and therefore predictive models are needed to anticipate resistant mutations and to guide drug discovery process. In this work, we leverage experimental data collected on 69 clinically observed and/or literature described AR mutants to train a deep neural network (DNN) to predict their responses to currently used and experimental AR anti-androgens. We demonstrate that the use of DNN provides more accurate prediction of the biological outcome (inhibition, activation, no-response) in AR mutant-drug pairs compared to other machine learning approaches and also allows the use of more general 2D descriptors. Finally, the developed approach was used to predict the effect of the latest AR inhibitor darolutamide on all reported AR mutants.


Introduction
Resistance to drug treatments is a common occurrence across many diseases but it is especially prevalent and lethal in cancers, where it represents a major obstacle for long term therapies. The acquired drug resistance can be caused by a number of different mechanisms by which cancer cells can escape the treatment. 1 One of the major ways is the development of gain-of-function mutations where the target protein becomes altered under selective therapeutic pressure, rendering the drug ineffective. Gain of function mutations are of particular importance in prostate cancer (PCa) where resistance is a common and often deadly occurrence. 2 The main drug target in PCa is the androgen receptor (AR), a nuclear hormone receptor whose increased activation is one of the principal drivers of PCa. Multiple decades of research on the AR has led to a number of targeted drug treatments that have significantly improved patient survival and well-being. However, despite the major gains in AR targeted treatments, resistance invariably develops to all current drugs. 3,4 One important aspect of resistant AR mutants is that they do not simply render the drug ineffective but can even turn the drug from an antagonist into an agonist, thus promoting cancer growth. This characteristic seems to be unique to the AR and thus emphasises the need to identify gain-of-function mutations that cause this phenotype so that patients can be screened and taken off treatment before resistance develops. 5 Additionally, understanding and predicting those mutations that cause resistance will enable us to design better drugs that might avoid this mechanism in the future.
Previous research had identified a number of new AR mutants from patient DNA using next generation sequencing technology. 6 The activation of these mutants were then measured modelling as a binary classification of the AR mutant responses to common PCa drugs and endogenous human steroids using structure-based 4D QSAR descriptors. [9][10][11][12] The QSAR model was able to identify a de novo AR mutant that showed a resistant phenotype to the current anti-androgen drugs. However, the approach was relatively simplistic and despite achieving an accuracy of~80%, the model generated several false positive predictions for the external validation set. Importantly,the use of binary antagonist/agonist classification reduced the complexity of the AR mutant responses that can range from full antagonism to full agonism and also includes partial agonistic response and no-response (non-functional AR mutants) categories.
Recent advances in a machine learning theory have led to significant progress and qualitative changes in QSAR and chemoinformatics practice. Interpretable linear QSAR models become increasingly replaced by support vector machines (SVM), random forest (RF), artificial neural networks (ANN) and other non-linear techniques that demonstrate robust performance on large biological datasets. 13,14 More recently the progress in deep learning techniques along with the increasing availability of biological data has brought DNNs into the drug modelling spotlight. 15 Deep networks have been particularly effective due to their ability to learn more complex non-linear trends from larger datasets and their lower requirements for input representations, i.e. lesser need for precise descriptor engineering. 16,17 In this work, we utilized a deep neural network (DNN), along with general proteochemometric descriptors to predict more differentiated AR mutant-drug responses on significantly extended experimental dataset, compared to our previous study. Furthermore, we implemented a structure-independent protocol, where by using protein sequence-based descriptors and 2D drug fingerprints, we avoided the drug docking step in feature construction, saving time and making the model more generalizable. Notably, such an approximation allows the consideration of mutations that occur outside of the ligand-binding domain (LBD) of the AR, where no structural information is available. As mentioned above, the resulting DNN model distinguishes four AR mutant response phenotypes and provides rather accurate discrimination between them.  Given that certain drug response phenotypes are less frequent than others (resistant phenotype is relatively rare) there is a high imbalance between classes that can negatively affect the accuracy of the model (the model can get high accuracy by just predicting the majority class). To address this imbalance, we over-sample the minority classes according to the borderline synthetic minority over-sampling technique (SMOTE) method which outperformed the regular SMOTE method and random oversampling in cross validation. 28,29 Baseline methods

Results and discussion
To justify the use of the more complex DNN method for this task, we benchmarked the corresponding results against outcomes from SVM and RF models. In particular we have

DNN model training
The general overview of our approach can be seen in Figure 2, where the network takes in protein sequence and 2D chemical descriptors and outputs 4 classes of dose-response curve.
The DNN was also optimized to find the best hyperparameters using 5-fold cross validation.
The optimal architecture was found to be two hidden layers with 128 neurons in the first layer and 32 in the second layer, which is a relatively small network but is a suitable amount of weights to train given the small number of training examples. Categorical cross-entropy loss was used, with a batch size of 16 and the ADAM optimizer for training. 30 Each hidden node had a rectified linear unit (ReLU) 31 activation applied and the output layer had a softmax activation applied. 32 Additionally, to reduce over-fitting of the network, a dropout rate of 0.01 was applied to each hidden layer and early stopping was used to terminate training if validation loss did not improve over 20 epochs.

Prediction results
Interestingly, SVM and random forest have fairly high AUC on the training set despite the large dimension of the input and the DNN performs only slightly better than SVM and the same as random forest (Figure 3). However, further investigation into precision and recall shows that both baseline methods over-predict for the majority class, a bias that the AUC does not fully capture. This can be seen in Table 1 where the DNN significantly outperforms the baselines in precision, recall and F1 score.  The shallow models had difficulty discriminating between the antagonist and u-shaped phenotypes and thus do not generalize well to validation data. This can be seen clearly in Figure 4 with a comparison of confusion matrices for the three methods on the test set.
The DNN predictions are noticeably better than baselines at identifying the non-responsive phenotype and is more sensitive at distinguishing between the mixed (U-shaped) response and antagonist response which are subtly different.