Prediction of Premature Termination Codon Suppressing Compounds for Treatment of Duchenne Muscular Dystrophy Using Machine Learning

A significant percentage of Duchenne muscular dystrophy (DMD) cases are caused by premature termination codon (PTC) mutations in the dystrophin gene, leading to the production of a truncated, non-functional dystrophin polypeptide. PTC-suppressing compounds (PTCSC) have been developed in order to restore protein translation by allowing the incorporation of an amino acid in place of a stop codon. However, limitations exist in terms of efficacy and toxicity. To identify new compounds that have PTC-suppressing ability, we selected and clustered existing PTCSC, allowing for the construction of a common pharmacophore model. Machine learning (ML) and deep learning (DL) models were developed for prediction of new PTCSC based on known compounds. We conducted a search of the NCI compounds database using the pharmacophore-based model and a search of the DrugBank database using pharmacophore-based, ML and DL models. Sixteen drug compounds were selected as a consensus of pharmacophore-based, ML, and DL searches. Our results suggest notable correspondence of the pharmacophore-based, ML, and DL models in prediction of new PTC-suppressing compounds.


Introduction
Duchenne muscular dystrophy (DMD) is an X-linked neuromuscular disorder characterized by progressive muscular degeneration. DMD affects about 1 in 3500 male births worldwide [1]. Currently, dystrophin is considered one of the most important genes involved in DMD. This protein is a component of the dystrophin-associated protein complex (DAPC) [1,2]. Mutations in the dystrophin gene on the X chromosome cause DMD; they damage the DAPC, causing creatine kinase to pass through muscle cell membranes into the blood, in turn, leading to elevated serum creatine kinase levels [2].
About 5-15% of DMD cases are caused by nonsense mutations, resulting in premature termination codons (PTCs) in the place of a sense codon. This leads to the production of a truncated dystrophin protein [2,3]. Consequently, several therapeutic approaches for treating DMD have focused on suppressing PTCs. One therapeutic approach in particular-the use of PTC-suppressing compounds (PTCSCs)-relies on the fact that translation termination is not 100% efficient [3]. It depends on foundation for future investigation of the identified PTC-suppressing drug candidates for treatment of DMD.

Development of a Pharmacophore Model
Based on the alignment of Cluster A, we created a seven-feature pharmacophore model. The pharmacophore model is described by three donors/acceptors, three acceptors, and one hydrophobic region. Based on the alignment of Cluster B, we developed a five-feature pharmacophore model. The five features consisted of three donors/acceptors, one hydrophobic region, and one acceptor. Three acceptors (Acc) were added to regions containing a high density of oxygen atoms, and two donors/acceptors (Don/Acc) were added to regions containing a high density of oxygen and nitrogen atoms. The pharmacophore models for both clusters are shown below ( Figure 2).

Development of a Pharmacophore Model
Based on the alignment of Cluster A, we created a seven-feature pharmacophore model. The pharmacophore model is described by three donors/acceptors, three acceptors, and one hydrophobic region. Based on the alignment of Cluster B, we developed a five-feature pharmacophore model. The five features consisted of three donors/acceptors, one hydrophobic region, and one acceptor. Three acceptors (Acc) were added to regions containing a high density of oxygen atoms, and two donors/acceptors (Don/Acc) were added to regions containing a high density of oxygen and nitrogen atoms. The pharmacophore models for both clusters are shown below ( Figure 2).

Databases Search
Pharmacophore searches were conducted on the NCI compounds database and DrugBank FDAapproved drug database using each of the two pharmacophore models. Default settings were selected for Cluster A pharmacophore model. Partial matches of eight of 10 and nine of 10 were run on the NCI Database and DrugBank Database, respectively. This was due to a greater number of pharmacophore features elucidated in the Cluster B pharmacophore model than in the Cluster A pharmacophore model (ten, rather than seven), which increased the selectivity of the pharmacophore search. The partial matches, which included compounds with correspondence to at least eight of the 10 features for the NCI database search and nine of the 10 features for the DrugBank database search, allowed us to increase the number of hits generated.

NCI Database Search
We searched an NCI compounds database containing 260,071 experimental compounds using the two pharmacophore models. When we searched the NCI compounds database with the five-offive feature pharmacophore for Cluster A, 35 compounds were elucidated with 234 conformations. These 35 compounds contain all five pharmacophore features. When we searched the database with the eight-of-ten feature pharmacophore for Cluster B, 100 compounds were elucidated containing eight of the ten pharmacophore features, at minimum. Of the 135 total compounds found by the NCI database search, twelve top-scoring compounds were selected with greatest correspondence to the pharmacophore centers (Table 1).

Databases Search
Pharmacophore searches were conducted on the NCI compounds database and DrugBank FDA-approved drug database using each of the two pharmacophore models. Default settings were selected for Cluster A pharmacophore model. Partial matches of eight of 10 and nine of 10 were run on the NCI Database and DrugBank Database, respectively. This was due to a greater number of pharmacophore features elucidated in the Cluster B pharmacophore model than in the Cluster A pharmacophore model (ten, rather than seven), which increased the selectivity of the pharmacophore search. The partial matches, which included compounds with correspondence to at least eight of the 10 features for the NCI database search and nine of the 10 features for the DrugBank database search, allowed us to increase the number of hits generated.

NCI Database Search
We searched an NCI compounds database containing 260,071 experimental compounds using the two pharmacophore models. When we searched the NCI compounds database with the five-of-five feature pharmacophore for Cluster A, 35 compounds were elucidated with 234 conformations. These 35 compounds contain all five pharmacophore features. When we searched the database with the eight-of-ten feature pharmacophore for Cluster B, 100 compounds were elucidated containing eight of the ten pharmacophore features, at minimum. Of the 135 total compounds found by the NCI database search, twelve top-scoring compounds were selected with greatest correspondence to the pharmacophore centers (Table 1). These 35 compounds contain all five pharmacophore features. When we searched the database with the eight-of-ten feature pharmacophore for Cluster B, 100 compounds were elucidated containing eight of the ten pharmacophore features, at minimum. Of the 135 total compounds found by the NCI database search, twelve top-scoring compounds were selected with greatest correspondence to the pharmacophore centers (Table 1). Similarly, we searched the DrugBank database containing 2356 FDA-approved drugs using the two pharmacophore models. The pharmacophore search on the DrugBank database using the five-of-five pharmacophore for Cluster A returned 57 drugs with 1974 conformations. These drugs share all five pharmacophore features. The pharmacophore search on the DrugBank database with the nine-of-ten pharmacophore for Cluster B elucidated 21 drugs containing nine of the ten pharmacophore features, at minimum.

Training Set Preparation
Three datasets were constructed to build and evaluate the classification model. The training set contained 43 PTCSCs and 42 non-PTCSCs. The training set contained the bulk of the known inhibitors so that the algorithm would have enough data to learn an accurate representation. Ten inhibitors were excluded from the training set, so that model could be evaluated on an independent set, one which did not influence the development of the model, to demonstrate that this model can be useful in identification of molecules which it has not seen before. The testing sets also included five times as many non-inhibitors as inhibitors. This was done to simulate the screen's actual purpose; the elucidation of inhibitors in large molecular databases, where only one in many thousands to hundreds of thousands of molecules will end up having the desired activity. Another practice we wished to investigate is the control of drug screens for molecular weights. It is common practice to make sure the molecules in your inactive class, those which are not of interest, are of similar molecular weight to the molecules in your active class, those compounds which are of interest. The thought here being that molecular weight is a poor indicator of chemical activity and should not be used to bias model training. One could easily imagine a situation in which a drug screen has near perfect accuracy on a training or test set purely because the inactive molecules selected are of a wildly different size whether much smaller or larger. We aimed to investigate whether it is more useful to match the distribution of molecular weights to the active molecules in the training or testing set when designing a drug screen by including two test sets, one adapted for each situation. The molecular weight distribution in these sets are described in Tables 2-4 with Test A have a distribution representative of the active molecules in the training set and Test B, being representative of the distribution of active compounds in the test set. Table 2. The molecular weight distribution in the training set. The inactive set, inactive meaning non-inhibitor, contains one less molecule in the 400-500 g/mol range, because said molecule proved to be problematic later in preprocessing.  Table 3. The number of molecules in each molecular weight range in set Test A, the testing set whose non-inhibitor molecular weight range distribution as described in the table reflects those in the training set. The number of molecules needed for each range was multiplied by 50/43 and rounded to the nearest whole number in order to reach the desired 50 non-inhibitor molecules in this set, while keeping the same distribution of molecular weights as the training set.  Table 4. The testing set, which is composed of molecules whose distribution of molecular weights in the non-inhibitor class matches that of the testing set. This means that there is a direct relationship between the number of inhibitor molecules in each molecular weight range as described in the table and the number of non-inhibitors in that molecular weight range. In this case there are 5 non-inhibitors for every inhibitor in the range.

Model Development and Validation
Models were trained to classify between PTCSCs and non-PTCSCs. WEKA and TF were used for comparison of the machine learning platforms. Performance on the training set was evaluated on accuracy and AUC, although other metrics like false positive rate were also recorded and taken into consideration ( Table 5). The models' performance on outside data, that which had not been included in the training set, was evaluated largely on the false positive and false negative rates, due to the 5:1 sample bias against inhibitors molecules ( Table 2). We felt this would indicate how the model would perform when used as a drug screen filtering through large datasets in which only a small number of molecules are of interest. The bias we artificially created will only be exaggerated by the application of our model to large molecular databases.  Tables 5 and 6 list the training test results for the most powerful models we developed using WEKA and TF. The best performing model developed with WEKA achieved an accuracy of 85.88% and an AUC of 0.928 on the training set using an MLP architecture. The top performing model trained using TF, Model 4 (TF M4), achieved an accuracy of 94.12% with an AUC of 1.0. Had TF M4 not been tested on a validation set we would think it to be our most powerful. However, it demonstrates poor accuracy on both our testing sets, 75.81% on Test A and 78.33% on Test B (Table 6). This is true for the WEKA MLP model as well, achieving only 75.00% accuracy on Test A and 70.00% accuracy on Test B, despite predicting with 85.88% accuracy on the training set. Both these models demonstrate over fitting: too strictly learning, adhering to, or memorizing, the training set at the cost of performance on other data sets, in this case Test A and Test B. Model 1 demonstrates the opposite problem, under fitting, where the model over generalizes the training set. Here, it creates a situation in which the model performs better on the testing sets than on the training set. Creating a successful machine learning model relies on balancing under and over fitting. A model that displays neither is no longer a model, it is a solution. Machine learning techniques are not meant to yield rigorous solutions, but rather approximations over the problem space presented in their training sets. Models 2 and 3 are the most balanced, demonstrating accuracies in the mid 80% range for all three datasets. Figure 3 illustrates that these two balanced models fall between the other two developed in TF; they occupy the ML "goldilocks zone" (green box), illustrating the region in which under fitting and over fitting are minimized. Model 2 is slightly under fitting, because the testing set accuracy is slightly higher than that of the training set. Model 3 is the opposite, slightly over trained, as the training accuracy exceeds that of both testing sets.  Models 2 and 3 are the most balanced, demonstrating accuracies in the mid 80% range for all three datasets. Figure 3 illustrates that these two balanced models fall between the other two developed in TF; they occupy the ML "goldilocks zone" (green box), illustrating the region in which under fitting and over fitting are minimized. Model 2 is slightly under fitting, because the testing set accuracy is slightly higher than that of the training set. Model 3 is the opposite, slightly over trained, as the training accuracy exceeds that of both testing sets. Selecting which of these models to apply and how to apply them when searching a large molecular database is not straightforward. We describe two approaches, one either selects the single model they feel suits their problem or situation best, or uses multiple models in a committee approach, one in which multiple models are used as voters in a final decision. A single model should be selected if it greatly exceeds the performance of the other models developed in most or all the Selecting which of these models to apply and how to apply them when searching a large molecular database is not straightforward. We describe two approaches, one either selects the single model they feel suits their problem or situation best, or uses multiple models in a committee approach, one in which multiple models are used as voters in a final decision. A single model should be selected if it greatly exceeds the performance of the other models developed in most or all the metrics being tracked. A committee approach often works very well if the models do not seem to obviously separate those themselves from each other, yet all perform well enough. We considered first the false negative rates on the test sets, and second the false positive rates on the test sets to determine which approach might be best in this situation. False negative rate was given priority because it demonstrates how often we will miss a molecule of the desired function, in this case how often a PTCSC will be classified as a PTCSC. This is the costliest form of error in this problem, because it describes a missed opportunity, a potential drug which will not be identified.
The false positive rate, while not as important as the false negative rate, is also of great importance, because it describes the frequency with which inactive compounds will end up being tested in vitro, or the rate at which molecules with no activity will be tested in vitro. The cost this adds to drug discovery can grow quickly when considering the quantity of false positives present when a screen is applied to a database with hundreds of thousands of molecules, a minute fraction of which will be active.
Model 2 performed best when considering results across datasets, especially considering its correct identification of 9/10 inhibitors in the testing sets and relatively low false positive rate (Table 7). The next step would be to use this trained model on a large molecular database, such as the set of all FDA approved drugs.

Model Deployment
After the development of both our pharmacophore and machine learning based QSAR models, we applied each of the three to the database of FDA approved drugs. Preprocessing of the FDA drug database for machine learning based predictions followed the same exact procedure as the other testing datasets, minus a label designating the molecules' activity. This is the unknown, and the purpose of the models is to designate these labels, WEKA, or a corresponding probability of falling into the active class, TensorFlow. The pharmacophore modeling method yielded 95 molecules with possible activity. The output of the TensorFlow based deep learning model was a probability of activity on the target protein for each of the molecules. We included the 176 most probable molecules predicted by TF, all those which had a greater than 50% probability of activity according to the model. The WEKA model predicted 167 molecules in the set to be active on the target protein. The models predicted a combined total of 350 molecules to be active on our target. All three models agreed on 16 molecules ( Table 8).
Note that a majority of the selected compounds are antibiotics targeting the 30S ribosomal subunit of bacteria.   The overlap between the TensorFlow model and the WEKA model predictions was most significant, sharing 46 of their combined 297 predictions ( Figure 4). The next largest overlap between molecules predicted by two methods is between those chosen using pharmacophore modeling and TensorFlow, 33 molecules predicted to be active out of a combined 238 structures ( Table 9). The overlap between the WEKA model and pharmacophore models is smallest, but still sizable at 25 molecules. The overlap between the TensorFlow model and the WEKA model predictions was most significant, sharing 46 of their combined 297 predictions ( Figure 4). The next largest overlap between molecules predicted by two methods is between those chosen using pharmacophore modeling and TensorFlow, 33 molecules predicted to be active out of a combined 238 structures ( Table 9). The overlap between the WEKA model and pharmacophore models is smallest, but still sizable at 25 molecules. The overlap between the TensorFlow model and the WEKA model predictions was most significant, sharing 46 of their combined 297 predictions ( Figure 4). The next largest overlap between molecules predicted by two methods is between those chosen using pharmacophore modeling and TensorFlow, 33 molecules predicted to be active out of a combined 238 structures ( Table 9). The overlap between the WEKA model and pharmacophore models is smallest, but still sizable at 25 molecules. The overlap between the TensorFlow model and the WEKA model predictions was most significant, sharing 46 of their combined 297 predictions (Figure 4). The next largest overlap between molecules predicted by two methods is between those chosen using pharmacophore modeling and TensorFlow, 33 molecules predicted to be active out of a combined 238 structures ( Table 9). The overlap between the WEKA model and pharmacophore models is smallest, but still sizable at 25 molecules.

Aminoglycoside antibiotic 15.5 30S ribosomal subunit
The overlap between the TensorFlow model and the WEKA model predictions was most significant, sharing 46 of their combined 297 predictions (Figure 4). The next largest overlap between molecules predicted by two methods is between those chosen using pharmacophore modeling and TensorFlow, 33 molecules predicted to be active out of a combined 238 structures ( Table 9). The overlap between the WEKA model and pharmacophore models is smallest, but still sizable at 25 molecules.

Building QSAR Models
We developed QSAR models for 10 of the 16 consensus drugs selected by pharmacophore-based, WEKA-based and TensorFlow-based searches of the FDA-approved drug database that have the same target (30S ribosome) (see Supplementary Materials). We elucidated the QSAR descriptors most useful to our modeling problem to analyze their relation to inhibitory activity on the 30S ribosome. Molecular refractivity (including implicit hydrogens)-SMR-proved to be most suitable for this comparison, displaying the strongest linear relationship to IC50 for these compounds ( Figure 5) [36].

Building QSAR Models
We developed QSAR models for 10 of the 16 consensus drugs selected by pharmacophore-based, WEKA-based and TensorFlow-based searches of the FDA-approved drug database that have the same target (30S ribosome) (see Supplementary Materials). We elucidated the QSAR descriptors most useful to our modeling problem to analyze their relation to inhibitory activity on the 30S ribosome. Molecular refractivity (including implicit hydrogens)-SMR-proved to be most suitable for this comparison, displaying the strongest linear relationship to IC 50 for these compounds ( Figure 5) [36].
A reasonable question arose from the obtained results-how the inhibition of bacterial proteins in the 30S ribosomal subunit is related to inhibition of the analogous human proteins. We conducted a study of the possible homology of bacterial and human proteins and obtained this unexpected but very interesting result ( Figure 6). A reasonable question arose from the obtained results-how the inhibition of bacterial proteins in the 30S ribosomal subunit is related to inhibition of the analogous human proteins. We conducted a study of the possible homology of bacterial and human proteins and obtained this unexpected but very interesting result ( Figure 6). The 30S ribosomal protein S12 sequence of Thermus thermophilus extracted from the crystal structure of this protein bound with the antibiotic streptomycin is shown at the top of Figure 7. The amino acids in contact with streptomycin are Lys46 (yellow), Lys47 (green), and Lys 91 (blue). The second sequence, below, is the analogous ribosomal sequence from Escherichia coli. BLAST searches of a possible alignment to human protein brought very interesting results. On a region spanning more than 110 amino acids in the 30S ribosomal protein S12 of Thermus thermophilus, there is a 46% identity alignment with the 28S ribosomal protein S12 of Homo sapiens. Alignment of the Escherichia coli and Homo sapiens ribosomal proteins produced a sequence identity of 48%. Such identities support the significant structural homology of these proteins in the region that is involved in binding of PTC-suppressing antibiotics. These results justify the use of IC 50 values based on inhibition of the bacterial ribosomal protein for prediction of activity of PTC-suppressing drugs in human patients ( Figure 6). The 30S ribosomal protein S12 sequence of Thermus thermophilus extracted from the crystal structure of this protein bound with the antibiotic streptomycin is shown at the top of Figure 7. The amino acids in contact with streptomycin are Lys46 (yellow), Lys47 (green), and Lys 91 (blue). The second sequence, below, is the analogous ribosomal sequence from Escherichia coli. BLAST searches of a possible alignment to human protein brought very interesting results. On a region spanning more than 110 amino acids in the 30S ribosomal protein S12 of Thermus thermophilus, there is a 46% identity alignment with the 28S ribosomal protein S12 of Homo sapiens. Alignment of the Escherichia coli and Homo sapiens ribosomal proteins produced a sequence identity of 48%. Such identities support the  Figure 6. (A) Alignment of the 30S ribosomal protein S12 of Thermus thermophilus and the 28S ribosomal protein of Homo sapiens. (B) Alignment of the 30S ribosomal protein S12 of Escherichia coli with the 28S ribosomal protein S12 of Homo sapiens.

Discussion
Many compounds that have been developed to treat DMD by stimulating translational readthrough are limited, due to their low efficacy or high toxicity. Aminoglycosides, such as gentamicin and paromomycin, have been found to induce ototoxicity and nephrotoxicity when administered to patients [37]. The use of non-aminoglycosides to treat DMD is still being investigated; however, the efficacy of these compounds varies, due to the stop codon context and sequence specificity.
We implemented pharmacophore-based, ML, and DL approaches to address this issue, with the aim of identifying potent compounds, including FDA-approved compounds, with PTC-suppressing capability. A literature search was conducted to identify known PTC-suppressing compounds. We then clustered the chosen compounds by molecular fingerprints. Compounds within each cluster were structurally aligned. Using these alignments, we developed two pharmacophore models. These models were used to conduct a search on the NCI database and DrugBank database of FDA-approved drugs. We selected 16 FDA-approved drugs as a consensus of all three approaches. Similar studies may be done in the future to further identify PTC-suppressing compounds for the treatment of DMD using larger, more expansive databases, such as ChEMBL.
Based on our QSAR models, we observed approximately linear relationships between the biological activities and structural attributes of each of the compounds. Our QSAR models show that the inhibitory activity against the 30S ribosome of drugs found by the consensus of pharmacophore-based search, ML, and DL methods have a close to linear relationship with the descriptor of molecular refractivity. Moreover, we confirmed that the bacterial ribosomal proteins interacting with antibiotics have a significant homology with the human ribosomal proteins. This suggests an answer to the question of why these antibiotics cause PTC suppression in human patients.
We applied all three models to the database of FDA approved drugs and obtained interesting results. All three models, WEKA, TensorFlow, and Pharmacophore, agreed on 16 molecules. These FDA approved drugs are prime candidates for in vitro testing to start the process of repurposing them to treat DMD. Pharmacophore modeling is a better predictor of biological activity than are QSAR methods, because in QSAR, we are limited to compounds targeting the same molecule. The molecules contained in the overlap between the pharmacophore model and TensorFlow model predictions sets are slightly more promising than the WEKA ML models because the TensorFlow DL model performed better than the WEKA model on independent testing sets.
Molecules predicted to have activity by more than one method should, in general, be weighted significantly higher than those which are only labeled active by only one model. Ensemble modeling methods, those in which multiple models are developed to make a single prediction, have been demonstrated to be more powerful than prediction methods dependent on only one model. Using an ensemble modeling method to make predictions is analogous to getting multiple professional opinions on a subject matter before making a decision. It allows for the prediction to account for many, well informed opinions, selecting the options which make sense from multiple perspectives.
Our analysis presents several potent compounds with PTC-suppressing ability. Ten of the 16 compounds identified using pharmacophore, ML, and DL models target the 30S ribosomal subunit, the primary target of existing PTC-suppressing compounds, indicating that they are viable options for repurposing for DMD treatment. Of the other six compounds, diazolidinyl urea, steviolbioside, streptozocin, and rutin have similar mechanisms of action (inhibition of DNA/protein synthesis), and their targets have similar structures to the 30S ribosomal subunit. This indicates that these compounds may have potential PTC-suppressing ability as well. With regard to the structures of the compounds found from all three approaches, 10 of the 16 compounds contain a 2-deoxystreptamine (2 DOS), which has been identified as a key structural feature in novel aminoglycoside structures [8]. In general, the structures of the compounds indicate PTC-suppressing ability. These compounds warrant further analysis of their pharmacokinetic properties and experimental validation of the development of these compounds for drug design.

Similarity Clustering
MOE (Molecular Operating Environment, Chemical Computing Group, Montreal, Canada) was used to perform similarity clustering. The main objective of similarity clustering is to separate compounds into subsets by their molecular fingerprints, based on the hypothesis that compounds with structural similarity will have similar binding properties. Examples of common molecular fingerprints include MACCS (Molecular ACCess System) and GpiDAPH3 (graph of pi-system-donor-acceptor-polar-hydrophobe three-point pharmacophore). The MACCS fingerprint encodes molecular structures in a bit string representing the presence or absence of sub-structural features [38]. GpiDAPH3 is calculated from the 2D molecular graph of a three-point pharmacophore [38]. The following three atomic properties: "is a hydrophobic", "is a donor", "is an acceptor" are computed to assign each atom to one of eight atom types [38]. Previous studies have ranked the GpiDAPH3 fingerprint above the MACCS fingerprint in terms of performance on datasets [38]. For the similarity clustering in this study, the GpiDAPH3 fingerprint was used. The SO (similarity and overlap) value selected was 0.45.

Flexible Alignment
The primary objective of flexible alignment is to identify the overlap of molecular features in selected compounds. Prior to pharmacophore elucidation, flexible alignment is a necessary step to superimpose the structures of selected compounds. Using the Flexible Alignment module from the Compute application in MOE, we performed flexible alignment separately for each cluster, given that each cluster was composed of distinct compounds with distinct chemical structures. The following parameters were used: Iteration Limit = 200, Failure Limit = 20, Energy Cutoff = 10.

Development of a Pharmacophore Model
The Query Editor of the Pharmacophore module of the Compute application in MOE was employed to build pharmacophore models for both clusters' alignments. The consensus pharmacophore models allowed us to select possible pharmacophore centers. The following pharmacophore centers were applied: H-bond donors (Don), H-bond acceptors (Acc), H-bond donors/acceptors (Don & Acc), and hydrophobic features (Hyd). The following parameters were used: Tolerance = 1.2, Threshold = 50%, and Consensus Score = Weighted Conformations. For the completion of the pharmacophore model, we used centers that belong to 100% of the superimposed compounds for Cluster A and 80% of the superimposed compounds for Cluster B. Based on the generated models, specific features were added accordingly to areas with high density of nitrogen atoms (Don), areas with high density of oxygen atoms (Acc), areas with high density of nitrogen and oxygen atoms (Don & Acc), or areas with high density of hydrophobic atoms (Hyd).

Databases Search
The Search window of the Pharmacophore module of the Compute application in MOE was employed to conduct searches of two databases: the NCI open database and the DrugBank database.

NCI Database Search
We prepared a conformational database from the NCI open database containing 260,071 experimental compounds, using the Conformational Search module of the Compute application in MOE. A pharmacophore search was then conducted on this database using the pharmacophore queries generated from the previous step. The default settings were used for Cluster A. For Cluster B, a partial match of 8 of 10 was run to increase the number of hits generated.

Drugbank Database Search
We also created a conformational database from the DrugBank database containing all 2356 FDA-approved drugs using the Conformational Search module of the Compute application in MOE. A pharmacophore search was conducted on this database, in the same way as with the NCI open database. The default settings were used for Cluster A. For Cluster B, a partial match of 9 of 10 was run to increase the number of hits generated.
Top-scoring compounds were identified based on the following criteria: a) the compounds must be within the ligand shape volume so that no atom centers lie beyond this volume; b) the highest-scoring compounds should have the greatest correspondence to the geometry of the selected pharmacophore centers.

Training and Test Set Preparation
Three data sets where constructed. One was for training the model, containing 43 PTCSCs and 42 random molecules with no documented affinity for the protein. The 37 PTCSCs used in pharmacophore-based methods and an additional 6 PTCSCs found from literature-lividomycin, neomycin sulfate, spectinomycin, isepamicin, clarithromycin, and telithromycin-were selected as inhibitors [39][40][41][42][43]. Non-inhibitors were selected to match the distribution of molecular weights observed in the inhibitor class in order to ensure that molecular weight-a characteristic we believe would bias the algorithm in a non-functionally impactful way-does not influence the model's training parameters ( Table 2).
The other two sets contained a common set of 10 inhibitors gathered from literature, none of which appeared in the training set: viomycin, roxithromycin, troleandomycin, tigecycline, omadacycline, demeclocycline, linezolid, eravacycline, chloramphenicol, and capreomycin [42,[44][45][46][47]. This was necessary to ensure that the sets contained enough distinct compounds that could be used as independent testing sets for model validation. The two testing sets differ in their distributions of molecular weights with respect to those of the non-PTCSCs they contain. The first set contained a set of non-inhibitors whose distribution of molecular weights mimics the training set, Set A (Table 3), while the second set of non-inhibitor molecular weights matches the testing set, Set B (Table 4). Set B was designed to have approximately 5 non-inhibitors for every one of the 10 known inhibitors in the testing set.
SMILES (simplified molecular-input line-entry system) representations of all molecules were collected from PubChem and used to compute QSAR molecular descriptors and fingerprints using the PaDEL online descriptor calculator [48]. Descriptors for the training set were uploaded to WEKA (Waikato Environment for Knowledge Analysis, University of Waikato, New Zealand) [49] and ranked by Information Gain (IG) with respect to the label of active/inactive. The 500 most informative columns were kept and ranked. The same 500 descriptors were isolated for the two testing sets. All three sets were MinMax scaled (Equation (1)), using the minimum and maximum values of each column presented in the training set.
Equation (1). The formula for MinMax scaling where X sc is the scaled value, X is the actual value for a feature in the column before scaling, X min is the minimum value for that feature in the training set, and X max is the maximum value for that feature in the training set.
The scaled sets were then reduced to their principal components using Numerical Python-a library consisting of multidimensional array objects and a collection of routines for processing those arrays (NumPy). Formulation of the principal components was performed using only the training set, with the formula derived from the training set applied to the training set and each of the testing sets. Principal component analysis (PCA) is a mathematical technique by which the unique basis sets of a set of numbers can be determined. It is commonly used to reduce the noise in a data set, allowing machine learning algorithms to converge at a minimal error more quickly. The resulting data sets contained 51 principal components representing each molecule (Figure 7). A label designating activity or lack thereof was added manually. arrays (NumPy). Formulation of the principal components was performed using only the training set, with the formula derived from the training set applied to the training set and each of the testing sets. Principal component analysis (PCA) is a mathematical technique by which the unique basis sets of a set of numbers can be determined. It is commonly used to reduce the noise in a data set, allowing machine learning algorithms to converge at a minimal error more quickly. The resulting data sets contained 51 principal components representing each molecule (Figure 7). A label designating activity or lack thereof was added manually.

Model Development
Machine learning was performed using both WEKA [49] and TensorFlow (TF) [50], comparing results obtained using each program. WEKA is a popular GUI typically used to accomplish basic ML tasks [49]. It is also equipped with number of preprocessing and feature selection functions, such as the previously mentioned IG and PCA [49]. TF is a more sophisticated ML platform requiring knowledge of a computer language, Python in this case, which allows for greater customization of the algorithms [50]. TF was used with Keras backend for access to its neural network libraries [50]. Models were trained through 10-fold cross validation and evaluated on their accuracy and area under receiver operating characteristic curve (AUC), with respect to the training set and the false negative and false positive rates on the testing sets. Modeling in WEKA was performed using default settings for all architectures attempted with multilayer perceptron (MLP) performing best across the sets. Modeling in TF was attempted with neural networks containing between 2 and 5 layers, with the best results achieved using a 3-layer neural network with parameters illustrated in Table 10. Table 10. Illustrating training parameters which led to the best results across all three datasets using TF. The layer densities and activation functions are listed in the order they appear; the layer of density 50 relies on a sigmoid activation function, density 25 corresponds to the elu activation function, and 1 to another sigmoidal layer.

Building QSAR Models
We prepared a database containing the names and experimental IC50 values for 10 of the 16 drugs identified as a consensus of the pharmacophore-based, ML, and DL search of the DrugBank database. The IC50 values were obtained from literature [28][29][30][31][32][33][34][35]. This database was submitted to the Structure-Activity Report (SAReport) editor of the QuaSAR module of the Compute Application in MOE in order to develop QSAR (Quantitative structure-activity relationship) models. The primary objective of developing QSAR models was to compare the therapeutic potential of the compounds found by the pharmacophore search to those currently used to treat DMD. QSAR was employed to identify the relationship between selected descriptors, which quantify the physicochemical properties of the compounds, and the IC50 values of the compounds, which describe the biological activity of the

Model Development
Machine learning was performed using both WEKA [49] and TensorFlow (TF) [50], comparing results obtained using each program. WEKA is a popular GUI typically used to accomplish basic ML tasks [49]. It is also equipped with number of preprocessing and feature selection functions, such as the previously mentioned IG and PCA [49]. TF is a more sophisticated ML platform requiring knowledge of a computer language, Python in this case, which allows for greater customization of the algorithms [50]. TF was used with Keras backend for access to its neural network libraries [50]. Models were trained through 10-fold cross validation and evaluated on their accuracy and area under receiver operating characteristic curve (AUC), with respect to the training set and the false negative and false positive rates on the testing sets. Modeling in WEKA was performed using default settings for all architectures attempted with multilayer perceptron (MLP) performing best across the sets. Modeling in TF was attempted with neural networks containing between 2 and 5 layers, with the best results achieved using a 3-layer neural network with parameters illustrated in Table 10. Table 10. Illustrating training parameters which led to the best results across all three datasets using TF. The layer densities and activation functions are listed in the order they appear; the layer of density 50 relies on a sigmoid activation function, density 25 corresponds to the elu activation function, and 1 to another sigmoidal layer.

Building QSAR Models
We prepared a database containing the names and experimental IC 50 values for 10 of the 16 drugs identified as a consensus of the pharmacophore-based, ML, and DL search of the DrugBank database. The IC 50 values were obtained from literature [28][29][30][31][32][33][34][35]. This database was submitted to the Structure-Activity Report (SAReport) editor of the QuaSAR module of the Compute Application in MOE in order to develop QSAR (Quantitative structure-activity relationship) models. The primary objective of developing QSAR models was to compare the therapeutic potential of the compounds found by the pharmacophore search to those currently used to treat DMD. QSAR was employed to identify the relationship between selected descriptors, which quantify the physicochemical properties of the compounds, and the IC 50 values of the compounds, which describe the biological activity of the compounds. Descriptors displaying the strongest linear relationship with the IC 50 values of the compounds were included in the QSAR models.
Supplementary Materials: The following are available online, Table S1: Drugs selected by Pharmacophore-based, ML-based and DL-based search in the FDA-approved drugs database.

Conflicts of Interest:
The authors declare no conflict of interest.