Convolutional Neural Network and Bidirectional Long Short-Term Memory-Based Method for Predicting Drug–Disease Associations

Identifying novel indications for approved drugs can accelerate drug development and reduce research costs. Most previous studies used shallow models for prioritizing the potential drug-related diseases and failed to deeply integrate the paths between drugs and diseases which may contain additional association information. A deep-learning-based method for predicting drug–disease associations by integrating useful information is needed. We proposed a novel method based on a convolutional neural network (CNN) and bidirectional long short-term memory (BiLSTM)—CBPred—for predicting drug-related diseases. Our method deeply integrates similarities and associations between drugs and diseases, and paths among drug-disease pairs. The CNN-based framework focuses on learning the original representation of a drug-disease pair from their similarities and associations. As the drug-disease association possibility also depends on the multiple paths between them, the BiLSTM-based framework mainly learns the path representation of the drug-disease pair. In addition, considering that different paths have discriminate contributions to the association prediction, an attention mechanism at path level is constructed. Our method, CBPred, showed better performance and retrieved more real associations in the front of the results, which is more important for biologists. Case studies further confirmed that CBPred can discover potential drug-disease associations.


Introduction
The research and development (R&D) stage of producing a novel drug is a time-consuming, complex, and costly process that normally lasts for more than ten years and costs approximately 1 billion dollars [1][2][3][4]. Simultaneously, there is a large gap between the high investment in R&D and the number of new drugs finally approved [5][6][7]. Because approved drugs have undergone the necessary clinical trials, their safety has been evaluated, identifying new indications for these drugs, (i.e., drug repositioning), which can effectively reduce the time and costs for drug-related R&D [5,8,9].
Network-based approaches have been widely used to study biological and medical associations [10,11]. Computational prediction of the associations between drugs and diseases can identify candidates for further wet-lab validation [12,13]. Several methods are used to predict and prioritize drug-associated diseases, which can generally be divided into two categories. Methods in the first category capture network topology information using a diffusion algorithm and then provide association scores for candidate diseases [14][15][16][17]. Wang et al. [16] identified candidate diseases using an iterative update

Drug Network Construction
To measure the drug similarities for constructing the drug network (DrNet), we used the method developed by Liang et al. [1] to calculate the cosine similarity of the chemical substructure vector among the drugs. The chemical substructure vector of a drug is an 869-dimensional binary vector. The presence or absence of each chemical substructure of a drug is encoded as 1 or 0. When the drug similarity was greater than 0, we added an edge to connect the two drug nodes in DrNet; the weight of the edges reflected the similarity between the drugs (Figure 1). DrNet can be represented by matrix R = R ij ∈ R N r ×N d where N r is the number of drugs and R ij is the similarity of drugs r i and r j in the range 0 to 1. An R ij closer to 1 indicates greater similarity between r i and r j . R ij is calculated as follows: where c i and c j are the chemical substructure vectors of r i and r j , respectively, and ||·|| indicates the magnitude of vector.

Construction of a Drug-Disease Network
A two-layer heterogeneous drug-disease network, DrDisNet, was constructed based on the similarities and associations of drugs and diseases, which consisted of a drug network (DrNet) and disease network (DisNet) as well as the edge (i.e., association between drugs and diseases) among the two networks.

Drug Network Construction
To measure the drug similarities for constructing the drug network (DrNet), we used the method developed by Liang et al. [1] to calculate the cosine similarity of the chemical substructure vector among the drugs. The chemical substructure vector of a drug is an 869-dimensional binary vector. The presence or absence of each chemical substructure of a drug is encoded as 1 or 0. When the drug similarity was greater than 0, we added an edge to connect the two drug nodes in DrNet; the weight of the edges reflected the similarity between the drugs (Figure 1). DrNet can be represented by matrix where is the number of drugs and is the similarity of drugs and in the range 0 to 1. An closer to 1 indicates greater similarity between and . is calculated as follows: where and are the chemical substructure vectors of and , respectively, and ‖•‖ indicates the magnitude of vector.

Disease Network Construction
Disease similarities play an important role in disease network construction. Wang et al. [26] used the MeSH disease term for each disease to calculate their respective semantic values. Next, semantic similarity was calculated from the semantic values of any two diseases. A larger number of common annotation terms among the two diseases indicated higher semantic similarity.
DisNet consisted of all pairs of diseases with similarity values greater than 0. The weight of any edge in the network was set to the similarity among the diseases to which the edge was connected.
is the similarity between diseases and and is the number of diseases.

Disease Network Construction
Disease similarities play an important role in disease network construction. Wang et al. [26] used the MeSH disease term for each disease to calculate their respective semantic values. Next, semantic similarity was calculated from the semantic values of any two diseases. A larger number of common annotation terms among the two diseases indicated higher semantic similarity.
DisNet consisted of all pairs of diseases with similarity values greater than 0. The weight of any edge in the network was set to the similarity among the diseases to which the edge was connected. Matrix D ∈ R N d ×N d denotes DisNet where D ij is the similarity between diseases d i and d j and N d is the number of diseases.

Edges between DrNet and DisNet
We considered the known associations between drugs and diseases as the edges that connected the corresponding nodes in DrNet and DisNet. The edge set was represented as A ∈ R N r ×N d , where each row represented a drug and each column represented a disease. A ij is 1 when drug r i has a known association with d j , while it is 0 when an association is not observed between r i and d j .
Finally, the heterogeneous drug-disease network DrDisNet was constructed by connecting DrNet and DisNet via known drug-disease associations (Figure 1). To concisely illustrate the subsequent methods, we assumed that N r = 5 and N d = 4.

Prediction Model Based on CNN and BiLSTM Module
We propose a novel prediction model based on CNN and BiLSTM-named as CBPred-which is shown in Figure 2. The convolution module on the left part of CBPred was introduced to learn the association representation from the perspective of the original features of a node pair (r i , d j ). Additionally, because the path from r i to d j also responds to the associated tendency between r i and d j , a BiLSTM module on the right part was used to integrate topological information into the path representation.

Edges between DrNet and DisNet
We considered the known associations between drugs and diseases as the edges that connected the corresponding nodes in DrNet and DisNet. The edge set was represented as A ∈ R × , where each row represented a drug and each column represented a disease. is 1 when drug has a known association with , while it is 0 when an association is not observed between and . Finally, the heterogeneous drug-disease network DrDisNet was constructed by connecting DrNet and DisNet via known drug-disease associations ( Figure 1). To concisely illustrate the subsequent methods, we assumed that = 5 and = 4.

Prediction Model Based on CNN and BiLSTM Module
We propose a novel prediction model based on CNN and BiLSTM-named as CBPred-which is shown in Figure 2. The convolution module on the left part of CBPred was introduced to learn the association representation from the perspective of the original features of a node pair ( , ) . Additionally, because the path from to also responds to the associated tendency between and , a BiLSTM module on the right part was used to integrate topological information into the path representation.

Embedding Layer
Feature matrix of drug and disease for the CNN module. Normally, if the similarity of a drug is more consistent with the association of a disease, the more likely it is that they are associated and vice versa. Therefore, we spliced up and down the similarities between the drug nodes and associations between drug and disease nodes, as shown on the left side of the feature matrix.
We use drug and disease as an example to illustrate the integration process ( Figure 3). The first row of the drug similarity matrix indicates the similarity to other drugs with , and the fourth of the expresses the association drugs with . Because is similar to and , and are also both related to . Thus, is likely to be involved in the disease process of . Similarly, if the relationship of and are more consistent with each disease, they will show a higher propensity for association.
is associated with and , while is similar to

Embedding Layer
Feature matrix of drug and disease for the CNN module. Normally, if the similarity of a drug is more consistent with the association of a disease, the more likely it is that they are associated and vice versa. Therefore, we spliced up and down the similarities between the drug nodes and associations between drug and disease nodes, as shown on the left side of the feature matrix.
We use drug r 1 and disease d 4 as an example to illustrate the integration process ( Figure 3). The first row of the drug similarity matrix R indicates the similarity to other drugs with r 1 , and the fourth of the A T expresses the association drugs with d 4 . Because r 1 is similar to r 4 and r 5 , r 3 and r 5 are also both related to d 4 . Thus, r 1 is likely to be involved in the disease process of d 4 .
, and thus, may associate with . Based on this information, we integrated the first row of and the fourth row of , as shown in the right part of the feature matrix. The final integration result is represented by the feature matrix . Furthermore, the first and second rows of are feature embedding of the drug and disease, respectively.

Path sequence features for the BiLSTM module.
It is well known that if two drugs are very similar, they are likely involved in a similar disease process. For example, for the path, --, is similar to , and is associated with , indicating an association between and . Based on similar logic, we can obtain the following path: Because is similar to and is associated with , may be treated by . Thus, there is a second path, --. Finally, we enumerate the path from the starting point to the end of in the network to obtain the path set P ( , ) ∈ R × × ( ) , where is the number of paths between nodes and , and the i-th path sequence in the ( , ) defined as . ( , ) is inputted into the bidirectional LSTM module as the path feature of the pair ( , ) to learn the representation at the path level.

Convolutional Module on the Left
The feature matrix is fed into the convolutional module to learn a latent original representation of node pair ( , ) ( Figure 4). To capture the boundary information of , we first pad to obtain , where is the number of padding layers around . For the first convolution layer, to apply the filter operators to the feature areas of w × w , we set the size of filter as (w , w ).
Next, we can obtain the feature map 1 is the number of filters. We used the subscript of the first element in the filter in as the filter position. For example, ( , , ) indicates that the kth filter starts at the feature area at ith row and jth column in . The area and process of convolution are defined as follows: Similarly, if the relationship of r 1 and d 4 are more consistent with each disease, they will show a higher propensity for association. r 1 is associated with d 2 and d 3 , while d 4 is similar to d 1 and d 3 , and thus, r 1 may associate with d 4 . Based on this information, we integrated the first row of A and the fourth row of D, as shown in the right part of the feature matrix. The final integration result is represented by the feature matrix F ∈ R 2×(N r +N d ) . Furthermore, the first and second rows of F are feature embedding of the drug and disease, respectively.
Path sequence features for the BiLSTM module. It is well known that if two drugs are very similar, they are likely involved in a similar disease process. For example, for the path, r 1 -r 5 -d 4 , r 1 is similar to r 5 , and r 5 is associated with d 4 , indicating an association between r 1 and d 4 . Based on similar logic, we can obtain the following path: Because d 3 is similar to d 4 and r 1 is associated with d 3 , d 4 may be treated by r 1 . Thus, there is a second path, r 1 -d 3 -d 4 . Finally, we enumerate the path from the starting point r s to the end of d t in the network to obtain the path set P (s,t) ∈ R N path ×1 × (N r + N d ) , where N path is the number of paths between nodes r s and d t , and the i-th path sequence in the P (s,t) defined as p i . P (1,4) is inputted into the bidirectional LSTM module as the path feature of the pair (r 1 , d 4 ) to learn the representation at the path level.

Convolutional Module on the Left
The feature matrix F is fed into the convolutional module to learn a latent original representation of node pair (r 1 , d 4 ) (Figure 4). To capture the boundary information of F, we first pad F to obtain where p conv is the number of padding layers around F. For the first convolution layer, to apply the filter operators to the feature areas of w h × w w , we set the size of filter as (w h , w w ).
Next, we can obtain the feature map where N conv is the number of filters. We used the subscript of the first element in the filter in P conv as the filter position. For example, W conv (i, j, k) indicates that the kth filter starts at the feature area at ith row and jth column in P conv . The area and process of convolution are defined as follows: is the first convolution output in which the kth filter is sliding to the ith row and jth column of P conv . g is a nonlinear activation function (rectified linear unit, ReLU), and b conv is a bias vector. To integrate features and reduce parameters, we use average pooling to compress the data in Z 1 in the pooling layer. The size of the pooling window is set to a × b, from which we obtain ×N conv . We then use Q 1 as the input to the second convolution layer, and obtain a similar output q ∈ R 1× ×N conv through the second average pooling. q is then flattened to obtain an original representation of the node pair (r 1 , d 4 ), denoted as v n : v n = f latten(q)

BiLSTM Module on the Right
The LSTM module controls the information flow through the gate mechanism, while the BiLSTM module learns the context representation of the input sequence from a forward LSTM and reverse LSTM [27,28]. The previously obtained path set P (1,4) was fed into the BiLSTM module on the right part to learn the path representation of r 1 and d 4 ( Figure 5). Forward LSTM linearly integrates the candidate state ( ) of ( ) with the candidate state of and determines how much information in the ( ) should be retained by and how much information in the are accepted by . Thus, obtaining the state of the sequence consisting of the 1st to jth nodes in the :  There are three gates, the forget gate f f ij , input gate i f ij , and output gate o f ij , in the forward LSTM unit which control how much information from path sequences should be forgotten, inputted, and outputted, respectively. The formulas for the three gates were defined as follows: where σ is the sigmoid activation function and ⊕ is the connection operator. The upper corner f indicates that this is a parameter of the forward LSTM unit; for example, W f g and b f g are the weight matrix and bias vector of the gate in the forward unit, respectively. x ij represents the embedding of the jth node of the ith path p i in the path set P (1,4) .
Forward LSTM linearly integrates the candidate stateĉ where is the element-wise product operator. The candidate stateĉ f ij of x ij is obtained by comprehensively considering the information from the previous node and x ij , defined as follows: where W where h f ij is a forward path representation of the 1st to jth nodes in p i . We take the hidden state h f il of the last node as the representation of p i , where l is the length of p i . The inverted sequence p b i of p i is then inputted into a structurally similar backward LSTM module to obtain a backward representation h b il of p b i . The upper corner b indicates that this is a parameter of the backward LSTM module. Thus, the path representation of the ith path in the bidirectional LSTM module is given by the following formula:

Attention Mechanism at Path Level
From the perspective of P (1,4) , not all paths equally contributed to the association prediction of r 1 and d 4 . An attention mechanism at the path level was introduced to extract paths important in the association between the drug and disease [29]. This yields: where u i is a hidden representation of h i . The path level context vector u p attempts to generalize the path strongly contributing to the association between r1 and d4 from P (1,4) , while u T p is the transpose of u p . Next, we measured the importance of p i in P (1,4) by comparing the similarity between u i and u p , and obtained the attention weight α i through the softmax function. v p is a path vector, which is a weighted sum of all information from path set P (1,4) based on the attention weights and path representations.

Combined Strategy
The original representation v n and path representation v p are both high-level representations of r 1 and d 4 and can be used as features for association classification. Thus, we projected the two representations v n and v p into the association distribution of C classes via the SoftMax layer while choosing the cross-entropy loss to evaluate the error between the known association distribution and prediction distribution: where t is the node pair in the training set T, p g c (t) is the one hot embedding of t, and s n (t) and s p (t) are the predicted scores of t from the CNN and BiLSTM modules, respectively. We designed a combined strategy for the model to make full use of the original representation v n and path representation v p . We used the Adam optimization algorithm to optimize the objective function [30]. Let λ be a hyperparameter to control the contribution of the original representations and path representations of the node pairs for the final predicted score.

Evaluation Metrics
We performed 5 fold cross-validation 20 times to evaluate the performance of our prediction method and the corresponding results were averaged [31,32]. First, known associated drug-disease pairs were divided randomly into five subsets and treated as positive samples. The remaining pairs were considered negative samples. Because the number of positive samples was much smaller than the number of negative samples in our dataset (approximately 1 to 169), we sampled a matching number of non-associating pairs randomly and divided them into five subsets to reduce the impact of class imbalance in predicting the results. Particularly, in each fold cross-validation, we used four positive and negative subsets as the training set for model training and the remaining positive samples as the testing set for performance evaluation. Finally, a higher rank for the positive samples indicated better the prediction performance of the method.
A disease with a score higher than the threshold θ indicates that it is identified as a positive sample and vice versa. Thus, the TPRs (true-positive rates) and FPRs (false-positive rates) under various θ can be calculated as follows: where TP (true-positive) and TN (true-negative) are the number of positive and negative samples which were correctly identified, while FN (false-negative) and FP (false-positive) are the number of positive and negative samples which were misidentified [33]. The receiver operating characteristic (ROC) curve can be drawn according to the TPR and FPR under each θ [34]. A ROC curve was constructed for each drug, and the area under the ROC curve (AUC) was used to evaluate the predictive performance of the method for the specific drug [35,36]. The average AUC of all drugs is considered as the comprehensive performance of the prediction model.
However, in most cases of class imbalance, the precision-recall (P-R) curves are more informative than the ROC curve [37]. Precision is the proportion of true-positive samples in all identified positives and recall is the ratio of true-positives among the samples with known associations [38]. Therefore, we used the P-R curve as another measurement to evaluate the performance of each method. The area under the P-R curve (AUPR) is another evaluation metric that focuses on true-positive samples [39]. The precision rates and recall rates can be defined as follows: Additionally, biologists typically select the top part of the predictive result for further validation in wet-lab experiments. Thus, the recall rates of the top k candidate drug-related diseases are more important because they reveal the number of successfully identified positive samples. We calculated the recall rates of the top k candidate to demonstrate the performance of each method on the top rankings of the predictive result.

Comparison with Other Methods
To evaluate the performance of CBPred, we compared this method with a series of state-of-the-art methods for predicting associations between drugs and diseases, including MBiRW [15], LRSSL [1], SCMFDD [18], and HGBI [16].
As shown in Figure 6a, CBPred showed the best performance for 763 drugs (AUC = 0.955). Specifically, CBPred showed a 25.3% higher AUC than HGBI, 23.2% higher AUC than SCMFDD, 12.7% higher AUC than MBiRW, and 12.4% higher AUC than LRSSL. We also show the predictive results of 15 well-characterized drugs in Table 1; CBPred achieved the best performance for 12 drugs. Both CBPred and LRSSL not only consider the nodes' attributes based on node similarities, but also extract topological information of drug-disease heterogeneous networks. Thus, compared to other methods, CBPred and LRSSL achieved the best and second-best performances. Luo et al. constructed a random walk with a restart-based model, MBiRW, for predicting associations between drugs and diseases. It focuses on the topological information of the networks, while node attributes are ignored. Additionally, because the restart probability is difficult to determine, which may result in insufficient global topological information or excessive noise, the performance of MBiRW was worse than the second method, LRSSL. Zhang et al. applied a matrix factorization-based model, SCMFDD, for predicting novel associations, which relies on the adjacency matrices of the heterogeneous network. However, reducing the dimension of the feature vectors may lead to loss of the potential information. Thus, the performance of SCMFDD was worse than that of MBiRW but better than that of HGBI. Comprehensively, HGBI showed lower performance than the other methods because it was too dependent on the similarity of drugs and diseases.  The precision-recall curves of each method are demonstrated in Figure 6b. The average AUPR of CBPred was greater than those of all the other methods (AUPR = 0.182). Our method, CBPred, achieved a 17.0%, 16.9%, 13.7%, and 7.5% higher AUPR than HGBI, SCMFDD, MBiRW, and LRSSL, respectively. As shown in Table 2, CBPred showed the best performance for 12 of the 15 wellcharacterized drugs. Table 2. Prediction results of CBPred and four other contrast methods for 15 drugs in terms of the area under the precision-recall curve (AUPR).  The precision-recall curves of each method are demonstrated in Figure 6b. The average AUPR of CBPred was greater than those of all the other methods (AUPR = 0.182). Our method, CBPred, achieved a 17.0%, 16.9%, 13.7%, and 7.5% higher AUPR than HGBI, SCMFDD, MBiRW, and LRSSL, respectively. As shown in Table 2, CBPred showed the best performance for 12 of the 15 well-characterized drugs. A Wilcoxon test to evaluate the prediction results of 763 drugs revealed that CBPred significantly outperformed the other methods [40][41][42]. These results were observed using a p-value threshold of 0.05, with CBPred showing better performance in terms of both AUCs and AUPRs (Table 3). Among the top k-ranked drugs, a higher recall rate indicated that drug-associated diseases were correctly identified. Our method, CBPred, consistently outperformed the other methods under different k values, as shown in Figure 7, and ranked 76.38% for the top 30 drugs, 85.78% for the top 60, and 92.54% for the top 120. Zhang's method, SCMFDD, showed very similar results to Wang's method, HGBI, for most of the recall rates, with the former ranked 27.97%, 41.75%, and 55.82% for the top 30, 60, and 120 drugs, respectively, while the latter ranked 25.70%, 37.39%, and 51.57%. The recall of LRSSL was higher than that of MBiRW before the top 120, after which it was surpassed. This may be because the k-nearest neighbors algorithm is utilized in the process of LRSSL, which may make the prediction effect too dependent on neighboring node information, causing difficulties in predicting isolated nodes. Luo's method, MBiRW, captured the global information for the drug-disease network and local topology of the node through random walk with restart algorithm, which showed better results than LRSSL.

Disease Name AUPR
In addition, to confirm the performance of CBPred from another perspective, we constructed a new drug-disease network where the disease similarities are calculated using disease ontology and disease-related genes according to Cheng's method [43]. The ROC and P-R curves of CBPred and other methods are shown in Supplementary Materials Figure S1. Our method, CBPred, still achieved the best performance under the new drug-disease network, which also illustrated that CBPred was effective when the disease ontology and disease-related genes were taken into account.

Case Studies of Five Drugs
To demonstrate the ability of CBPred to discover novel drug-disease associations, we conducted case studies of ciprofloxacin, ceftriaxone, ofloxacin, ampicillin, and levofloxacin and then analyzed their top ten candidate diseases ( Table 4).
The impacts of chemicals (i.e., drugs) on human health are presented in the Comparative Toxicogenomics Database (CTD). This information was manually collected and verified from published works. DrugBank records various attributes of the drug itself, such as associations with diseases. As shown in Table 3, 12 candidates are supported by direct evidence in CTD, and 9 candidates are involved according to DrugBank. These records indicate that these candidate diseases are treated with the corresponding drugs.
Clinical Trials is a database of clinical trials conducted worldwide and provides access to various ongoing and completed experimental information, with detailed patient descriptions and experimental dosing regimens and treatment outcomes. We selected only records with a status of "Completed" as our support material. The clinical trial results showed that our drug has a therapeutic relationship with the candidate disease. PubChem is a public database containing information on chemicals and their biological activities and is supported by the National Institutes of Health. Fifteen candidates were included from Clinical Trials and 11 candidates were included by PubChem. This demonstrated that the candidates are supported by clinical trials.
In addition to the manually verified drug-disease associations, the CTD database also contains inferred associations from literature that are temporarily unconfirmed. Four candidates were included by the inferred part of CTD, which shows that they are likely to have associations. Direct or indirect descriptions of all disease candidates for five drugs were found, revealing that CBPred can identify drug-disease association candidates with high reliability and accuracy.

Case Studies of Five Drugs
To demonstrate the ability of CBPred to discover novel drug-disease associations, we conducted case studies of ciprofloxacin, ceftriaxone, ofloxacin, ampicillin, and levofloxacin and then analyzed their top ten candidate diseases (Table 4). The impacts of chemicals (i.e., drugs) on human health are presented in the Comparative Toxicogenomics Database (CTD). This information was manually collected and verified from published works. DrugBank records various attributes of the drug itself, such as associations with diseases. As shown in Table 3, 12 candidates are supported by direct evidence in CTD, and 9 candidates are involved according to DrugBank. These records indicate that these candidate diseases are treated with the corresponding drugs.
Clinical Trials is a database of clinical trials conducted worldwide and provides access to various ongoing and completed experimental information, with detailed patient descriptions and experimental dosing regimens and treatment outcomes. We selected only records with a status of "Completed" as our support material. The clinical trial results showed that our drug has a therapeutic relationship with the candidate disease. PubChem is a public database containing information on chemicals and their biological activities and is supported by the National Institutes of Health. Fifteen candidates were included from Clinical Trials and 11 candidates were included by PubChem. This demonstrated that the candidates are supported by clinical trials.
In addition to the manually verified drug-disease associations, the CTD database also contains inferred associations from literature that are temporarily unconfirmed. Four candidates were included by the inferred part of CTD, which shows that they are likely to have associations. Direct or indirect descriptions of all disease candidates for five drugs were found, revealing that CBPred can identify drug-disease association candidates with high reliability and accuracy.

Prediction of Novel Drug-Disease Associations
After evaluating CBPred's prediction performance through five-fold cross-validation, case studies, and Wilcoxon test, we applied CBPred to all drugs. All known drug-disease associations were considered as the training set to train CBPred's prediction model. Many high-confidence candidate diseases of drugs were obtained via CBPred and are listed in Supplementary Materials Table S1.

Conclusions
A novel method based on a CNN and BiLSTM-CBPred-was developed for predicting potential disease indications for drugs. The CNN module of the CBPred captures complex and non-linear relationships among drug similarities, disease similarities, and drug-disease associations about a drug-disease pair. The path information was deeply integrated using the BiLSTM module of this method. We also established an attention mechanism at the path level to discriminate the different contributions of the path, which enhanced the prediction performance of CBPred. The experimental results revealed that CBPred outperformed other state-of-the-art methods in terms of both AUCs and AUPRs. Case studies of five drugs confirmed the ability of CBPred to discover potential disease indications for drugs. Our method, CBPred, is a prioritization tool that identifies reliable candidate drug-disease associations for subsequent biological validation in wet-lab experiments.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4409/8/7/705/s1. Table S1: The top 10 potential candidates for 763 drugs. Figure S1: Two type of curves of CBPred and other methods under a new drug-disease network.