Inferring the Disease-Associated miRNAs Based on Network Representation Learning and Convolutional Neural Networks

Identification of disease-associated miRNAs (disease miRNAs) are critical for understanding etiology and pathogenesis. Most previous methods focus on integrating similarities and associating information contained in heterogeneous miRNA-disease networks. However, these methods establish only shallow prediction models that fail to capture complex relationships among miRNA similarities, disease similarities, and miRNA-disease associations. We propose a prediction method on the basis of network representation learning and convolutional neural networks to predict disease miRNAs, called CNNMDA. CNNMDA deeply integrates the similarity information of miRNAs and diseases, miRNA-disease associations, and representations of miRNAs and diseases in low-dimensional feature space. The new framework based on deep learning was built to learn the original and global representation of a miRNA-disease pair. First, diverse biological premises about miRNAs and diseases were combined to construct the embedding layer in the left part of the framework, from a biological perspective. Second, the various connection edges in the miRNA-disease network, such as similarity and association connections, were dependent on each other. Therefore, it was necessary to learn the low-dimensional representations of the miRNA and disease nodes based on the entire network. The right part of the framework learnt the low-dimensional representation of each miRNA and disease node based on non-negative matrix factorization, and these representations were used to establish the corresponding embedding layer. Finally, the left and right embedding layers went through convolutional modules to deeply learn the complex and non-linear relationships among the similarities and associations between miRNAs and diseases. Experimental results based on cross validation indicated that CNNMDA yields superior performance compared to several state-of-the-art methods. Furthermore, case studies on lung, breast, and pancreatic neoplasms demonstrated the powerful ability of CNNMDA to discover potential disease miRNAs.


Introduction
MicroRNAs (miRNAs) are a class of endogenous small RNAs of approximately 20-24 nucleotides in length. miRNAs regulate gene expression in plants and animals after transcription [1][2][3]. Accumulating studies indicate that miRNAs are closely related to the development of human diseases [4][5][6][7]. Therefore, it is imperative to explore potential disease-associated miRNAs (disease miRNAs) in order to understand disease etiology and pathogenesis.
Disease miRNAs prediction can provide reliable candidates for experimental research. Several methods have been proposed for predicting potential disease miRNAs. Mainstream methods are roughly grouped into two categories. The first category of methods primarily uses the regulatory relationship between miRNAs and their target mRNA to predict potential miRNA-disease associations [8]. First, target genes related to miRNAs are obtained by analyzing base complementarity between the miRNA sequence and the putative target gene sequence. Then, using the interactions between the target gene and known disease-related genes, the potential disease miRNAs are predicted [9][10][11][12]. However, such methods are difficult to use due to experimentally validated targets being insufficiently described to date. Although more target gene samples were obtained through some experiments [13,14], prediction results from these methods have a high false positive rate.
Methods belonging to the second category are based on prior biological knowledge that miRNAs with similar functions are usually associated with similar diseases [15]. First, network medicine is the mainstream way of defining related diseases [16][17][18], some methods make full use of network topology to identify disease miRNAs [19,20]. Moreover, disease miRNAs are identified by a random walk on a single miRNA similarity network [21,22]. However, these methods rely too much on known disease-associated miRNAs and are ineffective for new diseases that lack associated miRNAs. To address this drawback, disease similarity information and miRNA-disease associations were introduced to form miRNA-disease heterogeneous networks, where random walks on a two-layer network were used to predict candidate miRNA-disease associations [23,24]. In addition, there are other methods available for calculating miRNA-disease correlation scores, several methods use non-negative matrix factorization [25][26][27][28][29]. By applying structural perturbation [30], by using transduction learning [31], by using the induction matrix [32], through the binary network projection [33], and extracting potential features that pertain to positive sample information [34]. However, there are complex and non-linear relationships between miRNA-miRNA, disease-disease, and miRNA-disease, all previous methods struggle to extract such relationships.
In this study, we present a new approach on the basis of convolutional neural networks for predicting miRNA-disease association, called CNNMDA. It contains two parts consisting of a left and a right. CNNMDA's left part deeply integrates miRNA similarities, disease similarities, and miRNA-disease associations, and uses these prior biological knowledge to construct the left embedding layer of the miRNA-disease node pair. The right part uses network representation learning to obtain a potential low-dimensional representation of the network node while preserving the topology of the network. Integrating the low dimensional features of miRNAs and diseases helps to estimate the likelihood of association between miRNAs and diseases at the global network level. We construct a deep learning framework based on convolutional neural networks (CNN) for the left and right parts, and learn the original representation and global representation of miRNA-disease node pairs. For some high-frequency diseases, CNNMDA can determine them with high accuracy. Moreover, case studies on 3 diseases indicate that CNNMDA is able to discover potential disease associated miRNAs.

Evaluation Metrics
To evaluate the performance of our prediction model, we performed a 5-fold cross-validation on CNNMDA. In the miRNA-disease association data set, the known miRNA-disease associations are called positive samples, while the unknown associations are considered negative samples. In the first place, all positive samples were extracted, and were divided into five subsets randomly. The next step was to extract the same number of negative samples as the positive samples, and these negative samples are also divided into five subsets randomly. In each cross-validation, we took four positive and four negative samples from five subsets to train the prediction model, and the remaining one positive sample and one negative sample were used as test data to evaluate the prediction performance.
Given a threshold τ, a positive sample is obtained when the prediction score is higher than τ, otherwise a negative sample is added. Accordingly, TPR and FPR are calculated by the following formula: where TP and TN represent the number of positive and negative samples that are judged correctly, respectively. FN indicates the number of positive samples that are misidentified as negative samples, and FP represents the number of negative samples that are misidentified as positive examples. We can calculate different TPRs and FPRs based on different thresholds. The obtained TPRs and FPRs can be plotted as ROC curves, and the area under the receiver operating characteristic curve (AUC) can be used as a criterion for evaluating prediction performance. By observing relevant data, we noted that there were only a few known miRNA-disease associations (positive samples), accounting for 1 31 of all associated data. It is not difficult to surmise that there is a serious imbalance between positive samples and negative samples. In this case, the PR (precision-recall) curve usually reflects more information than the ROC curve [35,36]. Precision indicates the proportion of positive samples that are defined correctly compared to the number of positive samples currently defined as positive examples. Recall indicates the proportion of positive samples that are defined correctly compared to all positive samples. This is calculated as follows: Similarly, precisions and recalls are calculated by different thresholds. Based on these values, the PR curve can be plotted and the area under the precision-recall curve (AUPR) can be calculated to evaluate the prediction performance of the model. In addition, biologists usually choose the top-rank prediction results for experimental validation, so we calculated the average recall value for 15 diseases in the top k ∈ {30, 60, 90, . . . , 240} as another evaluation method.

Comparison with Other Method
To evaluate the prediction performance of CNNMDA, we compare it with several methods that are at the forefront of the field. These included DMPred [29], GSTRW [37], BNPMDA [33], and Liu's method [23], where the parameter settings for each method were set to achieve the best performance. In CNNMDA, the parameters w l , w f , and w p in the convolution operation were set to 3, 5, and 2, respectively. Thus, the size of the convolution sliding window J ∈ R 3×5 , and the sliding window F ∈ R 1×2 in the pooling operation. The number of filters was set to 30 (n conv = 30). The parameters α, β, λ m , and λ d . used in the matrix factorization were all obtained from the set {0.2, 0.5, 0.8, 1, 2, 5, 8} by cross-validating the values of the various parameters. CNNMDA achieved the best performance when α = 0.2, β = 0.2, λ m = 0.2, and λ d = 0.2. In addition, the parameter λ in the combination formula for the left part and right part was set to 0.4. In other comparison methods, the parameters are set according to the values given in the original article.
As shown in Figure 1A and Table 1, CNNMDA achieved the best average performance for 15 diseases (AUC of ROC curve = 0.968). DMPred's performance was the second best, where the AUC was 5% lower than CNNMDA, reaching 0.918. In addition, the AUC values of BNPMDA and Liu reached 0.838 and 0.870, which were 13% and 9.8% lower than CNNMDA, respectively. GSTRW performed poorly compared with other methods, and its AUC value was only 0.816, 15.2% lower than CNNMDA. Among the methods, GSTRW displayed poor performance since only miRNA and disease similarity information is used in this method. Liu's method and BNPMDA fully capture the information of the network topology, and DMPred improves performance by integrating multiple sources of effective information. Our method, CNNMDA, through deep learning original representation and global representation of miRNA-disease node pairs, achieved the best prediction performance. CNNMDA also obtained the best results in each disease. Figure 1B and Table 2, we obtained the average AUPR of all the methods with respect to 15 diseases, and plotted the corresponding PR curves. It is not difficult to surmise that the average AUC-PR area of CNNMDA under 15 diseases was also significantly higher than for other methods. Compared with GSTRW, BNPMDA, Liu's Method and DMPred, CNNMDA displayed AUC-PR increases of 43.9%, 28.9%, 27.7%, and 24%, respectively. Moreover, in 13 of the 15 diseases, CNNMDA achieved the best performance.    As shown in Figure 1B and Table 2, we obtained the average AUPR of all the methods with respect to 15 diseases, and plotted the corresponding PR curves. It is not difficult to surmise that the average AUC-PR area of CNNMDA under 15 diseases was also significantly higher than for other methods. Compared with GSTRW, BNPMDA, Liu's Method and DMPred, CNNMDA displayed AUC-PR increases of 43.9%, 28.9%, 27.7%, and 24%, respectively. Moreover, in 13 of the 15 diseases, CNNMDA achieved the best performance.

As shown in
In addition, to further verify the superior performance of our method compared with other methods, we applied a commonly used method called a paired t-test. After calculation, the p-values of all paired t-test results were less than 0.05 (Table 3), indicating that the performance of CNNMDA is significantly better than other methods.
This was accompanied by a higher recall rate, which means that we have successfully identified more positive samples in the top k candidate list, further indication of the superiority of this model's prediction performance. Therefore, we calculated the average recall rate for all methods in 15 diseases (   The bold values indicate the higher AUCs. In addition, to further verify the superior performance of our method compared with other methods, we applied a commonly used method called a paired t-test. After calculation, the p-values of all paired t-test results were less than 0.05 (Table 3), indicating that the performance of CNNMDA is significantly better than other methods.

Case Studies of Lung Neoplasms, Breast Neoplasms, and Pancreatic Neoplasms
To demonstrate CNNMDA's ability to discover potential candidate disease miRNAs, we carried out our method on case studies of lung, breast, and pancreatic neoplasms. Because of space limitations, here, we focused on analyzing the candidates for lung neoplasms and listed the potential top 50 candidate miRNAs in detail (Table 4). For the other two diseases, we briefly analyzed the top 50 candidates, and their candidates are listed separately in Supplementary Table S1 and Supplementary Table S2, respectively. To ensure the reliability of prediction results, we first verified our predictions through four public databases, dbDEMC [38], PhenomiR [39], miRCancer [40], and TCGA [41]. Among them, dbDEMC explored miRNAs with abnormal expression in different cancers, where miRNAs with significantly different expression levels in cancer compared with normal tissues were retrieved and statistically analyzed through a "Significance Analysis of Microarrays" method. Similarly, PhenomiR consisted of dysregulated miRNAs associated with diseases. miRCancer provided a comprehensive collection of miRNA expression profiles in a variety of human cancers that are automatically extracted from published literature. TCGA sequenced the entire genome of some neoplasms, including at least 6000 candidate genes and microRNA sequences. It stored genomic characterization and sequence analysis of different tumor types. Since lung cancer is one of the most frequent cancers at present, we took lung neoplasms as an example and analyzed the top 50 candidate miRNAs in detail (Table 4). Among them, dbDMEC contained 43 candidates, and 32 candidates were verified by PhenomiR, indicating that they have been confirmed to be upregulated or downregulated in lung neoplasms. In addition, 10 candidates are included in the miRCancer, which further confirms their associations with the disease, and 7 miRNAs are contained in TCGA, indicating their different expression levels between cancer and normal tissues. The remaining 7 candidates were verified by the literature, where 5 miRNAs were confirmed to exert dysregulations in lung tissues compared with normal tissue [42][43][44][45][46]. miR-15a is involved in the regulation of non-small cell lung cancer and controls cell cycle progression in a synergistic and Rb-dependent manner [47], while miR-374a was confirmed to have different effects at different stages of lung cancer [48].
Among the top 50 candidates for breast neoplasms (ST1), dbDEMC and PhenomiR included 46 and 33 candidates, respectively, whose expression levels varied significantly in breast tumors compared with the normal tissues. The miRCancer contained 22 candidates indicating their associations with breast neoplasms, and 3 candidates were confirmed by TCGA, which demonstrates their different expression levels in different biological states. The remaining 3 candidates were verified by the literature. Among them, miR-142 is upregulated in human breast cancer stem cells (BCSCs) as compared to the non-tumorigenic breast cancer cells [49]. In addition, miR-542 can be used to predict the prognosis of breast cancer patients based on the mRNA expression of target gene lymphocyte antigen 9 (LY9), resulting in the secretion of frizzled protein-related protein 1 (SFRP1) [50]. miR-30e has separately been identified as an independent subtype-specific prognostic marker in breast cancer [51].
The top 50 pancreatic tumor candidates are listed in ST2, where 45 and 34 candidates are contained in the dbDEMC and PhenomiR, respectively. There are 19 candidates in the miRCancer that are known to be associated with the disease. Moreover, TCGA comprises 3 candidates. Five other candidates were also confirmed by the literature [52,53], where we also confirmed their different regulatory effects on pancreatic tumors. Moreover, the downregulation of the tumor protein UNC51-like kinase 1 (ULK1) by miR-372 inhibits the survival of human pancreatic cancer cells [54]. While miR-483 promotes cell proliferation by down-regulating its target gene Smad4 in pancreatic ductal adenocarcinoma (PDAC) cells. The three case studies provided above demonstrated the strong performance of CNNMDA in discovering potential disease associated miRNAs [55]. Functional enrichment analysis of miRNAs is helpful in understanding the function of disease-related miRNAs. Some tools [56][57][58] can be used to analyze the association between the function of the potential disease-associated miRNAs and disease progression. Among these tools, TAM [57] is a convenient online tool (http://cmbi.bjmu.edu.cn/tam), it integrates miRNAs into different sets according to various rules and provides investigators with the potential biological functions of the list of miRNAs. We performed functional enrichment analysis for the predicted top 50 potential disease-related miRNAs based on TAM. Here, we focused on the analysis of candidate miRNAs related to lung neoplasms (Figure 3). The results of the enrichment analysis of breast neoplasms and pancreatic neoplasms are listed in Supplementary Figures S1 and S2, respectively. Among the top 50 candidate miRNAs that relate to lung neoplasms, 12 miRNAs are involved in cell cycle-related functions, and 13 miRNAs are involved in human embryonic stem cell regulation functions. Furthermore, 9 miRNAs are concerned with apoptosis. In addition, 7, 7, and 6 miRNAs are related to cell proliferation, hormones regulation, and immune response, respectively. All the miRNA-related functions mentioned above have been confirmed to be closely related to the development of diseases. For instance, numerous studies have confirmed that cell cycle changes are closely related to cancer. When the normal cell cycle changes, the changes may lead to the division of some cells in the body and further cause cancer [59,60]. Specifically, it has been confirmed that cell cycle regulators play an important role in lung neoplasms [61]. As for human embryonic stem cell regulation, some research indicates it may be the origin of some solid tumors, including lung neoplasms, stomach neoplasms, and breast neoplasms [62,63]. Moreover, the metastasis of lung cancer may occur due to the dysregulation of some hormones in the human body [64], and the senescence of the immune system is a possible cause of lung cancer [65]. The other enriched functions associated with more miRNAs, such as apoptosis and cell proliferation, are related to the occurrence and development of diseases [66]. The above analysis can provide some insights into the putative roles of these candidates in lung neoplasms. cells [49]. In addition, miR-542 can be used to predict the prognosis of breast cancer patients based on the mRNA expression of target gene lymphocyte antigen 9 (LY9), resulting in the secretion of frizzled protein-related protein 1 (SFRP1) [50]. miR-30e has separately been identified as an independent subtype-specific prognostic marker in breast cancer [51]. The top 50 pancreatic tumor candidates are listed in ST2, where 45 and 34 candidates are contained in the dbDEMC and PhenomiR, respectively. There are 19 candidates in the miRCancer that are known to be associated with the disease. Moreover, TCGA comprises 3 candidates. Five other candidates were also confirmed by the literature [52,53], where we also confirmed their different regulatory effects on pancreatic tumors. Moreover, the downregulation of the tumor protein UNC51-like kinase 1 (ULK1) by miR-372 inhibits the survival of human pancreatic cancer cells [54]. While miR-483 promotes cell proliferation by down-regulating its target gene Smad4 in pancreatic ductal adenocarcinoma (PDAC) cells. The three case studies provided above demonstrated the strong performance of CNNMDA in discovering potential disease associated miRNAs [55]. Functional enrichment analysis of miRNAs is helpful in understanding the function of disease-related miRNAs. Some tools [56][57][58] can be used to analyze the association between the function of the potential disease-associated miRNAs and disease progression. Among these tools, TAM [57] is a convenient online tool (http://cmbi.bjmu.edu.cn/tam), it integrates miRNAs into different sets according to various rules and provides investigators with the potential biological functions of the list of miRNAs. We performed functional enrichment analysis for the predicted top 50 potential disease-related miRNAs based on TAM. Here, we focused on the analysis of candidate miRNAs related to lung neoplasms (Figure 3). The results of the enrichment analysis of breast neoplasms and pancreatic neoplasms are listed in

Dataset
We obtained miRNA-disease association data from the human miRNA-disease database (HMDD v2. 0) [67]. The database has collected thousands of miRNA-disease associations that have been experimentally verified. There were 492 miRNAs and 329 diseases in the dataset of our study, which contained 5218 known associations between them. The disease terms we used were derived from the U.S. National Library of Medicine. In terms of diseases, phenotype similarities and the semantic similarities between them were extracted from related literature [68].  [69], it can be defined as M 12 . miRNA similarities used in this study were calculated according to the above method. The similarity of N m miRNAs is represented by matrix M ij ∈ R N m ×N m and each value is between 0 and 1.

Disease Similarity Measure
Similarities between disease pairs can be judged by their semantics and phenotype; under normal conditions, if there are more common semantic terms and phenotypes between disease pairs, then they have a high probability of similarity. Accordingly, previous work calculated disease similarity based on the phenotypic and semantic information of the disease [29]. Disease similarities used in this study were obtained using Xuan's method. The similarity of N d diseases are represented by matrix D ij ∈ R N d ×N d and each value is also between 0 and 1.

miRNA-Disease Associations
We used the matrix A ∈ R N m ×N m to represent the associations between N m miRNAs and N d diseases. If miRNA m i is known to be associated with a disease d j , A ij = 1; contrastingly, A ij = 0 indicates that their association has not been explored.

Prediction Model Based on Network Representation Learning and Dual CNN
Here, we developed a novel prediction method based on network representation learning and dual CNN to infer potential miRNA-disease associations. Its prediction model is divided into a left part and a right part (Figure 4). The left part learns feature association representation between a miRNA m i and a disease d j through original feature information. The right part projects all miRNA and disease nodes into a low-dimensional space, thereby integrating their global information to obtain representative low-dimensional features of m i and d j . These two parts use CNN layer deep learning node level representation and global level representation, respectively. Next, the two sides obtain prediction scores for m i and d j through the fully connected layer, respectively. Finally, we integrated two scores as a final prediction score between m i and d j .

Embedding Layer on the Left
The left part integrates original feature information of miRNA and disease pairs. This is performed on the basis that miRNAs may be associated with similar diseases if they have similar functions and vice versa. Therefore, we combined miRNA and disease similarities as well as associations between them to form the feature representation of the left part. As an example, we have described the integration process of miRNA m 1 and disease d 5 ( Figure 5). The first row of M is denoted as M 1 . It contains similarity information between miRNA m 1 and all of the miRNAs. The fifth row of A T is denoted as We used the matrix ∈ × to represent the associations between miRNAs and diseases. If miRNA is known to be associated with a disease , = 1; contrastingly, = 0 indicates that their association has not been explored.

Prediction Model Based on Network Representation Learning and Dual CNN
Here, we developed a novel prediction method based on network representation learning and dual CNN to infer potential miRNA-disease associations. Its prediction model is divided into a left part and a right part (Figure 4). The left part learns feature association representation between a miRNA and a disease through original feature information. The right part projects all miRNA and disease nodes into a low-dimensional space, thereby integrating their global information to obtain representative low-dimensional features of and . These two parts use CNN layer deep learning node level representation and global level representation, respectively. Next, the two sides obtain prediction scores for and through the fully connected layer, respectively. Finally, we integrated two scores as a final prediction score between and .  disease with all of the miRNAs. miRNA is similar to , , and , and the disease has known association with and . Thus and are likely to be associated, as they are all related to and . Similarly, we integrate the first row of matrix A ( ) together with the third row of matrix D ( ). miRNA is known to be associated with , , and , and disease is similar to and , since both and are related to and . Therefore and may be associated with each other. Finally, we integrated , , , and to form the feature matrix ∈ ×( ) . Figure 5. Establishment of the left embedding layer of miRNA m1 and disease d5 by combining their similarities and associations.

Embedding Layer on the Right
In the right part, miRNA (disease) is projected into k-dimensional space to obtain representative low-dimensional features of miRNA and disease pairs, and integrate their global information. Non-negative matrix factorization (NMF) is an effective way to get a low-dimensional representation, and is widely used in data representation [70,71]. It aims to calculate two optimal non-negative matrices such that their product approximates the original matrix. Specifically, for the miRNA similarity matrix ∈ × , each row in it can be considered as a feature vector of a single miRNA, and we need to find non-negative matrices ∈ × and ∈ × whose products approximate to M, such as ≈ W . Therefore, there is an optimization item as follows: where ‖•‖ is the Frobenius norm of a matrix, X represents a low-dimensional feature matrix of miRNA, and is the basic matrix which is similar to the parameter matrix. Finally, k represents the target dimension that we reduce to.

Embedding Layer on the Right
In the right part, miRNA (disease) is projected into k-dimensional space to obtain representative low-dimensional features of miRNA and disease pairs, and integrate their global information. Non-negative matrix factorization (NMF) is an effective way to get a low-dimensional representation, and is widely used in data representation [70,71]. It aims to calculate two optimal non-negative matrices such that their product approximates the original matrix. Specifically, for the miRNA similarity matrix M ∈ R N m ×N m , each row in it can be considered as a feature vector of a single miRNA, and we need to find non-negative matrices W ∈ R N m ×k and X ∈ R N m ×k whose products approximate to M, such as M ≈ WX T . Therefore, there is an optimization item as follows: where · F is the Frobenius norm of a matrix, X represents a low-dimensional feature matrix of miRNA, and W is the basic matrix which is similar to the parameter matrix. Finally, k represents the target dimension that we reduce to. Similarly, we also project disease information into k-dimensional space, in terms of disease similarity matrix D ∈ R N d ×N d , calculating matrices V ∈ R N d ×k and Y ∈ R N d ×k , and D ≈ VY T . Thus, combined with Equation (3), we obtain the following objective function: where α is a parameter for control the contribution of the second item. Y represents a low-dimensional disease feature matrix, and V is a basic matrix. The i-th row of feature matrix X, x i , which is a row vector, represents the k-dimensional features of miRNA m i . Similary, the j-th row of feature matrix Y, y j , also a row vector, represents the k-dimensional features of disease d j . If the k-dimensional features of m i and d j are mostly consistent, there may be potential links between them. The association probability between them is estimated by the formula (x i ) y T j = xy T ij , and the score should be close to A ij , which is the true association probability between m i and d j . As a result, we extend the objective function to: where β is a parameter used to adjust the contribution of the third item. In addition, if miRNA m i is similar to miRNA m j , m i is likely related to other miRNAs whose similarity scores are relatively high with m j . To preserve this network topology information, we introduce the graph regular term, which indicates that if the two miRNAs (diseases) m i and m j are close in original feature space, these two miRNAs (diseases) should also be closer to each other when their feature dimensions are reduced. However, prior to this, we need to establish a graph model for miRNA and disease feature matrices.
For the miRNA feature matrix, a graph model S m is constructed. The elements S m ij are comprised of: where m i and m j represent the i-th miRNA and the j-th miRNA, respectively. The similarity score between them is obtained from matrix M, and similarity scores of the m i are sorted with the rest of the miRNAs to determine whether m j belongs to the k-nearest of m i .
For the disease feature matrix, a supplementary graph model S d is constructed: where d p and d q represent disease p and disease q. The similarity between d p and d q are obtained from matrix D.
The graph regular terms for miRNAs and diseases are defined as: where tr(.) represents the trace of a matrix, x i represents the i-th row of the matrix X, and y p represents the p-th row of the matrix Y.
. Combining the graph regular terms into the objective function gives: where λ m and λ d are parameters used to adjust the regularization terms.
Since the objective function in Equation (10) is not convex, it is unrealistic to hope to find a global optimal solution. We propose a strategy to find local minima by iteratively updating one item with other items fixed, such as updating X with W, Y, and V fixed. In addition, to constrain the matrix elements that are non-negative (w ij ≥ 0, x ij ≥ 0, v pq ≥ 0, y pq ≥ 0), we add the corresponding Lagrangian function. Finally, according to the trace and Frobenius norm of a matrix, the objective function L can also be expressed as: where δ, µ, ϕ, θ represents a Lagrange multiplier. Then the partial derivatives of X, W, Y, and Z can be calculated through the following function: According to Karush-Kuhn-Tucker (KKT) conditions [72], δ ij w ij = 0, µ ij x ij = 0, ϕ ij v ij = 0, θ ij y ij = 0, the following equations are obtained: Finally, we obtained the following update rules: Here, we iteratively update W, X, V, and Y through the above update formula until convergence. The first row of X, x 1 , is the feature vector of miRNA m 1 and the fifth row of Y, y 5 , is the feature vector of disease d 5 . If the k-dimensional features of m 1 and d 5 are mostly consistent, there may be potential links between them. Moreover, x 1 and y 5 are integrated together to form a global feature representation matrix P R 2×k (Figure 6). , , is the feature vector of disease . If the k-dimensional features of and are mostly consistent, there may be potential links between them. Moreover, and are integrated together to form a global feature representation matrix ϵ × (Figure 6).  Figure 6. Establishment of the right embedding layer miRNA m1 and disease d5 by integrating their projection vectors in low-dimensional space.

Convolutional Module on the Left
Feature matrix B, consisting of and , is input to the CNN module to learn the original node pair representation between and . In the convolutional layer, the convolution filter size is set to × , and the number of filters is . Therefore, the convolution filters can be represented as ∈ × × . The output after the convolution operation is expressed as ∈ × × . The following formulas represents the convolution process of X: where X( , , 1) indicates the first column vector in the sliding window when the filter moves to the j-th position of the i-th layer , and ( , , ) represents the convolution result when the t-th filter slides to the j-th position of the i-th layer. ɡ is a nonlinear activation function and is a bias vector. In the above formula, the stride is set to 1 by default. In the pooling layer, we apply the max-pooling operation to compress the convolution result , and get the output where ( , , ) is the pooling result for the p-th position in the i-th row, and is the width of the sliding window in the pooling operation. Next, is used as the input to enter the second convolution layer after the same convolution and pooling operations as above to get the result ∈ ×( )× . We then flatten to a column vector c ∈ × (v = × ( + ) × 2 ). Finally, through the fully connected layer and the softmax layer, we obtain the association prediction score between and . The score is defined as

Convolutional Module on the Right
The embedding in the right part, ϵ × , is used as input to learn global information about miRNA and disease through their representative k-dimensional features. The

Convolutional Module on the Left
Feature matrix B, consisting of m 1 and d 5 , is input to the CNN module to learn the original node pair representation between m 1 and d 5 . In the convolutional layer, the convolution filter size is set to w l × w f , and the number of filters is n conv . Therefore, the convolution filters can be represented as W conv ∈ R w l ×w f ×n conv . The output after the convolution operation is expressed as The following formulas represents the convolution process of X: (25) where X(i, j, 1). indicates the first column vector in the sliding window when the filter moves to the j-th position of the i-th layer, and C 1 (i, j, t) represents the convolution result when the t-th filter slides to the j-th position of the i-th layer. g is a nonlinear activation function and b conv is a bias vector. In the above formula, the stride is set to 1 by default. In the pooling layer, we apply the max-pooling operation to compress the convolution result C 1 , and get the output P 1 ∈ R (N m +N d )×n conv : where P 1 (i, p, t) is the pooling result for the p-th position in the i-th row, and w p is the width of the sliding window in the pooling operation. Next, P 1 is used as the input to enter the second convolution layer after the same convolution and pooling operations as above to get the result H 1 ∈ R 1 2 ×(N m +N d )×2n conv . We then flatten H 1 to a column vector c ∈ R v×1 (v = 1 2 × (N m + N d ) × 2n conv ). Finally, through the fully connected layer W L and the softmax layer, we obtain the association prediction score between m 1 and d 5 . The score is defined as score 1 ∈ R 2×1 :

Convolutional Module on the Right
The embedding in the right part, P ∈ R 2×k , is used as input to learn global information about miRNA m 1 and disease d 5 through their representative k-dimensional features. The process of convolution and pooling on the right is similar to the left, and the detailed operation process is defined as follows: Y conv,i,j = (Y(i, j, 1) , Y i, j, 2), . . . , Y i, j, j + w f − 1 Y conv,i,j ∈ R w l ×w f , C 2 (i, j, t) =g Y conv,i,j * W conv (:, :, t) + b conv (t) , P 2 (i, p, t) = max(C 2 i, w p * (p − 1) + 1, t , . . . , C 2 i, w p * p, t ), where Y indicates the value of the sliding window at different positions. C 2 is the feature output after the convolution layer, which then passes through the pooling layer to obtain P 2 . We also use P 2 as the input for the next convolution layer, and obtain the output H 2 ∈ R 1 2 ×k×2n conv through convolution and pooling operations. The next step is to flatten H 2 to a column vector o ∈ R v×1 (v = 1 2 × k × 2n conv ). Finally, through the fully connected layer W R and the softmax layer, we obtain the association prediction score between m 1 and d 5 . The score is defined as score 2 ∈ R 2×1 :

Combined Strategy
Considering the two parts of the prediction scores between m 1 and d 5 from different perspectives, the optimal performance of the two parts may be different. Therefore, we integrated score 1 and score 2 as the final association score. It is defined as follows: where λ ∈ (0, 1) is a parameter used to weigh the score contributions of score 1 and score 2 . The left and right CNN models all establish a loss function based on cross entropy, defined as loss 1 and loss 2 , respectively: a = e score 1 (1) e score 1 (0) + e score 1 (1) , b = e score 2 (1) e score 2 (0) + e score 2 (1) , where y lable represents the actual associated label between the miRNA and the disease. If the association between the miRNA and the disease is known, y lable = 1, otherwise, y lable = 0. score 1 (0) and score 1 (1) represent the association scores of miRNAs and diseases on the left side. It is similar to a binary classification problem, where score 1 (0) represents the probability that m 1 and d 5 are not associated, and score 1 (1) represents the probability of an association. Finally, we used the softmax function to obtain the association probability a. Similarly, for the calculated right path association probability b, score(1) indicates the final prediction score between m 1 and d 5 , and T represents the number of training samples.

Predicting Novel Disease-Related miRNAs
The predictive performance of CNNMDA was evaluated through a cross-validation process and several case studies, and was applied to predict potential candidate miRNAs for all 329 diseases. We used all positive and negative samples to train CNNMDA. The predicted results of 329 diseases are listed in Supplementary Table S3. Moreover, the candidate miRNAs related to 3 diseases are analyzed in case studies and they come from Supplementary Table S3.

Conclusions
CNNMDA has been developed as a novel method based on network representation learning and dual convolutional neural networks for predicting potential miRNA-disease associations. CNNMDA captures the internal relationships between miRNAs and diseases, including miRNA similarities and disease similarities. Meanwhile, it also captures the associations between miRNAs and diseases. Moreover, the representations of the miRNA nodes and the disease nodes are learned based on an entire miRNA-disease network, and as such are deeply integrated to enhance logical reasoning. The new framework based on network representation learning and dual convolutional neural networks is able to learn the original and global representations of a miRNA-disease pair. CNNMDA's performance was verified by cross-validation with 15 common diseases and case studies on 3 diseases. Experimental results indicated that CNNMDA outperforms existing methods in terms of both AUCs and AUPRs. It is able to generate reliable candidate miRNA-disease associations for subsequent validation by biologists.