1. Introduction
Drugs play a crucial role in curing diseases and enhancing quality of life [
1]. During drug development, drug–drug interactions (DDIs) are a critical consideration and the drug targeting of the selected protein should be bioavailable (e.g., favorable absorption and metabolism) [
2]. However, potential DDIs may lead to a strong rise or drop in the plasma concentration of the drug or metabolite, and even generate toxic compounds [
3]. From a clinical perspective, a combination of drugs is used for the treatment of complex diseases, but unexpected DDIs may induce adverse reactions, which can give rise to drug withdrawal and even to the death of the patient [
4,
5]. Thus, the early identification of potential DDIs is very critical for drug development and medical safety.
Alterations in drug pharmacokinetics and drug pharmacodynamics can be caused by DDIs [
6]. Pharmacokinetic DDIs crop up when the perpetrator drug disrupts the absorption, distribution, metabolism and elimination (ADME) of the victim drug, and also when the perpetrator drug interacts with the protein of the victim drug or other protein within the same signaling pathway [
4]. To screen and analyze unknown DDIs, biological techniques are mainly used, which are regarded as the ultimate way to judge and validate the DDIs, containing metabolism-based and transporter-based DDIs, such as testing whether the drug is the inhibitor/inducer or substrate of CYP enzymes [
7], and testing whether the drug is the inhibitor/inducer or substrate of a P-gp transporter [
3]. Then, based on the in vitro parameters, a cumbersome dynamic model (e.g., PBPK) and expensive in vivo experiment should be constructed and analyzed for the final validation.
Traditional biological technologies face the challenges of high cost, limited participant number, low efficiency and large number of pairwise drugs waiting for identification [
2]. Additionally, the constantly increasing demand for drug therapy makes the identification of potential DDIs before clinical medications are administered [
8] more and more urgent. Consequently, exploiting large-scale computational prediction methods as a decision aids for a large number of DDI candidates prescreening to provide a direction or prioritize of the in vitro–in vivo experiments, and improve efficiency for DDI research and development (R&D). Computational methods have gained concern from the academy and the industry, due to their promise to discover drug–drug interactions on the large scale [
9,
10]. The performance of computers has been greatly improved as the precondition to consider computational methods to realize DDI prediction. Furthermore, researchers constructed many reliable bioinformatic databases on drugs through many biomedical experiments, such as DrugBank [
11], ChEBI [
12], PubChem [
13] and KEGG [
14]. Recently, several computational methods have been proposed to reduce the cost of predicting potential DDIs. These methods can be roughly divided into four categories: similarity-based, network-based, matrix-factorization-based and semanticity-based.
Similarity-based methods are one of the relatively main approaches, assuming that, if drugs have similar function structures, they are more likely to have a similar interaction structure. Gottlieb et al. put forward a model named INDI, extracting feature vectors by calculating seven drug similarities and predicting interactions of the drugs by logistic regression [
15]. Cheng et al. merged many drug similarities to express drug–drug pairs and exploited five classifiers to build predicting models [
16]. Ferdousi et al. provided a method to construct embedding vectors of drugs, using four biological elements, including carriers, transporters, enzymes and targets (CTET), to predict potential DDIs through a Russell–Rao similarity [
17]. Rohani et al. predicted DDIs based on fusing similarity matrices and tested the performance on three different scales of drug similarity datasets [
18]. Due to the structure information of the interaction network and chemical sequence information not being considered, the final prediction effect was not very good.
Network-based methods infer novel drug–drug edges by the topological structure of the network and biological network that involves biomedical entities, or learning the high-order drug similarity and the propagating similarity. Zhang et al. predicted DDIs using an integrative label propagation method with high-order similarity transitivity on the multi-scale similarity-based network. It also can rank multiple drug information sources [
19]. Park et al. exploited the random walk with restarting on the protein–protein network method to the analog propagation of signals to predict drug–drug interactions [
20]. Deepika et al. proposed a meta-learning framework, extracting representation on four types of feature networks through Node2vec and containing chemical feature networks, biological feature networks, phenotypic feature networks and disease feature networks, then integrating the results of four classifiers to predict unknown DDIs [
21]. Liu et al. drew an approach DANN-DDI that contains five types of drug networks and learned the features through the graph embedding algorithm SDNE. An attention neural network is designed to learn concatenating representation and a deep neural network is used to generate prediction results [
22]. Although these methods have shown a good prediction performance, most of them preserve higher-order structure features with difficulty and stick to the local optimum with ease. The attention mechanism is not used to process multiple pieces of information, which leads to a limited performance.
Matrix-factorization-based methods turn the DDI adjacency matrix into several decomposed matrices and then re-establish adjacency by the decompositions. The model, named TMFUF, was developed by Shi et al. to identify DDIs, which uses drug additional information to rebuild the interaction matrix through triple matrix factorization [
23]. Zhang et al. proposed an ensemble model that is based on sparse feature learning for predicting drug–drug interactions [
24]. Zhang et al. designed a method for DDIs prediction that uses eight types of background information based on the matrix factorization of a manifold regularization [
25]. Yu et al. proposed DDINMF applying semi-nonnegative matrix factorization to conclude the enhanced and degressive prediction of pairwise drugs [
26]. Shi et al. introduced a BRSNMF model, an optimization of the DDINMF model, technically utilizing drug-binding protein as the feature to predict DDIs of new drugs [
27]. These approaches exploit the biological information of many supplements to ensure generalization, but the important information on the chemical sequence is not fully considered.
Semanticity-based methods generally abstract the information from the semanticity of sentences about drugs through text-mining, and then the candidates of drug interactions are detected and classified. Chowdhury et al. applied a framework that can extract information on multi-phase relations, exploiting the scope of negation cues and semantic roles to reduce the skewness of the data, and used SVM to calculate the possibility of DDIs [
28]. Zhu et al. exploited the BioBERT method to pre-train word vectors of drug descriptions and extracted the semantic representation of sentences by the BiGRU, integrating drug entity information by entity-aware attention, obtaining prediction results by the MLP [
29]. However, these methods are heavily dependent on the clinical evidence in the post-market, which means that there is no capability of providing potential DDI alerts before clinical medications are administered.
Although the aforementioned methods have their own advantages and play a crucial role in computational method development for drug–drug interaction prediction, there are still some limitations. (i) Due to several existing relevant computational methods only focusing on single information of drugs, they still cannot satisfy the demand for prediction accuracy in reality applications. (ii) Furthermore, most of them rely on artificially designed molecular representation, limited by the knowledge of domain experts. (iii) Ignoring deep network structure information and integrating drug features without the attention machine could lead to limited prediction performance.
In in silico research, data availability and accessibility are crucial factors in determining the accuracy and precision of calculational methods [
30]. In this paper, in order to address the existing deficiencies, we propose a novel framework for fusing drug chemical sequence, drug biological function similarity and deep network topology structure with an attention machine (BioChemDDI) to predict potential DDIs. In particular, we first obtain the drug biochemical features regarding chemical sequence information and biological function information through a word-embedding algorithm and the similarity matrix fusing method, respectively. Notably, the chemical sequence feature of each drug is firstly represented as a matrix, whose dimension can be reduced by a Convolutional Neural Network (CNN). Then, the rich structural information of the interaction network is extracted by efficiently graph embedding with graph collapse. Thirdly, we introduce the attention mechanism to fuse multiple features and enhance the feature of each drug node. Finally, the interaction scores are generated by the fully connected layer. The proposed model is trained end-to-end, and each feature vector can be further screened for the characteristic factors through the hidden layers. Additionally, in this work, four datasets are used to verify the robustness of our model and are compared with state-of-the-art methods to demonstrate the high efficiency of BioChemDDI. Then, the results of the five-fold cross-validation and the case studies further investigate whether our model is suitable for interaction studies between drugs. More meaningfully, case studies related to three cancer diseases can give insight to reveal unknown drug–drug interactions for inferring combinations of the drug in treating complex diseases. The BioChemDDI model can provide accurate predictions for the potential DDIs and is anticipated to serve as the prioritization tool for the development of drug and clinical applications, which can be used as pre-screening tools for potential DDIs. The computational platform web server is accessible at:
http://120.77.11.78/BioChemDDI/ (accessed on 11 April 2022).