Clinical Decision Support Systems to Predict Drug–Drug Interaction Using Multilabel Long Short-Term Memory with an Autoencoder

Big Data analytics is a technique for researching huge and varied datasets and it is designed to uncover hidden patterns, trends, and correlations, and therefore, it can be applied for making superior decisions in healthcare. Drug–drug interactions (DDIs) are a main concern in drug discovery. The main role of precise forecasting of DDIs is to increase safety potential, particularly, in drug research when multiple drugs are co-prescribed. Prevailing conventional method machine learning (ML) approaches mainly depend on handcraft features and lack generalization. Today, deep learning (DL) techniques that automatically study drug features from drug-related networks or molecular graphs have enhanced the capability of computing approaches for forecasting unknown DDIs. Therefore, in this study, we develop a sparrow search optimization with deep learning-based DDI prediction (SSODL-DDIP) technique for healthcare decision making in big data environments. The presented SSODL-DDIP technique identifies the relationship and properties of the drugs from various sources to make predictions. In addition, a multilabel long short-term memory with an autoencoder (MLSTM-AE) model is employed for the DDI prediction process. Moreover, a lexicon-based approach is involved in determining the severity of interactions among the DDIs. To improve the prediction outcomes of the MLSTM-AE model, the SSO algorithm is adopted in this work. To assure better performance of the SSODL-DDIP technique, a wide range of simulations are performed. The experimental results show the promising performance of the SSODL-DDIP technique over recent state-of-the-art algorithms.


Introduction
In the digital era, the velocity and volume of public, environmental, health, and population data from a wider variety of sources are rapidly developing. Big Data analytics technologies such as deep learning (DL), statistical analysis, data mining (DM), and machine learning (ML) are used to create state-of-the-art decision models [1]. Decision making based on concrete evidence is crucial and has a dramatic effect on program implementation and public health. This highlights the significant role of a decision model under uncertainty, involving health intervention, disease control, health services and systems, preventive medicine, quality of life, health disparities and inequalities, etc. A drug-drug interaction (DDI) can occur when more than one drug is co-prescribed [2]. Even though DDIs might have positive impacts, sometimes they have serious negative impacts and result in withdrawing a drug from the market. DDI prediction could assist in reducing the possibility of adverse reactions and improve the post-marketing surveillance and drug development processes [3]. Medical trials are time consuming and impracticable with respect to dealing with largescale datasets and the limitations of experimental conditions. Hence, researchers have presented a computation method to speed up the process of prediction [4]. The present computation DDI prediction method is divided into five classes of models: DL-based, network-based, similarity-based, literature extraction-based, and matrix factorization-based models.
ML techniques are an emerging area which are employed in large datasets for extracting hidden concepts and relationships amongst attributes [5]. An ML model can be used to forecast outcomes. Since it is extremely complex for humans to process and handle a large amount of data [6], hence, an ML model can play a major role to forecast healthcare outcomes with high quality and cost minimization [7]. ML algorithms are based primarily on rule-based, probability-based, tree-based, etc. methods. Large quantities of data gathered from a variety of sources are applied in the data preprocessing stage. During this stage, data dimension is minimized by eliminating redundant data. As the amount of data increases, a model is not capable of making a decision. Hence, various methods must be developed so that hidden knowledge or useful patterns are extracted from previous information [8]. Then, a model using a ML algorithm is tested under test data to discover the model's performance, which can be augmented again by considering some rules or parameters. Generally, ML is utilized in the area of prediction, data classification, and pattern recognition [9]. Numerous applications such as disease prediction, face detection, fraud detection, traffic management, and email filtering, use the ML concept. The DL method is part of ML algorithms, which makes use of supervised and unsupervised models for feature classification [10]. The various elements of DL approaches are utilized in the field of recommender systems, disease prediction, and image segmentation such as restricted Boltzmann machines (RBM), convolution neural networks (CNN), and autoencoders (AEs).
In this study, we develop a sparrow search optimization with deep learning-based DDI prediction (SSODL-DDIP) technique for healthcare decision making in big data environments. The presented SSODL-DDIP technique applies a multilabel long short-term memory with an autoencoder (MLSTM-AE) model for the DDI prediction process. Moreover, a lexicon-based approach is involved in determining the severity of interactions among the DDIs. To improve the prediction outcomes of the MLSTM-AE model, the SSO algorithm is adopted in this work. For ensuring better performance of the SSODL-DDIP technique, a wide range of simulations are performed.

Related Works
In [10], the authors proposed a positive unlabeled (PU) learning model which utilized a one-class support vector machine (SVM) model as the learning algorithm. The algorithm could learn the positive distribution from the unified feature vector space of drugs and targets, and regarded unknown pairs as unlabeled rather them labeling them as negative pairs. Wang et al. [11] introduced a novel technique, multi-view graph contrastive representative learning for DDI forecasting, MIRACLE for brevity, for capturing intra-view interactions and inter-view molecular structure among molecules concurrently. MIRACLE treated a DDI network as a multi-view graph in which all nodes in the interaction graph were a drug molecule graph sample. The author employed a bond-aware attentive message propagating algorithm for capturing drug molecular structured data and a graph convolution network (GCN) for encoding DDI relations in the MIRACLE learning phase. Along with that, the author modeled an innovative unsupervised contrastive learning element to integrate and balance multi-view data. In [12], the author devised a deep neural networks (DNNs) method that precisely identified the protein-ligand interactions with particular drugs. The DNN could sense the response of protein-ligand interactions for the particular drugs and could find which drug could effectively combat the virus.
Lin et al. [13] modeled an end-to-end structure, named a knowledge graph neural network (KGNN), for resolving DDI estimation. This structure could capture a drug and its neighborhood by deriving their linked relations in a knowledge graph (KG). For extracting semantic relations and high-order structures of the KG, the author studied the neighborhoods for all entities in KG as its local receptive, and then compiled neighborhood data from representations of the current entities. Pang et al. [14] presented a new attention-system-related multidimensional feature encoder for DDI estimation, called attention-related multidimensional feature encoders (AMDEs). To be specific, in an AMDE, the author encoded drug features from multidimensional features, which included data from an atomic graph of the drug and a simplified molecular-input line-entry system sequence. Salman et al. [15] modeled a DNN-oriented technique (SEV-DDI: Severity-DDI) that included certain integrated units or layers for attaining higher accuracy and precision. The author moved a step further and used the techniques for examining the seriousness of the interaction, after outpacing other methods in the DDI classifier task successfully. The capability to determine DDI severity helps in clinical decision aid mechanisms for making very precise and informed decisions, assuring the patient's safety.
Liu et al. [16] presented a deep attention neural network-related DDI predictive structure (DANN-DDI), for forecasting unnoticed DDIs. Firstly, by utilizing the graph embedding technique, the author framed multiple drug feature networks and learned drug representation from such networks; after that, the author concatenated learned drug embeddings and implemented an attention neural network for learning representation of drugdrug pairs; finally, the author devised a DNN to precisely estimate DDIs. Zhang et al. [17] introduced a sparse feature learning ensembled approach with linear neighborhood regularization (SFLLN), for forecasting DDIs. Initially, the authors compiled four drug features, i.e., pathways, chemical structures, enzymes, and targets, by mapping drugs in distinct feature spaces into general interaction spaces by sparse feature learning. Then, the authors presented the linear neighborhood regularizations for describing the DDIs in the communication space by utilizing known DDIs.

The Proposed Model
In this study, we introduce a novel SSODL-DDIP technique for DDI predictions in big data environments. The presented SSODL-DDIP technique accurately determines the relationship and drug properties from various sources to make predictions. It encompasses data preprocessing, MLSTM-AE-based DDI prediction, SSO hyperparameter tuning, and severity extraction.

Data Preprocessing
Standard text cleaning and preprocessing operations were carried out on sentences involving but not constrained to lemmatization. Every drug discussed in a sentence was considered and labeled to interact with others [18]. The number of drug pairs (DP) in a sentence is evaluated as follows: where n indicates the number of drugs in a sentence.
In addition, drug blinding was used, whereby all the drug names were allocated to the label, for a sentence, "Aspirin might reduce the effect of probenecid", labeled sentence was "Drug A might reduce the effect of Drug B ". The drug blinding method assists a technique to identify this label as "subject" and "object" that ultimately assist an approach during classification. Then, the processed sentence is given to the approach for classification and detection of DDI.
During word embedding, every word was converted into a real value vector. This word mapping into the matrix can be performed using Word2Vec and embedding data using the abstract of PubMed comprising the drugs.
Every sentence is preprocessed and constitutes "s i " and "d j ", where d j represents drug labels and s i is another word in the sentence. Every word "s i " is transformed to the word vectors using the word embedding matrices. Word embedding (WEMB) is an embedding matrix and WEMB ∈ R d s ×|V| whereas V denotes the vocabulary in the training dataset, d s signifies the count of dimensions, and v s i denotes the index of word embedding.

DDI Prediction Process
To predict the DDI accurately, the MLSTM-AE model is applied in this study. The MLSTM-AE model learns to recreate a time flipped version of input [19]. Every input electricity signal is denoted as where N denotes the overall sample count. Because, the final objective of the study is to learn to categorize, the embedding from the h T i hidden layer is passed via a fully connected (FC) layer, the output of which is the class label. The class label is one-hot encoded. The size of the label vector is equivalent to the number of appliances; once an appliance is ON, the corresponding location of the label vector is 1 or else 0. This can be denoted by y i = y 1 i , y 2 i , . . . , y c i , considering C appliances. Figure 1 represents the structure of MLSTM.
When the C th appliance is ON, the corresponding y c i is 1; otherwise it is 0. The groundtruth probability vector of i th samples are described asp i = y i / y i 1 . The predicted probability vector can be represented as p i .
This algorithm has been trained collectively with the reconstruction loss and multilabel classification loss, hence, the overall loss function is formulated by Equation (5):

Hyperparameter Tuning Process
For the hyperparameter tuning process, the SSODL-DDIP technique uses the SSO algorithm. The SSO is a recent metaheuristic approach which stimulates the anti-predatory and predation actions of the sparrow population [20], particularly, in foraging, individual sparrows act in two roles: joiner and discoverer. The discoverer is responsible for searching the food and guiding others, and the joiner forages by following the discoverers. A specific percentage of sparrows has been carefully chosen as the guarder that transmits alarm signals and carries out anti-predation behavior while they realize the danger. The discoverer position can be redeveloped as follows: In Equation (6), t is the existing value of update. T presents the maximal value of update. X t ij defines the present position of the i − th agent. X t+1 ij denotes the upgraded position of the i − th sparrow in the j − th dimension α ∈ (0, 1] refers to a random number. ST ∈ (0.5, 1] signifies a safety value. R 2 ∈ (0, 1] defines a warning value. G denotes a 1 × d matrix where each value is 1. O represents a random variable.
The joiner position is regenerated as follows: In Equation (7), X b signifies the existing optimum position of the discoverer. X w describes the worst position of the sparrow, B denotes the 1 × d matrix where every value is equivalent to 1 or −1, and A + = A T AA T −1 . Figure 2 demonstrates the steps involved in the SSO algorithm.
The position regeneration for the guarder can be defined as follows: In Equation (8), X best stand for the best global location. β and K ∈ [−1, 1] represent two random integers; f t i defines the fitness value. f t w and f t g are the present worst and best fitness values in the population, correspondingly; ε indicates a minimal number that is closer to zero as explained in Algorithm 1.

Algorithm 1: Pseudocode of SSO algorithm
Define Iter max , NP, n, P dp , s f , G c , FS U and FS L Arbitrarily initializing the flying squirrels places Create novel places for t = 1 : n1(n1 = entire count of squirrels on acorn trees) if R 1 ≥ P dp FS new at = FS old at + d g G c FS old ht − FS old at else FS new at = random location for t = 1 : n2(n2 = entire count of squirrels on normal trees moving to acorn trees) if R 2 ≥ P dp FS new nt = FS old nt + d g G c FS old at − FS old nt else FS new nt = random location end end for t = 1 : n3(n3 = entire count of squirrels on normal trees moving to hickory trees) if R 3 ≥ P dp

Severity Extraction Process
Lexicons such as Sent WordNet and WordNet Affect are common lexicons that are utilized for extracting common sentiments of texts, for instance, movies and social reviews. The subjectivity lexicon has been utilized for extracting subjective expression in arguments or text statements. Several common and subjectivity lexicons have been changed in medicinal study to distinct healthcare tasks. A wide pharmaceutical lexicon has also progressed specifically to the biomedical and healthcare domains and has been used for extracting the sentiments of clinical and pharmaceutical text. It can extract the polarity of sentences by executing Sent WordNet, and the interface has been classified as low, moderate, or high levels, as dangerous and advantageous DDIs are dependent upon the polarity of candidate sentences.

Conclusions
In this study, we introduced a novel SSODL-DDIP technique for DDI predictions in big data environments. The presented SSODL-DDIP technique accurately determined the relationship and drug properties from various sources to make a prediction. In addition, the MLSTM-AE model was employed for the DDI prediction process. Furthermore, a lexiconbased approach was involved in determining the severity of interactions among the DDIs. To improve the prediction outcomes of the MLSTM-AE model, the SSO algorithm was adopted in this work. To assure better performance of the SSODL-DDIP technique, a wide range of simulations were performed. The experimental outcomes show the promising performance of the SSODL-DDIP technique over recent state-of-the-art methodologies. Thus, the SSODL-DDIP technique can be employed for improved DDI predictions. In the future, hybrid metaheuristics could be designed to improve the prediction performance. In addition, outlier detection and clustering techniques could be integrated to enhance the predictive results of the proposed model. Data Availability Statement: Data sharing is not applicable to this article as no datasets were generated during the current study.

Conflicts of Interest:
The authors declare that they have no conflict of interest. The manuscript was written through contributions of all authors. All authors have given approval for the final version of the manuscript.