Computational Drug Repositioning: Current Progress and Challenges

: Novel drug discovery is time-consuming, costly, and a high-investment process due to the high attrition rate. Therefore, many trials are conducted to reuse existing drugs to treat pressing conditions and diseases, since their safety proﬁles and pharmacokinetics are already available. Drug repositioning is a strategy to identify a new indication of existing or already approved drugs, beyond the scope of their original use. Various computational and experimental approaches to incorporate available resources have been suggested for gaining a better understanding of disease mechanisms and the identiﬁcation of repurposed drug candidates for personalized pharmacotherapy. In this review, we introduce publicly available databases for drug repositioning and summarize the approaches taken for drug repositioning. We also highlight and compare their characteristics and challenges, which should be addressed for the future realization of drug repositioning.


Introduction
Many pharmaceutical companies are working on the development of new drugs, but the rate of gaining approval through clinical trials is very low compared to the financial and time investment. In fact, it takes about 14 years to develop new drugs through de novo drug discovery [1] (Figure 1). Despite prodigious increases in pharmaceutical R&D industry spending over the last several years, productivity has decreased or become limited, in terms of the number of new drugs approved or the number of original Investigational New Drug (IND) applications from commercial sources [2]. In response, interest in repurposing clinically approved drugs has increased. The aim of drug repositioning, also known as drug repurposing, is to identify new indications for a specific disease with existing drugs [3]. Drug repositioning is safe for humans in preclinical models and, due to only testing a drug's effectiveness towards a specific disease, benefits from a very small burden of failure (Table 1). Specifically, reduced development risk has been achieved because in vitro/in vivo screening, toxicology, chemical optimization, and formulation development have already been done and can be omitted. Thus, pharmaceutical companies and researchers have started to heavily invest in drug repositioning, which offers dramatic risk-reward trade-off over de novo drug discovery.   Most drug molecules target more than one target protein, making understanding polypharmacology crucial for developing drug repositioning strategies. With the recent advancement of technology, various methods of applying polypharmacology towards drug repositioning have been developed [2,[4][5][6][7][8][9]. In this review, we present recent advancements in the field of computational drug repositioning and classify them by focusing on how each strategy utilizes available data to achieve the goal and represent a recently updated data resource.

Overview of Drug Repositioning
Drug repositioning methods can be broadly divided into computational and experimental approaches. Sometimes, these methods are combined to utilize the benefit from both computational methods and experimental screenings. Furthermore, studies to find more universally successful candidate drugs have progressed by combining various types of data, such as genomic data, proteomic data, clinical health data, and literature data.

Profile-Based Drug Repositioning
Profile-based drug repositioning uses the bioactive profile of a drug in tandem with other drug or disease profiles. These profiles can be categorized as expression profiles, chemical structure profiles, or clinical profiles. An expression-based profile can be applied for drug-drug identification and drug-disease identification. Repositioned drugs reveal shared therapeutic mechanisms, which indicate their similar expression profile under different biological conditions, like cells or tissues. These mechanisms allow us to identify candidate drugs and alternative targets of drugs. The comparison of transcriptomic profiles between drugs or between drugs and diseases can predict the hidden relationships between them. The expression profile of a drug shows the reverse expression profile of a specific target disease. This profile shows the dysregulated molecular expression caused by a disease, which can potentially be recovered by a drug that has the opposite transcriptomic

Profile-Based Drug Repositioning
Profile-based drug repositioning uses the bioactive profile of a drug in tandem with other drug or disease profiles. These profiles can be categorized as expression profiles, chemical structure profiles, or clinical profiles. An expression-based profile can be applied for drug-drug identification and drug-disease identification. Repositioned drugs reveal shared therapeutic mechanisms, which indicate their similar expression profile under different biological conditions, like cells or tissues.
These mechanisms allow us to identify candidate drugs and alternative targets of drugs. The comparison of transcriptomic profiles between drugs or between drugs and diseases can predict the hidden relationships between them. The expression profile of a drug shows the reverse expression profile of a specific target disease. This profile shows the dysregulated molecular expression caused by a disease, which can potentially be recovered by a drug that has the opposite transcriptomic profile. Put simply, the over-expressed genes in the diseased condition can be treated with the drug, inducing downregulation of those genes. The principle of this computational strategy is simple, yet incredibly effective at identifying drugs with compatible expression profiles similar to a healthy state as opposed to a diseased state. A strong, negative, correlation between the transcriptomic expression profiles of a drug and a disease would show the potential effect on the disease of a drug. The Connectivity Map (cMap), developed by the Broad Institute, is a publicly accessible gene expression database generated through the results of dosing multiple cell types with more than 1300 compounds [17]. This resource can be combined with other public repositories of gene expression data such as those described in Table 2, to make cMap more effective. The complete profiles obtained from these databases can be used to identify novel drug-disease relationships and potential candidate drugs. The expression profile-based approach has discovered the new indication of phenoxybenzamine, an anti-hypertensive drug, as an analgesic and antinociceptive drug [35].
The second type of profile-based strategy is using chemical structure techniques, such as molecular docking [36][37][38][39][40][41][42][43][44]. Similar chemical structures lead to the similar biological activities [45]. Potential candidate drugs, predicted by structural similarity with target molecules, are used to identify new drug-target associations. The similarity of the ligand binding site is of particular use in identifying potential drug-disease pairs [40,41,46,47]. For example, if a specific protein, which is a causative target of disease A, has a similar local structure with that of protein B, which is a therapeutic target of drug B, we can predict that drug B may be used for disease A. This simple strategy is effective in drug repositioning. However, identifying candidate drugs based solely on their chemical structure may be difficult, as biological activity alters the chemical structures of target molecules. Thus, other information should be considered to improve this strategy's therapeutic effect. The molecular docking profile is a promising method for predicting potential drug candidates. The computational prediction of complementarity in binding sites between ligand (i.e., drug) and the target can identify novel interactions and uses for drug repositioning. A statistics-based chemoinformatics approach to predict new off-targets for FDA-approved drugs has been successfully developed and confirmed against known drug-target associations for unreported drug polypharmacology [41]. However, the lack of 3D structures of drug targets and structural information of molecules limits the predictability of this strategy.
Profiling based on side-effects is another critical resource for candidate-target prediction, as each drug has its own adverse effects. Similar side-effects reflect a shared physiology between two different drugs [48]. Matching the adverse effect profiles of two drugs strongly suggests a similar mode of action (MOA) for targets or pathways. Thus, the identification of drugs with similar side-effects shows a potential to replace an original drug with a candidate drug. It is also possible that the adverse effect profile of a specific drug is similar with the expression profile of a particular disease, which gives a critical clue for shared physiology between a drug and a disease [49]. This concept is successfully applied to identify novel drug-target relationships. Although this is a very promising approach, it is limited by the lack of side-effect profiles from drug manufacturers.

Network-Based Drug Repositioning
Along with profile-based drug repositioning methods, a method using drug-target interaction has been widely used [50][51][52]. With the success of the Human Genome Project, many studies have been conducted to find genetic variants that affect common diseases using the data from genome-wide association studies (GWAS) [53][54][55][56]. Drug repositioning studies using GWAS data often focus on whether proteins with disease-associated mutations will be the direct targets of drugs treating a disease.
When the genes obtained from GWAS study are not the druggable targets, a network-based approach can provide the new candidate genes that have protein interactions with GWAS-associated genes. This approach combines information regarding drugs and their druggable targets obtained from various databases [10,29,57] and predicts potential drug candidates.
However, these studies may be limited in their capacity to distinguish false positive genes from true positive genes, due to their linkage disequilibrium [53]. Some genes reveal fake disease associations from GWAS, as they are only co-located with disease-associated genes. Many studies have found new candidate drugs by combining gene and gene-drug networks or through expanding existing target genes using their biological pathways [53][54][55][56]. When disease-related genes, identified through GWAS, do not act as targets of actual drugs, network-based methods expand them to a set of genes that will then be considered as new candidate drug targets. Network-based strategies construct drug-disease networks based on various data (e.g., gene expression, protein-protein interaction (PPI), GWAS) and identify potential candidate drugs [53,58]. These interactions can be directly extracted from existing databases and studies or indirectly inferred using computational algorithms.
Various network-based approaches have been applied to extend existing core disease-associated genes [46,[59][60][61][62][63][64][65][66]. Network-based clustering algorithms were used to identify biological modules (also known as subnetworks) from drug-target and drug-disease networks based on the topological structures and the identified modules could be effectively utilized to select new potential candidate drug-target relationships [59][60][61][62]. Vismodegib, an inhibitor of the Hedgehog signaling pathway, was identified to treat Gorlin syndrome [63] and iloperidone, an atypical antipsychotic for treatment of schizophrenia, was identified as the new potential drug for hypertension [60]. Network-based propagation approaches were also popularly applied to identify potential candidate drugs [47,64,65,67]. Given a set of genes associated with a specific disease, they were extended to the genes sharing the neighborhoods in a PPI network or gene-gene network using a random walk propagation algorithm. Based on this approach, donepezil, methotrexate, gabapentin, risperidone, and cisplatin were also identified for new indications for Parkinson's disease, Crohn's disease, anxiety disorder, obsessive-compulsive disorder, and breast cancer, respectively [67].

Data-Based Drug Repositioning
The most representative example of clinical data used for drug repositioning is sildenafil [2]. This drug was originally developed as a treatment for pulmonary arterial hypertension; however, due to its unexpected side-effect, sildenafil was eventually used for erectile dysfunction and sold under the brand name Viagra. There are several other examples identified from retrospective analysis such as raloxifene in breast cancer and aspirin in colorectal cancer. This drug repositioning discovery was identified not by systematic analysis of clinical data, but by simple clinical analyses. Systematic approaches to combining various kinds of clinical data such as EHR and clinical trial data have increased. EHRs cover enormous data on the clinical history of patients, which includes the results of laboratory tests, drug prescription data, symptom description, and image data. Such huge amounts of data in EHR have been used for drug repositioning; however, that the accessibility and intrinsic noise of EHR data limits its utility for drug repositioning. Ethical issues in utilizing patient data and unstructured information in EHR will be great challenges ahead. By applying natural language processing (NLP) and machine learning techniques to EHR, clinical symptoms can be effectively applied for drug repositioning.
Another kind of data-based drug repositioning approach is based on text-mining approaches [66,[68][69][70][71]. There is dramatic increase in accumulation of medical and biological literature covering biological entities such as drug-disease, drug-target, and disease-gene relationships. Various text mining methods have been applied to identify potential disease-drug relationships from the literature. For example, drug-gene or disease-gene relationships are extracted from PubMed abstracts based on frequent co-occurrence patterns; and then potential indications of existing drugs are predicted from sentence-based networks or drug-disease-gene networks. This approach discovered that diltiazem and quinidine, the hypertension and arrhythmia drugs, could be used to treat Alzheimer's disease [66].

Barrier to Drug Repositioning
Despite the benefit of drug repositioning, the identification of potential uses for existing drugs still requires a considerable amount of high-risk investment. Repositioned drugs may fail to reveal reasonable efficacy in clinical trials even if the safety regulations are well satisfied; some reasons are outlined below.

Dose-Dependency
A drug's proper dose differs with its disease. The prescribed dosage for a potential drug is critical to its optimal therapeutic efficacy. Once it is approved for repositioning, an indication of its appropriate dose should also be studied through clinical trials.

Data Availability and Heterogeneity
Open-source models have been proposed in response to the increase in publicly available expression data; however, public access to certain types of data, such as clinical patient data, is limited and requires extensive manipulation for direct use and comprehension. Moreover, due to the data's heterogeneity, combining different types of data such as transcriptomic data, chemical structure data, and clinical literature data poses another computational challenge for effective drug repositioning.

Patenting of Drug
Patenting the predicted potential drugs is very challenging when novel indications are identified within the same drug category [72]. Systematic collections of drug repositioning patents, along with support systems to help extract relevant patents, will help determine the patent ownership of potential drug-disease pairs.

Validation of Drug
Combining in silico prediction and in vitro validation is necessary to achieve the ultimate success of drug repositioning. Various drug repositioning approaches identify novel drug-disease relationships and they can be combined with clinical records such as EHR, health insurance records, or physical exam data for effective determination of potential drugs. High throughput screening of chemicals using in vitro or in vivo systems could help to validate the predicted potential drugs. However, most in vitro systems differ from physiological conditions and thus, in vitro cell culture models resembling in vivo tissue and disease pathology should be considered for validation.

Next Step for Drug Repositioning
Evaluating potential candidate drugs in clinical use is critical for carrying out drug repositioning strategies. Although in silico evaluations have revealed the success in repositioning potential candidate drugs, some challenges should be addressed. One major challenge is that more drug-target and drug-disease information is needed. Specifically, we need both the true-positive pair and true-negative pair information to accurately predict new disease-drug pairs. The identification of potential drug-disease pairs is limited by our concrete knowledge of disease-drug pairs, whether they are unidentified or negative pairs. Without the true-negative pairs in training data, computational models for drug prediction cannot show complete predictive power. In addition, potentially repositioned drugs, identified through models, cannot be tested without clinically passing safety regulations and proving their efficacy towards their intended diseases. Thus, a combination of in silico prediction and in vitro validation, or the validation gained through the retrospective analysis of clinical history, is imperative in successful drug repositioning [73,74].

Conclusions
During the past several years, there has been an increased interest in drug repositioning. Although there have been a number of serendipitous drug-repositioning discoveries, pharmaceutical companies have made a concerted effort to utilize the benefits of drug repositioning. There are many methods that can lead to the successful identification of potentially repurposed drugs. The effective combination of different strategies and available data would bring the dramatic advances in computational drug repositioning field. This review shows that through its various methods, the drug-repositioning strategy proves to be more cost-effective, practical, and safe than the conventional de novo drug design approach.

Conflicts of Interest:
The authors declare no conflict of interest.