Information Drug Name Recognition: Approaches and Resources

Drug name recognition (DNR), which seeks to recognize drug mentions in unstructured medical texts and classify them into pre-defined categories, is a fundamental task of medical information extraction, and is a key component of many medical relation extraction systems and applications. A large number of efforts have been devoted to DNR, and great progress has been made in DNR in the last several decades. We present here a comprehensive review of studies on DNR from various aspects such as the challenges of DNR, the existing approaches and resources for DNR, and possible directions.


Introduction
With the rapid development of information technology, more and more medical documents are available, which contain a great amount of medical information, such as medical entities and relations between them.In order to take full advantage of medical texts, it is necessary to extract valuable information from them.Drugs, as one type of the basic medical elements, also need to be recognized.Drug name recognition (DNR), which seeks to recognize drug mentions in unstructured medical texts and classify them into pre-defined categories, is a fundamental task of medical information extraction, and is a key component of many medical relation extraction systems (e.g., drug-drug interactions [1] and OPEN ACCESS adverse drug reactions [2]) and applications (e.g., information retrieval, information management, information tracking, clinical decision support, drug discovery and drug development) [3].
Drug mentions widely exist in various types of medical texts including medical literature, electronic medical records, medical patent applications, clinical trial documents, etc.For different types of medical texts, a large number of efforts have been devoted to DNR in the last several decades, generating various kinds of approaches and resources.DNR has been a subtask of several public challenges in the medical domain such as the medication extraction challenge organized by the Center of Informatics for Integrating Biology and Beside (i2b2) in 2009 [4], the chemical and drug named entity recognition (CHEMDNER) challenge of the Critical Assessment of Information Extraction systems in Biology in 2013 (BioCreAtIvE IV) [5] and the drug-drug interaction (DDIExtraction) challenge in 2013 [6].
Although there are some reviews on information extraction approaches for drugs and chemical compounds [7][8][9], DNR is not especially discussed.In this study, we focus on DNR and present a comprehensive review including many resources, tools and approaches that are not covered by previous reviews.Moreover, we introduce the challenges and possible directions for DNR.

Challenges of Drug Name Recognition
DNR is a typical named entity recognition (NER) task.It is particularly challenging due to the following reasons:  The ways of naming drugs vary greatly.For example, the drug "quetiapine" (generic name) has the brand name "Seroquel XR", while its systematic International Union of Pure and Applied Chemistry (IUPAC) name is "2-[2-(4-dibenzo [b,f] [1,4] thiazepin-11-ylpiperazin-1-yl) ethoxy] ethanol".Furthermore, some drug names and their synonyms are the same as normal English words or phrases.For example, brand names of "oxymetazoline nasal" and "caffeine" are "Duration" and "Stay Awake", respectively.
 The frequent occurrences of abbreviations and acronyms make it difficult to identify the concepts to which the terms refer to.For example, the abbreviation "PN" can refer to the drug "penicillin" or other concepts such as "pneumonia", "polyarteritis nodosa" and "polyneuritis".
 New drugs are constantly and rapidly reported in scientific publications.Moreover, drug names may be misspelled in electronic medical records such as progress notes and discharge summaries.This makes DNR systems that rely only on dictionaries of known drug names not effective.
 Some drug names may correspond to non-continuous strings of text.For example, "loop diuretics" and "potassium-sparing diuretics" in the sentence "In some patients, the administration of a non-steroidal anti-inflammatory agent can reduce the diuretic, natriuretic, and antihypertensive effects of loop, potassium-sparing and thiazide diuretics".Such examples pose great difficulties to DNR.

Benchmark Datasets
The development and evaluation of DNR approaches require benchmark datasets where all drug names are annotated by human experts.Benchmark datasets can be used for training machine learning-based approaches and comparing different approaches.Table 1 lists some available benchmark datasets for DNR.Some datasets in Table 1 are not developed for DNR, but drug names are annotated in them.Therefore, they can be used for DNR.For example, ADE is developed for extraction of adverse drug effects and PK, PK-DDI and DDIExtraction 2011 are developed for extraction of drug-drug interactions.Moreover, since the datasets in Table 1 are developed for different tasks, definitions of drugs vary significantly in different datasets.Datasets such as ADE, EU-ADR and DDIExtraction 2011 only define a single class of drugs, while other datasets such as PK-DDI and DDIExtraction 2013 define multiple different classes of drugs.
To evaluate the performances of DNR approaches, precision, recall and F-score of DNR approaches on the benchmark datasets are measured.Precision is the percentage of correctly recognized drug names over all recognized results by an approach.Recall is the percentage of correctly recognized drug names over all drug names annotated in the benchmark datasets.F-score is the harmonic mean of precision and recall.

Dataset
Data Source URL Figure 1.Typical procedure of a DNR system.
(1) Preprocessing: Preprocessing aims at transforming the original input texts into representations required by DNR approaches and enriching the original texts with lexical and syntactic information.Preprocessing includes sentence splitting, tokenization, part-of-speech (POS) tagging, text chunking, lemmatization, etc.The output information of preprocessing can be used to induce rules or generate features for DNR approaches.The selection of suitable strategies or methods for preprocessing has significant impact on the performances of DNR systems [16,17].Dai et al. [16] investigated the effects of coarse-grained and fine-grained tokenization strategies on DNR.For the coarse-grained tokenization, Penn Treebank tokenization rules [18] were used.The fined-grained tokenization strategy applied some extra preprocessing steps on the generated tokens of coarse-grained tokenization, e.g., adding separations before and after symbols such as hyphens and dashes.It was demonstrated that fine-grained tokenization performed better than coarse-grained tokenization.Batista-Navarro et al. [17] focused on the effects of sentence splitting and tokenization on recognition of drugs and chemicals from chemical literature.They compared non-specialized implementations of sentence splitting and tokenization with specialized implementations tuned for chemical literature.Specialized implementations achieved better performance than non-specialized implementations.
There are many open source natural language processing (NLP) toolkits that can be used for preprocessing in DNR systems.Table 2 lists some commonly used NLP toolkits.For each preprocessing task, NLP tools based on different methods and tuned for different types of texts are available.It is important to select appropriate NLP tools to preprocess texts of different fields.The unstructured information management architecture (UIMA) [19,20] makes comparing and selecting NLP tools simple.Based on UIMA, it is easy to plug a NLP tool into existing text processing pipelines or combine NLP tools into text processing pipelines.For example, U-Compare [21,22] is an integrated NLP systems based on UIMA.It provides a large collection of NLP tools and allows sets of tools to be run in parallel on the same inputs.Moreover, it can automatically generate statistics for all possible combinations of these tools.
Table 2. Open source natural language processing (NLP) toolkits for preprocessing in DNR systems.them into predefined categories.Knowledge resources play important roles in DNR approaches.They can be used to match drug names, induce rules and generate features for DNR approaches.DNR approaches and knowledge resources will be introduced in detail in the following sections.
(3) Postprocessing: In the postprocessing step, heuristic rules and knowledge resources are commonly used to refine the recognition results of DNR approaches [30][31][32].For instance, Grego et al. [30] filtered the recognized drug names composed entirely by digits and removed characters such as "*", "−" and "." from recognized drug names if the characters appear at the end of recognized drug names.Leaman et al. [31] regarded a mention of drug or chemical with unbalanced parenthesis as an error.They balanced the parenthesis by adding or removing one character at the right or left of the mention.Grego et al. [32] calculate the semantic similarities between drugs identified by a DNR system in a given text window based on semantic relationships in a drug knowledge base.They assign a single validation score to each identified drug based on its similarities to other drugs and then filter falsely identified drugs using a given threshold to increase precision of the DNR system.

Approaches for Drug Name Recognition
Approaches for DNR can be classified into four categories: dictionary-based, rule-based, machine learning-based and hybrid approaches.

Dictionary-Based Approaches
Drug dictionaries refer to collections of drug names.They can be constructed manually or automatically from publicly available knowledge resources such as databases and ontologies containing synonyms or spelling variants of drug names.Different knowledge resources contain different terms.Some knowledge resources focus on drugs, while others focus on general chemicals.Therefore, drug dictionaries are usually constructed by merging several knowledge resources.Before reviewing the dictionary-based approaches, we introduce some freely available knowledge resources and describe how to construct drug dictionaries from them.The web-accessible URLs of the knowledge sources are listed in Table 3.
DrugBank is an online database that contains chemical, pharmacological and pharmaceutical information about drugs and comprehensive drug target information [33].Fields such as "name", "synonyms" and "international-brands" in DrugBank can be extracted to build a drug name dictionary.Kyoto Encyclopedia of Genes and Genomes (KEGG) DRUG is a drug information resource for approved drugs in Japan, USA and Europe [34].The "Name" field in KEGG DRUG can be used for the creation of drug name dictionary.
Pharmacogenomics Knowledgebase (PharmGKB) is a comprehensive resource that curates knowledge about the impact of genetic variation on drug response [35].It provides a drug list and the "name", "generic names" and "trade names" fields in the drug list can be collected to construct a drug name dictionary.
Comparative Toxicogenomics Database (CTD) is a publicly available database that provides manually curated information about chemical-gene interactions, chemical-disease and gene-disease relationships [36].The "ChemicalName" field can be extracted to build a drug name dictionary.
RxNorm is a standardized nomenclature for clinical drugs [37].It is created by the United States National Library of Medicine (NLM) to let various systems using different drug nomenclatures share and exchange data efficiently.The "ingredient (IN)" and "brand name (BN)" fields can be extracted to build a drug name dictionary.
RxTerms is a drug interface terminology derived from RxNorm for prescription writing or medication history recording [38].The "FULL_GENERIC_NAME", "BRAND_NAME" and "DISPLAY_NAME" fields can be used to build a drug name dictionary.
Drugs@FDA is provided by the United States Food and Drug Administration (FDA).It contains information about FDA-approved drug names, generic prescription, over-the-counter human drugs, etc. Drug names can be extracted from the "drug name" and "activeingred" fields of Drugs@FDA.
Therapeutic Targets Database (TTD) is a database that provides information about therapeutic targets and corresponding drugs [39].It contains many drugs including approved, clinical trial and experimental drugs.The "Name", "Synonyms" and "Trade Name" fields in TTD can be collected to build a drug dictionary.Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities [40].In addition, it incorporates an ontological classification, whereby the relationships between molecular entities, classes of entities and their parents, children and siblings are specified.The fields such as "ChEBI name", "International Nonproprietary Name (INN)" and "Synonyms" can be extracted for dictionary creation.Moreover, the class information in ChEBI ontology can be used to classify drugs.
Medical Subject Headings (MeSH) is a controlled vocabulary thesaurus from NLM.It consists of sets of terms named descriptors [41].MeSH descriptors are used for indexing, cataloging and searching for biomedical and health-related information and documents.MeSH descriptors are divided into 16 categories.Each category is further divided into some subcategories.Within each subcategory, descriptors are organized in a hierarchical structure.The category "D" covers drugs and chemicals.Terms belonging to category "D" can be extracted to build a drug dictionary.
PubChem is a public repository for biological properties of small molecules [42].It consists of three interconnected databases: PubChem Substance, PubChem Compound and PubChem BioAssay.PubChem Substance contains entries of mixtures, extracts, complexes and uncharacterized substances and provides synonyms of the substances.PubChem Compound is a subset of PubChem Substance.It contains pure and characterized chemical compounds but no synonyms.In order to build a high quality dictionary consisting of as many synonyms as possible, names and synonyms of PubChem Substance entries that have links to PubChem Compound entries are usually collected.
Unified Medical Language System (UMLS) Metathesaurus is a large, multi-purpose, and multi-lingual thesaurus that contains millions of biomedical and health related concepts from over 100 vocabularies, their synonymous names and relationships among them [43].Each concept in UMLS Metathesaurus is assigned to at least one semantic type.The concepts in the UMLS Metathesaurus with semantic types such as "Pharmacological Substance (PHSU)" and "Antibiotics (ANTB)" can be collected to build a drug dictionary.The joint chemical dictionary (Jochem) is a dictionary developed for the identification of drugs and small molecules in texts.It combines information from UMLS, MeSH, ChEBI, and so on [44].Concepts in Jochem can be extracted to build a drug dictionary.Dictionary-based approaches identify drug names by matching drug dictionaries against given texts.Exact matching approaches [45,46] usually achieve high precision, but suffer from low recall.This is because there are spelling mistakes or variants of drug names not covered by drug dictionaries.Therefore, approximate matching is used to improve the recall of dictionary-based approaches.Lexical similarity measures and approximate string matching methods such as edit distance [47], SOUNDEX [48] and Metaphone [49] can be used for approximate matching.For example, Levin et al. [50] used Metaphone to match generic and trade names of drugs in RxNorm [37] against anesthesia electronic health records.Moreover, there are approaches utilizing existing systems to map textual terms to drug dictionaries [51,52].For example, Rindflesch et al. [51] utilized the UMLS MetaMap program [53] to map biomedical texts to UMLS Metathesaurus concepts.Phrases that were mapped to concepts with the semantic type "Pharmacological Substance" were considered to be drug names.
Dictionary-based approaches also may yield low precisions because of low quality of drug dictionaries.Sirohi et al. [54] investigated the effects of using varying drug dictionaries to extract drugs from electronic medical records and concluded that the precision and recall could be considerably enhanced by refining the dictionaries.Many methods have been used to improve the quality of drug dictionaries [44,55,56].Hettne et al. [44] proposed several filtering rules to filter terms in a dictionary developed for DNR.For example, the short token filtering rule removed a term if the term was a singular character or an Arabic number after tokenization and removal of stop words.Moreover, they manually checked highly frequent terms in a set of randomly selected MEDLINE abstracts.If a term corresponded to a normal English word, it was added to a list of unwanted terms.Xu et al. [55] compared the drug dictionary with the SCOWL list [57], which is a list of normal English words.They manually reviewed ambiguous words and removed unlikely drug terms from the dictionary.At the same time, they expanded the dictionary by adding drug names annotated in a training dataset.
Due to the rapid development of pharmaceutical research, new drugs are constantly developed and enter the market.However, drug dictionaries cannot be updated regularly.It is impossible for a drug dictionary to cover all existing drugs.Therefore, approaches that do not rely too much on drug dictionaries are necessary for DNR.

Rule-Based Approaches
Rule-based approaches use rules that describe the composition patterns or context of drug names.Composition pattern-based rules are usually used to identify drug names that are generated following specific rules (e.g., systematic names and international nonproprietary names).For example, Lowe D. [58] et al. encoded the nomenclature rules as formal grammars to identify systematic names of drugs and chemicals.Segura-Bedmar et al. [3] built a regular expression for each international nonproprietary name stem recommended by World Health Organization to identify and classify drugs.However, composition pattern-based rules are ineffective for drug names generated without nomenclature rules.Context-based rules identify drug names by the context of drug names in free texts [59,60].For example, Gold et al. [59] and Hamon et al. [60] used contextual clues to extract misspelled drug names and drug names not in drug dictionary from discharge summaries.Phrases that were surrounded by enough information such as dosages, frequencies and durations were considered as drug names and were extracted accordingly.
In addition to hand-crafted rules, rules that are automatically learned also have been used for DNR [61,62].For example, Xu et al. [61] developed an iterative pattern leaning approach to extract drugs and other medical treatment concepts from randomized clinical trial abstracts.The approach started with a seed pattern such as "treated with NP (noun phrase)" or some seed instances (i.e., drug names).Then it looped over a procedure consisting of two steps: pattern discovery and instance extraction.The discovered patterns and extracted instances were scored.Only top ranked patterns were used to extract instances and top ranked instances were considered as reliable instances.
Although rule-based approaches perform well when expert rules are available, the generation of rules is time-consuming.Moreover, rules developed for a specific class of drug names are not applicable for other classes of drug names, and too specific rules usually achieve high precision but low recall.

Machine Learning-Based Approaches
Machine learning-based approaches usually formalize DNR as a classification problem or a sequence labeling problem.Each token is presented as a set of features and then is labeled by machine learning algorithms with a class label.The class label denotes whether a token is part of a drug name and its position in a drug name.BIO is the most popular tagging scheme used for DNR.Tags in the BIO tagging scheme respectively represent that a token is at the beginning (B) of a drug name, inside (I) of a drug name and outside of a drug name (O). Figure 2 shows an example of BIO tagging results of a sentence from the DDIExtraction 2013 dataset, where four types of drugs (drug, brand, group, non-human) are defined.Moreover, there are some more expressive tagging schemes such as BEIO, BESIO and B 12 EIO [63].The tagging schemes are derived from BIO. Tag E represents that a token is at the end of a drug name.Tag S represents a single token drug name.Tags B1 and B2 in B 12 EIO stand for the first token in a drug name and the second but not the last token in a drug name, respectively.Dai et al. [16] compared the effects of above four tagging schemes on DNR.It was demonstrated that BESIO outperformed other tagging schemes under the same experimental settings.The selection of machine learning models is very important for machine learning-based approaches.Classification models commonly used for DNR include Maximum Entropy (ME) [64] and Support Vector Machine (SVM) [65].They only consider individual tokens or phrases and do not take the order of tokens into account.Different from classification models, sequence tagging models such as Hidden Markov Model (HMM) [66] and Conditional Random Fields (CRF) [31,[67][68][69] consider the complete sequence of tokens in a sentence.They aim at predicting the most probable sequence of tags for a given sentence.CRF is widely used and demonstrated to be superior to other machine learning models used for DNR.For example, CRF-based systems achieved the best performances in the DNR tasks of i2b2 medication extraction [67], CHEMDNER [31] and DDIExtraction 2013 [68] challenges.In most cases, only one machine learning model is used in a machine learning-based DNR approach.However, there are approaches using multiple models [31,[70][71][72][73].For example, Leaman et al. [31] employed two independent CRF models with different tokenization strategies and feature sets.Results of the two models were combined with heuristic rules.Lu et al. [70] used a character-level CRF and a token-level CRF to learn the internal structure and context of drugs, respectively.Results of the two CRF models were also merged in a heuristic method.Lamurias A. et al. [72] train multiple CRF models on different training datasets and combine the confidence scores returned by the models to rank and filter the identified drug names.Sikdar et al. [73] combine one SVM model and six CRF models that use different features to recognize drug names based on an ensemble framework.Table 4 lists some open source toolkits that can be used as implementations of commonly used machine learning models.Performances of machine learning-based approaches highly depend on the features they used.Various types of features have been explored for DNR.Table 5 lists some features that are commonly used in machine learning-based DNR systems.Features based on the linguistic, orthographic and contextual information of tokens are widely used and the effectiveness of them is extensively studied.For example, Campos et al. [71] investigated the effects of features including lemma, POS, text chunking, dependency parsing, etc. Lemma, POS and text chunking features produced significant positive impacts on the performance of the CRF-based approach, while dependency parsing brought negative effects on the performance.Halgrim [64] examined the effects of POS, affix and orthographic feature in a ME-based approach and all the features provided positive outcomes.

Word shape
Uppercase letters, lowercase letters, digits, and other characters in a word are converted to "A", "a", "0" and "O", respectively.For example, "Phenytoin" is mapped to "Aaaaaaaaa".[17,31,68,71,74,75] Dictionary feature Whether an n-gram matches with part of a drug name in drug dictionaries. [ Outputs of NER tools Features derived from the output of existing chemical NER tools.[31,68,74] Word representation Word representation features based on Brown clustering, word2vec, etc. [70,75] Conjunction feature Conjunctions of different types of features, e.g., conjunction of lemma and POS features.[17,71,75] Domain-specific features such as dictionary features and features derived from outputs of existing chemical NER are also widely used.For example, Batista-Navarro et al. [17] compiled dictionaries from domain-specific knowledge resources including ChEBI, DrugBank, Jochem, etc.Each token was tagged by the dictionaries and the tagging results were used as features by a CRF-based approach.Rocktäschel et al. [68] generated domain-specific features from ChEBI, Jochem and the outputs of ChemSpot [76], which is a chemical NER tool.In general, domain-specific features can significantly improve the performances of machine learning-based approaches.
Recently, word representation features are exploited and demonstrated to be effective for DNR.Word representation features are generated by unsupervised machine learning algorithms on unstructured texts.They contain rich syntactic and semantic information of words.Many unsupervised machine learning algorithms have been proposed to learn word representation features and Brown Clustering algorithm [77] and word2vec [78] are most commonly used.For example, Lu et al. [70] employed Brown Clustering algorithm and word2vec to learn word representation features on MEDLINE documents.Then the word representation features were used to improve the performances of CRF-based DNR systems.
Moreover, conjunction features that combine different types of features are also used for DNR.Conjunction features can capture multiple linguistic characteristic of a word.For example, Batista-Navarro et al. [17] used conjunction features that combined lemmas and POS tags.Liu et al. [75] selected 8 types of features including word feature, POS, text chunking, etc., and combine them into conjunction features in two ways in their CRF-based DNR system.
Noisy features can significantly affect the performances and efficiencies of machine learning-based approaches.Therefore, the selection of informative and discriminative features is very important.However, determining the optimal subset of features by testing different combinations of features is time-consuming.Moreover, it is very likely that the optimal feature subset on a dataset will not perform well on another dataset.Therefore, automatic feature selection is necessary.In [75], Liu et al. employed three automatic feature selection methods, Chi-square [79], mutual information [80] and information gain [81], to eliminate noisy features for a CRF-based DNR system.Experimental results showed that each feature selection method could improve the performance of the CRF-based system.
Although machine learning-based approaches can achieve promising results, they require a sufficiently large and high quality annotated dataset for training.However, the creation of an annotated dataset is costly and time-consuming.Moreover, domain experts are required in the process of creating an annotated dataset.

Hybrid Approaches
Hybrid approaches combine multiple types of approaches to exploit the advantages and avoid the limitations of each type of approaches.In general, a post-processing step is needed to deal with the conflicting results of multiple approaches.Hybrid approaches usually produce better results than each component.Akhondi et al. [82] proposed a hybrid approach combining a dictionary-based approach and a rule-based approach based on the observation that different classes of drug names have different naming characteristics.The dictionary-based component is used to extract non-systematic names such as brand and generic drug names, and the rule-based component is used to extract systematic names, which are generated following standard nomenclature rules.Finally, the outputs of the dictionary-based and rule-based components are merged and the shorter one of two overlapping terms is removed.He et al. [83] constructed a drug name dictionary from DrugBank and MEDLINE abstracts.Then dictionary look-up was combined with a CRF-based approach to recognize drug names.For the overlapping terms, the results of dictionary look-up were kept.Due to the small size of training set, Tikk et al. [84] firstly developed a rule-based approach to label drug names in a large document set.Then a CRF-based approach was trained on the union of a small training set and the output of the rule-based approach.The CRF-based approach achieved better performance than that trained only on the small training set.Korkontzelos et al. [85] develop a voting system to combine a maximum entropy model, a perceptron classifier and a dictionary-based method to enhance the performance for DNR.Usié et al. [86] employ a CRF-based method, a dictionary and some regular expressions to recognize different types of drug names and then integrate the recognition results of the methods.
The performances of representative DNR systems on datasets of different types of texts are listed in Table 6.In the third column, "Dict", "Rule", "ML" and "Hybrid" denote dictionary-based, rule-based, machine learning-based and hybrid approaches, respectively.The fifth column lists the F-scores of DNR systems that only recognize drug names from texts (drug detection), while the sixth column lists the F-scores of DNR systems that not only recognize drugs from texts but also classify the recognized drugs into predefined classes (drug classification).
The fourth column of Table 6 lists the datasets on which the systems have been run on.We can see that machine learning-based systems or hybrid systems containing a machine learning component outperform other systems on the same dataset.Therefore, machine learning-based approaches or hybrid approaches that contain a machine learning component are the best choices if annotated datasets are available.Performance differences between machine learning-based approaches are mainly because of the selection of different machine learning models and different features.By comparing the fifth and the sixth column, we can see that drug classification is more difficult than drug detection.The performances for drug classification are relatively poor and more efforts should be devoted to drug classification.

Concluding Remarks and Future Perspectives
Many approaches have been proposed for DNR, ranging from simple dictionary-based approaches to complex hybrid approaches.These approaches differ in the degree of manual intervention, portability, and applicable situation.Each type of the approach has advantages over other types.Dictionary-based approaches are effective when comprehensive and up-to-date drug dictionaries are available.Moreover, dictionary-based approaches can normalize drug names in texts by mapping them to unique identifiers in drug dictionaries.In contrast, machine learning-based approaches can only identify drug names from texts.However, the creation and maintenance of comprehensive drug dictionaries are costly and time-consuming.Rule-based approaches are suitable when drug names are generated regularly.Rule-based approaches can be easily optimized by modifying existing rules or adding new rules.However, there is an unavoidable trade-off between precision and recall for rule-based approaches.Rules that are too specific achieve high precision but low recall.On the other hand, rules that are too general lead to high recall but low precision.Furthermore, the portability of rule-based approaches is poor.Rules defined for a class of drugs cannot be adapted to other classes.In contrast, machine learning-based approaches for a class of drugs can be easily retrained for other classes on corresponding training datasets.Machine learning-based approaches often outperform and rule-based approaches when sufficiently large and high quality annotated training datasets are available.However, it is costly to annotate datasets manually.Given the above, hybrid approaches that combine different approaches have been increasingly used.
At the present time, the state-of-the-art approaches for DNR are mainly based on traditional machine learning models such as CRF and SVM.Performance improvements of the state-of-the-art approaches depend heavily on exploring and using new effective features.However, performance improvements from new features are limited.It is necessary to explore new machine learning models for DNR.In recent years, deep neural networks (DNNs) [90] have been used in many machine learning tasks such as speech recognition [91] and visual object recognition [92] and achieved unprecedented success.It is worth exploring the use of DNNs for DNR.
The lack of sufficiently large and high quality training datasets is a major barrier to future work on DNR.Semi-supervised learning is a machine learning technique, which requires a small amount of annotated data and a large amount of unannotated data for training.Typical semi-supervised learning methods such as bootstrapping [93] and active learning [94] have demonstrated their effectiveness for improving the performances of systems when annotated data is scarce.Therefore, semi-supervised learning is a promising solution to lack of training datasets for DNR.
Another barrier to further development of DNR is the imbalance of training datasets.For example, drugs of the "drug" class account for 63% of all drugs in the DDIExtraction 2013 dataset, while drugs of the "no-human" class account for only 4%.As a result of the imbalance of training datasets, the top ranked system of the DDIExtraction 2013 challenge achieved an F-score of 79.0% for "drug", but only 14.1% for "no-human".Automatic text generation techniques based on formal grammar are likely to solve the imbalance problem of training datasets.Formal grammar can capture the morphology, syntax and semantic information of a language.It has been demonstrated that automatic text generation techniques based on formal grammar can automatically build realistic chemical-related training documents for chemical name extraction [95].Moreover, automatic text generation techniques can control the density of different classes of training examples, the variety and the complexity of contexts, as well as the size of the training sets.For future work, it is worth trying to improve the performance of DNR systems by automatically generating training datasets without data imbalance.
Although there are a few approaches proposed to recognize both continuous and non-continuous named entities such as disorders [96][97][98], they still perform poorly for non-continuous named entities.For example, in the CHEMDNER task of the BioCreative IV challenge, non-continuous drugs (i.e., multiple drugs) account for less than one percent and no participating system specially deals with them.All participating systems achieve poor performance for the multiple drugs.Therefore, effective solutions to non-continuous drug name recognition are needed for future work.

Figure 2 .
Figure 2.An example of BIO tagging results of a sentence from the DDIExtraction 2013 dataset.

Table 4 .
Open source implementations of machine learning models.

Table 5 .
Features used in machine learning-based DNR systems.

Table 6 .
Performances of representative DNR systems on different types of texts.