Next Article in Journal
Health Inequality as a Large-Scale Outcome of Complex Social Systems: Lessons for Action on the Sustainable Development Goals
Next Article in Special Issue
Uncertain Multiplicative Language Decision Method Based on Group Compromise Framework for Evaluation of Mobile Medical APPs in China
Previous Article in Journal
Performance and Biomass Characteristics of SBRs Treating High-Salinity Wastewater at Presence of Anionic Surfactants
Previous Article in Special Issue
Study on Differences between Patients with Physiological and Psychological Diseases in Online Health Communities: Topic Analysis and Sentiment Analysis
Open AccessArticle

Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules

1
School of Computer, University of South China, Hengyang 421001, China
2
Center for Complex Networks and Systems Research, Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN 47408, USA
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2020, 17(8), 2687; https://doi.org/10.3390/ijerph17082687
Received: 4 March 2020 / Revised: 4 April 2020 / Accepted: 9 April 2020 / Published: 14 April 2020
Electronic medical records are an integral part of medical texts. Entity recognition of electronic medical records has triggered many studies that propose many entity extraction methods. In this paper, an entity extraction model is proposed to extract entities from Chinese Electronic Medical Records (CEMR). In the input layer of the model, we use word embedding and dictionary features embedding as input vectors, where word embedding consists of a character representation and a word representation. Then, the input vectors are fed to the bidirectional long short-term memory to capture contextual features. Finally, a conditional random field is employed to capture dependencies between neighboring tags. We performed experiments on body classification task, and the F1 values reached 90.65%. We also performed experiments on anatomic region recognition task, and the F1 values reached 93.89%. On both tasks, our model had higher performance than state-of-the-art models, such as Bi-LSTM-CRF, Bi-LSTM-Attention, and Vote. Through experiments, our model has a good effect when dealing with small frequency entities and unknown entities; with a small training dataset, our method showed 2–4% improvement on F1 value compared to the basic Bi-LSTM-CRF models. Additionally, on anatomic region recognition task, besides using our proposed entity extraction model, 12 rules we designed and domain dictionary were adopted. Then, in this task, the weighted F1 value of the three specific entities extraction reached 84.36%. View Full-Text
Keywords: entity recognition; electronic medical records; Bi-LSTM-CRF; rules; domain dictionary entity recognition; electronic medical records; Bi-LSTM-CRF; rules; domain dictionary
Show Figures

Figure 1

MDPI and ACS Style

Chen, X.; Ouyang, C.; Liu, Y.; Bu, Y. Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules. Int. J. Environ. Res. Public Health 2020, 17, 2687.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop