Next Article in Journal
Thermal Characterizations of a Lithium Titanate Oxide-Based Lithium-Ion Battery Focused on Random and Periodic Charge-Discharge Pulses
Next Article in Special Issue
A Novel Machine Learning Approach for Sentiment Analysis on Twitter Incorporating the Universal Language Model Fine-Tuning and SVM
Previous Article in Journal
Future and Innovative Design Requirements Applying Industry 4.0 Technologies on Underground Ammunition Storage
Previous Article in Special Issue
Feature Learning for Stock Price Prediction Shows a Significant Role of Analyst Rating
Article

A Comparative Analysis of Active Learning for Biomedical Text Mining

1
School of Computer Science, The University of Sydney, Sydney, NSW 2006, Australia
2
School of Engineering, RMIT University, Carlton, VIC 3053, Australia
3
School of Electrical Engineering and Computing, The University of Newcastle, Newcastle, NSW 2308, Australia
4
UNSW Digital Health, WHO Center for eHealth, Faculty of Medicine, The University of New South Wales, Sydney, NSW 2052, Australia
*
Author to whom correspondence should be addressed.
Academic Editor: Teen-Hang Meen
Appl. Syst. Innov. 2021, 4(1), 23; https://doi.org/10.3390/asi4010023
Received: 31 December 2020 / Revised: 24 February 2021 / Accepted: 8 March 2021 / Published: 15 March 2021
(This article belongs to the Special Issue Advanced Machine Learning Techniques, Applications and Developments)
An enormous amount of clinical free-text information, such as pathology reports, progress reports, clinical notes and discharge summaries have been collected at hospitals and medical care clinics. These data provide an opportunity of developing many useful machine learning applications if the data could be transferred into a learn-able structure with appropriate labels for supervised learning. The annotation of this data has to be performed by qualified clinical experts, hence, limiting the use of this data due to the high cost of annotation. An underutilised technique of machine learning that can label new data called active learning (AL) is a promising candidate to address the high cost of the label the data. AL has been successfully applied to labelling speech recognition and text classification, however, there is a lack of literature investigating its use for clinical purposes. We performed a comparative investigation of various AL techniques using ML and deep learning (DL)-based strategies on three unique biomedical datasets. We investigated random sampling (RS), least confidence (LC), informative diversity and density (IDD), margin and maximum representativeness-diversity (MRD) AL query strategies. Our experiments show that AL has the potential to significantly reducing the cost of manual labelling. Furthermore, pre-labelling performed using AL expediates the labelling process by reducing the time required for labelling. View Full-Text
Keywords: active learning; machine learning; biomedical natural language processing active learning; machine learning; biomedical natural language processing
Show Figures

Figure 1

MDPI and ACS Style

Naseem, U.; Khushi, M.; Khan, S.K.; Shaukat, K.; Moni, M.A. A Comparative Analysis of Active Learning for Biomedical Text Mining. Appl. Syst. Innov. 2021, 4, 23. https://doi.org/10.3390/asi4010023

AMA Style

Naseem U, Khushi M, Khan SK, Shaukat K, Moni MA. A Comparative Analysis of Active Learning for Biomedical Text Mining. Applied System Innovation. 2021; 4(1):23. https://doi.org/10.3390/asi4010023

Chicago/Turabian Style

Naseem, Usman, Matloob Khushi, Shah K. Khan, Kamran Shaukat, and Mohammad A. Moni. 2021. "A Comparative Analysis of Active Learning for Biomedical Text Mining" Applied System Innovation 4, no. 1: 23. https://doi.org/10.3390/asi4010023

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop