Next Article in Journal
Systematic Review of Computer Vision Semantic Analysis in Socially Assistive Robotics
Previous Article in Journal
Abstract Reservoir Computing
Article

Rule-Enhanced Active Learning for Semi-Automated Weak Supervision

1
Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
2
School of Computer Science, Georgia Institute of Technology, Atlanta, GA 30332, USA
3
Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
4
Machine Learning Center at Georgia Tech, Georgia Institute of Technology, Atlanta, GA 30332, USA
*
Author to whom correspondence should be addressed.
Academic Editors: Emma Tonkin, Kristina Yordanova and Rüdiger Buchkremer
AI 2022, 3(1), 211-228; https://doi.org/10.3390/ai3010013
Received: 17 December 2021 / Revised: 4 March 2022 / Accepted: 11 March 2022 / Published: 16 March 2022
(This article belongs to the Topic Methods for Data Labelling for Intelligent Systems)
A major bottleneck preventing the extension of deep learning systems to new domains is the prohibitive cost of acquiring sufficient training labels. Alternatives such as weak supervision, active learning, and fine-tuning of pretrained models reduce this burden but require substantial human input to select a highly informative subset of instances or to curate labeling functions. REGAL (Rule-Enhanced Generative Active Learning) is an improved framework for weakly supervised text classification that performs active learning over labeling functions rather than individual instances. REGAL interactively creates high-quality labeling patterns from raw text, enabling a single annotator to accurately label an entire dataset after initialization with three keywords for each class. Experiments demonstrate that REGAL extracts up to 3 times as many high-accuracy labeling functions from text as current state-of-the-art methods for interactive weak supervision, enabling REGAL to dramatically reduce the annotation burden of writing labeling functions for weak supervision. Statistical analysis reveals REGAL performs equal or significantly better than interactive weak supervision for five of six commonly used natural language processing (NLP) baseline datasets. View Full-Text
Keywords: weak supervision; active learning; natural language processing; text classification; text mining; data labeling weak supervision; active learning; natural language processing; text classification; text mining; data labeling
Show Figures

Figure 1

MDPI and ACS Style

Kartchner, D.; Nakajima An, D.; Ren, W.; Zhang, C.; Mitchell, C.S. Rule-Enhanced Active Learning for Semi-Automated Weak Supervision. AI 2022, 3, 211-228. https://doi.org/10.3390/ai3010013

AMA Style

Kartchner D, Nakajima An D, Ren W, Zhang C, Mitchell CS. Rule-Enhanced Active Learning for Semi-Automated Weak Supervision. AI. 2022; 3(1):211-228. https://doi.org/10.3390/ai3010013

Chicago/Turabian Style

Kartchner, David, Davi Nakajima An, Wendi Ren, Chao Zhang, and Cassie S. Mitchell. 2022. "Rule-Enhanced Active Learning for Semi-Automated Weak Supervision" AI 3, no. 1: 211-228. https://doi.org/10.3390/ai3010013

Find Other Styles

Article Access Map by Country/Region

1
Back to TopTop