Machines usually employ a guess-and-check strategy to analyze data: they take the data, make a guess, check the answer, adjust it with regard to the correct one if necessary, and try again on a new data set. An active learning environment guarantees better performance while training on less, but carefully chosen, data which reduces the costs of both annotating and analyzing large data sets. This issue becomes even more critical for deep learning applications. Human-like active learning integrates a variety of strategies and instructional models chosen by a teacher to contribute to learners’ knowledge, while machine active learning strategies lack versatile tools for shifting the focus of instruction away from knowledge transmission to learners’ knowledge construction. We approach this gap by considering an active learning environment in an educational setting. We propose a new strategy that measures the information capacity of data using the information function from the four-parameter logistic item response theory (4PL IRT). We compared the proposed strategy with the most common active learning strategies—Least Confidence and Entropy Sampling. The results of computational experiments showed that the Information Capacity strategy shares similar behavior but provides a more flexible framework for building transparent knowledge models in deep learning.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited