Evaluating the response to open questions is a complex process since it requires prior knowledge of a specific topic and language. The computational challenge is to analyze the text by learning from a set of correct examples to train a model and then predict unseen cases. Thus, we will be able to capture patterns that characterize answers to open questions. In this work, we used a sequence labeling and deep learning approach to detect if a text segment corresponds to the answer to an open question. We focused our efforts on analyzing the general objective of a thesis according to three methodological questions: Q1: What will be done? Q2: Why is it going to be done? Q3: How is it going to be done? First, we use the Beginning-Inside-Outside (BIO) format to label a corpus of targets with the help of two annotators. Subsequently, we adapted four state-of-the-art architectures to analyze the objective: Bidirectional Encoder Representations from Transformers (BERT-BETO) for Spanish, Code Switching Embeddings from Language Model (CS-ELMo), Multitask Neural Network (MTNN), and Bidirectional Long Short-Term Memory (Bi-LSTM). The results of the F-measure for detection of the answers to the three questions indicate that the BERT-BETO and CS-ELMo architecture obtained the best effectivity. The architecture that obtained the best results was BERT-BETO. BERT was the architecture that obtained more accurate results. The result of a detection analysis for Q1, Q2 and Q3 on a non-annotated corpus at the graduate and undergraduate levels is also reported. We found that for detecting the three questions, only the doctoral academic level reached 100%; that is, the doctoral objectives did contain the answer to the three questions.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.