Current Trends in Natural Language Processing (NLP) and Human Language Technology (HLT)

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (31 October 2023) | Viewed by 25671

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editor


E-Mail Website
Guest Editor
Department of Computer Science, Faculty of Mathematics and Computer Science, University of Bucharest, 010014 Bucharest, Romania
Interests: artificial intelligence (AI); knowledge representation; natural language processing; computational linguistics; human language technology; computational statistics applied in natural language processing; data analysis
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue is concerned with technologies for processing human language in the form of text, with Natural Language Processing (NLP) tools and techniques ultimately responding to the two main existing challenges: natural language understanding and natural language generation. In the broad spectrum of research areas that are concerned with computational approaches to natural language, we will be looking at all main levels at which language processing is performed: the morphological level, the syntactic level, the semantic level, the pragmatic level, both from a theoretical and from a practical point of view.

AI-powered text processing continues to represent a strong trend in artificial intelligence (AI) primarily due to the genuine explosion of texts on the World Wide Web. NLP is one of the most important technologies in use today, especially due to the large and growing amount of online text, which needs to be understood in order for its enormous value to be fully asserted. NLP can make sense of the unstructured data that are produced by social platforms and other social data sources, and can help organize them into a more structured model that supports various types of tasks and applications, which are all of great interest to this Special Issue.

The large size, unrestrictive nature, and ambiguity of natural langauge have led to the vast development of the NLP field in various ways and from different perspectives, all of which are of interest to this Special Issue. Most of the approaches can be viewed as complementary, while in recent years machine-learning methods have strongly and successfully emerged. Large annotated bodies of text (corpora) have been employed to train machine-learning algorithms and to provide gold standards for evaluation corresponding to specific tasks. However, there are still various types of modern NLP applications (e.g. hate speech detection, stance detection) for which we are just now moving towards creating an appropriate benchmarking system. We hope this Special Issue will take steps in this respect as well. Although many machine-learning models have been developed for NLP applications, recently, deep learning approaches have achieved remarkable results across many NLP tasks. This Special Issue is interested in the use and exploration of current advances in machine-learning and deep learning for NLP topics, including (but not limited to) information extraction, information retrieval and text mining, text summarization, computational social science, discourse and dialog systems, interpretability, ethics in NLP, linguistic theories and NLP for social good. 

Although NLP is not a new science, the technology emerging from this field of study is rapidly advancing, thanks to an increased interest in human–machine communication. Human language technology (HLT) has been recognized as representing a major challenge for computing, requiring advanced NLP, as well as the availability of big data, resulting in large-scale systems and applications. Knowledge of natural language processing (NLP) and computational linguistics (CL), as well as concerning many of their application-oriented aspects, is required for researching software and systems that bridge the linguistic gap between people and machines. The involved human–computer interaction enables various types of real-world applications, all of which are of interest to this Special Issue.

Prof. Dr. Florentina Hristea
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing
  • computational linguistics
  • human language technology
  • human–computer interaction
  • knowledge representation
  • sentiment analysis
  • social media mining
  • machine learning
  • deep learning
  • big data

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

20 pages, 323 KiB  
Article
Leveraging Zero and Few-Shot Learning for Enhanced Model Generality in Hate Speech Detection in Spanish and English
by José Antonio García-Díaz, Ronghao Pan and Rafael Valencia-García
Mathematics 2023, 11(24), 5004; https://doi.org/10.3390/math11245004 - 18 Dec 2023
Viewed by 853
Abstract
Supervised training has traditionally been the cornerstone of hate speech detection models, but it often falls short when faced with unseen scenarios. Zero and few-shot learning offers an interesting alternative to traditional supervised approaches. In this paper, we explore the advantages of zero [...] Read more.
Supervised training has traditionally been the cornerstone of hate speech detection models, but it often falls short when faced with unseen scenarios. Zero and few-shot learning offers an interesting alternative to traditional supervised approaches. In this paper, we explore the advantages of zero and few-shot learning over supervised training, with a particular focus on hate speech detection datasets covering different domains and levels of complexity. We evaluate the generalization capabilities of generative models such as T5, BLOOM, and Llama-2. These models have shown promise in text generation and have demonstrated the ability to learn from limited labeled data. Moreover, by evaluating their performance on both Spanish and English datasets, we gain insight into their cross-lingual applicability and versatility, thus contributing to a broader understanding of generative models in natural language processing. Our results highlight the potential of generative models to bridge the gap between data scarcity and model performance across languages and domains. Full article
Show Figures

Figure 1

25 pages, 1269 KiB  
Article
Transformer-Based Composite Language Models for Text Evaluation and Classification
by Mihailo Škorić, Miloš Utvić and Ranka Stanković
Mathematics 2023, 11(22), 4660; https://doi.org/10.3390/math11224660 - 16 Nov 2023
Viewed by 874
Abstract
Parallel natural language processing systems were previously successfully tested on the tasks of part-of-speech tagging and authorship attribution through mini-language modeling, for which they achieved significantly better results than independent methods in the cases of seven European languages. The aim of this paper [...] Read more.
Parallel natural language processing systems were previously successfully tested on the tasks of part-of-speech tagging and authorship attribution through mini-language modeling, for which they achieved significantly better results than independent methods in the cases of seven European languages. The aim of this paper is to present the advantages of using composite language models in the processing and evaluation of texts written in arbitrary highly inflective and morphology-rich natural language, particularly Serbian. A perplexity-based dataset, the main asset for the methodology assessment, was created using a series of generative pre-trained transformers trained on different representations of the Serbian language corpus and a set of sentences classified into three groups (expert translations, corrupted translations, and machine translations). The paper describes a comparative analysis of calculated perplexities in order to measure the classification capability of different models on two binary classification tasks. In the course of the experiment, we tested three standalone language models (baseline) and two composite language models (which are based on perplexities outputted by all three standalone models). The presented results single out a complex stacked classifier using a multitude of features extracted from perplexity vectors as the optimal architecture of composite language models for both tasks. Full article
Show Figures

Figure 1

33 pages, 2317 KiB  
Article
Sentiment Difficulty in Aspect-Based Sentiment Analysis
by Adrian-Gabriel Chifu and Sébastien Fournier
Mathematics 2023, 11(22), 4647; https://doi.org/10.3390/math11224647 - 14 Nov 2023
Cited by 3 | Viewed by 1312
Abstract
Subjectivity is a key aspect of natural language understanding, especially in the context of user-generated text and conversational systems based on large language models. Natural language sentences often contain subjective elements, such as opinions and emotions, that make them more nuanced and complex. [...] Read more.
Subjectivity is a key aspect of natural language understanding, especially in the context of user-generated text and conversational systems based on large language models. Natural language sentences often contain subjective elements, such as opinions and emotions, that make them more nuanced and complex. The level of detail at which the study of the text is performed determines the possible applications of sentiment analysis. The analysis can be done at the document or paragraph level, or, even more granularly, at the aspect level. Many researchers have studied this topic extensively. The field of aspect-based sentiment analysis has numerous data sets and models. In this work, we initiate the discussion around the definition of sentence difficulty in this context of aspect-based sentiment analysis. To assess and quantify the difficulty of the aspect-based sentiment analysis, we conduct an experiment using three data sets: “Laptops”, “Restaurants”, and “MTSC” (Multi-Target-dependent Sentiment Classification), along with 21 learning models from scikit-learn. We also use two textual representations, TF-IDF (Terms frequency-inverse document frequency) and BERT (Bidirectional Encoder Representations from Transformers), to analyze the difficulty faced by these models in performing aspect-based sentiment analysis. Additionally, we compare the models with a fine-tuned version of BERT on the three data sets. We identify the most challenging sentences using a combination of classifiers in order to better understand them. We propose two strategies for defining sentence difficulty. The first strategy is binary and considers sentences as difficult when the classifiers are unable to correctly assign the sentiment polarity. The second strategy uses a six-level difficulty scale based on how many of the top five best-performing classifiers can correctly identify sentiment polarity. These sentences with assigned difficulty classes are then used to create predictive models for early difficulty detection. The purpose of estimating the difficulty of aspect-based sentiment analysis is to enhance performance while minimizing resource usage. Full article
Show Figures

Figure 1

13 pages, 1683 KiB  
Article
A Study on Double-Headed Entities and Relations Prediction Framework for Joint Triple Extraction
by Yanbing Xiao, Guorong Chen, Chongling Du, Lang Li, Yu Yuan, Jincheng Zou and Jingcheng Liu
Mathematics 2023, 11(22), 4583; https://doi.org/10.3390/math11224583 - 08 Nov 2023
Viewed by 779
Abstract
Relational triple extraction, a fundamental procedure in natural language processing knowledge graph construction, assumes a crucial and irreplaceable role in the domain of academic research related to information extraction. In this paper, we propose a Double-Headed Entities and Relations Prediction (DERP) framework, which [...] Read more.
Relational triple extraction, a fundamental procedure in natural language processing knowledge graph construction, assumes a crucial and irreplaceable role in the domain of academic research related to information extraction. In this paper, we propose a Double-Headed Entities and Relations Prediction (DERP) framework, which divides the entity recognition process into two stages: head entity recognition and tail entity recognition, using the obtained head and tail entities as inputs. By utilizing the corresponding relation and the corresponding entity, the DERP framework further incorporates a triple prediction module to improve the accuracy and completeness of the joint relation triple extraction. We conducted experiments on two English datasets, NYT and WebNLG, and two Chinese datasets, DuIE2.0 and CMeIE-V2, and compared the English dataset experimental results with those derived from ten baseline models. The experimental results demonstrate the effectiveness of our proposed DERP framework for triple extraction. Full article
Show Figures

Figure 1

14 pages, 542 KiB  
Article
Parameter-Efficient Fine-Tuning Method for Task-Oriented Dialogue Systems
by Yunho Mo, Joon Yoo and Sangwoo Kang
Mathematics 2023, 11(14), 3048; https://doi.org/10.3390/math11143048 - 10 Jul 2023
Cited by 1 | Viewed by 2366
Abstract
The use of Transformer-based pre-trained language models has become prevalent in enhancing the performance of task-oriented dialogue systems. These models, which are pre-trained on large text data to grasp the language syntax and semantics, fine-tune the entire parameter set according to a specific [...] Read more.
The use of Transformer-based pre-trained language models has become prevalent in enhancing the performance of task-oriented dialogue systems. These models, which are pre-trained on large text data to grasp the language syntax and semantics, fine-tune the entire parameter set according to a specific task. However, as the scale of the pre-trained language model increases, several challenges arise during the fine-tuning process. For example, the training time escalates as the model scale grows, since the complete parameter set needs to be trained. Furthermore, additional storage space is required to accommodate the larger model size. To address these challenges, we propose a new new task-oriented dialogue system called PEFTTOD. Our proposal leverages a method called the Parameter-Efficient Fine-Tuning method (PEFT), which incorporates an Adapter Layer and prefix tuning into the pre-trained language model. It significantly reduces the overall parameter count used during training and efficiently transfers the dialogue knowledge. We evaluated the performance of PEFTTOD on the Multi-WOZ 2.0 dataset, a benchmark dataset commonly used in task-oriented dialogue systems. Compared to the traditional method, PEFTTOD utilizes only about 4% of the parameters for training, resulting in a 4% improvement in the combined score compared to the existing T5-based baseline. Moreover, PEFTTOD achieved an efficiency gain by reducing the training time by 20% and saving up to 95% of the required storage space. Full article
Show Figures

Figure 1

16 pages, 1297 KiB  
Article
Research on Relation Classification Tasks Based on Cybersecurity Text
by Ze Shi, Hongyi Li, Di Zhao and Chengwei Pan
Mathematics 2023, 11(12), 2598; https://doi.org/10.3390/math11122598 - 06 Jun 2023
Viewed by 967
Abstract
Relation classification is a significant task within the field of natural language processing. Its objective is to extract and identify relations between two entities in a given text. Within the scope of this paper, we construct an artificial dataset (CS13K) for relation classification [...] Read more.
Relation classification is a significant task within the field of natural language processing. Its objective is to extract and identify relations between two entities in a given text. Within the scope of this paper, we construct an artificial dataset (CS13K) for relation classification in the realm of cybersecurity and propose two models for processing such tasks. For any sentence containing two target entities, we first locate the entities and fine-tune the pre-trained BERT model. Next, we utilize graph attention networks to iteratively update word nodes and relation nodes. A new relation classification model is constructed by concatenating the updated vectors of word nodes and relation nodes. Our proposed model achieved exceptional performance on the SemEval-2010 task 8 dataset, surpassing previous approaches with a remarkable F1 value of 92.3%. Additionally, we propose the integration of a ranking-based voting mechanism into the existing model. Our best results are an F1 value of 92.5% on the SemEval-2010 task 8 dataset and a value 94.6% on the CS13K dataset. These findings highlight the effectiveness of our proposed models in tackling relation classification tasks. Full article
Show Figures

Figure 1

18 pages, 874 KiB  
Article
Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation
by Andrei-Marius Avram, Verginica Barbu Mititelu, Vasile Păiș, Dumitru-Clementin Cercel and Ștefan Trăușan-Matu
Mathematics 2023, 11(11), 2548; https://doi.org/10.3390/math11112548 - 01 Jun 2023
Viewed by 1024
Abstract
Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text. In this work, we evaluate the performance of the mBERT model for MWE identification in [...] Read more.
Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text. In this work, we evaluate the performance of the mBERT model for MWE identification in a multilingual context by training it on all 14 languages available in version 1.2 of the PARSEME corpus. We also incorporate lateral inhibition and language adversarial training into our methodology to create language-independent embeddings and improve its capabilities in identifying multiword expressions. The evaluation of our models shows that the approach employed in this work achieves better results compared to the best system of the PARSEME 1.2 competition, MTLB-STRUCT, on 11 out of 14 languages for global MWE identification and on 12 out of 14 languages for unseen MWE identification. Additionally, averaged across all languages, our best approach outperforms the MTLB-STRUCT system by 1.23% on global MWE identification and by 4.73% on unseen global MWE identification. Full article
Show Figures

Figure 1

22 pages, 394 KiB  
Article
Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation
by Jani Dugonik, Mirjam Sepesy Maučec, Domen Verber and Janez Brest
Mathematics 2023, 11(11), 2484; https://doi.org/10.3390/math11112484 - 28 May 2023
Cited by 1 | Viewed by 1549
Abstract
This paper proposes a hybrid machine translation (HMT) system that improves the quality of neural machine translation (NMT) by incorporating statistical machine translation (SMT). Therefore, two NMT systems and two SMT systems were built for the Slovenian–English language pair, each for translation in [...] Read more.
This paper proposes a hybrid machine translation (HMT) system that improves the quality of neural machine translation (NMT) by incorporating statistical machine translation (SMT). Therefore, two NMT systems and two SMT systems were built for the Slovenian–English language pair, each for translation in one direction. We used a multilingual language model to embed the source sentence and translations into the same vector space. From each vector, we extracted features based on the distances and similarities calculated between the source sentence and the NMT translation, and between the source sentence and the SMT translation. To select the best possible translation, we used several well-known classifiers to predict which translation system generated a better translation of the source sentence. The proposed method of combining SMT and NMT in the hybrid system is novel. Our framework is language-independent and can be applied to other languages supported by the multilingual language model. Our experiment involved empirical applications. We compared the performance of the classifiers, and the results demonstrate that our proposed HMT system achieved notable improvements in the BLEU score, with an increase of 1.5 points and 10.9 points for both translation directions, respectively. Full article
Show Figures

Figure 1

19 pages, 386 KiB  
Article
A Mathematical Interpretation of Autoregressive Generative Pre-Trained Transformer and Self-Supervised Learning
by Minhyeok Lee
Mathematics 2023, 11(11), 2451; https://doi.org/10.3390/math11112451 - 25 May 2023
Cited by 7 | Viewed by 4934
Abstract
In this paper, we present a rigorous mathematical examination of generative pre-trained transformer (GPT) models and their autoregressive self-supervised learning mechanisms. We begin by defining natural language space and knowledge space, which are two key concepts for understanding the dimensionality reduction process in [...] Read more.
In this paper, we present a rigorous mathematical examination of generative pre-trained transformer (GPT) models and their autoregressive self-supervised learning mechanisms. We begin by defining natural language space and knowledge space, which are two key concepts for understanding the dimensionality reduction process in GPT-based large language models (LLMs). By exploring projection functions and their inverses, we establish a framework for analyzing the language generation capabilities of these models. We then investigate the GPT representation space, examining its implications for the models’ approximation properties. Finally, we discuss the limitations and challenges of GPT models and their learning mechanisms, considering trade-offs between complexity and generalization, as well as the implications of incomplete inverse projection functions. Our findings demonstrate that GPT models possess the capability to encode knowledge into low-dimensional vectors through their autoregressive self-supervised learning mechanism. This comprehensive analysis provides a solid mathematical foundation for future advancements in GPT-based LLMs, promising advancements in natural language processing tasks such as language translation, text summarization, and question answering due to improved understanding and optimization of model training and performance. Full article
Show Figures

Figure 1

20 pages, 702 KiB  
Article
Multilingual Multi-Target Stance Recognition in Online Public Consultations
by Valentin Barriere and Alexandra Balahur
Mathematics 2023, 11(9), 2161; https://doi.org/10.3390/math11092161 - 04 May 2023
Cited by 3 | Viewed by 1166
Abstract
Machine Learning is an interesting tool for stance recognition in a large-scale context, in terms of data size, but also regarding the topics and themes addressed or the languages employed by the participants. Public consultations of citizens using online participatory democracy platforms offer [...] Read more.
Machine Learning is an interesting tool for stance recognition in a large-scale context, in terms of data size, but also regarding the topics and themes addressed or the languages employed by the participants. Public consultations of citizens using online participatory democracy platforms offer this kind of setting and are good use cases for automatic stance recognition systems. In this paper, we propose to use three datasets of public consultations, in order to train a model able to classify the stance of a citizen within a text, towards a proposal or a debate question. We studied stance detection in several contexts: using data from an online platform without interactions between users, using multilingual data from online debates that are in one language, and using data from online intra-multilingual debates, which can contain several languages inside the same unique debate discussion. We propose several baselines and methods in order to take advantage of the different available data, by comparing the results of models using out-of-dataset annotations, and binary or ternary annotations from the target dataset. We finally proposed a self-supervised learning method to take advantage of unlabelled data. We annotated both the datasets with ternary stance labels and made them available. Full article
Show Figures

Figure 1

12 pages, 418 KiB  
Article
Text Simplification to Specific Readability Levels
by Wejdan Alkaldi and Diana Inkpen
Mathematics 2023, 11(9), 2063; https://doi.org/10.3390/math11092063 - 26 Apr 2023
Cited by 1 | Viewed by 2334
Abstract
The ability to read a document depends on the reader’s skills and the text’s readability level. In this paper, we propose a system that uses deep learning techniques to simplify texts in order to match a reader’s level. We use a novel approach [...] Read more.
The ability to read a document depends on the reader’s skills and the text’s readability level. In this paper, we propose a system that uses deep learning techniques to simplify texts in order to match a reader’s level. We use a novel approach with a reinforcement learning loop that contains a readability classifier. The classifier’s output is used to decide if more simplification is needed, until the desired readability level is reached. The simplification models are trained on data annotated with readability levels from the Newsela corpus. Our simplification models perform at sentence level, to simplify each sentence to meet the specified readability level. We use a version of the Newsela corpus aligned at the sentence level. We also produce an augmented dataset by automatically annotating more pairs of sentences using a readability-level classifier. Our text simplification models achieve better performance than state-of-the-art techniques for this task. Full article
Show Figures

Figure 1

20 pages, 1327 KiB  
Article
Improving Intent Classification Using Unlabeled Data from Large Corpora
by Gabriel Bercaru, Ciprian-Octavian Truică, Costin-Gabriel Chiru and Traian Rebedea
Mathematics 2023, 11(3), 769; https://doi.org/10.3390/math11030769 - 03 Feb 2023
Cited by 1 | Viewed by 3226
Abstract
Intent classification is a central component of a Natural Language Understanding (NLU) pipeline for conversational agents. The quality of such a component depends on the quality of the training data, however, for many conversational scenarios, the data might be scarce; in these scenarios, [...] Read more.
Intent classification is a central component of a Natural Language Understanding (NLU) pipeline for conversational agents. The quality of such a component depends on the quality of the training data, however, for many conversational scenarios, the data might be scarce; in these scenarios, data augmentation techniques are used. Having general data augmentation methods that can generalize to many datasets is highly desirable. The work presented in this paper is centered around two main components. First, we explore the influence of various feature vectors on the task of intent classification using RASA’s text classification capabilities. The second part of this work consists of a generic method for efficiently augmenting textual corpora using large datasets of unlabeled data. The proposed method is able to efficiently mine for examples similar to the ones that are already present in standard, natural language corpora. The experimental results show that using our corpus augmentation methods enables an increase in text classification accuracy in few-shot settings. Particularly, the gains in accuracy raise up to 16% when the number of labeled examples is very low (e.g., two examples). We believe that our method is important for any Natural Language Processing (NLP) or NLU task in which labeled training data are scarce or expensive to obtain. Lastly, we give some insights into future work, which aims at combining our proposed method with a semi-supervised learning approach. Full article
Show Figures

Figure 1

16 pages, 790 KiB  
Article
Sentence-CROBI: A Simple Cross-Bi-Encoder-Based Neural Network Architecture for Paraphrase Identification
by Jesus-German Ortiz-Barajas, Gemma Bel-Enguix and Helena Gómez-Adorno
Mathematics 2022, 10(19), 3578; https://doi.org/10.3390/math10193578 - 30 Sep 2022
Cited by 1 | Viewed by 2369
Abstract
Since the rise of Transformer networks and large language models, cross-encoders have become the dominant architecture for various Natural Language Processing tasks. When dealing with sentence pairs, they can exploit the relationships between those pairs. On the other hand, bi-encoders can obtain a [...] Read more.
Since the rise of Transformer networks and large language models, cross-encoders have become the dominant architecture for various Natural Language Processing tasks. When dealing with sentence pairs, they can exploit the relationships between those pairs. On the other hand, bi-encoders can obtain a vector given a single sentence and are used in tasks such as textual similarity or information retrieval due to their low computational cost; however, their performance is inferior to that of cross-encoders. In this paper, we present Sentence-CROBI, an architecture that combines cross-encoders and bi-encoders to obtain a global representation of sentence pairs. We evaluated the proposed architecture in the paraphrase identification task using the Microsoft Research Paraphrase Corpus, the Quora Question Pairs dataset, and the PAWS-Wiki dataset. Our model obtains competitive results compared with the state-of-the-art by using model ensembles and a simple model configuration. These results demonstrate that a simple architecture that combines sentence pair and single-sentence representations without using complex pre-training or fine-tuning algorithms is a viable alternative for sentence pair tasks. Full article
Show Figures

Figure 1

Back to TopTop