Submit to Special Issue Submit Abstract to Special Issue Review for Electronics Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Emerging Theory and Applications in Natural Language Processing, 2nd Edition

Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Related Special Issue
Published Papers

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 December 2025 | Viewed by 6474

Share This Special Issue

Special Issue Editors

Dr. Linmei Hu

E-Mail Website
Guest Editor

School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Interests: knowledge graph; natural language processing; multimodal
Special Issues, Collections and Topics in MDPI journals

Dr. Jian Liu

E-Mail Website
Guest Editor

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
Interests: natural language processing; knowledge graph; machine learning
Special Issues, Collections and Topics in MDPI journals

Dr. Bo Xu

E-Mail Website1 Website2
Guest Editor

School of Computer Science and Technology, Dalian University of Technology, Dalian 116081, China
Interests: information retrieval; question answering and dialogue; natural language processing; large language models
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues

In recent years, natural language processing (NLP) has been transformed by groundbreaking advancements in deep learning and the emergence of large language models (LLMs). The integration of LLMs with adaptation tuning methods has significantly increased the generalization capabilities of NLP models, potentially enabling the development of general artificial intelligence systems. Recognizing the significance of this progress, it is crucial to explore their potential and understand their relationship with classical methods in shaping the future of NLP and its real-world applications. The aim of this Special Issue is to showcase cutting-edge research in NLP, highlighting novel theories, methods, and applications that advance the state of the art, while also promoting interdisciplinary research.

The scope of this Special Issue includes, but is not limited to, the following topics:

Novel NLP theory, architectures, and algorithms;
Theoretical foundations of LLMs: emergent abilities, scaling effects, etc.;
Model training and utilization strategies;
Efficiency and scalability of language models;
Integration of NLP with other AI technologies;
Interpretability of NLP and LLM;
Evaluating large language models: capabilities and limitations;
Ethical considerations and fairness;
Safety and alignment in LLMs;
Domain-specific NLP applications;
Other emerging topics in NLP and LLM research.

Dr. Linmei Hu
Dr. Jian Liu
Dr. Bo Xu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

natural language processing
large language models
NLP theory and application

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Related Special Issue

Emerging Theory and Applications in Natural Language Processing in Electronics (15 articles)

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

24 pages, 3386 KB

Open AccessArticle

Characterization of Students’ Thinking States Active Based on Improved Bloom Classification Algorithm and Cognitive Diagnostic Model

by Yipeng Liu, Hua Yuan, Zhaoyu Shou, Chenchen Lu and Jianwen Mo

Electronics 2025, 14(19), 3957; https://doi.org/10.3390/electronics14193957 - 8 Oct 2025

Viewed by 200

Abstract

A student’s active thinking state directly affects their learning experience in the classroom. To help teachers understand students’ active thinking states in real-time, this study aims to construct a model which characterizes their active thinking states. The main research objectives are as follows: (1) to achieve accurate classification of the cognitive levels of in-class exercises; (2) to effectively quantify the active thinking state of students through analyzing the correlation between student cognitive levels and exercise cognitive levels. The research methods used in this study to achieve these objectives are as follows: First, LSTM and Chinese-RoBERTa-wwm models are integrated to extract sequential and semantic information from plain text while TBCC is used to extract the semantic features of code text, allowing for comprehensive determination of the cognitive level of exercises. Second, a cognitive diagnosis model—namely, the QRCDM—is adopted to evaluate students’ real-time cognitive levels with respect to knowledge points. Finally, the cognitive levels of exercises and students are input into a self-attention mechanism network, their correlation is analyzed, and the thinking activity state is generated as a state representation. The proposed text classification model outperforms baseline models regarding ACC, micro-F1, and macro-F1 scores on two sets of exercise datasets in Chinese containing mixed code texts, with the highest ACC, micro-F1, and macro-F1 values reaching 0.7004, 0.6941, and 0.6912, respectively. This proves the proposed model’s effectiveness in classifying the cognitive level of exercises. The accuracy of the thinking activity state characterization model reaches 61.54%. In particular, this is higher than the random baseline, thus verifying the model’s feasibility. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing, 2nd Edition)

► Show Figures

Figure 1

35 pages, 8966 KB

Open AccessArticle

Verified Language Processing with Hybrid Explainability

by Oliver Robert Fox, Giacomo Bergami and Graham Morgan

Electronics 2025, 14(17), 3490; https://doi.org/10.3390/electronics14173490 - 31 Aug 2025

Cited by 1 | Viewed by 622

Abstract

The volume and diversity of digital information have led to a growing reliance on Machine Learning (ML) techniques, such as Natural Language Processing (NLP), for interpreting and accessing appropriate data. While vector and graph embeddings represent data for similarity tasks, current state-of-the-art pipelines lack guaranteed explainability, failing to accurately determine similarity for given full texts. These considerations can also be applied to classifiers exploiting generative language models with logical prompts, which fail to correctly distinguish between logical implication, indifference, and inconsistency, despite being explicitly trained to recognise the first two classes. We present a novel pipeline designed for hybrid explainability to address this. Our methodology combines graphs and logic to produce First-Order Logic (FOL) representations, creating machine- and human-readable representations through Montague Grammar (MG). The preliminary results indicate the effectiveness of this approach in accurately capturing full text similarity. To the best of our knowledge, this is the first approach to differentiate between implication, inconsistency, and indifference for text classification tasks. To address the limitations of existing approaches, we use three self-contained datasets annotated for the former classification task to determine the suitability of these approaches in capturing sentence structure equivalence, logical connectives, and spatiotemporal reasoning. We also use these data to compare the proposed method with language models pre-trained for detecting sentence entailment. The results show that the proposed method outperforms state-of-the-art models, indicating that natural language understanding cannot be easily generalised by training over extensive document corpora. This work offers a step toward more transparent and reliable Information Retrieval (IR) from extensive textual data. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing, 2nd Edition)

► Show Figures

Graphical abstract

31 pages, 855 KB

Open AccessArticle

A Comparative Evaluation of Transformer-Based Language Models for Topic-Based Sentiment Analysis

by Spyridon Tzimiris, Stefanos Nikiforos, Maria Nefeli Nikiforos, Despoina Mouratidis and Katia Lida Kermanidis

Electronics 2025, 14(15), 2957; https://doi.org/10.3390/electronics14152957 - 24 Jul 2025

Viewed by 1842

Abstract

This research investigates topic-based sentiment classification in Greek educational-related data using transformer-based language models. A comparative evaluation is conducted on GreekBERT, XLM-r-Greek, mBERT, and Palobert using three original sentiment-annotated datasets representing parents of students with functional diversity, school directors, and teachers, each capturing diverse educational perspectives. The analysis examines both overall sentiment performance and topic-specific evaluations across four thematic classes: (i) Material and Technical Conditions, (ii) Educational Dimension, (iii) Psychological/Emotional Dimension, and (iv) Learning Difficulties and Emergency Remote Teaching. Results indicate that GreekBERT consistently outperforms other models, achieving the highest overall F1 score (0.91), particularly excelling in negative sentiment detection (F1 = 0.95) and showing robust performance for positive sentiment classification. The Psychological/Emotional Dimension emerged as the most reliably classified category, with GreekBERT and mBERT demonstrating notably high accuracy and F1 scores. Conversely, Learning Difficulties and Emergency Remote Teaching presented significant classification challenges, especially for Palobert. This study contributes significantly to the field of sentiment analysis with Greek-language data by introducing original annotated datasets, pioneering the application of topic-based sentiment analysis within the Greek educational context, and offering a comparative evaluation of transformer models. Additionally, it highlights the superior performance of Greek-pretrained models in capturing emotional detail, and provides empirical evidence of the negative emotional responses toward Emergency Remote Teaching. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing, 2nd Edition)

► Show Figures

Figure 1

29 pages, 1234 KB

Open AccessArticle

Automatic Detection of the CaRS Framework in Scholarly Writing Using Natural Language Processing

by Olajide Omotola, Nonso Nnamoko, Charles Lam, Ioannis Korkontzelos, Callum Altham and Joseph Barrowclough

Electronics 2025, 14(14), 2799; https://doi.org/10.3390/electronics14142799 - 11 Jul 2025

Viewed by 661

Abstract

Many academic introductions suffer from inconsistencies and a lack of comprehensive structure, often failing to effectively outline the core elements of the research. This not only impacts the clarity and readability of the article but also hinders the communication of its significance and objectives to the intended audience. This study aims to automate the CaRS (Creating a Research Space) model using machine learning and natural language processing techniques. We conducted a series of experiments using a custom-developed corpus of 50 biology research article introductions, annotated with rhetorical moves and steps. The dataset was used to evaluate the performance of four classification algorithms: Prototypical Network (PN), Support Vector Machines (SVM), Naïve Bayes (NB), and Random Forest (RF); in combination with six embedding models: Word2Vec, GloVe, BERT, GPT-2, Llama-3.2-3B, and TEv3-small. Multiple experiments were carried out to assess performance at both the move and step levels using 5-fold cross-validation. Evaluation metrics included accuracy and weighted F1-score, with comprehensive results provided. Results show that the SVM classifier, when paired with Llama-3.2-3B embeddings, consistently achieved the highest performance across multiple tasks when trained on preprocessed dataset, with 79% accuracy and weighted F1-score on rhetorical moves and strong results on M2 steps (75% accuracy and weighted F1-score). While other combinations showed promise, particularly NB and RF with newer embeddings, none matched the consistency of the SVM–Llama pairing. Compared to existing benchmarks, our model achieves similar or better performance; however, direct comparison is limited due to differences in datasets and experimental setups. Despite the unavailability of the benchmark dataset, our findings indicate that SVM is an effective choice for rhetorical classification, even in few-shot learning scenarios. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing, 2nd Edition)

► Show Figures

Figure 1

23 pages, 809 KB

Open AccessArticle

Towards Smarter Assessments: Enhancing Bloom’s Taxonomy Classification with a Bayesian-Optimized Ensemble Model Using Deep Learning and TF-IDF Features

by Ali Alammary and Saeed Masoud

Electronics 2025, 14(12), 2312; https://doi.org/10.3390/electronics14122312 - 6 Jun 2025

Cited by 1 | Viewed by 2015

Abstract

Bloom’s taxonomy provides a well-established framework for categorizing the cognitive complexity of assessment questions, ensuring alignment with course learning outcomes (CLOs). Achieving this alignment is essential for constructing meaningful and valid assessments that accurately measure student learning. However, in higher education, the large volume of questions that instructors must develop each semester makes manual classification of cognitive levels a time-consuming and error-prone process. Despite various attempts to automate this classification, the highest accuracy reported in existing research has not exceeded 93.5%, highlighting the need for further advancements in this area. Furthermore, the best-performing deep learning models only reached an accuracy of 86%. These results emphasize the need for improvement, particularly in the application of deep learning models, which have not been fully exploited for this task. In response to these challenges, our study explores a novel approach to enhance the accuracy of cognitive level classification. We leverage a combination of augmentation through synonym substitution, advanced feature extraction techniques utilizing DistilBERT and TF-IDF, and a robust ensemble model incorporating soft voting. These methods were selected to capture both semantic meaning and term frequency, allowing the model to benefit from contextual depth and statistical relevance. Additionally, Bayesian optimization is employed for hyperparameter tuning to refine the model’s performance further. The novelty of our approach lies in the fusion of sparse TF-IDF features with dense DistilBERT embeddings, optimized through Bayesian search across multiple classifiers. This hybrid design captures both term-level salience and deep contextual semantics, something not fully exploited in prior models focused solely on transformer architectures. Our soft-voting ensemble capitalizes on classifier diversity, yielding more stable and accurate results. Through this integrated approach outperformed previous configurations with an accuracy of 96%, surpassing the current state-of-the-art results and setting a new benchmark for automated cognitive level classification. These findings have significant implications for the development of high-quality, scalable assessments in educational settings. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing, 2nd Edition)

► Show Figures

Figure 1

19 pages, 3091 KB

Open AccessArticle

Efficient Data Reduction Through Maximum-Separation Vector Selection and Centroid Embedding Representation

by Sultan Alshamrani

Electronics 2025, 14(10), 1919; https://doi.org/10.3390/electronics14101919 - 9 May 2025

Cited by 1 | Viewed by 569

Abstract

This study introduces two novel data reduction approaches for efficient sentiment analysis: High-Distance Sentiment Vectors (HDSV) and Centroid Sentiment Embedding Vectors (CSEV). By leveraging embedding space characteristics from DistilBERT, HDSV selects maximally separated sample pairs, while CSEV computes representative centroids for each sentiment class. We evaluate these methods on three benchmark datasets: SST-2, Yelp, and Sentiment140. Our results demonstrate remarkable data efficiency, reducing training samples to just 100 with HDSV and two with CSEV while maintaining comparable performance to full dataset training. Notable findings include CSEV achieving 88.93% accuracy on SST-2 (compared to 90.14% with full data) and both methods showing improved cross-dataset generalization, with less than 2% accuracy drop in domain transfer tasks versus 11.94% for full dataset training. The proposed methods enable significant storage savings, with datasets compressed to less than 1% of their original size, making them particularly valuable for resource-constrained environments. Our findings advance the understanding of data requirements in sentiment analysis, demonstrating that strategically selected minimal training data can achieve robust and generalizable classification while promoting more sustainable machine learning practices. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing, 2nd Edition)

► Show Figures

Journal Menu

Journal Browser

Emerging Theory and Applications in Natural Language Processing, 2nd Edition

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Related Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI