applsci-logo

Journal Browser

Journal Browser

The Advanced Trends in Natural Language Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 December 2026 | Viewed by 6548

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer and Information Sciences, Faculty of Engineering, Ibaraki University, Hitachi 316-8511, Japan
Interests: natural language processing

E-Mail Website
Guest Editor
Departamento de Informática y Sistemas, Facultad de Informática, Universidad de Murcia, Campus de Espinardo, Espinardo, 30100 Murcia, Spain
Interests: semantic web; knowledge engineering; ontologies; linked data; social semantic web; distributed systems; service oriented architectures; cloud computing; artificial intelligence; natural language processing; intelligent agents and multiagent systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Natural language processing (NLP) is a branch of computer science and artificial intelligence that focuses on combining computational linguistics, machine learning and deep learning to enable computer systems to analyze, understand and generate human language. In this field of research, we explore how computers can be useful in helping machines to understand and interact with texts and speech in natural language. In recent years, the emergence of powerful large-scale language models (LLMs), such as OpenAI's GPT series, Google Gemini and Meta Llama, led to significant advances in NLP research such as text classification, machine translation, information extraction, summarization and question answering. However, there are a number of unresolved theoretical and technical problems in basic NLP and NLP applications that await further research.

We invite you to participate in this Special Issue on "The Advanced Trends in Natural Language Processing".

This Special Issue will collect contributions from researchers in various fields of natural language processing, presenting results from the frontline of research and discussing the advances achieved in the state of the art of fundamental NLP and NLP applications.

Suggested themes and article types for submissions:

In this Special Issue, original research articles and reviews are welcome. Research areas may include (but not limited to) the following:

  • Computational linguistics;
  • Machine learning and deep learning for NLP;
  • Lexical semantics and sentence-level semantics;
  • Terminology extraction and classification;
  • Named-entity recognition and linking;
  • Text classification;
  • Sentiment/emotion analysis;
  • Information retrieval and semantic web;
  • Argument mining;
  • Summarization;
  • Relation extraction and classification;
  • Representation learning for NLP tasks;
  • Resources and evaluation for NLP tasks;
  • Knowledge base/graph construction and alignment from semi-structured content;
  • NLP applications for healthcare, social sciences and education, etc.

We look forward to receiving your contributions.

Dr. Minoru Sasaki
Dr. Francisco García-Sánchez
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing
  • machine learning and deep learning
  • large-scale language models (LLMs)
  • named-entity recognition
  • text classification

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 737 KB  
Article
Layer-Wise Attention with Pivot Layers for Effective Fine-Tuning of Encoder-Based Language Models
by Seung-Dong Lee, Jun-Ha Hwang, Miseo Kim and Young-Seob Jeong
Appl. Sci. 2026, 16(9), 4278; https://doi.org/10.3390/app16094278 - 27 Apr 2026
Abstract
Fine-tuning pre-trained encoder-based language models for down-stream tasks is typically performed by exploiting the output of the last encoder layer. However, an alternative line of research suggests that leveraging representations from multiple encoder layers may yield richer linguistic information. Previous studies found that [...] Read more.
Fine-tuning pre-trained encoder-based language models for down-stream tasks is typically performed by exploiting the output of the last encoder layer. However, an alternative line of research suggests that leveraging representations from multiple encoder layers may yield richer linguistic information. Previous studies found that different layers convey different linguistic knowledge, suggesting that the last layer might not be optimal for all down-stream tasks. In this paper, we propose a layer-wise attention mechanism using a pivot layer as a new fine-tuning method. The pivot layer is used to compute attention scores of encoder layers, and we define three types of pivot layers. We also examine four attention functions and demonstrate through experiments that the attention function plays an important role in layer-wise attention for fine-tuning. The best-performing combination of our proposed mechanism outperformed the standard fine-tuning method and other recent methods in the General Language Understanding Evaluation (GLUE) benchmark. By visualizing the attention distributions, we found that the last layer is not always preferable for every GLUE benchmark task, and that differences in attention distribution are associated with task performance. Full article
(This article belongs to the Special Issue The Advanced Trends in Natural Language Processing)
22 pages, 1452 KB  
Article
Definition-Anchored Unsupervised Word Sense Induction Using LLM-Generated Glosses
by Shota Yoshikawa and Minoru Sasaki
Appl. Sci. 2026, 16(8), 3797; https://doi.org/10.3390/app16083797 - 13 Apr 2026
Viewed by 262
Abstract
Word sense induction (WSI) aims to automatically discover the different senses of a word from contextual usage without predefined sense inventories. However, existing distributional clustering methods often suffer from dominant-sense bias and struggle to correctly identify minority senses. In this paper, we propose [...] Read more.
Word sense induction (WSI) aims to automatically discover the different senses of a word from contextual usage without predefined sense inventories. However, existing distributional clustering methods often suffer from dominant-sense bias and struggle to correctly identify minority senses. In this paper, we propose a definition-anchored reclassification framework for WSI that leverages large language models (LLMs) to generate explicit sense descriptions and refine cluster assignments. Unlike purely distributional approaches, our method integrates semantic definitions into the induction process. Our method improves instance-level alignment by introducing a trade-off with global structural consistency, as it shifts the decision process from geometric clustering to definition-based semantic matching. Experiments on the SemEval-2010 and SemEval-2013 datasets demonstrate that the proposed method consistently outperforms traditional clustering baselines and existing WSI systems across both structural metrics (NMI and V-measure) and instance-level metrics (F-B3 and Fuzzy-F-B3). In particular, our approach effectively mitigates dominant-sense bias and improves the recovery of minority senses by preserving them as distinct clusters while correctly assigning their instances. These results suggest that explicit semantic representations generated by LLMs provide a promising direction for addressing long-standing challenges in unsupervised word sense induction. Furthermore, unlike purely distributional clustering approaches, our method explicitly introduces LLM-generated semantic definitions as anchors, enabling more robust mitigation of dominant-sense bias and improved recall of minority senses. Full article
(This article belongs to the Special Issue The Advanced Trends in Natural Language Processing)
Show Figures

Figure 1

29 pages, 8422 KB  
Article
A Transformer-Based Method for Bidirectional French–Lingala Machine Translation in Speech and Text
by Reagan E. Mandiya, Selain K. Kasereka, Christophe B. Wizamo, Milena Savova-Mratsenkova, Ruffin-Benoît M. Ngoie, Tasho Tashev and Nathanaël M. Kasoro
Appl. Sci. 2026, 16(7), 3399; https://doi.org/10.3390/app16073399 - 31 Mar 2026
Viewed by 583
Abstract
Underrepresented languages such as Lingala are a significant part of the world’s cultural and linguistic heritage. Lingala plays a central role in daily communication, business, media, education, and culture for millions of people in the Democratic Republic of Congo (DRC) and the Republic [...] Read more.
Underrepresented languages such as Lingala are a significant part of the world’s cultural and linguistic heritage. Lingala plays a central role in daily communication, business, media, education, and culture for millions of people in the Democratic Republic of Congo (DRC) and the Republic of Congo. However, due to data scarcity and dialectal diversity, natural language processing (NLP) research often overlooks this language. In this paper, we propose a deep neural network pipeline for bidirectional French–Lingala automatic translation, covering both text-to-text and voice-to-text scenarios, by integrating Long Short-Term Memory (LSTM) and Transformer models on a specialized parallel corpus. The Bidirectional Encoder Representations from Transformers (BERT) model is used as a bidirectional source encoder to improve contextual representation, while the Whisper model handles automatic speech recognition as the first stage of the audio translation pipeline. Experimental results show that the standalone Transformer achieves a BLEU score of 35.3, compared to 8.12 for the LSTM SeqToSeq baseline. Fine-tuning with BERT raises the BLEU score to 38.6. Integrating the Whisper ASR module for an end-to-end speech translation task yields a final pipeline BLEU score of 55.4, with a Word Error Rate of 12.3% on the speech recognition sub-task, confirming the effectiveness of each component. These results demonstrate the potential of combining domain-specific pre-trained models with modular neural architectures to achieve competitive translation performance in a critically under-resourced language. Full article
(This article belongs to the Special Issue The Advanced Trends in Natural Language Processing)
Show Figures

Figure 1

25 pages, 29137 KB  
Article
An Empirical Study on Enhancing Large Language Models for Long-Term Conversations in Korean
by Hongjin Kim, Jeonghyun Kang, Yeajin Jang, Yujin Sim and Harksoo Kim
Appl. Sci. 2026, 16(7), 3175; https://doi.org/10.3390/app16073175 - 25 Mar 2026
Viewed by 409
Abstract
Large language models (LLMs) have shown strong performance in open-domain dialogue, yet they continue to struggle with long-term multi-session conversations (MSC), particularly in non-English languages such as Korean. In this work, we present a comprehensive empirical study on enhancing Korean MSC capabilities of [...] Read more.
Large language models (LLMs) have shown strong performance in open-domain dialogue, yet they continue to struggle with long-term multi-session conversations (MSC), particularly in non-English languages such as Korean. In this work, we present a comprehensive empirical study on enhancing Korean MSC capabilities of LLMs through dataset construction, memory modeling, and parameter-efficient fine-tuning. We introduce an extended Korean MSC dataset that explicitly distinguishes between persona memory (long-term user attributes) and episode memory (short-term, event-driven information), enabling more effective memory management across sessions. Using this dataset, we evaluate LLM performance on three core MSC tasks: session summarization, memory update, and response generation. Our experiments reveal that Korean MSC is intrinsically more challenging than English MSC and that memory update and response generation require substantial reasoning ability. To address these challenges, we compare LoRA, DPO, MoE, CPT, Layer Tuning, and neuron-level tuning methods. Results consistently show that neuron tuning, guided by a novel language-specific neuron identification method based on activation scores and entropy, achieves superior performance and robustness, particularly in continual learning settings. Overall, our findings highlight neuron-level adaptation as an effective and interpretable approach for improving long-term conversational ability in low-resource languages. Full article
(This article belongs to the Special Issue The Advanced Trends in Natural Language Processing)
Show Figures

Figure 1

16 pages, 489 KB  
Article
Integrating Hybrid AI Approaches for Enhanced Translation in Minority Languages
by Chen-Chi Chang, Yu-Hsun Lin, Yun-Hsiang Hsu and I-Hsin Fan
Appl. Sci. 2025, 15(16), 9039; https://doi.org/10.3390/app15169039 - 15 Aug 2025
Cited by 1 | Viewed by 2683
Abstract
This study presents a hybrid artificial intelligence model designed to enhance translation quality for low-resource languages, specifically targeting the Hakka language. The proposed model integrates phrase-based machine translation (PBMT) and neural machine translation (NMT) within a recursive learning framework. The methodology consists of [...] Read more.
This study presents a hybrid artificial intelligence model designed to enhance translation quality for low-resource languages, specifically targeting the Hakka language. The proposed model integrates phrase-based machine translation (PBMT) and neural machine translation (NMT) within a recursive learning framework. The methodology consists of three key stages: (1) initial translation using PBMT, where Hakka corpus data is structured into a parallel dataset; (2) NMT training with Transformers, leveraging the generated parallel corpus to train deep learning models; and (3) recursive translation refinement, where iterative translations further enhance model accuracy by expanding the training dataset. This study employs preprocessing techniques to clean and optimize the dataset, reducing noise and improving sentence segmentation. A BLEU score evaluation is conducted to compare the effectiveness of PBMT and NMT across various corpus sizes, demonstrating that while PBMT performs well with limited data, the Transformer-based NMT achieves superior results as training data increases. The findings highlight the advantages of a hybrid approach in overcoming data scarcity challenges for minority languages. This research contributes to machine translation methodologies by proposing a scalable framework for improving linguistic accessibility in under-resourced languages. Full article
(This article belongs to the Special Issue The Advanced Trends in Natural Language Processing)
Show Figures

Figure 1

22 pages, 989 KB  
Article
Impact of Developer Queries on the Effectiveness of Conversational Large Language Models in Programming
by Viktor Taneski, Sašo Karakatič, Patrik Rek and Gregor Jošt
Appl. Sci. 2025, 15(12), 6836; https://doi.org/10.3390/app15126836 - 17 Jun 2025
Cited by 1 | Viewed by 1704
Abstract
This study investigates the effects of LLM-based coding assistance on web application development by students using a frontend framework. Rather than comparing different models, it focuses on how students interact with LLM tools to isolate the impact of query type on coding success. [...] Read more.
This study investigates the effects of LLM-based coding assistance on web application development by students using a frontend framework. Rather than comparing different models, it focuses on how students interact with LLM tools to isolate the impact of query type on coding success. To this end, participants were instructed to rely exclusively on LLMs for writing code, based on a given set of specifications, and their queries were categorized into seven types: Error Fixing (EF), Feature Implementation (FI), Code Optimization (CO), Code Understanding (CU), Best Practices (BP), Documentation (DOC), and Concept Clarification (CC). The results reveal that students who queried LLMs for error fixing (EF) were statistically more likely to have runnable code, regardless of prior knowledge. Additionally, students seeking code understanding (CU) and error fixing performed better, even when normalizing for previous coding ability. These findings suggest that the nature of the queries made to LLMs influences the success of programming tasks and provides insights into how AI tools can assist learning in software development. Full article
(This article belongs to the Special Issue The Advanced Trends in Natural Language Processing)
Show Figures

Figure 1

Back to TopTop