applsci-logo

Journal Browser

Journal Browser

Practical Applications of Large Language Models in Natural Language Processing

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 30 April 2026 | Viewed by 2424

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
Interests: artificial intelligence; natural language processing; knowledge graph; large language model

E-Mail Website
Guest Editor
School of Computer Science and Engineering, Southeast University, Nanjing 211189, China
Interests: natural language processing; knowledge graph; multimodal learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor Assistant
School of Engineering, University of Manchester, Manchester M13 9PL, UK
Interests: ontology embeddings; description logic

Special Issue Information

Dear Colleagues,

In the rapidly evolving landscape of artificial intelligence and computational linguistics, Large Language Models (LLMs) have emerged as transformative tools that are revolutionizing how we approach natural language processing tasks. The unprecedented capabilities of these models in understanding, generating, and manipulating human language have opened new frontiers in both theoretical research and practical applications across diverse domains. However, the successful deployment of LLMs in real-world scenarios presents unique challenges related to computational efficiency, domain adaptation, ethical considerations, and integration with existing systems. This Special Issue aims to bridge the gap between cutting-edge LLM research and practical implementation by showcasing innovative approaches that demonstrate the tangible benefits and solutions that these models provide to natural language processing challenges.

We invite the submission of original research contributions including, but not limited to, the following areas: LLM-powered textual analysis and understanding; domain-specific fine-tuning and adaptation; information extraction and knowledge graph construction; advanced conversational AI and personalized user experiences; applications in specialized sectors such as law, healthcare, and software engineering; efficient deployment strategies, including retrieval-augmented generation (RAG) and model compression; ethical AI, focusing on fairness, explainability, and bias mitigation; and the integration of LLMs with external tools and traditional NLP pipelines. Advancements in these areas promise to accelerate the adoption of intelligent language technologies, leading to more sophisticated and accessible natural language processing solutions for diverse industries and applications.

Dr. Yongrui Chen
Prof. Dr. Guilin Qi
Guest Editors

Dr. Hui Yang
Guest Editor Assistant

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • large language models
  • applied natural language processing
  • domain adaptation
  • efficient deployment
  • ethical AI

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 1079 KB  
Article
Feasibility of Using Large Language Models for Structured Medication Extraction from Clinical Text: A Comparative Analysis of Zero-Shot and Few-Shot Paradigms
by Evan Schulte, Mohamed Abusharkh, Kushal Dahal, Michael Klepser and Minji Sohn
Appl. Sci. 2026, 16(5), 2300; https://doi.org/10.3390/app16052300 - 27 Feb 2026
Viewed by 662
Abstract
The digitization of healthcare has been accompanied by a rapid expansion of electronic health records (EHRs); however, a significant proportion of critical patient data, specifically medication regimens, remains entrapped within unstructured clinical narratives. The inability to seamlessly compute this data hinders advancements in [...] Read more.
The digitization of healthcare has been accompanied by a rapid expansion of electronic health records (EHRs); however, a significant proportion of critical patient data, specifically medication regimens, remains entrapped within unstructured clinical narratives. The inability to seamlessly compute this data hinders advancements in pharmacovigilance, clinical decision support, and population health management. This study presents a comprehensive, rigorous evaluation of the feasibility of deploying Large Language Models (LLMs) to automate the extraction of structured dosage information (Dose, Daily Frequency, Duration) from outpatient antimicrobial clinical notes sourced from the Collaboration to Harmonize Antimicrobial Registry Measures (CHARM) registry. We scrutinized the performance of five distinct open-weight architectures, namely GPT-OSS:20B, Gemma 2:9B, Mistral 7B, Qwen3:14B and Llama 3.2, across both Zero-Shot and Retrieval Augmented Generation (RAG)-based Few-Shot prompting paradigms. Our analysis reveals a fundamental architectural trade-off: the reasoning-optimized GPT-OSS:20B dominates the zero-shot landscape (F1 > 0.90) by leveraging abstract schema understanding, whereas the instruction-tuned Gemma 2:9B excels in the few-shot setting (F1 ~ 0.99), effectively utilizing examples as guardrails to surpass larger models. Conversely, smaller models (Mistral, Llama) exhibit a prohibitive “hallucination barrier,” rendering them unsafe for unsupervised clinical application. Furthermore, we identify “Inconsistent Unit Handling” and “Complex Temporal Logic” as persistent failure modes that resist simple scaling laws. This report provides a definitive framework for selecting model architectures based on the availability of few-shot examples and highlights the necessity of dynamic RAG strategies to achieve production-grade reliability in medical informatics. Full article
Show Figures

Figure 1

29 pages, 911 KB  
Article
Boundary-Focused Large Language Model Adaptation for Style Change Detection in Multi-Authored Text
by Abeer Saad Alsheddi and Mohamed El Bachir Menai
Appl. Sci. 2026, 16(4), 1981; https://doi.org/10.3390/app16041981 - 17 Feb 2026
Viewed by 345
Abstract
The style change detection (SCD) task involves identifying the locations of writing style changes in multi-authored documents. This task can be applied to plagiarism detection, security, and commerce applications. Introducing decoder-based Large Language Models (LLMs) marks a pivotal shift in applications. The segment [...] Read more.
The style change detection (SCD) task involves identifying the locations of writing style changes in multi-authored documents. This task can be applied to plagiarism detection, security, and commerce applications. Introducing decoder-based Large Language Models (LLMs) marks a pivotal shift in applications. The segment boundaries for SCD models can be represented by concatenating two consecutive segments as pairs. However, LLMs usually restrict their input lengths, where the long-length inputs may exceed the restricted length. This paper seeks to bridge this gap and exploit the power of LLMs by introducing boundary-focused LLM Adaptation for SCD (BF-LLMA-SCD). The proposed solution adapts decoder-based LLMs for SCD using QLoRA. BF-LLMA-SCD truncates long-length input by preserving texts near an examined boundary while removing those at the other sides. BF-LLMA-SCD was trained on three PAN datasets. Comparison results with the top-performing SOTA solutions show that BF-LLMA-SCD achieved the best performance results in terms of F1 on PAN 2021 and PAN 2022/D1, while obtaining competitive results on PAN 2022/D3. BF-LLMA-SCD was also trained on an Arabic SCD dataset comprising three difficulty levels. It achieved an F1 score above 0.99 on easy instances. Full article
Show Figures

Figure 1

17 pages, 1206 KB  
Article
DPATransLLM: Detection of Pronominal Anaphora in Turkish Sentences Using Transformer-Based, Large Language Models and Hybrid Ensemble Approach
by Engin Demir and Metin Bilgin
Appl. Sci. 2025, 15(23), 12480; https://doi.org/10.3390/app152312480 - 25 Nov 2025
Viewed by 755
Abstract
In the current information age, with the exponential growth of data volume and language-based applications, the accurate resolution of intra-contextual relationships in texts has become indispensable for both academic research and industrial Natural Language Processing (NLP) systems. This study focuses on the detection [...] Read more.
In the current information age, with the exponential growth of data volume and language-based applications, the accurate resolution of intra-contextual relationships in texts has become indispensable for both academic research and industrial Natural Language Processing (NLP) systems. This study focuses on the detection of pronominal anaphora in Turkish sentences. For the detection of pronominal anaphora, a specific dataset comprising 2000 sentences and 72,239 tokens was created, and this dataset was labeled using a BIO tagging method developed with a custom approach for this study. In this work, fine-tuning was performed on Transformer-based language models pre-trained on Turkish data, such as BERT and RoBERTa. Additionally, Large Language Models (LLMs) trained on Turkish data, including Turkcell-LLM-7b-v1 and ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1, as well as multilingual models like Microsoft’s Phi-3 Mini-4K-Instruct and OpenAI’s GPT-4o-mini, were also fine-tuned with the created dataset to detect pronominal anaphora in sentences. Following the training of the language models, the resulting performance was evaluated using pronoun accuracy, antecedent accuracy, exact match, and F1-score metrics. According to the results obtained in the pronominal anaphora detection phase of the study, a novel hybrid ensemble approach combining multiple Transformer models with linguistic rules achieved the highest performance. This hybrid system attained scores of 0.987 for pronoun accuracy, 0.977 for antecedent accuracy, 0.505 for exact match, and 0.960 for F1-score, surpassing all individual models, including GPT-4o-mini. These findings reveal the superiority of ensemble methods combined with Turkish-specific linguistic rules over standalone models in Turkish anaphora resolution. This study is considered novel, as it is the first work to apply hybrid ensemble methods with linguistic rule integration to this domain for the Turkish language. Full article
Show Figures

Figure 1

Back to TopTop