Submit to Special Issue Submit Abstract to Special Issue Review for Applied Sciences Propose a Special Issue

Journal Menu

Journal Browser

Natural Language Processing in the Era of Artificial Intelligence

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 April 2026 | Viewed by 16907

Share This Special Issue

Special Issue Editors

Dr. Daniela Gîfu

E-Mail Website
Guest Editor

Institute of Computer Science, Romanian Academy, Iasi Branch, 700011 Iasi, Romania
Interests: natural language processing; computational linguistics; web of linked data; content analysis; social media and health information; applied and computational statistics; integrated health informatics system; assisted decision systems; research ethics
Special Issues, Collections and Topics in MDPI journals

Dr. Kevin Bretonnel Cohen

E-Mail Website
Guest Editor

Computational Bioscience Program, Department of Pharmacology, University of Colorado School of Medicine, Aurora, CO 80045, USA
Interests: spinal cord injury and regeneration; analysis of the speech of suicidal individuals; temporality in health records; information extraction from epilepsy clinic notes
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In an era when massive amounts of data have become available, researchers across various domains increasingly require the expertise of language engineers to process large quantities of literature, data, and records. Whether in healthcare, finance, education, social sciences, or any other field, linking the contents of these documents to each other, as well as to specialized ontologies, can enable access to and discovery of structured information, fostering significant advancements in natural language processing and research.

This Special Issue aims to gather innovative approaches for the exploitation of data using semantic web technologies and linked data by bringing together practitioners, researchers, and scholars to share examples, use cases, theories, and analyses across different fields. The main objective of this Special Issue is to consolidate an internationally appreciated forum for scientific research, with emphasis on crowdsourcing, the semantic web, knowledge integration, and data linking.

Dr. Daniela Gîfu
Dr. Kevin Cohen
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

natural language processing/text mining
data science/applied mathematics
knowledge integration
semantic web technologies
open linked data
crowdsourcing

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

Jump to: Review

23 pages, 3597 KB

Open AccessArticle

A Cloud-Based Sentiment Analysis System with a BERT Algorithm for Fake News on Twitter

by Nadire Cavus, Bora Oktekin and Murat Goksu

Appl. Sci. 2025, 15(20), 11046; https://doi.org/10.3390/app152011046 - 15 Oct 2025

Cited by 1 | Viewed by 1858

Abstract

The rapid spread of the global COVID-19 pandemic has rapidly changed people’s communication demands and shifted them to digital channels, thus increasing the use of social networks more than ever. However, the increased use of social networks has also led to emotional confusion that has emerged with the fake news problem. As a result of limited studies on fine-grained sentiment analysis of fake news, this study comprehensively presents a sentiment analysis of fake news across seven main categories. For this reason, this study aims to address this problem with a cloud-based system called SA-ES using the BERT algorithm to understand the emotional dimension of fake news spreading on Twitter (now X). In this context, the sentiment analysis of fake news has been examined in seven categories. The SA-ES system consists of 248,262 training datasets and 10,202 test datasets for testing and evaluation. The SA-ES system was trained with the BERT algorithm in two epochs during the modeling phase and reached 99% accuracy. We hope that the development of this SA-ES system will fill a gap in the literature and also help take measures toward a healthier society by analyzing the moods of people who share fake news on social networks. Full article

(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)

► Show Figures

Figure 1

24 pages, 2306 KB

Open AccessArticle

Dual-Path Short Text Classification with Data Optimization

by Wei Li, Guangying Lv and Yunling He

Appl. Sci. 2025, 15(20), 11015; https://doi.org/10.3390/app152011015 - 14 Oct 2025

Viewed by 695

Abstract

In order to solve problems of fragmented information, missing context and difficult-to-capture feature information in short texts, this paper proposes a dual-path classification model combining word-level and sentence-level feature information. Our method is developing the BERT pre-trained model for obtaining word vectors, and presenting attention mechanisms and the BiGRU model to extract local key information and global semantic information, respectively. To tackle the difficulties of models focusing more on hard-to-learn samples during training, a novel hybrid loss function is constructed as an optimization objective, and to address common quality issues in training data, a text data optimization method that integrates data filtering and augmentation techniques is proposed. This method aims to further enhance model performance by improving the quality of input data. Experimental results on three different short text datasets show that our proposed model outperforms existing models (such as Att + BiGRU, BERT + At), with an average F1 score exceeding 90%. Moreover, the performance metrics of the model improved on the datasets optimized with the proposed data optimization method compared to the original datasets, demonstrating the effectiveness of this method in enhancing training data quality and improving model performance. Full article

(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)

► Show Figures

Figure 1

21 pages, 3788 KB

Open AccessArticle

Research on BBHL Model Based on Hybrid Loss Optimization for Fake News Detection

by Minghu Tang, Jiayi Zhang, Xuan Bu, Junjie Wang and Peng Luo

Appl. Sci. 2025, 15(18), 10028; https://doi.org/10.3390/app151810028 - 13 Sep 2025

Viewed by 1263

Abstract

With the rapid development of social media, the spread of fake news has become a significant issue affecting social stability. To address the problems of incomplete feature extraction and simplistic loss function design in traditional fake news detection, this paper proposes a BBHL model based on hybrid loss optimization. The model achieves deep extraction of text features by integrating BERT, Bi-LSTM, and attention mechanisms, and innovatively fuses binary cross-entropy (BCE) loss with contrastive loss to enhance feature discriminability and the model’s generalization ability. Experiments on the Weibo, Twitter, and Pheme datasets demonstrate that the BBHL model significantly outperforms baseline models such as EANN and MCNN in metrics including accuracy and F1-score. Ablation experiments verify the effectiveness of contrastive loss, providing a robust and generalizable solution for fake news detection. Full article

(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)

► Show Figures

Figure 1

35 pages, 954 KB

Open AccessArticle

Beyond Manual Media Coding: Evaluating Large Language Models and Agents for News Content Analysis

by Stavros Doropoulos, Elisavet Karapalidou, Polychronis Charitidis, Sophia Karakeva and Stavros Vologiannidis

Appl. Sci. 2025, 15(14), 8059; https://doi.org/10.3390/app15148059 - 20 Jul 2025

Cited by 4 | Viewed by 3097

Abstract

The vast volume of media content, combined with the costs of manual annotation, challenges scalable codebook analysis and risks reducing decision-making accuracy. This study evaluates the effectiveness of large language models (LLMs) and multi-agent teams in structured media content analysis based on codebook-driven annotation. We construct a dataset of 200 news articles on U.S. tariff policies, manually annotated using a 26-question codebook encompassing 122 distinct codes, to establish a rigorous ground truth. Seven state-of-the-art LLMs, spanning low- to high-capacity tiers, are assessed under a unified zero-shot prompting framework incorporating role-based instructions and schema-constrained outputs. Experimental results show weighted global F1-scores between 0.636 and 0.822, with Claude-3-7-Sonnet achieving the highest direct-prompt performance. To examine the potential of agentic orchestration, we propose and develop a multi-agent system using Meta’s Llama 4 Maverick, incorporating expert role profiling, shared memory, and coordinated planning. This architecture improves the overall F1-score over the direct prompting baseline from 0.757 to 0.805 and demonstrates consistent gains across binary, categorical, and multi-label tasks, approaching commercial-level accuracy while maintaining a favorable cost–performance profile. These findings highlight the viability of LLMs, both in direct and agentic configurations, for automating structured content analysis. Full article

(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)

► Show Figures

Figure 1

20 pages, 4707 KB

Open AccessArticle

Entropy-Optimized Dynamic Text Segmentation and RAG-Enhanced LLMs for Construction Engineering Knowledge Base

by Haiyuan Wang, Deli Zhang, Jianmin Li, Zelong Feng and Feng Zhang

Appl. Sci. 2025, 15(6), 3134; https://doi.org/10.3390/app15063134 - 13 Mar 2025

Cited by 4 | Viewed by 3978

Abstract

In the field of construction engineering, there exists a dynamic evolution of extensive technical standards and specifications (e.g., GB/T and ISO series) that permeate the entire lifecycle of design, construction, and operation–maintenance. These standards require continuous version iteration to adapt to technological innovations. Engineers require specialized knowledge bases to assist in understanding and updating these standards. The advancement of large language models (LLMs) and Retrieval-Augmented Generation (RAG) technologies provides robust technical support for constructing domain-specific knowledge bases. This study developed and tested a vertical domain knowledge base construction scheme based on RAG architecture and LLMs, comprising three critical components: entropy-optimized dynamic text segmentation (EDTS), vector correlation-based chunk ranking, and iterative optimization of prompt engineering. This study employs an EDTS method to ensure information clarity and predictability within limited chunk lengths, followed by selecting 10 relevant chunks to form prompts for input into LLMs, thereby enabling efficient retrieval of vertical domain knowledge. Experimental validation using Qwen-series LLMs with a test set of 101 expert-verified questions from Chinese construction industry standard demonstrates that the overall test accuracy reaches 76%. The comparative experiments across model scales (1.5B, 3B, 7B, 14B, 32B, and 72B) quantitatively reveal the relationship between model size, answer accuracy, and execution time, providing decision-making guidance for computational resource-accuracy tradeoffs in engineering practice. Full article

(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)

► Show Figures

Figure 1

13 pages, 280 KB

Open AccessArticle

Under-Represented Speech Dataset from Open Data: Case Study on the Romanian Language

by Vasile Păiș, Verginica Barbu Mititelu, Elena Irimia, Radu Ion and Dan Tufiș

Appl. Sci. 2024, 14(19), 9043; https://doi.org/10.3390/app14199043 - 7 Oct 2024

Cited by 3 | Viewed by 2849

Abstract

This paper introduces the USPDATRO dataset. This is a speech dataset, in the Romanian language, constructed from open data, focusing on under-represented voice types (children, young and old people, and female voices). The paper covers the methodology behind the dataset construction, specific details regarding the dataset, and evaluation of existing Romanian Automatic Speech Recognition (ASR) systems, with different architectures. Results indicate that more under-represented speech content is needed in the training of ASR systems. Our approach can be extended to other low-resourced languages, as long as open data are available. Full article

(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)

► Show Figures

Figure 1

Review

Jump to: Research

38 pages, 620 KB

Open AccessReview

Large Language Models for Early-Stage Software Project Estimation: A Systematic Mapping Study

by Łukasz Radliński and Jakub Swacha

Appl. Sci. 2025, 15(24), 13099; https://doi.org/10.3390/app152413099 - 12 Dec 2025

Cited by 1 | Viewed by 1352

Abstract

Accurate estimation of software project characteristics during the early stages of development remains a constant challenge in software projects. Recent research suggests that large language models (LLMs) offer new opportunities to support such estimation tasks through their ability to interpret natural language specifications and extract contextual information from project descriptions. This paper presents a mapping study providing an overview of research on the applications of LLMs in early software project estimation. Thirty primary studies were systematically identified and categorised to examine estimation targets, used models, reference and supportive techniques, as well as applied evaluation measures. The obtained results provide insights into the methodological considerations, limitations, and challenges associated with LLM-based estimation approaches. The obtained findings inform both researchers and practitioners about the current state and potential of LLMs for supporting early-stage software project estimation. Full article

(This article belongs to the Special Issue Natural Language Processing in the Era of Artificial Intelligence)

► Show Figures

Journal Menu

Journal Browser

Natural Language Processing in the Era of Artificial Intelligence

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (7 papers)

Research

Review

Further Information

Guidelines

MDPI Initiatives

Follow MDPI