Submit to Special Issue Submit Abstract to Special Issue Review for Future Internet Propose a Special Issue

Journal Menu

Journal Browser

Artificial Intelligence (AI) and Natural Language Processing (NLP)

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Benefits of Publishing in a Special Issue
Published Papers

A special issue of Future Internet (ISSN 1999-5903). This special issue belongs to the section "Big Data and Augmented Intelligence".

Deadline for manuscript submissions: 20 July 2026 | Viewed by 7266

Share This Special Issue

Special Issue Editors

Dr. Sardar Jaf

E-Mail Website
Guest Editor

School of Computer Science, University of Sunderland, Sunderland SR1 3SD, UK
Interests: artificial intelligence; natural language processing; data science; large language model; NLP application; AI ethics and privacy issues; cybersecurity
Special Issues, Collections and Topics in MDPI journals

Dr. Basel Barakat

E-Mail Website
Guest Editor

School of Computing, Goldsmiths University of London, London SE14 6NW, UK
Interests: artificial intelligence; natural language processing; time-series forecasting; algorithms; Internet of Things (IoT) systems; real-time communication systems; healthcare technology innovation
Special Issues, Collections and Topics in MDPI journals

Prof. Dr. Yongqiang Cheng

E-Mail Website
Guest Editor

School of Computer Science, University of Sunderland, Sunderland SR1 3SD, UK
Interests: smart systems and digital healthcare; artificial intelligence; machine learning; wireless sensor network
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Artificial intelligence (AI) and natural language processing (NLP) are at the forefront of transformative technologies reshaping the way we interact with computers and automated machines. AI encompasses the broader discipline of developing systems capable of performing tasks that typically require human intelligence, such as reasoning, decision-making, categorization, prediction, and learning. NLP, a subfield of AI, focuses on enabling computers to understand, analyze, translate or interpret, generate, and respond to human language in a meaningful way.

Recent advancements in machine learning, especially in deep learning, and in large language models have significantly enhanced the capabilities of NLP applications, which range from real-time translation and use in dialogue systems and human–machine conversations to intelligent virtual assistance and automated content generation. These developments are opening new frontiers in many non-computing disciplines such as healthcare, education, business, economy, politics, manufacturing, etc.

For this Special Issue, we invite researchers, practitioners, and academics to contribute original research articles, reviews, and case studies that explore the latest innovations, challenges, and future directions in NLP and AI. Submissions may include theoretical insights, practical implementations, interdisciplinary approaches, and novel applications. This Special Issue will cover the following topics (though this list is not exhaustive):

Machine Learning and Deep Learning for NLP
- Machine learning and deep learning methods;
- Implementation, integration, testing, and deployment issues of AI and NLP systems;
- Transfer learning and pre-training techniques.
Language Modeling and Generation
- Text summarization;
- Machine translation;
- Text generation and completion.
Speech and Audio Processing
- Speech recognition and synthesis;
- Multimodal NLP (audio + text).
Semantic Analysis
- Named entity recognition (NER);
- Word sense disambiguation;
- Semantic similarity and entailment.
Information Extraction and Retrieval
- Knowledge graph construction;
- Question answering systems;
- Search engines and semantic search.
Emerging Topics
- Large language models (LLMs);
- Scaling laws, model compression, prompt engineering;
- Safety, alignment, and interpretability.
Multimodal AI
- Vision–language integration (e.g., image captioning, VQA);
- Cross-modal retrieval.
Ethics, Fairness, and Bias in NLP/AI
- Algorithmic fairness;
- Mitigation of harmful content;
- Transparency and accountability.
Low-resource and Multilingual NLP
- Zero-shot and few-shot learning;
- Cross-lingual transfer.
Human–AI Collaboration
- Co-creative systems;
- Conversational agents and chatbots;
- Explainable AI in NLP.
Application-Focused Areas
- Healthcare;
- Clinical NLP;
- Medical report generation and classification;
- NLP for finance.
Education and E-learning
- Automated essay scoring;
- Intelligent tutoring systems;
- Assessment and plagiarism.
Legal and Financial AI
- Contract analysis;
- Document classification and summarization;
- Privacy, security, trust, and ethical issues in AI;
- Legal complications on AI models.
Social Media and Sentiment Analysis
- Misinformation detection;
- Opinion mining.
Robotics and Human–Robot Interaction
- Natural language instruction;
- Context-aware dialogue systems.

You may choose our Joint Special Issue in Big Data and Cognitive Computing.

Dr. Sardar Jaf
Dr. Basel Barakat
Prof. Dr. Yongqiang Cheng
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Future Internet is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

artificial intelligence
AI
natural language processing
NLP
language models
large language model
speech processing
text processing
language processing

Benefits of Publishing in a Special Issue

Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (6 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

19 pages, 6992 KB

Open AccessArticle

AI-Based Proactive Maintenance for Cultural Heritage Conservation: A Hybrid Neuro-Fuzzy Approach

by Otilia Elena Dragomir and Florin Dragomir

Future Internet 2025, 17(11), 510; https://doi.org/10.3390/fi17110510 - 5 Nov 2025

Viewed by 339

Abstract

Cultural heritage conservation faces escalating challenges from environmental threats and resource constraints, necessitating innovative preservation strategies that balance predictive accuracy with interpretability. This study presents a hybrid neuro-fuzzy framework addressing critical gaps in heritage conservation practice through sequential integration of feedforward neural networks (FF-NNs) and Mamdani-type fuzzy inference systems (MFISs). The system processes multi-sensor data (temperature, vibration, pressure) through a two-stage architecture: an FF-NN for pattern recognition and an MFIS for interpretable decision-making. Evaluation on 1000 synthetic heritage building monitoring samples (70% training, 30% testing) demonstrates mean accuracy of 94.3% (±0.62%), precision of 92.3% (±0.78%), and recall of 90.3% (±0.70%) across five independent runs. Feature importance analysis reveals temperature as the dominant fault detection driver (60.6% variance contribution), followed by pressure (36.7%), while vibration contributes negatively (−2.8%). The hybrid architecture overcomes the accuracy–interpretability trade-off inherent in standalone approaches: while the FF-NN achieves superior fault detection, the MFIS provides transparent maintenance recommendations essential for conservation professional validation. However, comparative analysis reveals that rigid fuzzy rule structures constrain detection capabilities for borderline cases, reducing recall from 96% (standalone FF-NN) to 47% (hybrid system) in fault-dominant scenarios. This limitation highlights the need for adaptive fuzzy integration mechanisms in safety-critical heritage applications. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Figure 1

35 pages, 1285 KB

Open AccessArticle

Uncensored AI in the Wild: Tracking Publicly Available and Locally Deployable LLMs

by Bahrad A. Sokhansanj

Future Internet 2025, 17(10), 477; https://doi.org/10.3390/fi17100477 - 18 Oct 2025

Viewed by 1473

Abstract

Open-weight generative large language models (LLMs) can be freely downloaded and modified. Yet, little empirical evidence exists on how these models are systematically altered and redistributed. This study provides a large-scale empirical analysis of safety-modified open-weight LLMs, drawing on 8608 model repositories and evaluating 20 representative modified models on unsafe prompts designed to elicit, for example, election disinformation, criminal instruction, and regulatory evasion. This study demonstrates that modified models exhibit substantially higher compliance: while an average of unmodified models complied with only 19.2% of unsafe requests, modified variants complied at an average rate of 80.0%. Modification effectiveness was independent of model size, with smaller, 14-billion-parameter variants sometimes matching or exceeding the compliance levels of 70B parameter versions. The ecosystem is highly concentrated yet structurally decentralized; for example, the top 5% of providers account for over 60% of downloads and the top 20 for nearly 86%. Moreover, more than half of the identified models use GGUF packaging, optimized for consumer hardware, and 4-bit quantization methods proliferate widely, though full-precision and lossless 16-bit models remain the most downloaded. These findings demonstrate how locally deployable, modified LLMs represent a paradigm shift for Internet safety governance, calling for new regulatory approaches suited to decentralized AI. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Graphical abstract

19 pages, 611 KB

Open AccessArticle

Prompt-Driven and Kubernetes Error Report-Aware Container Orchestration

by Niklas Beuter, André Drews and Nane Kratzke

Future Internet 2025, 17(9), 416; https://doi.org/10.3390/fi17090416 - 11 Sep 2025

Viewed by 510

Abstract

Background: Container orchestration systems like Kubernetes rely heavily on declarative manifest files, which serve as orchestration blueprints. However, managing these manifest files is often complex and requires substantial DevOps expertise. Methodology: This study investigates the use of Large Language Models (LLMs) to automate the creation of Kubernetes manifest files from natural language specifications, utilizing prompt engineering techniques within an innovative error- and warning-report–aware refinement process. We assess the capabilities of these LLMs using Zero-Shot, Few-Shot, Prompt-Chaining, and Self-Refine methods to address DevOps needs and support fully automated deployment pipelines. Results: Our findings show that LLMs can generate Kubernetes manifests with varying levels of manual intervention. Notably, GPT-4 and GPT-3.5 demonstrate strong potential for deployment automation. Interestingly, smaller models sometimes outperform larger ones, challenging the assumption that larger models always yield better results. Conclusions: This research highlights the crucial impact of prompt engineering on LLM performance for Kubernetes tasks and recommends further exploration of prompt techniques and model comparisons, outlining a promising path for integrating LLMs into automated deployment workflows. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Figure 1

28 pages, 734 KB

Open AccessArticle

GPT-4.1 Sets the Standard in Automated Experiment Design Using Novel Python Libraries

by Nuno Fachada, Daniel Fernandes, Carlos M. Fernandes, Bruno D. Ferreira-Saraiva and João P. Matos-Carvalho

Future Internet 2025, 17(9), 412; https://doi.org/10.3390/fi17090412 - 8 Sep 2025

Viewed by 669

Abstract

Large language models (LLMs) have advanced rapidly as tools for automating code generation in scientific research, yet their ability to interpret and use unfamiliar Python APIs for complex computational experiments remains poorly characterized. This study systematically benchmarks a selection of state-of-the-art LLMs in generating functional Python code for two increasingly challenging scenarios: conversational data analysis with the ParShift library, and synthetic data generation and clustering using pyclugen and scikit-learn. Both experiments use structured, zero-shot prompts specifying detailed requirements but omitting in-context examples. Model outputs are evaluated quantitatively for functional correctness and prompt compliance over multiple runs, and qualitatively by analyzing the errors produced when code execution fails. Results show that only a small subset of models consistently generate correct, executable code. GPT-4.1 achieved a 100% success rate across all runs in both experimental tasks, whereas most other models succeeded in fewer than half of the runs, with only Grok-3 and Mistral-Large approaching comparable performance. In addition to benchmarking LLM performance, this approach helps identify shortcomings in third-party libraries, such as unclear documentation or obscure implementation bugs. Overall, these findings highlight current limitations of LLMs for end-to-end scientific automation and emphasize the need for careful prompt design, comprehensive library documentation, and continued advances in language model capabilities. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Figure 1

28 pages, 1712 KB

Open AccessArticle

Identifying Literary Microgenres and Writing Style Differences in Romanian Novels with ReaderBench and Large Language Models

by Aura Cristina Udrea, Stefan Ruseti, Vlad Pojoga, Stefan Baghiu, Andrei Terian and Mihai Dascalu

Future Internet 2025, 17(9), 397; https://doi.org/10.3390/fi17090397 - 30 Aug 2025

Viewed by 750

Abstract

Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary historians into one of 17 genres. Our findings reveal that most novels do not adhere to a single genre label but instead combine elements of multiple (micro)genres, challenging traditional single-label classification approaches. We employed a dual computational methodology combining an analysis with Romanian-tailored linguistic features with general-purpose LLMs. ReaderBench, a Romanian-specific framework, was utilized to extract surface, syntactic, semantic, and discourse features, capturing fine-grained linguistic patterns. Alternatively, we prompted two LLMs (Llama3.3 70B and DeepSeek-R1 70B) to predict genres at the paragraph level, leveraging their ability to detect contextual and thematic coherence across multiple narrative scales. Statistical analyses using Kruskal–Wallis and Mann–Whitney tests identified genre-defining features at both novel and chapter levels. The integration of these complementary approaches enhances microgenre detection beyond traditional classification capabilities. ReaderBench provides quantifiable linguistic evidence, while LLMs capture broader contextual patterns; together, they provide a multi-layered perspective on literary genre that reflects the complex and heterogeneous character of fictional texts. Our results argue that both language-specific and general-purpose computational tools can effectively detect stylistic diversity in Romanian fiction, opening new avenues for computational literary analysis in limited-resourced languages. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Figure 1

33 pages, 4233 KB

Open AccessArticle

A Comparative Study of PEGASUS, BART, and T5 for Text Summarization Across Diverse Datasets

by Eman Daraghmi, Lour Atwe and Areej Jaber

Future Internet 2025, 17(9), 389; https://doi.org/10.3390/fi17090389 - 28 Aug 2025

Cited by 1 | Viewed by 2902

Abstract

This study aims to conduct a comprehensive comparative evaluation of three transformer-based models, PEGASUS, BART, and T5 variants (SMALL and BASE), for the task of abstractive text summarization. The evaluation spans across three benchmark datasets: CNN/DailyMail (long-form news articles), Xsum (extreme single-sentence summaries of BBC articles), and Samsum (conversational dialogues). Each dataset presents unique challenges in terms of length, style, and domain, enabling a robust assessment of the models’ capabilities. All models were fine-tuned under controlled experimental settings using filtered and preprocessed subsets, with token length limits applied to maintain consistency and prevent truncation. The evaluation leveraged ROUGE-1, ROUGE-2, and ROUGE-L scores to measure summary quality, while efficiency metrics such as training time were also considered. An additional qualitative assessment was conducted through expert human evaluation of fluency, relevance, and conciseness. Results indicate that PEGASUS achieved the highest ROUGE scores on CNN/DailyMail, BART excelled in Xsum and Samsum, while T5 models, particularly T5-Base, narrowed the performance gap with larger models while still offering efficiency advantages compared to PEGASUS and BART. These findings highlight the trade-offs between model performance and computational efficiency, offering practical insights into model scaling—where T5-Small favors lightweight efficiency and T5-Base provides stronger accuracy without excessive resource demands. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Journal Menu

Journal Browser

Artificial Intelligence (AI) and Natural Language Processing (NLP)

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Benefits of Publishing in a Special Issue

Published Papers (6 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI