Artificial Intelligence (AI) and Natural Language Processing (NLP)

A special issue of Future Internet (ISSN 1999-5903). This special issue belongs to the section "Big Data and Augmented Intelligence".

Deadline for manuscript submissions: 20 July 2026 | Viewed by 2938

Special Issue Editors

School of Computer Science, University of Sunderland, Sunderland SR1 3SD, UK
Interests: artificial intelligence; natural language processing; data science; large language model; NLP application; AI ethics and privacy issues; cybersecurity
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computing, Goldsmiths University of London, London SE14 6NW, UK
Interests: artificial intelligence; natural language processing; time-series forecasting; algorithms; Internet of Things (IoT) systems; real-time communication systems; healthcare technology innovation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computer Science, University of Sunderland, Sunderland SR1 3SD, UK
Interests: smart systems and digital healthcare; artificial intelligence; machine learning; wireless sensor network
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Artificial intelligence (AI) and natural language processing (NLP) are at the forefront of transformative technologies reshaping the way we interact with computers and automated machines. AI encompasses the broader discipline of developing systems capable of performing tasks that typically require human intelligence, such as reasoning, decision-making, categorization, prediction, and learning. NLP, a subfield of AI, focuses on enabling computers to understand, analyze, translate or interpret, generate, and respond to human language in a meaningful way.

Recent advancements in machine learning, especially in deep learning, and in large language models have significantly enhanced the capabilities of NLP applications, which range from real-time translation and use in dialogue systems and human–machine conversations to intelligent virtual assistance and automated content generation. These developments are opening new frontiers in many non-computing disciplines such as healthcare, education, business, economy, politics, manufacturing, etc.

For this Special Issue, we invite researchers, practitioners, and academics to contribute original research articles, reviews, and case studies that explore the latest innovations, challenges, and future directions in NLP and AI. Submissions may include theoretical insights, practical implementations, interdisciplinary approaches, and novel applications. This Special Issue will cover the following topics (though this list is not exhaustive):

  1. Machine Learning and Deep Learning for NLP
    • Machine learning and deep learning methods;
    • Implementation, integration, testing, and deployment issues of AI and NLP systems;
    • Transfer learning and pre-training techniques.
  2. Language Modeling and Generation
    • Text summarization;
    • Machine translation;
    • Text generation and completion.
  3. Speech and Audio Processing
    • Speech recognition and synthesis;
    • Multimodal NLP (audio + text).
  4. Semantic Analysis
    • Named entity recognition (NER);
    • Word sense disambiguation;
    • Semantic similarity and entailment.
  5. Information Extraction and Retrieval
    • Knowledge graph construction;
    • Question answering systems;
    • Search engines and semantic search.
  6. Emerging Topics
    • Large language models (LLMs);
    • Scaling laws, model compression, prompt engineering;
    • Safety, alignment, and interpretability.
  7. Multimodal AI
    • Vision–language integration (e.g., image captioning, VQA);
    • Cross-modal retrieval.
  8. Ethics, Fairness, and Bias in NLP/AI
    • Algorithmic fairness;
    • Mitigation of harmful content;
    • Transparency and accountability.
  9. Low-resource and Multilingual NLP
    • Zero-shot and few-shot learning;
    • Cross-lingual transfer.
  10. Human–AI Collaboration
    • Co-creative systems;
    • Conversational agents and chatbots;
    • Explainable AI in NLP.
  11. Application-Focused Areas
    • Healthcare;
    • Clinical NLP;
    • Medical report generation and classification;
    • NLP for finance.
  12. Education and E-learning
    • Automated essay scoring;
    • Intelligent tutoring systems;
    • Assessment and plagiarism.
  13. Legal and Financial AI
    • Contract analysis;
    • Document classification and summarization;
    • Privacy, security, trust, and ethical issues in AI;
    • Legal complications on AI models.
  14. Social Media and Sentiment Analysis
    • Misinformation detection;
    • Opinion mining.
  15. Robotics and Human–Robot Interaction
    • Natural language instruction;
    • Context-aware dialogue systems.

You may choose our Joint Special Issue in Big Data and Cognitive Computing.

Dr. Sardar Jaf
Dr. Basel Barakat
Prof. Dr. Yongqiang Cheng
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Future Internet is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • AI
  • natural language processing
  • NLP
  • language models
  • large language model
  • speech processing
  • text processing
  • language processing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 611 KB  
Article
Prompt-Driven and Kubernetes Error Report-Aware Container Orchestration
by Niklas Beuter, André Drews and Nane Kratzke
Future Internet 2025, 17(9), 416; https://doi.org/10.3390/fi17090416 - 11 Sep 2025
Viewed by 291
Abstract
Background: Container orchestration systems like Kubernetes rely heavily on declarative manifest files, which serve as orchestration blueprints. However, managing these manifest files is often complex and requires substantial DevOps expertise. Methodology: This study investigates the use of Large Language Models (LLMs) to automate [...] Read more.
Background: Container orchestration systems like Kubernetes rely heavily on declarative manifest files, which serve as orchestration blueprints. However, managing these manifest files is often complex and requires substantial DevOps expertise. Methodology: This study investigates the use of Large Language Models (LLMs) to automate the creation of Kubernetes manifest files from natural language specifications, utilizing prompt engineering techniques within an innovative error- and warning-report–aware refinement process. We assess the capabilities of these LLMs using Zero-Shot, Few-Shot, Prompt-Chaining, and Self-Refine methods to address DevOps needs and support fully automated deployment pipelines. Results: Our findings show that LLMs can generate Kubernetes manifests with varying levels of manual intervention. Notably, GPT-4 and GPT-3.5 demonstrate strong potential for deployment automation. Interestingly, smaller models sometimes outperform larger ones, challenging the assumption that larger models always yield better results. Conclusions: This research highlights the crucial impact of prompt engineering on LLM performance for Kubernetes tasks and recommends further exploration of prompt techniques and model comparisons, outlining a promising path for integrating LLMs into automated deployment workflows. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
Show Figures

Figure 1

28 pages, 734 KB  
Article
GPT-4.1 Sets the Standard in Automated Experiment Design Using Novel Python Libraries
by Nuno Fachada, Daniel Fernandes, Carlos M. Fernandes, Bruno D. Ferreira-Saraiva and João P. Matos-Carvalho
Future Internet 2025, 17(9), 412; https://doi.org/10.3390/fi17090412 - 8 Sep 2025
Viewed by 399
Abstract
Large language models (LLMs) have advanced rapidly as tools for automating code generation in scientific research, yet their ability to interpret and use unfamiliar Python APIs for complex computational experiments remains poorly characterized. This study systematically benchmarks a selection of state-of-the-art LLMs in [...] Read more.
Large language models (LLMs) have advanced rapidly as tools for automating code generation in scientific research, yet their ability to interpret and use unfamiliar Python APIs for complex computational experiments remains poorly characterized. This study systematically benchmarks a selection of state-of-the-art LLMs in generating functional Python code for two increasingly challenging scenarios: conversational data analysis with the ParShift library, and synthetic data generation and clustering using pyclugen and scikit-learn. Both experiments use structured, zero-shot prompts specifying detailed requirements but omitting in-context examples. Model outputs are evaluated quantitatively for functional correctness and prompt compliance over multiple runs, and qualitatively by analyzing the errors produced when code execution fails. Results show that only a small subset of models consistently generate correct, executable code. GPT-4.1 achieved a 100% success rate across all runs in both experimental tasks, whereas most other models succeeded in fewer than half of the runs, with only Grok-3 and Mistral-Large approaching comparable performance. In addition to benchmarking LLM performance, this approach helps identify shortcomings in third-party libraries, such as unclear documentation or obscure implementation bugs. Overall, these findings highlight current limitations of LLMs for end-to-end scientific automation and emphasize the need for careful prompt design, comprehensive library documentation, and continued advances in language model capabilities. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
Show Figures

Figure 1

28 pages, 1711 KB  
Article
Identifying Literary Microgenres and Writing Style Differences in Romanian Novels with ReaderBench and Large Language Models
by Aura Cristina Udrea, Stefan Ruseti, Vlad Pojoga, Stefan Baghiu, Andrei Terian and Mihai Dascalu
Future Internet 2025, 17(9), 397; https://doi.org/10.3390/fi17090397 - 30 Aug 2025
Viewed by 527
Abstract
Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary [...] Read more.
Recent developments in natural language processing, particularly large language models (LLMs), create new opportunities for literary analysis in underexplored languages like Romanian. This study investigates stylistic heterogeneity and genre blending in 175 late 19th- and early 20th-century Romanian novels, each classified by literary historians into one of 17 genres. Our findings reveal that most novels do not adhere to a single genre label but instead combine elements of multiple (micro)genres, challenging traditional single-label classification approaches. We employed a dual computational methodology combining an analysis with Romanian-tailored linguistic features with general-purpose LLMs. ReaderBench, a Romanian-specific framework, was utilized to extract surface, syntactic, semantic, and discourse features, capturing fine-grained linguistic patterns. Alternatively, we prompted two LLMs (Llama3.3 70B and DeepSeek-R1 70B) to predict genres at the paragraph level, leveraging their ability to detect contextual and thematic coherence across multiple narrative scales. Statistical analyses using Kruskal–Wallis and Mann–Whitney tests identified genre-defining features at both novel and chapter levels. The integration of these complementary approaches enhances microgenre detection beyond traditional classification capabilities. ReaderBench provides quantifiable linguistic evidence, while LLMs capture broader contextual patterns; together, they provide a multi-layered perspective on literary genre that reflects the complex and heterogeneous character of fictional texts. Our results argue that both language-specific and general-purpose computational tools can effectively detect stylistic diversity in Romanian fiction, opening new avenues for computational literary analysis in limited-resourced languages. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
Show Figures

Figure 1

33 pages, 4233 KB  
Article
A Comparative Study of PEGASUS, BART, and T5 for Text Summarization Across Diverse Datasets
by Eman Daraghmi, Lour Atwe and Areej Jaber
Future Internet 2025, 17(9), 389; https://doi.org/10.3390/fi17090389 - 28 Aug 2025
Viewed by 1133
Abstract
This study aims to conduct a comprehensive comparative evaluation of three transformer-based models, PEGASUS, BART, and T5 variants (SMALL and BASE), for the task of abstractive text summarization. The evaluation spans across three benchmark datasets: CNN/DailyMail (long-form news articles), Xsum (extreme single-sentence summaries [...] Read more.
This study aims to conduct a comprehensive comparative evaluation of three transformer-based models, PEGASUS, BART, and T5 variants (SMALL and BASE), for the task of abstractive text summarization. The evaluation spans across three benchmark datasets: CNN/DailyMail (long-form news articles), Xsum (extreme single-sentence summaries of BBC articles), and Samsum (conversational dialogues). Each dataset presents unique challenges in terms of length, style, and domain, enabling a robust assessment of the models’ capabilities. All models were fine-tuned under controlled experimental settings using filtered and preprocessed subsets, with token length limits applied to maintain consistency and prevent truncation. The evaluation leveraged ROUGE-1, ROUGE-2, and ROUGE-L scores to measure summary quality, while efficiency metrics such as training time were also considered. An additional qualitative assessment was conducted through expert human evaluation of fluency, relevance, and conciseness. Results indicate that PEGASUS achieved the highest ROUGE scores on CNN/DailyMail, BART excelled in Xsum and Samsum, while T5 models, particularly T5-Base, narrowed the performance gap with larger models while still offering efficiency advantages compared to PEGASUS and BART. These findings highlight the trade-offs between model performance and computational efficiency, offering practical insights into model scaling—where T5-Small favors lightweight efficiency and T5-Base provides stronger accuracy without excessive resource demands. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))
Show Figures

Graphical abstract

Back to TopTop