MDPI - Publisher of Open Access Journals

21 pages, 2357 KB

Open AccessArticle

Integrating Thesaurus-Based Knowledge into Transformer Models for Semantic Understanding of Domain-Specific Texts

by Bayangali Abdygalym, Saule Tazhibayeva, Madina Sambetbayeva, Aigerim Yerimbetova, Roman Taberkhan, Manzura Abjalova, Aidos Sabdenov and Elmira Daiyrbayeva

Computers 2026, 15(5), 297; https://doi.org/10.3390/computers15050297 - 7 May 2026

Viewed by 233

Abstract

Integrating structured linguistic resources into deep learning architectures represents a key challenge in domain-oriented NLP. This study proposes a framework for incorporating knowledge from a military thesaurus of the Ground Forces, structured according to the XML Zthes standard, into pre-trained transformed language models, [...] Read more.

Integrating structured linguistic resources into deep learning architectures represents a key challenge in domain-oriented NLP. This study proposes a framework for incorporating knowledge from a military thesaurus of the Ground Forces, structured according to the XML Zthes standard, into pre-trained transformed language models, including KazBERT, multilingual BERT, and XLM-RoBERTA. The approach addresses two interrelated tasks in specialized terminology processing: concept linking and semantic search. Unlike existing knowledge-injection methods designed primarily for general-domain applications, this framework formalizes the mapping of Zthes elements, such as Term, Broader term, Narrower term, Related term, ScopeNote, Language, and Source into structured textual representations that can be directly processed by transformer architectures. Fine-tuning is conducted on a dataset of 18,400 training instances automatically generated from the thesaurus, including synonym pairs, hierarchical relations (hyperonymy and hyponymy), associative links, and definitional descriptions. Experimental evaluation demonstrated that thesaurus-enriched models outperform baseline architectures across all major metrics. XLM-RoBERTA model achieves F1 = 0.84 and Top-5 accuracy = 0.94 in the concept linking task, representing a five-point improvement over the baseline. The model reaches Macro-F1 = 0.84 across four relation types. Results obtained on a specialized test set derived from terminology databases of Kazakhstan’s Armed Forces confirm robust cross-lingual generalization across Kazakh, Russian and English military discourse. Full article

► Show Figures

Graphical abstract

35 pages, 2657 KB

Open AccessArticle

Mitigating Metamorphic Malware Through Adversarial Learning Techniques

by Kehinde O. Babaagba and Zhiyuan Tan

Network 2026, 6(2), 22; https://doi.org/10.3390/network6020022 - 8 Apr 2026

Viewed by 596

Abstract

Antivirus (AV) solutions remain a core defence mechanism against malicious software. However, many of these engines struggle to detect metamorphic malware, which continually alters its internal form in unpredictable ways. To address this limitation, we present an adversarially oriented approach that automatically generates [...] Read more.

Antivirus (AV) solutions remain a core defence mechanism against malicious software. However, many of these engines struggle to detect metamorphic malware, which continually alters its internal form in unpredictable ways. To address this limitation, we present an adversarially oriented approach that automatically generates novel malicious variants of existing malware that evade detection by a substantial proportion of AV systems, thereby providing material for strengthening defensive techniques. In this work, an Evolutionary Algorithm (EA) is used to evolve undetectable variants, guided by three fitness criteria: the evasiveness of the produced samples, and their behavioural and structural similarity to the original malware. The proposed method is assessed across three malware families to evaluate the effectiveness of the EA-generated variants. Results indicate that the EA produces diverse mutant variants capable of evading up to 94% of AV detectors for a given malware family, significantly surpassing the evasion rate of the original malware. Furthermore, we evaluated whether the mutants produced by the EA could enhance the training of machine learning models. In this context, a pretrained Natural Language Processing (NLP) transformer was employed within a transfer learning framework to improve the classification of metamorphic malware. When the evolved variants were incorporated into the training data, the approach achieved classification accuracies of up to 93%. These results highlight the value of using diverse EA-generated samples to strengthen malware classifiers, thereby improving the robustness of security systems against evolving threats. Full article

► Show Figures

Figure 1

14 pages, 449 KB

Open AccessArticle

Natural Language Processing-Based Triage of Superficial Soft Tissue Ultrasound Reports in Orthopedic Practice

by Nuri Koray Ülgen, Mevlüt Aytaç Demir, Ali Said Nazlıgül, Nihat Yiğit, Sadık Emre Erginoğlu, Ünal Demir and Mehmet Orçun Akkurt

Diagnostics 2026, 16(7), 1068; https://doi.org/10.3390/diagnostics16071068 - 2 Apr 2026

Viewed by 447

Abstract

Background/Objectives: Natural language processing (NLP) has emerged as a promising approach for extracting clinically meaningful information from unstructured radiology reports. While most artificial intelligence applications in musculoskeletal imaging focus on image-based analysis, the potential of NLP for urgency assessment in superficial soft [...] Read more.

Background/Objectives: Natural language processing (NLP) has emerged as a promising approach for extracting clinically meaningful information from unstructured radiology reports. While most artificial intelligence applications in musculoskeletal imaging focus on image-based analysis, the potential of NLP for urgency assessment in superficial soft tissue ultrasound reports remains underexplored. This study aimed to develop and evaluate an NLP-based triage model to classify superficial soft tissue ultrasound reports according to clinical urgency in orthopedic practice. Methods: A curated dataset of superficial soft tissue ultrasound reports requested for palpable soft tissue masses and subcutaneous swellings was retrospectively collected from routine orthopedic outpatient practice. Reports were manually annotated into three triage categories: non-pathological (GREEN), non-urgent pathological (YELLOW), and urgent or potentially urgent findings (RED). A pretrained Turkish BERT model was fine-tuned for three-class classification. Model performance was evaluated using accuracy, macro-averaged F1 score, per-class precision and recall, and confusion matrices. An independent dataset of previously unseen reports was additionally used to assess robustness under real-world conditions. Results: After preprocessing and deduplication, 394 unique report segments were included. The baseline BERT model achieved an accuracy of 92.5% and a macro-averaged F1 score of 0.9106 on the test set. High classification performance was observed across all classes, with particularly reliable detection of RED reports representing urgent clinical conditions. External evaluation on independent reports demonstrated high agreement with physician annotations, with discrepancies mainly occurring in borderline or indeterminate cases. Conclusions: This study demonstrates that NLP-based analysis of superficial soft tissue ultrasound reports can effectively support urgency assessment in orthopedic practice. The proposed approach offers a practical, scalable, and image-independent solution for triage, with potential to improve workflow efficiency and facilitate timely clinical decision-making in musculoskeletal imaging. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

► Show Figures

Figure 1

29 pages, 8422 KB

Open AccessArticle

A Transformer-Based Method for Bidirectional French–Lingala Machine Translation in Speech and Text

by Reagan E. Mandiya, Selain K. Kasereka, Christophe B. Wizamo, Milena Savova-Mratsenkova, Ruffin-Benoît M. Ngoie, Tasho Tashev and Nathanaël M. Kasoro

Appl. Sci. 2026, 16(7), 3399; https://doi.org/10.3390/app16073399 - 31 Mar 2026

Viewed by 841

Abstract

Underrepresented languages such as Lingala are a significant part of the world’s cultural and linguistic heritage. Lingala plays a central role in daily communication, business, media, education, and culture for millions of people in the Democratic Republic of Congo (DRC) and the Republic [...] Read more.

Underrepresented languages such as Lingala are a significant part of the world’s cultural and linguistic heritage. Lingala plays a central role in daily communication, business, media, education, and culture for millions of people in the Democratic Republic of Congo (DRC) and the Republic of Congo. However, due to data scarcity and dialectal diversity, natural language processing (NLP) research often overlooks this language. In this paper, we propose a deep neural network pipeline for bidirectional French–Lingala automatic translation, covering both text-to-text and voice-to-text scenarios, by integrating Long Short-Term Memory (LSTM) and Transformer models on a specialized parallel corpus. The Bidirectional Encoder Representations from Transformers (BERT) model is used as a bidirectional source encoder to improve contextual representation, while the Whisper model handles automatic speech recognition as the first stage of the audio translation pipeline. Experimental results show that the standalone Transformer achieves a BLEU score of 35.3, compared to 8.12 for the LSTM SeqToSeq baseline. Fine-tuning with BERT raises the BLEU score to 38.6. Integrating the Whisper ASR module for an end-to-end speech translation task yields a final pipeline BLEU score of 55.4, with a Word Error Rate of 12.3% on the speech recognition sub-task, confirming the effectiveness of each component. These results demonstrate the potential of combining domain-specific pre-trained models with modular neural architectures to achieve competitive translation performance in a critically under-resourced language. Full article

(This article belongs to the Special Issue The Advanced Trends in Natural Language Processing)

► Show Figures

Figure 1

22 pages, 1921 KB

Open AccessArticle

Hybrid Semantic–Syntactic NLP Framework for Intelligent Grading of Short Answers and Cloze Questions

by Olaniyan Julius, Silas Formunyuy Verkijika and Ibidun C. Obagbuwa

Appl. Sci. 2026, 16(7), 3191; https://doi.org/10.3390/app16073191 - 26 Mar 2026

Viewed by 549

Abstract

The increasing demand for scalable and fair assessment of open-form responses in digital education shows the need for intelligent grading systems capable of balancing syntactic precision with semantic understanding. This study proposes a hybrid semantic–syntactic NLP framework for automated grading of short-answer and [...] Read more.

The increasing demand for scalable and fair assessment of open-form responses in digital education shows the need for intelligent grading systems capable of balancing syntactic precision with semantic understanding. This study proposes a hybrid semantic–syntactic NLP framework for automated grading of short-answer and cloze-type questions. The framework integrates a rule-based matcher for syntactic accuracy, MPNet (Masked and Permuted Pre-trained Network) embeddings for semantic similarity, and a fine-tuned DeBERTa (Decoding-enhanced Bidirectional Encoder Representations from Transformer with Disentangled Attention) regressor for continuous score prediction, while a T5-small model provides pedagogically aligned feedback generation. Evaluations were conducted using benchmark corpora, synthetic cloze datasets, and a domain-specific short-answer corpus. Results demonstrate that the hybrid system outperforms traditional baselines, achieving 91% accuracy, a 0.89 F1 score, a mean absolute error of 0.36, and strong inter-rater agreement (κ = 0.87), aligning closely with human graders. Qualitative analyses show that the framework successfully recognizes paraphrased responses, assigns partial credit, and generates meaningful feedback. Ablation studies further validate the necessity of each subsystem, with performance significantly declining when components were removed. The findings confirm that the proposed framework is both computationally robust and pedagogically valuable, establishing a foundation for scalable, interpretable, and fair automated grading in contemporary educational environments. Full article

(This article belongs to the Special Issue Applications of Artificial Intelligence in Innovative Education)

► Show Figures

Graphical abstract

18 pages, 1834 KB

Open AccessFeature PaperArticle

Multi-Dataset Training for Improved Accuracy in Spatio-Temporal Problems: An Explainable Analysis

by Javier García-Sigüenza, Alberto Real-Fernández, Faraón Llorens-Largo, Rafael Molina-Carmona and Marc Semper

Mathematics 2026, 14(5), 908; https://doi.org/10.3390/math14050908 - 7 Mar 2026

Viewed by 548

Abstract

Deep learning models used to predict spatio-temporal data usually make use of embeddings to represent the different nodes that make up a graph, and thus are able to represent the characteristics of the nodes to be predicted. While in other fields of deep [...] Read more.

Deep learning models used to predict spatio-temporal data usually make use of embeddings to represent the different nodes that make up a graph, and thus are able to represent the characteristics of the nodes to be predicted. While in other fields of deep learning, such as NLP, a pre-training is performed on large datasets to obtain the embeddings and then apply them to another task with a smaller dataset, in the case of spatio-temporal problems, this is a more complex task. Therefore, in this paper, we propose a method for training on several graphs simultaneously to improve embeddings, using a model adapted to the problem and a dataset generated from subgraphs. To validate the method, a new dataset has been generated from several datasets used for traffic forecasting. The results obtained show that embeddings generated with training on multiple datasets increase prediction accuracy, improving metrics in the datasets used for validation. In addition, an analysis of the embeddings has been performed to add explainability to our method, providing a better understanding of how this training affects the generated embeddings. Full article

(This article belongs to the Special Issue Theory and Application of Neural Networks and Complex Networks, 2nd Edition)

► Show Figures

Figure 1

16 pages, 668 KB

Open AccessArticle

Evaluation of a Company’s Media Reputation Based on the Articles Published on News Portals

by Algimantas Venčkauskas, Vacius Jusas and Dominykas Barisas

Appl. Sci. 2026, 16(4), 1987; https://doi.org/10.3390/app16041987 - 17 Feb 2026

Viewed by 648

Abstract

A company’s reputation is an important, intangible asset, which is heavily influenced by media reputation. We developed a method to measure a company’s reputation based on sentiments detected in online articles. The sentiment of each sentence was evaluated and categorized into one of [...] Read more.

A company’s reputation is an important, intangible asset, which is heavily influenced by media reputation. We developed a method to measure a company’s reputation based on sentiments detected in online articles. The sentiment of each sentence was evaluated and categorized into one of three polarities: positive, negative, or neutral. Then, we developed another method to assess a company’s media reputation using all available online articles about the company. The company’s media reputation is presented as a tuple consisting of their media reputation on a scale from 0 to 100, the number of articles related to the company, and the margin of error. Experiments were conducted using articles written in Lithuanian published on major news portals. We used two different tools to assess the sentiments of the articles: Stanford CoreNLP v.4.5.10, combined with Google API, and the pre-trained transformer model XLM-RoBERTa. Google API was used for translation into English, as Stanford CoreNLP does not support the Lithuanian language. The results obtained were compared with those of existing methods, based on the coefficients of media endorsement and media favorableness, showing that the results of the proposed method are less moderate than the coefficient of media favorableness and less extreme than the coefficient of media endorsement. Full article

(This article belongs to the Special Issue Multimodal Emotion Recognition and Affective Computing)

► Show Figures

Figure 1

20 pages, 1239 KB

Open AccessArticle

Task-Adaptive and Multi-Level Contextual Understanding for Emotion Recognition in Conversations

by Xiaomeng Yao, Wei Cao, Yuyang Xue, Haijun Zhang and Xiaochao Fan

Appl. Sci. 2026, 16(4), 1706; https://doi.org/10.3390/app16041706 - 9 Feb 2026

Viewed by 385

Abstract

Emotion recognition in conversations (ERC) is a significant task in natural language processing, aimed at identifying the emotion of each utterance within a conversation. Current research predominantly relies on pre-trained language models, often incorporating sophisticated network architectures to capture complex contextual semantics in [...] Read more.

Emotion recognition in conversations (ERC) is a significant task in natural language processing, aimed at identifying the emotion of each utterance within a conversation. Current research predominantly relies on pre-trained language models, often incorporating sophisticated network architectures to capture complex contextual semantics in conversations. However, existing approaches have not successfully combined effective task-specific adaptation with adequate modeling of conversational context complexity. To address this, we propose a model named TAMC-ERC (Task-Adaptive and Multi-level Contextual Understanding for Emotion Recognition in Conversations). The model adopts a progressive recognition framework that sequentially builds on foundational utterance representations, integrates conversation-level contexts, and leads to a task-adaptive classification decision. First, the Task-Adaptive Representation Learning module produces highly discriminative utterance representations. It achieves this by integrating emotion space information into prompts and employing contrastive learning. Subsequently, the Multi-Level Contextual Understanding module performs in-depth modeling of the conversational context. It synergistically integrates both macroscopic narratives and microscopic interactions to construct a comprehensive emotional context. Finally, the classifier is directly parameterized by the emotion concept vectors from the task-adaptive stage. This creates a coherent task adaptation process, maintaining task-specific awareness from representation learning through to the final decision. Experiments on three benchmark datasets demonstrate that TAMC-ERC achieves highly competitive performance: it attains weighted average F1 scores of 71.04% on IEMOCAP, 66.95% on MELD, and 40.99% on EmoryNLP. These results set a new state of the art and demonstrate that the model outperforms most existing baselines. This work validates that integrating task adaptation with multi-level contextual modeling is key to addressing conversational complexity and improving recognition accuracy. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

13 pages, 1009 KB

Open AccessArticle

Phishing Email Detection Using BERT and RoBERTa

by Mariam Ibrahim and Ruba Elhafiz

Computation 2026, 14(2), 46; https://doi.org/10.3390/computation14020046 - 7 Feb 2026

Viewed by 2529

Abstract

One of the most harmful and deceptive forms of cybercrime is phishing, which targets users with malicious emails and websites. In this paper, we focus on the use of natural language processing (NLP) techniques and transformer models for phishing email detection. The Nazario [...] Read more.

One of the most harmful and deceptive forms of cybercrime is phishing, which targets users with malicious emails and websites. In this paper, we focus on the use of natural language processing (NLP) techniques and transformer models for phishing email detection. The Nazario Phishing Corpus is preprocessed and blended with real emails from the Enron dataset to create a robustly balanced dataset. Urgency, deceptive phrasing, and structural anomalies were some of the neglected features and sociolinguistic traits of the text, which underwent tokenization, lemmatization, and noise filtration. We fine-tuned two transformer models, Bidirectional Encoder Representations from Transformers (BERT) and the Robustly Optimized BERT Pretraining Approach (RoBERTa), for binary classification. The models were evaluated on the standard metrics of accuracy, precision, recall, and F1-score. Given the context of phishing, emphasis was placed on recall to reduce the number of phishing attacks that went unnoticed. The results show that RoBERTa has more general performance and fewer false negatives than BERT and is therefore a better candidate for deployment on security-critical tasks. Full article

(This article belongs to the Special Issue Sentiment-Driven Modelling in Business, Economics, and Social Sciences)

► Show Figures

Figure 1

22 pages, 6241 KB

Open AccessArticle

Using Large Language Models to Detect and Debunk Climate Change Misinformation

by Zeinab Shahbazi and Sara Behnamian

Big Data Cogn. Comput. 2026, 10(1), 34; https://doi.org/10.3390/bdcc10010034 - 17 Jan 2026

Cited by 3 | Viewed by 1857

Abstract

The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. [...] Read more.

The rapid spread of climate change misinformation across digital platforms undermines scientific literacy, public trust, and evidence-based policy action. Advances in Natural Language Processing (NLP) and Large Language Models (LLMs) create new opportunities for automating the detection and correction of misleading climate-related narratives. This study presents a multi-stage system that employs state-of-the-art large language models such as Generative Pre-trained Transformer 4 (GPT-4), Large Language Model Meta AI (LLaMA) version 3 (LLaMA-3), and RoBERTa-large (Robustly optimized BERT pretraining approach large) to identify, classify, and generate scientifically grounded corrections for climate misinformation. The system integrates several complementary techniques, including transformer-based text classification, semantic similarity scoring using Sentence-BERT, stance detection, and retrieval-augmented generation (RAG) for evidence-grounded debunking. Misinformation instances are detected through a fine-tuned RoBERTa–Multi-Genre Natural Language Inference (MNLI) classifier (RoBERTa-MNLI), grouped using BERTopic, and verified against curated climate-science knowledge sources using BM25 and dense retrieval via FAISS (Facebook AI Similarity Search). The debunking component employs RAG-enhanced GPT-4 to produce accurate and persuasive counter-messages aligned with authoritative scientific reports such as those from the Intergovernmental Panel on Climate Change (IPCC). A diverse dataset of climate misinformation categories covering denialism, cherry-picking of data, false causation narratives, and misleading comparisons is compiled for evaluation. Benchmarking experiments demonstrate that LLM-based models substantially outperform traditional machine-learning baselines such as Support Vector Machines, Logistic Regression, and Random Forests in precision, contextual understanding, and robustness to linguistic variation. Expert assessment further shows that generated debunking messages exhibit higher clarity, scientific accuracy, and persuasive effectiveness compared to conventional fact-checking text. These results highlight the potential of advanced LLM-driven pipelines to provide scalable, real-time mitigation of climate misinformation while offering guidelines for responsible deployment of AI-assisted debunking systems. Full article

(This article belongs to the Special Issue Natural Language Processing Applications in Big Data)

► Show Figures

Figure 1

23 pages, 2741 KB

Open AccessArticle

Subjective Evaluation of Operator Responses for Mobile Defect Identification in Remanufacturing: Application of NLP and Disagreement Tagging

by Abbirah Ahmed, Reenu Mohandas, Arash Joorabchi and Martin J. Hayes

Big Data Cogn. Comput. 2025, 9(12), 312; https://doi.org/10.3390/bdcc9120312 - 4 Dec 2025

Viewed by 830

Abstract

In the context of remanufacturing, particularly mobile device refurbishing, effective operator training is crucial for accurate defect identification and process inspection efficiency. This study examines the application of Natural Language Processing (NLP) techniques to evaluate operator expertise based on subjective textual responses gathered [...] Read more.

In the context of remanufacturing, particularly mobile device refurbishing, effective operator training is crucial for accurate defect identification and process inspection efficiency. This study examines the application of Natural Language Processing (NLP) techniques to evaluate operator expertise based on subjective textual responses gathered during a defect analysis task. Operators were asked to describe screen defects using open-ended questions, and their responses were compared with expert responses to evaluate their accuracy and consistency. We employed four NLP models, including finetuned Sentence-BERT (SBERT), pre-trained SBERT, Word2Vec, and Dice similarity, to determine their effectiveness in interpreting short, domain-specific text. A novel disagreement tagging framework was introduced to supplement traditional similarity metrics with explainable insights. This framework identifies the root causes of model–human misalignment across four categories: defect type, severity, terminology, and location. Results show that a finetuned SBERT model significantly outperforms other models by achieving Pearsons’s correlation of 0.93 with MAE and RMSE scores of 0.07 and 0.12, respectively, providing more accurate and context-aware evaluations. In contrast, other models exhibit limitations in semantic understanding and consistency. The results highlight the importance of finetuning NLP models for domain-specific applications and demonstrate how qualitative tagging methods can enhance interpretability and model debugging. This combined approach indicates a scalable and transparent methodology for the evaluation of operator responses, supporting the development of more effective training programmes in industrial settings where remanufacturing and sustainability generally are a key performance metric. Full article

(This article belongs to the Special Issue Artificial Intelligence (AI) and Natural Language Processing (NLP))

► Show Figures

Figure 1

14 pages, 3673 KB

Open AccessArticle

IMAGO: An Improved Model Based on Attention Mechanism for Enhanced Protein Function Prediction

by Meiling Liu, Longchang Liang, Qiutong Wang, Yunmeng Zhang, Lin Shi, Tianjiao Zhang and Zhenxing Wang

Biomolecules 2025, 15(12), 1667; https://doi.org/10.3390/biom15121667 - 29 Nov 2025

Viewed by 745

Abstract

Protein function prediction plays an important role in the field of biology. With the wide application of deep learning in the field of bioinformatics, more and more natural language processing (NLP) technologies are applied to the downstream tasks in the field of bioinformatics, [...] Read more.

Protein function prediction plays an important role in the field of biology. With the wide application of deep learning in the field of bioinformatics, more and more natural language processing (NLP) technologies are applied to the downstream tasks in the field of bioinformatics, and it has also shown excellent performance in protein function prediction. Protein-protein interaction (PPI) networks and other biological attributes contain rich information critical for annotating protein functions. However, existing deep learning networks still suffer from overfitting and noise issues, resulting in low accuracy in protein function prediction. Consequently, developing efficient models for protein function prediction remains a popular and challenging topic in the application of NLP in bioinformatics. In this study, we propose a novel protein function prediction model based on attention mechanisms, termed IMAGO. This model employs the Transformer pre-training process, integrating multi-head attention mechanisms and regularization techniques, and optimizes the loss function to effectively reduce overfitting and noise issues during training. It generates more robust embeddings, ultimately improving the accuracy of protein function prediction. Experimental results on human and mouse datasets indicate that our model surpasses other protein function prediction models across multiple metrics. Thus, this efficient, stable, and accurate deep learning model holds significant promise for protein function prediction. Full article

(This article belongs to the Section Bioinformatics and Systems Biology)

► Show Figures

Figure 1

25 pages, 1910 KB

Open AccessReview

Natural Language Processing in Generating Industrial Documentation Within Industry 4.0/5.0

by Izabela Rojek, Olga Małolepsza, Mirosław Kozielski and Dariusz Mikołajewski

Appl. Sci. 2025, 15(23), 12662; https://doi.org/10.3390/app152312662 - 29 Nov 2025

Cited by 2 | Viewed by 2173

Abstract

Deep learning (DL) methods have revolutionized natural language processing (NLP), enabling industrial documentation systems to process and generate text with high accuracy and fluency. Modern deep learning models, such as transformers and recurrent neural networks (RNNs), learn contextual relationships in text, making them [...] Read more.

Deep learning (DL) methods have revolutionized natural language processing (NLP), enabling industrial documentation systems to process and generate text with high accuracy and fluency. Modern deep learning models, such as transformers and recurrent neural networks (RNNs), learn contextual relationships in text, making them ideal for analyzing and creating complex industrial documentation. Transformer-based architectures, such as BERT (Bidirectional Encoder Representations from Transformers) and GPT (Generative Pre-trained Transformer), are ideally suited for tasks such as text summarization, content generation, and question answering, which are crucial for documentation systems. Pre-trained language models, tuned to specific industrial datasets, support domain-specific vocabulary, ensuring the generated documentation complies with industry standards. Deep learning-based systems can use sequential models, such as those used in machine translation, to generate documentation in multiple languages, promoting accessibility, and global collaboration. Using attention mechanisms, these models identify and highlight critical sections of input data, resulting in the generation of accurate and concise documentation. Integration with optical character recognition (OCR) tools enables DL-based NLP systems to digitize and interpret legacy documents, streamlining the transition to automated workflows. Reinforcement learning and human feedback loops can enhance a system’s ability to generate consistent and contextually relevant text over time. These approaches are particularly effective in creating dynamic documentation that is automatically updated based on data from sensors, registers, or other sources in real time. The scalability of DL techniques enables industrial organizations to efficiently produce massive amounts of documentation, reducing manual effort and improving overall efficiency. NLP has become a fundamental technology for automating the generation, maintenance, and personalization of industrial documentation within the Industry 4.0, 5.0, and emerging Industry 6.0 paradigms. Recent advances in large language models, search-assisted generation, and multimodal architectures have significantly improved the accuracy and contextualization of technical manuals, maintenance reports, and compliance documents. However, persistent challenges such as domain-specific terminology, data scarcity, and the risk of hallucinations highlight the limitations of current approaches in safety-critical manufacturing environments. This review synthesizes state-of-the-art methods, comparing rule-based, neural, and hybrid systems while assessing their effectiveness in addressing industrial requirements for reliability, traceability, and real-time adaptation. Human–AI collaboration and the integration of knowledge graphs are transforming documentation workflows as factories evolve toward cognitive and autonomous systems. The review included 32 articles published between 2018 and 2025. The implications of these bibliometric findings suggest that a high percentage of conference papers (69.6%) may indicate a field still in its conceptual phase, which contextualizes the article’s emphasis on proposed architecture rather than their industrial validation. Most research was conducted in computer science, suggesting early stages of technological maturity. The leading countries were China and India, but these countries did not have large publication counts, nor were leading researchers or affiliations observed, suggesting significant research dispersion. However, the most frequently observed SDGs indicate a clear health context, focusing on “industry innovation and infrastructure” and “good health and well-being”. Full article

(This article belongs to the Special Issue Emerging and Exponential Technologies in Industry 4.0)

► Show Figures

Figure 1

17 pages, 1206 KB

Open AccessArticle

DPATransLLM: Detection of Pronominal Anaphora in Turkish Sentences Using Transformer-Based, Large Language Models and Hybrid Ensemble Approach

by Engin Demir and Metin Bilgin

Appl. Sci. 2025, 15(23), 12480; https://doi.org/10.3390/app152312480 - 25 Nov 2025

Viewed by 848

Abstract

In the current information age, with the exponential growth of data volume and language-based applications, the accurate resolution of intra-contextual relationships in texts has become indispensable for both academic research and industrial Natural Language Processing (NLP) systems. This study focuses on the detection [...] Read more.

In the current information age, with the exponential growth of data volume and language-based applications, the accurate resolution of intra-contextual relationships in texts has become indispensable for both academic research and industrial Natural Language Processing (NLP) systems. This study focuses on the detection of pronominal anaphora in Turkish sentences. For the detection of pronominal anaphora, a specific dataset comprising 2000 sentences and 72,239 tokens was created, and this dataset was labeled using a BIO tagging method developed with a custom approach for this study. In this work, fine-tuning was performed on Transformer-based language models pre-trained on Turkish data, such as BERT and RoBERTa. Additionally, Large Language Models (LLMs) trained on Turkish data, including Turkcell-LLM-7b-v1 and ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1, as well as multilingual models like Microsoft’s Phi-3 Mini-4K-Instruct and OpenAI’s GPT-4o-mini, were also fine-tuned with the created dataset to detect pronominal anaphora in sentences. Following the training of the language models, the resulting performance was evaluated using pronoun accuracy, antecedent accuracy, exact match, and F1-score metrics. According to the results obtained in the pronominal anaphora detection phase of the study, a novel hybrid ensemble approach combining multiple Transformer models with linguistic rules achieved the highest performance. This hybrid system attained scores of 0.987 for pronoun accuracy, 0.977 for antecedent accuracy, 0.505 for exact match, and 0.960 for F1-score, surpassing all individual models, including GPT-4o-mini. These findings reveal the superiority of ensemble methods combined with Turkish-specific linguistic rules over standalone models in Turkish anaphora resolution. This study is considered novel, as it is the first work to apply hybrid ensemble methods with linguistic rule integration to this domain for the Turkish language. Full article

(This article belongs to the Special Issue Practical Applications of Large Language Models in Natural Language Processing)

► Show Figures

Figure 1

19 pages, 4893 KB

Open AccessArticle

LLMs in Staging: An Orchestrated LLM Workflow for Structured Augmentation with Fact Scoring

by Giuseppe Trimigno, Gianfranco Lombardo, Michele Tomaiuolo, Stefano Cagnoni and Agostino Poggi

Future Internet 2025, 17(12), 535; https://doi.org/10.3390/fi17120535 - 24 Nov 2025

Cited by 1 | Viewed by 920

Abstract

Retrieval-augmented generation (RAG) enriches prompts with external knowledge, but it often relies on additional infrastructure that may be impractical in resource-constrained or offline settings. In addition, updating the internal knowledge of a language model through retraining is costly and inflexible. To address these [...] Read more.

Retrieval-augmented generation (RAG) enriches prompts with external knowledge, but it often relies on additional infrastructure that may be impractical in resource-constrained or offline settings. In addition, updating the internal knowledge of a language model through retraining is costly and inflexible. To address these limitations, we propose an explainable and structured prompt augmentation pipeline that enhances inputs using pre-trained models and rule-based extractors, without requiring external sources. We describe this approach as an orchestrated LLM workflow: a structured sequence in which lightweight LLM modules assume specialized roles. Specifically, (1) an extractor module identifies factual triples from input prompts by combining dependency parsing with a rule-based extraction algorithm; (2) a scorer module, based on a generic lightweight LLM, evaluates the importance of each triple via its self-attention patterns, leveraging internal beliefs to promote explainability and trustworthy cooperation with the downstream model; (3) a performer module processes the augmented prompt for downstream tasks in supervised fine-tuning or zero-shot settings. Much like in a theater staging, each module operates transparently behind the scenes to support and elevate the performer’s final output. We evaluate this approach across multiple performer architectures (encoder-only, encoder-decoder, and decoder-only) and NLP tasks (multiple-choice QA, open-book QA, and summarization). Our results show that this structured augmentation with scored facts yields consistent improvements compared to baseline prompting: up to a

28.78 %

accuracy improvement for multiple-choice QA, up to a

9.42 %

BLEURT improvement for open-book QA, and up to a

18.14 %

ROUGE-L improvement for summarization. By decoupling knowledge scoring from task execution, our method provides a practical, interpretable, and low-cost alternative to RAG in static or knowledge-limited environments. Full article

(This article belongs to the Special Issue Generative Artificial Intelligence: Systems, Technologies and Applications)

► Show Figures

Graphical abstract

Search Results (166)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (166)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI