Machine Learning and Knowledge Extraction

54 pages, 2092 KB

Open AccessArticle

Shared Autoencoder-Based Unified Intrusion Detection Across Heterogeneous Datasets for Binary and Multi-Class Classification Using a Hybrid CNN–DNN Model

by Hesham Kamal and Maggie Mashaly

Mach. Learn. Knowl. Extr. 2026, 8(2), 53; https://doi.org/10.3390/make8020053 - 22 Feb 2026

Viewed by 1395

Abstract

As network environments become increasingly interconnected, ensuring robust cyber-security has become critical, particularly with the growing sophistication of modern cyber threats. Intrusion detection systems (IDSs) play a vital role in identifying and mitigating unauthorized or malicious activities; however, conventional machine learning-based IDSs often [...] Read more.

As network environments become increasingly interconnected, ensuring robust cyber-security has become critical, particularly with the growing sophistication of modern cyber threats. Intrusion detection systems (IDSs) play a vital role in identifying and mitigating unauthorized or malicious activities; however, conventional machine learning-based IDSs often rely on handcrafted features and are limited in their ability to detect diverse attack types across disparate network domains. To address these limitations, this paper introduces a novel unified intrusion detection framework that implements “Structural Dualism” to integrate three heterogeneous benchmark datasets (CSE-CIC-IDS2018, NF-BoT-IoT-v2, and IoT-23) into a harmonized, protocol-agnostic representation. The framework employs a shared autoencoder architecture with dataset-specific projection layers to learn a unified latent manifold. This 15-dimensional space captures the underlying semantics of attack patterns (e.g., volumetric vs. signaling) across multiple domains, while dataset-specific decoders preserve reconstruction fidelity through alternating multi-domain training. To identify complex micro-signatures within this manifold, the framework utilizes a synergistic hybrid convolutional neural network–deep neural network (CNN–DNN) classifier, where the CNN extracts spatial latent patterns and the DNN performs global classification across twenty-five distinct classes. Class imbalance is addressed through resampling strategies such as adaptive synthetic sampling (ADASYN) and edited nearest neighbors (ENN). Experimental results demonstrate remarkable performance, achieving 99.76% accuracy for binary classification and 99.54% accuracy for multi-class classification on the merged dataset, with strong generalization confirmed on individual datasets. These findings indicate that the shared autoencoder-based CNN–DNN framework, through its unique feature alignment and spatial extraction capabilities, significantly strengthens intrusion detection across diverse and heterogeneous environments. Full article

► Show Figures

Figure 1

24 pages, 1064 KB

Open AccessArticle

Kernel-Based Optimal Subspaces (KOS): A Method for Data Classification

by Lakhdar Remaki

Mach. Learn. Knowl. Extr. 2026, 8(2), 52; https://doi.org/10.3390/make8020052 - 22 Feb 2026

Viewed by 516

Abstract

Support Vector Machine (SVM) is a popular kernel-based method for data classification that has demonstrated high efficiency across a wide range of practical applications. However, SVM suffers from several limitations, including the potential failure of the optimization process, especially in high-dimensional spaces; the [...] Read more.

Support Vector Machine (SVM) is a popular kernel-based method for data classification that has demonstrated high efficiency across a wide range of practical applications. However, SVM suffers from several limitations, including the potential failure of the optimization process, especially in high-dimensional spaces; the inherently high computational cost; the lack of a systematic approach to multi-class classification; difficulties in handling imbalanced classes; and the prohibitive cost of real-time or dynamic classification. This paper proposes an alternative method, referred to as Kernel-based Optimal Subspaces (KOS), which belongs to the family of kernel subspace methods. Mathematically similar to Kernel PCA (KPCA), KOS achieves performance comparable to SVM while addressing the aforementioned weaknesses. The method is based on computing the minimum distance to optimal feature subspaces of the mapped data. Because no optimization process is required, KOS is robust, fast, and easy to implement. The optimal subspaces are constructed independently, enabling high parallelizability and making the approach well-suited for dynamic classification and real-time applications. Furthermore, the issue of imbalanced classes is naturally handled by subdividing large classes into smaller sub-classes, thereby creating appropriately sized sub-subspaces within the feature space. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

22 pages, 1472 KB

Open AccessReview

Innovations in Robots for Weed and Pest Control: A Systematic Review of Cutting-Edge Research

by Nicola Furnitto, Giuseppe Todde, Maria Spagnuolo, Giuseppe Sottosanti, Maria Caria, Giampaolo Schillaci and Sabina I. G. Failla

Mach. Learn. Knowl. Extr. 2026, 8(2), 51; https://doi.org/10.3390/make8020051 - 22 Feb 2026

Cited by 2 | Viewed by 2734

Abstract

In recent years, agriculture has begun to transform thanks to the arrival of robots and autonomous vehicles capable of performing complex operations such as weeding and spraying in an intelligent and targeted manner. In fact, new-generation agricultural robots use artificial intelligence (AI), cameras, [...] Read more.

In recent years, agriculture has begun to transform thanks to the arrival of robots and autonomous vehicles capable of performing complex operations such as weeding and spraying in an intelligent and targeted manner. In fact, new-generation agricultural robots use artificial intelligence (AI), cameras, and sensors to recognise weeds, analyse crop conditions, and apply plant protection products only where necessary, thus reducing waste and environmental impact. Some systems combine drones and ground vehicles to achieve even more accurate results. This systematic review synthesises recent advances in agricultural robotics for weed and pest management through a PRISMA-based approach. Literature was collected from major scientific databases (Scopus, Web of Science, IEEE Xplore, Google Scholar) and complementary sources, leading to the inclusion of 83 eligible studies. The selected evidence was structured into four application domains: (i) weed detection and mapping, (ii) robotic and non-chemical weed control (mechanical and laser-based approaches), (iii) selective/variable-rate spraying for pest and disease management, and (iv) integrated weeding–spraying solutions, including cooperative Unmanned Aerial Vehicle–Unmanned Ground Vehicle (UAV–UGV) systems. Overall, the reviewed studies confirm rapid progress in real-time perception (deep learning-based detection), navigation/localization (e.g., GNSS/RTK, LiDAR, sensor fusion) and targeted actuation (spot spraying and precision interventions), while also revealing persistent limitations: heterogeneous evaluation protocols, limited system-level comparisons in terms of work rate, scalability, costs and robustness under variable field conditions, and an often unclear distinction between prototype platforms and solutions close to commercialization. However, the large-scale spread of these technologies is still hampered by high costs, technical complexity, and cultural resistance. The review highlights how the integration of automation, sustainability, and accessibility is key to the agriculture of the future. Full article

(This article belongs to the Section Thematic Reviews)

► Show Figures

Graphical abstract

40 pages, 1792 KB

Open AccessArticle

Why So Meme? A Comparative and Explainable Analysis of Multimodal Hateful Meme Detection

by Nor Saiful Azam Bin Nor Azmi, Michal Ptaszynski, Fumito Masui and Abu Nowhash Chowdhury

Mach. Learn. Knowl. Extr. 2026, 8(2), 50; https://doi.org/10.3390/make8020050 - 21 Feb 2026

Viewed by 1617

Abstract

The rise of toxic content, particularly in the form of hateful memes, poses a significant challenge to social media platforms. This paper presents an empirical comparative study of unimodal and multimodal architectures for toxic content detection. Rather than proposing a novel architecture, the [...] Read more.

The rise of toxic content, particularly in the form of hateful memes, poses a significant challenge to social media platforms. This paper presents an empirical comparative study of unimodal and multimodal architectures for toxic content detection. Rather than proposing a novel architecture, the study evaluates the efficacy of a modular Late Fusion framework (RoBERViT) against specialized unimodal baselines (RoBERTa and ViT) and a generalist Large Multimodal (LLaVA). Both unimodal and multimodal configurations across two distinct benchmarks—the imbalanced Innopolis Hateful Memes dataset and the confounder-driven Facebook Hateful Meme dataset—were explored. Beyond quantitative metrics, this study conducts a qualitative analysis using Explainable AI (LIME) and a Large Multimodal Model (LLaVA) to investigate model reasoning. Results demonstrate that the multimodal fusion model consistently outperformed its unimodal counterparts on the Innopolis Hateful Meme dataset, achieving a toxic class F1-score of 0.6439 compared to the text-only score of 0.5794. However, on the Facebook Hateful Meme dataset, text-only models remain competitive, highlighting the “benign confounder” challenge. The qualitative analysis reveals that text remains the dominant modality, with models often relying on surface-level keywords. Notably, the Vision Transformer frequently uses text overlays as a visual proxy for hate, while the LLaVA model struggles with hallucinated toxicity in benign confounder contexts. These findings underscore the persistent challenge of achieving true multimodal understanding in hate speech detection. Full article

(This article belongs to the Special Issue Language Acquisition and Understanding)

► Show Figures

Figure 1

23 pages, 7233 KB

Open AccessArticle

Plug-and-Play LLM Knowledge Extraction for Robot Navigation: A Fine-Tuning-Free Edge Framework

by Sebastian Rojas-Ordoñez, Mikel Segura, Irune Yarza, Veronica Mendoza and Ekaitz Zulueta

Mach. Learn. Knowl. Extr. 2026, 8(2), 49; https://doi.org/10.3390/make8020049 - 21 Feb 2026

Viewed by 1641

Abstract

Large Language Models are increasingly used for high-level robotic reasoning, yet their latency and stochasticity complicate their direct use in low-level control. Moreover, extracting actionable navigation cues from multimodal context incurs inference costs that are challenging for embedded platforms. We present a plug-and-play [...] Read more.

Large Language Models are increasingly used for high-level robotic reasoning, yet their latency and stochasticity complicate their direct use in low-level control. Moreover, extracting actionable navigation cues from multimodal context incurs inference costs that are challenging for embedded platforms. We present a plug-and-play framework that augments a finite-state machine with asynchronous velocity suggestions generated by a Large Language Model, using an off-the-shelf DistilGPT-2 model running on-device on a Jetson AGX Orin. The system extracts task-relevant cues from the current context and integrates them only if they satisfy deadline, schema, and kinematic validation, thereby preserving a deterministic 50 Hz control loop with a <5 ms fallback path. We compare multiple Large Language Models for embedded robot control and quantify trade-offs among model size, inference time, and output validity. To assess whether the Large Language Models add value beyond signal processing, we include an ablation against a standard smoothing baseline; the results indicate that the Large Language Models contribute anticipatory, context-dependent adjustments that are not captured by filtering alone. Experiments in Gazebo and on a real TurtleBot3 reduce the final position error from 0.246 m to 0.159 m and improve trajectory efficiency from 0.821 to 0.901 without increasing control-loop latency. Approximately 80% of the Large Language Models’ outputs pass validation and are applied. Overall, the framework reduces developer effort by enabling behavioral changes at the prompt level while maintaining interpretable, robust edge-based navigation. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

21 pages, 1805 KB

Open AccessArticle

Introducing LEAF: LLM Edge Assessment Framework for Generative AI on the Edge

by Mustafa Abdulkadhim and Sandor R. Repas

Mach. Learn. Knowl. Extr. 2026, 8(2), 48; https://doi.org/10.3390/make8020048 - 18 Feb 2026

Cited by 1 | Viewed by 3549

Abstract

The transition of Large Language Models (LLMs) from centralized clouds to edge environments is critical for addressing privacy concerns, latency bottlenecks, and operational costs. However, existing edge benchmarking frameworks remain tailored to discriminative Deep Learning tasks (e.g., object detection), failing to capture the [...] Read more.

The transition of Large Language Models (LLMs) from centralized clouds to edge environments is critical for addressing privacy concerns, latency bottlenecks, and operational costs. However, existing edge benchmarking frameworks remain tailored to discriminative Deep Learning tasks (e.g., object detection), failing to capture the multidimensional challenges of generative AI, specifically the trade-offs between token generation speed, semantic accuracy, and hardware sustainability. To address this gap, we introduce LEAF (LLM Edge Assessment Framework), a novel evaluation methodology that integrates Circular Economy principles directly into performance metrics. LEAF assesses edge deployments across five synergistic pillars: Circular Economy Score, Energy Efficiency (Joules/Token), Performance Speed (Tokens/Second), semantic accuracy (BERTScore), and End-to-End Latency. We validate LEAF through an extensive experimental analysis of five distinct hardware classes, ranging from embedded IoT devices (Raspberry Pi 4 and 5, NVIDIA Jetson Nano) to professional edge servers (NVIDIA T400) and repurposed legacy workstations (NVIDIA GTX 1050 Ti). Utilizing 4-bit quantized models via the Ollama runtime, our results reveal a counterintuitive insight: repurposed consumer hardware significantly outperforms modern purpose-built edge SoCs. The legacy GTX 1050 Ti achieved a 20× speedup over the Raspberry Pi 4 and maintained superior energy-per-task efficiency compared to low-power ARM architectures by minimizing active runtime. These findings challenge the prevailing narrative that newer silicon is essential for Edge AI, demonstrating that sustainable, high-performance inference can be achieved by extending the lifecycle of existing hardware. LEAF thus provides a blueprint for a “Green Edge” ecosystem that balances computational capability with environmental responsibility. Full article

(This article belongs to the Section Data)

► Show Figures

Graphical abstract

19 pages, 5001 KB

Open AccessArticle

Novel Loss Functions for Improved Data Visualization in t-SNE

by Sara Nassar, Rachid Hedjam and Samir Brahim Belhaouari

Mach. Learn. Knowl. Extr. 2026, 8(2), 47; https://doi.org/10.3390/make8020047 - 18 Feb 2026

Cited by 1 | Viewed by 1169

Abstract

A popular method for projecting high-dimensional data onto a lower-dimensional space while preserving the integrity of its structure is t-distributed Stochastic Neighbor Embedding (t-SNE). This technique minimizes the Kullback–Leibler (

K L

) divergence to align the similarities between points in [...] Read more.

A popular method for projecting high-dimensional data onto a lower-dimensional space while preserving the integrity of its structure is t-distributed Stochastic Neighbor Embedding (t-SNE). This technique minimizes the Kullback–Leibler (

K L

) divergence to align the similarities between points in the original and reduced spaces. While t-SNE is highly effective, it prioritizes local neighborhood preservation, which results in limited separation between distant clusters and inadequate representation of global relationships. To improve these limitations, this work introduces two complementary approaches: (1) The Max-Flipped

K L

Divergence (

K L^{\max}

) modifies the original divergence by incorporating a contrastive term,

K L^{'}

, which enhances the ranking of point similarities through maximum similarity constraints. (2) The

K L

-Wasserstein Loss (

L_{K L - W}

) combines the

K L

divergence with the classic Wasserstein distance, allowing the embedding to benefit from the smooth and geometry-aware transport properties of Wasserstein metrics. Experimental results show that these methods lead to improved separation and better structural clarity in the low-dimensional space compared to standard t-SNE. Full article

(This article belongs to the Section Visualization)

► Show Figures

Graphical abstract

21 pages, 7276 KB

Open AccessArticle

SkySeg-Net: Sky Segmentation-Based Row-Terminal Recognition in Trellised Orchards

by Haiyang Gu, Yong Wang, Huaiyang Liu, Tong Tian, Changxing Geng and Yun Shi

Mach. Learn. Knowl. Extr. 2026, 8(2), 46; https://doi.org/10.3390/make8020046 - 13 Feb 2026

Cited by 1 | Viewed by 825

Abstract

Perception in trellised orchards is often challenged by dense canopy occlusion and overhead plastic coverings, which cause pronounced variations in sky visibility at row terminals. Accurately recognizing row terminals, including both row head and row tail positions, is therefore essential for understanding orchard [...] Read more.

Perception in trellised orchards is often challenged by dense canopy occlusion and overhead plastic coverings, which cause pronounced variations in sky visibility at row terminals. Accurately recognizing row terminals, including both row head and row tail positions, is therefore essential for understanding orchard row structures. This study presents SkySeg-Net, a sky segmentation-based framework for row-terminal recognition in trellised orchards. SkySeg-Net is built on an enhanced multi-scale U-Net architecture and employs ResNeSt residual split-attention blocks as the backbone. To improve feature discrimination under complex illumination and occlusion conditions, the Convolutional Block Attention Module (CBAM) is integrated into the downsampling path, while a Pyramid Pooling Module (PPM) is introduced during upsampling to strengthen multi-scale contextual representation. Sky regions are segmented from both front-view and rear-view camera images, and a hierarchical threshold-based pixel-sum analysis is applied to infer row-terminal locations based on sky-region distribution patterns. To support a comprehensive evaluation, a dedicated trellised vineyard dataset was constructed, featuring front-view and rear-view images and covering three representative grapevine growth stages (BBCH 69–71, 73–77, and 79–89). Experimental results show that SkySeg-Net achieves an mIoU of 91.21% and an mPA of 94.82% for sky segmentation, with a row-terminal recognition accuracy exceeding 98.17% across all growth stages. These results demonstrate that SkySeg-Net provides a robust and reliable visual perception approach for row-terminal recognition in trellised orchard environments. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

17 pages, 4681 KB

Open AccessArticle

Towards Adaptive Adverse Weather Removal via Semantic and Low-Level Visual Perceptual Priors

by Wei Dong, Han Zhou, Terry Ji and Jun Chen

Mach. Learn. Knowl. Extr. 2026, 8(2), 45; https://doi.org/10.3390/make8020045 - 12 Feb 2026

Viewed by 1063

Abstract

Adverse weather removal aims to restore images degraded by haze, rain, or snow. However, existing unified models often rely on implicit degradation cues, making them vulnerable to inaccurate weather perception and insufficient semantic guidance, which leads to over-smoothing or residual artifacts in real [...] Read more.

Adverse weather removal aims to restore images degraded by haze, rain, or snow. However, existing unified models often rely on implicit degradation cues, making them vulnerable to inaccurate weather perception and insufficient semantic guidance, which leads to over-smoothing or residual artifacts in real scenes. In this work, we propose AWR-VIP, a prior-guided adverse weather removal framework that explicitly extracts semantic and perceptual priors using a frozen vision–language model (VLM). Given a degraded input, we first employ a degradation-aware prompt extractor to produce a compact set of semantic tags describing key objects and regions, and simultaneously perform weather-type perception by prompting the VLM with explicit weather definitions. Conditioned on the predicted weather type and selected tags, the VLM further generates two levels of restoration guidance: a global instruction that summarizes image-level enhancement goals (e.g., visibility/contrast) and local instructions that specify tag-aware refinement cues (e.g., recover textures for specific regions). These textual outputs are encoded by a text encoder into a pair of priors (

P_{g l o b a l}

and

P_{l o c a l}

), which are injected into a UNet-based restorer through global-prior-modulated normalization and instruction-guided attention, enabling weather-adaptive and content-aware restoration. Extensive experiments on a combined benchmark show that AWR-VIP consistently outperforms state-of-the-art methods. Moreover, the VLM-derived priors are plug-and-play and can be integrated into other restoration backbones to further improve performance. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

19 pages, 3635 KB

Open AccessArticle

Improving Ground Cover Crop Fractional Vegetation Mapping via Causality-Based Deep Representation Learning

by Atif Latif, Masoumeh Hashemi, Matt Yost, Somayeh Esmaeili and Xiaojun Qi

Mach. Learn. Knowl. Extr. 2026, 8(2), 44; https://doi.org/10.3390/make8020044 - 11 Feb 2026

Viewed by 758

Abstract

Semantic segmentation and deep learning methods have rarely been applied to fractional vegetation cover (FVC) segmentation tasks due to the lack of publicly available datasets for training deep learning models. FVC is a key indicator for assessing vegetation distribution, crop density, and crop [...] Read more.

Semantic segmentation and deep learning methods have rarely been applied to fractional vegetation cover (FVC) segmentation tasks due to the lack of publicly available datasets for training deep learning models. FVC is a key indicator for assessing vegetation distribution, crop density, and crop responses to water availability and fertilizer application, yet conventional field-based measurement methods are time consuming, costly, labor intensive, and may lack the accuracy required for critical applications such as drought stress evaluation and water productivity. In this paper, we introduced causality-based deep learning techniques for FVC segmentation on a publicly available RGB dataset that consists of four ground cover crops: Phyla nodiflora L., Cynodon dactylon, Frankenia thymifolia Desf., and Oxalis stricta L. By separating causal from spurious correlations in pretrained features, using the stepwise intervention and reweighting (SIR) method at different encoder stages reduced confounding bias and enabled the models to learn more generalizable and task-relevant features. Extensive experiments on the FVC dataset, conducted with and without causality learning, showed that the proposed FCN + ResNet-50 model with causality learning and data augmentation achieved an accuracy of 94.80%, a precision of 94.97%, a recall of 94.35%, and an F1-score of 94.62%, which outperformed non-causal baselines and state-of-the-art transformer-based models including SegFormer and Mask2Former. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

37 pages, 20040 KB

Open AccessArticle

Towards LLM-Driven Cybersecurity in Autonomous Vehicles: A Big Data-Empowered Framework with Emerging Technologies

by Aristeidis Karras, Leonidas Theodorakopoulos, Christos Karras and Alexandra Theodoropoulou

Mach. Learn. Knowl. Extr. 2026, 8(2), 43; https://doi.org/10.3390/make8020043 - 11 Feb 2026

Cited by 1 | Viewed by 1722

Abstract

Modern Autonomous Vehicles generate large volumes of heterogeneous in-vehicle data, making cybersecurity a critical challenge as adversarial attacks become increasingly adaptive, stealthy, and multi-protocol. Traditional intrusion detection systems often fail under these conditions because of their limited contextual understanding, poor robustness to distribution [...] Read more.

Modern Autonomous Vehicles generate large volumes of heterogeneous in-vehicle data, making cybersecurity a critical challenge as adversarial attacks become increasingly adaptive, stealthy, and multi-protocol. Traditional intrusion detection systems often fail under these conditions because of their limited contextual understanding, poor robustness to distribution shifts, and insufficient regulatory transparency. This study introduces LLM-Guardian, a hierarchical intrusion detection framework with decision-making mechanisms that integrates Large Language Models (LLMs) with classical statistical detection theory, optimal transport drift analysis, graph neural networks, and formal uncertainty quantification. LLM-Guardian uses semantic anomaly scoring, conformal prediction for distribution-free confidence calibration, adaptive cumulative sum (CUSUM) sequential testing for low-latency detection, and topology-aware GNN reasoning designed to identify coordinated attacks across CAN, Ethernet, and V2X interfaces. In this work, the framework is empirically evaluated on four heterogeneous CAN-bus datasets, while the Ethernet and V2X components are instantiated at the architectural level and left as directions for future multi-protocol experimentation. Full article

(This article belongs to the Special Issue Using Large Language Models for Scientific Problem Solving and Engineering Design)

► Show Figures

Graphical abstract

14 pages, 4687 KB

Open AccessArticle

Extracting Product Improvement Insights from Social Media Comments Using Machine Learning: A Case Study in the Automotive Industry

by Philipp Brunner and Stefanie Vogl

Mach. Learn. Knowl. Extr. 2026, 8(2), 42; https://doi.org/10.3390/make8020042 - 11 Feb 2026

Viewed by 1188

Abstract

This paper presents a scalable machine learning pipeline for extracting actionable, product-related insights from user-generated social media comments. Leveraging sentence embeddings from SBERT and unsupervised clustering (k-Means and agglomerative), the approach structures informal and noisy comments from Instagram and YouTube into topic groups [...] Read more.

This paper presents a scalable machine learning pipeline for extracting actionable, product-related insights from user-generated social media comments. Leveraging sentence embeddings from SBERT and unsupervised clustering (k-Means and agglomerative), the approach structures informal and noisy comments from Instagram and YouTube into topic groups intended to support thematic analysis. A case study on feedback regarding BMW vehicles, comprising more than 26,000 comments, illustrates how the pipeline can reveal recurring user concerns, such as design critiques, usability issues, and technology-related expectations, even in short and unstructured social media comments. The proposed pipeline operates without labeled data or manual annotation, enabling scalable application and transferability across product categories and industries. By transforming large-scale, unstructured consumer feedback into interpretable themes, the pipeline provides product teams with an efficient and structured basis for data-driven product development and improvement. Full article

► Show Figures

Figure 1

28 pages, 2010 KB

Open AccessArticle

Prompt Engineering Strategies for Generating Medical Case-Based MCQs with Large Language Models: A Multi-Model Comparative Study

by Somaiya Al Shuraiqi, Adhari AlZaabi and Abdulrahman Aal Abdulsalam

Mach. Learn. Knowl. Extr. 2026, 8(2), 41; https://doi.org/10.3390/make8020041 - 10 Feb 2026

Cited by 1 | Viewed by 2276

Abstract

The use of large language models (LLMs) to automate the generation of medical case-based multiple-choice questions (MCQs) is increasing, but their accuracy, reliability, and educational validity are still not well understood. This study in a comparative framework examined nine LLMs with four different [...] Read more.

The use of large language models (LLMs) to automate the generation of medical case-based multiple-choice questions (MCQs) is increasing, but their accuracy, reliability, and educational validity are still not well understood. This study in a comparative framework examined nine LLMs with four different prompting methods to evaluate LLM-produced MCQs for clinical coherence and readiness for assessment. A uniform evaluation pipeline was constructed to examine automatic text-similarity measures using automated metrics (BLEU, ROUGE, and METEOR), structural and parsability measures, and operational effectiveness (latency, cost, quality-efficiency ratios). Human validation was performed on the best-performing model and prompt combination (OpenBioLLM-70B with Chain-of-Thought) focusing on the model prompt that demonstrated the best linguistic fidelity and clinically aligned reasoning. Two clinical experts independently reviewed 88 items using a five-domain rubric covering appropriateness, clarity, relevance, distractor quality, and cognitive level. Results indicated significant variation across models and prompting strategies, with Chain-of-Thought yielding the best overall performance in comparison to other strategies. The OpenBioLLM-70B model demonstrated the best overall balance of quality, parsability, and efficiency, achieving a prompt template quality score of 90.4, a consistency score of 88.8, and a response time of 3.28 s, with a quality-per-dollar value of 134.11. The expert rating confirmed clinical alignment, but there was consensus that distractor quality needed further improvements. These results provide evidence that LLMs under optimal prompting conditions can reliably support MCQ generation and provide large-scale, cost-effective support for medical assessment production. Full article

(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)

► Show Figures

Figure 1

38 pages, 3458 KB

Open AccessArticle

MERGE: Mammogram-Enhanced Representation via Wavelet-Guided CNNs for Computer-Aided Diagnosis of Breast Cancer

by Omneya Attallah

Mach. Learn. Knowl. Extr. 2026, 8(2), 40; https://doi.org/10.3390/make8020040 - 9 Feb 2026

Cited by 2 | Viewed by 955

Abstract

The early and accurate identification of breast cancer is a significant healthcare issue, largely because the traditional machine learning approaches rely on handcrafted features that are unable to fully capture the spatial and textural complexity found in mammograms. Even with the advancements made [...] Read more.

The early and accurate identification of breast cancer is a significant healthcare issue, largely because the traditional machine learning approaches rely on handcrafted features that are unable to fully capture the spatial and textural complexity found in mammograms. Even with the advancements made possible through deep learning and improvements in diagnostic performance, most computational-aided diagnosis (CAD) systems based on Convolutional Neural Networks (CNNs) still only rely on single-domain features, normally spatial features, while neglecting some important spectral and spatial–spectral features, leading to limitations in generalisability, redundancy, and loss of performative interpretability. Inspired by these limitations, this research proposes MERGE, a novel CAD framework that combines spatial, spectral, and spatial–spectral information—all part of a single multistage architecture taking advantage of three fine-tuned CNN models (ResNet-50, Xception, and Inception). This system utilises Discrete Stationary Wavelet Transform (DSWT) to enhance spectral–spatial features; Discrete Cosine Transform (DCT) to fuse the features optimally, resulting in enhanced spatial and spatial–spectral representations; and, finally, Non-Negative Matrix Factorisation (NNMF) for reduced-dimensional features. Finally, the Linear Discriminant Analysis (LDA), support vector machine (SVM), and k-nearest neighbours (KNN) classifiers provide a robust diagnosis. Using the INBreast and MIAS datasets in evaluations of the experimental research design, evaluation metrics of accuracy, sensitivity, specificity, and AUC were around 99%, with performance surpassing state-of-the-art paradigms. The findings of the suggested MERGE indicate significant promise as a dependable and effective diagnostic tool, enhancing the consistency and interpretability of breast cancer screening results. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

23 pages, 6344 KB

Open AccessArticle

Visual Perception and Robust Autonomous Following for Orchard Transportation Robots Based on DeepDIMP-ReID

by Renyuan Shen, Yong Wang, Huaiyang Liu, Haiyang Gu, Changxing Geng and Yun Shi

Mach. Learn. Knowl. Extr. 2026, 8(2), 39; https://doi.org/10.3390/make8020039 - 8 Feb 2026

Viewed by 952

Abstract

Dense foliage, severe illumination variations, and interference from multiple individuals with similar appearances in complex orchard environments pose significant challenges for vision-based following robots in maintaining persistent target perception and identity consistency, thereby compromising the stability and safety of fruit transportation operations. To [...] Read more.

Dense foliage, severe illumination variations, and interference from multiple individuals with similar appearances in complex orchard environments pose significant challenges for vision-based following robots in maintaining persistent target perception and identity consistency, thereby compromising the stability and safety of fruit transportation operations. To address these challenges, we propose a novel framework, DeepDIMP-ReID, which integrates the Deep Implicit Model Prediction (DIMP) tracker with a person re-identification (ReID) module based on EfficientNet. This visual perception and autonomous following framework is designed for differential-drive orchard transportation robots, aiming to achieve robust target perception and reliable identity maintenance in unstructured orchard settings. The proposed framework adopts a hierarchical perception–verification–control architecture. Visual tracking and three-dimensional localization are jointly achieved using synchronized color and depth data acquired from a RealSense camera, where target regions are obtained via the discriminative model prediction (DIMP) method and refined through an elliptical-mask-based depth matching strategy. Front obstacle detection is performed using DBSCAN-based point cloud clustering techniques. To suppress erroneous following caused by occlusion, target switching, or target reappearance after occlusion, an enhanced HOReID person re-identification module with an EfficientNet backbone is integrated for identity verification at critical decision points. Based on the verified perception results, a state-driven motion control strategy is employed to ensure safe and continuous autonomous following. Extensive long-term experiments conducted in real orchard environments demonstrate that the proposed system achieves a correct tracking rate exceeding 94% under varying human walking speeds, with an average localization error of 0.071 m. In scenarios triggering re-identification, a target discrimination success rate of 93.3% is obtained. These results confirm the effectiveness and robustness of the proposed framework for autonomous fruit transportation in complex orchard environments. Full article

► Show Figures

Figure 1

20 pages, 1202 KB

Open AccessPerspective

The Innovative Potential of Artificial Intelligence Applied to Patient Registries to Implement Clinical Guidelines

by Sebastiano Gangemi, Alessandro Allegra, Mario Di Gioacchino, Luca Gammeri, Irene Cacciola and Giorgio Walter Canonica

Mach. Learn. Knowl. Extr. 2026, 8(2), 38; https://doi.org/10.3390/make8020038 - 7 Feb 2026

Cited by 1 | Viewed by 1799

Abstract

Guidelines provide specific recommendations based on the best available medical knowledge, summarizing and balancing the advantages and disadvantages of various diagnostic and treatment options. Currently, consensus methods are the best and most common practices in creating clinical guidelines, even though these approaches have [...] Read more.

Guidelines provide specific recommendations based on the best available medical knowledge, summarizing and balancing the advantages and disadvantages of various diagnostic and treatment options. Currently, consensus methods are the best and most common practices in creating clinical guidelines, even though these approaches have several limitations. However, the rapid pace of biomedical innovation and the growing availability of real-world data (RWD) from clinical registries (containing data like clinical outcomes, treatment variables, imaging, and laboratory results) call for a complementary paradigm in which recommendations are continuously stress-tested against high-quality, interoperable data and auditable artificial intelligence (AI) pipelines. AI, based on information retrieved from patient registries, can optimize the process of creating guidelines. In fact, AI can analyze large volumes of data, ensuring essential tasks such as correct feature identification, prediction, classification, and pattern recognition of all information. In this work, we propose a four-phase lifecycle, comprising data curation, causal analysis and estimation, objective validation, and real-time updates, complemented by governance and machine learning operations (MLOps). A comparative analysis with consensus-only methods, a pilot protocol, and a compliance checklist are provided. We believe that the use of AI will be a valuable support in drafting clinical guidelines to complement expert consensus and ensure continuous updates to standards, providing a higher level of evidence. The integration of AI with high-quality patient registries has the potential to substantially modernize guideline development, enabling continuously updated, data-driven recommendations. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, 2nd Edition)

► Show Figures

Graphical abstract

25 pages, 11437 KB

Open AccessArticle

Enhancing the Extraction of GHG Emission-Reduction Targets from Sustainability Reports Using Vision Language Models

by Lars Wilhelmi, Christian Bruns and Matthias Schumann

Mach. Learn. Knowl. Extr. 2026, 8(2), 37; https://doi.org/10.3390/make8020037 - 5 Feb 2026

Viewed by 1121

Abstract

This study investigates how Vision Language Models (VLMs) can be used and methodically configured to extract Environmental, Social, and Governance (ESG) metrics from corporate sustainability reports, addressing the limitations of existing text-only and manual ESG data-extraction approaches. Using the Design Science Research Methodology, [...] Read more.

This study investigates how Vision Language Models (VLMs) can be used and methodically configured to extract Environmental, Social, and Governance (ESG) metrics from corporate sustainability reports, addressing the limitations of existing text-only and manual ESG data-extraction approaches. Using the Design Science Research Methodology, we developed an extraction artifact comprising a curated page-level dataset containing greenhouse gas (GHG) emission-reduction targets, an automated evaluation pipeline, model and text-preprocessing comparisons, and iterative prompt and few-shot refinement. Pages from oil and gas sustainability reports were processed directly by VLMs to preserve visual–textual structure, enabling a controlled comparison of text, image, and combined input modalities, with extraction quality assessed at page and attribute level using F1-scores. Among tested models, Mistral Small 3.2 demonstrated the most stable performance and was used to evaluate image, text, and combined modalities. Combined text + image modality performed best (F1 = 0.82), particularly on complex page layouts. The findings demonstrate how to effectively integrate visual and textual cues for ESG metric extraction with VLMs, though challenges remain for visually dense layouts and avoiding inference-based hallucinations. Full article

(This article belongs to the Special Issue Using Large Language Models for Scientific Problem Solving and Engineering Design)

► Show Figures

Graphical abstract

22 pages, 541 KB

Open AccessArticle

Perceiving AI as an Epistemic Authority or Algority: A User Study on the Human Attribution of Authority to AI

by Frida Milella and Federico Cabitza

Mach. Learn. Knowl. Extr. 2026, 8(2), 36; https://doi.org/10.3390/make8020036 - 5 Feb 2026

Cited by 1 | Viewed by 2735

Abstract

The increasing integration of artificial intelligence (AI) in decision-making processes has amplified discussions surrounding algorithmic authority—the perceived epistemic legitimacy of AI systems over human judgment. This study investigates how individuals attribute epistemic authority to AI, focusing on psychological, contextual, and sociotechnical factors. Existing [...] Read more.

The increasing integration of artificial intelligence (AI) in decision-making processes has amplified discussions surrounding algorithmic authority—the perceived epistemic legitimacy of AI systems over human judgment. This study investigates how individuals attribute epistemic authority to AI, focusing on psychological, contextual, and sociotechnical factors. Existing research highlights the importance of trust in automation, perceived performance, and moral frameworks in shaping such attributions. Unlike prior conceptual or philosophical accounts of algorithmic authority, our study adopts a relational and empirically grounded perspective by operationalizing algority through psychometric measures and contextual assessments. To address knowledge gaps in the micro-level dynamics of this phenomenon, we conducted an empirical study using psychometric tools and scenario-based assessments. Here, we report key findings from a survey of 610 participants, revealing significant correlations between trust in automation (TiA), perceptions of automated performance (PAS), and the propensity to defer to AI, particularly in high-stakes scenarios like criminal justice and job-matching. Trust in automation emerged as a primary factor, while moral attitudes moderated deference in ethically sensitive contexts. Our findings highlight the practical relevance of transparency and explainability for supporting critical engagement with AI outputs and for informing the design of contextually appropriate decision support. This study contributes to understanding algorithmic authority as a multidimensional construct, offering empirically grounded insights for designing AI systems that are trustworthy and context-sensitive. Full article

(This article belongs to the Topic Theories and Applications of Human-Computer Interaction)

► Show Figures

Figure 1

25 pages, 911 KB

Open AccessArticle

Migraine and Epilepsy Discrimination Using DTCWT and Random Subspace Ensemble Classifier

by Tuba Nur Subasi and Abdulhamit Subasi

Mach. Learn. Knowl. Extr. 2026, 8(2), 35; https://doi.org/10.3390/make8020035 - 4 Feb 2026

Viewed by 569

Abstract

Migraine and epilepsy are common neurological disorders that share overlapping symptoms, such as visual disturbances and altered consciousness, making accurate diagnosis challenging. Although their underlying mechanisms differ, both conditions involve recurrent irregular brain activity, and traditional EEG-based diagnosis relies heavily on clinical interpretation, [...] Read more.

Migraine and epilepsy are common neurological disorders that share overlapping symptoms, such as visual disturbances and altered consciousness, making accurate diagnosis challenging. Although their underlying mechanisms differ, both conditions involve recurrent irregular brain activity, and traditional EEG-based diagnosis relies heavily on clinical interpretation, which may be subjective and insufficient for clear differentiation. To address this challenge, this study introduces an automated EEG classification framework combining Dual Tree Complex Wavelet Transform (DTCWT) for feature extraction with a Random Subspace Ensemble Classifier for multi-class discrimination. EEG data recorded under photic and nonphotic stimulation were analyzed to capture both temporal and frequency characteristics. DTCWT proved effective in modeling the non-stationary nature of EEG signals and extracting condition-specific features, while the ensemble classifier improved generalization by training multiple models on diverse feature subsets. The proposed system achieved an average accuracy of 99.50%, along with strong F-measure, AUC, and Kappa scores. Notably, although previous studies suggest heightened EEG activity in migraine patients during flash stimulation, findings here indicate that flash stimulation alone does not reliably distinguish migraine from epilepsy. Overall, this research highlights the promise of advanced signal processing and machine learning techniques in enhancing diagnostic precision for complex neurological disorders. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

22 pages, 7547 KB

Open AccessArticle

AuraViT-FL: A Resource-Efficient 2D Hybrid Transformer Framework for Federated Lung Tumor Segmentation

by Mohamed A. Abdelhamed, Hana M. Nassef, Sara Abdelnasser, Sahar Selim and Lobna A. Said

Mach. Learn. Knowl. Extr. 2026, 8(2), 34; https://doi.org/10.3390/make8020034 - 3 Feb 2026

Cited by 1 | Viewed by 884

Abstract

Accurate lung tumor segmentation using computed tomography (CT) scans is needed for efficient tumor treatment. However, the development of deep learning models is often constrained by strict patient privacy regulations that limit direct data sharing. This work presents a system that enables multi-institutional [...] Read more.

Accurate lung tumor segmentation using computed tomography (CT) scans is needed for efficient tumor treatment. However, the development of deep learning models is often constrained by strict patient privacy regulations that limit direct data sharing. This work presents a system that enables multi-institutional collaboration while training high-quality lung tumor segmentation models without requiring access to sensitive patient data. The proposed framework features the AuraViT suite, which includes the standard AuraViT—a hybrid model with 136 million parameters that combines a Vision Transformer (ViT) encoder, Atrous Spatial Pyramid Pooling (ASPP), and attention-gated residual connections—and the Lightweight AuraViT (LAURA) family (Small, Tiny, and Mobile). These variants are designed for resource-constrained environments and potential edge deployment scenarios. Training is conducted on publicly available datasets (MSD Lung and NSCLC) in a simulated five-client federated learning setup that emulates collaboration among institutions while ensuring patient privacy. The framework uses a federated learning setup with FedProx, adaptive weighted aggregation, and a dynamic virtual client strategy to handle data and system differences. The framework is further evaluated through ablation studies on model architecture and feature importance. The results show that the standard AuraViT-FL achieves a global mean Dice score of 80.81%, while maintaining performance close to centralized training. Additionally, the LAURA variations show a better trade-off between accuracy and efficiency. Notably, the Mobile variant with ∼5 M parameters reduces model complexity by over 96% while maintaining competitive performance (82.96% Dice on MSD Lung). Full article

(This article belongs to the Topic Applications of Image and Video Processing in Medical Imaging)

► Show Figures

Figure 1

22 pages, 4304 KB

Open AccessArticle

Optimal Information Retrieval System in E-Learning Using Optimization-Driven Bidirectional Long Short-Term Memory

by Hemn Barzan Abdalla and Awder Ahmed

Mach. Learn. Knowl. Extr. 2026, 8(2), 33; https://doi.org/10.3390/make8020033 - 2 Feb 2026

Viewed by 718

Abstract

In an e-learning platform, information retrieval plays an enormous role through efficient processing. Recently, the education sector has increased its trend in online learning systems by generating a large amount of educational content based on student’s criteria. For this sophisticated data analysis scheme, [...] Read more.

In an e-learning platform, information retrieval plays an enormous role through efficient processing. Recently, the education sector has increased its trend in online learning systems by generating a large amount of educational content based on student’s criteria. For this sophisticated data analysis scheme, several methods have been employed in recent studies; however, they have suffered from various limitations, including reliability issues, security problems, unauthorized disclosure of data, cost consumption, and interpretability challenges. To tackle these issues, a proposed framework, named the war strategy optimization-based bidirectional long short-term memory (WSO-BiLSTM) model, is designed in this research to reduce sensitivity to local optima and improve convergence stability, thereby achieving robust retrieval performance. With this perspective, the BiLSTM model captures the semantic information of documents in a dual direction for effective retrieval outcomes. Moreover, the model’s key features are extracted effectively by various feature extraction methods. The dynamic movement towards the optimal solution of the WSO algorithm enables the proposed model to retrieve the information more accurately in the information retrieval system. Experiments on an e-learning dataset show that, with a 90% training split, the proposed method achieves 97.90% accuracy, 98.45% precision, 97.90% F1-score, and 97.35% recall. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

20 pages, 30275 KB

Open AccessArticle

Manifold Integration of Lung Emphysema Signatures (MILES): A Radiomic-Based Study

by Marek Socha, Agata Durawa, Małgorzata Jelito, Katarzyna Dziadziuszko, Witold Rzyman, Edyta Szurowska and Joanna Polanska

Mach. Learn. Knowl. Extr. 2026, 8(2), 32; https://doi.org/10.3390/make8020032 - 30 Jan 2026

Viewed by 744

Abstract

Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide, and emphysema is present in the majority of affected patients and can be identified on computed tomography (CT). This study investigated whether radiomic features derived from automatically and adaptively segmented [...] Read more.

Chronic obstructive pulmonary disease (COPD) is the third leading cause of death worldwide, and emphysema is present in the majority of affected patients and can be identified on computed tomography (CT). This study investigated whether radiomic features derived from automatically and adaptively segmented low-attenuation lung regions can capture distinct imaging characteristics of COPD beyond conventional emphysema measures. Radiomic features were extracted from 6078 chest CT scans of 2243 participants from the COPDGene cohort. Emphysematous regions were segmented using the MimSeg method based on Gaussian mixture modelling with patient-adjusted thresholding, and radiomic features were computed for individual lesion clusters and aggregated per patient using summary statistics, yielding 780 features per subject. Uniform Manifold Approximation and Projection (UMAP) was used to generate a low-dimensional embedding, and feature contributions were evaluated using SHAP analysis and statistical testing. The resulting embedding demonstrated structured patterns broadly aligned with Global Initiative for Chronic Obstructive Lung Disease (GOLD) stages, with greater overlap among GOLD 0–2 and more consolidated groupings for GOLD 3 and 4, reflecting differences in disease severity. The most influential features were predominantly derived from Grey Level Run Length Matrix measures, capturing textural heterogeneity and spatial organisation of emphysematous changes that are not directly described by standard density-based metrics. These findings suggest that radiomic analysis of adaptively segmented CT data may provide complementary and structurally distinct information relative to conventional emphysema measures, supporting a more nuanced characterisation of emphysema patterns in COPD. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

37 pages, 13544 KB

Open AccessArticle

Attention-Driven Feature Extraction for XAI in Histopathology Leveraging a Hybrid Xception Architecture for Multi-Cancer Diagnosis

by Shirin Shila, Md. Safayat Hossain, Md Fuyad Al Masud, Mohammad Badrul Alam Miah, Afrig Aminuddin and Zia Muhammad

Mach. Learn. Knowl. Extr. 2026, 8(2), 31; https://doi.org/10.3390/make8020031 - 28 Jan 2026

Viewed by 1778

Abstract

The automated and accurate results of classifying histopathology images are necessary in the early detection of cancer, especially the common cancers such as Colorectal Cancer (CRC) and Lung Cancer (LC). Nonetheless, classical deep learning frameworks often face challenges because the intra-class variations are [...] Read more.

The automated and accurate results of classifying histopathology images are necessary in the early detection of cancer, especially the common cancers such as Colorectal Cancer (CRC) and Lung Cancer (LC). Nonetheless, classical deep learning frameworks often face challenges because the intra-class variations are large, the relations across classes are alike, and the quality of images is not stable. In order to eliminate these constraints, a multi-layer diagnostic framework is offered in detail. This process starts with a strong preprocessing pipeline, which involves gamma correction, bilateral filtering, and adaptive CLAHE, resulting in statistically significant changes in image quality quantitative measures. The hybrid attention architecture is presented and includes an Xception backbone, a Convolutional Block Attention Module (CBAM), a Transformer block, and an MLP classifier to successfully combine local features with global context. The proposed model achieved an outstanding performance with a classification of 99.98%, 99.58%, and 99.33% percent on LC25000, CRC-VAL-HE-7K, and NCT-CRC-HE-100K when tested on three publicly available datasets. In order to enhance transparency, very detailed explainability analyses are conducted with the help of layer-wise feature visualization and Grad-CAM. Finally, the real-world example of this framework is presented by its implementation in a web-based platform, which can be a useful and easy-to-use tool in helping to diagnose a pathology. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

35 pages, 2414 KB

Open AccessArticle

Hierarchical Caching for Agentic Workflows: A Multi-Level Architecture to Reduce Tool Execution Overhead

by Farhana Begum, Craig Scott, Kofi Nyarko, Mansoureh Jeihani and Fahmi Khalifa

Mach. Learn. Knowl. Extr. 2026, 8(2), 30; https://doi.org/10.3390/make8020030 - 27 Jan 2026

Viewed by 2029

Abstract

Large Language Model (LLM) agents depend heavily on multiple external tools such as APIs, databases and computational services to perform complex tasks. However, these tool executions create latency and introduce costs, particularly when agents handle similar queries or workflows. Most current caching methods [...] Read more.

Large Language Model (LLM) agents depend heavily on multiple external tools such as APIs, databases and computational services to perform complex tasks. However, these tool executions create latency and introduce costs, particularly when agents handle similar queries or workflows. Most current caching methods focus on LLM prompt–response pairs or execution plans and overlook redundancies at the tool level. To address this, we designed a multi-level caching architecture that captures redundancy at both the workflow and tool level. The proposed system integrates four key components: (1) hierarchical caching that operates at both the workflow and tool level to capture coarse and fine-grained redundancies; (2) dependency-aware invalidation using graph-based techniques to maintain consistency when write operations affect cached reads across execution contexts; (3) category-specific time-to-live (TTL) policies tailored to different data types, e.g., weather APIs, user location, database queries and filesystem and computational tasks; and (4) session isolation to ensure multi-tenant cache safety through automatic session scoping. We evaluated the system using synthetic data with 2.25 million queries across ten configurations in fifteen runs. In addition, we conducted four targeted evaluations—write intensity robustness from 4 to 30% writes, personalized memory effects under isolated vs. shared cache modes, workflow-level caching comparison and workload sensitivity across five access distributions—on an additional 2.565 million queries, bringing the total experimental scope to 4.815 million executed queries. The architecture achieved 76.5% caching efficiency, reducing query processing time by 13.3× and lowering estimated costs by 73.3% compared to a no-cache baseline. Multi-tenant testing with fifteen concurrent tenants confirmed robust session isolation and 74.1% efficiency under concurrent workloads. Our evaluation used controlled synthetic workloads following Zipfian distributions, which are commonly used in caching research. While absolute hit rates vary by deployment domain, the architectural principles of hierarchical caching, dependency tracking and session isolation remain broadly applicable. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

26 pages, 1707 KB

Open AccessArticle

Axiom Generation for Automated Ontology Construction from Texts Through Schema Mapping

by Tsitsi Zengeya, Jean Vincent Fonou-Dombeu and Mandlenkosi Gwetu

Mach. Learn. Knowl. Extr. 2026, 8(2), 29; https://doi.org/10.3390/make8020029 - 26 Jan 2026

Viewed by 1702

Abstract

Ontology learning from unstructured text has become a critical task for knowledge-driven applications in Big Data and Artificial Intelligence. While significant advances have been made in the automatic extraction of concepts and relations using neural and Transformer-based models, the generation of formal Description [...] Read more.

Ontology learning from unstructured text has become a critical task for knowledge-driven applications in Big Data and Artificial Intelligence. While significant advances have been made in the automatic extraction of concepts and relations using neural and Transformer-based models, the generation of formal Description Logic axioms required for constructing logically consistent and computationally tractable ontologies remains largely underexplored. This paper puts forward a novel pipeline for automated axiom generation through schema mapping. Our paper introduces three key innovations: a deterministic mapping framework that guarantees logical consistency (unlike stochastic Large Language Models); guaranteed formal consistency verified by OWL reasoners (unaddressed by prior statistical methods); and a transparent, scalable bridge from neural extractions to symbolic logic, eliminating manual post-processing. Technically, the pipeline builds upon the outputs of a Transformer-based fusion model for joint concept and relation extraction. We then map lexical relational phrases to formal ontological properties through a lemmatization-based schema alignment step. Entity typing and hierarchical induction are then employed to infer class structures, as well as domain and range constraints. Using RDFLib and structured data processing, we transform the extracted triples into both assertional (ABox) and terminological (TBox) axioms expressed in Description Logic. Experimental evaluation on benchmark datasets (Conll04 and NYT) demonstrates the efficacy of the approach, with expert validation showing high acceptance rates (>95%) and reasoners confirming zero inconsistencies. The pipeline thus establishes a reliable, scalable foundation for automated ontology learning, advancing the field from extraction to formally verifiable knowledge base construction. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

27 pages, 4789 KB

Open AccessArticle

Assessing Interaction Quality in Human–AI Dialogue: An Integrative Review and Multi-Layer Framework for Conversational Agents

by Luca Marconi, Luca Longo and Federico Cabitza

Mach. Learn. Knowl. Extr. 2026, 8(2), 28; https://doi.org/10.3390/make8020028 - 26 Jan 2026

Cited by 3 | Viewed by 4647

Abstract

Conversational agents are transforming digital interactions across various domains, including healthcare, education, and customer service, thanks to advances in large language models (LLMs). As these systems become more autonomous and ubiquitous, understanding what constitutes high-quality interaction from a user perspective is increasingly critical. [...] Read more.

Conversational agents are transforming digital interactions across various domains, including healthcare, education, and customer service, thanks to advances in large language models (LLMs). As these systems become more autonomous and ubiquitous, understanding what constitutes high-quality interaction from a user perspective is increasingly critical. Despite growing empirical research, the field lacks a unified framework for defining, measuring, and designing user-perceived interaction quality in human–artificial intelligence (AI) dialogue. Here, we present an integrative review of 125 empirical studies published between 2017 and 2025, spanning text-, voice-, and LLM-powered systems. Our synthesis identifies three consistent layers of user judgment: a pragmatic core (usability, task effectiveness, and conversational competence), a social–affective layer (social presence, warmth, and synchronicity), and an accountability and inclusion layer (transparency, accessibility, and fairness). These insights are formalised into a four-layer interpretive framework—Capacity, Alignment, Levers, and Outcomes—operationalised via a Capacity × Alignment matrix that maps distinct success and failure regimes. It also identifies design levers such as anthropomorphism, role framing, and onboarding strategies. The framework consolidates constructs, positions inclusion and accountability as central to quality, and offers actionable guidance for evaluation and design. This research redefines interaction quality as a dialogic construct, shifting the focus from system performance to co-orchestrated, user-centred dialogue quality. Full article

(This article belongs to the Special Issue Using Large Language Models for Scientific Problem Solving and Engineering Design)

► Show Figures

Graphical abstract

45 pages, 2071 KB

Open AccessSystematic Review

Artificial Intelligence Techniques for Thyroid Cancer Classification: A Systematic Review

by Yanche Ari Kustiawan, Khairil Imran Ghauth, Sakina Ghauth, Liew Yew Toong and Sien Hui Tan

Mach. Learn. Knowl. Extr. 2026, 8(2), 27; https://doi.org/10.3390/make8020027 - 23 Jan 2026

Cited by 1 | Viewed by 2349

Abstract

Artificial intelligence (AI), particularly machine learning and deep learning architectures, has been widely applied to support thyroid cancer diagnosis, but existing evidence on its performance and limitations remains scattered across techniques, tasks, and data types. This systematic review synthesizes recent work on knowledge [...] Read more.

Artificial intelligence (AI), particularly machine learning and deep learning architectures, has been widely applied to support thyroid cancer diagnosis, but existing evidence on its performance and limitations remains scattered across techniques, tasks, and data types. This systematic review synthesizes recent work on knowledge extraction from heterogeneous imaging and clinical data for thyroid cancer diagnosis and detection published between 2021 and 2025. We searched eight major databases, applied predefined inclusion and exclusion criteria, and assessed study quality using the Newcastle–Ottawa Scale. A total of 150 primary studies were included and analyzed with respect to AI techniques, diagnostic tasks, imaging and non-imaging modalities, model generalization, explainable AI, and recommended future directions. We found that deep learning, particularly convolutional neural networks, U-Net variants, and transformer-based models, dominated recent work, mainly for ultrasound-based benign–malignant classification, nodule detection, and segmentation, while classical machine learning, ensembles, and advanced paradigms remained important in specific structured-data settings. Ultrasound was the primary modality, complemented by cytology, histopathology, cross-sectional imaging, molecular data, and multimodal combinations. Key limitations included diagnostic ambiguity, small and imbalanced datasets, limited external validation, gaps in model generalization, and the use of largely non-interpretable black-box models with only partial use of explainable AI techniques. This review provides a structured, machine learning-oriented evidence map that highlights opportunities for more robust representation learning, workflow-ready automation, and trustworthy AI systems for thyroid oncology. Full article

(This article belongs to the Section Thematic Reviews)

► Show Figures

Graphical abstract

17 pages, 3892 KB

Open AccessArticle

Transformer-Driven Semi-Supervised Learning for Prostate Cancer Histopathology: A DINOv2–TransUNet Framework

by Rubina Akter Rabeya, Jeong-Wook Seo, Nam Hoon Cho, Hee-Cheol Kim and Heung-Kook Choi

Mach. Learn. Knowl. Extr. 2026, 8(2), 26; https://doi.org/10.3390/make8020026 - 23 Jan 2026

Cited by 1 | Viewed by 1162

Abstract

Prostate cancer is diagnosed through a comprehensive study of histopathology slides, which takes time and requires professional interpretation. To minimize this load, we developed a semi-supervised learning technique that combines transformer-based representation learning and a custom TransUNet classifier. To capture a wide range [...] Read more.

Prostate cancer is diagnosed through a comprehensive study of histopathology slides, which takes time and requires professional interpretation. To minimize this load, we developed a semi-supervised learning technique that combines transformer-based representation learning and a custom TransUNet classifier. To capture a wide range of morphological structures without manual annotation, our method pretrains DINOv2 on 10,000 unlabeled prostate tissue patches. After receiving the transformer-derived features, a bespoke CNN-based decoder uses residual upsampling and carefully constructed skip connections to merge data from many spatial scales. Expert pathologists identified only 20% of the patches in the whole dataset; the remaining unlabeled samples were contributed by using a consistency-driven learning method that promoted reliable predictions across various augmentations. The model received precision and recall scores of 91.81% and 89.02%, respectively, and an accuracy of 93.78% on an additional test set. These results exceed the performance of a conventional U-Net and a baseline encoder–decoder network. All things considered, the localized CNN (Convolutional Neural Network) decoding and global transformer attention provide a reliable method for prostate cancer classification in situations with little annotated data. Full article

(This article belongs to the Topic Applications of Image and Video Processing in Medical Imaging)

► Show Figures

Graphical abstract

Journal Menu

Journal Browser

Mach. Learn. Knowl. Extr., Volume 8, Issue 2 (February 2026) – 28 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI