Machine Learning and Knowledge Extraction

20 pages, 20102 KB

Open AccessArticle

Explainable Glaucoma Screening via Optic Disc Localization and Comparative Class Activation Map-Based Analysis

by Oscar Ramos-Soto, Ezequiel Perez-Zarate, Jorge Ramos-Frutos, Diego Oliva, Marco Pérez-Cisneros, Guillermo Sosa-Gómez and Sandra E. Balderas-Mata

Mach. Learn. Knowl. Extr. 2026, 8(7), 173; https://doi.org/10.3390/make8070173 (registering DOI) - 24 Jun 2026

Abstract

Glaucoma, the leading cause of irreversible vision loss, often goes undetected in early stages due to its asymptomatic behaviour. Early diagnosis typically involves visual analysis of the optic disc (OD) in eye fundus images. Machine and deep learning techniques have emerged as valuable [...] Read more.

Glaucoma, the leading cause of irreversible vision loss, often goes undetected in early stages due to its asymptomatic behaviour. Early diagnosis typically involves visual analysis of the optic disc (OD) in eye fundus images. Machine and deep learning techniques have emerged as valuable tools for automating this process; however, their integration into clinical practice still faces limitations. These challenges include the presence of image regions that are not directly related to glaucoma assessment, such as retinal vasculature, the macula, and background structures, which may introduce irrelevant information and negatively affect classification performance, as well as a general lack of transparency in the decision-making process. This article proposes a methodology that enhances both the accuracy and interpretability of glaucoma detection by focusing solely on the OD region. First, a metaheuristic-based strategy is employed for precise OD detection and cropping, generating an OD-centric dataset with glaucoma-labeled images, which is composed of different public datasets. Four convolutional neural networks (CNNs), namely VGG-19, MobileNet-V2, ResNet-50, and DenseNet-161, are trained on this dataset using transfer learning. To address the need for model explainability, Grad-CAM, Score-CAM, and Eigen-CAM are applied to the trained models to generate post hoc visual explanations of their predictions. The experimental results showed that DenseNet-161 achieved the best overall performance on the assembled public dataset, using an 80%-10%-10% training, validation, and testing split, with a test accuracy of 0.9369 and an AUC of 0.9831. By isolating the OD region and incorporating explainability techniques, the methodology provides a robust and interpretable second opinion, supporting more accurate and efficient glaucoma screening. Full article

(This article belongs to the Topic Applications of Image and Video Processing in Medical Imaging)

28 pages, 7592 KB

Open AccessArticle

An Interactive Visualization Tool for Mining, Comparing Association Rules and Frequent Itemsets Across Multiple Datasets

by Yao Yao, Frank Klawonn, Frank Müller, Dominik Schröder, Sandra Steffens, Marie Mikuteit, Georg M. N. Behrens, Alexandra Dopfer-Jablonka, Lorenz Grigull and Kai Vahldiek

Mach. Learn. Knowl. Extr. 2026, 8(7), 172; https://doi.org/10.3390/make8070172 (registering DOI) - 24 Jun 2026

Abstract

As healthcare data grows in volume and complexity, the use of association rule mining (ARM) and frequent itemset mining (FISM) in disease analysis holds great potential for data-driven decision-making, personalized treatment strategies, and disease prevention. This study introduces an extensible, interactive, self-developed visualization [...] Read more.

As healthcare data grows in volume and complexity, the use of association rule mining (ARM) and frequent itemset mining (FISM) in disease analysis holds great potential for data-driven decision-making, personalized treatment strategies, and disease prevention. This study introduces an extensible, interactive, self-developed visualization tool designed specifically for ARM and FISM, enabling the intuitive exploration of medical datasets. The tool incorporates an innovative preprocessing method that binarizes datasets from various scaling systems using a systematic multi-threshold evaluation, ensuring standardized analysis across diverse data sources. Its interactive design empowers users to dynamically explore relevant patterns individually, enhancing both the interpretability and usability of customized results. In addition, the tool integrates exploratory statistical assessments to support the interpretation and comparison of resulting association rules (ARs) and frequent itemsets (FISs). In this paper, we evaluate the tool using two pilot datasets: one on symptoms for long COVID and one on incorporating rare diseases (RDs) while also providing sample datasets for user testing. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

24 pages, 12811 KB

Open AccessArticle

Real-Time Prediction of Reading Comprehension Levels from Beta-Band EEG Signals Using Kernel Ridge Regression and Principal Component Analysis

by Nuphar Avital, Dana Sadan, May Shikly and Dror Malka

Mach. Learn. Knowl. Extr. 2026, 8(7), 171; https://doi.org/10.3390/make8070171 (registering DOI) - 24 Jun 2026

Abstract

Real-time assessment of reading comprehension remains a challenge in educational research. Traditional evaluation methods, such as questionnaires, provide delayed and retrospective measures and therefore do not capture the dynamic nature of comprehension during reading. This exploratory study investigates whether beta-band electroencephalography (EEG) activity [...] Read more.

Real-time assessment of reading comprehension remains a challenge in educational research. Traditional evaluation methods, such as questionnaires, provide delayed and retrospective measures and therefore do not capture the dynamic nature of comprehension during reading. This exploratory study investigates whether beta-band electroencephalography (EEG) activity can be used to estimate EEG-derived indicators related to reading comprehension during academic reading. The study included 40 university students who read a conceptually demanding scientific text while EEG signals were continuously recorded. Beta-band activity (13–30 Hz) was extracted from six cognition-related channels and segmented into non-overlapping 2 s windows. Principal component analysis (PCA) was applied for dimensionality reduction, followed by kernel ridge regression (KRR) for prediction. At the window level, the proposed KRR–PCA framework achieved a mean absolute error (MAE) of 5.797, a root mean square error (RMSE) of 7.783, an MAE-based accuracy of 94.2%, and an explained variance of R² = 0.275 on a held-out test set. At the participant level, aggregated predictions showed a significant correlation with questionnaire-based comprehension scores (r = 0.59), indicating that EEG-derived features captured meaningful inter-individual differences. The framework also generated time-resolved prediction profiles that reflected fluctuations in EEG-derived comprehension estimates during reading. These findings suggest that beta-band EEG contains information related to reading comprehension and may support the development of future EEG-based educational monitoring systems. Further validation using larger cohorts and time-resolved comprehension measures is needed to confirm the practical applicability of the approach. Full article

(This article belongs to the Special Issue Clinically Robust and Transparent AI-Assisted Medical Diagnostics: From Learning Dynamics to Real-World Deployment)

► Show Figures

Graphical abstract

23 pages, 24608 KB

Open AccessArticle

Harmonic and Phase-Modulated Activation Functions for Implicit Neural Representations: A Comprehensive Benchmark Study

by Ahmad S. Tarawneh, Omar Lasassmeh, Anas A. Alkasasbeh, Abdulkareem Alzahrani, Khalid Almohammadi, Maha Alamri and Ahmad B. Hassanat

Mach. Learn. Knowl. Extr. 2026, 8(6), 170; https://doi.org/10.3390/make8060170 (registering DOI) - 21 Jun 2026

Abstract

It is well-known that activation functions are crucial in determining spectral expressiveness, training dynamics, and reconstruction accuracy in implicit neural representations (INRs), which employ coordinate-based multilayer perceptrons to represent continuous signals. Despite showing excellent performance, sinusoidal activations, for example SIREN, are limited in [...] Read more.

It is well-known that activation functions are crucial in determining spectral expressiveness, training dynamics, and reconstruction accuracy in implicit neural representations (INRs), which employ coordinate-based multilayer perceptrons to represent continuous signals. Despite showing excellent performance, sinusoidal activations, for example SIREN, are limited in their adaptability to diverse signal types due to their fixed harmonic structure. In this paper, we propose two novel periodic activation functions for INRs. (1) Harmonic generalizes sinusoidal activations by combining the fundamental frequency with learned second and third harmonics through per-neuron trainable amplitude coefficients, resulting in a richer spectral basis within the SIREN initialization framework. (2) PM-FINER (Phase-Modulated FINER) extends the variable-periodic FINER activation by embedding frequency modulation synthesis directly into the instantaneous phase, enabling data-driven phase distortion via a learnable modulation index and carrier ratio. We conducted comprehensive experiments spanning nine architectural configurations (including SIREN, WIRE, FINER, Gaussian, Harmonic, PM-FINER, and an additional direct comparison against the Subtractive Modulative Network (SMN)), using six natural images, three learning rate schedulers, and three random seeds, totaling 486 main training runs (534 runs total including an

ω_{0}

sensitivity sweep). Our evaluation combined peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and rigorous statistical analysis, such as paired t-tests, Wilcoxon signed-rank tests, Cohen’s d effect sizes, and Friedman rank tests. Under cosine annealing, Harmonic achieves a mean PSNR gain of 6.08 dB over SIREN and 2.57 dB over FINER (both

p < 0.001

, Cohen’s

d > 3.7

), while PM-FINER ranks statistically on par with Harmonic (mean difference

0.17

dB,

p = 0.36

), outperforming all of the other baselines. Compared with SMN, Harmonic outperforms it by

+ 7.94

dB under cosine annealing (Bonferroni-adjusted

p < 10^{- 5}

, Cohen’s

d = 12.3

), winning on all six images. Additionally, the Friedman ranking across the six images confirmed Harmonic (with mean rank

= 1.33

) and PM-FINER (with mean rank

= 1.67

), being the top two methods under cosine annealing. Our results establish interpretable multi-harmonic and phase-modulated activations as real alternatives to the existing INR activation functions. Full article

(This article belongs to the Section Learning)

► Show Figures

Graphical abstract

34 pages, 11536 KB

Open AccessArticle

EASE-PVNet: Robust Periocular Identity Verification Across Pre- and Post-Operative Facial Images

by Ziyad Azzaz, Omar Khaled, Esraa Khatab, Hany Said and Omar Shalash

Mach. Learn. Knowl. Extr. 2026, 8(6), 169; https://doi.org/10.3390/make8060169 (registering DOI) - 21 Jun 2026

Abstract

Identity verification across pre-operative and post-operative facial images remains a challenging task, particularly following eyelid surgery, where localized periocular changes can disrupt conventional face recognition systems. This research introduces a novel verification framework using an ensemble-based autoencoder-initialized Siamese eye-region periocular verification network designed [...] Read more.

Identity verification across pre-operative and post-operative facial images remains a challenging task, particularly following eyelid surgery, where localized periocular changes can disrupt conventional face recognition systems. This research introduces a novel verification framework using an ensemble-based autoencoder-initialized Siamese eye-region periocular verification network designed to remain resilient to surgically induced appearance variation. The proposed approach integrates anatomy-guided periocular normalization with a Siamese deep metric learning architecture, initialized via unsupervised autoencoder pretraining, enabling the model to acquire periocular-specific representations before supervised learning. Robustness in this data-limited clinical setting is enhanced through a combination of constrained periocular augmentation, dropout-based regularization, L2 weight decay, validation-guided checkpoint selection, staged hard-negative mining, validation-weighted multi-seed ensemble learning, and bootstrap-based threshold calibration. Experimental evaluation demonstrates recognition rates of 96.08% on the test set. These results indicate that the proposed framework maintains discriminative periocular identity representations under post-surgical appearance variation while remaining robust in a limited-data clinical setting. Full article

(This article belongs to the Special Issue Artificial Intelligence for Signal, Image, and Multimodal Data Processing: Algorithms, Models, and Knowledge Extraction)

► Show Figures

Figure 1

27 pages, 17235 KB

Open AccessArticle

XTrail-ID: An Explainable AI Human Footprint Trail Identification on Soil Substrate Using Unsupervised Machine Learning from UAV Imagery

by Wazha Mmereki, Rodrigo S. Jamisola, Jr., Zoe C. Jewell, Tinao Petso, Oduetse Matsebe and Sky K. Alibhai

Mach. Learn. Knowl. Extr. 2026, 8(6), 168; https://doi.org/10.3390/make8060168 - 18 Jun 2026

Abstract

This paper investigates human–AI collaboration through explainable AI where we interpret the results of barefoot print clustering using unsupervised machine learning. This can be used to identify the number of individuals from barefoot prints on the ground as a tool in forensics or [...] Read more.

This paper investigates human–AI collaboration through explainable AI where we interpret the results of barefoot print clustering using unsupervised machine learning. This can be used to identify the number of individuals from barefoot prints on the ground as a tool in forensics or anti-poaching. A self-supervised vision transformer, DINOv2, is used to automatically extract feature embeddings from localized barefoot-print regions to identify trails belonging to an individual on soil substrate. Furthermore, we introduce an Embedding Spatial Attribution Module (ESAM) to generate spatial attribution heatmaps, enabling visualization of discriminative regions that contribute to individual-specific trail identification and improving model explainability. The proposed method is named XTrail-ID, an explainable human footprint trail identification framework with two variants, OBB-XTrail-ID (oriented bounding box-based), and SEG-XTrail-ID (segmentation-based). We quantify embedding similarity using three complementary metrics: cosine similarity, Pearson correlation coefficient, and Spearman rank correlation. Twenty adults (ten males, ten females) participated, with a total of 1000 trail images extracted from UAV imagery. SEG-XTrail-ID using cosine similarity yielded the highest performance, with (3.21) discriminability and (94.2%) accuracy, while OBB-XTrail-ID using cosine similarity achieved (2.54) discriminability and (91.5%) accuracy. In addition, the latter exhibited reduced consistency in footprint grouping when more than three individuals were present within a single frame. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

23 pages, 2071 KB

Open AccessReview

XAI2Brain: A Perspective on Mechanistic Interpretability for Brain–AI Alignment

by Richard Jiang, Yongchen Zhou, Boyuan Wang, Plamen Angelov and Qiang Ni

Mach. Learn. Knowl. Extr. 2026, 8(6), 167; https://doi.org/10.3390/make8060167 - 18 Jun 2026

Abstract

The convergence of artificial intelligence (AI), explainable AI (XAI), and neuroscience is fostering new opportunities for understanding both machine and biological intelligence through interpretable and human-centered learning paradigms. In this Perspective, we introduce XAI2Brain as a conceptual framework for brain–AI alignment, positioning mechanistic [...] Read more.

The convergence of artificial intelligence (AI), explainable AI (XAI), and neuroscience is fostering new opportunities for understanding both machine and biological intelligence through interpretable and human-centered learning paradigms. In this Perspective, we introduce XAI2Brain as a conceptual framework for brain–AI alignment, positioning mechanistic interpretability as an intermediate layer connecting neural network representations, human understanding, and neuroscience-inspired AI design. Rather than viewing XAI solely as a post hoc transparency tool, we emphasize its emerging role in enabling mechanistic analysis of internal model representations, concept-level reasoning, and interactive human–AI alignment. We define XAI2Brain as a multi-level conceptual framework rather than a deployable system, explicitly aimed at structuring brain–AI alignment across representation-level, mechanism-level, and interaction-level perspectives. We survey the evolution of XAI methodologies—from feature attribution and concept-based explanations to mechanistic and human-centric interpretability approaches—and discuss how these methods may support bidirectional knowledge transfer between AI systems and cognitive neuroscience. Importantly, we adopt a cautious stance on brain–AI analogy, explicitly recognizing that artificial neural representations are not equivalent to biological neural representations, and instead focusing on functional and informational correspondences rather than structural equivalence. Unlike conventional human-in-the-loop or reinforcement learning from human feedback paradigms that primarily optimize behavioral outputs, XAI2Brain focuses on cognitively interpretable and mechanistically grounded alignment between AI systems and human reasoning processes. This alignment promotes interactive human-in-the-loop intelligence, empowering humans to comprehend, guide, and refine AI systems, while enabling AI systems to better interpret human instructions, intentions, and contextual reasoning. We further discuss the challenges of scaling explainability to large generative and multimodal models, including issues of interpretability robustness, cognitive compatibility, evaluation, and ethical accountability. We also highlight key limitations of current mechanistic interpretability methods, including explanation instability, representation superposition, and lack of causal guarantees, underscoring that these challenges remain open research problems. Rather than proposing a complete artificial brain architecture, this Perspective outlines a research roadmap toward more interpretable, adaptive, and neuroscience-inspired AI systems capable of supporting future brain–AI integration and collaborative intelligence. We additionally clarify that this work follows a narrative perspective review methodology with structured thematic synthesis of the literature. By framing explainability as a bridge between mechanistic AI understanding, cognitive science, and human-centered interaction, XAI2Brain highlights the importance of interpretable alignment for the next generation of brain-inspired AI systems. Full article

(This article belongs to the Section Learning)

► Show Figures

Graphical abstract

28 pages, 6366 KB

Open AccessArticle

Edge-Optimized Deep and Transfer Learning for Efficient DDoS Detection in IIoT Networks

by Mikiyas Alemayehu, Mohamed Chahine Ghanem and Hamza Kheddar

Mach. Learn. Knowl. Extr. 2026, 8(6), 166; https://doi.org/10.3390/make8060166 - 16 Jun 2026

Abstract

The increasing convergence of Operational Technology (OT) and Information Technology (IT) within the Industrial Internet of Things (IIoT) brings about remarkable improvements in monitoring and automation. However, it also exposes industrial systems to large-scale Distributed Denial of Service (DDoS) attacks. Edge-based defences are [...] Read more.

The increasing convergence of Operational Technology (OT) and Information Technology (IT) within the Industrial Internet of Things (IIoT) brings about remarkable improvements in monitoring and automation. However, it also exposes industrial systems to large-scale Distributed Denial of Service (DDoS) attacks. Edge-based defences are essential in satisfying low-latency demands and data sovereignty rules, yet they must function under severe resource limitations and adapt to shifting traffic characteristics without cloud assistance. In this work, we introduce a lightweight hybrid deep learning architecture that fuses a Convolutional Neural Network (CNN) with a Convolutional Block Attention Module (CBAM) and a Multi-Layer Perceptron (MLP) in a single detector. A sequential transfer learning scheme is adopted, including a feature projection layer that handles differences in input dimensionality. The model is pre-trained on the CIC-DDoS2019 dataset, then adapted to the more recent CICIoT23 dataset. Evaluations are performed on both datasets while preserving their natural class imbalance. We provide extensive ablation and variance analysis under identical experimental conditions. The proposed method achieves 99.52% accuracy on CICIoT23 while maintaining 99.65% recall, which is a crucial property for critical systems. Real-time measurements on a CPU-only testbed show an average inference latency of 0.013 ms, inference-only throughput exceeding 93,000 packets/s, and end-to-end batch throughput of approximately 38,000 packets/s. The solution demonstrates effective domain adaptation, sub-millisecond latency, and suitability for resource-constrained IIoT edge gateways. Full article

(This article belongs to the Section Safety, Security, Privacy, and Cyber Resilience)

► Show Figures

Graphical abstract

28 pages, 3195 KB

Open AccessArticle

What PISA Measures and What It Misses: A Two-Stage LLM-Based Alignment of IT Workforce Skills with Educational Proficiency

by Andreea-Maria Tanasă, Oprea Simona-Vasilica and Adela Bâra

Mach. Learn. Knowl. Extr. 2026, 8(6), 165; https://doi.org/10.3390/make8060165 - 15 Jun 2026

Abstract

Aligning information technology (IT) workforce demands with educational assessments is essential for bridging skills gaps; yet, no prior corpus maps IT task reasoning to Programme for International Student Assessment (PISA) proficiency levels. This paper introduces a large language model (LLM)-powered framework aligning IT [...] Read more.

Aligning information technology (IT) workforce demands with educational assessments is essential for bridging skills gaps; yet, no prior corpus maps IT task reasoning to Programme for International Student Assessment (PISA) proficiency levels. This paper introduces a large language model (LLM)-powered framework aligning IT competencies with PISA 2022 and the OECD (Organisation for Economic Co-operation and Development) Learning Compass 2030, drawing on O*NET v30.2 (Occupational Information Network), ESCO (European Skills, Competences, Qualifications, and Occupations) v1.2.1, PISA descriptors and OECD definitions. The framework operates in two stages: Stage 1 aligns 562 IT task statements with minimum PISA 2022 proficiency levels via LLM annotation and cross-model validation; and Stage 2 extends this mapping to the OECD Learning Compass 2030 through the semantic clustering of task embeddings and a bidirectional gap analysis of 95 ESCO transversal skills. Using Gemini 2.5 Flash, 562 tasks are annotated with minimum PISA levels across Mathematical, Reading, and Science literacy (first stage). Annotation reliability is assessed through a five-model cross-validation against a blind human domain expert (treated as a reference benchmark, not a gold standard) on a stratified 100-task sample (17.8% of the corpus), with agreement ranging from fair (Gemini 2.5 Flash, κ = 0.29) to moderate (Claude Haiku 4.5, κ = 0.50; LLaMA 3.3 70B, κ = 0.44). A bias-correction sensitivity analysis confirms that distributional findings remain stable after accounting for the primary annotator’s systematic overestimation, and OLS-calibrated alignment against O*NET ability ratings provides directional plausibility support. Validated tasks are embedded and clustered into 25 technical profiles via K-Means, each classified against OECD dimensions. The framework is extended to 95 ESCO transversal skills in 24 clusters. Bidirectional analysis reveals that, while every PISA proficiency level is engaged by at least one transversal cluster, 33% of these clusters, covering creative, ethical, social–emotional, and dispositional competencies, fall entirely outside PISA’s cognitive scope. This boundary mapping identifies where the PISA-based alignment is valid and where complementary tools are required for a full readiness assessment. Full article

(This article belongs to the Special Issue LLM-Inspired New Generation Machine Learning: Hyperparameter Optimization and Uncertainty Quantification)

► Show Figures

Figure 1

32 pages, 63364 KB

Open AccessArticle

Do Foundation Models Truly Outperform Domain-Specific Models? Evidence from Digital Pathology

by Chaima Ben Rabah and Ahmed Serag

Mach. Learn. Knowl. Extr. 2026, 8(6), 164; https://doi.org/10.3390/make8060164 - 12 Jun 2026

Abstract

Foundation models (FMs) are increasingly proposed as general-purpose solutions for computational pathology, with the potential to simplify clinical artificial intelligence deployment by reducing the need for task-specific architectures. However, their reliability across cancer domains with distinct morphological characteristics remains unclear, limiting confidence in [...] Read more.

Foundation models (FMs) are increasingly proposed as general-purpose solutions for computational pathology, with the potential to simplify clinical artificial intelligence deployment by reducing the need for task-specific architectures. However, their reliability across cancer domains with distinct morphological characteristics remains unclear, limiting confidence in real-world clinical use. We benchmarked seven general-purpose pathology FMs and three domain-specific FMs across eleven patch-level datasets spanning three clinically relevant domains: pediatric hematology, prostate cancer, and breast cancer, using both linear probing and last-layer fine-tuning adaptation strategies. By jointly evaluating pediatric leukemia, male-predominant prostate cancer, and female-predominant breast cancer, this study is, to our knowledge, the first to explicitly examine specialist-versus-generalist FM behavior across age- and sex-stratified cancer populations. Performance differences were strongly domain dependent. In hematology, the specialist FM DINOBloom matched and, in several datasets, marginally exceeded leading generalist models (AUC 0.990–0.999 vs. GigaPath 0.981–1.000), suggesting advantages for highly distinctive cellular morphology. In prostate cancer grading, the generalist FM UNI2-h consistently outperformed the specialist HistoEncoder (AUC 0.956–0.977 vs. 0.908–0.964). In breast cancer, UNI2-h achieved the best overall performance across all tasks. No publicly available breast-cancer-specific FM currently exists for direct comparison; therefore, breast cancer results characterize general FM transferability rather than specialist-versus-generalist differences. Importantly, cross-dataset experiments revealed substantial performance degradation under dataset shift in both prostate and breast cancer, indicating that current FMs are not yet robust enough for heterogeneous multi-site clinical use. These findings support the use of generalist FMs as efficient backbones for well-characterized single-site, patch-level tasks, while challenging the assumption that high benchmark performance necessarily reflects true clinical readiness and demonstrating that pathology FMs are not uniformly superior to specialist models. Full article

(This article belongs to the Special Issue Artificial Intelligence for Signal, Image, and Multimodal Data Processing: Algorithms, Models, and Knowledge Extraction)

► Show Figures

Graphical abstract

35 pages, 2314 KB

Open AccessArticle

From Legal Text to NP-Complete Decision Models: MPNet Retrieval and Policy Information Extraction

by Aigerim Aitim, Anel Auyezova, Bakhtgerey Sinchev and Aksulu Mukhanova

Mach. Learn. Knowl. Extr. 2026, 8(6), 163; https://doi.org/10.3390/make8060163 - 12 Jun 2026

Abstract

This study addresses the growing need to convert unstructured legal and policy documents into formal computational models that support transparent decision-making. The purpose of the work is to develop an applied framework that connects Legal NLP and policy information extraction with canonical combinatorial [...] Read more.

This study addresses the growing need to convert unstructured legal and policy documents into formal computational models that support transparent decision-making. The purpose of the work is to develop an applied framework that connects Legal NLP and policy information extraction with canonical combinatorial decision models, including set cover, set packing, subset sum, vertex cover, and independent set. The proposed method combines MPNet-based dense semantic retrieval for locating relevant legal passages, a Legal NLP layer for extracting obligations, prohibitions, exceptions, thresholds, and eligibility conditions, and a formal modeling stage that maps the extracted constraints to NP-complete formulations, including set cover, set packing, subset sum, vertex cover, and independent set. The framework is designed to transform regulatory text into machine-interpretable structures suitable for constraint-aware reasoning and policy analysis. The results show that the integration of semantic retrieval and structured legal information extraction improves the consistency, interpretability, and practical usability of formal problem construction from legal and policy documents. The proposed approach provides a reproducible bridge between legal text analytics and combinatorial decision modeling and supports legal decision support, compliance analysis, and policy-oriented intelligent systems. Full article

(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, 2nd Edition)

► Show Figures

Figure 1

29 pages, 3778 KB

Open AccessArticle

MedToolica: Finetuning-Free Agentic Compositional Tool Learning for 3D CT Reasoning

by Abdullah Hosseini and Ahmed Serag

Mach. Learn. Knowl. Extr. 2026, 8(6), 162; https://doi.org/10.3390/make8060162 - 11 Jun 2026

Abstract

Clinical reasoning over 3D CT scans is inherently compositional, requiring the integration of anatomical measurement, pathology assessment, spatial comparison, and clinical interpretation. We introduce MedToolica, a finetuning-free, role-based agentic framework for quantitative 3D abdominal CT reasoning that decomposes complex queries into structured sub-tasks [...] Read more.

Clinical reasoning over 3D CT scans is inherently compositional, requiring the integration of anatomical measurement, pathology assessment, spatial comparison, and clinical interpretation. We introduce MedToolica, a finetuning-free, role-based agentic framework for quantitative 3D abdominal CT reasoning that decomposes complex queries into structured sub-tasks coordinated through specialized expert tools. Empirical evaluation across quantitative reasoning benchmarks demonstrates that MedToolica is particularly effective in organ-centric measurement tasks when supported by reliable expert tools, achieving strong quantitative agreement (e.g.,

C C C = 0.99

for organ HU estimation versus

0.46

for finetuned baselines) and notable gains on multi-step visual reasoning tasks. In contrast, lesion-oriented tasks remain constrained by upstream tool limitations, indicating that reasoning sophistication alone cannot compensate for unreliable perception. Furthermore, we observe that the capability of the core language model substantially influences orchestration quality: smaller LLM orchestrators exhibit reduced overall accuracy due to higher execution failure rates (

25 %

vs.

79 %

) and increased susceptibility to hallucination (

43 %

vs.

2 %

). Collectively, these findings identify expert tool reliability and orchestration capability as critical determinants of performance in compositional medical AI and highlight both the promise and current limitations of finetuning-free agentic reasoning for quantitative 3D CT analysis. Full article

(This article belongs to the Special Issue Artificial Intelligence for Signal, Image, and Multimodal Data Processing: Algorithms, Models, and Knowledge Extraction)

► Show Figures

Figure 1

35 pages, 3071 KB

Open AccessArticle

Scenario-Adaptive Evaluation of Trustworthy Fine-Tuned Text Models Across Knowledge-Grounded Generation and Misinformation Detection

by Khrystyna Lipianina-Honcharenko, Pavlo Bykovyy, Andriy Krysovatyy, Myroslav Komar and Borys Yazlyuk

Mach. Learn. Knowl. Extr. 2026, 8(6), 161; https://doi.org/10.3390/make8060161 - 11 Jun 2026

Abstract

Large language models (LLMs) increasingly require robust evaluation under realistic instruction-following conditions, particularly for fine-tuned task-specific adapters operating in multilingual environments. This study proposes a scenario-adaptive evaluation framework for assessing the reliability of fine-tuned text models across two application regimes: misinformation detection (disinfo) [...] Read more.

Large language models (LLMs) increasingly require robust evaluation under realistic instruction-following conditions, particularly for fine-tuned task-specific adapters operating in multilingual environments. This study proposes a scenario-adaptive evaluation framework for assessing the reliability of fine-tuned text models across two application regimes: misinformation detection (disinfo) and knowledge-grounded factual biography generation (heroes). The framework integrates automated generation of balanced risk-oriented scenarios, bilingual evaluation in English and Ukrainian, the LLM-as-a-Judge paradigm, and multidimensional robustness analysis through the Alignment Robustness Index (ARI). Six LoRA-adapted models based on Qwen2.5-3B-Instruct, SmolLM2-1.7B-Instruct, and TinyLlama-1.1B-Chat-v1.0 were evaluated. The implemented pipeline generated 2052 scenarios and 6156 model responses, producing a final bilingual analytical subset of 4104 judged records. Experimental results show that task-specific adaptation produces task-dependent robustness profiles. In the disinfo case, Qwen2.5-3B achieved the strongest overall performance, combining the highest safety and classification accuracy. In contrast, the heroes case revealed a more compressed and multidimensional vulnerability space without a single dominant model. The results further demonstrate the importance of multilingual evaluation, as weaker adapters exhibited more pronounced cross-lingual safety gaps. Overall, the framework provides a reproducible and practically applicable methodology for evaluating fine-tuned language models under imperfect instruction conditions. Full article

(This article belongs to the Special Issue Trustworthy AI: Integrating Knowledge, Retrieval, and Reasoning)

► Show Figures

Graphical abstract

36 pages, 5712 KB

Open AccessArticle

Interpretable Machine Learning for the Shear Capacity of RC Corbels: A Validated, Application-Driven Model

by Wael Kassem

Mach. Learn. Knowl. Extr. 2026, 8(6), 160; https://doi.org/10.3390/make8060160 - 10 Jun 2026

Abstract

This paper demonstrates the application of a robust machine learning methodology to develop an accurate and, critically, an interpretable data-driven model for RC corbel shear assessment. A primary focus of this work is the use of advanced explainability techniques to rigorously validate the [...] Read more.

This paper demonstrates the application of a robust machine learning methodology to develop an accurate and, critically, an interpretable data-driven model for RC corbel shear assessment. A primary focus of this work is the use of advanced explainability techniques to rigorously validate the model’s predictive logic against fundamental principles of structural mechanics, directly confronting the limitations of “black-box” approaches. To implement this framework, an extensive database of 515 experimental tests was assembled. Different machine-learning (ML) techniques, including Random Forest, AdaBoost, Support Vector Machine, and XGBoost, were systematically evaluated to define the optimal predictive model. The most accurate algorithm, XGBoost, was selected and optimized to achieve exceptional performance, with a coefficient of determination (

R^{2}

) of 0.98 evaluated across the full database and a mean absolute relative deviation (MARD) of only 4%; on the held-out testing subset the model retains an

R^{2}

of 0.97 and a MARD of 15%, confirming that predictive performance does not degrade appreciably on unseen specimens. The predictive model was shown to be substantially more accurate and generalizable than current design approaches, including both ACI code provisions and other prominent analytical models from the literature. Crucially, the Shapley Additive exPlanations (SHAP) technique was used to rigorously interrogate the model’s predictive logic. The analysis showed that the model’s feature attributions are consistent with established structural mechanics, correctly identifying the governing influence of parameters like the shear span-to-depth ratio and reinforcement indices for distinct failure modes. This explainability analysis establishes that the learned associations agree with structural expectations; it does not by itself demonstrate mechanistic causality. The study provides a validated methodology for creating trustworthy ML models and indicates, subject to further validation, uncertainty quantification, and a clearly defined applicability domain, how such interpretable tools might complement existing design provisions. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

19 pages, 1061 KB

Open AccessPerspective

Digital Twins: A Computational Realization of the Scientific Method in Dynamical Systems

by Frank Emmert-Streib

Mach. Learn. Knowl. Extr. 2026, 8(6), 159; https://doi.org/10.3390/make8060159 - 10 Jun 2026

Abstract

The scientific method is widely acknowledged as an authoritative framework that provides guiding principles for empirical research across disciplines. Despite this central role, it is rarely examined explicitly as a conceptual framework. In this paper, we revive attention to its role by revealing [...] Read more.

The scientific method is widely acknowledged as an authoritative framework that provides guiding principles for empirical research across disciplines. Despite this central role, it is rarely examined explicitly as a conceptual framework. In this paper, we revive attention to its role by revealing a connection to digital twins, which have received considerable attention in recent years. Specifically, we argue that the digital twins framework can be interpreted as a computational realization of the scientific method in the context of dynamical systems. This connection is rooted in the dynamical nature of models, since dynamical systems arise across many scientific fields, from physics to economics, and also constitute a core component of digital twins. The main benefits of this connection include a common scientific language for knowledge transfer, a systematic approach that emphasizes the mechanisms of continuous learning and model selection, and a practical framework for implementing the scientific method computationally across disciplines. Full article

(This article belongs to the Collection Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction)

► Show Figures

Graphical abstract

17 pages, 5529 KB

Open AccessArticle

EA-StrongSORT: An Efficient Attention StrongSORT Framework for Detection-Based Tumor Tracking in Cine-MRI TrackRAD2025 Dataset

by Alyaa Amer, Noha Ghatwary, Salema Fayed, Sahar Magdy, Alla Hussein, Rania Kadry and Amina I. Abdelmaksoud

Mach. Learn. Knowl. Extr. 2026, 8(6), 158; https://doi.org/10.3390/make8060158 - 9 Jun 2026

Abstract

MRI-guided radiotherapy (MRIgRT) enables the real-time visualization of tumor motion, allowing adaptive radiation delivery based on dynamic anatomical changes. However, respiratory-induced tumor motion remains a major challenge, particularly for thoracic and abdominal tumors. Continuous tumor motion may reduce treatment accuracy and increase radiation [...] Read more.

MRI-guided radiotherapy (MRIgRT) enables the real-time visualization of tumor motion, allowing adaptive radiation delivery based on dynamic anatomical changes. However, respiratory-induced tumor motion remains a major challenge, particularly for thoracic and abdominal tumors. Continuous tumor motion may reduce treatment accuracy and increase radiation exposure to surrounding healthy tissues. Therefore, reliable and efficient tumor tracking is essential for real-time motion management in MRI-guided radiotherapy. Recent advances in artificial intelligence have demonstrated significant potential for medical image analysis; however, many existing tumor tracking approaches rely on segmentation-based methods that require detailed annotations and complex processing, which can limit their use in real-time clinical environments. In this work, we propose a detection-based tumor tracking framework that integrates the YOLOv11 object detection model with an enhanced StrongSORT tracking algorithm (EA-StrongSORT). The proposed approach replaces the conventional re-identification backbone with a lightweight EfficientNetV2 architecture augmented with an Efficient Channel Attention (ECA) mechanism. The overall framework follows a tracking-by-detection concept, where tumor regions are first detected and subsequently associated across frames. The proposed framework is evaluated on the TrackRAD2025 dataset using multiple YOLOv11 variants to analyze the balance between performance and model complexity. Experimental results demonstrate that the lightweight YOLOv11n model achieves the best detection performance, with a precision of 0.912, recall of 0.607, mean Average Precision (mAP) of 0.771, and

{mAP}_{50 - 95}

of 0.608. Furthermore, the proposed tracking framework achieves stable temporal association, with Multiple-Object Tracking Accuracy (MOTA) scores exceeding 91% and Higher-Order Tracking Accuracy (HOTA) scores around 90%. The proposed framework demonstrates the potential of detection-based tumor localization and tracking for real-time MRI-guided radiotherapy applications. Full article

(This article belongs to the Special Issue Artificial Intelligence for Signal, Image, and Multimodal Data Processing: Algorithms, Models, and Knowledge Extraction)

► Show Figures

Figure 1

34 pages, 1894 KB

Open AccessArticle

Generative Artificial Intelligence and Probabilistic Trees for the Linguistic Data Summarization in Wave Energy Decision-Making

by Iliana Pérez Pupo, Luis Segundo Alvarado Acuña, Pedro Y. Piñero Pérez, Raykenler Yzquierdo Herrera and Maikel Yelandi Leyva Vázquez

Mach. Learn. Knowl. Extr. 2026, 8(6), 157; https://doi.org/10.3390/make8060157 - 9 Jun 2026

Abstract

This paper presents a hybrid model that combines linguistic data summarization techniques, algorithms for constructing probabilistic trees, and various generative artificial intelligence models for learning and generating linguistic summaries to aid decision-making. The proposal is validated using methodological triangulation techniques that demonstrate high [...] Read more.

This paper presents a hybrid model that combines linguistic data summarization techniques, algorithms for constructing probabilistic trees, and various generative artificial intelligence models for learning and generating linguistic summaries to aid decision-making. The proposal is validated using methodological triangulation techniques that demonstrate high consistency in the knowledge discovered. The proposal also compares different generative artificial intelligence models; among the evaluated models, Gemini achieved the best performance. However, it is evident that, in certain contexts and tasks, small language models can be effective, yielding results comparable to large language models (LLMs) at a lower computational cost. This study applies the algorithms in a case study analyzing oceanographic data from Northern Chile. In the validation scenario, the combination of linguistic data summarization methods with unsupervised learning techniques effectively models human tolerance for imprecision when processing complex data and generated linguistic summaries easily interpretable by human decision-makers with high levels of confidence. Studies of energy capacities in the studied region and their behavior in both winter and summer are presented. Full article

(This article belongs to the Special Issue Using Large Language Models for Scientific Problem Solving and Engineering Design)

► Show Figures

Graphical abstract

29 pages, 828 KB

Open AccessArticle

Decoupling Privacy Noise from Optimization in Transformer Forecasting

by Bhagiradh Kantheti and Carlos A. Paz De Araujo

Mach. Learn. Knowl. Extr. 2026, 8(6), 156; https://doi.org/10.3390/make8060156 - 4 Jun 2026

Abstract

Strong differential privacy often collapses utility in transformer-based time-series forecasting because noise is injected directly into high-dimensional gradients (e.g., DP-SGD), severely corrupting the optimization process. We introduce Low-Dimensional Feature-Path Privacy for Transformers (LDPT), which enforces privacy by routing calibrated perturbations through a low-dimensional [...] Read more.

Strong differential privacy often collapses utility in transformer-based time-series forecasting because noise is injected directly into high-dimensional gradients (e.g., DP-SGD), severely corrupting the optimization process. We introduce Low-Dimensional Feature-Path Privacy for Transformers (LDPT), which enforces privacy by routing calibrated perturbations through a low-dimensional feature bottleneck (

D = 16

) that is independent of the model parameter count. LDPT implements noise via classically simulated quantum channels (Lindblad/depolarizing dynamics) and finite-shot POVM measurements, providing an auditable mapping from privacy budget

ε

to perturbation magnitude while keeping the transformer gradients clean. Across the ETT datasets and multiple prediction horizons, LDPT substantially preserves forecasting utility under its native local

ε

-QDP guarantee. At a nominal per-pass

ε = 0.1

, LDPT limits MSE degradation to under 6%. In contrast, DP-SGD with global

(ε, δ)

-DP applied to the identical transformer architecture suffers over 100% MSE degradation. Because these methods operate under different privacy definitions (local

ε

-QDP vs. global

(ε, δ)

-DP), this comparison illustrates the impact of noise placement rather than equivalent privacy protection. To isolate the effect of the calibration mechanism, we further evaluate a classical Gaussian mechanism on the same feature-path bottleneck, which requires orders-of-magnitude larger noise and severely degrades utility. Membership inference attacks confirm that LDPT does not amplify membership leakage beyond the non-private baseline. These results demonstrate that decoupling privacy noise from optimization through low-dimensional feature-path placement and tight channel-based calibration is critical for practical privacy-preserving transformer forecasting. Full article

(This article belongs to the Section Safety, Security, Privacy, and Cyber Resilience)

► Show Figures

Graphical abstract

25 pages, 1132 KB

Open AccessArticle

A Sovereign Conversational Assistant Powered by ALIA and Mistral for the AI Act Age: Architecture, Governance, and Evaluation

by Alejandro Carmona-Martínez, Antonio J. Jara and Alicia Asín

Mach. Learn. Knowl. Extr. 2026, 8(6), 155; https://doi.org/10.3390/make8060155 - 4 Jun 2026

Abstract

Digital Twins and Living Labs are increasingly used to support conservation, safety, accessibility, and visitor experience in cultural-heritage sites. Their practical value, however, depends on interfaces that can explain heterogeneous evidence, expose provenance, and operate under public-sector governance constraints. This paper presents a [...] Read more.

Digital Twins and Living Labs are increasingly used to support conservation, safety, accessibility, and visitor experience in cultural-heritage sites. Their practical value, however, depends on interfaces that can explain heterogeneous evidence, expose provenance, and operate under public-sector governance constraints. This paper presents a Sovereign Conversational Assistant (SCA) for the Libelium Heritage Living Lab, implemented as a small-language-model (SLM) and retrieval-augmented generation (RAG) stack that combines curated heritage and operational knowledge bases with provenance logging, refusal controls, and language enforcement. We first compare the Spanish public model BSC-LT/ALIA-40b-instruct-2601 with mistralai/Mistral-Small-3.2-24B-Instruct-2506 using 19 canonical test conditions executed over 155 repeated runs across five categories: historical queries, client experience, data analysis, hallucination resistance, and safety/ethics. Mistral passed all repeated runs, whereas ALIA passed 129/155 runs, showing strong factual and visitor-information behaviour but weaker numerical analysis, cross-lingual safety, and Spanish-language enforcement. To address external validity, we add a non-sovereign baseline comparison over the 13 canonical prompts against claude-opus-4-7, gemini-3.5-flash, and gpt-5.5 under the same RAG-conditioned harness. In this prompt-level comparison, mean final scores were ALIA 0.963, Claude Opus 4.7 0.938, Gemini 3.5 Flash 0.892, GPT-5.5 0.877, and Mistral 0.871; no pairwise difference was significant after Holm correction, and ALIA was non-inferior to the best external baseline at margins of 0.05 and 0.10, whereas Mistral was not. The contribution is therefore not a new RAG algorithm, but an operational method for deploying and evaluating a governance-aware, sovereign assistant for cultural-heritage Digital Twins, together with evidence that sovereign models can be competitive in controlled heritage RAG tasks while still requiring larger, human-calibrated benchmarks before stronger claims are made. Full article

(This article belongs to the Special Issue Trustworthy AI: Integrating Knowledge, Retrieval, and Reasoning)

► Show Figures

Figure 1

Journal Description

Machine Learning and Knowledge Extraction

Latest Articles

Journal Menu

Journal Browser

Highly Accessed Articles

Latest Books

E-Mail Alert

News

Topics

Conferences

Special Issues

Topical Collections

Further Information

Guidelines

MDPI Initiatives

Follow MDPI