You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.

Search for Articles:

Title / Keyword

Author / Affiliation / Email

Journal

Article Type

Advanced Search

Section

Special Issue

Volume

Issue

Number

Page

Logical OperatorOperator

Search Text

Search Type

Journal Description

Machine Learning and Knowledge Extraction

Machine Learning and Knowledge Extraction is an international, peer-reviewed, open access, monthly journal on machine learning and applications, see our video on YouTube explaining the MAKE journal concept.

Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, and other databases.
Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 27 days after submission; acceptance to publication is undertaken in 4.4 days (median values for papers published in this journal in the second half of 2025).
Journal Rank: JCR - Q1 (Engineering, Electrical and Electronic) / CiteScore - Q1 (Engineering (miscellaneous))
Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Journal Cluster of Artificial Intelligence: AI, AI in Medicine, Algorithms, BDCC, MAKE, MTI, Stats, Virtual Worlds and Computers.

Impact Factor: 6.0 (2024); 5-Year Impact Factor: 5.7 (2024)

Imprint Information Journal Flyer Open Access ISSN: 2504-4990

Latest Articles

25 pages, 33740 KB

Open AccessArticle

CTCF: A Three-Level Coarse-to-Fine Cascade for Unsupervised Deformable Medical Image Registration

by Daniil Pasenko and Roman Davydov

Mach. Learn. Knowl. Extr. 2026, 8(5), 122; https://doi.org/10.3390/make8050122 (registering DOI) - 2 May 2026

Deformable medical image registration aims to spatially align anatomical structures across volumetric scans. Recent transformer-based methods achieve high overlap accuracy but often produce deformation fields with topological violations. We propose CTCF, a Cascade Transformer for Coarse-to-Fine registration that wraps a lightweight coarse-and-refined envelope [...] Read more.

Deformable medical image registration aims to spatially align anatomical structures across volumetric scans. Recent transformer-based methods achieve high overlap accuracy but often produce deformation fields with topological violations. We propose CTCF, a Cascade Transformer for Coarse-to-Fine registration that wraps a lightweight coarse-and-refined envelope around a core registration module. Level 1 provides a coarse displacement estimate at quarter resolution, Level 2 performs the main registration via a Swin Transformer encoder with deformable cross-attention and a learned super-resolution decoder, and Level 3 applies error-driven flow refinement at half resolution. The two outer levels add only 3.0% parameter overhead yet improve registration accuracy while maintaining competitive deformation regularity relative to external baselines. The model is trained end-to-end with a composite unsupervised loss combining local normalized cross-correlation, diffusion regularization, inverse-consistency, and Jacobian-based topology preservation. On the OASIS brain MRI benchmark, CTCF achieves the highest Dice score of 0.8208 among the compared unsupervised methods while maintaining competitive SDlogJ, with all Dice improvements statistically significant at

p < 0.001

by the Wilcoxon signed-rank test. On IXI, CTCF also achieves the best Dice, HD95, SDlogJ, and fold percentage among the compared methods. A five-round ablation study validates each component: cascade decomposition isolates each level’s contribution, and resolution scaling experiments confirm the framework’s scalability, yielding further accuracy gains with zero parameter overhead. Full article

(This article belongs to the Special Issue Artificial Intelligence for Signal, Image, and Multimodal Data Processing: Algorithms, Models, and Knowledge Extraction)

► Show Figures

Graphical abstract

33 pages, 2478 KB

Open AccessReview

A Survey on Self-Supervised Learning in Cybersecurity: Network Intrusion and Malware Detection

by Josue Genaro Almaraz-Rivera, Jose Antonio Cantoral-Ceballos and Juan Felipe Botero

Mach. Learn. Knowl. Extr. 2026, 8(5), 121; https://doi.org/10.3390/make8050121 - 1 May 2026

Self-Supervised Learning (S-SL) is a recent line of research that could represent the next step to understanding human intuition. By blending the strengths of unsupervised and supervised learning paradigms, S-SL endows Deep Learning models with stronger generalization capabilities. Although better known for its [...] Read more.

Self-Supervised Learning (S-SL) is a recent line of research that could represent the next step to understanding human intuition. By blending the strengths of unsupervised and supervised learning paradigms, S-SL endows Deep Learning models with stronger generalization capabilities. Although better known for its applications in Computer Vision and Natural Language Processing, S-SL has also proved its value in other fields, such as cybersecurity. In this work, we review the current progress and future trends in S-SL for the two most relevant problems discussed in the cybersecurity literature: network intrusion and malware detection. The scope of this survey spans from 2019 to 2025. From an initial analysis of over 200 documents, we distill the 50 most relevant papers. We also highlight opportunity areas, such as attack detection over encrypted network traffic, RAM-based analysis of obfuscated malware, creating S-SL models for tabular data and resource-constrained devices, as well as the research on backdooring, encoder extraction, the transferability of vulnerabilities, and data memorization in S-SL. To the best of our knowledge, this is the first comprehensive survey regarding the application of Self-Supervised Learning in cybersecurity, benchmarking contrastive learning vs auxiliary pretext tasks and presenting the data requirements for implementing S-SL solutions in this field. We hope this paper provides a firm ground for further exploration. Full article

(This article belongs to the Section Thematic Reviews)

33 pages, 4406 KB

Open AccessArticle

Individual Indicators of the Learning Process for Identifying Critical Thinking in Students in Adaptive Learning

by Vassiliy Serbin, Mateus Mendes, Aray Kassenkhan, Akbayan Bekarystankyzy, Gulnur Ibragim and Azamat Tolegenov

Mach. Learn. Knowl. Extr. 2026, 8(5), 120; https://doi.org/10.3390/make8050120 - 1 May 2026

The rapid digitalization of higher education has intensified the need for reliable methods to assess higher-order cognitive skills, particularly critical thinking, in adaptive learning environments. However, most existing assessment approaches rely primarily on test outcomes and academic performance indicators, which do not adequately [...] Read more.

The rapid digitalization of higher education has intensified the need for reliable methods to assess higher-order cognitive skills, particularly critical thinking, in adaptive learning environments. However, most existing assessment approaches rely primarily on test outcomes and academic performance indicators, which do not adequately capture the multidimensional and process-based nature of critical thinking. This study proposes a multi-criteria hierarchical model for identifying and quantitatively assessing students’ critical thinking based on individual process indicators of learning activity in an intelligent educational environment. The model integrates cognitive, metacognitive, and behavioral indicators, including knowledge dynamics, task complexity, time characteristics, learning activity intensity, error rate, level of doubt, user interaction patterns, and system operating modes. These indicators are aggregated into a three-component structure representing metacognitive awareness, analytical depth, and strategic learning activity. The proposed model was empirically validated through a quasi-experimental longitudinal study involving 500 university students divided into control and experimental groups. The results demonstrate a statistically significant increase in all latent components of critical thinking and in the integral indicator within the experimental group. The model shows satisfactory internal consistency (Cronbach’s

α \geq 0.77

) and acceptable construct validity confirmed by confirmatory factor analysis. The findings indicate that the proposed model can serve as a practical analytical tool for monitoring critical thinking development and supporting personalized learning trajectories in adaptive digital educational systems. Full article

(This article belongs to the Section Learning)

26 pages, 20134 KB

Open AccessArticle

Morphology-Aware Multi-Scale Deep Representation Learning for Interpretable Knowledge Extraction in Brain Tumor MRI

by Helala AlShehri and Mariam Busaleh

Mach. Learn. Knowl. Extr. 2026, 8(5), 119; https://doi.org/10.3390/make8050119 - 1 May 2026

Robust brain tumor classification from magnetic resonance imaging (MRI) remains challenging due to complex structural heterogeneity and subtle inter-class variability. Beyond predictive accuracy, conventional convolutional neural networks predominantly rely on texture-dominant features and fixed receptive fields, which may limit the extraction of clinically [...] Read more.

Robust brain tumor classification from magnetic resonance imaging (MRI) remains challenging due to complex structural heterogeneity and subtle inter-class variability. Beyond predictive accuracy, conventional convolutional neural networks predominantly rely on texture-dominant features and fixed receptive fields, which may limit the extraction of clinically meaningful structural information. This study proposes a morphology-aware multi-scale deep representation learning framework that embeds morphological inductive bias directly within hierarchical feature extraction. The proposed architecture synergistically integrates trainable morphological operations with multi-scale convolutional feature learning inside a unified residual framework, supported by an in-block morphological refinement mechanism and a morphology-aware downsampling module. Unlike prior approaches that treat morphological operators as preprocessing or auxiliary branches, the proposed design incorporates differentiable dilation and erosion into the core feature hierarchy to guide structure-aware representation formation. The model was evaluated using five-fold cross-validation and an independent test set, achieving an overall test accuracy of 99.31% with consistently high macro-averaged precision, recall, F1-score, and AUC values. Grad-CAM analysis further demonstrates that the learned representations emphasize clinically relevant tumor regions, supporting interpretable structural knowledge extraction. Ablation studies confirm that performance improvements arise from the synergistic integration of multi-scale learning and morphology-aware refinement. Overall, embedding structural inductive bias within multi-scale deep representation learning enhances robustness, stability, and interpretable knowledge extraction for brain tumor MRI analysis. Full article

(This article belongs to the Section Learning)

32 pages, 2545 KB

Open AccessArticle

Supervised Machine Learning for Technical Debt in Python: Analysis and Prediction

by Elif Fırıncı and Mohanad Alayedi

Mach. Learn. Knowl. Extr. 2026, 8(5), 118; https://doi.org/10.3390/make8050118 - 1 May 2026

The appearance of technical debt (TD) becomes a critical problem, posing challenges related to software maintainability and its quality within the context of fast modern software development. The presented research focuses on the issue of TD appearance in the context of Python software [...] Read more.

The appearance of technical debt (TD) becomes a critical problem, posing challenges related to software maintainability and its quality within the context of fast modern software development. The presented research focuses on the issue of TD appearance in the context of Python software development by employing a hybrid approach involving perception and predictive approaches. Within the scope of research, the perceptions of 86 IT practitioners and developers have been studied with regard to their reactions, adaptation, and prioritization of different types of TDs. According to the qualitative results, cyclic architectural debt stems from low test coverage, documentation deficiency, and complicated code structure. Based on the aforementioned information, the research team developed a dataset consisting of 130 Python codes in real conditions with the following characteristics: code complexity, comments-to-code ratio, code smells, and software maintainability indexes being used. Thereafter, the application of DT, LR, NB, SVM, KNN, and RF predictive models allowed detecting TDs. The presented results reveal the possibility of predicting TDs with the use of machine learning methods, with optimal performance provided by random forest and optimized logistic regression models. Full article

(This article belongs to the Section Data)

19 pages, 3273 KB

Open AccessArticle

COMPAS: Compose Actions and Slots in Object-Centric World Models

by Vitaliy Vorobyov, Leonid Ugadiarov, Vladimir Frolov, Alexey Kovalev and Aleksandr Panov

Mach. Learn. Knowl. Extr. 2026, 8(5), 117; https://doi.org/10.3390/make8050117 - 29 Apr 2026

In this paper, we propose a novel approach, COMPAS (COMPose Actions and Slots), which leverages the strengths of state-of-the-art object-centric approaches for modeling the dynamics of an environment. Our method encodes the environment’s state into symbol-like, object-centric representations, known [...] Read more.

In this paper, we propose a novel approach, COMPAS (COMPose Actions and Slots), which leverages the strengths of state-of-the-art object-centric approaches for modeling the dynamics of an environment. Our method encodes the environment’s state into symbol-like, object-centric representations, known as slots, where each slot corresponds to an individual object. This approach offers a structured and interpretable way to model complex environments by combining slots with action representations for accurate next-state prediction. The primary contribution of our work is an efficient world model with a dynamics predictor capable of predicting accurate trajectories in action-dependent environments. Additionally, our slot extractor module enhances the predictive capabilities by extracting deterministic slots that remain consistent both within a single trajectory and across episodes. Unlike slots sampled from a trainable distribution, deterministic slots are generated from a single trainable parameter together with slot positional embeddings. This design improves the consistency across episodes, which in turn leads to more accurate dynamics prediction. We present a comprehensive evaluation of our approach in various environments, demonstrating that our proposed method outperforms competing models in environments with discrete and continuous action spaces. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

44 pages, 2726 KB

Open AccessArticle

A Tiny Vision-Based Model for Real-Time Student Attention Detection in Online Classes

by Chaymae Yahyati, Ismail Lamaakal, Yassine Maleh, Khalid El Makkaoui and Ibrahim Ouahbi

Mach. Learn. Knowl. Extr. 2026, 8(5), 116; https://doi.org/10.3390/make8050116 - 28 Apr 2026

Online and blended classrooms widen access but remove the in-person cues instructors use to gauge attention. Prior work typically relies on heavy, cloud-bound or multimodal models that are hard to deploy on commodity laptops, treats attention as an unordered label without calibrated probabilities, [...] Read more.

Online and blended classrooms widen access but remove the in-person cues instructors use to gauge attention. Prior work typically relies on heavy, cloud-bound or multimodal models that are hard to deploy on commodity laptops, treats attention as an unordered label without calibrated probabilities, and evaluates on subject-overlapping splits with limited robustness analysis. This creates a gap in Tiny, deployable, calibration-aware methods validated under realistic protocols. We address this gap with a TinyML, vision-only pipeline that estimates four attention levels: (Very Low, low, high, Very High ) from short webcam clips under strict on-device budgets. Each clip of

T = 30

frames at

224 \times 224

is processed by a compact hybrid encoder: a CNN extracts per frame spatial features, a BiLSTM models temporal context, and a lightweight GRU refines dynamics; three parallel branches with staggered widths encourage feature diversity before fusion. We apply structured pruning of convolutional channels and recurrent units, post-training INT8 quantization, and temperature scaling for calibrated probabilities; models are exported as ONNX. On DAiSEE with subject-independent splits, the baseline attains

99.86 %

accuracy and

0.998

macro-F1, with strong ordinal agreement (QWK = 0.998, ordinal MAE = 0.03). The compressed model preserves reliability (macro-F1 = 0.995, QWK = 0.995), remains robust to low light, partial occlusion, and head yaw, and yields ∼4× smaller size and ∼2.3× CPU speedups. These results indicate a deployable, privacy-preserving approach to fine-grained, on-device attention analytics. Full article

(This article belongs to the Special Issue Next-Generation TinyML: Innovations in Models, Security, and Applications for Constrained Intelligent Systems)

► Show Figures

Figure 1

22 pages, 3221 KB

Open AccessArticle

Mapping Dialectal Landscape: A Sequence-to-Sequence Approach to Japanese Dialect-to-Standard Normalization

by Kinga Lasek, Michal Ptaszynski, Fumito Masui, Mujahid Khalifah and Juuso Eronen

Mach. Learn. Knowl. Extr. 2026, 8(5), 115; https://doi.org/10.3390/make8050115 - 26 Apr 2026

Despite the progressing standardization of the Japanese language, regional dialects persist, particularly among older generations, causing communication gaps, which results in problems especially in healthcare and emergency contexts. This study proposes a text-to-text normalization method to convert eight Japanese dialects into standard Japanese [...] Read more.

Despite the progressing standardization of the Japanese language, regional dialects persist, particularly among older generations, causing communication gaps, which results in problems especially in healthcare and emergency contexts. This study proposes a text-to-text normalization method to convert eight Japanese dialects into standard Japanese using a fine-tuned mT5-small architecture. We evaluate the impact of learning rate schedulers, training duration, and data preprocessing on model performance. Our results demonstrate that the CharacTER (Character Translation Edit Rate) metric provides a more accurate evaluation than BLEU, which is practically ill-suited for the unsegmented nature of Japanese text. The optimal configuration minimizes character error rates by aligning input data with natural, unspaced Japanese orthography. Furthermore, we observe a statistically significant correlation between the model’s conversion error rate and the physical distance of the source dialect from Tokyo. This finding suggests that the model’s performance effectively serves as a proxy for measuring linguistic distance between dialectal variations and the standard language. Full article

(This article belongs to the Section Learning)

22 pages, 19401 KB

Open AccessArticle

Explainable Combined Spatial Representations for ECG Arrhythmia Classification

by Iulia Onică and Iulian B. Ciocoiu

Mach. Learn. Knowl. Extr. 2026, 8(5), 114; https://doi.org/10.3390/make8050114 - 25 Apr 2026

The paper addresses ECG arrhythmia classification using a novel input fusion strategy that combines spatial representations of ECG time series recordings. Four distinct time series-to-image transformations are considered, namely classical spectrograms, Gramian Angular Field (GAF), Recursive Plot (RP), and the S-Transform (ST). Classification [...] Read more.

The paper addresses ECG arrhythmia classification using a novel input fusion strategy that combines spatial representations of ECG time series recordings. Four distinct time series-to-image transformations are considered, namely classical spectrograms, Gramian Angular Field (GAF), Recursive Plot (RP), and the S-Transform (ST). Classification of combined 2 × 2 images generated from single-lead ECG recordings is performed using both custom and ResNet-50 deep learning architectures. Finally, several distinct explainability algorithms are used to identify the relevant regions in the input images that mainly influence the classification decisions. Experiments performed on the MIT-BIH and Chapman–Shaoxing arrhythmia datasets revealed performance comparable to more sophisticated approaches in terms of accuracy (99%), F1-score (98.6%), and AUC (0.999) values. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

17 pages, 2710 KB

Open AccessArticle

DPA-HiVQA: Enhancing Structured Radiology Reporting with Dual-Path Cross-Attention

by Ngoc Tuyen Do, Minh Nguyen Quang and Hai Van Pham

Mach. Learn. Knowl. Extr. 2026, 8(5), 113; https://doi.org/10.3390/make8050113 - 24 Apr 2026

Structured radiology reporting can improve clinical decision support by standardizing clinical findings into hierarchical formats. However, thousands of questions in structured report templates about clinical findings are prohibitively time-consuming, which can limit clinical adoption. Furthermore, early medical VQA datasets primarily focused on free-text [...] Read more.

Structured radiology reporting can improve clinical decision support by standardizing clinical findings into hierarchical formats. However, thousands of questions in structured report templates about clinical findings are prohibitively time-consuming, which can limit clinical adoption. Furthermore, early medical VQA datasets primarily focused on free-text and independent question–answer pairs while a recent dataset, Rad-ReStruct, introduced a hierarchical VQA, but the accompanying model still relies heavily on flattened embedding representations and single-path text–image fusion mechanisms that inadequately handle complex hierarchical dependencies in responses. In this paper, we propose DPA-HiVQA (Dual-Path Cross-Attention for Hierarchical VQA), addressing these limitations through two key contributions: (1) multi-scale image embedding representing global semantic embeddings with patch-level spatial features from domain-specific BioViL encoder; (2) dual-path cross-attention mechanism enabling simultaneous holistic semantic understanding and fine-grained spatial reasoning. Evaluated on the Rad-ReStruct benchmark, the model substantially outperforms the established benchmark baseline with an overall F1-score and Level 3 F1-score improvement by 21.2% and 31.9%, respectively. The proposed model demonstrates that dual-path cross-attention architectures can effectively connect holistic semantic understanding and fine-grained spatial detail, paving the way for practical AI-assisted structured reporting systems that reduce radiologist burden while maintaining diagnostic accuracy. Full article

► Show Figures

Graphical abstract

16 pages, 707 KB

Open AccessArticle

Scaling Laws in the Tiny Regime: How Small Models Change Their Mistakes

by Mohammed Alnemari, Rizwan Qureshi and Nader Bagherzadeh

Mach. Learn. Knowl. Extr. 2026, 8(5), 112; https://doi.org/10.3390/make8050112 - 24 Apr 2026

Neural scaling laws describe how model performance improves as a power law with size, but existing work has focused almost entirely on models above 100 M parameters. The regime below 20 million parameters, where TinyML and edge AI systems operate, remains largely unexamined. [...] Read more.

Neural scaling laws describe how model performance improves as a power law with size, but existing work has focused almost entirely on models above 100 M parameters. The regime below 20 million parameters, where TinyML and edge AI systems operate, remains largely unexamined. We train 90 models spanning 22 K to 19.8 M parameters across two architecture families (a plain ConvNet and MobileNetV2) on CIFAR-100, varying width while holding depth and training protocol fixed. Both architectures follow approximate power laws, with exponents of

α = 0.156

(ScaleCNN) and

α = 0.106

(MobileNetV2). However, the power law does not hold uniformly: local exponents decay with scale, and MobileNetV2 saturates at 19.8 M parameters (

α_{local} = 0.006

), hitting a data wall. The structure of errors also changes with scale. The Jaccard overlap between error sets of the smallest and largest ScaleCNN models is only 0.35; compression changes which inputs are misclassified, not merely how many. Small models develop a triage strategy, concentrating capacity on easy classes (Gini of per-class accuracy: 0.26 at 22 K params vs. 0.09 at 4.7 M) while effectively abandoning the hardest ones (bottom-5 class accuracy: 10% vs. 53%). The smallest models achieve the lowest ECE values (0.013 vs. peak 0.110 at mid-size), reversing the typical overconfidence–capacity relationship, though this partly reflects a global-mean matching artifact rather than well-calibrated per-bin confidence. On CIFAR-100, aggregate accuracy alone is therefore a misleading basis for edge deployment decisions; validation must happen at the target model size. All findings in this study are based on CIFAR-100 (32 × 32, 100 classes); their generalizability to other datasets, resolutions, and architectures remains to be verified. Full article

► Show Figures

Figure 1

15 pages, 662 KB

Open AccessArticle

A Hybrid Multi-Domain Feature Fusion Model Integrating MEEMD and Dual CNN for Iris Recognition

by Zine. Eddine Louriga, Ismail Jabri, Aziza El Ouaazizi and Anass El Affar

Mach. Learn. Knowl. Extr. 2026, 8(4), 111; https://doi.org/10.3390/make8040111 - 21 Apr 2026

Iris biometric systems are recognized as secure alternatives to conventional authentication methods, yet challenges such as variable illumination, noise, and intricate iris textures persist. To address these issues, our study presents a novel hybrid iris recognition framework that integrates advanced deep learning with [...] Read more.

Iris biometric systems are recognized as secure alternatives to conventional authentication methods, yet challenges such as variable illumination, noise, and intricate iris textures persist. To address these issues, our study presents a novel hybrid iris recognition framework that integrates advanced deep learning with a pioneering application of Multivariate Ensemble Empirical Mode Decomposition (MEEMD) for feature extraction—a method not previously applied in this context. Our framework first employs MEEMD to extract statistical features that capture the iris’s nonlinear and nonstationary variations. We then combine global semantic information from two pretrained convolutional neural networks—VGG16 and ResNet-152—with local micro-texture details encoded by Local Binary Patterns (LBP) to form a comprehensive feature representation. An efficient pre-processing and segmentation stage precisely isolates the iris region, and the resulting features are refined through dimensionality reduction techniques to yield a robust, compact representation. These features are subsequently classified using multiple models, each rigorously tuned via hyperparameter optimization. Experimental validation on benchmark datasets—including IITD, CASIA, and UBIRIS.v2—shows that our model achieves recognition rates of up to 98% on IITD, 97% on CASIA, and 97.30% on UBIRIS.v2, surpassing existing approaches. This work not only enhances iris recognition performance but also establishes a novel method that bridges advanced deep learning with innovative feature extraction for high-security applications. Full article

(This article belongs to the Section Learning)

► Show Figures

Graphical abstract

18 pages, 1843 KB

Open AccessArticle

MENARA: Medical Natural Arabic Response Assistant

by Ahmed Ibrahim, Abdullah Hosseini, Hoda Helmy, Maryam Arabi, Aya AlShareef, Wafa Lakhdhar and Ahmed Serag

Mach. Learn. Knowl. Extr. 2026, 8(4), 110; https://doi.org/10.3390/make8040110 - 21 Apr 2026

Dialectal variation presents a major challenge for deploying medical language models in real-world healthcare settings, where patient–clinician communication often occurs in regional vernaculars rather than standardized language forms. This challenge is particularly pronounced in the Arabic-speaking world, where clinical interactions frequently take place [...] Read more.

Dialectal variation presents a major challenge for deploying medical language models in real-world healthcare settings, where patient–clinician communication often occurs in regional vernaculars rather than standardized language forms. This challenge is particularly pronounced in the Arabic-speaking world, where clinical interactions frequently take place in diverse dialects that differ substantially from Modern Standard Arabic. Fine-tuning and maintaining separate models for each dialect is computationally inefficient and difficult to scale, motivating more integrated approaches. In this work, we present MENARA, an Arabic medical language model constructed by merging Egyptian Arabic, Moroccan Darija, and medical-domain specialists through model merging. We extend prior feasibility findings through comprehensive evaluation of cross-dialect performance, medical safety, and cross-lingual knowledge retention. Specifically, we introduce a fine-grained dialect composition analysis to quantify lexical purity and structured code-switching behavior, benchmark against state-of-the-art Arabic LLMs, conduct subject-matter-expert assessment of both dialectal fidelity and medical appropriateness. The results show that model merging preserves core medical competence while enabling robust dialectal adaptation, achieving strong cross-dialect fidelity while substantially reducing storage and deployment overhead compared to maintaining separate models. These findings establish model merging as a potentially practical and resource-efficient paradigm for dialect-aware medical NLP in linguistically fragmented healthcare environments. Full article

(This article belongs to the Special Issue Advancing Natural Language Processing for Low-Resource Languages and Dialects)

► Show Figures

Figure 1

25 pages, 9659 KB

Open AccessArticle

Desirability Rating-Based Counterfactual (DeRaC) Framework for Complex Multi-Output Classification Problems

by Neelabh Kshetry and Mehmed Kantardzic

Mach. Learn. Knowl. Extr. 2026, 8(4), 109; https://doi.org/10.3390/make8040109 - 19 Apr 2026

Counterfactual explanations are increasingly vital for understanding and trusting machine learning models. This paper presents Desirability Rating-based Counterfactual (DeRaC), which is a generalized framework for generating valid counterfactual explanations applicable to classification problems with complex output spaces, including single and multi-output [...] Read more.

Counterfactual explanations are increasingly vital for understanding and trusting machine learning models. This paper presents Desirability Rating-based Counterfactual (DeRaC), which is a generalized framework for generating valid counterfactual explanations applicable to classification problems with complex output spaces, including single and multi-output classification with binary and multi-class outputs. By expanding the definition of counterfactual validity through a novel “desirability rating,” the approach addresses limitations in existing methods regarding complex output spaces. The framework introduces concepts such as partially valid counterfactuals and a quantitative measure of output desirability. It can be integrated with various objective functions to identify counterfactuals that satisfy properties such as similarity, proximity, and validity. Experiments demonstrate the feasibility of systematically generating counterfactuals using existing optimization techniques, achieving varying degrees of validity and similarity; specifically, Genetic Algorithm produces consistently higher counterfactual desirability albeit at the expense of longer computation times. We observed a higher average counterfactual desirability rating of 0.871 across all tested optimization methods with Powell’s method combined with DeRaC achieving the lowest average distance of 0.897 when using a mixed-objective function. The research emphasizes the context-dependent nature of counterfactuals and lays the foundation for more transparent and trustworthy machine learning systems. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

16 pages, 892 KB

Open AccessArticle

MIDA—Method for Industrial Data Analysis Based on CRISP-DM

by Mateus Mendes and Torres Farinha

Mach. Learn. Knowl. Extr. 2026, 8(4), 108; https://doi.org/10.3390/make8040108 - 18 Apr 2026

As modern computers became increasingly more popular and larger amounts of digital data were available, different methodologies were proposed to extract information from data. CRISP-DM methodology quickly spread and is currently one of the most popular approaches used for data analysis. However, it [...] Read more.

As modern computers became increasingly more popular and larger amounts of digital data were available, different methodologies were proposed to extract information from data. CRISP-DM methodology quickly spread and is currently one of the most popular approaches used for data analysis. However, it has some shortcomings, such as being too general or business-centered. Different authors have proposed variations more suitable to specific fields in order to overcome those limitations. The present paper reviews CRISP-DM, some variations and similar methodologies, and proposes a Methodology for Industrial Data Analysis (MIDA)—a methodology conceived and improved over time, based on previous experience in industrial engineering processes. MIDA consists of eight steps and partially overlaps with CRISP-DM. It has been successfully applied in several previous projects. Full article

(This article belongs to the Section Data)

► Show Figures

Figure 1

24 pages, 2768 KB

Open AccessArticle

Enhancing Wearable-Based Elderly Activity Recognition Through a Hybrid Deep Residual Network

by Sakorn Mekruksavanich and Anuchit Jitpattanakul

Mach. Learn. Knowl. Extr. 2026, 8(4), 107; https://doi.org/10.3390/make8040107 - 18 Apr 2026

The rapid growth of the elderly population worldwide demands reliable activity recognition technologies to support independent living and continuous health supervision. However, conventional wearable sensor-based human activity recognition (HAR) techniques often fail to capture the complex temporal behaviour and subtle motion patterns characteristic [...] Read more.

The rapid growth of the elderly population worldwide demands reliable activity recognition technologies to support independent living and continuous health supervision. However, conventional wearable sensor-based human activity recognition (HAR) techniques often fail to capture the complex temporal behaviour and subtle motion patterns characteristic of the elderly. To address these limitations, this study introduces a hybrid deep residual architecture—CNN-CBAM-BiGRU—that integrates convolutional neural networks (CNNs), the convolutional block attention module (CBAM), and bidirectional gated recurrent units (BiGRUs) to improve activity recognition using inertial measurement unit (IMU) data. In the proposed CNN-CBAM-BiGRU framework, CNN layers automatically derive representative features from raw sensor signals, CBAM applies adaptive channel and spatial attention to highlight informative patterns, and BiGRU captures long-range temporal relationships within activity sequences. The approach was evaluated on three benchmark datasets designed for elderly populations—HAR70+, HARTH, and SisFall—covering daily activities and fall events. The proposed model consistently outperforms existing methods across all datasets, achieving accuracies exceeding 96%, F1-scores above 93%, and a fall detection recall of 93.74%, confirming its robustness and suitability for safety-critical monitoring applications. Class-level evaluation indicates excellent recognition of static postures and consistent performance for dynamic actions. Convergence analysis further confirms efficient learning with limited overfitting across datasets. The proposed framework thus provides a robust and accurate solution for wearable-based elderly activity recognition, with strong potential for deployment in fall detection, health monitoring, and ambient assisted living systems. Full article

(This article belongs to the Special Issue Sustainable Applications for Machine Learning—2nd Edition)

► Show Figures

Graphical abstract

23 pages, 98920 KB

Open AccessArticle

vinum-Analytics

by Nuno Ferreira, Filipe Pinto, António Valente, Diana Augusto, Manuela Reis and Salviano Soares

Mach. Learn. Knowl. Extr. 2026, 8(4), 106; https://doi.org/10.3390/make8040106 - 18 Apr 2026

Old-vine vineyards often contain dozens of grapevine varieties intermingled and irregularly distributed, making plant-level varietal identification slow and expensive when based on ampelography or molecular approaches. This paper proposes a field-oriented computer-vision pipeline for Vitis vinifera variety identification using images with a natural [...] Read more.

Old-vine vineyards often contain dozens of grapevine varieties intermingled and irregularly distributed, making plant-level varietal identification slow and expensive when based on ampelography or molecular approaches. This paper proposes a field-oriented computer-vision pipeline for Vitis vinifera variety identification using images with a natural background from the historic “Vinha Maria Teresa” parcel (Quinta do Crasto, Portugal). A single-class YOLO11 detector is trained to localize the vine leaf and generate standardized crops, and a YOLO11 classifier is then fine-tuned on leaf regions of interest (ROIs) for eight selected varieties in the Douro UNESCO region. We annotated 2015 vineyard images for classification and supplemented detection training with 2648 additional leaf images; detectors (YOLO11n/s/m) were benchmarked under four augmentation regimes and evaluated on a fixed 48-image subset, including runtime on CPU and GPU. The best detector reached mAP@50–95 of 0.918 on the benchmark, while YOLO11n achieved ∼27 FPS on CPU for fast cropping. On a 303-image test set, the best classifier (YOLO11s with mixed augmentations) achieved 94.06% Top-1 accuracy, 93.92% macro-F1, and 100% Top-5 accuracy with remaining errors concentrated among morphologically similar varieties. To assess deployment-oriented performance, classifiers trained under three input settings (manual crops, detector-generated crops, and full images) were evaluated on a held-out 48-image benchmark subset; removing the detection step reduced Top-1 accuracy from 75.00% to 68.75%, while the gap between manual and automatic crops was only 2.44 pp on successfully detected images with detection failures (14.6%) representing the primary operational bottleneck. Repeated retraining of the best manual-crop YOLO11s configuration across multiple random seeds showed stable performance with low variability in Top-1 accuracy and macro-F1. Under identical training conditions, ResNet50 and EfficientNet-B0 provided competitive baselines, but YOLO11s remained the strongest overall model on the held-out field benchmark. These results indicate that lightweight leaf detection plus crop-based classification can support scalable varietal identification in old vineyards under realistic acquisition conditions. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

26 pages, 5352 KB

Open AccessArticle

Diffusion-Based Feature Denoising and Using NNMF for Robust Brain Tumor Classification

by Hiba Adil Al-kharsan and Róbert Rajkó

Mach. Learn. Knowl. Extr. 2026, 8(4), 105; https://doi.org/10.3390/make8040105 - 18 Apr 2026

Brain tumor classification from magnetic resonance imaging, which is also known as MRI, plays a sensitive role in computer-assisted diagnosis systems. In recent years, deep learning models have achieved high classification accuracy. However, their sensitivity to adversarial perturbations has become an important reliability [...] Read more.

Brain tumor classification from magnetic resonance imaging, which is also known as MRI, plays a sensitive role in computer-assisted diagnosis systems. In recent years, deep learning models have achieved high classification accuracy. However, their sensitivity to adversarial perturbations has become an important reliability concern in medical applications. This study suggests a robust brain tumor classification framework that combines non-negative matrix factorization (NNMF or NMF), lightweight convolutional neural networks (CNNs), and diffusion-based feature purification. Initially, MRI images are preprocessed and converted into a non-negative data matrix, from which compact and interpretable NNMF feature representations are extracted. Statistical metrics, including AUC, Cohen’s d, and p-values, are used to rank and choose the most discriminative components. Then, a lightweight CNN classifier is trained directly on the selected feature groups. To improve adversarial robustness, a diffusion-based feature-space purification module is introduced. A forward noise method followed by a learned denoiser network is used before classification. System performance is estimated using both clean accuracy and robust accuracy under powerful adversarial attacks created by AutoAttack. The experimental results show that the proposed framework achieves competitive classification performance while significantly enhancing robustness against adversarial perturbations. The findings presuppose that combining interpretable NNMF-based representations with a lightweight deep approach and diffusion-based defense technique supplies an effective and reliable solution for medical image classification under adversarial conditions. Full article

(This article belongs to the Section Learning)

► Show Figures

Figure 1

37 pages, 3613 KB

Open AccessArticle

Evaluating the Efficacy of Large Language Models in Stock Market Decision-Making: A Decision-Focused, Price-Only, Multi-Country Analysis Using Historical Price Data

by Maria C. Mariani, Sourav Malakar, Amrita Bagchi, Subhrajyoti Basu, Saptarsi Goswami, Osei Kofi Tweneboah, Sarbadeep Biswas, Ankit Dey and Ankit Sinha

Mach. Learn. Knowl. Extr. 2026, 8(4), 104; https://doi.org/10.3390/make8040104 - 17 Apr 2026

This study provides a comparative evaluation of three state-of-the-art large language models (LLMs), namely OpenAI’s (San Francisco, CA, USA) GPT-4.0, Google’s (Google LLC, Mountain View, CA, USA) Gemini 2.0 Flash, and Meta’s (Meta Platforms, Menlo Park, CA, USA) LLaMA-4-Scout-17B-16E, in a decision-oriented framework [...] Read more.

This study provides a comparative evaluation of three state-of-the-art large language models (LLMs), namely OpenAI’s (San Francisco, CA, USA) GPT-4.0, Google’s (Google LLC, Mountain View, CA, USA) Gemini 2.0 Flash, and Meta’s (Meta Platforms, Menlo Park, CA, USA) LLaMA-4-Scout-17B-16E, in a decision-oriented framework in which the models generate structured outputs based only on historical closing-price data. The evaluation covers 150 stocks sampled from three countries (India, the United States, and South Africa) across ten economic sectors, including Information Technology, Banking, and Pharmaceuticals. Unlike many prior studies that combine numerical and textual inputs, this study relies solely on three years of numerical time series data and examines model responses in terms of decision labels such as buy, sell, or hold. The LLMs were provided with historical closing-price sequences and prompted with three types of finance-related questions: (a) whether to buy a stock, (b) whether to sell or hold a stock, and (c) in a pairwise comparison, which stock to buy or hold. These prompts were evaluated across two investment horizons: 1 month and 3 months. Model outputs were compared against realized market outcomes during the corresponding test periods. Performance was assessed across four key dimensions: country, sector, annualized volatility, and question type. The models were not given any supplementary financial information or instructions on specific analytical methods. The results indicate that GPT-4.0 achieves the highest average accuracy (56%), followed by LLaMA-4-Scout-17B-16E (48%) and Gemini 2.0 Flash (39%). Overall performance remains moderate and varies across market conditions, with relatively higher accuracy observed in high-volatility regimes (51%). This work evaluates how LLMs behave when presented with structured numerical price sequences in a controlled decision-labeling setting and contributes to the broader discussion on the potential and limitations of LLMs for numerical decision tasks in finance. Full article

► Show Figures

Figure 1

32 pages, 1178 KB

Open AccessArticle

Fake News Detection Through LLM-Driven Text Augmentation Across Media and Languages

by Abdul Sittar, Mateja Smiljanic, Alenka Guček and Marko Grobelnik

Mach. Learn. Knowl. Extr. 2026, 8(4), 103; https://doi.org/10.3390/make8040103 - 15 Apr 2026

The proliferation of fake news across social media, headlines, and news articles poses major challenges for automated detection, particularly in multilingual and cross-media settings affected by data imbalance. We propose a fake news detection framework based on LLM-driven, feature-guided text augmentation. The method [...] Read more.

The proliferation of fake news across social media, headlines, and news articles poses major challenges for automated detection, particularly in multilingual and cross-media settings affected by data imbalance. We propose a fake news detection framework based on LLM-driven, feature-guided text augmentation. The method generates realistic synthetic samples across languages, media types, and text granularities while preserving meaning and stylistic coherence. Experiments with classical and transformer-based models (Random Forest, Logistic Regression, BERT, XLM-R) across social media, headlines, and multilingual news datasets show consistent improvements in performance. For inherently balanced datasets (e.g., social media), synthetic augmentation yields negligible but stable performance changes. Across imbalanced scenarios, synthetic augmentation substantially improves minority-class recall and F1-score (e.g., fake news recall from 0.57 to 0.86), while preserving majority-class performance, leading to more balanced and reliable classifiers, whereas oversampling significantly degrades results due to overfitting on duplicated language patterns. Overall, a hybrid semantic- and style-based model proves to be the most robust strategy, outperforming oversampling and matching or exceeding baseline performance across datasets. Full article

► Show Figures

Graphical abstract

More Articles...

Submit to MAKE Review for MAKE

Journal Menu

Journal Browser

► Journal Browser

Highly Accessed Articles

View More...

Latest Books

More Books and Reprints...

E-Mail Alert

News

4 March 2026
MDPI’s 2025 Best Paper Awards—Award-Winning Papers Announced

27 April 2026
Meet Us at the 31st Australasian Conference on Information Security and Privacy (ACISP 2026), 6–9 July 2026, Perth, Australia

24 April 2026
Prof. Dr. William Gerwick Appointed Chair of the 2026 Tu Youyou Award Committee

More News & Announcements...

Topics

Propose a Topic

Topic in Applied Sciences, ASI, Blockchains, Computers, MAKE, Software

Recent Advances in AI-Enhanced Software Engineering and Web Services Topic Editors: Hai Wang, Zhe Hou
Deadline: 31 May 2026

Topic in AI, Applied Sciences, Electronics, MAKE

Deep Supplement Learning for Healthcare and Biomedical Applications Topic Editors: Tahir Cetin Akinci, Ömer Faruk Ertuǧrul
Deadline: 30 June 2026

Topic in Atmosphere, Earth, Encyclopedia, Entropy, Fractal Fract, MAKE, Meteorology

Revisiting Butterfly Effect, Multiscale Dynamics, and Predictability Using Ai-Enhanced Modeling Framework (AEMF) and Chaos Theory Topic Editors: Bo-Wen Shen, Roger A. Pielke Sr., Xubin Zeng
Deadline: 31 July 2026

Topic in Algorithms, Applied Sciences, Electronics, MAKE, AI, Software

Applications of NLP, AI, and ML in Software Engineering Topic Editors: Affan Yasin, Javed Ali Khan, Lijie Wen
Deadline: 30 August 2026

More Topics

Conferences

Propose a Conference Collaboration

1–3 July 2026 Entropy 2026: Exploring Complexity and Information in Science

More Conferences...

Special Issues

Propose a Special Issue

Special Issue in MAKE

Language Acquisition and Understanding Guest Editors: Michal Ptaszynski, Rafal Rzepka, Masaharu Yoshioka
Deadline: 15 July 2026

Special Issue in MAKE

Advancing Natural Language Processing for Low-Resource Languages and Dialects Guest Editors: Tanjim Mahmud, Michal Ptaszynski, Karl Andersson
Deadline: 31 July 2026

Special Issue in MAKE

Trustworthy AI: Integrating Knowledge, Retrieval, and Reasoning Guest Editor: Konstantinos Diamantaras
Deadline: 31 August 2026

Special Issue in MAKE

Explainable Artificial Intelligence: Theoretical Foundations and Methodological Advances Guest Editors: Sheng Du, Javier Del Ser Lorente
Deadline: 31 August 2026

More Special Issues

Topical Collections

Topical Collection in MAKE

Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction Collection Editor: Andreas Holzinger

Topical Collection in MAKE

Feature Papers in Safety, Security, Privacy, and Cyber Resilience Collection Editor: Simon Tjoa

Topical Collection in MAKE

Robust and Uncertainty-Aware Learning from Real-World Data Collection Editor: Federico Cabitza

Topical Collection in MAKE

Clustering and Data Mining Collection Editor: Feiping Nie

Back to TopTop