Journal Description
Machine Learning and Knowledge Extraction
Machine Learning and Knowledge Extraction
is an international, peer-reviewed, open access journal on machine learning and applications. It publishes original research articles, reviews, tutorials, research ideas, short notes and Special Issues that focus on machine learning and applications. Please see our video on YouTube explaining the MAKE journal concept. The journal is published quarterly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), dblp, and other databases.
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 25.5 days after submission; acceptance to publication is undertaken in 3.4 days (median values for papers published in this journal in the first half of 2025).
- Journal Rank: JCR - Q1 (Engineering, Electrical and Electronic) / CiteScore - Q1 (Engineering (miscellaneous))
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
6.0 (2024);
5-Year Impact Factor:
5.7 (2024)
Latest Articles
Exploratory Proof-of-Concept: Predicting the Outcome of Tennis Serves Using Motion Capture and Deep Learning
Mach. Learn. Knowl. Extr. 2025, 7(4), 118; https://doi.org/10.3390/make7040118 - 14 Oct 2025
Abstract
Tennis serves heavily impact match outcomes, yet analysis by coaches is limited by human vision. The design of an automated tennis serve analysis system could facilitate enhanced performance analysis. As serve location and serve success are directly correlated, predicting the outcome of a
[...] Read more.
Tennis serves heavily impact match outcomes, yet analysis by coaches is limited by human vision. The design of an automated tennis serve analysis system could facilitate enhanced performance analysis. As serve location and serve success are directly correlated, predicting the outcome of a serve could provide vital information for performance analysis. This article proposes a tennis serve analysis system powered by Machine Learning, which classifies the outcome of serves as “in”, “out” or “net”, and predicts the coordinate outcome of successful serves. Additionally, this work details the collection of three-dimensional spatio-temporal data on tennis serves, using marker-based optoelectronic motion capture. The classification uses a Stacked Bidirectional Long Short-Term Memory architecture, whilst a 3D Convolutional Neural Network architecture is harnessed for serve coordinate prediction. The proposed method achieves 89% accuracy for tennis serve classification, outperforming the current state-of-the-art whilst performing finer-grain classification. The results achieve an accuracy of 63% in predicting the serve coordinates, with a mean absolute error of 0.59 and a root mean squared error of 0.68, exceeding the current state-of-the-art with a new method. The system contributes towards the long-term goal of designing a non-invasive tennis serve analysis system that functions in training and match conditions.
Full article
Open AccessArticle
Learning to Partition: Dynamic Deep Neural Network Model Partitioning for Edge-Assisted Low-Latency Video Analytics
by
Yan Lyu, Likai Liu, Xuezhi Wang, Zhiyu Fan, Jinchen Wang and Guanyu Gao
Mach. Learn. Knowl. Extr. 2025, 7(4), 117; https://doi.org/10.3390/make7040117 - 13 Oct 2025
Abstract
►▼
Show Figures
In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning
[...] Read more.
In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning or greedy algorithms that optimize for a single frame. These myopic approaches adapt poorly to dynamic network and workload conditions, leading to high long-term costs and significant frame drops. This paper introduces a novel partitioning technique driven by a Deep Reinforcement Learning (DRL) agent on a local device that learns to dynamically partition a video analytics Deep Neural Network (DNN). The agent learns a farsighted policy to dynamically select the optimal DNN split point for each frame by observing the holistic system state. By optimizing for a cumulative long-term reward, our method significantly outperforms competitor methods, demonstrably reducing overall system cost and latency while nearly eliminating frame drops in our real-world testbed evaluation. The primary limitation is the initial offline training phase required by the DRL agent. Future work will focus on extending this dynamic partitioning framework to multi-device and multi-edge environments.
Full article

Figure 1
Open AccessArticle
Faithful Narratives from Complex Conceptual Models: Should Modelers or Large Language Models Simplify Causal Maps?
by
Tyler J. Gandee and Philippe J. Giabbanelli
Mach. Learn. Knowl. Extr. 2025, 7(4), 116; https://doi.org/10.3390/make7040116 - 7 Oct 2025
Abstract
(1) Background: Comprehensive conceptual models can result in complex artifacts, consisting of many concepts that interact through multiple mechanisms. This complexity can be acceptable and even expected when generating rich models, for instance to support ensuing analyses that find central concepts or decompose
[...] Read more.
(1) Background: Comprehensive conceptual models can result in complex artifacts, consisting of many concepts that interact through multiple mechanisms. This complexity can be acceptable and even expected when generating rich models, for instance to support ensuing analyses that find central concepts or decompose models into parts that can be managed by different actors. However, complexity can become a barrier when the conceptual model is used directly by individuals. A ‘transparent’ model can support learning among stakeholders (e.g., in group model building) and it can motivate the adoption of specific interventions (i.e., using a model as evidence base). Although advances in graph-to-text generation with Large Language Models (LLMs) have made it possible to transform conceptual models into textual reports consisting of coherent and faithful paragraphs, turning a large conceptual model into a very lengthy report would only displace the challenge. (2) Methods: We experimentally examine the implications of two possible approaches: asking the text generator to simplify the model, either via abstractive (LLMs) or extractive summarization, or simplifying the model through graph algorithms and then generating the complete text. (3) Results: We find that the two approaches have similar scores on text-based evaluation metrics including readability and overlap scores (ROUGE, BLEU, Meteor), but faithfulness can be lower when the text generator decides on what is an interesting fact and is tasked with creating a story. These automated metrics capture textual properties, but they do not assess actual user comprehension, which would require an experimental study with human readers. (4) Conclusions: Our results suggest that graph algorithms may be preferable to support modelers in scientific translations from models to text while minimizing hallucinations.
Full article
(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, Second Edition)
►▼
Show Figures

Figure 1
Open AccessReview
AI-Enabled IoT Intrusion Detection: Unified Conceptual Framework and Research Roadmap
by
Antonio Villafranca, Kyaw Min Thant, Igor Tasic and Maria-Dolores Cano
Mach. Learn. Knowl. Extr. 2025, 7(4), 115; https://doi.org/10.3390/make7040115 - 6 Oct 2025
Abstract
►▼
Show Figures
The Internet of Things (IoT) revolutionizes connectivity, enabling innovative applications across healthcare, industry, and smart cities but also introducing significant cybersecurity challenges due to its expanded attack surface. Intrusion Detection Systems (IDSs) play a pivotal role in addressing these challenges, offering tailored solutions
[...] Read more.
The Internet of Things (IoT) revolutionizes connectivity, enabling innovative applications across healthcare, industry, and smart cities but also introducing significant cybersecurity challenges due to its expanded attack surface. Intrusion Detection Systems (IDSs) play a pivotal role in addressing these challenges, offering tailored solutions to detect and mitigate threats in dynamic and resource-constrained IoT environments. Through a rigorous analysis, this study classifies IDS research based on methodologies, performance metrics, and application domains, providing a comprehensive synthesis of the field. Key findings reveal a paradigm shift towards integrating artificial intelligence (AI) and hybrid approaches, surpassing the limitations of traditional, static methods. These advancements highlight the potential for IDSs to enhance scalability, adaptability, and detection accuracy. However, unresolved challenges, such as resource efficiency and real-world applicability, underline the need for further research. By contextualizing these findings within the broader landscape of IoT security, this work emphasizes the critical importance of developing IDS solutions that ensure the reliability, privacy, and security of interconnected systems, contributing to the sustainable evolution of IoT ecosystems.
Full article

Graphical abstract
Open AccessArticle
Enhancing Cancer Classification from RNA Sequencing Data Using Deep Learning and Explainable AI
by
Haseeb Younis and Rosane Minghim
Mach. Learn. Knowl. Extr. 2025, 7(4), 114; https://doi.org/10.3390/make7040114 - 1 Oct 2025
Abstract
►▼
Show Figures
Cancer is one of the most deadly diseases, costing millions of lives and billions of USD every year. There are different ways to identify the biomarkers that can be used to detect cancer types and subtypes. RNA sequencing is steadily taking the lead
[...] Read more.
Cancer is one of the most deadly diseases, costing millions of lives and billions of USD every year. There are different ways to identify the biomarkers that can be used to detect cancer types and subtypes. RNA sequencing is steadily taking the lead as the method of choice due to its ability to access global gene expression in biological samples and facilitate more flexible methods and robust analyses. Numerous studies have employed artificial intelligence (AI) and specifically machine learning techniques to detect cancer in its early stages. However, most of the models provided are very specific to particular cancer types and do not generalize. This paper proposes a deep learning and explainable AI (XAI) combined approach to classifying cancer subtypes and a deep learning-based approach for the classification of cancer types using BARRA:CuRDa, an RNA-seq database with 17 datasets for seven cancer types. One architecture is designed to classify cancer subtypes with around 100% accuracy, precision, recall, F1 score, and G-Mean. This architecture outperforms the previous methodologies for all individual datasets. The second architecture is designed to classify multiple cancer types; it classifies eight types within the neighborhood of 87% of validation accuracy, precision, recall, F1 score, and G-Mean. Within the same process, we employ XAI, which identifies 99 genes out of 58,735 input genes that could be potential biomarkers for different cancer types. We also perform Pathway Enrichment Analysis and Visual Analysis to establish the significance and robustness of our methodology. The proposed methodology can classify cancer types and subtypes with robust results and can be extended to other cancer types.
Full article

Figure 1
Open AccessArticle
TL-Efficient-SE: A Transfer Learning-Based Attention-Enhanced Model for Fingerprint Liveness Detection Across Multi-Sensor Spoof Attacks
by
Archana Pallakonda, Rayappa David Amar Raj, Rama Muni Reddy Yanamala, Christian Napoli and Cristian Randieri
Mach. Learn. Knowl. Extr. 2025, 7(4), 113; https://doi.org/10.3390/make7040113 - 1 Oct 2025
Abstract
Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both
[...] Read more.
Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both real and fake fingerprints taken using four optical sensors and spoofs made using PlayDoh, Ecoflex, and Gelatine, is used to train and test the model architecture. Stratified splitting is performed once the images being input have been scaled and normalized to conform to EfficientNetB0’s format. The SE module adaptively improves appropriate features to competently differentiate live from fake inputs. The classification head comprises fully connected layers, dropout, batch normalization, and a sigmoid output. Empirical results exhibit accuracy between 98.50% and 99.50%, with an AUC varying from 0.978 to 0.9995, providing high precision and recall for genuine users, and robust generalization across unseen spoof types. Compared to existing methods like Slim-ResCNN and HyiPAD, the novelty of our model lies in the Squeeze-and-Excitation mechanism, which enhances feature discrimination by adaptively recalibrating the channels of the feature maps, thereby improving the model’s ability to differentiate between live and spoofed fingerprints. This model has practical implications for deployment in real-time biometric systems, including mobile authentication and secure access control, presenting an efficient solution for protecting against sophisticated spoofing methods. Future research will focus on sensor-invariant learning and adaptive thresholds to further enhance resilience against varying spoofing attacks.
Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
Adversarial Attacks Detection Method for Tabular Data
by
Łukasz Wawrowski, Piotr Biczyk, Dominik Ślęzak and Marek Sikora
Mach. Learn. Knowl. Extr. 2025, 7(4), 112; https://doi.org/10.3390/make7040112 - 1 Oct 2025
Abstract
Adversarial attacks involve malicious actors introducing intentional perturbations to machine learning (ML) models, causing unintended behavior. This poses a significant threat to the integrity and trustworthiness of ML models, necessitating the development of robust detection techniques to protect systems from potential threats. The
[...] Read more.
Adversarial attacks involve malicious actors introducing intentional perturbations to machine learning (ML) models, causing unintended behavior. This poses a significant threat to the integrity and trustworthiness of ML models, necessitating the development of robust detection techniques to protect systems from potential threats. The paper proposes a new approach for detecting adversarial attacks using a surrogate model and diagnostic attributes. The method was tested on 22 tabular datasets on which four different ML models were trained. Furthermore, various attacks were conducted, which led to obtaining perturbed data. The proposed approach is characterized by high efficiency in detecting known and unknown attacks—balanced accuracy was above 0.94, with very low false negative rates (0.02–0.10) for binary detection. Sensitivity analysis shows that classifiers trained based on diagnostic attributes can detect even very subtle adversarial attacks.
Full article
(This article belongs to the Section Learning)
►▼
Show Figures

Figure 1
Open AccessArticle
Transfer Learning for Generalized Safety Risk Detection in Industrial Video Operations
by
Luciano Radrigan, Sebastián E. Godoy and Anibal S. Morales
Mach. Learn. Knowl. Extr. 2025, 7(4), 111; https://doi.org/10.3390/make7040111 - 30 Sep 2025
Abstract
►▼
Show Figures
This paper proposes a transfer learning-based approach to enhance video-driven safety risk detection in industrial environments, addressing the critical challenge of limited generalization across diverse operational scenarios. Conventional deep learning models trained on specific operational contexts often fail when applied to new environments
[...] Read more.
This paper proposes a transfer learning-based approach to enhance video-driven safety risk detection in industrial environments, addressing the critical challenge of limited generalization across diverse operational scenarios. Conventional deep learning models trained on specific operational contexts often fail when applied to new environments with different lighting, camera angles, or machinery configurations, exhibiting a significant drop in performance (e.g., F1-score declining below 0.85). To overcome this issue, an incremental feature transfer learning strategy is introduced, enabling efficient adaptation of risk detection models using only small amounts of data from new scenarios. This approach leverages prior knowledge from pre-trained models to reduce the reliance on large-labeled datasets, particularly valuable in industrial settings where rare but critical safety risk events are difficult to capture. Additionally, training efficiency is improved compared with a classic approach, supporting deployment on resource-constrained edge devices. The strategy involves incremental retraining using video segments with average durations ranging from 2.5 to 25 min (corresponding to 5–50% of new scenario data), approximately, enabling scalable generalization across multiple forklift-related risk activities. Interpretability is enhanced through SHAP-based analysis, which reveals a redistribution of feature relevance toward critical components, thereby improving model transparency and reducing annotation demands. Experimental results confirm that the transfer learning strategy significantly improves detection accuracy, robustness, and adaptability, making it a practical and scalable solution for safety monitoring in dynamic industrial environments.
Full article

Graphical abstract
Open AccessArticle
Attention-Guided Differentiable Channel Pruning for Efficient Deep Networks
by
Anouar Chahbouni, Khaoula El Manaa, Yassine Abouch, Imane El Manaa, Badre Bossoufi, Mohammed El Ghzaoui and Rachid El Alami
Mach. Learn. Knowl. Extr. 2025, 7(4), 110; https://doi.org/10.3390/make7040110 - 29 Sep 2025
Abstract
►▼
Show Figures
Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the
[...] Read more.
Deploying deep learning (DL) models in real-world environments remains a major challenge, particularly under resource-constrained conditions where achieving both high accuracy and compact architectures is essential. While effective, Conventional pruning methods often suffer from high computational overhead, accuracy degradation, or disruption of the end-to-end training process, limiting their practicality for embedded and real-time applications. We present Dynamic Attention-Guided Pruning (DAGP), a Dynamic Attention-Guided Soft Channel Pruning framework that overcomes these limitations by embedding learnable, differentiable pruning masks directly within convolutional neural networks (CNNs). These masks act as implicit attention mechanisms, adaptively suppressing non-informative channels during training. A progressively scheduled L1 regularization, activated after a warm-up phase, enables gradual sparsity while preserving early learning capacity. Unlike prior methods, DAGP is retraining-free, introduces minimal architectural overhead, and supports optional hard pruning for deployment efficiency. Joint optimization of classification and sparsity objectives ensures stable convergence and task-adaptive channel selection. Experiments on CIFAR-10 (VGG16, ResNet56) and PlantVillage (custom CNN) achieve up to 98.82% FLOPs reduction with accuracy gains over baselines. Real-world validation on an enhanced PlantDoc dataset for agricultural monitoring achieves 60 ms inference with only 2.00 MB RAM on a Raspberry Pi 4, confirming efficiency under field conditions. These results illustrate DAGP’s potential to scale beyond agriculture to diverse edge-intelligent systems requiring lightweight, accurate, and deployable models.
Full article

Figure 1
Open AccessArticle
Enhancing Soundscape Characterization and Pattern Analysis Using Low-Dimensional Deep Embeddings on a Large-Scale Dataset
by
Daniel Alexis Nieto Mora, Leonardo Duque-Muñoz and Juan David Martínez Vargas
Mach. Learn. Knowl. Extr. 2025, 7(4), 109; https://doi.org/10.3390/make7040109 - 24 Sep 2025
Abstract
►▼
Show Figures
Soundscape monitoring has become an increasingly important tool for studying ecological processes and supporting habitat conservation. While many recent advances focus on identifying species through supervised learning, there is growing interest in understanding the soundscape as a whole while considering patterns that extend
[...] Read more.
Soundscape monitoring has become an increasingly important tool for studying ecological processes and supporting habitat conservation. While many recent advances focus on identifying species through supervised learning, there is growing interest in understanding the soundscape as a whole while considering patterns that extend beyond individual vocalizations. This broader view requires unsupervised approaches capable of capturing meaningful structures related to temporal dynamics, frequency content, spatial distribution, and ecological variability. In this study, we present a fully unsupervised framework for analyzing large-scale soundscape data using deep learning. We applied a convolutional autoencoder (Soundscape-Net) to extract acoustic representations from over 60,000 recordings collected across a grid-based sampling design in the Rey Zamuro Reserve in Colombia. These features were initially compared with other audio characterization methods, showing superior performance in multiclass classification, with accuracies of 0.85 for habitat cover identification and 0.89 for time-of-day classification across 13 days. For the unsupervised study, optimized dimensionality reduction methods (Uniform Manifold Approximation and Projection and Pairwise Controlled Manifold Approximation and Projection) were applied to project the learned features, achieving trustworthiness scores above 0.96. Subsequently, clustering was performed using KMeans and Density-Based Spatial Clustering of Applications with Noise (DBSCAN), with evaluations based on metrics such as the silhouette, where scores above 0.45 were obtained, thus supporting the robustness of the discovered latent acoustic structures. To interpret and validate the resulting clusters, we combined multiple strategies: spatial mapping through interpolation, analysis of acoustic index variance to understand the cluster structure, and graph-based connectivity analysis to identify ecological relationships between the recording sites. Our results demonstrate that this approach can uncover both local and broad-scale patterns in the soundscape, providing a flexible and interpretable pathway for unsupervised ecological monitoring.
Full article

Figure 1
Open AccessArticle
Learning to Balance Mixed Adversarial Attacks for Robust Reinforcement Learning
by
Mustafa Erdem and Nazım Kemal Üre
Mach. Learn. Knowl. Extr. 2025, 7(4), 108; https://doi.org/10.3390/make7040108 - 24 Sep 2025
Abstract
►▼
Show Figures
Reinforcement learning agents are highly susceptible to adversarial attacks that can severely compromise their performance. Although adversarial training is a common countermeasure, most existing research focuses on defending against single-type attacks targeting either observations or actions. This narrow focus overlooks the complexity of
[...] Read more.
Reinforcement learning agents are highly susceptible to adversarial attacks that can severely compromise their performance. Although adversarial training is a common countermeasure, most existing research focuses on defending against single-type attacks targeting either observations or actions. This narrow focus overlooks the complexity of real-world mixed attacks, where an agent’s perceptions and resulting actions are perturbed simultaneously. To systematically study these threats, we introduce the Action and State-Adversarial Markov Decision Process (ASA-MDP), which models the interaction as a zero-sum game between the agent and an adversary attacking both states and actions. Using this framework, we show that agents trained conventionally or against single-type attacks remain highly vulnerable to mixed perturbations. Moreover, we identify a key challenge in this setting: a naive mixed-type adversary often fails to effectively balance its perturbations across modalities during training, limiting the agent’s robustness. To address this, we propose the Action and State-Adversarial Proximal Policy Optimization (ASA-PPO) algorithm, which enables the adversary to learn a balanced strategy, distributing its attack budget across both state and action spaces. This, in turn, enhances the robustness of the trained agent against a wide range of adversarial scenarios. Comprehensive experiments across diverse environments demonstrate that policies trained with ASA-PPO substantially outperform baselines—including standard PPO and single-type adversarial methods—under action-only, observation-only, and, most notably, mixed-attack conditions.
Full article

Figure 1
Open AccessArticle
Saliency-Guided Local Semantic Mixing for Long-Tailed Image Classification
by
Jiahui Lv, Jun Lei, Jun Zhang, Chao Chen and Shuohao Li
Mach. Learn. Knowl. Extr. 2025, 7(3), 107; https://doi.org/10.3390/make7030107 - 22 Sep 2025
Abstract
►▼
Show Figures
In real-world visual recognition tasks, long-tailed distributions pose a widespread challenge, with extreme class imbalance severely limiting the representational learning capability of deep models. In practice, due to this imbalance, deep models often exhibit poor generalization performance on tail classes. To address this
[...] Read more.
In real-world visual recognition tasks, long-tailed distributions pose a widespread challenge, with extreme class imbalance severely limiting the representational learning capability of deep models. In practice, due to this imbalance, deep models often exhibit poor generalization performance on tail classes. To address this issue, data augmentation through the synthesis of new tail-class samples has become an effective method. One popular approach is CutMix, which explicitly mixes images from tail and other classes, constructing labels based on the ratio of the regions cropped from both images. However, region-based labels completely ignore the inherent semantic information of the augmented samples. To overcome this problem, we propose a saliency-guided local semantic mixing (LSM) method, which uses differentiable block decoupling and semantic-aware local mixing techniques. This method integrates head-class backgrounds while preserving the key discriminative features of tail classes and dynamically assigns labels to effectively augment tail-class samples. This results in efficient balancing of long-tailed data distributions and significant improvements in classification performance. The experimental validation shows that this method demonstrates significant advantages across three long-tailed benchmark datasets, improving classification accuracy by 5.0%, 7.3%, and 6.1%, respectively. Notably, the LSM framework is highly compatible, seamlessly integrating with existing classification models and providing significant performance gains, validating its broad applicability.
Full article

Figure 1
Open AccessArticle
Bayesian Learning Strategies for Reducing Uncertainty of Decision-Making in Case of Missing Values
by
Vitaly Schetinin and Livija Jakaite
Mach. Learn. Knowl. Extr. 2025, 7(3), 106; https://doi.org/10.3390/make7030106 - 22 Sep 2025
Abstract
Background: Liquidity crises pose significant risks to financial stability, and missing data in predictive models increase the uncertainty in decision-making. This study aims to develop a robust Bayesian Model Averaging (BMA) framework using decision trees (DTs) to enhance liquidity crisis prediction under missing
[...] Read more.
Background: Liquidity crises pose significant risks to financial stability, and missing data in predictive models increase the uncertainty in decision-making. This study aims to develop a robust Bayesian Model Averaging (BMA) framework using decision trees (DTs) to enhance liquidity crisis prediction under missing data conditions, offering reliable probabilistic estimates and insights into uncertainty. Methods: We propose a BMA framework over DTs, employing Reversible Jump Markov Chain Monte Carlo (RJ MCMC) sampling with a sweeping strategy to mitigate overfitting. Three preprocessing techniques for missing data were evaluated: Cont (treating variables as continuous with missing values labeled by a constant), ContCat (converting variables with missing values to categorical), and Ext (extending features with binary missing-value indicators). Results: The Ext method achieved 100% accuracy on a synthetic dataset and 92.2% on a real-world dataset of 20,000 companies (11% in crisis), outperforming baselines (AUC PRC 0.817 vs. 0.803, p < 0.05). The framework provided interpretable uncertainty estimates and identified key financial indicators driving crisis predictions. Conclusions: The BMA-DT framework with the Ext technique offers a scalable, interpretable solution for handling missing data, improving prediction accuracy and uncertainty estimation in liquidity crisis forecasting, with potential applications in finance, healthcare, and environmental modeling.
Full article
(This article belongs to the Section Learning)
►▼
Show Figures

Graphical abstract
Open AccessSystematic Review
Customer Churn Prediction: A Systematic Review of Recent Advances, Trends, and Challenges in Machine Learning and Deep Learning
by
Mehdi Imani, Majid Joudaki, Ali Beikmohammadi and Hamid Reza Arabnia
Mach. Learn. Knowl. Extr. 2025, 7(3), 105; https://doi.org/10.3390/make7030105 - 21 Sep 2025
Abstract
►▼
Show Figures
Background: Customer churn significantly impacts business revenues. Machine Learning (ML) and Deep Learning (DL) methods are increasingly adopted to predict churn, yet a systematic synthesis of recent advancements is lacking. Objectives: This systematic review evaluates ML and DL approaches for churn prediction, identifying
[...] Read more.
Background: Customer churn significantly impacts business revenues. Machine Learning (ML) and Deep Learning (DL) methods are increasingly adopted to predict churn, yet a systematic synthesis of recent advancements is lacking. Objectives: This systematic review evaluates ML and DL approaches for churn prediction, identifying trends, challenges, and research gaps from 2020 to 2024. Data Sources: Six databases (Springer, IEEE, Elsevier, MDPI, ACM, Wiley) were searched via Lens.org for studies published between January 2020 and December 2024. Study Eligibility Criteria: Peer-reviewed original studies applying ML/DL techniques for churn prediction were included. Reviews, preprints, and non-peer-reviewed works were excluded. Methods: Screening followed PRISMA 2020 guidelines. A two-phase strategy identified 240 studies for bibliometric analysis and 61 for detailed qualitative synthesis. Results: Ensemble methods (e.g., XGBoost, LightGBM) remain dominant in ML, while DL approaches (e.g., LSTM, CNN) are increasingly applied to complex data. Challenges include class imbalance, interpretability, concept drift, and limited use of profit-oriented metrics. Explainable AI and adaptive learning show potential but limited real-world adoption. Limitations: No formal risk of bias or certainty assessments were conducted. Study heterogeneity prevented meta-analysis. Conclusions: ML and DL methods have matured as key tools for churn prediction, yet gaps remain in interpretability, real-world deployment, and business-aligned evaluation. Systematic Review Registration: Registered retrospectively in OSF.
Full article

Graphical abstract
Open AccessArticle
Screening Smarter, Not Harder: Budget Allocation Strategies for Technology-Assisted Reviews (TARs) in Empirical Medicine
by
Giorgio Maria Di Nunzio
Mach. Learn. Knowl. Extr. 2025, 7(3), 104; https://doi.org/10.3390/make7030104 - 20 Sep 2025
Abstract
In the technology-assisted review (TAR) area, most research has focused on ranking effectiveness and active learning strategies within individual topics, often assuming unconstrained review effort. However, real-world applications such as legal discovery or medical systematic reviews are frequently subject to global screening budgets.
[...] Read more.
In the technology-assisted review (TAR) area, most research has focused on ranking effectiveness and active learning strategies within individual topics, often assuming unconstrained review effort. However, real-world applications such as legal discovery or medical systematic reviews are frequently subject to global screening budgets. In this paper, we revisit the CLEF eHealth TAR shared tasks (2017–2019) through the lens of budget-aware evaluation. We first reproduce and verify the official participant results, organizing them into a unified dataset for comparative analysis. Then, we introduce and assess four intuitive budget allocation strategies—even, proportional, inverse proportional, and threshold-capped greedy—to explore how review effort can be efficiently distributed across topics. To evaluate systems under resource constraints, we propose two cost-aware metrics: relevant found per cost unit (RFCU) and utility gain at budget (UG@B). These complement traditional recall by explicitly modeling efficiency and trade-offs between true and false positives. Our results show that different allocation strategies optimize different metrics: even and inverse proportional allocation favor recall, while proportional and capped strategies better maximize RFCU. UG@B remains relatively stable across strategies, reflecting its balanced formulation. A correlation analysis reveals that RFCU and UG@B offer distinct perspectives from recall, with varying alignment across years. Together, these findings underscore the importance of aligning evaluation metrics and allocation strategies with screening goals. We release all data and code to support reproducibility and future research on cost-sensitive TAR.
Full article
(This article belongs to the Topic The Use of Big Data in Public Health Research and Practice)
►▼
Show Figures

Graphical abstract
Open AccessArticle
Leveraging LLMs for Automated Extraction and Structuring of Educational Concepts and Relationships
by
Tianyuan Yang, Baofeng Ren, Chenghao Gu, Tianjia He, Boxuan Ma and Shin’ichi Konomi
Mach. Learn. Knowl. Extr. 2025, 7(3), 103; https://doi.org/10.3390/make7030103 - 19 Sep 2025
Abstract
►▼
Show Figures
Students must navigate large catalogs of courses and make appropriate enrollment decisions in many online learning environments. In this context, identifying key concepts and their relationships is essential for understanding course content and informing course recommendations. However, identifying and extracting concepts can be
[...] Read more.
Students must navigate large catalogs of courses and make appropriate enrollment decisions in many online learning environments. In this context, identifying key concepts and their relationships is essential for understanding course content and informing course recommendations. However, identifying and extracting concepts can be an extremely labor-intensive and time-consuming task when it has to be done manually. Traditional NLP-based methods to extract relevant concepts from courses heavily rely on resource-intensive preparation of detailed course materials, thereby failing to minimize labor. As recent advances in large language models (LLMs) offer a promising alternative for automating concept identification and relationship inference, we thoroughly investigate the potential of LLMs in automatically generating course concepts and their relations. Specifically, we systematically evaluate three LLM variants (GPT-3.5, GPT-4o-mini, and GPT-4o) across three distinct educational tasks, which are concept generation, concept extraction, and relation identification, using six systematically designed prompt configurations that range from minimal context (course title only) to rich context (course description, seed concepts, and subtitles). We systematically assess model performance through extensive automated experiments using standard metrics (Precision, Recall, F1, and Accuracy) and human evaluation by four domain experts, providing a comprehensive analysis of how prompt design and model choice influence the quality and reliability of the generated concepts and their interrelations. Our results show that GPT-3.5 achieves the highest scores on quantitative metrics, whereas GPT-4o and GPT-4o-mini often generate concepts that are more educationally meaningful despite lexical divergence from the ground truth. Nevertheless, LLM outputs still require expert revision, and performance is sensitive to prompt complexity. Overall, our experiments demonstrate the viability of LLMs as a tool for supporting educational content selection and delivery.
Full article

Graphical abstract
Open AccessArticle
Exploiting the Feature Space Structures of KNN and OPF Algorithms for Identification of Incipient Faults in Power Transformers
by
André Gifalli, Marco Akio Ikeshoji, Danilo Sinkiti Gastaldello, Victor Hideki Saito Yamaguchi, Welson Bassi, Talita Mazon, Floriano Torres Neto, Pedro da Costa Junior and André Nunes de Souza
Mach. Learn. Knowl. Extr. 2025, 7(3), 102; https://doi.org/10.3390/make7030102 - 18 Sep 2025
Abstract
►▼
Show Figures
Power transformers represent critical assets within the electrical power system, and their unexpected failures may result in substantial financial losses for both utilities and consumers. Dissolved Gas Analysis (DGA) is a well-established diagnostic method extensively employed to detect incipient faults in power transformers.
[...] Read more.
Power transformers represent critical assets within the electrical power system, and their unexpected failures may result in substantial financial losses for both utilities and consumers. Dissolved Gas Analysis (DGA) is a well-established diagnostic method extensively employed to detect incipient faults in power transformers. Although several conventional and machine learning techniques have been applied to DGA, most of them focus only on fault classification and lack the capability to provide predictive scenarios that would enable proactive maintenance planning. In this context, the present study introduces a novel approach to DGA interpretation, which highlights the trends and progression of faults by exploring the feature space through the algorithms k-Nearest Neighbors (KNN) and Optimum-Path Forest (OPF). To improve accuracy, the following strategies were implemented: statistical filtering based on normal distribution to eliminate outliers from the dataset; augmentation of gas-related features; and feature selection using optimization algorithms such as Cuckoo Search and Genetic Algorithms. The approach was validated using data from several transformers, with fault diagnoses cross-checked against inspection reports provided by the utility company. The findings indicate that the proposed method offers valuable insights into the progression, proximity, and classification of faults with satisfactory accuracy, thereby supporting its recommendation as a complementary tool for diagnosing incipient transformer faults.
Full article

Figure 1
Open AccessArticle
CRISP-NET: Integration of the CRISP-DM Model with Network Analysis
by
Héctor Alejandro Acuña-Cid, Eduardo Ahumada-Tello, Óscar Omar Ovalle-Osuna, Richard Evans, Julia Elena Hernández-Ríos and Miriam Alondra Zambrano-Soto
Mach. Learn. Knowl. Extr. 2025, 7(3), 101; https://doi.org/10.3390/make7030101 - 16 Sep 2025
Abstract
To carry out data analysis, it is necessary to implement a model that guides the process in an orderly and sequential manner, with the aim of maintaining control over software development and its documentation. One of the most widely used tools in the
[...] Read more.
To carry out data analysis, it is necessary to implement a model that guides the process in an orderly and sequential manner, with the aim of maintaining control over software development and its documentation. One of the most widely used tools in the field of data analysis is the Cross-Industry Standard Process for Data Mining (CRISP-DM), which serves as a reference framework for data mining, allowing the identification of patterns and, based on them, supporting informed decision-making. Another tool used for pattern identification and the study of relationships within systems is network analysis (NA), which makes it possible to explore how different components are interconnected. The integration of these tools can be justified and developed under the principles of Situational Method Engineering (SME), which allows for the adaptation and customization of existing methods according to the specific needs of a problem or context. Through SME, it is possible to determine which components of CRISP-DM need to be adjusted to efficiently incorporate NA, ensuring that this integration aligns with the project’s objectives in a structured and effective manner. The proposed methodological process was applied in a real working group, which allowed its functionality to be validated, each phase to be documented, and concrete outputs to be generated, demonstrating its usefulness for the development of analytical projects.
Full article
(This article belongs to the Topic AI and Computational Methods for Modelling, Simulations and Optimizing of Advanced Systems: Innovations in Complexity, Second Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
Learnable Petri Net Neural Network Using Max-Plus Algebra
by
Mohammed Sharafath Abdul Hameed, Sofiene Lassoued and Andreas Schwung
Mach. Learn. Knowl. Extr. 2025, 7(3), 100; https://doi.org/10.3390/make7030100 - 13 Sep 2025
Abstract
►▼
Show Figures
Interpretable decision-making algorithms are important when used in the context of production optimization. While concepts like Petri nets are inherently interpretable, they are not straightforwardly learnable. This paper presents a novel approach to transform the Petri net model into a learnable entity. This
[...] Read more.
Interpretable decision-making algorithms are important when used in the context of production optimization. While concepts like Petri nets are inherently interpretable, they are not straightforwardly learnable. This paper presents a novel approach to transform the Petri net model into a learnable entity. This is accomplished by establishing a relationship between the Petri net description in the event domain, its representation in the max-plus algebra, and a one-layer perceptron neural network. This allows us to apply standard supervised learning methods adapted to the max-plus domain to infer the parameters of the Petri net. To this end, the feed-forward and back-propagation paths are modified to accommodate the differing mathematical operations in the context of max-plus algebra. We apply our approach to a multi-robot handling system with potentially varying processing and operation times. The results show that essential timing parameters can be inferred from data with high precision.
Full article

Figure 1
Open AccessArticle
Dynamic Graph Analysis: A Hybrid Structural–Spatial Approach for Brain Shape Correspondence
by
Jonnatan Arias-García, Hernán Felipe García, Andrés Escobar-Mejía, David Cárdenas-Peña and Álvaro A. Orozco
Mach. Learn. Knowl. Extr. 2025, 7(3), 99; https://doi.org/10.3390/make7030099 - 10 Sep 2025
Abstract
►▼
Show Figures
Accurate correspondence of complex neuroanatomical surfaces under non-rigid deformations remains a formidable challenge in computational neuroimaging, owing to inter-subject topological variability, partial occlusions, and non-isometric distortions. Here, we introduce the Dynamic Graph Analyzer (DGA), a unified hybrid framework that integrates simplified structural descriptors
[...] Read more.
Accurate correspondence of complex neuroanatomical surfaces under non-rigid deformations remains a formidable challenge in computational neuroimaging, owing to inter-subject topological variability, partial occlusions, and non-isometric distortions. Here, we introduce the Dynamic Graph Analyzer (DGA), a unified hybrid framework that integrates simplified structural descriptors with spatial constraints and formulates matching as a global linear assignment. Structurally, the DGA computes node-level metrics, degree weighted by betweenness centrality and local clustering coefficients, to capture essential topological patterns at a low computational cost. Spatially, it employs a two-stage scheme that combines global maximum distances and local rescaling of adjacent node separations to preserve geometric fidelity. By embedding these complementary measures into a single cost matrix solved via the Kuhn–Munkres algorithm followed by a refinement of weak correspondences, the DGA ensures a globally optimal correspondence. In benchmark evaluations on the FAUST dataset, the DGA achieved a significant reduction in the mean geodetic reconstruction error compared to spectral graph convolutional netwworks (GCNs)—which learn optimized spectral descriptors akin to classical approaches like heat/wave kernel signatures (HKS/WKS)—and traditional spectral methods. Additional experiments demonstrate robust performance on partial matches in TOSCA and cross-species alignments in SHREC-20, validating resilience to morphological variation and symmetry ambiguities. These results establish the DGA as a scalable and accurate approach for brain shape correspondence, with promising applications in biomarker mapping, developmental studies, and clinical morphometry.
Full article

Figure 1
Highly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
Cancers, IJERPH, IJGI, MAKE, Smart Cities
The Use of Big Data in Public Health Research and Practice
Topic Editors: Quynh C. Nguyen, Thu T. NguyenDeadline: 31 December 2025
Topic in
Applied Sciences, Computers, Entropy, Information, MAKE, Systems
Opportunities and Challenges in Explainable Artificial Intelligence (XAI)
Topic Editors: Luca Longo, Mario Brcic, Sebastian LapuschkinDeadline: 31 January 2026
Topic in
Applied Sciences, Electronics, J. Imaging, MAKE, Information, BDCC, Signals
Applications of Image and Video Processing in Medical Imaging
Topic Editors: Jyh-Cheng Chen, Kuangyu ShiDeadline: 30 April 2026
Topic in
Applied Sciences, ASI, Blockchains, Computers, MAKE, Software
Recent Advances in AI-Enhanced Software Engineering and Web Services
Topic Editors: Hai Wang, Zhe HouDeadline: 31 May 2026

Special Issues
Special Issue in
MAKE
Deep Learning in Image Analysis and Pattern Recognition, 2nd Edition
Guest Editors: Xianzhi Wang, Guoqing ChaoDeadline: 26 November 2025
Special Issue in
MAKE
Language Acquisition and Understanding
Guest Editors: Michal Ptaszynski, Rafal Rzepka, Masaharu YoshiokaDeadline: 15 July 2026
Topical Collections
Topical Collection in
MAKE
Extravaganza Feature Papers on Hot Topics in Machine Learning and Knowledge Extraction
Collection Editor: Andreas Holzinger