Previous Issue
Volume 7, September
 
 

Mach. Learn. Knowl. Extr., Volume 7, Issue 4 (December 2025) – 34 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
24 pages, 1123 KB  
Article
Democratizing Machine Learning: A Practical Comparison of Low-Code and No-Code Platforms
by Luis Giraldo and Sergio Laso
Mach. Learn. Knowl. Extr. 2025, 7(4), 141; https://doi.org/10.3390/make7040141 (registering DOI) - 7 Nov 2025
Abstract
The growing use of machine learning (ML) and artificial intelligence across sectors has shown strong potential to improve decision-making processes. However, the adoption of ML by non-technical professionals remains limited due to the complexity of traditional development workflows, which often require software engineering [...] Read more.
The growing use of machine learning (ML) and artificial intelligence across sectors has shown strong potential to improve decision-making processes. However, the adoption of ML by non-technical professionals remains limited due to the complexity of traditional development workflows, which often require software engineering and data science expertise. In recent years, low-code and no-code platforms have emerged as promising solutions to democratize ML by abstracting many of the technical tasks typically involved in software engineering pipelines. This paper investigates whether these platforms can offer a viable alternative for making ML accessible to non-expert users. Beyond predictive performance, this study also evaluates usability, setup complexity, the transparency of automated workflows, and cost management under realistic “out-of-the-box” conditions. This multidimensional perspective provides insights into the practical viability of LC/NC tools in real-world contexts. The comparative evaluation was conducted using three leading cloud-based tools: Amazon SageMaker Canvas, Google Cloud Vertex AI, and Azure Machine Learning Studio. These tools employ ensemble-based learning algorithms such as Gradient Boosted Trees, XGBoost, and Random Forests. Unlike traditional ML workflows that require extensive software engineering knowledge and manual optimization, these platforms enable domain experts to build predictive models through visual interfaces. The findings show that all platforms achieved high accuracy, with consistent identification of key features. Google Cloud Vertex AI was the most user-friendly, SageMaker Canvas offered a highly visual interface with some setup complexity, and Azure Machine Learning delivered the best model performance with a steeper learning curve. Cost transparency also varied considerably, with Google Cloud and Azure providing clearer safeguards against unexpected charges compared to Sagemaker Canvas. Full article
Show Figures

Figure 1

41 pages, 21444 KB  
Article
Towards Explainable Machine Learning from Remote Sensing to Medical Images—Merging Medical and Environmental Data into Public Health Knowledge Maps
by Liviu Bilteanu, Corneliu Octavian Dumitru, Andreea Dumachi, Florin Alexandrescu, Radu Popa, Octavian Buiu and Andreea Iren Serban
Mach. Learn. Knowl. Extr. 2025, 7(4), 140; https://doi.org/10.3390/make7040140 - 6 Nov 2025
Abstract
Both remote sensing and medical fields benefited a lot from the machine learning methods, originally developed for computer vision and multimedia. We investigate the applicability of the same data mining-based machine learning (ML) techniques for exploring the structure of both Earth observation (EO) [...] Read more.
Both remote sensing and medical fields benefited a lot from the machine learning methods, originally developed for computer vision and multimedia. We investigate the applicability of the same data mining-based machine learning (ML) techniques for exploring the structure of both Earth observation (EO) and medical image data. Support Vector Machine (SVM) is an explainable active learning tool to discover the semantic relations between the EO image content classes, extending this technique further to medical images of various types. The EO image dataset was acquired by multispectral and radar sensors (WorldView-2, Sentinel-2, TerraSAR-X, Sentinel-1, RADARSAT-2, and Gaofen-3) from four different urban areas. In addition, medical images were acquired by camera, microscope, and computed tomography (CT). The methodology has been tested by several experts, and the semantic classification results were checked by either comparing them with reference data or through the feedback given by these experts in the field. The accuracy of the results amounts to 95% for the satellite images and 85% for the medical images. This study opens the pathway to correlate the information extracted from the EO images (e.g., quality-of-life-related environmental data) with that extracted from medical images (e.g., medical imaging disease phenotypes) to obtain geographically refined results in epidemiology. Full article
Show Figures

Graphical abstract

44 pages, 7519 KB  
Article
Cover Tree-Optimized Spectral Clustering: Efficient Nearest Neighbor Search for Large-Scale Data Partitioning
by Abderrafik Laakel Hemdanou, Youssef Achtoun, Sara Mouali, Mohammed Lamarti Sefian, Vesna Šešum Čavić and Stojan Radenović
Mach. Learn. Knowl. Extr. 2025, 7(4), 139; https://doi.org/10.3390/make7040139 - 5 Nov 2025
Abstract
Spectral clustering has established itself as a powerful technique for data partitioning across various domains due to its ability to handle complex cluster structures. However, its computational efficiency remains a challenge, especially with large datasets. In this paper, we propose an enhancement of [...] Read more.
Spectral clustering has established itself as a powerful technique for data partitioning across various domains due to its ability to handle complex cluster structures. However, its computational efficiency remains a challenge, especially with large datasets. In this paper, we propose an enhancement of spectral clustering by integrating Cover tree data structure to optimize the nearest neighbor search, a crucial step in the construction of similarity graphs. Cover trees are a type of spatial tree that allow for efficient exact nearest neighbor queries in high-dimensional spaces. By embedding this technique into the spectral clustering framework, we achieve significant reductions in computational cost while maintaining clustering accuracy. Through extensive experiments on random, synthetic, and real-world datasets, we demonstrate that our approach outperforms traditional spectral clustering methods in terms of scalability and execution speed, without compromising the quality of the resultant clusters. This work provides a more efficient utilization of spectral clustering in big data applications. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

34 pages, 1102 KB  
Article
Personalized Course Recommendations Leveraging Machine and Transfer Learning Toward Improved Student Outcomes
by Shrooq Algarni and Frederick T. Sheldon
Mach. Learn. Knowl. Extr. 2025, 7(4), 138; https://doi.org/10.3390/make7040138 - 5 Nov 2025
Abstract
University advising at matriculation must operate under strict information constraints, typically without any post-enrolment interaction history.We present a unified, leakage-free pipeline for predicting early dropout risk and generating cold-start programme recommendations from pre-enrolment signals alone, with an optional early-warning variant incorporating first-term academic [...] Read more.
University advising at matriculation must operate under strict information constraints, typically without any post-enrolment interaction history.We present a unified, leakage-free pipeline for predicting early dropout risk and generating cold-start programme recommendations from pre-enrolment signals alone, with an optional early-warning variant incorporating first-term academic aggregates. The approach instantiates lightweight multimodal architectures: tabular RNNs, DistilBERT encoders for compact profile sentences, and a cross-attention fusion module evaluated end-to-end on a public benchmark (UCI id 697; n = 3630 students across 17 programmes). For dropout, fusing text with numerics yields the strongest thresholded performance (Hybrid RNN–DistilBERT: f1-score ≈ 0.9161, MCC ≈ 0.7750, and simple ensembling modestly improves threshold-free discrimination (Area Under Receiver Operating Characteristic Curve (AUROC) up to ≈0.9488). A text-only branch markedly underperforms, indicating that numeric demographics and early curricular aggregates carry the dominant signal at this horizon. For programme recommendation, pre-enrolment demographics alone support actionable rankings (Demographic Multi-Layer Perceptron (MLP): Normalized Discounted Cumulative Gain @ 10 (NDCG@10) ≈ 0.5793, Top-10 ≈ 0.9380, exceeding a popularity prior by 2527 percentage points in NDCG@10); adding text offers marginal gains in hit rate but not in NDCG on this cohort. Methodologically, we enforce leakage guards, deterministic preprocessing, stratified splits, and comprehensive metrics, enabling reproducibility on non-proprietary data. Practically, the pipeline supports orientation-time triage (high-recall early-warning) and shortlist generation for programme selection. The results position matriculation-time advising as a joint prediction–recommendation problem solvable with carefully engineered pre-enrolment views and lightweight multimodal models, without reliance on historical interactions. Full article
Show Figures

Figure 1

29 pages, 5549 KB  
Article
A Graph-Structured, Physics-Informed DeepONet Neural Network for Complex Structural Analysis
by Guangya Zhang, Tie Xu, Jinli Xu and Hu Wang
Mach. Learn. Knowl. Extr. 2025, 7(4), 137; https://doi.org/10.3390/make7040137 - 4 Nov 2025
Viewed by 151
Abstract
This study introduces the Graph-Structured Physics-Informed DeepONet (GS-PI-DeepONet), a novel neural network framework designed to address the challenges of solving parametric Partial Differential Equations (PDEs) in structural analysis, particularly for problems with complex geometries and dynamic boundary conditions. By integrating Graph Neural Networks [...] Read more.
This study introduces the Graph-Structured Physics-Informed DeepONet (GS-PI-DeepONet), a novel neural network framework designed to address the challenges of solving parametric Partial Differential Equations (PDEs) in structural analysis, particularly for problems with complex geometries and dynamic boundary conditions. By integrating Graph Neural Networks (GNNs), Deep Operator Networks (DeepONets), and Physics-Informed Neural Networks (PINNs), the proposed method employs graph-structured representations to model unstructured Finite Element (FE) meshes. In this framework, nodes encode physical quantities such as displacements and loads, while edges represent geometric or topological relationships. The framework embeds PDE constraints as soft penalties within the loss function, ensuring adherence to physical laws while reducing reliance on large datasets. Extensive experiments have demonstrated the GS-PI-DeepONet’s superiority over traditional Finite Element Methods (FEMs) and standard DeepONets. For benchmark problems, including cantilever beam bending and Hertz contact, the model achieves high accuracy. In practical applications, such as stiffness analysis of a recliner mechanism and strength analysis of a support bracket, the framework achieves a 7–8 speed-up compared to FEMs, while maintaining fidelity comparable to FEM, with R2 values reaching up to 0.9999 for displacement fields. Consequently, the GS-PI-DeepONet offers a resolution-independent, data-efficient, and physics-consistent approach for real-time simulations, making it ideal for rapid parameter sweeps and design optimizations in engineering applications. Full article
Show Figures

Figure 1

20 pages, 2382 KB  
Article
Explainable Deep Learning for Neonatal Jaundice Classification Using Uncalibrated Smartphone Images
by Ashim Chakraborty, Yeshwanth Thota, Cristina Luca and Ian van der Linde
Mach. Learn. Knowl. Extr. 2025, 7(4), 136; https://doi.org/10.3390/make7040136 - 4 Nov 2025
Viewed by 193
Abstract
Hyperbilirubinemia, commonly known as jaundice, is a prevalent condition in newborns, primarily arising from alterations in red blood cell metabolism during the first week of life. While conventional diagnostic methods, such as serum analysis and transcutaneous bilirubinometry, are effective, there remains a critical [...] Read more.
Hyperbilirubinemia, commonly known as jaundice, is a prevalent condition in newborns, primarily arising from alterations in red blood cell metabolism during the first week of life. While conventional diagnostic methods, such as serum analysis and transcutaneous bilirubinometry, are effective, there remains a critical need for robust, non-invasive, image-based diagnostic tools. In this study, we propose a custom-designed convolutional neural network for classifying jaundice in neonatal images. Image preprocessing and segmentation techniques were systematically evaluated. The optimal workflow, which incorporated contrast enhancement and the extraction of regular skin patches of 144 × 144 pixels from regions of interest segmented using the Segment Anything Model, achieved a testing F1-score of 0.80. Beyond performance, this study addresses numerous shortcomings in the existing literature in this area relating to trust, replicability, and transparency. To this end, we employ fair performance metrics that are more robust to class imbalance, a transparent workflow, share source code, and use Gradient-weighted Class Activation Mapping to visualise and quantify the image regions that influence the classifier’s predictions in pursuit of epistemic justification. Full article
Show Figures

Graphical abstract

27 pages, 2139 KB  
Article
Generalisation Bounds of Zero-Shot Economic Forecasting Using Time Series Foundation Models
by Jittarin Jetwiriyanon, Teo Susnjak and Surangika Ranathunga
Mach. Learn. Knowl. Extr. 2025, 7(4), 135; https://doi.org/10.3390/make7040135 - 3 Nov 2025
Viewed by 501
Abstract
This study investigates the transfer learning capabilities of Time-Series Foundation Models (TSFMs) under the zero-shot setup, to forecast macroeconomic indicators. New TSFMs are continually emerging, offering significant potential to provide ready-trained and accurate forecasting models that generalise across a wide spectrum of domains. [...] Read more.
This study investigates the transfer learning capabilities of Time-Series Foundation Models (TSFMs) under the zero-shot setup, to forecast macroeconomic indicators. New TSFMs are continually emerging, offering significant potential to provide ready-trained and accurate forecasting models that generalise across a wide spectrum of domains. However, the transferability of their learning to many domains, especially economics, is not well understood. To that end, we study TSFM’s performance profile for economic forecasting, bypassing the need for training bespoke econometric models using extensive training datasets. Our experiments were conducted on a univariate case study dataset, in which we rigorously back-tested three state-of-the-art TSFMs (Chronos, TimeGPT, and Moirai) under data-scarce conditions and structural breaks. Our results demonstrate that appropriately engineered TSFMs can internalise rich economic dynamics, accommodate regime shifts, and deliver well-behaved uncertainty estimates out of the box, while matching and exceeding state-of-the-art multivariate models currently used in this domain. Our findings suggest that, without any fine-tuning and additional multivariate inputs, TSFMs can match or outperform classical models under both stable and volatile economic conditions. However, like all models, they are vulnerable to performance degradation during periods of rapid shocks, though they recover the forecasting accuracy faster than classical models. The findings offer guidance to practitioners on when zero-shot deployments are viable for macroeconomic monitoring and strategic planning. Full article
Show Figures

Graphical abstract

41 pages, 743 KB  
Article
An Overview of Large Language Models and a Novel, Large Language Model-Based Cognitive Architecture for Solving Open-Ended Problems
by Hashmath Shaik, Gnaneswar Villuri and Alex Doboli
Mach. Learn. Knowl. Extr. 2025, 7(4), 134; https://doi.org/10.3390/make7040134 - 1 Nov 2025
Viewed by 362
Abstract
Large Language Models (LLMs) offer new opportunities to devise automated implementation generation methods that can tackle problem solving beyond traditional methods, which usually require algorithmic specifications and use only static domain knowledge. LLMs can support devising new methods to support activities in tackling [...] Read more.
Large Language Models (LLMs) offer new opportunities to devise automated implementation generation methods that can tackle problem solving beyond traditional methods, which usually require algorithmic specifications and use only static domain knowledge. LLMs can support devising new methods to support activities in tackling open-ended problems, like problem framing, exploring possible solving approaches, feature elaboration and combination, advanced implementation assessment, and handling unexpected situations. This paper presents a detailed overview of the current work on LLMs, including model prompting, retrieval-augmented generation (RAG), and reinforcement learning. It then proposes a novel, LLM-based Cognitive Architecture (CA) to generate programming code starting from verbal discussions in natural language, a particular kind of problem-solving activity. The CA uses four strategies, three top-down and one bottom-up, to elaborate, adaptively process, memorize, and learn. Experiments are devised to study the CA performance, e.g., convergence rate, semantic fidelity, and code correctness. Full article
Show Figures

Figure 1

31 pages, 4855 KB  
Article
Machine Learning Regressors Calibrated on Computed Data for Road Traffic Noise Prediction
by Domenico Rossi, Aurora Mascolo, Daljeet Singh and Claudio Guarnaccia
Mach. Learn. Knowl. Extr. 2025, 7(4), 133; https://doi.org/10.3390/make7040133 - 1 Nov 2025
Viewed by 242
Abstract
Noise is one of the main pollutants in urban contexts, even if it is not perceived as severe as other pollutants. Transportation, specifically road traffic, accounts for most of the urban environmental noise, and its monitoring is very important and sometimes compelled by [...] Read more.
Noise is one of the main pollutants in urban contexts, even if it is not perceived as severe as other pollutants. Transportation, specifically road traffic, accounts for most of the urban environmental noise, and its monitoring is very important and sometimes compelled by law. To do this, two different approaches are possible: a direct measurement campaign or a simulation approach. The so-called Road Traffic Noise Models (RTNMs) are used for this second scope. In recent years, noise assessment has also been experimented with through Machine Learning (ML) techniques: ML is very interesting mainly because it is usable in unusual road traffic conditions, like in the presence of roundabouts and/or stops and traffic lights, or more generally when the free flow aspect is not verified, and the classic RTNMs fail. In this contribution, a large and comprehensive study on four different ML regressors is presented. After careful hyperparameter tuning, regressors have been calibrated by using two different approaches: a classic train/test split on real road traffic data, and by using a computed dataset. Results show a quantitative and qualitative description of the outputs of the ML regressors functioning, and how their calibration by using computed data instead of real data can give good output simulations. Full article
Show Figures

Graphical abstract

15 pages, 2491 KB  
Article
Federated Learning for Soil Moisture Prediction: Benchmarking Lightweight CNNs and Robustness in Distributed Agricultural IoT Networks
by Salma Zakzouk and Lobna A. Said
Mach. Learn. Knowl. Extr. 2025, 7(4), 132; https://doi.org/10.3390/make7040132 - 31 Oct 2025
Viewed by 316
Abstract
Federated learning (FL) provides a privacy-preserving approach for training machine learning models across distributed datasets; however, its deployment in environmental monitoring remains underexplored. This paper uses the WHIN dataset, comprising 144 weather stations across Indiana, to establish a benchmark for FL in soil [...] Read more.
Federated learning (FL) provides a privacy-preserving approach for training machine learning models across distributed datasets; however, its deployment in environmental monitoring remains underexplored. This paper uses the WHIN dataset, comprising 144 weather stations across Indiana, to establish a benchmark for FL in soil moisture prediction. The work presents three primary contributions: the design of lightweight CNNs optimized for edge deployment, a comprehensive robustness assessment of FL under non-IID and adversarial conditions, and the development of a large-scale, reproducible agricultural FL benchmark using the WHIN network. The paper designs and evaluates lightweight (∼0.8 k parameters) and heavy (∼9.4 k parameters) convolutional neural networks (CNNs) under both centralized and federated settings, supported by ablation studies on feature importance and model architecture. Results show that lightweight CNNs achieve near-heavy CNN performance (MAE = 7.8 cbar vs. 7.6 cbar) while reducing computation and communication overhead. Beyond accuracy, this work systematically benchmarks robustness under adversarial and non-IID conditions, providing new insights for deploying federated models in agricultural IoT. Full article
Show Figures

Graphical abstract

15 pages, 3320 KB  
Article
Diff-KNN: Residual Correction of Baseline Wind Predictions in Urban Settings
by Dimitri Nowak, Jennifer Werner, Franziska Hunger, Tomas Johnson, Andreas Mark, Radostin Mitkov and Fredrik Edelvik
Mach. Learn. Knowl. Extr. 2025, 7(4), 131; https://doi.org/10.3390/make7040131 - 29 Oct 2025
Viewed by 325
Abstract
Accurate prediction of urban wind flow is essential for urban planning and environmental assessment. Classical computational fluid dynamics (CFD) methods are computationally expensive, while machine learning approaches often lack explainability and generalizability. To address the limitations of both approaches, we propose Diff-KNN, a [...] Read more.
Accurate prediction of urban wind flow is essential for urban planning and environmental assessment. Classical computational fluid dynamics (CFD) methods are computationally expensive, while machine learning approaches often lack explainability and generalizability. To address the limitations of both approaches, we propose Diff-KNN, a hybrid method that combines Coarse-Scale CFD simulations with a K-Nearest Neighbors (KNN) model trained on the residuals between coarse- and fine-scale CFD results. Diff-KNN reduces velocity prediction errors by up to 83.5% compared to Pure-KNN and 56.6% compared to coarse CFD alone. Tested on the AIJE urban dataset, Diff-KNN effectively corrects flow inaccuracies near buildings and within narrow street canyons, where traditional methods struggle. This study demonstrates how residual learning can bridge physics-based and data-driven modeling for accurate and interpretable fine-scale urban wind prediction. Full article
Show Figures

Figure 1

34 pages, 3325 KB  
Systematic Review
A Systematic Review of Methods and Algorithms for the Intelligent Processing of Agricultural Data Applied to Sunflower Crops
by Valentina Arustamyan, Pavel Lyakhov, Ulyana Lyakhova, Ruslan Abdulkadirov, Vyacheslav Rybin and Denis Butusov
Mach. Learn. Knowl. Extr. 2025, 7(4), 130; https://doi.org/10.3390/make7040130 - 27 Oct 2025
Viewed by 440
Abstract
Food shortages are becoming increasingly urgent due to the growing global population. Enhancing oil crop yields, particularly sunflowers, is key to ensuring food security and the sustainable provision of vegetable fats essential for human nutrition and animal feed. However, sunflower yields are often [...] Read more.
Food shortages are becoming increasingly urgent due to the growing global population. Enhancing oil crop yields, particularly sunflowers, is key to ensuring food security and the sustainable provision of vegetable fats essential for human nutrition and animal feed. However, sunflower yields are often reduced by diseases, pests, and other factors. Remote sensing technologies, such as unmanned aerial vehicle (UAV) scans and satellite monitoring, combined with machine learning algorithms, provide powerful tools for monitoring crop health, diagnosing diseases, mapping fields, and forecasting yields. These technologies enhance agricultural efficiency and reduce environmental impact, supporting sustainable development in agriculture. This systematic review aims to assess the accuracy of various machine learning technologies, including classification and segmentation algorithms, convolutional neural networks, random forests, and support vector machines. These methods are applied to monitor sunflower crop conditions, diagnose diseases, and forecast yields. It provides a comprehensive analysis of current methods and their potential for precision farming applications. The review also discusses future research directions, including the development of automated systems for crop monitoring and disease diagnostics. Full article
(This article belongs to the Section Thematic Reviews)
Show Figures

Graphical abstract

21 pages, 3543 KB  
Article
Exploring New Horizons: fNIRS and Machine Learning in Understanding PostCOVID-19
by Antony Morales-Cervantes, Victor Herrera, Blanca Nohemí Zamora-Mendoza, Rogelio Flores-Ramírez, Aaron A. López-Cano and Edgar Guevara
Mach. Learn. Knowl. Extr. 2025, 7(4), 129; https://doi.org/10.3390/make7040129 - 24 Oct 2025
Viewed by 447
Abstract
PostCOVID-19 is a condition affecting approximately 10% of individuals infected with SARS-CoV-2, presenting significant challenges in diagnosis and clinical management. Portable neuroimaging techniques, such as functional near-infrared spectroscopy (fNIRS), offer real-time insights into cerebral hemodynamics and represent a promising tool for studying postCOVID-19 [...] Read more.
PostCOVID-19 is a condition affecting approximately 10% of individuals infected with SARS-CoV-2, presenting significant challenges in diagnosis and clinical management. Portable neuroimaging techniques, such as functional near-infrared spectroscopy (fNIRS), offer real-time insights into cerebral hemodynamics and represent a promising tool for studying postCOVID-19 in naturalistic settings. This study investigates the integration of fNIRS with machine learning to identify neural correlates of postCOVID-19. A total of six machine learning classifiers—Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNNs), XGBoost, Logistic Regression, and Multi-Layer Perceptron (MLP)—were evaluated using a stratified subject-aware cross-validation scheme on a dataset comprising 29,737 time-series samples from 37 participants (9 postCOVID-19, 28 controls). Four different feature representation strategies were compared: raw time-series, PCA-based dimensionality reduction, statistical feature extraction, and a hybrid approach that combines time-series and statistical descriptors. Among these, the hybrid representation demonstrated the highest discriminative performance. The SVM classifier trained on hybrid features achieved strong discrimination (ROC-AUC = 0.909) under subject-aware CV5; at the default threshold, Sensitivity was moderate and Specificity was high, outperforming all other methods. In contrast, models trained on statistical features alone exhibited limited Sensitivity despite high Specificity. These findings highlight the importance of temporal information in the fNIRS signal and support the potential of machine learning combined with portable neuroimaging for postCOVID-19 identification. This approach may contribute to the development of non-invasive diagnostic tools to support individualized treatment and longitudinal monitoring of patients with persistent neurological symptoms. Full article
Show Figures

Graphical abstract

17 pages, 1406 KB  
Article
Interleaved Fusion Learning for Trustworthy AI: Improving Cross-Dataset Performance in Cervical Cancer Analysis
by Carlos Martínez, Laura Busto, Olivia Zulaica and César Veiga
Mach. Learn. Knowl. Extr. 2025, 7(4), 128; https://doi.org/10.3390/make7040128 - 23 Oct 2025
Viewed by 323
Abstract
This study introduces a novel Interleaved Fusion Learning (IFL) methodology leveraging transfer learning to generate a family of models optimized for specific datasets while maintaining superior generalization performance across others. The approach is demonstrated in cervical cancer screening, where cytology image datasets present [...] Read more.
This study introduces a novel Interleaved Fusion Learning (IFL) methodology leveraging transfer learning to generate a family of models optimized for specific datasets while maintaining superior generalization performance across others. The approach is demonstrated in cervical cancer screening, where cytology image datasets present challenges of heterogeneity and imbalance. By interleaving transfer steps across dataset partitions and regulating adaptation through a dynamic learning parameter, IFL promotes both domain-specific accuracy and cross-domain robustness. To evaluate its effectiveness, complementary metrics are used to capture not only predictive accuracy but also fairness in performance distribution across datasets. Results highlight the potential of IFL to deliver reliable and unbiased models in clinical decision support. Beyond cervical cytology, the methodology is designed to be scalable to other medical imaging tasks and, more broadly, to domains requiring equitable AI solutions across multiple heterogeneous datasets. Full article
Show Figures

Figure 1

14 pages, 1361 KB  
Brief Report
A Comprehensive Study on Short-Term Oil Price Forecasting Using Econometric and Machine Learning Techniques
by Gil Cohen
Mach. Learn. Knowl. Extr. 2025, 7(4), 127; https://doi.org/10.3390/make7040127 - 23 Oct 2025
Viewed by 524
Abstract
This paper investigates the short-term predictability of daily crude oil price movements by employing a multi-method analytical framework that incorporates both econometric and machine learning techniques. Utilizing a dataset of 21 financial and commodity time series spanning ten years of trading days (2015–2024), [...] Read more.
This paper investigates the short-term predictability of daily crude oil price movements by employing a multi-method analytical framework that incorporates both econometric and machine learning techniques. Utilizing a dataset of 21 financial and commodity time series spanning ten years of trading days (2015–2024), we explore the dynamics of oil price volatility and its key determinants. In the forecasting phase, we applied seven models. The meta-learner model, which consists of three base learners (Random Forest, gradient boosting, and support vector regression), achieved the highest R2 value of 0.532, providing evidence that our complex model structure can successfully outperform existing approaches. This ensemble demonstrated that the most influential predictors of next-day oil prices are VIX, OVX, and MOVE (volatility indices for equities, oil, and bonds, respectively), and lagged oil returns. The results underscore the critical role of volatility spillovers and nonlinear dependencies in forecasting oil returns and suggest future directions for integrating macroeconomic signals and advanced volatility models. Moreover, we show that combining multiple machine learning procedures into a single meta-model yields superior predictive performance. Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 18977 KB  
Article
Large Language Models for Structured Task Decomposition in Reinforcement Learning Problems with Sparse Rewards
by Unai Ruiz-Gonzalez, Alain Andres and Javier Del Ser
Mach. Learn. Knowl. Extr. 2025, 7(4), 126; https://doi.org/10.3390/make7040126 - 22 Oct 2025
Viewed by 601
Abstract
Reinforcement learning (RL) agents face significant challenges in sparse-reward environments, as insufficient exploration of the state space can result in inefficient training or incomplete policy learning. To address this challenge, this work proposes a teacher–student framework for RL that leverages the inherent knowledge [...] Read more.
Reinforcement learning (RL) agents face significant challenges in sparse-reward environments, as insufficient exploration of the state space can result in inefficient training or incomplete policy learning. To address this challenge, this work proposes a teacher–student framework for RL that leverages the inherent knowledge of large language models (LLMs) to decompose complex tasks into manageable subgoals. The capabilities of LLMs to comprehend problem structure and objectives, based on textual descriptions, can be harnessed to generate subgoals, similar to the guidance a human supervisor would provide. For this purpose, we introduce the following three subgoal types: positional, representation-based, and language-based. Moreover, we propose an LLM surrogate model to reduce computational overhead and demonstrate that the supervisor can be decoupled once the policy has been learned, further lowering computational costs. Under this framework, we evaluate the performance of three open-source LLMs (namely, Llama, DeepSeek, and Qwen). Furthermore, we assess our teacher–student framework on the MiniGrid benchmark—a collection of procedurally generated environments that demand generalization to previously unseen tasks. Experimental results indicate that our teacher–student framework facilitates more efficient learning and encourages enhanced exploration in complex tasks, resulting in faster training convergence and outperforming recent teacher–student methods designed for sparse-reward environments. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

29 pages, 1377 KB  
Article
Classification of Obfuscation Techniques in LLVM IR: Machine Learning on Vector Representations
by Sebastian Raubitzek, Patrick Felbauer, Kevin Mallinger and Sebastian Schrittwieser
Mach. Learn. Knowl. Extr. 2025, 7(4), 125; https://doi.org/10.3390/make7040125 - 22 Oct 2025
Viewed by 441
Abstract
We present a novel methodology for classifying code obfuscation techniques in LLVM IR program embeddings. We apply isolated and layered code obfuscations to C source code using the Tigress obfuscator, compile them to LLVM IR, and convert each IR code representation into a [...] Read more.
We present a novel methodology for classifying code obfuscation techniques in LLVM IR program embeddings. We apply isolated and layered code obfuscations to C source code using the Tigress obfuscator, compile them to LLVM IR, and convert each IR code representation into a numerical embedding (vector representation) that captures intrinsic characteristics of the applied obfuscations. We then use two modern boost classifiers to identify which obfuscation, or layering of obfuscations, was used on the source code from the vector representation. To better analyze classifier behavior and error propagation, we employ a staged, cascading experimental design that separates the task into multiple decision levels, including obfuscation detection, single-versus-layered discrimination, and detailed technique classification. This structured evaluation allows a fine-grained view of classification uncertainty and model robustness across the inference stages. We achieve an overall accuracy of more than 90% in identifying the types of obfuscations. Our experiments show high classification accuracy for most obfuscations, including layered obfuscations, and even perfect scores for certain transformations, indicating that a vector representation of IR code preserves distinguishing features of the protections. In this article, we detail the workflow for applying obfuscations, generating embeddings, and training the model, and we discuss challenges such as obfuscation patterns covered by other obfuscations in layered protection scenarios. Full article
Show Figures

Figure 1

18 pages, 11753 KB  
Article
SemiSeg-CAW: Semi-Supervised Segmentation of Ultrasound Images by Leveraging Class-Level Information and an Adaptive Multi-Loss Function
by Somayeh Barzegar and Naimul Khan
Mach. Learn. Knowl. Extr. 2025, 7(4), 124; https://doi.org/10.3390/make7040124 - 20 Oct 2025
Viewed by 420
Abstract
The limited availability of pixel-level annotated medical images complicates training supervised segmentation models, as these models require large datasets. To deal with this issue, SemiSeg-CAW, a semi-supervised segmentation framework that leverages class-level information and an adaptive multi-loss function, is proposed to reduce dependency [...] Read more.
The limited availability of pixel-level annotated medical images complicates training supervised segmentation models, as these models require large datasets. To deal with this issue, SemiSeg-CAW, a semi-supervised segmentation framework that leverages class-level information and an adaptive multi-loss function, is proposed to reduce dependency on extensive annotations. The model combines segmentation and classification tasks in a multitask architecture that includes segmentation, classification, weight generation, and ClassElevateSeg modules. In this framework, the ClassElevateSeg module is initially pre-trained and then fine-tuned jointly with the main model to produce auxiliary feature maps that support the main model, while the adaptive weighting strategy computes a dynamic combination of classification and segmentation losses using trainable weights. The proposed approach enables effective use of both labeled and unlabeled images with class-level information by compensating for the shortage of pixel-level labels. Experimental evaluation on two public ultrasound datasets demonstrates that SemiSeg-CAW consistently outperforms fully supervised segmentation models when trained with equal or fewer labeled samples. The results suggest that incorporating class-level information with adaptive loss weighting provides an effective strategy for semi-supervised medical image segmentation and can improve the segmentation performance in situations with limited annotations. Full article
Show Figures

Figure 1

27 pages, 3749 KB  
Article
A Lightweight Deep Learning Model for Tea Leaf Disease Identification
by Bo-Yu Lien and Chih-Chin Lai
Mach. Learn. Knowl. Extr. 2025, 7(4), 123; https://doi.org/10.3390/make7040123 - 19 Oct 2025
Viewed by 500
Abstract
Tea is a globally important economic crop, and the ability to quickly and accurately identify tea leaf diseases can significantly improve both the yield and quality of tea production. With advances in deep learning, many recent studies have demonstrated that convolutional neural networks [...] Read more.
Tea is a globally important economic crop, and the ability to quickly and accurately identify tea leaf diseases can significantly improve both the yield and quality of tea production. With advances in deep learning, many recent studies have demonstrated that convolutional neural networks are both feasible and effective for identifying tea leaf diseases. In this paper, we propose a modified EfficientNetB0 lightweight convolutional neural network, enhanced with the ECA module, to reliably identify various tea leaf diseases. We used two tea leaf disease datasets from the Kaggle platform: the Tea_Leaf_Disease dataset, which contains six categories, and the teaLeafBD dataset, which includes seven categories. Experimental results show that our method substantially reduces computational costs, the number of parameters, and overall model size. Additionally, it achieves accuracies of 99.49% and 90.73% on these widely used datasets, making it highly suitable for practical deployment on resource-constrained edge devices. Full article
Show Figures

Graphical abstract

20 pages, 777 KB  
Article
Behind the Algorithm: International Insights into Data-Driven AI Model Development
by Limor Ziv and Maayan Nakash
Mach. Learn. Knowl. Extr. 2025, 7(4), 122; https://doi.org/10.3390/make7040122 - 17 Oct 2025
Viewed by 570
Abstract
Artificial intelligence (AI) is increasingly embedded within organizational infrastructures, yet the foundational role of data in shaping AI outcomes remains underexplored. This study positions data at the center of complexity, uncertainty, and strategic decision-making in AI development, aligning with the emerging paradigm of [...] Read more.
Artificial intelligence (AI) is increasingly embedded within organizational infrastructures, yet the foundational role of data in shaping AI outcomes remains underexplored. This study positions data at the center of complexity, uncertainty, and strategic decision-making in AI development, aligning with the emerging paradigm of data-centric AI (DCAI). Based on in-depth interviews with 74 senior AI and data professionals, the research examines how experts conceptualize and operationalize data throughout the AI lifecycle. A thematic analysis reveals five interconnected domains reflecting sociotechnical and organizational challenges—such as data quality, governance, contextualization, and alignment with business objectives. The study proposes a conceptual model depicting data as a dynamic infrastructure underpinning all AI phases, from collection to deployment and monitoring. Findings indicate that data-related issues, more than model sophistication, are the primary bottlenecks undermining system reliability, fairness, and accountability. Practically, this research advocates for increased investment in the development of intelligent systems designed to ensure high-quality data management. Theoretically, it reframes data as a site of labor and negotiation, challenging dominant model-centric narratives. By integrating empirical insights with normative concerns, this study contributes to the design of more trustworthy and ethically grounded AI systems within the DCAI framework. Full article
Show Figures

Figure 1

5 pages, 338 KB  
Brief Report
Small or Large? Zero-Shot or Finetuned? Guiding Language Model Choice for Specialized Applications in Healthcare
by Lovedeep Gondara, Jonathan Simkin, Graham Sayle, Shebnum Devji, Gregory Arbour and Raymond Ng
Mach. Learn. Knowl. Extr. 2025, 7(4), 121; https://doi.org/10.3390/make7040121 - 17 Oct 2025
Viewed by 448
Abstract
Objectives: To guide language model (LM) selection by comparing finetuning vs. zero-shot use, generic pretraining vs. domain-adjacent vs. further domain-specific pretraining, and bidirectional language models (BiLMs) such as BERT vs. unidirectional LMs (LLMs) for clinical classification. Materials and Methods: We evaluated BiLMs (RoBERTa, [...] Read more.
Objectives: To guide language model (LM) selection by comparing finetuning vs. zero-shot use, generic pretraining vs. domain-adjacent vs. further domain-specific pretraining, and bidirectional language models (BiLMs) such as BERT vs. unidirectional LMs (LLMs) for clinical classification. Materials and Methods: We evaluated BiLMs (RoBERTa, PathologyBERT, Gatortron) and LLM (Mistral nemo instruct 12B) on three British Columbia Cancer Registry (BCCR) pathology classification tasks varying in difficulty/data size. We assessed zero-shot vs. finetuned BiLMs, zero-shot LLM, and further BCCR-specific pretraining using macro-average F1 scores. Results: Finetuned BiLMs outperformed zero-shot BiLMs and zero-shot LLM. The zero-shot LLM outperformed zero-shot BiLMs but was consistently outperformed by finetuned BiLMs. Domain-adjacent BiLMs generally outperformed generic BiLMs after finetuning. Further domain-specific pretraining boosted complex/low-data task performance, with otherwise modest gains. Conclusions: For specialized classification, finetuning BiLMs is crucial, often surpassing zero-shot LLMs. Domain-adjacent pretrained models are recommended. Further domain-specific pretraining provides significant performance boosts, especially for complex/low-data scenarios. BiLMs remain relevant, offering strong performance/resource balance for targeted clinical tasks. Full article
26 pages, 3454 KB  
Article
Hybrid Deep Learning Approaches for Accurate Electricity Price Forecasting: A Day-Ahead US Energy Market Analysis with Renewable Energy
by Md. Saifur Rahman and Hassan Reza
Mach. Learn. Knowl. Extr. 2025, 7(4), 120; https://doi.org/10.3390/make7040120 - 15 Oct 2025
Viewed by 1169
Abstract
Forecasting day-ahead electricity prices is a crucial research area. Both wholesale and retail sectors highly value improved forecast accuracy. Renewable energy sources have grown more influential and effective in the US power market. However, current forecasting models have shortcomings, including inadequate consideration of [...] Read more.
Forecasting day-ahead electricity prices is a crucial research area. Both wholesale and retail sectors highly value improved forecast accuracy. Renewable energy sources have grown more influential and effective in the US power market. However, current forecasting models have shortcomings, including inadequate consideration of renewable energy impacts and insufficient feature selection. Many studies lack reproducibility, clear presentation of input features, and proper integration of renewable resources. This study addresses these gaps by incorporating a comprehensive set of input features, while these features are engineered to capture complex market dynamics. The model’s unique aspect is its inclusion of renewable-related inputs, such as temperature data for solar energy effects and wind speed for wind energy impacts on US electricity prices. The research also employs data preprocessing techniques like windowing, cleaning, normalization, and feature engineering to enhance input data quality and relevance. We developed four advanced hybrid deep learning models to improve electricity price prediction accuracy and reliability. Our approach combines variational mode decomposition (VMD) with four deep learning (DL) architectures: dense neural networks (DNNs), convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and bidirectional LSTM (BiLSTM) networks. This integration aims to capture complex patterns and time-dependent relationships in electricity price data. Among these, the VMD-BiLSTM model consistently outperformed the others across all window implementations. Using 24 input features, this model achieved a remarkably low mean absolute error of 0.2733 when forecasting prices in the MISO market. Our research advances electricity price forecasting, particularly for the US energy market. These hybrid deep neural network models provide valuable tools and insights for market participants, energy traders, and policymakers. Full article
Show Figures

Figure 1

15 pages, 2232 KB  
Article
Image-Based Deep Learning for Brain Tumour Transcriptomics: A Benchmark of DeepInsight, Fotomics, and Saliency-Guided CNNs
by Ali Alyatimi, Vera Chung, Muhammad Atif Iqbal and Ali Anaissi
Mach. Learn. Knowl. Extr. 2025, 7(4), 119; https://doi.org/10.3390/make7040119 - 15 Oct 2025
Viewed by 444
Abstract
Classifying brain tumour transcriptomic data is crucial for precision medicine but remains challenging due to high dimensionality and limited interpretability of conventional models. This study benchmarks three image-based deep learning approaches, DeepInsight, Fotomics, and a novel saliency-guided convolutional neural network (CNN), for transcriptomic [...] Read more.
Classifying brain tumour transcriptomic data is crucial for precision medicine but remains challenging due to high dimensionality and limited interpretability of conventional models. This study benchmarks three image-based deep learning approaches, DeepInsight, Fotomics, and a novel saliency-guided convolutional neural network (CNN), for transcriptomic classification. DeepInsight utilises dimensionality reduction to spatially arrange gene features, while Fotomics applies Fourier transforms to encode expression patterns into structured images. The proposed method transforms each single-cell gene expression profile into an RGB image using PCA, UMAP, or t-SNE, enabling CNNs such as ResNet to learn spatially organised molecular features. Gradient-based saliency maps are employed to highlight gene regions most influential in model predictions. Evaluation is conducted on two biologically and technologically different datasets: single-cell RNA-seq from glioblastoma GSM3828672 and bulk microarray data from medulloblastoma GSE85217. Outcomes demonstrate that image-based deep learning methods, particularly those incorporating saliency guidance, provide a robust and interpretable framework for uncovering biologically meaningful patterns in complex high-dimensional omics data. For instance, ResNet-18 achieved the highest accuracy of 97.25% on the GSE85217 dataset and 91.02% on GSM3828672, respectively, outperforming other baseline models across multiple metrics. Full article
Show Figures

Graphical abstract

16 pages, 10962 KB  
Article
Exploratory Proof-of-Concept: Predicting the Outcome of Tennis Serves Using Motion Capture and Deep Learning
by Gustav Durlind, Uriel Martinez-Hernandez and Tareq Assaf
Mach. Learn. Knowl. Extr. 2025, 7(4), 118; https://doi.org/10.3390/make7040118 - 14 Oct 2025
Viewed by 639
Abstract
Tennis serves heavily impact match outcomes, yet analysis by coaches is limited by human vision. The design of an automated tennis serve analysis system could facilitate enhanced performance analysis. As serve location and serve success are directly correlated, predicting the outcome of a [...] Read more.
Tennis serves heavily impact match outcomes, yet analysis by coaches is limited by human vision. The design of an automated tennis serve analysis system could facilitate enhanced performance analysis. As serve location and serve success are directly correlated, predicting the outcome of a serve could provide vital information for performance analysis. This article proposes a tennis serve analysis system powered by Machine Learning, which classifies the outcome of serves as “in”, “out” or “net”, and predicts the coordinate outcome of successful serves. Additionally, this work details the collection of three-dimensional spatio-temporal data on tennis serves, using marker-based optoelectronic motion capture. The classification uses a Stacked Bidirectional Long Short-Term Memory architecture, whilst a 3D Convolutional Neural Network architecture is harnessed for serve coordinate prediction. The proposed method achieves 89% accuracy for tennis serve classification, outperforming the current state-of-the-art whilst performing finer-grain classification. The results achieve an accuracy of 63% in predicting the serve coordinates, with a mean absolute error of 0.59 and a root mean squared error of 0.68, exceeding the current state-of-the-art with a new method. The system contributes towards the long-term goal of designing a non-invasive tennis serve analysis system that functions in training and match conditions. Full article
Show Figures

Figure 1

28 pages, 3456 KB  
Article
Learning to Partition: Dynamic Deep Neural Network Model Partitioning for Edge-Assisted Low-Latency Video Analytics
by Yan Lyu, Likai Liu, Xuezhi Wang, Zhiyu Fan, Jinchen Wang and Guanyu Gao
Mach. Learn. Knowl. Extr. 2025, 7(4), 117; https://doi.org/10.3390/make7040117 - 13 Oct 2025
Viewed by 813
Abstract
In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning [...] Read more.
In edge-assisted low-latency video analytics, a critical challenge is balancing on-device inference latency against the high bandwidth costs and network delays of offloading. Ineffectively managing this trade-off degrades performance and hinders critical applications like autonomous systems. Existing solutions often rely on static partitioning or greedy algorithms that optimize for a single frame. These myopic approaches adapt poorly to dynamic network and workload conditions, leading to high long-term costs and significant frame drops. This paper introduces a novel partitioning technique driven by a Deep Reinforcement Learning (DRL) agent on a local device that learns to dynamically partition a video analytics Deep Neural Network (DNN). The agent learns a farsighted policy to dynamically select the optimal DNN split point for each frame by observing the holistic system state. By optimizing for a cumulative long-term reward, our method significantly outperforms competitor methods, demonstrably reducing overall system cost and latency while nearly eliminating frame drops in our real-world testbed evaluation. The primary limitation is the initial offline training phase required by the DRL agent. Future work will focus on extending this dynamic partitioning framework to multi-device and multi-edge environments. Full article
Show Figures

Figure 1

22 pages, 3708 KB  
Article
Faithful Narratives from Complex Conceptual Models: Should Modelers or Large Language Models Simplify Causal Maps?
by Tyler J. Gandee and Philippe J. Giabbanelli
Mach. Learn. Knowl. Extr. 2025, 7(4), 116; https://doi.org/10.3390/make7040116 - 7 Oct 2025
Viewed by 588
Abstract
(1) Background: Comprehensive conceptual models can result in complex artifacts, consisting of many concepts that interact through multiple mechanisms. This complexity can be acceptable and even expected when generating rich models, for instance to support ensuing analyses that find central concepts or decompose [...] Read more.
(1) Background: Comprehensive conceptual models can result in complex artifacts, consisting of many concepts that interact through multiple mechanisms. This complexity can be acceptable and even expected when generating rich models, for instance to support ensuing analyses that find central concepts or decompose models into parts that can be managed by different actors. However, complexity can become a barrier when the conceptual model is used directly by individuals. A ‘transparent’ model can support learning among stakeholders (e.g., in group model building) and it can motivate the adoption of specific interventions (i.e., using a model as evidence base). Although advances in graph-to-text generation with Large Language Models (LLMs) have made it possible to transform conceptual models into textual reports consisting of coherent and faithful paragraphs, turning a large conceptual model into a very lengthy report would only displace the challenge. (2) Methods: We experimentally examine the implications of two possible approaches: asking the text generator to simplify the model, either via abstractive (LLMs) or extractive summarization, or simplifying the model through graph algorithms and then generating the complete text. (3) Results: We find that the two approaches have similar scores on text-based evaluation metrics including readability and overlap scores (ROUGE, BLEU, Meteor), but faithfulness can be lower when the text generator decides on what is an interesting fact and is tasked with creating a story. These automated metrics capture textual properties, but they do not assess actual user comprehension, which would require an experimental study with human readers. (4) Conclusions: Our results suggest that graph algorithms may be preferable to support modelers in scientific translations from models to text while minimizing hallucinations. Full article
Show Figures

Figure 1

38 pages, 3764 KB  
Review
AI-Enabled IoT Intrusion Detection: Unified Conceptual Framework and Research Roadmap
by Antonio Villafranca, Kyaw Min Thant, Igor Tasic and Maria-Dolores Cano
Mach. Learn. Knowl. Extr. 2025, 7(4), 115; https://doi.org/10.3390/make7040115 - 6 Oct 2025
Viewed by 2134
Abstract
The Internet of Things (IoT) revolutionizes connectivity, enabling innovative applications across healthcare, industry, and smart cities but also introducing significant cybersecurity challenges due to its expanded attack surface. Intrusion Detection Systems (IDSs) play a pivotal role in addressing these challenges, offering tailored solutions [...] Read more.
The Internet of Things (IoT) revolutionizes connectivity, enabling innovative applications across healthcare, industry, and smart cities but also introducing significant cybersecurity challenges due to its expanded attack surface. Intrusion Detection Systems (IDSs) play a pivotal role in addressing these challenges, offering tailored solutions to detect and mitigate threats in dynamic and resource-constrained IoT environments. Through a rigorous analysis, this study classifies IDS research based on methodologies, performance metrics, and application domains, providing a comprehensive synthesis of the field. Key findings reveal a paradigm shift towards integrating artificial intelligence (AI) and hybrid approaches, surpassing the limitations of traditional, static methods. These advancements highlight the potential for IDSs to enhance scalability, adaptability, and detection accuracy. However, unresolved challenges, such as resource efficiency and real-world applicability, underline the need for further research. By contextualizing these findings within the broader landscape of IoT security, this work emphasizes the critical importance of developing IDS solutions that ensure the reliability, privacy, and security of interconnected systems, contributing to the sustainable evolution of IoT ecosystems. Full article
Show Figures

Graphical abstract

37 pages, 3463 KB  
Article
Enhancing Cancer Classification from RNA Sequencing Data Using Deep Learning and Explainable AI
by Haseeb Younis and Rosane Minghim
Mach. Learn. Knowl. Extr. 2025, 7(4), 114; https://doi.org/10.3390/make7040114 - 1 Oct 2025
Viewed by 757
Abstract
Cancer is one of the most deadly diseases, costing millions of lives and billions of USD every year. There are different ways to identify the biomarkers that can be used to detect cancer types and subtypes. RNA sequencing is steadily taking the lead [...] Read more.
Cancer is one of the most deadly diseases, costing millions of lives and billions of USD every year. There are different ways to identify the biomarkers that can be used to detect cancer types and subtypes. RNA sequencing is steadily taking the lead as the method of choice due to its ability to access global gene expression in biological samples and facilitate more flexible methods and robust analyses. Numerous studies have employed artificial intelligence (AI) and specifically machine learning techniques to detect cancer in its early stages. However, most of the models provided are very specific to particular cancer types and do not generalize. This paper proposes a deep learning and explainable AI (XAI) combined approach to classifying cancer subtypes and a deep learning-based approach for the classification of cancer types using BARRA:CuRDa, an RNA-seq database with 17 datasets for seven cancer types. One architecture is designed to classify cancer subtypes with around 100% accuracy, precision, recall, F1 score, and G-Mean. This architecture outperforms the previous methodologies for all individual datasets. The second architecture is designed to classify multiple cancer types; it classifies eight types within the neighborhood of 87% of validation accuracy, precision, recall, F1 score, and G-Mean. Within the same process, we employ XAI, which identifies 99 genes out of 58,735 input genes that could be potential biomarkers for different cancer types. We also perform Pathway Enrichment Analysis and Visual Analysis to establish the significance and robustness of our methodology. The proposed methodology can classify cancer types and subtypes with robust results and can be extended to other cancer types. Full article
Show Figures

Figure 1

20 pages, 14055 KB  
Article
TL-Efficient-SE: A Transfer Learning-Based Attention-Enhanced Model for Fingerprint Liveness Detection Across Multi-Sensor Spoof Attacks
by Archana Pallakonda, Rayappa David Amar Raj, Rama Muni Reddy Yanamala, Christian Napoli and Cristian Randieri
Mach. Learn. Knowl. Extr. 2025, 7(4), 113; https://doi.org/10.3390/make7040113 - 1 Oct 2025
Viewed by 520
Abstract
Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both [...] Read more.
Fingerprint authentication systems encounter growing threats from presentation attacks, making strong liveness detection crucial. This work presents a deep learning-based framework integrating EfficientNetB0 with a Squeeze-and-Excitation (SE) attention approach, using transfer learning to enhance feature extraction. The LivDet 2015 dataset, composed of both real and fake fingerprints taken using four optical sensors and spoofs made using PlayDoh, Ecoflex, and Gelatine, is used to train and test the model architecture. Stratified splitting is performed once the images being input have been scaled and normalized to conform to EfficientNetB0’s format. The SE module adaptively improves appropriate features to competently differentiate live from fake inputs. The classification head comprises fully connected layers, dropout, batch normalization, and a sigmoid output. Empirical results exhibit accuracy between 98.50% and 99.50%, with an AUC varying from 0.978 to 0.9995, providing high precision and recall for genuine users, and robust generalization across unseen spoof types. Compared to existing methods like Slim-ResCNN and HyiPAD, the novelty of our model lies in the Squeeze-and-Excitation mechanism, which enhances feature discrimination by adaptively recalibrating the channels of the feature maps, thereby improving the model’s ability to differentiate between live and spoofed fingerprints. This model has practical implications for deployment in real-time biometric systems, including mobile authentication and secure access control, presenting an efficient solution for protecting against sophisticated spoofing methods. Future research will focus on sensor-invariant learning and adaptive thresholds to further enhance resilience against varying spoofing attacks. Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
Show Figures

Figure 1

20 pages, 646 KB  
Article
Adversarial Attacks Detection Method for Tabular Data
by Łukasz Wawrowski, Piotr Biczyk, Dominik Ślęzak and Marek Sikora
Mach. Learn. Knowl. Extr. 2025, 7(4), 112; https://doi.org/10.3390/make7040112 - 1 Oct 2025
Viewed by 654
Abstract
Adversarial attacks involve malicious actors introducing intentional perturbations to machine learning (ML) models, causing unintended behavior. This poses a significant threat to the integrity and trustworthiness of ML models, necessitating the development of robust detection techniques to protect systems from potential threats. The [...] Read more.
Adversarial attacks involve malicious actors introducing intentional perturbations to machine learning (ML) models, causing unintended behavior. This poses a significant threat to the integrity and trustworthiness of ML models, necessitating the development of robust detection techniques to protect systems from potential threats. The paper proposes a new approach for detecting adversarial attacks using a surrogate model and diagnostic attributes. The method was tested on 22 tabular datasets on which four different ML models were trained. Furthermore, various attacks were conducted, which led to obtaining perturbed data. The proposed approach is characterized by high efficiency in detecting known and unknown attacks—balanced accuracy was above 0.94, with very low false negative rates (0.02–0.10) for binary detection. Sensitivity analysis shows that classifiers trained based on diagnostic attributes can detect even very subtle adversarial attacks. Full article
(This article belongs to the Section Learning)
Show Figures

Graphical abstract

Previous Issue
Back to TopTop