Previous Issue
Volume 7, June
 
 

Mach. Learn. Knowl. Extr., Volume 7, Issue 3 (September 2025) – 23 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
20 pages, 5008 KiB  
Article
Harnessing Large-Scale University Registrar Data for Predictive Insights: A Data-Driven Approach to Forecasting Undergraduate Student Success with Convolutional Autoencoders
by Mohammad Erfan Shoorangiz and Michal Brylinski
Mach. Learn. Knowl. Extr. 2025, 7(3), 80; https://doi.org/10.3390/make7030080 (registering DOI) - 8 Aug 2025
Abstract
Predicting undergraduate student success is critical for informing timely interventions and improving outcomes in higher education. This study leverages over a decade of historical data from Louisiana State University (LSU) to forecast graduation outcomes using advanced machine learning techniques, with a focus on [...] Read more.
Predicting undergraduate student success is critical for informing timely interventions and improving outcomes in higher education. This study leverages over a decade of historical data from Louisiana State University (LSU) to forecast graduation outcomes using advanced machine learning techniques, with a focus on convolutional autoencoders (CAEs). We detail the data processing and transformation steps, including feature selection and imputation, to construct a robust dataset. The CAE effectively extracts meaningful latent features, validated through low-dimensional t-SNE visualizations that reveal clear clusters based on class labels, differentiating students likely to graduate from those at risk. A two-year gap strategy is introduced to ensure rigorous evaluation and simulate real-world conditions by predicting outcomes on unseen future data. Our results demonstrate the promise of CAE-derived embeddings for dimensionality reduction and computational efficiency, with competitive performance in downstream classification tasks. While models trained on embeddings showed slightly reduced performance compared to raw input data, with accuracies of 83% and 85%, respectively, their compactness and computational efficiency highlight their potential for large-scale analyses. The study emphasizes the importance of rigorous preprocessing, feature engineering, and evaluation protocols. By combining these approaches, we provide actionable insights and adaptive modeling strategies to support robust and generalizable predictive systems, enabling educators and administrators to enhance student success initiatives in dynamic educational environments. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

18 pages, 973 KiB  
Article
Machine Learning-Based Vulnerability Detection in Rust Code Using LLVM IR and Transformer Model
by Young Lee, Syeda Jannatul Boshra, Jeong Yang, Zechun Cao and Gongbo Liang
Mach. Learn. Knowl. Extr. 2025, 7(3), 79; https://doi.org/10.3390/make7030079 - 6 Aug 2025
Abstract
Rust’s growing popularity in high-integrity systems requires automated vulnerability detection in order to maintain its strong safety guarantees. Although Rust’s ownership model and compile-time checks prevent many errors, sometimes unexpected bugs may occasionally pass analysis, underlining the necessity for automated safe and unsafe [...] Read more.
Rust’s growing popularity in high-integrity systems requires automated vulnerability detection in order to maintain its strong safety guarantees. Although Rust’s ownership model and compile-time checks prevent many errors, sometimes unexpected bugs may occasionally pass analysis, underlining the necessity for automated safe and unsafe code detection. This paper presents Rust-IR-BERT, a machine learning approach to detect security vulnerabilities in Rust code by analyzing its compiled LLVM intermediate representation (IR) instead of the raw source code. This approach offers novelty by employing LLVM IR’s language-neutral, semantically rich representation of the program, facilitating robust detection by capturing core data and control-flow semantics and reducing language-specific syntactic noise. Our method leverages a graph-based transformer model, GraphCodeBERT, which is a transformer architecture pretrained model to encode structural code semantics via data-flow information, followed by a gradient boosting classifier, CatBoost, that is capable of handling complex feature interactions—to classify code as vulnerable or safe. The model was evaluated using a carefully curated dataset of over 2300 real-world Rust code samples (vulnerable and non-vulnerable Rust code snippets) from RustSec and OSV advisory databases, compiled to LLVM IR and labeled with corresponding Common Vulnerabilities and Exposures (CVEs) identifiers to ensure comprehensive and realistic coverage. Rust-IR-BERT achieved an overall accuracy of 98.11%, with a recall of 99.31% for safe code and 93.67% for vulnerable code. Despite these promising results, this study acknowledges potential limitations such as focusing primarily on known CVEs. Built on a representative dataset spanning over 2300 real-world Rust samples from diverse crates, Rust-IR-BERT delivers consistently strong performance. Looking ahead, practical deployment could take the form of a Cargo plugin or pre-commit hook that automatically generates and scans LLVM IR artifacts during the development cycle, enabling developers to catch vulnerabilities at an early stage in the development cycle. Full article
Show Figures

Figure 1

40 pages, 2515 KiB  
Article
AE-DTNN: Autoencoder–Dense–Transformer Neural Network Model for Efficient Anomaly-Based Intrusion Detection Systems
by Hesham Kamal and Maggie Mashaly
Mach. Learn. Knowl. Extr. 2025, 7(3), 78; https://doi.org/10.3390/make7030078 - 6 Aug 2025
Abstract
In this study, we introduce an enhanced hybrid Autoencoder–Dense–Transformer Neural Network (AE-DTNN) model for developing an effective intrusion detection system (IDS) aimed at improving the performance and robustness of threat detection strategies within a rapidly changing and increasingly complex network landscape. The Autoencoder [...] Read more.
In this study, we introduce an enhanced hybrid Autoencoder–Dense–Transformer Neural Network (AE-DTNN) model for developing an effective intrusion detection system (IDS) aimed at improving the performance and robustness of threat detection strategies within a rapidly changing and increasingly complex network landscape. The Autoencoder component restructures network traffic data, while a stack of Dense layers performs feature extraction to generate more meaningful representations. The Transformer network then facilitates highly precise and comprehensive classification. Our strategy incorporates adaptive synthetic sampling (ADASYN) for both binary and multi-class classification tasks, complemented by the edited nearest neighbors (ENN) technique and the use of class weights to mitigate class imbalance issues. In experiments conducted on the NF-BoT-IoT-v2 dataset, the AE-DTNN-based IDS achieved outstanding performance, with 99.98% accuracy in binary classification and 98.30% in multi-class classification. On the NSL-KDD dataset, the model reached 98.57% accuracy for binary classification and 97.50% for multi-class classification. Additionally, the model attained 99.92% and 99.78% accuracy in binary and multi-class classification, respectively, on the CSE-CIC-IDS2018 dataset. These results demonstrate the exceptional effectiveness of the proposed model in contrast to conventional approaches, highlighting its strong potential to detect a broad range of network intrusions with high reliability. Full article
Show Figures

Figure 1

24 pages, 1993 KiB  
Article
Evaluating Prompt Injection Attacks with LSTM-Based Generative Adversarial Networks: A Lightweight Alternative to Large Language Models
by Sharaf Rashid, Edson Bollis, Lucas Pellicer, Darian Rabbani, Rafael Palacios, Aneesh Gupta and Amar Gupta
Mach. Learn. Knowl. Extr. 2025, 7(3), 77; https://doi.org/10.3390/make7030077 - 6 Aug 2025
Viewed by 45
Abstract
Generative Adversarial Networks (GANs) using Long Short-Term Memory (LSTM) provide a computationally cheaper approach for text generation compared to large language models (LLMs). The low hardware barrier of training GANs poses a threat because it means more bad actors may use them to [...] Read more.
Generative Adversarial Networks (GANs) using Long Short-Term Memory (LSTM) provide a computationally cheaper approach for text generation compared to large language models (LLMs). The low hardware barrier of training GANs poses a threat because it means more bad actors may use them to mass-produce prompt attack messages against LLM systems. Thus, to better understand the threat of GANs being used for prompt attack generation, we train two well-known GAN architectures, SeqGAN and RelGAN, on prompt attack messages. For each architecture, we evaluate generated prompt attack messages, comparing results with each other, with generated attacks from another computationally cheap approach, a 1-billion-parameter Llama 3.2 small language model (SLM), and with messages from the original dataset. This evaluation suggests that GAN architectures like SeqGAN and RelGAN have the potential to be used in conjunction with SLMs to readily generate malicious prompts that impose new threats against LLM-based systems such as chatbots. Analyzing the effectiveness of state-of-the-art defenses against prompt attacks, we also find that GAN-generated attacks can deceive most of these defenses with varying levels of success with the exception of Meta’s PromptGuard. Further, we suggest an improvement of prompt attack defenses based on the analysis of the language quality of the prompts, which we found to be the weakest point of GAN-generated messages. Full article
Show Figures

Figure 1

22 pages, 1969 KiB  
Article
Significance of Time-Series Consistency in Evaluating Machine Learning Models for Gap-Filling Multi-Level Very Tall Tower Data
by Changhyoun Park
Mach. Learn. Knowl. Extr. 2025, 7(3), 76; https://doi.org/10.3390/make7030076 - 3 Aug 2025
Viewed by 158
Abstract
Machine learning modeling is a valuable tool for gap-filling or prediction, and its performance is typically evaluated using standard metrics. To enable more precise assessments for time-series data, this study emphasizes the importance of considering time-series consistency, which can be evaluated through amplitude—specifically, [...] Read more.
Machine learning modeling is a valuable tool for gap-filling or prediction, and its performance is typically evaluated using standard metrics. To enable more precise assessments for time-series data, this study emphasizes the importance of considering time-series consistency, which can be evaluated through amplitude—specifically, the interquartile range and the lower bound of the band in gap-filled time series. To test this hypothesis, a gap-filling technique was applied using long-term (~6 years) high-frequency flux and meteorological data collected at four different levels (1.5, 60, 140, and 300 m above sea level) on a ~300 m tall flux tower. This study focused on turbulent kinetic energy among several variables, which is important for estimating sensible and latent heat fluxes and net ecosystem exchange. Five ensemble machine learning algorithms were selected and trained on three different datasets. Among several modeling scenarios, the stacking model with a dataset combined with derivative data produced the best metrics for predicting turbulent kinetic energy. Although the metrics before and after gap-filling reported fewer differences among the scenarios, large distortions were found in the consistency of the time series in terms of amplitude. These findings underscore the importance of evaluating time-series consistency alongside traditional metrics, not only to accurately assess modeling performance but also to ensure reliability in downstream applications such as forecasting, climate modeling, and energy estimation. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

65 pages, 8546 KiB  
Review
Quantum Machine Learning and Deep Learning: Fundamentals, Algorithms, Techniques, and Real-World Applications
by Maria Revythi and Georgia Koukiou
Mach. Learn. Knowl. Extr. 2025, 7(3), 75; https://doi.org/10.3390/make7030075 - 1 Aug 2025
Viewed by 365
Abstract
Quantum computing, with its foundational principles of superposition and entanglement, has the potential to provide significant quantum advantages, addressing challenges that classical computing may struggle to overcome. As data generation continues to grow exponentially and technological advancements accelerate, classical machine learning algorithms increasingly [...] Read more.
Quantum computing, with its foundational principles of superposition and entanglement, has the potential to provide significant quantum advantages, addressing challenges that classical computing may struggle to overcome. As data generation continues to grow exponentially and technological advancements accelerate, classical machine learning algorithms increasingly face difficulties in solving complex real-world problems. The integration of classical machine learning with quantum information processing has led to the emergence of quantum machine learning, a promising interdisciplinary field. This work provides the reader with a bottom-up view of quantum circuits starting from quantum data representation, quantum gates, the fundamental quantum algorithms, and more complex quantum processes. Thoroughly studying the mathematics behind them is a powerful tool to guide scientists entering this domain and exploring their connection to quantum machine learning. Quantum algorithms such as Shor’s algorithm, Grover’s algorithm, and the Harrow–Hassidim–Lloyd (HHL) algorithm are discussed in detail. Furthermore, real-world implementations of quantum machine learning and quantum deep learning are presented in fields such as healthcare, bioinformatics and finance. These implementations aim to enhance time efficiency and reduce algorithmic complexity through the development of more effective quantum algorithms. Therefore, a comprehensive understanding of the fundamentals of these algorithms is crucial. Full article
(This article belongs to the Section Learning)
Show Figures

Graphical abstract

24 pages, 3121 KiB  
Article
SG-RAG MOT: SubGraph Retrieval Augmented Generation with Merging and Ordering Triplets for Knowledge Graph Multi-Hop Question Answering
by Ahmmad O. M. Saleh, Gokhan Tur and Yucel Saygin
Mach. Learn. Knowl. Extr. 2025, 7(3), 74; https://doi.org/10.3390/make7030074 - 1 Aug 2025
Viewed by 379
Abstract
Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given [...] Read more.
Large language models (LLMs) often tend to hallucinate, especially in domain-specific tasks and tasks that require reasoning. Previously, we introduced SubGraph Retrieval Augmented Generation (SG-RAG) as a novel Graph RAG method for multi-hop question answering. SG-RAG leverages Cypher queries to search a given knowledge graph and retrieve the subgraph necessary to answer the question. The results from our previous work showed the higher performance of our method compared to the traditional Retrieval Augmented Generation (RAG). In this work, we further enhanced SG-RAG by proposing an additional step called Merging and Ordering Triplets (MOT). The new MOT step seeks to decrease the redundancy in the retrieved triplets by applying hierarchical merging to the retrieved subgraphs. Moreover, it provides an ordering among the triplets using the Breadth-First Search (BFS) traversal algorithm. We conducted experiments on the MetaQA benchmark, which was proposed for multi-hop question-answering in the movies domain. Our experiments showed that SG-RAG MOT provided more accurate answers than Chain-of-Thought and Graph Chain-of-Thought. We also found that merging (up to a certain point) highly overlapping subgraphs and defining an order among the triplets helped the LLM to generate more precise answers. Full article
(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)
Show Figures

Figure 1

19 pages, 6095 KiB  
Article
MERA: Medical Electronic Records Assistant
by Ahmed Ibrahim, Abdullah Khalili, Maryam Arabi, Aamenah Sattar, Abdullah Hosseini and Ahmed Serag
Mach. Learn. Knowl. Extr. 2025, 7(3), 73; https://doi.org/10.3390/make7030073 - 30 Jul 2025
Viewed by 430
Abstract
The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific [...] Read more.
The increasing complexity and scale of electronic health records (EHRs) demand advanced tools for efficient data retrieval, summarization, and comparative analysis in clinical practice. MERA (Medical Electronic Records Assistant) is a Retrieval-Augmented Generation (RAG)-based AI system that addresses these needs by integrating domain-specific retrieval with large language models (LLMs) to deliver robust question answering, similarity search, and report summarization functionalities. MERA is designed to overcome key limitations of conventional LLMs in healthcare, such as hallucinations, outdated knowledge, and limited explainability. To ensure both privacy compliance and model robustness, we constructed a large synthetic dataset using state-of-the-art LLMs, including Mistral v0.3, Qwen 2.5, and Llama 3, and further validated MERA on de-identified real-world EHRs from the MIMIC-IV-Note dataset. Comprehensive evaluation demonstrates MERA’s high accuracy in medical question answering (correctness: 0.91; relevance: 0.98; groundedness: 0.89; retrieval relevance: 0.92), strong summarization performance (ROUGE-1 F1-score: 0.70; Jaccard similarity: 0.73), and effective similarity search (METEOR: 0.7–1.0 across diagnoses), with consistent results on real EHRs. The similarity search module empowers clinicians to efficiently identify and compare analogous patient cases, supporting differential diagnosis and personalized treatment planning. By generating concise, contextually relevant, and explainable insights, MERA reduces clinician workload and enhances decision-making. To our knowledge, this is the first system to integrate clinical question answering, summarization, and similarity search within a unified RAG-based framework. Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
Show Figures

Figure 1

23 pages, 8532 KiB  
Article
VisRep: Towards an Automated, Reflective AI System for Documenting Visualisation Design Processes
by Aron E. Owen and Jonathan C. Roberts
Mach. Learn. Knowl. Extr. 2025, 7(3), 72; https://doi.org/10.3390/make7030072 - 25 Jul 2025
Viewed by 256
Abstract
VisRep (Visualisation Report) is an AI-powered system for capturing and structuring the early stages of the visualisation design process. It addresses a critical gap in predesign: the lack of tools that can naturally record, organise, and transform raw ideation, spoken thoughts, sketches, and [...] Read more.
VisRep (Visualisation Report) is an AI-powered system for capturing and structuring the early stages of the visualisation design process. It addresses a critical gap in predesign: the lack of tools that can naturally record, organise, and transform raw ideation, spoken thoughts, sketches, and evolving concepts into polished, shareable outputs. Users engage in talk-aloud sessions through a terminal-style interface supported by intelligent transcription and eleven structured questions that frame intent, audience, and output goals. These inputs are then processed by a large language model (LLM) guided by markdown-based output templates for reports, posters, and slides. The system aligns free-form ideas with structured communication using prompt engineering to ensure clarity, coherence, and visual consistency. VisRep not only automates the generation of professional deliverables but also enhances reflective practice by bridging spontaneous ideation and structured documentation. This paper introduces VisRep’s methodology, interface design, and AI-driven workflow, demonstrating how it improves the fidelity and transparency of the visualisation design process across academic, professional, and creative domains. Full article
(This article belongs to the Section Visualization)
Show Figures

Figure 1

26 pages, 1276 KiB  
Systematic Review
Harnessing Language Models for Studying the Ancient Greek Language: A Systematic Review
by Diamanto Tzanoulinou, Loukas Triantafyllopoulos and Vassilios S. Verykios
Mach. Learn. Knowl. Extr. 2025, 7(3), 71; https://doi.org/10.3390/make7030071 - 24 Jul 2025
Viewed by 459
Abstract
Applying language models (LMs) and generative artificial intelligence (GenAI) to the study of Ancient Greek offers promising opportunities. However, it faces substantial challenges due to the language’s morphological complexity and lack of annotated resources. Despite growing interest, no systematic overview of existing research [...] Read more.
Applying language models (LMs) and generative artificial intelligence (GenAI) to the study of Ancient Greek offers promising opportunities. However, it faces substantial challenges due to the language’s morphological complexity and lack of annotated resources. Despite growing interest, no systematic overview of existing research currently exists. To address this gap, a systematic literature review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 methodology. Twenty-seven peer-reviewed studies were identified and analyzed, focusing on application areas such as machine translation, morphological analysis, named entity recognition (NER), and emotion detection. The review reveals six key findings, highlighting both the technical advances and persistent limitations, particularly the scarcity of large, domain-specific corpora and the need for better integration into educational contexts. Future developments should focus on building richer resources and tailoring models to the unique features of Ancient Greek, thereby fully realizing the potential of these technologies in both research and teaching. Full article
Show Figures

Figure 1

26 pages, 2658 KiB  
Article
An Efficient and Accurate Random Forest Node-Splitting Algorithm Based on Dynamic Bayesian Methods
by Jun He, Zhanqi Li and Linzi Yin
Mach. Learn. Knowl. Extr. 2025, 7(3), 70; https://doi.org/10.3390/make7030070 - 21 Jul 2025
Viewed by 277
Abstract
Random Forests are powerful machine learning models widely applied in classification and regression tasks due to their robust predictive performance. Nevertheless, traditional Random Forests face computational challenges during tree construction, particularly in high-dimensional data or on resource-constrained devices. In this paper, a novel [...] Read more.
Random Forests are powerful machine learning models widely applied in classification and regression tasks due to their robust predictive performance. Nevertheless, traditional Random Forests face computational challenges during tree construction, particularly in high-dimensional data or on resource-constrained devices. In this paper, a novel node-splitting algorithm, BayesSplit, is proposed to accelerate decision tree construction via a Bayesian-based impurity estimation framework. BayesSplit treats impurity reduction as a Bernoulli event with Beta-conjugate priors for each split point and incorporates two main strategies. First, Dynamic Posterior Parameter Refinement updates the Beta parameters based on observed impurity reductions in batch iterations. Second, Posterior-Derived Confidence Bounding establishes statistical confidence intervals, efficiently filtering out suboptimal splits. Theoretical analysis demonstrates that BayesSplit converges to optimal splits with high probability, while experimental results show up to a 95% reduction in training time compared to baselines and maintains or exceeds generalization performance. Compared to the state-of-the-art MABSplit, BayesSplit achieves similar accuracy on classification tasks and reduces regression training time by 20–70% with lower MSEs. Furthermore, BayesSplit enhances feature importance stability by up to 40%, making it particularly suitable for deployment in computationally constrained environments. Full article
Show Figures

Graphical abstract

23 pages, 4997 KiB  
Article
Prediction of Bearing Layer Depth Using Machine Learning Algorithms and Evaluation of Their Performance
by Yuxin Cong, Arisa Katsuumi and Shinya Inazumi
Mach. Learn. Knowl. Extr. 2025, 7(3), 69; https://doi.org/10.3390/make7030069 - 21 Jul 2025
Viewed by 405
Abstract
In earthquake-prone areas such as Tokyo, accurate estimation of bearing stratum depth is crucial for foundation design, liquefaction assessment, and urban disaster mitigation. However, traditional methods such as the standard penetration test (SPT), while reliable, are labor-intensive and have limited spatial distribution. In [...] Read more.
In earthquake-prone areas such as Tokyo, accurate estimation of bearing stratum depth is crucial for foundation design, liquefaction assessment, and urban disaster mitigation. However, traditional methods such as the standard penetration test (SPT), while reliable, are labor-intensive and have limited spatial distribution. In this study, 942 geological survey records from the Tokyo metropolitan area were used to evaluate the performance of three machine learning algorithms, random forest (RF), artificial neural network (ANN), and support vector machine (SVM), in predicting bearing stratum depth. The main input variables included geographic coordinates, elevation, and stratigraphic category. The results showed that the RF model performed well in terms of multiple evaluation indicators and had significantly better prediction accuracy than ANN and SVM. In addition, data density analysis showed that the prediction error was significantly reduced in high-density areas. The results demonstrate the robustness and adaptability of the RF method in foundation soil layer identification, emphasizing the importance of comprehensive input variables and spatial coverage. The proposed method can be used for large-scale, data-driven bearing stratum prediction and has the potential to be integrated into geological risk management systems and smart city platforms. Full article
Show Figures

Figure 1

22 pages, 875 KiB  
Article
Towards Robust Synthetic Data Generation for Simplification of Text in French
by Nikos Tsourakis
Mach. Learn. Knowl. Extr. 2025, 7(3), 68; https://doi.org/10.3390/make7030068 - 19 Jul 2025
Viewed by 405
Abstract
We present a pipeline for synthetic simplification of text in French that combines large language models with structured semantic guidance. Our approach enhances data generation by integrating contextual knowledge from Wikipedia and Vikidia articles and injecting symbolic control through lightweight knowledge graphs. To [...] Read more.
We present a pipeline for synthetic simplification of text in French that combines large language models with structured semantic guidance. Our approach enhances data generation by integrating contextual knowledge from Wikipedia and Vikidia articles and injecting symbolic control through lightweight knowledge graphs. To construct document-level representations, we implement a progressive summarization process that incrementally builds running summaries and extracts key ideas. Simplifications are generated iteratively and assessed using semantic comparisons between input and output graphs, enabling targeted regeneration when critical information is lost. Our system is implemented using LangChain’s orchestration framework, allowing modular and extensible coordination of LLM components. Evaluation shows that context-aware prompting and semantic feedback improve simplification quality across successive iterations. Full article
(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)
Show Figures

Figure 1

24 pages, 2667 KiB  
Article
Transformer-Driven Fault Detection in Self-Healing Networks: A Novel Attention-Based Framework for Adaptive Network Recovery
by Parul Dubey, Pushkar Dubey and Pitshou N. Bokoro
Mach. Learn. Knowl. Extr. 2025, 7(3), 67; https://doi.org/10.3390/make7030067 - 16 Jul 2025
Viewed by 507
Abstract
Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, [...] Read more.
Fault detection and remaining useful life (RUL) prediction are critical tasks in self-healing network (SHN) environments and industrial cyber–physical systems. These domains demand intelligent systems capable of handling dynamic, high-dimensional sensor data. However, existing optimization-based approaches often struggle with imbalanced datasets, noisy signals, and delayed convergence, limiting their effectiveness in real-time applications. This study utilizes two benchmark datasets—EFCD and SFDD—which represent electrical and sensor fault scenarios, respectively. These datasets pose challenges due to class imbalance and complex temporal dependencies. To address this, we propose a novel hybrid framework combining Attention-Augmented Convolutional Neural Networks (AACNN) with transformer encoders, enhanced through Enhanced Ensemble-SMOTE for balancing the minority class. The model captures spatial features and long-range temporal patterns and learns effectively from imbalanced data streams. The novelty lies in the integration of attention mechanisms and adaptive oversampling in a unified fault-prediction architecture. Model evaluation is based on multiple performance metrics, including accuracy, F1-score, MCC, RMSE, and score*. The results show that the proposed model outperforms state-of-the-art approaches, achieving up to 97.14% accuracy and a score* of 0.419, with faster convergence and improved generalization across both datasets. Full article
Show Figures

Figure 1

16 pages, 2355 KiB  
Article
Generalising Stock Detection in Retail Cabinets with Minimal Data Using a DenseNet and Vision Transformer Ensemble
by Babak Rahi, Deniz Sagmanli, Felix Oppong, Direnc Pekaslan and Isaac Triguero
Mach. Learn. Knowl. Extr. 2025, 7(3), 66; https://doi.org/10.3390/make7030066 - 16 Jul 2025
Viewed by 311
Abstract
Generalising deep-learning models to perform well on unseen data domains with minimal retraining remains a significant challenge in computer vision. Even when the target task—such as quantifying the number of elements in an image—stays the same, data quality, shape, or form variations can [...] Read more.
Generalising deep-learning models to perform well on unseen data domains with minimal retraining remains a significant challenge in computer vision. Even when the target task—such as quantifying the number of elements in an image—stays the same, data quality, shape, or form variations can deviate from the training conditions, often necessitating manual intervention. As a real-world industry problem, we aim to automate stock level estimation in retail cabinets. As technology advances, new cabinet models with varying shapes emerge alongside new camera types. This evolving scenario poses a substantial obstacle to deploying long-term, scalable solutions. To surmount the challenge of generalising to new cabinet models and cameras with minimal amounts of sample images, this research introduces a new solution. This paper proposes a novel ensemble model that combines DenseNet-201 and Vision Transformer (ViT-B/8) architectures to achieve generalisation in stock-level classification. The novelty aspect of our solution comes from the fact that we combine a transformer with a DenseNet model in order to capture both the local, hierarchical details and the long-range dependencies within the images, improving generalisation accuracy with less data. Key contributions include (i) a novel DenseNet-201 + ViT-B/8 feature-level fusion, (ii) an adaptation workflow that needs only two images per class, (iii) a balanced layer-unfreezing schedule, (iv) a publicly described domain-shift benchmark, and (v) a 47 pp accuracy gain over four standard few-shot baselines. Our approach leverages fine-tuning techniques to adapt two pre-trained models to the new retail cabinets (i.e., standing or horizontal) and camera types using only two images per class. Experimental results demonstrate that our method achieves high accuracy rates of 91% on new cabinets with the same camera and 89% on new cabinets with different cameras, significantly outperforming standard few-shot learning methods. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

24 pages, 8216 KiB  
Article
Application of Dueling Double Deep Q-Network for Dynamic Traffic Signal Optimization: A Case Study in Danang City, Vietnam
by Tho Cao Phan, Viet Dinh Le and Teron Nguyen
Mach. Learn. Knowl. Extr. 2025, 7(3), 65; https://doi.org/10.3390/make7030065 - 14 Jul 2025
Viewed by 536
Abstract
This study investigates the application of the Dueling Double Deep Q-Network (3DQN) algorithm to optimize traffic signal control at a major urban intersection in Danang City, Vietnam. The objective is to enhance signal timing efficiency in response to mixed traffic flow and real-world [...] Read more.
This study investigates the application of the Dueling Double Deep Q-Network (3DQN) algorithm to optimize traffic signal control at a major urban intersection in Danang City, Vietnam. The objective is to enhance signal timing efficiency in response to mixed traffic flow and real-world traffic dynamics. A simulation environment was developed using the Simulation of Urban Mobility (SUMO) software version 1.11, incorporating both a fixed-time signal controller and two 3DQN models trained with 1 million (1M-Step) and 5 million (5M-Step) iterations. The models were evaluated using randomized traffic demand scenarios ranging from 50% to 150% of baseline traffic volumes. The results demonstrate that the 3DQN models outperform the fixed-time controller, significantly reducing vehicle delays, with the 5M-Step model achieving average waiting times of under five minutes. To further assess the model’s responsiveness to real-time conditions, traffic flow data were collected using YOLOv8 for object detection and SORT for vehicle tracking from live camera feeds, and integrated into the SUMO-3DQN simulation. The findings highlight the robustness and adaptability of the 3DQN approach, particularly under peak traffic conditions, underscoring its potential for deployment in intelligent urban traffic management systems. Full article
Show Figures

Graphical abstract

22 pages, 2583 KiB  
Article
Helmet Detection in Underground Coal Mines via Dynamic Background Perception with Limited Valid Samples
by Guangfu Wang, Dazhi Sun, Hao Li, Jian Cheng, Pengpeng Yan and Heping Li
Mach. Learn. Knowl. Extr. 2025, 7(3), 64; https://doi.org/10.3390/make7030064 - 9 Jul 2025
Viewed by 387
Abstract
The underground coal mine environment is complex and dynamic, making the application of visual algorithms for object detection a crucial component of underground safety management as well as a key factor in ensuring the safe operation of workers. We look at this in [...] Read more.
The underground coal mine environment is complex and dynamic, making the application of visual algorithms for object detection a crucial component of underground safety management as well as a key factor in ensuring the safe operation of workers. We look at this in the context of helmet-wearing detection in underground mines, where over 25% of the targets are small objects. To address challenges such as the lack of effective samples for unworn helmets, significant background interference, and the difficulty of detecting small helmet targets, this paper proposes a novel underground helmet-wearing detection algorithm that combines dynamic background awareness with a limited number of valid samples to improve accuracy for underground workers. The algorithm begins by analyzing the distribution of visual surveillance data and spatial biases in underground environments. By using data augmentation techniques, it then effectively expands the number of training samples by introducing positive and negative samples for helmet-wearing detection from ordinary scenes. Thereafter, based on YOLOv10, the algorithm incorporates a background awareness module with region masks to reduce the adverse effects of complex underground backgrounds on helmet-wearing detection. Specifically, it adds a convolution and attention fusion module in the detection head to enhance the model’s perception of small helmet-wearing objects by enlarging the detection receptive field. By analyzing the aspect ratio distribution of helmet wearing data, the algorithm improves the aspect ratio constraints in the loss function, further enhancing detection accuracy. Consequently, it achieves precise detection of helmet-wearing in underground coal mines. Experimental results demonstrate that the proposed algorithm can detect small helmet-wearing objects in complex underground scenes, with a 14% reduction in background false detection rates, and thereby achieving accuracy, recall, and average precision rates of 94.4%, 89%, and 95.4%, respectively. Compared to other mainstream object detection algorithms, the proposed algorithm shows improvements in detection accuracy of 6.7%, 5.1%, and 11.8% over YOLOv9, YOLOv10, and RT-DETR, respectively. The algorithm proposed in this paper can be applied to real-time helmet-wearing detection in underground coal mine scenes, providing safety alerts for standardized worker operations and enhancing the level of underground security intelligence. Full article
Show Figures

Graphical abstract

19 pages, 1926 KiB  
Article
A Novel Approach to Company Bankruptcy Prediction Using Convolutional Neural Networks and Generative Adversarial Networks
by Alessia D’Ercole and Gianluigi Me
Mach. Learn. Knowl. Extr. 2025, 7(3), 63; https://doi.org/10.3390/make7030063 - 7 Jul 2025
Viewed by 567
Abstract
Predicting company bankruptcy is a critical task in financial risk assessment. This study introduces a novel approach using Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) to enhance bankruptcy prediction accuracy. By transforming financial statements into grayscale images and leveraging synthetic data [...] Read more.
Predicting company bankruptcy is a critical task in financial risk assessment. This study introduces a novel approach using Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs) to enhance bankruptcy prediction accuracy. By transforming financial statements into grayscale images and leveraging synthetic data generation, we analyze a dataset of 6249 companies, including 3256 active and 2993 bankrupt firms. Our methodology innovates by addressing dataset limitations through GAN-based data augmentation. CNNs are employed to take advantage of their ability to extract hierarchical patterns from financial statement images, providing a new approach to financial analysis, while GANs help mitigate dataset imbalance by generating realistic synthetic data for training. We generate synthetic financial data that closely mimics real-world patterns, expanding the training dataset and potentially improving classifier performance. The CNN model is trained on a combination of real and synthetic data, with strict separation between training/validation and testing. Full article
(This article belongs to the Section Network)
Show Figures

Graphical abstract

22 pages, 568 KiB  
Review
A Review of Methods for Unobtrusive Measurement of Work-Related Well-Being
by Zoja Anžur, Klara Žinkovič, Junoš Lukan, Pietro Barbiero, Gašper Slapničar, Mohan Li, Martin Gjoreski, Maike E. Debus, Sebastijan Trojer, Mitja Luštrek and Marc Langheinrich
Mach. Learn. Knowl. Extr. 2025, 7(3), 62; https://doi.org/10.3390/make7030062 - 1 Jul 2025
Viewed by 1038
Abstract
Work-related well-being is an important research topic, as it is linked to various aspects of individuals’ lives, including job performance. To measure it effectively, unobtrusive sensors are desirable to minimize the burden on employees. Because there is a lack of consensus on the [...] Read more.
Work-related well-being is an important research topic, as it is linked to various aspects of individuals’ lives, including job performance. To measure it effectively, unobtrusive sensors are desirable to minimize the burden on employees. Because there is a lack of consensus on the definitions of well-being in the psychological literature in terms of its dimensions, our work begins by proposing a conceptualization of well-being based on the refined definition of health provided by the World Health Organization. We focus on reviewing the existing literature on the unobtrusive measurement of well-being. In our literature review, we focus on affect, engagement, fatigue, stress, sleep deprivation, physical comfort, and social interactions. Our initial search resulted in a total of 644 studies, from which we then reviewed 35, revealing a variety of behavioral markers such as facial expressions, posture, eye movements, and speech. The most commonly used sensory devices were red, green, and blue (RGB) cameras, followed by microphones and smartphones. The methods capture a variety of behavioral markers, the most common being body movement, facial expressions, and posture. Our work serves as an investigation into various unobtrusive measuring methods applicable to the workplace context, aiming to foster a more employee-centric approach to the measurement of well-being and to emphasize its affective component. Full article
(This article belongs to the Special Issue Sustainable Applications for Machine Learning)
Show Figures

Figure 1

13 pages, 2983 KiB  
Article
AI-Driven Intelligent Financial Forecasting: A Comparative Study of Advanced Deep Learning Models for Long-Term Stock Market Prediction
by Sira Yongchareon
Mach. Learn. Knowl. Extr. 2025, 7(3), 61; https://doi.org/10.3390/make7030061 - 1 Jul 2025
Viewed by 1200
Abstract
The integration of artificial intelligence (AI) and advanced deep learning techniques is reshaping intelligent financial forecasting and decision-support systems. This study presents a comprehensive comparative analysis of advanced deep learning models, including state-of-the-art transformer architectures and established non-transformer approaches, for long-term stock market [...] Read more.
The integration of artificial intelligence (AI) and advanced deep learning techniques is reshaping intelligent financial forecasting and decision-support systems. This study presents a comprehensive comparative analysis of advanced deep learning models, including state-of-the-art transformer architectures and established non-transformer approaches, for long-term stock market index prediction. Utilizing historical data from major global indices (S&P 500, NASDAQ, and Hang Seng), we evaluate ten models across multiple forecasting horizons. A dual-metric evaluation framework is employed, combining traditional predictive accuracy metrics with critical financial performance indicators such as returns, volatility, maximum drawdown, and the Sharpe ratio. Statistical validation through the Mann–Whitney U test ensures robust differentiation in model performance. The results highlight that model effectiveness varies significantly with forecasting horizons and market conditions—where transformer-based models like PatchTST excel in short-term forecasts, while simpler architectures demonstrate greater stability over extended periods. This research offers actionable insights for the development of AI-driven intelligent financial forecasting systems, enhancing risk-aware investment strategies and supporting practical applications in FinTech and smart financial analytics. Full article
Show Figures

Figure 1

13 pages, 1700 KiB  
Article
A Simple Yet Powerful Hybrid Machine Learning Approach to Aid Decision-Making in Laboratory Experiments
by Bernardo Campos Diocaretz, Ágota Tűzesi and Andrei Herdean
Mach. Learn. Knowl. Extr. 2025, 7(3), 60; https://doi.org/10.3390/make7030060 - 25 Jun 2025
Viewed by 654
Abstract
High-dimensional experimental spaces and resource constraints challenge modern science. We introduce a hybrid machine-learning (ML) framework that combines Ordinary Least Squares (OLS) for global surface estimation, Gaussian Process (GP) regression for uncertainty modelling, expected improvement (EI) for active learning, and K-means clustering for [...] Read more.
High-dimensional experimental spaces and resource constraints challenge modern science. We introduce a hybrid machine-learning (ML) framework that combines Ordinary Least Squares (OLS) for global surface estimation, Gaussian Process (GP) regression for uncertainty modelling, expected improvement (EI) for active learning, and K-means clustering for diversifying conditions. We applied this approach to published growth-rate data of the diatom Thalassiosira pseudonana, originally measured across 25 phosphate–temperature conditions. Using the nutrient–temperature model as a simulator, our ML framework located the optimal growth conditions in only 25 virtual experiments—matching the original study’s outcome. Sensitivity analyses further revealed that fewer iterations and controlled batch sizes maintain accuracy even with higher data variability. This demonstrates that ML-guided experimentation can achieve expert-level decision-making without extensive prior data, reducing experimental burden while preserving rigour. Our results highlight the promise of algorithm-assisted experimentation in biology, agriculture, and medicine, marking a shift toward smarter, data-driven scientific workflows. Full article
Show Figures

Graphical abstract

24 pages, 3832 KiB  
Article
Stitching History into Semantics: LLM-Supported Knowledge Graph Engineering for 19th-Century Greek Bookbinding
by Dimitrios Doumanas, Efthalia Ntalouka, Costas Vassilakis, Manolis Wallace and Konstantinos Kotis
Mach. Learn. Knowl. Extr. 2025, 7(3), 59; https://doi.org/10.3390/make7030059 - 24 Jun 2025
Viewed by 818
Abstract
Preserving cultural heritage can be efficiently supported by structured and semantic representation of historical artifacts. Bookbinding, a critical aspect of book history, provides valuable insights into past craftsmanship, material use, and conservation practices. However, existing bibliographic records often lack the depth needed to [...] Read more.
Preserving cultural heritage can be efficiently supported by structured and semantic representation of historical artifacts. Bookbinding, a critical aspect of book history, provides valuable insights into past craftsmanship, material use, and conservation practices. However, existing bibliographic records often lack the depth needed to analyze bookbinding techniques, provenance, and preservation status. This paper presents a proof-of-concept system that explores how Large Language Models (LLMs) can support knowledge graph engineering within the context of 19th-century Greek bookbinding (1830–1900), and as a result, generate a domain-specific ontology and a knowledge graph. Our ontology encapsulates materials, binding techniques, artistic styles, and conservation history, integrating metadata standards like MARC and Dublin Core to ensure interoperability with existing library and archival systems. To validate its effectiveness, we construct a Neo4j knowledge graph, based on the generated ontology and utilize Cypher Queries—including LLM-generated queries—to extract insights about bookbinding practices and trends. This study also explores how semantic reasoning over the knowledge graph can identify historical binding patterns, assess book conservation needs, and infer relationships between bookbinding workshops. Unlike previous bibliographic ontologies, our approach provides a comprehensive, semantically rich representation of bookbinding history, methods and techniques, supporting scholars, conservators, and cultural heritage institutions. By demonstrating how LLMs can assist in ontology/KG creation and query generation, we introduce and evaluate a semi-automated pipeline as a methodological demonstration for studying historical bookbinding, contributing to digital humanities, book conservation, and cultural informatics. Finally, the proposed approach can be used in other domains, thus, being generally applicable in knowledge engineering. Full article
(This article belongs to the Special Issue Knowledge Graphs and Large Language Models)
Show Figures

Graphical abstract

26 pages, 1838 KiB  
Article
Machine Learning Product Line Engineering: A Systematic Reuse Framework
by Bedir Tekinerdogan
Mach. Learn. Knowl. Extr. 2025, 7(3), 58; https://doi.org/10.3390/make7030058 - 20 Jun 2025
Viewed by 696
Abstract
Machine Learning (ML) is increasingly applied across various domains, addressing tasks such as predictive analytics, anomaly detection, and decision-making. Many of these applications share similar underlying tasks, offering potential for systematic reuse. However, existing reuse in ML is often fragmented, small-scale, and ad [...] Read more.
Machine Learning (ML) is increasingly applied across various domains, addressing tasks such as predictive analytics, anomaly detection, and decision-making. Many of these applications share similar underlying tasks, offering potential for systematic reuse. However, existing reuse in ML is often fragmented, small-scale, and ad hoc, focusing on isolated components such as pretrained models or datasets without a cohesive framework. Product Line Engineering (PLE) is a well-established approach for achieving large-scale systematic reuse in traditional engineering. It enables efficient management of core assets like requirements, models, and code across product families. However, traditional PLE is not designed to accommodate ML-specific assets—such as datasets, feature pipelines, and hyperparameters—and is not aligned with the iterative, data-driven workflows of ML systems. To address this gap, we propose Machine Learning Product Line Engineering (ML PLE), a framework that adapts PLE principles for ML systems. In contrast to conventional ML reuse methods such as transfer learning or fine-tuning, our framework introduces a systematic, variability-aware reuse approach that spans the entire lifecycle of ML development, including datasets, pipelines, models, and configuration assets. The proposed framework introduces the key requirements for ML PLE and the lifecycle process tailored to machine-learning-intensive systems. We illustrate the approach using an industrial case study in the context of space systems, where ML PLE is applied for data analytics of satellite missions. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

Previous Issue
Back to TopTop