Previous Issue
Volume 7, March
 
 

Mach. Learn. Knowl. Extr., Volume 7, Issue 2 (June 2025) – 25 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
17 pages, 2841 KiB  
Article
Machine Learning Models to Predict Individual Cognitive Load in Collaborative Learning: Combining fNIRS and Eye-Tracking Data
by Wenli Chen, Zirou Lin, Lishan Zheng, Mei-Yee Mavis Ho, Farhan Ali and Wei Peng Teo
Mach. Learn. Knowl. Extr. 2025, 7(2), 51; https://doi.org/10.3390/make7020051 - 6 Jun 2025
Viewed by 162
Abstract
Effectively leveraging cognitive load predictions helps optimize collaborative learning design and implementation. This study explored the feasibility of predicting individual learners’ cognitive load during collaborative learning using a combination of functional near-infrared spectroscopy (fNIRS) and eye-tracking data. A total of 188 valid collaborative [...] Read more.
Effectively leveraging cognitive load predictions helps optimize collaborative learning design and implementation. This study explored the feasibility of predicting individual learners’ cognitive load during collaborative learning using a combination of functional near-infrared spectroscopy (fNIRS) and eye-tracking data. A total of 188 valid collaborative events collected from 78 graduate students who engaged in three collaborative ideation tasks were analyzed using various machine learning algorithms applied to classify cognitive load levels. Nine features, derived from both fNIRS and eye-tracking data, were used as input for the models. Results demonstrated that machine learning models could accurately predict individual cognitive load, with the Random Forest model achieving the highest performance (F1 score = 0.84). Furthermore, the integration of fNIRS and eye-tracking data significantly enhanced predictive performance, with the multimodal model achieving an F1 score 0.87—outperforming the eye-tracking-only model (F1 = 0.79) by 8% and the fNIRS-only model (F1 = 0.68) by 19%. Analysis of feature importance revealed that “Total Fixation Duration”, “Average Inter-Fixation Degree”, and prefrontal cortex activity were among the strongest predictors of learners’ cognitive load. These findings have implications for understanding cognitive load dynamics and designing effective collaborative learning environments and human–computer interfaces. Full article
Show Figures

Figure 1

18 pages, 2759 KiB  
Article
Simulated Annealing-Based Hyperparameter Optimization of a Convolutional Neural Network for MRI Brain Tumor Classification
by Sofia El Amoury, Youssef Smili and Youssef Fakhri
Mach. Learn. Knowl. Extr. 2025, 7(2), 50; https://doi.org/10.3390/make7020050 - 31 May 2025
Viewed by 332
Abstract
Brain tumor classification poses significant challenges in medical imaging, largely due to the heterogeneity and structural complexity of tumors. With Magnetic Resonance Imaging (MRI) serving as a cornerstone for diagnosis, manual interpretation by radiologists is time-consuming and prone to inter-observer variability. Recent advances [...] Read more.
Brain tumor classification poses significant challenges in medical imaging, largely due to the heterogeneity and structural complexity of tumors. With Magnetic Resonance Imaging (MRI) serving as a cornerstone for diagnosis, manual interpretation by radiologists is time-consuming and prone to inter-observer variability. Recent advances in deep learning, particularly through the application of Convolutional Neural Networks (CNNs), have transformed medical image analysis by enabling automated, high-accuracy feature extraction. Despite their promise, the performance of CNNs is highly contingent upon optimal hyperparameter tuning, a process that can be both computationally demanding and pivotal for model efficacy. In this study, we employ Simulated Annealing (SA), a probabilistic metaheuristic technique, to methodically optimize the hyperparameters of a CNN architecture designed specifically for classifying brain tumors from MRI scans. Our approach employs a direct representation of hyperparameters alongside an efficient perturbation strategy, facilitating a comprehensive exploration of the parameter space. Experimental evaluations conducted on an extensive MRI dataset (N = 7023 scans classified into glioma, meningioma, no tumor and pituitary) demonstrate that our SA-optimized CNN model achieves a validation accuracy of 98.15%, thereby affirming the potential of SA in enhancing the performance of deep learning systems in medical diagnostics. These findings underscore the critical role of advanced hyperparameter optimization techniques in improving diagnostic accuracy and robustness, ultimately contributing to the development of more reliable and efficient brain tumor classification systems in clinical settings. Full article
(This article belongs to the Special Issue Advances in Machine and Deep Learning)
Show Figures

Figure 1

26 pages, 4439 KiB  
Article
Using N-Version Architectures for Railway Segmentation with Deep Neural Networks
by Philipp Jaß and Carsten Thomas
Mach. Learn. Knowl. Extr. 2025, 7(2), 49; https://doi.org/10.3390/make7020049 - 26 May 2025
Viewed by 230
Abstract
Autonomous trains require reliable and accurate environmental perception to take over safety-critical tasks from the driver. This paper investigates the application of N-version architectures to rail track detection using Deep Neural Networks (DNNs) as a means to improve the safety of machine learning [...] Read more.
Autonomous trains require reliable and accurate environmental perception to take over safety-critical tasks from the driver. This paper investigates the application of N-version architectures to rail track detection using Deep Neural Networks (DNNs) as a means to improve the safety of machine learning (ML)-enabled perception systems. We combine three different neural network architectures (WCID, VGG16-UNet, MobileNet–SegNet) in a 3M1I configuration. In this configuration, we apply two fusion methods to increase accuracy and to enable error detection: Maximum Confidence Voting (MCV), combining the DNN predictions at the image level, and Pixel Majority Voting (PMV), a novel approach for combining the predictions at the pixel level. In addition, we implement a new method for evaluating and combining prediction confidence values in the N-version architecture during runtime. We adjust the overall prediction confidence according to the conformity of all individual predictions, which is not possible with an individual network. Our results show that the N-version architecture not only enables a detection of erroneous predictions by utilizing those adjusted confidence values, but it can also partially improve the predictions by using the PMV combination algorithm. This work emphasizes the importance of model diversity and appropriate thresholds for an accurate assessment of prediction safety. These approaches can significantly improve the practical applicability of ML-based systems in safety-critical domains such as rail transportation. Full article
(This article belongs to the Section Visualization)
Show Figures

Figure 1

24 pages, 2044 KiB  
Article
Bregman–Hausdorff Divergence: Strengthening the Connections Between Computational Geometry and Machine Learning
by Tuyen Pham, Hana Dal Poz Kouřimská and Hubert Wagner
Mach. Learn. Knowl. Extr. 2025, 7(2), 48; https://doi.org/10.3390/make7020048 - 26 May 2025
Viewed by 166
Abstract
The purpose of this paper is twofold. On a technical side, we propose an extension of the Hausdorff distance from metric spaces to spaces equipped with asymmetric distance measures. Specifically, we focus on extending it to the family of Bregman divergences, which includes [...] Read more.
The purpose of this paper is twofold. On a technical side, we propose an extension of the Hausdorff distance from metric spaces to spaces equipped with asymmetric distance measures. Specifically, we focus on extending it to the family of Bregman divergences, which includes the popular Kullback–Leibler divergence (also known as relative entropy). The resulting dissimilarity measure is called a Bregman–Hausdorff divergence and compares two collections of vectors—without assuming any pairing or alignment between their elements. We propose new algorithms for computing Bregman–Hausdorff divergences based on a recently developed Kd-tree data structure for nearest neighbor search with respect to Bregman divergences. The algorithms are surprisingly efficient even for large inputs with hundreds of dimensions. As a benchmark, we use the new divergence to compare two collections of probabilistic predictions produced by different machine learning models trained using the relative entropy loss. In addition to the introduction of this technical concept, we provide a survey. It outlines the basics of Bregman geometry, and motivated the Kullback–Leibler divergence using concepts from information theory. We also describe computational geometric algorithms that have been extended to this geometry, focusing on algorithms relevant for machine learning. Full article
Show Figures

Figure 1

21 pages, 3561 KiB  
Article
Artificial Intelligence Meets Bioequivalence: Using Generative Adversarial Networks for Smarter, Smaller Trials
by Anastasios Nikolopoulos and Vangelis D. Karalis
Mach. Learn. Knowl. Extr. 2025, 7(2), 47; https://doi.org/10.3390/make7020047 - 23 May 2025
Viewed by 274
Abstract
This study introduces artificial intelligence as a powerful tool to transform bioequivalence (BE) trials. We apply advanced generative models, specifically Wasserstein Generative Adversarial Networks (WGANs), to create virtual subjects and reduce the need for real human participants in generic drug assessment. Although BE [...] Read more.
This study introduces artificial intelligence as a powerful tool to transform bioequivalence (BE) trials. We apply advanced generative models, specifically Wasserstein Generative Adversarial Networks (WGANs), to create virtual subjects and reduce the need for real human participants in generic drug assessment. Although BE studies typically involve small sample sizes (usually 24 subjects), which may limit the use of AI-generated populations, our findings show that these models can successfully overcome this challenge. To show the utility of generative AI algorithms in BE testing, this study applied Monte Carlo simulations of 2 × 2 crossover BE trials, combined with WGANs. After training of the WGAN model, several scenarios were explored, including sample size, the proportion of subjects used for the synthesis of virtual subjects, and variabilities. The performance of the AI-synthesized populations was tested in two ways: (a) first, by assessing the similarity of the performance with the actual population, and (b) second, by evaluating the statistical power achieved, which aimed to be as high as that of the entire original population. The results demonstrated that WGANs could generate virtual populations with BE acceptance percentages and similarity levels that matched or exceeded those of the original population. This approach proved effective across various scenarios, enhancing BE study sample sizes, reducing costs, and accelerating trial durations. This study highlights the potential of WGANs to improve data augmentation and optimize subject recruitment in BE studies. Full article
(This article belongs to the Section Network)
Show Figures

Figure 1

22 pages, 2244 KiB  
Article
Revolutionizing Cardiac Risk Assessment: AI-Powered Patient Segmentation Using Advanced Machine Learning Techniques
by Joan D. Gonzalez-Franco, Alejandro Galaviz-Mosqueda, Salvador Villarreal-Reyes, Jose E. Lozano-Rizk, Raul Rivera-Rodriguez, Jose E. Gonzalez-Trejo, Alexei-Fedorovish Licea-Navarro, Jorge Lozoya-Arandia and Edgar A. Ibarra-Flores
Mach. Learn. Knowl. Extr. 2025, 7(2), 46; https://doi.org/10.3390/make7020046 - 22 May 2025
Viewed by 457
Abstract
Cardiovascular diseases stand as the leading cause of mortality worldwide, underscoring the urgent need for effective tools that enable early detection and monitoring of at-risk patients. This study combines Artificial Intelligence (AI) techniques—specifically the k-means clustering algorithm—alongside dimensionality reduction methods like Principal Component [...] Read more.
Cardiovascular diseases stand as the leading cause of mortality worldwide, underscoring the urgent need for effective tools that enable early detection and monitoring of at-risk patients. This study combines Artificial Intelligence (AI) techniques—specifically the k-means clustering algorithm—alongside dimensionality reduction methods like Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP) to identify patient groups with varying levels of heart attack risk. We used a publicly available clinical dataset with 1319 patient records, which included variables such as age, gender, blood pressure, glucose levels, CK-MB Creatine Kinase MB (KCM), and troponin levels. We normalized and prepared the data, then we employed PCA and UMAP to reduce dimensionality and facilitate visualization. Using the k-means algorithm, we segmented the patients into distinct groups based on their clinical features. Our analysis revealed two distinct patient groups. Group 2 exhibited significantly higher levels of troponin (mean 0.4761 ng/mL), KCM (18.65 ng/mL), and glucose (mean 148.19 mg/dL) and was predominantly composed of men (97%). These factors indicate an increased risk of cardiac events compared to Group 1, which had lower levels of these biomarkers and a slightly higher average age. Interestingly, no significant differences in blood pressure were observed between the groups. This study demonstrates the effectiveness of combining Machine Learning (ML) techniques with dimensionality reduction methods to enhance risk stratification accuracy in cardiology. By enabling more targeted interventions for high-risk patients, our unsupervised segmentation approach focuses on intrinsic data patterns rather than predefined diagnostic labels, serves as a powerful complement to traditional risk assessment tools. Full article
(This article belongs to the Special Issue Sustainable Applications for Machine Learning)
Show Figures

Graphical abstract

14 pages, 1580 KiB  
Article
Machine Learning Classification of Fossilized Pectinodon bakkeri Teeth Images: Insights into Troodontid Theropod Dinosaur Morphology
by Jacob Bahn, Germán H. Alférez and Keith Snyder
Mach. Learn. Knowl. Extr. 2025, 7(2), 45; https://doi.org/10.3390/make7020045 - 21 May 2025
Viewed by 1112
Abstract
Although the manual classification of microfossils is possible, it can become burdensome. Machine learning offers an alternative that allows for automatic classification. Our contribution is to use machine learning to develop an automated approach for classifying images of Pectinodon bakkeri teeth. This can [...] Read more.
Although the manual classification of microfossils is possible, it can become burdensome. Machine learning offers an alternative that allows for automatic classification. Our contribution is to use machine learning to develop an automated approach for classifying images of Pectinodon bakkeri teeth. This can be expanded for use with many other species. Our approach is composed of two steps. First, PCA and K-means were applied to a numerical dataset with 459 samples collected at the Hanson Ranch Bonebed in eastern Wyoming, containing the following features: crown height, fore-aft basal length, basal width, anterior denticles, and posterior denticles per millimeter. The results obtained in this step were used to automatically organize the P. bakkeri images from two out of three clusters generated. Finally, the tooth images were used to train a convolutional neural network with two classes. The model has an accuracy of 71%, a precision of 71%, a recall of 70.5%, and an F1-score of 70.5%. Full article
(This article belongs to the Special Issue Deep Learning in Image Analysis and Pattern Recognition, 2nd Edition)
Show Figures

Figure 1

20 pages, 3299 KiB  
Article
Quantum-Inspired Models for Classical Time Series
by Zoltán Udvarnoki and Gábor Fáth
Mach. Learn. Knowl. Extr. 2025, 7(2), 44; https://doi.org/10.3390/make7020044 - 21 May 2025
Viewed by 154
Abstract
We present a model of classical binary time series derived from a matrix product state (MPS) Ansatz widely used in one-dimensional quantum systems. We discuss how this quantum Ansatz allows us to generate classical time series in a sequential manner. Our time series [...] Read more.
We present a model of classical binary time series derived from a matrix product state (MPS) Ansatz widely used in one-dimensional quantum systems. We discuss how this quantum Ansatz allows us to generate classical time series in a sequential manner. Our time series are built in two steps: First, a lower-level series (the driving noise or the increments) is created directly from the MPS representation, which is then integrated to create our ultimate higher-level series. The lower- and higher-level series have clear interpretations in the quantum context, and we elaborate on this correspondence with specific examples such as the spin-1/2 Ising model in a transverse field (ITF model), where spin configurations correspond to the increments of discrete-time, discrete-level stochastic processes with finite or infinite autocorrelation lengths, Gaussian or non-Gaussian limit distributions, nontrivial Hurst exponents, multifractality, asymptotic self-similarity, etc. Our time series model is a parametric model, and we investigate how flexible the model is in some synthetic and real-life calibration problems. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

19 pages, 488 KiB  
Article
Membership Inference Attacks Fueled by Few-Shot Learning to Detect Privacy Leakage and Address Data Integrity
by Daniel Jiménez-López, Nuria Rodríguez-Barroso, M. Victoria Luzón, Javier Del Ser and Francisco Herrera
Mach. Learn. Knowl. Extr. 2025, 7(2), 43; https://doi.org/10.3390/make7020043 - 20 May 2025
Viewed by 381
Abstract
Deep learning models have an intrinsic privacy issue as they memorize parts of their training data, creating a privacy leakage. Membership inference attacks (MIAs) exploit this to obtain confidential information about the data used for training, aiming to steal information. They can be [...] Read more.
Deep learning models have an intrinsic privacy issue as they memorize parts of their training data, creating a privacy leakage. Membership inference attacks (MIAs) exploit this to obtain confidential information about the data used for training, aiming to steal information. They can be repurposed as a measurement of data integrity by inferring whether the data were used to train a machine learning model. While state-of-the-art attacks achieve significant privacy leakage, their requirements render them infeasible, hindering their use as practical tools to assess the magnitude of the privacy risk. Moreover, the most appropriate evaluation metric of MIA, the true positive rate at a low false positive rate, lacks interpretability. We claim that the incorporation of few-shot learning techniques into the MIA field and a suitable qualitative and quantitative privacy evaluation measure should resolve these issues. In this context, our proposal is twofold. We propose a few-shot learning-based MIA, termed the FeS-MIA model, which eases the evaluation of the privacy breach of a deep learning model by significantly reducing the number of resources required for this purpose. Furthermore, we propose an interpretable quantitative and qualitative measure of privacy, referred to as the Log-MIA measure. Jointly, these proposals provide new tools to assess privacy leakages and to ease the evaluation of the training data integrity of deep learning models, i.e., to analyze the privacy breach of a deep learning model. Experiments carried out with MIA over image classification and language modeling tasks, and a comparison to the state of the art, show that our proposals excel in identifying privacy leakages in a deep learning model with little extra information. Full article
(This article belongs to the Section Privacy)
Show Figures

Figure 1

18 pages, 1578 KiB  
Article
Leveraging Failure Modes and Effect Analysis for Technical Language Processing
by Mathieu Payette, Georges Abdul-Nour, Toualith Jean-Marc Meango, Miguel Diago and Alain Côté
Mach. Learn. Knowl. Extr. 2025, 7(2), 42; https://doi.org/10.3390/make7020042 - 9 May 2025
Viewed by 440
Abstract
With the evolution of data collection technologies, sensor-generated data have become the norm. However, decades of manually recorded maintenance data still hold untapped value. Natural Language Processing (NLP) offers new ways to extract insights from these historical records, especially from short, unstructured maintenance [...] Read more.
With the evolution of data collection technologies, sensor-generated data have become the norm. However, decades of manually recorded maintenance data still hold untapped value. Natural Language Processing (NLP) offers new ways to extract insights from these historical records, especially from short, unstructured maintenance texts often accompanying structured database fields. While NLP has shown promise in this area, technical texts pose unique challenges, particularly in preprocessing and manual annotation. This study proposes a novel methodology combining Failure Mode and Effect Analysis (FMEA), a reliability engineering tool, into the NLP pipeline to enhance Named Entity Recognition (NER) in maintenance records. By leveraging the structured and domain-specific knowledge encapsulated in FMEAs, the annotation process becomes more systematic, reducing the need for exhaustive manual effort. A case study using real-world data from a major electrical utility demonstrates the effectiveness of this approach. The custom NER model, trained using FMEA-informed annotations, achieves high precision, recall, and F1 scores, successfully identifying key reliability elements in maintenance text. The integration of FMEA not only improves data quality but also supports more informed asset management decisions. This research introduces a novel cross-disciplinary framework combining reliability engineering and NLP. It highlights how domain expertise can be used to streamline annotation, improve model accuracy, and unlock actionable insights from legacy maintenance data. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

24 pages, 4213 KiB  
Article
Automated Grading Through Contrastive Learning: A Gradient Analysis and Feature Ablation Approach
by Mateo Sokač, Mario Fabijanić, Igor Mekterović and Leo Mršić
Mach. Learn. Knowl. Extr. 2025, 7(2), 41; https://doi.org/10.3390/make7020041 - 29 Apr 2025
Viewed by 477
Abstract
As programming education becomes increasingly complex, grading student code has become a challenging task. Traditional methods, such as dynamic and static analysis, offer foundational approaches but often fail to provide granular insights, leading to inconsistencies in grading and feedback. This study addresses the [...] Read more.
As programming education becomes increasingly complex, grading student code has become a challenging task. Traditional methods, such as dynamic and static analysis, offer foundational approaches but often fail to provide granular insights, leading to inconsistencies in grading and feedback. This study addresses the limitations of these methods by integrating contrastive learning with explainable AI techniques to assess SQL code submissions. We employed contrastive learning to differentiate between student and correct SQL solutions, projecting them into a high-dimensional latent space, and used the Frobenius norm to measure the distance between these representations. This distance was used to predict the percentage of points deducted from each student’s solution. To enhance interpretability, we implemented feature ablation and integrated gradients, which provide insights into the specific tokens in student code that impact the grading outcomes. Our findings indicate that this approach improves the accuracy, consistency, and transparency of automated grading, aligning more closely with human grading standards. The results suggest that this framework could be a valuable tool for automated programming assessment systems, offering clear, actionable feedback and making machine learning models in educational contexts more interpretable and effective. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

36 pages, 11592 KiB  
Article
A Novel Approach Based on Hypergraph Convolutional Neural Networks for Cartilage Shape Description and Longitudinal Prediction of Knee Osteoarthritis Progression
by John B. Theocharis, Christos G. Chadoulos and Andreas L. Symeonidis
Mach. Learn. Knowl. Extr. 2025, 7(2), 40; https://doi.org/10.3390/make7020040 - 26 Apr 2025
Viewed by 357
Abstract
Knee osteoarthritis (KOA) is a highly prevalent muscoloskeletal joint disorder affecting a significant portion of the population worldwide. Accurate predictions of KOA progression can assist clinicians in drawing preventive strategies for patients. In this paper, we present an integrated approach based [...] Read more.
Knee osteoarthritis (KOA) is a highly prevalent muscoloskeletal joint disorder affecting a significant portion of the population worldwide. Accurate predictions of KOA progression can assist clinicians in drawing preventive strategies for patients. In this paper, we present an integrated approach based on hypergraph convolutional networks (HGCNs) for longitudinal predictions of KOA grades and progressions from MRI images. We propose two novel models, namely, the C_Shape.Net and the predictor network. The C_Shape.Net operates on a hypergraph of volumetric nodes, especially designed to represent the surface and volumetric features of the cartilage. It encompasses deep HGCN convolutions, graph pooling, and readout operations in a hierarchy of layers, providing, at the output, expressive 3D shape descriptors of the cartilage volume. The predictor is a spatio-temporal HGCN network (ST_HGCN), following the sequence-to-sequence learning scheme. Concretely, it transforms sequences of knee representations at the historical stage into sequences of KOA predictions at the prediction stage. The predictor includes spatial HGCN convolutions, attention-based temporal fusion of feature embeddings at multiple layers, and a transformer module that generates longitudinal predictions at follow-up times. We present comprehensive experiments on the Osteoarthritis Initiative (OAI) cohort to evaluate the performance of our methodology for various tasks, including node classification, longitudinal KL grading, and progression. The basic finding of the experiments is that the larger the depth of the historical stage, the higher the accuracy of the obtained predictions in all tasks. For the maximum historic depth of four years, our method yielded an average balanced accuracy (BA) of 85.94% in KOA grading, and accuracies of 91.89% (+1), 88.11% (+2), 84.35% (+3), and 79.41% (+4) for the four consecutive follow-up visits. Under the same setting, we also achieved an average value of Area Under Curve (AUC) of 0.94 for the prediction of progression incidence, and follow-up AUC values of 0.81 (+1), 0.77 (+2), 0.73 (+3), and 0.68 (+4), respectively. Full article
(This article belongs to the Section Network)
Show Figures

Figure 1

19 pages, 18858 KiB  
Article
PIDQA—Question Answering on Piping and Instrumentation Diagrams
by Mohit Gupta, Chialing Wei, Thomas Czerniawski and Ricardo Eiris
Mach. Learn. Knowl. Extr. 2025, 7(2), 39; https://doi.org/10.3390/make7020039 - 21 Apr 2025
Viewed by 768
Abstract
This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable knowledge bases through a three-stage pipeline. First, [...] Read more.
This paper introduces a novel framework enabling natural language question answering on Piping and Instrumentation Diagrams (P&IDs), addressing a critical gap between engineering design documentation and intuitive information retrieval. Our approach transforms static P&IDs into queryable knowledge bases through a three-stage pipeline. First, we recognize entities in a P&ID image and organize their relationships to form a base entity graph. Second, this entity graph is converted into a Labeled Property Graph (LPG), enriched with semantic attributes for nodes and edges. Third, a Large Language Model (LLM)-based information retrieval system translates a user query into a graph query language (Cypher) and retrieves the answer by executing it on LPG. For our experiments, we augmented a publicly available P&ID image dataset with our novel PIDQA dataset, which comprises 64,000 question–answer pairs spanning four categories: (I) simple counting, (II) spatial counting, (III) spatial connections, and (IV) value-based questions. Our experiments (using gpt-3.5-turbo) demonstrate that grounding the LLM with dynamic few-shot sampling robustly elevates accuracy by 10.6–43.5% over schema contextualization alone, even under high lexical diversity conditions (e.g., paraphrasing, ambiguity). By reducing barriers in retrieving P&ID data, this work advances human–AI collaboration for industrial workflows in design validation and safety audits. Full article
(This article belongs to the Section Visualization)
Show Figures

Figure 1

27 pages, 2387 KiB  
Systematic Review
Knowledge Graphs and Their Reciprocal Relationship with Large Language Models
by Ramandeep Singh Dehal, Mehak Sharma and Enayat Rajabi
Mach. Learn. Knowl. Extr. 2025, 7(2), 38; https://doi.org/10.3390/make7020038 - 21 Apr 2025
Viewed by 2250
Abstract
The reciprocal relationship between Large Language Models (LLMs) and Knowledge Graphs (KGs) highlights their synergistic potential in enhancing artificial intelligence (AI) applications. LLMs, with their natural language understanding and generative capabilities, support the automation of KG construction through entity recognition, relation extraction, and [...] Read more.
The reciprocal relationship between Large Language Models (LLMs) and Knowledge Graphs (KGs) highlights their synergistic potential in enhancing artificial intelligence (AI) applications. LLMs, with their natural language understanding and generative capabilities, support the automation of KG construction through entity recognition, relation extraction, and schema generation. Conversely, KGs serve as structured and interpretable data sources that improve the transparency, factual consistency and reliability of LLM-based applications, mitigating challenges such as hallucinations and lack of explainability. This study conducts a systematic literature review of 77 studies to examine AI methodologies supporting LLM–KG integration, including symbolic AI, machine learning, and hybrid approaches. The research explores diverse applications spanning healthcare, finance, justice, and industrial automation, revealing the transformative potential of this synergy. Through in-depth analysis, this study identifies key limitations in current approaches, including challenges in scalability with maintaining dynamic and real-time Knowledge Graphs, difficulty in adapting general-purpose LLMs to specialized domains, limited explainability in tracing model outputs to interpretable reasoning, and ethical concerns surrounding bias, fairness, and transparency. In response, the study highlights potential strategies to optimize LLM–KG synergy. The findings from this study provide actionable insights for researchers and practitioners aiming for robust, transparent, and adaptive AI systems to enhance knowledge-driven AI applications through LLM–KG integration, further advancing generative AI and explainable AI (XAI) applications. Full article
(This article belongs to the Section Data)
Show Figures

Figure 1

22 pages, 17414 KiB  
Article
Advancing Particle Tracking: Self-Organizing Map Hyperparameter Study and Long Short-Term Memory-Based Outlier Detection
by Max Klein, Niklas Dormagen, Lukas Wimmer, Markus H. Thoma and Mike Schwarz
Mach. Learn. Knowl. Extr. 2025, 7(2), 37; https://doi.org/10.3390/make7020037 - 17 Apr 2025
Viewed by 363
Abstract
Particle tracking velocimetry (PTV) forms the basis for many fluid dynamic experiments, in which individual particles are tracked across multiple successive images. However, when the experimental setup involves high-speed, high-density particles that are indistinguishable and follow complex or unknown flow fields, matching particles [...] Read more.
Particle tracking velocimetry (PTV) forms the basis for many fluid dynamic experiments, in which individual particles are tracked across multiple successive images. However, when the experimental setup involves high-speed, high-density particles that are indistinguishable and follow complex or unknown flow fields, matching particles between images becomes significantly more challenging. Reliable PTV algorithms are crucial in such scenarios. Previous work has demonstrated that the Self-Organizing Map (SOM) machine learning approach offers superior outcomes on complex-plasma data compared with traditional methods, though its performance is sensitive to hyperparameter calibration, which requires optimization for specific flow scenarios. In this article, we describe how the dependence of the various hyperparameters on different flow scenarios was studied and the optimal settings for diverse flow conditions were identified. Based on these results, automatic hyperparameter calibration was implemented in the PTV framework. Furthermore, the SOM’s performance was directly compared with that of the preceding conventional PTV method, Trackpy, for complex plasmas using synthetic data. Finally, as a new approach to identifying incorrectly matched particle traces, a Long Short-Term Memory (LSTM) neural network was developed to sort out all inaccuracies to further improve the outcome. Combined with automatic hyperparameter calibration, outlier detection and additional computational speed optimization, this work delivers a robust, versatile and efficient framework for PTV analysis. Full article
(This article belongs to the Section Network)
Show Figures

Figure 1

12 pages, 1766 KiB  
Article
Machine-Learned Codes from EHR Data Predict Hard Outcomes Better than Human-Assigned ICD Codes
by Ying Yin, Yijun Shao, Phillip Ma, Qing Zeng-Treitler and Stuart J. Nelson
Mach. Learn. Knowl. Extr. 2025, 7(2), 36; https://doi.org/10.3390/make7020036 - 17 Apr 2025
Viewed by 425
Abstract
We used machine learning (ML) to characterize 894,154 medical records of outpatient visits from the Veterans Administration Central Data Warehouse (VA CDW) by the likelihood of assignment of 200 International Classification of Diseases (ICD) code blocks. Using four different predictive models, we found [...] Read more.
We used machine learning (ML) to characterize 894,154 medical records of outpatient visits from the Veterans Administration Central Data Warehouse (VA CDW) by the likelihood of assignment of 200 International Classification of Diseases (ICD) code blocks. Using four different predictive models, we found the ML-derived predictions for the code blocks were consistently more effective in predicting death or 90-day rehospitalization than the assigned code block in the record. We reviewed records of ICD chapter assignments. The review revealed that the ML-predicted chapter assignments were consistently better than those humanly assigned. Impact factor analysis, a method of explanation of AI findings that was developed in our group, demonstrated little effect on any one assigned ICD code block but a marked impact on the ML-derived code blocks of kidney disease as well as several other morbidities. In this study, machine learning was much better than human code assignment at predicting the relatively rare outcomes of death or rehospitalization. Future work will address generalizability using other datasets, as well as addressing coding that is more nuanced than that of the categorization provided by code blocks. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

16 pages, 1816 KiB  
Article
ADTime: Adaptive Multivariate Time Series Forecasting Using LLMs
by Jinglei Pei, Yang Zhang, Ting Liu, Jingbin Yang, Qinghua Wu and Kang Qin
Mach. Learn. Knowl. Extr. 2025, 7(2), 35; https://doi.org/10.3390/make7020035 - 17 Apr 2025
Viewed by 641
Abstract
Large language models (LLMs) have recently demonstrated notable performance, particularly in addressing the challenge of extensive data requirements when training traditional forecasting models. However, these methods encounter significant challenges when applied to high-dimensional and domain-specific datasets. These challenges primarily arise from inability to [...] Read more.
Large language models (LLMs) have recently demonstrated notable performance, particularly in addressing the challenge of extensive data requirements when training traditional forecasting models. However, these methods encounter significant challenges when applied to high-dimensional and domain-specific datasets. These challenges primarily arise from inability to effectively model inter-variable dependencies and capture variable-specific characteristics, leading to suboptimal performance in complex forecasting scenarios. To address these limitations, we propose ADTime, an adaptive LLM-based approach for multivariate time series forecasting. ADTime employs advanced preprocessing techniques to identify latent relationships among key variables and temporal features. Additionally, it integrates temporal alignment mechanisms and prompt-based strategies to enhance the semantic understanding of forecasting tasks by LLMs. Experimental results show that ADTime outperforms state-of-the-art methods, reducing MSE by 9.5% and MAE by 6.1% on public datasets, and by 17.1% and 13.5% on domain-specific datasets. Furthermore, zero-shot experiments on real-world refinery datasets demonstrate that ADTime exhibits stronger generalization capabilities across various transfer scenarios. These findings highlight the potential of ADTime in advancing complex, domain-specific time series forecasting tasks. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

13 pages, 3467 KiB  
Article
Pattern Matching-Based Denoising for Images with Repeated Sub-Structures
by Anil Kumar Mysore Badarinarayana, Christoph Pratsch, Thomas Lunkenbein and Florian Jug
Mach. Learn. Knowl. Extr. 2025, 7(2), 34; https://doi.org/10.3390/make7020034 - 7 Apr 2025
Viewed by 423
Abstract
In electron microscopy, obtaining low-noise images is often difficult, especially when examining biological samples or delicate materials. Therefore, the suppression of noise is essential for the analysis of such noisy images. State-of-the-art image denoising methods are dominated by supervised Convolution neural network (CNN)-based [...] Read more.
In electron microscopy, obtaining low-noise images is often difficult, especially when examining biological samples or delicate materials. Therefore, the suppression of noise is essential for the analysis of such noisy images. State-of-the-art image denoising methods are dominated by supervised Convolution neural network (CNN)-based methods. However, supervised CNNs cannot be used if a noise-free ground truth is unavailable. To address this problem, we propose a method that uses re-occurring patterns in images. Our proposed method does not require noise-free images for the denoising task. Instead, it is based on the idea that averaging images with the same signal having independent noise suppresses the overall noise. In order to evaluate the performance of our method, we compare our results with other state-of-the-art denoising methods that do not require a noise-free image. We show that our method is the best for retaining fine image structures. Additionally, we develop a confidence map for evaluating the denoising quality of the proposed method. Furthermore, we analyze the time complexity of the algorithm to ensure scalability and optimize the algorithm to improve the runtime efficiency. Full article
Show Figures

Figure 1

22 pages, 6363 KiB  
Article
Optimisation-Based Feature Selection for Regression Neural Networks Towards Explainability
by Georgios I. Liapis, Sophia Tsoka and Lazaros G. Papageorgiou
Mach. Learn. Knowl. Extr. 2025, 7(2), 33; https://doi.org/10.3390/make7020033 - 5 Apr 2025
Viewed by 698
Abstract
Regression is a fundamental task in machine learning, and neural networks have been successfully employed in many applications to identify underlying regression patterns. However, they are often criticised for their lack of interpretability and commonly referred to as black-box models. Feature selection approaches [...] Read more.
Regression is a fundamental task in machine learning, and neural networks have been successfully employed in many applications to identify underlying regression patterns. However, they are often criticised for their lack of interpretability and commonly referred to as black-box models. Feature selection approaches address this challenge by simplifying datasets through the removal of unimportant features, while improving explainability by revealing feature importance. In this work, we leverage mathematical programming to identify the most important features in a trained deep neural network with a ReLU activation function, providing greater insight into its decision-making process. Unlike traditional feature selection methods, our approach adjusts the weights and biases of the trained neural network via a Mixed-Integer Linear Programming (MILP) model to identify the most important features and thereby uncover underlying relationships. The mathematical formulation is reported, which determines the subset of selected features, and clustering is applied to reduce the complexity of the model. Our results illustrate improved performance in the neural network when feature selection is implemented by the proposed approach, as compared to other feature selection approaches. Finally, analysis of feature selection frequency across each dataset reveals feature contribution in model predictions, thereby addressing the black-box nature of the neural network. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

14 pages, 1442 KiB  
Article
RoSe-Mix: Robust and Secure Deep Neural Network Watermarking in Black-Box Settings via Image Mixup
by Tamara El Hajjar, Mohammed Lansari, Reda Bellafqira, Gouenou Coatrieux, Katarzyna Kapusta and Kassem Kallas
Mach. Learn. Knowl. Extr. 2025, 7(2), 32; https://doi.org/10.3390/make7020032 - 30 Mar 2025
Viewed by 1239
Abstract
Due to their considerable costs, deep neural networks (DNNs) are valuable assets that need to be protected in terms of intellectual property (IP). From this statement, DNN watermarking gains significant interest since it allows DNN owners to prove their ownership. Various methods that [...] Read more.
Due to their considerable costs, deep neural networks (DNNs) are valuable assets that need to be protected in terms of intellectual property (IP). From this statement, DNN watermarking gains significant interest since it allows DNN owners to prove their ownership. Various methods that embed ownership information in the model behavior have been proposed. They need to fill several requirements, among them the security, which represents an attacker’s difficulty in breaking the watermarking scheme. There is also the robustness requirement, which quantifies the resistance against watermark removal techniques. The problem is that the proposed methods generally fail to meet these necessary standards. This paper presents RoSe-Mix, a robust and secure deep neural network watermarking technique designed for black-box settings. It addresses limitations in existing DNN watermarking approaches by integrating key features from two established methods: RoSe, which uses cryptographic hashing to ensure security, and Mixer, which employs image Mixup to enhance robustness. Experimental results demonstrate that RoSe-Mix achieves security across various architectures and datasets with a robustness to removal attacks exceeding 99%. Full article
(This article belongs to the Section Privacy)
Show Figures

Figure 1

17 pages, 1144 KiB  
Article
Leveraging LLMs for Non-Security Experts in Threat Hunting: Detecting Living off the Land Techniques
by Antreas Konstantinou, Dimitrios Kasimatis, William J. Buchanan, Sana Ullah Jan, Jawad Ahmad, Ilias Politis and Nikolaos Pitropakis
Mach. Learn. Knowl. Extr. 2025, 7(2), 31; https://doi.org/10.3390/make7020031 - 30 Mar 2025
Viewed by 1141
Abstract
This paper explores the potential use of Large Language Models (LLMs), such as ChatGPT, Google Gemini, and Microsoft Copilot, in threat hunting, specifically focusing on Living off the Land (LotL) techniques. LotL methods allow threat actors to blend into regular network activity, which [...] Read more.
This paper explores the potential use of Large Language Models (LLMs), such as ChatGPT, Google Gemini, and Microsoft Copilot, in threat hunting, specifically focusing on Living off the Land (LotL) techniques. LotL methods allow threat actors to blend into regular network activity, which makes detection by automated security systems challenging. The study seeks to determine whether LLMs can reliably generate effective queries for security tools, enabling organisations with limited budgets and expertise to conduct threat hunting. A testing environment was created to simulate LotL techniques, and LLM-generated queries were used to identify malicious activity. The results demonstrate that LLMs do not consistently produce accurate or reliable queries for detecting these techniques, particularly for users with varying skill levels. However, while LLMs may not be suitable as standalone tools for threat hunting, they can still serve as supportive resources within a broader security strategy. These findings suggest that, although LLMs offer potential, they should not be relied upon for accurate results in threat detection and require further refinement to be effectively integrated into cybersecurity workflows. Full article
(This article belongs to the Section Privacy)
Show Figures

Figure 1

21 pages, 3831 KiB  
Article
Comparative Analysis of Machine Learning Techniques for Predicting Bulk Specific Gravity in Modified Asphalt Mixtures Incorporating Polyethylene Terephthalate (PET), High-Density Polyethylene (HDPE), and Polyvinyl Chloride (PVC)
by Bhupender Kumar, Navsal Kumar, Rabee Rustum and Vijay Shankar
Mach. Learn. Knowl. Extr. 2025, 7(2), 30; https://doi.org/10.3390/make7020030 - 27 Mar 2025
Viewed by 622
Abstract
In today’s rapidly evolving transportation infrastructure, developing long-lasting, high-performance pavement materials remains a significant priority. Integrating machine learning (ML) techniques provides a transformative approach to optimizing asphalt mix design and performance prediction. This study investigates the use of waste plastics, including Polyethylene Terephthalate [...] Read more.
In today’s rapidly evolving transportation infrastructure, developing long-lasting, high-performance pavement materials remains a significant priority. Integrating machine learning (ML) techniques provides a transformative approach to optimizing asphalt mix design and performance prediction. This study investigates the use of waste plastics, including Polyethylene Terephthalate (PET), High-Density Polyethylene (HDPE), and Polyvinyl Chloride (PVC), as modifiers in asphalt concrete to enhance durability and mechanical performance. A predictive modeling approach was employed to estimate the bulk-specific gravity (Gmb) of asphalt concrete using various ML techniques, including Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Gaussian Processes (GPs), and Reduced Error Pruning (REP) Tree. The accuracy of each model was evaluated using statistical performance metrics, including the correlation coefficient (CC), scatter index (SI), mean absolute error (MAE), and root mean square error (RMSE). The results demonstrate that the ANN model outperformed all other ML techniques, achieving the highest correlation (CC = 0.9996 for training, 0.9999 for testing) and the lowest error values (MAE = 0.0004, RMSE = 0.0006, SI = 0.00026). A comparative analysis between actual and predicted Gmb values confirmed the reliability of the proposed ANN model, with minimal error margins and superior accuracy. Additionally, sensitivity analysis identified bitumen content (BC) and volume of bitumen (Vb) as the most influential parameters affecting Gmb, emphasizing the need for precise parameter optimization in asphalt mix design. This study demonstrates the effectiveness of machine learning-driven predictive modeling in optimizing sustainable asphalt mix design, offering a cost-effective, time-efficient, and highly accurate alternative to traditional experimental methods. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

31 pages, 24053 KiB  
Article
Optimizing a Double Stage Heat Transformer Performance by Levenberg–Marquardt Artificial Neural Network
by Suset Vázquez-Aveledo, Rosenberg J. Romero, Lorena Díaz-González, Moisés Montiel-González and Jesús Cerezo
Mach. Learn. Knowl. Extr. 2025, 7(2), 29; https://doi.org/10.3390/make7020029 - 27 Mar 2025
Viewed by 1002
Abstract
Waste heat recovery is a critical strategy for optimizing energy consumption and reducing greenhouse gas emissions. In this context, the circular economy highlights the importance of this practice as a key tool to enhance energy efficiency, minimize waste, and decrease environmental impact. Artificial [...] Read more.
Waste heat recovery is a critical strategy for optimizing energy consumption and reducing greenhouse gas emissions. In this context, the circular economy highlights the importance of this practice as a key tool to enhance energy efficiency, minimize waste, and decrease environmental impact. Artificial neural networks are particularly well-suited for managing nonlinearities and complex interactions among multiple variables, making them ideal for controlling a double-stage absorption heat transformer. This study aims to simultaneously optimize both user-defined parameters. Levenberg–Marquardt and scaled conjugated gradient algorithms were compared from five to twenty-five neurons to determine the optimal operating conditions while the coefficient of performance and the gross temperature lift were simultaneously maximized. The methodology includes R2024a MATLAB© programming, real-time data acquisition, visual engineering environment software, and flow control hardware. The results show that applying the Levenberg–Marquardt algorithm resulted in an increase in the correlation coefficient (R) at 20 neurons, improving the thermodynamic performance and enabling greater energy recovery from waste heat. Full article
(This article belongs to the Special Issue Sustainable Applications for Machine Learning)
Show Figures

Figure 1

27 pages, 941 KiB  
Article
Accelerating Disease Model Parameter Extraction: An LLM-Based Ranking Approach to Select Initial Studies for Literature Review Automation
by Masood Sujau, Masako Wada, Emilie Vallée, Natalie Hillis and Teo Sušnjak
Mach. Learn. Knowl. Extr. 2025, 7(2), 28; https://doi.org/10.3390/make7020028 - 26 Mar 2025
Viewed by 1326
Abstract
As climate change transforms our environment and human intrusion into natural ecosystems escalates, there is a growing demand for disease spread models to forecast and plan for the next zoonotic disease outbreak. Accurate parametrization of these models requires data from diverse sources, including [...] Read more.
As climate change transforms our environment and human intrusion into natural ecosystems escalates, there is a growing demand for disease spread models to forecast and plan for the next zoonotic disease outbreak. Accurate parametrization of these models requires data from diverse sources, including the scientific literature. Despite the abundance of scientific publications, the manual extraction of these data via systematic literature reviews remains a significant bottleneck, requiring extensive time and resources, and is susceptible to human error. This study examines the application of a large language model (LLM) as an assessor for screening prioritisation in climate-sensitive zoonotic disease research. By framing the selection criteria of articles as a question–answer task and utilising zero-shot chain-of-thought prompting, the proposed method achieves a saving of at least 70% work effort compared to manual screening at a recall level of 95% (NWSS@95%). This was validated across four datasets containing four distinct zoonotic diseases and a critical climate variable (rainfall). The approach additionally produces explainable AI rationales for each ranked article. The effectiveness of the approach across multiple diseases demonstrates the potential for broad application in systematic literature reviews. The substantial reduction in screening effort, along with the provision of explainable AI rationales, marks an important step toward automated parameter extraction from the scientific literature. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

37 pages, 4565 KiB  
Article
On Classification of the Human Emotions from Facial Thermal Images: A Case Study Based on Machine Learning
by Marius Sorin Pavel, Simona Moldovanu and Dorel Aiordachioaie
Mach. Learn. Knowl. Extr. 2025, 7(2), 27; https://doi.org/10.3390/make7020027 - 25 Mar 2025
Viewed by 764
Abstract
(1) Background: This paper intends to accomplish a comparative study and analysis regarding the multiclass classification of facial thermal images, i.e., in three classes corresponding to predefined emotional states (neutral, happy and sad). By carrying out a comparative analysis, the main goal of [...] Read more.
(1) Background: This paper intends to accomplish a comparative study and analysis regarding the multiclass classification of facial thermal images, i.e., in three classes corresponding to predefined emotional states (neutral, happy and sad). By carrying out a comparative analysis, the main goal of the paper consists in identifying a suitable algorithm from machine learning field, which has the highest accuracy (ACC). Two categories of images were used in the process, i.e., images with Gaussian noise and images with “salt and pepper” type noise that come from two built-in special databases. An augmentation process was applied to the initial raw images that led to the development of the two databases with added noise, as well as the subsequent augmentation of all images, i.e., rotation, reflection, translation and scaling. (2) Methods: The multiclass classification process was implemented through two subsets of methods, i.e., machine learning with random forest (RF), support vector machines (SVM) and k-nearest neighbor (KNN) algorithms and deep learning with the convolutional neural network (CNN) algorithm. (3) Results: The results obtained in this paper with the two subsets of methods belonging to the field of artificial intelligence (AI), together with the two categories of facial thermal images with added noise used as input, were very good, showing a classification accuracy of over 99% for the two categories of images, and the three corresponding classes for each. (4) Discussion: The augmented databases and the additional configurations of the implemented algorithms seems to have had a positive effect on the final classification results. Full article
(This article belongs to the Section Learning)
Show Figures

Figure 1

Previous Issue
Back to TopTop