applsci-logo

Journal Browser

Journal Browser

Machine Learning and Data Analysis: Bridging Theory and Real-World Solutions

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 February 2026 | Viewed by 3936

Special Issue Editors


E-Mail Website
Guest Editor
Department of Information and Communication Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, 10000 Zagreb, Croatia
Interests: data science; machine learning; natural language processing; language technologies; machine translation; business analytics; open data

E-Mail Website
Guest Editor
Department of Information and Communication Sciences, Faculty of Humanities and Social Sciences, University of Zagreb, 10000 Zagreb, Croatia
Interests: machine learning; data science; machine translation; natural language processing; language technologies; information systems; databases
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Informatics, University of Rijeka, Radmile Matejčić 2, 51000 Rijeka, Croatia
Interests: artificial intelligence; machine learning; interpretable machine learning; educational data mining; natural language processing; machine translation
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The MDPI journal Applied Sciences invites submissions to a Special Issue on “Machine Learning and Data Analysis: Bridging Theory and Real-World Solutions”.

The goal of this Special Issue is to investigate how machine learning and data analysis can be utilized for solving real-world problems in a variety of domains. It presents research that transforms theoretical breakthroughs into practical applications. A wide range of subjects are covered in this issue, including the creation of machine learning algorithms and innovative methods for analyzing data, as well as their use in various fields.

In this Special Issue, original and unpublished works with results in any way related to machine learning, data science, natural language processing, and linked areas are welcome. We welcome various types of experimental and methodological aspects on novel solutions in machine learning and data analysis, including the following:

  • Research, analysis, or implementation approaches and innovations in machine learning and data analysis methods;
  • User studies on the application of machine learning and data science in various fields;
  • Emerging technologies and evaluation of integrative solutions for data analysis and predictive analytics;
  • Corpora and other digital resources that are essential to data science;
  • Research on natural language processing and language and speech technologies;
  • Strategies, challenges, and opportunities in data science. Nevertheless, submissions with a strong theoretical contribution are also desirable.

 Topics of interest include, but are not limited to, the following:

  • Innovations in machine learning and data analysis;
  • Analysis of machine learning methods and approaches;
  • Emerging technologies and evaluation of deep learning;
  • Integrative approaches to machine learning and data analysis;
  • Advanced data analysis and predictive data analytics;
  • Open data analytics and innovations;
  • Image recognition and generation;
  • Digital corpora and other resources for data science;
  • Research on natural language processing;
  • Innovative language technologies;
  • Novel approaches to speech technologies;
  • Real-world data-driven solutions and new strategies;
  • Data-driven decision making;
  • Challenges and opportunities in data science;
  • Ethical considerations and legal issues in artificial intelligence;
  • Application of machine learning and data science in various fields.

Prof. Dr. Sanja Seljan
Dr. Ivan Dunđer
Prof. Dr. Marija Brkić Bakarić
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • deep learning
  • data analysis
  • data science
  • data mining
  • natural language procssing
  • language technologies
  • speech technologies
  • open data
  • image recognition
  • predictive analytics
  • digital resources

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

29 pages, 613 KiB  
Article
Hamming Diversification Index: A New Clustering-Based Metric to Understand and Visualize Time Evolution of Patterns in Multi-Dimensional Datasets
by Sarthak Pattnaik and Eugene Pinsky
Appl. Sci. 2025, 15(14), 7760; https://doi.org/10.3390/app15147760 - 10 Jul 2025
Viewed by 237
Abstract
One of the most challenging problems in data analysis is visualizing patterns and extracting insights from multi-dimensional datasets that vary over time. The complexity of data and variations in the correlations between different features adds further difficulty to the analysis. In this paper, [...] Read more.
One of the most challenging problems in data analysis is visualizing patterns and extracting insights from multi-dimensional datasets that vary over time. The complexity of data and variations in the correlations between different features adds further difficulty to the analysis. In this paper, we provide a framework to analyze the temporal dynamics of such datasets. We use machine learning clustering techniques and examine the time evolution of data patterns by constructing the corresponding cluster trajectories. These trajectories allow us to visualize the patterns and the changing nature of correlations over time. The similarity and correlations of features are reflected in common cluster membership, whereas the historical dynamics are described by a trajectory in the corresponding (cluster, time) space. This allows an effective visualization of multi-dimensional data over time. We introduce several statistical metrics to measure duration, volatility, and inertia of changes in patterns. Using the Hamming distance of trajectories over multiple time periods, we propose a novel metric, the Hamming diversification index, to measure the spread between trajectories. The novel metric is easy to compute, has a simple machine learning implementation, and provides additional insights into the temporal dynamics of data. This parsimonious diversification index can be used to examine changes in pattern similarities over aggregated time periods. We demonstrate the efficacy of our approach by analyzing a complex multi-year dataset of multiple worldwide economic indicators. Full article
Show Figures

Figure 1

11 pages, 1505 KiB  
Article
Comparison of Dimensionality Reduction Approaches and Logistic Regression for ECG Classification
by Simeon Lappa Tchoffo, Éloïse Soucy, Ismaila Baldé, Jalila Jbilou and Salah El Adlouni
Appl. Sci. 2025, 15(12), 6627; https://doi.org/10.3390/app15126627 - 12 Jun 2025
Viewed by 356
Abstract
This study aims to analyze electrocardiogram (ECG) data for the classification of five cardiac rhythms: sinus bradycardia (SB), sinus rhythm (SR), atrial fibrillation (AFIB), supraventricular tachycardia (SVT), and sinus tachycardia (ST). While SR is considered normal, the other four represent types of cardiac [...] Read more.
This study aims to analyze electrocardiogram (ECG) data for the classification of five cardiac rhythms: sinus bradycardia (SB), sinus rhythm (SR), atrial fibrillation (AFIB), supraventricular tachycardia (SVT), and sinus tachycardia (ST). While SR is considered normal, the other four represent types of cardiac arrhythmias. A range of methods is utilized, including the supervised learning technique K-Nearest Neighbors (KNNs), combined with dimensionality reduction approaches such as Principal Component Analysis (PCA) and Uniform Manifold Approximation and Projection (UMAP), a modern method based in Riemannian topology. Additionally, logistic regression was applied using both maximum likelihood and Bayesian methods, with two distinct prior distributions: an informative normal prior and a non-informative Jeffreys prior. Performance was assessed using evaluation metrics such as positive predictive value (PPV), negative predictive value (NPV), specificity, sensitivity, accuracy, and F1-score. Ultimately, the UMAP-KNN method demonstrated the best overall performance. Full article
Show Figures

Figure 1

27 pages, 19294 KiB  
Article
Classifying X-Ray Tube Malfunctions: AI-Powered CT Predictive Maintenance System
by Ladislav Pomšár, Maryna Tsvietaieva, Maros Krupáš and Iveta Zolotová
Appl. Sci. 2025, 15(12), 6547; https://doi.org/10.3390/app15126547 - 10 Jun 2025
Viewed by 540
Abstract
Computed tomography scans are among the most used medical imaging modalities. With increased popularity and usage, the need for maintenance also increases. In this work, the problem is tackled using machine learning methods to create a predictive maintenance system for the classification of [...] Read more.
Computed tomography scans are among the most used medical imaging modalities. With increased popularity and usage, the need for maintenance also increases. In this work, the problem is tackled using machine learning methods to create a predictive maintenance system for the classification of faulty X-ray tubes. Data for 137 different CT machines were collected, with 128 deemed to fulfil the quality criteria of the study. Of these, 66 have had X-ray tubes subsequently replaced. Afterwards, auto-regressive model coefficients and wavelet coefficients, as standard features in the area, are extracted. For classification, a set of different classical machine learning approaches is used alongside two different architectures of neural networks—1D VGG-style CNN and LSTM RNN. In total, seven different machine learning models are investigated. The best-performing model proved to be an LSTM trained on trimmed and normalised input data, with an accuracy of 87% and a recall of 100% for the faulty class. The developed model has the potential to maximise the uptime of CT machines and help mitigate the adverse effects of machine breakdowns. Full article
Show Figures

Figure 1

18 pages, 438 KiB  
Article
ML-Empowered Microservice Workload Prediction by Dual-Regularized Matrix Factorization
by Xiaoxuan Luo, Hong Shen and Wei Ke
Appl. Sci. 2025, 15(11), 5946; https://doi.org/10.3390/app15115946 - 25 May 2025
Viewed by 503
Abstract
A technical challenge for workload prediction in microservice systems is how to capture both the dynamic features of workload and evolving dependencies among microservices. The existing work focused mainly on modeling dynamic features without taking adequate account of evolving dependencies due to their [...] Read more.
A technical challenge for workload prediction in microservice systems is how to capture both the dynamic features of workload and evolving dependencies among microservices. The existing work focused mainly on modeling dynamic features without taking adequate account of evolving dependencies due to their unpredictable temporal dynamics. To fill this gap, as an illustration of bridging theory and real-work solutions by integrating machine learning with data analysis, we propose a novel framework of Temporality-Dependence Dual-Regularized Matrix Factorization (TDDRMF) by combining matrix factorization with regularization on both workload temporality and microservice dependencies. It models the workload matrix as the product of a microservice dependency matrix W and workload feature matrix X applying matrix factorization, and computes X by temporal regularization and W by low-rank norm regularization as a convex relaxation of rank minimization. To further enhance its adaptability to workload variations in real-time environments, we deploy a dynamic error detection and update mechanism. Experiments on the Alibaba dataset show that TDDRMF achieves 18.5% lower RMSE than TAMF in 10-step prediction, improving the existing matrix factorization methods in accuracy. In comparison with ML-based methods, as TDDRMF uses only 5% of their training data, it requires only a small fraction of their training time. Full article
Show Figures

Figure 1

18 pages, 2863 KiB  
Article
On the Optimum Linear Soft Fusion of Classifiers
by Luis Vergara and Addisson Salazar
Appl. Sci. 2025, 15(9), 5038; https://doi.org/10.3390/app15095038 - 1 May 2025
Cited by 2 | Viewed by 244
Abstract
We present new analytical developments that contribute to a better understanding of the (soft) fusion of classifiers. To this end, we propose an optimal linear combiner based on a minimum mean-square-error class estimation approach. This solution allows us to define a post-fusion mean-square-error [...] Read more.
We present new analytical developments that contribute to a better understanding of the (soft) fusion of classifiers. To this end, we propose an optimal linear combiner based on a minimum mean-square-error class estimation approach. This solution allows us to define a post-fusion mean-square-error improvement factor relative to the best fused classifier. Key elements for this improvement factor are the number of classifiers, their pairwise correlations, the imbalance between their performances, and the bias. Furthermore, we consider exponential models for the class-conditional probability densities to establish the relationship between the classifier’s error probability and the mean square error of the class estimate. This allows us to predict the reduction in the post-fusion error probability relative to that of the best classifier. These theoretical findings are contrasted in a biosignal application for the detection of arousals during sleep from EEG signals. The results obtained are reasonably consistent with the theoretical conclusions. Full article
Show Figures

Figure 1

Review

Jump to: Research

20 pages, 1776 KiB  
Review
Bridging Theory and Practice: A Review of AI-Driven Techniques for Ground Penetrating Radar Interpretation
by Lilong Zou, Ying Li, Kevin Munisami and Amir M. Alani
Appl. Sci. 2025, 15(15), 8177; https://doi.org/10.3390/app15158177 - 23 Jul 2025
Viewed by 124
Abstract
Artificial intelligence (AI) has emerged as a powerful tool for advancing the interpretation of ground penetrating radar (GPR) data, offering solutions to long-standing challenges in manual analysis, such as subjectivity, inefficiency, and limited scalability. This review investigates recent developments in AI-driven techniques for [...] Read more.
Artificial intelligence (AI) has emerged as a powerful tool for advancing the interpretation of ground penetrating radar (GPR) data, offering solutions to long-standing challenges in manual analysis, such as subjectivity, inefficiency, and limited scalability. This review investigates recent developments in AI-driven techniques for GPR interpretation, with a focus on machine learning, deep learning, and hybrid approaches that incorporate physical modeling or multimodal data fusion. We systematically analyze the application of these techniques across various domains, including utility detection, infrastructure monitoring, archeology, and environmental studies. Key findings highlight the success of convolutional neural networks in hyperbola detection, the use of segmentation models for stratigraphic analysis, and the integration of AI with robotic and real-time systems. However, challenges remain with generalization, data scarcity, model interpretability, and operational deployment. We identify promising directions, such as domain adaptation, explainable AI, and edge-compatible solutions for practical implementation. By synthesizing current progress and limitations, this review aims to bridge the gap between theoretical advancements in AI and the practical needs of GPR practitioners, guiding future research towards more reliable, transparent, and field-ready systems. Full article
Show Figures

Figure 1

30 pages, 843 KiB  
Review
Optimizing Internet of Things Honeypots with Machine Learning: A Review
by Stefanie Lanz, Sarah Lily-Rose Pignol, Patrick Schmitt, Haochen Wang, Maria Papaioannou, Gaurav Choudhary and Nicola Dragoni
Appl. Sci. 2025, 15(10), 5251; https://doi.org/10.3390/app15105251 - 8 May 2025
Viewed by 1093
Abstract
The increasing use of Internet of Things (IoT) devices has led to growing security concerns, necessitating advanced solutions to address emerging threats. Honeypots enhance IoT security by attracting and analyzing attackers. However, traditional honeypots struggle with adaptability and efficiency. This paper examines how [...] Read more.
The increasing use of Internet of Things (IoT) devices has led to growing security concerns, necessitating advanced solutions to address emerging threats. Honeypots enhance IoT security by attracting and analyzing attackers. However, traditional honeypots struggle with adaptability and efficiency. This paper examines how machine learning enhances honeypot capabilities by improving threat detection and response mechanisms. A systematic literature review using the snowballing method explores the application of supervised, unsupervised, and reinforcement learning. Various classifiers for machine learning are analyzed to optimize honeypot architectures. This paper focuses on two types of honeypots: dynamic honeypots, which evolve to mislead attackers, and adaptive honeypots, which respond to threats in real time. By evaluating low-interaction, high-interaction, and hybrid honeypots, we determine how different machine learning techniques enhance detection and resource efficiency. Key findings include improved detection rates, with machine learning techniques, particularly supervised learning models like random forest, significantly enhancing detection accuracy, achieving up to 0.96 accuracy. Adaptive honeypots utilizing machine learning demonstrate better resource management, reducing false positives and optimizing computational resources. Despite these improvements, high computational demands and limited real-world testing hinder widespread adoption in IoT environments. This paper provides an overview of current trends, identifies research gaps, and offers insights for developing more intelligent IoT honeypots. There is no doubt that machine learning can help create more resilient and adaptive security solutions for IoT networks. Full article
Show Figures

Figure 1

Back to TopTop