Machine Learning: Techniques, Industry Applications, Code Sharing, and Future Trends

A special issue of Computers (ISSN 2073-431X). This special issue belongs to the section "AI-Driven Innovations".

Deadline for manuscript submissions: 31 October 2026 | Viewed by 19154

Special Issue Editors


E-Mail Website
Guest Editor
1. Artificial Intelligence and Cyber Futures Institute, Charles Sturt University, Orange, NSW 2800, Australia
2. Rural Health Research Institute, Charles Sturt University, Orange, NSW 2800, Australia
Interests: artificial intelligence; uncertainty quantification; imbalanced data
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
School of Computer Science and Engineering, Macau University of Science and Technology, Macau 999078, China
Interests: cloud computing; networks and distributed systems; blockchain; deep learning; natural language processing
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

This Special Issue aims to highlight the importance of transparency, reproducibility, and openness in machine learning research by encouraging solutions accompanied by publicly shared codes. The goal is to promote best practices in sharing codes and datasets, making it easier for the research community to reproduce and build upon existing works.

Potential authors are encouraged to submit new concepts according to the submission guidelines. We also encourage researchers to share their codes in public repositories and implement them in open platforms like Kaggle, Code Ocean, etc. Editors and reviewers will aim to improve the presented concepts by providing effective feedback to researchers. This Special Issue can potentially bring about technological advances and an improved understanding of concepts among everyone involved, including readers. 

Scope and Topics of Interest:

We invite original research papers, reviews, and case studies that demonstrate innovative applications of machine learning and provide public access to the codebases used for the research. The topics of interest include, but are not limited to, the following:

  • Open-source machine learning frameworks and tools;
  • New machine learning models with publicly available implementation;
  • Benchmarking studies with open access datasets and codes;
  • Case studies and applications of machine learning in various domains with shared codes;
  • Best practices for reproducibility in machine learning research;
  • Public repositories and tools for collaborative machine learning development;
  • Studies on the impact of code sharing in AI research;
  • Efficient data preprocessing, feature extraction, and model evaluation using shared codes;
  • Reusable machine learning pipelines and workflows.

Dr. Hussain Mohammed Dipu Kabir
Dr. Subrota Kumar Mondal
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Computers is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1800 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • open-source machine learning
  • reproducible research
  • code sharing in AI
  • machine learning frameworks
  • publicly available datasets
  • transparent machine learning
  • benchmarking in machine learning
  • collaborative machine learning
  • open science in AI
  • code-based research validation
  • machine learning algorithms with codes
  • open repositories in ML
  • GitHub for machine learning
  • computational experiment reproducibility
  • best practices in code sharing

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

27 pages, 6893 KB  
Article
LoRA-Based Deep Learning for High-Fidelity Satellite Image Super-Resolution in Big Data Remote Sensing
by Noha Rashad Mahmoud, Hussam Elbehiery, Basheer Abdel Fattah Youssef and Hanaa Bayomi Ali Mobarz
Computers 2026, 15(5), 313; https://doi.org/10.3390/computers15050313 - 14 May 2026
Viewed by 273
Abstract
High-resolution satellite imagery is pivotal for accurate analysis in remote sensing applications, including land-use monitoring, urban planning, and environmental assessment. However, obtaining such data is often costly and limited. Consequently, super-resolution techniques, such as deep learning models and fine-tuning strategies like LoRA, offer [...] Read more.
High-resolution satellite imagery is pivotal for accurate analysis in remote sensing applications, including land-use monitoring, urban planning, and environmental assessment. However, obtaining such data is often costly and limited. Consequently, super-resolution techniques, such as deep learning models and fine-tuning strategies like LoRA, offer a promising alternative to the critical research challenge, especially given the diversity and large scale of satellite datasets. While deep learning-based super-resolution models have been very promising recently, their effectiveness, efficiency, and scalability across heterogeneous satellite scenes are not well studied. This work studies the performance of representative deep learning Super-Resolution frameworks, including the Enhanced Super-Resolution Generative Adversarial Network. (ESRGAN), Swin Transformer for Image Restoration (SwinIR), and latent diffusion models (LDM), under unified experimental conditions using the WorldStrat dataset. The main goal is to establish whether adaptation strategies for parameter efficiency can boost reconstruction quality while reducing computational and training costs. Toward this goal, we investigate hybrid sequential pipelines, ensemble averaging, and Low-Rank Adaptation (LoRA)–based fine-tuning. The experiments indicate that these pipelines, which use multi-model methods, achieve only marginal performance gains while incurring substantial increases in computational complexity. LoRA-Based Fine-Tuning, by contrast, has demonstrated superiority in enhancing reconstruction accuracy and quality across all model families, despite using only a small percentage of trainable parameters. LoRA-based models demonstrate superiority over multi-model methods in both efficiency and performance. The presented results confirm that LoRA is an effective and accessible technique for high-fidelity satellite-based super-resolution image synthesis. The manuscript identifies LoRA as one of the enabling technologies advancing the state of the art in Deep Learning-based Super Resolution for large-scale satellite-based image synthesis. Full article
Show Figures

Figure 1

21 pages, 25855 KB  
Article
Semantic Segmentation-Based Identification and Quantitative Analysis of Cross-Sectional Quality Features in Luzhou-Flavor Liquor Daqu
by Zheli Song, Yi Dong, Chao Wang, Xiu Zhang, Aibao Sun, Cuiping You, Jian Mao and Shuangping Liu
Computers 2026, 15(5), 307; https://doi.org/10.3390/computers15050307 - 12 May 2026
Viewed by 296
Abstract
The objective evaluation of Daqu cross-sectional quality is challenging due to its heterogeneous structure, small features, and low contrast. This study proposes a semantic-segmentation-based framework for the automated identification and quantitative analysis of Luzhou-flavor Daqu cross-sections. Four representative architectures—including three convolutional neural network [...] Read more.
The objective evaluation of Daqu cross-sectional quality is challenging due to its heterogeneous structure, small features, and low contrast. This study proposes a semantic-segmentation-based framework for the automated identification and quantitative analysis of Luzhou-flavor Daqu cross-sections. Four representative architectures—including three convolutional neural network (CNN)-based models (U-Net, U-Net++, and U2-Net) and one Transformer-based model (SegFormer)—were systematically benchmarked. To address severe class imbalance and enhance model robustness, a task-specific data augmentation pipeline was implemented. With these optimized augmentation strategies, the U2-Net model demonstrated the best performance, with a peak mean Intersection over Union (mIoU) of 87.54% and a Dice score of 98.30%. Based on the predicted masks, quantitative indicators such as plaque area ratio, pizhang thickness, and fissure length were precisely extracted. The proposed framework provides an objective and scalable solution for Daqu quality inspection, offering significant practical value for industrial scenarios involving complex materials and fine-grained defect patterns. Full article
Show Figures

Graphical abstract

35 pages, 4403 KB  
Article
A Reproducible Hybrid Architecture of Fuzzy Logic and XGBoost for Explainable Tabular Classification of Territorial Vulnerability
by Aiman Akynbekova, Ayagoz Mukhanova, Raikhan Muratkhan, Lunara Diyarova, Saya Baigubenova, Gulden Murzabekova, Gulaim Orazymbetova, Aliya Satybaldieva and Zhanat Abdikadyr
Computers 2026, 15(4), 259; https://doi.org/10.3390/computers15040259 - 20 Apr 2026
Viewed by 378
Abstract
This study proposes a reproducible hybrid computational model for the explainable classification of territorial vulnerability using heterogeneous tabular data. The approach integrates fuzzy logic and extreme gradient boosting in a two-stage architecture that balances interpretability and predictive performance. First, a fuzzy transformation is [...] Read more.
This study proposes a reproducible hybrid computational model for the explainable classification of territorial vulnerability using heterogeneous tabular data. The approach integrates fuzzy logic and extreme gradient boosting in a two-stage architecture that balances interpretability and predictive performance. First, a fuzzy transformation is applied to construct interpretable risk and resilience indicators based on multi-source administrative indicators. The analytical dataset was formed by integrating 11 heterogeneous administrative sources into a single matrix of 166 territorial units and 76 features. The model was evaluated on a stratified 75/25 split of the training and test sets using the F1 score, ROC-AUC, precision, recall, and integrated quality criterion. Experimental results show that the proposed Fuzzy-XGBoost framework achieved an F1 score of 0.7333 on the test dataset, an ROC-AUC of 0.8291, and an Integrated Score of 0.768, outperforming the strongest baseline and improving recall in highly vulnerable areas. Furthermore, probabilistic threshold optimization identified an operating point at τ = 0.35, reducing the number of missed high-risk cases while maintaining acceptable specificity. The results demonstrate that fuzzy feature expansion combined with gradient boosting provides an efficient and interpretable solution for tabular risk classification and decision support problems under heterogeneity and uncertainty. Full article
Show Figures

Figure 1

14 pages, 1902 KB  
Article
Evaluating Machine Learning Classifiers in Detecting Cyberattacks
by Mustafa Hammad, Mohamed Almahmood, Maen Hammad, Bassam A. Y. Alqaralleh and Aymen I. Zreikat
Computers 2026, 15(4), 248; https://doi.org/10.3390/computers15040248 - 16 Apr 2026
Viewed by 323
Abstract
This study aims to develop a machine learning model that can accurately detect cyberattacks. We compare the performance of Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF) in predicting cyberattacks. Furthermore, we investigate whether using Information Gain Attribute Evaluation (IGAE) [...] Read more.
This study aims to develop a machine learning model that can accurately detect cyberattacks. We compare the performance of Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF) in predicting cyberattacks. Furthermore, we investigate whether using Information Gain Attribute Evaluation (IGAE) for feature selection improves the performance of the algorithms. This work provides a clear comparison of the algorithms and shows the most suitable one for classifying cyberattacks. In addition, this study combines LR and RF using a voting classifier along with IGAE and compares its performance with that of the rest of the algorithms. We investigate whether combining algorithms increases the accuracy of the results. The results show that the most accurate algorithm is RF, followed by LR and SVM. Contrary to initial expectations, the findings further indicate that the application of IGAE marginally reduces algorithm accuracy across the tested classifiers, suggesting that feature selection through information gain is not universally beneficial in cyberattack detection tasks. These findings contribute to the growing body of knowledge on effective machine learning methodologies for cybersecurity applications. Full article
Show Figures

Figure 1

16 pages, 1649 KB  
Article
The Seed Optimization Method for Fuzz Testing Based on Neural Network-Guided Genetic Algorithm
by Yongbo Jiang, Zhitao Li, Baofeng Duan and Tao Feng
Computers 2026, 15(3), 170; https://doi.org/10.3390/computers15030170 - 6 Mar 2026
Viewed by 740
Abstract
To address the issues of low initial seed efficiency and a large number of ineffective mutations, this paper proposes an innovative fuzz testing seed optimization method combining neural networks and genetic algorithms. Traditional fuzz testing seed generation typically relies on random selection and [...] Read more.
To address the issues of low initial seed efficiency and a large number of ineffective mutations, this paper proposes an innovative fuzz testing seed optimization method combining neural networks and genetic algorithms. Traditional fuzz testing seed generation typically relies on random selection and the number of covered paths. In contrast, our method significantly improves seed generation efficiency and coverage by incorporating neural network models and genetic algorithms. First, the AFL tool is used to generate seed coverage path data, which is then used to train the neural network model. This model is employed to construct a fitness function to assess the potential of each seed. Subsequently, new seeds are generated through genetic algorithm crossover and mutation operations, with fitness evaluations based on the predictions of the neural network. Ultimately, the genetic algorithm optimizes the seeds through multiple generations, progressively improving coverage and vulnerability discovery capabilities. The experimental results demonstrate that the proposed method achieves significant improvements in fuzz testing performance, with path coverage increased by 28% compared to AFL and 23% compared to AFL++, and vulnerability discovery enhanced by over 200%. Full article
Show Figures

Figure 1

30 pages, 2239 KB  
Article
Exploring Risk Factors of Mycotoxin Contamination in Fresh Eggs Using Machine Learning Techniques
by Eman Omar, Eman Alsaidi, Abdullah Aref, Sharaf Omar, Wafa’ Bani Mustafa and Hind Milhem
Computers 2026, 15(1), 34; https://doi.org/10.3390/computers15010034 - 7 Jan 2026
Cited by 1 | Viewed by 791
Abstract
Mycotoxins are toxic compounds produced by certain fungi, whose health effects may be significant when they contaminate fresh eggs. Conventional methods of mycotoxin analysis, while accurate, are labor-intensive, time-consuming, and impractical for large-scale screening applications. This study attempts to use using machine learning [...] Read more.
Mycotoxins are toxic compounds produced by certain fungi, whose health effects may be significant when they contaminate fresh eggs. Conventional methods of mycotoxin analysis, while accurate, are labor-intensive, time-consuming, and impractical for large-scale screening applications. This study attempts to use using machine learning techniques to predict the concentration and presence of deoxynivalenol (DON), aflatoxin B1 (AFB1), and ochratoxin A (OTA) in fresh eggs from Jordan. Rather than replacing analytical detection methods, the proposed approach can enable a risk-based prioritization of samples for laboratory testing by identifying high-risk samples based on environmental and production factors. A dataset consisting of 1250 poultry egg samples collected between January and July 2024 under several factors involving environmental conditions and chemical assay results regarding mycotoxin content in eggs was used. Several machine learning algorithms were used in this study to build predictive models, including decision trees, support vector machines, and neural networks. The results indicate that machine learning can accurately and reliably predict mycotoxin contamination, which demonstrates the potential for integrating machine learning into food safety protocols. This study contributes toward developing predictive analytics for food safety and lays the groundwork for future research aimed at improving contamination monitoring systems. Full article
Show Figures

Figure 1

20 pages, 2549 KB  
Article
RD-RE: Reverse Distillation with Feature Reconstruction Enhancement for Industrial Anomaly Detection
by Youjia Fu and Antao Lin
Computers 2026, 15(1), 21; https://doi.org/10.3390/computers15010021 - 4 Jan 2026
Cited by 1 | Viewed by 1110
Abstract
Industrial anomaly detection methods based on reverse distillation (RD) have shown significant potential. However, existing RD approaches struggle to achieve an effective balance between constraining the feature consistency of the teacher–student networks and maintaining differentiated representation capability, which is crucial for precise anomaly [...] Read more.
Industrial anomaly detection methods based on reverse distillation (RD) have shown significant potential. However, existing RD approaches struggle to achieve an effective balance between constraining the feature consistency of the teacher–student networks and maintaining differentiated representation capability, which is crucial for precise anomaly detection. To address this challenge, we propose Reverse Distillation with Feature Reconstruction Enhancement (RD-RE) for Industrial Anomaly Detection. Firstly, we design a cross-stage feature fusion student network to integrate spatial detail information from the encoder with rich semantic information from the decoder. Secondly, we introduce a Locally Aware Dynamic Attention (LDA) module to enhance local detail feature response, thereby improving the model’s robustness in capturing anomalous regions. Finally, a Context-Aware Adaptive Multi-Scale Feature Fusion (CFFMS-FF) module is designed to constrain the consistency of local feature reconstruction. Experiments on the MVTec AD benchmark dataset demonstrate the effectiveness of RD-RE, achieving competitive results of 99.0%, 95.8%, 78.3%, and 99.7% on pixel-level AUROC, PRO, and AP and image-level AUROC metrics, and outperforming existing RD-based approaches. These results conclude that the integration of cross-stage fusion and local attention effectively mitigates the representation-consistency trade-off, providing a more robust solution for industrial anomaly localization. Full article
Show Figures

Figure 1

22 pages, 3451 KB  
Article
LSTM-Based Music Generation Technologies
by Yi-Jen Mon
Computers 2025, 14(6), 229; https://doi.org/10.3390/computers14060229 - 11 Jun 2025
Cited by 1 | Viewed by 3800
Abstract
In deep learning, Long Short-Term Memory (LSTM) is a well-established and widely used approach for music generation. Nevertheless, creating musical compositions that match the quality of those created by human composers remains a formidable challenge. The intricate nature of musical components, including pitch, [...] Read more.
In deep learning, Long Short-Term Memory (LSTM) is a well-established and widely used approach for music generation. Nevertheless, creating musical compositions that match the quality of those created by human composers remains a formidable challenge. The intricate nature of musical components, including pitch, intensity, rhythm, notes, chords, and more, necessitates the extraction of these elements from extensive datasets, making the preliminary work arduous. To address this, we employed various tools to deconstruct the musical structure, conduct step-by-step learning, and then reconstruct it. This article primarily presents the techniques for dissecting musical components in the preliminary phase. Subsequently, it introduces the use of LSTM to build a deep learning network architecture, enabling the learning of musical features and temporal coherence. Finally, through in-depth analysis and comparative studies, this paper validates the efficacy of the proposed research methodology, demonstrating its ability to capture musical coherence and generate compositions with similar styles. Full article
Show Figures

Figure 1

21 pages, 2758 KB  
Article
Enhancing Cognitive Workload Classification Using Integrated LSTM Layers and CNNs for fNIRS Data Analysis
by Mehshan Ahmed Khan, Houshyar Asadi, Mohammad Reza Chalak Qazani, Adetokunbo Arogbonlo, Siamak Pedrammehr, Adnan Anwar, Hailing Zhou, Lei Wei, Asim Bhatti, Sam Oladazimi, Burhan Khan and Saeid Nahavandi
Computers 2025, 14(2), 73; https://doi.org/10.3390/computers14020073 - 17 Feb 2025
Cited by 6 | Viewed by 4828
Abstract
Functional near-infrared spectroscopy (fNIRS) is employed as a non-invasive method to monitor functional brain activation by capturing changes in the concentrations of oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HbR). Various machine learning classification techniques have been utilized to distinguish cognitive states. However, conventional [...] Read more.
Functional near-infrared spectroscopy (fNIRS) is employed as a non-invasive method to monitor functional brain activation by capturing changes in the concentrations of oxygenated hemoglobin (HbO) and deoxygenated hemoglobin (HbR). Various machine learning classification techniques have been utilized to distinguish cognitive states. However, conventional machine learning methods, although simpler to implement, undergo a complex pre-processing phase before network training and demonstrate reduced accuracy due to inadequate data preprocessing. Additionally, previous research in cognitive load assessment using fNIRS has predominantly focused on differentiating between two levels of mental workload. These studies mainly aim to classify low and high levels of cognitive load or distinguish between easy and difficult tasks. To address these limitations associated with conventional methods, this paper conducts a comprehensive exploration of the impact of Long Short-Term Memory (LSTM) layers on the effectiveness of Convolutional Neural Networks (CNNs) within deep learning models. This is to address the issues related to spatial feature overfitting and the lack of temporal dependencies in CNNs discussed in the previous studies. By integrating LSTM layers, the model can capture temporal dependencies in the fNIRS data, allowing for a more comprehensive understanding of cognitive states. The primary objective is to assess how incorporating LSTM layers enhances the performance of CNNs. The experimental results presented in this paper demonstrate that the integration of LSTM layers with convolutional layers results in an increase in the accuracy of deep learning models from 97.40% to 97.92%. Full article
Show Figures

Figure 1

Review

Jump to: Research

52 pages, 2937 KB  
Review
Federated Learning: A Survey of Core Challenges, Current Methods, and Opportunities
by Madan Baduwal, Priyanka Paudel and Vini Chaudhary
Computers 2026, 15(3), 155; https://doi.org/10.3390/computers15030155 - 2 Mar 2026
Cited by 1 | Viewed by 5446
Abstract
Federated learning (FL) has emerged as a transformative distributed learning paradigm that enables collaborative model training without sharing raw data, thereby preserving privacy across large, diverse, and geographically dispersed clients. Despite its rapid adoption in mobile networks, Internet of Things (IoT) systems, healthcare, [...] Read more.
Federated learning (FL) has emerged as a transformative distributed learning paradigm that enables collaborative model training without sharing raw data, thereby preserving privacy across large, diverse, and geographically dispersed clients. Despite its rapid adoption in mobile networks, Internet of Things (IoT) systems, healthcare, finance, and edge intelligence, FL continues to face several persistent and interdependent challenges that hinder its scalability, efficiency, and real-world deployment. In this survey, we present a systematic examination of six core challenges in federated learning: heterogeneity, computation overhead, communication bottlenecks, client selection, aggregation and optimization, and privacy preservation. We analyze how these challenges manifest across the full FL pipeline, from local training and client participation to global model aggregation and distribution, and examine their impact on model performance, convergence behavior, fairness, and system reliability. Furthermore, we synthesize representative state-of-the-art approaches proposed to address each challenge and discuss their underlying assumptions, trade-offs, and limitations in practical deployments. Finally, we identify open research problems and outline promising directions for developing more robust, scalable, and efficient federated learning systems. This survey aims to serve as a comprehensive reference for researchers and practitioners seeking a unified understanding of the fundamental challenges shaping modern federated learning. Full article
Show Figures

Figure 1

Back to TopTop