MDPI - Publisher of Open Access Journals

28 pages, 3324 KB

Open AccessArticle

Predicting Flexural Strength of FRP-Strengthened Waste Aggregate Concrete Beams with Machine Learning: A Step Towards Sustainability

by Arissaman Sangthongtong, Burachat Chatveera, Gritsada Sua-iam, Adnan Nawaz, Tahir Mehmood, Suniti Suparp, Muhammad Salman, Muhammad Noman, Qudeer Hussain and Panumas Saingam

Buildings 2026, 16(8), 1512; https://doi.org/10.3390/buildings16081512 (registering DOI) - 12 Apr 2026

Abstract

Using waste materials in the manufacture of concrete has many environmental advantages. However, it can be difficult to estimate structural performance, especially when beams are reinforced with fiber-reinforced polymers (FRP). In order to provide a data-driven approach to sustainable structural design, this work [...] Read more.

Using waste materials in the manufacture of concrete has many environmental advantages. However, it can be difficult to estimate structural performance, especially when beams are reinforced with fiber-reinforced polymers (FRP). In order to provide a data-driven approach to sustainable structural design, this work explores the use of machine learning (ML) approaches to forecast the flexural strength of FRP-strengthened waste aggregate concrete beams. A total number of 92 experimental datasets were used to develop and assess four ML algorithms: Random Forest (RF), Decision Tree (DT), Neural Network (NN), and Extreme Gradient Boosting (XGBoost). Regression plots, Taylor diagrams, statistical measures (R2R^2R2, RMSE, MAE, MSE), and explainable AI (XAI) tools, including SHAP, LIME, and partial dependence plots (PDPs), were used to evaluate the model’s performance. RF outperformed NN in terms of predictive accuracy, while XGBoost exhibited similar performance to RF. The most significant predictors, according to a SHAP analysis, were beam length and fiber length, with the lower followed by steel tensile strength, fiber width, and concrete compressive strength. LIME offered local interpretability for individual predictions, but PDPs demonstrated optimal parameter ranges and a nonlinear feature strength relationship. The findings provide engineers with a strong decision-support tool for designing green infrastructure, since they show that ensemble-based models can accurately represent the intricate, nonlinear dynamics controlling flexural behavior in sustainable FRP-strengthened waste aggregate concrete beams. Full article

(This article belongs to the Collection Advanced Concrete Materials in Construction)

40 pages, 8661 KB

Open AccessArticle

Explainable Ensemble Machine Learning for the Prediction and Optimization of Pozzolanic Concrete Compressive Strength

by Sebghatullah Jueyendah and Elif Ağcakoca

Polymers 2026, 18(8), 933; https://doi.org/10.3390/polym18080933 - 10 Apr 2026

Abstract

Pozzolanic concrete demonstrates intricate, highly nonlinear material interactions that pose significant challenges for the accurate prediction of compressive strength (CS). This study introduces a novel, interpretable ensemble machine learning (ML) framework for predicting CS based on 759 mixture records encompassing cement, aggregates, supplementary [...] Read more.

Pozzolanic concrete demonstrates intricate, highly nonlinear material interactions that pose significant challenges for the accurate prediction of compressive strength (CS). This study introduces a novel, interpretable ensemble machine learning (ML) framework for predicting CS based on 759 mixture records encompassing cement, aggregates, supplementary cementitious materials (pozzolans), water/binder (W/B), superplasticizer, water, and curing age. Descriptive analysis and ANOVA were used to identify key predictors, followed by an 80/20 train–test split with 10-fold cross-validation to ensure robust and generalizable modeling. To further enhance model reliability, 5% of outliers were removed using an isolation forest algorithm, after which data were normalized and ensemble hyperparameters optimized. Among the evaluated models, the extra trees algorithm with standard scaling demonstrated the most stable generalization, achieving a coefficient of determination (R²) of 0.978 and a root mean square error (RMSE) of 4.197 MPa on the test set, and R² = 0.966 (RMSE = 5.053 MPa) under 10-fold cross-validation. Feature importance, SHAP, and partial dependence analyses consistently demonstrated that W/B, curing age, and cement are the principal determinants of CS. Finally, multi-objective optimization generated high-strength, low-impact mixtures, confirming the framework’s effectiveness as a transparent decision-support tool for performance- and sustainability-oriented pozzolanic concrete design. This study is novel in combining interpretable ensemble ML with multi-objective optimization to simultaneously achieve precise CS prediction and the formulation of sustainable, performance-optimized pozzolanic concrete mixtures. Full article

(This article belongs to the Section Artificial Intelligence in Polymer Science)

27 pages, 17215 KB

Open AccessArticle

Integrated Multi-Omics and Machine Learning Framework Identifies Diagnostic Signatures and Druggable Targets in Breast Cancer

by Zifu Wang, Jinqi Hou, Yimin Chen, Jundi Li and Sivakumar Vengusamy

Genes 2026, 17(4), 396; https://doi.org/10.3390/genes17040396 - 30 Mar 2026

Viewed by 448

Abstract

Background: Breast cancer (BC) is one of the most diagnosed malignancies and a leading cause of cancer-related mortality among women worldwide, thereby posing a substantial threat to women’s health worldwide. However, clinically robust diagnostic biomarkers with high sensitivity and specificity, as well as [...] Read more.

Background: Breast cancer (BC) is one of the most diagnosed malignancies and a leading cause of cancer-related mortality among women worldwide, thereby posing a substantial threat to women’s health worldwide. However, clinically robust diagnostic biomarkers with high sensitivity and specificity, as well as well-validated molecular targets for targeted therapy, remain limited. Methods: BC transcriptomic data from seven GEO datasets and the TCGA-BRCA cohort (n = 1231) were integrated for analysis. After batch-effect correction, candidate genes were screened through DEA, WGCNA, and PPI networks analysis. An ensemble machine learning (ML) framework incorporating 127 algorithmic combinations was constructed, and SHAP analysis was applied to identify hub genes. Further analyses included functional enrichment, immune infiltration, miRNA regulatory network analysis, and SMR analysis. The expression patterns were validated using single-cell transcriptome data. Drug repositioning analysis and AI-assisted virtual screening were performed to prioritize compounds with favorable drug-like properties. The predicted binding modes of candidate compounds with CHEK1 were assessed by molecular docking. Results: Thirty core genes were obtained through differential expression, WGCNA, and PPI screening. Integrated ML (127 algorithms) determined the optimal model (AUC = 0.919), and SHAP identified nine feature genes, among which CHEK1 and KIF23 showed preliminary diagnostic potential across four external cohorts (AUC: 0.625–0.938). Functional enrichment indicated that both are enriched in the cell cycle and p53 pathways, closely associated with BRCA1/ATR; immune infiltration revealed significant correlations with macrophages and CD8⁺ T cells, with hsa-miR-15a-5p and hsa-miR-607 being common upstream regulatory miRNAs. SMR analysis supported a causal relationship between CHEK1 expression and BC genetic susceptibility (p_SMR < 0.05, p_HEIDI > 0.05); single-cell analysis confirms its heterogeneous expression. AI-assisted virtual screening identified 25 A-grade computational candidate compounds from 171 candidates. Molecular docking suggested that Olaparib and LY294002 can form favorable interactions with the CHEK1 active pocket. Conclusions: The study identified CHEK1 as a key diagnostic gene for BC through 127 ML algorithms and SMR causal inference. By combining AI-assisted virtual screening and molecular docking, computational candidate compounds targeting CHEK1 were prioritized. These findings represent hypothesis-generating in silico predictions and require experimental validation before any therapeutic conclusions can be drawn. Full article

(This article belongs to the Section Molecular Genetics and Genomics)

► Show Figures

Figure 1

24 pages, 4459 KB

Open AccessArticle

AI-Driven Decision Support System for Proactive Risk Management in Construction Projects

by Jon Zorrilla, Sandra Seijo, Unai Arenal and Juan Ramón Mena

Intell. Infrastruct. Constr. 2026, 2(2), 4; https://doi.org/10.3390/iic2020004 - 26 Mar 2026

Viewed by 480

Abstract

Construction projects frequently face risks such as anomalies, delays, and bottlenecks, which can substantially affect timelines and budgets. This study proposes a machine learning (ML)-based framework for early identification of risks in construction projects, enabling pattern understanding and decision-making through clustering, outlier and [...] Read more.

Construction projects frequently face risks such as anomalies, delays, and bottlenecks, which can substantially affect timelines and budgets. This study proposes a machine learning (ML)-based framework for early identification of risks in construction projects, enabling pattern understanding and decision-making through clustering, outlier and bottleneck detection, and relevant variables identification. It uses a business process management (BPM) dataset of construction documents and applies clustering techniques to both numerical and mixed datasets to group documents with similar characteristics, enabling the detection of temporal deviations and the patterns behind them. Additionally, an ensemble anomaly detection model based on different algorithms is implemented to identify outliers through key variables, which may indicate hidden risks and planning errors. Explainable artificial intelligence (XAI) techniques are then used to analyse the importance of the variables, supporting the identification and analysis of bottlenecks that may compromise project success. The results reveal an F1 score of 0.73 in bottleneck detection using three understandable decision rules, a 6% rate of anomalies within the dataset, and three distinct project clusters. This approach enables accurate and timely detection of risks while providing valuable insights for decision-making, improving risk management, and optimising project execution in the architecture, engineering and construction (AEC) industry. Full article

(This article belongs to the Special Issue From Concept to Reality: Digital Innovations Driving the Future of Intelligent Infrastructure and Construction)

► Show Figures

Figure 1

14 pages, 626 KB

Open AccessSystematic Review

Machine Learning Models for Predicting Bleeding Risk in Anticoagulated Patients with Atrial Fibrillation and Venous Thromboembolism: A Comparative Evidence Synthesis

by Winnie Z. Y. Teo, Maggie Wing Yin Wong, Fang Jin Lim, Emmeliene Su-Min Ong, Nesaretnam Barr Kumarakulasinghe and Eng Soo Yap

J. Clin. Med. 2026, 15(6), 2370; https://doi.org/10.3390/jcm15062370 - 20 Mar 2026

Viewed by 347

Abstract

Background: Accurate prediction of bleeding events in patients receiving oral anticoagulants remains a key challenge in the management of atrial fibrillation (AF) and venous thromboembolism (VTE). Machine learning (ML) algorithms have emerged as powerful tools that capture complex, nonlinear interactions among risk factors, [...] Read more.

Background: Accurate prediction of bleeding events in patients receiving oral anticoagulants remains a key challenge in the management of atrial fibrillation (AF) and venous thromboembolism (VTE). Machine learning (ML) algorithms have emerged as powerful tools that capture complex, nonlinear interactions among risk factors, potentially offering superior accuracy. Objectives: To synthesize evidence comparing ML-based bleeding risk models with conventional clinical scores in anticoagulated AF and VTE populations. Methods: We conducted a systematic review with narrative synthesis of studies published between 2015 and 2025 applying ML algorithms to predict bleeding events in anticoagulated AF or VTE patients. Results: Thirteen studies were identified (seven AF and six VTE), including 464,523 participants in total. ML algorithms such as random forest (RF), extreme gradient boosting (XGBoost), and neural networks consistently outperformed traditional tools. In AF, AUCs ranged from 0.64 to 0.76 compared to 0.52–0.61 for HAS-BLED. In VTE, ML models achieved 0.59–0.91 versus 0.61–0.65 for RIETE or VTE-BLEED. Deep learning ensembles reached the highest AUCs (>0.8). Conclusions: ML-based bleeding risk models demonstrated statistically superior discrimination compared to established scores in both AF and VTE contexts, but effect sizes were modest (ΔAUC 0.05–0.15) and clinical utility remains uncertain. Broader validation, calibration assessment, and demonstration of impact on clinical outcomes are necessary before routine adoption. Full article

(This article belongs to the Special Issue Thrombosis and Haemostasis: Clinical Advances)

► Show Figures

Figure 1

26 pages, 2590 KB

Open AccessArticle

A Machine Learning Framework for the Reconstruction of Composite Fatigue and Fracture Properties: A Synthetic Data Study

by Saurabh Tiwari and Aman Gupta

Materials 2026, 19(6), 1131; https://doi.org/10.3390/ma19061131 - 14 Mar 2026

Viewed by 427

Abstract

This study presents a machine learning framework for the reconstruction of fatigue life and fracture toughness in natural fiber-reinforced composites, evaluating the predictive accuracy of six regression algorithms—Random Forest, Gradient Boosting, Support Vector Machine, Neural Network, Ridge Regression, and Lasso Regression—using a controlled [...] Read more.

This study presents a machine learning framework for the reconstruction of fatigue life and fracture toughness in natural fiber-reinforced composites, evaluating the predictive accuracy of six regression algorithms—Random Forest, Gradient Boosting, Support Vector Machine, Neural Network, Ridge Regression, and Lasso Regression—using a controlled synthetic dataset of 600 samples generated from established Basquin fatigue and Rule of Mixtures fracture equations, incorporating stochastic noise calibrated to experimental scatter (CV = 15–50%), with log-normal noise standard deviation of 0.20 for fatigue life and Gaussian noise standard deviation of 0.15 for fracture toughness. The dataset encompasses eight natural fiber types (flax, jute, sisal, hemp, bamboo, coconut, banana, and pineapple) and five matrix systems (epoxy, polyester, PLA, vinyl ester, and polyurethane). Models were evaluated using a 70-15-15 train–validation–test split with 5-fold cross-validation and exhaustive grid search hyperparameter optimisation. Gradient Boosting achieved R² = 0.93 for fatigue life and Stacking Ensemble achieved R² = 0.87 for fracture toughness, representing 97% and 89% of their respective noise-ceiling values (theoretical maximum R² of 0.96 and 0.98 given the programmed noise levels). The ML models perform supervised function approximation—learning to reconstruct the programmed generation equations rather than discovering novel physical composite behaviour—and function as automated surrogates for the governing equations. Feature importance analysis identified engineered composite indicators, stress amplitude, and fiber length as the most influential parameters. The framework provides a reproducible ML evaluation pipeline as a methodological template for future experimental composite studies. Full article

► Show Figures

Figure 1

16 pages, 625 KB

Open AccessArticle

Benchmarking Training Emissions of Regression Models for Vehicle CO₂ Prediction

by Mahmut Turhan, Murat Emeç and Muzaffer Ertürk

Sustainability 2026, 18(6), 2830; https://doi.org/10.3390/su18062830 - 13 Mar 2026

Viewed by 275

Abstract

The urgency of climate action has intensified the use of machine learning (ML) to predict vehicular CO₂ emissions; however, the training of machine learning models also generates computational emissions that are seldom reported. This study addresses a paradox central to Green AI: [...] Read more.

The urgency of climate action has intensified the use of machine learning (ML) to predict vehicular CO₂ emissions; however, the training of machine learning models also generates computational emissions that are seldom reported. This study addresses a paradox central to Green AI: can carbon-intensive algorithms be justified for predicting carbon emissions? Using a public dataset of 7385 light-duty vehicles, we trained nine widely used regression models spanning simple linear baselines, polynomial and regularised linear methods, tree-based learners, ensembles, and a neural network. All experiments were instrumented with CodeCarbon to quantify real-time training footprints under a grid carbon intensity of 450 g CO₂/kWh. Across models, test performance ranged from R² = 0.72 to 0.99, yet training emissions varied by four orders of magnitude, from 0.001 g CO₂ (simple linear regression) to 2.3 g CO₂ (XGBoost). Although XGBoost achieved the highest accuracy (R² = 0.9947), it emitted approximately 2300× more CO₂ than regularised polynomial linear models for only a 0.39-point gain in R². Pareto analysis identifies Lasso and Ridge regression with degree-4 polynomial features as sustainability-optimal, reaching R² = 0.9908 at ~0.004 g CO₂. To unify predictive and environmental efficiency, we introduce Accuracy-per-Gram (APG = R²/CO₂) and Marginal Emissions Cost (MEC = ΔCO₂/ΔR²), demonstrating a steep efficiency cliff beyond regularised linear models. At the fleet scale (100 million vehicles with daily retraining), algorithm choice implies ~84 t CO₂/year for XGBoost versus ~0.15 t for Lasso, highlighting the potential climate cost of marginal accuracy gains. We provide a reproducible carbon-tracking pipeline, Green-AI evaluation metrics, and deployment guidance, arguing that computational sustainability must co-determine model selection for emissions-related ML systems. Most critically, we identify a clear accuracy–carbon emission Pareto frontier, demonstrating that regularised polynomial linear models lie on the sustainability-optimal boundary, while widely used ensemble methods such as XGBoost sit beyond an “efficiency cliff,” where marginal accuracy improvements incur disproportionately high carbon costs. Full article

(This article belongs to the Topic Advances in Low-Carbon, Climate-Resilient, and Sustainable Built Environment)

► Show Figures

Figure 1

31 pages, 5209 KB

Open AccessReview

AI-Driven Fault Detection and O&M for Wind Turbine Drivetrains: A Review of SCADA, CMS and Digital Twin Integration

by Ning Jia, Jiangzhe Feng, Zongyou Zuo, Zhiyi Liu, Tengyuan Wang, Chang Cai and Qingan Li

Energies 2026, 19(5), 1370; https://doi.org/10.3390/en19051370 - 7 Mar 2026

Viewed by 596

Abstract

The rapid expansion of wind energy has increased the operational complexity of wind turbines, where component degradation, environmental variability, and maintenance decisions are tightly coupled. Artificial intelligence (AI) has been widely applied to support fault detection and operation and maintenance (O&M), yet many [...] Read more.

The rapid expansion of wind energy has increased the operational complexity of wind turbines, where component degradation, environmental variability, and maintenance decisions are tightly coupled. Artificial intelligence (AI) has been widely applied to support fault detection and operation and maintenance (O&M), yet many existing studies remain fragmented and insufficiently address practical challenges such as heterogeneous data, sparse fault labels, and cross-site generalization. This review provides an engineering-oriented synthesis of AI-based methods for wind turbine fault detection and O&M, focusing on drivetrain diagnostics as a representative application. The literature is organized along an end-to-end O&M workflow, including SCADA-based condition monitoring, component-level fault diagnosis, health assessment and remaining useful life estimation, multi-modal blade inspection, and DT (Digital Twin) integration. Traditional ML (machine learning), ensemble methods, deep learning, physics-informed learning, and transfer learning are reviewed with respect to their data requirements, operational assumptions, and deployment constraints. Beyond algorithmic performance, this review discusses data governance, alarm design, model updating, and interpretability, and summarizes public datasets and emerging data resources. The aim is to bridge methodological advances and practical O&M requirements, supporting reliable and deployable AI applications in wind energy systems. Full article

(This article belongs to the Section A3: Wind, Wave and Tidal Energy)

► Show Figures

Figure 1

35 pages, 5289 KB

Open AccessArticle

Sentiment Classification of Amazon Product Reviews Based on Machine and Deep Learning Techniques: A Comparative Study

by Eman Daraghmi and Noora Zyadeh

Future Internet 2026, 18(3), 138; https://doi.org/10.3390/fi18030138 - 7 Mar 2026

Viewed by 494

Abstract

Sentiment classification plays a crucial role in analyzing customer feedback to identify market trends, enhance product recommendations, and improve customer satisfaction. This study focuses on sentiment analysis of Amazon reviews using two major datasets—Fine Food Reviews and Unlocked Mobile Reviews—which exhibit label imbalance. [...] Read more.

Sentiment classification plays a crucial role in analyzing customer feedback to identify market trends, enhance product recommendations, and improve customer satisfaction. This study focuses on sentiment analysis of Amazon reviews using two major datasets—Fine Food Reviews and Unlocked Mobile Reviews—which exhibit label imbalance. To address this challenge, both oversampling and undersampling techniques were applied to balance the datasets. Various machine learning (ML) algorithms, including Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), Naïve Bayes (NB), and Gradient Boosting Machine (GBM), as well as deep learning (DL) models such as Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and transformer-based models like RoBERTa, were implemented. After data cleaning and preprocessing, models were trained, and performance was evaluated. The results indicate that oversampling significantly enhances classification accuracy, particularly for the Fine Food dataset. Among ML models, Random Forest achieved the highest accuracy due to its ensemble approach and robustness in handling high-dimensional data. DL models, particularly RoBERTa, also demonstrated superior performance owing to their capacity to capture contextual dependencies. The findings emphasize the importance of data balancing for optimal sentiment analysis and contribute valuable insights toward advancing automated opinion classification in e-commerce applications. Full article

(This article belongs to the Section Big Data and Augmented Intelligence)

► Show Figures

Figure 1

82 pages, 6468 KB

Open AccessArticle

Correction Functions and Refinement Algorithms for Enhancing the Performance of Machine Learning Models

by Attila Kovács, Judit Kovácsné Molnár and Károly Jármai

Automation 2026, 7(2), 45; https://doi.org/10.3390/automation7020045 - 6 Mar 2026

Viewed by 732

Abstract

The aim of this study is to investigate and demonstrate the role of correction functions and optimisation-based refinement algorithms in enhancing the performance of machine learning models, particularly in predictive anomaly detection tasks applied in industrial environments. The performance of machine learning models [...] Read more.

The aim of this study is to investigate and demonstrate the role of correction functions and optimisation-based refinement algorithms in enhancing the performance of machine learning models, particularly in predictive anomaly detection tasks applied in industrial environments. The performance of machine learning models is highly dependent on the quality of data preprocessing, model architecture, and post-processing methodology. In many practical applications—particularly in time-series forecasting and anomaly detection—the conventional training pipeline alone is insufficient, because model uncertainty, structural bias and the handling of rare events require specialised post hoc calibration and refinement mechanisms. This study provides a systematic overview of the role of correction functions (e.g., Principal Component Analysis (PCA), Squared Prediction Error (SPE)/Q-statistics, Hotelling’s T², Bayesian calibration) and adaptive improvement algorithms (e.g., Genetic Algorithms (GA), Particle Swarm Optimisation (PSO), Simulated Annealing (SA), Gaussian Mixture Model (GMM) and ensemble-based techniques) in enhancing the performance of machine learning pipelines. The models were trained on a real industrial dataset compiled from power network analytics and harmonic-injection-based loading conditions. Model validation and equipment-level testing were performed using a large-scale harmonic measurement dataset collected over a five-year period. The reliability of the approach was confirmed by comparing predicted state transitions with actual fault occurrences, demonstrating its practical applicability and suitability for integration into predictive maintenance frameworks. The analysis demonstrates that correction functions introduce deterministic transformations in the data or error space, whereas improvement algorithms apply adaptive optimisation to fine-tune model parameters or decision boundaries. The combined use of these approaches significantly reduces overfitting, improves predictive accuracy and lowers false alarm rates. This work introduces the concept of an Organically Adaptive Predictive (OAP) ML model. The proposed model presents organic adaptivity, continuously adjusting its predictive behaviour in response to dynamic variations in network loading and harmonic spectrum composition. The introduced terminology characterises the organically emergent nature of the adaptive learning mechanism. Full article

(This article belongs to the Topic Predictive Analytics and Fault Diagnosis of Machines with Machine Learning Techniques, 2nd Edition)

► Show Figures

Figure 1

25 pages, 3080 KB

Open AccessReview

Machine Learning for Alloy Design: A Property-Oriented Review

by Shamim Pourrahimi and Soroosh Hakimian

Alloys 2026, 5(1), 7; https://doi.org/10.3390/alloys5010007 - 6 Mar 2026

Viewed by 1008

Abstract

Machine learning (ML) is becoming an established part of alloy research, offering new ways to link composition, processing routes, and microstructure with measured properties. In this work, recent studies using ML for predicting or optimizing alloy behavior are reviewed, covering mechanical, corrosion, phase-related, [...] Read more.

Machine learning (ML) is becoming an established part of alloy research, offering new ways to link composition, processing routes, and microstructure with measured properties. In this work, recent studies using ML for predicting or optimizing alloy behavior are reviewed, covering mechanical, corrosion, phase-related, and physical properties. Unlike previous reviews organized by alloy system or modeling approach, this review is structured by target property (mechanical, corrosion, phase/structure, and physical), which helps identify the input features commonly used to model each property and highlights existing gaps in data and validation. For each study, the main property of interest, dataset features, model type, algorithm choice, use of hyperparameter tuning, and validation strategy were examined. Comparing these reports shows that ensemble models such as random forest and XGBoost, together with deep neural networks, usually perform better than linear approaches. At the same time, issues related to small datasets and inconsistent reporting remain major challenges. Attention is also drawn to new directions, particularly physics-based learning and multi-objective optimization, that are changing how ML is applied in materials design. Overall, this review summarizes current practices and outlines areas where closer integration of data-driven and experimental methods could accelerate the development of next-generation alloys. Full article

(This article belongs to the Topic Microstructure and Properties in Metals and Alloys, 3rd Volume)

► Show Figures

Figure 1

25 pages, 918 KB

Open AccessReview

Parkinson’s Disease Detection Using Machine Learning Algorithms: A Comprehensive Review

by Jelica Cincović, Miloš Cvetanović, Milica Djurić-Jovičić, Nebojsa Bacanin and Boško Nikolić

Algorithms 2026, 19(3), 193; https://doi.org/10.3390/a19030193 - 4 Mar 2026

Viewed by 423

Abstract

Parkinson’s disease (PD) is a progressive neurodegenerative disorder in which early detection remains a major clinical challenge due to heterogeneous motor and non-motor manifestations and the lack of reliable biomarkers. In recent years, machine learning (ML) and deep learning (DL) methods have been [...] Read more.

Parkinson’s disease (PD) is a progressive neurodegenerative disorder in which early detection remains a major clinical challenge due to heterogeneous motor and non-motor manifestations and the lack of reliable biomarkers. In recent years, machine learning (ML) and deep learning (DL) methods have been increasingly investigated as decision-support tools for PD screening using diverse clinical and behavioral data. This review synthesizes PD detection studies published between 2017 and 2025, systematically analyzing 32 representative works across multiple modalities, including MRI, PET, EEG, REM sleep biomarkers, voice recordings, gait signals, handwriting/drawing tasks, and finger-tapping measurements. Across the reviewed literature, high classification performance is frequently reported, with CNN-based and hybrid DL architectures achieving particularly strong results in imaging and time-series settings, while classical ML approaches such as SVM and ensemble models remain competitive for engineered feature-based datasets. However, the review also reveals major barriers to reliable translation, including small datasets, inconsistent evaluation protocols, limited external validation, and the risk of performance inflation caused by non-subject-independent data splitting. Overall, this review provides a structured and modality-oriented reference of algorithms, datasets, and performance trends, while highlighting key methodological gaps and practical priorities for developing robust and clinically deployable PD detection systems. Full article

(This article belongs to the Special Issue Recent Advances in Artificial Intelligence and Metaheuristics Optimization)

► Show Figures

Figure 1

17 pages, 1309 KB

Open AccessArticle

Path Loss Considering Atmospheric Impact in 5G Networks: A Comparison of Machine Learning Models

by Vasileios P. Rekkas, Leandro dos Santos Coelho, Viviana Cocco Mariani, Adamantini Peratikou and Sotirios K. Goudos

Technologies 2026, 14(3), 151; https://doi.org/10.3390/technologies14030151 - 2 Mar 2026

Viewed by 459

Abstract

Accurate estimation of wireless propagation characteristics is essential for guiding the design and deployment of fifth-generation (5G) communication systems. As network demand increases and 5G infrastructure is introduced in progressive phases, reliable path loss (PL) prediction models are required to refine deployment strategies [...] Read more.

Accurate estimation of wireless propagation characteristics is essential for guiding the design and deployment of fifth-generation (5G) communication systems. As network demand increases and 5G infrastructure is introduced in progressive phases, reliable path loss (PL) prediction models are required to refine deployment strategies and improve network efficiency. Conventional propagation models frequently display limited flexibility when applied to diverse environmental conditions and often entail considerable computational expense, reducing their practicality for large-scale 5G planning. Recent developments in data-centric artificial intelligence (AI) have enabled more adaptive and analytically powerful approaches to propagation modeling, resulting in notable gains in PL prediction accuracyThis study employs a comprehensive dataset produced using the NYUSIM channel simulator, integrating a wide spectrum of atmospheric parameters and seasonal variations within South Asian urban microcell environments, complemented by broad empirical observations. The core objective is to construct, optimize, and evaluate four machine learning (ML) models capable of accurately predicting PL at high-frequency bands critical to 5G performance. A fully automated hyperparameter tuning pipeline, based on the Optuna framework, is applied to twelve regression algorithms, including advanced ensemble methods, regularized linear techniques, and classical baseline models. Performance assessment emphasizes predictive reliability, stability, and cross-model generalization. Furthermore, statistical analysis utilizing bootstrap confidence intervals and paired t-tests indicates that all ML methods perform equivalently (p > 0.4), while SHapley Additive exPlanations (SHAP) analysis across all models supports a consistent feature importance distribution, supporting the statistical analysis results. To showcase the superiority of the ML approaches, a comparison with conventional free-space PL modeling methods is presented, with the AI methodology demonstrating robust performance across seasonal variations and a 95.3% improvement. Full article

(This article belongs to the Section Information and Communication Technologies)

► Show Figures

Figure 1

38 pages, 3811 KB

Open AccessArticle

Interpretable Machine Learning for Compressive Strength Prediction of Fly Ash-Based Geopolymer Concrete

by Farnaz Ahadian, Ümit Işıkdağ, Gebrail Bekdaş, Sinan Melih Nigdeli, Celal Cakiroglu and Zong Woo Geem

Sustainability 2026, 18(5), 2227; https://doi.org/10.3390/su18052227 - 25 Feb 2026

Viewed by 335

Abstract

Fly ash-based geopolymer concrete (GPC) is a sustainable alternative to conventional cementitious materials; however, its compressive strength is governed by complex and highly correlated mixture parameters, making experimental optimization expensive and data-driven modeling challenging. While machine learning (ML) techniques have been widely applied [...] Read more.

Fly ash-based geopolymer concrete (GPC) is a sustainable alternative to conventional cementitious materials; however, its compressive strength is governed by complex and highly correlated mixture parameters, making experimental optimization expensive and data-driven modeling challenging. While machine learning (ML) techniques have been widely applied to predict GPC strength, most studies prioritize predictive accuracy without explicitly addressing multicollinearity among input variables, which can distort feature importance, reduce model stability, and limit engineering interpretability. This study proposes a multicollinearity-integrated and interpretable ML framework that systematically embeds correlation diagnostics and structured feature screening within the modeling pipeline rather than treating interpretability as a post-processing step. Multiple conventional and ensemble learning algorithms were comparatively evaluated using cross-validation to ensure generalization robustness. The proposed framework achieved a maximum coefficient of determination (R²) of 0.96 with low prediction error, outperforming baseline regression models while demonstrating improved stability under correlated input conditions. Unlike existing studies that rely solely on black-box optimization, the integrated interpretability analysis revealed physically consistent dominance of curing temperature, alkali content, and water-related parameters in governing strength development. By explicitly coupling predictive performance with multicollinearity mitigation and engineering-oriented interpretability, this work advances beyond accuracy-driven ML applications and provides a robust and transparent decision-support tool for sustainable geopolymer mix design. Full article

► Show Figures

Figure 1

13 pages, 1443 KB

Open AccessArticle

Early Prediction of 90-Day Periprosthetic Joint Infection After Hip Arthroplasty for Proximal Femur Fracture Using Machine Learning: Development and Temporal Validation of a Predictive Model

by Nicolò Giuseppe Biavardi, Francesco Pezone, Federico Morlini, Mattia Alessio-Mazzola, Valerio Pace, Pierluigi Antinolfi, Giacomo Placella and Vincenzo Salini

J. Clin. Med. 2026, 15(4), 1668; https://doi.org/10.3390/jcm15041668 - 23 Feb 2026

Viewed by 451

Abstract

Background: Periprosthetic joint infection (PJI) after hip arthroplasty for proximal femur fracture is a severe complication, and early postoperative identification remains challenging. This study developed and validated machine learning (ML) models for the early prediction of 90-day EBJIS 2021 “confirmed” PJI using routinely [...] Read more.

Background: Periprosthetic joint infection (PJI) after hip arthroplasty for proximal femur fracture is a severe complication, and early postoperative identification remains challenging. This study developed and validated machine learning (ML) models for the early prediction of 90-day EBJIS 2021 “confirmed” PJI using routinely available perioperative data. Methods: We performed a single-center retrospective study including 1182 consecutive adults undergoing primary hip arthroplasty for proximal femur fracture (2015–2022). Forty-seven perioperative candidate predictors were extracted, including early postoperative laboratory values (postoperative day 1–2 and maxima within 72 h). Six algorithms were trained and compared (logistic regression, random forest, support vector machine, multilayer perceptron, XGBoost, and stacking ensemble) using a stratified 80/20 training–test split with 10-fold cross-validation, grid-search hyperparameter tuning, and class weighting. A sensitivity-prioritizing classification threshold was derived using training data only and applied unchanged to evaluation cohorts. Uncertainty was estimated via 1000 bootstrap iterations. Calibration was assessed using the Brier score and calibration intercept/slope. Temporal validation was conducted in a same-center 2023 cohort (n = 147). Model explainability used SHAP. Results: EBJIS-confirmed 90-day PJI occurred in 58/1182 (4.9%) patients. In held-out testing, the final XGBoost model demonstrated good discrimination (AUC 0.889, 95% CI 0.804–0.960) with good overall calibration (Brier score 0.043). Using a prespecified sensitivity-prioritizing threshold selected in the training set, test-set sensitivity was 100%, specificity 58.5%, PPV 11.4%, and NPV 100%. The stacking ensemble yielded the highest discrimination (AUC 0.937; 95% CI 0.89–0.98). In temporal validation (same-center 2023 cohort; n = 147), model performance remained stable (AUC 0.892; sensitivity 85.7%; NPV 99.1% at the prespecified threshold). Calibration was favorable in the development cohort (Brier 0.041; intercept −0.04; slope 0.96) and in 2023 (Brier 0.038; intercept −0.06; slope 0.94). SHAP identified postoperative C-reactive protein, operative duration, body mass index, ASA class, and serum sodium as the most influential predictors. Conclusions: ML models, particularly XGBoost, supported early postoperative risk stratification for 90-day EBJIS-confirmed PJI after fracture-related hip arthroplasty, with a consistently high NPV and stable calibration in a temporally independent same-center cohort. Prospective multi-center validation and impact evaluation are needed before clinical implementation. Full article

(This article belongs to the Special Issue Clinical Advances in Trauma and Orthopaedic Surgery)

► Show Figures

Figure 1

Search Results (339)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (339)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI