Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (341)

Search Parameters:
Keywords = Bayesian-Random Forest model

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 297 KB  
Article
Integrated Biomarker–Volumetric Profiling Defines Neurodegenerative Subtypes and Predicts Neuroaxonal Injury in Multiple Sclerosis Based on Bayesian and Machine Learning Analyses
by Alin Ciubotaru, Roxana Covali, Cristina Grosu, Daniel Alexa, Laura Riscanu, Bîlcu Robert-Valentin, Radu Popa, Gabriela Dumachita Sargu, Cristina Popa, Cristiana Filip, Laura-Elena Cucu, Albert Vamanu, Victor Constantinescu and Emilian Bogdan Ignat
Biomedicines 2026, 14(1), 42; https://doi.org/10.3390/biomedicines14010042 - 24 Dec 2025
Viewed by 55
Abstract
Background: The clinical–radiological paradox in multiple sclerosis (MS) underscores the need for biomarkers that better reflect neurodegenerative pathology. Serum neurofilament light chain (sNfL) is a dynamic marker of neuroaxonal injury, while brain volumetry provides structural assessment of disease impact. However, the precise [...] Read more.
Background: The clinical–radiological paradox in multiple sclerosis (MS) underscores the need for biomarkers that better reflect neurodegenerative pathology. Serum neurofilament light chain (sNfL) is a dynamic marker of neuroaxonal injury, while brain volumetry provides structural assessment of disease impact. However, the precise link between sNfL and regional atrophy patterns, as well as their combined utility for patient stratification and prediction, remains underexplored. Objective: This study aimed to establish a multimodal biomarker framework by integrating sNfL with comprehensive volumetric MRI to define neurodegenerative endophenotypes and predict neuroaxonal injury using Bayesian inference and machine learning. Methods: In a cohort of 57 MS patients, sNfL levels were measured using single-molecule array (Simoa) technology. Brain volumes for 42 regions were quantified via automated deep learning segmentation (mdbrain software). We employed (1) Bayesian correlation to quantify evidence for sNfL–volumetric associations; (2) mediation analysis to test whether grey matter atrophy mediates the EDSS–sNfL (Expanded Disability Status Scale) relationship; (3) unsupervised K-means clustering to identify patient subtypes based on combined sNfL–volumetric profiles; and (4) supervised machine learning (Elastic Net and Random Forest regression) to predict sNfL from volumetric features. Results: Bayesian analysis revealed strong evidence linking sNfL to total grey matter volume (r = −0.449, BF10 = 0.022) and lateral ventricular volume (r = 0.349, BF10 = 0.285). Mediation confirmed that grey matter atrophy significantly mediates the relationship between EDSS and sNfL (indirect effect = 0.45, 95% CI [0.20, 0.75]). Unsupervised clustering identified three distinct endophenotypes: “High Neurodegeneration” (elevated sNfL, severe atrophy, high disability), “Moderate Injury,” and “Benign Volumetry” (low sNfL, preserved volumes, mild disability). Supervised models predicted sNfL with high accuracy (R2 = 0.65), identifying total grey matter volume, ventricular volume, and age as top predictors. Conclusions: This integrative multi-method analysis demonstrates that sNfL is robustly associated with global grey matter and ventricular volumes, and that these measures define clinically meaningful neurodegenerative subtypes in MS. Machine learning confirms that a concise set of volumetric features can effectively predict neuroaxonal injury. These findings advance a pathobiology-driven subtyping framework and provide a validated model for using routine MRI volumetry to assess neuroaxonal health, with implications for prognosis and personalised therapeutic strategies. Full article
19 pages, 1682 KB  
Article
Personalized Mortality Risk Stratification in ALD- and MASLD-Related Hepatocellular Carcinoma Using a Machine Learning Approach
by Miguel Suárez, Sergio Gil-Rojas, Pablo Martínez-Blanco, Ana M. Torres, Natalia Martínez-García, Miguel Torralba and Jorge Mateo
Metabolites 2026, 16(1), 8; https://doi.org/10.3390/metabo16010008 - 22 Dec 2025
Viewed by 148
Abstract
Background/Objectives: The epidemiology of hepatocellular carcinoma (HCC) is shifting, with alcohol-associated liver disease (ALD) and metabolic dysfunction-associated steatotic liver disease (MASLD) becoming leading causes in developed countries. This study aimed to identify the main prognostic factors for mortality at diagnosis in HCC patients [...] Read more.
Background/Objectives: The epidemiology of hepatocellular carcinoma (HCC) is shifting, with alcohol-associated liver disease (ALD) and metabolic dysfunction-associated steatotic liver disease (MASLD) becoming leading causes in developed countries. This study aimed to identify the main prognostic factors for mortality at diagnosis in HCC patients with ALD and MASLD using machine learning (ML) algorithms. Random Forest (RF) was proposed as reference method. Methods: A multicenter, retrospective cohort of 91 patients diagnosed with HCC due to ALD or MASLD between 2008 and 2023 was analyzed. Demographic, clinical, and biochemical variables were collected. Several ML algorithms were implemented: RF, Support Vector Machine, Decision Tree, Gaussian Naïve Bayes, and K-Nearest Neighbors. Bayesian optimization was applied for hyperparameter tuning. Model performance was evaluated using standard metrics including AUC, precision, recall, and F1 score. Results: RF achieved the highest performance across all metrics (AUC: 0.91, precision: 90.67%, F1 score: 91.05%), surpassing other algorithms by over 10%. The most relevant variables for mortality prediction were serum albumin, CRP/albumin ratio, BCLC stage, and ALBI score. MELD 3.0 showed superior predictive value compared to other MELD variants. Conversely, AFP had limited prognostic utility in this population. Conclusions: In HCC patients related to ALD and MASLD, liver function and systemic inflammation markers outperform tumor markers for early mortality prediction. In this cohort, RF offered the highest predictive performance among the evaluated algorithms and may support personalized prognosis in ALD- and MASLD-related HCC; however, external validation in independent datasets is required before broad clinical implementation. Full article
(This article belongs to the Special Issue Liver Injury and Regeneration—Metabolic Research)
Show Figures

Figure 1

26 pages, 13352 KB  
Article
Robust Rainfall Gap-Filling in Coastal Arid Regions Using Ensemble Fusion Models
by Badar Al-Jahwari, Ghazi Al-Rawas, Mohammad Reza Nikoo, Talal Etri and Jens Grundmann
Hydrology 2026, 13(1), 1; https://doi.org/10.3390/hydrology13010001 - 20 Dec 2025
Viewed by 201
Abstract
In arid regions, the challenges posed by rainfall data availability, missing data, and limited historical records significantly affect hydrological modeling studies and climate change assessments. For various hydrology applications, it is essential to implement advanced techniques in order to obtain a complete dataset [...] Read more.
In arid regions, the challenges posed by rainfall data availability, missing data, and limited historical records significantly affect hydrological modeling studies and climate change assessments. For various hydrology applications, it is essential to implement advanced techniques in order to obtain a complete dataset series. This study explores the implementation of multiple machine learning techniques to address the complexity of filling daily rainfall data for 88 rainfall stations in the Al-Batinah region of Oman, covering the period from 1993 to 2024. The machine learning models applied in this study include Multiple Linear Regression (MLR), Random Forest (RF), K-Nearest Neighbors (KNN), Support Vector Regression (SVR), and Gradient-Boosting Trees (GBT). A non-clustering approach is used as well as a clustering approach as part of the methodology. In the first method, rainfall stations are not clustered, while in the second method, optimal cluster numbers are calculated using K-means clustering. The target station utilizes the nearby rainfall station data located within a 50 km radius with the highest correlation coefficients. A novel Ensemble Fusion Model has been applied to improve the efficacy of multiple predictive models, including the RF Fusion Model (RF) and Multi-Model Super Ensemble Fusion Model (MMSE). The estimation approaches are further enhanced and evaluated by Bayesian optimization of hyperparameters, dataset imputation utilizing Multiple Imputation by Chained Equations (MICE), and Leave-One-Year-Out (LOYO) cross-validation. Based on the results, it can be concluded that the GBT model performs the best in both cluster and non-cluster approaches. A further benefit of applying Ensemble Fusion Models to rainfall gap-filling methods is that the coefficient of determination (R2) for clustering and non-clustering approaches increases to 22.5% and 22.2%, respectively. Full article
(This article belongs to the Section Hydrological and Hydrodynamic Processes and Modelling)
Show Figures

Figure 1

25 pages, 5917 KB  
Article
Explainable Machine Learning-Based Prediction of Compressive Strength in Sustainable Recycled Aggregate Self-Compacting Concrete Using SHAP Analysis
by Ahmed Almutairi
Sustainability 2025, 17(24), 11334; https://doi.org/10.3390/su172411334 - 17 Dec 2025
Viewed by 298
Abstract
The increasing emphasis on sustainability in construction materials has led to a surge of research focused on recycled aggregate self-compacting concrete (RA-SCC). However, the critical gap in predicting the compressive strength of concrete remains challenging because of the nonlinear interactions among the mix’s [...] Read more.
The increasing emphasis on sustainability in construction materials has led to a surge of research focused on recycled aggregate self-compacting concrete (RA-SCC). However, the critical gap in predicting the compressive strength of concrete remains challenging because of the nonlinear interactions among the mix’s constituents. The distinct contribution of this study is to develop an interpretable machine learning (ML) framework to accurately forecast the compressive strength of RA-SCC and identify the most influential mix parameters. A dataset comprising 400 experimental samples was compiled, incorporating eight input variables: age, cement strength, cement, fly ash, blast furnace slag, water, recycled aggregate, and superplasticizer, with compressive strength as the output variable. Four ML algorithms such as support vector regression (SVR), random forest (RF), Multilayer Perceptron (MLP), and extreme gradient boosting (XGBoost) were trained and optimized using Bayesian-based hyperparameter tuning combined with 10-fold cross-validation. Among the evaluated models, XGBoost demonstrated superior accuracy, with R2 = 0.98 and RMSE = 2.95 MPa during training, and R2 = 0.96 with RMSE = 3.25 MPa during testing, confirming its robustness and minimal overfitting. SHAP (SHapley Additive exPlanations) evaluation indicates that superplasticizer, cement, and cement strength were the most dominant factors influencing compressive strength, whereas higher water content showed a negative impact. The developed framework demonstrates that explainable ML can effectively capture the complex nonlinear behavior of RA-SCC, offering a reliable tool for mix design optimization and sustainable concrete production. These findings contribute to advancing data-driven decision making in eco-efficient materials engineering. Full article
Show Figures

Figure 1

29 pages, 8414 KB  
Article
Optimized Explainable Machine Learning Protocol for Battery State-of-Health Prediction Based on Electrochemical Impedance Spectra
by Lamia Akther, Md Shafiul Alam, Mohammad Ali, Mohammed A. AlAqil, Tahmida Khanam and Md. Feroz Ali
Electronics 2025, 14(24), 4869; https://doi.org/10.3390/electronics14244869 - 10 Dec 2025
Viewed by 350
Abstract
Monitoring the battery state of health (SOH) has become increasingly important for electric vehicles (EVs), renewable storage systems, and consumer gadgets. It indicates the residual usable capacity and performance of a battery in relation to its original specifications. This information is crucial for [...] Read more.
Monitoring the battery state of health (SOH) has become increasingly important for electric vehicles (EVs), renewable storage systems, and consumer gadgets. It indicates the residual usable capacity and performance of a battery in relation to its original specifications. This information is crucial for the safety and performance enhancement of the overall system. This paper develops an explainable machine learning protocol with Bayesian optimization techniques trained on electrochemical impedance spectroscopy (EIS) data to predict battery SOH. Various robust ensemble algorithms, including HistGradientBoosting (HGB), Random Forest, AdaBoost, Extra Trees, Bagging, CatBoost, Decision Tree, LightGBM, Gradient Boost, and XGB, have been developed and fine-tuned for predicting battery health. Eight comprehensive metrics are employed to estimate the model’s performance rigorously: coefficient of determination (R2), mean squared error (MSE), median absolute error (medae), mean absolute error (MAE), correlation coefficient (R), Nash–Sutcliffe efficiency (NSE), Kling–Gupta efficiency (KGE), and root mean squared error (RMSE). Bayesian optimization techniques were developed to optimize hyperparameters across all models, ensuring optimal implementation of each algorithm. Feature importance analysis was performed to thoroughly evaluate the models and assess the features with the most influence on battery health degradation. The comparison indicated that the GradientBoosting model outperformed others, achieving an MAE of 0.1041 and an R2 of 0.9996. The findings suggest that Bayesian-optimized tree-based ensemble methods, particularly gradient boosting, excel at forecasting battery health status from electrochemical impedance spectroscopy data. This result offers an excellent opportunity for practical use in battery management systems that employ diverse industrial state-of-health assessment techniques to enhance battery longevity, contributing to sustainability initiatives for second-life lithium-ion batteries. This capability enables the recycling of vehicle batteries for application in static storage systems, which is environmentally advantageous and ensures continuity. Full article
(This article belongs to the Special Issue Advanced Control and Power Electronics for Electric Vehicles)
Show Figures

Figure 1

38 pages, 8524 KB  
Article
Prediction of Compressive Strength of Carbon Nanotube Reinforced Concrete Based on Multi-Dimensional Database
by Ao Yan, Shengdong Zhang, Zhuoxuan Li, Peng Zhu and Yuching Wu
Buildings 2025, 15(23), 4349; https://doi.org/10.3390/buildings15234349 - 1 Dec 2025
Viewed by 343
Abstract
The incorporation of carbon nanotubes (CNTs) enhances the mechanical properties of cement-based materials by inhibiting micro-crack propagation. Machine learning provides an efficient approach for predicting the compressive strength of CNT-reinforced concrete, yet existing studies often lack important features and rely on less adaptive [...] Read more.
The incorporation of carbon nanotubes (CNTs) enhances the mechanical properties of cement-based materials by inhibiting micro-crack propagation. Machine learning provides an efficient approach for predicting the compressive strength of CNT-reinforced concrete, yet existing studies often lack important features and rely on less adaptive models. To address these issues, a multi-dimensional database (429 experimental data points) covering 11 factors (including cement mix ratio, CNT morphology, and dispersion process) was constructed. A hierarchical model verification and optimization was conducted: traditional regression models (Multiple Linear Regression, Multiple Polynomial Regression (MPR), Multivariate Adaptive Regression Splines), mainstream model (Support Vector Regression (SVR)), and ensemble learning models (Random Forest, eXtreme Gradient Boosting (XGB), Light Gradient Boosting Machine optimized by Particle Swarm Optimization (PSO)/Bayesian Optimization (BO)) are trained, compared, and evaluated. MPR performs best (test set R2 = 0.856) among traditional regression models, while SVR (test set R2 = 0.824) is less accurate. The highest accuracy in ensemble models is achieved by the PSO-optimized XGB model, with R2 = 0.910 (test set). PSO outperforms BO in optimization precision, while BO is much more efficient. Water–cement ratio, age, and sand–cement ratio are the primary influencing factors for strength. Among CNT parameters, the inner diameter has greater impact than the length and outer diameter. Optimal CNT parameters are CNT–cement mass ratio 0.1–0.3%, inner diameter ≥ 7.132 nm, and length 1–15 μm. Surfactant polycarboxylate can increase strength, while OH functional groups can decrease it. These findings, integrated into the high-precision PSO-XGB model, provide a powerful tool for optimizing the mix design of CNT-reinforced concrete, accelerating its development and application in the industry. Full article
(This article belongs to the Section Building Materials, and Repair & Renovation)
Show Figures

Figure 1

17 pages, 10990 KB  
Article
Study of Intelligent Identification of Radionuclides Using a CNN–Meta Deep Hybrid Model
by Xiangting Meng, Ziyi Wang, Yu Sun, Zhihao Dong, Xiaoliang Liu, Huaiqiang Zhang and Xiaodong Wang
Appl. Sci. 2025, 15(22), 12285; https://doi.org/10.3390/app152212285 - 19 Nov 2025
Viewed by 417
Abstract
The rapid and accurate identification of radionuclides and the quantitative analysis of their activities have long been key research areas in the field of nuclear spectrum data processing. Traditional nuclear spectrum analysis methods heavily rely on manual feature extraction, making them highly susceptible [...] Read more.
The rapid and accurate identification of radionuclides and the quantitative analysis of their activities have long been key research areas in the field of nuclear spectrum data processing. Traditional nuclear spectrum analysis methods heavily rely on manual feature extraction, making them highly susceptible to interference from factors such as energy resolution, calibration drift, and spectral peak overlap when dealing with complex mixed-radionuclide spectra, ultimately leading to degraded identification performance and accuracy. Based on multi-nuclide energy spectral data acquired via Geant4 simulation, this study compares the performance of partial least squares regression (PLSR), random forest (RF), a convolutional neural network (CNN), and a hybrid CNN–Meta model for radionuclide identification and quantitative activity analysis under conditions of raw energy spectra, Z-score normalization, and min-max normalization. To maximize the potential of each model, principal component selection, Bayesian hyperparameter optimization, iteration tuning, and meta-learning optimization were employed. Model performance was comprehensively evaluated using the coefficient of determination (R2), root mean square error (RMSE), mean relative error (MRE), and computational time. The results demonstrate that deep learning models can effectively capture nonlinear relationships within complex energy spectra, enabling accurate radionuclide identification and activity quantification. Specifically, the CNN achieved a globally optimal test RMSE of 0.00566 and an R2 of 0.999 with raw energy spectra. CNN–Meta exhibited superior adaptability and generalization under min-max normalization, reducing test error by 70.8% compared to RF, while requiring only 49% of the total computation time of the CNN model. RF was relatively insensitive to preprocessing but yielded higher absolute errors, whereas PLSR was limited by its linear nature and failed to capture the nonlinear characteristics of complex energy spectra. In conclusion, the CNN–Meta hybrid model demonstrates superior performance in both accuracy and efficiency, providing a reliable and effective approach for the rapid identification of radionuclides and quantitative analysis of activity in complex energy spectra. Full article
Show Figures

Figure 1

32 pages, 18645 KB  
Article
More Trustworthy Prediction of Elastic Modulus of Recycled Aggregate Concrete Using MCBE and TabPFN
by Wei-Tian Lu, Ze-Zhao Wang and Xin-Yu Zhao
Materials 2025, 18(22), 5221; https://doi.org/10.3390/ma18225221 - 18 Nov 2025
Viewed by 366
Abstract
The sustainable use of recycled aggregate concrete (RAC) is a critical pathway toward resource-efficient and environmentally responsible construction. However, the mechanical performance of RAC—particularly its elastic modulus—exhibits pronounced variability due to the heterogeneous quality and microstructural defects of recycled aggregates. This variability complicates [...] Read more.
The sustainable use of recycled aggregate concrete (RAC) is a critical pathway toward resource-efficient and environmentally responsible construction. However, the mechanical performance of RAC—particularly its elastic modulus—exhibits pronounced variability due to the heterogeneous quality and microstructural defects of recycled aggregates. This variability complicates the establishment of reliable predictive models and equations for elastic modulus estimation and restricts RAC’s broader structural implementation. Conventional empirical and machine-learning-based models (e.g., support vector machine, random forest, and artificial neural networks) are typically dataset-specific, prone to overfitting, and incapable of quantifying bias and uncertainty, making them unsuitable for heterogeneous materials data. This study introduces a bias-aware and more accurate predictive framework that integrates the Tabular Prior-data Fitted Network (TabPFN) with Monte Carlo Bias Estimation (MCBE)—for the first time applied in RAC materials research. A database containing 1161 RAC samples from diverse literature sources was established. This database includes key parameters such as apparent density ranging from 2270 kg/m3 to 3150 kg/m3, water absorption from 0.75% to 7.81%, replacement ratio from 0% to 100%, and compressive strength values ranging from 10.00 MPa to 108.51 MPa. MCBE quantified representational bias and guided targeted data augmentation, while TabPFN—pretrained on millions of Bayesian inference tasks—achieved R2 = 0.912 and RMSE = 1.65 GPa without any hyperparameter tuning. Feature attribution analysis confirmed compressive strength as the most influential factor governing the elastic modulus, consistent with established composite mechanics principles. The proposed TabPFN–MCBE framework provides a reliable, bias-corrected, and transferable approach for modeling recycled aggregate concrete (RAC). It enables accurate predictions that are both trustworthy and interpretable, advancing the use of data-driven methods in sustainable materials design. Full article
Show Figures

Figure 1

20 pages, 1915 KB  
Article
Feature Selection and Model Optimization for Survival Prediction in Patients with Angina Pectoris
by Róbert Bata, Amr Sayed Ghanem and Attila Csaba Nagy
J. Clin. Med. 2025, 14(22), 8111; https://doi.org/10.3390/jcm14228111 - 16 Nov 2025
Viewed by 636
Abstract
Background: With the rapid emergence of novel survival models and feature selection methods, comparing them with traditional approaches is essential to define contexts of optimal performance. Methods: This study systematically evaluates nine survival models combined with nine feature selection methods for predicting the [...] Read more.
Background: With the rapid emergence of novel survival models and feature selection methods, comparing them with traditional approaches is essential to define contexts of optimal performance. Methods: This study systematically evaluates nine survival models combined with nine feature selection methods for predicting the occurrence of angina pectoris using electronic health record (EHR) data from a Hungarian hospital (n = 29,655, features = 1150). Performance was assessed with the concordance index (C-index) and integrated Brier score (IBS) to compare predictive accuracy across methods. Results: Tree-based survival models, particularly gradient-boosted survival (GBS) and random survival forest (RSF), consistently outperformed conventional approaches in terms of C-index, but showed slightly worse calibration as reflected in their higher IBSs. The best-performing model was RSF, which was optimized using Bayesian hyperparameter tuning. For feature selection, tree-based methods such as Boruta and RSF-based approaches showed superior performance. We further identified clusters of feature selection methods and generated consensus feature sets. We also analyzed the internal relationships between the selected features. Survival model performance was also examined over time using the time-dependent Area Under the Curve (AUC) based on the best-performing feature set. Conclusions: Our findings highlight the substantial impact of recent methodological innovations in survival analysis, which offer significant gains in predictive accuracy and efficiency, ultimately support more robust clinical decision-making in the early identification of angina pectoris among patients with diabetes. Full article
(This article belongs to the Special Issue Application of Artificial Intelligence in Cardiology)
Show Figures

Figure 1

24 pages, 22187 KB  
Article
Predicting the Strength of Fly Ash–Slag–Gypsum-Based Backfill Materials Using Interpretable Machine Learning Modeling
by Tingdi Fan, Siqi Zhang and Wen Ni
Appl. Sci. 2025, 15(22), 12035; https://doi.org/10.3390/app152212035 - 12 Nov 2025
Viewed by 294
Abstract
Predicting unconfined compressive strength (UCS) is essential for the safety and stability of solid waste-based backfill materials, particularly due to the correlation between strength development and hazardous substance immobilization. This study developed a machine learning model to predict UCS and optimize mixtures using [...] Read more.
Predicting unconfined compressive strength (UCS) is essential for the safety and stability of solid waste-based backfill materials, particularly due to the correlation between strength development and hazardous substance immobilization. This study developed a machine learning model to predict UCS and optimize mixtures using fly ash, slag, and desulfurized gypsum. A dataset with 14 input features—including composition, water content, and curing time—was analyzed using Recursive Feature Elimination (RFE) for feature selection. Random Forest, Bayesian, and Gray Wolf Optimizer (GWO)-enhanced models were compared. The GWO-GB model achieved superior accuracy (R2 = 0.9335), with curing time (27.99%), water content (22.16%), and sulfur trioxide (18.98%) identified as the most significant features. The model enables rapid, high-precision UCS prediction, reduces experimental workload, and offers insights for mix design optimization and feature interaction analysis. Full article
Show Figures

Figure 1

28 pages, 2424 KB  
Article
A Novel Application of Choquet Integral for Multi-Model Fusion in Urban PM10 Forecasting
by Houria Bouzghiba, Amine Ajdour, Najiya Omar, Abderrahmane Mendyl and Gábor Géczi
Atmosphere 2025, 16(11), 1274; https://doi.org/10.3390/atmos16111274 - 10 Nov 2025
Viewed by 558
Abstract
Air pollution forecasting remains a critical challenge for urban public health management, with traditional approaches struggling to balance accuracy and interpretability. This study introduces a novel PM10 forecasting framework combining physics-informed feature engineering with interpretable ensemble fusion using the Choquet integral, the [...] Read more.
Air pollution forecasting remains a critical challenge for urban public health management, with traditional approaches struggling to balance accuracy and interpretability. This study introduces a novel PM10 forecasting framework combining physics-informed feature engineering with interpretable ensemble fusion using the Choquet integral, the first application of this non-linear aggregation operator for air quality forecasting. Using hourly data from 11 monitoring stations in Budapest (2021–2023), we developed four specialized feature sets capturing distinct atmospheric processes: short-term dynamics, long-term patterns, meteorological drivers, and anomaly detection. We evaluated machine learning models including Random Forest variants (RF), Gradient Boosting (GBR), Support Vector Regression (SVR), K-Nearest Neighbors (KNN), and Long Short-Term Memory (LSTM) architectures across six identified pollution regimes. Results revealed the critical importance of feature engineering over architectural complexity. While sophisticated models failed when trained on raw data, the KNN model with 5-dimensional anomaly features achieved exceptional performance, representing an 86.7% improvement over direct meteorological input models. Regime-specific modeling proved essential, with GBR-Regime outperforming GBR-Stable by a remarkable effect size. For ensemble fusion, we compared the novel Choquet integral approach against conventional methods (mean, median, Bayesian Model Averaging, stacking). The Choquet integral achieved near-equivalent performance to state-of-the-art stacking while providing complete mathematical interpretability through interaction coefficients. Analysis revealed predominantly redundant interactions among models, demonstrating that sophisticated fusion must prevent information over-counting rather than merely combining predictions. Station-specific interaction patterns showed selective synergy exploitation at complex urban locations while maintaining redundancy management at simpler sites. This work establishes that combining domain-informed feature engineering with interpretable Choquet integral aggregation can match black-box ensemble performance while maintaining the transparency essential for operational deployment and regulatory compliance in air quality management systems. Full article
Show Figures

Figure 1

27 pages, 2824 KB  
Article
Identifying Predictors of Utilization of Skilled Birth Attendance in Uganda Through Interpretable Machine Learning
by Shaheen M. Z. Memon, Robert Wamala and Ignace H. Kabano
Int. J. Environ. Res. Public Health 2025, 22(11), 1691; https://doi.org/10.3390/ijerph22111691 - 9 Nov 2025
Viewed by 575
Abstract
Skilled Birth Attendance (SBA) is essential for reducing maternal and neonatal mortality, yet access remains limited in many low- and middle-income countries. This study used machine learning to predict SBA use among Ugandan women and identify key influencing factors. We analyzed data from [...] Read more.
Skilled Birth Attendance (SBA) is essential for reducing maternal and neonatal mortality, yet access remains limited in many low- and middle-income countries. This study used machine learning to predict SBA use among Ugandan women and identify key influencing factors. We analyzed data from the 2016 Uganda Demographic and Health Survey, focusing on women aged 15 to 49 who had given birth in the preceding five years. After preparing and selecting relevant features, six tree-based models (decision tree, random forest, gradient boosting, XGBoost, LightGBM, CatBoost) and logistic regression were applied. Class imbalance was addressed using cost-sensitive learning, and hyperparameters were tuned via Bayesian optimization. XGBoost performed best (F1-score: 0.52; recall: 0.73; AUC: 0.75). SHapley Additive Explanations (SHAP) were used to interpret model predictions. Key predictors of SBA use included education level, antenatal care visits, region (especially Northern Uganda), perceived distance to a healthcare facility, and urban or rural residence. The results demonstrate the value of interpretable machine learning for identifying at-risk populations and guiding targeted maternal health interventions in Uganda. Full article
(This article belongs to the Section Global Health)
Show Figures

Figure 1

27 pages, 6536 KB  
Article
Development of a Tractor Hydrostatic Transmission Efficiency Prediction Model Using Novel Hybrid Deep Kernel Learning and Residual Radial Basis Function Interpolator Model
by Jin Kam Park, Oleksandr Yuhai, Jin Woong Lee, Yubin Cho and Joung Hwan Mun
Agriculture 2025, 15(22), 2325; https://doi.org/10.3390/agriculture15222325 - 8 Nov 2025
Viewed by 609
Abstract
This study proposes a data-efficient surrogate modeling approach for predicting hydrostatic transmission (HST) system efficiency in tractors using minimal data. Only 27 samples were selected from a dataset of 5092 measurements based on the minimum, mean, and maximum values of the input variables [...] Read more.
This study proposes a data-efficient surrogate modeling approach for predicting hydrostatic transmission (HST) system efficiency in tractors using minimal data. Only 27 samples were selected from a dataset of 5092 measurements based on the minimum, mean, and maximum values of the input variables (input shaft speed, HST ratio, and load), which were used as the training data. A hybrid prediction model combining deep kernel learning and a residual radial basis function surrogate was developed with hyperparameters optimized via Bayesian optimization. For performance verification, the proposed model was compared with Neural Network (NN), Random Forest, XGBoost, Gaussian Process (GP), and Support Vector Regressor (SVR) models trained using 27 samples. As a result, the proposed model achieved the highest prediction accuracy (R2 = 0.93, MAPE = 5.94%, RMSE = 4.05). Process, SVM (Support Vector MA). These findings indicate that the proposed approach can be effectively used to predict the overall HST efficiency using minimal data, particularly in situations where experimental data collection is limited. Full article
(This article belongs to the Special Issue Computers and IT Solutions for Agriculture and Their Application)
Show Figures

Figure 1

19 pages, 2825 KB  
Article
Research on Landslide Displacement Prediction Using Stacking-Based Machine Learning Fusion Model
by Yongqiang Li, Anchen Hu, Yinsheng Wang, Honggang Wu and Daohong Qiu
Appl. Sci. 2025, 15(21), 11747; https://doi.org/10.3390/app152111747 - 4 Nov 2025
Viewed by 429
Abstract
To address the issues of the insufficient accuracy and weak generalization capabilities of single models in landslide displacement prediction, this paper proposes a machine learning model fusion prediction method for landslide displacement based on stacking. Taking the landslide displacement data (F) and rainfall [...] Read more.
To address the issues of the insufficient accuracy and weak generalization capabilities of single models in landslide displacement prediction, this paper proposes a machine learning model fusion prediction method for landslide displacement based on stacking. Taking the landslide displacement data (F) and rainfall (RAINFALL) of the Baishui River landslide in the Three Gorges Reservoir area as the research object, input sequences were constructed through data preprocessing and feature engineering. Prediction models including SVR, XGBoost, Bayesian optimization, and random forest were established. Based on the stacking framework, an integrated landslide displacement prediction model was developed by dynamically weighting the outputs of the base models using prediction accuracy and stability as fusion indicators. The Baishui River landslide, a typical colluvial landslide, was selected as a case study, with typical displacement data from monitoring points ZG118 and XD-01 from December 2006 to December 2012. The results show that the evaluation metrics (R2, ERMSE, and EMAE) for ZG118 and XD-01 demonstrate satisfactory prediction performance. Compared with traditional single models such as a TCN and XGBoost, the proposed integrated model exhibits improved prediction accuracy, providing scientific support for the real-time monitoring and early warning of landslide hazards. Full article
Show Figures

Figure 1

25 pages, 9622 KB  
Article
Prediction of Compressive Strength of Concrete Using Explainable Machine Learning Models
by Hainan Fu, Xiong Zhou, Pengfei Xu and Dandan Sun
Materials 2025, 18(21), 5009; https://doi.org/10.3390/ma18215009 - 3 Nov 2025
Viewed by 1394
Abstract
Predicting the compressive strength of concrete is essential for engineering design and quality assurance. Traditional empirical formulas often fall short in capturing complex multi-factor interactions and nonlinear relationships. This study employs an interpretable machine learning framework using Gradient Boosting Trees, Random Forest, and [...] Read more.
Predicting the compressive strength of concrete is essential for engineering design and quality assurance. Traditional empirical formulas often fall short in capturing complex multi-factor interactions and nonlinear relationships. This study employs an interpretable machine learning framework using Gradient Boosting Trees, Random Forest, and Backpropagation Neural Networks to predict concrete compressive strength. Bayesian optimization was employed for hyperparameter tuning, and SHAP analysis was used to quantify feature contributions. Based on 223 sets of compression test data, this study systematically compared the predictive performance of the five models. Results demonstrate that the CatBoost model achieved the best results, R2 of 0.9388, RMSE of 2.7131 MPa, and MAPE of 5.45%, outperforming other models. SHAP analysis indicated that cement content had the greatest impact on strength, followed by water content, water reducer, fly ash, and aggregates, with notable interactive effects between factors. Compared to the empirical formula in the current industry standard Specification for Mix Proportion Design of Ordinary Concrete, the CatBoost model showed higher accuracy under specific raw material and curing conditions, with MAPE values of 2.94% and 5.96%, respectively. The optimized CatBoost model, combined with interpretability analysis, offers a data-driven tool for concrete mix optimization, balancing high precision with practical engineering applicability. Full article
(This article belongs to the Section Construction and Building Materials)
Show Figures

Figure 1

Back to TopTop