Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (75)

Search Parameters:
Keywords = extra trees regressor

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 891 KB  
Article
Ensemble Learning with Systematic Hyperparameter Optimization for Urban-Bike-Sharing Demand Prediction
by Ivona Brajevic, Eva Tuba and Milan Tuba
Sustainability 2026, 18(8), 3766; https://doi.org/10.3390/su18083766 - 10 Apr 2026
Viewed by 312
Abstract
Bike sharing is an established component of urban mobility infrastructure, offering a low-emission alternative to motorized transport for short trips in cities worldwide. Accurate demand forecasting is essential for efficient system operation: it enables better bike redistribution, reduces user wait times, and lowers [...] Read more.
Bike sharing is an established component of urban mobility infrastructure, offering a low-emission alternative to motorized transport for short trips in cities worldwide. Accurate demand forecasting is essential for efficient system operation: it enables better bike redistribution, reduces user wait times, and lowers the operational costs associated with rebalancing. This study evaluated multiple ensemble strategies for hourly bike-sharing demand prediction, comparing bagging methods (Random Forest, Extra Trees), boosting methods (AdaBoost, Gradient Boosting Regressor, Histogram-based Gradient Boosting Regressor), and a Voting ensemble, while systematically investigating the impact of hyperparameter optimization. A repeated hold-out protocol was used, in which the dataset was randomly divided into 80% training and 20% test subsets across 10 random splits; 5-fold cross-validation was applied within each training fold exclusively for hyperparameter tuning, ensuring the test set remained unseen during model selection. Random Search and Bayesian Optimization were compared under identical budgets of 60 configurations per model. Results show that optimization substantially improves all models, with the most pronounced gains for AdaBoost (58% RMSE reduction) and Gradient Boosting Regressor (45% RMSE reduction). A Voting ensemble combining a Random Search-tuned Gradient Boosting Regressor and a Bayesian-optimized Histogram-based Gradient Boosting Regressor achieves the best overall performance (RMSE of 38.48, R2 of 0.955) with the lowest variance among all repeated splits. Feature importance analysis confirms that hour of day and temperature are the dominant demand drivers, consistent with the operational patterns of urban bike-sharing systems. The performance difference between Random Search and Bayesian Optimization is negligible for most models, suggesting that well-designed search spaces allow simpler strategies to achieve competitive results. A controlled comparison conducted under identical experimental conditions shows that the Voting ensemble is statistically equivalent to XGBoost and nominally better than LightGBM, while CatBoost achieves a statistically significant advantage, highlighting it as a strong individual alternative. Full article
(This article belongs to the Special Issue Artificial Intelligence and Sustainable Development)
Show Figures

Figure 1

29 pages, 4375 KB  
Article
Application of AI in Tablet Development: An Integrated Machine Learning Framework for Pre-Formulation Property Prediction
by Masugu Hamaguchi, Tomoki Adachi and Noriyoshi Arai
Pharmaceutics 2026, 18(4), 452; https://doi.org/10.3390/pharmaceutics18040452 - 8 Apr 2026
Viewed by 353
Abstract
Background/Objectives: Tablet development requires simultaneous optimization of multiple quality attributes under limited experimental budgets, yet formulation–property relationships are highly nonlinear in mixture systems. To support pre-formulation decision-making prior to extensive tablet prototyping, this study proposes an AI framework that organizes formulation and process [...] Read more.
Background/Objectives: Tablet development requires simultaneous optimization of multiple quality attributes under limited experimental budgets, yet formulation–property relationships are highly nonlinear in mixture systems. To support pre-formulation decision-making prior to extensive tablet prototyping, this study proposes an AI framework that organizes formulation and process data together with raw-material property records into a reusable database, and enriches conventional composition/process features with physically motivated mixture descriptors derived from raw-material properties and formulation/process settings. Methods: Mixture-level scalar descriptors are constructed by composition-weighted aggregation of material properties, and particle size distribution (PSD) is incorporated via a compact set of summary statistics computed from composition-weighted mixture PSDs. Three feature sets are compared: (i) Materials + Processes (MP), (ii) MP with scalar Descriptors (MPD), and (iii) MPD with PSD summaries (MPDD). Five target properties are modeled: hardness, disintegration time, flow function, cohesion, and thickness. We train and evaluate Random Forest, Extra Trees Regressor, Lasso, Partial Least Squares, Support Vector Regression, and a multi-branch neural network that processes the three feature blocks separately and concatenates them for prediction. For interpolation assessment, repeated Train/Dev/Test splitting (5:3:2) across multiple random seeds is used, and the effect of feature augmentation is quantified by paired RMSE improvements with bootstrap confidence intervals and paired Wilcoxon signed-rank tests. To assess robustness under practical formulation updates, rolling-origin time-series splits are employed and Applicability Domain indicators are computed to characterize out-of-distribution coverage. Results: Across interpolation evaluations, mixture-descriptor augmentation (MPD/MPDD) improves hardness and disintegration time in most settings, whereas gains for flow function are smaller and cohesion/thickness show mixed effects under limited sample sizes. Conclusions: Under extrapolation-oriented evaluation, the descriptors can improve hardness but may degrade disintegration-time prediction under covariate shift, emphasizing the need for careful descriptor selection and dimensionality control when deploying pre-formulation predictors. Full article
Show Figures

Figure 1

43 pages, 13084 KB  
Article
Machine Learning-Based Prediction of Surface Integrity in High-Pressure Coolant-Assisted Machining of Near-β Ti-5553 Titanium Alloy
by Lokman Yünlü
Machines 2026, 14(4), 367; https://doi.org/10.3390/machines14040367 - 27 Mar 2026
Viewed by 453
Abstract
This study investigates the factors affecting surface integrity during the machining of near-β Ti-5553, a critical material in the aerospace and defense industries. Considering this alloy as a difficult-to-machine material, the turning process was examined by analyzing the effects of cutting speed, feed [...] Read more.
This study investigates the factors affecting surface integrity during the machining of near-β Ti-5553, a critical material in the aerospace and defense industries. Considering this alloy as a difficult-to-machine material, the turning process was examined by analyzing the effects of cutting speed, feed rate, and cooling strategy (dry, conventional, and 30 MPa/High-Pressure cooling) on cutting force, temperature, surface roughness, and residual stress. The primary novelty of this research lies in its integrated approach: rather than evaluating surface integrity metrics in isolation, it simultaneously models interrelated responses to residual stress, cutting temperature, cutting force, and surface roughness under high-pressure coolant (HPC) conditions. Furthermore, it introduces a robust machine learning framework that uniquely applies data augmentation (Gaussian jittering and interpolation) to overcome the conventional constraints of limited experimental machining data, providing a highly accurate predictive tool. The experimental data were expanded using data augmentation methods (Gaussian jittering and interpolation) and modeled using five different machine learning algorithms (Extra Trees, Random Forest, Gradient Boosting, KNN, and AdaBoost). The results revealed that cooling pressure plays a dominant role, particularly in residual stress (importance score: 0.926) and cutting temperature (0.657). It was observed that high-pressure cooling (HPC) reduces thermal gradients, thereby lowering tensile stresses and improving surface integrity. When algorithm performances were compared, the Extra Trees and Random Forest models achieved the most accurate predictions after hyperparameter optimization. Specifically, the optimized Extra Trees regressor demonstrated exceptional predictive capability for residual stress, achieving an accuracy of 98.47%, a remarkably high coefficient of determination (R2 = 0.9997), and a minimal Mean Squared Error (MSE = 6.8289). These quantitative results confirm that the proposed machine learning framework provides a highly reliable and precise tool for controlling surface quality in HPC- assisted machining. Full article
Show Figures

Figure 1

32 pages, 19818 KB  
Article
An Interpretable Ensemble Machine Learning Framework for Predicting the Ultimate Flexural Capacity of BFRP-Reinforced Concrete Beams
by Sebghatullah Jueyendah and Elif Ağcakoca
Polymers 2026, 18(5), 601; https://doi.org/10.3390/polym18050601 - 28 Feb 2026
Cited by 2 | Viewed by 541
Abstract
Prediction of the ultimate moment capacity (Mu) of BFRP-reinforced concrete beams is complicated by nonlinear parameter interactions and the linear-elastic response of BFRP, reducing the accuracy of conventional design models. This study develops an optimized machine learning (ML) framework incorporating random forest, extra [...] Read more.
Prediction of the ultimate moment capacity (Mu) of BFRP-reinforced concrete beams is complicated by nonlinear parameter interactions and the linear-elastic response of BFRP, reducing the accuracy of conventional design models. This study develops an optimized machine learning (ML) framework incorporating random forest, extra trees, gradient boosting, adaboost, bagging, support vector regression, histogram-based gradient boosting, and ensemble voting and stacking strategies for reliable prediction of the Mu of BFRP-reinforced concrete beams. A comprehensive database of material, geometric, reinforcement, and BFRP mechanical parameters was analyzed, and model performance was evaluated using an 80/20 train–test split and 10-fold cross-validation based on R2, RMSE, MAE, and MAPE. The stacking regressor demonstrated superior predictive performance, achieving an R2 of 0.999 (RMSE = 0.590) in training and an R2 of 0.988 (RMSE = 2.487) in testing, indicating excellent robustness and strong generalization capability in predicting Mu. Furthermore, interpretability analyses based on SHAP, PDP, ALE, and ICE demonstrate that span length (L) and beam depth (h) constitute the governing parameters in the prediction of Mu. Unlike prior studies focused mainly on predictive accuracy, this work proposes an optimized and interpretable stacking ensemble framework that integrates explainable AI with classical flexural mechanics for physically consistent and reliable prediction of the ultimate moment capacity of BFRP-reinforced concrete beams. Full article
(This article belongs to the Special Issue Fiber-Reinforced Polymer Composites: Progress and Prospects)
Show Figures

Graphical abstract

21 pages, 1287 KB  
Article
Machine Learning Calibration of Smartphone-Based Infrared Thermal Cameras: Improved Bias and Persistent Random Error
by Jayroop Ramesh, Tom Loney, Stefan Du Plessis, Homero Rivas, Assim Sagahyroon, Fadi Aloul and Thomas Boillat
Sensors 2026, 26(4), 1295; https://doi.org/10.3390/s26041295 - 17 Feb 2026
Viewed by 624
Abstract
Low-cost, smartphone-based thermal cameras offer unprecedented accessibility for physiological monitoring, yet their validity and reliability for absolute skin temperature measurement in clinical settings remain contentious. This study aims to quantify the agreement and repeatability of a widely used smartphone thermal camera, the FLIR [...] Read more.
Low-cost, smartphone-based thermal cameras offer unprecedented accessibility for physiological monitoring, yet their validity and reliability for absolute skin temperature measurement in clinical settings remain contentious. This study aims to quantify the agreement and repeatability of a widely used smartphone thermal camera, the FLIR One Pro, against a consumer-grade, non-contact infrared thermometer, the iHealth PT3. A method comparison study was conducted with 40 healthy adult participants, yielding a total of 2400 temperature measurements. Skin temperature of the hand dorsum was measured concurrently with the FLIR One Pro and the iHealth PT3. The protocol involved two rounds: Round 1 (R1) in a stable, static environment to assess baseline repeatability, and Round 2 (R2) in a dynamic environment mimicking clinical repositioning. The performance of the instruments was compared using paired t-tests for mean differences and Bland–Altman analysis for assessing agreement. The iHealth PT3 demonstrated superior precision, with an average intra-participant standard deviation (SD) of 0.030 °C in R1 and 0.092 °C in R2. In stark contrast, the FLIR One Pro exhibited significantly higher variability, with an average SD of 0.34 °C in R1 and 0.30 °C in R2. Bland–Altman analysis revealed a substantial mean bias of −1.42 °C in R1 and −1.15 °C, with critically wide 95% limits of agreement ranges of ≈6 °C. The substantial systematic bias and poor agreement of the FLIR One Pro far exceed both its manufacturer-stated accuracy and clinically acceptable error margins for absolute temperature measurement. To further examine whether calibration could mitigate these deficiencies, we applied a suite of ten machine learning regressors to map FLIR readings onto iHealth PT3 values. Calibration reduced systematic bias across all models, with Quantile Gradient-Boosted Regression Trees achieving the lowest MAE (1.162 °C). The Extra Trees model yielded the lowest RMSE (1.792 °C) and the highest explained variance (R2 = 0.152), yet this relatively low value confirms that the device’s high intrinsic variability limits the effectiveness of algorithmic correction. As such the device has limited utility for longitudinal patient monitoring or for diagnostic decisions that rely on precise, absolute temperature thresholds. These findings inform medical practitioners in low-resource settings of the profound limitations of using this device as a standalone clinical thermometer and emphasize that algorithmic correction cannot compensate for fundamental hardware and measurement noise constraints. Full article
(This article belongs to the Special Issue AI-Based Sensing and Imaging Applications)
Show Figures

Figure 1

27 pages, 1858 KB  
Article
Temporal Dynamics of UAV Multispectral Vegetation Indices for Accurate Machine Learning-Based Wheat Yield Prediction
by Krstan Kešelj, Zoran Stamenković, Marko Kostić, Vladimir Aćin, Aleksandar Ivezić, Mladen Ivanišević and Nenad Magazin
AgriEngineering 2026, 8(2), 71; https://doi.org/10.3390/agriengineering8020071 - 16 Feb 2026
Viewed by 823
Abstract
Accurate wheat yield prediction is essential for ensuring food security and sustainable resource management under the increasing challenges of climate change. This study investigates the integration of unmanned aerial vehicle (UAV)-based multispectral imaging and machine learning (ML) techniques to improve yield forecasting in [...] Read more.
Accurate wheat yield prediction is essential for ensuring food security and sustainable resource management under the increasing challenges of climate change. This study investigates the integration of unmanned aerial vehicle (UAV)-based multispectral imaging and machine learning (ML) techniques to improve yield forecasting in European wheat cultivars. Field experiments were conducted on 400 sub-plots with varying NPK fertilization regimes and five wheat varieties, monitored across six phenological stages during the 2023 growing season in Vojvodina, Serbia. A DJI Phantom 4 Multispectral UAV collected high-resolution imagery, from which 65 vegetation indices were computed. Using PyCaret’s automated ML framework, 25 regression algorithms were evaluated for yield prediction. Ensemble models, particularly Random Forest, Extra Trees, Gradient Boosting, and LightGBM, consistently outperformed linear and kernel-based approaches. The highest prediction accuracy was achieved with the Random Forest Regressor during full flowering (BBCH 65–69), yielding an R2 of 0.952 and an RMSE of 0.44 t/ha. Results highlight the temporal dynamics of model performance, with optimal predictions occurring during reproductive stages. The findings confirm that UAV-derived multispectral data, coupled with ensemble machine learning, provide a non-invasive, accurate, and computationally efficient method for yield forecasting. This framework has significant potential for supporting precision agriculture, enabling real-time decision-making, and enhancing the resilience of wheat production systems. Full article
Show Figures

Figure 1

28 pages, 6311 KB  
Article
Machine Learning-Assisted Optimisation of the Laser Beam Powder Bed Fusion (PBF-LB) Process Parameters of H13 Tool Steel Fabricated on a Preheated to 350 C Building Platform
by Katsiaryna Kosarava, Paweł Widomski, Michał Ziętala, Daniel Dobras, Marek Muzyk and Bartłomiej Adam Wysocki
Materials 2026, 19(1), 210; https://doi.org/10.3390/ma19010210 - 5 Jan 2026
Viewed by 1167
Abstract
This study presents the first application of Machine Learning (ML) models to optimise Powder Bed Fusion using Laser Beam (PBF-LB) process parameters for H13 steel fabricated on a 350 °C preheated building platform. A total of 189 cylindrical specimens were produced for training [...] Read more.
This study presents the first application of Machine Learning (ML) models to optimise Powder Bed Fusion using Laser Beam (PBF-LB) process parameters for H13 steel fabricated on a 350 °C preheated building platform. A total of 189 cylindrical specimens were produced for training and testing machine learning (ML) models using variable process parameters: laser power (250–350 W), scanning speed (1050–1300 mm/s), and hatch spacing (65–90 μm). Eight ML models were investigated: 1. Support Vector Regression (SVR), 2. Kernel Ridge Regression (KRR), 3. Stochastic Gradient Descent Regressor, 4. Random Forest Regressor (RFR), 5. Extreme Gradient Boosting (XGBoost), 6. Extreme Gradient Boosting with limited depth (XGBoost LD), 7. Extra Trees Regressor (ETR) and 8. Light Gradient Boosting Machine (LightGBM). All models were trained using the Fast Library for Automated Machine Learning & Tuning (FLAML) framework to predict the relative density of the fabricated samples. Among these, the XGBoost model achieved the highest predictive accuracy, with a coefficient of determination R2=0.977, mean absolute percentage error MAPE = 0.002, and mean absolute error MAE = 0.017. Experimental validation was conducted on 27 newly fabricated samples using ML predicted process parameters. Relative densities exceeding 99.6% of the theoretical value (7.76 g/cm3) for all models except XGBoost LD and KRR. The lowest MAE = 0.004 and the smallest difference between the ML-predicted and PBF-LB validated density were obtained for samples made with LightGBM-predicted parameters. Those samples exhibited a hardness of 604 ± 13 HV0.5, which increased to approximately 630 HV0.5 after tempering at 550 °C. The LightGBM optimised parameters were further applied to fabricate a part of a forging die incorporating internal through-cooling channels, demonstrating the efficacy of machine learning-guided optimisation in achieving dense, defect-free H13 components suitable for industrial applications. Full article
(This article belongs to the Special Issue Multiscale Design and Optimisation for Metal Additive Manufacturing)
Show Figures

Graphical abstract

27 pages, 11265 KB  
Article
Using Machine Learning Methods to Predict Cognitive Age from Psychophysiological Tests
by Daria D. Tyurina, Sergey V. Stasenko, Konstantin V. Lushnikov and Maria V. Vedunova
Healthcare 2025, 13(24), 3193; https://doi.org/10.3390/healthcare13243193 - 5 Dec 2025
Viewed by 608
Abstract
Background/Objectives: This paper presents the results of predicting chronological age from psychophysiological tests using machine learning regressors. Methods: Subjects completed a series of psychological tests measuring various cognitive functions, including reaction time and cognitive conflict, short-term memory, verbal functions, and color and spatial [...] Read more.
Background/Objectives: This paper presents the results of predicting chronological age from psychophysiological tests using machine learning regressors. Methods: Subjects completed a series of psychological tests measuring various cognitive functions, including reaction time and cognitive conflict, short-term memory, verbal functions, and color and spatial perception. The sample included 99 subjects, 68 percent of whom were men and 32 percent were women. Based on the test results, 43 features were generated. To determine the optimal feature selection method, several approaches were tested alongside the regression models using MAE, R2, and CV_R2 metrics. SHAP and Permutation Importance (via Random Forest) delivered the best performance with 10 features. Features selected through Permutation Importance were used in subsequent analyses. To predict participants’ age from psychophysiological test results, we evaluated several regression models, including Random Forest, Extra Trees, Gradient Boosting, SVR, Linear Regression, LassoCV, RidgeCV, ElasticNetCV, AdaBoost, and Bagging. Model performance was compared using the determination coefficient (R2) and mean absolute error (MAE). Cross-validated performance (CV_R2) was estimated via 5-fold cross-validation. To assess metric stability and uncertainty, bootstrapping (1000 resamples) was applied to the test set, yielding distributions of MAE and RMSE from which mean values and 95% confidence intervals were derived. Results: The study identified RidgeCV with winsorization and standardization as the best model for predicting cognitive age, achieving a mean absolute error of 5.7 years and an R2 of 0.60. Feature importance was evaluated using SHAP values and permutation importance. SHAP analysis showed that stroop_time_color and stroop_var_attempt_time were the strongest predictors, followed by several task-timing features with moderate contributions. Permutation importance confirmed this ranking, with these two features causing the largest performance drop when permuted. Partial dependence plots further indicated clear positive relationships between these key features and predicted age. Correlation analysis stratified by sex revealed that most features were significantly associated with age, with stronger effects generally observed in men. Conclusions: Feature selection revealed Stroop timing measures and task-related metrics from math and campimetry tests as the strongest predictors, reflecting core cognitive processes linked to aging. The results underscore the value of careful outlier handling, feature selection, and interpretable regularized models for analyzing psychophysiological data. Future work should include longitudinal studies and integration with biological markers to further improve clinical relevance. Full article
(This article belongs to the Special Issue AI-Driven Healthcare Insights)
Show Figures

Figure 1

23 pages, 1747 KB  
Article
Machine Learning-Based Prediction of Soybean Plant Height from Agronomic Traits Across Sequential Harvests
by Bruno Rodrigues de Oliveira, Renato Lustosa Sobrinho, Fernando Rodrigues Trindade Ferreira, Fernando Ferrari Putti, Matteo Bodini, Camila Martins Saporetti and Leonardo Goliatt
AgriEngineering 2025, 7(12), 408; https://doi.org/10.3390/agriengineering7120408 - 2 Dec 2025
Cited by 1 | Viewed by 951
Abstract
The accurate prediction of plant height is crucial for optimizing soybean cultivar selection and improving yield estimations. In this study, we investigate the potential of machine learning (ML) algorithms to predict soybean plant height (PH) based on a diverse set of agronomic parameters [...] Read more.
The accurate prediction of plant height is crucial for optimizing soybean cultivar selection and improving yield estimations. In this study, we investigate the potential of machine learning (ML) algorithms to predict soybean plant height (PH) based on a diverse set of agronomic parameters analyzed from forty soybean cultivars evaluated across sequential harvests. Using a comprehensive dataset, the models Elastic Net (EN), Extra Trees (ET), Gaussian Process Regressor (GPR), K-Nearest Neighbors, and XGBoost (XGB) were compared in terms of predictive accuracy, uncertainty, and robustness. Our results demonstrate that ET outperformed other models with an average correlation coefficient of 0.674, R2 of 0.426 and the lowest RMSE of 6.859 cm and MAE of 5.361 cm, while also showing the lowest uncertainty (5.07%). The proposed ML framework includes an extensive model evaluation pipeline that incorporates the Performance Index (PI), ANOVA, and feature importance analysis, providing a multidimensional perspective on model behavior. The most influential features for PH prediction were the number of stems (NS) and insertion of the first pod (IFP). This research highlights the viability of integrating explainable ML techniques into agricultural decision support systems, enabling data-driven strategies for cultivar evaluation and phenotypic trait forecasting. Full article
(This article belongs to the Special Issue The Future of Artificial Intelligence in Agriculture, 2nd Edition)
Show Figures

Figure 1

29 pages, 6244 KB  
Article
Application of Long Short-Term Memory and XGBoost Model for Carbon Emission Reduction: Sustainable Travel Route Planning
by Sevcan Emek, Gizem Ildırar and Yeşim Gürbüzer
Sustainability 2025, 17(23), 10802; https://doi.org/10.3390/su172310802 - 2 Dec 2025
Cited by 2 | Viewed by 1083
Abstract
Travel planning is a process that allows users to obtain maximum benefit from their time, cost and energy. When planning a route from one place to another, it is an important option to present alternative travel areas on the route. This study proposes [...] Read more.
Travel planning is a process that allows users to obtain maximum benefit from their time, cost and energy. When planning a route from one place to another, it is an important option to present alternative travel areas on the route. This study proposes a travel route planning (TRP) architecture using a Long Short-Term Memory (LSTM) and Extreme Gradient Boosting (XGBoost) model to improve both travel efficiency and environmental sustainability in route selection. This model incorporates carbon emissions directly into the route planning process by unifying user preferences, location recommendations, route optimization, and multimodal vehicle selection within a comprehensive framework. By merging environmental sustainability with user-focused travel planning, it generates personalized, practical, and low-carbon travel routes. The carbon emissions observed with TRP’s artificial intelligence (AI) recommendation route are presented comparatively with those of the user-determined route. XGBoost, Random Forest (RF), Categorical Boosting (CatBoost), Light Gradient Boosting Machine (LightGBM), (Extra Trees Regressor) ETR, and Multi-Layer Perception (MLP) models are applied to the TRP model. LSTM is compared with Recurrent Neural Networks (RNNs) and Gated Recurrent Unit (GRU) models. Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Squared Error (MSE), and Normalized Root Mean Square Error (NRMSE) error measurements of these models are carried out, and the best result is obtained using XGBoost and LSTM. TRP enhances environmental responsibility awareness within travel planning by integrating sustainability-oriented parameters into the decision-making process. Unlike conventional reservation systems, this model encourages individuals and organizations to prioritize eco-friendly options by considering not only financial factors but also environmental and socio-cultural impacts. By promoting responsible travel behaviors and supporting the adoption of sustainable tourism practices, the proposed approach contributes significantly to the broader dissemination of environmentally conscious travel choices. Full article
(This article belongs to the Special Issue Design of Sustainable Supply Chains and Industrial Processes)
Show Figures

Figure 1

18 pages, 5042 KB  
Article
Tree-Based Regressor Comparison for Burn Severity Mapping: Spatially Blocked Validation Within and Across Fires
by Linh Nguyen Van and Giha Lee
Remote Sens. 2025, 17(22), 3756; https://doi.org/10.3390/rs17223756 - 19 Nov 2025
Viewed by 755
Abstract
Accurate, timely maps of post-fire burn severity are vital for rehabilitation, hydrologic hazard assessment, and ecosystem recovery in the western United States, where large, frequent wildfires and steep environmental gradients challenge model generalization. Machine learning models, particularly tree-based regressors, are increasingly used to [...] Read more.
Accurate, timely maps of post-fire burn severity are vital for rehabilitation, hydrologic hazard assessment, and ecosystem recovery in the western United States, where large, frequent wildfires and steep environmental gradients challenge model generalization. Machine learning models, particularly tree-based regressors, are increasingly used to relate satellite-derived spectral features to ground-based severity metrics such as the Composite Burn Index (CBI). However, model generalization across spatial domains, both within and between wildfires, remains poorly characterized. In this study, we benchmarked six tree-based regression models (Decision Tree-DT, Random Forest-RF, Extra Trees-ET, Bagging, Gradient Boosting-GB, and AdaBoost-AB) for predicting wildfire severity from Landsat surface reflectance data across ten U.S. fire events. Two spatial validation strategies were applied: (i) within-fire spatial generalization via Leave-One-Cluster-Out (LOCO) and (ii) cross-fire transfer via Leave-One-Fire-Out (LOFO). Performance is assessed with R2, RMSE, and MAE under identical predictors and default hyperparameters. Results indicate that, under LOCO, variance-reduction ensembles lead: RF attains R2 = 0.679, MAE = 0.397, RMSE = 0.516, with ET statistically comparable (R2 = 0.673, MAE = 0.393, RMSE = 0.518), and Bagging close behind (R2 = 0.668, MAE = 0.402, RMSE = 0.525). Under LOFO, ET transfers best (R2 = 0.616, MAE = 0.450, RMSE = 0.571), followed by GB (R2 = 0.564, MAE = 0.479, RMSE = 0.606) and RF (R2 = 0.543, MAE = 0.490, RMSE = 0.621). These results indicate that tree ensembles, especially ET and RF, are competitive under minimal tuning for rapid severity mapping; in practice, RF is a strong choice for an individual fire with local calibration, whereas ET is preferred when model transferability to unseen fires is paramount. Full article
(This article belongs to the Special Issue Advances in Remote Sensing for Burned Area Mapping)
Show Figures

Figure 1

22 pages, 1957 KB  
Article
GWO-Optimized Ensemble Learning for Interpretable and Accurate Prediction of Student Academic Performance in Smart Learning Environments
by Mohammed Husayn, Oluwatayomi Rereloluwa Adegboye and Ahmad Alzubi
Appl. Sci. 2025, 15(22), 12163; https://doi.org/10.3390/app152212163 - 16 Nov 2025
Cited by 1 | Viewed by 946
Abstract
Accurate and interpretable prediction of student academic performance is a cornerstone of data-driven educational support systems, enabling timely interventions, personalized learning pathways, and equitable resource allocation. While ensemble machine learning models such as Random Forest, Extra Trees, and CatBoost have shown promise in [...] Read more.
Accurate and interpretable prediction of student academic performance is a cornerstone of data-driven educational support systems, enabling timely interventions, personalized learning pathways, and equitable resource allocation. While ensemble machine learning models such as Random Forest, Extra Trees, and CatBoost have shown promise in educational data mining, their predictive power and generalizability are often limited by suboptimal weighting schemes and sensitivity to hyperparameter configurations. To address this, we propose a Grey Wolf Optimizer (GWO)-guided ensemble framework that dynamically optimizes each base regressor’s contribution to minimize prediction error while preserving model transparency. Evaluated on a real-world student performance dataset, the proposed approach achieves a coefficient of determination (R2) of 0.93, significantly outperforming individual and conventional ensemble baselines. Furthermore, we integrate SHAP (SHapley Additive exPlanations) to provide educator-friendly interpretability, revealing that daily study hours, study effectiveness, lifestyle score, and screen time are the most influential predictors of exam outcomes. By bridging an optimized machine learning model with educational analytics, this work delivers a robust, transparent, and high-performing AI solution tailored for intelligent tutoring systems, early-warning platforms, and adaptive learning environments. The methodology exemplifies how nature-inspired optimization can enhance not only accuracy but also actionable insight for stakeholders in smart education ecosystems. Full article
Show Figures

Figure 1

23 pages, 16680 KB  
Article
Interpretation of Dominant Features Governing Compressive Strength in One-Part Geopolymer
by Yiren Wang, Yihai Jia, Chuanxing Wang, Weifa He, Qile Ding, Fengyang Wang, Mingyu Wang and Kuizhen Fang
Buildings 2025, 15(20), 3661; https://doi.org/10.3390/buildings15203661 - 11 Oct 2025
Cited by 1 | Viewed by 744
Abstract
One-part geopolymers (OPG) offer a low-carbon alternative to Portland cement, yet mix design remains largely empirical. This study couples machine learning with SHAP (Shapley Additive Explanations) to quantify how mix and curing factors govern performance in Ca-containing OPG. We trained six regressors—Random Forest, [...] Read more.
One-part geopolymers (OPG) offer a low-carbon alternative to Portland cement, yet mix design remains largely empirical. This study couples machine learning with SHAP (Shapley Additive Explanations) to quantify how mix and curing factors govern performance in Ca-containing OPG. We trained six regressors—Random Forest, ExtraTrees, SVR, Ridge, KNN, and XGBoost—on a compiled dataset and selected XGBoost as the primary model based on prediction accuracy. Models were built separately for four targets: compressive strength at 3, 7, 14, and 28 days. SHAP analysis reveals four dominant variables across targets—Slag, Na2O, Ms, and the water-to-binder ratio (w/b)—while the sand-to-binder ratio (s/b), temperature, and humidity are secondary within the tested ranges. Strength evolution follows a reaction–densification logic: at 3 days, Slag dominates as Ca accelerates C–(N)–A–S–H formation; at 7–14 days, Na2O leads as alkalinity/soluble silicate controls dissolution–gelation; by 28 days, Slag and Na2O jointly set the strength ceiling, with w/b continuously regulating porosity. Interactions are strongest for Slag × Na2O (Ca–alkalinity synergy). These results provide actionable guidance: prioritize Slag and Na2O while controlling w/b for strength. The XGBoost+SHAP workflow offers transparent, data-driven decision support for OPG mix optimization and can be extended with broader datasets and formal validation to enhance generalization. Full article
Show Figures

Figure 1

32 pages, 16950 KB  
Article
Regression-Based Performance Prediction in Asphalt Mixture Design and Input Analysis with SHAP
by Kemal Muhammet Erten and Remzi Gürfidan
Appl. Sci. 2025, 15(19), 10779; https://doi.org/10.3390/app151910779 - 7 Oct 2025
Cited by 1 | Viewed by 1204
Abstract
The primary aim of this study is to predict the Marshall stability and flow values of hot-mix asphalt samples prepared according to the Marshall design method using regression-based machine learning algorithms. To overcome the limited number of experimental observations, synthetic data generation was [...] Read more.
The primary aim of this study is to predict the Marshall stability and flow values of hot-mix asphalt samples prepared according to the Marshall design method using regression-based machine learning algorithms. To overcome the limited number of experimental observations, synthetic data generation was applied using the Conditional Tabular Generative Adversarial Network (CTGAN), while the structural consistency of the generated data was validated through Principal Component Analysis (PCA). Two datasets containing 17 physical and mechanical input variables were analyzed, and multiple regression models were compared, including Extra Trees, Random Forest, Gradient Boosting, AdaBoost, and K-Nearest Neighbors. Among these, the Extra Trees Regressor consistently achieved the best results with near-perfect accuracy in flow predictions (MAE ≈ 4.06 × 10−15, RMSE ≈ 4.97 × 10−15, Accuracy ≈ 99.99%) and high performance in stability predictions (MAE = 109.52, RMSE = 150.67, accuracy = 90.45%). Furthermore, model interpretability was ensured by applying SHapley Additive Explanations (SHAP), which revealed that parameters such as softening point, VMA, penetration, and void ratios were the most influential features. These findings demonstrate that regression-based ensemble models, combined with synthetic data augmentation and explainable AI methods, can serve as reliable and interpretable tools in asphalt mixture design. Full article
Show Figures

Figure 1

18 pages, 1174 KB  
Article
Gender Knowledges, Cultures of Equality, and Structural Inequality: Interpreting Female Employment Patterns in Manufacturing Through Interpretable Machine Learning
by Bediha Sahin
Soc. Sci. 2025, 14(9), 545; https://doi.org/10.3390/socsci14090545 - 10 Sep 2025
Cited by 1 | Viewed by 2218
Abstract
Persistent gender inequality in industrial employment continues to challenge inclusive labor systems worldwide. While education and labor market reforms have expanded opportunities for women, structural barriers remain deeply embedded in manufacturing sectors. This study adopts a systems-based perspective to investigate the institutional, demographic, [...] Read more.
Persistent gender inequality in industrial employment continues to challenge inclusive labor systems worldwide. While education and labor market reforms have expanded opportunities for women, structural barriers remain deeply embedded in manufacturing sectors. This study adopts a systems-based perspective to investigate the institutional, demographic, and health-related factors shaping female employment in manufacturing across ten countries from 2013 to 2022. By integrating feminist political economy with interpretable machine learning techniques—including Random Forest, Gradient Boosting, and Extra Trees regressors—the study models non-linear and interactive relationships among thirteen structural indicators drawn from the World Bank’s World Development Indicators. The findings reveal that general female labor force participation is the strongest and most consistent predictor of women’s inclusion in manufacturing. Health-related variables, such as maternal mortality and fertility rates, exhibit strong negative effects, underscoring the continued influence of caregiving burdens and inadequate health systems. Education indicators show more variable impacts, suggesting that institutional context mediates their effectiveness. The use of SHAP and Partial Dependence Plots enhances the transparency of the models and supports a more nuanced understanding of how structural forces shape gendered labor outcomes. In addition to modeling structural inequalities, this study highlights how gender knowledges and cultures of equality are contextually produced and negotiated within the manufacturing sector. The findings underscore the importance of understanding both global systems and local cultural frameworks in shaping gendered employment outcomes. By linking interpretable machine learning with systems thinking, this research provides a holistic and data-driven account of industrial gender inequality. The results offer policy-relevant insights for designing more inclusive labor strategies that address not only economic incentives but also the social and institutional systems in which employment patterns are embedded. Full article
(This article belongs to the Special Issue Gender Knowledges and Cultures of Equalities in Global Contexts)
Show Figures

Figure 1

Back to TopTop