Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (866)

Search Parameters:
Keywords = SHAP-values

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 1883 KB  
Article
A Hybrid Predictive Model for Employee Turnover: Integrating Ensemble Learning and Feature-Driven Insights from IBM HR Analytics
by Muna I. Alyousef, Hamza Wazir Khan and Mian Usman Sattar
Information 2026, 17(2), 208; https://doi.org/10.3390/info17020208 - 17 Feb 2026
Abstract
Employee turnover presents a significant challenge to modern organizations, often resulting in operational disruptions, substantial hiring costs, and a loss of institutional knowledge. While traditional human resource practices have historically been reactive, the emergence of machine learning has introduced a proactive capability to [...] Read more.
Employee turnover presents a significant challenge to modern organizations, often resulting in operational disruptions, substantial hiring costs, and a loss of institutional knowledge. While traditional human resource practices have historically been reactive, the emergence of machine learning has introduced a proactive capability to anticipate and mitigate attrition before it occurs. This research utilizes the IBM HR Analytics dataset, which contains 1470 employee records and 35 distinct features, to develop a hybrid machine learning model designed to enhance the accuracy of turnover predictions. To ensure the model’s effectiveness, the researchers employed a comprehensive preprocessing phase that included eliminating non-informative features, applying label encoding to categorical data, and using StandardScaler to normalize quantitative values. A critical component of the study addressed the common issue of class imbalance within HR data. To resolve this, a hybrid sampling strategy was implemented, combining Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADASYN) to create a more balanced learning environment for the algorithms. The core of the predictive engine is a soft voting ensemble that integrates three powerful algorithms: Random Forest, XGBoost, and logistic regression. Evaluated on an 80/20 train–test split, the tuned XGBoost model achieved an impressive 84% accuracy and an Area Under the Curve (AUC) of 0.80. Meanwhile, the logistic regression component contributed the highest F1-score, reinforcing the overall strength and balance of the ensemble approach. These metrics confirm that the hybrid model is both robust and reliable for identifying at-risk employees. Beyond simple prediction, the study prioritized interpretability by using SHapley Additive exPlanations (SHAP) to identify the primary drivers of attrition. The analysis revealed that the most significant variables influencing an employee’s decision to leave include the interaction between job level and experience, frequent overtime, monthly income, current job level, and total years spent at the company. By providing these data-driven insights, the model empowers HR teams to transition from reactive troubleshooting to proactive retention planning, ultimately securing the organization’s talent and stability. Full article
(This article belongs to the Special Issue Machine Learning Approaches for Prediction and Decision Making)
Show Figures

Figure 1

17 pages, 4034 KB  
Article
Non-Destructive Assessment of Beef Freshness Using Visible and Near-Infrared Spectroscopy with Interpretable Machine Learning
by Ruoxin Chen, Wei Ning, Xufen Xie, Jingran Bi, Gongliang Zhang and Hongman Hou
Foods 2026, 15(4), 728; https://doi.org/10.3390/foods15040728 - 15 Feb 2026
Viewed by 52
Abstract
Beef freshness is a critical indicator of meat quality and safety, and its rapid, non-destructive detection is of significant importance for ensuring consumer health and enhancing quality control throughout the meat industry chain. This study developed a novel methodology for non-destructive beef freshness [...] Read more.
Beef freshness is a critical indicator of meat quality and safety, and its rapid, non-destructive detection is of significant importance for ensuring consumer health and enhancing quality control throughout the meat industry chain. This study developed a novel methodology for non-destructive beef freshness assessment using visible and near-infrared (Vis-NIR) spectroscopy combined with machine learning, explainable artificial intelligence (xAI) techniques, and the SHapley Additive exPlanations (SHAP) framework. An improved hybrid heuristic method, particle swarm optimization–genetic algorithm (PSOGA), was used for feature selection, optimizing the wavelength subset for predicting beef quality indicators, including total volatile basic nitrogen (TVB-N) and color parameters (L*, a*, and b*). The eXtreme Gradient Boosting (XGBoost) was employed for regression modeling, and the results showed that PSOGA significantly outperforms traditional methods, with the PSOGA-XGBoost model achieving a satisfactory prediction accuracy (R2p values of 0.9504 for TVB-N, 0.9540 for L*, 0.8939 for a*, and 0.9416 for b*). The SHAP framework identified the key wavelengths as 1236 nm and 1316 nm for TVB-N, 728 nm for L*, 576 nm for a*, and 604 nm for b*, providing valuable insights into the determination of key wavelengths and enhancing the interpretability of the model. The results demonstrated the effectiveness of PSOGA and SHAP, providing a promising analytical method for monitoring beef freshness. Full article
(This article belongs to the Special Issue Advances in Meat Quality and Quality Control)
Show Figures

Figure 1

18 pages, 1390 KB  
Article
Predicting Anticipated Telehealth Use: Development of the CONTEST Score and Machine Learning Models Using a National U.S. Survey
by Richard C. Wang and Usha Sambamoorthi
Healthcare 2026, 14(4), 500; https://doi.org/10.3390/healthcare14040500 - 14 Feb 2026
Viewed by 164
Abstract
Objectives: Anticipated telehealth use is an important determinant of whether telehealth can function as a durable component of hybrid care models. However, there are limited practical tools to identify patients at risk of discontinuing telehealth. We aim to (1) identify factors associated with [...] Read more.
Objectives: Anticipated telehealth use is an important determinant of whether telehealth can function as a durable component of hybrid care models. However, there are limited practical tools to identify patients at risk of discontinuing telehealth. We aim to (1) identify factors associated with anticipated telehealth use; (2) develop a risk stratification tool (CONTEST); (3) compare its performance with machine learning (ML) models; and (4) evaluate model fairness across sex and race/ethnicity. Methods: We conducted a retrospective cross-sectional analysis of the 2024 Health Information National Trends Survey 7 (HINTS 7), including U.S. adults with ≥1 telehealth visit in the prior 12 months. The primary outcome was anticipated telehealth use. Survey-weighted multivariable logistic regression informed a Framingham-style point score (CONTEST). ML models (XGBoost, random forest, logistic regression) were trained and evaluated using the area under the receiver operating characteristic curve (AUROC), precision, and recall. Global interpretation used SHAP values. Fairness was assessed using group metrics (Disparate Impact, Equal Opportunity) and individual counterfactual-flip rates (CFR). Results: Approximately one-third of adults reported at least one telehealth visit in the prior year. Among these users, nearly one in ten expressed an unwillingness to continue using telehealth in the future. Four telehealth experience factors were independently associated with unwillingness to continue: lower perceived convenience, technical problems, lower perceived quality compared to in-person care, and unwillingness to recommend telehealth. CONTEST demonstrated strong discrimination for identifying individuals with lower anticipated telehealth use (AUROC 0.876; 95% CI, 0.843–0.908). XGBoost performed best among the ML models (AUROC 0.902 with all features). With the same four top features, an ML-informed point score achieved an AUROC of 0.872 (95% CI, 0.839–0.904), and a four-feature XGBoost model yielded an AUROC of 0.893 (95% CI, 0.821–0.948, p > 0.05). Group fairness metrics revealed disparities across sex and race/ethnicity, whereas individual counterfactual analyses indicated low flip rates (sex CFR: 0.024; race/ethnicity CFR: 0.013). Conclusions: A parsimonious, interpretable score (CONTEST) and feature-matched ML models provide comparable discrimination for stratifying risk of lower anticipated telehealth use. Sustained engagement hinges on convenience, technical reliability, perceived quality, and patient advocacy. Implementation should pair prediction with operational support and routine fairness monitoring to mitigate subgroup disparities. Full article
(This article belongs to the Special Issue Informatics in Healthcare Outcomes)
Show Figures

Figure 1

34 pages, 3490 KB  
Article
Forecasting Municipal Financial Distress in South Africa: A Machine Learning Approach
by Nkosinathi Emmanuel Radebe, Bomi Cyril Nomlala and Frank Ranganai Matenda
Forecasting 2026, 8(1), 18; https://doi.org/10.3390/forecast8010018 - 14 Feb 2026
Viewed by 108
Abstract
Persistent fiscal stress in South African municipalities undermines service delivery, yet practical tools for early detection remain limited. This study predicts one-year-ahead municipal financial distress to support risk-based prioritisation. We develop machine learning models using a 2018/19–2022/23 municipality panel, combining 13 financial health [...] Read more.
Persistent fiscal stress in South African municipalities undermines service delivery, yet practical tools for early detection remain limited. This study predicts one-year-ahead municipal financial distress to support risk-based prioritisation. We develop machine learning models using a 2018/19–2022/23 municipality panel, combining 13 financial health indicators from State of Local Government (SoLG) reports with selected socio-economic variables. Penalised logistic regression is benchmarked against random forest and XGBoost under a leakage-aware, time-ordered split into training, validation, and an out-of-time test year; class imbalance is handled through class weighting. Performance is evaluated using PR-AUC, ROC-AUC, calibration, and a capacity-constrained Top-30 rule. All models outperform a naïve last-year baseline on the out-of-time test (PR-AUC 0.934–0.954; ROC-AUC 0.886–0.923), with bootstrap intervals supporting robustness. Random forest performs best overall, while penalised logistic regression remains competitive. Under the Top-30 rule (12.3% workload), precision is high (precision@30 0.967–1.000) while recall is modest (recall@30 0.186–0.192). SHAP values and logistic odds ratios identify liquidity, solvency, cash coverage, and employment deprivation as key drivers. The Top-30 rule corresponds to an annual intensive monitoring portfolio that is reasonable under constrained staffing and budget capacity in national and provincial oversight units, while probability thresholds are reported as conventional benchmarks rather than as policy triggers. Full article
(This article belongs to the Section Forecasting in Economics and Management)
Show Figures

Figure 1

36 pages, 31133 KB  
Article
SOBLE-Top5: A Stacking Ensemble Learning-Based Seasonal Downscaling Inversion Framework for Surface Soil Moisture Using Multi-Source Data
by Shengmin Zhu, Haiyang Yu, Bingqian Ji, Qi Liu and Deng Pan
Remote Sens. 2026, 18(4), 585; https://doi.org/10.3390/rs18040585 - 13 Feb 2026
Viewed by 150
Abstract
Surface soil moisture (SSM) serves as a critical indicator for regional water cycles, agricultural management, and drought monitoring. However, existing the SMAP data suffers from limited spatial resolution, making it challenging to meet the demands of large-scale, high-resolution applications. Taking Henan Province, located [...] Read more.
Surface soil moisture (SSM) serves as a critical indicator for regional water cycles, agricultural management, and drought monitoring. However, existing the SMAP data suffers from limited spatial resolution, making it challenging to meet the demands of large-scale, high-resolution applications. Taking Henan Province, located in east-central China with a continental monsoon climate and marked seasonal variability, as the study area, this research integrates multi-source data to develop a seasonal modeling strategy. Based on stacking ensemble learning, the SSM downscaling inversion model (SOBLE-Top5) is constructed. SHAP value attribution analysis is employed to reveal the primary drivers of seasonal dynamics. The results indicate: (1) The SSM exhibits distinct seasonal characteristics. Compared to the all-season modeling, the RMSE and R2 metrics significantly improve during spring and summer. The winter ET and RF models show an approximately 9–14% higher R2 and a 47–50% lower RMSE. (2) The SOBLE-Top5 strategy achieved up to a 4.65% higher R2 and a 21.22% lower RMSE compared to the optimal single base model. (3) Spatial variations in the SSM characteristics reveal stable performance during the winter. The spring saw slight SSM declines in the northern regions due to rising temperatures. The study area reached its annual low (<0.08 m3/m3) in May–June. Driven by flood season precipitation, July–August witnessed local increases exceeding 52%. The autumn exhibited a stable-then-rising trend with pronounced north–south gradient characteristics. (4) The SHAP analysis indicates that the winter SSM is primarily controlled by bulk density and clay content. The spring SSM is most influenced by LST, followed by bulk density. The summer and the autumn SSM are synergistically driven by multiple factors including elevation, temperature, and precipitation, with the summer precipitation exerting the most significant impact on instantaneous SSM variations. Full article
(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)
Show Figures

Figure 1

29 pages, 5892 KB  
Article
Interpretable Machine Learning with SHAP Identifies Key Biomarkers in a Multi-Factorial Spectrum of Age-Related Neurological and Metabolic Conditions
by Daniil V. Artamonov, Polina I. Popova, Ekaterina A. Korf, Natalia G. Voitenko, Alisa A. Chernysheva, Pavel V. Avdonin, Richard O. Jenkins and Nikolay V. Goncharov
Int. J. Mol. Sci. 2026, 27(4), 1805; https://doi.org/10.3390/ijms27041805 - 13 Feb 2026
Viewed by 114
Abstract
Vascular and metabolic disorders in the elderly—including acute ischemic stroke (AIS), chronic cerebral circulation insufficiency (CCCI), type 2 diabetes mellitus (DM), and subcortical ischemic vascular dementia (SIVD)—pose a major diagnostic challenge due to their reliance on multi-parameter blood chemistry. In this study, 49 [...] Read more.
Vascular and metabolic disorders in the elderly—including acute ischemic stroke (AIS), chronic cerebral circulation insufficiency (CCCI), type 2 diabetes mellitus (DM), and subcortical ischemic vascular dementia (SIVD)—pose a major diagnostic challenge due to their reliance on multi-parameter blood chemistry. In this study, 49 biochemical features were analyzed within a cohort of 120 patients. The application of variance-aware statistical testing revealed that several features (e.g., Fe, Transf, RDW%, LDL) exhibited statistically significant heterogeneity of variance (p < 0.05), which is known to distort standard ANOVA inference. While standard machine-learning (ML) classifiers demonstrated variable performance across clinical groups, a gradient boosting model with restricted tree depth (max depth = 3) achieved high discriminative accuracy, yielding F1-scores between 0.87 and 0.96 across all five clinical classes. Through the use of Shapley Additive Explanations (SHAP), key stable biomarkers including iron (Fe), transferrin, and glucose were identified as having synergistic interactions in model predictions. A comparative analysis of feature importance ranks indicated consistency between statistical significance and SHAP values, with Spearman correlation coefficients reaching 0.53 for groups 1–2 and 0.59 for groups 1–5. Conversely, unsupervised KMeans clustering (k = 5) revealed a poor correspondence with clinical labels, yielding an Adjusted Rand Index (ARI) of 0.198 and Normalized Mutual Information (NMI) of 0.286. These results underscore that statistical structures in biochemical data do not always map to meaningful clinical categories and advocate for the adoption of variance-aware workflows and interpretable ML to enhance diagnostic reliability in aging populations. Full article
(This article belongs to the Special Issue Challenges and Innovation in Neurodegenerative Diseases, 2nd Edition)
Show Figures

Figure 1

29 pages, 123573 KB  
Article
Dynamic Landslide Susceptibility Assessment Integrating SBAS-InSAR and Interpretable Machine Learning: A Case Study of the Baihetan Reservoir Area, Southwest China
by Hongfei Wang, Chuhan Deng, Ziyou Zhang, Zhekai Jiang, Qi Wei, Weijie Yi, Tao Chen and Junwei Ma
Remote Sens. 2026, 18(4), 578; https://doi.org/10.3390/rs18040578 - 12 Feb 2026
Viewed by 120
Abstract
Landslide susceptibility mapping (LSM) is a fundamental approach for identifying and predicting areas prone to slope failure. However, most conventional LSM methods are based on time-invariant conditioning factors or long-term-averaged predictors and seldom incorporate slope-kinematic information from deformation observations, thereby limiting their ability [...] Read more.
Landslide susceptibility mapping (LSM) is a fundamental approach for identifying and predicting areas prone to slope failure. However, most conventional LSM methods are based on time-invariant conditioning factors or long-term-averaged predictors and seldom incorporate slope-kinematic information from deformation observations, thereby limiting their ability to capture evolving slope instability. Moreover, the black-box nature of many models limits interpretability and confidence in their predictions. In this study, we integrate small baseline subset interferometric synthetic aperture radar (SBAS-InSAR) with interpretable machine learning (ML) methods to develop a dynamic LSM framework that improves the accuracy and reliability of susceptibility assessment. First, static LSM was performed using ML algorithms, and SHapley Additive exPlanations (SHAP) was used to quantify and visualize feature importance. Subsequently, SBAS-InSAR was applied to retrieve surface deformation rates. Finally, a dynamic LSM matrix was constructed to integrate InSAR-derived deformation with static susceptibility classes, producing time-varying landslide susceptibility maps. Application of the framework in the Baihetan Reservoir area, Southwest China, demonstrates its practical value. During the static LSM phase, the extreme gradient boosting (XGBoost) model achieved strong predictive performance (the area under the receiver operating characteristic curve (AUC) = 0.8864; accuracy = 0.8315; precision = 0.8947), outperforming the alternative models. SHAP analysis indicates that elevation and distance to rivers are the primary controls on landslide occurrence. Incorporating SBAS-InSAR deformation data into the dynamic LSM matrix effectively captures the spatiotemporal evolution of slope instability. Susceptibility upgrades are observed for multiple inventoried landslides, and the actively deforming Xiaomidi and Gantianba landslides are presented as representative case studies, further supported by multisource observations from satellite imagery, unmanned aerial vehicle (UAV) surveys, and ground-based global navigation satellite system (GNSS) monitoring. Consequently, the proposed dynamic LSM framework overcomes limitations of static approaches by integrating deformation information and enhancing interpretability through explainable artificial intelligence. Full article
Show Figures

Figure 1

17 pages, 1116 KB  
Article
Deep Learning for Emergency Department Sustainability: Interpretable Prediction of Revisit
by Wang-Chuan Juang, Zheng-Xun Cai, Chia-Mei Chen and Zhi-Hong You
Healthcare 2026, 14(4), 464; https://doi.org/10.3390/healthcare14040464 - 12 Feb 2026
Viewed by 72
Abstract
Background: Emergency department (ED) overcrowding strains clinicians and potentially compromises urgent care quality. Unscheduled return visits (URVs), also known as readmissions, contribute to this cycle, motivating tools that identify high-risk patients at discharge. Methods: This study performed a retrospective study using ED electronic [...] Read more.
Background: Emergency department (ED) overcrowding strains clinicians and potentially compromises urgent care quality. Unscheduled return visits (URVs), also known as readmissions, contribute to this cycle, motivating tools that identify high-risk patients at discharge. Methods: This study performed a retrospective study using ED electronic health records (EHRs) from Kaohsiung Veterans General Hospital from January 2018 to December 2022 (n = 184,653). The model integrates structured variables, such as vital signs, medication and laboratory counts, and ICD-10–based comorbidity measures, with unstructured physician notes. Key physiologic measurements were transformed into binary form using clinical reference intervals, and random under-sampling addressed class imbalance. A multimodal, CNN was proposed and evaluated with an 8:2 train–test split and 10-fold Monte Carlo cross-validation. Results: The proposed model achieved a sensitivity of 0.717 (CI: [0.695, 0.738]), accuracy of 0.846 (CI: [0.842, 0.850]), and AUROC of 0.853. Binary transformation improved recall and AUROC relative to the original numeric representations. SHAP analysis showed that unstructured features dominated prediction, while structured variables added complementary value. In a small-scale pilot evaluation using the SHAP-enabled interface, participating physicians reported the system helped surface high-risk cohorts and reduced cognitive workload by consolidating relevant patient information for rapid cross-checking. Conclusions: An interpretable CNN-based clinical decision support system can predict ED revisit risk from multimodal EHR data and demonstrates practical usability in a real-world clinical setting, supporting targeted discharge planning and follow-up as a near-term approach to mitigate overcrowding. Full article
Show Figures

Figure 1

42 pages, 10041 KB  
Article
Probabilistic Prediction of Concrete Compressive Strength Using Copula Functions: A Novel Framework for Uncertainty Quantification
by Cheng Zhang, Senhao Cheng, Shanshan Tao, Shuai Du and Zhengjun Wang
Buildings 2026, 16(4), 754; https://doi.org/10.3390/buildings16040754 - 12 Feb 2026
Viewed by 121
Abstract
Traditional machine learning models for concrete compressive strength prediction provide only single-value estimates without quantifying the probability of meeting design requirements, leaving engineers unable to make risk-informed decisions. This study addresses this critical limitation by developing a novel probabilistic prediction framework that integrates [...] Read more.
Traditional machine learning models for concrete compressive strength prediction provide only single-value estimates without quantifying the probability of meeting design requirements, leaving engineers unable to make risk-informed decisions. This study addresses this critical limitation by developing a novel probabilistic prediction framework that integrates explainable machine learning with Copula-based joint distribution modeling. Using a dataset of 1030 concrete samples with curing ages ranging from 1 to 365 days, we first established an XGBoost 2.1.4 prediction model achieving R2 = 0.9211 (RMSE = 4.51 MPa) on the test set. SHAP 0.49.1 (SHapley Additive exPlanations) analysis identified curing age (33.3%) and water–cement ratio (28.8%) as the dominant features, together accounting for 62.1% of predictive importance. These two controllable engineering parameters were then selected as core variables for probabilistic modeling. The key innovation lies in integrating Copula-based dependence modeling with explainable machine learning (XGBoost–SHAP) to quantify the compliance probability of concrete strength under specific mix designs and curing conditions, thereby supporting risk-informed quality control decisions. Through systematic comparison of five Copula families (Gaussian, Student t, Clayton, Gumbel, and Frank), we identified optimal dependence structures: Gaussian Copula (ρ = −0.54) for the water–cement ratio–strength relationship and Clayton Copula for the age–strength relationship, revealing asymmetric tail dependence patterns invisible to conventional correlation analysis. The three-dimensional Copula model enables engineers to estimate compliance probability—the likelihood of concrete achieving target strength under specific mix designs and curing conditions. We propose an illustrative three-tier decision rule for construction quality management based on the compliance probability P: P ≥ 0.95 (high-confidence approval), 0.80 ≤ P < 0.95 (warning zone requiring enhanced monitoring), and P < 0.80 (high risk suggesting corrective actions such as mix adjustment or extended curing), noting that these thresholds can be recalibrated to project-specific risk tolerance and local specifications. This framework supports a paradigm shift from reactive “mix-then-test” quality control to proactive “predict-then-decide” construction management, providing quantitative risk assessment tools previously unavailable in deterministic prediction approaches. Full article
Show Figures

Figure 1

21 pages, 3171 KB  
Article
Automated Fiber Placement Gap Width Prediction Using a Transformer-Based Deep Learning Approach
by Diogo Cardoso, António Ramos Silva and Nuno Correia
Processes 2026, 14(4), 609; https://doi.org/10.3390/pr14040609 - 10 Feb 2026
Viewed by 171
Abstract
Automated Fiber Placement (AFP) is a critical process in composite manufacturing, where precise fiber tow placement is essential for achieving high-quality and high-performance engineering components. However, deviations in process variables frequently lead to defects such as gaps and overlaps, which can compromise structural [...] Read more.
Automated Fiber Placement (AFP) is a critical process in composite manufacturing, where precise fiber tow placement is essential for achieving high-quality and high-performance engineering components. However, deviations in process variables frequently lead to defects such as gaps and overlaps, which can compromise structural integrity. While various monitoring techniques exist, accurately predicting and understanding the formation of these defects from complex sensor data remains challenging. This work introduces a novel application of a Transformer-based deep learning architecture to enhance the estimation of gap widths in AFP. Leveraging a publicly available industrial AFP dataset, our methodology incorporates a customized positional encoding scheme to effectively integrate the critical spatial context of the tow layup process. The model’s predictive performance was evaluated, achieving a Mean Absolute Percentage Error (MAPE) of 1.04% and an R-squared (R2) value of 0.9143, demonstrating its capability for accurate gap width estimation. Furthermore, SHapley Additive exPlanations (SHAP) analysis was employed to assess the complex interplay between sources of manufacturing process variation. This study establishes the Transformer architecture as a promising and interpretable data-driven tool for AFP process monitoring. The results serve as a proof of concept for attention-based virtual metrology, offering a pathway towards deeper process understanding and defect mitigation. Full article
Show Figures

Figure 1

18 pages, 8050 KB  
Article
Machine Learning-Based Analysis of Arsenic Migration from Soil to Highland Barley in High Geological Background Areas
by Jiahui Zuo, Chuangchuang Zhang, Xuefeng Liang, Yanming Cai, Ye Li, Yandi Hu and Yujie Zhao
Sustainability 2026, 18(4), 1782; https://doi.org/10.3390/su18041782 - 10 Feb 2026
Viewed by 95
Abstract
To investigate the effect of high-arsenic (As) soil on the absorption of As by highland barley, 135 pairs of soil–crop samples were collected in the main producing areas of highland barley in the middle reaches of the Yarlung Zangbo River. Eight soil variables, [...] Read more.
To investigate the effect of high-arsenic (As) soil on the absorption of As by highland barley, 135 pairs of soil–crop samples were collected in the main producing areas of highland barley in the middle reaches of the Yarlung Zangbo River. Eight soil variables, including pH, redox potential (Eh), soil organic matter (SOM), total arsenic (T-As), total iron (T-Fe), total manganese (T-Mn), chemically extractable As (KH2PO4-As), and bioavailable As determined by diffusive gradients in thin films (DGT-As), were measured, along with As concentrations in barley grains (HB-As). Machine learning approaches were employed to construct predictive models for HB-As accumulation, and feature influence mechanisms were interpreted using SHapley Additive exPlanations (SHAP) and Partial Dependence Plot (PDP) analyses. The results showed that: (1) among models constructed using the full feature set, the random forest (RF) model exhibited the best predictive performance for HB-As, with R2 values of 0.756 and 0.651 for the training and testing datasets, respectively; (2) SHAP analysis indicated that DGT-As had the greatest contribution to the model (30.5%), followed by T-As and T-Fe/Mn; and (3) significant interaction effects among soil variables jointly influenced HB-As accumulation. This study provides scientific support for agricultural product safety, soil security, and sustainable land use in plateau agroecosystems. Full article
(This article belongs to the Section Soil Conservation and Sustainability)
Show Figures

Figure 1

21 pages, 7688 KB  
Article
Owner Social Determinants of Health Associated with Exercise Patterns in Golden Retrievers with and Without Cancer
by Elpida Artemiou, Andrea Paredes and Sarah Hooper
Vet. Sci. 2026, 13(2), 172; https://doi.org/10.3390/vetsci13020172 - 9 Feb 2026
Viewed by 220
Abstract
The World Health Organization (WHO) recognizes the impact of social determinants of health (SDHs) on human health and wellbeing factors. Limited research has explored how SDHs, such as the social, economic, and environmental conditions in which individuals are born, live, work, and grow [...] Read more.
The World Health Organization (WHO) recognizes the impact of social determinants of health (SDHs) on human health and wellbeing factors. Limited research has explored how SDHs, such as the social, economic, and environmental conditions in which individuals are born, live, work, and grow older, shape exercise behaviors and chronic health conditions such as cancer in dogs. This study links SDHs identified through owner-provided continental United States zip codes with levels of physical activity. We hypothesized that owners with higher incomes, education, and access to healthcare services positively influence their dog’s health outcomes, specifically owner-reported physical activity. Our study utilized all owner-provided data, collected between 2012 and 2022, from the first seven years of owner surveys for the 3044 Golden Retrievers enrolled in the Morris Animal Foundation Lifetime Study. Sixteen GPBoost Poisson models were built to assess the impact of twenty-three social determinants in Golden Retrievers with and without a diagnosis of cancer. SHAP values were calculated for each dependent variable. Consistently, economic factors, education, ethnicity, and health care access were identified as important variables. Furthermore, our findings suggest that complex interactions between ethnicities and other SDHs should be explored in future studies. Full article
Show Figures

Figure 1

19 pages, 1553 KB  
Article
Enhancing Student Retention in Higher Education Institutions (HEIs): Machine Learning Approach
by Emeka Cajetan Umendu, Mustansar Ghanzanfar, Aaron Kans and Md Atiqur Rahman Ahad
Electronics 2026, 15(4), 734; https://doi.org/10.3390/electronics15040734 - 9 Feb 2026
Viewed by 200
Abstract
Student dropout remains a critical challenge for higher education institutions, with significant implications for resource allocation, academic planning, and institutional sustainability. This study applies machine learning techniques to predict student non-continuation and attrition to support data-driven retention strategies in higher education. By framing [...] Read more.
Student dropout remains a critical challenge for higher education institutions, with significant implications for resource allocation, academic planning, and institutional sustainability. This study applies machine learning techniques to predict student non-continuation and attrition to support data-driven retention strategies in higher education. By framing the problem as a multi-class classification task (Dropout, Enrolled, Graduate), the proposed framework enables early and differentiated intervention planning. Using a publicly available higher education student dataset (4424 records, 34 features, multi-class outcome), a structured analytical pipeline was implemented, incorporating Winsorisation for outlier mitigation, SMOTE for class imbalance handling, and targeted feature engineering. Model performance was assessed using a 5-fold nested cross-validation framework. Four classifiers, Extra Trees, Random Forest, Gradient Boosting, and Logistic Regression, were trained on an optimised subset of 28 features. Among these, the Extra Trees model achieved the strongest performance, attaining a mean AUC of 0.96 (±0.0053) and an accuracy of 87.4% (±0.012). Model interpretability was enhanced through SHAP analysis, which identified cumulative approved academic units and tuition fee payment status as the most influential predictors of student outcomes. The findings underscore the value of early predictive analytics for informing proactive institutional interventions, particularly in academic monitoring and financial support to strengthen student retention frameworks. Full article
(This article belongs to the Special Issue AI-Driven Data Analytics and Mining)
Show Figures

Figure 1

24 pages, 2710 KB  
Article
Improving PDSI Z-Index Prediction with Ensemble Learning: A Case Study from the Troy Region of Türkiye
by Umut Mucan and Ebru Elif Arslantaş Civelekoğlu
Sustainability 2026, 18(4), 1752; https://doi.org/10.3390/su18041752 - 9 Feb 2026
Viewed by 144
Abstract
Climate change is expected to intensify droughts, thereby increasing the need for reliable predictive tools. In this study, one-month-ahead forecasts of the Palmer Z-Index were generated using long-term monthly data from two meteorological stations (17112 Çanakkale and 18084 Biga) located in the Troy [...] Read more.
Climate change is expected to intensify droughts, thereby increasing the need for reliable predictive tools. In this study, one-month-ahead forecasts of the Palmer Z-Index were generated using long-term monthly data from two meteorological stations (17112 Çanakkale and 18084 Biga) located in the Troy region. The input features included current and lagged meteorological variables, multi-month rolling statistics, and seasonal encodings. Eight machine learning models, including linear and ensemble tree-based approaches, were evaluated using time series cross-validation. Drought events were defined based on Palmer Z-Index and standardized drought indicators, and model performance was assessed using commonly adopted accuracy and detection measures. Shapley Additive Explanations (SHAP) analysis was used to quantify the feature contributions. Gradient Boosting achieved the highest predictive accuracy at the main station, while XGBoost and CatBoost also performed strongly. High accuracy was maintained at the second station, demonstrating the spatial robustness of the model. The machine learning-predicted Palmer Z-Index values showed strong agreement with observed hydrological drought conditions; severe drought events were detected with high confidence and low false alarm rates. SHAP results identified precipitation inputs as the most dominant driver of Z-Index variability. Overall, the findings suggest that ML-based models can provide timely and interpretable forecasts for operational drought early warning systems. Nonetheless, further research is needed to test the generalizability of these findings under different climate regimes and data conditions. Full article
(This article belongs to the Section Sustainable Water Management)
Show Figures

Figure 1

20 pages, 2488 KB  
Article
Network Instability as a Signal of Systemic Financial Stress: An Explainable Machine-Learning Framework
by Livia Valentina Moretti, Enrico Barbierato and Alice Gatti
Future Internet 2026, 18(2), 91; https://doi.org/10.3390/fi18020091 - 9 Feb 2026
Viewed by 146
Abstract
This paper develops a framework for monitoring and forecasting episodes of systemic financial stress using a combination of market information, macro-financial indicators, and measures derived from time-varying correlation networks, embedded in a sequential machine-learning setting. The contribution is not tied to a single [...] Read more.
This paper develops a framework for monitoring and forecasting episodes of systemic financial stress using a combination of market information, macro-financial indicators, and measures derived from time-varying correlation networks, embedded in a sequential machine-learning setting. The contribution is not tied to a single modelling innovation, but rather to the way these ingredients are brought together under an evaluation protocol designed to mimic real-time supervisory use, and to an interpretability layer that makes the resulting predictions easier to inspect. Monthly data covering the period from 2006 to 2025 are used to construct evolving correlation structures and summary indicators of market co-movement. These features are combined with standard predictors and fed into logistic regression, random forest, and gradient boosting models, all estimated in expanding windows and assessed strictly on future observations. Predictive accuracy remains limited, which is consistent with the difficulty of anticipating stress regimes several months ahead at monthly frequency, although gradient boosting attains the highest average AUC across evaluation folds and displays noticeable variation over time. Inspection of SHAP values points to instability in correlation networks, volatility conditions, and short-horizon return behaviour as recurring drivers of the predicted stress probabilities, suggesting that the models draw on information that goes beyond individual market series. Taken together, the results indicate that recurrent statistical regularities and changes in market structure can be exploited for monitoring purposes when models are trained and tested in a sequential fashion. The overall design is intended to be usable in practice and to support supervisory analysis, while remaining transparent enough to allow scrutiny of the signals driving the forecasts. Full article
Show Figures

Figure 1

Back to TopTop