MDPI - Publisher of Open Access Journals

25 pages, 3968 KB

Open AccessArticle

Explainable Data-Driven Approach for Smart Crop Yield Prediction in Sub-Saharan Africa: Performance and Interpretability Analysis

by Damilola D. Olatinwo, Herman C. Myburgh, Allan De Freitas and Adnan Abu-Mahfouz

Agriculture 2026, 16(8), 826; https://doi.org/10.3390/agriculture16080826 - 8 Apr 2026

Viewed by 98

Abstract

The increasing demand for innovative strategies in sustainable food production—driven by rapid global population growth, particularly in sub-Saharan Africa (SSA)—necessitates urgent attention to agricultural resilience. Recent technological advancements have enhanced crop productivity, post-harvest preservation, and environmentally sustainable farming practices. However, three critical bottlenecks [...] Read more.

The increasing demand for innovative strategies in sustainable food production—driven by rapid global population growth, particularly in sub-Saharan Africa (SSA)—necessitates urgent attention to agricultural resilience. Recent technological advancements have enhanced crop productivity, post-harvest preservation, and environmentally sustainable farming practices. However, three critical bottlenecks remain: (i) the lack of accurate, maize-specific yield prediction methods tailored to SSA; (ii) limited multimodal modeling approaches capable of capturing complex, nonlinear interactions among heterogeneous data sources; and (iii) a lack of explainability mechanisms, which render high-performing models “black boxes” and hinder stakeholder trust. To address these gaps, this study presents an explainable machine learning framework for smart maize yield prediction. We integrate multimodal SSA-specific soil, crop, and weather data to capture the multi-dimensional drivers of maize productivity. Six diverse algorithms—including extreme gradient boosting (XGBoost), light gradient boosting machine (LGBM), categorical boosting (CatBoost), support vector machine (SVM), random forest (RF), and an artificial neural network (ANN) combined with a k-nearest neighbors (kNN)—were benchmarked to evaluate predictive performance. To ensure robustness against spatial heterogeneity, we employed a Leave-One-Plot-Out (LOPO) cross-validation strategy. Empirical results on unseen test data identify CatBoost as the best-performing model, achieving a coefficient of determination of (

R^{2} = ~ 76 %

), demonstrating its ability to capture complex, nonlinear relationships in agricultural data. To enhance transparency and stakeholder trust, we integrated Local Interpretable Model-agnostic Explanations (LIME), providing plot-level insights into the physiological and environmental drivers of maize yield. Together, these contributions establish a scalable and interpretable modeling framework capable of supporting data-driven agricultural decision-making in SSA. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

27 pages, 1404 KB

Open AccessArticle

Drivers and Barriers to the Use of Generative Artificial Intelligence in the Spanish Active Population: Insights from Artificial Neural Network Modeling and Shapley Additive Explanations

by Teresa Torres-Coronas, Jorge de Andrés-Sánchez, Orlando Lima Rua and Álvaro Carrasco-Aguilar

Computers 2026, 15(4), 215; https://doi.org/10.3390/computers15040215 - 1 Apr 2026

Viewed by 329

Abstract

This study analyzes the determinants of generative artificial intelligence (GAI) use intensity among the Spanish working population, as well as the possible existence of gender gaps in its adoption. To this end, a conceptual model is proposed that incorporates perceived economic and productive [...] Read more.

This study analyzes the determinants of generative artificial intelligence (GAI) use intensity among the Spanish working population, as well as the possible existence of gender gaps in its adoption. To this end, a conceptual model is proposed that incorporates perceived economic and productive usefulness (PEU), perceived social usefulness (PSU), three dimensions of the Technology Readiness Index—technological optimism (OPTI), innovativeness (INNOV), and insecurity (INSEC)—and three sociodemographic variables: entrepreneurial status, gender, and generational cohort. The model is implemented using artificial neural networks (ANNs) endowed with explanatory capability through Shapley Additive Explanations (SHAP). The application of SHAP enables the assessment of both the global and local importance of the explanatory variables, as well as the potential existence of gender biases in their contribution to GAI use. The results indicate that the most relevant variables are PEU, generational cohort, and INNOV. Although gender does not rank among the most important variables in terms of global importance, women exhibit lower levels of GAI use, and gender-related differences are also observed in the contribution of several explanatory variables. In particular, substantive effect sizes are observed for PSU, OPTI, INSEC, entrepreneurial status, and membership in Generation Y. By contrast, differences associated with especially relevant variables such as PEU and INNOV, as well as membership in Generation Z, do not exhibit meaningful effect sizes. Full article

(This article belongs to the Special Issue Machine Learning: Innovation, Implementation, and Impact)

► Show Figures

Figure 1

16 pages, 2243 KB

Open AccessArticle

A Feature Selection Method for Yarn Quality Prediction Based on SHAP Interpretation

by Chunxue Wei, Tianxiang Liu, Baowei Zhang and Xiao Wang

Algorithms 2026, 19(4), 266; https://doi.org/10.3390/a19040266 - 1 Apr 2026

Viewed by 174

Abstract

This study developed an interpretable framework, RFE-SHAP, designed for yarn quality prediction. It integrates Recursive Feature Elimination (RFE) with SHapley Additive exPlanations (SHAP) theory to refine feature selection and mitigate data redundancy in small-sample environments. With Support Vector Regression (SVR) serving as the [...] Read more.

This study developed an interpretable framework, RFE-SHAP, designed for yarn quality prediction. It integrates Recursive Feature Elimination (RFE) with SHapley Additive exPlanations (SHAP) theory to refine feature selection and mitigate data redundancy in small-sample environments. With Support Vector Regression (SVR) serving as the foundational evaluator, the RFE process iteratively identifies critical variables. Distinct from conventional methods, our approach employs SHAP values to quantify both the primary effects of individual features and the complex synergistic interactions among variables. This yields a transparent and intuitive strategy for identifying optimal feature subsets for two key quality indicators: yarn strength and hairiness H-value. To assess performance, a comparative analysis was performed between the traditional SVR-RFE method and the proposed RFE-SHAP method, using both as inputs for a Back-Propagation Artificial Neural Network (BP-ANN). The experimental results based on authentic production data demonstrate that the RFE-SHAP-BP model significantly enhances prediction reliability. Notably, compared to the baseline SVR-RFE-BP model, the proposed approach reduced the Mean Absolute Percentage Error (MAPE) by 0.73 and 1.01 percentage points for yarn strength and hairiness H-value, respectively. The final MAPE values reached 2.10% and 2.78%, confirming the model’s superior precision. These findings indicate that the RFE-SHAP method is highly feasible and effectively elevates prediction performance in data-limited industrial scenarios. Full article

► Show Figures

Figure 1

24 pages, 5590 KB

Open AccessArticle

Knowledge-Guided Interpretable Machine Learning Framework for Ladle Furnace Desulphurisation Control

by Didi Zhao, Yuan Gu, Zemin Chen, Yiliang Liu, Baiqiao Chen and Jingyuan Li

Processes 2026, 14(7), 1118; https://doi.org/10.3390/pr14071118 - 30 Mar 2026

Viewed by 330

Abstract

A hybrid modelling framework is proposed to predict endpoint sulphur content in the ladle furnace (LF) refining process by embedding metallurgical expert knowledge into interpretable machine learning (ML). Industrial process data were extracted from the Level-2 (L2) system of a steel plant, and [...] Read more.

A hybrid modelling framework is proposed to predict endpoint sulphur content in the ladle furnace (LF) refining process by embedding metallurgical expert knowledge into interpretable machine learning (ML). Industrial process data were extracted from the Level-2 (L2) system of a steel plant, and a desulphurisation dataset comprising 5169 heats with 29 process variables was constructed using a knowledge-guided time window from the joint satisfaction of refining conditions to the final argon-blowing stage. After data cleaning, normalisation and correlation-based feature selection, four algorithms—Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM) and Artificial Neural Network (ANN)—were trained and compared on a representative cluster of steel grades identified by K-means. The ANN model achieved a coefficient of determination (R²) of 0.7752, a root mean square error (RMSE) of 0.0027 wt%, a mean absolute error (MAE) of 0.0017 wt% and a hit rate (HR, ±0.0025 wt% for S) of 76.40% on the test set. SHapley Additive exPlanations (SHAP) indicate that limestone addition, slag basicity, argon flow rate, refining time and initial sulphur content dominantly govern sulphur removal. The expert-knowledge-guided, interpretable framework provides quantitative support for specification-conforming endpoint sulphur control while mitigating over-desulphurisation and reagent consumption. Full article

(This article belongs to the Special Issue Artificial Intelligence-Based Analytics for Data-Driven Decision-Making in Industrial Process Engineering)

► Show Figures

Figure 1

28 pages, 7001 KB

Open AccessArticle

Thermal Intelligence for Hydro-Generators: Data-Driven Prediction of Stator Winding Temperature Under Real Operating Conditions

by Zangpo, Munira Batool and Imtiaz Madni

Energies 2026, 19(7), 1671; https://doi.org/10.3390/en19071671 - 28 Mar 2026

Viewed by 386

Abstract

Hydropower remains one of the primary sources of power generation. It can be operated as either a base-load or peak-load plant due to its rapid, easy start-up and stop-down capability. However, power plants, old or new, need to be operated and maintained optimally [...] Read more.

Hydropower remains one of the primary sources of power generation. It can be operated as either a base-load or peak-load plant due to its rapid, easy start-up and stop-down capability. However, power plants, old or new, need to be operated and maintained optimally to meet energy demand and maximise economic returns. While the older plants without digital controls such as the Supervisory Control and Data Acquisition (SCADA) system are unable to leverage the evolving technology including big data and Artificial Intelligence (AI), the newer plants or plants that already have some form of data acquisition system have the advantage of leveraging the newer platforms for efficient operation, monitoring and fault diagnosis. Thus, an Artificial Neural Network (ANN), a machine learning (ML) algorithm, was chosen for this case study to predict the generator’s operational stator temperature by selecting six parameters that could potentially affect it. Real data from the 336 MW Chhukha Hydropower Plant (CHP) in Bhutan were used to train the ANN. The prediction of temperature using an ANN in MATLAB^® yielded an R² (correlation coefficient) of 96.8%, which is impressive but can be further improved through various optimisation and tuning methods with increased data volume and complexity. The performance of ANN prediction was validated against other regression models, and the ANN was found to outperform them. This demonstrated its capability to predict and detect generator temperature faults before failures, thereby enhancing hydropower operation and maintenance (O&M) efficiency. The model’s interpretation was also done through Shapley Additive ExPlanations (SHAP). Full article

(This article belongs to the Section F: Electrical Engineering)

► Show Figures

Figure 1

27 pages, 3936 KB

Open AccessArticle

Productivity Prediction in Tight Oil Reservoirs: A Stacking Ensemble Approach with Hybrid Feature Selection

by Zhengyang Kang, Yong Zheng, Tianyang Zhang, Haoyu Chen, Xiaoyan Zhou, Quanyu Cai and Yiran Sun

Processes 2026, 14(7), 1089; https://doi.org/10.3390/pr14071089 - 27 Mar 2026

Viewed by 294

Abstract

To address the challenges of low accuracy and complex influencing factors in predicting horizontal well fracturing productivity during the development of unconventional oil and gas resources such as tight oil, this paper proposes a productivity prediction framework based on an improved feature selection [...] Read more.

To address the challenges of low accuracy and complex influencing factors in predicting horizontal well fracturing productivity during the development of unconventional oil and gas resources such as tight oil, this paper proposes a productivity prediction framework based on an improved feature selection method and an ensemble learning model. This study employs a fusion analysis using the entropy weight method to combine Pearson correlation analysis and improved gray relational analysis (IGRA) for feature selection. Thirteen machine learning models were tested with six distinct parameter combinations to construct a Stacking-based ensemble learning model, with base models including Random Forest (RF), Ridge Regression (RR), and Artificial Neural Network (ANN). Particle Swarm Optimization (PSO) was employed to optimize hyperparameters, followed by interpretability analysis using SHapley Additive exPlanations (SHAP). The results indicate that the model with fused weights demonstrated optimal performance. The Stacking model achieved significantly improved accuracy after PSO optimization, with the coefficient of determination increasing by 4.9%, outperforming all comparison models. Engineering guidance is provided: Under current geological conditions, sand ratio and displacement fluid volume require fine-tuning to prevent over-treatment. Fracturing design should implement differentiated strategies based on the target sand body thickness. This study not only delivers a high-precision production prediction tool but also offers decision support for efficient unconventional oil and gas field development through its exceptional interpretability. Full article

(This article belongs to the Section Petroleum and Low-Carbon Energy Process Engineering)

► Show Figures

Figure 1

26 pages, 1536 KB

Open AccessArticle

GraphGPT-Patent: Time-Aware Graph Foundation Modeling on Semantic Similarity Document Graphs for Grant-Time Economic Impact Prediction

by Tianhui Fang, Junru Si, Chi Ye and Hailong Shi

Appl. Sci. 2026, 16(6), 2737; https://doi.org/10.3390/app16062737 - 12 Mar 2026

Viewed by 289

Abstract

Predicting the future impact of technical economic documents at release time is challenging due to delayed supervision signals, long-tailed label distributions, and time- and domain-dependent shifts in language and topics. Moreover, similarity graphs derived from text embeddings can be noisy due to boilerplate [...] Read more.

Predicting the future impact of technical economic documents at release time is challenging due to delayed supervision signals, long-tailed label distributions, and time- and domain-dependent shifts in language and topics. Moreover, similarity graphs derived from text embeddings can be noisy due to boilerplate and evolve under temporal drift, making robustness and leakage-free evaluation essential. We formulate grant-time patent impact prediction as a node classification and within-domain ranking problem on a large-scale semantic similarity document graph built from patent text embeddings, avoiding any future citation leakage. The document graph is constructed via ANN Top-K retrieval and similarity thresholding, enabling scalable and reproducible sparsification on hundreds of thousands of nodes. We propose GraphGPT-Patent, which adapts a reversible graph-to-sequence foundation backbone to local subgraphs extracted from the similarity network. The model incorporates time- and domain-conditioned edge reliability to suppress drift-induced and template-driven pseudo-similarity, and optimizes a joint objective coupling high-impact classification with ranking consistency within comparable groups. Experiments on USPTO granted patents (2000–2022) across three high-volume CPC domains and three evaluation horizons show consistent gains over text-only and GNN baselines, achieving up to 0.94 recall for the positive class and improved macro-average recall across nine settings. Temporal shift analyses further quantify the effect of training-data freshness, while explanation subgraphs provide auditable structural evidence of model decisions. The proposed framework offers an effective graph-based learning pipeline for scalable impact prediction and downstream triage under strict information constraints. Full article

(This article belongs to the Special Issue Graph-Based Methods in Artificial Intelligence and Machine Learning, 2nd Edition)

► Show Figures

Figure 1

20 pages, 3087 KB

Open AccessArticle

Classification and Prediction of Average Current in High-Power Semiconductor Devices: A Machine Learning Framework

by Fawad Ahmad, Luis Vaccaro, Armel Asongu Nkembi, Mario Marchesoni and Federico Portesine

Electronics 2026, 15(6), 1149; https://doi.org/10.3390/electronics15061149 - 10 Mar 2026

Viewed by 245

Abstract

The applications of machine learning (ML) in power electronics are expanding with time, providing effective tools that reduce design complexity and enhance predictive accuracy. In high-power semiconductor devices, such as thyristors and high-power diodes, electrical parameters may directly influence electro-thermal behavior, reliability, and [...] Read more.

The applications of machine learning (ML) in power electronics are expanding with time, providing effective tools that reduce design complexity and enhance predictive accuracy. In high-power semiconductor devices, such as thyristors and high-power diodes, electrical parameters may directly influence electro-thermal behavior, reliability, and overall device performance. Consequently, accurate prediction and classification of average current are critical to ensure optimal device selection, optimize design, and assess performance. In this article, a comprehensive dataset based on data from industrial thyristors capturing electrical and structural parameters relevant to current handling capability is utilized to classify and predict the average current of devices. Additionally, Shapley additive explanation (SHAP) analysis has been performed, highlighting the importance of crucial parameters and identifying the impact of each parameter on model output. Moreover, several ML models, including artificial neural networks (ANNs), support vector machines (SVMs), ensembles, and Gaussian process regression (GPR) are implemented and then compared to assess their performance. The proposed methodology provides manufacturers and designers with data-driven design tools that enhance reliability assessments and facilitate optimized device selection for high-power applications. Full article

(This article belongs to the Section Semiconductor Devices)

► Show Figures

Figure 1

25 pages, 7590 KB

Open AccessArticle

Rock Brittleness Prediction with BDEGTO-Optimized XGBoost

by Yajuan Wu, Tao Wen, Ruozhao Wang, Yunpeng Yang and Xiaohong Xu

Processes 2026, 14(5), 878; https://doi.org/10.3390/pr14050878 - 9 Mar 2026

Viewed by 246

Abstract

Precise assessment of rock brittleness is a prerequisite for effective wellbore integrity and successful reservoir stimulation in drilling programs. To achieve precise prediction of rock brittleness index (BI), this study proposes an improved optimization algorithm for an artificial gorilla troops optimizer (GTO), called [...] Read more.

Precise assessment of rock brittleness is a prerequisite for effective wellbore integrity and successful reservoir stimulation in drilling programs. To achieve precise prediction of rock brittleness index (BI), this study proposes an improved optimization algorithm for an artificial gorilla troops optimizer (GTO), called a Bernoulli Differential Evolution Gorilla Troops Optimizer (BDEGTO). In the BDEGTO, Bernoulli mapping is introduced during the population initialization process, and the differential evolution is embedded after the exploration stage of the GTO. These modifications effectively address the early-stage optimization weaknesses and the susceptibility to local optima that are commonly encountered in a traditional GTO. To evaluate the performance of the BDEGTO, comparisons are made with other optimization algorithms based on 91 datasets from 32 rock types. The results demonstrate the significant advantages of the BDEGTO over other algorithms. Furthermore, the BDEGTO is applied to the optimization process of Least Squares Boosting (LSB), Extreme Gradient Boosting (XGB), and Light Gradient Boosting Machine (LGBM). A comparison is made with Support Vector Regression (SVR), Artificial Neural Network (ANN), and Convolutional Neural Network (CNN) algorithms for predicting rock brittleness based on input parameters such as P-wave velocity (V_p), point load index (Is₅₀), and unit weight (UW). The findings indicate that BDEGTO-XGB achieves the best prediction performance for BI. Additionally, through SHapley Additive exPlanations (SHAP) analysis, it is determined that among the three input parameters, Is₅₀ has the most significant influence. These research results provide valuable guidance for the brittleness assessment of similar rocks. Full article

(This article belongs to the Section Petroleum and Low-Carbon Energy Process Engineering)

► Show Figures

Figure 1

33 pages, 6006 KB

Open AccessArticle

An Experimental and Modeling Study on the Interaction of Cements with Varying C₃A Ratios and Different Water-Reducing Admixtures Using the op-ANN and Various Machine Learning Methods

by Veysel Kobya, Hasan Tahsin Öztürk, Kemal Karakuzu, Ali Mardani and Naz Mardani

Polymers 2026, 18(5), 656; https://doi.org/10.3390/polym18050656 - 7 Mar 2026

Cited by 1 | Viewed by 420

Abstract

This study investigates the interaction between polycarboxylate-based water-reducing admixtures (WRAs) and various types of CEM I 42.5R Portland cements, focusing on optimizing input parameters in cementitious systems. Despite the widespread use of WRAs to enhance concrete’s workability, strength, and durability, their compatibility with [...] Read more.

This study investigates the interaction between polycarboxylate-based water-reducing admixtures (WRAs) and various types of CEM I 42.5R Portland cements, focusing on optimizing input parameters in cementitious systems. Despite the widespread use of WRAs to enhance concrete’s workability, strength, and durability, their compatibility with cement remains a critical challenge, often leading to performance issues such as low initial flow, bleeding, and rapid slump loss. This research addresses two significant gaps in the literature: the unexplored use of input parameter reduction in cementitious systems and the application of novel metaheuristic algorithms in optimizing these systems. In this study, 25 WRA were first synthesized to enrich the inputs of machine learning (ML) models. Then, a dataset of 750 entries was generated, and advanced prediction models were developed. To ensure scientific rigor and eliminate data leakage, a triple-split dataset strategy (Training–Validation–Test) and 5-fold cross-validation were implemented. Among the machine learning techniques analyzed, the Optimized Artificial Neural Networks (opANN) architecture decisively demonstrated the highest prediction performance on the isolated test dataset. In the opANN process, 10 different metaheuristics were tested to evaluate their effectiveness in hyperparameter optimization. As a result, the Kepler Optimization (KOA) algorithm was determined as the algorithm with the highest performance in ANN hyperparameter optimization. Furthermore, Shapley Additive Explanations (SHAP) analysis was utilized to bridge the gap between empirical observations and algorithmic predictions, quantitatively corroborating the rheological roles of phosphate and sulfonate groups. The results offer new insights into WRA–cement compatibility and present advanced, interpretable modeling approaches that enhance predictive accuracy, contributing to more reliable and sustainable concrete practices. Full article

(This article belongs to the Special Issue Application of Polymers in Cementitious Materials)

► Show Figures

Figure 1

34 pages, 5939 KB

Open AccessArticle

Explainable Machine Learning for Volatile Fatty Acid Soft-Sensing in Anaerobic Digestion: A Pilot Feasibility Study

by Bibars Amangeldy, Assiya Boltaboyeva, Nurdaulet Tasmurzayev, Zhanel Baigarayeva, Baglan Imanbek, Aliya Jemal Getahun, Dinara Turmakhanbet, Moldir Kuatova and Waldemar Wojcik

Algorithms 2026, 19(3), 183; https://doi.org/10.3390/a19030183 - 1 Mar 2026

Viewed by 471

Abstract

Sustainable energy systems such as anaerobic digestion (AD) bioreactors exhibit complex nonlinear dynamics that complicate the monitoring of key stability indicators using conventional laboratory-based methods. As a preliminary investigation, this pilot study explores the feasibility of using machine learning-based soft sensing to estimate [...] Read more.

Sustainable energy systems such as anaerobic digestion (AD) bioreactors exhibit complex nonlinear dynamics that complicate the monitoring of key stability indicators using conventional laboratory-based methods. As a preliminary investigation, this pilot study explores the feasibility of using machine learning-based soft sensing to estimate Total Volatile Fatty Acids (TVFA(M)) from routinely measured physicochemical parameters. Using a short-term laboratory dataset obtained from controlled CO₂ biomethanisation experiments, several regression models were benchmarked, including an attention-based deep learning architecture (TabNet), multi-architecture artificial neural networks (ANNs), gradient-boosting ensembles (CatBoost, XGBoost, LightGBM), and classical kernel-based approaches. Model performance was evaluated under a cross-validated framework to assess predictive capability and consistency across folds within the limited experimental scope. Among the tested models, TabNet achieved highly competitive performance, yielding an R² of 0.8551, an RMSE of 0.0090, and an MAE of 0.0067. To support model transparency and interpretability, Explainable Artificial Intelligence (XAI) techniques based on SHapley Additive exPlanations (SHAP) were applied, identifying pCO₂ as the dominant contributor to TVFA(M) predictions within the studied operational range. The results demonstrate the potential of explainable machine learning models as soft sensors for TVFA(M) estimation under controlled laboratory conditions. Although restricted to controlled laboratory conditions and a short observation period, this pilot study demonstrates the potential of explainable machine learning models for TVFA(M) estimation and provides a methodological benchmark for future validation using larger and more diverse datasets. Full article

(This article belongs to the Special Issue Machine Learning and Artificial Intelligence in Engineering Applications: 2nd Edition)

► Show Figures

Figure 1

20 pages, 2566 KB

Open AccessArticle

Machine Learning-Based Prediction of Long-Term Mortality in STEMI Patients Using Clinical, Laboratory, and Inflammatory–Metabolic Indices

by Gökhan Keskin, Abdulkadir Çakmak and Mehmet Uğur Çalışkan

J. Clin. Med. 2026, 15(5), 1800; https://doi.org/10.3390/jcm15051800 - 27 Feb 2026

Viewed by 340

Abstract

Background: This study aims to compare the performance of machine learning (ML) models developed to predict long-term mortality risk in patients with ST-segment elevation myocardial infarction (STEMI) undergoing primary percutaneous coronary intervention (pPCI) and to investigate the prognostic value of novel inflammatory–metabolic indices. [...] Read more.

Background: This study aims to compare the performance of machine learning (ML) models developed to predict long-term mortality risk in patients with ST-segment elevation myocardial infarction (STEMI) undergoing primary percutaneous coronary intervention (pPCI) and to investigate the prognostic value of novel inflammatory–metabolic indices. Methods: In this retrospective study, 329 consecutive STEMI patients who underwent pPCI (292 survivors, 37 deaths) were included. Five ML algorithms—Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Support Vector Machines (SVM), and Artificial Neural Networks (ANN)—were developed for mortality prediction. Model performance was evaluated using accuracy, sensitivity, specificity, and the area under the receiver operating characteristic (ROC) curve (AUC). SHAP (Shapley Additive exPlanations) analysis was used to interpret model decision mechanisms. Results: The mortality group had significantly higher door-to-balloon time (DTBT), Systemic Inflammatory Response Index (SIRI), pan-immune-inflammation value (PIV), whereas body mass index (BMI), Prognostic Nutritional Index (PNI), and Advanced Lung Cancer Inflammation Index (ALI) values were significantly lower (p < 0.001). Among the ML models, the XGBoost algorithm achieved the best performance, with 98.99% accuracy, a ROC-AUC of 0.999, and 100% sensitivity, correctly identifying all mortality cases. SHAP analysis identified DTBT, albumin level, and ALI score as the strongest predictors of mortality, in that order. Conclusions: The XGBoost algorithm provides high accuracy and reliability for predicting long-term mortality in STEMI patients. Beyond DTBT, integrating novel indices—especially ALI and TyG—into ML models may serve as a powerful clinical tool for early identification of high-risk patients and improved risk stratification. Full article

(This article belongs to the Special Issue New Perspectives in Acute Coronary Syndrome)

► Show Figures

Figure 1

22 pages, 3798 KB

Open AccessArticle

Deciphering Phosphorus Recovery from Wastewater via Machine Learning: Comparative Insights Among Al³⁺, Fe³⁺ and Ca²⁺ Systems

by Yanyu Liu and Baichuan Jiang

Water 2026, 18(2), 182; https://doi.org/10.3390/w18020182 - 9 Jan 2026

Viewed by 409

Abstract

Efficient phosphorus recovery is of great significance for sustainable wastewater management and resource recycling. While chemical precipitation is widely used, its effectiveness under complex multi-factor conditions remains challenging to predict and optimize. This study compiled a multidimensional dataset from recent experimental literature, encompassing [...] Read more.

Efficient phosphorus recovery is of great significance for sustainable wastewater management and resource recycling. While chemical precipitation is widely used, its effectiveness under complex multi-factor conditions remains challenging to predict and optimize. This study compiled a multidimensional dataset from recent experimental literature, encompassing key operational parameters (reaction time, temperature, pH, stirring speed) and dosages of three metal precipitants (Al³⁺, Ca²⁺, Fe³⁺) to systematically evaluate and benchmark phosphorus recovery performance across these distinct systems, six machine learning algorithms—Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Gaussian Process Regression (GPR), Elastic Net, Artificial Neural Network (ANN), and Partial Least Squares Regression (PLSR)—were developed and cross-validated. Among them, the GPR model exhibited superior predictive accuracy and robustness. (

R^{2}

= 0.69,

R M S E

= 0.54). Beyond achieving high-fidelity predictions, this study advances the field by integrating interpretability analysis with Shapley Additive Explanations (SHAP) and Partial Dependence Plots (PDP). These analyses identified distinct controlling factors across systems: reaction time and pH for aluminum, Ca²⁺ dosage and alkalinity for calcium, and phosphorus loading with stirring speed for iron. The revealed factor-specific mechanisms and synergistic interactions (e.g., among pH, metal dose, and mixing intensity) provide actionable insights that transcend black-box prediction. This work presents an interpretable Machine Learning (ML) framework that offers both theoretical insights and practical guidance for optimizing phosphorus recovery in multi-metal systems and enabling precise control in wastewater treatment operations. Full article

(This article belongs to the Special Issue Sustainable Wastewater Treatment and the Circular Economy—2nd Edition)

► Show Figures

Figure 1

21 pages, 5970 KB

Open AccessArticle

Evaluation of Multiple Influences on the Unconfined Compressive Strength of Fibre-Reinforced Backfill Using a GWO–LGBM Model

by Xin Chen, Yunmin Wang, Shengjun Miao, Shian Zhang, Zhi Yu and Linfeng Du

Materials 2026, 19(1), 200; https://doi.org/10.3390/ma19010200 - 5 Jan 2026

Viewed by 416

Abstract

Fibres can markedly enhance the uniaxial compressive strength (UCS) of cemented paste backfill (CPB). However, previous studies have mainly verified the effectiveness of polypropylene and straw fibres in improving the UCS of CPB experimentally, while systematic multi-factor evaluation remains limited. In this study, [...] Read more.

Fibres can markedly enhance the uniaxial compressive strength (UCS) of cemented paste backfill (CPB). However, previous studies have mainly verified the effectiveness of polypropylene and straw fibres in improving the UCS of CPB experimentally, while systematic multi-factor evaluation remains limited. In this study, laboratory experiments were conducted on polypropylene- and straw fibre-reinforced CPB to construct a reliable dataset. The factors influencing the intensity of uniaxial compressive strength were divided into four aspects (mixture proportions, physical properties of the cement–tailings mixture, chemical characteristics of tailings, and fibre properties), and four intelligent models were developed for effectiveness analysis and UCS prediction. SHapley Additive exPlanations (SHAP) were employed to quantify the contributions of individual features, and the findings were experimentally validated. The GWO–LGBM model outperformed the SVR, ANN, and LGBM models, achieving R² = 0.907, RMSE = 0.78, MAE = 0.515, and MAPE = 0.157 for the training set, and R² = 0.949, RMSE = 0.627, MAE = 0.38, and MAPE = 0.115 for the testing set, respectively. Feature analysis reveals that mixture proportions contribute the most to UCS, followed by the tailings’ physical properties, the fibre properties, and the tailings’ chemical characteristics. This study found that cement content and tailings gradation control CPB structural compactness and fibres enhance bonding between hydration products and tailings aggregates, while the chemical composition of the tailings plays an inert role, functioning mainly as an aggregate. Full article

(This article belongs to the Section Construction and Building Materials)

► Show Figures

Figure 1

19 pages, 3937 KB

Open AccessArticle

Forecasting Daily Ambient PM_2.5 Concentrations in Qingdao City Using Deep Learning and Hybrid Interpretable Models and Analysis of Driving Factors Using SHAP

by Zhenfang He, Qingchun Guo, Zuhan Zhang, Genyue Feng, Shuaisen Qiao and Zhaosheng Wang

Toxics 2026, 14(1), 44; https://doi.org/10.3390/toxics14010044 - 30 Dec 2025

Cited by 5 | Viewed by 867

Abstract

With the acceleration of urbanization in China, air pollution is becoming increasingly serious, especially PM_2.5 pollution, which poses a significant threat to public health. The study employed different deep learning models, including recurrent neural network (RNN), artificial neural network (ANN), convolutional Neural [...] Read more.

With the acceleration of urbanization in China, air pollution is becoming increasingly serious, especially PM_2.5 pollution, which poses a significant threat to public health. The study employed different deep learning models, including recurrent neural network (RNN), artificial neural network (ANN), convolutional Neural Network (CNN), bidirectional Long Short-Term Memory (BiLSTM), Transformer, and novel hybrid interpretable CNN–BiLSTM–Transformer architectures for forecasting daily PM_2.5 concentrations on the integrated dataset. The dataset of meteorological factors and atmospheric pollutants in Qingdao City was used as input features for the model. Among the models tested, the hybrid CNN–BiLSTM–Transformer model achieved the highest prediction accuracy by extracting local features, capturing temporal dependencies in both directions, and enhancing global pattern and key information, with low root Mean Square Error (RMSE) (5.4236 μg/m³), low mean absolute error (MAE) (4.0220 μg/m³), low mean absolute percentage error (MAPE) (22.7791%) and high correlation coefficient (R) (0.9743) values. Shapley additive explanations (SHAP) analysis further revealed that PM₁₀, CO, mean atmospheric temperature, O_3, and SO₂ are the key influencing factors of PM_2.5. This study provides a more comprehensive and multidimensional approach for predicting air pollution, and valuable insights for people’s health and policy makers. Full article

► Show Figures

Figure 1

Search Results (94)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (94)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI