Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,245)

Search Parameters:
Keywords = SHAP methods

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 4034 KB  
Article
Non-Destructive Assessment of Beef Freshness Using Visible and Near-Infrared Spectroscopy with Interpretable Machine Learning
by Ruoxin Chen, Wei Ning, Xufen Xie, Jingran Bi, Gongliang Zhang and Hongman Hou
Foods 2026, 15(4), 728; https://doi.org/10.3390/foods15040728 (registering DOI) - 15 Feb 2026
Abstract
Beef freshness is a critical indicator of meat quality and safety, and its rapid, non-destructive detection is of significant importance for ensuring consumer health and enhancing quality control throughout the meat industry chain. This study developed a novel methodology for non-destructive beef freshness [...] Read more.
Beef freshness is a critical indicator of meat quality and safety, and its rapid, non-destructive detection is of significant importance for ensuring consumer health and enhancing quality control throughout the meat industry chain. This study developed a novel methodology for non-destructive beef freshness assessment using visible and near-infrared (Vis-NIR) spectroscopy combined with machine learning, explainable artificial intelligence (xAI) techniques, and the SHapley Additive exPlanations (SHAP) framework. An improved hybrid heuristic method, particle swarm optimization–genetic algorithm (PSOGA), was used for feature selection, optimizing the wavelength subset for predicting beef quality indicators, including total volatile basic nitrogen (TVB-N) and color parameters (L*, a*, and b*). The eXtreme Gradient Boosting (XGBoost) was employed for regression modeling, and the results showed that PSOGA significantly outperforms traditional methods, with the PSOGA-XGBoost model achieving a satisfactory prediction accuracy (R2p values of 0.9504 for TVB-N, 0.9540 for L*, 0.8939 for a*, and 0.9416 for b*). The SHAP framework identified the key wavelengths as 1236 nm and 1316 nm for TVB-N, 728 nm for L*, 576 nm for a*, and 604 nm for b*, providing valuable insights into the determination of key wavelengths and enhancing the interpretability of the model. The results demonstrated the effectiveness of PSOGA and SHAP, providing a promising analytical method for monitoring beef freshness. Full article
(This article belongs to the Special Issue Advances in Meat Quality and Quality Control)
Show Figures

Figure 1

23 pages, 3515 KB  
Article
Characterizing Cotton Defoliation Progress via UAV-Based Multispectral-Derived Leaf Area Index and Analysis of Influencing Factors
by Yukun Wang, Zhenwang Zhang, Chenyu Xiao, Te Zhang, Keke Yu, Chong Zhang, Qinghua Liao, Fangjun Li, Sumei Wan, Guodong Chen, Xiaoli Tian, Mingwei Du and Zhaohu Li
Remote Sens. 2026, 18(4), 609; https://doi.org/10.3390/rs18040609 (registering DOI) - 15 Feb 2026
Abstract
Timely monitoring of cotton defoliation progress is crucial for optimizing the quality of mechanical harvesting. To accurately assess the defoliation status prior to mechanical picking, a field experiment was conducted in Hejian, Hebei Province, China, in 2022. Using a DJI P4M multispectral drone, [...] Read more.
Timely monitoring of cotton defoliation progress is crucial for optimizing the quality of mechanical harvesting. To accurately assess the defoliation status prior to mechanical picking, a field experiment was conducted in Hejian, Hebei Province, China, in 2022. Using a DJI P4M multispectral drone, canopy images of cotton were collected before and after defoliation at three flight altitudes: 25 m, 50 m, and 100 m. The study employed machine learning algorithms including linear regression, Support Vector Machine (SVM), Generalized Additive Model (GAM), and Random Forest (RF) to invert the Leaf Area Index (LAI). Additionally, SVM-based supervised classification was introduced to eliminate background interference from soil and open cotton bolls, while the XGBoost model and SHAP method were used to analyze the main factors influencing LAI inversion. Key findings include the following: The univariate linear relationship between EVI and LAI proved to be the most robust, with the model constructed from 100 m flight altitude data performing best (validation set: R2 = 0.921, RMSE = 0.284). The rate of LAI change showed a strong positive correlation with field-measured defoliation rate (r = 0.83–0.88), confirming its reliability as a proxy indicator for defoliation progress. Soil and open cotton bolls were identified as major negative factors affecting LAI inversion accuracy. The optimal machine learning prediction model varied with days after spraying, demonstrating significant temporal variability. This study demonstrates that high-throughput LAI inversion based on drone-derived multispectral EVI enables precise and dynamic monitoring of cotton defoliation. The approach provides farmers and field managers with an efficient, non-destructive monitoring tool. By delivering real-time insight into defoliation progress, it plays a pivotal role in enabling precision defoliation management, reducing excessive chemical use, optimizing the scheduling of mechanical operations, and ultimately enhancing both the sustainability and profitability of cotton production. Full article
Show Figures

Figure 1

18 pages, 1390 KB  
Article
Predicting Anticipated Telehealth Use: Development of the CONTEST Score and Machine Learning Models Using a National U.S. Survey
by Richard C. Wang and Usha Sambamoorthi
Healthcare 2026, 14(4), 500; https://doi.org/10.3390/healthcare14040500 (registering DOI) - 14 Feb 2026
Abstract
Objectives: Anticipated telehealth use is an important determinant of whether telehealth can function as a durable component of hybrid care models. However, there are limited practical tools to identify patients at risk of discontinuing telehealth. We aim to (1) identify factors associated with [...] Read more.
Objectives: Anticipated telehealth use is an important determinant of whether telehealth can function as a durable component of hybrid care models. However, there are limited practical tools to identify patients at risk of discontinuing telehealth. We aim to (1) identify factors associated with anticipated telehealth use; (2) develop a risk stratification tool (CONTEST); (3) compare its performance with machine learning (ML) models; and (4) evaluate model fairness across sex and race/ethnicity. Methods: We conducted a retrospective cross-sectional analysis of the 2024 Health Information National Trends Survey 7 (HINTS 7), including U.S. adults with ≥1 telehealth visit in the prior 12 months. The primary outcome was anticipated telehealth use. Survey-weighted multivariable logistic regression informed a Framingham-style point score (CONTEST). ML models (XGBoost, random forest, logistic regression) were trained and evaluated using the area under the receiver operating characteristic curve (AUROC), precision, and recall. Global interpretation used SHAP values. Fairness was assessed using group metrics (Disparate Impact, Equal Opportunity) and individual counterfactual-flip rates (CFR). Results: Approximately one-third of adults reported at least one telehealth visit in the prior year. Among these users, nearly one in ten expressed an unwillingness to continue using telehealth in the future. Four telehealth experience factors were independently associated with unwillingness to continue: lower perceived convenience, technical problems, lower perceived quality compared to in-person care, and unwillingness to recommend telehealth. CONTEST demonstrated strong discrimination for identifying individuals with lower anticipated telehealth use (AUROC 0.876; 95% CI, 0.843–0.908). XGBoost performed best among the ML models (AUROC 0.902 with all features). With the same four top features, an ML-informed point score achieved an AUROC of 0.872 (95% CI, 0.839–0.904), and a four-feature XGBoost model yielded an AUROC of 0.893 (95% CI, 0.821–0.948, p > 0.05). Group fairness metrics revealed disparities across sex and race/ethnicity, whereas individual counterfactual analyses indicated low flip rates (sex CFR: 0.024; race/ethnicity CFR: 0.013). Conclusions: A parsimonious, interpretable score (CONTEST) and feature-matched ML models provide comparable discrimination for stratifying risk of lower anticipated telehealth use. Sustained engagement hinges on convenience, technical reliability, perceived quality, and patient advocacy. Implementation should pair prediction with operational support and routine fairness monitoring to mitigate subgroup disparities. Full article
(This article belongs to the Special Issue Informatics in Healthcare Outcomes)
Show Figures

Figure 1

32 pages, 18424 KB  
Article
Spatial Assessment of Urban Flood Resilience Using a GESIS-ML Framework: A Case Study of Chongqing, China
by Yunyan Li, Huanhuan Yuan, Jiaxing Dai, Binyan Wang, Xing Liu and Chenhao Fang
Sustainability 2026, 18(4), 1988; https://doi.org/10.3390/su18041988 (registering DOI) - 14 Feb 2026
Abstract
Against the backdrop of climate change and rapid urbanization, assessing urban flood resilience requires spatially continuous and interpretable approaches capable of capturing nonlinear interactions between natural and human systems. This study proposes a high-resolution framework for mapping urban flood resilience in the built-up [...] Read more.
Against the backdrop of climate change and rapid urbanization, assessing urban flood resilience requires spatially continuous and interpretable approaches capable of capturing nonlinear interactions between natural and human systems. This study proposes a high-resolution framework for mapping urban flood resilience in the built-up areas of Chongqing, China, grounded in the geography–ecology–society–infrastructure systems (GESIS) concept. A Flood Resilience Index is constructed at a 50 m grid resolution using ten core indicators and objective weighting based on combined entropy and coefficient-of-variation methods. Three machine learning models—multilayer perceptron (MLP), random forest, and XGBoost—are then trained to reproduce the resilience surface by integrating these indicators with additional historical flood-exposure variables, with SHAP used for model interpretation. The MLP model achieves the best performance (R2 ≈ 0.78) and generates spatially coherent resilience patterns. Impervious surface fraction and building density exert dominant negative effects, whereas elevation and ecological connectivity contribute positively. The results reveal pronounced nonlinear thresholds in key drivers, indicating that flood resilience cannot be inferred from monotonic factor effects alone. By combining objective weighting, explainable machine learning, and historical exposure information, this framework supports both accurate prediction and policy-relevant interpretation of urban flood resilience for sustainable urban planning in mountainous megacities. Full article
Show Figures

Figure 1

26 pages, 1916 KB  
Article
A Temporally Dynamic Feature-Extraction Framework for Phishing Detection with LIME and SHAP Explanations
by Chris Mayo, Michael Tchuindjang, Sarfraz Brohi and Nikolaos Ersotelos
Future Internet 2026, 18(2), 101; https://doi.org/10.3390/fi18020101 (registering DOI) - 14 Feb 2026
Abstract
Phishing remains one of the most pervasive social engineering threats, exploiting human vulnerabilities and continuously evolving to bypass static detection mechanisms. Existing machine learning models achieve high accuracy but often act as opaque systems that lack robustness to evolving tactics and explainability, limiting [...] Read more.
Phishing remains one of the most pervasive social engineering threats, exploiting human vulnerabilities and continuously evolving to bypass static detection mechanisms. Existing machine learning models achieve high accuracy but often act as opaque systems that lack robustness to evolving tactics and explainability, limiting trust and real-world deployment. In this research, we propose a dynamic Explainable AI (XAI) approach for phishing detection that integrates temporally aware feature extraction with dual interpretability through LIME and SHAP applied to the resulting window-level features. The novelty of this research lies in a temporally dynamic feature framework that simulates a plausible email reading progression using a heuristic temporal model and employs a sliding window aggregation method to capture behavioural and temporal patterns within email content. Using an aggregated dataset of 82,500 phishing and legitimate emails, dynamic features were extracted and used to train four classifiers: Random Forest, XGBoost, Multi-Layer Perceptron, and Logistic Regression. Ensemble models demonstrated strong performance with XGBoost achieving 94% accuracy and Random Forest 93%. This research addresses an important gap by combining dynamically constructed temporal features with transparent explanations, achieving high detection performance while preserving interpretability. These findings demonstrate that dynamic temporal modelling with explainable learning can enhance the trustworthiness and practicality of phishing detection systems, highlighting that temporally structured features and explainable learning can enhance the trustworthiness and practical deployability of phishing detection systems without incurring excessive computational overhead. Full article
23 pages, 3619 KB  
Article
Unbalanced Data Mining Algorithms from IoT Sensors for Early Cockroach Infestation Prediction in Sewer Systems
by Joaquín Aguilar, Cristóbal Romero, Carlos de Castro Lozano and Enrique García
Algorithms 2026, 19(2), 152; https://doi.org/10.3390/a19020152 (registering DOI) - 14 Feb 2026
Abstract
Predictive pest management in urban sewer networks represents a sustainable alternative to reactive, biocide-based methods. Using data collected through an IoT architecture and validated with manual inspections across eight manholes over 113 days, we implemented a rigorous comparative framework evaluating eleven data mining [...] Read more.
Predictive pest management in urban sewer networks represents a sustainable alternative to reactive, biocide-based methods. Using data collected through an IoT architecture and validated with manual inspections across eight manholes over 113 days, we implemented a rigorous comparative framework evaluating eleven data mining algorithms, including classical methods (KNN, SVM, decision trees) and advanced ensemble techniques (XGBoost, LightGBM, CatBoost) optimized for unbalanced datasets. Gradient boosting models with explicit handling of class imbalance—where the absence of pests exceeds 77% of observations—showed exceptional performance, achieving a Macro-F1 score above 0.92 and high precision in identifying the minority high-risk class. Explainability analysis using SHAP consistently revealed that elevated CO2 concentrations are the primary predictor of infestation, enabling early identification of critical zones. This study demonstrates that carbon dioxide (CO2) acts as the most robust bioindicator for predicting severe infestations of Periplaneta americana, significantly outperforming conventional environmental variables such as temperature and humidity. The implementation of the model in a real-time monitoring platform generates interpretable heat maps that support proactive and localized interventions, optimizing resource use and reducing dependence on biocides. This study presents a scalable, operationally viable predictive system designed for direct integration into municipal asset management workflows, offering a concrete, industry-ready solution to transform pest control from a reactive, labor-intensive process into a data-driven, proactive operational paradigm. This approach not only transforms pest management from reactive to predictive but also aligns with the Sustainable Development Goals, offering a scalable, interpretable, and operationally viable system for smart cities. Full article
Show Figures

Figure 1

26 pages, 1731 KB  
Article
Time-Varying Linkages Between Survey-Based Financial Risk Tolerance and Stock Market Dynamics: Signal Decomposition and Regime-Switching Evidence
by Wookjae Heo
Mathematics 2026, 14(4), 667; https://doi.org/10.3390/math14040667 - 13 Feb 2026
Viewed by 30
Abstract
This study examines how aggregate financial risk tolerance (FRT), measured from repeated survey responses, co-evolves with stock-market dynamics over time. The observed FRT index is treated as a noisy preference signal containing both gradual drift and episodic deviations, and its market relevance is [...] Read more.
This study examines how aggregate financial risk tolerance (FRT), measured from repeated survey responses, co-evolves with stock-market dynamics over time. The observed FRT index is treated as a noisy preference signal containing both gradual drift and episodic deviations, and its market relevance is evaluated under time variation, frequency components, and stress regimes. Using monthly data that align the survey-based FRT index with market returns and risk measures, a three-part econometric design is implemented. First, a time-varying parameter VAR (TVP-VAR) characterizes bidirectional, non-constant linkages between FRT and market outcomes. Second, signal-extraction methods decompose FRT into a smooth “normal” component and a high-frequency “abnormal” component (with robustness to alternative filters) to test whether short-run deviations contain distinct information for volatility and downside risk. Third, a Markov-switching specification assesses state dependence by testing whether the FRT–market relationship differs between low-stress and high-stress regimes. Across specifications, the FRT–market linkage is strongly state dependent: the sign and magnitude of FRT effects drift over time and differ across regimes, with high-frequency FRT deviations aligning more closely with risk dynamics than the smooth component. Predictive validation is provided via out-of-sample forecasting of next-month market risk using elastic net and gradient boosting relative to an AR(1) benchmark; explainability analysis (SHAP) indicates that abnormal FRT contributes incremental predictive content beyond standard market-state variables. Overall, the framework offers a mathematically transparent approach to modeling survey-based preference signals in markets and supports regime-aware forecasting and risk-management applications. Full article
(This article belongs to the Special Issue Signal Processing and Machine Learning in Real-Life Processes)
Show Figures

Figure 1

20 pages, 1359 KB  
Article
Development and Temporal Validation of Explainable Machine Learning Models for Predicting Vitamin B12 Deficiency Using Routine Laboratory Analytes
by Ferhat Demirci, Oktay Yıldırım, Aylin Demirci and Pınar Akan
Diagnostics 2026, 16(4), 563; https://doi.org/10.3390/diagnostics16040563 - 13 Feb 2026
Viewed by 35
Abstract
Background/Objectives: Vitamin B12 deficiency is a prevalent yet frequently underdiagnosed condition, largely due to the limited diagnostic accuracy of serum total B12 and the restricted availability of confirmatory biomarkers such as holotranscobalamin and methylmalonic acid. This study aimed to develop and validate explainable [...] Read more.
Background/Objectives: Vitamin B12 deficiency is a prevalent yet frequently underdiagnosed condition, largely due to the limited diagnostic accuracy of serum total B12 and the restricted availability of confirmatory biomarkers such as holotranscobalamin and methylmalonic acid. This study aimed to develop and validate explainable machine learning (ML) models capable of predicting vitamin B12 deficiency using only routinely available laboratory examinations, thereby supporting early detection within standard diagnostic workflows. Methods: This retrospective study included 51,630 adult patients who underwent concurrent vitamin B12 testing and routine laboratory evaluation between 2015 and 2025. An independent temporal validation cohort of 34,744 patients was used to assess generalizability. Eight supervised ML algorithms were developed within a four-stage experimental framework incorporating default modeling, probability-threshold optimization, hyperparameter tuning, and feature engineering. Model performance was evaluated using AUC-ROC, AUC-PR, sensitivity, specificity, F1 score, accuracy, Matthews correlation coefficient, and likelihood ratios. Model explainability and clinical utility were assessed using SHAP, LIME, and decision curve analysis. Results: Among all algorithms, CatBoost demonstrated the most balanced and clinically relevant performance. In the threshold-optimized configuration, the model achieved a sensitivity of 0.92, specificity of 0.67, F1 score of 0.82, AUC-ROC of 0.88, and AUC-PR of 0.86 in the test set. Temporal validation confirmed robust generalizability, with improved discrimination (AUC-ROC 0.90; AUC-PR 0.91) and stable calibration. Explainability analyses identified hematologic indices (MCV, HGB, HCT, RDW), iron-related markers, inflammatory measurands, and age as the most influential contributors, consistent with known pathophysiology. Conclusions: This study presents a large-scale, explainable, and temporally validated ML framework for predicting vitamin B12 deficiency using routine laboratory data alone. The model demonstrates strong diagnostic performance, biological plausibility, and potential for seamless integration into laboratory and clinical decision-support systems, enabling cost-effective and early identification of patients at risk. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

29 pages, 123573 KB  
Article
Dynamic Landslide Susceptibility Assessment Integrating SBAS-InSAR and Interpretable Machine Learning: A Case Study of the Baihetan Reservoir Area, Southwest China
by Hongfei Wang, Chuhan Deng, Ziyou Zhang, Zhekai Jiang, Qi Wei, Weijie Yi, Tao Chen and Junwei Ma
Remote Sens. 2026, 18(4), 578; https://doi.org/10.3390/rs18040578 (registering DOI) - 12 Feb 2026
Viewed by 75
Abstract
Landslide susceptibility mapping (LSM) is a fundamental approach for identifying and predicting areas prone to slope failure. However, most conventional LSM methods are based on time-invariant conditioning factors or long-term-averaged predictors and seldom incorporate slope-kinematic information from deformation observations, thereby limiting their ability [...] Read more.
Landslide susceptibility mapping (LSM) is a fundamental approach for identifying and predicting areas prone to slope failure. However, most conventional LSM methods are based on time-invariant conditioning factors or long-term-averaged predictors and seldom incorporate slope-kinematic information from deformation observations, thereby limiting their ability to capture evolving slope instability. Moreover, the black-box nature of many models limits interpretability and confidence in their predictions. In this study, we integrate small baseline subset interferometric synthetic aperture radar (SBAS-InSAR) with interpretable machine learning (ML) methods to develop a dynamic LSM framework that improves the accuracy and reliability of susceptibility assessment. First, static LSM was performed using ML algorithms, and SHapley Additive exPlanations (SHAP) was used to quantify and visualize feature importance. Subsequently, SBAS-InSAR was applied to retrieve surface deformation rates. Finally, a dynamic LSM matrix was constructed to integrate InSAR-derived deformation with static susceptibility classes, producing time-varying landslide susceptibility maps. Application of the framework in the Baihetan Reservoir area, Southwest China, demonstrates its practical value. During the static LSM phase, the extreme gradient boosting (XGBoost) model achieved strong predictive performance (the area under the receiver operating characteristic curve (AUC) = 0.8864; accuracy = 0.8315; precision = 0.8947), outperforming the alternative models. SHAP analysis indicates that elevation and distance to rivers are the primary controls on landslide occurrence. Incorporating SBAS-InSAR deformation data into the dynamic LSM matrix effectively captures the spatiotemporal evolution of slope instability. Susceptibility upgrades are observed for multiple inventoried landslides, and the actively deforming Xiaomidi and Gantianba landslides are presented as representative case studies, further supported by multisource observations from satellite imagery, unmanned aerial vehicle (UAV) surveys, and ground-based global navigation satellite system (GNSS) monitoring. Consequently, the proposed dynamic LSM framework overcomes limitations of static approaches by integrating deformation information and enhancing interpretability through explainable artificial intelligence. Full article
Show Figures

Figure 1

27 pages, 7226 KB  
Article
Interpretable Deep Learning for Landslide Forecasting in Post-Seismic Areas: Integrating SBAS-InSAR and Environmental Factors
by H. Y. Guo and A. M. Martínez-Graña
Appl. Sci. 2026, 16(4), 1852; https://doi.org/10.3390/app16041852 - 12 Feb 2026
Viewed by 153
Abstract
Forecasting post-seismic landslide displacement is challenged by the difficulty in distinguishing short-term acceleration from creep and the risk of spatiotemporal leakage. To address this, an interpretable deep-learning framework is developed, integrating SBAS-InSAR time series with an Attention-enhanced Gated Recurrent Unit (Attention-GRU). Prior to [...] Read more.
Forecasting post-seismic landslide displacement is challenged by the difficulty in distinguishing short-term acceleration from creep and the risk of spatiotemporal leakage. To address this, an interpretable deep-learning framework is developed, integrating SBAS-InSAR time series with an Attention-enhanced Gated Recurrent Unit (Attention-GRU). Prior to modeling, a multi-stage preprocessing strategy, including empirical mode decomposition, is applied to mitigate noise and delineate active deformation zones. Unlike standard architectures, the model’s temporal attention mechanism adaptively amplifies critical precursory acceleration phases. Furthermore, a strict landslide-object-based partitioning strategy is employed to rigorously mitigate spatiotemporal leakage. The framework was evaluated in the Le’an Town landslide cluster using multi-source data. Targeting identified hazardous regions, the method achieved an R2 of 0.93 and reduced MAPE by 42.7% relative to the SVR baseline. This reflects a location-specific predictive capability, within active zones rather than regional generalization. SHapley Additive exPlanations (SHAP) further confirmed the model captures physical relationships, such as sensitivity to 25–35° slopes and vegetation degradation. Ultimately, the proposed framework offers a transparent, physically interpretable tool for operational hazard mitigation. Full article
(This article belongs to the Special Issue Remote Sensing Image Processing and Application, 2nd Edition)
Show Figures

Figure 1

32 pages, 4917 KB  
Article
Optimization of Cultivation Strategies Through Crop Yield Prediction for Rice and Maize Using a Hybrid CatBoost-NSGA-II Model
by Yuyang Zhang, Amir Abdullah Khan, Wei Zhao and Xufeng Xiao
Agriculture 2026, 16(4), 423; https://doi.org/10.3390/agriculture16040423 - 12 Feb 2026
Viewed by 84
Abstract
In light of the dual challenges of global climate change and the pressure on agricultural resources, increasing crop yields and resource utilization efficiency has become the key to ensuring food security and sustainable agricultural development. This study takes environmental factors and cultivation measures [...] Read more.
In light of the dual challenges of global climate change and the pressure on agricultural resources, increasing crop yields and resource utilization efficiency has become the key to ensuring food security and sustainable agricultural development. This study takes environmental factors and cultivation measures as input and crop yield as output; systematically compares five ensemble learning models: RF, LightGBM, GBDT, XGBoost, and CatBoost; and then screens out the CatBoost algorithm with the best performance. The CatBoost-Nondominated Sorting Genetic Algorithm II (NSGA-II) hybrid model was constructed. This model provides data-driven solutions and strategies for cultivating rice and maize through precise yield prediction and multi-objective optimization. To enhance the interpretability of the model, we used the SHAP method to parse the predicted behavior to ensure that the results conform to common agricultural knowledge. Based on this, we constructed a constrained multi-objective optimization problem and solved it using the NSGA-II algorithm to obtain a Pareto frontier that strikes a balance among yield, resource consumption and growth cycle. Case studies showed that CatBoost performs best in the selected datasets. SHAP identified precipitation, fertilization/irrigation intensity and temperature as the main influencing factors; NSGA-II generated a well-distributed Pareto solution set, allowing for the flexible selection of representative cultivation schemes based on different management objectives. This modeling paradigm showed good generalization ability and can be extended to other crop cultivation strategy optimization scenarios based on tabular data. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

28 pages, 3275 KB  
Article
Deep-Learning-Based Classification of Lung Adenocarcinoma and Squamous Cell Carcinoma Using DNA Methylation Profiles: A Multi-Cohort Validation Study
by Maram Fahaad Almufareh, Samabia Tehsin, Mamoona Humayun, Sumaira Kausar and Asad Farooq
Cancers 2026, 18(4), 607; https://doi.org/10.3390/cancers18040607 - 12 Feb 2026
Viewed by 137
Abstract
Background/Objectives: The precise classification of non-small-cell lung cancer (NSCLC) into lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) has important role in treatment decisions and in prognosis. Proper subtyping ensures that patients receive the most appropriate therapeutic strategies and allows clinicians to [...] Read more.
Background/Objectives: The precise classification of non-small-cell lung cancer (NSCLC) into lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) has important role in treatment decisions and in prognosis. Proper subtyping ensures that patients receive the most appropriate therapeutic strategies and allows clinicians to make informed evaluations regarding disease outcomes. This study presents a deep neural-network-based classification approach utilizing genome-wide DNA methylation profiles from the Illumina HumanMethylation450 BeadChip platform. Methods: A total of 5000 of the most discriminative CpG probes are identified through variance-based feature selection in the presented methodology, which are then classified through a five-layer deep neural network with batch normalization and dropout regularization. Training and validation were performed using data from The Cancer Genome Atlas (TCGA), with external validation conducted on two independent Gene Expression Omnibus (GEO) datasets: GSE39279 and GSE56044. Results: The model achieved 96.92% accuracy with an area under the receiver-operating characteristic curve (AUC-ROC) of 0.9981 on the TCGA test set. Robust generalization was obtained in cross-dataset validation experiments, with the GEO-trained model achieving 88.92% accuracy and 0.9724 AUC-ROC when validated on TCGA data. The most influential CpG biomarkers contributing to classification decisions are analysed using SHAP (Shapley Additive Explanations). Conclusions: These findings demonstrate the potential of DNA methylation-based deep learning approaches for reliable NSCLC subtype classification with clinical applicability. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Lung Cancer)
Show Figures

Figure 1

17 pages, 1116 KB  
Article
Deep Learning for Emergency Department Sustainability: Interpretable Prediction of Revisit
by Wang-Chuan Juang, Zheng-Xun Cai, Chia-Mei Chen and Zhi-Hong You
Healthcare 2026, 14(4), 464; https://doi.org/10.3390/healthcare14040464 - 12 Feb 2026
Viewed by 59
Abstract
Background: Emergency department (ED) overcrowding strains clinicians and potentially compromises urgent care quality. Unscheduled return visits (URVs), also known as readmissions, contribute to this cycle, motivating tools that identify high-risk patients at discharge. Methods: This study performed a retrospective study using ED electronic [...] Read more.
Background: Emergency department (ED) overcrowding strains clinicians and potentially compromises urgent care quality. Unscheduled return visits (URVs), also known as readmissions, contribute to this cycle, motivating tools that identify high-risk patients at discharge. Methods: This study performed a retrospective study using ED electronic health records (EHRs) from Kaohsiung Veterans General Hospital from January 2018 to December 2022 (n = 184,653). The model integrates structured variables, such as vital signs, medication and laboratory counts, and ICD-10–based comorbidity measures, with unstructured physician notes. Key physiologic measurements were transformed into binary form using clinical reference intervals, and random under-sampling addressed class imbalance. A multimodal, CNN was proposed and evaluated with an 8:2 train–test split and 10-fold Monte Carlo cross-validation. Results: The proposed model achieved a sensitivity of 0.717 (CI: [0.695, 0.738]), accuracy of 0.846 (CI: [0.842, 0.850]), and AUROC of 0.853. Binary transformation improved recall and AUROC relative to the original numeric representations. SHAP analysis showed that unstructured features dominated prediction, while structured variables added complementary value. In a small-scale pilot evaluation using the SHAP-enabled interface, participating physicians reported the system helped surface high-risk cohorts and reduced cognitive workload by consolidating relevant patient information for rapid cross-checking. Conclusions: An interpretable CNN-based clinical decision support system can predict ED revisit risk from multimodal EHR data and demonstrates practical usability in a real-world clinical setting, supporting targeted discharge planning and follow-up as a near-term approach to mitigate overcrowding. Full article
Show Figures

Figure 1

28 pages, 8950 KB  
Article
Revealing Spatiotemporal Evolution and Driving Mechanisms of Grey Water Footprint in Land Consolidation Areas Using Explainable Machine Learning Models: Evidence from Yan’an Region, Shaanxi Province
by Qiaoyang Yang, Hui Qian, Qi Long, Yicheng Duan and Zhiming Cao
Sustainability 2026, 18(4), 1854; https://doi.org/10.3390/su18041854 - 11 Feb 2026
Viewed by 135
Abstract
The grey water footprint (GWF) is a critical indicator for assessing the impact of socio-economic activities on the water resources environment. To address the dual challenges of economic growth and water pollution associated with Land Consolidation Projects (LCPs) in the Loess Plateau, this [...] Read more.
The grey water footprint (GWF) is a critical indicator for assessing the impact of socio-economic activities on the water resources environment. To address the dual challenges of economic growth and water pollution associated with Land Consolidation Projects (LCPs) in the Loess Plateau, this study systematically analyzes the spatiotemporal distribution of GWF in the Yan’an region from 2000 to 2023 and employs the eXtreme Gradient Boosting (XGBoost) model to comprehensively explore its driving mechanisms. The SHapley Additive Explanations (SHAP) method was employed to quantify the dynamic contributions of the driving factors of GWF, while the threshold effects of these factors were assessed using partial dependence plot analysis. Additionally, spatial matching patterns between agricultural GWF (GWFagr) and economic factors were examined using the Gini coefficient and imbalance index. These findings indicate that the total GWF (TGWF) peaked at 1.347 billion m3 in 2004 and declined due to improvements in water management efficiency. Spatially, TGWF is higher in the central and eastern regions, where GWFagr is predominant. The permanent population and per capita GDP are the key driving factors, accounting for 21.1% and 15% of the total change in TGWF, respectively. In the spatial coupling relationship between agricultural GDP and GWFagr, the overall imbalance index has significantly decreased. The synergistic effect between the Grain for Green Project and LCPs is becoming increasingly evident. These insights provide scientific support and policy guidance for the ecological protection and high-quality development of the Yellow River Basin. Full article
Show Figures

Figure 1

12 pages, 2135 KB  
Article
Machine Learning-Assisted In Situ Monitoring System for Identifying and Predicting Components, Concentrations, and Viscosities of Fracturing Flowback Wastewater
by Sai Gong, Haoran Chen, Qiuju Liu, Yao Pan and Jinfeng Wang
Water 2026, 18(4), 464; https://doi.org/10.3390/w18040464 - 11 Feb 2026
Viewed by 166
Abstract
The effective management of fracturing flowback wastewater is critical to oil and gas production sustainability, while its complex and rapidly evolving rheology poses a significant barrier to monitoring and targeted treatment. Traditional offline sampling methods suffer from measurement latency, failing to capture real-time [...] Read more.
The effective management of fracturing flowback wastewater is critical to oil and gas production sustainability, while its complex and rapidly evolving rheology poses a significant barrier to monitoring and targeted treatment. Traditional offline sampling methods suffer from measurement latency, failing to capture real-time dynamic changes in treatment reactors. To address these limitations, this study develops a novel machine learning-assisted in situ monitoring system integrating ultrasonic time-domain reflectometry (UTDR) to characterize fluid components, concentrations, and viscosity simultaneously. Specifically, the random forest model achieved the highest accuracy (88.0%) in component identification among three tree-based algorithms, while support vector classification (SVC) effectively discriminated concentration levels with an accuracy of 82.4%. For viscosity prediction, the 1D-convolutional neural network (1D-CNN) demonstrated superior performance, achieving an R2 of 0.972. Crucially, interpretability analyses (SHAP and Grad-CAM) confirmed that model decisions align with hydroacoustic principles of attenuation and viscous damping. In dynamic enzymatic degradation tests, the system successfully tracked rapid viscosity transitions with a relative error of less than 13%. This approach provides a high-resolution, cost-effective solution for the intelligent monitoring of fracturing flowback wastewater. Full article
(This article belongs to the Section Hydraulics and Hydrodynamics)
Show Figures

Figure 1

Back to TopTop