MDPI - Publisher of Open Access Journals

22 pages, 4719 KiB

Open AccessArticle

An Explainable AI Approach for Interpretable Cross-Layer Intrusion Detection in Internet of Medical Things

by Michael Georgiades and Faisal Hussain

Electronics 2025, 14(16), 3218; https://doi.org/10.3390/electronics14163218 - 13 Aug 2025

Viewed by 137

This paper presents a cross-layer intrusion detection framework leveraging explainable artificial intelligence (XAI) and interpretability methods to enhance transparency and robustness in attack detection within the Internet of Medical Things (IoMT) domain. By addressing the dual challenges of compromised data integrity, which span [...] Read more.

This paper presents a cross-layer intrusion detection framework leveraging explainable artificial intelligence (XAI) and interpretability methods to enhance transparency and robustness in attack detection within the Internet of Medical Things (IoMT) domain. By addressing the dual challenges of compromised data integrity, which span both biosensor and network-layer data, this study combines advanced techniques to enhance interpretability, accuracy, and trust. Unlike conventional flow-based intrusion detection systems that primarily rely on transport-layer statistics, the proposed framework operates directly on raw packet-level features and application-layer semantics, including MQTT message types, payload entropy, and topic structures. The key contributions of this research include the application of K-Means clustering combined with the principal component analysis (PCA) algorthim for initial categorization of attack types, the use of SHapley Additive exPlanations (SHAP) for feature prioritization to identify the most influential factors in model predictions, and the employment of Partial Dependence Plots (PDP) and Accumulated Local Effects (ALE) to elucidate feature interactions across layers. These methods enhance the system’s interpretability, making data-driven decisions more accessible to nontechnical stakeholders. Evaluation on a realistic healthcare IoMT testbed demonstrates significant improvements in detection accuracy and decision-making transparency. Furthermore, the proposed approach highlights the effectiveness of explainable and cross-layer intrusion detection for secure and trustworthy medical IoT environments that are tailored for cybersecurity analysts and healthcare stakeholders. Full article

(This article belongs to the Special Issue IoT for Healthcare and Wellbeing: Trends, Challenges, and Applications, 2nd Edition)

► Show Figures

Figure 1

21 pages, 2896 KiB

Open AccessArticle

Explainable CNN–Radiomics Fusion and Ensemble Learning for Multimodal Lesion Classification in Dental Radiographs

by Zuhal Can and Emre Aydin

Diagnostics 2025, 15(16), 1997; https://doi.org/10.3390/diagnostics15161997 - 9 Aug 2025

Viewed by 355

Abstract

Background/Objectives: Clinicians routinely rely on periapical radiographs to identify root-end disease, but interpretation errors and inconsistent readings compromise diagnostic accuracy. We, therefore, developed an explainable, multimodal AI framework that (i) fuses two data modalities, deep CNN embeddings and radiomic texture descriptors that [...] Read more.

Background/Objectives: Clinicians routinely rely on periapical radiographs to identify root-end disease, but interpretation errors and inconsistent readings compromise diagnostic accuracy. We, therefore, developed an explainable, multimodal AI framework that (i) fuses two data modalities, deep CNN embeddings and radiomic texture descriptors that are extracted only from lesion-relevant pixels selected by Grad-CAM, and (ii) makes every prediction transparent through dual-layer explainability (pixel-level Grad-CAM heatmaps + feature-level SHAP values). Methods: A dataset of 2285 periapical radiographs was processed using six CNN architectures (EfficientNet-B1/B4/V2M/V2S, ResNet-50, Xception). For each image, a Grad-CAM heatmap generated from the penultimate layer of the CNN was thresholded to create a binary mask that delineated the region most responsible for the network’s decision. Radiomic features (first-order, GLCM, GLRLM, GLDM, NGTDM, and shape2D) were then computed only within that mask, ensuring that handcrafted descriptors and learned embeddings referred to the same anatomic focus. The two feature streams were concatenated, optionally reduced by principal component analysis or SelectKBest, and fed to random forest or XGBoost classifiers; five-view test-time augmentation (TTA) was applied at inference. Pixel-level interpretability was provided by the original Grad-CAM, while SHAP quantified the contribution of each radiomic and deep feature to the final vote. Results: Raw CNNs achieved a ca. 52% accuracy and AUC values near 0.60. The multimodal fusion raised performance dramatically; the Xception + radiomics + random forest model achieved a 95.4% accuracy and an AUC of 0.9867, and adding TTA increased these to 96.3% and 0.9917, respectively. The top ensemble, Xception and EfficientNet-V2S fusion vectors classified with XGBoost under five-view TTA, reached a 97.16% accuracy and an AUC of 0.9914, with false-positive and false-negative rates of 4.6% and 0.9%, respectively. Grad-CAM heatmaps consistently highlighted periapical regions, while SHAP plots revealed that radiomic texture heterogeneity and high-level CNN features jointly contributed to correct classifications. Conclusions: By tightly integrating CNN embeddings, mask-targeted radiomics, and a two-tiered explainability stack (Grad-CAM + SHAP), the proposed system delivers state-of-the-art lesion detection and a transparent technique, addressing both accuracy and trust. Full article

(This article belongs to the Special Issue Lesion Detection and Analysis Using Artificial Intelligence, Third Edition)

► Show Figures

Figure 1

21 pages, 2314 KiB

Open AccessArticle

An Explainable Machine-Learning Framework Based on XGBoost–SHAP and Big Data for Revealing the Socioeconomic Drivers of Population Urbanization in China

by Ziheng Shangguan

Systems 2025, 13(8), 679; https://doi.org/10.3390/systems13080679 - 9 Aug 2025

Viewed by 349

Abstract

The global acceleration of population urbanization has transformed cities into primary spatial hubs of human activity. As urban populations continue to expand, identifying the socioeconomic drivers of urbanization and elucidating their underlying mechanisms are essential for achieving Sustainable Development Goal 11, established by [...] Read more.

The global acceleration of population urbanization has transformed cities into primary spatial hubs of human activity. As urban populations continue to expand, identifying the socioeconomic drivers of urbanization and elucidating their underlying mechanisms are essential for achieving Sustainable Development Goal 11, established by the United Nations. This study leverages machine learning and big data to investigate the determinants of population urbanization in China over the period 1991–2023. Utilizing the XGBoost algorithm combined with SHAP (Shapley Additive Explanations), the analysis reveals a tripartite structure of key drivers encompassing industrial support, employment orientation, and infrastructure accessibility. Regional assessments indicate distinct urbanization patterns: Eastern coastal areas are predominantly driven by finance and service industries; central inland regions follow an investment-led trajectory anchored in infrastructure development and real estate expansion, while the western interior relies mainly on employment-centered strategies. Partial Dependence Plots (PDPs) highlighted spatial variations in the effects of sensitive factors, with interaction analyses revealing synergistic effects between tertiary sector shares and the working-age share in eastern coastlands, structural amplification by real estate investment with appropriate working-age population shares in the central inlands, and balancing interactions between GDP growth rates and tertiary sector shares in the western interior. These findings contribute to a more nuanced understanding of the socioeconomic forces shaping urbanization and offer evidence-based recommendations for policymakers in other developing countries seeking to foster sustainable urban growth. Full article

(This article belongs to the Section Systems Practice in Social Science)

► Show Figures

Figure 1

17 pages, 1584 KiB

Open AccessArticle

What Determines Carbon Emissions of Multimodal Travel? Insights from Interpretable Machine Learning on Mobility Trajectory Data

by Guo Wang, Shu Wang, Wenxiang Li and Hongtai Yang

Sustainability 2025, 17(15), 6983; https://doi.org/10.3390/su17156983 - 31 Jul 2025

Viewed by 288

Abstract

Understanding the carbon emissions of multimodal travel—comprising walking, metro, bus, cycling, and ride-hailing—is essential for promoting sustainable urban mobility. However, most existing studies focus on single-mode travel, while underlying spatiotemporal and behavioral determinants remain insufficiently explored due to the lack of fine-grained data [...] Read more.

Understanding the carbon emissions of multimodal travel—comprising walking, metro, bus, cycling, and ride-hailing—is essential for promoting sustainable urban mobility. However, most existing studies focus on single-mode travel, while underlying spatiotemporal and behavioral determinants remain insufficiently explored due to the lack of fine-grained data and interpretable analytical frameworks. This study proposes a novel integration of high-frequency, real-world mobility trajectory data with interpretable machine learning to systematically identify the key drivers of carbon emissions at the individual trip level. Firstly, multimodal travel chains are reconstructed using continuous GPS trajectory data collected in Beijing. Secondly, a model based on Calculate Emissions from Road Transport (COPERT) is developed to quantify trip-level CO₂ emissions. Thirdly, four interpretable machine learning models based on gradient boosting—XGBoost, GBDT, LightGBM, and CatBoost—are trained using transportation and built environment features to model the relationship between CO₂ emissions and a set of explanatory variables; finally, Shapley Additive exPlanations (SHAP) and partial dependence plots (PDPs) are used to interpret the model outputs, revealing key determinants and their non-linear interaction effects. The results show that transportation-related features account for 75.1% of the explained variance in emissions, with bus usage being the most influential single factor (contributing 22.6%). Built environment features explain the remaining 24.9%. The PDP analysis reveals that substantial emission reductions occur only when the shares of bus, metro, and cycling surpass threshold levels of approximately 40%, 40%, and 30%, respectively. Additionally, travel carbon emissions are minimized when trip origins and destinations are located within a 10 to 11 km radius of the central business district (CBD). This study advances the field by establishing a scalable, interpretable, and behaviorally grounded framework to assess carbon emissions from multimodal travel, providing actionable insights for low-carbon transport planning and policy design. Full article

(This article belongs to the Special Issue Sustainable Transportation Systems and Travel Behaviors)

► Show Figures

Figure 1

25 pages, 2761 KiB

Open AccessArticle

Leveraging Deep Learning, Grid Search, and Bayesian Networks to Predict Distant Recurrence of Breast Cancer

by Xia Jiang, Yijun Zhou, Alan Wells and Adam Brufsky

Cancers 2025, 17(15), 2515; https://doi.org/10.3390/cancers17152515 - 30 Jul 2025

Viewed by 374

Abstract

Background: Unlike most cancers, breast cancer poses a persistent risk of distant recurrence—often years after initial treatment—making long-term risk stratification uniquely challenging. Current tools fall short in predicting late metastatic events, particularly for early-stage patients. Methods: We present an interpretable machine [...] Read more.

Background: Unlike most cancers, breast cancer poses a persistent risk of distant recurrence—often years after initial treatment—making long-term risk stratification uniquely challenging. Current tools fall short in predicting late metastatic events, particularly for early-stage patients. Methods: We present an interpretable machine learning (ML) pipeline to predict distant recurrence-free survival at 5, 10, and 15 years, integrating Bayesian network-based causal feature selection, deep feed-forward neural network models (DNMs), and SHAP-based interpretation. Using electronic health record (EHR)-based clinical data from over 6000 patients, we first applied the Markov blanket and interactive risk factor learner (MBIL) to identify minimally sufficient predictor subsets. These were then used to train optimized DNM classifiers, with hyperparameters tuned via grid search and benchmarked against models from 10 traditional ML methods and models trained using all predictors. Results: Our best models achieved area under the curve (AUC) scores of 0.79, 0.83, and 0.89 for 5-, 10-, and 15-year predictions, respectively—substantially outperforming baselines. MBIL reduced input dimensionality by over 80% without sacrificing accuracy. Importantly, MBIL-selected features (e.g., nodal status, hormone receptor expression, tumor size) overlapped strongly with top SHAP contributors, reinforcing interpretability. Calibration plots further demonstrated close agreement between predicted probabilities and observed recurrence rates. The percentage performance improvement due to grid search ranged from 25.3% to 60%. Conclusions: This study demonstrates that combining causal selection, deep learning, and grid search improves prediction accuracy, transparency, and calibration for long-horizon breast cancer recurrence risk. The proposed framework is well-positioned for clinical use, especially to guide long-term follow-up and therapy decisions in early-stage patients. Full article

(This article belongs to the Special Issue AI-Based Applications in Cancers)

► Show Figures

Figure 1

22 pages, 1724 KiB

Open AccessArticle

Development and Clinical Interpretation of an Explainable AI Model for Predicting Patient Pathways in the Emergency Department: A Retrospective Study

by Émilien Arnaud, Pedro Antonio Moreno-Sanchez, Mahmoud Elbattah, Christine Ammirati, Mark van Gils, Gilles Dequen and Daniel Aiham Ghazali

Appl. Sci. 2025, 15(15), 8449; https://doi.org/10.3390/app15158449 - 30 Jul 2025

Viewed by 488

Abstract

Background: Overcrowded emergency departments (EDs) create significant challenges for patient management and hospital efficiency. In response, Amiens Picardy University Hospital (APUH) developed the “Prediction of the Patient Pathway in the Emergency Department” (3P-U) model to enhance patient flow management. Objectives: To develop and [...] Read more.

Background: Overcrowded emergency departments (EDs) create significant challenges for patient management and hospital efficiency. In response, Amiens Picardy University Hospital (APUH) developed the “Prediction of the Patient Pathway in the Emergency Department” (3P-U) model to enhance patient flow management. Objectives: To develop and clinically validate an explainable artificial intelligence (XAI) model for hospital admission predictions, using structured triage data, and demonstrate its real-world applicability in the ED setting. Methods: Our retrospective, single-center study involved 351,019 patients consulting in APUH’s EDs between 2015 and 2018. Various models (including a cross-validation artificial neural network (ANN), a k-nearest neighbors (KNN) model, a logistic regression (LR) model, and a random forest (RF) model) were trained and assessed for performance with regard to the area under the receiver operating characteristic curve (AUROC). The best model was validated internally with a test set, and the F1 score was used to determine the best threshold for recall, precision, and accuracy. XAI techniques, such as Shapley additive explanations (SHAP) and partial dependence plots (PDP) were employed, and the clinical explanations were evaluated by emergency physicians. Results: The ANN gave the best performance during the training stage, with an AUROC of 83.1% (SD: 0.2%) for the test set; it surpassed the RF (AUROC: 71.6%, SD: 0.1%), KNN (AUROC: 67.2%, SD: 0.2%), and LR (AUROC: 71.5%, SD: 0.2%) models. In an internal validation, the ANN’s AUROC was 83.2%. The best F1 score (0.67) determined that 0.35 was the optimal threshold; the corresponding recall, precision, and accuracy were 75.7%, 59.7%, and 75.3%, respectively. The SHAP and PDP XAI techniques (as assessed by emergency physicians) highlighted patient age, heart rate, and presentation with multiple injuries as the features that most specifically influenced the admission from the ED to a hospital ward. These insights are being used in bed allocation and patient prioritization, directly improving ED operations. Conclusions: The 3P-U model demonstrates practical utility by reducing ED crowding and enhancing decision-making processes at APUH. Its transparency and physician validation foster trust, facilitating its adoption in clinical practice and offering a replicable framework for other hospitals to optimize patient flow. Full article

(This article belongs to the Special Issue Unlocking Scientific Insights: Data Mining, Large Models, and AI-Driven Discovery)

► Show Figures

Figure 1

14 pages, 1209 KiB

Open AccessArticle

Investigation of Growth Differentiation Factor 15 as a Prognostic Biomarker for Major Adverse Limb Events in Peripheral Artery Disease

by Ben Li, Farah Shaikh, Houssam Younes, Batool Abuhalimeh, Abdelrahman Zamzam, Rawand Abdin and Mohammad Qadura

J. Clin. Med. 2025, 14(15), 5239; https://doi.org/10.3390/jcm14155239 - 24 Jul 2025

Viewed by 336

Abstract

Background/Objectives: Peripheral artery disease (PAD) impacts more than 200 million individuals globally and leads to mortality and morbidity secondary to progressive limb dysfunction and amputation. However, clinical management of PAD remains suboptimal, in part because of the lack of standardized biomarkers to predict [...] Read more.

Background/Objectives: Peripheral artery disease (PAD) impacts more than 200 million individuals globally and leads to mortality and morbidity secondary to progressive limb dysfunction and amputation. However, clinical management of PAD remains suboptimal, in part because of the lack of standardized biomarkers to predict patient outcomes. Growth differentiation factor 15 (GDF15) is a stress-responsive cytokine that has been studied extensively in cardiovascular disease, but its investigation in PAD remains limited. This study aimed to use explainable statistical and machine learning methods to assess the prognostic value of GDF15 for limb outcomes in patients with PAD. Methods: This prognostic investigation was carried out using a prospectively enrolled cohort comprising 454 patients diagnosed with PAD. At baseline, plasma GDF15 levels were measured using a validated multiplex immunoassay. Participants were monitored over a two-year period to assess the occurrence of major adverse limb events (MALE), a composite outcome encompassing major lower extremity amputation, need for open/endovascular revascularization, or acute limb ischemia. An Extreme Gradient Boosting (XGBoost) model was trained to predict 2-year MALE using 10-fold cross-validation, incorporating GDF15 levels along with baseline variables. Model performance was primarily evaluated using the area under the receiver operating characteristic curve (AUROC). Secondary model evaluation metrics were accuracy, sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). Prediction histogram plots were generated to assess the ability of the model to discriminate between patients who develop vs. do not develop 2-year MALE. For model interpretability, SHapley Additive exPlanations (SHAP) analysis was performed to evaluate the relative contribution of each predictor to model outputs. Results: The mean age of the cohort was 71 (SD 10) years, with 31% (n = 139) being female. Over the two-year follow-up period, 157 patients (34.6%) experienced MALE. The XGBoost model incorporating plasma GDF15 levels and demographic/clinical features achieved excellent performance for predicting 2-year MALE in PAD patients: AUROC 0.84, accuracy 83.5%, sensitivity 83.6%, specificity 83.7%, PPV 87.3%, and NPV 86.2%. The prediction probability histogram for the XGBoost model demonstrated clear separation for patients who developed vs. did not develop 2-year MALE, indicating strong discrimination ability. SHAP analysis showed that GDF15 was the strongest predictive feature for 2-year MALE, followed by age, smoking status, and other cardiovascular comorbidities, highlighting its clinical relevance. Conclusions: Using explainable statistical and machine learning methods, we demonstrated that plasma GDF15 levels have important prognostic value for 2-year MALE in patients with PAD. By integrating clinical variables with GDF15 levels, our machine learning model can support early identification of PAD patients at elevated risk for adverse limb events, facilitating timely referral to vascular specialists and aiding in decisions regarding the aggressiveness of medical/surgical treatment. This precision medicine approach based on a biomarker-guided prognostication algorithm offers a promising strategy for improving limb outcomes in individuals with PAD. Full article

(This article belongs to the Special Issue The Role of Biomarkers in Cardiovascular Diseases)

► Show Figures

Figure 1

38 pages, 5575 KiB

Open AccessArticle

Explainable Data Mining Framework of Identifying Root Causes of Rocket Engine Anomalies Based on Knowledge and Physics-Informed Feature Selection

by Xiaopu Zhang, Wubing Miao and Guodong Liu

Machines 2025, 13(8), 640; https://doi.org/10.3390/machines13080640 - 23 Jul 2025

Viewed by 350

Abstract

Liquid rocket engines occasionally experience abnormal phenomena with unclear mechanisms, causing difficulty in design improvements. To address the above issue, a data mining method that combines ante hoc explainability, post hoc explainability, and prediction accuracy is proposed. For ante hoc explainability, a feature [...] Read more.

Liquid rocket engines occasionally experience abnormal phenomena with unclear mechanisms, causing difficulty in design improvements. To address the above issue, a data mining method that combines ante hoc explainability, post hoc explainability, and prediction accuracy is proposed. For ante hoc explainability, a feature selection method driven by data, models, and domain knowledge is established. Global sensitivity analysis of a physical model combined with expert knowledge and data correlation is utilized to establish the correlations between different types of parameters. Then a two-stage optimization approach is proposed to obtain the best feature subset and train the prediction model. For the post hoc explainability, the partial dependence plot (PDP) and SHapley Additive exPlanations (SHAP) analysis are used to discover complex patterns between input features and the dependent variable. The effectiveness of the hybrid feature selection method and its applicability under different noise combinations are validated using synthesized data from a high-fidelity simulation model of a pressurization system. Then the analysis of the causes of a large vibration phenomenon in an active engine shows that the prediction model has good accuracy, and the feature selection results have a clear mechanism and align with domain knowledge, providing both accuracy and interpretability. The proposed method shows significant potential for data mining in complex aerospace products. Full article

(This article belongs to the Special Issue Physical-Informed Fault Monitoring and Fault-Tolerant Control of Industrial System)

► Show Figures

Figure 1

18 pages, 8113 KiB

Open AccessArticle

An Interpretable Machine Learning Model Based on Inflammatory–Nutritional Biomarkers for Predicting Metachronous Liver Metastases After Colorectal Cancer Surgery

by Hao Zhu, Danyang Shen, Xiaojie Gan and Ding Sun

Biomedicines 2025, 13(7), 1706; https://doi.org/10.3390/biomedicines13071706 - 12 Jul 2025

Viewed by 512

Abstract

Objective: Tumor progression is regulated by systemic immune status, nutritional metabolism, and the inflammatory microenvironment. This study aims to investigate inflammatory–nutritional biomarkers associated with metachronous liver metastasis (MLM) in colorectal cancer (CRC) and develop a machine learning model for accurate prediction. Methods [...] Read more.

Objective: Tumor progression is regulated by systemic immune status, nutritional metabolism, and the inflammatory microenvironment. This study aims to investigate inflammatory–nutritional biomarkers associated with metachronous liver metastasis (MLM) in colorectal cancer (CRC) and develop a machine learning model for accurate prediction. Methods: This study enrolled 680 patients with CRC who underwent curative resection, randomly allocated into a training set (n = 477) and a validation set (n = 203) in a 7:3 ratio. Feature selection was performed using Boruta and Lasso algorithms, identifying nine core prognostic factors through variable intersection. Seven machine learning (ML) models were constructed using the training set, with the optimal predictive model selected based on comprehensive evaluation metrics. An interactive visualization tool was developed to interpret the dynamic impact of key features on individual predictions. The partial dependence plots (PDPs) revealed a potential dose–response relationship between inflammatory–nutritional markers and MLM risk. Results: Among 680 patients with CRC, the cumulative incidence of MLM at 6 months postoperatively was 39.1%. Multimodal feature selection identified nine key predictors, including the N stage, vascular invasion, carcinoembryonic antigen (CEA), systemic immune–inflammation index (SII), albumin–bilirubin index (ALBI), differentiation grade, prognostic nutritional index (PNI), fatty liver, and T stage. The gradient boosting machine (GBM) demonstrated the best overall performance (AUROC: 0.916, sensitivity: 0.772, specificity: 0.871). The generalized additive model (GAM)-fitted SHAP analysis established, for the first time, risk thresholds for four continuous variables (CEA > 8.14 μg/L, PNI < 44.46, SII > 856.36, ALBI > −2.67), confirming their significant association with MLM development. Conclusions: This study developed a GBM model incorporating inflammatory-nutritional biomarkers and clinical features to accurately predict MLM in colorectal cancer. Integrated with dynamic visualization tools, the model enables real-time risk stratification via a freely accessible web calculator, guiding individualized surveillance planning and optimizing clinical decision-making for precision postoperative care. Full article

(This article belongs to the Special Issue Advances in Hepatology)

► Show Figures

Figure 1

26 pages, 6730 KiB

Open AccessArticle

Construction and Application of Carbon Emissions Estimation Model for China Based on Gradient Boosting Algorithm

by Dongjie Guan, Yitong Shi, Lilei Zhou, Xusen Zhu, Demei Zhao, Guochuan Peng and Xiujuan He

Remote Sens. 2025, 17(14), 2383; https://doi.org/10.3390/rs17142383 - 10 Jul 2025

Viewed by 399

Abstract

Accurate forecasting of carbon emissions at the county level is critical to support China’s dual-carbon goals. However, most current studies are limited to national or provincial scales, employing traditional statistical methods inadequate for capturing complex nonlinear interactions and spatiotemporal dynamics at finer resolutions. [...] Read more.

Accurate forecasting of carbon emissions at the county level is critical to support China’s dual-carbon goals. However, most current studies are limited to national or provincial scales, employing traditional statistical methods inadequate for capturing complex nonlinear interactions and spatiotemporal dynamics at finer resolutions. To overcome these limitations, this study develops and validates a high-resolution predictive model using advanced gradient boosting algorithms—Gradient Boosting Decision Tree (GBDT), Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM)—based on socioeconomic, industrial, and environmental data from 2732 Chinese counties during 2008–2017. Key variables were selected through correlation analysis, missing values were interpolated using K-means clustering, and model parameters were systematically optimized via grid search and cross-validation. Among the algorithms tested, LightGBM achieved the best performance (R² = 0.992, RMSE = 0.297), demonstrating both robustness and efficiency. Spatial–temporal analyses revealed that while national emissions are slowing, the eastern region is approaching stabilization, whereas emissions in central and western regions are projected to continue rising through 2027. Furthermore, SHapley Additive exPlanations (SHAP) were applied to interpret the marginal and interaction effects of key variables. The results indicate that GDP, energy intensity, and nighttime lights exert the greatest influence on model predictions, while ecological indicators such as NDVI exhibit negative associations. SHAP dependence plots further reveal nonlinear relationships and regional heterogeneity among factors. The key innovation of this study lies in constructing a scalable and interpretable county-level carbon emissions model that integrates gradient boosting with SHAP-based variable attribution, overcoming limitations in spatial resolution and model transparency. Full article

► Show Figures

Figure 1

29 pages, 12455 KiB

Open AccessArticle

Beyond Linearity: Uncovering the Complex Spatiotemporal Drivers of New-Type Urbanization and Eco-Environmental Resilience Coupling in China’s Chengdu–Chongqing Economic Circle with Machine Learning

by Caoxin Chen, Shiyi Wang, Meixi Liu, Ke Huang, Qiuyi Guo, Wei Xie and Jiangjun Wan

Land 2025, 14(7), 1424; https://doi.org/10.3390/land14071424 - 7 Jul 2025

Viewed by 347

Abstract

Rapid urbanization worldwide has led to ecological challenges, undermining eco-environmental resilience (EER). Understanding the coupling coordination between new-type urbanization (NTU) and EER is critical for achieving sustainable urban development. This study investigates the Chengdu–Chongqing Economic Circle using the coupling coordination degree (CCD) model [...] Read more.

Rapid urbanization worldwide has led to ecological challenges, undermining eco-environmental resilience (EER). Understanding the coupling coordination between new-type urbanization (NTU) and EER is critical for achieving sustainable urban development. This study investigates the Chengdu–Chongqing Economic Circle using the coupling coordination degree (CCD) model to evaluate NTU-EER coordination levels and their spatiotemporal evolution. A random forest (RF) model, interpreted with Shapley Additive exPlanations (SHAP) and Partial Dependence Plot (PDP) algorithms, explores nonlinear driving mechanisms, while Geographically and Temporally Weighted Regression (GTWR) assesses drivers’ spatiotemporal heterogeneity. The results reveal the following: (1) NTU and EER levels steadily improved from 2004 to 2022, although coordination between cities still requires enhancement; (2) CCD exhibited a temporal pattern of “progressive escalation and continuous optimization,” and a spatial pattern of “dual-core leadership and regional diffusion,” with most cities shifting from NTU-lagged to synchronized development; (3) environmental regulations (MAR) and fixed asset investment (FIX) emerged as the most influential CCD drivers, and significant nonlinear interactions were observed, particularly those involving population size (HUM); (4) CCD drivers exhibited complex spatiotemporal heterogeneity, characterized by “stage dominance—marginal variation—spatial mismatch.” These findings enrich existing research and offer policy insights to enhance coordinated development in the Chengdu–Chongqing Economic Circle. Full article

► Show Figures

Figure 1

19 pages, 2863 KiB

Open AccessArticle

Analysis of Weak Links in the Mechanized Mining of Underground Metal Mines: Insights from Machine Learning and SHAP Explainability Models

by Chengye Yang, Keping Zhou and Jielin Li

Appl. Sci. 2025, 15(13), 7391; https://doi.org/10.3390/app15137391 - 1 Jul 2025

Viewed by 294

Abstract

In the mechanized mining of metal mines, identifying and optimizing vulnerabilities within the production system is essential for enhancing operational efficiency and ensuring sustainable development. By leveraging data from 88 stopes at Guangxi Tongkeng Mine over a decade, we constructed a comprehensive dataset [...] Read more.

In the mechanized mining of metal mines, identifying and optimizing vulnerabilities within the production system is essential for enhancing operational efficiency and ensuring sustainable development. By leveraging data from 88 stopes at Guangxi Tongkeng Mine over a decade, we constructed a comprehensive dataset encompassing drilling, charging, blasting, ventilation, support, ore drawing, and maintenance. The XGBoost algorithm was employed to model factors influencing stope production capacity (PC), with its parameters optimized using the Marine Predator Algorithm (MPA). The MPA–XGBoost model demonstrates a high predictive accuracy for PC (R² = 0.958, VAF = 95.981%, MAE = 4.844, RMSE = 7.033). A Shapley Additive Explanations (SHAP) analysis reveals that drilling efficiency (DE) contributes most positively (35.6%), while ventilation time (VT) and equipment maintenance time (EMT) negatively impact PC. SHAP dependence plots indicate that increasing DE significantly enhances PC, whereas excessive VT or EMT leads to a substantial decline in PC. These findings offer valuable insights and a robust foundation for optimizing design and improving production management in mechanized mining operations. Full article

(This article belongs to the Special Issue Rock Mechanics in Geotechnical and Tunnel Engineering)

► Show Figures

Figure 1

28 pages, 56125 KiB

Open AccessArticle

Capturing Built Environment and Automated External Defibrillator Resource Interplay in Tianjin Downtown

by Sara Grigoryan, Yike Hu and Nadeem Ullah

ISPRS Int. J. Geo-Inf. 2025, 14(7), 255; https://doi.org/10.3390/ijgi14070255 - 30 Jun 2025

Viewed by 450

Abstract

Automated external defibrillator resources (AEDRs) are the crux of out-of-hospital cardiac arrest (OHCA) responses, enhancing safe and sustainable urban environments. However, existing studies failed to consider the nexus between built environment (BE) features and AEDRs. Can explainable machine-learning (ML) methods reveal the BE-AEDR [...] Read more.

Automated external defibrillator resources (AEDRs) are the crux of out-of-hospital cardiac arrest (OHCA) responses, enhancing safe and sustainable urban environments. However, existing studies failed to consider the nexus between built environment (BE) features and AEDRs. Can explainable machine-learning (ML) methods reveal the BE-AEDR nexus? This study applied an Optuna-based extreme gradient boosting (OP_XGBoost) decision tree model with SHapely Additive exPlanations (SHAP) and partial dependence plots (PDPs) aiming to scrutinize the spatial effects, relative importance, and non-linear impact of BE features on AEDR intensity across grid and block urban patterns in Tianjin Downtown, China. The results indicated, that (1) marginally, the AEDR intensity was most influenced by the service coverage (SC) at grid scale and nearby public service facility density (NPSF_D) at block scale, while synergistically, it was shaped by comprehensive accessibility and land-use interactions with the prioritized block pattern; (2) block-level granularity and (3) non-linear interdependencies between BE features and AEDR intensity existed as game-changers. The findings suggested an effective and generalizable approach to capture the complex interplay of the BE-AEDR and boost the AED deployment by setting health at the heart of the urban development framework. Full article

(This article belongs to the Special Issue HealthScape: Intersections of Health, Environment, and GIS&T)

► Show Figures

Figure 1

23 pages, 7504 KiB

Open AccessArticle

Development and Validation of the Early Gastric Carcinoma Prediction Model in Post-Eradication Patients with Intestinal Metaplasia

by Wulian Lin, Guanpo Zhang, Hong Chen, Weidong Huang, Guilin Xu, Yunmeng Zheng, Chao Gao, Jin Zheng, Dazhou Li and Wen Wang

Cancers 2025, 17(13), 2158; https://doi.org/10.3390/cancers17132158 - 26 Jun 2025

Viewed by 412

Abstract

Background: Gastric cancer (GC) remains a major global health challenge, with rising incidence among patients post-Helicobacter pylori (H. pylori) eradication, particularly those with persistent intestinal metaplasia (IM). Current risk stratification tools are limited in this high-risk population. Aim: [...] Read more.

Background: Gastric cancer (GC) remains a major global health challenge, with rising incidence among patients post-Helicobacter pylori (H. pylori) eradication, particularly those with persistent intestinal metaplasia (IM). Current risk stratification tools are limited in this high-risk population. Aim: To develop, validate, and externally test a machine learning-based prediction model—termed the Early Gastric Cancer Model (EGCM)—for identifying early gastric cancer (EGC) risk in H. pylori-eradicated patients with IM, and to implement it as a web-based clinical tool. Methods: This retrospective, dual-center study enrolled 214 H. pylori-eradicated patients with histologically confirmed IM from 900 Hospital and Fujian Provincial People’s Hospital. The dataset was split into a training cohort (70%) and an internal validation cohort (30%), with an external test cohort from the second center. A total of 21 machine learning algorithms were screened using cross-validation and hyperparameter optimization. Boruta and SHAP analyses were employed for feature selection, and the final EGCM was constructed using the top five predictors: atrophy range, xanthoma, map-like redness (MLR), MLR range, and age. Model performance was evaluated via ROC curves, precision–recall curves, calibration plots, and decision curve analysis (DCA), and compared against conventional inflammatory biomarkers such as NLR and PLR. Results: The CatBoost algorithm demonstrated the best overall performance, achieving an AUC of 0.743 (95% CI: 0.70–0.80) in internal validation and 0.905 in the external test set. The EGCM exhibited superior discrimination compared to individual inflammatory markers (p < 0.01). Calibration analysis confirmed strong agreement between predicted and observed outcomes. DCA showed the EGCM yielded greater net clinical benefit. A web calculator was developed to facilitate clinical application. Conclusions: The EGCM is a validated, interpretable, and practical tool for stratifying EGC risk in H. pylori-eradicated IM patients across multiple centers. Its integration into clinical practice could improve surveillance precision and early cancer detection. Full article

(This article belongs to the Section Cancer Causes, Screening and Diagnosis)

► Show Figures

Figure 1

28 pages, 3141 KiB

Open AccessArticle

Investigating the Factors Influencing Household Financial Vulnerability in China: An Exploration Based on the Shapley Additive Explanations Approach

by Xi Chen, Guowan Hu and Huwei Wen

Sustainability 2025, 17(12), 5523; https://doi.org/10.3390/su17125523 - 16 Jun 2025

Viewed by 614

Abstract

The increasingly observable financial vulnerability of households in emerging market countries makes it imperative to investigate the factors influencing it. Considering that China stands as a representative of emerging market economies, analyzing the factors influencing household financial vulnerability in China presents great reference [...] Read more.

The increasingly observable financial vulnerability of households in emerging market countries makes it imperative to investigate the factors influencing it. Considering that China stands as a representative of emerging market economies, analyzing the factors influencing household financial vulnerability in China presents great reference significance for the sustainable development of households in emerging market countries. Using data from the China Household Finance Survey (CHFS) household samples, this paper presents the regional distribution of households with financial vulnerability in China. Utilizing machine learning (ML), this research examines the factors that influence household financial vulnerability in China and determines the most significant ones. The results reveal that households with financial vulnerability in China takes up a proportion of more than 63%, and household financial vulnerability is lower in economically developed coastal regions than in medium and small-sized cities in the central and western parts of China. The analysis results of the SHAP method show that the debt leverage ratio of a household is the most significant feature variable in predicting financial vulnerability. The ALE plots demonstrate that, in a household, the debt leverage ratio, the age of household head, health condition, economic development and literacy level are significantly nonlinearly related to financial vulnerability. Heterogeneity analysis reveals that, except for household debt leverage and insurance participation, the key characteristic variables exerting the most pronounced effect on financial fragility differ between urban and rural households: household head age for urban families and physical health status for rural families. Furthermore, digital financial inclusion and social security exert distinct impacts on financial vulnerability, showing significantly stronger effects in high per capita GDP regions and low per capita GDP regions, respectively. These findings offer valuable insights for policymakers in emerging economies to formulate targeted financial risk mitigation strategies—such as developing household debt relief and prevention mechanisms and strengthening rural health security systems—and optimize policies for household financial health. Full article

(This article belongs to the Section Health, Well-Being and Sustainability)

► Show Figures

Figure 1

Search Results (97)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (97)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI