MDPI - Publisher of Open Access Journals

22 pages, 7977 KiB

Open AccessArticle

Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning

by Lei Zhao, Zhou Yang, Chunxia Wang, Mohui Jin and Jieli Duan

Agronomy 2025, 15(8), 1880; https://doi.org/10.3390/agronomy15081880 (registering DOI) - 3 Aug 2025

Banana de-handing is a critical yet labor-intensive step in postharvest processing, with current manual methods resulting in high costs and occupational risks. This study addresses the automation of de-handing point localization by integrating high-resolution 3D scanning and morphometric analysis of banana crowns with [...] Read more.

Banana de-handing is a critical yet labor-intensive step in postharvest processing, with current manual methods resulting in high costs and occupational risks. This study addresses the automation of de-handing point localization by integrating high-resolution 3D scanning and morphometric analysis of banana crowns with machine learning techniques. A total of 210 crown samples were analyzed to extract key morphological features, including inner arc length (L_i), inner arc radius (R_i), outer arc radius (R_o), and the distance between inner and outer arcs (D_oi), among others. Four machine learning algorithms, namely, Multi-Layer Perceptron (MLP), Gradient Boosted Decision Trees (GBDT), Extreme Gradient Boosting (XGBoost), and Random Forest (RF), were developed to predict the target radius (R_t) and target distance (D_ti) of the de-handing point. The RF models achieved the optimal predictive performance on the testing set, with the following results: for R_t, R² = 0.95, MAE = 1.50, and RMSE = 1.94; for D_ti, R² = 0.91, MAE = 1.33, and RMSE = 1.66. A Shapley Additive Explanations (SHAP) analysis revealed that L_i, R_i, and R_o were the most influential features for R_t, while D_oi was the most important for D_ti. Notably, feature threshold effects were observed, with limited gains in prediction accuracy beyond specific morphological values. These results provide a quantitative foundation for vision-guided automated de-handing systems, advancing intelligent and efficient banana postharvest management. Full article

(This article belongs to the Section Precision and Digital Agriculture)

23 pages, 3427 KiB

Open AccessArticle

Visual Narratives and Digital Engagement: Decoding Seoul and Tokyo’s Tourism Identity Through Instagram Analytics

by Seung Chul Yoo and Seung Mi Kang

Tour. Hosp. 2025, 6(3), 149; https://doi.org/10.3390/tourhosp6030149 (registering DOI) - 1 Aug 2025

Viewed by 101

Abstract

Social media platforms like Instagram significantly shape destination images and influence tourist behavior. Understanding how different cities are represented and perceived on these platforms is crucial for effective tourism marketing. This study provides a comparative analysis of Instagram content and engagement patterns in [...] Read more.

Social media platforms like Instagram significantly shape destination images and influence tourist behavior. Understanding how different cities are represented and perceived on these platforms is crucial for effective tourism marketing. This study provides a comparative analysis of Instagram content and engagement patterns in Seoul and Tokyo, two major Asian metropolises, to derive actionable marketing insights. We collected and analyzed 59,944 public Instagram posts geotagged or location-tagged within Seoul (n = 29,985) and Tokyo (n = 29,959). We employed a mixed-methods approach involving content categorization using a fine-tuned convolutional neural network (CNN) model, engagement metric analysis (likes, comments), Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment analysis and thematic classification of comments, geospatial analysis (Kernel Density Estimation [KDE], Moran’s I), and predictive modeling (Gradient Boosting with SHapley Additive exPlanations [SHAP] value analysis). A validation analysis using balanced samples (n = 2000 each) was conducted to address Tokyo’s lower geotagged data proportion. While both cities showed ‘Person’ as the dominant content category, notable differences emerged. Tokyo exhibited higher like-based engagement across categories, particularly for ‘Animal’ and ‘Food’ content, while Seoul generated slightly more comments, often expressing stronger sentiment. Qualitative comment analysis revealed Seoul comments focused more on emotional reactions, whereas Tokyo comments were often shorter, appreciative remarks. Geospatial analysis identified distinct hotspots. The validation analysis confirmed these spatial patterns despite Tokyo’s data limitations. Predictive modeling highlighted hashtag counts as the key engagement driver in Seoul and the presence of people in Tokyo. Seoul and Tokyo project distinct visual narratives and elicit different engagement patterns on Instagram. These findings offer practical implications for destination marketers, suggesting tailored content strategies and location-based campaigns targeting identified hotspots and specific content themes. This study underscores the value of integrating quantitative and qualitative analyses of social media data for nuanced destination marketing insights. Full article

(This article belongs to the Special Issue Data-Driven Insights in Tourism and Hospitality: Smart Technologies and Data Science)

► Show Figures

Figure 1

22 pages, 2120 KiB

Open AccessArticle

Machine Learning Algorithms and Explainable Artificial Intelligence for Property Valuation

by Gabriella Maselli and Antonio Nesticò

Real Estate 2025, 2(3), 12; https://doi.org/10.3390/realestate2030012 - 1 Aug 2025

Viewed by 70

Abstract

The accurate estimation of urban property values is a key challenge for appraisers, market participants, financial institutions, and urban planners. In recent years, machine learning (ML) techniques have emerged as promising tools for price forecasting due to their ability to model complex relationships [...] Read more.

The accurate estimation of urban property values is a key challenge for appraisers, market participants, financial institutions, and urban planners. In recent years, machine learning (ML) techniques have emerged as promising tools for price forecasting due to their ability to model complex relationships among variables. However, their application raises two main critical issues: (i) the risk of overfitting, especially with small datasets or with noisy data; (ii) the interpretive issues associated with the “black box” nature of many models. Within this framework, this paper proposes a methodological approach that addresses both these issues, comparing the predictive performance of three ML algorithms—k-Nearest Neighbors (kNN), Random Forest (RF), and the Artificial Neural Network (ANN)—applied to the housing market in the city of Salerno, Italy. For each model, overfitting is preliminarily assessed to ensure predictive robustness. Subsequently, the results are interpreted using explainability techniques, such as SHapley Additive exPlanations (SHAPs) and Permutation Feature Importance (PFI). This analysis reveals that the Random Forest offers the best balance between predictive accuracy and transparency, with features such as area and proximity to the train station identified as the main drivers of property prices. kNN and the ANN are viable alternatives that are particularly robust in terms of generalization. The results demonstrate how the defined methodological framework successfully balances predictive effectiveness and interpretability, supporting the informed and transparent use of ML in real estate valuation. Full article

(This article belongs to the Topic Improving Nature-Smart Policies through Innovative Resilient Evaluations)

► Show Figures

Figure 1

32 pages, 17155 KiB

Open AccessArticle

Machine Learning Ensemble Methods for Co-Seismic Landslide Susceptibility: Insights from the 2015 Nepal Earthquake

by Tulasi Ram Bhattarai and Netra Prakash Bhandary

Appl. Sci. 2025, 15(15), 8477; https://doi.org/10.3390/app15158477 (registering DOI) - 30 Jul 2025

Viewed by 178

Abstract

The Mw 7.8 Gorkha Earthquake of 25 April 2015 triggered over 25,000 landslides across central Nepal, with 4775 events concentrated in Gorkha District alone. Despite substantial advances in landslide susceptibility mapping, existing studies often overlook the compound role of post-seismic rainfall and lack [...] Read more.

The Mw 7.8 Gorkha Earthquake of 25 April 2015 triggered over 25,000 landslides across central Nepal, with 4775 events concentrated in Gorkha District alone. Despite substantial advances in landslide susceptibility mapping, existing studies often overlook the compound role of post-seismic rainfall and lack robust spatial validation. To address this gap, we validated an ensemble machine learning framework for co-seismic landslide susceptibility modeling by integrating seismic, geomorphological, hydrological, and anthropogenic variables, including cumulative post-seismic rainfall. Using a balanced dataset of 4775 landslide and non-landslide instances, we evaluated the performance of Logistic Regression (LR), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost) models through spatial cross-validation, SHapley Additive exPlanations (SHAP) explainability, and ablation analysis. The RF model outperformed all others, achieving an accuracy of 87.9% and a Receiver Operating Characteristic (ROC) Area Under the Curve (AUC) value of 0.94, while XGBoost closely followed (AUC = 0.93). Ensemble models collectively classified over 95% of observed landslides into High and Very High susceptibility zones, demonstrating strong spatial reliability. SHAP analysis identified elevation, proximity to fault, peak ground acceleration (PGA), slope, and rainfall as dominant predictors. Notably, the inclusion of post-seismic rainfall substantially improved recall and F1 scores in ablation experiments. Spatial cross-validation revealed the superior generalizability of ensemble models under heterogeneous terrain conditions. The findings underscore the value of integrating post-seismic hydrometeorological factors and spatial validation into susceptibility assessments. We recommend adopting ensemble models, particularly RF, for operational hazard mapping in earthquake-prone mountainous regions. Future research should explore the integration of dynamic rainfall thresholds and physics-informed frameworks to enhance early warning systems and climate resilience. Full article

(This article belongs to the Section Earth Sciences)

► Show Figures

Figure 1

14 pages, 2727 KiB

Open AccessArticle

A Multimodal MRI-Based Model for Colorectal Liver Metastasis Prediction: Integrating Radiomics, Deep Learning, and Clinical Features with SHAP Interpretation

by Xin Yan, Furui Duan, Lu Chen, Runhong Wang, Kexin Li, Qiao Sun and Kuang Fu

Curr. Oncol. 2025, 32(8), 431; https://doi.org/10.3390/curroncol32080431 - 30 Jul 2025

Viewed by 104

Abstract

Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through [...] Read more.

Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through SHapley Additive exPlanations (SHAP) analysis and deep learning visualization. Methods: This multicenter retrospective study included 463 patients with pathologically confirmed colorectal cancer from two institutions, divided into training (n = 256), internal testing (n = 111), and external validation (n = 96) sets. Radiomics features were extracted from manually segmented regions on axial T2-weighted imaging (T2WI) and diffusion-weighted imaging (DWI). Deep learning features were obtained from a pretrained ResNet101 network using the same MRI inputs. A least absolute shrinkage and selection operator (LASSO) logistic regression classifier was developed for clinical, radiomics, deep learning, and combined models. Model performance was evaluated by AUC, sensitivity, specificity, and F1-score. SHAP was used to assess feature contributions, and Grad-CAM was applied to visualize deep feature attention. Results: The combined model integrating features across the three modalities achieved the highest performance across all datasets, with AUCs of 0.889 (training), 0.838 (internal test), and 0.822 (external validation), outperforming single-modality models. Decision curve analysis (DCA) revealed enhanced clinical net benefit from the integrated model, while calibration curves confirmed its good predictive consistency. SHAP analysis revealed that radiomic features related to T2WI texture (e.g., LargeDependenceLowGrayLevelEmphasis) and clinical biomarkers (e.g., CA19-9) were among the most predictive for CRLM. Grad-CAM visualizations confirmed that the deep learning model focused on tumor regions consistent with radiological interpretation. Conclusions: This study presents a robust and interpretable multiparametric MRI-based model for noninvasively predicting liver metastasis in colorectal cancer patients. By integrating handcrafted radiomics and deep learning features, and enhancing transparency through SHAP and Grad-CAM, the model provides both high predictive performance and clinically meaningful explanations. These findings highlight its potential value as a decision-support tool for individualized risk assessment and treatment planning in the management of colorectal cancer. Full article

(This article belongs to the Section Gastrointestinal Oncology)

► Show Figures

Graphical abstract

26 pages, 8762 KiB

Open AccessArticle

Clustered Rainfall-Induced Landslides in Jiangwan Town, Guangdong, China During April 2024: Characteristics and Controlling Factors

by Ruizeng Wei, Yunfeng Shan, Lei Wang, Dawei Peng, Ge Qu, Jiasong Qin, Guoqing He, Luzhen Fan and Weile Li

Remote Sens. 2025, 17(15), 2635; https://doi.org/10.3390/rs17152635 - 29 Jul 2025

Viewed by 189

Abstract

On 20 April 2024, an extreme rainfall event occurred in Jiangwan Town Shaoguan City, Guangdong Province, China, where a historic 24 h precipitation of 206 mm was recorded. This triggered extensive landslides that destroyed residential buildings, severed roads, and drew significant societal attention. [...] Read more.

On 20 April 2024, an extreme rainfall event occurred in Jiangwan Town Shaoguan City, Guangdong Province, China, where a historic 24 h precipitation of 206 mm was recorded. This triggered extensive landslides that destroyed residential buildings, severed roads, and drew significant societal attention. Rapid acquisition of landslide inventories, distribution patterns, and key controlling factors is critical for post-disaster emergency response and reconstruction. Based on high-resolution Planet satellite imagery, landslide areas in Jiangwan Town were automatically extracted using the Normalized Difference Vegetation Index (NDVI) differential method, and a detailed landslide inventory was compiled. Combined with terrain, rainfall, and geological environmental factors, the spatial distribution and causes of landslides were analyzed. Results indicate that the extreme rainfall induced 1426 landslides with a total area of 4.56 km², predominantly small-to-medium scale. Landslides exhibited pronounced clustering and linear distribution along river valleys in a NE–SW orientation. Spatial analysis revealed concentrations on slopes between 200–300 m elevation with gradients of 20–30°. Four machine learning models—Logistic Regression, Support Vector Machine (SVM), Random Forest (RF), and Extreme Gradient Boosting (XGBoost)—were employed to assess landslide susceptibility mapping (LSM) accuracy. RF and XGBoost demonstrated superior performance, identifying high-susceptibility zones primarily on valley-side slopes in Jiangwan Town. Shapley Additive Explanations (SHAP) value analysis quantified key drivers, highlighting elevation, rainfall intensity, profile curvature, and topographic wetness index as dominant controlling factors. This study provides an effective methodology and data support for rapid rainfall-induced landslide identification and deep learning-based susceptibility assessment. Full article

(This article belongs to the Special Issue Study on Hydrological Hazards Based on Multi-Source Remote Sensing)

► Show Figures

Figure 1

20 pages, 8154 KiB

Open AccessArticle

Strategies for Soil Salinity Mapping Using Remote Sensing and Machine Learning in the Yellow River Delta

by Junyong Zhang, Xianghe Ge, Xuehui Hou, Lijing Han, Zhuoran Zhang, Wenjie Feng, Zihan Zhou and Xiubin Luo

Remote Sens. 2025, 17(15), 2619; https://doi.org/10.3390/rs17152619 - 28 Jul 2025

Viewed by 321

Abstract

In response to the global ecological and agricultural challenges posed by coastal saline-alkali areas, this study focuses on Dongying City as a representative region, aiming to develop a high-precision soil salinity prediction mapping method that integrates multi-source remote sensing data with machine learning [...] Read more.

In response to the global ecological and agricultural challenges posed by coastal saline-alkali areas, this study focuses on Dongying City as a representative region, aiming to develop a high-precision soil salinity prediction mapping method that integrates multi-source remote sensing data with machine learning techniques. Utilizing the SCORPAN model framework, we systematically combined diverse remote sensing datasets and innovatively established nine distinct strategies for soil salinity prediction. We employed four machine learning models—Support Vector Regression (SVR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Geographical Gaussian Process Regression (GGPR) for modeling, prediction, and accuracy comparison, with the objective of achieving high-precision salinity mapping under complex vegetation cover conditions. The results reveal that among the models evaluated across the nine strategies, the SVR model demonstrated the highest accuracy, followed by RF. Notably, under Strategy IX, the SVR model achieved the best predictive performance, with a coefficient of determination (R²) of 0.62 and a root mean square error (RMSE) of 0.38 g/kg. Analysis based on SHapley Additive exPlanations (SHAP) values and feature importance indicated that Vegetation Type Factors contributed significantly and consistently to the model’s performance, maintaining higher importance than traditional salinity indices and playing a dominant role. In summary, this research successfully developed a comprehensive, high-resolution soil salinity mapping framework for the Dongying region by integrating multi-source remote sensing data and employing diverse predictive strategies alongside machine learning models. The findings highlight the potential of Vegetation Type Factors to enhance large-scale soil salinity monitoring, providing robust scientific evidence and technical support for sustainable land resource management, agricultural optimization, ecological protection, efficient water resource utilization, and policy formulation. Full article

(This article belongs to the Special Issue Remote Sensing of Soil Condition Assessment and Degradation Drivers Monitoring)

► Show Figures

Figure 1

14 pages, 1209 KiB

Open AccessArticle

Investigation of Growth Differentiation Factor 15 as a Prognostic Biomarker for Major Adverse Limb Events in Peripheral Artery Disease

by Ben Li, Farah Shaikh, Houssam Younes, Batool Abuhalimeh, Abdelrahman Zamzam, Rawand Abdin and Mohammad Qadura

J. Clin. Med. 2025, 14(15), 5239; https://doi.org/10.3390/jcm14155239 - 24 Jul 2025

Viewed by 295

Abstract

Background/Objectives: Peripheral artery disease (PAD) impacts more than 200 million individuals globally and leads to mortality and morbidity secondary to progressive limb dysfunction and amputation. However, clinical management of PAD remains suboptimal, in part because of the lack of standardized biomarkers to predict [...] Read more.

Background/Objectives: Peripheral artery disease (PAD) impacts more than 200 million individuals globally and leads to mortality and morbidity secondary to progressive limb dysfunction and amputation. However, clinical management of PAD remains suboptimal, in part because of the lack of standardized biomarkers to predict patient outcomes. Growth differentiation factor 15 (GDF15) is a stress-responsive cytokine that has been studied extensively in cardiovascular disease, but its investigation in PAD remains limited. This study aimed to use explainable statistical and machine learning methods to assess the prognostic value of GDF15 for limb outcomes in patients with PAD. Methods: This prognostic investigation was carried out using a prospectively enrolled cohort comprising 454 patients diagnosed with PAD. At baseline, plasma GDF15 levels were measured using a validated multiplex immunoassay. Participants were monitored over a two-year period to assess the occurrence of major adverse limb events (MALE), a composite outcome encompassing major lower extremity amputation, need for open/endovascular revascularization, or acute limb ischemia. An Extreme Gradient Boosting (XGBoost) model was trained to predict 2-year MALE using 10-fold cross-validation, incorporating GDF15 levels along with baseline variables. Model performance was primarily evaluated using the area under the receiver operating characteristic curve (AUROC). Secondary model evaluation metrics were accuracy, sensitivity, specificity, negative predictive value (NPV), and positive predictive value (PPV). Prediction histogram plots were generated to assess the ability of the model to discriminate between patients who develop vs. do not develop 2-year MALE. For model interpretability, SHapley Additive exPlanations (SHAP) analysis was performed to evaluate the relative contribution of each predictor to model outputs. Results: The mean age of the cohort was 71 (SD 10) years, with 31% (n = 139) being female. Over the two-year follow-up period, 157 patients (34.6%) experienced MALE. The XGBoost model incorporating plasma GDF15 levels and demographic/clinical features achieved excellent performance for predicting 2-year MALE in PAD patients: AUROC 0.84, accuracy 83.5%, sensitivity 83.6%, specificity 83.7%, PPV 87.3%, and NPV 86.2%. The prediction probability histogram for the XGBoost model demonstrated clear separation for patients who developed vs. did not develop 2-year MALE, indicating strong discrimination ability. SHAP analysis showed that GDF15 was the strongest predictive feature for 2-year MALE, followed by age, smoking status, and other cardiovascular comorbidities, highlighting its clinical relevance. Conclusions: Using explainable statistical and machine learning methods, we demonstrated that plasma GDF15 levels have important prognostic value for 2-year MALE in patients with PAD. By integrating clinical variables with GDF15 levels, our machine learning model can support early identification of PAD patients at elevated risk for adverse limb events, facilitating timely referral to vascular specialists and aiding in decisions regarding the aggressiveness of medical/surgical treatment. This precision medicine approach based on a biomarker-guided prognostication algorithm offers a promising strategy for improving limb outcomes in individuals with PAD. Full article

(This article belongs to the Special Issue The Role of Biomarkers in Cardiovascular Diseases)

► Show Figures

Figure 1

18 pages, 1154 KiB

Open AccessArticle

Predicting Major Adverse Cardiovascular Events After Cardiac Surgery Using Combined Clinical, Laboratory, and Echocardiographic Parameters: A Machine Learning Approach

by Mladjan Golubovic, Velimir Peric, Marija Stosic, Vladimir Stojiljkovic, Sasa Zivic, Aleksandar Kamenov, Dragan Milic, Vesna Dinic, Dalibor Stojanovic and Milan Lazarevic

Medicina 2025, 61(8), 1323; https://doi.org/10.3390/medicina61081323 - 23 Jul 2025

Viewed by 271

Abstract

Background and Objectives: Despite significant advances in surgical techniques and perioperative care, major adverse cardiovascular events (MACE) remain a leading cause of postoperative morbidity and mortality in patients undergoing coronary artery bypass grafting and/or aortic valve replacement. Accurate preoperative risk stratification is essential [...] Read more.

Background and Objectives: Despite significant advances in surgical techniques and perioperative care, major adverse cardiovascular events (MACE) remain a leading cause of postoperative morbidity and mortality in patients undergoing coronary artery bypass grafting and/or aortic valve replacement. Accurate preoperative risk stratification is essential yet often limited by models that overlook atrial mechanics and underutilized biomarkers. Materials and Methods: This study aimed to develop an interpretable machine learning model for predicting perioperative MACE by integrating clinical, biochemical, and echocardiographic features, with a particular focus on novel physiological markers. A retrospective cohort of 131 patients was analyzed. An Extreme Gradient Boosting (XGBoost) classifier was trained on a comprehensive feature set, and SHapley Additive exPlanations (SHAPs) were used to quantify each variable’s contribution to model predictions. Results: In a stratified 80:20 train–test split, the model initially achieved an AUC of 1.00. Acknowledging the potential for overfitting in small datasets, additional validation was performed using 10 independent random splits and 5-fold cross-validation. These analyses yielded an average AUC of 0.846 ± 0.092 and an F1-score of 0.807 ± 0.096, supporting the model’s stability and generalizability. The most influential predictors included total atrial conduction time, mitral and tricuspid annular orifice areas, and high-density lipoprotein (HDL) cholesterol. These variables, spanning electrophysiological, structural, and metabolic domains, significantly enhanced discriminative performance, even in patients with preserved left ventricular function. The model’s transparency provides clinically intuitive insights into individual risk profiles, emphasizing the significance of non-traditional parameters in perioperative assessments. Conclusions: This study demonstrates the feasibility and potential clinical value of combining advanced echocardiographic, biochemical, and machine learning tools for individualized cardiovascular risk prediction. While promising, these findings require prospective validation in larger, multicenter cohorts before being integrated into routine clinical decision-making. Full article

(This article belongs to the Section Intensive Care/ Anesthesiology)

► Show Figures

Figure 1

26 pages, 2219 KiB

Open AccessArticle

Predicting Cognitive Decline in Parkinson’s Disease Using Artificial Neural Networks: An Explainable AI Approach

by Laura Colautti, Monica Casella, Matteo Robba, Davide Marocco, Michela Ponticorvo, Paola Iannello, Alessandro Antonietti, Camillo Marra and for the CPP Integrated Parkinson’s Database

Brain Sci. 2025, 15(8), 782; https://doi.org/10.3390/brainsci15080782 - 23 Jul 2025

Viewed by 380

Abstract

Background/Objectives: The study aims to identify key cognitive and non-cognitive variables (e.g., clinical, neuroimaging, and genetic data) predicting cognitive decline in Parkinson’s disease (PD) patients using machine learning applied to a sample (N = 618) from the Parkinson’s Progression Markers Initiative database. [...] Read more.

Background/Objectives: The study aims to identify key cognitive and non-cognitive variables (e.g., clinical, neuroimaging, and genetic data) predicting cognitive decline in Parkinson’s disease (PD) patients using machine learning applied to a sample (N = 618) from the Parkinson’s Progression Markers Initiative database. Traditional research has mainly employed explanatory approaches to explore variable relationships, rather than maximizing predictive accuracy for future cognitive decline. In the present study, we implemented a predictive framework that integrates a broad range of baseline cognitive, clinical, genetic, and imaging data to accurately forecast changes in cognitive functioning in PD patients. Methods: An artificial neural network was trained on baseline data to predict general cognitive status three years later. Model performance was evaluated using 5-fold stratified cross-validation. We investigated model interpretability using explainable artificial intelligence techniques, including Shapley Additive Explanations (SHAP) values, Group-Wise Feature Masking, and Brute-Force Combinatorial Masking, to identify the most influential predictors of cognitive decline. Results: The model achieved a recall of 0.91 for identifying patients who developed cognitive decline, with an overall classification accuracy of 0.79. All applied explainability techniques consistently highlighted baseline MoCA scores, memory performance, the motor examination score (MDS-UPDRS Part III), and anxiety as the most predictive features. Conclusions: From a clinical perspective, the findings can support the early detection of PD patients who are more prone to developing cognitive decline, thereby helping to prevent cognitive impairments by designing specific treatments. This can improve the quality of life for patients and caregivers, supporting patient autonomy. Full article

(This article belongs to the Section Neurodegenerative Diseases)

► Show Figures

Figure 1

21 pages, 6005 KiB

Open AccessArticle

Archetype Identification and Energy Consumption Prediction for Old Residential Buildings Based on Multi-Source Datasets

by Chengliang Fan, Rude Liu and Yundan Liao

Buildings 2025, 15(14), 2573; https://doi.org/10.3390/buildings15142573 - 21 Jul 2025

Viewed by 312

Abstract

Assessing energy consumption in existing old residential buildings is key for urban energy conservation and decarbonization. Previous studies on old residential building energy assessment face challenges due to data limitations and inadequate prediction methods. This study develops a novel approach integrating building energy [...] Read more.

Assessing energy consumption in existing old residential buildings is key for urban energy conservation and decarbonization. Previous studies on old residential building energy assessment face challenges due to data limitations and inadequate prediction methods. This study develops a novel approach integrating building energy simulation and machine learning to predict large-scale old residential building energy use using multi-source datasets. Using Guangzhou as a case study, open-source building data was collected to identify 31,209 old residential buildings based on age thresholds and areas of interest (AOIs). Key building form parameters (i.e., long side, short side, number of floors) were then classified to identify residential archetypes. Building energy consumption data for each prototype was generated using EnergyPlus (V23.2.0) simulations. Furthermore, XGBoost and Random Forest machine learning algorithms were used to predict city-scale old residential building energy consumption. Results indicated that five representative prototypes exhibited cooling energy use ranging from 17.32 to 21.05 kWh/m², while annual electricity consumption ranged from 60.10 to 66.53 kWh/m². The XGBoost model demonstrated strong predictive performance (R² = 0.667). SHAP (Shapley Additive Explanations) analysis identified the Building Shape Coefficient (BSC) as the most significant positive predictor of energy consumption (SHAP value = 0.79). This framework enables city-level energy assessment for old residential buildings, providing critical support for retrofitting strategies in sustainable urban renewal planning. Full article

(This article belongs to the Special Issue Enhancing Building Resilience Under Climate Change)

► Show Figures

Figure 1

23 pages, 1458 KiB

Open AccessArticle

From Meals to Marks: Modeling the Impact of Family Involvement on Reading Performance with Counterfactual Explainable AI

by Myint Swe Khine, Nagla Ali and Othman Abu Khurma

Educ. Sci. 2025, 15(7), 928; https://doi.org/10.3390/educsci15070928 - 21 Jul 2025

Viewed by 270

Abstract

This study investigates the impact of family engagement on student reading achievement in the United Arab Emirates (UAE) using counterfactual explainable artificial intelligence (CXAI) analysis. Drawing data from 24,600 students in the UAE PISA dataset, the analysis employed Gradient Boosting, SHAP (SHapley Additive [...] Read more.

This study investigates the impact of family engagement on student reading achievement in the United Arab Emirates (UAE) using counterfactual explainable artificial intelligence (CXAI) analysis. Drawing data from 24,600 students in the UAE PISA dataset, the analysis employed Gradient Boosting, SHAP (SHapley Additive exPlanations), and counterfactual simulations to model and interpret the influence of ten parental involvement variables. The results identified time spent talking with parents, frequency of family meals, and encouragement to achieve good marks as the strongest predictors of reading performance. Counterfactual analysis revealed that increasing the time spent talking with parents and frequency of family meals from their minimum (1) to maximum (5) levels, while holding other variables constant at their medians, could increase the predicted reading score from the baseline of 358.93 to as high as 448.68, marking an improvement of nearly 90 points. These findings emphasize the educational value of culturally compatible parental behaviors. The study also contributes to methodological advancement by integrating interpretable machine learning with prescriptive insights, demonstrating the potential of XAI for educational policy and intervention design. Implications for educators, policymakers, and families highlight the importance of promoting high-impact family practices to support literacy development. The approach offers a replicable model for leveraging AI to understand and enhance student learning outcomes across diverse contexts. Full article

► Show Figures

Figure 1

18 pages, 4607 KiB

Open AccessArticle

Multi-Objective Machine Learning Optimization of Cylindrical TPMS Lattices for Bone Implants

by Mansoureh Rezapourian, Ali Cheloee Darabi, Mohammadreza Khoshbin and Irina Hussainova

Biomimetics 2025, 10(7), 475; https://doi.org/10.3390/biomimetics10070475 - 18 Jul 2025

Viewed by 518

Abstract

This study presents a multi-objective optimization framework for designing cylindrical triply periodic minimal surface (TPMS) lattices tailored for bone implant applications. Using an artificial neural network (ANN) as a surrogate model trained on simulated data, four key properties—ultimate stress (U), energy absorption (EA), [...] Read more.

This study presents a multi-objective optimization framework for designing cylindrical triply periodic minimal surface (TPMS) lattices tailored for bone implant applications. Using an artificial neural network (ANN) as a surrogate model trained on simulated data, four key properties—ultimate stress (U), energy absorption (EA), surface area-to-volume ratio (SA/VR), and relative density (RD)—were predicted from seven lattice design parameters. To address anatomical variability, a novel implant size-based categorization (small, medium, and large) was introduced, and separate optimization runs were conducted for each group. The optimization was performed via the NSGA-II algorithm to maximize mechanical performance (U and EA) and surface efficiency (SA/VR), while filtering for biologically relevant RD values (20–40%). Separate optimization runs were conducted for small, medium, and large implant size groups. A total of 105 Pareto-optimal designs were identified, with 75 designs retained after RD filtering. SHapley Additive exPlanations (SHAP) analysis revealed the dominant influence of thickness and unit cell size on target properties. Kernel density and boxplot comparisons confirmed distinct performance trends across size groups. The framework effectively balances competing design goals and enables the selection of size-specific lattices. The proposed approach provides a reproducible pathway for optimizing bioarchitectures, with the potential to accelerate the development of lattice-based implants in personalized medicine. Full article

(This article belongs to the Special Issue Biomimicry and Functional Materials: 5th Edition)

► Show Figures

Figure 1

28 pages, 7756 KiB

Open AccessArticle

An Interpretable Machine Learning Framework for Unraveling the Dynamics of Surface Soil Moisture Drivers

by Zahir Nikraftar, Esmaeel Parizi, Mohsen Saber, Mahboubeh Boueshagh, Mortaza Tavakoli, Abazar Esmaeili Mahmoudabadi, Mohammad Hassan Ekradi, Rendani Mbuvha and Seiyed Mossa Hosseini

Remote Sens. 2025, 17(14), 2505; https://doi.org/10.3390/rs17142505 - 18 Jul 2025

Viewed by 388

Abstract

Understanding the impacts of the spatial non-stationarity of environmental factors on surface soil moisture (SSM) in different seasons is crucial for effective environmental management. Yet, our knowledge of this phenomenon remains limited. This study introduces an interpretable machine learning framework that combines the [...] Read more.

Understanding the impacts of the spatial non-stationarity of environmental factors on surface soil moisture (SSM) in different seasons is crucial for effective environmental management. Yet, our knowledge of this phenomenon remains limited. This study introduces an interpretable machine learning framework that combines the SHapley Additive exPlanations (SHAP) method with two-step clustering to unravel the spatial drivers of SSM across Iran. Due to the limited availability of in situ SSM data, the performance of three global SSM datasets—SMAP, MERRA-2, and CFSv2—from 2015 to 2023 was evaluated using agrometeorological stations. SMAP outperformed the others, showing the highest median correlation and the lowest Root Mean Square Error (RMSE). Using SMAP, we estimated SSM across 609 catchments employing the Random Forest (RF) algorithm. The RF model yielded R² values of 0.89, 0.83, 0.70, and 0.75 for winter, spring, summer, and autumn, respectively, with corresponding RMSE values of 0.076, 0.081, 0.098, and 0.061 m³/m³. SHAP analysis revealed that climatic factors primarily drive SSM in winter and autumn, while vegetation and soil characteristics are more influential in spring and summer. The clustering results showed that Iran’s catchments can be grouped into five categories based on the SHAP method coefficients, highlighting regional differences in SSM controls. Full article

(This article belongs to the Special Issue Earth Observation Satellites for Soil Moisture Monitoring)

► Show Figures

Figure 1

20 pages, 9405 KiB

Open AccessArticle

Developing a Hybrid Model to Enhance the Robustness of Interpretability for Landslide Susceptibility Assessment

by Xiao Yan, Dongshui Zhang, Yongshun Han, Tongsheng Li, Pin Zhong, Zhe Ning and Shirou Tan

ISPRS Int. J. Geo-Inf. 2025, 14(7), 277; https://doi.org/10.3390/ijgi14070277 - 16 Jul 2025

Viewed by 352

Abstract

Landslide is one of the most damaging natural hazards, causing extensive damage to the infrastructure and threatening human life. Although advances have been made in landslide susceptibility assessment by objective explainable machine learning, the interpretability robustness of traditional single landslide susceptibility model is [...] Read more.

Landslide is one of the most damaging natural hazards, causing extensive damage to the infrastructure and threatening human life. Although advances have been made in landslide susceptibility assessment by objective explainable machine learning, the interpretability robustness of traditional single landslide susceptibility model is still low. The proposed interpretable hybrid model in this study overcomes these challenges and aims to enhance the stability of landslide susceptibility interpretability. The model integrates three base machine learning models—LightGBM, XGBoost, and Random Forest—using a heterogeneous category strategy, thereby enhancing the robustness of model interpretability. The hybrid model is interpreted using SHAP (Shapley Additive Explanations) values, which quantify feature contributions. A 10-fold cross-validation with the coefficient of variation (CV) metric reveals that the hybrid model outperforms individual base models in terms of interpretive robustness, yielding a lower CV value of 0.175 compared to 0.208 for LightGBM, 0.240 for XGBoost, and 0.207 for the Random Forest model. Although predictive accuracy remains comparable to the baseline models, the hybrid model provides more stable and reliable interpretability results for landslide susceptibility. It identifies the slope, elevation, and LS factor as the three most important factors for landslide susceptibility in Xi’an city. Furthermore, the quantitative nonlinear relationships between these predisposing factors and susceptibility were identified, providing empowering knowledge for the landslides risk prevention and urban planning in the regions vulnerable to landslides. Full article

(This article belongs to the Special Issue Advances in Remote Sensing and GIS for Natural Hazards Monitoring and Management)

► Show Figures

Figure 1

Search Results (356)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (356)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI