Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (919)

Search Parameters:
Keywords = Shapley Additive Explanations (SHAPs)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 13851 KiB  
Article
A Spatially Aware Machine Learning Method for Locating Electric Vehicle Charging Stations
by Yanyan Huang, Hangyi Ren, Xudong Jia, Xianyu Yu, Dong Xie, You Zou, Daoyuan Chen and Yi Yang
World Electr. Veh. J. 2025, 16(8), 445; https://doi.org/10.3390/wevj16080445 (registering DOI) - 6 Aug 2025
Abstract
The rapid adoption of electric vehicles (EVs) has driven a strong need for optimizing locations of electric vehicle charging stations (EVCSs). Previous methods for locating EVCSs rely on statistical and optimization models, but these methods have limitations in capturing complex nonlinear relationships and [...] Read more.
The rapid adoption of electric vehicles (EVs) has driven a strong need for optimizing locations of electric vehicle charging stations (EVCSs). Previous methods for locating EVCSs rely on statistical and optimization models, but these methods have limitations in capturing complex nonlinear relationships and spatial dependencies among factors influencing EVCS locations. To address this research gap and better understand the spatial impacts of urban activities on EVCS placement, this study presents a spatially aware machine learning (SAML) method that combines a multi-layer perceptron (MLP) model with a spatial loss function to optimize EVCS sites. Additionally, the method uses the Shapley additive explanation (SHAP) technique to investigate nonlinear relationships embedded in EVCS placement. Using the city of Wuhan as a case study, the SAML method reveals that parking site (PS), road density (RD), population density (PD), and commercial residential (CR) areas are key factors in determining optimal EVCS sites. The SAML model classifies these grid cells into no EVCS demand (0 EVCS), low EVCS demand (from 1 to 3 EVCSs), and high EVCS demand (4+ EVCSs) classes. The model performs well in predicting EVCS demand. Findings from ablation tests also indicate that the inclusion of spatial correlations in the model’s loss function significantly enhances the model’s performance. Additionally, results from case studies validate that the model is effective in predicting EVCSs in other metropolitan cities. Full article
(This article belongs to the Special Issue Fast-Charging Station for Electric Vehicles: Challenges and Issues)
27 pages, 4506 KiB  
Article
Interpretable Machine Learning Framework for Corporate Financialization Prediction: A SHAP-Based Analysis of High-Dimensional Data
by Yanhe Wang, Wei Wei, Zhuodong Liu, Jiahe Liu, Yinzhen Lv and Xiangyu Li
Mathematics 2025, 13(15), 2526; https://doi.org/10.3390/math13152526 - 6 Aug 2025
Abstract
High-dimensional prediction problems with complex non-linear feature interactions present significant algorithmic challenges in machine learning, particularly when dealing with imbalanced datasets and multicollinearity issues. This study proposes an innovative Shapley Additive Explanations (SHAP)-enhanced machine learning framework that integrates SHAP with advanced ensemble methods [...] Read more.
High-dimensional prediction problems with complex non-linear feature interactions present significant algorithmic challenges in machine learning, particularly when dealing with imbalanced datasets and multicollinearity issues. This study proposes an innovative Shapley Additive Explanations (SHAP)-enhanced machine learning framework that integrates SHAP with advanced ensemble methods for interpretable financialization prediction. The methodology simultaneously addresses high-dimensional feature selection using 40 independent variables (19 CSR-related and 21 financialization-related), multicollinearity issues, and model interpretability requirements. Using a comprehensive dataset of 25,642 observations from 3776 Chinese A-share companies (2011–2022), we implement nine optimized machine learning algorithms with hyperparameter tuning via the Hippopotamus Optimization algorithm and five-fold cross-validation. XGBoost demonstrates superior performance with 99.34% explained variance, achieving an RMSE of 0.082 and R2 of 0.299. SHAP analysis reveals non-linear U-shaped relationships between key predictors and financialization outcomes, with critical thresholds at approximately 10 for CSR_SocR, 1.5 for CSR_S, and 5 for CSR_CV. SOE status, EPU, ownership concentration, firm size, and housing prices emerge as the most influential predictors. Notable shifts in factor importance occur during the COVID-19 pandemic period (2020–2022). This work contributes a scalable, interpretable machine learning architecture for high-dimensional financial prediction problems, with applications in risk assessment, portfolio optimization, and regulatory monitoring systems. Full article
Show Figures

Figure 1

25 pages, 4450 KiB  
Article
Analyzing Retinal Vessel Morphology in MS Using Interpretable AI on Deep Learning-Segmented IR-SLO Images
by Asieh Soltanipour, Roya Arian, Ali Aghababaei, Fereshteh Ashtari, Yukun Zhou, Pearse A. Keane and Raheleh Kafieh
Bioengineering 2025, 12(8), 847; https://doi.org/10.3390/bioengineering12080847 (registering DOI) - 6 Aug 2025
Abstract
Multiple sclerosis (MS), a chronic disease of the central nervous system, is known to cause structural and vascular changes in the retina. Although optical coherence tomography (OCT) and fundus photography can detect retinal thinning and circulatory abnormalities, these findings are not specific to [...] Read more.
Multiple sclerosis (MS), a chronic disease of the central nervous system, is known to cause structural and vascular changes in the retina. Although optical coherence tomography (OCT) and fundus photography can detect retinal thinning and circulatory abnormalities, these findings are not specific to MS. This study explores the potential of Infrared Scanning-Laser-Ophthalmoscopy (IR-SLO) imaging to uncover vascular morphological features that may serve as MS-specific biomarkers. Using an age-matched, subject-wise stratified k-fold cross-validation approach, a deep learning model originally designed for color fundus images was adapted to segment optic disc, optic cup, and retinal vessels in IR-SLO images, achieving Dice coefficients of 91%, 94.5%, and 97%, respectively. This process included tailored pre- and post-processing steps to optimize segmentation accuracy. Subsequently, clinically relevant features were extracted. Statistical analyses followed by SHapley Additive exPlanations (SHAP) identified vessel fractal dimension, vessel density in zones B and C (circular regions extending 0.5–1 and 0.5–2 optic disc diameters from the optic disc margin, respectively), along with vessel intensity and width, as key differentiators between MS patients and healthy controls. These findings suggest that IR-SLO can non-invasively detect retinal vascular biomarkers that may serve as additional or alternative diagnostic markers for MS diagnosis, complementing current invasive procedures. Full article
(This article belongs to the Special Issue AI in OCT (Optical Coherence Tomography) Image Analysis)
Show Figures

Figure 1

23 pages, 3831 KiB  
Article
Estimating Planetary Boundary Layer Height over Central Amazonia Using Random Forest
by Paulo Renato P. Silva, Rayonil G. Carneiro, Alison O. Moraes, Cleo Quaresma Dias-Junior and Gilberto Fisch
Atmosphere 2025, 16(8), 941; https://doi.org/10.3390/atmos16080941 (registering DOI) - 5 Aug 2025
Abstract
This study investigates the use of a Random Forest (RF), an artificial intelligence (AI) model, to estimate the planetary boundary layer height (PBLH) over Central Amazonia from climatic elements data collected during the GoAmazon experiment, held in 2014 and 2015, as it is [...] Read more.
This study investigates the use of a Random Forest (RF), an artificial intelligence (AI) model, to estimate the planetary boundary layer height (PBLH) over Central Amazonia from climatic elements data collected during the GoAmazon experiment, held in 2014 and 2015, as it is a key metric for air quality, weather forecasting, and climate modeling. The novelty of this study lies in estimating PBLH using only surface-based meteorological observations. This approach is validated against remote sensing measurements (e.g., LIDAR, ceilometer, and wind profilers), which are seldom available in the Amazon region. The dataset includes various meteorological features, though substantial missing data for the latent heat flux (LE) and net radiation (Rn) measurements posed challenges. We addressed these gaps through different data-cleaning strategies, such as feature exclusion, row removal, and imputation techniques, assessing their impact on model performance using the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and r2 metrics. The best-performing strategy achieved an RMSE of 375.9 m. In addition to the RF model, we benchmarked its performance against Linear Regression, Support Vector Regression, LightGBM, XGBoost, and a Deep Neural Network. While all models showed moderate correlation with observed PBLH, the RF model outperformed all others with statistically significant differences confirmed by paired t-tests. SHAP (SHapley Additive exPlanations) values were used to enhance model interpretability, revealing hour of the day, air temperature, and relative humidity as the most influential predictors for PBLH, underscoring their critical role in atmospheric dynamics in Central Amazonia. Despite these optimizations, the model underestimates the PBLH values—by an average of 197 m, particularly in the spring and early summer austral seasons when atmospheric conditions are more variable. These findings emphasize the importance of robust data preprocessing and higtextight the potential of ML models for improving PBLH estimation in data-scarce tropical environments. Full article
(This article belongs to the Special Issue Applications of Artificial Intelligence in Atmospheric Sciences)
Show Figures

Figure 1

17 pages, 6884 KiB  
Article
An Interpretable XGBoost Framework for Predicting Oxide Glass Density
by Pawel Stoch
Appl. Sci. 2025, 15(15), 8680; https://doi.org/10.3390/app15158680 (registering DOI) - 5 Aug 2025
Abstract
Accurately predicting glass density is crucial for designing novel materials. This study aims to develop a robust predictive model for the density of oxide glasses and, more importantly, to investigate how physically informed feature engineering can create accurate and interpretable models that reveal [...] Read more.
Accurately predicting glass density is crucial for designing novel materials. This study aims to develop a robust predictive model for the density of oxide glasses and, more importantly, to investigate how physically informed feature engineering can create accurate and interpretable models that reveal underlying physical principles. Using a dataset of 76,593 oxide glasses from the SciGlass database, three machine learning (ML) models (ElasticNet, XGBoost, MLP) were trained and evaluated. Four distinct feature sets were constructed with increasing physical complexity, ranging from simple elemental composition to the advanced Magpie descriptors. The best model was further analyzed for interpretability using feature importance and SHapley Additive exPlanations (SHAP) analysis. A clear hierarchical improvement in predictive accuracy was observed with increasing feature sophistication across all models. The XGBoost model combined with the Magpie feature set provided the best performance, achieving a coefficient of determination (R2) of 0.97. Interpretability analysis revealed that the model’s predictions were overwhelmingly driven by physical attributes, with mean atomic weight being the most influential predictor. The model learns to approximate the fundamental density equation using mean atomic weight as a proxy for molar mass and electronic structure features to estimate molar volume. This demonstrates that a data-driven approach can function as a scientifically valid and interpretable tool, accelerating the discovery of new materials. Full article
Show Figures

Figure 1

28 pages, 4243 KiB  
Article
Electric Bus Battery Energy Consumption Estimation and Influencing Features Analysis Using a Two-Layer Stacking Framework with SHAP-Based Interpretation
by Runze Liu, Jianming Cai, Lipeng Hu, Benxiao Lou and Jinjun Tang
Sustainability 2025, 17(15), 7105; https://doi.org/10.3390/su17157105 - 5 Aug 2025
Abstract
The widespread adoption of electric buses represents a major step forward in sustainable transportation, but also brings new operational challenges, particularly in terms of improving their efficiency and controlling costs. Therefore, battery energy consumption management is a key approach for addressing these issues. [...] Read more.
The widespread adoption of electric buses represents a major step forward in sustainable transportation, but also brings new operational challenges, particularly in terms of improving their efficiency and controlling costs. Therefore, battery energy consumption management is a key approach for addressing these issues. Accurate prediction of energy consumption and interpretation of the influencing factors are essential for improving operational efficiency, optimizing energy use, and reducing operating costs. Although existing studies have made progress in battery energy consumption prediction, challenges remain in achieving high-precision modeling and conducting a comprehensive analysis of the influencing features. To address these gaps, this study proposes a two-layer stacking framework for estimating the energy consumption of electric buses. The first layer integrates the strengths of three nonlinear regression models—RF (Random Forest), GBDT (Gradient Boosted Decision Trees), and CatBoost (Categorical Boosting)—to enhance the modeling capacity for complex feature relationships. The second layer employs a Linear Regression model as a meta-learner to aggregate the predictions from the base models and improve the overall predictive performance. The framework is trained on 2023 operational data from two electric bus routes (NO. 355 and NO. W188) in Changsha, China, incorporating battery system parameters, driving characteristics, and environmental variables as independent variables for model training and analysis. Comparative experiments with various ensemble models demonstrate that the proposed stacking framework exhibits superior performance in data fitting. Furthermore, XGBoost (Extreme Gradient Boosting) is introduced as a surrogate model to approximate the decision logic of the stacking framework, enabling SHAP (SHapley Additive exPlanations) analysis to quantify the contribution and marginal effects of influencing features. The proposed stacked and surrogate models achieved superior battery energy consumption prediction accuracy (lowest MSE, RMSE, and MAE), significantly outperforming benchmark models on real-world datasets. SHAP analysis quantified the overall contributions of feature categories (battery operation parameters: 56.5%; driving characteristics: 42.3%; environmental data: 1.2%), further revealing the specific contributions and nonlinear influence mechanisms of individual features. These quantitative findings offer specific guidance for optimizing battery system control and driving behavior. Full article
(This article belongs to the Section Sustainable Transportation)
9 pages, 1436 KiB  
Proceeding Paper
Insights into Air Quality Index (AQI) Variability with Explainable Machine Learning Techniques
by Claudio Andenna and Roberta Valentina Gagliardi
Environ. Earth Sci. Proc. 2025, 34(1), 1; https://doi.org/10.3390/eesp2025034001 - 5 Aug 2025
Abstract
In this study, a combined approach joining the machine learning model Extreme Gradient Boosting (XGBoost) with Shapley Additive Explanation (SHAP) is adopted to simulate the temporal pattern of the air quality index (AQI) and subsequently explore the key factors affecting AQI variability. Based [...] Read more.
In this study, a combined approach joining the machine learning model Extreme Gradient Boosting (XGBoost) with Shapley Additive Explanation (SHAP) is adopted to simulate the temporal pattern of the air quality index (AQI) and subsequently explore the key factors affecting AQI variability. Based on the analysis of air pollutants and meteorological data acquired from two air quality monitoring stations in Rome (Italy), over the 2018–2022 period, the results demonstrate the effectiveness of the proposed methodological approach in elucidating the role of the main factors driving AQI evolution, and their interaction effects. Full article
Show Figures

Figure 1

19 pages, 2795 KiB  
Article
State Analysis of Grouped Smart Meters Driven by Interpretable Random Forest
by Zhongdong Wang, Zhengbo Zhang, Weijiang Wu, Zhen Zhang, Xiaolin Xu and Hongbin Li
Electronics 2025, 14(15), 3105; https://doi.org/10.3390/electronics14153105 - 4 Aug 2025
Abstract
Accurate evaluation of the operational status of smart meters, as the critical interface between the power grid and its users, is essential for ensuring fairness in power transactions. This highlights the importance of implementing rotation management practices based on meter status. However, the [...] Read more.
Accurate evaluation of the operational status of smart meters, as the critical interface between the power grid and its users, is essential for ensuring fairness in power transactions. This highlights the importance of implementing rotation management practices based on meter status. However, the traditional expiration-based rotation method has become inadequate due to the extended service life of modern smart meters, necessitating a shift toward status-driven targeted management. Existing multifactor comprehensive assessment methods often face challenges in balancing accuracy and interpretability. To address these limitations, this study proposes a novel method for analyzing the status of smart meter groups using an interpretable random forest model. The approach incorporates an expert-knowledge-guided grouping assessment strategy, develops a multi-source heterogeneous feature set with strong correlations to meter status, and enhances the random forest model with the SHAP (SHapley Additive exPlanations) interpretability framework. Compared to conventional methods, the proposed approach demonstrates superior efficiency and reliability in predicting the failure rates of smart meter groups within distribution network areas, offering robust support for the maintenance and management of smart meters. Full article
Show Figures

Figure 1

44 pages, 6212 KiB  
Article
A Hybrid Deep Reinforcement Learning Architecture for Optimizing Concrete Mix Design Through Precision Strength Prediction
by Ali Mirzaei and Amir Aghsami
Math. Comput. Appl. 2025, 30(4), 83; https://doi.org/10.3390/mca30040083 - 3 Aug 2025
Viewed by 182
Abstract
Concrete mix design plays a pivotal role in ensuring the mechanical performance, durability, and sustainability of construction projects. However, the nonlinear interactions among the mix components challenge traditional approaches in predicting compressive strength and optimizing proportions. This study presents a two-stage hybrid framework [...] Read more.
Concrete mix design plays a pivotal role in ensuring the mechanical performance, durability, and sustainability of construction projects. However, the nonlinear interactions among the mix components challenge traditional approaches in predicting compressive strength and optimizing proportions. This study presents a two-stage hybrid framework that integrates deep learning with reinforcement learning to overcome these limitations. First, a Convolutional Neural Network–Long Short-Term Memory (CNN–LSTM) model was developed to capture spatial–temporal patterns from a dataset of 1030 historical concrete samples. The extracted features were enhanced using an eXtreme Gradient Boosting (XGBoost) meta-model to improve generalizability and noise resistance. Then, a Dueling Double Deep Q-Network (Dueling DDQN) agent was used to iteratively identify optimal mix ratios that maximize the predicted compressive strength. The proposed framework outperformed ten benchmark models, achieving an MAE of 2.97, RMSE of 4.08, and R2 of 0.94. Feature attribution methods—including SHapley Additive exPlanations (SHAP), Elasticity-Based Feature Importance (EFI), and Permutation Feature Importance (PFI)—highlighted the dominant influence of cement content and curing age, as well as revealing non-intuitive effects such as the compensatory role of superplasticizers in low-water mixtures. These findings demonstrate the potential of the proposed approach to support intelligent concrete mix design and real-time optimization in smart construction environments. Full article
(This article belongs to the Section Engineering)
Show Figures

Figure 1

22 pages, 4943 KiB  
Article
Predicting De-Handing Point in Bananas Using Crown Morphology and Interpretable Machine Learning
by Lei Zhao, Zhou Yang, Chunxia Wang, Mohui Jin and Jieli Duan
Agronomy 2025, 15(8), 1880; https://doi.org/10.3390/agronomy15081880 - 3 Aug 2025
Viewed by 100
Abstract
Banana de-handing is a critical yet labor-intensive step in postharvest processing, with current manual methods resulting in high costs and occupational risks. This study addresses the automation of de-handing point localization by integrating high-resolution 3D scanning and morphometric analysis of banana crowns with [...] Read more.
Banana de-handing is a critical yet labor-intensive step in postharvest processing, with current manual methods resulting in high costs and occupational risks. This study addresses the automation of de-handing point localization by integrating high-resolution 3D scanning and morphometric analysis of banana crowns with machine learning techniques. A total of 210 crown samples were analyzed to extract key morphological features, including inner arc length (Li), inner arc radius (Ri), outer arc radius (Ro), and the distance between inner and outer arcs (Doi), among others. Four machine learning algorithms, namely, Multi-Layer Perceptron (MLP), Gradient Boosted Decision Trees (GBDT), Extreme Gradient Boosting (XGBoost), and Random Forest (RF), were developed to predict the target radius (Rt) and target distance (Dti) of the de-handing point. The RF models achieved the optimal predictive performance on the testing set, with the following results: for Rt, R2 = 0.95, MAE = 1.50, and RMSE = 1.94; for Dti, R2 = 0.91, MAE = 1.33, and RMSE = 1.66. A Shapley Additive Explanations (SHAP) analysis revealed that Li, Ri, and Ro were the most influential features for Rt, while Doi was the most important for Dti. Notably, feature threshold effects were observed, with limited gains in prediction accuracy beyond specific morphological values. These results provide a quantitative foundation for vision-guided automated de-handing systems, advancing intelligent and efficient banana postharvest management. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

24 pages, 1964 KiB  
Article
Data-Driven Symmetry and Asymmetry Investigation of Vehicle Emissions Using Machine Learning: A Case Study in Spain
by Fei Wu, Jinfu Zhu, Hufang Yang, Xiang He and Qiao Peng
Symmetry 2025, 17(8), 1223; https://doi.org/10.3390/sym17081223 - 2 Aug 2025
Viewed by 231
Abstract
Understanding vehicle emissions is essential for developing effective carbon reduction strategies in the transport sector. Conventional emission models often assume homogeneity and linearity, overlooking real-world asymmetries that arise from variations in vehicle design and powertrain configurations. This study explores how machine learning and [...] Read more.
Understanding vehicle emissions is essential for developing effective carbon reduction strategies in the transport sector. Conventional emission models often assume homogeneity and linearity, overlooking real-world asymmetries that arise from variations in vehicle design and powertrain configurations. This study explores how machine learning and explainable AI techniques can effectively capture both symmetric and asymmetric emission patterns across different vehicle types, thereby contributing to more sustainable transport planning. Addressing a key gap in the existing literature, the study poses the following question: how do structural and behavioral factors contribute to asymmetric emission responses in internal combustion engine vehicles compared to new energy vehicles? Utilizing a large-scale Spanish vehicle registration dataset, the analysis classifies vehicles by powertrain type and applies five supervised learning algorithms to predict CO2 emissions. SHapley Additive exPlanations (SHAPs) are employed to identify nonlinear and threshold-based relationships between emissions and vehicle characteristics such as fuel consumption, weight, and height. Among the models tested, the Random Forest algorithm achieves the highest predictive accuracy. The findings reveal critical asymmetries in emission behavior, particularly among hybrid vehicles, which challenge the assumption of uniform policy applicability. This study provides both methodological innovation and practical insights for symmetry-aware emission modeling, offering support for more targeted eco-design and policy decisions that align with long-term sustainability goals. Full article
(This article belongs to the Section Engineering and Materials)
Show Figures

Figure 1

23 pages, 3427 KiB  
Article
Visual Narratives and Digital Engagement: Decoding Seoul and Tokyo’s Tourism Identity Through Instagram Analytics
by Seung Chul Yoo and Seung Mi Kang
Tour. Hosp. 2025, 6(3), 149; https://doi.org/10.3390/tourhosp6030149 - 1 Aug 2025
Viewed by 255
Abstract
Social media platforms like Instagram significantly shape destination images and influence tourist behavior. Understanding how different cities are represented and perceived on these platforms is crucial for effective tourism marketing. This study provides a comparative analysis of Instagram content and engagement patterns in [...] Read more.
Social media platforms like Instagram significantly shape destination images and influence tourist behavior. Understanding how different cities are represented and perceived on these platforms is crucial for effective tourism marketing. This study provides a comparative analysis of Instagram content and engagement patterns in Seoul and Tokyo, two major Asian metropolises, to derive actionable marketing insights. We collected and analyzed 59,944 public Instagram posts geotagged or location-tagged within Seoul (n = 29,985) and Tokyo (n = 29,959). We employed a mixed-methods approach involving content categorization using a fine-tuned convolutional neural network (CNN) model, engagement metric analysis (likes, comments), Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment analysis and thematic classification of comments, geospatial analysis (Kernel Density Estimation [KDE], Moran’s I), and predictive modeling (Gradient Boosting with SHapley Additive exPlanations [SHAP] value analysis). A validation analysis using balanced samples (n = 2000 each) was conducted to address Tokyo’s lower geotagged data proportion. While both cities showed ‘Person’ as the dominant content category, notable differences emerged. Tokyo exhibited higher like-based engagement across categories, particularly for ‘Animal’ and ‘Food’ content, while Seoul generated slightly more comments, often expressing stronger sentiment. Qualitative comment analysis revealed Seoul comments focused more on emotional reactions, whereas Tokyo comments were often shorter, appreciative remarks. Geospatial analysis identified distinct hotspots. The validation analysis confirmed these spatial patterns despite Tokyo’s data limitations. Predictive modeling highlighted hashtag counts as the key engagement driver in Seoul and the presence of people in Tokyo. Seoul and Tokyo project distinct visual narratives and elicit different engagement patterns on Instagram. These findings offer practical implications for destination marketers, suggesting tailored content strategies and location-based campaigns targeting identified hotspots and specific content themes. This study underscores the value of integrating quantitative and qualitative analyses of social media data for nuanced destination marketing insights. Full article
Show Figures

Figure 1

22 pages, 2120 KiB  
Article
Machine Learning Algorithms and Explainable Artificial Intelligence for Property Valuation
by Gabriella Maselli and Antonio Nesticò
Real Estate 2025, 2(3), 12; https://doi.org/10.3390/realestate2030012 - 1 Aug 2025
Viewed by 191
Abstract
The accurate estimation of urban property values is a key challenge for appraisers, market participants, financial institutions, and urban planners. In recent years, machine learning (ML) techniques have emerged as promising tools for price forecasting due to their ability to model complex relationships [...] Read more.
The accurate estimation of urban property values is a key challenge for appraisers, market participants, financial institutions, and urban planners. In recent years, machine learning (ML) techniques have emerged as promising tools for price forecasting due to their ability to model complex relationships among variables. However, their application raises two main critical issues: (i) the risk of overfitting, especially with small datasets or with noisy data; (ii) the interpretive issues associated with the “black box” nature of many models. Within this framework, this paper proposes a methodological approach that addresses both these issues, comparing the predictive performance of three ML algorithms—k-Nearest Neighbors (kNN), Random Forest (RF), and the Artificial Neural Network (ANN)—applied to the housing market in the city of Salerno, Italy. For each model, overfitting is preliminarily assessed to ensure predictive robustness. Subsequently, the results are interpreted using explainability techniques, such as SHapley Additive exPlanations (SHAPs) and Permutation Feature Importance (PFI). This analysis reveals that the Random Forest offers the best balance between predictive accuracy and transparency, with features such as area and proximity to the train station identified as the main drivers of property prices. kNN and the ANN are viable alternatives that are particularly robust in terms of generalization. The results demonstrate how the defined methodological framework successfully balances predictive effectiveness and interpretability, supporting the informed and transparent use of ML in real estate valuation. Full article
Show Figures

Figure 1

17 pages, 1584 KiB  
Article
What Determines Carbon Emissions of Multimodal Travel? Insights from Interpretable Machine Learning on Mobility Trajectory Data
by Guo Wang, Shu Wang, Wenxiang Li and Hongtai Yang
Sustainability 2025, 17(15), 6983; https://doi.org/10.3390/su17156983 - 31 Jul 2025
Viewed by 195
Abstract
Understanding the carbon emissions of multimodal travel—comprising walking, metro, bus, cycling, and ride-hailing—is essential for promoting sustainable urban mobility. However, most existing studies focus on single-mode travel, while underlying spatiotemporal and behavioral determinants remain insufficiently explored due to the lack of fine-grained data [...] Read more.
Understanding the carbon emissions of multimodal travel—comprising walking, metro, bus, cycling, and ride-hailing—is essential for promoting sustainable urban mobility. However, most existing studies focus on single-mode travel, while underlying spatiotemporal and behavioral determinants remain insufficiently explored due to the lack of fine-grained data and interpretable analytical frameworks. This study proposes a novel integration of high-frequency, real-world mobility trajectory data with interpretable machine learning to systematically identify the key drivers of carbon emissions at the individual trip level. Firstly, multimodal travel chains are reconstructed using continuous GPS trajectory data collected in Beijing. Secondly, a model based on Calculate Emissions from Road Transport (COPERT) is developed to quantify trip-level CO2 emissions. Thirdly, four interpretable machine learning models based on gradient boosting—XGBoost, GBDT, LightGBM, and CatBoost—are trained using transportation and built environment features to model the relationship between CO2 emissions and a set of explanatory variables; finally, Shapley Additive exPlanations (SHAP) and partial dependence plots (PDPs) are used to interpret the model outputs, revealing key determinants and their non-linear interaction effects. The results show that transportation-related features account for 75.1% of the explained variance in emissions, with bus usage being the most influential single factor (contributing 22.6%). Built environment features explain the remaining 24.9%. The PDP analysis reveals that substantial emission reductions occur only when the shares of bus, metro, and cycling surpass threshold levels of approximately 40%, 40%, and 30%, respectively. Additionally, travel carbon emissions are minimized when trip origins and destinations are located within a 10 to 11 km radius of the central business district (CBD). This study advances the field by establishing a scalable, interpretable, and behaviorally grounded framework to assess carbon emissions from multimodal travel, providing actionable insights for low-carbon transport planning and policy design. Full article
(This article belongs to the Special Issue Sustainable Transportation Systems and Travel Behaviors)
Show Figures

Figure 1

35 pages, 3218 KiB  
Article
Integrated GBR–NSGA-II Optimization Framework for Sustainable Utilization of Steel Slag in Road Base Layers
by Merve Akbas
Appl. Sci. 2025, 15(15), 8516; https://doi.org/10.3390/app15158516 (registering DOI) - 31 Jul 2025
Viewed by 164
Abstract
This study proposes an integrated, machine learning-based multi-objective optimization framework to evaluate and optimize the utilization of steel slag in road base layers, simultaneously addressing economic costs and environmental impacts. A comprehensive dataset of 482 scenarios was engineered based on literature-informed parameters, encompassing [...] Read more.
This study proposes an integrated, machine learning-based multi-objective optimization framework to evaluate and optimize the utilization of steel slag in road base layers, simultaneously addressing economic costs and environmental impacts. A comprehensive dataset of 482 scenarios was engineered based on literature-informed parameters, encompassing transport distance, processing energy intensity, initial moisture content, gradation adjustments, and regional electricity emission factors. Four advanced tree-based ensemble regression algorithms—Random Forest Regressor (RFR), Extremely Randomized Trees (ERTs), Gradient Boosted Regressor (GBR), and Extreme Gradient Boosting Regressor (XGBR)—were rigorously evaluated. Among these, GBR demonstrated superior predictive performance (R2 > 0.95, RMSE < 7.5), effectively capturing complex nonlinear interactions inherent in slag processing and logistics operations. Feature importance analysis via SHapley Additive exPlanations (SHAP) provided interpretative insights, highlighting transport distance and energy intensity as dominant factors affecting unit cost, while moisture content and grid emission factor predominantly influenced CO2 emissions. Subsequently, the Gradient Boosted Regressor model was integrated into a Non-Dominated Sorting Genetic Algorithm II (NSGA-II) framework to explore optimal trade-offs between cost and emissions. The resulting Pareto front revealed a diverse solution space, with significant nonlinear trade-offs between economic efficiency and environmental performance, clearly identifying strategic inflection points. To facilitate actionable decision-making, the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method was applied, identifying an optimal balanced solution characterized by a transport distance of 47 km, energy intensity of 1.21 kWh/ton, moisture content of 6.2%, moderate gradation adjustment, and a grid CO2 factor of 0.47 kg CO2/kWh. This scenario offered a substantial reduction (45%) in CO2 emissions relative to cost-minimized solutions, with a moderate increase (33%) in total cost, presenting a realistic and balanced pathway for sustainable infrastructure practices. Overall, this study introduces a robust, scalable, and interpretable optimization framework, providing valuable methodological advancements for sustainable decision making in infrastructure planning and circular economy initiatives. Full article
Show Figures

Figure 1

Back to TopTop