Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (244)

Search Parameters:
Keywords = model-agnostic interpretation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
36 pages, 4464 KB  
Article
Efficient Image-Based Memory Forensics for Fileless Malware Detection Using Texture Descriptors and LIME-Guided Deep Learning
by Qussai M. Yaseen, Esraa Oudat, Monther Aldwairi and Salam Fraihat
Computers 2025, 14(11), 467; https://doi.org/10.3390/computers14110467 (registering DOI) - 1 Nov 2025
Abstract
Memory forensics is an essential cybersecurity tool that comprehensively examines volatile memory to detect the malicious activity of fileless malware that can bypass disk analysis. Image-based detection techniques provide a promising solution by visualizing memory data into images to be used and analyzed [...] Read more.
Memory forensics is an essential cybersecurity tool that comprehensively examines volatile memory to detect the malicious activity of fileless malware that can bypass disk analysis. Image-based detection techniques provide a promising solution by visualizing memory data into images to be used and analyzed by image processing tools and machine learning methods. However, the effectiveness of image-based data for detection and classification requires high computational efforts. This paper investigates the efficacy of texture-based methods in detecting and classifying memory-resident or fileless malware using different image resolutions, identifying the best feature descriptors, classifiers, and resolutions that accurately classify malware into specific families and differentiate them from benign software. Moreover, this paper uses both local and global descriptors, where local descriptors include Oriented FAST and Rotated BRIEF (ORB), Scale-Invariant Feature Transform (SIFT), and Histogram of Oriented Gradients (HOG) and global descriptors include Discrete Wavelet Transform (DWT), GIST, and Gray Level Co-occurrence Matrix (GLCM). The results indicate that as image resolution increases, most feature descriptors yield more discriminative features but require higher computational efforts in terms of time and processing resources. To address this challenge, this paper proposes a novel approach that integrates Local Interpretable Model-agnostic Explanations (LIME) with deep learning models to automatically identify and crop the most important regions of memory images. The LIME’s ROI was extracted based on ResNet50 and MobileNet models’ predictions separately, the images were resized to 128 × 128, and the sampling process was performed dynamically to speed up LIME computation. The ROIs of the images are cropped to new images with sizes of (100 × 100) in two stages: the coarse stage and the fine stage. The two generated LIME-based cropped images using ResNet50 and MobileNet are fed to the lightweight neural network to evaluate the effectiveness of the LIME-based identified regions. The results demonstrate that the LIME-based MobileNet model’s prediction improves the efficiency of the model by preserving important features with a classification accuracy of 85% on multi-class classification. Full article
(This article belongs to the Special Issue Using New Technologies in Cyber Security Solutions (2nd Edition))
Show Figures

Figure 1

31 pages, 1368 KB  
Review
eXplainable Artificial Intelligence (XAI): A Systematic Review for Unveiling the Black Box Models and Their Relevance to Biomedical Imaging and Sensing
by Nadeesha Hettikankanamage, Niusha Shafiabady, Fiona Chatteur, Robert M. X. Wu, Fareed Ud Din and Jianlong Zhou
Sensors 2025, 25(21), 6649; https://doi.org/10.3390/s25216649 - 30 Oct 2025
Abstract
Artificial Intelligence (AI) has achieved immense progress in recent years across a wide array of application domains, with biomedical imaging and sensing emerging as particularly impactful areas. However, the integration of AI in safety-critical fields, particularly biomedical domains, continues to face a major [...] Read more.
Artificial Intelligence (AI) has achieved immense progress in recent years across a wide array of application domains, with biomedical imaging and sensing emerging as particularly impactful areas. However, the integration of AI in safety-critical fields, particularly biomedical domains, continues to face a major challenge of explainability arising from the opacity of complex prediction models. Overcoming this obstacle falls within the realm of eXplainable Artificial Intelligence (XAI), which is widely acknowledged as an essential aspect for successfully implementing and accepting AI techniques in practical applications to ensure transparency, fairness, and accountability in the decision-making processes and mitigate potential biases. This article provides a systematic cross-domain review of XAI techniques applied to quantitative prediction tasks, with a focus on their methodological relevance and potential adaptation to biomedical imaging and sensing. To achieve this, following PRISMA guidelines, we conducted an analysis of 44 Q1 journal articles that utilised XAI techniques for prediction applications across different fields where quantitative databases were used, and their contributions to explaining the predictions were studied. As a result, 13 XAI techniques were identified for prediction tasks. Shapley Additive eXPlanations (SHAP) was identified in 35 out of 44 articles, reflecting its frequent computational use for feature-importance ranking and model interpretation. Local Interpretable Model-Agnostic Explanations (LIME), Partial Dependence Plots (PDPs), and Permutation Feature Index (PFI) ranked second, third, and fourth in popularity, respectively. The study also recognises theoretical limitations of SHAP and related model-agnostic methods, such as their additive and causal assumptions, which are particularly critical in heterogeneous biomedical data. Furthermore, a synthesis of the reviewed studies reveals that while many provide computational evaluation of explanations, none include structured human–subject usability validation, underscoring an important research gap for clinical translation. Overall, this study offers an integrated understanding of quantitative XAI techniques, identifies methodological and usability gaps for biomedical adaptation, and provides guidance for future research aimed at safe and interpretable AI deployment in biomedical imaging and sensing. Full article
Show Figures

Figure 1

24 pages, 2742 KB  
Article
Capturing the Asymmetry of Pitting Corrosion: An Interpretable Prediction Model Based on Attention-CNN
by Xiaohai Ran and Changfeng Wang
Symmetry 2025, 17(10), 1775; https://doi.org/10.3390/sym17101775 - 21 Oct 2025
Viewed by 247
Abstract
Fossil fuels are crucial to the global energy supply, with pipelines being a vital transportation method. However, these vital assets are highly susceptible to pitting corrosion, an insidious form of degradation that can lead to catastrophic failures. Unlike uniform corrosion, which represents a [...] Read more.
Fossil fuels are crucial to the global energy supply, with pipelines being a vital transportation method. However, these vital assets are highly susceptible to pitting corrosion, an insidious form of degradation that can lead to catastrophic failures. Unlike uniform corrosion, which represents a symmetric form of material loss, pitting corrosion is a highly asymmetric and localized phenomenon. The inherent complexity and asymmetry of this process make its prediction a significant challenge. To address this, this study presents SSA-CNN-Attention, a deep learning model specifically designed to analyze the complex, nonlinear interactions among environmental factors. The model employs a Convolutional Neural Network (CNN) to extract local features, while a crucial attention mechanism allows it to asymmetrically weight the importance of these features, enhancing its ability to recognize intricate interactions. Additionally, the Sparrow Search Algorithm (SSA) optimizes the model’s hyperparameters for improved accuracy and stability. Furthermore, a post hoc interpretability analysis using the LIME framework validates that the model’s learned feature relationships are consistent with established corrosion science, revealing how the model accounts for the asymmetric influence of key variables. The experimental results demonstrate that the proposed model reduces mean squared error (MSE) by 61.3% and mean absolute error (MAE) by 26.6%, while improving the coefficient of determination (R2) by 28.2% compared to traditional CNNs. These findings highlight the model’s superior performance in predicting a fundamentally asymmetric process and provide valuable insights into the underlying corrosion mechanisms. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

27 pages, 5750 KB  
Article
Hybrid Diagnostic Framework for Interpretable Bearing Fault Classification Using CNN and Dual-Stage Feature Selection
by Mohamed Elhachemi Saouli, Mostefa Mohamed Touba and Adel Boudiaf
Sensors 2025, 25(20), 6386; https://doi.org/10.3390/s25206386 - 16 Oct 2025
Viewed by 548
Abstract
Timely and accurate fault diagnosis in rotary machinery is essential for ensuring system reliability and minimizing unplanned downtime. While deep learning approaches, particularly Convolutional Neural Networks (CNNs), have demonstrated strong performance in vibration-based fault classification, their limited interpretability poses challenges for adoption in [...] Read more.
Timely and accurate fault diagnosis in rotary machinery is essential for ensuring system reliability and minimizing unplanned downtime. While deep learning approaches, particularly Convolutional Neural Networks (CNNs), have demonstrated strong performance in vibration-based fault classification, their limited interpretability poses challenges for adoption in safety-critical environments. To address this, the present study introduces a hybrid diagnostic framework that integrates CNN-based transfer learning with interpretable supervised classification, aiming to enhance both predictive accuracy and model transparency. A key innovation of this work lies in the dual-stage feature selection process, combining Analysis of Variance (ANOVA) and Permutation Feature Importance (PFI) to refine deep features extracted from a pre-trained VGG19 network. This strategy improves both dimensionality reduction and classification performance in a statistically grounded, model-agnostic manner. Furthermore, SHapley Additive exPlanations (SHAP) are employed to interpret the predictions, offering insight into the most influential features driving the classification decisions. Experimental evaluation on the Case Western Reserve University (CWRU) bearing dataset confirms the effectiveness of the proposed approach, achieving 100% classification accuracy using ten-fold cross-validation. By uniting high performance with transparent decision-making, the framework demonstrates strong potential for explainable and reliable fault diagnosis in industrial settings. Full article
(This article belongs to the Special Issue AI-Assisted Condition Monitoring and Fault Diagnosis)
Show Figures

Figure 1

11 pages, 1181 KB  
Communication
Surgical Instrument Segmentation via Segment-Then-Classify Framework with Instance-Level Spatiotemporal Consistency Modeling
by Tiyao Zhang, Xue Yuan and Hongze Xu
J. Imaging 2025, 11(10), 364; https://doi.org/10.3390/jimaging11100364 - 15 Oct 2025
Viewed by 289
Abstract
Accurate segmentation of surgical instruments in endoscopic videos is crucial for robot-assisted surgery and intraoperative analysis. This paper presents a Segment-then-Classify framework that decouples mask generation from semantic classification to enhance spatial completeness and temporal stability. First, a Mask2Former-based segmentation backbone generates class-agnostic [...] Read more.
Accurate segmentation of surgical instruments in endoscopic videos is crucial for robot-assisted surgery and intraoperative analysis. This paper presents a Segment-then-Classify framework that decouples mask generation from semantic classification to enhance spatial completeness and temporal stability. First, a Mask2Former-based segmentation backbone generates class-agnostic instance masks and region features. Then, a bounding box-guided instance-level spatiotemporal modeling module fuses geometric priors and temporal consistency through a lightweight transformer encoder. This design improves interpretability and robustness under occlusion and motion blur. Experiments on the EndoVis 2017 and 2018 datasets demonstrate that our framework achieves mIoU improvements of 3.06%, 2.99%, and 1.67% and mcIoU gains of 2.36%, 2.85%, and 6.06%, respectively, over previously state-of-the-art methods, while maintaining computational efficiency. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

38 pages, 7624 KB  
Review
Towards Explainable Deep Learning in Computational Neuroscience: Visual and Clinical Applications
by Asif Mehmood, Faisal Mehmood and Jungsuk Kim
Mathematics 2025, 13(20), 3286; https://doi.org/10.3390/math13203286 - 14 Oct 2025
Viewed by 537
Abstract
Deep learning has emerged as a powerful tool in computational neuroscience, enabling the modeling of complex neural processes and supporting data-driven insights into brain function. However, the non-transparent nature of many deep learning models limits their interpretability, which is a significant barrier in [...] Read more.
Deep learning has emerged as a powerful tool in computational neuroscience, enabling the modeling of complex neural processes and supporting data-driven insights into brain function. However, the non-transparent nature of many deep learning models limits their interpretability, which is a significant barrier in neuroscience and clinical contexts where trust, transparency, and biological plausibility are essential. This review surveys structured explainable deep learning methods, such as saliency maps, attention mechanisms, and model-agnostic interpretability frameworks, that bridge the gap between performance and interpretability. We then explore explainable deep learning’s role in visual neuroscience and clinical neuroscience. By surveying literature and evaluating strengths and limitations, we highlight explainable models’ contribution to both scientific understanding and ethical deployment. Challenges such as balancing accuracy, complexity and interpretability, absence of standardized metrics, and scalability are assessed. Finally, we propose future directions, which include integrating biological priors, implementing standardized benchmarks, and incorporating human-intervention systems. The research study highlights the position of explainable deep learning, not only as a technical advancement but represents it as a necessary paradigm for transparent, responsible, auditable, and effective computational neuroscience. In total, 177 studies were reviewed as per PRISMA, which provided evidence across both visual and clinical computational neuroscience domains. Full article
(This article belongs to the Special Issue Methods, Analysis and Applications in Computational Neuroscience)
Show Figures

Figure 1

29 pages, 3821 KB  
Article
Mathematical Framework for Digital Risk Twins in Safety-Critical Systems
by Igor Kabashkin
Mathematics 2025, 13(19), 3222; https://doi.org/10.3390/math13193222 - 8 Oct 2025
Viewed by 426
Abstract
This paper introduces a formal mathematical framework for Digital Risk Twins (DRTs) as an extension of traditional digital twin (DT) architectures, explicitly tailored to the needs of safety-critical systems. While conventional DTs enable real-time monitoring and simulation of physical assets, they often lack [...] Read more.
This paper introduces a formal mathematical framework for Digital Risk Twins (DRTs) as an extension of traditional digital twin (DT) architectures, explicitly tailored to the needs of safety-critical systems. While conventional DTs enable real-time monitoring and simulation of physical assets, they often lack structured mechanisms to model stochastic failure processes; evaluate dynamic risk; or support resilient, risk-aware decision-making. The proposed DRT framework addresses these limitations by embedding probabilistic hazard modeling, reliability theory, and coherent risk measures into a modular and mathematically interpretable structure. The DT to DRT transformation is formalized as a composition of operators that project system trajectories onto risk-relevant features, compute failure intensities, and evaluate risk metrics under uncertainty. The framework supports layered integration of simulation, feature extraction, hazard dynamics, and decision-oriented evaluation, providing traceability, scalability, and explainability. Its utility is demonstrated through a case study involving an aircraft brake system, showcasing early warning detection, inspection schedule optimization, and visual risk interpretation. The results confirm that the DRT enables modular, explainable, and domain-agnostic integration of reliability logic into digital twin systems, enhancing their value in safety-critical applications. Full article
Show Figures

Figure 1

29 pages, 9465 KB  
Article
Modeling Seasonal Fire Probability in Thailand: A Machine Learning Approach Using Multiyear Remote Sensing Data
by Enikoe Bihari, Karen Dyson, Kayla Johnston, Daniel Marc G. dela Torre, Akkarapon Chaiyana, Karis Tenneson, Wasana Sittirin, Ate Poortinga, Veerachai Tanpipat, Kobsak Wanthongchai, Thannarot Kunlamai, Elijah Dalton, Chanarun Saisaward, Marina Tornorsam, David Ganz and David Saah
Remote Sens. 2025, 17(19), 3378; https://doi.org/10.3390/rs17193378 - 7 Oct 2025
Viewed by 1016
Abstract
Seasonal fires in northern Thailand are a persistent environmental and public health concern, yet existing fire probability mapping approaches in Thailand rely heavily on subjective multi-criteria analysis (MCA) methods and temporally static data aggregation methods. To address these limitations, we present a flexible, [...] Read more.
Seasonal fires in northern Thailand are a persistent environmental and public health concern, yet existing fire probability mapping approaches in Thailand rely heavily on subjective multi-criteria analysis (MCA) methods and temporally static data aggregation methods. To address these limitations, we present a flexible, replicable, and operationally viable seasonal fire probability mapping methodology using a Random Forest (RF) machine learning model in the Google Earth Engine (GEE) platform. We trained the model on historical fire occurrence and fire predictor layers from 2016–2023 and applied it to 2024 conditions to generate a probabilistic fire prediction. Our novel approach improves upon existing operational methods and scientific literature in several ways. It uses a more representative sample design which is agnostic to the burn history of fire presences and absences, pairs fire and fire predictor data from each year to account for interannual variation in conditions, empirically refines the most influential fire predictors from a comprehensive set of predictors, and provides a reproducible and accessible framework using GEE. Predictor variables include both socioeconomic and environmental drivers of fire, such as topography, fuels, potential fire behavior, forest type, vegetation characteristics, climate, water availability, crop type, recent burn history, and human influence and accessibility. The model achieves an Area Under the Curve (AUC) of 0.841 when applied to 2016–2023 data and 0.848 when applied to 2024 data, indicating strong discriminatory power despite the additional spatial and temporal variability introduced by our sample design. The highest fire probabilities emerge in forested and agricultural areas at mid elevations and near human settlements and roads, which aligns well with the known anthropogenic drivers of fire in Thailand. Distinct areas of model uncertainty are also apparent in cropland and forests which are only burned intermittently, highlighting the importance of accounting for localized burning cycles. Variable importance analysis using the Gini Impurity Index identifies both natural and anthropogenic predictors as key and nearly equally important predictors of fire, including certain forest and crop types, vegetation characteristics, topography, climate, human influence and accessibility, water availability, and recent burn history. Our findings demonstrate the heavy influence of data preprocessing and model design choices on model results. The model outputs are provided as interpretable probability maps and the methods can be adapted to future years or augmented with local datasets. Our methodology presents a scalable advancement in wildfire probability mapping with machine learning and open-source tools, particularly for data-constrained landscapes. It will support Thailand’s fire managers in proactive fire response and planning and also inform broader regional fire risk assessment efforts. Full article
(This article belongs to the Special Issue Remote Sensing in Hazards Monitoring and Risk Assessment)
Show Figures

Figure 1

23 pages, 1004 KB  
Review
Toward Transparent Modeling: A Scoping Review of Explainability for Arabic Sentiment Analysis
by Afnan Alsehaimi, Amal Babour and Dimah Alahmadi
Appl. Sci. 2025, 15(19), 10659; https://doi.org/10.3390/app151910659 - 2 Oct 2025
Viewed by 399
Abstract
The increasing prevalence of Arabic text in digital media offers significant potential for sentiment analysis. However, challenges such as linguistic complexity and limited resources make Arabic sentiment analysis (ASA) particularly difficult. In addition, explainable artificial intelligence (XAI) has become crucial for improving the [...] Read more.
The increasing prevalence of Arabic text in digital media offers significant potential for sentiment analysis. However, challenges such as linguistic complexity and limited resources make Arabic sentiment analysis (ASA) particularly difficult. In addition, explainable artificial intelligence (XAI) has become crucial for improving the transparency and trustworthiness of artificial intelligence (AI) models. This paper addresses the integration of XAI techniques in ASA through a scoping review of developments. This study critically identifies trends in model usage, examines explainability methods, and explores how these techniques enhance the explainability of model decisions. This review is crucial for consolidating fragmented efforts, identifying key methodological trends, and guiding future research in this emerging area. Online databases (IEEE Xplore, ACM Digital Library, Scopus, Web of Science, ScienceDirect, and Google Scholar) were searched to identify papers published between 1 January 2016 and 31 March 2025. The last search across all databases was conducted on 1 April 2025. From these, 19 peer-reviewed journal articles and conference papers focusing on ASA with explicit use of XAI techniques were selected for inclusion. This time frame was chosen to capture the most recent decade of research, reflecting advances in deep learning and the transformer-based and explainable AI methods. The findings indicate that transformer-based models and deep learning approaches dominate in ASA, achieving high accuracy, and that local interpretable model-agnostic explanations (LIME) is the most widely used explainability tool. However, challenges such as dialectal variation, small or imbalanced datasets, and the black box nature of advanced models persist. To address these challenges future research directions should include the creation of richer Arabic sentiment datasets, the development of hybrid explainability models, and the enhancement of adversarial robustness. Full article
Show Figures

Figure 1

25 pages, 8881 KB  
Article
Evaluating Machine Learning Techniques for Brain Tumor Detection with Emphasis on Few-Shot Learning Using MAML
by Soham Sanjay Vaidya, Raja Hashim Ali, Shan Faiz, Iftikhar Ahmed and Talha Ali Khan
Algorithms 2025, 18(10), 624; https://doi.org/10.3390/a18100624 - 2 Oct 2025
Viewed by 402
Abstract
Accurate brain tumor classification from MRI is often constrained by limited labeled data. We systematically compare conventional machine learning, deep learning, and few-shot learning (FSL) for four classes (glioma, meningioma, pituitary, no tumor) using a standardized pipeline. Models are trained on the Kaggle [...] Read more.
Accurate brain tumor classification from MRI is often constrained by limited labeled data. We systematically compare conventional machine learning, deep learning, and few-shot learning (FSL) for four classes (glioma, meningioma, pituitary, no tumor) using a standardized pipeline. Models are trained on the Kaggle Brain Tumor MRI Dataset and evaluated across dataset regimes (100%→10%). We further test generalization on BraTS and quantify robustness to resolution changes, acquisition noise, and modality shift (T1→FLAIR). To support clinical trust, we add visual explanations (Grad-CAM/saliency) and report per-class results (confusion matrices). A fairness-aligned protocol (shared splits, optimizer, early stopping) and a complexity analysis (parameters/FLOPs) enable balanced comparison. With full data, Convolutional Neural Networks (CNNs)/Residual Networks (ResNets) perform strongly but degrade with 10% data; Model-Agnostic Meta-Learning (MAML) retains competitive performance (AUC-ROC ≥ 0.9595 at 10%). Under cross-dataset validation (BraTS), FSL—particularly MAML—shows smaller performance drops than CNN/ResNet. Variability tests reveal FSL’s relative robustness to down-resolution and noise, although modality shift remains challenging for all models. Interpretability maps confirm correct activations on tumor regions in true positives and explain systematic errors (e.g., “no tumor”→pituitary). Conclusion: FSL provides accurate, data-efficient, and comparatively robust tumor classification under distribution shift. The added per-class analysis, interpretability, and complexity metrics strengthen clinical relevance and transparency. Full article
(This article belongs to the Special Issue Machine Learning Models and Algorithms for Image Processing)
Show Figures

Figure 1

30 pages, 401 KB  
Systematic Review
Explainable Artificial Intelligence and Machine Learning for Air Pollution Risk Assessment and Respiratory Health Outcomes: A Systematic Review
by Israel Edem Agbehadji and Ibidun Christiana Obagbuwa
Atmosphere 2025, 16(10), 1154; https://doi.org/10.3390/atmos16101154 - 1 Oct 2025
Viewed by 670
Abstract
Air pollution is a leading environmental risk that causes respiratory morbidity and mortality. The increasing availability of high-resolution environmental data and air pollution-related health cases have accelerated the use of machine learning models (ML) to estimate environmental exposure–response relationships, forecast health risks and [...] Read more.
Air pollution is a leading environmental risk that causes respiratory morbidity and mortality. The increasing availability of high-resolution environmental data and air pollution-related health cases have accelerated the use of machine learning models (ML) to estimate environmental exposure–response relationships, forecast health risks and call for the needed policy and practical interventions. Unfortunately, ML models are opaque, in a sense that, it is unclear how these models combine various data inputs to make a concise decision. Thus, limiting its trust and use in clinical matters. Explainable artificial intelligence (xAI) models offer the necessary techniques to ensure transparent and interpretable models. This systematic review explores online data repositories through the lens of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline to synthesize articles from 2020 to 2025. Various inclusion and exclusion criteria were established to narrow the search to a final selection of 92 articles, which were thoroughly reviewed by independent researchers to reduce bias in article assessment. Equally, the ROBINS-I (Risk Of Bias In Non-randomized Studies of Interventions) domain strategy was helpful in further reducing any possible risk in the article assessment and its reproducibility. The findings reveal a growing adoption of ML techniques such as random forests, XGBoost, parallel lightweight diagnosis models and deep neural networks for health risk prediction, with SHAP (SHapley Additive exPlanations) emerging as the dominant technique for these models’ interpretability. The extremely randomized tree (ERT) technique demonstrated optimal performance but lacks explainability. Moreover, the limitations of these models include generalizability, data limitations and policy translation. This review’s outcome suggests limited research on the integration of LIME (Local Interpretable Model-Agnostic Explanations) in the current ML model; it recommends that future research could focus on causal-xAI-ML models. Again, the use of such models in respiratory health issues may be complemented with a medical professional’s opinion. Full article
(This article belongs to the Section Air Quality and Health)
Show Figures

Figure 1

28 pages, 3628 KB  
Article
From Questionnaires to Heatmaps: Visual Classification and Interpretation of Quantitative Response Data Using Convolutional Neural Networks
by Michael Woelk, Modelice Nam, Björn Häckel and Matthias Spörrle
Appl. Sci. 2025, 15(19), 10642; https://doi.org/10.3390/app151910642 - 1 Oct 2025
Viewed by 356
Abstract
Structured quantitative data, such as survey responses in human resource management research, are often analysed using machine learning methods, including logistic regression. Although these methods provide accurate statistical predictions, their results are frequently abstract and difficult for non-specialists to comprehend. This limits their [...] Read more.
Structured quantitative data, such as survey responses in human resource management research, are often analysed using machine learning methods, including logistic regression. Although these methods provide accurate statistical predictions, their results are frequently abstract and difficult for non-specialists to comprehend. This limits their usefulness in practice, particularly in contexts where eXplainable Artificial Intelligence (XAI) is essential. This study proposes a domain-independent approach for the autonomous classification and interpretation of quantitative data using visual processing. This method transforms individual responses based on rating scales into visual representations, which are subsequently processed by Convolutional Neural Networks (CNNs). In combination with Class Activation Maps (CAMs), image-based CNN models enable not only accurate and reproducible classification but also visual interpretability of the underlying decision-making process. Our evaluation found that CNN models with bar chart coding achieved an accuracy of between 93.05% and 93.16%, comparable to the 93.19% achieved by logistic regression. Compared with conventional numerical approaches, exemplified by logistic regression in this study, the approach achieves comparable classification accuracy while providing additional comprehensibility and transparency through graphical representations. Robustness is demonstrated by consistent results across different visualisations generated from the same underlying data. By converting abstract numerical information into visual explanations, this approach addresses a core challenge: bridging the gap between model performance and human understanding. Its transparency, domain-agnostic design, and straightforward interpretability make it particularly suitable for XAI-driven applications across diverse disciplines that use quantitative response data. Full article
Show Figures

Figure 1

22 pages, 701 KB  
Article
CuBE: A Customizable Bounds Evaluation Framework for Automated Assessment of RAG Systems in Government Services
by Bolun Yang, Xuhong Yu, Xin Zheng, Jing Nong, Zhentao Liu, Xinmin Dai and Xiaoyao Xie
Appl. Sci. 2025, 15(19), 10447; https://doi.org/10.3390/app151910447 - 26 Sep 2025
Viewed by 376
Abstract
Retrieval-Augmented Generation (RAG) systems are increasingly adopted in government services, yet different administrations have varying customization needs and lack standardized methods to evaluate performance. In particular, general-purpose evaluation approaches fail to show how well a system meets domain-specific expectations. This paper presents CuBE [...] Read more.
Retrieval-Augmented Generation (RAG) systems are increasingly adopted in government services, yet different administrations have varying customization needs and lack standardized methods to evaluate performance. In particular, general-purpose evaluation approaches fail to show how well a system meets domain-specific expectations. This paper presents CuBE (Customizable Bounds Evaluation), a tailored evaluation framework for RAG systems in public administration. CuBE integrates large language model (LLM) scoring, customizable evaluation dimensions, and a bounded scoring paradigm with baseline and upper-bound reference sets, enhancing fairness, consistency, and interpretability. We further introduce Lightweight Targeted Assessment (LTA) to support efficient customization. CuBE is validated on GSIA (Guizhou Provincial Government Service Center Intelligent Assistant) by using four state-of-the-art language models. The results show that CuBE produces robust, stable, and model-agnostic evaluations while reducing reliance on manual annotation and facilitating system optimization and rapid iteration. Moreover, CuBE informs parameter settings, enabling developers to design RAG systems that better meet customizer needs. This study establishes a replicable paradigm for trustworthy and efficient evaluation of RAG systems in complex government service scenarios. Full article
Show Figures

Figure 1

19 pages, 1025 KB  
Article
Research on Trade Credit Risk Assessment for Foreign Trade Enterprises Based on Explainable Machine Learning
by Mengjie Liao, Wanying Jiao and Jian Zhang
Information 2025, 16(10), 831; https://doi.org/10.3390/info16100831 - 26 Sep 2025
Viewed by 447
Abstract
As global economic integration deepens, import and export trade plays an increasingly vital role in China’s economy. To enhance regulatory efficiency and achieve scientific, transparent credit supervision, this study proposes a trade credit risk evaluation model based on interpretable machine learning, incorporating loss [...] Read more.
As global economic integration deepens, import and export trade plays an increasingly vital role in China’s economy. To enhance regulatory efficiency and achieve scientific, transparent credit supervision, this study proposes a trade credit risk evaluation model based on interpretable machine learning, incorporating loss preferences. Key risk features are identified through a comprehensive interpretability framework combining SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME), forming an optimal feature subset. Using Light Gradient Boosting Machine (LightGBM) as the base model, a weight adjustment strategy is introduced to reduce costly misclassification of high-risk enterprises, effectively improving their recognition rate. However, this adjustment leads to a decline in overall accuracy. To address this trade-off, a Bagging ensemble framework is applied, which restores and slightly improves accuracy while maintaining low misclassification costs. Experimental results demonstrate that the interpretability framework improves transparency and business applicability, the weight adjustment strategy enhances high-risk enterprise detection, and Bagging balances the overall classification performance. The proposed method ensures reliable identification of high-risk enterprises while preserving overall model robustness, thereby providing strong practical value for enterprise credit risk assessment and decision-making. Full article
Show Figures

Figure 1

26 pages, 2197 KB  
Article
LLM-Driven Sentiment Analysis in MD&A: A Multi-Agent Framework for Corporate Misconduct Prediction
by Yeling Liu, Yongkang Liu and Kai Yang
Systems 2025, 13(10), 839; https://doi.org/10.3390/systems13100839 - 24 Sep 2025
Viewed by 816
Abstract
The textual analysis of Management Discussion and Analysis (MD&A) reveals valuable insights into corporate operational performance and future risks. However, techniques for accurately extracting sentiment from unstructured Chinese MD&A texts still lack comprehensiveness. Existing studies related to sentiment analysis often use lexicon-based methods, [...] Read more.
The textual analysis of Management Discussion and Analysis (MD&A) reveals valuable insights into corporate operational performance and future risks. However, techniques for accurately extracting sentiment from unstructured Chinese MD&A texts still lack comprehensiveness. Existing studies related to sentiment analysis often use lexicon-based methods, which rely on predefined, context-agnostic word lists and accurate Chinese word segmentation and struggle with domain-specific terminology, leading to limited accuracy and interpretability. Although research has attempted to develop context-aware lexicons and language models, these methods still face limitations when applied to long and complex financial texts. To address the limitations, we propose MDARisk, a novel framework for corporate misconduct prediction. The core of MDARisk is a MultiSenti module, which leverages a multi-agent LLM approach to extract comprehensive and contextual sentiment from MD&A. Unlike lexicon methods, our LLM-based module interprets words based on their surrounding semantic context, allowing it to decipher nuanced expressions and specialized financial language. We first conduct an econometric validation using fixed-effects logit models to test whether the MultiSenti-derived MD&A sentiment is significantly associated with subsequent corporate misconduct. We then evaluate out-of-sample predictive utility by adding this sentiment feature to multiple classifiers and assessing its incremental gains over the baseline model. Empirical results demonstrate that our approach provides a more reliable sentiment-based indicator for misconduct risk, achieves higher predictive accuracy, and outperforms the traditional financial sentiment analysis approach. Our MDARisk framework provides a cost-efficient approach for automated disclosure screening, benefiting auditors, regulators, and investors in assessing potential misconduct risks. Full article
(This article belongs to the Topic Agents and Multi-Agent Systems)
Show Figures

Figure 1

Back to TopTop