Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection

Mapaila, Tebogo Forster; Senekane, Makhamisa

doi:10.3390/technologies14040212

Open AccessArticle

Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection

by

Tebogo Forster Mapaila

¹

and

Makhamisa Senekane

^1,2,*

¹

Institute for Intelligent Systems, University of Johannesburg, Johannesburg Auckland Park 2006, South Africa

²

National Institute for Theoretical and Computational Sciences, Stellenbosch 7602, South Africa

^*

Author to whom correspondence should be addressed.

Technologies 2026, 14(4), 212; https://doi.org/10.3390/technologies14040212

Submission received: 31 December 2025 / Revised: 27 January 2026 / Accepted: 3 February 2026 / Published: 3 April 2026

(This article belongs to the Special Issue Privacy-Preserving and Trustworthy AI for Industrial 4.0 and Beyond)

Download

Browse Figures

Versions Notes

Abstract

Financial fraud and money laundering continue to challenge financial stability and regulatory oversight, motivating the widespread adoption of machine learning models for transaction monitoring. Although ensemble models such as Random Forest and XGBoost achieve strong predictive performance, their deployment in high-stakes financial environments is constrained by limited interpretability, overconfident predictions, and the absence of principled mechanisms for expressing decision uncertainty. Emerging regulatory expectations increasingly emphasise transparency, accountability, and operational reliability, underscoring the need for evaluation frameworks that extend beyond predictive accuracy. This study proposes the Integrated Transparency and Confidence Framework (ITCF), a deployment-oriented approach that unifies model explainability, statistically valid uncertainty quantification, and operational decision support for fraud detection. ITCF combines instance-level explanations generated via Local Interpretable Model-Agnostic Explanations (LIME) with distribution-free uncertainty estimation using split conformal prediction. The framework incorporates selective explainability, abstention-based routing, and uncertainty-driven triage to support human-in-the-loop review. Using the PaySim dataset of 6,362,620 mobile-money transactions, Random Forest and XGBoost models are evaluated under extreme class imbalance using F1-score, AUC–ROC, and Matthews Correlation Coefficient (MCC). At a target coverage level of 90% (

α = 0.1

), both models achieve empirical coverage close to the target level, with XGBoost producing smaller prediction sets and superior recall, MCC, and latency. ITCF provides transaction-level explanations for uncertain cases and specifies an auditable workflow that is intended to support transparency, traceability, and risk-aware human review, thereby enabling defensible human decision-making in regulated environments. Overall, this study illustrates how explainability and uncertainty quantification can be combined in a deployment-oriented evaluation workflow while noting that real-world validation remains a future endeavour.

Keywords:

artificial intelligence (AI); conformal prediction (CP); explainable artificial intelligence (XAI); financial crime analytics; fraud detection; anti-money laundering (AML); predictive uncertainty; random forest (RF); uncertainty quantification (UQ); XGBoost (XGB)

Graphical Abstract

1. Introduction

1.1. Background and Motivation

Financial fraud and money laundering continue to undermine trust in financial institutions and place sustained pressure on regulatory oversight. As digital and mobile payment ecosystems expand, adversaries increasingly exploit weaknesses in identity verification, transaction velocity, and cross-channel coordination, reducing the effectiveness of static rule-based monitoring. In practice, many deployed detection pipelines reduce model outputs to binary risk labels accompanied by uniform post hoc explanations, limiting the ability to prioritise analyst effort or to distinguish confident predictions from genuinely ambiguous cases. Prior reviews highlight a persistent gap between advances in explainability and uncertainty estimation and their practical integration into governance-oriented workflows that support traceability and human oversight [1,2].

In response, machine learning (ML) and artificial intelligence (AI) techniques have become integral to contemporary financial crime detection systems [3,4]. Ensemble learning models, including Random Forest and Extreme Gradient Boosting (XGBoost), demonstrate strong predictive performance, particularly in highly imbalanced scenarios where fraudulent transactions represent a small fraction of total observations. However, these models are frequently criticised for their lack of transparency, as their internal decision processes are not easily interpretable by human stakeholders. Conversely, interpretable models such as Generalised Additive Models (GAMs) provide greater transparency while maintaining competitive accuracy, though they may not achieve the predictive performance of ensemble methods in complex datasets. This trade-off highlights the need to integrate interpretability with ensemble accuracy to improve both transparency and predictive performance.

In regulated, high-risk sectors such as financial services, predictive accuracy alone is insufficient. Institutions are required to justify automated or semi-automated decisions to regulators, auditors, internal risk committees, and affected customers. The opacity of black-box models poses significant challenges for governance, auditability, and accountability, directly affecting the deployment of ML-based fraud detection systems [5]. Supervisory commentary and policy discussions increasingly emphasise explainability and auditability as important considerations for AI systems deployed in financial services [6,7].

1.2. Explainability and Predictive Uncertainty

Explainable Artificial Intelligence (XAI) has emerged to address concerns regarding the interpretability of model behaviour. Post hoc explanatory techniques, including Local Interpretable Model-Agnostic Explanations (LIME) and SHapley Additive exPlanations (SHAP), are widely adopted and frequently cited in regulatory and policy discussions on trustworthy AI [7,8]. In South Africa, the joint Financial Sector Conduct Authority (FSCA) and Prudential Authority (PA) report, Artificial Intelligence in the South African Financial Sector [6], reinforces this emphasis by identifying transparency, accountability, and explainability as essential requirements for responsible AI adoption in financial services.

However, explainability alone does not indicate the confidence level of a model’s predictions. In financial crime detection, where class imbalance is pronounced and decision consequences are asymmetric, probabilistic outputs may appear definitive even when predictions are unreliable. Overconfident model outputs can lead to missed detections, increased customer friction, inefficient investigation throughput, and heightened governance and compliance risk when automated outputs cannot be defended or audited [5,6,9]. Previous research demonstrates that neglecting predictive uncertainty can produce systems that obscure their limitations and undermine risk-based decision-making [10,11].

Uncertainty Quantification (UQ) offers a complementary approach by explicitly conveying the reliability of individual predictions. Conformal Prediction (CP) establishes a statistically rigorous framework for UQ, providing finite-sample, distribution-free coverage guarantees under independent and identically distributed (IID) assumptions. These characteristics render CP especially suitable for regulated and risk-sensitive domains, including financial crime detection.

1.3. Operational and Regulatory Implications of Separate Treatment

Although XAI and uncertainty quantification are well-established research domains, they are often implemented separately in deployed systems [12]. In operational anti-money laundering (AML) systems, this separation introduces three specific risks: (i) escalation of low-quality alerts, (ii) overconfidence in borderline cases, and (iii) reduced audit defensibility during supervisory reviews. Explanations may be provided without reference to predictive confidence, while uncertainty estimates may lack the contextual interpretability necessary for informed escalation or intervention decisions [13].

In transaction-monitoring environments, alert volumes are substantial, and class imbalance is significant. Industry reports frequently indicate that traditional rules-based transaction monitoring can produce false-positive rates above 90%, contributing to alert fatigue and constrained investigative capacity [14]. When explainability is provided without calibrated confidence information, analysts may receive plausible explanations for predictions that remain unreliable, fostering unwarranted trust in borderline cases. In addition, presenting uncertainty measures without explanatory context offers limited insight into the reasons for transaction uncertainty, thereby weakening defensible risk-based routing [15].

These challenges are intensified in regulated environments, where institutions are required to demonstrate robust governance, transparency, and accountability in AI-assisted decision-making. Recent South African supervisory commentary highlights responsible AI risk management, consumer protection, and auditability as key supervisory priorities [6]. As a result, a governance gap may arise in which institutions cannot justify both the rationale and the reliability of automated alerts at the point of decision-making.

1.4. Contribution and Scope

To address these gaps, the study proposes an Integrated Transparency and Confidence Framework (ITCF) for financial fraud detection. The framework integrates established explainability and uncertainty quantification techniques into a coherent, auditable, and risk-based operational workflow. This approach emphasises practical deployment by embedding these techniques within existing workflows, thereby enhancing analyst usability and supporting compliance with regulatory requirements.

The primary contributions of this study are as follows:

A deployment-oriented framework that integrates local explainability (LIME) with set-valued uncertainty quantification (split conformal prediction) for fraud detection under severe class imbalance.
A selective explainability strategy in which explanations are generated only for abstained or high-uncertainty cases, reducing interpretability overhead and analyst cognitive load.
An uncertainty-driven human-in-the-loop triage mechanism that uses conformal abstention and predictive entropy to route ambiguous cases for review.
An empirical evaluation on a large-scale transaction dataset analysing predictive performance, uncertainty behaviour, calibration diagnostics, and operational latency.

The contribution of ITCF lies not in algorithmic innovation but in the governance and operations layer, where uncertainty serves as an early-warning indicator for potential issues and explanations are attached to flagged cases as decision evidence. The novelty lies in specifying when explanations should be generated, how uncertainty should trigger escalation, and how these elements interact within a deployable fraud-triage workflow. This approach transforms two established techniques into a compliance supporting workflow that is both usable by analysts and auditable by oversight functions. Although operational and regulatory considerations motivate the proposed framework, the discussion of deployment remains conceptual. No real-world implementation or pilot study has been conducted at this stage. This clarification is intended to set appropriate expectations regarding the empirical support for the framework and to indicate directions for future validation.

1.5. Relationship to Existing Research

Research on explainable fraud detection has largely focused on post hoc interpretation of model outputs, often without explicitly accounting for predictive uncertainty [16]. Recent surveys and systematic reviews document extensive use of feature attribution methods, rule extraction, and local explanation techniques, while also highlighting persistent challenges related to highly imbalanced interpretability, reliability, and practical deployment constraints in financial systems [17,18,19]. This study emphasises that strong predictive accuracy alone is insufficient for operational adoption in regulated environments.

Uncertainty quantification has been investigated in parallel within financial applications and other trust-critical domains. Common methodologies include Bayesian models, deep ensembles, and heuristic confidence scores [20,21]. Although these approaches can indicate uncertainty, they often lack formal, finite-sample guarantees, making them challenging to justify in high-risk contexts where decisions require auditability and defensibility. Furthermore, as highlighted by [15], uncertainty is frequently considered independently from explainability, which may diminish its utility in supporting analyst decision-making. Recent research in detection domains beyond financial fraud emphasises the need to explicitly manage uncertainty and variability within operational workflows. For example, mixture-of-experts models have been proposed to enhance robustness and generalisation in face forgery detection [22]. Although the application domain is distinct, these studies reinforce the broader conclusion that effective detection systems are strengthened by designs that account for uncertainty and enable selective human intervention.

A structured comparison between the proposed ITCF framework and representative prior approaches is provided in Table 1.

In both fraud-specific and related literature, existing systems often reduce model outputs to binary risk labels or static explanations uniformly applied across cases. This approach restricts the prioritisation of analyst effort and impedes differentiation between confident predictions and genuinely ambiguous instances. Prior reviews have identified a persistent gap between methodological advances in explainability and uncertainty estimation and their practical integration into decision workflows that facilitate governance, traceability, and human oversight.

Within this context, the Integrated Transparency and Confidence Framework (ITCF) is positioned as an applied, deployment-oriented integration rather than a novel methodological contribution. By combining split conformal prediction with selective, instance-level explainability, ITCF illustrates how predictive uncertainty can be operationalised as a control signal for human-in-the-loop review, with explanations generated only where they are most informative. This integration addresses a practical gap in existing fraud detection systems, particularly in relation to governance, traceability, and decision support in regulated financial crime environments. ITCF does not directly address inherent model bias, concept drift, or fairness constraints, which remain distinct and well-recognised challenges in operational fraud detection. Future work may explore extending ITCF by integrating bias mitigation techniques and adaptive modelling approaches to better accommodate evolving data distributions and fairness requirements in dynamic settings.

1.6. Advantages, Limitations, and Trade-Offs

The Integrated Transparency and Confidence Framework (ITCF) is an applied, deployment-oriented integration of established techniques. Consequently, its advantages are accompanied by specific limitations and design trade-offs that must be considered to ensure accurate interpretation of the results.

Probability calibration scope: Although Maximum Calibration Error (MCE) is reported to characterise worst-case probabilistic miscalibration, this study does not assert that the outputs are probability-calibrated. The ITCF uses conformal prediction for uncertainty management, and calibration diagnostics are provided for completeness rather than as the foundation for automated decision-making.
Choice of explainability method: The framework employs LIME for local, instance-level explainability because of its model-agnostic properties and operational flexibility. However, LIME exhibits variability due to stochastic perturbation sampling. In this study, explanations are considered decision-support artefacts, rather than definitive ground truth. Future research will investigate complementary or more stable approaches, such as SHAP or counterfactual explanations, and will include formal stability assessments. Planned efforts include implementing stability metrics to evaluate the consistency of explanations across multiple runs and comparing these results with SHAP outcomes. This approach will enable quantitative assessment of variability and identification of potential improvements in explanation reliability. Comparative analyses will also be conducted to evaluate the robustness and stability of LIME relative to alternative explainability methods.
Computational considerations: Generating LIME explanations introduces additional computational overhead, potentially challenging strict real-time deployment requirements. The ITCF addresses this by generating explanations only for uncertain cases. However, latency-sensitive environments may require further optimisation or alternative explanation strategies.
Marginal coverage guarantees: The ITCF’s uncertainty guarantees are marginal rather than class-conditional. Although this approach is suitable for the current evaluation, class-conditional or label-conditional conformal methods may offer improved control in highly imbalanced fraud detection scenarios and represent a logical extension of this research.
Dataset limitations: The experimental evaluation is based on the PaySim synthetic transaction dataset. While this dataset is widely used for controlled experimentation, synthetic data may not fully capture the complexity, behavioural adaptation, and institutional constraints of real-world financial systems, limiting the direct generalisability of the findings.
Model and architectural scope: This study focuses on two tree-based ensemble models, Random Forest and XGBoost, chosen for their strong performance on tabular data and computational efficiency. Although prior research indicates that these models often outperform deep learning architectures in structured financial datasets [26,27], these findings may not be directly applicable to other model classes, such as neural networks or graph-based approaches.
Static evaluation setting: All analyses in this study were conducted under the assumption of static data. In operational environments, fraud patterns evolve over time, and this research did not address the effects of concept drift on predictive performance, uncertainty calibration, or explanation reliability.

2. Materials and Methods

2.1. Integrated Transparency and Confidence Framework (ITCF)

The Integrated Transparency and Confidence Framework (ITCF) unifies uncertainty quantification and model explainability within a deployment-oriented decision workflow for fraud detection. Instead of addressing uncertainty and interpretability as separate diagnostics, the framework incorporates them as interdependent elements within a risk-based decision process.

The workflow proceeds through four sequential roles that cascade information, ensuring efficient integration of components. The integration logic is organised according to a functional separation of roles:

1.: Uncertainty identification: Split conformal prediction determines when a model prediction cannot be made with sufficient confidence under a predefined coverage level.
2.: Selective explainability: Local explanations (LIME) are generated only for transactions flagged as uncertain or abstained, providing insight into why a prediction is ambiguous.
3.: Operational triage: Transactions associated with conformal abstention are routed for human review, with entropy used as a secondary prioritisation signal.
4.: Decision support: Explanations accompany escalated cases as contextual evidence to guide analyst judgement, rather than as automated decision rules.

This design intentionally avoids generating explanations for all predictions, thereby reducing computational overhead and mitigating the risk of false interpretability in high-confidence cases. Explanations are provided only when the model’s confidence is low or when the prediction is abstained, ensuring efficient resource allocation and maintaining interpretability where it is most critical.

2.1.1. End-to-End ITCF Workflow

The end-to-end operation of the Integrated Transparency and Confidence Framework (ITCF) is illustrated and summarised below. The workflow explicitly incorporates uncertainty-aware decision logic and conditional explainability to support operational governance and analyst prioritisation.

2.1.2. Workflow Description

Incoming transactions are first subjected to standard feature engineering and validation before being processed by a base machine learning classifier (Random Forest or XGBoost), which produces point probability estimates. These estimates are subsequently passed to a split conformal prediction module to generate calibrated prediction sets with formal coverage guarantees. Decision logic is applied based on the structure of the prediction set: confident single-label outputs may be considered eligible for low-touch handling, while multi-label or empty prediction sets are interpreted as indicators of ambiguity or elevated decision risk and trigger abstention. Explainability is applied conditionally using Local Interpretable Model-agnostic Explanations (LIME), ensuring that explanations are generated only for abstained or high-uncertainty cases. This design choice avoids routine post hoc explanation of confident predictions and instead positions interpretability as a targeted decision-support mechanism. The final output is an integrated risk report combining model score, uncertainty signal, and local explanation, which is routed to a human analyst for review, escalation, or clearance, thereby supporting auditability and human-in-the-loop governance.

All automation-related pathways are presented conceptually to illustrate possible governance-aligned handling strategies and do not imply deployed system behaviour or validated automation outcomes.

2.2. Framework Components

The ITCF framework comprises four integrated modules: (i) data preprocessing, (ii) supervised model training, (iii) uncertainty quantification, and (iv) local explainability. Random Forest and XGBoost classifiers are representative ensemble models frequently used in tabular fraud detection. These models were selected for strong performance on tabular transaction data, computational efficiency, and compatibility with widely used post hoc explanation tools, making them practical baselines for deployment-oriented evaluation. This makes them suitable candidates for evaluating deployment-oriented fraud detection workflows under realistic latency constraints. Although more complex models, such as neural networks, are available, they typically require greater computational resources and do not provide the same level of interpretability as Random Forest and XGBoost. The outputs from the uncertainty and explainability modules are integrated into structured risk reports and a visual dashboard to facilitate traceable, risk-aware decision-making.

2.2.1. Identification of High-Uncertainty Cases

High-uncertainty transactions are identified through indicators derived from conformal prediction outputs and probabilistic model scores. These indicators are used exclusively for operational triage and should not be interpreted as formal measures of decision risk under marginal coverage guarantees.

Conformal Abstention (Empty Prediction Regions)

In highly imbalanced binary classification contexts, empty conformal prediction regions may arise from the interaction between sharply peaked probability estimates, class imbalance, and fixed conformity thresholds under split conformal prediction [28]. Within ITCF, empty prediction regions are interpreted as conservative decision-risk indicators, conditioned on the selected significance level

α

, rather than as indicators of pure epistemic doubt. These abstentions identify cases where the model cannot assign a label with the required coverage guarantee and are routed for human review. Conducting sensitivity analysis over alternative

α

values (such as

0.05, 0.10

, or

0.20

) is recommended during deployment governance to align abstention rates with institutional risk tolerance and analyst capacity.

Predictive entropy is computed as

H (p) = - \sum_{c \in {0, 1}} p_{c} log (p_{c}),

(1)

Higher entropy values indicate reduced confidence in assigning a single class [29]. Entropy serves as a secondary prioritisation signal among abstained cases, allowing analysts to address the most ambiguous transactions first. An entropy threshold of

0.5

functions as a policy parameter reflecting institutional risk appetite and analyst capacity and is used here for demonstration. Institutions may adjust this threshold according to their specific risk tolerance and available analytical resources [30]. In supplementary experiments (recommended for deployment), institutions can evaluate thresholds in the range 0.3–0.6 and report the resulting review volumes and fraud-capture rates to select a policy-consistent operating point.

It is important to note that split conformal prediction provides set-level uncertainty guarantees via marginal coverage rather than probability calibration of model outputs. Although diagnostic calibration metrics such as MCE are reported elsewhere in this study, the ITCF does not assume or require calibrated probabilities for operational decision-making. In practical applications, stakeholders may still request probability diagnostics to gain a comprehensive understanding of model behaviour. In this context, uncertainty is interpreted through the properties of conformal prediction regions, specifically coverage, abstention rates, and region size, rather than relying solely on point probability estimates.

2.2.2. Justification of Design Choices

Entropy Threshold Selection

In binary classification, entropy reaches its maximum value of

log 2 \approx 0.693

when class probabilities are evenly distributed. A threshold of

0.5

is selected to identify materially ambiguous cases and to exclude predictions that are nearly certain for the majority class. Rather than asserting this threshold as optimal, it is treated as a tunable parameter. A sensitivity analysis over the range

0.3

to

0.6

is recommended as part of model governance procedures.

Explainability Robustness

LIME explanations are subject to variability due to stochastic perturbation sampling and local surrogate model fitting [23]. The ITCF framework addresses this by limiting LIME usage to abstained cases, where explanations function as decision support rather than as automated ground truth. In this context, LIME explanations are regarded as decision-support artefacts rather than definitive or causal explanations. LIME was selected for its simplicity, interpretability, computational efficiency, and ease of integration with existing workflows, in preference to alternatives such as SHAP or counterfactual techniques. However, this choice involves trade-offs, including the absence of global interpretability provided by SHAP and the actionable insights offered by counterfactual explanations. A comprehensive comparative evaluation of LIME against alternative explanation methods was beyond the scope of this study and is identified as a direction for future model validation. Consequently, LIME explanations within ITCF are interpreted as contextual decision-support artefacts rather than stable or causal representations of model behaviour. Future research should investigate alternative explanation methods to enhance the robustness, transparency, and stakeholder trust in fraud detection models.

Exchangeability Assumptions

Split conformal prediction provides marginal coverage guarantees under exchangeability assumptions [11]. In real transaction streams, non-stationarity may arise due to seasonality, behavioural adaptation, and concept drift. ITCF therefore positions conformal validity as conditional on stable data-generating processes and recommends rolling recalibration, drift monitoring, and threshold re-estimation when distributional change is detected. Operational implementations may use drift tests (e.g., Kolmogorov–Smirnov) and stability indicators (e.g., population stability index) to trigger review and recalibration when material distributional change is detected.

2.2.3. Human Review and Analyst Guidance

Within the Integrated Transparency and Confidence Framework (ITCF), predictions with high uncertainty are routed for human review when entropy exceeds the predefined threshold or when the conformal prediction region is empty or ambiguous. In such cases, analysts receive a structured risk report that integrates model outputs, uncertainty indicators, and selected explainability artefacts to support informed judgement rather than automated decision-making.

Each risk report contains (i) the predicted class probabilities, (ii) the entropy score and associated uncertainty flag, (iii) the conformal prediction region, and (iv) a local LIME explanation that highlights the features most influential in the model’s prediction for the specific transaction. Analysts are instructed to interpret LIME explanations as indicators of model sensitivity and decision drivers, rather than as causal evidence or prescriptive rules.

In ambiguous cases, guidance directs analysts to assess whether the highlighted features correspond with established domain knowledge, internal fraud typologies, or known risk patterns. Special attention is given to explanations dominated by transaction amounts, balance changes, or transaction types, as these features are frequently linked to borderline fraud scenarios. Explanations with diffuse or unstable feature contributions are interpreted as indicators of increased model uncertainty and prompt further investigation rather than reliance on the model output.

Final decisions are the responsibility of the human analyst, who may corroborate model signals with supplementary information such as historical account behaviour, customer profiles, or rule-based alerts. This human-in-the-loop approach ensures that explainability supports accountability and consistency, while mitigating over-reliance on model-driven recommendations in high-risk or uncertain cases. For governance, analyst decisions on high-uncertainty cases should be documented, including whether the model recommendation was accepted, overridden, or escalated, along with a brief rationale.

2.3. Dataset and Preprocessing

Experiments were performed using the publicly available PaySim synthetic transaction dataset (https://www.kaggle.com/datasets/mtalaltariq/paysim-data (accessed on 3 December 2025)), comprising

6, 362, 620

transactions. Data were divided into training (

60 %

), calibration (

20 %

), and test (

20 %

) sets using stratified sampling to preserve class imbalance. Preprocessing included label encoding for categorical features, standardisation of continuous variables, and removal of high-cardinality identifiers (e.g., nameOrig, nameDest). No resampling techniques (e.g., SMOTE) were applied in order to preserve operationally realistic base rates and decision thresholds under highly imbalanced conditions. In transaction-monitoring settings, oversampling can distort joint feature relationships (e.g., balance transitions and amount constraints) and may yield overly optimistic minority-class performance that does not reflect production alert distributions. Accordingly, the evaluation emphasises risk-based triage, abstention behaviour, and analyst routing under the natural class prevalence.

2.4. Models and Training

Random Forest and XGBoost classifiers were trained with fixed hyperparameters to prioritise reproducibility. This strategy prevents overfitting to synthetic data and ensures consistent results across runs, although it may limit optimal performance achievable through hyperparameter tuning. Model performance was evaluated using Precision, Recall, F1-score, AUC-ROC, and the Matthews Correlation Coefficient (MCC), which offers a balanced assessment under severe class imbalance [31].

2.5. Uncertainty Quantification via Split Conformal Prediction

Split conformal prediction was implemented with a target coverage of

1 - α = 0.90

. The non-conformity score for a calibration instance x with true class y is defined as

NC (x) = 1 - P (y ∣ x) .

(2)

For a new instance, class-wise scores

1 - P (k ∣ x)

are computed, and the prediction region includes any class satisfying

1 - P (k ∣ x) \leq q_{0.90},

(3)

where

q_{0.90}

denotes the empirical 90th percentile of calibration scores. This procedure yields distribution-free marginal coverage guarantees under IID assumptions [11,32].

2.6. Local Explainability via LIME

Local Interpretable Model-Agnostic Explanations (LIME) generate instance-level explanations by fitting a sparse linear surrogate model in the neighbourhood of a prediction [23]. Tabular LIME was employed with controlled kernel width, sample size, and feature selection. Explanations were generated conditionally for abstained cases and aggregated across sampled instances to analyse recurring feature contributions.

2.7. Evaluation Metrics

In addition to standard classification metrics (Accuracy, Precision, Recall, F1-score, and AUC–ROC), we report the Matthews Correlation Coefficient (MCC), which provides a balanced evaluation of classifier performance under severe class imbalance by accounting for all four confusion-matrix outcomes. To characterise probability calibration, we also report the Maximum Calibration Error (MCE), which quantifies the largest deviation between predicted probabilities and observed empirical frequencies across probability bins. MCE serves as a diagnostic for probabilistic consistency; however, probability calibration does not inform decision-making within the proposed framework. Uncertainty handling and decision routing in ITCF rely solely on set-valued guarantees provided by split conformal prediction, rather than on calibrated point probabilities.

2.8. Operational Latency Analysis

Operational feasibility was assessed by measuring latency across three inference scenarios: single-instance (real-time), small-batch (32 instances), and full-batch prediction. Latency measurements were obtained using time.perf_counter() over repeated runs. Decision latency, which includes prediction and uncertainty estimation, is distinguished from explanatory latency, which involves conditional LIME computation, to reflect realistic deployment constraints.

2.9. Software and Implementation Environment

All experiments and analyses were conducted in Python (Python Software Foundation, Wilmington, DE, USA) using the Google Colaboratory (Colab) cloud-based notebook environment (Google LLC, Mountain View, CA, USA). Model training and evaluation were implemented using the scikit-learn machine learning library (scikit-learn developers, open-source project) for Random Forest and the XGBoost library (XGBoost developers, open-source project) for gradient boosting. Split conformal prediction was implemented using the CREPES conformal prediction library for Python (an open-source project). Local explainability was generated using the LIME package (LIME developers, open-source project). Data preprocessing and numerical computations were performed using NumPy and pandas (NumPy and pandas developer communities, open-source projects). Visualisations were produced using Matplotlib and PGFPlots (open-source projects).

3. Results and Discussion

This section presents and discusses the findings from evaluating Random Forest (RF) and XGBoost (XGB) within the Integrated Transparency and Confidence Framework (ITCF). The analysis addresses predictive performance under extreme class imbalance, uncertainty quantification using split conformal prediction, interpretability through LIME, and operational latency. The discussion prioritises practical decision support, auditability, and deployment considerations.

3.1. Predictive Performance Under Extreme Class Imbalance

Table 2 summarises classification performance on the PaySim test set (

n = 1, 272, 524

), which demonstrates severe class imbalance (Fraud: 1643; No Fraud:

1, 270, 881

). Both models achieve near-perfect accuracy; however, this metric is not informative in this context because the majority class dominates. Therefore, the analysis emphasises Recall, F1-score, and Matthews Correlation Coefficient (MCC), which more accurately reflect minority-class detection and balanced error rates.

XGBoost achieves higher Recall, F1-score, and MCC, indicating improved sensitivity to fraudulent transactions and a better balance across error types. Random Forest achieves slightly higher Precision and AUC-ROC, reflecting marginally stronger suppression of false positives and threshold-independent separability. In fraud detection, where false negatives correspond to missed fraud events, higher Recall is often prioritised even at the cost of a modest increase in false positives. Under this operational framing, XGBoost is selected as the primary base model for subsequent ITCF analyses, while Random Forest is retained for comparative assessment.

3.2. Probability Calibration and Reliability

Table 3 shows that XGBoost exhibits lower ECE and substantially lower MCE compared to Random Forest, indicating smaller worst-case deviations from perfect calibration (lower MCE). Accordingly, these values are reported as descriptive diagnostics rather than as evidence of decision-calibrated probabilities. Under extreme imbalance, ECE can be dominated by bins that are almost entirely populated by the majority class, yielding very small averages even when minority-relevant regions exhibit substantial local miscalibration, as reflected in MCE. Sensitivity to binning choices and class-conditional calibration diagnostics are deferred to future work.

Reliability diagrams (Figure 1) illustrate the relationship between predicted probabilities and observed frequencies across bins. Although the global ECE values in Table 3 are numerically small, they should be interpreted cautiously under severe class imbalance, where the dominance of the majority class and binning effects can mask localised miscalibration. For this reason, MCE is reported as a complementary worst-case diagnostic, and calibration plots are used only to describe probability–reliability behaviour. Importantly, ITCF does not rely on calibrated point probabilities for routing decisions; operational decision logic is driven by conformal prediction sets, abstention behaviour, and coverage properties, while calibration diagnostics are included as supplementary evidence of where point probabilities may be locally unreliable.

3.3. Uncertainty Quantification via Split Conformal Prediction

As described in Section 2, we report empirical coverage, abstention behaviour, and prediction-region characteristics observed on the PaySim test set.

Uncertainty was quantified using split conformal prediction with a target coverage of

1 - α = 0.90

. Table 4 reports the global properties of the resulting prediction regions.

In this binary setting, the chosen nonconformity score

1 - P (y ∣ x),

(4)

combined with a single global quantile threshold tends to yield either (i) a singleton set when one class probability comfortably exceeds the inclusion threshold or (ii) an empty set when neither class meets the criterion. Two-label sets would require both classes simultaneously satisfying the inclusion inequality, which is uncommon under sharply peaked posteriors and extreme imbalance. This behaviour is therefore a property of the current conformal configuration (score choice, binning-free quantile thresholding, and probability sharpness), not a general guarantee of conformal prediction in binary classification.

3.3.1. Class-Conditional Uncertainty Characteristics for XGBoost

To illustrate how uncertainty differs across classes, Table 5 presents entropy and maximum probability statistics for XGBoost, highlighting class-specific uncertainty. Fraud cases exhibited higher mean entropy and lower mean maximum probability than non-fraud cases, indicating that detecting the minority class remains inherently more uncertain in highly imbalanced settings. In this study, uncertainty primarily manifests through abstentions (empty regions) rather than multi-class ambiguity.

3.3.2. Illustrative High- and Low-Uncertainty Examples

Table 6 presents representative transactions spanning both high- and low-uncertainty regimes within the Integrated Transparency and Confidence Framework (ITCF), clarifying the instance-level behaviour underlying the reported prediction-region statistics and coverage results. These examples are illustrative rather than exhaustive and are intended to provide operational insight into how entropy thresholds and conformal prediction regions inform abstention, automation, and analyst triage during inference.

For instance, when an analyst receives case 17,257, balanced predicted class probabilities and an entropy value exceeding the operational threshold are observed. Due to insufficient confidence in the prediction, the case is escalated for further review. In a deployment setting, each abstention could be accompanied by a structured note documenting the rationale for escalation (e.g., balanced probabilities and entropy exceeding the policy threshold) to support audit trails and evidence of human oversight.

In contrast, low-uncertainty instances are characterised by highly skewed predicted probabilities, near-zero entropy, and single-class conformal regions. Such cases support confident automated decision-making, thereby preserving operational efficiency while maintaining formal coverage guarantees.

The following examples are illustrative and operational in nature and do not constitute additional statistical evidence beyond the aggregate uncertainty results already reported.

For the same instances shown in Table 6, LIME explanations are generated only for high-uncertainty cases and are used to contextualise analyst review by highlighting the features most responsible for predictive ambiguity, as discussed in the explainability analysis section.

3.4. Interpretability with LIME Explanations

LIME generated instance-level explanations and aggregated attribution summaries to enhance transparency. Across sampled instances, the most frequently identified features were transaction amount, origin/destination balances (oldbalanceOrg, newbalanceOrig, oldbalanceDest, newbalanceDest), and transaction type. Table 7 and Figure 2 provide an aggregated summary of the most influential features identified in the sampled explanations.

LIME explanations are interpreted as descriptive indicators of local model behaviour rather than as causal attributions. A key next step for deployment-oriented validation is benchmarking explanation patterns against analyst reason codes or internal fraud typologies, to assess whether the explanations are consistent with domain expectations.

Within ITCF, LIME is applied selectively to cases routed for review (i.e., abstentions or high-entropy predictions). Explanations are used as decision-support artefacts for analysts rather than as definitive causal statements. This selective use mitigates interpretability overhead while acknowledging known variability in LIME explanations due to stochastic perturbation sampling.

LIME explanations were not validated against expert-labeled reason codes or rule-based typologies. As such, explanations should be interpreted as model-consistent descriptors rather than regulatory justifications.

3.5. Effectiveness of the Proposed Integrated Transparency and Confidence Framework

The ITCF operationalises uncertainty and explainability within a unified decision workflow. Low-uncertainty singleton predictions are eligible for low-touch handling, while abstentions and high-entropy cases are escalated for human review. For escalated cases, local explanations provide contextual evidence to support the analyst’s judgement and documentation.

This applied integration aligns established explainability and uncertainty quantification techniques within a coherent operational workflow. The framework focuses on operational decision support, traceability, and governance, which are critical considerations in regulated financial environments. Traceability features aim to align with common model risk management expectations (e.g., documented validation, monitoring, and audit trails) and support governance and supervisory review in regulated financial environments.

A consolidated comparison of model performance, explainability, and probability-diagnostic metrics is presented in Table 8.

The overall score is computed using an illustrative weighted aggregation of performance, explainability, and probability diagnostics, intended to demonstrate comparative trade-offs rather than prescribe a fixed policy-dependent ranking. The weighting scheme was selected by the authors solely for illustrative purposes to reflect a common governance trade-off in fraud operations, where detection effectiveness is typically prioritised while explainability and reliability remain material for auditability and oversight. The weights (Performance 40%, Explainability 35%, Probability diagnostics 25%) are not normative and do not reflect regulatory prescriptions. Institutions may reweigh these dimensions based on policy objectives, analyst capacity, or risk appetite. A sensitivity check can be performed by varying weights (e.g.,

\pm 10

percentage points) to assess ranking stability under alternative governance priorities.

3.5.1. Targeted Human Review and Efficient Analyst Resource Allocation

The ITCF employs a triage-oriented decision workflow that connects predictive uncertainty with human review, ensuring analyst attention is directed toward cases where model-supported decisions are subject to human oversight. Instead of applying a uniform manual review or fixed alert thresholds, the framework routes transactions based on uncertainty signals derived from split-conformal prediction and supplementary probability-based measures.

At the chosen confidence level (

1 - α = 0.9

), approximately

9.9 %

of transactions yield empty conformal prediction regions and are therefore conservatively abstained from. These abstentions serve as the primary trigger for analyst review and indicate cases where the model cannot assign a label with the required coverage guarantees. In principle, abstention-based routing can reduce review volume relative to indiscriminate escalation, but operational impact is not measured in this offline evaluation.

The ITCF decision routing logic and corresponding responsibility allocation are summarised in Table 9.

According to a study by reference [33], predictive entropy, calculated from point probabilities, provides an additional signal for prioritising cases in the analyst queue. However, entropy does not supersede conformal abstentions. Instead, it offers additional details for ordering and managing the cases for review. This distinction preserves the statistical validity of conformal prediction while enabling operational flexibility in workload management for analysts.

The applied triage design enhances operational efficiency and decision quality by directing expert attention to cases characterised by genuine uncertainty or policy sensitivity. By integrating uncertainty-aware routing and selective explainability, the ITCF facilitates defensible human-in-the-loop decision-making while mitigating the risks of overconfident automation and reducing unnecessary manual workload.

3.5.2. Governance-Oriented Transparency

The ITCF generates auditable artefacts for each decision, such as uncertainty objects (prediction regions) and traceable explanations for escalated cases. Although these artefacts can enhance transparency and documentation in regulated environments, successful deployment necessitates further validation using real-world datasets, including drift-aware evaluation and explicit calibration analysis.

3.5.3. Operational Latency and Deployability

Operational feasibility was evaluated through latency measurements across three inference scenarios: single-instance (real-time), small-batch (32 instances), and full-batch (offline processing). Table 10 reports summary statistics.

The results suggest that XGBoost is better suited to real-time and near-real-time fraud detection workflows, where rapid decision-making is essential. Although both models demonstrate similar conformal coverage and predictive accuracy, XGBoost’s lower latency offers a distinct operational advantage, justifying its selection as the preferred model within the ITCF. Notably, latency is considered a secondary deployment factor rather than a primary optimisation objective, thereby maintaining performance, uncertainty, reliability, and interpretability as the core criteria for model evaluation.

3.5.4. Operational Impact

The following points outline potential operational implications, as opposed to empirically validated outcomes:

Proactive Risk Management: By routing uncertain cases for review, the framework may support earlier intervention in operational settings; however, operational impact is not measured in this offline evaluation.
Explainability for Human Analysts: Aligning model explanations with established fraud behaviour patterns fosters trust and assists analysts in understanding the rationale for each flagged case.
Efficient Resource Allocation: Conformal prediction regions with target coverage facilitate prioritisation, allowing analysts to focus on high-uncertainty cases rather than reviewing all model alerts. This approach enhances both efficiency and decision quality.
Governance readiness: The integrated interpretability and uncertainty artefacts support transparency, traceability, and audit documentation practices commonly used in regulated financial environments.

3.6. Comparison of the ITCF with Prior Research Work

The proposed framework (ITCF) is qualitatively compared with prior research on explainability and uncertainty quantification, while accounting for differences in datasets, imbalance ratios, and operational objectives. The results of this comparison are presented in Table 11.

Table 11 compares our Integrated Transparency and Confidence Framework (ITCF) with prior work and illustrates one practical integration pattern of CP-based triage with selective local explanations under extreme imbalance, specifically at a ratio of 773:1. This applied integration is relevant for regulatory technology discussions, as it illustrates how uncertainty and explainability can be aligned within governance workflows. By offering a structured framework for transparency and uncertainty awareness, the proposed approach supports policy discussions and responsible automation, without asserting direct regulatory compliance or deployment outcomes.

The framework comparison table illustrates the distinctiveness of our approach: it integrates Conformal Prediction (CP) with LIME.

The table below presents this essential differentiation:

Comparing Fraud vs. Non-Fraud Uncertainty for XGBoost

Fraud Cases: Showed a higher mean entropy ( $0.1415$ ) and a lower mean maximum probability ( $0.9374$ ) compared to non-fraud cases. This indicates that fraud predictions are inherently more uncertain than non-fraud predictions, a common problem in highly imbalanced datasets where the minority class is harder to predict with high confidence.
Non-Fraud Cases: Exhibited very low mean entropy ( $0.0010$ ) and very high mean maximum probability ( $0.9998$ ), confirming the model’s high confidence in predicting the majority non-fraud class.

These comparisons are indicative rather than strictly comparable, as prior studies differ in their dataset structures, class imbalances, and operational objectives.

3.7. Key Insights

The experimental evaluation of the Integrated Transparency and Confidence Framework (ITCF) reveals several key insights:

1.: Robust performance under extreme class imbalance: Both Random Forest and XGBoost demonstrate strong predictive performance despite the severe class imbalance present in the PaySim dataset ( $773.7 : 1$ ). In this context, the Matthews Correlation Coefficient (MCC) offers a more informative and balanced assessment than accuracy or AUC-ROC, highlighting significant differences in minority-class detection capability.
2.: XGBoost as the preferred base model for operational deployment: XGBoost consistently outperforms Random Forest in recall, F1-score, and MCC, and demonstrates substantially lower inference latency in both single-instance and batch scenarios. This combination of enhanced minority-class sensitivity and computational efficiency makes XGBoost more suitable for operational fraud-detection pipelines.
3.: Meaningful uncertainty stratification via conformal prediction: Split conformal prediction achieves near-target marginal coverage and expresses uncertainty primarily through conservative abstention, represented by empty prediction regions, rather than ambiguous multi-class outputs. Approximately $9.9 %$ of transactions are abstained, offering a principled mechanism to identify cases where automated decisions are least reliable.
4.: Fraud-specific uncertainty characteristics: Fraudulent transactions exhibit higher predictive entropy and lower maximum class probabilities than non-fraud cases, suggesting that minority-class predictions are inherently more uncertain. These characteristics account for most abstentions and support the need for targeted human review in high-risk scenarios.
5.: Applied integration of uncertainty and explainability: The ITCF demonstrates how conformal uncertainty can be operationalised as a control signal for human-in-the-loop decision-making, while LIME explanations provide contextual, instance-level support for analyst judgement in uncertain cases. This integration is oriented toward deployment and emphasises workflow design over algorithmic novelty.
6.: Governance- and audit-oriented transparency: By coupling uncertainty objects, such as prediction regions and entropy, with selective, case-specific explanations, the framework supports traceability and accountability at the point of decision-making. Notably, these benefits are achieved without overstating probability calibration, causal interpretability, or regulatory compliance guarantees.

Overall, these findings suggest that the ITCF offers a practical, risk-aware approach to integrating predictive performance, uncertainty quantification, and interpretability within a unified operational workflow. Conformal prediction facilitates conservative management of uncertain cases, while explainability enables informed human oversight. Within this framework, XGBoost is identified as the more suitable base model due to its superior fraud-detection capabilities and operational efficiency. Future work will involve testing the ITCF on live data and advancing toward real-world deployment. This subsequent deployment study will aim to confirm the framework’s applicability and reliability in dynamic environments, ensuring that stakeholders realise tangible benefits and improvements in operational workflows.

4. Conclusions

This study examined the integration of predictive performance, explainability, uncertainty quantification, and operational considerations within financial fraud detection systems. While machine learning techniques for fraud detection have advanced substantially, their practical adoption remains constrained by fragmented evaluation practices that treat these dimensions in isolation. The Integrated Transparency and Confidence Framework (ITCF) addresses this gap by combining ensemble classifiers, local explainability, and statistically valid set-level uncertainty within a single, deployment-oriented workflow. This integrated approach supports risk-aware decision-making by explicitly linking model predictions, uncertainty signals, and contextual explanations.

4.1. Summary of Findings

Empirical evaluation using the PaySim dataset demonstrated that both Random Forest and XGBoost models deliver strong predictive performance, with XGBoost achieving higher Recall, F1-score, and MCC, as well as significantly lower inference latency. These characteristics make XGBoost more suitable for high-throughput fraud detection pipelines. Split conformal prediction attained near-target marginal coverage and expressed uncertainty primarily through conservative abstention rather than ambiguous multi-class outputs, providing a clear and auditable signal for escalation. Selectively applied LIME explanations yielded instance-level feature attributions that were consistent with the model’s learned decision patterns; alignment with analyst typologies remains future validation work.

4.2. Contributions and Implications

The principal contribution of this work is operational rather than algorithmic. The study specifies how local explainability and set-valued uncertainty outputs can be combined into a unified triage workflow that supports documentation, prioritisation, and analyst decision-making. By explicitly connecting what a model predicts, how reliable that prediction is, and why a case is flagged, the ITCF demonstrates a practical integration of established explainability and uncertainty quantification techniques within a governance-oriented decision framework.

4.3. Limitations

This study has several limitations. The experimental evaluation relies on the PaySim dataset, which is simulated and may not fully capture the diversity, behavioural adaptation, and temporal dynamics of real-world fraud patterns. In addition, all analyses were conducted in an offline setting, and the framework’s performance under continuously evolving transaction streams was not assessed. Consequently, the generalisability of the findings to production environments requires further empirical validation.

4.4. Future Work

Future research will extend this work along several directions. First, the ITCF will be evaluated on diverse real-world datasets to assess robustness, generalisability, and operational impact. Second, adaptive extensions of the framework will be developed to monitor predictive performance, uncertainty behaviour, and explanation stability over time, enabling responsiveness to evolving fraud patterns. Third, closer integration with analyst workflows will be explored through intuitive dashboards and decision-support interfaces. Finally, broader explainability and uncertainty techniques, including SHAP and Bayesian or ensemble-based uncertainty methods, will be investigated alongside more complex model architectures such as graph neural networks.

Overall, this study demonstrates that trustworthy, operationally aligned fraud detection systems can be designed by evaluating predictive performance alongside uncertainty, explainability, and workflow integration. By embedding these principles into a single framework, the ITCF provides a structured foundation for advancing risk-aware and transparent machine learning systems for financial crime detection.

Author Contributions

Conceptualisation, T.F.M.; methodology, T.F.M.; software, T.F.M.; validation, T.F.M.; formal analysis, T.F.M.; investigation, T.F.M.; data curation, T.F.M.; writing original draft preparation, T.F.M.; visualisation, T.F.M.; writing review and editing, M.S., T.F.M.; supervision, M.S.; guidance and critical review, M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the University of Johannesburg and the South African National Institute for Theoretical and Computational Sciences.

Informed Consent Statement

This study did not involve human or animal subjects and therefore did not require ethical approval.

Data Availability Statement

The PaySim dataset is publicly available: https://www.kaggle.com/datasets/mtalaltariq/paysim-data (accessed on 3 December 2025).

Acknowledgments

During manuscript preparation, generative AI tools (ChatGPT, QuillBot, and Grammarly) were used exclusively for language refinement, grammatical editing, structural clarity, and minor LATEX formatting assistance. No generative AI tools were used for data generation, model training, experimental design, or result interpretation. All content was reviewed and validated by the authors, who take full responsibility for the accuracy and integrity of the work.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

XAI	Explainable Artificial Intelligence
LIME	Local Interpretable Model-agnostic Explanations
CP	Conformal Prediction
UQ	Uncertainty Quantification
RF	Random Forest
XGB	XGBoost (Extreme Gradient Boosting)
AUC	Area Under the Curve
ROC	Receiver Operating Characteristic
NC	Non-Conformity
ML	Machine Learning
AI	Artificial Intelligence
TP	True Positive
TN	True Negative
FP	False Positive
FN	False Negative
AML	Anti-Money Laundering
Dtrain	Training Dataset
Dcalib	Calibration Dataset
Dtest	Test Dataset
FSCA	Financial Sector Conduct Authority
PA	Prudential Authority
ITCF	Integrated Transparency and Confidence Framework
SHAP	SHapley Additive ExPlanations

References

Seth, P.; Sankarapu, V.K. Bridging the Gap in XAI: Why Reliable Metrics Matter for Explainability and Compliance. arXiv 2025, arXiv:2502.04695. [Google Scholar] [CrossRef]
Zhang, Z.; Song, L.; Bao, E.; Lv, X.; Wang, X. Adaptive Temporal Motif Graph Anomaly Detection for Financial Transaction Networks. arXiv 2025, arXiv:2508.20829. [Google Scholar] [CrossRef]
Ngai, E.W.T.; Hu, Y.; Wong, Y.H.; Chen, Y.; Sun, X. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decis. Support Syst. 2011, 50, 559–569. [Google Scholar] [CrossRef]
West, J.; Bhattacharya, M. Intelligent financial fraud detection: A comprehensive review. Comput. Secur. 2016, 57, 47–66. [Google Scholar] [CrossRef]
Casper, S.; Ezell, C.; Siegmann, C.; Kolt, N.; Curtis, T.L.; Bucknall, B.; Haupt, A.; Wei, K.; Scheurer, J.; Hobbhahn, M.; et al. Black-Box Access Is Insufficient for Rigorous AI Audits. arXiv 2024, arXiv:2401.14446. [Google Scholar] [CrossRef]
Financial Sector Conduct Authority (FSCA); Prudential Authority (PA). Artificial Intelligence in the South African Financial Sector: Joint FSCA–PA Report; Financial Sector Conduct Authority; Prudential Authority: Pretoria, South Africa, 2025; Available online: https://www.resbank.co.za/en/home/publications/publication-detail-pages/media-releases/2025/artificial-intelligence-in-the-south-african-financial-sector (accessed on 27 January 2026).
European Commission. Proposal for a Regulation of the European Parliament and of the Council Laying Down Harmonised Rules on Artificial Intelligence (Artificial Intelligence Act); European Union: Luxembourg, 21 April 2021; Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=celex%3A52021PC0206 (accessed on 27 January 2026).
Samek, W.; Wiegand, T.; Müller, K.R. Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. ITU J. ICT Discov. 2017, 1, 39–48. [Google Scholar]
Bhatt, U.; Xiang, A.; Sharma, S.; Weller, A.; Taly, A.; Jia, Y.; Ghosh, J.; Puri, R.; Moura, J.M.F.; Eckersley, P.; et al. Explainable machine learning in deployment. In Proceedings of the 2020 ACM Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020. [Google Scholar]
Barber, R.F.; Candès, E.J.; Ramdas, A.; Tibshirani, R.J. Predictive inference with the jackknife+. Ann. Stat. 2015, 48, 2797–2825. [Google Scholar] [CrossRef]
Vovk, V.; Gammerman, A.; Shafer, G. Algorithmic Learning in a Random World; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
Sati, S.G.; Author, V.; Majumder, R.Q. Integrating Explainable AI in Financial Fraud Detection Systems for Enhanced Decision Transparency. Int. J. Emerg. Res. Eng. Technol. 2023, 6, 113–120. [Google Scholar] [CrossRef]
Adadi, A.; Berrada, M. Peeking inside the black-box: A survey on Explainable Artificial Intelligence (XAI). IEEE Access 2018, 6, 52138–52160. [Google Scholar] [CrossRef]
Oztas, B.; Cetinkaya, D. Transaction Monitoring in Anti-Money Laundering: A Qualitative Analysis and Points of View from Industry. Future Gener. Comput. Syst. 2024, 159, 161–171. [Google Scholar] [CrossRef]
Singh, D.S. Explainability-Driven Feature Selection for Financial Fraud Detection. Int. J. Adv. Res. Comput. Sci. Eng. 2025, 1, 9–15. [Google Scholar]
Almalki, F.; Masud, M. Financial Fraud Detection Using Explainable AI and Stacking Ensemble Methods. arXiv 2025, arXiv:2505.10050. [Google Scholar] [CrossRef]
Chaddad, A.; Peng, J.; Xu, J.; Bouridane, A. Survey of explainable AI techniques in healthcare. Sensors 2023, 23, 634. [Google Scholar] [CrossRef]
Dwivedi, R.; Dave, D.; Naik, H.; Singhal, S.; Omer, R.; Patel, P.; Qian, B.; Wen, Z.; Shah, T.; Morgan, G.; et al. Explainable AI (XAI): Core ideas, techniques, and solutions. ACM Comput. Surv. 2023, 55, 1–33. [Google Scholar] [CrossRef]
Ali, A.; Abd Razak, S.; Othman, S.H.; Eisa, T.A.E.; Al-Dhaqm, A.; Nasser, M.; Elhassan, T.; Elshafie, H.; Saif, A. Financial Fraud Detection Based on Machine Learning: A Systematic Literature Review. Appl. Sci. 2022, 12, 9637. [Google Scholar] [CrossRef]
Kabir, H.D.; Khosravi, A.; Hosen, M.A.; Nahavandi, S. Neural network-based uncertainty quantification: A survey of methodologies and applications. IEEE Access 2018, 6, 36218–36234. [Google Scholar] [CrossRef]
Shi, Y.; Wei, P.; Feng, K.; Feng, D.C.; Beer, M. A survey on machine learning approaches for uncertainty quantification of engineering systems. Mach. Learn. Comput. Sci. Eng. 2025, 1, 11. [Google Scholar] [CrossRef]
Kong, C.; Luo, A.; Bao, P.; Yu, Y.; Li, H.; Zheng, Z.; Wang, S.; Kot, A.C. MoE-FFD: Mixture of Experts for Generalized and Parameter-Efficient Face Forgery Detection. arXiv 2024, arXiv:2404.08452. [Google Scholar] [CrossRef]
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
Schwalbe, G.; Finzel, B. A comprehensive taxonomy for explainable artificial intelligence: A systematic survey of surveys on methods and concepts. Data Min. Knowl. Discov. 2024, 38, 3043–3101. [Google Scholar] [CrossRef]
Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why do tree-based models still outperform deep learning on typical tabular data? Adv. Neural Inf. Process. Syst. 2022, 35, 507–520. [Google Scholar]
Shwartz-Ziv, R.; Armon, A. Tabular data: Deep learning is not all you need. Inf. Fusion 2022, 81, 84–90. [Google Scholar] [CrossRef]
Conformalized Credal Regions for Classification with Ambiguous Ground Truth. arXiv 2024, arXiv:2411.04852. [CrossRef]
Houston, A.; Cosma, G. A Meta-Heuristic Approach to Estimate and Explain Classifier Uncertainty. Appl. Intell. 2024, 55, 319. [Google Scholar] [CrossRef]
Li, F.; Chen, Z. Dynamic Quantification Anti-Fraud Machine Learning Model for Real-Time Transaction Fraud Detection in Banking. Discov. Comput. 2025, 28, 59. [Google Scholar] [CrossRef]
Chicco, D.; Jurman, G. The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BMC Genom. 2023, 16, 4. [Google Scholar] [CrossRef]
Shafer, G.; Vovk, V. A tutorial on conformal prediction. J. Mach. Learn. Res. 2008, 9, 371–421. [Google Scholar]
Kumar, D.; Darabi, N.; Tayebati, S.; Trivedi, A.R. Beyond Confidence: Adaptive Abstention in Dual-Threshold Conformal Prediction for Autonomous System Perception. arXiv 2025, arXiv:2502.07255. [Google Scholar] [CrossRef]
Nouretdinov, I.; Gammerman, A.; Vovk, V. Machine learning classification with confidence: Application of transductive conformal predictors to MRI-based diagnostic and prognostic markers in depression. NeuroImage 2011, 56, 1508–1517. [Google Scholar] [CrossRef] [PubMed]
Bhattacharyya, S.; Jha, S.; Tharakunnel, K.; Westland, J.C. Data mining for credit card fraud: A comparative study. Decis. Support Syst. 2011, 50, 602–613. [Google Scholar] [CrossRef]

Figure 1. Reliability Diagram.

Figure 2. LIME Features Importance.

Table 1. Prior Methods Comparison.

Aspect	ITCF (This Work)	Representative Prior Approaches	Study
Explainability	Instance-level explanations generated selectively for uncertain cases	XAI applied uniformly, often without uncertainty context	[17,23,24]
Operationalisation	Explicit routing of cases based on uncertainty and confidence signals	Binary risk flags or threshold-based alerts	[18,19]
Uncertainty	Set-valued predictions with marginal coverage guarantees under IID assumptions	Bayesian or ensemble confidence measures without formal guarantees	[20,21]
Computational Considerations	Conditional explanation generation to reduce interpretability overhead	Explanations generated for all predictions	[25]

Table 2. Predictive performance.

Metric	Random Forest	XGBoost
Accuracy	0.999683	0.999705
Precision	0.980620	0.960756
Recall	0.769933	0.804626
F1-score	0.862598	0.875787
MCC	0.8688	0.8791
AUC-ROC	0.999232	0.998545

Table 3. Probability Calibration Metrics for XGBoost and Random Forest Models.

Model	ECE	MCE
XGBoost	0.0001	0.1978
Random Forest	0.0002	0.3215

Table 4. Prediction regions at

α = 0.1

. Empty regions correspond to abstentions; single-class regions correspond to singleton prediction sets.

Table 4. Prediction regions at

α = 0.1

. Empty regions correspond to abstentions; single-class regions correspond to singleton prediction sets.

Model	Empty Regions	Single-Class Regions	Ambiguous Regions	Coverage	Avg. Set Size
Random Forest	126,510 (9.94%)	1,146,014 (90.06%)	0 (0.00%)	0.9006	0.9006
XGBoost	126,234 (9.92%)	1,146,290 (90.08%)	0 (0.00%)	0.9008	0.9008

Table 5. Uncertainty statistics by class for XGBoost.

Class	Mean Entropy	Mean Max Probability	Ambiguous Regions	Count
Fraud	0.1415	0.9374	0/1643	1643
Non-Fraud	0.0010	0.9998	0/1,270,881	1,270,881

Table 6. Illustrative high- and low-uncertainty transaction examples (XGBoost).

Index	True Label	Prob (No Fraud)	Prob (Fraud)	Entropy	Region
17,257	Fraud	0.5190	0.4810	0.6924	[Abstain]
25,075	Fraud	0.7445	0.2555	0.5683	[Abstain]
57,842	Fraud	0.2722	0.7278	0.5854	[Abstain]
630	Fraud	0.0164	0.9836	0.0839	[Fraud]
901	Fraud	0.0000	1.0000	0.0000	[Fraud]
2278	Fraud	0.0000	1.0000	0.0000	[Fraud]

Table 7. LIME Features.

Rank	Feature	Importance
1	amount	1.352244
2	oldbalanceOrg	1.319357
3	type	0.390689
4	newbalanceOrig	0.196532
5	newbalanceDest	0.184403
6	oldbalanceDest	0.171549
7	step	0.115079

Table 8. Comprehensive Model Ranking Across Performance, Explainability, and Probability-Diagnostic Dimensions.

Model	F1	AUC	MCC	XAI	Coverage	ECE	MCE	Overall Score
XGBoost	0.8758	0.9985	0.8791	0.92	0.9008	0.0001	0.1978	0.805
Random Forest	0.8626	0.9992	0.8688	0.89	0.9006	0.0002	0.3215	0.795

Table 9. ITCF Decision Pathway.

Pathway	Description	Responsible Party
High Confidence	Single-class conformal prediction region; transaction processed automatically without human intervention.	Automated System
Abstention (High Uncertainty)	Empty conformal prediction region; transaction routed for analyst review with supporting LIME explanation.	Analyst
Policy-Based Review	Optional escalation triggered by business rules or compliance policies, even when model confidence is high.	Compliance Officer

Table 10. Latency statistics (seconds) across inference scenarios.

Scenario	Model	Mean	Median	Std	Min	Max
Full Batch	Random Forest	6.298480	6.345143	0.550589	5.615543	6.945900
Full Batch	XGBoost	3.552923	3.294554	0.381216	3.247905	4.334907
Single Instance	Random Forest	0.031691	0.024792	0.010064	0.022717	0.063234
Single Instance	XGBoost	0.003906	0.001471	0.004782	0.000902	0.023151
Small Batch (32)	Random Forest	0.038518	0.034834	0.013388	0.023293	0.118237
Small Batch (32)	XGBoost	0.002787	0.001231	0.004599	0.001116	0.027186

Table 11. Framework Comparison with Prior Work.

Study	Dataset	Method	Coverage	Avg Set Size	XAI + CP
Reference [11]	Balanced	CP	90%	1.10	✗
Reference [23]	Various	LIME	N/A	N/A	✗
Reference [34]	Credit (10:1)	CP	91%	1.15	✗
Reference [35]	Network Intrusion	CP	89%	1.08	✗
ITCF	PaySim (773.70:1)	CP + LIME	90.08%	0.9008	✔

Notes: N/A = not applicable (study does not use conformal prediction or prediction sets). ✔ = method combines XAI and conformal prediction; ✗ = method does not combine XAI and conformal prediction.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mapaila, T.F.; Senekane, M. Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection. Technologies 2026, 14, 212. https://doi.org/10.3390/technologies14040212

AMA Style

Mapaila TF, Senekane M. Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection. Technologies. 2026; 14(4):212. https://doi.org/10.3390/technologies14040212

Chicago/Turabian Style

Mapaila, Tebogo Forster, and Makhamisa Senekane. 2026. "Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection" Technologies 14, no. 4: 212. https://doi.org/10.3390/technologies14040212

APA Style

Mapaila, T. F., & Senekane, M. (2026). Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection. Technologies, 14(4), 212. https://doi.org/10.3390/technologies14040212

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Integrating Model Explainability and Uncertainty Quantification for Trustworthy Fraud Detection

Abstract

1. Introduction

1.1. Background and Motivation

1.2. Explainability and Predictive Uncertainty

1.3. Operational and Regulatory Implications of Separate Treatment

1.4. Contribution and Scope

1.5. Relationship to Existing Research

1.6. Advantages, Limitations, and Trade-Offs

2. Materials and Methods

2.1. Integrated Transparency and Confidence Framework (ITCF)

2.1.1. End-to-End ITCF Workflow

2.1.2. Workflow Description

2.2. Framework Components

2.2.1. Identification of High-Uncertainty Cases

Conformal Abstention (Empty Prediction Regions)

2.2.2. Justification of Design Choices

Entropy Threshold Selection

Explainability Robustness

Exchangeability Assumptions

2.2.3. Human Review and Analyst Guidance

2.3. Dataset and Preprocessing

2.4. Models and Training

2.5. Uncertainty Quantification via Split Conformal Prediction

2.6. Local Explainability via LIME

2.7. Evaluation Metrics

2.8. Operational Latency Analysis

2.9. Software and Implementation Environment

3. Results and Discussion

3.1. Predictive Performance Under Extreme Class Imbalance

3.2. Probability Calibration and Reliability

3.3. Uncertainty Quantification via Split Conformal Prediction

3.3.1. Class-Conditional Uncertainty Characteristics for XGBoost

3.3.2. Illustrative High- and Low-Uncertainty Examples

3.4. Interpretability with LIME Explanations

3.5. Effectiveness of the Proposed Integrated Transparency and Confidence Framework

3.5.1. Targeted Human Review and Efficient Analyst Resource Allocation

3.5.2. Governance-Oriented Transparency

3.5.3. Operational Latency and Deployability

3.5.4. Operational Impact

3.6. Comparison of the ITCF with Prior Research Work

Comparing Fraud vs. Non-Fraud Uncertainty for XGBoost

3.7. Key Insights

4. Conclusions

4.1. Summary of Findings

4.2. Contributions and Implications

4.3. Limitations

4.4. Future Work

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI