An Improved Summary–Explanation Method for Promoting Trust Through Greater Support with Application to Credit Evaluation Systems

Peng, Chen; He, Tianci

doi:10.3390/math13081305

Open AccessArticle

An Improved Summary–Explanation Method for Promoting Trust Through Greater Support with Application to Credit Evaluation Systems

by

Chen Peng

^1,*

and

Tianci He

²

¹

College of Computer Science and Engineering, Jishou University, Jishou 416000, China

²

College of Communication and Electronic Engineering, Jishou University, Jishou 416000, China

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(8), 1305; https://doi.org/10.3390/math13081305

Submission received: 29 March 2025 / Revised: 13 April 2025 / Accepted: 14 April 2025 / Published: 16 April 2025

(This article belongs to the Special Issue Advances in Machine Learning and Graph Neural Networks)

Download

Browse Figures

Versions Notes

Abstract

Decision support systems are being increasingly applied in critical decision-making domains such as healthcare and criminal justice. Trust in these systems requires transparency and explainability. Among the forms of explanation, globally consistent summary–explanation (SE) is a rule-based local explanation offering useful global information and 100% dataset consistency. However, globally consistent SEs with limited complexity often have a small amount of support, making them unconvincing. To improve the support of SEs, this paper introduces the q-consistent SE, trading slightly lower consistency for greater support. The challenge is solving the maximizing support with the q-consistency (MSqC) problem, which is more complex than maximizing support for global consistency, leading to extended solution times using standard solvers. To enhance efficiency, the paper proposes a weighted column sampling (WCS) method, using simplified increase support (SIS) scores to create and solve smaller problem instances. Experiments on credit evaluation scenarios confirm that the SIS-based WCS method on MSqC problems improves scalability and yields SEs with greater support and better global extrapolation effectiveness.

Keywords:

explainability; trusted computing; decision support system; credit evaluation

MSC:

68T05

1. Introduction

Decision support systems (DSS) have been incorporated into various domains that require critical decision-making, such as healthcare, criminal justice, and finance [1,2,3,4,5]. DSS with explainable artificial intelligence (XAI) assist decision-makers by generating a high-level summary of models/datasets or clear explanations of suggested decisions. This paper examines a challenge encountered by DSS in the domain of credit evaluation: how to deliver accurate and credible summary–explanations (SE). An SE is defined as a local explanation rule

b (x_{e}) \to y

with global information. The following is an example of A globally consistent SE for an observation instance of the Home Equity Line of Credit (HELOC) dataset used in the FICO explainability challenge [6]: “For all the 100 people with ExternalRiskEstimate

\leq 63

and AverageMinFile

\leq 48

, the model predicts a high risk of default”. An SE rule

b (x_{e}) \to y

being globally consistent means that y is true for all instances in the support set (the 100 people that satisfy the conditions). Therefore, although SE is specific to a target observation

x_{e}

, it offers a valuable global perspective on the dataset or model.

In credit evaluation, the problem of maximizing support with global consistency (MSGC) aims to deliver a persuasive explanation for the credit evaluation decision by searching for the conditions that are satisfied by both the applicant and most other people in the database (thus maximum support). The MSGC problem can be formulated as an integer programming (IP) problem [7] and solved by branch-and-bound (B&B) solvers [8].

Well-supported explanations are convincing, which foster trust in the system among users. Nonetheless, owing to the global consistency constraint, solving the MSGC problem on large datasets often leads to rules that are either highly complex, have low support, or are even infeasible. For many real-world scenarios, explanations with strong support significantly influence user acceptance, whereas minor inconsistencies can often be tolerated. Guided by this observation, this paper addresses the problem of maximizing support with q-consistency (MSqC),

q \in (0, 1]

, which substantially enhances the support of the SE rule. Consider, for example, a 0.85-consistent SE rule with a support value of 2000: “In over 85% of the 2000 people with ExternalRiskEstimate

\leq 63

and AverageMinFile

\leq 48

, the model predicts a high risk of default”.

Although the extension seems straightforward, it turns out that the MSqC problem is much more complex than the original MSGC problem. The primary cause lies in the fact that when maximizing the q consistent support, which allows for a proportion of up to

1 - q

of the matched observations to be inconsistent, it is necessary to incorporate all observations with outcomes different from the target into the MSqC formulation. This requirement, i.e., the inclusion of outcome-divergent observations, was absent in the original MSGC problem.

Besides formulating the MSqC problem, this paper also addresses the following challenges:

1.: Efficiently solving the MSqC problem with large datasets. The B&B method is inefficient on IP problems with large datasets, including MSGC and MSqC. For example, a 60-second time limit should be established to terminate the solution [7]. In our reproduction, solving MSGC with the SCIP solver [9] requires 1852 s for $| N | = 10$ K and 101 s on average for datasets of size $| N | = 1$ K. As demonstrated by the experimental results, the performance of the B&B method degrades significantly when solving the more complex MSqC problem. Therefore, a more efficient and scalable solver is needed to solve the MSqC problems on a large dataset.
2.: Finding explanations that extrapolate well. In certain domains, such as credit evaluations, optimization models are often addressed within a subset of the global data, owing to practical considerations including transmission efficiency, privacy protection, and distributed storage solutions. Therefore, explanations with high support and consistency on a local dataset should also exhibit these qualities on the global dataset. This extrapolation effectiveness of explanations should be measured, and the explanations obtained by our method should exhibit good extrapolation effectiveness.

The primary contributions of this study are summarized below:

1.: The MSqC problem is novelly formulated for the first time, which can generate structured explanations (SEs) with substantially higher support, achieved by allowing for slight reductions in consistency.
2.: The simplified increased support (SIS)-based WCS method is proposed for solving the MSqC problem efficiently, which is much more scalable than the standard B&B method.
3.: A global prior injection technique is proposed to further improve the SIS-based WCS method for finding SE with better extrapolation effectiveness.

This paper is organized as follows. Section 2 introduces the background and related studies. Section 3 formally defines the globally consistent SE and q-consistent SE, and formulates the MSGC and MSqC problems. Then, Section 4 introduces the proposed SIS-based WCS method and the global prior injection technique. In Section 5, computer experiments are conducted to evaluate the effectiveness of the proposed methods, in terms of both the solving time and solution quality. Finally, Section 6 concludes the paper.

The list of symbols used in this paper can be found in Supplementary Materials.

2. Background

2.1. Explainable Artificial Intelligence

Explainable Artificial Intelligence (XAI) refers to a collection of techniques and methods aimed at making AI models more transparent and understandable to human users. Traditional black-box machine learning models, such as deep neural networks and ensemble methods, often yield high predictive accuracy but lack interpretability, making it difficult for users to trust and adopt their decisions in critical applications [10]. XAI seeks to bridge this gap by providing insights into model behavior, offering explanations that enhance accountability, regulatory compliance, and user trust [11,12,13].

XAI methods can be broadly classified into intrinsic interpretability and post hoc interpretability [11]. Intrinsic methods involve models that are inherently interpretable, such as decision trees, linear regression, and rule-based models. These models provide transparency by design, allowing users to trace decision pathways directly.

Post hoc interpretability, on the other hand, involves techniques applied after training a complex black-box model. These include feature importance analysis, local interpretable model-agnostic explanations (LIME) [14], and Shapley additive explanations (SHAP) [15]. Post hoc methods can provide global (explaining the overall behavior of a model) or local (explaining a specific decision) insights into model predictions.

Despite significant progress, XAI faces multiple challenges. For example, (1) simpler models are often more interpretable but may lack predictive power [16] (accuracy vs. interpretability trade-off); (2) many XAI methods are computationally expensive and struggle to scale [17] (scalability and efficiency); (3) XAI methods can be susceptible to adversarial manipulations [18] (robustness against adversarial attacks); (4) integrating XAI tools into existing systems often require specialized knowledge [11]. Recent studies have also emphasized the value of incorporating diverse contextual factors to enhance interpretability in complex domains, such as travel demand prediction using environmental and socioeconomic variables [19].

2.2. Combinatorial Optimization in XAI

Combinatorial optimization involves mathematical techniques for selecting the optimal solution from a finite set of possibilities. Common approaches include Mixed-Integer Linear Programming (MILP), Constraint Programming (CP), heuristic search, and branch-and-bound methods [20]. These techniques have found applications in enhancing XAI by providing rigorous ways to enforce transparency constraints and ensure optimal interpretability [16].

Feature Selection and Dimensionality Reduction: Combinatorial optimization can be used to identify the smallest subset of features that maintain model performance while improving interpretability. MILP-based feature selection methods have been shown to enhance transparency by reducing unnecessary complexity [18].
Optimized Decision Trees and Rule Lists: Recent studies have demonstrated the use of combinatorial optimization for training globally optimal decision trees and rule-based models. These methods directly optimize for both accuracy and interpretability by minimizing tree depth or the number of decision rules [16,21,22,23,24].
Counterfactual Explanations: Counterfactual explanations provide actionable insights by showing the minimal changes required for a different model outcome. The generation of optimal counterfactual examples is a combinatorial problem that can be effectively solved using MILP [25].

2.3. Enhancing Credit Evaluation with Combinatorial Optimization

Financial decision-making, particularly in credit evaluation, relies heavily on machine learning models. Regulatory requirements such as the General Data Protection Regulation (GDPR) mandate transparency in automated decision-making, requiring lenders to provide clear explanations for loan approvals and rejections [26].

The use of combinatorial optimization techniques in credit evaluation systems has proven beneficial in several key areas.

Monotonic Constraints in Credit Scoring: First, monotonic constraints in credit scoring ensure that model outputs remain aligned with financial intuition. For instance, increasing a borrower’s income should not decrease their creditworthiness. These constraints can be effectively enforced using MILP, allowing models to maintain both interpretability and regulatory compliance [27].
Optimal Credit Scoring Models: Second, combinatorial optimization facilitates the development of optimal credit scoring models. By deriving sparse yet highly predictive models, these techniques enable the creation of interpretable scoring systems that provide transparent creditworthiness assessments while maintaining strong predictive performance [17].
Counterfactual Explanations for Loan Decisions: Finally, counterfactual explanations generated through MILP provide actionable insights for loan applicants. These explanations suggest specific changes that applicants can make to improve their credit standing, such as reducing outstanding debt or increasing savings. By offering tailored guidance, counterfactual explanations enhance the fairness and transparency of credit decision-making [28].

The integration of combinatorial optimization into XAI for credit evaluation not only improves model interpretability but also aligns with regulatory standards. Transparent models enable financial institutions to audit decision-making processes, ensuring fairness and reducing bias. Moreover, by providing clear, actionable explanations, these methods help to build trust between customers and lenders, ultimately leading to more responsible AI deployment in financial services [26].

In some scenarios, no models exist to generate neighborhood data, and the sole source of knowledge is historical or pre-provided data. One prominent instance of this scenario corresponds to the FICO explainable machine learning challenge [6], in which the FICO company was provided a dataset generated by its black-box model, yet the model itself remained inaccessible to researchers. For situations in which data alone are accessible, and the model itself is not accessible, a more data-centered approach is necessary. Two possible options are to fit a model to the data and provide interpretations of it [29,30], or explain the decisions based on data patterns [7,31]. For example, globally consistent rule-based summary–explanation (SE) was proposed in [7], casting the problem of maximizing the number of samples that support the decision rule as a combinatorial optimization problem, called the maximizing support with global consistency (MSGC) problem.

However, the global consistency constraint in MSGC often results in infeasibility or small support for the SE solution of MSGC on many practical large datasets. Indeed, in many real-world scenarios, explanations with high support facilitate users’ trust in the explainer system, whereas minor inconsistencies can often be acceptable within certain thresholds. Guided by this idea, we generalize the MSGC problem to MSqC in this paper by permitting a small inconsistency level q in explanations, thereby significantly enhancing the support of the SE rule.

While our proposed MSqC framework focuses on optimizing the trade-off between support and consistency in rule-based explanations, it also shares conceptual similarities with approximation-based rule generation methods, such as “probably approximately correct” (PAC) rules [32]. These methods often allow a small fraction of exceptions or errors in order to improve the generalization capacity of learned rules.

However, there are important differences. PAC-style rule learners typically operate in a probabilistic framework, relying on distributional assumptions and statistical learning theory to guarantee generalization. Moreover, many of these methods are model-dependent and aim to learn classification rules applicable across the dataset. In contrast, our MSqC framework is model-agnostic and deterministic, directly optimizing rule support under a user-defined consistency threshold q without requiring any probabilistic assumptions or model access. Furthermore, MSqC generates target-specific summary–explanations that not only maximize support but also ensure interpretability in high-stakes decision-making settings, such as credit evaluation.

This distinction positions MSqC as a novel contribution in the space of interpretable explanation frameworks with a focus on optimization-based control over consistency and coverage rather than statistical generalization guarantees.

3. Problem Formulation

In this section, we first introduce the typical use cases of SE under a credit risk assessment scenario; then, the formal definitions of SE, its max-support problem, the consistency level, and the extrapolation challenge are presented in the rest of this section.

3.1. A Credit Risk Assessment Scenario

Here, a credit risk assessment scenario is used to illustrate the use cases and specific forms of SEs generated through a decision support system (DSS). Given input data

x_{e}

(also called target data, indicating that it is the target to be explained), the DSS outputs a suggested decision

y_{e}

and the SE that explains the target observation

(x_{e}, y_{e})

. Figure 1 illustrates two cases in which the DSS responses to the request of a loaner or a banker, facilitating their comprehension of the target observation

(x_{e}, y_{e})

using SEs that reveal the risk levels of the instances in the dataset similar to

x_{e}

.

The figure also shows that different SEs can have varying degrees of impact on users’ trust, depending on its level of credibility. In general, SEs with large support and high consistency levels are more convincing, promoting trust, while SEs with small support or low consistency levels are less persuasive, therefore undermining users’ trust in the system. On the right-hand side of the figure, a comparison is made between the globally consistent SE [7] and the q-consistent SE proposed in this paper using a target observation sampled from the 10K-sized FICO dataset [6]. It can be seen that while the globally consistent SE has a consistency of 100%, it has very small support (which is typically the case for globally consistent SEs). In contrast, the q-consistent SE attains much greater support at the cost of a lower consistency of 82%, which, given its advantages, should be acceptable for this scenario, as well as potentially for many others.

3.2. Globally Consistent Summary–Explanation

The summary–explanation (SE) is defined on a truth table, i.e., a

| P |

-dimensional observation dataset

{(x_{i}, y_{i}), i \in N}

with binary features

x_{i} \in {0, 1}^{| P |}

. Further,

N

serves as the index set for observations, while

P

represents the set of indices for binary feature functions. In the credit risk assessment scenario, binary labels

y_{i} \in {0, 1}

are used to denote high (1) or low (0) risk, although it should be noted that SE does not require labels to be binary. In general, such a truth table can be derived from any dataset

{({\tilde{x}}_{i}, y_{i}), i \in N}

with arbitrary inputs

{\tilde{x}}_{i}

. As an illustration, the initial input vector

{\tilde{x}}_{e} = {{\tilde{x}}_{e, 1}, {\tilde{x}}_{e, 2}} = {30, 10}

is converted into the binary representation

x_{e} = {δ_{e, 1}, δ_{e, 2}, δ_{e, 3}} = {1, 0, 1}

, through the ordered feature function set

{{\tilde{x}}_{e, 1} \geq 0, {\tilde{x}}_{e, 1} \geq 50, {\tilde{x}}_{e, 2} \geq 0}

. Henceforth, the terms ’feature’ and ’feature function’ are treated as synonymous.

Let b denote a conjunctive clause constructed by the logical AND (∧) of multiple conditions, where each condition corresponds to a binary feature

F_{p}

, i.e.,

b (\cdot) = \land_{p \in P^{'}} F_{p} (\cdot)

, where

P^{'} \subseteq P

represents a subset of the feature set

P

. For the example above, b could be

{\tilde{x}}_{e, 1} \geq 0

, or

({\tilde{x}}_{e, 1} \geq 50) \land ({\tilde{x}}_{e, 2} \geq 0)

.

A summary–explanation (SE) is a rule

b (\cdot) \to y

that describes the binary classifier

h^{b (\cdot) \to y} (x) = \{\begin{matrix} y & if b (x) = \land_{p \in P^{'}} F_{p} (x) = 1, \\ 1 - y & otherwise . \end{matrix}

A globally consistent SE for an observation

(x_{e}, y_{e})

is an SE

b (\cdot) \to y_{e}

with the following properties:

1.: Relevancy, i.e., $b (x_{e}) = 1$ ;
2.: Consistency, i.e., for all observations $i \in N$ , if $b (x_{i}) = 1$ , then $y_{i} = y_{e}$ .

In a more accessible manner, a globally consistent SE

b (\cdot) \to y_{e}

can be articulated as follows: “for every observation (such as people or customers) where the clause

b (\cdot)

holds, the outcomes (e.g., predicted risk/decision) matches the label

y_{e}

of observation e”. This type of explanation aligns the current observation

(x_{e}, y_{e})

with historical data points in the dataset, thereby making it more persuasive to users in domains such as credit evaluation.

The quality of the clause

b (\cdot)

is evaluated using two metrics:

1.: Complexity $| b |$ , which is the count of conditions within b;
2.: Support $| S_{N} (b) |$ , which represents the cardinality of the support set $S_{N} (b)$ , is defined as the set of observations in $N$ that satisfy clause b. Specifically, $S_{N} (b) = {i \in N : b (x_{i}) = 1}$ .

As is customary, the term support can refer to either the support set

S_{N} (b)

or its cardinality

| S_{N} (b) |

, depending on the context. To prevent ambiguity, this paper uses set notation to denote either index sets or original sets for notational simplicity.

3.3. Minimizing Complexity and Maximizing Support

The challenges of minimizing complexity with global consistency (MCGC) and maximizing support with global consistency (MSGC) for SE

b (\cdot) \to y_{e}

were introduced by Rudin et al. [7]. The MCGC problem aims to identify a solution b that minimizes complexity

| b |

, and this objective can be modeled as the following IP model:

\begin{matrix} min_{b} & \sum_{p \in P^{e}} b_{p} \end{matrix}

(1)

\begin{matrix} s . t . & \sum_{p \in P^{e}} b_{p} (1 - δ_{i, p}) \geq 1, \forall i \in N ∖ N^{e} \end{matrix}

(2)

\begin{matrix} b_{p} \in {0, 1}, \forall p \in P^{e} \end{matrix}

(3)

where the binary decision variable

b_{p} = 1

signifies that feature p is included in the resulting clause

b = \land_{p \in P^{'}} F_{p} (\cdot)

, i.e.,

p \in P^{'}

and

b_{p} = 0

otherwise. These variables,

b_{p}

, are termed feature variables. Additionally, the binary variable

δ_{i, p} \in {0, 1}

signifies whether observation i satisfies binary feature p, i.e.,

δ_{i, p} = F_{p} (x_{i})

. The activation feature set

P^{e}

for an observation e consists of the features that are satisfied by e, i.e.,

P^{e} = {p \in P : δ_{e, p} = 1}

. In addition,

N^{e}

represents the set of consistent observations, defined as

N^{e} = {i \in N : y_{i} = y_{e}}

. Therefore, the set of inconsistent observations is

N ∖ N^{e}

, where for each

i \in N ∖ N^{e}

, it holds that

y_{i} \neq y_{e}

. The relevancy property is guaranteed by selecting features from

P^{e}

only, while the consistency property is ensured by condition (2), which ensures that for any observation with

y_{i} \neq y_{e}

,

b (x_{i}) = 0

. The solution of the MC model is fast, due to simplicity; however, it does not guarantee large support.

The MSGC problem can be viewed as an extension of the MCGC problem, where the objective is to find b with maximal support

| s_{N} (b) |

, subjecting to a complexity constraint

| b | < M_{c}

. As indicated by Rudin et al. [7], this paper adopts

M_{c} = 4

value deemed reasonable for the complexity of SE in credit evaluation. The MSGC problem is formulated as follows:

\begin{matrix} max_{b, r} & \sum_{i \in N^{e}} r_{i} \end{matrix}

(4)

\begin{matrix} s . t . & \sum_{p \in P^{e}} b_{p} (1 - δ_{i, p}) \geq 1, \forall i \in N ∖ N^{e} \end{matrix}

(5)

\begin{matrix} \sum_{p \in P^{e}} b_{p} (1 - δ_{i, p}) \leq M (1 - r_{i}), \forall i \in N^{e} \end{matrix}

(6)

\begin{matrix} \sum_{p \in P^{e}} b_{p} \leq M_{c}, \end{matrix}

(7)

\begin{matrix} b_{p}, r_{i} \in {0, 1}, \forall p \in P^{e}, \forall i \in N^{e} \end{matrix}

(8)

where the binary decision variable

r_{i} \in {0, 1}

denotes whether observation i is included in the support of clause

b (\cdot)

, i.e.,

b (x_{i}) = 1

if and only if

r_{i} = 1

. Additionally, the constant M must satisfy

M \geq M_{c}

. Equation (6) enforces that for an observation

i \in N^{e}

to be part of

b (\cdot)

’s support, all conditions within

b (\cdot)

must be satisfied by i. While the support of the MSGC SE is typically larger than that of the MCGC SE, solving the MSGC model (4)–(8) is notably slower due to its elevated complexity.

3.4. The Third Metric: Consistency Level

Formally, a globally consistent symbolic explanation

b (\cdot) \to y

corresponds to a 1-consistent or 100%-consistent rule, as it mandates perfect consistency across all observations

i \in N

. However, deriving such a strictly consistent rule is often computationally challenging if not outright impossible in real-world datasets. For large-scale applications, both the MCGC and MSGC models typically yield rules with either excessive complexity, insufficient support, or infeasibility. The rationale is straightforward: complex datasets rarely contain 1-consistent rules with moderate complexity (e.g.,

M_{c} \leq 4

) Consequently, relaxing the consistency requirement from strict 1-consistency to a more lenient threshold such as 0.9 or 0.8 becomes a pragmatic choice. This relaxation is widely acceptable in practical SE applications, including credit evaluation, where near-consistent rules often suffice for actionable insights.

We characterize q-consistency as the following condition: for at least a fraction q of the samples where

b (x_{i}) = 1, i \in N

, it holds that

y_{i} = y

. Define

S_{N} (b, y)

as the set of consistent examples, i.e.,

S_{N} (b, y) = {i \in N : b (x_{i}) = 1, y_{i} = y}

. Formally, the consistency measure for the rule

b (\cdot) \to y

is defined as the ratio of consistent examples to the total support, i.e.,

c_{N} (b, y) = | S_{N} (b, y) | / | S_{N} (b) | .

(9)

Consequently, the q-consistency requirement translates to ensuring that this ratio meets or exceeds q, i.e.,

c_{N} (b, y) \geq q

.

A q-consistent SE

b (\cdot) \to y_{e}

for an observation

(x_{e}, y_{e})

can be restated as follows: ”For more than q proportion of all the observations where

b (\cdot)

holds true, the outcomes are

y_{e}

”. For instance, in the domain of credit evaluation, a q-consistent SE for an observation of the HELOC dataset [6] can be as follows: “For over 80% of all the 1108 people with NumTotalTrades

\leq 40

and NumTradesOpeninLast12M

\leq 3

, the model predicts a high risk of default”. From the above, it can be seen that the essence of the summary–explanation lies in the following factors: (1) informing the applicant about how many people share similar features with the applicant (i.e., the support), and (2) explaining to the applicant that among these similar individuals, a q-fraction of them defaulted. As a result, the system provides a credible explanation of why the applicant is classified as high risk and consequently rejected.

Naturally, the problem of maximizing support with q-consistency (MSqC) can be extended from the MSGC problem (4)–(8). The objective of MSqC is to maximize the support of SE

b \to y

while subjecting to the q-consistency constraint

c_{N} (b, y) \geq q

, which can be formulated as follows:

\begin{matrix} max_{b, r} & \sum_{i \in N} r_{i} \end{matrix}

(10)

\begin{matrix} s . t . & \sum_{p \in P^{e}} b_{p} (1 - δ_{i, p}) + r_{i} \geq 1, \forall i \in N ∖ N^{e} \end{matrix}

(11)

\begin{matrix} \sum_{p \in P^{e}} b_{p} (1 - δ_{i, p}) \leq M (1 - r_{i}), \forall i \in N \end{matrix}

(12)

\begin{matrix} \sum_{p \in P^{e}} b_{p} \leq M_{c}, \end{matrix}

(13)

\begin{matrix} \sum_{i \in N} (a_{i} - q) r_{i} \geq 0, \end{matrix}

(14)

\begin{matrix} b_{p}, r_{i} \in {0, 1}, \forall p \in P^{e}, \forall i \in N . \end{matrix}

(15)

Compared to the MSGC model (4)–(8), four modifications are introduced, listed as follows.

1.: Binary variables $r_{i}$ (representing supportive observations) are defined for all observations $i \in N$ , rather than just for consistent observations $i \in N^{e}$ .
2.: Constraints (11) now incorporate $r_{i}$ , allowing for inconsistent support ( $r_{i} = 1$ ) for $i \in N ∖ N^{e}$ . Specifically, recall that in MCGC and MSGC, the consistency constraint requires that for any inconsistent observation i, the SE rule is not satisfied, i.e., $b (x_{i}) = \land_{p \in P^{'}} F_{p} (x_{i}) = 0$ . Thus, constraints (2) and (5) in the MCGC and MSGC problems mean that at least one clause p in the SE rule b (i.e., $b_{p} = 1$ ) needs to be unsatisfied (i.e., $δ_{i, p} = 0)$ . Here, in MSqC, some inconsistent observations $i \in N ∖ N^{e}$ may also satisfy the SE rule b. Thus, constraint (11) imposes $b (x_{i}) = 0$ only on those observations that are not supportive ( $r_{i} = 0$ ).
3.: Constraints (12) now apply to all $i \in N$ instead of only those in $N^{e}$ , simply because $r_{i}$ are now defined for all observations $N$ , rather than just for consistent observations $i \in N$ . As in MSGC, this constraint enforces that for any supportive observations ( $r_{i} = 1$ ); it must hold that $b (x_{i}) = 1$ , i.e., whenever $b_{p} = 1$ , we have $δ_{i, p} = 1$ , for i with $r_{i} = 1$ .
4.: Equation (4) introduces the q-consistency constraint, where binary constants $a_{i} = 1$ denotes $i \in N^{e}$ . Specifically, since the number of the consistent examples is $| S_{N} (b, y) | = \sum_{i \in N} a_{i} r_{i}$ , and the size of the support is $| S_{N} (b) | = \sum_{i \in N} r_{i}$ , by the definition of the consistency measure $c_{N} (b, y)$ (9), constraint (14) describes the q-consistency constraint $c_{N} (b, y) \geq q$ .

Remark 1.

Previous research [7] has demonstrated that MCGC and MSGC are NP-hard. The model formulations MCGC (1)–(3), MSGC (4)–(8), and MSqC (10)–(15) reveal a clear progression in their complexity levels. Formally proving the NP-hardness of MSqC is left as future work, but its complexity dominance over MSGC and MCGC is evident from the increased model size, i.e., MSqC has strictly more variables and constraints while maintaining a similar structure.

3.5. The Extrapolation Challenge

In domains such as credit evaluations, optimization models are often solved over a subset of the global data, driven by practical factors such as transmission efficiency, privacy concerns, and distributed storage. As shown in Figure 2, the DSS optimizes an SE on one of the local datasets, while the SE may be evaluated later on the global dataset. The extrapolation challenge states that if the SEs given by the DSS have large support and consistency locally, then they should also have large support and consistency level if evaluated on the global dataset.

More formally, let

D

denote the global dataset (index set), and

N \subseteq D

be a local dataset (index set) that is drawn randomly from

D

. For a target observation

(x_{e}, y_{e}), e \in N

to be explained, an SE rule

b_{N} \to y_{e}

is sought by solving an optimization problem on

N

. During the solution process, the local support

| S_{N} (b) |

and local consistency level

c_{N} (b, y_{e})

of an SE rule

b \to y_{e}

can be computed on the local dataset

N

. However, as the global dataset is inaccessible during the optimization process, the challenge in extrapolation lies in improving the global support

| S_{D} (b) |

and global consistency level

| c_{D} (b, y_{e}) |

without direct access to the global dataset.

4. Methodology

As formulated in the previous section, the generation of an SE can be implemented by solving an optimization model. Specifically, three IP models, i.e., MC, MSGC, and MSqC, have been introduced to optimize the SE for different objectives.

Since we aim to enhance the credibility of SE by increasing its support, our focus is on the solution algorithms for MSGC and MSqC.

The MSGC model [7] is solved with B&B solvers such as SCIP [9]. However, the B&B method is hardly applicable on these IP models with large datasets, because not only is B&B an exact solution method, but also MSGC and MSqC are NP-hard. Therefore, it is necessary to develop an approximate solution algorithm for large-scale MSqC problems that is not only efficient but also find SEs that extrapolate well (see Section 3.5).

However, most approximate algorithms, including evolutionary algorithms (e.g., genetic algorithms and particle swarm optimization) are not suitable for our objective, because both MSGC and MSqC are large combinatorial problems with numerous constraints. Specifically, the MSGC problem has

| N^{e} | + | P^{e} |

binary decision variables and

| N | + 1

constraints, and the MSqC problem has

| N | + | P^{e} |

binary decision variables and

| N ∖ N^{e} | + | N | + 2

constraints. As a result, regular evolutionary methods struggle to even find a feasible solution for MSGC and MSqC.

Our method is based on a heuristic sampling approach that shares similarities with stochastic optimization methods such as stochastic subgradient descent [33,34]. However, unlike purely random sampling, each sub-problem in our framework is constructed in a guided manner using SIS scores, which incorporates global prior knowledge about feature importance (called global prior injection). This guided sampling helps improve the representativeness of selected sub-problems and addresses the extrapolation challenge commonly encountered in local explanation methods.

In this section, we first introduce the the SIS scores-based sampling method for feature variables, then the global prior injection technique that aims to improve the SE solutions quality on the global dataset, and finally the proposed SIS-based WCS optimization algorithm framework. The relationship of the different components in the optimization framework is illustrated in Figure 3. In short, to achieve scalability in time efficiency, the WCS optimization algorithm decomposes a large MSqC problem into multiple smaller MSGC sub-problems and finds the optimal solution among the solutions of the MSGCs. Each MSGC is obtained by the column (feature) and row (instance) sampling of the dataset

N

, with column sampling weighted by the features’ pre-computed SIS scores. The global prior injection technique embeds global preferences of the features into their SIS score distribution, utilizing the fact that SIS scores can be computed before the solution process.

For the convenience of the reader, the symbols used throughout this paper are listed in the Supplementary Materials.

4.1. Simplified Increased Support

The simplified increased support (SIS) is a score of feature variables by which the smaller IP sub-problems (i.e., the smaller MSGCs) determine their selection of variables. In other words, each smaller MSGC problem is generated by sampling (without replacement)

ρ | P^{e} |, ρ \in (0, 1]

with the number of feature variables from the activation feature set

P^{e}

according to the features’ SIS scores

s_{p}

, which is defined as the follows:

s_{p} = \sum_{i \in N^{e}} δ_{i, p} - \sum_{i \in N ∖ N^{e}} δ_{i, p}, p \in P^{e} .

(16)

The derivation process is detailed in the Supplementary Materials. In line with our intuition, this involves weighting a feature p by the frequency with which it is satisfied by observations where

y_{i} = y_{p}

, subtracting a feature p by the frequency at which it holds across observations where

y_{i} \neq y_{p}

. The sampling probability for each feature p is

{Prob}_{p} = σ_{p} (s^{'})

with

σ_{p} (x) = \frac{e^{x_{p}}}{\sum_{k} e^{x_{k}}}, and s_{p}^{'} = \frac{a s_{p}}{\max (s)},

(17)

where

σ (\cdot)

denotes the Softmax function, s represents the SIS vector defined in (16),

s^{'}

is the SIS vector normalized and scaled by a factor a, and

s_{p}^{'}

signifies the pth element of

s^{'}

corresponding to the feature variable

b_{p}

.

4.2. Global Prior Injection

To address the global extrapolation challenge, we propose that global prior information should be incorporated into each WCS IP model’s local solution procedure through

s_{p}^{gl}

, which is an extension of (16) as shown below:

s_{p}^{gl} = \sum_{i \in D^{e}} δ_{i, p} - \sum_{i \in D ∖ D^{e}} δ_{i, p}, p \in P

(18)

where

D^{e}

represents the subset of consistent observations within the global dataset

D

, specifically defined as

D^{e} = {i \in D : y_{i} = y_{e}}

. By defining

\sum_{i \in D^{e}} δ_{i, p} = Δ_{p, y_{e}}

, and leveraging the binary nature of

y_{i}

, the complementary sum over inconsistent observations is

\sum_{i \in D ∖ D^{e}} δ_{i, p} = Δ_{p, 1 - y_{e}}

. Notably,

Δ_{p, 0}

and

Δ_{p, 1}

are exclusively determined by the global dataset

D

; as such, these values can be pre-computed and applied to multiple summary–explanation queries.

4.3. The Weighted Column Sampling Optimization Framework

The proposed WCS optimization framework for approximately solving the MSqC model (10)–(15) can be decomposed into formulating, resolving, and integrating the solutions of

N_{subs}

number of smaller IPs, each of which is an MSGC model (4)–(8) with an observations dataset

N_{i} \subseteq N

and feature set

P_{i} \subseteq P^{e}

. The overall procedure of the WCS optimization framework is shown in Figure 3 and Algorithm 1.

To be more specific, let

{MSGC}_{i}

denote the ith sub-problem. The observation dataset

N_{i}

for

{MSGC}_{i}

is uniformly randomly sampled (without replacement) with a given size

| N_{i} | = n_{wcs}

, and the feature set

P_{i}

is sampled (without replacement) with a given size

| P_{i} | = ρ | P^{e} |, ρ \in (0, 1]

from the activation feature set

P^{e}

of the original MSqC problem, according to the global-prior-injected SIS-based probability

σ_{p} (s_{p}^{gl})

. Given that every

{MSGC}_{i}

problem is not large in scale, they can be solved with the B&B algorithm efficiently. Moreover, since the iterations within the for-loop are independent of each other, parallel processing can be utilized to accelerate the solution process.

Algorithm 1: The SIS-based WCS optimization algorithm

Finally, the ChooseBest function can be implemented in multiple ways. Since both the support and consistency level should be maximized, their product can serve as the sole selection metric, and either one can be given priority. In the case of credit risk assessment, this study employs a method where the solution with the maximum support is searched first among those with a consistency level exceeding 80%. If no such solutions exist, the search continues with those having a consistency level exceeding 75%, and so forth.

The solution time of MSqC(WCS) depends on

N_{sub}

, the solution time of each sub-problem

{MSGC}_{i}

, and the ChooseBest function. As analyzed by Rudin et al. [7], each

{MSGC}_{i}

is NP-hard. However, fixing

n_{wcs}

and

ρ

can set the time complexity of

{MSGC}_{i}

to a constant level, denoted as K. The ChooseBest function is of time complexity

O (N_{sub})

. Then, the time complexity of MSqC(WCS) will be

O (N_{sub} \cdot O ({MSGC}_{i}) + N_{sub}) = O (K N_{sub})

. The space complexity of MSqC(WCS) is

O (| P | \cdot | D | + n_{th} O ({MSGC}_{i}))

, where

n_{th}

denotes the number of threads running concurrently. The space complexity of each sub-problem

O ({MSGC}_{i})

can also be set to a constant level by fixing

n_{wcs}

and

ρ

; then, the space complexity of MSqC(WCS) will be

O (| P | \cdot | D | + n_{th} K)

. In our experiments, we typically use

N_{sub} = 40

,

n_{wcs} = 100

, and

ρ = 0.25

, and it can be verified in Section 5 that the solution time of MSqC(WCS) does not increase with the size

| N |

of the local solution dataset.

5. Computer Experiments

It is worth noting that our method differs fundamentally from popular model-dependent explanation techniques such as LIME [14] and SHAP [15]. These methods aim to interpret the predictions of a trained black-box model by quantifying the contribution of each input feature to the output for individual instances. In contrast, our approach is model-agnostic and dataset-driven: it constructs rule-based summary–explanations that describe subpopulations satisfying certain logical conditions with high support and acceptable consistency, without referencing any specific prediction model.

Due to these fundamental differences in scope and goal, direct empirical comparisons with LIME or SHAP are not meaningful. Instead, we focus on comparing our method against structurally similar summary–explanation models (e.g., MSGC and MSqC solved via B&B) to evaluate performance in terms of support, consistency, and runtime. This comparison better reflects the relative advantages of q-consistent SEs in rule-based, global interpretability contexts.

Specifically, in this section, three different SE generation methods are tested and compared: MSGC(B&B), MSqC(B&B), and MSqC(WCS). The SE generation methods are implemented by solving MSGC or MSqC with the B&B or WCS methods. Note that we have used the notation <Model> (<Method>) to denote solving model <Model> using solution method <Method>. For example, MSqC(WCS) denotes solving MSqC (10)–(15) with the SIS-based WCS method and using the solution as the SE output (see Figure 3).

5.1. Experimental Setup

Dataset descriptions. Here, three distinct credit-related datasets are utilized to validate the effectiveness of our approach. The first one is the renowned HELOC dataset, employed in the FICO machine learning explainability challenge [6]. The other two datasets are sourced from the UCI Machine Learning Repository: Taiwan Credit [35], focusing on credit card client default cases in Taiwan, and Australian Credit Approval [36], which concerns credit card applications. In the following, we primarily describe the application of different methods to the HELOC dataset. For detailed information about the Taiwan Credit and Australian Credit Approval datasets, as well as a comparison of the different methods on these three datasets, please refer to Appendix A.

The HELOC dataset contains the credit evaluation-related information of an individual, such as their credit history, outstanding balances, and delinquency status. The information are stored in 23 feature variables, including numeric and discrete types with missing values. (Note that, in addition to handling missing values, features binarization is required to establish the MSGC and MSqC model, which will generate more binary features than the original dataset.) The target variable “RiskPerformance” represents the repayment status of an individual’s credit account, indicating whether the individual paid their debts as negotiated over a 12–36 month period. The dataset contains data for 10,460 individuals, which is reduced to

| D | = 9871

after data cleaning. Further details of this dataset are given in the Supplementary Materials.

Data preprocessing. The dataset used in our experiments has been pre-cleaned and does not contain any missing values. All features are either numerical or categorical. To handle categorical features, we first apply label encoding to convert string values into integers, followed by one-hot encoding to produce binary feature indicators. This representation is compatible with our summary–explanation (SE) framework, which operates on boolean-valued feature functions. For numerical features, we use quantile-based thresholding to generate binary features. Specifically, we select a small number of quantile cut-off points (e.g., quartiles) and transform each numerical variable into a set of binary features indicating whether its value exceeds each threshold. This preprocessing ensures that all features used in the SE models are binary-valued and suitable for logical clause construction.

Distributed storage simulation: To evaluate the global extrapolation effectiveness of the different methods, it is necessary to mimic the distributed storage of datasets and carry out the local solution of MSGC and MSqC models. Here, we use a local–global ratio parameter

α \in (0, 1]

to control the size of the local dataset

| N | = α | D |

. At

α = 1

, it is equivalent to having no distributed storage, and the models are solved directly on the global dataset.

Parameters and hardware: Table 1 presents the parameter settings employed in the experiments. Experiments were carried out on a Mac Mini desktop featuring an M1 chip.

Metrics: The SE generation methods are evaluated from three different aspects, i.e., solution speed, local solution effectiveness, and global extrapolation effectiveness. Specifically, given a target observation

(x_{e}, y_{e})

and an SE solution

b_{s} \to y_{e}

of an SE generation method, five metrics are computed, which can be grouped as follows:

1.: Local solution performance metrics, i.e., solution time $T_{s}$ , local support $s_{N} = | S_{N} (b_{s}) |$ and local consistency level $c_{N} = c_{N} (b_{s}, y_{e})$ .
2.: Global extrapolation performance metrics, i.e., global support $s_{D} = | S_{D} (b_{s}) |$ and global consistency level $c_{D} = c_{D} (b_{s}, y_{e})$ .

As is customary, the metrics are calculated by averaging across multiple algorithm executions using the 1-shifted geometric mean, which exhibits resilience to outliers of all magnitudes [37].

Experimental Procedure

The experiment consists of the following three phases:

1.

Preprocessing phase: The missing values were handled and features were binarized. Specifically, ordinal features are binarized with

n_{thresholds}

quantile thresholds, and categorical features are binarized into one-hot vectors. Then, the global SIS scores

s_{p}^{gl}

(18) of features are computed.

2.

Execution and parameter sweep phase: In order to investigate how the performance of the three methods are impacted by different parameter configurations, a parameter sweep was conducted. Specifically, for each parameter setting (mainly varying

n_{wcs}

and

α

with other parameters fixed), a round of experiment was conducted, which can be divided into the following two steps.

(a): Request scenario generation: A local dataset consisting of $| N |$ observations was sampled from the global dataset, and an explanation request was randomly generated by selecting an observation from the global dataset as the target observation to be explained.
(b): Problem formulation and solution: The MSGC and MSqC models were established, and then the MSGC model is solved by B&B and MSqC is solved by both B&B and the proposed SIS-based WCS method. The names of the SE generation methods, i.e., MSGC(B&B), MSqC(B&B), and MSqC(WCS), indicate which model is being solved by which solution method. Metrics were calculated and recorded. This process was repeated $n_{reps}$ times, and then the 1-shifted geometric mean of the metrics was calculated.

3.

Results and analysis phase: The results of the experiments were analyzed and discussed.

In the following, we first take a quick look at how the SEs produced by MSqC(WCS) are different from the SEs produced by MSGC(B&B) by sampling a few results from the SE solutions; then, the results are compared in details on local solution effectiveness and global extrapolation effectiveness, respectively. In Appendix A, the results for two other datasets are also presented.

5.2. Sampled Results of the Summary–Explanation Solutions

For an illustration of the SE solutions in practical applications, Figure 4 compares several example SEs generated by MSGC(B&B) and MSqC(WCS) methods for random target observations when

α = 1

. It can be observed that the SEs produced by MSqC(WCS) have much greater support while maintaining consistency within an acceptable range for the credit evaluation scenario.

5.3. Local Solution Effectiveness

The effectiveness of local solutions for various methods is evaluated by contrasting their solution time, local support size, and local consistency scores. Table 2 shows the local solution performance of the three methods under different settings of the local–global ratio

α

with WCS sub-problem size

n_{wcs} = 100

. We have the following observations.

1.: Solution time $T_{s}$ : MSGC(B&B) and MSqC(B&B) scale poorly with respect to the size of the local dataset $| N |$ ( $T_{s}$ increases drastically as $α$ increases from 0.01 to 1, which corresponds to $| N |$ increases from approximately from 0.1 K to 10 K). When $| N | \geq 4$ K, MSqC(B&B) fails to complete within the 2-h time limit. In contrast, the solution time of MSqC(WCS) is independent of $| N |$ , and in fact, it relies only on the number and complexities of the WCS sub-problems.
2.: Local support $s_{N}$ : MSGC(B&B) yields SE solutions with the smallest local support, while MSqC(B&B) yields SE solutions with the largest. This is the expected result which motivates the formulation of the MSqC problem (see Section 3.4). Furthermore, MSqC(WCS) solutions also have much larger local support than MSGC(B&B), though not as large as MSqC(B&B). These observations of the local support and solution time of the different methods are also demonstrated in Figure 5.
3.: Local consistency level $c_{N}$ : MSGC(B&B) solutions always have a local consistency level of $c_{N} = 1$ , and MSqC(B&B) solutions always have $c_{N} \geq q = 0.85$ , because B&B is an exact solution method. Compared with the two B&B-based methods, MSqC(WCS) solutions generally have lower local consistency levels.

However, the local consistency levels

c_{N}

of the MSqC(WCS) solutions can be raised by increasing the WCS sub-problem size

n_{wcs}

, as demonstrated in Figure 6, for most local–global ratio values

α

. Increasing the sampled feature ratio

ρ

can also achieve the same goal, though the experiment is omitted. Note that stochastic variations exist for the performance of MSqC(WCS) because of the randomness in observations and features sampling (see Algorithm 1).

In summary, MSqC(WCS) offers superior time efficiency compared to MSqC(B&B) at the expense of reduced local support

s_{N}

and lower local consistency levels

c_{N}

. However, the local support is still significantly larger than that of MSGC(B&B), and its local consistency levels remain relatively close to the desired value

q = 0.85

. Moreover, the local consistency levels of MSqC(WCS) can be further improved by adjusting parameters, although this may increase the solution time, necessitating a trade-off based on the specific situation.

5.4. Global Extrapolation Effectiveness

As highlighted earlier, in real-world scenarios, model solving typically involves only a subset of the observation dataset, motivated by requirements like privacy constraints and distributed storage efficiency. Consequently, assessing the global extrapolation effectiveness becomes crucial. Table 3 shows the global extrapolation performance of the three SE generation methods under different settings of the local–global ratio

α

with WCS sub-problem size

n_{wcs} = 100

. We have the following observations.

1.: Global support $s_{D}$ : Similar to local support $s_{N}$ , MSqC(WCS) solutions also have much larger global support than MSGC(B&B), though not as large as MSqC(B&B).
2.: Global consistency level $c_{D}$ : In contrast to the local consistency level $c_{N}$ , the global consistency level $c_{D}$ of MSqC(WCS) is comparable to that of MSGC(B&B) and MSqC(B&B). Moreover, advantages can be observed when $α$ is small (e.g., 0.01, 0.04), corresponding to larger distributed systems where each local dataset is significantly smaller than the global dataset.

In addition, similar to the local consistency level

c_{N}

, the global consistency levels

c_{D}

of MSqC(WCS) solutions can also be raised by increasing the WCS sub-problem sizes

n_{wcs}

, as demonstrated in Figure 7.

For

α = 1

, the scenario is equivalent to the absence of distributed storage, with models being solved directly using the global dataset.

6. Conclusions

In conclusion, this paper presents an improved summary–explanation (SE) decision support method that aims to promote trust in critical decision-making domains such as credit evaluation. Our method addresses the challenges associated with globally consistent SE by formulating the MSqC problem, which yields SEs achieving substantially higher support in exchange for marginally reduced consistencies. The major contributions of this study are as follows:

1.: Methodologically, this paper formulates the MSqC problem for the first time and proposes a novel solution method for MSqC, which not only yield SEs with much greater support, but is also much more scalable in efficiency.
2.: From a practical standpoint, this paper offers a valuable tool for decision-makers to generate high-level summaries of datasets and clear explanations of suggested decisions. By generating SE with greater support, this tool can improve the reliability and trustworthiness of DSS.

While the proposed approach demonstrates clear advantages in support maximization and runtime efficiency, the current method assumes static datasets and does not address dynamic or streaming data scenarios, where SEs may need continuous adaptation. In addition, our current experiments assume i.i.d. sampling between local and global datasets. In real-world applications, such as federated credit scoring or temporal shifts in user behavior, non-i.i.d. data distributions can significantly impact the extrapolation quality of explanations. Extending the framework to accommodate these aspects would further enhance its practical applicability.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/math13081305/s1: Table S1. List of symbols. Table S2. Dataset feature descriptions. Table S3. Dataset feature statistics.

Author Contributions

Conceptualization, methodology, software model, and writing original draft, C.P.; formal analysis, writing review and editing, T.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (No. 62466018) and the key Research Foundation of Education Bureau of Hunan Province (No. 23A0387).

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Experiments are also performed on two other datasets, which are all credit-related binary classification datasets. The detailed information and a comparison of the three different datasets can be found in Table A1. Local solution performances of the different methods applied on the Taiwan Credit dataset and Australian Credit Approval dataset are shown in Table A2. Global extrapolation performances of the different methods applied on the Taiwan Credit dataset are shown in Table A3. Note that for the Australian Credit Approval dataset, due to its small size, the local–global ratio parameter

α

starts from 0.1, and no global extrapolation experiments were conducted.

Table A1. Datasets information.

Dataset	HELOC (FICO, 2018) [6]	Taiwan Credit (Yeh, 2016) [36]	Australian Credit Approval (Lichman et al., 2013) [37]
Source	FICO	UCI ML Repository	UCI ML Repository
Size	Medium	Large	Small
N. clean instances	9871	30,000	653
N. features	23	15	15
Balance	Y	N	Y
N. instances with $y = 0$	5136	23,364	296
N. instances with $y = 1$	4735	6636	357

Table A2. Local solution performance of different methods applied on the Taiwan Credit dataset and Australian Credit Approval dataset. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: solution time

T_{s}

, local support

s_{N}

, and local consistency level

c_{N}

.

Table A2. Local solution performance of different methods applied on the Taiwan Credit dataset and Australian Credit Approval dataset. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: solution time

T_{s}

, local support

s_{N}

, and local consistency level

c_{N}

.

(a) Dataset: Taiwan Credit (Yeh, 2016) [36]
		MSGC(B&B)			MSqC(B&B)			MSqC(WCS)
$α$	$\| N \|$	$T_{s}$	$s_{N}$	$c_{N}$	$T_{s}$	$s_{N}$	$c_{N}$	$T_{s}$	$s_{N}$	$c_{N}$
0.01	0.3K	9.33	33.28	1	7.19	137.30	0.86	1.32	87.21	0.86
0.04	1.2K	73.89	39.65	1	84.76	535.73	0.82	1.41	155.39	0.66
0.07	2.1K	125.39	37.82	1	295.99	966.96	0.85	1.23	361.59	0.71
0.1	3K	294.69	41.38	1	N/A	N/A	N/A	1.26	314.08	0.65
0.4	12K	N/A	N/A	N/A	N/A	N/A	N/A	1.23	1540.49	0.70
0.7	21K	N/A	N/A	N/A	N/A	N/A	N/A	1.32	1021.20	0.62
1	30K	N/A	N/A	N/A	N/A	N/A	N/A	1.16	3539.00	0.70
(b) Dataset: Australian Credit Approval (Lichman et al., 2013) [37]
		MSGC(B&B)			MSqC(B&B)			MSqC(WCS)
$α$	$\| N \|$	$T_{s}$	$s_{N}$	$c_{N}$	$T_{s}$	$s_{N}$	$c_{N}$	$T_{s}$	$s_{N}$	$c_{N}$
0.1	65	0.07	19.78	1	0.26	32.06	0.87	0.14	19.16	1.00
0.4	261	1.21	45.70	1	4.94	97.55	0.89	0.20	66.07	0.89
0.7	457	3.05	63.67	1	14.42	194.55	0.90	0.22	143.70	0.91
1	653	5.81	63.31	1	28.61	264.58	0.89	0.21	212.72	0.91

Table A3. Global extrapolation performance of the different methods applied on the Taiwan Credit dataset. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: global support

s_{D} |

and global consistency level

c_{D}

.

Table A3. Global extrapolation performance of the different methods applied on the Taiwan Credit dataset. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: global support

s_{D} |

and global consistency level

c_{D}

.

		MSGC(B&B)		MSqC(B&B)		MSqC(WCS)
$α$	$\| N \|$	$s_{D}$	$c_{D}$	$s_{D}$	$c_{D}$	$s_{D}$	$c_{D}$
0.01	0.3 K	3516.64	0.78	14,371.22	0.76	7786.90	0.77
0.04	1.2 K	980.29	0.78	12,530.41	0.79	2067.15	0.65
0.07	2.1 K	537.85	0.80	13,910.32	0.82	3454.96	0.71
0.1	3 K	384.07	0.82	N/A	N/A	1981.38	0.64
0.4	12 K	N/A	N/A	N/A	N/A	3361.19	0.70
0.7	21 K	N/A	N/A	N/A	N/A	1350.06	0.62
1	30 K	N/A	N/A	N/A	N/A	3539.00	0.70

References

Khan, N.; Okoli, C.N.; Ekpin, V.; Attai, K.; Chukwudi, N.; Sabi, H.; Akwaowo, C.; Osuji, J.; Benavente, L.; Uzoka, F.-M. Adoption and utilization of medical decision support systems in the diagnosis of febrile diseases: A systematic literature review. Expert Syst. Appl. 2023, 220, 119638. [Google Scholar] [CrossRef]
Liu, X.; Faisal, M.; Alharbi, A. A decision support system for assessing the role of the 5G network and AI in situational teaching research in higher education. Soft Comput. 2022, 26, 10741–10752. [Google Scholar] [CrossRef]
Birzhandi, P.; Cho, Y.-S. Application of fairness to healthcare, organizational justice, and finance: A survey. Expert Syst. Appl. 2023, 216, 119465. [Google Scholar] [CrossRef]
Yousefli, A.; Heydari, M.; Norouzi, R. A data-driven stochastic decision support system to investment portfolio problem under uncertainty. Soft Comput. 2022, 26, 5283–5296. [Google Scholar] [CrossRef]
Wang, J.; Yu, C.; Zhang, J. Constructing the regional intelligent economic decision support system based on fuzzy C-mean clustering algorithm. Soft Comput. 2020, 24, 7989–7997. [Google Scholar] [CrossRef]
FICO. Explainable Machine Learning Challenge. Available online: https://community.fico.com/s/explainable-machine-learning-challenge (accessed on 25 July 2022).
Rudin, C.; Shaposhnik, Y. Globally-consistent rule-based summary-explanations for machine learning models: Application to credit-risk evaluation. J. Mach. Learn. Res. 2023, 24, 1–44. [Google Scholar] [CrossRef]
Vanderbei, R. Linear Programming: Foundations and Extensions, 3rd ed.; Springer: New York, NY, USA, 2008; ISBN 978-0-387-74387-5. [Google Scholar]
Gamrath, G.; Anderson, D.; Bestuzheva, K.; Chen, W.-K.; Eifler, L.; Gasse, M.; Gemander, P.; Gleixner, A.; Gottwald, L.; Halbig, K.; et al. The SCIP Optimization Suite 7.0; Zuse Institute Berlin: Berlin, Germany, 2020; Available online: https://optimization-online.org/wp-content/uploads/2020/03/7705.pdf (accessed on 19 March 2025).
Samek, W.; Müller, K.-R. Explainable AI: Interpreting, Explaining and Visualizing Deep Learning; Springer: Berlin, Germany, 2019; pp. 5–22. [Google Scholar]
Arrieta, A.B.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion 2020, 58, 82–115. [Google Scholar] [CrossRef]
Ravi, V.; Srivastava, V.K.; Singh, M.P.; Burila, R.K.; Kassetty, N.; Vardhineedi, P.N.; Pasam, V.R.; Prova, N.N.I.; De, I. Explainable AI (XAI) for Credit Scoring and Loan Approvals. arXiv 2025, arXiv:2503.07829. [Google Scholar]
Yeo, W.J.; Van Der Heever, W.; Mao, R.; Cambria, E.; Satapathy, R.; Mengaldo, G.J.A.I.R. A comprehensive review on financial explainable AI. Expert Syst. Appl. 2025, 58, 1–49. [Google Scholar] [CrossRef]
Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why Should I Trust You?”: Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 2017, 30, 4768–4777. [Google Scholar]
Bertsimas, D.; Dunn, J. Optimal classification trees. Mach. Learn. 2017, 106, 1039–1082. [Google Scholar] [CrossRef]
Ustun, B.; Rudin, C. Supersparse linear integer models for optimized medical scoring systems. Mach. Learn. 2016, 102, 349–391. [Google Scholar] [CrossRef]
Baniecki, H.; Biecek, P. Adversarial attacks and defenses in explainable artificial intelligence: A survey. Inf. Fusion 2024, 102, 102303. [Google Scholar] [CrossRef]
Xu, Z.; Lv, Z.; Li, J.; Sun, H.; Sheng, Z. A Novel Perspective on Travel Demand Prediction Considering Natural Environmental and Socioeconomic Factors. IEEE Intell. Transp. Syst. Mag. 2022, 15, 2–25. [Google Scholar] [CrossRef]
Alhulayil, M.; Al-Mistarihi, M.F.; Shurman, M.M.J.E. Performance analysis of dual-hop AF cognitive relay networks with best selection and interference constraints. Expert Syst. Appl. 2022, 12, 124. [Google Scholar] [CrossRef]
Maldonado, S.; Pérez, J.; Weber, R.; Labbé, M. Feature selection for support vector machines via mixed integer linear programming. Inf. Sci. 2014, 279, 163–175. [Google Scholar] [CrossRef]
Angelino, E.; Larus-Stone, N.; Alabi, D.; Seltzer, M.; Rudin, C. Learning certifiably optimal rule lists for categorical data. J. Mach. Learn. Res. 2018, 18, 1–78. [Google Scholar]
Shah, S.; Munir, A.; Salam, A.; Ullah, F.; Amin, F.; AlSalman, H.; Javeed, Q. A dynamic trust evaluation and update model using advanced decision tree for underwater wireless sensor networks. Sci. Rep. 2024, 14, 22393. [Google Scholar] [CrossRef]
Sripodok, P.; Lapthanasupkul, P.; Arayapisit, T.; Kitkumthorn, N.; Srimaneekarn, N.; Neeranadpuree, V.; Amornwatcharapong, W.; Hempornwisarn, S.; Amornwikaikul, S.; Rungraungrayabkul, D. Development of a decision tree model for predicting the malignancy of localized gingival enlargements based on clinical characteristics. Sci. Rep. 2024, 14, 22185. [Google Scholar] [CrossRef]
Arjmandi, M.; Fattahi, M.; Motevassel, M.; Rezaveisi, H. Evaluating algorithms of decision tree, support vector machine and regression for anode side catalyst data in proton exchange membrane water electrolysis. Sci. Rep. 2023, 13, 20309. [Google Scholar] [CrossRef]
Korikov, A.; Shleyfman, A.; Beck, J.C. Counterfactual explanations for optimization-based decisions in the context of the GDPR. In Proceedings of the ICAPS 2021 Workshop on Explainable AI Planning (XAIP), Virtual Event, 6 August 2021; p. 17. [Google Scholar]
European Commission. General Data Protection Regulation (GDPR). Available online: https://eur-lex.europa.eu/eli/reg/2016/679/oj (accessed on 25 July 2022).
Chen, C.-C.; Li, S.-T. Credit rating with a monotonicity-constrained support vector machine model. Expert Syst. Appl. 2014, 41, 7235–7247. [Google Scholar] [CrossRef]
Kanamori, K.; Takagi, T.; Kobayashi, K.; Ike, Y.; Uemura, K.; Arimura, H. Ordered counterfactual explanation by mixed-integer linear optimization. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), Virtual Conference, 2–9 February 2021; pp. 11564–11574. [Google Scholar]
Dash, S.; Günlük, O.; Wei, D. Boolean decision rules via column generation. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, QC, Canada, 3–8 December 2018; pp. 4660–4670. [Google Scholar]
Lawless, C.; Dash, S.; Günlük, O.; Wei, D. Interpretable and fair boolean rule sets via column generation. J. Mach. Learn. Res. 2023, 24, 1–50. [Google Scholar]
Haussler, D.; Warmuth, M.K. The probably approximately correct (PAC) and other learning models. In The Mathematics of Generalization; CRC Press: Boca Raton, FL, USA, 1995; ISBN 978-0-429-49252-5. [Google Scholar]
Robbins, H.; Monro, S. A Stochastic Approximation Method. Ann. Math. Stat. 1951, 22, 400–407. [Google Scholar] [CrossRef]
Gu, B.; Shan, Y.; Quan, X.; Zheng, G. Accelerating Sequential Minimal Optimization via Stochastic Subgradient Descent. IEEE Trans. Cybernet. 2021, 51, 2215–2223. [Google Scholar] [CrossRef]
Kweon, S.J.; Hwang, S.W.; Lee, S.; Jo, M.J. Demurrage pattern analysis using logical analysis of data: A case study of the Ulsan Port Authority. Expert Syst. Appl. 2022, 206, 117745. [Google Scholar] [CrossRef]
Yeh, I.-C. Default of Credit Card Clients. UCI Machine Learning Repository. Available online: https://doi.org/10.24432/C55S3H (accessed on 19 March 2025).
Quinlan, J. UCI Machine Learning Repository. 1987. Available online: https://doi.org/10.24432/C5FS30 (accessed on 19 March 2025).

Figure 1. Explanation scenarios in credit evaluation with summary–explanation (SE). The q-consistent SE has much greater support (2114) than that of the globally consistent SE (88), with a generally acceptable lower consistency of 82%.

Figure 2. Global extrapolation effectiveness and the global extrapolation challenge: How can we improve the global support

| S_{D} (b) |

and global consistency level

| c_{D} (b, y_{e}) |

without direct access to the global dataset

D

?

Figure 2. Global extrapolation effectiveness and the global extrapolation challenge: How can we improve the global support

| S_{D} (b) |

and global consistency level

| c_{D} (b, y_{e}) |

without direct access to the global dataset

D

?

Figure 3. Solving the MSqC model with the SIS-based WCS optimization method. The local MSqC problem is solved by sampling and solving smaller MSGC sub-problems, guided by the features’ SIS scores designed with global prior injection.

Figure 4. Examples of MSGC(B&B) SEs (left) and MSqC(WCS) SEs (right).

Figure 5. Local support vs. solution time of the SE generation methods under different local–global ratio

α

(HELOC dataset).

Figure 5. Local support vs. solution time of the SE generation methods under different local–global ratio

α

(HELOC dataset).

Figure 6. Solution local consistency levels

c_{N}

of MSqC(WCS) for sub-problems with varying sizes

n_{wcs}

, ranging from 100 to 500 (HELOC dataset).

Figure 6. Solution local consistency levels

c_{N}

of MSqC(WCS) for sub-problems with varying sizes

n_{wcs}

, ranging from 100 to 500 (HELOC dataset).

Figure 7. Solution global consistency levels

c_{D}

of MSqC(WCS) for sub-problems with varying sizes

n_{wcs}

, ranging from 100 to 500 (HELOC dataset).

Figure 7. Solution global consistency levels

c_{D}

of MSqC(WCS) for sub-problems with varying sizes

n_{wcs}

, ranging from 100 to 500 (HELOC dataset).

Table 1. Experimental parameters.

Parameter	Value	Description
$M_{c}$	4	Maximum complexity for an SE solution
q	0.85	Minimum consistency level for MSqC(B&B)
a	3	Scaling factor for feature sampling
$N_{subs}$	40	Number of WCS sub-problems
$n_{wcs}$	100/300/500	Sub-problem size, i.e., number of observations in WCS sub-problems
$ρ$	0.25	Sampled feature ratio for WCS sub-problems
$T_{m a x}$	7200 s	Solution time limit
$α$	0.01/0.04/0.07 /0.1/0.4/0.7/1	The local–global ratio for controlling the size of the local dataset
$n_{thresholds}$	9	Number of quantile thresholds to binarize ordinal features
$n_{rep}$	40	Number of solution executions to average metrics

Table 2. Local solution performance of the SE generation methods on HELOC dataset, with

n_{subs} = 100

. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: solution time

T_{s}

, local support

s_{N}

and local consistency level

c_{N}

.

Table 2. Local solution performance of the SE generation methods on HELOC dataset, with

n_{subs} = 100

. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: solution time

T_{s}

, local support

s_{N}

and local consistency level

c_{N}

.

		MSGC(B&B)			MSqC(B&B)			MSqC(WCS)
$α$	$\| N \|$	$T_{s}$	$s_{N}$	$c_{N}$	$T_{s}$	$s_{N}$	$c_{N}$	$T_{s}$	$s_{N}$	$c_{N}$
0.01	99	1.85	18.10	1.00	5.55	35.49	0.86	3.47	16.74	1.00
0.04	395	21.26	28.80	1.00	252.28	93.59	0.86	3.72	66.78	0.87
0.07	691	50.91	30.11	1.00	1170.23	130.81	0.85	3.99	94.17	0.83
0.1	987	100.63	30.07	1.00	2374.98	155.97	0.85	3.83	127.67	0.84
0.4	3948	557.58	26.41	1.00	>7200	N/A	N/A	4.05	541.22	0.83
0.7	6910	1234.86	22.01	1.00	>7200	N/A	N/A	3.89	900.49	0.79
1	9871	1851.85	21.54	1.00	>7200	N/A	N/A	3.69	961.64	0.77

Table 3. Global extrapolation performance of the SE generation methods with

n_{subs} = 100

on the HELOC dataset. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: global support

s_{D} |

and global consistency level

c_{D}

.

Table 3. Global extrapolation performance of the SE generation methods with

n_{subs} = 100

on the HELOC dataset. The local–global ratio

α

is an adjustable parameter. The size of the local dataset

N

depends on

α

by

| N | = α | D |

. Metrics: global support

s_{D} |

and global consistency level

c_{D}

.

		MSGC(B&B)		MSqC(B&B)		MSqC(WCS)
$α$	$\| N \|$	$s_{D}$	$c_{D}$	$s_{D}$	$c_{D}$	$s_{D}$	$c_{D}$
0.01	99	1765.23	0.71	3546.02	0.67	1693.80	0.72
0.04	395	721.26	0.77	2311.31	0.76	1666.51	0.78
0.07	691	439.22	0.79	1865.70	0.79	1275.64	0.76
0.1	987	321.78	0.79	1543.26	0.78	1266.34	0.79
0.4	3948	68.08	0.82	N/A	N/A	1359.75	0.82
0.7	6910	33.05	0.85	N/A	N/A	1270.46	0.79
1	9871	21.54	1.00	N/A	N/A	961.64	0.77

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Peng, C.; He, T. An Improved Summary–Explanation Method for Promoting Trust Through Greater Support with Application to Credit Evaluation Systems. Mathematics 2025, 13, 1305. https://doi.org/10.3390/math13081305

AMA Style

Peng C, He T. An Improved Summary–Explanation Method for Promoting Trust Through Greater Support with Application to Credit Evaluation Systems. Mathematics. 2025; 13(8):1305. https://doi.org/10.3390/math13081305

Chicago/Turabian Style

Peng, Chen, and Tianci He. 2025. "An Improved Summary–Explanation Method for Promoting Trust Through Greater Support with Application to Credit Evaluation Systems" Mathematics 13, no. 8: 1305. https://doi.org/10.3390/math13081305

APA Style

Peng, C., & He, T. (2025). An Improved Summary–Explanation Method for Promoting Trust Through Greater Support with Application to Credit Evaluation Systems. Mathematics, 13(8), 1305. https://doi.org/10.3390/math13081305

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Improved Summary–Explanation Method for Promoting Trust Through Greater Support with Application to Credit Evaluation Systems

Abstract

1. Introduction

2. Background

2.1. Explainable Artificial Intelligence

2.2. Combinatorial Optimization in XAI

2.3. Enhancing Credit Evaluation with Combinatorial Optimization

3. Problem Formulation

3.1. A Credit Risk Assessment Scenario

3.2. Globally Consistent Summary–Explanation

3.3. Minimizing Complexity and Maximizing Support

3.4. The Third Metric: Consistency Level

3.5. The Extrapolation Challenge

4. Methodology

4.1. Simplified Increased Support

4.2. Global Prior Injection

4.3. The Weighted Column Sampling Optimization Framework

5. Computer Experiments

5.1. Experimental Setup

Experimental Procedure

5.2. Sampled Results of the Summary–Explanation Solutions

5.3. Local Solution Effectiveness

5.4. Global Extrapolation Effectiveness

6. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI