Next Article in Journal
A Study on the Device Topology and Control Strategy of a Hybrid Three-Port Photovoltaic Energy Storage Grid-Connected Converter
Next Article in Special Issue
AI-Based Impact Location in Structural Health Monitoring for Aerospace Application Evaluation Using Explainable Artificial Intelligence Techniques
Previous Article in Journal
Integrating Virtual Reality into Welding Training: An Industry 5.0 Approach
Previous Article in Special Issue
Comparative Analysis of Post Hoc Explainable Methods for Robotic Grasp Failure Prediction
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Fuzzy Rules for Explaining Deep Neural Network Decisions (FuzRED)

Johns Hopkins Applied Physics Laboratory, Laurel, MD 20723, USA
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(10), 1965; https://doi.org/10.3390/electronics14101965
Submission received: 13 March 2025 / Revised: 1 May 2025 / Accepted: 3 May 2025 / Published: 12 May 2025

Abstract

:
This paper introduces a novel approach to explainable artificial intelligence (XAI) that enhances interpretability by combining local insights from Shapley additive explanations (SHAP)—a widely adopted XAI tool—with global explanations expressed as fuzzy association rules. By employing fuzzy association rules, our method enables AI systems to generate explanations that closely resemble human reasoning, delivering intuitive and comprehensible insights into system behavior. We present the FuzRED methodology and evaluate its performance on models trained across three diverse datasets: two classification tasks (spam identification and phishing link detection), and one reinforcement learning task involving robot navigation. Compared to the Anchors method FuzRED offers at least one order of magnitude faster execution time (minutes vs. hours) while producing easily interpretable rules that enhance human understanding of AI decision making.

1. Introduction

Artificial intelligence (AI) systems have demonstrated increasingly remarkable modeling capabilities, yet this increased capability has come at the cost of increased complexity, which obscures the reasoning behind their decisions, predictions, or classifications. This lack of transparency has sparked concerns across various sectors, resulting in a reluctance to adopt AI systems. For example, medical professionals who rely on precise and understandable data to make critical decisions often hesitate to integrate AI into patient care due to the opaque nature of its outputs [1]. Similarly, military leaders have expressed reservations about deploying AI systems, citing a lack of transparency as a significant barrier [2].
Addressing these concerns requires the advancement of explainable AI (XAI), which seeks to make AI systems more transparent and understandable to humans. By offering clear, understandable explanations for their predictions, XAI can enhance trust, accountability, and privacy—key factors for the widespread adoption and integration of AI into real-world applications. Both governmental and non-governmental entities have emphasized the critical role of explainability, tying it to ethical standards and regulatory compliance [3,4]. For instance, the European Union’s General Data Protection Regulation includes provisions such as the “right to explanation”, potentially mandating explainability in AI systems [5]. These regulatory and ethical drivers have spurred increased research and development of XAI techniques, aimed at making AI systems more comprehensible to humans.
XAI techniques generally provide two types of explanations: local and global. Both types play a crucial role in achieving transparency in AI systems, each with its own strengths and limitations [6,7]. Local explanations provide precise, fine-grained insights into individual predictions, but when viewed in isolation, they fail to capture the broader behavior of the model. Global explanations, in contrast, aim to offer a broad understanding of how the model makes decisions across its entire feature space. However, achieving global explanations is challenging in practice, as Molnar states [8], “any model that exceeds a handful of parameters or weights is unlikely to fit into the short-term memory of the average human… [and] any feature space with more than three dimensions is simply inconceivable for humans”. This cognitive limitation makes it difficult to fully grasp complex models at a global level, underscoring the need for balanced approaches that combine both explanation types.
Ideally, we could utilize the precision of local explanations to construct global explanations that remain both intuitive and interpretable, without compromising accuracy or oversimplifying the model’s behavior. The FuzRED method aims to achieve this by leveraging feature importance rankings from local explanations to generate fuzzy IF → THEN rules, providing a high-level representation of model behavior.
We have found that using rules to explain AI model behavior significantly enhances interpretability, especially when these rules are grounded in familiar concepts or domain knowledge. Rules serve as simple, logical statements that describe the relationships between inputs and outputs, making model decisions more transparent. For instance, in a medical diagnosis system, a rule might state, “If a patient has a fever and a persistent cough, then there is a high probability of a respiratory infection”. Such rules are intuitive and align with the way humans naturally reason about cause and effect. However, real-world data are often uncertain and imprecise, which is where fuzzy association rules become valuable. Unlike rigid, binary rules, fuzzy rules introduce degrees of truth, making them more adaptable to complex and nuanced scenarios. For example, instead of classifying a fever as simply “present” or “absent”, a fuzzy rule might describe a patient’s temperature as “high” to a certain degree, allowing for a more flexible and realistic interpretation of medical conditions [9].
This paper introduces a novel XAI approach that leverages local explanations provided by SHAP (Shapley additive explanations) [10] to generate global explanations in the form of fuzzy association rules. By leveraging fuzzy association rules, our method enables AI systems to explain their decision-making in a way that aligns with human reasoning, offering explanations that are both accurate and intuitively understandable. This approach helps bridge the gap between complex AI models and human interpretability, enabling users to comprehend the underlying logic behind AI-driven decisions. As a result, it enhances trust, transparency, and accountability, facilitating broader adoption of AI systems in high-stake domains where explainability is critical.
The key contributions of FuzRED center on its novel approach to bridging the gap between complex model behavior and human interpretability. Most notably, it leverages fuzzy logic [11]—a system inherently aligned with human reasoning—to express model decisions using intuitive, linguistic categories such as “low”, “medium”, and “high”. This alignment enables FuzRED to translate abstract statistical patterns into rule-based explanations that people can naturally understand and reason about. By combining this fuzzy logic framework with feature-importance analysis, FuzRED automatically generates concise, meaningful rules that reflect the internal structure of black-box models. It is broadly applicable across various domains, including both classification tasks and deep reinforcement learning (DRL). Additionally, FuzRED is computationally efficient (more details in Section 5), making it suitable for deployment in real-time or iterative workflows. Overall, FuzRED advances the interpretability of AI by enabling clear, human-aligned insights that foster trust, support validation, and guide debugging.
The remainder of the paper is organized as follows: Section 2 provides an overview of XAI and related research. Section 3 outlines our proposed methodology, and Section 4 describes our experimental results. In Section 5, we discuss the key findings and their implications. Finally, Section 6 summarizes our conclusions and outlines directions for future work.

2. Related Work

2.1. AI Explainability

Alongside the development of increasingly complex models, the concept of model interpretability has emerged in the context of AI. Early AI systems, such as decision trees and rule-based systems, were regarded as inherently interpretable due to their transparent structure and straightforward decision-making processes. These systems allowed for easy traceability of decisions, which facilitated their use in critical applications. However, as AI techniques advanced, particularly with the rise of deep learning, models became more complex, leading to significant improvements in predictive accuracy but at the cost of interpretability [12]. In recent years, this tradeoff between accuracy and interpretability has fueled a surge in the development of XAI approaches aimed at making black-box models more interpretable. These methods are crucial in ensuring that AI systems are not only accurate but also trustworthy, especially in domains where understanding the decision-making process is essential [8].
The term model interpretability encompasses a wide range of conflicting definitions. For the purpose of this paper, we adopt the perspective of transparency via decomposability, as described by Lipton [7]: “…that each part of the model—each input, parameter, and calculation—admits an intuitive [to humans] explanation”. Although interesting work has been conducted across this expansive field, in this review we focus on existing techniques that are most relevant for comparison with the FuzRED method. This section is not intended to provide a comprehensive catalogue of the expansive ecosystem of all explainability methods. Rather, we limit our review to model-agnostic, post-hoc analysis methods. Model-specific methods (such as [13]) are promising but differ from the intended FuzRED use case that presumes no direct access to the model itself.
Model-agnostic, post-hoc analysis methods apply interpretability techniques to a previously trained model to gain insights into its internal decision-making processes without direct access to the model itself. This approach is particularly useful for debugging models, building trust before deployment, and performing post-deployment analysis. Post-hoc methods can be broadly categorized into feature importance methods and rule/explanation generation techniques [12].
Feature importance methods quantify the contribution of individual features to a model’s predictions. These methods are model-agnostic, meaning that they can be applied to any type of model, and they offer both local and global insights into the model’s behavior. However, these methods can struggle with correlated features, which can lead to potential misinterpretations of the feature importance scores. Two of the most widely used feature importance methods are LIME (local interpretable model-agnostic explanations) [14] and SHAP [10].
LIME generates local surrogate models to approximate the decision boundary of any black-box model. It perturbs the input data and observes the resulting changes in predictions, creating a simpler model that approximates the behavior of the original model within a local region [14]. LIME generates local approximations to model predictions.
SHAP is theoretically founded on Shapley values, a concept originally developed for cooperative game theory. SHAP values quantify the contribution that each player (i.e., feature) brings to the game (i.e., output of the model). They measure the impact of a specific feature’s current value on a prediction, compared to the effect it would have if set to a baseline value. SHAP provides a mathematically rigorous yet intuitive way to explain complex models [10]. The absolute SHAP value indicates the degree of influence a single feature has on a given prediction, helping to interpret the model’s decision-making process.
Rule/explanation generation methods, on the other hand, aim to produce human-interpretable rules that describe a model’s behavior. These rules are typically presented as “if–then” statements, which are designed to be easily understood by non-technical stakeholders.
A prominent example of this approach is Anchors [15], which provides high-precision explanations for individual predictions. Unlike LIME and SHAP, both of which produce explanations based on feature importance, Anchors produces rule-based explanations that are interpretable. These “anchors” are conditions in the input space that, when satisfied, guarantee the same model prediction with high probability. Anchors’ rules are local explanations; there is one rule for each data point one wishes to explain. The primary advantage of Anchors is its ability to generate explanations well aligned with rule-based reasoning, which is often preferred in domains like healthcare, finance, and law [6]. Every Anchors rule has coverage and precision metrics predicted based on the distribution of the training set. Coverage describes to what fraction of instances a rule applies (i.e., the percentage of instances that satisfy the antecedent of a rule) and precision—what fraction of the predictions is correct (i.e., the percentage of instances that satisfy both the antecedent and consequent of a rule).
Coverage of a rule R is defined as follows:
C o v e r a g e   R = n c o v e r s | D |
Precision (also referred to as accuracy) is defined as follows:
P r e c i s i o n   R = n c o r r e c t n c o v e r s
where, D is the class-labeled dataset, |D| is the number of instances in D, ncovers is the number of instances covered by R, and ncorrect is the number of instances correctly classified by rule R.
There are also two interesting additional categories of explanations: counterfactual explanations and concept-based methods. These methods show promise but are not easily comparable with FuzRED. The goal of counterfactual explanations is to describe the following for an example input to an AI model: “why was the model’s output P rather than Q?” [16] and how the input features can be changed to achieve output Q [17]. Counterfactual explanations focus on actionable and human-understandable changes, helping users grasp decision boundaries more clearly [18]. These explanations are particularly useful in high-stakes domains like finance and healthcare, where understanding “what could be different” can guide users in decision-making or policy compliance [6]. However, generating realistic and feasible counterfactuals remains a technical challenge, requiring careful balance among proximity, plausibility, and diversity [19].
Concept-based methods aim to bridge the gap between low-level model features and high-level human understanding by interpreting model behavior through semantically meaningful concepts. Instead of focusing on individual features or pixels, these approaches identify and reason about abstract concepts, such as “striped texture” or “wheel” in image classification tasks [20]. Concept activation vectors (CAVs), for example, allow users to quantify how sensitive a model’s predictions are to these human-defined concepts. However, defining relevant concepts and ensuring they are faithfully learned remain significant challenges.

2.2. Fuzzy Association Rule Mining (FARM)

There exists a large body of research that describes techniques related to association rule mining (ARM) and fuzzy association rule mining (FARM). The primary objective of data mining is to uncover hidden and previously unknown patterns and insights from data. When these insights are expressed as relationships among different attributes, the process is referred to as ARM. Introduced by Agrawal and Srikant [21], ARM was originally developed to identify interesting co-occurrences within supermarket data, a classic problem known as market basket analysis. The Apriori algorithm [21] is a fundamental technique in association rule mining, widely recognized for its ability to uncover relationships among items in large transactional datasets. It identifies frequent itemsets by leveraging the downward-closure property, which asserts that any subset of a frequent itemset must also be frequent. This principle allows the algorithm to efficiently prune the search space, reducing computational complexity.
The Apriori algorithm begins by identifying individual items that meet a minimum support threshold, then iteratively generates larger itemsets by combining these frequent items. Itemsets that do not meet the support threshold are discarded, ensuring efficiency. Once frequent itemsets are identified, the algorithm generates association rules based on these sets, evaluating them by their confidence or the likelihood of one itemset leading to another.
A limitation of traditional association rule mining is that it only works on binary data (i.e., an item is either purchased in a transaction (1) or not (0)). This approach is insufficient for many real-world applications where data are often categorical (e.g., district names, types of public health interventions) or quantitative (e.g., rainfall, temperature, age). Boolean rules are inadequate for such types of data, prompting the development of more advanced methods such as quantitative association rule mining [22] and fuzzy association rule mining [23].
The fuzzy extension of the Apriori algorithm is an adaptation of the classic Apriori algorithm, designed to handle the inherent vagueness and uncertainty present in many real-world datasets [24]. The traditional algorithm generates frequent itemsets and derives association rules based on crisp, binary data—each item either fully satisfies a condition or it does not. The algorithm proceeds similarly but operates on fuzzy itemsets, which are generated using predefined fuzzy membership functions (FMFs), as discussed above. By allowing for partial truth values, the fuzzy Apriori algorithm enhances the expressiveness and applicability of association rule mining in complex, real-world scenarios.

3. Materials and Methods

The FuzRED method seeks to deliver a set of rules that provide an accurate, human-understandable picture of broad model behavior. Importantly, this method provides users with the flexibility to restrict or expand the desired level of granularity at which the rulesets are generated and optimized. This flexibility is described later in Section 3.1 and Section 3.2.
The main steps of the explainability method that we developed are shown in Figure 1. We start with an AI model (e.g., shallow neural network, deep neural network, support vector machine, deep reinforcement learning network) that was trained to provide a decision or a prediction. For the ease of description, let us assume that the decision is a classification into one of n classes, but in reality, the decision/prediction could be of a different type. The first step is to determine from the large number of input features (sometimes thousands of them) the top k features that are the most important. In the second step, for each of the k features, a set of FMFs will be built. In the third step, fuzzy rules will be extracted for those features and the corresponding membership functions. These fuzzy rules are the explanations of why a given instance is classified into a given class (as such, they are local explanations). The full set of rules is a global explanation, explaining the behavior of the AI model. Often the full set of rules is not necessary, and a small subset of rules can be chosen that explains the model behavior. The fourth step of FuzRED is to choose that small subset of rules that explain the behavior of the AI model.

3.1. Determination of the Most Important Features

The first step in the FuzRED method is to determine the most important features that contribute to model prediction. The goal of this step is to identify a set of model input features that will be the most useful antecedents to consider when generating the ruleset (e.g., the x in an IF x → THEN y rule). This step is important for two reasons: scale and comprehensibility. In terms of scale, the number of input features present in a given dataset (or used as input by a given AI model) can vary widely, with some datasets providing thousands of features for each sample. If the input feature space is sufficiently large, and all features are equally considered as possible antecedents, the computational load when generating our ruleset can rapidly become untenable. Furthermore, an overly large set of different antecedents within a resulting ruleset can easily be mentally overwhelming for humans who attempt to use the ruleset to understand model behavior. This is why identifying the top n most-influential input features enables comprehensibility as well as managing scale. The user has the flexibility to choose any number of the most influential input features that fit their needs.
In our method, we use SHAP values to identify a small set of features that are the most important to the model being explained (i.e., the features that have the largest impact on the predictions that the model makes). Lundberg and Lee [10] demonstrated that SHAP values were more consistent with human intuition than other explanation methods, and SHAP values have the advantage of providing a high degree of per-prediction precision when quantifying the input contributions of a single feature. Each SHAP value represents the contribution of a single feature value to a given model prediction, where positive SHAP values indicate features that increase the model prediction for a given example, and negative SHAP values indicate features that decrease the model prediction for a given example.
As previously mentioned, SHAP values are localized to a single example, making them poor candidates on their own for extrapolating global model behavior. However, a useful characteristic of SHAP is that the global importance of each feature can be derived by aggregating SHAP values across all the given examples in a dataset. The authors in [10] propose creating a global feature importance plot by calculating the mean absolute SHAP value for each feature across a dataset. In FuzRED, we leverage this metric (i.e., mean absolute value, which we term mean magnitude) and investigate additional candidate metrics for capturing global feature importance. These metrics provide different ways to aggregate feature contributions across all predictions, enabling a deeper analysis of feature importance. Specifically, we evaluated metrics such as frequency, mean contribution, mean magnitude, maximum contribution, maximum magnitude, and minimum contribution. Frequency measures how often a feature appears among the top n important features for a set of predictions, while mean contribution sums the signed (positive and negative) contributions of a feature for a set of predictions and divides them by the number of features. Mean magnitude aggregates the absolute values of these contributions and divides them by the number of features (it is the same as mean absolute SHAP used by others). The remaining metrics are derived by applying statistical operations (maximum, minimum) to the signed contributions and the maximum operation to the magnitude. Despite its name, the minimum contribution does not find the least important features (we would not be interested in those); because some of the SHAP values are negative, the minimum contribution finds the features for which the contributions are the most negative (these are large contributions, just in the negative direction). These six metrics are then compared to identify the most informative features and are used to select the top n candidates as the antecedents available for use during rule generation.
The above step presents the first instance of the granular flexibility mentioned at the beginning of Section 3. As previously referenced by Molnar, humans have limited capacity for holding large volumes of data in their short-term memory. In this step, n can be chosen to suit the needs of the chosen dataset, model, and analyst. For some cases, excessively large datasets might warrant a higher n value in order to provide a detailed, granular analysis of the impact the large feature space has on the resulting model predictions. In other cases, an easy-to-comprehend list of rules based on a small subset of features might be preferable for building trust among subject matter experts (SMEs).
We often choose a value of 20 for n to provide a balance between a sufficiently small feature space and an acceptable level of analysis of the resulting ruleset for benchmarking purposes. Ideally, n should be chosen after a careful analysis of the balance among the density of high-impact features across the dataset, the available computational resources, and the potential for cognitive overwhelm on human analysts from the resulting ruleset.

3.2. Building FMFs

After identifying the top n features, in the second step of FuzRED, we create FMFs for each feature by fitting Gaussian mixture models (GMMs) to the values observed in the training set. GMM [25] is a soft clustering machine learning method used to determine the probability that each data point belongs to a given cluster (it produces fuzzy clusters). The data point can belong to several clusters, each with a different probability, and those probabilities sum to 1. A Gaussian mixture is composed of several Gaussians, each identified by k ∈ {1, …, K}, where K is the number of clusters of our data set. When we require m membership functions per feature, we set K to m (i.e., if we want 2 membership functions, K is set to 2). The user can require a different number of FMFs for different features. This is the second flexible element of FuzRED: the user can choose as many or as few FMFs per feature as desired.
Given that GMMs perform soft clustering, they can be used to automatically learn FMFs from the data. Those FMFs will have a Gaussian shape. This allows the user to skip the process of manually defining all the FMFs for all the features, but instead rely on an automatic method.
GMMs are generated separately for each of the top n features. We wrote software that allows obtaining any number of membership functions for a feature the user desires. We often use two or three membership functions per feature: this is often sufficient and leads to a lower number of rules than using more membership functions. However, the method we developed is general and produces any desired number of FMFs per feature.
If the user desires three membership functions per feature, K is set to 3, and each mixture model consists of three components, corresponding to three fuzzy classes—e.g., low, medium, and high. These FMFs are then applied to both the training and test datasets to generate fuzzy membership data for 3 × n fuzzy classes (three per feature).
Then, we perform rule extraction using the FARM method described in Section 3.3 to automatically extract fuzzy association rules from the test data.

3.3. Extracting Fuzzy Association Rules

FuzRED’s third step is the automatic extraction of fuzzy association rules from the test data. Uncovering hidden and previously unknown patterns and insights from data is usually the primary objective of data mining. When these insights are expressed as relationships among different attributes, the process is referred to as association rule mining (ARM). Introduced by Agrawal et al. [22], ARM was originally developed to identify interesting co-occurrences within supermarket data, a classic problem known as market basket analysis. ARM identifies frequent itemsets (i.e., combinations of items that are purchased together in at least N transactions in the database) and then generates association rules based on these itemsets. For example, if {X, Y} is a frequent itemset, ARM can generate rules such as X → Y or Y → X. A simple association rule in the context of consumer behavior might be:
IF (Bread AND Butter) THEN Milk
This rule indicates that if a customer purchases bread and butter, they are also likely to purchase milk. Such rules can be highly valuable for store managers in determining how to organize products on shelves to maximize sales. While some rules, like the example above, may seem intuitive, ARM is particularly powerful because it can also reveal hidden patterns that may not be immediately obvious to SMEs. One famous example is the surprising rule:
IF (Diapers) THEN Beer
Initially, store managers were skeptical of this finding, suspecting a flaw in the ARM methodology. However, upon closer examination of the transaction data, they found that this pattern was indeed prominent, particularly in the evenings. The explanation was that dads were often asked to pick up diapers on their way home from work and, while doing so, rewarded themselves by purchasing beer. This insight, which would not have been obvious without ARM, demonstrates the method’s ability to reveal non-trivial, actionable knowledge.
Fuzzy association rules [23] extend the concept of association rules by introducing the ability to handle continuous or categorical variables with a degree of uncertainty. These rules take the following form:
IF (X IS A) THEN (Y IS B)
where X and Y are variables, and A and B are fuzzy sets that characterize X and Y respectively. A simple example of fuzzy association rule for a medical application is the following:
  • IF (Temperature IS Strong Fever) AND (Skin IS Yellowish) AND (Loss of appetite IS Profound) THEN (Hepatitis is Acute)
  • The rule states that a patient exhibiting a strong fever, yellowish skin, and a profound loss of appetite is likely suffering from acute hepatitis. “Strong Fever”, “Yellowish”, “Profound”, and “Acute” are membership functions of the variables Temperature, Skin, Loss of appetite, and Hepatitis, respectively. To illustrate this, consider the FMFs for the variable Temperature given in Figure 2. According to the definition in that figure, a person with a body temperature of 100 °F has a “Normal” temperature with a membership value of 0.2 and, simultaneously, has a “Fever” with a membership value of 0.78. These FMFs provide a nuanced way to interpret data, reflecting the complexity of real-world situations. More information on fuzzy logic and fuzzy membership functions can be found in [11].
More formally, let D = {t1, t2, …, tn} represent the transaction database, where each ti is a transaction in D. Let I = {i1, i2, …, im} be the universe of items. A set X⊆I of items is called an itemset. When X has k elements, it is called a k-itemset. An association rule is an implication of the form X → Y, where X⊂I, Y⊂I, and X∩Y = ∅.
ARM and FARM rules have certain metrics associated with them. The three metrics that we use for evaluation and comparison of our method are support, confidence, and coverage. The support of an itemset, X, is defined as follows:
S u p p o r t X = n u m b e r   r e c o r d s   w i t h   X   n u m b e r   r e c o r d s   i n   D = n x n = p x
where n is the number of records in D, nx is the number of records with X, and px is the associated probability. The support of a rule (XY) is defined as follows:
S u p p o r t X Y = n u m b e r   o f   r e c o r d s   w i t h   X   a n d   Y n u m b e r   o f   r e c o r d s   i n   D = n x y n = p x y
where nxy is the number of records with X and Y and pxy is the associated probability.
We can also define the concept of fuzzy support. In order to do this, we first need to define how we determine whether (and to what degree) one fuzzy set is a subset of another. To this end, we define the matching degree (MD) of one set, S 1 , to another, S 2 , as follows:
M D S 1 S 2 = min f i F 1 f i S 2
where F 1 is the set of membership functions inherently selected by S 1 . We note that the min function in this definition is a T-norm. There are many other options for a T-norm that can be chosen, and a discussion of these goes beyond the scope of this paper, but for our purposes we always use the minimum function. We can now define the fuzzy support of a rule (XY) as follows:
F u z z y   S u p p o r t   X Y = s u m   o f   t h e   M D   o f   X   a n d   Y   t o   r e c o r d s   i n   D   n u m b e r   o f   r e c o r d s   i n   D = r D M D X Y ( r ) n
The confidence of a rule (XY) is defined as follows:
C o n f i d e n c e   X Y = n u m b e r   o f   r e c o r d s   w i t h   X   a n d   Y n u m b e r   o f   r e c o r d s   w i t h   X = n x y n x = n x y n × n n x = p x y p x
Similarly, the fuzzy confidence of a rule (XY) can be defined as follows:
F u z z y   C o n f i d e n c e   X Y = s u m   o f   t h e   M D   o f   X   a n d   Y   t o   r e c o r d s   i n   D s u m   o f   t h e   M D   o f   X   t o   r e c o r d s   i n   D = r D M D X Y ( r ) r D M D X ( r )
Confidence can be treated as the conditional probability (P(Y|X)) of a transaction containing X also containing Y. A high confidence value suggests a strong association rule. However, care must be taken in making this inference. For example, if the antecedent (X) and consequent (Y) both have a high support, the rule could have a relatively high confidence even if they were independent.
Support and confidence enable us to compare individual rules, but we also need to be able to compare sets of rules. An important metric for rule sets is coverage. This metric tries to capture how well a rule set “covers” a given data set, or what percentage of the records in the data set are in the support of one of the rules in the rule set. This concept is straightforward for crisp rules, but fuzzy rules introduce some complexities. We can define the coverage of a rule set, R, over a data set, D, as follows:
C o v e r a g e R D = n u m b e r   o f   r e c o r d s   w i t h   X   a n d   Y   f o r   s o m e   X Y   i n   R n u m b e r   o f   r e c o r d s   i n   D =   X Y     R d D   X Y d } n
It is often the case that the MD of a fuzzy rule to a data record is non-zero but extremely small. Thus, including all records with non-zero MD for some rule in a fuzzy rule set does not provide an accurate view of the coverage of that rule set. Therefore, we need to set a threshold which we call minCov for the minimum MD needed for a data record to be counted as “covered”. We also need to define the maximum matching degree (MMD) of a rule set. This is a generalization of the MD defined above for individual rules. The MMD of a rule set, R, on a data record, r, can be defined as follows:
M M D R   r = max X Y R M D X Y ( r )
We can now define the fuzzy coverage of a rule set, R, over a data set, D, as follows:
F u z z y   C o v e r a g e R   D = n u m b e r   r e c o r d s   w i t h   M D > m i n C o v   f o r   s o m e   r u l e   i n   R n u m b e r   r e c o r d s   i n   D = d D M M D R d m i n _ C o v n
When executing FuzRED, we set the minimum support threshold for the rules to 0.001, although a higher or lower support can be used. When selecting the number of antecedents that are available for a single rule, we often set the maximum number to three. Humans have difficulty quickly understanding rules with a large number of antecedents; therefore, for most applications, we discourage using a larger number.

3.4. Choosing a Subset of Rules

Although FARM typically generates thousands of rules, only a subset is relevant for interpretability. Humans do not like looking at such a large number of rules and would rather have a much smaller subset summarizing what the network does. We use the following method for pruning the rule set to find an optimal minimum set of rules. For each of the six methods for determining the most important features, we order the rules first by decreasing confidence, and then by decreasing support. Then we take the first rule from the ordered set of rules and determine for which examples from the test set the rule applies. We remove the examples to which it applies from the test set. Then, we take the next rule from the rule set and repeat the process. We continue the process until there are no more examples in the test set or no more rules.

4. Results

We applied the FuzRED method to three AI models, each trained for a different learning task with a different dataset. Two models were feed-forward deep neural networks performing binary classification tasks (spam email identification and phishing link detection, respectively). The third model was a DRL model trained to perform a benchmark robotics task. In the sections below, we present results for our FuzRED method, and results generated using the existing Anchors method.

4.1. Spam Email Identification

The SPAM Email Database [26], available from the University of California Irvine (UCI) Machine Learning Repository, was utilized for this study. This dataset contains spam emails identified as such by the recipients or the email postmaster. The non-spam emails consist of both work and personal emails identified as non-spam by the recipients. The dataset was submitted to the UCI Machine Learning Repository in July 1999. The dataset comprises 4601 total instances, 1813 of which are labeled as spam. There are 57 continuous data attributes, and the nominal class label is provided for each instance. Attributes include frequencies of select words and characters, as well as the statistics of uninterrupted sequences of capital letters in the emails. There are no missing attributes in this dataset. We do not have access to the text of the emails but only to frequencies of all the attributes in the emails. For more details on the attributes and their statistics, the reader is referred to [26].
To evaluate our explanation method, we trained a deep neural network model to make predictions on this dataset. The model consists of three fully connected hidden layers, each with 12 nodes using the rectified linear unit (ReLU) activation function, followed by a final fully connected layer with a single node utilizing the sigmoid activation function. We randomly selected 70% of the SPAM email dataset to serve as the training set. All input data were normalized using a StandardScaler object [27] before being fed into the model, and the resulting model achieved 89% accuracy on the test set.

4.1.1. FuzRED Results

We employed six methods for determining the most important features: mean magnitude, frequency, mean contribution, maximum contribution, maximum magnitude, and minimum contribution. These methods are described in Section 3.1.
SHAP can be used for both local and global explanations. Usually, for global explanations, the absolute Shapley values of all instances in the data are averaged. Passing a matrix of SHAP values to the bar plot function [28] creates a global feature importance plot, where the global importance of each feature is taken to be the mean absolute SHAP value for that feature over all the given samples. Figure 3 shows the plot created using the bar plot function with the top 20 features. Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 show the bar plots we created for the remaining five methods.
All the features determined to be the most important by frequency and maximum magnitude methods are shared by at least one other method. Mean contribution and mean magnitude methods each have one feature that was not determined to be the most important by any other methods (word_freq_conference and word_freq_85); maximum contribution has three features in the top 20 not identified by any of the other five methods: word_freq_report, word_freq_money, and word_freq_parts. Minimum contribution has the largest number of features not identified as the most important by any other method: word_freq_telnet, word_freq_857, word_freq_table, word_freq_technology, char_freq, and capital_run_length_longest.
Figure 9 and Figure 10 show the FMFs for word_freq_george and word_freq_hp. Figure A1, Figure A2 and Figure A3 in Appendix A.1 show the FMFs for word_freq_meeting, char_freq_$, and char_freq_!. These are some of the features that were determined to be the most important by at least one of the methods and that are used in the rule explanations in the next section.
Word_freq_george describes the frequency of the word george (how many times it appeared) in an email. Figure 9 shows the FMFs for word_freq_george: low, medium, and high. In the spam dataset, the largest frequency of george is 33. FMF low has a value of 1 for word_freq_george values from 0 to 2.223. For values larger than 2.223 but smaller than 12.557, the value of FMF medium is 1.0. Values larger than or equal to 12.557 but smaller than 13.433, belong to two FMFs: medium and high. For example, for 13.0, the value of FMF medium is 0.474 and that of MFM high is 0.526. This means that, when word_freq_george is 13.0, it is both medium and high (to the degree defined by the value of the appropriate FMF). For values larger than 13.433, the value of FMF high is 1. FMFs are continuous; however, when the feature itself is an integer (frequency is always an integer number), the above description can be rewritten as follows. FMF low has a value of 1 for word_freq_george values from 0 to 2. For values between 3 and 12, the value of FMF medium is 1.0. The value 13 belongs to two FMFs: medium and high. Starting at 14, the value of FMF high is 1.
Word_freq_hp describes the frequency of the word hp in an email. Figure 10 shows the three FMFs for word_freq_hp. In the spam dataset, the largest frequency of hp is 21. This is also an integer variable and its FMFs can be described as follows. FMF low has a value of 1 for the word_freq_hp value of 0. For values between 1 and 7, the value of FMF Medium is 1.0. A value of 8 belongs to two FMFs: medium and high. Starting at 9, the value of FMF high is 1.
Figures in Appendix A.1 (A1–A3) show the three FMFs for word_freq_000, word_freq_meeting, and char_freq_$, respectively. Word_freq_000 describes the frequency of 000 in an email. Word_freq_meeting describes the frequency of the word meeting in an email. Char_freq_$ describes the frequency of the character $ in an email.
When running fuzzy association rule mining, we limited the number of antecedents to three, we set the minimum confidence (Equation (8)) of a rule to 0.6, and the minimum support (Equation (6)) to 0.001. Most of the rules found by FuzRED are easy to understand. As an example, below we show some of the rules extracted using the maximum magnitude choice of features.
  • IF (word_freq_george IS High) THEN No-Spam, sup = 0.0263, conf = 1
This rule says that if the frequency of the word george is high (see Figure 3), then the email is no-spam with a confidence value of 1 (the rule is always true). This rule has a support of 2.63%, which means that it explains 2.63% of the dataset. At the beginning, this rule seems to be a little surprising; however, when we assessed the description of the dataset, we noticed that it said the no-spam rules were provided from personal and business emails of the dataset creators, and one of them was George Forman. In view of this information, the rule makes perfect sense.
  • IF (word_freq_hp IS Medium) THEN No-Spam, sup = 0.0753, conf = 1
  • The above rule says that if the frequency of the word hp (Hewlett Packard) is medium (see Figure 10), then the email is no-spam. The support of the rule is 7.53% and the rule is always true. The creators of the dataset were all HP employees, and many no-spam emails provided included their business emails, so this rule also makes perfect sense. A similar rule is shown below:
  • IF (word_freq_hp IS High) THEN No-Spam, sup = 0.0145, conf = 1
  • Some no-spam rules are related to the frequency of the words meeting, 1999, or lab:
  • IF (word_freq_meeting IS Medium) THEN No-Spam, sup = 0.0217, conf = 1
  • IF (word_freq_meeting IS High) THEN No-Spam, sup = 0.0046, conf = 1
  • IF (word_freq_george IS Medium AND word_freq_1999 IS Medium) THEN No-Spam, sup = 0.0014, conf = 1
This dataset was donated on 6/30/1999; it is very possible that the no-spam emails were chosen mostly from the beginning of 1999.
The spam rules often use the character frequency of $ or !:
  • IF (char_freq_$ IS High) THEN Spam, sup = 0.0033, conf = 1
  • IF (char_freq_$ IS Medium AND char_freq_! IS Medium) THEN Spam, sup = 0.0028, conf = 1
Since spam emails are often about becoming rich (receiving $), and some use a lot of exclamation points, the above rules are sensible.
  • IF (capital_run_length_average IS Medium) THEN Spam, sup = 0.0026, conf = 1
  • The above rule says that if the average length of uninterrupted sequences of capital letters is Medium, then the email is always spam.
  • IF (char_freq_$ IS Medium AND capital_run_length_total IS High) THEN Spam, sup = 0.0020, conf = 1
  • The above rule says that if the total number of capital letters in the email is high, then the email is always spam. Historically, spam emails use a lot of capital letters, so this rule is logical.
Additional FuzRED rules for spam email identification are shown in Appendix A.1.
Depending on which of the six methods is being used for determining the most important features and the minimum confidence of the rules (0.6 to 1), FuzRED extracts a different number of rules. The results are shown in Table 1. The table is arranged in descending order by total coverage (Equation (11)) and then in ascending order by the number of rules after pruning. The total coverage of this set of rules ranges from 0.9717 to only 0.1554.
When explaining the decisions of a neural network, one may wish to minimize the number of rules while still achieving the required level of dataset coverage. For example, if a coverage of at least 0.9 is desired, Table 1 shows that nine methods meet this criterion. Among them, the best option in terms of rule count is method 8 (maximum magnitude, minimum rule confidence of 0.6), which achieves a coverage of 0.9243 with just 96 rules after pruning. If a smaller ruleset is desired and a lower coverage threshold, such as 0.75, is acceptable, then method 11 (minimum contribution, minimum rule confidence of 0.6) provides a ruleset with just 67 rules. These 67 rules are listed in Appendix A.1, in Table A1. For cases where covering just over half of the dataset is sufficient, method 18 (minimum contribution, minimum rule confidence of 0.7) reduces the ruleset further to 62 rules. Thus, the optimal method depends on the specific balance between coverage requirements and the desire to minimize ruleset complexity for a given application.

4.1.2. Anchors Results

In order to evaluate and compare Anchors rules with those from FuzRED, we wrote a script to use the Anchors library to explain the exact same trained model and set of data. The script used anchor_tabular.AnchorTabularExplainer to generate the anchors and wrapped the model for compatibility with the Anchors framework. We wrote a fit_anchor function in order to compute the actual precision (Equation (1)) and coverage (Equation (2)) metrics for the anchor rules. This function parses the body of the rule and converts it to the equivalent logical conditions and then applies it to the full set of test data to find all instances to which the anchor applies. For this to work properly, we needed to round all of the feature data to two decimal places, as the values in the Anchors were rounded. Lastly, the function computes and returns the computed coverage and precision metrics for the anchors.
When Anchors was executed to obtain rules with a maximum of three antecedents, it produced 1520 rules. Some of the rules were repeating themselves and, therefore, we developed software to remove repeating rules and ended up with 648 distinct rules. Most rules have three antecedents, some have two, and a handful have just one. The rules have a precision ranging from 1 to 0.5312 and a coverage ranging from 0.5872 to 0.0007.
Below are some example rules. The rule with the highest coverage is the following:
  • IF (char_freq_$ <= 0.05 AND word_freq_free <= 0.00 AND word_freq_money <= 0.00) THEN No-Spam, coverage = 0.5872, precision = 0.8543
  • This rule says that if the frequency of $ is less than or equal to 0.05, and the frequencies of free and money are less than or equal to 0, then the email is No-Spam in 85.43% of the cases. Of course, frequency cannot be a negative number, and frequency is an integer; therefore, being less than or equal to 0.05, is the same as being zero. Therefore, the rule actually says that if the frequency of $, free, and money is zero, then the email is No-Spam in 85.43% of the cases. The 0.05 threshold in the rule is arbitrary. The rule has a large coverage (58.72%) and is true in 85.43% cases.
  • IF (word_freq_hp > 0.00 AND char_freq_! <= 0.00 AND word_freq_remove <= 0.00) THEN No-Spam, coverage = 0.1672, precision = 1
  • This rule says that if the frequency of hp is greater than 0 and the frequencies of ! and remove are less than or equal to 0, then the email is always No-Spam. Of course, frequency cannot be a negative number, so the rule actually says that if the frequency of hp is greater than 0 and the frequencies of ! and remove are 0, then the email is always No-Spam. This rule covers 16.72% of the data.
The rules above are relatively easy to understand, as they use zero as the threshold. However, most rules use very different numbers as the threshold. These numbers might seem arbitrary to the person who wants to understand the decisions the network is making.
  • IF (word_freq_remove > 0.00 AND char_freq_$ > 0.05 AND capital_run_length_longest > 43.00) THEN Spam, coverage = 0.0671, precision = 1
  • IF (word_freq_remove > 0.00 AND char_freq_$ > 0.05 AND capital_run_length_average > 3.71) THEN Spam, coverage = 0.0579, precision = 1
  • IF (word_freq_3d > 0.00 AND capital_run_length_average > 3.71 AND word_freq_you > 2.63) THEN Spam, coverage = 0.0007, precision = 1
All the features in the three rules above are integer; therefore, the thresholds used (i.e., 0.05, 3.71, 2.63) are very arbitrary and confusing.
Additional Anchors rules for the Spam email identification are shown in Appendix A.1.

4.2. Phishing Link Detection

This study leveraged the phishing link detection dataset curated by Hannousse and Yahiouche [29]. This dataset has 11,430 samples, each containing 87 features extracted from each URL. Fifty-six features were extracted by analyzing the text of the URL (URL-based), 24 features were extracted by analyzing the content of the website hosted at the URL (content-based), and seven features were extracted by gathering external information about the URL (external-based). Examples of URL-based features include the presence of common terms in the URL (such as ‘www’ or ‘.com’) or the ratio of digits to other characters in the URL. Content-based feature examples include the presence of invisible iframe objects on the site or an empty HTML <title> tag. External-based feature examples include whether or not the URL is from a WHOIS-registered domain, or the number of visitors to the site. In total, this dataset contains 87 features that are a combination of integer, floating point, and binary values. The authors in [29] provide a full description of the features alongside details regarding dataset collection methods.
The dataset was first split into 7658 training samples, while 3772 samples were reserved for testing. A deep feed-forward neural network was constructed with three hidden layers. The model was trained for 200 epochs with a batch size of 256. All input data were normalized using a StandardScaler object [27] before being fed into the model, and the resulting model achieved 85% accuracy on test data.

4.2.1. FuzRED Results

We employed six methods for determining the most important features: frequency, mean contribution, mean magnitude, maximum contribution, maximum magnitude, and minimum contribution. Figure 11, Figure 12, Figure 13, Figure 14, Figure 15 and Figure 16 show the 20 most important features for mean magnitude, frequency, mean contribution, maximum contribution, maximum magnitude, and minimum contribution, respectively. All the features determined as being the most important by frequency and maximum magnitude methods are shared by at least one other method. Maximum contribution has one feature that was not determined as the most important by any other method (iframe); mean magnitude has two such features (nb_slash, ratio_extMedia), and mean contribution has three features not determined as being the most important by any other method (dns_record, ratio_intHyperlinks, web_traffic). Minimum contribution has the largest number of features (11) not identified as being the most important by any other method: length_words_raw, nb_percent, avg_word_host, longest_word_host, length_url, whois_registered_domain, abnormal_subdomain, domain_in_brand, nb_dslash, onmouseover, and popup_window.
For each feature, we used Gaussian mixture models to construct two membership functions (low and high). Figure 17 and Figure 18 show the membership functions for the features: length_words_raw and ratio_intMedia. The FMFs for features, nb_dots, and nb_www are shown in Appendix A.2. These features are used in the example rules described in the FuzRED Rules section, Section 4.2.1. The feature descriptions in this section are taken from [30].
Length_words_raw describes the number of words in a URL. In the phishing dataset, the largest length_words_raw is 106 (Figure 17 shows the FMFs). FMF low has a value of 1 for length_words_raw values from 1 to 28.240; for 32.000, the FMF low has a 0.299 value, and FMF high has a 0.701 value. This means that when length_words_raw is 32.000, it is both low and high (to the degree defined by the value of the appropriate FMF). Starting at 35.023, the value of FMF high is 1.
Legitimate websites mostly use media (images, audio, and video) stored in the same domain. Phishing websites use more external media, usually stored in the target website domain, to save storage space. Ratio_intMedia estimates the ratio of internal media file links and uses it for distinguishing legitimate from phishing websites. In the phishing dataset, the largest ratio_intMedia is 100 (Figure 18 shows the FMFs). FMF low has a value of 1 for ratio_intMedia values from 0 to 38.804; for 49.155, the FMF low has a value of 0.501, and FMF high has a 0.499 value. This means that when ratio_intMedia is 49.155, it is both low and high (to the degree defined by the value of the appropriate FMF). Starting at 59.516, the value of FMF High is 1.
The FMFs for nb_dots and nb_www are shown in Appendix A.2.
When running fuzzy association rule mining, we limited the number of antecedents to three, we set the minimum confidence (Equation (8)) to 0.6, and we set the minimum support (Equation (6)) to 0.001. Most of the rules found by FuzRED are easy to understand. As an example, below we show some of the rules extracted using the maximum magnitude choice of top features.
  • IF (nb_dots IS High) THEN Phishing, sup = 0.0039, conf = 1This rule says that if the nb_dots (number of dots in the URL) is high, then the URL is always predicted to be a phishing domain by the model. An example of such a URL is below (13 dots in the domain name):
  • http://https.email.office.nhc8fso9liwz2e1rk12vhqaxgeq4g.hgbtmd8eshlu1rkesgz11.tk0mqbhgkkuvsf3u821.1sytn1idkv8s2qm4ehh7jja7d.mne8jnxrh8klahtqu.0fll4ryeb76852jeplwk9ckd6zqof.2wvxm5n6uamkxq7wxhpbxaq1a4.cxdtens.duckdns.org/365NewOfficeG15/jsmith@imaphost.com/paul/
  • IF (nb_www IS High AND length_words_raw IS High) THEN Phishing, sup = 0.0023522, conf = 1.0
  • Length_words_raw is an integer that refers to how many words there are in a URL. The rule says that if the nb_www and length_words_raw are high, then the URL is a phishing one. In the phishing URL below, nb_www is 2, and length_words_raw is 16:
  • http://timetravel.mementoweb.org/reconstruct/20141123205914mp_/https:/www.paypal.com/signin?returnuri=https:/www.paypal.com/cgi-bin/webscr?cmd=_account
  • IF (page_rank IS High AND nb_hyperlinks IS High) THEN No-Phishing, sup = 0.0141772, conf = 0.9991)
Page_rank is an algorithm used by Google Search to rank web pages in their search engine results. Phishing web pages are not very popular, hence, they are supposed to have low page ranks compared to legitimate web pages. Legitimate (No-Phishing) websites are supposed to consist of a larger number of pages compared with phishing ones. Therefore, the number of links in a URL contents (nb_hyperlinks) is considered for distinguishing phishing websites. The rule above says that if both the page_rank and nb_hyperlinks are high, then almost always (conf = 0.9991) the website is No-Phishing.
Depending on which of the six methods we are using for determining the most important features and the minimum confidence of the rules (0.6 to 1), FuzRED extracts a different number of rules. The results are shown in Table 2. The table is arranged in descending order by total coverage (Equation (11)) and then in ascending order by the number of rules after pruning. The total coverage of this set of rules ranges from 1.0 to only 0.0424.
When explaining the decisions of a neural network, one may wish to minimize the number of rules while still achieving a required level of dataset coverage. For example, if a coverage of at least 0.9 is desired, Table 3 shows that 23 methods meet this criterion. Among them, the best option in terms of rule count is method 16 (maximum contribution, minimum rule confidence of 0.8), which achieves a coverage of 0.9722 with just 52 rules after pruning. If a smaller ruleset is desired and a lower coverage threshold, such as 0.75, is acceptable, then method 27 (maximum magnitude, minimum rule confidence of 0.8) provides a ruleset with just 48 rules. These 48 rules are listed in Appendix A.2, Table A2. For cases where covering just over half of the dataset is sufficient, method 32 (maximum contribution, minimum rule confidence of 0.95) reduces the ruleset further to 34 rules.

4.2.2. Anchors Results

When running Anchors, we requested rules with up to three antecedents. Anchors produced 3808 rules; once the repeating ones had been removed, we obtained 548 rules. Most of the rules have three antecedents, some have two, and a few have one. Some rules are easy to understand:
  • IF (nb_star > 0.00) THEN Phishing, coverage = 0.0011, precision = 1
  • Nb_star refers to the number of stars (*) in a URL. The above rule is always true and describes 0.11% of the dataset.
Most rules are much more difficult to understand.
  • IF (page_rank <= 1.00 AND google_index > 0.00 AND longest_words_raw > 16.00) THEN Phishing, coverage = 0.1021, precision = 1
  • Google_index is a binary variable describing whether a given domain is indexed by Google or not (1 means indexed, and 0 means not indexed); web pages not indexed by Google have a higher probability of phishing. The above rule says that if page_rank <= 1.00, the google_index > 0.00 (in reality google_index = 1, meaning indexed by google), and the longest_words_raw > 16.00, the web page is always phishing. It is unclear why URLs indexed by Google would be related to phishing; as such, this rule appears contradictory.
  • IF (google_index <= 0.00 AND ratio_extErrors > 0.00 AND domain_registration_length > 446.75) THEN No-Phishing, coverage = 0.0684, precision = 0.9961
  • The above rule is very difficult to comprehend: when google_index is 0, the page is not indexed by Google, so why would this be related to No-Phishing? In the dataset, ratio_extErrors is always > 0; therefore, this part of the rule does nothing. Finally, the domain_registration_length has a difficult-to-understand threshold value. The rule is true in 99.61% of cases.
  • IF (google_index > 0.00 AND nb_hyperlinks <= 9.00 AND avg_words_raw > 8.00) THEN Phishing, coverage = 0.0551, precision = 0.9952
The rule above has difficult-to-understand thresholds.
Additional Anchors rules are shown in Appendix A.2.

4.3. Robotic Navigation

In this section, we demonstrate the application of the FuzRED explainability method to deep reinforcement learning (DRL) agents, expanding the scope of our explainability framework beyond traditional deep learning models. We created a toy model using the Miniworld package [30] and tasked an agent with vision-based navigation to achieve a sequence of objectives. We utilized a custom simulated environment in which an agent and three colored boxes—red, blue, and yellow—were placed in random positions. The agent’s goal was to reach each of the three boxes in a specified sequence: first the red box, then the blue box, and finally the yellow box. We used the Stable-Baselines3 package ver 2.3.2 [31] to train a proximal policy optimization (PPO) model to accomplish these objectives. To address the multi-goal navigation problem, we employed hierarchical reinforcement learning (HRL), which comprises a high-level policy for generating sub-goals and a low-level policy for executing actions. The high-level controller uses the current environmental observations to provide sub-goals to the low-level planner.
The agent processed the visual information it captured from the environment (shown in Figure 19) using an object detection model. Specifically, a pre-trained you look only once (YOLO) v8 [32] model was fine-tuned on screenshots from the Miniworld environment, which were manually labeled to detect the three colored boxes. YOLO provided reliable information about the position of each box by generating bounding box coordinates, which were then used as part of the state representation for the PPO agent. These coordinates included the x and y positions of each box’s center, as well as the height and width of each bounding box. The coordinates were normalized between 0 and 1 before being fed into the low-level planning agent. The final state space presented to the PPO model was a concatenated vector consisting of bounding box coordinates for the red, blue, and yellow boxes, combined with one-hot encoded vectors indicating the detection status of each box and which box was the current target. The agent had an action space consisting of turning right, turning left, moving forward, and moving backward.
The training of the model involved using both step-based rewards and sub-goal completion rewards. The agent received a positive reward of 0.03 for each step it took that brought it closer to the target box, while moving away from the target resulted in a penalty of −0.03. When the agent successfully reached a sub-goal, it was rewarded with a value that increased based on how many sub-goals had already been completed and how efficiently the current sub-goal was achieved:
R s u b g o a l = 2 × n u m b o x e s f o u n d × 1 0.2 × c u r r e n t s t e p max steps
This reward function incentivized the agent to find all boxes and do so as efficiently as possible, with a maximum of 1000 steps allowed per episode. The agent was trained using a vectorized environment, which allowed multiple versions of the environment to run simultaneously, thus speeding up the training process and providing diverse experiences for learning. During the training of the PPO model, we performed several rounds of hyperparameter tuning. The final hyperparameters selected are given in Table 3. The training process continued until the model performance converged, achieving all three sub-goals on average within 250 steps after approximately 700,000 training steps.
To generate rules explaining the model, we extracted features from the state space used to train the model, including bounding box coordinates, one-hot encoded detection statuses, and target indicators. These features are summarized in Table 4.

4.3.1. FuzRED Results

For each feature, we created FMFs by fitting Gaussian mixture models to the feature values observed during training. Because there were only 18 features, we did not perform a down selection to the most important features for this case. For continuous-valued features, we selected three components for the mixture model, corresponding to three fuzzy classes (e.g., left, middle, right for position coordinates). For one-hot encoded features, we created two classes (e.g., true or false). The resulting FMFs were then applied to data from 10 episodes where the trained agent ran in deterministic mode and successfully completed the objective.
Figure 20 and Figure 21 show some example FMFs. Figure 20 shows the two FMFs for red_box_detected. The FMFs for all binary variables: blue_box_detected, yellow_box_detected, target_red_box, target_blue_box, and target_yellow_box all look like those on Figure 20. Figure 21 shows the three FMFs for feature red_box_position: left, middle and right. It describes if the position of the red box is to the left, middle or right of the image that the agent sees. In Appendix A.3, Figure A6 shows the three FMFs for the feature red_box_height: short, medium, and tall. Of course, the height of the box does not change. However, when the agent is far from the box, the box appears short, and when the agent is close to it, it appears tall.
Using the FMFs, we generated fuzzy association rules describing the agent’s behavior in the environment. Rule extraction and pruning were performed using the FARM method described in Section 3.3 and Section 3.4. After pruning there were a total of 43 rules remaining across the 3 agent action classes that were actually utilized. As the agent never moved backward, no rules were generated for this action class.
The three rules below show the behavior of the agent when a given box is the current target, but that box is not detected, meaning that it is not in the field of view. In all cases when this occurs, the agent will turn to try to get the target box into its field of view. This is demonstrated by the fact that each of these rules has 100% confidence.
  • IF target_red_box IS true AND red_box_detected IS false THEN turn_left, sup = 0.104, conf = 1
  • IF target_blue_box IS true AND blue_box_detected IS false THEN turn_left, sup = 0.115, conf = 1
  • IF target_yellow_box IS true AND yellow_box_detected IS false THEN turn_right, sup = 0.116, conf = 1
An interesting thing to note here is that the agent learned to always turn left when looking for the red or blue box, but to always turn right when looking for the yellow box.
The rule below shows an interesting behavior of the agent.
  • IF target_blue_box IS true AND red_box_detected IS true THEN turn_left, sup = 0.041, conf = 1
It shows that whenever the target is blue but the red box is anywhere within the field of view, then the agent turns left. This is interesting because it shows learned behavior about the sequence of the sub-goals. For this task the agent was trained to always go from red to blue. As a result, every time the target switched to blue, the agent had just reached the red box and was right up against it and unable to move through it; therefore, it learned to turn until the red box was out of its field of view before proceeding.
The rules below demonstrate the move_forward action of the agent.
  • IF target_blue_box IS true AND blue_box_detected IS true THEN move_forwardsup-0.240, conf = 1
  • The above rule shows that the agent always moves forward when the blue box is the target and it is in the field of view.
  • IF target_yellow_box IS true AND yellow_box_detected IS true THEN move_forward, sup = 0.223, conf = 0.996
  • The rule above shows that the agent almost always moves forward when the yellow box is the target and it is in the field of view.
However, the same is not true of the red box:
  • IF target_red_box IS true AND red_box_position_x IS middle AND yellow_box_detected IS false THEN move_forward, sup = 0.031, conf = 1
  • IF target_red_box IS true AND red_box_width IS wide AND blue_box_detected IS false THEN move_forward, sup = 0.015, conf = 0.999
  • IF target_red_box IS true AND red_box_height IS tall AND yellow_box_detected IS false THEN move_forward, sup = 0.071, conf = 0.987
  • The above rules show examples of some of the other conditions that need to be in place for the agent to move forward toward the red box, namely that it needs to be in the middle of the field of view or taking up a large portion of the field of view in height or width and the other boxes are not in the way.
Rule pruning was performed using the FARM method described above. After pruning, there were a total of 43 rules remaining across the three agent action classes that were actually utilized. Depending on the minimum confidence (Equation (8)) of the rules (0.6 to 1), FuzRED extracts a different number of rules. The results are shown in Table 5. The table is arranged in descending order by total coverage and then in ascending order by the number of rules after pruning. The total coverage (Equation (11)) of this set of rules ranges from 0.9984 to 0.8606.
In order to explain the decisions of a neural network, one might be interested in the rules explaining a higher or lower percentage of the dataset depending on the application or type of user. Someone might want the rules to describe at least 0.9 of the test data (i.e., a coverage of 0.9 or above), and then looking at Table 5, nine methods from the top of the table satisfy this requirement. Method 9 requires the smallest number of rules. The 28 rules for method 9 are shown in Appendix A.3 in Table A3. Most of the rules are the same as described earlier in Section 4.3.1.

4.3.2. Anchors Results

Anchors extracted 1270 rules with up to four antecedents. After we removed the repeating rules, 155 rules were left. Most of the rules had three or four antecedents, some had two, and a few had one. Let us start with a few rules with two antecedents, as these are easier to understand.
  • IF (target_red_box <= 0.00 AND blue_box_height > 0.16) THEN move_forward, coverage = 0.1898, precision = 1
  • Target_red_box is a binary variable, meaning that this rule really says that if the target_red_box = 0 (this means the red box is not the target) and blue_box_height > 0.16, move_forward. The meaning of blue_box_height > 0.16 is unknown.
  • IF (0.00 < blue_box_position_x <= 0.06) THEN move_forward, coverage = 0.0819, precision = 0.9712
  • It is difficult to understand what the threshold values of blue_box_position_x mean in this rule.
  • IF (blue_box_detected <= 0.00 AND target_blue_box > 0.00) THEN turn_left, coverage = 0.115, precision = 1
  • Blue_box_detected and target_blue_box are binary features. As such, the rule really says that if blue_box_detected = 0.00 and target_blue_box = 1.0, then always turn_left. In other words, it says that if the blue_box is not detected and the target is the blue_box, then turn_left. This rule is the same as one of the FuzRED rules described in 4.3.1.2. The only difference is that the FuzRED rule was easy to understand:
  • IF target_blue_box IS true AND blue_box_detected IS false THEN turn_left
  • Here are some examples of rules with three antecedents (note that most of the Anchors rules have three antecedents):
  • IF (blue_box_position_x <= 0.06 AND red_box_height > 0.14 AND target_red_box > 0.00) THEN move_forward, coverage = 0.1228, precision = 0.9744
  • Taking into account that some of the features are binary, this rule says that if target_red_box = 1 (so the target is the red_box), blue_box_position_x <= 0.06, and red_box_height > 0.14, then most of the time move_forward. The thresholds 0.06 and 0.14 are difficult to understand.
  • IF (target_red_box > 0.00 AND blue_box_height > 0.16 AND red_box_detected <= 0.00) THEN turn_left, coverage = 0.0244, precision = 1
  • Taking into account that some of the features are binary, this rule says that if target_red_box = 1 (meaning that the red_box is the target), red_box_detected = 0 (red_box not detected), and blue_box_height > 0.16, then always move_right. The threshold 0.16 is difficult to understand.
Additional Anchors rules with four antecedents are shown in Appendix A.3.

5. Discussion

Across all three problems, FuzRED significantly outperforms Anchors in terms of computation speed. All experiments were conducted on a server equipped with two Intel Xeon E5-2650 CPUs (24 cores each) and 8 Nvidia GTX 1080ti GPUs. Figure 22 shows the comparison of computation times between FuzRED and Anchors.
For the spam dataset, FuzRED completes the entire process—including computing SHAP values, selecting the top 20 SHAP features, generating membership functions, and extracting rules—in 53.7 s, while Anchors requires 16.1 h to complete the same task. Notably, Anchors appeared to utilize only a single CPU core, which may have contributed to its slower performance. FuzRED computation time is faster than Anchors by more than two orders of magnitude.
For the phishing dataset, FuzRED performs all the computations in 105.1 s. In comparison, Anchors runs for 71.6 h. Again, note that Anchors appeared to only be able to fully utilize 1 CPU core. In addition, the process of generating the Anchors rules for the phishing dataset required more memory than the 256 GB of RAM available on the server. The rule generation process ran out of memory a little over half way through the process and had to be restarted on the remaining data. The time spent shutting down and restarting the process was not counted in the runtime reported above. FuzRED computation time is faster than that of Anchors by more than three orders of magnitude. For the robotic navigation task, FuzRED performs all the computations in just seven minutes on the same server as the computations for spam and phishing. In comparison, Anchors takes over 129 min to complete its computations on the same server. FuzRED computation time is faster than that of Anchors by one order of magnitude. The reason the speed-up for the robotic navigation task is smaller than that of the two other tasks is that the total number of features in this task is only 18 and we did not perform feature selection, meaning that both FuzRED and Anchors were getting rules for the same number of features.
The rules generated by FuzRED are highly interpretable, using familiar linguistic terms such as low, medium, high, left, middle, and right which align with natural human reasoning. For SMEs, the FMF plots visually indicate the value ranges corresponding to each membership category. Additionally, shorter rules enhance interpretability, and FuzRED predominantly generates rules with only one or two antecedents, with a small subset containing three. By selecting a concise yet representative subset of rules that explain a large portion of the dataset, FuzRED enables users to focus on key patterns, making it easier to assess and validate the model’s behavior efficiently.
Anchors generates a large set of rules, most of which contain three antecedents, some with two, and a few with one. Many of these rules include antecedents that are always true, which could be omitted to create shorter, more interpretable rules. Additionally, Anchors often applies arbitrary numerical thresholds (e.g., 0.76, 0.16)—even for integer features—making the rules harder to understand. For binary features, instead of straightforward conditions like “feature = 0” or “feature = 1”, Anchors consistently uses greater than or equal to (≥) or less than or equal to (≤) conditions, further complicating interpretability. Moreover, the rule sets produced by Anchors are significantly larger than those generated by FuzRED, making it more challenging to evaluate whether the overall model behavior aligns with expectations.
While these results have demonstrated many of the advantages of FuzRED, it is important to also note some of its current limitations. First, although our experiments demonstrate that FuzRED can handle moderately complex models efficiently, scaling the approach to extremely large neural networks or high-dimensional datasets could pose challenges. In these scenarios, it is possible that the number of important features could be in the hundreds or thousands, a factor that would substantially increase the runtime and could potentially become intractable due to the combinatorial explosion that occurs in fuzzy association rule mining as either the number of items or the length of rules increases. In addition, this could create an excessively large collection of rules that would be difficult to manage and interpret. Similarly, while the DRL agent example demonstrated the applicability of FuzRED to multi-class classification problems, the method currently generates separate rulesets for each class, and the analysis of all the rulesets could become challenging for many classes. Finally, to date, we have only focused on structured and numeric data. Therefore, extending FuzRED to unstructured modalities such as images or unstructured text would likely require an appropriate feature extraction or encoding step as a precursor to fuzzy rule generation. Each of these issues are important targets for future work.
Table 6 shows a brief comparison of FuzRED, SHAP, LIME, and Anchors methods in terms of interpretability, computation time, explanation scope, scalability, and ease-of-use for non-experts.

6. Conclusions and Future Work

We introduced FuzRED, a novel method for explaining AI model decisions using fuzzy IF-THEN rules. This method translates complex model behaviors into a set of human-understandable rules described with familiar linguistic terms (e.g., “low”, “medium”, “high”). The approach helps to bridge the gap between black-box neural network decision-making and human reasoning. By combining feature-importance analysis with fuzzy logic, FuzRED ensures that the extracted rules remain faithful to the model’s internal patterns while also being intuitively interpretable. FuzRED is fully automated—the user only needs to provide a trained model, along with training and test datasets, and can optionally set a couple of parameters to guide rule generation. We demonstrated its applicability to both standard classification tasks and a DRL scenario, highlighting its flexibility across different types of AI problems. We also showed that it is significantly faster than the Anchors method, and we think additional optimizations could improve the efficiency further.
FuzRED produces a concise set of representative rules that provide a high-level summary of the model’s behavior, allowing users to focus on key decision patterns without being overwhelmed by details. This capability can help build trust in AI systems among subject-matter experts and even non-technical stakeholders by enabling them to see in plain and intuitive language why a model made a certain prediction or recommendation. Likewise, model developers can use FuzRED as a validation and debugging tool to ensure the model’s reasoning aligns with their domain knowledge and the system’s requirements. If an extracted rule reveals that an unexpected or illogical condition drove a prediction, it will alert developers to potential issues in the model or dataset. This, in turn, will prompt further investigation or retraining. In addition, FuzRED’s computational efficiency makes it feasible to integrate into real-world workflows, even for time-sensitive applications or iterative model development.
This capability has important implications for deploying AI in a variety of real-world applications. In healthcare, for example, an interpretable set of rules derived from a diagnostic model can help clinicians and patients trust the model’s recommendations and understand the rationale behind critical decisions. In defense and security, rule-based explanations enable decision-makers to verify that an AI system’s reasoning aligns with established rules of engagement and ethical constraints before they act on its output. Similarly, in autonomous systems such as self-driving vehicles or robotic assistants, understanding the conditions that trigger certain actions improves transparency and safety, fostering greater confidence in the system’s behavior and facilitating compliance with regulatory requirements.
By providing transparent, rule-based insights into a neural network’s workings, FuzRED contributes to the broader goal of making AI systems more interpretable and accountable. Notably, the fuzzy rule-based approach provides a method that yields global explanations (capturing broad patterns across many instances) rather than only case-by-case insights. In other words, FuzRED complements existing local explanation techniques by offering a higher-level view of a model’s decision logic that can be examined holistically.
The integration of fuzzy logic with deep-learning explanations in FuzRED also opens avenues for new research. For instance, future work could leverage FuzRED’s extracted rules to identify higher-level concepts that a neural network has learned, or to distill a complex model into a simpler, more transparent form without significant loss of accuracy. Future concept-based explainability methods could also leverage the output of FuzRED to aid in concept identification by analyzing the interactions between different feature or rule combinations within the larger ruleset. Moving forward, we hope to extend FuzRED to more complex scenarios—for example, scaling it to accommodate much larger networks and higher-dimensional datasets and adapting the approach to unstructured data like image and text inputs. These efforts will further broaden FuzRED’s applicability and amplify its impact on the field of explainable AI.

Author Contributions

Conceptualization, A.L.B.; methodology, A.L.B. and B.D.B.; software, K.Z. and B.D.B.; formal analysis, A.L.B., B.D.B. and K.Z.; writing—original draft preparation, A.L.B. and B.D.B.; writing—review and editing, A.L.B., B.D.B. and K.Z.; visualization, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the JHU Institute of Assured Autonomy.

Data Availability Statement

All the data used in this paper will be made available upon publication of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial intelligence
ARMAssociation rule mining
DRLDeep reinforcement learning
FARMFuzzy association rule mining
FMFFuzzy membership function
FuzREDFuzzy rules for explaining deep neural network decisions
GMMGaussian mixture model
HRLHierarchical reinforcement learning
LIMELocal interpretable model-agnostic explanations
MDMatching degree
MMDMaximum matching degree
ReLURectified linear unit
SHAPShapley additive explanations
SMESubject matter expert
UCIUniversity of California Irvine
XAIExplainable artificial intelligence
YOLOYou look only once

Appendix A

Appendix A.1. Additional Results for the Spam Email Identifiction Dataset

Some FMFs are shown in in Section 4.1.1. (FMFs). Additional FMFs are shown below. The low, medium, and high FMFs follow a similar pattern as that for word_freq_george and word_freq_hp described in Section 4.1.1.
Figure A1. Spam dataset: FMFs function for word_freq_000. Low, medium, and high are shown in red, green, and blue, respectively.
Figure A1. Spam dataset: FMFs function for word_freq_000. Low, medium, and high are shown in red, green, and blue, respectively.
Electronics 14 01965 g0a1
Figure A2. Spam dataset: FMFs for word_freq_meeting. Low, medium, and high are shown in red, green, and blue, respectively.
Figure A2. Spam dataset: FMFs for word_freq_meeting. Low, medium, and high are shown in red, green, and blue, respectively.
Electronics 14 01965 g0a2
Figure A3. Spam dataset: FMFs for char_freq_$. Low, medium, and high are shown in red, green, and blue, respectively.
Figure A3. Spam dataset: FMFs for char_freq_$. Low, medium, and high are shown in red, green, and blue, respectively.
Electronics 14 01965 g0a3
In Section 4.1.1 (FuzRED Rules), we present a small set of rules. Additional rules with their explanations are shown below.
There are also some rules with the word frequency of remove.
  • IF (word_freq_remove IS High) THEN Spam, sup = 0.0119, conf = 1
  • Many spam emails play on people’s fear, claiming their account will be deleted unless they take immediate action, in emails like: “Urgent: Verify your account or we will need to remove it from our system. Click here to verify.”
Below we show some of the rules extracted using the frequency choice of the top features.
  • IF (word_freq_000 IS High AND word_freq_1999 IS Low) THEN Spam, sup = 0.0085, conf = 1
  • The above rule says that if the frequency of 000 is high (spam emails often mention $ and a lot of zeros) and the frequency of 1999 (many no-spam emails were from 1999) is low, then the email is always spam.
  • IF (word_freq_re IS High) THEN No-Spam, sup = 0.0020, conf = 1
  • If someone is replying to someone else’s email (or chain of email replies), this is not spam.
  • IF (word_freq_hpl IS High) THEN No-Spam, sup = 0.0198, conf = 1
  • Hpl is the acronym for Hewlett Packard Labs, and the rule states that if its frequency is high, the email is always no-spam.
Most Anchors rules are presented in Section 4.1.2. Here, we show a few additional Anchors rules.
  • IF (word_freq_remove > 0.00 AND word_freq_hp <= 0.00 AND word_freq_will > 0.00) THEN Spam, coverage = 0.1192, precision = 1
Since frequencies can never be negative, this rule actually states that if the frequency of remove is greater than zero, the frequency of hp is zero, and the frequency of will is greater than zero, then the email is spam.
  • IF (word_freq_george > 0.00 AND word_freq_labs > 0.00) THEN No-Spam, coverage = 0.0533, precision = 1
Table A1. Pruned set of rules for spam: method 11 from Table 1, minimum contribution method, minimum rule confidence of 0.6.
Table A1. Pruned set of rules for spam: method 11 from Table 1, minimum contribution method, minimum rule confidence of 0.6.
Antecedent 1Antecedent 2Antecedent 3ConsequentSupportConfidence
word_freq_hp; Medium No-Spam0.07531
word_freq_hpl; Medium No-Spam0.05981
word_freq_george; High No-Spam0.02631
capital_run_length_longest; Medium Spam0.02371
word_freq_meeting; Medium No-Spam0.02171
word_freq_hpl; High No-Spam0.01981
word_freq_hp; High No-Spam0.01451
word_freq_1999; Lowword_freq_telnet; Mediumword_freq_technology; MediumNo-Spam0.01421
word_freq_cs; Medium No-Spam0.01261
word_freq_direct; High No-Spam0.01121
word_freq_lab; Lowword_freq_telnet; Mediumword_freq_technology; MediumNo-Spam0.01071
word_freq_telnet; High No-Spam0.00991
word_freq_1999; Mediumword_freq_edu; Medium No-Spam0.00971
word_freq_lab; Mediumword_freq_technology; Medium No-Spam0.00891
word_freq_project; Medium No-Spam0.00791
word_freq_1999; Mediumword_freq_technology; Medium No-Spam0.00551
word_freq_1999; Mediumchar_freq_; Medium No-Spam0.00541
word_freq_edu; High No-Spam0.00531
char_freq_$; Mediumword_freq_direct; Medium Spam0.00471
word_freq_meeting; High No-Spam0.00461
word_freq_email; Mediumchar_freq_$; Medium Spam0.00431
char_freq_; High No-Spam0.00391
char_freq_$; High Spam0.00331
char_freq_$; Mediumchar_freq_; Medium Spam0.00301
word_freq_email; Highword_freq_re; Lowword_freq_all; LowSpam0.00291
word_freq_cs; High No-Spam0.00261
word_freq_email; Mediumword_freq_all; High No-Spam0.00261
word_freq_re; High No-Spam0.00201
word_freq_edu; Mediumword_freq_all; Medium No-Spam0.00201
word_freq_edu; Lowword_freq_edu; Medium No-Spam0.00141
word_freq_george; Mediumword_freq_1999; Medium No-Spam0.00141
word_freq_george; Mediumword_freq_1999; High No-Spam0.00131
word_freq_george; Mediumchar_freq_; Medium No-Spam0.00131
word_freq_1999; Highword_freq_all; Medium No-Spam0.00121
word_freq_hp; Lowchar_freq_$; Mediumword_freq_all; MediumSpam0.00651
word_freq_857; Medium No-Spam0.02170.9999
word_freq_email; Mediumword_freq_direct; Lowword_freq_direct; MediumSpam0.00110.9996
word_freq_re; Medium No-Spam0.02580.9940
word_freq_lab; High No-Spam0.00990.9930
word_freq_telnet; Medium No-Spam0.02610.9901
word_freq_technology; Lowword_freq_technology; Medium No-Spam0.00230.9788
word_freq_email; Lowword_freq_email; Mediumword_freq_telnet; LowSpam0.00440.9763
word_freq_george; Medium No-Spam0.01550.9747
word_freq_lab; Medium No-Spam0.01580.9641
word_freq_hp; Lowchar_freq_$; Mediumword_freq_meeting; LowSpam0.03060.9630
word_freq_edu; Medium No-Spam0.03290.9615
word_freq_hp; Lowword_freq_email; Mediumword_freq_direct; MediumSpam0.00830.9283
word_freq_email; Mediumword_freq_all; Lowword_freq_all; MediumSpam0.00470.9233
word_freq_1999; Medium No-Spam0.05760.9109
word_freq_table; Mediumword_freq_all; Medium No-Spam0.00140.8914
capital_run_length_longest; Lowword_freq_all; High No-Spam0.01480.8817
word_freq_technology; Mediumword_freq_all; Medium No-Spam0.00490.8816
word_freq_email; Mediumword_freq_telnet; Lowword_freq_all; MediumSpam0.02180.8791
word_freq_email; Mediumchar_freq_$; Lowword_freq_direct; MediumNo-Spam0.02290.8453
word_freq_technology; Medium No-Spam0.03870.8429
word_freq_direct; Lowword_freq_direct; Mediumword_freq_technology; LowSpam0.00200.8257
word_freq_hp; Lowword_freq_direct; Mediumword_freq_857; LowSpam0.01580.7933
word_freq_hp; Lowword_freq_email; Mediumword_freq_telnet; LowSpam0.05980.7865
char_freq_$; Lowchar_freq_; Mediumcapital_run_length_longest; LowNo-Spam0.01230.7714
word_freq_1999; High No-Spam0.00380.7435
word_freq_hpl; Lowword_freq_technology; Lowword_freq_all_; MediumSpam0.09910.7326
word_freq_telnet; Lowword_freq_hpl; Lowword_freq_all; MediumSpam0.09980.7298
word_freq_email; Lowchar_freq_$; Lowword_freq_all; LowNo-Spam0.50130.6670
word_freq_email; Lowword_freq_all; Low No-Spam0.50310.6476
char_freq_$; Lowword_freq_all; Low No-Spam0.51940.6431
word_freq_email; Lowchar_freq_$; Low No-Spam0.55260.6292
char_freq_$; Lowcapital_run_length_longest; Low No-Spam0.58010.6155

Appendix A.2. Additional Results for Phishing Link Detection

Some FMFs are shown in Section 4.2.1. (FMFs). Additional FMFs are shown below.
Nb_dots describes the number of dots in a URL. In the phishing dataset, the largest nb_dots is 24 (Figure A4 shows the FMFs). Phishing URLs use sensitive words to gain trust on visited web pages. The number of such words in URLs is considered as a phishing indicator. Nb_www describes the number of ‘www’ in a URL. Common terms such as ‘www’ tend to be used mostly once in legitimate URLs, but they are used more than once in phishing URLs. Nb_www is an integer that describes how many times www is used in a URL. In the phishing dataset, the largest nb_www is 2 (Figure A5 shows the FMFs).
Figure A4. Phishing dataset: FMFs for nb_dots. Low and high are shown in red and blue, respectively.
Figure A4. Phishing dataset: FMFs for nb_dots. Low and high are shown in red and blue, respectively.
Electronics 14 01965 g0a4
Most Anchors rules are presented in Section 4.2.2. Here, we show a few additional Anchors rules.
  • IF (nb_dollar > 0.00 AND nb_dots > 3.00) THEN Phishing, coverage = 0.0005, precision = 1
The above rule is always true and describes 0.05% of the dataset.
  • IF (google_index > 0.00 AND nb_www <= 0.00 AND domain_age <= 978.50) THEN Phishing coverage = 0.1169, precision = 0.9615
Nb_www is always greater than or equal to 0; therefore, this antecedent states that nb_www is equal to zero. Again, it is not clear why URLs indexed by Google (google_index > 0.00) would lead to phishing. Domain_age has a difficult-to-understand threshold in the above rule.
Figure A5. Phishing dataset: FMFs for nb_www. Low and high are shown in red and blue, respectively.
Figure A5. Phishing dataset: FMFs for nb_www. Low and high are shown in red and blue, respectively.
Electronics 14 01965 g0a5
Table A2. Pruned set of rules for the phishing dataset, method 27 from Table 3, maximum magnitude and minimum rule confidence of 0.8.
Table A2. Pruned set of rules for the phishing dataset, method 27 from Table 3, maximum magnitude and minimum rule confidence of 0.8.
Antecedent 1Antecedent 2Antecedent 3ConsequentSupportConfidence
nb_at; High Phishing0.02361
nb_space; Lownb_colon; High Phishing0.00691
nb_semicolumn; High Phishing0.00451
longest_words_raw; High Phishing0.00451
phish_hints; Highratio_digits_host; High Phishing0.00451
brand_in_subdomain; High Phishing0.00401
nb_dots_; High Phishing0.00391
nb_hyperlinks; Highratio_extRedirection; High No-Phishing0.00351
domain_registration_length; Highnb_hyperlinks; High No-Phishing0.00281
nb_www; Highlength_words_raw; High Phishing0.00241
domain_registration_length; Highratio_digits_host; High Phishing0.00111
phish_hints; Highratio_extRedirection; High Phishing0.00950.9999
ratio_extRedirection; Highratio_digits_host; High Phishing0.00370.9993
page_rank; Highnb_hyperlinks; High No-Phishing0.01420.9991
phish_hints; Highpage_rank; Low Phishing0.07110.9945
nb_space; Highdomain_registration_length; High No-Phishing0.00110.9931
nb_www; Lowratio_digits_host; High Phishing0.03250.9902
nb_space; Highpage_rank; Highnb_www; HighNo-Phishing0.00160.9837
domain_registration_length; Lowphish_hints; High Phishing0.07840.9832
phish_hints; Highnb_www; Low Phishing0.05960.9825
page_rank; Highratio_extRedirection; Highnb_www; HighNo-Phishing0.02730.9803
nb_hyphens; Highpage_rank; Highratio_extRedirection; HighNo-Phishing0.00350.9770
nb_hyphens; Highpage_rank; Highlength_words_raw; LowNo-Phishing0.03140.9758
ratio_digits_host; High Phishing0.03500.9755
nb_hyperlinks; Highnb_www; Low No-Phishing0.00980.9721
phish_hints; Lownb_hyperlinks; High No-Phishing0.01790.9702
nb_hyphens; Highpage_rank; High No-Phishing0.03200.9610
page_rank; Highnb_www; High No-Phishing0.22500.9565
nb_hyphens; Lownb_hyphens; Highnb_www; HighNo-Phishing0.00290.9546
domain_registration_length; Highratio_extRedirection; Highnb_www; HighNo-Phishing0.00400.9534
page_rank; Highratio_extRedirection; Lowratio_extRedirection; HighNo-Phishing0.01120.9530
domain_registration_length; Lowdomain_registration_length; Highpage_rank; HighNo-Phishing0.00380.9527
domain_registration_length; Highnb_www; Highlength_words_raw; LowNo-Phishing0.04350.9462
nb_dots; Lownb_www; Lowlength_words_raw; HighNo-Phishing0.00150.9422
nb_hyphens; Highratio_extRedirection; Lownb_www; HighNo-Phishing0.02270.9403
longest_words_raw; Lowdomain_registration_length; Highratio_extRedirection; HighNo-Phishing0.00720.9375
nb_space; Highphish_hints; Lownb_www; HighNo-Phishing0.00320.9230
ratio_extRedirection; Lowratio_extRedirection; Highnb_www; HighNo-Phishing0.00700.9084
nb_underscore; Highpage_rank; Highnb_colon; LowNo-Phishing0.00920.9045
phish_hints; Lowratio_extRedirection; Highnb_www; HighNo-Phishing0.04180.8770
page_rank; Lowratio_extRedirection; Lownb_www; LowPhishing0.25890.8661
phish_hints; Lowratio_extRedirection; Lowratio_extRedirection; HighNo-Phishing0.01260.8583
page_rank; Lownb_www; Low Phishing0.28450.8524
nb_hyphens; Highnb_at; Lowphish_hints; LowNo-Phishing0.03830.8489
nb_space; Highpage_rank; Lowpage_rank; HighNo-Phishing0.00110.8375
page_rank; Lowpage_rank; Highratio_extRedirection; HighNo-Phishing0.01390.8301
domain_registration_length; Highphish_hints; Lowpage_rank; HighNo-Phishing0.04890.8230
phish_hints; Lownb_www; High No-Phishing0.34680.8155

Appendix A.3. Additional Results for Robotic Navigation

Some FMFs are shown in Section 4.3.1. (FuzRED Rules). An additional FMF is shown below.
Figure A6. Robotic navigation: FMFs for feature red_box_height. Short, medium, and tall are shown in red, green, and blue, respectively.
Figure A6. Robotic navigation: FMFs for feature red_box_height. Short, medium, and tall are shown in red, green, and blue, respectively.
Electronics 14 01965 g0a6
Many Anchors rules are presented in Section 4.3.2. Some of the Anchors rules with four antecedents are shown below.
  • IF (yellow_box_detected <= 1.00 AND blue_box_height > 0.16 AND yellow_box_position_y <= 0.77 AND blue_box_position_y <= 0.78) THEN move_forward, coverage =0.0591, precision = 0.96
The first antecedent of this rule, i.e., yellow_box_detected <= 1.00, is always true: yellow_box_detected is a binary feature. As such, it does not make any sense for this antecedent to be part of the rule; however, Anchors includes it. The other parts of the rule— blue_box_height > 0.16, yellow_box_position_y <= 0.77, and blue_box_position_y <= 0.78—are difficult to understand.
  • IF (blue_box_detected <= 1.00 AND red_box_height > 0.14 AND target_red_box > 0.00 AND yellow_box_detected <= 0.00) THEN move_forward, coverage = 0.1268, precision = 0.9689
This is an example of another rule that has an antecedent that is always true: blue_box_detected <= 1.00. Taking into account that some of the features are binary, this rule really states that if red_box_height > 0.14, target_red_box = 1, and yellow_box_detected not true, then move_right in 96.89% percent of cases. The threshold 0.14 is difficult to understand.
Table A3. Pruned set of rules for robotic navigation agent method 9 from Table 5.
Table A3. Pruned set of rules for robotic navigation agent method 9 from Table 5.
Antecedent 1Antecedent 2Antecedent 3ConsequentSupportConfidence
blue_box_detected; Truetarget_blue_box; True move_forward0.241
yellow_box_height; Talltarget_yellow_box; True move_forward0.1181
yellow_box_detected; Falsetarget_yellow_box; True turn_right0.1161
blue_box_detected; Falsetarget_blue_box; True turn_left0.1151
red_box_detected; Falsetarget_red_box; True turn_left0.1041
yellow_box_width; Narrowtarget_yellow_box; True move_forward0.0461
red_box_width; Mediumyellow_box_position_x; Rightyellow_box_height; Tallmove_forward0.0461
blue_box_width; Mediumyellow_box_position_x; Middle move_forward0.0331
red_box_position_x; Middleyellow_box_detected; Falsetarget_red_box; Truemove_forward0.0311
red_box_position_x; Leftblue_box_height; Tall move_forward0.0241
red_box_width; Mediumred_box_height; Mediumyellow_box_height; Tallmove_forward0.0231
red_box_position_x; Middlered_box_width; Mediumyellow_box_width; Mediummove_forward0.0221
red_box_position_x; Rightyellow_box_position_x; Right move_forward0.021
yellow_box_width; Widetarget_yellow_box; True move_forward0.0161
red_box_position_x; Righttarget_red_box; True move_forward0.0131
red_box_position_y; Middleyellow_box_width; Medium move_forward0.0131
yellow_box_height; Shorttarget_yellow_box; True move_forward0.0131
red_box_width; Narrowyellow_box_width; Mediumblue_box_detected; Falsemove_forward0.0131
red_box_height; Mediumyellow_box_position_x; Leftyellow_box_position_y; Bottommove_forward0.0121
red_box_width; Narrowblue_box_width; Medium move_forward0.0111
red_box_width; Mediumblue_box_width; Mediumyellow_box_detected; Falsemove_forward0.0111
red_box_height; Tallyellow_box_width; Narrow move_forward0.011
red_box_width; Wideblue_box_detected; Falsetarget_red_box; Truemove_forward0.0150.999
red_box_width; Narrowblue_box_detected; Falsetarget_red_box; Truemove_forward0.0170.997
yellow_box_detected; Truetarget_yellow_box; True move_forward0.2230.996
blue_box_width; Mediumyellow_box_position_y; Middle move_forward0.020.995
red_box_width; Mediumblue_box_position_x; Middleblue_box_position_y; Bottommove_forward0.0150.995
blue_box_position_y; Middleblue_box_width; Mediumyellow_box_height; Mediummove_forward0.0150.991

References

  1. Towards trustable machine learning. Nat. Biomed. Eng. 2018, 2, 709–710. [CrossRef] [PubMed]
  2. Sayler, K. Artificial Intelligence and National Security; U.S. Congressional Research Service: Washington, DC, USA, 2019. [Google Scholar]
  3. Fjeld, J.; Hilligoss, H.; Achten, N.; Daniel, M.L.; Feldman, J.; Kagay, S. Principled Artificial Intelligence; Berkman Klein Center: Harvard, MA, USA, 2019; Available online: https://ai-hr.cyber.harvard.edu/primp-viz.html (accessed on 1 February 2025).
  4. Doshi-Velez, F.; Kortz, M. Accountability of AI Under the law: The Role of Explanation. arXiv 2017. Available online: https://arxiv.org/abs/1711.01134 (accessed on 1 February 2025). [CrossRef]
  5. Wagner, J. GDPR and Explainable AI. 2019. Available online: https://www.jwtechwriter.com/project-146.php (accessed on 1 February 2025).
  6. Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. 2018, 51, 93. [Google Scholar] [CrossRef]
  7. Lipton, Z.C. The mythos of model interpretability. arXiv 2017. Available online: https://arxiv.org/abs/1606.03490 (accessed on 1 February 2025).
  8. Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed. 2022. Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 1 February 2025).
  9. Pedrycz, W.; Gomide, F. Fuzzy Systems Engineering: Toward Human-Centric Computing; Wiley-IEEE Press: Hoboken, NJ, USA, 2007. [Google Scholar]
  10. Lundberg, S.M.; Lee, S. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  11. Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
  12. Das, A.; Rad, P. Opportunities and challenges in explainable artificial intelligence (XAI): A survey. arXiv 2020. Available online: https://arxiv.org/abs/2006.11371 (accessed on 1 February 2025).
  13. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 2019, 128, 336–359. [Google Scholar] [CrossRef]
  14. Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 1135–1144. [Google Scholar]
  15. Ribeiro, M.T.; Singh, S.; Guestrin, C. Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar] [CrossRef]
  16. Stepin, I.; Alonso, J.M.; Catalá, A.; Pereira-Fariña, M. A Survey of Contrastive and Counterfactual Explanation Generation Methods for Explainable Artificial Intelligence. IEEE Access 2021, 9, 11974–12001. [Google Scholar] [CrossRef]
  17. Mazzine Barbosa, R.; Martens, D. A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data. Appl. Sci. 2021, 11, 7274. [Google Scholar] [CrossRef]
  18. Wachter, S.; Mittelstadt, B.; Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. J. Law Technol. 2017, 31, 841–887. [Google Scholar] [CrossRef]
  19. Verma, S.; Dickerson, J.; Hines, K. Counterfactual explanations for machine learning: A review. arXiv 2020. Available online: https://arxiv.org/abs/2010.10596 (accessed on 1 February 2025).
  20. Kim, B.; Wattenberg, M.; Gilmer, J.; Cai, C.; Wexler, J.; Viegas, F.; Sayres, R. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
  21. Agrawal, R.; Srikant, R. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 12–15 September 1994; pp. 487–499. [Google Scholar]
  22. Agrawal, R.; Imielinski, T.; Swami, A. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, 26–28 May 1993; ACM: New York, NY, USA, 1993; pp. 207–216. [Google Scholar]
  23. Kuok, C.M.; Fu, A.; Wong, M.H. Mining fuzzy association rules in databases. ACM Sigmod Rec. 1998, 27, 41–46. [Google Scholar] [CrossRef]
  24. Liu, B.; Hsu, W.; Ma, Y. Integrating classification and association rule mining. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD), New York, NY, USA, 27–31 August 1998; AAAI Press: Washington, DC, USA, 1998. [Google Scholar]
  25. Reynolds, D. Gaussian mixture models. In Encyclopedia of Biometrics; Li, S.Z., Jain, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar] [CrossRef]
  26. Dua, D.; Graff, C. UCI Machine Learning Repository; University of California, School of Information and Computer Science: Irvine, CA, USA, 2019; Available online: http://archive.ics.uci.edu/ml (accessed on 1 February 2025).
  27. scikit-learn. (n.d.). StandardScaler. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html (accessed on 1 February 2025).
  28. Lundberg, S.M. (n.d.). SHAP Documentation. Available online: https://shap.readthedocs.io/en/latest/ (accessed on 1 February 2025).
  29. Hannousse, A.; Yahiouche, S. Towards benchmark datasets for machine learning based website phishing detection: An experimental study. arXiv 2020. Available online: https://arxiv.org/abs/2010.12847 (accessed on 1 February 2025). [CrossRef]
  30. Chevalier-Boisvert, M.; Dai, B.; Towers, M.; de Lazcano, R.; Willems, L.; Lahlou, S.; Pal, S.; Castro, P.S.; Terry, J. Minigrid & Miniworld: Modular & customizable reinforcement learning environments for goal-oriented tasks. arXiv 2023. Available online: https://arxiv.org/abs/2306.13831 (accessed on 1 February 2025).
  31. Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-baselines3: Reliable reinforcement learning implementations. J. Mach. Learn. Res. 2021, 22, 1–8. [Google Scholar]
  32. Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLOv8 (Version 8.0.0). 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 1 February 2025).
Figure 1. FuzRED approach.
Figure 1. FuzRED approach.
Electronics 14 01965 g001
Figure 2. FMFs for the fuzzy feature Temperature: Low, Normal, Fever, Strong Fever, and Hyperthermia.
Figure 2. FMFs for the fuzzy feature Temperature: Low, Normal, Fever, Strong Fever, and Hyperthermia.
Electronics 14 01965 g002
Figure 3. Spam dataset: bar plot for the top 20 features determined using the mean magnitude method.
Figure 3. Spam dataset: bar plot for the top 20 features determined using the mean magnitude method.
Electronics 14 01965 g003
Figure 4. Spam dataset: bar plot for the top 20 features determined using the frequency method.
Figure 4. Spam dataset: bar plot for the top 20 features determined using the frequency method.
Electronics 14 01965 g004
Figure 5. Spam dataset: bar plot for the top 20 features determined using the mean contribution method.
Figure 5. Spam dataset: bar plot for the top 20 features determined using the mean contribution method.
Electronics 14 01965 g005
Figure 6. Spam dataset: bar plot for the top 20 features determined using the maximum contribution method.
Figure 6. Spam dataset: bar plot for the top 20 features determined using the maximum contribution method.
Electronics 14 01965 g006
Figure 7. Spam dataset: bar plot for the top 20 features determined using the maximum magnitude method.
Figure 7. Spam dataset: bar plot for the top 20 features determined using the maximum magnitude method.
Electronics 14 01965 g007
Figure 8. Spam dataset: bar plot for the top 20 features determined using the minimum method.
Figure 8. Spam dataset: bar plot for the top 20 features determined using the minimum method.
Electronics 14 01965 g008
Figure 9. Spam dataset: FMFs for word_freq_george, i.e., low, medium, and high, are shown in red, green, and blue, respectively.
Figure 9. Spam dataset: FMFs for word_freq_george, i.e., low, medium, and high, are shown in red, green, and blue, respectively.
Electronics 14 01965 g009
Figure 10. Spam dataset: FMFs for word_freq_hp, i.e., low, medium, and high are shown in red, green, and blue, respectively.
Figure 10. Spam dataset: FMFs for word_freq_hp, i.e., low, medium, and high are shown in red, green, and blue, respectively.
Electronics 14 01965 g010
Figure 11. Phishing dataset: bar plot for the top 20 features determined using the mean magnitude method.
Figure 11. Phishing dataset: bar plot for the top 20 features determined using the mean magnitude method.
Electronics 14 01965 g011
Figure 12. Phishing dataset: bar plot for the top 20 features determined using the frequency method.
Figure 12. Phishing dataset: bar plot for the top 20 features determined using the frequency method.
Electronics 14 01965 g012
Figure 13. Phishing dataset: bar plot for the top 20 features determined using the mean contribution method.
Figure 13. Phishing dataset: bar plot for the top 20 features determined using the mean contribution method.
Electronics 14 01965 g013
Figure 14. Phishing dataset: bar plot for the top 20 features determined using the maximum contribution method.
Figure 14. Phishing dataset: bar plot for the top 20 features determined using the maximum contribution method.
Electronics 14 01965 g014
Figure 15. Phishing dataset: bar plot for the top 20 features determined using the maximum magnitude method.
Figure 15. Phishing dataset: bar plot for the top 20 features determined using the maximum magnitude method.
Electronics 14 01965 g015
Figure 16. Phishing dataset: bar plot for the top 20 features determined using the minimum contribution method.
Figure 16. Phishing dataset: bar plot for the top 20 features determined using the minimum contribution method.
Electronics 14 01965 g016
Figure 17. Phishing dataset: FMFs for length_words_raw. Low and high are shown in red and blue, respectively.
Figure 17. Phishing dataset: FMFs for length_words_raw. Low and high are shown in red and blue, respectively.
Electronics 14 01965 g017
Figure 18. Phishing dataset: FMFs for ratio_intMedia. Low and high are shown in red and blue, respectively.
Figure 18. Phishing dataset: FMFs for ratio_intMedia. Low and high are shown in red and blue, respectively.
Electronics 14 01965 g018
Figure 19. Robotic navigation agent’s environment.
Figure 19. Robotic navigation agent’s environment.
Electronics 14 01965 g019
Figure 20. Robotic navigation: FMFs for feature red_box_detected. False and true are shown in red and green, respectively.
Figure 20. Robotic navigation: FMFs for feature red_box_detected. False and true are shown in red and green, respectively.
Electronics 14 01965 g020
Figure 21. Robotic navigation: FMFs for feature red_box_position. Left, middle, and right are shown in red, green, and blue, respectively.
Figure 21. Robotic navigation: FMFs for feature red_box_position. Left, middle, and right are shown in red, green, and blue, respectively.
Electronics 14 01965 g021
Figure 22. Comparison of computation times between FuzRED and Anchors. The Y scale is logarithmic.
Figure 22. Comparison of computation times between FuzRED and Anchors. The Y scale is logarithmic.
Electronics 14 01965 g022
Table 1. Spam: number of rules extracted by FuzRED for different methods of determining the most important features.
Table 1. Spam: number of rules extracted by FuzRED for different methods of determining the most important features.
Row
Number
Min Rule
Confidence
MethodNumber of RulesNumber of Rules
After Pruning
Total Coverage
10.6Maximum contribution6091390.9717
20.6Mean magnitude5451280.9671
30.6Frequency5781330.9671
40.7Mean magnitude3871260.9414
50.7Frequency3841310.9414
60.7Maximum contribution3531350.9309
70.6Mean contribution4061090.9289
80.6Maximum magnitude342960.9243
90.7Mean contribution2301060.9019
100.7Maximum magnitude238900.8907
110.6Minimum contribution195670.7966
120.8Mean magnitude2531140.6458
130.8Frequency2501180.6248
140.8Mean contribution1821020.6004
150.9Mean magnitude2041020.5747
160.9Frequency2091070.5583
170.95Mean magnitude158940.5332
180.7Minimum contribution179620.5346
190.95Frequency161990.5168
200.96Mean magnitude149910.5122
210.9Mean contribution144920.5089
220.97Mean magnitude139870.5010
230.96Frequency154960.4924
240.8Maximum magnitude157810.4852
250.98Mean magnitude138850.4918
260.97Frequency146910.4852
270.95Mean contribution125870.4885
280.98Frequency144890.4760
290.96Mean contribution116830.4635
300.99Mean magnitude121760.4602
310.99Frequency126800.4523
320.97Mean contribution113800.4523
330.98Mean contribution111770.4457
340.9Maximum magnitude120710.4332
350.99Mean contribution100700.4134
360.8Maximum contribution2121200.3568
370.95Maximum magnitude99660.4009
380.8Minimum contribution104560.4075
390.96Maximum magnitude93630.3726
400.97Maximum magnitude87600.3568
410.98Maximum magnitude88590.3562
420.9Minimum contribution86490.3805
430.99Maximum magnitude82560.3430
440.9Maximum contribution1831080.2877
451Mean magnitude72480.3364
460.95Minimum contribution71460.3476
470.96Minimum contribution68460.3476
481Frequency73470.3246
490.95Maximum contribution1611010.2646
501Mean contribution64460.3160
510.96Maximum contribution158990.2600
520.97Maximum contribution151940.2535
530.98Maximum contribution148910.2508
540.97Minimum contribution64430.3226
550.99Maximum contribution136870.2357
560.99Minimum contribution60400.3061
570.98Minimum contribution60400.3061
581Maximum magnitude54350.2791
591Minimum contribution60360.2791
601Maximum contribution77490.1554
Table 2. Phishing: number of rules extracted by FuzRED for different methods for determining the most important features.
Table 2. Phishing: number of rules extracted by FuzRED for different methods for determining the most important features.
Row NumberMin Rule ConfidenceMethodNumber of RulesNum of Rules After PruningTotal Coverage
10.6Frequency9121631
20.6Mean magnitude9391641
30.7Mean magnitude7731641
40.8Mean magnitude5821641
50.7Frequency7651620.9992
60.8Frequency5881590.9984
70.6Minimum contribution2991000.9981
80.6Maximum contribution113560.9966
90.7Minimum contribution257950.9960
100.6Mean contribution3661240.9952
110.9Frequency4131410.9862
120.7Maximum contribution100530.9841
130.7Mean contribution2941150.9838
140.9Mean magnitude3801420.9825
150.6Maximum magnitude220580.9793
160.8Maximum contribution69520.9722
170.7Maximum magnitude187560.9700
180.8Minimum contribution181870.9669
190.95Frequency2521190.9517
200.95Mean magnitude2381180.9459
210.9Minimum contribution121730.9427
220.96Frequency2251080.9337
230.96Mean magnitude2101060.9266
240.9Maximum contribution55430.8834
250.95Minimum contribution95610.7961
260.8Mean contribution195890.7908
270.8Maximum magnitude116480.7704
280.96Minimum contribution88580.7216
290.97Frequency185940.7235
300.97Mean magnitude175910.6779
310.98Frequency139750.6694
320.95Maximum contribution44340.6622
330.96Maximum contribution49320.6516
340.98Mean magnitude132710.6259
350.99Frequency91530.5623
360.99Mean magnitude90540.5411
370.9Mean contribution103580.5427
380.97Minimum contribution76520.5005
390.9Maximum magnitude66390.4907
400.95Maximum magnitude44320.4801
410.96Maximum magnitude38280.4629
420.98Minimum contribution65470.4449
430.99Minimum contribution52390.3624
440.97Maximum contribution33250.2654
450.95Mean contribution62400.2569
460.97Maximum magnitude38260.2450
470.96Mean contribution58380.2169
480.97Mean contribution54380.2155
490.98Maximum contribution32230.2121
500.98Maximum magnitude29220.2092
510.99Maximum contribution22160.1893
520.98Mean contribution45350.1853
530.99Mean contribution29270.1665
540.99Maximum magnitude23170.1575
551Frequency23160.0793
561Mean magnitude17130.0745
571Minimum contribution20150.0567
581Maximum magnitude15100.0536
591Mean contribution1190.0488
601Maximum contribution880.0424
Table 3. Hyperparameters used to train the PPO model.
Table 3. Hyperparameters used to train the PPO model.
HyperparameterValue
Discount factor (γ)0.99
Entropy coefficient0.05
Learning rate0.0005
Batch size1024
Number of layers2
Layer size128
Number of environments16
Number of steps per update2048
Table 4. Feature names and descriptions.
Table 4. Feature names and descriptions.
Feature NameDescription
red_box_position_xThe x coordinate of the center of the YOLO bounding box around the red box.
red_box_position_yThe y coordinate of the center of the YOLO bounding box around the red box.
red_box_widthThe width of the YOLO bounding box around the red box.
red_box_heightThe height of the YOLO bounding box around the red box.
blue_box_position_xThe x coordinate of the center of the YOLO bounding box around the blue box.
blue_box_position_yThe y coordinate of the center of the YOLO bounding box around the blue box.
blue_box_widthThe width of the YOLO bounding box around the blue box.
blue_box_heightThe height of the YOLO bounding box around the blue box.
yellow_box_position_xThe x coordinate of the center of the YOLO bounding box around the yellow box.
yellow_box_position_yThe y coordinate of the center of the YOLO bounding box around the yellow box.
yellow_box_widthThe width of the YOLO bounding box around the yellow box.
yellow_box_heightThe height of the YOLO bounding box around the yellow box.
red_box_detectedBinary feature that indicates if YOLO has detected the red box (0 = false, 1 = true).
blue_box_detectedBinary feature that indicates if YOLO has detected the blue box (0 = false, 1 = true).
yellow_box_detectedBinary feature that indicates if YOLO has detected the yellow box (0 = false, 1 = true).
target_red_boxBinary feature that indicates if the red box is the current target (0 = false, 1 = true).
target_blue_boxBinary feature that indicates if the blue box is the current target (0 = false, 1 = true).
target_yellow_boxBinary feature that indicates if the yellow box is the current target (0 = false, 1 = true).
Table 5. Robotic navigation: number of rules extracted by FuzRED before and after pruning.
Table 5. Robotic navigation: number of rules extracted by FuzRED before and after pruning.
Row NumberMinimum Rule ConfidenceTotal Number of RulesTotal Number of Rules After PruningTotal Coverage
10.61250420.9984
20.71189380.9921
30.81131370.9913
40.91007360.9874
50.95824360.9874
60.96749360.9874
70.97697350.9819
80.98647310.9740
90.99560280.9213
101189220.8606
Table 6. Comparision of FuzRED, SHAP, LIME, and Anchors.
Table 6. Comparision of FuzRED, SHAP, LIME, and Anchors.
CriteriaFuzREDSHAPLIMEAnchors
InterpretabilityHigh; intuitive fuzzy IF–THEN rulesMedium; feature importance valuesMedium; local surrogate models (linear rules)Medium-high; IF–THEN rules, but thresholds can be difficult to understand
Computation timeLow (seconds/minutes)Medium (minutes/hours, depending on model size)Medium (minutes/hours)High (hours/days, especially with large datasets)
Explanation scopeGlobal and local (integrates both scopes)Primarily local, aggregated for global insightsPrimarily local; global insight limitedLocal; separate rules per instance
Scalability to large modelsMedium; potential complexity with very large feature setsMedium; computationally intensive for large datasets/modelsMedium; relies on sampling, complexity rises with dataset/model sizeLow; challenging for large models/datasets
Ease of use for non-expertsMedium-high; intuitive linguistic rules in plain language, only requires basic understanding of features in dataLow-medium; numeric values less intuitive without visualizationMedium; explanations clear but rely on numeric thresholds and linear approximationsMedium; rules straightforward but numeric thresholds less intuitive
StrengthsInterpretable rules, computational efficiency, combined local/global insightsRigorous theoretical foundations, strong global feature analysisIntuitive local approximations, straightforward interpretationPrecise local explanations, high precision
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Buczak, A.L.; Baugher, B.D.; Zaback, K. Fuzzy Rules for Explaining Deep Neural Network Decisions (FuzRED). Electronics 2025, 14, 1965. https://doi.org/10.3390/electronics14101965

AMA Style

Buczak AL, Baugher BD, Zaback K. Fuzzy Rules for Explaining Deep Neural Network Decisions (FuzRED). Electronics. 2025; 14(10):1965. https://doi.org/10.3390/electronics14101965

Chicago/Turabian Style

Buczak, Anna L., Benjamin D. Baugher, and Katie Zaback. 2025. "Fuzzy Rules for Explaining Deep Neural Network Decisions (FuzRED)" Electronics 14, no. 10: 1965. https://doi.org/10.3390/electronics14101965

APA Style

Buczak, A. L., Baugher, B. D., & Zaback, K. (2025). Fuzzy Rules for Explaining Deep Neural Network Decisions (FuzRED). Electronics, 14(10), 1965. https://doi.org/10.3390/electronics14101965

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop