You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Editor’s Choice
  • Article
  • Open Access

12 May 2025

Fuzzy Rules for Explaining Deep Neural Network Decisions (FuzRED)

,
and
Johns Hopkins Applied Physics Laboratory, Laurel, MD 20723, USA
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Explainable Artificial Intelligence: Concepts, Techniques, Analytics and Applications

Abstract

This paper introduces a novel approach to explainable artificial intelligence (XAI) that enhances interpretability by combining local insights from Shapley additive explanations (SHAP)—a widely adopted XAI tool—with global explanations expressed as fuzzy association rules. By employing fuzzy association rules, our method enables AI systems to generate explanations that closely resemble human reasoning, delivering intuitive and comprehensible insights into system behavior. We present the FuzRED methodology and evaluate its performance on models trained across three diverse datasets: two classification tasks (spam identification and phishing link detection), and one reinforcement learning task involving robot navigation. Compared to the Anchors method FuzRED offers at least one order of magnitude faster execution time (minutes vs. hours) while producing easily interpretable rules that enhance human understanding of AI decision making.

1. Introduction

Artificial intelligence (AI) systems have demonstrated increasingly remarkable modeling capabilities, yet this increased capability has come at the cost of increased complexity, which obscures the reasoning behind their decisions, predictions, or classifications. This lack of transparency has sparked concerns across various sectors, resulting in a reluctance to adopt AI systems. For example, medical professionals who rely on precise and understandable data to make critical decisions often hesitate to integrate AI into patient care due to the opaque nature of its outputs [1]. Similarly, military leaders have expressed reservations about deploying AI systems, citing a lack of transparency as a significant barrier [2].
Addressing these concerns requires the advancement of explainable AI (XAI), which seeks to make AI systems more transparent and understandable to humans. By offering clear, understandable explanations for their predictions, XAI can enhance trust, accountability, and privacy—key factors for the widespread adoption and integration of AI into real-world applications. Both governmental and non-governmental entities have emphasized the critical role of explainability, tying it to ethical standards and regulatory compliance [3,4]. For instance, the European Union’s General Data Protection Regulation includes provisions such as the “right to explanation”, potentially mandating explainability in AI systems [5]. These regulatory and ethical drivers have spurred increased research and development of XAI techniques, aimed at making AI systems more comprehensible to humans.
XAI techniques generally provide two types of explanations: local and global. Both types play a crucial role in achieving transparency in AI systems, each with its own strengths and limitations [6,7]. Local explanations provide precise, fine-grained insights into individual predictions, but when viewed in isolation, they fail to capture the broader behavior of the model. Global explanations, in contrast, aim to offer a broad understanding of how the model makes decisions across its entire feature space. However, achieving global explanations is challenging in practice, as Molnar states [8], “any model that exceeds a handful of parameters or weights is unlikely to fit into the short-term memory of the average human… [and] any feature space with more than three dimensions is simply inconceivable for humans”. This cognitive limitation makes it difficult to fully grasp complex models at a global level, underscoring the need for balanced approaches that combine both explanation types.
Ideally, we could utilize the precision of local explanations to construct global explanations that remain both intuitive and interpretable, without compromising accuracy or oversimplifying the model’s behavior. The FuzRED method aims to achieve this by leveraging feature importance rankings from local explanations to generate fuzzy IF → THEN rules, providing a high-level representation of model behavior.
We have found that using rules to explain AI model behavior significantly enhances interpretability, especially when these rules are grounded in familiar concepts or domain knowledge. Rules serve as simple, logical statements that describe the relationships between inputs and outputs, making model decisions more transparent. For instance, in a medical diagnosis system, a rule might state, “If a patient has a fever and a persistent cough, then there is a high probability of a respiratory infection”. Such rules are intuitive and align with the way humans naturally reason about cause and effect. However, real-world data are often uncertain and imprecise, which is where fuzzy association rules become valuable. Unlike rigid, binary rules, fuzzy rules introduce degrees of truth, making them more adaptable to complex and nuanced scenarios. For example, instead of classifying a fever as simply “present” or “absent”, a fuzzy rule might describe a patient’s temperature as “high” to a certain degree, allowing for a more flexible and realistic interpretation of medical conditions [9].
This paper introduces a novel XAI approach that leverages local explanations provided by SHAP (Shapley additive explanations) [10] to generate global explanations in the form of fuzzy association rules. By leveraging fuzzy association rules, our method enables AI systems to explain their decision-making in a way that aligns with human reasoning, offering explanations that are both accurate and intuitively understandable. This approach helps bridge the gap between complex AI models and human interpretability, enabling users to comprehend the underlying logic behind AI-driven decisions. As a result, it enhances trust, transparency, and accountability, facilitating broader adoption of AI systems in high-stake domains where explainability is critical.
The key contributions of FuzRED center on its novel approach to bridging the gap between complex model behavior and human interpretability. Most notably, it leverages fuzzy logic [11]—a system inherently aligned with human reasoning—to express model decisions using intuitive, linguistic categories such as “low”, “medium”, and “high”. This alignment enables FuzRED to translate abstract statistical patterns into rule-based explanations that people can naturally understand and reason about. By combining this fuzzy logic framework with feature-importance analysis, FuzRED automatically generates concise, meaningful rules that reflect the internal structure of black-box models. It is broadly applicable across various domains, including both classification tasks and deep reinforcement learning (DRL). Additionally, FuzRED is computationally efficient (more details in Section 5), making it suitable for deployment in real-time or iterative workflows. Overall, FuzRED advances the interpretability of AI by enabling clear, human-aligned insights that foster trust, support validation, and guide debugging.
The remainder of the paper is organized as follows: Section 2 provides an overview of XAI and related research. Section 3 outlines our proposed methodology, and Section 4 describes our experimental results. In Section 5, we discuss the key findings and their implications. Finally, Section 6 summarizes our conclusions and outlines directions for future work.

3. Materials and Methods

The FuzRED method seeks to deliver a set of rules that provide an accurate, human-understandable picture of broad model behavior. Importantly, this method provides users with the flexibility to restrict or expand the desired level of granularity at which the rulesets are generated and optimized. This flexibility is described later in Section 3.1 and Section 3.2.
The main steps of the explainability method that we developed are shown in Figure 1. We start with an AI model (e.g., shallow neural network, deep neural network, support vector machine, deep reinforcement learning network) that was trained to provide a decision or a prediction. For the ease of description, let us assume that the decision is a classification into one of n classes, but in reality, the decision/prediction could be of a different type. The first step is to determine from the large number of input features (sometimes thousands of them) the top k features that are the most important. In the second step, for each of the k features, a set of FMFs will be built. In the third step, fuzzy rules will be extracted for those features and the corresponding membership functions. These fuzzy rules are the explanations of why a given instance is classified into a given class (as such, they are local explanations). The full set of rules is a global explanation, explaining the behavior of the AI model. Often the full set of rules is not necessary, and a small subset of rules can be chosen that explains the model behavior. The fourth step of FuzRED is to choose that small subset of rules that explain the behavior of the AI model.
Figure 1. FuzRED approach.

3.1. Determination of the Most Important Features

The first step in the FuzRED method is to determine the most important features that contribute to model prediction. The goal of this step is to identify a set of model input features that will be the most useful antecedents to consider when generating the ruleset (e.g., the x in an IF x → THEN y rule). This step is important for two reasons: scale and comprehensibility. In terms of scale, the number of input features present in a given dataset (or used as input by a given AI model) can vary widely, with some datasets providing thousands of features for each sample. If the input feature space is sufficiently large, and all features are equally considered as possible antecedents, the computational load when generating our ruleset can rapidly become untenable. Furthermore, an overly large set of different antecedents within a resulting ruleset can easily be mentally overwhelming for humans who attempt to use the ruleset to understand model behavior. This is why identifying the top n most-influential input features enables comprehensibility as well as managing scale. The user has the flexibility to choose any number of the most influential input features that fit their needs.
In our method, we use SHAP values to identify a small set of features that are the most important to the model being explained (i.e., the features that have the largest impact on the predictions that the model makes). Lundberg and Lee [10] demonstrated that SHAP values were more consistent with human intuition than other explanation methods, and SHAP values have the advantage of providing a high degree of per-prediction precision when quantifying the input contributions of a single feature. Each SHAP value represents the contribution of a single feature value to a given model prediction, where positive SHAP values indicate features that increase the model prediction for a given example, and negative SHAP values indicate features that decrease the model prediction for a given example.
As previously mentioned, SHAP values are localized to a single example, making them poor candidates on their own for extrapolating global model behavior. However, a useful characteristic of SHAP is that the global importance of each feature can be derived by aggregating SHAP values across all the given examples in a dataset. The authors in [10] propose creating a global feature importance plot by calculating the mean absolute SHAP value for each feature across a dataset. In FuzRED, we leverage this metric (i.e., mean absolute value, which we term mean magnitude) and investigate additional candidate metrics for capturing global feature importance. These metrics provide different ways to aggregate feature contributions across all predictions, enabling a deeper analysis of feature importance. Specifically, we evaluated metrics such as frequency, mean contribution, mean magnitude, maximum contribution, maximum magnitude, and minimum contribution. Frequency measures how often a feature appears among the top n important features for a set of predictions, while mean contribution sums the signed (positive and negative) contributions of a feature for a set of predictions and divides them by the number of features. Mean magnitude aggregates the absolute values of these contributions and divides them by the number of features (it is the same as mean absolute SHAP used by others). The remaining metrics are derived by applying statistical operations (maximum, minimum) to the signed contributions and the maximum operation to the magnitude. Despite its name, the minimum contribution does not find the least important features (we would not be interested in those); because some of the SHAP values are negative, the minimum contribution finds the features for which the contributions are the most negative (these are large contributions, just in the negative direction). These six metrics are then compared to identify the most informative features and are used to select the top n candidates as the antecedents available for use during rule generation.
The above step presents the first instance of the granular flexibility mentioned at the beginning of Section 3. As previously referenced by Molnar, humans have limited capacity for holding large volumes of data in their short-term memory. In this step, n can be chosen to suit the needs of the chosen dataset, model, and analyst. For some cases, excessively large datasets might warrant a higher n value in order to provide a detailed, granular analysis of the impact the large feature space has on the resulting model predictions. In other cases, an easy-to-comprehend list of rules based on a small subset of features might be preferable for building trust among subject matter experts (SMEs).
We often choose a value of 20 for n to provide a balance between a sufficiently small feature space and an acceptable level of analysis of the resulting ruleset for benchmarking purposes. Ideally, n should be chosen after a careful analysis of the balance among the density of high-impact features across the dataset, the available computational resources, and the potential for cognitive overwhelm on human analysts from the resulting ruleset.

3.2. Building FMFs

After identifying the top n features, in the second step of FuzRED, we create FMFs for each feature by fitting Gaussian mixture models (GMMs) to the values observed in the training set. GMM [25] is a soft clustering machine learning method used to determine the probability that each data point belongs to a given cluster (it produces fuzzy clusters). The data point can belong to several clusters, each with a different probability, and those probabilities sum to 1. A Gaussian mixture is composed of several Gaussians, each identified by k ∈ {1, …, K}, where K is the number of clusters of our data set. When we require m membership functions per feature, we set K to m (i.e., if we want 2 membership functions, K is set to 2). The user can require a different number of FMFs for different features. This is the second flexible element of FuzRED: the user can choose as many or as few FMFs per feature as desired.
Given that GMMs perform soft clustering, they can be used to automatically learn FMFs from the data. Those FMFs will have a Gaussian shape. This allows the user to skip the process of manually defining all the FMFs for all the features, but instead rely on an automatic method.
GMMs are generated separately for each of the top n features. We wrote software that allows obtaining any number of membership functions for a feature the user desires. We often use two or three membership functions per feature: this is often sufficient and leads to a lower number of rules than using more membership functions. However, the method we developed is general and produces any desired number of FMFs per feature.
If the user desires three membership functions per feature, K is set to 3, and each mixture model consists of three components, corresponding to three fuzzy classes—e.g., low, medium, and high. These FMFs are then applied to both the training and test datasets to generate fuzzy membership data for 3 × n fuzzy classes (three per feature).
Then, we perform rule extraction using the FARM method described in Section 3.3 to automatically extract fuzzy association rules from the test data.

3.3. Extracting Fuzzy Association Rules

FuzRED’s third step is the automatic extraction of fuzzy association rules from the test data. Uncovering hidden and previously unknown patterns and insights from data is usually the primary objective of data mining. When these insights are expressed as relationships among different attributes, the process is referred to as association rule mining (ARM). Introduced by Agrawal et al. [22], ARM was originally developed to identify interesting co-occurrences within supermarket data, a classic problem known as market basket analysis. ARM identifies frequent itemsets (i.e., combinations of items that are purchased together in at least N transactions in the database) and then generates association rules based on these itemsets. For example, if {X, Y} is a frequent itemset, ARM can generate rules such as X → Y or Y → X. A simple association rule in the context of consumer behavior might be:
IF (Bread AND Butter) THEN Milk
This rule indicates that if a customer purchases bread and butter, they are also likely to purchase milk. Such rules can be highly valuable for store managers in determining how to organize products on shelves to maximize sales. While some rules, like the example above, may seem intuitive, ARM is particularly powerful because it can also reveal hidden patterns that may not be immediately obvious to SMEs. One famous example is the surprising rule:
IF (Diapers) THEN Beer
Initially, store managers were skeptical of this finding, suspecting a flaw in the ARM methodology. However, upon closer examination of the transaction data, they found that this pattern was indeed prominent, particularly in the evenings. The explanation was that dads were often asked to pick up diapers on their way home from work and, while doing so, rewarded themselves by purchasing beer. This insight, which would not have been obvious without ARM, demonstrates the method’s ability to reveal non-trivial, actionable knowledge.
Fuzzy association rules [23] extend the concept of association rules by introducing the ability to handle continuous or categorical variables with a degree of uncertainty. These rules take the following form:
IF (X IS A) THEN (Y IS B)
where X and Y are variables, and A and B are fuzzy sets that characterize X and Y respectively. A simple example of fuzzy association rule for a medical application is the following:
  • IF (Temperature IS Strong Fever) AND (Skin IS Yellowish) AND (Loss of appetite IS Profound) THEN (Hepatitis is Acute)
  • The rule states that a patient exhibiting a strong fever, yellowish skin, and a profound loss of appetite is likely suffering from acute hepatitis. “Strong Fever”, “Yellowish”, “Profound”, and “Acute” are membership functions of the variables Temperature, Skin, Loss of appetite, and Hepatitis, respectively. To illustrate this, consider the FMFs for the variable Temperature given in Figure 2. According to the definition in that figure, a person with a body temperature of 100 °F has a “Normal” temperature with a membership value of 0.2 and, simultaneously, has a “Fever” with a membership value of 0.78. These FMFs provide a nuanced way to interpret data, reflecting the complexity of real-world situations. More information on fuzzy logic and fuzzy membership functions can be found in [11].
    Figure 2. FMFs for the fuzzy feature Temperature: Low, Normal, Fever, Strong Fever, and Hyperthermia.
More formally, let D = {t1, t2, …, tn} represent the transaction database, where each ti is a transaction in D. Let I = {i1, i2, …, im} be the universe of items. A set X⊆I of items is called an itemset. When X has k elements, it is called a k-itemset. An association rule is an implication of the form X → Y, where X⊂I, Y⊂I, and X∩Y = ∅.
ARM and FARM rules have certain metrics associated with them. The three metrics that we use for evaluation and comparison of our method are support, confidence, and coverage. The support of an itemset, X, is defined as follows:
S u p p o r t X = n u m b e r   r e c o r d s   w i t h   X   n u m b e r   r e c o r d s   i n   D = n x n = p x
where n is the number of records in D, nx is the number of records with X, and px is the associated probability. The support of a rule (XY) is defined as follows:
S u p p o r t X Y = n u m b e r   o f   r e c o r d s   w i t h   X   a n d   Y n u m b e r   o f   r e c o r d s   i n   D = n x y n = p x y
where nxy is the number of records with X and Y and pxy is the associated probability.
We can also define the concept of fuzzy support. In order to do this, we first need to define how we determine whether (and to what degree) one fuzzy set is a subset of another. To this end, we define the matching degree (MD) of one set, S 1 , to another, S 2 , as follows:
M D S 1 S 2 = min f i F 1 f i S 2
where F 1 is the set of membership functions inherently selected by S 1 . We note that the min function in this definition is a T-norm. There are many other options for a T-norm that can be chosen, and a discussion of these goes beyond the scope of this paper, but for our purposes we always use the minimum function. We can now define the fuzzy support of a rule (XY) as follows:
F u z z y   S u p p o r t   X Y = s u m   o f   t h e   M D   o f   X   a n d   Y   t o   r e c o r d s   i n   D   n u m b e r   o f   r e c o r d s   i n   D = r D M D X Y ( r ) n
The confidence of a rule (XY) is defined as follows:
C o n f i d e n c e   X Y = n u m b e r   o f   r e c o r d s   w i t h   X   a n d   Y n u m b e r   o f   r e c o r d s   w i t h   X = n x y n x = n x y n × n n x = p x y p x
Similarly, the fuzzy confidence of a rule (XY) can be defined as follows:
F u z z y   C o n f i d e n c e   X Y = s u m   o f   t h e   M D   o f   X   a n d   Y   t o   r e c o r d s   i n   D s u m   o f   t h e   M D   o f   X   t o   r e c o r d s   i n   D = r D M D X Y ( r ) r D M D X ( r )
Confidence can be treated as the conditional probability (P(Y|X)) of a transaction containing X also containing Y. A high confidence value suggests a strong association rule. However, care must be taken in making this inference. For example, if the antecedent (X) and consequent (Y) both have a high support, the rule could have a relatively high confidence even if they were independent.
Support and confidence enable us to compare individual rules, but we also need to be able to compare sets of rules. An important metric for rule sets is coverage. This metric tries to capture how well a rule set “covers” a given data set, or what percentage of the records in the data set are in the support of one of the rules in the rule set. This concept is straightforward for crisp rules, but fuzzy rules introduce some complexities. We can define the coverage of a rule set, R, over a data set, D, as follows:
C o v e r a g e R D = n u m b e r   o f   r e c o r d s   w i t h   X   a n d   Y   f o r   s o m e   X Y   i n   R n u m b e r   o f   r e c o r d s   i n   D =   X Y     R d D   X Y d } n
It is often the case that the MD of a fuzzy rule to a data record is non-zero but extremely small. Thus, including all records with non-zero MD for some rule in a fuzzy rule set does not provide an accurate view of the coverage of that rule set. Therefore, we need to set a threshold which we call minCov for the minimum MD needed for a data record to be counted as “covered”. We also need to define the maximum matching degree (MMD) of a rule set. This is a generalization of the MD defined above for individual rules. The MMD of a rule set, R, on a data record, r, can be defined as follows:
M M D R   r = max X Y R M D X Y ( r )
We can now define the fuzzy coverage of a rule set, R, over a data set, D, as follows:
F u z z y   C o v e r a g e R   D = n u m b e r   r e c o r d s   w i t h   M D > m i n C o v   f o r   s o m e   r u l e   i n   R n u m b e r   r e c o r d s   i n   D = d D M M D R d m i n _ C o v n
When executing FuzRED, we set the minimum support threshold for the rules to 0.001, although a higher or lower support can be used. When selecting the number of antecedents that are available for a single rule, we often set the maximum number to three. Humans have difficulty quickly understanding rules with a large number of antecedents; therefore, for most applications, we discourage using a larger number.

3.4. Choosing a Subset of Rules

Although FARM typically generates thousands of rules, only a subset is relevant for interpretability. Humans do not like looking at such a large number of rules and would rather have a much smaller subset summarizing what the network does. We use the following method for pruning the rule set to find an optimal minimum set of rules. For each of the six methods for determining the most important features, we order the rules first by decreasing confidence, and then by decreasing support. Then we take the first rule from the ordered set of rules and determine for which examples from the test set the rule applies. We remove the examples to which it applies from the test set. Then, we take the next rule from the rule set and repeat the process. We continue the process until there are no more examples in the test set or no more rules.

4. Results

We applied the FuzRED method to three AI models, each trained for a different learning task with a different dataset. Two models were feed-forward deep neural networks performing binary classification tasks (spam email identification and phishing link detection, respectively). The third model was a DRL model trained to perform a benchmark robotics task. In the sections below, we present results for our FuzRED method, and results generated using the existing Anchors method.

4.1. Spam Email Identification

The SPAM Email Database [26], available from the University of California Irvine (UCI) Machine Learning Repository, was utilized for this study. This dataset contains spam emails identified as such by the recipients or the email postmaster. The non-spam emails consist of both work and personal emails identified as non-spam by the recipients. The dataset was submitted to the UCI Machine Learning Repository in July 1999. The dataset comprises 4601 total instances, 1813 of which are labeled as spam. There are 57 continuous data attributes, and the nominal class label is provided for each instance. Attributes include frequencies of select words and characters, as well as the statistics of uninterrupted sequences of capital letters in the emails. There are no missing attributes in this dataset. We do not have access to the text of the emails but only to frequencies of all the attributes in the emails. For more details on the attributes and their statistics, the reader is referred to [26].
To evaluate our explanation method, we trained a deep neural network model to make predictions on this dataset. The model consists of three fully connected hidden layers, each with 12 nodes using the rectified linear unit (ReLU) activation function, followed by a final fully connected layer with a single node utilizing the sigmoid activation function. We randomly selected 70% of the SPAM email dataset to serve as the training set. All input data were normalized using a StandardScaler object [27] before being fed into the model, and the resulting model achieved 89% accuracy on the test set.

4.1.1. FuzRED Results

We employed six methods for determining the most important features: mean magnitude, frequency, mean contribution, maximum contribution, maximum magnitude, and minimum contribution. These methods are described in Section 3.1.
SHAP can be used for both local and global explanations. Usually, for global explanations, the absolute Shapley values of all instances in the data are averaged. Passing a matrix of SHAP values to the bar plot function [28] creates a global feature importance plot, where the global importance of each feature is taken to be the mean absolute SHAP value for that feature over all the given samples. Figure 3 shows the plot created using the bar plot function with the top 20 features. Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 show the bar plots we created for the remaining five methods.
Figure 3. Spam dataset: bar plot for the top 20 features determined using the mean magnitude method.
Figure 4. Spam dataset: bar plot for the top 20 features determined using the frequency method.
Figure 5. Spam dataset: bar plot for the top 20 features determined using the mean contribution method.
Figure 6. Spam dataset: bar plot for the top 20 features determined using the maximum contribution method.
Figure 7. Spam dataset: bar plot for the top 20 features determined using the maximum magnitude method.
Figure 8. Spam dataset: bar plot for the top 20 features determined using the minimum method.
All the features determined to be the most important by frequency and maximum magnitude methods are shared by at least one other method. Mean contribution and mean magnitude methods each have one feature that was not determined to be the most important by any other methods (word_freq_conference and word_freq_85); maximum contribution has three features in the top 20 not identified by any of the other five methods: word_freq_report, word_freq_money, and word_freq_parts. Minimum contribution has the largest number of features not identified as the most important by any other method: word_freq_telnet, word_freq_857, word_freq_table, word_freq_technology, char_freq, and capital_run_length_longest.
Figure 9 and Figure 10 show the FMFs for word_freq_george and word_freq_hp. Figure A1, Figure A2 and Figure A3 in Appendix A.1 show the FMFs for word_freq_meeting, char_freq_$, and char_freq_!. These are some of the features that were determined to be the most important by at least one of the methods and that are used in the rule explanations in the next section.
Figure 9. Spam dataset: FMFs for word_freq_george, i.e., low, medium, and high, are shown in red, green, and blue, respectively.
Figure 10. Spam dataset: FMFs for word_freq_hp, i.e., low, medium, and high are shown in red, green, and blue, respectively.
Word_freq_george describes the frequency of the word george (how many times it appeared) in an email. Figure 9 shows the FMFs for word_freq_george: low, medium, and high. In the spam dataset, the largest frequency of george is 33. FMF low has a value of 1 for word_freq_george values from 0 to 2.223. For values larger than 2.223 but smaller than 12.557, the value of FMF medium is 1.0. Values larger than or equal to 12.557 but smaller than 13.433, belong to two FMFs: medium and high. For example, for 13.0, the value of FMF medium is 0.474 and that of MFM high is 0.526. This means that, when word_freq_george is 13.0, it is both medium and high (to the degree defined by the value of the appropriate FMF). For values larger than 13.433, the value of FMF high is 1. FMFs are continuous; however, when the feature itself is an integer (frequency is always an integer number), the above description can be rewritten as follows. FMF low has a value of 1 for word_freq_george values from 0 to 2. For values between 3 and 12, the value of FMF medium is 1.0. The value 13 belongs to two FMFs: medium and high. Starting at 14, the value of FMF high is 1.
Word_freq_hp describes the frequency of the word hp in an email. Figure 10 shows the three FMFs for word_freq_hp. In the spam dataset, the largest frequency of hp is 21. This is also an integer variable and its FMFs can be described as follows. FMF low has a value of 1 for the word_freq_hp value of 0. For values between 1 and 7, the value of FMF Medium is 1.0. A value of 8 belongs to two FMFs: medium and high. Starting at 9, the value of FMF high is 1.
Figures in Appendix A.1 (A1–A3) show the three FMFs for word_freq_000, word_freq_meeting, and char_freq_$, respectively. Word_freq_000 describes the frequency of 000 in an email. Word_freq_meeting describes the frequency of the word meeting in an email. Char_freq_$ describes the frequency of the character $ in an email.
When running fuzzy association rule mining, we limited the number of antecedents to three, we set the minimum confidence (Equation (8)) of a rule to 0.6, and the minimum support (Equation (6)) to 0.001. Most of the rules found by FuzRED are easy to understand. As an example, below we show some of the rules extracted using the maximum magnitude choice of features.
  • IF (word_freq_george IS High) THEN No-Spam, sup = 0.0263, conf = 1
This rule says that if the frequency of the word george is high (see Figure 3), then the email is no-spam with a confidence value of 1 (the rule is always true). This rule has a support of 2.63%, which means that it explains 2.63% of the dataset. At the beginning, this rule seems to be a little surprising; however, when we assessed the description of the dataset, we noticed that it said the no-spam rules were provided from personal and business emails of the dataset creators, and one of them was George Forman. In view of this information, the rule makes perfect sense.
  • IF (word_freq_hp IS Medium) THEN No-Spam, sup = 0.0753, conf = 1
  • The above rule says that if the frequency of the word hp (Hewlett Packard) is medium (see Figure 10), then the email is no-spam. The support of the rule is 7.53% and the rule is always true. The creators of the dataset were all HP employees, and many no-spam emails provided included their business emails, so this rule also makes perfect sense. A similar rule is shown below:
  • IF (word_freq_hp IS High) THEN No-Spam, sup = 0.0145, conf = 1
  • Some no-spam rules are related to the frequency of the words meeting, 1999, or lab:
  • IF (word_freq_meeting IS Medium) THEN No-Spam, sup = 0.0217, conf = 1
  • IF (word_freq_meeting IS High) THEN No-Spam, sup = 0.0046, conf = 1
  • IF (word_freq_george IS Medium AND word_freq_1999 IS Medium) THEN No-Spam, sup = 0.0014, conf = 1
This dataset was donated on 6/30/1999; it is very possible that the no-spam emails were chosen mostly from the beginning of 1999.
The spam rules often use the character frequency of $ or !:
  • IF (char_freq_$ IS High) THEN Spam, sup = 0.0033, conf = 1
  • IF (char_freq_$ IS Medium AND char_freq_! IS Medium) THEN Spam, sup = 0.0028, conf = 1
Since spam emails are often about becoming rich (receiving $), and some use a lot of exclamation points, the above rules are sensible.
  • IF (capital_run_length_average IS Medium) THEN Spam, sup = 0.0026, conf = 1
  • The above rule says that if the average length of uninterrupted sequences of capital letters is Medium, then the email is always spam.
  • IF (char_freq_$ IS Medium AND capital_run_length_total IS High) THEN Spam, sup = 0.0020, conf = 1
  • The above rule says that if the total number of capital letters in the email is high, then the email is always spam. Historically, spam emails use a lot of capital letters, so this rule is logical.
Additional FuzRED rules for spam email identification are shown in Appendix A.1.
Depending on which of the six methods is being used for determining the most important features and the minimum confidence of the rules (0.6 to 1), FuzRED extracts a different number of rules. The results are shown in Table 1. The table is arranged in descending order by total coverage (Equation (11)) and then in ascending order by the number of rules after pruning. The total coverage of this set of rules ranges from 0.9717 to only 0.1554.
Table 1. Spam: number of rules extracted by FuzRED for different methods of determining the most important features.
When explaining the decisions of a neural network, one may wish to minimize the number of rules while still achieving the required level of dataset coverage. For example, if a coverage of at least 0.9 is desired, Table 1 shows that nine methods meet this criterion. Among them, the best option in terms of rule count is method 8 (maximum magnitude, minimum rule confidence of 0.6), which achieves a coverage of 0.9243 with just 96 rules after pruning. If a smaller ruleset is desired and a lower coverage threshold, such as 0.75, is acceptable, then method 11 (minimum contribution, minimum rule confidence of 0.6) provides a ruleset with just 67 rules. These 67 rules are listed in Appendix A.1, in Table A1. For cases where covering just over half of the dataset is sufficient, method 18 (minimum contribution, minimum rule confidence of 0.7) reduces the ruleset further to 62 rules. Thus, the optimal method depends on the specific balance between coverage requirements and the desire to minimize ruleset complexity for a given application.

4.1.2. Anchors Results

In order to evaluate and compare Anchors rules with those from FuzRED, we wrote a script to use the Anchors library to explain the exact same trained model and set of data. The script used anchor_tabular.AnchorTabularExplainer to generate the anchors and wrapped the model for compatibility with the Anchors framework. We wrote a fit_anchor function in order to compute the actual precision (Equation (1)) and coverage (Equation (2)) metrics for the anchor rules. This function parses the body of the rule and converts it to the equivalent logical conditions and then applies it to the full set of test data to find all instances to which the anchor applies. For this to work properly, we needed to round all of the feature data to two decimal places, as the values in the Anchors were rounded. Lastly, the function computes and returns the computed coverage and precision metrics for the anchors.
When Anchors was executed to obtain rules with a maximum of three antecedents, it produced 1520 rules. Some of the rules were repeating themselves and, therefore, we developed software to remove repeating rules and ended up with 648 distinct rules. Most rules have three antecedents, some have two, and a handful have just one. The rules have a precision ranging from 1 to 0.5312 and a coverage ranging from 0.5872 to 0.0007.
Below are some example rules. The rule with the highest coverage is the following:
  • IF (char_freq_$ <= 0.05 AND word_freq_free <= 0.00 AND word_freq_money <= 0.00) THEN No-Spam, coverage = 0.5872, precision = 0.8543
  • This rule says that if the frequency of $ is less than or equal to 0.05, and the frequencies of free and money are less than or equal to 0, then the email is No-Spam in 85.43% of the cases. Of course, frequency cannot be a negative number, and frequency is an integer; therefore, being less than or equal to 0.05, is the same as being zero. Therefore, the rule actually says that if the frequency of $, free, and money is zero, then the email is No-Spam in 85.43% of the cases. The 0.05 threshold in the rule is arbitrary. The rule has a large coverage (58.72%) and is true in 85.43% cases.
  • IF (word_freq_hp > 0.00 AND char_freq_! <= 0.00 AND word_freq_remove <= 0.00) THEN No-Spam, coverage = 0.1672, precision = 1
  • This rule says that if the frequency of hp is greater than 0 and the frequencies of ! and remove are less than or equal to 0, then the email is always No-Spam. Of course, frequency cannot be a negative number, so the rule actually says that if the frequency of hp is greater than 0 and the frequencies of ! and remove are 0, then the email is always No-Spam. This rule covers 16.72% of the data.
The rules above are relatively easy to understand, as they use zero as the threshold. However, most rules use very different numbers as the threshold. These numbers might seem arbitrary to the person who wants to understand the decisions the network is making.
  • IF (word_freq_remove > 0.00 AND char_freq_$ > 0.05 AND capital_run_length_longest > 43.00) THEN Spam, coverage = 0.0671, precision = 1
  • IF (word_freq_remove > 0.00 AND char_freq_$ > 0.05 AND capital_run_length_average > 3.71) THEN Spam, coverage = 0.0579, precision = 1
  • IF (word_freq_3d > 0.00 AND capital_run_length_average > 3.71 AND word_freq_you > 2.63) THEN Spam, coverage = 0.0007, precision = 1
All the features in the three rules above are integer; therefore, the thresholds used (i.e., 0.05, 3.71, 2.63) are very arbitrary and confusing.
Additional Anchors rules for the Spam email identification are shown in Appendix A.1.

4.2. Phishing Link Detection

This study leveraged the phishing link detection dataset curated by Hannousse and Yahiouche [29]. This dataset has 11,430 samples, each containing 87 features extracted from each URL. Fifty-six features were extracted by analyzing the text of the URL (URL-based), 24 features were extracted by analyzing the content of the website hosted at the URL (content-based), and seven features were extracted by gathering external information about the URL (external-based). Examples of URL-based features include the presence of common terms in the URL (such as ‘www’ or ‘.com’) or the ratio of digits to other characters in the URL. Content-based feature examples include the presence of invisible iframe objects on the site or an empty HTML <title> tag. External-based feature examples include whether or not the URL is from a WHOIS-registered domain, or the number of visitors to the site. In total, this dataset contains 87 features that are a combination of integer, floating point, and binary values. The authors in [29] provide a full description of the features alongside details regarding dataset collection methods.
The dataset was first split into 7658 training samples, while 3772 samples were reserved for testing. A deep feed-forward neural network was constructed with three hidden layers. The model was trained for 200 epochs with a batch size of 256. All input data were normalized using a StandardScaler object [27] before being fed into the model, and the resulting model achieved 85% accuracy on test data.

4.2.1. FuzRED Results

We employed six methods for determining the most important features: frequency, mean contribution, mean magnitude, maximum contribution, maximum magnitude, and minimum contribution. Figure 11, Figure 12, Figure 13, Figure 14, Figure 15 and Figure 16 show the 20 most important features for mean magnitude, frequency, mean contribution, maximum contribution, maximum magnitude, and minimum contribution, respectively. All the features determined as being the most important by frequency and maximum magnitude methods are shared by at least one other method. Maximum contribution has one feature that was not determined as the most important by any other method (iframe); mean magnitude has two such features (nb_slash, ratio_extMedia), and mean contribution has three features not determined as being the most important by any other method (dns_record, ratio_intHyperlinks, web_traffic). Minimum contribution has the largest number of features (11) not identified as being the most important by any other method: length_words_raw, nb_percent, avg_word_host, longest_word_host, length_url, whois_registered_domain, abnormal_subdomain, domain_in_brand, nb_dslash, onmouseover, and popup_window.
Figure 11. Phishing dataset: bar plot for the top 20 features determined using the mean magnitude method.
Figure 12. Phishing dataset: bar plot for the top 20 features determined using the frequency method.
Figure 13. Phishing dataset: bar plot for the top 20 features determined using the mean contribution method.
Figure 14. Phishing dataset: bar plot for the top 20 features determined using the maximum contribution method.
Figure 15. Phishing dataset: bar plot for the top 20 features determined using the maximum magnitude method.
Figure 16. Phishing dataset: bar plot for the top 20 features determined using the minimum contribution method.
For each feature, we used Gaussian mixture models to construct two membership functions (low and high). Figure 17 and Figure 18 show the membership functions for the features: length_words_raw and ratio_intMedia. The FMFs for features, nb_dots, and nb_www are shown in Appendix A.2. These features are used in the example rules described in the FuzRED Rules section, Section 4.2.1. The feature descriptions in this section are taken from [30].
Figure 17. Phishing dataset: FMFs for length_words_raw. Low and high are shown in red and blue, respectively.
Figure 18. Phishing dataset: FMFs for ratio_intMedia. Low and high are shown in red and blue, respectively.
Length_words_raw describes the number of words in a URL. In the phishing dataset, the largest length_words_raw is 106 (Figure 17 shows the FMFs). FMF low has a value of 1 for length_words_raw values from 1 to 28.240; for 32.000, the FMF low has a 0.299 value, and FMF high has a 0.701 value. This means that when length_words_raw is 32.000, it is both low and high (to the degree defined by the value of the appropriate FMF). Starting at 35.023, the value of FMF high is 1.
Legitimate websites mostly use media (images, audio, and video) stored in the same domain. Phishing websites use more external media, usually stored in the target website domain, to save storage space. Ratio_intMedia estimates the ratio of internal media file links and uses it for distinguishing legitimate from phishing websites. In the phishing dataset, the largest ratio_intMedia is 100 (Figure 18 shows the FMFs). FMF low has a value of 1 for ratio_intMedia values from 0 to 38.804; for 49.155, the FMF low has a value of 0.501, and FMF high has a 0.499 value. This means that when ratio_intMedia is 49.155, it is both low and high (to the degree defined by the value of the appropriate FMF). Starting at 59.516, the value of FMF High is 1.
The FMFs for nb_dots and nb_www are shown in Appendix A.2.
When running fuzzy association rule mining, we limited the number of antecedents to three, we set the minimum confidence (Equation (8)) to 0.6, and we set the minimum support (Equation (6)) to 0.001. Most of the rules found by FuzRED are easy to understand. As an example, below we show some of the rules extracted using the maximum magnitude choice of top features.
  • IF (nb_dots IS High) THEN Phishing, sup = 0.0039, conf = 1This rule says that if the nb_dots (number of dots in the URL) is high, then the URL is always predicted to be a phishing domain by the model. An example of such a URL is below (13 dots in the domain name):
  • http://https.email.office.nhc8fso9liwz2e1rk12vhqaxgeq4g.hgbtmd8eshlu1rkesgz11.tk0mqbhgkkuvsf3u821.1sytn1idkv8s2qm4ehh7jja7d.mne8jnxrh8klahtqu.0fll4ryeb76852jeplwk9ckd6zqof.2wvxm5n6uamkxq7wxhpbxaq1a4.cxdtens.duckdns.org/365NewOfficeG15/jsmith@imaphost.com/paul/
  • IF (nb_www IS High AND length_words_raw IS High) THEN Phishing, sup = 0.0023522, conf = 1.0
  • Length_words_raw is an integer that refers to how many words there are in a URL. The rule says that if the nb_www and length_words_raw are high, then the URL is a phishing one. In the phishing URL below, nb_www is 2, and length_words_raw is 16:
  • http://timetravel.mementoweb.org/reconstruct/20141123205914mp_/https:/www.paypal.com/signin?returnuri=https:/www.paypal.com/cgi-bin/webscr?cmd=_account
  • IF (page_rank IS High AND nb_hyperlinks IS High) THEN No-Phishing, sup = 0.0141772, conf = 0.9991)
Page_rank is an algorithm used by Google Search to rank web pages in their search engine results. Phishing web pages are not very popular, hence, they are supposed to have low page ranks compared to legitimate web pages. Legitimate (No-Phishing) websites are supposed to consist of a larger number of pages compared with phishing ones. Therefore, the number of links in a URL contents (nb_hyperlinks) is considered for distinguishing phishing websites. The rule above says that if both the page_rank and nb_hyperlinks are high, then almost always (conf = 0.9991) the website is No-Phishing.
Depending on which of the six methods we are using for determining the most important features and the minimum confidence of the rules (0.6 to 1), FuzRED extracts a different number of rules. The results are shown in Table 2. The table is arranged in descending order by total coverage (Equation (11)) and then in ascending order by the number of rules after pruning. The total coverage of this set of rules ranges from 1.0 to only 0.0424.
Table 2. Phishing: number of rules extracted by FuzRED for different methods for determining the most important features.
When explaining the decisions of a neural network, one may wish to minimize the number of rules while still achieving a required level of dataset coverage. For example, if a coverage of at least 0.9 is desired, Table 3 shows that 23 methods meet this criterion. Among them, the best option in terms of rule count is method 16 (maximum contribution, minimum rule confidence of 0.8), which achieves a coverage of 0.9722 with just 52 rules after pruning. If a smaller ruleset is desired and a lower coverage threshold, such as 0.75, is acceptable, then method 27 (maximum magnitude, minimum rule confidence of 0.8) provides a ruleset with just 48 rules. These 48 rules are listed in Appendix A.2, Table A2. For cases where covering just over half of the dataset is sufficient, method 32 (maximum contribution, minimum rule confidence of 0.95) reduces the ruleset further to 34 rules.
Table 3. Hyperparameters used to train the PPO model.

4.2.2. Anchors Results

When running Anchors, we requested rules with up to three antecedents. Anchors produced 3808 rules; once the repeating ones had been removed, we obtained 548 rules. Most of the rules have three antecedents, some have two, and a few have one. Some rules are easy to understand:
  • IF (nb_star > 0.00) THEN Phishing, coverage = 0.0011, precision = 1
  • Nb_star refers to the number of stars (*) in a URL. The above rule is always true and describes 0.11% of the dataset.
Most rules are much more difficult to understand.
  • IF (page_rank <= 1.00 AND google_index > 0.00 AND longest_words_raw > 16.00) THEN Phishing, coverage = 0.1021, precision = 1
  • Google_index is a binary variable describing whether a given domain is indexed by Google or not (1 means indexed, and 0 means not indexed); web pages not indexed by Google have a higher probability of phishing. The above rule says that if page_rank <= 1.00, the google_index > 0.00 (in reality google_index = 1, meaning indexed by google), and the longest_words_raw > 16.00, the web page is always phishing. It is unclear why URLs indexed by Google would be related to phishing; as such, this rule appears contradictory.
  • IF (google_index <= 0.00 AND ratio_extErrors > 0.00 AND domain_registration_length > 446.75) THEN No-Phishing, coverage = 0.0684, precision = 0.9961
  • The above rule is very difficult to comprehend: when google_index is 0, the page is not indexed by Google, so why would this be related to No-Phishing? In the dataset, ratio_extErrors is always > 0; therefore, this part of the rule does nothing. Finally, the domain_registration_length has a difficult-to-understand threshold value. The rule is true in 99.61% of cases.
  • IF (google_index > 0.00 AND nb_hyperlinks <= 9.00 AND avg_words_raw > 8.00) THEN Phishing, coverage = 0.0551, precision = 0.9952
The rule above has difficult-to-understand thresholds.
Additional Anchors rules are shown in Appendix A.2.

4.3. Robotic Navigation

In this section, we demonstrate the application of the FuzRED explainability method to deep reinforcement learning (DRL) agents, expanding the scope of our explainability framework beyond traditional deep learning models. We created a toy model using the Miniworld package [30] and tasked an agent with vision-based navigation to achieve a sequence of objectives. We utilized a custom simulated environment in which an agent and three colored boxes—red, blue, and yellow—were placed in random positions. The agent’s goal was to reach each of the three boxes in a specified sequence: first the red box, then the blue box, and finally the yellow box. We used the Stable-Baselines3 package ver 2.3.2 [31] to train a proximal policy optimization (PPO) model to accomplish these objectives. To address the multi-goal navigation problem, we employed hierarchical reinforcement learning (HRL), which comprises a high-level policy for generating sub-goals and a low-level policy for executing actions. The high-level controller uses the current environmental observations to provide sub-goals to the low-level planner.
The agent processed the visual information it captured from the environment (shown in Figure 19) using an object detection model. Specifically, a pre-trained you look only once (YOLO) v8 [32] model was fine-tuned on screenshots from the Miniworld environment, which were manually labeled to detect the three colored boxes. YOLO provided reliable information about the position of each box by generating bounding box coordinates, which were then used as part of the state representation for the PPO agent. These coordinates included the x and y positions of each box’s center, as well as the height and width of each bounding box. The coordinates were normalized between 0 and 1 before being fed into the low-level planning agent. The final state space presented to the PPO model was a concatenated vector consisting of bounding box coordinates for the red, blue, and yellow boxes, combined with one-hot encoded vectors indicating the detection status of each box and which box was the current target. The agent had an action space consisting of turning right, turning left, moving forward, and moving backward.
Figure 19. Robotic navigation agent’s environment.
The training of the model involved using both step-based rewards and sub-goal completion rewards. The agent received a positive reward of 0.03 for each step it took that brought it closer to the target box, while moving away from the target resulted in a penalty of −0.03. When the agent successfully reached a sub-goal, it was rewarded with a value that increased based on how many sub-goals had already been completed and how efficiently the current sub-goal was achieved:
R s u b g o a l = 2 × n u m b o x e s f o u n d × 1 0.2 × c u r r e n t s t e p max steps
This reward function incentivized the agent to find all boxes and do so as efficiently as possible, with a maximum of 1000 steps allowed per episode. The agent was trained using a vectorized environment, which allowed multiple versions of the environment to run simultaneously, thus speeding up the training process and providing diverse experiences for learning. During the training of the PPO model, we performed several rounds of hyperparameter tuning. The final hyperparameters selected are given in Table 3. The training process continued until the model performance converged, achieving all three sub-goals on average within 250 steps after approximately 700,000 training steps.
To generate rules explaining the model, we extracted features from the state space used to train the model, including bounding box coordinates, one-hot encoded detection statuses, and target indicators. These features are summarized in Table 4.
Table 4. Feature names and descriptions.

4.3.1. FuzRED Results

For each feature, we created FMFs by fitting Gaussian mixture models to the feature values observed during training. Because there were only 18 features, we did not perform a down selection to the most important features for this case. For continuous-valued features, we selected three components for the mixture model, corresponding to three fuzzy classes (e.g., left, middle, right for position coordinates). For one-hot encoded features, we created two classes (e.g., true or false). The resulting FMFs were then applied to data from 10 episodes where the trained agent ran in deterministic mode and successfully completed the objective.
Figure 20 and Figure 21 show some example FMFs. Figure 20 shows the two FMFs for red_box_detected. The FMFs for all binary variables: blue_box_detected, yellow_box_detected, target_red_box, target_blue_box, and target_yellow_box all look like those on Figure 20. Figure 21 shows the three FMFs for feature red_box_position: left, middle and right. It describes if the position of the red box is to the left, middle or right of the image that the agent sees. In Appendix A.3, Figure A6 shows the three FMFs for the feature red_box_height: short, medium, and tall. Of course, the height of the box does not change. However, when the agent is far from the box, the box appears short, and when the agent is close to it, it appears tall.
Figure 20. Robotic navigation: FMFs for feature red_box_detected. False and true are shown in red and green, respectively.
Figure 21. Robotic navigation: FMFs for feature red_box_position. Left, middle, and right are shown in red, green, and blue, respectively.
Using the FMFs, we generated fuzzy association rules describing the agent’s behavior in the environment. Rule extraction and pruning were performed using the FARM method described in Section 3.3 and Section 3.4. After pruning there were a total of 43 rules remaining across the 3 agent action classes that were actually utilized. As the agent never moved backward, no rules were generated for this action class.
The three rules below show the behavior of the agent when a given box is the current target, but that box is not detected, meaning that it is not in the field of view. In all cases when this occurs, the agent will turn to try to get the target box into its field of view. This is demonstrated by the fact that each of these rules has 100% confidence.
  • IF target_red_box IS true AND red_box_detected IS false THEN turn_left, sup = 0.104, conf = 1
  • IF target_blue_box IS true AND blue_box_detected IS false THEN turn_left, sup = 0.115, conf = 1
  • IF target_yellow_box IS true AND yellow_box_detected IS false THEN turn_right, sup = 0.116, conf = 1
An interesting thing to note here is that the agent learned to always turn left when looking for the red or blue box, but to always turn right when looking for the yellow box.
The rule below shows an interesting behavior of the agent.
  • IF target_blue_box IS true AND red_box_detected IS true THEN turn_left, sup = 0.041, conf = 1
It shows that whenever the target is blue but the red box is anywhere within the field of view, then the agent turns left. This is interesting because it shows learned behavior about the sequence of the sub-goals. For this task the agent was trained to always go from red to blue. As a result, every time the target switched to blue, the agent had just reached the red box and was right up against it and unable to move through it; therefore, it learned to turn until the red box was out of its field of view before proceeding.
The rules below demonstrate the move_forward action of the agent.
  • IF target_blue_box IS true AND blue_box_detected IS true THEN move_forwardsup-0.240, conf = 1
  • The above rule shows that the agent always moves forward when the blue box is the target and it is in the field of view.
  • IF target_yellow_box IS true AND yellow_box_detected IS true THEN move_forward, sup = 0.223, conf = 0.996
  • The rule above shows that the agent almost always moves forward when the yellow box is the target and it is in the field of view.
However, the same is not true of the red box:
  • IF target_red_box IS true AND red_box_position_x IS middle AND yellow_box_detected IS false THEN move_forward, sup = 0.031, conf = 1
  • IF target_red_box IS true AND red_box_width IS wide AND blue_box_detected IS false THEN move_forward, sup = 0.015, conf = 0.999
  • IF target_red_box IS true AND red_box_height IS tall AND yellow_box_detected IS false THEN move_forward, sup = 0.071, conf = 0.987
  • The above rules show examples of some of the other conditions that need to be in place for the agent to move forward toward the red box, namely that it needs to be in the middle of the field of view or taking up a large portion of the field of view in height or width and the other boxes are not in the way.
Rule pruning was performed using the FARM method described above. After pruning, there were a total of 43 rules remaining across the three agent action classes that were actually utilized. Depending on the minimum confidence (Equation (8)) of the rules (0.6 to 1), FuzRED extracts a different number of rules. The results are shown in Table 5. The table is arranged in descending order by total coverage and then in ascending order by the number of rules after pruning. The total coverage (Equation (11)) of this set of rules ranges from 0.9984 to 0.8606.
Table 5. Robotic navigation: number of rules extracted by FuzRED before and after pruning.
In order to explain the decisions of a neural network, one might be interested in the rules explaining a higher or lower percentage of the dataset depending on the application or type of user. Someone might want the rules to describe at least 0.9 of the test data (i.e., a coverage of 0.9 or above), and then looking at Table 5, nine methods from the top of the table satisfy this requirement. Method 9 requires the smallest number of rules. The 28 rules for method 9 are shown in Appendix A.3 in Table A3. Most of the rules are the same as described earlier in Section 4.3.1.

4.3.2. Anchors Results

Anchors extracted 1270 rules with up to four antecedents. After we removed the repeating rules, 155 rules were left. Most of the rules had three or four antecedents, some had two, and a few had one. Let us start with a few rules with two antecedents, as these are easier to understand.
  • IF (target_red_box <= 0.00 AND blue_box_height > 0.16) THEN move_forward, coverage = 0.1898, precision = 1
  • Target_red_box is a binary variable, meaning that this rule really says that if the target_red_box = 0 (this means the red box is not the target) and blue_box_height > 0.16, move_forward. The meaning of blue_box_height > 0.16 is unknown.
  • IF (0.00 < blue_box_position_x <= 0.06) THEN move_forward, coverage = 0.0819, precision = 0.9712
  • It is difficult to understand what the threshold values of blue_box_position_x mean in this rule.
  • IF (blue_box_detected <= 0.00 AND target_blue_box > 0.00) THEN turn_left, coverage = 0.115, precision = 1
  • Blue_box_detected and target_blue_box are binary features. As such, the rule really says that if blue_box_detected = 0.00 and target_blue_box = 1.0, then always turn_left. In other words, it says that if the blue_box is not detected and the target is the blue_box, then turn_left. This rule is the same as one of the FuzRED rules described in 4.3.1.2. The only difference is that the FuzRED rule was easy to understand:
  • IF target_blue_box IS true AND blue_box_detected IS false THEN turn_left
  • Here are some examples of rules with three antecedents (note that most of the Anchors rules have three antecedents):
  • IF (blue_box_position_x <= 0.06 AND red_box_height > 0.14 AND target_red_box > 0.00) THEN move_forward, coverage = 0.1228, precision = 0.9744
  • Taking into account that some of the features are binary, this rule says that if target_red_box = 1 (so the target is the red_box), blue_box_position_x <= 0.06, and red_box_height > 0.14, then most of the time move_forward. The thresholds 0.06 and 0.14 are difficult to understand.
  • IF (target_red_box > 0.00 AND blue_box_height > 0.16 AND red_box_detected <= 0.00) THEN turn_left, coverage = 0.0244, precision = 1
  • Taking into account that some of the features are binary, this rule says that if target_red_box = 1 (meaning that the red_box is the target), red_box_detected = 0 (red_box not detected), and blue_box_height > 0.16, then always move_right. The threshold 0.16 is difficult to understand.
Additional Anchors rules with four antecedents are shown in Appendix A.3.

5. Discussion

Across all three problems, FuzRED significantly outperforms Anchors in terms of computation speed. All experiments were conducted on a server equipped with two Intel Xeon E5-2650 CPUs (24 cores each) and 8 Nvidia GTX 1080ti GPUs. Figure 22 shows the comparison of computation times between FuzRED and Anchors.
Figure 22. Comparison of computation times between FuzRED and Anchors. The Y scale is logarithmic.
For the spam dataset, FuzRED completes the entire process—including computing SHAP values, selecting the top 20 SHAP features, generating membership functions, and extracting rules—in 53.7 s, while Anchors requires 16.1 h to complete the same task. Notably, Anchors appeared to utilize only a single CPU core, which may have contributed to its slower performance. FuzRED computation time is faster than Anchors by more than two orders of magnitude.
For the phishing dataset, FuzRED performs all the computations in 105.1 s. In comparison, Anchors runs for 71.6 h. Again, note that Anchors appeared to only be able to fully utilize 1 CPU core. In addition, the process of generating the Anchors rules for the phishing dataset required more memory than the 256 GB of RAM available on the server. The rule generation process ran out of memory a little over half way through the process and had to be restarted on the remaining data. The time spent shutting down and restarting the process was not counted in the runtime reported above. FuzRED computation time is faster than that of Anchors by more than three orders of magnitude. For the robotic navigation task, FuzRED performs all the computations in just seven minutes on the same server as the computations for spam and phishing. In comparison, Anchors takes over 129 min to complete its computations on the same server. FuzRED computation time is faster than that of Anchors by one order of magnitude. The reason the speed-up for the robotic navigation task is smaller than that of the two other tasks is that the total number of features in this task is only 18 and we did not perform feature selection, meaning that both FuzRED and Anchors were getting rules for the same number of features.
The rules generated by FuzRED are highly interpretable, using familiar linguistic terms such as low, medium, high, left, middle, and right which align with natural human reasoning. For SMEs, the FMF plots visually indicate the value ranges corresponding to each membership category. Additionally, shorter rules enhance interpretability, and FuzRED predominantly generates rules with only one or two antecedents, with a small subset containing three. By selecting a concise yet representative subset of rules that explain a large portion of the dataset, FuzRED enables users to focus on key patterns, making it easier to assess and validate the model’s behavior efficiently.
Anchors generates a large set of rules, most of which contain three antecedents, some with two, and a few with one. Many of these rules include antecedents that are always true, which could be omitted to create shorter, more interpretable rules. Additionally, Anchors often applies arbitrary numerical thresholds (e.g., 0.76, 0.16)—even for integer features—making the rules harder to understand. For binary features, instead of straightforward conditions like “feature = 0” or “feature = 1”, Anchors consistently uses greater than or equal to (≥) or less than or equal to (≤) conditions, further complicating interpretability. Moreover, the rule sets produced by Anchors are significantly larger than those generated by FuzRED, making it more challenging to evaluate whether the overall model behavior aligns with expectations.
While these results have demonstrated many of the advantages of FuzRED, it is important to also note some of its current limitations. First, although our experiments demonstrate that FuzRED can handle moderately complex models efficiently, scaling the approach to extremely large neural networks or high-dimensional datasets could pose challenges. In these scenarios, it is possible that the number of important features could be in the hundreds or thousands, a factor that would substantially increase the runtime and could potentially become intractable due to the combinatorial explosion that occurs in fuzzy association rule mining as either the number of items or the length of rules increases. In addition, this could create an excessively large collection of rules that would be difficult to manage and interpret. Similarly, while the DRL agent example demonstrated the applicability of FuzRED to multi-class classification problems, the method currently generates separate rulesets for each class, and the analysis of all the rulesets could become challenging for many classes. Finally, to date, we have only focused on structured and numeric data. Therefore, extending FuzRED to unstructured modalities such as images or unstructured text would likely require an appropriate feature extraction or encoding step as a precursor to fuzzy rule generation. Each of these issues are important targets for future work.
Table 6 shows a brief comparison of FuzRED, SHAP, LIME, and Anchors methods in terms of interpretability, computation time, explanation scope, scalability, and ease-of-use for non-experts.
Table 6. Comparision of FuzRED, SHAP, LIME, and Anchors.

6. Conclusions and Future Work

We introduced FuzRED, a novel method for explaining AI model decisions using fuzzy IF-THEN rules. This method translates complex model behaviors into a set of human-understandable rules described with familiar linguistic terms (e.g., “low”, “medium”, “high”). The approach helps to bridge the gap between black-box neural network decision-making and human reasoning. By combining feature-importance analysis with fuzzy logic, FuzRED ensures that the extracted rules remain faithful to the model’s internal patterns while also being intuitively interpretable. FuzRED is fully automated—the user only needs to provide a trained model, along with training and test datasets, and can optionally set a couple of parameters to guide rule generation. We demonstrated its applicability to both standard classification tasks and a DRL scenario, highlighting its flexibility across different types of AI problems. We also showed that it is significantly faster than the Anchors method, and we think additional optimizations could improve the efficiency further.
FuzRED produces a concise set of representative rules that provide a high-level summary of the model’s behavior, allowing users to focus on key decision patterns without being overwhelmed by details. This capability can help build trust in AI systems among subject-matter experts and even non-technical stakeholders by enabling them to see in plain and intuitive language why a model made a certain prediction or recommendation. Likewise, model developers can use FuzRED as a validation and debugging tool to ensure the model’s reasoning aligns with their domain knowledge and the system’s requirements. If an extracted rule reveals that an unexpected or illogical condition drove a prediction, it will alert developers to potential issues in the model or dataset. This, in turn, will prompt further investigation or retraining. In addition, FuzRED’s computational efficiency makes it feasible to integrate into real-world workflows, even for time-sensitive applications or iterative model development.
This capability has important implications for deploying AI in a variety of real-world applications. In healthcare, for example, an interpretable set of rules derived from a diagnostic model can help clinicians and patients trust the model’s recommendations and understand the rationale behind critical decisions. In defense and security, rule-based explanations enable decision-makers to verify that an AI system’s reasoning aligns with established rules of engagement and ethical constraints before they act on its output. Similarly, in autonomous systems such as self-driving vehicles or robotic assistants, understanding the conditions that trigger certain actions improves transparency and safety, fostering greater confidence in the system’s behavior and facilitating compliance with regulatory requirements.
By providing transparent, rule-based insights into a neural network’s workings, FuzRED contributes to the broader goal of making AI systems more interpretable and accountable. Notably, the fuzzy rule-based approach provides a method that yields global explanations (capturing broad patterns across many instances) rather than only case-by-case insights. In other words, FuzRED complements existing local explanation techniques by offering a higher-level view of a model’s decision logic that can be examined holistically.
The integration of fuzzy logic with deep-learning explanations in FuzRED also opens avenues for new research. For instance, future work could leverage FuzRED’s extracted rules to identify higher-level concepts that a neural network has learned, or to distill a complex model into a simpler, more transparent form without significant loss of accuracy. Future concept-based explainability methods could also leverage the output of FuzRED to aid in concept identification by analyzing the interactions between different feature or rule combinations within the larger ruleset. Moving forward, we hope to extend FuzRED to more complex scenarios—for example, scaling it to accommodate much larger networks and higher-dimensional datasets and adapting the approach to unstructured data like image and text inputs. These efforts will further broaden FuzRED’s applicability and amplify its impact on the field of explainable AI.

Author Contributions

Conceptualization, A.L.B.; methodology, A.L.B. and B.D.B.; software, K.Z. and B.D.B.; formal analysis, A.L.B., B.D.B. and K.Z.; writing—original draft preparation, A.L.B. and B.D.B.; writing—review and editing, A.L.B., B.D.B. and K.Z.; visualization, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the JHU Institute of Assured Autonomy.

Data Availability Statement

All the data used in this paper will be made available upon publication of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial intelligence
ARMAssociation rule mining
DRLDeep reinforcement learning
FARMFuzzy association rule mining
FMFFuzzy membership function
FuzREDFuzzy rules for explaining deep neural network decisions
GMMGaussian mixture model
HRLHierarchical reinforcement learning
LIMELocal interpretable model-agnostic explanations
MDMatching degree
MMDMaximum matching degree
ReLURectified linear unit
SHAPShapley additive explanations
SMESubject matter expert
UCIUniversity of California Irvine
XAIExplainable artificial intelligence
YOLOYou look only once

Appendix A

Appendix A.1. Additional Results for the Spam Email Identifiction Dataset

Some FMFs are shown in in Section 4.1.1. (FMFs). Additional FMFs are shown below. The low, medium, and high FMFs follow a similar pattern as that for word_freq_george and word_freq_hp described in Section 4.1.1.
Figure A1. Spam dataset: FMFs function for word_freq_000. Low, medium, and high are shown in red, green, and blue, respectively.
Figure A2. Spam dataset: FMFs for word_freq_meeting. Low, medium, and high are shown in red, green, and blue, respectively.
Figure A3. Spam dataset: FMFs for char_freq_$. Low, medium, and high are shown in red, green, and blue, respectively.
In Section 4.1.1 (FuzRED Rules), we present a small set of rules. Additional rules with their explanations are shown below.
There are also some rules with the word frequency of remove.
  • IF (word_freq_remove IS High) THEN Spam, sup = 0.0119, conf = 1
  • Many spam emails play on people’s fear, claiming their account will be deleted unless they take immediate action, in emails like: “Urgent: Verify your account or we will need to remove it from our system. Click here to verify.”
Below we show some of the rules extracted using the frequency choice of the top features.
  • IF (word_freq_000 IS High AND word_freq_1999 IS Low) THEN Spam, sup = 0.0085, conf = 1
  • The above rule says that if the frequency of 000 is high (spam emails often mention $ and a lot of zeros) and the frequency of 1999 (many no-spam emails were from 1999) is low, then the email is always spam.
  • IF (word_freq_re IS High) THEN No-Spam, sup = 0.0020, conf = 1
  • If someone is replying to someone else’s email (or chain of email replies), this is not spam.
  • IF (word_freq_hpl IS High) THEN No-Spam, sup = 0.0198, conf = 1
  • Hpl is the acronym for Hewlett Packard Labs, and the rule states that if its frequency is high, the email is always no-spam.
Most Anchors rules are presented in Section 4.1.2. Here, we show a few additional Anchors rules.
  • IF (word_freq_remove > 0.00 AND word_freq_hp <= 0.00 AND word_freq_will > 0.00) THEN Spam, coverage = 0.1192, precision = 1
Since frequencies can never be negative, this rule actually states that if the frequency of remove is greater than zero, the frequency of hp is zero, and the frequency of will is greater than zero, then the email is spam.
  • IF (word_freq_george > 0.00 AND word_freq_labs > 0.00) THEN No-Spam, coverage = 0.0533, precision = 1
Table A1. Pruned set of rules for spam: method 11 from Table 1, minimum contribution method, minimum rule confidence of 0.6.
Table A1. Pruned set of rules for spam: method 11 from Table 1, minimum contribution method, minimum rule confidence of 0.6.
Antecedent 1Antecedent 2Antecedent 3ConsequentSupportConfidence
word_freq_hp; Medium No-Spam0.07531
word_freq_hpl; Medium No-Spam0.05981
word_freq_george; High No-Spam0.02631
capital_run_length_longest; Medium Spam0.02371
word_freq_meeting; Medium No-Spam0.02171
word_freq_hpl; High No-Spam0.01981
word_freq_hp; High No-Spam0.01451
word_freq_1999; Lowword_freq_telnet; Mediumword_freq_technology; MediumNo-Spam0.01421
word_freq_cs; Medium No-Spam0.01261
word_freq_direct; High No-Spam0.01121
word_freq_lab; Lowword_freq_telnet; Mediumword_freq_technology; MediumNo-Spam0.01071
word_freq_telnet; High No-Spam0.00991
word_freq_1999; Mediumword_freq_edu; Medium No-Spam0.00971
word_freq_lab; Mediumword_freq_technology; Medium No-Spam0.00891
word_freq_project; Medium No-Spam0.00791
word_freq_1999; Mediumword_freq_technology; Medium No-Spam0.00551
word_freq_1999; Mediumchar_freq_; Medium No-Spam0.00541
word_freq_edu; High No-Spam0.00531
char_freq_$; Mediumword_freq_direct; Medium Spam0.00471
word_freq_meeting; High No-Spam0.00461
word_freq_email; Mediumchar_freq_$; Medium Spam0.00431
char_freq_; High No-Spam0.00391
char_freq_$; High Spam0.00331
char_freq_$; Mediumchar_freq_; Medium Spam0.00301
word_freq_email; Highword_freq_re; Lowword_freq_all; LowSpam0.00291
word_freq_cs; High No-Spam0.00261
word_freq_email; Mediumword_freq_all; High No-Spam0.00261
word_freq_re; High No-Spam0.00201
word_freq_edu; Mediumword_freq_all; Medium No-Spam0.00201
word_freq_edu; Lowword_freq_edu; Medium No-Spam0.00141
word_freq_george; Mediumword_freq_1999; Medium No-Spam0.00141
word_freq_george; Mediumword_freq_1999; High No-Spam0.00131
word_freq_george; Mediumchar_freq_; Medium No-Spam0.00131
word_freq_1999; Highword_freq_all; Medium No-Spam0.00121
word_freq_hp; Lowchar_freq_$; Mediumword_freq_all; MediumSpam0.00651
word_freq_857; Medium No-Spam0.02170.9999
word_freq_email; Mediumword_freq_direct; Lowword_freq_direct; MediumSpam0.00110.9996
word_freq_re; Medium No-Spam0.02580.9940
word_freq_lab; High No-Spam0.00990.9930
word_freq_telnet; Medium No-Spam0.02610.9901
word_freq_technology; Lowword_freq_technology; Medium No-Spam0.00230.9788
word_freq_email; Lowword_freq_email; Mediumword_freq_telnet; LowSpam0.00440.9763
word_freq_george; Medium No-Spam0.01550.9747
word_freq_lab; Medium No-Spam0.01580.9641
word_freq_hp; Lowchar_freq_$; Mediumword_freq_meeting; LowSpam0.03060.9630
word_freq_edu; Medium No-Spam0.03290.9615
word_freq_hp; Lowword_freq_email; Mediumword_freq_direct; MediumSpam0.00830.9283
word_freq_email; Mediumword_freq_all; Lowword_freq_all; MediumSpam0.00470.9233
word_freq_1999; Medium No-Spam0.05760.9109
word_freq_table; Mediumword_freq_all; Medium No-Spam0.00140.8914
capital_run_length_longest; Lowword_freq_all; High No-Spam0.01480.8817
word_freq_technology; Mediumword_freq_all; Medium No-Spam0.00490.8816
word_freq_email; Mediumword_freq_telnet; Lowword_freq_all; MediumSpam0.02180.8791
word_freq_email; Mediumchar_freq_$; Lowword_freq_direct; MediumNo-Spam0.02290.8453
word_freq_technology; Medium No-Spam0.03870.8429
word_freq_direct; Lowword_freq_direct; Mediumword_freq_technology; LowSpam0.00200.8257
word_freq_hp; Lowword_freq_direct; Mediumword_freq_857; LowSpam0.01580.7933
word_freq_hp; Lowword_freq_email; Mediumword_freq_telnet; LowSpam0.05980.7865
char_freq_$; Lowchar_freq_; Mediumcapital_run_length_longest; LowNo-Spam0.01230.7714
word_freq_1999; High No-Spam0.00380.7435
word_freq_hpl; Lowword_freq_technology; Lowword_freq_all_; MediumSpam0.09910.7326
word_freq_telnet; Lowword_freq_hpl; Lowword_freq_all; MediumSpam0.09980.7298
word_freq_email; Lowchar_freq_$; Lowword_freq_all; LowNo-Spam0.50130.6670
word_freq_email; Lowword_freq_all; Low No-Spam0.50310.6476
char_freq_$; Lowword_freq_all; Low No-Spam0.51940.6431
word_freq_email; Lowchar_freq_$; Low No-Spam0.55260.6292
char_freq_$; Lowcapital_run_length_longest; Low No-Spam0.58010.6155

Appendix A.2. Additional Results for Phishing Link Detection

Some FMFs are shown in Section 4.2.1. (FMFs). Additional FMFs are shown below.
Nb_dots describes the number of dots in a URL. In the phishing dataset, the largest nb_dots is 24 (Figure A4 shows the FMFs). Phishing URLs use sensitive words to gain trust on visited web pages. The number of such words in URLs is considered as a phishing indicator. Nb_www describes the number of ‘www’ in a URL. Common terms such as ‘www’ tend to be used mostly once in legitimate URLs, but they are used more than once in phishing URLs. Nb_www is an integer that describes how many times www is used in a URL. In the phishing dataset, the largest nb_www is 2 (Figure A5 shows the FMFs).
Figure A4. Phishing dataset: FMFs for nb_dots. Low and high are shown in red and blue, respectively.
Most Anchors rules are presented in Section 4.2.2. Here, we show a few additional Anchors rules.
  • IF (nb_dollar > 0.00 AND nb_dots > 3.00) THEN Phishing, coverage = 0.0005, precision = 1
The above rule is always true and describes 0.05% of the dataset.
  • IF (google_index > 0.00 AND nb_www <= 0.00 AND domain_age <= 978.50) THEN Phishing coverage = 0.1169, precision = 0.9615
Nb_www is always greater than or equal to 0; therefore, this antecedent states that nb_www is equal to zero. Again, it is not clear why URLs indexed by Google (google_index > 0.00) would lead to phishing. Domain_age has a difficult-to-understand threshold in the above rule.
Figure A5. Phishing dataset: FMFs for nb_www. Low and high are shown in red and blue, respectively.
Table A2. Pruned set of rules for the phishing dataset, method 27 from Table 3, maximum magnitude and minimum rule confidence of 0.8.
Table A2. Pruned set of rules for the phishing dataset, method 27 from Table 3, maximum magnitude and minimum rule confidence of 0.8.
Antecedent 1Antecedent 2Antecedent 3ConsequentSupportConfidence
nb_at; High Phishing0.02361
nb_space; Lownb_colon; High Phishing0.00691
nb_semicolumn; High Phishing0.00451
longest_words_raw; High Phishing0.00451
phish_hints; Highratio_digits_host; High Phishing0.00451
brand_in_subdomain; High Phishing0.00401
nb_dots_; High Phishing0.00391
nb_hyperlinks; Highratio_extRedirection; High No-Phishing0.00351
domain_registration_length; Highnb_hyperlinks; High No-Phishing0.00281
nb_www; Highlength_words_raw; High Phishing0.00241
domain_registration_length; Highratio_digits_host; High Phishing0.00111
phish_hints; Highratio_extRedirection; High Phishing0.00950.9999
ratio_extRedirection; Highratio_digits_host; High Phishing0.00370.9993
page_rank; Highnb_hyperlinks; High No-Phishing0.01420.9991
phish_hints; Highpage_rank; Low Phishing0.07110.9945
nb_space; Highdomain_registration_length; High No-Phishing0.00110.9931
nb_www; Lowratio_digits_host; High Phishing0.03250.9902
nb_space; Highpage_rank; Highnb_www; HighNo-Phishing0.00160.9837
domain_registration_length; Lowphish_hints; High Phishing0.07840.9832
phish_hints; Highnb_www; Low Phishing0.05960.9825
page_rank; Highratio_extRedirection; Highnb_www; HighNo-Phishing0.02730.9803
nb_hyphens; Highpage_rank; Highratio_extRedirection; HighNo-Phishing0.00350.9770
nb_hyphens; Highpage_rank; Highlength_words_raw; LowNo-Phishing0.03140.9758
ratio_digits_host; High Phishing0.03500.9755
nb_hyperlinks; Highnb_www; Low No-Phishing0.00980.9721
phish_hints; Lownb_hyperlinks; High No-Phishing0.01790.9702
nb_hyphens; Highpage_rank; High No-Phishing0.03200.9610
page_rank; Highnb_www; High No-Phishing0.22500.9565
nb_hyphens; Lownb_hyphens; Highnb_www; HighNo-Phishing0.00290.9546
domain_registration_length; Highratio_extRedirection; Highnb_www; HighNo-Phishing0.00400.9534
page_rank; Highratio_extRedirection; Lowratio_extRedirection; HighNo-Phishing0.01120.9530
domain_registration_length; Lowdomain_registration_length; Highpage_rank; HighNo-Phishing0.00380.9527
domain_registration_length; Highnb_www; Highlength_words_raw; LowNo-Phishing0.04350.9462
nb_dots; Lownb_www; Lowlength_words_raw; HighNo-Phishing0.00150.9422
nb_hyphens; Highratio_extRedirection; Lownb_www; HighNo-Phishing0.02270.9403
longest_words_raw; Lowdomain_registration_length; Highratio_extRedirection; HighNo-Phishing0.00720.9375
nb_space; Highphish_hints; Lownb_www; HighNo-Phishing0.00320.9230
ratio_extRedirection; Lowratio_extRedirection; Highnb_www; HighNo-Phishing0.00700.9084
nb_underscore; Highpage_rank; Highnb_colon; LowNo-Phishing0.00920.9045
phish_hints; Lowratio_extRedirection; Highnb_www; HighNo-Phishing0.04180.8770
page_rank; Lowratio_extRedirection; Lownb_www; LowPhishing0.25890.8661
phish_hints; Lowratio_extRedirection; Lowratio_extRedirection; HighNo-Phishing0.01260.8583
page_rank; Lownb_www; Low Phishing0.28450.8524
nb_hyphens; Highnb_at; Lowphish_hints; LowNo-Phishing0.03830.8489
nb_space; Highpage_rank; Lowpage_rank; HighNo-Phishing0.00110.8375
page_rank; Lowpage_rank; Highratio_extRedirection; HighNo-Phishing0.01390.8301
domain_registration_length; Highphish_hints; Lowpage_rank; HighNo-Phishing0.04890.8230
phish_hints; Lownb_www; High No-Phishing0.34680.8155

Appendix A.3. Additional Results for Robotic Navigation

Some FMFs are shown in Section 4.3.1. (FuzRED Rules). An additional FMF is shown below.
Figure A6. Robotic navigation: FMFs for feature red_box_height. Short, medium, and tall are shown in red, green, and blue, respectively.
Many Anchors rules are presented in Section 4.3.2. Some of the Anchors rules with four antecedents are shown below.
  • IF (yellow_box_detected <= 1.00 AND blue_box_height > 0.16 AND yellow_box_position_y <= 0.77 AND blue_box_position_y <= 0.78) THEN move_forward, coverage =0.0591, precision = 0.96
The first antecedent of this rule, i.e., yellow_box_detected <= 1.00, is always true: yellow_box_detected is a binary feature. As such, it does not make any sense for this antecedent to be part of the rule; however, Anchors includes it. The other parts of the rule— blue_box_height > 0.16, yellow_box_position_y <= 0.77, and blue_box_position_y <= 0.78—are difficult to understand.
  • IF (blue_box_detected <= 1.00 AND red_box_height > 0.14 AND target_red_box > 0.00 AND yellow_box_detected <= 0.00) THEN move_forward, coverage = 0.1268, precision = 0.9689
This is an example of another rule that has an antecedent that is always true: blue_box_detected <= 1.00. Taking into account that some of the features are binary, this rule really states that if red_box_height > 0.14, target_red_box = 1, and yellow_box_detected not true, then move_right in 96.89% percent of cases. The threshold 0.14 is difficult to understand.
Table A3. Pruned set of rules for robotic navigation agent method 9 from Table 5.
Table A3. Pruned set of rules for robotic navigation agent method 9 from Table 5.
Antecedent 1Antecedent 2Antecedent 3ConsequentSupportConfidence
blue_box_detected; Truetarget_blue_box; True move_forward0.241
yellow_box_height; Talltarget_yellow_box; True move_forward0.1181
yellow_box_detected; Falsetarget_yellow_box; True turn_right0.1161
blue_box_detected; Falsetarget_blue_box; True turn_left0.1151
red_box_detected; Falsetarget_red_box; True turn_left0.1041
yellow_box_width; Narrowtarget_yellow_box; True move_forward0.0461
red_box_width; Mediumyellow_box_position_x; Rightyellow_box_height; Tallmove_forward0.0461
blue_box_width; Mediumyellow_box_position_x; Middle move_forward0.0331
red_box_position_x; Middleyellow_box_detected; Falsetarget_red_box; Truemove_forward0.0311
red_box_position_x; Leftblue_box_height; Tall move_forward0.0241
red_box_width; Mediumred_box_height; Mediumyellow_box_height; Tallmove_forward0.0231
red_box_position_x; Middlered_box_width; Mediumyellow_box_width; Mediummove_forward0.0221
red_box_position_x; Rightyellow_box_position_x; Right move_forward0.021
yellow_box_width; Widetarget_yellow_box; True move_forward0.0161
red_box_position_x; Righttarget_red_box; True move_forward0.0131
red_box_position_y; Middleyellow_box_width; Medium move_forward0.0131
yellow_box_height; Shorttarget_yellow_box; True move_forward0.0131
red_box_width; Narrowyellow_box_width; Mediumblue_box_detected; Falsemove_forward0.0131
red_box_height; Mediumyellow_box_position_x; Leftyellow_box_position_y; Bottommove_forward0.0121
red_box_width; Narrowblue_box_width; Medium move_forward0.0111
red_box_width; Mediumblue_box_width; Mediumyellow_box_detected; Falsemove_forward0.0111
red_box_height; Tallyellow_box_width; Narrow move_forward0.011
red_box_width; Wideblue_box_detected; Falsetarget_red_box; Truemove_forward0.0150.999
red_box_width; Narrowblue_box_detected; Falsetarget_red_box; Truemove_forward0.0170.997
yellow_box_detected; Truetarget_yellow_box; True move_forward0.2230.996
blue_box_width; Mediumyellow_box_position_y; Middle move_forward0.020.995
red_box_width; Mediumblue_box_position_x; Middleblue_box_position_y; Bottommove_forward0.0150.995
blue_box_position_y; Middleblue_box_width; Mediumyellow_box_height; Mediummove_forward0.0150.991

References

  1. Towards trustable machine learning. Nat. Biomed. Eng. 2018, 2, 709–710. [CrossRef] [PubMed]
  2. Sayler, K. Artificial Intelligence and National Security; U.S. Congressional Research Service: Washington, DC, USA, 2019. [Google Scholar]
  3. Fjeld, J.; Hilligoss, H.; Achten, N.; Daniel, M.L.; Feldman, J.; Kagay, S. Principled Artificial Intelligence; Berkman Klein Center: Harvard, MA, USA, 2019; Available online: https://ai-hr.cyber.harvard.edu/primp-viz.html (accessed on 1 February 2025).
  4. Doshi-Velez, F.; Kortz, M. Accountability of AI Under the law: The Role of Explanation. arXiv 2017. Available online: https://arxiv.org/abs/1711.01134 (accessed on 1 February 2025). [CrossRef]
  5. Wagner, J. GDPR and Explainable AI. 2019. Available online: https://www.jwtechwriter.com/project-146.php (accessed on 1 February 2025).
  6. Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Comput. Surv. 2018, 51, 93. [Google Scholar] [CrossRef]
  7. Lipton, Z.C. The mythos of model interpretability. arXiv 2017. Available online: https://arxiv.org/abs/1606.03490 (accessed on 1 February 2025).
  8. Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed. 2022. Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 1 February 2025).
  9. Pedrycz, W.; Gomide, F. Fuzzy Systems Engineering: Toward Human-Centric Computing; Wiley-IEEE Press: Hoboken, NJ, USA, 2007. [Google Scholar]
  10. Lundberg, S.M.; Lee, S. A unified approach to interpreting model predictions. In Proceedings of the Advances in Neural Information Processing Systems 30 (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
  11. Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
  12. Das, A.; Rad, P. Opportunities and challenges in explainable artificial intelligence (XAI): A survey. arXiv 2020. Available online: https://arxiv.org/abs/2006.11371 (accessed on 1 February 2025).
  13. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. Int. J. Comput. Vis. 2019, 128, 336–359. [Google Scholar] [CrossRef]
  14. Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 1135–1144. [Google Scholar]
  15. Ribeiro, M.T.; Singh, S.; Guestrin, C. Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar] [CrossRef]
  16. Stepin, I.; Alonso, J.M.; Catalá, A.; Pereira-Fariña, M. A Survey of Contrastive and Counterfactual Explanation Generation Methods for Explainable Artificial Intelligence. IEEE Access 2021, 9, 11974–12001. [Google Scholar] [CrossRef]
  17. Mazzine Barbosa, R.; Martens, D. A Framework and Benchmarking Study for Counterfactual Generating Methods on Tabular Data. Appl. Sci. 2021, 11, 7274. [Google Scholar] [CrossRef]
  18. Wachter, S.; Mittelstadt, B.; Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. J. Law Technol. 2017, 31, 841–887. [Google Scholar] [CrossRef]
  19. Verma, S.; Dickerson, J.; Hines, K. Counterfactual explanations for machine learning: A review. arXiv 2020. Available online: https://arxiv.org/abs/2010.10596 (accessed on 1 February 2025).
  20. Kim, B.; Wattenberg, M.; Gilmer, J.; Cai, C.; Wexler, J.; Viegas, F.; Sayres, R. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In Proceedings of the 35th International Conference on Machine Learning (ICML), Stockholm, Sweden, 10–15 July 2018. [Google Scholar]
  21. Agrawal, R.; Srikant, R. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Data Bases, Santiago, Chile, 12–15 September 1994; pp. 487–499. [Google Scholar]
  22. Agrawal, R.; Imielinski, T.; Swami, A. Mining association rules between sets of items in large databases. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, 26–28 May 1993; ACM: New York, NY, USA, 1993; pp. 207–216. [Google Scholar]
  23. Kuok, C.M.; Fu, A.; Wong, M.H. Mining fuzzy association rules in databases. ACM Sigmod Rec. 1998, 27, 41–46. [Google Scholar] [CrossRef]
  24. Liu, B.; Hsu, W.; Ma, Y. Integrating classification and association rule mining. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD), New York, NY, USA, 27–31 August 1998; AAAI Press: Washington, DC, USA, 1998. [Google Scholar]
  25. Reynolds, D. Gaussian mixture models. In Encyclopedia of Biometrics; Li, S.Z., Jain, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar] [CrossRef]
  26. Dua, D.; Graff, C. UCI Machine Learning Repository; University of California, School of Information and Computer Science: Irvine, CA, USA, 2019; Available online: http://archive.ics.uci.edu/ml (accessed on 1 February 2025).
  27. scikit-learn. (n.d.). StandardScaler. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html (accessed on 1 February 2025).
  28. Lundberg, S.M. (n.d.). SHAP Documentation. Available online: https://shap.readthedocs.io/en/latest/ (accessed on 1 February 2025).
  29. Hannousse, A.; Yahiouche, S. Towards benchmark datasets for machine learning based website phishing detection: An experimental study. arXiv 2020. Available online: https://arxiv.org/abs/2010.12847 (accessed on 1 February 2025). [CrossRef]
  30. Chevalier-Boisvert, M.; Dai, B.; Towers, M.; de Lazcano, R.; Willems, L.; Lahlou, S.; Pal, S.; Castro, P.S.; Terry, J. Minigrid & Miniworld: Modular & customizable reinforcement learning environments for goal-oriented tasks. arXiv 2023. Available online: https://arxiv.org/abs/2306.13831 (accessed on 1 February 2025).
  31. Raffin, A.; Hill, A.; Gleave, A.; Kanervisto, A.; Ernestus, M.; Dormann, N. Stable-baselines3: Reliable reinforcement learning implementations. J. Mach. Learn. Res. 2021, 22, 1–8. [Google Scholar]
  32. Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLOv8 (Version 8.0.0). 2023. Available online: https://github.com/ultralytics/ultralytics (accessed on 1 February 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.