COVAS: Highlighting the Importance of Outliers in Classification Through Explainable AI

Roth, Sebastian; Cerrito, Adrien; Orth, Samuel; Hartmann, Ulrich; Friemert, Daniel

doi:10.3390/make8010024

Open AccessArticle

COVAS: Highlighting the Importance of Outliers in Classification Through Explainable AI

by

Sebastian Roth

^1,*

,

Adrien Cerrito

²

,

Samuel Orth

¹

,

Ulrich Hartmann

^1,3

and

Daniel Friemert

^1,3

¹

Faculty of Mathematics, Informatics, Technology, University of Applied Sciences Koblenz, 53424 Remagen, Germany

²

Haute-Ecole Arc Santé, HES-SO University of Applied Sciences and Arts Western Switzerland, 2800 Delémont, Switzerland

³

Institut für Medizintechnik und Informationsverarbeitung, 56070 Koblenz, Germany

^*

Author to whom correspondence should be addressed.

Mach. Learn. Knowl. Extr. 2026, 8(1), 24; https://doi.org/10.3390/make8010024

Submission received: 8 December 2025 / Revised: 8 January 2026 / Accepted: 16 January 2026 / Published: 19 January 2026

(This article belongs to the Topic Opportunities and Challenges in Explainable Artificial Intelligence (XAI))

Download

Browse Figures

Versions Notes

Abstract

Understanding the decision-making behavior of machine learning models is essential in domains where individual predictions matter, such as medical diagnosis or sports analytics. While explainable artificial intelligence (XAI) methods such as SHAP provide instance-level feature attributions, they mainly summarize typical decision behavior and offer limited support for systematically exploring atypical yet correctly classified cases. In this work, we introduce the Classification Outlier Variability Score (COVAS), a framework designed to support hypothesis generation through the analysis of explanation variability. COVAS operates in the explanation space and builds directly on SHAP value representations. It quantifies how strongly an individual instance’s SHAP-based explanation deviates from class-specific attribution patterns by aggregating standardized SHAP deviations into a single score. Consequently, the applicability of COVAS inherits the model- and data-agnostic properties of SHAP, provided that explanations can be computed for the underlying model and data. We evaluate COVAS on publicly available datasets from the medical and sports domains. The results show that COVAS reveals explanation-space outliers not captured by feature-space outlier detection or prediction uncertainty measures. Robustness analyses demonstrate stability across parameter choices, class imbalance, model initialization, and model classes. Overall, COVAS complements existing XAI techniques by enabling targeted instance-level inspection and facilitating XAI-guided hypothesis formulation.

Keywords:

explainable artificial intelligence (XAI); SHAP values; outlier detection; model interpretability; machine learning classification; instance-level analysis; anomaly detection

1. Introduction

When using machine learning (ML) algorithms for diagnostic or classification tasks, it is essential to understand how the model reaches its conclusions [1,2]. Thus, explainable artificial intelligence (XAI) plays a crucial role in this domain [1,3]. The necessity of XAI depends strongly on the application [1,3,4]. In fields such as medicine and biomechanics, model transparency is imperative [2,3,5]. Furthermore, AI systems must translate their outputs into domain-specific language that is comprehensible to experts while taking into account their varying levels of expertise [2,3,5].

One widely used method for interpreting model decisions is Shapley Additive Explanations (SHAP) [6,7,8]. SHAP attributes feature-wise contributions to individual predictions, enabling a detailed understanding of model behavior. Its model-agnostic nature allows SHAP to be applied across a broad spectrum of ML models and datasets [3,6,7,8]. In addition to numerical attributions, SHAP also provides graphical visualizations such as decision plots, which illustrate how features accumulate toward a final prediction [6].

However, interpreting outliers—instances that deviate substantially from the expected feature attribution pattern—remains a challenge [1,2]. Outliers are often dismissed because they are difficult to interpret or may stem from measurement errors. Yet, in the absence of evidence for systematic errors, atypical instances may contain meaningful and novel insights [9,10]. In medicine, such outliers may reveal rare or borderline manifestations of diseases [9,10], and the same holds true for movement disorders such as cerebral palsy or Parkinson’s disease [11,12].

Existing outlier detection approaches typically rely on statistical distances in the feature space, such as z-score filtering, Mahalanobis distance, or density-based methods like Local Outlier Factor (LOF) [13,14,15]. However, these approaches generally ignore the underlying model behavior and do not incorporate localized explanations from XAI tools [8].

Recent work has shown that local feature attributions can vary across instances and modeling conditions, motivating the analysis of explanation patterns beyond aggregate summaries [16,17]. In parallel, explanations have increasingly been treated as objects of analysis in their own right, for example, through systematic evaluation and comparison of explanation outputs across models and datasets [18]. However, existing approaches typically do not focus on systematically characterizing atypical explanation patterns for correctly classified instances or providing a quantitative score to assess such deviations.

To address this gap, we introduce the Classification Outlier Variability Score (COVAS), a method that leverages SHAP values to identify instances whose decision paths deviate substantially from the class-specific mean. Since COVAS is based on SHAP values, it too is model- and data-agnostic and can, in principle, be extended to XAI methods with characteristics similar to SHAP. This study demonstrates the applicability of COVAS in two domains: medical diagnosis, using the Breast Cancer Wisconsin Diagnostic dataset [19], and sports science, illustrated with the FIFA 2018 Man of the Match dataset [20], where the availability of large-scale biomechanical datasets remains limited [21]. Since SHAP explanations are available for multiclass settings, COVAS is not inherently limited to binary classification [6]. Yet in this study, we focus on binary case studies.

Unlike classical outlier detection methods, COVAS is not designed to identify mislabeled samples, noise, or anomalies in the input space. Instead, its purpose is to highlight atypical but correctly classified instances whose SHAP-based decision paths deviate from typical class behavior. Such cases may reveal non-obvious patterns that can support further analysis or motivate hypothesis generation beyond simplified assumptions about feature relevance. COVAS, therefore, serves as an explanation-driven knowledge-discovery tool rather than a performance-oriented anomaly detection algorithm.

This work, therefore, focuses on explanation-level variability, not on anomaly detection performance, and evaluates COVAS through qualitative, domain-oriented case analyses.

2. Materials and Methods

All experiments in this work focus on binary classification tasks and serve as illustrative case studies for the proposed framework.

2.1. Hardware and Software

All experiments were conducted on an Apple M3 Max system with 36 GB of unified memory using Python 3.9.15. Key libraries included numpy, pandas, tensorflow-keras, scikit-learn, shap, and matplotlib. A fixed random seed of 100 was used throughout all experiments to ensure reproducibility. A detailed list of software versions and hardware specifications is provided in Appendix A.

2.2. Datasets

Both datasets were processed using the same machine learning (ML) model and similar preprocessing steps. For both datasets, feature and target names were extracted. Since neither dataset included unique instance identifiers, these were created manually to allow detected outliers to be traced back to the original data.

In both cases, a train–test split was performed using the train_test_split function from scikit-learn with a random state of 100 and a test set size of 30% of the full dataset. All features were standardized using the StandardScaler from scikit-learn to improve network performance.

2.2.1. Breast Cancer (BC) Dataset

The breast cancer dataset from Wolberg et al. [19] comprises a total of 569 observations, each represented by 30 quantitative features extracted from digitized images of fine needle aspiration (FNA) biopsies of breast tissue. These features quantify the morphological characteristics of cell nuclei, including parameters such as radius, texture, perimeter, area, and smoothness, calculated using statistical metrics such as the mean, standard error, and extreme values. The dataset supports binary classification, with each sample classified as either malignant (212 cases) or benign (357 cases), and it is widely used for developing and evaluating ML models for breast cancer diagnosis. In this study, it was used to demonstrate the application of COVAS in a medical classification setting.

2.2.2. FIFA 2018 Man of the Match (MotM) Dataset

The FIFA 2018 World Cup match statistics dataset from Kaggle [20] contains information on 64 matches, including a total of 26 features describing team performance (e.g., goals, shots, ball possession, fouls), player performance (e.g., goals, assists, cards), and the label “Man of the Match”. It also includes advanced match statistics such as shots on target, corner kicks, and yellow/red cards. The dataset supports detailed analyzes of individual and team performance and enables statistical exploration of factors influencing match outcomes and Man of the Match selections. It was selected because its variables can be readily interpreted using SHAP, thereby facilitating an intuitive demonstration of how COVAS utilizes SHAP values for outlier detection in the sports domain.

2.3. Neural Network Architecture

A compact feedforward neural network was used to demonstrate the COVAS method rather than to optimize predictive performance for a specific dataset. The network consists of three hidden layers with 64, 32, and 32 neurons, respectively, each using rectified linear unit (ReLU) activation. The output layer contains a single neuron with sigmoid activation for binary classification.

Training was performed using the Adam optimizer and binary cross-entropy loss for ten epochs with a batch size of 16 on the standardized input features. No hyperparameter tuning or architecture search was conducted. A simple model was deliberately chosen to facilitate the analysis of the decision-making process. Other architectures or parameter settings might achieve higher accuracy, but performance optimization is outside the scope of this work.

While the core experiments are conducted using a feedforward neural network, COVAS is model-agnostic by construction through SHAP explanations. An additional validation using a tree-based model is provided in the Appendix B.

2.4. SHAP Implementation

SHAP values were computed using the shap library [6]. First, an appropriate model explainer was selected and fitted on the scaled training data and the trained neural network. The SHAP values and the corresponding base values were then extracted, where the base value represents the average model prediction. For the binary classification tasks considered here, there is only one base value.

SHAP values quantify the contribution of each feature to the prediction for a specific instance based on cooperative game theory [6]. They indicate how much each feature adds to or subtracts from the base value to yield the final prediction, thereby making the model output more interpretable. For visualization, decision plots were generated using the built-in SHAP functions.

For outlier identification, only correctly classified instances were considered. Model outputs ranged from 0 to 1; values greater than or equal to 0.5 were interpreted as predictions for class label 1, and values below 0.5 as predictions for class label 0. SHAP values, base values, and instance IDs were stored in a class-keyed dictionary based on the indices of the original data. These data structures served as the basis for the subsequent COVAS computation and visualization.

2.5. COVAS Framework

COVAS provides a systematic procedure for detecting anomalous instances by combining SHAP-based explanations with class-specific statistical descriptors of feature contributions. For each class, the distribution of SHAP values is analyzed by computing the mean and standard deviation of each feature’s SHAP values. These statistics are stored and later used to quantify the degree to which a given instance deviates from typical class behavior. Since COVAS operates in the explanation space, in principle, it can be applied to any predictive model for which SHAP explanations can be computed.

We consider two modes for applying COVAS:

Continuous mode: The strength of an outlier is quantified by the absolute deviation from the mean, measured in units of standard deviations.
Threshold mode: A binary matrix is derived, indicating whether the deviation of a feature exceeds a predefined standard deviation threshold.

Before constructing the COVA (Classification Outlier Variability) matrix

K

, all SHAP values are z-transformed. Because SHAP values are feature-specific in scale and therefore not directly comparable across features, this normalization renders the deviations dimensionless and enables contributions from different features to be evaluated on a common scale. Furthermore, COVAS is designed to quantify atypicality in explanation patterns rather than the direction of individual feature contributions. Accordingly, we consider the absolute values of the standardized SHAP deviations, treating unusually strong positive and negative contributions as equally indicative of atypical behavior. The resulting matrix

K \in R^{N \times M}

contains N instances and M features, with each entry defined as

κ_{P_{n}, F_{m}} = |\frac{X_{P_{n}, F_{m}} - μ_{F_{m}}}{σ_{F_{m}}}|,

(1)

where

μ_{F_{m}}

and

σ_{F_{m}}

denote the mean and standard deviation of the SHAP values associated with feature

F_{m}

for the corresponding class. The term

P_{n}

refers to the n-th instance. The absolute value reflects the magnitude of the deviation irrespective of direction and thus captures the overall “outlier strength” of each feature for a given instance.

The continuous COVA matrix

K_{cont}

is defined as

K_{cont} = [\begin{matrix} κ_{P_{1}, F_{1}} & κ_{P_{1}, F_{2}} & \dots & κ_{P_{1}, F_{M}} \\ κ_{P_{2}, F_{1}} & κ_{P_{2}, F_{2}} & \dots & κ_{P_{2}, F_{M}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ κ_{P_{N}, F_{1}} & κ_{P_{N}, F_{2}} & \dots & κ_{P_{N}, F_{M}} \end{matrix}] .

To compute class-specific COVAS representations, the SHAP values corresponding to each class are used. Since the SHAP values are stored as matrices, feature names are assigned to each column to ensure correct alignment with their respective statistical distributions.

A scalar COVA score for each instance is obtained by averaging the absolute z-scores across all features:

\bar{κ_{P_{n}}} = \frac{1}{M} \sum_{m = 1}^{M} κ_{P_{n}, F_{m}} .

(2)

This average score reflects the overall degree of abnormality of an instance relative to the class-specific feature distributions.

It is important to emphasize that the aim of COVAS is not to outperform existing anomaly detection algorithms in terms of accuracy. COVAS quantifies deviations in the explanation space rather than in the raw feature space. This distinction reflects its role as a model interpretability tool: the goal is to surface unexpected, meaningful explanation patterns that warrant closer inspection and may support hypothesis formation, rather than to flag erroneous or noisy samples.

2.5.1. Continuous Mode

In the continuous mode, the z-transformed SHAP values of each instance (Equation (1)) are stored in the class-specific matrix

K_{cont}

. Each entry represents the standardized deviation of a feature’s SHAP value from its expected distribution. A scalar outlier score for an instance is then obtained by averaging its absolute z-scores across all features (Equation (2)). This continuous score quantifies the overall atypicality of an instance concerning its class-specific SHAP behavior.

2.5.2. Threshold Mode

In the threshold mode, the COVA matrix

K_{thresh}

is constructed by applying a fixed cut-off to the z-transformed SHAP values. For each instance–feature pair, a binary value is assigned based on whether the absolute z-score exceeds a predefined threshold

τ

:

κ_{P_{n}, F_{m}} = \{\begin{matrix} 1, & if |\frac{X_{P_{n}, F_{m}} - μ_{F_{m}}}{σ_{F_{m}}}| > τ, \\ 0, & otherwise . \end{matrix}

The resulting matrix contains a value of 1 if the SHAP value of a feature for a given instance lies outside the threshold range, and a value of 0 if it lies within the range. The threshold

τ

is typically set based on the desired sensitivity to deviations (e.g.,

τ = 2

for values beyond two standard deviations). The final COVA score for each instance is computed as the mean of the binary indicators across all features, as in Equation (2), and reflects the proportion of features for which the instance is considered an outlier. In addition to the default setting

τ = 2

, a sensitivity analysis across multiple

τ

values was performed to assess the robustness of the proposed framework (see Section 3.3.1).

2.6. COVAS Workflow

To provide a clearer overview of the COVAS workflow, Figure 1 illustrates the main steps of the COVAS computation pipeline. The corresponding computational procedure is formalized in Algorithm 1, which summarizes the individual processing steps in a concise and reproducible manner.

Algorithm 1 COVAS Computation Framework
Require: Feature set X, model predictions y, SHAP explainer S
Ensure: Outlier score vector $κ$
1:	Compute SHAP values for correctly classified instances: $\hat{S} \leftarrow S (X)$
2:	for each class c do
3:	Extract SHAP values for class: ${\hat{S}}_{c}$
4:	Compute $μ_{c} \leftarrow mean ({\hat{S}}_{c})$ , $σ_{c} \leftarrow std ({\hat{S}}_{c})$
5:	end for
6:	for each instance i do
7:	Compute deviation vector for class c: ${\hat{κ}}_{i} \leftarrow \|({\hat{S}}_{i, c} - μ_{c}) / σ_{c}\|$
8:	Compute outlier score: $κ_{i} \leftarrow mean ({\hat{κ}}_{i})$
9:	end for
10:	return $κ$

2.7. Custom SHAP Decision Plots

SHAP decision plots were used to visualize how individual features contribute to the model output for correctly classified instances within each class, as illustrated in Figure 2. The y-axis lists input features ordered by their average importance, and the x-axis shows the model output (e.g., class probability) progressing from the base value (gray vertical line) toward the final prediction. Each colored line represents a single instance and traces how feature contributions accumulate.

A custom extension of the standard SHAP decision plot includes the mean SHAP path (solid black line) and the

\pm 2

standard deviation bounds (dashed green lines), highlighting the central tendency and variability of feature contributions across instances. These additions contextualize individual paths with respect to overall model behavior and facilitate the identification of outliers and atypical decision paths. This visualization format is used consistently in the subsequent figures to interpret model behavior across different classes.

3. Results

3.1. Classification Performance

For the presented experiments, the full COVAS analysis, including SHAP computation, completed in under one minute per dataset. On the breast cancer (BC) dataset, the model achieved a test accuracy of 97.08%, resulting in 64 correctly classified malignant and 102 correctly classified benign instances.

On the Man of the Match (MotM) dataset, the model reached a test accuracy of 66.67%, with 15 correctly classified instances for the Not MotM class and 11 for the MotM class.

3.2. COVAS Case Studies

COVAS is used to analyze deviations in SHAP-based explanation patterns among correctly classified instances. We report class-wise COVAS scores and SHAP decision plots for both datasets; full COVA matrices are omitted due to size and are available in the public repository (see Code and Data Availability).

3.2.1. Results on the Breast Cancer (BC) Dataset

Figure 2 and Figure 3 visualize class-specific SHAP decision paths for correctly classified instances. The malignant class shows higher dispersion around the mean SHAP path, with several instances deviating noticeably for features such as worst concave points and mean perimeter. In contrast, benign instances cluster more tightly around the mean path, indicating more homogeneous explanation patterns within this class. Table 1 reports representative COVAS scores for both classes and highlights instances with atypical explanation patterns despite correct classification.

Instance-Level Illustration

To illustrate how COVAS supports the analysis of individual samples, we consider a representative instance with a high COVAS score from the malignant class of the breast cancer dataset. As shown in Table 1, this instance (patient 3) is correctly classified but exhibits a markedly atypical explanation pattern, reflected by its elevated

\bar{κ_{P_{n}}}

score.

Figure 4 visualizes the SHAP-based decision paths for all correctly classified malignant instances, with patient 3 highlighted. While most instances follow a compact, class-specific mean trajectory, the highlighted instance shows pronounced deviations from the mean SHAP path across several high-impact features, including mean perimeter and worst area.

This example demonstrates how COVAS enables the systematic identification of atypical yet correctly classified instances and supports targeted, instance-level inspection of explanation patterns. Such cases may serve as starting points for further domain-specific investigation or hypothesis formulation by experts, without implying any clinical conclusions.

3.2.2. Results on the Man of the Match (MotM) Dataset

For the MotM dataset, the SHAP decision plots (Figure 5 and Figure 6) show larger within-class variability than in the BC case study, particularly for the MotM class. This indicates that correctly classified matches can still exhibit diverse attribution patterns across the available match statistics. Table 2 lists representative high- and low-scoring matches per class, illustrating that COVAS distinguishes typical from atypical explanation patterns within both MotM and Not MotM.

3.3. Robustness Analyzes

To strengthen the reliability of the empirical findings, we conducted robustness analyzes evaluating the sensitivity of COVAS to key design choices and parameter settings. The following subsection focuses on the threshold parameter

τ

used in threshold-mode COVAS.

3.3.1. Threshold Sensitivity

We evaluated the robustness of threshold-mode COVAS across

τ \in {1.0, 1.5, 2.0, 2.5, 3.0}

on both datasets. As summarized in Table 3, increasing

τ

leads to a smooth decrease in the mean and standard deviation of COVAS scores, reflecting increasingly strict deviation criteria.

The stability of the most atypical instances was assessed using the top-10 overlap between consecutive

τ

values (Table 4). Overlaps are generally high (0.7–1.0), indicating stable identification of atypical instances under moderate threshold variations. A lower overlap (0.40) is observed for the malignant class of the BC dataset when increasing

τ

from 2.5 to 3.0, which reflects the conservative nature of very high thresholds that suppress borderline deviations rather than an instability of the method. For the MotM dataset at

τ = 3.0

, no feature-level deviations exceed the threshold, resulting in zero COVAS scores and further illustrating the strictness of large threshold values.

3.3.2. Ablation Study

We tested the sensitivity of COVAS to key design choices, including (i) using raw instead of z-standardized SHAP values, and (ii) using global instead of class-conditional feature distributions. Across both datasets, these variants produced identical top-10 selections and near-perfect rank agreement, indicating that the identification of atypical instances is not driven by a specific normalization choice in the studied settings.

3.4. Comparison with Feature-Space Methods

We compared COVAS to Local Outlier Factor (LOF) and a feature-wise z-score baseline by selecting the top-10 outliers among correctly classified instances. For both datasets, the overlap between the COVAS top-10 and the feature-space baselines is zero (Table 5), indicating that COVAS highlights instances that are not classical feature-space outliers. Consistently, LOF and feature-wise z-scores identify samples with higher feature-space extremeness, whereas COVAS outliers show lower mean absolute feature z-scores.

3.5. Impact of Class Imbalance

To assess the impact of class imbalance, we retrained the neural network on the Breast Cancer dataset using a class-weighted loss while keeping the test set unchanged. Because the set of correctly classified instances differs between the balanced and unbalanced models, all comparisons were restricted to the intersection of samples correctly classified in both settings and evaluated separately for each class.

As shown in Table 6, class weighting induces only minor shifts in the COVAS score distributions for both classes. For benign instances (

n = 33

), the mean COVAS score increases slightly from 0.62 to 0.66 with a marginal reduction in variability, while for malignant instances (

n = 16

), the mean increases from 0.73 to 0.74 with a small increase in standard deviation.

Importantly, the relative ranking of instances remains largely preserved. Spearman rank correlations between balanced and unbalanced COVAS scores are high for both benign (

ρ = 0.88

) and malignant (

ρ = 0.86

) classes, and the overlap among the top-10 most atypical samples remains substantial (0.8 and 0.7, respectively). Overall, these results suggest that class imbalance has a limited effect on explanation-space outlier identification in the evaluated setting.

4. Discussion

4.1. Network Performance

The neural network achieved a final accuracy of 66.67% on the MotM dataset. It was trained on a total of 89 instances and evaluated on 39 test instances. In contrast, on the breast cancer (BC) dataset, the model reached an accuracy of 97.08%, based on 398 training instances and 171 test instances.

The substantially higher performance on the BC dataset suggests that either the larger amount of data or the more clearly separable class structure supported more stable learning, whereas the smaller and noisier MotM dataset limited the achievable accuracy. Since the aim of this study was to provide a proof of concept for COVAS rather than to develop a highly optimized classifier for each dataset, we deliberately used the same simple network architecture for both tasks to ensure comparability and transparency of the decision process.

4.2. Interpretation of COVAS Results

To clarify the relationship between COVAS scores and predictive uncertainty, we provide an additional analysis in Appendix C, showing that COVAS captures aspects of model behavior that are largely independent of standard uncertainty measures.

COVAS is computed on correctly classified instances by design, as its goal is to identify atypical explanation patterns within the set of successful model decisions. This avoids conflating explanation-space deviations with misclassification effects and ensures that detected outliers reflect unusual but still valid decision paths. A consequence is that the number of analyzable instances depends on predictive accuracy; when accuracy is lower, fewer test samples remain for COVAS analysis, which we report explicitly and consider a practical limitation.

4.2.1. Breast Cancer Dataset

As shown in Table 1, patient 3 exhibits the highest average deviation from the mean classification path, with a COVA score of 1.994. This instance is therefore a promising candidate for further investigation. A closer analysis of the features that contributed most strongly to the classification, particularly those falling outside the

\pm 2

standard deviation bounds in the SHAP decision plots (Figure 2 and Figure 3), could support the formulation of new hypotheses for breast cancer research.

By systematically highlighting such borderline cases, COVAS may contribute to a more comprehensive understanding of breast cancer characteristics [8]. Rather than focusing solely on typical presentations, COVAS directs attention towards atypical but correctly classified instances, which may reveal rare constellations of features or alternative decision pathways that remain invisible in aggregate analyses [2].

4.2.2. Man of the Match Dataset

For the MotM dataset, Table 2 shows that the selection of the Man of the Match in Sweden’s game on 3 July 2018 exhibits the greatest deviation from the average MotM decision. This match represents a strong outlier in terms of feature attributions and is therefore a suitable candidate for more detailed analysis. In this context, COVAS can support the formulation of new hypotheses about the relevance of individual performance statistics in player evaluation.

By examining such borderline games, where the statistical profile of the MotM differs markedly from the typical pattern, COVAS may help to uncover alternative performance constellations that lead to recognition; for example, roles that are less focused on scoring but highly influential in other metrics. This could advance research in performance analysis and support more nuanced discussions about what constitutes an “outstanding” performance in football.

4.3. Potential Applications of COVAS

COVAS is a versatile algorithm that can be applied across multiple domains. Each domain offers distinct opportunities for leveraging outlier-aware explanations to generate new insights. Below, we outline three areas in which COVAS shows particular potential.

4.3.1. Medical Domain

In the medical domain, COVAS can be used to focus on unique or specific symptom patterns in patients. In Figure 3, for example, there is at least one instance that exceeds the

\pm 2

standard deviation band yet ends around the mean model output. This instance is clearly classified differently from the average path, but the patient is still correctly classified as benign.

Such cases may motivate new hypotheses; for instance, high values of features such as worst radius do not consistently indicate malignancy in breast cancer. More generally, COVAS has the potential to yield new insights into disease manifestations by emphasizing atypical but valid clinical presentations. By focusing on abnormal patient profiles that are nonetheless correctly classified, COVAS may facilitate the identification of additional symptoms, variant patterns, or subgroups, thereby supporting diagnosis across a broader patient spectrum.

4.3.2. Sports Domain

A further potential application lies in athlete performance and movement analysis. A common misconception in sports is that a single, idealized movement pattern exists and should be replicated by all athletes [22]. In contrast, modern approaches such as nonlinear pedagogy encourage the exploration of individualized, functional movement solutions [23].

With COVAS, movement patterns can be analyzed in the context of performance outcomes; for example, the run-up and jump phases in the high jump in relation to the vaulted height. Athletes achieving exceptional results, such as world-class performances, may exhibit movement patterns that systematically differ from the average. A comprehensive COVAS analysis could identify key movement components in such outlier athletes and inform new training approaches. Similarly, this concept could be applied to general movement analysis in rehabilitation or talent development.

4.3.3. COVAS for Model Enhancement

Beyond domain-specific interpretation, COVAS also offers potential for model development in ML. By focusing on edge cases, it can reveal previously overlooked aspects of the dataset and highlight regions of the input space where model behavior differs from the majority. Because of its model-agnostic design through the use of SHAP values, COVAS can, in principle, be applied to a wide variety of ML models, including tree-based methods, neural networks, and linear models. For linear models, SHAP attributions are closely related to weighted feature values, and explanation patterns may therefore reduce to simpler statistical deviations that can be computed directly from the data. In this sense, COVAS can be viewed as a generalization that becomes particularly informative for non-linear models, where explanation patterns are no longer trivially linked to the input space.

When examining a confusion matrix, COVAS can be used to analyze false positives (FP) and false negatives (FN) in more detail by identifying which misclassified instances are particularly atypical compared with the average FP or FN case. Combined with SHAP, this enables a feature-level decomposition of why the model failed for specific edge cases. Such insights can guide architectural modifications (e.g., additional layers, regularization, alternative activation functions) or data-centric interventions (e.g., targeted data collection or augmentation) to improve model robustness and fairness [2].

Crucially, COVAS is not intended as a competitive outlier detection method. Its strength lies in identifying explanation-level outliers—instances whose SHAP trajectories diverge from class-typical decision paths while still being classified correctly. Such borderline cases are often the most informative for domain experts, as they may reveal alternative mechanisms, rare manifestations, or overlooked feature interactions. Consequently, COVAS functions primarily as a tool for explanation-based hypothesis generation rather than as a diagnostic or anomaly-screening system.

4.4. Limitations and Future Directions

In the current formulation, COVAS depends on SHAP values as input, and only through this does it gain its model- and data-agnostic characteristics. The current formulation implicitly assumes that standardized SHAP value distributions provide a meaningful measure of deviation; this assumption may be less appropriate in domains with highly skewed or multi-modal attribution distributions. In addition, the present study focuses on correctly classified instances and therefore does not directly address model calibration or misclassification patterns. Extending COVAS to jointly analyze correctly and incorrectly classified cases could provide a richer perspective on model behavior.

While the computational effort required to obtain SHAP values was moderate for the tabular case studies analyzed here, extending the evaluation to high-dimensional data or temporally ordered inputs such as time series or sequence models would substantially increase computational demands. As a consequence, obtaining explanations in real-time or low-latency application scenarios may not always be feasible. Prior work has demonstrated that applying SHAP-based explanations to temporal models requires specialized extensions, such as TimeSHAP, which computes feature-, timestep-, and cell-level attributions for ordered inputs [24], or WindowSHAP, which improves efficiency by partitioning sequences into time windows [25]. These approaches highlight both the feasibility and the increased computational cost of explainability in temporal domains, which was beyond the scope of the present study.

A seed stability analysis further indicates that the identified explanation-space outliers remain consistent across different random initializations, suggesting that the reported patterns are not driven by stochastic effects associated with model initialization or network size (see Appendix C (Seed Stability Analysis on the MotM Dataset)).

Future work includes extending the empirical evaluation of COVAS to multiclass and non-binary classification settings, as well as to additional model classes and data domains. While SHAP was chosen due to its wide use and model-agnostic properties, the general idea of COVAS could also be applied to other explainability methods with similar characteristics, such as Integrated Gradients. Furthermore, investigating the integration of COVAS into interactive analysis workflows for domain experts represents an important direction for future research.

We emphasize that the presented examples are intended to illustrate the type of deviations identified by COVAS rather than to provide validated clinical or domain-specific conclusions. Assessing the practical relevance of such deviations requires expert evaluation and is beyond the scope of this work.

5. Conclusions

In this study, we introduced COVAS, a framework designed to highlight atypical but correctly classified instances by leveraging SHAP-based explanations, thus gaining a model- and data-agnostic characteristic. By quantifying the deviation of individual instances from class-specific SHAP distributions, COVAS provides a structured approach for identifying cases that follow uncommon decision paths and therefore represent candidates for deeper investigation. Across two distinct domains—medical diagnosis and sports analytics—COVAS demonstrated its potential to reveal instance-level variability that is not captured by standard performance metrics or aggregate interpretability methods. These findings suggest that COVAS can support hypothesis generation, enhance model inspection, and complement existing XAI techniques by directing attention toward informative outliers.

The core contribution of COVAS lies in using explanation-based deviations to support scientific hypothesis generation rather than in detecting mislabeled or erroneous data points.

Future work includes extending COVAS to additional data modalities, such as sequential or high-dimensional settings, where explanation methods pose additional computational challenges. Moreover, the current formulation focuses on correctly classified instances and does not directly address misclassification patterns or model calibration.

Looking ahead, the broader impact of COVAS will depend on its application across a wider range of datasets, model architectures, and domains. We strongly encourage other researchers to apply and evaluate COVAS within their respective fields to further test, refine, and extend the method. Future work should investigate alternative attribution backends, assess sensitivity across model types, and explore adaptations of COVAS for more complex data modalities. Such efforts will help establish the generality of COVAS and support its integration into practical machine learning workflows.

6. Reproducibility Statement

All datasets used in this study are publicly available benchmark datasets and are cited in the Section 2. The complete codebase for computing COVAS, including model training, SHAP value computation, and post-processing pipelines for generating figures and tables, is available in a public repository at https://github.com/SebClone/covas (accessed on 8 January 2026). This provides all necessary components to reproduce the reported experiments and to apply COVAS to additional datasets.

Author Contributions

Conceptualization, S.R., A.C. and D.F.; methodology, S.R. and D.F.; software, S.R. and S.O.; validation, S.R., A.C. and S.O.; formal analysis, S.R. and D.F.; investigation, S.R.; resources, U.H. and D.F.; data curation, S.R.; writing—original draft preparation, S.R.; writing—review and editing, A.C., S.O., U.H. and D.F.; visualization, S.R.; supervision, U.H., A.C. and D.F.; project administration, D.F.; funding acquisition, D.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All datasets used in this study are publicly available: Breast Cancer Wisconsin (Diagnostic) dataset [19], and the FIFA 2018 Man of the Match dataset [20]. All code used for model training, SHAP computation, COVAS implementation, and figure generation is available at: https://github.com/SebClone/covas. No additional data were generated in this study.

Acknowledgments

Generative artificial intelligence tools were used to assist with language editing and structural refinement of the manuscript. All methodological descriptions, analyses, results, and conclusions were conceived, implemented, and verified by the authors. No data, figures, numerical results, or scientific content were generated or modified using generative AI.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
BC	Breast Cancer (Dataset)
COVAS	Classification Outlier Variability Score
COVA	Classification Outlier Variability
FNA	Fine Needle Aspiration
FN	False Negative
FP	False Positive
ID	Identifier
ML	Machine Learning
MotM	Man of the Match
Not MotM	Not Man of the Match
ReLU	Rectified Linear Unit
SHAP	Shapley Additive Explanations
Std/SD	Standard Deviation
XAI	Explainable Artificial Intelligence
z-score	Standardized score (z-transformed value)
$τ$	Threshold parameter in COVA threshold mode

Appendix A. Hardware and Software Details

All experiments were conducted on an Apple M3 Max processor with 36 GB of unified memory running macOS 15.5. Computations were implemented in Visual Studio Code (version 1.99.2) using Python 3.9.15. The following Python packages and versions were used throughout the experiments:

numpy (version 1.23.5)
pandas (version 2.2.1)
tensorflow-keras (version 2.12)
scikit-learn (version 1.3)
shap (version 0.42.1)
matplotlib (version 3.8)

To ensure reproducibility, a fixed random seed of 100 was used across all experiments unless stated otherwise.

Appendix B. COVAS on a Tree-Based Model

Appendix B.1. Random Forest Setup

We conducted an additional validation on the Breast Cancer dataset using a Random Forest as a representative tree-based model to assess the compatibility of COVAS across different model classes rather than to benchmark predictive performance.

The Random Forest was trained with 300 trees using default Gini-based splits and a fixed random seed for reproducibility. A stratified 70/30 train–test split was applied, and no class weighting or hyperparameter tuning was performed. Feature standardization was retained for consistency with the neural network experiments, although it is not required for tree-based models. The model configuration was deliberately kept simple to provide a stable basis for explanation-space analysis.

SHAP values were computed using TreeExplainer and subsequently processed using the same COVAS pipeline as for the neural network, including class-conditional normalization and aggregation. This setup enables a direct comparison of explanation-space outlier identification across model classes.

Appendix B.2. Comparison Across Model Classes

To compare explanation-space outlier rankings obtained from the feedforward neural network (FFNN) and the Random Forest (RF), the analysis was restricted to instances correctly classified by both models, thereby avoiding confounding effects due to misclassification.

Agreement between the two rankings was assessed using two complementary measures: the top-10 overlap, which captures consistency among the most atypical instances, and the Spearman rank correlation (

ρ

), which quantifies global rank agreement. The Spearman correlation is invariant to score scaling and distributional differences, enabling a meaningful comparison of explanation-based rankings across model classes.

Appendix B.3. Results

Table A1. Comparison of COVAS rankings obtained from a feedforward neural network (FFNN) and a Random Forest (RF) on the Breast Cancer dataset. The comparison is restricted to samples correctly classified by both models.

Class	Common Samples	Top-10 Overlap	Spearman $ρ$	p-Value
Malignant	16	0.80	0.73	0.001
Benign	32	0.70	0.72	$3 \times 10^{- 6}$

The results indicate substantial agreement between the two model classes. Despite local ranking differences, highly atypical instances are largely identified consistently, and moderate to high rank correlations confirm that the overall ordering of explanation-space atypicality is largely preserved across models.

Appendix B.4. Decision Plots of the Random Forest Model

Decision plots Figure A1 and Figure A2 based on SHAP values were generated for the Random Forest model. In the Random Forest setting, decision paths for both benign and malignant classes converge toward high output values, as the model output represents the predicted class probability rather than a class-specific margin. Consequently, correctly classified instances of both classes are associated with high confidence scores, resulting in trajectories that approach values close to one.

Figure A1. SHAP decision plot for the benign class of the scikit-learn breast cancer dataset using a Random Forest classifier. The plot shows cumulative SHAP contributions of input features for all correctly classified benign instances, with features ordered by average contribution magnitude. The horizontal axis represents the model output, progressing from the SHAP base value to the final prediction. Individual instance paths are overlaid with a mean SHAP trajectory (solid black line) and

\pm 2

standard deviation bands (green dashed lines), summarizing central tendencies and variability in explanation-space decision paths.

Figure A1. SHAP decision plot for the benign class of the scikit-learn breast cancer dataset using a Random Forest classifier. The plot shows cumulative SHAP contributions of input features for all correctly classified benign instances, with features ordered by average contribution magnitude. The horizontal axis represents the model output, progressing from the SHAP base value to the final prediction. Individual instance paths are overlaid with a mean SHAP trajectory (solid black line) and

\pm 2

standard deviation bands (green dashed lines), summarizing central tendencies and variability in explanation-space decision paths.

Figure A2. SHAP decision plot for the malignant class of the scikit-learn breast cancer dataset using a Random Forest classifier. The plot depicts cumulative SHAP contributions of input features for all correctly classified malignant instances, with features ordered by average contribution magnitude. The horizontal axis shows the progression of the model output from the SHAP base value to the final prediction. Individual decision paths are overlaid with a mean SHAP trajectory (solid black line) and

\pm 2

standard deviation bands (green dashed lines), summarizing central tendencies and variability in explanation-space decision paths.

Figure A2. SHAP decision plot for the malignant class of the scikit-learn breast cancer dataset using a Random Forest classifier. The plot depicts cumulative SHAP contributions of input features for all correctly classified malignant instances, with features ordered by average contribution magnitude. The horizontal axis shows the progression of the model output from the SHAP base value to the final prediction. Individual decision paths are overlaid with a mean SHAP trajectory (solid black line) and

\pm 2

standard deviation bands (green dashed lines), summarizing central tendencies and variability in explanation-space decision paths.

Appendix C. Relationship Between COVAS Scores and Prediction Uncertainty

COVAS is designed to quantify atypicality in explanation space rather than predictive uncertainty. To examine its relationship to common uncertainty measures, we compared COVAS scores with simple uncertainty proxies derived from model predictions.

For binary classification, prediction entropy and a margin-based distance to the decision boundary were used as uncertainty measures. The analysis was restricted to correctly classified test instances, consistent with the definition of COVAS. For each class, Spearman rank correlations were computed to assess monotonic relationships without assuming specific score distributions.

Table A2. Spearman rank correlations between COVAS scores and prediction uncertainty measures for correctly classified instances.

Class	n	$ρ$ (COVAS, Entropy)	$ρ$ (COVAS, Margin)
Malignant	64	$- 0.14$	$- 0.15$
Benign	102	$- 0.20$	$- 0.20$

Across both classes, only weak correlations are observed, indicating that high COVAS scores do not primarily reflect prediction uncertainty. Instead, COVAS captures deviations in explanation patterns that can occur even for confident predictions, supporting its interpretation as a complementary explanation-space measure.

Seed Stability Analysis on the MotM Dataset

To assess the sensitivity of COVAS to random model initialization, the MotM experiment was repeated using three different random seeds (0, 50, and 100). For each seed, the full training, explanation, and COVAS computation pipeline was executed independently. Stability was evaluated based on rank consistency among samples correctly classified in all runs.

As summarized in Table A3, the top-ranked COVAS instances for the MotM class are identical across all seeds, resulting in a top-10 overlap of 1.0. Since no rank variability is observed among the common samples, the Spearman correlation is undefined in this case.

Table A3. Seed stability analysis for the MotM dataset. Tested on seeds 0, 50 and 100.

Class	$n_{common}$	Spearman $ρ$	Top-10 Overlap
MotM	8	n/a	1.0

Overall, these results indicate that the explanation-space deviations identified by COVAS are robust to random initialization effects, despite the relatively large network architecture.

References

Clement, T.; Kemmerzell, N.; Abdelaal, M.; Amberg, M. XAIR: A Systematic Metareview of Explainable AI (XAI) Aligned to the Software Development Process. Mach. Learn. Knowl. Extr. 2023, 5, 78–108. [Google Scholar] [CrossRef]
Garcke, J.; Roscher, R. Explainable Machine Learning. Mach. Learn. Knowl. Extr. 2023, 5, 169–170. [Google Scholar] [CrossRef]
Kalasampath, K.; Spoorthi, K.N.; Sajeev, S.; Kuppa, S.S.; Ajay, K.; Maruthamuthu, A. A Literature Review on Applications of Explainable Artificial Intelligence (XAI). IEEE Access 2025, 13, 41111–41140. [Google Scholar] [CrossRef]
Mersha, M.; Lam, K.; Wood, J.; AlShami, A.K.; Kalita, J. Explainable artificial intelligence: A survey of needs, techniques, applications, and future direction. Neurocomputing 2024, 599, 128111. [Google Scholar] [CrossRef]
Sadeghi, Z.; Alizadehsani, R.; CIFCI, M.A.; Kausar, S.; Rehman, R.; Mahanta, P.; Mahanta, P.; Almasri, A.; Alkhawaldeh, R.S.; Hussain, S.; et al. A review of Explainable Artificial Intelligence in healthcare. Comput. Electr. Eng. 2024, 118, 109370. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
Molnar, C. Interpreting Machine Learning Models with SHAP: A Guide with Python Examples and Theory on Shapley Values, 1st ed.; Christoph Molnar: Munich, Germany, 2023. [Google Scholar]
Buhrmester, V.; Münch, D.; Arens, M. Analysis of Explainers of Black Box Deep Neural Networks for Computer Vision: A Survey. Mach. Learn. Knowl. Extr. 2021, 3, 966–989. [Google Scholar] [CrossRef]
Janoudi, G.; Uzun (Rada), M.; Fell, D.B.; Ray, J.G.; Foster, A.M.; Giffen, R.; Clifford, T.; Walker, M.C. Outlier analysis for accelerating clinical discovery: An augmented intelligence framework and a systematic review. PLoS Digit. Health 2024, 3, e0000515. [Google Scholar] [CrossRef] [PubMed]
Cai, J.; Hu, W.; Yang, Y.; Yan, H.; Chen, F. Outlier detection in spatial error models using modified thresholding-based iterative procedure for outlier detection approach. Bmc Med. Res. Methodol. 2024, 24, 89. [Google Scholar] [CrossRef]
Pearson, T.; Pons, R.; Ghaoui, R.; Sue, C. Genetic mimics of cerebral palsy. Mov. Disord. 2019, 34, 625–636. [Google Scholar] [CrossRef] [PubMed]
Suppa, A.; Asci, F.; Kamble, N.; Chen, K.H.; Sciacca, G.; Merchant, S.H.; Tijssen, M.; Chen, R.; Hallett, M.; Pal, P. Neurophysiology of Atypical Parkinsonian Syndromes: A Study Group Position Paper. Mov. Disord. 2025, 40, 1451–1510. [Google Scholar] [CrossRef] [PubMed]
Datta, A.; Sen, S.; Zick, Y. Algorithmic Transparency via Quantitative Input Influence. In Transparent Data Mining for Big and Small Data; Springer: Berlin/Heidelberg, Germany, 2017; pp. 71–94. [Google Scholar] [CrossRef]
De Maesschalck, R.; Jouan-Rimbaud, D.; Massart, D.L. The Mahalanobis distance. Chemom. Intell. Lab. Syst. 2000, 50, 1–18. [Google Scholar] [CrossRef]
Breunig, M.; Kriegel, H.; Ng, R.; Sander, J. LOF: Identifying Density-Based Local Outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, New York, NY, USA, 15–18 May 2000; pp. 93–104. [Google Scholar] [CrossRef]
Ballegeer, M.; Bogaert, M.; Benoit, D.F. Evaluating the Stability of Model Explanations in Instance-Dependent Cost-Sensitive Credit Scoring. Eur. J. Oper. Res. 2025, 326, 630–640. [Google Scholar] [CrossRef]
Chen, Z.; Lessmann, S.; Baesens, B. Interpretable Machine Learning for Imbalanced Credit Scoring. Eur. J. Oper. Res. 2023, 310, 346–365. [Google Scholar] [CrossRef]
Saarela, M. Recent Applications of Explainable Artificial Intelligence: A Systematic Review. Appl. Sci. 2024, 14, 8884. [Google Scholar] [CrossRef]
Wolberg, W.; Mangasarian, O.; Street, N. Breast Cancer Wisconsin (Diagnostic); UCI Machine Learning Repository: Irvine, CA, USA, 1993. [Google Scholar] [CrossRef]
Mathan, J.; Community, K. FIFA 2018 Match Statistics: Man of the Match Dataset. 2018. Available online: https://www.kaggle.com/datasets/mathan/fifa-2018-match-statistics (accessed on 20 June 2025).
Friemert, D.; Schnur, D.; Runkel, S.; Borsch, J.; Karamanidis, K.; Dellen, B.; Thieme, L.; Fiedler, A.; Jaeckel, U.; Hartmann, U. Limitations of Public Biomechanical Movement Datasets for Deep Learning: Issues of Metadata, Standardization, and Variety in Motion Types. medRxiv 2025. [Google Scholar] [CrossRef]
Seifert, L.; Button, C.; Davids, K. Key Properties of Expert Movement Systems in Sport. Sport. Med. 2013, 43, 167–178. [Google Scholar] [CrossRef] [PubMed]
Correia, V.; Carvalho, J.; Araujo, D.; Pereira, E.; Davids, K. Principles of nonlinear pedagogy in sport practice. Phys. Educ. Sport Pedagog. 2018, 24, 117–132. [Google Scholar] [CrossRef]
Bento, J.; Saleiro, P.; Cruz, A.F.; Figueiredo, M.A.T.; Bizarro, P. TimeSHAP: Explaining Recurrent Models through Sequence Perturbations. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual, 14–18 August 2021; pp. 2565–2573. [Google Scholar] [CrossRef]
Nayebi, A.; Tipirneni, S.; Reddy, C.K.; Foreman, B.; Subbian, V. WindowSHAP: An Efficient Framework for Explaining Time Series Classifiers. J. Biomed. Inform. 2023, 145, 104435. [Google Scholar] [CrossRef]

Figure 1. Overview of the COVAS computation pipeline. The process begins with SHAP values for correctly classified instances, from which feature-wise distributions (mean and standard deviation) are derived. These statistics are used to compute the COVA matrix either in continuous mode (absolute z-scores) or threshold mode (binary outlier flags). A final per-instance score is obtained by averaging across features, providing a class-specific measure of outlier strength.

Figure 2. SHAP decision plot for the malignant class of the scikit-learn breast cancer dataset. The figure illustrates how individual features contribute to the model’s output for all correctly classified malignant instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each blue line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. To enhance interpretability, a custom mean SHAP path is included as a solid black line, along with green dashed lines indicating

\pm 2

standard deviations, providing insight into both central tendencies and variability among instances. This visualization supports the identification of outliers and atypical decision paths within the malignant class.

Figure 2. SHAP decision plot for the malignant class of the scikit-learn breast cancer dataset. The figure illustrates how individual features contribute to the model’s output for all correctly classified malignant instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each blue line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. To enhance interpretability, a custom mean SHAP path is included as a solid black line, along with green dashed lines indicating

\pm 2

standard deviations, providing insight into both central tendencies and variability among instances. This visualization supports the identification of outliers and atypical decision paths within the malignant class.

Figure 3. SHAP decision plot for the benign class of the scikit-learn breast cancer dataset. The figure illustrates how individual features contribute to the model’s output for all correctly classified benign instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each magenta line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. A custom mean SHAP path is included as a solid black line, together with green dashed lines indicating

\pm 2

standard deviations. This visualization supports the identification of outliers and atypical decision paths within the benign class.

Figure 3. SHAP decision plot for the benign class of the scikit-learn breast cancer dataset. The figure illustrates how individual features contribute to the model’s output for all correctly classified benign instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each magenta line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. A custom mean SHAP path is included as a solid black line, together with green dashed lines indicating

\pm 2

standard deviations. This visualization supports the identification of outliers and atypical decision paths within the benign class.

Figure 4. The plot shows SHAP-based decision paths for all correctly classified malignant instances. Thin blue lines represent individual instances, the solid black line denotes the mean SHAP path, and the dashed green lines indicate the ±2 standard deviation envelope across instances. The highlighted yellow dashed line corresponds to patient 3. The numerical annotations along this path indicate the SHAP contribution values of the highlighted instance at each feature step.

Figure 5. SHAP decision plot for the Not MotM class of the FIFA 2018 Man of the Match dataset. The figure illustrates how individual match statistics contribute to the model’s output for all correctly classified Not Man of the Match instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each blueish line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. A custom mean SHAP path is included as a solid black line, along with green dashed lines indicating

\pm 2

standard deviations, providing insight into both central tendencies and variability among instances.

Figure 5. SHAP decision plot for the Not MotM class of the FIFA 2018 Man of the Match dataset. The figure illustrates how individual match statistics contribute to the model’s output for all correctly classified Not Man of the Match instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each blueish line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. A custom mean SHAP path is included as a solid black line, along with green dashed lines indicating

\pm 2

standard deviations, providing insight into both central tendencies and variability among instances.

Figure 6. SHAP decision plot for the MotM class of the FIFA 2018 Man of the Match dataset. The figure illustrates how individual match statistics contribute to the model’s output for all correctly classified Man of the Match instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each magenta line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. A custom mean SHAP path is included as a solid black line, together with green dashed lines indicating

\pm 2

standard deviations, supporting the identification of outliers and atypical decision paths within the MotM class.

Figure 6. SHAP decision plot for the MotM class of the FIFA 2018 Man of the Match dataset. The figure illustrates how individual match statistics contribute to the model’s output for all correctly classified Man of the Match instances. The y-axis ranks the input features by their average contribution magnitude, while the x-axis represents the cumulative model output, progressing from the base value (gray vertical line) toward the final prediction. Each magenta line corresponds to a single instance, visualizing the sequential contribution of each feature to the final classification. A custom mean SHAP path is included as a solid black line, together with green dashed lines indicating

\pm 2

standard deviations, supporting the identification of outliers and atypical decision paths within the MotM class.

Table 1. Overview of the COVA score for selected instances from the breast cancer dataset, sorted by class. The table presents exemplary instances from the scikit-learn breast cancer dataset, ordered by their class-specific Classification Outlier Variability Score (COVA score). The COVA score

\bar{κ_{P_{n}}}

reflects the average absolute z-score of SHAP values across all features, quantifying the degree of deviation from typical class-specific decision behavior. Higher values indicate stronger outlier characteristics. Instances from both the malignant and benign classes are shown, with vertical dots indicating omitted intermediate values for clarity.

Table 1. Overview of the COVA score for selected instances from the breast cancer dataset, sorted by class. The table presents exemplary instances from the scikit-learn breast cancer dataset, ordered by their class-specific Classification Outlier Variability Score (COVA score). The COVA score

\bar{κ_{P_{n}}}

reflects the average absolute z-score of SHAP values across all features, quantifying the degree of deviation from typical class-specific decision behavior. Higher values indicate stronger outlier characteristics. Instances from both the malignant and benign classes are shown, with vertical dots indicating omitted intermediate values for clarity.

ID	COVA Score $\bar{κ_{P_{n}}}$	Class
patient 3	1.994	malignant
patient 41	1.582	malignant
patient 197	1.418	malignant
patient 261	1.394	malignant
patient 31	1.260	malignant
⋮	⋮	⋮
patient 323	0.257	malignant
patient 152	2.889	benign
patient 157	2.137	benign
patient 484	1.791	benign
patient 225	1.582	benign
patient 455	1.492	benign
⋮	⋮	⋮
patient 175	0.255	benign

Table 2. Overview of the COVA score for selected instances from the FIFA 2018 Man of the Match dataset, sorted by class. The table presents exemplary instances from the FIFA 2018 Man of the Match dataset, ordered by their class-specific Classification Outlier Variability Score (COVA score). The COVA score

\bar{κ_{P_{n}}}

reflects the average absolute z-score of SHAP values across all match statistics, quantifying the degree of deviation from typical class-specific decision behavior. Higher values indicate stronger outlier characteristics. Instances from both the MotM and Not MotM classes are shown, with vertical dots indicating omitted intermediate values for clarity.

Table 2. Overview of the COVA score for selected instances from the FIFA 2018 Man of the Match dataset, sorted by class. The table presents exemplary instances from the FIFA 2018 Man of the Match dataset, ordered by their class-specific Classification Outlier Variability Score (COVA score). The COVA score

\bar{κ_{P_{n}}}

reflects the average absolute z-score of SHAP values across all match statistics, quantifying the degree of deviation from typical class-specific decision behavior. Higher values indicate stronger outlier characteristics. Instances from both the MotM and Not MotM classes are shown, with vertical dots indicating omitted intermediate values for clarity.

ID	COVA Score $\bar{κ_{P_{n}}}$	Class
Morocco 25-06-2018	1.147	Not MotM
Morocco 20-06-2018	0.996	Not MotM
Mexico 02-07-2018	0.977	Not MotM
Tunisia 18-06-2018	0.886	Not MotM
Egypt 19-06-2018	0.884	Not MotM
⋮	⋮	⋮
Peru 21-06-2018	0.317	Not MotM
Sweden 03-07-2018	1.083	MotM
Croatia 21-06-2018	0.984	MotM
Russia 19-06-2018	0.883	MotM
Belgium 14-07-2018	0.813	MotM
Sweden 18-06-2018	0.798	MotM
⋮	⋮	⋮
Japan 19-06-2018	0.589	MotM

Table 3. Sensitivity analysis of the threshold parameter

τ

. Mean and standard deviation of the COVAS score distributions for different

τ

values on the Breast Cancer (BC) and FIFA 2018 Man of the Match (MotM) datasets.

Table 3. Sensitivity analysis of the threshold parameter

τ

. Mean and standard deviation of the COVAS score distributions for different

τ

values on the Breast Cancer (BC) and FIFA 2018 Man of the Match (MotM) datasets.

Dataset	Class	$τ$	n	Mean COVAS	Std COVAS
BC	Malignant	1.0	64	0.23	0.17
BC	Malignant	1.5	64	0.11	0.13
BC	Malignant	2.0	64	0.05	0.08
BC	Malignant	2.5	64	0.02	0.05
BC	Malignant	3.0	64	0.02	0.04
BC	Benign	1.0	102	0.21	0.18
BC	Benign	1.5	102	0.09	0.13
BC	Benign	2.0	102	0.05	0.11
BC	Benign	2.5	102	0.03	0.08
BC	Benign	3.0	102	0.02	0.05
MotM	Not MotM	1.0	15	0.29	0.15
MotM	Not MotM	1.5	15	0.11	0.08
MotM	Not MotM	2.0	15	0.06	0.06
MotM	Not MotM	2.5	15	0.02	0.03
MotM	Not MotM	3.0	15	0.01	0.02
MotM	MotM	1.0	11	0.31	0.10
MotM	MotM	1.5	11	0.13	0.08
MotM	MotM	2.0	11	0.06	0.07
MotM	MotM	2.5	11	0.01	0.02
MotM	MotM	3.0	11	0.00	0.00

Table 4. Top-10 overlap between consecutive threshold values

τ

for the Breast Cancer (BC) and FIFA 2018 Man of the Match (MotM) datasets. The overlap quantifies the stability of the most atypical instances identified by COVAS across different threshold settings.

Table 4. Top-10 overlap between consecutive threshold values

τ

for the Breast Cancer (BC) and FIFA 2018 Man of the Match (MotM) datasets. The overlap quantifies the stability of the most atypical instances identified by COVAS across different threshold settings.

Dataset	Class	$τ_{prev}$	$τ_{curr}$	Top-10 Overlap
BC	Malignant	1.0	1.5	1.00
BC	Malignant	1.5	2.0	1.00
BC	Malignant	2.0	2.5	0.70
BC	Malignant	2.5	3.0	0.40
BC	Benign	1.0	1.5	0.90
BC	Benign	1.5	2.0	0.80
BC	Benign	2.0	2.5	1.00
BC	Benign	2.5	3.0	0.90
MotM	Not MotM	1.0	1.5	0.80
MotM	Not MotM	1.5	2.0	0.80
MotM	Not MotM	2.0	2.5	0.80
MotM	Not MotM	2.5	3.0	0.70
MotM	MotM	1.0	1.5	0.90
MotM	MotM	1.5	2.0	1.00
MotM	MotM	2.0	2.5	0.90
MotM	MotM	2.5	3.0	0.90

Table 5. Comparison of COVAS with feature-space outlier detection methods (top-10). Overlap with COVAS and mean absolute feature-wise z-score are reported for both datasets.

Dataset	Method	Overlap with COVAS	Mean Abs. Feature z-Score
BC	COVAS	1.0	0.893
BC	LOF	0.0	1.554
BC	Feature z-score	0.0	1.730
MotM	COVAS	1.0	0.764
MotM	LOF	0.0	0.898
MotM	Feature z-score	0.0	0.928

Table 6. Impact of class balancing on COVAS scores for the Breast Cancer dataset. All comparisons are performed on the intersection of instances that are correctly classified by both the balanced (bal.) and unbalanced (unbal.) models, reported separately for each class.

Class	n	Mean (Unbal.)	Mean (Bal.)	Std (Unbal.)	Std (Bal.)	$ρ$	Top-10 Ovl.
Benign	33	0.622	0.659	0.348	0.339	0.88	0.80
Malignant	16	0.725	0.741	0.267	0.295	0.86	0.70

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Roth, S.; Cerrito, A.; Orth, S.; Hartmann, U.; Friemert, D. COVAS: Highlighting the Importance of Outliers in Classification Through Explainable AI. Mach. Learn. Knowl. Extr. 2026, 8, 24. https://doi.org/10.3390/make8010024

AMA Style

Roth S, Cerrito A, Orth S, Hartmann U, Friemert D. COVAS: Highlighting the Importance of Outliers in Classification Through Explainable AI. Machine Learning and Knowledge Extraction. 2026; 8(1):24. https://doi.org/10.3390/make8010024

Chicago/Turabian Style

Roth, Sebastian, Adrien Cerrito, Samuel Orth, Ulrich Hartmann, and Daniel Friemert. 2026. "COVAS: Highlighting the Importance of Outliers in Classification Through Explainable AI" Machine Learning and Knowledge Extraction 8, no. 1: 24. https://doi.org/10.3390/make8010024

APA Style

Roth, S., Cerrito, A., Orth, S., Hartmann, U., & Friemert, D. (2026). COVAS: Highlighting the Importance of Outliers in Classification Through Explainable AI. Machine Learning and Knowledge Extraction, 8(1), 24. https://doi.org/10.3390/make8010024

Article Menu

COVAS: Highlighting the Importance of Outliers in Classification Through Explainable AI

Abstract

1. Introduction

2. Materials and Methods

2.1. Hardware and Software

2.2. Datasets

2.2.1. Breast Cancer (BC) Dataset

2.2.2. FIFA 2018 Man of the Match (MotM) Dataset

2.3. Neural Network Architecture

2.4. SHAP Implementation

2.5. COVAS Framework

2.5.1. Continuous Mode

2.5.2. Threshold Mode

2.6. COVAS Workflow

2.7. Custom SHAP Decision Plots

3. Results

3.1. Classification Performance

3.2. COVAS Case Studies

3.2.1. Results on the Breast Cancer (BC) Dataset

Instance-Level Illustration

3.2.2. Results on the Man of the Match (MotM) Dataset

3.3. Robustness Analyzes

3.3.1. Threshold Sensitivity

3.3.2. Ablation Study

3.4. Comparison with Feature-Space Methods

3.5. Impact of Class Imbalance

4. Discussion

4.1. Network Performance

4.2. Interpretation of COVAS Results

4.2.1. Breast Cancer Dataset

4.2.2. Man of the Match Dataset

4.3. Potential Applications of COVAS

4.3.1. Medical Domain

4.3.2. Sports Domain

4.3.3. COVAS for Model Enhancement

4.4. Limitations and Future Directions

5. Conclusions

6. Reproducibility Statement

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Hardware and Software Details

Appendix B. COVAS on a Tree-Based Model

Appendix B.1. Random Forest Setup

Appendix B.2. Comparison Across Model Classes

Appendix B.3. Results

Appendix B.4. Decision Plots of the Random Forest Model

Appendix C. Relationship Between COVAS Scores and Prediction Uncertainty

Seed Stability Analysis on the MotM Dataset

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI