CoLIME with 2D Copulas for Reliable Local Explanations on Imbalanced Network Data

Mantas Bacevicius; Kristina Sutiene; Lukas Malakauskas; Agne Paulauskaite-Taraseviciene

doi:10.3390/app16010119

,

and

¹

Department of Applied Informatics, Kaunas University of Technology, 51368 Kaunas, Lithuania

²

Department of Mathematical Modeling, Kaunas University of Technology, 51368 Kaunas, Lithuania

^*

Author to whom correspondence should be addressed.

Appl. Sci.2026, 16(1), 119;https://doi.org/10.3390/app16010119

This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications

Version Notes

Order Reprints

Abstract

Local Interpretable Model-agnostic Explanations (LIME) is a widely used technique for interpreting individual predictions of complex “black-box” models by fitting a simple surrogate model to synthetic perturbations of the input. However, its standard perturbation strategy of sampling features independently from a Gaussian distribution often generates unrealistic samples and neglects inter-feature dependencies. This can lead to low local fidelity (poor approximation of the model’s behavior) and unstable explanations across different runs. This paper presents CoLIME, which is a copula-based perturbation generation framework for LIME, designed to capture the underlying data distribution and inter-feature dependencies more accurately. The framework employs bivariate (2D) copula models to jointly sample correlated features while fitting suitable marginal distributions for individual features. Furthermore, perturbation localization strategies were implemented, restricting perturbations to a defined local radius and maintaining specific property values to ensure that the synthesized samples remain representative of the actual local environment. The proposed approach was evaluated on a network intrusion detection dataset, comparing the fidelity and stability of LIME under Gaussian versus copula-based perturbations, using Ridge regression as the surrogate explainer. Empirically, for the most dependent feature pairs, CoLIME increases mean surrogate fidelity by 21.84–50.31% on the merged CIC-IDS2017/2018 dataset and by 29.28–60.24% on the UNSW-NB15 dataset. Stability is similarly improved, with mean Jaccard similarity gains of 3.78–5.45% and 1.95–2.12%, respectively. These improvements demonstrate that dependency-preserving perturbations provide a significantly more reliable foundation for explaining complex network intrusion detection models.

Keywords:

LIME; Explainable AI (XAI); explanation fidelity; copula models; network intrusion

CoLIME with 2D Copulas for Reliable Local Explanations on Imbalanced Network Data

Abstract

Article Metrics

Citations

Article Access Statistics