Abstract
Local Interpretable Model-agnostic Explanations (LIME) is a widely used technique for interpreting individual predictions of complex “black-box” models by fitting a simple surrogate model to synthetic perturbations of the input. However, its standard perturbation strategy of sampling features independently from a Gaussian distribution often generates unrealistic samples and neglects inter-feature dependencies. This can lead to low local fidelity (poor approximation of the model’s behavior) and unstable explanations across different runs. This paper presents CoLIME, which is a copula-based perturbation generation framework for LIME, designed to capture the underlying data distribution and inter-feature dependencies more accurately. The framework employs bivariate (2D) copula models to jointly sample correlated features while fitting suitable marginal distributions for individual features. Furthermore, perturbation localization strategies were implemented, restricting perturbations to a defined local radius and maintaining specific property values to ensure that the synthesized samples remain representative of the actual local environment. The proposed approach was evaluated on a network intrusion detection dataset, comparing the fidelity and stability of LIME under Gaussian versus copula-based perturbations, using Ridge regression as the surrogate explainer. Empirically, for the most dependent feature pairs, CoLIME increases mean surrogate fidelity by 21.84–50.31% on the merged CIC-IDS2017/2018 dataset and by 29.28–60.24% on the UNSW-NB15 dataset. Stability is similarly improved, with mean Jaccard similarity gains of 3.78–5.45% and 1.95–2.12%, respectively. These improvements demonstrate that dependency-preserving perturbations provide a significantly more reliable foundation for explaining complex network intrusion detection models.