Scrub-and-Learn: Category-Aware Weight Modification for Machine Unlearning

Wang, Jiali; Bie, Hongxia; Jing, Zhao; Zhi, Yichen

doi:10.3390/ai6060108

Open AccessArticle

Scrub-and-Learn: Category-Aware Weight Modification for Machine Unlearning

by

Jiali Wang

,

Hongxia Bie

^*

,

Zhao Jing

^† and

Yichen Zhi

^†

Intelligent Media Computing Center, School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

AI 2025, 6(6), 108; https://doi.org/10.3390/ai6060108

Submission received: 14 April 2025 / Revised: 19 May 2025 / Accepted: 20 May 2025 / Published: 22 May 2025

(This article belongs to the Special Issue Controllable and Reliable AI)

Download

Browse Figures

Versions Notes

Abstract

(1) Background: Machine unlearning plays a crucial role in privacy protection and model optimization, particularly in forgetting entire categories of data in classification tasks. However, existing methods often struggle with high computational costs, such as estimating the inverse Hessian, or require access to the original training data, limiting their practicality. (2) Methods: In this work, we introduce Scrub-and-Learn, which is a category-aware weight modification framework designed to remove class-level knowledge efficiently. By modeling unlearning as a continual learning task, our method leverages re-encoded labels of samples from the target category to guide weight updates, effectively scrubbing unwanted knowledge while preserving the rest of the model’s capacity. (3) Results and Conclusions: Experimental results on multiple benchmarks demonstrate that our method effectively eliminates targeted categories—achieving a recognition rate below

5 %

—while preserving the performance of retained classes within a

4 %

deviation from the original model.

Keywords:

machine unlearning; category-aware weight modification; continual learning; class-level forgetting

1. Introduction

With the advancement of machine learning, efficiently removing sensitive data from models—while ensuring privacy and regulatory compliance—has become a critical research challenge. Machine unlearning [1,2,3,4,5] has emerged as a promising solution, offering privacy protection, computational efficiency, and model optimization. By selectively removing information related to specific data, unlearning eliminates sensitive content without requiring full retraining, thereby safeguarding user privacy [6,7,8,9], defending against backdoor attacks [10,11], and ensuring compliance with regulations such as GDPR [12]. Additionally, by removing biased data, unlearning improves model fairness and transparency [13,14], demonstrating broad application potential.

Machine unlearning aims to eliminate the influence of specific data from a trained model, ensuring that the model behaves as if it has never seen those data, without requiring retraining [15]. In classification tasks, the relationship between whether a model has seen a sample and whether it can correctly classify it is more complex. Due to the model’s generalization ability, unlearning can maintain high classification accuracy even on unseen test sets. Additionally, research on core datasets [16,17,18] shows that training a model with a small, representative dataset can still yield strong classification performance. Therefore, in classification tasks, the focus is primarily on forgetting entire categories rather than individual samples.

For classification tasks, existing unlearning methods primarily follow two approaches. The first approach focuses on network parameters, aiming to remove the information about the deleted data embedded within them. These methods are typically based on influence functions [19] and include techniques such as Certified Removal [20,21,22]. These methods generally require computing the inverse Hessian matrix. Although some studies approximate it using the Fisher Information Matrix (FIM) [23,24,25], the computational cost remains high, limiting their practicality for large-scale datasets and deep neural networks.

Another approach is to treat the network as the starting point, define an appropriate loss function, and implement unlearning through continual learning. This method avoids computing the Hessian matrix or FIM but requires a dataset for continued training. For instance, Variational Bayesian Unlearning [26], Saliency Unlearning [27], and other methods [28,29,30,31,32] rely on access to the original training data. This requirement increases storage overhead, because the dataset needs to be stored indefinitely. In contrast, other methods [33,34,35,36,37] require only the forgotten data, reducing storage demands. However, they rely on additional generators to produce artificial data to compensate for the missing information.

To address the challenges of high computational complexity and storage overhead, we propose a novel unlearning method that avoids computing the Hessian inverse matrix while effectively removing specific data categories. Our approach leverages the model’s natural forgetting tendency during continual learning, using a small number of samples as new tasks. With just a few training iterations, the method effectively helps the model unlearn the targeted data categories.

The main contributions of this paper are as follows:

We propose Scrub-and-Learn, a novel machine unlearning method that enables effective forgetting of specific data categories, supporting privacy protection and regulatory compliance. Unlike existing approaches, it does not require Hessian inverse computation, access to the original training data, or the creation of auxiliary datasets.
We introduce a submaximum one-hot label encoding strategy for unlearning, assigning a probability of 1 to the second-highest predicted class and 0 to others to signal the removal of category-specific knowledge.
We analyze the weight-sharing patterns across data categories, revealing how neural network weights contribute to cross-category representations. Additionally, we explore catastrophic forgetting in continual learning, where forgetting the target class disrupts other classes.
Extensive experiments on benchmark datasets, including MNIST, FashionMNIST, SVHN, CIFAR-10, and CIFAR-100, demonstrate that Scrub-and-Learn effectively forgets targeted classes while preserving the performance of retained ones. The method generalizes well across datasets of varying sizes and model architectures.

2. Related Work

Existing machine unlearning methods often face challenges related to high computational complexity and significant storage overhead. Our proposed fast unlearning method, guided by a small number of samples, is inspired by several related studies.

Machine Unlearning: Ref. [20] used the inverse Hessian matrix computed over the entire dataset to introduce perturbations to network parameters, aiming to balance utility and removal effectiveness. However, the inverse Hessian matrix is computationally expensive and challenging to scale to complex datasets and deep networks. To address this, ref. [21] proposed storing the inverse Hessian matrix for the retained data as a data statistic, thereby reducing computation during optimization. Other methods approximate the Hessian for each forgotten sample using the average Hessian across the dataset [22] or replace it with the Fisher Information Matrix (FIM) [23]. Despite these efforts, the computational cost remains significant due to the complexity of models and data. Ultimately, the most effective way to reduce computation costs is to eliminate the need to compute the inverse Hessian entirely.

Unlearning methods that avoid computing the Hessian matrix typically apply different operations to retained and forgotten data, aiming to remove information from the forgotten data while preserving knowledge from the retained data. For example, ref. [27] avoided updating network parameters on the retained dataset and updated only those parameters with significant gradients from the forgotten data. Similarly, ref. [29] used gradient descent on the retained data and gradient ascent on the forgotten data. Ref. [30] updated feature extraction parameters based on the feature distance of forgotten data while updating all parameters using retained data. However, these methods require continuous access to the original dataset, increasing storage overhead. To mitigate this, some approaches use only a small subset of samples. While this reduces storage requirements, the network needs data to learn effectively. Generative and inverse models synthesize data when the original dataset is unavailable. For instance, ref. [33,34] generated adversarial noise that maximized the model error, while [35,36] used inversion techniques to reconstruct both retained and forgotten samples.

Continual Learning: The continuous acquisition of incremental information from nonstationary data distributions often leads to catastrophic forgetting or interference [38,39,40]. Catastrophic forgetting [41,42] refers to a significant decline in performance on previously learned tasks after learning a new task. The degradation of previously learned skills worsens as the task sequence progresses [43]. Some researchers [44,45] argue that the root cause of catastrophic forgetting lies in a set of shared weights that provide the network with remarkable generalization and graceful degradation abilities. In pre-trained models, different class representations become more orthogonal as they scale, which helps the network become more resistant to forgetting.

Mitigating catastrophic forgetting is a key challenge in continual learning. However, since unlearning is often intentional, we explore whether we can exploit the network’s natural tendency to forget to erase sensitive data. The challenge is to maintain performance on retained data while forgetting sensitive information. Our research addresses this problem.

3. Method

In this section, we explore how network weights change during the training process and investigate the patterns of weight sharing among different data categories. We then introduce Scrub-and-Learn, which is a category-aware framework that efficiently removes class-specific knowledge by modifying network weights.

3.1. Weight Dynamics and Category Sharing

The neural network typically used in classification tasks consists of a feature extractor and a classifier. A typical model for the classifier is a fully connected layer with one to three layers. In the literature [15], we discuss in detail the case of a single fully connected layer as a classifier. Building on this, we further investigate classifiers with multiple fully connected layers, with the following analysis focusing on the case of two layers as an example.

The classifier takes the feature vector

\hat{x} \in R^{(d)}

, extracted by the feature extractor, as input to generate classification decisions. The output of the first layer of the classifier is given by

h = ReLU (Ω_{1} \hat{x} + b_{1})

, where

Ω_{1} \in R^{(m, d)}

and

b_{1} \in R^{(m)}

represent the weights and biases of the first connection layer, respectively. The final output of the classifier is

\hat{y} = softmax (Ω_{2} h + b_{2})

, where

Ω_{2} \in R^{(C, m)}

and

b_{2} \in R^{(C)}

represent the weights and biases of the output layer, respectively. During training, the classifier utilizes cross-entropy as the loss function and employs gradient descent to enhance classification accuracy.

According to the analysis in previous work [15], at both the initial and later stages of training, the weights of the classifier with one-layer full connection are updated under the guidance of the Equiangular Tight Frame (ETF) and eventually converge to the vertices of the ETF, as shown in Figure 1a. The classifier that converges to the vertex of the ETF ensures that the cosine similarity between the row feature vectors of different classes approaches the cosine similarity between the row feature vectors of different classes approaches as

- 1 / (C - 1)

, thereby minimizing their similarity. As a result, the representations of samples of each category in the last layer become as orthogonal as possible, enhancing class separability in the feature space.

The feature vector extracted from samples of C different classes is represented as

\hat{X} = {{\hat{x}}^{0}, {\hat{x}}^{1}, \dots, {\hat{x}}^{C - 1}} \in R^{(d, C)}

, where

{\hat{x}}^{c}

denotes the feature extracted from a sample of class c, and d is the dimension of the feature vector. After passing through the classifier, the gradient of the first-layer weight

Ω_{1}

is given by Equation (1):

\nabla Ω_{1} = P ⊙ H \cdot {(A - Y)}^{T} \cdot (A - Y) \cdot {\hat{X}}^{T}

(1)

where P is an element-wise mask controlling weight updates,

H = {h^{0}, h^{1}, \dots, h^{C - 1}} \in R^{(m, C)}

is the first-layer output corresponding to

\hat{X}

as the classifier input and contains neuron activation markers,

A = {{\hat{y}}^{0}, {\hat{y}}^{1}, \dots, {\hat{y}}^{C - 1}} \in R^{(C, C)}

is the classifier output (softmax probabilities), and

Y = {y^{0}, y^{1}, \dots, y^{C - 1}} \in R^{(C, C)}

represents the one-hot encoded labels corresponding to

\hat{X}

. This formulation illustrates how the weight updates in the first layer are influenced by the output differences

A - Y

and the feature vectors

\hat{X}

, demonstrating how class relationships affect weight modifications.

This analysis shows that in a two-layer fully connected classifier, the behavior of weight gradients aligns with the one-layer case in updating class weights based on the ETF structure. Specifically, the update direction for the weight

ω_{1}^{c}

of class c follows the corresponding row feature vector of class c, while the weights

ω_{1}^{i}

for other classes

i \neq c

are updated in the opposite direction, as is consistent with prior work [15]. However, compared to the one-layer case, the gradient expression in Equation (1) introduces an additional modulation term

P ⊙ H

. This component implies that the weight update process is not purely determined by the ETF structure but also undergoes further sparse recombination, as shown in Figure 1b. In particular,

P ⊙ H

controls how neuron activations and selective importance contribute to weight adjustments, adding a layer of complexity to the learning dynamics of multi-layer classifiers.

This observation suggests that while ETF alignment continues to guide class separation, the presence of

P ⊙ H

introduces an additional selection mechanism that may influence the efficiency of forgetting or retaining specific classes during training. Understanding this term’s effect is crucial for designing efficient unlearning mechanisms that leverage weight evolution properties in deep classifiers.

We focus on the weight gradient of the i-th neuron in the first layer.

\nabla ω_{1}^{i} = {[- \frac{C (1 - a)}{C - 1}]}^{2} \cdot p_{i}^{T} ⊙ h_{i}^{T} \cdot (I - \frac{1}{C} 1 1^{T}) \cdot {\hat{X}}^{T}

(2)

where

h_{i} \in R^{C}, i = 0, 1, \dots, m - 1

is the row vector of H, consisting of the i-th element in

{h^{0}, h^{1}, \dots, h^{C - 1}}

. The weight gradient in Equation (2) reveals that each neuron’s update depends on a sparse combination of class features, which are shaped by the activation pattern

p_{i}^{T} ⊙ h_{i}^{T}

. Since neurons in the first layer respond to similar features across different classes, they naturally integrate information from multiple categories during backpropagation. The empirical evidence from Figure 2 reinforces this theoretical insight. The confusion matrix of the MNIST data on a classification model with two fully connected layers shows that different subsets of network weights exhibit different class recognition trends while explaining the weights shared between multiple classes. Specifically, the weights in Figure 2b are primarily responsible for recognizing class 0 but also misclassifying other classes (e.g., 2, 3, 5, 8, and 9) as class 0. And the weights in Figure 2c exhibit strong recognition of class 1 but frequently misclassify classes 2 and 7 as class 1. Both sets of weights significantly influence the class 2 classification, suggesting that certain weights are crucial across multiple categories.

3.2. Category-Aware Weight Modification

Deep neural networks typically learn class discrimination by minimizing the cross-entropy between predicted log probabilities and one-hot encoded labels. In this setup, the one-hot vector provides a strict supervision signal, assigning “1” to the correct class and “0” to all others. This encoding influences the model’s memory by directing the gradient flow during backpropagation. We hypothesize that altering the label of a single class can guide the model to forget that class without impairing its ability to learn others.

We introduce Scrub-and-Learn, which is a category-aware weight modification framework that removes class-level knowledge from a trained network. This approach reframes unlearning as continual learning, treating the forgetting task as a new objective for the pre-trained classification network. It leverages the network’s tendency to forget during continual learning and uses label guidance to discard class-specific information.

To support this, we design a lightweight yet effective forgetting mechanism that directly manipulates the one-hot label representations. By identifying samples from the forgotten class and substituting their labels with those of alternative classes inferred from the model’s predictions, we enable class-level forgetting while maintaining the integrity of the overall training process.

Our algorithm modifies the label assignment before loss computation, ensuring that the model no longer receives explicit gradient signals to reinforce the forgotten class. This approach, called submaximum one-hot encoding, assigns a value of 1 to the index corresponding to the second-highest probability in the model’s output for forgotten samples while setting all other positions to 0. The detailed procedure is outlined in Algorithm 1. This design preserves the standard training structure while inducing forgetting through label-level supervision adjustment. For example, in a 10-class classification network aiming to forget class 0, if the probability output is

[0.8, 0.01, 0.1, 0.01, 0.01, 0.02, 0.01, 0.01, 0.01, 0.02]

, the new encoding would be

[0, 0, 1, 0, 0, 0, 0, 0, 0, 0]

, and the gradient computed using the new one-hot vector guides the modification of the model weights, as illustrated in Figure 3.

Algorithm 1 Submaximum one-hot encoding for forgotten class samples label.

Require: sample

X \in R^{N \times}

, ground-truth labels

l \in N^{N}

, forgotten class index

c_{f}

, Model

M

Ensure: Submaximum one-hot encoding

Y \in R^{N \times C}

1: Probability distribution of samples

A \leftarrow M (X)

2: Identify forgotten-class sample indices:

I \leftarrow {i ∣ l_{i} = c_{f}}

3: for

i \in I

do
4: Remove column

c_{f}

:

{\tilde{A}}_{i} \leftarrow concat (A_{i} [: c_{f}], log A_{i} [c_{f} + 1 :])

5: Compute predicted alternative class:

l_{i}^{'} \leftarrow arg {max}_{j} {\tilde{A}}_{i, j}

6: Update labels:

l_{i} \leftarrow l_{i}^{'}

7: end for
8:

Y \leftarrow one_hot (l, num_classes = C)

During forward propagation, the input of the forgotten sample activates the weights most strongly associated with it, which is captured in

p^{c_{f}} ⊙ h^{c_{f}}

, where

c_{f}

represents the forgotten class. The new encoding vector serves as a supervisory signal—central to memory formation in the model—and effectively intervenes in the gradient dynamics that reinforce the forgotten class during learning. Specifically, our method replaces the one-hot label of the forgotten class with an alternative label inferred from the model’s predictions. This approach removes the original supervisory signal and redirects the model’s focus to other classes. Consequently, the gradient of the model weights shifts from

\nabla_{ω} log {\hat{y}}_{l}

to

\nabla_{ω} log {\hat{y}}_{l^{'}}

, where

l^{'}

denotes the alternative class. The model updates its weights to strengthen the new class representation rather than the forgotten one, causing a gradual decay of the original class-specific representation. This mechanism leverages the inherent plasticity of neural networks, where continuous learning can overwrite prior knowledge, and reframes forgetting as a guided relearning process. Our label-level redirection offers a clean and effective approach for targeted category-level forgetting while preserving the structural integrity of the training process.

We also explored two alternative forgetting encoding methods. The first is random one-hot encoding, where a randomly selected position in the encoding vector (corresponding to a nonforgotten class) is set to 1, while all others remain as 0. For instance, in the task of forgetting class 0, the label encoding for a class-0 sample is represented as

[0, s, s, s, s, s, s, s, s, s]

, where a randomly selected s is assigned a value of 1, and the rest are set to 0. However, this method introduces uncertainty into the encoding, causing the model weights to receive different supervisory signals at each iteration. It also leads to a significant change in the model’s loss function, as shown in Equation (3):

L = - \sum_{c} y_{c} log ({\hat{y}}_{c}) = - log ({\hat{y}}_{s}) \geq - log ({\hat{y}}_{l^{'}})

(3)

As a result, the model’s performance on retained data deteriorates. The second method is all-zero encoding, where the supervisory signal for the forgotten class encodes a vector of zeros. For example, in this setting, samples from class 0 are encoded as

[0, 0, 0, 0, 0, 0, 0, 0, 0]

. However, this approach results in a constant loss of 0, as shown in Equation (4):

L = - \sum_{c} {\hat{y}}_{c} log ({\hat{y}}_{c}) = 0

(4)

This approach eliminates the learning signal, making the model ineffective at forgetting the target data.

As illustrated in Figure 4, the accuracy of VGG16 on CIFAR-10 after forgetting is shown separately for retained data (top of each subimage) and forgotten data (bottom of each subimage). The horizontal axis k indicates the total number of iterations, the vertical axis represents the batch size, and the color scale denotes the average accuracy over 30 experiments. Each subimage corresponds to a different encoding method used during the forgetting task: (a) submaximum one-hot encoding effectively forgets class 0 while preserving accuracy on retained data; (b) random one-hot encoding achieves forgetting but also reduces accuracy on retained classes; and (c) all-zero encoding fails to clean class 0 data, with its accuracy still around

60 %

after 10 iterations, indicating ineffective forgetting.

Compared to entropy maximization or adversarial perturbation-based forgetting, our method achieves stable forgetting with minimal impact on the remaining knowledge by selectively replacing class labels using the second-largest predicted class. This approach avoids introducing excessive uncertainty or instability into the network. Moreover, unlike adversarial methods that rely on carefully designed noise, our approach is simple, efficient, and less prone to degrading general performance.

3.3. Challenges in the Unlearning Process

Since the model’s weights are shared across different categories, in the forgetting task, samples from the forgetting class primarily adjust the network weights most relevant to them. However, as the model learns new tasks, its shared weights are continuously modified, leading to catastrophic forgetting of the retained data.

We experimentally reproduced the forgetting process of one epoch in continued learning across different datasets and models. A new task requires a trained classification network to forget class-0 samples. To evaluate the unlearning model’s forgetting behavior, we explicitly divided the test dataset into two subsets: a forgetting set containing only class-0 samples and a retention set containing samples from all other classes. The total number of iterations k is defined as

k = Num_samples / batchsize

, where

Num_samples

represents the total number of samples required in the forgetting task. In the forgetting experiments, both k and

batchsize

ranged from 1 to 10, except on the CIFAR100 dataset, where k took values from the set

{3, 6, 9, 12, 15, 18, 21, 24, 27, 30}

. In the i-th experiment, the sample index range was

[i * 10, i * 10 + Num_samples]

. For each batch size and iteration count, the classification accuracy was averaged over 30 runs to improve robustness and ensure broader sample coverage, enhancing the evaluation of method stability. Figure 5 displays the complete experimental results. Each subgraph corresponds to a specific model–dataset pair. In each subgraph, the top panel shows the classification accuracy on the retained dataset under different batch sizes and iteration counts. The bottom panel shows the accuracy on the forgotten dataset. The horizontal axis indicates the number of iterations, the vertical axis represents batch size, and the color scale reflects classification accuracy levels. As previously mentioned, networks trained continuously typically experience catastrophic forgetting within a single cycle when learning new tasks. Notably, the model exhibits a sequential forgetting pattern: it first forgets samples from the forgetting set, followed by a gradual decline in accuracy on the retained set.

Therefore, when the model forgets a specific category of data in a continual learning scenario, the core challenge lies in effectively balancing the targeted removal of forgotten data with the preservation of classification performance on retained data. It is essential to halt learning of the forgotten task before catastrophic forgetting begins to degrade performance on the retained data, thereby maintaining a balance between forgetting and retention.

3.4. Sample Selection for Unlearning

When the model continues learning, if the new task differs significantly from the old one, such as identifying various bird species instead of different cat breeds, the new task’s training set should be no smaller than that of the old task. These samples help the model effectively learn the new task. In continual learning, forgetting a specific category is interpreted as introducing a new task that excludes that category. For example, in the task of forgetting class 0 and retaining classification of the remaining classes, the classification of the remaining data is part of the old task. The correlation between the new and old tasks implies that the samples of the new task also come from the old task. Additionally, to prevent catastrophic forgetting of the retained data, the number of iterations for the new task cannot be too large, so only a limited number of samples are needed to achieve effective forgetting.

Based on experimental observations, we found that the sample size for a new task should not exceed

5 C

, where C is the total number of categories. In our experiments, using a small batch size accelerated the forgetting process. Specifically, batch sizes smaller than 10 on a 10-class dataset and smaller than 20 on a 100-class dataset proved effective for facilitating forgetting. However, when the number of iterations exceeded 5 for the 10-class dataset or 20 on the 100-class dataset, the model’s performance on the retained data dropped significantly. Therefore, we provide these values as rough guidelines.

When using this algorithm, if the model’s retention performance is compromised when training with only forgotten class samples, adding retained class samples can help mitigate the performance loss. As shown in Figure 6, when VGG16 continues learning on CIFAR-10 with the same total number of samples, the selection of different sample categories affects performance. The horizontal axis represents the total number of samples. In the figure, (a) includes only forgotten class samples, while (b) includes retained class samples, with one sample selected per retained class. The red circles indicate that the model can effectively forget class-0 samples when the total number of samples is 13 and the batch size is 4. Additionally, including retained class samples in the training set helps the model better preserve retention performance.

4. Experiment

This section outlines the experimental design, implementation process, and research results. The primary goal of the experiments is to verify the effectiveness of the proposed method and compare it with state-of-the-art approaches to evaluate its performance across different datasets.

4.1. Experimental Setting

We used different network architectures based on task difficulty across several datasets. For MNIST [46], we trained the MLP and LeNet [46] classifiers. On FashionMNIST [47], we trained the MLP, LeNet, and AlexNet [48] models. For SVHN [49], we used the AlexNet, VGG11 [50] and ResNet18 [51] networks. The CIFAR10 [52] experiments involved the VGG16, ResNet34, InceptionV3 [53], and ViT-S [54] classifiers. Finally, on CIFAR100, we employed VGG16, ResNet50, InceptionV3, and ViT-S architectures.

To assess the practical applicability of our method in privacy-critical scenarios, we further conducted experiments on the Augmented Skin Conditions Image (ASCI) dataset [55], which contains dermatological images annotated with six different skin diseases.

We conducted all experiments in PyTorch and optimized the models with stochastic gradient descent (SGD). To evaluate model performance and establish a baseline model, we trained each network from scratch and used almost the same hyperparameters across datasets. The momentum was 0.9, the weight decay was 0.0005, and the initial learning rate was 0.1. CosineAnnealingLR was used to adjust the learning rate during training. For MNIST, FashionMNIST, and SVHN, we trained the models for 100 epochs with a batch size of 256. For CIFAR10 and CIFAR100, the batch size was 128, with training extended to 200 epochs. After training, we applied a sample-guided fast unlearning method to each baseline model to remove an entire class.

In the single-class experiments, we designated class 0 as the forgetting class. For MNIST, FashionMNIST, and SVHN, eight class-0 samples were randomly selected, with a learning rate of 0.01 and a batch size of 4. For CIFAR-10, one sample from each retained class and four from the forgotten class were used, with a learning rate of 0.05 and the same batch size. For CIFAR-100, one sample from each retained class and ten from the forgotten class were selected, with a learning rate of 0.2 and batch size 16. For the multi-class forgetting experiments, classes 0 and 4 were designated as the forgotten classes in the 10-class dataset, while 20 classes were selected randomly for forgetting in the 100-class dataset. For MNIST, FashionMNIST, and SVHN, eight samples from each forgotten class were selected, with a learning rate of 0.01 and batch sizes of 8, 6, and 2, respectively. For CIFAR-10, one sample from each retained class and four from each forgotten class were selected according to policy, using a learning rate of 0.05 and a batch size of 4. For CIFAR-100, one sample from each retained class and four from each forgotten class were selected, with a learning rate of 0.5 and batch size 16. The optimizer configuration was kept consistent with that used during baseline training.

We conducted all experiments on a Linux server equipped with dual Intel^® Xeon^® Silver 4214R CPUs (24 cores and 48 threads), 440 GB of RAM, and an NVIDIA GeForce RTX 3080 Ti GPU with 12 GB of VRAM. The software environment included Python 3.8, PyTorch 1.10.0 (with CUDA 11.3 and cuDNN 8.2), running on Ubuntu 20.04 LTS.

4.2. Evaluation of Unlearning Networks

Table 1 and Table 2 present the accuracy of the original and unlearning models on retained and forgotten data across different network models and datasets. We selected 30 sample groups using the sample index method described earlier and recorded the best accuracy on retained and forgotten data. The experimental results in Table 1 show that our forgetting algorithm performed well across multiple network models and datasets in the single-class forgetting task. The forgetting accuracy of all models was less than

5 %

, while the difference between the accuracy on the retained data and the original accuracy did not exceed

4 %

. Table 2 shows that forgetting two categories in the 10-class dataset yielded similar results to forgetting one category. However, forgetting 20 categories in the 100-class dataset led to a noticeable drop in retained accuracy. This result may be due to the large number of parameters involved in the forgetting process in multi-class forgetting experiments, which leads to more extensive model modifications and consequently undermines generalization performance.

Notably, the results on the medical imaging dataset (see Table 1, last two rows) demonstrate that the proposed method maintained consistent forgetting behavior while preserving competitive accuracy on the retained classes. These results support the applicability of our approach to real-world privacy-sensitive domains.

4.3. Comparison with Other Unlearning Methods

Table 3 compares our method with state-of-the-art approaches, including PBU [7], GKT [33], WF-Net [31], and NG-IR [34], on MNIST, SVHN, CIFAR-10, and CIFAR-100. While the accuracy of other methods dropped significantly on the retained data, our method had a smaller negative impact on the model’s accuracy for the retained dataset. All baseline methods were evaluated on the identical class-level forgetting task using the results reported in their original papers.

4.4. Analyze the Impact of Learning Rate on the Unlearning Methods

This experiment analyzed the impact of learning rate on the forgetting algorithm using a VGG16 classification model trained on the CIFAR10 dataset. In forgetting class 0, all other conditions remained constant, while the learning rate varied across values of 0.01, 0.02, 0.03, …, up to 0.1. Figure 7 illustrates the forgetting performance under different learning rates. The learning rate primarily influenced the number of iterations needed for forgetting; the higher the learning rate, the fewer iterations required. In practice, selecting the learning rate involves balancing the tradeoff between effective forgetting and retention.

4.5. T-SNE Visualization of the Classification Results of the Unlearning Model

This experiment investigated the impact of our forgetting method on models trained on the MNIST dataset using LeNet. Using our proposed forgetting method, we obtained two forgetting models: one that forgets only class 0 (single-class forgetting) and another that forgets classes 0 and 4 (multi-class forgetting). Figure 8 visualizes the classification results of the original model (a), the single-class forgetting model (b), and the multi-class forgetting model (c) using the t-SNE technique. The results show that our forgetting method caused the model to reinterpret the forgotten class data belonging to other classes by selectively forgetting the second-highest confidence encoding in the sample labels. This adjustment shifted the model’s overall decision boundary, as highlighted by the black dotted circle in the figure.

5. Discussion

5.1. Method Analysis

In this section, we present our results alongside related unlearning methods and analyze the advantages and limitations of our proposed framework. Compared to existing approaches such as influence-based unlearning [20,21,22,23], and sample-based methods [26,27,28,29,30,31,32], our method offers a simple and effective alternative for class-level forgetting.

Influence-based methods rely on second-order statistics or influence functions to estimate parameter importance. Although effective, they are typically computationally expensive due to the need to approximate or compute the Fisher Information Matrix or Jacobian–Hessian products, which limits their scalability to shallow networks. In contrast, our method avoids such overhead by operating directly at the label level—selectively replacing the original label with the model’s second-highest prediction—to induce forgetting. This design simplifies the forgetting process by eliminating the need for second-order gradient computations and retraining loops.

Compared to retraining-based strategies like SISA [56], which require partitioning the model and iteratively retraining submodels, our method adopts a simplified fine-tuning strategy guided solely by the modified labels. This approach reduces the computational burden and improves applicability to deep models and diverse datasets.

Sample-based forgetting methods generally employ gradient ascent or loss maximization to drive forgetting during retraining or fine-tuning. However, these methods require access to the entire dataset or involve generating large volumes of synthetic data. Our approach significantly reduces the sample requirements. Furthermore, unlike approaches that demand retraining from scratch or extensive fine-tuning, our method offers an efficient plug-and-play forgetting solution for trained models.

Experimental results across multiple datasets demonstrate that our method consistently achieves low accuracy on the forgotten classes (≤5%) while maintaining strong performance on the retained classes (≤4%).

5.2. Limitations and Future Work

Despite its promising results, our method has several limitations. First, it is limited to class-level forgetting and does not support instance-level or feature-level forgetting, which is essential in more fine-grained unlearning scenarios. Second, the method’s effectiveness diminishes when multiple classes need to be removed from the model simultaneously. The increased number of target classes introduces more parameters, potentially significantly affecting overall model performance. Third, our approach focuses on the empirical evaluation of forgetting effectiveness and does not offer formal privacy guarantees. These areas warrant further exploration.

In future work, we plan to address these limitations by extending our approach to multi-class and continual unlearning settings, where multiple classes are removed from the model sequentially or dynamically. We also aim to enhance the method by incorporating parameter-based importance metrics to improve its accuracy and interpretability. Furthermore, we intend to explore adaptive label replacement strategies that better reflect inter-class relationships and to develop theoretical guarantees for knowledge removal. Finally, we will evaluate the applicability of our method in broader domains, such as language models and privacy-sensitive applications.

6. Conclusions

In this work, we propose Scrub-and-Learn, which is a novel unlearning method that effectively removes knowledge associated with a specific class without requiring Hessian inverse computation. By leveraging submaximum one-hot encoding and using only a small number of samples, Scrub-and-Learn enables the model to forget the target class while preserving performance on the retained classes. Extensive experiments across multiple datasets and architectures demonstrate the effectiveness and efficiency of Scrub-and-Learn in single-class forgetting scenarios. These results highlight the practical value of our approach for safe, targeted, and scalable model unlearning.

Author Contributions

Conceptualization and methodology, J.W.; Validation, J.W., Z.J. and Y.Z.; investigation, Z.J. and Y.Z.; writing—original draft preparation, J.W.; writing—review and editing, H.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported in part by Beijing Natural Science Foundation-Joint Funds of Haidian Original Innovation Project (Grant No. L232056), Science and Technology Innovation 2030 Major Projects (Grant No. 2022ZD0211603).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this study are publicly available. The CIFAR-10 and CIFAR-100 datasets can be accessed at [https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 31 October 2024)], the SVHN dataset at [http://ufldl.stanford.edu/housenumbers/ (accessed on 31 October 2024)], the Fashion-MNIST dataset at [https://github.com/zalandoresearch/fashion-mnist (accessed on 31 October 2024)], the MNIST dataset at [http://yann.lecun.com/exdb/mnist/ (accessed on 31 October 2024)], and the Augmented Skin Conditions Image (ASCI) dataset at [https://www.kaggle.com/datasets/syedalinaqvi/augmented-skin-conditions-image-dataset (accessed on 31 October 2024)].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

FIM	Fisher Information Matrix
ETF	Equiangular Tight Frame
RAM	Random-access memory
GPU	Graphics Processing Unit
VRAM	Video RAM
CUDA	Compute Unified Devices Architecture
PBU	Partially Blinded Unlearning
GKT	Gated Knowledge Transfer
WF-Net	Weight Filtering-Net
NG-IR	Noise Generation-Impair Repair
ASCI	Augmented Skin Conditions Image

References

Xu, J.; Wu, Z.; Wang, C.; Jia, X. Machine Unlearning: Solutions and Challenges. IEEE Trans. Emerg. Top. Comput. Intell. 2024, 8, 2150–2168. [Google Scholar] [CrossRef]
Zhang, H.; Nakamura, T.; Isohara, T.; Sakurai, K. A review on machine unlearning. SN Comput. Sci. 2023, 4, 337. [Google Scholar] [CrossRef]
Shaik, T.; Tao, X.; Xie, H.; Li, L.; Zhu, X.; Li, Q. Exploring the Landscape of Machine Unlearning: A Comprehensive Curvey and Taxonomy. IEEE Trans. Neural Netw. Learn. Syst. 2024; early access. [Google Scholar]
Cevallos, I.D.; Benalcázar, M.E.; Valdivieso Caraguay, Á.L.; Zea, J.A.; Barona-López, L.I. A Systematic Literature Review of Machine Unlearning Techniques in Neural Networks. Computers 2025, 14, 150. [Google Scholar] [CrossRef]
Li, N.; Zhou, C.; Gao, Y.; Chen, H.; Zhang, Z.; Kuang, B.; Fu, A. Machine unlearning: Taxonomy, metrics, applications, challenges, and prospects. IEEE Trans. Neural Netw. Learn. Syst. 2025; early access. [Google Scholar]
De Min, T.; Mancini, M.; Lathuilière, S.; Roy, S.; Ricci, E. Unlearning Personal Data from a Single Image. Transactions on Machine Learning Research. March 2025. Available online: https://openreview.net/pdf?id=VxC4PZ71Ym (accessed on 18 May 2025).
Panda, S.; Sourav, S. Partially Blinded Unlearning: Class Unlearning for Deep Networks from Bayesian Perspective. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 6372–6380. [Google Scholar]
Chen, K.; Huang, Y.; Wang, Y.; Zhang, X.; Mi, B.; Wang, Y. Privacy preserving machine unlearning for smart cities. Ann. Telecommun. 2024, 79, 61–72. [Google Scholar] [CrossRef]
Chen, K.; Wang, Z.; Mi, B. Private Data Protection with Machine Unlearning in Contrastive Learning Networks. Mathematics 2024, 12, 4001. [Google Scholar] [CrossRef]
Wei, S.; Zhang, M.; Zha, H.; Wu, B. Shared adversarial unlearning: Backdoor mitigation by unlearning shared adversarial examples. Adv. Neural Inf. Process. Syst. 2023, 36, 25876–25909. [Google Scholar]
Guo, Y.; Zhao, Y.; Hou, S.; Wang, C.; Jia, X. Verifying in the dark: Verifiable machine unlearning by using invisible backdoor triggers. IEEE Trans. Inf. Forensics Secur. 2023, 19, 708–721. [Google Scholar] [CrossRef]
Protection, Formerly Data. General Data Protection Regulation (GDPR). Intersoft Consulting. 2018, p. 24. Available online: https://gdpr-info.eu/ (accessed on 14 October 2024).
Zhang, D.; Pan, S.; Hoang, T.; Xing, Z.; Staples, M.; Xu, X.; Yao, L.; Lu, Q.; Zhu, L. To be forgotten or to be fair: Unveiling fairness implications of machine unlearning methods. AI Ethics 2024, 4, 83–93. [Google Scholar] [CrossRef]
Chen, R.; Yang, J.; Xiong, H.; Bai, J.; Hu, T.; Hao, J.; Feng, Y.; Zhou, J.T.; Wu, J.; Liu, Z. Fast Model Debias with Machine Unlearning. Adv. Neural Inf. Process. Syst. 2023, 36, 14516–14539. [Google Scholar]
Wang, J.; Bie, H.; Jing, Z.; Zhi, Y.; Fan, Y. Weight Masking in Image Classification Networks: Class-Specific Machine Unlearning. Knowl. Inf. Syst. 2025, early access, 1–21. [Google Scholar] [CrossRef]
Dolatabadi, H.M.; Erfani, S.M.; Leckie, C. Adversarial coreset selection for efficient robust training. Int. J. Comput. Vis. 2023, 131, 3307–3331. [Google Scholar] [CrossRef]
Maalouf, A.; Eini, G.; Mussay, B.; Feldman, D.; Osadchy, M. A unified approach to coreset learning. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 6893–6905. [Google Scholar] [CrossRef] [PubMed]
Huang, Y.; Yuan, X.; Wang, H.; Du, Y. Coreset selection can accelerate quantum machine learning models with provable generalization. Phys. Rev. Appl. 2024, 22, 014074. [Google Scholar] [CrossRef]
Zhao, P.; Zhang, K.; Zhang, H.; Chen, H. Alternating minimization differential privacy protection algorithm for the novel dual-mode learning tasks model. Expert Syst. Appl. 2024, 259, 125279. [Google Scholar] [CrossRef]
Guo, C.; Goldstein, T.; Hannun, A.; Van Der Maaten, L. Certified Data Removal from Machine Learning Models. arXiv 2019, arXiv:1911.03030. [Google Scholar]
Sekhari, A.; Acharya, J.; Kamath, G.; Suresh, A.T. Remember What You Want to Forget: Algorithms for Machine Unlearning. Adv. Neural Inf. Process. Syst. 2021, 34, 18075–18086. [Google Scholar]
Suriyakumar, V.; Wilson, A.C. Algorithms that Approximate Data Removal: New Results and Limitations. Adv. Neural Inf. Process. Syst. 2022, 35, 18892–18903. [Google Scholar]
Peste, A.; Alistarh, D.; Lampert, C.H. SSSE: Efficiently Erasing Samples from Trained Machine Learning Models. arXiv 2021, arXiv:2107.03860. [Google Scholar]
Zhang, Y.; Lu, Z.; Zhang, F.; Wang, H.; Li, S. Machine unlearning by reversing the continual learning. Appl. Sci. 2023, 13, 9341. [Google Scholar] [CrossRef]
Mahadevan, A.; Mathioudakis, M. Certifiable unlearning pipelines for logistic regression: An experimental study. Mach. Learn. Knowl. Extr. 2022, 4, 591–620. [Google Scholar] [CrossRef]
Nguyen, Q.P.; Low, B.K.H.; Jaillet, P. Variational Bayesian Unlearning. Adv. Neural Inf. Process. Syst. 2020, 33, 16025–16036. [Google Scholar]
Fan, C.; Liu, J.; Zhang, Y.; Wong, E.; Wei, D.; Liu, S. SalUn: Empowering Machine Unlearning via Gradient-Based Weight Saliency in Both Image Classification and Generation. arXiv 2023, arXiv:2310.12508. [Google Scholar]
Kurmanji, M.; Triantafillou, P.; Hayes, J.; Triantafillou, E. Towards Unbounded Machine Unlearning. Adv. Neural Inf. Process. Syst. 2023, 36, 1957–1987. [Google Scholar]
Trippa, D.; Campagnano, C.; Bucarelli, M.S.; Tolomei, G.; Silvestri, F. ∇τ: Gradient-Based and Task-Agnostic Machine Unlearning. CoRR 2024, arXiv:2403.14339. [Google Scholar]
Cotogni, M.; Bonato, J.; Sabetta, L.; Pelosin, F.; Nicolosi, A. DUCK: Distance-Based Unlearning via Centroid Kinematics. arXiv 2023, arXiv:2312.02052. [Google Scholar]
Poppi, S.; Sarto, S.; Cornia, M.; Baraldi, L.; Cucchiara, R. Multi-Class Unlearning for Image Classification via Weight Filtering. IEEE Intell. Syst. 2024, 39, 40–47. [Google Scholar] [CrossRef]
Wang, W.; Zhang, C.; Tian, Z.; Yu, S. Machine Unlearning via Representation Forgetting with Parameter Self-Sharing. IEEE Trans. Inf. Forensics Secur. 2023, 19, 1099–1111. [Google Scholar] [CrossRef]
Chundawat, V.S.; Tarun, A.K.; Mandal, M.; Kankanhalli, M. Zero-Shot Machine Unlearning. IEEE Trans. Inf. Forensics Secur. 2023, 18, 2345–2354. [Google Scholar] [CrossRef]
Tarun, A.K.; Chundawat, V.S.; Mandal, M.; Kankanhalli, M. Fast yet Effective Machine Unlearning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 13046–13055. [Google Scholar] [CrossRef]
Abbasi, A.; Thrash, C.; Akbari, E.; Zhang, D.; Kolouri, S. CovarNav: Machine Unlearning via Model Inversion and Covariance Navigation. arXiv 2023, arXiv:2311.12999. [Google Scholar]
Yoon, Y.; Nam, J.; Yun, H.; Lee, J.; Kim, D.; Ok, J. Few-Shot Unlearning by Model Inversion. arXiv 2022, arXiv:2205.15567. [Google Scholar]
Ma, Z.; Liu, Y.; Liu, X.; Liu, J.; Ma, J.; Ren, K. Learn to forget: Machine unlearning via neuron masking. IEEE Trans. Dependable Secur. Comput. 2022, 20, 3194–3207. [Google Scholar] [CrossRef]
De Lange, M.; Aljundi, R.; Masana, M.; Parisot, S.; Jia, X.; Leonardis, A.; Slabaugh, G.; Tuytelaars, T. A Continual Learning Survey: Defying Forgetting in Classification Tasks. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3366–3385. [Google Scholar]
Wang, L.; Zhang, X.; Su, H.; Zhu, J. A comprehensive survey of continual learning: Theory, method and application. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5362–5383. [Google Scholar] [CrossRef]
Parisi, G.I.; Kemker, R.; Part, J.L.; Kanan, C.; Wermter, S. Continual Lifelong Learning with Neural Networks: A Review. Neural Netw. 2019, 113, 54–71. [Google Scholar] [CrossRef] [PubMed]
Masana, M.; Liu, X.; Twardowski, B.; Menta, M.; Bagdanov, A.D.; Van De Weijer, J. Class-Incremental Learning: Survey and Performance Evaluation on Image Classification. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 5513–5533. [Google Scholar] [CrossRef] [PubMed]
Kong, Y.; Liu, L.; Chen, H.; Kacprzyk, J.; Tao, D. Overcoming Catastrophic Forgetting in Continual Learning by Exploring Eigenvalues of Hessian Matrix. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 16196–16210. [Google Scholar] [CrossRef]
Peng, J.; Tang, B.; Jiang, H.; Li, Z.; Lei, Y.; Lin, T.; Li, H. Overcoming Long-Term Catastrophic Forgetting Through Adversarial Neural Pruning and Synaptic Consolidation. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 4243–4256. [Google Scholar] [CrossRef]
Zhang, M.; Li, H.; Pan, S.; Chang, X.; Zhou, C.; Ge, Z.; Su, S. One-Shot Neural Architecture Search: Maximising Diversity to Overcome Catastrophic Forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2921–2935. [Google Scholar] [CrossRef]
French, R.M. Catastrophic Forgetting in Connectionist Networks. Trends Cogn. Sci. 1999, 3, 128–135. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 2. [Google Scholar] [CrossRef]
Netzer, Y.; Wang, T.; Coates, A.; Bissacco, A.; Wu, B.; Ng, A.Y. Reading Digits in Natural Images with Unsupervised Feature Learning. In Proceedings of the NIPS Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain, 10 December 2011; Volume 2011, p. 4. [Google Scholar]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images. 2009. Available online: http://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf (accessed on 31 October 2024).
Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
Lee, S.H.; Lee, S.; Song, B.C. Vision transformer for small-size datasets. arXiv 2021, arXiv:2112.13492. [Google Scholar]
Naqvi, S.A.R. Augmented Skin Conditions Image Dataset. Kaggle. 2023. Available online: https://www.kaggle.com/datasets/syedalinaqvi/augmented-skin-conditions-image-dataset (accessed on 31 October 2024).
Bourtoule, L.; Chandrasekaran, V.; Choquette-Choo, C.A.; Jia, H.; Travers, A.; Zhang, B.; Lie, D.; Papernot, N. Machine unlearning. In Proceedings of the 2021 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 24–27 May 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 141–159. [Google Scholar]

Figure 1. Weights gradients of classifiers with different number of layers. (a) Classifier with one fully connected layer. (b) Classifier with two fully connected layers.

Figure 2. Confusion matrix of MNIST on a classifier. (a) Full model. (b) Only some weights in the first connected layer. (c) Another part of the weights in the first connected layer.

Figure 3. Category-aware weight modification and submaximum one-hot encoding to the labels of forgotten samples.

Figure 4. Network performance on the forgetting task using different encoding strategies: (a) submaximum one-hot encoding, (b) random one-hot encoding, and (c) all-zero encoding.

Figure 5. The performance of the model after continued learning on different data–model pairs. The forgotten set data are forgotten first, and then the retained set data are forgotten.

Figure 6. The forgetting performance of VGG16 under different training sample sizes: (a) includes only the forgotten class samples; (b) includes both the forgotten and retained class samples.

Figure 7. The performance of the model after forgetting learning with different learning rates on CIFAR10.

Figure 8. t-SNE visualization of the classification results of the LeNet model on the MNIST dataset: (a) original model, (b) unlearning class 0, (c) unlearning class 0 and 4.

Table 1. The performance of single-class unlearning evaluated across multiple datasets and various networks.

Dataset	Model	Original Network		Unlearning Network		ΔAcc_R(%)↑
Dataset	Model	Acc_R(%)	Acc_F(%)	Acc_R(%)	Acc_F(%)↓	ΔAcc_R(%)↑
MNIST	MLP	98.49	99.39	98.04	1.22	−0.45
MNIST	LeNet	99.09	99.59	98.57	0.00	−0.52
FashionMNIST	MLP	90.49	86.20	89.14	1.30	−1.35
	LeNet	90.83	87.50	90.18	2.00	−0.65
	AlexNet	92.50	87.50	89.84	2.20	−2.66
SVHN	AlexNet	93.97	94.38	92.73	4.58	−1.24
	VGG11	95.68	96.62	95.32	0.23	−0.36
	ResNet18	96.17	96.62	94.40	3.15	−1.77
CIFAR10	VGG16	89.52	92.80	89.50	0.00	−0.02
	ResNet34	89.10	91.00	87.33	0.70	−1.77
	InceptionV3	93.31	93.80	91.91	4.90	−1.40
	ViT-S	95.84	97.10	93.19	3.10	−2.65
CIFAR100	VGG16	64.99	86.00	64.45	0.00	−0.54
	ResNet50	66.05	88.00	64.60	1.00	−1.45
	InceptionV3	75.15	88.00	72.17	0.00	−2.98
	ViT-S	82.78	92.00	79.26	0.00	−3.52
ASCI	ResNet50	98.00	92.41	94.25	2.53	−3.75
ASCI	ViT-S	97.00	94.94	94.25	3.80	−2.75

Table 2. The performance of multi-class unlearning evaluated across multiple datasets and various networks.

Dataset	Model	Original Network		Unlearning Network		ΔAcc_R(%)↑
Dataset	Model	Acc_R(%)	Acc_F(%)	Acc_R(%)	Acc_F(%)↓	ΔAcc_R(%)↑
MNIST	MLP	98.52	98.83	97.44	4.17	−1.08
MNIST	LeNet	99.04	99.54	97.20	3.56	−1.84
FashionMNIST	MLP	91.23	85.40	91.23	3.60	0.00
FashionMNIST	LeNet	91.71	84.60	91.68	4.20	−0.03
SVHN	VGG11	96.31	95.62	94.25	0.02	−2.05
SVHN	ResNet18	96.07	96.86	95.50	1.89	−0.57
CIFAR10	VGG16	89.46	91.40	89.91	0.44	0.45
CIFAR10	ResNet34	89.16	89.80	87.74	3.50	−1.42
CIFAR100	VGG16	64.60	67.60	50.66	0.10	−13.94
CIFAR100	ResNet50	65.84	68.00	57.59	4.65	−8.25

Table 3. Comparison of single-class unlearning with state-of-the-art methods on MNIST, SVHN, CIFAR10, and CIFAR100.

Dataset	Method	Model	Original Network		Unlearning Network		ΔAcc_R(%)↑
Dataset	Method	Model	Acc_R(%)	Acc_F(%)	Acc_R(%)	Acc_F(%)↓	ΔAcc_R(%)↑
MNIST	PBU	ResNet18	99.00	99.19	96.24	0.21	−2.76
		AllCNN	99.49	99.37	98.40	0.07	−1.09
		ResNet34	99.45	99.43	96.02	0.01	−3.43
	GKT(zero-shot)	AllCNN	97.84	99.61	97.12	0.00	−0.72
		LeNet	98.15	99.59	95.79	0.00	−2.36
		ResNet9	98.57	99.10	94.57	0.00	−4.00
	WF-Net	VGG16	99.60	99.60	73.20	0.00	−26.40
		ResNet18	99.60	99.60	94.00	9.68	−5.60
		ViT-T	98.90	98.90	93.50	0.00	−5.40
	Ours	MLP	98.49	99.39	98.04	1.22	−0.45
	Ours	LeNet	99.09	99.59	98.57	0.00	−0.52
SVHN	GKT(zero-shot)	AllCNN	94.52	95.16	92.43	0.00	−2.09
		LeNet	85.69	81.42	78.27	0.00	−7.42
		ResNet9	82.76	87.11	39.44	0.00	−43.32
	Ours	AlexNet	93.97	94.38	92.73	4.59	−1.24
		VGG11	95.68	96.62	95.32	0.23	−0.36
		ResNet18	96.17	96.62	94.40	3.15	−1.77
CIFAR10	PBU	ResNet18	76.54	71.82	66.16	4.50	−10.38
		AllCNN	84.92	79.64	76.15	1.04	−8.77
		ResNet34	76.78	68.52	68.61	0.59	−8.10
	GKT(zero-shot)	AllCNN	94.05	87.49	81.97	0.00	−12.08
		LeNet	59.8	62.25	41.32	0.00	−18.48
		ResNet9	84.83	88.50	56.83	0.00	−28.00
	NG-IR	ResNet18	77.86	81.01	71.60	0.00	−6.26
	NG-IR	AllCNN	82.64	91.02	73.90	0.00	−8.74
	WF-Net	VGG16	93.00	93.00	80.20	18.30	−12.80
		ResNet18	93.90	94.00	79.70	9.25	−14.20
		ViT-T	78.00	78.00	73.50	0.00	−4.50
	Ours	VGG16	89.52	92.80	89.50	0.00	−0.02
		ResNet34	89.10	91.00	87.33	0.70	−1.77
		InceptionV3	93.31	93.80	91.91	4.90	−1.40
		ViT-S	95.84	97.10	93.19	3.10	−2.65
CIFAR100	PBU	Resnet18	76.06	80.11	69.55	1.50	−6.51
		Resnet50	75.95	78.44	69.29	0.33	−6.66
		Resnet34	75.21	84.00	65.34	0.17	−9.87
	NG-IR	ResNet18	78.68	83.00	75.36	0.00	−3.32
	NG-IR	MobileNetV2	77.43	90.00	75.76	0.00	−1.67
	WF-Net	VGG16	93.00	93.00	80.20	18.30	−12.80
		ResNet18	93.90	94.00	79.70	9.25	−14.20
		ViT-T	78.00	78.00	73.50	0.00	−4.50
	Ours	VGG16	64.99	86.00	64.44	0.00	−0.55
		ResNet50	66.05	88.00	64.60	1.00	−1.46
		InceptionV3	75.15	88.00	72.17	0.00	−2.98
		ViT-S	82.78	92.00	79.26	0.00	−3.52

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, J.; Bie, H.; Jing, Z.; Zhi, Y. Scrub-and-Learn: Category-Aware Weight Modification for Machine Unlearning. AI 2025, 6, 108. https://doi.org/10.3390/ai6060108

AMA Style

Wang J, Bie H, Jing Z, Zhi Y. Scrub-and-Learn: Category-Aware Weight Modification for Machine Unlearning. AI. 2025; 6(6):108. https://doi.org/10.3390/ai6060108

Chicago/Turabian Style

Wang, Jiali, Hongxia Bie, Zhao Jing, and Yichen Zhi. 2025. "Scrub-and-Learn: Category-Aware Weight Modification for Machine Unlearning" AI 6, no. 6: 108. https://doi.org/10.3390/ai6060108

APA Style

Wang, J., Bie, H., Jing, Z., & Zhi, Y. (2025). Scrub-and-Learn: Category-Aware Weight Modification for Machine Unlearning. AI, 6(6), 108. https://doi.org/10.3390/ai6060108

Article Menu

Scrub-and-Learn: Category-Aware Weight Modification for Machine Unlearning

Abstract

1. Introduction

2. Related Work

3. Method

3.1. Weight Dynamics and Category Sharing

3.2. Category-Aware Weight Modification

3.3. Challenges in the Unlearning Process

3.4. Sample Selection for Unlearning

4. Experiment

4.1. Experimental Setting

4.2. Evaluation of Unlearning Networks

4.3. Comparison with Other Unlearning Methods

4.4. Analyze the Impact of Learning Rate on the Unlearning Methods

4.5. T-SNE Visualization of the Classification Results of the Unlearning Model

5. Discussion

5.1. Method Analysis

5.2. Limitations and Future Work

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI