Comparative Analysis of Membership Inference Attacks in Federated and Centralized Learning †

’Comparative Analysis of Membership Inference Attacks in Federated Learning’


Introduction
Machine learning (ML) is gaining popularity thanks to the increasing availability of extensive datasets and technological advancements [1,2].Centralized learning (CL) techniques become impractical in the context of abundant private data as they mandate transmitting and processing data through a central server.Google's federated learning (FL) has emerged as a distributed machine learning paradigm since its inception in 2017 [3].
In FL, a central server supports participants in the training model by exchanging trained models or gradients of training data without revealing raw or sensitive information either to the central server or other participants.The application of FL is crucial, particularly in processing sensitive and personal data, such as in healthcare, where ML is increasingly prevalent, especially in compliance with GDPR [4] and HIPAA [5] regulations.Despite its advancements, FL is susceptible to membership inference attacks (MIA), a method employed to gain insights into training data.Although FL primarily aims for privacy protection, attackers can infer specific data by intercepting FL updates transmitted between training parties and the central server [6,7].For instance, if an attacker is aware that patient data are part of the model's training set, they could deduce the patient's current health status [8].Prior research has explored membership inference attacks (MIA) in a centralized environment where data are owned by a single data owner.It is imperative to extend this investigation to MIA in federated learning (FL).This article undertakes an analysis of various MIA techniques initially proposed in the centralized learning (CL) environment [9][10][11].The examination encompasses their applicability in the FL environment and evaluates the effectiveness of countermeasures to mitigate these attacks in both FL and CL environments.An earlier version of this work has already been published [12], focusing solely on MIA in the FL environment.In that iteration, we scrutinized nine mitigation techniques [9,10,[13][14][15][16][17][18][19] against MIA attacks and showed that knowledge distillation [19] performs better in reducing the attack recall while keeping accuracy as high as possible.We also conducted some experiments to observe the effects of three various optimizers, Stochastic Gradient Descent (SGD) [20], Root Mean Squared Propagation (RMSProp) [21], and Adaptive Gradient (Adagrad) [22], in deep learning on MIA recall and FL model accuracy.We found no difference between these optimizers on MIA recall.In this paper, we investigated two more optimizers and three more countermeasures in both CL and FL environments, and we compared the results.To the best of our knowledge, this study is the first comprehensive study that investigates the MIA in both CL and FL environments and applies twelve mitigation techniques against MIA with five various optimizers for the target model.Our contributions in this paper are summarized below.

•
We conducted a comprehensive study of the effectiveness of the membership inference attack in the FL and CL environments considering different attack techniques, optimizers, datasets, and countermeasures.Existing related work focuses on the CL environment and the effectiveness of one single countermeasure.In this paper, we investigated the FL environment, compared it with the CL environment, and studied the effectiveness of combining two mitigation techniques together.

•
We established a trade-off relationship between model accuracy and attack recall.Our investigation revealed that employing knowledge distillation in conjunction with either Gaussian noise, Gaussian dropout, or activity regularization yields the most favorable balance between model accuracy and attack recall across both image and tabular datasets.
The remainder of this article is organized as follows.In the Section 2, we presented the related work.In Section 3, we explained the different attacks on a model for membership inference.Countermeasures are detailed in Section 4. The setup and the results of the experiments are described and analyzed in Section 5. Finally, we conclude our article in Section 6.

MIA against CL
Shokri et al. [9] performed the first MIA on ML models to identify the presence of a data sample in the training set of the ML model with black-box access.Shokri et al. [9] created a target model, shadow models, and attack models, and they made two main assumptions.First, the attacker must create multiple shadow models, each with the same structure as the target model.Second, the dataset used to train shadow models comes from the same distribution as the target model's training data.Subsequently, Salem et al. [10] widened the scope of the MIA of Shokri et al. [9].They showed that the MIA is possible without having any prior assumption of the target model dataset or having multiple shadow models.Nasr et al. [31] showed that more reasonable attack scenarios are possible in both FL and CL environments.They designed a white-box attack on the target model in FL and CL by assuming different adversary prior knowledge.Lan Liu et al. [11] studied perturbations in feature space and found that the sensitivity of trained data to a fully trained machine learning model is lower than that of untrained data.Lan Liu et al. [11] calculated sensitivity by comparing the sensitivity values of different data samples using a Jacobian matrix, which measures the relationship between the target's predictions and the feature value of the target sample.
Numerous attacks in the existing literature draw inspiration from Shokri's research [9].Carlini et al. [2] introduced a novel attack called the Likelihood Ratio Attack (LiRA), which amalgamates concepts from various research papers.They advocate for a shift in the evaluation metric for MIA by recommending the use of the true positive rate (recall) while maintaining a very low false alarm rate.Their findings reveal that, when measured by recall, many attacks prove to be less effective than previously believed.In our study, we adopt the use of recall, rather than accuracy, as the measure of MIA attack effectiveness.

MIA against FL
Nasr et al. [31] showed that MIA seriously compromises the privacy of FL participants even when the universal model achieves high prediction accuracy.A common defense against such attacks is the differential privacy (DP) [38] approach, which manipulates each update with some random noise.However, it suffers from a significant loss of FL classification accuracy.Bai et al. [39] proposed a homomorphic-cryptography-based privacy enhancement mechanism impacting MIA.They used homomorphic cryptography to encrypt the collaborators' parameters and added a parameter selection method to the FL system aggregator to select specific participant updates with a given probability.Another FL MIA defense technique is the digestive neural network (DNN) [35], which modifies inputs and skews updates, maximizing FL classification accuracy and minimizing inference attack accuracy.Wang et al. [36] proposed a new privacy mechanism called the Federated Regularization Learning Model to prevent information leakage in FL.Xie et al. [37] proposed an adversarial noise generation method that was added to the attack features of the attack model on MIA against FL.

Attack Techniques for Membership Inference
In this section, we summarize the different methods of MIA [9][10][11] that we applied in this paper.The summary of the considered membership attacks is shown in Table 2.We employed four well-known attacks in this paper, and each of them has its own characteristics.MIA can be formulated [40] as follows: Given a data sample(x, y) and additional knowledge K M Target about the target model M Target , the attacker typically tries to create an attack model M Attack to eventually return either 0 or 1, where 0 indicates the sample is not a member of the training set and 1 indicates the sample is a member of the training set.The additional knowledge can be the distribution of the target data and the type of the target model.Figure 1 summarizes the general idea of the first MIA on ML models proposed by Shokri et al. [9].The target model takes a data sample as input and generates the probability prediction vector after training.Suppose D Train Target is the private training dataset of the target model M Target , where (x i , y i ) are the labeled data records.In this labeled dataset, (x i ) represents the input to the target model, while (y i ) represents the class label of x i in the set 1, 2, ..., C Target .The output of the target model M Target is a vector of probabilities of size C Target , where the elements range from 0 to 1 and they sum to 1. Multiple shadow models are created by the attacker to mimic the behavior of the target model and to generate the data needed to train the attack model.Attack is the attack model's training dataset, which contains labeled data records (x i , y i ) and the probability vector generated by the shadow model for each data sample x i .The label for x i in the attack model is either "in" if x i is used to train the shadow model or "out" if x i is used to test the shadow model.This attack is named Attack 1 in our experiments.

Salem et al.'s MIA
Early demonstrations by Shokri et al. [9] on the feasibility of MIA are based on many assumptions, e.g. the use of multiple shadow models, knowledge of the structure of the target model, and the availability of a dataset from the same distribution as the training data of the target model.Salem et al [10] diminished all these key assumptions, showing that the MIA is generally applicable at low cost and carries greater risk than previously thought [10].They provided two MIA attacks: I) with the knowledge of dataset distribution, model architecture, and only one shadow model, and II) with no knowledge about dataset distribution and model architecture.The former attack is named Attack 2 and the latter one is named Attack 3 in Table 2.

Prediction Sensitivity MIA
The idea behind this attack is that training data from a fully trained ML model generally have lower prediction sensitivity than untrained data (i.e., test data).The overview of this attack [11] is shown in Figure 2. The only allowed interaction between the attacker and the target model M is to query M with a sample x and then obtain the prediction result.The target model M maps the n-dimensional vector x ∈ R n to the output m-dimensional y ∈ R m .The Jacobian matrix of M is a matrix m × n whose element in the ith row and jth column is where y = M(x).The input sample is x = [x 1 , x 2 , . . ., x n ], and the corresponding prediction is the relationship between the change in the input record's i-th feature value and the change in the prediction probability that this sample belongs to j-th class.The Jacobian matrix comprises a series of first-order partial derivatives.The derivatives can be approximated by calculating the numerical differentiation with the following equation: where is a small value added to the input sample's i-th feature value.Add to the i-th feature value of the target sample x t , whose membership property to know provides two modified samples to query the target model and derive the partial derivatives of the i-th feature for the target model: . Then, for each feature in x, this process is repeated to combine the partial derivatives into the Jacobian matrix.For simplicity, the approximation of the Jacobian matrix is defined as J(x; M).The L-2 norm of J(x; M) represents the prediction sensitivity for the target sample, as described by Novak et al. [41].For a m × n matrix A, the L-2 norm of A can be computed as follows: where i and j are the row and column number of the matrix element a ij , respectively.There is a difference in prediction sensitivity between samples from the training set and samples from the testing set.Once prediction sensitivity is calculated, an unsupervised clustering method (like k-means) partitions a set of target records (prediction sensitivity values) into two subsets.The cluster with the lowest mean sensitivity compared to the members of the M's training set is chosen.Then, during the inference stage, the samples are clustered into three or more groups and ordered by an average norm.Finally, the groups with lower average norms are predicted from the target model's training set, whereas others are not.

Defense Mechanisms
Attackers take advantage of the fact that ML models behave differently during the prediction with new data than with training data to differentiate members from nonmembers.This property is associated with the degree of overfitting, which is measured by the generalization gap.The generalization gap is the difference between the accuracy of the model between training and testing time.When overfitting is high, the model is more vulnerable to MIA.Therefore, whatever method is used to reduce overfitting is also profitable for MIA reduction.We applied the following methods to see how they mitigate the MIA.

•
Dropout (D): It prevents overfitting by randomly deleting units in the neural network and allows for an approximately efficient combination of many different neural network architectures [16].This was suggested by Salem et al. [10] and implemented as an MIA mitigation technique in ML models in a centralized framework.

•
Monte Carlo Dropout (MCD): It is proposed by Gal et al. [13].It captures the uncertainty of the model.Various networks (where several neurons have been randomly disabled) can be visualized as Monte Carlo samples from the space of all available models.This provides a mathematical basis for the model to infer its uncertainty, often improving its performance.This work allows dropout to be applied to the neural network during model inference [42].Therefore, instead of making one prediction, multiple predictions are made, one for each model (already prepared with random disabled neurons), and their distributions are averaged.Then, the average is considered as the final prediction.• Batch Normalization (BN): This is a technique that improves accuracy by normalizing activations in the middle layers of deep neural networks [14].Normalization is used as a defense in label-only MIA, and the results show that both regularization and normalization can slightly decrease the average accuracy of the attack [32].

•
Gaussian Noise (GN): This is the most practical perturbation-based model for describing the nonlinear effects caused by additive Gaussian noise [23].GN is used to ignore adversarial attacks [15].

•
Gaussian Dropout (GD): It is the integration of Gaussian noise with the random probability of nodes.Unlike standard dropout, nodes are not entirely deleted.Instead of ignoring neurons, they are subject to Gaussian noise.From Srivatsava's experiments [16], it appears that using the Gaussian dropout reduced computation time because the weights did not have to be scaled each time to match the skipped nodes, as in the standard dropout.[19,46].According to many MIA mitigation articles, KD outperforms the cutting edge approaches [33,34] in terms of MIA mitigation, while other FL articles support that it facilitates effective communication [47][48][49] to maintain the heterogeneity of the collaborating parties.

•
Combination of KD with AR (AR-KD): In our early experiments [12], we noticed that, in most test cases, KD lowers the recall while preserving the model accuracy.In this work, we are combining AR as a mitigation technique with KD.To the best of our knowledge, this is the first work that combines AR and KD and evaluates its results both in CL and FL.

•
Combination of KD with GN (GN-KD): Like AR, we are also combining GN and KD to see how they affect the attack recall and model accuracy.This paper is also the first paper that combines GN and KD and evaluates the performance of this combination in CL and FL environments.

• Combination of KD with GD (GD-KD):
We also combine KD and GD to see their effects on attack recall and model accuracy using five various optimizers on image datasets.To our knowledge, there is no work that combines these two methods to evaluate how they behave against MIA.Therefore, this is the first paper that combines these methods and analyses them in both CL and FL environments.

Performance Analysis
In this section, a summary of the experimental setup and results is provided.We performed our experiments on a 2.30 GHz 12th Gen Intel(R) Core(TM) i7-12700H processor with 16.00 GB RAM on the x64-based Windows 11 OS.We used open-source frameworks and standard libraries, such as Keras and Tensorflow in Python.The code of this work is available at [50].

Experimental Setup
In the following, we detail the experimental setup.

Datasets
The datasets of our experiments are CIFAR-10 [29], MNIST [27], FMNIST [28], and Purchase [30].These datasets are the benchmark to validate the MIA, and they are the same as those used in recent related work [51].CIFAR-10, MNIST, and FMNIST are image datasets in which, by normalizing, we fit the image pixel data in the range [0,1], which helps to train the model more accurately.Purchase is a tabular dataset that has 600 dimensions and 100 labels.We used one-hot encoding of this dataset to be able to feed it into the neural network [51].Each dataset is split into 30,000 for training and 10,000 for testing.For training in the FL environment, the training dataset is uniformly divided between three FL participants to train the local models based on the FedAvg [3] algorithm separately and update the central server to reach a global optimal model.

Model Architecture
The models are based on the Keras sequential function and a linear stack of neural network layers.In these models, we first defined the flattened input layer, followed by three dense layers.The MNIST and FMNIST input sizes are 28 × 28, while the CIFAR-10 input sizes are 32 × 32.The Purchase dataset input size is considered 600 since it has 600 features.We added all countermeasure layers in between the dense layers.As knowledge distillation is an architectural mitigation technique, we ran a separate experiment to see its performance.We specified an output size of 10 as the labels for each class in the MNIST, FMNIST, and CIFAR-10 datasets range between 0 and 9. Also, we set the output size of 100 for the Purchase dataset as the labels for this dataset range between 0 and 99.In addition, we set the activation function for the output layer to softmax to make the outputs sum to 1.

Training Setup
For training, we used SGD, RMSProp, Adagrad, Nadam, and Adadelta optimizers, with a learning rate equal to 0.01.The loss function for all the optimizers is the categorical cross-entropy.We have a batch size of 32 and epochs of 10 for each participant during training.We reproduced the FL process, including local participant training and FedAvg aggregation.The scheme of data flow is illustrated in Figure 3.

Evaluation Metrics
We focus on test accuracy as an evaluation metric for the FL model and recall as an evaluation metric for successful attacks in the FL setting.The recall (true positive rate) represents the fraction of the members of the training dataset that are correctly inferred as members by the attacker.

Comparison Methods
We investigated the performance of four attacks, as mentioned in Table 2. Attack 1 employs multiple shadow models mimicking both the structure and the data distribution of the target model.Attack 2 applies a single shadow model.The structure of the model is different.However, the training data distribution imitates the target model.Unlike Attack 1 and Attack 2, in Attack 3, both the structure of the model and the training data distribution differ from the target model.Finally, Attack 4 applies the Jacobian matrix paradigm, which brings us an entirely different membership inference attack using the target model.

Experimental Results
In this section, we compared FL and CL.We also experimentally analyzed the effect of the MIA and the effect of the mitigation techniques in both environments, considering image and tabular datasets.

CL vs. FL
Many studies thoroughly compared the CL and FL approaches [52,53].FL is concluded as a network-efficient alternative to CL [54].In our comparison of the two approaches, as shown in Figures 4-7, CL outperformed FL regarding accuracy in most cases, which is expected.In Figure 7, the accuracy in CL is considerably lower than the accuracy in FL for GN, GD, and AR.This is justified by the nature of the tabular dataset, which seems to be overfitted using Adadelta and Nadam optimizers in the CL environment, and overfitting is removed when we apply these optimizers in the FL environment.The accuracy values are also tabulated in Tables 3 and 4 for CL and FL environments, respectively.In all figures and tables, the WC is the value for the model accuracy (or attack recall) without having any countermeasure included in the model.Figures 8-11 illustrate attack recall in our experiments.An interesting aspect to note is related to the Adadelta optimizer in image datasets.If we examine Adadelta's performance in image datasets in Figures 4-6, we can observe that there is minimal loss in accuracy when using this optimizer.However, our experiments depicted in Figures 8-10 indicate that, even when we do not implement any countermeasure (WC) to mitigate membership inference attacks (MIA), Adadelta is capable of functioning as a countermeasure without significantly compromising utility.It is evident that utilizing Adadelta alone results in a substantial reduction in the recall of the MIA attack.However, for tabular datasets, Adadelta is not performing significantly differently from other optimizers, as shown in Figure 11.In all the tables in this paper, the value in parentheses shows the difference between that countermeasure and its corresponding value in the without countermeasure (WC) column.WC shows the values when we do not use any countermeasure.Generally, the recall of Attack 1 is almost the same, if not less, in FL compared to the recall in CL considering different mitigation techniques.Figure 12 illustrates five various optimizers' effects as well as various countermeasures' effects on the FL model accuracy, where the y-axis provides the test accuracy of the FL model.As DP-SGD is specialized for SGD optimizer, we applied DP only on SGD optimizer and not with other optimizers.The first group in all the plots is WC, which represents the baseline without countermeasures.We have provided the full details of our experiments in CL and FL environments in Tables 3 and 4, respectively.• CL model accuracy with countermeasures: As per Table 4, the combination that yields the highest CL model accuracy for MNIST after applying countermeasures belongs to Nadam with M. When we apply M as the countermeasure and Nadam as the optimizer, the accuracy of the model slightly decreases compared to the case when we use no countermeasure (WC).Subsequently, this is followed by an increase in the attack recall when using Nadam with M, as per Table 5.In general, Nadam with M slightly decreases model accuracy and significantly increases attack recall for the MNIST dataset, while Adadelta with MCD provides the lowest model accuracy.For the FMNIST dataset, when we use Adagrad with M, we have even higher accuracy than no countermeasure.However, the attack recall is subsequently high, as shown in Table 5.In CIFAR-10, AR and BN hold the highest accuracy, while MCD has the lowest accuracy.In the Purchase dataset, AR-KD yields the highest accuracy for all optimizers, even better than without countermeasures.This happens while attack recall in the Purchase dataset, as per Table 5, is reduced for SGD and Adagrad.• FL model accuracy without countermeasure: As shown in Table 4, the highest FL model accuracy belongs to Nadam, RMSProp, Adadelta, and Adagrad on the MNIST, FMNIST, CIFAR-10, and Purchase datasets, respectively, whereas RMSProp on the CIFAR-10 and Purchase datasets as well as SGD on MNIST and FMNIST yield the lowest accuracy.In general, FL model accuracy is the lowest for Purchase and the highest for MNIST.This is justified by the nature of the datasets and the distribution of their features, which make each data record more distinguishable from the others.The reason why some optimizers are performing very well for specific datasets in the CL environment and not performing well for the same dataset in the FL environment is that these optimizers are sensitive to the FedAvg algorithm, where we average the total weights that are computed locally by the clients to generate the global model.• FL model accuracy with countermeasures: As per Table 4, BN has no significant effect on the CIFAR-10 model accuracy.For CIFAR-10, the highest accuracy belongs to AR-KD when using the SGD optimizer, and the lowest accuracy belongs to MCD when using the Adagrad optimizer.For MNIST and FMNIST, the countermeasure that maintains the maximum accuracy varies between different optimizers.For instance, in FMNIST, the mitigation technique that keeps the model accuracy at its maximum value is AR-KD for four optimizers: SGD, Adagrad, RMSProp, and Adadelta.Also, for Nadam, KD yields the highest accuracy in the FL environment.For FMNIST, the lowest accuracy belongs to AR when using the Nadam optimizer.For MNIST, the best accuracy goes for AR-KD when using SGD and Adagrad, whereas M provides the highest accuracy in RMSProp and Nadam.Also, BN provides the highest accuracy when using Adadelta.The lowest accuracy for MNIST belongs to GD-KD when using the Nadam optimizer.The highest accuracy for the Purchase dataset belongs to AR when using SGD and Adagrad, as well as KD when using Nadam and Adadelta.Reducing the attacks' recall is the best sign that implies MIA mitigation.Figure 13 illustrates the results of the four aforementioned attacks applying five optimizers, with(out) countermeasures on four datasets, MNIST, FMNIST, CIFAR-10, and Purchase, respectively.The y-axis represents the recall of the attack.The attack recall in CL is tabulated in Table 5, whereas the attack recall in FL is tabulated in Table 6 on various datasets and optimizers.4. In CIFAR-10, the strongest attack is Attack 1 with SGD, boasting a recall value of 92.6%.GN-KD is capable of reducing this recall to 58.6% while causing a minimal 4.2% drop in FL accuracy, as detailed in Table 4.In the Purchase dataset, the most potent attack, Attack 4, using the RMSProp optimizer, experiences an 80% reduction in effectiveness with a recall value of 96% when AR-KD is applied.Notably, AR-KD not only avoids a decline in accuracy for the Purchase dataset with the RMSProp optimizer but also substantially boosts accuracy by 52%.This improvement is attributed to the capacity of AR-KD to modify the model's architecture, thereby averting overfitting.

Accuracy-Recall Trade-Off
To obtain a clear comparison between the efficiency of the countermeasures, we calculated the ratio of accuracy over recall.The higher the ratio is, the better the trade-off we are achieving.Figure 14 illustrates the accuracy-recall ratio of each countermeasure.As shown in Figure 14, for almost all optimizers, the highest trade-off belongs to one of the combinatory approaches (either AR-KD, GN-KD, or GD-KD).This figure proves that the combinational approaches that we tested provide a better trade-off between the accuracy of the target model and MIA attack recall.The higher value of this trade-off conveys the message that the mitigation technique keeps the accuracy of the target model high and reduces the attack recall as much as possible.

Privacy and Utility
Concluding from Tables 3-6, it is noted that combination of KD with either AR, GN, or GD has significant advantages over using each one of them separately as well as over other conventional countermeasures.Experiments are showing that the new combinations of countermeasures successfully handle the trade-off between privacy and utility.Generally speaking, in all datasets and almost all optimizers (AR, GD, and GN), KD is capable of reducing the attack recall while preserving the accuracy of the model at a high level.Not only do they preserve the utility of the model at a high level but also, due to the nature of KD, in some cases, they increase model accuracy as well.

Conclusions
This research paper presents a thorough examination of the accuracy of centralized and federated learning models, as well as the recall rates associated with different membership inference attacks.Additionally, it evaluates the effectiveness of various defense mechanisms within both centralized and federated learning environments.Our experimental findings reveal that Attack 1 [9] yields the highest advantage for potential attackers,

Figure 3 .
Figure 3. Overview of the FL system.

Figure 4 .
Figure 4. Comparison of model accuracy of CL and FL using various optimizers and countermeasures-MNIST dataset.

Figure 5 .
Figure 5.Comparison of model accuracy of CL and FL using various optimizers and countermeasures-FMNIST dataset.

Figure 6 .
Figure 6.Comparison of model accuracy of CL and FL using various optimizers and countermeasures-CIFAR-10 dataset.

Figure 7 .
Figure 7.Comparison of model accuracy of CL and FL using various optimizers and countermeasures-Purchase dataset.

Figure 8 .
Figure 8.Comparison of Attack 1 recall on CL and FL using various optimizers and countermeasures-MNIST dataset.

Figure 9 .
Figure 9.Comparison of Attack 1 Recall on CL and FL using various optimizers and countermeasures-FMNIST dataset.

Figure 10 .
Figure 10.Comparison of Attack 1 recall on CL and FL using various optimizers and countermeasures-CIFAR-10 dataset.

Figure 11 .
Figure 11.Comparison of Attack 1 recall on CL and FL using various optimizers and countermeasures-Purchase dataset.

Figure 12 .
Figure 12.A comparison of FL model accuracy with five various optimizers, with and without countermeasures-MNIST, FMNIST, CIFAR-10, and Purchase datasets.• CL model accuracy without countermeasure: As per Table 3, the highest CL model accuracy results for Nadam, SGD, Adagrad, and Adagrad on the MNIST, FMNIST, CIFAR-10, and Purchase datasets, respectively.In contrast, Nadam on the CIFAR-10, Adadelta on MNIST, FMNIST, and Purchase yield the lowest accuracy.Generally speaking, depending on the dataset, the optimizer, and the batch size used in each round of training, the values for the model accuracy change.

Figure 13 .
Figure 13.A comparison of the four attacks on the FL environment using five optimizers with and without countermeasures.

Figure 14 .
Figure 14.The ratio of the accuracy of the model over recall of the attack model in FL environment.

Table 1 .
Related work summary.

Table 2 .
Comparison of the considered attacks.

Type Shadow Model Target's ModelTraining Data Distribution Prediction Sensitivity No. Shadow Models Target Model Structure
The attacker creates several (n) shadow models M i Shadow (), where each shadow model i is trained on the dataset D i Shadow .The attacker first splits its dataset D i Shadow into two sets, D i Train Shadow and D i Test Shadow , such that D i Train Shadow ∩ D i Trial Shadow = φ.Then, the attacker trains each shadow model M i Shadow with the training set D i Train Shadow and tests the same model with D i Test Shadow test dataset.The attack model is a collection of models, one for each output class of target data.D Train [17]s a technique used to encourage the model to have specific properties regarding the activations (outputs) of neurons in the network during training.The purpose of activity regularization is to prevent overfitting and encourage certain desirable characteristics in the network's behavior.The L1 regularizer and the L2 regularizer are two regularization techniques[24].L1 regularization penalizes the sum of the absolute values of the weights, while L2 regularization penalizes the sum of the squares of the weights.Shokri et al.[9]used a conventional L2 regularizer as a defense technique to overcome MIA in ML neural network models.It tells the sequence processing layers that some steps are missing from the input and should be ignored during data processing[17].If all input tensor values in that timestep are equal to the mask value, the timestep is masked (ignored) in all subsequent layers of that timestep.
• Activity Regularization (AR): • Masking (M): • Differential Privacy (DP): Differentially Private Stochastic Gradient Descent (DPSGD) is a differentially private version of the Stochastic Gradient Descent (SGD) algorithm that happens during model training [18] and incorporates gradient updates with some additive Gaussian noise to provide differential privacy.DP [43-45] is a solid standard to ensure the privacy of distributed datasets.• Knowledge Distillation (KD): It distills and transfers knowledge from one deep neural network (DNN) to another DNN

Table 6 .
Cont.As shown in Table5, for the MNIST dataset, the strongest attack is Attack 1 when we apply RMSProp.The recall value of this attack without any countermeasure is 99.4%, which is the highest among other attacks.For Attack 1, only changing the optimizer to Adadelta drops this value to 59.7% without using any countermeasure.Also, the weakest attack goes for Attack 4 when using Adadelta optimization.The recall value of this attack is 44%.For the FMNIST dataset, the strongest attack is Attack 1 with the SGD optimizer and the weakest attack is Attack 4 with the Nadam optimizer.For CIFAR-10, the strongest attack is Attack 1 with SGD optimizer and the weakest is Attack 4 with Adadelta optimizer.For the Purchase dataset, the best attack is Attack 4 with RMSProp optimizer and the worst attack is Attack 4 with Adagrad optimizer.• CL attacks recall with countermeasures: As per Table 5, different mitigation techniques provide various recall values in every attack.We observe that the strongest attack in the case of MNIST, which is Attack 1 with RMSProp, is defended by GD-KD by a reduction of 65.2% of recall value, which is impressive.Using GD-KD is only reducing the model accuracy by 8% according to Table 3.We can conclude that, in the CL environment, GD-KD provides the strongest defense with the lowest model accuracy degradation.This is very important in developing future ML models.For the FMNIST dataset, the strongest attack belongs to Attack 1 when using SGD.This attack in the case of FMNIST is also defended by GD-KD by a reduction of 37.4% in recall value, although the strongest defense for this particular attack and dataset is GN-KD with a 37.9% recall reduction.It is noteworthy that GD-KD and GN-KD drop model accuracy by 15.4% and 13.9%, respectively, as shown in Table 3.The same thing holds true for the CIFAR-10 dataset.The strongest attack is Attack 1 with SGD optimizer for this dataset, and GN-KD is capable of defending this attack by a reduction in attack recall by 34%.Also, in the Purchase dataset, the strongest attack, which is Attack 4 with RMSProp optimizer, is defended by GN-KD and resulted in recall value reduction by 84%.In general, we observe that, in the CL environment, in most of the experiments, the combinations of KD and another countermeasure provides lower attack recall values than other mitigation techniques.This means that these combinations are the best to defend MIA against ML in the CL environment.As shown in Figure 13 and Table 6, for the MNIST dataset, we observe that the highest attack recall (99%) belongs to Attack 1 with RMSProp.This value is significantly reduced to 65.7% by only changing the optimizer to Adadelta.It is impressive to see that changing the optimizer to Adadelta will not drop model accuracy significantly.According to Table 4, using Adadelta reduces FL model accuracy by approximately 1% compared to Nadam.For the FMNIST dataset, Attack 1 with Adagrad provides the highest attack recall value (83.6%).When we change the optimizer to Adadelta, we witness a drop in attack recall to 68.9% without any mitigation technique.The same as Adadelta in MNIST, we are seeing a slight drop in accuracy from 91.7% to 84.1% according to Table 4.For the CIFAR-10 dataset, the highest attack recall is 79% for Attack 1 with SGD optimizer.This value is dropped to 65.9% by only changing the optimizer to Adadelta.Similar to MNIST and FMNIST, this change has not had a significant impact on the accuracy of the FL model.As shown in Table 4, the accuracy of CIFAR-10, when using Adadelta as an optimizer, only drops by roughly 2%.For the Purchase dataset, the best attack is Attack 4 with the RMSProp optimizer with 96% recall value.Also, without applying any countermeasure, the lowest recall value for this dataset belongs to Attack 3 with the Nadam optimizer.• FL attacks recall with countermeasures: As shown in Table 6, it is evident that the various mitigation techniques exhibit varying performance.However, in general, the combinations of KD with either GD, GN, or AR consistently offer improved protection while preserving the model's utility.For MNIST with RMSProp, GN-KD effectively reduces the recall of Attack 1 by 32.5%, which is the most potent attack in our FL MNIST experiments.Remarkably, this reduction is achieved with only an 11% decrease in FL model accuracy, as indicated in Table 4.In the case of FMNIST, Table 6 reveals that Attack 1 with Adagrad exhibits a high recall value of 83.6%.However, this attack can be mitigated by GN-KD, resulting in a 20.7% reduction in recall.It is worth noting that this defense strategy incurs a modest accuracy drop of 9.7%, as reflected in Table • FL attacks recall without countermeasure: