You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

13 November 2023

FLIBD: A Federated Learning-Based IoT Big Data Management Approach for Privacy-Preserving over Apache Spark with FATE

,
,
,
,
,
and
1
Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece
2
Department of Management Science and Technology, University of Patras, 26334 Patras, Greece
*
Authors to whom correspondence should be addressed.
This article belongs to the Special Issue Data Privacy and Cybersecurity in Mobile Crowdsensing

Abstract

In this study, we introduce FLIBD, a novel strategy for managing Internet of Things (IoT) Big Data, intricately designed to ensure privacy preservation across extensive system networks. By utilising Federated Learning (FL), Apache Spark, and Federated AI Technology Enabler (FATE), we skilfully investigated the complicated area of IoT data management while simultaneously reinforcing privacy across broad network configurations. Our FLIBD architecture was thoughtfully designed to safeguard data and model privacy through a synergistic integration of distributed model training and secure model consolidation. Notably, we delved into an in-depth examination of adversarial activities within federated learning contexts. The Federated Adversarial Attack for Multi-Task Learning (FAAMT) was thoroughly assessed, unmasking its proficiency in showcasing and exploiting vulnerabilities across various federated learning approaches. Moreover, we offer an incisive evaluation of numerous federated learning defence mechanisms, including Romoa and RFA, in the scope of the FAAMT. Utilising well-defined evaluation metrics and analytical processes, our study demonstrated a resilient framework suitable for managing IoT Big Data across widespread deployments, while concurrently presenting a solid contribution to the progression and discussion surrounding defensive methodologies within the federated learning and IoT areas.

1. Introduction

Federated Learning (FL) is poised to play an essential role in extending the Internet of Things (IoT) and Big Data ecosystems by enabling entities to harness the computational power of private devices, thus safeguarding user data privacy [1]. Despite its benefits, FL is vulnerable to multiple types of assaults, including label-flipping and covert attacks [2,3,4]. The label-flipping attack specifically targets the central model by manipulating its decisions for a specific class, which can result in biased or incorrect results. To protect federated learning from data poisoning, researchers have devised techniques like differential privacy, which conceals user data details, and secure multi-party computation, which confidentially processes data across different sources. These methods strengthen FL’s defences, preserving both data privacy and integrity [5,6,7,8]. To counter these threats, the research community has developed a plethora of privacy-preserving mechanisms and advanced techniques for improved model training, optimisation, and deployment while preserving the accuracy of the central model.
In the context of the Internet of Things and Big Data systems, Federated Learning (FL) has emerged as a vital paradigm for addressing the challenges associated with distributed data processing, privacy preservation, and resource utilisation. FL enables decentralised machine learning on edge devices, facilitating efficient data processing without compromising privacy [9,10]. Federated Learning (FL) stands out as a pivotal approach to addressing the complex issues arising from the integration of the Internet of Things (IoT) with Big Data analytics. In the vast and dynamic landscape of the IoT, where Intelligent Transportation Systems (ITSs) stand as a cornerstone, FL is being adopted to navigate the complexities of privacy and data management. ITSs, which are a quintessential source of Big Data, benefit from FL’s decentralised nature, enabling enhanced data privacy without compromising the utility of the information extracted from countless daily transactions [11]. In parallel, FL is advancing the field of healthcare, particularly in the management of Sexually Transmitted Infections (STIs), by providing a framework that respects patient confidentiality while harnessing large datasets for better disease control [12].
The potential of FL extends beyond healthcare into the automotive industry, particularly in the vehicular IoT. The system’s ability to harness a multitude of end-user data for local model training makes it a promising alternative to traditional GPS navigation in urban environments. This is because it allows for the collection and processing of data points across diverse vehicular trajectories, thus enabling more-precise and context-aware navigation systems [13]. The privacy-centric design of FL ensures that sensitive user data remain on local devices, thereby reducing the risk of breaches and unauthorised access, a compelling advantage for massive user data applications [14].
Despite its significant advantages, FL is not without its vulnerabilities, with data poisoning attacks representing a salient threat to its security paradigm. These attacks compromise the integrity of learning models by injecting malicious data into the training process. Recent studies have shown that FL systems are susceptible to such threats, prompting an increase in research focused on fortifying these systems against potential breaches [15]. The research presented in [16,17] examined the security vulnerabilities of federated learning within IoT ecosystems and suggested measures to strengthen its security. Ultimately, as FL continues to be integrated across various sectors, ensuring the security of its systems against such attacks is imperative, warranting further investigation and development of sophisticated defence mechanisms, as illustrated by the extensive research in the studies [18,19,20,21,22,23,24,26].
In an FL scenario, the label-flipping attack is a significant concern because it can manipulate the central decisions of the model for a specific class, without reducing its overall accuracy. This manipulation can result in biased or incorrect results, which can have far-reaching consequences for businesses and individuals. To address this issue, the research community has developed various countermeasures to mitigate the risk of data poisoning attacks. The results of the experiments described in this paper provide valuable insights into the efficacy of these privacy-preserving mechanisms and their potential to mitigate the risk of data poisoning attacks in FL systems. The findings are particularly relevant for businesses that rely on FL for training machine learning models as they can help ensure the security and integrity of the models while promoting reliable results.
The contribution of this work is to initiate a comprehensive exploration of the security aspects of federated learning within the scope of IoT Big Data management using the perspective of the Federated Adversarial Attack for Multi-Task Learning (FAAMT) algorithm. Within our proposed model framework, we traversed the complex routes of managing large-scale data, ensuring that the inherent privacy features of federated learning are not jeopardised by potential adversarial attacks. Our research clearly outlines the potential vulnerabilities and susceptibilities of federated multi-task learning systems to data poisoning attacks, shedding light on essential insights to strengthen the robustness and security of the FLIBD approach. By integrating the robust, privacy-preserving Big Data management approach with a thorough analysis and mitigation of adversarial attacks via FAAMT and relevant defence mechanisms, this study creates a well-defined, stable foundation that ensures both the efficient management of IoT Big Data and the protection of collaborative, federated learning experiences in multi-task learning environments. Thus, our work presents a balanced combination of efficient data management and enhanced security, driving the implementation of federated learning in real-world IoT and Big Data applications towards a secure, privacy-preserving future.
The remainder of this work is structured as follows. In Section 2, related work in the field of federated learning and data poisoning is surveyed and covered in detail along with state-of-the-art methods and data encryption applications. Section 3 outlines the methodology at both the theory and application levels, while Section 4 delves into the various types of attacks on federated learning. Section 5 highlights the experimental results and their findings. Finally, Section 7 concludes this work, and Section 7.1 provides a roadmap for future research directions.

3. Methodology

In this section, we provide a comprehensive overview of the steps taken in the data poisoning attack strategies that we propose for federated machine learning. Initially, the concept of federated multi-task learning functions as a general framework for multi-task learning in a federated environment is given. We then present the mathematical formulation of our poisoning attack and describe how to further optimise the model. The abbreviations for the variables used in our work are given in Table 2. To make it simpler for the reader to follow our methodology and comprehend the underlying principles, we introduce and elucidate these notations in depth throughout the paper. Moreover, we provide a representation of our methodology, which includes plots, to help the reader understand the steps involved in our proposed attack.
Table 2. Notation for utilised variables.

Objectives of Federated Machine Learning

In this section, we address the issue of data poisoning attacks in federated multi-task learning. We introduce three types of attacks based on real-world scenarios and propose a bilevel formulation to determine the optimal attacks. The goal of the attacker is to degrade the performance of a set of target nodes T by injecting corrupted or poisoned data into a set of source attacking nodes S . The objective of federated machine learning is to develop a model from data generated by n distinct devices denoted as D i with i [ n ] , which may have different data distributions. To address this, separate models w 1 , , w n are trained on individual local datasets. The focus of this paper is on a horizontal (sample-based) federated learning model known as federated multi-task learning [83], which is a general multi-task learning framework applied in a federated setting. The federated multi-task learning model can be represented as:
i = 1 N j = 1 M i L i w i x i , j , y i , j ,
regularised by the trace of the product W Ω W scaled by λ 1 and the Frobenius norm of W squared scaled by λ 2 :
λ 1 Tr W Ω W + λ 2 | W | F 2 .
The symbol Tr denotes the trace, and | W | F denotes the Frobenius norm of W .
Here, ( x i j , y i j ) is the j-th sample of the i-th device and m i represents the number of clean samples in the i-th device. The matrix W R d × n , with the i-th column being the weight vector for the i-th device, comprises the weight vectors w 1 , , w n . The relationships among the devices are modelled by Ω R n × n , and the regularisation terms are controlled by the parameters λ 1 > 0 and λ 2 > 0 .
In the context of federated machine learning, finding the Ω matrix can be computed separately from the data, as it is independent of them. A key contribution of [83] was the creation of an effective method for distributed optimisation to update the W matrix. This was accomplished by incorporating the ideas of distributed primal–dual optimisation, which were previously outlined in [84].
Let m = i = 1 n m i and X = Diag X 1 , , X n R n d × m . Holding the Ω matrix constant, the optimisation problem can be re-expressed in its dual form with the dual variables α i R m i as follows:
min α , W , Ω i = 1 n j = 1 m i L i α i x i j + λ 1 Tr W Ω W .
where Tr denotes the trace function and λ 1 is a hyperparameter. The optimisation process of finding the Ω matrix is separable from the data, allowing it to be calculated at a central location. The study by [83] contributed a robust method for distributed optimisation to refine the W matrix, building upon the concepts of distributed primal–dual optimisation described in [84]. The optimisation involves the conjugate dual function L i * of L i and the term λ 1 that regularises Tr W Ω W . The variable α i j in α R n represents the j-th sample ( x i j , y i j ) from the i-th device. In this work, we designate D i = ( x i j , α i , y i j ) x i j R d , α i R m i , y i j R m i as the uncorrupted data in device i. The malicious data injected into node i are represented as:
D ˜ i = ( X ˜ i , α ˜ i , y ˜ i ) X ˜ i R d × n ˜ i , α ˜ i R n ˜ i , y ˜ i R n ˜ i
where n ˜ i is the number of injected samples for node i. If i S , then D ˜ i = , meaning that n ˜ i = 0 . The three types of attacks we focused on for FL systems are:
  • Direct attack: T = S . The attacker can directly inject data into all the target nodes, exploiting a vulnerability during data collection. For instance, in the case of mobile phone activity recognition, an attacker can provide counterfeit sensor data to directly attack the target mobile phones (nodes).
  • Indirect attack: When no direct data injection is possible into the targeted nodes, symbolised by T S = , the attacker can exert influence on these nodes indirectly. This is achieved by introducing tainted data into adjacent nodes, exploiting the communication protocol to subsequently impact the target nodes. Such an attack leverages the interconnected nature of the network, allowing for the propagation of the attack effects through the established data-sharing pathways.
  • Hybrid attack: This form of attack represents a combination of both direct and indirect methods, wherein the aggressor has the capability to contaminate the data pools of both the target and source nodes at once. By employing this dual approach, the attacker enhances the potential to disrupt the network, manipulating data flows and learning processes by simultaneously injecting poisoned data into multiple nodes, thereby magnifying the scope and impact of the attack across the network.
The objective to maximally impair the functioning of the target nodes is structured as a two-tiered optimisation challenge, in line with the model proposed by [55]:
max D ˜ s | s S t | t T L t ( D t , w t ) , s . t . , min α , W , Ω = 1 m 1 n + n ˜ L ( D D ˜ ) + λ 1 R ( X α ) .
Here, D ˜ s | s S denotes the set of injected data for the source attacking nodes. The upper-level problem is defined as maximising the performance degradation of the target nodes and is denoted as H . The secondary optimisation issue, addressed by the second condition, concerns a federated multi-task learning scenario in which the training dataset comprises a combination of unaltered and compromised data points.
max D ˜ s S t T L t ( D t , w t ) , s . t . , min α , W , Ω = 1 m 1 n + n ˜ L ( D D ˜ , W ) + λ 1 R ( X α ) + λ 2 W F 2 , α i 2 C , i { 1 , , n } .
The adjusted optimisation challenge, as specified in Equation (8), is detailed as follows:
  • The aim, symbolised by max D ˜ s | s S t | t T L t ( D t , w t ) , seeks to escalate the collective loss for all target nodes within the group T , with D ˜ s representing the data manipulated by the attackers.
  • The initial limitation strives to diminish the total loss for all learning tasks under the federation, depicted by min α , W , Ω = 1 m 1 n + n ˜ L ( D D ˜ , W ) + λ 1 R ( X α ) + λ 2 W F 2 . This includes both the original and tampered data, with L denoting the loss for each task .
  • To deter overfitting, which is fitting the model too closely to a limited set of data points, the model includes penalty terms λ 1 R ( X α ) and λ 2 W F 2 . These terms penalise the model’s complexity, balanced by the parameters λ 1 and λ 2 .
  • A constraint α i 2 C ensures that the model’s parameters α do not exceed a certain threshold C, maintaining the model’s general performance and stability.
In essence, this framework streamlines the complex aspects of federated learning when tasked with handling both unmodified and tampered data, a situation often arising in extensive data settings such as the IoT.

4. Attacks on Federated Learning

This section presents the newly developed FAAMT algorithm, designed to identify the most-strategic approaches for launching attacks within federated learning environments. The algorithm builds upon the commonly used approach in data poisoning attack research, based on related works, and utilises a projected stochastic gradient ascent method to effectively raise the empirical loss of the target nodes, thereby reducing their performance in classification or regression tasks.
In the health IoT domain, a notable innovation is the DPFL-HIoT model, which applies Gradient Boosting Trees (GBTs) to detect healthcare fraud [85]. This method effectively safeguards patient data privacy while still enabling the machine learning model training process. Considering the sensitivity of health information and the difficulties in merging such data, FL emerges as a practical AI-powered tool. It serves the vast amounts of medical data stored in healthcare institutions, crucial for enabling smart health services like telehealth, diagnostics, and ongoing patient monitoring. FL, thus, offers a critical solution that allows for the use of these data while prioritising privacy.
A recent study described a method for executing tensor decomposition while safeguarding privacy within federated cloud systems [86]. This method is detailed extensively, alongside a comprehensive analysis of its complexity. It employs the Paillier encryption scheme and secure multi-party computation to ensure the confidentiality of the decomposition process. Additionally, the paper evaluates the precision of its predictive ratings alongside the computational and communication overhead incurred by users. The analysis of complexity includes the unfolding of tensors, the application of the Lanczos algorithm to these unfolded matrices, the determination of truncated orthogonal bases for tridiagonal matrices, and the construction of the core tensor using encrypted data. The findings within this paper validate the method’s efficacy in maintaining privacy while facilitating tensor decomposition in a federated cloud context.
An innovative strategy has been developed to address the complexity of IoT device data and privacy concerns in Recurrent Neural Network (RNN) models [87]. This approach introduces a Differentially Private Tensor-based RNN (DPTRNN) framework dedicated to maintaining the confidentiality of sensitive user information within IoT applications. The proposed DPTRNN model is designed to tackle the challenges of dealing with heterogeneous IoT data and privacy issues in RNN models. Employing a tensor-based algorithm for back-propagation through time, enhanced with perturbation techniques to ensure privacy, the efficacy of the DPTRNN model is evaluated against video datasets. The results indicated that the model surpasses previous methods in terms of accuracy while providing a higher level of privacy protection.

4.1. Strategies for Attacking via Alternating Minimisation

The task of refining the attack strategy within bilevel problems presents a significant hurdle, mainly attributed to the intricate and non-convex characteristics of the subordinate problem. In our bilevel construct, the dominant problem is a straightforward primal issue, whereas the subordinate issue reveals a complex, non-linear, and non-convex nature. To navigate these complexities, we adopted an alternating minimisation technique. This method involves a cyclic update of the injected data to amplify the impact on the function H , which quantifies the performance deterioration of the targeted nodes. This technique is instrumental in addressing the challenging non-convexity of the subordinate problem, paving the way for an enhanced attack stratagem tailored to federated learning contexts.
By implementing this alternating minimisation, we capitalise on its inherent flexibility to fine-tune our approach iteratively, aligning our injected data with the optimal points of disruption. Each iteration brings us closer to a strategy that can subtly, yet substantially weaken the model’s performance from within. Moreover, this method’s iterative nature allows us to monitor and adjust the attack in real-time, reacting to changes and defences put in place by the network, ensuring that our attack remains effective throughout the learning process. Such adaptability makes it an indispensable tool in the arsenal for compromising federated learning systems.
To tackle the non-convex nature of the lower-level problem in the bilevel optimisation of the attack problem, we adopted an alternating minimisation approach, where we optimise the features of the injected data, denoted by D ˜ s , i , α s , i , by fixing the labels of the injected data. The updates to the injected data D ˜ s , i are obtained via the projected stochastic gradient ascent method as shown below:
D ˜ s , i k Proj X D ˜ s , i k 1 + η 1 D ˜ s , i k 1 H ,
Here, η 1 denotes the learning rate, k indexes the current iteration, and X defines the set of all permissible injected data, as outlined by the initial restriction in the primary problem H . The projection operator Proj X ( · ) ensures that the injected data lie within the 2 -norm ball of radius r. The corresponding dual variable α s , i is updated accordingly as D ˜ s , i evolves:
α s , i k α s , i k 1 + Δ α s , i
where Δ α s , i is the step in the dual space that maximises the upper-level problem H . The equation for computing the gradient of the t-th target node in Equation (9) is derived using the chain rule.
x ^ a t L t D t , w t = w t L t D t , w t · x ^ a ± w t .
The gradient of the objective function at the upper level with respect to both x ^ a t and w t is given by x ^ a t L t ( D t , w t ) = w t L t ( D t , w t ) × x ^ a t w t . The first term on the right-hand side, w t L t ( D t , w t ) , is readily computable as it solely relies on the loss function L t ( · ) . Conversely, the second term, x ^ a t w t , hinges on the optimal conditions of the lower-level problem as set out in Equation (7).
In order to ascertain x ^ a t w t , we commence by fixing the parameter Ω in the lower-level issue to sidestep its constraints, leading us to the dual problem stated as:
min α , W = 1 m 1 n + n ˜ L ( D D ˜ ) + λ 1 R ( X α )
Considering the optimality conditions of the lower-level problem as constraints within the upper-level problem, we treat Ω as a constant matrix during the gradient determination. Furthermore, given that R ( X α ) is continuously differentiable, we correlate W and α through the relationship w ( α ) = R ( X α ) . Lastly, we proceed with updating the dual variable α and calculating the ensuing gradients as shown in Equation (9).
To update the dual variable α , we seek to maximise the dual objective with a least-squares loss or hinge loss function by updating each element of α individually. The optimisation problem in Equation (11) can be reformulated into a constrained optimisation problem for node as:
min α = 1 m 1 n + n ^ L α i + L α ^ i + λ 1 R X α ; α ^
To enable distributed computation across multiple nodes, the optimisation problem in Equation (10) can be separated into smaller, data-local subproblems. This is achieved by making a quadratic approximation of the dual problem. At each step k, two samples are chosen randomly, one from the original clean data (i.e., i 1 , , n ) and one from the injected data (i.e., i 1 , , n ^ ). The updates for both α i and α ^ i in node can be computed as:
α i k = α i k 1 + Δ α i , α ^ i k = α ^ i k 1 + Δ α ^ i ,
where Δ α i and Δ α ^ i are the stepsizes that maximise the dual objective in Equation (15) when all other variables are fixed. To achieve maximum dual ascent, we optimise:
Δ α i = arg min a R L * α i + a + a w α , x i + λ 2 x i a M 2
Δ α ^ i = arg min a ^ R L * α ^ i + a ^ + a ^ w α ^ , x i + λ 2 x i a ^ M 2
Here, M is a symmetric positive definite matrix and M R d × d is the -th diagonal block of M. The inverse of M is expressed as
M 1 = Ω + λ 2 λ 1 I m d × m d .
where Ω : = Ω I d × d R m d × m d and λ 1 and λ 2 are positive constants. For the least-squares loss function, the closed-form solution for Δ α i is:
Δ α i = y i x i x i α i 0.5 α i k 1 0.5 + λ 1 x i M 2 .
For the hinge loss, Δ α ^ i r can be computed as follows:
Δ α ^ i r = y i r max 0 , min 1 , 1 x i r x i r α k 1 y i r λ 1 x i r M 2 + α k 1 y i r α k 1
The gradient associated with each of the injected data x ^ s i for the corresponding target node t is expressed as:
x ^ s s L t w t x t j , y t j = 2 w t x t j y t j x t j · Δ α ^ s i Ω ( t , s )
For the hinge loss, the gradient is:
x ^ s u L t w t x t j , y t j = y t j x t j · Δ α s i Ω ( t , s )
The algorithm for the attack on federated learning is given in Algorithm 1.
Algorithm 1 Federated Adversarial Attack for Multi-Task Learning (FAAMT).
1:
procedure FAAMT( T , S , n ^ s )
2:
    Randomly initialise the current state f ( X ^ 0 , v ^ 0 , y ^ 0 ) S
3:
    Initialise the set of injected data as D ^ = D ^ 0 , and set the iteration counter k = 0
4:
    Set the global learning rate α and tolerance ϵ for convergence
5:
    while  k < maximum number of iterations and not converged do
6:
        # of parallel computations for each node
7:
        for all nodes i = 1 to m in parallel do
8:
           Calculate the solution δ v i for node i using Equation (15) or Equation (16)
9:
           Update the local variables v i v i + α · δ v i
10:
           if  i S  then
11:
               Calculate the solution δ v ^ i for node i using Equation (15) or Equation (16)
12:
               Update the adversarial variables v ^ i v ^ i + α · δ v ^ i
13:
           end if
14:
        end for
15:
        Aggregate local updates to update Global Model parameters W k
16:
        Check for convergence with a predefined criterion, e.g., | | W k W k 1 | | < ϵ
17:
        Increase counter k k + 1
18:
        # of injected data point updates for source nodes
19:
        for all source nodes s = 1 to S in parallel do
20:
           Update the injected data point x ^ s using Equation (7)
21:
        end for
22:
        Update the set of injected data as D ^ D ^ { x ^ s } s = 1 S
23:
    end while
24:
    if converged then
25:
        return Successfully converged with adversarial examples D ^
26:
    else
27:
        return Algorithm reached maximum iterations without convergence
28:
    end if
29:
end procedure
The Federated Adversarial Attack for Multi-Task Learning (FAAMT) Algorithm 1 is an essential mechanism designed to test the robustness of federated learning systems against malicious attacks. It systematically integrates adversarial samples into a network’s nodes, specifically within a multi-task learning setting. Through iterative updates governed by a central learning rate and convergence criteria, FAAMT evaluates the vulnerability of federated learning structures. This assessment is essential for strengthening the defence mechanisms of such systems against advanced adversarial strategies.

Further Analysis

To attack the federated learning model, we focused on the bilevel optimisation problem. The upper-level problem aims to find the optimal attack strategy, while the lower-level problem is to update the model weights. The bilevel problem can be expressed as:
max D ˜ s , i , α s , i H ( D , W ; D ˜ s , i , α s , i ) subject to | D ˜ s , i | 2 r , W = arg min W L ( D D ˜ s , i , W ; α s , i )
For the non-convex nature of the lower-level problem, we used the alternating minimisation approach to optimise the bilevel problem. We initiate by fixing the labels of the injected data and optimising the features of the injected data, denoted by D ˜ s , i , α s , i , as follows:
D ˜ s , i k Proj X D ˜ s , i k 1 + η 1 D ˜ s , i k 1 H ,
To compute the gradient, we first derive the gradient of the target node loss function L t ( D t , W ) with respect to the model weights W:
W L t ( D t , W ) = j = 1 n t w ( w t x t j , y t j ) ,
where ( · ) is the loss function, x t j and y t j are the j-th data point and its label in the target node dataset, and n t is the number of data points in the target node dataset. Next, we need to compute the gradient of the upper-level objective with respect to the injected data D ˜ s , i :
D ˜ s , i H = t = 1 T D ˜ s , i L t ( D t D ˜ s , i , W ) .
To calculate this gradient, we utilise the chain rule, obtaining:
D ˜ s , i H = t = 1 T W L t ( D t , W ) · D ˜ s , i W .
For the lower-level problem, we update the model weights W using gradient descent with learning rate η 2 :
W k W k 1 η 2 W L ( D D ˜ s , i , W k 1 ; α s , i ) .
After updating the features of the injected data, we fix the features and optimise the labels of the injected data:
α s , i k Proj Y α s , i k 1 + η 3 α s , i k 1 H ,
where Proj Y projects the labels into the feasible set Y .
To compute the gradient of the upper level objective with respect to the labels of the injected data, we have:
α s , i H = t = 1 T α s , i L t ( D t D ˜ s , i , W ) .
We apply the chain rule to obtain the following expression:
α s , i H = t = 1 T W L t ( D t , W ) · α s , i W .
Finally, the alternating minimisation algorithm consists of iterating through these updates for the features of the injected data, the labels of the injected data, and the model weights until convergence. By focusing on the gradients of the loss function, the attacker can effectively manipulate the model’s behaviour, leading to a successful attack.

5. Experimental Results

5.1. System Specifications of Experiments

The ensuing experimental design, crucial for substantiating the efficacy of FLIBD, was scrupulously devised to accurately emulate a pragmatic IoT Big Data management context. This, in turn, allows for a discerning evaluation of the approach’s potency in safeguarding privacy within expansive systems. In particular, the experiments were enacted within a high-performance computing environment, marked by its robust computational and storage capabilities, thereby being impeccably aligned for navigating through the complexities of resource-demanding federated learning tasks among handling copious IoT Big Data. A comprehensive exposition of the essential system specifications leveraged for the experiments is proffered in Table 3.
Table 3. System specifications for experimental evaluation.
The above system was orchestrated to emulate large-scale system scenarios for federated learning with a lens toward IoT Big Data management. The CPU, equipped with an extensive number of cores, alongside the robust RAM, facilitates efficient parallel processing capabilities, ensuring expedited computational procedures. The GPU, boasting a substantive video memory, becomes quintessential in efficiently managing and processing massive datasets, especially under operations requiring enhanced parallelisation, such as model training in federated learning frameworks. The ultra-fast NVMe storage plays a critical role in minimising data retrieval and storage times, thus mitigating potential bottlenecks arising from disk I/O operations. Furthermore, the experiment leveraged Apache Spark with Federated AI Technology Enabler (FATE) to enable secure computations and model training, allowing data privacy preservation through federated learning mechanisms.

5.2. Experimental Evaluation of Attacking Mechanism

In the ever-expanding domain of Federated Learning (FL), the imperative to guard against adversarial attacks whilst ensuring accurate model predictions has become paramount. The ensuing analysis offers a rigorous examination of the Federated Adversarial Attack for Multi-Task Learning (FAAMT) algorithm (Algorithm 1). This adversarial approach, developed with meticulous attention to the intricacies of multi-task learning, seeks to diminish the predictive accuracy of various federated learning methods, even those that are conventionally recognised for their robustness against adversarial machinations.
To unravel the efficacy of FAAMT, it was assessed against several widely acknowledged federated learning methods and defences, such as the Global Model, Median Model, Trimmed Mean Model, Krum, Bulyan, RFA, and Romoa. Each method typically exhibits substantial resistance to conventional adversarial tactics, establishing them as formidable entities in federated learning defences. However, through the vision of FAAMT, we sought to decipher whether their robustness holds steadfast or gradually attenuates under the sophisticated attack mechanisms unleashed by the algorithm.
The outcomes derived from applying the FAAMT method (Figure 2) show the performance variations among the federated learning mechanisms previously discussed, unmasking their potential susceptibilities. This analytical exploration, thus, enriches our comprehension of the resilience landscape among adversarial challenges within federated learning frameworks, underscoring the specific vulnerabilities and strengths inherent in each approach.
Figure 2. Efficacy assessment of FAAMT against various federated learning defence mechanisms.
Intriguingly, the derived results emanating from the visual representation expose a compelling narrative pertaining to the potency of the FAAMT mechanism. A discerning glance at the plot elucidates a conspicuous decrement in the predictive accuracies of the defended federated learning methods under attack, reinforcing the assertion that FAAMT has managed to permeate their defensive veils to a certain degree.
Certain defence mechanisms, which previously heralded unwavering resilience in the face of adversarial onslaughts, now showcase perceptible vulnerabilities when entwined with the FAAMT. This not only lays bare latent susceptibilities within these methods, but also accentuates the detail and sophistication embedded within the FAAMT algorithm. Notably, certain defences exhibited a more-pronounced susceptibility, suggesting a hierarchical differentiation in robustness among the examined methods and emphasising the necessity for an incessant evolution in defence strategies.
As the realm of federated learning continues to evolve, so too will the complexity and cunning of adversarial attacks. Thus, the illumination of FAAMT’s ability to infiltrate these robust defence mechanisms propels the discourse on the perennial battle between adversarial attacks and defence methods forward. It beckons the scientific community to explore further, excavate deeper, and innovate with a renewed zeal to safeguard federated learning models against progressively shrewd adversarial endeavours. It is within this iterative process of attack and defence that the future robustness and reliability of federated learning models will be forged, ensuring the integrity and durability of decentralised learning mechanisms in real-world applications.

5.3. Experimental Evaluation of Defences

This section will analyse the plots produced during the experimental evaluation of the implemented defences based on the proposed attack. In each case, the maximum number of clients was 10 and the number of attackers in the system varied from none to 5 clients, i.e., 50% of the system. A higher percentage than this was not considered as there is no point in implementing a defence on a system that malicious users mostly own. The section will start with the results of the attack on a network using the standard FedAvg aggregation method and then split by the percentage of exposure. For the experiments, we utilised the widely used CIFAR-10 dataset, which contains 32 × 32 images, which are classified into various object categories like birds, cats, and aeroplanes. The training dataset contains 5000 images for each of the 10 classes. The purpose of this analysis was to determine the effectiveness of the implemented defences against the proposed attack and to identify any vulnerabilities or limitations. The experimental evaluation aimed to provide insights into the defences’ strengths and weaknesses under different conditions. The experimental results of this analysis may guide the development of future defences and enhance the security of federated learning systems. We evaluated the methods of the Median, Trimmed Mean, Krum, and Bulyan before the attack and afterwards. Before the label-flipping attack, we observed that a very simple logistic regression model in this task had significant accuracy, approaching 98%. In other words, it is a relatively tractable and simple way to implement this task. This is illustrated in Figure 3a.
Figure 3. Performance of the Global Model after each round of FL.
From above, we can see the same results as previously, but with the difference that, now, 20% of the users in our system were malicious. We can see from the F1-score that there was a significant decrease in accuracy, which can be catastrophic. In the end, we also provide the F1-score for the case where 50% of the clients were malicious. In Figure 3b, we can see that the Digit 5 class had a significant decrease even with 20% of the clients being malicious. An important by-product of this attack is that the class to which we converted Digit 5 (i.e., Class 2) also had a significant reduction in its F1-score. In Figure 4a, we can observe that the convergence in the F1-score was not as good as in the simple FedAvg method, which started from the third round, while the Median started from the fourth or even fifth round. In Figure 4b, we can see that, even with a method as simple as the Median, we can identify an increase in the F1-score that was immediate. Also, the convergence was similar.
Figure 4. Performance of the Median Model after each round of FL.
For the Trimmed Mean method, the results are shown in Figure 5a. Similar to the previous method, convergence occurred later. The Trimmed Mean method (Figure 5b) was observed to not perform as well, but this may be attributed to the fact that the customer sample after trimming the edges was small. For the Krum method, the results are shown in Figure 6a. Compared to the previous method, it appeared to have higher convergence. However, after the attack (Figure 6b), it appeared that it lowered while maintaining higher results than the Trimmed Mean method. For the Bulyan method, the results are shown in Figure 7a. Here, the convergence was similar to the Trimmed Mean method, but lower than the Krum model. However, the attack on the model (Figure 7b) appeared to have a more-robust performance.
Figure 5. Performance of the Trimmed Mean Model after each round of FL.
Figure 6. Performance of the Krum model after each round of FL.
Figure 7. Performance of the Bulyan model after each round of FL.

Evaluation Metrics:

To rigorously assess the effectiveness of the Federated Adversarial Attack for Multi-Task Learning (FAAMT) and the robustness of various federated learning defensive mechanisms, we employed two pivotal metrics: accuracy and the F1-score. The accuracy metric gauges the overall performance of the model by evaluating the ratio of correctly predicted instances to the total instances and is defined as
Accuracy = True Positives ( TPs ) + True Negatives ( TNs ) Total Instances ( P + N )
where TPs and TNs denote the number of true positive and true negative predictions, respectively, and P and N represent the total actual positive and negative instances, respectively. The F1-score, the harmonic mean of precision and recall, offers a balanced perspective, especially pivotal when dealing with imbalanced datasets, and is expressed as:
F 1 - Score = 2 × Precision × Recall Precision + Recall
where
Precision = TP TP + False Positives ( FP )
and
Recall = TP TP + False Negatives ( FN )
Utilising these metrics, the ensuing sections deliberate the comprehensive evaluation of the FAAMT and the subjected defence mechanisms, shedding light on their respective strengths and vulnerabilities amidst adversarial incursions.
The results for the proposed attack and defence mechanisms are shown in Table 4, where we assess the accuracy and F1-score for the attack (FAAMT method), as well as the defence mechanism behaviour. Lastly, we evaluated the performance of both the attack and defence on 2 clients infected out of 10.
Table 4. Metrics assessing the performance of attack and defence mechanisms.

6. Discussion

The experimental analysis presented delved into the complex interactions between various federated learning defence mechanisms and a sophisticated adversarial attack, the FAAMT. The results from the high-performance computational environment underscored the imperative of robust defences in the realm of IoT Big Data management. The experiments revealed that even well-established defences, such as Krum and Bulyan, exhibit vulnerabilities when confronted with the FAAMT, hinting at the subtleties in the defence mechanisms that must be addressed. This observation is crucial, for it demonstrates that the current landscape of federated learning is not impervious to novel attack strategies and must continuously evolve.
Notably, the FAAMT elucidated a significant reduction in the F1-scores of the federated learning models, which was particularly pronounced for certain classes within the CIFAR-10 dataset. This degradation in performance suggests that, while current defences offer a degree of resilience, they are not entirely foolproof against such calculated adversarial intrusions. This is a stark reminder that the security of federated learning models remains a moving target, requiring ongoing research and refinement to mitigate the risk of adversarial exploitation.
The experimental evaluation also offered insights into the performance of various defence strategies post-attack. Methods like the Median and Trimmed Mean, traditionally considered less sophisticated than Krum or Bulyan, displayed a delayed convergence, indicating a slower recovery from adversarial disturbances. This behaviour highlights the need for a balance between the complexity of defence algorithms and their adaptability in the face of adversarial manoeuvres. The nuanced decrease in the performance of the Krum and Trimmed Mean Models post-attack further accentuates the need for dynamic defence strategies capable of coping with the evolving nature of adversarial threats.
In contrast, the Bulyan method showcased a relatively more-robust performance under adversarial conditions, suggesting that certain combinations of defence methodologies may offer enhanced resilience. The experimental results suggested a promising direction for future investigation: examining combined or multi-level defence strategies that could offer stronger protection against adversarial attacks. The field of federated learning, especially within the context of the IoT and Big Data, is, thus, poised at a crucial juncture where the iterative process of developing and testing new defence strategies is not just beneficial, but necessary to safeguard the integrity of decentralised learning models against increasingly sophisticated adversarial tactics.

7. Conclusions

In our investigation, we presented the Federated Adversarial Attack for Multi-Task Learning (FAAMT) algorithm, which significantly exposed the susceptibilities of federated multi-task learning frameworks to data poisoning. Our results provided clear evidence that FAAMT can critically undermine the training process, leading to a noticeable deterioration in model accuracy and performance. This highlights an urgent necessity for the establishment of robust defence strategies within such systems.
As we look to the horizon of federated multi-task learning, it is imperative to address the challenges presented by the FAAMT algorithm. The development and in-depth examination of defence strategies such as secure aggregation, robust outlier detection, and the integration of differential privacy mechanisms are pivotal areas requiring immediate attention. The future landscape should include extensive evaluations of these defences’ effectiveness against sophisticated data poisoning tactics with respect to multi-task learning environments.
The fine-tuning of FAAMT can lead to many potential enhancements. Its adaptability could be tested against various data poisoning strategies to assess impacts on user data confidentiality. Furthermore, its extension to other federated learning paradigms, such as federated transfer learning, could provide significant insights. Investigating the resilience of optimisation techniques like gradient clipping related to adversarial interventions in multi-task scenarios could yield beneficial strategies for maintaining system integrity. Additionally, adversarial tactics that specifically target individual tasks or users, while minimising disruption to other system components, could reveal specific system vulnerabilities and build room for the enhancement of privacy-preserving measures.
In summary, this study highlighted a critical and immediate need to reinforce the security and privacy protocols of federated multi-task learning systems. The safe and effective application of these systems in practical settings demands the creation and implementation of robust defence mechanisms. Addressing these open challenges is essential to advancing the adoption of federated multi-task learning, ensuring that it can be confidently deployed to facilitate collaborative learning processes across heterogeneous devices, all while maintaining a steady defence of user privacy in the interconnected ecosystems of the IoT and Big Data.

7.1. Future Work

The prospect of combining Federated Learning (FL) and Automated Machine Learning (AutoML) opens new opportunities for enhancing data privacy and optimising Big Data management. The integration of FL with AutoML, as envisioned in the studies [88,89], provides a robust foundation for the development of the Federated Adversarial Attack for Multi-Task Learning (FAAMT) algorithm. This algorithm aims to address the complexities of multi-task learning within a federated framework, where the goal is to enable the collaborative training of models across multiple tasks while ensuring data privacy and robustness against adversarial attacks. The FAAMT approach could leverage the privacy-preserving nature of FL to protect data across distributed networks and the efficiency of AutoML to optimise learning tasks, thereby ensuring that the integrity of multi-task learning is maintained even in the presence of potential adversarial threats.
The integration of edge intelligence and edge caching mechanisms outlined in the first study with the advanced hyperparameter optimisation techniques from the second study provides the foundational elements necessary for the further development of FAAMT. The algorithm could harness the computational power available at the network edge, alongside sophisticated AutoML methods, to effectively manage the multi-faceted challenges of Big Data in a federated context. This synergy promises to deliver not only more-personalised and immediate data processing at the edge, but also a robust framework for multi-task models that are inherently more resistant to adversarial attacks. In such a way, the FAAMT algorithm has the potential to become a quintessential part of smart, decentralised networks, enabling them to remain resilient in the face of evolving cyber threats while making the most of the vast datasets generated across IoT environments.
The development of the FAAMT algorithm points us in a new direction for research that incorporates the protective features of federated learning with the dynamic fine-tuning abilities of AutoML. By utilising both, future work can focus on improving this algorithm to better manage the complex tasks of learning multiple things at once in environments where there may be hostile attempts to disrupt learning. This will include creating better methods to spot and stop such attacks in networks where data are shared across different computers, making algorithms that can learn different tasks more effectively and efficiently, and ensuring that the system can handle the unevenly distributed data that are often found in everyday situations. Further studies can also investigate how to best spread out the work of computing across the edges of the networks, making the most of local computing for AutoML tasks and further protecting the privacy of data within the federated learning setup. As the FAAMT algorithm improves even more, it could be used in many areas, turning multi-task learning in networks where data are shared across many computers into not just a possibility, but a strong method for keeping data private and safe in a world that is more and more connected.

Author Contributions

A.K., A.G., L.T., G.A.K., G.K., C.K. and S.S. conceived of the idea, designed and performed the experiments, analysed the results, drafted the initial manuscript and revised the final manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
  2. Jebreel, N.M.; Domingo-Ferrer, J.; Sánchez, D.; Blanco-Justicia, A. Defending against the label-flipping attack in federated learning. arXiv 2022, arXiv:2207.01982. [Google Scholar] [CrossRef]
  3. Jiang, Y.; Zhang, W.; Chen, Y. Data Quality Detection Mechanism Against Label Flipping Attacks in Federated Learning. IEEE Trans. Inf. Forensics Secur. 2023, 18, 1625–1637. [Google Scholar] [CrossRef]
  4. Li, D.; Wong, W.E.; Wang, W.; Yao, Y.; Chau, M. Detection and mitigation of label-flipping attacks in federated learning systems with KPCA and K-means. In Proceedings of the 2021 8th International Conference on Dependable Systems and Their Applications (DSA), Yinchuan, China, 5–6 August 2021; pp. 551–559. [Google Scholar]
  5. Cheng, Y.; Liu, Y.; Chen, T.; Yang, Q. Federated learning for privacy-preserving AI. Commun. ACM 2020, 63, 33–36. [Google Scholar] [CrossRef]
  6. Truex, S.; Baracaldo, N.; Anwar, A.; Steinke, T.; Ludwig, H.; Zhang, R.; Zhou, Y. A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security; 2019; pp. 1–11. Available online: https://arxiv.org/abs/1812.03224 (accessed on 18 October 2023).
  7. Wei, K.; Li, J.; Ding, M.; Ma, C.; Yang, H.H.; Farokhi, F.; Jin, S.; Quek, T.Q.; Poor, H.V. Federated learning with differential privacy: Algorithms and performance analysis. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3454–3469. [Google Scholar] [CrossRef]
  8. Pan, K.; Feng, K. Differential Privacy-Enabled Multi-Party Learning with Dynamic Privacy Budget Allocating Strategy. Electronics 2023, 12, 658. [Google Scholar] [CrossRef]
  9. Karras, A.; Karras, C.; Giotopoulos, K.C.; Tsolis, D.; Oikonomou, K.; Sioutas, S. Peer to Peer Federated Learning: Towards Decentralized Machine Learning on Edge Devices. In Proceedings of the 2022 7th South-East Europe Design Automation, Computer Engineering, Computer Networks and Social Media Conference (SEEDA-CECNSM), Ioannina, Greece, 23–25 September 2022; pp. 1–9. [Google Scholar] [CrossRef]
  10. Abreha, H.G.; Hayajneh, M.; Serhani, M.A. Federated Learning in Edge Computing: A Systematic Survey. Sensors 2022, 22, 450. [Google Scholar] [CrossRef]
  11. Kaleem, S.; Sohail, A.; Tariq, M.U.; Asim, M. An Improved Big Data Analytics Architecture Using Federated Learning for IoT-Enabled Urban Intelligent Transportation Systems. Sustainability 2023, 15, 15333. [Google Scholar] [CrossRef]
  12. Javed, A.R.; Hassan, M.A.; Shahzad, F.; Ahmed, W.; Singh, S.; Baker, T.; Gadekallu, T.R. Integration of blockchain technology and federated learning in vehicular (iot) networks: A comprehensive survey. Sensors 2022, 22, 4394. [Google Scholar] [CrossRef]
  13. Kong, Q.; Yin, F.; Xiao, Y.; Li, B.; Yang, X.; Cui, S. Achieving Blockchain-based Privacy-Preserving Location Proofs under Federated Learning. In Proceedings of the ICC 2021—IEEE International Conference on Communications, Montreal, QC, Canada, 14–23 June 2021; pp. 1–6. [Google Scholar] [CrossRef]
  14. Math, S.; Tam, P.; Kim, S. Reliable Federated Learning Systems Based on Intelligent Resource Sharing Scheme for Big Data Internet of Things. IEEE Access 2021, 9, 108091–108100. [Google Scholar] [CrossRef]
  15. Nuding, F.; Mayer, R. Data poisoning in sequential and parallel federated learning. In Proceedings of the 2022 ACM on International Workshop on Security and Privacy Analytics; 2022; pp. 24–34. Available online: https://dl.acm.org/doi/abs/10.1145/3510548.3519372 (accessed on 18 October 2023).
  16. Tolpegin, V.; Truex, S.; Gursoy, M.E.; Liu, L. Data poisoning attacks against federated learning systems. In Proceedings of the Computer Security—ESORICS 2020: 25th European Symposium on Research in Computer Security, ESORICS 2020, Guildford, UK, 14–18 September 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 480–501. [Google Scholar]
  17. Sun, G.; Cong, Y.; Dong, J.; Wang, Q.; Lyu, L.; Liu, J. Data poisoning attacks on federated machine learning. IEEE Internet Things J. 2021, 9, 11365–11375. [Google Scholar] [CrossRef]
  18. Li, J.; Guo, W.; Han, X.; Cai, J.; Liu, X. Federated Learning based on Defending Against Data Poisoning Attacks in IoT. arXiv 2022, arXiv:2209.06397. [Google Scholar]
  19. Xu, W.; Fang, W.; Ding, Y.; Zou, M.; Xiong, N. Accelerating Federated Learning for IoT in Big Data Analytics with Pruning, Quantization and Selective Updating. IEEE Access 2021, 9, 38457–38466. [Google Scholar] [CrossRef]
  20. Fu, A.; Zhang, X.; Xiong, N.; Gao, Y.; Wang, H.; Zhang, J. VFL: A verifiable federated learning with privacy-preserving for big data in industrial IoT. IEEE Trans. Ind. Inform. 2020, 18, 3316–3326. [Google Scholar] [CrossRef]
  21. Can, Y.S.; Ersoy, C. Privacy-preserving federated deep learning for wearable IoT-based biomedical monitoring. ACM Trans. Internet Technol. (TOIT) 2021, 21, 1–17. [Google Scholar] [CrossRef]
  22. Zhang, Y.; Zhang, Y.; Zhang, Z.; Bai, H.; Zhong, T.; Song, M. Evaluation of data poisoning attacks on federated learning-based network intrusion detection system. In Proceedings of the 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), Hainan, China, 18–20 December 2022; pp. 2235–2242. [Google Scholar] [CrossRef]
  23. Singh, S.; Rathore, S.; Alfarraj, O.; Tolba, A.; Yoon, B. A framework for privacy preservation of IoT healthcare data using Federated Learning and blockchain technology. Future Gener. Comput. Syst. 2022, 129, 380–388. [Google Scholar] [CrossRef]
  24. Albelaihi, R.; Alasandagutti, A.; Yu, L.; Yao, J.; Sun, X. Deep-Reinforcement-Learning-Assisted Client Selection in Nonorthogonal-Multiple-Access-Based Federated Learning. IEEE Internet Things J. 2023, 10, 15515–15525. [Google Scholar] [CrossRef]
  25. Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 2019, 10, 1–19. [Google Scholar] [CrossRef]
  26. Alam, T.; Gupta, R. Federated Learning and Its Role in the Privacy Preservation of IoT Devices. Future Internet 2022, 14, 246. [Google Scholar] [CrossRef]
  27. Dhiman, G.; Juneja, S.; Mohafez, H.; El-Bayoumy, I.; Sharma, L.K.; Hadizadeh, M.; Islam, M.A.; Viriyasitavat, W.; Khandaker, M.U. Federated Learning Approach to Protect Healthcare Data over Big Data Scenario. Sustainability 2022, 14, 2500. [Google Scholar] [CrossRef]
  28. Aono, Y.; Hayashi, T.; Wang, L.; Moriai, S. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 2017, 13, 1333–1345. [Google Scholar]
  29. Chen, Y.R.; Rezapour, A.; Tzeng, W.G. Privacy-preserving ridge regression on distributed data. Inf. Sci. 2018, 451, 34–49. [Google Scholar] [CrossRef]
  30. Fang, H.; Qian, Q. Privacy Preserving Machine Learning with Homomorphic Encryption and Federated Learning. Future Internet 2021, 13, 94. [Google Scholar] [CrossRef]
  31. Park, J.; Lim, H. Privacy-Preserving Federated Learning Using Homomorphic Encryption. Appl. Sci. 2022, 12, 734. [Google Scholar] [CrossRef]
  32. Angulo, E.; Márquez, J.; Villanueva-Polanco, R. Training of Classification Models via Federated Learning and Homomorphic Encryption. Sensors 2023, 23, 1966. [Google Scholar] [CrossRef]
  33. Shen, X.; Jiang, H.; Chen, Y.; Wang, B.; Gao, L. PLDP-FL: Federated Learning with Personalized Local Differential Privacy. Entropy 2023, 25, 485. [Google Scholar] [CrossRef]
  34. Wang, X.; Wang, J.; Ma, X.; Wen, C. A Differential Privacy Strategy Based on Local Features of Non-Gaussian Noise in Federated Learning. Sensors 2022, 22, 2424. [Google Scholar] [CrossRef]
  35. Zhao, J.; Yang, M.; Zhang, R.; Song, W.; Zheng, J.; Feng, J.; Matwin, S. Privacy-Enhanced Federated Learning: A Restrictively Self-Sampled and Data-Perturbed Local Differential Privacy Method. Electronics 2022, 11, 4007. [Google Scholar] [CrossRef]
  36. McMahan, H.B.; Ramage, D.; Talwar, K.; Zhang, L. Learning differentially private recurrent language models. arXiv 2017, arXiv:1710.06963. [Google Scholar]
  37. So, J.; Güler, B.; Avestimehr, A.S. Turbo-Aggregate: Breaking the Quadratic Aggregation Barrier in Secure Federated Learning. IEEE J. Sel. Areas Inf. Theory 2021, 2, 479–489. [Google Scholar] [CrossRef]
  38. Xu, G.; Li, H.; Liu, S.; Yang, K.; Lin, X. VerifyNet: Secure and Verifiable Federated Learning. IEEE Trans. Inf. Forensics Secur. 2020, 15, 911–926. [Google Scholar] [CrossRef]
  39. Kim, H.; Park, J.; Bennis, M.; Kim, S.L. Blockchained On-Device Federated Learning. IEEE Commun. Lett. 2020, 24, 1279–1283. [Google Scholar] [CrossRef]
  40. Mahmood, Z.; Jusas, V. Blockchain-Enabled: Multi-Layered Security Federated Learning Platform for Preserving Data Privacy. Electronics 2022, 11, 1624. [Google Scholar] [CrossRef]
  41. Liu, H.; Zhou, H.; Chen, H.; Yan, Y.; Huang, J.; Xiong, A.; Yang, S.; Chen, J.; Guo, S. A Federated Learning Multi-Task Scheduling Mechanism Based on Trusted Computing Sandbox. Sensors 2023, 23, 2093. [Google Scholar] [CrossRef] [PubMed]
  42. Mortaheb, M.; Vahapoglu, C.; Ulukus, S. Personalized Federated Multi-Task Learning over Wireless Fading Channels. Algorithms 2022, 15, 421. [Google Scholar] [CrossRef]
  43. Du, W.; Atallah, M.J. Privacy-preserving cooperative statistical analysis. In Proceedings of the Seventeenth Annual Computer Security Applications Conference, New Orleans, LA, USA, 10–14 December 2001; pp. 102–110. [Google Scholar]
  44. Du, W.; Han, Y.S.; Chen, S. Privacy-preserving multivariate statistical analysis: Linear regression and classification. In Proceedings of the 2004 SIAM International Conference on Data Mining, Lake Buena Vista, FL, USA, 22–24 April 2004; pp. 222–233. [Google Scholar]
  45. Khan, A.; ten Thij, M.; Wilbik, A. Communication-Efficient Vertical Federated Learning. Algorithms 2022, 15, 273. [Google Scholar] [CrossRef]
  46. Hardy, S.; Henecka, W.; Ivey-Law, H.; Nock, R.; Patrini, G.; Smith, G.; Thorne, B. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv 2017, arXiv:1711.10677. [Google Scholar]
  47. Schoenmakers, B.; Tuyls, P. Efficient binary conversion for Paillier encrypted values. In Proceedings of the Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, 28 May–1 June 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 522–537. [Google Scholar]
  48. Zhong, Z.; Zhou, Y.; Wu, D.; Chen, X.; Chen, M.; Li, C.; Sheng, Q.Z. P-FedAvg: Parallelizing Federated Learning with Theoretical Guarantees. In Proceedings of the IEEE INFOCOM 2021—IEEE Conference on Computer Communications, Vancouver, BC, Canada, 10–13 May 2021; pp. 1–10. [Google Scholar] [CrossRef]
  49. Barreno, M.; Nelson, B.; Sears, R.; Joseph, A.D.; Tygar, J.D. Can machine learning be secure? In Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, Taipei, Taiwan, 21–24 March 2006; pp. 16–25. [Google Scholar]
  50. Huang, L.; Joseph, A.D.; Nelson, B.; Rubinstein, B.I.; Tygar, J.D. Adversarial machine learning. In Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, Chicago, IL, USA, 21 October 2011; pp. 43–58. [Google Scholar]
  51. Chu, W.L.; Lin, C.J.; Chang, K.N. Detection and Classification of Advanced Persistent Threats and Attacks Using the Support Vector Machine. Appl. Sci. 2019, 9, 4579. [Google Scholar] [CrossRef]
  52. Chen, Y.; Hayawi, K.; Zhao, Q.; Mou, J.; Yang, L.; Tang, J.; Li, Q.; Wen, H. Vector Auto-Regression-Based False Data Injection Attack Detection Method in Edge Computing Environment. Sensors 2022, 22, 6789. [Google Scholar] [CrossRef]
  53. Jiang, Y.; Zhou, Y.; Wu, D.; Li, C.; Wang, Y. On the Detection of Shilling Attacks in Federated Collaborative Filtering. In Proceedings of the 2020 International Symposium on Reliable Distributed Systems (SRDS), Shanghai, China, 21–24 September 2020; pp. 185–194. [Google Scholar] [CrossRef]
  54. Alfeld, S.; Zhu, X.; Barford, P. Data poisoning attacks against autoregressive models. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
  55. Biggio, B.; Nelson, B.; Laskov, P. Poisoning attacks against support vector machines. arXiv 2012, arXiv:1206.6389. [Google Scholar]
  56. Zügner, D.; Akbarnejad, A.; Günnemann, S. Adversarial attacks on neural networks for graph data. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining; 2018; pp. 2847–2856. Available online: https://arxiv.org/abs/1805.07984 (accessed on 18 October 2023).
  57. Zhao, M.; An, B.; Yu, Y.; Liu, S.; Pan, S. Data poisoning attacks on multi-task relationship learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
  58. Dhiman, S.; Nayak, S.; Mahato, G.K.; Ram, A.; Chakraborty, S.K. Homomorphic Encryption based Federated Learning for Financial Data Security. In Proceedings of the 2023 4th International Conference on Computing and Communication Systems (I3CS), Shillong, India, 16–18 March 2023; pp. 1–6. [Google Scholar]
  59. Jia, B.; Zhang, X.; Liu, J.; Zhang, Y.; Huang, K.; Liang, Y. Blockchain-Enabled Federated Learning Data Protection Aggregation Scheme With Differential Privacy and Homomorphic Encryption in IIoT. IEEE Trans. Ind. Inform. 2022, 18, 4049–4058. [Google Scholar] [CrossRef]
  60. Zhang, S.; Li, Z.; Chen, Q.; Zheng, W.; Leng, J.; Guo, M. Dubhe: Towards data unbiasedness with homomorphic encryption in federated learning client selection. In Proceedings of the 50th International Conference on Parallel Processing, Lemont, IL, USA, 9–12 August 2021; pp. 1–10. [Google Scholar]
  61. Guo, X. Federated Learning for Data Security and Privacy Protection. In Proceedings of the 2021 12th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), Xi’an, China, 10–12 December 2021; pp. 194–197. [Google Scholar] [CrossRef]
  62. Fan, C.I.; Hsu, Y.W.; Shie, C.H.; Tseng, Y.F. ID-Based Multireceiver Homomorphic Proxy Re-Encryption in Federated Learning. ACM Trans. Sens. Netw. 2022, 18, 1–25. [Google Scholar] [CrossRef]
  63. Kou, L.; Wu, J.; Zhang, F.; Ji, P.; Ke, W.; Wan, J.; Liu, H.; Li, Y.; Yuan, Q. Image encryption for Offshore wind power based on 2D-LCLM and Zhou Yi Eight Trigrams. Int. J. Bio-Inspired Comput. 2023, 22, 53–64. [Google Scholar] [CrossRef]
  64. Li, L.; Li, T. Traceability model based on improved witness mechanism. CAAI Trans. Intell. Technol. 2022, 7, 331–339. [Google Scholar] [CrossRef]
  65. Lee, J.; Duong, P.N.; Lee, H. Configurable Encryption and Decryption Architectures for CKKS-Based Homomorphic Encryption. Sensors 2023, 23, 7389. [Google Scholar] [CrossRef]
  66. Ge, L.; Li, H.; Wang, X.; Wang, Z. A review of secure federated learning: Privacy leakage threats, protection technologies, challenges and future directions. Neurocomputing 2023, 561, 126897. [Google Scholar] [CrossRef]
  67. Zhang, J.; Zhu, H.; Wang, F.; Zhao, J.; Xu, Q.; Li, H. Security and privacy threats to federated learning: Issues, methods, and challenges. Secur. Commun. Netw. 2022, 2022, 2886795. [Google Scholar] [CrossRef]
  68. Zhang, J.; Li, M.; Zeng, S.; Xie, B.; Zhao, D. A survey on security and privacy threats to federated learning. In Proceedings of the 2021 International Conference on Networking and Network Applications (NaNA), Lijiang City, China, 29 October–1 November 2021; pp. 319–326. [Google Scholar] [CrossRef]
  69. Li, Y.; Bao, Y.; Xiang, L.; Liu, J.; Chen, C.; Wang, L.; Wang, X. Privacy threats analysis to secure federated learning. arXiv 2021, arXiv:2106.13076. [Google Scholar]
  70. Asad, M.; Moustafa, A.; Yu, C. A critical evaluation of privacy and security threats in federated learning. Sensors 2020, 20, 7182. [Google Scholar] [CrossRef]
  71. Manzoor, S.I.; Jain, S.; Singh, Y.; Singh, H. Federated Learning Based Privacy Ensured Sensor Communication in IoT Networks: A Taxonomy, Threats and Attacks. IEEE Access 2023, 11, 42248–42275. [Google Scholar] [CrossRef]
  72. Benmalek, M.; Benrekia, M.A.; Challal, Y. Security of federated learning: Attacks, defensive mechanisms, and challenges. Rev. Sci. Technol. L’Inform. Série RIA Rev. D’Intell. Artif. 2022, 36, 49–59. [Google Scholar] [CrossRef]
  73. Arbaoui, M.; Rahmoun, A. Towards secure and reliable aggregation for Federated Learning protocols in healthcare applications. In Proceedings of the 2022 Ninth International Conference on Software Defined Systems (SDS), Paris, France, 12–15 December 2022; pp. 1–3. [Google Scholar]
  74. Abdelli, K.; Cho, J.Y.; Pachnicke, S. Secure Collaborative Learning for Predictive Maintenance in Optical Networks. In Proceedings of the Secure IT Systems: 26th Nordic Conference, NordSec 2021, Virtual Event, 29–30 November 2021; Proceedings 26. Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 114–130. [Google Scholar]
  75. Jeong, H.; Son, H.; Lee, S.; Hyun, J.; Chung, T.M. FedCC: Robust Federated Learning against Model Poisoning Attacks. arXiv 2022, arXiv:2212.01976. [Google Scholar]
  76. Lyu, X.; Han, Y.; Wang, W.; Liu, J.; Wang, B.; Liu, J.; Zhang, X. Poisoning with cerberus: Stealthy and colluded backdoor attack against federated learning. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023. [Google Scholar]
  77. Wei, K.; Li, J.; Ding, M.; Ma, C.; Jeon, Y.S.; Poor, H.V. Covert model poisoning against federated learning: Algorithm design and optimization. IEEE Trans. Dependable Secur. Comput. 2023. [Google Scholar] [CrossRef]
  78. Sun, J.; Li, A.; DiValentin, L.; Hassanzadeh, A.; Chen, Y.; Li, H. Fl-wbc: Enhancing robustness against model poisoning attacks in federated learning from a client perspective. Adv. Neural Inf. Process. Syst. 2021, 34, 12613–12624. [Google Scholar]
  79. Mao, Y.; Yuan, X.; Zhao, X.; Zhong, S. Romoa: Ro bust Mo del A ggregation for the Resistance of Federated Learning to Model Poisoning Attacks. In Proceedings of the Computer Security–ESORICS 2021: 26th European Symposium on Research in Computer Security, Darmstadt, Germany, 4–8 October 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 476–496. [Google Scholar]
  80. Pillutla, K.; Kakade, S.M.; Harchaoui, Z. Robust Aggregation for Federated Learning. IEEE Trans. Signal Process. 2022, 70, 1142–1154. [Google Scholar] [CrossRef]
  81. Wang, X.; Liang, Z.; Koe, A.S.V.; Wu, Q.; Zhang, X.; Li, H.; Yang, Q. Secure and efficient parameters aggregation protocol for federated incremental learning and its applications. Int. J. Intell. Syst. 2022, 37, 4471–4487. [Google Scholar] [CrossRef]
  82. Hao, M.; Li, H.; Xu, G.; Chen, H.; Zhang, T. Efficient, private and robust federated learning. In Proceedings of the Annual Computer Security Applications Conference, Virtual Event, 6–10 December 2021; pp. 45–60. [Google Scholar]
  83. Smith, V.; Chiang, C.K.; Sanjabi, M.; Talwalkar, A.S. Federated multi-task learning. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
  84. Shalev-Shwartz, S.; Zhang, T. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization. In Proceedings of the International Conference on Machine Learning, Beijing, China, 22–24 June 2014; pp. 64–72. [Google Scholar]
  85. Wassan, S.; Suhail, B.; Mubeen, R.; Raj, B.; Agarwal, U.; Khatri, E.; Gopinathan, S.; Dhiman, G. Gradient Boosting for Health IoT Federated Learning. Sustainability 2022, 14, 16842. [Google Scholar] [CrossRef]
  86. Feng, J.; Yang, L.T.; Zhu, Q.; Choo, K.K.R. Privacy-preserving tensor decomposition over encrypted data in a federated cloud environment. IEEE Trans. Dependable Secur. Comput. 2018, 17, 857–868. [Google Scholar] [CrossRef]
  87. Feng, J.; Yang, L.T.; Ren, B.; Zou, D.; Dong, M.; Zhang, S. Tensor recurrent neural network with differential privacy. IEEE Trans. Comput. 2023. [Google Scholar] [CrossRef]
  88. Karras, A.; Karras, C.; Giotopoulos, K.C.; Tsolis, D.; Oikonomou, K.; Sioutas, S. Federated Edge Intelligence and Edge Caching Mechanisms. Information 2023, 14, 414. [Google Scholar] [CrossRef]
  89. Karras, A.; Karras, C.; Schizas, N.; Avlonitis, M.; Sioutas, S. AutoML with Bayesian Optimizations for Big Data Management. Information 2023, 14, 223. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.