Privacy-Preserving Data Aggregation Mechanisms in Mobile Crowdsensing Driven by Edge Intelligence

Liu, Xiuwen; Chen, Sirui; Xu, Zhiqiang

doi:10.3390/electronics15010026

Open AccessReview

Privacy-Preserving Data Aggregation Mechanisms in Mobile Crowdsensing Driven by Edge Intelligence

by

Xiuwen Liu

^*,

Sirui Chen

and

Zhiqiang Xu

Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao 266580, China

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(1), 26; https://doi.org/10.3390/electronics15010026 (registering DOI)

Submission received: 6 November 2025 / Revised: 18 December 2025 / Accepted: 18 December 2025 / Published: 21 December 2025

(This article belongs to the Special Issue Machine Learning for Cyber Security and Privacy: Innovations, Challenges, and Future Directions)

Download

Browse Figures

Versions Notes

Abstract

Edge Intelligence (EI) empowers Mobile Crowdsensing (MCS) with real-time, distributed processing capabilities, but these advancements exacerbate long-standing privacy challenges. The strict requirements for low-latency computation on heterogeneous, resource-constrained edge nodes often conflict with the significant overhead imposed by traditional privacy-preserving techniques. Furthermore, distributed data flows and dynamic network conditions expand the attack surface, complicating risk containment. However, existing surveys do not examine privacy-preserving data aggregation through the lens of EI-specific constraints, a gap that this work aims to address. To this end, this paper systematically reviews recent privacy-preserving aggregation mechanisms from an EI-oriented perspective that accounts for real-time constraints, energy limitations, and decentralized cooperation. The survey examines emerging attack models and defense strategies associated with distributed collaboration and evaluates their implications for aggregation security in EI environments. Existing methods are categorized and assessed according to MCS system architecture and lifecycles, revealing limitations in applicability, scalability, and suitability under EI constraints. By integrating current techniques with experimental findings, this paper identifies open challenges and outlines promising directions for enhancing privacy protection in EI-driven MCS, offering both conceptual and analytical insights and practical guidance for future system design.

Keywords:

edge intelligence; mobile crowdsensing; data aggregation; privacy protection

1. Introduction

Mobile Crowdsensing (MCS) depends on extensively distributed devices (e.g., mobile terminals, sensors, and edge nodes) for data acquisition and initial processing. MCS collects data from devices and sends it to a central server for processing and analysis [1]. In EI-driven MCS, the optimization of the data processing location occurs, with an increased number of computing jobs executed locally at the edge nodes, hence diminishing data transmission demands and latency. EI improves real-time responsiveness through collaborative computing between edge nodes and the cloud [2]. This design allows the cloud server to primarily augment the inadequate computational capacity of the edge nodes, mitigating performance constraints in extensive data transmission and enhancing the system’s adaptability to intricate and dynamic application contexts [3].

In EI-driven MCS systems, distributed devices collect heterogeneous data from multiple sources, including sensor readings, location information, and user behavior. The scale and complexity of such data make direct processing inefficient, rendering data aggregation essential for reducing redundancy and improving system performance. However, aggregation also introduces privacy risks, particularly for sensitive data, which may lead to privacy breaches and inference attacks. Consequently, EI-driven MCS systems have several obstacles in the design of privacy-preserving data aggregation strategies [3]. In this work, EI-driven MCS is regarded as a paradigm in which sensing data are processed and aggregated primarily at the network edge, enabling decentralized learning, real-time decision-making, and collaborative model optimization under resource and communication constraints. This distinguishes EI-driven MCS from traditional MCS, which mainly relies on centralized cloud-based aggregation and offline or near-offline processing.

The distributed edge architecture also increases robustness challenges. Although federated learning supports secure aggregation, it still suffers from inherent security vulnerabilities. Federated learning is a distributed training paradigm in which multiple clients collaboratively optimize a shared model by exchanging model updates rather than raw data. The distributed and heterogeneous nature of EI reduces the consistency of client behaviors and weakens global coordination, thereby lowering the effectiveness of anomaly detection and identity verification. As a result, adversaries can more easily hide malicious updates or inject multiple identities, making poisoning and Sybil attacks harder to contain. These attacks not only threaten model security but also directly undermine the reliability of aggregated results, thereby intensifying the robustness challenges for EI-driven MCS systems. Therefore, EI-driven MCS systems need to design targeted mechanisms to defend against various types of attacks in different scenarios and enhance the robustness of federated learning systems to ensure secure data aggregation.

EI-driven MCS faces a fundamental trade-off between data utility and privacy protection under multi-layer architectures. Limited computation and energy resources at edge nodes restrict the applicability of complex privacy mechanisms, while lightweight solutions often provide insufficient guarantees. In addition, privacy risks may propagate across layers during data transmission, as most traditional mechanisms are designed for single-layer protection. Consequently, the overall system faces a higher level of privacy risk, making the trade-off between privacy and utility even more difficult to achieve. Thus, EI-driven MCS needs lightweight, resource-efficient privacy mechanisms and a cross-layer framework that coordinates privacy needs across system layers.

Real-time processing is another key feature of the EI-driven MCS, centered on enabling rapid decision-making through low latency. This poses challenges for privacy protection under resource-constrained constraints, as real-time requirements demand continuous breakthroughs in local processing capabilities while ensuring data privacy. In EI-driven MCS system communications, common transmission prioritizes rapid and efficient data transfer but leaves privacy information vulnerable to tracking and surveillance. Anonymous connectivity schemes, requiring multiple relay or hybrid nodes, often introduce significant latency. Consequently, developing privacy protection mechanisms and communication protocols that do not incur significant delays remains a critical research focus.

Within the federated learning framework, the additive secret sharing method in Secure Multi-Party Computing (SMC) is utilized to facilitate secure data aggregation across participants while safeguarding the original data from exposure. Simultaneously, the integration of symmetric key cryptography (e.g., double-layer lattice-based cryptosystem) efficiently safeguards data privacy during transmission and storage to prevent the leaking of intermediate findings [4,5]. The integration of local differential privacy with blockchain and smart contracts guarantees that privacy protection during data aggregation is both visible and reliable, capable of mitigating the risk of possible privacy breaches [6]. Despite the success of these approaches in privacy protection and data aggregation within MCS, further optimization is necessary for EI-driven MCS systems, as they are primarily tailored for the MCS framework and have not adequately resolved issues related to multi-layer privacy diffusion and real-time requirements.

Several prior surveys have examined mobile crowdsensing or its associated privacy challenges, including system-level overviews of MCS architectures and privacy issues [1], studies focusing on participant and task-management privacy [7,8], and broader investigations of privacy protection techniques in MCS [9,10]. Although informative, these works primarily address traditional MCS settings and do not consider how the architectural characteristics of edge intelligence, such as device heterogeneity, resource constraints, dynamic connectivity and real-time processing requirements, reshape the design space of privacy-preserving data aggregation. Surveys on edge intelligence and trustworthy distributed learning [2,3,11] provide valuable insights into EI architectures and security, but they do not focus on privacy-preserving aggregation within MCS workflows. To the best of our knowledge, no existing review systematically analyzes aggregation-oriented privacy mechanisms from an EI-driven perspective or evaluates their suitability under EI-specific constraints. This gap motivates the present work. Given that EI-driven MCS sits at the intersection of edge computing, security, and data aggregation, this survey intentionally adopts a cross-disciplinary perspective. It aims to equip readers who are familiar with MCS or distributed learning, but not necessarily experts in EI or security, with the conceptual foundations needed to understand EI-specific aggregation and privacy challenges.

This study categorizes and reviews contemporary privacy-preserving data aggregation algorithms based on the system architecture and lifecycle of MCS, considering the specific requirements of the EI environment. This study examines existing MCS methodologies and evaluates their extension and applicability in EI-driven systems. This work aims to provide conceptual and analytical insights and practical references for the advancement of more comprehensive and efficient privacy-preserving data aggregation methodologies in EI-driven MCS systems. The main contributions and innovations of this paper can be summarized as follows:

Stage-wise Review and Comparative Analysis Oriented to EI: Based on the core requirements of EI for real-time performance, resource, and collaborative capability, we systematically organized and compared existing privacy-preserving data aggregation techniques through a phased, multidimensional approach. The intrinsic advantages, design focuses, and inherent limitations of various methods under EI environments are thoroughly elaborated.
Privacy Protection Technology Migration and Adaptability Assessment from MCS to EI: This study systematically reviews and reconstructs mature privacy-preserving data aggregation schemes from traditional MCS, clarifying their feasibility for migration to EI architectures, adaptation bottlenecks, and performance trade-offs. It provides critical reference for privacy technology selection and path exploration in edge intelligence applications.
Empirical Performance and Data Utility Evaluation of Multiple Schemes: Through experiments, we compared the performance of various privacy protection techniques in the EI scenario and their impact on data quality, intuitively revealing the distinct characteristics of different approaches in balancing data utility, privacy strength, and system overhead.
Outlook on Future Research Directions: the review and experimental results, this work identifies the key gaps in existing technologies when addressing the demands of EI-driven systems. Consequently, several promising research directions for future exploration are proposed, including lightweight privacy algorithms, edge-cloud collaboration mechanisms, and the design of attack-resistant models.

The structure of this paper is as follows: Section 2 outlines the focus of this work. Section 3 presents a classification framework. Section 4 classifies, discusses, and compares data aggregation privacy protection methods in EI-driven MCS. Section 5 experimentally evaluates the performance of multiple methods. Section 6 provides recommendations for future research directions.

2. Architectures, Data Aggregation, Privacy Protection and Security Vulnerabilities: A Brief Overview

2.1. Architecture of Mobile Crowdsensing Driven by Edge Intelligence

Figure 1 illustrates a conceptual multi-layer architecture of an EI-driven MCS system and contrasts it with the conventional MCS workflow. In a traditional MCS system, several task initiators, mobile users, and a cloud-based sensing platform interact in a centralized manner: task initiators submit sensing requirements; the platform assigns tasks to users, and users collect data through their devices and return the results for cloud-side processing. EI-driven MCS systems differ fundamentally in how they interface with the sensing platform. In conventional MCS, user devices may transmit raw data, partially processed data, or both, requiring the cloud to handle large data volumes. In contrast, EI-driven MCS shifts substantial computation to the edge, where edge nodes perform local processing and transmit only results or aggregated data. This architectural shift reduces raw data transmission, improves communication efficiency, and reshapes the privacy and security landscape.

Accordingly, we abstract the EI-driven MCS architecture into three interrelated components, the sensing part, the communication part, and the application part [1]. These components form a multi-layer architecture in which data aggregation, privacy protection mechanisms, and security vulnerabilities are tightly coupled across layers.

The vulnerabilities part represents core risks threatening the security, data integrity, and user privacy of EI-driven MCS, directly impacting system trust and stability. The local processing capabilities of EI-driven MCS edge nodes and the dynamic access features of mobile nodes provide entry points for various attacks. These vulnerabilities can lead to task failures, diminished data credibility, and privacy breaches, posing challenges to system efficiency and security.
The application part addresses the advanced elements of MCS activities, emphasizing activity design and organization, including task assignment and management, user recruitment, incentivizing user engagement through effective strategies, and ensuring that tasks are assigned and executed efficiently. In EI-driven MCS, the decentralized computing capabilities of edge nodes improve task execution and feedback velocity, facilitating real-time processing and decision-making, hence rendering the system superior to conventional MCS in reaction speed and performance.
The communication part is accountable for data transfer and administration. In traditional MCS, it typically transfers all data from mobile devices to the cloud for processing. Conversely, in EI-driven MCS, the edge nodes initially analyze and aggregate data locally, with the communication part often transmitting the aggregated results instead of the raw data. This method diminishes data communication volume, mitigates the danger of privacy breaches during transmission, and enhances both the efficiency and security of the transmission process.
The sensing part includes sensors on various mobile devices that collect and initially process raw environmental data. It acts as the data source for the entire system, making efficient and accurate data collection crucial for performance and reliability. In EI-driven MCS, sensor data is typically aggregated and processed in real-time at edge nodes, which reduces raw data transfer, lowers privacy risks, and enhances real-time data throughput and processing efficiency.

2.2. Data Aggregation Techniques: Enhancing Efficiency and System Performance

Data aggregation is essential for enhancing data quality, minimizing redundancy, and facilitating informed decision-making. Different aggregation forms serve different purposes but face challenges such as node heterogeneity, dynamic resources, and secure collaboration. It is essential to thoroughly investigate and categorize current data aggregation methods to meet the varied requirements in intricate contexts. This article delineates five techniques for data aggregation:

Common data aggregation involves directly combining data based on the initiator’s request or privacy protection needs. This typically includes operations like sum, mean, variance, density, p-order moment, skewness, and kurtosis [4]. Some argue that an ideal privacy-protected data aggregation method should satisfy both basic aggregation needs and specific custom needs (such as min, max, and top-K) to enhance the method’s versatility.
Weighted aggregation is closely related to truth discovery, a technique used to extract accurate information from data provided by many participants. Since participants may offer erroneous, biased, or noisy data, data aggregation helps identify the most trustworthy information. In this process, Truth Discovery assesses the quality of each data provider’s input, which informs the weighting used during aggregation [12].
Cluster aggregation is often combined with encryption-based privacy-preserving methods, such as task clustering methods that group users with tasks of the same interest into the same cluster with user-encrypted task-bid pairs [13], or the use of K-Means clustering algorithms to directly group encrypted user data [14].
andom matrix aggregation is the operation of changing the form of data before aggregation. For example, the data is stored as a random matrix and then the matrix is aggregated; when privacy protection is performed in a confidence framework, the reputation values can be replaced and embedded in the generated random position matrix [15].
The idea of model aggregation originates from federated computing, which considers global and local models in the privacy protection process and combines the transmitted data to be aggregated directly at the model level in order to accomplish the privacy security purpose of anti-collusion [5].

2.3. Privacy Protection Methods: Protecting Data Security and System Reliability

This section categorizes modern privacy protection strategies into four main types, based on their core principles and focus in addressing privacy risks. Each technique addresses privacy concerns at different stages and contexts within the system, forming the foundation of current privacy protection mechanisms. The following descriptions detail each method:

Cryptography: Cryptographic methods often include homomorphic encryption, secure multiparty computation, and key encryption. Homomorphic encryption allows data to be computed without decryption; secure multi-party computation allows multiple parties to compute together without revealing their private data [16]; and key encryption includes symmetric encryption (e.g., AES), which encrypts and decrypts using the same key, and asymmetric encryption (e.g., RSA), which encrypts and decrypts using the same key, using a public key and decrypts using a private key.
Anonymization: Anonymization methods protect privacy by removing or generalizing identifying information, ensuring individuals can’t be identified. For instance, k-anonymity generalizes or suppresses data so each key attribute combination appears at least k times in the dataset.
Data Perturbation: Data Perturbation methods protect privacy by altering data while preserving its statistical properties for analysis. Differential privacy, a common method, resists inference attacks, ensuring attackers with background knowledge can’t gain more private user data.
Confidence Framework: The framework builds trust relationships by assessing and managing the historical behavior of participants. It gathers data like transaction history, feedback ratings, and interaction count, calculates reputation using weighted averaging or Bayesian inference, and updates it dynamically based on recent behaviors to ensure data privacy and security.

2.4. Analysis of Key Challenges in Privacy Protection for EI-Driven MCS

Although EI enhances MCS through distributed processing, it also introduces new challenges for privacy-preserving aggregation. These challenges primarily stem from the resource-limited, heterogeneous, and dynamic nature of the edge environment [3]. We therefore identify and elaborate on four pivotal challenges arising from these characteristics.

Computational Resource Heterogeneity and Scarcity: The computational capabilities of edge nodes span a wide spectrum, from embedded sensors to edge servers, and are typically constrained in terms of processing, storage, and energy [3,17]. This heterogeneity directly precludes the straightforward application of many computation-intensive privacy techniques. For instance, fully homomorphic encryption [16] and complex public-key cryptosystems may introduce unacceptable latency and energy consumption on low-power devices. Consequently, the research focus must shift towards lightweight cryptographic primitives, data perturbation, and secret sharing schemes with manageable computational overhead, which demonstrate better suitability in resource-constrained environments [4,18].
Dynamic and Unstable Network Topology: The sensing network comprised of mobile devices is highly dynamic, with nodes frequently joining or departing, and network connection quality often fluctuating. This instability poses a significant threat to protocols requiring multiple rounds of low-latency interaction, such as many Secure Multi-Party Computation (MPC) protocols and secret sharing schemes reliant on persistent connections [19]. Mechanisms designed for the EI environment must inherently incorporate strong fault tolerance, asynchronous operation capabilities, or employ optimized communication patterns to reduce dependency on network stability [12,20]. Similar challenges have also been reported in other highly dynamic edge-based systems, such as vehicular networks, where efficient conditional privacy-preserving authentication protocols like EBCPA [21] are designed to operate under strict real-time and mobility requirements.
Stringent Real-Time Processing Requirements: Low latency is a core value proposition of EI, essential for enabling instant decision-making [2,3]. This implies that the overhead of any privacy-preserving mechanism must be strictly bounded within the application’s acceptable latency budget. Techniques that introduce significant delays, such as complex anonymous communication circuits, thus require careful re-evaluation. Architectural innovations like edge-side pre-processing and hierarchical aggregation [22,23] can be leveraged to distribute and mitigate this latency. In this context, efficiency becomes a design objective equally critical to privacy strength.
Cross-Layer Diffusion of Data and Privacy Risks: The multi-tier architecture of EI (device-edge-cloud) means data traverses multiple trust domains throughout its lifecycle. Data protected at the end-device level might face new leakage risks during aggregation at edge nodes or in transit [11]. This cross-layer diffusion of privacy risks necessitates that the protection mechanism cannot be a mere stack of isolated techniques. Instead, it must constitute a cross-layer collaborative defense-in-depth system, integrating cryptography, anonymization, trust management, and policy enforcement [5,15,24].

Accordingly, these four challenges constitute the conceptual basis for our subsequent evaluation of privacy-preserving mechanisms.

2.5. Security Vulnerabilities in Federated Learning: Security Challenges and Typical Attacks

Federated learning is a decentralized learning framework in which each participant trains a local model on its own data and sends only model updates to a coordinating server. The server aggregates these updates to construct a global model without accessing raw data, thereby reducing direct data exposure. Despite this design, federated learning remains vulnerable in practice, as the server cannot verify the correctness, integrity, or origin of the received updates, creating opportunities for adversarial manipulation and privacy leakage [11,25]. In this section, we categorize the major security vulnerabilities in federated learning into four classes and provide a systematic exposition of the corresponding attack patterns.

Poisoning Attack: A poisoning attack occurs when an adversary injects harmful data or updates to disrupt system behavior or steal information. Such attacks exploit the model’s dependence on training data to compromise its reliability. The attack happens during the model training phase, and problems only show up after the model is put into use. Under normal inputs, the model may appear to function properly for a period of time, however, due to the contamination of training data, it continuously produces erroneous outputs that deviate from expectations, while the traces of the attack remain difficult to detect after training. Recent studies in federated learning have further explored poisoning attacks in secure aggregation settings. Xhemrishi et al. [26] investigate malicious client identification under secure aggregation, demonstrating a practical privacy–security trade-off. Wang et al. [27] examine the vulnerability of similarity-based reliability assessment to poisoning, revealing both strong attack effectiveness and efficient defenses. In EI-driven MCS, heterogeneous client updates increase natural variance, making poisoned updates harder to distinguish from normal behavior.
Backdoor Attack: Malicious clients inject triggers into training data, sending locally trained models with backdoors to the server, causing the global model to inherit these backdoors during the aggregation process. These attacks are stealthy because the model behaves normally under regular inputs. When a trigger input appears, the backdoor activates and forces the model to output attacker-defined results, undermining reliability while remaining hard to detect. For backdoor threats, recent work has examined constrained and collusive attack models. Huang et al. [28] present a detection framework with strong robustness and generalization. Lyu et al. [29] analyze collusive backdoor attacks that exhibit high stealth and sparsity, highlighting the difficulty of ensuring resilience in edge-based collaborative learning.
Sybil Attack: Attackers create numerous fake identities to impersonate legitimate participants. These identities can collude to manipulate the system and influence model aggregation. Such attacks can forge majorities, tamper with reputations, and disrupt data consistency. It poses a particular threat to systems that rely on node collaboration, thereby undermining the overall reliability and security of the system. Sybil attacks have also been extensively investigated. Dong et al. [30] model realistic Sybil behaviors with dishonest participants, while Jin et al. [31] propose a reputation-constrained truth discovery mechanism that mitigates inflated worker weights but degrades under high Sybil prevalence. Frequent node churn and uneven device capabilities reduce identity stability, allowing Sybil identities to blend more easily into the system.
Inference Attack: Inference attacks analyze model outputs or background knowledge to deduce sensitive training data. The characteristic of this attack is that the adversary does not need direct access to the raw data but can instead deduce sensitive content through inference. This poses challenges to privacy protection measures such as data anonymization and de-identification, potentially leading to the leakage of individual or group privacy. Inference attacks remain a central challenge in distributed learning. Wang et al. [32] introduce a hierarchical noise injection mechanism to address accuracy degradation under DP, whereas Hu et al. [33] design a lightweight framework that reduces membership inference risk with minimal overhead.

3. Classification Framework

(1): Vulnerabilities Part: We focus on the common security vulnerabilities that currently exist in federated learning. Owing to its distributed architecture, federated learning is highly susceptible to various attacks, which we categorize into four types. For each category, we summarize representative attacks and their impacts, with emphasis on recent methods. Furthermore, we compare their respective advantages and disadvantages to clarify the threats they pose to MCS.
(2): Application part: We focus on the algorithm content and stages, categorizing them into three groups: task assignment, user recruitment and incentives, and the overall system. Due to some research lacking phase specificity, we developed a comprehensive system class. In these categories, we provide a concise overview of the privacy protection techniques and data aggregation methods used in each algorithm, with attention to operation localization and noise introduction.
(3): Communication part: Within this part, We categorize data transmissions into two groups: Anonymous Connections and Common Transmission. Anonymous connections protect user privacy by obfuscating identity and encrypting communication, preventing tracking and identification. Common Transmission prioritizes fast and efficient data transfer. We also consider whether the algorithm provides comprehensive explanations of data release.
(4): Sensing part: We focus on the categories and origins of collected data, including sensitive data protected for privacy and the range of sensors used. Privacy-protected data targets include Location, Identity, Sensory, Bid, and Reputation privacy. Sensors are used in various domains such as general, medical, automotive, and industrial networks.

4. Privacy-Preserving Data Aggregation Mechanism Algorithms

4.1. Novel Attack and Defense Methods—Vulnerabilities Part

Federated learning (FL) is vulnerable to poisoning attacks due to its decentralized nature. Xhemrishi et al. [26] pointed out that although privacy-enhancing secure aggregation protocols protect data privacy, they limit the server’s ability to detect individual client updates, forming a core trade-off between privacy and security. To address this issue, they proposed the FedGT framework, which adopts the idea of group testing to divide clients into overlapping groups. Through secure aggregation, the aggregated model result of each group is obtained and tested to determine whether malicious clients exist. Then, malicious clients are identified and removed through a decoding operation, thereby improving model security and utility while ensuring privacy. Wang et al. [27] revealed that commonly used similarity metrics in federated learning have a security vulnerability where benign and malicious models may have the same similarity evaluation results in high-dimensional models, although their actual parameter values differ significantly. They designed a new untargeted attack method for model poisoning, Faker, which simultaneously maximizes the evaluation similarity and parameter discrepancy between malicious and benign models, enabling it to bypass multiple mainstream defense mechanisms. Meanwhile, the Selective Partial Parameter (SPP) strategy proposed in the paper can effectively defend against this attack by randomly evaluating only part of the parameters.

Federated learning is vulnerable to both targeted and untargeted data and model poisoning attacks, and it also faces Byzantine failures, where even a single malicious client can compromise the global model. Because traditional methods require centralized data or cause privacy risks, Valadi et al. [34] proposed a solution to mitigate poisoning attacks and fairness loss in heterogeneous settings. Their method constructs a scoring function on the server side based on a verification dataset, which includes a bias-reduction term, a slope term, and a baseline score. It can dynamically assess the weights of client updates without needing more client information, all while maintaining robustness, fairness, and differential privacy.

Backdoor attacks are a form of poisoning, and federated learning is vulnerable because it lacks supervision over local training. Huang et al. [28] demonstrated that attackers in backdoor attacks upload malicious model parameters through local training, causing the global model to output incorrect labels for specific trigger inputs while maintaining correctness for normal inputs, achieving strong concealment. Therefore, the proposed Scope defense method effectively detects constrained backdoor attacks through dimension normalization, differential scaling, and dominant gradient clustering. Lyu et al. [29] proposed the CoBA method, enabling colluding malicious participants to execute backdoor attacks with high sparsity and stealth. This attack successfully bypassed 15 mainstream robust federated learning defense methods by optimizing backdoor triggers to facilitate backdoor data learning, controlling malicious local model bias, applying projection gradient descent techniques to enhance backdoor persistence, and increasing malicious model diversity to evade similarity detection.

Many approaches have been proposed to defend against Sybil attacks. Dong et al. [30] showed that traditional Sybil defenses assume attackers bid honestly, while real attackers may bid dishonestly to gain more utility. To address this, they proposed a new attack method in which an attacker splits into multiple identities whose total bids equal that of a single identity but do not follow truthful valuation, rendering traditional mechanisms ineffective. Accordingly, they designed the PRAM mechanism, which effectively defends against this attack through the merging of suspicious identities, bidder ranking independent of bid values, and specific winner selection and payment methods. Similarly, the Sybil attack also poses challenges in crowdsourcing environments. Jin et al. [31] pointed out that Sybil attacks threaten crowdsourcing quality because truth-discovery methods often ignore workers’ historical reputations. To address this issue, they proposed the RCTD method, which refines the approval rate using the Wilson lower bound to enhance confidence, introduces a similarity penalty term between weight ranking and refined approval rate ranking to constrain weight estimation, and solves the objective function through block coordinate descent combined with a heuristic algorithm, effectively mitigating this problem.

Inference attacks may allow adversaries to infer or reconstruct clients’ training data from shared model parameters. Wang et al. [32] pointed out that the cumulative privacy budget across multiple iterations, influenced by model dimension and iteration count, triggers a “privacy explosion” problem, further exacerbating privacy risks. Consequently, they proposed the Rényi Differential Privacy-based model RAFLS to balance privacy protection and model accuracy in federated learning. RAFLS employs hierarchical adaptive noise injection, model parameter shuffling, and fine-grained model weight aggregation to achieve privacy protection while minimizing noise’s impact on accuracy. Hu et al. [33] demonstrated that, because federated learning protects only local data while exposing model updates, attackers can exploit the exposed weights to capture training traces and steal private information, typically through membership inference attacks and gradient inversion attacks. To mitigate these threats, they proposed a mechanism that combines gradient-guided selective homomorphic encryption with a Mask Consensus unified masking mechanism. This approach effectively defends against the aforementioned attacks while reducing communication overhead, lowering the accuracy of membership inference attacks to 49.2%, and preventing data reconstruction attacks with only a minimal encryption rate.

In addition, federated learning faces several other types of attacks and challenges. Xu et al. [35] pointed out that due to limited wireless bandwidth resources, attackers can send a large number of malicious requests to the communication channels between users and servers or between service providers and network operators, occupying scarce spectrum resources and causing increased FL latency or even service interruption. To address this vulnerability, the study employed an evolutionary game mechanism and a bilateral auction mechanism to mitigate DDoS attacks in user-side and server-side bandwidth allocation, respectively. Jiang et al. [36] addressed Simpson’s paradox in heterogeneous FL, where conflicting local and global trends lead to inaccurate aggregated models. They proposed the FedCFA framework, which generates counterfactual samples through counterfactual learning and optimizes feature independence via factor disentanglement loss, effectively alleviating this challenge and improving the global model’s accuracy and training efficiency.

Security vulnerabilities in FL arise from the tension between decentralization and strict privacy requirements. In EI scenarios, the architectural and device characteristics do not simply inherit the security weaknesses of federated learning. Instead, they amplify the stealthiness of attacks, reduce the effectiveness of defenses, and accelerate the accumulation of privacy risks, making security threats more difficult to contain. These attacks not only endanger model security but also undermine the reliability of data aggregation, thereby intensifying the robustness challenges of EI-MCS systems.

Table 1 provides a detailed summary of these methods, including their goals, advantages, and disadvantages. Privacy enhancing mechanisms naturally trade off against attack detection capabilities, and in certain scenarios, resource constraints may also conflict with security guarantees. These issues threaten both data privacy and model reliability. As a major practical implementation of federated learning, EI-driven MCS further amplifies existing security risks while introducing new requirements for the coordination between data aggregation efficiency and privacy protection. Based on this, the following sections will focus on EI-driven MCS scenarios, systematically exploring the integration mechanisms of data aggregation and privacy protection, and offering advice on addressing the challenges of security and efficiency in such settings.

4.2. Privacy Scheme and Data Aggregation—Application Part

4.2.1. Task Assignment Stages

The design of task assignment schemes is a critical component of management systems, exerting a significant influence on their stability and efficiency [7]. In practical task assignment scenarios, addressing allocation efficiency and resource consumption amid multiple concurrent tasks presents a challenge. Simultaneously, achieving optimal task assignment relies on participants’ personal information, such as precise user locations or bids, making privacy protection the second significant challenge [8]. In EI scenarios, heterogeneous edge resources and network conditions require task assignment to adapt dynamically to device status, which further complicates scheduling.

Current privacy protection methods for task assignment focus on multi-objective optimization while also exploring integration with novel approaches such as fog computing and blockchain. Peng et al. [37] proposed a privacy-preserving multi-objective task assignment (PMTA) scheme that provides multidimensional privacy protection for sensor users under multiple constraints. This enables participants to obfuscate sensitive data using differential privacy techniques and perform local data aggregation. The scheme avoids TTP dependency, resists prior knowledge attacks, reduces users’ expected travel distance, and lowers costs for task publishers, achieving synergistic optimization of multidimensional privacy protection and task assignment efficiency. However, when applied to the EI scenario, the real-time joining of edge devices requires the differential privacy (DP) mechanism to dynamically adjust noise intensity. Traditional MCS task assignment uses static privacy policies that cannot adapt to rapidly changing EI device states, making fixed DP noise unstable and less effective.

Song et al. [20] proposed a privacy-preserving task matching scheme (PPTM) that converts the interests of vehicular workers into binary vectors. The scheme uses matrix factorization and proxy re-encryption for multi-user, multi-keyword search and speeds matching by aggregating binary vectors with the same prefix. During the task assignment phase, it achieves privacy-preserving Jaccard similarity search under interest constraints and Euclidean distance similarity search under location constraints. However, in the highly dynamic link of EI, multiple rounds of encrypted searches may accumulate significant delays. Peng et al. [38] designed a secure computation protocol SCP based on Shamir’s secret sharing and Carmichael’s theorem, which enables efficient and fine-grained task matching and privacy protection across multiple users and multiple tasks in MCS. In EI settings with frequent node joins and departures, the robustness of secret-share reconstruction may decrease, suggesting the need for resilience-enhancing redundancy.

To reduce communication delay and server load, some studies introduce fog computing by deploying fog nodes with data collection, communication, and computation capabilities. However, compared with traditional MCS that focuses solely on data collection and centralized cloud processing, in EI scenarios fog nodes serve as core edge units and must additionally address the unique risks of privacy leakage caused by potential collusion among multiple entities, including edge devices, cloud servers, and users, as well as the limited computational power of edge devices. Yan et al. [22] proposed a fog-computing-based aggregation scheme combining secret sharing and Paillier encryption to balance privacy and efficiency through hierarchical processing at fog nodes and the cloud. While its hierarchical design aligns with EI architectures, Paillier operations remain heavy for low-power edge devices and may increase latency in resource-constrained environments.

To address the unique requirements of mobile crowdsensing in EI scenarios for low-latency response and bandwidth optimization, Yan et al. [23] further introduced an asynchronous group authentication and aggregation mechanism based on threshold secret sharing and Paillier encryption, enabling verifiable fog-assisted crowdsensing. Threshold-based reconstruction, however, may become less reliable under EI’s unstable participation patterns where nodes intermittently disconnect.

Zhao et al. [39] developed iTAM, a bilateral privacy-preserving task assignment mechanism leveraging Paillier encryption to protect both requester and worker privacy while supporting hybrid constraints. In practical EI deployments, Paillier’s computational cost may challenge real-time scheduling on low-capability edge nodes, requiring lightweight optimization.

In vehicular EI scenarios, edge nodes like roadside units have less computing power. Therefore, Cheng et al. [15] adopted a lightweight design and implemented multi-level privacy protection measures, including the use of a confidence framework and k-anonymity in the task assignment phase to safeguard location privacy and reputation privacy. At the system level, identity privacy is protected, while homomorphic encryption is employed to encrypt sensing data. Data validity is verified through privacy-preserving comparison algorithms, while secure aggregation of valid data is achieved leveraging homomorphic addition properties, ensuring the privacy and security of perception data. Asheralieva et al. [40] designed a coding model based on Lagrange coding computation for a private security-resilient distributed mobile edge computing system. By designating base stations as master nodes and edge nodes as worker nodes, the model achieves computational offloading, efficiently allocates tasks, and incentivizes worker nodes.

In addition, blockchain technology is also employed in secure data aggregation, and it is more effective in tackling the two issues of edge node collusion risk and energy consumption collaborative optimization that are more prominent in EI. Wang et al. [41] proposed a blockchain-based approach that introduces sensitive task decomposition and task receiver partitioning during the task assignment phase to defend against collusion attacks. Secret sharing is utilized to ensure security during task decomposition, and a softmax-based reputation management algorithm is applied throughout the entire process.

In summary, privacy-preserving task assignment in EI must reconcile traditional MCS challenges with the resource heterogeneity and stringent real-time demands of the edge. While cryptographic approaches like homomorphic encryption ensure robust data confidentiality across layers, their computational cost often conflicts with device limitations. Conversely, secret sharing and data perturbation offer lighter-weight alternatives but introduce trade-offs in communication robustness or data utility, respectively. The optimal choice is thus highly context-dependent, dictated by the dominant EI constraints in a given deployment scenario.

4.2.2. User Recruitment and Incentives Stages

User recruitment is a crucial step in MCS, involving the attraction, motivation, and management of users. Because system performance depends on many users contributing quality data, ensuring active participation is essential [42]. Research on user recruitment and incentive mechanisms has long been a popular topic, and numerous schemes have been proposed. The privacy solutions adopted in these schemes can be categorized into cryptography, data perturbation, and confidence frameworks. It is worth noting that privacy protection schemes focusing on incentive mechanisms aim to protect not only sensing privacy but also bidding privacy. Unlike conventional MCS, users in EI may act as both data providers and edge nodes. Incentive mechanisms should therefore account for local resource contributions in addition to sensing quality.

Zhao et al. [4] proposed a privacy-preserving mobile crowdsensing framework based on additive secret sharing, enabling secure local aggregation without introducing noise. The scheme protects both sensing data and bidding privacy while supporting common aggregation operations, such as statistical moments, and significantly improves computational efficiency compared with traditional secure aggregation approaches. Under EI’s intermittently connected edges, however, share transmission may be disrupted, affecting aggregation completeness and quality.

Sun et al. [43] designed the first personalized privacy protection incentive mechanism for truth discovery in contract-based MCS systems. The method proposed by Feng et al. [19] aims to achieve both privacy protection and real-time rewards. It employs a replicated secret sharing technique to preserve privacy, supporting arithmetic operations such as multiplication, logarithm, division, and Euclidean distance on shared secrets, while also incorporating private truth discovery into consideration. Unlike Feng et al. [19], Peng et al. [12] focus more on truth discovery, utilizing secret sharing to ensure privacy protection. The weights generated through the truth discovery process are used as quantitative indicators of data quality, which are then applied to dynamically adjust users’ rewards, thereby constructing a data-quality-driven incentive mechanism. As EI network conditions vary, inter-server synchronization essential for accurate weight estimation may become less stable.

In the user recruitment phase, some studies have also introduced fog computing. Sun et al. [44] proposed a fog-computing-based mobile swarm perception framework for a vehicular mobile crowdsensing framework. By employing buses as fog nodes, integrating zero-knowledge verification, homomorphic encryption, partially blind signatures, and one-way hash chains, the framework achieves the separation of user identity and data during the stages of data reporting, reward distribution, and reputation management.

Employing differential privacy techniques with added noise is also a common approach. Yu et al. [6] proposed a privacy-preserving incentive mechanism that integrates differential privacy with truth discovery and blockchain technology. By injecting noise into sensed data and aggregating participant weights in a weighted manner, the scheme protects sensing privacy while ensuring data quality. Yet, blockchain is employed to enable transparent and reliable aggregation and incentive distribution without relying on a trusted third party. Jin et al. [45] adopted a reverse auction mechanism to motivate user participation and compensate for both sensing costs and privacy leakage costs. They applied differential privacy with a Laplace noise addition to protect user privacy and employed a weighted data aggregation method based on workers’ skill levels to enhance the accuracy of the results. In EI deployments where connectivity is intermittent, reward feedback may be delayed, affecting incentive responsiveness.

The privacy protection incentive mechanism partially utilizing an exponential mechanism to design differential privacy is highly novel. Wang et al. [13] employ an exponential mechanism to design differential privacy during bidding, enabling users to locally encrypt tasks of interest via homomorphic encryption and perform data aggregation through local encrypted task clustering. Jiang et al. [46] proposed an incentive mechanism for uncertain tasks in MCS scenarios without real-time constraints. Their method employs the exponential mechanism to realize differential privacy, thereby protecting users’ bidding privacy and ensuring both the truthfulness and individual rationality of the incentive process.

Some studies also incorporate privacy externalities into MCS auction design. Zhang et al. [47] proposed an auction-based solution in which the platform, acting as the auctioneer, recruits workers. The approach addresses privacy issues through a differential privacy method based on the divisibility of the Laplace distribution and takes into account the privacy externalities that depend on the aggregate noise contributed by workers. Two mechanisms were designed for passive privacy and active privacy scenarios, respectively. Both mechanisms satisfy truthfulness, individual rationality, and computational efficiency, while minimizing data procurement costs and ensuring the required aggregate accuracy.

These methods provide valuable insights into how privacy protection can be integrated with efficient data processing in EI-driven MCS. They demonstrate that privacy-preserving techniques such as local data processing, dynamic incentive mechanisms, and blockchain-assisted distributed management can enhance data quality and processing speed while meeting the privacy requirements of edge nodes. At the same time, user recruitment schemes offer strong guarantees for bid and sensing privacy, although under EI conditions their performance is influenced by synchronization reliability and communication overhead, which underscores the need for resilient and lightweight privacy-preserving mechanisms.

4.2.3. The Overall System

The overall system refers to privacy protection schemes designed at the system level without emphasizing whether they are applied specifically during the task assignment or user recruitment phase. In EI, balancing model personalization with communication overhead becomes more critical than in traditional MCS. Wei et al. [48] proposed a personalized federated learning framework based on differential privacy and combined it with meta-learning mechanisms. This framework provides guidance for balancing privacy budget allocation and model convergence performance through theoretical analysis.

In distributed privacy protection scenarios, Wei et al. [49] focused on distributed differential privacy by formally defining an aggregation model and designing both Gaussian and Laplace aggregation protocols. They conducted a comparative study of privacy protection methods under the shuffle model and the aggregation model, demonstrating that the aggregation model not only provides a privacy amplification effect but also significantly outperforms the shuffle model in terms of accuracy, functional support, and practicality. Their results show that the aggregation model is more suitable for real-world distributed privacy protection. Luo et al. [50] proposed a distributed differential privacy matrix factorization method for implicit data. It protects gradient privacy through user local gradient clipping and Gaussian noise injection, achieves secure data aggregation by splitting gradients via additive secret sharing, and does not require a trusted recommender.

Unlike simple data collectors, EI devices perform local computation, creating new challenges in coordinating training and privacy protection. Shamsabadi et al. [51] achieved privacy protection without exchanging plaintext data parameters through methods such as training single-class reconstruction adversarial networks locally on user devices, sharing parameters via additive secret sharing with non-collusive service providers and regulators, and employing 2PC protocol encryption during prediction-phase computations. To address the problem of utility loss in traditional local differential privacy caused by the lack of prior knowledge utilization during data aggregation, Jiang et al. [52] focused on the local data collection phase and proposed a context-aware local information privacy framework. By applying randomized response, random sampling, and additive noise in conjunction with prior knowledge, their approach achieves privacy protection while optimizing the trade-off between utility and privacy.

Edge cloud computing as a key carrier for edge intelligence, to address its privacy protection issues, Zeng et al. [17] proposed a distributed DNN computing system that ensures data security by restricting data circulation within user-controlled heterogeneous edge device clusters without uploading it to remote clouds. In collaborative DNN inference, the system aggregates feature image segments generated by multiple devices onto a single device during the classification stage to reduce communication overhead. Tchaye-Kondi et al. [53], addressing the privacy and latency issues in centralized neural network training within edge cloud environments, constructed edge-side feature extractors through an inductive learning mechanism and designed a novel local differential privacy algorithm that combines random unit responses with Laplace differential privacy to add noise and further enhance privacy. The proposed framework preserves privacy while achieving accuracy comparable to or even higher than existing approaches.

To balance user needs and privacy protection, Li et al. [54] proposed an adaptive scheme based on Bayesian classification and genetic algorithms that deeply integrates application scenarios, user requirements, and privacy policies. This approach builds a metric linking privacy mechanisms with user demands and uses an adaptive classifier to identify QoS preferences. It automatically selects appropriate privacy protection mechanisms and parameters rather than fixed privacy levels. Ren et al. [55] developed a game-theoretic framework for privacy protection to balance user location privacy and data aggregation accuracy. This framework introduces two strategy-learning algorithms, LEFS and LSRE, which are respectively designed for fixed user satisfaction constraints and adaptive satisfaction adjustment. Through iterative user strategy learning, the framework achieves equilibrium convergence between privacy protection and aggregation accuracy.

Several studies on privacy protection have employed blockchain technology. Peng et al. [56] proposed a blockchain-based MCS scheme (BPPC), which integrates multiple blockchain technologies, K-anonymity, and cryptographic algorithms to establish a decentralized system. This system addresses privacy leakage issues during data upload and reward distribution stages, thereby achieving user privacy protection and precise reward distribution. Wang et al. [24] introduced a triple real-time trajectory privacy protection mechanism based on edge computing and blockchain. It effectively protects the trajectory privacy of task participants by using local differential privacy, spatiotemporal dynamic pseudonym mechanisms, and blockchain technology.

The following works all employ cryptographic methods for privacy protection while introducing innovations in data aggregation and addressing specific challenges in EI scenarios. Zhang et al. [5] proposed a privacy-enhanced aggregation method for federated learning, which utilizes a dual trapdoor cryptosystem to protect both global and local models. This model aggregation approach aligns with the core requirement for lightweight processing on edge devices in EI scenarios. Despite its robustness, the required cryptographic operations may impose considerable load on low-capability EI edge devices. Zhao et al. [57] introduced a multi-dimensional improved encrypted medical data aggregation scheme (VMEMDA), which utilizes time and space as dimensions to perform fundamental data aggregation operations (sum, mean, var). Gope et al. [18] proposed a lightweight spatial data aggregation scheme that employs cryptographic primitives such as hash functions and XOR operations. It achieves data privacy protection and secure, efficient aggregation through masking techniques and a temporary identity mechanism. Zhang et al. [58], focusing on smart grid scenarios, enhanced a public-key cryptosystem into a four-prime dual-message encryption mode and combined it with super-increasing sequences for multi-dimensional encrypted data aggregation. They employed Shamir’s secret sharing to achieve transmission fault tolerance and identity-based aggregate signatures to ensure data integrity. This approach addresses the more prominent issue in EI, compared to traditional MCS, of data transmission interruptions caused by edge device failures. Its reliance on stable transmission paths, however, limits deployment in EI settings with volatile topologies. Palazzo et al. [59] proposed a publicly verifiable privacy protection data aggregation protocol supporting arbitrary collusion between malicious aggregators and malicious users. By integrating privacy protection methods such as Shamir secret sharing and bilinear pairings, combined with a data flow aggregation process involving user signature generation, aggregator signature aggregation, and constant-time verification, they ensured data confidentiality, integrity, and authenticity.

Furthermore, homomorphic encryption is also a commonly employed method. Agate et al. [14] and Wu et al. [60] applied homomorphic encryption in clustering and federated learning contexts, respectively. Zhao et al. [61] address real-time incentivization, data reliability assessment, and privacy protection in MCS through homomorphic encryption and secure computation protocols SecLog, SecMulDiv, and SecDist, employing a two-layer truth discovery model for dynamic weight data aggregation. Rezaeibagha et al. [62] proposed an efficient and secure scheme based on authenticated additive homomorphic encryption and cryptographic accumulators, achieving encrypted aggregation of medical sensor data in the Internet of Things via a binary tree structure. Zheng et al. [63] introduced the EPSet scheme, which employs homomorphic encryption-based privacy-preserving filtering and refinement protocols (PPR), combined with a pivot k-d tree index, to efficiently execute privacy-preserving set similarity range queries on encrypted data.

System-level schemes provide strong privacy and verifiability but often rely on stable networks or moderate computational resources, which limits their applicability in the heterogeneous, dynamic, and resource-constrained environments typical of EI. As shown in Table 2, the privacy schemes and aggregation mechanisms used in existing methods offer valuable guidance for EI-driven MCS, as their privacy-preserving clustering and model aggregation techniques are closely aligned with the principles of federated learning and EI. Since both paradigms require distributed data processing and iterative model updating, their successful deployment in federated learning highlights their potential effectiveness in EI. By enabling data aggregation and model updates at edge nodes, these techniques enhance both processing efficiency and privacy protection, providing a robust foundation for EI-driven MCS in large-scale distributed settings.

4.3. Data Collection and Transmission: Communication Part and Sensing Part

4.3.1. Common Transmission

Common transmission prioritizes fast, efficient data transfer to meet real-time requirements. Under this transmission paradigm, however, private information within the data becomes susceptible to tracking and monitoring. Its privacy protection techniques rely on homomorphic encryption, secret sharing, combined security mechanisms, and scenario-specific technologies.

Homomorphic encryption is a highly prevalent technique. Zhang et al. [5] ensured data confidentiality through TTH homomorphic encryption, used the HKE protocol to generate blind factors against collusion attacks, and applied data packing techniques to improve transmission efficiency. Their method divides the secure transmission of encrypted data into three phases: model upload, update, and result generation. By incorporating key distribution and collaborative decryption, they established a privacy-enhanced transmission system that simultaneously protects local models, sample counts, and the global model. Zheng et al. [63] utilized secure channels and SHE homomorphic encryption to enable the transmission of keys, pivot parameters, encrypted datasets, encrypted trapdoors, and query results among data owners, query users, and cloud servers. All transmitted data are either in encrypted form or protected via secure channels. The system operates under the assumptions that neither the two servers nor the data owners engage in collusion with each other. The core of the data flow transmission in Wu et al. [60] involves the exchange of encrypted model parameters. The cloud distributes homomorphically encrypted parameters of an initialized GCAE model to edge clients. After training on local data, clients encrypt and upload their updated local model parameters to the cloud. The cloud then aggregates these parameters to update the global model, which is again encrypted and distributed to clients. By leveraging the parameter sharing and dimensionality reduction features of GCAE, the number of parameters is reduced, lowering transmission overhead.

As a type of homomorphic encryption, Paillier is also widely applied. Zhao et al. [57] employ a modified Paillier cryptosystem to ensure data confidentiality, utilize homomorphic hash functions to safeguard encrypted data integrity, and leverage unpredictable random sequences with provable data ownership mechanisms to guarantee aggregated data correctness. This approach resists eavesdropping, tampering, and replay attacks while preventing privacy leakage to the MCS platform. The data transmission process in Zhao et al. [61] primarily involves three entities: the workers, the SP, and the CSP. Workers collect sensing data and encrypt it using the requester’s public key before sending it to the SP. The SP then applies random numbers to mask intermediate results, such as data weights, and forwards the masked encryption data to the CSP. The CSP decrypts the masked data, performs the necessary computations, and returns the results to the SP. Throughout this process, core data are transmitted in encrypted form. Privacy of the workers’ sensing data and data weights, as well as the requester’s estimated ground truth, is ensured through Paillier encryption and random masking.

The following data transmission schemes all employ secret sharing technology. Feng et al. [19] designed a fragmented transmission scheme based on replicated secret sharing, where workers divide their sensed data into secret shares and transmit them to the SP and two CSPs. The SP and CSPs then collaborate through share exchange to complete the computation. During the entire process, only data fragments are transmitted, and no single participant can reconstruct the complete information, which supports real-time incentive computation under privacy protection. Peng et al. [12] adopted secret sharing shares as the transmission carrier. In the three key stages of data uploading, inter-server computational interaction, and truth recovery, the transmission of original data is replaced by the transmission of divided shares. In addition, SSL and SSH secure channels are used to ensure transmission security. Through the combination of share division and secure channel communication, their scheme integrates privacy protection and data transmission functions. Yan et al. [65] employed secure channels as the basic carrier and applied additive secret sharing to divide the original data into shares that cannot be individually decrypted. They further used encryption techniques to handle intermediate data exchanged between platforms. Transmission uses only secret shares or encrypted values, enabling aggregation while protecting worker data and requester results, thereby preventing privacy leakage during transmission.

In addition, several transmission methods employ more sophisticated techniques. Yan et al. [22] relied on a fog computing-based hierarchical transmission architecture and adopted a combination of pseudo-ID anonymization, hash-based authentication, digital signatures, finite-field encryption, and region-based transmission. This approach simultaneously defends against internal and external attacks, safeguarding the privacy and security of user data, decisions, and requester task results. Due to heterogeneous latency across EI fog regions, aggregation may exhibit temporal misalignment without additional coordination. Jiang et al. [52] considered data transmission in an untrusted server environment, where users do not directly transmit raw data. They locally perturb the original data under LIP privacy constraints using mechanisms such as randomized response, random sampling, or additive noise. The perturbed non-original data are then transmitted to the server. Song et al. [20] designed a transmission scheme for vehicular environments based on KGC-enabled secure key distribution. Encrypted interest indices and location data are transmitted between vehicle workers and the crowdsourcing server, while all sensitive data exchanged between task requesters and the crowdsourcing server are transmitted in encrypted form. The KGC ensures secure key delivery through protected communication, and the crowdsourcing server processes and matches only encrypted data and re-encryption keys, thereby preventing any leakage of raw sensitive information during transmission.

Applying these schemes to EI faces practical constraints: limited edge resources can make homomorphic encryption increase latency; node dynamism can weaken secret-sharing collaboration; and local perturbation may reduce accuracy under constrained computation. Nevertheless, EI’s distributed architecture can still enable efficient, privacy-preserving transmission for secure real-time processing.

4.3.2. Anonymous Connectivity

Anonymous connectivity protects privacy by making transmitted data difficult to trace to specific users. It effectively conceals users’ real identities and locations, thereby safeguarding their privacy. However, since data must pass through multiple relay or mix nodes, the transmission delay is generally higher than that of common transmission methods. Wang et al.’s [13] BIP mechanism utilizes anonymous linking technology to enable random, anonymous transmission of user bids to agents, who then forward them to the platform. The TBIP mechanism, supported by encryption technology, facilitates specific data transfers between the platform and users, users and third parties, and the platform and third parties. Encryption and third-party mediation ensure the platform cannot access users’ private information. However, anonymous multi-hop forwarding can significantly increase latency under EI’s strict real-time constraints.

Sun et al. [44] proposed a fog computing-based distributed anonymous encrypted transmission framework that covers the entire process of sensed data reporting, group aggregation, reputation querying, and reward redemption. The architecture centers on privacy protection, integrating cryptographic methods with fog-node collaboration. This approach not only prevents identity and location leakage during data transmission but also reduces latency and communication overhead through a distributed design, thereby accommodating the high-mobility and large-scale data transmission requirements of vehicular networks. In terms of data transmission, Yu et al. [6] leverage blockchain-assisted storage and addressing mechanisms to enhance data integrity and participant anonymity with minimal impact on aggregation accuracy.

Some research data transmission methods do not rely on trusted third parties. In Peng et al. [37], data transmission in the mobile crowdsourcing system is implemented through an anonymous connectivity channel. The task requester transmits task requirements to the platform, while users first desensitize their sensitive location and bid information via the anonymous connectivity channel. Afterward, they apply local differential privacy-based obfuscation to the data and upload the obfuscated location and bid information to the platform through the same anonymous channel. The platform then uses the received data for subsequent task assignment.

In the communication layer, Zhao et al. [4] show that secret-sharing-based transmission can support privacy-preserving data aggregation and incentive allocation without exposing raw sensing data or bids. Agate et al. [14] improved the K-Means algorithm to perform cluster grouping during encrypted data transmission. All data are exchanged through secure and authenticated TLS communication channels, and users transmit only encrypted personal data to the SP. The SP and BH operate using blinded encrypted data, each accessing only what its role requires. Consequently, no party can infer users’ raw information, cluster membership, or outlier identities, achieving a balance between privacy protection and data computation utility.

Anonymous connectivity has been applied across various scenarios. In the smart grid context, Gope et al. [18] employed lightweight cryptographic primitives to replace real identities with pseudo-identities and one-time temporary identities. They combined hash values to verify data integrity, symmetric encryption to protect sensitive information, and timestamps to prevent replay attacks. This approach hides real identities and electricity usage while ensuring secure, reliable transmission. In the communication layer, Cheng et al. [15] demonstrate that confidence-based reputation verification can support anonymous connectivity by decoupling identity verification from direct vehicle-to-server interactions.

In EI-driven MCS, anonymous connectivity is essential for protecting user privacy, yet traditional multi-relay forwarding introduces latency that conflicts with EI’s low-latency demands. The limited computing power and high dynamism of edge devices further challenge the stability of anonymous connections. By leveraging EI to build edge relay networks and enabling lightweight edge-side authentication, anonymity and task responsiveness can be better balanced. Although anonymous connectivity offers strong privacy guarantees, its performance can still be hindered by latency in highly mobile and dynamic EI topologies, highlighting the need for adaptive relay control or lightweight anonymity mechanisms.

4.3.3. Comparison of Methods

The EI-driven MCS depends on edge nodes for data processing and decision-making. The diversity and decentralized characteristics of edge nodes need enhanced real-time connectivity, with data no longer being centrally uploaded to the cloud but rather dispersed and processed via decentralized nodes. Thus, privacy solutions must adapt to EI’s low-latency and high-concurrency demands to maintain privacy–efficiency balance.

After completing privacy protection and data aggregation, the next step is data release. We focus on whether the algorithmic methods clearly describe the data release process, as shown in Table 3.

Focusing on the types of data objects for privacy protection allows for more effective design and implementation of targeted privacy protection mechanisms, thereby improving the overall protection effectiveness. Privacy-protected data objects include sensitive information such as location, identity, perception, bidding, reputation value, etc. Protecting certain data types accurately improves algorithm specificity, avoids overprotection, and allocates resources efficiently. In addition, accurate protection can decrease computational and communication overhead, maintaining system efficiency while fulfilling privacy needs and making privacy protection techniques more efficient and versatile in varied contexts.

The distribution of sensors impacts data gathering and accuracy. Sensor distribution domains include general, medical, automobile, and industrial networks. Considering sensor distribution ensures precise, broad data collection suited to application needs [10].

4.3.4. Key Attributes for Expanding to EI-Driven MCS Systems

Table 4 summarises the four main properties of EI in existing methods: Decentralized Learning, Real-Time Processing, Collaborative learning, and Model Adaptability. Decentralized learning distributes data and computation across edge nodes, supporting EI-local processing [6,19]. Real-time processing requires low latency and highlights privacy challenges under resource constraints [19]. Collaborative Learning involves sharing models or information among edge nodes, fostering a smarter network while highlighting the need for efficient resource management and synchronization [12,14]. Model Adaptability reflects the ability to adjust models dynamically based on edge device capabilities and environmental changes, essential for cross-layer privacy protection [6,57]. Summarizing these features helps evaluate the applicability and flexibility of privacy-preserving data aggregation methods in an EI-driven context.

To avoid ambiguity, the classification in Table 4 follows explicit criteria rather than subjective judgment. Specifically, an attribute is labeled as “Fully satisfied” if the corresponding work explicitly incorporates the attribute into its system architecture or algorithm design and provides experimental validation or concrete implementation details. An attribute is labeled as “Partially satisfied” if it is implicitly supported or partially addressed (e.g., discussed conceptually or enabled under specific assumptions) but not fully implemented or evaluated. “Not satisfied” indicates that the attribute is neither explicitly designed for nor evaluated in the corresponding work. It should be noted that this classification aims to provide a qualitative comparative overview of existing methods under EI requirements, rather than a quantitative performance ranking.

5. Performance Comparison of Data Aggregation Schemes Under Different Privacy Schemes

This section evaluates the performance of representative privacy-preserving aggregation mechanisms in an EI-driven MCS environment. Our goal is to examine how different classes of privacy techniques behave under realistic EI constraints, including heterogeneous device capabilities, fluctuating network conditions, and real-time processing demands. It should be noted that the objective of this experimental study is not to fully emulate real-world EI-driven MCS workloads, such as large-scale spatiotemporal trajectory sensing or high-frequency sensor streams. Instead, the experiments are designed as an illustrative and controlled comparison to examine how different classes of privacy-preserving mechanisms influence aggregation accuracy and computational efficiency under EI-inspired constraints. Therefore, the results should be interpreted as mechanism-level insights rather than as performance benchmarks for specific EI applications.

5.1. Experimental Setup and Dataset

Simulation Framework and Environment: The experiments were conducted using a lightweight Python-based discrete-event simulation framework, a commonly adopted tool for modeling distributed sensing and edge–cloud interactions. The simulated environment consists of 50 nodes organized into an EI-style multi-tier topology. To reflect realistic heterogeneity, nodes are divided into three categories:

Low-end edge devices resembling IoT-class sensors with limited processing power.
Mobile devices with moderate computational capability.
Fog/edge servers providing relatively high processing capacity.

Each node type is assigned different computation speeds and memory budgets to emulate typical EI resource diversity.

Network Conditions: To model EI’s dynamic and heterogeneous communication environment, network latency is sampled from a time-varying distribution ranging from 5 to 40 ms, while bandwidth varies between 5 and 20 Mbps, and a small packet loss rate of 0.5–2% is incorporated to reflect unstable wireless connectivity near the edge. These parameters approximate the common characteristics of mobile and edge networks observed in practical EI deployments.

Device Heterogeneity Modeling: The computational cost of each operation is scaled according to node type:

Perturbation operations (DP) incur minimal overhead on all devices.
Secure multi-party computation (MPC)-style aggregation requires negligible local computation but relies on multi-round communication, making it sensitive to unstable links.
Anonymization adds an additional transformation step before transmission, increasing communication but not computation.

This modeling approach captures the contrasting impacts of privacy mechanisms under heterogeneous EI capabilities.

Dataset and Task Description: The COMPAS dataset is used to simulate non-image MCS attribute aggregation. Although originally intended for classification tasks, its structured categorical and numerical attributes resemble common user-provided data in MCS environments, such as demographic information, behavioral indicators, or risk-level scores. This makes COMPAS a suitable proxy for evaluating how different privacy configurations affect the accuracy of aggregated results.

5.2. Experimental Results and Analysis

This section evaluates how different combinations of privacy-preserving techniques influence the accuracy and efficiency of data aggregation in an EI-driven MCS environment. The experiment includes three categories of privacy mechanisms: secure multiparty computation (MPC) within cryptographic techniques, differential privacy (DP) within data perturbation, and anonymization techniques. Based on these mechanisms, several combined privacy configurations are constructed and tested. The evaluated configurations include Only DP, Only MPC, DP combined with MPC, DP combined with anonymization, MPC combined with anonymization, and DP combined with both MPC and anonymization. These configurations allow us to observe how privacy mechanisms reinforce or offset each other when applied jointly.

The evaluation metrics include the error rate of the aggregated data and the running time of the aggregation process. These two metrics reflect core requirements of EI-driven MCS. EI systems must manage real-time constraints while preserving data quality. Therefore, both the accuracy and the computational cost of privacy mechanisms are central to the system’s practical performance.

Figure 2 presents the average aggregation error under different privacy budgets for all examined privacy configurations. The results show that the error rate is markedly higher when using only DP. This occurs because DP introduces random noise directly at the data source, and this noise accumulates through the aggregation process. When MPC is applied together with DP, the error rate is significantly reduced. MPC mitigates the effect of noise because distributed computation and secret sharing enable participants to compute the result without exposing raw data, and the accuracy of MPC-based computation helps the aggregated result remain closer to the true value. When anonymization is added to DP or MPC, the error rate increases slightly. This increase remains within an acceptable range, indicating that anonymization provides additional privacy with a moderate accuracy cost.

Figure 3 reports the execution time for data aggregation under the same set of privacy configurations. The results show that DP alone requires the least running time, because the perturbation step is performed locally and involves minimal computation. When MPC is introduced, the running time increases significantly. MPC requires encryption and decryption operations as well as multi-party interaction protocols, which increases computational complexity and communication overhead. Adding anonymization to either DP or MPC results in only a small change in execution time. This indicates that anonymization introduces limited computational load and does not significantly affect the overall aggregation efficiency.

In summary, the experimental results demonstrate a clear trade-off between privacy strength, aggregation accuracy, and running time. DP alone offers efficiency but produces higher error. MPC preserves accuracy more effectively but introduces considerable computational and communication cost. Anonymization provides an additional layer of privacy with relatively small influence on performance. The combination of DP and MPC provides the most balanced outcome, achieving substantial accuracy improvements compared with DP alone while maintaining acceptable computational overhead. These observations align with the broader analytical discussion in Section 4.2 and Section 4.3 and highlight the importance of selecting privacy mechanisms that match the resource and communication conditions of EI-driven MCS systems.

6. Future Research Directions

Future research on privacy-preserving data aggregation in EI-driven MCS should move beyond general adaptations of traditional MCS solutions and be grounded in the specific constraints and challenges introduced by edge intelligence. Based on the gaps identified in the reviewed literature, several concrete research directions emerge.

First, although adaptive privacy techniques have been explored in existing MCS studies, most adaptive mechanisms primarily focus on adjusting privacy parameters (e.g., noise magnitude or encryption strength) under relatively static system assumptions. In EI-driven MCS, however, edge nodes exhibit strong heterogeneity in computation capability, energy availability, and network conditions. This raises an open research question of how privacy-preserving aggregation mechanisms can dynamically adapt not only privacy budgets but also aggregation strategies and computational complexity in response to real-time variations in edge resources while still satisfying latency and accuracy requirements [3,14,57].

Second, current user-centric privacy solutions mainly empower users to configure privacy preferences at a coarse level, often without considering how these preferences interact with collaborative learning and aggregation processes at the edge. In EI-driven MCS, users may simultaneously act as data providers and contributors to edge intelligence, making privacy control more complex. Future research should therefore investigate how user-centric privacy management can be systematically integrated with decentralized aggregation and learning mechanisms, enabling fine-grained and context-aware privacy control without significantly increasing system overhead [13].

Third, multi-dimensional privacy protection techniques have been proposed to address different privacy targets, such as location, identity, and sensing data. However, most existing approaches treat these dimensions independently and are primarily designed for single-layer MCS architectures. In EI-driven MCS, privacy risks often propagate across sensing, communication, and learning layers due to cross-layer optimization and collaboration. An important research challenge is how to design lightweight, cross-layer multi-dimensional privacy mechanisms that explicitly model these interdependencies and prevent privacy leakage amplification in multi-tier EI architectures [57].

In addition, many privacy-preserving data aggregation schemes in the literature are developed within isolated technical domains, such as IoT or cloud computing, and are not directly applicable to EI-driven MCS without modification. While these technologies offer valuable building blocks, future work should explore how privacy mechanisms from IoT, cloud computing, and related fields can be cohesively integrated into EI systems. In particular, there is a need for unified privacy frameworks that can coordinate privacy protection across edge and cloud layers while maintaining scalability, robustness, and real-time performance [6,57].

Overall, advancing privacy-preserving data aggregation in EI-driven MCS requires a shift from generic solution reuse toward EI-aware, problem-driven design. Future studies should place greater emphasis on how effectively proposed mechanisms address core EI challenges, including resource heterogeneity, dynamic network topology, and real-time constraints, as well as where their limitations remain. Such a perspective is essential for developing practical, scalable, and resilient EI-driven MCS systems.

Author Contributions

Conceptualization, X.L. and S.C.; methodology, X.L., S.C. and Z.X.; software, S.C. and Z.X.; validation, S.C. and Z.X.; formal analysis, S.C. and Z.X.; investigation, X.L., S.C. and Z.X.; data curation, S.C. and Z.X.; writing—original draft preparation, S.C.; writing—review and editing, X.L., S.C. and Z.X.; visualization, S.C. and Z.X.; supervision, X.L. and S.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

This study is primarily a literature review; therefore, no new data were generated or analyzed for the review portion. All data generated from the original experiments conducted in this study are included in the published article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Capponi, A.; Fiandrino, C.; Kantarci, B.; Foschini, L.; Kliazovich, D.; Bouvry, P. A survey on mobile crowdsensing systems: Challenges, solutions, and opportunities. IEEE Commun. Surv. Tutor. 2019, 21, 2419–2465. [Google Scholar] [CrossRef]
Gong, T.; Zhu, L.; Yu, F.R.; Tang, T. Edge intelligence in intelligent transportation systems: A survey. IEEE Trans. Intell. Transp. Syst. 2023, 24, 8919–8944. [Google Scholar] [CrossRef]
Xu, D.; Li, T.; Li, Y.; Su, X.; Tarkoma, S.; Jiang, T.; Crowcroft, J.; Hui, P. Edge intelligence: Empowering intelligence to the edge of network. Proc. IEEE 2021, 109, 1778–1837. [Google Scholar] [CrossRef]
Zhao, B.; Li, X.; Liu, X.; Pei, Q.; Li, Y.; Deng, R.H. Crowdfa: A privacy-preserving mobile crowdsensing paradigm via federated analytics. IEEE Trans. Inf. Forensics Secur. 2023, 18, 5416–5430. [Google Scholar] [CrossRef]
Zhang, M.; Chen, S.; Shen, J.; Susilo, W. Privacyeafl: Privacy-enhanced aggregation for federated learning in mobile crowdsensing. IEEE Trans. Inf. Forensics Secur. 2023, 18, 5804–5816. [Google Scholar] [CrossRef]
Yu, R.; Oguti, A.M.; Ochora, D.R.; Li, S. Towards a privacy-preserving smart contract-based data aggregation and quality-driven incentive mechanism for mobile crowdsensing. J. Netw. Comput. Appl. 2022, 207, 103483. [Google Scholar] [CrossRef]
Pournajaf, L.; Xiong, L.; Garcia-Ulloa, D.A.; Sunderam, V. A Survey on Privacy in Mobile Crowd Sensing Task Management; Technic Report; TR-2014-002; The Department of Computer Science, Emory University: Atlanta, GA, USA, 2014. [Google Scholar]
Pournajaf, L.; Garcia-Ulloa, D.A.; Xiong, L.; Sunderam, V. Participant privacy in mobile crowd sensing task management: A survey of methods and challenges. ACM Sigmod Rec. 2016, 44, 23–34. [Google Scholar] [CrossRef]
Wang, Z.; Pang, X.; Hu, J.; Liu, W.; Wang, Q.; Li, Y.; Chen, H. When mobile crowdsensing meets privacy. IEEE Commun. Mag. 2019, 57, 72–78. [Google Scholar] [CrossRef]
Wang, Y.; Yan, Z.; Feng, W.; Liu, S. Privacy protection in mobile crowd sensing: A survey. World Wide Web 2020, 23, 421–452. [Google Scholar] [CrossRef]
Ma, C.; Li, J.; Wei, K.; Liu, B.; Ding, M.; Yuan, L.; Han, Z.; Poor, H.V. Trusted ai in multiagent systems: An overview of privacy and security for distributed learning. Proc. IEEE 2023, 111, 1097–1132. [Google Scholar] [CrossRef]
Peng, T.; Zhong, W.; Wang, G.; Luo, E.; Yu, S.; Liu, Y.; Yang, Y.; Zhang, X. Privacy-preserving truth discovery based on secure multi-party computation in vehicle-based mobile crowdsensing. IEEE Trans. Intell. Transp. Syst. 2024, 25, 7767–7779. [Google Scholar] [CrossRef]
Wang, Z.; Li, J.; Hu, J.; Ren, J.; Wang, Q.; Li, Z.; Li, Y. Towards privacy-driven truthful incentives for mobile crowdsensing under untrusted platform. IEEE Trans. Mob. Comput. 2021, 22, 1198–1212. [Google Scholar] [CrossRef]
Agate, V.; Ferraro, P.; Re, G.L.; Das, S.K. BLIND: A privacy preserving truth discovery system for mobile crowdsensing. J. Netw. Comput. Appl. 2024, 223, 103811. [Google Scholar] [CrossRef]
Cheng, Y.; Ma, J.; Liu, Z.; Wu, Y.; Wei, K.; Dong, C. A lightweight privacy preservation scheme with efficient reputation management for mobile crowdsensing in vehicular networks. IEEE Trans. Dependable Secur. Comput. 2022, 20, 1771–1788. [Google Scholar] [CrossRef]
Marcolla, C.; Sucasas, V.; Manzano, M.; Bassoli, R.; Fitzek, F.H.; Aaraj, N. Survey on fully homomorphic encryption, theory, and applications. Proc. IEEE 2022, 110, 1572–1609. [Google Scholar] [CrossRef]
Zeng, L.; Chen, X.; Zhou, Z.; Yang, L.; Zhang, J. Coedge: Cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices. IEEE/ACM Trans. Netw. 2020, 29, 595–608. [Google Scholar] [CrossRef]
Gope, P.; Sikdar, B. Lightweight and privacy-friendly spatial data aggregation for secure power supply and demand management in smart grids. IEEE Trans. Inf. Forensics Secur. 2018, 14, 1554–1566. [Google Scholar] [CrossRef]
Feng, Q.; He, D.; Luo, M.; Huang, X.; Choo, K.K.R. EPRICE: An efficient and privacy-preserving real-time incentive system for crowdsensing in industrial Internet of Things. IEEE Trans. Comput. 2023, 72, 2482–2495. [Google Scholar] [CrossRef]
Song, F.; Qin, Z.; Liu, D.; Zhang, J.; Lin, X.; Shen, X. Privacy-preserving task matching with threshold similarity search via vehicular crowdsourcing. IEEE Trans. Veh. Technol. 2021, 70, 7161–7175. [Google Scholar] [CrossRef]
Lin, C.; Huang, X.; He, D. EBCPA: Efficient blockchain-based conditional privacy-preserving authentication for VANETs. IEEE Trans. Dependable Secur. Comput. 2022, 20, 1818–1832. [Google Scholar] [CrossRef]
Yan, X.; Ng, W.W.; Zhao, B.; Liu, Y.; Gao, Y.; Wang, X. Fog-enabled privacy-preserving multi-task data aggregation for mobile crowdsensing. IEEE Trans. Dependable Secur. Comput. 2023, 21, 1301–1316. [Google Scholar] [CrossRef]
Yan, X.; Ng, W.W.; Zeng, B.; Lin, C.; Liu, Y.; Lu, L.; Gao, Y. Verifiable, reliable, and privacy-preserving data aggregation in fog-assisted mobile crowdsensing. IEEE Internet Things J. 2021, 8, 14127–14140. [Google Scholar] [CrossRef]
Wang, W.; Wang, Y.; Duan, P.; Liu, T.; Tong, X.; Cai, Z. A triple real-time trajectory privacy protection mechanism based on edge computing and blockchain in mobile crowdsourcing. IEEE Trans. Mob. Comput. 2022, 22, 5625–5642. [Google Scholar] [CrossRef]
Hu, K.; Gong, S.; Zhang, Q.; Seng, C.; Xia, M.; Jiang, S. An overview of implementing security and privacy in federated learning. Artif. Intell. Rev. 2024, 57, 204. [Google Scholar] [CrossRef]
Xhemrishi, M.; Östman, J.; Wachter-Zeh, A.; i Amat, A.G. FedGT: Identification of malicious clients in federated learning with secure aggregation. IEEE Trans. Inf. Forensics Secur. 2025, 20, 2577–2592. [Google Scholar] [CrossRef]
Wang, Z.; Hu, Q.; Zou, X.; Hu, P.; Cheng, X. Can we trust the similarity measurement in federated learning? IEEE Trans. Inf. Forensics Secur. 2025, 20, 3758–3771. [Google Scholar] [CrossRef]
Huang, S.; Li, Y.; Yan, X.; Gao, Y.; Chen, C.; Shi, L.; Chen, B.; Ng, W.W. Scope: On Detecting Constrained Backdoor Attacks in Federated Learning. IEEE Trans. Inf. Forensics Secur. 2025, 20, 3302–3315. [Google Scholar] [CrossRef]
Lyu, X.; Han, Y.; Wang, W.; Liu, J.; Wang, B.; Chen, K.; Li, Y.; Liu, J.; Zhang, X. Coba: Collusive backdoor attacks with optimized trigger to federated learning. IEEE Trans. Dependable Secur. Comput. 2024, 22, 1506–1518. [Google Scholar] [CrossRef]
Dong, X.; Zhang, Y.; Guo, Y.; Gong, Y.; Shen, Y.; Ma, J. PRAM: A practical sybil-proof auction mechanism for dynamic spectrum access with untruthful attackers. IEEE Trans. Mob. Comput. 2021, 22, 1143–1156. [Google Scholar] [CrossRef]
Jin, X.; Gong, Z.; Jiang, J.; Wang, C.; Zhang, J.; Wang, Z. RCTD: Reputation-Constrained Truth Discovery in Sybil Attack Crowdsourcing Environment. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024. [Google Scholar]
Wang, S.; Gai, K.; Yu, J.; Zhu, L.; Wu, H.; Wei, C.; Yan, Y.; Zhang, H.; Choo, K.K.R. RAFLS: RDP-based adaptive federated learning with shuffle model. IEEE Trans. Dependable Secur. Comput. 2024, 22, 1181–1194. [Google Scholar] [CrossRef]
Hu, C.; Li, B. Maskcrypt: Federated learning with selective homomorphic encryption. IEEE Trans. Dependable Secur. Comput. 2024, 22, 221–233. [Google Scholar] [CrossRef]
Valadi, V.; Qiu, X.; De Gusmao, P.P.B.; Lane, N.D.; Alibeigi, M. FedVal: Different good or different bad in federated learning. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023. [Google Scholar]
Xu, Y.; Zhang, S.; Lyu, C.; Liu, J.; Shen, Y.; Norio, S. Mitigating Distributed DoS Attacks on Bandwidth Allocation for Federated Learning in Mobile Edge Networks. IEEE Trans. Dependable Secur. Comput. 2024, 22, 1941–1960. [Google Scholar] [CrossRef]
Jiang, Z.; Xu, J.; Zhang, S.; Shen, T.; Li, J.; Kuang, K.; Cai, H.; Wu, F. Fedcfa: Alleviating simpson’s paradox in model aggregation with counterfactual federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025. [Google Scholar]
Peng, T.; You, W.; Guan, K.; Luo, E.; Zhang, S.; Wang, G.; Wang, T.; Wu, Y. Privacy-preserving multiobjective task assignment scheme with differential obfuscation in mobile crowdsensing. J. Netw. Comput. Appl. 2024, 224, 103836. [Google Scholar] [CrossRef]
Peng, T.; Zhong, W.; Wang, G.; Zhang, S.; Luo, E.; Wang, T. Spatiotemporal-aware privacy-preserving task matching in mobile crowdsensing. IEEE Internet Things J. 2023, 11, 2394–2406. [Google Scholar] [CrossRef]
Zhao, B.; Tang, S.; Liu, X.; Zhang, X.; Chen, W.N. iTAM: Bilateral privacy-preserving task assignment for mobile crowdsensing. IEEE Trans. Mob. Comput. 2020, 20, 3351–3366. [Google Scholar] [CrossRef]
Asheralieva, A.; Niyato, D.; Xiong, Z. Auction-and-learning based lagrange coded computing model for privacy-preserving, secure, and resilient mobile edge computing. IEEE Trans. Mob. Comput. 2021, 22, 744–764. [Google Scholar] [CrossRef]
Wang, X.; Garg, S.; Lin, H.; Kaddoum, G.; Hu, J.; Hossain, M.S. A secure data aggregation strategy in edge computing and blockchain-empowered internet of things. IEEE Internet Things J. 2020, 9, 14237–14246. [Google Scholar] [CrossRef]
Tang, W.; Ren, J.; Deng, K.; Zhang, Y. Secure data aggregation of lightweight E-healthcare IoT devices with fair incentives. IEEE Internet Things J. 2019, 6, 8714–8726. [Google Scholar] [CrossRef]
Sun, P.; Wang, Z.; Wu, L.; Feng, Y.; Pang, X.; Qi, H.; Wang, Z. Towards personalized privacy-preserving incentive for truth discovery in mobile crowdsensing systems. IEEE Trans. Mob. Comput. 2020, 21, 352–365. [Google Scholar] [CrossRef]
Sun, G.; Sun, S.; Yu, H.; Guizani, M. Toward incentivizing fog-based privacy-preserving mobile crowdsensing in the Internet of Vehicles. IEEE Internet Things J. 2019, 7, 4128–4142. [Google Scholar] [CrossRef]
Jin, H.; Su, L.; Xiao, H.; Nahrstedt, K. Inception: Incentivizing privacy-preserving data aggregation for mobile crowd sensing systems. In Proceedings of the 17th ACM International Symposium on Mobile Ad Hoc Networking and Computing, Paderborn, Germany, 5–8 July 2016; pp. 341–350. [Google Scholar]
Jiang, X.; Ying, C.; Li, L.; Düdder, B.; Wu, H.; Jin, H.; Luo, Y. Incentive Mechanism for Uncertain Tasks under Differential Privacy. IEEE Trans. Serv. Comput. 2024, 17, 977–989. [Google Scholar] [CrossRef]
Zhang, M.; Yang, L.; He, S.; Li, M.; Zhang, J. Privacy-preserving data aggregation for mobile crowdsensing with externality: An auction approach. IEEE/ACM Trans. Netw. 2021, 29, 1046–1059. [Google Scholar] [CrossRef]
Wei, K.; Li, J.; Wang, M.; Zhou, X. Personalized Federated Learning With Differential Privacy and Convergence Guarantee. IEEE Trans. Inf. Forensics Secur. 2023, 18, 4488–4503. [Google Scholar] [CrossRef]
Wei, Y.; Jia, J.; Wu, Y.; Hu, C.; Dong, C.; Liu, Z.; Chen, X.; Peng, Y.; Wang, S. Distributed differential privacy via shuffling versus aggregation: A curious study. IEEE Trans. Inf. Forensics Secur. 2024, 19, 2501–2516. [Google Scholar] [CrossRef]
Luo, C.; Wang, Y.; Zhang, Y.; Zhang, L.Y. Distributed Differentially Private Matrix Factorization for Implicit Data via Secure Aggregation. IEEE Trans. Comput. 2024, 74, 705–716. [Google Scholar] [CrossRef]
Shamsabadi, A.S.; Gascón, A.; Haddadi, H.; Cavallaro, A. PrivEdge: From local to distributed private training and prediction. IEEE Trans. Inf. Forensics Secur. 2020, 15, 3819–3831. [Google Scholar] [CrossRef]
Jiang, B.; Seif, M.; Tandon, R.; Li, M. Context-aware local information privacy. IEEE Trans. Inf. Forensics Secur. 2021, 16, 3694–3708. [Google Scholar] [CrossRef]
Tchaye-Kondi, J.; Zhai, Y.; Shen, J.; Zhu, L. Privacy-preserving offloading in edge intelligence systems with inductive learning and local differential privacy. IEEE Trans. Netw. Serv. Manag. 2023, 20, 5026–5037. [Google Scholar] [CrossRef]
Li, F.; Yin, P.; Chen, Y.; Niu, B.; Li, H. Achieving fine-grained qos for privacy-aware users in lbss. IEEE Wirel. Commun. 2020, 27, 31–37. [Google Scholar] [CrossRef]
Ren, Y.; Li, X.; Miao, Y.; Luo, B.; Weng, J.; Choo, K.K.R.; Deng, R.H. Towards privacy-preserving spatial distribution crowdsensing: A game theoretic approach. IEEE Trans. Inf. Forensics Secur. 2022, 17, 804–818. [Google Scholar] [CrossRef]
Peng, T.; Guan, K.; Liu, J.; Chen, J.; Wang, G.; Zhu, J. A blockchain-based mobile crowdsensing scheme with enhanced privacy. Concurr. Comput. Pract. Exp. 2023, 35, e6664. [Google Scholar] [CrossRef]
Zhao, J.; Huang, H.; Zhang, X.; He, D.; Choo, K.K.R.; Jiang, Z.L. VMEMDA: Verifiable multidimensional encrypted medical data aggregation scheme for cloud-based wireless body area networks. IEEE Internet Things J. 2024, 11, 18647–18662. [Google Scholar] [CrossRef]
Zhang, X.; Huang, C.; Gu, D.; Zhang, J.; Xue, J.; Wang, H. Privacy-preserving statistical analysis over multi-dimensional aggregated data in edge computing-based smart grid systems. J. Syst. Archit. 2022, 127, 102508. [Google Scholar] [CrossRef]
Palazzo, M.; Dekker, F.W.; Brighente, A.; Conti, M.; Erkin, Z. Privacy-Preserving Data Aggregation with Public Verifiability Against Internal Adversaries. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 14–16 August 2024; pp. 6957–6974. [Google Scholar]
Wu, Q.; Chen, X.; Zhou, Z.; Zhang, J. Fedhome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Trans. Mob. Comput. 2020, 21, 2818–2832. [Google Scholar] [CrossRef]
Zhao, B.; Liu, X.; Chen, W.N.; Liang, W.; Zhang, X.; Deng, R.H. PRICE: Privacy and reliability-aware real-time incentive system for crowdsensing. IEEE Internet Things J. 2021, 8, 17584–17595. [Google Scholar] [CrossRef]
Rezaeibagha, F.; Mu, Y.; Huang, K.; Chen, L. Secure and efficient data aggregation for IoT monitoring systems. IEEE Internet Things J. 2020, 8, 8056–8063. [Google Scholar] [CrossRef]
Zheng, Y.; Lu, R.; Guan, Y.; Zhang, S.; Shao, J.; Wang, F.; Zhu, H. EPSet: Efficient and privacy-preserving set similarity range query over encrypted data. IEEE Trans. Serv. Comput. 2024, 17, 524–536. [Google Scholar] [CrossRef]
Zhang, W.; Jiang, B.; Li, M.; Lin, X. Privacy-preserving aggregate mobility data release: An information-theoretic deep reinforcement learning approach. IEEE Trans. Inf. Forensics Secur. 2022, 17, 849–864. [Google Scholar] [CrossRef]
Yan, X.; Zeng, B.; Zhang, X. Privacy-preserving and customization-supported data aggregation in mobile crowdsensing. IEEE Internet Things J. 2022, 9, 19868–19880. [Google Scholar] [CrossRef]

Figure 1. A conceptual multi-layer architecture of EI-driven mobile crowdsensing (MCS). The figure illustrates the sensing, communication, and application layers, together with data aggregation workflows, privacy protection mechanisms, and architecture-dependent security vulnerabilities that arise from edge intelligence characteristics.

Figure 2. Average Error Rate under Different Privacy Schemes.

Figure 3. Execution Time for Different Privacy Approaches and Degrees.

Table 1. Comparison of attacks and defenses related to federal security vulnerabilities.

Literature	Attack Types	Focus	Advantages	Disadvantages
Xhemrishi et al. [26]	Poisoning Attack	Malicious client identification in secure aggregation scenarios, achieving privacy-security trade-off	High flexibility and robustness, High practicality	Evident scenario limitations
Wang et al. [27]	Poisoning Attack	Security of local model reliability assessment using similarity metrics in FL	Superior attack performance, Practical and efficient defense	Limited to similarity-based metrics
Huang et al. [28]	Backdoor Attack	Detection of constrained backdoor attacks and stability verification	Excellent robustness and generalization, Balanced performance	Slight computational overhead
Lyu et al. [29]	Backdoor Attack	Collusive backdoor attack with high accuracy and stealth	Leading performance, High sparsity, High concealment	Weak anti-interference capability
Dong et al. [30]	Sybil Attack	Realistic Sybil attack model with dishonest attackers	Realistic attacker behavior modeling	Limited scenario assumptions
Jin et al. [31]	Sybil Attack	Addresses unreasonable worker weight estimation in truth discovery	Resolves inflated approval rates, High efficiency	Degrades under high Sybil ratio
Wang et al. [32]	Inference Attack	Defense against accuracy degradation from DP noise injection	Hierarchical noise injection, High adaptability	Assumes honest server, Increased overhead
Hu et al. [33]	Inference Attack	Defense against membership inference with reduced overhead	Low overhead, Flexible design	Performance bottlenecks in large models

Table 2. Comparison of privacy schemes and aggregation schemes in the application part.

Literature	Task Assignment			User Recruitment and Incentives			System
Literature	Methods	Data Aggreg.	Local/ Noise *	Methods	Data Aggreg.	Local/ Noise	Methods	Data Aggreg.	Local/ Noise
Zhang et al. [64]	-	-	-	-	-	-	Data Perturbation	Common	NO/YES
Zhao et al. [4]	-	-	-	Cryptograph	Common	YES/NO	Cryptograph	Common	YES/NO
Zhang et al. [5]	-	-	-	-	-	-	Cryptograph	Model	NO/NO
Yan et al. [22]	Cryptograph	Common	NO/NO	-	-	-	-	-	-
Zhao et al. [57]	-	-	-	-	-	-	Cryptograph	Common	NO/NO
Feng et al. [19]	-	-	-	Cryptograph	Common	NO/NO	-	-	-
Peng et al. [12]	-	-	-	Cryptograph	Weighted	YES/NO	-	-	-
Cheng et al. [15]	Cryptograph Confidence	Random Matrix	YES/NO	Confidence	Random Matrix	YES/NO	Cryptograph Anonymization	Common	NO/NO
Agate et al. [14]	-	-	-	-	-	-	Cryptograph	Cluster	YES/NO
Yu et al. [6]	-	-	-	Data Perturbation	Weighted	YES/YES	-	-	-
Peng et al. [37]	Data Perturbation	Common	YES/NO	-	-	-	-	-	-
Yan et al. [23]	Cryptograph	Common	NO/NO	-	-	-	-	-	-
Yan et al. [65]	-	-	-	-	-	-	Cryptograph	Common	NO/NO
Wu et al. [60]	-	-	-	-	-	-	Cryptograph Confidence	Model	NO/NO
Shamsabadi et al. [51]	Cryptograph Confidence	Model	NO/NO	Cryptograph	Model	NO/NO	-	-	-
Jiang et al. [52]	-	-	-	-	-	-	Data Perturbation	Weighted	NO/YES
Zheng et al. [63]	Cryptograph	Random Matrix	NO/NO	-	-	-	Cryptograph	Random Matrix	NO/NO
Jin et al. [45]	-	-	-	Data Perturbation	Weighted	NO/NO	-	-	-
Gope et al. [18]	-	-	-	-	-	-	Cryptograph Anonymization Data Perturb.	Common	NO/NO
Zhao et al. [61]	-	-	-	Cryptograph	Cryptograph	NO/NO	Cryptograph	Weighted	NO/NO
Song et al. [20]	Cryptograph	Cluster	YES/NO	-	-	-	-	-	-
Rezaeibagha et al. [62]	-	-	-	-	-	-	Cryptograph	Common	NO/NO
Sun et al. [44]	-	-	-	Cryptograph Anonymization	Weighted	YES/NO	-	-	-

* Local: whether it is local aggregation; Noise: whether noise in the aggregation results.

Table 3. Comparison of approaches in the communication part and sensing part.

Literature	Communications		Sensing
Literature	Transmission	Data Release	Targets	Distribution Area
Zhang et al. [64]	Common Transmission	YES	Location Privacy	General Networks
Zhao et al. [4]	Anonymous Connection	NO	Sensory Privacy Bid Privacy	General Networks
Zhang et al. [5]	Common Transmission	YES	Sensory Privacy	General Networks
Yan et al. [22]	Common Transmission	YES	Sensory Privacy	General Networks
Zhao et al. [57]	Common Transmission	NO	Sensory Privacy	Medical Networks
Feng et al. [19]	Common Transmission	YES	Sensory Privacy Weights Privacy	Industrial Networks
Peng et al. [12]	Common Transmission	NO	Sensory Privacy	Automotive Networks
Cheng et al. [15]	Anonymous Connection	YES	Location Privacy Identity Privacy Sensory Privacy Reputation Privacy	Automotive Networks
Agare et al. [14]	Anonymous Connection	YES	Sensory Privacy	General Networks
Yu et al. [6]	Anonymous Connection	YES	Sensory Privacy	General Networks
Peng et al. [37]	Anonymous Connection	YES	Location Privacy Sensory Privacy Bid Privacy	General Networks
Yan et al. [65]	Common Transmission	YES	Sensory Privacy	General Networks
Wang et al. [13]	Anonymous Connection	NO	Identity Privacy	General Networks
Wu et al. [60]	Common Transmission	YES	Location Privacy Identity Privacy Sensory Privacy	General Networks
Asheralieva et al. [40]	Common Transmission	NO	Identity Privacy	Industrial Networks
Jiang et al. [52]	Common Transmission	YES	Location Privacy	Medical Networks
Zheng et al. [63]	Common Transmission	NO	Sensory Privacy	Medical Networks
Jin et al. [45]	Common Transmission	YES	Location Privacy Sensory Privacy	Automotive Networks
Gong et al. [18]	Anonymous Connection	NO	Identity Privacy Sensory Privacy	Industrial Networks
Zhao et al. [61]	Common Transmission	NO	Location Privacy Sensory Privacy Bid Privacy Weights Privacy	General Networks
Song et al. [20]	Common Transmission	NO	Location Privacy	Automotive Networks
Rezaeibagha et al. [62]	Common Transmission	NO	Sensory Privacy	Medical Networks
Sun et al. [44]	Anonymous Connection	NO	Location Privacy Identity Privacy Sensory Privacy Reputation Privacy	Automotive Networks

Table 4. Qualitative comparison of key edge intelligence attributes supported in existing studies.

Literatures	Decentralized Learning	Real-Time Processing	Collaborative Learning	Model Adaptability
Zhang et al. [64]	FS	NS	FS	FS
Zhao et al. [4]	FS	PS	NS	FS
Zhang et al. [5]	NS	PS	PS	PS
Yan et al. [22]	FS	FS	NS	FS
Zhao et al. [57]	NS	PS	NS	FS
Feng et al. [19]	FS	FS	NS	PS
Peng et al. [12]	FS	NS	NS	FS
Cheng et al. [15]	FS	NS	NS	FS
Agate et al. [14]	FS	NS	NS	FS
Yu et al. [6]	FS	NS	NS	FS
Peng et al. [37]	FS	NS	NS	FS
Wang et al. [13]	FS	NS	NS	PS
Wu et al. [60]	FS	FS	FS	FS
Asheralieva et al. [40]	FS	PS	NS	FS

Abbreviations: FS = Fully Satisfied; PS = Partially Satisfied; NS = Not Satisfied.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Liu, X.; Chen, S.; Xu, Z. Privacy-Preserving Data Aggregation Mechanisms in Mobile Crowdsensing Driven by Edge Intelligence. Electronics 2026, 15, 26. https://doi.org/10.3390/electronics15010026

AMA Style

Liu X, Chen S, Xu Z. Privacy-Preserving Data Aggregation Mechanisms in Mobile Crowdsensing Driven by Edge Intelligence. Electronics. 2026; 15(1):26. https://doi.org/10.3390/electronics15010026

Chicago/Turabian Style

Liu, Xiuwen, Sirui Chen, and Zhiqiang Xu. 2026. "Privacy-Preserving Data Aggregation Mechanisms in Mobile Crowdsensing Driven by Edge Intelligence" Electronics 15, no. 1: 26. https://doi.org/10.3390/electronics15010026

APA Style

Liu, X., Chen, S., & Xu, Z. (2026). Privacy-Preserving Data Aggregation Mechanisms in Mobile Crowdsensing Driven by Edge Intelligence. Electronics, 15(1), 26. https://doi.org/10.3390/electronics15010026

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Privacy-Preserving Data Aggregation Mechanisms in Mobile Crowdsensing Driven by Edge Intelligence

Abstract

1. Introduction

2. Architectures, Data Aggregation, Privacy Protection and Security Vulnerabilities: A Brief Overview

2.1. Architecture of Mobile Crowdsensing Driven by Edge Intelligence

2.2. Data Aggregation Techniques: Enhancing Efficiency and System Performance

2.3. Privacy Protection Methods: Protecting Data Security and System Reliability

2.4. Analysis of Key Challenges in Privacy Protection for EI-Driven MCS

2.5. Security Vulnerabilities in Federated Learning: Security Challenges and Typical Attacks

3. Classification Framework

4. Privacy-Preserving Data Aggregation Mechanism Algorithms

4.1. Novel Attack and Defense Methods—Vulnerabilities Part

4.2. Privacy Scheme and Data Aggregation—Application Part

4.2.1. Task Assignment Stages

4.2.2. User Recruitment and Incentives Stages

4.2.3. The Overall System

4.3. Data Collection and Transmission: Communication Part and Sensing Part

4.3.1. Common Transmission

4.3.2. Anonymous Connectivity

4.3.3. Comparison of Methods

4.3.4. Key Attributes for Expanding to EI-Driven MCS Systems

5. Performance Comparison of Data Aggregation Schemes Under Different Privacy Schemes

5.1. Experimental Setup and Dataset

5.2. Experimental Results and Analysis

6. Future Research Directions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI