Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends

Nayak, Sahil; Gungor, Onat; Rosing, Tajana

doi:10.3390/electronics15112389

Open AccessReview

Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends

by

Sahil Nayak

^*

,

Onat Gungor

and

Tajana Rosing

Department of Computer Science and Engineering, University of California, San Diego, CA 92093, USA

^*

Author to whom correspondence should be addressed.

Electronics 2026, 15(11), 2389; https://doi.org/10.3390/electronics15112389

Submission received: 2 May 2026 / Revised: 26 May 2026 / Accepted: 27 May 2026 / Published: 1 June 2026

(This article belongs to the Special Issue Novel Methods Applied to Security and Privacy Problems in Future Networking Technologies, Volume II)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Collaborative driving, in which autonomous vehicles cooperate with other vehicles and roadside infrastructure to improve safety, perception, and traffic efficiency, is emerging as a key paradigm for next-generation transportation systems. While such collaboration enhances situational awareness, it also introduces new security vulnerabilities across perception, communication, planning, decision-making, and control layers. In this survey, we present a unified taxonomy of security threats and defense mechanisms in collaborative driving systems, systematically organizing attacks and countermeasures across system layers. We further examine the integration of language models, including vision-based and multimodal reasoning models, into collaborative driving pipelines, highlighting the resulting security risks and design challenges. Finally, we identify key open research challenges, including cross-layer and end-to-end security, uncertainty-aware defenses, and real-world validation, outlining promising directions for future work toward secure and resilient collaborative autonomous mobility.

Keywords:

collaborative driving; autonomous vehicles; connected and autonomous vehicles (CAVs); security; adversarial attacks; defense mechanisms; language models

1. Introduction

Collaborative driving (CD) has emerged as a fundamental paradigm in autonomous driving (AD), enabling vehicles to share perceptual information and collectively reason about their environment [1,2,3,4]. By leveraging distributed observations, CD significantly extends perception beyond the limitations of individual vehicles, improving object detection, tracking, and situational awareness [5,6,7,8]. To support this capability, CD architectures are typically organized into three interconnected layers: perception, communication and fusion, and planning and control [9]. The perception layer captures and interprets the surrounding environment using onboard sensors [5]. The communication and fusion layer aggregates information from multiple agents (including neighboring vehicles and infrastructure) primarily through vehicle-to-everything (V2X) communication [6,10]. Finally, the planning and control layer utilizes this aggregated information to generate and execute driving decisions [9].

This layered architecture introduces a broad attack surface, with each layer exposing distinct security vulnerabilities. The perception layer is vulnerable to physical and sensor-level attacks, such as spoofing and tampering, which can corrupt raw environmental observations and lead to incorrect scene understanding [9]. The communication and fusion layer is susceptible to network-level threats, including radio jamming and message manipulation, which can disrupt or distort the exchange of critical information among vehicles and infrastructure [10]. The planning and control layer, in turn, is exposed to evasion attacks on machine learning models and latency-inducing disruptions [11,12,13], which can degrade decision-making reliability and responsiveness. These vulnerabilities are particularly critical in CD, where information is shared and aggregated across multiple agents; as a result, a single compromised component can propagate errors throughout the system and impact multiple vehicles simultaneously. Consequently, a substantial body of research has emerged to investigate both attack strategies and corresponding defense mechanisms across these layers [14,15,16,17,18,19,20,21]. In this work, we systematically analyze these vulnerabilities and provide a comprehensive survey of existing defenses, highlighting their strengths and limitations.

Beyond traditional models, the integration of language models (LMs) into CD has recently emerged as a growing research direction [22,23]. Large language models (LLMs) enable human-like, commonsense-driven reasoning and planning capabilities [24], while multimodal LLMs (MLLMs) extend these abilities to diverse data modalities, improving robustness in complex traffic scenarios. However, these advancements also introduce new security challenges [25]. These models expand the attack surface, exposing CD systems to emerging threats such as prompt injection attacks [26]. To mitigate these risks, recent work has proposed a range of frameworks for securing language-enabled CD systems [25,27,28].

The security aspect of CD networks becomes even more critical as these systems move toward cross-domain V2X environments. In addition to conventional communication, future systems may increasingly incorporate drones, cloud services, and other emerging infrastructures [29,30]. This shift requires new cybersecurity adaptations around these technologies. For example, recent work on hybrid satellite networks has investigated data aggregation with dynamic group key agreement [31]. Additionally, Internet of Vehicle authentication studies have proposed conditional privacy-preserving batch verification schemes to support scalable message authentication [32]. These directions highlight the broader need for efficient, cross-domain security systems in CD environments.

While existing surveys on CD security largely focus on vulnerabilities in V2X communication, broader analyses of the end-to-end CD pipeline remain limited [10,33,34,35,36,37,38]. In particular, the integration of LMs into CD introduces a new class of threats that has received little systematic study. We address these gaps by providing a comprehensive survey of cybersecurity challenges across all layers of CD systems, covering both traditional and language-enabled architectures. Our main contributions are as follows:

We provide a systematic review of security threats in CD systems, spanning the perception, communication and fusion, and planning and control layers.
We further extend this analysis to language-enabled CD systems, highlighting vulnerabilities introduced by integrating LLMs and other emerging AI components.
We propose a unified taxonomy of threats and defenses, identify key open research challenges, and outline directions for future work.

By situating these vulnerabilities within a cross-layer perspective, this survey provides a foundation for designing more secure and resilient CD systems and guides future research on emerging threats in ML-enabled autonomous mobility.

Figure 1 provides an overview of the structure of this survey, and Table 1 shows a list of key acronyms used throughout this paper. Section 2 provides a background for this paper, covering the CD Architecture, LM integration, and a brief overview of the security landscape of CD. Section 3 discusses related work. Section 4 contains an overview of models and datasets within CD. Section 5 describes cybersecurity threats and mitigations. Section 6 discusses research gaps and future directions, and we conclude in Section 7.

2. Background

AD systems have demonstrated large-scale real-world deployment, with hundreds of millions of autonomous miles driven and hundreds of thousands of weekly rides in operational fleets such as Waymo [39]. CD extends the broader AD paradigm by enabling communication and joint decision-making among multiple agents within a connected environment [6]. These agents include vehicles, infrastructure, and drones [30,40,41]. By leveraging shared information and cooperative perception (CP), CD enhances decision accuracy and situational awareness, improving the safety and robustness of AD systems [5,7,8,42,43]. LM integration is also a key component of advanced CD, improving the reasoning and data processing abilities of these systems [44].

2.1. CD Architecture

Figure 2 illustrates the CD architecture, which comprises three primary layers: perception; communication and fusion; and planning and control [9]. CD starts in the perception layer, where a variety of vehicles and infrastructure units collect information about the surroundings in a process called collaborative perception (CP) [12,45]. Next, these nodes share data among each other and undergo a fusion process to combine this data [46,47,48]. Finally, the ego vehicle handles the planning and control stage locally, using an underlying machine learning (ML) model to make and execute decisions [6,10].

2.1.1. Perception Layer

The perception layer utilizes multiple sensing modalities, including RGB cameras, Light Detection and Ranging (LiDAR) sensors, and radar, for environmental perception [5]. RGB cameras provide rich visual information for object detection and classification [9]. However, their performance degrades under low-light or poor illumination conditions. To address these limitations, LiDAR and radar sensors are incorporated as complementary modalities. LiDAR employs laser pulses to generate high-resolution three-dimensional representations of the environment in the form of point clouds [49]. It is particularly effective for accurate distance estimation and object detection and operates independently of ambient lighting conditions. Radar, in contrast, uses radio waves for perception and is robust to adverse weather and lighting conditions, making it a reliable sensing modality in challenging environments [9].

As the foundational component of AD systems, the perception layer enables vehicles to interpret their surroundings and make informed decisions for safe operation [6]. Despite significant advances in individual sensing technologies, single-vehicle perception remains insufficient to ensure comprehensive environmental awareness. This limitation arises from restricted fields of view, occlusions, and potential sensor degradation or hardware failures [50]. To mitigate these challenges, CP across multiple vehicles has emerged as a key paradigm in CD, enabling the sharing and integration of complementary sensory information.

Furthermore, the perception layer extends beyond onboard vehicle sensors to include data from external infrastructure and connected entities [6]. In particular, intelligent roadside units provide complementary perception capabilities by sharing contextual information, improving coverage of occlusions and enhancing situational awareness in CD environments [50].

2.1.2. Communication and Fusion Layer

The second layer in CD systems is the communication and data fusion layer, which enables the exchange and integration of information across distributed perception modules [10]. This layer plays a critical role in extending situational awareness beyond the sensing capabilities of a single vehicle.

V2X communication enables information exchange among vehicles, infrastructure, and other connected entities [51]. Its primary modes include vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I), which support cooperative awareness and coordinated decision-making in traffic environments. Beyond these core modes, V2X extends to emerging paradigms such as vehicle-to-home (V2H), vehicle-to-building (V2B), vehicle-to-grid (V2G), and vehicle-to-drone (V2D) communication, enabling deeper integration with smart environments and cyber–physical infrastructures [29]. Collectively, these communication paradigms underpin cooperative perception, coordination, and control in connected transportation systems [24,51].

V2X communication is primarily supported by two technologies: dedicated short-range communications (DSRC), based on the IEEE 802.11p standard for communication, and cellular V2X (C-V2X) standardized by the 3rd Generation Partnership Project (3GPP) [10,17]. IEEE 802.11p enables low-latency, direct communication in highly dynamic vehicular environments by simplifying network association procedures, reducing communication overhead and delay [52]. In contrast, C-V2X leverages cellular infrastructure and has evolved from 4G LTE-V2X to 5G NR-V2X, with ongoing research toward 6G-enabled systems [53,54]. C-V2X supports both direct communication and network-assisted communication, offering improved coverage for large-scale deployments [53].

To effectively utilize shared information, data fusion techniques are employed to combine observations from multiple sources into a unified representation of the environment [6]. Fusion strategies are typically categorized into early, late, and intermediate fusion approaches. Early fusion aggregates raw sensor data, providing high information richness at the cost of significant communication bandwidth [5]. Late fusion, in contrast, exchanges high-level decisions or predictions, reducing communication overhead but often sacrificing contextual detail. Intermediate fusion strikes a balance by transmitting learned feature representations, offering a trade-off between information fidelity and communication efficiency [2,3,4,47,48]. As a result, intermediate fusion has become the dominant approach in modern V2X-enabled CD systems [5].

Despite these advances, the communication and fusion layer faces several practical challenges, including latency constraints, limited bandwidth, synchronization across heterogeneous sources, and the reliability and trustworthiness of shared information [10,33,55]. These factors critically influence the performance of cooperative perception and decision-making in real-world deployments. Overall, by integrating V2X communication with efficient data fusion mechanisms, this layer enables collaborative situational awareness, forming the foundation for downstream planning and decision-making processes.

2.1.3. Planning and Control Layer

The planning and control layer generates and executes safe and efficient driving strategies based on fused information from collaborative perception and communication modules [9]. It bridges high-level decision-making and low-level control to enable real-time action planning and execution. This layer is split into two key processes: the planning and decision-making procedure, and the control and actuation operation.

At the planning and decision-making stage, the system predicts the future trajectories of surrounding agents using historical motion patterns and contextual information [56]. In V2X-enabled environments, these predictions leverage observations from multiple vehicles and infrastructures, allowing for a more comprehensive understanding of agent interactions and improving prediction accuracy and robustness, particularly in complex or occluded scenarios. Based on these predictions, the system determines feasible and optimal driving maneuvers, such as lane changes, merging, and intersection traversal, while satisfying safety, efficiency, and traffic rule constraints [57].

The control and actuation component translates planned trajectories into low-level commands such as steering, acceleration, and braking [46]. It must account for vehicle dynamics, actuator constraints, and real-time feedback to ensure accurate trajectory tracking. In addition to nominal driving behavior, the control system must handle safety-critical scenarios such as sudden obstacle avoidance and emergency braking under strict latency and reliability requirements.

While planning and control are fundamental components of conventional autonomous driving systems, the integration of V2X information enhances their performance. By incorporating shared multi-source context beyond ego-vehicle perception, this layer supports more informed and robust decision-making and control in dynamic driving environments.

2.2. LM Integration

LMs are a subset of AI models that generate and process natural language by modeling the probability distribution over word sequences [58]. This capability supports language-based reasoning and enables multimodal understanding by representing diverse data modalities in textual form.

LLMs are a class of LMs trained on large-scale datasets with billions to trillions of parameters, enabling strong generative and predictive capabilities [58]. They demonstrate enhanced reasoning and generalization capabilities compared to smaller-scale models. Vision–language models (VLMs) extend LLMs by integrating visual inputs with textual understanding, enabling multimodal perception [59]. MLLMs further generalize this paradigm by supporting additional modalities such as images, video, and audio [60].

In CD systems, the integration of these models offers two primary advantages: First, LMs introduce higher level semantic and commonsense reasoning capabilities into the driving stack [61]. Unlike conventional perception and planning modules that operate primarily on geometric and statistical representations, LMs can encode abstract knowledge about traffic norms, human intent, and contextual constraints. This enables more flexible decision making in long tail and unforeseen scenarios, thereby improving robustness and operational safety. Second, LMs, particularly MLLMs, enable unified processing across heterogeneous data modalities [62]. AD systems inherently rely on diverse sensor inputs, including camera images, LiDAR point clouds, and radar signals. MLLMs provide a framework for jointly reasoning over these inputs within a shared representation space. This capability becomes increasingly critical in CD, where information is aggregated not only from the ego vehicle but also from external vehicles and infrastructure. By facilitating cross modal fusion and semantic level interpretation, LMs support scalable and context-aware cooperative driving systems.

2.3. Security Landscape of CD

Cybersecurity is critical in CD systems [63], as vulnerabilities can directly compromise safety, performance, and reliability. Figure 3 shows these threats in each layer. In conventional AD systems, multiple layers are susceptible to attacks. On the perception layer, physical sensors such as cameras, LiDAR, and radar can be manipulated; for instance, LiDAR spoofing injects malicious laser signals to generate false obstacles, potentially misleading vehicle perception [9,64]. Similarly, the planning and control layer has its own vulnerabilities. The underlying ML model is threatened by evasion attacks, where small, carefully crafted perturbations in input data can significantly degrade decision-making performance at inference time [9,65].

CD introduces additional security challenges beyond those of traditional AD systems. V2X communication introduces the communication and fusion layer, which opens new attack surfaces [9]. Attacks unique to CD include Sybil attacks, where multiple fake nodes are created to overwhelm the network and compromise data integrity [36], and jamming attacks, which disrupt inter-vehicle communications by injecting interfering signals [55]. Additionally, the control and actuation process within the planning and control layer is vulnerable to cross-layer threats, where compromises from previous layers result in the execution of a malicious or unsafe decision [36]. This is especially critical in CD systems, where more layers means a broader attack surface through which attacks can propagate. Overall, while CD enhances perception and decision-making capabilities, it also significantly expands the attack surface. The combination of traditional AD vulnerabilities and CD-specific threats underscores the critical importance of robust cybersecurity strategies in CD systems.

3. Related Work

This survey examines security in CD environments, with a particular focus on the emerging role of LMs. While prior studies have explored individual aspects of this space, a unified and systematic taxonomy of security challenges in CD systems is still missing. To address this gap, we organize the existing literature along five key dimensions: CD surveys, V2X security, general AD security, ML security in AD, and the integration of LMs in AD. Table 2 provides a structured overview of the existing literature and the specific aspects each work addresses.

3.1. Surveys on CD

Although no prior survey directly addresses security in CD, a substantial body of work reviews CD and CP systems more broadly. Existing surveys [5,7,8,42,43,45,73] examine various aspects of these systems, including perception, communication, system design, and datasets.

Huang et al. [5] provide a comprehensive survey of CP, including mathematical formulations and key components such as sensor fusion and communication. It also provides a brief overview of V2X communication security. Bai et al. [7] emphasize architectural design and system structure. Cui et al. [8] review CP techniques in connected autonomous vehicles, focusing on multi-sensor fusion, communication strategies, and V2X integration. Ji et al. [42] survey cooperative vehicle-infrastructure systems (CVISs), analyzing system architectures, traffic applications, and challenges in collaborative decision-making. Yu et al. [43] examine both CP and control, highlighting interactions between vehicles and infrastructure and their impact on safety and efficiency. Yazgan et al. [45] focus on datasets for collaborative perception. Finally, Song et al. [73] adopt an information-centric perspective, treating V2X communication as a sensing modality.

Despite their breadth, these surveys largely overlook security considerations. Addressing security is essential for the reliable deployment of CD systems, motivating the need for a dedicated taxonomy that systematically characterizes threats, vulnerabilities, and defense mechanisms in this domain.

3.2. Surveys on V2X Communication Security

V2X security research primarily focuses on securing communication between vehicles and infrastructure, with an emphasis on reliable and secure data exchange. A substantial body of survey work [10,28,33,34,35,36,37,38,66,67] examines this domain from multiple perspectives, including threat modeling, privacy, and communication protocols.

Gularte et al. [33] provide a broad literature review of V2X cybersecurity, extending beyond technical vulnerabilities to include financial investments and trends in research activity. Alnasser et al. [34] analyze fundamental security challenges in V2X networks, including threats to availability, integrity, confidentiality, authenticity, and non-repudiation. Sedar et al. [10] offer a comprehensive overview of cybersecurity mechanisms designed for V2X communication systems. Yoshizawa et al. [35] focus on privacy, discussing threat models and corresponding protection mechanisms.

El-Rewini et al. [36] take a multi-layered perspective, analyzing threats and countermeasures across sensing, communication, and control layers, making their work one of the most comprehensive in this space. Ghosal and Conti [37] categorize communication-layer attacks such as impersonation, replay, and jamming, and propose corresponding mitigation strategies. Hasan et al. [38] examine threats targeting communication infrastructure, including unauthorized access, data manipulation, and denial-of-service attacks. Qian et al. [66] focus on 5G-enabled V2X systems, reviewing vulnerabilities such as authentication attacks, privacy leakage, and network-level threats, along with associated security mechanisms. Ying et al. [28] study attack models in vehicular networks with an emphasis on national cryptographic standards used by the Chinese State Cryptography Administration. Finally, Rishiwal et al. [67] explore V2X security and privacy in smart city environments from a human-centric perspective.

These surveys primarily focus on communication-layer security and do not capture the full CD pipeline. While El-Rewini et al. [36] extend beyond communication to multiple system layers, they provide limited coverage of machine learning security and do not address emerging LM-based threats. This gap highlights the need for a unified framework that captures security risks across the entire CD stack, including learning-based components.

3.3. Surveys on AD System Security

AD system security research examines cybersecurity risks across the vehicle stack, including sensors, onboard networks, ML components, and external communication interfaces. Existing surveys [9,68,69,70,71,72] analyze these vulnerabilities and review corresponding mitigation strategies.

Lippi et al. [9] provide a broad overview of the AD pipeline, covering security in perception, V2X communication, and ML components. Kim et al. [68] conduct a large-scale analysis of 151 studies, categorizing attacks across control systems, AD components, and V2X communication, with limited discussion of ML-based defenses. Deng et al. [72] focus on deep learning-based AD systems, categorizing attacks and defenses across perception, cloud services, and learning models. Cui et al. [69] examine both safety failures and cybersecurity threats, analyzing attacks targeting sensors, communication networks, and onboard software.

A subset of surveys focuses on connected and autonomous vehicles (CAVs), which incorporate V2X-enabled communication with other vehicles and infrastructure. Pham and Xiong [70] analyze attack models across communication protocols (e.g., IEEE 802.11p), onboard systems, and infrastructure. Hossain et al. [71] categorize threats spanning V2X communication, sensors, hardware, and adversarial ML attacks.

These surveys primarily emphasize security at the level of individual vehicles and their internal components. While works such as Lippi et al. [9] and CAV-focused surveys [70,71] extend partially to interconnected settings, they do not fully capture the CD paradigm. Furthermore, the growing integration of LMs and multimodal reasoning systems introduces new attack surfaces that remain largely unexplored in the literature. In contrast, this work centers on CD systems and develops a comprehensive taxonomy of security threats and mitigation strategies spanning the full perception-to-control pipeline, explicitly incorporating machine learning and LM-driven components.

3.4. Surveys on LMs for AD

An emerging body of work explores the integration of LMs into AD, motivating several recent surveys that examine their roles in perception, reasoning, and decision-making. These surveys cover LLMs, VLMs, and MLLMs, highlighting their advantages in enhancing AD capabilities [24,74,75,76,77,78].

Yang et al. [74] review LLM-based approaches in AD, categorizing applications across planning, perception, question answering, generation, and evaluation. Zhou et al. [75] focus on VLMs, analyzing how visual perception is combined with language reasoning for tasks such as scene understanding, trajectory planning, and decision explanation. Cui et al. [76] survey MLLMs in AD, organizing prior work into perception, planning, and industrial deployment. Gao et al. [77] discuss foundation models, including LLMs, VLMs, and MLLMs, emphasizing their role in environment understanding, trajectory planning, and human-interpretable reasoning.

A limited number of works consider security aspects. Wang [78] examines VLMs from a security perspective, analyzing threats such as backdoor and adversarial attacks, along with corresponding defenses. Panyam et al. [24] study LLM integration in V2X-enabled settings, focusing on architectures that support communication, coordination, and decision-making in connected vehicle environments.

Despite these contributions, existing surveys emphasize model capabilities, multimodal reasoning, and system performance, with limited attention to security. LMs introduce new vulnerabilities, including prompt injection, hallucination, and adversarial manipulation of multimodal inputs, which remain insufficiently explored. Furthermore, prior work typically considers language-enabled AD systems in isolation and does not examine their interaction with CD environments. In contrast, this work investigates the cybersecurity implications of integrating LMs into CD systems, providing a unified taxonomy of threats and mitigation strategies spanning both communication and decision-making components.

4. Overview of Collaborative Driving Models and Datasets

This section covers recent work in CD models. First, we discuss innovations in non-language-enabled CD. Next, we provide a foundation for language-enabled CD by creating a taxonomy of single-perception language AD models. Finally, we showcase the convergence of these two paradigms and survey language-enabled CD models. As a whole, this section aims to provide context on key recent advancements in collaborative and language-based AD.

4.1. V2X CD Models

V2X CD models leverage this communication to enhance perception, planning, and decision-making. Research in this area shows a clear shift in focus towards intermediate fusion methods, as a middle ground between information density and bandwidth cost. This subsection categorizes learning models into early, intermediate, and late fusion approaches, highlighting their contributions and limitations. Table 3 is a taxonomy of these models. It categorizes important features such as the data shared and machine learning architecture behind these systems. It also describes the task of these models, such as object detection and end-to-end driving.

4.1.1. Early Fusion Models

Cooper (2019) [79] shares raw LiDAR data among vehicles to enhance CP. Cooper demonstrates significant improvements over single-vehicle perception, but requires high bandwidth and exhibits limited scalability, motivating subsequent feature-level fusion approaches.

CooPre (2025) [91] introduces a self-supervised pretraining framework for V2X CP that learns from unlabeled multi-agent LiDAR data, reducing reliance on expensive 3D annotations.

CoPAD (2025) [90] fuses raw trajectory data from infrastructure and other vehicles. It introduces a novel early fusion module, producing more accurate multi-agent predictions.

4.1.2. Late Fusion Models

Late fusion models combine decisions or outputs from multiple perception sources to enhance robustness and safety.

V2X-Communication-Aided AD (2020) [46] combines multimodal data from local sensors and V2X transmissions, applying fusion selectively on dynamic obstacles. It represents an early real-world implementation of V2X-assisted driving.

Demonstrations of Cooperative Perception (2021) [50] implements collective perception through V2X, integrating roadside sensors and autonomous vehicles. Experiments in real-world and simulated settings show improved pedestrian detection and overall safety, validating the effectiveness of V2X integration.

5G NR-V2X (2021) [84] introduces a framework integrating 5G networking into V2X communication, covering layers, reliability, and security, while discussing early ML applications for AD.

4.1.3. Intermediate Fusion Models

Intermediate fusion models transmit feature-level representations rather than raw data, balancing communication efficiency with perception accuracy.

F-Cooper (2019) [80] transmits feature maps between CAD-enabled vehicles, maintaining strong perception performance while reducing communication overhead.

When2Com (2020) [82] learns when and with whom to communicate via graph-based grouping, selectively sharing intermediate feature representations to reduce bandwidth while maintaining strong perception performance.

V2VNet (2020) [83] focuses on vehicle to vehicle communication and iteratively exchanges and aggregates intermediate feature representations between vehicles, improving detection accuracy through learned V2V communication.

DiscoNet (2021) [81] introduces distilled collaboration graphs using a teacher–student learning framework, sharing these representations among student models to enhance performance while reducing communication costs.

V2X-ViT (2022) [85] integrates a vision transformer to combine multimodal inputs from vehicles and infrastructure. It automatically assesses the reliability of each source, demonstrating robustness under noisy conditions and outperforming prior SOTA models.

CoBEVT (2022) [86] enables cooperative generation of Bird’s Eye View (BEV) maps. By sharing compressed features instead of raw data, it reduces bandwidth usage while preserving accuracy even under strong compression.

Where2Comm (2022) [87] utilizes spatial confidence maps to selectively transmit the most crucial intermediate features, further reducing communication overhead while preserving strong CP performance.

BM2CP (2023) [88] enables efficient multimodal collaborative perception by fusing LiDAR and camera features across agents, selectively sharing intermediate representations and outperforming SOTA methods with a 50x communication volume reduction.

SCOPE (2023) [1] incorporates spatial and temporal awareness into intermediate fusion, improving robustness to misalignment and dynamic environments.

How2Comm (2023) [89] uses an information-aware mechanism to communicate the most important features between agents, reducing bandwidth while maintaining strong multi-agent perception performance.

CoMamba (2024) [2] applies state space modeling for low-latency perception, achieving real-time inference at 26.9 FPS, demonstrating the efficiency of temporal models.

UniV2X (2024) [3] proposes an end-to-end framework integrating sparse and dense data streams, balancing bandwidth efficiency and perception fidelity.

SiCP (2024) [4] introduces a generic framework that models individual and CP by fusing intermediate features across agents, improving 3D object detection while balancing local and shared information.

XET-V2X (2025) [47] fuses multimodal spatial and temporal data, addressing sensor calibration and latency challenges while improving environmental detection accuracy.

V2X-R (2025) [48] incorporates 4D radar into V2X perception pipelines, mitigating weather and spatial challenges, supporting multiple fusion methods, and setting a foundation for radar-enhanced CP.

4.2. LMs in Single-Perception AD

LLMs and VLMs provide reasoning, interpretability, and human-like planning capabilities for AD systems. Recent work explores their integration with single-vehicle perception pipelines. These models are crucial for providing a foundation for language-enabled CD models, which integrate V2X communication alongside the advanced reasoning capabilities of these models. Table 4 presents these models. It covers important categories specific to language-enabled AD systems, such as the underlying LM and the reasoning system.

Drive Like a Human (2023) [61] demonstrates GPT-3.5-based closed-loop simulation, showing human-like decision-making and the feasibility of reasoning in AD.

DriveGPT4 (2023) [22] processes RGB data and user instructions, predicting low-level vehicle controls and enabling explainable reasoning.

LanguageMPC (2023) [92] leverages LLMs for model predictive control, converting sensor inputs into textual prompts for reasoning, resulting in lower overall cost metrics.

LMDrive (2023) [93] decomposes driving into tasks handled via an LLM, supporting closed-loop self-learning and improved robustness in complex scenarios.

DriveLLM (2024) [23] integrates prediction generation and feedback mechanisms, allowing verbal self-reflection for improved control decisions.

HighwayLLM (2024) [94] combines reinforcement learning and LLM-based trajectory planning or safety evaluation, improving collision rates and vehicle velocity.

Talk2Drive (2024) [95] converts natural-language commands into driving actions, enabling human-interactive AD with low latency.

DriveVLM (2024) [96] uses VLMs for improved scene understanding, integrating dual pipelines to address latency and perception accuracy.

DriveLM (2025) [97] introduces graph-based visual question answering (VQA) for reasoning and planning, with a corresponding dataset supporting future language-driven AD research.

ReasonDrive (2025) [98] uses explicit reasoning in VLM-based AD, through reasoning-enhanced fine-tuning. It focuses on using small VLMs, which are more practical for real-world scenarios.

DriveAgent (2025) [99] integrates LiDAR, GPS, and intertial measurement unit (IMU) data with LLM reasoning for multimodal end-to-end control and natural-language explanation.

DriveMLM (2025) [62] synthesizes multimodal inputs via an MLLM tokenizer-decoder pipeline, achieving state-of-the-art driving decisions while maintaining explainability.

KLDrive (2026) [100] introduces knowledge graphs to augment LLM reasoning in AD, achieving SOTA performance while showing significantly fewer hallucinations and improved fact-based explainability.

4.3. Language-Based CD Models

Language-based CD models combine CP and LLM/VLM reasoning to enable natural language communication, negotiation, and planning among vehicles. Table 5 is a taxonomy of these models.

Talking Vehicles (2024) [101] enables LLM-facilitated natural-language communication between vehicles, improving cooperative behavior while slightly increasing collision risk.

V2V-LLM (2025) [40] applies MLLM-based late fusion to V2V communication, improving grounding, object identification, and planning.

V2X-LLM (2025) [44] processes multimodal V2X data for scenario explanation, state prediction, and navigation advisory, achieving high accuracy with low latency.

CoLMDriver (2025) [102] integrates language-based negotiation and a VLM intention planner, improving driving scores and negotiation quality through reinforcement-based actor–critic evaluation.

LangCoop (2025) [103] transmits semantic information via natural language (LangPack), reducing bandwidth while maintaining cooperative performance.

LLMDriver (2025) [104] introduces a collaborative planning module with an Experience Memory Module, enhancing safety and comfort decisions based on past driving scenarios.

V2X-VLM (2025) [105] fuses multimodal data from vehicles and infrastructure using VLMs, achieving robustness and efficiency improvements over prior unified V2X models.

V2V-GoT (2025) [106] extends V2V-LLM with temporal feature maps to better capture environmental dynamics, improving planning performance while maintaining communication costs.

UNCAP (2026) [107] enhances LangCoop with uncertainty-guided communication, optimizing reliability and bandwidth while supporting VLM-based planning, significantly improving driving performance and efficiency.

4.4. Datasets

Datasets are crucial in the underlying ML systems in CD models [45]. CD datasets are more complex than general AD sets, as they involve different sensor types and information fusion. Datasets used in CD models involve data collected from infrastructure, vehicles, and other sources. Table 6 is an overview of these datasets. It shows key categories such as the modality of data and the task enabled by the dataset, such as object detection. Additionally, this table shows whether each set was created with real-world data.

BAAI-VANJEE (2021) [108] focuses on roadside architecture data and includes varying conditions in weather and traffic.

IPS300+ (2022) [109] contains open-source multimodal data, collected using a roadside intersection perception system (IPS).

V2X-Sim (2022) [110] is a simulated dataset containing multimodal data from vehicles and roadside units, enabling wide-ranging V2X communication.

Rope3D (2022) [111] involves a high diversity of roadside data, focusing on capturing 3D objects with varying camera positions, specifications, viewpoints, and environments.

A9-Dataset (2022) [112] uses roadside gantry bridges to provide an infrastructure-based high-view dataset, supporting real-world traffic scenario.

OPV2V (2022) [113] focuses on inter-vehicle communication, containing simulated frames from vehicles.

DAIR-V2X (2022) [114] is a large-scale real-world dataset containing frames from both vehicles and infrastructure, focusing on synchronized camera and LiDAR data.

V2XSet (2022) [85] focuses on real-world noise within V2X communication. This dataset was proposed alongside V2X-ViT, which focused on robustness against noisy conditions, and thus enables building CD models that can handle similar environments.

DOLPHINS (2023) [115] is a simulated V2X-focused dataset including data from vehicles and roadside infrastructure, including dynamic weather conditions across the set.

LUCOOP (2023) [116] contains multimodal real-world data, collected by three vehicles. It focuses on highly precise and meticulous measurements of the vehicles and the surrounding environment.

V2V4Real (2023) [117] includes real-world data focused on V2V applications, collected by two sensor-ridden vehicles in diverse driving scenarios.

V2X-Seq (2023) [56] is a real-world sequential dataset, focusing on the temporal ordering of collected data. It enables object tracking and trajectory planning, rather than just object detection.

DeepAccident (2024) [118] is a simulated dataset focusing on the safety aspect of V2X communication. This dataset includes wide-ranging collision accidents, enabling the accident prediction task.

HoloVIC (2024) [119] provides large-scale, multi-view “holographic” intersection sensing with dense multi-sensor layouts, using real-world sensors to create these layouts.

TUMTraf (2024) [120] focuses on infrastructure data in challenging scenarios like overtaking and u-turns.

RCooper (2024) [30] is a roadside perception dataset that involves multiple agents, covering a larger area and enabling CD communication tasks with a broader network.

Multi-V2X (2024) [121] introduces a large-scale multimodal CP dataset with varying vehicle penetration rates, enabling realistic evaluation of V2X systems under partial connectivity conditions.

V2X-R (2025) [48] also provides a proprietary dataset focusing on the 4D radar modality. This dataset includes a combination of camera images, LiDAR point clouds, and 4D radar point clouds.

V2XScenes (2025) [122] focuses on challenging scenarios in V2X communications. This dataset includes real-world data obtained from both vehicles and roadside infrastructure.

AGC-Drive (2025) [29] is an aerial-ground collaborative dataset containing real-world data gathered from two vehicles and one Unmanned Aerial Vehicle (UAV). This data enables V2D communication, bringing an additional layer to V2X systems.

CATS-V2V (2025) [123] contains V2V-focused real-world data, collected by two vehicles. It focuses on temporal consistency, as well as adverse traffic scenarios.

WHALES (2025) [124] centers around high amounts of agents in each scenario. This enables the agent scheduling task, determining which agents should communicate, what they should send, and when they should send information under communication constraints.

V2X-Radar (2026) [125] focuses on combining 4D radar with the other modalities of LiDAR and camera in a real-world setting, containing adverse weather and dark environment conditions.

UrbanIng-V2X (2026) [126] introduces a real-world dataset spanning multiple intersections with multi-vehicle and multi-infrastructure sensor data. This dataset enables realistic evaluation of complex urban V2X scenarios.

V2U4Real (2026) [41] is another large-scale real-world vehicle and UAV dataset, which contains data gathered from UAVs and ground vehicles.

Table 6. Comparison of datasets for CD. The modality column represents the type of data in the set, where L is LiDAR, C is Camera, and R is 4D Radar. In the task column, OD is Object Detection, AP is Accident Prediction, AOD is Aerial Object Detection, and AS is Agent Scheduling. The V2X category represents the forms of communication covered by the dataset, where V is V2V, I is V2I, and D is V2D.

Dataset	Year	V2X	Modality	Task	Real-World	Open-Source
[30,108,109,111,112,119,120]	2021–2024	I	L, C	OD	✓	✓
[113]	2022	V	L, C	OD	✗	✓
[85,110,115,121]	2022–2024	V, I	L, C	OD	✗	✓
[56,114,116,126]	2022-2026	V, I	L, C	OD	✓	✓
[117,123]	2023–2025	V	L, C	OD	✓	✓
[118]	2024	V, I	L, C	AP	✗	✓
[48]	2025	V, I	L, C, R	OD	✗	✓
[122]	2025	V, I	L, C, R	OD	✓	✗
[124]	2025	V, I	L, C	AS	✗	✓
[29,41]	2025–2026	D	L, C	OD, AOD	✓	✓

Note: ✓ indicates that the dataset satisfies the corresponding criterion, while ✗ indicates that the dataset does not satisfy the corresponding criterion.

5. Cybersecurity Threats and Defense Mechanisms in CD

This section examines security vulnerabilities in CD systems and reviews recent research on mitigating these threats. We focus on our CD architecture layers: perception, communication and fusion, and planning and control. Each vulnerability is organized into attack and defense frameworks, detailing the types of threats and the corresponding mitigation strategies. Figure 3 is a high-level overview of the layers and their vulnerability, and this section approaches each layer with a more in-depth lens.

5.1. Threat Model

The threat model for CD systems consists of three main categories: attacker type, knowledge, and objective. Figure 4 is a brief overview of this model.

5.1.1. Attacker Type

CD environments involve two primary types of attackers: internal and external [127]. Internal (insider) attackers can communicate directly with other nodes in the network, whereas external (outsider) attackers lack such direct access. Insider attackers are generally more dangerous, as they possess greater knowledge of the system and more opportunities to compromise its operations. External attackers execute attacks that do not require authentication [10], such as jamming or spoofing, which exploit signal transmissions. They may also attempt to gather information through eavesdropping. Insider attackers, in contrast, can carry out more severe threats to vehicle safety [10]. They are capable of launching Sybil, DoS, and replay attacks, among others. With insider credentials, attackers can inject false messages, manipulate cooperative decision-making, and otherwise disrupt the vehicular network in ways unavailable to outsiders.

5.1.2. Attacker Knowledge

Attacker knowledge refers to the amount of information an adversary possesses about the target system, including its architecture, algorithms, and training data [128]. Security analyses often consider multiple knowledge assumptions to capture both realistic and worst-case scenarios. Attacker knowledge is commonly categorized into three classes: white-box, black-box, and gray-box.

In the white-box scenario, the attacker has complete information about the targeted vehicular system [128]. Although this scenario may be less representative of real-world threats (since an attacker rarely has full knowledge of a system), it remains valuable for analyzing worst-case vulnerabilities and informing robust defense strategies.

Conversely, in the black-box scenario, the attacker has no access to the internal structure of the system [129]. Here, the adversary can only observe the inputs and outputs of the target model, often relying on probing and empirical observations rather than direct manipulation of internal components.

The gray-box scenario represents the most common real-world setting [130]. In this case, the attacker possesses partial knowledge of the system, such as specific message formats or perception pipelines, but lacks full access. Gray-box models provide a realistic basis for analyzing average-case attacks, particularly in CD contexts where attackers may exploit limited system knowledge to compromise safety or decision-making.

5.1.3. Attacker Objective

Attackers can also be categorized as malicious or rational, based on their behavior and objectives [10]. This classification provides insight into the type, persistence, and intent of attacks in vehicular networks.

Malicious attackers aim primarily to disrupt, damage, or destabilize the vehicular communication network without seeking personal benefit [127]. Their objective is often to degrade system reliability or cause widespread disruption. Such attackers may target the availability and integrity of V2X communication services. For instance, Denial-of-Service (DoS) or Distributed DoS (DDoS) attacks can overwhelm network nodes or communication channels, preventing vehicles from receiving critical safety messages [10]. Other malicious strategies include message flooding and false information injection, which can compromise CP and decision-making processes. Because CD relies on timely and accurate information exchange between vehicles and infrastructure, these attacks can significantly degrade system performance and create unsafe driving conditions.

In contrast, rational attackers are motivated by specific incentives or benefits rather than purely destructive intent [127]. They act strategically to maximize gain while minimizing the risk of detection. In CD environments, rational attackers may exploit communication systems to obtain financial rewards, sensitive information, or strategic advantages. Common examples include eavesdropping on V2X messages to collect private vehicle information, e.g., location, or speed, and manipulating data to influence navigation systems or traffic management for personal benefit. Rational adversaries generally favor stealthier attacks that allow prolonged exploitation without obvious disruption.

Distinguishing between malicious and rational attackers is critical for modeling security threats in CD. While malicious actors primarily target system availability and reliability, rational actors often focus on confidentiality and economic gain. Understanding these differing objectives enables the design of tailored detection and mitigation strategies to secure V2X networks effectively.

5.2. Perception Layer

Attacks on the perception layer target the initial stage of the CD pipeline, compromising the vehicle’s ability to accurately sense its environment via RGB cameras, LiDAR, and Radar [5]. Threats at this layer can be broadly categorized into two types: sensor spoofing and temporal attacks. Figure 5 is an overview of security on this layer.

5.2.1. Sensor Spoofing

Sensor spoofing is a low-level attack that primarily targets LiDAR and Radar sensors [9]. In this attack, adversaries generate phantom obstacles by manipulating the timing and angles of laser pulses to deceive LiDAR, or by transmitting false signals to mislead Radar. Sensor spoofing can also propagate beyond the initial node after fusion. Before fusion, corrupted LiDAR, radar, or camera inputs mainly affect local perception [131]. In early fusion, spoofed raw data can contaminate the shared scene representation; in intermediate fusion, corrupted feature maps may be harder to inspect because the attack is embedded in learned representations; and in late fusion, false detections or messages can bias downstream planning [50,79,86].

Defensive strategies against sensor spoofing focus on validating incoming sensor data and detecting anomalies that indicate malicious manipulation [9]. Techniques such as consistency checks, sensor fusion, and anomaly detection have been proposed to enhance the robustness of perception systems against these attacks.

Attacks: Shin et al. [131] demonstrated an early LiDAR spoofing attack, injecting laser signals to cause the perception system to record phantom reflections. This manipulation can lead to false obstacle detections, missed detections of real obstacles, or partial blindness of the perception system. Building on this work, Cao et al. [64] transmitted carefully timed laser pulses toward the target LiDAR sensor, causing it to register incorrect distance measurements that are subsequently used by the perception model.

Tu et al. [132] developed a LiDAR spoofing system for more practical scenarios, targeting vehicles that were distant and moving at high speed. This attack employed high-precision laser emitters to manipulate the victim sensor from long ranges. Nagata et al. [49] proposed SLAMSpoof, which specifically targeted LiDAR-based localization, i.e., the process of determining a vehicle’s precise position and orientation using LiDAR data. This attack injected falsified point cloud data into the target sensor, effectively compromising the Simultaneous Localization and Mapping (SLAM) system and demonstrating the real-world feasibility of sensor spoofing. Extending this work, Nagata et al. [133] developed D-SLAMSpoof, capable of operating in more dynamic environments by gradually altering the injected data to shift the perceived position over time. Yahia et al. [134] introduced the use of mirrors to disrupt LiDAR laser pulses, corrupting the resulting point cloud output.

Spoofing attacks against Radar systems are also prevalent. Komissarov and Wool [135] demonstrated an attack that transmitted carefully crafted radar signals to manipulate the victim radar’s distance and velocity measurements, potentially triggering emergency braking or acceleration responses in autonomous vehicles. Vennam et al. [136] introduced a reflect array attack, capturing emitted radar waves and modulating them to inject ghost objects. Ordean and Garcia [137] showcased multiple spoofing techniques against 76–78 GHz radar systems, including object injection and noise disruption. Hunt et al. [138] proposed a black-box spoofing attack that did not require knowledge of the underlying radar architecture. Finally, Zhu et al. [139] utilized low-cost reflective tiles to interfere with radar perception, highlighting the feasibility of cost-effective physical-layer attacks.

Defenses: Several methods have been proposed to detect and mitigate LiDAR spoofing attacks. Sun et al. [140] introduced CARLO, a system that ensures LiDAR point clouds follow expected physical patterns (for example, verifying that laser pulses stop at detected objects as anticipated) and flags any deviations as potentially spoofed. Cho et al. [141] proposed ADoPT, which leverages temporal consistency across LiDAR scans to detect abrupt or incoherent changes in object structure over time. In the context of SLAM spoofing, Nagata et al. [133] developed ISD-SLAM, which compares LiDAR SLAM estimates with onboard sensor measurements, identifying sudden deviations as signs of spoofing. Khattab et al. [142] presented a decision-tree-based detector that distinguishes spoofed LiDAR measurements from legitimate point clouds by analyzing characteristic patterns in the sensor data.

Several approaches have been proposed to defend against radar spoofing attacks. Wu et al. [143] introduced a collaborative framework in which multiple radars share estimated range, velocity, and angle/location of detected objects. The system performs cross-verification across radars to identify inconsistencies indicative of spoofing. Zhou et al. [144] proposed a method that transforms radar signals into a time–frequency distribution, where legitimate reflections form structured patterns while interference or spoofed signals appear as irregular streaks, enabling easy detection. Finally, Akhtar et al. [145] developed AOHDL, a deep learning-based solution that extracts radar features from coherent pulse maps using specialized neural networks and employs a neural classifier to distinguish between genuine and manipulated radar signals.

5.2.2. Temporal Attacks and Defenses

Temporal attacks manipulate the ordering, synchronization, or continuity of sensor data streams, exploiting the time-sensitive nature of CD systems [14]. A specialized form of temporal attack is the misalignment attack, in which an adversary intentionally introduces delays in perception processing. This causes the vehicle to operate on outdated or misaligned environmental information, potentially leading to incorrect decisions or unsafe maneuvers.

Attacks: Shariar et al. [14] introduced DejaVu, an attack that induces delays across multiple sensor streams, causing temporal misalignment. This misalignment results in data from different streams being incorrectly synchronized, degrading perception performance. Finkenzeller et al. [15] demonstrated another form of temporal misalignment attack by injecting controlled delays between sensors, showing that detection performance declines steadily as the delay increases.

Defenses: Shahriar et al. [146] proposed AION, a framework designed to mitigate temporal misalignment attacks. AION detects inconsistencies across sensor streams by analyzing temporal continuity and alignment, effectively defending against attacks such as DejaVu. Liu et al. [16] introduced a physics-aware spatiotemporal defense framework that enforces physically plausible trajectories for detected objects across frames, providing robust protection against temporal attacks. Mu et al. [147] developed a multi-frame comparison mechanism that monitors data over time, ensuring temporal consistency and defending the AD system against time-based inconsistencies.

5.3. Communication and Fusion Layer

The communication and fusion layer enables vehicles and infrastructure to exchange and integrate information across the network [3]. While this collaboration enhances situational awareness and decision-making, it also introduces significant security vulnerabilities. In particular, this layer is susceptible to a range of attacks, including jamming, Sybil attacks, and DoS [10]. Because the fusion module aggregates data from multiple sources, malicious or compromised inputs can propagate across the network and simultaneously impact multiple vehicles. Furthermore, the reliance on wireless communication and multi-agent sensing in V2X systems introduces additional attack surfaces, increasing the risk of large-scale and coordinated adversarial disruptions. Figure 6 is an overview of security on this layer.

5.3.1. Jamming

Radio jamming involves the intentional transmission of interfering signals over the same frequency band as vehicular receivers, with the goal of disrupting communication between vehicles and infrastructure [55]. Such attacks can significantly degrade or completely block communication, undermining the reliability of V2X systems. In AD contexts, jamming attacks can lead to traffic slowdowns or complete stoppages [33]. More critically, they prevent vehicles from accessing essential information required for safe decision-making. For instance, vehicles may be unable to detect the presence or position of nearby agents during safety-critical maneuvers, such as navigating blind intersections, thereby introducing substantial safety risks.

Attacks: Souse et al. [17] proposed a cross-technology jamming attack on DSRC (IEEE 802.11p) communication systems using a C-V2X transmitter, demonstrating effective signal degradation across heterogeneous communication standards. Arif and Kim [18] introduced a clustered jamming strategy leveraging UAVs, where multiple coordinated jammers synchronize their transmissions to maximize disruption. Talukder and Xie [19] proposed symbiotic jamming attacks on NR-V2X systems, in which the jammer exploits the transmission patterns of legitimate V2X nodes to inject interference selectively during critical packet exchanges, thereby increasing efficiency and stealth.

Defenses: Bousalem et al. [20] developed a deep learning-based intrusion detection system for identifying jamming attacks in 5G-V2X environments, where the model learns to distinguish normal communication patterns from jamming-induced anomalies. Krayani et al. [21] proposed a detection framework based on a Generalized Dynamic Bayesian Network, which models the V2X environment to identify disruptions caused by jamming. Pourranjbar et al. [148] introduced a neural network-based approach to predict jammer behavior and forecast channel usage, enabling proactive mitigation, particularly in scenarios involving multiple coordinated jammers.

5.3.2. Message Spoofing

Message spoofing involves the transmission of falsified signals or messages to deceive CAVs [33]. In such attacks, adversaries impersonate legitimate entities to inject misleading information into the network, causing receivers to trust and act upon incorrect data [33]. As an emerging and critical threat, message spoofing undermines the integrity and reliability of V2X communication, posing significant risks to the safety of CD systems [36].

Attacks: Ansari et al. [149] developed a comprehensive message spoofing framework capable of falsifying Basic Safety Messages (BSMs) in V2X systems, supporting 68 distinct spoofing strategies. This framework enables attackers to inject forged yet legitimate-looking BSMs into the network, deceiving nearby vehicles and influencing their perception of the environment. By encompassing a wide range of attack scenarios, this work highlights the flexibility and severity of message spoofing threats in CD environments.

Defenses: Boualouache and Engel [150] surveyed ML-based misbehavior detection systems for vehicular networks, highlighting their ability to monitor message content and GPS signals to ensure validity and detect anomalies, including message spoofing. Silva et al. [151] proposed an AI-based framework that compared the Direction of Attack estimated transmitter location with visual vehicle detection results, enabling the system to identify spoofed messages corresponding to non-existent vehicles. Greco et al. [152] introduced a physical-layer defense mechanism leveraging a graph neural network to analyze network communication patterns and physical-layer features, ensuring the authenticity of received messages and protecting the V2X network from spoofing attacks.

5.3.3. Eavesdropping

Eavesdropping attacks involve intercepting communications between vehicles and infrastructure to acquire sensitive information, such as user data or vehicle locations [10,33]. These attacks pose significant privacy risks and may enable adversaries to exploit confidential information. Notably, within the eavesdropping domain, research has predominantly focused on developing defenses rather than modeling eavesdropping attacks.

Defenses: Li et al. [153] proposed a deep reinforcement learning approach to mitigate eavesdropping attacks, employing a soft actor–critic algorithm to iteratively adapt the network to dynamic environmental conditions while maintaining secure connections. Gu et al. [154] introduced a hybrid framework combining deep learning for authentication of legitimate nodes with reinforcement learning to dynamically optimize the model’s effectiveness over time. Mamun et al. [155] evaluated the use of homomorphic encryption, enabling computations on encrypted data without decryption, thus protecting sensitive information from eavesdroppers. Finally, Pan et al. [156] proposed a secure semantic communication system, which transmits task-relevant information rather than raw data, integrating secure semantic encoding and trust management to defend against eavesdropping in next-generation vehicular networks.

5.3.4. DoS/DDoS

Denial-of-Service (DoS) attacks aim to prevent legitimate users from accessing network services critical to vehicular communication and AD functionality [10]. Distributed Denial-of-Service (DDoS) attacks amplify this threat by originating from multiple locations simultaneously, complicating detection and mitigation. These attacks overwhelm the network with excessive requests, thereby disrupting the regular flow of information and denying service to legitimate vehicles and infrastructure [33]. In the context of V2X systems, DoS and DDoS attacks have the potential to incapacitate entire communication networks, posing severe safety and operational risks to AD systems.

Attacks: Trkulja et al. [157] introduced and evaluated DoS attacks of varying sophistication. They categorized the attacks into three types: oblivious attacks, which randomly select resources; smart attacks, which monitor channel activity to target active blocks; and cooperative attacks, which coordinate multiple attackers to maximize interference. Twardokus and Rahbari [158] proposed two DoS attacks designed to jam vehicles’ BSM systems and degrade V2X network performance, effectively preventing legitimate vehicles from transmitting critical safety messages. Finally, Tine et al. [159] presented DoS attacks targeting the forward collision warning system in autonomous vehicles. These attacks relied on flooding User Datagram Protocol (UDP) and BSM packets, carefully crafted to comply with 3GPP and SAE protocol standards, highlighting the practical feasibility of protocol-compliant DoS attacks.

Defenses: Jayakrishna and Prasanth [160] proposed a deep learning and reinforcement learning-based framework to detect and mitigate DDoS attacks. Their model captured spatiotemporal patterns of network traffic to identify abnormal communication indicative of DDoS activity, while the reinforcement learning component dynamically adjusted network flow to mitigate ongoing attacks. Yigit et al. [161] developed a defense mechanism for DDoS attacks in V2I communication using digital twin simulations, comparing real-world behavior with a virtual twin to detect anomalies caused by potential attacks. Sadaf et al. [162] presented a fuzzy logic-based approach to detect and mitigate DoS attacks in V2X networks. This framework analyzed message characteristics such as frequency, entropy, and timestamps, identifying abnormal patterns that signaled ongoing DoS activity.

5.3.5. Replay

Replay attacks occur when legitimate messages are recorded and retransmitted at a different time or location [34]. In V2X systems, this can cause vehicles to respond to outdated or fabricated information, potentially triggering inappropriate decisions [33].

Attacks: Sohail et al. [163] evaluated the susceptibility of V2X communication to replay attacks, demonstrating that DSRC-based networks can be compromised if message freshness is not properly verified. Oza et al. [164] presented a replay attack targeting messages exchanged between sensors and traffic control systems. In this scenario, retransmitted messages caused the controller to incorrectly detect vehicle presence or absence at intersections, resulting in improper traffic light scheduling and potential safety hazards.

Defenses: Oza et al. [164] proposed a framework to detect replay attacks in traffic control systems by leveraging an expected physical model of traffic flow and signal behavior. Incoming sensor data that deviated from the system’s predicted behavior was flagged as a potential replay attack. Dai et al. [165] employed timestamp-based Hash-based Message Authentication Code verification, ensuring that each message contained a fresh timestamp and authentication log, providing protection against replay attacks with minimal communication overhead. Huo et al. [166] introduced a hash chain-based authentication framework, where each message depends on the hash of the previous message, effectively preventing the reuse of previously captured messages. Yang et al. [167] developed a credential-based anonymous authentication system, which continuously updated vehicle credentials and incorporated a revocation mechanism to maintain secure communication.

5.3.6. Sybil

Sybil attacks involve an adversary creating multiple fake identities or nodes within a vehicular network [36]. These counterfeit nodes can overwhelm communication channels, increase network traffic, and manipulate cooperative decision-making processes. Due to the reliance of CD systems on shared information and trust among vehicles, Sybil attacks are particularly feasible and pose a severe threat to the integrity and efficiency of V2X communications.

Attacks: Guven and Taysi [168] investigated Sybil attacks in V2X communication by simulating various attack scenarios and introducing a dataset to represent these threats more realistically. They implemented four primary strategies, including generating attack packets with random values and packets following a structured grid pattern. Azam et al. [169] developed a Sybil attack model aimed at supporting defense research. In this model, the attacker could generate multiple coordinated fake identities, allowing the malicious nodes to act in unison and manipulate the network.

Defenses: Azam et al. [169] proposed a defense framework against Sybil attacks that leveraged collaborative learning, allowing communication nodes to share insights and build a global model for detection. Tadesse et al. [170] developed a multi-factor authentication system that verified ID number, status, security key, and speed to ensure the legitimacy of nodes. Morton et al. [171] introduced TASER, a trust-aware Sybil detection system that operates without relying on external infrastructure and employs dual antennas to incorporate directionality into verification. Finally, Baza et al. [172] applied a proof-of-work mechanism to roadside units, increasing the difficulty of creating fake identities.

5.3.7. Ransomware

Ransomware attacks involve using malware to infect a computer system to extort the user [173]. In the context of AD, these attacks can threaten the privacy and safety of vehicles and users, by restricting access to vehicle functions, connected services, or supporting infrastructure [174]. These attacks could also play a significant role in damaging the communication between vehicles, disrupting key V2X nodes and damaging CD systems as a whole [175].

Attacks: Bajpai et al. [174] present a proof-of-concept ransomware attack by exploiting exposed services within the vehicle, launching ransomware payloads. The attack was an early demonstration that ransomware could threaten vehicle availability and privacy even without directly attacking driving-control components. Malik et al. [176] developed a series of ransomware attacks on CD systems, considering the attack vectors of Wi-Fi hotspot, physical connection, and malicious update. They then investigated how these attacks propagated through the CD network, infecting other vehicles.

Defenses: Malik et al. [176] proposed a two-stage mitigation framework for their own attacks. The first stage scanned network activity for abnormal activity, while the second stage used a k-nearest neighbors algorithm to detect an attack and close network connections if one is detected. Alsharabi et al. [177] proposed a reverse-engineering-based ransomware defense approach in which ransomware samples were analyzed to find distinct indicators. These indicators were then used to detect and create countermeasures against these attacks.

5.4. Planning and Control Layer

The planning and control layer is responsible for trajectory prediction and high-level decision-making and low-level execution in autonomous vehicles [178]. This stage relies heavily on machine learning models, making it vulnerable to various ML-specific threats. Adversarial attacks are the broad description of ML attacks. We distinguish them into inference-time (evasion) and training-time (data poisoning and backdoor) attacks. Additionally, LM integration creates a host of new vulnerabilities, such as prompt injection, which fall under this layer. Finally, the control and actuation component is vulnerable to the propagation of attacks through previous layers. Figure 7 is an overview of security on this layer.

5.4.1. Evasion Attacks

Evasion attacks are among the most common threats faced by CD models [179]. These attacks involve introducing subtle but strategically crafted perturbations to the input data of a machine learning model. In the context of V2X AD, such perturbations can degrade the accuracy of inferences, leading to misclassification of objects, incorrect trajectory predictions, or unsafe control decisions at inference time. The primary risk of evasion attacks lies in their stealth: perturbations are often imperceptible to humans but can cause significant deviations in model behavior, compromising the safety and reliability of AD systems.

Attacks: Tao et al. [12] introduced the Blind Area Confusion (BAC) attack, which targets the perception system’s least confident regions. By mimicking actual vehicles and gradually injecting optimized perturbations, this attack caused inaccuracies in downstream tasks such as sensor fusion and trajectory planning. Tao et al. [13] developed an attack leveraging a Mutual View Information Graph (MVIG), which encodes known vulnerabilities in a CD environment. Patterns learned from the evolving MVIG allowed the attacker to plan effective strategies against the network over time. Tu et al. [180] proposed a multi-agent evasion attack, where perturbed transmissions between vehicles accumulate over time, exploiting temporal redundancy to maximize impact. Zhang et al. [181] introduced evasion attacks specifically designed for different fusion models in CD. These attacks focused on data fabrication, injecting perturbations that led to false object detection and misinformed decision-making.

Defenses: Tao et al. [12] proposed a defense framework against evasion attacks, including the BAC threat. This approach integrated both spatial and temporal consistency checks into a unified system. Hu et al. [182] introduced the Probability-Agnostic Sample Consensus method, which randomly samples subsets of collaborating agents to reach consensus with the ego vehicle’s perception. Agents that deviate from consensus are flagged as potentially malicious and removed from the network. Zhao et al. [183] developed Malicious Agent Detection, targeting evasion attacks against multi-agent collaborative object detectors by identifying and eliminating malicious participants injecting perturbations. Zhang et al. [181] proposed a Collaborative Anomaly Detection system, where each vehicle shares an occupancy map indicating free and occupied spaces. Cross-vehicle validation is then used to detect abnormalities caused by data fabrication or adversarial perturbations.

5.4.2. Data Poisoning

Data poisoning is a training-time attack on ML models, where an adversary deliberately manipulates the training data to degrade model performance [184]. These attacks can affect various labels and categories within the dataset, leading to a significant drop in prediction accuracy and overall reliability. It is important to distinguish data poisoning from evasion attacks: while evasion attacks target model inputs after training to induce immediate mispredictions, data poisoning corrupts the training process itself, causing the model to learn incorrect behaviors that manifest during deployment.

Attacks: Patel et al. [185] proposed Bait and Switch, a real-world data poisoning attack targeting learning in AD. This attack involved placing electric billboards near traffic signals and displaying images synchronized with the lights. Vehicles trained on these images would then learn misleading behaviors. At deployment, the attacker could switch the order of the images, causing incorrect reactions from the vehicle. Garg et al. [186] introduced FLStealth, a poisoning attack against federated learning (FL) in CD systems. FLStealth created two models: an honest model and a byzantine model. The byzantine model was trained to subtly degrade overall performance while remaining close to the honest model, making detection difficult.

Defenses: Bataineh et al. [187] proposed a framework leveraging explainable AI (XAI) to detect data poisoning. The system exploits XAI’s sensitivity to small changes in input data; by analyzing shifts in explanations, it can identify when training data has been tampered with. Chaabene et al. [188] developed a defense against FL poisoning attacks in AD. Their framework detects anomalies caused by inconsistent data labeling, effectively preventing significant drops in model accuracy under poisoning. Finally, Kabir et al. [189] introduced FLShield, a general FL poisoning defense framework. Although not specifically designed for CD, FLShield combines multiple local models into representative global models and validates them to ensure robustness. This approach demonstrates strong performance against diverse poisoning strategies, making it applicable to federated learning-based CD systems.

5.4.3. Backdoor Attacks and Defenses

Backdoor attacks create hidden vulnerabilities in ML model, typically in the form of specific triggers or patterns [26]. When this trigger is present in the input, the model produces an attacker-specified output or action. Executing a backdoor attack generally requires access to the training data and procedure, which may be difficult in black-box scenarios. However, if the attacker gains such access, these attacks become dangerous and difficult to detect, as the malicious behavior remains dormant until the trigger is activated.

Attacks: Garg et al. [186] proposed an Off-Track Attack, injecting a square pattern trigger into data. This trigger forced the car to turn, and had a significant impact on model accuracy. Chen et al. [190] used a temporal trigger, injecting false trajectories forcing the ego vehicle to take a malicious action. Zhang et al. [191] presented the BadLANE attack, with mud element-inspired triggers, randomly generating and injecting patterns into data. This attack showed high success rates, as well as high robustness under changing conditions. Liao et al. [192] also targeted the lane detection task in their DBALD attack, involving three strategies: offsetting the lane, removing lane boundaries, and rotating the lane.

Defenses: Kumar et al. [193] proposed SecFedDrive, a framework to defend against FL backdoor attacks for AD. This method involved two security layers: Residual Check and Neural Cleanse. Residual Check blocked images with pixel triggers from being injected into the data, while Neural Cleanse worked as an anomaly detection system. Together, these measures were able to reduce attack efficiency and effectiveness. Wang et al. [194] developed Backdozer, a detection system for backdoor attacks. This system involved extracting abstract features from data into a latent subspace, and test the structure of these features. Benign data would form a consistent structure in latent space, while backdoor data would distort the structure. Finally, Kumari et al. [195] proposed a Bayesian backdoor defense against FL backdoor attacks. This method involved modeling client updates probabilistically, using Bayesian modeling to capture variations in clients’ weights. This allowed the framework to filter out malicious updates without hurting accuracy. Although this framework was not specifically intended for FL defense in an AD or CD context, it is still extremely applicable to models that use FL.

5.4.4. Latency Attacks and Defenses

Latency attacks are an emerging subset of adversarial threats in CD [11]. These attacks aim to disrupt the availability of CD. While they use perturbations similar to those in evasion attacks, the objective is different: instead of causing incorrect predictions, latency attacks slow down the perception or decision-making system, rendering it less effective. Early studies indicate that this class of attacks can significantly degrade system performance. Latency attacks in the planning and control layer differ from temporal attacks in the perception layer. Whereas temporal attacks target the alignment and timing of sensor streams, latency attacks focus on the ML models themselves, aiming to slow down collaborative decision-making across vehicles rather than individual vehicle perception.

Attacks: Wang et al. [11] conducted the first study on latency attacks in CD. They introduced CP-FREEZER, an attack that leveraged adversarial V2V messages to intentionally slow down the CD pipeline, introducing critical delays. Experimental results demonstrated a 90-fold increase in latency and a 3 s processing time per frame, emphasizing the severity of such attacks. Additionally, Ma et al. [196] proposed SlowPerception, an attack that generates numerous phantom objects via projected perturbations, overloading perception algorithms and slowing down processing.

Defenses: Given the recent emergence of latency attacks on ML-based CD systems, there are currently no dedicated defenses against this threat. However, some existing adversarial defense frameworks could potentially mitigate latency attacks. For instance, the framework proposed by Tao et al. [12], which incorporates temporal history to defend against adversarial attacks, could be adapted to flag abnormal temporal patterns caused by latency attacks. Similarly, Hu et al.’s [182] consensus-based framework identifies and removes malicious agents in the network, ignoring their contributions, which may help reduce the impact of latency-inducing inputs.

5.4.5. Language-Based Attacks and Defenses

LM vulnerabilities are crucial and increasingly important threats in the planning and control layer. This category covers various types of vulnerabilities. For example, prompt injection attacks involve the insertion of malicious instructions into user input, causing LMs to behave unexpectedly or undesirably [26]. These attacks pose a significant security risk for language-based models, as LMs cannot inherently distinguish between legitimate user input and instructions from developers. In addition to prompt injection, there have been some training-time attacks developed against language-enabled CD systems.

Attacks: Burbano et al. [197] introduced CHAI, a prompt injection attack that embeds malicious text in the real world targeting text prompts in the underlying LLM. For instance, this attack could manipulate road signs to be interpreted as instructions by the LLM, successfully hijacking driving decisions and causing unsafe behaviors such as incorrect turns or collisions. CHAI was the first prompt injection attack specifically targeting LLMs in AD, marking a significant step in this research area. Liu et al. [198] proposed PINA, which targeted the planning and reasoning outputs of LM-based systems. This attack injected prompts that altered navigation decisions through a three-component system consisting of an attack evaluator, a distribution analyzer, and a prompt refiner. Together, these components enabled the attacker to degrade the performance of LM-based navigation systems via prompt injection. Long and Li [199] proposed FuncPoison, a novel poisoning attack targeting LLM-integrated CD systems. FuncPoison injected malicious text-based prompts into the LLM reasoning module, biasing outputs and hijacking the model. Experiments demonstrated high effectiveness and stealth, highlighting it as a significant emerging threat to LLM-based CD. Finally, Ni et al. [200] proposed an attack against VLMs in AD, targeting the visual inputs of these models. They presented BadVLMDriver, which used common physical items (for example, a red balloon) to cause malicious actions. This was a significant departure from other backdoor attacks, which used patterns in their data to induce these actions. This attack showed an extremely high effectiveness, with a 92% success rate for certain triggers. This paper showed the emerging threat of backdoor attacks against VLMs and other LMs in AD, emphasizing the need for defense against these attacks.

Defenses: Lu et al. [201] proposed ARGUS, a comprehensive defense framework against prompt injection in multimodal LLMs. This system safeguards decision-making by detecting malicious prompts and preserving trustworthy inputs, guiding the model toward safe and accurate outputs. While ARGUS was not specifically designed for AD or CD systems, its principles could still be applied effectively in such contexts. Similarly, Wang et al. [202] developed AegisAgent, a three-layered defense framework. The first layer is an input sanitizer, which removes anomalies and interferences; the second is a consistency verifier, detecting deeper manipulations within the model; and the third is a robust reasoner, ensuring accurate decision-making even under prompt injection attacks. Although AegisAgent was not originally aimed at AD, it could be adapted to defend language-model-enabled CD systems.

5.4.6. Cross-Layer Propagation of Attacks

The control and actuation stage within the planning and control layer is vulnerable to the propagation of attacks from previous layers [36]. This cross-layer propagation results from attacks in the perception and the communication and fusion layers. Additionally, attacks against the planning and decision-making process also propagate down to the control stage. Islam and El-Wakeel [203] conducted tests on this cross-layer propagation. Their experimental results showed how attacks on previous layers caused delayed control, incorrect trajectories, and traffic violations. Eslami and Yu [204] performed a systematic analysis on cross-layer threats in agentic AVs, modeling how small upstream attacks can result in major disruptions on the control system. Ultimately, cross-layer propagation is still an emerging area of research, but it plays a critical role in accurately modeling real-world threats to CD systems.

6. Research Gaps and Future Directions

Despite significant progress in CD, security, particularly in LM-enabled AD systems, remains an emerging and underexplored area. Several critical research gaps merit attention, both from a practical and theoretical perspective.

1. Cybersecurity for 6G-enabled C-V2X Communication: The integration of 6G technology into vehicular networks promises unprecedented bandwidth, ultra-low latency, and support for massive heterogeneous devices [27]. However, current cybersecurity solutions remain largely incremental extensions of 5G mechanisms and fail to account for the unique characteristics of 6G networks, including highly dynamic topologies, AI-driven network management, and ultra-dense deployments. Future research should focus on:

Designing comprehensive security frameworks specifically for 6G-enabled AD systems.
Identifying novel attack vectors that exploit network intelligence, ultra-high-speed communication, and new physical-layer technologies.
Developing scalable, adaptive defense strategies capable of protecting large-scale, multi-agent vehicular networks.

2. Emerging ML Attacks in CD: ML models in CD are increasingly vulnerable to novel attacks beyond classical adversarial methods. Latency-based ML attacks, such as CP-FREEZER [11], demonstrate that delaying perception or decision-making pipelines can have catastrophic consequences, including collisions. Future work should prioritize:

Designing detection and mitigation strategies for latency attacks and other time-sensitive adversarial threats.
Developing robust ML pipelines that can maintain safety-critical performance under adversarial or delayed inputs.
Exploring hybrid defense strategies that combine consensus mechanisms, anomaly detection, and temporal validation for multi-agent systems.

3. Cross-Layer and End-to-End Security: While individual layer-focused defense frameworks are common, attacks can trickle down through layers. For example, a DoS attack can impact the fusion of data across nodes, which can result in errors in the planning and control layer due to an incorrect trajectory [205]. In this regard, future work should consider:

Developing end-to-end security frameworks that model and mitigate attack propagation across the full CD stack.
Designing cross-layer detection mechanisms that correlate anomalies across multiple layers to improve attack attribution and robustness.
Creating unified threat models that capture interdependencies between layers, enabling analysis of how attacks in one layer can cascade into others.

4. Real-World Testing and Validation of CD Models: Though there have been numerous developments in CD models over the last few years, an important gap is the lack of real-world testing and validation. Existing approaches are evaluated primarily in simulated environments [44,80,85]. While models are shifting towards using real-world datasets such as DAIR-V2X [114], they still evaluate the behavior of their models through simulations rather than in the real world. While this is useful for controlled experimentation, simulated environments fail to fully capture the complexity of real-world driving scenarios. Future work should focus on:

Developing large-scale real-world testbeds that incorporate V2X interactions under realistic conditions.
Incorporating realistic communication constraints, such as latency, packet loss, and bandwidth limitations, into evaluation pipelines.
Creating standardized benchmarks and datasets that include adversarial scenarios, multi-agent interactions, and diverse environmental conditions.

5. Uncertainty-Aware CD: CD systems operate in highly dynamic environments. Uncertainty arises from multiple sources, including sensor noise, occlusions, and communication delays. Despite this, most existing CD models adopt deterministic pipelines that do not account for uncertainty. UNCAP [107] is a work that uses this uncertainty, but this direction is still emerging. Future work should focus on:

Designing fusion frameworks that propagate and aggregate uncertainty across agents and modalities.
Incorporating uncertainty into planning and decision-making processes, allowing vehicles to adopt risk-aware behaviors under ambiguous or unreliable conditions.
Establishing calibration methods to ensure that model confidence aligns with real-world performance.

6. Security of LLM-Enabled CD: The integration of LLMs into CD introduces entirely new attack surfaces. Emerging threats include:

Data poisoning attacks targeting LLM reasoning, such as FuncPoison [199], which bias model outputs by injecting malicious training prompts.
Physical backdoor attacks, such as BadVLMDriver [200], which exploit environmental cues to trigger malicious LLM behavior.
Prompt injection attacks [197,198] that manipulate model instructions in real time, potentially causing unsafe driving decisions.

Currently, there are no dedicated defense frameworks for these threats in the CD context. Future research should aim to:

Develop proactive and reactive defenses against LLM-targeted attacks, combining input sanitization, anomaly detection, and robust reasoning.
Investigate secure training and update mechanisms for federated LLM systems in multi-agent environments.
Establish benchmark datasets and evaluation protocols for assessing LLM security in CD.

Addressing these gaps is essential not only for safeguarding CD systems but also for enabling safe, scalable deployment of AI and LLM-based technologies in real-world transportation. By systematically investigating 6G security, novel ML attacks, and LM vulnerabilities, the research community can develop a foundation for trustworthy autonomous driving systems.

7. Conclusions

CD represents a transformative advancement in autonomous transportation, enabling vehicles to leverage shared perception and collective intelligence to enhance safety, efficiency, and robustness. However, as this survey has demonstrated, the integration of V2X communication and emerging language-model-based reasoning significantly expands the attack surface across all layers of the CD pipeline. In this work, we presented a comprehensive taxonomy of security threats and defense mechanisms spanning perception, communication and fusion, planning, and control, while highlighting the unique vulnerabilities introduced by LM integration. By synthesizing existing research across these domains, we identified critical gaps in realistic evaluation and emerging paradigms, including multi-agent adversaries and AI-driven decision-making. Looking ahead, ensuring the security and trustworthiness of CD systems requires a shift from isolated, component-level defenses to holistic, system-level approaches. This entails the development of unified security frameworks, robust multimodal and language-aware defenses, and next-generation communication security tailored to evolving technologies such as 6G. Addressing these challenges is essential to fully realize the potential of CD in real-world, safety-critical environments.

Author Contributions

Conceptualization, S.N. and O.G.; investigation, S.N. and O.G.; project administration, T.R.; supervision, O.G. and T.R.; validation, S.N. and O.G.; visualization, S.N.; writing—original draft, S.N.; writing—review and editing, S.N., O.G. and T.R. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been funded in part by NSF, with award numbers #2112665, #2112167, #2003279, #2120019, #2211386, #2052809, #1911095 and in part by PRISM and CoCoSys, centers in JUMP 2.0, an SRC program sponsored by DARPA.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, K.; Yang, D.; Zhang, J.; Li, M.; Liu, Y.; Liu, J.; Wang, H.; Sun, P.; Song, L. Spatio-Temporal Domain Awareness for Multi-Agent Collaborative Perception. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Los Alamitos, CA, USA, 2–6 October 2023; pp. 23326–23335. [Google Scholar] [CrossRef]
Li, J.; Liu, X.; Li, B.; Xu, R.; Li, J.; Yu, H.; Tu, Z. CoMamba: Real-time Cooperative Perception Unlocked with State-Space Models. In Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 19–25 October 2025; pp. 16993–17000. [Google Scholar] [CrossRef]
Yu, H.; Yang, W.; Zhong, J.; Yang, Z.; Fan, S.; Luo, P.; Nie, Z. End-to-End Autonomous Driving Through V2X Cooperation. Proc. Aaai Conf. Artif. Intell. 2025, 39, 9598–9606. [Google Scholar] [CrossRef]
Qu, D.; Chen, Q.; Bai, T.; Lu, H.; Fan, H.; Zhang, H.; Fu, S.; Yang, Q. SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2024, Abu Dhabi, United Arab Emirates, 14–18 October 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 8905–8912. [Google Scholar] [CrossRef]
Huang, T.; Liu, J.; Zhou, X.; Nguyen, D.C.; Rahimi Azghadi, M.; Xia, Y.; Han, Q.L.; Sun, S. Vehicle-to-Everything Cooperative Perception for Autonomous Driving. Proc. IEEE 2025, 113, 443–477. [Google Scholar] [CrossRef]
Liu, S.; Gao, C.; Chen, Y.; Peng, X.; Kong, X.; Wang, K.; Xu, R.; Jiang, W.; Xiang, H.; Ma, J.; et al. Towards vehicle-to-everything autonomous driving: A survey on collaborative perception. arXiv 2023, arXiv:2308.16714. [Google Scholar]
Bai, Z.; Wu, G.; Barth, M.J.; Liu, Y.; Akin Sisbot, E.; Oguchi, K.; Huang, Z. A Survey and Framework of Cooperative Perception: From Heterogeneous Singleton to Hierarchical Cooperation. IEEE Trans. Intell. Transp. Syst. 2024, 25, 15191–15209. [Google Scholar] [CrossRef]
Cui, G.; Zhang, W.; Xiao, Y.; Yao, L.; Fang, Z. Cooperative Perception Technology of Autonomous Driving in the Internet of Vehicles Environment: A Review. Sensors 2022, 22, 5535. [Google Scholar] [CrossRef] [PubMed]
Lippi, G.; Aljawarneh, M.; Al-Naamneh, Q.; Hazaymih, R.; Dhomeja, L. Security and Privacy Challenges and Solutions in Autonomous Driving Systems: A Comprehensive Review. J. Cyber Secur. Risk Audit. 2025, 2025, 23–41. [Google Scholar] [CrossRef]
Sedar, R.; Kalalas, C.; Vázquez-Gallego, F.; Alonso, L.; Alonso-Zarate, J. A Comprehensive Survey of V2X Cybersecurity Mechanisms and Future Research Paths. IEEE Open J. Commun. Soc. 2023, 4, 325–391. [Google Scholar] [CrossRef]
Wang, C.; Song, R.; Muller, R.; Monteuuis, J.P.; Celik, Z.B.; Petit, J.; Gerdes, R.; Li, M. CP-FREEZER: Latency Attacks Against Vehicular Cooperative Perception. Proc. Aaai Conf. Artif. Intell. 2026, 40, 1114–1122. [Google Scholar] [CrossRef]
Tao, Y.; Hu, S.; Hu, Y.; An, H.; Cao, H.; Fang, Y. GCP: Guarded Collaborative Perception with Spatial-Temporal Aware Malicious Agent Detection. IEEE Trans. Dependable Secur. Comput. 2026, 23, 1–14. [Google Scholar] [CrossRef]
Tao, Y.; Hu, S.; An, H.; Fang, Z.; Cao, H.; Fang, Y. Learning Mutual View Information Graph for Adaptive Adversarial Collaborative Perception. arXiv 2026, arXiv:2602.19596. [Google Scholar] [CrossRef]
Shahriar, M.H.; Barat, M.M.A.; Sundar, H.; Zhang, N.; Ramakrishnan, N.; Hou, Y.T.; Lou, W. Temporal Misalignment Attacks against Multimodal Perception in Autonomous Driving. arXiv 2026, arXiv:2507.09095. [Google Scholar] [CrossRef]
Finkenzeller, A.; Roberts, A.; Bellone, M.; Maennel, O.; Hamad, M.; Steinhorst, S. Sensor Fusion Desynchronization Attacks. In Proceedings of the 37th Euromicro Conference on Real-Time Systems (ECRTS 2025), Toulouse, France, 8–11 July 2025. [Google Scholar] [CrossRef]
Liu, Y.; Nie, Z.; Yu, T.; Chen, M.; Yao, Z.; Lu, J.; Peng, L.; Fan, F. Physics-Aware Spatiotemporal Consistency for Transferable Defense of Autonomous Driving Perception. Sensors 2026, 26, 835. [Google Scholar] [CrossRef]
Sousa, B.; Magaia, N.; Silva, S.; Nguyen, H.; Guan, Y.L. Jamming Attack on DSRC Communication Caused by a C-V2X Sidelink Device. In Proceedings of the 2025 IEEE 101st Vehicular Technology Conference (VTC2025-Spring), Oslo, Norway, 17–20 June 2025; pp. 1–7. [Google Scholar] [CrossRef]
Arif, M.; Kim, W. Clustered jamming in U-V2X communications with 3D antenna beam-width fluctuations. Comput. Commun. 2024, 216, 209–228. [Google Scholar] [CrossRef]
Talukder, M.; Xie, J. SymJam: Symbiotic Jamming Attacks on NR-V2X. In Proceedings of the GLOBECOM 2024—2024 IEEE Global Communications Conference, Cape Town, South Africa, 8–12 December 2024; pp. 523–528. [Google Scholar] [CrossRef]
Bousalem, B.; Silva, V.F.; Boualouache, A.; Langar, R.; Cherrier, S. Deep Learning-based Smart Radio Jamming Attacks Detection on 5G V2I/V2N Communications. In Proceedings of the GLOBECOM 2023—2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 7139–7144. [Google Scholar] [CrossRef]
Krayani, A.; William, N.J.; Marcenaro, L.; Regazzoni, C. Jammer Detection in Vehicular V2X Networks. In Proceedings of the 2022 Microwave Mediterranean Symposium (MMS), Milan, Italy, 9–13 May 2022; pp. 1–5. [Google Scholar] [CrossRef]
Xu, Z.; Zhang, Y.; Xie, E.; Zhao, Z.; Guo, Y.; Wong, K.Y.K.; Li, Z.; Zhao, H. DriveGPT4: Interpretable End-to-End Autonomous Driving Via Large Language Model. IEEE Robot. Autom. Lett. 2024, 9, 8186–8193. [Google Scholar] [CrossRef]
Cui, Y.; Huang, S.; Zhong, J.; Liu, Z.; Wang, Y.; Sun, C.; Li, B.; Wang, X.; Khajepour, A. DriveLLM: Charting the Path Toward Full Autonomous Driving with Large Language Models. IEEE Trans. Intell. Veh. 2024, 9, 1450–1464. [Google Scholar] [CrossRef]
Panyam, S.; Donvir, A.; Paliwal, G.; Gujar, P. Survey of LLMs and AI Agents in V2X: Simulation, Analysis & Architectures. In Proceedings of the 2025 Systems of Signals Generating and Processing in the Field of on Board Communications, Moscow, Russia, 12–14 March 2025; pp. 1–11. [Google Scholar] [CrossRef]
Gao, X.; Lin, T.H.; Song, R.; Wu, Y.; Huang, K.R.; Jin, Z.; Lin, F.; Liu, S.; Tu, Z. SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving. arXiv 2025, arXiv:2510.18123. [Google Scholar] [CrossRef]
Gulyamov, S.; Gulyamov, S.; Rodionov, A.; Khursanov, R.; Mekhmonov, K.; Babaev, D.; Rakhimjonov, A. Prompt Injection Attacks in Large Language Models and AI Agent Systems: A Comprehensive Review of Vulnerabilities, Attack Vectors, and Defense Mechanisms. Information 2026, 17, 54. [Google Scholar] [CrossRef]
Xin, S. 6G-V2X Security: Overcoming Challenges for a Safer, Smarter Transportation Future. Appl. Comput. Eng. 2025, 149, 202–208. [Google Scholar] [CrossRef]
Ying, Z.; Wang, K.; Xiong, J.; Ma, M. A literature review on V2X communications security: Foundation, solutions, status, and future. IET Commun. 2024, 18, 1683–1715. [Google Scholar] [CrossRef]
Hou, Y.; Zou, B.; Zhang, M.; Chen, R.; Yang, S.; Zhang, Y.; Zhuo, J.; Chen, S.; Chen, J.; Ma, H. AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios. In Proceedings of the The Thirty-Ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track, Sydney, Australia, 6–12 December 2026. [Google Scholar]
Hao, R.; Fan, S.; Dai, Y.; Zhang, Z.; Li, C.; Wang, Y.; Yu, H.; Yang, W.; Yuan, J.; Nie, Z. RCooper: A Real-World Large-Scale Dataset for Roadside Cooperative Perception. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–21 June 2024; pp. 22347–22357. [Google Scholar] [CrossRef]
Tan, H.; Fang, C.; Shen, J.; Bhuiyan, Z.A.; Wu, Q.M.J. Cross-Domain Heterogeneous Data Aggregation with Dynamic Group Key Agreement for Hybrid Satellite Networks. IEEE Trans. Dependable Secur. Comput. 2026, 23, 4830–4844. [Google Scholar] [CrossRef]
Sutrala, A.K.; Bagga, P.; Das, A.K.; Kumar, N.; Rodrigues, J.J.P.C.; Lorenz, P. On the Design of Conditional Privacy Preserving Batch Verification-Based Authentication Scheme for Internet of Vehicles Deployment. IEEE Trans. Veh. Technol. 2020, 69, 5535–5548. [Google Scholar] [CrossRef]
Herman Muraro Gularte, K.; Alfredo Ruiz Vargas, J.; Paulo Javidi da Costa, J.; Santos da Silva, A.; Almeida Santos, G.; Wang, Y.; Alfons Müller, C.; Lipps, C.; Timóteo de Sousa Júnior, R.; de Britto Vidal Filho, W.; et al. Safeguarding the V2X Pathways: Exploring the Cybersecurity Landscape Through Systematic Review. IEEE Access 2024, 12, 72871–72895. [Google Scholar] [CrossRef]
Alnasser, A.; Sun, H.; Jiang, J. Cyber security challenges and solutions for V2X communications: A survey. Comput. Netw. 2019, 151, 52–67. [Google Scholar] [CrossRef]
Yoshizawa, T.; Singelée, D.; Muehlberg, J.T.; Delbruel, S.; Taherkordi, A.; Hughes, D.; Preneel, B. A Survey of Security and Privacy Issues in V2X Communication Systems. ACM Comput. Surv. 2023, 55, 1–36. [Google Scholar] [CrossRef]
El-Rewini, Z.; Sadatsharan, K.; Selvaraj, D.F.; Plathottam, S.J.; Ranganathan, P. Cybersecurity challenges in vehicular communications. Veh. Commun. 2020, 23, 100214. [Google Scholar] [CrossRef]
Ghosal, A.; Conti, M. Security issues and challenges in V2X: A Survey. Comput. Netw. 2020, 169, 107093. [Google Scholar] [CrossRef]
Hasan, M.; Mohan, S.; Shimizu, T.; Lu, H. Securing Vehicle-to-Everything (V2X) Communication Platforms. IEEE Trans. Intell. Veh. 2020, 5, 693–713. [Google Scholar] [CrossRef]
Waymo Safety Impact. Available online: https://waymo.com/safety/impact/ (accessed on 15 April 2026).
Chiu, H.; Hachiuma, R.; Wang, C.Y.; Smith, S.F.; Wang, Y.C.F.; Chen, M.H. V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multimodal Large Language Models. arXiv 2026, arXiv:2502.09980. [Google Scholar] [CrossRef]
Li, W.; Xiang, H.; Wang, T.; Wu, S.; Xia, Q.; Wang, C.; Wen, C. V2U4Real: A Real-World Large-Scale Dataset for Vehicle-to-UAV Cooperative Perception. arXiv 2026, arXiv:2603.25275. [Google Scholar] [CrossRef]
Ji, Y.; Zhou, Z.; Yang, Z.; Huang, Y.; Zhang, Y.; Zhang, W.; Xiong, L.; Yu, Z. Toward autonomous vehicles: A survey on cooperative vehicle-infrastructure system. iScience 2024, 27, 109751. [Google Scholar] [CrossRef] [PubMed]
Yu, G.; Li, H.; Wang, Y.; Chen, P.; Zhou, B. A review on cooperative perception and control supported infrastructure-vehicle system. Green Energy Intell. Transp. 2022, 1, 100023. [Google Scholar] [CrossRef]
Wu, K.; Li, P.; Zhou, Y.; Gan, R.; You, J.; Cheng, Y.; Zhu, J.; Parker, S.T.; Ran, B.; Noyce, D.A.; et al. V2X-LLM: Enhancing V2X Integration and Understanding in Connected Vehicle Corridors. arXiv 2025, arXiv:2503.02239. [Google Scholar] [CrossRef]
Yazgan, M.; Akkanapragada, M.V.; Marius Zöllner, J. Collaborative Perception Datasets in Autonomous Driving: A Survey. In Proceedings of the 2024 IEEE Intelligent Vehicles Symposium (IV), Jeju Island, Republic of Korea, 2–5 June 2024; pp. 2269–2276. [Google Scholar] [CrossRef]
Jung, C.; Lee, D.; Lee, S.; Shim, D.H. V2X-Communication-Aided Autonomous Driving: System Design and Experimental Validation. Sensors 2020, 20, 2903. [Google Scholar] [CrossRef]
Yang, Z.; Ai, Y.; Zhang, W. End-to-End 3D Spatiotemporal Perception with Multimodal Fusion and V2X Collaboration. arXiv 2025, arXiv:2512.21831. [Google Scholar] [CrossRef]
Huang, X.; Wang, J.; Xia, Q.; Chen, S.; Yang, B.; Li, X.; Wang, C.; Wen, C. V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 11–17 June 2025; pp. 27390–27400. [Google Scholar]
Nagata, R.; Koide, K.; Hayakawa, Y.; Suzuki, R.; Ikeda, K.; Sako, O.; Chen, Q.A.; Sato, T.; Yoshioka, K. SLAMSpoof: Practical LiDAR Spoofing Attacks on Localization Systems Guided by Scan Matching Vulnerability Analysis. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Atlanta, GA, USA, 19–23 May 2025. [Google Scholar]
Shan, M.; Narula, K.; Wong, Y.F.; Worrall, S.; Khan, M.; Alexander, P.; Nebot, E. Demonstrations of Cooperative Perception: Safety and Robustness in Connected and Automated Vehicle Operations. Sensors 2021, 21, 200. [Google Scholar] [CrossRef]
Zhang, X.; Li, J.; Zhou, J.; Zhang, S.; Wang, J.; Yuan, Y.; Liu, J.; Li, J. Vehicle-to-everything communication in Intelligent Connected Vehicles: A survey and taxonomy. Automot. Innov. 2025, 8, 13–45. [Google Scholar] [CrossRef]
Jiang, D.; Delgrossi, L. IEEE 802.11p: Towards an International Standard for Wireless Access in Vehicular Environments. In Proceedings of the VTC Spring 2008 - IEEE Vehicular Technology Conference, Singapore, 11–14 May 2008; pp. 2036–2040. [Google Scholar] [CrossRef]
Chen, S.; Hu, J.; Shi, Y.; Zhao, L.; Li, W. A Vision of C-V2X: Technologies, Field Testing, and Challenges with Chinese Development. IEEE Internet Things J. 2020, 7, 3872–3881. [Google Scholar] [CrossRef]
Noor-A-Rahim, M.; Liu, Z.; Lee, H.; Khyam, M.O.; He, J.; Pesch, D.; Moessner, K.; Saad, W.; Poor, H.V. 6G for Vehicle-to-Everything (V2X) Communications: Enabling Technologies, Challenges, and Opportunities. Proc. IEEE 2022, 110, 712–734. [Google Scholar] [CrossRef]
Da Silva, A.S.; Da Costa, J.P.J.; Santos, G.A.; Miri, Z.; Fauzi, M.I.B.M.; Vinel, A.; de Freitas, E.P.; Kastell, K. Radio Jamming in Vehicle-to-Everything Communication Systems: Threats and Countermeasures. In Proceedings of the 2023 23rd International Conference on Transparent Optical Networks (ICTON), Bucharest, Romania, 2–6 July 2023; pp. 1–4. [Google Scholar] [CrossRef]
Yu, H.; Yang, W.; Ruan, H.; Yang, Z.; Tang, Y.; Gao, X.; Hao, X.; Shi, Y.; Pan, Y.; Sun, N.; et al. V2X-Seq: A Large-Scale Sequential Dataset for Vehicle-Infrastructure Cooperative Perception and Forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 5486–5495. [Google Scholar]
Ding, W.; Zhang, L.; Chen, J.; Shen, S. EPSILON: An Efficient Planning System for Automated Vehicles in Highly Interactive Environments. IEEE Trans. Robot. 2022, 38, 1118–1138. [Google Scholar] [CrossRef]
Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A Survey of Large Language Models. Front. Comput. Sci. 2026, 20, 2012627. [Google Scholar] [CrossRef]
Zhang, J.; Huang, J.; Jin, S.; Lu, S. Vision-Language Models for Vision Tasks: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 5625–5644. [Google Scholar] [CrossRef]
Yin, S.; Fu, C.; Zhao, S.; Li, K.; Sun, X.; Xu, T.; Chen, E. A survey on multimodal large language models. Natl. Sci. Rev. 2024, 11, nwae403. [Google Scholar] [CrossRef]
Fu, D.; Li, X.; Wen, L.; Dou, M.; Cai, P.; Shi, B.; Qiao, Y. Drive Like a Human: Rethinking Autonomous Driving with Large Language Models. In Proceedings of the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW), Waikoloa, HI, USA, 2–8 January 2024; pp. 910–919. [Google Scholar] [CrossRef]
Cui, E.; Wang, W.; Li, Z.; Xie, J.; Zou, H.; Deng, H.; Luo, G.; Lu, L.; Zhu, X.; Dai, J. DriveMLM: Aligning multi-modal large language models with behavioral planning states for autonomous driving. Vis. Intell. 2025, 3, 12. [Google Scholar] [CrossRef]
Abdo, A.; Chen, H.; Zhao, X.; Wu, G.; Feng, Y. Cybersecurity on Connected and Automated Transportation Systems: A Survey. IEEE Trans. Intell. Veh. 2024, 9, 1382–1401. [Google Scholar] [CrossRef]
Cao, Y.; Xiao, C.; Cyr, B.; Zhou, Y.; Park, W.; Rampazzi, S.; Chen, Q.A.; Fu, K.; Mao, Z.M. Adversarial Sensor Attack on LiDAR-based Perception in Autonomous Driving. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, London, UK, 11–15 November 2019; CCS ’19; ACM: New York, NY, USA, 2019; pp. 2267–2281. [Google Scholar] [CrossRef]
Gong, H.; Li, D.; Wong, W.E.; Li, H. A Survey of Adversarial Methods in Autonomous Driving. In Proceedings of the 2025 IEEE 49th Annual Computers, Software, and Applications Conference (COMPSAC), Turin, Italy, 30 June–4 July 2025; pp. 27–38. [Google Scholar] [CrossRef]
Qian, J.; Wang, W.; Yang, X.; Xu, H. Survey on Security and Privacy in 5G V2X. In Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering, New York, NY, USA, Xiamen, China, 21–23 October 2023; EITCE ’22, pp. 1056–1062. [Google Scholar] [CrossRef]
Rishiwal, V.; Agarwal, U.; Alotaibi, A.; Tanwar, S.; Yadav, P.; Yadav, M. Exploring Secure V2X Communication Networks for Human-Centric Security and Privacy in Smart Cities. IEEE Access 2024, 12, 138763–138788. [Google Scholar] [CrossRef]
Kim, K.; Kim, J.S.; Jeong, S.; Park, J.H.; Kim, H.K. Cybersecurity for autonomous vehicles: Review of attacks and defense. Comput. Secur. 2021, 103, 102150. [Google Scholar] [CrossRef]
Cui, J.; Liew, L.S.; Sabaliauskaite, G.; Zhou, F. A review on safety failures, security attacks, and available countermeasures for autonomous vehicles. Ad Hoc Netw. 2019, 90, 101823. [Google Scholar] [CrossRef]
Pham, M.; Xiong, K. A survey on security attacks and defense techniques for connected and autonomous vehicles. Comput. Secur. 2021, 109, 102269. [Google Scholar] [CrossRef]
Mostaq Hossain, S.M.; Banik, S.; Banik, T.; Shibli, A.M. Survey on Security Attacks in Connected and Autonomous Vehicular Systems. In Proceedings of the 2023 IEEE International Conference on Computing (ICOCO), Langkawi, Malaysia, 4–6 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 295–300. [Google Scholar] [CrossRef]
Deng, Y.; Zhang, T.; Lou, G.; Zheng, X.; Jin, J.; Han, Q.L. Deep Learning-Based Autonomous Driving Systems: A Survey of Attacks and Defenses. IEEE Trans. Ind. Inform. 2021, 17, 7897–7912. [Google Scholar] [CrossRef]
Song, Z.; Xie, T.; Wen, F.; Li, J. Wireless Communication as an Information Sensor for Multi-agent Cooperative Perception: A Survey. arXiv 2025, arXiv:2505.00747. [Google Scholar] [CrossRef]
Yang, Z.; Jia, X.; Li, H.; Yan, J. LLM4Drive: A Survey of Large Language Models for Autonomous Driving. In Proceedings of the NeurIPS 2024 Workshop on Open-World Agents, Vancouver, BC, Canada, 15 December 2024. [Google Scholar]
Zhou, X.; Liu, M.; Yurtsever, E.; Zagar, B.L.; Zimmer, W.; Cao, H.; Knoll, A.C. Vision Language Models in Autonomous Driving: A Survey and Outlook. IEEE Trans. Intell. Veh. 2024, 9, 1–20. [Google Scholar] [CrossRef]
Cui, C.; Ma, Y.; Cao, X.; Ye, W.; Zhou, Y.; Liang, K.; Chen, J.; Lu, J.; Yang, Z.; Liao, K.D.; et al. A Survey on Multimodal Large Language Models for Autonomous Driving. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops, Waikoloa, HI, USA, 1–6 January 2024; pp. 958–979. [Google Scholar]
Gao, Y.; Piccinini, M.; Zhang, Y.; Wang, D.; Moller, K.; Brusnicki, R.; Zarrouki, B.; Gambi, A.; Totz, J.F.; Storms, K.; et al. Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis. IEEE Open J. Intell. Transp. Syst. 2026, 1. [Google Scholar] [CrossRef]
Wang, J. Vision-Language Model Security in Autonomous Driving: A Survey. Appl. Comput. Eng. 2025, 146, 1–10. [Google Scholar] [CrossRef]
Chen, Q.; Tang, S.; Yang, Q.; Fu, S. Cooper: Cooperative Perception for Connected Autonomous Vehicles Based on 3D Point Clouds. In Proceedings of the 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Dallas, TX, USA, 7–10 July 2019; pp. 514–524. [Google Scholar] [CrossRef]
Chen, Q.; Ma, X.; Tang, S.; Guo, J.; Yang, Q.; Fu, S. F-cooper: Feature based cooperative perception for autonomous vehicle edge computing system using 3D point clouds. In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC ’19. Arlington, VA, USA, 7–9 November 2019; ACM: New York, NY, USA, 2019; pp. 88–100. [Google Scholar] [CrossRef]
Li, Y.; Ren, S.; Wu, P.; Chen, S.; Feng, C.; Zhang, W. Learning Distilled Collaboration Graph for Multi-Agent Perception. In Proceedings of the Advances in Neural Information Processing Systems; Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2021; Volume 34, pp. 29541–29552. [Google Scholar]
Liu, Y.C.; Tian, J.; Glaser, N.; Kira, Z. When2com: Multi-Agent Perception via Communication Graph Grouping. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 4105–4114. [Google Scholar] [CrossRef]
Wang, T.H.; Manivasagam, S.; Liang, M.; Yang, B.; Zeng, W.; Urtasun, R. V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction. In Proceedings of the Computer Vision—ECCV 2020, Glasgow, UK, 23–28 August 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M., Eds.; Springer: Cham, Switzerland, 2020; pp. 605–621. [Google Scholar]
Bagheri, H.; Noor-A-Rahim, M.; Liu, Z.; Lee, H.; Pesch, D.; Moessner, K.; Xiao, P. 5G NR-V2X: Toward Connected and Cooperative Autonomous Driving. IEEE Commun. Stand. Mag. 2021, 5, 48–54. [Google Scholar] [CrossRef]
Xu, R.; Xiang, H.; Tu, Z.; Xia, X.; Yang, M.H.; Ma, J. V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer. In Proceedings of the Computer Vision—ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 107–124. [Google Scholar] [CrossRef]
Xu, R.; Tu, Z.; Xiang, H.; Shao, W.; Zhou, B.; Ma, J. CoBEVT: Cooperative Bird’s Eye View Semantic Segmentation with Sparse Transformers. In Proceedings of the 6th Conference on Robot Learning, Auckland, New Zealand, 14–18 December 2022; Liu, K., Kulic, D., Ichnowski, J., Eds.; Proceedings of Machine Learning Research: Cambridge, MA, USA, 2022; Volume 205, pp. 989–1000. [Google Scholar]
Hu, Y.; Fang, S.; Lei, Z.; Zhong, Y.; Chen, S. Where2comm: Communication-Efficient Collaborative Perception via Spatial Confidence Maps. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; Volume 35, pp. 4874–4886. [Google Scholar]
Zhao, B.; ZHANG, W.; Zou, Z. BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities. In Proceedings of the 7th Conference on Robot Learning, Atlanta, GA, USA, 6–9 November 2023; Tan, J., Toussaint, M., Darvish, K., Eds.; PMLR: Cambridge, MA, USA, 2023; Volume 229, pp. 1022–1035. [Google Scholar]
Yang, D.; Yang, K.; Wang, Y.; Liu, J.; Xu, Z.; Yin, R.; Zhai, P.; Zhang, L. How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2023; Volume 36, pp. 25151–25164. [Google Scholar]
Wu, K.; Qiao, J.; Zhang, Y. CoPAD: Multi-source Trajectory Fusion and Cooperative Trajectory Prediction with Anchor-oriented Decoder in V2X Scenarios. In Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 19–25 October 2025; pp. 2690–2696. [Google Scholar] [CrossRef]
Zhao, S.Z.; Xiang, H.; Xu, C.; Xia, X.; Zhou, B.; Ma, J. CooPre: Cooperative Pretraining for V2X Cooperative Perception. In Proceedings of the 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 19–25 October 2025; pp. 11765–11772. [Google Scholar] [CrossRef]
Sha, H.; Mu, Y.; Jiang, Y.; Chen, L.; Xu, C.; Luo, P.; Li, S.E.; Tomizuka, M.; Zhan, W.; Ding, M. LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving. arXiv 2025, arXiv:2310.03026. [Google Scholar] [CrossRef]
Shao, H.; Hu, Y.; Wang, L.; Song, G.; Waslander, S.L.; Liu, Y.; Li, H. LMDrive: Closed-Loop End-to-End Driving with Large Language Models. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–21 June 2024; pp. 15120–15130. [Google Scholar] [CrossRef]
Yildirim, M.; Dagda, B.; Asodia, V.; Fallah, S. HighwayLLM: Decision-making and navigation in highway driving with RL-informed language model. Robot. Auton. Syst. 2025, 193, 105114. [Google Scholar] [CrossRef]
Cui, C.; Yang, Z.; Zhou, Y.; Ma, Y.; Lu, J.; Li, L.; Chen, Y.; Panchal, J.; Wang, Z. Personalized Autonomous Driving with Large Language Models: Field Experiments. In Proceedings of the 2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), Edmonton, AB, Canada, 24–27 September 2024; pp. 20–27. [Google Scholar] [CrossRef]
Tian, X.; Gu, J.; Li, B.; Liu, Y.; Wang, Y.; Zhao, Z.; Zhan, K.; Jia, P.; Lang, X.; Zhao, H. DriveVLM: The Convergence of Autonomous Driving and Large Vision-Language Models. arXiv 2024, arXiv:2402.12289. [Google Scholar] [CrossRef]
Sima, C.; Renz, K.; Chitta, K.; Chen, L.; Zhang, H.; Xie, C.; Beißwenger, J.; Luo, P.; Geiger, A.; Li, H. DriveLM: Driving with Graph Visual Question Answering. In Proceedings of the Computer Vision—ECCV 2024: 18th European Conference, Milan, Italy, 29 September–4 October 2024; Proceedings, Part LII; Springer: Berlin/Heidelberg, Germany, 2024; pp. 256–274. [Google Scholar] [CrossRef]
Chahe, A.; Zhou, L. ReasonDrive: Efficient Visual Question Answering for Autonomous Vehicles with Reasoning-Enhanced Small Vision-Language Models. In Proceedings of the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, 11–17 June 2025; pp. 3870–3879. [Google Scholar] [CrossRef]
Hou, X.; Wang, W.; Yang, L.; Lin, H.; Feng, J.; Min, H.; Zhao, X. DriveAgent: Multi-Agent Structured Reasoning With LLM and Multimodal Sensor Fusion for Autonomous Driving. IEEE Robot. Autom. Lett. 2025, 10, 12189–12196. [Google Scholar] [CrossRef]
Tian, Y.; Zhang, J.; Wang, Z.; Ren, X.; Yu, X.; Gungor, O.; Rosing, T. KLDrive: Fine-Grained 3D Scene Reasoning for Autonomous Driving based on Knowledge Graph. arXiv 2026, arXiv:2603.21029. [Google Scholar] [CrossRef]
Cui, J.; Tang, C.; Holtz, J.; Nguyen, J.; Allievi, A.G.; Qiu, H.; Stone, P. Talking Vehicles: Cooperative Driving via Natural Language. arXiv 2025, arXiv:2503.12345. [Google Scholar]
Liu, C.; Liu, G.; Wang, Z.; Yang, J.; Chen, S. CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, 6–12 October 2025; pp. 25951–25960. [Google Scholar]
Gao, X.; Wu, Y.; Wang, R.; Liu, C.; Zhou, Y.; Tu, Z. LangCoop: Collaborative Driving with Language. In Proceedings of the 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, 11–17 June 2025; pp. 4226–4237. [Google Scholar] [CrossRef]
Vilho, J.; Liang, T.; Guo, C.; Zhang, T. Autonomous Driving Planning Based on Large Language Model: Collaborative Driving. In Proceedings of the 2025 IEEE 101st Vehicular Technology Conference (VTC2025-Spring), Oslo, Norway, 15–18 June 2025; pp. 1–6. [Google Scholar] [CrossRef]
You, J.; Jiang, Z.; Huang, Z.; Shi, H.; Gan, R.; Wu, K.; Cheng, X.; Li, X.; Ran, B. V2X-VLM: End-to-End V2X cooperative autonomous driving through large vision-Language models. Transp. Res. Part C Emerg. Technol. 2026, 183, 105457. [Google Scholar] [CrossRef]
kuang Chiu, H.; Hachiuma, R.; Wang, C.Y.; Wang, Y.C.F.; Chen, M.H.; Smith, S.F. V2V-GoT: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multimodal Large Language Models and Graph-of-Thoughts. arXiv 2026, arXiv:2509.18053. [Google Scholar] [CrossRef]
Bhatt, N.P.; han Li, P.; Gupta, K.; Siva, R.; Milan, D.; Hogue, A.; Chinchali, S.P.; Fridovich-Keil, D.; Wang, Z.; Topcu, U. UNCAP: Uncertainty-Guided Neurosymbolic Planning Using Natural Language Communication for Cooperative Autonomous Vehicles. arXiv 2025, arXiv:2506.12345. [Google Scholar]
Yongqiang, D.; Dengjiang, W.; Gang, C.; Bing, M.; Xijia, G.; Yajun, W.; Jianchao, L.; Yanming, F.; Juanjuan, L. BAAI-VANJEE Roadside Dataset: Towards the Connected Automated Vehicle Highway technologies in Challenging Environments of China. arXiv 2021, arXiv:2105.14370. [Google Scholar] [CrossRef]
Wang, H.; Zhang, X.; Li, Z.; Li, J.; Wang, K.; Lei, Z.; Haibing, R. IPS300+: A Challenging multi-modal data sets for Intersection Perception System. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 2539–2545. [Google Scholar] [CrossRef]
Li, Y.; Ma, D.; An, Z.; Wang, Z.; Zhong, Y.; Chen, S.; Feng, C. V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving. IEEE Robot. Autom. Lett. 2022, 7, 10914–10921. [Google Scholar] [CrossRef]
Ye, X.; Shu, M.; Li, H.; Shi, Y.; Li, Y.; Wang, G.; Tan, X.; Ding, E. Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 21341–21350. [Google Scholar]
Creß, C.; Zimmer, W.; Strand, L.; Fortkord, M.; Dai, S.; Lakshminarasimhan, V.; Knoll, A. A9-Dataset: Multi-Sensor Infrastructure-Based Dataset for Mobility Research. In Proceedings of the 2022 IEEE Intelligent Vehicles Symposium (IV), Aachen, Germany, 4–8 June 2022; IEEE Press: Piscataway, NJ, USA, 2022; pp. 965–970. [Google Scholar] [CrossRef]
Xu, R.; Xiang, H.; Xia, X.; Han, X.; Li, J.; Ma, J. OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; IEEE Press: Piscataway, NJ, USA, 2022; pp. 2583–2589. [Google Scholar] [CrossRef]
Yu, H.; Luo, Y.; Shu, M.; Huo, Y.; Yang, Z.; Shi, Y.; Guo, Z.; Li, H.; Hu, X.; Yuan, J.; et al. DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 21329–21338. [Google Scholar] [CrossRef]
Mao, R.; Guo, J.; Jia, Y.; Sun, Y.; Zhou, S.; Niu, Z. DOLPHINS: Dataset for Collaborative Perception Enabled Harmonious and Interconnected Self-driving. In Proceedings of the Computer Vision—ACCV 2022, Macao, China, 4–8 December 2022; Springer: Cham, Switzerland, 2023; pp. 495–511. [Google Scholar] [CrossRef]
Axmann, J.; Moftizadeh, R.; Su, J.; Tennstedt, B.; Zou, Q.; Yuan, Y.; Ernst, D.; Alkhatib, H.; Brenner, C.; Schön, S. LUCOOP: Leibniz University Cooperative Perception and Urban Navigation Dataset. In Proceedings of the 2023 IEEE Intelligent Vehicles Symposium (IV), Anchorage, AK, USA, 4–7 June 2023; pp. 1–8. [Google Scholar] [CrossRef]
Xu, R.; Xia, X.; Li, J.; Li, H.; Zhang, S.; Tu, Z.; Meng, Z.; Xiang, H.; Dong, X.; Song, R.; et al. V2V4Real: A Real-World Large-Scale Dataset for Vehicle-to-Vehicle Cooperative Perception. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 13712–13722. [Google Scholar] [CrossRef]
Wang, T.; Kim, S.; Wenxuan, J.; Xie, E.; Ge, C.; Chen, J.; Li, Z.; Luo, P. DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving. Proc. AAAI Conf. Artif. Intell. 2024, 38, 5599–5606. [Google Scholar] [CrossRef]
Ma, C.; Qiao, L.; Zhu, C.; Liu, K.; Kong, Z.; Li, Q.; Zhou, X.; Kan, Y.; Wu, W. HoloVic:Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–21 June 2024; pp. 22129–22138. [Google Scholar] [CrossRef]
Zimmer, W.; Wardana, G.A.; Sritharan, S.; Zhou, X.; Song, R.; Knoll, A.C. TUMTraf V2X Cooperative Perception Dataset. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 17–21 June 2024; pp. 22668–22677. [Google Scholar] [CrossRef]
Li, R.; Pei, X. Multi-V2X: A Large Scale Multi-modal Multi-penetration-rate Dataset for Cooperative Perception. arXiv 2024, arXiv:2409.04980. [Google Scholar] [CrossRef]
Wang, B.; Wang, Y.; Gong, W.; Chen, S.; Liu, G.; Xiong, M.; Ng, C.L. V2XScenes: A Multiple Challenging Traffic Conditions Dataset for Large-Range Vehicle-Infrastructure Collaborative Perception. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, HI, USA, 6–12 October 2025; pp. 28385–28395. [Google Scholar]
Li, H.; Cao, B.; Liang, Z.; Li, W.; Oh, J.; Chen, Y.; Liang, S.; Zhou, H.; Ma, C.; Liu, J.; et al. CATS-V2V: A Real-World Vehicle-to-Vehicle Cooperative Perception Dataset with Complex Adverse Traffic Scenarios. arXiv 2025, arXiv:2511.11168. [Google Scholar] [CrossRef]
Wang, Y.R.; Chen, S.; Song, Z.; Zhou, S. WHALES: A Multi-Agent Scheduling Dataset for Enhanced Cooperation in Autonomous Driving. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2025, Hangzhou, China, 19–25 October 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 20487–20493. [Google Scholar] [CrossRef]
Yang, L.; Zhang, X.; Li, J.; Wang, C.; Ma, J.; Song, Z.; Zhao, T.; Song, Z.; Wang, L.; Zhou, M.; et al. V2X-Radar: A Multi-modal Dataset with 4D Radar for Cooperative Perception. arXiv 2026, arXiv:2411.10962. [Google Scholar] [CrossRef]
Sekaran, K.C.; Geisler, M.; Rößle, D.; Mohan, A.; Cremers, D.; Utschick, W.; Botsch, M.; Huber, W.; Schön, T. UrbanIng-V2X: A Large-Scale Multi-Vehicle, Multi-Infrastructure Dataset Across Multiple Intersections for Cooperative Perception. In Proceedings of the The Thirty-Ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track, San Diego, CA, USA, 8–14 December 2025; Curran Associates, Inc.: Red Hook, NY, USA, 2026. [Google Scholar]
Wang, J.; Shao, Y.; Ge, Y.; Yu, R. A Survey of Vehicle to Everything (V2X) Testing. Sensors 2019, 19, 334. [Google Scholar] [CrossRef]
Wang, F.; Wang, X.; Ban, X.J. Data poisoning attacks in intelligent transportation systems: A survey. Transp. Res. Part C Emerg. Technol. 2024, 165, 104750. [Google Scholar] [CrossRef]
Papernot, N.; McDaniel, P.; Goodfellow, I.; Jha, S.; Celik, Z.B.; Swami, A. Practical Black-Box Attacks against Machine Learning. In Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, Abu Dhabi, United Arab Emirates, 2–6 April 2017; ASIA CCS ’17; ACM: New York, NY, USA, 2017; pp. 506–519. [Google Scholar] [CrossRef]
Mazzone, F.; Badawi, A.A.; Polyakov, Y.; Everts, M.; Hahn, F.; Peter, A. Investigating Privacy Attacks in the Gray-Box Setting to Enhance Collaborative Learning Schemes. arXiv 2024, arXiv:2409.17283. [Google Scholar] [CrossRef]
Shin, H.; Kim, D.; Kwon, Y.; Kim, Y. Illusion and Dazzle: Adversarial Optical Channel Exploits against Lidars for Automotive Applications. In International Conference on Cryptographic Hardware and Embedded Systems; Springer: Cham, Switzerland, 2017. [Google Scholar]
Tu, N.; Huang, S.; Huang, Q.; Chen, Y.; Zhang, Z.M. On the Realism of LiDAR Spoofing Attacks against Autonomous Driving Vehicle at High Speed and Long Distance. In Proceedings of the 29th USENIX Security Symposium (USENIX Security), Boston, MA, USA, 12–14 August 2020; USENIX Association: Berkeley, CA, USA, 2020; pp. 1303–1320. [Google Scholar]
Nagata, R.; Koide, K.; Ikeda, K.; Sako, O.; Yoshioka, K. D-SLAMSpoof: An Environment-Agnostic LiDAR Spoofing Attack using Dynamic Point Cloud Injection. arXiv 2026, arXiv:2603.11365. [Google Scholar]
Yahia, S.; Alla, I.; Mohan, G.B.; Rau, D.; Singh, M.; Loscri, V. Seeing is Deceiving: Mirror-Based LiDAR Spoofing for Autonomous Vehicle Deception. arXiv 2025, arXiv:2509.17253. [Google Scholar]
Komissarov, R.; Wool, A. Spoofing Attacks Against Vehicular FMCW Radar. In Proceedings of the 5th Workshop on Attacks and Solutions in Hardware Security, Virtual, 15 November 2021; ASHES ’21; ACM: New York, NY, USA, 2021; pp. 91–97. [Google Scholar] [CrossRef]
Reddy Vennam, R.; Jain, I.K.; Bansal, K.; Orozco, J.; Shukla, P.; Ranganathan, A.; Bharadia, D. mmSpoof: Resilient Spoofing of Automotive Millimeter-wave Radars using Reflect Array. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–25 May 2023; pp. 1807–1821. [Google Scholar] [CrossRef]
Ordean, M.; Garcia, F.D. Millimeter-Wave Automotive Radar Spoofing. arXiv 2022, arXiv:2205.06567. [Google Scholar] [CrossRef]
Hunt, D.; Angell, K.; Qi, Z.; Chen, T.; Pajic, M. MadRadar: A Black-Box Physical Layer Attack Framework on mmWave Automotive FMCW Radars. In Proceedings of the Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, 26 February–1 March 2024. [Google Scholar] [CrossRef]
Zhu, Y.; Miao, C.; Xue, H.; Li, Z.; Yu, Y.; Xu, W.; Su, L.; Qiao, C. TileMask: A Passive-Reflection-based Attack against mmWave Radar Object Detection in Autonomous Driving. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, Copenhagen, Denmark, 26–30 November 2023; CCS ’23; ACM: New York, NY, USA, 2023; pp. 1317–1331. [Google Scholar] [CrossRef]
Sun, J.; Cao, Y.; Chen, Q.A.; Mao, Z.M. Towards Robust LiDAR-based Perception in Autonomous Driving: General Black-box Adversarial Sensor Attack and Countermeasures. In Proceedings of the 29th USENIX Security Symposium (USENIX Security 20), Boston, MA, USA, 12–14 August 2020; USENIX Association: Berkeley, CA, USA, 2020; pp. 877–894. [Google Scholar]
Cho, M.; Cao, Y.; Zhou, Z.; Mao, Z.M. ADoPT: LiDAR Spoofing Attack Detection Based on Point-Level Temporal Consistency. In Proceedings of the 34th British Machine Vision Conference 2023, BMVC, Aberdeen, UK, 20–24 November 2023; BMVA: Durham, UK, 2023. [Google Scholar]
Alheeti, K.M.A.; Alzahrani, A.; Al Dosary, D. LiDAR Spoofing Attack Detection in Autonomous Vehicles. In Proceedings of the 2022 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 7–9 January 2022; pp. 1–2. [Google Scholar] [CrossRef]
Wu, H.; Onen, C.; Pandharipande, A. Cooperative Automotive Radar System for Ghost Target Spoofing Detection. In Proceedings of the 2025 IEEE SENSORS, Sydney, Australia, 26–29 October 2025; pp. 1–4. [Google Scholar] [CrossRef]
Zhou, Y.; Cao, R.; Zhang, A.; Li, P. An Interference Mitigation Method for FMCW Radar Based on Time–Frequency Distribution and Dual-Domain Fusion Filtering. Sensors 2024, 24, 3288. [Google Scholar] [CrossRef]
Akhtar, M.M.; Li, Y.; Cheng, W.; Dong, L.; Tan, Y.; Geng, L. AOHDL: Adversarial Optimized Hybrid Deep Learning Design for Preventing Attack in Radar Target Detection. Remote Sens. 2024, 16, 3109. [Google Scholar] [CrossRef]
Shahriar, M.H.; Barat, M.M.A.; Sundar, H.; Zhang, N.; Ramakrishnan, N.; Hou, T.; Lou, W. Detecting Temporal Misalignment Attacks in Multimodal Fusion for Autonomous Driving. In Proceedings of the The Fourteenth International Conference on Learning Representations, Vienna, Austria, 3–7 May 2026. [Google Scholar]
Mu, J. A Real-Time Defense Against Object Vanishing Adversarial Patch Attacks for Object Detection in Autonomous Vehicles. In Proceedings of the Security and Privacy in Cyber-Physical Systems and Smart Vehicles, San Diego, CA, USA, 23–25 October 2025; Hei, X., Garcia, L., Kim, T., Kim, K., Eds.; Springer: Cham, Switzerland, 2025; pp. 296–307. [Google Scholar]
Pourranjbar, A.; Kaddoum, G.; Saad, W. Recurrent-Neural-Network-Based Anti-Jamming Framework for Defense Against Multiple Jamming Policies. IEEE Internet Things J. 2023, 10, 8799–8811. [Google Scholar] [CrossRef]
Ansari, M.; Petit, J.; Monteuuis, J.P.; Chen, C. VASP: V2X Application Spoofing Platform. In Proceedings of the Inaugural International Symposium on Vehicle Security & Privacy, San Diego, CA, USA, 27 February 2023. [Google Scholar] [CrossRef]
Boualouache, A.; Engel, T. A Survey on Machine Learning-Based Misbehavior Detection Systems for 5G and Beyond Vehicular Networks. Commun. Surveys Tuts. 2023, 25, 1128–1172. [Google Scholar] [CrossRef]
Silva, D.A.D.; Silva, A.S.D.; Valle De Lima, D.; Paulo Javidi Da Costa, J.; Melo, L.F.O.D.; Miranda, C.; Santos, G.A.; Vinel, A.; Mendes, P.; Verhoeven, S.; et al. Spoofer Detection Framework for V2X Systems via Tensor-Based DoA Estimation and YOLO-Based Object Detection. IEEE Access 2026, 14, 23624–23643. [Google Scholar] [CrossRef]
Greco, D.; Sohail, M.S.; Marchese, M. Detection of C-V2X Spoofing Attacks using Physical Layer Features and Graph Neural Networks. In Proceedings of the 2025 IEEE International Conference on Cyber Security and Resilience (CSR), Venice, Italy, 15–17 October 2025; pp. 801–806. [Google Scholar] [CrossRef]
Li, Z.; Liao, L.; Gu, S.; Zhao, J. Physical layer eavesdropping defense scheme for V2X based on improved SAC algorithm. Phys. Commun. 2026, 74, 102980. [Google Scholar] [CrossRef]
Gu, S.; Wei, M.; Liao, L.; Zhao, J. Eavesdropping defense scheme in C-V2X using deep learning and reinforcement learning. Phys. Commun. 2025, 71, 102673. [Google Scholar] [CrossRef]
Mamun, A.A.; Yates, K.; Rakotondrafara, A.; Chowdhury, M.; Cartor, R.; Gao, S. Experimental Evaluation of Post-Quantum Homomorphic Encryption for Privacy-Preserving I2I Communication in ITS. arXiv 2025, arXiv:2508.02461. [Google Scholar]
Pan, Y.; Wang, Y.; Guo, S.; Yin, C.; Li, R.; Su, Z.; Wu, Y. Trustworthy Semantic Communication for Vehicular Networks: Challenges and Solutions. IEEE Veh. Technol. Mag. 2025, 2–11. [Google Scholar] [CrossRef]
Trkulja, N.; Starobinski, D.; Berry, R.A. Denial-of-Service Attacks on C-V2X Networks. In Proceedings of the NDSS Workshop on Automotive and Autonomous Vehicle Security (AutoSec), Virtual, 21 February 2021. [Google Scholar]
Twardokus, G.; Rahbari, H. Vehicle-to-Nothing? Securing C-V2X Against Protocol-Aware DoS Attacks. In Proceedings of the IEEE INFOCOM 2022—IEEE Conference on Computer Communications, London, UK, 2–5 May 2022; pp. 1629–1638. [Google Scholar] [CrossRef]
Tine, J.M.; Aldeen, M.; Enan, A.; Salek, M.S.; Cheng, L.; Chowdhury, M. Real-World Evaluation of Protocol-Compliant Denial-of-Service Attacks on C-V2X-based Forward Collision Warning Systems. arXiv 2026, arXiv:2508.02805. [Google Scholar]
Jayakrishna, N.; Prasanth, N.N. Detection and mitigation of distributed denial of service attacks in vehicular ad hoc network using a spatiotemporal deep learning and reinforcement learning approach. Results Eng. 2025, 26, 104839. [Google Scholar] [CrossRef]
Yigit, Y.; Panitsas, I.; Maglaras, L.; Tassiulas, L.; Canberk, B. Cyber-Twin: Digital Twin-Boosted Autonomous Attack Detection for Vehicular Ad-Hoc Networks. In Proceedings of the ICC 2024—IEEE International Conference on Communications, Denver, CO, USA, 9–13 June 2024; pp. 2167–2172. [Google Scholar] [CrossRef]
Sadaf, M.; Iqbal, Z.; Anwar, Z.; Noor, U.; Imran, M.; Gadekallu, T.R. A novel framework for detection and prevention of denial of service attacks on autonomous vehicles using fuzzy logic. Veh. Commun. 2024, 46, 100741. [Google Scholar] [CrossRef]
Sohail, M.S.; Portomauro, G.; Gaggero, G.B.; Patrone, F.; Marchese, M. Performance Analysis and Security Preservation of DSRC in V2X Networks. Electronics 2025, 14, 3786. [Google Scholar] [CrossRef]
Oza, P.; Foruhandeh, M.; Gerdes, R.; Chantem, T. Secure Traffic Lights: Replay Attack Detection for Model-based Smart Traffic Controllers. In Proceedings of the Second ACM Workshop on Automotive and Aerial Vehicle Security, New Orleans, LA, USA, 17 March 2020; AutoSec ’20; ACM: New York, NY, USA, 2020; pp. 5–10. [Google Scholar] [CrossRef]
Dai, Y.; Wang, Q.; Song, X.; Wang, S. A Lightweight Key Agreement Protocol for V2X Communications Based on Kyber and Saber. Sensors 2025, 25, 6938. [Google Scholar] [CrossRef]
Huo, Q.; Ning, Y.; Bian, C.; Sun, D. Research on anti-replay attack mechanism of intelligent connected vehicles based on hashing chain and V2X communication. In Proceedings of the The International Conference Optoelectronic Information and Optical Engineering (OIOE2024), Wuhan, China, 25–27 October 2024; Yue, Y., Leng, L., Eds.; International Society for Optics and Photonics, SPIE: Bellingham, WA, USA, 2025; Volume 13513, p. 135133H. [Google Scholar] [CrossRef]
Yang, Y.; Wei, Z.; Zhang, Y.; Lu, H.; Choo, K.K.R.; Cai, H. V2X security: A case study of anonymous authentication. Pervasive Mob. Comput. 2017, 41, 160–172. [Google Scholar] [CrossRef]
Guven, T.; Taysi, Z.C. Creating a Realistic Sybil Attack Dataset for Inter-Vehicle Communication. Peer-to-Peer Netw. Appl. 2025, 18, 234. [Google Scholar] [CrossRef]
Azam, S.; Bibi, M.; Riaz, R.; Rizvi, S.S.; Kwon, S.J. Collaborative Learning Based Sybil Attack Detection in Vehicular AD-HOC Networks (VANETS). Sensors 2022, 22, 6934. [Google Scholar] [CrossRef]
Tadesse, E.M.; Girma, A.; Mebrte, A. Sybil Attack Prevention and Detection Mechanism in VANET Based on Multi-factor Authentication. Int. J. Inf. Commun. Sci. 2026, 11, 1–12. [Google Scholar] [CrossRef]
Morton, T.L.; Borah, A.; Paranjothi, A. Trust-Aware Sybil Attack Detection for Resilient Vehicular Communication. Internet Technol. Lett. 2025, 8, e617. [Google Scholar] [CrossRef]
Baza, M.; Nabil, M.; Mahmoud, M.M.E.A.; Bewermeier, N.; Fidan, K.; Alasmary, W.; Abdallah, M. Detecting Sybil Attacks Using Proofs of Work and Location in VANETs. IEEE Trans. Dependable Secur. Comput. 2022, 19, 39–53. [Google Scholar] [CrossRef]
Bendiab, G.; Hameurlaine, A.; Germanos, G.; Kolokotronis, N.; Shiaeles, S. Autonomous Vehicles Security: Challenges and Solutions Using Blockchain and Artificial Intelligence. IEEE Trans. Intell. Transp. Syst. 2023, 24, 3614–3637. [Google Scholar] [CrossRef]
Bajpai, P.; Enbody, R.; Cheng, B.H. Ransomware Targeting Automobiles. In Proceedings of the Second ACM Workshop on Automotive and Aerial Vehicle Security, New Orleans, LA, USA, 17 March 2020; AutoSec ’20; ACM: New York, NY, USA, 2020; pp. 23–29. [Google Scholar] [CrossRef]
Parker, C. Ransomware Vehicle Embedded System Attacks. In Proceedings of the 2021 Ground Vehicle Systems Engineering and Technology Symposium, Novi, MI, USA, 10–12 August 2021. [Google Scholar] [CrossRef]
Malik, A.W.; Anwar, Z.; Rahman, A.U. A Novel Framework for Studying the Business Impact of Ransomware on Connected Vehicles. IEEE Internet Things J. 2023, 10, 8348–8356. [Google Scholar] [CrossRef]
Alsharabi, N.; Alshammari, M.; Alharbi, Y. Analysis of Ransomware Using Reverse Engineering Techniques to Develop Effective Countermeasures. J. Adv. Inf. Technol. 2023, 14, 284–294. [Google Scholar] [CrossRef]
Zhu, R.; Zhu, X.; Zhang, A.; Zhang, X.; Sun, J.; Qian, F.; Qiu, H.; Mao, Z.M.; Lee, M. Boosting Collaborative Vehicular Perception on the Edge with Vehicle-to-Vehicle Communication. In Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems, Hangzhou, China, 4–7 November 2024; SenSys ’24; ACM: New York, NY, USA, 2024; pp. 141–154. [Google Scholar] [CrossRef]
Ulmasov, J.; Sun, P.; Boukerche, A. Adversarial Collaborative Perception in Autonomous Driving. In Proceedings of the 2025 29th International Symposium on Distributed Simulation and Real Time Applications (DS-RT), Atlanta, GA, USA, 6–8 October 2025; pp. 1–6. [Google Scholar] [CrossRef]
Tu, J.; Wang, T.; Wang, J.; Manivasagam, S.; Ren, M.; Urtasun, R. Adversarial Attacks on Multi-Agent Communication. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 7768–7777. [Google Scholar]
Zhang, Q.; Jin, S.; Zhu, R.; Sun, J.; Zhang, X.; Chen, Q.A.; Mao, Z.M. On Data Fabrication in Collaborative Vehicular Perception: Attacks and Countermeasures. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 14–16 August 2024; pp. 6309–6326. [Google Scholar]
Hu, S.; Tao, Y.; Xu, G.; Qian, X.; Deng, Y.; Chen, X.; Kwong, S.T.W.; Fang, Y. CP-uniGuard: A Unified, Probability-Agnostic, and Adaptive Framework for Malicious Agent Detection and Defense in Multi-Agent Embodied Perception Systems. IEEE Trans. Mob. Comput. 2026, 25, 8798–8811. [Google Scholar] [CrossRef]
Zhao, Y.; Xiang, Z.; Yin, S.; Pang, X.; Wang, Y.; Chen, S. MADE: Malicious Agent Detection for Robust Multi-Agent Collaborative Perception. In Proceedings of the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Abu Dhabi, United Arab Emirates, 14–18 October 2024; pp. 13817–13823. [Google Scholar] [CrossRef]
Grosse, K.; Alahi, A. A qualitative AI security risk assessment of autonomous vehicles. Transp. Res. Part C Emerg. Technol. 2024, 169, 104797. [Google Scholar] [CrossRef]
Patel, N.; Krishnamurthy, P.; Garg, S.; Khorrami, F. Bait and Switch: Online Training Data Poisoning of Autonomous Driving Systems. arXiv 2020, arXiv:2011.04065. [Google Scholar] [CrossRef]
Garg, S.; Jönsson, H.; Kalander, G.; Nilsson, A.; Pirange, B.; Valadi, V.; Östman, J. Poisoning Attacks on Federated Learning for Autonomous Driving. In Proceedings of the 4th International Conference on AI Research (SCAI 2024), Milan, Italy, 4–6 December 2024; pp. 11–18. [Google Scholar]
Bataineh, A.S.; Zulkernine, M.; Abusitta, A.; Halabi, T. Detecting Poisoning Attacks in Collaborative IDSs of Vehicular Networks Using XAI and Shapley Value. ACM J. Auton. Transp. Syst. 2024, 2, 18. [Google Scholar] [CrossRef]
Chaabene, R.B.; Ameyed, D.; Jaafar, F.; Cheriet, M. Robust Federated Learning Frameworks Guarding Against Data Flipping Threats for Autonomous Vehicles. arXiv 2025, arXiv:2504.12345. [Google Scholar]
Kabir, E.; Song, Z.; Ur Rashid, M.R.; Mehnaz, S. FLShield: A Validation Based Federated Learning Framework to Defend Against Poisoning Attacks. In Proceedings of the 2024 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 20–23 May 2024; pp. 2572–2590. [Google Scholar] [CrossRef]
Chen, X.; Feng, S.; Xiong, Z.; An, S.; Mao, Y.; Yan, L.; Tao, G.; Guo, W.; Zhang, X. Temporal Logic-Based Multi-Vehicle Backdoor Attacks Against Offline RL Agents in End-to-End Autonomous Driving. arXiv 2025, arXiv:2509.16950. [Google Scholar]
Zhang, X.; Liu, A.; Zhang, T.; Liang, S.; Liu, X. Towards Robust Physical-world Backdoor Attacks on Lane Detection. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 28 October–1 November 2024; MM ’24; ACM: New York, NY, USA, 2024; pp. 5131–5140. [Google Scholar] [CrossRef]
Liao, Y.; Cao, Y.; Zhang, Y.; He, W.; Xiao, Y.; Du, X.; Huang, Z.; Dong, J.S. Towards Stealthy and Effective Backdoor Attacks on Lane Detection: A Naturalistic Data Poisoning Approach. arXiv 2025, arXiv:2508.15778. [Google Scholar]
Kumar, R.; Ebbrecht, G.; Farooq, J.; Wei, W.; Mao, Y.; Chen, J. SecFedDrive: Securing Federated Learning for Autonomous Driving Against Backdoor Attacks. In Proceedings of the 2024 IEEE Conference on Communications and Network Security (CNS), Rome, Italy, 30 September–2 October 2024; pp. 1–6. [Google Scholar] [CrossRef]
Wang, Y.; Li, W.; Alam, M.; Maniatakos, M.; Jabari, S.E. Backdozer: A Backdoor Detection Methodology for DRL-Based Traffic Controllers. ACM J. Auton. Transp. Syst. 2024, 1, 15. [Google Scholar] [CrossRef]
Kumari, K.; Rieger, P.; Fereidooni, H.; Jadliwala, M.; Sadeghi, A.R. BayBFed: Bayesian Backdoor Defense for Federated Learning. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–25 May 2023; pp. 737–754. [Google Scholar] [CrossRef]
Ma, C.; Wang, N.; Zhao, Z.; Chen, Q.A.; Shen, C. SlowPerception: Physical-World Latency Attack against Visual Perception in Autonomous Driving. arXiv 2024, arXiv:2406.05800. [Google Scholar]
Burbano, L.; Ortiz, D.; Sun, Q.; Yang, S.; Tu, H.; Xie, C.; Cao, Y.; Cardenas, A.A. CHAI: Command Hijacking against embodied AI. arXiv 2026, arXiv:2510.00181. [Google Scholar] [CrossRef]
Liu, J.; He, Y.; Fan, L.; Zhong, Q.; Cheng, Y.; Zhang, M.; Chen, Y.; Xu, W. PINA: Prompt Injection Attack against Navigation Agents. arXiv 2026, arXiv:2601.13612. [Google Scholar] [CrossRef]
Long, Y.; Li, S. FuncPoison: Poisoning Function Library to Hijack Multi-agent Autonomous Driving Systems. arXiv 2025, arXiv:2509.24408. [Google Scholar]
Ni, Z.; Ye, R.; Wei, Y.; Xiang, Z.; Wang, Y.; Chen, S. Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models. In Proceedings of the Trustworthy Multi-Modal Foundation Models and AI Agents (TiFA), Milan, Italy, 29 September 2024. [Google Scholar]
Lu, W.; Zeng, Z.; Zhang, K.; Li, H.; Zhuang, H.; Wang, R.; Chen, C.; Peng, H. ARGUS: Defending Against Multimodal Indirect Prompt Injection via Steering Instruction-Following Behavior. arXiv 2025, arXiv:2512.05745. [Google Scholar] [CrossRef]
Wang, Y.; Yang, H.; Pal, S.; Xu, W. AegisAgent: An Autonomous Defense Agent Against Prompt Injection Attacks in LLM-HARs. arXiv 2025, arXiv:2512.20986. [Google Scholar] [CrossRef]
Islam, M.A.; El-Wakeel, A.S. Adversarial Robustness Analysis of Cloud-Assisted Autonomous Driving Systems. arXiv 2026, arXiv:2604.04349. [Google Scholar] [CrossRef]
Eslami, A.; Yu, J. Security Risks of Agentic Vehicles: A Systematic Analysis of Cognitive and Cross-Layer Threats. arXiv 2025, arXiv:2512.17041. [Google Scholar] [CrossRef]
He, C.; Xu, X.; Jiang, H.; Jiang, J.; Chen, T.; Long, Y. Resilient Control of Trajectory Tracking for Cloud-Based Intelligent Connected Vehicle Under DoS Attacks. IEEE Trans. Autom. Sci. Eng. 2025, 22, 22817–22832. [Google Scholar] [CrossRef]

Figure 1. Structure of this survey.

Figure 2. CD Architecture.

Figure 3. CD Security Landscape.

Figure 4. Threat Model in CD Systems.

Figure 5. Taxonomy of perception-layer vulnerabilities.

Figure 6. Taxonomy of communication and fusion-layer vulnerabilities.

Figure 7. Taxonomy of planning and control-layer vulnerabilities.

Table 1. List of Key Acronyms.

Acronym	Description
CD	Collaborative Driving
AD	Autonomous Driving
LM	Language Model
LLM	Large Language Model
VLM	Vision–Language Model
MLLM	Multimodal Large Language Model
CP	Cooperative Perception
ML	Machine Learning
AI	Artificial Intelligence
RGB	Red, Green, Blue
LiDAR	Light Detection and Ranging
V2X	Vehicle-to-Everything
V2V	Vehicle-to-Vehicle
V2I	Vehicle-to-Infrastructure
V2H	Vehicle-to-Home
V2B	Vehicle-to-Building
V2G	Vehicle-to-Grid
V2D	Vehicle-to-Drone
DSRC	Dedicated Short-Range Communications
C-V2X	Cellular Vehicle-to-Everything
3GPP	3rd Generation Partnership Project
CVIS	Cooperative Vehicle-Infrastructure System
CAV	Connected and Autonomous Vehicle
BEV	Bird’s Eye View
SOTA	State-of-the-Art
VQA	Visual Question Answering
IMU	Inertial Measurement Unit
UAV	Unmanned Aerial Vehicle
DoS	Denial-of-Service
DDoS	Distributed Denial-of-Service
SLAM	Simultaneous Localization and Mapping
BSM	Basic Safety Message
UDP	User Datagram Protocol
BAC	Blind Area Confusion
MVIG	Mutual View Information Graph
FL	Federated Learning
XAI	Explainable AI
CNN	Convolutional Neural Network
GNN	Graph Neural Network
SSM	State Space Model
CoT	Chain-of-Thought
NL	Natural Language

Table 2. Comparison of related survey papers. CD refers to surveys on CD/CP systems. AD Sec. covers general AD cybersecurity. V2X Sec. covers communication-layer security within V2X systems. Full CD Sec. requires security coverage across the CD pipeline, over multiple layers. LMs in AD requires coverage of LM applications within AD. LM Sec. in CD requires explicit treatment of security risks or defenses for language-enabled CD systems.

Paper	Year	Coverage
Paper	Year	CD	AD Sec.	V2X Sec.	Full CD Sec.	LMs in AD	LM Sec. in CD
[5,9,10,28,33,34,35,36,37,38,66,67,68,69,70,71]	2019–2025	✓	✓	✓	✗	✗	✗
[72]	2021	✗	✓	✗	✗	✗	✗
[7,8,42,43,45,73]	2022–2025	✓	✗	✗	✗	✗	✗
[74,75,76,77]	2023–2026	✗	✗	✗	✗	✓	✗
[78]	2025	✗	✓	✗	✗	✓	✗
[24]	2025	✓	✗	✗	✗	✓	✗
Ours	2026	✓	✓	✓	✓	✓	✓

Note: ✓ indicates primary coverage of the topic, while ✗ indicates no primary coverage or only brief mention.

Table 3. Comparison of CD Models. The V2X category represents the forms of communication covered by the dataset, where V is V2V and I is V2I. In the data shared column, FM represents Feature Maps and PC is point clouds. In the architecture category, CNN is convolutional neural network, a class of neural network for image classification. Att. is attention. GNN is graph neural network, a class of neural network designed to process data as graphs. SSM is state space model, which compresses previous context, making it efficient for large inputs. In the task category, OD is object detection, E-E is end-to-end driving, SS is semantic segmentation, a task classifies pixels within data.

Model	Year	V2X	Fusion	Data Shared	Architecture	Task
[79]	2019	V	Early	LiDAR PC	CNN	OD
[80,81]	2019–2021	V	Inter.	LiDAR FM	CNN	OD
[82]	2020	V	Inter.	LiDAR FM	CNN + Att.	OD
[83]	2020	V	Inter.	LiDAR FM	GNN	E-E
[46,84]	2020–2021	V, I	Late	Messages	None	E-E
[50]	2021	V, I	Late	Messages	None	OD
[85]	2022	V, I	Inter.	LiDAR FM	Transformer	OD
[86]	2022	V	Inter.	BEV FM	Transformer	SS
[87,88,89]	2022–2023	V	Inter.	BEV FM	CNN + Att.	OD
[1]	2023	V, I	Inter.	BEV FM	CNN + Att.	OD
[2]	2024	V	Inter.	LiDAR FM	SSM	OD
[3]	2024	V	Inter.	BEV FM	CNN	E-E
[4]	2024	V	Inter.	BEV FM	CNN	OD
[90]	2025	V, I	Early	Trajectory	Transformer	E-E
[91]	2025	V, I	Early	BEV FM	CNN	OD
[47]	2025	V, I	Inter.	MM FM	Transformer	OD
[48]	2025	V, I	Inter.	BEV FM	Diffusion	OD

Table 4. Comparison of LM-Based Single-Perception Models. In the reasoning column, NL is Natural Language reasoning, where the model uses human language to come to conclusions. In the Closed/Open Loop category, CL is closed-loop and OL is open-loop.

Model	Year	Modality	LM	Reasoning	Closed/Open Loop	Input	Explainable
[61]	2023	LLM	GPT-3.5	NL	CL	Prompts	✓
[22]	2023	MLLM	LLaMa2	NL	OL	VQA	✓
[92]	2023	LLM	GPT-3.5	Structured	CL	Prompts	✗
[93]	2023	MLLM	LLaVa-1.5	NL	CL	Prompts	✓
[23]	2024	LLM	GPT-4	NL	OL	Prompts	✓
[94]	2024	LLM	Mistral	Structured	CL	Prompts	✗
[95]	2024	LLM	GPT-4	Structured	CL	Prompts	✗
[96]	2024	VLM	Qwen-VL	NL	OL	Prompts	✓
[97]	2025	VLM	BLIP-2	NL	OL	VQA	✓
[98]	2025	VLM	GPT-4o	NL	OL	VQA	✓
[99]	2025	VLM	LLaMa-3.2-vision	Structured	OL	Prompts	✓
[62]	2025	MLLM	LLaMa-7B	Structured	CL, OL	Prompts	✓
[100]	2026	MLLM	Qwen3-7B	Structured	OL	VQA	✓

Note: ✓ indicates that the model provides explainable outputs, while ✗ indicates that the model does not provide explainable outputs.

Table 5. Comparison of LM-Based CD Models. The V2X category represents the forms of communication covered by the dataset, where V is V2V and I is V2I. FM in the data shared column represents Feature Maps. In the reasoning column, CoT is Chain-of-Thought, a reasoning strategy where the model breaks down a problem into a series of steps. NL is Natural Language reasoning, where the model uses human language to come to conclusions.

Model	Year	V2X	Fusion	Data Shared	Modality	LM	Input	Reasoning
[101]	2024	V	Late	Messages	LLM	GPT-4o-mini	Prompts	CoT
[40]	2025	V	Late	Scene FM	MLLM	LLaVA-v1.5-7b	VQA	NL
[44]	2025	V, I	Late	Messages	LLM	GPT-4	Prompts	NL
[102]	2025	V	Late	Messages	VLM	InternVL2-4B	Prompts	NL
[103]	2025	V	Late	Messages	VLM	Agnostic	Prompts	CoT
[104]	2025	V	Late	Messages	LLM	GPT-3.5	Prompts	CoT
[105]	2025	V, I	Inter.	Camera FM	VLM	Florence-2	Prompts	NL
[106]	2025	V	Late	Scene FM	MLLM	LLaVA-v1.5-7b	VQA	NL
[107]	2026	V	Late	Messages	VLM	GPT-4o	Prompts	Structured

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nayak, S.; Gungor, O.; Rosing, T. Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends. Electronics 2026, 15, 2389. https://doi.org/10.3390/electronics15112389

AMA Style

Nayak S, Gungor O, Rosing T. Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends. Electronics. 2026; 15(11):2389. https://doi.org/10.3390/electronics15112389

Chicago/Turabian Style

Nayak, Sahil, Onat Gungor, and Tajana Rosing. 2026. "Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends" Electronics 15, no. 11: 2389. https://doi.org/10.3390/electronics15112389

APA Style

Nayak, S., Gungor, O., & Rosing, T. (2026). Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends. Electronics, 15(11), 2389. https://doi.org/10.3390/electronics15112389

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Security in Collaborative Driving: A Survey of Threats, Defenses, and Emerging Trends

Abstract

1. Introduction

2. Background

2.1. CD Architecture

2.1.1. Perception Layer

2.1.2. Communication and Fusion Layer

2.1.3. Planning and Control Layer

2.2. LM Integration

2.3. Security Landscape of CD

3. Related Work

3.1. Surveys on CD

3.2. Surveys on V2X Communication Security

3.3. Surveys on AD System Security

3.4. Surveys on LMs for AD

4. Overview of Collaborative Driving Models and Datasets

4.1. V2X CD Models

4.1.1. Early Fusion Models

4.1.2. Late Fusion Models

4.1.3. Intermediate Fusion Models

4.2. LMs in Single-Perception AD

4.3. Language-Based CD Models

4.4. Datasets

5. Cybersecurity Threats and Defense Mechanisms in CD

5.1. Threat Model

5.1.1. Attacker Type

5.1.2. Attacker Knowledge

5.1.3. Attacker Objective

5.2. Perception Layer

5.2.1. Sensor Spoofing

5.2.2. Temporal Attacks and Defenses

5.3. Communication and Fusion Layer

5.3.1. Jamming

5.3.2. Message Spoofing

5.3.3. Eavesdropping

5.3.4. DoS/DDoS

5.3.5. Replay

5.3.6. Sybil

5.3.7. Ransomware

5.4. Planning and Control Layer

5.4.1. Evasion Attacks

5.4.2. Data Poisoning

5.4.3. Backdoor Attacks and Defenses

5.4.4. Latency Attacks and Defenses

5.4.5. Language-Based Attacks and Defenses

5.4.6. Cross-Layer Propagation of Attacks

6. Research Gaps and Future Directions

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI