Next Article in Journal
Archaeogenetic Data Mining Supports a Uralic–Minoan Homeland in the Danube Basin
Next Article in Special Issue
Building Bio-Ontology Graphs from Data Using Logic and NLP
Previous Article in Journal
Blockchain Consensus Mechanisms: A Bibliometric Analysis (2014–2024) Using VOSviewer and R Bibliometrix
Previous Article in Special Issue
LPG Semantic Ontologies: A Tool for Interoperable Schema Creation and Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Integrating Knowledge Graphs into Autonomous Vehicle Technologies: A Survey of Current State and Future Directions

by
Swe Nwe Nwe Htun
and
Ken Fukuda
*
National Institute of Advanced Industrial Science and Technology, Artificial Intelligence Research Center, Tokyo 135-0064, Japan
*
Author to whom correspondence should be addressed.
Information 2024, 15(10), 645; https://doi.org/10.3390/info15100645
Submission received: 31 August 2024 / Revised: 6 October 2024 / Accepted: 10 October 2024 / Published: 16 October 2024
(This article belongs to the Special Issue Knowledge Graph Technology and its Applications II)

Abstract

:
Autonomous vehicles (AVs) represent a transformative innovation in transportation, promising enhanced safety, efficiency, and sustainability. Despite these promises, achieving robustness, reliability, and adherence to ethical standards in AV systems remains challenging due to the complexity of integrating diverse technologies. This survey reviews literature from 2017 to 2023, analyzing over 90 papers to explore the integration of knowledge graphs (KGs) into AV technologies. Our findings indicate that KGs significantly enhance AV systems by providing structured semantic understanding, improving real-time decision-making, and ensuring compliance with regulatory standards. The paper identifies that while KGs contribute to better environmental perception and contextual reasoning, challenges remain in their seamless integration with existing systems and in maintaining processing speed. We also address the ethical dimensions of AV decision-making, advocating for frameworks that prioritize safety and transparency. This review underscores the potential of KGs to address critical challenges in AV technologies, offering a hopeful and optimistic outlook for the development of robust, reliable, and socially responsible autonomous transportation solutions.

1. Introduction

Autonomous vehicles (AVs) lead a transportation revolution, promising enhanced safety, reduced traffic congestion, and improved energy efficiency [1,2]. AVs are commonly defined as driverless vehicles capable of operating on standard road infrastructure without human intervention. However, the term AV can encompass a range of automation levels [3]. According to the Society of Automotive Engineers (SAE), vehicle automation is classified into six levels, from Level 0 (no automation) to Level 5 (full automation), as illustrated in Figure 1. Our paper focuses on AVs operating at Levels 3 to 5. At these levels, AVs can perform most or all driving tasks, with Level 5 representing full autonomy in any environment and Level 3 requiring occasional human intervention. While the default definition of an AV aligns most closely with Levels 4 and 5, this paper also considers a broader view, addressing AV systems that may include vehicles operating at Level 3 automation, where human intervention is occasionally required.
The journey from early prototypes in the mid-20th century to today’s advanced AV systems has been marked by significant research and technological breakthroughs. In the United States, major tech companies and automakers, such as Waymo [4] and General Motors [5], are pioneering innovations and conducting extensive real-world testing. Despite these advancements, regulatory hurdles and public acceptance remain critical challenges, as noted by the National Highway Traffic Safety Administration (NHTSA) [6].
In China, rapid AV development is propelled by strong government support and a vast market, with large-scale data availability and extensive testing positioning the country as a global leader. Notably, China led the development of the first international standard for autonomous driving test scenarios, as reported by the Ministry of Industry and Information Technology (MIIT) [7].
Meanwhile, Japan’s approach emphasizes safety and the integration of AVs with existing transportation systems, addressing the needs of an aging population and advancing smart city initiatives. Reports from Japan’s Ministry of Economy, Trade and Industry (METI) [8] and the Japan Automobile Manufacturers Association (JAMA) [9] highlight Japan’s significant progress in developing AVs capable of navigating complex urban environments. These global advancements underscore the critical need for sophisticated systems that encompass perception, decision-making, control, and communication to navigate the complexities of real-world environments successfully. However, transitioning from human-driven to fully autonomous vehicles presents significant challenges across technology, regulation, and public acceptance.
Knowledge graphs (KGs) have emerged as a powerful tool to manage complex, interconnected data, offering significant potential to address key challenges in AV technologies. By enhancing data integration, contextual understanding, and operational efficiency, KGs can revolutionize how AVs interpret and navigate their environments. Current AV perception systems, relying on sensors like cameras, LiDAR, and radar, often struggle with accurately understanding complex driving scenarios, especially in unpredictable conditions such as varying weather, lighting, or sudden obstacles [10,11,12]. Misinterpretations—whether due to false positives, incorrect object classification, or delayed processing—can compromise vehicle safety and reliability [13]. KGs help mitigate these issues by providing a structured representation of diverse data, allowing AVs to better understand and interpret their surroundings [14]. KGs improve object recognition and scene interpretation, especially in challenging situations like distinguishing between a stationary object and a pedestrian about to cross the street.
In addition to perception, AVs must make rapid, safe, and ethically sound decisions in complex environments like urban streets, where pedestrians, cyclists, and other vehicles create dynamic scenarios [15]. These decisions must minimize harm while aligning with societal values, such as prioritizing pedestrian safety, even at the risk of a collision with another vehicle. AVs also face challenges related to uncertainty (incomplete or unclear sensor data) and ambiguity (situations that can be interpreted in multiple ways, like whether a pedestrian is about to cross). Traditional rule-based systems [16], while effective in predictable environments, struggle to adapt to unexpected situations, such as sudden pedestrian movements or unclear traffic signals. KGs address these limitations by enabling AVs to connect real-time sensor inputs with pre-established knowledge, making context-aware decisions in complex, uncertain scenarios [14,17]. This integration could significantly enhance the safety, adaptability, and trustworthiness of autonomous driving systems.
Existing research on AVs often focuses on individual system components, such as perception, decision-making, and control. However, there is a gap in understanding how KGs can significantly enhance these systems by improving how they process and interpret information. Despite their potential, only some reviews address how KGs can be cohesively integrated with AV technologies. Yurtsever et al. [18] comprehensively review general AV technologies, covering core functions, system architectures, and key challenges. While their work establishes a solid foundation for understanding AV systems, it does not explore emerging technologies like KGs. Similarly, Badue et al. [19] offer an in-depth examination of AV technologies, discussing aspects like architecture, perception, decision-making, and control mechanisms while highlighting key advancements. However, this survey also could focus on KGs. Zhao et al. [20] present an extensive overview of AV technologies, addressing critical components such as perception, localization, mapping, planning, and control. Their work emphasizes challenges related to real-time decision-making, safety, scalability, and sensor integration but does not delve into the potential role of KGs in enhancing these systems. While these surveys provide valuable insights, there is a clear need for more comprehensive reviews that specifically examine how KGs could transform AV technologies.
Luettin et al. [14] focus specifically on the role of KGs in AVs, exploring their application in tasks like perception, decision-making, validation, and scene understanding. However, their survey does not extensively address the challenges involved in implementing KGs, leaving a gap in understanding the practical hurdles of integration.
Our survey addresses a gap in the literature by specifically exploring the role of KGs in AV systems. Unlike other works, such as those by Yurtsever et al. [18], Badue et al. [19], and Zhao et al. [20], which do not delve into the potential of KGs, our work emphasizes their importance in providing semantic understanding, decision-making support, and the ability to tackle complex AV challenges. While Luettin et al. [14] provide a good starting point by discussing the application of KGs in AV tasks like perception and decision-making, their work does not sufficiently address the challenges involved in implementing KGs in real-world systems. Our contribution fills this gap by examining not only the benefits but also the practical hurdles of integrating KGs into AV systems, such as scalability, data availability, and the complexity of real-time decision-making. The comparison between our work and closely related survey studies is presented in Table 1.
This paper comprehensively reviews state-of-the-art AV technologies and the current state of integrating KGs into AV technologies. Moreover, we identify research challenges and propose future directions for developing robust, reliable, and socially responsible autonomous transportation solutions. By synthesizing existing knowledge and shedding light on emerging trends, this survey aims to contribute to the advancement of AV technologies that are proficient, ethically sound, and accepted by society.
The paper is organized as follows. Section 2 presents the review strategy. Section 3 covers fundamental concepts and recent advancements in AV technologies, such as perception, localization, path planning, and decision-making. Section 4 examines the role of KGs in enhancing AV systems, focusing on their contributions to semantic understanding, data integration, and real-time decision-making. In addition, it investigates the practical integration of KGs with AV technologies, addressing challenges and impacts on performance. Section 5 explores the ethical dimensions of AV decision-making, emphasizing the need for tailored ethical frameworks. Finally, Section 6 presents the conclusions.

2. Review Strategy

Defining the review strategy is a fundamental aspect of a systematic review [17]. This section details the review strategies employed in this paper. The review strategy we propose is structured to explore studies that contribute to the integration of KGs into AV technologies. It is composed of three key elements: research questions, publications retrieval, and articles in review. The following research questions, as presented in Table 2, were acquired to guide the article analysis.
For a comprehensive review, we retrieved publications from reputable databases, including IEEE Xplore and Google Scholar. These databases were selected for their broad coverage of relevant journals and conferences, ensuring a thorough review. Keywords used in the search include “knowledge graphs”, “autonomous vehicles”, and “integration”. The initial search yielded a significant number of results from the databases, with publications dated between 2017 and 2023. Articles were then selected for detailed review based on the following filtering process:
  • Relevance: Articles were selected not only for focus on the integration of KGs with autonomous vehicle technologies but also for their contribution to understanding the background and fundamentals of AV technologies.
  • Manual Screening: Abstracts of the identified articles were manually reviewed to assess alignment with the research questions. Only studies that were directly relevant to AV technologies and KG integration or offered valuable insights into AV technologies were retained for further analysis.
  • Institutional Expertise: To capture cutting-edge research and institutional expertise, we specifically included sources from Toyota Research Institute, Kanazawa University, The University of Tokyo, and the National Institute of Advanced Industrial Science and Technology (AIST) as part of the filtering process.
  • Peer-Review Status: We ensured that the review is based on peer-reviewed sources. Preprint articles from repositories such as ArXiv were excluded unless their final, peer-reviewed versions were available. Articles with discrepancies between preprint and published versions were cross-checked, and only the final published versions were retained for analysis.
After applying these criteria, approximately 85 articles were identified as most relevant for inclusion in this review. Figure 2 illustrates the percentage of papers published each year between 2017 and 2024, reflecting the final set of articles selected for analysis.

3. Background and Fundamentals

This section aims to provide a comprehensive overview of the key components of AV technologies. In support of this analysis, we reference around 50 journals and conferences that have made significant contributions to AV research. Notable among them are the IEEE/CVF International Conference on Computer Vision (ICCV), IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Robotics: Science and Systems (RSS), IEEE Transactions on Intelligent Transportation Systems, IEEE Robotics and Automation Letters, and the International Conference on Robotics and Automation (ICRA). These venues are renowned for their substantial impact on AV-related topics, as summarized in Table 3, which is based on a selection of surveys referencing these notable sources.
Furthermore, key contributors to the AV field, such as V.C. Guizilini, A. Gaidon, and R. Ambrus, have collectively authored significant papers on essential AV topics. Their extensive work underscores the depth of research covered in our survey. Table 4 outlines the primary research contributions made by these scholars.
To further elucidate the landscape of AV research, we analyze the popularity and focus areas of key components in AV technologies. Table 5 details the popularity of these components and the reasons behind their emphasis on research. This analysis highlights which components are receiving significant attention and are experiencing substantial growth. Among these, we briefly discuss the components where the research focus is either already high or is experiencing significant growth through Section 3.1 to Section 3.6. By examining the interrelationships between these components, we aim to demonstrate how advancements in each area contribute to AV systems’ performance and reliability. For a clear and concise overview of the research focus across different AV technologies, refer to Table 6.

3.1. Perception

The perception of AVs has evolved significantly, incorporating advanced techniques to handle the complex and dynamic environments encountered during operation. These systems rely on data from various sensors, including cameras and LiDAR. Approaches include using cameras alone, LiDAR alone, or combining both through sensor fusion to enhance perception. Since sensors are essential for determining both the state of the AV and its environment, Table 7 provides an overview of the different sensors integrated into autonomous vehicles.
Building on this foundational sensor data, key areas of focus in perception include segmentation [1,2,38], which divides visual data into meaningful parts, enabling the identification and localization of objects. Street-view change detection [25] enhances these systems by recognizing environmental alterations that could impact navigation. Monocular depth estimation [21,26,30,39] helps the system understand the 3D structure of a scene using just one camera, which is especially important when other sensors, like LiDAR, are not available.
Innovations such as sparse view synthesis [22] and rigorous calibration methods [28,57] ensure that sensor data are both accurate and comprehensive, aiding in reliable object detection and scene interpretation. Multi-object tracking [32], 3D object detection [31,63], and LIDAR-based flow estimation in bird’s eye view (BeV) [29] enhance obstacle state estimation by accurately predicting the dynamic state of objects from consecutive point cloud data, enabling AV systems to maintain situational awareness and predict potential hazards in real time.
Advanced techniques like ego-motion estimation [27], occupancy prediction [33], and visual odometry [44] further enhance the vehicle’s understanding of its surroundings, ensuring robust navigation even in challenging conditions. Additionally, the detection of driver alertness [46], pedestrian locomotion [47,63], and traffic signals [51,57] contribute to the system’s ability to interact safely and effectively with human operators and other road users.
By recognizing and adapting to environmental conditions [56] and road surface features [52], these systems can adjust their behavior to ensure safe operation. However, the reliability of perception systems remains paramount, with ongoing research addressing challenges such as sensor noise, adverse weather, poor lighting, and high traffic density, all of which can affect the accuracy and safety of autonomous driving systems.

3.2. Localization and Mapping

Localization ensures that an AV accurately identifies its position in the environment. This is typically achieved by fusing data from a global positioning system (GPS), inertial measurement units (IMUs), and visual-based sensors, such as LiDAR and cameras. Localization approaches generally fall into two categories: simultaneous localization and mapping (SLAM), which performs localization and mapping concurrently, and offline mapping, where the map is constructed separately. Mapping, essential for creating detailed HD maps, can be classified into online and offline approaches. LiDAR remains a primary sensor for HD map generation, although vision-based methods also contribute through visual SLAM and deep learning techniques. HD map generation typically involves collecting and aligning point clouds, labeling map elements, and frequent updates, making it labor intensive.
Updating and maintaining maps [25] is necessary to ensure that the system’s understanding of its environment remains accurate over time, especially in dynamic settings. Techniques such as depth-aware mapping [26] integrate depth information from sensors like LiDAR to enhance map detail and accuracy, while multi-camera approaches [28] leverage data from multiple cameras to create comprehensive map representations.
Ego-motion estimation [27] updates the vehicle’s position on the map and is further refined using probabilistic localization methods [34] that estimate the most likely position based on sensor data. Similarly, creating and updating occupancy maps [33], which delineate occupied and accessible spaces, is vital for safe navigation and obstacle avoidance. Visual odometry [41,44] estimates movement through the environment, complementing other mapping techniques.
Advanced mapping methods that integrate global navigation satellite system/inertial navigation system–real-time kinematic (GNSS/INS-RTK) positioning [48] offer high-precision localization. Different map representations [53]—such as grid maps and topological maps—affect how the system navigates and localizes itself in various scenarios.
Additionally, generating 2.5D maps using LiDAR and Graph SLAM [49] provides a simplified yet effective 3-dimensional representation of the environment, particularly useful in multilevel environments [50], like parking garages, for precise vehicle localization. The dual process of map generation and localization [57] underpins the system’s ability to navigate effectively, with 3D LiDAR mapping [35] offering detailed environmental scans that support robust localization even in complex terrains. Comprehensive 3D mapping frameworks [36,42] integrate these diverse techniques, facilitating the creation and continuous updating of maps critical for autonomous navigation.

3.3. Path Planning

Path planning in autonomous systems involves generating safe and efficient trajectories for navigating complex environments. Control-aware prediction [37] enhances this process by considering vehicle dynamics and control constraints when predicting future states and planning paths. Flow estimation [29] also plays a role in understanding traffic patterns and the movement of other vehicles, which is crucial for effective path planning and collision avoidance.
In scenarios where potential accidents are a concern, planning near-accident driving scenarios [45] ensures that the vehicle can respond appropriately to sudden environmental changes or hazards. Point-to-point navigation [34] calculates optimal routes between specific locations, ensuring the vehicle reaches its destination efficiently.
Safety trajectory generation [58] is a crucial aspect of path planning, aiming to create trajectories that minimize risk and enhance overall safety. This involves considering the movements of the vehicle and other road users to avoid collisions and ensure smooth operation. The driver’s target trajectory [54] provides a reference for planning and aligning the vehicle’s trajectory with the expected path of human drivers.
Interactive trajectory prediction [60] involves adapting the planned trajectory based on real-time interactions with other road users, such as adjusting to changes in traffic conditions or the actions of nearby vehicles. Additionally, lane-change style classification [64] helps the system understand different lane-change behaviors, which can inform planning strategies and improve the overall driving experience. Together, these advanced methods address the challenges of real-time decision-making and long-term prediction, significantly enhancing both path and trajectory planning in autonomous driving systems.

3.4. Control

Control in autonomous systems encompasses various strategies and technologies to ensure the vehicle operates safely and effectively. Safety verification [24] is crucial for validating that the control systems adhere to safety standards and prevent dangerous behaviors. Control-aware prediction [37] is also a key strategy that integrates vehicle dynamics into the prediction process. This ensures that the control commands are not just realistic but also feasible, given the vehicle’s capabilities and constraints, resulting in a better alignment of the planned trajectories with the vehicle’s control limits.
Interpretable policies [40] are also a significant aspect of control strategies. These policies are designed to be transparent and understandable, fostering better analysis and trust in the decision-making process. They guide the vehicle’s actions in different situations, making the control system more robust and reliable.
Additionally, controlling a vehicle’s actions [45] focuses on implementing control strategies to execute the planned maneuvers, ensuring that the vehicle performs the desired actions accurately. This includes generating control commands [34] that translate high-level plans into specific actions for the vehicle. Behavior cloning [23] involves training control systems using data from human drivers to replicate their driving behaviors, which can enhance the vehicle’s ability to handle complex driving scenarios.
Chassis performance [59] refers to the system’s ability to manage and optimize the vehicle’s physical movements, including steering and acceleration. Controlling vehicle steering [55] is a specific aspect of control that involves adjusting the vehicle’s direction to ensure precise maneuvering. Automated lane-change control [64] involves managing the vehicle’s transitions between lanes, typically integrating with path planning to ensure smooth and safe lane changes. Controlling lateral and longitudinal vehicle movements [61] covers the broader scope of managing both the vehicle’s side-to-side and forward/backward movements, which is essential for maintaining desired trajectories and ensuring smooth operation.
In summary, effective control strategies integrate predictive models, transparent decision-making, and precise maneuver execution to enhance AVs’ overall safety and performance.

3.5. Decision-Making

Decision-making in autonomous systems involves complex processes to ensure safe and effective driving by interpreting data and predicting outcomes. Control-aware prediction [37] integrates vehicle dynamics into the decision-making process, helping ensure that decisions are feasible and align with the vehicle’s capabilities. Pedestrian intent prediction [43] enables the vehicle to make informed decisions to avoid potential collisions and ensure safety. This involves analyzing pedestrian movements and predicting their future actions.
Moreover, switching between different driving modes [45] allows the vehicle to adapt to different driving conditions and scenarios, enhancing its flexibility and responsiveness. This capability is vital for transitioning between manual and autonomous driving modes or different levels of automation. Behavior cloning [23] leverages data from human drivers to replicate their decision-making processes, providing a basis for the vehicle to handle complex driving scenarios by mimicking human behavior. Another technique, predicted probability distribution [34], helps evaluate the likelihood of various outcomes based on current data, aiding in making more informed decisions by assessing potential risks and benefits.
Weather conditions in decision-making [56] involve incorporating environmental factors such as rain, snow, or fog into the decision-making process, ensuring that the vehicle can adapt to changing weather conditions and maintain safe operation. Additionally, decision-making on predicted driver behavior [58] focuses on anticipating and responding to the behavior of other drivers, enhancing the vehicle’s ability to interact safely and effectively in mixed-traffic scenarios. The driver’s target trajectory [54] aligns the vehicle’s actions with the driver’s intended path, ensuring that the vehicle’s decisions support the overall driving goals.
Monitoring the duration of takeover time [62] involves assessing the time required for a driver to take control of the vehicle from autonomous mode, which is crucial for ensuring a smooth transition and maintaining safety. Additionally, decision-making on lane changes [64] encompasses strategies for safely and efficiently changing lanes, integrating with path planning and control systems to execute smooth lane transitions. Decision-making based on dynamic traffic conditions [61] involves adapting decisions based on real-time traffic data, ensuring that the vehicle can effectively navigate through varying traffic scenarios.
Despite advancements, decision-making in autonomous systems faces challenges such as accurately predicting complex driver behaviors, adapting to dynamic and adverse conditions, and ensuring seamless transitions between different driving modes, all of which are critical for enhancing overall system reliability and safety.

3.6. Human–Machine Interaction (HMI)

HMI in semi-autonomous systems optimizes communication between the driver and the vehicle, improving safety and usability. In driver assistance systems (Levels 2–3), the vehicle relies on driver inputs for tasks like steering and monitoring the environment. Technologies such as sparse view synthesis and scene visualization [22] help enhance the vehicle’s perception of its surroundings, which is then communicated to the driver through real-time feedback systems [46]. This feedback includes driver state monitoring to ensure the driver is engaged, enabling timely interventions when needed. Furthermore, eye tracking and monitoring duration [62] provide insights into driver attention patterns to refine interaction interfaces.
In fully autonomous systems (Levels 4–5), the HMI’s main role is to keep the passenger informed about the route, the vehicle’s status, and any important conditions. Since the vehicle handles all driving tasks, the passenger does not need to take control. Instead, the HMI focuses on providing clear updates, such as navigation and system performance. Studies involving driving simulations and user interactions [64] help improve these interfaces, making sure they are easy to use and provide passengers with the information they need in a simple and intuitive way.

4. Knowledge Graphs and Ontologies

KGs are pivotal in structuring and representing knowledge through graph formats, where nodes represent entities and edges capture relationships between them. According to Hogan et al. [65], a KG is “a graph of data with the objective of accumulating and conveying real-world knowledge”, with entities being either tangible objects or abstract concepts and relationships indicating how these entities are connected. KGs utilize factual triples (e.g., Albert Einstein, WinnerOf, Nobel Prize), formalized under the Resource Description Framework (RDF). Formally, a KG is represented as G = (H, R, T), where H denotes a set of entities, T includes entity-literal pairs, and R represents the relationships connecting these elements.
Different graph models can be used to construct KGs, including directed labeled graphs, which feature nodes connected by directed, labeled edges to represent binary relationships intuitively; hyper-relational graphs, which are directed multigraphs with nodes and edges that can have associated key–value pairs for more complex representations; and hypergraphs, which extend the concept of binary edges to model multiple and complex relationships, including nested graphs within hypernodes [14]. These models provide flexible frameworks for capturing diverse data structures.
Building on these fundamental structures, knowledge graph embeddings (KGEs) convert the symbolic information in KGs into low-dimensional vectors that reflect the semantic meanings of entities and relationships [14]. Unsupervised KGE methods generate embeddings based on the inherent structure and attributes of the KG, using approaches such as statistical relational learning and embedding techniques [66,67]. In contrast, supervised KGE methods optimize embedding for specific tasks using labeled data and include techniques such as graph neural networks (GNNs), graph convolutional networks (GCNs), and graph attention networks (GATs) [68,69]. Recent surveys, including Monka et al. [70], illustrate the growing interest in integrating KGs with machine learning methods, highlighting their applications in fields like automated driving. Thus, while the graph models provide the structure and organization of knowledge, KGEs enhance its practical application by translating this structured data into meaningful representations for different tasks.
In the context of KGs, ontologies provide a formal schema that defines the types of entities and relationships within a specific domain. Philosophically, ontology refers to the study of categories and their interrelations, but in computer science, it represents a formal and explicit specification of a shared conceptualization [14]. Ontologies address semantic heterogeneity and enable interoperability by establishing a common understanding of terms and their interrelationships. In the domain of automated driving (AD), ontologies are essential for modeling and integrating knowledge related to vehicles, drivers, routes, and driving environments. They ensure consistent interpretations of concepts such as vehicle types, driver behaviors, and traffic scenarios.
Building upon this base, specific ontologies tailored to different aspects of AD provide structured frameworks for integrating domain-specific knowledge. For example, Vehicle Model Ontologies enhance data interoperability by representing vehicle-related concepts, including types, components, and sensors. Driver Model Ontologies focus on characterizing driver behaviors and styles, supporting driver assistance systems, while Driver Assistance Ontologies underpin decision-making and information sharing related to driver assistance features and human–machine interactions. Additionally, Routing and Context Model Ontologies facilitate route planning and contextual modeling by addressing elements like road geometry, traffic signs, and environmental conditions. Cross-Cutting Ontologies encompass different aspects of driving automation, such as different levels of automation, risk assessment, and interactions with human–driven vehicles. Collectively, these ontologies promote semantic consistency across systems and enhance the capabilities of AD systems by enabling the comprehensive integration of complex scenarios and diverse data sources [14].

4.1. Knowledge Graphs Integrated into AV Technologies

KGs have been increasingly recognized for their crucial role in autonomous driving, particularly in enhancing situation comprehension. The integration of KGs into these domains provides a structured approach to handling and interpreting complex driving scenarios and sensor data, thereby improving the decision-making process of autonomous vehicles. We have been presented with Section 3, in which foundational understanding sets the stage for exploring the challenges and research opportunities that KGs can address when integrated into AV technologies. In the following subsection, we delve into KGs’ contributions to AVs from 12 papers. These contributions highlight how KGs enable more efficient and explainable models, ultimately leading to safer and more reliable AV systems. Table 8 provides an overview of KGs’ contributions to AV applications.

4.1.1. Scene Representation

The research effort [71] on context and situation intelligence (CoSI) introduces a KG-based framework aimed at enhancing situation comprehension in driving scenarios, as illustrated in Figure 3. This framework integrates diverse information sources—such as driver state, destination, personal preferences, and the surrounding environment—into a unified KG structure. Figure 3 provides an excerpt of the CoSI knowledge graph (CKG), demonstrating how environmental information is represented through instances (assertional box) of ontological concepts (terminological box). The CoSI ontology, which underpins this KG, models the key aspects as follows:
  • Scene: Refers to a snapshot of the environment, including both static and dynamic elements, as well as the self-representations of actors and observers and the relationships among these entities.
  • Situation: Represents the complete set of circumstances considered when choosing an appropriate behavioral pattern at a specific moment. It includes all relevant conditions, options, and factors influencing behavior.
  • Scenario: Describes the progression over time across multiple scenes, including actions, events, and goals that define this temporal development.
  • Observation: Involves the process of performing a procedure to estimate or determine the value of a property of a feature of interest.
  • Driver: A user with attributes specific to the driving context.
  • Profile: Structured representation of user characteristics.
  • Preference: A concept used in psychology, economics, and philosophy to describe a choice between alternatives. For instance, a person shows a preference for A over B if they would opt for A rather than B.
This is a great example of ontology, which helps integrate and interpret both sensor data and driver-related information, including preferences and abilities. By representing this information as entities and their inter-relationships and incorporating semantic axioms, the KG enables advanced reasoning and inference capabilities. The use of axiomatic rules and KG embedding techniques in a GNN architecture has shown significant improvements in situation classification, difficulty assessment, and trajectory prediction.
Additionally, Wickramarachchi et al. [82] highlight the benefits of knowledge graph embeddings (KGEs) in facilitating neuro-symbolic fusion. This approach improves the predictive performance of machine learning models in autonomous driving by integrating symbolic reasoning with neural network capabilities.
Further, Wang et al. [83] presented an approach to predicting pedestrian trajectories. As illustrated in Figure 4, the framework is composed of two main modules: the spatio-temporal interaction aware module and the trajectory distribution aware module, followed by a trajectory decoder. First, the spatio-temporal interaction-aware module captures both spatial and temporal relationships between pedestrians. This module includes spatial self-attention, which focuses on the most relevant pedestrian interactions at each time step, and a graph convolutional network (GCN) that models pedestrian interactions by treating them as nodes in a graph, with edges representing interactions between them. Additionally, it employs a temporal asymmetric network to process the temporal evolution of trajectories, with a stronger emphasis on recent movements, and a temporal self-attention mechanism to highlight the most critical time steps. The module also incorporates a spatial asymmetric network, which accounts for the unequal influence pedestrians have on each other based on their positions and movements. Residual connections ensure that important features are preserved as they pass through the model. Second, the trajectory distribution aware module handles uncertainty in trajectory prediction by encoding both observed trajectories (past movements) and ground truth trajectories (actual future paths). This allows the model to capture the underlying distribution of possible future movements, enabling it to generate potential future trajectories that reflect the inherent variability in pedestrian behavior. Lastly, the trajectory decoder uses a combination of the temporal asymmetric CNN (TA-CNN), which focuses on the temporal dynamics of predicted trajectories, and the trajectory embedding CNN (TE-CNN), which converts the learned features into spatial coordinates. This results in the generation of multiple plausible future trajectories, offering possible pedestrian paths. By modeling both spatio-temporal interactions and trajectory distributions, the framework enhances the understanding of pedestrian movement in complex environments, improving decision-making and planning.
Road scene-graph representations, when combined with graph learning techniques, have recently surpassed deep learning methods in action classification, risk assessment, and collision prediction. Ref. [72] introduced roadscene2vec, an open-source tool designed to facilitate research into road scene-graph applications. roadscene2vec provides capabilities for generating scene graphs from video clips or CARLA (Car Learning to Act) simulator data, creating spatio-temporal embeddings with various models, and visualizing and analyzing scene graphs. It supports risk assessment, collision prediction, transfer learning, and model explainability evaluation. Similarly, Zipfl et al. [73] introduced a semantic scene graph model where traffic participants are represented as nodes, and their relationships are captured as semantically classified edges. This model provides a structured way to describe traffic scenes beyond just the road geometry, focusing on traffic participants’ interactions and relative positions. Using graph-based representations for dynamic objects in traffic scenes, such as pedestrians and vehicles, highlights an approach to organizing and interpreting complex scene data.
In addition, the SemanticFormer [84] approach uses a semantic traffic scene graph to represent and process high-level information about traffic participants, road topology, and traffic signs. The ontologies of the traffic scene are shown in Figure 5. The ontology depicted in the figure represents a comprehensive semantic framework for modeling traffic scenes, particularly designed for applications such as trajectory prediction, behavior analysis, and decision-making in autonomous driving. This structured KG integrates static and dynamic entities in a traffic scene, capturing their relationships and interactions over time. The graph follows a modular approach, where different components represent the environment (road structures, lanes, intersections) and participants (vehicles, pedestrians, obstacles). These components, when interconnected through predefined relationships, form a foundation for understanding and predicting real-world driving scenarios.
The elements in the ontology include the EgoVehicle, representing the autonomous vehicle navigating the scene, and Participants, which encapsulate all dynamic entities, such as Humans, Vehicles, and MovableObjects. The human participants are further subclassed into entities like Children, Adults, and Police Officers, reflecting their distinct behaviors and interactions in a traffic environment. Similarly, vehicles are divided into types like Cars, Bicycles, Trucks, and Motorcycles, capturing the diverse range of interactions and speeds these entities exhibit on the road. In addition to dynamic entities, the ontology models static objects such as RoadSegments, LaneDividers, Intersections, and CarparkAreas. These elements define the structure of the road environment and are essential for path planning and navigation. Relationships like switchVia or hasNextLaneSnippet model how vehicles can move between lanes, offering information into potential lane-switching behaviors or turns at intersections. The inclusion of StaticObjects, such as BicycleRacks or TrafficCones, also ensures that the autonomous vehicle can account for obstacles that may affect route planning or cause obstructions.
Geometric and positional data, represented through entities like Point and Polygon, offer spatial information about road elements and participants. This information is crucial for defining the location and shape of lanes, intersections, or other key objects in the scene. The integration of Geometry into the ontology allows the autonomous vehicle to map its environment and understand the spatial constraints in real time. The temporal aspect of the ontology is managed through elements like Scene and Sequence. A Scene represents a snapshot of the traffic environment, containing all entities and their relationships at a given moment. Multiple scenes can be grouped into a Sequence, forming a temporal chain that models the evolution of a traffic situation over time. This temporal structure is critical for predicting the future trajectories of participants, as the system can understand not only the current state of the environment but also how it is likely to change. In summary, this ontology provides a structured framework that captures the complexities of real-world traffic environments, integrating spatial, temporal, and relational data. Modeling both the static infrastructure and the dynamic behaviors of participants serves as a foundational tool for enhancing the decision-making capabilities of autonomous vehicles, enabling them to navigate safely and effectively in diverse and unpredictable scenarios.
Additionally, [74] introduces an approach using KGs to model various entities and their semantic connections within traffic scenes. It presented the nuScenes knowledge graph (nSKG), which explicitly models scene participants and road elements, including their semantic and spatial relationships. Similarly, Sun et al. [85] investigate the effects of spatial resolution, the relationship between graphs and trajectory predictions, and methods for embedding knowledge into graphs.
Urbieta et al. [79] introduced and formalized knowledge-based entity prediction (KEP), which improves scene understanding by predicting potentially missing entities using a knowledge-infused learning approach. The proposed solution includes (1) a dataset-agnostic ontology for describing driving scenes, (2) a comprehensive scene representation with knowledge graphs, and (3) a novel mapping of KEP to link prediction (LP) using KGE. Evaluations are performed with real urban driving data.

4.1.2. Object Tracking

Focused on 3D multi-object tracking, Zaech et al. [75] use graph structures to integrate detection and track states, improving tracking accuracy and stability. This approach aligns with enhancing scene understanding through dynamic object tracking. The learning-based graph approach integrates object detections and tracks, creating a unified representation of the environment that supports effective scene understanding.

4.1.3. Road Sign Detection

Accurate road sign annotation is crucial for AI-based road sign recognition (RSR) systems but is often hindered by annotators’ difficulties with diverse road sign systems. Ref. [76] propose a novel method combining knowledge graphs with a machine learning algorithm—variational prototyping encoder (VPE)—to enhance road sign classification. Annotators use the road sign knowledge graph to query visual attributes, receiving candidate suggestions from the VPE model.

4.1.4. Scene Graph Augmented Risk Assessment

Despite significant autonomous driving progress, navigating complex road conditions remains challenging. There is notable evidence that assessing the subjective risk level of different decisions can improve AD safety in normal and complex driving scenarios. Traditional deep learning methods often fail in modeling traffic interactions and need more explainability. Thus, [77] proposes a novel approach using scene graphs with a multi-relation graph convolution network, long-short term memory network, and attention layers to assess driving maneuver risks. By leveraging KGs, this research demonstrates how KGs can enhance object recognition and scene comprehension, aligning with scene graphs to model relationships between traffic participants. This integration supports improved object detection algorithms and a more comprehensive understanding of driving environments.

4.1.5. Scene Creation

To address the challenges of identifying diverse scenarios, Bagschik et al. [78] proposed the use of ontologies to model expert knowledge and generate a wide array of traffic scenes. Ontologies facilitate the creation of detailed and varied scenarios by translating specialist knowledge into structured formats that can be used for computer-aided processing. This approach enhances the scenario creation process, ensuring that automated vehicles undergo rigorous testing across a comprehensive set of scenarios, contributing to their safety and development. Furthermore, [79] proposed the automotive global ontology (AGO) as a knowledge organization system (KOS) implemented with the Neo4j graph database. The AGO is demonstrated through two use cases—semantic labeling and scenario-based testing.

4.1.6. Lane Graph Estimation for Urban Driving

Lane-level scene annotations are crucial for trajectory planning in autonomous vehicles but are labor intensive and costly. Zurn et al. [80] propose a novel method for estimating lane geometry from bird’s-eye-view images by framing it as a graph estimation problem. Lane anchor points are represented as graph nodes, and lane segments as graph edges. The model, trained on multimodal data from the NuScenes dataset, estimates lane shapes and connections, resulting in a directed lane graph. The LaneGraphNet model demonstrates strong performance on urban scenes in the NuScenes dataset, offering a promising step towards automated HD lane annotation. Traditional methods need help with lane connectivity and often overlook interaction modeling, while traffic element-to-lane assignments remain limited to the image domain. To address these challenges, Li et al. [81] introduced TopoNet, which uses a scene graph neural network to model relationships in driving scenes. This end-to-end framework abstracts traffic knowledge and understands the connections between traffic elements and lanes beyond traditional perception tasks. Figure 6 shows how TopoNet built the connections between traffic elements and lanes.

4.2. Challenges and Potential Solutions in Existing Work

In this section, we outline the limitations identified in integrating KGs and semantic technologies as presented in Section 4.1, followed by potential solutions and directions for future research.

4.2.1. Maturity of Semantic Technologies

Semantic technologies, which enable machines to understand and reason with human language, have made significant strides. However, their integration into automotive applications still presents challenges, particularly regarding performance and interoperability. In AVs, processing large amounts of data in real time remains an issue, and the task of merging different systems and data sources, such as sensors and maps, is complicated by varying formats and standards. Triple stores, specialized databases for storing and querying data relationships, play a key role in addressing these challenges [71]. The balance between virtual data access, which allows for flexible, on-the-fly queries, and materialization, which speeds up queries by pre-storing data, is crucial for optimizing performance. Advances in triple stores are helping to improve query efficiency, but further research is needed to refine these techniques. Future work needs to focus on optimizing this balance to more effectively handle the vast and complex data involved in AD.

4.2.2. Knowledge Graph Embeddings (KGE) and Data Preparation

Preparing data for KGE can be complex, particularly when information is dispersed across multiple relationships [71,77]. Special queries and the creation of views can help manage this complexity, but more advanced methods are needed. Future research should explore automated techniques for optimizing KGE in dynamic, multi-relational contexts. Additionally, current datasets often lack the volume and variety necessary for complex semantic inferences. Expanding datasets to include risk and accident scenarios and employing advanced methods like scene captioning and risk classification could address these limitations. Scalability remains a concern as datasets grow larger, requiring more efficient data processing methods.

4.2.3. Handling Long Frame Gaps and Occlusions in Tracking

Tracking systems often struggle with long frame gaps and occlusions, leading to ID switches and false positives. Developing adaptive methods that intelligently manage these trade-offs is crucial for improving tracking continuity. Advanced occlusion models and temporal context-aware algorithms present promising avenues for AV research [75].

4.2.4. Intersection and Lane Detection Challenges

Complex intersections and lane configurations present significant challenges for lane detection models, often leading to inaccuracies. Enhancing these models with additional contextual information (traffic signal and sign information, road markings and signage, surrounding vehicle behavior, GPS and map data, road geometry), improving data augmentation techniques, and expanding training datasets to cover a broader range of scenarios can improve model robustness in these challenging environments.

4.2.5. Expanding Predictive Capabilities and Scalability

While current predictive systems excel in specific scenarios like “pedestrian crossing” and “lane changes”, they fall short in more complex situations, such as near-miss maneuvers. Expanding predictive models to include a wider range of use cases, particularly in diverse cultural contexts, is necessary for more comprehensive behavior prediction in autonomous vehicles. Integration with AV behavior planners is a critical next step [72]. Additionally, the scalability of deploying GNNs on large graphs, as seen in dataset-specific ontologies, is another critical challenge [74]. Techniques to manage the increased computational complexity and memory requirements associated with large graphs are necessary to fully realize the potential of GNNs in AV applications.

4.2.6. Validation of Automated Driving Systems

One of the primary challenges in validating highly automated driving systems is the impractical requirement of millions of test kilometers to cover critical situations [71]. This challenge can be addressed by utilizing simulation data for validation, which is expected to generalize well to real-world scenarios. However, the effectiveness of simulation data needs further exploration, such as validation with real-world data, transfer learning techniques, and human-in-the-loop testing, to ensure reliability in diverse and unpredictable real-world conditions.

5. Discussion for Ethical and Practical Considerations in AV Technologies

In this section, we discuss ethical and practical considerations in AV technologies.

5.1. Challenges of Knowledge Graphs in AVs

As we presented in Section 3 through Section 4, KGs are known for their ability to represent complex information in a way that helps humans understand relationships between objects and situations. However, their exact role in helping AV systems recognize and make decisions is still debated. While KGs can improve decision-making by organizing information about the environment and objects, it is unclear how essential or effective they are when it comes to the fast, real-time decisions that AV systems need to make.
Recent research has explored how KGs can serve as valuable tools for contextual enrichment in AV systems, particularly in areas such as environmental perception and situation awareness. For instance, KGs have been employed to improve the semantic understanding of traffic environments, support sensor data fusion, and provide a knowledge base for real-time decision-making. However, integrating KGs into AV systems must balance improving decision accuracy with maintaining the processing speed needed for safe operations. For instance, KGs can help an AV interpret “traffic signs” or predict “pedestrian behavior” in contexts like “construction zones” or “school areas”, but this information must be processed quickly enough for the vehicle to respond in real time, such as “stopping for a crossing pedestrian”. KGs may not serve as a standalone solution but rather as a component within a broader decision-making framework, particularly in machine learning pipelines where they enhance prediction accuracy and contextual reasoning.
Future research should focus on evaluating the real-time performance of KGs in AV decision processes and determining whether they offer tangible benefits compared to machine learning approaches. Key questions include how well KGs scale with the complexity of AV environments and how they integrate with high-performance machine learning algorithms that enable fast decision-making.

5.2. Ethical Decision-Making in AVs: Moving Beyond Human Analogy

One of the critical aspects of AV decision-making lies in its ethical dimension. Unlike human drivers, who rely on intuition and often make decisions under time pressure with limited information, AVs are pre-programmed to follow specific rules and logic. However, this raises an important concern: how are AVs programmed to make ethically sound decisions in real-world scenarios, particularly when their perception of the environment differs fundamentally from that of humans?
In AVs, decision-making algorithms need to consider the limitations of machine perception, which depends on sensors like cameras, lidar, and radar. Unlike human senses, which directly interpret the environment, these sensors generate data streams that the AV must process. Because machines perceive the world differently from humans, the ethical frameworks guiding AV decisions need to reflect the specific strengths and weaknesses of these sensors. Instead of just copying human decision-making, the ethical guidelines for AVs should be tailored to how these machines actually see and understand their surroundings.
Recent incidents in AV development, such as the Boeing MCAS (maneuvering characteristics augmentation system) disaster [86], underscore the risks of poorly designed decision-making systems that fail to account for complex, real-world situations. In AVs, ethical decisions should be tightly coupled with safety regulations, focusing on minimizing harm while adhering to legal standards. The debate around whether AVs should be programmed to make difficult ethical choices (e.g., who to protect in an unavoidable accident) often leads to the spread of “fake ethics” that are not grounded in the practical realities of AV technology.
To mitigate such concerns, developing decision-making algorithms that prioritize risk avoidance and minimize harm is crucial, rather than only replicating human ethical dilemmas in AV systems. Furthermore, the ethical reasoning framework of AVs should be transparent and accountable, ensuring that decisions can be traced and reviewed in the event of an incident.

5.3. Addressing Fake Ethics in AVs

In discussions about ethics and AVs, speculative scenarios often emerge, such as an AV having to choose between “saving a mother and a child” or “avoiding a cat”. These hypothetical situations, while interesting in ethical debates, do not reflect the real operational challenges AVs face. Such scenarios lead to “fake ethics”, which distract from the real goal of AV development—preventing accidents and ensuring safe driving.
For example, human drivers are expected to follow traffic laws and avoid dangerous situations, like speeding through a crowded intersection. AVs should be programmed the same way: to avoid risks entirely rather than needing to make moral decisions in split seconds. The focus should be on designing AV systems that can recognize and react safely to potential hazards before they escalate into accidents. Take a real-world case: instead of asking an AV to choose between hitting a pedestrian or swerving into oncoming traffic, the goal should be to avoid either scenario in the first place by programming the vehicle to slow down when approaching a crowded area. This proactive approach avoids the need for moral decision-making altogether. Additionally, concerns about AVs being programmed to take risks that human drivers would avoid, such as speeding through yellow lights to save time. Rather than forcing AVs to make risky decisions, developers should ensure that AVs always prioritize safety, such as stopping at yellow lights even when they could technically make it through. Ultimately, the ethical focus for AVs should be on preventing accidents and minimizing harm, not on programming them to make difficult moral choices in high-pressure moments.

5.4. Accountability in AV Decision-Making

A critical concern raised in AV ethics is accountability—specifically, who or what is responsible when an AV makes a decision that results in harm or an accident. Unlike human drivers, who may act unpredictably or make judgment calls, AVs operate under strict programming, meaning that every decision can be traced back to the algorithms that govern their behavior.
There are growing concerns that if AVs are programmed to take risks that human drivers would not be expected to take, the responsibility for accidents could shift toward the manufacturers or software developers. For example, if an AV is designed to prioritize the safety of its passengers over pedestrians in certain scenarios, such programming could be interpreted as premeditated risk-taking, raising legal and ethical questions about liability.
To address these concerns, regulations must ensure that AVs follow responsible driving behaviors similar to the standards and legal rules that apply to human drivers. Developers of AV systems need to be responsible for making sure their algorithms do not create unnecessary risks on the road, such as making unsafe lane changes or failing to prioritize emergency vehicles. Additionally, AV decision-making processes should be reviewed and checked regularly. In addition to legal responsibility, there is also a need for AVs to be equipped with fail-safe mechanisms that can respond appropriately when unexpected situations arise. This could include handing control back to a human driver or executing predefined safety maneuvers when the system cannot confidently make an ethically sound decision.

6. Conclusions

This paper has surveyed the integration of knowledge graphs (KGs) into autonomous vehicle (AV) technologies, addressing the research questions outlined in Section 2. In response to the first question—what are the key applications in AV technologies and KGs—we discussed the critical components of AV systems in Section 3, including perception, localization, path planning, and decision-making, and highlighted the significant challenges these systems face in real-world environments. In Section 4, we explored the specific ways in which KGs can enhance these components by providing a structured framework for organizing and interpreting complex environmental data, contributing to improved perception, localization, path planning, and decision-making within AV systems.
For the second question—which aspects of KG integration does the article address—we discussed the application of KGs in AV systems in Section 4, demonstrating how they can enhance perception and decision-making by providing structured and contextualized data from diverse sources. We highlighted that while KGs have the potential to improve AVs’ ability to interpret traffic environments and make more informed decisions, their practical integration still faces challenges, particularly in terms of scalability, robustness, and real-time performance across various driving conditions.
In relation to the third question—what methods do the article discuss for integrating KGs with AV systems—we presented the approaches for KG integration, particularly in enhancing sensor data processing and real-time decision-making. These methods involve employing KGs to provide contextualized data that enhances the performance of existing AV systems. Future work is required to evaluate and refine these methods in diverse and unpredictable driving scenarios.
Lastly, in response to the question—what are the limitations of AVs, and what future research does the article suggest—we discussed the limitations, including the need to improve the scalability and real-time performance of KG-based systems. In Section 5, we also explored the ethical considerations in AV decision-making, emphasizing that AVs should prioritize safety and risk minimization over replicating human ethical reasoning. Transparency and accountability in AV decision-making are crucial, as is ensuring that manufacturers take responsibility for any pre-programmed risks embedded in AV systems.
In conclusion, while the integration of KGs into AV technologies holds great potential for addressing the complexities of perception and decision-making, significant research gaps remain. Enhancing the scalability, robustness, and real-time processing capabilities of KGs is essential to meet the stringent demands of AV systems. Future research should focus on real-world evaluations of KG performance across diverse scenarios, as well as deeper integration with machine learning pipelines. Ethical concerns, particularly those related to safety and accountability, must remain central to the development of AV technologies. By addressing these challenges, KGs could unlock new possibilities for creating more reliable, ethical, and safe autonomous driving systems.

Author Contributions

Conceptualization, S.N.N.H. and K.F.; methodology, S.N.N.H.; validation, S.N.N.H. and K.F.; formal analysis, K.F.; investigation, K.F.; resources, S.N.N.H.; data curation, S.N.N.H.; writing—original draft preparation, S.N.N.H.; writing—review and editing, K.F.; visualization, S.N.N.H.; supervision, K.F.; project administration, K.F.; funding acquisition, K.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work is based on results obtained from the project JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Guizilini, V.C.; Li, J.; Ambrus, R.; Gaidon, A. Geometric Unsupervised Domain Adaptation for Semantic Segmentation. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 8517–8527. [Google Scholar]
  2. Hou, R.; Li, J.; Bhargava, A.; Raventós, A.; Guizilini, V.C.; Fang, C.; Lynch, J.P.; Gaidon, A. Real-Time Panoptic Segmentation from Dense Detections. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 8520–8529. [Google Scholar]
  3. Faisal, A.; Kamruzzaman, M.; Yigitcanlar, T.; Currie, G. Understanding autonomous vehicles: A systematic literature review on capability, impact, planning and policy. J. Transp. Land Use 2019, 12, 45–72. [Google Scholar] [CrossRef]
  4. Waymo Safety Report; Waymo LLC: Mountain View, CA, USA, 2020.
  5. 2023 Sustainability Report; Journey to Zero; General Motors (GM): Detroit, MI, USA, 2023.
  6. Automated Driving Systems; A Vision for Safety; U.S. Department of Transportation, National Highway Traffic Safety Administration (NHTSA): Washington, DC, USA, 2017.
  7. Ministry of Industrial and Information Technology of the People’s Republic of China. Available online: https://wap.miit.gov.cn/jgsj/zbys/qcgy/art/2022/art_6fae62605ce34907939028daf6021c48.html (accessed on 30 September 2024).
  8. Society 5.0 and SIP Autonomous Driving. Available online: https://www.sip-adus.go.jp/exhibition/a2.html (accessed on 30 September 2024).
  9. Automated Driving Safety Evaluation Framework Ver. 3.0; Sectional Committee of AD Safety Evaluation, Automated Driving Subcommittee; Japan Automobile Manufacturers Association, Inc.: Tokyo, Japan, 2022.
  10. Xu, R.; Xia, X.; Li, J.; Li, H.; Zhang, S.; Tu, Z.; Meng, Z.; Xiang, H.; Dong, X.; Song, R.; et al. V2V4Real: A Real-World Large-Scale Dataset for Vehicle-to-Vehicle Cooperative Perception. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 13712–13722. [Google Scholar]
  11. Ogunrinde, I.O.; Bernadin, S. Deep Camera–Radar Fusion with an Attention Framework for Autonomous Vehicle Vision in Foggy Weather Conditions. Sensors 2023, 23, 6255. [Google Scholar] [CrossRef] [PubMed]
  12. Alaba, S.Y.; Gurbuz, A.C.; Ball, J.E. Emerging Trends in Autonomous Vehicle Perception: Multimodal Fusion for 3D Object Detection. World Electr. Veh. J. 2024, 15, 20. [Google Scholar] [CrossRef]
  13. Baczmanski, M.; Synoczek, R.; Wasala, M.; Kryjak, T. Detection-segmentation convolutional neural network for autonomous vehicle perception. In Proceedings of the 2023 27th International Conference on Methods and Models in Automation and Robotics (MMAR), Miedzyzdroje, Poland, 22–25 August 2023; pp. 117–122. [Google Scholar]
  14. Luettin, J.; Monka, S.; Henson, C.; Halilaj, L. A Survey on Knowledge Graph-Based Methods for Automated Driving. In Knowledge Graphs and Semantic Web. KGSWC 2022. Communications in Computer and Information Science; Villazón-Terrazas, B., Ortiz-Rodriguez, F., Tiwari, S., Sicilia, M.A., Martín-Moncunill, D., Eds.; Springer: Cham, Switzerland, 2022; Volume 1686. [Google Scholar] [CrossRef]
  15. Rezwana, S.; Lownes, N. Interactions and Behaviors of Pedestrians with Autonomous Vehicles: A Synthesis. Future Transp. 2024, 4, 722–745. [Google Scholar] [CrossRef]
  16. Sayed, S.A.; Abdel-Hamid, Y.; Hefny, H.A. Artificial intelligence-based traffic flow prediction: A comprehensive review. J. Electr. Syst. Inf. Technol. 2013, 10, 1–42. [Google Scholar] [CrossRef]
  17. Liu, Q.; Li, X.; Tang, Y.; Gao, X.; Yang, F.; Li, Z. Graph Reinforcement Learning-Based Decision-Making Technology for Connected and Autonomous Vehicles: Framework, Review, and Future Trends. Sensors 2023, 23, 8229. [Google Scholar] [CrossRef]
  18. Yurtsever, E.; Lambert, J.; Carballo, A.; Takeda, K. A Survey of Autonomous Driving: Common Practices and Emerging Technologies. IEEE Access 2019, 8, 58443–58469. [Google Scholar] [CrossRef]
  19. Badue, C.S.; Guidolini, R.; Carneiro, R.V.; Azevedo, P.; Cardoso, V.B.; Forechi, A.; Jesus, L.F.; Berriel, R.; Paixão, T.M.; Mutz, F.W.; et al. Self-Driving Cars: A Survey. Expert Syst. Appl. 2021, 165, 113816. [Google Scholar] [CrossRef]
  20. Zhao, J.; Zhao, W.; Deng, B.; Wang, Z.; Zhang, F.; Zheng, W.; Cao, W.; Nan, J.; Lian, Y.; Burke, A.F. Autonomous driving system: A comprehensive survey. Expert Syst. Appl. 2023, 242, 122836. [Google Scholar] [CrossRef]
  21. Guizilini, V.C.; Vasiljevic, I.; Chen, D.; Ambrus, R.; Gaidon, A. Towards Zero-Shot Scale-Aware Monocular Depth Estimation. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 9199–9209. [Google Scholar]
  22. Irshad, M.; Zakharov, S.; Liu, K.; Guizilini, V.C.; Kollar, T.; Gaidon, A.; Kira, Z.; Ambrus, R. NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 9153–9164. [Google Scholar]
  23. Codevilla, F.; Santana, E.; López, A.M.; Gaidon, A. Exploring the Limitations of Behavior Cloning for Autonomous Driving. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9328–9337. [Google Scholar]
  24. DeCastro, J.A.; Liebenwein, L.; Vasile, C.I.; Tedrake, R.; Karaman, S.; Rus, D. Counterexample-Guided Safety Contracts for Autonomous Driving. In Proceedings of the Workshop on the Algorithmic Foundations of Robotics, Mérida, Mexico, 9–11 December 2018. [Google Scholar]
  25. Alcantarilla, P.F.; Stent, S.; Ros, G.; Arroyo, R.; Gherardi, R. Street-view change detection with deconvolutional networks. Auton. Robot. 2016, 42, 1301–1322. [Google Scholar] [CrossRef]
  26. Guizilini, V.C.; Li, J.; Ambrus, R.; Pillai, S.; Gaidon, A. Robust Semi-Supervised Monocular Depth Estimation with Reprojected Distances. In Proceedings of the Conference on Robot Learning, Virtual, 16–18 November 2020; Volume 100, pp. 503–512. [Google Scholar]
  27. Ambrus, R.; Guizilini, V.C.; Li, J.; Pillai, S.; Gaidon, A. Two Stream Networks for Self-Supervised Ego-Motion Estimation. In Proceedings of the Conference on Robot Learning, Osaka, Japan, 30 October–1 November 2019. [Google Scholar]
  28. Kanai, T.; Vasiljevic, I.; Guizilini, V.C.; Gaidon, A.; Ambrus, R. Robust Self-Supervised Extrinsic Self-Calibration. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 1932–1939. [Google Scholar]
  29. Lee, K.; Kliemann, M.; Gaidon, A.; Li, J.; Fang, C.; Pillai, S.; Burgard, W. PillarFlow: End-to-end Birds-eye-view Flow Estimation for Autonomous Driving. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October 2020–24 January 2021; pp. 2007–2013. [Google Scholar]
  30. Guizilini, V.C.; Ambrus, R.; Pillai, S.; Gaidon, A. 3D Packing for Self-Supervised Monocular Depth Estimation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 2482–2491. [Google Scholar]
  31. Manhardt, F.; Kehl, W.; Gaidon, A. ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 2064–2073. [Google Scholar]
  32. Chiu, H.; Li, J.; Ambrus, R.; Bohg, J. Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 14227–14233. [Google Scholar]
  33. Guizilini, V.C.; Senanayake, R.; Ramos, F.T. Dynamic Hilbert Maps: Real-Time Occupancy Predictions in Changing Environments. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 4091–4097. [Google Scholar]
  34. Amini, A.; Rosman, G.; Karaman, S.; Rus, D. Variational End-to-End Navigation and Localization. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 8958–8964. [Google Scholar]
  35. Koide, K.; Yokozuka, M.; Oishi, S.; Banno, A. Globally Consistent and Tightly Coupled 3D LiDAR Inertial Mapping. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 5622–5628. [Google Scholar]
  36. Koide, K.; Oishi, S.; Yokozuka, M.; Banno, A. Tightly Coupled Range Inertial Localization on a 3D Prior Map Based on Sliding Window Factor Graph Optimization. In Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, Japan, 13–17 May 2024; pp. 1745–1751. [Google Scholar]
  37. McAllister, R.T.; Wulfe, B.; Mercat, J.; Ellis, L.; Levine, S.; Gaidon, A. Control-Aware Prediction Objectives for Autonomous Driving. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 01–08. [Google Scholar]
  38. Lee, K.; Ros, G.; Li, J.; Gaidon, A. SPIGAN: Privileged Adversarial Learning from Simulation. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
  39. Guizilini, V.C.; Hou, R.; Li, J.; Ambrus, R.; Gaidon, A. Semantically-Guided Representation Learning for Self-Supervised Monocular Depth. In Proceedings of the International Conference on Learning Representations (ICLR), Addis Ababa, Ethiopia, 26–30 April 2020; 27 February 2020. Available online: https://openreview.net/pdf?id=ByxT7TNFvH (accessed on 1 August 2024).
  40. DeCastro, J.A.; Leung, K.; Aréchiga, N.; Pavone, M. Interpretable Policies from Formally-Specified Temporal Properties. In Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; pp. 1–7. [Google Scholar]
  41. Honda, K.; Koide, K.; Yokozuka, M.; Oishi, S.; Banno, A. Generalized LOAM: LiDAR Odometry Estimation With Trainable Local Geometric Features. IEEE Robot. Autom. Lett. 2022, 7, 12459–12466. [Google Scholar] [CrossRef]
  42. Koide, K.; Yokozuka, M.; Oishi, S.; Banno, A. Globally Consistent 3D LiDAR Mapping With GPU-Accelerated GICP Matching Cost Factors. IEEE Robot. Autom. Lett. 2021, 6, 8591–8598. [Google Scholar] [CrossRef]
  43. Liu, B.; Adeli, E.; Cao, Z.; Lee, K.; Shenoi, A.; Gaidon, A.; Niebles, J. Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction. IEEE Robot. Autom. Lett. 2020, 5, 3485–3492. [Google Scholar] [CrossRef]
  44. Ghaffari Jadidi, M.; Clark, W.; Bloch, A.M.; Eustice, R.M.; Grizzle, J.W. Continuous Direct Sparse Visual Odometry from RGB-D Images. In Proceedings of the Robotics: Science and Systems (RSS), Breisgau, Germany, 22–26 June 2019; Available online: https://www.roboticsproceedings.org/rss15/p44.pdf (accessed on 1 August 2024).
  45. Cao, Z.; Biyik, E.; Wang, W.Z.; Raventós, A.; Gaidon, A.; Rosman, G.; Sadigh, D. Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving. In Proceedings of the Robotics: Science and Systems (RSS), Virtually, 12–16 July 2020. [Google Scholar]
  46. Gideon, J.; Stent, S.; Fletcher, L. A Multi-Camera Deep Neural Network for Detecting Elevated Alertness in Drivers. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 2931–2935. [Google Scholar]
  47. Mangalam, K.; Adeli, E.; Lee, K.; Gaidon, A.; Niebles, J. Disentangling Human Dynamics for Pedestrian Locomotion Forecasting with Noisy Supervision. In Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2019; pp. 2773–2782. [Google Scholar]
  48. Aldibaja, M.; Suganuma, N.; Yoneda, K.; Yanase, R. Challenging Environments for Precise Mapping Using GNSS/INS-RTK Systems: Reasons and Analysis. Remote Sens. 2022, 14, 4058. [Google Scholar] [CrossRef]
  49. Aldibaja, M.; Suganuma, N. Graph SLAM-Based 2.5D LIDAR Mapping Module for Autonomous Vehicles. Remote Sens. 2021, 13, 5066. [Google Scholar] [CrossRef]
  50. Aldibaja, M.; Suganuma, N.; Yanase, R. 2.5D Layered Sub-Image LIDAR Maps for Autonomous Driving in Multilevel Environments. Remote Sens. 2022, 14, 5847. [Google Scholar] [CrossRef]
  51. Yoneda, K.; Kuramoto, A.; Suganuma, N.; Asaka, T.; Aldibaja, M.; Yanase, R. Robust Traffic Light and Arrow Detection Using Digital Map with Spatial Prior Information for Automated Driving. Sensors 2020, 20, 1181. [Google Scholar] [CrossRef]
  52. Yanase, R.; Hirano, D.; Aldibaja, M.; Yoneda, K.; Suganuma, N. LiDAR- and Radar-Based Robust Vehicle Localization with Confidence Estimation of Matching Results. Sensors 2022, 22, 3545. [Google Scholar] [CrossRef]
  53. Aldibaja, M.; Yanase, R.; Suganuma, N. Waypoint Transfer Module between Autonomous Driving Maps Based on LiDAR Directional Sub-Images. Sensors 2024, 24, 875. [Google Scholar] [CrossRef]
  54. Yan, Z.; Yang, B.; Wang, Z.; Nakano, K. A Predictive Model of a Driver’s Target Trajectory Based on Estimated Driving Behaviors. Sensors 2023, 23, 1405. [Google Scholar] [CrossRef]
  55. Nacpil, E.J.; Nakano, K. Surface Electromyography-Controlled Automobile Steering Assistance. Sensors 2020, 20, 809. [Google Scholar] [CrossRef] [PubMed]
  56. Yoneda, K.; Suganuma, N.; Yanase, R.; Aldibaja, M. Automated driving recognition technologies for adverse weather conditions. Iatss Res. 2019, 43, 253–262. [Google Scholar] [CrossRef]
  57. Aldibaja, M.; Suganuma, N.; Yoneda, K.; Yanase, R.; Kuramoto, A. Supervised Calibration Method for Improving Contrast and Intensity of LIDAR Laser Beams. In Proceedings of the International Conference on Multisensor Fusion and Integration for Intelligent Systems, Daegu, South Korea, 16–18 November 2017. [Google Scholar]
  58. Yoneda, K.; Kuramoto, A.; Suganuma, N. Convolutional neural network based vehicle turn signal recognition. In Proceedings of the International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS), Okinawa, Japan, 24–26 November 2017; pp. 204–205. [Google Scholar]
  59. Cheng, S.; Wang, Z.; Yang, B.; Li, L.; Nakano, K. Quantitative Evaluation Methodology for Chassis-Domain Dynamics Performance of Automated Vehicles. IEEE Trans. Cybern. 2022, 53, 5938–5948. [Google Scholar] [CrossRef] [PubMed]
  60. Hou, L.; Li, S.E.; Yang, B.; Wang, Z.; Nakano, K. Structural Transformer Improves Speed-Accuracy Trade-Off in Interactive Trajectory Prediction of Multiple Surrounding Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 24778–24790. [Google Scholar] [CrossRef]
  61. Cheng, S.; Yang, B.; Wang, Z.; Nakano, K. Spatio-Temporal Image Representation and Deep-Learning-Based Decision Framework for Automated Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 24866–24875. [Google Scholar] [CrossRef]
  62. Huang, C.; Yang, B.; Nakano, K. Impact of duration of monitoring before takeover request on takeover time with insights into eye tracking data. Accid. Anal. Prev. 2023, 185, 107018. [Google Scholar] [CrossRef]
  63. Shimizu, T.; Koide, K.; Oishi, S.; Yokozuka, M.; Banno, A.; Shino, M. Sensor-independent Pedestrian Detection for Personal Mobility Vehicles in Walking Space Using Dataset Generated by Simulation. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 1788–1795. [Google Scholar]
  64. Wang, Z.; Guan, M.; Lan, J.; Yang, B.; Kaizuka, T.; Taki, J.; Nakano, K. Classification of Automated Lane-Change Styles by Modeling and Analyzing Truck Driver Behavior: A Driving Simulator Study. IEEE Open J. Intell. Transp. Syst. 2022, 3, 772–785. [Google Scholar] [CrossRef]
  65. Hogan, A.; Blomqvist, E.; Cochez, M.; d’Amato, C.; de Melo, G.; Gutierrez, C.; Gayo, J.E.; Kirrane, S.; Neumaier, S.; Polleres, A.; et al. Knowledge Graphs. ACM Comput. Surv. (CSUR) 2020, 54, 1–37. [Google Scholar] [CrossRef]
  66. Cai, H.; Zheng, V.W.; Chang, K.C. A Comprehensive Survey of Graph Embedding: Problems, Techniques, and Applications. IEEE Trans. Knowl. Data Eng. 2017, 30, 1616–1637. [Google Scholar] [CrossRef]
  67. Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Yu, P.S. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Trans. Neural Netw. Learn. Syst. 2020, 33, 494–514. [Google Scholar] [CrossRef]
  68. Lilis, Y.; Zidianakis, E.; Partarakis, N.; Antona, M.; Stephanidis, C. Personalizing HMI Elements in ADAS Using Ontology Meta-Models and Rule Based Reasoning. In Universal Access in Human–Computer Interaction. Design and Development Approaches and Methods, Proceedings of the 11th International Conference, UAHCI 2017, Held as Part of HCI International 2017, Vancouver, BC, Canada, 9–14 July 2017; Springer: Cham, Switzerland, 2017. [Google Scholar]
  69. Velickovic, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio’, P.; Bengio, Y. Graph Attention Networks. In Proceedings of the 6th International Conference on Learning Representations (ICLR), Conference Track Proceedings, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
  70. Monka, S.; Halilaj, L.; Rettinger, A. A Survey on Visual Transfer Learning using Knowledge Graphs. Semant. Web 2022, 13, 477–510. [Google Scholar] [CrossRef]
  71. Halilaj, L.; Dindorkar, I.; Lüttin, J.; Rothermel, S. A Knowledge Graph-Based Approach for Situation Comprehension in Driving Scenarios. In Proceedings of the Extended Semantic Web Conference, Heraklion, Greece, 6–10 June 2021. [Google Scholar]
  72. Malawade, A.V.; Yu, S.Y.; Hsu, B.; Kaeley, H.; Karra, A.; Faruque, M.A. roadscene2vec: A Tool for Extracting and Embedding Road Scene-Graphs. Knowl. Based Syst. 2021, 242, 108245. [Google Scholar] [CrossRef]
  73. Zipfl, M.; Zöllner, J.M. Towards Traffic Scene Description: The Semantic Scene Graph. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 3748–3755. [Google Scholar]
  74. Mlodzian, L.; Sun, Z.; Berkemeyer, H.; Monka, S.; Wang, Z.; Dietze, S.; Halilaj, L.; Luettin, J. nuScenes Knowledge Graph—A comprehensive semantic representation of traffic scenes for trajectory prediction. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Paris, France, 2–6 October 2023; pp. 42–52. [Google Scholar]
  75. Zaech, J.; Dai, D.; Liniger, A.; Danelljan, M.; Gool, L.V. Learnable Online Graph Representations for 3D Multi-Object Tracking. IEEE Robot. Autom. Lett. 2022, 7, 5103–5110. [Google Scholar] [CrossRef]
  76. Kim, J.E.; Henson, C.; Huang, K.; Tran, T.A.; Lin, W.Y. Accelerating Road Sign Ground Truth Construction with Knowledge Graph and Machine Learning. In Intelligent Computing. Lecture Notes in Networks and Systems; Arai, K., Ed.; Springer: Cham, Switzerland, 2021; Volume 284. [Google Scholar]
  77. Yu, S.Y.; Malawade, A.V.; Muthirayan, D.; Khargonekar, P.P.; Faruque, M.A. Scene-Graph Augmented Data-Driven Risk Assessment of Autonomous Vehicle Decisions. IEEE Trans. Intell. Transp. Syst. 2020, 23, 7941–7951. [Google Scholar] [CrossRef]
  78. Bagschik, G.; Menzel, T.; Maurer, M. Ontology based Scene Creation for the Development of Automated Vehicles. In Proceedings of the 2018 IEEE Intelligent Vehicles Symposium (IV), Suzhou, China, 26–30 June 2018; pp. 1813–1820. [Google Scholar]
  79. Urbieta, I.R.; Nieto, M.; García, M.; Otaegui, O. Design and Implementation of an Ontology for Semantic Labeling and Testing: Automotive Global Ontology (AGO). Appl. Sci. 2021, 11, 7782. [Google Scholar] [CrossRef]
  80. Zürn, J.; Vertens, J.; Burgard, W. Lane Graph Estimation for Scene Understanding in Urban Driving. IEEE Robot. Autom. Lett. 2021, 6, 8615–8622. [Google Scholar] [CrossRef]
  81. Li, T.; Chen, L.; Geng, X.; Wang, H.; Li, Y.; Liu, Z.; Jiang, S.; Wang, Y.; Xu, H.; Xu, C.; et al. Graph-based Topology Reasoning for Driving Scenes. arXiv 2023, arXiv:2304.05277. [Google Scholar]
  82. Wickramarachchi, R.; Henson, C.A.; Sheth, A. An Evaluation of Knowledge Graph Embeddings for Autonomous Driving Data: Experience and Practice. In Proceedings of the AAAI Spring Symposium Combining Machine Learning with Knowledge Engineering, Palo Alto, CA, USA, 23–25 March 2020. [Google Scholar]
  83. Wang, R.; Song, X.; Hu, Z.; Cui, Y. Spatio-Temporal Interaction Aware and Trajectory Distribution Aware Graph Convolution Network for Pedestrian Multimodal Trajectory Prediction. IEEE Trans. Instrum. Meas. 2023, 72, 1–11. [Google Scholar] [CrossRef]
  84. Sun, Z.; Wang, Z.; Halilaj, L.; Luettin, J. SemanticFormer: Holistic and Semantic Traffic Scene Representation for Trajectory Prediction Using Knowledge Graphs. IEEE Robot. Autom. Lett. 2024, 9, 7381–7388. [Google Scholar] [CrossRef]
  85. Sun, R.; Lingrand, D.; Precioso, F. Exploring the Road Graph in Trajectory Forecasting for Autonomous Driving. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Paris, France, 2–6 October 2023; pp. 71–80. [Google Scholar]
  86. Herkert, J.; Borenstein, J.; Miller, K. The Boeing 737 MAX: Lessons for Engineering Ethics. Sci. Eng. Ethics 2020, 26, 2957–2974. [Google Scholar] [CrossRef]
Figure 1. Autonomous driving under limited conditions (A) and application in a more diverse environment (B) by Strategic Innovation Promotion Program (SIP): source available from https://www.sip-adus.go.jp/exhibition/a2.html (accessed on 10 September 2024).
Figure 1. Autonomous driving under limited conditions (A) and application in a more diverse environment (B) by Strategic Innovation Promotion Program (SIP): source available from https://www.sip-adus.go.jp/exhibition/a2.html (accessed on 10 September 2024).
Information 15 00645 g001
Figure 2. Distribution of papers published each year from 2017 to 2024, based on the final set of articles selected for analysis.
Figure 2. Distribution of papers published each year from 2017 to 2024, based on the final set of articles selected for analysis.
Information 15 00645 g002
Figure 3. CoSI knowledge graph [71] proposed by Halilaj et al. An excerpt of the CoSI KG representing respective situations occurring in two consecutive scenes: (1) the bottom layer depicts scenery information among participants; (2) the top layer includes concepts such as classes and relationships representing the domain knowledge; and (3) the middle layer contains concrete instances capturing the scenery information based on the ontological concepts [71].
Figure 3. CoSI knowledge graph [71] proposed by Halilaj et al. An excerpt of the CoSI KG representing respective situations occurring in two consecutive scenes: (1) the bottom layer depicts scenery information among participants; (2) the top layer includes concepts such as classes and relationships representing the domain knowledge; and (3) the middle layer contains concrete instances capturing the scenery information based on the ontological concepts [71].
Information 15 00645 g003
Figure 4. Framework proposed by Wang et al. [83], including the spatiotemporal interaction-aware module, trajectory distribution-aware module, and trajectory decoder.
Figure 4. Framework proposed by Wang et al. [83], including the spatiotemporal interaction-aware module, trajectory distribution-aware module, and trajectory decoder.
Information 15 00645 g004
Figure 5. Traffic scene ontologies proposed by Sun et al. [84]. Agent ontology defines agent attributes like category, speed, position, and trajectory, and relationships to map like distance to lane, and path distance. Map ontology defines map elements like lane snippet, lane slice, traffic light, etc., and relations within map elements like left/right lane and switch via double dashed line [84].
Figure 5. Traffic scene ontologies proposed by Sun et al. [84]. Agent ontology defines agent attributes like category, speed, position, and trajectory, and relationships to map like distance to lane, and path distance. Map ontology defines map elements like lane snippet, lane slice, traffic light, etc., and relations within map elements like left/right lane and switch via double dashed line [84].
Information 15 00645 g005
Figure 6. Graph-based topology reasoning proposed by Li et al. [81]. TopoNet is introduced to directly grasp the topology comprehension of the heterogeneous graph. “Topology LL” and “Topology LT” denote the relationships between lane centerlines and the connections between lane centerlines and traffic elements, respectively (left). TopoNet builds all connections between traffic elements and lanes (Top Right: GT, Bottom Right Prediction) [81].
Figure 6. Graph-based topology reasoning proposed by Li et al. [81]. TopoNet is introduced to directly grasp the topology comprehension of the heterogeneous graph. “Topology LL” and “Topology LT” denote the relationships between lane centerlines and the connections between lane centerlines and traffic elements, respectively (left). TopoNet builds all connections between traffic elements and lanes (Top Right: GT, Bottom Right Prediction) [81].
Information 15 00645 g006
Table 1. Comparison between our work and related survey studies.
Table 1. Comparison between our work and related survey studies.
Survey Coverage[18][19][20][14]Ours
Perception
Localization
Mapping
Moving Object Detection and Tracking
Traffic Signalization Detection
Path Planning
Behavior Selection
Motion Planning
Obstacle Avoidance
Control
Sensors and Hardware
Road and Lane Detection
Assessment
Decision-Making
Human–Machine Interaction
Datasets and Tools
Semantic Segmentation
Trajectory Prediction
Simulator and Scenario Generation
KGs Applied to AVs
  • Object Detection
  • Semantic Segmentation
  • Mapping
  • Scene Understanding
  • Object Behavior Prediction
  • Motion Planning
  • Validation
  • Scene Representation
  • Object Tracking
  • Road Sign Detection
  • Scene-Graph Augmented Risk Assessment
  • Scene Creation
Current Challenges and Limitations
Future Directions
Development in Industry
Ethical and Practical Considerations in AV Technologies
√ indicates that the survey study includes coverage of the respective topic.
Table 2. Overview of research questions and focus areas.
Table 2. Overview of research questions and focus areas.
Research QuestionFocus
What are the key applications in AV technologies and KGs?Application
Which aspects of KG integration does the article address?Contribution
What methods do the article discuss for integrating KGs with AV systems?Methodology
What are the limitations of AVs, and what future research does the article suggest?Limitations and Future Work
Table 3. Selected conferences and journals contributing to AV technologies.
Table 3. Selected conferences and journals contributing to AV technologies.
Conferences/JournalsPublisherReferences/Published Year
IEEE/CVF International Conference on Computer Vision (ICCV)IEEE[1] 2021, [21] 2023, [22] 2023, [23] 2019
Workshop on the Algorithmic Foundations of RoboticsSpringer[24] 2018
Autonomous RobotsSpringer[25] 2016
Conference on Robot Learning (CoRL)PMLR[26] 2020, [27] 2019
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)IEEE[28] 2023, [29] 2020
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)IEEE[2] 2020, [10] 2023, [30] 2019, [31] 2018
International Conference on Robotics and Automation (ICRA)IEEE[32] 2020, [33] 2019, [34] 2018, [35] 2022, [36] 2024, [37] 2022
International Conference on Learning Representations (ICLR)ICLR[38] 2018, [39] 2020
IEEE/International Conference on Intelligent Transportation Systems (ITSC)IEEE[40] 2020
IEEE Robotics and Automation LettersIEEE[41] 2022, [42] 2021, [43] 2020
Robotics: Science and Systems (RSS)MIT Press[44] 2019, [45] 2020
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)IEEE[46] 2018
IEEE Winter Conference on Applications of Computer Vision (WACV)IEEE[47] 2019
Remote SensingMDPI[48] 2022, [49] 2021, [50] 2022
Sensors (Basel, Switzerland)MDPI[11] 2023, [17] 2023, [51] 2020, [52] 2022, [53] 2024 [54] 2023, [55] 2020,
International Association of Traffic and Safety Sciences (IATSS)Elsevier[56] 2019
International Conference on Multisensor Fusion and Integration for Intelligent SystemsIEEE[57] 2017
International Conference on Intelligent Informatics and Biomedical Sciences (ICIIBMS)IEEE[58] 2017
IEEE Transactions on CyberneticsIEEE[59] 2022
IEEE Transactions on Intelligent Transportation SystemsIEEE[60] 2022, [61] 2022
Accident Analysis and PreventionElsevier[62] 2023
International Conference on Pattern Recognition (ICPR)IEEE[63] 2020
PMLR: Proceedings of Machine Learning Research, IEEE: Institute of Electrical and Electronics Engineers, MIT: Massachusetts Institute of Technology, MDPI: Multidisciplinary Digital Publishing Institute.
Table 4. Contributions of scholars in AV technologies.
Table 4. Contributions of scholars in AV technologies.
ScholarResearch FocusNo. of Published
Papers
Conferences/
Journals
V.C. GuiziliniSemantic segmentation [1,2,39]; Monocular Depth Estimation [21,26,30,39]; Sparse View Synthesis [22]; Calibration [28]; Ego-Motion Estimation [27]; Occupancy Prediction [33]10ICCV, IROS, ICLR, CVPR, CoRL, ICRA
A. GaidonSemantic segmentation [1,2,39]; Monocular Depth Estimation [21,26,30,39]; Sparse View Synthesis [22]; Calibration [28]; Object Detection [31]; Flow Estimation [29]; Ego-Motion Estimation [27]; Occupancy Prediction [33]; Sparse Visual Odometry [44]; Pedestrian Locomotion Forecasting [47]; Near-Accident Driving [45]; Behavior Cloning [23]; Pedestrian Intent Prediction [43] 18ICCV, IROS, ICLR, CVPR, CoRL, ICRA, IEEE Robotics and Automation Letters, RSS, WACV
R. AmbrusSemantic segmentation [1,39]; Monocular Depth Estimation [21,26,30,39]; Sparse View Synthesis [22]; Calibration [28]; Multi-Object Tracking [32]; Ego-Motion Estimation [27] 9ICCV, CoRL, IROS, ICRA, ICRL, CVPR
Table 5. Popularity and reasons for key components of AV technologies.
Table 5. Popularity and reasons for key components of AV technologies.
Key ComponentPopularityReasons
Sensors and perception systemsVery highHigh research volume on sensor accuracy and data processing.
Localization and mappingHighSignificant focus on SLAM and GPS-based localization.
Path planningHighAdvancements in planning algorithms for efficient AV navigation.
Control systemsModerate to highOngoing research in control mechanisms integral to AV operation.
Decision-makingHighSignificant focus on machine learning and AI-based decision-making.
Human–machine interface (HMI)ModerateIncreasing considerations for user experience.
Communication systemsModerateGrowing interest in 5G and V2X technologies.
Safety and redundancyModerate to highSubstantial interest in AV reliability and public acceptance.
Ethical and legal considerationsModerateRising importance due to regulatory and societal impact.
Societal impact and infrastructureModerateLong-term AV integration amid growing policy discussions.
Table 6. Overview of research focus in AV technologies.
Table 6. Overview of research focus in AV technologies.
Key ComponentResearch Focus
PerceptionSegmentation [1,2,38]; Street-View Change Detection [25]; Monocular Depth Estimation [21,26,30,39]; Sparse View Synthesis [22]; Calibration [28,57]; Multi-Object Tracking [32]; Object Detection [31]; Flow Estimation [29]; Ego-Motion Estimation [27]; Occupancy Prediction [33]; Visual Odometry and Image Registration [44]; Driver Alertness Detection [46]; Pedestrian Locomotion Prediction [47,63]; Traffic Lights and Arrow detection [51]; Interpreting Environmental Conditions [56]; Recognition and Matching Road Surface Features [52]; Turn Signal Recognition [58]; Gaze Tracking [62]; Spatio-Temporal Image Representation [61]
Localization and MappingUpdating and Maintaining Maps [25]; Depth-Aware Map [26]; Multi-Camera Maps [28]; Ego-Motion Estimation [27]; Creating and Updating Occupancy Maps [33]; Visual Odometry [41,44]; Probabilistic Localization [34]; Mapping with GNSS/INS-RTK [48]; Transferring Lane Graphs and Different Map Representation [53]; Generating 2.5D maps using LIDAR and Graph SLAM [49]; 2.5D Maps for Multilevel Environments and Vehicle Localization [50]; Map Generation and Localization [57]; 3D LiDAR Mapping and Location [35]; 3D Mapping [36,42]
Path PlanningFlow Estimation [29]; Point-to-Point Navigation [34]; Control-Aware Prediction [37]; Planning Near-Accident Driving Scenarios [45]; Safety Trajectory Generation [58]; Driver’s Target Trajectory [54]; Interactive Trajectory Prediction [60]; Lane-Change Styles Classification [64]
Control Generating Control Commands [34]; Control-Aware Prediction [37]; Controlling Vehicle’s Actions [45]; Automated Lane-Change Control [64]; Safety Verification [24]; Interpretable Policies [40]; Behavior Cloning [23]; Chassis Performance [59]; Controlling Vehicle Steering [55]
Decision-MakingWeather Conditions in Decision-Making [56]; Predicted Probability Distribution [34]; Control-Aware Prediction [37]; Switching Between Different Driving Modes [45]; Decision-Marking on Predicted Driver’s Behavior [58]; Driver’s Target Trajectory [54]; Decision-Making on Lane-Changes [64]; Interpretable Policies [40]; Behavior Cloning [23]; Chassis Performance [59]; Decision-Based Dynamic Traffic Conditions [61]; Pedestrian Intent Prediction [43]; Monitoring Duration on Takeover Time [62]
Human–Machine Interaction (HMI) Sparse View Synthesis and Scene Visualization [22]; Real-Time Feedback based on Driver State [46]; Driving Simulation and User Interaction [64]; Controlling Vehicle Steering [55]; Monitoring Duration and Eye Tracking [62]
Table 7. A summary of common types of sensors used in AVs.
Table 7. A summary of common types of sensors used in AVs.
Sensor TypePlacement in Automated CarSome Exampled Use Cases
CamerasFront, sides, rear, roofLane-keeping, pedestrian detection, object recognition
LiDAR 1Roof, bumpers, sides3D object detection, terrain mapping, localization
Radar 2Front and rear bumpersAdaptive cruise control, collision detection
Ultrasonic SensorsFront and rear bumpers, sidesParking assistance, close-range obstacle detection
GPSRoof, dashboardRoute planning, navigation, localization
IMUIntegrated in-vehicle systemsStabilization, motion tracking, localization
Odometry SensorsWheels or chassisLocalization, motion planning, distance tracking
V2X 3 SensorsRoof, exterior antennasTraffic management, safety alerts, cooperative driving
Infrared (IR) SensorsFront bumper, roofNight vision, obstacle detection in low visibility
Magnetic SensorsBottom of the vehicleLane-keeping in autonomous shuttles
Barometric Pressure SensorsInside vehicle sensor suiteAltitude measurement, terrain planning
Laser RangefindersFront and rear of the vehicleObject detection, parking assistance
Proximity SensorsFront and rear bumpersParking, collision avoidance
Environmental SensorsExterior, often on the roofAdjusting driving in response to weather
1 LiDAR: light detection and ranging; 2 Radar: radio detection and ranging; 3 V2X: vehicle-to-everything.
Table 8. Key contributions of knowledge graphs in AV applications.
Table 8. Key contributions of knowledge graphs in AV applications.
Research FocusApproachKey Contributions
Scene RepresentationCoSI [71]Integrates heterogeneous sources into a unified KG structure for situation classification, difficulty assessment, and trajectory prediction using GNN architecture.
roadscene2vec [72]Generates scene graphs for risk assessment, collision prediction, and model explainability.
Semantic Scene Graph [73]Captures traffic participants’ interactions and relative positions.
nSKG [74]Represents scene participants and road elements, including semantic and spatial relationships.
Object Tracking3D multi-object tracking
[75]
Graph structures integrate detection and track states to improve 3D multi-object tracking accuracy and stability.
Road Sign DetectionKGs with VPE [76]Combines KGs with variational prototyping encoder (VPE) for improved road sign classification and accurate annotation.
Scene Graph-Augmented Risk AssessmentScene graph sequence [77]Scene graphs with multi-relation GCN, LSTM, and attention layers assess driving maneuver risks, improving object recognition and scene comprehension.
Scene CreationOntologies [78]Ontologies model expert knowledge for generating diverse traffic scenes and enhancing scenario creation for AV testing.
AGO [79]Automotive global ontology (AGO) as a knowledge organization system (KOS) for semantic labeling and scenario-based testing.
Lane Graph Estimation LaneGraphNet [80]Estimates lane geometry from BEV images by framing it as a graph estimation problem.
TopoNet [81]Uses a scene graph neural network to model relationships in driving scenes, understanding traffic element connections.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Htun, S.N.N.; Fukuda, K. Integrating Knowledge Graphs into Autonomous Vehicle Technologies: A Survey of Current State and Future Directions. Information 2024, 15, 645. https://doi.org/10.3390/info15100645

AMA Style

Htun SNN, Fukuda K. Integrating Knowledge Graphs into Autonomous Vehicle Technologies: A Survey of Current State and Future Directions. Information. 2024; 15(10):645. https://doi.org/10.3390/info15100645

Chicago/Turabian Style

Htun, Swe Nwe Nwe, and Ken Fukuda. 2024. "Integrating Knowledge Graphs into Autonomous Vehicle Technologies: A Survey of Current State and Future Directions" Information 15, no. 10: 645. https://doi.org/10.3390/info15100645

APA Style

Htun, S. N. N., & Fukuda, K. (2024). Integrating Knowledge Graphs into Autonomous Vehicle Technologies: A Survey of Current State and Future Directions. Information, 15(10), 645. https://doi.org/10.3390/info15100645

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop