Design of an Extended DCAT-Based Metadata Schema and Data Catalog for Autonomous Vehicle Accident Investigation
Abstract
1. Introduction
2. Related Works
2.1. Accident Investigation Process Traffic Accident Involving AV
2.2. Usable Data to Accident Investigation
2.3. Metadata Schema and Data Catalog
2.4. Contributions
3. Methodology
3.1. Mapping of Entities-Itemes-Data on Investigation Process
- Vehicle InvestigationThe vehicle investigation team conducts physical defect investigations and system defect investigations. Physical defect investigations examine the vehicle’s basic information (operating mode, key functions, vehicle status, etc.), hardware (H/W), and whether the chassis system is functioning properly. System defect investigations check for errors in the Human–Machine Interface (HMI), software, and functional modules, analyzing software versions and system logs as well.
- Digital ForensicThe digital forensics team focuses on investigating communication failures and security vulnerabilities. They inspect the V2X communication status of AVs and the safety of communication infrastructure, assess the potential for cybersecurity breaches, and verify whether accidents occurred due to external factors.
- Virtual Environments InvestigationThe virtual environment investigation team reviews environmental factors such as high-precision maps, road design, and traffic operation status to determine whether AVs accurately perceived the actual road environment through their sensors.
3.2. Definition of Classes and Properties Under Essential DCAT and AP
3.3. Meta Data Schema for AV Accident Investigation
3.4. The Visualization UI Based on the Designed Metadata Schema
4. Case Study: Application of the New Methodology for Autonomous Vehicle Accident Investigation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chougule, A.; Chamola, V.; Sam, A.; Yu, F.R.; Sikdar, B. A comprehensive review on limitations of autonomous driving and its impact on accidents and collisions. IEEE Open J. Veh. Technol. 2023, 5, 142–161. [Google Scholar] [CrossRef]
- Kim, H.; Han, H.; You, Y.; Cho, M.J.; Hong, J.; Song, T.J. A Comprehensive Traffic Accident Investigation System for Identifying Causes of the Accident Involving Events with Autonomous Vehicle. J. Adv. Transp. 2024, 2024, 9966310. [Google Scholar] [CrossRef]
- Terrizzano, I.G.; Schwarz, P.M.; Roth, M.; Colino, J.E. Data Wrangling: The Challenging Yourney from the Wild to the Lake. In Proceedings of the CIDR, Asilomar, Pacific Grove, CA, USA, 10 October 2015. [Google Scholar]
- Beamer, A. Map metadata: Essential elements for search and storage. Program 2009, 43, 18–35. [Google Scholar] [CrossRef]
- Stillerman, J.; Fredian, T.; Greenwald, M.; Manduchi, G. Data catalog project—A browsable, searchable, metadata system. Fusion Eng. Des. 2016, 112, 995–998. [Google Scholar] [CrossRef]
- Yeong, D.; Velasco-Hernandez, G.; Barry, J.; Walsh, J. Sensor and Sensor Fusion Technology in Autonomous Vehicles: A Review. Sensors 2021, 21, 2140. [Google Scholar] [CrossRef]
- Korea Law Information Center. Enforcement Decree of the Road Traffic Act. 2017. Available online: https://www.law.go.kr/ (accessed on 20 July 2025).
- National Transportation Safety Board. The Investigative Process. 2025. Available online: https://www.ntsb.gov/ (accessed on 20 July 2025).
- Employment and Social Development Canada. Investigations of Motor Vehicle Accidents on Public Roads—IPG-066. Effective Date: January 2009. 2009. Available online: https://www.canada.ca/en/employment-social-development/programs/laws-regulations/labour/interpretations-policies/066.html (accessed on 20 July 2025).
- Essex Police. H 0602 Procedure—Road Traffic Collisions (Investigations). Version 18—October 2024. 2024. Available online: https://www.essex.police.uk/ (accessed on 20 July 2025).
- Bundesministerium der Justiz und für Verbraucherschutz. Straßenverkehrsgesetz (StVG). 2025. Available online: https://www.gesetze-im-internet.de/stvg/ (accessed on 20 July 2025).
- Government of Sweden. Accident Investigation Act (1990:712). 1990. Available online: https://www.shk.se/ (accessed on 20 July 2025).
- Korea Road Traffic Authority (Koroad). Engineering Analysis of Traffic Accidents on Request from Judicial Institutions. 2025. Available online: https://www.koroad.or.kr/eng/content/view/ME02080000.do (accessed on 20 July 2025).
- College of Policing. Investigation of Fatal and Serious Injury Road Collisions. 2023. Available online: https://www.college.police.uk/ (accessed on 20 July 2025).
- California Department of Motor Vehicles. Autonomous Vehicle Collision Reports. 2025. Available online: https://www.dmv.ca.gov/ (accessed on 20 July 2025).
- Hoque, M.A.; Hasan, R. AVGuard: A Forensic Investigation Framework for Autonomous Vehicles. In Proceedings of the ICC 2021—IEEE International Conference on Communications, Montreal, QC, Canada, 14–23 June 2021; pp. 1–6. [Google Scholar] [CrossRef]
- Giovannini, E.; Giorgetti, A.; Pelletti, G.; Giusti, A.; Garagnani, M.; Pascali, J.; Pelotti, S.; Fais, P. Importance of dashboard camera (Dash Cam) analysis in fatal vehicle-pedestrian crash reconstruction. Forensic Sci. Med. Pathol. 2021, 17, 379–387. [Google Scholar] [CrossRef]
- Niehoff, P.; Gabler, H.C.; Brophy, J.; Chidester, C.; Hinch, J.; Ragland, C. Evaluation of event data recorders in full systems crash tests. In Proceedings of the 19th International Conference on the Enhanced Safety of Vehicles, Washington, DC, USA, 6–9 June 2005. [Google Scholar]
- Oh, G.; Ko, W.; Park, J.; Yun, I.; SO, J.J. Study on the improvement of traffic accident report for automated vehicle test scenarios. J. Korea Inst. Intell. Transp. Syst. 2022, 21, 167–182. [Google Scholar] [CrossRef]
- Kwayu, K.M.; Kwigizile, V.; Lee, K.; Oh, J.S. Discovering latent themes in traffic fatal crash narratives using text mining analytics and network topology. Accid. Anal. Prev. 2021, 150, 105899. [Google Scholar] [CrossRef]
- Pisu, P.; Soliman, A.; Rizzoni, G. Vehicle chassis monitoring system. Control Eng. Pract. 2003, 11, 345–354. [Google Scholar] [CrossRef]
- Smith, T.; Toth, C.; Timcho, T. Sharing and Using Connected Device Data to Improve Traveler Safety and Traffic Management—Concept of Operations, Use Cases, Traveler Information Needs, Messages, and Requirements; Report FHWA-HRT-23-030; WSP USA Inc.: New York, NY, USA; Cambridge Systematics, Inc.: Medford, MA, USA; Federal Highway Administration, Office of Operations Research, Development, and Technology: McLean, VA, USA, 2023.
- National Transportation Safety Board. Collision Between Car Operating with Partial Driving Automation and Truck-Tractor Semitrailer, Delray Beach, Florida, March 1, 2019; Highway Accident Brief NTSB/HAB-20/01; National Transportation Safety Board: Washington, DC, USA, 2020.
- National Transportation Safety Board. Collision Between Vehicle Controlled by Developmental Automated Driving System and Pedestrian, Tempe, Arizona, March 18, 2018; Highway Accident Report NTSB/HAR-19/03; National Transportation Safety Board: Washington, DC, USA, 2019.
- Feifel, H.; Erdem, B.; Menzel, D.; Gee, R. Reducing Fatalities in Road crashes in Japan, Germany, and USA with V2X-enhanced-ADAS. In Proceedings of the 27th Enhanced Safety of Vehicles (ESV), Conference, Yokohama, Japan, 3–6 April 2023; pp. 3–6. [Google Scholar]
- Khattak, M.; De Backer, H.; De Winne, P.; Brijs, T.; Pirdavani, A. Analysis of Road Infrastructure and Traffic Factors Influencing Crash Frequency: Insights from Generalised Poisson Models. Infrastructures 2024, 9, 47. [Google Scholar] [CrossRef]
- da Silva, M.P. Analysis of Event Data Recorder Data for Vehicle Safety Improvement; John, A., Ed.; Technical Report HS-810 935; DOT-VNTSC-NHTSA-08-01; Volpe National Transportation Systems Center (U.S.): Cambridge, MA, USA, 2008.
- Kim, I.; Lee, G.; Lee, S.; Choi, W. Data Storage System Requirement for Autonomous Vehicle. In Proceedings of the 2022 22nd International Conference on Control, Automation and Systems (ICCAS), Busan, Republic of Korea, 27 November–1 December 2022; 2022; pp. 45–49. [Google Scholar] [CrossRef]
- Hyun, S.; Son, J.; Oh, Y.; You, B. A study of the DSSAD data elements derivation through autonomous driving data analysis on expressways. J. Korea Inst. Intell. Transp. Syst. 2024, 23, 97–106. [Google Scholar] [CrossRef]
- Jung, C.; Lee, D.; Lee, S.; Shim, D. V2X-Communication-Aided Autonomous Driving: System Design and Experimental Validation. Sensors 2020, 20, 2903. [Google Scholar] [CrossRef] [PubMed]
- Girdhar, M.; You, Y.; Song, T.J.; Ghosh, S.; Hong, J. Post-accident cyberattack event analysis for connected and automated vehicles. IEEE Access 2022, 10, 83176–83194. [Google Scholar] [CrossRef]
- Pai, V.N.; Barosan, I.; Khabbaz Saberi, A. Map and Its Impact on the Functional Safety of Automated Driving Vehicles. J. Softw. Eng. Auton. Syst. 2023, 1, 17–27. [Google Scholar] [CrossRef]
- Moura, D.; Zhu, S.; Zvitia, O. Nexar Dashcam Collision Prediction Dataset and Challenge. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 11–15 June 2025. [Google Scholar]
- Che, Z.; Li, G.; Li, T.; Jiang, B.; Shi, X.; Zhang, X.; Lu, Y.; Wu, G.; Liu, Y.; Ye, J. D2-City: A Large-Scale Dashcam Video Dataset of Diverse Traffic Scenarios. arXiv 2019, arXiv:1904.01975. [Google Scholar]
- Chen, R.J.; Tatem, W.M.; Gabler, H.C. Event Data Recorders (EDRs) Duration Study: Final Report; Final Report NHTSA Supplemental Report; Submitted to National Highway Traffic Safety Administration; Virginia Tech, Department of Biomedical Engineering and Mechanics: Blacksburg, VA, USA, 2017. [Google Scholar]
- Gabler, H.; Gabauer, D.; Newell, H.; Glassboro, N. Use of Event Data Recorder (EDR) Technology for Highway Crash Data Analysis. NCHRP Project 2004, 17–24. [Google Scholar]
- Chapman, S. Automated Vehicle Safety Assurance—In-Use Safety and Security Monitoring: Task 2—Minimum Dataset Specification; Published Project Report PPR2017 TETI0042; Prepared for Department for Transport; Version 1.0; Copyright © TRL Limited; TRL Limited: Wokingham, UK, 2022. [Google Scholar]
- UNECE World Forum for Harmonization of Vehicle Regulations (WP.29). DSSAD Guidance Document; Informal Document WP.29-196-09; Submitted to the 196th Session of the World Forum for Harmonization of Vehicle Regulations (WP.29); UNECE: Geneva, Switzerland, 19 June 2025; Available online: https://unece.org/transport/documents/2025/06/informal-documents/grva-dssad-guidance-document (accessed on 20 July 2025).
- SAE International. V2X Communications Message Set Dictionary; Technical Report SAE J2735_202409; Revised September 2024; SAE International: Warrendale, PA, USA, 2024. [Google Scholar]
- ETSI. Intelligent transport systems (its); vehicular communications; basic set of applications; part 2: Specification of cooperative awareness basic service. Draft ETSI TS 2011, 20, 448–451. [Google Scholar]
- No. EN 302 637-3; Intelligent Transport Systems (ITS); Vehicular Communications; Basic Set of Applications; Part 3: Specifications of Decentralized Environmental Notification Basic Service. ETSI: Sophia Antipolis, France, 2019.
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3354–3361. [Google Scholar] [CrossRef]
- KITTI Vision Benchmark Suite. Available online: https://www.cvlibs.net/datasets/kitti/raw_data.php (accessed on 20 July 2025).
- Sun, P.; Kretzschmar, H.; Dotiwalla, X.; Chouard, A.; Patnaik, V.; Tsui, P.; Guo, J.; Zhou, Y.; Chai, Y.; Caine, B.; et al. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2446–2454. [Google Scholar]
- Caesar, H.; Bankiti, V.; Lang, A.H.; Vora, S.; Liong, V.E.; Xu, Q.; Krishnan, A.; Pan, Y.; Baldan, G.; Beijbom, O. nuScenes: A multimodal dataset for autonomous driving. arXiv 2019, arXiv:1903.11027. [Google Scholar]
- nuScenes Dataset. Available online: https://www.nuscenes.org/nuscenes (accessed on 20 July 2025).
- Tampa CV Pilot Signal Phasing and Timing (SPaT) Sample. Available online: https://data.transportation.gov/Automobiles/Tampa-CV-Pilot-Signal-Phasing-and-Timing-SPaT-Samp/xn7c-yu2n/about_data (accessed on 20 July 2025).
- Wilson, B.; Qi, W.; Agarwal, T.; Lambert, J.; Singh, J.; Khandelwal, S.; Pan, B.; Kumar, R.; Hartnett, A.; Pontes, J.K.; et al. Argoverse 2: Next generation datasets for self-driving perception and forecasting. arXiv 2023, arXiv:2301.00493. [Google Scholar] [CrossRef]
- Yee, M.; Surkis, A.; Lamb, I.; Contaxis, N. The NYU Data Catalog: A modular, flexible infrastructure for data discovery. J. Am. Med Informatics Assoc. 2023, 30, 1693–1700. [Google Scholar] [CrossRef]
- Dibowski, H.; Schmid, S.; Svetashova, Y.; Henson, C.; Tran, T. Using Semantic Technologies to Manage a Data Lake: Data Catalog, Provenance and Access Control. In Proceedings of the SSWS@ ISWC, Athens, Greece, 2–6 November 2020; pp. 65–80. [Google Scholar]
- Cherradi, M.; Bouhafer, F.; Haddadi, A.E. Data lake governance using IBM-Watson knowledge catalog. Sci. Afr. 2023, 21, e01854. [Google Scholar] [CrossRef]
- Anil Hirwade, M. A study of metadata standards. Libr. Hi Tech News 2011, 28, 18–25. [Google Scholar] [CrossRef]
- Shin, D.K.; Lee, S.H.; Kang, J.; Park, E.M. Data catalogue standards based on dcat for transportation data: Dcat-trans. J. Korean Soc. Transp. 2019, 37, 430–444. [Google Scholar] [CrossRef]
- Albertoni, R.; Browning, D.; Cox, S.; Beltran, A.; Perego, A.; Winstanley, P. Data Catalog Vocabulary (DCAT)-Version 3, 2024. w3C Recommendation. Available online: https://www.w3.org/TR/vocab-dcat-3/ (accessed on 1 October 2025).
- European Commission. DCAT Application Profile for Data Portals in Europe (DCAT-AP)—Version 3.0.0, 2024. Interoperable Europe Portal. Available online: https://interoperable-europe.ec.europa.eu/collection/semic-support-centre/solution/dcat-application-profile-data-portals-europe/release/300 (accessed on 1 October 2025).
- DCAT-AP.de. DCAT-AP.de Specification—Version 3.0, 2024. DCAT-AP.de Portal. Available online: https://www.dcat-ap.de/def/dcatde/3.0/spec/ (accessed on 1 October 2025).
- GeoDCAT-AP. GeoDCAT-AP 3.0.0. 2025. Available online: https://semiceu.github.io/GeoDCAT-AP/releases/3.0.0/ (accessed on 20 July 2025).
- Canham, S.; Ohmann, C. A metadata schema for data objects in clinical research. Trials 2016, 17, 557. [Google Scholar] [CrossRef]
- Labropoulou, P.; Gkirtzou, K.; Gavriilidou, M.; Deligiannis, M.; Galanis, D.; Piperidis, S.; Rehm, G.; Berger, M.; Mapelli, V.; Rigault, M.; et al. Making metadata fit for next generation language technology platforms: The metadata schema of the european language grid. In Proceedings of the Twelfth Language Resources and Evaluation Conference, Marseille, France, 11–16 May 2020; pp. 3428–3437. [Google Scholar]
- Welten, S.; Neumann, L.; Yediel, Y.U.; da Silva Santos, L.O.B.; Decker, S.; Beyan, O. DAMS: A distributed analytics metadata schema. Data Intell. 2021, 3, 528–547. [Google Scholar] [CrossRef]
- Mukherjee, S.; Das, R. Integration of domain-specific metadata schema for cultural heritage resources to DSpace: A prototype design. J. Libr. Metadata 2020, 20, 155–178. [Google Scholar] [CrossRef]
- Abaza, H.; Shutsko, A.; Klopfenstein, S.A.; Vorisek, C.N.; Schmidt, C.O.; Brünings-Kuppe, C.; Clemens, V.; Darms, J.; Hanß, S.; Intemann, T.; et al. Toward a Domain-Overarching Metadata Schema for Making Health Research Studies FAIR (Findable, Accessible, Interoperable, and Reusable): Development of the NFDI4Health Metadata Schema. JMIR Med. Inform. 2025, 13, e63906. [Google Scholar] [CrossRef]
- Kim, E.; Kim, J.; Woo, W. Metadata schema for context-aware augmented reality applications in cultural heritage domain. In 2015 Digital Heritage; IEEE: Piscataway, NJ, USA, 2015; Volume 2, pp. 283–290. [Google Scholar]
- Bermudez-Edo, M.; Elsaleh, T.; Barnaghi, P.; Taylor, K. IoT-Lite: A lightweight semantic model for the internet of things and its use with dynamic semantics. Pers. Ubiquitous Comput. 2017, 21, 475–487. [Google Scholar] [CrossRef]
- Specka, X.; Gärtner, P.; Hoffmann, C.; Svoboda, N.; Stecker, M.; Einspanier, U.; Senkler, K.; Zoarder, M.M.; Heinrich, U. The BonaRes metadata schema for geospatial soil-agricultural research data–Merging INSPIRE and DataCite metadata schemes. Comput. Geosci. 2019, 132, 33–41. [Google Scholar] [CrossRef]
- Manouselis, N.; Costopoulou, C. Quality in metadata: A schema for e-commerce. Online Inf. Rev. 2006, 30, 217–237. [Google Scholar] [CrossRef]
- Cano, M.A.; Tsueng, G.; Zhou, X.; Xin, J.; Hughes, L.D.; Mullen, J.L.; Su, A.I.; Wu, C. Schema Playground: A tool for authoring, extending, and using metadata schemas to improve FAIRness of biomedical data. BMC Bioinform. 2023, 24, 159. [Google Scholar] [CrossRef]
- Koç, H.; Erdoğan, A.M.; Barjakly, Y.; Peker, S. UML Diagrams in Software Engineering Research: A Systematic Literature Review. Proceedings 2021, 74, 13. [Google Scholar] [CrossRef]






| Type of Accident | Layers | Data Source | Type of Data Generation | Temporal Resolution | Data Type | Data Contents | Refs. |
|---|---|---|---|---|---|---|---|
| AV & MV | Dashcam footage | Dashcam | when an event occur continuously recorded | 25∼30 fps | Unstructured (MP4) | Visual context of the driving environment | [33,34] |
| Accident overview | Accident report | when an accident occur | N/A | Unstructured (PDF) | Vehicle information Driver information (e.g., DUI) Accident severity Accident circumstances Accident overview (e.g., weather, road condition) | [15] | |
| Recording devices within vehicles | EDR | when an event occur (airbag deployment, and so forth) | 2 fps | Unstructured (PDF) | Pre-crash information (e.g., speed, brake engagement status) In-crash information (e.g., airbag deployment time) | [35,36] | |
| AV | Recording devices within AV | DSSAD | continuously recorded when an event occur (crash detection, system failure, and so forth) | 2 fps | Structured (Timestamped event log) | System status codes (e.g., ADS activation status) Control command logs (e.g., steering angle) V2X messages (e.g., vehicle received messages) Sensor fusion information (e.g., object detection) Vehicle location (e.g., latitude, longitude) | [37,38] |
| V2V communication | BSM (Basic Safety Message) | continuously recorded | 10 fps | Structured (ASN.1/UPER) | Vehicle location Braking information (e.g., brake system status) Vehicle dimensions (e.g., length, width) Protected communication zone information | [39] | |
| CAM (Cooperative Awareness Message) | continuously recorded | 1∼10 fps | Structured (ASN.1/UPER) | Vehicle location Vehicle dimensions | [40] | ||
| V2I communication | TIM (Traveler Information Message) | when an event occur (road condition changes, pre-defined zones, and so forth) | Event- based | Structured (ASN.1/UPER) | Recommended information (e.g., road construction) Road sign types Emergency alert | [39] | |
| DENM (Decentralized Environmental Notification Message) | when an event occur (accident occur, emergency vehicles approaching, and so forth) | Event- based | Structured (ASN.1/UPER) | Event overview (e.g., type, location) Emergency vehicle information (e.g., speed, direction) Geographic area warning information | [41] | ||
| Vehicle sensor | Camera | continuously recorded | 10∼12 fps | Unstructured (PNG, TFRecord) | Video of vehicle perception 3D bounding box for vehicle object perception | [42,43,44,45,46] | |
| LiDAR | continuously recorded | 10∼20 fps | Unstructured (bin, TFRecord) | Video of vehicle perception 3D bounding box for vehicle object perception | [42,43,44,45,46] | ||
| Radar | continuously recorded | 13 fps | Unstructured (bin) | Video of vehicle perception 3D bounding box for vehicle object perception | [45,46] | ||
| Road infrastructure | Traffic Signal | when an event occur (signal state changes) | 1 fps | Structured (CSV) | Signal information (e.g., signal state, remaining time) | [47] | |
| HD Map | Periodic updated | N/A | Unstructured (GeoTIFF) | Road geometry (e.g., lane boundaries) ODD | [48] |
| Class | Description | DCAT [54] | DCAT- AP [55] | DCAT- AP.de [56] | Geo- DCAT- AP [57] | DCAT- Trans [53] |
|---|---|---|---|---|---|---|
| Catalog | To provide a list of datasets and data services included in the catalog including title, description, and list of included resources | O | O | O | O | O |
| DatasetSeries | To represent a collection of datasets with temporal or periodic continuity, including names, descriptions, and information about the included datasets | O | O | O | O | |
| Resource | To express the characteristics of resources commonly used across multiple classes, including descriptive information such as unique identifiers and names | O | O | O | O | |
| Dataset | To describe the characteristics of a specific dataset, it includes information such as the title, description, subject, temporal scope, and spatial scope | O | O | O | O | O |
| DataService | To explain how to access and utilize the data, it includes information on the service’s name, description, access address, and provided functions | O | O | O | O | |
| Distribution | To explain the actual availability of the dataset, it includes information on data format, access path, download, location, and media type | O | O | O | O | O |
| CatalogRecord | To provide the management history of resources registered in the catalog, including information such as catalog’s creation time and modification time | O | O | O | O | |
| Agent | To describe the organization or individual associated with the datatset, include information on the administrator’s name, contact details, and type | O | O | O | ||
| Location | To express the spatial scope related to the dataset, it includes location coordinates, administrative districts, and geographic area information | O | O | O | ||
| LicenseDocument | To explain the terms of use for the dataset, including the required license type, rights description, and related documentation information | O | O | |||
| Checksum | To verify data integrity, it includes information such as verification algorithms and hash values | O | O | |||
| Relationship | To explain the relationships between data, include the names of related data, hierarchical relationships, and association information | O | O | O | ||
| Kind | To describe the category to which the dataset belongs, it includes information about the characteristics of the type, classfication, and category | O | O | O | ||
| Attribution | To describe the entities contributing to the dataset, include information about the relevant organizations or individuals, their roles, and their level of contrivution | O | ||||
| Taxonomy | To systematically classify the topics within the dataset, it includes information on the topic classification system and topic items | O |
| Types of Property | Description | Sub-Properties | Refs. |
|---|---|---|---|
| Data modification and creation | Information containing history data on creation, updates, modifications, enabling tracking of the data lifecycles, including the creator, publication date, modification history | wasGeneratedBy, issued | [53,54,55,56,57] |
| Index and classification | Information for indexing, including the dataset’s classification system, subject terms, keywords which supports users in exploring data suited to their purposes | type, keyword | [53,54,55,56,57] |
| Descripton | Basic information necessary for understanding the overall characteristics and content of the data, including its title, description, format, and other essential details information supporting data comprehension | description, title | [53,54,55,56,57] |
| Resolution | Information regarding the temporal and spatial resolution levels possessed by the data information supporting data evaluation according to analysis or utilization purpose | spatialResolutionInMeters | [53,54,55,56,57] |
| Metadata | Information including metadata’s own compliance with standards, reference schemas, and other details information supporting the structureal reliability and interoperability of metadata | conformsTo, isReferencedBy | [53,54,55,56,57] |
| Distribution | Technical distribution information, including the data distribution method, format, download path, file size, and other technical details, as well as information supporting data accessibility, such as the data provision method | downloadURL, bytesize | [53,54,55,56,57] |
| Spatiotemporal | Includes information on the temporal and spatial scope covered by the data supporting spatio-temporal analysis or limited use by region and period | temporal, spatial | [53,54,55,56,57] |
| Identification | Includes the data’s unique identifier, version information, supporting the tracking of the data’s identity and history | version, identifier | [53,54,55,56,57] |
| Linkage and Relationship | Includes information on references and linkages between other data related to specific data, supporting related data indexing and integrated data utilization | relation, isReferencedBy | [53,54,55,56,57] |
| Access and Rights | Information regarding permissions, licenses, access paths, and other details for data utilization, including information supporting legal restrictions on data use and managing authorized personnel access | accessRights, accessURL | [53,54,55,56,57] |
| Provider and Manager | Information regarding the entity that created and provides the data, the managing agency, and the responsible personnel, supporting the assurance of reliability regarding the data’s source and responsible entity | publisher, contactPoint | [53,54,55,56,57] |
| Assistance | Additional attributes not covered in the above items, such as other reference information and data integrity verification, which are necessary for schema composition but do not describe the data it self | servesDataset, checksumValue | [53,54,55,56,57] |
| Authors (Year) | Research Domain | Proposed Schema | Diagram | Case Study | Validation | Note | Ref. |
|---|---|---|---|---|---|---|---|
| S. Canham and C.Ohmann (2016) | Clinical research | O | X | X | X | - | [58] |
| Shin et al. (2019) | Transportation | O | X | X | X | - | [53] |
| X. Specka et al. (2019) | Soil-agricultural | O | O | X | X | - | [65] |
| N. Manouselis and C. Costopoulou (2006) | E-commerce | O | O | X | X | - | [66] |
| M.A. Cano et al. (2023) | Biomedical | O | O | X | X | - | [67] |
| P. Labropoulou et al. (2020) | Language technology | O | O | O | X | Pilot implementation utilizing the schema | [59] |
| M. Bermudez-Edo et al. (2017) | IoT | O | O | O | O | Schema validation based on testbed data | [64] |
| S. Welten et al. (2021) | Medical | O | O | O | X | Visualization implementation utilizing the schema | [60] |
| S. Mukherjee and R. Das (2020) | Cultural heritage | O | O | O | X | Schema validation based on real-world data | [61] |
| H. Abaza et al. (2025) | Health research | O | O | O | X | Visualization implementation utilizing the schema | [62] |
| Kim et al. (2015) | Cultural heritage | O | O | O | X | Visualization implementation utilizing the schema | [63] |
| Items | Sub-Items | Data Resource | Current Data Availability |
|---|---|---|---|
| Essential information | ODD area | N/A | N/A |
| Party | behavior | accident report | O |
| trajectory | DSSAD | O | |
| forward attention status | accident report | O | |
| Object | cellphone usage status | accident report | O |
| fixed | accident report | O | |
| movable | accident report | O | |
| Traffic | Traffic flow Progression | N/A | N/A |
| Environment | road facility location | HD Map | O |
| sun glare | Camera, LiDAR, Radar | O | |
| traffic signal information | Traffic Signal | O | |
| Vehicle information | vehicle level | accident report | O |
| autonomous mode | DSSAD | O | |
| conventional mode | accident report | O | |
| H/W function fault | sense function | DSSAD | O |
| perception & localize function | DSSAD | O | |
| scene function | DSSAD, Camera, LiDAR, Radar | O | |
| plan & decide function | DSSAD | O | |
| EV system | DSSAD | O | |
| Chassis system | chassis type | N/A | N/A |
| chassis status | N/A | N/A | |
| HMI | HMI type | N/A | N/A |
| HMI location | N/A | N/A | |
| S/W function fault | sense function | DSSAD | O |
| perception & localize function | DSSAD | O | |
| scene function | DSSAD, Camera, LiDAR, Radar | O | |
| plan & decide function | DSSAD | O | |
| EV system | DSSAD | O | |
| Other function fault | ADS operational status | DSSAD | O |
| DDT fallback moment | N/A | O | |
| risk minimization driving interval | DSSAD | O | |
| Violation | type of violation | accident report | O |
| System version | software version | N/A | N/A |
| firmware version | N/A | N/A | |
| hardware version | N/A | N/A | |
| Communication | in-vehicle | N/A | N/A |
| external | BSM, CAM, TIM, DENM | O | |
| Communication infrastructure | infrastructure type | N/A | N/A |
| infrastructure status | N/A | N/A | |
| infrastructure location | N/A | N/A | |
| Security | physical | N/A | N/A |
| cyber | N/A | N/A | |
| Virtual environment | road facility | HD Map | O |
| visibility condition | Camera, LiDAR, Radar | O | |
| road configuration | HD Map | O | |
| road operation condition | TIM | O | |
| road type | HD Map | O | |
| Camera | O | ||
| road condition | LiDAR | O | |
| Radar | O | ||
| security & communication alert area | DENM | O |
| Class | 12 Types of Properties | Sub-Properties | Description | Example | Remark |
|---|---|---|---|---|---|
| Dataset | Data modification and Creation | issued * | Date the dataset was first created | 1 June 2025 | Re-use existing DCAT and APs definitions or extend meanings |
| modified * | Date the dataset was updated or modified | 5 June 2025 | |||
| wasGeneratedBy * | How the dataset was created | LiDAR sesnsor raw data extraction | |||
| accrualPeriodicity * | Update cycle of the dataset | Daily | |||
| Identification | identifier | A unique identifier for a dataset | lidar-dataset-2025-06 | ||
| version | A version for managing modification and update history of the same dataset | v.1.2 | |||
| Index and Classification | keyword | Keywords of the dataset | Perception, Virtual environment | ||
| Description | title * | Title of the dataset | AV LiDAR perception virtual environment data | ||
| description * | Description of the dataset | LiDAR point cloud data collected during AV operation | |||
| distribution * | Distribution method of the dataset | Provided in compressed file format | |||
| Resolution | temporalResolution | Temproal resolution of the dataset | 12 fps | ||
| spatialResolution | Spatial resolution of the dataset, point (individual object), line (linear object), area (area obejct) units | Driving trajectory at 0.5 m resolution | |||
| Spatiotemporal | spatial | Spatial scope of the dataset | Major arterial roads within Seoul Metropolitan City | ||
| temporal | Temporal scope of the dataset | 1 June 2025 5 June 2025 | |||
| Provider and Manager | creator | Name of creating agency or administrator | OO University AV research Lab. | ||
| contactPoint | Contact information for creating agency or administrator | lidarlab@univ.ac.kr | |||
| Metadata | conformsTo | Standards (e.g., metadata, technical specifications) that the dataset or service complies with | International Standard | ||
| isReferencedBy | Information on how the dataset is referenced and utilized by other datasets, documents | 2025 AV environment perception report | |||
| Relation | qualifiedRelation | Relationship with other datasets | prov:wasDerivedFrom AV2025_v1 | ||
| videoResolution | Resolution of the video dataset | 1920 × 1080 @ 30 fps | New Properties (Data heterogeneity management) | ||
| triggerMechanism | Data storage or generation condition | IMU sensor crash detection | |||
| samplingRate | Temporal collection frequency or cycle of data | 12 fps | |||
| dataModality | Data representation format | Video (Point cloud) | |||
| clipLength | Clip length of log or video data | 45 s | |||
| sensorType | Data collection equipment or sensor type | LiDAR (Front) | |||
| dataGranularityLevel | Spatial and temporal resolution level of data | Spatial: High resolution (1 m or less) Temporal: High resolution (1 frame per second or more) | |||
| eventTimeMarker | Synchronization reference point for multimodal data | 15 May 2025 T08:30:10Z | New Properties (Multimodal data linkage and spatial information) | ||
| prePostEventWindow | Recording time interval before and after a specific event | Pre: 15 s Post: 5 s | New Properties (Accident investigation process linkage) | ||
| analysisSupportLevel | Level of support data provides for analysis | All levels (Raw data provided) | |||
| investigationStep | Stage of accident investigation where data is utilized | Pre-crash (Virtual environment investigation) | |||
| reportingPurpose | Purpose for creating data or documentation | Comparison of Real-World environments and AV-perception virtual environments | |||
| dataSoruceEntity | Data collection entity or equipment | DSSAD_Extractor_Unit_42 | New Properties (Special causes specific to AV) | ||
| cyberSecurityEvent | Security events detected during the data collection process | Authentication failure warning | |||
| Distribution | Description | title | Title of distribution | LiDAR data distribution file | Re-use existing DCAT and APs definitions or extend meanings |
| description | Description of distribution | Compressed LAS (LASer) format Lidar data | |||
| Data Modification and Creation | issued | Date the distribution was first created | 1 June 2025 | ||
| modified | Date the distribution was updated or modified | 5 June 2025 | |||
| Distribution | format * | Distribution format | LAS | ||
| compressFormat * | Distribution file compression format | zip | |||
| byteSize * | Distribution file size | 25 GB | |||
| downloadURL * | Distribution file download URL | https://data.exampl/lidar202506.zip | |||
| Access and Rights | accessURL * | URL providing access to the specified distribution method | https://api.data.example/lidar | ||
| accessService * | Service endpoint provided for interacting with data distribution | REST API (JSON) | |||
| beforebyteSize | Pre-compression distribution file size information | 36 GB | New Properties (Data heterogeneity management) | ||
| formatDetail | Detailed file format (e.g., encoding method) or standard name of the distribution data) | Video data, MP4 (H.264) | |||
| Catalog | Assistance | catalog * | Hierarchical relationships between catalogs | Top-level: City data catalog/Sub-level: Vehicle sensor catalog | Re-use existing DCAT and APs definitions or extend meanings |
| service * | Provided information for datasets included in the catalog | LiDAR Data Service 2025 | |||
| dataset * | Individual data included in the catalog | LiDARDataset_202506 | |||
| Index and Classification | themeTaxonomy * | Subject classification criteria supporting dataset organization and retrieval | Road environment perception | ||
| Relationship | Linkage and Relationship | relation * | Related datasets | AV camera perception virtual environment data | Re-use existing DCAT and APs definitions or extend meanings |
| isRequiredBy * | ‘A’ dataset required for using ‘B’ dataset to enable complex data interpretation | 3D Object Detection Dataset | |||
| Requires * | ‘B’ dataset required for using ‘A’ dataset to enable complex data interpretation | Road Surface Condition Dataset | |||
| caseNum | Unique identifier information for individual accidents | CASE-20250601-0123 | New Properties (Accident investigation process linkage) | ||
| multiModalLinkage | Information on the relationship between ‘A’ dataset and other sensors, devices, and so forth | Camera → LiDar Matching | New Properties (Multimodal data linkage and spatial information) | ||
| Location | Spatiotemporal | geometry * | Spatial geometry of the data | LINESTRING (127.03 37.50, 127.05 37.52) | Re-use existing DCAT and APs definitions or extend meanings |
| bbox * | Geographic boundaries of the data | POLYGON (127.02 37.49, 127.06 37.53) | |||
| adminUnitL1 | Highest-level administrative district to which the data belongs | Republic of Korea | |||
| adminUnitL2 | Second-highest administrative district to which the data belongs | Seoul Metropolitan City | |||
| adminUnitL3 | Third-highest administrative district to which the data belongs | Gangnam District | |||
| adminUnitL4 | Lowest-level administrative district to which the data belongs | Yeoksam 1-dong | |||
| videodatabbox | Geographic boundary information in video data | FRAMEBOX (Seoul) 00:00:00–01:00:00 FRAMEBOX (Incheon) 01:00:00–02:00:00 | New Properties (Multimodal data linkage and spatial information) | ||
| geocoordinate | Spatial coordinate system information referenced by the data | 37.7749° N | |||
| geoContextType | Reference system type for spatial data (e.g., road network-based, administrative boundary-based) | Road network-based | |||
| DataService | Access and Rights | endpointDescription * | Description of operations possible using the endpoint | RESTful API for LiDAR data query | Re-use existing DCAT and APs definitions or extend meanings |
| endpointURL * | Endpoint URL of the service providing the dataset | https://api.data.example/lidar/v1 | |||
| license * | Download and operation permissions for the dataset | CC BY 4.0 | |||
| Assistance | servesDataset | Datasets that can be deployed by the data service | LiDARDataset_202506 | ||
| PeriodOfTime | spatiotemporal | endDate * | Data release end date | 5 June 2025 | Re-use existing DCAT and APs definitions or extend meanings |
| startDate * | Data release end date | 1 June 2025 | |||
| Agent | Provider and Manager | type * | Agent type | Organization | Re-use existing DCAT and APs definitions or extend meanings |
| name * | Name of managing agency or administrator | OO University AV research center | |||
| contactPoint | Contact information for managing agency or administrator | avcenter@univ.ac.kr | |||
| License | Access and Rights | type * | Required License type | CC BY 4.0 | Re-use existing DCAT and APs definitions or extend meanings |
| Checksum | Assistance | algorithm * | Algorithm used to generate the checksumValue | SHA-256 | Re-use existing DCAT and APs definitions or extend meanings |
| checksumValue * | Checksum generated using the algorithm checksumValue | 9f2c7b3e4a…c12a |
| Content | Description |
|---|---|
| Accident Location | Signalized Intersection |
| Scenario description | The traffic signal at the intersection was displaying a green light, but the attacker hacked the signal control system and transmitted a red light message to the AV. This caused the AV to brake abruptly, resulting in a rear-end collision by the MV following behind. |
| Driver task | Automated Driving System (ADS) fully engaged |
| AV function issue | Security breach |
| Issue type | Malicious message injection |
| System state | Hacked V2I system |
| Weather | Clear |
| Collision type | AV to MV |
| Crash type | Rear-end collision of MV |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, M.; Kim, N.; Kim, H.; Song, T.-J. Design of an Extended DCAT-Based Metadata Schema and Data Catalog for Autonomous Vehicle Accident Investigation. Sustainability 2025, 17, 11237. https://doi.org/10.3390/su172411237
Kim M, Kim N, Kim H, Song T-J. Design of an Extended DCAT-Based Metadata Schema and Data Catalog for Autonomous Vehicle Accident Investigation. Sustainability. 2025; 17(24):11237. https://doi.org/10.3390/su172411237
Chicago/Turabian StyleKim, Minwook, Nayeon Kim, Heesoo Kim, and Tai-Jin Song. 2025. "Design of an Extended DCAT-Based Metadata Schema and Data Catalog for Autonomous Vehicle Accident Investigation" Sustainability 17, no. 24: 11237. https://doi.org/10.3390/su172411237
APA StyleKim, M., Kim, N., Kim, H., & Song, T.-J. (2025). Design of an Extended DCAT-Based Metadata Schema and Data Catalog for Autonomous Vehicle Accident Investigation. Sustainability, 17(24), 11237. https://doi.org/10.3390/su172411237

