AI-Assisted CAN Trace Analysis for State Identification to Improve Structure-Aware Fuzz Testing of Automotive ECUs

Popescu, Aurelian; Kifor, Claudiu Vasile; Lisaru, Codrina Victoria

doi:10.3390/automation7030083

Open AccessArticle

AI-Assisted CAN Trace Analysis for State Identification to Improve Structure-Aware Fuzz Testing of Automotive ECUs

by

Aurelian Popescu

^*

,

Claudiu Vasile Kifor

and

Codrina Victoria Lisaru

Faculty of Engineering, Lucian Blaga University of Sibiu, 55024 Sibiu, Romania

^*

Author to whom correspondence should be addressed.

Automation 2026, 7(3), 83; https://doi.org/10.3390/automation7030083 (registering DOI)

Submission received: 27 April 2026 / Revised: 16 May 2026 / Accepted: 20 May 2026 / Published: 22 May 2026

(This article belongs to the Section Smart Transportation and Autonomous Vehicles)

Download

Browse Figures

Versions Notes

Abstract

Fuzz testing is a key verification technique for identifying robustness and cybersecurity weaknesses in automotive electronic control units (ECUs). However, conventional CAN-based fuzz testing suffers from extremely low acceptance rates because randomly generated frames often violate protocol constraints such as counters, check-sums, and state dependencies. This study addresses the test-preparation bottleneck by proposing an AI-assisted approach for automated identification of stable operational system states from Controller Area Network (CAN) traces. These states can serve as valid starting points for mutation-based and model-based fuzzing. CAN traces generated in a Hardware-in-the-Loop (HIL) environment were analyzed using multiple publicly accessible large language model (LLM) systems. The objective was to evaluate whether AI/LLM tools can (i) identify unique system states, (ii) compute dwell-time distributions, and (iii) derive state transition maps directly from raw CAN traces and DBC definitions. Additionally, we checked the possibility of these tools to analyze the quality of CAN communication (message cycle time). At the end of the study, we ran experiment tasks using CAN logs taken from a production car. Results show that AI-assisted analysis can extract operational states and transitions with varying levels of agreement with the deterministic baseline, supporting preparatory analysis during fuzzing test preparation. While performance varies across tools, AI support demonstrates strong potential for accelerating and assisting structured fuzz testing workflows.

Keywords:

automotive cybersecurity; Controller Area Network (CAN); fuzz testing; state machine extraction; AI-assisted testing; grey-box fuzzing; HIL testing; protocol-aware fuzzing; Microsoft Copilot; Claude AI; ChatGPT; Google Gemini; ISO/SAE 21434

1. Introduction

Modern vehicles are complex cyber–physical systems composed of dozens of interconnected electronic control units (ECUs) communicating over in-vehicle networks such as the Controller Area Network (CAN). While this connectivity enables advanced functionality, it also increases the attack surface of automotive systems. As a result, cybersecurity assurance has become a mandatory engineering activity, reinforced by industry standards and regulatory frameworks, including SAE J3061 [1], ISO/SAE 21434 [2], and UNECE Regulation R155 [3]. These documents require systematic verification and validation measures to identify software weaknesses, communication vulnerabilities, and unsafe system behaviors throughout the vehicle development lifecycle.

Among the available security verification techniques, fuzz testing has emerged as a practical and effective method for identifying robustness issues and cybersecurity vulnerabilities. Fuzzing is an automated testing strategy in which malformed, unexpected, or semi-valid inputs are injected into a System Under Test (SUT) to provoke abnormal behavior, crashes, or unintended state transitions. Its strength lies in its ability to expose implementation flaws that are difficult to detect through specification-based testing. ISO/SAE 21434 explicitly recommends fuzz testing and penetration testing as distinct but complementary validation methods during integration and verification phases [2].

Fuzz testing approaches are commonly classified as black-box, grey-box, or white-box depending on the level of internal knowledge and runtime feedback available [4]. Black-box fuzzing simulates an external attacker’s perspective and requires no source code access, making it suitable for interface robustness and compliance testing. However, unguided input generation typically results in low coverage and slow vulnerability discovery. White-box fuzzing uses full program knowledge, instrumentation, and constraint-based techniques to achieve deep path exploration, but its high computational cost and scalability limitations restrict practical use in large embedded systems. Grey-box fuzzing represents a compromise, incorporating lightweight execution feedback such as coverage information to guide mutation while maintaining scalability [5,6]. This balance explains its widespread use in automotive security testing, particularly when OEMs (Original Equipment Manufacturers) provide partial protocol or interface knowledge to independent testing teams.

Applying fuzz testing to automotive networks presents domain-specific challenges [7,8]. CAN communication relies on structured signals, message timing, counters, and checksums. Random or structurally invalid frames are often rejected at the protocol level, preventing them from reaching functional ECU logic [9]. Consequently, unguided fuzzing leads to extremely low acceptance rates—defined here as the proportion of injected test frames that satisfy protocol constraints and are processed by the ECU application logic—and inefficient testing [10]. Structure-aware CAN fuzzing techniques have been proposed to address this limitation by extracting message structure from DBC files and generating semi-valid frames. Nevertheless, many existing approaches focus primarily on payload mutation [11,12,13] and do not sufficiently address the prerequisite of placing the SUT in a valid operational state before fuzzing.

Kim et al. [14] introduce a Structure-Aware CAN Fuzzing protocol, a grey-box fuzzing method that identifies the structure of CAN messages in Phase 1 and then systematically generates fuzzing inputs in Phase 2 to detect vulnerabilities in the SUT (see Figure 1). This approach reduces testing time and accelerates issue discovery within CAN DBC (database CAN) files. However, the analyzed system does not include feedback fuzzing, a feature commonly found in advanced software fuzzing techniques. The final test results and reports are generated after the Monitoring phase (Phase 3) is completed. Our study concentrates on Phase 1 of structure-aware CAN fuzzing testing.

Another critical challenge is maintaining stable and testable system states during fuzzing. Automotive ECUs often rely on specific sequences of prerequisite signals. Missing or invalid conditions can cause fail-safe modes in which injected frames are ignored. Experimental studies report acceptance rates below 1% when fuzzed inputs are not aligned with valid system states [10]. This highlights the importance of state-aware fuzz testing, in which the operational context of the SUT is considered before input injection. Emerging research in intelligent or learning-based fuzzing, including generative adversarial approaches for CAN traffic modeling, further indicates the potential of data-driven techniques to improve protocol compliance and test efficiency. Huiwen et al. [15] demonstrate with the SimADFuzz framework that using high-quality test scenarios during fuzzing improves test results for autonomous driving systems (ADSs). Within the same ADAS domain, Jin et al. [16] propose a multiphase seed optimization method for fuzz testing, in which scenarios are prioritized during the pre-fuzzing phase using an index-based ranking approach.

In addition to the identification of operating system states, the identification of state transitions during the preparation phase of fuzz testing can also improve test results. In the context of model-based fuzzing, Huang et al. [17] presents a method that leverages large language models to automatically generate sequences of identified system states for testing network protocol implementations.

The identification of operational states and their transitions from recorded CAN traffic is therefore a crucial preparatory step for structured fuzz testing. Traditional methods rely on manual signal inspection using tools such as Vector CANalyzer 19.4 [18], vSignalyzer [19], MATLAB CAN Explorer [20] and BUSMASTER [21], which is time-consuming and requires domain expertise. At the same time, advances in large language models (LLMs) and AI-assisted data analysis suggest new possibilities for automating trace interpretation and state extraction.

Golam et al. [22] employed a Doc2Vec model, based on a neural machine translation approach, to parse CAN logs and identify Unified Diagnostic Services (UDS) messages. Doc2Vec is an embedding model that captures the contextual information of an entire document and aggregates it to support more effective classification. It is built upon the Word2Vec embedding framework, which represents words in a high-dimensional vector space and thereby facilitates subsequent processing within machine learning models. This approach was tested by Golam et al. only for UDS messages, not for all types of CAN messages.

Parsing methodologies for automotive CAN traces typically begin with the transformation of unstructured hexadecimal frame data into structured representations via template extraction and tokenization, drawing on principles established in general log parsing research. Adaptive log parsing frameworks, such as LogParse, employ incremental learning mechanisms to dynamically infer and update log templates over time. This adaptability is particularly important in automotive environments, where ongoing firmware and software updates can introduce changes to log formats, necessitating continuous template refinement to maintain accurate and reliable parsing [23].

Advanced parsing techniques increasingly incorporate pattern recognition and machine learning methodologies, including clustering algorithms and sequence modeling frameworks. For example, hierarchical clustering methods facilitate precision tuning and computational efficiency, thereby supporting the high-throughput, real-time parsing requirements characteristic of large-scale automotive data environments [24]. Furthermore, parsing approaches that utilize token interdependency graphs, such as Tipping, enable the efficient and accurate differentiation between log templates and variable parameters. This mechanism enhances both scalability and analytical precision, particularly when processing the substantial log volumes generated by complex automotive systems [25].

In addition, the application of machine learning frameworks, such as Long Short-Term Memory (LSTM) networks integrated with autoencoder–decoder architectures and Support Vector Machines (LADSVM), for log-based anomaly detection can be effectively extended to automotive CAN data. By first performing rigorous parsing and structuring of raw CAN traces, these models can be employed to identify anomalies and system faults with improved accuracy and reliability [26]. Emerging studies suggest that LLMs, leveraged through few-shot prompting techniques (e.g., ChatGPT), demonstrate considerable potential for log parsing and interpretation tasks [27].

Like a summary of parsing of CAN logs, we highlight that traditional CAN log parsing remains heavily dependent on manual inspection tools, which are labor-intensive and require substantial domain expertise. Recent AI-based approaches offer promising alternatives, yet current evidence remains limited. Golam’s Doc2Vec-based method demonstrates that machine learning can support CAN log parsing, but its validation is restricted to UDS messages, limiting its generalizability to broader CAN traffic. More advanced parsing frameworks, including adaptive template extraction, clustering, and token interdependency graph methods, appear better suited to the evolving and high-volume nature of automotive logs. These approaches offer greater scalability, flexibility, and suitability for real-time analysis, particularly in environments affected by frequent software updates. Moreover, structured parsing can serve as a foundation for downstream anomaly detection using LSTM-, autoencoder-, or SVM-based models. Although LLMs and few-shot prompting show considerable promise, their application to CAN trace parsing remains preliminary and requires more systematic validation in automotive contexts.

This paper investigates the use of AI-assisted analysis of CAN traces to automatically identify operational system states and state transitions for fuzz testing preparation. By evaluating multiple publicly accessible AI/LLM tools on real and simulated CAN datasets, the study aims to evaluate the capability of AI/LLM tools to support automated interpretation of CAN traces for structured fuzz testing preparation. The proposed approach addresses a key bottleneck in automotive fuzz testing and contributes to improving the effectiveness of protocol-aware and state-aware testing strategies.

The effectiveness of structure-aware fuzz testing in automotive networks strongly depends on the availability of valid operational system states from which testcase injection can begin. This study addresses the test-preparation bottleneck by introducing an AI-assisted method for extracting stable system states and state transitions directly from recorded CAN traffic. We already analyzed how AI tools can support requirements-based testing by generating test cases automatically [28].

Protocol-based fuzz testing for CAN networks presents inherent limitations [29]. When CAN messages contain counters and checksum signals, conventional fuzzers frequently generate invalid frames that are discarded by ECUs before reaching application logic. This leads to low acceptance rates, inefficient use of test resources, and limited coverage.

Machine learning-driven fuzzing approaches trained on authentic CAN traffic offer a promising alternative. By learning structural and statistical properties of valid communication, such systems can produce protocol-compliant yet anomalous test inputs. In this context, field-associative mutation generative adversarial networks (FAMGAN) have been proposed to improve fuzzing efficiency [29]. A key conclusion across these approaches is that reinitializing the SUT into valid operational states—ideally after each test case—significantly improves fuzzing effectiveness. The AI-assisted state extraction method proposed in this paper directly supports the study conclusion.

The primary objective of this study is to investigate the feasibility of using AI-assisted analysis of CAN traces to support the preparation phase of structured automotive fuzz testing. To operationalize this goal, the study pursues three specific objectives:

(i) To evaluate whether AI/LLM tools can accurately identify unique operational system states from recorded CAN traffic;

(ii) To assess the capability of AI/LLM tools to compute dwell-time distributions of detected states;

(iii) To examine the ability of these systems to derive state transition relationships suitable for integration into protocol-aware fuzz testing workflows.

The main contributions of this work are the following:

AI-Assisted State Identification. We demonstrate that automated analysis using LLM-based AI/LLM tools can support the preparation phase of fuzz testing by identifying stable operational states from CAN traces. These states can serve as reliable starting points for mutation-based and model-based fuzzing.
Comparative Evaluation of Commercial AI/LLM tools. Multiple commercial AI/LLM tools are experimentally evaluated regarding their ability to parse CAN traces and DBC files, detect unique system states, and compute dwell-time distributions. The comparison highlights practical capabilities and limitations of AI-assisted automotive data analysis.
Automated Extraction of State Transitions. The study investigates the ability of AI/LLM tools to infer system state transitions and generate state-machine representations from CAN traces. Such models are valuable not only for structured fuzz testing but also for functional and safety-oriented test-case derivation, as suggested in prior work on fuzz-based functional validation [30].
Automated Analysis of CAN communication quality. The study demonstrates that AI/LLM tools are able to compute the cycle time of CAN messages and calculate the minimum, maximum and average values of messages cycle time.

The remainder of this paper is organized as follows: Section 2 describes the materials and methods employed in the study, including the Hardware-in-the-Loop (HIL) experimental setup, CAN data acquisition process, datasets, and the evaluated AI/LLM tools. Section 3 presents the execution and results of the AI-assisted analysis, structured according to the defined evaluation tasks, namely operational state identification, dwell-time analysis, visualization outputs, and state transition extraction. Section 4 discusses the findings, highlighting comparative tool performance, practical implications for fuzz testing preparation, and identified limitations. Finally, Section 5 concludes the paper and outlines directions for future research.

2. Materials and Methods

The experimental setup used in this study is based on a Hardware-in-the-Loop (HIL) system originally developed for functional and safety system testing. The same infrastructure can be adapted, with minor modifications, to support cybersecurity-oriented fuzz testing. The overall architecture of the fuzz testing implementation within the HIL environment is illustrated in Figure 2.

The framework consists of four principal components: a fuzz engine responsible for generating test inputs, an injector that transmits CAN frames into the network, a monitor that records system responses, and the System Under Test (SUT), which represents the ECU being evaluated. As shown in Figure 2, these components interact through the CAN communication channel and analog/digital interfaces typical for HIL configurations.

The present work focuses on the test preparation phase of fuzz testing. In this stage, AI-assisted analysis is employed to process extended CAN recordings and extract operational system states that represent valid and protocol-compliant starting conditions for fuzzing scenarios. The AI support module (indicated in Figure 2) analyzes long-duration CAN traces and identifies stable states and state transitions that can be used to initialize mutation-based or model-based fuzz testing.

2.1. CAN Data Acquisition and Network Description

To determine valid starting conditions for fuzz testing, Controller Area Network (CAN) traffic was analyzed using traces generated specifically for research purposes.

CAN bus communication was monitored and analyzed using CANalyzer 19.4 (test version) developed by Vector Informatik (Germany, Stuttgart). The tool was used for message logging, frame decoding, and real-time bus analysis during system validation [18].

Although synthetic traces were used to ensure controlled signal behavior, the same methodology is applicable to CAN traces collected from real vehicles, as demonstrated at the end of the experiment.

In CAN-based communication, ECUs interpret raw frames according to a database definition file (.dbc), which specifies message identifiers, frame length, signal layouts, scaling rules, and transmission cycle times. The DBC file used in this study defines the following (Figure 3):

Two network nodes (including the ECU under test);
Three CAN messages: ClimateState, TireInfos, and VehicleState;
Multiple functional signals.

Figure 3. CAN database edited with CANbd++ 3.1.30 [31].

Within the HIL setup, the SUT is the Tire Pressure Monitoring (TPM) Electronic Control Unit (ECU), which receives and processes signals from the VehicleState message. The TPM ECU serves as the dedicated control module responsible for supervising, analyzing, and managing data provided by the vehicle’s Tire Pressure Monitoring System (TPMS).

Its primary functions include monitoring tire pressure and tire temperature; generating warnings under specific conditions, such as underpressure, overpressure, or excessive temperature; performing data processing and fault diagnostics, including reference pressure compensation; communicating with other vehicle systems and control units; managing the energy consumption of wheel-mounted sensors; and handling sensor registration, pairing, and relearning procedures.

2.2. CAN Trace Generation

Two CAN trace files were generated using the Interactive Generator module of Vector CANalyzer (see Table 1) and the 3rd file used in research was recorded from a production car (Tesla Model 3—2022) [32].

The temporal evolution of selected signals from both datasets is shown in Figure 4 and Figure 5, respectively. Signal behavior was validated using the Graph function in CANalyzer. Vehicle measurement is used in this research only to confirm the experiment results of the first two traces.

To ensure diverse system dynamics, signals with different temporal characteristics were included:

Slow-varying ramps (e.g., vehicle speed);
Pulsed signals (e.g., clamp15, MotorState);
Random fluctuations (simulated sensor faults);
Fast counters (message counters).

The used .dbc contains CAN messages with different message cycle time (VehicleState—100 ms; ClimateState—1000 ms), for making the experiment close to practical mode. No multiframe and multiplex messages were used in the experiment.

2.3. Evaluation of AI/LLM Tools

Four commercial AI/LLM tools were evaluated for automated CAN trace analysis (see Table 2).

The evaluation focus was not robustness, reliability, and variance of the AI/LLM tools but correctness, completeness, and consistency with given instructions. In the experiment, we employ the evaluation variable “task completion (%)”, which indicates the extent to which the assigned task was successfully completed. The study was run between 15 January and 8 February 2026 on a notebook with Microsoft Windows 11 at least 3 times for every task and tool.

2.4. Analysis Procedure

The objective of the AI-assisted analysis was to extract the following:

Unique operational system states;
Dwell-time distribution of states;
State transition relationships;
Calculate and analyze the cycle time of transmitted CAN messages.

Each AI tool received the same core analytical tasks (see Table 3), with minor prompt adaptations when required for platform-specific input handling (for example, file type—instead of .dbc, only files ending with .txt were accepted) or visualization requests, for both the short and long CAN traces.

The analysis considered the signals clamp15, MotorState, HeatingState, Door-State_driver, and DoorState_codriver. The signals selected for this study were chosen with the intent of ensuring accessibility and comprehensibility to readers without a specialized background in the automotive domain. It should be noted that any alternative set of signals from the .dbc file could have been employed, and the selection needed not be limited to five signals. The .dbc file developed for the purposes of this work was deliberately kept simple, with the CAN message and signal designations chosen to be self-explanatory, thereby eliminating the need for additional signal descriptions.

In this paper we use dwell time and dwell share terms. The dwell time is the difference between the exit time and the entry time:

T_{d w e l l} = t_{e x i t} - t_{e n t r y}

(1)

And dwell share:

d w e l l s h a r e (%) = \frac{T_{d w e l l}}{T_{t o t a l}} \times 100

(2)

2.5. Methodological Relevance

In CAN-based fuzz testing, invalid counters or checksum values frequently lead to frame rejection before application-level processing. Therefore, identifying stable operational states from recorded traffic is essential for initializing mutation-based or model-based fuzzing effectively. Without valid system preconditions, injected test frames are often discarded at the protocol validation stage, resulting in low acceptance rates and limited test coverage.

The AI-assisted workflow investigated in this study addresses this preparatory requirement by extracting operational states and their temporal characteristics directly from CAN traces. These outputs provide structured initialization conditions for fuzzing scenarios, enabling protocol-compliant system setup prior to test input injection.

2.6. Deterministic Baseline Parsing and Ground-Truth Generation

To establish a reproducible reference for evaluating the outputs generated by the investigated AI/LLM tools, a deterministic Python (3.12)-based parsing pipeline was developed for the .dbc and .asc files used in this study. The purpose of this baseline pipeline was to generate benchmark results for the same analytical tasks assigned to the AI/LLM tools, namely operational state identification, dwell-time computation, state transition extraction, and cycle time analysis. This deterministic reference was introduced to ensure that LLM-generated outputs could be assessed against a stable and verifiable ground truth rather than only through descriptive comparison.

The baseline parser processes the CAN database (.dbc) and the CAN trace files (.asc) without relying on interactive interpretation. First, the parser extracts message definitions, signal definitions, scaling parameters, and value tables from the DBC file. For message identifiers stored in Vector DBC format, the extended-frame identifier representation is normalized before decoding. The parser then reads the .asc trace sequentially and decodes the payload of relevant CAN messages according to the signal definitions specified in the DBC. In the present study, the messages of primary interest were VehicleState and ClimateState, from which the signals clamp15, MotorState, HeatingState, DoorState_driver, and DoorState_codriver were reconstructed.

Because the selected signals are transmitted asynchronously and do not necessarily appear in the same frame, state reconstruction was performed using a last-value-held strategy. More precisely, at each signal update, the most recent known value of each of the five signals was retained, and the current system state was updated whenever at least one signal value changed. Consecutive duplicate states were collapsed into a single state episode. By default, partially defined initial states were excluded from the benchmark until all five selected signals had been observed at least once. This design choice was made to avoid artificial inflation of the state count caused by undefined initialization values. However, the parser also supports inclusion of the initial partially defined state for sensitivity analysis.

The deterministic parser generated structured benchmark files containing the following:

(1): Reconstructed state-change events;
(2): The set of unique operational states with occurrence counts and total dwell times;
(3): The transition table with source state, target state, and transition frequency;
(4): Cycle time summary statistics.

These outputs were used as the reference results for evaluating the correctness of LLM-generated analyses in Section 3.

The use of a deterministic baseline is particularly important in the present context because it enables separation between plausible but approximate AI-generated interpretations and exact trace-derived results.

3. Execution and Results

This section presents the results of the CAN trace analysis, combining the deterministic baseline described in Section 2.6 with the outputs generated by the evaluated AI/LLM tools. The purpose of this evaluation is to comparatively assess the capability of the selected AI/LLM tools to support the preparation phase of structured automotive fuzz testing, while also examining how closely their outputs align with a reproducible reference baseline.

The analysis focuses on five key dimensions: operational state identification, dwell-time analysis, visualization generation, state transition extraction, and cycle time analysis of CAN messages. To support a rigorous comparison, the deterministic parser was first used to generate benchmark results for the short and long CAN traces. These benchmark results serve as the ground truth for interpreting the correctness, consistency, and limitations of the LLM-generated analyses.

Both short- and long-duration CAN datasets were used to examine tool consistency, scalability, and analytical depth. Although the same analytical tasks were assigned to each platform, some prompt adaptations were required to accommodate platform-specific input handling, such as file-format conversion from .asc to .txt or clarification of visualization requests. The results presented in the following subsections therefore distinguish between exact agreement with the deterministic baseline, approximate agreement, and deviations attributable to interpretation or tool limitations.

3.1. Deterministic Baseline Results

Before evaluating the outputs generated by the investigated AI/LLM tools, a deterministic reference baseline was established using the Python-based parsing pipeline (see Section 2.6). The purpose of this baseline was to generate reproducible benchmark results for the same analytical tasks assigned to the AI/LLM tools, namely operational state identification, dwell-time computation, state transition extraction, and cycle time analysis. The baseline results were derived directly from the .dbc and .asc files and were used as the reference for assessing the correctness of the LLM-generated outputs.

For the short CAN trace, the deterministic parser identified 16 unique operational system states when partially defined initial states were excluded. For the long CAN trace, the parser identified 24 unique operational system states under the same reconstruction rules. These values provide a stable benchmark for interpreting differences between AI-generated results, especially in cases where state counts may be affected by initialization handling or approximate interpretation.

The deterministic baseline further identified 19 state-change events and 18 directed transition edges in the short trace. For the long trace, the parser identified 49 state-change events, corresponding to 48 transitions and 38 unique transition paths. These transition counts are particularly important because they provide an exact reference for evaluating the fidelity of AI-generated state transition tables and transition maps. In the long-trace case, these baseline values are consistent with the transition totals reported for AI_2, indicating that the main behavioral structure of the dataset was captured correctly in that analysis.

For cycle time analysis of the VehicleState message (CAN ID 0xC00FF31), the deterministic baseline produced the following results for the short trace: minimum cycle time 98.347 ms, maximum cycle time 101.974 ms, average cycle time 100.0003 ms, and population standard deviation 0.4696 ms. For the long trace, the corresponding values were minimum 98.528 ms, maximum 101.649 ms, average 99.999999 ms, and a population standard deviation of 0.4311 ms. The population standard deviation was calculated as

σ = \sqrt{{\frac{1}{N} \sum_{i = 1}^{N} {(x_{i} - μ)}^{2}}}

(3)

where x_i represents each observation, μ is the population mean, and N is the total number of observations.

These results provide a precise reference for comparison with the cycle time statistics reported by the evaluated AI/LLM tools.

An additional benefit of the deterministic baseline is that it clarifies the likely origin of the short-trace discrepancy observed in one LLM result, where 17 instead of 16 states were reported. The parser was designed to exclude partially defined initial states by default; however, if such states are included, an additional initialization-related state may appear before all five selected signals have been observed. This indicates that the discrepancy is most likely related to state-initialization semantics rather than to a fundamentally different interpretation of the operational behavior recorded in the trace.

Table 4 summarizes the deterministic baseline results for both datasets and serves as the benchmark reference for the comparative analysis of the AI/LLM tools presented in the following subsections.

3.2. Task 1—Operational State Identification

The first evaluation task focused on the ability of AI/LLM tools to identify unique operational system states based on signal combinations extracted from CAN traces. All evaluated tools successfully parsed the provided .ASC trace files and corresponding DBC definitions, demonstrating the feasibility of AI-assisted interpretation of structured automotive communication data.

For the long-duration dataset, all four AI/LLM tools consistently identified 24 unique operational system states (Figure 6). This result indicates strong convergence in AI-driven state detection when sufficient temporal data is available. The larger dataset allowed for clearer differentiation between stable operational modes and transient signal combinations, facilitating consistent classification across tools.

For the short-duration dataset, platforms (AI_1, AI_2, AI_4) identified 16 states, while AI_3 detected 17 states.

In the chat interface, the AI_4 tool offers the possibility to the user to choose between “fast Mode”, “thinking Mode” and “Pro Mode”. For the research tasks, the results offered in “fast Mode” and “Thinking Mode” were incomplete, and we present here only the results of the tool in “Pro Mode”.

The 16 unique system states identified for the short CAN trace for Task 1 are displayed in Table 5.

For Task 1, the AI_2 tool answered for the long CAN trace that it found 24 different combinations of signal values (unique system states) over approximately 1964 s (~33 min) and 48 total transitions with 38 unique transition paths. Figure 7 presents the top 10 most stable system states observed in the long CAN trace.

Further inspection revealed that this discrepancy originated from the interpretation of initialization conditions associated with the ClimateState signal (for the short CAN log). Because this message has a longer transmission cycle compared to other signals and the first ClimateState message came after 2 s after the first VehicleState message, AI_3 classified its initial undefined value as a distinct operational state. While technically valid, this interpretation represents a modeling nuance rather than a detection error.

For Task 1, AI_3 required 3 min and 54 s to complete the analysis and identified 17 unique operational system states. Similar to Figure 6, the output of the AI_3 tool shows the dwell-time share of system states for the short CAN file. Each bar represents one unique system state (combination of clamp15, MotorState, HeatingState, DoorState_driver, Door-State_codriver), and its height shows the percentage of total observation time spent in that state.

Overall, the results confirm that LLM-based analysis can reliably extract operational state combinations from CAN traces. Minor variations are attributable to interpretation of initialization semantics rather than parsing limitations. This capability is highly relevant for fuzz testing preparation, as it enables automated identification of valid system contexts required for protocol-compliant input injection.

3.3. Task 2—Dwell-Time Analysis

The second task evaluated the ability of AI/LLM tools to compute dwell-time distributions associated with previously identified operational states. Temporal analysis is critical for fuzz testing preparation because it highlights dominant system modes and guides prioritization of test-case injection scenarios.

All evaluated platforms generated dwell-time metrics; however, differences were observed in computational accuracy and analytical richness. AI_1 produced dwell-time tables and associated distribution charts, but minor numerical inaccuracies were detected in absolute dwell-time values. Despite this, proportional dwell-share calculations remained coherent, suggesting that visualization layers relied on internally normalized values.

The output generated by AI_1 for Task 2 consisted of a .csv table (Table 6); however, the dwell-time values for several system states were computed incorrectly.

AI_2 demonstrated the most comprehensive analytical output. In addition to dwell-time computation, it generated operational insights such as dominant state identification, ignition-state distributions, motor-activity breakdowns, and transient-state detection. These enriched analytics provide practical value for testers by contextualizing system exposure durations and identifying high-priority operational modes.

AI_3 and AI_4 produced accurate dwell-time share distributions. Notably, AI_3 extended visualization capabilities by integrating Pareto analysis, combining absolute dwell durations with cumulative contribution curves. This representation supports rapid identification of the most operationally significant system states.

These results indicate that AI/LLM tools can effectively perform temporal behavioral analysis of automotive systems.

However, numerical outputs may require validation when engineering-grade precision is required for safety-critical testing that is under scope of ISO 26262 (Road vehicles—Functional Safety) [37] or ISO 21448 (Road vehicles—Safety of the intended functionality) [38].

3.4. Task 3—Visualization Outputs

Visualization generation was successfully completed by all evaluated platforms. The produced outputs varied in graphical sophistication, interactivity, and analytical augmentation.

AI_1 (Figure 8) and AI_4 generated static bar charts representing dwell-time distributions across operational states. These visualizations provide clear comparative representations but lack interactive exploration capabilities.

AI_2 produced interactive visual analytics, including sortable bar charts, hover-based tooltips, summary statistics, and extended legend tables. This interactive layer enhances interpretability, allowing for testers to dynamically explore operational hierarchies and temporal dominance patterns.

Figure 9 presents the output of AI_2 tool for Task 3 for short CAN trace over an interactive HTML bar chart:

AI_3 introduced hybrid visual analytics through Pareto chart integration. By combining frequency distributions with cumulative curves, this approach highlights the disproportionate operational weight of dominant states and supports prioritization strategies for fuzz testing campaigns.

For Task 2 and Task 3 (long CAN trace), the AI_4 tool identified 24 unique states and for the dwell-time share, Figure 10 was generated. Additionally, a summary of findings was output: “The system spent the majority of its time in active or standby running states. The most frequent state observed was Ignition ON, Motor Running, Heating Active, and all doors Closed, accounting for approximately 29% of the trace duration. Significant dwell time was also observed in states where the motor was running but the heating was either inactive or in sleep mode, suggesting thermal management transitions or stationary testing” [36].

Overall, AI-generated visualization outputs significantly reduce manual data interpretation effort and provide decision-support artifacts for structured fuzz testing planning.

3.5. Task 4—State Transition Extraction

This task examined the ability of AI/LLM tools to infer and visualize operational state transitions from CAN traces. Transition modeling is essential for structured fuzz testing because it enables systematic navigation across valid system contexts.

AI_1 generated static transition diagrams and grouped-state representations (Figure 11). These visualizations provided clear structural overviews of system dynamics but were limited in interactivity.

For Task 4, for the long trace, the output was a state transition table with 24 system-states and a .png file—see Figure 12A.

We extended Task 4 for the long CAN trace, asking the AI_1 tool to realize a state transition map where the States were grouped by the signal “HeatingState” (Figure 12B).

For the long CAN trace, AI_3 generated a state legend in the table view and a .png file with system state transitions. The state machine diagram also presents the counts of every transition and the direction of transition: between S0 and S1 4 transitions were observed.

AI_2 produced the most advanced transition modeling outputs, including interactive HTML transition maps built with vis.js network (a dynamic, browser-based visualization library) [39], transition frequency reports, and Graphviz DOT export files. These artifacts support integration into automated testing toolchains and continuous integration environments.

Task 4 output for the long CAN trace was different compared with the short CAN trace—for the system state transition map, the AI_2 tool did not use a bubble map, rather state stickers with details for every state (Figure 13). The HTML visualization makes it easy to see how the system flows through states, with error states (red), normal operational states (green), and transitional states (blue) clearly distinguished.

In addition to the HTML state transition map, a comprehensive state transition report was generated, including complete state definitions, the full transition matrix, a timeline of the first 20 transitions, and the most frequent transition paths. A corresponding .dot file in Graphviz format was also produced to enable automated diagram generation.

After prompting AI_2 to generate a state transition bubble map, the tool produced an interactive HTML-based visualization using a force-directed layout.

AI_3 generated state-machine diagrams and transition matrices, also exportable in DOT format. Additionally, transition heatmaps were produced to illustrate high-frequency state pathways.

For Task 4 (short CAN trace), the AI_3 tool needed more clarification before generating a bubble-based state transition map with human-readable labels—Figure 14. The tool creates a Graphviz [40] DOT description file, useful to render it in development toolchain/Continuous Integration systems.

AI_4 attempted transition modeling using Mermaid graph syntax [41]; however, rendering errors prevented successful visualization of the generated models. This highlights variability in visualization robustness across platforms.

Collectively, the results demonstrate that AI/LLM tools are capable of extracting state transition relationships from automotive communication traces. Such outputs can serve as structural blueprints for state-aware fuzz testing, enabling controlled traversal of operational modes during vulnerability discovery campaigns.

3.6. Task 5—Message/Signal Decoding and Visualization

The task of visualizing signal behavior over time was successfully completed by all AI/LLM tools. Figure 15 presents the generated plots, which follow a similar approach across the evaluated AI/LLM tools, indicating that, once the CAN files are correctly parsed, interpreting and visualizing the dataset is a straightforward task.

The Data Characteristic presented by AI_4 shows a good understanding of signal behavior, information that can help the tester in preparing the testcases: “The plotted data shows significant fluctuation in the temperature values (ranging approximately between 10 °C and 35 °C) within the 10-s window. This suggests the signal in this simulation trace might be simulated with random noise or a specific test pattern, rather than representing a stable physical ambient temperature” [36].

3.7. Task 6—Analyze the Quality of CAN Transmission

The cycle time of CAN messages is the interval at which a specific message is transmitted repeatedly on a CAN bus. Short cycle time means faster system response, and long cycle time means slower updates but less bus load. If the cycle time is not respected and if the next CAN frame is missing longer than 3×cycle time, a timeout for the respective message will be raised. The cycle time for safety-critical CAN functions is typically very short and tightly controlled because these messages are directly related to vehicle control, human safety, or legal compliance (airbag/restraint pre-trigger monitoring: 1–5 ms; brake, wheel speed, ABS, steering angle, stability control: 5–10 ms).

For Task 6, all tools produced the requested cycle-time statistics for the VehicleState message (see Table 7), and a histogram of cycle time values from the analyzed CAN trace (only the short CAN trace) was also generated.

For histogram creation, the AI_1 tool chooses a bigger time sample to create the plot, and the graph is different from AI_2 and AI_3—the counts of messages for a time sample are over 400 from AI_1 and under 350 from AI_2, AI_3 and AI_4 (see Figure 16).

The AI_2 tool (Claude AI) additionally plotted the jitter/variation in actual cycle time—this gives a deep understanding of CAN stack stability.

3.8. Real CAN Log Analysis

We have also run the experiment tasks using a real CAN log (8:59 min duration) taken from a Tesla Model 3 car [32]. The log size is 244 Mb, contains one CAN-FD channel, and the CAN-ID messages have an 11-bit length. The .dbc file contains four network nodes, 159 messages with 2752 signals. The definition of a system state in this case contains the signals “DI_trackModeState”, “DI_systemState”, “VCLEFT_frontDoorState” and “VCLEFT_rearDoorState”.

Our deterministic baseline parser identified 12 unique system states, 16 system change events, and 15 state transitions. For CAN message analysis, the message ID 118DriveSystemStatus (CAN ID 0x118) was chosen. The comparison of system state analysis and cycle time results between the baseline script and AI tool (AI_2) are presented in Table 8 and Table 9.

The message with CAN ID 0x118 follows a nominal 10 ms cycle (100 Hz), with the distribution forming a clean bimodal bell around 9 ms and 11 ms (see Figure 17)—this is a common artifact of timer jitter when a 10 ms ECU task is sampled by the logger at a slightly different rate, causing frames to alternate between arriving a touch early and a touch late.

The single maximum of 38,226 ms is not a real cycle time anomaly—it corresponds to the ~38 s bus-silent gap between the two recording segments visible in the timeline. Excluding that outlier, the true worst-case gap between frames is well under 20 ms.

The P1–P99 (latency) window of 7.8–12.2 ms captures 99% of all frames, confirming the message is extremely regular with no meaningful jitter in normal operation.

4. Discussion

The experimental results indicate that AI-assisted analysis of CAN traces can effectively support the preparation phase of structured automotive fuzz testing. Across all evaluated tasks, the investigated AI/LLM tools demonstrated the ability to parse CAN traces, interpret DBC definitions, generate analytical artifacts relevant to state-aware testing workflows, and analyze the quality of CAN communication (see Table 10).

From an execution perspective, it was observed that repeated submission of similar analytical tasks across multiple datasets led to reduced response latency and more structured outputs. This behavior suggests that AI/LLM tools may internally reuse previously generated parsing logic or analytical templates when processing structurally similar data. Although this phenomenon was not formally measured, it indicates potential efficiency gains in iterative cybersecurity testing scenarios involving multiple CAN recordings.

Task 1—State Identification. All evaluated AI/LLM tools successfully identified operational system states within the long-duration dataset, converging on a total of 24 unique states. For the short-duration dataset, three platforms identified 16 states, while one platform identified 17. The discrepancy was traced to the interpretation of initialization values associated with the ClimateState signal prior to its first valid transmission.

Because the ClimateState message operates at a longer transmission cycle (1000 ms) compared to other signals (100 ms), its initial undefined value was interpreted as a distinct operational state by one tool. This behavior reflects a modeling interpretation difference rather than a parsing limitation and highlights the sensitivity of AI-assisted state extraction to signal initialization semantics.

The fuzz testing practitioner should not trust blindly in the results of AI tools and should not “fuzz” only the system states discovered in preparation phase. To ensure the completeness of a fuzz testing session, it is imperative that a preliminary “smoke test” phase—encompassing broad fuzzing of all available inputs over a defined time interval or a predetermined number of test cases—is not omitted from the testing workflow.

During DBC parsing, the AI/LLM tools also detected structural encoding nuances, including extended-frame identifier representations. For example, the VehicleState message utilized a 29-bit identifier stored according to Vector DBC conventions. The ability of AI/LLM tools to interpret such protocol-level metadata demonstrates analytical capabilities beyond simple signal extraction.

Additionally, one platform proposed higher-level state aggregation by grouping ignition status (clamp15) and engine activity (MotorState), offering an abstraction layer useful for functional interpretation of vehicle operational modes (Figure 18).

Task 2—Dwell-Time Analysis. Dwell-time computation results revealed differences in numerical precision and analytical depth across platforms. While all tools generated temporal distributions of system states, minor inaccuracies were identified in absolute dwell-time values produced by one platform. However, proportional dwell-share calculations remained consistent, indicating that visualization layers relied on normalized temporal distributions rather than raw duration values.

More advanced analytical summaries were generated by other platforms, including identification of dominant operational states, ignition-state distributions, motor-activity breakdowns, and transient-state detection. These synthesized insights provide practical value for testers by highlighting high-exposure operational contexts and supporting prioritization of fuzz testing injection scenarios.

Such temporal behavioral profiling is particularly relevant for structured fuzz testing, where the selection of realistic system states directly influences vulnerability discovery efficiency.

Task 3—Visualization Capabilities. All the evaluated platforms successfully generated graphical representations of dwell-time distributions. However, visualization sophistication varied considerably.

Static bar charts provided baseline interpretability of operational exposure, while interactive dashboards enabled dynamic exploration of system behavior through tooltips, sortable state hierarchies, and embedded statistical summaries. Advanced visualization approaches, such as Pareto analysis, further enhanced interpretability by highlighting the disproportionate contribution of dominant operational states to total system runtime.

These visualization capabilities significantly reduce manual interpretation effort and provide decision-support artifacts for planning structured fuzz testing campaigns. By identifying dominant and rare system modes, testers can strategically prioritize mutation injection points and optimize test resource allocation.

Task 4—State Transition Modeling. State transition extraction revealed the most significant variability across AI/LLM tools. While all tools attempted to infer transition relationships from CAN traces, the structure and usability of generated artifacts differed substantially.

Export capabilities, such as Graphviz DOT descriptions [40], further increased the practical applicability of the generated models by enabling integration into automated testing toolchains and continuous integration environments. These artifacts allow for testers to reuse AI-derived behavioral models within structured fuzzing frameworks.

Prior research has demonstrated that incorporating state transition modeling into protocol fuzz testing frameworks can significantly improve testing coverage and vulnerability discovery rates [42]. The AI-assisted transition extraction demonstrated in this study aligns with these findings by enabling automated derivation of system behavioral models directly from recorded CAN traffic.

Task 5—Message/Signal Decoding and Visualization. CAN trace parsing and signal decoding of .ASC files seem to be a trivial task for tested AI/LLM tools, as is also the plot of the signal behavior in different time windows. A little differentiation came from the interpretation of the analyzed signal. AI_3 only outputs the plot of the signal without any other comments. AI_1, AI_2 and AI_4 add a key finding or a signal analysis, suggesting that the signal does not represent a stable physical ambient temperature or it is “corrupted” with random noise.

Task 6—Analysis of CAN communication quality. Calculating the cycle time of a CAN message shows good results from all AI/LLM tools. It is worth mentioning that the analyzed CAN trace was “clean”, without ErrorFrames and with only one CAN channel. The behavior of message cycle time gives information about the stability of the ECU that is sending the CAN messages and also about the stability of communication, without a high busload or with two senders that are trying to transmit messages with the same CAN_ID. For the ECUs that are relevant for functional safety, this kind of measurement and analysis for a long period of time and under “stressed” conditions is mandatory (ISO 26262:2018 [37]). For CAN messages that are gated from one channel to another one, the AI support could easily analyze what delay this gateway introduces in message transmission or how the gateway affects the cycle time of the message.

5. Conclusions and Future Work

This study investigated the use of AI-assisted analysis of Controller Area Network traces to support the test preparation phase of structured automotive fuzz testing. The results demonstrate that LLM–based tools are capable of extracting operational system states, computing dwell-time distributions, and deriving state transition relationships directly from CAN traces and DBC descriptions.

Across all evaluated platforms, consistent operational state sets were identified within extended CAN recordings, confirming that AI-assisted parsing scales effectively to long-duration datasets. Minor discrepancies observed in short traces were attributable to differences in the interpretation of initialization conditions rather than to fundamental analytical limitations. In addition, the ability of AI/LLM tools to detect structural inconsistencies within DBC files highlights their potential role as auxiliary verification instruments within automotive testing workflows.

From a fuzz testing perspective, the most significant contribution of this work lies in the automated identification of stable operational states and state transition pathways without having programming knowledge. These artifacts support structured fuzz testing preparation by enabling initialization from protocol-consistent system contexts, reducing frame rejection caused by invalid preconditions, and facilitating systematic exploration of operational modes.

Consequently, AI/LLM tools can function as state discovery and behavioral modeling engines that enhance the effectiveness of mutation-based and model-based fuzz testing approaches, particularly within gray-box automotive cybersecurity testing environments.

Despite these promising findings, several limitations were identified. Quantitative outputs—such as dwell-time computations—may contain minor numerical inaccuracies and therefore require validation prior to safety-critical application. Variability was also observed in file-format handling robustness and visualization generation capabilities across platforms. For these reasons, current AI/LLM tools should be regarded as decision-support tools that augment, rather than replace, established engineering-grade analysis environments.

Future research should expand both the technical scope and industrial applicability of AI-assisted CAN trace analysis.

A primary research direction involves evaluation using large-scale automotive datasets, including gigabyte-scale vehicle recordings collected under real driving conditions. Such experimentation would allow for the assessment of scalability, processing latency, and analytical consistency in production-level cybersecurity validation workflows.

Further work should also explore support for additional automotive data formats, such as Binary Logging Format (BLF) and Measurement Data Format version 4 (MF4), which are widely used in OEM testing and validation pipelines. Extending AI-assisted parsing capabilities to these formats would significantly broaden practical deployment potential.

Another promising avenue involves hybrid analytical workflows in which AI/LLM tools generate executable data-processing scripts—such as Python-based parsing or state-mining routines—that can be executed offline within controlled engineering environments. This approach could combine the interpretive flexibility of AI/LLM tools with the computational rigor of deterministic analysis pipelines.

Finally, quantitative benchmarking studies are needed to compare AI-assisted state discovery against traditional signal-engineering methodologies. Metrics such as state-detection accuracy, transition coverage, preprocessing time, and fuzzing efficiency gains would provide objective validation of the added value introduced by AI-driven preparatory analysis.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/automation7030083/s1; articleFuzz.dbc Trace_long.asc; Trace_short.asc.

Author Contributions

Conceptualization, C.V.K. and A.P.; methodology, A.P.; software, A.P. and C.V.L.; validation, C.V.K., A.P. and C.V.L.; formal analysis, A.P.; investigation, A.P.; resources, A.P.; data curation, A.P. and C.V.L.; writing—original draft preparation, A.P.; writing—review and editing, C.V.K., A.P. and C.V.L.; visualization, A.P.; supervision, C.V.K.; project administration, C.V.K. and A.P.; funding acquisition, C.V.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Acknowledgments

Vector Informatik GmbH for CANalyzer license.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
BLF	Binary Logging Format
CAN	Controller Area Network
DBC	Database CAN
ECU	Electronic Control Unit
FAMGAN	Field-associative Mutation Generative Adversarial Network
HIL	Hardware In the Loop
HTML	Hypertext Markup Language
LLM	Large Language Model
MF4	Measurement Data Format version 4
OEM	Original Equipment Manufacturer
SUT	System Under Test

References

SAE International–Vehicle Cybersecurity Systems Engineering Committee. SAE J3061—Cybersecurity Guidebook for Cyber-Physical Vehicle Systems; Society of Automotive Engineers, SAE International: Warrendale, PA, USA, 2016. [Google Scholar]
ISO/SAE 21434:2021; Road Vehicles—Cybersecurity Engineering. ISO: Geneva, Switzerland, 2021. Available online: https://www.iso.org/standard/70918.html (accessed on 2 February 2026).
UN-ECE R155—Cyber Security and Cyber Security Management System. Available online: https://eur-lex.europa.eu/legal-content/DE/TXT/?uri=OJ:L:2021:082:TOC (accessed on 2 February 2026).
ISTQB International Software Testing Qualification Board. Available online: https://www.istqb.org (accessed on 3 February 2026).
Vikram, V.; Laybourn, I.; Li, A.; Nair, N.; OBrien, K.; Sanna, R.; Padhye, R. Guiding Greybox Fuzzing with Mutation Testing. In Proceedings of the ISSTA 2023: Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis; ACM: New York, NY, USA, 2023; Volume 2023, pp. 929–941. [Google Scholar]
Moukahal, L.J.; Zulkernine, M.; Soukup, M.; Soc, I.C. Boosting Grey-Box Fuzzing for Connected Autonomous Vehicle Systems. In Proceedings of the 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C); IEEE: New York, NY, USA, 2021; pp. 516–527. [Google Scholar]
Oka, D.K. Improving Fuzz Testing Coverage by Using Agent Instrumentation. In Building Secure Cars: Assuring the Automotive Software Development Lifecycle; Wiley: Hoboken, NJ, USA, 2021; pp. 179–209. [Google Scholar]
Kifor, C.; Popescu, A. Automotive Cybersecurity: A Survey on Frameworks, Standards, and Testing and Monitoring Technologies. Sensors 2024, 24, 6139. [Google Scholar] [CrossRef] [PubMed]
Oka, D.K. Building Secure Cars: Assuring the Automotive Software Development Lifecycle; Wiley: Hoboken, NJ, USA, 2021. [Google Scholar]
Santos, T.; Grümer, P.; Parsamehr, R.; Pacheco, H. OCANada: A Generation-Based Fuzzer for ECUs over CAN. In Proceedings of the 2025 IEEE Vehicular Networking Conference (VNC), Porto, Portugal, 2–4 June 2025. [Google Scholar]
Jiang, Y.; Li, Z.; Duan, B.; Feng, T. The Seed Optimization Method for Fuzz Testing Based on Neural Network-Guided Genetic Algorithm. Computers 2026, 15, 170. [Google Scholar] [CrossRef]
Im, S.; Jung, Y.; Park, J. Algorithmic Optimization for Accelerated UDS Fuzzing in Cyber–Physical Automotive Networks: The BB-FAST Approach on LIN-Bus. Electronics 2026, 15, 1223. [Google Scholar] [CrossRef]
Shen, Z.; Li, X.; Xie, H. GRLFuzz: A Fuzz Testing Method for Optimizing Mutation Strategies Based on Reinforcement Learning BT—Intelligent Vehicles; Li, H., Wang, Z., Zhao, S., Sun, P., Herrmann, M., Zheng, X., Liu, Y., Eds.; Springer Nature: Singapore, 2026; pp. 286–300. [Google Scholar]
Kim, H.; Jeong, Y.; Choi, W.; Lee, D.H.; Jo, H.J. Efficient ECU Analysis Technology Through Structure-Aware CAN Fuzzing. IEEE Access 2022, 10, 23259–23271. [Google Scholar] [CrossRef]
Yang, H.; Zhou, Y.; Chen, T. SimADFuzz: Simulation-Feedback Fuzz Testing for Autonomous Driving Systems. ACM Trans. Softw. Eng. Methodol. 2025, 35, 1–32. [Google Scholar] [CrossRef]
Jin, Q.; Wu, T.; Dong, Y.; Ding, Z.; Xu, Y. ReinSeed: Reinforcement Fuzz Testing with Multiphase Seed Optimization for Autonomous Driving Systems. IET Softw. 2025, 2025, 8657455. [Google Scholar] [CrossRef]
Huang, C.; Wang, D.; Zhou, Z.Q. LLM-Assisted Model-Based Fuzzing of Protocol Implementations. arXiv 2025, arXiv:2508.01750. [Google Scholar]
Vector Informatik GmbH Vector CANanalyser. Available online: https://www.vector.com/de/de/produkte/produkte-a-z/software/canalyzer/ (accessed on 2 February 2026).
Vector Informatik GmbH VSignalyzer. Available online: https://www.vector.com/int/en/products/products-a-z/software/vsignalyzer/ (accessed on 2 February 2026).
The MathWorks, I. MATLAB CAN Explorer—Acquire and Visualize CAN Data. Available online: https://in.mathworks.com/help/vnt/ug/canexplorer-app.html (accessed on 4 February 2026).
Robert Bosch Engineering and Business Solutions (RBEI) BUSMASTER. Available online: https://github.com/rbei-etas/busmaster (accessed on 4 February 2026).
Kayas, G.; Pelletier, Z.; Gordon, D.; Arntson, T.; Payton, J. AI-Assisted Vulnerability Analysis And Classification Framework for UDS on CAN-Bus Fuzzer. In Proceedings of the 10th escar USA—The World’s Leading Automotive Cyber Security Conference; isits AG: Bochum, Germany, 2023; p. 11. [Google Scholar]
Meng, W.B.; Liu, Y.; Zaiter, F.; Zhang, S.L.; Chen, Y.H.; Zhang, Y.Z.; Zhu, Y.C.; Wang, E.; Zhang, R.Z.; Tao, S.M.; et al. LogParse: Making Log Parsing Adaptive through Word Classification. In Proceedings of the 2020 29th International Conference on Computer Communications and Networks (ICCCN), Honolulu, HI, USA, 3–6 August 2020. [Google Scholar]
Li, Z.Y.; Song, J.; Zhang, T.Y.; Yang, T.; Ou, X.J.; Ye, Y.J.; Duan, P.F.; Lin, M.C.; Chen, J.J. ACM Adaptive and Efficient Log Parsing as a Cloud Service. In Proceedings of the SIGMOD/PODS ′25: Companion of the 2025 International Conference on Management of Data; ACM: New York, NY, USA, 2025; Volume 2025, pp. 512–524. [Google Scholar]
Hashemi, S.; Mäntylä, M. Token Interdependency Parsing (Tipping)—Fast and Accurate Log Parsing. arxiv 2024, arXiv:2408.00645. [Google Scholar]
Zhang, H.; Zhou, Y.; Xu, H.H.; Shi, J.A.; Lin, X.H.; Gao, Y.Q. Anomaly Detection in Virtual Machine Logs against Irrelevant Attribute Interference. PLoS ONE 2025, 20, e0315897. [Google Scholar] [CrossRef] [PubMed]
Le, V.H.; Zhang, H.Y. Log Parsing: How Far Can ChatGPT Go? In Proceedings of the 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE); IEEE: New York, NY, USA, 2023; pp. 1699–1704. [Google Scholar]
Lisaru, C.; Kifor, C. Acta Technica Napocensis—Chatgpt vs Github Copilot—A Requirement-Based Test Cases Generation Capabilities Evaluation. ACTA Tech. Napoc.—Appl. Math. Mech. Eng. 2025, 68, 927–932. [Google Scholar]
Li, Z.; Jiang, W.; Liu, X.; Tan, K.; Jin, X.; Yang, M. GAN Model Using Field Fuzz Mutation for In-Vehicle CAN Bus Intrusion Detection. Math. Biosci. Eng. 2022, 19, 6996–7018. [Google Scholar] [CrossRef] [PubMed]
Popescu, A.; Kifor, C. V Application of Fuzz Testing for Functional Validation in Automotive. ACTA Tech. Napoc.—Appl. Math. Mech. Eng. 2025, 68, 943–950. [Google Scholar]
Vector Informatik GmbH CANdb++. Available online: https://www.vector.com/int/en/products/products-a-z/software/candb/ (accessed on 2 February 2026).
CSS Electronics Electric Vehicle “Data Pack” [CAN/UDS Log Files & DBC Files]. Available online: https://www.csselectronics.com/pages/ev-data-pack-electric-vehicles (accessed on 3 February 2026).
Microsoft Microsoft Copilot. Available online: https://copilot.microsoft.com/ (accessed on 2 February 2026).
Anthropic Claude AI. Available online: https://claude.ai (accessed on 2 February 2026).
OpenAI ChatGPT. Available online: https://chatgpt.com/ (accessed on 2 February 2026).
Google Google Gemini. Available online: https://gemini.google.com/app?hl=en (accessed on 2 February 2026).
ISO 26262-1:2018; Road Vehicles—Functional Safety. ISO: Geneva, Switzerland, 2018. Available online: https://www.iso.org/standard/68383.html (accessed on 2 February 2026).
ISO 21448:2022; Road Vehicles—Safety of the Intended Functionality. ISO: Geneva, Switzerland, 2022. Available online: https://www.iso.org/standard/77490.html (accessed on 2 February 2026).
Community, V. j. Vis.Js. Available online: https://visjs.org/ (accessed on 2 February 2026).
Authors, T.G. Graphviz. Available online: https://graphviz.org/ (accessed on 2 February 2026).
Mermaid Mermaid. Available online: https://mermaid.js.org/ (accessed on 2 February 2026).
Feng, X.; Tan, W.; Qiu, T.; Yu, W.; Zhang, Z.; Cui, B. A PFCP Protocol Fuzz Testing Framework Integrating Data Mutation Strategies and State Transition Algorithms. In Innovative Mobile and Internet Services in Ubiquitous Computing; Springer: Berlin/Heidelberg, Germany, 2024; pp. 272–279. [Google Scholar]

Figure 1. Structure-aware CAN Fuzzing protocol (adapted from [14]).

Figure 2. Fuzz testing implementation in a Systemtest HIL (AI 1 support).

Figure 4. Short CAN trace—signal behavior (visualized with Vector CANalyzer tool).

Figure 5. Long CAN trace—signal behavior (visualized with Vector CANalyzer tool).

Figure 6. Dwell-time share per SystemState for long CAN trace—AI_1 tool.

Figure 7. Top 10 most stable system states observed in long CAN trace—AI2 tool.

Figure 8. Bar chart of dwell share of short CAN trace—AI1 tool.

Figure 9. Bar chart of dwell share of short CAN trace—AI2 tool.

Figure 10. Bar chart of dwell share of long CAN trace—AI4 tool.

Figure 11. Task 4 for short CAN trace with AI_1 Tool (State transition diagram).

Figure 12. (A) Task 4 for long CAN trace with AI_1 tool. (B) System states grouped by “HeatingState” signal (inactive—blue; active—green; sleep—yellow; error—red).

Figure 13. System state transition map created with AI_2 tool (for the long CAN trace).

Figure 14. Task 4 (short CAN trace) from AI3_tool—System state machine diagram.

Figure 15. Task 5 plot of AmbientAirTemperature signal from all AI tools.

Figure 16. Cycle time histogram of VehicleState CAN message.

Figure 17. Cycle time histogram of CAN message with ID 0x118.

Figure 18. System state timeline (clamp15, MotorState).

Table 1. CAN traces used in research.

Dataset	Duration	File Size	Reference
Short trace (Trace_short.asc)	4 min 15 s	342 KB	Figure 3
Long trace (Trace_long.asc)	~32 min	4.8 MB	Figure 4
Vehicle trace (00000001_CAN.asc)	8 min 52 s	244 MB	-

Table 2. AI/LLM tools used in the study.

Index	Tool Name	Version	Account Type
AI_1	Microsoft Copilot [33]	core model version GPT 5 released in June 2024	Copilot Business
AI_2	Claude AI [34]	Claude Sonnet 4.5—specific model is “claude-sonnet-4-5-20250929”	Claude Pro
AI_3	ChatGPT [35]	GPT-5.2	ChatGPT Plus
AI_4	Google Gemini [36]	Gemini 3 Flash	Free—Pro Mode

Table 3. AI analysis tasks.

Task Name	Task Description
Task 1	The attached file “Trace_short/long.asc” is a CAN trace and the file “articleFuzz.dbc” is a .dbc file. Search and display the unique system states observed in CAN trace taking into consideration the system signals “clamp15”, “MotorState”, “HeatingState”, “DoorState_driver” and “DoorState_codriver”.
Task 2	Compute dwell-time share of system states discovered earlier.
Task 3	Show a bar chart of dwell share.
Task 4	Create a bubble-based (optional) state transition map.
Task 5	Make a graph with signal “ambientAirTemperature” from CAN message VehicleState for a duration of 10 s, starting with timestamp 60 s.
Task 6	Calculate from CAN trace the Cycle time of CAN message VehicleState (CAN ID 0xC00FF31). Display the minimum, maximum and average cycle time.

Table 4. Deterministic baseline results for the analyzed CAN traces.

Dataset	Short Trace	Long Trace
Unique states	16	24
State-change events	19	49
Total transitions	18	48
Unique transition paths	18	38
VehicleState min cycle time(ms)	98.347	98.528
VehicleState max cycle time (ms)	101.974	101.649
VehicleState avg cycle time (ms)	100.0003	99.999999
VehicleState std. dev(ms)	0.4696	0.4311

Table 5. Unique system states of short CAN trace—AI4 tool.

State Nr.	StartTime (s)	Clamp15	MotorState	Heating State	Driver Door	Co-Driver Door	Context/Description
1	0.00	OFF	Not Running	0	Closed	Closed	Initial state: Vehicle parked and off.
2	15.51	OFF	Not Running	0	Open	Closed	Driver opens door.
3	21.81	ON	Not Running	0	Open	Closed	Ignition turned ON (Clamp 15).
4	26.51	ON	Not Running	0	Open	Open	Co-driver opens door.
5	31.71	ON	Starting	0	Open	Open	Motor starts cranking.
6	35.61	ON	Running	0	Open	Open	Motor is running.
7	40.11	ON	Running	0	Open	Closed	Co-driver door closes.
8	44.01	ON	Running	0	Closed	Closed	Driver door closes; car ready to drive.
9	61.45	ON	Running	1	Closed	Closed	Climate control heating activates.
10	194.45	ON	Running	2	Closed	Closed	Heating goes to sleep mode.
11	228.91	ON	Not Running	2	Closed	Closed	Motor stops.
12	235.01	ON	Not Running	2	Closed	Open	Co-driver opens door to exit.
13	240.01	OFF	Not Running	2	Open	Open	Ignition OFF; Driver opens door.
14	248.45	OFF	Not Running	0	Open	Open	Heating system shuts down completely.
15	253.01	OFF	Not Running	0	Open	Closed	Co-driver door closes.
16	255.51	OFF	Not Running	0	Closed	Closed	Driver door closes (Return to initial state).

Table 6. Calculated dwell time (from short CAN trace) for every identified SystemState with AI1 tool.

Clamp 15	Motor State	Heating State	Door State Driver	Door State Codriver	First Time s	Last Time s	Occurrences	Dwell Time_s
OFF	0	Inactive	Closed	Closed	3.512	259.012	158	255.5
OFF	0	Inactive	Open	Closed	15.512	255.212	90	239.7
OFF	0	Inactive	Open	Open	246.512	252.512	61	6.0
OFF	0	Sleep	Open	Open	239.412	246.412	71	7.0
ON	0	inactive	Open	Closed	21.812	26.412	47	4.6
ON	0	inactive	Open	Open	26.512	31.612	52	5.1
ON	0	Sleep	closed	Closed	228.913	232.212	34	3.29
ON	0	Sleep	closed	Open	232.312	235.413	32	3.1
ON	0	Sleep	Open	Open	235.512	239.312	39	3.8
ON	1	active	closed	Closed	61.512	194.412	1207	132.9
ON	1	active	closed	Error	92.212	104.412	123	12.2
ON	1	inactive	closed	Closed	44.012	61.412	175	17.4
ON	1	inactive	Open	Closed	40.112	43.912	39	3.8
ON	1	inactive	Open	Open	35.613	40.012	45	4.49
ON	1	Sleep	closed	Closed	194.512	228.812	344	34.3
ON	1	inactive	Open	Open	31.713	35.512	39	3.79

Table 7. Cycle time values taken from the short CAN trace.

AI_Tool	Minimum (ms)	Maximum (ms)	Average (ms)	Standard Deviation (ms)	Messages Analyzed
AI_1	98.347	101.974	100	0.470	2576
AI_2	98.347	101.974	100	0.470	2576
AI_3	98.347	101.974	100	0.4696	2576
AI_4	98.347	101.974	100	0.470	2576

Table 8. Cycle time analysis for a production CAN log.

Tool	Frames Identified	Minimum Cycle Time (ms)	Maximum Cycle Time (ms)	Average Cycle Time (ms)	Standard Deviation Cycle Time (ms)
Baseline	50,097	6.100	38.225	10.762	170.746
AI tool	50,097	6.100	38.226	10.76	170.7

Table 9. System state analysis for a production CAN log.

Tool	Unique System States	System Change Events	Unique State Transitions	Highest Dwell Time (ms)	2nd Dwell Time (ms)
Baseline	12	16	15	471.903	32.869
AI tool	12	16	15	471.904	32.869

Table 10. Evaluation of task completion for proposed tasks.

Task	Task_Completion (%)
Task	AI_1	AI_2	AI_3	AI_4
Task 1	100%	100%	100%	100%
Task 2	100%	100%	100%	100%
Task 3	100%	100%	100%	100%
Task 4	100%	100%	100%	100%
Task 5	100%	100%	100%	100%
Task 6	100%	100%	100%	100%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Popescu, A.; Kifor, C.V.; Lisaru, C.V. AI-Assisted CAN Trace Analysis for State Identification to Improve Structure-Aware Fuzz Testing of Automotive ECUs. Automation 2026, 7, 83. https://doi.org/10.3390/automation7030083

AMA Style

Popescu A, Kifor CV, Lisaru CV. AI-Assisted CAN Trace Analysis for State Identification to Improve Structure-Aware Fuzz Testing of Automotive ECUs. Automation. 2026; 7(3):83. https://doi.org/10.3390/automation7030083

Chicago/Turabian Style

Popescu, Aurelian, Claudiu Vasile Kifor, and Codrina Victoria Lisaru. 2026. "AI-Assisted CAN Trace Analysis for State Identification to Improve Structure-Aware Fuzz Testing of Automotive ECUs" Automation 7, no. 3: 83. https://doi.org/10.3390/automation7030083

APA Style

Popescu, A., Kifor, C. V., & Lisaru, C. V. (2026). AI-Assisted CAN Trace Analysis for State Identification to Improve Structure-Aware Fuzz Testing of Automotive ECUs. Automation, 7(3), 83. https://doi.org/10.3390/automation7030083

Article Menu

AI-Assisted CAN Trace Analysis for State Identification to Improve Structure-Aware Fuzz Testing of Automotive ECUs

Abstract

1. Introduction

2. Materials and Methods

2.1. CAN Data Acquisition and Network Description

2.2. CAN Trace Generation

2.3. Evaluation of AI/LLM Tools

2.4. Analysis Procedure

2.5. Methodological Relevance

2.6. Deterministic Baseline Parsing and Ground-Truth Generation

3. Execution and Results

3.1. Deterministic Baseline Results

3.2. Task 1—Operational State Identification

3.3. Task 2—Dwell-Time Analysis

3.4. Task 3—Visualization Outputs

3.5. Task 4—State Transition Extraction

3.6. Task 5—Message/Signal Decoding and Visualization

3.7. Task 6—Analyze the Quality of CAN Transmission

3.8. Real CAN Log Analysis

4. Discussion

5. Conclusions and Future Work

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI