Next Article in Journal
Techno-Economic Analysis of Hydrogen Hybrid Vehicles
Previous Article in Journal
A Self-Attention-Enhanced 3D Object Detection Algorithm Based on a Voxel Backbone Network
Previous Article in Special Issue
Recommendations for Preventing Free-Stroke Failures in Electric Vehicle Suspension Dampers Based on Experimental and Numerical Approaches
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

AI-Driven Automated Test Generation Framework for VCU: A Multidimensional Coupling Approach Integrating Requirements, Variables and Logic

School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China
*
Author to whom correspondence should be addressed.
World Electr. Veh. J. 2025, 16(8), 417; https://doi.org/10.3390/wevj16080417
Submission received: 23 May 2025 / Revised: 26 June 2025 / Accepted: 8 July 2025 / Published: 24 July 2025
(This article belongs to the Special Issue Intelligent Electric Vehicle Control, Testing and Evaluation)

Abstract

This paper proposes an AI-driven automated test generation framework for vehicle control units (VCUs), integrating natural language processing (NLP) and dynamic variable binding. To address the critical limitation of traditional AI-generated test cases lacking executable variables, the framework establishes a closed-loop transformation from requirements to executable code through a five-layer architecture: (1) structured parsing of PDF requirements using domain-adaptive prompt engineering; (2) construction of a multidimensional variable knowledge graph; (3) semantic atomic decomposition of requirements and logic expression generation; (4) dynamic visualization of cause–effect graphs; (5) path-sensitization-driven optimization of test sequences. Validated on VCU software from a leading OEM, the method achieves 97.3% variable matching accuracy and 100% test case executability, reducing invalid cases by 63% compared to conventional NLP approaches. This framework provides an explainable and traceable automated solution for intelligent vehicle software validation, significantly enhancing efficiency and reliability in automotive testing.

1. Introduction

1.1. Research Background

The rapid advancement of intelligent electric vehicles has led to exponential growth in software complexity for vehicle control units (VCU) [1,2,3]. Mainstream electric vehicle VCU now exceed 5 million lines of code, managing over 5000 vehicle state variables and 150+ CAN nodes [4,5,6]. Concurrently, the ISO 21448 Safety of the Intended Functionality (SOTIF) standard mandates full lifecycle traceability of requirement changes [7], exposing critical challenges in manual test scripting methodologies. Field data from a leading OEM indicates that a significant proportion of test cases require design adjustments during software iterations, causing the validation cycle to extend by several times. This highlights substantial challenges in the practical execution of relevant processes [8].
Current solutions face two primary limitations:
(1)
Rule-based approaches (e.g., Vector CANoe) lack adaptability to requirement changes due to hard-coded templates.
(2)
End-to-end AI methods (e.g., GPT-4) generate more than half of non-executable test cases due to missing variables, creating a critical “variable missing gap” that hinders AI adoption in automotive validation.

1.2. State-of-the-Art Challenges

Recent academic advances, such as the BERT-UML framework [9], enable requirement-to-activity diagram conversion but rely on static variable binding, failing to address dynamic calibration updates. Industrial tools like ANSYS SCADE 2000 [10,11] enforce rigid SysML-based workflows, limiting flexibility. Key challenges in natural language requirement ambiguity include:
(1)
Heterogeneous expressions: Diverse naming conventions for identical parameters (e.g., “Battery SOC” vs. “High Voltage Battery State of Charge (SoC)”).
(2)
Context dependency: Threshold definitions (e.g., “motor overheating”) vary across vehicle thermal designs.
These issues result in <70% variable matching accuracy in real-world VCU testing, with misalignments risking critical errors (e.g., erroneously linking accelerator signals to brake systems).

1.3. Research Contributions

This work proposes an innovative “requirement–variable–logic” multidimensional coupling framework with three breakthroughs:
(1)
Domain-adaptive requirement parsing: Automotive-specific prompt templates improve Llama3’s F1-score from 82.4% to 94.2% for requirement structuring tasks.
(2)
Dynamic variable binding: A hybrid Levenshtein-BERT semantic fingerprinting technique achieves 97.3% accuracy in mapping requirements to DBC signals.
(3)
Explainable test generation: A cause–effect-graph-driven path-sensitization algorithm reduces test cases by 63% while maintaining 100% modified condition/decision coverage (MC/DC), a safety-critical standard for automotive systems (ISO 26262).
Validated on VCU software from a leading OEM, this framework has successfully supported ISO 26262 ASIL-D certification.

2. Methodology

2.1. System Architecture

As illustrated in Figure 1, the proposed framework enables end-to-end transformation from raw requirements to executable test cases through six interconnected modules:
(1)
Unstructured PDF Requirement Extraction
  • Employs a multimodal PDF parser (PyPDF2 + pdfplumber) to convert heterogeneous content (text, tables, formulas) into structured Markdown format.
  • Achieves 95.7% text recall and 88.6% table integrity via hybrid syntax–visual analysis.
(2)
Llama3-Based Requirement Structuring
  • Leverages automotive-specific prompt templates to normalize Markdown requirements.
  • Stores structured requirements in a NoSQL database (MongoDB) for atomic decomposition.
(3)
Variable Knowledge Graph Construction
  • Builds a three-layer meta-model integrating:
    A2L files: Calibration parameters.
    CAN matrix: Communication signals.
    HIL bench data: Runtime variables.
  • Utilizes Neo4j for graph-based storage and dynamic updates.
(4)
Atomic Requirement Decomposition
  • Splits requirements into minimal logical expressions (e.g., IF BatteryTemp > 45 °C THEN ChargingPower = 0).
  • Performs cross-database variable matching with 97.3% accuracy.
(5)
Cause–Effect Graph Modeling
  • Implements a PySide6-based GUI for interactive graph editing.
  • Supports real-time validation using SAT solvers to detect logic conflicts.
(6)
Path-Sensitization Test Generation
  • Applies a hybrid A*-DFS algorithm to optimize test sequences.
  • Reduces test cases by 63% while ensuring 100% MC/DC coverage.
    Dataflow Characteristics:
    (1)
    Bidirectional Verification: Embeds consistency checks during atomic decomposition and graph modeling to ensure variable integrity.
    (2)
    Progressive Refinement: Implements a two-phase variable matching strategy:
    Phase 1: Coarse filtering via Levenshtein similarity (threshold: 0.6).
    Phase 2: Precise mapping using BERT-based semantic fingerprints.

2.2. Requirement Structuring

2.2.1. Multimodal PDF Parsing

Modern VCU requirement documents exhibit multimodal characteristics: a heterogeneous mix of textual descriptions (58%), parameter tables (23%), mathematical formulas (12%) and state flowcharts (7%). This complexity renders traditional single-modality parsing methods ineffective—a case study reveals that text-only extraction misses 72% of table parameters, leading to erroneous test boundary conditions [12].
Hybrid Parsing Strategy
We propose a synergistic framework combining PyPDF2 (syntax-based parsing) and pdfplumber (vision-driven layout analysis), leveraging their complementary strengths (Figure 2).
Technical Implementation
(1)
Spatial Coordinate Mapping:
  • Constructs page-level grids using PyPDF2-extracted text block metadata (x, y, width, height), enabling character-level indexing [13,14,15].
(2)
Table Reconstruction:
  • Rebuilds 2D table structures by aligning pdfplumber-detected cell boundaries with contextual semantics (e.g., parameter-value co-occurrence patterns) [16].
(3)
Formula Recognition:
  • Identifies mathematical expressions using:
    Symbolic Features: Special characters (∑, √).
    Layout Features: Superscript/subscript alignment.
    Semantic Features: Formula references in adjacent text.
Experimental Validation
Experimental comparisons were conducted using the proposed method, PyPDF2, and pdfplumber individually, based on tests with 100 VCU documents. The comparison results are presented in Table 1.
  • Achieves 91.2% F1-score for composite recall, outperforming single-tool approaches by 23.8 percentage points.
  • Reduces table parameter errors from 41% ± 2.1% (95% CI) to 6.8% ± 0.7% (χ2 test, *p* < 0.01) [12].
  • Solves nested table misalignment in ABS control requirements, eliminating 37 faulty test cases and saving 14 person-days.
Table 1. Performance comparison of PDF parsing tools (tested on 100 VCU documents).
Table 1. Performance comparison of PDF parsing tools (tested on 100 VCU documents).
MetricPyPDF [13]pdfplumber [14]Our Method
Parsing PrincipleSyntax AnalysisVisual LayoutHybrid Syntax–Visual
Text Recall89.2%78.5%95.7%
Table Integrity32.7%91.3%88.6%
Formula RecognitionN/ALimited Support76.9%
Speed (pages/sec)15.39.712.1

2.2.2. Domain-Adaptive Prompt Engineering

Llama3 Adaptation
Llama3, Meta’s open-source large language model series, demonstrates robust reasoning capabilities in automotive applications. The 8B-parameter variant balances lightweight deployment with performance comparable to 70B-scale models [13]. Key technical advantages include:
  • Grouped Query Attention (GQA): Efficiently processes long-form requirements (avg. 128 tokens/clause).
  • Rotary Position Encoding (RoPE): Captures cross-paragraph dependencies critical for multiconstraint requirements.
  • Domain-Specific Pretraining: Trained on 15 trillion tokens, including 8.3% engineering documentation (e.g., ISO/SAE standards) [17].
Automotive Prompt Template Library
Based on 1200 annotated requirement documents, we developed a hierarchical template library (Table 2):
To provide a more comprehensive understanding of template application, Table 3 has been expanded to include examples for all five template types, along with actual prompt examples and structured outputs. Each task-specific prompt is carefully designed to guide Llama3 in parsing and structuring requirement texts through explicit instructions.
Adaptive Optimization Strategy
A three-phase iterative refinement process (Figure 3) ensures continuous template improvement:
(1)
Dynamic Placeholder Replacement: Auto-completes contextual units (e.g., “20%” → “20% SOC”).
(2)
Domain Lexicon Injection: Embeds 4300 automotive terms (ASIL-D, CAN FD) through prompt engineering.
(3)
Feedback Reinforcement Learning: Optimizes template weights via:
ω n e w = ω o l d + α · N c o r r e c t N t o t a l
where:
α : Learning rate;
N c o r r e c t : Number of validated templates;
N t o t a l : total validation samples.
Figure 3. Three-phase adaptive optimization strategy.
Figure 3. Three-phase adaptive optimization strategy.
Wevj 16 00417 g003
Experimental Validation
To verify the effectiveness and superiority of the proposed approach in handling requirement parsing tasks, comparative experiments were conducted against several state-of-the-art methods. The key performance metrics, including parsing accuracy, precision, recall, and F1-score, were quantified and summarized in Table 4, which presents a comprehensive comparison of the performance across different models on the requirement parsing tasks.

2.3. Variable Knowledge Graph Construction

2.3.1. Multisource Data Fusion

VCU variable data exhibit multisource heterogeneous characteristics (Table 5), presenting three integration challenges:
(1)
Naming Conflicts: Identical signals with different labels across sources (e.g., “BatteryVoltage” vs. “V_BAT”) [18,19,20].
(2)
Semantic Ambiguity: Context-dependent interpretations (e.g., “Voltage” may denote battery or motor phase voltage) [21].
(3)
Dynamic Updates: Calibration parameters evolve with software iterations [22].
Table 5. Multisource heterogeneous data characteristics.
Table 5. Multisource heterogeneous data characteristics.
Data SourceData TypeTypical CharacteristicsParsing Tool
CANdb++ DBCCommunication SignalPhysical dimensions and byte ordercantools
MATLAB/SimulinkModel ParametersCalculation logic and data flow relationshipsSimulink API
ASAP2 (A2L)Calibration ParametersAddress mapping and ECU memory layoutASAP2Parser
Excel DataEngineering ConstraintsOperating conditions and test boundary valuespandas
Fusion Rules:
(1)
Priority Strategy: DBC signals > Calibration parameters > Model parameters (based on real-time requirements).
(2)
Conflict Resolution: Select latest entries via version timestamps for overlapping variables.
(3)
Dynamic Updates: Incremental synchronization using FileSystemWatcher event listeners.

2.3.2. Hybrid Semantic Fingerprinting

To bridge the semantic gap between natural language requirements and knowledge graph variables, we propose a hybrid semantic fingerprint (HSF) algorithm (Figure 4). The hybrid semantic fingerprint uses domain-tuned BERT vectors (bert-base-uncased fine-tuned on SAE J1939 standards).
Core Algorithm:
(1)
Formal Features:
Name Similarity: Enhanced Levenshtein distance.
S i m n a m e = 1 E d i t D i s t a n c e ( s 1 , s 2 ) m a x ( l e n ( s 1 ) , l e n ( s 2 ) )
Unit Consistency: SI unit standardization (e.g., “V” ↔ “Volt”).
(2)
Semantic Features:
Contextual Embeddings: Domain-tuned BERT vectors.
Co-occurrence Frequency:
S i m c o n t e x t = l o g ( C o C o u n t ( v 1 , v 2 ) ) m a x ( l e n ( s 1 ) , l e n ( s 2 ) )
where:
C o C o u n t ( v 1 , v 2 ) : Represents co-occurrence frequency in 15,000 VCU documents. Weight coefficients (0.6, 0.3, 0.1) were optimized via grid search to maximize F1-score.
(3)
Hybrid Matching:
S c o r e H S F = 0.6 · S i m n a m e + 0.3 · S i m c o n t e x t + 0.1 · U n i t M a t c h
Experimental Validation (Table 6):
Optimization Outcomes:
  • Matching Accuracy: 97.3% (θ = 0.7 threshold).
  • Manual Intervention Rate: Reduced from 31.6% to 2.7%.
  • Processing Speed: 3.2 ms/variable (NVIDIA Jetson AGX Orin).

2.4. Requirement Atomization

Decomposing natural language requirements into minimal programmable semantic units is the most fundamental and critical step in processing natural language [23]. Atomic requirement decomposition aims to disassemble natural language requirements into minimal programmable semantic units, addressing challenges of linguistic ambiguity, logical nesting, and contextual variable dependencies. We propose a dual-layer decomposition framework combining semantic role labeling (SRL) and logical constraint parsing, as illustrated in Figure 5.

2.4.1. Semantic Role Labeling

A domain-optimized BiLSTM-CRF model is employed, with enhanced input features for automotive requirements. Table 7 shows the input feature engineering for semantic role labeling:
Performance Benchmark (5000 annotated requirements):
To further validate the robustness and scalability of the proposed framework in semantic role labeling (SRL)—a critical subtask in requirement understanding—we conducted a performance benchmark using a large-scale dataset consisting of 5000 manually annotated requirements. The results of the benchmark are presented in Table 8.

2.4.2. Logic Expression Generation

Four logic expression templates are constructed from SRL results, which are shown in Table 9:

2.5. Cause–Effect Graph Modeling and Test Case Generation

2.5.1. Cause–Effect Graph (CEG) Modeling

The cause–effect graph formalizes condition–action logic from requirements into a visual topological structure for test path generation. The core components of the PySide6-based modeling tool shows in Table 10:
Modeling Workflow:
Node Mapping: Import atomic logic expressions (Section 2.4) as initial nodes.
Logic Integration: Merge duplicate conditions via drag-and-drop and connect nodes via logic gates.
Version Control: Maintain historical graph versions for ISO 26262-compliant traceability.

2.5.2. Path-Sensitization Algorithm

To minimize test cases while ensuring 100% MC/DC coverage, we propose a mixed-integer programming heuristic search (MIP-HS) algorithm:
Objective Function:
M i n i m i z e i = 1 n ω i x i + λ j = 1 m ( 1 y j )
where:
x i { 0,1 } : Selection of path i;
y i [ 0,1 ] : Coverage degree of condition j;
ω i : Path weight (risk-based assignment);
λ: Coverage penalty factor.
Constraints:
(1)
MC/DC Coverage:
i S j x i 1 ,   j { a l l   c o n d i t i o n   n o d e s }
where Sj are paths covering condition j.
(2)
Risk Constraint:
i H x i R m i n
where h: high-risk paths, Rmin: minimum high-risk cases.
(3)
Mutual Exclusion:
x k + x l 1
where paths k and l are mutually exclusive.

2.5.3. Experimental Validation

Comparative results on a leading OEM’s VCU platform are summarized in Table 11:
Key Case Studies:
(1)
Fast-Charging Function Test:
  • Reduced test cases from 28 to 9 while covering boundary conditions (e.g., SOC = 20% ± 0.5%).
(2)
Thermal Management Test:
  • Eliminated 42% redundant paths caused by nested logic gates.

3. Case Study: Charging Control System Validation

3.1. Experimental Setup

Validation Target:
  • Hardware Platform: A 2024-model VCU from a leading automotive platform.
  • Software Version: 1.00.03 (charging control firmware).
Baseline Methods:
  • Method A: Manual test design using Vector CANoe v11.0.
  • Method B: GPT-4 end-to-end generation (API version 15 May 2024).
Evaluation Metrics:
(1)
Functional Coverage: Compliance with ISO 26262 criteria.
(2)
Testing Efficiency: Test case generation/execution time.
(3)
Defect Detection: Injected fault identification rate.

3.2. Results

Functional Coverage Analysis (Table 12):
Efficiency Comparison:
  • Generation Time: 2.1 h (this method) vs. 10.5 h (A) vs. 0.3 h (B).
  • Execution Time: 45 min (this method) vs. 62 min (A). * Method B cases were 63% non-executable.
Defect Detection Capability (Table 13):

3.3. Representative Scenario Analysis

Scenario 1: Dynamic Fast-Charging Power Regulation
Requirement:
P c h g = k ( T b a t T m i n ) ( S O C m a x S O C ) ,   w h e r e   T b a t [ 15   ° C ,   45   ° C ] ,   S O C [ 20 % ,   95 % ] .
  • Implementation:
    (1)
    Variable mapping: T b a t B a t t e r y T e m p ,   S O C B a t t e r y S O C .
    (2)
    Generated boundary tests: T b a t = 14 , 15 , 16   ° C ,   S O C = { 19 , 20 , 21 } % .
Finding: Detected power calculation anomalies at Tbat = 14 °C, Tbat = 14 °C due to rounding errors.
Scenario 2: Charging Gun State Machine
  • Requirement: Transition from “Disconnected” to “Charging” requires: Plug-in signal = 1 ∧ Insulation test passed ∧ Contactor closure timeout < 3 s. ("∧" denotes logical AND).
  • Advantages:
    (1)
    Automated identification of timeout constraints.
    (2)
    Generated edge cases (e.g., 5 s delayed contactor closure).
    (3)
    Discovered CAN signal race conditions causing contactor state flips.

3.4. Industrial Deployment

Deployed at a leading OEM (June 2024–April 2025), the framework achieved:
  • Validation Scope: 12 vehicle models, 38 VCU software iterations.
  • Test Cases: 12,750 generated (99.3% executability).
  • Defects Identified:
    Requirement conflicts: 47 (e.g., simultaneous fast-charge enable/disable commands).
    Implementation errors: 238 (including 3 ASIL-D vulnerabilities).
  • Efficiency Gains:
    Test design cycle reduced by 82% (14 → 2.5 person-days).
    Regression testing time decreased by 76% via incremental updates.

4. Discussion

4.1. Methodological Advantages

4.1.1. Comparative Advantages over Traditional Methods

  • Traceability:
    Our cause–effect graph enables bidirectional requirement–test traceability, improving coverage by 37% compared to traditional traceability matrices [29], fully complying with ISO 26262 Clause 8. In ASPICE L2 audits at a leading OEM, requirement traceability defects decreased from 48 to 3.
  • Dynamic Adaptability:
    The incremental path-sensitization algorithm achieves 28× faster test case regeneration during requirement changes than UML-based methods (Table 14).
  • Hot-Swappable Updates: Variable knowledge graphs synchronize within 1.2 s when DBC signal definitions change.

4.1.2. Advantages over AI Methods

(1)
Executability Guarantee:
Resolves 68.9% variable absence in GPT-4-generated cases via semantic fingerprinting.
Achieves 2.3% false-positive rate in ISO 21448 SOTIF validation vs. GPT-4’s 31.7%.
(2)
Explainability:
Cause–effect graphs provide auditable decision paths, aligning with the EU AI Act’s transparency mandates for high-risk systems [30].

4.2. Limitations

4.2.1. Technical Limitations

(1)
Chinese Nested Clauses:
  • For Chinese nested clauses with four or more layers, a sliding-window semantic role labeling approach can be adopted in the future. This method decomposes complex logical structures into atomic units while maintaining contextual semantic coherence through attention linking mechanisms, thus effectively improving the labeling accuracy in validation tests.
(2)
Multi-ECU Coordination:
  • 23% of defects originate from cross-ECU signal misalignment (e.g., VCU-BMS timing mismatches).

4.2.2. Engineering Challenges

(1)
Long-Tail Effect:
  • Five percent of complex state machines (10-layer nesting) consume sixty-three percent of computational resources due to path explosion (106 paths). There are two optimization strategies: decompose the system into independent functional subgraphs with interface constraints, reducing the complexity of individual graphs. Implement multiview switching in the PySide6 tool to support flexible navigation between local details and global overviews.
(2)
Hardware Dependency:
  • To address hardware limitations, we propose containerized HIL emulation using QEMU virtualization. This allows cloud-native execution while maintaining 98.7% signal timing accuracy versus physical dSPACE SCALEXIO systems.

4.3. Future Work

(1)
Multimodal Requirement Integration:
  • Incorporate voice/image inputs (e.g., meeting transcripts, sketches) for X-in-the-Loop testing.
(2)
Quantum-Inspired Optimization:
  • Quantum-inspired optimization uses simulated annealing via CUDA-accelerated tensor operations (PyTorch implementation). For 1000-path graphs, execution time reduces from 8.2 s to 68 ms on NVIDIA A100 GPUs, enabling real-time processing of complex state machines.
(3)
Cloud–Edge Collaboration:
  • Develop hybrid architectures for cloud-based knowledge graph updates and edge-side real-time execution.

5. Conclusions

This study addresses two critical challenges in VCU test generation—variable absence and requirement ambiguity—through a novel “requirement–variable–logic” multidimensional coupling framework. Key contributions include:
(1)
Technical Breakthroughs:
  • Domain-adaptive Llama3 prompting achieves a 94.2% F1-score for requirement parsing.
  • Hybrid semantic fingerprinting enables 97.3% variable matching accuracy.
(2)
Industrial Value:
  • Reduces test design cycles by 82%, saving CNY 1.27M per vehicle program.
  • Identifies 12 types of latent defects, including 3 ASIL-D vulnerabilities, ensuring ISO 21448 SOTIF compliance.
(3)
Paradigm Shift:
  • Establishes explainable test generation via cause–effect graphs, advancing agile validation for intelligent vehicle software.
Future work will extend this framework to multi-ECU coordination and quantum-accelerated optimization, driving the evolution from component-level to system-level intelligence in automotive testing.

Author Contributions

Framework design, methodology, manuscript preparation: G.W.; Algorithm implementation, validation, data curation: X.X.; Industrial deployment, case studies, funding acquisition: Y.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Shanxi Province Major Science and Technology Project grant number 202301150401011.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
VCUVehicle Control Unit
NLPNatural Language Processing
CANController Area Network
DBCDatabase Container (automotive signal definition format)
BERTBidirectional Encoder Representations from Transformers
BMSBattery Management System
OEMOriginal Equipment Manufacturer
MC/DCModified Condition/Decision Coverage

References

  1. Pan, F.; Song, Y.; Wen, L.; Petrovic, N.; Lebioda, K.; Knoll, A. Automating Automotive Software Development: A Synergy of Generative AI and Formal Methods. arXiv 2025, arXiv:2505.02500. [Google Scholar] [CrossRef]
  2. Vdovic, H.; Babic, J.; Podobnik, V. Automotive Software in Connected and Autonomous Electric Vehicles: A Review. IEEE Access 2019, 7, 166365–166379. [Google Scholar] [CrossRef]
  3. Damasiotis, V.; Fitsilis, P.; O’Kane, J.F. Modeling Software Development Process Complexity. Int. J. Inf. Technol. Proj. Manag. 2018, 9, 17–40. [Google Scholar] [CrossRef]
  4. Thakur, P.; Sharma, S.K. Estimation of complexity in software reliability growth modeling. Adv. Appl. Math. Sci. 2020, 19, 563–572. [Google Scholar]
  5. Lipu, M.S.H.; Hannan, M.A.; Karim, T.F.; Hussain, A.; Saad, M.H.M.; Ayob, A.; Miah, M.S.; Mahlia, T.M.I. Intelligent algorithms and control strategies for battery management system in electric vehicles: Progress, challenges and future outlook. J. Clean. Prod. 2021, 292, 126044. [Google Scholar] [CrossRef]
  6. Wang, B.; Han, Y.; Wang, S.; Tian, D.; Cai, M.; Liu, M.; Wang, L. A Review of Intelligent Connected Vehicle Cooperative Driving Development. Mathematics 2022, 10, 3635. [Google Scholar] [CrossRef]
  7. Haraldsson, B.; Staron, M. Aspects of complexity in automotive software systems and their relation to maintainability effort. A case study. arXiv 2025, arXiv:2505.13135. [Google Scholar] [CrossRef]
  8. ISO 21448:2022; Road Vehicles—Safety of the Intended Functionality. ISO: Geneva, Switzerland, 2022.
  9. Agarwal, G. Test Case Automation: Transforming Software Testing in the Digital Era. Int. J. Comput. Eng. 2024, 6, 52–58. [Google Scholar] [CrossRef]
  10. Konrad, S.; Cheng, B.H.C.; Campbell, L.A. Object analysis patterns for embedded systems. IEEE Trans. Softw. Eng. 2004, 30, 970–992. [Google Scholar] [CrossRef]
  11. ANSYS Inc. SCADE System Requirements to Model-Based Testing. ANSYS SCADE Suite User Guide, Version 2023 R1, pp. 215–228. 2023. Available online: https://www.ansys.com (accessed on 9 July 2023).
  12. Wang, W.; Yang, C.; Wang, Z.; Huang, Y.; Chu, Z.; Song, D.; Zhang, L.; Chen, A.R.; Ma, L. TESTEVAL: Benchmarking Large Language Models for Test Case Generation. arXiv 2024, arXiv:2406.04531. [Google Scholar]
  13. Adhikari, N.S.; Agarwal, S. A Comparative Study of PDF Parsing Tools Across Diverse Document Categories. arXiv 2024, arXiv:2410.09871. [Google Scholar]
  14. PyPDF2 Developers. PyPDF2 Documentation: PDF Text Extraction Toolkit. 2023. Available online: https://pypdf2.readthedocs.io (accessed on 9 July 2023).
  15. Schwab, J. pdfplumber: Visual-driven PDF parsing for complex layouts. In Proceedings of the Python in Science Conferences, Austin, TX, USA, 12–18 July 2021; pp. 102–109. [Google Scholar]
  16. ISO/IEC 32000-2:2020; Document Management—Portable Document Format—Part 2: PDF 2.0. ISO: Geneva, Switzerland, 2020.
  17. Meta AI. Llama 3: Open Foundation for Fine-Tuned Language Models. 2024. Available online: https://ai.meta.com/blog/meta-llama-3 (accessed on 9 July 2024).
  18. Gupta, A. Domain-Adaptive Pretraining for Technical Documentation Processing. In Proceedings of the ACL, Toronto, ON, Canada, 9–14 July 2023; pp. 1289–1303. [Google Scholar]
  19. Arcanjo, R.R.; Martins, L.E.G.; Fernandes, D.L.G. Verification and validation of embedded software in an automotive context: A systematic literature review. Rev. Científica Multidiscip. Núcleo Conhecimento 2023, 18, 102–123. [Google Scholar] [CrossRef]
  20. Rafael, T.; Robert, H.; Ramesh, S.; Joanne, M.A. Applying declarative analysis to industrial automotive software product line models. Empir. Softw. Eng. 2023, 28, 40. [Google Scholar] [CrossRef]
  21. Vector Informatik. CANdb++ Documentation: Signal Management in Automotive Networks. 2023. Available online: https://vector.com (accessed on 9 July 2023).
  22. MathWorks. Simulink Parameter Management for AUTOSAR Systems. MATLAB Documentation R2023a. 2023. Available online: https://www.mathworks.com/help/autosar/ug/parameter-management-for-autosar-systems.html (accessed on 9 July 2023).
  23. ASAM e.V. ASAP2 Standard Specification v1.7.0. 2022. Available online: https://www.asam.net (accessed on 9 July 2022).
  24. Das, S.; Deb, N.; Cortesi, A.; Chaki, N. Extracting goal models from natural language requirement specifications. J. Syst. Softw. 2024, 211, 111981. [Google Scholar] [CrossRef]
  25. Mikolov, T. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781. [Google Scholar] [CrossRef]
  26. Huang, L.; Zhang, W.; Li, J.; Wang, Q.; Zhao, Y.; Chen, H.; Liu, S.; Zhu, M.; Yang, X.; Sun, D.; et al. Domain-Specific POS Tagging for Automotive Requirements. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), Punta Cana, Dominican Republic, 7–11 November 2021; pp. 234–245. [Google Scholar]
  27. Manning, C.D.; Surdeanu, M.; Bauer, J.; Finkel, J.R.; Bethard, S.; McClosky, D.; Wang, L.; Li, P.; Zhang, H.; Chen, J.; et al. Universal Dependencies for Chinese. In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, 7–12 May 2018; European Language Resources Association (ELRA): Paris, France, 2018; pp. 886–893. [Google Scholar]
  28. ISO 26262:2018; Road Vehicles—Functional Safety [S]. International Organization for Standardization (ISO): Geneva, Switzerland, 2018.
  29. Rocha, M.; Simão, A.; Sousa, T. Model-based test case generation from UML sequence diagrams using extended finite state machines. Softw. Qual J. 2021, 29, 597–627. [Google Scholar] [CrossRef]
  30. European Commission. Regulation on Harmonised Rules on Artificial Intelligence (AI Act); Official Journal of the European Union: Luxembourg, 2024. [Google Scholar]
Figure 1. System architecture diagram.
Figure 1. System architecture diagram.
Wevj 16 00417 g001
Figure 2. PDF Multimodal Reading Algorithm Process.
Figure 2. PDF Multimodal Reading Algorithm Process.
Wevj 16 00417 g002
Figure 4. Hybrid Semantic Fingerprint algorithm workflow.
Figure 4. Hybrid Semantic Fingerprint algorithm workflow.
Wevj 16 00417 g004
Figure 5. Semantic role labeling and logic constraint parsing workflow.
Figure 5. Semantic role labeling and logic constraint parsing workflow.
Wevj 16 00417 g005
Table 2. Hierarchical prompt template library.
Table 2. Hierarchical prompt template library.
Template TypeFunctionUse Case
CausalExtracts condition–action pairsCharging control, torque management
State MachineIdentifies state transitionsDriving mode switching, fault diagnosis
ComputationalParses formulas and parameter constraintsEnergy calculation, safety thresholds
TemporalCaptures time-constrained sequencesEmergency braking, thermal management
Invalid FilterFilters non-testable requirementsDocument preprocessing
Table 3. Engineering implementations of representative templates.
Table 3. Engineering implementations of representative templates.
Original Requirement TextTemplate TypeStructured Output (JSON)
“When battery temperature ≥ 45 °C and SOC < 20%, limit charging power to 3 kW”Causal{“conditions”: [{“var”: “BatteryTemp”, “op”: “≥”, “val”: “45”}, {“var”: “SOC”, “op”: “<”, “val”: “20”}], “action”: “ChargingPower = 3”}
“Transition from Parking to Drive mode requires: VehicleSpeed = 0, BrakePedal = 1”State Machine{“transitions”: [{“from”: “Parking”, “to”: “Drive”, “triggers”: [{“var”: “VehicleSpeed”, “op”: “==”, “val”: “0” },…]}]}
“Calculate maximum torque using formula: T_max = 0.8 × I_bat × V_bus”Computational{“formula”: “T_max = 0.8I_batV_bus”, “variables”: [“I_bat@DBC_0x321”, “V_bus@Simulink@Inverter”]}
“After the vehicle detects an obstacle, emergency braking should be initiated within 1 s”Temporal{“sequence”: [{“event”: “DetectObstacle”, “time_constraint”: “1s”, “action”: “EmergencyBraking”}]}
“Ensure there is no jerking during vehicle operation”Invalid Filter{“Invalid”: []}
Table 4. Comparative performance on requirement parsing tasks.
Table 4. Comparative performance on requirement parsing tasks.
ModelAccuracyRecallF1-ScoreExecutable Rate
GPT-4 (zero-shot)78.4%72.1%75.1%63.2%
BERT-UML85.6%84.3%84.9%71.8%
Llama3 (vanilla)82.4%83.1%82.7%68.5%
Our Method93.7%94.8%94.2%97.3%
Table 6. Comparison of Matching Results of Hybrid Semantic Fingerprint Algorithms.
Table 6. Comparison of Matching Results of Hybrid Semantic Fingerprint Algorithms.
Requirement VariablesCandidate VariablesSimnameSimcontextUnitMatchFinal Score
Battery temperatureBms_Temp0.920.881.00.928
Motor speedMtr_Speed0.850.911.00.886
High-voltage system statusHV_St0.450.931.00.642
Charge current limitChgCurrLim0.380.790.00.430
Table 7. Input feature engineering for semantic role labeling.
Table 7. Input feature engineering for semantic role labeling.
Input FeaturesProcessing MethodExampleTechnical Source
Word VectorDomain-fine-tuned Word2Vec“SOC” → [0.72, −0.15, …][24]
Part-of-Speech TagExtended Automotive-specific Part-of-Speech SetNoun Tag Extension: Signal/Value/Unit[25]
Dependency Syntax RelationModified Version of Stanford Parser“When SOC < 20%” → Conditional Adverbial Clause[26]
Domain Dictionary MarkingPreset 2300 Automotive Electronics Entity WordsAutomatic Marking of “VCU”, “CAN FD”This Article
Table 8. Semantic role labeling performance comparison.
Table 8. Semantic role labeling performance comparison.
ModelAccuracyRecallF1-ScoreProcessing Speed
Traditional CRF83.2%81.7%82.4%1250
BERT-Base89.5%88.3%88.9%340
BiLSTM-CRF91.8%93.1%92.4%980
Table 9. Logic expression templates and implementations.
Table 9. Logic expression templates and implementations.
Logical TypeExpression StructureInput ExampleOutput Example
Causal ConditionIF <Condition Set> THEN <Action Set>When the vehicle speed > 30 km/h and the brake is not pressed, fast charging is prohibitedIF VehicleSpeed > 30 AND BrakePedal = 0 THEN ChargingPower = 0
State Transition<Current State> → <Event> → <Target State>To switch from Parking to Drive, the vehicle speed must be 0State(Parking) ON Event(ShiftRequest) WHEN VehicleSpeed = 0 → State(Drive)
Mathematical Constraint<Variable> = <Expression>T_max = 0.8 × I_bat × V_bus” (maximum torque T_max = 0.8 × I_bat × V_bus)T_max = 0.8 × I_bat × V_bus
Temporal SequenceAFTER <Event> WITHIN <Δt> DO <Action>Complete the system self-check within 5 s after power-onAFTER PowerOn WITHIN 5s DO SelfTest = 1
Table 10. Core components of the PySide6-based modeling tool.
Table 10. Core components of the PySide6-based modeling tool.
ComponentFunction DescriptionTechnical Characteristics
Causal NodeRepresents input conditions or output actionsSupports multiple types of nodes: Boolean (True/False), numerical (threshold range), and enumeration (state set)
Logic GateDefines the logical relationship between conditionsSupports AND/OR/NOT/XOR gates and can be nested (e.g., (A AND B) OR (C AND NOT D))
Constraint EdgeConnects nodes and transmits logical constraintsSupports weight annotation (such as risk level), time delay (Δt), and priority marking
Table 11. Performance comparison of test generation methods.
Table 11. Performance comparison of test generation methods.
IndicatorThis MethodUML-Based Method [27]Random Generation Method
Number of Test Cases45120Over 300 (Unable to Converge)
MC/DC Coverage Rate100%95%78%
High-risk Path Coverage Rate100%82%65%
Average Generation Time (s)12.38.70.5
Adaptability to Requirement ChangesIncremental Update (Average 2.1 s)Full Reconstruction (58 s)Not Applicable
Table 12. Results of Coverage Analysis.
Table 12. Results of Coverage Analysis.
Coverage DimensionThis MethodMethod AMethod BISO 26262 Requirements [28]
Statement Coverage100%98%89%100%
Branch Coverage100%95%76%100%
MC/DC Coverage100%88%62%100%
Requirement Traceability Coverage100%73%51%95%
Table 13. Comparison Results of Defect Detection Capabilities.
Table 13. Comparison Results of Defect Detection Capabilities.
Fault TypeNumber of InjectionsDetected by This MethodDetected by Method ADetected by Method B
Variable Binding Error121295
Boundary Condition Omission8863
Abnormal State Transition5541
Temporal Constraint Conflict3310
Table 14. Efficiency comparison for requirement change adaptation.
Table 14. Efficiency comparison for requirement change adaptation.
MethodResponse Time to Requirement ChangesReusability Rate of Test Cases
This Method (Incremental Update)2.1 s92%
UML Reverse Engineering58 s35%
Manual Reconstruction6.5 h8%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wu, G.; Xu, X.; Kang, Y. AI-Driven Automated Test Generation Framework for VCU: A Multidimensional Coupling Approach Integrating Requirements, Variables and Logic. World Electr. Veh. J. 2025, 16, 417. https://doi.org/10.3390/wevj16080417

AMA Style

Wu G, Xu X, Kang Y. AI-Driven Automated Test Generation Framework for VCU: A Multidimensional Coupling Approach Integrating Requirements, Variables and Logic. World Electric Vehicle Journal. 2025; 16(8):417. https://doi.org/10.3390/wevj16080417

Chicago/Turabian Style

Wu, Guangyao, Xiaoming Xu, and Yiting Kang. 2025. "AI-Driven Automated Test Generation Framework for VCU: A Multidimensional Coupling Approach Integrating Requirements, Variables and Logic" World Electric Vehicle Journal 16, no. 8: 417. https://doi.org/10.3390/wevj16080417

APA Style

Wu, G., Xu, X., & Kang, Y. (2025). AI-Driven Automated Test Generation Framework for VCU: A Multidimensional Coupling Approach Integrating Requirements, Variables and Logic. World Electric Vehicle Journal, 16(8), 417. https://doi.org/10.3390/wevj16080417

Article Metrics

Back to TopTop