For the fault types that we defined, we compared the options of injecting faults in the automotive software using ASFIT versus using the existing fault injection method.
f compares the numbers of faults that can be injected by ASFIT and other fault injection methods out of a total of 54 faults. ASFIT can inject all 54 faults, whereas the software-based FIT tools can inject 5–23 faults and the hardware-based FIT tools can inject 42 faults. Figure 8
a–e shows the results for the 54 faults by error type.
The software-based FIT tools Kayotee, CaNoe, and G-SWFIT can inject 5, 12, and 12 faults, respectively, related to data error, as shown in Figure 8
a. As shown in Table 4
, CaNoe and G-SWFIT can only inject faults related to data errors that change the variables, change the parameters, or return values used in calling relationships between functions. Kayotee can only inject data error faults for variables used in the software. However, these three tools cannot inject software faults related to program flow, access, timing, or asymmetric errors.
According to ISO 26262, GRINDER, another software-based FIT tool, can inject bit flips, data type-based corruption, and timing faults for every layer of AUTOSAR [12
]. However, bit flips and data type-based corruption faults can be injected only in the calling relationship functions; data errors for global variables cannot be injected. Moreover, CPU clock corruption, a timing error fault that can be injected by the CPU clock control corresponding to the MCAL layer, cannot be injected by GRINDER either. In addition, fault injection for access, program flow, and asymmetric errors, excluding data or timing errors, was not supported.
TRACE32, GHS Probe, and CW IDE, which inject faults using a hardware debugger, can inject faults at the desired position by adding a breakpoint in the source code [19
]. These tools can inject most faults but cannot inject some timing errors or access and asymmetric errors. In the case of timing errors, they can inject data loss and CPU clock corruption faults through data manipulation after stopping the execution through a breakpoint, but cannot inject faults such as data delay and no response that occur during execution.
The greatest advantage of ASFIT is that, unlike other methods, it can inject all faults related to access, timing, and asymmetric errors. As automotive software is a hard, real-time system composed of various ECUs, the management of shared memory is critical and delays should be prevented when interacting with these ECUs. Furthermore, a fault that occurs during execution of the software must be accurately communicated to other interacting ECUs. Therefore, functional safety must be verified through fault injection for related access, timing, and asymmetric errors. In the following, we analyze the proposed ASFIT more concretely with particular cases of these faults.
1 Case 1 (ESCL access error): fault ID 7—invalid address fault injection for global variables in the callee of the runnable–runnable integration test level.
The ESCL locks or unlocks the wheel using the vehicle’s smart key. The fault that we injected checks consistency for the handle lock state of the vehicle between the ESCL hardware and the control system and then changes the address of a variable used to supply power to the ESCL to an invalid value. When this fault occurs, normal power supply to the ESCL is impossible because the variable value cannot be read, and the task that contains the runnable related to the ESCL control is rebooted for safety.
This fault is injected in the step when the runnables inside the SWC are integrated. The fault that we inject changes the address of the global variable b_ESCLPowerSupplied, which is used in ESCLPowerSupply(), corresponding to the callee when the runnable EsclConsistencyCheck() calls the runnable ESCLPowerSupply(), as shown in Figure 9
The call between ESCLConsistencyCheck() and ESCLPowerSupply() occurs when the task FuncOSTask_BSW_FG3_AppModeRequest() is executed. Therefore, the T_wrapper of the corresponding task checks whether it is the fault injection target, then injects the fault by calling the F_wrapper (①), and finally calls the original task (②). As the access error is a fault that accesses an address for which it does not have access permission, the address in the global variable b_ESCLPowerSupplied is registered in the protected memory managed by the OS. Then, when ESCLPowerSupply() attempts to access the variable, an access error occurs because it does not have the access permission.
To check the results of fault injection using ASFIT, we operated LEDs when the fault injection code was executed in an environment wherein the automotive software operated. When a fault is not injected and the operation is normal, the initial state of the LED (all LEDs are on) is maintained, and when a fault is injected, a specific LED is turned off in the fault injection function. As with the initial state of the LED, all LEDs return to the on state when the target board is rebooted or when the LED control function is called within the safety mechanism.
a shows that the third LED from the left is turned off to confirm the occurrence of an access error due to fault injection. After the fault was injected, the task was restarted, and all the LEDs turned on, as shown in Figure 10
a. This finding shows that the safety mechanism that we defined works well.
The existing software-based fault injection method does not support changing the memory area protected by the OS, while in the hardware fault injection method, it is difficult to find the memory area protected by the OS without the total source code of the OS. In contrast, ASFIT can perform fault injection because it finds the protected memory area of the OS from the binary when the fault injection code is generated.
2 Case 2 (ESCL timing error): fault ID 13—data delay fault injection for the parameter of the caller at the SWC–RTE integration test level.
The ESCL locks or unlocks the wheel using the vehicle’s smart key. The fault that we injected delays the value of the data exchanged during communication to check consistency between the ESCL hardware and the control system.
This fault is injected in the step when the RTE function is called by a runnable inside the SWC, at the integration test level of the SWC and the RTE layer. This fault delays the transmission of the value of the parameter l_ESCLUnlock transmitted from the caller EsclControl() when the runnable EsclControl() calls Rte_Write_P_ConsistencyCheck_L_ESCLUnlock(). When this fault occurs, consistency between the ESCL hardware and the control system cannot be checked, and the safety mechanism that stops power to the ESCL until consistency can be confirmed is activated.
shows the concrete fault injection process in detail. The T_wrapper of a task that includes a SWC–RTE call, which is the fault injection position, injects a fault (①) by calling the F_wrapper, and then calls the original task (②). At this time, the F_wrapper TE_Delay() activates the fault by changing the address of the RTE function called by the SWC–RTE calling code to the address of Delay().
As in Case 1, to check the results of fault injection by ASFIT, we modified the fault injection code such that it would control the LEDs when the code was executed in an environment wherein the automotive software was operated. Furthermore, as in Case 1, the LEDs were turned off and the ESCL was rebooted after the fault injection. All the LEDs were then turned on, indicating that our proposed design for the safety mechanism works well.
This fault can be injected by software-based fault injection but not by hardware-based fault injection. As mentioned above, the hardware-based fault injection method stops execution in order to inject faults. However, unlike the change in the value of the allocated memory or register, when a delay occurs an execution code must be added, which requires software rebuilding.
3 Case 3 (asymmetric error): fault ID 40—asymmetric value fault injection for the return statement of the callee at the SWC–BSW integration test level.
The VCU performs control related to vehicle driving, such as the motor and steering wheel. The fault that we injected calls a function that notifies an error to the entire system when an error occurs. At this time, instead of sending the same value, the error value of one call is changed. When this fault occurs, the system sequentially stops operations and shuts down.
This fault occurs in the integration test level of the SWC and the BSW layer. The fault that we injected occurs in the SWC, as shown in Figure 12
. When Det_ReportError() is called, the error value transmitted from the callee Det_ReportError() to another SWC or the ECU is changed uniquely for a specific SWC. When this fault occurs, the safety mechanism that stops all the operations sequentially and shuts down the system is activated. It is difficult to confirm the operation of the safety mechanism only by the operation of the LED. Thus, in this case, to verify the operation of the safety mechanism, a serial port was connected to the target for debugging.
shows the ASFIT fault injection process. In the case of an asymmetric error, when a fault or event occurs anywhere in the system, the T_wrapper is generated for the function of the BSDW layer, which communicates the fault or event to the entire system. In Case 3, the target for generating the T_wrapper is Det_ReportError(), which is a function that communicates a fault to other SWCs and the ECU. The T_wrapper _wrap_Det_ReportError() calls the F_wrapper (①) to inject a fault and then calls the original function (②). At this time, the fault injection function ASE_value() generates an asymmetric error by changing the error value for a specific caller.
Sending the same value to the entire system is a fault that can be injected by other fault injection methods, but sending a different value only to part of the system in a situation where the same value must be sent throughout the system cannot be performed by hardware or software-based fault injection. ASFIT makes asymmetric fault injection possible by injecting the fault uniquely to a specific calling relationship of the error report function.