Embodied Visual Perception for Driver Fatigue Monitoring Systems: A Hierarchical Decoupling Framework for Robust Fatigue Detection and Scenario Understanding

Chen, Siyu; Huang, Juhua; Liu, Yinyin; Ye, Saier; Bai, Yuqi

doi:10.3390/electronics15030689

Open AccessArticle

Embodied Visual Perception for Driver Fatigue Monitoring Systems: A Hierarchical Decoupling Framework for Robust Fatigue Detection and Scenario Understanding

by

Siyu Chen

^1,*,

Juhua Huang

¹,

Yinyin Liu

²,

Saier Ye

³ and

Yuqi Bai

^4,*

¹

School of Advanced Manufacturing, Nanchang University, Nanchang 330031, China

²

School of Economics and Management, Nanchang University, Nanchang 330031, China

³

Beijing Kuangshi Technology Co., Beijing 100000, China

⁴

School of Environment, Education and Development, The University of Manchester, Manchester M13 9PL, UK

^*

Authors to whom correspondence should be addressed.

Electronics 2026, 15(3), 689; https://doi.org/10.3390/electronics15030689

Submission received: 17 December 2025 / Revised: 25 January 2026 / Accepted: 27 January 2026 / Published: 5 February 2026

(This article belongs to the Topic Unmanned Vehicles Technology and Embodied Intelligence Systems for Intelligent Transportation)

Download

Browse Figures

Versions Notes

Abstract

As intelligent vehicle technologies evolve, reliable driver monitoring systems have become increasingly critical for ensuring the safety of human drivers and operational reliability. This paper proposes a novel visual computing framework for Driver Fatigue Monitoring Systems (DFMSs) based on hierarchical decoupling and scenario element analysis, specifically designed for intelligent transportation environments. By treating the monitoring system as an engineering-level embodied perception–decision system deployed within the vehicle, rather than a purely disembodied vision module, the framework decouples low-level algorithmic perception from application-layer decision logic, enabling a more granular evaluation of visual computing performance in real-world scenarios. We leverage Python 3.9-driven automated test case generation to simulate diverse environmental variables, improving testing efficiency by 50% over traditional manual methods. The system utilizes deep learning-based visual computing to achieve high-fidelity monitoring of eye closure (PERCLOS, EAR), yawning (MAR), and head pose dynamics, enabling real-time assessment of the driver’s state within the embodied system loop. Comparative benchmarking reveals that our framework significantly outperforms existing models in visual understanding accuracy, achieving perfect confidence scores (1.000) for eye closure and smoking behavior detection, while drastically reducing false positives in mobile phone usage detection (misidentification rate: 0.016 vs. 0.805). These results demonstrate that an embodied approach to visual perception enhances the robustness and reliability of driver monitoring systems deployed in real vehicles, providing a scalable pathway for the development of next-generation intelligent transportation safety standards.

Keywords:

hierarchical decoupling; scenario elements; driver state monitoring; testing methodology research; computer vision

1. Introduction

In recent years, with continuous advancements in automotive technology, per capita vehicle ownership has steadily increased. Yet, human factors account for 80% to 90% of traffic accidents [1,2], with a significant proportion attributed to driver fatigue. Notably, the likelihood of accidents caused by fatigued driving is four to six times higher than that under normal driving conditions [3]. Consequently, the development of efficient and reliable Driver Fatigue Monitoring Systems (DFMSs) has become an urgent necessity [4].

Driver state monitoring systems leverage advanced technologies to assess drivers’ physiological conditions, attention levels, and emotional states in real time, thereby enhancing driving safety and comfort [5,6]. Earlier systems relied on seat pressure or steering torque sensors, but their limited functionality could not meet the complexity of modern driving environments [7,8]. With advancements in automotive electronics, monitoring technologies have shifted toward vision-based approaches, using in-cabin cameras and infrared sensors combined with deep learning algorithms to analyze facial and posture information. These systems extract key features such as the percentage of eyelid closure over time (PERCLOS), blink frequency, eye aspect ratio (EAR), mouth aspect ratio (MAR), head posture, and risky behaviors [9,10,11,12,13,14,15,16]. Machine learning models then process these features to identify and predict driver fatigue states in real time, enabling timely warnings to prompt drivers to rest. Meanwhile, data privacy has become a critical consideration, requiring strict encryption and anonymization measures during data transmission and storage to mitigate potential risks [17,18,19,20].

Driver Monitoring Systems (DMSs) are increasingly being incorporated into automotive safety standards worldwide [21]. In China, installation of DMS has been mandated in commercial vehicles since 2018, with full implementation across new vehicles expected by 2027. In 2020, the U.S. National Transportation Safety Board recommended DMS as an effective safety measure for Level 2 advanced driver assistance systems [22]. In 2023, the European New Car Assessment Programme (E-NCAP) designated active safety systems with DMS as a key criterion for achieving a five-star rating [23], and in 2024, the European Union’s General Safety Regulation (GSR) required all new vehicles sold in the EU to be equipped with driver fatigue monitoring systems [24]. Collectively, these developments indicate that DMS will become a mandatory component of active safety systems worldwide in the near future.

In addition to regulatory progress, recent academic studies further highlight the global trend toward integrating multimodal monitoring approaches. For example, Guo et al. (2023) demonstrated that combining multi-source data streams significantly enhances vigilance estimation accuracy [5]; Qu et al. (2024) provided a comprehensive review of vision-based driver monitoring techniques and emphasized the importance of deep learning for fatigue recognition [20]; and Yang et al. (2024) analyzed the effect of DMS design on drivers’ mental stress and attention in car-sharing scenarios, offering insights into user acceptance and human factors [21]. These recent contributions show that while this study aligns with ongoing international efforts, it diverges by proposing a novel testing framework that emphasizes hierarchical decoupling and automated scenario generation, thereby addressing current gaps in standardization and test efficiency.

Despite these advances, both academia and industry still face significant challenges in the testing and evaluation of Driver Monitoring Systems (DMSs). Most existing studies primarily focus on improving fatigue recognition algorithms or reporting performance on specific datasets, while systematic testing frameworks for DMSs remain relatively underexplored. In practice, evaluation is often conducted through end-to-end vehicle tests or fixed benchmark datasets, where algorithmic perception accuracy, system integration logic, and human–machine interaction effects are tightly coupled. This coupling makes it difficult to isolate the root causes of performance degradation or to conduct fair and reproducible comparisons across different systems. Moreover, standardized testing methodologies that explicitly account for scenario diversity—such as lighting conditions, driver behaviors, and demographic variability—are still lacking, resulting in limited comparability between suppliers and vehicle models.

In this work, the term embodied is used from an engineering systems perspective. Specifically, the DFMS is regarded as an in-vehicle system in which visual perception, vehicle context, decision logic, and human–machine interaction are physically coupled and evaluated as a closed perception–decision loop. This study does not aim to formulate a cognitive agent or learning-based policy, but rather focuses on testing and validating how such embodied perception and decision components interact under real deployment constraints.

To address these challenges, this study proposes a testing framework based on hierarchical decoupling and scenario elements, which differs fundamentally from prior fatigue recognition and evaluation studies. Rather than introducing a new detection algorithm, this framework explicitly separates algorithm-level visual perception evaluation from application-level functional validation, enabling targeted and interpretable testing at different system layers. Scenario elements are formalized to abstract driving behaviors, environmental conditions, and driver characteristics into controllable variables for automated test case generation. Combined with multidimensional validation methods—including virtual simulation, functional testing, subjective evaluation, and benchmarking—the proposed framework provides a structured, scalable, and reusable approach for DFMS evaluation, supporting both standard development and iterative product optimization.

In summary, the main contributions of this study are as follows:

(1): Proposition of a hierarchical decoupling testing framework for DFMSs, which explicitly separates algorithm-level perception evaluation from application-level functional validation, addressing an underexplored gap in existing DFMS testing methodologies;
(2): Introduction of the scenario elements method, driving the automation of test case generation and improving testing coverage;
(3): Employment of multidimensional validation methods, demonstrating superior performance in key behavior detection and false positive rate control compared to competing products;
(4): Providing practical guidance for the standardized development of DFMS and offering feasibility studies for system optimization across different vehicle types and operational environments.

The remainder of the paper is structured as follows: Section 2 reviews the background and limitations of existing DFMS testing methods; Section 3 provides a detailed explanation of the proposed testing framework and its core components; Section 4 presents experimental design, results, and analysis; Section 5 discusses the limitations of the research and concludes the paper.

2. Driver Fatigue Monitoring System

Building on the challenges highlighted in the introduction, this section details the architecture and functionality of the proposed Driver Fatigue Monitoring System (DFMS) to address driver fatigue effectively.

Driver monitoring technology has rapidly evolved, transitioning from indirect methods such as seat pressure sensors and steering wheel torque sensors to direct monitoring approaches based on visual imaging technology. By leveraging visual image perception technology to capture in-cabin driver facial and posture information, this method enables more intuitive and efficient identification of driver fatigue and abnormal behaviors, establishing it as the mainstream technical approach in the field of driver monitoring today [6,25]. Figure 1 illustrates the in-cabin visual image monitoring interface.

2.1. Hardware Product Terminal

The hardware terminal of the Driver Fatigue Monitoring System (DFMS) primarily consists of three main components: the main control board, the camera module, and the infrared supplementary light. The main control board employs a highly integrated, computationally powerful AI System-on-Chip (SoC), with the AX620A chip at its core. The camera module is equipped with an image sensor based on the OV05B1S chip, enabling real-time capture of driver images and transmission to the main control board for data processing. The infrared supplementary light (IR LED) ensures stable acquisition of high-quality driver facial images under various lighting conditions. A schematic diagram of the hardware terminal is shown in Figure 2.

The proposed DFMS is implemented on an embedded AI System-on-Chip (SoC) platform commonly used for in-cabin driver monitoring applications, integrating image signal processing and neural network inference within a single control unit. This deployment configuration is representative of current production-grade DMS or cockpit domain controllers used in commercial vehicles. As the system does not rely on discrete GPUs or external computing units, but instead operates entirely on an embedded SoC together with a standard in-cabin camera and infrared illumination, the hardware resource requirements are compatible with those of existing commercial vehicle platforms that already support driver monitoring or occupant sensing functions.

2.2. Functional Design

The DFMS continuously monitors the driver’s eye and mouth states during driving through in-cabin cameras. The collected eye and mouth state data are combined with vehicle behavior states for comprehensive analysis to determine driver fatigue levels. The fatigue monitoring alarm system is categorized into none, low, medium, and high levels, with alarm levels determined based on eye closure detection, yawn detection, and vehicle behavior state analysis [26].

The overall functional workflow of the DFMS is illustrated in Figure 3. Upon vehicle startup and power-on, the system synchronously activates the camera (Camera Activate), the main control module (Control Activate), and the in-vehicle system (Car Bot). Once all three components are successfully activated, the system notifies the driver via the interface with the message, “The Cockpit Care system has been activated successfully,” and enters continuous monitoring mode.

In the active state, the system prompts the driver to perform eye closure or yawning actions to verify real-time monitoring functionality. When behaviors indicative of fatigue, such as eye closure or yawning, are detected, the system immediately conducts relevant data analysis. Upon identifying eye closure actions (eye opening and closing test) or yawning actions (yawn detection), the system triggers an alarm mechanism, displaying a fatigue warning image on the interface accompanied by an auditory alert to ensure the driver takes appropriate action promptly.

Additionally, the system includes an option for the driver to manually disable the fatigue monitoring function. When the driver selects this option via the “Cockpit Care” interface, the system issues a reminder: “The fatigue monitoring function is turned off, please drive safely”.

If any component, such as the camera or control module, fails to activate properly during the startup process, the system issues a warning to the driver, displaying the message, “The Cockpit Care system has failed to activate,” and advises the driver to check the system status and ensure safe driving.

This functional design fully embodies the human–machine interaction characteristics of the fatigue monitoring system, balancing the driver’s active intervention with automated system monitoring. Through clear and real-time alarm strategies, it effectively enhances driving safety and the reliability of the user experience.

2.3. Algorithmic Software Architecture

The algorithmic software architecture of the DFMS adopts a modular and hierarchical design, including image acquisition, image preprocessing, facial keypoint detection, posture feature analysis, and fatigue state determination, as shown in Figure 4. The processing steps are logically organized, starting with the initialization of the system and image acquisition, followed by face frame evaluation, head pose analysis, occlusion and fatigue detection modules, and culminating in the calculation of attention and fatigue levels. This structured representation helps clarify the modular nature of the system and how different algorithmic components interact to generate comprehensive driver state assessments.

To facilitate the readability of the complex workflow illustrated in Figure 4, the DFMS algorithmic software architecture can be interpreted as a sequence of four functional stages. First, image acquisition and preprocessing modules initialize the system, configure parameters, and generate normalized facial data from raw RGB and infrared inputs. Second, face frame validation and posture-related computations (e.g., face frame judgment, head angle calculation, and yaw estimation) determine whether subsequent feature extraction is activated. Third, a set of core perception submodules—highlighted in the dashed block in Figure 4, including Face-PB, OCC-EyeOC, MouthOC, RC, and Gaze—extract facial posture, eye state, mouth state, risky behaviors, and gaze information. Finally, these intermediate outputs are fused at the application layer to compute attention- and fatigue-related scores and warning logic, resulting in the overall fatigue level assessment and corresponding user alerts.

Although the DFMS is implemented as an integrated in-vehicle system, the architecture in Figure 4 represents a composition of algorithmic perception modules and data-processing stages. Each module operates on well-defined inputs (e.g., RGB/IR images, facial keypoints, vehicle signals) and produces intermediate algorithmic outputs (e.g., eye state probabilities, mouth opening scores, head pose angles, gaze coordinates), which are subsequently fused through rule-based logic for higher-level driver state inference. This abstraction allows individual algorithms and data-processing steps to be independently evaluated and analyzed, consistent with standard practices in computer vision–based driver monitoring research.

During system operation, the DFMS camera first captures infrared (IR) and RGB images containing the driver’s face. Upon receiving a DFMS request, the system performs software initialization and parameter configuration, followed by preprocessing to output facial image data.

After preprocessing, the system employs a Face Detection three-Pack Algorithm to output the face frame, face orientation, and face keypoints. If no face is detected in the image, the system proceeds to process the next frame; if a face frame is successfully detected, the system conducts a Driver Face Frame Judgment. Upon confirming the driver’s face frame, it performs posture calculations (Face Frame Position Calculation) and retrieves real-time vehicle information (e.g., vehicle speed, door lock status) through Vehicle Status Judgment to enhance fatigue monitoring accuracy with dynamic vehicle data.

Subsequently, the system uses vehicle state information, such as steering wheel angle and turn signal status, to execute Face-based Frame Cropping for precise extraction of the driver’s facial region. The system then activates multiple submodules for feature analysis, including:

Face-PB: Head posture and image blur detection algorithm (Pose and Blur), outputting face angle data and image clarity scores.
OCC-EyeOC: Eye occlusion and closure state detection module, identifying eye closure levels (e.g., left eye open, right eye open, or closed).
MouthOC: Mouth state detection module, calculating mouth opening scores.

Gaze: Gaze detection module, outputting driver gaze coordinate data.

RC: Risky behavior detection module, identifying dangerous behaviors such as smoking or phone use.

The system integrates these detection results for advanced feature fusion and state evaluation, including head angle calculation, head corner area calculation, right-left status calculation, and head-down status calculation. It further calculates the field of vision using driver gaze coordinates to determine the attention level calculation. Based on comprehensive features such as eye closure states, yawning actions, mouth opening conditions, head posture, and risky behavior scores, the system performs Next Frame Fatigue State Calculation to determine the real-time driver fatigue level and conducts Risky Behavior Calculation based on risk scores.

This software architecture, through modular design, precise algorithmic processing, and multi-source feature fusion, ensures the real-time performance and high reliability of driver state monitoring, providing a solid technical foundation for subsequent fatigue warnings and safety interventions.

3. Testing Methods and Test Case Extraction

Having described the design of the Driver Fatigue Monitoring System (DFMS) in the previous section, this section introduces the testing methods and case extraction strategies used to rigorously evaluate its performance. From a methodological perspective, this section formalizes DFMS testing as a data-driven evaluation problem, where algorithmic perception accuracy and application-level decision logic are assessed through structured scenario decomposition and controlled data processing pipelines.

3.1. Hierarchical Decoupling Testing

To address the complex hardware–software interaction characteristics of the Driver Fatigue Monitoring System (DFMS), this study adopts a hierarchical decoupling testing approach to verify and evaluate system functionality with greater clarity and efficiency. In contrast to existing DFMS testing approaches, which often rely on integrated end-to-end vehicle testing that conflates both algorithmic performance and application outcomes, our approach separates these layers, offering more targeted and efficient testing. The hierarchical decoupling method enables precise localization of issues, which is often challenging in integrated testing approaches. Additionally, our approach incorporates a scenario elements method that allows for automated, comprehensive test case generation, reducing the need for manual test creation and ensuring broader scenario coverage. These innovations significantly enhance both the efficiency and accuracy of DFMS validation compared to existing methods, which often lack scalability and scenario diversity.

Hierarchical design and decoupling are widely used in software engineering, sharing common principles but emphasizing different aspects [27]. Hierarchical design decomposes complex systems into modular layers with clearly defined functions, improving modularity, reducing complexity, and enhancing maintainability and scalability [28]. Decoupling minimizes dependencies between modules, increasing independence and flexibility. This improves maintainability, facilitates targeted testing, and enables independent updates or replacements, thereby enhancing overall system reliability [29,30,31].

The DFMS, built on an embedded system, relies on efficient collaboration between hardware modules and software algorithms [32]. The hierarchical decoupling testing method allows for targeted independent testing of each component and precise evaluation of their interaction, ensuring overall reliability and stability. To illustrate this process more intuitively, Figure 5 provides an overview of the end-to-end testing framework, showing the sequential data flow between the Sensor, Algorithm, Engineering, and Testing Teams. The Sensor Team captures and transmits raw video data, the Algorithm Team processes it using detection and recognition models, the Engineering Team integrates the processed outputs into system functions, and the Testing Team evaluates the complete system through functional performance and user experience assessments. This structured flow reflects the collaborative yet modular nature of the approach, ensuring each team focuses on its specific role while maintaining overall system coherence.

Furthermore, the system development involves multiple technical teams, each with distinct responsibilities and testing tasks. To ensure the final user experience of the system, this study proposes a hierarchical decoupling testing strategy tailored for end-to-end testing scenarios:

①: Clear Definition of Team Responsibilities for Collaborative Hierarchical Testing

The Sensor Team validates video data quality; the Algorithm Team verifies model accuracy, real-time performance, and processing pipelines; the Engineering Team tests functional logic and interfaces; and the Testing Team conducts comprehensive end-to-end evaluations to ensure design objectives and user requirements are met.

②: Decoupling of Algorithmic and Application Layers for Targeted Precision Testing

Algorithm and application layers are tested separately. The Algorithm Team assesses models for face detection, head posture, eye state, and behavior recognition, while the Testing Team verifies application-layer functions such as driver state judgment, occlusion recognition, and fatigue/attention/risk alert logic.

③: End-to-End Black-Box Testing to Ensure Overall User Experience

After independent tests, black-box testing evaluates overall functionality, responsiveness, and user experience in realistic scenarios.

Through hierarchical decoupling, this approach reduces module coupling, improves fault localization, and enhances process efficiency, strongly supporting DFMS development and optimization.

3.2. Scenario-Based Test Cases

In product-level testing of the DFMS, black-box testing is extensively utilized to validate system functionality and performance. Particularly in handling complex or specialized application scenarios, black-box testing enables the testing team to comprehensively assess the system’s user experience under diverse environments and conditions [33]. Traditionally, test scenario cases are manually crafted based on prior experience and subjective judgment, which is often inefficient and prone to incomplete scenario coverage. To overcome these limitations, this study proposes a structured automated script-based method for generating test scenarios. By leveraging standardized scenario decomposition and Python programming, this approach ensures comprehensive scenario coverage, significantly improving the efficiency and accuracy of test case generation.

In this method, test scenario cases are organized through systematic structural decomposition, encompassing three dimensions: testing pipeline (testing behaviors and duration), testing elements (environment, accessories, occlusions, etc.), and testing subjects (gender, ethnicity, age, etc.). Each dimension includes various variables and factors, which are combined to generate specific test cases. For instance, as shown in the diagram, testing actions and durations form the core of testing behaviors, covering combinations such as left gaze, right gaze, and different durations. Environmental factors account for variations in lighting conditions and dynamic versus static scenes, including representative challenging conditions such as rapid illumination transitions, strong backlighting, and motion-induced image perturbations, which are commonly encountered in real-world driving environments. Accessories and occlusions include factors such as glasses and hats, while testing subjects cover differences in driver gender, ethnicity, and age (as shown in Figure 6).

The creation of test scenario cases is based on multiple scenario elements, each containing various variables. The different combinations of these variables form diverse test cases, ensuring the breadth and completeness of testing. This approach not only enhances the comprehensiveness of test coverage, but also effectively reduces biases that may arise from manual test case creation, thereby improving the reliability of test results. By utilizing this structured Python script-based automated test scenario-generation method, testers can efficiently generate test cases covering diverse usage scenarios in a short time, ensuring stable system performance that meets design expectations across varied environments.

3.3. Scenario Extraction Algorithms

Product-level system test cases are typically constructed and derived based on scenario elements. Traditional test case creation relies heavily on the experience of testers, which is often inefficient and prone to incomplete coverage. To address these challenges, this study employs three commonly used algorithms from mathematical probability theory—the Cartesian product algorithm, orthogonal analysis algorithm, and pairwise analysis algorithm—to investigate methods for test scenario extraction [34,35,36].

3.3.1. Cartesian Product Algorithm

The Cartesian product algorithm is a widely adopted method in product-level system testing. It generates various test scenario cases by arranging combinations of different input test elements, enabling the identification of defects and issues in the system under diverse operating conditions [34,37,38]. Each test scenario case covers distinct input test elements, thereby verifying the correctness and stability of the system across various conditions [39].

Given several sets

A_{1}, A_{2}, \dots, A_{n}

, the result of the Cartesian product is all possible ordered combinations. The mathematical expression is as follows:

A_{1} \times A_{2} \times \dots \times A_{n} = {(a_{1}, a_{2}, \dots, a_{n}) ∣ a_{i} \in A_{i}, \forall i \in {1, 2, \dots, n}}

(1)

Suppose there are two sets

A = {a, b}

and

B = {1, 2}

, then the Cartesian product is as follows:

A \times B = {(a, 1), (a, 2), (b, 1), (b, 2)}

(2)

For the driver fatigue monitoring system, the Cartesian product generates diverse scenarios, including lighting conditions, driving behaviors, and driver characteristics.

Although the Cartesian product algorithm is suitable for the virtual simulation test stage, it will generate all possible combinations of elements, resulting in a very large number of test scenarios, which may cause inefficiency and excessive resource consumption in actual testing. Therefore, in real-vehicle testing, the number of test scenarios should be reduced as much as possible while ensuring full coverage to reduce test time and cost. To this end, orthogonal analysis and pairwise analysis can be used as more optimized options.

3.3.2. Orthogonal Analysis Algorithm

The orthogonal analysis algorithm is a testing approach based on mathematical probability statistics, which is used to effectively select the number of test scenarios that cover various combinations of system-level elements. It can realize comprehensive testing of the product system in smaller test scenario cases to find potential problems and optimization [40,41]. Orthogonality means that in each test scenario case, the combination of each test element can appear evenly and balanced to achieve the highest possible coverage of test scenario cases [42].

Orthogonal analysis is based on the design concept of orthogonal array. Orthogonal array is a mathematical structure used to generate a balanced arrangement of each factor in the combination so that every combination of two factors can appear in all test cases. Given an orthogonal array

L_{m}

, its structure is as follows:

L_{m} (v_{1}, v_{2}, \dots, v_{k}) = [\begin{matrix} a_{11} & a_{12} & \dots & a_{1 k} \\ a_{21} & a_{22} & \dots & a_{2 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m k} \end{matrix}]

(3)

where

m

is the number of test cases, and

v_{1}, v_{2}, \dots, v_{k}

represents the level of each factor, ensuring that the combination of factors is evenly distributed in each row (i.e., each test case).

The orthogonal analysis algorithm generates balanced combinations of driving behaviors, environments, and driver characteristics, ensuring even distribution of factor levels.

This method can significantly reduce the number of test cases while ensuring that all possible combinations are covered. It is especially suitable for situations with many test dimensions and complex combinations.

3.3.3. Pairwise Analysis Algorithm

The pairwise analysis algorithm is an algorithm used to determine the matching relationship between two or more elements. It is usually used to solve combinatorial optimization problems, that is, pairing a set of elements to form non-repeating combinations [36]. The pairwise algorithm is based on the following two assumptions:

①: Each dimension is orthogonal, that is, each dimension has no intersection with each other;
②: According to mathematical statistical analysis, 73% of defects (35% for single factor and 38% for double factor) are caused by the interaction of a single factor or two factors, and 17% of defects are caused by the interaction of three factors.

The goal of pairwise analysis is to select a combination of certain key factors in the test case to maximize the chance of defect exposure while reducing the number of test cases. The mathematical expression is as follows:

Pairwise Test Coverage = \sum_{i, j \in Factors} (i, j) \in Test Case Combinations

(4)

Pairwise analysis generates the minimum number of test cases by combining test factors, making it efficient for system testing.

To clarify the trade-offs among the three scenario-generation approaches, Table 1 compares their time/space complexity and notes practical limitations. Let

k

denote the number of factors and

s_{i}

the number of levels for factor

i

(with

s = \max_{i} s_{i}

). The total number of exhaustive combinations is

N_{C P} = \prod_{i = 1}^{k} s_{i}

.

In practice, the exhaustive Cartesian product guarantees full coverage but scales exponentially with

k

. Orthogonal designs reduce the test set to

O (s^{t})

scale while preserving t-wise balance when a suitable array exists. Pairwise covering arrays typically yield the smallest sets and near-linear generation time, making them preferable for resource-constrained, on-road validation; higher-order risks can be mitigated by augmenting with a small number of targeted cases. The full Python scripts for the above three algorithms have been placed in the Supplementary Material (S1, ZIP).

The computational complexity of the automated test case generation process is a key consideration when dealing with large datasets and numerous test variables. Our Python scripts rely on efficient algorithms such as the Cartesian product, orthogonal analysis, and pairwise analysis to generate test cases. However, the computational requirements vary significantly depending on the size of the input datasets, the number of variables, and the complexity of the test scenarios. Specifically, the Cartesian product algorithm, while comprehensive, can lead to a high number of combinations, which increases processing time. For instance, generating a test case set with 10 variables and 5 levels per variable using the Cartesian product method required approximately 4 h on a typical desktop setup (Intel i7, 16 GB RAM). In contrast, both the orthogonal analysis and pairwise analysis algorithms reduce the number of test cases, improving efficiency, with average processing times for similar configurations reduced to 1.5 h. Additionally, as the number of test variables increases, or when high-resolution images are used for scenario simulations, the computational load increases. In our current setup, high-resolution image processing (e.g., 1920 × 1080 pixel RGB images) adds approximately 2 h to the total test case generation time. To improve efficiency in real-time applications, we aim to optimize the script by exploring parallel processing and algorithmic enhancements in future work. The integration of more powerful hardware (e.g., multi-core processors or GPUs) could also significantly reduce the time required for these processes.

Beyond computational aspects, the choice of these algorithms was guided by their complementary strengths: Cartesian product provides exhaustive coverage for simulation datasets, orthogonal analysis achieves balanced coverage with fewer cases, and pairwise analysis minimizes test size for on-road validation. Moreover, scenario design incorporated demographic and environmental diversity (e.g., gender, age, lighting, accessories) to enhance representativeness. Sample sizes were determined to balance statistical coverage with feasibility, ensuring rigorous yet practical validation.

4. Testing and Validation Analysis

Building upon the testing methods outlined in Section 3, this section presents empirical validation through various testing environments and analyses, demonstrating the effectiveness of the system.

4.1. Test Validation Environment Design

Product testing and validation generally rely on quantifiable metrics and the design of corresponding interfaces and signals to statistically evaluate the accuracy of test results [43]. This section constructs a testing environment based on the system architecture shown in Figure 7, comprising three components: the Vehicle Signal Input and Acquisition Module (Vehicle), the DFMS Core Processing Unit, and the human–machine interaction (HMI) and Measurement Interface Module.

In the Vehicle Module, the “Power On” condition serves as the test initiation trigger, simulating a real ignition scenario. A unified signal interface collects multidimensional raw signals, including functional status bits, vehicle power mode, vehicle speed, gear position, steering status, turn signal status, door lock status, seatbelt status, and system physical time, which are input into the DFMS. The DFMS Core Processing Unit first determines the system’s operational phase using a functional state machine. Upon meeting switch-state transition and readiness conditions, the system sequentially enters standby, warm-up, and activation phases. Subsequently, it employs four application-layer perception–processing–logic (PPL) modules, namely Fatigue Level Warning, Attention Warning, Dangerous Behavior Warning, and Leave Warning. Here, PPL refers to a rule-based application-layer processing stage that integrates perception outputs with vehicle state information to generate driver state judgments and warning decisions.

In the HMI and Measurement Interface Module, the system uses six icons—“Fault Alarm,” “Activation Status,” “Fatigue Warning,” “Attention Warning,” “Dangerous Behavior,” and “Leave Warning”—to visually represent various alert states. It also synchronously records input-output signals and alert timing, providing quantitative support for subsequent virtual simulation testing, basic function testing, and subjective evaluation testing.

This modular design not only replicates multi-source data inputs under real-world conditions, but also enables precise validation of the system’s functional performance, laying a solid foundation for product optimization and methodological research.

4.2. Virtual Simulation Testing

Virtual simulation testing plays a crucial role in the validation process for the Driver Fatigue Monitoring System (DFMS). Its objective is to comprehensively evaluate the system’s performance and stability under diverse operating conditions by simulating real-world driving scenarios. This testing approach involves constructing test datasets using scenario videos collected from real vehicles, which are then subjected to data preprocessing, annotation, and analysis to generate test cases. This enables full scenario coverage, identifying potential issues in various application contexts and optimizing system functionality.

The implementation steps for virtual simulation testing are illustrated in Figure 8 and consist of the following four stages:

Data Preprocessing: Scenario videos capturing driver behaviors are collected in real-vehicle environments. These videos undergo segmentation, sampling, and cleaning to remove noisy data, ensuring data quality meets testing requirements.
Data Annotation: Preprocessed video data are manually or semi-automatically annotated to generate corresponding images and JSON files. Annotations include driver facial features, behavioral states (e.g., eye closure, yawning, phone use), and environmental factors (e.g., lighting variations, occlusions).
Result Analysis: Using the annotated dataset, test cases are executed on a virtual simulation platform, and key metrics such as recognition accuracy, false positive rate, and false negative rate are statistically analyzed across different scenarios.
Data Feedback: Test results are fed back into the system development process to optimize algorithm models and functional design, enhancing the system’s robustness in complex scenarios.

The core of virtual simulation testing lies in the construction of the test dataset. This study leverages the hierarchical decoupling and scenario elements approach proposed in Section 3 to design a virtual simulation test case library. The test case library employs the Cartesian product algorithm to generate a comprehensive set of test scenarios, covering various test elements (e.g., lighting conditions, driver gender, ethnicity, age, behavior types) and test subjects, ensuring that the test scenarios simulate a wide range of real-world driving conditions. During the testing process, the system uses in-cabin cameras to capture driver images, analyzing facial features and behavioral states through deep learning algorithms to output corresponding fatigue or attention alert signals.

4.3. Basic Function Testing

Basic function testing seeks to verify whether the Driver Fatigue Monitoring System (DFMS) can achieve its core functionalities as designed, encompassing both application-level and system-level testing. The tests are primarily conducted in a bench testing environment, utilizing a series of designed test cases to rapidly evaluate the system’s functional performance and stability under typical usage scenarios. The test cases cover various driver behavioral states and system configuration functions to ensure accurate recognition and response across diverse conditions.

Table 2 provides a detailed description of the basic function test cases designed for the DFMS and their corresponding test results. For clarity, test icons are consistently labeled using standard terminology (e.g., “Fatigue Warning” instead of mixed terms), and units such as time (s) are explicitly indicated.

Through the execution of the aforementioned test cases, the system accurately triggered corresponding interface prompts and alert icons when detecting driver behaviors such as eye closure, yawning, phone use, smoking, drinking, mask wearing, and sunglasses wearing. This indicates that the application function modules effectively recognize typical driver fatigue and distraction behaviors. Additionally, the switch settings test for system functions validated the system’s correct response to configuration changes, with interface transition logic aligning with the expected design.

The results of the basic function testing demonstrate that the Driver Fatigue Monitoring System (DFMS) operates stably in the bench testing environment, with functionality meeting design requirements. No significant functional defects were identified during the testing process, providing a solid foundation for subsequent virtual simulation and subjective evaluation tests. Still, it should be noted that the idealized conditions of the bench testing environment may not fully reflect the complexity of real-world driving scenarios. Therefore, further testing in real-world conditions is necessary to validate the system’s robustness.

To further demonstrate the utility of the proposed testing framework, we performed a comparison between two distinct Driver Fatigue Monitoring Systems (DFMSs) using the same set of generated test scenarios. The framework was applied to generate diverse test scenarios, including variations in driving behaviors, environmental conditions, and vehicle configurations. When comparing the results from both systems, key differences in detection accuracy and false positive rates were observed. This comparison highlights the ability of the proposed framework to consistently generate test scenarios and evaluate multiple systems under standardized conditions, ensuring consistent and reliable performance assessments.

4.4. Subjective Evaluation Testing

Subjective evaluation testing aims to assess the user experience and practical application effectiveness of the Driver Fatigue Monitoring System (DFMS) by simulating typical scenarios in real-world driving scenarios. This testing approach employs structured storylines and scenario scripts to identify potential issues under various driving conditions and provide a basis for design optimization. The system utilizes an in-cabin camera module to capture the driver’s facial and body images, analyzing fatigue states (e.g., eye closure, smoking, phone use) through deep learning algorithms to ensure monitoring accuracy in complex environments.

The testing specifically focuses on the impact of lighting variations on the performance of the camera module. Daytime scenarios emphasize testing under challenging lighting conditions, such as bridge underpasses, tunnel entries and exits, underground parking garages, and direct sunlight during sunrise or sunset, which introduce abrupt illumination changes and high dynamic range visual disturbances. Nighttime scenarios prioritize performance in varying brightness levels (particularly low-light conditions) and specific situations (e.g., brake lights from the vehicle ahead or headlights from the vehicle behind). To this end, four subjective evaluation test routes were designed, covering a diverse range of driving scenarios. The key performance indicators (KPIs) for these routes, including false positive rates (FPRs), response times (RTs), and user satisfaction scores (DSSs), are summarized in Table 3. The results demonstrate stable performance across diverse conditions, with Route 4 achieving the highest satisfaction due to robust recognition performance and minimal false alerts.

In this study, the response time (RT) reported in Table 3 represents the end-to-end on-board processing delay in real vehicle deployment. Specifically, RT is defined as the elapsed time between the onset of a target driver behavior (identified via frame-level annotation or manual trigger) and the corresponding warning output on the human–machine interface (HMI), including visual icon activation and/or audible alert. Across the four real-world routes, the observed average RT ranges from 1.09 s to 1.34 s, indicating that the proposed DFMS pipeline is generally consistent with the real-time requirements of practical in-vehicle use, where warning delays within approximately 1–2 s are typically considered acceptable for driver assistance and monitoring systems.

During the testing process, participants drove along predefined routes while the system continuously monitored and recorded fatigue warning, attention distraction warnings, and related behavior recognition results. Subjective evaluation was conducted through driver feedback and observational records, assessing the system’s response timeliness, alert accuracy, and user acceptance across different scenarios. Preliminary results indicate that the system effectively detects fatigue behaviors in both daytime and nighttime scenarios. Nonetheless, occasional false positives or false negatives were observed under extreme lighting conditions, such as tunnel entrances/exits or direct strong sunlight, necessitating further optimization of algorithm parameters. The subjective evaluation testing provides critical data support for optimizing the system’s user experience in real-world applications. Future improvements can be achieved by increasing test sample sizes and expanding scenario complexity to further enhance the system’s adaptability and reliability.

4.5. Algorithm Model Testing

The algorithm model testing was conducted as an algorithm-level performance evaluation and engineering validation of the deployed DFMS perception models. The test data were collected from two Asian adult male subjects in real-vehicle environments, covering representative driver behaviors such as phone use, drinking, smoking, eye opening/closure, and mask wearing. A total of approximately 20,000 in-vehicle images were collected for this purpose. It should be noted that this dataset was not intended to support population-level generalization or fairness evaluation, but rather to verify the correctness, stability, and real-time performance of the algorithm models under controlled deployment conditions. The results were obtained through the 4.3 virtual simulation test method. Table 4 shows the test results of each model in the dataset. The study used accuracy, precision, recall, and F1-score for evaluation [44]. The calculation formula is as follows:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(5)

Precision = \frac{T P}{T P + F P}

(6)

Recall = \frac{T P}{T P + F N}

(7)

F 1 = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall}

(8)

One of the critical models in driver fatigue assessment is designed to detect yawning behavior, categorized into three classes: normal, half-open mouth, and fully open mouth, with only the fully open mouth classified as a yawn. A total of 103,000 paired IR/RGB images were collected, including 79,000 closed-mouth images, 12,000 half-open mouth images, and 12,000 fully open mouth images. Table 5 presents the test results for the mouth algorithm model.

For the mouth (yawn) detection experiments, the dataset was partitioned into training, validation, and test subsets following a subject-independent split strategy, such that images from the same recording session and subject were not shared across different subsets. This design was adopted to mitigate potential data leakage caused by frame-level temporal correlations.

Model performance was evaluated on the held-out test set using a per-frame evaluation protocol, and true positive rates (TPR) were reported at fixed false positive rate (FPR) operating points (1%, 0.5%, and 0.1%), as summarized in Table 5. The reported metrics, therefore reflect frame-level classification performance under controlled conditions, rather than per-subject or population-level behavioral statistics.

One of the important models in driver fatigue judgment is to judge whether the driver has eyes open or closed. The number of collected pictures IR is 32 K pairs of data. Table 6 shows the test results of the eye algorithm model.

To facilitate reproducibility, key implementation details are provided here. All deep learning models were trained using Python 3.8 with TensorFlow 2.6 on a NVIDIA RTX 3090 GPU (24 GB VRAM) and Intel i7 CPU with 64 GB RAM. Training was conducted for 50 epochs with a batch size of 32, using the Adam optimizer (learning rate = 0.001, β1 = 0.9, β2 = 0.999) and categorical cross-entropy as the loss function. Data augmentation (random rotation ± 15°, horizontal flip, and brightness adjustment) was applied to increase robustness. Input images were resized to 224 × 224 pixels, and normalization was performed before training. These settings ensured stable convergence and reproducible performance across experiments.

4.6. Similar Product Testing

To further demonstrate the applicability of the proposed testing framework in practical engineering scenarios, a comparative evaluation was conducted between the proposed Driver Fatigue Monitoring System (DFMS) and a representative commercial driver behavior analysis product. This experiment is intended as an illustrative engineering benchmark, rather than an exhaustive algorithm-level performance comparison.

The selected commercial product is provided by an Internet company in the form of a publicly accessible application programming interface (API). Users can upload driver behavior images and obtain corresponding recognition results and JSON-format outputs, as shown in Figure 9. The supported behavior categories include eye closure, smoking, phone use, mask wearing, yawning, head-down behavior, seat belt usage, and hands-off steering wheel.

The evaluation dataset consisted of images collected from young Asian male and female subjects. Identical driver behavior images were uploaded to both the commercial platform and the proposed DFMS, and the resulting recognition outputs were analyzed under the same testing conditions. The comparison results are summarized in Table 7.

It should be emphasized that the purpose of this comparison is not to establish a comprehensive ranking among multiple driver monitoring algorithms, but to demonstrate how the proposed hierarchical decoupling and scenario-based testing framework can be applied to consistently evaluate different DFMS products under identical input conditions. Due to the black-box nature and limited accessibility of most commercial DFMS solutions, a single widely used external product was selected as a reference baseline, which is sufficient for illustrating the discriminative capability and practical value of the testing methodology.

From an engineering validation perspective, the results indicate that the proposed DFMS produces more stable and conservative outputs in several representative scenarios, particularly in reducing false alarms for phone-use detection. For example, in the non-phone-call scenario shown in Table 7, the commercial product produced a high confidence misrecognition score (0.805), whereas the proposed system maintained a low response value (0.016), effectively avoiding false positives. Such behavior is critical in real-world deployment, where excessive false alarms may negatively affect driver trust and system acceptance.

Overall, this comparative experiment demonstrates that the proposed testing framework can effectively expose functional differences between DFMS products and highlight potential reliability issues under standardized test conditions. Extending this benchmarking to a broader range of commercial and open-source systems will be an important direction for future work, as access to additional comparable products becomes available.

5. Conclusions

This study introduces a testing framework for the Driver Fatigue Monitoring System (DFMS) based on hierarchical decoupling and scenario element methods, which systematically improves testing efficiency, accuracy, and reliability. It provides significant theoretical and practical foundations for improving road safety and system optimization, leading to the following conclusions:

By decomposing DFMS functionalities into algorithmic and application layers and implementing hierarchical decoupling testing, this study effectively addresses the limitations of traditional end-to-end testing, particularly the difficulty in pinpointing issues, thereby significantly improving the specificity and precision of the testing process.

Through structured decomposition based on scenario elements, combined with Cartesian product, orthogonal analysis, and pairwise analysis algorithms, comprehensive test cases were generated using Python automation scripts. Compared to traditional manual methods, this approach improved test case generation efficiency by approximately 50% while ensuring comprehensive coverage of multidimensional scenarios (e.g., lighting conditions, driving behaviors, and driver characteristics).

The testing validation was completed through virtual simulation, basic function tests, subjective evaluations, and benchmarking tests. The results demonstrate the proposed system’s superior performance in key behavior detection, such as eye closure detection (confidence score of 1.000 vs. 0.674 for the competitor product), smoking detection (confidence score of 1.000 vs. 0.011 for the competitor product), and false positive rate control (phone call misrecognition rate of 0.016 vs. 0.805 for the competitor product), showcasing significant advantages over similar products. Based on simulated data, the enhanced detection capabilities are likely to reduce the risk of fatigue-related accidents, as the system enables more timely interventions, which could reduce incidents caused by driver fatigue. However, real-time processing demands substantial computational resources, including high-performance hardware such as GPUs or AI chips. The limitations in system resources may impact scalability and performance across different vehicle platforms. These limitations should be addressed in future work to optimize real-world applications.

Although this study establishes a robust foundation for standardized DFMS testing and performance optimization, the comprehensiveness of test scenarios is limited by the diversity of real-vehicle data collection samples. Most test subjects came from relatively homogeneous demographic groups, which may restrict the generalizability of the results; therefore, the reported algorithm model testing results should be interpreted primarily as engineering validation outcomes rather than evidence of demographic fairness or population-level robustness. Another limitation lies in the potential challenges of adapting the system across different vehicle types due to hardware variability and software integration requirements. Future research could focus on expanding system robustness with a prioritized approach. First, integrating multimodal sensors, such as heart rate monitors and steering wheel torque sensors, will improve the system’s ability to detect fatigue-related behaviors in diverse conditions. Second, addressing the challenge of extreme weather conditions, such as heavy rain, snow, and fog, will enhance the system’s reliability under adverse environmental factors. Third, broader demographic data collection and systematic validation across multiple vehicle categories will help address the limitations mentioned above, ensuring stronger adaptability and wider applicability. Additionally, the generalizability of the system to other vehicle types and operational domains should be explored. Collaborative efforts with OEMs (Original Equipment Manufacturers) and regulatory bodies will be crucial in establishing standardized guidelines, testing protocols, and performance benchmarks. Furthermore, aligning with evolving automotive safety standards beyond 2027, such as updates to C-NCAP, E-NCAP, and UNECE regulations, will ensure long-term compliance and industry adoption. Engaging in joint working groups with industry stakeholders can accelerate the standardization process and promote interoperability across markets.

In summary, the proposed testing framework and validation methods provide reusable theoretical foundations and practical guidance for advancing intelligent driving safety technologies. The modular design and automated testing strategies are not only applicable to the development and optimization of DFMS, but also provide a reference for testing other intelligent driving assistance systems, holding significant academic value and broad application prospects. Moreover, the proposed framework provides practical guidance for industry adoption and regulatory compliance by aligning DFMS development with evolving international safety standards.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/electronics15030689/s1, Supplementary File S1: Source code and dataset used in this study.

Author Contributions

Conceptualization, S.C. and J.H.; methodology, Y.L.; software, Y.B.; validation, S.C., J.H. and S.Y.; formal analysis, Y.B.; investigation, S.C.; resources, J.H.; data curation, S.Y.; writing—original draft preparation, S.C.; writing—review and editing, J.H.; visualization, Y.L.; supervision, S.C.; project administration, S.Y.; funding acquisition, Y.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

This study involving human participants was conducted in accordance with the Declaration of Helsinki and relevant local regulations. The study protocol and all related documents were reviewed and approved by the Medical Ethics Committee of The First Affiliated Hospital of Nanchang University (approval code: CDYF-2024-045) on 18 July 2024.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in the study are included in the article; further inquiries can be directed to the corresponding author. All data used in this study were collected with informed consent from the participants. Participant data are available upon reasonable request and subject to confidentiality constraints. The scripts referenced in the Methods section have been provided as Supplementary Material S1 (ZIP).

Conflicts of Interest

Author Saier Ye was employed by the company Beijing Kuangshi Technology Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

DMS	Driver Monitoring System
DFMS	Driver Fatigue Monitoring System
PERCLOS	Percentage of Eyelid Closure Over Time
EAR	Eye Aspect Ratio
MAR	Mouth Aspect Ratio
HMI	Human–Machine Interaction Module
PPL	Application-layer PPL
IR	Infrared Image
RGB	Red-Green-Blue Image
API	Application Programming Interface
PPL	Perception–Processing–Logic (application-layer modules)

References

Rosen, H.E.; Bari, I.; Paichadze, N.; Peden, M.; Khayesi, M.; Monclus, J.; Hyder, A.A. Global Road Safety 2010–18: An Analysis of Global Status Reports. Injury 2022, 56, 110266. [Google Scholar] [CrossRef] [PubMed]
Hamid, S.; Davoud, K.-Z. Road Traffic Injuries Measures in the Eastern Mediterranean Region: Findings from the Global Status Report on Road Safety—2015. J. Inj. Violence Res. 2019, 11, 149. [Google Scholar]
Zhou, F.; Alsaid, A.; Blommer, M.; Curry, R.; Swaminathan, R.; Kochhar, D.; Talamonti, W.; Tijerina, L.; Lei, B. Driver Fatigue Transition Prediction in Highly Automated Driving Using Physiological Features. Expert Syst. Appl. 2020, 147, 113204. [Google Scholar] [CrossRef]
Meng, X.; Cheng, Q.; Jiang, X.; Fang, Z.; Chen, X.; Li, S.; Li, C.; Sun, C.; Wang, W.; Wang, Z.L. Triboelectric Nanogenerator as a Highly Sensitive Self-Powered Sensor for Driver Behavior Monitoring. Nano Energy 2018, 51, 721–727. [Google Scholar] [CrossRef]
Guo, Z.; Zhou, M.; Li, G. Driver Vigilance State Estimation Based on Multisource Data. IEEE Sens. J. 2023, 23, 2592–2604. [Google Scholar] [CrossRef]
Doudou, M.; Bouabdallah, A.; Berge-Cherfaoui, V. Driver Drowsiness Measurement Technologies: Current Research, Market Solutions, and Challenges. Int. J. Intell. Transp. Syst. Res. 2020, 18, 297–319. [Google Scholar] [CrossRef]
Wang, H.; Zhao, M.; Beurier, G.; Wang, X. Automobile Driver Posture Monitoring Systems: A Review. China J. Highw. Transp. 2019, 2, 1–18. [Google Scholar]
Bai, X.; Gai, C.; Wu, D.; Zhu, J.; Wu, G.; Zhang, M.; Liu, Y.; Sun, J.; Zhuang, J.; Huang, Y.; et al. Intelligent Sensing System of the Car Seat with Flexible Piezoresistive Sensor of Double-Layer Conductive Network. IEEE Sens. J. 2022, 22, 3113–3121. [Google Scholar] [CrossRef]
Azadani, M.N.; Boukerche, A. Driving Behavior Analysis Guidelines for Intelligent Transportation Systems. IEEE Trans. Intell. Transp. Syst. 2021, 23, 6027–6045. [Google Scholar] [CrossRef]
Horberry, T.; Mulvihill, C.; Fitzharris, M.; Lawrence, B.; Lenné, M.; Kuo, J.; Wood, D. Human-Centered Design for an in-Vehicle Truck Driver Fatigue and Distraction Warning System. IEEE Trans. Intell. Transp. Syst. 2021, 23, 5350–5359. [Google Scholar] [CrossRef]
Chen, J.; Jiang, Y.; Huang, Z.; Guo, X.; Wu, B.; Sun, L.; Wu, T. Fine-Grained Detection of Driver Distraction Based on Neural Architecture Search. IEEE Trans. Intell. Transp. Syst. 2021, 22, 5783–5801. [Google Scholar] [CrossRef]
Chen, S.; Huang, L.; Bai, J.; Jiang, H.; Chang, L. Multi-Sensor Information Fusion Algorithm with Central Level Architecture for Intelligent Vehicle Environmental Perception System; SAE Technical Paper; SAE: Warrendale, PA, USA, 2016. [Google Scholar]
Gerstmair, M.; Melzer, A.; Onic, A.; Huemer, M. On the Safe Road toward Autonomous Driving: Phase Noise Monitoring in Radar Sensors for Functional Safety Compliance. IEEE Signal Process. Mag. 2019, 36, 60–70. [Google Scholar] [CrossRef]
Halin, A.; Verly, J.G.; Van Droogenbroeck, M. Survey and Synthesis of State of the Art in Driver Monitoring. Sensors 2021, 21, 5558. [Google Scholar] [CrossRef] [PubMed]
Chen, G.; Wiede, C.; Kokozinski, R. Data Processing Approaches on SPAD-Based d-TOF LiDAR Systems: A Review. IEEE Sens. J. 2020, 21, 5656–5667. [Google Scholar] [CrossRef]
Luckinger, F.; Sauter, T. Software-Based AUTOSAR-Compliant Precision Clock Synchronization over CAN. IEEE Trans. Ind. Inform. 2022, 18, 7341–7350. [Google Scholar] [CrossRef]
Nidamanuri, J.; Nibhanupudi, C.; Assfalg, R.; Venkataraman, H. A Progressive Review: Emerging Technologies for ADAS Driven Solutions. IEEE Trans. Intell. Veh. 2021, 7, 326–341. [Google Scholar] [CrossRef]
Rosero-Montalvo, P.D.; López-Batista, V.F.; Peluffo-Ordóñez, D.H. Hybrid Embedded-Systems-Based Approach to in-Driver Drunk Status Detection Using Image Processing and Sensor Networks. IEEE Sens. J. 2020, 21, 15729–15740. [Google Scholar] [CrossRef]
Safarov, F.; Akhmedov, F.; Abdusalomov, A.B.; Nasimov, R.; Cho, Y.I. Real-Time Deep Learning-Based Drowsiness Detection: Leveraging Computer-Vision and Eye-Blink Analyses for Enhanced Road Safety. Sensors 2023, 23, 6459. [Google Scholar] [CrossRef]
Qu, F.; Dang, N.; Furht, B.; Nojoumian, M. Comprehensive Study of Driver Behavior Monitoring Systems Using Computer Vision and Machine Learning Techniques. J. Big Data 2024, 11, 32. [Google Scholar] [CrossRef]
Yang, H.; Hu, N.; Jia, R.; Zhang, X.; Xie, X.; Liu, X.; Chen, N. How Does Driver Fatigue Monitor System Design Affect Carsharing Drivers? An Approach to the Quantification of Driver Mental Stress and Visual Attention. Travel Behav. Soc. 2024, 35, 100755. [Google Scholar] [CrossRef]
Mueller, A.S.; Reagan, I.J.; Cicchino, J.B. Addressing Driver Disengagement and Proper System Use: Human Factors Recommendations for Level 2 Driving Automation Design. J. Cogn. Eng. Decis. Mak. 2021, 15, 3–27. [Google Scholar] [CrossRef]
Song, R.; Li, X.; Zhao, X.; Liu, M.; Zhou, J.; Wang, F.-Y. Identifying Critical Test Scenarios for Lane Keeping Assistance System Using Analytic Hierarchy Process and Hierarchical Clustering. IEEE Trans. Intell. Veh. 2023, 8, 4370–4380. [Google Scholar] [CrossRef]
Coyne, R.; Hanlon, M.; Smeaton, A.F.; Corcoran, P.; Walsh, J.C. Understanding Drivers’ Perspectives on the Use of Driver Monitoring Systems during Automated Driving: Findings from a Qualitative Focus Group Study. Transp. Res. Part F Traffic Psychol. Behav. 2024, 105, 321–335. [Google Scholar] [CrossRef]
Du, G.; Li, T.; Li, C.; Liu, P.X.; Li, D. Vision-Based Fatigue Driving Recognition Method Integrating Heart Rate and Facial Features. IEEE Trans. Intell. Transp. Syst. 2020, 22, 3089–3100. [Google Scholar] [CrossRef]
Naz, S.; Ahmed, A.; ul ain Mubarak, Q.; Noshin, I. Intelligent Driver Safety System Using Fatigue Detection. In Proceedings of the 2017 19th International Conference on Advanced Communication Technology (ICACT), Pyeongchang, Republic of Korea, 19–22 February 2017; pp. 89–93. [Google Scholar]
Bass, J.M. Artefacts and Agile Method Tailoring in Large-Scale Offshore Software Development Programmes. Inf. Softw. Technol. 2016, 75, 1–16. [Google Scholar] [CrossRef]
Hasselbring, W. Software Architecture: Past, Present, Future. In The Essence of Software Engineering; Springer International Publishing: Cham, Switzerland, 2018; pp. 169–184. [Google Scholar]
Richards, M. Software Architecture Patterns; O’Reilly Media: Sebastopol, CA, USA, 2015; Volume 4. [Google Scholar]
Wang, Q.-G. Decoupling Control; Springer: Berlin/Heidelberg, Germany, 2003; Volume 285. [Google Scholar]
Mo, R.; Cai, Y.; Kazman, R.; Xiao, L.; Feng, Q. Decoupling Level: A New Metric for Architectural Maintenance Complexity. In Proceedings of the 38th International Conference on Software Engineering, Austin, TX, USA, 14–22 May 2016; pp. 499–510. [Google Scholar]
Khan, M.Q.; Lee, S. A Comprehensive Survey of Driving Monitoring and Assistance Systems. Sensors 2019, 19, 2574. [Google Scholar] [CrossRef] [PubMed]
Antkiewicz, M.; Kahn, M.; Ala, M.; Czarnecki, K.; Wells, P.; Acharya, A.; Beiker, S. Modes of Automated Driving System Scenario Testing: Experience Report and Recommendations. SAE Int. J. Adv. Curr. Pract. Mobil. 2020, 2, 2248–2266. [Google Scholar] [CrossRef]
Agesen, O. The Cartesian Product Algorithm: Simple and Precise Type Inference of Parametric Polymorphism. In Proceedings of the ECOOP’95—Object-Oriented Programming, 9th European Conference, Åarhus, Denmark, 7–11 August 1995; Springer: Berlin/Heidelberg, Germany, 1995; pp. 2–26. [Google Scholar]
Hussain, Z.; Shawe-Taylor, J.; Hardoon, D.R.; Dhanjal, C. Design and Generalization Analysis of Orthogonal Matching Pursuit Algorithms. IEEE Trans. Inf. Theory 2011, 57, 5326–5341. [Google Scholar] [CrossRef]
He, S.; Guo, F.; Zou, Q.; Huiding, F. MRMD2. 0: A Python Tool for Machine Learning with Feature Ranking and Reduction. Curr. Bioinform. 2020, 15, 1213–1221. [Google Scholar] [CrossRef]
Klavžar, S.; Seifter, N. Dominating Cartesian Products of Cycles. Discret. Appl. Math. 1995, 59, 129–136. [Google Scholar] [CrossRef]
Imrich, W.; Peterin, I. Recognizing Cartesian Products in Linear Time. Discret. Math. 2007, 307, 472–483. [Google Scholar] [CrossRef]
Cicerone, S.; Di Stefano, G.; Klavžar, S. On the Mutual Visibility in Cartesian Products and Triangle-Free Graphs. Appl. Math. Comput. 2023, 438, 127619. [Google Scholar] [CrossRef]
Korenberg, M.J. A Robust Orthogonal Algorithm for System Identification and Time-Series Analysis. Biol. Cybern. 1989, 60, 267–276. [Google Scholar] [CrossRef] [PubMed]
Wu, H. Application of Orthogonal Experimental Design for the Automatic Software Testing. Appl. Mech. Mater. 2013, 347, 812–818. [Google Scholar] [CrossRef]
Shrestha, N. Factor Analysis as a Tool for Survey Analysis. Am. J. Appl. Math. Stat. 2021, 9, 4–11. [Google Scholar] [CrossRef]
Wishart, J.; Como, S.; Forgione, U.; Weast, J.; Weston, L.; Smart, A.; Nicols, G. Literature Review of Verification and Validation Activities of Automated Driving Systems. SAE Int. J. Connect. Autom. Veh. 2020, 3, 267–323. [Google Scholar] [CrossRef]
Liu, H.; Ma, T.; Lin, Y.; Peng, K.; Hu, X.; Xie, S.; Luo, K. Deep Learning in Rockburst Intensity Level Prediction: Performance Evaluation and Comparison of the NGO-CNN-BiGRU-Attention Model. Appl. Sci. 2024, 14, 5719. [Google Scholar] [CrossRef]

Figure 1. The in-cabin visual image monitoring interface.

Figure 2. Schematic diagram of the DFMS hardware terminal.

Figure 3. Overall functional flowchart of DFMS.

Figure 4. DFMS algorithmic software architecture. (a high-resolution version is provided in the Supplementary Materials for improved readability).

Figure 5. Schematic diagram of DFMS hierarchical decoupling testing architecture.

Figure 6. Structure of scenario-based test case variable combinations.

Figure 7. Driver fatigue monitoring system product flowchart.

Figure 8. Virtual simulation test steps.

Figure 9. User interface of a driver behavior analysis product released by an Internet company.

Table 1. Comparative complexity and scalability of scenario-generation algorithms (large-scale settings).

Approach	Time Complexity (Generation)	Space (Test Set)	Practical Limitations
Cartesian Product	$O (N_{C P})$ , where $N_{C P} = \prod_{i = 1} s_{i}$	$O (N_{C P})$	Combinatorial explosion; suitable for simulation but often infeasible for on-road tests.
Orthogonal Analysis (strength $t$ )	$\approx O (k N_{O A})$ , where $N_{O A} \approx c \cdot s^{t}$ (when arrays exist)	$O (N_{O A})$	Requires feasible orthogonal arrays; mixed-level/constrained designs may have limited availability; covers t-wise interactions only.
Pairwise Analysis (covering array, $t = 2$ )	$\approx O (k N_{P W})$ with $N_{P W} = O (s^{2} \log k)$ (uniform $S$ ); near-linear in output size for IPO/greedy families	$O (N_{P W})$	Ensures pairwise coverage but may miss higher-order interactions; quality depends on generator and constraints.

Table 2. Basic function test cases.

Module	Submodule	Test Case Description	Test Results
Application Functions	Eye Closure	Perform eye closure action while facing forward (no head deviation), duration < 3 s	Fatigue warning interface displayed, Eye Closure test icon triggered
	Yawning	Perform yawning action while facing forward (no head deviation), duration < 3 s	Fatigue warning interface displayed, Yawning test icon triggered
	Phone Use	Perform phone use action while facing forward (no head deviation), holding phone in left or right hand, duration > 3 s	Fatigue warning interface displayed, Phone Use test icon triggered
	Smoking	Perform smoking action (unlit cigarette) while facing forward (no head deviation), duration > 3 s	Fatigue warning interface displayed, Smoking test icon triggered
	Drinking	Perform drinking action (mineral water bottle) while facing forward (no head deviation), duration > 3 s	Fatigue warning interface displayed, Drinking test icon triggered
	Mask Wearing	Perform mask wearing action (disposable medical mask) while facing forward (no head deviation), duration > 3 s	Fatigue warning interface displayed, Mask Wearing test icon triggered
	Sunglasses	Perform sunglasses wearing action (standard black sunglasses) while facing forward (no head deviation), duration > 3 s	Fatigue warning interface displayed, Sunglasses test icon triggered
	Fatigue Warning—Eye Closure	Perform eye closure action while facing forward (no head deviation), duration > 10 s	Fatigue warning interface displayed, Eye Closure test icon and Fatigue warning icon triggered
	Fatigue Warning—Yawning	Perform yawning action while facing forward (eyes open, no head deviation), duration > 5 s	Fatigue warning interface displayed, Yawning test icon and Fatigue warning icon triggered
	Attention Alert—Looking Left/Right	Perform left or right head turn, head deviation > 45°, duration < 3 s, gaze following head movement	Fatigue warning interface displayed, Looking Left/Right test icon triggered
	Attention Alert—Looking Down	Perform looking down action, head deviation > 45°, duration < 3 s, gaze following head movement	Fatigue warning interface displayed, Looking Down/Up test icon triggered
System Functions	Switch Settings	Operate the [Fatigue Warning] switch	[Fatigue Warning] switch toggles on/off, interface content updates accordingly

Table 3. Subjective evaluation test routes for the driver fatigue monitoring system.

Subjective Assessment Route	Driving Scenario	Route Planning for Specific Scenarios	FPR (%)	RT (s)	DSS (1–5)
Route 1	Simulate a short-distance commuting scenario.	In Shanghai, the route from the underground parking garage of Yunbo Center Building, passing through the Tianlin Tunnel to Baoshi Garden, is 2.4 km long.	5.2	1.21	4.3
Route 2	Simulate a long-distance travel scenario.	Starting from the Geely Automobile Research Institute to Ningbo Airport (total distance 85 km), the route passes through expressway sections (including scenarios with strong light, backlight, sidelight, and direct sunlight on the face during sunrise and sunset), Changxiling Tunnel (total length 2.7 km), and Shizishan Tunnel (total length 2.7 km).	3.9	1.34	4.0
Route 3	Simulate daytime usage scenarios with a focus on attention.	In Beijing, starting from the underground parking garage of Rongke Building, passing through Peking University and Tsinghua University, including scenarios of turning and changing light conditions, direct sunset light from east to west, the Datun Road Tunnel, and the underpass at Xueyuan Bridge roundabout.	2.7	1.12	4.5
Route 4	Simulate nighttime usage scenarios with a focus on attention.	In Beijing, starting from Zhichun Road, passing through the Fourth Ring Road North and the Beijing-Zhangjiakou Expressway to the Chengyuan Building.	2.2	1.09	4.7

Note: FPR represents false positive rate; RT represents avg. response time; DSS represents driver satisfaction score.

Table 4. Test results for each model in the dataset.

Model	Data Size	TP	FP	FN	TN	Accuracy	Precision	Recall	F1-Score
Drink	732	675	42	15	0	0.92	0.94	0.98	0.96
Phone	732	716	1	15	0	0.98	1.00	0.98	0.99
Smoke	732	717	0	15	0	0.98	1.00	0.98	0.99
Eye	920	890	10	17	3	0.97	0.99	0.98	0.99
Mouth	760	747	3	7	3	0.99	1.00	0.99	0.99
Mask	2845	2742	6	96	1	0.96	1.00	0.97	0.98

Note: TP represents true positive; FP represents false positive; FN represents false negative; TN represents true negative. All metrics are reported as values between 0 and 1.

Table 5. Mouth algorithm model test results.

FPR	Image Type	0-TPR/Threshold	1-TPR/Threshold	2-TPR/Threshold
1%	RGB	0.9773/0.9911	0.8518/0.8680	0.9245/0.1712
	IR	0.9735/0.9913	0.7355/0.9270	0.8664/0.4777
	RGB-IR	0.9998/0.0028	0.9997/0.0098	1.0000/0.0046
0.5%	RGB	0.9351/0.9983	0.6932/0.9859	0.8945/0.4297
	IR	0.9368/0.9976	0.5770/0.9893	0.7895/0.8574
	RGB-IR	0.9998/0.0051	0.9997/0.0246	1.0000/0.0110
0.1%	RGB	0.7234/0.9998	0.3767/0.9987	0.7317/0.9756
	IR	0.7409/0.9997	0.2951/0.9988	0.6083/0.9894
	RGB-IR	0.9998/0.0066	0.9986/0.1529	1.0000/0.0246

Note: FPR represents false positive rate; TPR represents true positive rate; 0 represents closed mouth; 1 represents half-open mouth; 2 represents fully open mouth. “RGB-IR fusion” refers to the combination of RGB (red-green-blue) color images and infrared (IR) images to enhance the robustness and accuracy of mouth state detection by leveraging both visible and infrared light spectrum information. All metrics are reported as values between 0 and 1.

Table 6. Eye algorithm model test results.

Eye	FPR	Image Type	0-TPR/Threshold	1-TPR/Threshold
Left eye	1%	IR	0.9902/0.8979	0.9905/0.0974
	0.5%	IR	0.9850/0.9610	0.9754/0.3830
	0.1%	IR	0.6877/0.9987	0.8025/0.9889
Right eye	1%	IR	0.9891/0.8852	0.9881/0.1570
	0.5%	IR	0.9785/0.9600	0.9742/0.4669
	0.1%	IR	0.6416/0.9980	0.8238/0.9844

Note: FPR represents false positive rate; TPR represents true positive rate; 0 represents eyes open, and 1 represents eyes closed. All metrics are reported as values between 0 and 1.

Table 7. Benchmark test results.

Test Items	Competitor Product	Proposed System
① Driver Eye Closure Behavior	Recognition: eye_close score of 0.674	Recognition: eye_close score of 1.000
Result: Both systems detected eye closure, but the competitor product showed lower confidence.
② Driver Eye Closure Behavior	Non-recognition: eye_close score of 0.438	Recognition: eye_close score of 1.000
Result: The competitor product failed to recognize eye closure.
③ Driver Smoking Behavior	Non-recognition: smoke score of 0.011	Recognition: smoke score of 1.000
Result: The competitor product failed to detect smoking behavior, but the proposed system accurately identified smoking behavior.
④ Driver Phone Call Behavior	Recognition: phone score of 0.999	Recognition: phone score of 1.000
Result: Both systems accurately detected phone call behavior.
⑤ Driver Phone Call Behavior	Misrecognition: phone score of 0.805	Non-recognition: phone score of 0.016
Result: The competitor product incorrectly identified a non-phone call scenario as a phone call, but the proposed system correctly avoided false positives.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, S.; Huang, J.; Liu, Y.; Ye, S.; Bai, Y. Embodied Visual Perception for Driver Fatigue Monitoring Systems: A Hierarchical Decoupling Framework for Robust Fatigue Detection and Scenario Understanding. Electronics 2026, 15, 689. https://doi.org/10.3390/electronics15030689

AMA Style

Chen S, Huang J, Liu Y, Ye S, Bai Y. Embodied Visual Perception for Driver Fatigue Monitoring Systems: A Hierarchical Decoupling Framework for Robust Fatigue Detection and Scenario Understanding. Electronics. 2026; 15(3):689. https://doi.org/10.3390/electronics15030689

Chicago/Turabian Style

Chen, Siyu, Juhua Huang, Yinyin Liu, Saier Ye, and Yuqi Bai. 2026. "Embodied Visual Perception for Driver Fatigue Monitoring Systems: A Hierarchical Decoupling Framework for Robust Fatigue Detection and Scenario Understanding" Electronics 15, no. 3: 689. https://doi.org/10.3390/electronics15030689

APA Style

Chen, S., Huang, J., Liu, Y., Ye, S., & Bai, Y. (2026). Embodied Visual Perception for Driver Fatigue Monitoring Systems: A Hierarchical Decoupling Framework for Robust Fatigue Detection and Scenario Understanding. Electronics, 15(3), 689. https://doi.org/10.3390/electronics15030689

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Embodied Visual Perception for Driver Fatigue Monitoring Systems: A Hierarchical Decoupling Framework for Robust Fatigue Detection and Scenario Understanding

Abstract

1. Introduction

2. Driver Fatigue Monitoring System

2.1. Hardware Product Terminal

2.2. Functional Design

2.3. Algorithmic Software Architecture

3. Testing Methods and Test Case Extraction

3.1. Hierarchical Decoupling Testing

3.2. Scenario-Based Test Cases

3.3. Scenario Extraction Algorithms

3.3.1. Cartesian Product Algorithm

3.3.2. Orthogonal Analysis Algorithm

3.3.3. Pairwise Analysis Algorithm

4. Testing and Validation Analysis

4.1. Test Validation Environment Design

4.2. Virtual Simulation Testing

4.3. Basic Function Testing

4.4. Subjective Evaluation Testing

4.5. Algorithm Model Testing

4.6. Similar Product Testing

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI