Next Article in Journal
Electricity Prices and Residential Electricity Consumption in South Africa: Evidence from Fully Modified Ordinary Least Squares and Dynamic Ordinary Least Squares Tests
Previous Article in Journal
Optimal Design and Cost–Benefit Analysis of a Solar Photovoltaic Plant with Hybrid Energy Storage for Off-Grid Healthcare Facilities with High Refrigeration Loads
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Research Progress in Multi-Domain and Cross-Domain AI Management and Control for Intelligent Electric Vehicles

1
School of Automotive and Traffic Engineering, Jiangsu University, Zhenjiang 212013, China
2
School of Automotive Engineering, Shandong Jiaotong University, Jinan 250357, China
3
Research Centre for Electric Vehicles, Department of Electrical and Electronic Engineering, The Hong Kong Polytechnic University, Hong Kong 999077, China
4
China Automotive Engineering Research Institute Co., Ltd., Chongqing 401122, China
5
Chongqing Special Equipment Testing and Research Institute (Chongqing Special Equipment Accident Emergency Investigation and Handling Center), Chongqing 401121, China
*
Author to whom correspondence should be addressed.
Energies 2025, 18(17), 4597; https://doi.org/10.3390/en18174597
Submission received: 23 July 2025 / Revised: 21 August 2025 / Accepted: 27 August 2025 / Published: 29 August 2025

Abstract

Recent breakthroughs in artificial intelligence are accelerating the intelligent transformation of vehicles. Vehicle electronic and electrical architectures are converging toward centralized domain controllers. Deep learning, reinforcement learning, and deep reinforcement learning now form the core technologies of domain control. This review surveys advances in deep reinforcement learning in four vehicle domains: intelligent driving, powertrain, chassis, and cockpit. It identifies the main tasks and active research fronts in each domain. In intelligent driving, deep reinforcement learning handles object detection, object tracking, vehicle localization, trajectory prediction, and decision making. In the powertrain domain, it improves power regulation, energy management, and thermal management. In the chassis domain, it enables precise steering, braking, and suspension control. In the cockpit domain, it supports occupant monitoring, comfort regulation, and human–machine interaction. The review then synthesizes research on cross-domain fusion. It identifies transfer learning as a crucial method to address scarce training data and poor generalization. These limits still hinder large-scale deployment of deep reinforcement learning in intelligent electric vehicle domain control. The review closes with future directions: rigorous safety assurance, real-time implementation, and scalable on-board learning. It offers a roadmap for the continued evolution of deep-reinforcement-learning-based vehicle domain control technology.

1. Introduction

Recent advances in artificial intelligence (AI) and computing hardware have catalyzed a new wave of innovation in robotics and control, opening unprecedented opportunities for traditional manufacturing sectors [1,2,3]. Within this context, automotive intelligence has emerged as a pivotal driver of industry transformation: governments worldwide now issue supportive policies that foster the high-quality development of intelligent electric vehicles. Such vehicles are highly integrated cyber-physical systems that combine sophisticated onboard sensors, controllers, and actuators with modern communication and networking technologies. By enabling seamless information exchange among vehicles, users, infrastructure, and cloud platforms [4,5,6], these technologies lay the foundation for a new generation of cars capable of high-level autonomous driving [7,8].
Accompanied by the rapid development of intelligent electric vehicles, the traditional distributed automotive electronic and electrical (E/E) architecture of the vehicle is often distributed with hundreds of electronic control units (ECUs), and the upgrading and expansion of intelligent functions rely on increasing the number of ECUs and sensors, which faces problems such as a sharp increase in research and development (R&D) and production costs, reduced security, and insufficient arithmetic power [9,10]. Facing these limitations, automotive E/E architectures have gradually evolved toward centralization [11], integrating ECUs that were originally isolated from each other, and domain controllers have emerged. Compared with the distributed architecture, the improvement of chip arithmetic power and software algorithm performance will become the core of automotive intelligence upgrading, and the marginal cost of automotive intelligent R&D and upgrading will be significantly reduced [12]. The most common five functional domain controllers, such as those of Continental and Bosch, encompass an intelligence driving domain, powertrain domain, chassis domain, cockpit domain, and body domain. Intelligent technologies in the intelligence driving domain can usually be divided into two main modules: a perception and localization module and a decision-planning module. The perception and localization module is mainly responsible for target detection and tracking around the intelligent vehicle and vehicle localization tasks, and the decision-planning module is mainly responsible for the tasks of motion prediction, path planning, and behavioral decision making. The powertrain domain is mainly responsible for the optimization and control of the vehicle powertrain, including power control, energy management strategy (EMS), and thermal management [13]. The chassis domain is mainly responsible for vehicle driving behavior and attitude control, including brake control, steering control, and suspension control [14]. The cockpit domain is mainly responsible for various electronic information system functions in the intelligent cockpit, including seat air conditioning and other equipment control, driver and passenger status monitoring, human–computer interaction, and other tasks [15]. The body domain is mainly responsible for the control of various body functions, including lights, wipers, windows and doors, lighting and wireless charging, etc. [16].
With the rapid development of AI algorithms, deep learning (DL), reinforcement learning (RL), and deep reinforcement learning (DRL) algorithms have shown obvious advantages due to their accuracy, efficiency, real-time abilities, and ease of use, which have caused these algorithms to gradually surpass traditional algorithms and to receive widespread attention. These algorithms have been employed in the field of domain control technology for large-scale applications in intelligent electric vehicles [17,18,19]. DL is a neural-network-based architecture that builds multiple levels of neural network layers by establishing a computational model containing multiple processing layers, including supervised learning and unsupervised learning, and automatically extracts and learns features through the machine [20,21,22]. Taking the target detection task in the intelligent driving domain as an example, the target detection needs to realize the accurate identification of the image content, and the DL can be trained through the powerful feature extraction capability using a large amount of labeled data to automatically learn the complex features in the image, which is a task very suitable to being solved by DL. RL is an approach based on a Markov decision process, which is based on the constant interaction between an intelligent body and its environment, obtaining rewards so as to continuously optimize its own action strategy to maximize the benefits [23,24]. Taking the task of energy management in the powertrain domain as an example, energy management is essentially a task with a desired control objective and specific physical constraints of the optimal control problem, which is very suitable to being solved by RL. As shown in Table 1, DRL integrates DL’s high-dimensional feature perception with RL’s trial-and-error interaction, yielding three main algorithm families [25,26]: value-based methods that learn action or state values to guide choice, policy-gradient methods that optimize the policy directly, and actor–critic hybrids that couple a value estimator (critic) with a policy generator (actor) to exploit the advantages of both. These three algorithms identify more solutions for each domain controller task of intelligent electric vehicles, and through continuous learning and iteration, they improve the intelligence level and decision-making ability of intelligent electric vehicle systems, so that they can adapt to a variety of complex driving scenarios and road conditions.
Several scholars have conducted reviews focusing on individual functional domains or specific tasks within the five domain controllers of intelligent electric vehicles. For instance, references [27,28,29,30,31] concentrated on tasks in the intelligent driving domain, references [32,33,34] examined the powertrain domain, references [35,36,37] addressed the chassis domain, references [38,39,40] focused on the cockpit domain, and reference [41] reviewed the body domain, as summarized in Table 2. Overall, Table 2 highlights that most studies are concentrated in the intelligent driving and powertrain domains, where DRL has been extensively applied to perception, decision making, and energy management. In contrast, research on the chassis, cockpit, and body domains remains limited and exploratory. This imbalance reflects the domain-dependent nature of DRL applications and the evident maturity gaps across functional areas.
Although industry efforts toward hardware-based multi-domain fusion have begun, algorithm-level integration remains underexplored. As Table 2 indicates, no study has yet provided a systematic review of cross-domain DRL applications, underscoring the need for research that transcends single-domain tasks and advances toward unified, scalable solutions. At the same time, while hardware integration promises cost reduction, true cross-domain fusion faces substantial challenges due to the wide variety of tasks and the need for diverse domain expertise, which significantly increases software and algorithmic complexity [42,43,44,45]. To address these challenges, recent academic work has shifted attention toward algorithm-level integration across domains. However, a comprehensive synthesis of such approaches is still lacking, leaving the pathways and mechanisms for effective cross-domain DRL integration insufficiently understood.
In response to the above issues and to further advance the application of DRL in intelligent electric vehicle domain control, this paper systematically examines four key functional areas: intelligent driving, powertrain, chassis, and cockpit. As shown in Figure 1, research efforts are unevenly distributed. The intelligent driving and powertrain domains have been extensively studied, while the chassis and cockpit domains remain comparatively underexplored. Beyond these single-domain studies, two emerging research directions can be highlighted. Domain fusion seeks to integrate control tasks across domains such as driving, braking, and suspension, yet it faces the challenge of rapidly increasing system complexity. Domain transfer, in contrast, aims to leverage knowledge gained in one domain to improve adaptability in another, but the lack of unified frameworks constrains its effectiveness. Taken together, these insights suggest that future research should move beyond isolated investigations toward the development of unified DRL strategies that enable collaborative control across multiple domains.
At the same time, as shown in Figure 2, the key words of the literature related to intelligent electric vehicle domain control technology are sorted out. Figure 2a is based on a systematic search of the Web of Science database for articles related to smart cars, using the titles of each section as keywords. The search covers papers published between 2015 and 2025. Articles related to smart cars were included, while irrelevant ones were excluded, and the results were summarized to form this figure. Figure 2b identifies the relationships and data links between each field and each functional module. Since the tasks in the body domain are relatively fixed and simple, the functional safety level is at a low level, the computing power requirements are low, and the DRL algorithm research used in this domain is also relatively small, this article does not review this domain.
The existing research mainly focuses on review and algorithm application within a single functional domain. Although rich results have been accumulated in the intelligent driving domain, powertrain domain, and cockpit domain, the exploration of cross-domain fusion and unified architecture is still insufficient. In addition, different reviews mostly focus on a single task or a single type of algorithm, and they lack a systematic comparison and challenge summary of DRL in multi-task and multi-domain scenarios. Therefore, there is an urgent need for a review that can comprehensively sort out the application of DRL in different functional domains from the perspective of the domain controller and reveal cross-domain fusion and future development directions. Based on this, the innovations of this article are that it systematically summarizes the application and challenges of DRL in the four major functional domains from the perspective of the intelligent vehicle domain controller; reviews the research progress of single-domain multi-task fusion and cross-domain multi-task fusion, especially the application of end-to-end algorithm architecture in the intelligent driving domain; and explores potential paths to combine transfer learning with DRL to improve the generalization and adaptability of intelligent vehicles in complex working conditions. The details are as follows:
(1)
Innovatively starting from the perspective of the intelligent vehicle domain controller, the research results and application status of DRL algorithms in the four functional domains of intelligent vehicles are sorted out, and the solution ideas regarding DRL algorithms’ application to the difficult problems of intelligent vehicle domain control technology are discussed.
(2)
From the two aspects of single-domain multi-task fusion and multi-domain multi-task fusion, the research progress on cross-domain fusion methods of intelligent vehicles is reviewed, focusing on the research progress of end-to-end algorithm architecture in multi-task fusion in the intelligent driving domain, and summarizing the status of multi-domain fusion research guided by the three elements of “vehicle, road, and person.”
(3)
We analyze the limitations and challenges of using the DRL algorithm in solving the technical problems of intelligent vehicle domain control, and explore the application effect and potential of combining the transfer learning method with the DRL algorithm in intelligent vehicle domain control technology.
As shown in Figure 3, the rest of this paper is structured as follows: Section 2 summarizes the application status of the DRL algorithm in the intelligent driving domain of intelligent vehicles. Section 3 summarizes the application status of the DRL algorithm in the powertrain domain of intelligent vehicles. Section 4 summarizes the application status of the DRL algorithm in the chassis domain of intelligent vehicles. Section 5 summarizes the application status of the DRL algorithm in the cockpit domain of intelligent vehicles. Section 6 discusses the application progress of the DRL algorithm in the multi-domain fusion stage of intelligent vehicles. Section 7 discusses the application direction of transfer learning in the domain control technology of intelligent vehicles. Finally, the conclusion is given, and the future development trend for the DRL algorithm in the domain control technology of intelligent vehicles is discussed.

2. Intelligent Driving Domain

As shown in Figure 4, this section sorts out the key technology research of target detection, target tracking, positioning technology, trajectory prediction, and decision planning, and presents a comprehensive summary and analysis of the application of the DRL algorithm in various tasks of the intelligent driving domain controller.

2.1. Target Detection

Target detection is an important task of the intelligent electric vehicle environment perception system. It can analyze the image or video data collected by the sensor to obtain information about other road traffic participants other than the vehicle itself, including vehicle and pedestrian recognition, traffic sign recognition, obstacle position information recognition, and lane line and road edge detection [46,47,48]. The sensors for intelligent electric vehicle target detection mainly include cameras, ultrasonic radars, millimeter wave radars, lidars, etc. According to the different types of sensors used, intelligent electric vehicle target detection can be divided into vision-based, radar-based, and multi-sensor fusion-based [49,50]. The current difficulties and solutions for intelligent electric vehicle target detection are shown in Table 3.

2.1.1. Vision-Based Target Detection

Cameras are the most widely used automotive target detection sensors. They can obtain rich geometric features and color information from the traffic environment and fully express target details [58,59,60,61,62]. Traditional vision-based target detection methods mainly include VJ detectors [63], histogram-oriented gradient detectors [64], deformable part models [65], etc. These methods are based on the combination of manually designed features and machine learning classifiers. They have problems such as large computational complexity and poor feature robustness and cannot meet the high reliability requirements of intelligent vehicles. According to the characteristics of the algorithm process, DL algorithms can be divided into single-stage and two-stage target detection algorithms. The single-stage target detection algorithm directly applies a single detection network [51,66].
In terms of single-stage algorithms, Naimi et al. [52] used Mobile Net V2 to replace the original VGG16 backbone of the SSD algorithm and only used the first two feature mapping layers of the SSD to address the problem that the SSD algorithm is not good at handling small objects such as traffic signs and signal lights. The results show that the improved algorithm has improved accuracy and speed on smaller traffic signs and signal lights. Wang et al. [67] proposed an improved YOLOv4 algorithm for real-time vehicle target detection in severe weather conditions. Wang et al. [68] introduced the large kernel attention (LKA) technology based on the YOLOv5 algorithm to decouple the large kernel convolution and proposed an LKA-YOLO algorithm. While reducing the computational workload of the model and improving its deployability on the vehicle platform, it also significantly enhanced the ability of YOLOv5 to handle dense and small targets. In terms of two-stage algorithms, Yang et al. [69] proposed replacing the original VGG-16 feature extractor of the faster region-based convolutional neural network with MA-Res Net, extracting different layers of MA-Res Net to construct a feature pyramid. The improved target detection model achieved more accurate detection results in multi-size, and especially small-size, target detection. Luo et al. [70] improved the multi-scale representation capability of the traditional feature pyramid network (FPN) model by simplifying the weighted dual-path FPN to solve the problem of real-time vehicle detection in congested traffic. They also applied DIoU-soft to off-target detection in scenarios with dense target distribution, reducing the impact of missed detection caused by densely distributed vehicles.

2.1.2. Radar-Based Target Detection

Radar can present object size, position, and motion status information through point cloud features; is not easily affected by ambient lighting; and causes false detection and missed detection [71,72,73,74,75]. Traditional radar-based target detection methods include likelihood ratio detection [76], tracking before detection [77], and constant false alarm rate detection [78]. These methods are based on statistical characteristics and highly dependent on prior knowledge. They cannot meet the requirements of intelligent electric vehicle target detection [79]. DL algorithms can automatically learn target detection features from complex point clouds, obtain richer features, and improve vehicle detection accuracy [80].
Sun et al. [81] established a scene-aware lidar point cloud coding framework that fully utilizes 2D texture and 3D topological information to segment movable objects, and established a deep network to eliminate the temporal and spatial redundancy of point cloud semantic segmentation, thereby reducing the impact of moving objects in lidar data. Li et al. [82] proposed a densely connected kernel point network to effectively extract rich semantic context information and valuable geometric features from local point cloud regions, and introduced a kernel point convolution attention module to effectively enhance the distinguishability of point cloud data features. Li et al. [53] constructed a dual-stream YOLOv4-MobileNetv3 target detection network model with a backbone network (MobileNetv3) structure. This model structure not only reduces the time of point-cloud-based target detection but can also achieve high-precision target detection under low-light driving conditions. In recent years, scholars have begun to study the application of graph neural networks (GNNs) in point cloud data processing. Graph neural networks are an effective method for applying graph structured data to deep learning. They can capture local and global information of point cloud data and achieve point cloud data classification tasks. Chen et al. [54] proposed a point cloud processing method based on graph neural networks, called point cloud neural transform, thereby improving the point cloud data processing speed of large-scale scenes. The DuEqNet structure proposed by Wang et al. [55] performs equivariant column encoding on a graph neural network (GNN), thereby extracting local and global equivariance features, respectively, and reducing the impact of vehicle steering on detection accuracy.

2.1.3. Fusion-Based Target Detection

Different types of sensors have different functions and present traffic environment information in different forms [83]. Target detection based on a single sensor often cannot achieve accurate perception of the vehicle under all working conditions due to the characteristics of each sensor. Multi-sensor fusion perception technology can effectively utilize cross-modal complementary information, overcome the limitations of a single sensor, and improve the performance of the perception module [84,85,86,87]. Due to the complementary characteristics of cameras and lidars, fusing the two is one of the most researched areas.
According to the different fusion stages, fusion perception technology can be mainly divided into post-fusion and pre-fusion [88]. Post-fusion technology is based on the traditional multi-source information fusion theory. It performs data fusion after the detection and segmentation tasks are respectively completed on data from multiple sensors. It mainly uses Kalman filtering [89], weighted averaging [90], Bayesian estimation [91], and other algorithms to achieve the fusion of processing results. However, with this solution, it is difficult to overcome the inherent defects of different sensors. Therefore, relevant scholars prefer pre-fusion technology based on the DL algorithm, which directly inputs the raw data collected by the sensor into the fusion model [92]. Xu et al. [56] proposed a multimodal fusion framework called FusionPainting. Compared with the use of image data or point cloud data alone, it significantly improves the target detection effect. Li et al. [57] proposed a multimodal feature fusion network called LRPN. Sparse point cloud and image information is fused in an overlapping manner, which improves the detection accuracy under different lighting conditions. Lin et al. [93] proposed the CL3D two-stream architecture to generate pseudo-LiDAR from RGB images, perform pseudo-point enhancement on the original LiDAR, and design a point-guided fusion module to combine semantic and geometric features, thereby improving the performance of multimodal 3D object detection. Liu et al. [94] used the CB-Fusion module to enhance point cloud features by utilizing the rich semantic information absorbed from image features in a cascaded bidirectional interactive fusion manner.

2.2. Target Tracking

Target tracking aims to select an area or object as a target in an image sequence or video based on target detection, and then find the target’s motion trajectory, specific shape, and position in the next several consecutive frames [95,96,97,98,99]. Due to problems such as the undirected motion of the target changing the appearance pattern of the target and the scene, non-rigid target structure, occlusion between targets and between targets and scenes, and the movement of on-board sensors, the appearance features of the target between the previous and next frames of the video are too different. With traditional target tracking methods, including generative model methods and discriminative model methods, it is difficult to meet actual driving needs [100,101]. DL algorithms have powerful semantic information extraction and generalization capabilities. Depending on the tracking objects and application scenarios, they are gradually being widely adopted for use in the fields of single object tracking (SOT) and multiple object tracking (MOT) for intelligent electric vehicles [102]. The current difficulties and solutions for intelligent electric vehicle target tracking are shown in Table 4.
In terms of SOT, Lyu et al. [106] developed a Siamese-based tracker for real-time autonomous driving. Their design blends ensemble learning, uses two modern backbones for feature extraction, and adds a channel-attention module to the high-dimensional features, boosting robustness against occlusion. Teng et al. [103] designed a single target tracking model using the features of the target in each frame and its dynamic changes. The model consists of a spatial long short-term memory (LSTM) module and a temporal LSTM module, which can obtain the temporal change characteristics of the video target across frames and record the target’s motion and appearance changes. In terms of MOT, the MOT algorithm based on DL algorithm can be divided into two categories: detection-based tracking (DBT) and joint detection and tracking (JDT), depending on whether the detection and tracking stages are independent. The DBT algorithm is mainly divided into two stages, detection and tracking, while the JDT algorithm integrates detection and tracking into one framework [107]. Li et al. [108] proposed a fast-moving target detection and motion state tracking method based on point cloud information. They also designed a point cloud registration method to achieve high-precision estimation of the target motion state. Hassaballah et al. [104] proposed a detection and tracking method based on visibility enhancement of YOLOv3 to reduce the impact of severe weather conditions on the performance of target detection and tracking algorithms for autonomous vehicles. Li et al. [109] developed a convolutional neural network (CNN) online single-shot multi-target tracking model. The tracker embeds a feature combination module to cope with target shape changes, employs an attention mechanism network that selectively highlights tracking-relevant cues, and couples triplet loss with online instance-matching loss to separate look-alike objects, cutting false and missed alarms in dense traffic. Dong et al. [105] proposed a segmentation-based JDT multi-target tracking model. Based on the Track R-CNN framework, the model adds position normalization features to refine target segmentation and uses a Bi-LSTM module to perform forward and reverse matching of consecutive frames of the target, reducing the impact of the surrounding background environment on target tracking.

2.3. Positioning Technology

Positioning technology is designed to help intelligent vehicles estimate their own position and posture. It plays an important auxiliary role in the perception, planning, and control processes and is an indispensable part of the intelligent driving domain [110,111,112,113,114]. Depending on the type of sensor used, intelligent vehicle positioning technology can be divided into satellite-and-inertial-navigation-based, vision-based, radar-based, and fusion-based [115]. Among them, the positioning accuracy based on satellite and inertial navigation is poor; the signal is easily interfered and blocked, and it is easy to produce the problem of excessive cumulative error. The positioning accuracy based on millimeter wave radar also cannot meet the needs of intelligent vehicles. Although the positioning based on laser radar has higher accuracy, the positioning algorithm may fail when the environmental structure changes or the intensity of reflected light from objects in the environment is insufficient. Therefore, this section does not discuss these positioning technologies. The current difficulties and solutions of intelligent vehicle positioning technology are shown in Table 5.

2.3.1. Vision-Based Positioning Technology

Visual positioning refers to predicting the absolute position of a camera in a map coordinate system based on the current image of a visual camera [121,122]. Compared with positioning methods based on Global Positioning System (GPS) and radar, it has a large detection range, can provide rich and intuitive information, and has high-precision perception capabilities [123]. Among them, traditional visual positioning methods extract features and match key points or key frames to solve the position [124]. Although it has high accuracy, it relies on matching local features and has poor generalization performance [125]. DL algorithms can directly map position and posture from images, replacing the traditional method’s feature extraction, posture solution, and other steps, and improving the robustness of image feature representation [126].
Ma et al. [116] developed a visual SLAM system based on adaptive scale descriptor (ASD-SLAM), a visual SLAM framework whose loss function exploits the negative-to-positive distance ratio as a scale cue, enabling its CNN to achieve more accurate localization in challenging environments. Shi et al. [117] designed a dual-branch CNN structure that can extract features from visual images and satellite maps, respectively, and designed a Levenberg–Marquardt module to reduce the difference between projected features and observed features, thereby improving the longitudinal positioning accuracy of the vehicle. Tian et al. [127] introduced a loss function term for 3D scene geometry constraints, and used motion information, 3D scene geometry information, and image content to train a visual localization network based on ResNet-50, thereby improving localization accuracy in different scenarios. Hu et al. [128] combat environmental drift in visual localization by embedding a Grad-SAM loss into a multi-domain image-translation network, using feature-map cues for high-accuracy retrieval. To handle motion and illumination changes, Song et al. [129] devise a hybrid-attention framework that models long-range pixel dependencies, yielding geometry-robust features and sturdier CNN localizers in dynamic scenes.

2.3.2. Fusion-Based Positioning Technology

Similar to the target detection task, there are inevitable defects when a single sensor is used in the positioning task. In the actual intelligent electric vehicle positioning system, sensor fusion solutions are often used to achieve high-precision and highly robust positioning [130]. Most of the current fusion positioning technologies fuse visual sensors with radar sensors to improve the overall performance of the positioning method by complementing their respective advantages and disadvantages [131]. At present, traditional fusion positioning methods mainly include filtering-based and optimization-based. The filtering-based fusion positioning method is limited by the Markov assumption and is mostly used in odometers. It is difficult to optimize and update all historical position and attitude data. The optimization-based fusion positioning method is highly dependent on the environment. In scenes with similar features, uneven point cloud density, and unbalanced features, the estimation of position and attitude information is prone to degradation, resulting in large positioning errors [132]. The advantages of DL algorithms in closed-loop detection and feature description provide them with higher positioning accuracy in dynamic scenes and different environments compared with traditional fusion positioning methods [133].
For multi-sensor fusion-based localization, several deep learning frameworks have emerged. Kang et al. [134] extract RGB cues with a CNN, fuse LiDAR descriptors via a GNN, and match the joint features to landmark maps for vehicle pose estimation. Ibrahim et al. [135] propose UnLoc, which applies sparse 3D convolutions to cylindrical point-cloud features and combines them—through slot-attention-filtered 2D convolutions—with radar and image data, markedly improving single- and multi-sensor robustness. Tibebu et al. [118] design an end-to-end LiDAR–camera network that compresses latent representations and feeds them into a recurrent neural network (RNN), reducing both translation/rotation errors and training time. Li et al. [119] present a LiDAR framework that segments moving objects using spatiotemporal cues and employs an iterative extended Kalman filter to register raw point clouds with IMU data. Finally, Almalioglu et al. [120] introduce the self-supervised GRAMME model, which leverages attention-based fusion of vision and radar signals to maintain high-precision ego motion estimates even under adverse weather.

2.4. Trajectory Prediction

Trajectory prediction is usually located between the perception and positioning module and the decision-making and planning module in the intelligent driving domain. Its main task is to process the input information of the perception and positioning module and predict the target’s movement trajectory in the future [136]. Depending on the prediction object, it can be divided into pedestrian trajectory prediction and vehicle trajectory prediction, because pedestrians and vehicles have significant differences in behavior patterns, movement characteristics, and prediction requirements. Accurate prediction of the trajectory of vehicles and moving pedestrians can enhance the intelligent driving system’s ability to perceive changes [137]. The current difficulties and solutions for trajectory prediction of intelligent driving vehicles are shown in Table 6.

2.4.1. Pedestrian Trajectory Prediction

Pedestrian trajectory prediction is a difficult task because each pedestrian on the road has different movement patterns. The movement trajectory of each pedestrian is affected by many factors, including their own movement mode, potential goals or intentions, and environmental structure [138]. Conventional approaches—such as social-force models, Markov models, and Bayesian networks—struggle to capture the intricate dynamics of pedestrian–pedestrian and pedestrian–environment interactions, and they become unwieldy when handling high-dimensional data [139]. The DL algorithm solves the problem that pedestrian trajectory prediction is essentially a time series, and it can capture the interaction characteristics between the predicted pedestrian and the surrounding pedestrians [140].
Song et al. [142] proposed a convolutional LSTM to solve the problem that ordinary DL algorithms cannot learn the spatial information of pedestrians in dense crowds. Through multi-channel tensors and convolutions, it can better learn the spatiotemporal interaction between pedestrians and the environment. Xue et al. [143] proposed a pedestrian path prediction algorithm based on LSMT, PoPPL, which classifies the trajectories observed by the defined clustering route training algorithm and generates trajectories corresponding to the predicted destination area, thereby improving the prediction accuracy of busy areas in scenes where the predicted trajectory has a high probability. Hsieh et al. [144] proposed a trajectory prediction model semi-supervised conditional generative adversarial network based on a social conditional generative adversarial network (GAN). Yang et al. [145] aimed at the error accumulation problem of a pedestrian trajectory prediction model based on RNN algorithm under long-term prediction, and proposed a prediction network, SGAMTE-Net, based on a heterogeneous graph attention mechanism and multimodal trajectory endpoints to improve the accuracy of long-term trajectory prediction and to be able to predict multiple potential trajectories at the same time. Youssef et al. [146] proposed a spatiotemporal multi-graph convolutional network. By constructing a multi-graph adjacency matrix, they captured the social interaction relationship between pedestrians based on position and velocity, thereby simulating the spatiotemporal structure in real scenes and using the relationship between multiple graphs to predict more accurate trajectories.

2.4.2. Vehicle Trajectory Prediction

Compared with pedestrians, vehicle movement is affected by its own volume, inertia, driving rules, and various road environments, and is also highly uncertain [147]. Traditional vehicle trajectory prediction methods mainly predict driver behavior intentions through probability and statistical methods or use kinematic models to predict trajectories. Since manually designed features do not have learning capabilities and are sensitive to noise, they are difficult to expand to complex scenarios and complex trajectories. DL algorithms can consider the interaction characteristics between multiple vehicles, driving intentions, and a diversity of trajectories when predicting vehicle trajectories, making the predicted trajectory more feasible in real-world scenarios [148].
Dai et al. [149] designed a semi-supervised vehicle trajectory prediction model that combines the AND/OR graph (AOG) and the spatiotemporal LSTM (ST-LSTM). The concept of sub-intention was introduced to improve the vehicle trajectory classification to reduce the prediction model’s dependence on high-quality data sets. The model learned the sub-intention and used it to infer the possible trajectory through the AOG, which improved the interpretability of the trajectory prediction process. Choi et al. [141] proposed a driving style attention GAN (DSA-GAN) to solve the problem of the impact of different driving styles on the target vehicle trajectory prediction. Hou et al. [150] speed up multi-vehicle prediction by replacing LSTMs with a structured Transformer that processes patio-temporal interactions in parallel. To handle junction uncertainty, Wang et al. [151] combine a hybrid driving style classifier with a Transformer interaction encoder, outputting multimodal, driver-like motion plans. Jeon et al. [152] introduce SCALE-Net, which couples edge-enhanced GCN interaction embeddings with an LSTM sequence model; its node-scalable design remains stable across traffic densities.

2.5. Decision Planning

Decision planning is the core of intelligent vehicles. It makes task decisions based on driving needs and then plans a safe path between two points as the vehicle’s driving trajectory, avoiding possible obstacles [153,154,155]. Trajectory planning usually operates on two tiers: global and local. The global planner scans the map to chart a feasible route from start to goal without any time-dependent constraints [156,157]. After receiving the global path, the behavioral decision making combines the environmental information obtained from the perception module to make specific behavioral decisions. Local trajectory planning is based on the results of global planning and behavioral decision making, combined with specific local environmental information and the vehicle’s own state information to generate a spatiotemporal trajectory that meets specific constraints [158]. The current difficulties and solutions for intelligent electric vehicle decision making and planning are shown in Table 7.

2.5.1. Behavioral Decisions

Traditional decision-making algorithms are usually rule-oriented models that cannot flexibly respond to sudden changes and uncertainty in the environment [163]. They do not work well in some specific scenarios, such as intersections, high-speed lane changes, and ramp merging [164]. RL algorithms and DRL algorithms can effectively cope with such complex environmental changes and uncertainties faced by intelligent vehicles.
Seong et al. [165] proposed a driving strategy based on soft actor–critic (SAC) to deal with the problem of autonomous driving conflicts at intersections. By combining the spatiotemporal attention mechanism, the spatiotemporal information of the agent during driving is extracted, thereby helping the agent to make reasonable decisions. Jafari et al. [159] combined hand-designed logic with data-driven reinforcement learning agents to design a cognitive hybrid autonomous motion decision-making model. It is used to solve the problem of unprotected left turns at T-shaped junctions and crossroads without traffic control and signal protection under congested traffic conditions. Guo et al. [166] used deep deterministic policy gradient (DDPG) to learn continuous longitudinal acceleration and deceleration, integrated information from adjacent lanes, and made discrete lane change decisions based on deep Q-network (DQN), thereby achieving ecological and safe driving of autonomous vehicles at continuous signal intersections. Mirchevska et al. [160] embed a DQN safety filter that keeps highway lane changes collision-free. Zhao et al. [167] apply TD3 to co-optimize lane-keeping and on-ramp merges, delivering smoother, faster entries. Liu et al. [161] craft a graph-RL framework that fuses GNNs with DRL, tackling diverse interactive-traffic decisions on highways.

2.5.2. Trajectory Planning

Existing intelligent electric vehicle trajectory planning methods are widely borrowed from robot trajectory planning methods, such as graph search-based algorithms, sampling-based algorithms, and optimization-based algorithms [168,169]. However, the movement speed of robots is much slower than that of vehicles, and the application scenarios are also very different. Therefore, with conventional robot trajectory planning methods, it is difficult to meet the challenge of changing road scenes during vehicle driving and effectively avoid road obstacles [170,171]. RL algorithms and DRL algorithms provide new ideas and directions for the trajectory planning problems of intelligent electric vehicles in complex environments, enabling vehicles to achieve more human-like driving trajectories by learning from natural driving data or expert experience.
Qiao et al. [172] proposed a hierarchical DRL framework based on deep Q-network (DQN) and a dual deep Q-network (DDQN) for autonomous vehicle trajectory planning in urban environments. They also used a hybrid reward mechanism and heuristic exploration in the training process to improve the efficiency of vehicle behavior selection and the speed of model training. Chen et al. [173] used a distributed multi-agent reinforcement learning (MARL) method to coordinate the parking trajectories of multiple vehicles equipped with an automatic valet parking system in a parking lot. They accelerated the parking trajectory learning by introducing the traditional hybrid A* trajectory planning method and considered the conflict constraints of multiple vehicles to ensure the safety and efficiency of multi-vehicle planning. Li et al. [162] proposed an ecological driving planning method based on a hierarchical framework. The upper layer uses a queue-based traffic model to estimate the impact of traffic uncertainty. The lower layer builds a DDPG-based vehicle speed and trajectory controller based on traffic lights and traffic flow to ensure driving safety while reducing energy consumption. Zhu et al. [174] proposed a motion planning model based on hybrid reinforcement learning to address the impact of pedestrian distraction on the safety of autonomous vehicles at street crosswalks. When approaching pedestrians are detected, a more reasonable deceleration is adopted to reduce unnecessary braking. Tang et al. [175] proposed a driving controller based on SAC to solve the decision-making and motion planning problems in interactive traffic environments. They also achieved switching between high-speed safe lane change maneuvers and comfortable cruising by adjusting the weights of different reward functions. Cao et al. [176] proposed a reinforcement-learning-enhanced highway exit planner, which constructed a tree of possible actions and their results, simulated different scenarios to find the best action, and adjusted the vehicle motion when the AV could not exit, thereby increasing the probability of successfully exiting the highway.
This section systematically analyzes the key technologies of intelligent driving domain controllers, covering five core aspects: target detection, target tracking, positioning technology, trajectory prediction, and decision planning. It also discusses the current application status and typical solutions of DRL in each task. Across these five tasks, several common patterns emerge. In target detection, DRL has proven effective for robustness against occlusion and environmental variability, yet most improvements rely on deep feature extraction, leaving real-time deployment on vehicle hardware a persistent challenge. In target tracking, while Siamese-based and attention-enhanced networks improve resilience to occlusion and appearance changes, they often trade accuracy for latency, limiting applicability in dense traffic. Positioning technologies increasingly integrate multi-sensor fusion with DRL to mitigate drift and environmental noise, but high dependence on high-definition maps and costly sensors remains unresolved. In trajectory prediction, methods based on GANs, GNNs, and Transformers provide richer modeling of multi-agent interactions and uncertainty, but long-term prediction stability is still weak. Decision planning studies show that DRL can approximate human-like reasoning and interactive strategies, yet the safety verification and personalization issues remain open. Collectively, these findings indicate that while DRL has significantly advanced the intelligent driving domain by enhancing perception accuracy, adaptability, and planning intelligence, systematic solutions for cross-task collaboration, real-time efficiency, and interpretability are still lacking. These limitations indicate the necessity of an integrative review capable of synthesizing progress across tasks, identifying cross-domain challenges, and outlining potential research directions.

3. Powertrain Domain

As shown in Figure 5, this section summarizes the key technical research on power control, energy management, and thermal management in the powertrain domain, and discusses the comprehensive summary and analysis of the application of the DRL algorithm in various tasks of the powertrain domain controller and the analysis of its limitations.

3.1. Power Control

Power control aims to accurately control and adjust the power and torque output by multiple power components (internal combustion engine, motor/generator, battery, gearbox) to achieve longitudinal control of the vehicle [177,178]. With rule-based power control methods, it is difficult to deal with the nonlinearity and increasing complexity of the power system configuration of modern intelligent electric vehicles, and there are complex coupling relationships between the components [179]. RL algorithms and DRL algorithms can avoid the dilemma of continuity mismatch and sudden changes in action trajectories by converting to match the continuity of actions and the continuity of control actuators, helping the vehicle to consider various driving conditions and emergencies, and realize multi-component coordinated control [180]. The current difficulties and solutions of power control for intelligent electric vehicles are shown in Table 8.
Beaudoin et al. [181] calibrated the parameters of the shift controller using a model-based reinforcement learning algorithm. The algorithm combined a saturated instantaneous cost function, allowing customization of the cost function width for each state dimension, and enhancing the shift performance evaluation. Zhang et al. [182] constructed a power control framework based on DDQN to avoid the overestimation problem in the DQN algorithm. By setting appropriate hyperparameters and batch training, it is easier to obtain an adaptive optimization model, enabling the vehicle to achieve optimal switching between multiple working modes. Liu et al. [183] optimized the parameters of the transmission shift controller. The algorithm can complete the controller optimization process by learning the probabilistic dynamic model of the system with only a small number of experiments, thereby achieving efficient and accurate shifting under different slopes and loads. Li et al. [184] introduced a game theory traffic simulation environment into the Q-learning-based highway autonomous driving power controller, which can characterize the interactive behavior of vehicles in traffic. By adding the power system operating state to the decision-making strategy and appropriately defining the reward function, the safety, comfort, performance, and energy efficiency requirements of the autonomous vehicle can be met. The training scheme for an agent that optimizes traction torque and transmission shifting strategies proposed by Kerbel et al. [185] uses a non-strategy actor–critic architecture that can coordinate the processing of mixed action spaces, i.e., the combination of continuous actions (torque) and discrete actions (shifting), and balances torque magnitude with shifting frequency.

3.2. Energy Management

Energy management aims to coordinate the power distribution between different power sources (including internal combustion engines, fuel cells, electric motors, and power batteries) according to the power demand, the operating status of the power system, and the differences in the working characteristics of different power sources [186,187,188]. Since different subsystems of a vehicle are usually independent of each other, optimizing multiple systems at the same time may introduce more complex problems. Therefore, the EMS of traditional hybrid vehicles usually focuses on optimizing the energy distribution between power sources and cannot take into account the influence of other factors and system requirements [189]. The application of the RL algorithm and DRL algorithm in the EMS of intelligent electric vehicles can more effectively process high-dimensional state information, more comprehensively optimize multiple complex objectives, and realize the integrated control of intelligent electric vehicle energy management and other tasks [190]. The current difficulties and solutions of intelligent electric vehicle capacity management are shown in Table 9.
Tang et al. [191] combined DDPG with expert-assisted rules to achieve comprehensive optimization of energy management by weighing the fuel consumption cost, battery aging cost, and SOC sustainability reward function under different weight coefficients. Wu et al. [194] incorporated an over-temperature penalty and multi-stress-driven degradation cost of on-board lithium-ion batteries into the EMS evaluation indicators, and used the SAC algorithm to intelligently balance the multiple objectives of buses to achieve the optimization of LIB thermal safety and overall driving cost. Du et al. [192] used a new optimization method to update the weights of the Q neural network, achieving faster training speed and making the designed EMS show near-global optimal fuel economy in different driving cycles. Cui et al. [195] proposed a vehicle EMS based on TD3 that integrates driving style and traffic conditions. By using the simulated annealing-genetic algorithm to optimize the fuzzy c-means method, the traffic conditions were accurately identified in three categories, and different energy management strategies were implemented according to different traffic conditions. Tang et al. [193] develop a dual-DRL energy manager that couples DDPG for continuous throttle control with DQN for discrete gear selection, enabling coordinated multi-component control in a hybrid action space. In the energy management of power batteries or fuel cells, when the design of a DRL reward functions involves life extension or state of charge maintenance, accurately detecting and estimating their state of charge and lifespan is crucial. In recent years, to meet this demand, some scholars have used DL methods to estimate or predict state time series data, which has become a research hotspot in this field [196,197,198].

3.3. Thermal Management

Thermal management aims to control the operating temperature of power components (including batteries, motors, and fuel cells) according to the temperature characteristics of the power system to ensure the performance, safety, and service life of the power components [199,200,201]. However, with the improvement of the performance parameters of intelligent vehicles, more stringent requirements have been put forward for the effectiveness, dynamic response characteristics, and economy of the thermal management system. Traditional thermal management methods have high development costs, long transmission paths resulting in low control accuracy and poor reliability, and difficulty in achieving refined and energy-efficient optimal control [202]. The DRL algorithm can monitor the working status and environmental conditions of the vehicle, and adjust the cooling fan, coolant flow rate, and other parameters in real time based on the information, so that the power components can maintain the optimal temperature under various conditions. On the other hand, it can extend the life of the power components through predictive management, identify potential problems in real time, and take intervention measures to improve system reliability [203]. The current difficulties and solutions of thermal management of intelligent vehicles are shown in Table 10.
Arjmandzadeh et al. [204] proposed a thermal management optimization method for electric vehicle battery packs under extreme fast charging conditions based on the proximal policy optimization (PPO) algorithm. This method fully considers the detailed dynamic characteristics of the battery at the vehicle level to reduce battery pack degradation and battery thermal management power consumption. Li et al. [205] treat the coolant pump and radiator as separate agents and train them jointly with the CESL-MATD3 algorithm; their decentralized execution brings the two actuators into harmony, raising proton exchange membrane dual-cell thermal-loop efficiency. Billert et al. [206] design a CNN that deterministically forecasts battery temperature under varying heating and cooling set-points, furnishing accurate inputs for predictive thermal management. The automotive powertrain domain is a multi-physics coupling system. For thermal management, it is necessary to ensure good temperature control accuracy when the power output changes. Therefore, relevant scholars have studied the integration of automotive thermal management and energy management. Huang et al. [207] considered the impact of battery thermal effects on energy efficiency, extracted vehicle state sequence features including battery heat generation through gated recurrent units (GRU), and obtained the optimal battery thermal management and energy management strategies through DDQN. Wei et al. [208] integrated thermal management strategies with energy management strategies through DDPG, where thermal management requirements were determined by energy management strategies that introduced thermal awareness and health awareness, thereby achieving comprehensive optimization among efficiency, fuel economy, and temperature tracking performance.
This section systematically reviews key technologies and typical applications of DRL around the three core tasks of the powertrain domain controller: power control, energy management, and thermal management. Across power control, energy management, and thermal management, DRL has demonstrated the ability to coordinate mixed continuous–discrete actions, balance multi-objective trade-offs, and adapt dynamically to changing conditions. A clear trend is the integration of EMS with thermal and life-cycle management, moving from isolated subsystem optimization toward holistic powertrain coordination. Nevertheless, most studies still rely heavily on simulation-based training and simplified system models, which raises concerns about real-time feasibility and safety validation in real vehicles. Furthermore, while DRL reward functions can encode objectives such as energy economy, SOC stability, or temperature control, they often lack interpretability and are sensitive to hyperparameter design. These limitations highlight the necessity of developing hybrid approaches, including the integration of DRL with model-based control or transfer learning, to enhance robustness, scalability, and verifiability in safety-critical automotive applications.

4. Chassis Domain

As shown in Figure 6, this section summarizes the key technical research on steering control, braking control, and suspension control in the chassis domain, and provides a comprehensive summary and analysis of the application of the DRL algorithm in various tasks of the chassis domain controller, as well as an analysis of its limitations.

4.1. Steering Control

Steering control aims to control the lateral movement of the vehicle and improve its handling and safety [209,210,211]. In the electronic control steering system of intelligent electric vehicles, active steering technology has an increasing potential [212]. RL algorithms and DRL algorithms have good fitting capabilities and can effectively control the nonlinearity and unknown parameters of the active steering system, and they have strong robustness to system uncertainties [213]. The current difficulties and solutions of intelligent electric vehicle steering control are shown in Table 11.
Zhao et al. [216] proposed an active steering control strategy based on DDPG. The expected value and actual value of the steering wheel angle are used as reward functions to obtain the action value of the controller, that is, the expected current of the brushless DC motor. Compared with proportional integral derivative control, this strategy effectively controls the active steering of the vehicle without the need for prior knowledge. Morais et al. [217] proposed a hybrid lateral control framework combining PPO and robust linear quadratic regulator to improve the anti-interference performance of autonomous vehicles driving on main roads. However, this method cannot guarantee the generalization of learning and can only be used in similar training scenarios. Wasala et al. [214] used vehicle parameters and path trajectories as state spaces to improve the generalization ability of the DRL algorithm in the lateral control of autonomous vehicles. They generated a generalized reinforcement learning agent that can adapt to different external environmental conditions, such as different road friction, road topology, tires, and vehicles, without the need for additional training. Chao-zhong et al. [215] mitigate the mismatch in steering authority between human and automation with a DDPG-based cooperative controller that continuously reallocates control, drawing on real-time driver wheel angles, system steering commands, and vehicle–road cues.

4.2. Brake Control

Effective braking control mitigates both the frequency and severity of accidents, making it pivotal to vehicle safety. In intelligent electric vehicles’ electronic braking systems, anti-lock braking (ABS) and automatic emergency braking (AEB) are assuming an increasingly central role [218,219,220]. Traditional rule-based braking control strategies, such as constant spacing strategy and sliding mode control strategy, have poor adaptability to complex and changing driving environments, especially in environments where intelligent electric vehicles and traditional driver-controlled vehicles coexist, and it is difficult to achieve continuous changes in braking deceleration during braking [221,222]. In the design of intelligent electric vehicle braking control, the braking strategy obtained by RL algorithm and DRL algorithm can adjust the expected braking acceleration in real time according to the change of vehicle safety status and realize more precise control of the braking process [223]. The current difficult problems and solutions of intelligent electric vehicle braking control are shown in Table 12.
Mantripragada et al. [226] proposed a PPO-based ABS control method that adapts to the changing tire characteristics by utilizing the available grip at the tire–road interface to achieve the optimal wheel slip rate and shorten the agent training time through parallelization technology. Dubey et al. [227] proposed an automatic throttle braking system based on DDPG, which is used for braking and throttle control strategies in two situations: a static obstacle in front of the vehicle and an intersection where two vehicles are approaching. Fanti et al. [224] proposed an AEB control method based on DDPG to solve the problem of autonomous vehicles braking in uncertain environments, especially when pedestrians cross the crosswalk. The method takes into account the uncertainty of the vehicle’s initial speed, the pedestrian’s initial position, and whether the pedestrian has crossed the road. The method manages the vehicle’s speed change through braking. Hou et al. [225] present a residual-RL post-braking controller—built on PPO—that adaptively modulates brake torque to suppress body pitch and longitudinal shake under varied braking scenarios, markedly smoothing ride quality.

4.3. Suspension Control

Suspension control is to control the vertical movement of the vehicle by adjusting the height, stiffness, and damping of the suspension to achieve precise control of the driving posture [228,229,230]. In modern intelligent automobile suspension systems, the development of intelligent suspension control for active and semi-active suspension systems represented by air suspension systems and magnetorheological suspension systems has become a research hotspot [231,232]. Traditional suspension control methods such as model predictive control and linear quadratic Gaussian control usually face many unfavorable factors such as too many model uncertainty parameters, unknown nonlinear interference, and limited signal feedback. Model parameter calibration is time consuming, and control design is extremely difficult [233]. The current difficulties and solutions for intelligent electric vehicle suspension control are shown in Table 13.
Ming et al. [235] designed a control method for a quarter-vehicle semi-active suspension system based on the DDPG algorithm. However, this control method only has a good control effect on a single type of road surface. On this basis, Yong et al. [236] proposed a full-vehicle semi-active suspension control method based on a switching SAC algorithm, which can identify different road types including horizontal roads and speed bumps, and thus apply a suitable SAC control model. Liang et al. [234] studied the parameter-free H control problem of an active suspension system, designed a parameter-free Q-learning algorithm that can quickly calculate H∞ control gain, used the adaptive judgment method to adjust action network and criticism network, solved the game Riccati equation of the system, and gave the optimal solution of active suspension control. Wang et al. [237] devise an expert-guided TD3 variant with integrated soft- and hard-constraint modules for active suspension control: the soft layer speeds learning of the relationships among body acceleration, dynamic load, and control force, while the hard layer enforces safety bounds throughout operation.
This section reviews recent technological advances and representative DRL applications in the three core tasks of the chassis domain controller, namely steering, braking, and suspension control. Across steering, braking, and suspension control, deep reinforcement learning demonstrates notable strengths in managing system nonlinearity, adapting to uncertain environments, and coordinating human–machine interaction. Actor–critic methods, including DDPG, PPO, and TD3, have become the dominant frameworks because they balance continuous and discrete control actions, which improves lateral stability, braking precision, and ride comfort. Despite these advances, most approaches remain highly dependent on simulation environments and carefully tuned reward functions, which raises concerns regarding their robustness and reliability in real-world deployment. Research in the chassis domain is also fragmented: steering studies focus on human–automation collaboration, braking control emphasizes emergency safety interventions, and suspension research targets ride comfort. However, integration across these tasks has not yet been achieved, limiting the overall potential of chassis domain control. Furthermore, rigorous safety verification, generalization across road types and vehicle models, and the incorporation of physical constraints into DRL training remain unresolved challenges. Future research should move toward coordinated chassis control that unifies steering, braking, and suspension, while embedding safety constraints and combining data-driven adaptability with model-based guarantees to ensure both performance and reliability in safety-critical scenarios.

5. Cockpit Domain

As shown in Figure 7, this section summarizes the key technical research on personnel monitoring, comfort control, and human–computer interaction in the cockpit domain, and provides a comprehensive summary and analysis of the application of the DRL algorithm in various tasks of the cockpit domain controller, as well as an analysis of its limitations.

5.1. Personnel Monitoring

Personnel monitoring aims to identify the distraction, fatigue, and dangerous behavior of personnel, especially drivers, by extracting key facial features, motion features, and audio information, so as to avoid potential accidents and improve driving safety [238]. The cockpit of an intelligent vehicle is a dynamic and complex scene. During the driving process, factors such as driving operations, road conditions, and noise inside the vehicle may affect the expression and recognition accuracy of facial expressions and voice emotions. Therefore, relevant scholars have applied deep learning algorithms to cockpit personnel monitoring to achieve better recognition results. The current difficulties and solutions for intelligent electric vehicle personnel monitoring are shown in Table 14.
Fang et al. [241] proposed a driving behavior and eye direction monitoring system based on the YOLOv4 algorithm. The system collects images through a camera installed in front of the driver at nose height and defines the safe gaze direction, thereby monitoring 11 categories including driver gaze, whether the driver is eye-opening, whether the driver is wearing a seat belt, etc. Du [242] et al. proposed a multimodal fusion recurrent neural network that fuses heart rate and eye- and mouth-opening signals from an RGB-D camera to elevate driver-fatigue detection accuracy; its reliability, however, drops with sudden head motion or camera jitter. On this basis, Lu et al. [239] proposed a fatigue detection method for arbitrary postures, called JHPFA-Net. This method synthesizes realistic frontal faces through the FF-Module module, which can be used to capture facial movements in arbitrary postures, and integrates head posture attributes and facial morphology through the GK-Module module to improve detection accuracy. Different from the above-mentioned personnel monitoring systems using images and videos, Pistolesi et al. [243] recorded the conversations in the vehicle through audio sensors and classified the audio into calm conversations or quarrels based on the denoised spectrograms using CNN and GRU. However, this system is easily affected by other environmental noises and needs further improvement and upgrading. At the same time, in addition to monitoring personnel during driving to reduce collision-related accidents, relevant scholars have also studied personnel monitoring to reduce accidents not related to collisions, such as preventing children in closed vehicles from dying from overheating. Hu et al. [244] designed a multi-task cascade convolution framework to detect children in a vehicle and classify the age of the detected faces to reduce the recognition failure rate and verified the feasibility of deploying the framework on an embedded platform. Moussa et al. [240] improved four deep learning models to detect children, pets, and adults, respectively, in order to reduce the false alarm problem caused by children or pets being with their parents, thereby improving the monitoring accuracy and avoiding the problem of false alarms.

5.2. Comfort Control

Vehicle air conditioning, seats, seat belts, and other in-cabin equipment have a significant impact on the riding comfort and safety of the driver and passengers. However, traditional control strategies, such as air conditioning control, require the driver and passengers to manually set the target temperature of the vehicle air conditioning based on the cabin temperature and their own feelings. This control method cannot meet the needs of the development of intelligent electric vehicle cabins [245]. Therefore, relevant scholars have begun to study the intelligent cockpit equipment control strategies based on the DRL algorithm, controlling the equipment according to the cockpit environment, equipment operation characteristics, and passenger characteristics, and improving the comfort of the driver and passengers. The current difficulties and solutions of intelligent electric vehicle comfort control are shown in Table 15.
In terms of cabin air conditioning, Brusey et al. [246] proposed an RL-based automotive cabin thermal comfort control method that fully considers factors such as cabin air temperature, external air temperature, and the temperature of the driver and passengers; controls the air outlet temperature; improves the cabin thermal comfort time; and reduces energy consumption. However, this method only simulates the temperature felt by cabin occupants through a head position temperature sensor, and the comfort evaluation is not accurate enough. Hu et al. [247] embedded a human thermal comfort model into a PPO-driven HVAC controller, using the model’s comfort scores as the optimization target; evaluations across four dynamic scenarios and diverse subjects showed markedly higher comfort accuracy and robustness. In terms of cockpit seats, Lee et al. [248] proposed an RL-based electric seat driving method to reduce the motion sickness symptoms of passengers in the vehicle. The method intelligently adjusts the position of the power seat according to the vehicle acceleration state, the passenger acceleration state, and the passenger otolith response state. In terms of cabin seat belts, Şener et al. [249] proposed a visual seat belt fastening detector based on the CNN algorithm to address the safety and comfort issues caused by improper seat belt fastening. The method also automatically adjusts the vertical height of the four positions of the seat belt based on the detection results, thereby improving the comfort of the belt fastening and avoiding neck injuries. To curb cabin noise, Nascimento et al. [250] designed a Q-learning controller that reacts to sound levels by adjusting window positions and cruise speed, lessening acoustic distraction, and boosting driver comfort.

5.3. Human–Machine Interaction

The intelligent cockpit human–machine interaction (HMI) system is based on relevant software and hardware technologies. It provides assistance by judging the needs of people in different situations, realizes information transmission and exchange, achieves the purpose of people using vehicles and vehicles serving people, and improves user experience [251]. With the development of DRL algorithms, HMI can perceive, in a timely way, the status and changes of drivers and passengers and their environment; accurately infer their behavior and intentions; and actively respond to their needs, achieving active responsive interaction.
In terms of visual interaction, Shishavan et al. [252] studied a closed-loop brain–machine interface (BMI) system that controls in-vehicle functions through a windshield head-up display. In terms of gesture interaction, Sachara et al. [253] converted the point cloud data of the on-board ToF camera into CNN-interpretable data, achieving a 100% recognition rate within the execution time of some gestures. In terms of device interaction, Kim et al. [254] proposed a vehicle interaction control method using intelligent watches and knees as touch platforms, converting three-dimensional gestures in space into two-dimensional touches based on the knees, and performing touch interaction recognition based on the GRU algorithm based on the multivariate time series (MTS) motion data captured by the intelligence watch device. In terms of voice interaction, Paranjape et al. [255] developed an in-vehicle voice intelligent assistance application based on the RASA framework, which converts what the user says into text and sends it to the NLU for semantic entity and intent recognition. The core will then decide the corresponding response and call the corresponding API or reply to the user based on its knowledge. In other areas, Dewalegama et al. [256] developed a taxi human–computer interaction system based on YOLO and CNN algorithms, which consists of four functional components, including a music player based on passenger facial expressions. These systems are connected to the Internet through a head-up display (HUD), allowing drivers and taxi companies to better serve passengers.
This section systematically reviews key technological advances and typical applications of DRL around the three core tasks of the cockpit domain controller: personnel monitoring, comfort control, and human–machine interaction. Within the cockpit domain, DRL has demonstrated clear advantages in personnel monitoring, comfort control, and human–machine interaction by enhancing adaptability to dynamic environments, enabling learning from multimodal information, and supporting the delivery of personalized services. Existing studies commonly rely on deep perception models integrated with DRL to identify driver states and occupant behaviors, which improves recognition accuracy and responsiveness. Despite these advances, current research remains fragmented. Work on personnel monitoring is largely centered on fatigue and distraction detection, and comfort control research emphasizes thermal regulation and seat adjustment, while human–machine interaction studies are often confined to single modalities such as gesture or voice. The absence of integration across these subsystems restricts cockpit controllers from achieving holistic and context-aware decision making. Moreover, the strong dependence on large labeled datasets and simulation environments raises concerns about their capacity to generalize under real-world variability, and issues such as privacy leakage and secure data handling have received insufficient attention. Future research should therefore advance toward multimodal collaboration that unifies monitoring, comfort, and interaction, while embedding privacy-preserving mechanisms and developing adaptive policies capable of transferring effectively across vehicles, occupants, and environmental conditions.

6. Domain Fusion

As shown in Figure 8, although the performance of the centralized architecture in the intelligent vehicle domain is better than that of the traditional distributed architecture, with the rapid development of intelligent driving, the perception, decision-making, and control systems involved are more complex. Once an error occurs in one of the functional modules, the error may be amplified in subsequent tasks, leading to unsafe decision-making behavior. In addition, more and more sub-modules use different deep reinforcement learning algorithms. The entire system is very complex and large, which will bring a catastrophic explosion of computing requirements and bring great difficulties and challenges to developers [257]. Therefore, multi-domain and multi-task fusion is becoming the development trend of DRL algorithm applications in the field of intelligent vehicles.

6.1. Single-Domain Multi-Task Fusion

Single-domain multi-task fusion mostly integrates hierarchical tasks or related coupled tasks in the same domain. The former requires integrating different modules, treating the system as a black box, neutralizing all modules, training one or more neural networks, and obtaining a direct mapping from input to control commands. The most typical of these is the end-to-end model in the intelligent driving domain. The end-to-end intelligent electric vehicle decision-making and planning model unifies multiple tasks, such as environmental perception, target recognition, target tracking, and planning decisions, into a deep neural network. By obtaining information related to the vehicle and driving environment (such as vehicle turning angle, speed, road distance, environmental images, etc.), the vehicle control signal is directly output after processing by the neural network, completing the unification from cognition to control decision making [258,259].
End-to-end decision making largely follows two paths: imitation learning (IL) and reinforcement learning [260]. IL frames driving as supervised learning, using expert demonstrations to pair sensor observations—camera, LiDAR, GPS, and more—with the associated control outputs (throttle, steering, brake). Trained on these state–action pairs, the network learns the expert’s mapping from perception to commands. Behavior cloning, the most common IL variant, is essentially supervised policy learning that directly copies expert trajectories. Hawke et al. [261] developed an end-to-end conditional IL system that uses monocular camera images and high-level route commands as the input and output of vehicle control signals and can achieve longitudinal and lateral control in complex urban environments. Cai et al. [262] proposed an imitation learning end-to-end autonomous driving planning model based on the CNN-LSTM network architecture. The model consists of three sub-networks that execute straight, left, and right turn commands, respectively, and output collision-free trajectories after being fed back to the LSTM network through different sub-networks. In imitation learning, using inverse reinforcement learning to model driving behavior can also effectively obtain reward functions from expert experience and then learn strategies based on the reward function. Couto et al. [263] proposed an end-to-end autonomous driving framework based on the combination of generative adversarial imitation learning (GAIL) and behavior cloning (BC), which is used to autonomously navigate vehicles in urban environments in the VEHICLELA simulation environment. Compared with traditional GAIL, the GAIL-enhanced BC structure can calculate the desired navigation trajectory more quickly.
RL is an end-to-end decision-making planning method that reduces the need for driving data. This method uses a reward function as an incentive and adopts a trial-and-error approach to learn how to make decisions and actions in the environment to maximize cumulative rewards [264,265,266]. Compared with imitation learning, this method does not rely on labeled driving data and can avoid the problem of insufficient data in various driving situations. However, its trial-and-error process has safety hazards and is only applicable to virtual simulation environments. There are certain risks in using it in actual road environments. Peng et al. [267] proposed an end-to-end autonomous driving lane-keeping strategy based on the value-based DDQN algorithm, taking the original image and velocity vector as mixed input, introducing the dueling neural network architecture to improve sampling efficiency, and visualizing the end-to-end autonomous driving network through a saliency map. Song et al. [268] used the change of adjacent actions as a regularization term in the end-to-end autonomous driving strategy based on the DDPG algorithm, added it to the reward function, constrained the next action to be in the acceptable neighborhood of the current action, and improved the smoothness of the output action while accelerating the convergence of the reward function. Wu et al. [269] combined the curiosity-driven RNN algorithm to generate intrinsic reward signals based on the PPO algorithm to guide the agent’s exploration of the environment, improve exploration efficiency, and achieve high performance in overtaking tasks. Chen et al. [270] introduced maximum entropy reinforcement learning with sequence latent variables to improve the interpretability of the DRL-based end-to-end strategy. The high-dimensional data are decoded into a semantic bird’s eye mask to explain how the strategy explores the environment, while greatly reducing the sample complexity of the learning strategy. Chen et al. [271] strengthen DDPG end-to-end driving by IL-assisted pre-training of the feature extractor on sparse expert data and an auxiliary branch that supplies pose, position, and road-edge labels, sharpening perception and limiting futile exploration.
For tasks with related coupling, due to the coupling relationship between tasks or components, we can directly consider additional variables by expanding the action, state space, and reward function of the deep reinforcement learning algorithm, thereby achieving multi-task integrated control, such as energy management and thermal management in the powertrain domain [272], energy management and fault diagnosis [273], and energy management and power component control [274]. In the chassis domain, since the suspension plays an indispensable role in the vehicle posture stability during the steering and braking control processes, relevant scholars have also studied the integrated control of these three tasks, including steering and suspension integrated control [275,276,277], braking and suspension integrated control [278,279,280], and integrated control of the three [278]. However, these studies did not examine the application of the DRL algorithm in them. In the cockpit domain, human–computer interaction is basically multi-task fusion, because human–computer interaction involves highly integrated functions such as communication, control, execution, interaction, management, safety, and entertainment. Not only are the interactive technologies diversified, such as face recognition, multimodal interaction, augmented reality (AR), HUD, surround view, etc., but the media of human–computer interaction also tend to be diversified, such as cockpit space layout, central control display, rearview mirror, audio, seats, lighting, etc. Therefore, human–computer interaction tasks must rely on multiple software and hardware modules for multi-task collaboration to provide drivers and passengers with a comprehensive human–vehicle interaction experience.

6.2. Multi-Domain and Multi-Task Fusion

Multi-domain and multi-task fusion refers to the integration and coordination of different tasks in two or more domains to achieve information sharing and function expansion among multiple domains. For example, the intelligent driving domain can adjust the driving strategy according to the real-time status of the powertrain domain to improve energy efficiency and safety; the cockpit domain and the intelligent driving domain can coordinate to adjust the in-vehicle environment according to the driving status to improve ride comfort. Since multi-source information fusion and multi-objective control coordination involved in multi-domain fusion are more complex than tasks within a single domain, a single DRL algorithm makes it difficult to fuse and process these multimodal heterogeneous data, resulting in information omissions and unbalanced decisions. It is necessary to integrate multiple algorithms or improve the algorithm framework, such as hybrid deep learning models [281,282], hierarchical reinforcement learning frameworks [283], integrated learning frameworks, and multi-agent MARL frameworks, to cope with the complexity and variability in multi-domain fusion. In addition, although intelligent vehicles have many functional domains and tasks, the direction of multi-domain integration is not without basis. The three elements of people (drivers and passengers), vehicles (vehicle status), and roads (vehicle trajectories) constitute the core elements of the transportation system. The current driving behavior of intelligent vehicles is also a process of interactive coupling among people, vehicles, and roads. Therefore, vehicles as cars for multi-domain integration are also mostly guided by these three elements, further breaking down the barriers between the three and optimizing the multi-functional performance of the vehicle.

6.2.1. Fusion with Road (Vehicle Trajectory) as the Target

“Road” refers to the road on which intelligent vehicles travel and its surrounding facilities, including the physical characteristics of the road, traffic signs and signals, traffic participants (pedestrians, other vehicles), obstacles, and weather. These factors have an important impact on the perception, decision-making, and control functions of intelligent vehicles and jointly determine the vehicle’s driving trajectory. Multi-domain fusion with vehicle trajectory as the target is not only to make full use of the global road information and improve the degree of vehicle–road coupling, but also a key way to achieve intelligent driving and optimize vehicle performance.
In terms of integration of the intelligent driving domain and powertrain domain, Chen et al. [284] present JAC, an RL framework that unifies perception, decision making, and actuation. An attention-enhanced CNN sharpens sensing, and joint training links tactical choices to throttle/brake commands, boosting control accuracy and safety. Wang et al. [285] designed a MARL-based coordinated control of vehicle energy management and vehicle following behavior. The two agents were used for cooperative optimization and interactive learning, which can avoid pre-emptive vehicle speed prediction and further achieve vehicle state optimization and real-time decision making. In terms of the integration of the intelligent driving domain and chassis domain, Li et al. [286] decomposed the vehicle lateral control, i.e., steering control system, into perception and control modules. The CNN algorithm based on multi-task learning (MTL) learned the driver’s perspective image to predict the road features and obtain the vehicle’s position on the road. Then, the RL algorithm based on MTL made control decisions based on the trajectory features to keep the vehicle driving in the center of the curve. Porav and Newman [287] also combined image perception and recognition tasks into a DDPG-based vehicle braking control strategy. They used the CNN algorithm to predict the real-time dynamic trajectory of obstacles, vehicles, and pedestrians in front of the vehicle based on the front camera image; used it as state input; and trained the agent to achieve throttle and brake command control to reduce the probability of vehicle collision accidents. In terms of the fusion of the powertrain domain and chassis domain, Chen et al. [288] implemented an integrated control strategy for hybrid electric vehicles through two DRL algorithms, using DDPG to control the horizontal speed and heading angle of the vehicle, and using DQN to control the power distribution of the power system. They also designed a step-by-step training method to realize the logical relationship between the three strategies and improve the control accuracy in a real road environment.

6.2.2. Fusion with Vehicle (Vehicle Status) as the Target

“Vehicle” refers to the state of the intelligent vehicle itself, including the state of the vehicle hardware (including power system, control system, sensor) and the vehicle driving state (including motion characteristics, power state, working mode). In the research on automotive domain control technology, relevant scholars mostly focus on the two states of vehicle posture and vehicle energy. Vehicle posture refers to the position and direction of the vehicle in three-dimensional space, including pitch angle, roll angle, yaw angular velocity, etc., which reflect the stability and safety performance of the vehicle during driving. Vehicle energy refers to the vehicle’s power source, energy management system, and energy consumption status, reflecting the vehicle’s power and endurance. Multi-domain fusion with vehicle status (posture and energy) as the target can make full use of vehicle information, improve driving safety and power efficiency, and realize real-time adjustment of the vehicle in complex driving environments.
In terms of vehicle energy fusion, Tang et al. [289] proposed a hierarchical EMS combined with visual target detection. The DQN algorithm is used at the lower layer to control the speed of the following vehicle and the engine power distribution, thus achieving energy reduction in the safe following scenario. Zhang et al. [290] proposed a strategy-hierarchical and execution-coordinated ACC-EMS framework to solve the coupling relationship between vehicle adaptive cruise control (ACC) and energy management. The upper layer plans the following distance and charging status of the vehicle and guides the lower layer controller to control the working point of the powertrain components through rewards, thus avoiding the extra energy consumption caused by speed fluctuations and achieving ecological cruise. Min et al. [291] proposed an energy-saving driving method for autonomous vehicles. They used the LSTM algorithm to predict the trajectory of surrounding vehicles including driving intentions and lane change timing, and then set the reward function of the PPO algorithm based on the predicted trajectory to achieve collaborative optimization that takes into account energy efficiency, driving safety, and traffic efficiency. In terms of fusion with vehicle posture as the target, Li et al. [292] introduce a MARL-based cooperative framework that handles lateral control for vehicles merging from high-speed ramps, enabling smooth, safe integration with mainline traffic. Chen et al. [293] used a CNN-based road recognition network to identify various road surfaces including asphalt, snow, and cobblestones, and trained the DQN-based multi-objective control strategy based on the recognition results to achieve the coordinated optimization of motor speed, lane change strategy, and engine power distribution during braking, thereby ensuring the stability of the braking posture and high fuel economy of the hybrid electric vehicle (HEV) under harsh road conditions. Hu et al. [294] enhance semi-active suspension by using YOLOv2 to spot speed bumps and an improved Skyhook algorithm to adapt damping in real time, enabling predictive control of vertical vehicle posture.

6.2.3. Integration with People (Drivers and Passengers) as the Goal

“People” refers to the driver or passengers in an intelligent vehicle, including the driver’s driving behavior and habits and health status, and the needs, experience, and preferences of the passengers. Multi-domain fusion targeting drivers and passengers can better monitor and respond to the status of drivers and passengers, provide personalized services and experience, and improve driving comfort and safety, thus achieving a smarter and more humane driving and riding experience.
In terms of improving driving comfort, Du et al. [295] used a hierarchical reinforcement learning framework to collaboratively control the speed planning and suspension system of an autonomous driving vehicle. The DP algorithm was used at the upper level to solve the longitudinal comfort speed on rough roads. The suspension control strategy of the DDPG algorithm with external knowledge was trained at the lower level based on the speed planning results to achieve suspension parameter adaptation under different road conditions, thereby improving ride comfort under harsh road conditions. Deng et al. [296] proposed a collaborative EMS for fuel cell vehicles based on the SAC algorithm. In both the cooling and heating working modes of the air-conditioning system, the temperature of the passengers in the vehicle can be kept in the most comfortable zone. In terms of improving driving safety, Roh and Lee [297] combined deep learning with AR. Through mask-region-based CNN (R-CNN), they detected pedestrians and personal mobile units with abnormal movements, such as scooters, and visualized the detection information on an AR-based HUD, improving the driver’s situational awareness and response capabilities to sudden dangerous situations. Fang et al. [298] proposed a human–machine co-driving control strategy for driver fatigue driving. They first designed a driver fatigue evaluation model based on multiple facial feature points using a multi-task CNN algorithm. On the basis of quantifying the fatigue level, they proposed a new human–machine control authority allocation strategy to reduce abnormal operations caused by driver fatigue driving and improve driving safety. In terms of improving personalized services, Ling et al. [299] proposed an emotion preference style adaptation autonomous driving framework, which can identify the driver’s emotions from the driver’s EEG signals, use fuzzy classifiers to analyze the driver’s driving preferences, and then use them together with safety indicators as a DDPG-based autonomous driving control model to adjust the vehicle’s driving behavior style to adapt to the driver’s habitual preferences.

7. Domain Transfer

As shown in Figure 8, as the research on the DRL algorithm in automobile domain control technology deepens, some shortcomings are gradually being exposed, such as low data utilization, weak generalization ability, blind exploration, lack of reasoning and representation ability, etc. These are the problems of the difficult and long cycle of model strategy development, which greatly restricts the application of the DRL algorithm in real vehicles.
Transfer learning (TL) is one of the very effective ways to solve this problem. Transfer learning transfers knowledge from one field to another field to accelerate the learning process or improve the learning effect [300]. It is suitable for situations such as insufficient data, high feature dimension, or large model complexity. The corresponding learning tasks of these two domains can be set as source tasks and target tasks. Transfer learning is to complete the target tasks of the target domain by learning the knowledge in the source domain and its corresponding source tasks. Leveraging transfer learning is pivotal for enabling the practical deployment of DRL in a real-world vehicle control system [301,302,303]. Domain adaptation can markedly reduce deployment costs by transferring policies trained in simulation to real vehicles with minimal fine-tuning, effectively correcting feature and dynamics mismatches across domains. This approach not only reduces the need for expensive real-world data collection but also improves robustness by mitigating performance degradation under environmental variations such as road friction changes or sensor noise. Multi-task learning lowers the cost of acquiring diverse control skills by jointly training on multiple related driving tasks such as lane keeping, adaptive cruise control, and energy management within a unified framework. By learning shared representations, it enhances generalization across scenarios and increases resilience to unseen traffic patterns and weather conditions. Meta-learning minimizes adaptation costs for new vehicle platforms or operational contexts by enabling policies to rapidly adapt online using only a small number of interactions. This capability is particularly valuable for handling unexpected disturbances such as actuator faults or sudden traffic changes without extensive retraining. The selection of the most appropriate paradigm or a hybrid strategy should be guided by the dominant cost and robustness bottlenecks as well as the anticipated variability in the deployment environment.
The purpose of introducing transfer learning in the research of intelligent electric vehicle domain control is to solve problems such as the large differences in the distribution of different types of sensor data sets, and the difficulty in obtaining experimental data of different models and different working conditions, which leads to the small number of available samples for the DRL algorithm.

7.1. Transfer Between Different Working Scenarios

Intelligent vehicles operate in different environments and may encounter different operating scenarios, such as cities, rural areas, and highways. Most existing control strategies are trained and tested under standard conditions or a specific condition. When RL faces a completely new driving condition, it needs to re-explore and learn through interaction with the environment, which will be very time consuming. At the same time, it will also face different environments, such as bad weather and bumpy roads. Taking the target detection task as an example, most DL-based target detection models have good adaptability when driving in normal weather, thanks to the training data having sufficient labels, recognizable content, and clean style, but the performance is very poor at night or in rain, snow, and dense fog. On the one hand, this is because of the gap in feature distribution, and on the other hand, these specific driving scene data are not easy to obtain. Therefore, transfer learning under different working conditions can help intelligent vehicles better adapt to changes in working conditions and environment, solve the problem of data scarcity, and improve the performance of the model in scarce data scenarios.
Domain adaptation aligns feature distributions between a labeled source domain and an unlabeled target domain that share classes but differ statistically [304]. Leveraging this idea, Li et al. [305] enhance Faster R-CNN for foggy scenes: image- and object-level adaptations jointly narrow global-style and local-appearance gaps, while an augmented “auxiliary” domain regularizes the transfer, yielding sharper detection in haze. Chen et al. [306] apply a graph-convolutional network with domain-adversarial training (DANN) to cooperative vehicle localization; by drawing on neighboring vehicle cues and adversarial cancelling of environment-specific biases, they boost positioning accuracy across diverse conditions. In addition, DANN can be trained with the least amount of data to achieve the smallest error. Shu et al. [307] proposed a decision framework for autonomous vehicles at intersections based on TL and DDQN. They set one of the three driving tasks of left turn, straight driving, and right turn as the source task, and the other two tasks as the target tasks. They also defined three rules to determine the action selection in the target task, thereby accelerating the real-time establishment of the decision-making strategy of autonomous vehicles. Hu et al. [308] proposed an RL-based energy distribution and cabin thermal comfort collaborative optimization strategy for extended-range electric buses. The EMS learned by the DDPG algorithm was transferred from the AC-off state to the AC-on state, and five indicators were used to measure the impact of the EMS after the migration on the powertrain performance and thermal comfort.

7.2. Transfer Between Different Models of Equipment

Since the structures and sensors of different vehicle models are different, most models are usually only applicable to specific types of vehicles or specific types of sensors. For example, the EMS of power-split hybrid vehicles is not applicable to series hybrid vehicles, and the effect of the same object detection model on different models of cameras is also different. Although intelligent vehicles have rich and diverse configurations, there must be certain commonalities between them due to the similarities in vehicle systems, hardware equipment, and task objectives. Transfer learning between different models of equipment can improve the portability and reusability of strategies, greatly reducing the development time and workload required for domain control technology.
Dinh et al. [309] remap detections from a short-lens camera to a long-lens view via focal-ratio geometry, enabling simultaneous wide-field and long-range sensing. Li et al. [310] shrink labelling needs across cameras with ML-ANet, which aligns source- and target-domain features through multi-kernel maximum mean discrepancy. Lian et al. [311] port a DDPG-trained Prius energy-management network to three other hybrid EVs by reusing and fine-tuning its weights. To improve the adaptability and online application capability of the EMS, Huang et al. [312] combined the improved SAC algorithm with TL and performed EMS migration on two different types of urban fuel cell vehicles. On the basis of transferring network parameters, PER buffer and Sum Tree were transferred at the same time, and the learned knowledge was fully reused. The source domain was trained through a large number of real vehicle conditions to obtain a robust model that can be represented in the source domain.

8. Conclusions

Compared with prior reviews, which have often focused on single domains such as intelligent driving or powertrain control, this article provides a broader synthesis across four major functional domains of intelligent electric vehicles. Earlier studies have typically emphasized descriptive summaries of DRL applications but seldom conducted systematic comparisons of different algorithmic families or evaluated their suitability under varying operational constraints. By contrast, the present review critically examines how DRL methods perform across steering, braking, energy management, cockpit interaction, and other tasks, and highlights both their advantages and their persistent limitations. This critical comparison underscores that while DRL improves adaptability, robustness, and decision-making capacity, it also faces unresolved challenges in real-time deployment, safety assurance, and generalization, thereby clarifying the research gaps that future work must address.
This article comprehensively reviews the application of DRL algorithms in intelligent electric vehicle domain control, covering the intelligent driving, powertrain, chassis, and cockpit domains, as well as two cross-cutting directions—multi-domain fusion and algorithm migration. A critical synthesis reveals that although DRL has demonstrated strong adaptability and flexibility compared with traditional model-based and rule-based methods, its practical application still faces structural fragmentation, data scarcity, and safety concerns.
In the intelligent driving domain, numerous DRL variants have been developed to improve perception–decision–control performance in dynamic traffic. Compared with conventional MPC or heuristic planning, DRL provides superior adaptability to uncertain environments and enables real-time decision making. However, these methods often suffer from instability, poor interpretability, and limited generalization to unseen scenarios. Hybrid frameworks that combine DRL with model-based control exhibit a better balance between adaptability and safety, suggesting a future trend toward hybridization rather than purely data-driven learning. End-to-end models have further advanced multi-task fusion, yet they risk overfitting and require vast datasets, indicating the importance of incorporating priors and constraints.
In the powertrain domain, DRL algorithms have been widely used to optimize energy management strategies, improving fuel economy and prolonging battery life. Compared with static rule-based or DP-based approaches, DRL provides better adaptability to dynamic loads and operating conditions. Nevertheless, the effectiveness of these methods largely depends on the careful design of states, actions, and rewards, and many existing studies remain task-specific, limiting scalability. Current research demonstrates that combining DRL with prediction models or transfer learning offers stronger adaptability, but challenges remain in handling long-term degradation and rare events.
In the chassis domain, applications of DRL are still emerging. Most works focus on single-function control such as suspension or braking, while few address multi-task collaborative control. Compared with traditional control methods, DRL has shown potential in coordinating nonlinear dynamics and adapting to uncertain environments, but the field lacks systematic validation, benchmarking, and integration with other subsystems. This highlights both the promise of DRL in chassis control and the necessity of developing standardized testbeds and evaluation metrics.
In the cockpit domain, the increasing need for multimodal human–vehicle interaction has expanded the research frontier beyond traditional DRL applications. While DRL has been applied to adaptive displays and driver state monitoring, large language models and generative AI now show greater potential for realizing emotional and context-aware interaction. Compared with DRL, these foundation models offer richer multimodal understanding but lack the reinforcement-signal-driven adaptability. Future cockpit intelligence will likely emerge from the integration of DRL’s adaptive control with generative models’ semantic and emotional reasoning.
Beyond single-domain studies, two research directions are particularly noteworthy. For multi-domain fusion, most current works remain fragmented, integrating at most two or three subsystems, which leads to limited scalability. The degree of integration is determined by the innovation of algorithmic frameworks; the transition from partial fusion to true multi-domain collaboration remains an open challenge. For algorithm migration, studies indicate that differences across vehicle platforms, including hardware configurations and sensor layouts, often necessitate substantial retraining. Migration across operating conditions such as varying traffic density, road terrain, and weather patterns has reached a more advanced and effective stage compared with migration across hardware platforms. However, systematic frameworks that can ensure universal adaptability are still lacking.
Finally, several challenges and promising directions emerge across all domains, as outlined below.

8.1. Intelligence Electric Vehicle Universal Model

Generative AI and world models demonstrate the possibility of integrating perception, prediction, decision, and control into a unified framework. Compared with domain-specific DRL, such models can extract multi-source correlations and achieve end-to-end optimization. The success of models like the UniAD [313] and world model [314] architectures shows that general foundation models can support multi-task, multi-domain collaboration. The key research problem is how to achieve efficient data closure and modular collaboration, while maintaining interpretability and safety in decision making.

8.2. Data Sample Acquisition

DRL methods remain heavily data-dependent, and real-world vehicle testing is expensive and difficult to scale. Compared with traditional model-based methods, DRL requires far more labeled and diverse data. While transfer learning and domain adaptation can partially reduce data demand, obtaining reliable and representative datasets remains a bottleneck. Addressing this issue requires advances in high-fidelity simulation, self-supervised learning, and collaborative data sharing across platforms.

8.3. Possibility of Vehicle-Side Model Deployment

Current DRL models are computationally intensive and typically run in cloud or laboratory environments. Compared with lightweight model-based controllers, on-vehicle deployment remains impractical. Model compression techniques such as pruning, quantization, and knowledge distillation provide feasible solutions, enabling small models on vehicles to inherit the knowledge of larger models. Achieving efficient vehicle-side deployment requires a balance between computational cost, reliability, and adaptability.

8.4. Portability of DRL Algorithms

As intelligent vehicles diversify, including aerial and underwater platforms, portability becomes a critical issue. Unlike conventional rule-based systems that require redesigning for each platform, DRL offers transferable learning through simulation-to-real adaptation. The ability of DRL to generalize across navigation, obstacle avoidance, and environment perception tasks highlights its potential to accelerate cross-domain development. However, systematic frameworks for ensuring reliable portability across drastically different modalities remain underexplored.

8.5. Interoperability with Charging Infrastructure

Future intelligent vehicles must coordinate with energy systems through interoperable charging infrastructures. DRL enables joint optimization of vehicle powertrain control and charging scheduling, balancing battery health with grid-level objectives such as renewable integration. Compared with static scheduling or heuristic-based methods, DRL offers dynamic adaptability. Integrating DRL with transfer learning and domain adaptation can further reduce data costs and improve interoperability across different charging standards, advancing from vehicle-centric energy management to cooperative vehicle–grid–home ecosystems.
Despite the promising performance of DRL in simulations, its application in safety-critical real-world systems faces persistent challenges arising from fundamental design and learning constraints. One major issue is poor generalization, often caused by the domain gap between training and operational environments, where policies tend to overfit to simulation-specific dynamics or limited field data, resulting in degraded performance under unseen conditions. Another challenge lies in data scarcity, which is intensified in safety-critical contexts as exploration must be strictly constrained to avoid unsafe states, leading to insufficient coverage of rare yet critical scenarios. In addition, reward design plays a pivotal role in maintaining long-term policy stability, where sparse or delayed rewards can slow convergence, misaligned rewards may encourage unintended or unsafe behaviors (often referred to as reward exploitation), and short-term reward maximization may conflict with long-term safety objectives, causing instability over prolonged operation. Moreover, current DRL methods lack formal safety guarantees and remain susceptible to distributional shifts and adversarial perturbations, making it difficult to ensure consistent and robust behavior under diverse operating conditions. Addressing these limitations will require the integration of safety-aware exploration, robust policy optimization, interpretable decision making, and formal verification into DRL pipelines.

Author Contributions

Conceptualization, D.L. and D.H.; methodology, D.L., Y.C., Y.S., W.W. and D.H.; software, D.L. and Y.S.; validation, D.L. and Y.C.; resources, D.L. and D.H.; data curation, D.L., Y.S. and W.W.; writing—original draft preparation, D.L., Y.C., Y.S., W.W., S.J., H.R., F.Y., C.J., K.T., S.H., J.W. and D.H.; writing—review and editing, D.L. and D.H.; visualization, D.L. and Y.S.; supervision, D.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Postgraduate Research & Practice Innovation Program of Jiangsu Province, grant number KYCX25_4187, and Zhangjiagang City pre-research fund project, grant number zk20230036.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

Author Kunpeng Tang was employed by the company China Automotive Engineering Research Institute Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIArtificial Intelligence
DRLDeep Reinforcement Learning
RLReinforcement Learning
DLDeep Learning
DDPGDeep Deterministic Policy Gradient
DQNDeep Q-Network
LSTMLong Short-Term Memory
EMSEnergy Management System
DDQNDouble Deep Q-Network
SSDSingle Shot MultiBox Detector
MARLMulti-Agent Reinforcement Learning
GAILGenerative Adversarial Imitation Learning
TLTransfer Learning
ECUsElectronic Control Units
GRUGated Recurrent Unit
AEBAutomatic Emergency Braking
ABSAnti-lock Braking System
DANNDomain Adversarial Neural Networks
E/EElectronic and Electrical
LKALarge Kernel Attention
MTLMulti-Task Learning
R&DResearch and Development
SOCState of Charge
GNNGraph Neural Network
SACSoft Actor–Critic
R-CNNRegion-based CNN
CNNConvolutional Neural Network
ILImitation Learning
BCBehavior Cloning
IMUInertial Measurement Unit
GANGenerative Adversarial Network
RNNRecurrent Neural Network
PPOProximal Policy Optimization
ACCAdaptive Cruise Control
ARAugmented Reality
GPSGlobal Positioning System

References

  1. Jia, C.; Liu, W.; He, H.; Chau, K.T. Superior energy management for fuel cell vehicles guided by improved DDPG algorithm: Integrating driving intention speed prediction and health-aware control. Appl. Energy 2025, 394, 126195. [Google Scholar] [CrossRef]
  2. Chen, J.; Zhang, M.; Xu, B.; Sun, J.; Mujumdar, A.S. Artificial intelligence assisted technologies for controlling the drying of fruits and vegetables using physical fields: A review. Trends Food Sci. Technol. 2020, 105, 251–260. [Google Scholar] [CrossRef]
  3. Taherdoost, H.; Madanchian, M. AI Advancements: Comparison of Innovative Techniques. AI 2024, 5, 38–54. [Google Scholar] [CrossRef]
  4. Chai, X.; Yan, J.; Zhang, W.; Sulowicz, M.; Feng, Y. Recent Progress on Digital Twins in Intelligent Connected Vehicles: A Review. Elektron. Elektrotech. 2024, 30, 4–17. [Google Scholar] [CrossRef]
  5. Liu, H.; Yan, S.; Shen, Y.; Li, C.; Zhang, Y.; Hussain, F. Model predictive control system based on direct yaw moment control for 4WID self-steering agriculture vehicle. Int. J. Agric. Biol. Eng. 2021, 14, 175–181. [Google Scholar] [CrossRef]
  6. Datta, S.K.; Haerri, J.; Bonnet, C.; Costa, R.F.D. Vehicles as Connected Resources: Opportunities and Challenges for the Future. IEEE Veh. Technol. Mag. 2017, 12, 26–35. [Google Scholar] [CrossRef]
  7. Huang, B.; Yu, W.; Ma, M.; Wei, X.; Wang, G. Artificial-Intelligence-Based Energy Management Strategies for Hybrid Electric Vehicles: A Comprehensive Review. Energies 2025, 18, 3600. [Google Scholar] [CrossRef]
  8. Ye, Z.X.; Chikangaise, P.; Dong, S.W.; Hua, C.W.; Qi, Y.S. Review of intelligent sprinkler irrigation technologies for remote autonomous system. Int. J. Agric. Biol. Eng. 2018, 11, 23–30. [Google Scholar] [CrossRef]
  9. Jia, C.; Zhou, J.; He, H.; Li, J.; Wei, Z.; Li, K. Health-conscious deep reinforcement learning energy management for fuel cell buses integrating environmental and look-ahead road information. Energy 2024, 290, 130146. [Google Scholar] [CrossRef]
  10. Căpriță, H.V.; Selișteanu, D. A Novel Configurable End-to-End Communication Protection Hardware Module for Automotive Sensors. IEEE Sens. J. 2024, 24, 8949–8961. [Google Scholar] [CrossRef]
  11. Ayres, N.; Deka, L.; Paluszczyszyn, D. Container-Based Electronic Control Unit Virtualisation: A Paradigm Shift Towards a Centralised Automotive E/E Architecture. Electronics 2024, 13, 4283. [Google Scholar] [CrossRef]
  12. Hu, J.; Wang, X.; Tan, S. Electric Vehicle Integration in Coupled Power Distribution and Transportation Networks: A Review. Energies 2024, 17, 4775. [Google Scholar] [CrossRef]
  13. Jia, C.; Liu, W.; He, H.; Chau, K.T. Deep reinforcement learning-based energy management strategy for fuel cell buses integrating future road information and cabin comfort control. Energy Convers. Manag. 2024, 321, 119032. [Google Scholar] [CrossRef]
  14. Liu, Y.; Du, S.; Micallef, C.; Jia, Y.; Shi, Y.; Hughes, D.J. Optimisation and Management of Energy Generated by a Multifunctional MFC-Integrated Composite Chassis for Rail Vehicles. Energies 2020, 13, 2720. [Google Scholar] [CrossRef]
  15. Trachtler, A. Integrated vehicle dynamics control using active brake, steering and suspension systems. Int. J. Veh. Des. 2004, 36, 1–12. [Google Scholar] [CrossRef]
  16. Kabir, M.R.; Boddupalli, S.; Nath, A.P.D.; Ray, S. Automotive Functional Safety: Scope, Standards, and Perspectives on Practice. IEEE Consum. Electron. Mag. 2025, 14, 10–25. [Google Scholar] [CrossRef]
  17. Huang, Y.; Li, Z.; Bian, Z.; Jin, H.; Zheng, G.; Hu, D.; Sun, Y.; Fan, C.; Xie, W.; Fang, H. Overview of Deep Learning and Nondestructive Detection Technology for Quality Assessment of Tomatoes. Foods 2025, 14, 286. [Google Scholar] [CrossRef]
  18. Guo, J.; Zhang, K.; Adade, S.Y.-S.S.; Lin, J.; Lin, H.; Chen, Q. Tea grading, blending, and matching based on computer vision and deep learning. J. Sci. Food Agric. 2025, 105, 3239–3251. [Google Scholar] [CrossRef] [PubMed]
  19. Yi, F.; Shu, X.; Zhou, J.; Zhang, J.; Feng, C.; Gong, H.; Zhang, C.; Yu, W. Remaining useful life prediction of PEMFC based on matrix long short-term memory. Int. J. Hydrogen Energy 2025, 111, 228–237. [Google Scholar] [CrossRef]
  20. Tian, Y.; Sun, J.; Zhou, X.; Yao, K.; Tang, N. Detection of soluble solid content in apples based on hyperspectral technology combined with deep learning algorithm. J. Food Process. Preserv. 2022, 46, e16414. [Google Scholar] [CrossRef]
  21. Alom, M.Z.; Taha, T.M.; Yakopcic, C.; Westberg, S.; Sidike, P.; Nasrin, M.S.; Hasan, M.; Van Essen, B.C.; Awwal, A.A.S.; Asari, V.K. A State-of-the-Art Survey on Deep Learning Theory and Architectures. Electronics 2019, 8, 292. [Google Scholar] [CrossRef]
  22. Nunekpeku, X.; Zhang, W.; Gao, J.; Adade, S.Y.-S.S.; Li, H.; Chen, Q. Gel strength prediction in ultrasonicated chicken mince: Fusing near-infrared and Raman spectroscopy coupled with deep learning LSTM algorithm. Food Control 2025, 168, 110916. [Google Scholar] [CrossRef]
  23. Jia, C.; Liu, W.; He, H.; Chau, K.T. Health-conscious energy management for fuel cell vehicles: An integrated thermal management strategy for cabin and energy source systems. Energy 2025, 333, 137330. [Google Scholar] [CrossRef]
  24. Li, K.; Zhou, J.; Jia, C.; Yi, F.; Zhang, C. Energy sources durability energy management for fuel cell hybrid electric bus based on deep reinforcement learning considering future terrain information. Int. J. Hydrogen Energy 2024, 52, 821–833. [Google Scholar] [CrossRef]
  25. Xie, F.; Guo, Z.; Li, T.; Feng, Q.; Zhao, C. Dynamic Task Planning for Multi-Arm Harvesting Robots Under Multiple Constraints Using Deep Reinforcement Learning. Horticulturae 2025, 11, 88. [Google Scholar] [CrossRef]
  26. Jia, C.; He, H.; Zhou, J.; Li, J.; Wei, Z.; Li, K.; Li, M. A novel deep reinforcement learning-based predictive energy management for fuel cell buses integrating speed and passenger prediction. Int. J. Hydrogen Energy 2025, 100, 456–465. [Google Scholar] [CrossRef]
  27. Wang, Z.; Zhan, J.; Duan, C.; Guan, X.; Lu, P.; Yang, K. A Review of Vehicle Detection Techniques for Intelligent Vehicles. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3811–3831. [Google Scholar] [CrossRef]
  28. Reda, M.; Onsy, A.; Haikal, A.Y.; Ghanbari, A. Path planning algorithms in the autonomous driving system: A comprehensive review. Robot. Auton. Syst. 2024, 174, 104630. [Google Scholar] [CrossRef]
  29. Liu, Q.; Li, X.; Tang, Y.; Gao, X.; Yang, F.; Li, Z. Graph Reinforcement Learning-Based Decision-Making Technology for Connected and Autonomous Vehicles: Framework, Review, and Future Trends. Sensors 2023, 23, 8229. [Google Scholar] [CrossRef]
  30. Ni, J.; Chen, Y.; Chen, Y.; Zhu, J.; Ali, D.; Cao, W. A Survey on Theories and Applications for Self-Driving Cars Based on Deep Learning Methods. Appl. Sci. 2020, 10, 2749. [Google Scholar] [CrossRef]
  31. Lu, Y.; Ma, H.; Smart, E.; Yu, H. Real-Time Performance-Focused Localization Techniques for Autonomous Vehicle: A Review. IEEE Trans. Intell. Transp. Syst. 2022, 23, 6082–6100. [Google Scholar] [CrossRef]
  32. Egan, D.; Zhu, Q.; Prucka, R. A Review of Reinforcement Learning-Based Powertrain Controllers: Effects of Agent Selection for Mixed-Continuity Control and Reward Formulation. Energies 2023, 16, 3450. [Google Scholar] [CrossRef]
  33. Ganesh, A.H.; Xu, B. A review of reinforcement learning based energy management systems for electrified powertrains: Progress, challenge, and potential solution. Renew. Sustain. Energy Rev. 2022, 154, 111833. [Google Scholar] [CrossRef]
  34. Al Miaari, A.; Ali, H.M. Batteries temperature prediction and thermal management using machine learning: An overview. Energy Rep. 2023, 10, 2277–2305. [Google Scholar] [CrossRef]
  35. Hua, X.; Zeng, J.; Li, H.; Huang, J.; Luo, M.; Feng, X.; Xiong, H.; Wu, W. A Review of Automobile Brake-by-Wire Control Technology. Processes 2023, 11, 994. [Google Scholar] [CrossRef]
  36. Zhang, L.; Wang, Q.; Chen, J.; Wang, Z.-P.; Li, S.-H. Brake-by-wire system for passenger cars: A review of structure, control, key technologies, and application in X-by-wire chassis. Etransportation 2023, 18, 100292. [Google Scholar] [CrossRef]
  37. Mortazavizadeh, S.A.; Ghaderi, A.; Ebrahimi, M.; Hajian, M. Recent Developments in the Vehicle Steer-by-Wire System. IEEE Trans. Transp. Electrif. 2020, 6, 1226–1235. [Google Scholar] [CrossRef]
  38. Lajunen, A.; Yang, Y.; Emadi, A. Review of Cabin Thermal Management for Electrified Passenger Vehicles. IEEE Trans. Veh. Technol. 2020, 69, 6025–6040. [Google Scholar] [CrossRef]
  39. Park, J.; Park, W. Functional requirements of automotive head-up displays: A systematic review of literature from 1994 to present. Appl. Ergon. 2019, 76, 130–146. [Google Scholar] [CrossRef]
  40. Lu, J.; Peng, Z.; Yang, S.; Ma, Y.; Wang, R.; Pang, Z.; Feng, X.; Chen, Y.; Cao, Y. A review of sensory interactions between autonomous vehicles and drivers. J. Syst. Archit. 2023, 141, 102932. [Google Scholar] [CrossRef]
  41. Guo, Y.; Wang, T.; Zheng, X. A brief analysis of the development process and future trend of automobile headlights. SHS Web Conf. 2023, 165, 2003. [Google Scholar] [CrossRef]
  42. Sharma, S.; Pandey, A.; Sharma, V.; Mishra, S.; Alkhayyat, A. Federated Learning and Blockchain: A Cross-Domain Convergence. In Proceedings of the 2023 3rd International Conference on Technological Advancements in Computational Sciences (ICTACS), Tashkent, Uzbekistan, 1–3 November 2023; pp. 1121–1127. [Google Scholar] [CrossRef]
  43. Bultmann, S.; Quenzel, J.; Behnke, S. Real-time multi-modal semantic fusion on unmanned aerial vehicles with label propagation for cross-domain adaptation. Robot. Auton. Syst. 2023, 159, 104286. [Google Scholar] [CrossRef]
  44. Abdallaoui, S.; Aglzim, E.-H.; Chaibet, A.; Kribèche, A. Thorough Review Analysis of Safe Control of Autonomous Vehicles: Path Planning and Navigation Techniques. Energies 2022, 15, 1358. [Google Scholar] [CrossRef]
  45. Ahmed, S.; Qiu, B.; Ahmad, F.; Kong, C.-W.; Xin, H. A State-of-the-Art Analysis of Obstacle Avoidance Methods from the Perspective of an Agricultural Sprayer UAV’s Operation Scenario. Agronomy 2021, 11, 1069. [Google Scholar] [CrossRef]
  46. Luo, Y.; Wei, L.; Xu, L.; Zhang, Q.; Liu, J.; Cai, Q.; Zhang, W. Stereo-vision-based multi-crop harvesting edge detection for precise automatic steering of combine harvester. Biosyst. Eng. 2022, 215, 115–128. [Google Scholar] [CrossRef]
  47. Chen, L.; Li, G.; Xie, W.; Tan, J.; Li, Y.; Pu, J.; Chen, L.; Gan, D.; Shi, W. A Survey of Computer Vision Detection, Visual SLAM Algorithms, and Their Applications in Energy-Efficient Autonomous Systems. Energies 2024, 17, 5177. [Google Scholar] [CrossRef]
  48. Zhang, T.; Zhou, J.; Liu, W.; Yue, R.; Yao, M.; Shi, J.; Hu, J. Seedling-YOLO: High-Efficiency Target Detection Algorithm for Field Broccoli Seedling Transplanting Quality Based on YOLOv7-Tiny. Agronomy 2024, 14, 931. [Google Scholar] [CrossRef]
  49. Bai, Z.; Wu, G.; Barth, M.J.; Liu, Y.; Sisbot, E.A.; Oguchi, K. PillarGrid: Deep Learning-Based Cooperative Perception for 3D Object Detection from Onboard-Roadside LiDAR. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 1743–1749. [Google Scholar] [CrossRef]
  50. Wang, Z.; Wu, Y.; Niu, Q. Multi-Sensor Fusion in Automated Driving: A Survey. IEEE Access 2020, 8, 2847–2868. [Google Scholar] [CrossRef]
  51. Xu, B.; Zhang, X.; Wang, L.; Hu, X.; Li, Z.; Pan, S.; Li, J.; Deng, Y. RPFA-Net: A 4D RaDAR Pillar Feature Attention Network for 3D Object Detection. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; pp. 3061–3066. [Google Scholar] [CrossRef]
  52. Naimi, H.; Akilan, T.; Khalid, M.A.S. Fast Traffic Sign and Light Detection using Deep Learning for Automotive Applications. In Proceedings of the 2021 IEEE Western New York Image and Signal Processing Workshop (WNYISPW), Rochester, NY, USA, 22 October 2021; pp. 1–5. [Google Scholar] [CrossRef]
  53. Li, G.; Fan, H.; Jiang, G.; Jiang, D.; Liu, Y.; Tao, B.; Yun, J. RGBD-SLAM Based on Object Detection with Two-Stream YOLOv4-MobileNetv3 in Autonomous Driving. IEEE Trans. Intell. Transp. Syst. 2024, 25, 2847–2857. [Google Scholar] [CrossRef]
  54. Chen, S.; Niu, S.; Lan, T.; Liu, B. PCT: Large-Scale 3d Point Cloud Representations Via Graph Inception Networks with Applications to Autonomous Driving. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 4395–4399. [Google Scholar] [CrossRef]
  55. Wang, X.; Lei, J.; Lan, H.; Al-Jawari, A.; Wei, X. DuEqNet: Dual-equivariance network in outdoor 3D object detection for autonomous driving. 2023 IEEE International Conference on Robotics and Automation (ICRA). arXiv 2023, arXiv:2302.13577. [Google Scholar] [CrossRef]
  56. Xu, S.; Zhou, D.; Fang, J.; Yin, J.; Bin, Z.; Zhang, L. FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 22 September 2021; pp. 3047–3054. [Google Scholar] [CrossRef]
  57. Li, S.; Liu, B.; Zhao, Y.; Zheng, K.; Cheng, H. An Object Detection Method Enhanced by Sparse Point Cloud for Low Illumination in Autonomous Driving. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 293–297. [Google Scholar] [CrossRef]
  58. Bai, Y.; Zhang, B.; Xu, N.; Zhou, J.; Shi, J.; Diao, Z. Vision-based navigation and guidance for agricultural autonomous vehicles and robots: A review. Comput. Electron. Agric. 2023, 205, 107584. [Google Scholar] [CrossRef]
  59. Yuan, L.-m.; Cai, J.-r.; Sun, L.; Ye, C. A Preliminary Discrimination of Cluster Disqualified Shape for Table Grape by Mono-Camera Multi-Perspective Simultaneously Imaging Approach. Food Anal. Methods 2016, 9, 758–767. [Google Scholar] [CrossRef]
  60. Li, B.; Chan, P.H.; Baris, G.; Higgins, M.D.; Donzella, V. Analysis of Automotive Camera Sensor Noise Factors and Impact on Object Detection. IEEE Sens. J. 2022, 22, 22210–22219. [Google Scholar] [CrossRef]
  61. Wang, J.; Gao, Z.; Zhang, Y.; Zhou, J.; Wu, J.; Li, P. Real-Time Detection and Location of Potted Flowers Based on a ZED Camera and a YOLO V4-Tiny Deep Learning Algorithm. Horticulturae 2022, 8, 21. [Google Scholar] [CrossRef]
  62. Wu, Q.; Gu, J. Research on Robot Visual Servo Control Based on Image Identification. In Proceedings of the 2016 International Conference on Modeling, Simulation and Optimization Technologies and Applications (MSOTA2016), Xiamen, China, 18–19 December 2016; pp. 384–387. [Google Scholar] [CrossRef]
  63. Viola, P.; Jones, M. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Kauai, HI, USA, 8–14 December 2001; p. I. [Google Scholar] [CrossRef]
  64. Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 881, pp. 886–893. [Google Scholar] [CrossRef]
  65. Felzenszwalb, P.; McAllester, D.; Ramanan, D. A discriminatively trained, multiscale, deformable part model. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar] [CrossRef]
  66. Varghese, R.; Sambath, M. A comprehensive review on two-stage object detection algorithms. In Proceedings of the 2023 International Conference on Quantum Technologies, Communications, Computing, Hardware and Embedded Systems Security (iQ-CCHESS), Kottayam, India, 15–16 September 2023; pp. 1–7. [Google Scholar] [CrossRef]
  67. Wang, R.; Zhao, H.; Xu, Z.; Ding, Y.; Li, G.; Zhang, Y.; Li, H. Real-time vehicle target detection in inclement weather conditions based on YOLOv4. Front. Neurorobot. 2023, 17, 1058723. [Google Scholar] [CrossRef]
  68. Wang, Z.; Li, Y.; Liu, Y.; Meng, F. Improved object detection via large kernel attention. Expert Syst. Appl. 2024, 240, 122507. [Google Scholar] [CrossRef]
  69. Yang, L.; Zhong, J.; Zhang, Y.; Bai, S.; Li, G.; Yang, Y.; Zhang, J. An Improving Faster-RCNN With Multi-Attention ResNet for Small Target Detection in Intelligent Autonomous Transport With 6G. IEEE Trans. Intell. Transp. Syst. 2023, 24, 7717–7725. [Google Scholar] [CrossRef]
  70. Luo, J.; Fang, H.; Shao, F.; Hu, C.; Meng, F. Vehicle Detection in Congested Traffic Based on Simplified Weighted Dual-Path Feature Pyramid Network with Guided Anchoring. IEEE Access 2021, 9, 53219–53231. [Google Scholar] [CrossRef]
  71. Wang, L.; Tang, J.; Liao, Q. A study on radar target detection based on deep neural networks. IEEE Sens. Lett. 2019, 3, 1–4. [Google Scholar] [CrossRef]
  72. Alipour, H. Point Cloud-Based Analysis of Integrated Drone-Based Tracking, Mapping, and Anomaly Detection for GPS-Denied Environments; University of British Columbia Library: Vancouver, BC, Canada, 2024. [Google Scholar] [CrossRef]
  73. Lu, G.; He, Z.; Zhang, S.; Huang, Y.; Zhong, Y.; Li, Z.; Han, Y. A Novel Method for Improving Point Cloud Accuracy in Automotive Radar Object Recognition. IEEE Access 2023, 11, 78538–78548. [Google Scholar] [CrossRef]
  74. Xu, J.; Liu, H.; Shen, Y.; Zeng, X.; Zheng, X. Individual nursery trees classification and segmentation using a point cloud-based neural network with dense connection pattern. Sci. Hortic. 2024, 328, 112945. [Google Scholar] [CrossRef]
  75. Yang, N.; Chang, K.; Dong, S.; Tang, J.; Wang, A.; Huang, R.; Jia, Y. Rapid image detection and recognition of rice false smut based on mobile smart devices with anti-light features from cloud database. Biosyst. Eng. 2022, 218, 229–244. [Google Scholar] [CrossRef]
  76. Quan, C.; Liu, F.; Qi, L.; Tie, Y. LRT-CLUSTER: A New Clustering Algorithm Based on Likelihood Ratio Test to Identify Driving Genes. Interdiscip. Sci. Comput. Life Sci. 2023, 15, 217–230. [Google Scholar] [CrossRef]
  77. Liang, M.; Kropfreiter, T.; Meyer, F. A BP Method for Track-Before-Detect. IEEE Signal Process. Lett. 2023, 30, 1137–1141. [Google Scholar] [CrossRef]
  78. Diskin, T.; Beer, Y.; Okun, U.; Wiesel, A. CFARnet: Deep learning for target detection with constant false alarm rate. Signal Process. 2024, 223, 109543. [Google Scholar] [CrossRef]
  79. Ranft, B.; Stiller, C. The Role of Machine Vision for Intelligent Vehicles. IEEE Trans. Intell. Veh. 2016, 1, 8–19. [Google Scholar] [CrossRef]
  80. Pravallika, A.; Hashmi, M.F.; Gupta, A. Deep Learning Frontiers in 3D Object Detection: A Comprehensive Review for Autonomous Driving. IEEE Access 2024, 12, 173936–173980. [Google Scholar] [CrossRef]
  81. Sun, X.; Wang, M.; Du, J.; Sun, Y.; Cheng, S.S.; Xie, W. A Task-Driven Scene-Aware LiDAR Point Cloud Coding Framework for Autonomous Vehicles. IEEE Trans. Ind. Inform. 2023, 19, 8731–8742. [Google Scholar] [CrossRef]
  82. Li, Y.; Li, X.; Zhang, Z.; Shuang, F.; Lin, Q.; Jiang, J. DenseKPNET: Dense Kernel Point Convolutional Neural Networks for Point Cloud Semantic Segmentation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–13. [Google Scholar] [CrossRef]
  83. Klein, L.A. ITS Sensors and Architectures for Traffic Management and Connected Vehicles; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar] [CrossRef]
  84. Yuanyuan, Z.; Bin, Z.; Cheng, S.; Haolu, L.; Jicheng, H.; Kunpeng, T.; Zhong, T. Review of the field environmental sensing methods based on multi-sensor information fusion technology. Int. J. Agric. Biol. Eng. 2024, 17, 1–13. [Google Scholar] [CrossRef]
  85. Xu, S.; Xu, X.; Zhu, Q.; Meng, Y.; Yang, G.; Feng, H.; Yang, M.; Zhu, Q.; Xue, H.; Wang, B. Monitoring leaf nitrogen content in rice based on information fusion of multi-sensor imagery from UAV. Precis. Agric. 2023, 24, 2327–2349. [Google Scholar] [CrossRef]
  86. Xiang, C.; Feng, C.; Xie, X.; Shi, B.; Lu, H.; Lv, Y.; Yang, M.; Niu, Z. Multi-Sensor Fusion and Cooperative Perception for Autonomous Driving: A Review. IEEE Intell. Transp. Syst. Mag. 2023, 15, 36–58. [Google Scholar] [CrossRef]
  87. Yang, B.; Li, J.; Zeng, T. A Review of Environmental Perception Technology Based on Multi-Sensor Information Fusion in Autonomous Driving. World Electr. Veh. J. 2025, 16, 20. [Google Scholar] [CrossRef]
  88. Wang, H.; Liu, J.; Dong, H.; Shao, Z. A Survey of the Multi-Sensor Fusion Object Detection Task in Autonomous Driving. Sensors 2025, 25, 2794. [Google Scholar] [CrossRef]
  89. Zhu, C.; Liu, X.; Chen, H.; Tian, X. Automatic cruise system for water quality monitoring. Int. J. Agric. Biol. Eng. 2018, 11, 220–228. [Google Scholar] [CrossRef]
  90. Csiszar, O. Ordered Weighted Averaging Operators: A Short Review. IEEE Syst. Man Cybern. Mag. 2021, 7, 4–12. [Google Scholar] [CrossRef]
  91. Wei, L.; Yang, H.; Niu, Y.; Zhang, Y.; Xu, L.; Chai, X. Wheat biomass, yield, and straw-grain ratio estimation from multi-temporal UAV-based RGB and multispectral images. Biosyst. Eng. 2023, 234, 187–205. [Google Scholar] [CrossRef]
  92. Barreto-Cubero, A.J.; Gómez-Espinosa, A.; Escobedo Cabello, J.A.; Cuan-Urquizo, E.; Cruz-Ramírez, S.R. Sensor Data Fusion for a Mobile Robot Using Neural Networks. Sensors 2022, 22, 305. [Google Scholar] [CrossRef]
  93. Lin, C.; Tian, D.; Duan, X.; Zhou, J.; Zhao, D.; Cao, D. CL3D: Camera-LiDAR 3D Object Detection With Point Feature Enhancement and Point-Guided Fusion. IEEE Trans. Intell. Transp. Syst. 2022, 23, 18040–18050. [Google Scholar] [CrossRef]
  94. Liu, Z.; Huang, T.; Li, B.; Chen, X.; Wang, X.; Bai, X. EPNet++: Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 8324–8341. [Google Scholar] [CrossRef] [PubMed]
  95. Zhang, T.; Zhao, M. Multi-Scale Vehicle Detection and Tracking Method in Highway Scene. In Proceedings of the 2020 Chinese Control and Decision Conference (CCDC), Hefei, China, 22–24 August 2020; pp. 2066–2071. [Google Scholar] [CrossRef]
  96. Li, J.; Shang, Z.; Li, R.; Cui, B. Adaptive Sliding Mode Path Tracking Control of Unmanned Rice Transplanter. Agriculture 2022, 12, 1225. [Google Scholar] [CrossRef]
  97. Peng, C.; Zeng, Z.; Gao, J.; Zhou, J.; Tomizuka, M.; Wang, X.; Zhou, C.; Ye, N. PNAS-MOT: Multi-Modal Object Tracking With Pareto Neural Architecture Search. IEEE Robot. Autom. Lett. 2024, 9, 4377–4384. [Google Scholar] [CrossRef]
  98. Lu, E.; Xue, J.; Chen, T.; Jiang, S. Robust Trajectory Tracking Control of an Autonomous Tractor-Trailer Considering Model Parameter Uncertainties and Disturbances. Agriculture 2023, 13, 869. [Google Scholar] [CrossRef]
  99. Shi, R.; Han, X.; Guo, W. Uncertain multi-objective programming approach for planning supplementary irrigation areas in rainfed agricultural regions. Irrig. Drain. 2025, 74, 1193–1214. [Google Scholar] [CrossRef]
  100. Wang, D.; Huang, C.; Wang, Y.; Deng, Y.; Li, H. A 3D Multiobject Tracking Algorithm of Point Cloud Based on Deep Learning. Math. Probl. Eng. 2020, 2020, 8895696. [Google Scholar] [CrossRef]
  101. Sun, J.; Wang, Z.; Ding, S.; Xia, J.; Xing, G. Adaptive disturbance observer-based fixed time nonsingular terminal sliding mode control for path-tracking of unmanned agricultural tractors. Biosyst. Eng. 2024, 246, 96–109. [Google Scholar] [CrossRef]
  102. Zheng, L.; Tang, M.; Chen, Y.; Zhu, G.; Wang, J.; Lu, H. Improving multiple object tracking with single object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 11–15 June 2021; pp. 2453–2462. [Google Scholar] [CrossRef]
  103. Teng, Z.; Xing, J.; Wang, Q.; Zhang, B.; Fan, J. Deep Spatial and Temporal Network for Robust Visual Object Tracking. IEEE Trans. Image Process. 2020, 29, 1762–1775. [Google Scholar] [CrossRef] [PubMed]
  104. Hassaballah, M.; Kenk, M.A.; Muhammad, K.; Minaee, S. Vehicle Detection and Tracking in Adverse Weather Using a Deep Learning Framework. IEEE Trans. Intell. Transp. Syst. 2021, 22, 4230–4242. [Google Scholar] [CrossRef]
  105. Dong, X.; Niu, J.; Cui, J.; Fu, Z.; Ouyang, Z. Fast Segmentation-Based Object Tracking Model for Autonomous Vehicles. In Proceedings of the Algorithms and Architectures for Parallel Processing, New York, NY, USA, 2–4 October 2020; pp. 259–273. [Google Scholar] [CrossRef]
  106. Lyu, P.; Wei, M.; Wu, Y. High-precision and real-time visual tracking algorithm based on the Siamese network for autonomous driving. Signal Image Video Process. 2023, 17, 1235–1243. [Google Scholar] [CrossRef]
  107. Luo, W.; Xing, J.; Milan, A.; Zhang, X.; Liu, W.; Kim, T.-K. Multiple object tracking: A literature review. Artif. Intell. 2021, 293, 103448. [Google Scholar] [CrossRef]
  108. Li, J.; Huang, X.; Zhan, J. High-Precision Motion Detection and Tracking Based on Point Cloud Registration and Radius Search. IEEE Trans. Intell. Transp. Syst. 2023, 24, 6322–6335. [Google Scholar] [CrossRef]
  109. Li, G.; Chen, X.; Li, M.; Li, W.; Li, S.; Guo, G.; Wang, H.; Deng, H. One-shot multi-object tracking using CNN-based networks with spatial-channel attention mechanism. Opt. Laser Technol. 2022, 153, 108267. [Google Scholar] [CrossRef]
  110. Tang, S.; Xia, Z.; Gu, J.; Wang, W.; Huang, Z.; Zhang, W. High-precision apple recognition and localization method based on RGB-D and improved SOLOv2 instance segmentation. Front. Sustain. Food Syst. 2024, 8, 1403872. [Google Scholar] [CrossRef]
  111. Guan, X.; Shi, L.; Yang, W.; Ge, H.; Wei, X.; Ding, Y. Multi-Feature Fusion Recognition and Localization Method for Unmanned Harvesting of Aquatic Vegetables. Agriculture 2024, 14, 971. [Google Scholar] [CrossRef]
  112. Bisogni, C.; Cascone, L.; Nappi, M.; Pero, C. IoT-enabled Biometric Security: Enhancing Smart Car Safety with Depth-based Head Pose Estimation. ACM Trans. Multimed. Comput. Commun. Appl. 2024, 20, 1–24. [Google Scholar] [CrossRef]
  113. Chen, J.; Zhu, F.; Guan, Z.; Zhu, Y.; Shi, H.; Cheng, K. Development of a combined harvester navigation control system based on visual simultaneous localization and mapping-inertial guidance fusion. J. Agric. Eng. 2024, 55. [Google Scholar] [CrossRef]
  114. Wang, J.; Ni, D.; Li, K. RFID-Based Vehicle Positioning and Its Applications in Connected Vehicles. Sensors 2014, 14, 4225–4238. [Google Scholar] [CrossRef]
  115. Alkendi, Y.; Seneviratne, L.; Zweiri, Y. State of the Art in Vision-Based Localization Techniques for Autonomous Navigation Systems. IEEE Access 2021, 9, 76847–76874. [Google Scholar] [CrossRef]
  116. Ma, T.; Wang, Y.; Wang, Z.; Liu, X.; Zhang, H. ASD-SLAM: A Novel Adaptive-Scale Descriptor Learning for Visual SLAM. In Proceedings of the 2020 IEEE Intelligent Vehicles Symposium (IV), Las Vegas, NV, USA, 19 October–13 November 2020; pp. 809–816. [Google Scholar] [CrossRef]
  117. Shi, Y.; Li, H. Beyond cross-view image retrieval: Highly accurate vehicle localization using satellite image. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 17010–17020. [Google Scholar] [CrossRef]
  118. Tibebu, H.; De-Silva, V.; Artaud, C.; Pina, R.; Shi, X. Towards Interpretable Camera and LiDAR Data Fusion for Autonomous Ground Vehicles Localisation. Sensors 2022, 22, 8021. [Google Scholar] [CrossRef]
  119. Li, Q.; Zhuang, Y.; Huai, J. Multi-sensor fusion for robust localization with moving object segmentation in complex dynamic 3D scenes. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103507. [Google Scholar] [CrossRef]
  120. Almalioglu, Y.; Turan, M.; Trigoni, N.; Markham, A. Deep learning-based robust positioning for all-weather autonomous driving. Nat. Mach. Intell. 2022, 4, 749–760. [Google Scholar] [CrossRef] [PubMed]
  121. Wang, J.; Zhang, Y.; Gu, R. Research Status and Prospects on Plant Canopy Structure Measurement Using Visual Sensors Based on Three-Dimensional Reconstruction. Agriculture 2020, 10, 462. [Google Scholar] [CrossRef]
  122. Sabziev, E. Determining the Location of an Unmanned Aerial Vehicle Based on Video Camera Images. Adv. Inf. Syst. 2021, 5, 136–139. [Google Scholar] [CrossRef]
  123. Ersü, C.; Petlenkov, E.; Janson, K. A Systematic Review of Cutting-Edge Radar Technologies: Applications for Unmanned Ground Vehicles (UGVs). Sensors 2024, 24, 7807. [Google Scholar] [CrossRef]
  124. Cai, L.; Ye, Y.; Gao, X.; Li, Z.; Zhang, C. An improved visual SLAM based on affine transformation for ORB feature extraction. Optik 2021, 227, 165421. [Google Scholar] [CrossRef]
  125. Al-Refai, R.; Nandakumar, K. A unified model for face matching and presentation attack detection using an ensemble of vision transformer features. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 662–671. [Google Scholar] [CrossRef]
  126. Kamel, A.; Sheng, B.; Yang, P.; Li, P.; Shen, R.; Feng, D.D. Deep convolutional neural networks for human action recognition using depth maps and postures. IEEE Trans. Syst. Man Cybern. Syst. 2018, 49, 1806–1819. [Google Scholar] [CrossRef]
  127. Tian, M.; Nie, Q.; Shen, H. 3D Scene Geometry-Aware Constraint for Camera Localization with Deep Learning. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 4211–4217. [Google Scholar] [CrossRef]
  128. Hu, H.; Wang, H.; Liu, Z.; Chen, W. Domain-Invariant Similarity Activation Map Contrastive Learning for Retrieval-Based Long-Term Visual Localization. IEEE/CAA J. Autom. Sin. 2022, 9, 313–328. [Google Scholar] [CrossRef]
  129. Song, X.; Li, H.; Liang, L.; Shi, W.; Xie, G.; Lu, X.; Hei, X. TransBoNet: Learning camera localization with Transformer Bottleneck and Attention. Pattern Recognit. 2024, 146, 109975. [Google Scholar] [CrossRef]
  130. Wang, S.; Ahmad, N.S. A Comprehensive Review on Sensor Fusion Techniques for Localization of a Dynamic Target in GPS-Denied Environments. IEEE Access 2025, 13, 2252–2285. [Google Scholar] [CrossRef]
  131. Solanki, A.; Amiri, W.A.; Mahmoud, M.; Swieder, B.; Hasan, S.R.; Guo, T.N. Survey of Navigational Perception Sensors’ Security in Autonomous Vehicles. IEEE Access 2025, 13, 104937–104965. [Google Scholar] [CrossRef]
  132. Thrun, S.; Montemerlo, M. The Graph SLAM Algorithm with Applications to Large-Scale Mapping of Urban Structures. Int. J. Robot. Res. 2006, 25, 403–429. [Google Scholar] [CrossRef]
  133. Lu, W.; Wan, G.; Zhou, Y.; Fu, X.; Yuan, P.; Song, S. Deepvcp: An end-to-end deep neural network for point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 12–21. [Google Scholar] [CrossRef]
  134. Kang, Q.; She, R.; Wang, S.; Tay, W.P.; Navarro, D.N.; Hartmannsgruber, A. Location Learning for AVs: LiDAR and Image Landmarks Fusion Localization with Graph Neural Networks. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 3032–3037. [Google Scholar] [CrossRef]
  135. Ibrahim, M.; Akhtar, N.; Anwar, S.; Mian, A. UnLoc: A Universal Localization Method for Autonomous Vehicles using LiDAR, Radar and/or Camera Input. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 5187–5194. [Google Scholar] [CrossRef]
  136. Yan, Z.; Li, P.; Fu, Z.; Xu, S.; Shi, Y.; Chen, X.; Zheng, Y.; Li, Y.; Liu, T.; Li, C. Int2: Interactive trajectory prediction at intersections. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 8536–8547. [Google Scholar] [CrossRef]
  137. Zipfl, M.; Hertlein, F.; Rettinger, A.; Thoma, S.; Halilaj, L.; Luettin, J.; Schmid, S.; Henson, C. Relation-based Motion Prediction using Traffic Scene Graphs. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 825–831. [Google Scholar] [CrossRef]
  138. Geng, M.; Cai, Z.; Zhu, Y.; Chen, X.; Lee, D.H. Multimodal Vehicular Trajectory Prediction with Inverse Reinforcement Learning and Risk Aversion at Urban Unsignalized Intersections. IEEE Trans. Intell. Transp. Syst. 2023, 24, 12227–12240. [Google Scholar] [CrossRef]
  139. Ban, Y.; Li, X.; Rosman, G.; Gilitschenski, I.; Meireles, O.; Karaman, S.; Rus, D. A Deep Concept Graph Network for Interaction-Aware Trajectory Prediction. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 8992–8998. [Google Scholar] [CrossRef]
  140. Cheng, H.; Liu, M.; Chen, L.; Broszio, H.; Sester, M.; Yang, M.Y. GATraj: A graph- and attention-based multi-agent trajectory prediction model. ISPRS J. Photogramm. Remote Sens. 2023, 205, 163–175. [Google Scholar] [CrossRef]
  141. Choi, S.; Kweon, N.; Yang, C.; Kim, D.; Shon, H.; Choi, J.; Huh, K. DSA-GAN: Driving Style Attention Generative Adversarial Network for Vehicle Trajectory Prediction. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; pp. 1515–1520. [Google Scholar]
  142. Song, X.; Chen, K.; Li, X.; Sun, J.; Hou, B.; Cui, Y.; Zhang, B.; Xiong, G.; Wang, Z. Pedestrian Trajectory Prediction Based on Deep Convolutional LSTM Network. IEEE Trans. Intell. Transp. Syst. 2021, 22, 3285–3302. [Google Scholar] [CrossRef]
  143. Xue, H.; Huynh, D.Q.; Reynolds, M. PoPPL: Pedestrian Trajectory Prediction by LSTM With Automatic Route Class Clustering. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 77–90. [Google Scholar] [CrossRef] [PubMed]
  144. Hsieh, T.J.; Shih, C.S.; Lin, C.W.; Chen, C.W.; Tsung, P.K. Trajectory Prediction at Unsignalized Intersections using Social Conditional Generative Adversarial Network. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 19–22 September 2021; pp. 844–851. [Google Scholar] [CrossRef]
  145. Yang, X.; Bingxian, L.; Xiangcheng, W. SGAMTE-Net: A pedestrian trajectory prediction network based on spatiotemporal graph attention and multimodal trajectory endpoints. Appl. Intell. 2023, 53, 31165–31180. [Google Scholar] [CrossRef]
  146. Youssef, T.; Zemmouri, E.; Bouzid, A. STM-GCN: A spatiotemporal multi-graph convolutional network for pedestrian trajectory prediction. J. Supercomput. 2023, 79, 20923–20937. [Google Scholar] [CrossRef]
  147. Khakzar, M.; Bond, A.; Rakotonirainy, A.; Trespalacios, O.O.; Dehkordi, S.G. Driver influence on vehicle trajectory prediction. Accid. Anal. Prev. 2021, 157, 106165. [Google Scholar] [CrossRef]
  148. Xing, Y.; Lv, C.; Cao, D. Personalized Vehicle Trajectory Prediction Based on Joint Time-Series Modeling for Connected Vehicles. IEEE Trans. Veh. Technol. 2020, 69, 1341–1352. [Google Scholar] [CrossRef]
  149. Dai, S.; Li, Z.; Li, L.; Zheng, N.; Wang, S. A Flexible and Explainable Vehicle Motion Prediction and Inference Framework Combining Semi-Supervised AOG and ST-LSTM. IEEE Trans. Intell. Transp. Syst. 2022, 23, 840–860. [Google Scholar] [CrossRef]
  150. Hou, L.; Li, S.E.; Yang, B.; Wang, Z.; Nakano, K. Structural Transformer Improves Speed-Accuracy Trade-Off in Interactive Trajectory Prediction of Multiple Surrounding Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 24778–24790. [Google Scholar] [CrossRef]
  151. Wang, X.; Tang, K.; Dai, X.; Xu, J.; Xi, J.; Ai, R.; Wang, Y.; Gu, W.; Sun, C. Safety-Balanced Driving-Style Aware Trajectory Planning in Intersection Scenarios with Uncertain Environment. IEEE Trans. Intell. Veh. 2023, 8, 2888–2898. [Google Scholar] [CrossRef]
  152. Jeon, H.; Choi, J.; Kum, D. SCALE-Net: Scalable Vehicle Trajectory Prediction Network under Random Number of Interacting Vehicles via Edge-enhanced Graph Convolutional Neural Network. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 24 October–24 January 2021; pp. 2095–2102. [Google Scholar] [CrossRef]
  153. Schwarting, W.; Alonso-Mora, J.; Rus, D. Planning and Decision-Making for Autonomous Vehicles. Annu. Rev. Control Robot. Auton. Syst. 2018, 1, 187–210. [Google Scholar] [CrossRef]
  154. Zhu, Z.; Zhao, H. Joint Imitation Learning of Behavior Decision and Control for Autonomous Intersection Navigation. In Proceedings of the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Detroit, MI, USA, 1–5 October 2023; pp. 1564–1571. [Google Scholar] [CrossRef]
  155. Zhang, F.; Teng, S.; Wang, Y.; Guo, Z.; Wang, J.; Xu, R. Design of bionic goat quadruped robot mechanism and walking gait planning. Int. J. Agric. Biol. Eng. 2020, 13, 32–39. [Google Scholar] [CrossRef]
  156. Ma, W.; Wu, J.; Sun, B.; Leng, X.; Miao, W.; Gao, Z.; Li, W. Intelligent vehicle decision-making strategy integrating spatiotemporal features at roundabout. Expert Syst. Appl. 2025, 273, 126779. [Google Scholar] [CrossRef]
  157. Lu, E.; Xu, L.; Li, Y.; Tang, Z.; Ma, Z. Modeling of working environment and coverage path planning method of combine harvesters. Int. J. Agric. Biol. Eng. 2020, 13, 132–137. [Google Scholar] [CrossRef]
  158. Feraco, S.; Luciani, S.; Bonfitto, A.; Amati, N.; Tonoli, A. A local trajectory planning and control method for autonomous vehicles based on the RRT algorithm. In Proceedings of the 2020 AEIT International Conference of Electrical and Electronic Technologies for Automotive (AEIT Automotive), Turin, Italy, 18–20 November 2020; pp. 1–6. [Google Scholar] [CrossRef]
  159. Jafari, R.; Ashari, A.E.; Huber, M. CHAMP: Integrated Logic with Reinforcement Learning for Hybrid Decision Making for Autonomous Vehicle Planning. In Proceedings of the 2023 American Control Conference (ACC), San Diego, CA, USA, 31 May–2 June 2023; pp. 3310–3315. [Google Scholar] [CrossRef]
  160. Mirchevska, B.; Pek, C.; Werling, M.; Althoff, M.; Boedecker, J. High-level Decision Making for Safe and Reasonable Autonomous Lane Changing using Reinforcement Learning. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 2156–2162. [Google Scholar] [CrossRef]
  161. Liu, Q.; Li, Z.; Li, X.; Wu, J.; Yuan, S. Graph convolution-based deep reinforcement learning for multi-agent decision-making in interactive traffic scenarios. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 4074–4081. [Google Scholar] [CrossRef]
  162. Li, J.; Fotouhi, A.; Pan, W.; Liu, Y.; Zhang, Y.; Chen, Z. Deep reinforcement learning-based eco-driving control for connected electric vehicles at signalized intersections considering traffic uncertainties. Energy 2023, 279, 128139. [Google Scholar] [CrossRef]
  163. Yang, R.; Xu, S.; Li, H.; Zhu, H.; Zhao, H.; Wang, X. Action-Oriented Deep Reinforcement Learning Method for Precast Concrete Component Production Scheduling. Buildings 2025, 15, 697. [Google Scholar] [CrossRef]
  164. Bevly, D.; Cao, X.; Gordon, M.; Ozbilgin, G.; Kari, D.; Nelson, B.; Woodruff, J.; Barth, M.; Murray, C.; Kurt, A.; et al. Lane Change and Merge Maneuvers for Connected and Automated Vehicles: A Survey. IEEE Trans. Intell. Veh. 2016, 1, 105–120. [Google Scholar] [CrossRef]
  165. Seong, H.; Jung, C.; Lee, S.; Shim, D.H. Learning to Drive at Unsignalized Intersections using Attention-based Deep Reinforcement Learning. In Proceedings of the 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), Indianapolis, IN, USA, 22 September 2021; pp. 559–566. [Google Scholar] [CrossRef]
  166. Guo, Q.; Angah, O.; Liu, Z.; Ban, X. Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors. Transp. Res. Part C Emerg. Technol. 2021, 124, 102980. [Google Scholar] [CrossRef]
  167. Zhao, R.; Sun, Z.; Ji, A. A Deep Reinforcement Learning Approach for Automated On-Ramp Merging. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; pp. 3800–3806. [Google Scholar] [CrossRef]
  168. En, L.; Zheng, M.; Yaoming, L.; Lizhang, X.; Zhong, T. Adaptive backstepping control of tracked robot running trajectory based on real-time slip parameter estimation. Int. J. Agric. Biol. Eng. 2020, 13, 178–187. [Google Scholar] [CrossRef]
  169. Gautam, A.; He, Y.; Lin, X. An Overview of Motion-Planning Algorithms for Autonomous Ground Vehicles with Various Applications. SAE Int. J. Veh. Dyn. Stab. NVH 2024, 8, 179–213. [Google Scholar] [CrossRef]
  170. Katrakazas, C.; Quddus, M.; Chen, W.-H.; Deka, L. Real-time motion planning methods for autonomous on-road driving: State-of-the-art and future research directions. Transp. Res. Part C Emerg. Technol. 2015, 60, 416–442. [Google Scholar] [CrossRef]
  171. Guo, Y.; Guo, Z.; Wang, Y.; Yao, D.; Li, B.; Li, L. A Survey of Trajectory Planning Methods for Autonomous Driving—Part I: Unstructured Scenarios. IEEE Trans. Intell. Veh. 2024, 9, 5407–5434. [Google Scholar] [CrossRef]
  172. Qiao, Z.; Schneider, J.; Dolan, J.M. Behavior Planning at Urban Intersections through Hierarchical Reinforcement Learning. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 2667–2673. [Google Scholar] [CrossRef]
  173. Chen, S.; Wang, M.; Yang, Y.; Song, W. Conflict-constrained Multi-agent Reinforcement Learning Method for Parking Trajectory Planning. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA), London, UK, 29 May–2 June 2023; pp. 9421–9427. [Google Scholar] [CrossRef]
  174. Zhu, H.; Han, T.; Alhajyaseen, W.K.M.; Iryo-Asano, M.; Nakamura, H. Can automated driving prevent crashes with distracted Pedestrians? An exploration of motion planning at unsignalized Mid-block crosswalks. Accid. Anal. Prev. 2022, 173, 106711. [Google Scholar] [CrossRef]
  175. Tang, X.; Huang, B.; Liu, T.; Lin, X. Highway Decision-Making and Motion Planning for Autonomous Driving via Soft Actor-Critic. IEEE Trans. Veh. Technol. 2022, 71, 4706–4717. [Google Scholar] [CrossRef]
  176. Cao, Z.; Yang, D.; Xu, S.; Peng, H.; Li, B.; Feng, S.; Zhao, D. Highway Exiting Planner for Automated Vehicles Using Reinforcement Learning. IEEE Trans. Intell. Transp. Syst. 2021, 22, 990–1000. [Google Scholar] [CrossRef]
  177. Chen, Y.X.; Chen, L.; Wang, R.C.; Xu, X.; Shen, Y.J.; Liu, Y.L. Modeling and test on height adjustment system of electrically-controlled air suspension for agricultural vehicles. Int. J. Agric. Biol. Eng. 2016, 9, 40–47. [Google Scholar] [CrossRef]
  178. Li, J.; Yu, T.; Yang, B. Adaptive Controller of PEMFC Output Voltage Based on Ambient Intelligence Large-Scale Deep Reinforcement Learning. IEEE Access 2021, 9, 6063–6075. [Google Scholar] [CrossRef]
  179. Sun, Z.; Guo, R.; Xue, X.; Hong, Z.; Luo, M.; Wong, P.K.; Liu, J.J.R.; Wang, X. Application-oriented mode decision for energy management of range-extended electric vehicle based on reinforcement learning. Electr. Power Syst. Res. 2024, 226, 109896. [Google Scholar] [CrossRef]
  180. Su, Q.; Zhou, J.; Yi, F.; Hu, D.; Lu, D.; Wu, G.; Zhang, C.; Deng, B.; Cao, D. An intelligent control method for PEMFC air supply subsystem to optimize dynamic response performance. Fuel 2024, 361, 130697. [Google Scholar] [CrossRef]
  181. Beaudoin, M.-A.; Boulet, B. Improving gearshift controllers for electric vehicles with reinforcement learning. Mech. Mach. Theory 2022, 169, 104654. [Google Scholar] [CrossRef]
  182. Zhang, Z.; Zhang, T.; Hong, J.; Zhang, H.; Yang, J.; Jia, Q. Double deep Q-network guided energy management strategy of a novel electric-hydraulic hybrid electric vehicle. Energy 2023, 269, 126858. [Google Scholar] [CrossRef]
  183. Liu, Y.; Zhang, J.; Lv, Z.; Ye, J. The Optimization of RBFNN Gearshift Controller Parameters for Electric Vehicles Using PILCO Reinforcement Learning. IEEE Access 2023, 11, 92807–92821. [Google Scholar] [CrossRef]
  184. Li, H.; Li, N.; Kolmanovsky, I.; Girard, A. Energy-Efficient Autonomous Vehicle Control Using Reinforcement Learning and Interactive Traffic Simulations. In Proceedings of the 2020 American Control Conference (ACC), Denver, CO, USA, 1–3 July 2020; pp. 3029–3034. [Google Scholar] [CrossRef]
  185. Kerbel, L.; Ayalew, B.; Ivanco, A.; Loiselle, K. Driver Assistance Eco-driving and Transmission Control with Deep Reinforcement Learning. In Proceedings of the 2022 American Control Conference (ACC), Atlanta, GA, USA, 8–10 June 2022; pp. 2409–2415. [Google Scholar] [CrossRef]
  186. Lu, D.; Hu, D.; Wang, J.; Wei, W.; Zhang, X. A Data-Driven Vehicle Speed Prediction Transfer Learning Method with Improved Adaptability Across Working Conditions for Intelligent Fuel Cell Vehicle. IEEE Trans. Intell. Transp. Syst. 2025, 26, 10881–10891. [Google Scholar] [CrossRef]
  187. Zhu, Z.; Yang, Y.; Wang, D.; Cai, Y.; Lai, L. Energy Saving Performance of Agricultural Tractor Equipped with Mechanic-Electronic-Hydraulic Powertrain System. Agriculture 2022, 12, 436. [Google Scholar] [CrossRef]
  188. Wu, Y.; Huang, Z.; Zhang, R.; Huang, P.; Gao, Y.; Li, H.; Liu, Y.; Peng, J. Driving style-aware energy management for battery/supercapacitor electric vehicles using deep reinforcement learning. J. Energy Storage 2023, 73, 109199. [Google Scholar] [CrossRef]
  189. Liu, J.; Xia, C.; Jiang, D.; Sun, Y. Development and Testing of the Power Transmission System of a Crawler Electric Tractor for Greenhouses. Appl. Eng. Agric. 2020, 36, 797–805. [Google Scholar] [CrossRef]
  190. Tang, X.; Jia, T.; Hu, X.; Huang, Y.; Deng, Z.; Pu, H. Naturalistic Data-Driven Predictive Energy Management for Plug-In Hybrid Electric Vehicles. IEEE Trans. Transp. Electrif. 2021, 7, 497–508. [Google Scholar] [CrossRef]
  191. Tang, X.; Zhang, J.; Pi, D.; Lin, X.; Grzesiak, L.M.; Hu, X. Battery Health-Aware and Deep Reinforcement Learning-Based Energy Management for Naturalistic Data-Driven Driving Scenarios. IEEE Trans. Transp. Electrif. 2022, 8, 948–964. [Google Scholar] [CrossRef]
  192. Du, G.; Zou, Y.; Zhang, X.; Liu, T.; Wu, J.; He, D. Deep reinforcement learning based energy management for a hybrid electric vehicle. Energy 2020, 201, 117591. [Google Scholar] [CrossRef]
  193. Tang, X.; Chen, J.; Pu, H.; Liu, T.; Khajepour, A. Double Deep Reinforcement Learning-Based Energy Management for a Parallel Hybrid Electric Vehicle with Engine Start–Stop Strategy. IEEE Trans. Transp. Electrif. 2022, 8, 1376–1388. [Google Scholar] [CrossRef]
  194. Wu, J.; Wei, Z.; Li, W.; Wang, Y.; Li, Y.; Sauer, D.U. Battery Thermal- and Health-Constrained Energy Management for Hybrid Electric Bus Based on Soft Actor-Critic DRL Algorithm. IEEE Trans. Ind. Inform. 2021, 17, 3751–3761. [Google Scholar] [CrossRef]
  195. Cui, N.; Cui, W.; Shi, Y. Deep Reinforcement Learning Based PHEV Energy Management with Co-Recognition for Traffic Condition and Driving Style. IEEE Trans. Intell. Veh. 2023, 8, 3026–3039. [Google Scholar] [CrossRef]
  196. Tao, S.; Guo, R.; Lee, J.; Moura, S.; Casals, L.C.; Jiang, S.; Shi, J.; Harris, S.; Zhang, T.; Chung, C.Y.; et al. Immediate remaining capacity estimation of heterogeneous second-life lithium-ion batteries via deep generative transfer learning. Energy Environ. Sci. 2025, 18, 7413–7426. [Google Scholar] [CrossRef]
  197. Wang, Y.; Wang, K.; Wang, B.; Yin, Y.; Zhao, H.; Han, L.; Jiao, K. A Data-Driven Approach to Lifespan Prediction for Vehicle Fuel Cell Systems. IEEE Trans. Transp. Electrif. 2023, 9, 5049–5060. [Google Scholar] [CrossRef]
  198. Tao, S.; Ma, R.; Zhao, Z.; Ma, G.; Su, L.; Chang, H.; Chen, Y.; Liu, H.; Liang, Z.; Cao, T.; et al. Generative learning assisted state-of-health estimation for sustainable battery recycling with random retirement conditions. Nat. Commun. 2024, 15, 10154. [Google Scholar] [CrossRef]
  199. Hu, D.; Wang, Y.; Li, J.; Yang, Q.; Wang, J. Investigation of optimal operating temperature for the PEMFC and its tracking control for energy saving in vehicle applications. Energy Convers. Manag. 2021, 249, 114842. [Google Scholar] [CrossRef]
  200. Yang, S.; Zhai, C.; Gao, Y.; Dou, H.; Zhao, X.; He, Y.; Wang, X. Planting uniformity performance of motor-driven maize precision seeding systems. Int. J. Agric. Biol. Eng. 2022, 15, 101–108. [Google Scholar] [CrossRef]
  201. Zhang, Y.; Huang, J.; He, L.; Zhao, D.; Zhao, Y. Reinforcement learning-based control for the thermal management of the battery and occupant compartments of electric vehicles. Sustain. Energy Fuels 2024, 8, 588–603. [Google Scholar] [CrossRef]
  202. Saraireh, M. Thermal Management in Electric Vehicles: Modeling and Prospects. Int. J. Heat Technol. 2023, 41, 103–116. [Google Scholar] [CrossRef]
  203. Choi, W.; Kim, J.W.; Ahn, C.; Gim, J. Reinforcement Learning-based Controller for Thermal Management System of Electric Vehicles. In Proceedings of the 2022 IEEE Vehicle Power and Propulsion Conference (VPPC), Merced, CA, USA, 1–4 November 2022; pp. 1–5. [Google Scholar] [CrossRef]
  204. Arjmandzadeh, Z.; Abbasi, M.H.; Wang, H.; Zhang, J.; Xu, B. Electric Vehicle Battery Thermal Management Under Extreme Fast Charging with Deep Reinforcement Learning. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5011679 (accessed on 26 August 2025).
  205. Li, J.; Li, Y.; Yu, T. An optimal coordinated proton exchange membrane fuel cell heat management method based on large-scale multi-agent deep reinforcement learning. Energy Rep. 2021, 7, 6054–6068. [Google Scholar] [CrossRef]
  206. Billert, A.M.; Frey, M.; Gauterin, F. A Method of Developing Quantile Convolutional Neural Networks for Electric Vehicle Battery Temperature Prediction Trained on Cross-Domain Data. IEEE Open J. Intell. Transp. Syst. 2022, 3, 411–425. [Google Scholar] [CrossRef]
  207. Huang, G.; Zhao, P.; Zhang, G. Real-Time Battery Thermal Management for Electric Vehicles Based on Deep Reinforcement Learning. IEEE Internet Things J. 2022, 9, 14060–14072. [Google Scholar] [CrossRef]
  208. Wei, Z.; Song, R.; Ji, D.; Wang, Y.; Pan, F. Hierarchical thermal management for PEM fuel cell with machine learning approach. Appl. Therm. Eng. 2024, 236, 121544. [Google Scholar] [CrossRef]
  209. Chen, Y.; Chen, L.; Huang, C.; Lu, Y.; Wang, C. A dynamic tire model based on HPSO-SVM. Int. J. Agric. Biol. Eng. 2019, 12, 36–41. [Google Scholar] [CrossRef]
  210. Wei, J.; Zheng, Z.A.; Chen, J. Research on the Control Strategy for Handling Stability of Electric Power Steering System with Active Front Wheel Steering Function. SAE Int. J. Veh. Dyn. Stab. NVH 2024, 8, 81–97. [Google Scholar] [CrossRef]
  211. Pan, Q.; Zhou, B.; Wu, X.; Cui, Q.; Zheng, K. Steering collision avoidance and lateral stability coordinated control based on vehicle lateral stability region. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2024, 239, 1699–1716. [Google Scholar] [CrossRef]
  212. Zhou, Q.; Liu, L.; Xu, Z.; Wang, X. Design and Control of Personalized Steering Feel for Steer-by-Wire Systems. IEEE Trans. Intell. Transp. Syst. 2025, 26, 6288–6303. [Google Scholar] [CrossRef]
  213. Lin, X.; Huang, J.; Zhang, B.; Zhou, B.; Chen, Z. A velocity adaptive steering control strategy of autonomous vehicle based on double deep Q-learning network with varied agents. Eng. Appl. Artif. Intell. 2025, 139, 109655. [Google Scholar] [CrossRef]
  214. Wasala, A.; Byrne, D.; Miesbauer, P.; O’Hanlon, J.; Heraty, P.; Barry, P. Trajectory based lateral control: A Reinforcement Learning case study. Eng. Appl. Artif. Intell. 2020, 94, 103799. [Google Scholar] [CrossRef]
  215. Chao-zhong, W.; Yao, L.; Zhi-jun, C.; Peng, L. Human-machine integration method for steering decision-making of intelligent vehicle based on reinforcement learning. J. Traffic Transp. Eng. 2022, 22, 55. [Google Scholar] [CrossRef]
  216. Zhao, J.; Cheng, S.; Li, L.; Li, M.; Zhang, Z. A model free controller based on reinforcement learning for active steering system with uncertainties. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2021, 235, 2470–2483. [Google Scholar] [CrossRef]
  217. de Morais, G.A.P.; Marcos, L.B.; Bueno, J.N.A.D.; de Resende, N.F.; Terra, M.H.; Grassi, V., Jr. Vision-based robust control framework based on deep reinforcement learning applied to autonomous ground vehicles. Control Eng. Pract. 2020, 104, 104630. [Google Scholar] [CrossRef]
  218. Zhu, Q.; Zhu, Z.; Zhang, H.; Gao, Y.; Chen, L. Design of an Electronically Controlled Fertilization System for an Air-Assisted Side-Deep Fertilization Machine. Agriculture 2023, 13, 2210. [Google Scholar] [CrossRef]
  219. Zhang, Y.; Li, Z.; Hu, C.; Zhang, Y.; Chen, J.; Du, C. Human–Machine Shared Steering Decision-Making of Intelligent Vehicles Based on Heterogeneous Synchronous Reinforcement Learning. IEEE Trans. Intell. Transp. Syst. 2025, 1–13. [Google Scholar] [CrossRef]
  220. Wu, J.; Yang, H.; Yang, L.; Huang, Y.; He, X.; Lv, C. Human-Guided Deep Reinforcement Learning for Optimal Decision Making of Autonomous Vehicles. IEEE Trans. Syst. Man Cybern. Syst. 2024, 54, 6595–6609. [Google Scholar] [CrossRef]
  221. Yu, X. Switching in Sliding Mode Control: A Spatio-Temporal Perspective. IEEE/CAA J. Autom. Sin. 2025, 12, 1063–1071. [Google Scholar] [CrossRef]
  222. Ghazi, G.A.; Al-Ammar, E.A.; Hasanien, H.M.; Ko, W.; Lee, S.M.; Turky, R.A.; Tostado-Véliz, M.; Jurado, F. Circle Search Algorithm-Based Super Twisting Sliding Mode Control for MPPT of Different Commercial PV Modules. IEEE Access 2024, 12, 33109–33128. [Google Scholar] [CrossRef]
  223. Fu, Y.; Li, C.; Yu, F.R.; Luan, T.H.; Zhang, Y. A Decision-Making Strategy for Vehicle Autonomous Braking in Emergency via Deep Reinforcement Learning. IEEE Trans. Veh. Technol. 2020, 69, 5876–5888. [Google Scholar] [CrossRef]
  224. Fanti, M.P.; Mangini, A.M.; Martino, D.; Olivieri, I.; Parisi, F.; Popolizio, F. Safety and Comfort in Autonomous Braking System with Deep Reinforcement Learning. In Proceedings of the 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Prague, Czech Republic, 9–12 October 2022; pp. 1786–1791. [Google Scholar] [CrossRef]
  225. Hou, X.; Gan, M.; Zhang, J.; Zhao, S.; Ji, Y. Vehicle ride comfort optimization in the post-braking phase using residual reinforcement learning. Adv. Eng. Inform. 2023, 58, 102198. [Google Scholar] [CrossRef]
  226. Mantripragada, V.K.T.; Kumar, R.K. Deep reinforcement learning-based antilock braking algorithm. Veh. Syst. Dyn. 2023, 61, 1410–1431. [Google Scholar] [CrossRef]
  227. Dubey, V.S.; Kasad, R.; Agrawal, K. Autonomous braking and throttle system: A deep reinforcement learning approach for naturalistic driving. arXiv 2020, arXiv:2008.06696. [Google Scholar] [CrossRef]
  228. Theunissen, J.; Tota, A.; Gruber, P.; Dhaens, M.; Sorniotti, A. Preview-based techniques for vehicle suspension control: A state-of-the-art review. Annu. Rev. Control 2021, 51, 206–235. [Google Scholar] [CrossRef]
  229. Cui, L.; Mao, H.; Xue, X.; Ding, S.; Qiao, B. Optimized design and test for a pendulum suspension of the crop spray boom in dynamic conditions based on a six DOF motion simulator. Int. J. Agric. Biol. Eng. 2018, 11, 76–85. [Google Scholar] [CrossRef]
  230. Kimball, J.B.; DeBoer, B.; Bubbar, K. Adaptive control and reinforcement learning for vehicle suspension control: A review. Annu. Rev. Control 2024, 58, 100974. [Google Scholar] [CrossRef]
  231. Lin, J.; Lian, R.J. Intelligent Control of Active Suspension Systems. IEEE Trans. Ind. Electron. 2011, 58, 618–628. [Google Scholar] [CrossRef]
  232. Wang, Z.; Liu, C.; Zheng, X.; Zhao, L.; Qiu, Y. Advancements in semi-active automotive suspension systems with magnetorheological dampers: A review. Appl. Sci. 2024, 14, 7866. [Google Scholar] [CrossRef]
  233. Feng, J.; Yin, Z.; Xia, Z.; Wang, W.; Shangguan, W.-B.; Rakheja, S. Control Strategy of Semi-Active Suspension Based on Road Roughness Identification. SAE Int. J. Veh. Dyn. Stab. NVH 2024, 8, 231–252. [Google Scholar] [CrossRef]
  234. Liang, T.; Han, S.Y.; Zhou, J.; Chen, Y.H.; Yang, J.; Zhao, J. Adaptive Vibration Control of Vehicle Semi-Active Suspension System Based on Ensemble Fuzzy Logic and Reinforcement Learning. In Proceedings of the 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Prague, Czech Republic, 9–12 October 2022; pp. 2627–2632. [Google Scholar]
  235. Ming, L.; Yibin, L.; Xuewen, R.; Shuaishuai, Z.; Yanfang, Y. Semi-Active Suspension Control Based on Deep Reinforcement Learning. IEEE Access 2020, 8, 9978–9986. [Google Scholar] [CrossRef]
  236. Yong, H.; Seo, J.; Kim, J.; Kim, M.; Choi, J. Suspension Control Strategies Using Switched Soft Actor-Critic Models for Real Roads. IEEE Trans. Ind. Electron. 2023, 70, 824–832. [Google Scholar] [CrossRef]
  237. Wang, C.; Cui, X.; Zhao, S.; Zhou, X.; Song, Y.; Wang, Y.; Guo, K. Enhancing vehicle ride comfort through deep reinforcement learning with expert-guided soft-hard constraints and system characteristic considerations. Adv. Eng. Inform. 2024, 59, 102328. [Google Scholar] [CrossRef]
  238. Jannusch, T.; Shannon, D.; Völler, M.; Murphy, F.; Mullins, M. Cars and distraction: How to address the limits of Driver Monitoring Systems and improve safety benefits using evidence from German young drivers. Technol. Soc. 2021, 66, 101628. [Google Scholar] [CrossRef]
  239. Lu, Y.; Liu, C.; Chang, F.; Liu, H.; Huan, H. JHPFA-Net: Joint Head Pose and Facial Action Network for Driver Yawning Detection Across Arbitrary Poses in Videos. IEEE Trans. Intell. Transp. Syst. 2023, 24, 11850–11863. [Google Scholar] [CrossRef]
  240. Moussa, M.M.; Shoitan, R.; Cho, Y.-I.; Abdallah, M.S. Visual-Based Children and Pet Rescue from Suffocation and Incidence of Hyperthermia Death in Enclosed Vehicles. Sensors 2023, 23, 7025. [Google Scholar] [CrossRef] [PubMed]
  241. Fang, Y.C.; Zhao, X.L.; Lin, H.Y.; Yang, Y.C.; Guo, J.I.; Fan, C.P. YOLO Deep-Learning Based Driver Behaviors Detection and Effective Gaze Estimation by Head Poses for Driver Monitor System. In Proceedings of the 2023 IEEE 12th Global Conference on Consumer Electronics (GCCE), Nara, Japan, 10–13 October 2023; pp. 82–83. [Google Scholar] [CrossRef]
  242. Du, G.; Li, T.; Li, C.; Liu, P.X.; Li, D. Vision-Based Fatigue Driving Recognition Method Integrating Heart Rate and Facial Features. IEEE Trans. Intell. Transp. Syst. 2021, 22, 3089–3100. [Google Scholar] [CrossRef]
  243. Pistolesi, F.; Baldassini, M.; Lazzerini, B. Speech-Based Detection of In-Car Escalating Arguments to Prevent Distracted Driving. In Proceedings of the 2023 IEEE International Conference on Big Data (BigData), Sorrento, Italy, 15–18 December 2023; pp. 5330–5337. [Google Scholar] [CrossRef]
  244. Hu, Y. Solving Pediatric Vehicular Heatstroke with Efficient Multi-Cascaded Convolutional Networks. In Proceedings of the 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), Guangzhou, China, 15–17 January 2021; pp. 93–99. [Google Scholar] [CrossRef]
  245. Vashisht, S.; Rakshit, D. Recent advances and sustainable solutions in automobile air conditioning systems. J. Clean. Prod. 2021, 329, 129754. [Google Scholar] [CrossRef]
  246. Brusey, J.; Hintea, D.; Gaura, E.; Beloe, N. Reinforcement learning-based thermal comfort control for vehicle cabins. Mechatronics 2018, 50, 413–421. [Google Scholar] [CrossRef]
  247. Hu, D.; Qiu, C.; Lu, D.; Wang, J.; Huang, H.; Xue, H. An intelligent thermal comfort control strategy for air conditioning of fuel cell vehicles. Appl. Therm. Eng. 2024, 248, 123286. [Google Scholar] [CrossRef]
  248. Lee, C.-G.; Kwon, O. Reinforcement Learning Based Power Seat Actuation to Mitigate Carsickness of Autonomous Vehicles. In Proceedings of the HCI International 2023 Posters, Copenhagen, Denmark, 23–28 July 2023; pp. 36–41. [Google Scholar] [CrossRef]
  249. Şener, A.Ş.; Ince, I.F.; Baydargil, H.B.; Garip, I.; Ozturk, O. Deep learning based automatic vertical height adjustment of incorrectly fastened seat belts for driver and passenger safety in fleet vehicles. Proc. Inst. Mech. Eng. Part D J. Automob. Eng. 2021, 236, 639–654. [Google Scholar] [CrossRef]
  250. Nascimento, E.R.; Bajcsy, R.; Gregor, M.; Huang, I.; Villegas, I.; Kurillo, G. On the Development of an Acoustic-Driven Method to Improve Driver’s Comfort Based on Deep Reinforcement Learning. IEEE Trans. Intell. Transp. Syst. 2021, 22, 2923–2932. [Google Scholar] [CrossRef]
  251. Gao, F.; Ge, X.; Li, J.; Fan, Y.; Li, Y.; Zhao, R. Intelligent cockpits for connected vehicles: Taxonomy, architecture, interaction technologies, and future directions. Sensors 2024, 24, 5172. [Google Scholar] [CrossRef] [PubMed]
  252. Shishavan, H.H.; Behzadi, M.M.; Lohan, D.J.; Dede, E.M.; Kim, I. Closed-Loop Brain Machine Interface System for In-Vehicle Function Controls Using Head-Up Display and Deep Learning Algorithm. IEEE Trans. Intell. Transp. Syst. 2024, 25, 6594–6603. [Google Scholar] [CrossRef]
  253. Sachara, F.; Kopinski, T.; Gepperth, A.; Handmann, U. Free-hand gesture recognition with 3D-CNNs for in-car infotainment control in real-time. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017; pp. 959–964. [Google Scholar] [CrossRef]
  254. Kim, H.; Lee, H.; Park, J.; Paillat, L.; Kim, S.C. Vehicle Control on an Uninstrumented Surface with an Off-the-Shelf Smartwatch. IEEE Trans. Intell. Veh. 2023, 8, 3366–3374. [Google Scholar] [CrossRef]
  255. Paranjape, A.; Patwardhan, Y.; Deshpande, V.; Darp, A.; Jagdale, J. Voice-Based Smart Assistant System for Vehicles Using RASA. In Proceedings of the 2023 International Conference on Computational Intelligence, Networks and Security (ICCINS), Mylavaram, India, 22–23 December 2023; pp. 1–6. [Google Scholar] [CrossRef]
  256. Dewalegama, M.P.; Zoysa, A.D.S.d.; Kodikara, L.M.; Dissanayake, D.M.J.C.; Kuruppu, T.A.; Rupasinghe, S. Deep Learning-Based Smart Infotainment System for Taxi Vehicles. In Proceedings of the 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), Ankara, Turkey, 9–11 June 2022; pp. 1–6. [Google Scholar] [CrossRef]
  257. Jiang, T.; Fang, H.; Wang, H. Blockchain-Based Internet of Vehicles: Distributed Network Architecture and Performance Analysis. IEEE Internet Things J. 2019, 6, 4640–4649. [Google Scholar] [CrossRef]
  258. Li, G.; Yang, L.; Li, S.; Luo, X.; Qu, X.; Green, P. Human-Like Decision Making of Artificial Drivers in Intelligent Transportation Systems: An End-to-End Driving Behavior Prediction Approach. IEEE Intell. Transp. Syst. Mag. 2022, 14, 188–205. [Google Scholar] [CrossRef]
  259. Cheng, J.; Sun, J.; Yao, K.; Xu, M.; Dai, C. Multi-task convolutional neural network for simultaneous monitoring of lipid and protein oxidative damage in frozen-thawed pork using hyperspectral imaging. Meat Sci. 2023, 201, 109196. [Google Scholar] [CrossRef]
  260. Hussein, A.; Gaber, M.M.; Elyan, E.; Jayne, C. Imitation learning: A survey of learning methods. ACM Comput. Surv. 2017, 50, 1–35. [Google Scholar] [CrossRef]
  261. Hawke, J.; Shen, R.; Gurau, C.; Sharma, S.; Reda, D.; Nikolov, N.; Mazur, P.; Micklethwaite, S.; Griffiths, N.; Shah, A.; et al. Urban Driving with Conditional Imitation Learning. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 251–257. [Google Scholar] [CrossRef]
  262. Cai, P.; Sun, Y.; Chen, Y.; Liu, M. Vision-Based Trajectory Planning via Imitation Learning for Autonomous Vehicles. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27–30 October 2019; pp. 2736–2742. [Google Scholar] [CrossRef]
  263. Couto, G.C.K.; Antonelo, E.A. Generative Adversarial Imitation Learning for End-to-End Autonomous Driving on Urban Environments. In Proceedings of the 2021 IEEE Symposium Series on Computational Intelligence (SSCI), Orlando, FL, USA, 5–7 December 2021; pp. 1–7. [Google Scholar] [CrossRef]
  264. Huang, Z.; Liu, H.; Wu, J.; Lv, C. Conditional Predictive Behavior Planning with Inverse Reinforcement Learning for Human-Like Autonomous Driving. IEEE Trans. Intell. Transp. Syst. 2023, 24, 7244–7258. [Google Scholar] [CrossRef]
  265. Jia, C.; He, H.; Zhou, J.; Li, J.; Wei, Z.; Li, K. Learning-based model predictive energy management for fuel cell hybrid electric bus with health-aware control. Appl. Energy 2024, 355, 122228. [Google Scholar] [CrossRef]
  266. Chen, Y.; Yu, Z.; Han, Z.; Sun, W.; He, L. A Decision-Making System for Cotton Irrigation Based on Reinforcement Learning Strategy. Agronomy 2024, 14, 11. [Google Scholar] [CrossRef]
  267. Peng, B.; Sun, Q.; Li, S.E.; Kum, D.; Yin, Y.; Wei, J.; Gu, T. End-to-End Autonomous Driving Through Dueling Double Deep Q-Network. Automot. Innov. 2021, 4, 328–337. [Google Scholar] [CrossRef]
  268. Song, W.; Liu, S.; Li, Y.; Yang, Y.; Xiang, C. Smooth Actor-Critic Algorithm for End-to-End Autonomous Driving. In Proceedings of the 2020 American Control Conference (ACC), Denver, CO, USA, 1–3 July 2020; pp. 3242–3248. [Google Scholar] [CrossRef]
  269. Wu, Y.; Liao, S.; Liu, X.; Li, Z.; Lu, R. Deep Reinforcement Learning on Autonomous Driving Policy with Auxiliary Critic Network. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 3680–3690. [Google Scholar] [CrossRef]
  270. Chen, J.; Li, S.E.; Tomizuka, M. Interpretable End-to-End Urban Autonomous Driving with Latent Deep Reinforcement Learning. IEEE Trans. Intell. Transp. Syst. 2022, 23, 5068–5078. [Google Scholar] [CrossRef]
  271. Chen, S.; Wang, M.; Song, W.; Yang, Y.; Li, Y.; Fu, M. Stabilization Approaches for Reinforcement Learning-Based End-to-End Autonomous Driving. IEEE Trans. Veh. Technol. 2020, 69, 4740–4750. [Google Scholar] [CrossRef]
  272. Zhang, H.; Chen, B.; Lei, N.; Li, B.; Li, R.; Wang, Z. Integrated Thermal and Energy Management of Connected Hybrid Electric Vehicles Using Deep Reinforcement Learning. IEEE Trans. Transp. Electrif. 2024, 10, 4594–4603. [Google Scholar] [CrossRef]
  273. Shi, W.; Huangfu, Y.; Xu, L.; Pang, S. Online energy management strategy considering fuel cell fault for multi-stack fuel cell hybrid vehicle based on multi-agent reinforcement learning. Appl. Energy 2022, 328, 120234. [Google Scholar] [CrossRef]
  274. Ruan, S.; Ma, Y.; Yang, N.; Yan, Q.; Xiang, C. Multiobjective optimization of longitudinal dynamics and energy management for HEVs based on nash bargaining game. Energy 2023, 262, 125422. [Google Scholar] [CrossRef]
  275. Liang, J.; Lu, Y.; Pi, D.; Yin, G.; Zhuang, W.; Wang, F.; Feng, J.; Zhou, C. A Decentralized Cooperative Control Framework for Active Steering and Active Suspension: Multi-Agent Approach. IEEE Trans. Transp. Electrif. 2022, 8, 1414–1429. [Google Scholar] [CrossRef]
  276. Xu, H.; Zhao, Y.; Pi, W.; Wang, Q.; Lin, F.; Zhang, C. Integrated Control of Active Front Wheel Steering and Active Suspension Based on Differential Flatness and Nonlinear Disturbance Observer. IEEE Trans. Veh. Technol. 2022, 71, 4813–4824. [Google Scholar] [CrossRef]
  277. Wang, C.; Deng, K.; Zhao, W.; Zhou, G.; Li, X. Robust control for active suspension system under steering condition. Sci. China Technol. Sci. 2017, 60, 199–208. [Google Scholar] [CrossRef]
  278. Tchamna, R.; Youn, E.; Youn, I. Combined control effects of brake and active suspension control on the global safety of a full-car nonlinear model. Veh. Syst. Dyn. 2014, 52, 69–91. [Google Scholar] [CrossRef]
  279. Soltani, A.; Bagheri, A.; Azadi, S. Integrated vehicle dynamics control using semi-active suspension and active braking systems. Proc. Inst. Mech. Eng. Part K J. Multi-Body Dyn. 2017, 232, 314–329. [Google Scholar] [CrossRef]
  280. Termous, H.; Shraim, H.; Talj, R.; Francis, C.; Charara, A. Coordinated control strategies for active steering, differential braking and active suspension for vehicle stability, handling and safety improvement. Veh. Syst. Dyn. 2018, 57, 1494–1529. [Google Scholar] [CrossRef]
  281. Zhou, J.; Shu, X.; Zhang, J.; Yi, F.; Jia, C.; Zhang, C.; Kong, X.; Zhang, J.; Wu, G. A deep learning method based on CNN-BiGRU and attention mechanism for proton exchange membrane fuel cell performance degradation prediction. Int. J. Hydrogen Energy 2024, 94, 394–405. [Google Scholar] [CrossRef]
  282. Deng, J.; Zhao, X.; Luo, W.; Bai, X.; Xu, L.; Jiang, H. Microwave detection technique combined with deep learning algorithm facilitates quantitative analysis of heavy metal Pb residues in edible oils. J. Food Sci. 2024, 89, 6005–6015. [Google Scholar] [CrossRef]
  283. Pateria, S.; Subagdja, B.; Tan, A.-h.; Quek, C. Hierarchical reinforcement learning: A comprehensive survey. ACM Comput. Surv. 2021, 54, 1–35. [Google Scholar] [CrossRef]
  284. Chen, L.; He, Y.; Wang, Q.; Pan, W.; Ming, Z. Joint Optimization of Sensing, Decision-Making and Motion-Controlling for Autonomous Vehicles: A Deep Reinforcement Learning Approach. IEEE Trans. Veh. Technol. 2022, 71, 4642–4654. [Google Scholar] [CrossRef]
  285. Wang, Y.; Wu, Y.; Tang, Y.; Li, Q.; He, H. Cooperative energy management and eco-driving of plug-in hybrid electric vehicle via multi-agent reinforcement learning. Appl. Energy 2023, 332, 120563. [Google Scholar] [CrossRef]
  286. Li, D.; Zhao, D.; Zhang, Q.; Chen, Y. Reinforcement Learning and Deep Learning Based Lateral Control for Autonomous Driving [Application Notes]. IEEE Comput. Intell. Mag. 2019, 14, 83–98. [Google Scholar] [CrossRef]
  287. Porav, H.; Newman, P. Imminent Collision Mitigation with Reinforcement Learning and Vision. In Proceedings of the 2018 21st International Conference on Intelligent Transportation Systems (ITSC), Maui, HI, USA, 4–7 November 2018; pp. 958–964. [Google Scholar]
  288. Chen, J.; Li, S.; Yang, K.; Wei, C.; Tang, X. Deep Reinforcement Learning-Based Integrated Control of Hybrid Electric Vehicles Driven by Lane-Level High-Definition Map. IEEE Trans. Transp. Electrif. 2024, 10, 1642–1655. [Google Scholar] [CrossRef]
  289. Tang, X.; Chen, J.; Yang, K.; Toyoda, M.; Liu, T.; Hu, X. Visual Detection and Deep Reinforcement Learning-Based Car Following and Energy Management for Hybrid Electric Vehicles. IEEE Trans. Transp. Electrif. 2022, 8, 2501–2515. [Google Scholar] [CrossRef]
  290. Zhang, H.; Peng, J.; Dong, H.; Tan, H.; Ding, F. Hierarchical reinforcement learning based energy management strategy of plug-in hybrid electric vehicle for ecological car-following process. Appl. Energy 2023, 333, 120599. [Google Scholar] [CrossRef]
  291. Min, H.; Xiong, X.; Yang, F.; Sun, W.; Yu, Y.; Wang, P. An Energy-Efficient Driving Method for Connected and Automated Vehicles Based on Reinforcement Learning. Machines 2023, 11, 168. [Google Scholar] [CrossRef]
  292. Li, W.; Zhao, Z.; Liang, K.; Zhao, K. Coordinated Longitudinal and Lateral Motions Control of Automated Vehicles Based on Multi-Agent Deep Reinforcement Learning for On-Ramp Merging; SAE Technical Paper: Warrendale, PE, USA, 2024; ISSN 0148-7191. [Google Scholar]
  293. Chen, J.; Shu, H.; Tang, X.; Liu, T.; Wang, W. Deep reinforcement learning-based multi-objective control of hybrid power system combined with road recognition under time-varying environment. Energy 2022, 239, 122123. [Google Scholar] [CrossRef]
  294. Hu, H.; Wu, G.; Mao, L. Preview Control of Semi-Active Suspension with Adjustable Damping Based on Machine Vision. In Proceedings of the 2021 IEEE 16th Conference on Industrial Electronics and Applications (ICIEA), Chengdu, China, 1–4 August 2021; pp. 117–123. [Google Scholar] [CrossRef]
  295. Du, Y.; Chen, J.; Zhao, C.; Liao, F.; Zhu, M. A hierarchical framework for improving ride comfort of autonomous vehicles via deep reinforcement learning with external knowledge. Comput. Aided Civ. Infrastruct. Eng. 2023, 38, 1059–1078. [Google Scholar] [CrossRef]
  296. Deng, L.; Li, S.; Tang, X.; Yang, K.; Lin, X. Battery thermal- and cabin comfort-aware collaborative energy management for plug-in fuel cell electric vehicles based on the soft actor-critic algorithm. Energy Convers. Manag. 2023, 283, 116889. [Google Scholar] [CrossRef]
  297. Roh, D.H.; Lee, J.Y. Augmented Reality-Based Navigation Using Deep Learning-Based Pedestrian and Personal Mobility User Recognition—A Comparative Evaluation for Driving Assistance. IEEE Access 2023, 11, 62200–62211. [Google Scholar] [CrossRef]
  298. Fang, Z.; Wang, J.; Wang, Z.; Chen, J.; Yin, G.; Zhang, H. Human–Machine Shared Control for Path Following Considering Driver Fatigue Characteristics. IEEE Trans. Intell. Transp. Syst. 2024, 25, 7250–7264. [Google Scholar] [CrossRef]
  299. Ling, J.; Li, J.; Tei, K.; Honiden, S. Towards Personalized Autonomous Driving: An Emotion Preference Style Adaptation Framework. In Proceedings of the 2021 IEEE International Conference on Agents (ICA), Kyoto, Japan, 13–15 December 2021; pp. 47–52. [Google Scholar] [CrossRef]
  300. Zhao, Z.; Alzubaidi, L.; Zhang, J.; Duan, Y.; Gu, Y. A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations. Expert Syst. Appl. 2024, 242, 122807. [Google Scholar] [CrossRef]
  301. Tao, S.; Sun, C.; Fu, S.; Wang, Y.; Ma, R.; Han, Z.; Sun, Y.; Li, Y.; Wei, G.; Zhang, X.; et al. Battery Cross-Operation-Condition Lifetime Prediction via Interpretable Feature Engineering Assisted Adaptive Machine Learning. ACS Energy Lett. 2023, 8, 3269–3279. [Google Scholar] [CrossRef]
  302. Zhu, Z.; Lin, K.; Jain, A.K.; Zhou, J. Transfer Learning in Deep Reinforcement Learning: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 13344–13362. [Google Scholar] [CrossRef]
  303. Fu, S.; Tao, S.; Fan, H.; He, K.; Liu, X.; Tao, Y.; Zuo, J.; Zhang, X.; Wang, Y.; Sun, Y. Data-driven capacity estimation for lithium-ion batteries with feature matching based transfer learning method. Appl. Energy 2024, 353, 121991. [Google Scholar] [CrossRef]
  304. Fan, C.Y.; Liu, P.; Xiao, T.; Zhao, W.; Tang, X.L. A review of deep domain adaptation: General situation and complex situation. ACTA Autom. Sin. 2021, 47, 515–548. [Google Scholar] [CrossRef]
  305. Li, J.; Xu, R.; Ma, J.; Zou, Q.; Ma, J.; Yu, H. Domain adaptive object detection for autonomous driving under foggy weather. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–7 January 2023; pp. 612–622. [Google Scholar] [CrossRef]
  306. Chen, W.Y.; Chang, H.Y.; Wang, C.Y.; Chung, W.H. Cooperative Neighboring Vehicle Positioning Systems Based on Graph Convolutional Network: A Multi-Scenario Transfer Learning Approach. In Proceedings of the ICC 2022—IEEE International Conference on Communications, Seoul, Korea, 16–20 May 2022; pp. 3226–3231. [Google Scholar] [CrossRef]
  307. Shu, H.; Liu, T.; Mu, X.; Cao, D. Driving Tasks Transfer Using Deep Reinforcement Learning for Decision-Making of Autonomous Vehicles in Unsignalized Intersection. IEEE Trans. Veh. Technol. 2022, 71, 41–52. [Google Scholar] [CrossRef]
  308. Hu, D.; Huang, C.; Yin, G.; Li, Y.; Huang, Y.; Huang, H.; Wu, J.; Li, W.; Xie, H. A transfer-based reinforcement learning collaborative energy management strategy for extended-range electric buses with cabin temperature comfort consideration. Energy 2024, 290, 130097. [Google Scholar] [CrossRef]
  309. Quang Dinh, V.; Munir, F.; Azam, S.; Yow, K.-C.; Jeon, M. Transfer learning for vehicle detection using two cameras with different focal lengths. Inf. Sci. 2020, 514, 71–87. [Google Scholar] [CrossRef]
  310. Li, G.; Ji, Z.; Chang, Y.; Li, S.; Qu, X.; Cao, D. ML-ANet: A Transfer Learning Approach Using Adaptation Network for Multi-label Image Classification in Autonomous Driving. Chin. J. Mech. Eng. 2021, 34, 78. [Google Scholar] [CrossRef]
  311. Lian, R.; Tan, H.; Peng, J.; Li, Q.; Wu, Y. Cross-Type Transfer for Deep Reinforcement Learning Based Hybrid Electric Vehicle Energy Management. IEEE Trans. Veh. Technol. 2020, 69, 8367–8380. [Google Scholar] [CrossRef]
  312. Huang, R.; He, H.; Su, Q. Towards a fossil-free urban transport system: An intelligent cross-type transferable energy management framework based on deep transfer reinforcement learning. Appl. Energy 2024, 363, 123080. [Google Scholar] [CrossRef]
  313. Hu, Y.; Yang, J.; Chen, L.; Li, K.; Sima, C.; Zhu, X.; Chai, S.; Du, S.; Lin, T.; Wang, W. Planning-oriented autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 17853–17862. [Google Scholar] [CrossRef]
  314. Niranjan, D.R.; VinayKarthik, B.C. Deep Learning based Object Detection Model for Autonomous Driving Research using CARLA Simulator. In Proceedings of the 2021 2nd International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 7–9 October 2021; pp. 1251–1258. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of intelligent electric vehicle domain control technology.
Figure 1. Schematic diagram of intelligent electric vehicle domain control technology.
Energies 18 04597 g001
Figure 2. Keyword highlights of literature: (a) clustering graph; (b) association picture.
Figure 2. Keyword highlights of literature: (a) clustering graph; (b) association picture.
Energies 18 04597 g002
Figure 3. Section diagram of paper.
Figure 3. Section diagram of paper.
Energies 18 04597 g003
Figure 4. Section 2 structure.
Figure 4. Section 2 structure.
Energies 18 04597 g004
Figure 5. Section 3 structure.
Figure 5. Section 3 structure.
Energies 18 04597 g005
Figure 6. Section 4 structure.
Figure 6. Section 4 structure.
Energies 18 04597 g006
Figure 7. Section 5 structure.
Figure 7. Section 5 structure.
Energies 18 04597 g007
Figure 8. Section 6 and Section 7 structure.
Figure 8. Section 6 and Section 7 structure.
Energies 18 04597 g008
Table 1. Comparative analysis of representative algorithm categories.
Table 1. Comparative analysis of representative algorithm categories.
CategoryStrengthsLimitations Typical Scenarios
Value-BasedHigh sample efficiency in discrete spaces.
Stable with replay buffers.
Simple to implement.
Cannot directly handle continuous actions.
Overestimation bias in vanilla forms.
Weak for stochastic policies.
Discrete control, low-dimensional actions, tabular tasks.
Policy-GradientDirect stochastic policy optimization.
Handles continuous/discrete actions.
Good exploration in high-entropy tasks.
High gradient variance.
Slow in early methods.
Prone to sub-optimal convergence.
Continuous control, constrained actions.
Actor–CriticBalances bias variance.
High sample efficiency.
Supports continuous/discrete actions.
More complex networks.
Critic error can destabilize.
Sensitive to hyperparameters.
Real-world control, multi-agent, sparse-reward tasks.
Table 2. Current literature review.
Table 2. Current literature review.
Ref.Functional DomainApplication of DRL Main Contributions
[27]Intelligent driving domainYesBased on the classification of sensor types and algorithms, the automotive target detection methods in recent years are summarized.
[28]Intelligent driving domainYesIn the conclusion, the development of algorithms for vehicle trajectory planning in recent years is discussed.
[29]Intelligent driving domainYesThe application of different RL algorithms in behavioral decision making in different autonomous driving scenarios is summarized.
[30]Intelligent driving domainYesSummarizes the application of RL and DRL algorithms in multiple tasks of vehicles such as obstacle detection, scene recognition, lane detection, navigation, and path planning
[31]Intelligent driving domainNo/
[32]Powertrain domainYesThe effects of different RL algorithm actions and reward function setting choices on the performance of the powertrain controller are analyzed.
[33]Powertrain domainYesAn analysis and review of RL-based EMS research was conducted according to different types of hybrid vehicle architectures.
[34]Powertrain domainYesThis study explores approaches for predicting battery temperatures and optimizing their thermal control.
[35]Chassis domainNo/
[36]Chassis domainNo/
[37]Chassis domainNo/
[38]Cockpit domainYesVarious automotive cabin thermal modeling techniques are discussed, and control technologies for cabin thermal management in different weather conditions are evaluated.
[39]Cockpit domainNo/
[40]Cockpit domainNo/
[41]Body domainNo/
Table 3. Difficulties and solutions for target detection.
Table 3. Difficulties and solutions for target detection.
Difficult QuestionsApplication of DRLRef. Main Contributions
Partial occlusion and overlap issuesYes[49]Fusion of information from multiple 3D lidars to enhance situational awareness of occluded objects in dense scenes.
Radar sensor noise interferenceYes[51]Only the RPFA-Net method of 4D radar is used.
Long distance, small targetYes[52]Uses MobileNet V2 to replace its original backbone and only uses the first two feature mapping layers of Single Shot MultiBox Detector (SSD) to improve small object detection performance.
Detection speed is slowYes[53]Constructed a model that can quickly generate a global sparse graph and construct a dense graph.
Changing scenesYes[54]The 3D environment is partitioned into voxels, and a novel graph-based initialization network is introduced that encodes the points residing within.
Changes in vehicle perspectiveYes[55]The dual equivariance of the model can extract local and global equivariance features, respectively, thereby alleviating the impact of vehicle steering.
Low detection accuracyYes[56]2D RGB imagery is integrated with 3D point clouds at the semantic level to boost 3D object-detection accuracy.
Strong light, low lightYes[57]A sparse point-cloud–image fusion strategy is adopted, and fog augmentation is added to the dataset images.
Camera image is blurryNo//
Photo stitching technologyNo//
Bad weatherNo//
Color fusion environment is highNo//
Table 4. Difficulties and solutions for target tracking.
Table 4. Difficulties and solutions for target tracking.
Difficult QuestionsApplication of DRLRef. Main Contributions
Target scale changesYes[95]By embedding an improved SNIPER sampling strategy within Faster R-CNN, our method achieves reliable vehicle detection across variable scales.
Tracking inefficienciesYes[97]Applies neural architecture search to uncover efficient tracking models that cut real-time latency.
Target is partially occludedYes[100]An object detector extracts an oriented 3D bounding box from the point cloud, after which similarity-based re-identification matches it to known instances.
Changes in appearance featuresYes[103]A single target tracking model is designed, which can obtain the temporal variation characteristics of video targets across frames.
Affected by sunlight, bad weather Yes[104]The scheme consists of three stages: illumination enhancement, reflectance component enhancement, and linear weighted fusion.
Background interference problemYes[105]Using position-normalized features, a general convolutional layer is used to enhance the object contour.
High target appearance similarity No//
Limited tracking capabilities No//
Table 5. Difficulties and solutions of positioning technology.
Table 5. Difficulties and solutions of positioning technology.
Difficult QuestionsApplication of DRLRef. Main Contributions
Positioning reliabilityYes[116]To improve the distinguishability and matching of feature descriptors, an effective multi-level intensity map representation is adopted, and a new sampling method based on coverage fraction is proposed.
Positioning accuracy is not highYes[117]A panoramic camera was added next to the positioning camera to assist gaze control.
Changes in vehicle perspectiveYes[118]The compressed representation mode explains the learned features and processes, significantly reducing translation and rotation errors.
Error accumulation problemYes[119]Uses PointPillars to detect and remove objects, then performs lidar odometry and mapping for a more static mapping.
Affected by sunlight, bad weather Yes[120]An effective multi-scale feature discriminator is proposed for adversarial training. The features of visual sensors and radar are fused together.
Map data overfitting problemNo//
Dependency on HD mapsNo//
High-speed positioning delay No//
Table 6. Difficulties in trajectory prediction and their solutions.
Table 6. Difficulties in trajectory prediction and their solutions.
Difficult QuestionsApplication of DRLRef. Main Contributions
Impact of complex road environmentYes[137]Traffic scenes are modeled in the form of spatial semantic scene graphs to make various predictions about traffic participants.
Multimodal predictionYes[138]The system unites a multimodal trajectory generator with modules for inverse reinforcement learning and risk avoidance.
Multi-agent interaction problemYes[139]Explicitly modeling interactions using graph structures can lead to better predictions of agent interactions.
Predicting real-time constraintsYes[140]An attention-based graph model GATraj is proposed, which balances prediction accuracy and inference speed well.
Trajectory uncertaintyYes[141]A driving style attention generative adversarial network is proposed.
Prediction reliability is not highNo//
Table 7. Difficulties in decision making and planning and their solutions.
Table 7. Difficulties in decision making and planning and their solutions.
Difficult QuestionsApplication of DRLRef. Main Contributions
Human-like decision planningYes[154]High- and low-level policies are co-learned from human demos.
Cognitive reasoning questionsYes[159]Combining hand-designed logic with data-driven reinforcement learning agents.
Planning optimality problemYes[160]Designing a safety mechanism for lane-changing decisions of autonomous vehicles on highways.
Vehicle interactionYes[161]A general graph reinforcement learning (GRL) framework is proposed to solve the decision-making problem in interactive traffic scenarios on highways.
Multi-objective optimizationYes[162]An eco-driving planning method based on a hierarchical framework is proposed to reduce energy consumption while ensuring driving safety.
Difficult to satisfy personalizationNo//
Self-learning and self-performingNo//
Table 8. Difficult problems and solutions of power control.
Table 8. Difficult problems and solutions of power control.
Difficult QuestionsApplication of DRLRef. Main Contributions
Security issuesYes[19]A twin-delay DDPG strategy smooths torque delivery, steadies training, and supports safe, energy-efficient driving.
Environmental adaptability issuesYes[178]This algorithm adds environmental intelligent exploration to the original DDPG algorithm.
Inefficient power switchingYes[179]By defining the action as the mode choice, the method uses a continuous state space and richer decision criteria, enabling more precise mode transitions.
Slow response problemYes[180]Control method of fuel cell gas supply subsystem based on DRL to optimize dynamic response performance.
Intent and state are difficult to identifyNo//
Smoothness problemNo//
Table 9. Difficult problems and solutions of energy management.
Table 9. Difficult problems and solutions of energy management.
Difficult QuestionsApplication of DRLRef. Main Contributions
Consider the driver’s intentionsYes[188]A semi-supervised SVM classifier identifies driver types and derives driving-style features.
Low utilization rate of informationYes[190]Wavelet neural network is used to predict future traffic information, generate long-term global driving state.
Lifetime optimizationYes[191]Weighs the fuel consumption cost, battery aging cost, and state of charge (SOC) sustainability reward function under different weight coefficients.
Optimality problemYes[192]The designed EMS for updating the weights of the Q neural network shows close to the global optimal fuel economy in different driving cycles.
Powertrain status optimizationYes[193]The comprehensive synchronous control of multiple components is realized in the mixed motion space through the dual DRL algorithm
Table 10. Difficult problems and solutions of thermal management.
Table 10. Difficult problems and solutions of thermal management.
Difficult QuestionsApplication of DRLRef. Main Contributions
Multi-system collaboration Yes[201]A novel RL-based cooling control strategy is proposed to coordinately control the passenger compartment and battery cold plate.
Temperatures are difficult to trackYes[203]Tuned for fast convergence on multiple-input problems, minimizing tracking error and power consumption.
Thermal runaway issuesYes[204]By modelling fine-grained, vehicle-level battery dynamics, the method curbs pack aging.
Low efficiency of waste heat reuseNo//
Module temperature uniformityNo//
Table 11. Difficult problems and solutions of steering control.
Table 11. Difficult problems and solutions of steering control.
Difficult QuestionsApplication of DRLRef. Main Contributions
Not very robustYes[214]Generate a generalized reinforcement learning agent by selecting vehicle parameters and path trajectories as the state space.
Human–machine co-driving controlYes[215]Make the optimal driving rights allocation based on the driver’s steering angle, the vehicle’s autonomous steering angle, and vehicle–road information.
Optimal steering parametersNo//
Table 12. Difficult problems and solutions of brake control.
Table 12. Difficult problems and solutions of brake control.
Difficult QuestionsApplication of DRLRef. Main Contributions
Braking timelinessYes[224]A DDPG-based AEB control method is proposed to manage the vehicle’s speed change through braking action.
Braking stabilityYes[225]The strategy enhances ride comfort by modulating brake torque to suppress body pitch and longitudinal oscillations across varied braking scenarios.
Braking timing problemNo//
Brake energy recoveryNo//
Table 13. Difficult problems and solutions of suspension control.
Table 13. Difficult problems and solutions of suspension control.
Difficult QuestionsApplication of DRLRef. Main Contributions
Time lag problemYes[233]An adaptive suspension system is introduced, combining random road-profile detection with semi-active control technology.
Poor adaptability and robustnessYes[234]To meet the differing performance demands of various road conditions, we craft a fuzzy-logic reward that dynamically steers the optimization objective.
Nonlinear phenomenaNo//
Synergy of posture and comfortNo//
Table 14. Difficult problems and solutions of personnel monitoring.
Table 14. Difficult problems and solutions of personnel monitoring.
Difficult QuestionsApplication of DRLRef. Main Contributions
Person’s head turningYes[239]This method synthesizes realistic frontal faces through the FF-Module module, which can be used to capture facial movements in any posture, and integrates head posture attributes and facial morphology through the GK-Module module.
False alarm problemYes[240]By improving four DL models to detect children, pets, and adults, respectively, the problem of false alarms can be avoided while improving accuracy.
Driver interaction behaviorNo//
Noise interference problemNo//
Privacy leakage issueNo//
Expression and DifferenceNo//
Table 15. Difficult problems and solutions of comfort control.
Table 15. Difficult problems and solutions of comfort control.
Difficult QuestionsApplication of DRLRef. Main Contributions
Environmental changesYes[246]Takes into full account factors such as air temperature, external air temperature, and temperature of the passengers, and controls the air outlet temperature.
Difference in comfortYes[247]A human thermal comfort model is embedded, and its comfort scores serve as the optimization target for the PPO-driven cabin HVAC controller.
Comfort and mood changesNo//
Degree of intelligenceNo//
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lu, D.; Chen, Y.; Sun, Y.; Wei, W.; Ji, S.; Ruan, H.; Yi, F.; Jia, C.; Hu, D.; Tang, K.; et al. Research Progress in Multi-Domain and Cross-Domain AI Management and Control for Intelligent Electric Vehicles. Energies 2025, 18, 4597. https://doi.org/10.3390/en18174597

AMA Style

Lu D, Chen Y, Sun Y, Wei W, Ji S, Ruan H, Yi F, Jia C, Hu D, Tang K, et al. Research Progress in Multi-Domain and Cross-Domain AI Management and Control for Intelligent Electric Vehicles. Energies. 2025; 18(17):4597. https://doi.org/10.3390/en18174597

Chicago/Turabian Style

Lu, Dagang, Yu Chen, Yan Sun, Wenxuan Wei, Shilin Ji, Hongshuo Ruan, Fengyan Yi, Chunchun Jia, Donghai Hu, Kunpeng Tang, and et al. 2025. "Research Progress in Multi-Domain and Cross-Domain AI Management and Control for Intelligent Electric Vehicles" Energies 18, no. 17: 4597. https://doi.org/10.3390/en18174597

APA Style

Lu, D., Chen, Y., Sun, Y., Wei, W., Ji, S., Ruan, H., Yi, F., Jia, C., Hu, D., Tang, K., Huang, S., & Wang, J. (2025). Research Progress in Multi-Domain and Cross-Domain AI Management and Control for Intelligent Electric Vehicles. Energies, 18(17), 4597. https://doi.org/10.3390/en18174597

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop