1. Introduction
Since the 1930s, the authors of science fiction books have envisioned the advent of self-driving cars, and constructing such cars has been a challenge for the Artificial Intelligence (AI) community since the 1960s [
1]. Nowadays, the advances of sensing (e.g., computer vision, laser and microwave ranging, etc.), mapping, networking, and machine learning have facilitated the rapid growth of research and prototyping of semi-autonomous and autonomous vehicles [
2]. However, most of the implemented solutions so far seldom assume an unconditional and complete autonomy of the vehicle. Moreover, the currently available—up to level 3 (of the six levels, 0–5) of autonomous driving, as defined in 2014 by the Society of Automotive Engineers—industrial applications of semi-autonomous vehicles require that the drivers keep their hands on the steering wheel all the time during the operation of the “auto-pilot” [
3]. Indeed, in a case of an unforeseen traffic situation or road condition, the control might have to be transferred to the presumably more dependable human driver [
4]. We believe that the automation should abstain from unconditionally ceding the control of the car to the human driver, especially in heavy traffic situations or in challenging (such as, for example, slippery) road conditions due to the major drawback of this transfer: the reduced cognitive load of the (passive) human driver might result in a rough transition of control between the automation and the human [
5]. In such situations, the behavior of the human might be either inadequate or too slow as they might be subjected to a sharp increase of their cognitive load and the psychological stress of dealing with the suddenly arising, challenging traffic situation (e.g., a suddenly appearing obstacle in front of the car in dense multi-lane traffic). In addition, the human driver might be underqualified to control the car in challenging road conditions (e.g., retaining control of an oversteering car on a slippery road). The recent traffic accidents—some of them fatal—involving autonomous cars drew attention to the problem of the cognitive conditions required of human drivers to allow them to serve as a dependable backup for the potentially imperfect automation [
6,
7,
8]. Nevertheless, autonomous vehicles are expected to outperform human drivers in terms of improved overall safety of road traffic, and to result in a decrease of the total number of accidents caused by a slow or inadequate response of a human driver due to fatigue, inattention, and a lack of experience or qualification in dealing with extreme traffic situations and road conditions [
3].
The task of automated driving can be decomposed into the following subtasks: (i) defining the desired trajectory (driving line) and the desired pattern of speed along this trajectory, (ii) keeping the actual trajectory of the car as close as possible to the desired one, and (iii) maintaining the actual speed of the car along this trajectory as close as possible to the desired one [
9,
10]. We considered the first and third subtasks to be beyond the scope of our current work; instead, we focused on the second one—in challenging, slippery road conditions—which could be solved by an appropriate steering of the car.
Currently, the canonical servo-control of steering [
11]—which, as we will elaborate later, could be seen as an example of a proportional-derivative (PD) controller—is adopted as a formal model of the steering angle function (SAF): the function that continuously decides the current steering angle of the front (steering) wheels depending on both the lateral and angular deviation of the car from its intended trajectory. The model continuously attempts to minimize the values of these two deviations (i.e., the errors) by setting the steering angle to such a value that would result in both a prompt and stable (non-oscillatory) return of the car to its desired trajectory. This model mimics well the steering behavior of a human driver and, similarly to such a driver, provides a good quality of steering on dry, non-slippery roads. With an appropriate tuning of the relevant gain coefficients (depending on the specific features of the physical model of the particular car), on dry roads the servo-control could achieve a steering behavior that is very similar to that of a human driver in adequate cognitive condition [
12].
However, to the best of our knowledge, there is no documented research on the applicability of the PD servo-control model for automated control of the car under more challenging—
slippery (e.g., wet, snowy, or icy)—road conditions. We speculate that PD servo-control might not be adequate in such conditions because the vehicle dynamics model of a directionally unstable (e.g., understeering or oversteering) car on slippery roads is more complex and involves additional variables (beyond the lateral and angular deviations from the desired trajectory) pertinent to the state of the car than are involved in a car driven on normal, non-slippery roads [
13]. Such complexity might not be expressed adequately by the relatively simple PD servo-control.
Moreover, in an eventual case of unacceptability of the PD servo-control as a steering model in such slippery conditions, which alternative model would be more appropriate, and how to develop it, would be open questions too.
These two concerns—the eventual inapplicability of the PD servo-control model and the lack of understanding of its alternative(s) for the steering of a car on slippery roads—motivated our research. Our objectives were (i) to examine the applicability of PD servo-control as an auto-steering model of a car on slippery roads, and (ii) to investigate the feasibility to develop heuristically the optimal (possibly nonlinear) steering model in such road conditions by means of simulated evolution via genetic programming (GP).
Our work is additionally motivated by the fact that despite the significant body of research on computational intelligence in The Open Racing Car Simulator (TORCS) [
14,
15], we are not aware of any implementation of SAF of the car on slippery roads in this simulation environment. Rather, the focus of such research is on the automated development (via either a simulated evolution or machine learning) of racing agents aimed at attending (and, ultimately, winning) simulated car races. The eventual incorporation of SAF on slippery roads would be seen as unneeded as the races are usually held on dry, grippy tracks where the cars seldom experience any significant directional instability.
On the other hand, the inspiration for the proposed heuristic development of SAF via GP is that, to the best of our knowledge, there are no documented attempts to automatically develop the SAF on slippery roads via evolutionary computing and GP in particular.
The related work includes the seminal work of Huang et al. [
11], who have demonstrated that GP—as an unsupervised machine learning approach—could be successfully applied to heuristically develop from scratch a PD controller that stabilizes a car on slippery roads by eliciting precisely quantified asymmetric brake forces on its wheels. The control of the steering, however, has not been subjected to any optimization for the given slippery road conditions, but rather implemented as a generic, handcrafted PD servo-control that has performed well on dry, grippy roads. Similarily to GP, end-to-end reinforcement learning (e.g., of deep neural networks) does not require an explicit modularization of the controller [
16,
17]. Instead, the internal structure of the controller is automatically developed by the learning framework. The advantage of his approach is that it could discover and utilize the unusual features of the environment and develop sophisticated controllers that could solve the tasks for given (generic) environmental conditions. The drawbacks include the huge search space, a lack of understanding of how (and why) exactly the developed controllers work, and consequently a lack of confidence in their robustness and generality. Moreover, conversely to the proposed approach of employing GP, end-to-end reinforcement learning requires a human-annotated training set of data, i.e., the mapping of the states of the car and the environment (e.g., consecutive frames of a video feed) into appropriate steering commands obtained from a human (expert) driver. However, in our work we assume that such perception information is not always fully accessible (or, even if accessible, not necessarily with the same quality as the information used for training), for example, when driving in heavy-traffic, poor-visibility conditions (fog, snow, rain, night, etc.) or immediately behind a heavy vehicle (bus, truck, etc.). Therefore, in our work we propose the use of very simple perception information: only the current lateral and angular deviation of the car from the intended trajectory (i.e., the center of the lane). With such simple perceptions, for a human driver it would have been virtually impossible to achieve driving (let alone on slippery roads) that is good enough to serve as a representative trainer for the learner.
Rather than automatically (via GP), the steering controller of the car on slippery roads could be handcrafted by the developers applying various top-down approaches. Usually, handcrafted solutions are based on certain assumptions intended to simplify the complex dynamics of a skidding car. The challenge of these approaches is in deciding the adequate assumptions and abstractions for the domain-specific knowledge. In predictive control [
18,
19,
20], for example, the controller decides the value of the control signal based on the predicted, rather than the current, values of parameters pertinent to the state of the car and its environment. The prediction could be seen as approximating the values of these parameters from their current values, their rate of change, and the prediction time that corresponds to the latencies in the control loop. Such prediction would be very accurate if the latencies are well-known, the laws that govern the changes of these latencies are also well-known, and the rate of change is (nearly) constant during the duration of the latency. In contrast to these approaches, the proposed method of automated, heuristic development of SAF via GP—as elaborated in detail in
Section 2.5—assumes very little of such knowledge.
The remainder of the article is organized as follows.
Section 2 elaborates on the adopted model of the car and its environment. It also explains the basics of the PD and PID servo-control models and the proposed approach of employing GP for evolution of the optimal SAF.
Section 3 presents our experimental results, and
Section 4 discusses the some of the limitations of the results obtained by GP. Finally,
Section 5 draws a conclusion.
3. Experimental Results
For each of the six road conditions (as described in
Table 3) we employed GP and performed 20 independent evolutionary runs to obtain the best SAF. The fitness convergence characteristics of these independent runs are shown in
Figure 7.
As
Figure 7 illustrates, in all road conditions the fitness of the best-evolved SAF converges to values that are better (i.e., lower) than those of the best PD and PID controllers. We obtained the optimal values of parameters of the best PD and PID controllers by their complete enumeration (i.e., a “brute-force” search). For the PD controller, we evaluated 25 discrete values of each of the two parameters
k1 and
k*2 (resulting in a size of the search space equal to 25
2 = 625). For the PID, we borrowed the optimal values of the two parameters of the PD controller, and used only 10 discrete values in the vicinity of these values and 25 values for the third parameter
k3 (i.e., the size of the search space is 2500). The best fitness of the three proposed controllers—PD, PID, and GP-RMEP—and the optimal values of the coefficients of the PD and PID controllers are shown in
Table 5.
As the results shown in
Table 5 demonstrate, the PID controller outperforms the PD controller in that its fitness is lower in all road conditions. However, the quality of steering of the best-evolved GP-RMEP controller is even better than that of PID, and the difference between them widens with the decrease of the friction coefficient, reaching a maximum of about 4 times (1532 versus 374) on icy roads (
µ = 0.3). Conversely, on grippy, dry roads (
µ = 0.8 and
µ = 1.0), this difference is not very significant (587 versus 545 and 613 versus 498, respectively), implying that the PID model provides good enough steering of the car in these road conditions.
The lack of generality is one of the well-documented drawbacks of solutions obtained via GP [
4,
5,
26] that hinders the applicability of this algorithm to real-world problems. Indeed, we could not be sure about how well the SAF that was evolved in a single car driven at a fixed speed on a fixed track featuring a fixed coefficient of friction would perform in different situation(s). Ultimately, we should have considered an evolving SAF that performs (nearly) equally well on several fitness cases that correspond to these different conditions. Moreover, in order to bridge the inevitable reality gap, we should have implemented an evolutionary adaptation of a set of the best SAFs, evolved on the simulated car, to a real one driven on a real track. However, in our current, seminal work, we will report the results of testing the best SAF evolved via GP with a single fitness case (fixed track, fixed coefficient of friction, and driven at fixed speed
V = 0.85
VCR) in different road conditions. As shown in
Figure 8, all GP-RMEP controllers, evolved for a particular slippery condition (e.g., for
µ equal to 0.3, 0.4, 0.5, or 0.6, respectively) feature a reasonably small degradation (if any) when tested in different (unforeseen during the evolution) road conditions.
For example, the GP-RMEP controller, evolved on snowy roads with
µ = 0.5, offers a comparatively good quality of steering (i.e., lower fitness values) compared to the alternative PD and PID controllers when tested on roads with friction
µ equal to 0.3 (icy), 0.4 (snowy), and 0.6 (rainy). The analytical expression of the SAF of this GP-RMEP controller is shown in Equation (8).
It is not uncommon that the solutions obtained via GP are way too complex to be easily comprehended by a human [
5]. The presented best-evolved SAF is not an exception to this trend, and we cannot explain precisely either why or how the SAF, shown in Equation (8), works. We could only confirm that the SAF implements a proportional-derivative (PD) control of the steering in that both (i) the direct values of parameters and (ii) their derivatives are incorporated in its code. The only integral term included in the terminal set of the GP—the integral of the lateral deviation—is not incorporated by the GP in the best-evolved SAF, which is consonant with the findings that the integral term could be obsolete in some nonlinear PID controllers [
27].
The dynamics of the steering angle and the deviation from the center of the lane of the car steered by the sample best-evolved SAF shown in Equation (8) in snowy (
µ = 0.5) road conditions are illustrated in
Figure 9 and
Figure 10, respectively. The same figures also show the behavior of the car steered by the best PD and PID controllers with values of parameters optimized for this particular road condition (as shown in
Table 5). As these two figures illustrate, the lateral deviation of the car steered by the sample best-evolved steering function is significantly lower than that of the best solution of the servo-control model, especially during steady-state cornering in the left Turn 1 and right Turn 2. Also, during the transition between these two turns the steering is smoother and more stable (non-oscillatory).
As
Figure 9 illustrates, the maximum value of the steering angle of the car steered by GP-RMEP during the reaction to the initial step disturbance is about 2 times lower (0.3 rad versus 0.62 rad) than that of the PD and PID controllers. This in turn results in a smoother return of the car to the center of the lane. Moreover, the steering angle produced by GP-RMEP appears to be limited to a particular maximum value of 0.3 rad. A similar phenomenon could be observed during the transitions between Turn 1 and Turn 2, with the only difference that the limit is much lower (−0.06 rad), which facilitates the smooth, non-oscillatory transitions between the turns. We think that the PD and PID controllers could not yield limited values of the steering angle in these two situations because the linearity of these controllers implies that such a limitation would compromise their ability to satisfy the other (somehow contradicting) requirements of the steering controller; namely, to follow the center of the lane closely and to return to the lane swiftly when having deviated from it. We assume that we could borrow this know-how, discovered by GP, to design a PID steering controller with coefficients that adaptively (possibly nonlinearly) vary the gain of the three terms P, I, and D depending on the current driving situation. Moreover, these coefficients, rather than being constants, could be evolved (e.g., via GP) as functions of some parameters pertinent to the state of the car.
4. Discussion
Some of the best SAFs evolved on icy and snowy roads featured slight steering oscillations with frequencies of about 1~5 Hz. We cannot conclude whether these oscillations have a beneficial effect on the controllability of the car. Most likely, they appear as a result of a neutral genetic code in the evolved SAF: a code that has no (neither a beneficial nor a detrimental) effect on the behavior of the car. Indeed, these steering oscillations do not manifest themselves in an oscillating trajectory of the car, because, due to the slippery conditions, the front tires slip excessively and could not provide sharp directional control of the car. Consequently, the oscillations do not result in any measurable oscillations of the lateral acceleration of the car, which was a metric that we used in the calculation of the fitness of the evolved SAF. Therefore, the GP could not impose any selection pressure against the evolved SAF that results in these oscillations. Despite that, the oscillations apparently do not affect detrimentally the steering of the car. From the standpoint of the feasibility of the practical implementation of the evolved SAF on real cars, such oscillations might be highly undesirable due to the uncomfortable vibrations and accelerated wear of the tires and the components of the steering system of the car. We speculate that by modifying the fitness function—e.g., by introducing a third additive component (in addition to the area under the trajectory of the car around the center of the lane and the average of its lateral velocity) that explicitly reflects the severity of steering oscillations—we could discourage the GP from evolving oscillating steering functions.
The realization of steering controllers in the real world could be done in two steps in accordance with the concept of evolutionary robotics [
33]. The first step, as discussed in our work, involves the evolution of a generic steering solution on the software model of the car in TORCS. The second stage, intended to bridge the inevitable reality gap, could be implemented as an evolutionary adaptation of the evolved generic solutions to real cars in real-world environments.
Also, in our discussion about the generality of the evolved SAF, we assumed the possibility to deploy a single, general SAF that could be good enough in any (slippery) road condition. The disadvantage of such an approach would be that such a general SAF might be somehow inferior to a dedicated SAF evolved for a particular road condition (as illustrated in
Figure 8). On the other hand, the advantage is that the system is not required to determine (in a real time) the current (instant) road conditions, i.e., the current (instant) coefficient of friction between the tires and the road. Our assumption—supported by the fact that most of the existing driving aids activated in slippery road conditions, such as anti-locking brake systems, traction control, and electronic stability programs, rely on the detection of the slippage of the tires rather than on the actual coefficient of friction (one of the underlying reasons for such a slippage)—was that such a determination is rather challenging: it requires a significant computational (signal-processing) power; the obtained result is approximate, and it might be obtained in a limited number of driving situations. However, the recent advances in automotive control suggest that in the near future these challenges might be successfully addressed [
34].