Autonomous Vehicle Control Comparison

: Self-driving features rely upon autonomous control of vehicle kinetics, and this manuscript compares several disparate approaches to control predominant kinetics. Classical control using feedback of state position and velocities, open-loop optimal control, real-time optimal control, double-integrator patching ﬁlters with and without gain-tuning, and control law inversion patching ﬁlters accompanying velocity control are assessed in Simulink, and their performances are compared. Optimal controls are found via Pontryagin’s method of optimization utilizing three necessary conditions: Hamiltonian minimization, adjoint equations, and terminal transversality of the endpoint Lagrangian. It is found that real-time optimal control and control-law patching ﬁlter with velocity control incorpo-rating optimization are the two best methods overall as judged in Monte Carlo analysis by means and standard deviations of position and rate errors and cost.


Introduction
Vehicle kinetics are commonly studied using predominant kinetics represented by double integrators equivalently for translation and rotation, neglecting coupling cross-products generated by the transport theorem for representing motion in the basis coordinates of rotating reference frames. Invocation of the Michel Chasles's Theorem [1] permits a full description of six-degrees-of-freedom motion by assertion of Euler's [2] and Newton's [3] equations. Controlling mechanical motion has an even longer pedigree culminating in utilization of deterministic artificial intelligence as a burgeoning field proposed by Cooper/Heidlauf in 2017 [4], expanded to space vehicles by Smeresky/Rizzo [5] and after the introduction of autonomous trajectory generation by Baker et al. [6] formalized as a process that very same year [7] and proven to be optimal. The provenance of the approach stems from nonlinear adaptive control as proposed by Slotine [8], modified by Fossen [9], and improved in 2012 with experimental validation [10], while Cooper/Heidlauf formulated the Slotine/Fossen feed-forward element as a deterministic statement of self-awareness. The physics-based methods of Lorenz [11,12] provided the formalization which amplified the importance of the first principles [1][2][3] to establish physics-based statements of self-awareness. In Reference [13], simple adaption techniques remained prevalent until the codification of optimal learning by Smeresky/Rizzo in 2020. One noteworthy feature is the ubiquitous applicability to such disparate fields as described so far include: spacecraft, robotics, and power electronics. Additionally, Rätze et al. [14] leverage the physics-based methods for optimal control of a CO 2 methanation reactor. Bukhari et al. [15] leveraged the methods to monitor and mitigate megacities air pollution. Thus, the utility of this present study is manifest. Deterministic artificial intelligence necessitates autonomously generated trajectories [16], therefore presented here is a comparison of techniques to autonomously generate optimal trajectories that minimize fuel usage. Trajectory research was evident immediately by Cooper/Heidlauf following their seminal publication, the work that immediately followed concentrated on autonomous vehicle trajectory generation [17] and substantiation of sensor-less state error tracking [18] since the optimal regulator performed so poorly. Shortly thereafter, Cooper collaborated with Smeresky to produce the combination of their efforts to date [19]. This combined work initiated the several lines of research proposals since 2020 seeking to optimize the current instantiation of deterministic artificial intelligence.
Optimization [20] can take many forms: linear quadratic optimal feed-back regulator [21], time-optimal [22], minimum fuel [23], minimum tracking error [24], etc., where minimum tracking error methods were proposed to facilitate autonomous vehicle collision avoidance [25]. These are directly compared in [26] using several figures of merit in the face of parameter uncertainty and noisy sensor fusion: state tracking error, rate tracking error, fuel consumption, and computational burden.
This article recommends the best method (of six methods compared) for moving a vehicle from an initial normalized state of zero to a final normalized state of unity in a scaled-time of one second in the presence of uncertain mass and mass moments of inertia, and noisy state and rate sensor data. Uncertainty in the moment of inertia is assumed uniform ±10%, while uncertainty in state and rate is assumed Gaussian with zero mean µ, and standard deviation σ of 0.01.
Six methods are derived, modeled, and compared in simulation experiments: 1. P+V Control (proportional control plus velocity negation); 2.
Double-Integrator patching filter with P+V control; 5.
Double-integrator patching filter with gain-tuning for P+V control; 6.
Control law inversion patching filter with P+V control.
The methods for implementing these control schemes is outlined in the following sections. Section 2 establishes a benchmark control for comparison, namely velocity control augmenting state feedback. Section 3 describes the methods for ascertaining the control minimizing state, rate, and acceleration trajectories and optimal control utilizing Pontryagin's minimization condition, the adjoint equations, and terminal transversality of the endpoint Lagrangian. Following the introduction of optimization methods, slight modifications are introduced to enable real-time optimization based on noisy sensor data. Section 4 introduces methods to incorporate the optimization results from Section 3 into a pre-existing system designed without optimization in mind. Three so-called patching filters are introduced: the double-integrator patching filter, a gain-tuned double-integrator patching filter, and a control law inverting patching filter. Section 5 presented the results of one-thousand simulation experiments using the models developed in Sections 3 and 4 and figures of merit are presented to reveal relative superiority compared to the classical benchmark controller.

Materials and Methods
Methods are developed with variables defined in proximal tables (e.g., Tables 1 and 2), where simulatoin topologies are provided in Appendix A to aid repeatability.

Classical Benchmark Control
Using an end time of t f = 1, a P + V Control System is used with gain parameters tuned to performance specifications using the closed-loop system Equation (1) below [26]. The following equation is used to match desired rise and settling times to position and velocity gains K P and K V .
The damping coefficient is ξ = 0.7, and the desired settling time is t s = 0.6. The settling time is defined as the time it takes for oscillation to stabilize within 2% of steady state t s = 4.6 ξω n → ω n = 4.6 t s ξ → ω n = 10.95 → K V = 2ξω n → K V = 15.33. The rise time is t r = 1.8 ω n = 0.164 → K P = ω 2 n = 119.95, giving K P = 119.95 and K V = 15.33. t s is set to 0.6 s to ensure settling time before the stop time of 1, so that θ f and ω f can be compared to their desired final values of 1 and 0, respectively. In the following analysis, µθ f , σθ f , µω f , σω f , µ J cost , and σ J cost are figures of merit compared across the four error simulations specified in the previous section. Here, µ denotes the average value of the subscripted parameter over N = 1000 Monte Carlo simulations, and σ denotes the standard deviation.

Finding the Optimal Control
To develop a real-time optimal controller using Pontryagin's Optimization Principles, an analytic solution for optimal state and rate trajectories will be found given certain constraints and arbitrary initial and final conditions (θ 0 , ω 0 ) and (θ f , ω f ), with arbitrary start and end times t 0 and t f . Here, I is the moment of inertia of the vehicle, and τ is the torque applied.
The goals are as indicated in Equation (3): J cost , the quadratic cost function, is chosen to be 1 2 t f t 0 τ 2 (t)dt because it scales with applied torque (and thus proportional to fuel usage), is positive-definite, and increases with time.
A Hamiltonian is defined as a new cost function that allows one to define a running cost F that can increase as a function of time, an endpoint cost E(θ f ) that penalizes inaccuracy in final state, and endpoint constraints. Minimizing this and solving for optimal constants via Pontryagin's method leads to an optimal control scheme. To solve this double-integrator quadratic control (DQC) problem, the following steps will be used:
Apply Terminal Transversality of the endpoint Lagrangian.
Finally, the resulting solution will be given in a matrix form which allows for the optimal solution to be found at every time step, leading to RTOC.

Formulate the Hamiltonian
There is no endpoint cost E(θ f ) specified and only the running cost F is present. F is defined as the integrand of J cost :

Minimize the Hamiltonian
Here, the ' * ' donates the optimal control or trajectory.

Formulate the Adjoint Equation
In addition, using initial conditions and assuming I = 1, Using t f = t f − t 0 and t = t − t 0 , the final conditions imply as in Equation (12): Leading to the optimal trajectory and control: This control is used for the open-loop guidance control.

Terminal Transversality of the Endpoint Lagrangian
The endpoint cost E(θ f ) is kept as 0: The endpoint function e(θ f ) is set to equal zero: Thus, the endpoint Lagrangian is where ν is defined as the covector and is composed of the adjoints (λ) found in Section 2.2.3 BecauseĒ(θ f , ν) = 0, there is no terminal transversality condition to be applied. The optimal state and rate trajectory are shown in Figure 1. Figure 1. Artemis missions, NASA will land the first woman and first person of color on the Moon, using innovative technologies to explore more of the lunar surface than ever before image courtesy NASA [27], image use in accordance with NASA image use policy: [28].

Real-Time Optimal Control (RTOC)
At each time step, the simulation solves a matrix equation to find the best constants − → p = c 1 c 2 c 3 c 4 that will achieve the desired end state and rate given the current state and rate. If no noise or external disturbances are present, these should always be the optimal constants found in Equations (8) and (9). However, the presence of noise and disturbances necessitates adjustment to reach the end goal. The form of the matrix equation is taken directly from the set of linear equations found in Equations (10), (12), and (13) and is presented as follows: [T] is defined as the matrix in Equation (19). − → p is defined as the vector of constants c i , where I is absorbed into these constants when the matrix equation is solved.
− → b is defined as the vector of current and final state and rate as in Equation (19). Performing the matrix inverse in Equation (20) solves for the optimal constants − → p at every time step.
When |det[T]| ≤ 0.001, the control is switched to the open-loop optimal solution, as a matrix with zero determinant is noninvertible, and the matrix inversion would provide a diverging solution.  The resulting constants from − → p are fed as the control as in Equation (13).

Patching Filters
Instead of feeding in the desired end state to the P+V controller, patching filters take in the optimal control and feed in the desired current state as determined by the optimal control. Here, except for the case of gain-tuning (Section 2.3.2), it is assumed that the gains of the P+V controller cannot be altered from their form found in Section 2: where θ d (t) is calculated in a few different ways:

Double-Integrator Patching Filter
Here, the open-loop optimal solution is fed into a double-integrator and fed as an input to the P+V controller. The double-integrator model essentially feeds in exactly the optimal trajectory θ * (t) = θ d (t) to the P+V controller instead of θ f ,d , the desired final state.

Double-Integrator Patching Filter with Gain-Tuning
Here, it is assumed that the gains in the P+V controller can be altered. By performing manual tuning, K P = 280 and K V = 15.33 were found to work best in arriving at the desired end state, correcting for errors that arose in the case of the double-integrator patching filter without gain-tuning.. Higher or lower velocity gains worsened accuracy, while higher position gains only increased the velocity. These newfound gains are used in the simulation for the double-integrator patching filter with gain-tuning.

Control Law Inversion Patching Filter
Here, the open-loop optimal solution is fed into the following transfer function, which is then fed as an input to the P+V Controller: This takes into account the fact the P+V controller gains will inherently change the resulting θ(t) to one that is not the θ * (t) fed in to the filter. This transfer function aims to predict the effects that the P+V controller will have and input a θ d (t) that counters that effect to retain optimality. According to Equation (2), this inverts the effect of the P+V controller, and adds a double-integrator (a factor of 1 s 2 ) to it.

Results
A Monte Carlo simulation was run for each of the six methods using a uniform ±10% uncertainty in I so that I varied from 0.9 to 1.1. The state and rate sensors are subject to a Gaussian distribution of "white noise" with mean 0 and a standard deviation of 0.01. N = 1000 simulations are done for each method. MATLAB's ode4 Runge-Kutta integration solver was used with a step size of h = 0.01 s.
State and rate data are taken prior to going through the noisy sensors. The large, systematic diagonal spread in final state and rate values is most likely to stem from the uniform uncertainty in I, not the noisy sensor data.
In Figure 2, the standard deviation ellipses plotted are those of the measured data in θ f and ω f . In addition, 1, 2, and 3 standard deviations are plotted.

Discussion
The results summarized in Table 3 will be compared and contrasted. It is noted that state and rate data presented are "actual" final states and rates before passing through noisy sensors.
It is clear from the results that the double-integrator patching filter without gain tuning is the least accurate method. The errors in accuracy are four orders of magnitude higher in state and rate than RTOC. While the quadratic cost of the maneuver is 28% lower than the case of open-loop optimal control and RTOC, the loss in accuracy is too large for this method to be considered a good option. With gain-tuning, the state error was decreased by two orders of magnitude, making it the second-most accurate method for state out of the six. However, it still has a large, systematic error in rate, as well as a higher cost than the double-integrator patching filter without gain-tuning and the control law inversion patching filter. This systematic error makes the gain-tuning with double-integrator patching filter also a relatively bad method.    The P+V controller has accuracy and precision within the same order of magnitude as that of open-loop optimal control, except in the rate, where open-loop optimal wins by three orders of magnitude. The major issue with P+V control is cost. For essentially the same accuracy, the maneuver costs 1.5 orders of magnitude more. The benefit of P+V control over open-loop optimal, however, is that it is a feedback control mechanism rather than a feedforward mechanism. This allows it to correct for external disturbances. A feedforward mechanism, on the other hand, is unaffected by sensor noise (even if it is of nonzero mean). Which one is more desirable in a given scenario is then determined by the context of the engineering problem and the presence of disturbances and noise of nonzero noise, which is not tested here.
Open-loop optimal control may have the highest accuracy in rate, but the control law inversion patching filter has the highest accuracy in state, and outperforms all other methods except RTOC in rate accuracy. Its precision is lower than that of RTOC, but its increase in accuracy outweighs this. The control law inversion patching filter has a slightly higher mean cost than RTOC and open-loop optimal guidance but is relatively insignificant. The spread in cost is also higher. RTOC's slightly 0.5 order of magnitude lower state accuracy and one order of magnitude higher rate accuracy, combined with improved slightly improved rate precision and lower state precision, makes it a good competitor with the control law inversion patching filter, depending on desirable traits.
All figures in Figure 2 are made with equal axes. It should be noted that Figure 2d,e has a shifted center of the graph to encompass the data points. These points center on a different point than the other methods due to a systematic error inherent in the double-integrator patching filter. These graphs also provide visual cues indicating the spread patterns of the various methods. The open-loop optimal feedforward control has a relatively circular distribution of points, as it is not subject to noise in the state and rate sensors, but just uncertainty in the moment of inertia. The spreads of the other methods are more elliptical in nature. RTOC has a smaller spread than any other method, but has a noticeable diagonality to it-indicating a bias from the changing moment of inertia.
Computational burden was not measured as a figure of merit due to the inability to obtain meaningful results. Other processes were running on the same computer during these simulations, thus affecting run time, or computational burden, measurements.

Conclusions
The control law inversion patching filter and RTOC work best out of the six options for general purposes. The decision for which control scheme is best ultimately depends on the relative use case, context, and requirements. Strict cost requirements or precision requirements may outweigh benefits that the control law inversion patching filter provides over RTOC, for example. The control law inversion filter also provides a benefit over RTOC in computational burden due to the lack of a matrix inversion scheme.
Using optimality will reduce cost significantly over traditional methods, as indicated by the steep decrease in J cost between P+V control and any optimal control scheme. If traditional methods are being used, one can incorporate optimality by the use of patching filters and combine the benefits of traditional feedback controllers with the cost benefits of Pontryagin optimization.
A future study could be done running the control law inversion patching filter with RTOC and measuring its performance. Future study may also be done studying the effects of these control schemes with the inclusion of a full 6-degrees-of-freedom, threedimensional, coupled, nonlinear equations of motion with external forces and Coriolis forces present.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations and symbols are used in this manuscript:        Figure A7. Simulink implementation of patching filters. A manual switch is present to switch between a simple double-integrator patching filter and a control law inversion patching filter. Figure A8. Simulink implementation of a P+V feedback controller.