Design Framework for Achieving Guarantees with Learning-Based Observers

: The paper proposes a novel framework for state observer design, in which learning-based observers are incorporated. The aim of the method is to provide a framework, which is able to guarantee the limitation of the observation error, even if the error of the learning-based observer under all scenarios cannot be veriﬁed. The framework is based on the robust H ∞ design method, which is able to provide guarantees on the resulted observer. Moreover, the observer design process is extended with a controller design, which leads to a joint robust H ∞ controller-observer design. In this paper the proposed method is applied on a vehicle control problem, such as lateral path following. In this problem the goal of the observer is to provide an accurate lateral velocity signal for the vehicle, which is used in the controlled system for the generation of front wheel steering angle. The effectiveness of the method is illustrated through simulation examples on high-ﬁdelity vehicle dynamic simulator CarMaker.


Introduction and Motivation
The development of the complex automatic control systems has become a high challenge for the industry. One of the most important field is the control of autonomous vehicles, in which various safety performance requirements and similarly, enhanced functionality for the vehicle systems must be guaranteed. It requires lots of measurements using high number of sensors. During the sensing process in the industrial applications, several states of a given system can be measured, which play a crucial role in the control system. In many cases, not all of the states of a system can be measured directly or the appropriate sensors are too expensive for wide use. However, the increasing number of achievable signals makes possible to observe, estimate and predict the states of the system, which can lead to enhanced functionality in the controlled system. Several approaches have been developed for the observation problems in recent years. In terms of solutions, two main groups can be distinguished. In the first group, the classical approaches can be found. In [1] a gain scheduled H ∞ observer can be found, in which the time delays and the saturation of the actuator are taken into account. An H ∞ filtering method for the problem of state of charge and state of health monitoring in electric vehicles is used in [2]. Orientation angles are determined using nonlinear Luenberger observer in [3], during the estimation process low-cost inertial measurement unit is used. Moreover, in [4] a method is proposed for minimizing the disturbances and the errors of the estimation by using H ∞ norm approach. Furthermore, a polytopic system-based solution is presented in [5]. The goal of that paper is to solve the state estimation and the fault detection problem at the same time. The work of [6] describes a control method for linear parameter varying systems using a polytopic observer. Although the proposed methods are able to handle the nonlinearities of the system, they require the accurate knowledge of the observed system. In many cases the nonlinearities of the system unknown. Methods based on big data analyses can be used in order to improve the accuracy of the estimation process.
In the second group, the non-conventional methods can be found. In these methods the estimation process is extended with the results of the machine learning algorithms, with which the accuracy can be increased especially in nonlinear operation range. In [7] the estimation is an essential part of the control system of an induction motor, which applies a neural network-based solution. Furthermore, machine learning-based observers can also be used for mobile robots, see [8]. However, these approaches cannot provide analytical guarantees for the performances of the estimation. Using the combination of the neural network results and a model-based estimation approach, the performances can be increased significantly. For example, a Luenberger observer is extended with the results of the neural network in [9]. A solution for the estimation of the motor inertia value is presented in [10]. The inertia value is observed using a extended Luenberger observer, in which the gain matrix of the observer is adjusted using a neural network. Moreover, in [11] a H ∞ filtering algorithm is combined with the results of neural network in order to measure the rolling angle of the vehicle. In the proposed solution only the on-board sensor signals are used and the method is based on a sample vehicle model. Kalman filtering is another important approach in the problem of state estimation with lots of practical implementation possibilities. For example, [12] proposes a cascaded Kalman filtering method for state estimation in the field of cooperative lateral vehicle following. In the context of electric vehicles, Kalman filtering can be used for the state estimation of the batteries [13]. Through an appropriate method the real-time operation of the filtering process can be guaranteed [14].
The benefit of the classical approaches is that they are able to provide provable guarantees on the observation. For example, in case of model-based observer design process it is possible to scale the maximum error of the process, i.e., the difference between the estimated and the real signal. Nevertheless, it requires the accurate model on the system and the achievable observation performance due to the limited complexity of the model is limited. Despite, the learning-based observers has the advantage to provide accurate observation while preliminary physical model on the process is not required. The design of the observer is based on a training process, in which several scenarios are used, e.g., in a supervised learning or in a reinforcement learning process. Since it can be difficult to formulated some type of nonlinear dynamics of the system, an advantage of the learning-based approach is that their effect on the performances can be catched through learning. Thus, it is unnecessary to use complex nonlinear identification methods to achieve a control-oriented state-space model. Furthermore, an advantage of the learning-based observers is that high number of measured signals for achieving an accurate observation can be used, especially the inputs of the agent can contain unstructured data (e.g., camera frames). Although learning-based techniques has high effectiveness in practical applications, it is difficult to provide provable guarantees on the performance level of the observation process.
The aim of this paper is to propose a framework for the design of observers, in which the model-based and the learning-based approaches are integrated. The goal of the paper is to bridge the gap between the observer design methods, i.e., provide guarantees on the minimum performance level of the observation process, and similarly, provide the possibility for improving the maximum performance level simultaneously. The role of the model-based observer is to provide an observation, which has guarantee on the minimum performance level. The aim of the learning-based observer is to provide another observation, which is potentially more accurate. The output of the learning-based observer is taken part in the model-based observer to improve the final observation signal. The contribution of the paper is a design framework with the model-based observer design, in which some information on the learning-based observer is incorporated in. The advantage of the method is that it is independent on the internal structure of the learning-based observer, and thus, it can be used providing guarantees for various agents. In this paper the proposed framework is applied to an observation problem in the field of the vehicle control, i.e., lateral path following.
The paper is organized as follows. The design framework and the concept behind the observation is presented in Section 2. The design of the model-based observer with the consideration of the learning-based observer is presented. Section 4 presents the application of the proposed method to a vehicle control problem and moreover, simulation results for the illustration of the observation effectiveness is also presented. Finally, in Section 5 the conclusions of the paper and the further challenges are summarized.

Design Framework
In this section the framework for the observer design is presented. The structure of the observer, together with a controller is illustrated in Figure 1.
The idea behind the framework is to providex, which is as close as possible to the real state vector x of the system. Since in various industrial applications the observer is used for control purposes,x can be used for the generation of control input u through the controller u = K(x). Nevertheless, the design of the observer is independent from the control design, and thus,x can be used for non-control purposes, e.g., monitoring the operation of the system. In the design framework it is considered that the input of the model-based observer and the input of the learning-based observer can be different, i.e., it is not necessary that the measurements y m and y L to be the same. Generally, the learning-based observers can use high number of measured signals, due to their complex and nonlinear structure. For example, in case of environment sensing applications the estimation of the autonomous vehicle position on the road is based on camera information, which can be considered as unstructured data. Despite, the model-based observers has a structure with limited complexity and thus, the number of the measured signals are also limited. Moreover, in case of model-based observers structured data can be used in the process. Therefore, it is advantageous to differentiate y m and y L , but the measured signals in y m can be the parts of y L .
The output of the learning-based observer is noted withx L , which is the estimation on the state vector of the system. This information is used by the model-based observer to improve the estimationẋ, with which |x −x| is minimized. The idea is close to the concept of the Kalman-filtering, in which innovation term is used to update the modelbased estimation. In the update process of the Kalman-filtering it is considered that the measurement for the innovation is accurate, and thus, the estimation is fitted to that. Despite, in the proposed conceptx L is considered to be accurate in most cases, but not necessarily in all scenarios. For example, there can be scenarios, when the output of the learning-based observer is highly inaccurate, such as faults or rare inputs, which are highly different from the samples in the training set. The goal of the proposed observer structure is to avoid the unlimited increase of inaccuracy in the observation, i.e., the limitation of the error between x andx must be guaranteed. It is achieved through the model-based observer design, which provides the minimum performance level of the observer, i.e., bounds on the observation error. Nevertheless, the learning-based observer is considered to be designed on a way that it is able to provide accurate observation under normal circumstances and thus, the consideration ofx L has benefits on the minimization of the error |x −x|. It results in the improvement of the observer maximum performance level. Decision on the accuracy ofx L is the part of the operation of the model-based observer, whose design is detailed in the following section.

Robust Design of the Model-Based Observer
The goal of this section is to propose the design of the model-based observer, in which the outputx L of the learning-based observer is incorporated. The model-based observer design is based on the robust H ∞ method, with which guarantee on the error of the observation can be provided. Moreover, in this section the design of the observer is extended with the design of a robust controller for closed-loop purposes, which results in an output-feedback H ∞ controller with guarantees.
The designed model-based observer must guarantee the following features.
• The model-based observer must provide an observationx, with which the observation error (x −x) 2 is minimized. It requests an accurate model on the process, and on the measurement y m . Moreover, the observation is improved throughx L . • The model-based observer must decide on the acceptability ofx L . Its reason is that the learning-based observation process can degrade, because the performance level on the observation is not guaranteed. For example,x L is unacceptable if there are faults in the operation of the learning-based agent. Another example is that if an input sample for the agent significantly differs from the samples in the training set and thus, x L can lead to a reduced performance level. This feature through the robust design is achieved.
The model for the observer design is based on the state-space representatioṅ where A, B 2 , C 2 are matrices, x represents the state vector of the system with n states and u is the control input. for simplicity, one control input is considered in the rest of the paper. Moreover, the signalx L is considered as a bounded disturbance in the system. In spite of the classical disturbance signals,x L has benefits on the system in most cases, as presented in Section 2. During the design of the model-based observer, it is requested that the observation of the learning-based observer must be inside of a bounded range of the model-based observation. And thus, the model-based observer must be robust against the bounded disturbance. Consequently, the maximum observation error, i.e., the minimum performance level of the observer is guaranteed by the H ∞ design. The goal of the observer is to minimize the difference between the states of the system and the estimated states, such as Thus, it is requested to find an observer matrix L which is able to minimize the objective (2). The structure of the observer, which contains L and the model of the systems is formed as˙x where ∆ vector is the improvement based on the learning-based estimationx L . The values in ∆ is formed as follows. The values in where max, min functionals represent the selection of the higher or lower values and index i forx L,i ,x i represents the elements of the state vector. It means that ∆ can be interpreted as a state correction fromx L , which is bounded to avoid the degradation ofx, ifx L is degraded. (4) expresses thatx i −x L,i must be between ∆ min,i and ∆ max,i . The selection of ∆ min,i , ∆ max,i depends on the requirements on the acceptable maximum observation error, i.e., ifx L,i is degraded, which value of degradation is acceptable onx i . If ∆ i varies in high range, it can lead to increased degradation. But, if ∆ i varies in a small range, the benefits of the learning-based observer has less impact on the observation process.
From the aspect of the observer design, the vector of ∆ and u can be handled as known disturbances, which means that the model for the design of the observer is transformed aṡx where Since the goal of the observer is to minimize the error in (2), the objective of the observer design is formed as the minimization of the cost function where the minimization of z 0 = C 1,o x −x is the performance criteria of the observer with the identity matrix C 1,o = eye(n), l = L(y m − C 2x ) is the control signal for the correction in the observer, Q o is weighting matrix, which expresses priorities between the performances and r o is scalar weight for the correction. The design of the observer is based on the solution of the algebraic Riccati inequalities [15], such as where Y is a symmetric matrix. γ > 0 scalar represents the upper bound of the H ∞ norm of the transfer function from w to the observation performance z 0 . The goal of the observer design is to minimize γ, i.e., γ < 1 to achieve robustness against the disturbances must be guaranteed. The result of the minimization is Y, from which the observer matrix for (3) is created, such as The computation of Y, L is based on a minimization process. The goal is to find the minimum of γ, where the feasibility of the Riccati inequalities (7) are guaranteed. The Riccati inequalities are feasible if the solution Y can be computed. In practice, the minimum of γ can be found through an iterative process, e.g., line-search.
Since in several industrial problems the observers are used for control purposes, the joint design of the robust H ∞ controller and observer is presented. The goal of the control design is to minimize the quadratic cost function where in z c = C 1,c x vector with C 1,c matrix the performances of the control system are formed, Q c is the weighting matrix for creating priorities between the control performances and r c scales the control input u.
The robust H ∞ design process of the observer and the controller is based on joint Riccati inequalities, which is formed as follows [15] where X is a symmetric matrix. Thus, in this case the minimization of γ is constrained by five inequalities, i.e., it is necessary to find γ < 1 where X, Y exist. The control input of the system is computed as u = −Kx, where the controller matrix K is derived as The result of the γ minimization are the L and K matrices, with which the controlled system can be formulated. Figure 2 illustrates the implementation of the controlled systems with the observer. In the block ∆ generation the rule (4) is implemented.

Application of the Observer Design to a Vehicle Control Problem
In the rest of the paper the proposed observer design framework for a vehicle control problem is applied. The goal of the observer is to provide a precise lateral velocity value v y for the path tracking control, if the yaw-rateψ and the lateral error of the vehicle from the path e y = y − y re f are measured.
The model-based representation of the system is based on the two-wheel bicycle model of a medium-size passenger car [16], such as and the model-based formulation isẋ where the state vector is x T = [ψ, v y , y] and C r = 126,000 N/rad, C f = 126,000 N/rad are the cornering stiffness values on the front and rear wheels, l f = 1.47 m, l r = 1.5 m are the distances between the front/rear axles and the center of gravity. The mass of the vehicle is m = 2108 kg and the inertia value on the vertical axis is J = 1585.3 kgm 2 . In the design of the observer and the controller v = 20 m/s constant longitudinal velocity is considered. The performance of the controller is formed as where y re f is the reference position of the vehicle. Since y re f reference signal can be handled as a disturbance in the system, which offsets the value of y for the controller, it is not considered directly in the design of K.
The computation of the control input u requires the states of the system because of the full-state feedback. Sinceψ and y are considered to be measured, an accurate observation on the statev y must be provided. Therefore, the performance criteria of the observer design is to minimize (v y − v y ) 2 . In the implementation of the controller and the observer the coordinate system of the vehicle is handled to move together with the vehicle. It results in that the lateral position of the vehicle in the implementation of the controller is equal to the measured lateral error e y . Thus, for implementation purposes, the vector of the measured signal is y T m = [ψ, e y ]. The ∆ min,i , ∆ max,i values for the generation of ∆ signal are selected with the same absolute values, such as |∆ min,i | = |∆ max,i |, ∀i ∈ n. The value for v y is ±0.15 m/s, forψ it is ±0.1 rad/s and for e y it is ±0.2 m. Moreover, the design of the model-based observer requires the selection of Q o and r o . In the given observer design problem the Q o is suggested to select in the form of a matrix, whose elements outside of the main diagonal are zero. The values in the main diagonal represent the priority of v y observation, i.e., the related value of Q o to v y is suggested to selected as a high value. Nevertheless, the selection of Q o , r o is a tuning process, until the requested performance level of the observer is reached.
The result of the γ minimization is illustrated in Figure 3. The initial value of the candidate γ value is 100 and the achieved minimum value is 0.4028, see step 14. During the minimization process in step 9 with the candidate γ value 0.3906 the minimization is failed, i.e., X, Y matrices cannot be existed. Since the value in step 14 is close to the value in step 9, it can be selected as a minimum for γ. The achieved γ value guarantees the robustness of the system due to γ < 1.

Training of the Learning-Based Observer
In the vehicle control example a neural network-based observer is implemented in order to increase the accuracy of the estimation process. The goal of the learning-based observer is to provide an estimation onx L , which is carried out through the following signals, which can be measured by the on-board sensors of the vehicle: • longitudinal velocity v, • lateral accelerationv y , • steering angle δ, • yaw-rateψ. The training process of the neural network is performed using supervised learning, for which a previously recorded training dataset is used. During the data generation several simulations are performed in CarMaker vehicle dynamics simulation software. The steering angle of the vehicle was randomized within the reasonable region.
Generally, neural networks are able to handle fitting problems, where the process is influenced by high nonlinearities. Neural networks consist of several layers, which can be divided into three main groups such as the input layer hidden layers and the output layer. The layers consist of neurons, which is built up by activation functions and weights. Before the training process, several parameters are be chosen such as the number of the layers and neurons, which are determined using the k-fold cross-validation technique [17]. Moreover, taking into account the chosen activation functions, the number of neurons can be determined see [18].
In Figure 4 the structure of the selected neural network is illustrated, in which the inputs of the network is given by (y l,i ) and the output isx l . Since the accuracy of the estimation process can be increased by considering past values, the neural network-based observer takes into account the actual and 3 past values of the measured attributes. The sampling time of the past values can be determined using spectral analysis [19]. In Figure 5 an example can be seen for the results of the estimation process with various numbers of neurons in the hidden layers (n 1 -first hidden layer, n 2second hidden layer). It can be said that by increasing the number of neurons, a better estimation accuracy can be achieved. However, using too many neurons leads to overfitting, which greatly decrease the usability of the network. As a result, Table 1 summarizes the main parameters of the neural network with the lowest sum of error value, i.e., the selected number of neurons and the types of activation function. The training process is performed using a backpropagation algorithm, and parameters of the neural network is calculated using Levenberg-Marquardt optimization process.

Simulation Results
Finally, the effectiveness of the proposed observer design framework is presented through a comprehensive simulation example. The simulations are performed in CarMaker vehicle dynamic simulation software, in which the vehicle is driven along a predefined path. Two different cases are compared during the simulations. In the first case, the measured signals of the sensors on the vehicle are considered to be accurate. But, in the second case, additional noise with high value is added to the measured signals in order to simulate the case, when the learning-based observer can provide inaccuratex L . The goal of the simulations is to show that through learning-based observer the state observation process can be improved and furthermore, the proposed design framework provides guarantees if the output of learning-based observer is degraded.
In Figure 6a the reference trajectory of the vehicle can be seen, which is based on the data of a section of Hockenheimring, Germany. The steering angle, which is provided by the resulted controller K, is shown in Figure 6b. During the simulation example, the longitudinal velocity of the controlled vehicle is set to 50 km/h. In Figure 7 the yaw-rate of the vehicle is depicted. The figure shows that the measured yaw-rate signal is quite noisy. Using the proposed observer, which is augmented with the results of the neural network, the obtained yaw-rate value can be used during the lateral control of the vehicle. Thus, the impact of the noise on the control performance can be reduced though the proposed method. In Figure 8a the estimated lateral velocities can be seen. The real value of the lateral velocity is represented with the black line and the red line illustrates the results of the model-based observer. Moreover, the lateral velocities provided by the observer, which is extended with the neural network, is shown with the blue line. It can be seen, that using the proposed observer structure, the results of the estimation process is more accurate compared to the purely model-based solution. Furthermore, it can be said that the choice of ∆ min , ∆ max has high impact on the results. In the cases when the neural network provides more accurate results than the model-based observer, the estimation accuracy can be highly increased. But, when the neural network provides poor results, avoiding inaccuracy of the state observation process can be guaranteed by the robust H ∞ design of the model-based observer.  In Figure 9 a statistical analysis, i.e., a histogram is presented for the illustration of the effectiveness of the observer. In the histogram the probability values of each estimation errors on v y are illustrated. The blue bars represent the results with the proposed method, in which the outputs of the neural network are taken in to account. The statistical analysis confirms the conclusions of the simulation results. Figure 9. Comparison between the model-based and the combined estimation.
In the second example noises with high values on the measurements are added, which leads to the inaccurate operation of the learning-based observer. The result of the observation on v y is shown in Figure 10. It can be seen that in this case the error betweenx L and x is significantly increased and thus,x =x L in most of the simulation. Nevertheless, the degradation of the observation process is limited, due to the limitation of ∆. The statistical analysis through the plot of the histogram on the results of the second simulation is found in Figure 11. It shows the main benefit of the method, i.e., the guarantee on the estimation error. In case of the proposed method the plot of the histogram is bounded, while without the limitation of ∆ the neural-network-based observer leads to a flatter plot without limits on the error. Since in the proposed example the observed statex is used for control purposes, the impact of the observation accuracy on the tracking performance is examined. Figure 12 shows the histogram on the absolute value of the lateral tracking error in case of the second scenario. It can be seen that the proposed method results in reduced lateral error with increased probability, compared to the simulation with learning-based observer. The improvement is around ≈25% if |e y | < 0.05 m.

Conclusions
The simulation examples proposed the effectiveness of the method on a vehicle control problem from two aspects. First, through the design framework the higher accuracy of the learning-based observer can be utilized in the state observation process. Second, if the learning-based observation has degradation, the degradation of the state observation process can be limited. The advantageous operation of the designed observer is achieved through the robust H ∞ observer design method. The effectiveness of the design method has been illustrated through simulations on the example of state estimation for lateral dynamics of vehicles. The statistical evaluation of the results has concluded that the estimation error through the proposed method can be limited and similarly, the performance level of the observation process is improved. Finally, the observer design method is extended with the design method of the controller, which leads to a joint H ∞ design.
Nevertheless, the proposed method has some limitations, which must be handled through the practical application of the method. First, the training of the learning-based observer requires high number of data, which can be achieved through simulations or test measurements. In some applications the collection of high number of data can be expensive and the unsuccessful scenarios during the training process can lead to critical operation. Therefore, a challenge of the learning-based observation is to provide a method, with which a quantity index of the required number of samples can be given. A further limitation of the method is that it uses linear model for the design of the model-based observer. A future goal is to extend the design process for further class of systems, e.g., linear parameter varying systems. Another future challenge of the method is to extend the state observation process for a prediction method of the states. It requests the development of a method on the comprehensive analysis of the learning-based agent, e.g., observer or predictor. In the proposed method the decision on the acceptability of the learning-based observation is based on the actual signals, but for prediction the output of the learning-based agent on a longer horizon must be examined, which is a challenge in the design framework. Furthermore, another challenge of the method is to guarantee the observability with the learning-based observer for a system, which is unobservable. It can require an analysis method on the global observability, which contains the augmented system, i.e., the system and the learning-based observer.