A New Method for Determining the Degree of Controllability of State Variables for the LQR Problem Using the Duality Theorem

: The control performance of a dynamic system can be checked by the degree of controllability. In this work, we present a new method for determining the degree of observability of state variables for the linear quadratic optimal estimation (LQE) problem. We carried out the calculation of the degree of controllability for the linear quadratic optimal control (LQR) problem using a duality theorem. Compared with the traditional measures of controllability such as determinant, trace, and maximal eigenvalue of the inverse controllability Gramian, the proposed degree of controllability was developed for each state variable and takes into account both the controllability Gramian and the cost function. The new method is convenient to apply to LQR problem. In the numerical simulation, we determined the inﬂuence of the model parameters on the degree of controllability. Besides that, we analyzed the degree of controllability, which gives an insight into the relationship between the system model design and the control performance.


Introduction
"Degree of controllability" is a scalar measure of controllability to check how controllable a given system is [1][2][3]. It reflects the efficiency of the control effort.
The check of controllability only provides an answer to the binary question of whether or not a dynamical system is completely controllable. However, in practice we always need to know how controllable a given system is. Therefore, several authors proposed the concept of degree of controllability (DOC) as a measure of the controllability of a given system [1][2][3]. The degree of controllability analysis can help us choose the optimal parameters, number, and locations of actuators for a control system, which is of substantial importance in the aerospace industries to solve optimal control problems (the actuator placement in flexible spacecraft, sensor and actuator positioning for active vibration control, etc.) [2,[4][5][6]. In addition, the study of the importance or lack of importance of each control variable to the various outputs of the system is useful for decoupling or model reduction in systems [2,7].
There are many definitions of the degree of controllability. The most widely-known definition is related to the minimum control input energy required to transfer any initial state x 0 to the origin [1][2][3][4]7]. If a system requires smaller input energy than others to regulate the system, it can be considered more controllable [3].
Kalman et al. first discussed the method for determining the degree of controllability in the work [1], where the authors considered the weighted trace of the controllability Gramian as a measure of models and obtain the characteristics of the nonstationarity. The study of the degree of observability and controllability using these methods makes it possible to obtain a more complete connection between the qualitative characteristics of the systems under study.
The paper has the following structure: In Section 2, we present our study of the duality of recursive process P in linear quadratic regulation (LQR) and estimation (LQE). In Section 3, we present a method for determining the degree of observability of state variables. In Section 4, using the duality theorem, the degree of controllability is calculated in a similar way by using the duality theorem. In Section 5, we determined the dependences of the degree of controllability on the parameters of model matrix and control matrix. The numerical example of inverted pendulum on a carriage shows that the degree of controllability analysis can help us predict the control performance of different state variables. As the degree of the controllability increases, the settling time of the response process becomes shorter and the amplitude of the response process decreases.

The Duality of Linear Quadratic Regulation (LQR) and Estimation (LQE)
Linear quadratic regulation (LQR) and linear quadratic estimation (LQE) are widely used as control and estimation methods. These methods are based on Pontryagin's maximum principle and Bellman's dynamic programming [31][32][33]. In 1960, Kalman found the duality theorem by solving the optimal control (LQR) and optimal estimation (LQE) problems [19,20]. Kalman proved that the filtering problem is a duality of the noise-free regulator problem. The solutions of the optimal estimation problem and the optimal regulator problem are equivalent under the duality relations [19,20].

Linear Quadratic Estimation (LQE)
Let us consider a linear dynamic system where x k is an n-state vector of the system (the components x i k of x k are called state variables); w k is an r-Gaussian random process noise vector with zero mean and covariance matrix Q = E w j w T k ; z k is an m-measurement vector; v k is an m-Gaussian random measurement noise vector with zero mean and covariance matrix R = E v j v T k ; F k,k−1 is an n × n transition matrix; H k is an m × n observation matrix. Moreover v j and w k are uncorrelated with each other, i.e., E v j w T k = 0 at any j and k. In the LQE problem, the optimal solutions P k and K k are generated by the recursion relations [19,20]: where x k is an n-state vector estimate, P k is covariance matrix of the estimation errors x k , x k is the estimation error and x k = x k − x k , K k is optimal gain matrix. P k , K k are obtained iteratively by minimizing the cost function J k , where J k = P k = E x k x T k . After that, we can calculate the optimal values of K k and P k iteratively after setting the initial conditions P 0 , Q, R.

Linear Quadratic Regulator (LQR)
Consider a dynamic system with the following form where x k is an n-state vector of the system, the control law u k is an r-vector, and u k = −K k x k , F k+1,k is n × n state transition matrix, B k is n × r control matrix, K k is r × n optimal control gain matrix.
In the LQR problem, the optimal solutions P k , K k are determined by the recursion relations [19,20]: where P k is a matrix of quadratic form of the performance index under optimal regulation, Q is state-cost weighting matrix, R is control-cost weighting matrix, and P k is the r × n optimal control gain matrix and is determined by minimizing the cost function and R are positive-definite symmetric matrices. It should be noted that the physical meanings of P k , Q, and R in the LQR problem and LQE problem are different. x T k Qx k denotes the distance between the actual state and the desired state in the hyperplane. u T k Ru k denotes the cost of the energy. The selection of Q and R will influence the performance of transient response process.
The optimal values of K k and P k can be calculated iteratively after setting the initial conditions P N , Q, R, where N is the terminal time.

Duality Theorem
Consider a dynamic model in the LQR regulation problem and its dual system model in the LQE estimation problem According to the duality theorem [19,20], if the given initial conditions P N , Q, and R in these two systems are equal, respectively, the obtained optimal solutions P k , K k in system (5) and (6) will also be the same, not only for the ranks of these matrices, but also for each element in the matrices P k , K k .
Since the value of the optimal solution of P k and K k in the LQR and LQE of the dual system are the same, then the criteria for qualitative characteristics based on P, Q, and R of the original system and the dual system will also be the same. That is, the degrees of observability and controllability are dual. We can demonstrate the duality of the degrees of observability and controllability in various forms, for example [4,7,10].
Next, we considered a numerical criterion for the degree of observability, which is more useful in practice. Using the criterion of the degree of observability for the analysis of state variables of the dual system, it is possible to calculate the degree of controllability of the corresponding state variables of the original system.

A Method for Determining the Degree of Observability of State Variables
A qualitative characteristic of the observed components of the state vector is the degree of observability. The degree of observability is an abstract characteristic, and the approach presented in [34][35][36] allows for determining which of the components of the state vector are observed better.
The approach not only gives an assessment of the quality of observability of the system but also allows a comparison of the observability of the components of the state vectors of various systems.
We assume that the directly measured components of the state vector have the maximum degree of observability. The degree of observability of the directly measured component of the state vector is consider to be 1.
Assume that one component of the state vector is measured, the observation matrix H k = 1 0 · · · 0 .
Divide each measurement interval into n subintervals and denote the measurements z 1 , z 2 , . . . , z n by the state at the first subinterval x 1 .
Or in the matrix form: where From Equation (8), we can rewrite the state vector x 1 as follows: We introduce the concept of equivalent measurement noise. Then, according to the Equation (7), the equivalent measurement noise for an arbitrary component of the state vector has the following form: where α i1 , α i2 , . . . , α in are elements in ith row of matrix O −1 . Determine the variance of the equivalent measurement noise for the ith component R * i by the coefficients α i1 , α i2 , . . . , α in .
where R = E v j v k is the variance of the measurement noise v of the directly measured component of the state vector. We can judge a measure of observability by two characteristics: accuracy of estimation and time of convergence. Thus, the criterion for the degree of observability has the following form [34][35][36]: where P 1 0 is the initial variance of the directly measured component of the state vector. The criterion (12) may be used for any t 0 .
The degree of observability for an arbitrary state component is a scalar in the criterion (12). Thus, the proposed criterion allows comparing the degrees of observability for a specific state component in various systems.
This criterion directly uses the ratio of P and R. If variance P of the directly measured state component converges in one step, then for the other state components P may converge in several steps. According to Shannon's information theory, if the projection of the measurement noise v * on the ith coordinate is relatively small, then the ith state component could get more information from the measurements vector z * . As a result, the estimation errors of the ith component will be faster reduced to 0. The larger the value of P i 0 R * i , the faster the convergence speed of performance index P. This analysis is applicable to any initial time t 0 .

A Method for Determining the Degree of Controllability of State Variables
Assume, without loss of generality, that the control u is a scalar, and the control matrix B is an n × 1 matrix.
The cost function has the following form Since the value of P k , K k in the estimation process of the dual system and P k , K k in the regulation process of the original system are the same, we can use the degree of observability (12) of the dual system to determine the degree of controllability of the original system.
The dual system for estimation problem has the following form The controllability matrix of the dual system has the form O = , and the equivalent measurement noise is where α i1 , α i2 , . . . , α in are elements in the ith row of matrix O −1 . Thus the degree of observability (12) of the dual system will be where P 1 N , R 1 are initial conditions of the estimation process in dual system. Since the initial conditions of the dual system P N , Q, R are the same, then the degree of controllability of the original system will be where R 1 is initial control-cost weighting matrix for control u; R * i is the equivalent control-cost weighting matrix of the ith component of the state vector and the value of it is the same as (17). As in Equation (17), the criterion (18) for the degree of controllability can also be used for any terminal time t N .
The physical meaning of expression (18) may be explained as follows. If the projection of the control costs R * i on the ith coordinate is less, the energy consumption will be less to control the ith state component. In other words, the ith state variable is easier to control. This idea was first mentioned in [1] to evaluate the controllability of the whole system.
In summary, we proposed a new method for determining the degree of controllability of specific state variables using the duality theorem. This criterion allows for comparing the degree of controllability of state components in different systems.

Simulation Results and Its Discussion
In order to study the proposed degree of controllability for specific state variables, we used an example of the stabilization models of a pendulum on a carriage. We studied the value of the degree of controllability and the control performance of the system under different system model parameters. The inverted pendulum model is shown in Figure 1.  Figure 1 introduces the following symbols: m-pendulum mass; M-carriage mass; f -force applied by the motor to the carriage through the belt; 2l-pendulum length; l-distance from the pivot point to the center of mass of the pendulum; I-moment of inertia of the pendulum; θ (t)-the deviation angle of the pendulum from the vertical, in the position of unstable equilibrium is zero, increases clockwise; x (t)-carriage position.
Near the equilibrium point, the nonlinear model can be linearized as The discrete state-space model of the system has the form where F is state transition matrix, B is control matrix, C is output matrix, f k is input, Here ∆T is sample time, ∆T = 1s [37,38]; D = M + m ml ml I + ml 2 = I + ml 2 (M + m) − m 2 l 2 ; The degree of controllability for each component: where R is control-cost weighting matrix; R * i is equivalent control-cost weighting matrix of the ith component state; P i N is cost function of the ith component in the state vector.
Since the controllability matrix and its inverse matrix and O˘1 are difficult to express analytically, we study the degree of controllability by simulation. From the Equation (17) it is obvious that if the value of R R * i is larger, then the degree of controllability DoC i for the ith component will be larger. The ith state component will be easier to control. The R R * i depends on the matrix parameters of B and F. We studied the relationship between the degree of controllability DoC i and the parameters M, m, l, and g.
The parameters of the system model are given in Table 1. Set the initial values of weighting matrices P N , Q, R as follows We investigated the control performance at different parameters m and l. First, we study the system as m changes, while the other parameters are given the values shown in Table 1. Then, we obtain the relationship between DoC i and m, shown in Figures 2 and 3. Figures 2 and 3 show that as the the pendulum mass m increases, the value of R R * i decreases, and the degree of controllability DoC i decreases, i.e., the state variables of the system become harder to control.   (2) Pendulum angle θ response process ( Figure 5) The simulation results confirm the relationship between the degree of controllability DoC i and the change in the parameter m. When m increases, the degree of controllability DoC i becomes smaller, the initial states of the system x 0 converges more slowly and the amplitude becomes larger.
If other parameters are fixed, and only l changes, then    The control performance is shown in Figures 8 and 9.
(1) Carriage position x response process (2) Pendulum angle θ response process Obviously, with a relatively small length of the rod l, the initial state x 0 of the system would quicker converge to 0 and the amplitude of the response process would be smaller, i.e., the system becomes easier to control. The settling time and the rise time of the system become shorter. The simulation results could be explained by real physical principles. The shorter the rod is, the easier it is to control the deviation angle of the pendulum. Consequently, the response process will converge more quickly.

Conclusions
In this work, we proposed a new method for determining the degree of controllability of specific state variables. The proposed method not only considers the affect of controllability/observability Gramian, but also the affect of the cost function and weighting matrix. Using this method, it will be more convenient for investigating LQR (linear quadratic optimal control) problems. We carried out a numerical simulation of the inverted pendulum on a carriage. Furthermore, we demonstrated the relationship between degree of controllability and control performance by the transient responses process at different system model parameters. We determined the influence of the model parameters on the degree of controllability, which allows us to design system models with the desired properties. The proposed criterion can compare the degree of controllability of the same component of the state vector in two different systems.