Design of Optimal Controllers for Unknown Dynamic Systems through the Nelder–Mead Simplex Method

Tsai, Hsun-Heng; Fuh, Chyun-Chau; Ho, Jeng-Rong; Lin, Chih-Kuang

doi:10.3390/math9162013

Open AccessArticle

Design of Optimal Controllers for Unknown Dynamic Systems through the Nelder–Mead Simplex Method

¹

Department of Biomechatronics Engineering, National Pingtung University of Science and Technology, Pingtung 91201, Taiwan

²

Department of Mechanical and Mechatronic Engineering, National Taiwan Ocean University, Keelung 20224, Taiwan

³

Department of Mechanical Engineering, National Central University, Taoyuan 320317, Taiwan

^*

Author to whom correspondence should be addressed.

Mathematics 2021, 9(16), 2013; https://doi.org/10.3390/math9162013

Submission received: 29 July 2021 / Revised: 16 August 2021 / Accepted: 19 August 2021 / Published: 23 August 2021

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents an efficient method for designing optimal controllers. First, we established a performance index according to the system characteristics. In order to ensure that this performance index is applicable even when the state/output of the system is not within the allowable range, we added a penalty function. When we use a certain controller, if the state/output of the system remains within the allowable range within the preset time interval, the penalty function value is zero. Conversely, if the system state/output is not within the allowable range before the preset termination time, the experiment/simulation is terminated immediately, and the penalty function value is proportional to the time difference between the preset termination time and the time at which the experiment was terminated. Then, we used the Nelder–Mead simplex method to search for the optimal controller parameters. The proposed method has the following advantages: (1) the dynamic equation of the system need not be known; (2) the method can be used regardless of the stability of the open-loop system; (3) this method can be used in nonlinear systems; (4) this method can be used in systems with measurement noise; and (5) the method can improve design efficiency.

Keywords:

optimal control; penalty function; performance index; N–M simplex method; inverted pendulum

1. Introduction

In the real world, all dynamic systems are nonlinear; linear systems exist only in theory. Many methods of analysis and control have been proposed for nonlinear systems [1,2]. Among these, Lyapunov’s method can not only be used to analyze the stability of nonlinear systems; it is also often used to design feedback controllers. At present, many nonlinear control techniques based on Lyapunov theory have been proposed and applied to actual physical systems, such as Lyapunov redesign, backstepping, adaptive control, sliding mode control, etc. The recent developments and applications of the above methods are as follows.

Tavasoli and Enjilela utilized Lyapunov redesign to stabilize the vibration of a boundary-controlled flexible rectangular plate in the presence of exogenous disturbances [3]. Xu et al., applied output-feedback Lyapunov redesign to control a magnetic suspension system [4]. A backstepping controller was applied to control a quadrotor unmanned aerial vehicle [5], microgyroscope [6], and multiphase motor drives [7]. Adaptive control is a control method that can adapt to parameter changes or initially uncertain controlled systems. Recently, Liu et al., developed a novel adaptive fault-tolerant control strategy to suppress the vibrations of a flexible panel [8]. Liang et al., proposed neural-network-based event-triggered adaptive control for nonaffine nonlinear multiagent systems with dynamic uncertainties [9]. Wang and Na presented parameter estimation and adaptive control for servo mechanisms with friction compensation [10]. Since the publication of the survey paper on sliding mode control in IEEE Transactions of Automatic Control in 1977 [11], sliding mode control has been extensively studied and used in practical applications due to its simplicity and robustness with respect to external disturbances and modeling uncertainties [12,13,14]. In addition, gain scheduling is used to design corresponding linear controllers for nonlinear systems at different operating points or regions [15,16]. Feedback linearization is based on the theory of differential geometry to find an appropriate conversion between the control input and state variables and to convert the nonlinear system into an equivalent linear system. This method has been used for induction motors [17], boost converters [18], an unmanned bicycle robot [19], etc.

However, these methods usually require information on the approximate/nominal dynamic equation of the system and the upper bound of uncertainty or disturbance. Moreover, real controllers are always limited in magnitude. Therefore, in the actual design of a controller, we usually adopt a trial-and-error method. However, during adjustment of the parameters, the state or output of the system may not be within the allowable range. When this occurs, the experiment must be immediately terminated and relevant safety or protective measures must be initiated to prevent injury to the operator or damage to equipment. Therefore, designing a controller with good performance for real applications is difficult and time consuming.

To overcome the aforementioned difficulties and complexity associated with designing actual controllers, we propose an efficient and optimized controller design method for nonlinear unstable systems. First, we establish a performance index (objective function) according to the characteristics of the system to be controlled. This performance index may include a time function and any measurable system state or output signal; the dynamic equation of the system need not be known in advance. Moreover, the performance index itself or the dynamic equation of the system can also contain non-differentiable terms. Most importantly, to ensure that the performance index is applicable even when the state or output of the system is not within the allowable range, we add a special penalty function. The penalty function is used to solve constrained optimization problems. It is used to convert constrained problems into unconstrained problems by introducing an artificial penalty for violating the constraint.

The implication of the penalty function proposed in this paper is as follows. We assume that for the same system, the time interval between the beginning and termination of each experiment or simulation is fixed. When we use a certain set of controller parameters to control the system, if the state or output of the system is within the allowable range within the preset termination time, the penalty function value is zero. Conversely, if the system state or output is not within the allowable range before the preset termination time, the experiment or simulation is terminated immediately, and the penalty function value is proportional to the time difference between the preset termination time and the time at which the experiment was terminated.

To keep the performance index as low as possible, we used the Nelder–Mead (N–M) simplex method (also known as the downhill simplex method) to search for controller parameters. The original concept of the N–M simplex method was proposed by Spendley, Hext, and Himsworth [20], and it was further improved by Nelder and Mead [21]. The convergence properties of the Nelder-Mead simplex method are discussed in references [22,23,24,25]. Because the N–M simplex method is easy to implement, it has been widely used to search for the minimum or maximum value of the objective function in a multidimensional parameter space. In particular, the N–M simplex method does not require the derivative of the objective function; therefore, the real system is applicable even if it has nondifferentiable problems or the objective function value contains noise. The N–M simplex method continues to be applied in different fields, such as parameter estimation [26,27], optimization of machining parameters [28], power plant optimization [29], optimization of the production parameters for bread rolls [30], etc.

To verify the feasibility of the proposed method, we adopted an inverted pendulum system with measurement noise as an example. Then, we employed the N–M simplex method to search for the controller parameters iteratively. The simulation results revealed that even if the initial controller parameters cannot stabilize the system, after the algorithm reaches the iterative termination condition we set in advance, the system is stable and exhibits good transient response performance.

2. Design of Optimal Controllers for Unknown Parameter Systems

Optimal control is a branch of optimization problems that deals with finding a controller for a dynamical system over a period of time such that an objective function is optimized [31,32]. An objective function is usually called a performance index in the field of control. The purpose of optimization is to obtain a parameter vector such that the objective function is at a minimum. However, in many cases, the choice of parameters is not arbitrary but subject to certain restrictive conditions. We term this the constrained optimization problem. A general constrained minimization problem may be written as follows:

\min J (p)

(1)

subject to g_{i} (p) \leq 0, i = 1, 2, \dots, m_{1}

h_{j} (p) = 0, j = 1, 2, \dots, m_{2}

For a constrained optimization problem, we usually convert the constraints into a suitable penalty function and add this function to the original objective function. Thus, we transform a constrained optimization problem into an unconstrained problem; moreover, the solution of the unconstrained problem converges to the solution of the original constrained problem.

In the field of optimization control, the commonly used performance indices are as follows [31,32,33]:

Integral squared error (ISE):

$J = \int_{0}^{T_{f}} e^{2} (t) d t$

(2)

The smaller the value of this index, the closer the error of the control system in the time interval [0, $T_{f}$ ] is to zero.
Integral absolute error (IAE):

$J = \int_{0}^{T_{f}} |e (t)| d t$

(3)

The meaning of this index is similar to that of the ISE.
Integral time-weighted absolute error (ITAE):

$J = \int_{0}^{T_{f}} t |e (t)| d t$

(4)

The smaller the value of this index, the closer the error of the control system in the time interval [0, $T_{f}$ ] to zero and the faster the convergence.
Integral time-squared error (ITSE):

$J = \int_{0}^{T_{f}} t e^{2} (t) d t$

(5)

The meaning of this index is similar to that of the ITAE.
Quadratic performance index:

$J = x^{T} (T_{f}) F (T_{f}) x (T_{f}) + \int_{0}^{T_{f}} (x^{T} Q x + u^{T} R u) d t$

(6)

where $T_{f}$ , $e$ , $x = {[\begin{matrix} x_{1} & x_{2} & \dots & x_{n} \end{matrix}]}^{T}$ , and $u = {[\begin{matrix} u_{1} & u_{2} & \dots & u_{m_{u}} \end{matrix}]}^{T}$ denote the terminal time, output error, system state, and control input, respectively. Additionally, $F$ and $Q$ are positive semidefinite matrices, and $R$ is a positive definite matrix. When the control objective is to keep the state small, the control input not too large, and the final state as close to zero as possible, we can use this performance index.

The method proposed in this study does not require information on the dynamic equation of the system in advance, and it can use any of the above performance indices or other suitable performance indices. To ensure that the selected performance index is applicable even when the state or output of the system is not within the allowable range, we add a penalty function to the performance index. Before formally defining the penalty function, we first assume that, for the same system, the start time of each experiment (or simulation) is zero, and the terminal time is fixed and represented by

T_{f}

. If the state or output of the system is within the allowable variation range until the terminal time is reached, the penalty function weight

W_{r}

is equal to zero. Conversely, if the state or output leaves the allowable variation range before reaching the terminal time, the experiment (or simulation) is terminated immediately; we denote this instant as

T_{r}

, and we then let the penalty function weight

W_{r}

be a sufficiently large positive constant. Then, we define the penalty function as follows:

P = W_{r} (T_{f} - T_{r})

(7)

To design controller parameters such that the performance index reaches the minimum value, we use the N–M simplex method to search for the controller parameters.

The N–M simplex method proposed by Nelder and Mead [21] is used for solving N-dimensional unconstrained optimization problems of the following form:

\min_{p \in ℝ^{N}} J (p)

(8)

where

J (p)

is defined as an objective function, which is usually called the performance index in the control field.

After the form of the performance index is determined, the N–M simplex method generates a sequence of simplices, where each simplex is defined by

N + 1

distinct vertices, namely,

p_{0}, \dots, p_{N}

, for which the corresponding function values are

J_{0}, \dots, J_{N}

. The points

p_{0}, \dots, p_{N}

are assumed to be sorted such that

J_{0} \leq \dots \leq J_{N - 1} < J_{N}

, and

\bar{p}

represents the centroid of points

p_{0}, \dots, p_{N - 1}

. In each iteration, simplex transformations in the N–M simplex method are controlled by the parameters

α

,

β

, and

γ

. These parameters should satisfy the following conditions:

0 < β < 1, 0 < α < γ

(9)

These parameters have typical values of

α = 1

,

β = 0.5

, and

γ = 2

. The values of

α

,

γ

,

β

, and

- β

yield the reflection point

p_{r}

, expansion point

p_{e}

, outer contraction point

p_{c}

, and inner contraction point

p_{c c}

, respectively. The objection functions at these four points are denoted as

J_{r}

,

J_{e}

,

J_{c}

, and

J_{c c}

, respectively. If none of the four points represents an improvement in the current worst point

p_{N}

, the algorithm shrinks the points

p_{1}, \dots, p_{N}

toward the lowest

p_{0}

, thereby producing a new simplex. During the shrinking process, each value of

p_{i}

is replaced by

p_{0} + 0.5 (p_{i} - p_{0})

for

i = 1, \dots, N

. A new iteration is automatically triggered after the shrinking process is complete. The iterative process continues until the specified termination criteria are satisfied (e.g., the iterations reach the allowed maximum number or the function value

J_{0}

is lower than the default value). We list the various vertices that may be tried during the iteration of the N–M simplex method in Table 1. The pseudo code of the N–M simplex method is shown in Algorithm 1.

Algorithm 1 The pseudo code of the N–M simplex method.

Define

α = 1

,

β = 0.5

,

γ = 2

Choose initial

p_{0}, \dots, p_{N}

and calculate

J_{0}, \dots, J_{N}

while termination conditions are not satisfied
Sort

p_{0}, \dots, p_{N}

such that

J_{0} \leq \dots \leq J_{N}

\bar{p} = \frac{1}{N} \sum_{i = 0}^{N - 1} p_{i}

p_{r} = \bar{p} + α (\bar{p} - p_{N})

(Reflection)
Calculate

J_{r}

if

J_{r} < J_{0}

p_{e} = \bar{p} + γ (\bar{p} - p_{N})

(Expansion)
Calculate

J_{e}

if

J_{e} < J_{r}

p_{N} = p_{e}

J_{N} = J_{e}

else

p_{N} = p_{r}

J_{N} = J_{r}

end
else if

J_{r} < J_{n}

p_{c} = \bar{p} + β (\bar{p} - p_{N})

(Outer contraction)
Calculate

J_{c}

if

J_{c} < J_{r}

p_{N} = p_{c}

J_{N} = J_{c}

else

p_{N} = p_{r}

J_{N} = J_{r}

end
else

p_{c c} = \bar{p} - β (\bar{p} - p_{N})

(Inner contraction)
Calculate

J_{c c}

if

J_{c c} < J_{N}

p_{N} = p_{c c}

J_{N} = J_{c c}

else
for

i = 1, \dots, N

p_{i} = 0.5 (p_{0} + p_{i})

(Shrink)
Calculate

J_{i}

end
end
end
end
Print out

p_{0}

and

J_{0}

The N–M simplex method is easy to implement; therefore, it has been widely used to solve unconstrained optimal problems in an N-dimensional parameter space. In particular, the N–M simplex method does not require the derivative of the objective function, and the real system is thus applicable even if the real system has nondifferentiable problems or the objective function value contains noise. For the concept and detailed algorithm of the N–M simplex method, please refer to the literature [20,21,22,23,24,25,26].

Because the N–M simplex method has the abovementioned characteristics, it is suitable for use in optimal controller design. If all the signals in the performance index are available, we need not know the dynamic equation of the system, and we can calculate the performance index values corresponding to each set of controller parameters. We then use the N–M simplex method to gradually find the optimal controller parameters that will allow the performance index to reach the minimum.

3. Numerical Simulation

Let us consider an inverted pendulum system (Figure 1).

We assume that the length of the linear cart rail is 2 m and the middle point is

x = 0

; the pendulum rod is rigid and massless. All frictional forces in the system can be neglected. Under such an assumption, the entire pendulum mass is concentrated at the center of the pendulum ball. The symbol definitions and simulation conditions for this system are as follows:

$M = 0.5$ kg (cart mass);
$m = 0.1$ kg (ball mass);
$L = 0.3$ m (distance from the pendulum pivot to the center of the ball);
$g = 9.8$ m/s² (gravity constant);
$θ$ (rad): rotational displacement of the pendulum;
$x$ (m): horizontal displacement of the cart; and
$u$ (N): control force,

where

u

is subjected to the following saturation condition:

u = \{\begin{matrix} 30, \\ u, \\ - 30, \end{matrix} \begin{array}{l} if u > u_{\max} = 30 \\ if u_{\min} \leq u \leq u_{\max} \\ if u < u_{\min} = - 30 \end{array}

(10)

The dynamic equation of the inverted pendulum is expressed as follows [34,35]:

\ddot{θ} = \frac{u \cos θ - (M + m) g \sin θ + m L (\cos θ \sin θ) {\dot{θ}}_{2}^{2}}{m L \cos^{2} θ - (M + m) L}

(11)

\ddot{x} = \frac{u + m L \sin θ {\dot{θ}}_{2}^{2} - m g (\cos θ \sin θ)}{M + m - m \cos^{2} θ}

(12)

The state variables of the system are defined as follows:

x_{1} = θ, x_{2} = \dot{θ}, x_{3} = x, x_{4} = \dot{x}

(13)

The dynamic equation of the inverted pendulum system can be rewritten as follows:

{\dot{x}}_{1} = x_{2}

(14)

{\dot{x}}_{2} = \frac{u \cos x_{1} - (M + m) g \sin x_{1} + m L (\cos x_{1} \sin x_{1}) x_{2}^{2}}{m L \cos^{2} x_{1} - (M + m) L}

(15)

{\dot{x}}_{3} = x_{4}

(16)

{\dot{x}}_{4} = \frac{u + m L (\sin x_{1}) x_{2}^{2} - m g (\cos x_{1} \sin x_{1})}{M + m - m \cos^{2} x_{1}}

(17)

The state vector is defined as follows:

x = [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \\ x_{4} \end{matrix}] = [\begin{matrix} θ \\ \dot{θ} \\ x \\ \dot{x} \end{matrix}]

(18)

In this example, we assume that the states

x_{1} = θ

and

x_{3} = x

can be measured but are disturbed by

v_{θ}

and

v_{x}

, respectively. Both

v_{θ}

and

v_{x}

are Gaussian noises with a mean value of zero and a standard deviation of 0.0001. The states

x_{2} = \dot{θ}

and

x_{4} = \dot{x}

are estimated using the Euler method, where

\dot{θ} ≅ (θ (k) - θ (k - 1)) / Δ t

and

\dot{x} ≅ (x (k) - x (k - 1)) / Δ t

. The controller used in the inverted pendulum system is the following state feedback controller:

u = - k x \equiv - [\begin{matrix} k_{1} & k_{2} & k_{3} & k_{4} \end{matrix}] [\begin{matrix} x_{1} \\ x_{2} \\ x_{3} \\ x_{4} \end{matrix}] = - k_{1} x_{1} - k_{2} x_{2} - k_{3} x_{3} - k_{4} x_{4}

(19)

The purpose of control is to fix the cart at the middle point of the rail and to maintain the angle between the inverted pendulum and the plumb line at 0. In addition, the initial state is

x = {[\begin{matrix} 0.7 & 0 & 0 & 0 \end{matrix}]}^{T}

; the sampling time

Δ t

is 0.002 s; the simulation termination time

T_{f}

is 2 s; the allowable variation range of the pendulum is

- θ_{\max} = θ_{\min} \leq θ \leq θ_{\max}

, where

θ_{\max} = 0.7

; the allowable range of the cart is

- x_{\max} = x_{\min} \leq x \leq x_{\max}

, where

x_{\max} = 0.8

; and

T_{r}

denotes the time at which

θ

or

x

moves out of the allowable range. Additionally, the discrete performance index is defined as follows:

J = \sum_{k = 1}^{N_{f}} k (\frac{|θ (k)|}{θ_{\max}} + \frac{|x (k)|}{x_{\max}}) + W_{r} (T_{f} - T_{r})

(20)

where

N_{f} \equiv T_{f} / Δ t = 1000

and

N_{r} \equiv T_{r} / Δ t

.

W_{r}

is the weighting of the penalty function. When the output signal remains within the allowable range within the termination time

T_{f}

,

W_{r}

is zero. Conversely, if the output signal leaves the allowable range before the time reaches

T_{f}

, then

W_{r}

is a positive constant that is much higher than the value of the first term on the right-hand side of the equation in function (20). A useful reference value is

W_{r} ≅ N_{f}^{2} (θ_{\max} / θ_{\max} + x_{\max} / x_{\max}) = 2 N_{f}^{2}

. The actual value used in this example is

10^{6}

. According to the above description, we define

W_{r}

in this example as follows:

W_{r} = \{\begin{matrix} 10^{6}, \\ 0, \end{matrix} \begin{array}{l} if θ or x exceeds the allowable range at the N_{r} th sampling \\ otherwise \end{array}

(21)

When the output

θ

or

x

is not within the allowable range, along with adding the penalty function to the performance index, we also immediately terminate (stop) the control.

Because the state feedback controller used in this example has four parameters, we first arbitrarily design five sets of controller parameters as the five vertices of the initial simplex. The five sets of parameters and the corresponding performance indices are listed in Table 2. The time responses of the system are shown in Figure 2. Among these, the parameters

p_{0}

and

p_{1}

can keep the system state within the allowable safe range before the time reaches

T_{f}

; therefore, both the penalty function values are 0 and the corresponding index values are small. When using

p_{2}

,

p_{3}

, and

p_{4}

, all corresponding states exceed the allowable range before the time reaches

T_{f}

. Therefore, the penalty function achieves the expected effect that the corresponding index values are much larger than the index values corresponding to both

p_{0}

and

p_{1}

. From this result, it can also be seen that the earlier the simulation/experiment is interrupted, the larger the corresponding index value (representing a worse corresponding parameter).

Then, we used the above five sets of parameters as the five vertices of the initial simplex. We set the maximum number of iterations for searching the optimal parameters to 50. The results obtained using the N–M simplex optimal search method are shown in Figure 3 and Figure 4, where the resulting parameters are

k_{1} = - 276.7

,

k_{2} = - 48.12

,

k_{3} = - 289.5

, and

k_{4} = - 122.1

. The corresponding performance index is

J = 2.292 \times 10^{4}

, which is also lower than those for the initial four sets of controllers.

Obtaining a global optimal controller for nonlinear systems is difficult, especially when the dynamic equation of the system is unknown and the state or output signal includes measurement noise. Therefore, the results obtained in the above examples may be local optimal controllers based on specific initial conditions. In practical applications, the most important goal is to design a stable or robust controller effectively, not necessarily to obtain a global optimal controller.

To demonstrate the feasibility of this method, we simulated the above inverted pendulum system with the same state feedback controller,

u = k x

= [\begin{matrix} - 276.7 & - 48.12 & - 289.5 & - 122.1 \end{matrix}] x

. In this simulation, we used 50 different states as initial conditions. The distribution ranges of the initial states were

x_{1} (0) = θ (0) \in [- 0.349, 0.349]

(or

[- 20^{\circ} {, 20}^{\circ}]

),

x_{2} (0) = \dot{θ} (0) = 0

,

x_{3} (0) = x (0) \in [- 0.25, 0.25]

, and

x_{4} (0) = \dot{x} (0) = 0

. Figure 5 shows the time responses of the above simulation. The results show that the 50 different initial states approached equilibrium within 2 s.

4. Conclusions

In this study, we proposed a systematic method for designing optimal controllers for systems with unknown dynamic equations. First, we proposed an original performance index based on the characteristics and control aim of the controlled system. The performance index can include the state, output, error, or control input of the system. To ensure that this performance index was applicable even when the state or output of the system was not within the allowable safety range, we added a key penalty function. Then, we used the N–M simplex method to search for the optimal controller parameters iteratively. In addition to the ease of implementation of the N–M simplex method, another important advantage of the algorithm is that it only needs all the signals in the performance index to be available, without the need to know the dynamic equation of the system in advance.

To demonstrate the feasibility of the proposed method, we adopted an inverted pendulum system with measurement noise as the example. The simulation results showed that even if the initial controller parameters could not stabilize the system, after the algorithm reached the iterative termination condition, not only was the system stable but it also exhibited good transient response performance.

The optimal controller parameter search method proposed in this study has the following advantages: (1) the dynamic equation of the system need not be known; (2) the method can be used regardless of the stability of the open-loop system; (3) the method can be applied to both linear and nonlinear systems; (4) the method can be used in systems containing measurement noise; and (5) the systematic nature of the method can improve the design efficiency.

Author Contributions

Conceptualization, H.-H.T. and C.-C.F.; software, C.-C.F.; validation, H.-H.T. and C.-C.F.; resources, J.-R.H.; writing—original draft preparation, H.-H.T. and C.-C.F.; writing—review and editing, H.-H.T., C.-C.F., J.-R.H. and C.-K.L.; project administration, C.-K.L.; funding acquisition, J.-R.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Ministry of Science and Technology, Taiwan, ROC, under Grant MOST 110-2218-E-008-007-.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Slotine, J.-J.E.; Li, W. Applied Nonlinear Control; Prentice Hall: Englewood Cliffs, NJ, USA, 1991. [Google Scholar]
Khalil, H. Nonlinear Systems, 3rd ed.; Prentice Hall: Upper Saddle River, NJ, USA, 2002. [Google Scholar]
Tavasoli, A.; Enjilela, V. Active disturbance rejection and Lyapunov redesign approaches for robust boundary control of plate vibration. Int. J. Syst. Sci. 2017, 48, 1656–1670. [Google Scholar] [CrossRef]
Xu, J.; Fridman, L.M.; Fridman, E.; Niu, Y. Output-feedback Lyapunov redesign of uncertain systems with delayed measurements. Int. J. Robust Nonlinear Control 2021, 31, 3747–3766. [Google Scholar] [CrossRef]
Zhou, L.; Zhang, J.; She, H.; Jin, H. Quadrotor UAV flight control via a novel saturation integral backstepping controller. Automatika 2019, 60, 193–206. [Google Scholar] [CrossRef]
Fang, Y.; Fei, J.; Yang, Y. Adaptive backstepping design of a microgyroscope. Micromachines 2018, 9, 338. [Google Scholar] [CrossRef] [Green Version]
Mossa, M.A.; Echeikh, H. A novel fault tolerant control approach based on backstepping controller for a five phase induction motor drive: Experimental investigation. ISA Trans. 2021, 112, 373–385. [Google Scholar] [CrossRef]
Liu, Z.; Han, Z.; Zhao, Z.; He, W. Modeling and adaptive control for a spatial flexible spacecraft with unknown actuator failures. Sci. China Inf. Sci. 2021, 64, 152208. [Google Scholar] [CrossRef]
Liang, H.; Liu, G.; Zhang, H.; Huang, T. Neural-network-based event-triggered adaptive control of nonaffine nonlinear multiagent systems with dynamic uncertainties. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 2239–2250. [Google Scholar] [CrossRef]
Wang, S.; Na, J. Parameter estimation and adaptive control for servo mechanisms with friction compensation. IEEE Trans. Ind. Inform. 2020, 16, 6816–6825. [Google Scholar] [CrossRef]
Utkin, V.I. Variable structure systems with sliding modes. IEEE Trans. Autom. Control 1977, 22, 212–222. [Google Scholar] [CrossRef]
Šabanovic, A. Variable structure systems with sliding modes in motion control—A survey. IEEE Trans. Ind. Inform. 2011, 7, 212–223. [Google Scholar] [CrossRef]
Wang, J.; Zhu, P.; He, B.; Deng, G.; Zhang, C.; Huang, X. An adaptive neural sliding mode control with eso for uncertain nonlinear systems. Int. J. Control Autom. Syst. 2021, 19, 687–697. [Google Scholar] [CrossRef]
Shao, K.; Zheng, J.; Wang, H.; Xu, F.; Wang, X.; Liang, B. Recursive sliding mode control with adaptive disturbance observer for a linear motor positioner. Mech. Syst. Signal Process. 2021, 146, 107014. [Google Scholar] [CrossRef]
Charfeddine, S.; Jerbi, H. A survey on non-linear gain scheduling design control for continuous and discrete time systems. Int. J. Model. Identif. Control 2013, 19, 203–216. [Google Scholar] [CrossRef]
Liu, C.; Zhao, W.; Li, J. Gain scheduling output feedback control for vehicle path tracking considering input saturation. Energies 2020, 13, 4570. [Google Scholar] [CrossRef]
Accetta, A.; Alonge, F.; Cirrincione, M.; D’Ippolito, F.; Pucci, M.; Rabbeni, R.; Sferlazza, A. Robust control for high performance induction motor drives based on partial state-feedback linearization. IEEE Trans. Ind. Appl. 2019, 55, 490–503. [Google Scholar] [CrossRef]
Wu, J.; Lu, Y. Exact feedback linearisation optimal control for single-inductor dual-output boost converter. IET Power Electron. 2020, 13, 2293–2301. [Google Scholar] [CrossRef]
Owczarkowski, A.; Horla, D.; Zietkiewicz, J. Introduction of feedback linearization to robust LQR and LQI control– analysis of results from an unmanned bicycle robot with reaction wheel. Asian J. Control. 2018, 21, 1028–1040. [Google Scholar] [CrossRef]
Spendly, W.; Hext, G.R.; Himsworth, F.R. Sequential application of simplex designs in optimisation and evolutionary operation. Technometrics 1962, 4, 441–461. [Google Scholar] [CrossRef]
Nelder, J.A.; Mead, R. A simplex method for function minimization. Comput. J. 1965, 7, 308–313. [Google Scholar] [CrossRef]
Lagarias, J.C.; Reeds, J.A.; Wright, M.H.; Wright, P.E. Convergence properties of the Nelder-Mead simplex method in low dimensions. SIAM J. Optim. 1998, 9, 112–147. [Google Scholar] [CrossRef] [Green Version]
McKinnon, K.I.M. Convergence of the Nelder-Mead simplex method to a nonstationary point. SIAM J. Optim. 1998, 9, 148–158. [Google Scholar] [CrossRef]
Byatt, D. Convergent Variants of the Nelder-Mead Algorithm. Master’s Thesis, University of Canterbury, Christchurch, New Zealand, 2000. [Google Scholar] [CrossRef]
Price, C.J.; Coope, I.D.; Byatt, D. A convergent variant of the Nelder-Mead algorithm. J. Optim. Theory Appl. 2002, 113, 5–19. [Google Scholar] [CrossRef] [Green Version]
Fuh, C.-C.; Tsai, H.-H.; Lin, H.-C. Parameter identification of linear time-invariant systems with large measurement noises. In Proceedings of the 2016 12th World Congress on Intelligent Control and Automation (WCICA), Guilin, China, 12–15 June 2016; pp. 2874–2878. [Google Scholar] [CrossRef]
Xu, S.; Wang, Y.; Wang, Z. Parameter estimation of proton exchange membrane fuel cells using eagle strategy based on JAYA algorithm and Nelder-Mead simplex method. Energy 2019, 173, 457–467. [Google Scholar] [CrossRef]
Lee, Y.; Resiga, A.; Yi, S.; Wern, C. The optimization of machining parameters for milling operations by using the Nelder–Mead simplex method. J. Manuf. Mater. Process. 2020, 4, 66. [Google Scholar] [CrossRef]
Niegodajew, P.; Marek, M.; Elsner, W.; Kowalczyk, Ł. Power plant optimisation—Effective use of the Nelder-Mead approach. Processes 2020, 8, 357. [Google Scholar] [CrossRef] [Green Version]
Zettel, V.; Hitzmann, B. Optimization of the production parameters for bread rolls with the Nelder–Mead simplex method. Food Bioprod. Process. 2017, 103, 10–17. [Google Scholar] [CrossRef]
Naidu, D.S. Optimal Control. Systems; CRC Press: Boca Raton, FL, USA, 2003. [Google Scholar]
Anderson, B.D.O.; Moore, J.B. Optimal Control: Linear Quadratic Methods; Prentice-Hall: Englewood Cliffs, NJ, USA, 1990. [Google Scholar]
Dorf, R.C.; Bishop, R.H. Modern Control. Systems, 12th ed.; Prentice Hall: Englewood Cliffs, NJ, USA, 2014. [Google Scholar]
Mahmoodabadi, M.J.; Haghbayan, H.K. An optimal adaptive hybrid controller for a fourth-order under-actuated nonlinear inverted pendulum system. Trans. Inst. Meas. Control 2019, 42, 285–294. [Google Scholar] [CrossRef]
Waszak, M.; Langowski, R. An automatic self-tuning control system design for an inverted pendulum. IEEE Access 2020, 8, 26726–26738. [Google Scholar] [CrossRef]

Figure 1. Schematic of an inverted pendulum system.

Figure 2. Time responses of the inverted pendulum system controlled by the five sets of controller parameters given in Table 2.

Figure 3. Time response of the inverted pendulum system controlled by the state feedback controller in which the parameters are searched via the N–M simplex method based on the initial vertices given in Table 2.

Figure 4. Convergence graph of performance index

J_{0}

when the simplex method based on the initial vertices given in Table 2 is used to search for the controller parameters of the inverted pendulum system.

Figure 4. Convergence graph of performance index

J_{0}

when the simplex method based on the initial vertices given in Table 2 is used to search for the controller parameters of the inverted pendulum system.

Figure 5. Time response of the inverted pendulum system controlled by the state feedback controller

u = k x

= [\begin{matrix} - 276.7 & - 48.12 & - 289.5 & - 122.1 \end{matrix}] x

with 50 different random initial states.

Figure 5. Time response of the inverted pendulum system controlled by the state feedback controller

u = k x

= [\begin{matrix} - 276.7 & - 48.12 & - 289.5 & - 122.1 \end{matrix}] x

with 50 different random initial states.

Table 1. Various vertices that may be tried during the iteration of the N–M simplex method.

	$α = 1$ , $β = 0.5$ , $γ = 2$
Reflection	$p_{r} = \bar{p} + α (\bar{p} - p_{N})$	$J_{r} = J (p_{r})$
Expansion	$p_{e} = \bar{p} + γ (\bar{p} - p_{N})$	$J_{e} = J (p_{e})$
Outer contraction	$p_{c} = \bar{p} + β (\bar{p} - p_{N})$	$J_{c} = J (p_{c})$
Inner contraction	$p_{c c} = \bar{p} - β (\bar{p} - p_{N})$	$J_{c c} = J (p_{c c})$
Shrink	$p_{i} = 0.5 (p_{0} + p_{i})$ , $i = 1, \dots, N$	$J_{i} = J (p_{i})$ , $i = 1, \dots, N$

Table 2. Initial controller parameters for the inverted pendulum system.

$i$	$p_{i} = [\begin{matrix} k_{1} & k_{2} & k_{3} & k_{4} \end{matrix}]$	$J_{i} = J (p_{i})$
0	$p_{0} = [\begin{matrix} - 200 & - 10 & - 80 & - 10 \end{matrix}]$	$J_{0} = 1.941 \times 10^{5}$
1	$p_{1} = [\begin{matrix} - 150 & - 100 & - 150 & - 100 \end{matrix}]$	$J_{1} = 4.293 \times 10^{5}$
2	$p_{2} = [\begin{matrix} - 100 & - 20 & - 150 & - 20 \end{matrix}]$	$J_{2} = 5.651 \times 10^{8}$
3	$p_{3} = [\begin{matrix} - 80 & - 150 & - 100 & - 30 \end{matrix}]$	$J_{3} = 7.300 \times 10^{8}$
4	$p_{4} = [\begin{matrix} - 5 & - 5 & - 5 & - 5 \end{matrix}]$	$J_{4} = 7.560 \times 10^{8}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tsai, H.-H.; Fuh, C.-C.; Ho, J.-R.; Lin, C.-K. Design of Optimal Controllers for Unknown Dynamic Systems through the Nelder–Mead Simplex Method. Mathematics 2021, 9, 2013. https://doi.org/10.3390/math9162013

AMA Style

Tsai H-H, Fuh C-C, Ho J-R, Lin C-K. Design of Optimal Controllers for Unknown Dynamic Systems through the Nelder–Mead Simplex Method. Mathematics. 2021; 9(16):2013. https://doi.org/10.3390/math9162013

Chicago/Turabian Style

Tsai, Hsun-Heng, Chyun-Chau Fuh, Jeng-Rong Ho, and Chih-Kuang Lin. 2021. "Design of Optimal Controllers for Unknown Dynamic Systems through the Nelder–Mead Simplex Method" Mathematics 9, no. 16: 2013. https://doi.org/10.3390/math9162013

APA Style

Tsai, H.-H., Fuh, C.-C., Ho, J.-R., & Lin, C.-K. (2021). Design of Optimal Controllers for Unknown Dynamic Systems through the Nelder–Mead Simplex Method. Mathematics, 9(16), 2013. https://doi.org/10.3390/math9162013

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design of Optimal Controllers for Unknown Dynamic Systems through the Nelder–Mead Simplex Method

Abstract

1. Introduction

2. Design of Optimal Controllers for Unknown Parameter Systems

3. Numerical Simulation

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI