A Performance-Driven MPC Algorithm for Underactuated Bridge Cranes

: A crane system often works in a complex environment. It is difﬁcult to model or learn its true dynamics by traditional system identiﬁcation approaches. If a dynamics model is created by minimizing its prediction error, its use tends to introduce inaccuracies and thus lead to subop-timal performance. Is it possible to learn the dynamics model of a crane that can achieve the best performance, instead of learning its true dynamics? This work answers the question by presenting a performance-driven model predictive control (P-MPC) algorithm for a two-dimensional underactuated bridge crane. In the proposed dual-layer control architecture, an inner-loop controller uses a proportional–integral–derivative controller to achieve anti-sway rapidly. An outer-loop controller uses MPC to ensure accurate trolley positioning under control constraints. Compared with classical MPC, this work proposes a data-driven method for plant modeling and controller parameter up-dating. By considering the control target at the learning stage, the method can avoid adjusting the controller to deal with uncertainty. We use Bayesian optimization in an active learning framework where a locally linear dynamics model is learned with the intent of maximizing control performance and then used in conjunction with optimal control schemes to efﬁciently design a controller for a given task. The model is updated directly based on the performance observed in experiments on the physical system in an iterative manner till a desired performance is achieved. The controller parameters and prediction models of the best closed-loop performance can be found through continuous experiments and iterative optimization. Simulation and experiment results show that we can explicitly ﬁnd the dynamics model that produces the best performance for an actual system, and the method can quickly suppress swing and realize accurate trolley positioning. The results veriﬁed its effectiveness, feasibility, and superior performance on comparing it with state-of-the-art methods.


Introduction
A mechanical system with fewer drivers than the degree of freedom is called an underactuated one [1]. The reduction of drivers can make them light and flexible. Therefore, it is widely used in industry [2]. Underactuated systems are often divided into two categories. The first contains underactuated mechanical systems with restricted movement [3,4], including a mobile robot, shuttle, underwater vehicle, and underwater underactuated robot.
They cannot move sideways, or they must follow a fixed trajectory. The second covers underactuated mechanical arm type systems, mainly including different types of cranes (such as bridge, cantilever, and tower cranes), the inverted pendulum system, the ball and beam system, the translational oscillator with rotating actuator, and the pendulum robot. The state of this type of system is analogous to a shift from a connecting rod or a rotation. The reduction of the number of drivers can increase coupling among the system components and the complexity of the system, making its controller design more difficult. In recent decades, the underactuated mechanical system's control problem has been a long-term challenging problem in control engineering. Moreover, under particular circumstances, a fully actuated system may become an underactuated system when any actuator partially fails, or a running trajectory is fixed. Therefore, the in-depth study of underactuated mechanical systems is of essential theoretical and practical significance.
As a typical underactuated mechanical arm system, a bridge crane is an essential means of cargo transportation. It is widely used in construction sites, ports, production workshops, warehouses, and other industrial fields [5]. Its main control objectives can be summarized as accurately transporting goods to a target location and suppressing load swing as much as possible. The payload's swing angle must be small enough to avoid accidents in the process of transportation. Suppose that a bridge crane's swing is too large then in that case, it affects its operational safety and production efficiency. Due to the lack of several actuators, its payload swing angle is underactuated, making such an angle difficult to control. How to suppress payload swing and ensure a trolley's accurate positioning is an important yet challenging issue to be addressed. The coupling or accompanying nonholonomic constraints on a crane system state increases the difficulty in designing an underactuated crane control system.
In recent years, researchers have made some noticeable progress in controlling bridge cranes. An in-depth analysis of their dynamics has been conducted and a reasonable motion trajectory for their trolley while considering the load swing angle is planned [6][7][8]. Ouyang et al. proposed an s-shaped motion trajectory generation method that could realize anti-sway [6]. Jaafar et al. proposed a feedforward command shaping method of anti-sway for a bridge crane system [9], which requires fewer sensors than some existing methods. These methods [6][7][8][9] are all open-loop control-based ones and fail to deal with external disturbance. Some closed-loop feedback control methods have been proposed to deal with external disturbance to such a system [10]. The common method is to use proportional-integral-derivative (PID) controllers [11]. Some control algorithms based on state observers are proposed to handle partially unmeasurable states [12,13]. To reveal the effects of unknown disturbances on cranes, Zhang et al. proposed a finite-time trajectory tracking control method based on a state observer [13]. Some complex control methods have been proposed, e.g., passivity-based control schemes in [14,15] and a Lyapunov-based controller in [16]. However, such types of controllers have some inherent defects, e.g., difficult controller design and a narrow scope of application. Some robust controllers have also been proposed to control cranes, such as sliding mode control (SMC) methods [17][18][19][20][21][22][23][24][25]. The work [19] proposed an integral-barrier Lyapunov function (IBLF)-based control method to suppress the undesirable vibrations of the flexible crane system with a boundary output constraint. However, it needs precise partial-ordinary differential equations. Besides, there are some crane system control schemes combined with artificial intelligence, such as fuzzy logic-based [26,27] and neural network-based control [28,29]. These methods draw on human experience to help improve the control performance of a crane.
Most of the methods mentioned above can implement anti-swing. However, they fail to guarantee that such swing is always within its allowable range [30]. Due to the limited power of the crane's actuator, the large control input tends to cause actuator saturation, thereby reducing control performance and even causing safety problems.
Model predictive control (MPC) can predict the state of a system and handle various constraints. It has been applied to the control problem of bridge cranes [30][31][32][33][34][35]. For example, to deal with constraints, Chen et al. proposed a novel MPC algorithm for 2-D overhead cranes [30], which could deal with actuator saturation. By combining multivariable MPC with particle swarm optimizer (PSO), Smoczek et al. proposed a novel anti-sway method [31]. The methods [30,31] are based on an accurate linearized model. Their application is limited due to model error between the linearized models and their modeled crane systems. Meanwhile, an MPC algorithm requires a high-precision model to achieve superior closed-loop performance.
In adaptive control, model parameters are generally updated to obtain a good prediction model but not necessarily to maximize control performance. Aiming at finding the best predictive model and parameters of a controller from experimental data, we proposed a control method based on performance-driven MPC, which directly considers the crane's control target at a learning stage. This method requires us to continuously conduct experiments and collect closed-loop data. Although this work is for a small-scale physical crane (e.g., 1:10 size crane) instead of an actual large-scale crane, we can obtain sufficient closed-loop data for a large-scale crane and synthesize a controller by the proposed method. Through continuous iteration and optimization of parameters, the controller for the best performance can be learned and then applied. The performance-driven MPC algorithm framework is shown in Figure 1. It mainly includes three modules: a closed-loop experimental module, a closed-loop control module, and a Bayes-based controller parameter optimization and model learning module [36]. Following [37], we designed a dual-layer control architecture. An inner controller aims to quickly suppress the swing angle, while an outer one can handle control constraints and state ones. The merits of this paper can be summarized as follows.
power of the crane's actuator, the large control input tends to cause actuator saturation, thereby reducing control performance and even causing safety problems.
Model predictive control (MPC) can predict the state of a system and handle various constraints. It has been applied to the control problem of bridge cranes [30][31][32][33][34][35]. For example, to deal with constraints, Chen et al. proposed a novel MPC algorithm for 2-D overhead cranes [30], which could deal with actuator saturation. By combining multivariable MPC with particle swarm optimizer (PSO), Smoczek et al. proposed a novel anti-sway method [31]. The methods [30,31] are based on an accurate linearized model. Their application is limited due to model error between the linearized models and their modeled crane systems. Meanwhile, an MPC algorithm requires a high-precision model to achieve superior closed-loop performance.
In adaptive control, model parameters are generally updated to obtain a good prediction model but not necessarily to maximize control performance. Aiming at finding the best predictive model and parameters of a controller from experimental data, we proposed a control method based on performance-driven MPC, which directly considers the crane's control target at a learning stage. This method requires us to continuously conduct experiments and collect closed-loop data. Although this work is for a small-scale physical crane (e.g., 1:10 size crane) instead of an actual large-scale crane, we can obtain sufficient closed-loop data for a large-scale crane and synthesize a controller by the proposed method. Through continuous iteration and optimization of parameters, the controller for the best performance can be learned and then applied. The performance-driven MPC algorithm framework is shown in Figure 1. It mainly includes three modules: a closed-loop experimental module, a closed-loop control module, and a Bayes-based controller parameter optimization and model learning module [36]. Following [37], we designed a duallayer control architecture. An inner controller aims to quickly suppress the swing angle, while an outer one can handle control constraints and state ones. The merits of this paper can be summarized as follows.  (1) This work proposes a conversion method to construct a new state augmented system for solving the problem where classical MPC fails to deal with underactuated systems [31,38]. Although a bridge crane's structure is different from other types of underactuated (1) This work proposes a conversion method to construct a new state augmented system for solving the problem where classical MPC fails to deal with underactuated systems [31,38]. Although a bridge crane's structure is different from other types of underactuated systems, the dual-layer control architecture in this article can also be applied to other underactuated systems.
(2) Compared with classical MPC algorithms, which require a precise linearization model of a plant to achieve high closed-loop performance, this work proposes a method that does not need specific knowledge of system dynamics characteristics. Using the experimental data for parameter tuning, we can synthesize a controller of excellent performance to achieve fast and accurate trolley positioning and anti-swing control objectives. From the Machines 2021, 9, 177 4 of 17 application point of view, the control method is based on data, thereby making it highly applicable to various industrial systems.
The rest of this article is made up as follows. Section 2 briefly introduces the classic MPC and Bayesian optimization. Section 3 introduces a controller's design, including the parameterization method of each layer of the controller. Section 4 introduces the method of optimizing the parameters based on the data combined with Bayes optimization. Section 5 presents simulation, experimental and comparison results. Section 6 concerns the conclusion.

Classic MPC and Bayesian Optimization for a Bridge Crane
This section briefly reviews the methods of classic MPC and Bayesian optimization.

Classic MPC for Bridge Crane
MPC has gained significant success in recent decades and has become an important control method for handling system constraints [31] as well as a common approach for crane anti-sway. A discrete crane's dynamics can be described as follows: where u(k) ∈ R n u , x(k) ∈ R n x and w(k) ∈ R n w are a crane's input, state, and noise sampled at time k, respectively. n u , n x , and n w are the number for input, state, and noise, respectively. Classical MPC schemes do not consider any uncertainty for a prediction model. They rely on feedback and the next sampling result to resolve the following problem to compensate for the uncertainty of feedback: s.t.
x i+1|k = f x i|k , u i|k , k + i U = u 0|k , · · · , u N|k ∈ U j (j = 1, · · · , n cu ) X = x 0|k , · · · , x N|k ∈ X j (j = 1, · · · , n cx ) where U is a control vector. X is a state vector, and l(x i|k , u i|k , k + i) is a cost function at time k. n cu and n cx are the number of elements in input sets and state sets. Usually, a cost function is a weighted quadratic cost suitable for tracking tasks. In many MPC formulas, the terminal components l f and X f are generally imposed for meeting system stability requirements. The MPC control law can be obtained by solving Equation (2), resulting in where u * 0|k is the first element of the computes optimal control sequence U * applied to the crane at time step k.

Bayesian Optimization for a Bridge Crane
Bayesian optimization is a standard method for training models in machine learning. The objective function's minimal value can be found by establishing a substitute function (probability model) via the MPC objective function's past evaluation results. So, we can use Bayesian optimization to optimize the crane model parameters and controller parameters. Assume that a hyperparameter vector of the bridge crane is X = (x 1 , x 2 , . . . , x n ). Different super parameters lead to different effects. Bayesian optimization assumes that there is a functional relationship between the super parameters and the loss function.
Suppose that there is a function f : x −→ R , x ⊆ X. The optimization problem can be described as: Initializing the data-set D = {(x i , y i ), . . . , (x n , y n )}. We can assume f~GP(µ, κ) (GP: Gaussian process, µ : mean, and κ: covariance kernel). The forecast also obeys a normal distribution, i.e., whereσ is a covariance matrix,μ is a mean vector, k is the covariance matrix of test sample input and training sample input. K is the covariance matrix between training sample inputs, respectively.
In the next step, we need to select the parameters X that satisfy (5) based on the calculated hypothetical model, then bring the hyperparameters into the network for training, and finally obtain output y i , and update the data set D = {DU(x i , y i )}.
In this paper, we need to consider the balance between exploration and exploitation, and they are defined as follows: Exploitation: Based on the data collected in the past, search in the area with higher mean value to optimize the performance index, with a high probability of obtaining better results. Note that with such a search it is easy to fall into the local optimum.
Exploration: Learn more about J in a larger variance area of parameter space. The acquisition function can be selected in a variety of ways, such as Expected Improvement (EI) and Probability of Improvement. The EI algorithm can balance between exploration and exploitation [39][40][41][42][43]. Thus, the EI acquisition function used in this paper is introduced here. Assumingf = min f , andf represent the minimum value of f , the utility function is defined as follows: u(x)= max (0,f − f (x)), acquisition function can be defined as: The point that makes the α EI value maximal is the best point by calculation.

Control Architecture
This section presents a dual-layer, multi-rate, tracking control structure, as shown in Figure 2. The inner loop uses a PID controller to achieve rapidly anti-sway at sampling time T s . To effectively deal with system state constraints and control ones, the outer loop uses MPC to effectively solve an online constraint optimization problem at sampling time T MPC (T MPC = NT s with N ∈ N). p(t) and θ(t) are the crane's position and vertical direction angle of the payload, respectively. e(t) is the tracking error of the inner loop system s. r(t) is the reference value. u s (t) and u*(t) are the inputs to the inner-loop system s and real input to a crane. A s ∈ R 2×2 , B s ∈ R 2×1 , C s ∈ R 2×2 and D s ∈ R 2×1 are prediction model parameters. ν = [ν P ν I ν D ] is a parameterized vector of a PID controller. N P and N C are MPC parameters.
MPC to effectively solve an online constraint optimization problem at sampling time TMPC (TMPC = NTs with N∈N). p(t) and θ(t) are the crane's position and vertical direction angle of the payload, respectively. e(t) is the tracking error of the inner loop system s. r(t) is the reference value. us(t) and u*(t) are the inputs to the inner-loop system s and real input to a crane. A s ∈ℝ 2×2 , B s ∈ℝ 2×1 , C s ∈ℝ 2×2 and D s ∈ℝ 2×1 are prediction model parameters.
= [ν P ν I ν D ] is a parameterized vector of a PID controller. NP and NC are MPC parameters.

Inner PID Controller Parameterization
An inner PID controller is parameterized as a vector ν ∈ ℝ ν . The discrete transfer function at the sampling time Ts can be defined as: is a parameterized vector of a PID controller. A filter is added to the differential term (N d ≫ 1) for suppressing the high-frequency gain of noise. Nd's value has little effect on the overall performance. Thus, there is no need to optimize it.

Outer MPC Controller Parameterization
Consider a one-input and two-output (OITO) plant S as the prediction model of an inner closed-loop system. y s ∈ℝ and y= p θ ∈ℝ are the input and output states, respectively. Considering discrete sampling time Ts, we have the following discrete state-space representation: where ξ∈ℝ n ξ is the system state that has the transfer functions with the same poles of a crane model. To realize the learning of the parameters of a predictive model, we parameterize A s ∈ℝ 2×2 , B s ∈ℝ 2×1 , C s ∈ℝ 2×2 and D s ∈ℝ 2×1 into a vector µ∈ℝ . At time t = nTMPC (n∈N), the outer MPC solves the following optimization problem:

Inner PID Controller Parameterization
An inner PID controller is parameterized as a vector ν ∈ R n ν . The discrete transfer function at the sampling time Ts can be defined as: where ν = [ν P ν I ν D ] is a parameterized vector of a PID controller. A filter is added to the differential term (N d 1) for suppressing the high-frequency gain of noise. N d 's value has little effect on the overall performance. Thus, there is no need to optimize it.

Outer MPC Controller Parameterization
Consider a one-input and two-output (OITO) plant S as the prediction model of an inner closed-loop system. y s ∈ R and y= p θ ∈ R are the input and output states, respectively. Considering discrete sampling time T s , we have the following discrete statespace representation: where ξ ∈ R n ξ is the system state that has the transfer functions with the same poles of a crane model. To realize the learning of the parameters of a predictive model, we parameterize A s ∈ R 2×2 , B s ∈ R 2×1 , C s ∈ R 2×2 and D s ∈ R 2×1 into a vector µ ∈ R n µ . At time t = nT MPC (n ∈ N), the outer MPC solves the following optimization problem: where ∆u s (t + k | t), y(t + k | t ) and u s (t + N c | t ) are the kth control increment, output and input of plant S at time t, respectively.ŷ = [5, 0.2, 10] and ∨ y = [0, −0.2, −10] are the upper and lower bounds of the output,û s = 10 andǔ s = −10 are the upper and lower bounds of input, ∆û s = 1 and ∆ǔ s = −1 are the upper and lower bounds of the control increment. u r is an input reference, and y r is an output reference. N p is the prediction horizon, and N c is the control horizon. Q y , Q u , Q ∆u and Q ε are the non-negative weights of output, input, control increment, and a relaxation factor, respectively. ε is the corresponding relaxation positive factor. λ y , λ u and λ ∆u are the coefficients of the ε. According to the standard MPC design, there is a hard constraint N c ≤ N p that needs to be enforced in the parameter learning.
In this paper, we only consider N p and N c as the tuning parameters, which significantly impact on the closed-loop control performance. We define the overall vector of tuning , n µ is the number of parameters in the controller. We use empirical values instead of learning the parameters by optimization for λ y , λ u , λ ∆u , Q y , Q u , Q ∆u and Q ε in this paper to reduce the number of parameters to be tuned.

P-MPC Parameter Tuning
The designed architecture of a controller and controller parameterization was introduced in the previous section. This section introduces a closed-loop performance function and parameter tuning method based on Bayesian optimization.

Closed-Loop Performance Index
To realize the trolley accurately positioning and the payload anti-swing, the following evaluation function J is designed for accurately evaluating the performance of a closed-loop control algorithm.
J(y T , u T ; η, υ) τ(p) = 20(|p|−p set ), |p|> p set 0, |p|≤ p set (13) where (y T ,u T ) is the measured signal of output and input at sampling time t = 1, . . . , T, respectively. T is the duration of the closed-loop simulation. v i and η i are the i-th Bayesian optimization results of v and η. i o is the optimal solution index. τ is a penalty function that considers the physical constraints on the positioning of trolley p set . This function will hopefully limit the overshoot of the trolley displacement and prevent the occurrence of safety accidents. The crane's displacement and swing angle data are obtained through closed-loop simulation experiments. The closed-loop performance J is calculated according to (12). The parameter optimization problem of the controller can be described as follows: where D is the set of controller parameter candidates.

P-MPC Controller Parameter Tuning
The Bayesian optimization algorithm has two steps: 1. Construct a Gaussian process regression model [40] and update its parameters through a sampling dataset D; and 2. Build an extracting function to guide the next sampling step, which is referred to as extraction/collection for short.
A controller parameter tuning problem can be solved by minimizing (14). In this paper, we use a Bayesian optimization (BO) strategy to do so. Similar to the work [37], parameter tuning is summarized in Algorithm 1. Our goal is to update effectively the hyperparameters when new data is observed. In this paper, we assume that the cost J i corresponding to controller parameters (ν, η) obeys Gaussian distribution which has two advantages for our work according to [39,40]: 1. The Gaussian regression model is more accurate than such regression models as principal component regression and least squares.
2. It allows priors of a hyperparameter to be defined or a particular structure of a covariance function to be constructed. This feature can help us achieve controller parameter optimization by using the prior experimental data. These advantages enable us to introduce domain knowledge into a GP model to improve its accuracy.
According to the past evaluation results of the objective function, Bayesian optimization can establish a substitute function (a probability model) for minimizing the value of the objective function [41][42][43]. The acquisition function α(·) is constructed based on a GP model learning step, i.e., (10), and the parameters of the next controller can be selected by maximizing the acquisition function α(·). The initial point's quality directly affects the convergence rate of the algorithm and the quality of the final solution. The acquisition function often produces an optimal local solution. Therefore, we need to balance the exploration and development of the parameters. We adopted the acquisition function EI in this paper, which has a unique advantage in balancing exploration with exploitation of parameters [42]. Although our proposed method can find the parameters that make the control performance optimal, its optimality requires theoretical proof and that remains open [43].
The algorithm is initialized by selecting m > 1 different (for example, m can be randomly selected or set to a fixed value) controller parameter combination values. Then, we conduct a closed-loop experiment for each pair of parameters (ν i ,η i ) to collect data and calculate the performance index J i by Equations (12) and (13) for constructing the initial set D ←− (ν m , η m ), J m of parameters and performance. In practice, if safety constraints are violated, the experiment is interrupted, and a very large cost is allocated to J i . We repeat the above steps and iterate the experiment until the stop condition is met. In each iteration, the following two steps are executed:

Learning a GP Model
We fit J to the available data set D ←− (ν m , η m ), J m , and J~GP(µ 0 , κ) (µ 0 : zeros mean, κ((υ, η), (υ, η)):covariance kernel). The posterior distribution J(y T , u T ; η o , υ o ) is defined as follows: where k i ∈ R i is the covariance matrix of test and training sample inputs. Its n-th element is κ ((ν o , η o ), (ν n , η n )). K ∈ R i×i is a covariance matrix among training sample inputs. Its [n, m]-th entry has K((ν n , η n ), (ν m , η m )). I represents an identity matrix. σ 2 e represents the variance of additive (Gaussian) noise. The covariance function κ((ν, η), (υ, η)) for the GP can be chosen as a radial basis function where σ 2 0 is the prior covariance of the function that can control the degree of local correlation, and W is a weight matrix. They can be computed by maximizing the likelihood function as follows

Parameter Tuning by Bayesian Optimization
The next controller's parameters (ν i+1 , η i+1 ) can be selected by maximizing the acquisition function α(ν, η | D ): where Y and X are the domain of ν and η, respectively.
The exploration step chooses a point with a high mean of parameters. The exploitation step chooses a point with a large variance of parameters, which can avoid the algorithm's falling into local optimal parameters and increase the chance of finding the best performance controller parameters [39][40][41][42][43], i.e., whereĴ = min j=1,...,i J(y T , u T ; υ j , η j υ) is an optimal value for the previous i iterations of the objective function. Φ and φ are the probability density function and the cumulative density function of the standard normal distribution, respectively.

Simulation and Experiment Results
In this section, the dynamics of a bridge crane is first described. Simulations are provided next to verify the performance of the proposed control method.

Bridge Crane Dynamics
A bridge crane is usually composed of wire, payload, and trolley. Its corresponding 2-D simplified physical model is shown in Figure 3 [31][32][33][34][35]. The dynamic equation is defined as follows: where M = 5 Kg and m = 5 Kg denote the mass of trolley and payload, respectively. θ is the vertical direction angle of the payload. g = 9.81 m/s 2 represents the gravitational acceleration. l = 1 m is the length of the hoisting rope, which is fixed during transportation. γ = 0.1 is the friction between the trolley and the platform. ς = 0.1 is the friction between payload and air. F denotes a driving force, and x is the horizontal displacement; O and p set are the trolley's starting and target points, respectively. According to our method, we do not need to know the dynamic characteristic of a bridge crane. The dynamic Equations (21) and (22) are only used for producing closed-loop experimental data.
where J = min j=1,…,i J ̅ (y T , u T ; ν j , η j )is an optimal value for the previous i iterations of the objective function. Φ and ϕ are the probability density function and the cumulative density function of the standard normal distribution, respectively

Simulation and Experiment Results
In this section, the dynamics of a bridge crane is first described. Simulations are provided next to verify the performance of the proposed control method.

Bridge Crane Dynamics
A bridge crane is usually composed of wire, payload, and trolley. Its corresponding 2-D simplified physical model is shown in Figure 3

F=(m+M)x -ml sinθ-mlθ cosθ-2ml θ cosθ+mlθ sinθ+γx
where M = 5 Kg and m = 5 Kg denote the mass of trolley and payload, respectively. is the vertical direction angle of the payload. g = 9.81 m/s 2 represents the gravitational acceleration. l = 1 m is the length of the hoisting rope, which is fixed during transportation. γ = 0.1 is the friction between the trolley and the platform. ς = 0.1 is the friction between payload and air. F denotes a driving force, and x is the horizontal displacement; O and pset are the trolley's starting and target points, respectively. According to our method, we do not need to know the dynamic characteristic of a bridge crane. The dynamic Equations (21) and (22) are only used for producing closed-loop experimental data.

Simulation Results
MATLAB/Simulink was used to conduct simulations. We conducted 200 closed-loop experiments and collected data for calculating control performance and expanding a historical database. Based on database prior data, we can use BO to optimize the controller parameters. The system is initialized at x(0) x (0) θ(0) θ (0) = [0 0 0 0] for each experiment. The sample time is Ts = 0.1 s and TMPC = 1 s (n = 10). The closed-loop performance of each experiment is shown in Figure 4. We set the current test point as a blue cross, the current best point up to iteration i as the black line, and the best closed-loop performance

Simulation Results
MATLAB/Simulink was used to conduct simulations. We conducted 200 closedloop experiments and collected data for calculating control performance and expanding a historical database. Based on database prior data, we can use BO to optimize the controller parameters. The system is initialized at [x(0)  Figure 4. We set the current test point as a blue cross, the current best point up to iteration i as the black line, and the best closed-loop performance at iteration 92 as a red square. We select the parameters with the best closed-loop performance to verify the algorithm's feasibility and effectiveness under the following conditions. Case 1: p set = 3 m. Case 2: p set = 5 m. PID and its improved algorithm are currently the primary methods in the application of anti-sway. Model predictive control equivalent input disturbance (EID for short) is currently the best MPC method in the application of anti-sway [35]. Thus performancedriven MPC (P-MPC for short) proposed in this work is compared with the classical double-closed-loop PID (PID for short) [44] and EID.
at iteration 92 as a red square. We select the parameters with the best closed-loop performance to verify the algorithm's feasibility and effectiveness under the following conditions. PID and its improved algorithm are currently the primary methods in the application of anti-sway. Model predictive control equivalent input disturbance (EID for short) is currently the best MPC method in the application of anti-sway [35]. Thus performance-driven MPC (P-MPC for short) proposed in this work is compared with the classical doubleclosed-loop PID (PID for short) [44] and EID.
During the transportation process, the output signals x, , and input force F are disturbed by an additive white Gaussian noise.
In the early iteration stage, the closed-loop performance index J ̅ of the system is very large. After accumulating specific closed-loop data, BO can find better parameters. It can be seen from Figure 4 that the performance index decreases faster after the 20th experiment. As the number of experiments increases, the closed-loop performance J ̅ of the experiment becomes lower and lower. The rest of the experiments are focused on the lowcost area. After the 92nd iteration experiments, J ̅ no longer declines. Hence, the parameters at the 92nd can be selected as the best experimental parameters.
From Figures 5 and 6, the swing angle is always within 7° in the entire transportation process of the trolley by using the proposed method. When the trolley reaches the target location, the whole process only takes 8 s. After that, the swing angle oscillates within 2°. The entire transportation process is smooth. The algorithm can restore payload to normal swing quickly while achieving precise positioning and finally maintaining a small angle fluctuation. Besides, the three kinds of approaches of performance comparisons can be seen from Figures 5 and 6. The control performances are shown in Tables 1 and 2. We can see that P-MPC has the best performance. They all take the same time for the trolley to carry the goods to the target point. In Case 1, the maximum swing angle of PID is 12°. The maximum swing angle of the EID is 7°. The maximum swing angle of P-MPC is only 3.5°, which is significantly smaller than the others. In Case 2, the maximum swing angle of PID is 27°. The maximum swing angle of EID is 12°, and the maximum swing angle of P-MPC is only 7°, which is also much smaller than its two peers. We can compute that the P-MPC method's During the transportation process, the output signals x, θ, and input force F are disturbed by an additive white Gaussian noise.
In the early iteration stage, the closed-loop performance index J of the system is very large. After accumulating specific closed-loop data, BO can find better parameters. It can be seen from Figure 4 that the performance index decreases faster after the 20th experiment. As the number of experiments increases, the closed-loop performance J of the experiment becomes lower and lower. The rest of the experiments are focused on the low-cost area. After the 92nd iteration experiments, J no longer declines. Hence, the parameters at the 92nd can be selected as the best experimental parameters.
From Figures 5 and 6, the swing angle is always within 7 • in the entire transportation process of the trolley by using the proposed method. When the trolley reaches the target location, the whole process only takes 8 s. After that, the swing angle oscillates within 2 • . The entire transportation process is smooth. The algorithm can restore payload to normal swing quickly while achieving precise positioning and finally maintaining a small angle fluctuation. Besides, the three kinds of approaches of performance comparisons can be seen from Figures 5 and 6. The control performances are shown in Tables 1 and 2. We can see that P-MPC has the best performance. They all take the same time for the trolley to carry the goods to the target point. In Case 1, the maximum swing angle of PID is 12 • . The maximum swing angle of the EID is 7 • . The maximum swing angle of P-MPC is only 3.5 • , which is significantly smaller than the others. In Case 2, the maximum swing angle of PID is 27 • . The maximum swing angle of EID is 12 • , and the maximum swing angle of P-MPC is only 7 • , which is also much smaller than its two peers. We can compute that the P-MPC method's closed-loop performance is 0.021 for Case 1 and 0.477 for Case 2, which is much better than its peers 0.366, 0.491 for Case 1 and 0.743, 0.796 for Case 2. Comparing experimental results of the different methods, the proposed P-MPC is the best solution for safety and efficiency in real applications.
closed-loop performance is 0.021 for Case 1 and 0.477 for Case 2, which is much better than its peers 0.366, 0.491 for Case 1 and 0.743, 0.796 for Case 2. Comparing experimental results of the different methods, the proposed P-MPC is the best solution for safety and efficiency in real applications.   closed-loop performance is 0.021 for Case 1 and 0.477 for Case 2, which is much better than its peers 0.366, 0.491 for Case 1 and 0.743, 0.796 for Case 2. Comparing experimental results of the different methods, the proposed P-MPC is the best solution for safety and efficiency in real applications.

Experiment Results
A lab was specially built to validate the proposed method, as shown in Figure 7. The experimental platform used three Alternating Current (AC) asynchronous motors to drive the trolley to move on the track. The maximum speed was 0.2 m/s. Due to the limitation of the experimental site, the track length of the crane was 5.5 m, the actual usable length was 5 m, and the maximum lifting rope length was 3 m. The maximum payload mass was 1 t. The moving distance sensor used in this experiment could achieve an accuracy of 1 mm. Using the aircraft altitude angle sensor, the dynamic swing angle and the static swing angle accuracy could reach 0.01 • . The friction coefficient was 0.2. The swing angle of the payload was required to remain within ±50 mm after the mechanism stopped in 5 s.

Experiment Results
A lab was specially built to validate the proposed method, as shown in Figure 7. The experimental platform used three Alternating Current (AC) asynchronous motors to drive the trolley to move on the track. The maximum speed was 0.2 m/s. Due to the limitation of the experimental site, the track length of the crane was 5.5 m, the actual usable length was 5 m, and the maximum lifting rope length was 3 m. The maximum payload mass was 1 t. The moving distance sensor used in this experiment could achieve an accuracy of 1 mm. Using the aircraft altitude angle sensor, the dynamic swing angle and the static swing angle accuracy could reach 0.01°. The friction coefficient was 0.2. The swing angle of the payload was required to remain within ±50 mm after the mechanism stopped in 5 s. As same as for the simulation, we conducted 200 closed-loop experiments and collected data to calculate the control performance and to expand the historical database. Based on the database prior data, we can use BO to optimize the controller parameters. The system is initialized at x(0) x (0) θ(0) θ (0) = [0 0 0 0] for each experiment. The sample time is Ts = 0.003 s and TMPC = 0.03 s (n=10). The closed-loop performance of each experiment is shown in Figure 8. We set the current test point as a blue cross, the current best point up to iteration i as the black line, and the best closed-loop performance at iteration 170 as a red square. Then, we selected the parameters with the best closed-loop performance to verify the algorithm's feasibility and effectiveness under the following conditions.
Case 3: pset = 4.5 m.  Figure 8. We set the current test point as a blue cross, the current best point up to iteration i as the black line, and the best closed-loop performance at iteration 170 as a red square. Then, we selected the parameters with the best closed-loop performance to verify the algorithm's feasibility and effectiveness under the following conditions. As same as for the simulation, the closed-loop performance index J ̅ of the system is very large in the early iteration stage. After accumulating specific closed-loop data, BO can find better parameters. It can be seen from Figure 8 that the performance index decreases faster after the 15th experiment. As the number of experiments increases, the closed-loop performance J ̅ of the experiment becomes lower and lower. The rest of the experiments are focused on the low-cost area. After the 170th iteration experiment, J ̅ no longer declines. Hence, the parameters at the 170th can be selected as the best experimental parameters.
From Figure 9, the swing angle is always within 1 ° in the entire transportation process of the trolley when using the proposed method. When the trolley reaches the target location, the whole process only takes 20 s. After that, the swing angle oscillates within 0.1°. The entire transportation process is smooth. The algorithm can restore payload to normal swing quickly while achieving precise positioning and finally maintaining a small angle fluctuation. Besides, the three kinds of approaches of performance comparisons can be seen in Figure 9. The control performances are shown in Table 3. We can see that P-MPC has the best performance. They all take the same time for the trolley to carry the goods to the target point. In Case 3, the maximum swing angle of PID is 2.5°. The maximum swing angle of the EID is 1.5°. The maximum swing angle of P-MPC is only 1°, which is significantly smaller than the others. We can compute that the P-MPC method's closed-loop performance is 0.003 for Case 3, which is much better than its peers 0.015, 0.042 for Case 3. Comparing experimental results of the different methods, the proposed P-MPC is the best solution for safety and efficiency in real applications. As same as for the simulation, the closed-loop performance index J of the system is very large in the early iteration stage. After accumulating specific closed-loop data, BO can find better parameters. It can be seen from Figure 8 that the performance index decreases faster after the 15th experiment. As the number of experiments increases, the closed-loop performance J of the experiment becomes lower and lower. The rest of the experiments are focused on the low-cost area. After the 170th iteration experiment, J no longer declines. Hence, the parameters at the 170th can be selected as the best experimental parameters.
From Figure 9, the swing angle is always within 1 • in the entire transportation process of the trolley when using the proposed method. When the trolley reaches the target location, the whole process only takes 20 s. After that, the swing angle oscillates within 0.1 • . The entire transportation process is smooth. The algorithm can restore payload to normal swing quickly while achieving precise positioning and finally maintaining a small angle fluctuation. Besides, the three kinds of approaches of performance comparisons can be seen in Figure 9. The control performances are shown in Table 3. We can see that P-MPC has the best performance. They all take the same time for the trolley to carry the goods to the target point. In Case 3, the maximum swing angle of PID is 2.5 • . The maximum swing angle of the EID is 1.5 • . The maximum swing angle of P-MPC is only 1 • , which is significantly smaller than the others. We can compute that the P-MPC method's closed-loop performance is 0.003 for Case 3, which is much better than its peers 0.015, 0.042 for Case 3. Comparing experimental results of the different methods, the proposed P-MPC is the best solution for safety and efficiency in real applications.

Conclusions
This work proposed a performance-driven MPC algorithm for an underactuated 2-D bridge crane system. We could find the best performance MPC controller with predictive model and controller parameters from closed-loop experimental data to deal with unknown dynamic systems. The proposed method can effectively deal with the system's various constraints that do not cause the controller to saturate. Thus, it is easy to apply it to actual cranes. The simulation results show that this method can achieve high-precision positioning of the trolley and rapid anti-sway of payload, while outperforming the traditional PID controller and EID controller. Our future work will focus on the stability analysis of the proposed method in this paper. We should clarify the least number of closed-loop experiments required to construct a stable and reliable controller. It is necessary to construct a more accurate nonlinear model for large scale cranes, thereby improving the controller's robustness and stability in application. Our future work aims to promote P-MPC to 3D cranes, carry out a feasibility analysis of optimization results, and focus on chance-constrained stochastic control in the P-MPC framework. The model does not necessarily provide the highest input/output data fit result but yields a good controller corresponding to the best closed-loop performance. Moreover, P-MPC can be extended to solve different control problems, e.g., the inverted pendulum system, the ball and beam system, a translational oscillator with a rotating actuator, and a power-line inspection robot [45][46][47][48]. Engineers who only have little knowledge of control theory can use our method to design a controller with the highly desired performance.

Conclusions
This work proposed a performance-driven MPC algorithm for an underactuated 2-D bridge crane system. We could find the best performance MPC controller with predictive model and controller parameters from closed-loop experimental data to deal with unknown dynamic systems. The proposed method can effectively deal with the system's various constraints that do not cause the controller to saturate. Thus, it is easy to apply it to actual cranes. The simulation results show that this method can achieve high-precision positioning of the trolley and rapid anti-sway of payload, while outperforming the traditional PID controller and EID controller. Our future work will focus on the stability analysis of the proposed method in this paper. We should clarify the least number of closed-loop experiments required to construct a stable and reliable controller. It is necessary to construct a more accurate nonlinear model for large scale cranes, thereby improving the controller's robustness and stability in application. Our future work aims to promote P-MPC to 3D cranes, carry out a feasibility analysis of optimization results, and focus on chance-constrained stochastic control in the P-MPC framework. The model does not necessarily provide the highest input/output data fit result but yields a good controller corresponding to the best closed-loop performance. Moreover, P-MPC can be extended to solve different control problems, e.g., the inverted pendulum system, the ball and beam system, a translational oscillator with a rotating actuator, and a power-line inspection robot [45][46][47][48]. Engineers who only have little knowledge of control theory can use our method to design a controller with the highly desired performance.