Hierarchical Sliding Mode Control Combined with Nonlinear Disturbance Observer for Wheeled Inverted Pendulum Robot Trajectory Tracking

Featured Application : The research presented herein could be used mostly in warehouse logistics transport activities in smart manufacturing. Abstract: A proposed optimized model for the trajectory tracking control of a wheeled inverted pendulum robot (WIPR) system is presented in this study, which addresses the problem of poor trajectory tracking performance in the presence of unknown disturbances due to the nonlinear and underactuated characteristics of the system. First, a kinematic controller was used to track a reference trajectory and generate a control law that specifies the desired forward and rotation speeds of the system. Next, a nonlinear disturbance observer (NDO) was designed to enhance the system’s robustness to external disturbances and improve its tracking performance. Then, the coupled sys-tem state variables were decoupled into two subsystems: a forward rotation subsystem and a tilt angle velocity subsystem. An improved hierarchical sliding mode controller was designed to con-trol these subsystems separately. Finally, simulation experiments were conducted to compare the proposed method with a common sliding mode control approach. The simulation results demonstrate that the proposed method achieves better tracking performance in the presence of unknown disturbances.


Introduction
With the rapid development of technology, human society has simultaneously achieved increased convenience and comfort [1].In today's factory warehouses and production lines, a variety of robots add to the possibilities of Industry 4.0.In the warehouses of more advanced companies, transport robots can be found everywhere, replacing traditional manpower and eliminating the need for workers to carry out repetitive lifting and carrying.These "smart" robots can accomplish a task as long as they can follow the requirements of a given transport trajectory.This paper studies the trajectory tracking of a mobile wheeled inverted pendulum on a given reference trajectory to achieve perfect tracking of the ideal motion trajectory of the robots, meeting the requirements of factory transport robots and providing a powerful source of assistance to realizing smart factories [2][3][4][5].
Mobile wheeled inverted pendulum models, such as WIPRs, have attracted much attention because of their special advantages, such as compactness, mobility, and humanlike functions.WIPRs are widely used to verify the effectiveness of nonlinear underactuated control methods, and compared with the traditional inverted pendulum, WIPRs have more applications than traditional inverted pendulum vehicles, especially in unknown, dynamic, and nonlinear environments, and are commonly used in logistics transportation, commuting, and navigation, as well as in the aforementioned application in the environment of factory transportation.However, a WIPR is classified as a typical model of nonlinear underactuated systems with two input torques driving two wheels and three degrees of freedom (forward, rotation, and tilt angle of the pendulum), and achieving its high-performance motion control is still a challenging task for the control community [6][7][8][9].
On the one hand, when a WIPR moves, it is always assumed that the ground can provide enough friction to prevent the robot from side-slipping and wheel-sliding (i.e., the robot is guaranteed to move with purely rolling wheels without skidding phenomena), which is a non-complete constraint at this point.On the other hand, consider that the underdriven inverted pendulum body needs to use the input torque of two driving wheels to control the three degrees of freedom of WIPR forward movement, rotation, and the angle of the inverted pendulum.If we want to use MWIPR to track the trajectory, we need to drive a WIPR in real time to control the three form variables with two input variables, which is a typical underactuated problem.Finally, in the real world, WIPRs operate in factories or other similar environments and always encounter various unknown disturbances that interfere with the system.Therefore, the three problems of incomplete constraints and underactuated and unknown perturbations are the main challenges faced by this particular mobile robot for trajectory tracking control [10][11][12][13].
The three issues mentioned above are of importance for the following reasons.First, the incomplete constraint will lead to the WIPR being unable to follow any trajectory movement, especially in the case of high-speed heavy load; if the robot's incomplete constraints are not considered in motion planning, this is likely to lead to untimely obstacle avoidance and unreachable trajectory.Second, underdriven robots often have excellent dynamic performance or price advantages in terms of drive cost, but their biggest problem is the higher requirements in controller design.Finally, unknown disturbances will affect the control accuracy of the system to a certain extent, and more seriously, will affect the stability of the control system [14,15].
Many researchers and practitioners have proposed several control algorithms to overcome the difficulties faced by the problems associated with WIPR systems.One of the widely used methods is fuzzy control, which is an empirical, rule-based control technique that can effectively control nonlinear systems.By establishing a dynamic model of the WIPR, a fuzzy-logic-based controller can be designed to take the position and angle information of the WIPR as the input and output control signals to control its motion state.For instance, Jian Huang, in [16], proposed an Integral Interval Type 2 Fuzzy Logic (IT2FL) method that can maintain the MTWIP equilibrium while obtaining the desired position and orientation to make it work in an uncertain environment.However, the disadvantages of fuzzy control include low control accuracy, strong dependence on control rules, and difficulty in designing control rules.
The second control algorithm type is neural network control.Chenguang Yang [17] decomposed the underdriven WIPR model into two subsystems.The approximation characteristics of the neural network were used for motion control of the fully driven subsystem, and the sub-fully driven system was used to indirectly control the tilt angle motion of the pendulum.However, the method requires a large number of wavelet coefficient vectors, making the neural network computationally intensive.
Finally, sliding mode control, as the most typical robust control method, shows good tracking performance and strong robustness, which support its wide use in linear and nonlinear systems.For underactuated systems, various sliding mode control methods have been proposed by researchers to achieve different control effects, such as integral sliding mode control, terminal sliding mode control, and hierarchical sliding mode control [18][19][20].Among them, the application of hierarchical sliding mode control in practical underdriven systems is receiving more and more attention, such as balancing control of a double-inverted pendulum and trajectory tracking control of a wheeled inverted pendulum.Nabanita Adhikary, in [21], proposed an integral inverse-step sliding mode controller for underdriven system control.A feedback control law was designed based on the backpropagation method, and a sliding surface was introduced in the final stage of the algorithm.Jian Huang, in [22], designed two terminal sliding mode controllers to control the speed and braking of a UW-Car based on the dynamic model and the terminal sliding mode control method.He Ping [23] proposed a hierarchical sliding mode controller (HSMC) developed to simultaneously perform speed control and balance control of a twowheeled self-balancing vehicle (TWSBV).
Hierarchical sliding mode control is a control strategy based on sliding mode control, which divides the sliding surface into two layers.In the first layer, a high-speed sliding surface is introduced, and the control system approaches the desired state quickly.In the second layer, a low-speed sliding surface is introduced, and the control system stabilizes near the desired state.The layered sliding mode control can improve the control accuracy and stability and also has a good effect on the response speed and robustness of the system.Therefore, hierarchical sliding mode control has the same drawback in that it is insensitive to disturbances, which can easily cause the "jitter" phenomenon of the system.To address the shortcomings of sliding mode control, this paper proposes an improved hierarchical sliding mode control method with adaptive exponential convergence law, which can adaptively adjust the control convergence law according to the control state and smooth the sign function, thus effectively improving the problem of the strong jitter of the traditional sliding mode control, and combining the nonlinear disturbance observer (NDO), which is the most powerful method for the control of sliding mode.The NDO can effectively solve the negative impact caused by the unknown disturbance and make the system more robust, and achieve an ideal control effect on the trajectory tracking ability of the WIPR system [24][25][26][27][28][29][30].
Overall, this paper includes the following four aspects: the first part constructs the dynamic model of the WIPR system, decouples the multi-coupled state variables, and facilitates the subsequent controller design; the second part establishes the kinematic trajectory tracking controller of the system and solves to obtain the desired speed of the dynamic control system.In the third part, an optimization model of the WIPR system combining nonlinear disturbance observer and hierarchical sliding mode control is designed, and the convergence of the nonlinear disturbance observer and the stability of the improved hierarchical sliding mode controller is demonstrated.The fourth part constructs the simulation model using the MATLAB/Simulink platform and conducts numerical simulation comparison experiments.
The contributions of this paper are as follows: (1) A wheeled inverted pendulum robot with a transport platform is envisioned for use in warehouses or other application scenarios to move goods.(2) The convergence law of hierarchical sliding mode control is improved to mitigate the jitter phenomenon of the sliding mode control system, and an adaptive function is introduced to minimize the system jitter.(3) By combining a nonlinear disturbance observer and hierarchical sliding mode control to estimate unknown external disturbances as input compensation, the system is made to control more accurately.

WIPR Model
A WIPR is a wheeled inverted pendulum transport robot with a placement table, as illustrated in Figure 1.Its left and right wheels are independent drive wheels that control the robot's movement speed, rotation direction, and tilt angle of the pendulum using the principle of differential drive to manage the position and posture of WIPR.The generalized world coordinate system is denoted OXYZ  while   , x y , representing the center coordinate of the robot wheels.The robot's forward velocity and rotational angular velocity are denoted as v and w , respectively.The angle of the robot's direction of motion concerning the X -axis is represented by  , while  is the tilt angle of the pendulum concerning the Z -axis.M refers to the total weight of the transport platform plus the pendulum, whereas m denotes the weight of each drive wheel.The distance between the two wheels is represented by d , while r  , and l  are the torque of the right wheel and the left wheel, respectively.The rotational inertia of each driven wheel is denoted by w I and M I represents the rotational inertia of the transport platform and the pendulum to- gether.The length of the pendulum is represented by L .Detailed introduction of robot parameters can be seen in Table 1.

Assumption 1:
The tires of the MWIPR do not experience any skidding, and there is no potential for lateral deflection during its motion.According to Assumption 1, the incomplete constraint equation of WIPR in Equation ( 1) can be listed as follows: The position and posture of the WIPR in the world coordinate system are represented by   , , , T q x y    .As the Lagrangian modeling method does not require the inclusion of internal forces within the system, it is a quick and straightforward method of building a model.This property makes it particularly well-suited for constructing multivariable and nonlinear dynamic models for the WIPR, as demonstrated in this paper.By dividing q into m q and  , the position of the robot in the coordinate system is denoted by m q , while the angle of the pendulum is represented by  .Therefore, the Lagrangian method [31] is employed to establish the dynamic model of the WIPR, and the resulting mathematical model is presented below as Equation (2).
By defining , the incompleteness constraint of Equation (1) yields the following result: The WIPR system's incomplete constraint force is

 
T m A q  , where  is the La- grange Multiplier.To eliminate the constraint forces in the system, we seek to find a ma- , therefore, Equation ( 4) can be deduced.

 
To eliminate the incompetent constraint forces, a new vector S q , resulting in Equation ( 5): The dynamics of the system can be described using the following equation, in which The value of each variable in the expression is indicated as Due to the coupling of the state variables in the system, Equation ( 5) is decoupled into Equation (6).
where  m m m m    .

The Design of the Kinematic Control Law
In kinematic trajectory tracking control for a WIPR, the system can be simplified to a general two-wheeled non-complete mobile robot for trajectory tracking.The process involves utilizing a reference trajectory state vector lim 0 The Lyapunov function is selected as Equation (9).
Lyapunov's stability theorem [32] establishes that the system can achieve asymptotic stability if the function is negative definite, i.e., if 1 0 V   . Accordingly, the sought control law is as follows (11).
Therefore, 1 0 V   can be proven.
So far, the desired velocity required for the design of the dynamical system is shown in Equation (11), and the velocity tracking problem of the dynamical system and the angle tracking problem of the pendulum will be solved next.

The Design of NDO
A nonlinear disturbance observer is developed to estimate the actual disturbance in the system for an unknown disturbance D, thereby strengthening the system's robustness.To address practical considerations, it is assumed that any disturbance is bounded as follows [33][34][35].
Lemma 1.For initial conditions that are bounded, a Liapunov function is also uniformly bounded

  x t if there exists a continuous positive definite Liapunov function  
V x satisfying the fol- lowing conditions:  is the V class function, and , c  all are positive constants.
Assumption 2. Since no disturbance can be infinite in the real world, we assume that the perturbations in the WIPR system studied in this paper are all bounded, and their first-order derivatives and second-order derivatives are assumed to be bounded; thus, the following equations can be obtained.
the Euclidean norm of the vector.The NDO is designed as in Equation ( 14).
  D and D  represent the estimates of the total perturbation and its derivative, re- spectively, while .

D D D       represents the error between the derivative of the actual value of a perturbation and the derivative of the estimated value of the same perturbation.
The derivation of D  and D   substitution of Equations ( 7) and ( 14) into the above equation leads to results D   and D   , which are the equations of the NDO, ( 15) and ( 16), respectively.
By letting and substituting the appropriate Equations (15-( 17) can be obtained.

E LE D
where . The observer's stability is examined, and a Lyapunov function is selected to make sure that it can reliably predict the system state despite any nonlinear disturbances.By selecting an appropriate Liapunov function, we rigorously prove the stability of the observer and the precision of its estimation precisely.
Property 2. L is a skew-symmetric matrix.
The proof of the derivative of 2 V can be expressed as follows: The above design of an NDO for a WIPR can be summarized in the following theorem.
Theorem 1.For the existence of an unknown disturbance in a WIPR system, the perturbation estimation error is bounded for the observer designed according to Equation (14).

The Design of Improved Slide Mode Control
The sliding mode control algorithm consists of two key elements: (i) the design of the sliding mode surface; and (ii) the design of the convergence rate.The design of the sliding mode surface is mainly based on the system structure as well as the control objective.As for the design of convergence law, there are four different convergence laws: the isokinetic convergence law, exponential convergence law, power convergence law, and general convergence law.In this paper, based on the optimal control objective of WIPPR to cope with nonlinearity and underdrive, as well as unknown disturbances, the traditional exponential convergence law is improved by introducing an adaptive control function, and an improved sliding mode control based on the adaptive exponential convergence law is proposed, which can weaken the system jitter while speeding up the system response, making the sliding mode control more suitable for tracking the reference trajectory of the WIPR system under the action of unknown disturbances [36].The design for the traditional exponential convergence law is shown in Equation (20).
where: s denotes the slip surface function; the parameters  ,  denote the conver- gence coefficient; and   sgn s denotes the sign function.
In the traditional exponential convergence law, the isokinetic term is denoted , and the exponential term is denoted s   .When the state of the system is far from the slip surface, the exponential term and the isokinetic term in the convergence law act simultaneously to help the system move toward the slip surface, and the magnitude of the isokinetic term and the exponential term are mainly determined by the reference  ,  .The exponential term is small, and the isokinetic term acts mainly when the system is moving close to the surface.This paper makes a corresponding improvement based on the traditional exponential convergence law and introduces the adaptive   o s function to adjust the convergence law in accordance with the control state of the system, as shown in Figure 2, which can accelerate the convergence speed of the sliding mode and weaken the overshoot phenomenon.This allows the sliding mode control to reduce the jitter phenomenon of the system.
Following the inclusion of the adaptive function   o s , the new exponential convergence law is as follows: where Through the analysis, relative to not adding the adaptive function (i.e., o(s) = 1), it can be found that when the system motion point is far away from the sliding surface (namely, when s is far away from the origin 0), the adaptive function   o s will increase the convergence law, which will speed up the system convergence speed, shorten the system state convergence time to the target and reduce the control time; when the system motion point is close to the sliding surface, s will converge to 0 and   o s will be less than 1.The role of   o s here is to suppress the jitter amplitude and weaken the state variable fluctuation problem after the system is stabilized, and the suppression effect will be more obvious as the parameter b increases.To further weaken the jitter problem of the stabilized system, the smoothing process is carried out for the symbolic function   sgn s in this paper, which is known as the traditional symbolic function [37], as shown in the following equation.

 
The symbolic function after the smoothing process is shown below.

The Design of the Forward-Rotation Subsystem
For convenience, the system has been reorganized into the following form (24).
    The state variables in the system described by Equation ( 24) are highly coupled.To address this issue and to expand the system's asymptotic stability domain, a hierarchical sliding mode controller was designed.The controller's primary objective is to utilize an input control law that can simultaneously both system variables 1  and 2  , thereby, mitigating the problem of system coupling [38].
Having obtained the expected forward velocity ( d v ) and angular velocity ( d w ) from Equation (11), the error between the actual and expected values can be defined as follows: To design the sliding mode control error tracking scheme for the v -w subsystem, two mutually independent first-layer sliding mode surfaces were initially constructed.
The equations used to create these slide surfaces are as follows: , 0 The results of deriving Equations ( 26) and ( 27) are presented below.
According to Filippov's equivalent control theory, the equivalent control laws for 1 The second sliding surface can be expressed as a linear combination of the first sliding surface.
To control 1  and 2  , the equivalent control law must be included at the same time to control and enter their designed sliding surface, respectively.Therefore, the total control law is shown in the following equation.
sw  is the switching law of the converging slide surface phase, and the expressions are as follows.
    To mitigate the jitter phenomenon of the system, the isokinetic and exponential terms of the sliding mode control are improved, where 1 To prove that the designed controller is stable, the Lyapunov function is chosen as follows.
The derivative of 3 V for time is given by the following expression.
By design  , the result of the following equation can be ob- tained.
From Lemma 1 in [38], the following equation can be obtained.
It can be seen that the  

3
V t index converges to 0, and the rate of convergence de- pends on 1  .
As demonstrated by the preceding equation, the error state can attain the slip surface in a finite amount of time.Subsequently, the first layer of slip surfaces 1 s and 2 s can converge asymptotically to zero, leading to the convergence of both the rotational and forward velocities of WIPR to the desired values.

The Design of the Tilt-Angle Subsystem
The system discussed in the previous section can achieve complete tracking of As WIPR aims to maintain a vertical and stable direction of the pendulum during its motion, all relevant parameters ( , ,   ) can be set to zero.As such, the following definitions can be employed: Let the sliding mode surface be defined as Equation (39), with its derivative expressed as Equation (40).
    After substituting Equation (37), the control law for the tilt angle subsystem can be derived as presented in Equation (41).

  
    To prove the stability of the designed system, the Lyapunov function is chosen as follows.

Simulation
The focus of this section is to discuss the trajectory-tracking effect of the system in a simulation environment, and to verify the feasibility of the proposed control scheme and what the advantages of the proposed method are compared with other control systems in this paper.Next, the simulation results of different control systems in the face of the same disturbance will be compared to verify the control effectiveness of each system.The parameters in the system are shown in the following Table 2.
Table 2.The value of each parameter variable in the system.

Parameter (Unit) Value
The simulation experiments in Matlab/Simulink verified the high-precision trajectory tracking capability of the system and the stability of the pendulum in robot motion.During the simulation study, the initial position was set as To demonstrate the superiority of the proposed method in this paper, three comparative experiments were conducted under the given disturbance conditions: the first experiment involved the simulation results of the unimproved HSMC method, the second experiment involved the simulation results of the improved IHSMC method with adaptive law but without nonlinear disturbance observer, and the third experiment involved the simulation results of the proposed method in this paper (referred to as PC).
First of all, by observing Figures 4-6, it can be concluded that the proposed method in this paper is better than the other two control methods in terms of both the speed of convergence of the error to the steady state and the magnitude of the fluctuation of the error after reaching the steady state when compared with the other two methods.This undoubtedly reflects the effectiveness of the method in this paper, which can track the given reference trajectory very accurately.and 8 give the tracking of the desired speed of the WIPR system under the three control methods.Compared with the other two methods, firstly, the control method in this paper can track the desired speed more rapidly, reaching the effect of tracking the desired speed at 0.7 s, whereas the other two methods track the desired speed in more than 1 s, which is much slower than the method in this paper, and the fluctuation frequency is high, which may affect the stability of the WIPR.As can be seen from Figure 8, the present method exceeds the other two methods in the tracking effect of rotational velocity relative to the forward velocity, for one.The convergence speed is fast, and more importantly, the proposed method is very stable after the velocity tracking reaches the steady state, which can be regarded as showing no fluctuation compared with the other two methods.The HSMC method with general exponential convergence law has more frequent angle oscillations, and the system is more unstable, as can be seen from the angle change of the WIPR pendulum shown in Figure 9, whereas the IHSMC improved convergence law method's pendulum has smoother oscillations after reaching stability, and the control effect is obviously stronger than that of the HSMC with general exponential convergence law.In terms of response time and maximum overshoot, the suggested method outperforms the other two ways, and it can continue to operate smoothly and without oscillations once it has reached the stabilization point.The results indicate that when the convergence law of HSMC follows the general exponential convergence law, the jitter vibration of the input torque for the left and right wheels of WIPR is evident, which adversely affects the output of the actuator (i.e., affects the output of the drive motors of the left and right wheels).In contrast, Figure 11 illustrates that the improved convergence law significantly reduces the jitter phenomenon, resulting in a more beneficial improvement for the actuator.Figure 12 presents the variation of input torque under the proposed control method.It can be observed that the input torque obtained by this method is smoother than IHSMC, and the jitter suppression effect is more satisfactory.This approach achieves a better torque input graph, making it the most effective method among the three for actuator benefits.Therefore, the proposed control method demonstrates superior performance in terms of reducing jitter and enhancing actuator benefits compared with the other two control methods, making it a better solution.
Figure 13 shows the trajectory tracking diagrams of different control systems.From an intuitive point of view, the proposed scheme is also significantly better than the other two schemes.As shown by the simulation comparison experiment, the method proposed in this paper is feasible, and its effect is excellent.

Conclusions
The purpose of this paper was to study the trajectory-tracking problem for WIPRs and propose a hierarchical sliding mode controller with a nonlinear perturbation observer to achieve accurate control of the reference trajectory and maintain the pendulum stability during motion.A nonlinear disturbance observer was designed to make the system more robust to unknown external disturbances.The underdriven coupling of WIPRs was addressed by dividing the system into two subsystems through the decoupling of its control state variables.The hierarchical sliding mode control method with an improved convergence law was then applied to control the system and suppress the "jitter" phenomenon.Finally, the Lyapunov function was chosen to verify the stability of the system mathematically.
The feasibility of the control system was verified using simulation software.However, considering the complexity of the real-world environment and external uncertainty, future work will focus on building a hardware system for the robot to study the real effects of the control method of WIPRs in the real world.

MII 1 .
The total weight of the transport platform plus the pendulum w The rotational inertia of each driven wheel M The rotational inertia of the transport platform and the pendulum d The distance between the two wheels L The tilt angle of the pendulum Remark The forward velocity of MWIPR is v x , and cos sin
defined and used to transform the equation.The transformation involves multiplying both sides of the equation by a scalar   T m control objective designed to manage linear and angular velocities.The objective is to ensure that the actual robot travel trajectory aligns with the reference trajectory, even if the trajectory error zero.To meet the requirement expressed above, the control laws for d v and d w can be devised as Equation(8).

While the error persists, the value 1 V
remains greater than zero, thereby rendering the function positive definite.Equation(10) describes the derivative of 1

Figure 2 .
Figure 2. The adaptive function   o s .

 and 2 
are as follows:

 and 1  3 sgn
are the isokinetic and expo- nential terms of the previous design convergence law, and both are positive constants; are the adaptive and symbolic functions designed in the previous paper.

4 V
for time is given by the following expression.
enables 4 0 V   to hold, indicating that the system achieves asymptotic stability.Figure3displays the schematic block diagram of the control system.
desired tracking velocity = 1 m/s and the angular velocity = 1 rad/s.Therefore, the trajectory of the robot should be a circle with a radius of 1 m, and the center of the circle is   0, 0 .There- fore, the time function of the reference trajectory was chosen as   was added, as shown in the following equation.

Figure 4 .
Figure 4.The error of x.

Figure 5 .
Figure 5.The error of y.

Figures 7
Figures7 and 8give the tracking of the desired speed of the WIPR system under the three control methods.Compared with the other two methods, firstly, the control method in this paper can track the desired speed more rapidly, reaching the effect of tracking the desired speed at 0.7 s, whereas the other two methods track the desired speed in more than 1 s, which is much slower than the method in this paper, and the fluctuation frequency is high, which may affect the stability of the WIPR.As can be seen from Figure8, the present method exceeds the other two methods in the tracking effect of rotational velocity relative to the forward velocity, for one.The convergence speed is fast, and more importantly, the proposed method is very stable after the velocity tracking reaches the steady state, which can be regarded as showing no fluctuation compared with the other two methods.
gravity matrix,  is the control input matrix, and d  is the to- tal unknown disturbance.The detailed expressions of each vector or matrix are presented below.