Next Article in Journal
Adaptive Event-Based Dynamic Output Feedback Control for Unmanned Marine Vehicle Systems under Denial-of-Service Attack
Next Article in Special Issue
Compensation of Current Sensor Faults in Vector-Controlled Induction Motor Drive Using Extended Kalman Filters
Previous Article in Journal
Emerging Trends and Challenges in IoT Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Design, Implementation, and Control of a Wheel-Based Inverted Pendulum

by
Dominik Zaborniak
1,†,
Krzysztof Patan
2,† and
Marcin Witczak
2,*
1
Faculty of Computer, Electrical and Control Engineering, University of Zielona Góra, 65-516 Zielona Góra, Poland
2
Institute of Control and Computation Engineering, University of Zielona Góra, 65-516 Zielona Góra, Poland
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Electronics 2024, 13(3), 514; https://doi.org/10.3390/electronics13030514
Submission received: 30 December 2023 / Revised: 22 January 2024 / Accepted: 23 January 2024 / Published: 26 January 2024
(This article belongs to the Special Issue State-of-the-Art Research in Systems and Control Engineering)

Abstract

:
Control of an inverted pendulum is a classical example of the stabilisation problem pertaining to systems that are unstable by nature. The reaction wheel and the motor act as actuators, generating the torque needed to stabilise the system and counteract inevitable disturbances. This paper begins by describing the design and physical implementation of a wheel-based inverted pendulum. Subsequently, the process of designing and testing the proportional–integral–derivative (PID) and unknown input Kalman-filter-based linear quadratic regulator (LQR) controllers is performed. In particular, the design and pre-validation were carried out in the Matlab/Simulink environment. The final validation step was realised using a constructed physical pendulum, with a digital controller implemented using the STM32 board. Finally, a set of various physical disturbances were introduced to the system to show the high reliability and superiority of the proposed Kalman-filter-based LQR strategy.

1. Introduction

Irrespective of the system under investigation, the main aim of any control strategy is to stabilise it while satisfying a set of predefined performance indices. This simply means that the system should be transferred to the so-called equilibrium state while satisfying a set of performance indices. There is, of course, a wide set of control strategies that may cope with the above-defined tasks. These control strategies have been proven to be useful for various tasks. Usually, their usefulness is validated using some challenging systems. Undoubtedly, the inverted pendulum belongs to this group. At first glance, it seems to be a relatively simple system. Indeed, it is simply a suspended pendulum that one must force in such a way as to stand it in a vertical position. Unfortunately, this system raises a number of very important challenges. These include suitable system modelling, copying with non-linearity, non-minimal phase behaviour, and under-actuation. On the other hand, the goal is to keep the pendulum at an unstable equilibrium point. To make this possible, the controller must continuously and appropriately balance the centre of system’s gravity above the axis of rotation. Additionally, the control system must be fast enough to counteract attempts to destabilise it. Apart from the control-oriented problems, the design of the inverted pendulum system raises several issues. These involve, but are not limited to, proper selection of electrical and actuator components, control device programming, sensor fusion, and data filtering.
Taking into account these preliminary discussions, several authors have attacked the above-defined problem from different angles. In [1], the authors focus on the stabilisation problem under the presence of a constant unknown bias in the pendulum angle measurements. Another study [2] proves that even in the presence of a time delay in the system, a wheeled inverted pendulum can be well stabilised with data taken from only one accelerometer. The authors of [3] made a comparison between different control strategies for a rotary inverted pendulum. In [4], the authors performed an investigation devoted to robust generalised dynamic inversion. Another interesting study concerning robust control was proposed in [5], along with H analysis. A neural network-based control approach was introduced in [6]. Yet another method that involves fuzzy controllers was proposed in [7]. Similarly, the authors of [8] developed a fuzzy controller and investigated its performance by comparing it with an LQR controller. An interesting study concerning an extended platform that includes one additional reaction wheel is described in [9]. Furthermore, [10] can be considered as a great guide to the optimal mechanical design of a pendulum and reaction wheel for a given electric motor, as well as a wheel diameter that maximises the recovery angle. A mathematical model of inverted pendulum systems was derived in [11]. Special attention has also been paid to the system modelling problem [12]. Finally, since not all state variables are available, the importance of an accurate state estimation is discussed in [13]. Inverted pendulums with a higher degree of freedom have also been discussed by several authors. For example, [14] describes a dual-axis, self-balancing, reaction-wheel-based inverted pendulum system. Another impressive work [15] presents a novel 3D inverted pendulum that can be balanced using only a single reaction wheel. Another very interesting system that involves a cube that contains three reaction wheels and a nonlinear control algorithm was described in [16]. The inverted pendulum model was also investigated in a study concerning walking robots [17]. The above-presented review of the latest state-of-the-art methods clearly indicates the problems facing the current research, which can be summarised as follows:
  • The need for a simple and cost-efficient hardware design (mechanics and electronics).
  • The necessity of developing a representative mathematical model, along with a strategy that allows its efficient parameter estimation.
  • The need to develop efficient control and estimation strategies that allow for the desired and reliable performance of inverted pendulums.
Taking into account the above discussion and literature review, the contributions of this paper can be summarised as follows:
  • Determination of the pendulum structure and its nonlinear model (Section 2).
  • A proposal for a cost-efficient hardware architecture (Section 2.2).
  • Application of the small-angle approach for the determination of the linear state-space model of an inverted pendulum (Section 2).
  • Modelling and data-based identification of an inverted pendulum (Section 3).
  • Validation of the identified model (Section 3).
  • Matlab/Simulink-based preliminary validation of the model (Section 3).
  • Design and analysis of dedicated cascade PID and Kalman-filter-based LQR controllers (Section 4).
  • Experimental validation of the proposed design and control strategies (Section 5).
Finally, the last section summarises the paper and proposes future research directions.

2. Inverted Pendulum Model and Design

2.1. Mathematical Model

In order to analyse the system and design the controller, a mathematical model of a reaction wheel inverted pendulum (RWIP) was first derived. Figure 1 shows the scheme of the RWIP system. It consists of a fixed base, a rotating arm, and a rotating reaction wheel. The frame ( x 0 , y 0 , z 0 ) is an inertial one and it is related to the pendulum’s base, where z 0 represents the axis overlapping the pendulum arm’s axis of rotation. The frame ( x 1 , y 1 , z 1 ) is a non-inertial frame of the reference related to the end of the moving pendulum’s arm, where z 1 is the axis overlapping the wheel’s axis of rotation. The force F G is the gravitational force acting at the system’s centre of mass. The force F G T is the component of the force F G , perpendicular to the arm of the pendulum. The angle θ denotes the pendulum’s roll and is measured in the frame ( x 0 , y 0 , z 0 ) between the vertical, that is, y 0 , and current position of the pendulum’s arm. It’s acceleration is represented as follows:
θ ¨ = M p d M g J sin ( θ ) b θ J θ ˙ + b α J α ˙ + 1 J τ d 1 J τ w ,
where M p is the mass of the pendulum’s arm and reaction wheel combined together; d M stands for the distance between the axis z 0 and the centre of mass; g is the gravity acceleration; J is the pendulum’s total moment of inertia (including the arm, motor, and reaction wheel) with respect to the z 0 axis; b θ and b α are the coefficients of friction associated with angles θ and α , respectively; θ ˙ and α ˙ are first derivatives (angular velocities) of θ and α , respectively; τ d stands for the torque caused by disturbing forces acting on the pendulum’s arm; and τ w denotes the torque generated by the reaction wheel. Angle α , measured in the frame ( x 1 , y 1 , z 1 ) , describes how much the reaction wheel is rotated around the z 1 axis. It denotes acceleration in the frame ( x 0 , y 0 , z 0 ) , which is represented by the following equation:
α ¨ = M p g d M J sin ( θ ) + b θ J θ ˙ ( J + J w ) b α J J w α ˙ 1 J τ d + J + J w J J w τ w ,
where J w is the wheel’s moment of inertia with respect to the z 1 axis.
The reaction wheel is driven by a DC motor, which is described in the following set of equations:
τ w = k m i a , R a i a + L a d i a d t = u k e α ˙ ,
where u is the motor’s supply voltage, with a maximum value of U d d = 12 V ; i a is the armature circuit current; R a is the armature circuit resistance; L a is the armature circuit inductance; k m is the motor’s mechanical constant; and k e is the motor’s back electromotive force (EMF) constant.
It was assumed that the time constant of the armature’s circuit is much smaller than the mechanical time constants of the rest of the system. Therefore, L a = 0 was assumed. By reformulating Equation (3), the following simplified motor model can be achieved:
τ w = k m R a u k m k e R a α ˙ .
Introducing (4) into (1) and (2) gives the following combined model of the inverted pendulum:
θ ¨ = M p d M g J sin ( θ ) b θ J θ ˙ + 1 J ( b α + k m k e R a ) α ˙ + 1 J τ d 1 J k m R a u , α ¨ = M p g d M J sin ( θ ) + b θ J θ ˙ J + J w J J w ( b α + k m k e R a ) α ˙ 1 J τ d + J + J w J J w k m R a u .
Model (5) is a nonlinear one. It is obvious that for small values of θ , sin ( θ ) = θ . Thus, using this property, nonlinear model (5) can be rewritten in a simplified linear form, as follows:
θ ¨ = M p d M g J θ b θ J θ ˙ + 1 J ( b α + k m k e R a ) α ˙ + 1 J τ d 1 J k m R a u , α ¨ = M p g d M J θ + b θ J θ ˙ J + J w J J w ( b α + k m k e R a ) α ˙ 1 J τ d + J + J w J J w k m R a u .
Model (6) can be rewritten in the following equivalent state-space form:
x ˙ = A x + B u , y = C x ,
with:
x = [ θ θ ˙ α α ˙ ] T , u = [ u τ d ] T ,
A = 0 1 0 0 M p d M g J b θ J 0 1 J ( b α + k m k e R a ) 0 0 0 1 M p g d M J b θ J 0 J + J w J J w ( b α + k m k e R a ) , B = 0 0 1 J k m R a 1 J 0 0 J + J w J J w k w R a 1 J ,   C = 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 .
Variable α does not appear in Equations (5) and (6); therefore, it does not directly affect the dynamics of the system. It can also be observed that A has a very specific form. However, as can be seen in the matrix C , our physical realisation of the pendulum does not have a sensor that measures the angular velocity of the wheel α ˙ . Thus, its value should be estimated based on measurements of α .

2.2. Physical System Design and Implementation

The pendulum was designed using Fusion 360 software. The final version of the simulation model can be seen in Figure 2a. All of the components were created based on this model. Most of them were cut from wood, but a couple of parts were manufactured using a 3D printer. The complete laboratory stand can be seen in Figure 2b.
The electrical system design boils down to the development of a STM32 Nucleo-F103RB board that acts as a digital controller. It is responsible for collecting data from the following sensors:
  • A rotary encoder (E38S6-C-(600)B5-26G2) to measure θ .
  • An encoder inside the gear motor (Pololu 4752) to measure α .
  • An inertial measurement unit (IMU), LSM6DS33, to record θ ˙ .
The controller stabilises the pendulum by calculating the motor control signal u. It sends a corresponding pulse-width modulation (PWM) signal to the H-bridge module L298N, which feeds the motor. For the sake of simplification, the H-bridge controller is assumed to be a component capable of changing the duty cycle of the PWM signal from 0 to 1 linearly, in a voltage range from 0 to U d d , without any losses. The microcontroller unit can operate without the need for external computing systems. It is programmed using the C++ language and uses an internal clock to ensure a constant sampling time. The conceptual component connection diagram is shown in Figure 3a. Moreover, the implementation phases are portrayed in Figure 3b.

3. Parameter Estimation

The following parameters were taken directly from the physical model: d M = 0.131 m, M p = 0.4753 kg, and M w = 0.2003 kg. Other parameters should be derived based on physical laws or should be properly estimated. This process is portrayed in the following sections.

3.1. The Reaction Wheel’s Moment of Inertia

The reaction wheel consists of a wooden disk and weights made of nuts and bolts. The disks mass is m d = 0.0670 kg, and the mass of each weight m w = 0.0333 kg. The disk was modelled as a uniform annulus with an inner radius r i = 0.06 m and an outer radius r o = 0.10 m. Its moment of inertia J d is described by the following formula:
J d = 1 2 m d ( r i 2 + r o 2 ) = 4.56 · 10 4 k g · m 2 .
The weights are located at a distance r w = 0.08 m from the centre of the annulus. The moment of inertia of the individual weight J i w is approximated by the moment of inertia of a point mass:
J i w = m w r w 2 = 2.13 · 10 4 k g · m 2 .
In this way, the moment of inertia of the entire wheel is equal to the following:
J w = J d + 4 J i w = 1.3 · 10 3 k g · m 2 .

3.2. The Pendulum’s Moment of Inertia

The moment of inertia J was determined and based on a model of a simple physical pendulum. By knowing the oscillation period T, the moment of inertia can be calculated as follows:
J = T 2 4 π 2 M p g d M .
The reaction wheel was locked, making the pendulum behave like a rigid body, oscillating freely, with a small amplitude. The period of oscillation T was determined by analysing the pendulum’s response, as presented in Figure 4.
T = 19.32 0.53 23 = 0.8170 s .
Using (11), one can calculate the moment of inertia J as follows:
J = 0.0103 k g · m 2 .

3.3. Friction Coefficient

The reaction wheel was kept locked in order to estimate the coefficient of friction b θ . To do this, the pendulum needs to swing faster (with larger θ ˙ ) and thus with a greater amplitude. In this case, the motion of the pendulum, with motion resistance and nonlinear gravity, is described by the following equation:
J θ ¨ M p d M g sin ( θ ) + b θ θ ˙ = 0 .
Based on (14), a scheme for a nonlinear dynamic model was built in Simulink (see Figure 5). Again, the laboratory stand was forced to swing and measurements of θ were collected. The obtained measurements were compared with the outputs of the simulation model and the parameter b θ was adjusted accordingly. The process was repeated to allow us to obtain the lowest value of the mean square error (MSE). In this way, the value of b θ = 5.6799 · 10 6 k g m 2 s 1 was estimated.

3.4. The Motor’s Parameters

In order to determine the parameters of the motor, the pendulum’s arm was kept fixed ( θ = const ). In this case, the reaction wheel accelerates according to the following equation:
J w α ¨ + b α + k m k e R a α ˙ = k m R a u .
Let T w = J w ( b α + k m k e R a ) 1 and K w = ( b α + k m k e R a ) 1 k m R a . Then, (15) can be rewritten into the following form:
T w α ¨ + α ˙ = K w u .
The time constant T w and gain K w were estimated based on the step response of the motor, as illustrated in Figure 6.
From the step response shown in Figure 6, the value of T w was found to be equal to 0.1 s , while K w was derived based on the slope of the step response; its value is equal to 3.4114 V 1 s 1 . The rest of parameters were derived from (15) as follows:
( b α + k m k e R a ) = J w T w = 0.0131 , k m R a = K w ( b α + k m k e R a ) = 0.0447 .

3.5. Summary of the Model’s Parameters

All the necessary parameters concerning successful system modelling are summarised in Table 1. A final comparison between the derived model and the laboratory stand can be seen in Figure 7. This comparison presents the rectangular pulse response of the considered pendulum system.

4. Control of Inverted Pendulum

In order to control RWIP, two closed-loop method were examined and compared: a cascade control system with two PID controllers and a linear–quadratic–Gaussian (LQG) control system.

4.1. Cascade PID Control

As an input, the PID controller is fed with an error signal, which is the difference between the reference of the measured output, i.e., e = y r e f y . Based on this error, a control signal is generated that affects the system in such a way as to minimise it. For an ideal PID controller, the control signal is proportional to the current difference K p (P term), the integral of the previous error values K i 1 s (I term), and the error rate of change K d s (D term). In practice, the differentiating component amplifies the high-frequency signals, including noise, and oscillations caused by the step changes of the digital signal. To avoid this, a low-pass filter is added, resulting in a filtered version of the PID controller, as follows:
G P I D F ( s ) = K p + K i s + K d s T f s + 1 ,
where T f is the time constant of the low-pass filter. This form was implemented within the microcontroller. However, for design purposes, a representation that exhibits the positions of the controller’s zeros and poles was employed, as follows:
G P I D F ( s ) = K m ( s s z 1 ) ( s s z 2 ) s ( s s p ) ,
where s z 1 and s z 2 represent zeros, s p denotes the pole, and K m is the overall gain of the controller. Subsequently, it is possible to transition from notation (19) to (18) using the following substitutions:
K p = K m [ ( s z 1 + s z 2 ) s p 2 s z 1 s z 2 ] s p 2 , K i = K m s z 1 s z 2 s p 1 , K d = K m s p 1 + K m [ ( s z 1 + s z 2 ) s p s z 1 s z 2 ] s p 3 , T f = s p 1 .
For the remainder of the paper, the name PID is used to denote the filtered version.
The cascade structure of the control system, shown in Figure 8, was used to stabilise the pendulum. It consists of an inner loop with a controller, represented by the transfer function G P I D i n ( s ) , and an outer loop with the controller G P I D o u t ( s ) . The transfer function of the pendulum is denoted by G R W I P ( s ) . The task of the inner control loop is to quickly stabilise θ . The outer loop stabilises the variable α by providing the reference signal θ r e f for the inner control loop. Because of the derivative term in G P I D o u t , if the velocity α ˙ increases, then θ r e f decreases. This forces the pendulum to tilt slightly in the opposite direction. To maintain stability, the wheel begins to accelerate in the opposite direction. As a result, the velocity α ˙ decreases, and θ r e f and θ go back to zero.
Note that both controllers were designed based on the root locus method. As shown in Figure 9, using state-space representation (7), the pole–zero maps of the pendulum were derived with respect to the input signal u.
The system has an unstable pole at s = 7.439 . In addition, in the transfer function representing the output α , there is a right-half-plane zero at s = 7.25 , which creates a non-minimum-phase system. It should be also mentioned that by taking into account the θ output, the system has one zero at the origin. Moreover, by taking θ ˙ into account, it is evident that there are two zeros at the origin.
Let us analyse the root locus of the inner loop. Using the proportional term of G P I D i n alone, the unstable pole tends towards zero at the origin of the complex plane. This feature is illustrated in Figure 10a. As a result, the P controller itself is not able to stabilise the system. An additional pole is therefore added at the origin. The system is still unstable; however, the unstable pole is now no longer attracted to the origin (see Figure 10b). Finally, additional zeros and a pole were added at s z 1 = 3 , s z 2 = 2 , and s p = 1 . As portrayed in Figure 10c, this forces the unstable pole to attract s z 2 .
The gain was chosen so that the unstable pole is moved to the left half-plane and the dominant poles provide an overshoot of around 12 % for K m = 4 . The negative gain is due to the opposite directions in which the reaction wheel and the pendulum arm rotate. This can be seen in matrix B —from input u to θ ˙ , there is a negative coefficient, while there is a positive one up to α ˙ . The purpose of the inner loop is to control θ and, hence, a negative gain is needed. By applying the pole and zeros’ locations to (19) and using (20), the transmittance of the inner controller in the parallel form (18) is obtained as follows:
G P I D i n = 4 ( s + 3 ) ( s + 2 ) s ( s + 1 ) = 4 24 s 8 s s + 1 ,
where K p = 4 , K i = 24 , K d = 8 , and T f = 1 . Such settings make it possible to stabilise θ to the required reference level. Figure 11 shows the root locus for the outer loop, which is based on the inner loop. A pole in the origin indicates the presence of an integral component. For that reason, an additional integral component would make the regulation worse. Therefore, the structure of the outer controller was simplified to include only the proportional and derivative components. The root locus plot gain was set to K m = 0.01 and the transfer function of G P I D o u t was chosen as follows:
G P D o u t = 0.01 s s + 15 = 0.000667 s 0.0667 s + 1 ,
with K p = 0 , K i = 0 , K d = 0.000667 , and T f = 0.0667 .
Figure 12 shows the simulated step responses of the pendulum with respect to u and τ d . The system is stable in the bounded-input, bounded-output (BIBO) sense, taking into account the step response for signal u (the left part of Figure 12). However, it is not fully stable for the input τ d (the right part of Figure 12). This can be clearly seen in the response of signal α . It was not considered to be a major problem because signal α is not the control target. However, it introduces a non-zero constant velocity α ˙ , which, in practice—with a large τ d —may saturate the motor and destabilise the pendulum.

4.2. Linear–Quadratic–Gaussian Controller

Alternatively, a state-feedback-based controller was designed. This control scheme is based on full information about the system state x . If it is not fully available, a state observer is created to provide its estimate x ^ . The state feedback makes it possible to shift the poles of the closed-loop system to any position in the complex plane [18]. Such a design method is very desirable and is suitable for dealing with multiple-input and multiple-output (MIMO) systems. One important characteristic is that it is able to handle dynamic coupling between state variables. LQG combines such an LQR controller with a Kalman filter as a state estimator. According to the separation principle [19], the design process of the controller and the state estimator can be split without exerting a negative influence on the control performance.
The controller design was realised using a microcontroller, which is a digital device. Therefore, the state-space model (7) was discretised. For that purpose, the zero-order hold method was employed. Moreover, the refreshing time of the microcontroler was set to 12 ms. Finally, the discrete-time state-space model of the open-loop system has the following form:
x [ k + 1 ] = A d x [ k ] + B d u [ k ] ,
y [ k ] = C d x [ k ] ,
where
A d = 1.0042 0.0120 0 0.0001 0.7056 1.0042 0 0.0142 0.0041 0.0000 1.0000 0.0112 0.6649 0.0041 0 0.8735 ,
B d = 0.0036 0.0069 0.5832 1.1552 0.0318 0.0067 5.1780 1.0884 , C d = 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 .
It should be kept in mind that the second column of the matrix B d is related to the distortion input τ d and cannot be used to control the system. This means that its effect should be decoupled while estimating an unknown state x [ k ] . To settle this problem, a Kalman filter with an unknown input is employed [20]. In particular, the matrix B d is split into two components, B dc and B ø . They correspond to the first and second columns of B d . As indicated in [20], the primary design condition underlying the unknown input Kalman filter is
rank ( C d B ø ) = dim τ d = 1 ,
which is clearly satisfied. Finally, the unknown input Kalman filter is given by the following:
x ^ [ k / k 1 ] = A d x ^ [ k 1 / k 1 ] + B dc u [ 1 , k 1 ] ,
τ ^ [ d , k 1 ] = M ( y [ k ] C d x ^ [ k / k 1 ] ) ,
x ^ * [ k / k ] = x ^ [ k / k 1 ] + B ø τ ^ [ d , k 1 ] ,
x ^ [ k / k ] = x ^ * [ k / k ] + K k ( y [ k ] C d x ^ * [ k / k ] )
where the matrix M = ( C d B ø ) + , with ( · ) + being a pseudo-inverse of its argument. Finally, the Kalman filter gain matrix is calculated according to the strategy proposed in [20].
The scheme of the closed system with state feedback and the Kalman filter is shown in Figure 13, with K lqr denoting the optimal feedback gain matrix.

4.2.1. Feedback Gain

When designing a control system, it should be ensured that the system is controllable. The rank of the controllability matrix T is equal to the rank of the state vector
r a n k ( T ) = r a n k ( [ B dc A d B dc A d 2 B dc A d 3 B dc ] ) = 4 ,
which means that the pendulum is controllable.
In order to calculate the feedback gain K lqr , the state-cost weighted matrix Q lqr and the input-cost weighted matrix R lqr were defined. The values of the Q lqr matrix were adjusted to prioritise the stabilisation of angle θ more than α . Then, based on the simulations carried out, the value of R lqr was selected so that u was between - U d d and U d d for the desired degree of stabilisation.
Q lqr = 10 0 0 0 0 10 0 0 0 0 0.1 0 0 0 0 0.1
R lqr = 2
Finally, the optimal gain K lqr aims to minimise the quadratic cost function J d , defined as follows:
J d ( u ) = n = 1 ( x [ k ] T Q lqr x [ k ] + u [ k ] T R lqr u [ k ] ) .
with the control law u [ k ] = K lqr x [ k ] . This yields the following:
K lqr = [ 18.2432 2.6777 0.0966 0.1529 ] .
After closing the feedback loop, the discrete closed-loop system is described by the following equation:
x [ k + 1 ] = ( A d B dc K lqr ) x [ k ] + B d u [ k ] ,
y [ k ] = C d x [ k ] .
The poles’ locations are included in poles = [ 0.1167 + 0.0000 i , 0.9370 + 0.0272 i , 0.9370 0.0272 i , 0.9881 + 0.0000 i ] with the following magnitudes: | poles | = [ 0.1167 , 0.9374 , 0.9374 , 0.9881 ] . Clearly, all poles are located inside the unit circle, which means that state feedback successfully stabilises the model. The system’s step responses are shown in Figure 14. In contrast to the cascade PID controller, this time, the velocity α ˙ converges to zero. For this reason, the motor is not saturated in a steady state.

4.2.2. State Observer

The state estimator aims at reconstructing the current state of the system based on the measurements and a mathematical model of the system. In this case, this mainly refers to the estimation of α ˙ . However, a Kalman filter is used as a full-state observer; thus, other state variable estimates were also obtained.
First, it was verified whether the system is observable or not. The rank of the observability matrix O is equal to the dimension of the state vector
r a n k ( O ) = r a n k ( [ C d C d A d C d A d 2 C d A d 3 ] ) = 4 ,
which means that the system is observable.
The estimator is based on a linear model (6), and it is itself a dynamic system, which can be described by the following equation:
x ^ [ k + 1 ] = ( A d B dc K lqr ) x ^ [ k ] + B d u [ k ] + K kf ( y [ k ] y ^ [ k ] ) , y ^ [ k ] = C d x ^ [ k ] .
where x ^ [ k ] is the estimate of the system’s state at time k, while y ^ [ k ] is the estimate of the system’s output at time k.
The covariance of the process noise Q kf and the measurement noise R kf was assumed to be constant and have the following form:
Q kf = 10 1 0 0 10 0 , R kf = 2 · 10 3 0 0 0 0 5 · 10 1 0 0 0 0 1 · 10 1 0 0 0 0 10 10 .
It was assumed that both the measurement and the process noise are uncorrelated. This assumption leads to the diagonal form of covariance matrices. The covariance of the measurement noise was selected based on an analysis of the recorded data. In turn, the covariance of the process noise was selected via a trial and error method. Such settings guarantee a good unknown input Kalman filter performance for the whole operational range of the inverted pendulum.

4.3. Comparative Evaluation

In order to compare the control systems based on PID and LQR controllers, four performance indices were considered:
  • The maximum absolute value of the controlled signal x max .
  • The settling time t r , calculated as the time from the moment the system is excited until the signal error e x ( t ) = x ( t ) x f i n a l reaches and remains constantly within the tolerance zone ± 0.05 · e m a x , with e m a x being the maximum error.
  • The percentage overshoot P O 0 —calculated for signals with a zero steady-state value as the ratio of two adjacent peak amplitudes: P O 0 = x p e a k 1 / x p e a k 2 · 100 % .
  • The percentage overshoot P O n —calculated for signals with a non-zero steady-state value: P O n = ( x p e a k 1 x f i n a l ) / x f i n a l · 100 % .
The quantitative control indices are listed in Table 2.
Clearly, the cascade PID controller achieves smaller x max values than the LQR controller. In the case of the control input, the regulation time of θ is better for the cascade PID controller, but the overshoot is slightly lower for the LQR controller. In the case of the disturbance τ d , the LQR controller performs better than the PID controller. Obviously, the cascade PID has problems—with a proper control of α , one could not provide the quality indices in that case. Taking into account all performance indices as well as disturbance compensation abilities, it can be concluded that the LQR controller performs better than the cascade PID controller.

5. Experimental Results

The designed and investigated control systems were tested in two ways: (i) under undisturbed operating conditions and (ii) with a constant disturbance force caused by additional weights. Both video recordings of the experiments and data collected are portrayed in the following subsections.

5.1. Impulse Disturbance Torque

Both control systems were tested at the under normal operating conditions. Additionally, in the steady state, the pendulum was lightly pushed with a wooden bar.
The cascade PID controller was able to stabilise the system. Interested readers can find a video of the controller’s performance on the following webpage: https://www.youtube.com/shorts/TG2SWD9OdJk. However, this method results in a non-zero reaction wheel speed ( α ˙ 0 ). This can be seen in Figure 15. It can also be observed that the control signal u is very noisy.
The LQG controller also successfully stabilised the pendulum. For a video of the controller’s performance, please visit the following webpage: https://youtube.com/shorts/Kn4MKzpV6JI. The system balances around its equilibrium point with the speed of the wheel α ˙ oscillating around 0 rad/s. The control signal u is much less noisy than in the case of the cascade controller.
Both systems were able to function properly following a light push of the bar, but the LQG controller was much more robust. The results presented in Figure 15 include the response of the pendulum to pushes occurring at the time instants t = 63.6 s and t = 64.2 s, while Figure 16 presents the response of the pendulum when the bar was pushed at t = 146.5 s, t = 149.4 s, and t = 153.2 s.

5.2. Constant Disturbance Torque

As part of this experiment, an arm was attached to the pendulum. Weights of 35 g each were added to the pendulum in a stepwise manner. The system recognises these weights as step changes in the disturbance torque signal τ d . A video showing the performance of the pendulum in the case of the cascade controller can be found at https://www.youtube.com/shorts/pgWTVsNW0Pg; a video showing the LQR controller can be found at https://youtu.be/yshctkoC1pk. A portion of the measured data for the cascade structure is shown in Figure 17; data for the LQR controller can be found in Figure 18. The closed-loop system with the LQR controller can handle additional weights very well. On the other hand, adding weights to the cascade design increases the average wheel speed α ˙ during steady-state operation. It saturates the motor very quickly, which causes the pendulum to lose stability. It should be noted that the control signal u in the case of the LQR controller is, again, smoother than that in the cascade control system.
A constant, non-zero value of torque disturbance τ d contributes a constant static error of θ . This is true for both the cascade and LQR controllers. This appear to degrade the performance of the control system. However, it can be a desirable property. A constant τ d shifts the equilibrium point of the pendulum. The LQR controller brings θ to such a position that the torque created by the gravity force neutralises the influence of the constant τ d .

6. Conclusions

In this study, the development of a mathematical model of a pendulum with a reaction wheel was described. Based on this, a physical inverted pendulum system was designed and successfully implemented. For this purpose, the appropriate sensors, motor, and microcontroller were selected. The linearised pendulum model was implemented in the Matlab environment. Based on this, cascade PID and LQG controllers were developed. Finally, they were implemented in the physical pendulum system. In order to collect the measurements, a connection was established between the microcontroller and a personal computer. The estimation of the model parameters was based on the knowledge acquired from the basic physical relations. This proved to be sufficient for developing a good mathematical model. Finally, the stabilisation of the RWIP was also successful. Both the cascade PID and LQG controllers were able to stabilise the pendulum. Using the cascade controller, the pendulum balanced at approximately ± 0.01 rad around the equilibrium point, while the LQG controller balanced at approximately ± 0.025 rad. However, it was evident that the LQG controller was able to deal much better with noise and disturbances. Moreover, the design process of the LQG controller was much easier than the cascade PID controller. The application of state feedback made it straightforward to move the poles of the device. Because of the coupling between θ and α , designing a cascade controller was more difficult.
Note that the main limitation of this approach was eliminated by applying an unknown input Kalman filter, which served as a torque disturbance estimator, thereby enabling greater robustness to external disturbances.
Motivated by these promising results with the LQG approach, further research will be oriented towards:
  • Investigating the selection of the Kalman filter covariance matrices in order to better deal with both changing noise and disturbances.
  • Relaxing the small-angle assumption and deriving a linear parameter-varying (LPV) model of the pendulum.
  • Designing LPV controllers and state observers.

Author Contributions

Conceptualisation, M.W. and K.P.; methodology, D.Z.; software, D.Z.; validation, D.Z., K.P. and M.W.; formal analysis, K.P.; investigation, D.Z.; data curation, D.Z.; writing—original draft preparation, D.Z.; writing—review and editing, K.P. and M.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Data Availability Statement

The datasets presented in this article are not readily available because the data are part of an ongoing study.

Conflicts of Interest

The authors declare no conflicts interests.

References

  1. Aranovskiy, S.; Biryuk, A.; Nikulchev, E.V.; Ryadchikov, I.; Sokolov, D. Observer design for an inverted pendulum with biased position sensors. J. Comput. Syst. Sci. Int. 2019, 58, 297–304. [Google Scholar] [CrossRef]
  2. Xu, Q.; Stepan, G.; Wang, Z. Balancing a wheeled inverted pendulum with a single accelerometer in the presence of time delay. J. Vib. Control 2017, 23, 604–614. [Google Scholar] [CrossRef]
  3. Hamza, M.F.; Yap, H.J.; Choudhury, I.A.; Isa, A.I.; Zimit, A.Y.; Kumbasar, T. Current development on using Rotary Inverted Pendulum as a benchmark for testing linear and nonlinear control algorithms. Mech. Syst. Signal Process. 2019, 116, 347–369. [Google Scholar] [CrossRef]
  4. Mehedi, I.M.; Ansari, U.; Al-Saggaf, U.M. Three degrees of freedom rotary double inverted pendulum stabilization by using robust generalized dynamic inversion control: Design and experiments. J. Vib. Control. 2020, 26, 2174–2184. [Google Scholar] [CrossRef]
  5. Baimukashev, D.; Sandibay, N.; Rakhim, B.; Varol, H.A.; Rubagotti, M. Deep learning-based approximate optimal control of a reaction-wheel-actuated spherical inverted pendulum. In Proceedings of the 2020 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), Boston, MA, USA, 6–9 July 2020; pp. 1322–1328. [Google Scholar]
  6. Du, D.; Zhang, C.; Song, Y.; Zhou, H.; Li, X.; Fei, M.; Li, W. Real-time Hinf control of networked inverted pendulum visual servo systems. IEEE Trans. Cybern. 2019, 50, 5113–5126. [Google Scholar] [CrossRef] [PubMed]
  7. Bezci, Y.E.; Aghaei, V.T.; Akbulut, B.E.; Tan, D.; Allahviranloo, T.; Fernandez-Gamiz, U.; Noeiaghdam, S. Classical and intelligent methods in model extraction and stabilization of a dual-axis reaction wheel pendulum: A comparative study. Results Eng. 2022, 16, 100685. [Google Scholar] [CrossRef]
  8. Nguyen, B.H.; Cu, M.P.; Nguyen, M.T.; Tran, M.S.; Tran, H.C. Lqr and fuzzy control for reaction wheel inverted pendulum model. Robot. Manag. 2019, 24. [Google Scholar]
  9. Trentin, J.F.S.; Da Silva, S.; Ribeiro, J.M.D.S.; Schaub, H. Inverted pendulum nonlinear controllers using two reaction wheels: Design and implementation. IEEE Access 2020, 8, 74922–74932. [Google Scholar] [CrossRef]
  10. Belascuen, G.; Aguilar, N. Design, modeling and control of a reaction wheel balanced inverted pendulum. In Proceedings of the 2018 IEEE Biennial Congress of Argentina (ARGENCON), San Miguel de Tucuman, Argentina, 6–8 June 2018; pp. 1–9. [Google Scholar]
  11. Önen, Ü.; Çakan, A. Multibody modeling and balance control of a reaction wheel inverted pendulum using lqr controller. Int. J. Robot. Control Syst. 2021, 1, 84–89. [Google Scholar] [CrossRef]
  12. Chinelato, C.I.G.; Neves, G.P.D.; Angélico, B.A. Safe control of a reaction wheel pendulum using control barrier function. IEEE Access 2020, 8, 160315–160324. [Google Scholar] [CrossRef]
  13. Ding, H.; Zhou, Z.; Dang, H.; Zhao, Z. Control Wheel Rotation Inverted Pendulum Control Based on Unscented Kalman Filter. In Proceedings of the 2019 IEEE 2nd International Conference on Information Communication and Signal Processing (ICICSP), Weihai, China, 28–30 September 2019; pp. 81–86. [Google Scholar]
  14. Türkmen, A.; Korkut, M.Y.; Erdem, M.; Gönül, Ö.; Sezer, V. Design, implementation and control of dual axis self balancing inverted pendulum using reaction wheels. In Proceedings of the 2017 10th International Conference on Electrical and Electronics Engineering (ELECO), Chengdu, China, 10–17 August 2017; pp. 717–721. [Google Scholar]
  15. Hofer, M.; Muehlebach, M.; D’Andrea, R. The One-Wheel Cubli: A 3D inverted pendulum that can balance with a single reaction wheel. Mechatronics 2023, 91, 102965. [Google Scholar] [CrossRef]
  16. Kim, Y.; Park, J.; Han, S. Balancing the Cubli Frame with LQR-controlled Reaction Wheel. J. Sens. Sci. Technol. 2018, 27, 165–169. [Google Scholar]
  17. Ryadchikov, I.; Sokolov, D.; Biryuk, A.; Sechenev, S.; Svidlov, A.; Volkodav, P.; Mamelin, Y.; Popko, K.; Nikulchev, E. Stabilization of a hopper with three reaction wheels. In Proceedings of the ISR 2018, 50th International Symposium on Robotics, VDE, Munich, Germany, 20–21 June 2018; pp. 1–4. [Google Scholar]
  18. Sontag, E.D. Mathematical Control Theory: Deterministic Finite Dimensional Systems; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2013; Volume 6. [Google Scholar]
  19. Georgiou, T.T.; Lindquist, A. The Separation Principle in Stochastic Control, Redux. IEEE Trans. Autom. Control 2013, 58, 2481–2494. [Google Scholar] [CrossRef]
  20. Gillijns, S.; De Moor, B. Unbiased minimum-variance input and state estimation for linear discrete-time systems. Automatica 2007, 43, 111–116. [Google Scholar] [CrossRef]
Figure 1. Scheme of an inverted pendulum.
Figure 1. Scheme of an inverted pendulum.
Electronics 13 00514 g001
Figure 2. Angled view of the inverted pendulum with reaction wheel. Graphical design stage: a model that was realised in Fusion 360 software (a). Implementation stage: the final form of the constructed laboratory stand (b).
Figure 2. Angled view of the inverted pendulum with reaction wheel. Graphical design stage: a model that was realised in Fusion 360 software (a). Implementation stage: the final form of the constructed laboratory stand (b).
Electronics 13 00514 g002
Figure 3. Conceptual connection diagram (a) and implementation chart flow (b).
Figure 3. Conceptual connection diagram (a) and implementation chart flow (b).
Electronics 13 00514 g003
Figure 4. Free oscillations of the pendulum with the reaction wheel locked. θ d o w n = θ π indicates that the pendulum is pointing downwards.
Figure 4. Free oscillations of the pendulum with the reaction wheel locked. θ d o w n = θ π indicates that the pendulum is pointing downwards.
Electronics 13 00514 g004
Figure 5. Simulink system model used to determine the value of the coefficient b θ .
Figure 5. Simulink system model used to determine the value of the coefficient b θ .
Electronics 13 00514 g005
Figure 6. Step response of α with the pendulum’s arm locked.
Figure 6. Step response of α with the pendulum’s arm locked.
Electronics 13 00514 g006
Figure 7. Open-loop rectangular pulse response: laboratory stand measurements (blue); system model outputs (orange).
Figure 7. Open-loop rectangular pulse response: laboratory stand measurements (blue); system model outputs (orange).
Electronics 13 00514 g007
Figure 8. Control diagram using cascade PID controller.
Figure 8. Control diagram using cascade PID controller.
Electronics 13 00514 g008
Figure 9. Pole–zero maps of the pendulum.
Figure 9. Pole–zero maps of the pendulum.
Electronics 13 00514 g009
Figure 10. Root locus for inner open loop: the P controller (a), the PI controller (b), and the PID controller (c).
Figure 10. Root locus for inner open loop: the P controller (a), the PI controller (b), and the PID controller (c).
Electronics 13 00514 g010
Figure 11. Root locus of open outer loop.
Figure 11. Root locus of open outer loop.
Electronics 13 00514 g011
Figure 12. Simulated step response for the cascade PID control system.
Figure 12. Simulated step response for the cascade PID control system.
Electronics 13 00514 g012
Figure 13. Control system with the LQG controller.
Figure 13. Control system with the LQG controller.
Electronics 13 00514 g013
Figure 14. Simulated step response of closed system with LQR controller.
Figure 14. Simulated step response of closed system with LQR controller.
Electronics 13 00514 g014
Figure 15. Time waveforms of the implemented closed-loop system with the cascade PID controller: measured data (blue line); θ r e f value (orange line).
Figure 15. Time waveforms of the implemented closed-loop system with the cascade PID controller: measured data (blue line); θ r e f value (orange line).
Electronics 13 00514 g015
Figure 16. Time waveforms of the implemented system with the LQG controller: measured data (blue line); estimated data (orange line).
Figure 16. Time waveforms of the implemented system with the LQG controller: measured data (blue line); estimated data (orange line).
Electronics 13 00514 g016
Figure 17. Step response for τ d of the cascade PID controller: measured data (blue line); θ r e f value (orange line).
Figure 17. Step response for τ d of the cascade PID controller: measured data (blue line); θ r e f value (orange line).
Electronics 13 00514 g017
Figure 18. Step response for τ d of the LQG controller: collected data (blue line); data estimated by the Kalman filter (orange line).
Figure 18. Step response for τ d of the LQG controller: collected data (blue line); data estimated by the Kalman filter (orange line).
Electronics 13 00514 g018
Table 1. List of all the parameters needed for system modelling.
Table 1. List of all the parameters needed for system modelling.
ParameterValueUnit
d M 0.131 m
M p 0.4753 kg
M w 0.2003 kg
J 0.0103 k g m 2
J w 0.0013 k g m 2
b θ 5.6799 · 10 6 k g m 2 s 1
( b α + k m k e R a ) 0.0131 k g m 2 s 1
k m R a 0.0447 k g m 2 s 2 V 1
g 9.81 m s 2
Table 2. Performance indices for cascade PID and LQR controllers.
Table 2. Performance indices for cascade PID and LQR controllers.
Cascade PID
Input  u Disturbance  τ d
Variable θ α α ˙ θ α α ˙
x max 0.0073 r a d 1 r a d 1.14 r a d s 0.033 r a d 40 r a d s
t r [s] 1.72 5.52 3.95 6.3 7
P O 0 [ % ] 28
P O n [ % ] 000
LQR
Input uDisturbance τ d
Variable θ α α ˙ θ α α ˙
x max 0.033 r a d 7.6 r a d 5.55 r a d s 1.93 r a d 302 r a d 215 r a d s
t r [s] 2.6 3.28 3.6 2.1 3.2 3.6
P O 0 [ % ] 22
P O n [ % ] 0 18.3 0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zaborniak, D.; Patan, K.; Witczak, M. Design, Implementation, and Control of a Wheel-Based Inverted Pendulum. Electronics 2024, 13, 514. https://doi.org/10.3390/electronics13030514

AMA Style

Zaborniak D, Patan K, Witczak M. Design, Implementation, and Control of a Wheel-Based Inverted Pendulum. Electronics. 2024; 13(3):514. https://doi.org/10.3390/electronics13030514

Chicago/Turabian Style

Zaborniak, Dominik, Krzysztof Patan, and Marcin Witczak. 2024. "Design, Implementation, and Control of a Wheel-Based Inverted Pendulum" Electronics 13, no. 3: 514. https://doi.org/10.3390/electronics13030514

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop