Next Article in Journal
Dark-SLAM: A Robust Visual Simultaneous Localization and Mapping Pipeline for an Unmanned Driving Vehicle in a Dark Night Environment
Previous Article in Journal
Atmospheric Aircraft Conceptual Design Based on Multidisciplinary Optimization with Differential Evolution Algorithm and Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Two-Step Controller for Vision-Based Autonomous Landing of a Multirotor with a Gimbal Camera

1
School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
2
Robot Innovation Group, Samsung Display Co., Ltd., Yongin-si 17113, Republic of Korea
*
Author to whom correspondence should be addressed.
Drones 2024, 8(8), 389; https://doi.org/10.3390/drones8080389
Submission received: 3 July 2024 / Revised: 31 July 2024 / Accepted: 7 August 2024 / Published: 9 August 2024

Abstract

This article presents a novel vision-based autonomous landing method utilizing a multirotor and a gimbal camera, which is designed to be applicable from any initial position within a broad space by addressing the problems of a field of view and singularity to ensure stable performance. The proposed method employs a two-step controller based on integrated dynamics for the multirotor and the gimbal camera, where the multirotor approaches the landing site horizontally in the first step and descends vertically in the second step. The multirotor and the camera converge simultaneously to the desired configuration because we design the stabilizing controller for the integrated dynamics of the multirotor and the gimbal camera. The controller requires only one feature point and decreases unnecessary camera rolling. The effectiveness of the proposed method is demonstrated through simulation and real environment experiments.

1. Introduction

Multirotors have gained popularity as a preferred platform for unmanned aerial vehicles (UAVs) due to their maneuverability, ability to take off vertically and land, low cost, and minimal maintenance requirements [1]. Recently, researchers have begun equipping multirotors with cameras to enable autonomous and precise control through visual input from the surrounding environment [2,3]. This trend has become widespread with the development of lightweight and small cameras. One area of research in this field is vision-based autonomous landing (VBAL), where a UAV uses a camera to autonomously detect, approach, and land on a designated site [4].
Various techniques are used in VBAL, most of which rely on visual servoing for UAV control. Visual servoing refers to the use of an image alone for UAV control, although additional sensor information such as an ultrasonic sensor [5], inertial measurement unit (IMU) [6], GPS [7], and LiDAR [7] may be additionally utilized in some cases to obtain the depth of a target, attitude, position, and altitude of the multirotor. Using this visual servoing technique, we can conduct a sequence of procedures that involves detecting a landing site, approaching it at close range, and landing safely.
Several studies on VBAL employ UAVs equipped with fixed cameras positioned mostly downward. For instance, Kumar [5] uses a fixed stereo camera and an ultrasonic sensor to determine the relative position of the quadrotor with respect to the landing site and applies four basic visual servoing controllers independently. Many recent studies have introduced the concept of a virtual camera. Lee et al. [8] devise a VBAL technique utilizing an adaptive sliding mode controller and a virtual camera that requires at least four point features in an image, while Zheng et al. [6] design a controller that employs a virtual camera and the backstepping approach, which requires at least three point features. Li et al. [9] propose a VBAL control scheme for an arbitrarily oriented planar visual target using the virtual camera and the image moment feature that still requires at least two coplanar feature points. One issue with the fixed camera is that the target features within its field of view can disappear during flight. Some solutions to this problem include using a control barrier function [10] or nonlinear model predictive control [11], which employ optimization with a visibility constraint to generate control inputs while keeping the landing site in the field of view. Recently, visual servoing control using reinforcement learning has also been conducted to keep the target within the field of view [12].
An alternative approach to address the issue is to mount a gimbal camera on the UAV [13]. As the UAV moves along any trajectory, the gimbal camera adjusts its attitude to keep the landing site close to the center of the image. Some VBAL systems that use gimbal cameras have separate controllers for the camera and UAV. These systems typically use an image-tracking algorithm for the gimbal camera controller, which is independent of the UAV dynamics. Examples of such systems include those using target tracking error dynamics [14], proportional navigation [15], or classical visual servoing with a tilting camera [7], model predictive control with a pan-tilt camera [16]. In these systems, the image tracking algorithm begins to control the gimbal camera after the target moves away from the center, which implies that the gimbal moves after the UAV moves. Thus, fast image tracking is necessary to prevent losing the landing site, but using a high gain may exacerbate the impact of noise and cause control input chattering. As Jiang et al. [14] have noted, there is a trade-off between speed and stability when tracking the image after the camera deviates from the target. Cabecinhas et al. [17] have proposed an attitude estimator and an integrated controller that considers both a pan-tilt camera and the quadrotor dynamics, but it still necessitates at least four feature points.
The focus of our study is to present a two-step VBAL controller for a multirotor with a gimbal camera to address the aforementioned issues. The integrated dynamics of the multirotor and the gimbal camera are derived using one feature point, enabling them to converge simultaneously to the desired configurations using feedback linearization. We show theoretically that our proposed method allows the multirotor to stably approach, track a target, and land from any starting position (even at a low altitude and over a long horizontal distance from the target). Importantly, this implementation relies solely on onboard sensor information. In the first step, the multirotor moves horizontally toward the landing site. Here, we define a five-dimensional state consisting of the image coordinates of one feature point, the depth of the feature point, the camera rolling variable relative to the world frame, and the altitude of the multirotor. As the multirotor gets close enough to the landing site horizontally, we initiate the second step, which we call vertical approaching. Here, we define a three-dimensional state consisting of the virtual camera coordinates of one feature point and the altitude of the multirotor. We conduct the method through real experiments, validating that the proposed states reach their desired values, leading to successful landings at the landing site.
This paper is organized as follows. Section 2 describes our two-step controller along with the derivation of the integrated dynamics. Section 3 shows the performance of the controller in simulation and real environments. The conclusion is presented in Section 4.

2. Materials and Methods

2.1. Notation

The set of all 3 × 3 rotation matrices is defined as SO ( 3 ) = { A R 3 × 3 A T A = I , det A = 1 } . The set of all 3 × 3 skew symmetric matrices is defined as so ( 3 ) = A R 3 × 3 A T + A = 0 . The hat map : R 3 so ( 3 ) is defined as
Ω ^ = 0 Ω 3 Ω 2 Ω 3 0 Ω 1 Ω 2 Ω 1 0
for Ω = [ Ω 1 , Ω 2 , Ω 3 ] T R 3 . The standard basis vectors are denoted as e 1 = [ 1 , 0 , 0 ] T , e 2 = [ 0 , 1 , 0 ] T , and e 3 = [ 0 , 0 , 1 ] T .
 Lemma 1. 
1. 
u ^ v = u × v = v ^ u for all u , v R 3 .
2. 
R Ω ^ R T = ( R Ω ) ^ for all R SO ( 3 ) and Ω R 3 .
We define three coordinate frames for our multirotor environment as shown in Figure 1. The world frame { w } , which has X w Y w Z w coordinates, is an inertial frame. The body frame { b } , which has X b Y b Z b coordinates, is attached to the multirotor with its origin at the center of the multirotor. The gimbal camera frame { c } , which has X c Y c Z c coordinates, is attached to the gimbal camera with its z-axis coinciding with its optical axis. The rotation, position, and linear velocity of a frame { q } with respect to a frame { p } are denoted as R p q SO ( 3 ) , P p q = [ X p q , Y p q , Z p q ] T R 3 , and V p q R 3 , respectively. The body angular velocity of a frame { q } in its own frame is denoted as Ω q R 3 . They satisfy
P ˙ p q = V p q , R ˙ p q = R p q Ω ^ q .
Define a function L : R 2 R 2 × 3 as
L ( u , n ) = e 1 T e 2 T K u n 1 e 3 T
for [ u , n ] T R 2 , where the constant camera intrinsic matrix is given by
K = f x 0 c x 0 f y c y 0 0 1 ,
with f x and f y being the focal lengths in pixels, and c x and c y being the pixel coordinates of the principal point in the camera image. Define another function M : R 2 so ( 3 ) as
M ( u , n ) = K 1 u n 1 ^
for [ u , n ] T R 2 . Finally, τ R used in this paper represents time.

2.2. The First Step: Horizontal Approaching

In this section, we devise a controller that controls a multirotor to approach a target landing site in the horizontal direction. Let us assume that the landing site is fixed in the world frame { w } and is already detected by the camera on the multirotor. The state is defined as
s 1 = [ u c , n c , Z c t , η , h ] T R 5 ,
where [ u c , n c ] T is the pixel coordinates of the detected feature, which is the center of the landing site in the camera image, Z c t is the depth of the feature in the camera frame { c } , which is the third component of
P c t = [ X c t , Y c t , Z c t ] T ,
representing the position of the target in the camera frame { c } , and
η = e 3 T R w c e 1
is the quantitative value representing the amount of the camera rolling, which can be utilized for stable feature tracking. The final component h is the altitude of the multirotor, which is the same as the third component of
P w b = [ X w b , Y w b , Z w b ] T ,
representing the position of the body frame in the world frame { w } . Figure 1 illustrates the state variables.
 Theorem 1. 
The dynamics of the state s 1 is given by
s ˙ 1 = A 1 ( s 1 ) + B 1 ( s 1 ) α β ,
where
A 1 ( s 1 ) = 1 Z c t L ( u c , n c ) e 3 T 0 1 × 3 0 1 × 3 R b c T Ω ^ b P b c , B 1 ( s 1 ) = L ( u c , n c ) M ( u c , n c ) 1 Z c t L ( u c , n c ) Z c t e 3 T M ( u c , n c ) e 3 T e 3 T R w c e ^ 1 0 1 × 3 0 1 × 3 e 3 T R w c ,
and [ α T , β T ] T is related to [ Ω c T , V w b T ] T , which is given by
α = Ω c + R b c T Ω b R 3 , β = R b c T R w b T V w b R 3 .
Proof. 
The first two variables of s 1 are given by the pinhole camera model [18] as
u c n c = 1 Z c t e 1 T e 2 T K P c t ,
where K, Z c t , and P c t are given in (3) and (5), respectively. By differentiation and the chain rule, we get
u ˙ c n ˙ c = 1 Z c t L ( u c , n c ) P ˙ c t ,
where L ( u c , n c ) is given in (2). To obtain P ˙ c t in (12), we express P c t as follows:
P c t = R b c T R w b T ( P w t R w b P b c P w b ) ,
where P w t and P b c are the position of the target in the world frame { w } and the position of the camera in the body frame { b } , respectively, and are both constant. By differentiating (13) in t and using Lemma 1, (1), and (10), we get
P ˙ c t = R ˙ b c T R w b T ( P w t R w b P b c P w b ) + R b c T R ˙ w b T ( P w t R w b P b c P w b ) + R b c T R w b T ( R ˙ w b P b c V w b ) = Z c t M ( u c , n c ) α β R b c T Ω ^ b P b c ,
where M ( u c , n c ) is the evaluation of M ( u , n ) given in (4) at ( u , n ) = ( u c , n c ) , which can be simplified to
M ( u c , n c ) = 1 Z c t P ^ c t ,
which can be verified with (4) and (11). By substituting (14) into (12), we obtain the time derivative of the first two variables of s 1 , which is
u ˙ c n ˙ c = L ( u c , n c ) M ( u c , n c ) α 1 Z c t L ( u c , n c ) β 1 Z c t L ( u c , n c ) R b c T Ω ^ b P b c .
The time derivative of the third variable of s 1 is
Z ˙ c t = e 3 T P ˙ c t = Z c t e 3 T M ( u c , n c ) α e 3 T β e 3 T R b c T Ω ^ b P b c
by (5) and (14). The time derivative of the fourth variable of s 1 is computed as
η ˙ = e 3 T ( R ˙ w b R b c + R w b R ˙ b c ) e 1 = e 3 T R w c e ^ 1 α
by (1), (6), (10) and Lemma 1. The time derivative of the fifth variable of s 1 is given by
h ˙ = e 3 T P ˙ w b = e 3 T R w c β
by (1), (7), and (10). By incorporating (16)–(19), we get the dynamics of the state s 1 in (8). □
 Remark 1. 
Since the position and yaw angle of the multirotor can be controlled independently [19], we control the yaw angle to be fixed in this paper. Additionally, the orientation of the gimbal can also be controlled independently. Therefore, we choose the linear velocity V w b of the multirotor and the angular velocity Ω c of the gimbal as the control inputs in our proposed system. Lower-level control for thrust and torque can be handled using a commercial flight controller or a backstepping technique.
 Lemma 2. 
Assume Z c t > 0 . The matrix B 1 ( s 1 ) in (9) has full rank if and only if the following condition holds:
( e 1 × ( R w c T e 3 ) ) · P c t 0 .
 Proof. 
The matrix B 1 ( s 1 ) satisfies
V B 1 ( s 1 ) W = 0 3 × 3 I 3 × 3 ( e 1 × R w c T e 3 ) T 0 1 × 3 Z c t ( M ( u c , n c ) R w c T e 3 ) T 0 1 × 3 ,
where
V = 1 Z c t L ( u c , n c ) e 3 T 1 0 3 × 1 0 3 × 1 0 1 × 3 1 0 e 3 T R w c 1 Z c t L ( u c , n c ) e 3 T 1 0 1
and
W = I 3 × 3 0 3 × 3 Z c t M ( u c , n c ) I 3 × 3 .
The matrices V and W are both invertible because 1 Z c t L ( u c , n c ) T e 3 T is always invertible. Hence, the matrix B 1 ( s 1 ) has full rank if and only if
( e 1 × ( R w c T e 3 ) ) × ( M ( u c , n c ) R w c T e 3 ) 0 3 × 1 .
By using R w c T e 3 0 , (15), and the triple product expansion, we can show that (21) is equivalent to (20). □
We now derive the dynamics of the state, s 1 , and devise a controller that stabilizes at the desired state given by
s 1 , d = [ c x , c y , Z c t , d , 0 , h d ] T ,
where Z c t , d is the desired depth of the feature from the camera, and h d is the desired altitude of the multirotor.
 Theorem 2. 
Define the error state as follows:
ϵ 1 = s 1 s 1 , d .
Suppose that (20) holds. Then ϵ 1 converges to zero as τ if the control law is given by
Ω c = α 1 R b c T Ω b , V w b = R w c β 1 ,
where
α 1 β 1 = B 1 ( s 1 ) ( λ 1 ϵ 1 A 1 ( s 1 ) ) ,
with B 1 ( s 1 ) = B 1 ( s 1 ) T ( B 1 ( s 1 ) B 1 ( s 1 ) T ) 1 the pseudo-inverse of B 1 ( s 1 ) and λ 1 R 5 × 5 a positive-definite diagonal matrix.
Proof. 
The dynamics of the error state ϵ 1 becomes
ϵ ˙ 1 = A 1 ( s 1 ) + B 1 ( s 1 ) α 1 β 1 ,
by Theorem 1. Since (20) holds, we obtain
ϵ ˙ 1 = λ 1 ϵ 1
by Lemma 2 and (22), so that ϵ 1 converges to zero as τ . □
 Remark 2. 
Calculation of the horizontal approaching controller in (22) needs the values of u c , n c , h, R w b , Ω b , R b c , Z c t , and η. The values of u c and n c can be obtained from a camera image. The value of h can be obtained from a range finder. The values of R w b and Ω b can be obtained from an IMU or a commercial state estimator. The value of R b c can be obtained from a gimbal. The value of Z c t can be obtained from a depth camera. The value of η is calculated by (6).

2.3. The Second Step: Vertical Approaching

We adapt the virtual camera definition in [6,20] as depicted in the left part of Figure 1 to avoid the ill-conditioned problem of Theorem 2. Rewrite (20) as follows:
( e 1 × ( R w c T e 3 ) ) · P c t = e 3 T R w c e ^ 1 R w c T R ¯ P v t = e 3 · ( ( R ¯ P v t ) × ( R w c e 1 ) ) ,
where
P v t = [ X v t , Y v t , Z v t ] T = R ¯ T R w c P c t
is the position of the target in the virtual camera frame { v } and
R ¯ = 0 1 0 1 0 0 0 0 1 .
There are three conditions under which (24) becomes zero. The first is e 3 = R w c e 1 , but we can ignore this condition because the two vectors are not close to each other in the horizontal approaching step. The second is R ¯ P v t = k R w c e 1 for some k R , which is equivalent to P c t = k e 1 by (25), but this condition can not occur during image feature tracking. The third is e 3 = k R ¯ P v t for some k R , which is equivalent to u v = c x and n v = c y , where u v and n v are the pixel coordinates of the target in the virtual camera image, which has the same intrinsic matrix K as the actual camera. The pixel coordinates [ u v , n v ] T are derived as follows:
u v n v 1 = μ 1 K R ¯ T R w c K 1 u c n c 1 ,
where
μ = e 3 T K R ¯ T R w c K 1 u c n c 1 .
Hence, we switch from the horizontal approaching to the vertical approaching when the following condition is satisfied:
u v c x u , 1 , n v c y n , 1 ,
where u , 1 and n , 1 are the threshold values corresponding to the absolute values of errors so that we can avoid the condition (20). When the condition (28) holds, we anchor the actual camera so that it is always looking downward relative to the body frame { b } and start the vertical approaching step that will be derived here. In this step, the multirotor descends while staying close to the center of the target. The state for the controller in this second step is defined as
s 2 = [ u v , n v , h ] T R 3 .
 Theorem 3. 
The dynamics of the state s 2 = [ u v , n v , h ] T R 3 is given by
s ˙ 2 = A 2 ( s 2 ) + B 2 ( s 2 ) γ ,
where
A 2 ( s 2 ) = 1 Z v t L ( u v , n v ) ( R ¯ T R w b Ω ^ b P b c ) 0 , B 2 ( s 2 ) = 1 Z v t L ( u v , n v ) e 3 T ,
and γ is related to V w b as
γ = R ¯ T V w b R 3 .
Proof. 
The first two variables of the s 2 satisfy:
u ˙ v n ˙ v = 1 Z v t L ( u v , n v ) P ˙ v t
as for the derivation of (12), where Z v t and P v t are from (25). To obtain P ˙ v t in (31), we express P v t as
P v t = R ¯ T ( P w t R w b P b c P w b ) ,
whose time derivative is
P ˙ v t = R ¯ T ( R w b Ω ^ b P b c V w b ) = R ¯ T R w b Ω ^ b P b c γ
by Lemma 1, (1), and (30). By substituting (33) into (31), we obtain the time derivatives of the first two variables of s 2 as
u ˙ v n ˙ v = 1 Z v t L ( u v , n v ) ( R ¯ T R w b Ω ^ b P b c + γ ) .
The time derivative of the third variable of s 2 is given by
h ˙ = e 3 T R ¯ γ = e 3 T γ .
By incorporating (34) and (35), we get (29). □
The desired state is given as
s 2 , d = [ c x , c y , h 2 , d ] T ,
where h 2 , d is a sufficiently small distance between the multirotor and the landing site.
 Theorem 4. 
Define the error state as follows:
ϵ 2 = s 2 s 2 , d .
Then ϵ 2 converges to zero as τ if the control law is given by
V w b = R ¯ γ 2 ,
where
γ 2 = B 2 ( s 2 ) 1 ( λ 2 ϵ 2 A 2 ( s 2 ) )
with a positive-definite diagonal matrix λ 2 R 3 × 3 .
Proof. 
The dynamics of the error state ϵ 2 becomes
ϵ ˙ 2 = A 2 ( s 2 ) + B 2 ( s 2 ) γ 2 ,
by Theorem 3. Since the matrix B 2 ( s 2 ) is always invertible, we obtain
ϵ ˙ 2 = λ 2 ϵ 2
by (36), so that ϵ 2 converges to zero as τ . □
Through Theorem 2, Theorem 4, and condition (28), our control algorithm operates continuously, driving the defined error to converge to zero, thereby reaching the position for landing. The method in [14] considers the roll angle of the gimbal to be small to avoid singularity cases. Still, our algorithm theoretically identifies the singularity case and addresses this issue by implementing the two-step controller. Additionally, recent methods [7,14,15,16] implement gimbal controllers without considering UAV movement or treating it merely as a disturbance. In contrast, we design the controller based on integrated dynamics that ensure every component of the error simultaneously and exponentially converges to zero.
 Remark 3. 
Calculation of the vertical approaching controller in (36) needs the values of u v , n v , h, Z v t , R w b , and Ω b . The way to obtain the values of h, R w b , and Ω b is explained in Remark 2. The values of u v , n v can be obtained by (26). Normally, Z v t can be obtained by Z v t = μ Z c t with μ defined in (27). If P b c is small, we can approximate Z v t with h.
The comprehensive workflow for regulating the multirotor is depicted in Figure 2. Initially, the multirotor surveys the surrounding area to identify the target landing site within the field of view of the camera. Subsequently, the horizontal approaching step outlined in Section 2.2 steers the multirotor. Once the multirotor approaches the target and fulfils the requirements specified in (28), the controller shifts to the vertical approaching step as discussed in Section 2.3. When the multirotor reaches a vicinity of the target vertically and satisfies
u v c x u , 2 , n v c y n , 2 , h h 2 , d h , V w b V w b t h ,
where u , 2 , n , 2 , h , and V w b t h are the threshold values corresponding to the absolute values of errors, the landing process is initiated by directing the multirotor to descend and powering off the multirotor.

3. Results

To evaluate the effectiveness of our two-step VBAL controller, we first conducted experiments in the simulation using Gazebo on Ubuntu 18.04. The controller was implemented in C++ and ROS [21], and a control input for both the multirotor and gimbal was sent via the PX4 autopilot. We gave V w b as a control command. On the other hand, our gimbal controller only gets three-axis angle values as control commands, so we calculate a rotation matrix R cmd using Ω c as follows:
R cmd = R b c ( τ ) e Ω ^ c Δ τ ,
where Δ τ is a sampling time. Then, we convert R cmd into Euler angles in Z-X-Y sequence to provide yaw, roll, and pitch angle commands.
We utilized the center pixel of the Apriltag as the target landing site, and the detection algorithm [22] was used by the multirotor solely for the specific target detection. The initial hover position of the multirotor was P w b = [ 0 , 0.05 , 2.65 ] T , as illustrated in Figure 3a, and the target position was P w t = [ 5.5 , 3.5 , 0 ] T , the center of a 1 m × 1 m Apriltag. We initiated the horizontal approaching step with a desired state of s 1 , d = [ c x , c y , 2.5 , 0 , 2.5 ] T , followed by the vertical approaching step with a desired state of s 2 , d = [ c x , c y , 0.5 ] T . The gain matrices for the controllers were set as λ 1 = diagonal ( 30 , 30 , 20 , 1 , 1 ) and λ 2 = I 3 × 3 based on experience. The camera intrinsic parameters are given in Table 1. We set ( u , 1 , n , 1 ) = ( 160 , 80 ) pixels , ( u , 2 , n , 2 ) = ( 20 , 20 ) pixels , h = 0.05 m , and no condition was specified in V w b t h . The sampling time Δ τ is 0.1 s . As seen in Figure 3b, the multirotor landed securely on the landing site.
The three-dimensional trajectory of the multirotor is shown in Figure 4, with the blue line and red line representing the trajectory by the horizontal approaching step and the vertical approaching step, respectively, while the green pyramid triangles indicate the camera orientations. No unnecessary rolling of the gimbal camera was observed.
Figure 5 shows the measurement plots beginning when the gimbal camera of the multirotor initially detects the target. The intervals for vertical approaching, horizontal approaching, and landing are indicated by dotted blue, red, and purple boxes, respectively. The trajectory of the pixel errors u c c x and n c c y between the target feature and the camera principal point in the image are plotted in Figure 5a, demonstrating that the gimbal camera was directed to place the target feature near the principal point of the camera. Moreover, the gimbal camera consistently focused on the target throughout the experiment, as demonstrated in Figure 4. The horizontal distance [ X w b , Y w b ] T [ X w t , Y w t ] T between the multirotor and the target is depicted in Figure 5b, illustrating that the multirotor flew horizontally towards the target during the horizontal approaching step. The altitude h of the multirotor is presented in Figure 5c, indicating a small decrease in altitude during the horizontal approaching step, while the multirotor descended vertically towards the target during the vertical approaching step. In Figure 5d, the yaw, roll, and pitch angles of the gimbal camera trajectories are shown.
Our method was also validated in a real-world setting using a multirotor in Figure 6a. This multirotor was equipped with an onboard computer (NVIDIA’s Jetson TX2) for algorithm processing and a flight controller with PX4 firmware for controlling the actuators. We attached a gimbal (STorM32) to the multirotor bottom and mounted a camera (GoPro Hero 4) on the gimbal. The diagonal size and height of the multirotor are 37 cm and 21 cm, respectively. We defined the center of the multirotor as being 16.8 cm from the bottom of the multirotor. The total weight of the multirotor with onboard computer, gimbal camera, and battery is 2.46 k g . We used the state estimator of the PX4 controller, which fused data from a motion capture system (OptiTrack), an IMU, and a range finder, so we were able to get the altitude, attitude, and angular velocity of the multirotor center and give linear velocity and yaw commands. The landing site consisted of two Apriltags, as shown in Figure 6b. The experimental site was an indoor area with dimensions of 6 m × 7 m × 3 m (width × depth × height), with the center of the ground defined as the origin using the motion capture system.
When the multirotor was far from the target, it detected the larger Apriltag first, and as it approached the target, it detected the smaller Apriltag. The initial position of the multirotor was P w b = [ 1.69 , 0.94 , 1.77 ] T , and the target position was P w t = [ 2 , 1 , 0 ] T . The desired states for the horizontal approaching step and the vertical approaching step were s 1 , d = [ c x , c y , 1.8 , 0 , 1.8 ] T and s 2 , d = [ c x , c y , 0.5 ] T , respectively. The gain matrices of the controller were selected as λ 1 = diagonal ( 60 , 60 , 80 , 5 , 5 ) and λ 2 = diagonal ( 0.5 , 0.5 , 1.0 ) from experience. The camera intrinsic parameters are given in Table 1. We set ( u , 1 , n , 1 ) = ( 108 , 60 ) pixels , ( u , 2 , n , 2 ) = ( 20 , 20 ) pixels , h = 0.2 m , and V w b t h = 0.3 m · s 1 . The sampling time Δ τ is 0.033 s . We captured four images of the multirotor flight as shown in Figure 7. The initial and final states of the multirotor are depicted in Figure 7a,d, respectively.
The three-dimensional trajectory of the multirotor is shown in Figure 8 using the same representation as in Figure 4. We plotted the pixel distance between the target feature and the principal point of the camera in the camera image, the horizontal distance between the multirotor and the target, the altitude of the multirotor, and the three-axis angles of the gimbal camera in the real environment in Figure 9 using the same representation as in Figure 5. The multirotor safely landed on the target landing site during the real experiment, confirming the feasibility of our two-step VBAL controller. We also conducted ten experiments for each of the three different tests, keeping all the real experimental setup parameters mentioned previously, except for the initial position. We plotted all three-dimensional position trajectories of the multirotor in Figure 10. The red trajectories represent the first test, the blue trajectories represent the second test, and the green trajectories represent the third test. In Table 2, the three-dimensional initial positions, percentages of success rate, and mean final horizontal landing position errors [ X w b , Y w b ] [ X w t , Y w t ] for the three tests are shown. The landing was successful in all experiments, and the total mean and standard deviation of final landing position errors were [ 0.029 , 0.043 ] ± [ 0.044 , 0.036 ] . Compared to the landing site size, the total mean distance error is less than 10 % , demonstrating that stable and precise landings are possible. Our successful landing results from various initial multirotor positions are shown in the video available at https://youtu.be/MtBXix4g638 (accessed on 6 August 2024) (see Supplementary Video S1).

4. Conclusions

This study proposes a novel two-step controller for achieving vision-based autonomous landing of a multirotor equipped with a gimbal camera. We define state variables for each step, derive integrated dynamics of the multirotor and the gimbal camera, and formulate controllers using feedback linearization. The first step involves horizontal approaching of the multirotor towards the landing site, while the second step is comprised of vertical descent. The controller requires only one feature point and effectively eliminates unnecessary camera rolling. The efficacy of the controller is demonstrated in both simulation and real environment, exhibiting desirable multirotor control. We expect our control algorithm to be applicable when a multirotor needs to explore a wide area for surveillance and detection, then approach a target closely, or when a delivery drone needs to land at a site where its location is not precisely known. Our proposed algorithm, based on a velocity-level controller, will be further developed after a thorough consideration of the dynamics of the multirotor, to ensure its robustness against target movement and disturbances caused by external forces in future work. Additionally, we will conduct experiments under various conditions to derive metrics for quantitative performance, allowing for comparison with other state-of-the-art VBAL methods.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/drones1010000/s1, Video S1. Demonstrations in a real environment.

Author Contributions

Conceptualization, S.Y. and J.-H.P.; methodology, S.Y.; software, S.Y.; validation, S.Y. and J.-H.P.; formal analysis, S.Y.; investigation, S.Y. and J.-H.P.; resources, S.Y. and J.-H.P.; data curation, S.Y. and J.-H.P.; writing—original draft preparation, S.Y. and J.-H.P.; writing—review and editing, D.E.C.; visualization, S.Y.; supervision, D.E.C.; project administration, D.E.C.; funding acquisition, D.E.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Institute for Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No.2022-0-00469) and by Center for Applied Research in Artificial Intelligence(CARAI) grant funded by Defense Acquisition Program Administration (DAPA) and Agency for Defense Development(ADD) (UD190031RD).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author. The data are not publicly available due to legal restrictions.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
UAVUnmanned aerial vehicle
VBALVision-based autonomous landing
IMUInertial measurement unit
GPSGlobal positioning system
LiDARLight detection and ranging

References

  1. Tomic, T.; Schmid, K.; Lutz, P.; Domel, A.; Kassecker, M.; Mair, E.; Grixa, I.L.; Ruess, F.; Suppa, M.; Burschka, D. Toward a Fully Autonomous UAV: Research Platform for Indoor and Outdoor Urban Search and Rescue. IEEE Robot. Autom. Mag. 2012, 19, 46–56. [Google Scholar] [CrossRef]
  2. Lu, Y.; Xue, Z.; Xia, G.S.; Zhang, L. A survey on vision-based UAV navigation. Geo-Spat. Inf. Sci. 2018, 21, 21–32. [Google Scholar] [CrossRef]
  3. Kanellakis, C.; Nikolakopoulos, G. Survey on computer vision for UAVs: Current developments and trends. J. Intell. Robot. Syst. 2017, 87, 141–168. [Google Scholar] [CrossRef]
  4. Kong, W.; Zhou, D.; Zhang, D.; Zhang, J. Vision-based autonomous landing system for unmanned aerial vehicle: A survey. In Proceedings of the 2014 International Conference on Multisensor Fusion and Information Integration for Intelligent Systems (MFI), Beijing, China, 28–29 September 2014; pp. 1–8. [Google Scholar]
  5. Kumar, A. Real-time performance comparison of vision-based autonomous landing of quadcopter on a ground moving target. IETE J. Res. 2023, 69, 5455–5472. [Google Scholar] [CrossRef]
  6. Zheng, D.; Wang, H.; Wang, J.; Chen, S.; Chen, W.; Liang, X. Image-based visual servoing of a quadrotor using virtual camera approach. IEEE/ASME Trans. Mechatron. 2016, 22, 972–982. [Google Scholar] [CrossRef]
  7. Keipour, A.; Pereira, G.A.; Bonatti, R.; Garg, R.; Rastogi, P.; Dubey, G.; Scherer, S. Visual Servoing Approach to Autonomous UAV Landing on a Moving Vehicle. Sensors 2022, 22, 6549. [Google Scholar] [CrossRef] [PubMed]
  8. Lee, D.; Ryan, T.; Kim, H.J. Autonomous landing of a VTOL UAV on a moving platform using image-based visual servoing. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation (ICRA), St Paul, MN, USA, 14–18 May 2012; pp. 971–976. [Google Scholar]
  9. Li, J.; Xie, H.; Low, K.H.; Yong, J.; Li, B. Image-based visual servoing of rotorcrafts to planar visual targets of arbitrary orientation. IEEE Robot. Autom. Lett. 2021, 6, 7861–7868. [Google Scholar] [CrossRef]
  10. Zheng, D.; Wang, H.; Wang, J.; Zhang, X.; Chen, W. Toward visibility guaranteed visual servoing control of quadrotor UAVs. IEEE/ASME Trans. Mechatron. 2019, 24, 1087–1095. [Google Scholar] [CrossRef]
  11. Zhang, K.; Shi, Y.; Sheng, H. Robust nonlinear model predictive control based visual servoing of quadrotor UAVs. IEEE/ASME Trans. Mechatron. 2021, 26, 700–708. [Google Scholar] [CrossRef]
  12. Fu, G.; Chu, H.; Liu, L.; Fang, L.; Zhu, X. Deep reinforcement learning for the visual servoing control of uavs with fov constraint. Drones 2023, 7, 375. [Google Scholar] [CrossRef]
  13. Hansen, J.G.; de Figueiredo, R.P. Active Object Detection and Tracking Using Gimbal Mechanisms for Autonomous Drone Applications. Drones 2024, 8, 55. [Google Scholar] [CrossRef]
  14. Jiang, T.; Lin, D.; Song, T. Vision-based autonomous landing of a quadrotor using a gimbaled camera. Proc. Inst. Mech. Eng. Part J. Aerosp. Eng. 2019, 233, 5093–5106. [Google Scholar] [CrossRef]
  15. Liu, X.; Yang, Y.; Ma, C.; Li, J.; Zhang, S. Real-time visual tracking of moving targets using a low-cost unmanned aerial vehicle with a 3-axis stabilized gimbal system. Appl. Sci. 2020, 10, 5064. [Google Scholar] [CrossRef]
  16. Yang, L.; Wang, X.; Zhou, Y.; Liu, Z.; Shen, L. Online predictive visual servo control for constrained target tracking of fixed-wing unmanned aerial vehicles. Drones 2024, 8, 136. [Google Scholar] [CrossRef]
  17. Cabecinhas, D.; Brás, S.; Cunha, R.; Silvestre, C.; Oliveira, P. Integrated visual servoing solution to quadrotor stabilization and attitude estimation using a pan and tilt camera. IEEE Trans. Control Syst. Technol. 2017, 27, 14–29. [Google Scholar] [CrossRef]
  18. Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  19. Chang, D.E. A New Bundle Picture for the Drone. IEEE Trans. Autom. Control 2023, 68, 4218–4224. [Google Scholar] [CrossRef]
  20. Jabbari, H.; Oriolo, G.; Bolandi, H. An adaptive scheme for image-based visual servoing of an underactuated UAV. Int. J. Robot. Autom. 2014, 29, 92–104. [Google Scholar]
  21. Quigley, M.; Conley, K.; Gerkey, B.; Faust, J.; Foote, T.; Leibs, J.; Wheeler, R.; Ng, A.Y. ROS: An open-source Robot Operating System. In Proceedings of the ICRA Workshop on Open Source Software, Kobe, Japan, 12–17 May 2009; Volume 3, p. 5. [Google Scholar]
  22. Wang, J.; Olson, E. AprilTag 2: Efficient and robust fiducial detection. In Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Daejeon, Republic of Korea, 9–14 October 2016; pp. 4193–4198. [Google Scholar]
Figure 1. Overall framework of our vision-based autonomous landing system. The mission is to guide the multirotor, positioned at an arbitrary point in three-dimensional space, to land using the proposed vertical and horizontal approachings. The first difference between the two approachings is when they are employed concerning distance from the vertical axis of the target, and the second is whether gimbal control is utilized.
Figure 1. Overall framework of our vision-based autonomous landing system. The mission is to guide the multirotor, positioned at an arbitrary point in three-dimensional space, to land using the proposed vertical and horizontal approachings. The first difference between the two approachings is when they are employed concerning distance from the vertical axis of the target, and the second is whether gimbal control is utilized.
Drones 08 00389 g001
Figure 2. Overall workflow for the multirotor control. The condition of transition from horizontal approaching to vertical approaching is (28). The condition of transition from vertical approaching to landing is (38).
Figure 2. Overall workflow for the multirotor control. The condition of transition from horizontal approaching to vertical approaching is (28). The condition of transition from vertical approaching to landing is (38).
Drones 08 00389 g002
Figure 3. Gazebo environment. The red, green, and blue lines represent the X Y Z axes of the world frame. (a) The initial hovering state of the multirotor. (b) The final landing state of the multirotor.
Figure 3. Gazebo environment. The red, green, and blue lines represent the X Y Z axes of the world frame. (a) The initial hovering state of the multirotor. (b) The final landing state of the multirotor.
Drones 08 00389 g003
Figure 4. The three-dimensional trajectory of the multirotor in Gazebo environment.
Figure 4. The three-dimensional trajectory of the multirotor in Gazebo environment.
Drones 08 00389 g004
Figure 5. Trajectories in Gazebo environment of (a) the pixel errors between the target feature and the camera principal point in the camera image, (b) the horizontal distance [ X w b , Y w b ] T [ X w t , Y w t ] T between the multirotor and the target, (c) the altitude h of the multirotor, and (d) the three-axis angles of the gimbal camera. I: horizontal approaching (blue dotted box), II: vertical approaching (red dotted box), III: landing (violet dotted box).
Figure 5. Trajectories in Gazebo environment of (a) the pixel errors between the target feature and the camera principal point in the camera image, (b) the horizontal distance [ X w b , Y w b ] T [ X w t , Y w t ] T between the multirotor and the target, (c) the altitude h of the multirotor, and (d) the three-axis angles of the gimbal camera. I: horizontal approaching (blue dotted box), II: vertical approaching (red dotted box), III: landing (violet dotted box).
Drones 08 00389 g005
Figure 6. Real environment for experiment. (a) The multirotor, (b) Apriltag for the landing site.
Figure 6. Real environment for experiment. (a) The multirotor, (b) Apriltag for the landing site.
Drones 08 00389 g006
Figure 7. Images of the multirotor flying with our controller, where the red circle locates the multirotor. Time sequence: (ad).
Figure 7. Images of the multirotor flying with our controller, where the red circle locates the multirotor. Time sequence: (ad).
Drones 08 00389 g007
Figure 8. The three-dimensional trajectory of the multirotor in the real environment.
Figure 8. The three-dimensional trajectory of the multirotor in the real environment.
Drones 08 00389 g008
Figure 9. Trajectories in the real environment: (a) the pixel errors between the target feature and the camera principal point in the camera image, (b) the horizontal distance [ X w b , Y w b ] T [ X w t , Y w t ] T between the multirotor and the target, (c) the altitude h of the multirotor, and (d) the three-axis angles of the gimbal camera. I: horizontal approaching (blue dotted box), II: vertical approaching (red dotted box), III: landing (violet dotted box).
Figure 9. Trajectories in the real environment: (a) the pixel errors between the target feature and the camera principal point in the camera image, (b) the horizontal distance [ X w b , Y w b ] T [ X w t , Y w t ] T between the multirotor and the target, (c) the altitude h of the multirotor, and (d) the three-axis angles of the gimbal camera. I: horizontal approaching (blue dotted box), II: vertical approaching (red dotted box), III: landing (violet dotted box).
Drones 08 00389 g009aDrones 08 00389 g009b
Figure 10. The three-dimensional position trajectories for the three tests. The red, blue, and green trajectories represent the trajectories of the first, second, and third tests, respectively.
Figure 10. The three-dimensional position trajectories for the three tests. The red, blue, and green trajectories represent the trajectories of the first, second, and third tests, respectively.
Drones 08 00389 g010
Table 1. Camera intrinsic parameters.
Table 1. Camera intrinsic parameters.
Parameter: ( f x , f y , c x , c y )
Simulation(205.47, 205.47, 320.50, 180.50)
Real experiment(207.86, 205.42, 217.39, 112.67)
Table 2. Initial positions and quantitative results for the three tests.
Table 2. Initial positions and quantitative results for the three tests.
Test 1Test 2Test 3
Initial position ( m ) [ 1 , 1.5 , 1.8 ] [ 1.5 , 1 , 1.8 ] [ 1.5 , 0.5 , 1.8 ]
Success rate (%)100100100
Landing position error ( m ) [ 0.027 , 0.042 ] [ 0.003 , 0.046 ] [ 0.057 , 0.042 ]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yoo, S.; Park, J.-H.; Chang, D.E. A Two-Step Controller for Vision-Based Autonomous Landing of a Multirotor with a Gimbal Camera. Drones 2024, 8, 389. https://doi.org/10.3390/drones8080389

AMA Style

Yoo S, Park J-H, Chang DE. A Two-Step Controller for Vision-Based Autonomous Landing of a Multirotor with a Gimbal Camera. Drones. 2024; 8(8):389. https://doi.org/10.3390/drones8080389

Chicago/Turabian Style

Yoo, Sangbaek, Jae-Hyeon Park, and Dong Eui Chang. 2024. "A Two-Step Controller for Vision-Based Autonomous Landing of a Multirotor with a Gimbal Camera" Drones 8, no. 8: 389. https://doi.org/10.3390/drones8080389

APA Style

Yoo, S., Park, J.-H., & Chang, D. E. (2024). A Two-Step Controller for Vision-Based Autonomous Landing of a Multirotor with a Gimbal Camera. Drones, 8(8), 389. https://doi.org/10.3390/drones8080389

Article Metrics

Back to TopTop