Robust Vision-Based Control of a Rotorcraft UAV for Uncooperative Target Tracking

This paper investigates the problem of using an unmanned aerial vehicle (UAV) to track and hover above an uncooperative target, such as an unvisited area or an object that is newly discovered. A vision-based strategy integrating the metrology and the control is employed to achieve target tracking and hovering observation. First, by introducing a virtual camera frame, the reprojected image features can change independently of the rotational motion of the vehicle. The image centroid and an optimal observation area on the virtual image plane are exploited to regulate the relative horizontal and vertical distance. Then, the optic flow and gyro measurements are utilized to estimate the relative UAV-to-target velocity. Further, a gain-switching proportional-derivative (PD) control scheme is proposed to compensate for the external interference and model uncertainties. The closed-loop system is proven to be exponentially stable, based on the Lyapunov method. Finally, simulation results are presented to demonstrate the effectiveness of the proposed vision-based strategy in both hovering and tracking scenarios.


Introduction
Unmanned aerial vehicles (UAVs) have received growing interest, due to their advantages of vertical take off and landing, rapid maneuverability, and low cost. With improvements in sensing devices, batteries, materials, and other technologies, UAVs have sufficient payload and flight endurance, supporting many applications such as transportation, real-time monitoring, search and rescue, and security and surveillance [1][2][3][4]. There have been a variety of studies related to missions using autonomous hovering and tracking technologies [5][6][7]. In [5], a finite-time controller was proposed to drive a quadrotor hovering above a target with a limited duration. In [6], a novel fuzzy PID-type iterative learning control was developed for trajectory tracking of a quadrotor under the effects of external disturbances and uncertain factors of the system. The problem of energy-efficient path planning and simultaneously anticipating disturbances has been addressed in [7]. However, the available studies have mainly focused on hovering above or tracking a target with a known trajectory and definite position and velocity information. Tracking passive, uncooperative, or even unknown targets, such as vehicles in traffic accidents and fire areas in groves or forests, is still a challenging problem for UAV control. These cases usually occur suddenly and detailed geometric information (e.g., dimensions and size) of the target is not available. On the other hand, rough information (e.g., shape and structure) or a simplified model of the target can be stored onboard, which can help the vehicle to identify and lock on to the target.
A large amount of state of the art sensors has been equipped on the airborne platform, and multi-sensor fusion is a perspective trend in UAV navigation and control [8]. Inertial measurement

Problem Formulation
The problem addressed in this paper corresponds to UAV tracking and hovering scenarios in which the target is newly discovered. The target, which is referred to as "uncooperative", lacks accurate dimension/size information and real-time communication with the vehicle. It can be identified by the rough information of generic features stored onboard, such as shape and structure. The UAV studied here is a quadrotor equipped with multiple sensors, including an IMU, an ultrasonic sensor, and a monocular camera. A vision-based strategy integrating the metrology and the resulting control is employed to achieve target tracking and hovering observation. The quadrotor is a typical underactuated mechanism with more degrees of freedom than actuations. To analyze the tracking maneuvers, we first define several reference coordinate frames and present the equations of the motion of the quadrotor, then give an overall control framework of tracking and hovering observation .

Quadrotor Model
The quadrotor considered in this paper consists of a rigid cross-frame equipped with four rotors, as shown in Figure 1. The two rotors on the diagonal rotate in the same direction, while the adjacent rotors rotate in opposite directions. Six degrees of freedom of the quadrotor's position and attitude can be achieved by adjusting the rotation speed of the four motors. Two coordinate frames are introduced to describe the equations of motion of the quadrotor. An inertial frame I is fixed to some point O i on the earth, with a basis {X i , Y i , Z i }, whose elements are oriented north, east, and down, respectively. A body-fixed frame B is assumed to be attached to the center of mass O b of the quadrotor. The unit vectors of the body-fixed frame are represented by {X b , Y b , Z b },which are oriented forward, right, and down, respectively. Version June 6, 2020 submitted to Sensors 4 of 20 where ξ = [x, y, z] T  (3) Assumption 1: If the quadrotor does not perform maneuvers that are too aggressive, the roll and pitch 135 angles will both be very small (< 15 • ). Then, the matrix W(Φ) can be replaced with a unit matrix, and 136φ ,θ,ψ,φ,θ,ψ can be regarded as approximately equal to ω x , ω y , ω z ,ω x ,ω y ,ω z , respectively.

137
According to the working principle of the quadrotor (see Fig. 1), by adjusting the rotational speed 138 of each group of rotors, the quadrotor can generate a thrust force u 1 and torque vectors τ = [u 2 , u 3 , u 4 ] T , 139 which can be described as where ω 1 , ω 2 , ω 3 , and ω 4 are the speeds of four motors, respectively; b and d represent lift and drag 141 coefficients, respectively; and l is the distance from the center of each rotor to the center of mass of the 142 quadrotor.

143
The gravity vector of the quadrotor is denoted by F g . As the quadrotor may be disturbed by 144 the wind and other external factors, some unstructured forces and moments of the translational and 145 rotational dynamics, which are described as F d and τ d , are introduced into the system. Consider a quadrotor with mass m and inertial matrix J ∈ R 3×3 . The translational dynamics in the inertial frame and the rotational dynamics in the body frame are given as follows. For the translational dynamics, and for the rotational dynamics, where ξ = [x, y, z] T and v = [v x , v y , v z ] T are position and linear velocities of the quadrotor, respectively, expressed in the inertial frame; E 3 = [0, 0, 1] T is the unit vector in the body frame; the attitude of the vehicle, Φ, is given by three Euler angles ϕ, θ, and ψ denoting the roll, pitch, and yaw, respectively; and ω = [ω x , ω y , ω z ] T is the quadrotor's angular velocity expressed in the body frame. The corresponding rotation matrix from the body frame to the inertial frame is denoted by R, and the matrix associated with the Euler angles and the angular velocity can be written as Assumption 1. If the quadrotor does not perform maneuvers that are too aggressive, the roll and pitch angles will both be very small (< 15 • ). Then, the matrix W(Φ) can be replaced with a unit matrix, andφ,θ,ψ,φ,θ, ψ can be regarded as approximately equal to ω x , ω y , ω z ,ω x ,ω y ,ω z , respectively.
According to the working principle of the quadrotor (see Figure 1), by adjusting the rotational speed of each group of rotors, the quadrotor can generate a thrust force u 1 and torque vectors τ = [u 2 , u 3 , u 4 ] T , which can be described as where ω 1 , ω 2 , ω 3 , and ω 4 are the speeds of four motors, respectively; b and d represent lift and drag coefficients, respectively; and l is the distance from the center of each rotor to the center of mass of the quadrotor. The gravity vector of the quadrotor is denoted by F g . As the quadrotor may be disturbed by the wind and other external factors, some unstructured forces and moments of the translational and rotational dynamics, which are described as F d and τ d , are introduced into the system. Assumption 2. The external disturbances F d and τ d are assumed to be bounded; that is, are positive constants and · denotes the standard Euclidean vector norm and induced matrix norm. Assumption 3. The mass is m =m + ∆m, wherem and ∆m are the nominal and uncertain parts of the mass, respectively. The inertia matrix is J =J + ∆J, whereJ and ∆J are the nominal and uncertain parts of the inertia matrix, respectively. ∆m and ∆J satisfy the inequalities ∆m ≤ c m and ∆J ≤ c J , where c m and c J are positive constants.

Control Framework of Tracking and Hovering Observation
A flowchart of the proposed vision-based UAV tracking strategy is presented as follows (see Figure 2). During the cruise of the UAV, when a sensitive target is captured and identified by the onboard camera, the tracking and hovering observation process starts. First, a set of feature points on the target are extracted. By using the center point of the feature points and their pixel velocities, we can determine the relative distance and velocity, respectively, between the UAV and target. Then, the relative distance and velocity are input into the controller as error states. Based on the gain-switching PD control, the target centroid is expected to coincide with the center of the image plane, and the target image is required to be within the optimal observation area.
Version June 6, 2020 submitted to Sensors 5 of 20 where c m and c J are positive constants. image is required to be within the optimal observation area.

Visual Measurement Using a Virtual Camera Frame
When a sensitive target is first detected and determined by the camera during the quadrotor flight, its projection on the image plane can be described by a set of parameters, including slope, curvature, area, image centroid, and some parameters related to the shape of the target [36]. The selected target for tracking is assumed to be horizontal to the X i -Y i plane of the inertial frame, and its projection onto the image plane is a compact area with large values of shape parameters, such as sphericity or rectangularity.

Relative Distance Estimation in Horizontal Direction
As the target is uncooperative and cannot send any position information to the quadrotor, the relative distance measurement relies entirely on the vision-based method. However, the target does not have specific markers and its detailed geometry is unknown to the quadrotor, such that it is challenging to use PnP solvers or Template Matching approaches for relative pose acquisition. In this work, we utilize an effective and simple method based on the image error to estimate the relative horizontal distance between the quadrotor and the target. On the other hand, the quadrotor is an underactuated system with only four independently controllable degrees of freedom. Rolling and translational motions along the Y b axis are coupled, as are pitching and translational motions along the X b axis, which means that the vehicle will inevitably tilt when maneuvering horizontally. Thus, the image features of the target will change with not only the translational, but also the rotational, motion of the quadrotor, which makes it more complicated to estimate the relative distance and velocity.
Without loss of generality, the camera frame considered in this paper, C = {X c , Y c , Z c }, is assumed to coincide with the body frame B. To solve the problem mentioned above, we introduce a virtual camera coordinate frame V, which has the same origin as the frame B(C). The corresponding virtual image plane X v -Y v is always parallel to the X i -Y i plane and with the same yaw angle as the frame C; that is, its roll and pitch angles remain zero (see Figure 3).  of the quadrotor, which makes it more complicated to estimate the relative distance and velocity.

181
Without loss of generality, the camera frame considered in this paper, to coincide with the body frame B. To solve the problem mentioned above, we introduce a virtual 183 camera coordinate frame V, which has the same origin as the frame B(C). The corresponding virtual 184 image plane X v -Y v is always parallel to the X i -Y i plane and with the same yaw angle as the frame C; 185 that is, its roll and pitch angles remain zero (see Fig. 3). The coordinates of a point P in the inertial frame and camera frame are denoted by i P =

187
[ i P x , i P y , i P z ] T and c P = [ c P x , c P y , c P z ] T , respectively. They have the following geometric relationship: i The coordinates of a point P in the inertial frame and camera frame are denoted by i P = [ i P x , i P y , i P z ] T and c P = [ c P x , c P y , c P z ] T , respectively. They have the following geometric relationship, where i O c denotes the coordinates of the origin O c in the inertial frame. The pixel coordinates of the point P on the image plane are given by perspective projection equations [37]: in which f is the focal length of the camera and [u 0 , n 0 ] T is the coordinate of the image plane center. Now, reproject the image coordinates [u, n] T onto the virtual image plane using a matrix, R θ ϕ , associated with a rotation in the roll angle around X i and in the pitch angle around Y i : in which p = [u, n, f ] T andR 1 θ ϕ ,R 2 θ ϕ , andR 3 θ ϕ are the row vectors of the matrix R θ ϕ . It is assumed that N(N ≥ 3) non-collinear points are fixed on the selected target. The image centroid of the target is computed as In this paper, we need to drive the quadrotor directly above the target, such that the desired image feature is determined as the center of the virtual image plane, [ v u 0 , v n 0 ] T , as shown in Figure 3. The reprojection of the feature points onto the virtual camera frame decouples the pitch and roll motion of the vehicle through the change of coordinates [ v u, v n] T . Therefore, the relative horizontal distance can be estimated directly, using the deviation of the image centroid of the target from the center of the image plane: where a = v u g − v u 0 and b = v n g − v n 0 are image errors defined in image space, and v z is the vertical distance of the virtual camera frame obtained by the ultrasonic sensor measurement.

Relative Distance Estimation in Vertical Direction
When tracking an uncooperative target, the relative horizontal distance estimated in the above section should converge to and remain at zero. The height control is relatively flexible, depending on different observation requirements. The easiest way which comes to our mind is to keep the quadrotor flying at a constant height; however, this cannot guarantee the target being observed in an optimal area on the image plane. Intuitively, the image size of the target varies with the height of the camera. The quadrotor is required to fly neither too high nor too low, avoiding making the target image invisible or leaving the field of view. Now, introduce a circle as the optimal observation area on the virtual image plane centered at O 1 with radius r opt . Then, construct a circle that passes through the farthest feature point from the image centroid of the target. The radius of the constructed circle, denoted by r, is defined as Sensors 2020, 20, 3474 8 of 23 According to Equation (10), the constructed circle will cover all of the feature points of the selected compact target. Given that the relative horizontal error converges to zero, the vertical distance of the quadrotor can be controlled by adjusting r to be equal to or less than r opt , which guarantees the target image being kept in the optimal observation area. Therefore, the desired value of radius r satisfies the following conditions, where σ is the radius scaling factor and σ = 1 indicates that the optimal observation area circumscribes the target image. From Equations (10) and (11), the desired flying height of the quadrotor can be written as Then, we can define the position error in the vertical direction as To control the translational motion of the quadrotor, we define the full position error as ∆ξ = [∆x, ∆y, ∆z] T , which is desired to converge to [0, 0, 0] T .

Relative Velocity Estimation
Generally, the control actions require knowledge of not only position error but also velocity error. The quadrotor is equipped with an IMU providing the angular velocity and acceleration of the vehicle. If the target is stationary, the relative velocity can be obtained directly by integrating the measured acceleration information. However, while the precision of the gyro is satisfactory for the needs of the vehicle maneuvers, UAV-equipped accelerometers are usually not accurate enough for evaluating the platform velocity [29]. Velocity error acquirement is more complicated when the uncooperative target is moving. In this paper, we use the optic flow and gyro measurements to estimate the relative quadrotor-to-target velocity, which is recorded by Partial Velocity Evaluation.
Taking the first derivative of Equation (7), we obtain the dynamics of an image in which v v and v v pk are the velocities of the point P and the quadrotor expressed in the virtual camera frame, and the matrices L vk and L ψk are defined as , and assume that the velocities of N points are approximately equal to the target velocity, v v pk = v v t . Then, Equation (14) can be extended as The finite time difference of image features, vṗ , can be computed by directly measuring the optic flow of the visual features in images. Denoting by ∆v the relative quadrotor-to-target velocity expressed in the inertial frame, we have Sensors 2020, 20, 3474 9 of 23 in which L + v ∈ R 3×2N is the Moore-Penrose pseudo-inverse of the matrix L v and R ψ is the rotation matrix corresponding to the yaw angle by the relation R = R ψ R θ ϕ .
The relative translational velocity (expressed in the inertial frame) ∆v and the pixel velocities vṗ are related by the interaction matrix L vk and L ψk . At least two points on the target are required to determine the three components of vector ∆v. Due to the fact that the virtual image plane is introduced, the depth information of each feature point is not necessary, which is an advantage over traditional visual odometry.

Gain-Switching PD Controller Design
The control objective in this work is to drive a quadrotor tracking an uncooperative target at an appropriate height for optimal observation. The quadrotor, as an underactuated system, has six degrees of freedom with only four control inputs. Therefore, the control of a quadrotor is typically divided into an outer position loop and an inner attitude loop. The outer loop provides the reference attitude signal to the inner loop, while the inner loop tracks the orientation reference of the vehicle. A block diagram of the UAV control loop structure is shown in Figure 4. The objective is equivalent to designing a control input for translational motion based on visual feedback, such that the controlled underactuated system with model uncertainties can guarantee ∆ξ → 0 and ∆v → 0, then designing a control input for rotational motion with the knowledge of reference Euler angles derived from the outer loop.
PD controllers are commonly used in UAV control. Given the uncertainties in the dynamics of the vehicle, a traditional PD controller can produce large steady-state errors or even affect the overall stability of the system. To improve the robustness of the PD controller, we introduce a gain-switching term to deal with the uncertainties by avoiding using large gains.

Control of the translational motion 270
The translational dynamics (1) can be rewritten, in matrix form, as The translational motion of the quadrotor is actually accomplished by both the thrust and 272 orientation of the body. Let us define a set of virtual control inputs for the translational motion: From Equations (17) and (18), we have

Control of the Translational Motion
The translational dynamics (1) can be rewritten, in matrix form, as The translational motion of the quadrotor is actually accomplished by both the thrust and orientation of the body. Let us define a set of virtual control inputs for the translational motion, From Equations (17) and (18), we have The control law for the translational motion is proposed as where quantities with overbar symbols indicate that a priori estimates are used, which may deviate from the true value; ∆ξ and ∆v are obtained by visual measurement, as detailed in Section 3; −F g is the model compensation term designed to eliminate nonlinear elements of the dynamics; and K P , K D , and K ε represent the proportional, differential, and gain-switching coefficient matrices, respectively. The switching function n(ε, s) can be any piecewise-continuous function with the following properties, s n(ε, s) = n(ε, s) s n(ε, s) ≥ 1 − ε/s, s = 0 .
In this paper, the switching function is chosen as where ε is the gain switching threshold and s is the error feedback term given by Substituting Equation (20) into (19) yields the error dynamics for the translational motion: where the disturbance function, denoted by h, can be written as: where δm =m − m and δF g =F g − F g denote the deviation between the measured values and the true values,v t is the target acceleration, which is assumed to be unknown but has an upper bound.

Remark 1.
The disturbance function h represents all sources of uncertainties in translational error dynamics, including model uncertainty resulting from the inaccurately measured mass of the quadrotor −K P δm∆ξ and −K D δm∆v, gravitational error −δF g , unknown target motion −mv t , and unstructured forces F d . Under Assumption 2-3, the disturbance function h is bounded.
Now, the properties of the gain-switching coefficient will be addressed. The matrix K ε is symmetric and positive definite, and must satisfy the following conditions, where k ε is a positive bounding scalar designated to ensure that the term u ε provides greater acceleration than that resulting from the disturbances in h. The proportional and differential coefficient matrices K P and K D are selected to satisfy the following constraints with k P and k D , in which k P and k D are feedback gains designated as the following functions of the desired rate of convergence, α: These equations guarantee that the norm of K P and K D is large enough, such that the commanded force specified by u P and u D delivers-at a minimum-the acceleration specified by the vector u ε .

Remark 2.
Compared with traditional PD control, the proposed PD control consists of not only the proportional and differential terms u P and u D , which can eliminate the position and velocity errors, but also a gain-switching term u ε , which acts as a robustifying term based on the switching function n(ε, s). By using Equation (26), appropriate robust control parameters can be selected to restrain the uncertainties h. In addition, the transient behavior of the system can be characterized analytically, based on the desired rate of convergence in feedback gains (28).
Referring to Equation (24) and defining the system state e t = ∆ξ T , ∆v T T , the tracking dynamics model for translational motion can be written as Lemma 1. Consider a systemẋ(t) = f (t, x(t)). If there exist a continuously differentiable function V(x) and scalars c 1 , c 2 > 0, which satisfy [38] (i) then the system is (globally and uniformly) exponentially convergent to B(q) with rate α, where (29) with the controller designed by Equation (20). If the corresponding parameters are assigned as in Equations (26)- (28), then the state e t will exponentially converge to zero.

Proof. The Lyapunov function candidate is designated as
where A = k L I k P I k P I k D I Sensors 2020, 20, 3474 12 of 23 and I is a unit matrix. The constant parameter k L is also a function of α: By using the values k P , k D , and k L in Equations (28) and (32), the matrix A is positive definite. Thus, V(e t ) is a positive definite function satisfying Condition (i) in Lemma 1 is satisfied with c 1 = λ min and c 2 = λ max , where λ min and λ max are the smallest and largest eigenvalues of A.
Taking the time derivative of V(e t ) and substituting Equation (29) into it, we havė n(ε, s)).
When s ≥ ε, n = −s s , it follows that Combining Equations (39) and (40), we can obtain a global upper bound Substituting Equation (41) into the time derivative ofV(e t ) in Equation (38) leads tȯ in which V * =Ē 2α.
Inequality (42) ensures that the state of the system e t can exponentially converge to a small ball around the origin defined by V(e t ) < V * . Thus, the condition (ii) of Lemma 1 is satisfied.
It is worth noting that the control law given by Equation (20) yields virtual control inputs for translational motion. Using Equation (18), we can compute the command thrust u 1 , as well as the desired roll and pitch angles for the attitude controller: where ψ d is the reference value of yaw. As the yaw angle is independent of the outer loop, we can prescribe it as its initial value or another constant.

Control of the Rotational Motion
In the inner attitude loop, a similar gain-switching PD control law is presented: where ∆Φ = Φ − Φ d and ∆ω = ω − ω d denote the quadrotor attitude and the equivalent rate error, respectively, and Φ d = [ϕ d , θ d , ψ d ] T are the desired Euler angles output from the outer loop. The desired angular velocity ω d is taken to be zero. The switching function n (ε , s ) is chosen the same as before, and the error feedback term s is given by Referring to Equation (2) and defining the system states e r = ∆Φ T , ∆ω T T , the tracking dynamics model for rotational motion, under Assumption 1, can be written as where the disturbance function h is given by Theorem 2. Consider the closed-loop system for the rotational motion in Equation (46) with the controller designed by Equation (44). If the corresponding parameters are assigned by then, the state e r will exponentially converge to zero. Proof. The stability analysis of the system for rotational motion is similar to that for translational motion, so it is not described in detail here.

Simulation Results
In this section, MATLAB simulations are presented to validate the performance of the proposed vision-based control scheme. We considered two scenarios in this work: hovering above a stationary target and tracking a moving target. In both scenarios, the target to be observed was assumed to be uncooperative and without detailed geometry information. The physical parameters of the simulated quadrotor were m = 2.1 kg and J = diag{0.0096, 0.0098, 0.016} kg · m 2 /rad. The nominal part of the quadrotor's mass and moment of inertia werem = 2.1 kg andJ = diag{0.0081, 0.0081, 0.0142} kg · m 2 /rad. The focal length divided by pixel size of the camera was set as 213, and the image resolution was 160 × 120 pixels with principal point located at [80,60]. Considering that the quadrotor may be affected by wind disturbances in the environment, we applied some sinusoidal (cosinusoidal) forces and torques to the vehicle with the following values,

Scenario 1: Hovering Observation
In some cases, such as traffic and fire accidents, we usually drive a UAV equipped with optical sensors to approach the scene of the accident and hover over the damaged vehicle or burning object. To provide distinct images and effective air support for the subsequent rescue, it is required to adjust the height of the UAV, keeping the designated target within the optimal area of the camera's field of view. The proposed vision-based control scheme in this work is applicable to the above missions.
In the simulation, the target was set as a rectangular object. Its visual features included its four vertexes with the following coordinates, For the optimal observation of the target, we defined a reference circle of radius 40 pixels on the virtual image plane, and the radius scaling factor σ was designated to be 1. The corresponding desired height was 1.7048 m, which can be computed using Equation (12). The control gains used in the simulation are listed in Table 1.

Gains
Values Gains Values The transient response of the system mainly depends on the feedback gains k P , k D and k P , k D , which are determined by the constraint Equations (28) and (50). Based on the trial and error method, the desired rates of convergence for translational and rotational motion were selected as √ 2/2 and 2, respectively. Then, the proportional and differential matrices K P , K D and K P , K D could be determined by using Equations (27) and (48). To guarantee the robustness properties of the system in the presence of wind and modelling errors, the control gains in the gain-switching term k ε , K ε and k ε , K ε must be selected such that they are larger than the effect resulting from the unstructured disturbances h, h , based on Equations (26) and (49). The gain switching thresholds ε and ε were tuned repeatedly to restrain the chattering phenomenon of the control efforts. The simulation results are illustrated in Figures 5-7. The translational motion of the quadrotor is shown in Figure 5. The horizontal position errors converged to less than 0.05 m with transient response time enduring at about 5 s. The error in the vertical direction was relatively large, about 0.12 m, mainly resulting from the uncertainty of the quadrotor mass. Figure 6 describes the rotational motion of the quadrotor. The Euler angles were kept within a small range, satisfying Assumption 1. The properties of the target centroid in image space are illustrated in Figure 7. It is shown that the vehicle could rapidly and smoothly approach the target, and the control accuracy of the target centroid on the image plane was kept within 5 pixels. The proposed system, therefore, is satisfactory for continuous hovering observation of the target. Version June 6, 2020 submitted to Sensors   The proposed system, therefore, is satisfactory for continuous hovering For comparison, we ran a simulation of traditional PD control using the same initial conditions and gains. The results are reported in Figures 8 and 9. The results indicate that the proposed gain-switching PD controller had better performance than the traditional PD controller in steady-state performance, with lower oscillation, and the proposed method had a smaller error limit of the target centroid in image space. Further, we relocated the quadrotor at two different initial positions to compare the system steady-state behavior between the gain-switching PD controller with the traditional PD controller. The mean, amplitude, and accuracy (3σ) of the position error were illustrated in Table 2. The results showed that the proposed controller was effective in different initial conditions. Compared with the traditional PD controller, the performance of the proposed controller was greatly improved, with the position-control accuracy increasing by a factor of 4-5.
Version June 6, 2020 submitted to Sensors      To evaluate the imaging effects of the target and the corresponding flying height of the quadrotor under different observation requirements, we ran another simulation ignoring the external disturbance and model uncertainties. In the simulation, three criteria were set: flying at a constant height (i.e., ∆z = 0), radius scaling factor σ = 1, and σ = 0.8. As plotted in Figure 10, the results showed that introducing a reference circle on the virtual image plane and adjusting the radius scaling factor can regulate the imaging effect, according to different observation requirements. To illustrate that the proposed strategy is applicable to different kinds of targets, we ran another simulation to hover above an irregularly shaped target. The Cartesian coordinates of the feature points that enclose the target were:   ies. In the simulation, three criteria were set: flying at a constant height (i.e., g factor σ = 1; and σ = 0.8. As plotted in Fig. 10, the results showed that e circle on the virtual image plane and adjusting the radius scaling factor can ffect, according to different observation requirements. To illustrate that the applicable to different kinds of targets, we ran another simulation to hover haped target. The Cartesian coordinates of the feature points that enclose the

Scenario 2: Tracking a Moving Target
We also considered some more challenging cases, such as when the target vehicle is moving or when a fire area changes in real-time. Therefore, not only positioning the UAV over the target is required, but also following its (unknown) trajectory. Compared with the traditional PD control, the robust controller proposed in this paper was more competent in this challenging mission. In this simulation, the target had the same geometric characteristics as in SCENARIO 1 but followed three different trajectories on the flat ground: circular movement, S-type movement, and linear movement. The quadrotor initial position was set to be [1, 1.5, −3] m. To evaluate the performance of the proposed controllers in a more realistic environment, we added some noise to the measurement information. White noise with covariances of 0.5 and 10 −4 was augmented to the visual data (image features and their pixel velocities) and angular rates, respectively.
The tracking performance of the circular, S-type, and linear movements were illustrated in Figures 12-17, respectively. Trajectories of motion of the quadrotor and the target in a 3D environment were plotted in Figures 12, 14 and 16, which showed satisfactory tracking performance in spite of the presence of disturbance and noise. Meanwhile, Figures 13, 15 and 17 depicted the position tracking error of the two controllers in different target movements. The results indicated that the proposed control strategy applied to different target maneuvers, and showed better robustness and higher tracking accuracy than the traditional PD controller. The only exception was that there existed some fluctuations during the linear tracking (Figures 16 and 17). It was because of the vehicle's reaction time when the target suddenly changed its direction.
Version June 6, 2020 submitted to Sensors

Conclusions
In this paper, we have developed a vision-based control scheme of a quadrotor for target tracking in the absence of location information and geometric features of the target. After transforming the image features to a virtual camera frame, optical-based metrology is exploited to estimate the relative distance and velocity. At the same time, the height of the quadrotor and image size can be adjusted by regulating the optimal observation area and radius scaling factor. Considering the presence of external interference and model uncertainties, we presented a gain-switching proportional-derivative (PD) control strategy to improve the robustness of the system. Two case studies, corresponding to hovering and tracking scenarios, are presented in this work. The simulation results indicated that the proposed vision-based scheme performed better in both hovering and tracking missions, compared with the traditional PD control.
In future work, we are going to add a field of view constraint to the system, as the proposed algorithm can not guarantee that all visual features are always kept inside the field of view of the camera. We also plan to implement the proposed control scheme in a real quadrotor.
Author Contributions: S.Z. provided the main idea and supervised the whole process. X.Z. performed the simulation and finished the draft manuscript. B.Z. reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.