2.1. Problem Statement and Method Overview
UAV formations can estimate the relative position between UAVs by exchanging the GNSS global position data of UAVs or local positioning data provided by INS or vision systems. However, in complex flight environments, GNSS positioning data are susceptible to interference, leading to significant positioning inaccuracies. While INS can maintain normal operations in these environments, they are prone to drift over time, making them less suitable for extended-range missions. Vision-based relative positioning techniques are constrained by their detection range and may encounter difficulties in large-scale flight operations. To address the challenge of relative positioning between UAVs in formation during long-duration missions in GNSS-denied environments, this paper proposes a novel relative pose estimation method that integrates inertial navigation and datalink measurements. The overall framework of the proposed approach is illustrated in
Figure 1. Each UAV in the formation is equipped with an formation positioning system consisting of four key modules: the inertial navigation module, the data link measurement process module, the time alignment module, and the relative positioning module.
Each UAV is equipped with an IMU sensor to capture acceleration and angular velocity data during flight. An inertial navigation integration algorithm is used to estimate its position and attitude in real time, and the estimated flight state is then transmitted to the data link module for further processing.
The integrated data link system equipped on each UAV within the formation facilitates bidirectional data link communication, enabling it to transmit its own flight status to teammates while receiving theirs. During communication, the data link can measure the relative distance, azimuth angle, and altitude angle of neighboring UAVs. To enhance accuracy, the data link measurement processing module filters out erroneous measurements and computes the relative positions of teammates based on the refined data.
The timestamps of sensor measurements from the IMU and data link are inherently asynchronous due to differences in system clocks across UAVs and variations in measurement frequencies between the IMU and data link. To address this issue, a decentralized time calibration method based on the peer delay mechanism in the Precision Time Protocol (PTP) is first employed to estimate the time offset. The data preprocessing module then aligns the timestamps by using the data link measurement time as a reference. Spherical linear interpolation is applied to synchronize the states of the two communicating UAVs to the reference timestamp. Additionally, the geometric characteristics of UAV states within the continuous time series are analyzed, and flight states with significant geometric variations are selected for optimization.
The relative positioning module processes data received from the data preprocessing module. Using the flight states of the UAV as vertices, it integrates the relative pose measurements obtained from the data link with the UAV state estimates to construct relative position error edges between UAVs. A relative pose factor graph, composed of these vertices and edges, is then established within a sliding window. The graph optimization method is employed to iteratively refine the relative pose estimates in real time, enhancing the accuracy of UAV positioning within the formation.
2.4. Inertial Navigation Method
The inertial integration algorithm employs a differential method to compute the UAV’s position, attitude, and velocity at each measurement moment based on the acceleration and angular velocity measured by the IMU [
34]. For instance, the computation of flight state parameters of UAV at
k timestamp proceeds as follows:
The transformation matrix from the body coordinate system to the navigation coordinate system at the current time instant, computed using the angular velocity measured by the gyroscope, can be represented by Equation (
3):
In the Equation (
3),
represents the attitude transformation matrix from the body coordinate system to the navigation coordinate system at timestamp
k.
represents the transformation matrix from the navigation coordinate system at timestamp
to the navigation coordinate system at timestamp
k.
represents the transformation matrix from the body coordinate system to the navigation coordinate system at timestamp
.
represents the transformation matrix from the body coordinate system at timestamp
to the body coordinate system at timestamp
k. The
and
are calculated using the two-interval coning algorithms as follows:
where
and
are the measurements of angular increments sampled under equal time intervals at the time interval
,
represents the angular velocity of rotation of the navigation frame relative to the inertial frame, and
denotes the conversion between the angle–axis representation and rotation matrix.
The accelerometer measures acceleration within the UAV’s body coordinate system. By applying the transformation matrix
, the measured acceleration is converted into the navigation coordinate system, as expressed in Equation (
7). This transformation enables velocity computation within the navigation coordinate system.
In Equation (
7),
represents the UAV’s acceleration in the navigation coordinate system at timestamp
k, while
denotes the acceleration in the body coordinate system at the same time. Using
to calculate the velocity change over the measurement interval, the velocity
can be calculated as follows:
where
and
represent the velocities of the UAV in the navigation coordinate system at timestamp
k and timestamp
, respectively.
denotes the velocity increment with errors, where
comprises the sculling error and
is the rotation error.
signifies the velocity error from the previous timestamp.
is the pose transformation matrix from the body coordinate system to the navigation coordinate system at timestamp
k.
represents the rotation angular velocity of the Earth in the navigation frame at timestamp
k, while
denotes the rotation angular velocity of the UAV around the Earth in the navigation coordinate system at timestamp
k.
represents the gravitational acceleration in the navigation coordinate system at timstamp
,
is the time between
timestamp and k timestamp, and
and
are the angle changes corresponding to the rotation matrix at
moment and
k moment, respectively.
The trapezoidal integration method is utilized to update the UAV’s position within the navigation coordinate system, as expressed by the following equation:
where
and
represent the UAV’s position in the navigation coordinate system at times
and
k, respectively. The velocity components
and
correspond to the UAV’s velocities in the navigation frame at these time instances, while
denotes the time interval between time steps
and
k. The transformation matrix
maps the velocity from the navigation coordinate system to the geographic coordinate system at time
. The transformation matrix
is defined as follows:
where
and
denote the meridional and prime vertical radii of curvature at the UAV’s location, respectively. The parameters
L,
, and
h correspond to the latitude, longitude, and altitude of the UAV. The matrix accounts for the variations in Earth’s curvature, ensuring accurate position updates in the navigation frame.
By solving the inertial navigation, the transformation matrix , position , and velocity of the UAV are integrated into the state parameters , where represents the quaternion representation of the . These state parameters are then transmitted to the other teammates within the formation via the data link.
2.5. Data Link Ranging Anomaly Detection and Processing
During formation flight, the dynamic repositioning of UAVs can trigger the data link communication system to switch the active receiving sensor, potentially introducing anomalies in distance measurements. As the mission environment becomes increasingly complex, external obstacles (e.g., buildings, foliage, or airborne objects) may block or reflect the communication signal path, leading to an overestimation of the measured distances. These factors result in erratic ranging information, as demonstrated in experimental scenarios: for instance, placing an iPad (as a signal-blocking object) directly in front of the data link receiver caused a 1 m overestimation in the measured distance. Such inconsistencies necessitate robust anomaly detection and filtering mechanisms for reliable ranging data.
Given a new data link measurement
, the predicted distance
at time
k can be computed based on the UAV’s own state
, the teammate’s flight state
, and the relative pose transformation matrix
from the previous time step. To facilitate this computation, the state parameters of both UAVs are utilized to determine the transformation matrices
and
, which map each UAV’s navigation coordinate system to the ECEF coordinate system. Additionally, the attitude transformation matrices
and
, which convert each UAV’s body coordinate system to its respective navigation coordinate system at time
, are obtained. These transformations enable the projection of UAV state parameters from time
k to time
, resulting in the transformed positions
and
, as follows:
where
and
represent the UAV positions in their respective navigation coordinate systems at time
k. The transformation matrices sequentially map these positions from the navigation frame to the ECEF frame and then back to the body coordinate system at time
, ensuring temporal alignment for further computations.
Using the relative pose transformation
, which represents the estimated transformation between the body coordinate systems of UAV
a and
b at time
, the position of UAV
b, denoted as
, is mapped into the body coordinate system of UAV
a at the same time step. This transformation yields the position
, ensuring that the state parameters of both UAVs are represented within a common spatial coordinate system:
By computing the distance between UAV
a and
b within this unified coordinate system, the predicted value
of the data link measurement can be obtained:
The ranging error
at time
k is then computed as the difference between the predicted distance
and the actual distance
measured by the data link:
Considering potential inaccuracies in the optimized transformation , the computation error is compared against the corresponding error at time . If the discrepancy exceeds a predefined threshold relative to the ranging error at time , it can be inferred that the data link measurement has been affected by interference. In practical implementation, this threshold is set to 1.2 times the ranging error at time . When the system detects anomalous interference signals in the data link measurements, the affected data are excluded to prevent state estimation errors. In such scenarios, the system employs a constant velocity trajectory assumption to predict the relative distance between UAVs. Specifically, the predicted distance is calculated based on the last valid measurement and the time-averaged velocity vector, which then serves as a dynamic reference baseline for detecting anomalies in subsequent data link ranging results.
2.7. Time Aligned
The IMU on the UAV operates at a frequency of 100 Hz, while the data link communication frequency is 30 Hz. Additionally, the IMU sensors on different UAVs within the formation have asynchronous reference times, complicating the synchronization of data from these sensors. To address this issue, a time alignment method based on the data link signal timestamp is employed, as illustrated in
Figure 2.
In
Figure 2, the square nodes represent the flight states estimated by the inertial navigation integration algorithm, as described in
Section 2.4. The circular nodes represent the data link measurements between the two UAVs. The green dashed lines indicate the measurements between the data link equipment pairs, as discussed in
Section 2.5, and the dashed box represents the time alignment window, which includes a data link measurement and the two nearest flight states. By interpolating the UAV states of both UAVs within this time window, the state parameters of the UAV, aligned with the data link measurement moment, can be calculated, as shown in the triangular nodes. Specifically, the drone state parameters include both position and orientation. While position data exist in a linear space and can be effectively interpolated using direct linear interpolation, orientation data, represented as rotations, are not continuous in linear space. Instead, they exist on the unit quaternion manifold, where direct linear interpolation can introduce significant errors, particularly for large rotations. Therefore, we employ SLERP, which provides smooth, constant speed interpolation along the shortest path on the quaternion manifold, ensuring the accurate alignment of orientation data, as described in Equations (
24)–(
26).
In the interpolation process, represents the time point to be interpolated, while , , and denote the corresponding UAV state parameters after time alignment. represents the interpolation scale factor, while ⊗ denotes quaternion multiplication. Additionally, and are the timestamps of the inertial navigation estimates before and after the moment , respectively. Similarly, , , , , , and represent the UAV’s state parameters at the timestamp and .
Consider UAV a and b flying in formation as an example. The two UAVs exchange their flight state parameters respectively via data link communication. To ensure consistency, each UAV performs time synchronization between its own state parameters and the received flight state parameters of its teammate. Subsequently, the aligned flight state parameters and data link measurements are integrated into the data frame .
To ensure both the diversity of optimization data and the real-time optimization of relative pose estimation, time-aligned data were preselected before the optimization process. The selection of optimization data was based on the diversity of geometric features in the flight trajectories. Specifically, the state parameters after the current alignment were compared with those from the previous alignment. If all parameters of the data frame, except for the position parameters, remained unchanged, it was determined that the UAVs in formation were flying in parallel along a straight line trajectory. In such cases, these data were considered redundant and excluded from subsequent optimization.
2.8. Relative Pose Optimization
Following data alignment and selection using the method outlined in
Section 2.5, the aligned data frames are utilized to construct the relative pose optimization factor graph model, as illustrated in
Figure 3. Factor graph optimization is a powerful approach for state estimation that represents the relationships between various measurements and system states as a graph. In this framework, the nodes represent the states of the UAVs, including their positions and orientations, while the edges (factors) represent the constraints provided by sensor measurements, such as inertial data and ranging information. Unlike traditional filtering methods like the Kalman filter, which rely on a first-order, implicit sliding window, factor graph optimization maintains an explicit sliding window, typically encompassing multiple historical states for joint optimization. This approach offers several key advantages: 1. By optimizing multiple frames simultaneously, factor graphs can correct past state estimates when new information becomes available, reducing long-term drift and improving overall consistency. 2. Factor graphs can incorporate a wide range of measurement types and constraints, including nonlinear relationships, making them well-suited for complex multi-UAV scenarios. 3. Unlike recursive, forward-only methods, factor graph optimization allows for the redistribution of errors across the sliding window, providing more accurate state estimation in GNSS-denied or bias-prone environments. This structure provides a robust foundation for accurate relative pose estimation in UAV formations, as it effectively fuses diverse sensor data while mitigating the impact of drift and noise.
In this factor graph model, each node represents the state parameters of a UAV. Specifically, red nodes denote the state parameters of the UAV itself, green nodes correspond to the state parameters of its teammate, and blue rectangular nodes represent inertial navigation factors, which encode the inertial integration constraints between connected state parameter nodes. The blue triangular nodes denote data link measurement factors, capturing the relative measurements between the two UAVs’ state parameters. The yellow square nodes represent the relative pose transformations between the connected UAV state parameters, serving as the variables to be optimized within the factor graph.
To ensure real-time computational efficiency while maintaining an overdetermined equation system for optimization, a sliding window mechanism is employed, indicated by a red dashed box in the figure. The optimization process is constrained to the nodes within this sliding window. The size of the sliding window is determined by balancing the computational resources available and the need for consistent optimization results. Specifically, the window size is influenced by three key factors: 1. Equation Closure: The factor graph must remain well-constrained, requiring that the number of residuals (constraints) exceeds the number of optimization variables. 2. Real-Time Performance: The number of nodes directly affects the dimensionality of the optimization problem. To achieve real-time performance, it is necessary to limit the dimensionality according to the available computational capacity. 3. Optimization Consistency: Including more nodes in the sliding window improves the global consistency of the optimization result. Considering these factors, the sliding window is set to include as many nodes as possible without compromising real-time performance. In this study, the sliding window is set to a time interval of 10 s, encompassing n measurements (where ), including 300 data frames. Additionally, to mitigate unconvergence issues caused by the long-term drift of inertial navigation integration, the time interval between the first and last frame within the sliding window is constrained.
In
Figure 3, each time-aligned data frame, along with the corresponding relative pose transformation node and the associated inertial navigation integration factors, forms a complete closed loop. Each of these closed loops enables the construction of a relative position residual
. Taking the pose residual
derived from the
th data frame as an example, its computation is given by the following:
where
represents the position of UAV
b in the data link coordinate system of UAV
a at timestamp
, as measured by the data link. Meanwhile,
denotes the position of UAV
b in the data link coordinate system of UAV
a at timestamp
, as estimated by the inertial navigation system.
The data link system is capable of measuring the position
of UAV
b in the NEU coordinate system of UAV
a at timestamp
. By applying the pose transformation matrix
, this position can be transformed into the body coordinate system of UAV
a, yielding the position
of UAV
b at timestamp
. The transformation is expressed as follows:
By utilizing the inertial navigation factors and the relative pose nodes involved in the closed loop, the position of UAV
b in the body coordinate system of UAV
a can be calculated as follows:
where
represents the transformation matrix from the body coordinate system of UAV
a at timestamp
k to the body coordinate system of UAV
a at timestamp
, and
denotes the transformation matrix from the body coordinate system of UAV
b at timestamp
to the body coordinate system of UAV
b at timestamp
k. Additionally,
represents the relative pose of the body coordinate system of UAV
b with respect to the body coordinate system of UAV
a at timestamp
k.
represents the position of UAV
b in its body coordinate system at timestamp
. The transformation matrices
and
can be computed using the transitivity property of inertial data, as follows:
In each closed loop, the inertial factors on one side of the loop can be multiplied to obtain the transformation matrix
and
, which describes the transform in the body coordinate system from timestamp
to timestamp
k. Inertial navigation can compute the position and orientation of the UAV’s body system in its navigation coordinate system at continuous time intervals. By using the ECEF coordinate system as an intermediary, the transformation matrix
and
corresponding to each inertial factor can be derived. The individual transformation matrices for
and
are computed as follows:
where
denotes the transformation matrix from the east-north-up coordinate system of UAV
b at timestamp
k to the body coordinate system.
represents the transformation matrix from the geocentric coordinate system to the east-north-up coordinate system of UAV
b at timestamp
k.
signifies the transformation matrix from the north-east-down coordinate system of UAV
b at timestamp
to the geocentric coordinate system.
represents the transformation matrix from the body coordinate system to the north-east-down coordinate system of UAV
b at timestamp
.
As illustrated in
Figure 3, each closed loop, comprising a data frame and a relative transformation node, can construct a relative pose residual. By integrating the
n relative pose residuals within the sliding window, the optimization equation for refining the relative transformation node is formulated as follows:
where
represents the information matrix of the relative position error at the
instance. The cost function
corresponds to the relative position residual as described in Equation (29). The optimization variable
x consists of the relative pose between UAVs, denoted by
. By transforming the rotation matrix
in
into its corresponding Lie algebra
, the optimization variable can be expressed as
. Within the sliding window framework, only a single type of residual is considered. Given that the time interval of the optimized sliding window is relatively short, the drift of the inertial navigation factors contributing to the residual can be neglected. As a result, it is reasonable to assume that the confidence level of each residual is equivalent, leading to the assignment of
.
During the optimization process, the Jacobian matrix of
with respect to
is derived analytically. The relationship between
and
is established as follows:
The multiplication of
is equivalent to first performing a rotational transformation with the Lie group corresponding to
, and then linearly adding the translation
. Therefore, the differentiation can be carried out separately for
and
. The corresponding Jacobian matrix
can be computed as follows:
Using the left perturbation model approach, the partial derivative of
with respect to
can be calculated as follows:
The partial derivative of
with respect to
is as follows:
Combining the partial derivatives obtained for
and
, the Jacobian matrix
of
with respect to
can be derived.
The factor graph model as depicted in
Figure 3 is solved by the open-source C++ library Ceres. The Levenberg–Marquardt method is chosen to solve the optimization equations, with the Huber function selected as the robust kernel function.