AOA-Based Three-Dimensional Positioning and Tracking Using the Factor Graph Technique

In this paper, an angle-of-arrival (AOA)-based algorithm is proposed for tracking the position of an anonymous target in three-dimensional (3D) space. Distributed sensors are deployed, which can measure both the azimuth and elevation angles of the AOAs. Assuming the target movement is non-linear, the extended Kalman filter (EKF) is applied, where the observation process is realized by a practical AOA-based position detector, to form a unified factor graph (FG) framework. Moreover, the variance of observation errors, which is needed by EKF, is estimated in real time by using both the AOA measurements and the predicted target state. Such a dynamic estimating approach exhibits higher performance robustness compared to the conventional method, especially when the sensing environment is unstable. Additionally, the predicted target state is also used as the a priori information of the system, in order to reduce the impacts of burst sensing errors. According to the simulations, the proposed system is shown to achieve less root mean squared errors (RMSE) in different evaluation scenarios, with fast convergence behavior.


Introduction
Future wireless networks should not only realize communication, but also support new services, such as connected transportation systems, smart logistics, unmanned factories, etc. [1][2][3][4]. As one of the enabling techniques, position detection is becoming increasingly important. Its technical evolution can be found in discussions within the Third-Generation Partnership Project (3gpp), the Bluetooth Special Interest Group (SIG), and other standard development organizations. Such applications should not only be confined to two-dimensional (2D) planes, but also be implementable to three-dimensional (3D) spaces.Moreover, communication may also rely on positioning techniques. For example, in multiple-input multiple-output (MIMO) systems, beamforming is well utilized to focus the signal power towards a specified receiver. However, conventional approaches for deciding the beam direction may not satisfy the future requirement, especially when the target is moving. Hence, a robust and low-complexity tracking solution should be considered.
To track non-linear target movement in practice, the extended Kalman filter (EFK) is used. One of the factors that influences the accuracy of EKF is the variance estimation of observation errors [5,6]. In this paper, the observation process in EKF is realized by a specific location detector, instead of a theoretical model in many previous works [7,8]. Usually, it is difficult to estimate the variance of detection errors in real-time scenarios. Such estimation may get even worse when the environment changes. For example, in connected car networks, car sensing may not only be performed by fixed infrastructures on the roads, but also be conducted by surrounding vehicles, which forms a simultaneous localization and mapping (SLAM) problem [9,10]. In SLAM, the sensing environment may change as devices move, and the variance of detection errors may become unstable. Hence, it is a crucial challenge for the tracking system to perform accurate estimations of such variance in real time.
In SLAM, multiple sensors can be used for detecting a single target. Actually, since the network density continues to grow, the utilization of multiple distributed sensing devices will become more convenient in the future. Therefore, this paper considers a scenario with distributed sensors. To perform a location detection with distributed sensors, many algorithms have been proposed. For example, in [11], the least squares (LS) method was applied for indoor positioning, and a hybrid system combining time-difference-of-arrival (TDOA) and angle-of-arrival (AOA) measurements was considered. In [12], a factor-graph (FG) method [13] was proposed for location detection using time-of-arrival (TOA) measurement. The performance was shown to achieve the maximum likelihood (ML) bound with relatively low complexity. Due to such advantages, many extensions of the FG algorithm to other applications can be found in [14][15][16] for received-signal-strength (RSS), TDOA, and AOA, respectively. This work will also focus on FG for both location detection and tracking.
A location detection and tracking algorithm using TOA was proposed in [17]. However, the use of TOA always requires strict time synchronization, because the signal emitting time should be known. Such a limitation will increase the difficulties of implementing such a technique in practice. Moreover, for an anonymous target that may not provide any transmission information to sensors or the fusion center, TOA cannot be used. There are still some limitations for RSS-based detection [14]. For example, RSS is also regarded as a technique only for a known target, because the training of received signal strength has to be conducted before the real deployment. To deal with the drawbacks mentioned above, TDOA-based techniques can be used, which does not require synchronization between the target and sensors, as well as off-line training. However, the performances are too sensitive to the measurement errors, due to the light speed.
In contrast, the performances of AOA-based techniques are not so sensitive to measurement errors, while still enjoying the same advantages as TDOA for detecting anonymous targets. Therefore, this paper will focus on AOA. In [16,18,19], AOA-based location detection and tracking was studied. However, the previous works only considered the scenarios in 2D. Actually, the use cases in 3D are currently increasing. For example, it can be applied to aviation control for airports, where not only the scheduled aircraft, but also the anonymous invaders can be detected and tracked. Moreover, live TV broadcasts via cameras on drones will become more popular, which requires accurate drone control in 3D by the fusion center. Additionally, future intelligent transportation systems may realize 3D detection capability to assist the navigation of vehicles running in the complex multi-layer overpass roads. Note that the measurement of AOA samples, including azimuth and elevation angles in 3D, may require more complicated soft/hard designs, such as MUSIC and ESPRIT [20,21], but they are out of the scope of this paper.
For location detectors that work in a relatively stable condition, the variance of detection errors will also be stable. In this case, a fixed variance value can be fed to EKF for estimating the performance of the observer. Usually, such a value can be calculated from off-line training. However, this paper considers a dynamic system, which indicates that the measuring abilities of sensors may vary through the tracking process. Therefore, a real-time tuning approach of the observation variance is required. This paper proposes an estimation method by using an approximated Cramer-Rao bound (ACRB). Specifically, the Cramer-Rao bound (CRB) provides a low bound on the detection performance according to the measurement of indirect parameters. For the AOA-based location detection problem, the CRB can be calculated by using the variances of AOA measurements and the real position of the target. However, the latter parameter is unavailable in practice. Instead, it is possible to get a predicted target position with EKF, with which the ACRB can be calculated as the estimation of the observation variance. Such a process can work in dynamic environments and has been shown to achieve a robust tracking performance in this paper.
The main contribution of this paper can be summarized as follows.
1. An integrated FG structure is formulated to realize 3D location detection and tracking with AOA measurements.

2.
It has been proven that the information from the elevation angle can help to improve the detection of 2D space, from both theoretical and simulation analyses.

3.
The proposed technique has exhibited robust performances even with an unstable sensing environment.

4.
By utilizing the ACRB as side information (SI) to the detector, the proposed system can greatly reduce the impacts of burst sensing errors.
The organization of this paper is as follows. The proposed system model is described in Section 2. Then, the proposed FG algorithms for realizing the location detector and tracking are provided in Sections 3 and 4, respectively. After that, the derivations of CRB and ACRB are given in Section 5. The simulation results that prove the superiority of the proposed technique are provided in Section 6. Finally, the conclusion is given in Section 7 with some remarks.

System Model
In this paper, a discrete and non-linear state space model (SSM) is considered. A single target is to be detected, and its state at time index k can be denoted by the 3D location s k = [x k , y k , z k ] T , with k = {1, 2, ..., K}. Moreover, there are N distributed sensors deployed in this system, and the fusion center is assumed to know their position (X n , Y n , Z n ), with n = {1, 2, ..., N}. The azimuth and elevation angles, denoted by φ and θ, respectively, are shown in Figure 1. For the sake of performing location detection, the relative distance ∆d n from the n-th sensor to the target is computed by: (1) x y z Figure 1. 3D coordinate with azimuth (φ) and elevation (θ) angles.
The measured AOA at each sensor is comprised of both azimuth φ and elevation θ angles. The definition of φ and θ in the spherical coordinate exactly follows the standard way shown in Figure 1. The relative distances calculated above can be connected by the true AOAs as follows, where the subscripts are omitted for simplicity. ∆x = ∆z · tan (θ) · sin (φ) (2) = ∆y · tan (φ) , As shown in Equations (3)-(6), the calculations of relative distances are self-contained. In order to obtain a converged result, iterative detection is performed based on an FG framework, which is known as the message passing algorithm. Note that there are always two ways to calculate the relative distance, which indicates that there will be two factor nodes included in the proposed FG structure, connecting to the variable node of the relative distance. A detailed discussion of the FG structure will be given in the next section. Moreover, it is assumed that for every time index, sensors are able to measure L samples of AOA from the target, and the variable of measured AOA can be described as a Gaussian model as: where l = {1, 2, ..., L}, and the noise components n φ and n θ follow N 0, σ 2 φ and N 0, σ 2 θ , respectively. Even though the quantization errors may break the Gaussianity of the measured AOAs, they are assumed to be very small in this paper, such that the input of the FG detector can be still regarded as Gaussian distributed variables. Moreover, it is assumed that the measured AOAs can be sent from sensors to the fusion center with error-free transmissions. Due to the Gaussian assumption of the measured AOAs at each sensor, they can be described by only parameters of the Gaussian probability density function (PDF), instead of L snapshots. By omitting the indexes n and k for simplicity, mφ and σ 2 φ are used to represent the mean and the variance ofφ, respectively. Similarly, mθ and σ 2 θ are used to represent the mean and the variance ofθ, respectively. Clearly, when L becomes larger, the means and variances obtained from AOA measurements will be closer to the real values.
To describe the tracking model, the state process equation in EKF is first given by: where w k = w x,k , w y,k , w z,k T denotes the noise vector, and each of the elements follows a Gaussian distribution with N 0, σ 2 w . As a non-linear function, f (·) determines the movement of the target. To realize EFK, the conventional way is to approximate , according to the Taylor expansion of the first order. Unfortunately, in this paper, the target's movement is unknown, which should be dynamically adjusted at each time index. Therefore, this work introduces an α such that f (α) = s k−1 holds. Therefore, (10) can be expressed by: where v k−1 = f (α) (s k−1 − α), known as the adjustment term, which should be updated dynamically at every time of the tracking process. Obviously, the accuracy of such a model can be guaranteed only if the target does not move far between two adjacent time indexes. Then, the observation equation can be also expressed as a function of the real target state, as: where the observation noise vector is represented by e k . Since s k is fully determined by the true AOAs φ and θ, g (s k ) can be also expressed byg (φ k , θ k ). Note thatg (φ k , θ k ) is not theoretically formulated in this paper, but realized by a practical 3D location detector. Within the proposed FG structure, all messages passed through are approximated to be Gaussian, including the noise of observation. Therefore, it is assumed that each element from e k follows N 0, σ 2 e . Unfortunately, σ 2 e , which is needed to perform EKF, cannot be directly measured at the sensors. The theoretical bound of the observation result is determined by the CRB. In other words, the smallest achievable σ 2 e can be calculated based on the variances of AOA measurements.

Location Detector
This section introduces the FG structure in detail for AOA-based 3D location detection, as shown in Figure 2. It is also known as the observer in EFK. Note that in the following discussion, the indexes of the sensor are ignored from the variables for an easier expression. First of all, the variable nodes (A φ , A θ ) are used, which contain the means and variances of the measured AOAs at each sensor. The outputs of (A φ , A θ ), i.e., (m φ , σ 2 φ ) and (m θ , σ 2 θ ), are then sent to the factor nodes (T A , T B , T C ) for trigonometric calculations, which triggers the FG iterations.  As shown in Figure 2, T A only takes the information from A φ and then feeds its outputs to the variable nodes ∆x and ∆y, which forms a completed 2D location detection as in [16]. The calculations within the factor node T A are referred to (3) and (5). In order to further fulfill 3D location detection, the information from A θ should also be utilized. Therefore, factor nodes T B and T C are introduced, which connect both A φ and A θ and feed to (∆x, ∆z) and (∆y, ∆z), respectively. The operations within T B and T C are based on (2), (4), (6), and (7). As a result, each of the relative distance variables is linked to two different trigonometric factor nodes, as shown in Figure 2.
It should also be noted that in T A , the multiplication of two independent Gaussian variables is needed, while in T B and T C , the multiplication of three independent Gaussian variables is required. Then, the means and variances of such multiplication results have to be calculated, assuming all messages are Gaussian.
Specifically, given two independent variables a ∼ N (m a , σ 2 a ) and b ∼ N (m b , σ 2 b ), the product a · b can be expressed by the mean and the variance as: If a third variable c ∼ N (m c , σ 2 c ) is introduced, the product of them a · b · c can be expressed in a similar way, as: However, the original trigonometric calculations shown in (3)- (6) are not linear, which may break the Gaussianity of the output of the factor nodes. To deal with this problem, the first order TSexpansion is used to approximate the non-linear function f (α) as: where the linear approximation is assumed to be around α = m α . As a result, three constant values f (m α ), f (m α ), and m α are obtained. Based on (17), the approximated mean and variance of f (α) can be expressed by: Based on (18) and (19), it is straightforward to calculate the means and the variances of particular trigonometric functions of (3)-(6), which are summarized in Table 1.

Approximated Mean Approximated Variance
Given the mathematical preliminaries above, the proposed FG detection will be explained in detail as follows. First of all, messages containing means and variances of the measured AOAs will be fed to the trigonometric factor nodes T A , T B , and T C . After that, the output messages go to relative distance variable nodes ∆x, ∆y, and ∆z. As can be seen in Figure 2, each of the relative distance variable nodes connect to two trigonometric factor nodes, which requires the multiplication of two independent Gaussian PDFs as, Based on (20), particular calculations of the means and variances of the iterative messages are made possible, from the relative distance variable nodes to the relative distance factor nodes, which are expressed by, m ∆y→D B = m T A →∆y ·σ 2 T C →∆y +m T C →∆y ·σ 2 and: On the other hand, the means of messages passing from the trigonometric factor nodes to the relative distance variable nodes are calculated as: In addition, the variances passing from the factor node T A to the variable node ∆x can be calculated by: and the variance passing from the factor node T B to the variable node ∆x is given by: However, calculating the rest of the variances required in the proposed FG detection is omitted, due to the limited space. Actually, the way of such calculation is very similar to (33) and (34). Even though many calculations are involved in the equations mentioned above, some of them are reusable, which guarantees a low implementation complexity.
Finally, according to (1), messages passing from the factor nodes of a relative distance to different axes can be expressed by: and: In the equations shown above, the variable subscripts indicating the sensor index are ignored for simplicity, because such steps are conducted for all sensors in parallel. At the variable nodes of estimated target position, messages coming from different sensors will be exchanged in order to make the FG iteration work. Therefore, the variable subscripts indicating the sensor index will be used when calculating the means and variances, as [13]: According to the design of real systems, the FG iteration can be performed until some particular conditions are satisfied. For example, if the gap of detected positions between two adjacent time indexes is less than a pre-defined threshold, or the maximum allowed iteration number is simply reached, the FG iteration can stop. Finally, the target location detected by the observer can be obtained by (50)-(52), where σ 2 x , σ 2 y , and σ 2 z are calculated by (47)-(49), respectively. (52)

FG-EKF
The FG structure-based EKF is explained in this section. Based on (11) and (12), the goal of the tracking system is to obtain the values of s k and v k , which allow the a posteriori probability p (s k , v k |o 1:k ) to be maximized, with (·) 1:k denoting time indexes. The following equation calculates the marginal function of s k and v k , denoted byp (s k , v k ), as: where ∼ is the operator for exclusion. Factorization of the conditional PDF function of (53) can be given by: where (54) is calculated by following Bayes' rule, and (55) is computed since o k only depends on s k . Moreover, assuming that s k only depends on s k−1 and v k−1 , while v k only depends on v k−1 . Then, by substituting (56) into (55), p (s 1:k , v 1:k |o 1:k ) can be expressed by: where the denominator of (56) is omitted, i.e., p (o k |o k−1 ). It should be noted that p (s 1:k−1 , v 1:k−1 |o 1:k−1 ) in (56) denotes the output of FG-EKF from the previous time index. Hence, ∏ is used in (57) considering the case with the time index from one to k. Given (57), the FG-EKF algorithm used in this paper can be detailed in two steps as follows.

Prediction of the State
At the beginning, state prediction of the current time index is conducted based on the FG-EKF outputs at the previous time index. According to Figure 3, the message µ c (s k|k−1 ) indicating the state prediction is given by: where the message flows µ a (s k−1 ) and µ b (v k−1 ) are taken from the FG-EKF outputs at time index

State Refinement
Then, the predicted state s k|k−1 can be further refined by the detected target position of the observer, yielding the FG-EKF results at the current timing k. According to Figure 3, the message flow of the FG-EKF output µ e (s k ) is given by: where µ d (o k ) represents the message flow of the observed target state, computed by the proposed 3D location detector.

ACRB Derivation
In this section, the proposed ACRB for the DOA-based 3D location detector is derived. Based on the result in [16], the conventional CRB is given by: where FI M represents the Fisher information matrix and s denotes the real target position. Given the PDF of the random variableÂ, denoted by p(·), the Fisher information matrix can be expressed by: whereÂ represents the measured DOAs with L samples. Then, the CRB of the DOA-based 3D location detector can be expressed by: where J represents the Jacobian function, which can be further expressed by: where ∆x n = X n − x, ∆y n = Y n − y and ∆z n = Z n − z, ∆xy n = (X n − x) 2 + (Y n − y) 2 , and ∆xyz n = Finally, the ACRB at time index k can be expressed by: where the expression of J k|k−1 is equal to J, but with ∆x n = X n − x k|k−1 , ∆y n = , and ∆xyz n =

Results
In this section, simulation results are provided in order to prove the accuracy, convergence, and robustness of the proposed technique. The real path of the moving target is formulated by functions with regard to the time index k, as: where φ = π/60 and s 0 = [x 0 , y 0 , z 0 ] T = [0, 0, 0] T indicating the initial location. According to the specific simulation scenarios, the setting of sensor positions, FG iteration time, and AOA measurement variances will be different.

Accuracy Evaluation
In this subsection, the accuracy of the proposed tracking technique is evaluated by simulations. Three distributed sensors are deployed on the ground, with the positions at [100, −10, 0] T , [−10, −10, 0] T , and [−50, 100, 0] T , respectively. Such a deployment is only based on practical considerations. Actually, the optimal sensor location cannot be guaranteed due to the mobility of the target, as in (65) and (66). The FG iteration number of the detector is fixed at 15, and 100 snapshots are assumed for the detection of each time index.
First of all, the real path of the target and tracking trajectories are shown in Figure 4. Note that the standard deviations of AOAs from both the azimuth and elevation angles are set equal in the simulations, denoted by σ φ,θ , in order to evaluate the general tendency of the system behavior. However, in practice, σ φ and σ θ can be different. The tracking results with σ φ = 5 • can be seen in Figure 4a. In order to evaluate the accuracy of the proposed tracking technique, RMSEs are calculated. Specifically, the observer, realized by the 3D location detector described in Section 3, achieves RMSE = 1.6 meters, while this value is reduced to 1.4 with the proposed EKF. The RMSE values of the observer and the EKF are 6.15 and 4.6 in Figure 4b, which are higher than Figure 4a, because of a larger σ φ,θ = 20 • is assumed in the simulation.  Moreover, another simulation was conducted to evaluate whether the 2D tracking performance can be improved with the help of the measured elevation angles. Actually, in many applications, the importance of 2D location accuracy is more than that of 3D. Therefore, it will be very beneficial to see if the 2D detection can be enhanced with additional assistance from elevation measurement. As shown in Figure 5, the RMSE values achieved by using only the azimuth angle φ for 2D detection are plotted, versus the standard deviation σ φ . Interestingly, considering both φ and θ with 3D detection, the RMSEs in 2D are found smaller than the case where only φ is used, which are closer to the theoretical CRB bound. This finding proves the fact that the information carried by the elevation angle θ also contributes to the detection in the 2D plane, which was ignored in the previous works [16]. Therefore, the introduction of the elevation angle is not only for realizing 3D detection, but also for enhancing the entire system performance. The mathematical proof of such improvement in terms of a lower CRB is left as future work. Only φ is used Both φ and θ are used CRB 2D Figure 5. Improvement of 2D location with elevation measurement.

Convergence
In this subsection, the system convergence behavior in terms of the FG iteration number is evaluated. The sensor deployment is assumed to be the same as the previous simulation, while both σ φ,θ = 5 • and σ φ,θ = 20 • are set for comparison. According to Figure 6, the tracking errors in terms of RMSEs can be much reduced only after five FG iterations. The performances will be further improved and stable with 10 iterations. Moreover, even the standard deviations of the AOA measurement are different, and the convergence behavior is found to be very similar.

Robustness Analysis
In this subsection, the robustness of tracking performance is evaluated, which is also regarded as the main advantage of the proposed technique. First of all, the effect of sudden sensing errors is considered, such as the false alarm, which may cause unreasonable measurement AOAs. In contrast to the previous simulations, four sensors are used with their positions at [−100, −10, 0] T , [−10, −10, 0] T , [−50, −100, 0] T , and [−50, −5, 10] T . The reason why more sensors are utilized is to avoid the case in which not enough sensors can work normally when some suffer from significant sensing errors. The FG iteration number is fixed at 15, while 100 snapshots are assumed for measurement at each time index. The real path of the target is set the same as in the previous simulations.
Specifically, at each time index, one of the four sensors is assumed to encounter sudden errors with a probability equal to 0.2, besides the true AOA measurement. Without the help of SI, the fusion center can simply discard the data of a sensor if AOAs from more than one origin are detected. However, the system performance will also decrease when fewer sensors are used. In the proposed system, a predicted AOA will be calculated for each sensor by the fusion center, based on the predicted sensor location by EKF. If the measured AOA is close to the predicted one within a predefined threshold, set by 10 • in the simulation, it will be used for location detection at the current time index. Otherwise, the measured data will be discarded. By doing this, sudden errors such as false alarms can be eliminated while still keeping the desired measurement, in order to achieve a better tracking performance.
The performance improvement with SI is shown in Figure 7. Actually, SI is regarded as the a priori information generated from ACRB, which helps the fusion center make a pre-decision to eliminate the impacts of burst sensing errors. In other words, ACRB enables the prediction of the expected AOA measurements. For example, in this simulation of false alarms, if the measured mean AOAs are larger than the predicted values beyond a threshold, i.e., 5 • , they will be directly eliminated. It can be seen that the RMSEs achieved with SI are always smaller than the cases without SI. The improvement is found to be more significant when the standard deviation of the AOAs becomes larger. Moreover, simulations are conducted to evaluate the robustness of the system against the dynamic measurement variances. In this case, σ φ,θ is assumed to be randomly chosen from a set of {5 • , 10 • , 15 • , 20 • }, for each time index. After location detection, the observation variance is calculated based on the proposed ACRB, which is dynamically fed to the EKF. As a comparative study, another scheme assuming a fixed σ φ,θ equal to two is also simulated. According to Figure 8, the RMSE achieved by the proposed technique is 4.3, which is smaller than the comparative scheme with the RMSE being 5.4. Obviously, the two local maximum RMSEs appeared for the comparative scheme happen when σ φ,θ = 20 • was chosen. However, such errors are much reduced with the proposed technique. Therefore, the robustness of the proposed system is proven.

Conclusions
This paper provided an AOA-based tracking technique in 3D with an FG algorithm. The system accuracy was proven via simulations even with large measurement variances. It can be found that the introduction of elevation angels not only realized 3D detection, but also enhanced the detection accuracy in 2D. Moreover, the convergence performance of the proposed FG algorithm was evaluated by simulations, which was shown to be achieved within around 10 iterations despite a large measurement variance. Finally, the predicted target state by EKF was used to eliminate sudden sensing errors and estimate the observation variance through calculating the ACRB, which significantly enhanced the robustness of the tracking performances.