Optimal Shadowing Filter for a Positioning and Tracking Methodology with Limited Information

Positioning and tracking a moving target from limited positional information is a frequently-encountered problem. For given noisy observations of the target’s position, one wants to estimate the true trajectory and reconstruct the full phase space including velocity and acceleration. The shadowing filter offers a robust methodology to achieve such an estimation and reconstruction. Here, we highlight and validate important merits of this methodology for real-life applications. In particular, we explore the filter’s performance when dealing with correlated or uncorrelated noise, irregular sampling in time and how it can be optimised even when the true dynamics of the system are not known.


Introduction
A wide range of filters has been developed for the purposes of tracking moving objects and approximating their trajectories. Currently, tracking is typically achieved using sequential statistical filters, such as Kalman or particle filters, which take only the current and future states into account [1][2][3][4][5][6][7]. The classic Kalman filter has been extended to consider converted measurements and nonlinear problems [8]. A review of how to use the particle filter and its capability to handle general models and applications has been introduced in [9,10]. In [11][12][13], we developed and introduced a new tracking methodology based on shadowing filters. We proved the concept and demonstrated the method's superiority over other tracking methods, as well as its applicability to real-life problems. In this paper, we conduct further investigations to demonstrate and verify the additional advantages of the shadowing filter tracking method, which are important for real applications. Namely, we introduce the two-dimensional solution when the observational error's correlation is taken into account and compare the method's performance when this correlation is considered or ignored. We additionally test the filter's performance when dealing with non-uniform sampling in time and demonstrate the filter's capability to overcome singularities.
As a motivation, consider the frequently-occurring tracking problem that arises when observations are recorded in a non-Cartesian coordinate system. Assume that the bearing and range of an object are observed. These observations are independently determined from some reference point. However, when these measurements are transformed into Cartesian data, they will have correlated errors resulting from this transformation. To make matters worse, we often need to combine multiple measurements. In practice, such problems arise in: (a) active radar locations; (b) satellite interferometry; (c) GPS; and (d) passive sonar locations. The challenges in these applications vary. Applications (a), (b) and (d) all use bearing and range data, but in (b), the range is very accurately measured, while the bearing has large errors. In (d), the situation is the opposite, with observations of the bearing being high quality. Application (c), on the other hand, uses only range information, but from several different satellites. All of these cases can be dealt with using the shadowing filter by taking into account the correlation of the observational error. Section 3 of this paper introduces the methodology for approximating the trajectory of an object being tracked with noisy position data and potential correlations in the observational error. The solution in two-dimensions is derived for both uncorrelated and correlated error scenarios. Section 4 utilizes a simple, non-trivial system example to perform a numerical analysis in order to assess whether or not error correlation has to be considered to ensure a good approximation of the object's path and if the extra computational effort required is worth it.
Other common challenges one could face in practice are singularities and non-uniform sampling of data caused by the failure of the recording devices. Section 5 demonstrates the robustness of the shadowing filter tracking methodology to avoid singularity impacts, as well as deal successfully with irregular time resolution. We also apply the tracking technique to a noisy chaotic trajectory, specifically that of the Lorenz model, and analyse the results from the perspective of noise reduction. In the final section, we draw conclusions and make some suggestions for future directions of this work.

Shadowing Filters
The shadowing filter method is unlike sequential filter methods in that it estimates the full trajectory based on all observations simultaneously. In this aspect, our approach is somewhat similar to variational filters [14,15]; however, it does not fall into the trap of local minima [16,17]. This is due to the very simple principle these filters are based on: if the model is good enough, state estimations must be close to the observations and consistent with the model's equations. In practice, this principle imposes a quadratic norm on the filter, which only has one minimum. Specifically regarding the tracking approach, previous investigations showed and verified some merits and advantages of the shadowing filter tracking methodology. In [12,13], we showed the capability of shadowing filter to track single-particle and multi-particle moving objects using GPS data from flying birds; in Figure 1, we show an example of how we successfully used this method to track eight pigeons' trajectories and extracted the corresponding acceleration profiles only from positional GPS data. To track rigid bodies, the method was extended [11] to include rotational motion and moments of inertia. Our method is able to reconstruct the full dynamical phase space from positional information [11][12][13].
In general, varieties of the shadowing filter have proven to have better performance than Kalman filters [18], particle filters [19], sliding average filters [12] and variational filters [17]. They have been successfully implemented from simple low-dimensional maps and flows [17] up to operational weather models [20]. In [11], we addressed a benchmark with the state-of-art of tracking methods. We conducted direct detailed comparisons with the Kalman filter [21], extended Kalman filter [22] and particle filter [23,24] approaches. Our computational comparisons confirm the superiority of the shadowing filter tracking methodology over these sequential filter tracking methods with respect to accuracy and complexity (computational time). In this manuscript, we will not redo these comprehensive numerical comparisons. However, as the motivation, we confirm the performance compared to existing methods in Figure 2 with systems later investigated in this manuscript. Specifically, correlated observational errors are taken into account in Figure 2a   In this demonstration, we refer to our previous works [12,13]. In (a), we track the trajectories of the eight pigeons (the coloured solid lines) from the GPS signals (red dots). In (b), we use the tracking algorithm to estimate the corresponding acceleration profiles for each pigeon (colours refer to different pigeons).  The superiority of the shadowing filter tracking methodology over a sequential filter methodology (Kalman filter). Error correlation has been taken into account in (a) and (b), while the singularity scenario is presented in (c) and (d). The shadowing filter shows obvious superiority, especially where error correlation increases the nonlinearity of the system far from the reference point at the origin in the first scenario and around the singular records in the second scenario. Note that the large error of the shadowing filter estimates at the beginning of the trajectory is expected, as explained in previous investigations [11,12].

Summary of the Tracking Methodology
Typically, tracking data will contain only noisy position observations, and so, we are faced with the challenge of having no explicit information about the object's acceleration or velocity. Using the acceleration as a free parameter, we are able to implement the shadowing filter to find a good approximation of the full phase space including acceleration and velocity information based on the position observations. This free parameter can then be tuned in order to optimize the quality of the solution of the approximated states.
We aim to track a point object that is moving in a d-dimensional Euclidean space with Cartesian coordinates, given a sequence of noisy observations. Let R i ∈ R d be the true states and P i ∈ R d be the noisy observations of the position at time t i for i = 0, 1, . . . , n, where the observational errors have a d × d covariance matrix C i and corresponding information matrix I i = C −1 i , which is used to account for the correlations of the observational error.
The object's dynamics are modelled based on its observed position P i ∈ R d , velocity v i ∈ R d and acceleration a i ∈ R d for t i ≤ t ≤ t i+1 . Our aim is to approximate p i ∈ R d close to the true state R i . To do this, we minimize the total squared error ∑ n i=0 (P i − p i ) T I i (P i − p i ). We assume that the acceleration is constant over one time interval (T i = t i+1 − t i ) and that its magnitude is bounded over the entire trajectory by the relation (∑ n−1 i=0 T i a T i a i ≤ (t n − t 0 )ξ 2 ). Using Newton's laws and a Galilean transformation, we can solve this optimization problem with the Lagrange multipliers method [25]. The appropriate Lagrange function is: where the Lagrange multipliers λ i ∈ R d , µ i ∈ R d are d-dimensional column vectors, η ∈ R and ξ ∈ R. Equation (1b) represents Newton's first law, Equation (1c) Newton's second law, while Equation (1d) bounds the magnitude of the acceleration. While using the Lagrange multiplier method seems only natural in the context of a Newtonian model, the optimization could be also implemented using for example gradient descent [16,17] or penalty methods [26]. The solution to the scalar case and rigid-body problems can be found in [11,27]. In this paper, using similar linear algebra methods and appropriately-constructed vector and matrix equations, we extend the solution to higher dimensions, specifically two-dimensional problems, taking into account the correlation of the observational errors. Note that this solution can be generalised for the d-dimensional problem where all vectors and matrices increase in dimension by a factor of d.

Correlated Observational Errors
For higher dimensions, even in two dimensions, the complexity increases as we have to determine whether observational errors are not independent, but correlated. Note that the Weighted Least Squares (WLS) method and its extension, the constrained-WLS, are related to our shadowing filter [28][29][30][31]. For the tracking problem, they are able to deal with correlated/uncorrelated noise and irregular sampling in time. However, unlike the shadowing filter, they do not consider explicitly the velocity and acceleration [32,33].
Consider the situation of an object moving in the xy-plane. Given a time resolution of measurements T i = t i+1 − t i for i = 0, 1, . . . , n − 1, the true position is R i = (R x i , R y i ) T ∈ R 2 and the noisy observed position is P i = (X i , Y i ) T ∈ R 2 . The error in the x-direction has variance σ 2 x i , and similarly, in the y-direction the error has a variance σ 2 y i . Therefore, the observational error has the following 2 × 2 covariance matrix: with the corresponding information matrix: where: Note that generally, σ x i and σ y i can vary with time.
In addition, defining the unknown variables, η ∈ R and ξ ∈ R, then the Lagrangian expressed by Equation (1) can be stated as: Obviously, the optimal solution occurs when all the partial derivatives of L are zero: The partial derivative equations and the complete solution of this situation are detailed in Appendix A. For the purpose of approximating the trajectory, we require only the final equation of the solution given by Equation (A24): (ηB +ĀI)p =ĀIP whereĀ andB are numerical matrices used to construct the solution, I is the information matrix, P is the observational states vector, p is the approximated states vector and η is the smoothing parameter of the filter.

Uncorrelated Observational Errors
If the observational errors of the x-component are independent of those in the y-component, then they are uncorrelated and at each time t i where i = 0, 1, . . . , n the covariance between x i and y i equals zero, i.e., cov x i y i = 0, ∀i = 0, 1, . . . , n. Therefore, S x i = σ −2 x i , S y i = σ −2 y i , S x i y i = 0. Consequently, Equations (A1) and (A2) can be stated as: This gives us two independent systems of equations: the x-component system Equations (A3), (A5), (A7), (A9) and (3) and the y-component system Equations (A4), (A6), (A8), (A10) and (4). For each component, the systems are identical to the scalar case, which has been solved previously [11,27], and the optimization problem can be solved independently for x and y.

Numerical Investigations: Correlated vs. Uncorrelated Observational Errors
Given a dataset with correlated observational errors, we can either take the correlation into account or ignore it. There are practical reasons why one might choose to ignore the correlation. For instance, the algorithms are more complex with correlation and much more computationally intensive, as a d × 5-dimensional system of equations must be solved (see Equations (A1)-(A10)), but if we ignore correlation, the problem is reduced to d independent scalar problems.
We refer herein to these two situations, of taking the Correlation into Account and Ignoring it, as CA and CI, respectively.
In the following investigations, a target's position is tracked in the xy-plane by the position s = (x, y) T given by the simple, non-trivial observation model: where 0 ≤ t ≤ 150, i.e., the length of the time series is n = 151. Note that the choice of this particular path is motivated by the range and bearing problem set up below with the non-linearity in y simplifying the description of the dynamics in polar coordinates.
We introduce correlated observational errors by transforming the position into range and bearing coordinates q = (r, θ) T with the inverse of the standard transformation s = f (q): where r is the range from a reference point (a, b) T and θ is the bearing in radians measured anti-clockwise from the x-axis. Thus, the transformed observations are (r i , θ i ) T , where i = 0, 1, . . . , 150 and the origin (0, 0) T is the reference point. We then add white noise to each measurement: is the observational noise of the range component with variance σ 2 r and ψ i ∼ N(0, 1) is the observational noise of the bearing component with variance σ 2 θ . Finally, we obtain the noisy observations, P i = (X i , Y i ) T , in Cartesian coordinates, by applying the transformation defined in Equation (6) Under the assumption that r is not close to zero and the variances of r and θ are small, the information matrix of s = (x, y) T can be approximated as [27]: Given that this information matrix is estimated in advance, we are now able to apply the shadowing technique via Equation (A24) to approximate the trajectory, which we illustrate in Figure 3.
In Figure 3a, the range measurement is four-times more accurate than that of the bearing, as, for this experiment, the standard deviations were chosen to be σ θ = 0.60 and σ r = 0.15. Conversely, in Figure 3b, the bearing measurement is four-times more accurate than the range, as σ r = 0.60 and σ θ = 0.15. Whilst the filter does work well in either case, note that σ θ has a significant effect on the observations. This is not the case for σ r .

Quality Measures
In order to assess the quality of the filter, we must specify the aim of the tracking. Our aim is to find an approximate trajectory from noisy observations of a moving object's positions, which lies optimally close to the true trajectory of the object. In this context, there are two questions that one might hope to answer: (1) Where is the object in a given time interval? (2) What is the current position?
The different nature of these questions is subtle, but profound, with each lending itself to a different class of possible applications. For Question (1), we need to find an approximate trajectory that is close to the true trajectory over the entire interval and therefore shadows the true dynamics. Such a scenario would be required, for example, in order to identify the ship responsible for leaking oil into the ocean based on its trajectory. Question (2) does not need an approximated trajectory that is close to all states, but only close to the last state, for example finding a missing flight. A filter capable of solving both questions will likely need to be optimized differently for the particular case. This could be either "tracking the path that the object travelled" or "knowing the recent position of the object". Consequently, we require two measures.
The first measure we define details how close the approximated states p i = (x i , y i ) T are to the true states R i = (R x i , R y i ) T along the whole trajectory for i = 0, 1, . . . , n where n + 1 is the length of the time series. For this, we use the root mean squared error defined by: The second measure details how close the approximated final state, p n = (x n , y n ) T , is to the true final state, R n = (R x n , R y n ) T and can be quantified by the end-point error: For more accurate results, we use the ensemble averages of these quantities as measures of the filter's performance. We consider N time series, giving N true trajectories, N sequences of observations and therefore N sequences of approximates: In the experiments that follow, we assess the quality of the filter for both tracking objectives and optimize parameters such that the respective average errors are minimized.

Optimization of the Filter Performance
The quality of the solution, i.e., the approximated states, can be optimized by tuning the free parameter of our filter. Generally, the dataset will contain noisy position observations, and so, acceleration can be used as the free parameter. Thus, the smoothing parameter η of the acceleration Lagrangian term in Equation (2d) approximates a suitable value of the object's inertia.
The optimal value is dependent on the time resolution of the time series, the system's dynamics, the noise intensity, etc. Therefore, the following investigation should be understood as a proof of concept only, whereby an optimal value is determined for some specific application. Furthermore, as we only want to show that η does in fact have some optimum value, we can use a rather simple numerical scheme.
The algorithm implemented starts with a broad parameter sweep, η = 0.1, 1, 10, 100, 1000, and calculates the average errors for each tracking objective ( E and e ) from N = 100 time series. It then identifies the sub-interval in which the minimum error occurs and divides this η-interval into five new subintervals. Repeating this procedure six times, with a new time series for each narrowing of the parameter range, we get an approximation of the optimal η. This approximation is sufficient to show that an optimal value exists.
Tables A1 and A2 show the results when optimizing η subject to minimizing E for the CA and CI cases, respectively. The minimum in each interval is indicated in bold. We note that the optimal values for the two types of correlation consideration lie close together, and there does not seem to be a large difference between the resulting errors.
Tables A3 and A4 show the optimization data when our aim is to minimize e for the CA and CI cases, respectively. Again, minimum values in each interval are indicated in bold. Here, we note a huge difference between the approximated optimal values for the two types of correlation considered. When correlation was ignored, we were required to expand the starting interval as the best η value in Experiment 1 (the zeroth iteration) was the interval's endpoint η = 1000. Consequently, the almost optimal value of this method η = 1600 was much higher than the value of η = 91, found when correlation was considered. On the other hand, the average end-point error was 1.4-times larger when correlation was accounted for.
The results of this investigation are summarized in Figure 4. Blue lines show the results when correlation is accounted for, and red lines show the results when correlation is ignored. The general shape of the graphs-large values of the error for small and large η values-support our initial assumption that an optimal value of η exists. Since our primitive iterative algorithm can only be used to find an approximation of the optimal η, we cannot say anything about whether the found best value is a local or a global property. However, as was pointed out in the beginning, the purpose of this investigation was only to show that the η value can be optimized for a specific application. We show both scenarios when the error's correlation is taken into account (blue) or completely ignored (red). The minimum illustrates the existence of the optimal value of η that minimises the error and shows that incorporating the error's correlation does not significantly improve the filter's performance.

Comparison of the CA and CI Cases
Recall that CA refers to the situation where correlation is taken into account, and CI ignores it. Surprisingly, the CI algorithm seems to lead to similar or even better estimates. To get a clearer picture, we need to understand the error distributions, as well. Figures 5 and 6 show histograms of the two error measures, E and e , resulting from each correlation algorithm. We used data from 100 time series and chose the approximated optimal η values found previously.    We are not surprised that there was not much difference between the CA and CI algorithms when we were interested in minimizing E , as seen in Figure 5. Both average errors E were about 0.21, and the distributions were peaked around this value. If there was any difference, it was that the percentage of errors contained in the interval [0. 18,24] was about 10% higher for the CA algorithm. It is not immediately clear whether this is statistically significant, but given that both methods showed similar result, we do not think that investigating this statistical significance is important.
Recall that for the case of minimizing e , unlike for E , our η optimization gave us quite different optimal values depending on which algorithm was used. This situation is depicted in Figure 6. Surprisingly, the resulting distributions have much in common: In addition, there was a peak around the average value in both cases with e ≈ 0.2.
We conclude that the shadowing filter's tracking methodology was robust enough to consider or ignore the correlations of the observational errors. That is, there was not a marked improvement in the shadowing filter's performance if we took correlations into account. This is further illustrated in Figure 7, which shows estimates for both tasks ((a) minimizing average error and (b) minimizing last point error).

Applications
In this section, we demonstrate some applications with common challenges one frequently faces in practice. Specifically, we emphasise three challenges relevant to tracking: irregular/non-uniform time resolution, records with large singularities, tracking chaotic trajectories and real-world multiple object application.

Non-Uniform Sampling
Here, we highlight the fact that our shadowing filter is flexible enough to be used even for non-uniformly sampled data. Often, data points are not recorded with the same sampling rate, and so, t h+2 − t h+1 = t h+1 − t h = ∆t does not hold for all h in the dataset. Such a situation is quite common for example in GPS tracking, as the GPS device often fails to log one or more data points [34,35]. Our filter can be used with arbitrary values of ∆t i between any two data points; however, we focus here on the sub-problem that some data points are not recorded with each ∆t i a multiple of some sampling time such that ∆t i = m × ∆t.
It has been shown that our tracking methodology is sufficiently robust to ignore the error correlation without effecting performance. Consequently, any d-dimensional problem can be treated as d independent scalar problems. Hence, for our numerical investigation, we will use data generated from the following scalar function that gives true states Y and noisy observations P: where 0 ≤ t ≤ n, the observation's noise t ∼ N(0, 1) with t a white noise process, χ t is an independent cumulative white noise process, χ t = t ∑ i=1ξ i whereξ i ∼ N(0, 1), α is the variance of the cumulative white noise and β is the variance of the observational noise.
We generated data with n initial length of n = 500 using β = 9 and α = 1. From the dataset, we randomly deleted data points using a uniform distribution (cf. Figure 8a) so that the final dataset had n ≈ 125 data points. Again, we used our usual measures < E > and < e > estimated from N = 100 datasets for η = 1.3 or 19 (depending on the tracking objective). To incorporate the non-constant sampling time, we included the sampling times into the matrix Equation (A12). Our numerical investigation shows that for increasing levels of deletion, the error increased relative to the uniform sampling. Specifically, we found that < E >≈ 3.3 (η = 1.3) and < e >≈ 1.5 (η = 19). Overall, our filter was still able to find a good enough approximation, as can be seen in Figure 8b.

Multiple Bearing Observations: Singularities
In this section, we will study another situation where non-Cartesian observations are used. Consider the case when the target is tracked in the xy-plane, by the position s = (x, y), using bearing observations q = (θ, θ ) from two distinct reference points (a, b) and (a , b ). The transformation f between the different coordinates is defined by: (x, y) = (a, b) + m(cos θ, sin θ), (12) = (a , b ) + m (cos θ , sin θ ), where under most circumstances, there exist unique m, m ∈ R. Hence, if we have bearing observations, then we need to find the raw position estimates, and that requires calculating m and m . These can be estimated by solving the linear equations: which are equivalent to: and these can be expressed in matrix form as follows: Solving these linear equations enables us to find the raw position estimates from bearing observations using the transformation f [27].
In this situation, it is difficult to compute the Jacobian matrix directly from the transformation f . However, note that the inverse of the transformation f is given by: Consequently, the matrix K, the Jacobian matrix of f −1 , is easily computed as follows: where r 2 = (x − a) 2 + (y − b) 2 and r 2 = (x − a ) 2 + (y − b ) 2 , which requires simple computations once an estimate of (x, y) is obtained. Therefore, under the assumption that both r and r are not close to zero, the information matrix can be approximated using the matrix K as follows: where σ −2 θ is the variance of the observational error in the first bearing component and σ −2 θ is the variance in the second bearing component. It follows that the information matrix is given by: Given this information matrix, we can apply the shadowing filter to find an approximated trajectory.
A problem arises in situations when the sensors and the target are collinear; i.e., θ ≈ ±θ . In such situations, the linear Equations (14) are singular or badly conditioned. This can lead to raw position estimates far from their true positions. To solve this problem, we have three options: we can drop the bad observation, replace it with a forecasted position or take into account that the information in this observation is less reliable than in other observations. Since the shadowing filter is estimating a trajectory from a sequence of observations, generally there is no harm in either of these solutions. Although, dropping bad observations leads to a non-uniform time-gap between observations where the shadowing filter still works successfully, as verified in the above investigations. We have chosen here to give our observations different weights, according to their reliability. To do this, we scaled the information matrix by using the rcond command in MATLAB. This command calculates the one-norm estimate of the reciprocal condition number as returned by LAPACK. rcond was used on the matrix giving information on the angles: cos θ − cos θ sin θ − sin θ .
The output of rcond varies between zero and one. For angles that lead to an ill-conditioned Equation (14), we get small weights, but good observations will get maximum weighting.
In the example shown in Figure 9 and [27], we tracked an object using bearing observations provided by two moving sensors. The target moved on a circular path (sin (t/25), cos (t/25) for 0 ≤ t ≤ 100). It moved in a clockwise direction, starting from the 12 o'clock position. The first sensor moved between (−3, 3) and (3,1), and the other between (−3, −2) and (3,1). Both sensors had a constant speed. To generate the bearing observations, we transformed the true states of the object to the bearing coordinates using Equations (15) and (16). We added white noise to both components with variances α θ = α θ = 0.05 to obtain the desired multiple bearing noisy observations. After that, we transformed these noisy observations to the Cartesian coordinates using the transformation f after computing m and m from Equation (14). We applied the shadowing filter at η = 1 on these observations using the scaled information matrices to deal with the bad observations. In Figure 9, we can also see the effect of the different weighting. Note that when the target was between the four and five o'clock position, it was directly between the sensors θ = ±θ . This led to a very bad observation, as seen by the large error. Similarly, at the end of the trajectory, the object was almost directly behind both sensors, which also led to a poorly-conditioned observations. However, using the one-norm estimates as described above enabled us to determine good approximations. Observations True Approximation Figure 9. Performance of the shadowing filter in tracking an object using multiple bearing observations with η = 1, α θ = α θ = 0.05. Note how the filter is able to overcome the singularities in the observations and estimates the true dynamics quite well.

Chaotic Trajectory: Lorenz Model
In the previous sections and examples, we investigated the performance of the tracking filter for deterministic trajectories in one-and two-dimensional spaces. The natural step from here is to consider the realistic case of a tracking such as the chaotic motions seen in nature in insects, fish or birds.
We now apply the tracking technique to the Lorenz model [36]: Using the standard Lorenz model and typical parameters ρ = 28, Σ = 10 and B = 8/3, we initialized the system with (x 0 , y 0 , z 0 ) = (0, 1, 1.05) and generated a trajectory containing 2501 data points for t = [0, 25]. The data evolved as per Equation (17) and were measured with some uncertainty following the incorporation of noise with variance of β = 1 to each component as per Equation (11).
The dynamics were tracked using the shadowing filter with η = 0.1, which was the optimal value for the chosen noise variance as found in Section 4, and a sampling rate of 0.1. We considered uncorrelated errors and treated the system as per the scalar case outlined in Section 3. Figure 10a-c shows the successful tracking of a target in each of the variables, while Figure 11 shows these details in a projection of the three-dimensional state space. We note that as the shadowing filter did not contain any information about the vector field, the noisy observations were projected down on to the attracting manifold of the Lorenz model. Therefore, our filter acted similar to the other shadowing filters, which had information regarding the vector field encoded in them [17]. Although there were some estimation error, one can see at the outer loops in Figure 11, we note that our filter estimated the true time series most of the time.   Up until this point, we have been concerned with forecasting. However, this is closely related to noise reduction, and the results of the previous section can be viewed in such a context [37,38].
We assumed that the clean time series of each variable,x,ỹ andz, is in the presence of measurement noise and aimed to separate these from the observed noisy data in order to restore the structure of the system and improve the quality of predictions [38], i.e., x n =x t + β t , where t is defined as in Equation (11).
For measurement noise applied to Equation (17), the approximated trajectory (green triangles) of Figures 10 and 11 is the reduced noise trajectory. Figure 11 illustrates how the noise (red line) has smeared the chaotic attractor and obscured its structure and how the chaotic dynamics of the clean (blue line) trajectory were successfully captured.

Real-World Multiple Object Application
As a final application, we present how the shadowing filter can be used to track multiple objects. We used two-dimensional data produced from images of a real flock of ducks moving on a water surface. The dataset was provided by the authors of [39]. We focussed on one of the data files where a turning event occurred. We consider 25 individuals observed at 30 time points. We tracked the trajectories using the positional observations and extracted the corresponding acceleration for each bird. The difference between this application and the tracking of pigeons illustrated in Figure 1 was the size of the flock. Here, we have a larger number of individuals, but a relatively short trajectory. The results are shown in Figure 12, where the circles are the observations and the estimated paths are given as solid lines. The filter accurately tracked the position of each individual. The acceleration of each bird is shown in the lower panels of the figure. Note that this additional information about each individual's state can be used to forecast its future dynamics and could be used to infer that the forces dominate such a collective behaviour [13].

Conclusions
In this paper and our previous ones [11][12][13], a new class of shadowing filters, applicable to observational data of objects moving in more than one dimension, was introduced and investigated. Our investigations demonstrate that this approach is highly versatile. First of all, it may be easily adapted to different settings. For instance, it can be implemented to track a single moving particle, a system of multiple objects [13] or be extended to consider rigid bodies instead [11]. Secondly, it is computationally inexpensive in comparison to alternative methods [11]. Finally, it outperformed the established and widely-used Kalman and particle filters in situations where nonlinearities were present [11]. Our tracking technique is applicable to real data and enables us to minimise measurement errors and overcome device failures, as well as allows for a reconstruction of the full dynamical state space from limited position-only data [12,13].
In this paper, specifically, optimization of the filter quality was achieved for two tracking objectives, namely "tracking the path that the object travelled" and "knowing the recent position of the object", by tuning the smoothing parameter such that the (task-specific) error was minimized. Numerical investigations indicated that our tracking method was sufficiently robust that the benefit of taking error correlation into account was negligible, allowing for many systems to be treated as per the scalar case. Furthermore, we demonstrated that this tracking method can be applied to irregularly-sampled real-world data without additional effort, a situation often encountered when measurement devices fail to log some measurement points. We highlight the ability of the shadowing filter tracking method to avoid singularities in data, another challenge encountered frequently in practice. The shadowing filter was successfully applied to noisy data from the Lorenz model, and we briefly commented on the relation of this approach to noise reduction algorithms. Finally, a real-world application of multiple object tracking was provided, as we tracked a flock of 25 ducks moving on the water surface.
Lightweight and inexpensive tracking devices are now available and have opened up a new area of tracking possibilities [34,35,[40][41][42][43][44], resulting in new observational data, which enable the study of movement, of animals for example, with so far unreached precision. Given these advantages and the resulting big data, it is clear that the new generation of filters needs to be fast and efficient. The outlined benefits of our filter together with its clear optimization possibility make it, in our opinion, a very strong contender for such applications, and we are looking forward to seeing the progress in this area based on our filter.

Conflicts of Interest:
The authors declare no conflict of interest.

Code Availability
Contact the author for the MATLAB code. A simple code for one-dimensional tracking is provided at: MathWorks-File Exchange-One-Dimensional Tracking Methodology Using Shadowing Filter-Ayham Zaitouny.

Abbreviations
The following abbreviations are used in this manuscript: CA Taking observational errors' Correlation into Account CI Ignoring observational errors' Correlation

Appendix A. Detailed Solution
We will show how to solve the problem in two-dimensions when the correlation is taken into account. We will represent the equations using vector and matrix forms. We will define some vectors and matrices; please note that in our definitions, 0 refers to a 2 × 2 matrix of zeros, as follows: Obviously, P denotes the observations, p the estimates, λ and µ our Lagrangian parameters, while a the piecewise constant accelerations. Moreover, to take correlations into account, we define the information matrix as: Similar to the scalar case, we need a time matrix T. However, in the two-dimensional problem, each time resolution is repeated twice along the main diagonal of T, since we have two components: To simplify the problem as before to one compact equation, we need two additional matrices, a 2n × 2n matrix D and a (2n + 2) × 2n matrix E defined as: Using our previous definitions, we can see that Equations (A1) and (A2) can be combined into this compact form: Similarly, Equations (A3) and (A4) can be expressed by: Following our previous analysis, Equations (A5) and (A6) can be combined by: These three new Equations (A13)-(A15) can be used together to eliminate the dual variables λ x i , λ y i , µ x i and µ y i . To do that, we need to define two more matrices M 2n×2n and L 2n×(2n+2) where the matrix L is similar to the matrix M, but with an extra two columns of zeros at the end, as follows: With these definitions, it can be easily verified that D 2n×2n M 2n×2n = I 2n×2n , which means that M is the inverse of D. Then, multiplying both sides of this identity by (Tλ) and using Equation (A14), it follows that: Similarly, it can be shown that E (2n+2)×2n L 2n×(2n+2) = J (2n+2)×(2n+2) , where J is defined as: Multiplying the identity by: we find that: Therefore, using the identity of J and Equation (A13), it follows: Now, Equations (A7) and (A9) can be combined to eliminate v x i , so that: Similarly, from Equations (A8) and (A10), it follows that: To combine Equations (A19) and (A20), we need to define two additional matrices G (2n−2)×2n and B (2n−2)×(2n+2) as follows: Using these two matrices, it is easy to see that Equations (A19) and (A20) are combined as: Now, Equations (A18) and (A21) can be combined, and we obtain the equation: (ηB + AI )p = AI P, where: Finally, to include the additional two conditions ∑ n i=0 (S x i (X i − x i ) + S x i y i (Y i − y i )) = 0 and ∑ n i=0 (S y i (Y i − y i ) + S x i y i (X i − x i )) = 0, we define a (2n × (2n + 2)) matrixB to be matrix B augmented with two final rows of zeros and a (2n × (2n + 2)) matrixĀ to be A augmented with two final rows of 2 × 2 identity matrices, then we get the final equation: (ηB +ĀI)p =ĀIP (A24) which can be solved by singular-value decomposition to give the approximation trajectory. Similar to the scalar case, to avoid the large errors at the end of the trajectory, we reverse the columns and rows of matrices A and B to obtain the desired solution. Therefore, we expect that the errors at the beginning of our estimations will now be large [11,12]. Tables   Table A1. Results of the experiments to identify the almost optimum value of η giving the smallest E when Correlation is Accounted for (CA