Synthetic ground-truth data are widely used for algorithm validation in several applications, including pose (orientation and position) estimation methods [1
]. The workflow of algorithm validation typically involves the following steps: the design of the algorithm, the assessment of its performance on simulated data and, finally, experimental validation. Although experimental validation is essential to legitimate the conclusions of any scientific investigation, it is well known that several undesired and sometimes uncontrollable factors may affect the experimental results. These factors are related to either the sensors, including those generating ground-truth data (intrinsic factors), or the external environment (extrinsic factors). On the other hand, simulation environments offer the opportunity to control both intrinsic and extrinsic factors, generate accurate ground-truth data and perform replications and statistical analyses.
The work carried out on sensor simulations in a pose estimation context is analyzed below.
An inertial measurement unit (IMU) signal simulator was presented in [6
]. The adopted object-oriented language approach based on the C++ language allowed the project to be extensible and modular. Realistic motion trajectories were generated starting from synthetic instances of angular velocity and acceleration signals. These signals were then modified to account for several error sources affecting sensor outputs, e.g., sensitivity and cross-axis sensitivity, bias, misalignments between reference frames, measurement (white) noise and quantization noise. The simulation framework was validated by comparing the generated signals with those from a real IMU.
An IMU signal generator was developed in [7
] to gain insight into the operation and performance of pedestrian dead reckoning (PDR) algorithms. In particular, the authors aimed at generating simple and realistic foot motion trajectories in three-dimensional (3D) space, without accounting specifically for any error source apart from the white measurement noise. Thanks to the use of the IMU signal generator, some critical points were discovered in the PDR algorithm, leading to a new design with better performances.
An object-oriented language approach was also adopted to implement IMUSim, a Python-based magnetic-inertial measurement unit (MIMU) simulator proposed specifically for applications in the field of human motion analysis [8
]. The ground-truth pose was obtained using stereophotogrammetric reference data from an optical motion capture system. The filtered motion trajectories were processed through the rigid body kinematic equations to obtain the angular velocity and the linear acceleration of the simulated IMU. Models of the measurement noises and biases that affect magneto/inertial sensors were implemented. Additional interesting aspects of IMUSim were the introduction of models describing the Earth’s magnetic field distribution and the simulation of wireless sensor network operation.
], another MIMU simulator was implemented, which modeled the sensor frequency response with a first-order dynamics. In addition, a more complex model for the sensor biases was introduced. Finally, the Earth’s fixed frame, which rotates together with the Earth, was distinguished from the inertial reference frame, which does not rotate with the Earth.
As for the simulation of vision-based systems for, e.g., navigation and ego-motion estimation, the visual measurements of interest typically consist of two-dimensional (2D) discriminative features that can be present in an image, such as corners or lines [10
]. These features, resolved in a 2D reference frame attached to the image plane, are typically extracted from grayscale images using ad hoc
algorithms called feature trackers [10
]. Therefore, from the algorithm designer’s point of view, the expected outputs from a camera simulator are the synthetic 2D points (or lines) returned from a virtual feature tracker. The standard procedure to simulate a vision system involves the creation of ground-truth three-dimensional (3D) points associated with their projection onto the image plane [1
]. Such a projective transformation requires a camera model (e.g., pinhole, fisheye) and the ground-truth camera pose and orientation with respect to the reference frame in which the features are defined. Correspondence between the same features in consecutive images is usually what is needed for camera pose estimation. In real conditions, noise, time-varying lighting conditions, fast movements and occlusions contribute to producing feature mismatches, which represent a major problem in vision-based ego-motion estimation [14
]. It would be desirable for a camera simulator to account for this kind of disturbance.
The aim of this work is to present the simulation environment that we developed for six DOF pose estimator benchmarking. The code is attached to this paper as Supplementary Material
. This paper therefore represents a useful reference for possible users of the framework here presented. Unlike the works analyzed above, the simulation of different kinds of sensors is supported simultaneously. The suite of sensors includes magnetic/inertial sensors and monocular cameras; however, new sensing modalities can be easily added by inheritance. Several sensors may be instantiated and integrated during a simulation session in order to obtain a sensing pool whose output data can be used to feed a pose estimation algorithm. In this paper, we first describe the simulation environment and the steps taken for its validation against real-life sensor data. Then, we report how IMU and camera data from a real experiment were replicated in a simulation to show the correctness of the synthetic sensor outputs. In addition, an actual benchmarking example concerning a sensor fusion-based extended Kalman filter (EKF) for orientation estimation is presented to show the usefulness of the proposed simulator. Finally, an example of magnetic disturbance simulation is shown, since it represents a useful functionality for replicating a realistic magnetic environment.
On the whole, the results shown in Section 3.1
demonstrated an overall agreement between the simulated and real sensor measurements. The RMSE
values were about one tenth of the respective peak-to-peak magnitude measured both for the IMU
and the camera
classes. High correlations were achieved, as well, with a minimum value of 0.90 for the y
-axis of the gyroscope. On that channel, the measurement noise of the real sensor was not negligible with respect to the strength of the measured signal. In particular, the results about the magnetometer are surprisingly good, because they were obtained without attempting any magnetic compensation. However, either enlarging the volume explored during the experiment or approaching a ferromagnetic object would have certainly degraded the correspondence between the simulated and real magnetometer data. In fact, as demonstrated in [20
], uncompensated magnetic distortions can severely affect the results of the magnetic sensor simulation simply because the sensed magnetic field is not the Earth’s pure magnetic field. The camera
instance also replicated the actual measurements accurately. The positive results prove the overall correctness of both the simulation environment implementation and the experimental design. In fact, R
values were very close to one, and the RMSEs
collected in about 140 s were lower than 10 pixel. Therefore, the proposed simulation framework proved to be a suitable tool for reproducing a real multi-sensor experiment.
To show one of the typical applications of the presented data simulator, we performed a consistency test on an EKF-based orientation estimator relying on both IMU and vision data. Referring to Table 4
, the estimator proved to be accurate, returning errors of about 0.1° (on each Euler angle) in simulation and less than 1° with real data. However, we were interested in comparing the behavior of the EKF in the two scenarios, rather than the mere errors. In fact, the usefulness of the simulated study came out from the trends shown in Figure 7
. The plot reported in Figure 7
, for example, shows us that if we had not performed a simulated test, we could not have received any assurance about filter consistency. The real errors, i.e.
, the errors computed with respect to the Vicon ground-truth data, are in fact considerably larger than the estimated covariances. This condition is usually interpreted as a sign of filter malfunctioning. However, by performing the same test with simulated data (from the same real experiment), we obtained the expected behavior, as shown in Figure 7
. The errors remained within the estimated uncertainty during the entire trial, which means that the uncertainty was reliably estimated. Therefore, in a real case scenario, the inconsistency between the estimated covariances and the estimation errors (which are relatively large if compared to those obtained with simulated data) should be attributed to the slight imperfections of the experimental setup (misalignments, imperfect calibrations, etc.
), rather than to filter instability. In fact, it should be noted that, for our setup, we estimated that the stereophotogrammetric errors propagated to the angles of interest in this study, causing a maximal inaccuracy of 0.5°. The EKF code therefore can be considered reasonably reliable, as the overall filter structure.
Finally, the proposed simulator allows the simulation of spatial magnetic disturbance distributions as the ones shown in Figure 8
. This functionality has several practical applications, e.g., evaluating the orientation estimator robustness with respect to the magnetic disturbances or benchmarking the localization methods based on spatial magnetic anomalies [29