Next Article in Journal
Experimental Analysis of Seismic Damage to the Frame Structure–Site System Crossing a Reverse Fault
Previous Article in Journal
CONTI-CrackNet: A Continuity-Aware State-Space Network for Crack Segmentation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Shallow Neural Network-Augmented Pose Estimator Based on Magneto-Inertial Sensors for Reference-Denied Environments

1
Faculty of Engineering, University of Szeged, 6725 Szeged, Hungary
2
Department of Mechanical, Energy and Management Engineering, University of Calabria, 87036 Cosenza, Italy
3
Institute of Informatics, University of Dunaujvaros, 2400 Dunaujvaros, Hungary
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(22), 6864; https://doi.org/10.3390/s25226864
Submission received: 28 September 2025 / Revised: 6 November 2025 / Accepted: 7 November 2025 / Published: 10 November 2025

Abstract

Magnetic, angular rate, and gravity (MARG) sensor-based inference is the de facto standard for mobile robot pose estimation, yet its sensor limitations necessitate fusion with absolute references. In environments where such references are unavailable, the system must rely solely on the uncertain MARG-based inference, posing significant challenges due to the resulting estimation uncertainties. This paper addresses the challenge of enhancing the accuracy of position/velocity estimations based on the fusion of MARG sensor data with shallow neural network (NN) models. The proposed methodology develops and trains a feasible cascade-forward NN to reliably estimate the true acceleration of dynamical systems. Three types of NNs are developed for acceleration estimation. The effectiveness of each topology is comprehensively evaluated in terms of input combinations of MARG measurements and signal features, number of hidden layers, and number of neurons. The proposed approach also incorporates extended Kalman and gradient descent orientation filters during the training process to further improve estimation effectiveness. Experimental validation is conducted through a case study on position/velocity estimation for a low-cost flying quadcopter. This process utilizes a comprehensive database of random dynamic flight maneuvers captured and processed in an experimental test environment with six degrees of freedom (6DOF), where both raw MARG measurements and ground truth data (three positions and three orientations) of system states are recorded. The proposed approach significantly enhances the accuracy in calculating the rotation matrix-based acceleration vector. The Pearson correlation coefficient reaches 0.88 compared to the reference acceleration, surpassing 0.73 for the baseline method. This enhancement ensures reliable position/velocity estimations even during typical quadcopter maneuvers within 10-s timeframes (flying 50 m), with a position error margin ranging between 2 to 4 m when evaluated across a diverse set of representative quadcopter maneuvers. The findings validate the engineering feasibility and effectiveness of the proposed approach for pose estimation in GPS-denied or landmark-deficient environments, while its application in unknown environments constitutes the main future research direction.

1. Introduction

A magnetic, angular rate, and gravity (MARG) sensor unit includes accelerometer, gyroscope, and magnetometer sensors. They have become the de facto standard in current mobile robot systems because they offer characteristics such as low cost, simple interface, and high-resolution measurements. These units are utilized for collecting navigation data such as attitude and heading reference systems (AHRS) to estimate orientation and heading information, as mentioned in [1,2,3,4]. They are also used to perform attitude control and incorporate the results in the simultaneous localization and mapping (SLAM), for example as reported in [5,6,7].
In practice, reference measurements such as GPS signals and visual or landmark information are often not available, for instance due to phase tracking loss and jamming sensitivity. This can be attributed to both the characteristics of the environment and the limitations of the embedded system, which result in serious performance degradation of both the localization algorithms and control approaches that rely on the estimated pose data [8,9]. In these cases, solely MARG-based pose estimation (i.e., 3D positions, velocity, and acceleration vector estimation) becomes crucial and can be seriously affected by the deficiencies of low-cost MARG sensors [10,11,12,13]. The inevitable noise superimposed on the measurements, temperature dependent bias, effect of gravity vector, and magnetic disturbances constitute the main error sources, which make MARG-based pose estimation a difficult problem even when required for only a short period of time [14,15]. Three-dimensional position estimation based on double integration of the acceleration signal provided by a MARG sensor is a challenge even if a robot’s 3D acceleration is known. The aforementioned uncertainties of the measurements become even more significant because of noisy environments, for example as highlighted in [16,17,18].
Pose estimation in mobile robotics is realized within a multi-sensor fusion framework, where relative measurements provide the prediction step of the extended Kalman filter (EKF) and reference measurements (e.g., from LiDAR, camera, or radio beacons) perform the correction step, compensating for drift and other numerical integration errors. Several recent works have successfully demonstrated such approaches: ref. [19], an EKF-based localization method integrating inertial measurement unit (IMU), wheel odometry, and ultra-wideband (UWB) measurements, where a complementary filter processes IMU and odometry data to generate EKF control inputs; ref. [20], an event-triggered EKF framework fusing barometer, IMU, and wheel encoder data for three-dimensional state estimation; ref. [21], a LiDAR-assisted fusion of IMU heading and wheel odometry for indoor localization; ref. [22], a UWB-, odometry-, and AHRS-based Kalman-filter fusion for precise indoor trajectory tracking; ref. [23], a wheel—inertial—visual odometry system using an RGB-D camera, IMU, and encoders enhanced by a fuzzy inference system and an iterated error-state Kalman filter; ref. [24], a multi-sensor fusion factor-graph method combining IMU, odometer, and LiDAR; and ref. [25], a UWB-, IMU-, odometry-, and laser-based fusion employing EKF for robot navigation. These methods represent recent advances that achieve high precision and robustness in robot localization under favorable sensing conditions. However, their reliability depends strongly on the continuous availability of update measurements from absolute sensors (e.g., GPS, camera, LiDAR, or radio beacons). When such reference updates are lost due to environmental or sensor limitations, the system must rely solely on prediction based on relative measurements, leading to rapid degradation of trajectory-tracking accuracy as drift accumulates. In such critical situations, the system’s safety and reliability can only be maintained if accurate motion and acceleration predictions can be derived purely from relative sensors such as MARG units. This inherent limitation of multi-sensor fusion frameworks motivated the present work, which focuses on achieving reliable acceleration vector estimation based exclusively on MARG-only measurements.
Quadrotors are among the largest applications in which reliable pose estimation is an inevitable requirement, as these platforms would otherwise damage both the hardware and its environment [26,27,28]. UAVs are increasingly employed in remote surveillance missions where their flight performance is heavily affected by uncertain environmental changes, external disturbances, and unreliable sensor information. Recent research has addressed these challenges using advanced control and estimation frameworks. Hierarchical control architectures integrating adaptive PID-based MPC have been proposed to improve optimal trajectory tracking under unpredictable environmental conditions [29], while variable disturbance observer-based controllers enhance robustness through adaptive tuning of the Q-filter according to the dynamic model and sensor noise characteristics [30]. High-gain disturbance observers have also been adopted to integrate disturbance estimation with sliding-mode control [31] and higher-order sliding-mode estimators have been utilized to handle parameter variations and perturbations, resulting in improved quadrotor dynamics [32]. More recently, data-driven approaches have emerged in which online learning-based control frameworks fuse uncertainty estimation with nominal controllers to cope with complex and time-varying dynamics [33]. Comparative analyses demonstrate that such hybrid learning-control schemes achieve an advantageous tradeoff between estimation accuracy, real-time performance, and overall flight stability, in particular those using echo state networks. Although robust control frameworks can effectively handle system uncertainties, their performance ultimately depends on the quality of the sensor data. Reliable estimation can only be achieved through accurate processing and fusion of multi-sensor information in order to provide reliable inputs to the control loops. Consequently, recent studies focus on addressing the issue at its source, namely, developing advanced fusion algorithms that can ensure robust state estimation under dynamic conditions.
In [28], the authors reported on an inertial measurement unit-based (IMU-based) pose estimation (which relied on only accelerometer and gyroscope sensors) for quadrotors, where an unscented Kalman Filter (KF) framework was utilized with motor speed- and torque-based unique process models and IMU data driven measurement model. In a similar fashion, [34] derived sophisticated dynamical models for multirotor systems that allow for obtaining accurate state estimation when data updates from reference sensors are unavailable. However, the proposed models require detailed knowledge of the state vector characteristics and noise parameters (e.g., the covariance matrices of KF), and as such are difficult to generalize and practically set up.
Pedestrian dead reckoning is a popular application in which IMU-only algorithms are developed for pose estimation [35,36,37]. In [38], a tightly-coupled EKF framework was proposed for IMU-only pose estimation with the application of neural networks (NN). Significant advantages were demonstrated, which included limited drift errors over other popular state estimation approaches. However, the bandwidth dynamics in pedestrian applications is significantly smaller than that in the case of robots or multirotors. In [39], an IMU-based dead-reckoning algorithm was developed for wheeled vehicles that employed KF and deep NN (DNN) to robustly handle the noise effects in state estimation. A novel state-space motion model was incorporated in the KF framework and deep learning was utilized to dynamically adapt the covariance matrices in the KF, yielding robust performance. Despite the technological advances of small and powerful onboard computers that execute multiple NNs real time, deep learning methods continue to increase computational costs significantly, reducing the applicability of these methods on standard low-cost embedded system-based robot platforms. In the work presented in [40], a triple-channel DNN was developed based on physical and mathematical models of IMUs, resulting in a significant accuracy improvement on the micro-aerial vehicle dataset.
Convolutional NNs (CNN) have also been employed for attitude estimation, for example in [17]. However, similar to DNNs, it is difficult to apply CNN networks within embedded systems because of their large capacity requirements [17]. Similarly, a bidirectional long short-term memory (LSTM) deep learning model was developed for estimating relative 6D poses from noisy low-cost IMU data in [41]. In [42], a complex DNN was employed for velocity estimation based on 9-axis IMU sensor data. Deep networks were successfully applied to estimate vehicle movement directions from accelerometer data [43]. Machine learning methods were used to estimate position, speed, and acceleration from a 9-axis IMU data in an iPhone application with the aim of statistically evaluating the behavior of the driver [44], although with some accuracy limitations.
Given the limitations of the above-mentioned approaches, shallow neural network (SNN) architectures appear to be a promising alternative for addressing the problem. Because the pose estimation problem is characterized by dataset of relatively low input and output dimensions, the processing pipeline is based on individual samples rather than sequences of data; therefore, SNN provides a more suitable topology than DNN. Specifically, SNNs do not introduce unnecessary complexity or require significant memory capabilities, aligning well with the hardware and software requirements of low-cost embedded systems-based architectures and allowing for efficient processing while maintaining adequate performance levels. It was shown in [45] that this approach can significantly reduce computational costs and hardware requirements while providing reliable results when no event recognition is required. The advantage of SNNs their smaller computational cost (similar to an EKF), and consequently their applicability to microcontrollers. Recently, SNNs have also been proven as a feasible solution for the classification of vehicle movements based on accelerometer signals [46]. Cascade-forward NNs (CFNNs) are applicable for time series data [47], especially when the input signal needs to be adjusted with a correction signal. The cascade architecture enables correction of the directly calculated sensor frame acceleration with a mixture of input channels, i.e., magnetic, angular rate and gravity signals, along with estimated orientation angles.
The above comprehensive review reveals that accurate 3D position and velocity estimation based on the integration of acceleration signals remains a challenging task, primarily due to the significant uncertainty of MARG sensor measurements arising from both environmental noise and inherent sensor limitations [16,48,49]. As highlighted in the literature, when reference measurements are unavailable owing to sensory or environmental constraints, the system must rely solely on MARG data to sustain performance, which necessitates the reliable estimation of the true acceleration vector. While sophisticated analytical models can theoretically address this issue, their practical application is hindered by model complexity, high dimensionality, and the presence of unknown dynamics and parameters. Alternatively, data-driven methods such as deep neural networks (DNNs) have shown promising performance, yet their deployment in embedded systems is often infeasible due to excessive computational demands.
Motivated by these challenges, this study aims to identify an optimal SNN topology that balances computational efficiency and estimation accuracy, thereby enabling precise MARG-based acceleration estimation. The proposed approach contributes to achieving consistent velocity and position estimates over short time windows—particularly in the absence of external reference updates—based exclusively on MARG sensor inputs. To this end, the true acceleration is learned using SNNs trained on datasets generated from a validated robot–sensor model, ensuring both practical applicability and generalization capability.
The main contributions of this work are summarized as follows:
1.
A complete learning framework is developed that produces a computationally efficient pose estimation model suitable for implementation on microcontroller-based embedded systems. The core element of this framework is an SNN that achieves high estimation performance with low computational cost.
2.
A robot–sensor model driven by real-world measurements is constructed to generate the training dataset. This model provides reference data for network training together with realistic MARG sensor readings, forming a comprehensive database for MARG-only pose estimation approaches.
3.
A novel CFNN architecture with two hidden layers is proposed. The results demonstrate that this topology represents the most suitable configuration among the SNNs, effectively integrating useful information from various input sensor signals (e.g., raw MARG data and orientation estimates) to learn the complex nonlinear mapping required for estimating the true acceleration.
4.
A comprehensive performance evaluation is conducted, including different MARG data configurations within NNs and a comparison with established attitude estimation filters such as EKF-based and gradient-descent algorithms, with the results used to validate the proposed approach.
The remainder of this paper is structured in three main sections: Section 2 introduces the tools and methods applied to develop and evaluate the NN-based algorithms from the real robot setup over the database generation and MARG-based algorithms to the employed NN structure and performance evaluation methods; Section 3 presents the obtained results, analyses, and influence of the parameters on performance; and Section 4 evaluates the results and provides a final overview of both the applicability and performance of the developed methods. Finally, Section 5 concludes the work.

2. Proposed Method

The main concept of the research is shown in Figure 1. The goal of the trained SNN is to achieve high accuracy in pose estimation. The mobile platform (robot) equipped with multiple sensors (e.g., IMU and encoder) needs to execute various motions; during these motions, the raw sensor data is registered. Then, the proposed procedure applies a rough pose estimation block, where the aim is to approximate both the trajectories and dynamics of executed motions. The output of this stage consists of roughly-estimated 6DOF pose time series, which are supplied to a realistic simulation environment. The simulation environment utilizes a 6DOF dynamical system equipped with realistic MARG sensors, which executes the input trajectories with the desired dynamics. Because the simulated system has an exactly known dynamical model, both the true states (ground truth of spatial coordinates and orientations) of the system and the raw MARG measurements are generated in an annotated database for NN training. This annotated database is also augmented with statistical signals, which represent different features of instantaneous MARG measurements. This enables the obtainment of a comprehensive database composed of multiple realistic robot maneuvers; for each scenario of executed motion, the ground truth data (target), raw sensor measurements (input) and statistical signals (input) are given. These training data are then used to train a shallow NN that estimates the 3D acceleration based on the input signals. This strategy ensures that the trained network shows both better performance and less bias in the estimated pose than the benchmark EKF-based methods during the real robot implementation.
Figure 2 shows the block diagram of the conducted work in more detail. A quadcopter constituted the benchmark setup on which the proposed work was evaluated. The detailed development process is discussed below:
1.
First, a physical low-cost quadcopter was used to generate various motions. Multiple trajectories were executed by the robot and the raw sensor data were simultaneously registered on an SD card. This process comprised more than 42 min of quadcopter flight with various maneuvers (i.e., random flights with different acceleration, loop trajectories, flip maneuvers, and dedicated rotations).
2.
Next, multiple filtration steps were executed to roughly estimate the pose of the quadcopter during the executed flights. The filtration steps included simple low- and high-pass filters and their combination, with the cut-off frequency adapted to the dynamics of the real system. The applied filters were used to both combine the raw sensor data and remove bias, thereby providing approximated trajectories. The aim of this rough estimation of position and orientation is to enable the generation of real-world scenarios with the proposed simulation model. This method ensured that the NN teaching was based on realistic maneuvers and that the generated signals had the same spectral characteristics as the real-world motions. The measurements included the accelerometer, gyroscope, magnetometer, barometer, and estimated attitude results, from which rough trajectory curves were obtained. These rough trajectories were only approximations (as the real trajectories executed by the quadcopter were unknown); however, the dynamics, shapes, and frequencies included in the obtained trajectories were the same as those in the real flight maneuvers.
3.
These roughly estimated pose data were employed within the simulation environment introduced in [49] to generate realistic sensor data along with the ground truth poses; this was a key step, since the ground truth pose was unknown during the real quadcopter flights. Based on the raw sensor data, attitude estimation was executed in two state-of-the-art filter frameworks to obtain the orientation of the quadcopter.
4.
After the database was generated, the NN architecture was set up and the NN teaching process was executed. Finally, the obtained NN was evaluated for acceleration, velocity, and pose estimation. The advantage of shallow networks against deep networks lies their simplicity, reduced need for training data, lower susceptibility to overlearning, and easier applicability in embedded systems (with calculation requirements similar to a traditional EKF). The proposed environment executes orientation estimation with EKF and gradient descent (GRD) algorithms, which were used as input signals in the NN teaching process. This partly regularizes the network via physics (similarly to physics-enhanced NNs); moreover, it reduces the required network complexity, since the NN is supplied by compressed and semi-processed data. Additionally, statistical signals (e.g., magnitude) were utilized for similar reasons.
For performance evaluation of these topologies, the Pearson’s correlation coefficient was calculated, which allowed us to estimate the value of a dependent variable regarding a particular value of an independent variable through regression equations. The proposed analysis contributed to setting up the most appropriate NN in terms of real-time pose estimation when sparse reference signals are available.

2.1. Accessibility Experiments with a Real Quadcopter System

The low-cost quadcopter setup used in this study is depicted in Figure 3. It was constructed from the following modules: an RD290 290 mm Carbon Fiber HexaCopter frame, 5045 Three-Blade Bullnose propellers, Video TX-TS832 600 mW, Radio RX-Frsky R9 Slim Plus, and ESC BLHeli 30A. Further, the quadcopter was equipped with an Omnibus F4 Pro control unit (Airbot Co. Ltd., London, UK), an MPU6050 Accelerometer and Gyroscope IMU platform (TDK InvenSense, San Jose, CA, USA), HMC5883L magnetometer (Honeywell International Inc., USA), and BMP180 Barometer (Bosch Sensortec GmbH, Reutlingen, Germany, see the details in Table 1). All these sensors were mounted on the body of the drone, with the X-axis indicating the front of the drone, the Y-axis representing lateral movement, and the Z-axis denoting the vertical direction. The drone was controlled with a TaranisQX7 controller joystick (FrSky Electronic Co., Ltd., Wuxi, Jiangsu Province, China). During the executed flight maneuvers, measurements were collected though the Betaflight firmware logging module, with MARG signals acquired at a 1000 Hz sampling rate. Various maneuvers were performed by the quadcopter to create a comprehensive and diverse dataset. The generated dataset included typical flight trajectories, attractive rotations and spins, and repeated loops, and is publicly available at [50].
Four scenarios are demonstrated in Figure 4: fast elevation, random flight, rotation around X, and repeated loops. These scenarios provided various maneuvers in the training database. The obtained dataset was not suitable for learning from directly due to the lack of ground truth data; for this reason, a simulation model was used to generate the training data, with the real measurements providing the input signals for the simulation.

2.2. Rough Estimation of Pose Data

As a first step, rough estimation of pose data was conducted from real quadcopter measurements, with the aim of ensuring similar maneuvers, angles, and frequencies as in reality.
The Betaflight firmware [51] computes the 3D attitude (roll, pitch, and yaw) using an embedded nonlinear complementary filter [52], which employs a proportional—integral (PI) feedback mechanism to correct gyroscope drift and fuse MARG data for attitude estimation. This filter (depicted on the left side of Figure 5) was used during the initial data collection phase to provide a rough real-time attitude estimate.
Planar coordinates (X, Y) were obtained by integrating the acceleration in the world coordinate system. The acceleration was estimated with a gravity vector decomposition from the accelerometer signals. Double integration was completed by a first-order high-pass filter with a 0.03-Hz cut-off frequency to minimize the drift noise. This cut-off frequency was set empirically, which proved to be an effective compromise between attenuating the low-frequency drift components—mainly originating from accelerometer bias, small orientation errors causing imperfect gravity compensation, and low-frequency sensor disturbaces—and preserving the genuine low-frequency motion dynamics of the quadcopter.
The altitude (Z position) was approximated using another linear complementary filter with a 2-Hz cut-off frequency, wherein the integrated Z position and barometer altitude estimation were fused. Figure 5 shows a block diagram of the rough estimation process.

2.3. Simulation Environment

The limitation of the previously introduced quadcopter setup is that the ground truth data (i.e., real positions, velocities, and accelerations) are unknown during the different flights. The flight controller firmware does obtain pose related information of the system; however, these results are based on unreliable estimations with unknown accuracy. Thus, specification of appropriate target signals for the NN was impracticable. To solve this problem, a simulation model described in our earlier work [49] was utilized to generate a database of both true system states and corresponding noisy MARG data based on roughly estimated pose data. The derivation of this simulation model is described in detail in [49]; only key information is provided below.

2.4. Model Dynamics

The environment simulates a 6DOF mechanical system that alters the pose of the MARG platform. The system is composed of six joints; these are operated in closed loop with PID controllers. The equations of motion is described in 12-dimensional state space with state vector x = q , q ˙ T , where q represents the vector of the generalized coordinates q = x b , y b , z b , ϕ , θ , ψ T (i.e., MARG pose). The base coordinates of the MARG unit are denoted with x b , y b , z b , whereas the orientation is described with ϕ , θ , ψ . Motion equations are provided as a closed form as
x ˙ ( t ) = q ˙ M ( q ) 1 τ a τ f V ( q , q ˙ ) , y ( t ) = x ( t ) .
In Equation (1), M ( q ) represents the inertia matrix of the system and V ( q , q ˙ ) includes the potential (gravity) force terms; M ( q ) and V ( q , q ˙ ) are derived based on Lagrange-equations, with the variables containing the mass and inertia parameters of each joint (see [49]). Further, τ a denotes the vector of external forces, whereas τ f models the friction effects. Each joint is characterized by both static and viscous (damping) frictions. Each joint is operated in the closed loop with PID controllers. The external force of the ith joint is defined as ( i = 1 6 )
τ a , i = K P , i q d , i q i + K I , i 0 t q d , i q i d ξ K D , i q ˙ i V q i , q ˙ i .
A rational MARG model is implemented at the end of the kinematic chain, which contains scale factor ( Δ S ) and misalignment (M) errors, bias ( ω ¯ , a 0 , h 0 ), and white noise components ( μ , ν , ϵ ) according to Equation (3).
Ω k = ( I + Δ S Ω ) M Ω ω k + ω ¯ k + μ k A k = ( I + Δ S A ) M A α k + g k + a 0 + ν k H k = ( I + Δ S H ) M H B s i h k + b h i + h 0 + ϵ k
In the MARG model, Ω denotes the gyroscope measurement, while A and H represent the accelerometer and magnetometer data, respectively. The 6DOF test bench produces the true states of the MARG (i.e., the real angular velocity ω , real external acceleration α , and real magnetic field vector h). Further, an artificial magnetic disturbance generator is employed in the environment to simulate realistic disturbances during the flights. This simulation environment (see Figure 6) was used to generate a database of true system states along with realistic MARG data.

2.5. Error Analysis

It is important to note that the aim is to derive the trajectories from real sensor data, despite their lack of precision in terms of derived spatial position signals; the rationale behind this approach is to fill the database with authentic data in order to ensure that the generated trajectories contain realistic acceleration maneuvers akin to those performed by the real drone. The error of the rough estimation stage (Section 2.2) does not significantly impact the performance, since these errors primarily affect the specific trajectories within the dataset rather than the coherence between the ground truth and sensor data in the training and validation databases. Instead, what truly influences the performance of NNs are the underlying characteristics of velocity and acceleration during flight.
Comparative statistics show the error analysis of the rough estimation stage in Figure 7, providing an evaluation of the proposed methodology. This involves taking the modeled sensor measurements (Equations (1)–(3)), running them through the rough estimation algorithm, then comparing the results to those obtained from the estimation algorithm using the original sensor measurements. The difference between these two sets of movement signals represents the error of the rough estimation algorithm while also confirming the characteristic similarities between the real flight movements and the modeled flight movements (i.e., it confirms both the precision and validity of this stage). Figure 7 compares these two set of signals for a flying scenario in which:
(A)
The graph shows the estimated position components based on the original sensor data (Xb, Yb, Zb) and modeled sensor data (Xc, Yc, Zc) using the same rough estimation algorithm.
(B)
The graph shows the regression analysis of the obtained position signals, where the title contains important metrics such as the Pearson correlation (r = 0.894), root mean square (rms = 18.5 m), and number of used samples (N = 598125).
(C)
The graph shows the estimated velocities based on original sensor data (VXb, VYb, VZb) and modeled sensor data (VXc, VYc, VZc) using the same rough estimation algorithm. This represents rather the maneuver characteristics as stated before.
(D)
The graph shows the spectrogram of velocity signals, where the rows of spectrogram are listed in the following order: VXb, VXc, VYb, VYc, VZb, VZc. This graph provides a three-dimensional report in which the horizontal axis represents time, the vertical axis represents the frequency, and the amplitude is color-coded. The first two spectrogram rows are the X channels, the third and fourth rows highlight the Y channels, and last two rows show the Z channels. The similarity of these colorful strips confirms the similar spectral characteristics of the pairwise signals.
Figure 7 highlights that the rough estimation stage driven simulation model fulfills its intended role, namely, reliably capturing the characteristics of flight maneuvers. This means that the modeling stage represents the original flight characteristics, allowing the modeling-based database to inherit the realistic flight movement characteristics. As a result, both the robustness and reliability of the proposed method are ensured despite any potential inaccuracies in the rough estimation stage.

2.6. Attitude and External Acceleration Estimation

Two popular filter structures were utilized to estimate the orientation based on the noisy measurements.

2.6.1. Extended Kalman Filter

The EKF executes both the quaternion ( q k ) and gyro-bias ( ω ¯ k ) propagation based on gyroscope measurements ( Ω k ) to obtain the a priori estimate x ^ k = q ^ k , ω ¯ ^ k T . The update phase determines the a posteriori estimate x ^ k = q ^ k , ω ¯ ^ k T by incorporating the accelerometer and magnetometer measurements z k = A k S , H k S T into the model (i.e., measurements in the sensor frame). This is a traditional EKF-based state estimation framework and provides a good balance between estimation accuracy and computational efficiency [53].
The predict phase outputs the a priori state estimate x ^ k and the corresponding error covariance P k . The quaternion propagation ( q ^ k ) is derived by the integration of the rate of change of attitude:
x ^ k = q ^ k ω ¯ ^ k = f x ^ k 1 , u k , where q ^ k = q ^ k 1 + T s 2 Q q ^ k 1 0 Ω k ω ¯ ^ k 1 and ω ¯ ^ k = ω ¯ ^ k 1 . P k = Φ P k 1 Φ T + Σ Q , where Φ = f x | x ^ k 1 .
The correct phase obtains the a posteriori estimate x ^ k and the corresponding error covariance P k . The predicted observations z ^ k are obtained by the quaternion product-based (⊗) coordinate transformations of the reference vectors g E , h E :
z ^ k = A ^ k S H ^ k S = h x ^ k , where A ^ k S = q ^ k E S g E q ^ k * E S and H ^ k S = q ^ k E S h E q ^ k * E S . G k = P k Ψ T Ψ P k Ψ T + Σ R 1 , where Ψ = h x | x ^ k . x ^ k = q ^ k ω ¯ ^ k = x ^ k + G k z k z ^ k . P k = I G k Ψ P k .
In Equations (4) and (5), A ^ S k denotes the estimated accelerometer measurement in the sensor frame, H ^ S k indicates the estimated magnetometer measurement, T s represents the sampling period, G k is the Kalman gain, and the quaternion matrix is denoted with Q q ^ k 1 . Moreover, Σ Q and Σ R denote the process and measurement noise covariance matrices, respectively. The derivation of the aforementioned equations is described in detail in our earlier work [54].

2.6.2. Gradient Descent-Based Orientation Filter

This algorithm is characterized by simple structure and shows good performance even in dynamic environments. It incorporates the accelerometer and magnetometer readings within a simple gradient descent algorithm to derive the quaternion derivative, which compensates for the gyroscope measurement error (drift correcting step):
f k = A ^ S q ^ k A k S H ^ S q ^ k H k S , f k = J k T f k , q ^ k + 1 = q ^ k + T s 1 2 Q q ^ k 0 Ω k β f k | | f k | | ,
where β denotes the learning rate of the GRD algorithm and J k denotes the Jacobian matrix of the objective function f k [55]. The gain of the filter is obtained based on earlier findings [49].

2.6.3. External Acceleration Estimation

External acceleration was estimated based on the obtained orientation results with the EFK or GRD algorithm, i.e., as a difference between the gravity and raw measurement vectors:
α ^ k = R q ^ k S E A k S g E
where g E = ( 0 , 0 , 9.8 ) T denotes the constant reference vector in the inertial frame and R E S S O 3 indicates the rotation matrix, which is constructed with the estimated quaternion.

2.7. Neural Network Model

Each training task requires a specific network architecture and algorithm parameterization [56]. In this research, a minimally sized shallow network was sought that could effectively process the sensor data by orientation sample-by-sample.
Cascade-forward NNs (CFNNs) are similar to feed-forward networks, but include a connection from the input and every previous layer to following layers. The cascade topology of the network supports the “correction signal” logic, which has a good fit to the knowledge-based algorithms. In this work, we compare one hidden layer with a larger number of neurons (CFNN1) and two hidden layers with a smaller number of neurons (CFNN2).
Time-delay NNs (FTDNNs) are regularly used for time series signal processing [57], and have similar memory-like behavior as recurrent NNs. Time series data often exhibit temporal dependencies, with the current value depending on past values. By including delayed-input samples, the NN can capture these dependencies. This allows the model to consider the history of the time series, which is crucial for making accurate predictions. We also used this network for trial and comparison.
Therefore, this research tested the following three types of NNs for accurate 3D acceleration estimation (see Figure 8):
1.
CFNN1—Cascade-forward NN with one hidden layer, which provides lower nonlinear capabilities. No time delay.
2.
CFNN2—Cascade-forward NN with two hidden layers, with the aim of providing higher nonlinear capabilities. No time delay.
3.
FTDNN—Focused time-delay NN with three time delays (1:3) and one hidden layer; for example, this network artificially adds an extra 48 input neurons for the 16 real input channels, which results in 64 neurons altogether.
Each developed shallow NN constitutes a regression model with 3–16 dimension inputs and three-dimensional output. Table 2 provides a network structure comparison that lists the key parameters of the addressed architectures. Table 3 summarizes both the characteristics of the utilized database and the inputs and outputs of the addressed problem.

2.8. Dataset Splitting and Performance Evaluation

The original measurements were sampled at 1000 Hz; however, downsampling to 100 Hz was employed because the position estimation stage was executed at lower frequencies. This sampling rate was sufficient for both tracking and controlling dynamical systems. When joining all the available scenarios this made for length measurements of 2500 s, for 250,000 samples, which were split into three defined sets where scenarios were ordered randomly: training (TR, 70%, 175,000 samples), validation (VA, 15%, 37,500 samples), and test (TE, 15%, 37,500 samples). The samples were used in the original order because of the previous samples in FTDNN, and repeated at the beginning as needed.
The performance of the velocity or acceleration estimation can be represented by the Pearson’s correlation coefficient, as in [58]. A value >0.9 is interpreted as good accuracy, since the baseline filter methods reach only 0.5–0.7 (see Table 4). The estimation performance in this research was calculated using the Pearson correlation between the ground truth and estimated accelerations, which was applied for the three directions (X, Y, Z) and for the overall samples A:
r ( a ) = ( a i a ¯ ) ( b i b ¯ ) ( a i a ¯ ) 2 ( b i b ¯ ) 2 ,
where i, b, and a represent the sample iterator, reference acceleration with its average value b ¯ , and estimated acceleration with its average value a ¯ , respectively.
The aforementioned approach provides unitless comparison metrics. The correlation is calculated separately for the X, Y, and Z axis channels and for all three axes at the same time (denoted with A), which represents the overall performance. The three axes are joined with simple concatenation A = [X; Y; Z]. It is possible to calculate the following four performances for the three sets of data, i.e., training: rTR(X), rTR(Y), rTR(Z), rTR(A); validation: rVA(X), rVA(Y), rVA(Z), rVA(A); and test: rTE(X), rTE(Y), rTE(Z,) rTE(A), from which the independent test set performance is used for the evaluation of rTE(A).
The observed overfitting was marginal because of the small number of neurons in the network compared to the large number of training samples. The training correlation performance was slightly higher than the test performance; however, this was a very small difference, e.g., rTR(A)-rTE(A) ≈ 0.005 for the largest NN that was trained. Consequently, the overfitting can be considered negligible for these shallow networks.

2.9. Network Initialization and Training Method

Levenberg–Marquardt backpropagation was used for the aforementioned SNNs because it provides better performance compared to gradient training. The GPU could not be utilized for faster training, since the GPU calculations only support gradient training and not Jacobian training. The Levenberg–Marquardt backpropagation algorithm uses the Hessian matrix as the quasi-Newton methods, and its mathematical description can be found in the official documentation [59].
The weights of NNs must be initialized to small random numbers because of the stochastic optimization algorithm [60]. Owing to both this stochastic nature and randomness in the initialization, the same network trained on the same data can produce different results. Validating using only one network can provide a rough performance estimation, because there are performance variances between differently initialized networks; however, in this paper, only one network is calculated for each configuration due to the long calculation time. CPU-based training requires 3–24 h per network (on a desktop PC with Intel-i5 and 16 Gb RAM configuration). We accept this performance estimation because its variation was only var(r) ≈ 0.003, verified by 30 Monte Carlo repeating random samples and random network initialization.
Because the embeddability of the target NN is highly dependent on its computation resource, it is important to choose the proper network size in such a way that it reduces the number of neurons as much as possible while keeping performance as high as possible.
The executed training process was based on the default hyperparameters provided by MATLAB’s cascadeforwardnet function for several reasons as follows.
1.
Dataset size: Our dataset was sufficiently large, which often mitigates the need for extensive hyperparameter tuning. With a large dataset, NNs tend to exhibit robustness to variations in hyperparameters.
2.
Convergence: We observed satisfactory convergence of the NN training process with the default parameters. This convergence indicates that the default parameters were suitable for our dataset and task.
3.
Low variability: The variability in the performance of the NN outputs was minimal, indicating that the default parameters provided both stable and consistent results across different runs.
The utilized hyperparameters are provided as follows.
  • Learning rate: 0.01
  • Activation function of hidden layers: logistic sigmoid
  • Activation function of output layer: linear activation function
  • Training algorithm: Levenberg–Marquardt backpropagation
  • Mini-batch size: entire training dataset
  • Performance function: mean squared error (MSE)
  • Regularization: no regularization
  • Normalization: no normalization
  • Stop criteria
    -
    Maximum number of epochs: 1000
    -
    Maximum time: infinity (no limit)
    -
    Performance goal: 0 (perfect case)
    -
    Maximum gradient: 1 × 10 7
    -
    Damping parameter ( μ ): 0.001
    -
    Training stops: if μ > 1 × 10 10
    -
    Maximum validation checks: 6

3. Results

The final solution was selected after three performance-based evaluation: first, the combination of input channels; second, the help of statistical channels; and third, the optimal number of neurons. A network with fast training characteristics and a single hidden layer was used in the first benchmark (CFNN1), whereas a network with two hidden layers was employed in the second benchmark (CFNN2). Finally, the larger networks were run with more neurons. Keeping this order of training helped to shorten the whole training procedure.
The first set of comparisons constituted both the sensor data and orientation filter outputs (generated by the EKF and GRD algorithms). The second set then incorporated additional statistical channels in the performance evaluation process, e.g., magnitudes of the accelerometer and gyroscope signals.
The third comparison investigated the optimal size of the networks. The selected final best model consisted of a CFNN2 architecture with H = 34 hidden neurons (network CFNN2H34), and the development leading to this solution is presented in what follows.

3.1. First Performance Comparison—Raw Sensor Data and Orientation Angles

Table 4 lists the obtained performance results, which include both the reference baseline methods and the NN models used for comparison. All performance results are expressed with correlation indexes between the ground truth acceleration and estimated acceleration. Table 4 lists three groups of results, which are discussed as follows.
1.
The reference baseline group shows the baseline method estimation results without NN elaboration. In the evaluated cases, the orientation is estimated with the EKF or GRD method based on the accelerometer, gyroscope, and magnetometer measurements (Acc + Gyr + Mag).
2.
The applicable NN group represents the elaborated solutions that can be applied in real mobile robot systems. The input fully relies on the accelerometer, gyroscope, and magnetometer measurements (Acc + Gyr+ Mag).
3.
The reference NN group uses the ground truth orientation instead of the estimated orientation (ANG), which proves the theoretical maximum performance in the case of a perfect attitude estimation.
The best results are marked in bold for each group. The abbreviations are as follows: Acc, Gyr, Mag, EKFe, EKFq, GRDe, GRDq, ANGe, and ANGq represent the accelerometer, gyroscope, magnetometer, orientation estimation with EKF in Euler representation, orientation estimation with EKF in quaternion representation, orientation estimation with GRD in Euler representation, orientation estimation with GRD in quaternion representation, ground-truth orientation in Euler representation, and ground-truth orientation in the quaternion representation, respectively. The acceleration results generated by already-existing methods (i.e., EKFa generated by EKF and GRDa generated by GRD) are included; for example, in Table 4, AccGyrMagEKFeEKFa indicates that 3D acceleration EKFa results are included along with the estimated 3D Euler representation EKFe. The abbreviation of the NNs comprises two parts: NN topology and number of hidden neurons; e.g., CFNN1H18 means a cascade-forward network with one hidden layer (CFNN1) and 18 neurons per hidden layer (H18).
Based on the achieved results, the performance evaluation is discussed as follows. The EKF method yielded the highest correlation results from the baseline groups (r = 0.725). The best reference NN was the AccGyrANGe version (r = 0.964); however, the version with an additional magnetometer was very similar (AccGyrMagANGe). These results prove that the most important factor is the correctness of the estimated attitude angles for the NN-based acceleration estimation, as the reference NNs using the ground truth attitude reached fairly high correlation results. The best applicable NN was the AccGyrMagGRDe combination (r = 0.848), which was halfway between the baseline and theoretical solution. The quaternion or Euler format of orientation only slightly influenced the performance. The contribution of output (GRDa, EKFa) of the baseline method in NN did not increase the performance.
The input channel importance test indicated that NNs mostly relied on the accelerometer and magnetometer channels (see details in Appendix B).

3.2. Second Performance Comparison—Additional Features

In the second benchmark, an NN architecture with higher nonlinearity was used. The CFNN2H15 network was characterized by two hidden layers, and each hidden layer had 15 neurons. The required training time of this network was five times higher than the CFNN1H18 network used in the first comparison. As an additional input, some simple feature channels were included, namely, IMUf included four features calculated on accelerometer and gyroscope signals (see Table 5). The last two features are average values on the last 20 samples, and represent the last 200 ms average signal characteristics. This time window was defined empirically.
Table 6 summarizes the obtained performance results. The first row highlights the best applicable NN solution from the previous benchmark (Table 4), which is included in the table for comparison. Two reference NNs were evaluated to determine how the true angle influences the network performance.
The best applicable NN was driven by the AccGyrMagGRDeIMUf input combination. This NN received 16 channels: three accelerometer inputs (AX, AY, AZ), three gyroscope inputs (GX, GY, GZ), three magnetometer inputs (MX, MY, MZ), three estimated angles based on the gradient method (GrdX, GrdY, GrdZ), and foue feature channels (ACCmag, GYRmag, ACCma20, ACCdma20). This NN architecture is evaluated in different tests in the next subsection and in two additional tests in both Appendices Appendix A and Appendix B.

3.3. Third Performance Comparison—Number of Neurons

Three types of NNs were evaluated in a comparison to analyze the number of hidden neurons; the results are highlighted in Figure 9. The network had 16 input channels; therefore, the optimal applicable number of neurons was estimated as twice the number of input channels, i.e., 32 neurons. The overfitting was measured as the performance difference between the training, validation, and test sets of samples. The overfitting rate in the case of the maximum number of neurons (H50) was only Δ r = 0.005, which is similar to the measurement error expressed from performance variability owing to the NN initialization. The NN configuration above 50 neurons (50 + 50 neurons in case of CFNN2) would have high processing requirements for memory to perform calculations in a real embedded implementation. Therefore, NNs with >50 neurons were not evaluated in this comparison; even >35 neurons per layer is not considered applicable in embedding, and would require higher resource compared to an EKF. In the case of CFNN2, there were two hidden layers; therefore, this network had double the number of neurons. This is the principal reason why it had higher performance compared to the CFNN1 version with only a single hidden layer. With 32 neurons, CFNN1 reached the performance level of CFNN2 with 10 + 10 neurons. This indicates the advantage of a double hidden layer over a single hidden layer, even one with twice the number of neurons. The maximum performance was reached when CFNN2 had 45 neurons; however, the version with 34 neurons was selected due to its having similarly high performance. Based on these results, the CFNN2H34 configuration is proposed for subsequent applications.
The FTDNN starts with lower performance; however, above 24 neurons it yields similar performance results to CFNN1. Moreover, FTDNN shows higher performance variability (nonlinearity), which suggests that its performance is more dependent on the weight initialization (in other words, the training algorithm is not so effective). We conclude that the FTDNN-type NN is not worth using for acceleration estimation purposes compared to CFNN type networks.

4. Discussion

The accuracy of MARG-based 3D position estimation is influenced by the errors of both the NN-based estimated acceleration and of the double integration method. The proposed NN-based acceleration estimation allows the 3D position of moving objects to be derived for short time windows, i.e., estimation results with a length of 5–10 s are characterized as reliable outputs. Table 7 shows typical 3D position errors after 5, 10, and 15 s for double integrated acceleration signals obtained based on the GRD method (GRDa), EKF method (EKFa), and the developed shallow network (CFNN2H34). The typical values are expressed with the median value of 100 sample periods taken from the validation test set. While each method shows high position errors, the proposed NN-based method is characterized by significantly smaller errors. These errors are 0.5–2 m after 5 s, 2–4 m after 10 s, and 5–10 m after 15 s. Figure 10 supports these results by highlighting the median position error (Euclidean distance) as it grows over time (0–15 s) for the proposed method (CFNN2H34) and the two baseline methods (GRDa, EKFa). This provides a more intuitive visualization of the performance improvement.
Figure 11, Figure 12 and Figure 13 depict a few scenarios of the estimated quadcopter movements obtained by simple double integration of the estimated acceleration. The figure plots 30-s integration results, with samples taken from the validation set. The accumulated drift error generates misleading results very soon; however, the proposed NN shows the most similar trajectory shape compared to the ground truth.
The left side of Figure 11, Figure 12 and Figure 13 highlights the NN-based position (XYZ*), velocity (Vel*), and acceleration (Acc*) signal coordinates compared to the ground truth coordinates, while the right side shows the executed spatial trajectory with the black curve (i.e., the ground truth). In addition, the red curve indicates the 3D position derived by the NN-based acceleration estimation, the green curve highlights the 3D position obtained based on the output of GRD algorithm, and the blue curve indicates the 3D position derived based on the Extended Kalman filter results.
The runtime of the proposed NN was compared to the EKF runtime using same PC and same samples. Despite MATLAB implementations of these algorithms not exactly representing the run-time in an embedded system, this provides a rough estimation of the expected processing resources. The analysis shows that CFNN2H34 has a smaller runtime compared to EKF with 20% based on 1000 samples. This means that the calculation resource of the EKF is similar to run a smaller shallow NN. The proposed CFNN2H34 requires 5449 floating point operations (FLOPs) for one input sample (including addition, multiplication, division, and exponential). This is a very low computational load that is well within the capabilities of typical low-cost microcontrollers (e.g., ARM Cortex-M4/M7), which are rated for millions of FLOPs per second (MFLOPs). Additionally, the network has a total of 2567 trainable parameters (weights and biases). Storing these as 32-bit floating-point numbers requires approximately 10.3 kB, which is a minimal footprint for modern microcontrollers. The RAM memory required for storing neuron activations during a forward pass is negligible, on the order of a few hundred bytes. The proposed model is extremely lightweight, orders of magnitude smaller than even simple DNNs such as LeNet-5 (≈60 k parameters, 1–10 MFLOPs) or small LSTM networks (≈50–200 k parameters, 1–5 MFLOPs). These low requirements for both computation and memory strongly support the feasibility of implementing the proposed model on low-cost embedded hardware.
The aforementioned results show that the proposed NN-based 3D tracking algorithm provides useful information of the movement of dynamical systems such as quadcopters. The algorithm is useful when the localization of the moving object should be obtained solely based on MARG data because of the limitation of the environment; for instance, GPS signals are commonly characterized by phase tracking loss, jamming sensitivity, and a discontinuous nature. In these cases, appropriate robot poses should be supplied for the path planning and control algorithms, otherwise the system may damage its environment. The proposed method provides significantly better estimates than the benchmark approaches, which is clearly presented by Figure 11, Figure 12 and Figure 13 as well as Table 7. This method can be improved in the future with more sophisticated integration methods and incorporation of additional sensor signals (i.e., barometer signals are very useful in the altitude estimation). Similarly, evaluation of the impact of hyperparameters on the calculation results could contribute to the comprehensive understanding and optimization of the proposed NN models. These issues are left open for future studies. Although the presented network was trained on data from a single quadcopter platform, thereby capturing platform-specific dynamics and sensor noise characteristics, the proposed methodology remains generic and can be straightforwardly applied to other systems through retraining with platform-specific datasets.

5. Conclusions

This paper proposes a new procedure for improved 3D pose estimation based on MARG data fusion in combination with an SNN. The proposed approach is based on four main stages: data acquisition on a real dynamical system, rough pose estimation with baseline filtering, application of both model dynamics and sensor models for database generation for NN teaching, and an optimal NN architecture suited for use in microcontroller-based systems and developed in combination with an orientation filter (EKF or GRD) for improved pose estimation. A specific case study has been investigated by reference to a quadcopter scenario. In particular, a database of real quadcopter maneuvers containing both realistic maneuvers and magnetic disturbances was generated by an appropriate simulation environment to successfully train NNs without significant overlearning. A comprehensive evaluation of NN setups was conducted, including the impact of raw sensor inputs, derived signals and signal features, sampling rate, channel importance, number of hidden layers, and number of neurons.
It was found that a CFNN with two hidden layers and 2 × 34 neurons was the optimal topology for accurate MARG-only acceleration estimation. The results showed that this NN can reach a significantly higher performance compared to the state-of-the-art methods (EKF or GRD). The Pearson correlation between the reference and estimated acceleration for the presented shallow NN was 0.878, whereas in the case of EKF and GRD we obtained results of 0.725 and 0.491, respectively. The performance of the proposed NN increased with more hidden neurons; however, as its implementation was also considered, we set 35 neurons set as the maximum number per layer in the investigation process. The acceleration vector provided by the obtained topology yielded significantly more reliable position/velocity estimates than the baseline methods. Based on the performed quadcopter flights, the NN-based approach was characterized by a position estimation accuracy with an error range of 0.6–1.5 m in a 5 s time window and 1.9–4.2 m in a 10 s time window.
As supervised methods, shallow NNs represent powerful models for both aggregating available measurements and estimating the external accelerations of mobile platforms. Augmenting an orientation filter with an SNN results in a MARG-only pose estimator that provides reliable position/velocity estimates for shorter time windows in reference-denied environments. Future work will include validation of the proposed method during quadcopter maneuvers, where GPS measurements will be used to generate the reference trajectories. In addition, the integration of invariant filters (e.g., invariant EKF) will be investigated in order to improve robustness and accuracy in dynamic maneuvers.

Author Contributions

Conceptualization, A.O.; methodology, A.O. and I.K.; software, A.O., P.S. and I.K.; validation, A.O., P.S., G.C., P.O. and I.K.; formal analysis, A.O. and I.K.; investigation, A.O., P.S., and I.K.; resources, A.O., P.S., G.C., P.O. and I.K.; data curation, A.O. and I.K.; writing—original draft preparation, A.O. and I.K.; writing—review and editing, P.S., G.C. and P.O.; visualization, A.O., P.S. and I.K.; supervision, A.O.; project administration, A.O.; funding acquisition, A.O. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Research, Development, and Innovation Fund of Hungary through project no. 142790 under the FK_22 funding scheme.

Data Availability Statement

The data supporting the findings of this study are available at [50].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACC (Acc)Accelerometer signal
ANGeOrientation estimation in Euler format
ANGqOrientation estimation in Quaternion format
CFNNCascade-Forward Neural Network (a specific type of shallow network)
CFNN1H18CFNN with 1 hidden layer and 18 neurons
CFNN2H34CFNN with 2 hidden layers and 2 × 34 neurons
CNNConvolutional Neural Network (not shallow network)
DNNDeep Neural Network (i.e., not a shallow network)
DOFDegrees of Freedom
EKFExtended Kalman Filter
EKFeEKF-based orientation signal in Euler format
EKFqEKF-based orientation signal in Quaternion format
FTDNNFocused Time-Delay Neural Network
GPSGlobal Positioning System
GRDGradient Descent algorithm
GRDeGRD-based orientation signal in Euler format
GRDqGRD-based orientation signal in Quaternion format
GYR (Gyr)Gyroscope signal
IMUInertial Measurement Unit
MAG (Mag)Magnetometer signal
MARGMagnetic, Angular Rate, and Gravity
NNNeural Network
RPYRoll–Pitch–Yaw signal
SNNShallow Neural Network

Appendix A. Effect of Sampling Rate

In this analysis, the NN performance is examined based on the sampling rate. Our choice of default sampling rate was 100 Hz; however, an analysis was also conducted to better understand the nature of the system. The sampling rates were selected based on integer divisors of the original 1000 Hz, i.e., 50 (1000 Hz/50 = 20 Hz), 40, 32, 25, 20, 16, 12, 10 (default 1000 Hz/10 = 100 Hz), 8, 6, and 4 divisors were evaluated. The best NN was employed; however, a lower number of neurons were applied to collect the results more quickly. The obtained trend is interesting. Figure A1 shows the achieved r(X), r(Y), r(Z), and r(A) performance results based on the input sampling rate. The trend shows a slight performance decrease for the increased sampling rate without any high fluctuations. This outcome was expected, as the lower frequency components can be estimated and the high frequencies are filtered out using lower sampling rate.
The aforementioned analyses suggest that high frequencies are not required for position estimation by double integration based on the estimated accelerometer signals. The appropriate sampling rate can be even lower than the currently used 100 Hz. It is also shown that the estimation accuracy does not degrade at lower sampling rates (e.g., from 100Hz to 50 Hz), and in some cases slightly improves. This is a significant finding for practical implementation. It means that the sampling rate can be reduced to 50 Hz without sacrificing performance, thereby halving the computational load. This is an excellent tradeoff for resource-constrained systems.
Figure A1. NN performance based on different sampling rates.
Figure A1. NN performance based on different sampling rates.
Sensors 25 06864 g0a1

Appendix B. Channel Importance Test

The contribution of each input channel in the NN-based acceleration estimation was tested in a channel skip test. A channel was skipped and the NN was recalculated to check the extent of the performance drop. The performance of the output channels (r(X), r(Y), r(Z)) individually depends on an input channel; therefore, these three output performances were evaluated. For example, based on our observation, the MX input channel influences r(X) output performance but not r(Z). The results are depicted alongside the overall performance, r(A), in Figure A2, which shows the outcome of the channel skip test on two NNs with different numbers of neurons, i.e., 15 (CFNN2H15) and 25 (CFNN2H25). The first point in Figure A2 represents the original performance including all 16 channels according to the previously selected configuration, AccGyrMagGRDeIMUf. The performance drop varies between 0–10% and its average is approximately 1%, representing a relatively small change. This confirms that redundancy exists between the selected 16 channels; moreover, the remaining channels can substitute for the information of the missing channel. In addition, the relative importance of each input channel on each output channel can be observed through the performance metrics:
1
The X output channel highly relies on the AZ, AX, and MX input channels.
2
The Y output channel highly relies on the AY, AZ, and MY input channels.
3
The Z output channel highly relies on the AZ channel.
The biggest drop in overall performance was caused by the absence of the accelerometer or magnetometer channels. As a result, it can be concluded that the NNs rely mostly on the accelerometer and magnetometer channels, as these sensors supply absolute information of the reference vectors.
Figure A2. Input channel importance test. The performance drop was compared when skipping each of the input channels. Two NNs are depicted: CFNN2H15 and CFNN2H25.
Figure A2. Input channel importance test. The performance drop was compared when skipping each of the input channels. Two NNs are depicted: CFNN2H15 and CFNN2H25.
Sensors 25 06864 g0a2

References

  1. Wang, J.; Zhang, C.; Wu, J.; Liu, M. An Improved Invariant Kalman Filter for Lie Groups Attitude Dynamics with Heavy-Tailed Process Noise. Machines 2021, 9, 182. [Google Scholar] [CrossRef]
  2. Wu, J. MARG Attitude Estimation Using Gradient-Descent Linear Kalman Filter. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1777–1790. [Google Scholar] [CrossRef]
  3. Wu, J.; Zhou, Z.; Fourati, H.; Cheng, Y. A super fast attitude determination algorithm for consumer-level accelerometer and magnetometer. IEEE Trans. Consum. Electron. 2018, 64, 375–381. [Google Scholar] [CrossRef]
  4. Feng, K.; Li, J.; Zhang, X.; Shen, C.; Bi, Y.; Zheng, T.; Liu, J. A new quaternion-based Kalman filter for real-time attitude estimation using the two-step geometrically-intuitive correction algorithm. Sensors 2017, 17, 2146. [Google Scholar] [CrossRef] [PubMed]
  5. Hashim, H.A.; Eltoukhy, A.E. Nonlinear filter for simultaneous localization and mapping on a matrix lie group using imu and feature measurements. IEEE Trans. Syst. Man Cybern. Syst. 2021, 52, 2098–2109. [Google Scholar] [CrossRef]
  6. Hashim, H.A. A geometric nonlinear stochastic filter for simultaneous localization and mapping. Aerosp. Sci. Technol. 2021, 111, 106569. [Google Scholar] [CrossRef]
  7. Hashim, H.A.; Abouheaf, M.; Abido, M.A. Geometric stochastic filter with guaranteed performance for autonomous navigation based on IMU and feature sensor fusion. Control Eng. Pract. 2021, 116, 104926. [Google Scholar] [CrossRef]
  8. Liu, H.; Zhou, Z.; Yu, L. Maneuvering Acceleration Estimation Algorithm Using Doppler Radar Measurement. Math. Probl. Eng. 2018, 2018, 4984186. [Google Scholar] [CrossRef]
  9. Zhang, X.; Zheng, K.; Lu, C.; Wan, J.; Liu, Z.; Ren, X. Acceleration estimation using a single GPS receiver for airborne scalar gravimetry. Adv. Space Res. 2017, 60, 2277–2288. [Google Scholar] [CrossRef]
  10. Hosseinyalamdary, S. Deep Kalman filter: Simultaneous multi-sensor integration and modelling; A GNSS/IMU case study. Sensors 2018, 18, 1316. [Google Scholar] [CrossRef]
  11. Caruso, M.; Sabatini, A.M.; Laidig, D.; Seel, T.; Knaflitz, M.; Croce, U.D.; Cereatti, A. Analysis of the Accuracy of Ten Algorithms for Orientation Estimation Using Inertial and Magnetic Sensing under Optimal Conditions: One Size Does Not Fit All. Sensors 2021, 21, 2543. [Google Scholar] [CrossRef]
  12. Hashim, H.A. GPS-denied Navigation: Attitude, Position, Linear Velocity, and Gravity Estimation with Nonlinear Stochastic Observer. In Proceedings of the 2021 American Control Conference (ACC), New Orleans, LA, USA, 25–28 May 2021; pp. 1146–1151. [Google Scholar]
  13. Poulose, A.; Eyobu, O.S.; Han, D.S. An indoor position-estimation algorithm using smartphone IMU sensor data. IEEE Access 2019, 7, 11165–11177. [Google Scholar] [CrossRef]
  14. Candan, B.; Soken, H.E. Robust Attitude Estimation Using IMU-Only Measurements. IEEE Trans. Instrum. Meas. 2021, 70, 1–9. [Google Scholar] [CrossRef]
  15. Candan, B.; Soken, H.E. Estimation of attitude using robust adaptive Kalman filter. In Proceedings of the 2021 IEEE 8th International Workshop on Metrology for AeroSpace (MetroAeroSpace), Naples, Italy, 23–25 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 159–163. [Google Scholar]
  16. Kecskés, I.; Odry, P.; Odry, P. Uncertainties in the movement and measurement of a hexapod robot. In Perspectives in Dynamical Systems I: Mechatronics and Life Sciences; Springer: Cham, Switzerland, 2021. [Google Scholar]
  17. Ok, M.; Ok, S.; Park, J.H. Estimation of Vehicle Attitude, Acceleration and Angular Velocity Using Convolutional Neural Network and Dual Extended Kalman Filter. Sensors 2021, 21, 1282. [Google Scholar] [CrossRef] [PubMed]
  18. De Marina, H.G.; Pereda, F.J.; Giron-Sierra, J.M.; Espinosa, F. UAV attitude estimation using unscented Kalman filter and TRIAD. IEEE Trans. Ind. Electron. 2011, 59, 4465–4474. [Google Scholar] [CrossRef]
  19. Han, X.; Li, H.; Hui, N.; Zhang, J.; Yue, G. A Multi-Sensor Fusion-Based Localization Method for a Magnetic Adhesion Wall-Climbing Robot. Sensors 2025, 25, 5051. [Google Scholar] [CrossRef]
  20. Yang, M.; Han, K.; Sun, T.; Tian, K.; Lian, C.; Zhao, Y.; Wang, Z.; Huang, Q.; Chen, M.; Li, W.J. A multi-sensor fusion approach for centimeter-level indoor 3D localization of wheeled robots. Meas. Sci. Technol. 2025, 36, 046304. [Google Scholar] [CrossRef]
  21. Teng, X.; Shen, Z.; Huang, L.; Li, H.; Li, W. Multi-sensor fusion based wheeled robot research on indoor positioning method. Results Eng. 2024, 22, 102268. [Google Scholar] [CrossRef]
  22. Tran, Q.K.; Ryoo, Y.J. Multi-Sensor Fusion Framework for Reliable Localization and Trajectory Tracking of Mobile Robot by Integrating UWB, Odometry, and AHRS. Biomimetics 2025, 10, 478. [Google Scholar] [CrossRef]
  23. Huang, G.; Huang, H.; Zhai, Y.; Tang, G.; Zhang, L.; Gao, X.; Huang, Y.; Ge, G. Multi-Sensor Fusion for Wheel-Inertial-Visual Systems Using a Fuzzification-Assisted Iterated Error State Kalman Filter. Sensors 2024, 24, 7619. [Google Scholar] [CrossRef]
  24. Zhang, L.; Wu, X.; Gao, R.; Pan, L.; Zhang, Q. A multi-sensor fusion positioning approach for indoor mobile robot using factor graph. Measurement 2023, 216, 112926. [Google Scholar] [CrossRef]
  25. Cheng, B.; He, X.; Li, X.; Zhang, N.; Song, W.; Wu, H. Research on positioning and navigation system of greenhouse mobile robot based on multi-sensor fusion. Sensors 2024, 24, 4998. [Google Scholar] [CrossRef] [PubMed]
  26. Rodríguez-Abreo, O.; Castillo Velásquez, F.A.; Zavala de Paz, J.P.; Martínez Godoy, J.L.; Garcia Guendulain, C. Sensorless Estimation Based on Neural Networks Trained with the Dynamic Response Points. Sensors 2021, 21, 6719. [Google Scholar] [CrossRef] [PubMed]
  27. Rodríguez-Abreo, O.; Garcia-Guendulain, J.M.; Hernández-Alvarado, R.; Flores Rangel, A.; Fuentes-Silva, C. Genetic Algorithm-Based Tuning of Backstepping Controller for a Quadrotor-Type Unmanned Aerial Vehicle. Electronics 2020, 9, 1735. [Google Scholar] [CrossRef]
  28. Svacha, J.; Paulos, J.; Loianno, G.; Kumar, V. Imu-based inertia estimation for a quadrotor using newton-euler dynamics. IEEE Robot. Autom. Lett. 2020, 5, 3861–3867. [Google Scholar] [CrossRef]
  29. Martínez-León, A.S.; Jatsun, S.; Emelyanova, O. Investigation of Wind Effects on UAV Adaptive PID Based MPC Control System. Enfoque UTE 2024, 15, 36–47. [Google Scholar] [CrossRef]
  30. Jeong, H.; Suk, J.; Kim, S. Control of quadrotor UAV using variable disturbance observer-based strategy. Control Eng. Pract. 2024, 150, 105990. [Google Scholar] [CrossRef]
  31. Izadi, M.; Faieghi, R. High-gain disturbance observer for robust trajectory tracking of quadrotors. Control Eng. Pract. 2024, 145, 105854. [Google Scholar] [CrossRef]
  32. Bianchi, D.; Di Gennaro, S.; Di Ferdinando, M.; Acosta Lua, C. Robust control of uav with disturbances and uncertainty estimation. Machines 2023, 11, 352. [Google Scholar] [CrossRef]
  33. Gu, W.; Zhao, J.; Rizzo, A. Learning uncertainties online for quadrotor flight control: A comparative study. J. Intell. Robot. Syst. 2025, 111, 98. [Google Scholar] [CrossRef]
  34. Macdonald, J.; Leishman, R.; Beard, R.; McLain, T. Analysis of an improved IMU-based observer for multirotor helicopters. J. Intell. Robot. Syst. 2014, 74, 1049–1061. [Google Scholar] [CrossRef]
  35. Kuang, J.; Niu, X.; Chen, X. Robust pedestrian dead reckoning based on MEMS-IMU for smartphones. Sensors 2018, 18, 1391. [Google Scholar] [CrossRef] [PubMed]
  36. Seel, T.; Raisch, J.; Schauer, T. IMU-based joint angle measurement for gait analysis. Sensors 2014, 14, 6891–6909. [Google Scholar] [CrossRef] [PubMed]
  37. Bai, N.; Tian, Y.; Liu, Y.; Yuan, Z.; Xiao, Z.; Zhou, J. A high-precision and low-cost IMU-based indoor pedestrian positioning technique. IEEE Sens. J. 2020, 20, 6716–6726. [Google Scholar] [CrossRef]
  38. Liu, W.; Caruso, D.; Ilg, E.; Dong, J.; Mourikis, A.I.; Daniilidis, K.; Kumar, V.; Engel, J. TLIO: Tight learned inertial odometry. IEEE Robot. Autom. Lett. 2020, 5, 5653–5660. [Google Scholar] [CrossRef]
  39. Brossard, M.; Barrau, A.; Bonnabel, S. AI-IMU dead-reckoning. IEEE Trans. Intell. Veh. 2020, 5, 585–595. [Google Scholar] [CrossRef]
  40. Esfahani, M.A.; Wang, H.; Wu, K.; Yuan, S. AbolDeepIO: A novel deep inertial odometry network for autonomous vehicles. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1941–1950. [Google Scholar] [CrossRef]
  41. Silva do Monte Lima, J.P.; Uchiyama, H.; Taniguchi, R.i. End-to-end learning framework for imu-based 6-dof odometry. Sensors 2019, 19, 3777. [Google Scholar] [CrossRef]
  42. Kim, W.Y.; Seo, H.I.; Seo, D.H. Nine-Axis IMU-based Extended inertial odometry neural network. Expert Syst. Appl. 2021, 178, 115075. [Google Scholar] [CrossRef]
  43. Hernandez Sanchez, S.; Fernández Pozo, R.; Hernandez Gomez, L.A. Estimating vehicle movement direction from smartphone accelerometers using deep neural networks. Sensors 2018, 18, 2624. [Google Scholar] [CrossRef]
  44. Eren, H.; Makinist, S.; Akin, E.; Yilmaz, A. Estimating driving behavior by a smartphone. In Proceedings of the 2012 IEEE Intelligent Vehicles Symposium, Madrid, Spain, 3–7 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 234–239. [Google Scholar]
  45. Mhaskar, H.; Liao, Q.; Poggio, T. When and why are deep networks better than shallow ones? In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31.
  46. Nuswantoro, F.M.; Sudarsono, A.; Santoso, T.B. Abnormal Driving Detection Based on Accelerometer and Gyroscope Sensor on Smartphone using Artificial Neural Network (ANN) Algorithm. In Proceedings of the 2020 International Electronics Symposium (IES), Surabaya, Indonesia, 29–30 September 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 356–363. [Google Scholar]
  47. Warsito, B.; Santoso, R.; Suparti; Yasin, H. Cascade forward neural network for time series prediction. In Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2018; Volume 1025, p. 012097. [Google Scholar]
  48. Kecskés, I.; Odry, Á.; Tadić, V.; Odry, P. Simultaneous calibration of a hexapod robot and an IMU sensor model based on raw measurements. IEEE Sens. J. 2021, 21, 14887–14898. [Google Scholar] [CrossRef]
  49. Odry, Á. An Open-Source Test Environment for Effective Development of MARG-Based Algorithms. Sensors 2021, 21, 1183. [Google Scholar] [CrossRef]
  50. Odry, Á. Flight Maneuvers Data Sets. 2025. Available online: https://github.com/akosodry/quadcopter_meas (accessed on 6 November 2025).
  51. Betaflight. A Popular Open-Source Flight Controller Software Used in FPV (First Person View) Drones. 2025. Available online: https://github.com/betaflight/betaflight.com (accessed on 6 November 2025).
  52. Mahony, R.; Hamel, T.; Pflimlin, J.M. Nonlinear complementary filters on the special orthogonal group. IEEE Trans. Autom. Control 2008, 53, 1203–1218. [Google Scholar] [CrossRef]
  53. Odry, A.; Kecskes, I.; Csik, D.; Hashim, H.A.; Sarcevic, P. Adaptive gradient-descent extended Kalman filter for pose estimation of mobile robots with sparse reference signals. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 4010–4017. [Google Scholar]
  54. Odry, Á.; Kecskes, I.; Sarcevic, P.; Vizvari, Z.; Toth, A.; Odry, P. A novel fuzzy-adaptive extended kalman filter for real-time attitude estimation of mobile robots. Sensors 2020, 20, 803. [Google Scholar] [CrossRef]
  55. Madgwick, S.O.; Harrison, A.J.; Vaidyanathan, R. Estimation of IMU and MARG orientation using a gradient descent algorithm. In Proceedings of the 2011 IEEE International Conference on Rehabilitation Robotics, Zurich, Switzerland, 29 June–1 July 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 1–7. [Google Scholar]
  56. Jeyaprabha, T.J.; Kavya, E.; Maurya, V.; Nitish, K. Robot Automation Algorithms. IOSR J. Electron. Commun. Eng. 2021, 16, 28–32. [Google Scholar]
  57. Peddinti, V.; Povey, D.; Khudanpur, S. A time delay neural network architecture for efficient modeling of long temporal contexts. In Proceedings of the Sixteenth Annual Conference of the International Speech Communication Association, Dresden, Germany, 6–10 September 2015. [Google Scholar]
  58. Simonetti, E.; Bergamini, E.; Vannozzi, G.; Bascou, J.; Pillet, H. Estimation of 3D Body Center of Mass Acceleration and Instantaneous Velocity from a Wearable Inertial Sensor Network in Transfemoral Amputee Gait: A Case Study. Sensors 2021, 21, 3129. [Google Scholar] [CrossRef]
  59. The MathWorks, Inc. Trainlm — Levenberg-Marquardt Backpropagation. 2025. Available online: https://www.mathworks.com/help/deeplearning/ref/trainlm.html (accessed on 6 November 2025).
  60. Brownlee, J. Why Initialize a Neural Network with Random Weights. 2021. Available online: https://machinelearningmastery.com/why-initialize-a-neural-network-with-random-weights (accessed on 6 November 2025).
Figure 1. Block diagram of the applied method. The colors represent three main research phases: the rough estimation stage based on robot motions is indicated in green; the training of NNs is highlighted in red; and the real robot activities are indicated in purple.
Figure 1. Block diagram of the applied method. The colors represent three main research phases: the rough estimation stage based on robot motions is indicated in green; the training of NNs is highlighted in red; and the real robot activities are indicated in purple.
Sensors 25 06864 g001
Figure 2. Block diagram of the NN development process. The colors represent three main research phases: the rough estimation stage based on quadcopter flights is indicated in green (estimated position vector is XYZ~, while estimated attitude expressed with roll, pitch, and yaw is RPY~); the training of NNs for acceleration estimation using the generated database is highlighted in red; and the NN implementation process to the robot is indicated in purple. NN-based estimates are indicated with an asterisk in the superscript, whereas EKF- and GRD-based estimates are denoted with a hat symbol in the superscript.
Figure 2. Block diagram of the NN development process. The colors represent three main research phases: the rough estimation stage based on quadcopter flights is indicated in green (estimated position vector is XYZ~, while estimated attitude expressed with roll, pitch, and yaw is RPY~); the training of NNs for acceleration estimation using the generated database is highlighted in red; and the NN implementation process to the robot is indicated in purple. NN-based estimates are indicated with an asterisk in the superscript, whereas EKF- and GRD-based estimates are denoted with a hat symbol in the superscript.
Sensors 25 06864 g002
Figure 3. Low-cost quadcopter setup used to generate input signals for database generation: (a) quadcopter in flight during measurement, (b) the utilized quadcopter setup without the battery, and (c) the main components of the quadcopter and their interconnections.
Figure 3. Low-cost quadcopter setup used to generate input signals for database generation: (a) quadcopter in flight during measurement, (b) the utilized quadcopter setup without the battery, and (c) the main components of the quadcopter and their interconnections.
Sensors 25 06864 g003
Figure 4. Various scenarios performed by the quadcopter: fast elevation, random flight, rotation around X coordinate, and repeated loops (GYR denotes gyroscope measurements in first row, ACC denotes accelerometer data in the second row, and RPY denotes the calculated attitude in the third row).
Figure 4. Various scenarios performed by the quadcopter: fast elevation, random flight, rotation around X coordinate, and repeated loops (GYR denotes gyroscope measurements in first row, ACC denotes accelerometer data in the second row, and RPY denotes the calculated attitude in the third row).
Sensors 25 06864 g004
Figure 5. Block diagram of preliminary rough estimation of quadcopter pose (XYZ~ and RPY~).
Figure 5. Block diagram of preliminary rough estimation of quadcopter pose (XYZ~ and RPY~).
Sensors 25 06864 g005
Figure 6. Block diagram of test environment used for the generation of the coherent database of true poses and realistic MARG data.
Figure 6. Block diagram of test environment used for the generation of the coherent database of true poses and realistic MARG data.
Sensors 25 06864 g006
Figure 7. Validation of the model-driven database generation process with error analysis of the rough estimation stage.
Figure 7. Validation of the model-driven database generation process with error analysis of the rough estimation stage.
Sensors 25 06864 g007
Figure 8. Block diagram of both CFNNs (with one and two hidden layers) and FTDNN with 16 input, 25 hidden, and three output neurons; the structures were generated using MATLAB R2024b.
Figure 8. Block diagram of both CFNNs (with one and two hidden layers) and FTDNN with 16 input, 25 hidden, and three output neurons; the structures were generated using MATLAB R2024b.
Sensors 25 06864 g008
Figure 9. NN performance comparison based on the number of hidden neurons.
Figure 9. NN performance comparison based on the number of hidden neurons.
Sensors 25 06864 g009
Figure 10. Median position error (Euclidean distance) over time for the proposed CFNN2H34 method and the two baseline algorithms (GRDa and EKFa).
Figure 10. Median position error (Euclidean distance) over time for the proposed CFNN2H34 method and the two baseline algorithms (GRDa and EKFa).
Sensors 25 06864 g010
Figure 11. Three-dimensional position, velocity, and acceleration estimation based on the obtained NN model. The acceleration was estimated by the best applicable NN (CFNN2H34) on AccGyrMagGRDeIMUf.
Figure 11. Three-dimensional position, velocity, and acceleration estimation based on the obtained NN model. The acceleration was estimated by the best applicable NN (CFNN2H34) on AccGyrMagGRDeIMUf.
Sensors 25 06864 g011
Figure 12. Another example of 3D position, velocity, and acceleration estimation based on the obtained NN model (second scenario).
Figure 12. Another example of 3D position, velocity, and acceleration estimation based on the obtained NN model (second scenario).
Sensors 25 06864 g012
Figure 13. Third scenario of 3D position, velocity, and acceleration estimation based on the obtained NN model.
Figure 13. Third scenario of 3D position, velocity, and acceleration estimation based on the obtained NN model.
Sensors 25 06864 g013
Table 1. Specifications of the sensors used in the measurement setup.
Table 1. Specifications of the sensors used in the measurement setup.
SensorTypeKey ParameterRangeResolution/Noise
MPU60503-Axis
Gyroscope
Angular VelocityUser-programmable:
±250 deg/s to ±2000 deg/s
16-bit ADC
Noise: ∼0.005 deg/s/ H z
3-Axis
Accelerometer
AccelerationUser-programmable:
±2 g to ±16 g
16-bit ADC
HMC5883L3-Axis
Magnetometer
Magnetic Field±8 Gauss (full-scale)12-bit ADC
Field resolution: 2 milli-Gauss
Noise Floor: 2 milli-Gauss
BMP180Barometric pressure,
temperature, and
altitude sensor
Pressure300 to 1100 hPa
(+9000 m to −500 m
relative to sea level)
16 to 19-bit output
Noise (High-Res Mode):
0.02 hPa (∼0.17 m)
Table 2. Key structural parameters of the evaluated neural network architectures.
Table 2. Key structural parameters of the evaluated neural network architectures.
ArchitectureInput Dimension
(Real Channels)
No. of
Hidden Layers
Neurons per
Hidden Layer
Total
Hidden Neurons
Time Delay
CFNN1161H (variable, 10–50 tested)HNo
CFNN2162H (variable, 10–50 tested)2 × HNo
FTDNN161H (variable, 10–50 tested)HYes
(1:3 delays)
Table 3. Overview of the database channels and their characteristics.
Table 3. Overview of the database channels and their characteristics.
SourceChannelSymbolUnitFs (Hz)Role
Sensors on
quadcopter
and
Beta flight
calculations
3D accelerometer
sensor signal
ACCbm/s21000Input measurements for
database generation
based
on the 6-DOF test
environment
3D gyroscope
sensor signal
GYRbdeg/s1000
3D magnetometer
sensor signal
MAGbuT1000
Barometer (altitude)
sensor signal
BARbm1000
3D attitude (orientation)RPYbrad1000
Rough estimation
of quadcopter pose
Rought 3D positionXYZ∼m1000
Rought 3D attitude
(orientation)
RPY∼rad1000
Output of 6-DOF
test environment
Reference 3D
acceleration
True Acc
(Ground truth)
m/s21000Target for NN training
Reference 3D
position
True XYZ
(Ground truth)
m10003D position validation
Noisy 3D accelerometer
sensor signal
Accm/s21000NN inputs
Noisy 3D gyroscope
sensor signal
Gyrdeg/s1000
Noisy 3D magnetometer
sensor signal
MaguT1000
Filter algorithmsOrientation estimation
by EKF
EKFe,
EKFq
rad100NN inputs
(RPY^ in Figure 2)
Orientation estimation
by GRD
GRDe,
GRDq
rad100
3D acceleration estimation
based on
EKF orientation results
EKFam/s2100Baseline methods
(Acc^ in Figure 2)
3D acceleration estimation
based on
GRD orientation results
GRDam/s2100
Additional features
(IMUf)
Magnitude of
3D accelerometer signal
ACCmagm/s2100NN inputs
Magnitude of
3D gyroscope signal
GYRmagdeg/s100
Moving average of
accelerometer signal
applied on the last 20 samples
ACCma20m/s2100
Moving average of
accelerometer derivatives
applied on the last 20 samples
ACCdma20m/s2100
NN output and
derived signals
Estimated 3D accelerationAcc*m/s2100NN Output
Estimated 3D velocityVel*m/s100Derived output
Estimated 3D positionXYZ*m100Validation output
Table 4. First performance comparison of NNs on different inputs.
Table 4. First performance comparison of NNs on different inputs.
GroupModel ParametersPerformance (Correlation)
Model and ChannelsInput DimXYZAll
Reference—baselineSensor acceleration (Acc)30.4220.2050.8910.265
GRD on AccGyrMag (GRDa)90.1780.3540.7820.491
EKF on AccGyrMag (EKFa)90.5480.4790.9330.725
Applicable NN (CFNN1H18)NN on Acc30.5380.2100.9330.674
NN on AccGyr60.5700.5650.9400.758
NN on AccMag60.7090.6110.9460.803
NN on AccEKFe60.6760.5620.9430.783
NN on AccEKFq70.6500.5750.9410.779
NN on AccGRDe60.5730.4120.9350.721
NN on AccGRDq70.5670.4140.9330.717
NN on AccGyrEKFe90.6970.6390.9470.808
NN on AccGyrGRDe90.5930.6020.9410.772
NN on AccGyrMag90.7280.7030.9460.830
NN on AccGyrMagEKFe120.7510.7020.9480.837
NN on AccGyrMagGRDe120.7790.7280.9450.848
NN on AccGyrMagEKFeEKFa150.7450.7220.9490.841
NN on AccGyrMagEKFeGRDa150.7660.7300.9490.848
NN on AccGyrMagGRDeGRDa150.7620.7290.9460.845
NN on AccGyrMagGRDeEKFa150.7780.7220.9480.848
Reference NN (CFNN1H18) with ground truth anglesNN on AccANGe60.9690.9550.9590.957
NN on AccGyrANGe90.9770.9730.9600.964
NN on AccGyrANGq90.9590.9560.9600.956
NN on AccGyrMagANGe120.9770.9730.9600.964
Bold numbers indicate the best results within each group.
Table 5. The additional feature channels (IMUf).
Table 5. The additional feature channels (IMUf).
Name [Unit]DescriptionEquation
ACCmag [m/s2]Magnitude of 3D accelerometer signal A C C m a g i = A X i 2 + A Y i 2 + A Z i 2
GYRmag [deg/s]Magnitude of 3D gyroscope signal G Y R m a g i = G X i 2 + G Y i 2 + G Z i 2
ACCma20 [m/s2]Moving average on accelerometer applied on the last 20 samples A C C m a 20 i = 1 20 k = i 19 k = i ( | A X k | + | A Y k | + | A Z k | )
ACCdma20 [m/s2]Moving average on accelerometer derivatives applied on the last 20 samples A C C d m a 20 i = 1 20 T s k = i 19 k = i ( | Δ A X k | + | Δ A Y k | + | Δ A Z k | ) Δ A k = A k A k 1
Table 6. The second benchmark performance comparison of NNs on different inputs.
Table 6. The second benchmark performance comparison of NNs on different inputs.
GroupModel ParametersPerformance (Correlation)
Model and ChannelsInput DimXYZAll
Best from Table 1NN on AccGyrMagGRDe120.7790.7280.9450.848
Applicable NN (CFNN2H15)NN on AccGyrMag90.7640.7710.9520.860
NN on AccGyrMagIMUf130.7760.7670.9530.862
NN on AccGyrMagEKFe120.7710.7690.9520.861
NN on AccGyrMagGRDe120.8010.7830.9510.871
NN on AccGyrMagEKFeIMUf160.7860.7740.9540.867
NN on AccGyrMagGRDeIMUf160.8050.8050.9520.878
NN on AccGyrMagEKFeEKFa150.7770.7680.9520.862
NN on AccGyrMagGRDeGRDa150.8090.7980.9520.877
CFNN2H15 with ground truth anglesNN on AccGyrMagANGe120.9770.9730.9600.964
NN on AccGyrMagANGeIMUf160.9770.9720.9620.965
Bold numbers indicate the best results within each group.
Table 7. Typical 3D position errors after 5, 10, and 15 s of double integrated acceleration estimated by GRDa, EKFa, and CFNN2H34.
Table 7. Typical 3D position errors after 5, 10, and 15 s of double integrated acceleration estimated by GRDa, EKFa, and CFNN2H34.
GRDa EKFa CFNN 2 H 34 a
TimeX [m]Y [m]Z [m]X [m]Y [m]Z [m]X [m]Y [m]Z [m]
5 s7.234.971.594.173.751.451.511.150.61
10 s54.6913.426.3711.2211.785.234.263.551.89
15 s142.1135.2013.7319.1618.339.4410.047.374.59
Bold values denote that the best performance is obtained by CFNN2H34.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Odry, A.; Sarcevic, P.; Carbone, G.; Odry, P.; Kecskes, I. A Novel Shallow Neural Network-Augmented Pose Estimator Based on Magneto-Inertial Sensors for Reference-Denied Environments. Sensors 2025, 25, 6864. https://doi.org/10.3390/s25226864

AMA Style

Odry A, Sarcevic P, Carbone G, Odry P, Kecskes I. A Novel Shallow Neural Network-Augmented Pose Estimator Based on Magneto-Inertial Sensors for Reference-Denied Environments. Sensors. 2025; 25(22):6864. https://doi.org/10.3390/s25226864

Chicago/Turabian Style

Odry, Akos, Peter Sarcevic, Giuseppe Carbone, Peter Odry, and Istvan Kecskes. 2025. "A Novel Shallow Neural Network-Augmented Pose Estimator Based on Magneto-Inertial Sensors for Reference-Denied Environments" Sensors 25, no. 22: 6864. https://doi.org/10.3390/s25226864

APA Style

Odry, A., Sarcevic, P., Carbone, G., Odry, P., & Kecskes, I. (2025). A Novel Shallow Neural Network-Augmented Pose Estimator Based on Magneto-Inertial Sensors for Reference-Denied Environments. Sensors, 25(22), 6864. https://doi.org/10.3390/s25226864

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop