A Study about Kalman Filters Applied to Embedded Sensors

Over the last decade, smart sensors have grown in complexity and can now handle multiple measurement sources. This work establishes a methodology to achieve better estimates of physical values by processing raw measurements within a sensor using multi-physical models and Kalman filters for data fusion. A driving constraint being production cost and power consumption, this methodology focuses on algorithmic complexity while meeting real-time constraints and improving both precision and reliability despite low power processors limitations. Consequently, processing time available for other tasks is maximized. The known problem of estimating a 2D orientation using an inertial measurement unit with automatic gyroscope bias compensation will be used to illustrate the proposed methodology applied to a low power STM32L053 microcontroller. This application shows promising results with a processing time of 1.18 ms at 32 MHz with a 3.8% CPU usage due to the computation at a 26 Hz measurement and estimation rate.


Introduction
Since the "Smart Dust" project [1] from Berkeley in 1999, Smart Sensors technologies and the Internet of Things (IoT) have been growing fields of research, focussing on data collection and interpretation [2]. The currently most used paradigm is to measure raw data from the sensor and send the data to the cloud, or a computer, in order to be processed by complex algorithms [3,4]. Another school behind those devices is to process sensor data within the sensor and provide the user with a readily usable, filtered and normalized measurement. The resulting lower data throughput and lower latency allow for lower consumption in wireless applications and easier measurement usage. The ultimate goal would be compensating all sensor dispersion and deviation, independently of environmental conditions, resulting in smart self correcting sensors.
The current and developing processing capabilities of sensors and the growing demand for Smart Sensors, in a wide array of fields from hobbies to industrial automation, call for new embedded, in-line, real-time data processing applications. One of these applications is automatic and continuous sensor calibration [5] with correction over time. Such a sensor does accelerate the manufacturing processes and provides more precise measurements over time, without costly human intervention or sensor replacement due to age related deviation.
State of the art lab sensors, such as pH-meters or spectrometers, use systematic manual recalibration procedures before each measurement to ensure good environmental conditions, dispersion, and deviation compensation. This method is, by definition, not applicable to embedded sensors providing continuous and autonomous measurement without any external intervention.
As most sensing elements are sensitive to multiple physical parameters, it is theoretically possible to automatically enhance measurement precision by merging complementary sensors data using multi-physical parameters estimation. One such common method was initially presented by Kalman in 1960 [6] and is known as the Kalman filter. This filter is a specialization of Bayesian filters [7] restricted to discrete time, linear systems with Gaussian noise, and a state space model of the system. Such a filter allows estimating the internal state of a system based on its measurements and model.
For the last 50 years, Kalman filters and its extensions for non-linear systems, Extended Kalman Filter (EKF) and Unscented Kalman Filter (UKF) [8], have been wildly used in various applications such as satellites, spacecrafts or planes to help automatic control of the systems. Due to its computing complexity, it has however not been wildly used in embedded systems until recent improvements to microcontrollers technologies and processing power. Such applications includes smartphones or drones for pose estimation.
This paper discusses a multi-sensor and multi-physical model coupled with a Kalman filter to achieve precise continuous estimation of a physical value without environmental bias while constrained to low processing power of embedded systems. Additionally, an automatic system re-calibration procedure in known conditions is derived.
The remainder of this paper is organized as follows: Section 2 summarizes the technical background used for this work; Section 3 exposes the proposed methodology; Section 4 applies the methodology to a 2D orientation estimation problem based on inertial measurement units; Section 5 exposes and discusses the results obtained with the application; Section 6 references the used materials and methods; and Section 7 describes the conclusions and the future of this work.

Technical Background
A multi-physical system model is a multiple input/multiple outputs (MIMO) system with physical values for some of its inputs, calibration values for some of its parameters and multi-parametric equations relying on these values for its measurement outputs. Such a system can be described using the state space representation using two equations: an evolution equation and an output/measurement equation.

Modeling the System
The state space representation [9] is a common MIMO system modeling toolkit. It relies on three vectors and two equations to describe the relation between the system commands, its state, and its outputs: • an input vector U, containing all the system known inputs; • a state vector X, containing the system internal state, which will evolve depending on the inputs; • an output vector Y, containing the system outputs/measurement; • an evolution Equation (1), describing the evolution of the internal state of the system, depending on the previous state and the command input vector; and • a measurement Equation (2), describing the measurements at the output of the system depending on its state and its command input.
This representation can be applied to large systems composed of multiple subsystems, whether those are coupled, uncoupled, or unidirectionally coupled-respectively dependent, independent, or semi-dependent. Unless coupled, the larger system can be decomposed into the sum of its uncoupled subsystems, enabling major optimizations for computational complexity [10].

Kalman Filters
As exposed earlier, Kalman filters rely on the state representation of a system. They are specialized Bayesian estimators for linear systems with discrete time and Gaussian noises. To do the estimation, the Kalman filter updates Equations (3) and (4) to Equations (5) and (6), where V k and W k are state and measurement noise vectors.
The Kalman filter is a recursive filter (i.e., it uses the output of its previous corrected estimation to process the next one). This process can be represented by Figure 1, with each step detailed by Table 1. This filter tends to reduce the quadratic error ofX k . Once the model is fixed, the fitting of the filter is done by adjusting the values of covariance matrices of V k and W k , [Q] and [R] respectively, and the initial estimated covariance of X, [P 0 ]. Step

Kalman Filter Real System
This filter has excellent estimation performances on well known linear system. For non-linear system, extensions have been developed, the best known being the Extended Kalman Filter (EKF) and the Unscented Kalman Filter (UKF).

The Extended Kalman Filter
The first, and easiest to understand, method to handle a non-linear system is the well-known EKF.
To estimate a non-linear system, the EKF filter does a local linearization of the system equations around the current estimated state. The probability densities of the estimated state and measurement vectors are obtained through the usage of the non-linear function for the mean value and the multiplication of the standard deviation of the estimated state by the corresponding linear function ( Figure 2). As the Kalman filter, the EKF considers the system as noisy, the non-linear equations for the system (Equations (1) and (2)) are transformed into Equations (7) and (8). In the case of state spaces represented systems, this is done by the computation of the Jacobian matrices of the evolution function (Equation (7)). [F X ] is the state relative Jacobian and [F V ] is the noise relative Jacobian.
Using the same pattern for the measurements, the state relative Jacobian will be named [H X ] and the noise relative one [H W ].
The whole estimation process is expose in Table 2. Step

EKF Real System
EvolutionX This method's main asset is its simplicity, as the operations are identical to the standard Kalman filter. The only addition is the computation of Jacobian matrices and the usage of non-linear functions for evolution and measurement predictions.
A known limitation is the slower convergence and instability of EKF compared to UKF when applied to systems with high non-linearities [11].

The Unscented Kalman Filter
To avoid large non-linearity estimation problems due to the local linearization used by the EKF, the UKF [12] was developed based on a mix between the Particle filter [13] and the Kalman filter. The method is based on the Unscented Transform to propagate the probability density directly through the non-linear function.

The Unscented Transform
The Unscented Transform (Appendix A) replaces the approximated linear projection of the Gaussian noise through the linear function by the projection of 2n + 1 weighted points, n being the size of the state vector, through the non-linear function. The mean and standard deviation of the weighted projected points is then computed to approximate the new, more precise, Gaussian density function at the output [8,14,15].  As shown in Figure 3, the estimated probability density used by the UKF is far more precise than the EKF in the case of highly non-linear systems: the output of the studied function is limited to [1; 2], the corresponding probability density should be zero outside these boundaries. In the case of the local-linearization method with the selected input probability density, the probability of a value higher than 2 is still important. On the other hand, with the Unscented Transform projection, the estimated probability density is far more coherent with the theoretical one.
Using this method, the computation for the UKF is decomposed into three steps: system evolution, measurements projection and correction.

System Evolution
• Generate a weighted point set for the following state estimation.
With X k a [n, 2n + 1] matrix representing the 2n + 1 states to propagate, weighted with ω c and ω µ previously computed according to Appendix A. The spread of those states around the mean value is adjusted using the λ parameter. The square root of a matrix is defines as , as explained in [16], and implies [A] is a square matrix. • Propagate the state through the evolution functioñ • Compute the projection statistics using the Unscented method Measurements Projection • Generate a weighted point set from the estimated statẽ • Propagate the points through the measurement functioñ • Estimate the mean and covariance of the measurement • Estimate the crossed covariance between the state and measurement • Correct the stateX With its precise projection, the UKF is much faster to converge and gives more precise results on highly non-linear systems. This precision is possible at the expense of a far more computational estimation process.

Algorithm Complexity and Computing Power
As the goal of this work is to provide a real-time estimation of the physical values on embedded low-power hardware, the processing time of the used algorithms has to be taken into account during the design. In this section, the algorithmic complexity of the different kinds of Kalman filters will be discussed, and compared, taking into account the processing capabilities of common microcontrollers.
To compare the algorithm complexities, it is mandatory to choose a complexity indicator. The commonly used indicators are: • the processed lines counts to do an operation; • the number of Multiplication and Accumulation (MAC) operations; and • the number of Floating point Operations (FLOP) ( i.e., the number of operation on "Real numbers" in the algorithm).
In the case of operations on microcontrollers and matrix related operations, the FLOP is the most representative indicator for the algorithms complexity. Using this indicator, the mathematical operations relative to Kalman filtering will be discussed in the next part.

Algorithms Complexity
Kalman filtering is all about matrices and vectors operations, from the simple addition of two vectors to the inversion of a matrix. In those kinds of operations, algorithmic complexity can be expressed in relation with the vectors and or matrices dimensions.
As an example, the steps required for the computation of the average of the components of a vector of size n are: • one affectation for the initialization of the sum variable; • one addition and affectation per element (n addictions and affectations); • one division; and • one affectation for the result.
Let's define the following complexity indicators: • T( f (n)), the number of operations to be executed to solve the problem; and Then, the algorithmic complexity of the averaging operation is T(3 + 2n) with a complexity order of O(2n). However, as the affectations can be considered as simple operations, the results can be simplified as T(1 + n), with an order of O(n).
Following the same method, we can get the complexities of every matrix operation as exposed in Table 3. Table 3. Matrix operations complexity sum-up.

T(.) O(.)
Matrix multiplication 2 × n × m × p 2 × n × m × p Adding two vectors of size n n n Adding two matrices of size (n, m) n × m n × m Transpose a matrix 0 0 Invert a matrix 4 × n 3 + 2 × n 2 4 × n 3 Mean vector of a matrix n × (m + 1) n × m Mean value of a vector n + 1 n Covariance de deux matrices In the case of Kalman filters, the matrix inversions can be simplified by a factor of 2 as the matrix to invert is Hermitian. The Cholesky inversion can be used in this case, giving a T(2n 3 + n 2 ) complexity, with an order of O(2n 3 ) [17].
From this point, the Kalman filters complexity are shown in Table 4, using n as the state vector size, m as the measurement vector size and p as the command vector size. Table 4. Kalman filtering complexity depending on n, m and p.

Algo
Opération As a result, the Kalman filters computing complexities are summed-up in the Table 5. (E)KF 4n 3 + 4m 3 + 6m 2 n + 4n 2 m + 3n 2 + · · · 4n 3 UKF 10n 3 + 4n 2 m + 14m 2 n + 23n 2 + 6m 2 + · · · 10n 3 Using this analysis, the UKF algorithm demands about twice the computing time of an equivalent EKF algorithm, which can be decisive in small applications. Moreover, the computational complexity of these filters grows extremely fast with the size of the system model, limiting their real-time usage to bounded complexity system model, with reasonable state, command, and measurement vectors size.
It also has to be noted that the previous study does not take into account the processing time of the non-linear functions called by the EKF and UKF algorithms.

A Computing Power Overview
Embedded systems, and thus Smart Sensors, are mainly targeting low power consumption, since most of them are battery powered, and aim for low manufacturing costs. Consequently, such systems are often designed around single core microcontroller architectures with low operating frequencies. Moreover, only high-end microcontrollers implement hardware Floating Point Unit (FPU) to accelerate the computation of "real numbers", due to their manufacturing cost and power consumption. Recent technological advances tend to improve this part [18].
Using this knowledge, the processing power of the used controller has to be acknowledged to ensure the complexity of the filter is not too important to ensure real-time operations. As an example, the comparison can be done between three largely used microcontrollers: • the ATMega328, a 8 bits microcontroller, the most commonly used in hobbyists designs as core controller of the Arduino Uno board; • the STM32L053, a ultra low-consumption 32 bits microcontroller, used in the 2D orientation estimation demonstration; and • the STM32F4xx, a high-end 32 bits microcontrollers family, embedding a FPU to accelerate computation of floating points numbers.
The computing power of those units is described in Table 6. Table 6. Computing performances. With this technical background, it is now possible to establish a method allowing to use Kalman filters into Smart Sensors or any other embedded system, keeping in mind the complexity problem.

The Proposed Methodology
The proposed methodology focuses on the system modelization and the algorithmic complexity containment, the main steps being discussed in this section. The next section will provide an example detailing and illustrating these steps.

Specify the Use-Cases
The first major step in a system design is the use-cases identification. For embedded sensors data fusion, the focus will mainly be on two parameters: • the operation context of the system-to what end it is being used (e.g., Calibration mode, Normal estimation mode); and • for each context, what are the parameters: known and controlled parameters, parameters to be estimated...
For the example of Calibration and Normal estimation use cases, we can sum-up the process as expressed in Table 7. Once the use-cases have been clearly identified, the main task focusses on the system behavioral equations identification.

Identify the System Equations
The system modelization is mostly about behavioral equations identification. In this part, the study consists in: • defining all the physical parameters affecting the system outputs; • defining all the calibration parameters (i.e., the dispersion parameters due to the sensors manufacturing process), and checking if it is possible to measure them independently of the desired measurement; and • defining all the equations linking these physical parameters (those are mainly differential equations).

Create the System Models for Each Use-Case
Once the use-cases and the system equations are established, the model creation part is decomposed as follows, with one model per use-case: • the known parameters are put into the command vector U k of the system; • the parameters to be estimated and all the intermediate parameters in the differential equations are put into the state vector X k ; • the measured output values of the system are put into the measurement vector Y k ; • the evolution and measurement equations are written according to the previously established equations; and • the system equations time discretization for continuous time equations (as the Kalman filters only works with discrete time models).
At this point, the designer should look for uncoupled, or unidirectionally coupled subsystems into the main system, especially if this subsystem has its own measurement outputs and can be expressed as a linear subsystem. If subsystems can be identified, the designer should consider dividing the system into multiple systems, easier to process: for example, a linear system composed of two commands, seven states and five measurements will have a complexity order about O(1372), according to Equation (18a). However, if this system can be decomposed in two subsystems, one with one command, three states and two measurement, the other with two commands (one from the previous system state, for the coupling), four states and three measurements, the overall complexity drops to O(364), according to Equation (18b), which gives a 3.7 times complexity optimization.

Apply an Adapted Filter
As the main goal of the design, the filter selection and implementation have to be carefully studied in order to assess the best possible performances. The designer will have to implement one Kalman filter per subsystem designed in the previous step.
To select the best possible solution for each equation, the following rules should be applied.
• If the subsystem is purely linear (i.e., its evolution and measurement equations are in the form of Equations (3) and (4)), the implemented estimator should be a Kalman filter.
• If the subsystem is purely non-linear (i.e., its evolution and measurement equation are in the form of Equations (1) and (2)), the implemented estimator should be an EKF or UKF, depending on the non-linearity, the state vector length and the available processing power. • If the subsystem is mixed (i.e., its evolution equation is linear and its measurement equation is non-linear), the evolution part should be handled by Kalman filter implementation and the measurement part should be implemented using EKF or UKF method, in order to optimize the processing load.
Finally, to optimize the processing time, some basic equations should be rewritten to their bare minimum: for example, a linear evolution equation in the form of Equation (19a) can be simplified to a couple of operations (Equations (19b) and (19c)) (X k [n] being the nth element of the vector X k ).
In this example, the unoptimized version is composed of: • two matrix multiplications by a vector, of complexity T(2 × 2 × 2 × 1) = T(8) each; and • an addition of two vectors of two elements, of a complexity of T(2) The optimized version is composed of one addition of two elements and two affectations (with virtually no computational cost), which gives a complexity optimization of T(10) to T(1).

Application to a 2D Orientation Estimation Problem
The 2D orientation estimation is a common problem in robotics [19]. For instance, in a self-balancing robot, the orientation of the robot has to be accurately measured at high speed in order to control the system using a feedback loop.

The Sensing Elements
To estimate the orientation of the robot in the XZ plane (Figure 4), the chosen approach relies on a three-axis accelerometer and three-axis gyroscope integrated circuit, the LSM6DSL sensor from ST Microelectronics. This circuit is used as a part of the development sensor board X-NUCLEO-IKS01A2 [20], which can be directly plugged on the microcontroller development board. This sensor has the following features: • raw measurements for the Accelerations and Rotational speed on X, Y and Z axis; • internal processing for free-fall detection, movement detection, 6D/4D orientation, click and double-click detection, pedometer, step detector and counter; • an independent automatic sampling with data storage in FIFO; • I 2 C or SPI serial interface; and • two configurable interrupt output lines.
A corresponding driver library is also provided with the development kit for STM32 development boards. As for all sensors, the measurements on the accelerometer and gyroscope can be biased. In the following study, only the gyroscope bias will be considered to have a significant impact on the measurement and thus will be compensated.

The Processing Unit
As a processing unit, the ultra-low power STM32L053R8 has been selected, using the corresponding Nucleo development kit. This microcontroller features: This microcontroller targets battery powered application, which is the main scope of the current study.

Applying the Methodology
The first step to design the sensor requires in the use-cases listing establishment. The proposed solution has two use-cases.
• Calibration mode: The sensor is still, on a table. Using this measurement, the gyroscope measurements should be zero, and the sensor bias is estimated by the filter. • Orientation estimation mode: The sensor bias is known and used as a control input, and the sensor orientation is estimated by the filter.
With the use-cases established, the system equations have to be written.

System Equations
To measure the 2D orientation of the system, the process relies on the measurement of the gravity acceleration by the accelerometers. This information is, however, sensitive to noise and parasite accelerations (e.g., chaos relative to the movement of the robot). The measurement stability is enhanced by the inclusion of gyroscope measurement, which assess the rotational speed.
The equations express the projection of the gravity vector in the XZ 2D plane of the robot and the measurement of its angle and norm ( Figure 5). Given this hypothesis, the three main parameters to the system are: • the gravity vector projection norm |grav XZ |; • the gravity vector projection angle θ Y , which is the desired measurement translating the system orientation in the 2D plane; and • the gyroscope measurement bias b gyro .
The system noises are: • the acceleration noise on X and Z axes: R ax and R az .
From these parameters, the sensor measurements are expressed as Equation (20a) for the gyroscope and Equations (20b) and (20c) for the accelerometers.
The system evolution can also be expressed by Equations (21a) and (21b).
At this point, the system equations have to be established for each use-case.

Calibration Mode Equations
In calibration mode, the system known input is the rotational velocity: the system being still, As the global system orientation is not relevant at this point, θ and |g XZ | will not be estimated in this mode, and thus a x and a z will not be monitored.
The state space equations for the calibration mode can be written as Equations (23a) and (23b).
When the system is stil, the gyroscope bias observation is trivial using this description.

Estimation Mode Equations
In estimation mode, the known input vector of the system is composed of the gyroscope bias. If the gyroscope noise is considered to be neglectable, the gyroscope Y axis measurement can also be added to the command vector.
Therefore, the state vector is composed of the gravity projection norm and angle. The system measurements are the accelerations along X and Z axes. The system equations can be written as Equations (24a) and (24b).Ẋ The discrete time equations can be approximated in this case to Equations (25a) and (25b) without precision loss. The sampling rate is set to 26 Hz in the current study per arbitrary choice among available hardware settings.
Now that the equations have been established for each use-case, the Kalman filters have to be applied to those systems.

Applying an Adapted Filter
A different filter has to be applied to each use-case, as follows.

Calibration Mode Filter
As this equation system is relatively simple, having a single state which translates to the single measurement, a simple low-pass filter should be used for the gyroscope bias compensation.
The selected method here was to compute the average value of the first 100 samples, and use this value as a b gyro .

Estimation Mode Filter
The system equations for the Estimation mode are: linear for the evolution and non-linear for the measurement. As explained in Section 3.4, the optimal implementation uses Kalman filter equations for the evolution estimation and EKF equations for the Prediction and Correction parts (Table 8). Table 8. Filter applied to Estimation Mode.
Step Used Equation Where [H X,k ] is the measurement Jacobian matrix listed in Equation (26).

Results and Discussion
The estimator has been implemented in C and deployed for the selected target. The results of every fourth estimation were sent through a virtual UART connection to the computer to be displayed in real-time.
There were no means available at the moment of the test to establish the precision of the measurement and assess the dynamic precision of the algorithm. It was however possible to establish the following results (Table 9): Table 9. Measurement performances results.

Parameter Result
Start-up convergence time ∼30 s @ ±1 • Still measurement noise <0.1 • Measurement repeatability <1 • in two consecutive tests At the microcrontroller nominal speed of 32 MHz, the CPU takes 1.18 ms to compute each estimation, executing the complete algorithm during that time slot (Figure 6). For a 26 Hz measurement rate, the CPU usage due to the estimation process is only 3.8%, giving a large amount of processing power to other tasks or to update with a more complex system model. The large untapped processing power available came as a surprise to the writers beating best expectations, as the previous computational burden estimation for a similar project (3D pose estimation on a 9 axis IMU at a 50 Hz sampling rate, discussed in the AREM project part of [21]) were far more important with a 465 kFLOPs/s requirement using an EKF estimator without optimizations. Future studies may consider including continuous estimation of the gyroscope bias. Furthermore, the impact of the accelerometers bias and gain should be studied on the precision of the results.
Finally, the integration of the whole system model (i.e., the robot-relative behavior) should be studied and integrated into the estimation process, providing the ability for a better control loop for the global application.

Materials and Methods
All the measurement have been done on development kits available from STMicroelectronics: • NUCLEO-L053R8 for the microcontroller development kit; and • X-NUCLEO-IKS01A2 for sensing elements.

Conclusions
The primary objective of this work was to develop a systematic approach for Smart Sensors data fusion designs using Kalman filters. The main pitfall when working with such computationally expensive algorithms embedded in microcontrollers is the limited processing power available. Consequently, the proposed methodology focused on complexity aware techniques to optimize the filter equations in order to fit the low-power requirements. The described optimizations make possible bounding the complexity of the Kalman filters. It should however be noted that large system models are known to be expensive to compute and cannot be addressed at high frequencies with low-power targets.
With less than 4% of the CPU time dedicated to the filter computation, the proposed 2D orientation estimation illustration gave unexpectedly good results in terms of processing time on an ultra-low power target. As a result, the writers attend to explore the capabilities of these optimized filters on a more complex application-specific model for the self-balancing robot in the near future.
The goal for the writers is to continue applying the methodology to a wider range of Smart Sensors projects, targeting new sensing elements, lower-end microcontrollers, and more complex models [21].