1. Introduction
Motion-capture systems are used in a wide range of applications, such as sports, film animation, video games and, importantly, the diagnosis and treatment of human movement disorders. They provide a method to analyze multibody movements [
1,
2,
3,
4]. Analysis of human movement begins with motion-capture systems [
5] that provide measurements from a collection of sensors, from which the subject’s kinematics and kinetics can be evaluated. Motion-capture measurements typically take the form of 3D marker positions, 3D sensor orientations from Inertial Measurement Units (IMUs), or joint positions from markerless video-based systems [
6,
7].
IMUs allow multibody motion capture in more varied environments and settings compared to optical marker-based systems, which are typically restricted to controlled laboratory conditions [
6,
8]. This flexibility makes IMU-based systems particularly valuable for applications such as patient rehabilitation, where portability, accessibility and privacy are important [
3]. However, due to limitations in accuracy, interpretability, and sensor-to-body calibration, IMU-based systems have not been widely adopted in clinical applications [
9,
10]. In this work, we focus on an estimation framework based on a tightly coupled motion-capture algorithm that leverages knowledge of the multibody system’s dynamics, which are directly related to the IMU’s accelerometer and gyroscope measurements, to simultaneously estimate system kinematics and kinetics, expressed in biomechanical motion axes [
11]. By enforcing dynamic consistency between the estimated kinematics and kinetics, our approach aims to improve the accuracy of IMU-based motion estimation.
IMUs capture linear acceleration and angular velocity using accelerometers and gyroscopes [
12]. In addition, IMUs typically incorporate a magnetometer to measure the local magnetic field intensity. Using these measurements, the orientation of the IMU can be estimated [
12]. Inaccuracies in IMU-derived orientations arise from multiple sources: magnetic distortions (especially indoors), gyroscope bias and noise, and the difficulty of separating inertial acceleration from gravitational acceleration during motion. Consequently, orientation estimates contain errors that propagate to subsequent joint angle estimates. While these angular estimates are often used directly for kinematic analysis, in cases where kinetics are also required, they are typically low-pass-filtered and then differentiated twice and filtered to compute the joint torques using inverse dynamics (ID) [
13]. Due to this process, the resulting kinetics are often inconsistent with the kinematic constraints imposed by joint definitions and with external forces measured by additional sensors such as force-plates [
12,
13,
14,
15,
16,
17,
18].
Musculoskeletal modeling software such as OpenSim [
19,
20,
21] enables joint angle estimation while imposing joint mobility constraints through inverse kinematics (IK). OpenSense [
22], an open-source algorithm within OpenSim (version 4.4), utilizes IMU orientations in conjunction with OpenSim’s internal coordinate formulation [
23] to estimate multibody joint kinematics via IK. In this method, virtual IMUs are placed on the model’s body segments with the same initial orientations as the real IMUs. In each time frame of movement, IK is solved for the model’s joint angles that best reproduce the experimentally measured IMU orientations by minimizing the angular distance (error) between the orientations of the virtual and experimental IMUs. Because the measured IMU orientations are directly used to estimate joint angles, errors in IMU orientation estimates propagate to the resulting kinematics, making accurate IMU orientation estimation critical for this approach. Alternatively, kinematically coupled approaches directly integrate joint kinematic definitions with raw IMU measurements, bypassing the need for precomputed orientations. For example, Weygers et al. [
24] enforce joint kinematic constraints by requiring consistency in acceleration at the joint center shared by adjacent body segments, as measured by the IMUs mounted on each segment. However, while this approach improves kinematic consistency, joint kinetics are not estimated directly.
Incorporating the multibody dynamics of the system, along with appropriate joint definitions, has been found to improve the accuracy of the estimated kinematics and kinetics [
25]. Similar to the works of Dorschky et al. [
25], Haraguchi and Hase [
26], Nitschke et al. [
27], and Koelewijn et al. [
28], we leverage dynamic models to estimate joint kinematics and kinetics directly from IMU measurements (linear accelerations and angular velocities). However, our approach differs in several key aspects. First, unlike Dorschky et al. [
25], who employed a planar model, we use the full multibody system dynamics to enforce the system equations of motion as part of the estimation. Second, contrary to Nitschke et al. [
27], we validate our method using real IMU measurements rather than relying on simulated inertial data generated from marker-based motion capture. Third, unlike trajectory optimization approaches such as that of Haraguchi and Hase [
26], which require predefined movement parameterizations (e.g., fixed nodes per gait cycle) and are limited to periodic movements with known cycle boundaries, our method performs state estimation without requiring prior knowledge of movement type, duration, or periodicity. Similarly, Koelewijn et al. [
28] require a predefined time discretization over a fixed movement window and estimate the entire trajectory simultaneously, whereas our approach is sequential and makes no such assumptions. We present an iterated extended Kalman filter (IEKF)-based algorithm that enables the use of different sensor modalities and measurement types. We demonstrate this extensibility by integrating marker-based motion capture and zero-torque updates (described in detail in
Section 2) along with IMU measurements. We tested our algorithm using real sensor measurements on two experimental setups: a 6-DoF KUKA robot (KUKA AG, Augsburg, Germany) and a 3-DoF pendulum, each of which provided accurate ground-truth data.
2. Methodology
To exploit the underlying multibody dynamics in the movement estimation problem, we developed an algorithm to directly couple the multibody system’s equations of motion with IMU sensor measurements (
Figure 1). Estimating the system’s kinematic and kinetic states directly from the system dynamics requires formulating a continuous nonlinear state-space model that describes the measurements in terms of the system states [
12,
29,
30,
31]. In this work, we leverage OpenSim (version 4.4) [
20], an open-source software for musculoskeletal modeling and simulation, to evaluate the equations of motion and kinematic Jacobians of the multibody system, and we use it to derive different sensor measurement models [
32].
The motion of a multibody system with
joints,
body segments and
joint angles at each time instant
t is estimated in terms of joint angles
, joint angular speeds
and joint torques
. Hence, the state vector
that we are interested in estimating is given by
where
,
and
.
2.1. General Multibody State-Space Model
The multibody system dynamics describes the dynamics of the states in (
1) as
for which
is a symmetric, positive definite mass matrix,
is a function that groups all the velocity terms, and
is a function that represents the gravity force component [
19,
33]. Furthermore,
is a function representing the torque that results from gravity, the Coriolis force and the centrifugal force. Using (
2), the joint angular accelerations can be expressed as
The system’s dynamic model is expressed to include process noise, which accounts for uncertainties and disturbances in the system as follows:
where
is zero-mean Gaussian noise with covariance
. The joint torques are assumed to evolve in a random-walk process [
34] with
, where
. Since the process noise is assumed to only affect the torque states, the process noise covariance matrix
has a block-diagonal structure with zero blocks for the generalized coordinates and velocities and
for the joint torques.
The system’s measurement model relates the states to the measurements as follows:
Here,
is a function that maps the states to the
sensor measurements, and
is the measurement error, where
. These measurements consist of IMU data, optical markers, and zero-torque updates, which will be described in
Section 2.2 and
Section 2.3.
2.2. IMU Sensor Measurement Model
We assume that each body segment has at least one IMU attached. The total number of IMUs on the system’s body segments is
. Each IMU
s, with
, has a gyroscope that measures the angular velocity
and an accelerometer that measures linear acceleration
, where the subscript indicates the measurement of sensor
s, and the superscript denotes the frame in which the measurement is expressed. This notation will be used throughout the rest of this paper. The axes of the frame
s are assumed to be aligned with the sensor’s axes, and its origin is assumed to be located at the center of the accelerometer triad. The measurement model for the gyroscope angular velocity is modeled by
where
is the measured angular velocity, and
is the measurement noise. Measured angular velocities can be expressed in terms of the body segment’s angular velocities, which can be expressed in terms of the system’s current generalized coordinates and velocities [
33] as
Here, two rotation matrices are introduced. First,
is the constant rotation matrix between the body frame
b and the sensor frame
s, which is assumed to be known. Second,
is the rotation matrix that describes the orientation transformation from the navigation frame
n to the body segment frame
b as a function of the joint angles. Note that the body frame and sensor frame have the same angular velocity, provided that the velocities are expressed in a common frame.
The IMU’s accelerometer measures the linear accelerations that the sensor experiences, including the acceleration due to gravity
. The measured accelerations depend on the current generalized joint angles, angular velocities and angular accelerations and can be expressed as
Here,
is the measured linear acceleration, and
is the gravity vector expressed in the navigation frame. We model the accelerometer measurement noise as
. The acceleration
is defined in terms of the body’s angular velocity, angular acceleration and linear acceleration as
Here,
,
and
are respectively the body segment’s linear acceleration, angular acceleration and angular velocity. These kinematic quantities are computed by propagating the joint angles, velocities, and accelerations,
,
, and
, recursively through the multibody chain from the fixed base, where
depends on
through (
2) and is evaluated by OpenSim. Furthermore,
is a translation vector expressed in the body frame
b that locates the position of the IMU frame
s to the body frame
b, as shown in
Figure 2. The rotation matrix
maps the body frame to the navigation frame.
2.3. Marker-Based Motion Capture and Zero-Torque Update Measurement Models
Marker-based motion-capture systems use multiple high-speed cameras fitted within a laboratory. After calibration, these systems provide marker position measurements with respect to a laboratory frame
l. Each body segment can have any number of markers attached for a total number of markers
. We denote each individual marker by
. The marker position with respect to the lab frame can then be defined as
where
is the marker position measured in the laboratory frame, and
is the position measurement noise. Marker position measurements can be written in terms of the body position in the navigation frame as
The translation vector
is a vector that maps the position of the body segment frame
b to the marker frame
m expressed in the navigation frame
n, as shown in
Figure 2. The two matrices
and
map the navigation frame
n to the body frame
b and the body frame
b to the laboratory
l, respectively.
We also consider the occasional availability of zero-torque measurement updates. For this, virtual joint torque sensors
that assume joint torque measurements of zero are expressed as
where
is the joint zero-torque virtual measurement, and
is the virtual torque measurement noise.
2.4. Tightly Coupled Kinematics and Kinetics Estimation Algorithm
To couple the system dynamics with the measurement models (above), we employ an iterated extended Kalman filter (IEKF), specified as follows (see Algorithm 1). The multibody system dynamics are defined by the OpenSim model, which includes body segment masses, inertial properties, joint types, and sensor attachment locations. OpenSim provides the computational framework to evaluate the mass matrix
, velocity-dependent terms
, and gravitational forces
in the process model
from (
4). Additionally, OpenSim computes the kinematic quantities for the measurement model
in (
14), including rotation matrices
in (
7) and (
8), body segment accelerations
, angular velocities
, and angular accelerations
. Furthermore, OpenSim enables the computation of the process and measurement Jacobians required for the IEKF [
23,
35]. The dynamic model is discretized at the sensor measurement frequency (100 Hz) to predict the future state at time
by numerically integrating the equations of motion using OpenSim’s Runge–Kutta integrator [
35], resulting in a discrete-time update equation
where
denotes the state at time instance
t,
represents the numerical integration of the continuous dynamics in (
4) over the time step
, and
is the discrete-time process noise. The algorithm measurement model that includes all the previously discussed sensor measurement models can then be expressed as
Here,
is the measurement model,
is the total measurement error, and
is the measurement error covariance.
| Algorithm 1: Tightly coupled motion-capture algorithm (TCMA) |
| Input: Sensor measurements , an initial state estimate with covariance , process and measurement noise covariance matrices and ; see (4), (13) and (14). |
| Output: Estimates of the system states and the respective covariance for . |
| 1: for do |
| 2: Time update Compute the state prediction by numerically integrating the noise-free dynamics in (4) from the previous estimate over one time step , |
| Compute the predicted state covariance as |
| where the discrete-time state transition Jacobian is approximated by discretizing the continuous-time Jacobian as , with computed numerically. |
| 3: Measurement update |
| 4: for until do |
| 5: Update the state with measurements as |
| where . |
| 6: Compute the filtered state and covariance |
The process of estimating the system states and their covariance using an IEKF is described in Algorithm 1. The first step is prediction, in which the state and covariance are predicted as and respectively. In this step the Jacobian is calculated to compute . In the second step the algorithm uses the measurements to iteratively update the values of the state and covariance. In this step the Jacobian is calculated to compute the Kalman gain , the updated states and updated state covariance . To handle the nonlinearities of the measurement model, the IEKF re-linearizes around the most recent state estimate at each iteration within the same time step, refining the state estimate until convergence, as shown in Algorithm 1. This reduces the linearization error compared to the standard EKF, which linearizes only once around the predicted state. The discrete-time state transition Jacobian is approximated via the matrix exponential of the continuous-time Jacobian rather than by perturbing the numerical integrator directly, as the latter would require additional integrations per time step with no observable gain in accuracy.
2.5. Real-World Testing on Physical Systems
To validate our approach, we performed two separate sets of experiments on two different systems. In the first setup we used a 3-degree-of-freedom (DoF) physical four-link pendulum consisting of aluminum segments connected by roller-bearing hinge joints, as shown in
Figure 3. Each body segment was equipped with an APDM Opal inertial sensor (APDM Wearable Technologies, Portland, OR, USA) and a set of optical motion-capture markers tracked by a Vicon motion-capture system (Vicon Motion Systems Ltd., Oxford, UK). Experiments were conducted on this passive multibody system, in which the lowest body segment, shown in silver in
Figure 3, was raised to a certain height and then released to swing under its own weight; the applied joint torques are assumed to be zero upon release. We evaluated Algorithm 1 (TCMA) using three different combinations of sensor measurements to estimate the pendulum motion:
IMU measurements only ();
IMU and marker measurements ();
IMU, marker, and virtual zero-torque measurements ().
We compared the output of our algorithm with these different measurement combinations against OpenSim marker-based IK/ID (/) and OpenSense IMU orientation-based IK/ID (/) across six trials, each with different initial generalized joint angles. While marker-based IK/ID is commonly used as a reference standard, we additionally include as a reference. This allows a comparison of OpenSense (which uses only IMU orientation estimates) against a method that integrates both measurement modalities.
The second experimental setup was a KUKA LBR 7 iiwa R800 robot (
Figure 4) consisting of 7 bodies and 6 revolute joints. Markers tracked by an OptiTrack motion-capture system (NaturalPoint, Inc., Corvallis, OR, USA) and Xsens MTw2 IMUs (Movella/Xsens, Enschede, The Netherlands) were fitted on all robot body segments except the base and the end-effector. Five separate trials were conducted in which we extracted the markers’ 3D positions, IMU accelerations and angular velocities, as well as ground-truth joint angles and torques from the robot’s internal encoders and sensors. We applied the TCMA to estimate the robot motion using two measurement configurations: IMU-only measurements in 4 trials and combined IMU and marker-based measurements in 1 trial. For the combined measurement trial, we also compared results with
/
. The OpenSense
/
method was not feasible due to magnetic disturbances from the robot’s metallic structure and motors, which compromise magnetometer-based orientation estimates [
15]. For the KUKA robot setup, all IMUs were pre-calibrated before each trial to remove accelerometer and gyroscope biases.
The measurement noise covariance matrix
was constructed using the KUKA robot setup by computing the sample covariance of residuals between sensor measurements and the corresponding encoder-derived robot states, treating the encoder measurements as ground truth given their high accuracy [
36]. The process noise covariance
was set as a diagonal matrix, with elements manually tuned to approximately 1 (Nm)
2 to balance dynamic flexibility and estimation accuracy. These same
and
values were used for both the KUKA and pendulum experiments without further adjustment, as the algorithm exhibited low sensitivity to the precise values of
, and the chosen
provided adequate performance across both experimental setups.
3. Results
We present the motion estimated by the TCMA using the sensor measurement combinations described in
Section 2.5. For each experimental setup, we report root-mean-square difference (RMSD) values across all trials and illustrate the estimated joint angles and torques for a representative trial.
Figure 5 shows representative results from a randomly selected pendulum trial (trial 3). At time 0 s, the pendulum was in its initial position, as shown in
Figure 3. The pendulum was raised and then released to swing freely under its own weight from 1.2 s to 7.8 s. All other trials followed the same procedure with different initial joint angles. The virtual zero-torque measurements shown in
Figure 5 were only applied to
after the pendulum was released.
Table 1 presents RMSD values across all six pendulum trials, comparing estimates from the TCMA using the different measurement combinations (
,
, and
), alongside
/
, evaluated against
/
and
as reference standards. Estimates from the TCMA showed small deviations in joint angles (≤4.23°) and joint torques (≤3.02 Nm) when compared to
/
. However, a steady offset of approximately 2° is observed in the
estimate from
after 2.4 s compared with the other
estimates (
Figure 5).
The KUKA robot experiments show similar performance. Joint angle and torque estimates from the TCMA, using either IMU-only measurements (
) or combined IMU and marker measurements (
), are compared with
/
and the robot’s internal encoders and torque sensors (
Figure 6). The robot’s initial pose is shown in
Figure 7. The shaded regions in
Figure 6 indicate the robot torque sensor accuracy (±2% of maximum torque, ranging from 176 Nm for joint 1 to 40 Nm for joint 6 [
36,
37]), providing context for evaluating whether our estimates fall within sensor uncertainty. The robot torque measurements shown are smoothed in
Figure 6 using a 20-sample moving average filter to reduce high-frequency noise.
In the trial shown in
Figure 6, all joints except the end-effector were simultaneously actuated. This was consistent across all five trials.
Figure 7 shows the resulting end-effector motion path for trial 1. For trial 1,
showed a maximum joint angle error of 3.24°, while
showed a maximum joint angle error of 2.84°. Torque errors were below 4.27 Nm for
and 3.71 Nm for
(
Table 2). Similarly, across trials 2 to 5,
maintained maximum errors of 3.01° for joint angles and 2.33 Nm for joint torques (
Table 3). Processing the experimental data with the TCMA took approximately 21.5 times longer than the data duration for the robot experiments (e.g., 412 s for a 19.2 s trial) and 9.8 times longer for the pendulum experiments (e.g., 77 s for a 7.9 s trial). Computations were performed using MATLAB R2025b on a Dell Latitude 7440 laptop equipped with an Intel Core i7-1365U processor (13th Gen, 1.80 GHz) with 16 GB RAM.
4. Discussion
We developed a tightly coupled motion-capture algorithm (TCMA) that is based on an iterated extended Kalman filter (IEKF) to tightly couple IMU measurements and a multibody system model to estimate the system kinematics and kinetics directly. Unlike traditional IMU-based methods such as OpenSense [
22], our formulation does not depend on computing sensor orientations. Hence, it can be used in indoor applications, where there are often strong ferromagnetic disturbances affecting the magnetometer, as demonstrated in the robot experiment.
The results of the trials conducted on the pendulum experimental setup indicate that, with respect to
, the
joint angle RMSD values were on average 19% lower than those from
(
Table 1). The maximum RMSD between the
joint angle estimates and
was 3.75°, compared to 6.56° for
(
Table 1).
Across all six pendulum trials, torque estimates from the TCMA with different measurement combinations were within
Nm RMSD of the marker-based ID (
Table 1). The TCMA with different measurement combinations, particularly the virtual zero-torque measurements, produced torque estimates that fluctuated around zero from about 3 s onward (
Figure 5,
Table 1). This behavior aligns with the expected physical dynamics of the pendulum, where only the low friction of the joints is present during free passive swinging. However, the lack of joint friction in the model may have caused the observed offset in
’s
estimate (
Figure 5). Since the virtual zero-torque measurements pull the estimated torques toward zero, any unmodeled friction torque cannot be absorbed by the torque state and instead shows up as a small bias in the joint angles. This effect is most visible at
, which is the only intermediate joint and therefore the only one coupling two freely swinging segments, so friction contributions from motion on both sides affect it.
The robot trials further demonstrate the ability of our approach to provide accurate motion estimates in environments with strong magnetic distortions, where using the OpenSense approach was not possible. Five motion paths were selected to exploit the high maneuverability of the robot (
Figure 7), showcasing the capability of our method to track the corresponding kinematics and kinetics (
Figure 6). The
joint angle estimates were consistent with measurements in the pendulum trials, demonstrating joint angle RMSD values below 3.25° with respect to the joint encoders (
Table 2 and
Table 3).
The torque estimates produced by
were within 4.27 Nm RMSD of the robot’s integrated torque sensor measurements across all trials (
Table 2 and
Table 3). Furthermore, the TCMA tracked abrupt changes in the torque profile. For example, around 1.4 s in the
estimate, the motion capture ID overshot by approximately 12 Nm, whereas
did not (
Figure 6). In the worst-case scenario, the RMSD of
exceeded the torque sensors’ uncertainty by 0.5% in the estimate of
in trial 1, where the torque RMSD was 4.27 Nm (
Table 2).
A key advantage of our approach is the elimination of kinematic filtering requirements inherent to traditional ID methods. Conventional ID approaches (both OpenSense ID and marker-based ID) require low-pass filtering of kinematic data to reduce high-frequency artifacts introduced by inverse kinematics. Hence, selecting an appropriate cutoff frequency presents a significant challenge, as different body segments may move at varying frequencies, making a single cutoff frequency suboptimal across all joints [
38]. This issue is particularly relevant in motions involving contact events, where making and breaking contact with the environment introduces higher-frequency dynamics, while limb movements remain predominantly low-frequency [
39]. Our algorithm mitigates this limitation by directly estimating kinetics from the dynamic model without requiring explicit cutoff frequency selection, thereby reducing sensitivity to segment-dependent motion speeds and arbitrary filtering choices. This was evident in both the pendulum experiment, where torque estimates naturally fluctuated around zero during free swinging (
Figure 5), and the robot experiment, where abrupt torque changes were accurately tracked (
Figure 6).
The inclusion of markers in the TCMA measurement model improved accuracy, with a 15–40% reduction versus the robot joint angle encoder RMSD (
Table 2). Additionally, the virtual zero-torque measurements applied during the pendulum experiment produced the smoothest torque profile (
Figure 5). These results show the effectiveness of combining different measurement models to enhance accuracy. For example, IMU measurements can be combined with pressure insoles or other force reaction measurements to capture human kinematics and kinetics outside the laboratory.
While our results from multibody mechanical systems are very encouraging and highlight distinct advantages, they rely on several assumptions that may limit the broader applicability of our approach. First, the locations of the IMUs were assumed to be rigid (i.e., free of soft tissue artifacts) and accurately known a priori, which is generally not the case in practice. In this work, we used motion-capture measurements to calibrate the sensor-to-body locations during our experiments. Second, although the TCMA exhibited low sensitivity to the measurement noise covariance matrix , it was more sensitive to the torque process noise . However, the development of adaptive process noise estimation methods lies beyond the scope of this work. Although the TCMA is sequential and thus potentially real-time-capable, currently, it does not yet meet real-time computational requirements.
Future improvements could address the sensor-to-body calibration limitation by incorporating an automatic sensor registration algorithm, which would solve for the sensor orientation and translation with respect to the body segment using the model’s joint properties in addition to the sensor measurements [
40]. The algorithm could also include an adaptive process and measurement covariance estimator to improve estimation accuracy, particularly when applied to different multibody systems in different environments. Additionally, estimating the gyroscope bias within the system states could further enhance accuracy. The TCMA also facilitates integration of additional sensor modalities, such as markerless motion-capture systems and force sensors, by defining corresponding measurement models in terms of the system states. This could enable estimation of additional quantities, such as joint contact forces and loading, while also improving overall motion estimation accuracy. An automatically differentiable model would provide exact Jacobians instead of numerical perturbations, potentially improving both TCMA accuracy and computational efficiency. Since TCMA uses the dynamics in OpenSim, the same approach can be applied to other models, including human musculoskeletal models that use joint and coordinate definitions commonly used by movement scientists and clinicians. This compatibility with existing biomechanical frameworks could facilitate broader adoption of IMU-based motion capture in clinical settings.