Three-Dimensional Extended Target Tracking and Shape Learning Based on Double Fourier Series and Expectation Maximization

Mao, Hongge; Yang, Xiaojun

doi:10.3390/s25154671

Open AccessArticle

Three-Dimensional Extended Target Tracking and Shape Learning Based on Double Fourier Series and Expectation Maximization

by

Hongge Mao

^1,2 and

Xiaojun Yang

^1,*

¹

School of Information Engineering, Chang’an University, Xi’an 710064, China

²

School of Artificial Intelligence and Software Engineering, Nanyang Normal University, Nanyang 473061, China

^*

Author to whom correspondence should be addressed.

Sensors 2025, 25(15), 4671; https://doi.org/10.3390/s25154671

Submission received: 9 June 2025 / Revised: 25 July 2025 / Accepted: 25 July 2025 / Published: 28 July 2025

(This article belongs to the Special Issue Multi-Sensor Technology for Target Tracking, Positioning and Navigation)

Download

Browse Figures

Versions Notes

Abstract

This paper investigates the problem of tracking targets with unknown but fixed 3D star-convex shapes using point cloud measurements. While existing methods typically model shape parameters as random variables evolving according to predefined prior models, this evolution process is often unknown in practice. We propose a particular approach within the Expectation Conditional Maximization (ECM) framework that circumvents this limitation by treating shape-defining quantities as parameters estimated directly via optimization. The objective is the joint estimation of target kinematics, extent, and orientation in 3D space. Specifically, the 3D shape is modeled using a radial function estimated via double Fourier series (DFS) expansion, and orientation is represented using the compact, singularity-free axis-angle method. The ECM algorithm facilitates this joint estimation: an Unscented Kalman Smoother infers kinematics in the E-step, while the M-step estimates DFS shape parameters and rotation angles by minimizing regularized cost functions, promoting robustness and smoothness. The effectiveness of the proposed algorithm is substantiated through two experimental evaluations.

Keywords:

target tracking; shape estimation; expectation conditional maximization; double Fourier series; axis-angle

1. Introduction

The accurate perception and tracking of targets in three-dimensional (3D) environments are a fundamental requirement for many modern intelligent systems, including autonomous vehicles, robotic platforms, and advanced surveillance systems [1,2,3,4]. While traditional point target tracking algorithms assume that one target only gives rise to one measurement, they are insufficient when dealing with objects possessing significant spatial extent relative to sensor resolution, known as extended target [5,6]. The effective tracking of these extended targets demands algorithms that can estimate not only their kinematic state (position, velocity) but also their physical characteristics, including shape and orientation, within a 3D space.

A substantial body of work considers the extended target tracking (ETT) problem, especially in 2D space [5,6]. Corresponding methods process 2D measurements based on diverse extent models. One common approach employs random matrix to model the target’s shape as an ellipse [7,8]. These extent models often assume that the extent state follows a specific probability distribution, such as a Gaussian distribution. However, this underlying distribution is frequently unknow in practice. Addressing this limitation, paper [9] presents an alternative method where extent characteristics, including semi-axis lengths, are treated as parameters for direct estimation. For complex star-convex targets, more sophisticated shape modeling methods, such as B-splines [10] and Gaussian processes (GPs) [11,12], have been developed. Notably, the random hyper-surface model (RHM) is a typical approach for representing arbitrary star-convex shapes within the frequency domain [13,14]. This model defines the target’s spatial extent using a Fourier series expansion, with its coefficients jointly estimated alongside the target’s kinematics. RHM offers a parameterization particularly well-suited for star-convex extents, effectively balancing representational accuracy with computational complexity. Numerous adaptations of such Fourier-based approaches have subsequently been explored across diverse application settings [15,16,17,18,19,20,21].

When 3D surface measurements from devices like multilayer LiDAR sensor or depth cameras are available, it becomes possible to model the target’s 3D shape. A few approaches to 3D extended target tracking involved measurement models using basic geometric shapes [22] or restricted shape estimation [23,24,25]. The methods are suitable for when the detailed shape information is not critical for the application. For instance, where the target’s shape is unknown beforehand, random matrix models can represent the target using forms like ellipsoid, or rectangular cuboids. However, the representation power of RM models remains limited to these relatively simple geometric shapes. To address more complex geometries, Gaussian Processes have been developed for 3D space, allowing the modeling of arbitrary shaped star-convex targets [26,27]. While GPs effectively capture complex 3D shapes, their representation (often via a radius function) typically results in a high-dimensional state vector. This high dimensionality poses challenges for filtering algorithms and increases the computational burden. In contrast, the double Fourier series (DFS) shape model proposed in [28] utilizes a shape vector of significantly lower dimensionality compared to GP models, while also enabling the estimation of both shape and pose. It’s worth noting that a common characteristic across many of these 3D shape measurement models is the modeling of the shape-characterizing state as a random variable, typically assumed to follow a Gaussian distribution.

Differing from traditional 3D methods that rely on prior shape evolution models, our approach models the parameters defining the extent state directly, treating them as quantities with clear physical meaning. Within the Expectation Conditional Maximization (ECM) framework, we propose an extended target tracking method for the joint estimation of kinematics, extent, and orientation in 3D space. The main contributions are listed as follows.

(1): The unknown but fixed 3D shape is represented via a radial function in spherical coordinates, with a Double Fourier Series (DFS) expansion employed for its modeling. This approach converts the continuous radial function into a parametric representation characterized by a finite set of coefficients.
(2): The axis-angle representation constitutes a compact and singularity-free approach for parameterizing 3D orientation. It describes a rotation by specifying a unit vector, which defines the axis of rotation, and a scalar angle, which denotes the magnitude of rotation about this axis. Furthermore, this method exhibits both geometric intuitiveness and notable computational efficiency.
(3): Joint estimation of the target’s kinematic state and shape parameters is achieved via the Expectation Conditional Maximization (ECM) framework. The E-step infers the kinematic state using an unscented Kalman smoother filter, while the M-step estimates shape and rotation parameters by minimizing separate cost functions with added regularization for robustness and smoothness.

The remainder of this paper is organized as follows. Section 2 formulates the problem and presents the system, including the motion model, shape representation, and measurement model. Section 3 derives an ECM-based algorithm for 3D ETT problem. Section 4 presents simulation results to validate the proposed approach across two distinct scenarios. Finally, Section 5 summarizes the main findings and concludes the article.

Notation: Throughout this paper, the superscripts “−1” and “

⊤

” represent the inverse and transpose of a matrix, respectively;

N (μ, \sum)

represents the Gaussian distribution with mean

μ

and covariance

\sum

;

E [\cdot]

and

E [\cdot | \cdot]

is mathematical expectation and conditional expectation. The notation

\otimes

refers to the Kronecker product.

I

denote the identity matrix with proper dimensions;

| \cdot |

and

T r [\cdot]

denote the determinant and trace of matrices, respectively;

| | \cdot | |_{2}

denotes the Euclidean norm.

2. Problem Formulation

This section first introduces a model for the spatial extent of 3D target, which employs a radial function parameterized via a Double Fourier Series (DFS). The target’s orientation is then described using the axis–angle representation, a method characterized by its compactness and singularity-free nature. Following this, two kinematic state evolution models specifically designed for 3D extended target tracking are presented. Finally, the paper details the measurement model that integrates this DFS-based shape representation with the selected axis–angle orientation.

2.1. DFS-Based Shape Model for 3D Target

Advanced sensors, such as depth cameras and LiDAR, can generate detailed three-dimensional (3D) point cloud data, extracting precise shape estimates from such data is crucial for object classification and predicting future behavior. To effectively learn shapes from these point clouds, it is essential to formulate a suitable object extent description method that possesses high representational power for diverse 3D shapes while also being sufficiently compact to support efficient online tracking algorithms.

This section details the modeling of the target shape in spherical coordinates via a radial function

f (θ, φ)

. Such functions relate angular coordinate pairs

(θ, φ)

to a radial distance in each orientation and are adept at representing star-convex geometries. In our methodology, this radial function is modeled using a Double Fourier Series (DFS) expansion, the specific form of which is given by:

\begin{matrix} f (ζ, θ, φ) & = ζ_{0} + \sum_{n = 1}^{N} \sum_{m = 1}^{M} [ζ_{n m}^{c c} \cos (n θ) \cos (m ϕ) + ζ_{n m}^{s s} \sin (n θ) \sin (m ϕ) \\ + ζ_{n m}^{s c} \sin (n θ) \cos (m ϕ) + ζ_{n m}^{c s} \cos (n θ) \sin (m ϕ)] \\ = & S (θ, φ) \cdot ζ \end{matrix}

(1)

where

ζ = [ζ_{0}, ζ_{11}^{c c}, ζ_{11}^{s s}, ζ_{11}^{s c}, ζ_{11}^{c s}, ζ_{12}^{c c}, ζ_{12}^{s s}, ζ_{12}^{s c}, ζ_{12}^{c s}, \dots, ζ_{N M}^{c c}, ζ_{N M}^{s s}, ζ_{N M}^{s c}, ζ_{N M}^{c s}]

\begin{array}{l} S (θ, φ) = [1, & \cos θ \cos ϕ, \sin θ \sin ϕ, \sin θ \cos ϕ, \\ \cos θ \sin ϕ, \cos θ \cos 2 ϕ, \sin θ \sin 2 ϕ, \sin θ \cos 2 ϕ, \cos θ \sin 2 ϕ \dots, \\ \cos N θ \cos M ϕ, \sin N θ \sin M ϕ, \sin N θ \cos M ϕ, \cos N θ \sin M ϕ] \end{array}

Here,

ζ_{0}

is the base radial parameter,

ζ_{n m}^{c c}, ζ_{n m}^{s s}, ζ_{n m}^{s c}, ζ_{n m}^{c s}

are the Fourier coefficients, and

N, M

are the number of angle

θ

and

ϕ

.

θ \in [- π, π]

is azimuth and

ϕ \in [- π / 2, π / 2]

is elevation. The shape parameter

ζ

collects

ζ_{0}

and all

4 N M

coefficients, resulting in a dimension of

d_{ζ} = 1 + 4 N M

.

2.2. Orientation Representation (Axis-Angle)

To represent the target’s three-dimensional (3D) orientation, this work utilizes the axis-angle parameterization, often expressed compactly as an orientation vector

ω = γ a

. In this formulation,

a

is a unit vector denoting the axis of orientation, and

γ

is the scalar angle defining the magnitude of rotation orientation according to the right-hand rule.

This choice of representation is underpinned by several compelling advantages pertinent to tracking applications. Firstly, the axis–angle representation (particularly in its three-parameter rotation vector form

ω

) provides a geometrically intuitive and unique mapping for any 3D rotation (up to 2π periodicity in angle), without requiring the satisfaction of complex inter-parameter constraints for its validity—unlike, for instance, the nine-parameter rotation matrix which must adhere to strict orthogonality and determinant conditions. Secondly, the practical application and conversion of this representation are significantly facilitated by Rodrigues’ rotation formula. This formula not only provides a straightforward and computationally efficient method for determining the effect of the rotation on a vector (i.e., rotating a vector) but also, more fundamentally, offers an explicit algorithm to compute the exponential map from the Lie algebra so (3) (the space of 3 × 3 skew-symmetric matrices, representing infinitesimal rotations) to the Special Orthogonal group SO (3) (the Lie group of finite 3D rotations) [29]. The corresponding rotation matrix

R (ω)

is computed using Rodrigues’ rotation formula:

R (ω) = I_{3} + \sin (γ) {[b]}_{\times} + (1 - \cos (γ)) {[b]}_{\times}^{2}

(2)

where

{[b]}_{\times} = (\begin{matrix} 0 & - b_{3} & b_{2} \\ b_{3} & 0 & - b_{1} \\ - b_{2} & b_{1} & 0 \end{matrix})

(3)

{[b]}_{\times}

is the skew-symmetric cross-product matrix of

b

. The axis-angle coordinates

ω = γ a

are a natural and compact rotation representation in terms of its geometric building blocks.

2.3. Kinematic State Model

A.: line kinematic model

The kinematic state

x_{t} = {[c_{t}^{⊤} v_{t}^{⊤}]}^{⊤}

at time

t

includes the center position

c_{t} = [c_{x, t} c_{y, t} c_{z, t}]

and velocity components

v_{t} = [v_{x, t} v_{y, t} v_{z, t}]

. The dynamic model of extended target is modeled by

x_{t + 1} = F x_{t} + w_{t}

(4)

where

F

is the state transition matrix, the process noise

w_{t} ~ N (0, C)

.

For constant velocity model (CV) [27]:

F = [\begin{matrix} 1 & T \\ 0 & 1 \end{matrix}] \otimes I_{3} C = [\begin{matrix} \frac{T^{3}}{3} & \frac{T^{2}}{2} \\ \frac{T^{2}}{2} & T \end{matrix}] \otimes (σ_{c}^{2} I_{3})

(5)

where

T

is sampling time,

σ_{c}

is the process noise variance for the centroid,

\otimes

is the Kronecker product, and

I_{3}

denotes an identity matrix.

B.: nonlinear kinematic model

While the target moves with a constant acceleration (CA) between time step, the acceleration state vector

a_{t} = [a_{x, t} a_{y, t} a_{z, t}]

is augmented the kinematic state. Combining these into the 9-dimensional state vector

x_{t} = {[c_{t}^{⊤}, v_{t}^{⊤}, a_{t}^{⊤}]}^{⊤}

, the state evolution model is given as follows [30],

x_{t + 1} = F^{C A} x_{t} + w_{t}^{C A}

(6)

where

F^{C A}

is the state transition matrix, the process noise

w_{t}^{C A} ~ N (0, Q^{C A})

.

F^{C A} = [\begin{matrix} I_{3} & T I_{3} & \frac{1}{2} T^{2} I_{3} \\ 0_{3} & I_{3} & T I_{3} \\ 0_{3} & 0_{3} & I_{3} \end{matrix}] Q^{C A} = σ_{a}^{2} [\begin{matrix} \frac{T^{5}}{20} I_{3} & \frac{T^{4}}{8} I_{3} & \frac{T^{3}}{6} I_{3} \\ \frac{T^{4}}{8} I_{3} & \frac{T^{3}}{3} I_{3} & \frac{T^{2}}{20} I_{3} \\ \frac{T^{3}}{6} I_{3} & \frac{T^{2}}{2} I_{3} & T I_{3} \end{matrix}]

(7)

The scalar

σ_{a}^{2}

represents the intensity of the underlying noise process and is a crucial tuning parameter for the CA model.

2.4. Measurement Model

Sensor data at time

t

consists of a set of

N_{t}

noisy 3D measurements

{z_{j, t}}_{j = 1}^{N_{t}}

originating from the target’s surface. Each measurement

z_{j, t}

is related to the state and parameters via the non-line measurement function plus noise. A single measurement can be expressed as

z_{j, t} = c_{t} + f_{l o c a l} + e_{t}

(8)

where

c_{t}

is the center of the target,

f_{l o c a l}

is the radial function representation which describes the target’s shape in local coordinate. However, the measurement

z_{j, t}

is obtained in global coordinates. Consequently, a global-to-local transformation is essential. This transformation yields the following local representation,

f_{l o c a l} = R_{g l o b a l}^{l o c a l} f_{g l o b a l} = R (ω) [\begin{array}{l} f (ζ, θ_{j, t}, ϕ_{j, t}) \cos (ϕ_{j, t}) \cos (θ_{j, t}) \\ f (ζ, θ_{j, t}, ϕ_{j, t}) \cos (ϕ_{j, t}) \sin (θ_{j, t}) \\ f (ζ, θ_{j, t}, ϕ_{j, t}) \sin (ϕ_{j, t}) \end{array}]

(9)

f (ζ, θ_{j, t}, ϕ_{j, t})

represents the radial function form in global coordinate, which modeled using DFS parameterized by the coefficient

ζ

, and

e_{t}

is zero-mean Gaussian measurement noise with covariance

R_{t}

.

(θ_{j, t}, ϕ_{j, t})

is the angle pair of the measurement source on the target surface that originates

z_{j, t}

.

R (ω)

is the rotation matrix that converts the global coordinates to local coordinates for the corresponding orientation parameter

ω

. This parameters

ω

utilize the axis-angle representation, expressed as

ω = γ a

, where

a = [a_{x}, a_{y}, a_{z}]

is a unit vector (

| | a | |_{2} = 1

) indicating the axis of rotation orientation, and

γ

is the angle of rotation about the axis

a

(typically in radial, following the right-hand rule). This parameter

ω

compactly stores both axis and angle information. Specifically, its magnitude

| | ω | |_{2} = γ

give the rotation angle, while its direction

ω / | | ω | |_{2} = a

give the axis for rotation orientation (for

γ \neq 0

). This axis-angle representation uses only 3 parameters (the components of

ω

) and advantageously avoids the gimbal lock singularities associated with Euler angles.

The parameter

ζ

is the set of coefficients for the DFS that defines the entire 3D shape of the target. The core assumption of the model is that the target is a rigid body. Therefore, this single, unified shape parameter

ζ

is used to calculate the surface radius for any point

(θ_{t}, ϕ_{t})

on the object’s surface. When process the

j

th measurement, and we use its specific projected angles

(θ_{j, t}, ϕ_{j, t})

as inputs to the universal shape function defined by

ζ

to get the predicted radius at that exact spot.

Let

Z^{G} = z_{j, t} - c_{t}

and

(θ_{j, t}, ϕ_{j, t})

be expressed as,

θ_{j, t} = \arccos (\frac{x_{t}^{G}}{| | z_{j, t} - c_{t} | |_{2}})

(10)

ϕ_{j, t} = atan 2 (y_{t}^{G}, x_{t}^{G})

(11)

where

Z_{t}^{G} ≜ (x_{t}^{G}, y_{t}^{G}, z_{t}^{G})

.

As the measurement

z_{j, t}

cannot be isolated as an explicit function of the state vector and noise, Equation (8) defines an implicit measurement model. It can be rewritten as:

\begin{array}{l} 0 = & \underset{h (x_{t}, z_{j, t}, ζ, ω)}{\underset{︸}{- z_{j, t} + c_{t} + R (ω) [\begin{array}{l} f (ζ, θ_{j, t}, ϕ_{j, t}) \cos (ϕ_{j, t}) \cos (θ_{j, t}) \\ f (ζ, θ_{j, t}, ϕ_{j, t}) \cos (ϕ_{j, t}) \sin (θ_{j, t}) \\ f (ζ, θ_{j, t}, ϕ_{j, t}) \sin (ϕ_{j, t}) \end{array}]}} + e_{t} \\ = h (x_{t}, z_{j, t}, ζ, ω) + e_{t} \end{array}

(12)

The zero vector appearing in Equation (12) can be interpreted as representing a pseudo-measurement. This pseudo-measurement is effectively a nonlinear function of the state and the original measurement, assumed to be corrupted by additive Gaussian noise.

3. ECM Tracking Algorithm Based on DFS

This section details the DFS-ECM framework for joint motion state, shape, and orientation estimation. Its core, the Expectation Conditional Maximization (ECM) algorithm, uses a batch method involving iterative Kalman smoothing over a sliding data window. This incorporates information from previous and subsequent scans. For a batch window from time 1 to

T_{b}

, the ECM framework defines these variables:

X = (x_{1}, \dots, x_{t} \dots, x_{T_{b}})

Z = (Z_{1}, \dots, Z_{t} \dots, Z_{T_{b}})

Λ = (Λ_{1}, \dots, Λ_{t} \dots, Λ_{T_{b}})

where

X

denotes the latent variables within the sliding window;

Z

represents the set of measurements within the sliding window, including

Z_{t} = (z_{1, t}, \dots z_{j, t}, \dots, z_{N_{t}, t})

;

Λ

denotes the complete sequence of unknown parameters, where each

Λ_{t} = (ω, ζ)

of the sliding window at time step

t

. ECM is an iterative optimization method whose objective is to maximize the log-likelihood function of the incomplete data (latent variable)

X

with respect to the parameter

Λ

.

The main terms used in this algorithm can be found in Table 1.

We define the complete-data log-likelihood function

L (Λ^{(i + 1)})

and its corresponding conditional expectation function

Q (Λ^{(i + 1)}; Λ^{(i)})

of the

(i + 1)

th iteration as follows:

L (Λ^{(i + 1)}) = \ln p (X^{(i + 1)}, Z | Λ^{(i + 1)})

(13)

Q (Λ^{(i + 1)}; Λ^{(i)}) = E [L (Λ^{(i + 1)}) | Z, Λ^{(i)}]

(14)

As the

Q

function involves multiple unknown parameters, a direct application of the traditional maximization step is not feasible. Instead, we utilize multiple conditional maximization steps to iteratively estimate the parameters.

3.1. E-Step

The primary purpose of the E-step is to estimate the state

x_{t}

and its corresponding error covariance matrix

P_{t}

, a process that involves an Unscented Kalman forward filter (UKF) [31,32] and backward Rauch–Tung–Striebel Smoother smoother [33,34]. The forward propagation is performed as,

Let

n

be the dimension of the kinematic state

x_{t}

, we generate

2 n + 1

sigma points

X_{k, t - 1}

based on the previous state estimate

x_{t - 1}

and error covariance

P_{t - 1}

(state initialized as

x_{0} = μ_{0}

,

P_{0} = \sum_{0}

):

X_{k, t - 1} = \{\begin{cases} x_{t - 1} & k = 0 \\ x_{t} + {(\sqrt{(n + λ) P_{t - 1}})}_{k} & k = 1, \dots, n \\ x_{t} - {(\sqrt{(n + λ) P_{t - 1}})}_{k} & k = n + 1, \dots, 2 n \end{cases}

(15)

The weights for mean

W_{k}^{(m)}

and covariance

W_{k}^{(c)}

are:

W_{k}^{(m)} = \{\begin{cases} \frac{λ}{n + λ} k = 0 \\ \frac{1}{2 (n + λ)} k \neq 0 \end{cases}

(16)

W_{k}^{(c)} = \{\begin{cases} \frac{λ}{n + λ} + (1 - α^{2} + β) k = 0 \\ \frac{1}{2 (n + λ)} k \neq 0 \end{cases}

(17)

where

λ = α^{2} (n + κ) - n

is a scaling parameter,

α

controls the distance of the Sigma points from mean,

κ

is a secondary parameter

β

is used to incorporate covariance information.

Prediction step of UKF:

(1) Propagate Sigma Points: Pass each state sigma point through the process model

F

,

X_{k, t | t - 1} = F X_{k, t - 1}

(18)

(2) Compute the predicted state

{\hat{x}}_{t | t - 1}

and predict error covariance

P_{t | t - 1}

:

{\hat{x}}_{t | t - 1} = \sum_{k = 0}^{2 n} W_{k}^{(m)} X_{k, t | t - 1}

(19)

P_{t | t - 1} = \sum_{k = 0}^{2 n} W_{k}^{(c)} (X_{k, t | t - 1} - {\hat{x}}_{t | t - 1}) {(X_{k, t | t - 1} - {\hat{x}}_{t | t - 1})}^{⊤} + C

(20)

Update step of UKF: the update proceeds sequentially for each measurement. Initialize the updated state and error covariance for this time step with predicted values,

{\hat{x}}_{0, t} = {\hat{x}}_{t | t - 1}

(21)

P_{0, t} = P_{t | t - 1}

(22)

the measurements

{z_{j, t}}_{j = 1}^{N_{t}}

are processed using the following steps,

(1) Predict the

j

th measurement

Z_{j, k, t | t - 1}

for each sigma point

X_{k, t | t - 1}

using the parameter

ζ, ω

,

Z_{j, k, t | t - 1} = h (X_{k, t | t - 1}, ζ, ω)

(23)

(2) Compute the

j

th weighted average measurement

{\hat{z}}_{j, t | t - 1}

of the predicted sigma point,

{\hat{z}}_{j, t | t - 1} = \sum_{k = 0}^{2 n} W_{k}^{(m)} Z_{k, t | t - 1}

(24)

(3) Estimate the

j

th innovation covariance

P_{j, z z, t | t - 1}

and the cross-covariance

P_{j, x z, t | t - 1}

,

P_{j, z z, t | t - 1} = \sum_{k = 0}^{2 n} W_{k}^{(c)} (Z_{k, t | t - 1} - {\hat{z}}_{j, t | t - 1}) {(Z_{k, t | t - 1} - {\hat{z}}_{j, t | t - 1})}^{⊤} + R_{t}

(25)

P_{j, x z, t | t - 1} = \sum_{k = 0}^{2 n} W_{k}^{(c)} (X_{k, t | t - 1} - {\hat{x}}_{t | t - 1}) {(Z_{k, t | t - 1} - {\hat{z}}_{j, t | t - 1})}^{⊤}

(26)

(4) Calculate the

j

th Kalman Gain

K_{j, t}

,

K_{j, t} = P_{j, x z, t | t - 1} P_{j, z z, t | t - 1}^{- 1}

(27)

(5) Update the

j

th state

{\hat{x}}_{j, t}

and its error covariance

P_{j, t}

,

{\hat{x}}_{j, t} = {\hat{x}}_{j - 1, t} + K_{j, t} (0 - {\hat{z}}_{j, t | t - 1})

(28)

P_{j, t} = P_{j - 1, t} - K_{j, t} P_{j, z z, t | t - 1} K_{j, t}^{⊤}

(29)

When processing all

N_{t}

measurements, the UKF state

{\hat{x}}_{t}

estimation and its error covariance

P_{t}

are:

{\hat{x}}_{t} = {\hat{x}}_{N_{t}, t}

(30)

P_{t} = P_{N_{t}, t}

(31)

After sequentially processing the data from time

t = 1

to

t = T_{b}

with an Unscented Kalman Filter (UKF), we obtain the series of filter states

{{\hat{x}}_{t}}_{t = 1}^{T_{b}}

and error covariances

{P_{t}}_{t = 1}^{T_{b}}

. Subsequently, the Rauch–Tung–Striebel smoothing algorithm is applied to compute the smoothed state

{\hat{x}}_{t | T_{b}}

and its error covariance

P_{t | T_{b}}

as follows,

The backward pass begins at the final time step

T_{b}

. (1) Initialize the smoothed state

{\hat{x}}_{0, t | T_{b}}

and error covariance

P_{0, t | T_{b}}

for this time step with UKF estimation,

{\hat{x}}_{0, t | T_{b}} = {\hat{x}}_{T_{b}}

(32)

P_{0, t | T_{b}} = P_{T_{b}}

(33)

The algorithm then iterates backward in time, from

t = T_{b} - 1

down to

t = 1

. For each time step

t

, the following computations with the measurements

{z_{j, t}}_{j = 1}^{N_{t}}

are performed:

(2) Compute the

j

th Smoothed Gain

G_{j, t}

,

G_{j, t} = P_{j, x z, t + 1 | t} P_{t + 1 | t}^{- 1}

(34)

where

P_{j, x z, t + 1 | t}

is the

j

th cross covariance and

P_{t + 1 | t}

is the predict error covariance of forward UKF, both obtained from the forward UKF pass.

(3) Compute the

j

th Smoothed state

{\hat{x}}_{j, t | T_{b}}

,

{\hat{x}}_{j, t | T_{b}} = {\hat{x}}_{j - 1, t | T_{b}} + G_{t} ({\hat{x}}_{t + 1 | T_{b}} - {\hat{x}}_{t + 1 | t})

(35)

(4) Compute the

j

th Smoothed error Covariance

P_{j, t | T_{b}}

,

P_{j, t | T_{b}} = P_{j - 1, t | T_{b}} + G_{j, t} (P_{t + 1 | T_{b}} - P_{t + 1 | t}) G_{j, t}^{⊤}

(36)

When processing all

N_{t}

measurements, the smoothed state

{\hat{x}}_{t | T_{b}}

estimation and its error covariance

P_{t | T_{b}}

are,

{\hat{x}}_{t | T_{b}} = {\hat{x}}_{N_{t}, t | T_{b}}

(37)

P_{t | T_{b}} = P_{N_{t}, t | T_{b}}

(38)

Considering the Markov property of state propagation and the independence of measurement sources, the log-likelihood function

L (Λ^{(i + 1)})

is written as,

L (Λ^{(i + 1)}) = \ln p (x_{0}) + \sum_{t = 1}^{T_{b}} \ln p (x_{t + 1} | x_{t}) + \sum_{t = 1}^{T_{b}} \sum_{j = 1}^{N_{t}} \ln p (z_{j, t} | x_{t}, Λ^{(i + 1)})

(39)

So,

\begin{matrix} Q (Λ^{(i + 1)}; Λ^{(i)})) & = E_{Z, Λ^{(i)}} [\ln p (x_{0})] + \sum_{t = 1}^{T_{b}} E_{Z, Λ^{(i)}} \ln p (x_{t} | x_{t - 1}) + \sum_{t = 1}^{T_{b}} \sum_{j = 1}^{N_{t}} E_{Z, Λ^{(i)}} \ln p (z_{j, t} | x_{t}, Λ^{(i + 1)}) \\ = \sum_{t = 1}^{T_{b}} \sum_{j = 1}^{N_{t}} E_{Z, Λ^{(i)}} \ln p (z_{j, t} | x_{t}, Λ^{(i + 1)}) + C o n s t \end{matrix}

(40)

Under the assumption of Gaussian measurement noise, the measurement likelihood becomes,

p (z_{j, t} | x_{t}, ω, ζ) = N (z_{j, t} - h (x_{t}, z_{j, t}, ω, ζ); 0, R_{t})

(41)

Insert Equation (41) into (40),

Q

is,

\begin{array}{l} Q (Λ^{(i + 1)}; Λ^{(i)}) = = \sum_{t = 1}^{T_{b}} \sum_{j = 1}^{N_{t}} E_{Z, Λ^{(i)}} \ln p (z_{j, t} | x_{t}, Λ_{t}^{(i + 1)}) + C o n s t \\ = - \frac{1}{2} \sum_{j = 1}^{N_{t}} {\sum_{t = 1}^{T_{b}} \ln | R_{j, t} | + Tr \sum_{t = 1}^{T_{b}} (R_{j, t}) E_{Z, Λ^{(i)}} [(z_{j, t} - h (x_{t}, z_{j, t}, ω, ζ)) {(•)}^{⊤}]} + C o n s t \end{array}

(42)

where

(•)

denotes the term same as the former one.

3.2. M-Step

After obtaining

Q (Λ^{(i + 1)}; Λ^{(i)})

in the E-step of the algorithm, the task of the M-step is to find

Λ^{(i + 1)}

, thereby updating the target’s shape and axis-angle parameters from the previous iteration. Since

Q (Λ^{(i + 1)}; Λ^{(i)})

is a non-linear function of the parameters, its analytical solution is generally difficult to obtain. Here, parameters are estimated by minimizing regularized cost function which equivalent to maxing

Q (Λ^{(i + 1)}; Λ^{(i)})

estimation.

(1): Shape parameter optimization

The shape parameter

ζ

are estimated by minimizing the cost function

J (ζ)

over a sliding window of

T_{b}

time steps (

t = 1

to

T_{b}

).

J_{o r i g i n a l} (ζ) = (\frac{1}{\sum_{t = 1}^{T_{b}} N_{t}} \sum_{t = 1}^{T_{b}} \sum_{j = 1}^{N_{t}} | | z_{j, t} - h ({\hat{x}}_{t}, z_{j, t}, ζ, \hat{ω}) | |_{R_{t}^{- 1}}^{2}) + η_{ζ} | | ζ_{2 : e n d} | |_{2}^{2} + P_{s} (ζ)

(43)

where,

{\hat{x}}_{t}

,

\hat{ω}

are the estimates of state and orientation at time

t

within the window.

| | e | |_{R^{- 1}}^{2} = e^{⊤} R^{- 1} e

denotes the squared Mahalanobis distance. The first term is the average squared Mahalanobis distance over all valid measurements in the window.

η_{ζ}

is the L2 regularization parameter applied to the coefficients (excluding

η_{0}

) to prevent overfitting and improving conditioning.

P_{s} (ζ)

is the smoothness penalty term.

Remark 1.

Using the original cost function

J_{o r i g i n a l} (ζ)

can lead to artificial bulges or distortions near the poles (

ϕ = \pm π / 2

). This occurs because the spherical coordinate system used by the DFS is degenerate at the poles, and measurement near these regions can exert disproportionate influence during optimization, leading to overfitting localized at the poles.

To mitigate these artifacts, we introduce a latitude-dependent weight factor

w_{p o l e} (ϕ)

that down-weights measurements near the poles. A common choice is

w_{p o l e} (ϕ) = \cos^{h} (| ϕ |)

(44)

where

h \geq 1

(e.g.,

h = 1

or

h = 2

) is a tuning parameter controlling the strength of the suppression.

The modified cost function

J (ζ)

is,

J (ζ) = (\frac{1}{\sum_{t = 1}^{T_{b}} \sum_{j = 1}^{N_{t}} w_{p o l e} ({\hat{ϕ}}_{j, t})} \sum_{t = 1}^{T_{b}} \sum_{j = 1}^{N_{t}} w_{p o l e} ({\hat{ϕ}}_{j, t}) | | z_{j, t} - h ({\hat{x}}_{t}, z_{j, t}, ζ, \hat{ω}) | |_{R_{t}^{- 1}}^{2}) + η_{ζ} | | ζ | |_{2}^{2} + P_{s} (ζ)

(45)

Remark 2.

By multiplying the squared Mahalanobis distance of each measurement by

w_{p o l e} (ϕ)

, we reduces the contribution of measurements where

| ϕ |

is close to

π / 2

. This is mathematically equivalent to assuming a higher effective measurement noise variance (

R_{j, t}^{*} = R_{t} / w_{p o l e} ({\hat{ϕ}}_{j, t})

) near the poles. It prevents the optimizer from overly fitting to potentially unreliable projections in these geometrically sensitive regions. Allowing the regularization terms to enforce a smoother, more plausible shape. The tuning parameter

h

control how aggressively measurements near the poles are down-weighted.

Improved Smoothness Penalty: A physically meaningful smoothness penalty for the DFS on a sphere penalizes higher spatial frequencies more heavily. The indices

n

and

m

in the DFS correspond to the frequencies in the azimuth and elevation directions, respectively. Higher values of

n

and

m

represent finer details or oscillations on the surface. A suitable penalty terms

P_{s} (ζ)

is given by:

P_{s} (ζ) = η_{s} \sum_{n = 1}^{N} \sum_{m = 1}^{M} (n^{ν} + m^{ν}) ({(ζ_{n, m}^{c c})}^{2} + {(ζ_{n, m}^{s s})}^{2} + {(ζ_{n, m}^{s c})}^{2} + {(ζ_{n, m}^{c s})}^{2})

(46)

where

η_{s}

is the smoothness regularization strength parameter. The term (

n^{ν} + m^{ν}

) acts as a weighting factor that increases with the frequency indices

n

and

m

. A common is

ν = 2

, making the penalty proportional to (

n^{2} + m^{2}

). The sum of squares involves all four coefficient types for the given (

n

,

m

) pair.

(2): Orientation parameter optimization

Minimize

J (ω)

at step

t

, regularizing towards the previous estimate

\hat{ω}

:

J (ω) = \frac{1}{N} \sum_{j = 1}^{N_{t}} | | z_{j, t} - h ({\hat{x}}_{t}, z_{j, t}, \hat{ζ}, ω) | |_{R_{t}^{- 1}}^{2} + η_{ω} | | ω - \hat{ω} {| |}_{2}^{2}

(47)

where

{\hat{x}}_{t}, \hat{ζ}

are the current estimates of state and shape.

\hat{ω}

is the estimated orientation parameter from the previous time step.

η_{ω}

is the regularization parameter controlling the penalty for deviating from the previous orientation estimation, promoting temporal smoothness.

(3): Optimization Constraints

Constraints can be applied during optimization:

A.: Shape parameter constraints:

ζ_{0} > ε

(small positive value)

B.: Rotation parameter constraints:

angle limit:

| | ω | |_{2} \leq π

.

Iteration of the EM algorithm ceases upon convergence or reaching a limit. Convergence is typically determined when the change in the likelihood function value between successive iterations is sufficiently small. Alternatively, termination occurs if the iteration count reaches a predefined upper bound.

The overall algorithm flow is summarized in Algorithm 1.

Algorithm 1. DFS-ECM

Initial parameters: Measurement set batch

Z

, State batch

X

, orientation and extent parameters

Λ

, Maximum Iterations

I

begin
Setup

i = 0

1 while not converged and

i < I

do
2 expectation:
3 for

t = 1 : T_{b}

do
4 for

j = 1 : N_{t}

do
5 Calculate

{\hat{x}}_{j, t}

and

P_{j, t}

according to (28–29)
6                    end for
7                end for
8                for

t = T_{b} : - 1 : 1

do
9 Calculate smoothed

{\hat{x}}_{t | T_{b}}

and

P_{t | T_{b}}

according to (37–38)
10                 end for
11             maximization:
12                Calculate

ζ

,

ω

according to (45,47)
13

i + +

;
14 end while
end begin

4. Simulation Results

Our proposed method, referred to as DFS-ECM in this section, are evaluated in two distinct scenarios. The simulations are run on an Intel Core™ i7-7700 processor at 3.6 GHz with 32 GB of RAM using MATLAB 2021a. Root Mean Square Error (RMSE) and the Intersection-Over-Union (IOU) are used to assess overall performance.

4.1. Evaluation Index

Root-Mean-Square Error (RMSE) is defined below:

R M S E_{c} = \sqrt{\frac{1}{M} \sum_{i = 1}^{M} {(C_{i} - c_{i})}^{2}}

(48)

where

R M S E_{c}

represents the RMSE of the parameter

c

,

C_{i}

represents the true and

c_{i}

represents the estimated value.

The shape estimation performance is evaluated by Intersection-Over-Union (IOU). IOU is defined as follows:

I O U (v_{t r u e}, \hat{v}) = \frac{v o l u m e (v_{t r u e} \cap \hat{v})}{v o l u m e (v_{t r u e} \cup \hat{v})}

(49)

where volumes

v_{t r u e}

is the true target shape, and

\hat{v}

represents the estimate.

I O U (v_{t r u e}, \hat{v}) \in [0, 1]

,

1

corresponds to a perfect match, while

0

indicates no intersection between the true and the estimated extent.

4.2. Scenario I

In this scenario, we test the DFS-ECM algorithm with ellipsoid and cubic target. Every target moves in a linear trajectory with the constant velocity (CV) model, and with a sensor located at the origin. The trajectory of target motion is presented in Figure 1. The initial kinematic state is set to [0 m, 0 m, 0 m, 10 m/s, 10 m/s, 0 m/s]. At each instant, 100 points measurements are originated from random sources which are sample from a uniform distribution defined over the target surface. The sampling time of all algorithms is set to

1 s

. The orientation is represented by the axis angle parameter

ω = (0, 0, 0)

.

The process noise standard deviations are set to

σ_{c} = 0.1

, For the sigma points, a scaling parameter

α = 1

,

κ = 0

,

β = 2

. For the extent parameter,

ζ_{0} = 1.0

and

ζ_{2, \dots, d_{ζ}} = 0

, the covariance matrix is set to

d i a g (0.1, \dots, 0.1) \in ℝ^{d_{ζ} \times d_{ζ}}

. The degree of double Fourier series is chosen as

M = N = 2

.

Figure 2 illustrates the typical results and tracking trajectories in one Monte Carlo run. Figure 2a,b depict the tracking scene for the ellipsoid, cube, respectively. For enhanced clarity regarding the reconstruction outcomes, Figure 3 illustrates the reconstruction results for 3D shapes obtained without incorporating the weight. A comparison between Figure 2 and Figure 3 reveals that optimizing shape reconstruction with the weight yields superior accuracy.

We evaluated centroid estimation for linearly moving ellipsoidal and cubic targets using 100 Monte Carlo trials (randomizing noise and sample points). Figure 4 shows the centroid RMSE for sliding window lengths

T_{b} = 5

,

10

, and

15

. For the ellipsoid (Figure 4a),

T_{b} = 5

yields the minimum RMSE; performance worsens for

T_{b} = 10

and

T_{b} = 15

, suggesting potential overfitting. For the cube (Figure 4b),

T_{b} = 15

is optimal. Generally, errors decrease over the simulation (indicating stabilization), and the relative performance of different window sizes remains consistent.

Figure 5 illustrates the RMSE values of the velocity estimates with the sliding window length

T_{b} = 5

,

10

, and

15

. This indicates that while there’s a large initial uncertainty in velocity, the estimation becomes highly accurate over time, and the choice of window size (within this range) has minimal impact on the long-term velocity estimation accuracy.

Figure 6 presents the Intersection over Union (IOU) values for the shape estimates, obtained using sliding window lengths of

T_{b} = 5

,

10

, and

15

. For both the ellipsoidal and cubic targets, the analysis indicates that the highest IOU score (representing the best shape estimate accuracy) is achieved with the window length

T_{b} = 15

.

As detailed in Table 2, the errors associated with the estimated orientataion parameter components remain consistently below 0.1 for both ellipsoidal and cubic target types, indicating a high degree of estimation accuracy.

For comparison, we use a standard random matrix-based extended target tracker [35], henceforce denoted as RM. Table 3 summarizes the comparative performance of the RM and DFS-ECM approaches. The results consistently show that the DFS-ECM method outperforms the RM model in terms of centroid estimation accuracy, velocity estimation accuracy, and shape estimation quality (as measured by IOU) across all tested scenarios.

As presented in Table 4, increasing the number of measurements positively impacts overall tracking performance. Specifically, a higher measurement count leads to a lower Centroid RMSE, indicating improved positioning accuracy. Concurrently, the shape estimation quality, measured by IOU, steadily improves for both the ellipsoid and the cube, confirming that more data yields more accurate reconstructions.

4.3. Scenario II

The performance of the DFS-ECM algorithm is further evaluated using a nonlinear scenario. In this experiment, a cubic target moves along a curved trajectory in three-dimensional space, as depicted in Figure 7. The target’s motion is governed by the constant acceleration (CA) model, which is defined in Equation (6). This simulation utilizes the same DFS model hyperparameters and EM parameters as those employed in Scenario Ⅰ. Within the CA model, the scalar parameter

σ_{a}^{2}

is set to 0.001. At each instant, 100 measurement points are sampled uniformly from the target’s surface. The orientation is represented by the axis angle parameter

ω = (0.1, - 0.2, 0.05)

.

To better illustrate the reconstruction quality, Figure 8 presents magnified views of typical reconstruction examples for the cubic target. These examples demonstrate satisfactory reconstruction performance.

Figure 9 presents the centroid RMSE over time for the cubic target, comparing window lengths

T_{b} = 5

,

10

, and

15

. Optimal performance (lowest RMSE) is achieved with the shortest window,

T_{b} = 5

. Conversely, the larger window lengths (

T_{b} = 10

and

T_{b} = 15

) yield higher errors, potentially indicating underfitting in this nonlinear scenario. However, all three RMSE curves exhibit convergence towards stable values over time.

The velocity RMSE for the cubic target is presented in Figure 10. Notably, while the larger window sizes (

T_{b} = 10

and

T_{b} = 15

) show similar performance with lower peak errors and better long-term stability (lower RMSE), the smallest window (

T_{b} = 5

) suffers from a higher peak error (occurring later) and maintains a higher error level beyond approximately 18 s.

Figure 11 plots the volume Intersection over Union (IOU), representing shape estimation accuracy, over time for three different window sizes. Towards the end of the observation period (after 40 s), the performance stabilizes for all window sizes. They converge to similar and relatively high IOU values, approximately in the range of 0.86–0.88. While these results are strong, the accuracy is slightly below that achieved for the cube in the linear scenario. We attribute this minor performance difference to the inherent challenges of the nonlinear model. The nonlinearity likely introduces small errors into the centroid position estimate, which subsequently impacts the overall accuracy of the shape reconstruction.

The performance of orientation parameter estimation is detailed in Figure 12. The error dynamics for individual rotation components within a single Monte Carlo run are illustrated in Figure 12a–c. The overall accuracy, depicted by the average Root Mean Square Error (RMSE) across all Monte Carlo runs, is shown in Figure 12d. The low error values demonstrate the algorithm’s capability for accurate angle estimation.

Figure 13 illustrates the computational cost, specifically the average processing time per step required by the algorithm, as a function of different window lengths (

T_{b} = 5

,

10

, and

15

). A clear positive correlation is evident: longer window lengths necessitate incNreased computation time per step.

5. Conclusions

This paper proposed a method for joint 3D extended object tracking with unknown but fixed shape in the framework of Expectation Conditional Maximization (ECM). By introducing a shape representation based on a Double Fourier Series (DFS) radial function and utilizing the axis-angle representation for orientation, this method enables the direct estimation of target shape and orientation parameters, overcoming the reliance on prior shape evolution models. Leveraging the ECM algorithm, kinematic state inference is performed using an Unscented Kalman Filter (UKF) in the E-step, while shape and orientation parameters are estimated via minimization of regularized cost functions in the M-step. This ultimately achieves the joint, robust, and smooth estimation of kinematics, shape, and orientation. The methods provide a full expression for the target’s shape, this detailed information can be used in 3D applications such as navigation, identification and classification. Future work will focus on adaptive tuning and joint estimation schemes to further advance the capability of 3D target tracking.

Author Contributions

Methodology, X.Y.; Software, H.M.; Investigation, X.Y.; Writing—original draft, H.M.; Writing—review & editing, X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Faion, F.; Zea, A.; Steinbring, J.; Baum, M.; Hanebeck, U.D. Recursive Bayesian Pose and Shape Estimation of 3D Objects Using Transformed Plane Curves. In Proceedings of the 2015 Sensor Data Fusion: Trends, Solutions, Applications (SDF), Bonn, Germany, 6–8 October 2015; pp. 1–6. [Google Scholar]
Moosmann, F.; Stiller, C. Joint self-localization and tracking of generic objects in 3D range data. In Proceedings of the 2013 IEEE International Conference on Robots and Automation (ICRA), Karlsruhe, Germany, 16–10 May 2013; pp. 1146–1152. [Google Scholar]
Held, D.; Levinson, J.; Thrun, S.; Savarese, S. Robust Real-time Tracking Combining 3D Shape, Color, and motion. Int. J. Robot. Res. 2016, 35, 30–49. [Google Scholar] [CrossRef]
Kraemer, S.; Bouzouraa, M.E.; Stiller, C. Simultaneous Tracking and Shape Estimation Using a Multi-layer Laserscanner. In Proceedings of the 20th IEEE International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017; pp. 1–7. [Google Scholar]
Granström, K.; Baum, M.; Reuter, M. Extended Object Tracking: Introduction, Overview and Applications. J. Adv. Inf. Fusion. 2017, 2, 139–174. [Google Scholar]
Mihaylova, L.; Carmi, A.Y.; Septier, F.; Gning, F.; Pang, S.K.; Godsill, S. Overview of Bayesian Sequential Monte Carlo Methods for Group and Extended Object Tracking. Digit. Signal Process. 2014, 25, 1–16. [Google Scholar] [CrossRef]
Thormann, K.; Yang, S.; Baum, S. A Comparison of Kalman filter Based Approaches for Elliptic Extended Object Tracking. In Proceedings of the IEEE 23rd International Conference on Information Fusion (FUSION), Rustenburg, South Africa, 6–9 July 2020; pp. 1–8. [Google Scholar]
Thormann, K.; Baum, K. Fusion of Elliptical Extended Object Estimates Parameterized with Orientation and Axes Lengths. IEEE Trans. Aerosp. Electron. Syst. 2021, 4, 2369–2382. [Google Scholar] [CrossRef]
Liu, S.; Liang, Y.; Xu, L.; Li, T.; Hao, X. EM-based Extended Object Tracking Without a Prior Extension Evolution Model. Signal Process. 2021, 188, 108181. [Google Scholar] [CrossRef]
Dahlén, K.M.; Lindberg, C.; Yoneda, M.; Ogawa, T. An Improved B-spline Extended Object Tracking Model Using the Iterative Closest Point Method. In Proceedings of the 25 the International Conference on Information Fusion (FUSION), Linköping, Sweden, 4–7 July 2022; pp. 1–8. [Google Scholar]
Wahlström, N.; Özkan, E. Extended Target Tracking Using Gaussian Processes. IEEE Trans. Signal Process. 2015, 16, 4165–4178. [Google Scholar] [CrossRef]
Thormann, K.; Baum, M.; Honer, J. Extended Target Tracking Using Gaussian Processes with High-resolution Automotive Radar. In Proceedings of the 21st International Conference on Information Fusion (FUSION), Cambridge, UK, 10–13 July 2018; pp. 1764–1770. [Google Scholar]
Baum, M.; Hanebeck, U.D. Shape Tracking of Extended Objects and Group Targets with Star-convex RHMs. In Proceedings of the 14th International Conference on Information Fusion(FUSION), Chicago, IL, USA, 5–8 July 2011; pp. 1–8. [Google Scholar]
Baum, M.; Hanebeck, U.D. Extended Object Tracking with Random Hypersurface Models. IEEE Trans. Aerosp. Electron. Syst. 2014, 50, 149–159. [Google Scholar] [CrossRef]
Baum, M.; Feldmann, M.; Fränken, D.; Hanebeck, U.D.; Koch, W. Extended Object and Group Tracking: A Comparison of Random Matrices and Random Hypersurface Models. In Proceedings of the Informatik 2010 Service Science—Neue Perspektiven fȕr Die Informatik, Beitrage Der 40 Jahrestagung Der Gesellschaft Fur Inform. e.V., Leipzig, Germany, 1 January 2010; pp. 904–906. [Google Scholar]
Zea, A.; Faion, F.; Baum, M.; Hanebeck, U.D. Level-set Random Hypersurface Models for Tracking Nonconvex Extended Objects. IEEE Trans. Aerosp. Electron. Syst. 2016, 52, 2990–3007. [Google Scholar] [CrossRef]
Liu, Y.; Ji, H.; Zhang, Y. Measurement Transformation Algorithm for Extended Target Tracking. Signal Process. 2021, 186, 1–12. [Google Scholar] [CrossRef]
Steinbring, J.; Baum, M.; Zea, A.; Faion, F.; Hanebeck, U.D. A Closed-Form Likelihood for Particle Filters to Track Extended Objects with Star-Convex RHMs. In Proceedings of the IEEE International Conference on Multisensor Fusion and lntegration for Intelligent Systems (MFI), San Diego, CA, USA, 14–16 September 2015; pp. 1–6. [Google Scholar]
Kaulbersch, H.; Baum, M.; Willett, P. EM Approach for Tracking Star-convex Extended Objects. In Proceedings of the 20th International Conference on Information Fusion, Xi’an, China, 10–13 July 2017; pp. 1–8. [Google Scholar]
Liu, Y.; Ji, H.; Zhang, Y. Gaussian-like Measurement Likelihood Based Particle Filter for Extended Target Tracking. IET Radar Sonar Navig. 2022, 12, 1–15. [Google Scholar] [CrossRef]
Özkan, E.; Wahlström, N.; Godsill, S.J. Rao-Blackwellised Particle Filter for Star-convex Extended Target Tracking Models. In Proceedings of the International Conference on Information Fusion, Heidelberg, Germany, 5–8 July 2016; pp. 1193–1199. [Google Scholar]
Faion, F.; Baum, M.; Hanebeck, U.D. Tracking 3D Shapes in Noisy Point Clouds with Random Hyper Surface Models. In Proceedings of the 15 th International Conference on Information Fusion (FUSION), Singapore, 9–12 July 2012; pp. 2230–2235. [Google Scholar]
Zea, A.; Faion, F.; Hanebeck, U.D. Tracking Extended Objects Using Extrusion Random Hypersurface Models. In Proceedings of the Sensor Data Fusion: Trends, Solutions, Applications (SDF), Bonn, Germany, 8–10 October 2014; pp. 1–6. [Google Scholar]
Hoher, P.; Baur, T.; Reuter, J.; Griesser, D.; Govaers, F.; Koch, W. 3D Extended Object Tracking and Shape Classification with a Lidar Sensor using Random Matrices and Virtual Measurement Models. In Proceedings of the 27th International Conference on Information Fusion (FUSION), Venice, Italy, 8–11 July 2024; pp. 1–6. [Google Scholar]
Baur, T.; Reuter, J.; Zea, A.; Hanebeck, U.D. Extent Estimation of Sailing Boats Applying Elliptic Cones to 3D Lidar Data. In Proceedings of the 25th International Conference on Information Fusion (FUSION), Linköping, Sweden, 4–7 July 2022; pp. 1–8. [Google Scholar]
Kumru, M.; Özkan, E. 3D Extended Object Tracking Using Recursive Gaussian Processes. In Proceedings of the International Conference on Information Fusion (FUSION), Cambridge, UK, 10–13 July 2018; pp. 1–8. [Google Scholar]
Kumru, M.; Özkan, E. Three-Dimensional Extended Object Tracking and Shape Learning Using Gaussian Processes. IEEE Trans. Aerosp. Electron. Syst. 2021, 5, 2795–2814. [Google Scholar] [CrossRef]
Baur, T.; Reuter, J.; Zea, A.; Hanebeck, U.D. Shape Estimation and Tracking using Spherical Double Fourier Series for Three-Dimensional Range Sensors. In Proceedings of the IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), Karlsruhe, Germany, 23–25 September 2021; pp. 1–6. [Google Scholar]
Hu, H.; Beck, J.; Lauer, M.; Stiller, C. Continuous Fusion of IMU and Pose Data using Uniform B-Spline. In Proceedings of the Multisensor Fusion and Integration for Intelligent System(MFI), Karlsruhe, Germany, 14–16 September 2020; pp. 1–6. [Google Scholar]
Li, X.R.; Jinkov, V. Survey of Maneuvering Target Tracking. Part I: Dynamic Models. IEEE Trans. Aerosp. Electron. Syst. 2003, 4, 1333–1364. [Google Scholar]
Wan, E.; Merwe, R. The Unscented Kalman Filter for Nonlinear Estimation. In Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium, Lake Louise, AB, Canada, 4 October 2000; pp. 1–6. [Google Scholar]
Julier, S.J.; Uhlmann, J.K. Unscented filtering and nonlinear estimation. Proc. IEEE 2004, 3, 401–422. [Google Scholar] [CrossRef]
Särkkä, S. Unscented Rauch-Tung-Striebel smoother. IEEE Trans. Autom. Control. 2008, 3, 845–849. [Google Scholar] [CrossRef]
Huang, Y.; Zhang, Y.; Zhao, Y.; Mihaylova, L.; Chambers, J.A. Robust Rauch-Tung-Striebel smoothing framework for heavy-tailed and/or skew noises. IEEE Trans. Aerosp. Electron. Syst. 2020, 1, 415–441. [Google Scholar] [CrossRef]
Feldman, M.; Franker, D.; Koch, W. Tracking of extended objects and group targets using random matrices. IEEE Trans. Signal Process. 2011, 4, 1409–1420. [Google Scholar] [CrossRef]

Figure 1. (a) ellipsoid target tracking; (b) cubic target tracking.

Figure 2. Target construction, (a) ellipsoid target, (b) cube target.

Figure 3. Target construction without weight (a) ellipsoid target (b) ellipsoid target.

Figure 4. Centroid RMSE (a) ellipsoid target RMSE (b) cube target RMSE.

Figure 5. Velocity RMSE (a) ellipsoid target RMSE (b) cube target RMSE.

Figure 6. IOU (a) ellipsoid target IOU (b) cube target IOU.

Figure 7. Cube targe tracking.

Figure 8. Cube target.

Figure 9. Centroid RMSE of cube.

Figure 10. Velocity RMSE of cube.

Figure 11. IOU of cube.

Figure 12. Rotation parameters estimate (a) rotation about the x-axis (b) rotation about the y-axis (c) rotation about the z-axis (d) orientation RMSE.

Figure 13. Average runtime.

Table 1. Summary of the main terms used in the derivation of the algorithm.

$P_{t}$	The error covariance of target state at time $t$
$P_{t \| t - 1}$	The predict error covariance of target state at time $t$
$P_{j, z z, t \| t - 1}$	The $j$ th innovation covariance at time $t$
$P_{j, x z, t \| t - 1}$	The $j$ th cross-covariance at time $t$
$P_{j, t}$	The $j$ th error covariance at time $t$
$P_{j, t \| T_{b}}$	The $j$ th smoothed covariance at time $t$
$P_{t \| T_{b}}$	The smoothed error covariance at time $t$
$p (x_{t} \| x_{t - 1})$	Transition Probability density of state $x_{t}$
$p (z_{j, t} \| x_{t}, Λ^{(i)})$	The likelihood function
$p (X^{(i + 1)}, Z \| Λ^{(i)})$	The likelihood function of complete data
$p (x_{0})$	The prior Probability density of state
$P_{s} (ζ)$	The smoothness penalty term of the cost function
$i$	Iteration times of ECM
$j$	Index of measurement number

Table 2. Orientation parameters estimation.

Target	Angle	T = 5 (rad)	T = 10 (rad)	T = 15 (rad)
Ellipsoid	$ω_{x}$	0.020	0.030	0.030
	$ω_{y}$	0.080	0.050	0.060
	$ω_{z}$	0.050	0.050	0.050
Cube	$ω_{x}$	0.043	0.042	0.043
	$ω_{y}$	0.004	0.004	0.003
	$ω_{z}$	0.015	0.012	0.016

Table 3. Performance comparation of two diffenernt algorithm.

Target	Algorithm	Centroid RMSE (m)	Velocity RMSE (m/s)	IOU
Ellipsoid	RM	0.27	0.10	0.90
Ellipsoid	DFS-ECM	0.26	0.07	0.96
Cube	RM	0.27	0.10	0.70
Cube	DFS-ECM	0.20	0.07	0.95

Table 4. Performance comparation of different muber of measurements.

Target	Number of Measurement	Centroid RMSE (m)	IOU
Ellipsoid	10	0.29	0.91
	20	0.28	0.92
	40	0.27	0.96
Cube	10	0.30	0.90
	20	0.28	0.92
	40	0.288	0.95

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mao, H.; Yang, X. Three-Dimensional Extended Target Tracking and Shape Learning Based on Double Fourier Series and Expectation Maximization. Sensors 2025, 25, 4671. https://doi.org/10.3390/s25154671

AMA Style

Mao H, Yang X. Three-Dimensional Extended Target Tracking and Shape Learning Based on Double Fourier Series and Expectation Maximization. Sensors. 2025; 25(15):4671. https://doi.org/10.3390/s25154671

Chicago/Turabian Style

Mao, Hongge, and Xiaojun Yang. 2025. "Three-Dimensional Extended Target Tracking and Shape Learning Based on Double Fourier Series and Expectation Maximization" Sensors 25, no. 15: 4671. https://doi.org/10.3390/s25154671

APA Style

Mao, H., & Yang, X. (2025). Three-Dimensional Extended Target Tracking and Shape Learning Based on Double Fourier Series and Expectation Maximization. Sensors, 25(15), 4671. https://doi.org/10.3390/s25154671

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Three-Dimensional Extended Target Tracking and Shape Learning Based on Double Fourier Series and Expectation Maximization

Abstract

1. Introduction

2. Problem Formulation

2.1. DFS-Based Shape Model for 3D Target

2.2. Orientation Representation (Axis-Angle)

2.3. Kinematic State Model

2.4. Measurement Model

3. ECM Tracking Algorithm Based on DFS

3.1. E-Step

3.2. M-Step

4. Simulation Results

4.1. Evaluation Index

4.2. Scenario I

4.3. Scenario II

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI