Correct Degree Selection for Koopman Mode Decomposition

Shin, Kilho; Asaoka, Shodai

doi:10.3390/math14040603

Open AccessArticle

Correct Degree Selection for Koopman Mode Decomposition

by

Kilho Shin

^1,*

and

Shodai Asaoka

²

¹

Computer Centre, Gakushuin University, Tokyo 171-8588, Japan

²

Graduate School of Computer Science and Systems Engineering, Kyushu Institute of Technology, Fukuoka 820-8502, Japan

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(4), 603; https://doi.org/10.3390/math14040603

Submission received: 8 December 2025 / Revised: 25 January 2026 / Accepted: 5 February 2026 / Published: 9 February 2026

(This article belongs to the Section E: Applied Mathematics)

Download

Browse Figures

Versions Notes

Abstract

Fourier Decomposition (FD) and Koopman Mode Decomposition (KMD) are important tools for time series data analysis, applied across a broad spectrum of applications. Both aim to decompose time series functions into superpositions of countably many wave functions, with strikingly similar mathematical foundations. These methodologies derive from the linear decomposition of functions within specific function spaces: FD uses a fixed basis of sine and cosine functions, while KMD employs eigenfunctions of the Koopman linear operator. A notable distinction lies in their scope: FD is confined to periodic functions, while KMD can decompose functions into exponentially amplifying or damping waveforms, making it potentially better suited for describing phenomena beyond FD’s capabilities. However, practical applications of KMD often show that despite an accurate approximation of training data, its prediction accuracy is limited. This paper clarifies that this issue is closely related to the number of wave components used in decomposition, referred to as the degree of a KMD. Existing methods use predetermined, arbitrary, or ad hoc values for this degree. We demonstrate that using a degree different from a uniquely determined value for the data allows infinite KMDs to accurately approximate training data, explaining why current methods, which select a single KMD from these candidates, struggle with prediction accuracy. Furthermore, we introduce mathematically supported algorithms to determine the correct degree. Simulations verify that our algorithms can identify the right degrees and generate KMDs that can make accurate predictions, even with noisy data.

Keywords:

Koopman mode decomposition; uniquely feasible degree; Hankel matrix; spectral decomposition; model selection; dynamical systems; time series analysis

MSC:

46N10; 37M10

1. Introduction

A wide range of natural and social phenomena are observed as superpositions of multiple nonlinear elemental processes. For example, recorded audio signals typically include not only the target speech—for instance, a conversation between individuals—but also various environmental noise components. Similarly, variations in the geomagnetic field arise from both internal processes, such as temporal fluctuations in the Earth’s main magnetic field, and external perturbations, such as solar flares and solar wind. Decomposing such observations into constituent processes and extracting only the components relevant to the study is a fundamental procedure in scientific research. Among such methods, frequency analysis—where time-series data are decomposed into countably (often finitely) many frequency components—plays a central role in data science.

A foundational principle across the natural sciences involves reducing nonlinear phenomena to linear problems, enabling analysis via linear algebra. For instance, kernel methods in machine learning embed data into high-dimensional (often infinite-dimensional Hilbert) spaces to facilitate linear solutions. Likewise, neural networks—which approximate arbitrary continuous (and thus potentially nonlinear) functions via linear combinations followed by nonlinear activations—rely on efficient linear transformations during training. The backpropagation algorithm, essential for learning from large-scale data, exemplifies this reliance.

A well-established and widely applied technique based on this principle is Fourier decomposition (FD), which forms the foundation of frequency analysis across a wide range of fields. Grounded in functional analysis, it represents a function as a linear combination of frequency components, typically expressed in terms of an orthonormal basis of trigonometric functions. This decomposition facilitates tasks such as signal characterization and noise reduction and has found broad applications in speech recognition and compression, image processing, radar and sonar analysis, time-series forecasting, and medical imaging.

Koopman Mode Decomposition (KMD) has recently attracted considerable theoretical attention as a powerful extension of Fourier decomposition. Its origin traces back to the 1931 work of B.O. Koopman, who formulated a representation of nonlinear dynamical systems through linear operators acting on function spaces—now referred to as Koopman operators. The theoretical foundation of this framework was subsequently formalized, and beginning in the 1990s, research by I. Mezić and collaborators renewed interest in its potential to reveal latent dynamics in nonlinear systems. The development of Dynamic Mode Decomposition (DMD) by P.J. Schmid ignited a new wave of research and led to advanced extensions such as Extended DMD (EDMD), which enable practical estimation of Koopman spectral components from data beyond the original limitations of DMD.

Both FD and KMD can be regarded as mathematical models in the sense that they aim to represent observed phenomena through mathematical expressions. From a regression perspective, both methods identify functions that adequately explain given numerical data by constructing time-series functions fitted to observations. One of the principal advantages of mathematical modeling lies not only in its ability to facilitate accurate and interpretable descriptions of underlying mechanisms and contributing factors, but also in its applicability to prediction.

However, although KMD often provides accurate representations of observed dynamics, it has been noted that its predictive accuracy may deteriorate under certain conditions. The present study aims to address this limitation by identifying the sources of prediction error in KMD and by proposing efficient algorithms to extract those Koopman modes that, when they exist, constitute the only viable candidates for accurate forecasting.

To illustrate Koopman Mode Decomposition and our contributions, we begin by recalling the concept of Fourier Decomposition (FD). Let

f (t)

be a

2 π

-periodic, complex-valued function in

L^{2}

. That is,

f (t + 2 π) = f (t)

for all t, and

\int_{- π}^{π} {| f (t) |}^{2} d t < \infty .

The space of such functions forms a Hilbert space, where the inner product between

g (t)

and

h (t)

is defined by

〈 g, h 〉 = \int_{- π}^{π} g (t) \cdot \bar{h (t)} d t,

with

\bar{x}

denoting the complex conjugate of x. Then,

f (t)

admits the decomposition:

f (t) = \sum_{n = - \infty}^{\infty} c_{n} e^{i n t} .

(1)

This decomposition is justified by the fact that the family

\{\frac{1}{\sqrt{2 π}} e^{i n t} ∣ n \in Z\}

forms a countable orthonormal basis for the Hilbert space of

2 π

-periodic functions in

L^{2} ([0, 2 π])

. The convergence in Equation (1) is understood in the

L^{2}

-norm. Although such convergence does not imply pointwise convergence, Riesz’s theorem ([1] Theorem 3.12) guarantees that a subsequence of the partial sums converges pointwise almost everywhere.

Koopman Mode Decomposition (KMD) [2] is similar to FD in that it expresses a function as a sum of oscillatory components. However, unlike FD, KMD allows for exponentially growing or decaying components. Hence, if a KMD of

f (t)

exists, it takes the form

f (t) = \sum_{n = 0}^{\infty} c_{n} λ_{n}^{t},

(2)

where

λ_{n}^{t} = {| λ_{n} |}^{t} e^{i \arg (λ_{n}) t}

. Thus, unless

| λ_{n} | = 1

, each term represents an exponentially growing (if

| λ_{n} | > 1

) or decaying (if

| λ_{n} | < 1

) wave (see Figure 1). The

λ_{n}

constitute a countable subset of the spectrum of the so-called Koopman operator [2].

KMD is expected to provide a more flexible framework for representing diverse phenomena than traditional Fourier analysis, particularly for systems exhibiting nonlinear dynamics, transient behaviors, or complex spatiotemporal patterns. This flexibility has led to successful applications across numerous scientific and engineering domains. In fluid dynamics, KMD has become a standard tool for analyzing coherent structures in turbulent flows, extracting dominant modes from flow fields, and constructing reduced-order models [3,4,5,6]. The method has proven effective for chaotic systems, where it reveals low-dimensional structures hidden within high-dimensional nonlinear dynamics [7]. In neuroscience, KMD enables the extraction of spatiotemporal patterns from neural activity recordings, providing insights into brain dynamics [8]. Applications in plasma physics include characterizing turbulent behavior in fusion devices and identifying dominant instability modes [9,10,11]. Beyond these scientific domains, KMD has found practical applications in sports analytics for analyzing player and team dynamics [12], in robotics for system identification and model-based control [13], and in video processing for efficient spatiotemporal data representation [14].

KMD is expected to provide a more flexible framework for representing diverse phenomena and has been applied in a wide array of domains, including: fluid dynamics [3,4,5,6], chaotic systems [7], neuroscience [8], plasma physics [9,10,11], sports analytics [12], robotics [13], and video processing [14].

In practical settings, both FD and KMD rely on a finite number of observations. Without restricting the summations in Equations (1) and (2) to finitely many terms, the decomposition becomes ill-posed. We therefore approximate the function by a finite superposition of ℓ oscillatory components, where ℓ is called the degree of the decomposition.

In the case of the Discrete Fourier Transform (DFT), we assume observations

f (t_{0}), \dots, f (t_{T - 1})

at

t_{k} = \frac{2 π k}{T}

, for

k = 0, \dots, T - 1

. Since

e^{i m t_{k}} = e^{i n t_{k}}

whenever

m \equiv n mod T

, the problem of finding the coefficients

c_{0}, \dots, c_{ℓ}

reduces to solving the linear system:

(f (t_{0}), \dots, f (t_{T - 1})) = (c_{0} \dots c_{T - 1}) [\begin{matrix} 1 & 1 & \dots & 1 \\ 1 & e^{i \frac{2 π}{T}} & \dots & e^{i \frac{2 π (T - 1)}{T}} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & e^{i \frac{2 π (T - 1)}{T}} & \dots & e^{i \frac{2 π {(T - 1)}^{2}}{T}} \end{matrix}] .

(3)

This system has a unique solution, as the coefficient matrix is a square Vandermonde matrix over distinct T-th roots of unity.

In general, an

m \times n

matrix

V_{n ∣ a_{1}, \dots, a_{m}} = [\begin{matrix} 1 & a_{1} & a_{1}^{2} & \dots & a_{1}^{n - 1} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & a_{m} & a_{m}^{2} & \dots & a_{m}^{n - 1} \end{matrix}],

is referred to as a Vandermonde matrix, whose determinant when

m = n

is given by

det V_{m ∣ a_{1}, \dots, a_{m}} = \prod_{i > j} (a_{i} - a_{j}) .

The square matrix on the right-hand side of Equation (3) is a Vandermonde matrix, and Equation (3) can be restated as

(x_{0} \dots x_{T - 1}) = (c_{0} \dots c_{ℓ - 1}) V_{T ∣ α^{0}, α^{1}, \dots, α^{ℓ - 1}},

(4)

where

α = e^{2 π i / T}

is a primitive T-th root of unity.

By the aforementioned invertibility of the Vandermonde matrix generated by distinct points

α^{0}, \dots, α^{T - 1}

, Equation (3) admits a unique solution:

(c_{0} \dots c_{T - 1}) = \frac{1}{T} (x_{0} \dots x_{T - 1}) V_{T ∣ α^{0}, α^{1}, \dots, α^{T - 1}}^{*} .

(5)

Here,

M^{*}

denotes the conjugate transpose (i.e., Hermitian transpose) of a matrix

M

.

In contrast, the KMD problem can be formulated as

[x_{0} \dots x_{T - 1}] = [m_{1} \dots m_{ℓ}] V_{T ∣ λ_{1}, \dots, λ_{ℓ}} .

(6)

Despite the fact that

x_{t}

and

m_{i}

are vectors rather than scalars, Equation (6) for KMD has the same form as Equation (4) for DFT: both decompose an observable matrix (or vector) into the product of a coefficient matrix (or vector) and a Vandermonde matrix. However, there are crucial distinctions between them. In DFT, the structure of the problem logically requires that the Vandermonde matrix be a fixed square matrix, and consequently, Equation (4) determines a uniquely solvable linear equation. In contrast, for KMD, the eigenvalues

λ_{1}, \dots, λ_{ℓ}

are unknown variables, and thus Equation (6) constitutes a higher-degree algebraic equation. Moreover, the number ℓ of eigenvalues is also a variable, and determining ℓ is both important and non-trivial. For example, the choice

ℓ = T

, which is required for DFT, is entirely unsuitable for KMD. Indeed, if

ℓ \geq T

, for any distinct set of eigenvalues

λ_{1}, \dots, λ_{ℓ}

, there always exists a corresponding set of modes

m_{1}, \dots, m_{ℓ}

such that Equation (6) holds (Proposition 3).

Despite the increased complexity of the KMD problem, several numerical methods exist to solve Equation (6) for a given degree ℓ, including Dynamic Mode Decomposition (DMD), which is typically applicable when

ℓ = rank X

; the Arnoldi method, applicable when

ℓ = T - 1

; and the vector Prony method, which allows arbitrary ℓ. These methods yield approximate solutions minimizing the residual sum of squares (RSS), especially in the presence of observation noise.

However, what remained unresolved was how to determine an optimal degree ℓ. We illustrate its importance through the following example, highlighting the predictive risk of inappropriate choice.

Example 1.

Consider one-dimensional observables given by

X = [1 1 1 1 3 5 7]

. It is readily verified that Equation (6) admits no solution if

ℓ \leq 3

. For

ℓ = 4

, the roots

λ_{1}, \dots, λ_{4}

of the following equation

f (x; α) = x^{4} - x^{3} - x - 1 + α (x - 1) = 0

(7)

uniquely determine the modes

m_{1}, \dots, m_{4}

such that Equation (6) is satisfied, thereby yielding a valid KMD for any value of the parameter α. As illustrated in Figure 2, all such KMDs exactly reproduce the observed sequence

X

for

t = 0, \dots, 6

, yet their extrapolations for

t \geq 7

differ significantly.

This example highlights a key issue: even if an algorithm happens to return a single quartic KMD, it is merely one among infinitely many KMDs that fit the observed data. Consequently, the forecast made by such a KMD is almost certainly different from the ground truth, and the chance of accurate prediction is negligibly small.

Thus, for the sake of predictive accuracy, it is crucial to select ℓ such that it is uniquely feasible, defined as follows:

Definition 1.

Given an observable matrix

X

, a degree ℓ is said to be feasible if there exists at least one solution

(λ_{1}, \dots, λ_{ℓ}; m_{1}, \dots, m_{ℓ})

to Equation (6). Moreover, if this solution is unique, then ℓ is said to be uniquely feasible.

This paper develops a theoretical framework for uniquely feasible degrees and, based on this foundation, proposes efficient and practical algorithms to determine whether a given set of observables

X

admits a uniquely feasible degree—and if so, to identify it. We also demonstrate through simulations that the KMD selected by our algorithms can yield highly accurate predictions.

2. Theoretical Frameworks Underlying Koopman Mode Decomposition

A key significance of Koopman Mode Decomposition (KMD) is its ability to analyze the dynamics of a nonlinear system using only methods from linear algebra. In this section, we provide a brief review of the theoretical framework of KMD, which bridges nonlinear dynamics and linear algebra.

2.1. Temporal Transition of States and Semigroup Property

Let Z denote a (possibly unobservable) state space. Under a deterministic assumption, once a state

ζ \in Z

is observed at some time, the state of the system after an elapsed time

t \geq 0

is uniquely determined and is denoted by

ζ_{t}

. Accordingly, the temporal evolution of the system is described by the mapping

\hat{ζ} : Z \times [0, \infty) ∋ (ζ, t) \mapsto ζ_{t} \in Z .

In the discrete-time setting,

\hat{ζ}

is instead defined on

Z \times ({0} \cup N)

, which can be regarded as a special case of the continuous-time formulation.

While the notation

\hat{ζ} (ζ, t)

emphasizes the bivariate nature of the mapping, the map

t \mapsto ζ_{t}

is essentially regarded as a univariate function of t, with the initial state

ζ

fixed. The deterministic assumption also requires the identity

ζ_{s + t} = {(ζ_{s})}_{t}

, which is equivalently expressed as

\hat{ζ} (ζ, s + t) = \hat{ζ} (\hat{ζ} (ζ, s), t)

for all

s, t \geq 0

. This implies that if we define

σ^{t} : = \hat{ζ} (\cdot, t) : Z \to Z

for

t \geq 0

, then the family

{σ^{t} : Z \to Z ∣ t \in [0, \infty)}

forms a one-parameter semigroup under composition, that is,

σ^{s + t} = σ^{t} \circ σ^{s}

holds for all

s, t \geq 0

.

2.2. Koopman Operator

We denote the space of

C

-valued functions defined over Z by

C^{Z}

. The function space

C^{Z}

forms a

C

-algebra, equipped with addition, multiplication, and scalar multiplication, defined as follows for

f, g \in C^{Z}

and

z \in C

:

(f + g) (ζ) = f (ζ) + g (ζ), (f g) (ζ) = f (ζ) g (ζ), and (z f) (ζ) = z f (ζ)

In particular,

C^{Z}

is a vector space over

C

.

The Koopman operator parameterized by time t is defined as

U^{t} : C^{Z} ∋ f \mapsto f \circ σ^{t} \in C^{Z} .

It is straightforward to verify that the Koopman operator is a

C

-algebra homomorphism and, in particular, a linear operator. Furthermore, we have

Proposition 1.

The collection of Koopman operators

{U^{t} ∣ t \in [0, \infty)}

forms a one-parameter semigroup. That is,

U^{s + t} = U^{t} \circ U^{s}

holds for any

s, t \in [0, \infty)

.

2.3. Koopman Generator

In general, when a one-parameter semigroup

T = {T^{t} ∣ t \in [0, \infty)}

is defined on a Banach space

B

, it is said to be strongly continuous if, for every

x \in B

, the following norm convergence holds:

lim_{t ↓ 0} ∥ T^{t} x - x ∥ = 0 .

A strongly continuous one-parameter semigroup

T

has several important properties ([15], Chapter 13):

Property 1.

Each

T^{t}

is bounded; that is, the operator norm

∥ T^{t} ∥

is well-defined. More precisely, there exist constants

M \geq 1

and

ω \in R

such that

∥ T^{t} ∥ \leq M e^{ω t}

for all

t \geq 0

.

Property 2.

The set

D (A)

of all

x \in B

for which

A x : = lim_{t ↓ 0} \frac{T^{t} x - x}{t}

exists is a dense linear subspace of

B

, and A is a closed linear operator with domain

D (A)

. This operator A is called the infinitesimal generator of

{T^{t}}

.

Property 3.

If

{T^{t}}

is bounded, i.e.,

{sup}_{t \geq 0} ∥ T^{t} ∥ < \infty

, then

D (A) = B

.

Property 4.

For every

x \in D (A)

, the derivative

\frac{d}{d t} T^{t} x = lim_{τ ↓ 0} \frac{T^{t + τ} x - T^{t} x}{τ}

exists, and we have

\frac{d}{d t} T^{t} x = T^{t} A x = A T^{t} x

for all

t \geq 0

and

x \in D (A)

.

Property 5.

If

A x = λ x

for some

x \neq 0

and

λ \in C

(that is, λ is an eigenvalue of A), then

T^{t} x = e^{λ t} x .

When we say that the Koopman operator semigroup

U = {U^{t} ∣ t \in [0, \infty)}

is strongly continuous, we assume that it acts on a Banach space

B

whose elements can be regarded as functions in

C^{Z}

in some way (e.g.,

B \subseteq C^{Z}

), and that for each

f \in B

the limit

lim_{t ↓ 0} U^{t} f = f

holds in the norm of

B

. Pointwise convergence of functions is a more primitive notion, and although these two modes of convergence are generally independent, they are closely related in certain settings.

Example 2.

Let X be a compact topological space and let

C (X)

denote the space of continuous functions on X. Since every continuous function on a compact space is bounded, we may equip

C (X)

with the supremum norm:

{∥ f ∥}_{\infty} : = sup_{x \in X} | f (x) | .

With this norm,

C (X)

is a Banach space. In this setting, convergence in norm is equivalent to uniform convergence, and in particular, uniform convergence implies pointwise convergence.

Example 3.

Let

(X, Σ, μ)

be a measure space, and let

B = L^{p} (μ)

for

1 \leq p < \infty

. Elements of

L^{p} (μ)

are equivalence classes of measurable functions that are equal almost everywhere. Thus, any statement about pointwise convergence should be interpreted in terms of representatives of these equivalence classes, that is, convergence almost everywhere. If a sequence

{f_{n}}_{n \in N} \subset L^{p} (μ)

satisfies

lim_{n \to \infty} {∥ f_{n} - f ∥}_{L^{p}} = 0,

then there exists a subsequence that converges to f pointwise almost everywhere. This follows from the completeness of

L^{p}

spaces and is sometimes referred to as a version of the Riesz convergence theorem (see ([1], Theorem 3.12)).

If the Koopman operator semigroup

{U^{t} ∣ t \in [0, \infty)}

is bounded, then its infinitesimal generator, referred to as the Koopman generator, and defined by

K f = lim_{t ↓ 0} \frac{U^{t} f - f}{t}, f \in D (K),

(8)

is defined on the entire Banach space.

We next consider the case in which the Koopman operators are bounded. Let

(Z, Σ, μ)

be a measure space, and suppose that for each

t \geq 0

the map

σ^{t} : Z \to Z

is measurable. We examine the boundedness of the associated Koopman operator

U^{t}

acting on

L^{p} (μ)

.

A sufficient condition for

U^{t}

to be bounded on

L^{p} (μ)

is that

σ^{t}

is non-expansive with respect to

μ

, meaning that

μ ({(σ^{t})}^{- 1} (A)) \leq μ (A) for all A \in Σ .

In this case, for any

f \in L^{p} (μ)

,

∥ U^{t} {f ∥}_{L^{p} (μ)}^{p} = \int_{Z} | f (σ^{t} (z)) |^{p} d μ (z) = \int_{Z} {| f (z) |}^{p} d ({(σ^{t})}_{*} μ) (z),

where

{(σ^{t})}_{*} μ

denotes the pushforward of

μ

by

σ^{t}

. If

σ^{t}

is non-expansive, then

{(σ^{t})}_{*} μ \leq μ

(as measures), and hence

∥ U^{t} {f ∥}_{L^{p} (μ)} \leq {∥ f ∥}_{L^{p} (μ)},

so in particular

sup_{t \geq 0} {∥ U^{t} ∥}_{L^{p} (μ) \to L^{p} (μ)} \leq 1 < \infty .

Note that, for

p = \infty

, the same argument shows

∥ U^{t} ∥_{L^{\infty} \to L^{\infty}} \leq 1

.

Conversely, if

σ^{t}

is expansive, i.e., there exists

A \in Σ

such that

μ ({(σ^{t})}^{- 1} (A)) > μ (A),

then the family

{U^{t}}_{t \geq 0}

may fail to be uniformly bounded in t (even though each fixed

U^{t}

can still be bounded).

In many applications, especially in ergodic theory and dynamical systems,

σ^{t}

is assumed to be measure-preserving, that is,

μ ({(σ^{t})}^{- 1} (A)) = μ (A) for all A \in Σ .

This implies

{(σ^{t})}_{*} μ = μ

, and hence

U^{t}

acts as an isometry on

L^{p} (μ)

for every

1 \leq p \leq \infty

. On

L^{2} (μ)

, the Koopman operators are therefore unitary. If, in addition,

{σ^{t}}_{t \in R}

forms a measure-preserving flow (so that

{U^{t}}_{t \in R}

is a strongly continuous unitary group), then by Stone’s theorem ([1], Theorem 13.40) the Koopman generator

K

is skew-adjoint, i.e.,

K^{*} = - K

. Since unitary and skew-adjoint operators are normal, the spectral theorem applies and provides the functional-analytic foundation for the Koopman Mode Decomposition.

2.4. Koopman Mode Decomposition and Spectral Theorem

To introduce the Koopman mode decomposition, we assume that the Koopman operator semigroup

{U^{t} ∣ t \in [0, \infty)}

defined on a Banach space

B

is strongly continuous in the norm of

B

and induces a Koopman generator defined on the entire

B

. For example, this holds when the semigroup

{U^{t}}

is bounded.

Let

σ_{p} (K) \subset C

denote the point spectrum of

K

, and let

V_{λ} \subseteq B

be the eigenspace corresponding to

λ \in σ_{p} (K)

. When f belongs to the completion in

B

of the linear span of

⋃_{λ \in σ_{p} (K)} V_{λ}

, that is, when

f = \sum_{λ \in σ_{p} (K)} ϕ_{λ}, ϕ_{λ} \in V_{λ},

holds in the norm of

B

, only countably many

ϕ_{λ}

are nonzero. We denote the corresponding eigenvalues by

{λ_{n} ∣ n = 1, 2, \dots}

. Then the Koopman mode decomposition of f is expressed as

f = \sum_{n = 1}^{N} ϕ_{λ_{n}}, N \in N \cup {\infty},

and the following relations hold:

\begin{matrix} K f = \sum_{n = 1}^{N} λ_{n} ϕ_{λ_{n}}, U^{t} f = \sum_{n = 1}^{N} e^{λ_{n} t} ϕ_{λ_{n}}, t \in [0, \infty) . \end{matrix}

If, in addition, every element of the one-parameter semigroup

{σ^{t} ∣ t \in [0, \infty)}

is measure-preserving on a measure space

(Z, Σ, μ)

, then the Koopman operators

U^{t}

and the Koopman generator

K

defined on

L^{2} (Z)

are unitary and skew-adjoint, respectively; that is, they are normal linear operators defined on the entire

B

. Hence, the Koopman mode decomposition can be understood in the context of the spectral theorem.

The spectral theorem asserts that a normal operator T defined on a Hilbert space

H

can be represented as

T = \int_{σ (T)} λ d E (λ),

where E is a projection-valued measure, which plays the role of a Borel measure defined on the Borel

σ

-algebra

B

of

σ (T) \subseteq C

. For each

B \in B

, the value

E (B)

is an orthogonal projection operator on

H

rather than a real number, and the measure E satisfies the following properties:

Orthogonality: $E (B_{1}) E (B_{2}) = E (B_{1} \cap B_{2})$ .
Countable additivity: For any countable mutually disjoint family ${B_{i}}_{i \in I} \subseteq B$ ,

$E (⋃_{i \in I} B_{i}) = \sum_{i \in I} E (B_{i}),$

where the convergence is in the strong operator topology.

Based on the projection-valued measure E, the integral of a measurable function F over

C

is defined as

F (T) = \int_{σ (T)} F (λ) d E (λ),

in complete analogy with the Lebesgue integral.

This integral representation is essential because the spectrum

σ (T)

may be distributed continuously in

C

. Furthermore, letting

σ_{d} (T) \subseteq σ_{p} (T)

denote the set of all isolated eigenvalues of T, we can express the decomposition as

F (T) = \sum_{λ \in σ_{d} (T)} F (λ) E ({λ}) + \int_{σ (T) ∖ σ_{d} (T)} F (λ) d E (λ),

(9)

where each

E ({λ})

for

λ \in σ_{d} (T)

coincides with the orthogonal projection onto the eigenspace of

λ

.

For

f \in L^{2} (Z)

—an equivalence class of functions equal almost everywhere— the Koopman mode decomposition of f and the actions of the Koopman generator

K

and the Koopman operators

U^{t}

are obtained by setting

T = K

and assuming that the integral part of Equation (9) vanishes (or is negligible). Since

E ({λ}) f

is nonzero only for a countable subset of

σ_{d} (K)

, we label the corresponding eigenvalues as

{λ_{n}}_{n = 1}^{N}

for

N \in N \cup {\infty}

, and write

ϕ_{λ_{n}} : = E ({λ_{n}}) f

.

Example 4.

For

F (x) = 1

, we obtain the Koopman mode decomposition of f:

f = \sum_{λ \in σ_{d} (T)} E ({λ}) f = \sum_{n = 1}^{N} ϕ_{λ_{n}} .

Example 5.

For

F (x) = x

, we obtain the action of

K

:

K f = \sum_{λ \in σ_{d} (T)} λ E ({λ}) f = \sum_{n = 1}^{N} λ_{n} ϕ_{λ_{n}} .

Example 6.

For

F (x) = e^{t x}

, we obtain the action of

U^{t}

:

e^{t K} f = \sum_{λ \in σ_{d} (T)} e^{t λ} E ({λ}) f = \sum_{n = 1}^{N} e^{t λ_{n}} ϕ_{λ_{n}} .

3. Discrete Koopman Mode Decomposition

The objective of Discrete Koopman Mode Decomposition (DKMD) is analogous to that of the Discrete Fourier Transform (DFT): to obtain a decomposition that fits a given finite sequence of samples of an unknown function. We begin by recalling the formulation of the DFT.

3.1. DFT and Vandelmonde Matrix

Let

(x_{0}, \dots, x_{T - 1})

denote the values of an unknown function observed at times

t = 0, 1, \dots, T - 1

. The DFT seeks coefficients

c_{0}, c_{\pm 1}, c_{\pm 2}, \dots

such that

x_{t} = \sum_{n = - \infty}^{\infty} c_{n} e^{i 2 π n \frac{t}{T}}, t = 0, \dots, T - 1 .

Since this is an underdetermined system with infinitely many unknowns, we restrict to a finite sum. Moreover, since

n \equiv n^{'} (mod T)

implies

e^{i 2 π n \frac{t}{T}} = e^{i 2 π n^{'} \frac{t}{T}}

, we may assume

n = 0, 1, \dots, T - 1

. Let

ω : = e^{i \frac{2 π}{T}}

. Then

x_{t} = \sum_{n = 0}^{T - 1} c_{n} ω^{n t}, t = 0, \dots, T - 1 .

In matrix form, we can write

(\begin{matrix} x_{0} & x_{1} & \dots & x_{T - 1} \end{matrix}) = (\begin{matrix} c_{0} & c_{1} & \dots & c_{T - 1} \end{matrix}) [\begin{matrix} 1 & 1 & 1 & \dots & 1 \\ 1 & ω & ω^{2} & \dots & ω^{T - 1} \\ 1 & ω^{2} & ω^{4} & \dots & ω^{2 (T - 1)} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & ω^{T - 1} & ω^{2 (T - 1)} & \dots & ω^{{(T - 1)}^{2}} \end{matrix}] .

(10)

The coefficient matrix above is an instance of the Vandermonde matrix.

Definition 2

([16]). Let

a_{1}, \dots, a_{m} \in C

and

n \in N

. The associated Vandermonde matrix is defined as

V_{n ∣ a_{1}, \dots, a_{m}} = [\begin{matrix} 1 & a_{1} & a_{1}^{2} & \dots & a_{1}^{n - 1} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & a_{m} & a_{m}^{2} & \dots & a_{m}^{n - 1} \end{matrix}] .

(11)

Proposition 2.

For pairwise distinct nodes

a_{1}, \dots, a_{m}

, the Vandermonde matrix satisfies:

$det V_{m ∣ a_{1}, \dots, a_{m}} = \prod_{1 \leq j < i \leq m} (a_{i} - a_{j})$ .
$rank V_{n ∣ a_{1}, \dots, a_{m}} = min {m, n}$ .
Viewing $V_{n ∣ a_{1}, \dots, a_{m}}$ as a linear map $F : C^{m} \to C^{n}$ , F is injective if and only if $m \leq n$ , surjective if and only if $m \geq n$ , and bijective if and only if $m = n$ .

In particular,

det V_{T ∣ ω^{0}, ω^{1}, \dots, ω^{T - 1}} = \prod_{0 \leq m < n < T} (ω^{n} - ω^{m}) \neq 0,

so Equation (10) has a unique solution for

(c_{0}, \dots, c_{T - 1})

.

3.2. Formulation of DKMD

Assume that the Koopman mode decomposition of a function f is expressed as a countably infinite sum of eigenfunctions

{ϕ_{n}}_{n \in N}

of the Koopman generator

K

. For example, assume that the spectrum of

K

consists only of a countably infinite set of isolated eigenvalues

{λ_{n}}_{n \in N}

:

f = \sum_{n = 1}^{\infty} ϕ_{n} .

Then, without loss of generality, we may assume that the values

{e^{λ_{n}}}_{n \in N}

are pairwise distinct.

Given finitely many samples

x_{t} = f (t)

for

t = 0, 1, \dots, T - 1

, DKMD seeks to satisfy

x_{t} = \sum_{n = 1}^{\infty} c_{n} e^{λ_{n} t}, where c_{n} : = ϕ_{n} (0) .

(12)

In the same way as in the DFT case, since the infinitely many unknowns

{c_{n}}_{n = 1}^{\infty}

and

{λ_{n}}_{n = 1}^{\infty}

make the system indefinite, we restrict their number to a finite

ℓ < \infty

. Then the system can be written compactly as

(x_{0}, x_{1}, \dots, x_{T - 1}) = (c_{1}, c_{2}, \dots, c_{ℓ}) V_{T ∣ e^{λ_{1}}, \dots, e^{λ_{ℓ}}} .

(13)

If

e^{λ_{1}}, \dots, e^{λ_{ℓ}}

are pairwise distinct and

ℓ \geq T

, the Vandermonde matrix represents a surjective linear mapping from

C^{ℓ}

onto

C^{T}

. Then we have the following:

Proposition 3.

For any

(x_{0}, \dots, x_{T - 1}) \in C^{T}

, pairwise distinct

e^{λ_{1}}, \dots, e^{λ_{ℓ}}

, and

ℓ \geq T

, there exists

(c_{1}, \dots, c_{ℓ}) \in C^{ℓ}

satisfying Equation (13).

Thus, in contrast to the DFT case, where the given values

x_{0}, \dots, x_{T - 1}

are decomposed into exactly T Fourier components, performing DKMD requires determining the number ℓ of Koopman modes within the range

ℓ < T

. Specifically, in the nonlinear system (13), the Koopman degree ℓ itself appears as an additional unknown, alongside

e^{λ_{1}}, \dots, e^{λ_{ℓ}}

and

c_{1}, \dots, c_{ℓ}

.

In this regard, the primary contribution of this paper is to establish criteria for determining ℓ, and to present computationally efficient methods for estimating it both exactly and approximately, particularly when the given observations are contaminated by noise.

In Section 3.4, we review an existing method from the literature for solving the equation in the special case where ℓ is known.

3.3. Definitions and Notations

The following definitions and descriptions are essential for the subsequent sections.

We extend the aforementioned formalization of DKMD from the decomposition of

C

-valued functions to that of

C^{m}

-valued functions for a dimension

m \geq 1

. This modification does not alter the fundamental nature of the problem but better aligns with real-world applications of DKMD, such as fluid dynamics.

Definition 3

(DKMD). Let

[x_{0} \dots x_{T - 1}]

be a matrix of observables at times

t = 0, \dots, T - 1

, where each

x_{t}

is an m-dimensional column vector. The discrete Koopman mode decomposition (DKMD) of this observable matrix is a matrix factorization given by

[x_{0} \dots x_{T - 1}] = [m_{1} \dots m_{ℓ}] V_{T ∣ μ_{1}, \dots, μ_{ℓ}},

(14)

where

μ_{1}, \dots, μ_{ℓ} \in C

are pairwise distinct.

Definition 4

([17] Koopman Eigenvalues, Modes, and Degree). Given a DKMD as in Equation (14), we refer to ℓ as the Koopman degree ,

μ_{1}, \dots, μ_{ℓ}

as the Koopman eigenvalues , and

m_{1}, \dots, m_{ℓ}

as the Koopman modes.

For convenience, let

X

and

M

denote the observable matrix and the matrix of modes, respectively:

X = [\begin{matrix} x_{0} & \dots & x_{T - 1} \end{matrix}], M = [\begin{matrix} m_{1} & \dots & m_{ℓ} \end{matrix}] .

Then the DKMD can be expressed compactly as

X = M V_{T ∣ μ_{1}, \dots, μ_{ℓ}} .

(15)

Table 1 summarizes the principal notations used throughout this article.

3.4. Known Methods to Compute DKMD for Known Degrees

In this section, we introduce the vector Prony method [18], which estimates the unknown Koopman eigenvalues

μ_{1}, \dots, μ_{ℓ}

and the Koopman mode matrix

M

that satisfy Equation (15), given the observable matrix

X

and the Koopman degree ℓ.

The procedure first computes the Koopman eigenvalues and subsequently computes the Koopman modes based on the obtained eigenvalues.

3.4.1. Computing the Koopman Eigenvalues

We first introduce the characteristic polynomial associated with a DKMD, whose roots correspond to the Koopman eigenvalues.

Definition 5

([17] Characteristic Polynomial). For a DKMD

X = M V_{T ∣ μ_{1}, \dots, μ_{ℓ}}

, the polynomial

f (X) = \prod_{i = 1}^{ℓ} (X - μ_{i})

is called the characteristic polynomial of the DKMD.

If the characteristic polynomial of a DKMD is expressed as

f (X) = X^{ℓ} + a_{ℓ - 1} X^{ℓ - 1} + \dots + a_{0},

then the following recurrence relation holds.

Proposition 4.

For all integers j with

0 \leq j \leq T - ℓ - 1

,

x_{j + ℓ} + a_{ℓ - 1} x_{j + ℓ - 1} + \dots + a_{0} x_{j} = 0 .

Proof.

The statement follows from

x_{j + ℓ} + a_{ℓ - 1} x_{j + ℓ - 1} + \dots + a_{0} x_{j} = M diag (μ_{1}^{j}, \dots, μ_{ℓ}^{j}) {[\begin{matrix} f (μ_{1}) & \dots & f (μ_{ℓ}) \end{matrix}]}^{T} = 0 .

□

Using Hankel matrices (Definition 6), the statement of Proposition 4 can be compactly expressed as

H_{0 ∣ X_{ℓ}^{T - 1}} = - [\begin{matrix} a_{0} & \dots & a_{ℓ - 1} \end{matrix}] H_{ℓ - 1 ∣ X_{0}^{T - 1}} .

(16)

Definition 6

([16] Hankel Matrix). Given a matrix

X = [x_{0}, \dots, x_{T - 1}]

, the kth Hankel matrix of

X

is defined for

0 \leq k \leq T - 1

as

H_{k ∣ X} = [\begin{matrix} {(X_{0}^{k})}^{T} & {(X_{1}^{k + 1})}^{T} & \dots & {(X_{T - k - 1}^{T - 1})}^{T} \end{matrix}],

where

H_{k ∣ X} \in C^{(k + 1) \times m (T - k)}

.

Although

(a_{0}, \dots, a_{ℓ - 1})

that satisfy Equation (16) may not exist, and even if they exist they may not be unique, the following least-squares optimality always holds:

- H_{0 ∣ X_{ℓ}^{T - 1}} H_{ℓ - 1 ∣ X_{0}^{T - 1}}^{+} \in \underset{a_{0}, \dots, a_{ℓ - 1}}{arg min} {∥H_{0 ∣ X_{ℓ}^{T - 1}} + [\begin{matrix} a_{0} & \dots & a_{ℓ - 1} \end{matrix}] H_{ℓ - 1 ∣ X_{0}^{T - 1}}∥}_{F} .

This represents the least-squares solution for the characteristic coefficients when Equation (16) is inconsistent.

3.4.2. Computing the Koopman Modes

Once the eigenvalues

μ_{1}, \dots, μ_{ℓ}

are computed, the linear Equation (14) determines

M

.

Since

X V_{T ∣ μ_{1}, \dots, μ_{ℓ}}^{+} \in \underset{M}{arg min} {∥X - M V_{T ∣ μ_{1}, \dots, μ_{ℓ}}∥}_{F}

holds, if a solution exists,

X V_{T ∣ μ_{1}, \dots, μ_{ℓ}}^{+}

provides one possible solution, which is not necessarily unique.

Equation (15) can be viewed as a factorization of linear mappings:

X : C^{T} \to C^{m}, V_{T ∣ μ_{1}, \dots, μ_{ℓ}} : C^{T} \to C^{ℓ}, M : C^{ℓ} \to C^{m} .

The condition for the existence of such an

M

is

ker X \supseteq ker V_{T ∣ μ_{1}, \dots, μ_{ℓ}} .

Under this condition, the solution is unique if and only if

V_{T ∣ μ_{1}, \dots, μ_{ℓ}}

is surjective, which holds if and only if

ℓ \leq T

and the nodes

μ_{1}, \dots, μ_{ℓ}

are pairwise distinct.

Proposition 5.

For an observable matrix

X

and a Koopman degree ℓ with

ℓ \leq T

, a DKMD of

X

with Koopman eigenvalues

μ_{1}, \dots, μ_{ℓ}

, if it exists, it is unique.

3.5. The Contributions of This Article

Given an observable matrix, the vector Prony method introduced in the previous section computes a DKMD for a specified Koopman degree ℓ. However, a DKMD does not always exist for the given degree, and even when it does, it may not be unique.

Definition 7

(Feasible Degree). A Koopman degree ℓ is said to be feasible for the observables if a DKMD of degree ℓ exists for the given observable matrix.

Although the Koopman degree ℓ must satisfy

ℓ < T

(Proposition 3), a DKMD may exist for multiple values of ℓ. Thus, selecting the optimal feasible degree is essential to obtain an optimal DKMD.

For this selection, we consider two independent principles:

Minimality: The optimal degree should be the smallest among all feasible degrees. This principle is analogous to Occam’s razor, favoring the simplest representation that adequately explains the observations.
Uniqueness: The optimal degree should correspond to a unique DKMD. If multiple DKMDs exist for a given ℓ, as described in a later section, the set of such decompositions forms a continuum, where different DKMDs yield distinct eigenvalues and modes. Consequently, any particular DKMD extracted from this continuum—such as one obtained by the vector Prony method—may fail to reproduce the true dynamics precisely.

In this article, we demonstrate that if a degree satisfies the uniqueness criterion, it also satisfies the minimality criterion, but the converse does not necessarily hold. Thus, between these two principles, we adopt the uniqueness criterion as the standard for selecting the optimal Koopman degree.

Definition 8

(Uniquely Feasible Degree). A Koopman degree is said to be uniquely feasible if a DKMD of that degree exists and is unique for the given observables.

In summary, the objective of this paper is to establish a theoretical framework for uniquely feasible degrees. Specifically, we demonstrate and present the following:

A uniquely feasible degree for a given observable matrix, if it exists, is the smallest among all feasible degrees.
Several structural properties of uniquely feasible degrees lead to computationally efficient algorithms for determining them.
These algorithms are further extended to handle noisy observables via least-squares formulations.

4. Finding Uniquely Feasible Degrees

4.1. Key Indices: Hankel Dimension and Codimension

We first summarize the results of the previous sections in a theorem that provides a necessary and sufficient condition for an observable matrix to admit a DKMD.

For notational convenience, we first introduce the following definition.

Definition 9

(square-free coefficient vector). A vector

(\begin{matrix} a_{0} & \dots & a_{n - 1} & 1 \end{matrix}) \in C^{n + 1}

is called a square-free coefficient vector if the algebraic equation

X^{n} + a_{n - 1} X^{n - 1} + \dots + a_{0} = 0

has no repeated roots.

Theorem 1

(Feasibility condition). Let

X

be an observable matrix and

ℓ < T

. The following statements are equivalent:

1.: $X$ admits a DKMD with Koopman degree ℓ; equivalently, ℓ is feasible for $X$ .
2.: There exists a square-free coefficient vector $(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & 1 \end{matrix})$ satisfying

$(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & 1 \end{matrix}) H_{ℓ ∣ X} = 0 .$

(17)

Proof. (1 ⇒ 2)

Suppose

X

admits a DKMD with Koopman eigenvalues

μ_{1}, \dots, μ_{ℓ}

. The coefficient vector

a = (\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & 1 \end{matrix})

of the characteristic polynomial of this DKMD is square-free by the definition of DKMD (Definition 3), and the assertion of Proposition 4 can be restated as

a {(X_{i}^{i + ℓ})}^{T} = 0 for i = 0, \dots, T - ℓ - 1,

which further implies Equation (17).

(2 ⇒ 1) Let

μ_{1}, \dots, μ_{ℓ}

denote the distinct roots of the square-free polynomial

X^{ℓ} + a_{ℓ - 1} X^{ℓ - 1} + \dots + a_{0} = 0 .

We consider

X, V_{T ∣ μ_{1}, \dots, μ_{ℓ}},

and

M

as representing linear mappings

F : C^{T} \to C^{m}

,

G : C^{T} \to C^{ℓ}

, and

H : C^{ℓ} \to C^{m}

, respectively. Then, the rank-nullity theorem implies

ker G = ⨁_{i = 0}^{T - ℓ - 1} C {(\begin{matrix} 0^{i} & a & 0^{T - ℓ - 1 - i} \end{matrix})}^{T},

and

ker F \supseteq ker G

follows from

X {(\begin{matrix} 0^{i} & a & 0^{T - ℓ - 1 - i} \end{matrix})}^{T} = X_{i}^{i + ℓ} a^{T} = 0

for all

i = 0, \dots, T - ℓ - 1

, which implies the existence of H. □

The square-free coefficient vector in Equation (17) lies in the orthogonal complement

V {(H_{ℓ ∣ X})}^{⊥}

, which is the subspace of

C^{ℓ + 1}

consisting of all vectors orthogonal to every column of

H_{ℓ ∣ X}

. Theorem 1 thus shows that the existence of a DKMD depends on the structure of

V (H_{k ∣ X})

and its orthogonal complement. This motivates the following key indices.

Definition 10

(Hankel dimension and codimension). Given an observable matrix

X

, the Hankel dimension and codimension of order k (

k < T

) are defined as

\dim_{H} (k ∣ X) = rank H_{k ∣ X}, {codim}_{H} (k ∣ X) = k + 1 - \dim_{H} (k ∣ X) .

Using these indices, we can restate Theorem 1 as follows:

Corollary 1.

For an observable matrix

X

and a Koopman degree

ℓ < T

, a necessary condition for the existence of an ℓ-degree DKMD of

X

is

{codim}_{H} (ℓ ∣ X) \geq 1 .

However, the condition

{codim}_{H} (ℓ ∣ X) \geq 1

is not a sufficient condition due to the following two reasons:

Even if we can find $(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & a_{ℓ} \end{matrix})$ with

$(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & a_{ℓ} \end{matrix}) H_{ℓ ∣ X} = 0,$

it may happen that $a_{ℓ} = 0$ , meaning that the polynomial corresponding to $(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & a_{ℓ} \end{matrix})$ is of degree lower than ℓ, which cannot induce a DKMD of degree ℓ;
Even if

$(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & 1 \end{matrix}) H_{ℓ ∣ X} = 0,$

the resulting polynomial

$X^{ℓ} + a_{ℓ - 1} X^{ℓ - 1} + \dots + a_{0} = 0$

may have repeated roots, rendering it unsuitable as a characteristic polynomial.

Our purpose here is to identify a Koopman degree ℓ that admits a unique DKMD. We can represent the condition for a uniquely feasible degree using the Hankel codimension, assuming that ℓ is feasible, that is, an ℓ-degree DKMD exists.

Corollary 2.

For a feasible degree ℓ of an observable matrix

X

with

ℓ < T

, the following hold:

1.: The DKMD is unique if and only if ${codim}_{H} (ℓ ∣ X) = 1$ .
2.: The set of ℓ-degree DKMDs forms a continuum if and only if ${codim}_{H} (ℓ ∣ X) > 1$ .

Proof.

Since ℓ is feasible,

V {(H_{ℓ ∣ X})}^{⊥}

is non-empty, and there exists a coefficient vector

{(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & 1 \end{matrix})}^{T} \in V {(H_{ℓ ∣ X})}^{⊥}

which determines the characteristic polynomial of a DKMD of

X

.

If

{codim}_{H} (ℓ ∣ X) = 1

, that is, if

dim V {(H_{ℓ ∣ X})}^{⊥} = 1

, then only

(\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & 1 \end{matrix})

can determine a characteristic polynomial, implying that the DKMD is unique by Proposition 5.

On the other hand, if

{codim}_{H} (ℓ ∣ X) > 1

, there exists a coefficient vector

(\begin{matrix} b_{0} & \dots & b_{ℓ - 1} & 1 \end{matrix}) \in V {(H_{ℓ ∣ X})}^{⊥} ∖ C (\begin{matrix} a_{0} & \dots & a_{ℓ - 1} & 1 \end{matrix}) .

When we define

f_{α} (X) = (1 - α) f (X) + α (X^{ℓ} + b_{ℓ - 1} X^{ℓ - 1} + \dots + b_{0}),

since

f_{0} (X)

is square-free, there exists

ϵ > 0

such that

f_{α} (X)

remains square-free for any

α \in (- ϵ, ϵ)

. In particular, each distinct value of

α

yields a distinct DKMD. □

4.2. The Koopman Dimension and Codimension for $m = 1$

In this section, we investigate the Hankel dimension and codimension in the restricted case where

m = 1

, i.e., when

X

consists of a single row with T components. These results will be extended to the general case

m \geq 1

in Section 4.4.5.

Lemma 1.

Let

X

be an observable matrix with one row and T columns. Furthermore, suppose that

X

admits a DKMD of the form

X = [\begin{matrix} m_{1} & \dots & m_{ℓ} \end{matrix}] V_{T ∣ μ_{1}, \dots, μ_{ℓ}}

such that

ℓ \leq \frac{T + 1}{2}

and each

m_{i}

is nonzero. Then, the leftmost

ℓ \times ℓ

block submatrix of

H_{ℓ - 1 ∣ X}

has nonzero determinant.

Proof.

The leftmost

ℓ \times ℓ

block submatrix is expressed as

\begin{matrix} [\begin{matrix} {(X_{0}^{ℓ - 1})}^{T} & {(X_{1}^{ℓ})}^{T} & \dots & {(X_{ℓ - 1}^{2 ℓ - 2})}^{T} \end{matrix}] & = {(V_{ℓ ∣ μ_{1}, \dots, μ_{ℓ}})}^{T} diag (m_{1}, \dots, m_{ℓ}) V_{ℓ ∣ μ_{1}, \dots, μ_{ℓ}} . \end{matrix}

The determinant of this matrix is nonzero, because

μ_{1}, \dots, μ_{ℓ}

are mutually distinct, implying

det V_{ℓ ∣ μ_{1}, \dots, μ_{ℓ}} \neq 0,

and

m_{1}, \dots, m_{ℓ}

are nonzero, implying

det diag (m_{1}, \dots, m_{ℓ}) \neq 0

. □

Theorem 2.

Let

X

,

m_{1}, \dots, m_{ℓ}

, and

μ_{1}, \dots, μ_{ℓ}

be as in Lemma 1. Assume further that

ℓ \leq \frac{T + 1}{2}

. Then, the Hankel dimension and codimension are given by

\begin{matrix} \dim_{H} (k ∣ X) & = \{\begin{matrix} k + 1, & for k \in {0, \dots, ℓ - 1}; \\ ℓ, & for k \in {ℓ, \dots, T - ℓ}; \\ T - k, & for k \in {T - ℓ + 1, \dots, T - 1}; \end{matrix} \\ {codim}_{H} (k ∣ X) & = \{\begin{matrix} 0, & for k \in {0, \dots, ℓ - 1}; \\ k + 1 - ℓ, & for k \in {ℓ, \dots, T - ℓ}; \\ 2 k - T + 1, & for k \in {T - ℓ + 1, \dots, T - 1} . \end{matrix} \end{matrix}

Proof.

Note that

H_{k ∣ X}

consists of

k + 1

rows and

T - k

columns.

If

0 \leq k \leq ℓ - 1

, equivalently, if

k + 1 \leq ℓ

and

T - k \geq ℓ

, the leftmost

(k + 1) \times ℓ

submatrix of

H_{k ∣ X}

coincides with the top

(k + 1) \times ℓ

submatrix of

H_{ℓ - 1 ∣ X}

, implying its rank is

k + 1

by Lemma 1. Thus, all

k + 1

rows of

H_{k ∣ X}

are linearly independent, and

\dim_{H} (k ∣ X) = k + 1

follows.

If

T - ℓ + 1 \leq k \leq T - 1

, equivalently, if

k + 1 > ℓ

and

T - k < ℓ

, the entire

H_{k ∣ X}

has

T - k

columns. Its top

ℓ \times (T - k)

submatrix coincides with the leftmost

ℓ \times (T - k)

submatrix of

H_{ℓ - 1 ∣ X}

, implying its rank is

T - k

by Lemma 1. Thus, all

T - k

columns of

H_{k ∣ X}

are linearly independent, and

\dim_{H} (k ∣ X) = T - k

follows.

If

ℓ \leq k \leq T - ℓ

, equivalently, if

k + 1 > ℓ

and

T - k \geq ℓ

, the top-left

ℓ \times ℓ

submatrix of

H_{k ∣ X}

coincides with that of

H_{ℓ - 1 ∣ X}

, and hence, Lemma 1 implies that the leftmost ℓ columns of

H_{k ∣ X}

are linearly independent.

On the other hand, we let

\prod_{i = 1}^{ℓ} (X - μ_{i}) = X^{ℓ} - a_{ℓ - 1} X^{ℓ - 1} - \dots - a_{0}

denote the characteristic polynomial. Then each eigenvalue

μ_{i}

satisfies

μ_{i}^{ℓ} = a_{ℓ - 1} μ_{i}^{ℓ - 1} + \dots + a_{0}

. For the i-th element

x_{i}

of

X

with

i \geq ℓ

,

\begin{matrix} x_{i} & = \sum_{j = 1}^{ℓ} m_{j} μ_{j}^{i} = \sum_{j = 1}^{ℓ} m_{j} μ_{j}^{i - ℓ} \cdot μ_{j}^{ℓ} = \sum_{j = 1}^{ℓ} m_{j} μ_{j}^{i - ℓ} (\sum_{n = 0}^{ℓ - 1} a_{n} μ_{j}^{n}) \\ = \sum_{n = 0}^{ℓ - 1} a_{n} (\sum_{j = 1}^{ℓ} m_{j} μ_{j}^{i - ℓ + n}) = \sum_{n = 0}^{ℓ - 1} a_{n} x_{i - ℓ + n} . \end{matrix}

This recurrence relation shows that for any

i \geq ℓ

, the element

x_{i}

is determined by

x_{i - ℓ}, \dots, x_{i - 1}

. Consequently, any column of

H_{k ∣ X}

is a linear combination of the leftmost ℓ columns, which have been proven to be linearly independent. Therefore,

\dim_{H} (k ∣ X) = ℓ

.

The claims about the Hankel codimension follow directly from the definition

{codim}_{H} (k ∣ X) = k + 1 - \dim_{H} (k ∣ X)

. □

Corollary 3.

If

ℓ \leq \frac{T}{2}

, then

k = ℓ

is the unique value satisfying

{codim}_{H} (k ∣ X) = 1

.

If

ℓ = \frac{T + 1}{2}

, then

X

admits no uniquely feasible degree.

4.3. Examples

In this section, we introduce three examples of an observable matrix

X

, each demonstrating different properties regarding the existence of a uniquely feasible degree:

No Koopman degree ℓ satisfies ${codim}_{H} (ℓ ∣ X) = 1$ , meaning that no uniquely feasible degree exists (Example 7).
A Koopman degree ℓ with ${codim}_{H} (ℓ ∣ X) = 1$ exists, but the corresponding characteristic polynomial is not square-free. As a result, a uniquely feasible degree does not exist (Example 8).
A uniquely feasible degree exists, ensuring that a DKMD is uniquely determined for the degree (Example 9).

These examples illustrate the conditions under which a DKMD is uniquely determined and the role of Hankel codimension in establishing uniqueness.

Example 7.

Consider the observable matrix

X = [\begin{matrix} 1 & 1 & 1 & 1 & 3 & 5 & 7 \end{matrix}] .

The Hankel matrices for

ℓ = 1, 2, 3

are computed as

\begin{matrix} H_{1 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 & 3 & 5 \\ 1 & 1 & 1 & 3 & 5 & 7 \end{matrix}], H_{2 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 & 3 \\ 1 & 1 & 1 & 3 & 5 \\ 1 & 1 & 3 & 5 & 7 \end{matrix}], and H_{3 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 3 \\ 1 & 1 & 3 & 5 \\ 1 & 3 & 5 & 7 \end{matrix}] . \end{matrix}

These computations yield

{codim}_{H} (1 ∣ X) = {codim}_{H} (2 ∣ X) = {codim}_{H} (3 ∣ X) = 0,

implying that

X

admits no DKMD for these degrees by Corollary 1.

On the other hand, for

ℓ = 4, 5, 6

, we have

H_{4 ∣ X} = [\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 3 \\ 1 & 3 & 5 \\ 3 & 5 & 7 \end{matrix}], H_{5 ∣ X} = [\begin{matrix} 1 & 1 \\ 1 & 1 \\ 1 & 1 \\ 1 & 3 \\ 3 & 5 \\ 5 & 7 \end{matrix}], and H_{6 ∣ X} = [\begin{matrix} 1 \\ 1 \\ 1 \\ 1 \\ 3 \\ 5 \\ 7 \end{matrix}] .

It follows that

{codim}_{H} (4 ∣ X) = 2, {codim}_{H} (5 ∣ X) = 4, and {codim}_{H} (6 ∣ X) = 6,

implying that DKMDs form a continuum by Corollary 2.

For example, for

ℓ = 4

,

V {(H_{4 ∣ X})}^{⊥} = C {(\begin{matrix} - 1 & - 1 & 0 & - 1 & 1 \end{matrix})}^{T} \oplus C {(\begin{matrix} - 1 & 1 & 0 & 0 & 0 \end{matrix})}^{T} .

Thus, the possible candidates for characteristic polynomials take the form

f_{a} (x) = x^{4} - x^{3} - x - 1 + a (x - 1),

(18)

whose discriminant is computed as

D = - 27 a^{4} + 54 a^{3} - 783 a^{2} - 900 a - 500 .

Since

f_{a} (X) = 0

has no repeated roots if and only if

D \neq 0

, it follows that four-degree DKMDs form a continuum.

Example 8.

When we let

X = [\begin{matrix} 1 & 1 & - 3 & 5 & - 7 \end{matrix}]

, the Hankel matrices for

ℓ = 1, 2, 3

are computed as

\begin{matrix} H_{1 ∣ X} = [\begin{matrix} 1 & 1 & - 3 & 5 \\ 1 & - 3 & 5 & - 7 \end{matrix}], H_{2 ∣ X} = [\begin{matrix} 1 & 1 & - 3 \\ 1 & - 3 & 5 \\ - 3 & 5 & - 7 \end{matrix}], and H_{3 ∣ X} = [\begin{matrix} 1 & 1 \\ 1 & - 3 \\ - 3 & 5 \\ 5 & - 7 \end{matrix}], \end{matrix}

implying

{codim}_{H} (1 ∣ X) = 0, {codim}_{H} (2 ∣ X) = 1, and {codim}_{H} (3 ∣ X) = 2 .

In fact, we have

V {(H_{1 ∣ X})}^{⊥} = (0)

, and furthermore,

\begin{matrix} V {(H_{2 ∣ X})}^{⊥} = C {(\begin{matrix} 1 & 2 & 1 \end{matrix})}^{T} and V {(H_{3 ∣ X})}^{⊥} = C {(\begin{matrix} 1 & 2 & 1 & 0 \end{matrix})}^{T} \oplus C {(\begin{matrix} 0 & 1 & 2 & 1 \end{matrix})}^{T} . \end{matrix}

This implies that there exists no DKMD for any of these degrees because

1.: For $ℓ = 1$ , no characteristic polynomial exists.
2.: For $ℓ = 2$ , if a DKMD existed, the corresponding characteristic polynomial would be

$X^{2} + 2 X + 1 = {(X + 1)}^{2},$

which is not square-free.
3.: For $ℓ = 3$ , if a DKMD existed, the corresponding characteristic polynomial would be of the form

$X^{3} + 2 X^{2} + X + a (X^{2} + 2 X + 1) = (X + a) {(X + 1)}^{2},$

which is also not square-free.

On the other hand, for

ℓ = 4

, we have

H_{4 ∣ X} = X^{T}

and

\begin{matrix} V {(H_{4 ∣ X})}^{⊥} & = C {(\begin{matrix} 7 & 0 & 0 & 0 & 1 \end{matrix})}^{T} \oplus C {(\begin{matrix} 1 & 2 & 1 & 0 & 0 \end{matrix})}^{T} \\ \oplus C {(\begin{matrix} 0 & 1 & 2 & 1 & 0 \end{matrix})}^{T} \oplus C {(\begin{matrix} 0 & 0 & 1 & 2 & 1 \end{matrix})}^{T}, \end{matrix}

implying that there exists a DKMD whose characteristic polynomial is

X^{4} + 7 = \prod_{k = 0}^{3} (X - \sqrt[4]{7} e^{\frac{i π}{4} (2 k + 1)}) .

Furthermore, all possible characteristic polynomials of DKMDs for these observables are of the form

f_{a, b, c} (X) = a (X^{4} + 7) + (1 - a) (X^{4} + 2 X^{3} + X^{2}) + b (X^{3} + 2 X^{2} + X) + c (X^{2} + 2 X + 1) .

Since

f_{a, b, c} (X)

is square-free for

(a, b, c) = (1, 0, 0)

, and since the set of

(a, b, c) \in C^{3}

for which

f_{a, b, c} (X)

is square-free is an open subset of

C^{3}

, we conclude that the set of possible four-degree DKMDs for

X

forms a continuum.

Example 9.

We expand

X

of Example 7 by adding one more dimension to the observables. Let the observable matrix

X

be given by

X = [\begin{matrix} 1 & 1 & 2 & 3 & 5 & 8 & 13 \\ 1 & 1 & 1 & 1 & 3 & 5 & 7 \end{matrix}] .

For

ℓ = 1, 2, 3

, the corresponding Hankel matrices are computed as

\begin{matrix} H_{1 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 & 2 & 1 & 3 & 1 & 5 & 3 & 8 & 5 \\ 1 & 1 & 2 & 1 & 3 & 1 & 5 & 3 & 8 & 5 & 13 & 7 \end{matrix}], \\ H_{2 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 & 2 & 1 & 3 & 1 & 5 & 3 \\ 1 & 1 & 2 & 1 & 3 & 1 & 5 & 3 & 8 & 5 \\ 2 & 1 & 3 & 1 & 5 & 3 & 8 & 5 & 13 & 7 \end{matrix}], and H_{3 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 & 2 & 1 & 3 & 1 \\ 1 & 1 & 2 & 1 & 3 & 1 & 5 & 3 \\ 2 & 1 & 3 & 1 & 5 & 3 & 8 & 5 \\ 3 & 1 & 5 & 3 & 8 & 5 & 13 & 7 \end{matrix}], \end{matrix}

implying

{codim}_{H} (1 ∣ X) = {codim}_{H} (2 ∣ X) = {codim}_{H} (3 ∣ X) = 0 .

On the other hand,

{codim}_{H} (4 ∣ X) = 1, {codim}_{H} (5 ∣ X) = 2, and {codim}_{H} (6 ∣ X) = 5

follows from

H_{4 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 & 2 & 1 \\ 1 & 1 & 2 & 1 & 3 & 1 \\ 2 & 1 & 3 & 1 & 5 & 3 \\ 3 & 1 & 5 & 3 & 8 & 5 \\ 5 & 3 & 8 & 5 & 13 & 7 \end{matrix}], H_{5 ∣ X} = [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & 1 & 2 & 1 \\ 2 & 1 & 3 & 1 \\ 3 & 1 & 5 & 3 \\ 5 & 3 & 8 & 5 \\ 8 & 5 & 13 & 7 \end{matrix}], and H_{6 ∣ X} = [\begin{matrix} 1 & 1 \\ 1 & 1 \\ 2 & 1 \\ 3 & 1 \\ 5 & 3 \\ 8 & 5 \\ 13 & 7 \end{matrix}] .

Therefore, if

X

admits a uniquely feasible degree, it must be four. In fact,

V {(H_{4 ∣ X})}^{⊥} = C {(\begin{matrix} - 1 & - 1 & 0 & - 1 & 1 \end{matrix})}^{T}

holds, and the corresponding characteristic polynomial is

f (x) = x^{4} - x^{3} - x - 1,

which is square-free, implying that

ℓ = 4

is uniquely feasible.

4.4. Important Properties of the Hankel Dimension and Codimension

In this subsection, we introduce important properties of the Hankel dimension and Hankel codimension, which play a crucial role in designing algorithms to determine uniquely feasible degrees, including the following. For the convenience of description, we denote the minimum feasible degree as

L = min {ℓ \in [0, T) ∣ {codim}_{H} (ℓ ∣ X) > 0} .

(19)

Best Possible Upper Bound of a Uniquely Feasible Degree: Let $r = rank X$ . If $ℓ > \frac{r T}{r + 1}$ , then ${codim}_{H} (ℓ ∣ X) > 1$ . Hence, no uniquely feasible degree can exceed $\frac{r T}{r + 1}$ , which is also the sharpest possible upper bound.
Monotonic Increase of the Hankel Codimension: The Hankel codimension ${codim}_{H} (ℓ ∣ X)$ is strictly increasing with respect to ℓ over the interval $[L, T)$ .
Equivalence Between Unique and Minimal Feasibility: The monotonicity of the codimension implies that, if ℓ is uniquely feasible, then $ℓ = L$ . In particular, if ${codim}_{H} (L ∣ X) > 1$ , no uniquely feasible degree exists.
Saturation of the Hankel Dimension: If $L \leq \frac{T}{2}$ and $ℓ \in [L, T - L]$ , then

$\dim_{H} (ℓ ∣ X) = L, {codim}_{H} (ℓ ∣ X) = ℓ + 1 - L .$

In particular, ${codim}_{H} (L ∣ X) = 1$ holds, which implies L is the only candidate for a uniquely feasible degree.

4.4.1. Invariance Under Basis Transformations

For

X \in C^{m \times T}

,

Y \in C^{n \times T}

, and

A \in C^{n \times m}

, we assume

Y = A X .

Their corresponding Hankel matrices are computed as

H_{k ∣ Y} = H_{k ∣ X} diag (\underset{T - k}{\underset{︸}{A^{T}, \dots, A^{T}}}),

where

diag (\underset{T - k}{\underset{︸}{A^{T}, \dots, A^{T}}})

is defined as the

m (T - k) \times n (T - k)

matrix given by

diag (\underset{T - k}{\underset{︸}{A^{T}, \dots, A^{T}}}) = [\begin{matrix} A^{T} & 0 \\ ⋱ \\ 0 & A^{T} \end{matrix}] .

This implies, in particular,

\dim_{H} (k ∣ Y) \leq \dim_{H} (k ∣ X) and {codim}_{H} (k ∣ Y) \geq {codim}_{H} (k ∣ X) .

Furthermore, if a square-free polynomial

f (x) = x^{k} + a_{k - 1} x^{k - 1} + \dots + a_{0}

is a characteristic polynomial of a DKMD for

X

, it is also a characteristic polynomial of a DKMD for

Y

. In fact, we have

(\begin{matrix} a_{0} & \dots & a_{k - 1} & 1 \end{matrix}) H_{k ∣ Y} = (\begin{matrix} a_{0} & \dots & a_{k - 1} & 1 \end{matrix}) H_{k ∣ X} diag (\underset{T - k}{\underset{︸}{A^{T}, \dots, A^{T}}}) = 0 .

Furthermore, if in addition there exists

B \in C^{m \times n}

such that

X = B Y,

then

\dim_{H} (k ∣ Y) = \dim_{H} (k ∣ X) and {codim}_{H} (k ∣ Y) = {codim}_{H} (k ∣ X)

holds, and the correspondence of characteristic polynomials is bijective.

The condition for

Y = A X

is that the row space of

Y

is contained in the row space of

X

, and symmetrically, the condition for

X = B Y

is that the row space of

Y

contains the row space of

X

. Hence, both

Y = A X

and

X = B Y

hold if and only if the row space of

Y

and the row space of

X

are identical. In other words, under the hypothesis

Y = A X

, the necessary and sufficient condition for the existence of

B \in C^{m \times n}

such that

X = B Y

is

rank Y = rank X

.

If

Y = A X

and

rank Y = rank X

hold, the minimum

{min}_{B} {∥ X - B Y ∥}_{F}

is zero, and hence,

∥ X - X Y^{+} {Y ∥}_{F} = 0

holds, implying

X = X Y^{+} Y .

Furthermore, this bijective correspondence between characteristic polynomials yields a bijective correspondence between DKMDs. In fact,

X = M V_{T ∣ μ_{1}, \dots, μ_{k}},

which is a DKMD of

X

, yields

Y = (A M) V_{T ∣ μ_{1}, \dots, μ_{k}},

which is a DKMD of

Y

. In reverse, a DKMD of

Y

Y = M V_{T ∣ μ_{1}, \dots, μ_{k}}

corresponds to a DKMD of

X

X = (X Y^{+} M) V_{T ∣ μ_{1}, \dots, μ_{k}} .

The latter correspondence is the inverse of the former by Proposition 5.

Thus, we have

Theorem 3.

If

rank Y = rank X

holds for

Y = A X

, then the following statements hold:

1.: $\dim_{H} (k ∣ X) = \dim_{H} (k ∣ Y)$ for each $k \in {0, 1, \dots, T - 1}$ .
2.: ${codim}_{H} (k ∣ X) = {codim}_{H} (k ∣ Y)$ for each $k \in {0, 1, \dots, T - 1}$ .
3.: The set of characteristic polynomials for $X$ is identical with that for $Y$ .
4.: A DKMD $X = M V_{T ∣ μ_{1}, \dots, μ_{k}}$ of $X$ can be converted to a DKMD $Y = (A M) V_{T ∣ μ_{1}, \dots, μ_{k}}$ of $Y$ , while a DKMD $Y = M V_{T ∣ μ_{1}, \dots, μ_{k}}$ of $Y$ can be converted to a DKMD $X = (X Y^{+} M) V_{T ∣ μ_{1}, \dots, μ_{k}}$ of $X$ . These conversions yield a bijective correspondence between the set of DKMDs for $X$ and that for $Y$ .

As an application of Theorem 3, the following two cases are particularly important:

Case where A is nonsingular: When $A$ is a nonsingular $m \times m$ matrix, we have $rank Y = rank X$ automatically, and thus Theorem 3 provides the invariance of the Hankel dimension, Hankel codimension, characteristic polynomials, and DKMDs under basis transformations in $C^{m}$ .
Case where A is fat: When $A \in C^{r \times m}$ with $r = rank X < m$ is selected to satisfy $rank (A X) = rank X$ , the matrix $Y = A X$ has fewer rows than $X$ , and thus $Y$ requires less computation to obtain DKMDs than $X$ .

4.4.2. The Best Possible Upper Bound for a Uniquely Feasible Degree

Although we have seen that a uniquely feasible degree is always less than T, we can determine the best possible upper bound. First, we see

Proposition 6.

For an observable matrix

X

with T columns, we have

r = rank X < T

if

X

admits a uniquely feasible degree.

Proof.

If ℓ is a uniquely feasible degree, Proposition 3 implies

ℓ < T

. On the other hand,

X = M V_{T ∣ μ_{1}, \dots, μ_{ℓ}}

implies

r \leq ℓ

. The assertion follows. □

Based on this proposition, we now establish the best possible upper bound for a Koopman degree that admits a unique DKMD.

Theorem 4.

If

{codim}_{H} (ℓ ∣ X) = 1

, then

ℓ \leq \frac{r T}{r + 1}

. Furthermore,

⌊ \frac{r T}{r + 1} ⌋

is the best possible upper bound for a uniquely feasible degree.

Proof.

When we take

Y = A X \in C^{r \times T}

with

rank Y = r

, Theorem 3 implies

\dim_{H} (ℓ ∣ X) = \dim_{H} (ℓ ∣ Y) = rank H_{ℓ ∣ Y} \leq r (T - ℓ),

equivalently,

{codim}_{H} (ℓ ∣ X) \geq ℓ + 1 - r (T - ℓ) .

Since

{codim}_{H} (ℓ ∣ X) = 1

by hypothesis,

ℓ + 1 - r (T - ℓ) \leq 1

, which yields

ℓ \leq \frac{r T}{r + 1} .

To show that

⌊ \frac{r T}{r + 1} ⌋

is the best possible upper bound for a uniquely feasible degree, we construct an

r \times T

observable matrix

X

with

rank X = r

and

\dim_{H} (ℓ ∣ X) = ℓ

.

First, we determine a sufficiently long series

Y

with mutually distinct eigenvalues

(μ_{1}, \dots, μ_{ℓ})

and nonzero modes

(m_{1}, \dots, m_{ℓ})

:

Y = [\begin{matrix} x_{0} & \dots & x_{\hat{T} - 1} \end{matrix}] = [\begin{matrix} m_{1} & \dots & m_{ℓ} \end{matrix}] V_{\hat{T} ∣ μ_{1}, \dots, μ_{ℓ}},

where

\hat{T}

is chosen sufficiently large so that we can construct an observable matrix

X = [\begin{matrix} x_{a_{1}} & \dots & x_{a_{1} + T - 1} \\ x_{a_{2}} & \dots & x_{a_{2} + T - 1} \\ ⋮ & ⋱ & ⋮ \\ x_{a_{r}} & \dots & x_{a_{r} + T - 1} \end{matrix}] .

Without loss of generality, we let

0 = a_{1} < a_{2} < \dots < a_{r}

and further suppose

a_{i + 1} - a_{i} \leq T - ℓ

for

i = 1, \dots, r - 1

.

We now verify that the columns of

H_{ℓ ∣ X}

correspond exactly to the first

a_{r} + T - ℓ

columns of

H_{ℓ ∣ Y}

. The i-th row of

X

contributes

T - ℓ

consecutive columns to

H_{ℓ ∣ X}

, where the last such column is

{(x_{a_{i} + T - ℓ - 1} \dots x_{a_{i} + T - 1})}^{T}

. On the other hand, the first column contributed by the

(i + 1)

-th row is

{(x_{a_{i + 1}} \dots x_{a_{i + 1} + ℓ})}^{T}

. Since

a_{i + 1} - a_{i} \leq T - ℓ

implies

a_{i + 1} \leq a_{i} + T - ℓ

, we have

a_{i + 1} \leq a_{i} + T - ℓ - 1 + 1

, which means the columns contributed by the i-th and

(i + 1)

-th rows are contiguous (or overlapping) in

H_{ℓ ∣ Y}

, with no gaps. Consequently, the set of columns of

H_{ℓ ∣ X}

coincides exactly with the first

a_{r} + T - ℓ

columns of

H_{ℓ ∣ Y}

.

Thus,

rank H_{ℓ ∣ X} = ℓ

holds if and only if

a_{r} + T - ℓ \geq ℓ

by Theorem 2. Since

a_{r}

satisfies

r - 1 \leq a_{r} \leq (r - 1) (T - ℓ)

, the condition for such an

a_{r}

to exist is that

2 ℓ - T \leq (r - 1) (T - ℓ), i . e ., ℓ \leq \frac{r T}{r + 1} .

Since ℓ must be an integer, the largest feasible value is

ℓ = ⌊ \frac{r T}{r + 1} ⌋

, completing the proof. □

4.4.3. Monotonic Increase of the Hankel Codimension

In this section, we establish that the Hankel codimension increases strictly monotonically beyond a certain threshold. This property plays a crucial role in identifying uniquely feasible degrees.

Theorem 5.

In the domain

ℓ \in {L, L + 1, \dots, T - 1}

,

{codim}_{H} (ℓ ∣ X)

is strictly monotonically increasing.

Proof.

Let

L \leq ℓ < k < T

. First, for a nonzero vector

w \in V {(H_{L ∣ X})}^{⊥}

, define

w^{0, ℓ - L} = {(w^{T} \underset{ℓ - L}{\underset{︸}{0 \dots 0}})}^{T} \in C^{ℓ + 1} .

This is a nonzero vector in

V {(H_{ℓ ∣ X})}^{⊥}

, so we have

{codim}_{H} (ℓ ∣ X) > 0

.

Next, let

v_{1}, \dots, v_{d}

be a basis of

V {(H_{ℓ ∣ X})}^{⊥}

, where

d = {codim}_{H} (ℓ ∣ X)

. By reordering if necessary, we may assume that

j_{1} : = max {j : {(v_{1})}_{j} \neq 0} \geq max {j : {(v_{i})}_{j} \neq 0}

holds for all

i \in {1, \dots, d}

.

We claim that the following

d + (k - ℓ)

vectors are linearly independent:

v_{1}^{1, k - ℓ - 1}, v_{1}^{2, k - ℓ - 2}, \dots, v_{1}^{k - ℓ, 0}, v_{1}^{0, k - ℓ}, v_{2}^{0, k - ℓ}, \dots, v_{d}^{0, k - ℓ},

which all belong to

V {(H_{k ∣ X})}^{⊥} \subset C^{k + 1}

. To prove the claim, let us consider

\sum_{s = 1}^{k - ℓ} c_{s} v_{1}^{s, k - ℓ - s} + \sum_{i = 1}^{d} c_{i}^{'} v_{i}^{0, k - ℓ} = 0 .

The vector

v_{1}^{s, k - ℓ - s}

has the form

{(0^{s} v_{1}^{T} 0^{k - ℓ - s})}^{T} \in C^{k + 1}

, so its

(j_{1} + s)

-th component equals

{(v_{1})}_{j_{1}} \neq 0

.

First, we prove

c_{s} = 0

for

s = k - ℓ, k - ℓ - 1, \dots, 1

by backward induction on s.

For

s = k - ℓ

, consider the

(j_{1} + k - ℓ)

-th component of the left-hand side. Among all vectors in the sum, only

v_{1}^{k - ℓ, 0}

has a nonzero value at this position, namely

{(v_{1})}_{j_{1}}

. The vectors

v_{i}^{0, k - ℓ}

for

i = 1, \dots, d

have the form

{(v_{i}^{T} 0^{k - ℓ})}^{T}

, and since

k - ℓ \geq 1

, we have

j_{1} + k - ℓ > j_{1}

, which implies that their

(j_{1} + k - ℓ)

-th component is zero. Similarly, for

t < k - ℓ

, the vector

v_{1}^{t, k - ℓ - t}

has nonzero components only up to position

j_{1} + t < j_{1} + k - ℓ

, so its

(j_{1} + k - ℓ)

-th component is also zero. Therefore,

c_{k - ℓ} {(v_{1})}_{j_{1}} = 0

, which gives

c_{k - ℓ} = 0

.

For

s < k - ℓ

, assuming

c_{s + 1} = \dots = c_{k - ℓ} = 0

by the induction hypothesis, we consider the

(j_{1} + s)

-th component of the left-hand side. By the same argument as above, only

v_{1}^{s, k - ℓ - s}

has a nonzero value at position

j_{1} + s

, namely

{(v_{1})}_{j_{1}}

. Therefore,

c_{s} {(v_{1})}_{j_{1}} = 0

, which gives

c_{s} = 0

.

This reduces the equation to

\sum_{i = 1}^{d} c_{i}^{'} v_{i}^{0, k - ℓ} = 0

, which implies

c_{i}^{'} = 0

for

i = 1, \dots, d

by the linear independence of

v_{1}, \dots, v_{d}

.

Therefore,

{codim}_{H} (k ∣ X) \geq d + (k - ℓ) = {codim}_{H} (ℓ ∣ X) + (k - ℓ) > {codim}_{H} (ℓ ∣ X),

which completes the proof. □

4.4.4. Equivalence Between Unique and Minimal Feasibility

If a uniquely feasible degree exists, it must coincide with the minimum feasible degree L. This conclusion follows directly from Theorem 5 as an immediate corollary.

Corollary 4.

If

X

admits a uniquely feasible degree, it must be L.

Proof.

While Corollary 2 states that a uniquely feasible degree ℓ must satisfy

{codim}_{H} (ℓ ∣ X) = 1

, Theorem 5 ensures that this condition is met only when

ℓ = L

. □

4.4.5. Saturation of $\dim_{H} (ℓ ∣ X)$

In this section, we present a theorem that extends Theorem 2. This result plays a crucial role in the development of algorithms for determining uniquely feasible degrees, particularly in the case where

L \leq \frac{T + 1}{2}

for the minimum feasible degree L.

Theorem 6.

If

L \leq \frac{T + 1}{2}

and an L-degree DKMD exists, then

\begin{matrix} \dim_{H} (ℓ ∣ X) & = \{\begin{matrix} ℓ + 1, & for 0 \leq ℓ \leq L - 1, \\ L, & for L \leq ℓ \leq T - L; \end{matrix} \\ {codim}_{H} (ℓ ∣ X) & = \{\begin{matrix} 0, & for 0 \leq ℓ \leq L - 1, \\ ℓ - L + 1, & for L \leq ℓ \leq T - L . \end{matrix} \end{matrix}

Proof.

We express the L-degree DKMD given by hypothesis in the form

X = M V_{T ∣ μ_{1}, \dots, μ_{L}} .

No column of

M

is a zero vector, since L is the minimum feasible degree. Indeed, if

X = M^{'} V_{T ∣ μ_{1}, \dots, μ_{L - 1}}

held for some

M^{'} \in C^{m \times (L - 1)}

, then the coefficient vector

(\begin{matrix} b_{0} & b_{1} & \dots & b_{L - 2} & 1 \end{matrix})

of the polynomial

(X - μ_{1}) \dots (X - μ_{L - 1}) = X^{L - 1} + b_{L - 2} X^{L - 2} + \dots + b_{0}

would belong to

V {(H_{L - 1 ∣ X})}^{⊥}

, contradicting the definition of L.

By Theorem 3, we can transform

X

by multiplying a matrix

A

such that

rank X = rank (A X)

without changing the assertion. In particular, we may assume the following:

$X \in C^{r \times T}$ with $rank X = r$ . To achieve this, take $i_{1}, \dots, i_{r}$ such that the rows $X 〈 i_{1} 〉, \dots, X 〈 i_{r} 〉$ are linearly independent. Then, define $A \in C^{r \times m}$ so that the j-th row of $A$ has 1 as the $i_{j}$ -th component and 0 for the other components.
The first row of $M$ has no zero component: $m_{1 i} \neq 0$ for $i = 1, 2, \dots, L$ . Since each column of $M$ is nonzero and $X \in C^{r \times T}$ , we can find a nonsingular matrix $A \in C^{r \times r}$ such that the first row of $A M$ has no zero component.

After this transformation, we apply Theorem 2 to the first row of

X

:

X 〈 1 〉 = (\begin{matrix} m_{11} & \dots & m_{1 L} \end{matrix}) V_{T ∣ μ_{1}, \dots, μ_{L}},

and obtain

\begin{matrix} \dim_{H} (ℓ ∣ X) & \geq \dim_{H} (ℓ ∣ X 〈 1 〉) = \{\begin{matrix} ℓ + 1, & for 0 \leq ℓ \leq L - 1, \\ L, & for L \leq ℓ \leq T - L, \end{matrix} \end{matrix}

since the columns of

H_{ℓ ∣ X 〈 1 〉}

are among those of

H_{ℓ ∣ X}

. For

0 \leq ℓ \leq L - 1

, in particular, the equality

\dim_{H} (ℓ ∣ X) = ℓ + 1

holds because

\dim_{H} (ℓ ∣ X) \leq ℓ + 1

by definition.

Leveraging this evaluation of the lower bound, the claim

\dim_{H} (ℓ ∣ X) = L

for

L \leq ℓ \leq T - L

can be proven by induction on ℓ.

For the base case

ℓ = L

, the definition of L gives

{codim}_{H} (L ∣ X) \geq 1

, which implies

\dim_{H} (L ∣ X) = L + 1 - {codim}_{H} (L ∣ X) \leq L

. Combined with

\dim_{H} (L ∣ X) \geq L

shown above, we have

\dim_{H} (L ∣ X) = L

.

For the inductive step, assume

ℓ > L

and

\dim_{H} (ℓ - 1 ∣ X) = L

, i.e.,

{codim}_{H} (ℓ - 1 ∣ X)

=

ℓ - L

. By Theorem 5,

{codim}_{H} (ℓ ∣ X) > {codim}_{H} (ℓ - 1 ∣ X) = ℓ - L

, which gives

\dim_{H} (ℓ ∣ X) = ℓ + 1 - {codim}_{H} (ℓ ∣ X) < ℓ + 1 - (ℓ - L) = L + 1,

and consequently,

\dim_{H} (ℓ ∣ X) \leq L

. Combined with

\dim_{H} (ℓ ∣ X) \geq L

, we conclude

\dim_{H} (ℓ ∣ X) = L

. Finally, the expressions for

{codim}_{H} (ℓ ∣ X)

follow directly from the definition of the Hankel codimension. □

Although Theorem 6 requires

L \leq \frac{T + 1}{2}

, if

L = \frac{T + 1}{2}

(which occurs only when T is odd), then

L = \frac{T + 1}{2} > \frac{T - 1}{2} = T - L

holds, implying that the range

L \leq ℓ \leq T - L

in Theorem 6 is empty. In this boundary case, the theorem provides information only for

ℓ = 0, 1, \dots, L - 1 = \frac{T - 1}{2}

, namely that

\dim_{H} (ℓ ∣ X) = ℓ + 1

and

{codim}_{H} (ℓ ∣ X) = 0

in this range.

For

L < \frac{T + 1}{2}

, which for integer L is equivalent to

L \leq \frac{T}{2}

, we have

Corollary 5.

If L is feasible and satisfies

L \leq \frac{T}{2}

, then L is a uniquely feasible degree.

Proof.

Since L is feasible, an L-degree DKMD exists. The condition

L \leq \frac{T}{2}

implies

L < \frac{T + 1}{2}

, so Theorem 6 applies and gives

{codim}_{H} (L ∣ X) = 1

. By Corollary 2, this means that L is uniquely feasible. □

4.5. Algorithms

Leveraging the five properties mentioned in Section 4.4, we develop efficient algorithms to search for a uniquely feasible degree L and to determine an L-degree characteristic polynomial, which provides eigenvalues of a DKMD. Once mutually distinct eigenvalues are obtained, the associated Koopman modes are calculated as described in Section 3.4.2. Our algorithms are categorized into the following:

One that applies to the case where $L \leq \frac{T}{2}$ and determines L by $\dim_{H} (⌊ \frac{T}{2} ⌋ ∣ X)$ ;
Another that performs a binary search to determine L in the case when $L > \frac{T}{2}$ .

To introduce the algorithms, we start by addressing a theoretical scenario where the observables exactly consist of a finite number of wave components. In this case, an exact DKMD is obtained. Subsequently, we consider a practical scenario where the observables comprise a finite number of dominant wave components, an infinite number of minor wave components, and noise. Here, our algorithms focus on extracting only the dominant wave components, effectively filtering out the minor components and the noise, resulting in an approximate DKMD.

4.5.1. A Theoretical Scenario

We first present Algorithm 1, which reduces the dimension of each observable so that decomposing the reduced observable matrix is equivalent to decomposing the original one. This reduction is practically useful for more efficient computation and a clearer understanding of the underlying structure. We then present algorithms that decompose an observable matrix for two cases:

L \leq \frac{T}{2}

(Algorithms 2 and 3) and

\frac{T}{2} < L \leq \frac{r T}{r + 1}

(Algorithm 4).

Algorithm 1: Dimension reduction of the observable matrix.

Require:

X \in C^{m \times T}

Ensure: Matrices

A \in C^{r \times m}

and

Y \in C^{r \times T}

for

r = rank X

with

Y = A X

and rank

Y = r

1: Find

i_{1}, \dots, i_{r} \in {1, 2, \dots, m}

such that the row vectors

X 〈 i_{1} 〉, \dots, X 〈 i_{r} 〉

are linearly independent;

2: Determine a matrix

A \in C^{r \times m}

such that the

(j, i_{j})

component is 1 and all other components are 0 for

j = 1, \dots, r

;

3: return

A

and

Y = A X

.

Algorithm 2: Search for an L-degree characteristic polynomial when

L \leq \frac{T}{2}

.

Require:

X \in C^{m \times T}

Ensure: The signal

continue

if

L > \frac{T}{2}

; the characteristic polynomial if

L \leq \frac{T}{2}

is uniquely feasible;

no_solution

if

L \leq \frac{T}{2}

is not uniquely feasible.

1: if

{codim}_{H} (⌊\frac{T}{2}⌋ ∣ X) > 0

then

2: Let

L = \dim_{H} (⌊\frac{T}{2}⌋ ∣ X)

;

3: Execute Algorithm 3;

4: return the return value of Algorithm 3;

5: else

6: return

continue

7: end if

Algorithm 3: Determine the characteristic polynomial.

Require:

X \in C^{m \times T}

and L

Ensure: An L-degree characteristic polynomial or

no_solution

1: if

\exists {(a_{0} a_{1} \dots a_{L} 1)}^{T} \in V {(H_{L ∣ X})}^{⊥}

then

2: Let

f (x) = x^{L + 1} + a_{L} x^{L} + \dots + a_{1} x + a_{0}

;

3: if

f (x) = 0

has no repeated roots then

4: return

f (x)

5: end if

6: end if

7: return

no_solution

Algorithm 4: Search for an L-degree characteristic polynomial when

L \in (\frac{T}{2}, \frac{r T}{r + 1}]

.

Require:

X \in C^{m \times T}

with

{codim}_{H} (⌊\frac{T}{2}⌋ ∣ X) = 0

Ensure: Either the characteristic polynomial of the DKMD of

X

for the uniquely feasible degree

L > \frac{T}{2}

, if present, or

no_solution

, otherwise.

1: if

{codim}_{H} (⌊\frac{r T}{r + 1}⌋ ∣ X) = 0

then

2: return

no_solution

3: end if

4: Let

l = ⌊\frac{T}{2}⌋

;

5: Let

h = ⌊\frac{r T}{r + 1}⌋

;

6: while

h - l > 1

do

7: Let

k = ⌊\frac{l + h}{2}⌋

;

8: if

{codim}_{H} (k ∣ X) = 0

then

9: Let

l = k

;

10: else if

{codim}_{H} (k ∣ X) = 1

then

11: Let

L = k

;

12: Execute Algorithm 3;

13: return the return value of Algorithm 3;

14: else

15: Let

h = k

;

16: end if

17: end while

18: if

{codim}_{H} (h ∣ X) = 1

then

19: Let

L = h

;

20: Execute Algorithm 3;

21: return the return value of Algorithm 3;

22: else

23: return

no_solution

;

24: end if

Dimension Reduction

Although each observable vector, i.e., a column vector of

X

, is of dimension m, the rank r of

X

can be smaller than m. In this case, Algorithm 1 determines a dimensionally reduced observable matrix

Y \in C^{r \times T}

and a conversion matrix

A \in C^{r \times m}

such that

Y = A X

and

rank Y = rank X = r

. By Theorem 3, the Hankel dimensions and codimensions, as well as the set of characteristic polynomials, are invariant under this conversion, and DKMDs for

X

and those for

Y

are mutually converted by matrix multiplication by

A

and

X Y^{+}

.

Such

A

and

Y

can be constructed by selecting r linearly independent rows of

X

and defining

A

so that these rows become the rows of

Y = A X

. Since these r rows of

X

are linearly independent and form all the rows of

Y

, we have

rank Y = r = rank X

.

By applying Algorithm 1, we can reduce the problem of decomposing

X \in C^{m \times T}

to the problem of decomposing

Y \in C^{r \times T}

with

r = rank X \leq m

. This reduction provides benefits in terms of computational efficiency and a clearer understanding of the data structure when executing the algorithms presented below. However, these algorithms are formulated in general terms and do not require that such a dimension reduction has been performed.

Case $L \leq \frac{T}{2}$

Algorithm 2 first investigates whether

L \leq \frac{T}{2}

by leveraging the equivalence between

L \leq ⌊\frac{T}{2}⌋

and

{codim}_{H} (⌊\frac{T}{2}⌋ ∣ X) > 0

. Indeed, if

L \leq ⌊\frac{T}{2}⌋

, then

{codim}_{H} (⌊\frac{T}{2}⌋ ∣ X) \geq {codim}_{H} (L ∣ X) > 0

holds by Theorem 5. Conversely, if

{codim}_{H} (⌊\frac{T}{2}⌋ ∣ X) > 0

, then

L \leq ⌊\frac{T}{2}⌋

by the definition of L.

If this investigation reveals

L \leq \frac{T}{2}

, the algorithm identifies L as

\dim_{H} (⌊\frac{T}{2}⌋ ∣ X)

by Theorem 6, and then executes Algorithm 3 to determine whether L is uniquely feasible. If

L > \frac{T}{2}

, the algorithm returns the value

continue

, indicating that Algorithm 4 should be used.

When invoked, Algorithm 3 verifies the following:

A vector ${(a_{0}, a_{1}, \dots, a_{L}, 1)}^{T}$ exists in $V {(H_{L ∣ X})}^{⊥}$ . This can be efficiently verified by performing a QR decomposition of $H_{L ∣ X}$ .
If such a vector exists, verify that the polynomial

$x^{L + 1} + a_{L} x^{L} + \dots + a_{1} x + a_{0} = 0$

has no repeated roots.

If both conditions are satisfied, this confirms that L is feasible, and we can then apply Theorem 6. As a result, we have

{codim}_{H} (L ∣ X) = 1

, meaning that L is uniquely feasible. Thus, the polynomial obtained in the verification is square-free and serves as the characteristic polynomial of the unique L-degree DKMD of

X

. In this case, the algorithm returns the obtained characteristic polynomial. Otherwise, it returns the value

no_solution

, indicating that no uniquely feasible degree exists.

Case $\frac{T}{2} < L \leq \frac{rT}{r + 1}$

Algorithm 4 details the procedure for cases when

L > \frac{T}{2}

. Note that if L is uniquely feasible, then

L \leq \frac{r T}{r + 1}

holds by Theorem 4, and this gives the best possible upper bound.

The algorithm first verifies whether

{codim}_{H} (⌊\frac{r T}{r + 1}⌋ ∣ X) > 0

. If this condition does not hold, no uniquely feasible degree exists, and the algorithm returns the signal

no_solution

.

If

{codim}_{H} (⌊\frac{r T}{r + 1}⌋ ∣ X) > 0

, then L lies in the range

⌊\frac{T}{2}⌋ < L \leq ⌊\frac{r T}{r + 1}⌋

. The algorithm utilizes a binary search to find L, leveraging the fact that

{codim}_{H} (ℓ ∣ X)

is a strictly increasing function by Theorem 5.

Since the identified L does not necessarily satisfy

{codim}_{H} (L ∣ X) = 1

, the algorithm must verify

{codim}_{H} (L ∣ X) = 1

before executing Algorithm 3 to confirm that L is uniquely.

4.5.2. A Practical Scenario

In practice, even if the Koopman operator has only discrete eigenvalues, the number of eigenvalues can be infinite, and in addition, observables can contain error signals. In such situations, the purpose of DKMD is to find a finite number of major wave components that most significantly affect the observables. From the viewpoint of executing our algorithms, the presence of minor components and errors makes direct computation of Hankel dimensions via QR decomposition impractical. In fact,

rank H_{ℓ ∣ X} = ℓ + 1

may always hold, which makes it impossible to identify the Hankel dimensions. In this section, we present a method to estimate Hankel dimensions for the major components via singular value decomposition (SVD) rather than QR decomposition.

We assume that the observables are represented as

x_{t} = \sum_{n = 1}^{ℓ} m_{n} λ_{n}^{t} + \sum_{n = 1}^{\infty} m_{n}^{'} {λ_{n}^{'}}^{t} + ε_{t},

where

λ_{n}^{t}

and

{λ_{n}^{'}}^{t}

represent major and minor wave components, respectively, and

ε_{t}

is noise. We define

\begin{matrix} {\hat{x}}_{t} = \sum_{n = 1}^{ℓ} {\hat{m}}_{n} {\hat{λ}}_{n}^{t}, \hat{X} = [{\hat{x}}_{0} \dots {\hat{x}}_{T - 1}]; \\ x_{t}^{'} = \sum_{n = 1}^{\infty} m_{n}^{'} {λ_{n}^{'}}^{t} + ε_{t}, X^{'} = [x_{0}^{'} \dots x_{T - 1}^{'}], \end{matrix}

and our basic assumption is that

x_{t}^{'}

is a small perturbation. Our aim is to estimate

rank H_{ℓ ∣ \hat{X}}

from

H_{ℓ ∣ X}

, taking advantage of the fact that

rank H_{ℓ ∣ \hat{X}}

equals the number of positive singular values of

H_{ℓ ∣ \hat{X}}

.

We consider

A = \hat{A} + A^{'} \in C^{m \times n}

with

m \leq n

. Let

σ_{1} \geq \dots \geq σ_{m} \geq 0

,

{\hat{σ}}_{1} \geq \dots \geq {\hat{σ}}_{m} \geq 0

, and

σ_{1}^{'} \geq \dots \geq σ_{m}^{'} \geq 0

be the singular values of

A

,

\hat{A}

, and

A^{'}

, respectively. Then,

|σ_{i} - {\hat{σ}}_{i}| \leq σ_{1}^{'}

holds for all

i \in {1, \dots, m}

. This can be proven as follows. By the Courant–Fischer min–max theorem [16], the i-th singular value of

A

satisfies

σ_{i} = min_{\begin{matrix} V \subseteq C^{n} \\ dim V = n - i + 1 \end{matrix}} max_{\begin{matrix} v \in V \\ ∥ v ∥ = 1 \end{matrix}} ∥ A v ∥ .

Since

A = \hat{A} + A^{'}

, we have

σ_{i} - {\hat{σ}}_{i} \leq σ_{1}^{'}

as follows:

\begin{matrix} σ_{i} & = min_{\begin{matrix} V \subseteq C^{n} \\ dim V = n - i + 1 \end{matrix}} max_{\begin{matrix} v \in V \\ ∥ v ∥ = 1 \end{matrix}} ∥ A v ∥ \\ \leq min_{\begin{matrix} V \subseteq C^{n} \\ dim V = n - i + 1 \end{matrix}} max_{\begin{matrix} v \in V \\ ∥ v ∥ = 1 \end{matrix}} (∥ \hat{A} v ∥ + ∥ A^{'} v ∥) \\ \leq min_{\begin{matrix} V \subseteq C^{n} \\ dim V = n - i + 1 \end{matrix}} max_{\begin{matrix} v \in V \\ ∥ v ∥ = 1 \end{matrix}} ∥ \hat{A} v ∥ + σ_{1}^{'} = {\hat{σ}}_{i} + σ_{1}^{'} . \end{matrix}

By the same reasoning, we also obtain

{\hat{σ}}_{i} - σ_{i} \leq σ_{1}^{'}

.

Furthermore, if

rank \hat{A} = r < m

, that is,

{\hat{σ}}_{1} \geq \dots \geq {\hat{σ}}_{r} > 0 = {\hat{σ}}_{r + 1} = \dots = {\hat{σ}}_{m},

then we have

σ_{i} \{\begin{matrix} \geq {\hat{σ}}_{i} - σ_{1}^{'} \geq {\hat{σ}}_{r} - σ_{1}^{'}, & if i \leq r, \\ \leq σ_{1}^{'}, & if i > r . \end{matrix}

Therefore, if

{\hat{σ}}_{r}

is sufficiently larger than

σ_{1}^{'}

, there exists a large gap between

σ_{i}

for

i \leq r

and

σ_{i}

for

i > r

, and hence, we can estimate r from

σ_{1}, \dots, σ_{m}

. Thus, if we can assume that the smallest positive singular value of

H_{ℓ ∣ \hat{X}}

is sufficiently larger than the largest singular value of

H_{ℓ ∣ X^{'}}

, we can apply this method to estimate

\dim_{H} (ℓ ∣ \hat{X})

.

A critical practical challenge in applying SVD-based estimation of Hankel dimensions is the selection of an appropriate threshold to distinguish major components from minor components and noise. While our simulations demonstrate clear gaps in the singular value spectra, establishing a systematic method for threshold selection is essential for robust implementation in diverse applications.

Several heuristic approaches have been proposed in the literature for threshold selection in similar contexts. The elbow method identifies the point where the rate of decrease in singular values changes most dramatically, corresponding to the transition from signal to noise. The relative gap criterion selects the threshold where the ratio

σ_{i} / σ_{i + 1}

is maximized, indicating the largest multiplicative jump in the spectrum. Alternatively, noise floor estimation approaches attempt to characterize the noise level from the smallest singular values and set the threshold based on statistical properties of this noise component.

In the present work, we employ visual inspection of the singular value spectra to identify gaps. Specifically, we plot the singular values and visually identify the index k where a clear separation between large and small singular values occurs. For our synthetic data with controlled noise levels and well-separated major and minor components, this approach consistently identifies the correct number of major components. However, we acknowledge that visual inspection is subjective and may not be suitable for automated applications or datasets with ambiguous gap structures.

Developing a fully automatic and statistically rigorous threshold selection method remains an important direction for future research. Promising approaches include (1) the relative gap criterion based on maximizing

σ_{i} / σ_{i + 1}

, which provides an objective measure of the most significant drop in the spectrum; (2) statistical hypothesis testing frameworks, such as permutation tests or methods based on random matrix theory, which can assess whether observed gaps exceed what would be expected from noise alone; and (3) information-theoretic criteria such as the Akaike Information Criterion (AIC) or Bayesian Information Criterion (BIC), which balance model complexity against goodness of fit. Such principled methods would significantly enhance the reliability and applicability of our methodology to real-world datasets with complex and unknown noise characteristics.

4.5.3. Time Complexities

The computationally intensive operations in these algorithms primarily involve executing QR decomposition (QRD), singular value decomposition (SVD), and solving equations (EqS). Table 2 demonstrates that the algorithms execute these computations only a small number of times and consequently prove to be highly efficient.

5. Simulations

The purpose of conducting simulations in this section is to demonstrate how Algorithms 2–4 achieve the objectives of estimating Koopman eigenvalues and making predictions. The simulations serve as empirical validation of the correctness of our theory: our algorithms provide accurate estimation of Koopman eigenvalues and accurate predictions, at least in controlled synthetic environments. This in no way diminishes the necessity and importance of verifying them with real-world data, which we intend to pursue as future work.

5.1. Synthetic Datasets Used in the Simulations

The synthetic datasets of observables are constructed with

m = 2

and

T = 50

, which makes

X

a

2 \times 50

matrix, and are randomly generated by performing the following steps:

Sample as many distinct Koopman eigenvalues, each classified as either major or minor, as specified in Table 3. Let $λ_{1}, \dots, λ_{N}$ denote these eigenvalues. For each $λ_{i}$ , its complex conjugate ${\bar{λ}}_{i}$ must also be in the set. Furthermore, every conjugate pair $(λ, \bar{λ})$ is sampled independently as follows:
- $| λ | = | \bar{λ} |$ is sampled according to a log-normal distribution with parameters $μ = 0$ and $σ = 0.01$ , whose probability density function is $\frac{1}{x σ \sqrt{2 π}} exp (- \frac{{(ln x)}^{2}}{2 σ^{2}})$ . The median, mean, and variance of this distribution are $e^{0} = 1$ , $e^{\frac{σ^{2}}{2}}$ , and $e^{2 σ^{2}} - e^{σ^{2}}$ , respectively.
- $arg λ = - arg \bar{λ}$ is sampled uniformly from the interval $(0, π)$ .
The distribution of $| λ |$ is designed to restrict the occurrence of samples too far from 1, because $| λ |$ much larger than 1 causes the observables to diverge, while a component with $| λ |$ much smaller than 1 decays rapidly.
Determine the Koopman mode $m_{i} = {(m_{1 i}, m_{2 i})}^{⊤}$ corresponding to $λ_{i}$ with the following constraints:
- $m_{1 j} = {\bar{m}}_{1 i}$ and $m_{2 j} = {\bar{m}}_{2 i}$ hold whenever $λ_{j} = {\bar{λ}}_{i}$ ;
- The modes associated with the major eigenvalues must have significant magnitudes, while those associated with the minor eigenvalues must have smaller magnitudes.
To satisfy the second requirement, we use a function $ς : C \to [0, 1]$ defined below, which has sharp peaks only at the sampled major eigenvalues $λ_{i_{1}}, \dots, λ_{i_{k}}$ :

$ς (λ ∣ λ_{i_{1}}, \dots, λ_{i_{k}}) = max_{j = 1, \dots, k} \frac{2}{1 + e^{100 [(| λ | - | λ_{i_{j}} {|)}^{2} + {(arg λ - arg λ_{i_{j}})}^{2}]}} .$

For each Koopman eigenvalue $λ_{i}$ , the mode $m_{i}$ is determined by sampling the argument of each component uniformly at random, while setting the magnitude to $ς (λ_{i} ∣ λ_{i_{1}}, \dots, λ_{i_{k}})$ , i.e., $| m_{1 i} | = | m_{2 i} | = ς (λ_{i} ∣ λ_{i_{1}}, \dots, λ_{i_{k}})$ .
Construct $X$ as $X = [m_{1}, \dots, m_{N}] V_{T ∣ λ_{1}, \dots, λ_{N}}$ . If the inclusion of noise is required, add to $X$ a noise matrix $[ε_{i t}]$ with $i \in {1, 2}$ and $t \in {0, 1, \dots, T - 1}$ , where each $ε_{i t}$ is independently sampled from a normal distribution $N (0, 0 . 01^{2})$ .

In addition, to evaluate the predictive accuracy of our algorithms, we compute observable values for

t \in {50, 51, \dots, 79}

using the Koopman eigenvalues and Koopman modes determined above.

5.2. Simulation Scenarios

We conduct simulations under the following four distinct scenarios:

Scenarios 1 and 2 investigate the case where an exact DKMD is obtained via QR decomposition (Section 4.5.1).
Scenarios 3 and 4 investigate the case where an approximated DKMD is obtained via singular value decomposition (Section 4.5.2).
Scenarios 1 and 3 are used to investigate Algorithm 2.
Scenarios 2 and 4 are used to investigate Algorithm 4.

For each scenario, we ran simulations several dozen times with different randomly generated observable datasets and obtained consistent results in terms of estimation and prediction accuracy. In the following subsections, we report the results of the simulations with one representative dataset.

5.3. Results of the Simulations

We have obtained excellent results in the simulations for all scenarios. In the case of estimating an exact DKMD, the estimated Koopman eigenvalues and the predictions for

t \in {50, 51, \dots, 79}

are identical to the ground truth within numerical precision. In the case of estimating an approximated DKMD, the estimated Koopman eigenvalues and the predictions for

t \in {50, 51, \dots, 79}

show close agreement with the ground truth.

5.3.1. Scenario 1

The observable data in Scenario 1 consist solely of 10 major components and include no minor components or noise. The uniquely feasible degree L is set to 10, which is smaller than half of T, and thus Hankel dimensions and codimensions can be computed by Theorem 6. Figure 3 shows the Hankel dimensions and codimensions obtained by directly computing the ranks of Hankel matrices. We observe that Theorem 6 is confirmed:

\dim_{H} (ℓ ∣ X) = L

and

{codim}_{H} (ℓ ∣ X) = ℓ + 1 - L

hold for

ℓ \in {10, 11, \dots, 40}

, where

10 = L

and

40 = T - L

, and in particular,

\dim_{H} (⌊ \frac{T}{2} ⌋ ∣ X) = L

holds, which enables Algorithm 2 to estimate the value of L for this

X

.

The ground truth 10 Koopman eigenvalues are plotted in Figure 4 as green circle markers. Algorithm 3 determines the corresponding characteristic polynomial, and the Koopman eigenvalues are estimated as the roots of this polynomial. The estimated Koopman eigenvalues are plotted in Figure 4 as red cross markers. The ground truth and the estimation coincide within numerical precision.

In Figure 5, we can observe that the values predicted using the estimated Koopman eigenvalues coincide with the ground truth observables for

t \in {50, 51, \dots, 79}

. Note that these observables were not used when estimating the Koopman eigenvalues. Since the predictions and the observables are real-valued, we can uniquely draw curves that interpolate the predicted values and the ground truth observables, respectively. (In contrast, if observables were complex-valued,

e^{α t}

would not necessarily coincide with

e^{(α + 2 π i) t}

for a real number t, even though

e^{α} = e^{α + 2 π i}

, making the interpolation non-unique.) The corresponding two curves are also displayed in Figure 5, and we can observe that they coincide with each other.

5.3.2. Scenario 2

The difference of Scenario 2 from Scenario 1 in the data generation is that the uniquely feasible degree L is 30, which is also identical to the number of major Koopman eigenvalues. Since L is greater than half of T, the hypotheses of Theorem 6 do not hold, and as a result, we cannot leverage the value of

\dim_{H} (⌊ \frac{T}{2} ⌋ ∣ X)

to estimate L. In fact, as Figure 6 shows,

\dim_{H} (⌊ \frac{T}{2} ⌋ ∣ X)

is 26, which differs from the ground truth

L = 30

. Therefore, Algorithm 4 searches for the degree ℓ such that

{codim}_{H} (ℓ ∣ X) = 1

holds. For this search, because of the strictly monotonically increasing property of the Hankel codimension (Theorem 5), as can be observed in Figure 6, Algorithm 4 can employ a binary search, which significantly improves the time efficiency of the search.

Figure 7 shows that the ground truth 30 Koopman eigenvalues and the estimated ones coincide within numerical precision. When minor components and noise are absent, the estimation of Koopman eigenvalues is highly accurate.

As Figure 8 shows, the predictions based on the estimated Koopman eigenvalues also coincide with the ground truth observables within numerical precision.

5.3.3. Scenario 3

The two preceding scenarios demonstrate that our QR decomposition-based algorithms accurately estimate Koopman eigenvalues and predictions when observables contain no minor components or noise. In such cases, the Hankel dimension equals the rank of the corresponding Hankel matrix.

However, when observables include minor components and noise, the rank of a Hankel matrix typically reaches its maximum possible value, exceeding the Hankel dimension associated with major components alone. Therefore, it is impossible to rely on QR decomposition to compute the Hankel dimension, and instead we employ SVD. Specifically, we select singular values of

H_{⌊ \frac{T}{2} ⌋ ∣ X}

that are significantly larger than the remainder and consider the principal subspace spanned by the left singular vectors corresponding to the selected singular values. We decompose each column vector of

H_{⌊ \frac{T}{2} ⌋ ∣ X}

into the sum of its projection onto the principal subspace and its component orthogonal to this subspace. We view the projection as arising from the major components and the orthogonal component as arising from the minor components and noise. Hence, the Hankel dimension is estimated as the dimension of this principal subspace.

The observable matrix

X

used in this scenario includes 90 minor components and noise. Figure 9 displays the singular values of

H_{⌊ \frac{T}{2} ⌋ ∣ X}

on a logarithmic scale. Visual inspection reveals a clear gap: the first 10 singular values substantially exceed the remaining 15. Applying a threshold of

10^{2}

yields the same conclusion. Thus, we estimate

L = 10

, matching the ground truth.

Figure 10 displays both the ground truth 10 Koopman eigenvalues and those estimated through Algorithm 3. From the SVD analysis, we estimated

L = 10

, which implies that for the observable matrix

\hat{X}

arising from the major components alone,

{codim}_{H} (10 ∣ \hat{X}) = 1

holds. Therefore, Algorithm 3 can be applied to determine the characteristic polynomial, whose roots are the estimated Koopman eigenvalues. Despite the presence of 90 minor components and noise, the estimated eigenvalues show close agreement with the ground truth.

Figure 11 depicts the predictions for

50 \leq t < 80

. Despite the presence of minor components and noise, the predictions closely follow the ground truth, accurately capturing both the overall trends and the local peaks.

5.3.4. Scenario 4

This scenario differs from Scenario 3 only in that

L = 30

, which exceeds half of T. Consequently, estimating

\dim_{H} (⌊ \frac{T}{2} ⌋ ∣ \hat{X})

does not directly yield L. Therefore, as in Scenario 2, we employ Algorithm 4, which uses binary search to identify the degree ℓ satisfying

{codim}_{H} (ℓ ∣ \hat{X}) = 1

. At each iteration of the binary search, we estimate

{codim}_{H} (ℓ ∣ \hat{X})

through SVD analysis of

H_{ℓ ∣ X}

.

Figure 12 displays a total of 100 eigenvalues, among which 30 are major (red) and the remaining 70 are minor (gray).

The binary search is conducted in the range

ℓ \in {25, 26, \dots, 49}

. Although singular values need to be evaluated for only approximately

{log}_{2} 25 \approx 5

Hankel matrices during the search, Figure 13 illustrates the singular values of all Hankel matrices

H_{ℓ ∣ X}

across this range to provide a comprehensive view. For clarity, the singular values are displayed on a linear scale rather than a logarithmic scale, and furthermore, values exceeding 1.0 are truncated at 1.0 to emphasize the gap structure. As the figure shows, we observe that

{codim}_{H} (30 ∣ X) = 1

holds, indicating that

ℓ = 30

is the uniquely feasible degree L for this

X

.

Once L is estimated, in the same way as Scenario 3, the Koopman eigenvalues can be estimated by solving the characteristic polynomial identified by Algorithm 3.

Figure 14 compares the ground truth Koopman eigenvalues and the estimated ones. We observe that the estimated eigenvalues and the ground truth are close.

Reflecting the fact that the estimated Koopman eigenvalues are close to the ground truth, as Figure 15 depicts, the predictions for

50 \leq t < 80

closely follow the ground truth and accurately capture both the overall trends and the local peaks, despite the presence of minor components and noise.

6. Conclusions

This paper has addressed a fundamental problem in Discrete Koopman Mode Decomposition (DKMD): determining the appropriate degree, which corresponds to the number of Koopman eigenvalues to include in the decomposition. We have introduced the concept of uniquely feasible degrees and developed both a theoretical framework and practical algorithms for their identification.

Our main theoretical contributions are as follows. We have rigorously characterized uniquely feasible degrees through Hankel dimensions and codimensions, showing that a degree ℓ is uniquely feasible if and only if

{codim}_{H} (ℓ ∣ X) = 1

holds and the resulting ℓ-degree characteristic polynomial has no repeated roots, where

X

represents an observable matrix consisting of T column vectors, each of which corresponds to an m-dimensional observable vector. We have established that Hankel codimensions are strictly monotonically increasing beyond a threshold degree

L = min {ℓ : {codim}_{H} (ℓ ∣ X) > 0}

, which enables efficient algorithmic identification. Furthermore, we have derived tight upper bounds on uniquely feasible degrees: when

L \leq \frac{T}{2}

, the degree L is uniquely feasible, while for

L > \frac{T}{2}

, any uniquely feasible degree satisfies

ℓ \leq \frac{r T}{r + 1}

where

r = rank X

. We have demonstrated through explicit construction that this bound is optimal and cannot be improved in general.

A critical consequence of our analysis is that when a degree is not uniquely feasible, infinitely many distinct DKMDs with different Koopman eigenvalues and modes can perfectly fit the same training data. While these alternative decompositions are indistinguishable on the training interval, they yield divergent predictions for future time steps, rendering the forecasts unreliable. This finding underscores the importance of selecting uniquely feasible degrees for obtaining meaningful and reliable decompositions.

Based on these theoretical results, we have developed practical algorithms for identifying uniquely feasible degrees. For noise-free data, our algorithms employ QR decomposition to compute Hankel dimensions exactly and determine uniquely feasible degrees through the condition

{codim}_{H} (ℓ ∣ X) = 1

. For the more realistic case where observables contain infinitely many minor components and measurement noise, we utilize singular value decomposition (SVD) to estimate Hankel dimensions. The key idea is to identify a gap in the singular value spectrum of the Hankel matrix

H_{ℓ ∣ X}

: when we decompose

X = \hat{X} + X^{'}

where

\hat{X}

represents major components and

X^{'}

represents minor components plus noise, the singular values corresponding to the Hankel matrix

H_{ℓ ∣ \hat{X}}

should be significantly larger than those corresponding to

H_{ℓ ∣ X^{'}}

, enabling estimation of

\dim_{H} (ℓ ∣ \hat{X})

. Our simulations with synthetically generated data demonstrate the effectiveness of these algorithms across various scenarios, including cases with different numbers of major and minor components, both with and without noise.

However, several important mathematical questions remain open for future investigation. The current approach to handling infinitely many minor components and noise relies on the existence of a gap between the smallest singular value of the Hankel matrix for major components and the largest singular value of the Hankel matrix for minor components plus noise. While our simulations suggest that such gaps exist under reasonable conditions, a rigorous theoretical characterization is lacking. Specifically, it remains to determine under what conditions on the magnitudes and phases of minor eigenvalues and the noise level a detectable gap exists, and how the gap size depends on these parameters. Furthermore, the selection of an appropriate threshold to distinguish major from minor singular values currently relies on heuristic or visual inspection. Developing a mathematically principled method for threshold selection, possibly based on statistical hypothesis testing or information-theoretic criteria, would significantly enhance the reliability and applicability of the methodology.

Beyond these theoretical questions, validation of our methodology on real-world data represents an essential direction for future work. While our synthetic data experiments provide controlled environments to verify the theoretical predictions, real dynamical systems present additional challenges not fully captured by our simulations, including non-stationary dynamics, complex noise structures, and model misspecification. Systematic applications to empirical data from diverse scientific domains—such as fluid flow measurements in turbulence studies, neurophysiological recordings in neuroscience, climate observations in atmospheric sciences, and vibration signals in mechanical engineering—would provide crucial insights into the practical performance, robustness, and limitations of the proposed algorithms. Such empirical validation would also inform the development of domain-specific adaptations and reveal whether additional theoretical developments are needed to handle the complexities of real-world observations.

Other natural directions for future research include extensions to continuous-time Koopman operators and the investigation of the relationship between sampling rates and uniquely feasible degrees, which would broaden the theoretical scope of the framework. The development of adaptive algorithms that automatically determine appropriate thresholds from data properties, rather than requiring manual tuning, would improve practical usability and make the methodology more accessible to researchers across different disciplines.

In conclusion, the framework of uniquely feasible degrees provides both theoretical rigor and practical utility for model selection in Discrete Koopman Mode Decomposition. The algorithms developed enable practitioners to identify the degrees to which the decomposition is uniquely determined by the observable data, thereby ensuring reliable predictions and meaningful interpretations of the underlying dynamics. While important mathematical questions regarding gap characterization and threshold selection remain open, and empirical validation on real-world data is needed, the foundations established in this work offer a solid basis for continued development of data-driven analysis methods for dynamical systems.

Author Contributions

Conceptualization, K.S.; Methodology, K.S.; Software, S.A.; Formal analysis, K.S.; Data curation, S.A.; Writing—original draft, K.S.; Visualization, S.A.; Supervision, K.S.; Project administration, K.S.; Funding acquisition, K.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Japan Society for the Promotion of Science (JSPS) grant number 21H05052.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Rudin, W. Real and Complex Analysis, 3rd ed.; McGraw-Hill: New York, NY, USA, 1987. [Google Scholar]
Koopman, B.O. Hamiltonian systems and transformation in Hilbert space. Proc. Natl. Acad. Sci. USA 1931, 17, 315–318. [Google Scholar] [CrossRef] [PubMed]
Schmid, P.J. Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech. 2010, 656, 5–28. [Google Scholar] [CrossRef]
Rowley, C.W.; Mezić, I.; Bagheri, S.; Schlatter, P.; Henningson, D.S. Spectral analysis of nonlinear flows. J. Fluid Mech. 2009, 641, 115–127. [Google Scholar] [CrossRef]
Tu, J.H.; Rowley, C.W.; Luchtenburg, D.M.; Brunton, S.L.; Kutz, J.N. On dynamic mode decomposition: Theory and applications. J. Comput. Dyn. 2014, 1, 391. [Google Scholar] [CrossRef]
Taira, K.; Brunton, S.L.; Dawson, S.T.; Rowley, C.W.; Colonius, T.; McKeon, B.J.; Schmidt, O.T.; Gordeyev, S.; Theofilis, V.; Ukeiley, L.S. Modal analysis of fluid flows: An overview. AIAA J. 2017, 55, 4013–4041. [Google Scholar] [CrossRef]
Brunton, S.L.; Brunton, B.W.; Proctor, J.L.; Kaiser, E.; Kutz, J.N. Chaos as an intermittently forced linear system. Nat. Commun. 2017, 8, 19. [Google Scholar] [CrossRef] [PubMed]
Brunton, B.W.; Johnson, L.A.; Ojemann, J.G.; Kutz, J.N. Extracting spatial–temporal coherent patterns in large-scale neural recordings using dynamic mode decomposition. J. Neurosci. Methods 2016, 258, 1–15. [Google Scholar] [CrossRef] [PubMed]
Taylor, R.; Kutz, J.N.; Morgan, K.; Nelson, B.A. Dynamic mode decomposition for plasma diagnostics and validation. Rev. Sci. Instruments 2018, 89, 053501. [Google Scholar] [CrossRef] [PubMed]
Kaptanoglu, A.A.; Morgan, K.D.; Hansen, C.J.; Brunton, S.L. Characterizing magnetized plasmas with dynamic mode decomposition. Phys. Plasmas 2020, 27, 032108. [Google Scholar] [CrossRef]
Kusaba, A.; Shin, K.; Shepard, D.; Kuboyama, T. Predictive Nonlinear Modeling by Koopman Mode Decomposition. In Proceedings of the 2020 International Conference on Data Mining Workshops (ICDMW), Sorrento, Italy, 17–20 November 2020; pp. 811–819. [Google Scholar] [CrossRef]
Fujii, K.; Takeishi, N.; Kibushi, B.; Kouzaki, M.; Kawahara, Y. Data-driven spectral analysis for coordinative structures in periodic human locomotion. Sci. Rep. 2019, 9, 16755. [Google Scholar] [CrossRef] [PubMed]
Berger, E.; Sastuba, M.; Vogt, D.; Jung, B.; Ben Amor, H. Estimation of perturbations in robotic behavior using dynamic mode decomposition. Adv. Robot. 2015, 29, 331–343. [Google Scholar] [CrossRef]
Takeishi, N.; Kawahara, Y.; Yairi, T. Sparse nonnegative dynamic mode decomposition. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 2682–2686. [Google Scholar] [CrossRef]
Rudin, W. Functional Analysis, 2nd ed.; McGraw-Hill Series in Higher Mathematics; McGraw-Hill, Inc.: New York, NY, USA, 1991. [Google Scholar]
Horn, R.A.; Johnson, C.R. Matrix Analysis, 2nd ed.; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Mezić, I. Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dyn. 2005, 41, 309–325. [Google Scholar] [CrossRef]
Susuki, Y.; Mezic, I. A Prony approximation of Koopman mode decomposition. In Proceedings of the 2015 54th IEEE Conference on Decision and Control (CDC), Osaka, Japan, 15–18 December 2015; pp. 7022–7027. [Google Scholar] [CrossRef]

Figure 1. Comparison between FD and KMD. The target function (shown in black) is a superposition of two trigonometric functions (represented in orange and blue). The amplitudes of these trigonometric functions are (a) constant for the FD and (b) decreasing over time for KMD.

Figure 2. Illustration of multiple valid KMDs with divergent predictions. Each KMD exactly fits the 7 observed values (left), while their extrapolations for

t \geq 7

differ substantially (right). Different colors correspond to different values of

α

, and the black dash–dotted curve indicates the KMD obtained using the vector Prony method with

ℓ = 4

.

Figure 2. Illustration of multiple valid KMDs with divergent predictions. Each KMD exactly fits the 7 observed values (left), while their extrapolations for

t \geq 7

differ substantially (right). Different colors correspond to different values of

α

, and the black dash–dotted curve indicates the KMD obtained using the vector Prony method with

ℓ = 4

.

Figure 3. Scenario 1: Hankel dimensions and codimensions.

Figure 4. Scenario 1: Estimation of eigenvalues.

Figure 5. Scenario 1: Prediction of observables.

Figure 6. Scenario 2: Hankel dimensions and codimensions.

Figure 7. Scenario 2: Estimation of eigenvalues.

Figure 8. Scenario 2: Prediction of observables.

Figure 9. Scenario 3: Singular values of

H_{⌊ \frac{T}{2} ⌋ ∣ X}

.

Figure 9. Scenario 3: Singular values of

H_{⌊ \frac{T}{2} ⌋ ∣ X}

.

Figure 10. Scenario 3: Estimation of eigenvalues.

Figure 11. Scenario 3: Prediction of observables.

Figure 12. Scenario 4: All 100 Koopman eigenvalues (30 major in red, 70 minor in gray).

Figure 13. Scenario 4: Singular values of

H_{ℓ ∣ X}

for

ℓ = 25, \dots, 49

.

Figure 13. Scenario 4: Singular values of

H_{ℓ ∣ X}

for

ℓ = 25, \dots, 49

.

Figure 14. Scenario 4: Estimation of eigenvalues.

Figure 15. Scenario 4: Prediction of observables.

Table 1. Notation summary.

Notation	Description
ℓ	Koopman degree.
$x_{0}, \dots, x_{T - 1} \in C^{m}$	Column vectors of observables.
$μ_{1}, \dots, μ_{ℓ} \in C$	Koopman eigenvalues of an ℓ-degree DKMD.
$m_{1}, \dots, m_{ℓ} \in C^{m}$	Koopman modes of an ℓ-degree DKMD.
$X \in C^{m \times T}$	Observable matrix $[x_{0} \dots x_{T - 1}]$ .
$M \in C^{m \times ℓ}$	Mode matrix $[m_{1} \dots m_{ℓ}]$ .
$X_{i}^{j} \in C^{m \times (j - i + 1)}$	Submatrix $[x_{i} \dots x_{j}]$ for $0 \leq i \leq j < T$ .
$X 〈 i 〉 \in C^{T}$	The ith row vector of $X$ .
$V_{n ∣ a_{1}, \dots, a_{k}} \in C^{k \times n}$	Vandermonde matrix (Definition 2).
$H_{k ∣ X} \in C^{(k + 1) \times m (T - k)}$	Hankel matrix (Definition 6).
$\dim_{H} (k ∣ X)$	kth Hankel dimension of $X$ , defined as $rank H_{k ∣ X}$ (Definition 10).
${codim}_{H} (k ∣ X)$	kth Hankel codimension of $X$ , defined as $k + 1 - \dim_{H} (k ∣ X)$ (Definition 10).
L	The smallest ℓ such that ${codim}_{H} (ℓ ∣ X) > 0$ .
$A_{i j}$	Entry of a matrix $A$ at row i and column j.
${∥ A ∥}_{F}$	Frobenius norm of $A$ : ${∥ A ∥}_{F} = \sqrt{\sum_{i, j} {\| A_{i j} \|}^{2}}$ .
$A^{+}$	Moore–Penrose pseudoinverse of $A$ ; $B = C A^{+}$ minimizes ${∥ C - B A ∥}_{F}$ .
$X^{T} \in C^{n \times m}$	Transpose of $X \in C^{m \times n}$ .
$X^{*} \in C^{n \times m}$	Conjugate transpose of $X \in C^{m \times n}$ .
$V (M)$	Subspace spanned by the column vectors of $M$ .
$[\begin{matrix} A & B \end{matrix}] \in C^{k \times (m + n)}$	Matrix obtained by appending the n columns of $B \in C^{k \times n}$ to $A \in C^{k \times m}$ .
$V^{⊥} \subset C^{n}$	Orthogonal complement of a subspace $V \subseteq C^{n}$ .
$v_{i}$	The ith component of a vector $v$ .
$0^{n} \in C^{n}$	n-dimensional zero row vector $(0 \dots 0)$ .
$v^{a, b} \in C^{n + a + b}$	Column vector defined as $v^{a, b} = {(0^{a} v^{T} 0^{b})}^{T}$ for $v \in C^{n}$ .
$diag (A_{1}, \dots, A_{k})$	Block diagonal matrix with diagonal blocks $A_{1}, \dots, A_{k}$ .

Table 2. Time complexity of the algorithms in the theoretical and practical scenarios.

Algorithms	Theoretical		Practical
Algorithms	QRD	EqS	SVD	EqS
Algorithm 1	1	0	1	0
Algorithm 2	1	0	1	0
Algorithm 3	1	1	1	1
Algorithm 4	< ${log}_{2} T$	0	< ${log}_{2} T$	0

Table 3. Features of simulation scenarios.

Scenario Number	Type	No. of Major Eigenvalues	No. of Minor Eigenvalues	Noise Inclusion
1	$L \leq \frac{T}{2}$	10	0	No
2	$L > \frac{T}{2}$	30	0	No
3	$L \leq \frac{T}{2}$	10	90	Yes
4	$L > \frac{T}{2}$	30	70	Yes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shin, K.; Asaoka, S. Correct Degree Selection for Koopman Mode Decomposition. Mathematics 2026, 14, 603. https://doi.org/10.3390/math14040603

AMA Style

Shin K, Asaoka S. Correct Degree Selection for Koopman Mode Decomposition. Mathematics. 2026; 14(4):603. https://doi.org/10.3390/math14040603

Chicago/Turabian Style

Shin, Kilho, and Shodai Asaoka. 2026. "Correct Degree Selection for Koopman Mode Decomposition" Mathematics 14, no. 4: 603. https://doi.org/10.3390/math14040603

APA Style

Shin, K., & Asaoka, S. (2026). Correct Degree Selection for Koopman Mode Decomposition. Mathematics, 14(4), 603. https://doi.org/10.3390/math14040603

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Correct Degree Selection for Koopman Mode Decomposition

Abstract

1. Introduction

2. Theoretical Frameworks Underlying Koopman Mode Decomposition

2.1. Temporal Transition of States and Semigroup Property

2.2. Koopman Operator

2.3. Koopman Generator

2.4. Koopman Mode Decomposition and Spectral Theorem

3. Discrete Koopman Mode Decomposition

3.1. DFT and Vandelmonde Matrix

3.2. Formulation of DKMD

3.3. Definitions and Notations

3.4. Known Methods to Compute DKMD for Known Degrees

3.4.1. Computing the Koopman Eigenvalues

3.4.2. Computing the Koopman Modes

3.5. The Contributions of This Article

4. Finding Uniquely Feasible Degrees

4.1. Key Indices: Hankel Dimension and Codimension

4.2. The Koopman Dimension and Codimension for m = 1

4.3. Examples

4.4. Important Properties of the Hankel Dimension and Codimension

4.4.1. Invariance Under Basis Transformations

4.4.2. The Best Possible Upper Bound for a Uniquely Feasible Degree

4.4.3. Monotonic Increase of the Hankel Codimension

4.4.4. Equivalence Between Unique and Minimal Feasibility

4.4.5. Saturation of dim H ℓ ∣ X

4.5. Algorithms

4.5.1. A Theoretical Scenario

Dimension Reduction

Case L ≤ T 2

Case T 2 < L ≤ rT r + 1

4.5.2. A Practical Scenario

4.5.3. Time Complexities

5. Simulations

5.1. Synthetic Datasets Used in the Simulations

5.2. Simulation Scenarios

5.3. Results of the Simulations

5.3.1. Scenario 1

5.3.2. Scenario 2

5.3.3. Scenario 3

5.3.4. Scenario 4

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2. The Koopman Dimension and Codimension for $m = 1$

4.4.5. Saturation of $\dim_{H} (ℓ ∣ X)$

Case $L \leq \frac{T}{2}$

Case $\frac{T}{2} < L \leq \frac{rT}{r + 1}$