Next Article in Journal
The Jacobi Symbol Problem for Quadratic Congruences and Applications to Cryptography
Previous Article in Journal
Quantitative Modeling of Investment–Output Dynamics: A Panel NARDL and GMM-Arellano–Bond Approach with Evidence from the Circular Economy
Previous Article in Special Issue
Investigations on the Chaos in the Generalized Double Sine-Gordon Planar System: Melnikov’s Approach and Applications to Generating Antenna Factors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Learning Dynamics from Data by Future-Informed Regression of Evolution

by
Gyurhan Nedzhibov
Faculty of Mathematics and Informatics, Konstantin Preslavsky University of Shumen, 9700 Shumen, Bulgaria
Mathematics 2026, 14(3), 464; https://doi.org/10.3390/math14030464
Submission received: 6 January 2026 / Revised: 23 January 2026 / Accepted: 26 January 2026 / Published: 28 January 2026
(This article belongs to the Special Issue Numerical Methods in Dynamical Systems)

Abstract

The data-driven modeling of nonlinear dynamical systems using the Koopman operator has become a widely adopted framework for spectral analysis, prediction, and control. However, classical Koopman-based methods are typically restricted to observables defined on the system state at a single time instant, which limits their expressivity for systems exhibiting temporal correlations, memory effects, or multi-step interactions. In this work, we introduce a generalized linear mapping operator designed to establish the optimal linear relationship between two complex, trajectory-dependent observables defined over an extended state space that incorporates both past and future dynamics. By allowing heterogeneous input–output observable spaces, the proposed framework systematically captures temporal dependencies, coupled dynamics, and physically informed features, extending the applicability of Koopman-based data-driven modeling. Numerical experiments on benchmark systems, including the SIR epidemic model, a two-mass spring–damper system, and a forced harmonic oscillator, demonstrate improved reconstruction accuracy and spectral representation compared to standard approaches. In particular, the proposed method achieves relative reconstruction errors as low as 5.89 × 10 5 for the SIR model and 8.96 × 10 4 for the forced harmonic oscillator, representing improvements of several orders of magnitude over classical DMD and EDMD variants. These results confirm the robustness of the new framework in capturing complex nonlinear and transient dynamics.
MSC:
65P99; 37M02; 37L65

1. Introduction

The analysis and prediction of complex nonlinear dynamical systems remains a central challenge across numerous scientific and engineering disciplines. Traditional approaches often rely on linearizations around equilibrium points or computationally intensive nonlinear simulation techniques. The Koopman operator framework [1,2] provides a powerful alternative, lifting nonlinear dynamics into an infinite-dimensional linear function space of observables. A renewed interest in Koopman analysis has emerged since the pioneering work of Mezić and collaborators [3,4,5], enabling spectral analysis and the prediction of nonlinear systems using data-driven approximations.
Dynamic Mode Decomposition (DMD) has become a prominent method for data-driven analysis, offering a linear approximation of the underlying dynamics in terms of spatial modes and associated eigenvalues [6,7,8]. DMD and its extensions have found applications in neuroscience [9], epidemiology [10], video processing [11], cavity flows [6], financial trading [12,13], and robotics [14]. Comprehensive reviews of the DMD literature are provided in [8,15,16], and recent modifications are discussed in [17,18,19,20,21].
Despite their successes, standard DMD methods are limited to linear observables directly associated with the state variables. This can lead to insufficient performance in strongly nonlinear systems. To address this, several extensions have been developed, including Extended Dynamic Mode Decomposition (EDMD) [22], Hankel (delay-embedding) DMD [23,24,25], DMD with Control (DMDc) [26,27,28], Optimized or Robust DMD [29], and Forward–Backward DMD (fbDMD) [30]. EDMD, in particular, generalizes DMD by projecting the dynamics onto a user-defined dictionary of nonlinear observables, enabling approximation of the Koopman operator for a wide range of nonlinear systems, including fluid flows, mechanical oscillators, and reaction–diffusion models [31]. Extensions aimed at enhancing expressivity and numerical stability include multi-resolution DMD [8], sparsity-promoting DMD [19], and kernel-based approaches [32,33,34,35,36,37].
The shift toward continuous-time data-driven modeling has gained significant momentum in recent years, reflecting the need for frameworks that align naturally with physical laws. Recent developments include data-driven continuous-time Hammerstein modeling addressing missing data [38], iterative learning control (ILC) for continuous-time systems [39], and continuous-time surrogate models for dynamic optimization [40]. These advances highlight the growing relevance of continuous-time operator representations and motivate the introduction of the proposed framework, which further enhances numerical stability, reconstruction accuracy, and spectral fidelity.
A common limitation of standard DMD and EDMD formulations is the restriction of observables to functions of the system state at a single time instant. This constraint can reduce the expressive power of the approximation, particularly in systems exhibiting temporal correlations, memory effects, or multi-step interactions. Time-delay embedding techniques, based on the Takens embedding theorem [41] and integrated with DMD or EDMD [24,42], allow observables to depend on sequences of past states, improving spectral approximations in quasi-periodic or chaotic systems. For a thorough review, see [42,43,44].
In this work, we introduce the Future-Informed Regression Evolution (FIRE) framework, a generalized operator-theoretic approach that extends the Koopman operator to map between heterogeneous spaces of multi-state observables. Unlike traditional EDMD or Hankel-DMD approaches [22,23,24], FIRE allows each observable to depend on a sequence of past, current, and future states, forming an extended information state. This enables the operator to capture temporal correlations, nonlinear cross-terms, derivatives, or other physically informed features that may differ between input and target spaces. By formalizing the input–output relationship via a general mapping π : X Y , FIRE generalizes the classical Koopman framework and systematically handles partial or scalar measurements while preserving spectral properties.
We demonstrate the effectiveness of FIRE on benchmark systems, including the SIR epidemic model, a two-mass spring–damper system, and the forced harmonic oscillator. The numerical results indicate improved reconstruction accuracy and spectral representation compared to standard DMD and EDMD variants.
The main contributions of this study are summarized as follows:
1.
We introduce a rigorous operator-theoretic formulation of the generalized transfer operator G , capturing the evolution of extended observables across heterogeneous spaces while preserving spectral properties under mild assumptions.
2.
We provide both continuous-time and discrete-time numerical realizations via finite-dimensional Galerkin approximations, connecting FIRE to EDMD and Hankel-DMD while highlighting its distinct advantages.
3.
Through numerical experiments on benchmark systems, we demonstrate that FIRE achieves superior reconstruction accuracy and robust spectral representation, even from partial or scalar observations.
The remainder of this paper is organized as follows. Section 2 reviews Koopman operator preliminaries. Section 3 presents the mathematical formulation of the FIRE framework, followed by the data-driven approximation of the generalized operator in Section 4. Section 5 presents the numerical experiments, and Section 6 concludes the paper.

2. Preliminary Theory

This section reviews the classical Koopman operator framework, discusses its finite-dimensional approximations via DMD and EDMD, and motivates the extension of the observable space to accommodate trajectory-dependent representations, which forms the basis for the generalized operator introduced in the subsequent section.

2.1. The Classical Koopman Framework

The Koopman operator provides a linear, operator-theoretic framework for the analysis of nonlinear dynamical systems by lifting the state evolution into a space of observables. This perspective enables the study of nonlinear dynamics using spectral methods traditionally associated with linear systems.
Let ( M , μ ) denote the phase space equipped with an invariant measure μ . Consider a discrete-time nonlinear dynamical system governed by the following map:
x k + 1 = F ( x k ) , x k M R n .
Let F be a linear space of complex-valued observables g : M C that is closed under composition with F, i.e.,
g F F for all g F ,
where ∘ denotes functional composition. A common choice is F = L 2 ( M , μ ) .
The Koopman operator K : F F associated with the dynamics (1) is defined by
( K g ) ( x ) = g ( F ( x ) ) .
Unlike the nonlinear map F, which acts on the state space M , the Koopman operator acts on observables. Although K is generally infinite-dimensional—even when F is finite-dimensional—it is linear, even for strongly nonlinear systems. Specifically, for any g 1 , g 2 F and scalars c 1 , c 2 C ,
K ( c 1 g 1 + c 2 g 2 ) = c 1 K g 1 + c 2 K g 2 .
A function ϕ F is called a Koopman eigenfunction associated with the eigenvalue λ C if
K ϕ = λ ϕ .
Combining (2) and (4), the eigenfunctions satisfy
ϕ ( F ( x ) ) = λ ϕ ( x ) ,
which implies that Koopman eigenfunctions evolve linearly along trajectories of the nonlinear system. This property forms the foundation for Koopman spectral analysis.
Suppose that the observable g F lies in the span of Koopman eigenfunctions { ϕ j } j = 1 , allowing the expansion
g ( x ) = j = 1 v j ϕ j ( x ) ,
where the coefficients v j C are referred to as Koopman modes. For a vector-valued observable g : M C n ,
g ( x ) = [ g 1 ( x ) , , g n ( x ) ] T ,
a similar expansion holds, as follows:
g ( x ) = j = 1 v j ϕ j ( x ) ,
where v j C n denote the Koopman modes associated with ϕ j .
Applying the Koopman operator yields
g ( F ( x ) ) = j = 1 λ j v j ϕ j ( x ) ,
and, by repeated application,
g ( F k ( x ) ) = j = 1 λ j k v j ϕ j ( x ) ,
where F k denotes the k-fold composition of F. Each Koopman mode evolves independently with growth rate and oscillatory behavior determined by the corresponding eigenvalue λ j .
The collection of triples { ( λ j , ϕ j , v j ) } j = 1 constitutes the Koopman mode decomposition, originally introduced by Mezić [3]. When the full state is observable (i.e., g i ( x ) = x i ), the Koopman operator provides a complete linear representation of the nonlinear dynamics.
Despite its conceptual elegance, the classical Koopman framework restricts observables to functions of the instantaneous system state. This limitation motivates the development of generalized operator constructions capable of incorporating trajectory-dependent and multi-state information, which form the basis of the FIRE framework introduced in the following sections.

2.2. Dynamic Mode Decomposition and Extended Dynamic Mode Decomposition

The Koopman operator is, in general, infinite-dimensional, which renders its direct numerical computation infeasible. Practical implementations therefore rely on finite-dimensional approximations obtained by restricting the operator to suitably chosen subspaces of observables. A variety of data-driven methods have been proposed to construct such approximations, among which Dynamic Mode Decomposition (DMD) and its generalization, Extended Dynamic Mode Decomposition (EDMD), are the most widely used.
Dynamic Mode Decomposition provides a numerical approximation of the Koopman operator based on linear observables of the system state. While DMD has proven effective in many applications, its reliance on linear measurements limits its ability to capture strongly nonlinear dynamics. Extended Dynamic Mode Decomposition was introduced to address this limitation by approximating the Koopman operator in a richer, user-defined space of nonlinear observables.
In EDMD, the infinite-dimensional Koopman operator K is approximated by its projection onto a finite-dimensional subspace spanned by a prescribed dictionary of observables. Let
D = { ψ 1 , , ψ N } ,
where each ψ i : M C belongs to the observable space F . We denote by
F D = span { ψ 1 , , ψ N } F
the finite-dimensional subspace generated by the dictionary. The objective of EDMD is to construct a matrix K C N × N that approximates the action of the Koopman operator restricted to F D .
Assume that snapshot pairs { ( x k , y k ) } k = 1 M are collected from the dynamical system, where
y k = F ( x k ) , x k , y k M .
Define the feature map associated with the dictionary as
ψ ( x ) = ψ 1 ( x ) ψ N ( x ) .
Using this feature map, EDMD constructs the data matrices
X = [ ψ ( x 1 ) ψ ( x M ) ] , Y = [ ψ ( y 1 ) ψ ( y M ) ] ,
where X , Y C N × M .
The finite-dimensional approximation of the Koopman operator is obtained by solving the least squares problem
min K C N × N Y K X F 2 ,
which admits the solution
K = Y X ,
where X denotes the Moore–Penrose pseudoinverse of X. The resulting matrix K represents the projection of the Koopman operator onto the subspace F D spanned by the chosen dictionary. Its spectral decomposition provides finite-dimensional approximations of Koopman eigenvalues, eigenfunctions, and modes, enabling the reconstruction and prediction of system dynamics.
The quality of the EDMD approximation critically depends on the choice of dictionary. If F D is invariant under the action of the Koopman operator, the approximation is exact on this subspace and the residual in (12) vanishes. In practice, however, Koopman-invariant subspaces are rarely known a priori. When the dictionary does not span such a subspace, the projected operator may introduce spurious eigenvalues and eigenfunctions, and important dynamical features may be lost. In the limit of infinite data, the EDMD matrix converges to the Koopman operator projected onto F D , but this projection error persists unless the dictionary is sufficiently expressive.
Dynamic Mode Decomposition arises as a special case of EDMD when the dictionary consists solely of the identity observable,
ψ ( x ) = x .
In this case, EDMD reduces to a linear approximation of the Koopman operator acting directly on the state space, highlighting DMD as a particular instance within the broader EDMD framework.

3. Generalized Sequence-Based Evolution Operator

The Koopman operator framework provides a powerful tool for representing nonlinear dynamics through the linear evolution of observables. Traditional formulations, however, typically consider observables defined on a single state at a fixed time, limiting their applicability to systems with temporal correlations, memory effects, or heterogeneous representations across time. To address these limitations, we introduce a generalized sequence-based evolution operator, designed to establish a linear mapping between observables defined over sequences of system states. By constructing the operator on spatio-temporal windows of feature vectors rather than the state space itself, this approach enables the incorporation of past, present, and future information in a unified framework. The operator preserves linearity in the space of observables and induces a structured evolution that can be interpreted as Markovian in the expanded feature space, providing a principled link between high-fidelity trajectories and reduced-order or aggregated representations.

3.1. Generalized Sequence-Based Operator

We now formalize the central concept of this work: a generalized evolution operator acting on multi-state observables.
Consider a discrete-time dynamical system on a state manifold M R n , governed by the map
z k + 1 = F ( z k ) , z k M .
Definition 1
(Extended State). Let { z k } k Z R n be a discrete-time process. For fixed non-negative integers d , s N 0 (memory depth and anticipation depth), the extended state at time k is defined as follows:
z ˜ k = z k d z k z k + s R ( d + s + 1 ) n .
The extended state aggregates a finite window of past, current, and (optionally) future states into a single vector. The corresponding extended state space is defined as a Cartesian product, as follows:
M q = M × × M q copies R q n .
Definition 2
(Multi-State Observable). A multi-state observable is a measurable function,
g : M q C ,
for a fixed integer q 0 .
The space of all such observables, denoted F ( M q ) , forms the feature space for extended observables. Lifting the analysis to F ( M q ) enables the incorporation of bidirectional temporal dependence and trajectory-based information.
To formalize input–output relationships across extended states, we introduce two distinct observation spaces:
X = M d + s + 1 , Y = M r + l + 1 ,
with associated observable spaces F ( X ) and F ( Y ) .
Definition 3
(Generalized Operator G ). Let X and Y be two state spaces as in (18), and let π : X Y be a measurable map encoding temporal alignment. The generalized operator
G : F ( Y ) F ( X )
is defined by
( G g ) ( x ) : = h ( x ) ,
where h ( x ) = g ( π ( x ) ) with x X and g F ( Y ) .
In other words, G maps a target observable g defined on Y to a corresponding observable h on X through composition with π , establishing a linear correspondence between extended-state features:
( G g ) ( x ) = g ( y ) , where y = π ( x ) .
This construction allows multi-state observables to depend on sequences of past, current, and future states while preserving linearity in the lifted feature space.
Unlike traditional transfer operators, which describe the temporal evolution of observables within a single state space, the operator G encodes the informational transfer between distinct extended-state representations of the same system. Equation (19) generalizes the classical Koopman action
( K g ) ( x k ) = g ( x k + 1 )
by allowing the evolved observable to depend on a local sequence of feature vectors rather than a single state. In this sense, G extends the classical Koopman composition rule to encompass non-Markovian and sequence-dependent transformations.
The operator G can be interpreted as a natural generalization of the Koopman operator: it reduces to the classical Koopman operator K when the map π coincides with the graph of the system evolution F, the target space Y is isomorphic to the input space X , and X Y M . Under these conditions, G recovers the standard Koopman action on single-time observables.
The incorporation of future states in the extended space X may appear related to backward-time or forward–backward DMD (fbDMD) formulations. However, the proposed framework differs fundamentally in both purpose and interpretation. Backward-time and fbDMD methods introduce reverse-time operators or symmetric constraints to improve numerical stability of Koopman eigenvalue estimates. In contrast, the anticipation depth in the present work is introduced at the level of state space construction, defining an augmented forward-time dynamical system with induced shift dynamics.
The generalized operator G does not approximate backward evolution, nor does it combine forward and backward operators. Instead, it acts as an intertwining map between Koopman operators on different state spaces, enabling spectral transfer under a semiconjugacy condition. As a result, future states are treated as components of the lifted state rather than as targets of inverse-time dynamics. This distinction allows the proposed framework to remain valid even for non-invertible or dissipative systems, where backward-time Koopman operators are not well-defined.
On future-informed versus causal constructions. Although the proposed framework allows for the construction of extended information states that may include both past and future measurements relative to a reference time index, this choice introduces a non-causal element when interpreted from a real-time prediction perspective. It is therefore important to distinguish between two different application regimes. In the first regime, during the offline analysis, such as system identification, spectral decomposition, or operator inference from archived datasets, access to future samples is available, and two-sided temporal windows are admissible. Including future measurements improves numerical conditioning and derivative estimation accuracy, particularly in continuous-time formulations, and leads to more faithful approximations of the infinitesimal generator. In the second regime, the online prediction or forecasting, where the future measurements are not available at inference time, once the operator and modal decomposition are learned, the learned Koopman modes evolve autonomously according to their eigenvalues, and only the current modal coordinates are propagated forward in time.
Moreover, the framework readily simplifies to a purely causal formulation by setting the anticipation depth to zero, reducing the method to a delay coordinate embedding based solely on past and present states. As such, the framework is applicable both in offline system identification and in online prediction scenarios where future data is unavailable.
In this work, the focus is on the offline analysis regime, where the goal is accurate operator identification and spectral characterization rather than real-time forecasting.

3.2. Theoretical Properties of the Generalized Operator G

The operator G inherits linearity directly from the structure of the observable spaces. Specifically, since F ( X ) is a vector space with pointwise addition and scalar multiplication, the action of G on linear combinations of observables preserves these operations. This property is formalized in the following result.
Lemma 1
(Linearity of G ). The generalized transfer operator G : F ( Y ) F ( X ) is linear.
Proof. 
Let g 1 ,   g 2 F ( Y ) be two observables, and let α ,   β R be arbitrary scalars. By definition of the generalized operator G , for all x X we obtain
( G g i ) ( x ) = g i ( π ( x ) ) , i = 1 , 2 ,
where π : X Y is the measurable map encoding temporal alignment.
Consider the linear combination α g 1 + β g 2 F ( Y ) . Then, for all x X ,
( G ( α g 1 + β g 2 ) ) ( x ) = ( α g 1 + β g 2 ) ( π ( x ) ) = α g 1 ( π ( x ) ) + β g 2 ( π ( x ) ) = α ( G g 1 ) ( x ) + β ( G g 2 ) ( x ) .
Since this holds for arbitrary x X , we conclude that
G ( α g 1 + β g 2 ) = α G g 1 + β G g 2 ,
which proves that G is linear.    □
Theorem 1
(Dynamical Consistency via Lifting). Let Y be the physical state space with dynamics F Y : Y Y , and let X be an augmented (lifted) state space. Let π : X Y be a measurable projection map. Assume there exists an induced dynamical map F X : X X that is π-compatible, such that the following diagram commutes:
π ( F X ( x ) ) = F Y ( π ( x ) ) for all x X .
Let G : F ( Y ) F ( X ) be the generalized operator, according Definition 3, as follows:
( G g ) ( x ) = g ( π ( x ) ) .
Then, the generalized operator preserves the temporal evolution of the observables:
( G g ) ( F X ( x ) ) = ( K Y g ) ( π ( x ) )
where K Y is the Koopman operator on the physical space Y .
Proof. 
Let g F ( Y ) be an observable in the physical space, and let h = G g F ( X ) be its lifted counterpart defined by ( G g ) ( x ) = g ( π ( x ) ) . We evaluate both sides of the identity (21) for an arbitrary point x X .
By applying the definition of the generalized operator G to the point F X ( x ) , we obtain the following:
( G g ) ( F X ( x ) ) = g ( π ( F X ( x ) ) ) .
Usingthe assumption of π -compatibility ( π F X = F Y π ), this expression becomes
g ( π ( F X ( x ) ) ) = g ( F Y ( π ( x ) ) ) .
Now, consider the Koopman operator K Y acting on the physical space Y , defined as follows:
( K Y g ) ( y ) = g ( F Y ( y ) ) .
Setting y = π ( x ) , we obtain
( K Y g ) ( π ( x ) ) = g ( F Y ( π ( x ) ) ) .
Combining the results from (22) and (23) yields
( G g ) ( F X ( x ) ) = g ( F Y ( π ( x ) ) ) = ( K Y g ) ( π ( x ) ) for   all x X ,
which establishes the claimed dynamical consistency of lifted observables.    □
Remark 1.
Although the original dynamic (15) evolves on the state manifold M , the delay-embedded space X carries induced shift dynamics F X , used to define the associated Koopman operatror K X .
Remark 2.
In the context of the data-driven frameworks presented in this work, the spaces X and Y are typically delay-embedded manifolds (Hankel structures) constructed from the physical trajectory M . In such cases, F X and F Y are naturally induced shift maps. Consequently, the projection map π : X Y , which acts as a truncation or window shift, automatically satisfies the semiconjugacy condition
π F X = F Y π .
This ensures that the generalized operator G is dynamically consistent with the underlying Koopman operator of the system, even under partial observability. Therefore, the dynamical consistency required in Theorem 1 holds automatically for delay coordinate constructions.
Rather than asserting generic validity, the following result identifies the precise structural condition under which the Koopman spectral information can be transferred between lifted and projected representations.
Theorem 2
(Commutation of Koopman Operators under Lifting). Let X and Y be state spaces equipped with induced discrete-time dynamical systems F X and F Y , respectively, and let π : X Y be a measurable map that induces a semiconjugacy between the dynamical systems F X and F Y , i.e., satisfying (24). Let K X and K Y denote the Koopman operators associated with F X and F Y , respectively, defined by
( K X h ) ( x ) : = h ( F X ( x ) ) , ( K Y g ) ( y ) : = g ( F Y ( y ) ) .
Then, the generalized operator G : F ( Y ) F ( X ) satisfies the following commutation relation:
K X G = G K Y .
Proof. 
To prove the equality of the two operator compositions, we evaluate their action on an arbitrary observable g F ( Y ) and x X .
Applying K X to G g yields
( K X G g ) ( x ) = ( G g ) ( F X ( x ) ) = g ( π ( F X ( x ) ) ) .
Using the commutation (semiconjugacy) condition (24), we obtain
g ( π ( F X ( x ) ) ) = g ( F Y ( π ( x ) ) ) .
On the other hand, applying G to K Y g results in
( G K Y g ) ( x ) = ( K Y g ) ( π ( x ) ) = g ( F Y ( π ( x ) ) ) .
Thus,
( K X G g ) ( x ) = ( G K Y g ) ( x ) ,
and since this holds for arbitrary x X and any g F ( Y ) , we conclude the statement of theorem.    □
The commutation relation K X G = G K Y demonstrates that the generalized transfer operator G is a bounded linear operator that intertwines the Koopman dynamics of the physical and augmented spaces. This ensures that the spectral map is preserved under the lifting transformation π .
Remark 3.
The semiconjugacy condition (24) is a structural assumption and does not hold generically for arbitrary embeddings or heterogeneous delay constructions. In particular, when X and Y are constructed using unequal delay lengths or distinct observation functions, the induced dynamics may fail to commute under projection. Theorem 2 therefore characterizes the algebraic structure of Koopman operators only in settings where a compatible lifting of the dynamics exists, such as uniform delay embeddings or shift-invariant augmented state spaces.
Theorem 3
(Spectral Inheritance). Let X and Y be state spaces with induced dynamics F X and F Y , and let π : X Y be a measurable map satisfying the semiconjugacy condition (24). Let G : F ( Y ) F ( X ) be defined by ( G g ) ( x ) = g ( π ( x ) ) . If φ λ F ( Y ) is an eigenfunction of K Y with eigenvalue λ, then its image under the generalized operator,
ϕ λ : = G φ λ F ( X ) ,
is an eigenfunction of K X associated with the same eigenvalue λ, as follows:
K X ϕ λ = λ ϕ λ ,
provided that ϕ λ is nontrivial.
Proof. 
From Theorem 2, we have established the commutation relation G K Y = K X G . Let φ λ F ( Y ) be a Koopman eigenfunction of K Y with eigenvalue λ . Operating with G on the eigenvalue equation K Y φ λ = λ φ λ , we obtain
G ( K Y φ λ ) = G ( λ φ λ ) .
Using the commutation property on the left side and linearity (Lemma 1) on the right side results in the following:
K X ( G φ λ ) = λ ( G φ λ ) .
This implies that ϕ λ = G φ λ is an eigenfunction of the Koopman operator K X acting on the space X with the same eigenvalue λ .    □
This result implies that the spectral information of the Koopman operator on Y is preserved under the generalized transfer operator G , providing a direct link between the eigenfunctions in the two observation spaces.
Remark 4.
The semiconjugacy condition π F X = F Y π is essential for the validity of Theorem 3. It guarantees that the dynamics on X and Y are compatible under the map π, allowing eigenfunctions to be transferred faithfully. In general, for arbitrary alignment maps, heterogeneous embeddings, or unequal delay lengths, this condition may fail, and Theorem 3 may no longer hold. In such cases, the spectral transfer may only be approximate or may not preserve eigenvalues, and care must be taken when interpreting the transferred spectral properties.
In practical data-driven applications, the semiconjugacy condition may be violated in several ways, specified below:
  • If the measurements are corrupted by noise or undersampled relative to the fastest time scales of the system, the induced dynamics F X may fail to accurately represent the underlying system F M .
  • As noted in Remark 3, using heterogeneous delay lengths or non-stationary observation functions can break the semiconjugacy. In such cases, the operator G no longer intertwines the dynamics, and the eigenvalues found in X may be spurious or fail to correspond to the physical frequencies in Y .
  • If the span of the dictionary in X does not contain the lifted versions of the eigenfunctions from Y , the spectral inheritance cannot be numerically realized, resulting in approximation errors.
Theorem 4
(Spectral Representation Mapping). Assume that the generalized operator G : F ( Y ) F ( X ) is a bounded linear operator. Let an observable g F ( Y ) admit a convergent Koopman spectral expansion of the form
g ( y ) = j = 1 v j φ j ( y ) ,
where { φ j } are Koopman eigenfunctions of K Y with eigenvalues { λ j } . Then, under the semiconjugacy assumption of Theorem 2, the lifted observable h = G g admits the representation
h ( x ) = j = 1 v j ϕ j ( x ) ,
where ϕ j = G φ j are Koopman eigenfunctions of K X associated with the same eigenvalues λ j .
Proof. 
Let g F ( Y ) admit the convergent spectral expansion (25) and define h = G g . By the assumed boundedness of the linear operator G , it is necessarily continuous on the observable space. This continuity allows the operator to commute with the limit of the infinite series:
h = G j = 1 v j φ j = j = 1 v j ( G φ j ) ,
where the convergence of the resulting series in F ( X ) is guaranteed by the boundedness of G (i.e., G v j φ j G · v j φ j ). From Theorem 3, under the semiconjugacy condition (24), each image ϕ j = G φ j is a Koopman eigenfunction of the induced operator K X associated with the same eigenvalue λ j . Substituting this result into the series, we obtain
h = j = 1 v j ϕ j .
Evaluating this expression at any point x X yields
h ( x ) = j = 1 v j ϕ j ( x ) ,
which establishes the claimed spectral representation mapping.    □
This result establishes that the generalized transfer operator G preserves the spectral decomposition of observables. Consequently, the eigenvalues and expansion coefficients of g in Y are directly inherited by the transferred observable h in X , providing a rigorous foundation for data-driven spectral reconstruction across different observation spaces.
Remark 5.
The operator G is assumed to be linear and bounded in the function space F ( Y ) , ensuring that the termwise mapping of the series (25) is well defined and that convergence is preserved in the image space. Furthermore, it is assumed that the representation (25) converges in F ( Y ) . Theorem 4 does not assert the existence or completeness of a Koopman eigenbasis, but rather characterizes the transfer of an existing spectral expansion under a compatible lifting.

3.3. Convergence to the Classical Koopman Case

Within the generalized framework introduced above, the transfer operator G acts between observables defined on potentially distinct observation spaces X and Y . A fundamental consistency requirement of this construction is that, under appropriate assumptions, G reduces to the classical Koopman operator. In this subsection, we formalize the conditions under which this reduction occurs and show that the proposed framework recovers the standard Koopman spectral representation as a special case. 
  • (i) Identical Observation Spaces. Assume that the observation spaces coincide, i.e.,
X Y ,
so that the generalized operator acts as
G : F ( X ) F ( X ) .
Let { ϕ j } j = 1 F ( X ) be a set of eigenfunctions of G with corresponding eigenvalues { λ j } j = 1 , satisfying
( G ϕ j ) ( x ) = ϕ j ( π ( x ) ) = λ j ϕ j ( x ) , x X .
Suppose that an observable g F ( X ) admits the expansion
g ( x ) = j = 1 v j ϕ j ( x ) ,
with convergence in F ( X ) . By linearity of G and the eigenfunction property, it follows that
( G g ) ( x ) = g ( π ( x ) ) = j = 1 v j λ j ϕ j ( x ) .
In the particular case where the alignment map coincides with the system dynamics, π = F , this expression reduces to the classical Koopman relation
g ( F ( x ) ) = j = 1 v j λ j ϕ j ( x ) ,
and repeated application yields
g ( F k ( x ) ) = j = 1 v j λ j k ϕ j ( x ) , k N .
(ii) Classical Koopman Limit. Consider now the fully classical setting in which the observation spaces coincide with the state space,
X Y M ,
and, consequently,
F ( X ) = F ( Y ) = F ( M ) .
If, in addition, the alignment map π coincides with the system evolution map F, the generalized operator G reduces exactly to the classical Koopman operator K .
Let g F ( M ) lie in the span of Koopman eigenfunctions { ϕ j } with eigenvalues { λ j } , and assume convergence of the associated series expansion. Then, the spectral representation induced by G coincides with the standard Koopman mode decomposition,
g ( F k ( z ) ) = j = 1 λ j k v j ϕ j ( z ) , z M , k N .
Here, the eigenvalues λ j encode the temporal growth, decay, or oscillatory behavior of the corresponding Koopman modes, confirming that the generalized framework is fully consistent with classical Koopman theory.

4. Data-Driven Approximation of the Generalized Operator G

Recall that the generalized transfer operator
G : F ( Y ) F ( X )
acts on an observable g F ( Y ) , according to
( G g ) ( x ) = g ( π ( x ) ) , x X ,
where π : X Y is a measurable alignment map. Thus, G advances observables by composition with π , mapping an observable defined on Y to a corresponding observable on X . As in the classical Koopman setting, G is linear but acts on an infinite-dimensional function space, rendering direct numerical computation intractable.
To obtain a practical data-driven realization, we approximate G on a finite-dimensional subspace spanned by a prescribed dictionary of observables. The objective is to identify a finite-dimensional linear operator that best reproduces the action of G on sampled data, in the least squares sense.
  • Finite-dimensional approximation. For simplicity of exposition, we consider the case
X Y = M s + d + 1 ,
corresponding to extended state representations constructed from fixed past and future windows. Let
D = { ψ 1 , , ψ N } , ψ i F ( X ) ,
be a dictionary of scalar-valued observables spanning a finite-dimensional subspace of F ( X ) . We collect these observables into the following feature map:
ψ ( x ) = ψ 1 ( x ) ψ N ( x ) C N , x M s + d + 1 ,
which encodes information from both past and future states through the extended state x . Snapshot construction. Given a sequence of state measurements { z k } k = 1 K , we construct pairs of extended states,
( x k , y k ) k = 1 M , y k = π ( x k ) ,
where x k and y k are formed according to the chosen delay and anticipation depths. Evaluating the feature map on these samples yields the following data matrices:
Ψ X = ψ ( x 1 ) ψ ( x M ) C N × M , Ψ Y = ψ ( y 1 ) ψ ( y M ) C N × M .
Least squares operator estimation. The finite-dimensional approximation G C N × N of the generalized operator G is defined as the solution of
G Ψ X Ψ Y ,
which represents a discrete realization of the operator relation ( G g ) ( x ) = g ( π ( x ) ) , restricted to the span of D . The matrix G is obtained by solving the following least squares problem:
arg min G C N × N Ψ Y G Ψ X F 2 ,
whose minimum norm solution is given by
G = Ψ Y Ψ X ,
where Ψ X denotes the Moore–Penrose pseudoinverse of Ψ X .
The operator G in (39) is the optimal linear map, in the Frobenius norm sense, that advances feature vectors from Ψ X to Ψ Y . When an exact solution exists, (39) yields the solution with minimal Frobenius norm. Otherwise, it provides the best least squares approximation to the action of the generalized transfer operator on the chosen finite-dimensional feature space.
Next, we characterize the precise conditions under which the finite-dimensional operator G defined in (39) exactly reproduces the action of the generalized transfer operator on the sampled data, i.e., satisfies Ψ Y = G Ψ X .
Theorem 5.
Let G = Ψ Y Ψ X be defined as in (39). Then,
Ψ Y = G Ψ X ,
if, and only if, the null spaces of the data matrices satisfy
N ( Ψ X ) N ( Ψ Y ) ,
where N ( · ) denotes the null space.
Proof. 
We first prove necessity. Suppose that the condition N ( Ψ X ) N ( Ψ Y ) does not hold. Then there exists a vector u N ( Ψ X ) such that Ψ Y u 0 . Since u N ( Ψ X ) , we have Ψ X u = 0 , and therefore
G Ψ X u = Ψ Y Ψ X Ψ X u = 0 .
Hence, G Ψ X u Ψ Y u , which implies that Ψ Y G Ψ X for any choice of G. Thus, exact reconstruction is impossible unless N ( Ψ X ) N ( Ψ Y ) .
We now prove sufficiency. Assume that N ( Ψ X ) N ( Ψ Y ) . Using the definition G = Ψ Y Ψ X , we compute
Ψ Y G Ψ X = Ψ Y Ψ Y Ψ X Ψ X = Ψ Y I Ψ X Ψ X .
The matrix Ψ X Ψ X is the orthogonal projector onto R ( Ψ X * ) , the range of the adjoint of Ψ X . Consequently, I Ψ X Ψ X is the orthogonal projector onto R ( Ψ X * ) = N ( Ψ X ) . Since N ( Ψ X ) N ( Ψ Y ) , by assumption, it follows that
Ψ Y I Ψ X Ψ X = 0 ,
and therefore Ψ Y = G Ψ X . This completes the proof.    □
Remark 6.
The condition N ( Ψ X ) N ( Ψ Y ) guarantees that no nontrivial linear dependency present in the feature evaluations at x k is violated after alignment by π. This condition is analogous to consistency requirements appearing in EDMD and related operator-theoretic identification frameworks.
To improve numerical stability and enhance interpretability, it is convenient to reformulate the least squares approximation in terms of normalized correlation matrices in feature space. Specifically, we introduce the following empirical covariance operators:
  • The Gram matrix in feature space,
    G r = 1 M Ψ X Ψ X * ,
    which represents the empirical inner product induced by the data on the dictionary { ψ i } i = 1 N ;
  • The action matrix,
    A c = 1 M Ψ Y Ψ X * ,
    which encodes the action of the generalized operator G on the feature evaluations.
Theorem 6
(Galerkin approximation of the generalized operator). Let G denote the generalized sequence-based operator acting on the finite-dimensional multi-state observable space
F D = span { ψ 1 , , ψ N } , ψ i F ( X ) ,
and let the feature space data matrices Ψ X and Ψ Y be defined as in (36). Define the Gram matrix G r and the action matrix A c by (40)–(41). Then the matrix
G = A c G r
provides the least squares approximation of the action of G restricted to F D and admits the interpretation of a data-driven Galerkin projection of G onto span { ψ i } i = 1 N .
Proof. 
We seek a finite-dimensional linear operator G C N × N such that
Ψ Y G Ψ X
in the least squares sense (38). Equivalently, we solve
min G C N × N k = 1 M ψ ( y k ) G ψ ( x k ) 2 2 .
Expanding (43) and differentiating with respect to G leads to the following normal equations:
k = 1 M ψ ( y k ) ψ ( x k ) * = G k = 1 M ψ ( x k ) ψ ( x k ) * .
Dividing both sides by M and using the definitions of the action matrix A c and Gram matrix G r (see (40)–(41)) yields
A c = G G r .
If G r is invertible, the unique solution is
G = A c G r 1 .
In the more general case when G r is rank-deficient (e.g., due to M < N or correlated snapshots), the Moore–Penrose pseudoinverse G r provides the minimum norm solution:
G = A c G r .
Galerkin interpretation. Consider the discrete, data-driven inner product over the snapshot set,
f , g M : = 1 M k = 1 M f ( x k ) * g ( x k ) ,
which induces the least squares metric in the dictionary space. Projecting the action of G onto span { ψ i } i = 1 N under this inner product leads to the following conditions:
i = 1 N G ψ j i = 1 N G i j ψ i , ψ i M = 0 , for all j = 1 , , N ,
which in matrix form exactly reproduces
A c = G G r ,
and, hence,
G = A c G r .
This demonstrates that G is the finite-dimensional least squares Galerkin approximation of G in the chosen dictionary basis.    □
Remark 7
(Empirical Galerkin Approximation). Equation (42) provides a data-driven Galerkin approximation of G onto span { ψ i } . Strictly speaking, this is a finite-dimensional, snapshot-based projection defined through the empirical inner product · , · M . Consequently, its convergence to the true Hilbert-space projection is governed by the sample size M, the richness of the dictionary { ψ i } , and the signal-to-noise ratio. The Moore–Penrose pseudoinverse G r always exists, ensuring a unique minimum norm solution is obtained even in cases of rank deficiency or collinearity in the feature space.
In the special case where X = Y = M and the alignment map coincides with the system flow ( π = F ), the finite-dimensional approximation (42) recovers the standard Extended Dynamic Mode Decomposition (EDMD) formulation of the Koopman operator. Here, the empirical inner product · , · M coincides with the conventional EDMD inner product over snapshots, and G reduces to the familiar EDMD matrix, confirming that the proposed framework is a consistent generalization of classical data-driven Koopman analysis.

4.1. Spectral Decomposition of G and Data Reconstruction

The reconstruction of nonlinear dynamics proceeds by mapping the linear evolution in the high-dimensional observable space back to the original state space.
We perform an eigendecomposition of the learned matrix G:
G W = W Λ ,
where Λ = diag ( λ 1 , , λ K ) contains the dynamic eigenvalues, and the columns of W are the corresponding right eigenvectors. These eigenvalues encode the growth, decay, and oscillation frequencies of the system’s modes.
The relation (37) can be expressed in terms of the eigen-decomposition:
ψ ( y ) = ψ ( π ( x ) ) = W Λ W 1 ψ ( x ) = V Φ ( x ) ,
where
Φ ( x ) = W 1 ψ ( x )
approximates the generalized eigenfunctions, and
V = W Λ
contains the dynamic modes.
In the particular case of π = F , the relation simplifies to
ψ ( x k ) = G k ψ ( x 0 ) = W Λ k W 1 ψ ( x 0 ) .
If the extended state x is representable in the dictionary ψ ( x ) , there exists a reconstruction matrix R R ( s + d + 1 ) × N such that
x = R ψ ( x ) .
Typically, if x is explicitly included as the first component(s) of ψ ( x ) , R reduces to a simple selection matrix, R = [ I 0 ] . Otherwise, R is obtained via least squares regression to best approximate x from the feature vector.
The predicted extended state at any time k is then
x k = R W Λ k W 1 ψ ( x 0 ) = V Λ k Φ ( x 0 ) ,
where
  • V = R W C ( s + d + 1 ) × K is the matrix of dynamic modes, each column representing a spatial structure associated with an eigenvalue;
  • Λ k C K × K contains the eigenvalues raised to the k-th power, determining temporal evolution;
  • Φ ( x 0 ) C K contains the generalized eigenfunctions evaluated at the initial state.
Finally, the original state z k M can be recovered via a projection, as follows:
z k = P x k ,
where P selects the relevant coordinates from the extended-state vector.

4.2. Continuous-Time System Identification

The generalized transfer operator framework can be naturally extended to continuous-time systems, providing a data-driven methodology for identifying linear representations of nonlinear dynamics. Consider a system governed by
z ˙ = f ( z ) ,
where z M . In analogy to the continuous-time Koopman operator, the generalized operator now forms a semigroup G t t 0 , generated by
L g = g · f ,
which defines the instantaneous evolution of observables.
To obtain a finite-dimensional approximation, a Galerkin projection onto a chosen dictionary of multi-state observables is employed. Specifically, for an extended observable vector ψ ( x ) , we seek a matrix L such that
ψ ˙ ( x ) L ψ ( x ) ,
capturing the continuous-time dynamics in the feature space. This formulation is independent of a fixed sampling interval and directly aligns with the underlying physical laws expressed as differential equations.
Under this framework, the continuous-time evolution of the observables admits the representation
ψ ( x ( t ) ) = W e t Λ W 1 ψ ( x 0 ) ,
where W and Λ are obtained from the spectral decomposition of L, analogous to the discrete-time case. The extended state can then be reconstructed as
x ( t ) = V e t Λ Φ ( x 0 ) ,
where V = R W and Φ ( x 0 ) = W 1 ψ ( x 0 ) . Equations (55)–(56) provide a continuous-time analog of the discrete-time spectral expansions, enabling the accurate prediction and analysis of nonlinear systems in a physically consistent manner.
This continuous-time formulation provides a natural extension of the discrete-time generalized operator framework. While the discrete-time operator G maps extended observables between fixed sampling intervals, the continuous-time generator L characterizes the instantaneous evolution of the same observables, independent of the sampling rate. Importantly, the spectral decomposition of L yields a continuous analog of the generalized eigenfunctions and dynamic modes, preserving the interpretability and reconstruction properties of the discrete-time FIRE framework. Consequently, the continuous-time extension enables the analysis, prediction, and control of nonlinear systems directly in their natural temporal domain, while maintaining consistency with the multi-state, trajectory-dependent observables central to the proposed operator-theoretic methodology.

4.3. Algorithm Summary

The following procedure summarizes the key steps to implement the generalized sequence-based FIRE framework for data-driven modeling of nonlinear dynamical systems:
From Algorithm 1, the transfer matrix G and its associated spectrum { λ j , w j } are obtained, enabling the reconstruction of the system’s state evolution via dominant eigenvalues and modes. This approach generalizes standard methods by incorporating temporal correlations and multi-step dependencies, providing a flexible yet linear–algebraically consistent representation of nonlinear dynamics.
Optional stabilization via ridge regularization. In order to stabilize the operator estimation in severely underdetermined settings, the regression problem can be augmented with Tikhonov (ridge) regularization. In this case, the operator G is computed as
G = Ψ Y Ψ X * ( Ψ X Ψ X * + α I ) 1 ,
where α > 0 is a regularization parameter controlling the trade-off between data fidelity and numerical stability. This formulation penalizes large operator norms and improves conditioning when the Gram matrix Ψ X Ψ X * is ill-conditioned or rank-deficient. This provides a formal mechanism to handle underdetermined systems beyond simple rank truncation, further safeguarding the spectral estimates against overfitting. Such regularization is particularly relevant in high-dimensional lifted representations with limited data.
Offline versus online usage. The use of extended states that include future observations is confined exclusively to the offline training stage, where the generalized operator G and its spectral decomposition are identified from data. Once the operator has been learned, online prediction and reconstruction proceed in a purely forward manner by propagating the modal coordinates through the learned eigenvalues λ j and evaluating the associated eigenfunctions ϕ j . No access to future states is required during deployment. Moreover, a strictly causal variant of the framework is recovered by setting the anticipation depth as s = 0 , in which case the method reduces to a past-only, delay-based formulation suitable for real-time forecasting.
Algorithm 1 Future Informed Regression Evolution (FIRE)
1.
Data Collection: Gather a sequence of state snapshots { z k } k = 1 m from the dynamical system of interest.
2.
Extended State Construction: Define extended states x k and y k according to the chosen memory depth d and anticipation depth s, forming
x k = [ z k d , , z k , , z k + s ] , y k = π ( x k ) ,
where π is the alignment map encoding temporal or feature-based correspondence.
3.
Dictionary Selection: Choose a set of basis functions (observables) D = { ψ 1 , , ψ N } to define the finite-dimensional feature space. Form the feature vectors
ψ ( x k ) = [ ψ 1 ( x k ) , , ψ N ( x k ) ] T .
4.
Data Matrix Assembly: Construct the snapshot matrices
Ψ X = [ ψ ( x 1 ) , , ψ ( x M ) ] , Ψ Y = [ ψ ( y 1 ) , , ψ ( y M ) ] .
5.
Finite-Dimensional Operator Estimation: Compute the least squares approximation of the generalized operator as follows:
G = Ψ Y Ψ X .
Or, equivalently, the Galerkin projection
G = A c G r , A c = 1 M Ψ Y Ψ X * , G r = 1 M Ψ X Ψ X * .
6.
Spectral Decomposition: Perform an eigendecomposition of G:
G W = W Λ ,
where Λ contains the eigenvalues and W contains the corresponding eigenvectors. Compute the generalized eigenfunctions
Φ ( x k ) = W 1 ψ ( x k ) ,
and the dynamic modes
V = R W ,
with R being the reconstruction matrix mapping the observables back to the original (extended) state.
7.
State Reconstruction and Prediction: Reconstruct and predict the system state using
x k V Λ k Φ ( x 0 ) , z k = P x k ,
where P selects the physical coordinates from the extended state.
8.
Continuous-Time Extension (Optional): For continuous-time systems, approximate the generator L via
ψ ˙ ( x ) L ψ ( x ) ,
and reconstruct the state trajectory as
x ( t ) = V exp ( t Λ ) Φ ( x 0 ) , z ( t ) = P x ( t ) .
Novelty of FIRE. Although EDMD on delay-embedded states and Hankel-DMD (or Hankel-EDMD) also operate on augmented state spaces, they are limited to cases where the lifted and target spaces coincide and the mapping between snapshots is a simple time shift. In contrast, the proposed FIRE framework introduces a generalized transfer operator G : F ( Y ) F ( X ) , which allows arbitrary measurable mappings π : X Y between potentially heterogeneous spaces. This enables the incorporation of heterogeneous observables, the anticipation of future states, or mappings that are not mere temporal shifts. As a result, FIRE extends the spectral analysis and modal decomposition beyond conventional delay embeddings.
In Section 5, we illustrate this capability with an example where π is not a simple shift, demonstrating that FIRE can capture dynamics inaccessible to standard EDMD or Hankel-DMD approaches. To illustrate the capability of FIRE in handling heterogeneous spaces and nontrivial mappings, we consider the forced harmonic oscillator in Example 2, where the mapping π incorporates both delayed states and additional observables corresponding to the forcing terms. It is important to distinguish between the raw data snapshots and the functional observable spaces. Although the matrices Ψ X and Ψ Y are constructed from the same temporal window of measurements, they span fundamentally distinct observable spaces. The source space X represents the state configuration (position and velocity), while the target space Y represents the evolutionary dynamics (velocity and acceleration). By mapping state observables to evolutionary observables, the generalized operator G acts as a differential bridge across these heterogeneous bundles. This distinction ensures that FIRE is not merely a temporal shift of snapshots, as in standard delay-DMD, but a rigorous projection of the system generator onto a multi-scale dictionary. This demonstrates that FIRE can capture dynamics beyond standard EDMD or Hankel-DMD embeddings.

5. Numerical Experiments

In this section, we present numerical experiments that demonstrate the effectiveness of the proposed multi-state decomposition framework for both finite-dimensional ODE systems and high-dimensional PDEs. By constructing observables that depend on multiple time steps, the method captures temporal correlations and multi-step interactions that conventional single-state approaches may fail to resolve. These examples illustrate how the generalized operator G and the FIRE framework provide a systematic, flexible approach to modeling complex nonlinear dynamics.
Fair baseline comparison. To ensure a meaningful and unbiased evaluation of the FIRE framework, all baseline methods, including classical EDMD and Hankel-DMD, were tested using comparable feature classes and dictionary constructions, where applicable. Specifically, for each example, the same temporal delay embeddings, polynomial orders, and trigonometric components are provided to all methods.
Furthermore, when tuning hyperparameters such as embedding depth or polynomial order, all the methods are optimized equivalently using either cross-validation or systematic parameter sweeps, so that the baseline performance reflects their true potential under fair conditions.
Error Metrics. To evaluate the performance of the proposed framework, we employ two complementary metrics.
First, we define the absolute pointwise reconstruction error at time index i as
E abs ( i ) = z ( i ) z ˜ ( i ) ,
where z ( i ) and z ˜ ( i ) denote the true and reconstructed system states, respectively. This metric captures the instantaneous deviation between the true and predicted trajectories and is particularly useful for identifying transient effects, peak mismatches, and localized reconstruction failures.
Second, we report a global relative reconstruction error over the full time horizon,
E r r = Z Z ˜ 2 Z 2 ,
where Z = [ z ( 1 ) , , z ( T ) ] and Z ˜ = [ z ˜ ( 1 ) , , z ˜ ( T ) ] are the matrices of the true and reconstructed trajectories. This normalized measure provides a scale-invariant assessment of the overall reconstruction quality and enables fair comparison across different models, dictionaries, and experimental settings.
Together, these metrics offer both local (time-resolved) and global (aggregate) perspectives on reconstruction performance.
Example 1.
Nonlinear Epidemiological Dynamics (SIR Model)
To assess the performance of the generalized transfer operator G on nonlinear dynamics arising in biological systems, we consider the classical SIR (Susceptible–Infectious–Recovered) model. This compartmental framework is governed by a bilinear infection term and thus represents a canonical benchmark for testing the ability of operator-theoretic methods to linearize and approximating intrinsically nonlinear flows. The evolution of the susceptible (S), infectious (I), and recovered (R) populations is described by the following nonlinear system:
d S d t = β S I N p d I d t = β S I N p γ I d R d t = γ I
where N p denotes the population size, β is the transmission rate, and γ is the recovery rate.
The numerical experiment is conducted with population size N p = 10 , 000 and initial conditions I 0 = 10 , R 0 = 0 , and S 0 = N p I 0 . The epidemiological parameters are set to β = 0.4 and γ = 0.1 , yielding a basic reproduction number R 0 = β / γ = 4.0 , corresponding to a pronounced epidemic growth regime. The system is simulated over a time horizon of T end = 200 days using high-resolution temporal sampling with step size Δ t to generate snapshot data. The resulting compartmental dynamics are shown in Figure 1.
Due to the bilinear coupling between the susceptible and infectious compartments, the SIR vector field is inherently nonlinear and non-normal. The generalized transfer operator G is therefore tasked with approximating the induced flow between epidemiological compartments via a linear evolution in an appropriately lifted feature space. By constructing a dictionary D of nonlinear observables, the proposed framework captures the dominant epidemic modes and provides a finite-dimensional linear representation of the underlying nonlinear dynamics.
To accurately capture the nonlinear interactions inherent in the SIR dynamics, we employ an augmented lifting dictionary that combines temporal delay embeddings with explicitly constructed nonlinear and differential features. This enriched representation enables the generalized transfer operator G to approximate the nonlinear epidemic flow by a linear evolution in a suitably high-dimensional feature space, where the dominant compartmental interactions are made explicit.
Let the population state be denoted by z ( t ) = [ z 1 ( t ) , z 2 ( t ) , z 3 ( t ) ] T , where z 1 ( t ) = S ( t ) , z 2 ( t ) = I ( t ) , and z 3 ( t ) = R ( t ) . The feature map Ψ is constructed by integrating three complementary sources of information:
  • Temporal delay embedding (Hankel structure). We employ a delay depth of s = 24 , yielding a multi-step embedding that encodes the recent temporal history of the epidemic. This extended memory allows the operator to implicitly capture time-varying infection and recovery rates and enhances the identifiability of transient epidemic phases.
  • Nonlinear interaction observables. Motivated by the bilinear infection term S I in the governing equations, we explicitly incorporate cross-product observables of the following form:
    ψ 1 ( t k ) = z 1 ( t k ) z 2 ( t k ) ,
    thereby embedding the dominant nonlinear coupling directly into the lifted space.
  • Differential information. To further align the lifted dynamics with the underlying continuous-time structure, we augment the dictionary with a numerical approximation of the recovered population derivative:
    R ˙ R k + 1 R k Δ t .
    Since R ˙ = γ I , this observable provides a direct linear proxy for the infectious compartment and substantially improves the spectral consistency of the learned operator. Accordingly, we include
    Since R ˙ = γ I , this term provides a direct linear link to the infectious compartment, significantly improving the spectral alignment of the operator. We explicitly include observables,
    ψ 2 ( t k ) = z 3 ( t k + 1 ) z 3 ( t k ) d t ,
    as an additional feature.
Collectively, this construction yields an extended state space of dimension M 25 . The resulting lifted feature vector is given by
ψ ( t k ) = [ z ( t k ) , , z ( t k + 23 ) , ψ 1 ( t k ) , , ψ 1 ( t k + 23 ) , ψ 2 ( t k ) , , ψ 2 ( t k + 23 ) ] T ,
so that, at each time step, the lifted state simultaneously encodes raw compartment values, nonlinear interaction terms, and differential information across all delay coordinates.
The data matrices are assembled via a standard time-shifting procedure,
Ψ X = ψ ( t 1 ) , , ψ ( t T ) a n d Ψ Y = ψ ( t 2 ) , , ψ ( t T + 1 ) ,
resulting in matrices of dimension 1024 × T , with T = 120 training snapshots. Applying Algorithm 1, we compute the finite-dimensional approximation G of the generalized transfer operator and use it to reconstruct the system dynamics.
Figure 2 compares the reconstructed SIR trajectories obtained using the proposed FIRE framework with those produced by the Hankel-DMD method.
The FIRE-based reconstruction closely follows the true nonlinear dynamics of all compartments, accurately reproducing both the transient growth phase and the decay regime of the infectious population. In particular, the timing and magnitude of the infection peak are well captured, indicating that the learned transfer operator successfully encodes the dominant nonlinear interaction mechanisms of the epidemic process.
Regularization and conditioning. The lifted regression problem arising in the SIR example is inherently high-dimensional, with a feature space dimension of N = 1024 and only T = 120 training snapshots. As a result, the estimation of the finite-dimensional transfer operator G from the relation Ψ Y G Ψ X is severely underdetermined and requires explicit regularization to ensure numerical stability and meaningful spectral properties.
In all numerical experiments, the operator G is computed using a truncated singular value decomposition (SVD) of the feature matrix Ψ X . Only the dominant singular directions are retained, corresponding to singular values satisfying σ j / σ max 10 6 . This procedure yields a regularized low-rank approximation of the regression operator and effectively filters spurious directions associated with noise and redundant observables. The singular value spectrum of Ψ X exhibits rapid decay, revealing a strong intrinsic low-rank structure induced by the combined effect of delay embeddings and nonlinear interaction features. Importantly, the dominant eigenvalues of the learned operator are observed to be robust with respect to the chosen truncation threshold.
Although Tikhonov (ridge) regularization, (57) was not required for the results reported in this example, it remains a viable option for further stabilizing operator estimation in more severely ill-conditioned or noise-contaminated settings. In practice, SVD truncation alone proved sufficient to control conditioning and prevent overfitting.
To further assess numerical stability, we monitor the conditioning of the regression by examining the singular value decay of Ψ X and the effective rank of the empirical Gram matrix Ψ X Ψ X * . Despite the nominally large lifted dimension ( N T ), the data-driven operator effectively acts on a substantially lower-dimensional subspace induced by the correlated temporal structure of the delay coordinates and nonlinear observables. This implicit dimensionality reduction enables stable operator identification and mitigates overfitting, even in the underdetermined regime.
To provide a systematic comparison on the same simulated dataset, we evaluate several baseline methods, including Dynamic Mode Decomposition (DMD), Hankel DMD, Extended DMD (EDMD), and continuous-time EDMD (cEDMD). Among these approaches, Hankel DMD yields the best performance. While DMD, EDMD, and cEDMD fail to accurately reconstruct the nonlinear epidemic dynamics, Hankel DMD achieves a reconstruction error on the order of 10 4 when a delay dimension of s = 38 is employed. The corresponding Hankel-DMD reconstruction is shown in Figure 2.
Reconstruction accuracy is quantified using the absolute pointwise error defined in (58). Figure 3 compares the temporal evolution of the reconstruction error for FIRE and Hankel DMD over the training interval. The FIRE framework consistently achieves a lower reconstruction error across all the time steps, with particularly pronounced improvements during the nonlinear growth and saturation phases of the epidemic.
For completeness, Figure 4 reports the reconstructions obtained using DMD, EDMD, and continuous EDMD. These methods fail to capture the essential nonlinear features of the SIR dynamics.
To provide a rigorous baseline comparison, the EDMD and continuous EDMD schemes were equipped with a comprehensive polynomial library, as follows:
D E D M D = { z , z 2 , z 1 z 2 , z 3 , z 1 2 z 2 , z 1 z 2 2 } ,
including all relevant cross-product terms up to the third order. For the continuous EDMD scheme, the evolution data matrix is constructed as
Ψ ˙ c E D M D = [ z ˙ , 2 z z ˙ , z ˙ 1 z 2 + z 1 z ˙ 2 , 3 z 2 z ˙ , 2 z 1 z ˙ 1 z 2 + z 1 2 z ˙ 2 , z ˙ 1 z 2 2 + 2 z 1 z 2 z ˙ 2 ] ,
where the time derivatives are approximated using forward finite differences, as follows:
z ˙ ( t ) z ( t + Δ t ) z ( t ) Δ t .
The reconstruction quality is evaluated using the relative reconstruction error defined in (59). Table 1 reports the errors for all methods considered.
Ablation Study. To isolate the contribution of each modeling component in the proposed FIRE framework, we perform an ablation study by systematically removing elements from the lifted representation. Specifically, we consider the following cases:
1.
Full dictionary: Delays + nonlinear cross-terms + derivative features.
2.
No delays: Remove temporal embeddings; retain nonlinear cross-terms and derivatives.
3.
No cross-terms: Remove nonlinear products S · I ; retain delays and derivatives.
4.
No derivatives: Remove derivative features; retain delays and nonlinear cross-terms.
Table 2 reports the relative reconstruction error for different dictionary configurations.
The numerical results indicate that the explicit inclusion of the transmission nonlinearity ( S · I ) together with the recovered rate derivative R ˙ yields the highest reconstruction fidelity. By embedding the functional form of the infection mechanism directly into the observable space, the generalized operator G is able to accurately learn the transition from the initial exponential growth phase to the epidemic peak and the subsequent recovery regime. This hybrid strategy, which combines deep delay embeddings with physics-informed nonlinear observables, enables the proposed framework to significantly outperform standard DMD-based methods in both short- and long-term prediction accuracy.
From a theoretical perspective, this behavior is consistent with Takens’ embedding theorem, which guarantees that the topology of a finite-dimensional dynamical system can be reconstructed from a single observed time series given a sufficiently large delay window. The choice of a deep delay embedding with s = 24 allows G to capture the delayed feedback between susceptible–infectious interactions and the ensuing peak in the infectious population. In effect, the operator implicitly encodes higher-order temporal dependencies that are not explicitly present in the original state variables, providing a fading memory representation of the epidemic dynamics.
By projecting the nonlinear flow onto a 120-dimensional extended observable space, the learned operator becomes markedly less sensitive to local numerical fluctuations in the sampled S, I, and R trajectories. This implicit regularization, induced by the delay structure and correlated nonlinear features, contributes to the observed robustness and accuracy of the FIRE reconstruction in highly nonlinear and transient regimes.
Example 2.
Forced Harmonic Oscillator with Transient Dynamics
A central challenge in data-driven system identification is the recovery of the underlying state space dynamics from limited, and often scalar, measurements. Classical operator-theoretic approaches typically rely on access to the full state or multiple observables and may fail to reconstruct the latent dynamical structure when only a single measurement is available. In contrast, the proposed generalized transfer operator G enables state reconstruction through a lifting strategy that combines delay embeddings with derivative-based observables, thereby recovering the intrinsic manifold from univariate time series data.
To assess the performance of G in this setting, we consider a second-order, linear, non-homogeneous dynamical system: a forced harmonic oscillator exhibiting both transient and steady-state behavior. The governing equation is
z ¨ + 5 z = sin ( t )
where z ( t ) denotes the displacement. The natural frequency of the unforced system is ω 0 = 5 2.236 rad/s, while the external forcing acts at frequency ω = 1 rad/s. The coexistence of these frequencies induces a transient regime followed by a forced response, providing a nontrivial test case for operator-based identification from scalar data.
The trajectory is generated over the time interval t [ 0 , 8 ] s, starting from rest with initial conditions z ( 0 ) = 0 and z ˙ ( 0 ) = 0 . The system is integrated using a variable-step Runge–Kutta method (ode45) with tight error tolerances ( RelTol = 10 6 , AbsTol = 10 6 ). A fixed sampling interval of Δ t = 10 3 s yields 8001 discrete snapshots { z k } , which capture both the initial transient phase and the subsequent forced oscillatory dynamics. This dataset serves as the sole input for constructing the data-driven approximation of the generalized operator G .
In this experiment, we assume that only the scalar displacement signal z ( t ) is available for measurement. The velocity z ˙ ( t ) and the external forcing are not directly observed. To compensate for this incomplete state information, we construct a compact yet expressive dictionary that embeds the scalar time series into a low-dimensional feature space enriched with derivative and forcing information. Our objective is to identify a continuous-time representation of the generalized transfer operator G via its infinitesimal generator.
Unlike discrete-time formulations, which approximate a finite-step evolution map, the generator-based approach seeks a linear operator G such that the lifted observables satisfy
Ψ ˙ ( t k ) G Ψ ( t k ) ,
where Ψ ( t ) denotes the feature map evaluated along the trajectory. This formulation directly approximates the action of the Lie generator associated with the underlying dynamics.
This example illustrates FIRE’s ability to operate on heterogeneous spaces, where the source space X and the target space Y are not identical. Specifically, X corresponds to a reconstructed feature space obtained from scalar measurements, while Y represents the infinitesimal evolution of these features under the system dynamics. The alignment between these spaces is achieved through a localized temporal window.
To this end, we introduce a symmetric delay structure with one step backward and one step forward, corresponding to memory depth s = 1 and anticipation depth q = 1 . The resulting extended state at time index k is
x k = z k 1 , z k , z k + 1 T M 3 .
This local temporal window encodes sufficient information to approximate first- and second-order derivatives while remaining agnostic to the true physical state ( z , z ˙ ) .
From the windowed measurements (63), numerical derivatives are computed using centered finite differences:
z ˙ approx ( t k ) : = z k + 1 z k 1 2 Δ t , z ¨ approx ( t k ) : = z k + 1 2 z k + z k 1 Δ t 2 .
These approximations provide local estimates of the velocity and acceleration without requiring explicit access to latent variables.
Based on this construction, we define a dictionary of observables D = { ψ 1 , , ψ 4 } acting on x k as
ψ 1 ( x k ) = z k , ψ 2 ( x k ) = z ˙ approx ( t k ) , ψ 3 ( x k ) = sin ( t k ) , ψ 4 ( x k ) = cos ( t k ) .
The inclusion of the trigonometric forcing terms explicitly encodes the non-autonomous input, while the derivative-based observable enables the reconstruction of the underlying second-order dynamics from the scalar data.
The use of the forward sample z k + 1 in the dictionary represents a localized anticipation depth during operator identification. Importantly, this non-causal information is required only in the offline training phase. Once the operator G is learned, the resulting spectral model is fully causal: the natural and forcing frequencies are embedded in the eigenvalues of G, enabling autonomous prediction without access to future measurements.
The resulting feature map is
Ψ ( t k ) = z ( t k ) z ˙ approx ( t k ) sin ( t k ) cos ( t k ) ,
with time derivative
Ψ ˙ ( t k ) = z ˙ approx ( t k ) z ¨ approx ( t k ) cos ( t k ) sin ( t k ) ,
where all derivatives are computed numerically from the sampled data.
Using the training dataset { Ψ ( t k ) , Ψ ˙ ( t k ) } k = 1 T with T = 5600 samples, we compute the continuous-time FIRE approximation by solving a least squares problem for the generator matrix G R 4 × 4 in (62).
This dictionary design effectively “unfolds” the scalar signal z ( t ) into a feature space in which the dynamics become linear and autonomous. By explicitly incorporating the driving terms sin ( t ) and cos ( t ) into the observable dictionary, the generalized operator G is able to represent the non-autonomous forcing without introducing explicit time dependence in the operator coefficients. As a result, the learned operator acts on an augmented state in which the external forcing is internalized as part of the state evolution.
We use the first 5600 snapshots as training data. The resulting operator matrix G, obtained from the least squares approximation Ψ ˙ G Ψ , provides a finite-dimensional numerical representation of the Lie generator associated with the lifted dynamics. This enables both high-fidelity state reconstruction and meaningful spectral analysis, demonstrating that the proposed framework remains robust even under the severe constraint of univariate measurements.
The dynamic reconstruction obtained using the FIRE algorithm is shown in Figure 5. For comparison, the right panel shows the reconstruction obtained using EDMD with a standard polynomial–trigonometric dictionary.
Figure 6 reports the pointwise absolute reconstruction error, computed using (58), for both the training interval and the out-of-sample prediction horizon. The observed error remains on the order of 10 4 , with slightly increased deviations during the prediction phase, reflecting the extrapolative nature of the task.
To provide a rigorous benchmark, we compare FIRE against several standard data-driven decomposition methods: Dynamic Mode Decomposition (DMD), Hankel DMD, Extended DMD (EDMD), and continuous EDMD. Among these approaches, EDMD yields the best performance, although its reconstruction accuracy remains significantly inferior to that of FIRE.
A representative EDMD dictionary augmented with explicit forcing terms is given by
D E D M D = { z , z 2 , z 3 , sin ( t ) , cos ( t ) , 1 } .
Despite the inclusion of cubic nonlinearities and trigonometric forcing, EDMD fails to accurately reconstruct the system dynamics, as illustrated in Figure 5.
For continuous EDMD, the same dictionary (66) is employed, with the corresponding evolution data matrix
Ψ ˙ c E D M D = [ z ˙ , 2 z z ˙ , 3 z 2 z ˙ , cos ( t ) , sin ( t ) , 0 ] ,
where time derivatives are approximated numerically.
Table 3 summarizes the relative reconstruction errors (59) for all considered methods.
Unlike EDMD or Hankel-DMD, which rely on fixed temporal shifts within a single observable space, FIRE explicitly permits a general alignment map π : X Y between heterogeneous spaces. In this example, the source space X encodes local temporal context through windowed scalar measurements, while the target space Y corresponds to the instantaneous evolution of lifted observables.
Consequently, the learned operator G is not a time-shift operator but a data-driven approximation of the infinitesimal generator governing the induced feature space dynamics. This distinction enables FIRE to recover hidden state information and non-autonomous forcing effects without requiring backward-time data or explicit access to the full system state. As a result, the proposed framework remains applicable in settings where classical Koopman-based methods either fail or require substantial modification.
Example 3.
Coupled Two-Mass Spring–Damper System
To demonstrate the effectiveness of the generalized transfer operator G in higher-dimensional state spaces, we consider a coupled two-degree of freedom (2-DOF) coupled system. The model consists of two masses, m 1 and m 2 , coupled through a linear spring with stiffness k and a viscous damping element with coefficient c. This example illustrates the ability of the proposed framework to identify coupled dynamics and accurately capture unstable regimes using displacement-only measurements.
The governing equations of motion are given by the following coupled second-order ordinary differential equations:
m 1 z ¨ 1 + c ( z ˙ 1 z ˙ 2 ) + k ( z 1 z 2 ) = 0 , m 2 z ¨ 2 + c ( z ˙ 2 z ˙ 1 ) + k ( z 2 z 1 ) = 0 ,
where z 1 ( t ) and z 2 ( t ) denote the displacements of masses m 1 and m 2 , respectively.
The numerical experiment is conducted using the parameters m 1 = 2 , m 2 = 1 , k = 6 , and c = 0.5 . The system is simulated over the time interval t [ 0 , 50 ] with time step Δ t = 0.01 and initial condition z 0 = [ 1 , 1 ] T , yielding 5001 discrete snapshots. The negative damping coefficient introduces an unstable regime in which energy is continuously injected into the system, causing exponential growth of oscillation amplitudes. This configuration provides a challenging test case for data-driven spectral identification methods.
The available measurements consist solely of the displacement vector z ( t ) = [ z 1 ( t ) , z 2 ( t ) ] T . To construct the transfer operator G , we lift these measurements into an augmented phase-space representation using forward finite differences. Let z k = z ( t k ) denote the sampled displacement at time t k . We define the extended state vectors as follows:
x k = [ z k ; z k + 1 ] , y k = [ z k + 1 ; z k + 2 ] ,
which implies that the source and target spaces coincide, i.e., X Y M 4 .
Based on these windowed measurements, we define the lifted observables as follows:
ψ 1 ( z ( t k ) ) = z k , ψ 2 ( x ( t k ) ) = z k + 1 z k Δ t ,
resulting in the feature vector
ψ ( t k ) = z k z k + 1 z k Δ t ,
which pairs displacement measurements with an approximation of the velocity. This lifting transforms the displacement-only observations into a phase space representation suitable for linear operator identification.
The first 70 % of the snapshots are used as training data for Algorithm 1. The reconstruction of the training trajectories obtained using FIRE is shown in Figure 7, while the out-of-sample predictions are presented in Figure 8.
The data matrices Ψ X and Ψ Y R 4 × T are constructed by time-shifting the lifted features,
Ψ X = ψ ( t 1 ) , , ψ ( t T ) , Ψ Y = ψ ( t 2 ) , , ψ ( t T + 1 ) ,
where T = 3500 denotes the number of training snapshots. This construction yields a finite-dimensional approximation of the discrete-time flow map in feature space.
The resulting operator G serves as a numerical approximation of the generalized transfer operator, mapping the current lifted state to its future counterpart. Despite the unstable nature of the underlying dynamics, the identified operator accurately captures both the inter-mass coupling and the dominant growth rates of the system modes, enabling reliable long-term prediction.
A comparative analysis was performed to benchmark the proposed FIRE scheme against several standard decomposition techniques, including DMD, Hankel DMD, Extended DMD (EDMD), and continuous EDMD. Among these methods, Hankel DMD provided the best performance, although still inferior to FIRE in terms of reconstruction fidelity.
For EDMD and continuous EDMD, we employ a nonlinear polynomial library:
D E D M D = { z , z 2 , z 1 z 2 , z 3 , z 1 2 z 2 , z 1 z 2 2 } .
The evolution data matrix for continuous EDMD is constructed as
Ψ ˙ c E D M D = [ z ˙ , 2 z z ˙ , z ˙ 1 z 2 + z 1 z ˙ 2 , 3 z 2 z ˙ , 2 z 1 z ˙ 1 z 2 + z 1 2 z ˙ 2 , z ˙ 1 z 2 2 + 2 z 1 z 2 z ˙ 2 ] ,
where the time derivatives are approximated using forward finite differences:
z ˙ ( t ) z ( t + Δ t ) z ( t ) Δ t .
Hankel DMD achieves the best reconstruction among the classical methods when using a delay coefficient s = 2 . Figure 9 shows the training data reconstruction obtained by Hankel DMD.
Figure 10 depicts the pointwise absolute reconstruction error for both FIRE and Hankel DMD, computed according to Formula (58) over the training snapshots i = 1 , , T , with T = 3500 .
These results demonstrate that FIRE successfully reconstructs the full four-dimensional state trajectory, even under challenging conditions including coupled, unstable dynamics and limited measurements. The framework effectively captures the inter-mass coupling and reproduces the growth rates of the dominant modes, outperforming classical DMD-based methods and validating its robustness for high-dimensional, multi-component dynamical systems.
Example 4.
Spatio-Temporal Dynamics of a Nonlinear Soliton (PDE)
To demonstrate the applicability of the generalized transfer operator G in high-dimensional state spaces, we consider the evolution of a coherent structure governed by a partial differential equation. Specifically, we study the nonlinear Schrödinger (NLS) equation,
i q t + 1 2 2 q ξ 2 + | q | 2 q = 0 ,
where q ( ξ , t ) is a complex-valued function of space ξ and time t. Following [8], we perform a Fourier transform in space to obtain an evolution equation in the Fourier domain:
d q ^ d t = i 2 k 2 q ^ + i F ( | q | 2 q ) ,
where ⊙ denotes element-wise multiplication, k is the vector of wavenumbers, and F represents the Fast Fourier Transform (FFT).
We discretize the spatial domain using n = 512 points, resulting in a high-dimensional state space M C 512 . The initial condition is a high-amplitude soliton:
q ( ξ , 0 ) = 2 sech ( ξ ) ,
and the system is integrated in time with a fourth-order Runge–Kutta scheme over t [ 0 , π ) , sampled at M = 21 time points with Δ t 0.15 . The resulting dataset consists of a snapshot matrix Z C 512 × 21 representing the spatio-temporal evolution of the soliton.
To capture the full dynamics, we employ a derivative-augmented dictionary. Each snapshot is paired with its temporal derivative approximated by forward differences, effectively doubling the dimensionality and allowing the operator to approximate the second-order-in-time nature of the NLS equation. Specifically, we define the extended state vectors as follows:
x k = [ z k ; z k + 1 ] , y k = [ z k + 1 ; z k + 2 ] ,
so that the observation spaces satisfy X Y M 2 n , where n = 512 . The lifted observables are chosen as
ψ 1 ( x k ) = z k , ψ 2 ( x k ) = z k + 1 z k Δ t ,
yielding the feature vector Ψ ( x k ) = [ ψ 1 ( x k ) , ψ 2 ( x k ) ] T . The forward difference derivative approximates the velocity field at each spatial point:
x ˙ k x k + 1 x k Δ t = ψ 2 ( x k ) ,
providing the operator with information about the temporal evolution of the system.
Finally, the lifted snapshot matrices are constructed as
Ψ X = [ ψ ( x 1 ) , , ψ ( x M ) ] , Ψ Y = [ ψ ( y 1 ) , , ψ ( y M ) ] ,
with dimensions 1024 × 19. Using Algorithm 1, we compute the finite-dimensional approximation of the transfer operator G, which simultaneously provides the reconstructed state vectors and the corresponding eigenfunctions and eigenvalues of the high-dimensional nonlinear system.
To evaluate the performance of the proposed generalized transfer operator G , we conducted a comparative study against standard decomposition techniques: Dynamic Mode Decomposition (DMD), Hankel DMD, Extended DMD (EDMD), and continuous EDMD (cEDMD). Figure 11 depicts the full simulation data along with reconstructions obtained by the FIRE algorithm and Hankel DMD.
The accuracy of the reconstructions is quantified using the pointwise absolute error metric (58). Continuous EDMD exhibited the poorest performance and is not visualized here. Among the remaining methods, FIRE and Hankel DMD consistently produce the lowest reconstruction errors, as shown in Figure 12.
Hankel DMD achieves its best performance using a temporal stack of only two snapshots, highlighting the importance of encoding velocity information in the feature space. For EDMD, we employed a dictionary consistent with the cubic nonlinearity of the NLS equation:
ψ ( z ) = [ z , | z | 2 z ] T .
Despite this, the FIRE algorithm provides superior reconstruction, demonstrating the advantage of a carefully designed, derivative-augmented observable set.
The eigenvalue spectra computed by the different methods further illustrate the efficacy of FIRE (Figure 13). FIRE aligns the eigenvalues close to the imaginary axis, as expected for the Hamiltonian-like dynamics of the soliton, whereas deviations are observed for DMD and EDMD.
The inclusion of the derivative x ˙ in the dictionary is essential for PDE systems. It allows the operator G to account for the “momentum” of the wave field, effectively capturing the spatio-temporal evolution even with a limited number of temporal snapshots ( M = 21 ). This lifting enables the extraction of dominant Koopman modes that characterize the soliton’s stability and shape-preserving propagation, demonstrating the robustness of the proposed framework for distributed parameter systems with continuous-state fields.

6. Conclusions

Standard Koopman-based approaches approximate the evolution operator by projecting the dynamics onto a finite set of observables defined at single time instances. However, many dynamical systems exhibit memory effects, delays, or spatio-temporal dependencies that cannot be adequately captured by single-state observables.
In this work, we introduce a multi-state extension in which each observable depends on multiple system states over time, forming a richer feature space that captures temporal dependencies more effectively. Moreover, the construction of the target data in feature space is generalized to allow functions of multiple observables, enabling more flexible regression and mode extraction. Importantly, the framework retains interpretability: the eigenvalues of the operator characterize temporal scales, while the corresponding modes represent spatial or state patterns. This multi-state formulation improves reconstruction accuracy, captures temporal correlations, and is applicable to both ODE and PDE systems.
The main contributions of this work can be summarized as follows:
  • We generalize the feature space target data to allow functions of multi-state observables, increasing the expressivity of Koopman approximations.
  • We introduce a multi-state extension of the EDMD framework, in which observables are functions of sequences of system states rather than a single state.
  • We provide numerical demonstrations on canonical nonlinear systems, showing improved reconstruction and spectral representation compared to standard EDMD.
The advantages of this framework include its ability to capture temporal correlations and memory effects, its improved representation of nonlinear dynamics, and its applicability to high-dimensional PDE systems. Numerical experiments confirm its effectiveness in reconstructing system trajectories and identifying dominant features, suggesting its potential for applications in model reduction, prediction, and control of complex dynamical systems.
Overall, the proposed framework unifies classical Koopman theory, delay embeddings, temporal convolution models, and modern sequence-learning architectures, offering a principled linear-operator perspective for systems with memory, multi-resolution dynamics, or nonlocal temporal dependencies. This approach opens new avenues for data-driven analysis of systems with temporal correlations, memory effects, or spatial couplings, bridging the gap between traditional DMD/EDMD methods and the dynamics of multi-step or coupled systems.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Koopman, B.O. Hamiltonian systems and transformation in hilbert space. Proc. Natl. Acad. Sci. USA 1931, 17, 315–318. [Google Scholar] [CrossRef]
  2. Koopman, B.O.; Neumann, J.v. Dynamical systems of continuous spectra. Proc. Natl. Acad. Sci. USA 1932, 18, 255. [Google Scholar]
  3. Mezić, I. Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dyn. 2025, 41, 309–325. [Google Scholar]
  4. Budišić, M.; Mezić, I. Geometry of the ergodic quotient reveals coherent structures in flows. Phys. D Nonlinear Phenom. 2012, 241, 1255–1269. [Google Scholar] [CrossRef]
  5. Budišić, M.; Mohr, R.; Mezixcx, I. Applied Koopmanism. Chaos Interdiscip. J. Nonlinear Sci. 2012, 22, 047510. [Google Scholar] [CrossRef] [PubMed]
  6. Schmid, P.J. Dynamic mode decomposition of numerical and experimental data. J. Fluid Mech. 2010, 656, 5–28. [Google Scholar] [CrossRef]
  7. Tu, J.H.; Rowley, C.W.; Luchtenburg, D.M.; Brunton, S.L.; Kutz, J.N. On dynamic mode decomposition: Theory and applications. J. Comput. Dyn. 2014, 1, 391–421. [Google Scholar] [CrossRef]
  8. Kutz, J.N.; Brunton, S.L.; Brunton, B.W.; Proctor, J.L. Dynamic Mode Decomposition: Data-Driven Modeling of Complex Systems; Society for Industrial and Applied Mathematics (SIAM): Philadelphia, PA, USA, 2016; pp. 1–234. ISBN 978-1-611-97449-2. [Google Scholar]
  9. Brunton, B.W.; Johnson, L.A.; Ojemann, J.G.; Kutz, J.N. Extracting spatial–temporal coherent patterns in large-scale neural recordings using dynamic mode decomposition. J. Neurosci. Methods 2016, 258, 1–15. [Google Scholar]
  10. Proctor, J.L.; Eckhoff, P.A. Discovering dynamic patterns from infectious disease data using dynamic mode decomposition. Int. Health 2015, 7, 139–145. [Google Scholar] [CrossRef]
  11. Grosek, J.; Kutz, J.N. Dynamic Mode Decomposition for Real-Time Background/Foreground Separation in Video. arXiv 2014, arXiv:1404.7592. [Google Scholar]
  12. Mann, J.; Kutz, J.N. Dynamic mode decomposition for financial trading strategies. Quant. Financ. 2016, 16, 1643–1655. [Google Scholar] [CrossRef]
  13. Kuttichira, D.P.; Gopalakrishnan, E.A.; Menon, V.K.; Soman, K.P. Stock price prediction using dynamic mode decomposition. In Proceedings of the 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Udupi, India, 13–16 September 2017; pp. 55–60. [Google Scholar] [CrossRef]
  14. Berger, E.; Sastuba, M.; Vogt, D.; Jung, B.; Amor, H.B. Estimation of perturbations in robotic behavior using dynamic mode decomposition. J. Adv. Robot. 2015, 29, 331–343. [Google Scholar] [CrossRef]
  15. Mezić, I. Analysis of fluid flows via spectral properties of the Koopman operator. Annu. Rev. Fluid Mech. 2013, 45, 357–378. [Google Scholar] [CrossRef]
  16. Bai, Z.; Kaiser, E.; Proctor, J.L.; Kutz, J.N.; Brunton, S.L. Dynamic Mode Decomposition for Compressive System Identification. AIAA J. 2020, 58, 561–574. [Google Scholar] [CrossRef]
  17. Le Clainche, S.; Vega, J.M.; Soria, J. Higher order dynamic mode decomposition of noisy experimental data: The flow structure of a zero-net-mass-flux jet. Exp. Therm. Fluid Sci. 2017, 88, 336–353. [Google Scholar] [CrossRef]
  18. Li, B.; Garicano-Menaab, J.; Valero, E. A dynamic mode decomposition technique for the analysis of non-uniformly sampled flow data. J. Comput. Phys. 2022, 468, 111495. [Google Scholar] [CrossRef]
  19. Jovanović, M.R.; Schmid, P.J.; Nichols, J.W. Sparsity-promoting dynamic mode decomposition. Phys. Fluids 2014, 26, 024103. [Google Scholar] [CrossRef]
  20. Mezić, I. On Numerical Approximations of the Koopman Operator. Mathematics 2022, 10, 1180. [Google Scholar] [CrossRef]
  21. Nedzhibov, G. Extended Online DMD and Weighted Modifications for Streaming Data Analysis. Computation 2023, 11, 114. [Google Scholar] [CrossRef]
  22. Williams, M.O.; Kevrekidis, I.G.; Rowley, C.W. A data–driven approximation of the koopman operator: Extending dynamic mode decomposition. J. Nonlinear Sci. 2015, 25, 1307–1346. [Google Scholar] [CrossRef]
  23. Arbabi, H.; Mezić, I. Ergodic theory, dynamic mode decomposition, and computation of spectral properties of the Koopman operator. SIAM J. Appl. Dyn. Syst. 2017, 16, 2096–2126. [Google Scholar] [CrossRef]
  24. Brunton, S.L.; Brunton, B.W.; Proctor, J.L.; Kaiser, E.; Kutz, J.N. Chaos as an intermittently forced linear system. Nat. Commun. 2017, 8, 19. [Google Scholar] [CrossRef]
  25. Nedzhibov, G.H. Delay-Embedding Spatio-Temporal Dynamic Mode Decomposition. Mathematics 2024, 12, 762. [Google Scholar] [CrossRef]
  26. Proctor, J.L.; Brunton, S.L.; Kutz, J.N. Dynamic mode decomposition with control. SIAM J. Appl. Dyn. Syst. 2016, 15, 142–161. [Google Scholar] [CrossRef]
  27. Nedzhibov, G. An Improved Approach for Implementing Dynamic Mode Decomposition with Control. Computation 2023, 11, 201. [Google Scholar] [CrossRef]
  28. Nedzhibov, G. An Alternative Framework for Dynamic Mode Decomposition with Control. AppliedMath 2025, 5, 60. [Google Scholar] [CrossRef]
  29. Askham, T.; Kutz, J.N. Variable projection methods for an optimized dynamic mode decomposition. SIAM J. Appl. Dyn. Syst. 2018, 17, 380–416. [Google Scholar] [CrossRef]
  30. Dawson, S.T.M.; Hemati, M.S.; Williams, M.O.; Rowley, C.W. Characterizing and correcting for the effect of sensor noise in the dynamic mode decomposition. Exp. Fluids 2016, 57, 42. [Google Scholar] [CrossRef]
  31. Korda, M.; Mezić, I. Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control. Automatica 2018, 93, 149–160. [Google Scholar] [CrossRef]
  32. Williams, M.O.; Rowley, C.W.; Kevrekidis, I.G. A kernel approach to data-driven Koopman spectral analysis. J. Comput. Dyn. 2015, 2, 247–265. [Google Scholar] [CrossRef]
  33. Anantharamu, S.; Mahesh, K. A parallel and streaming Dynamic Mode Decomposition algorithm with finite precision error analysis for large data. J. Comput. Phys. 2019, 380, 355–377. [Google Scholar] [CrossRef]
  34. Maryada, K.R.; Norris, S.E. Reduced-communication parallel dynamic mode decomposition. J. Comput. Sci. 2022, 61, 101599. [Google Scholar] [CrossRef]
  35. Smith, E.; Variansyah, I.; McClarren, R. Variable Dynamic Mode Decomposition for Estimating Time Eigenvalues in Nuclear Systems. arXiv 2022, arXiv:2208.10942v1. [Google Scholar]
  36. Ngo, T.T.; Nguyen, V.; Pham, X.Q.; Hossain, M.A.; Huh, E.N. Motion Saliency Detection for Surveillance Systems Using Streaming Dynamic Mode Decomposition. Symmetry 2020, 12, 1397. [Google Scholar] [CrossRef]
  37. Duran-Siguenza, J.F.; Minchala, L.I.; Garza-Castanon, L.E.; Zhang, H. Control based on the Koopman operator: A comprehensive review. J. Frankl. Inst. 2025, 362, 108256. [Google Scholar] [CrossRef]
  38. Islam, M.S.; Ahmad, M.A. Data-driven continuous-time Hammerstein modeling with missing data using improved Archimedes optimization algorithm. Results Eng. 2024, 24, 103357. [Google Scholar] [CrossRef]
  39. Chu, B.; Rapisarda, P. Data-Driven Iterative Learning Control for Continuous-Time Systems. In Proceedings of the 2023 62nd IEEE Conference on Decision and Control (CDC), Singapore, 13–15 December 2023; pp. 4626–4631. [Google Scholar]
  40. Beykal, B.; Diangelakis, N.A.; Pistikopoulos, E.N. Continuous-Time Surrogate Models for Data-Driven Dynamic Optimization; Montastruc, L., Negny, S., Eds.; Computer Aided Chemical Engineering; Elsevier: Amsterdam, The Netherlands, 2022; Volume 51, pp. 205–210. [Google Scholar]
  41. Takens, F. Detecting strange attractors in turbulence. Lect. Notes Math. 1981, 898, 366–381. [Google Scholar]
  42. Brunton, S.L.; Budišić, M.; Kaiser, E.; Kutz, J.N. Modern Koopman Theory for Dynamical Systems. arXiv 2021, arXiv:2102.12086. [Google Scholar] [CrossRef]
  43. Klus, S.; Nüske, F.; Koltai, P.; Wu, H.; Kevrekidis, I.; Schutte, C.; Noe, F. Data-driven model reduction and transfer operator approximation. J. Nonlinear Sci. 2018, 28, 985–1010. [Google Scholar] [CrossRef]
  44. Otto, S.E.; Rowley, C.W. Koopman operators for estimation and control of dynamical systems. Annu. Rev. Control Robot. Auton. Syst. 2021, 4, 59–87. [Google Scholar] [CrossRef]
Figure 1. Time evolution of the SIR epidemic model with reproduction number R 0 = 4 .
Figure 1. Time evolution of the SIR epidemic model with reproduction number R 0 = 4 .
Mathematics 14 00464 g001
Figure 2. Reconstruction of SIR dynamics using FIRE (left) and Hankel-DMD (right).
Figure 2. Reconstruction of SIR dynamics using FIRE (left) and Hankel-DMD (right).
Mathematics 14 00464 g002
Figure 3. Pointwise reconstruction error (58) for the SIR model using FIRE and Hankel DMD.
Figure 3. Pointwise reconstruction error (58) for the SIR model using FIRE and Hankel DMD.
Mathematics 14 00464 g003
Figure 4. Dynamic reconstruction of the SIR model using DMD, EDMD, and continuous EDMD. The state variables are shown as follows: susceptible S (blue), infected I (red), and recovered R (green).
Figure 4. Dynamic reconstruction of the SIR model using DMD, EDMD, and continuous EDMD. The state variables are shown as follows: susceptible S (blue), infected I (red), and recovered R (green).
Mathematics 14 00464 g004
Figure 5. Dynamics governed by Equation (61). Reconstruction by FIRE (left) and by EDMD (right).
Figure 5. Dynamics governed by Equation (61). Reconstruction by FIRE (left) and by EDMD (right).
Mathematics 14 00464 g005
Figure 6. Pointwise reconstruction error for training and predicted trajectories obtained by FIRE.
Figure 6. Pointwise reconstruction error for training and predicted trajectories obtained by FIRE.
Mathematics 14 00464 g006
Figure 7. Dynamic reconstruction of the training data obtained using FIRE.
Figure 7. Dynamic reconstruction of the training data obtained using FIRE.
Mathematics 14 00464 g007
Figure 8. Dynamic reconstruction of the prognosed trajectories obtained using FIRE.
Figure 8. Dynamic reconstruction of the prognosed trajectories obtained using FIRE.
Mathematics 14 00464 g008
Figure 9. Dynamic reconstruction of training data using Hankel DMD.
Figure 9. Dynamic reconstruction of training data using Hankel DMD.
Mathematics 14 00464 g009
Figure 10. Pointwise reconstruction error for FIRE and Hankel DMD.
Figure 10. Pointwise reconstruction error for FIRE and Hankel DMD.
Mathematics 14 00464 g010
Figure 11. Spatio-temporal dynamics of the NLS soliton (67): full simulation data (left), reconstruction by FIRE (middle), and reconstruction by Hankel DMD (right).
Figure 11. Spatio-temporal dynamics of the NLS soliton (67): full simulation data (left), reconstruction by FIRE (middle), and reconstruction by Hankel DMD (right).
Mathematics 14 00464 g011
Figure 12. Pointwise reconstruction error (58) for DMD, Hankel DMD, EDMD, and FIRE.
Figure 12. Pointwise reconstruction error (58) for DMD, Hankel DMD, EDMD, and FIRE.
Mathematics 14 00464 g012
Figure 13. Eigenvalue spectra computed by FIRE, Hankel DMD, DMD, and EDMD.
Figure 13. Eigenvalue spectra computed by FIRE, Hankel DMD, DMD, and EDMD.
Mathematics 14 00464 g013
Table 1. Relative reconstruction error (Err) of the SIR dynamics (60) for different methods.
Table 1. Relative reconstruction error (Err) of the SIR dynamics (60) for different methods.
FIREDMDHankel DMDEDMDContinuous EDMD
Err5.8914 × 10−50.54484.4415 × 10−40.44503.3922 × 1018
Table 2. Ablation results for the SIR example. The error is relative reconstruction error (59).
Table 2. Ablation results for the SIR example. The error is relative reconstruction error (59).
DictionaryRelative Error ( Err )
Full FIRE (delays + S · I + R ˙ )5.8914 × 10−5
No delays ( S · I + R ˙ )4.2369 × 104
No nonlinear terms (delays + R ˙ )7.0787 × 10−4
No derivatives (delays + S · I )7.0786 × 10−4
Pure delay (Hankel-DMD)4.4415 × 10−4
Table 3. Relative reconstruction error for the forced oscillator (61).
Table 3. Relative reconstruction error for the forced oscillator (61).
FIREDMDHankel DMDEDMDContinuous EDMD
Err8.96 × 10 4 1.48 0.73 0.24 1.34
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nedzhibov, G. Learning Dynamics from Data by Future-Informed Regression of Evolution. Mathematics 2026, 14, 464. https://doi.org/10.3390/math14030464

AMA Style

Nedzhibov G. Learning Dynamics from Data by Future-Informed Regression of Evolution. Mathematics. 2026; 14(3):464. https://doi.org/10.3390/math14030464

Chicago/Turabian Style

Nedzhibov, Gyurhan. 2026. "Learning Dynamics from Data by Future-Informed Regression of Evolution" Mathematics 14, no. 3: 464. https://doi.org/10.3390/math14030464

APA Style

Nedzhibov, G. (2026). Learning Dynamics from Data by Future-Informed Regression of Evolution. Mathematics, 14(3), 464. https://doi.org/10.3390/math14030464

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop