Space-Time Finite Element Tensor Network Approach for the Time-Dependent Convection–Diffusion–Reaction Equation with Variable Coefficients

Adak, Dibyendu; Truong, Duc P.; Vuchkov, Radoslav; De, Saibal; DeSantis, Derek; Roberts, Nathan V.; Rasmussen, Kim Ø.; Alexandrov, Boian S.

doi:10.3390/math13142277

Open AccessFeature PaperArticle

Space-Time Finite Element Tensor Network Approach for the Time-Dependent Convection–Diffusion–Reaction Equation with Variable Coefficients

by

Dibyendu Adak

^1,*

,

Duc P. Truong

¹

,

Radoslav Vuchkov

²

,

Saibal De

³

,

Derek DeSantis

⁴

,

Nathan V. Roberts

²

,

Kim Ø. Rasmussen

¹ and

Boian S. Alexandrov

¹

Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA

²

Sandia National Laboratories, Albuquerque, NM 87185, USA

³

Sandia National Laboratories, Livermore, CA 94551, USA

⁴

Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(14), 2277; https://doi.org/10.3390/math13142277

Submission received: 13 June 2025 / Revised: 4 July 2025 / Accepted: 9 July 2025 / Published: 15 July 2025

Download

Browse Figures

Versions Notes

Abstract

In this paper, we present a new space-time Galerkin-like method, where we treat the discretization of spatial and temporal domains simultaneously. This method utilizes a mixed formulation of the tensor-train (TT) and quantized tensor-train (QTT) (please see Section Tensor-Train Decomposition), designed for the finite element discretization (Q1-FEM) of the time-dependent convection–diffusion–reaction (CDR) equation. We reformulate the assembly process of the finite element discretized CDR to enhance its compatibility with tensor operations and introduce a low-rank tensor structure for the finite element operators. Recognizing the banded structure inherent in the finite element framework’s discrete operators, we further exploit the QTT format of the CDR to achieve greater speed and compression. Additionally, we present a comprehensive approach for integrating variable coefficients of CDR into the global discrete operators within the TT/QTT framework. The effectiveness of the proposed method, in terms of memory efficiency and computational complexity, is demonstrated through a series of numerical experiments, including a semi-linear example.

Keywords:

space-time finite element; tensor network approach; time-dependent problem; convection–diffusion–reaction equation; variable coefficients

MSC:

15A69; 35Q79; 65M70

1. Introduction

1.1. Time-Dependent 3D Convection–Diffusion–Reaction Problem

This paper develops a space-time finite element method for tensor network low-rank numerical solution of the time-dependent 3D convection–diffusion–reaction (CDR) equation with Dirichlet boundary conditions of the type

\begin{matrix} \frac{\partial u}{\partial t} - \nabla \cdot (κ (t, x) \nabla u) + b (t, x) \cdot \nabla u + c (t, x) u & = f (t, x) in [0, T] \times Ω, \\ u & = g (t, x) on [0, T] \times \partial Ω, \\ u (t_{0} = 0, x) & = u_{0} (x) in Ω, \end{matrix}

(1)

where

κ (t, x)

,

b (t, x) : = {[b_{1} (t, x), b_{2} (t, x), b_{3} (t, x)]}^{T}

, and

c (t, x)

are the linear coefficients of t and

x

, and

x = (x, y, z)

. The computational domain in space

Ω = Ω_{X} \times Ω_{Y} \times Ω_{Z} \subset R^{3}

is a three-dimensional cube, which is a Cartesian product of three intervals, and T is the final time-point. We considered inhomogeneous boundary conditions and a nonzero initial value of (1) and developed tensor-train space-time formats for coefficients that are either constant or separable. Subsequently, we extended the approach to accommodate a more general class of variable coefficients.

1.2. Classical Methods for Solving CDR

Finite element method (FEM) is a powerful technique for solving CDR with high accuracy [1,2,3,4,5]. Discontinuous Galerkin [6,7] and Petrov–Galerkin [8,9] methods are widely used for solving CDR, particularly in spectral methods and virtual element methods [10,11,12,13]. These methods involve approximating the partial differential equation (PDE) solution by projecting it onto a finite-dimensional subspace of trial functions. The key idea is to ensure that the residual error is orthogonal to the chosen subspace of test functions.

In the Galerkin approach, the continuous problem is discretized by selecting a finite number of trial functions to approximate the solution. These functions are typically chosen from a function space like polynomials or piecewise-defined functions. In the classical (Bubnov–Galerkin) method, the same set of functions is used for both the trial (approximation of the solution) and test (weighting) functions, which ensures that the residual error is orthogonal to the space spanned by the trial functions. The PDE is transformed into its weak (integral) form by multiplying it by a test function and integrating over the domain. The weak form allows handling problems with lower regularity solutions and dealing with complex geometries.

The Petrov–Galerkin (PG) method generalizes the Bubnov–Galerkin method by allowing test and trial spaces to differ. This added flexibility allows for improved stability and accuracy, especially for convection-dominated problems or where numerical instability is a concern. By selecting appropriate test functions, the Petrov–Galerkin method can introduce stabilization mechanisms, such as upwinding in convection-dominated problems, thereby reducing oscillations or instabilities. In the present work, we do not apply such stabilization mechanisms since our underlying problem is a diffusion-dominated problem. In fact, we employ the same discrete basis for both test and trial functions. Our method follows the Petrov–Galerkin framework because the underlying continuous function spaces differ, specifically in the norms employed; our stability analysis depends on this difference, and it is based on an inf-sup condition [14,15]. A survey of space-time discretization of parabolic evolution equations deals with finite element [16,17,18], discontinuous Galerkin approach [19], virtual element method [20,21], spectral collocation methods [12,22].

Space-time discretizations treat the discretization of spatial and temporal domains simultaneously. Treating space and time together requires more computational resources, particularly due to the memory needed to represent the solution across a time slab. However, this approach offers greater flexibility in temporal discretization, enabling spatially localized refinement in time. Space-time methods can also be easier to analyze and may achieve higher accuracy than traditional time-marching schemes. Our current numerical examples serve as proofs of concept for the new space-time tensor method.

1.3. Mitigating the Curse of Dimensionality

The CDR equations model a wide range of phenomena in physics, chemistry, and engineering, which often demand substantial computational resources due to the need for fine spatial and temporal discretizations. Traditional approaches to solving such problems often lead to prohibitively large systems of equations, making them impractical for many real-world applications, since the number of grid points grows exponentially with the number of dimensions. In real-life applications modeled by time-dependent PDEs, such as full waveform inversion problems, the number of grid points required for solving the PDE can be as large as

M = 6.6 \times 10^{10}

per time step. With a large number of time steps

N = 4 \times 10^{5}

, one must store a total of

M N

floating-point numbers [23].

This phenomenon is known as the curse of dimensionality [24]. The curse of dimensionality leads to poor computational scaling in numerical algorithms and poses the main challenge in multidimensional numerical computations, irrespective of the specific problem. Notably, even exascale high-performance computing, with its optimization strategies, cannot overcome the curse of dimensionality. This phenomenon forces algorithm developers to make hard choices—either to reduce the fidelity of the model using, e.g., reduced-order models (ROMs) or to repeat computation using checkpointing-type strategies. ROMs focus on creating a reduced form of the PDE operator and are an excellent solution in some cases. However, they are often tailored to specific models and demand significant domain expertise. Furthermore, ROMs can be incompatible with legacy codes as they often require invasive modification of the simulation software. ROM development is an active area of research and can work well for solving certain classes of problems [25,26].

As an alternative, tensor network techniques have shown promise in alleviating the curse of dimensionality associated with high-dimensional problems. Among these, the tensor-train (TT) format, introduced by Oseledets and Tyrtyshnikov [27], has gained particular attention due to its ability to represent high-dimensional data efficiently while maintaining computational tractability.

Here, we present a novel approach that combines the high accuracy of space-time finite element methods with the computational efficiency of TT decomposition for solving time-dependent CDR equations in three spatial dimensions. Our method exploits the inherent low-rank structure often present in the solutions of such equations to dramatically reduce the computational complexity and storage requirements. Moreover, given that the finite element discrete operators have banded structures, we further exploit the quantized tensor-train (QTT) format for more economical and efficient solvers [28]. Specifically, we develop a mixed TT/QTT decomposition of the four-dimensional (three space plus time) space-time finite element Galerkin discretization, and demonstrate how to construct the TT and QTT representations of the finite element discretization. While Kornev et al. [29] also explore QTT-based FEM, their work focuses on stationary 2D problems with time-stepping, whereas our approach addresses fully time-dependent 3D problems in a unified space-time framework. Our results show that this TT/QTT-FEM-PG approach can achieve high accuracy while dramatically reducing the computational resources required, making it possible to solve previously intractable problems in CDR systems.

The remainder of this paper is organized as follows. In Section 2, we derive the weak formulation of (1), review some basic concepts for space-time finite element methods, and introduce the discretization and its matrix formulation. Section 3 reviews the tensor notations and definitions of the TT format, the TT-matrix format, the QTT format, and the TT-cross interpolation technique. In Section 4, we detail the low-rank structure of the discrete formulation of the model problem. Section 5 describes the assembly process that leads to the global linear system. We have reformulated this process to be more compatible with tensor operations. This reformulation is particularly important as it simplifies and optimizes the tensorization process. In Section 6, we present our mixed TT/QTT design of the numerical solution of the CDR equation and introduce our algorithms in TT/QTT format. Section 7 generalizes the proposed technique for arbitrary order. In Section 8, we present our numerical results and assess the performance of our method. In Section 9, we offer our final remarks and discuss possible future work. Additional details on the CDR tensorization approach are provided in Appendix A, Appendix B, Appendix C, Appendix D and Appendix E.

2. Space-Time Finite Element Formulation of CDR

The numerical approximation of partial differential equations (PDEs) is essential in many scientific and engineering fields, especially when dealing with complex geometries and singular solutions. Finite element methods (FEMs) are renowned for their robustness and accuracy in addressing such problems. However, as the problem size and complexity grow, the computational cost of FEM increases substantially, making high-resolution simulations computationally prohibitive. This challenge motivates the development of a space-time tensor network approach, which aims to preserve the accuracy and convergence order of the original method while significantly reducing the computational cost.

In this section, we provide the background details of the FEM used throughout the text. We employ the following notations:

Ω_{X}, Ω_{Y}, Ω_{Z} \subset R

denote fixed intervals of

R

, while

Ω = Ω_{X} \times Ω_{Y} \times Ω_{Z} \subset R^{3}

. The time interval is denoted by

I_{T} = [0, T]

, and our space-time domain is

Ω_{T} = I_{T} \times Ω

.

In FEM, the domain is broken into smaller, simple “elements”, and we approximate the solution u using simple functions on the elements. This is done by first transforming the PDE into its weak formulation. The domain is divided into elements for defining basis functions. The weak formulation is then discretized in terms of the basis functions, from which the mass and stiffness matrices are computed. The resulting linear system is then formulated and solved. Space-time FEM is formulated over the entire space-time domain

Ω_{T}

. In this subsection, we detail the FEM for the 3D CRD in Equation (1).

2.1. Space-Time Weak Formulation

We let

C_{c}^{\infty} (Ω)

denote the set of infinitely differentiable functions with compact support within open set

Ω

, while

L^{2} (Ω)

and

L^{\infty} (Ω)

denote the set of (equivalence classes of) square-integrable and essentially bounded functions over

Ω

, respectively. Given a multi-index

α = (α_{1}, α_{2}, α_{3})

consisting of non-negative integers, and a function

ϕ \in C_{c}^{\infty} (Ω)

, we let

D^{α} ϕ

denote the

α

^th partial derivative of

ϕ

:

D^{α} ϕ = \frac{\partial^{| α |} ϕ}{\partial x^{α_{1}} \partial y^{α_{2}} \partial z^{α_{3}}},

where

| α | = α_{1} + α_{2} + α_{3}

. If u is a (locally) integrable function, then we let

D^{α} u

denote the

α

^th partial derivative of u in the weak sense. That is, if there exists a (locally integrable) function v such that

\int_{Ω} u D^{α} ϕ = {(- 1)}^{| α |} \int_{Ω} v ϕ

for all

ϕ \in C_{c}^{\infty} (Ω)

, then we assign

D^{α} u = v

. We let

H^{1} (Ω)

denote the Sobolev space consisting of all square-integrable functions that are differentiable in the weak sense up to the first order. We equip

H^{1} (Ω)

with the norm

{∥ u ∥}_{H^{1} (Ω)} : = {(\sum_{| α | \leq 1} {∥ D^{α} u ∥}^{2})}^{1 / 2},

where

∥ \cdot ∥

is the standard

L^{2} (Ω)

norm. The space

H_{0}^{1} (Ω)

denotes the special subspace of

H^{1} (Ω)

, consisting of functions that vanish on the boundary in the weak sense, and thus satisfy homogeneous boundary conditions. By

H^{- 1} (Ω)

, we mean the dual space of

H_{0}^{1} (Ω)

, i.e., the space of all bounded linear functionals on

H_{0}^{1} (Ω)

. We give

H^{- 1} (Ω)

the standard operator norm:

{∥ u ∥}_{H^{- 1} (Ω)} = sup_{ϕ \in H_{0}^{1} (Ω), {∥ ϕ ∥}_{H^{1} (Ω)} \leq 1} | 〈 u, ϕ 〉 |,

where

〈 u, ϕ 〉 : = \int_{Ω} u ϕ

denotes the duality between

H_{0}^{1} (Ω)

and

H^{- 1} (Ω)

. Given a function space X with the associated norm

{∥ \cdot ∥}_{X}

, the Bochner space

L^{2} (0, T; X)

consists of (equivalence classes of) functions u such that

u (t, \cdot) \in X

for almost all

t \in [0, T]

. We equip this space with the norm

{∥ u ∥}_{L^{2} (0, T; X)} = {(\int_{0}^{T} {∥ u (t, \cdot) ∥}_{X}^{2} d t)}^{\frac{1}{2}} .

Similarly, we define the space

H^{1} (0, T; X)

as the space of all functions whose weak time derivatives are in

L^{2} (0, T; X)

; we equip it with the norm

{∥ u ∥}_{H^{1} (0, T; X)} = {({∥ u ∥}_{L^{2} (0, T; X)}^{2} + {∥\frac{\partial u}{\partial t}∥}_{L^{2} (0, T; X)}^{2})}^{\frac{1}{2}} .

These spaces, especially with

X = H^{1} (Ω)

and

X = H^{- 1} (Ω)

, define the appropriate function spaces for posing the weak formulation of CRD. The space

L^{2} (0, T; H^{1} (Ω))

ensures that a function

u (t, x)

is square-integrable in time, and each time slice

u (\cdot, x)

has square-integrable first-spatial derivatives. Similarly, the space

H^{1} (0, T; H^{- 1} (Ω))

ensures that a function

u (t, x)

has a first-order time derivative that is weakly square-integrable in time. We define the trial space

U : = L^{2} (0, T; H_{0}^{1} (Ω)) \cap H^{1} (0, T; H^{- 1} (Ω))

associated with the norm

{∥ u ∥}_{U}^{2} = {∥ u ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} + {∥ u ∥}_{H^{1} (0, T; H^{- 1} (Ω))}^{2}

, and the test space

V : = L^{2} (0, T; H_{0}^{1} (Ω))

. Integrating by parts, we obtain the weak formulation of the model problem (1) with homogeneous boundary conditions and a zero initial guess, as finding a function

u \in U

such that

〈\frac{\partial u}{\partial t}, v〉 + 〈 κ (t, x) \nabla u, \nabla v 〉 + 〈 b (t, x) \cdot \nabla u, v 〉 + 〈 c (t, x) u, v 〉 = 〈 f (t, x), v 〉 \forall v \in V,

(2)

where

〈u, v〉 : = \int_{0}^{T} (\int_{Ω} u v d Ω) d t

. Throughout, we assume that

f (t, x) \in L^{2} (0, T; H^{- 1} (Ω))

, and that the coefficients satisfy

\begin{matrix} κ (t, x) \in L^{\infty} (Ω_{T}); b_{i} (t, x) \in W^{1, \infty} (Ω_{T}) \forall i = 1, 2, 3; \\ c (t, x) \in W^{1, \infty} (Ω_{T}); f (t, x) \in L^{\infty} (Ω_{T}); \end{matrix}

for almost every

t \in I_{T}

. We define the bilinear forms

\begin{matrix} D (u, v) & : = 〈\frac{\partial u}{\partial t}, v〉 + 〈 κ (t, x) \nabla u, \nabla v 〉 + 〈 b (t, x) \cdot \nabla u, v 〉 + 〈 c (t, x) u, v 〉, \\ B (u, v) & : = 〈 κ (t, x) \nabla u, \nabla v 〉 + 〈 b (t, x) \cdot \nabla u, v 〉 + 〈 c (t, x) u, v 〉, \end{matrix}

(3)

and

F (v) : = 〈 f (t, x), v 〉 .

(4)

Note that the assumption on the coefficients of (2) leads us to the continuity of

D (u, v)

, i.e., there exists a constant

C > 0

| D (u, v) | \leq {C ∥ u ∥}_{U} {∥ v ∥}_{V} .

(5)

Furthermore, to prove the inf-sup condition, we need the following two assumptions:

Assumption 1.

(a) There exists a positive constant

μ_{0}

, so

μ (t, x) = c (t, x) - \frac{1}{2} \nabla \cdot b (t, x) \geq μ_{0} > 0, and,

(6)

(b) There exist positive constants

κ_{*}

and

κ^{*}

, such that

\begin{matrix} κ^{*} \geq κ (t, x) \geq κ_{*} \end{matrix}

(7)

for almost all

(t, x) \in Ω_{T}

.

These two assumptions allow us to prove the following inf-sup Theorem:

Theorem 1.

If

u \in L^{2} (0, T; H_{0}^{1} (Ω)) \cap H^{1} (0, T; H^{- 1} (Ω))

satisfies

u (0, x) = 0

for

x \in Ω

, then under the above two conditions, (6) and (7) (Assumption 1), there exists a positive constant,

C > 0

, such that the following inf-sup stability condition is satisfied:

{C ∥ u ∥}_{U} \leq sup_{0 \neq v \in V} \frac{D (u, v)}{{∥ v ∥}_{V}} \forall u \in U,

(8)

where, although C is a constant, it may depend on

κ^{*}

and

μ_{0}

.

Proof.

To prove (8), we use [16] (Assumption 2.1), which requires two ingredients. First, for

t \in [0, T]

, we define a mapping

A (t) : H_{0}^{1} (Ω) \to H^{- 1} (Ω)

such that

\begin{matrix} 〈 A (t) u, v 〉 : = (κ (t, x) \nabla u, \nabla v) + (μ (t, x) u, v) \\ + \frac{1}{2} ((b (t, x) \cdot \nabla u, v) - (b (t, x) \cdot \nabla v, u)) \forall u, v \in H_{0}^{1} (Ω), and therefore, \\ 〈 A (t) u, u 〉 = (κ (t, x) \nabla u, \nabla u) + (μ (t, x) u, u) . \end{matrix}

(9)

where

(\cdot, \cdot)

denotes the

L^{2} (Ω)

inner product, i.e., the spatial integration only. Next, under assumption (6), we conclude that

\begin{matrix} 〈 A (t) u, u 〉 & = \int_{Ω} κ (t, x) {| \nabla u |}^{2} + \int_{Ω} μ (t, x) u^{2} \\ \geq κ_{*} \int_{Ω} {| \nabla u |}^{2} + μ_{0} \int_{Ω} u^{2} \\ = κ_{*} {∥ \nabla u ∥}_{L^{2} (Ω)}^{2} + μ_{0} {∥ u ∥}_{L^{2} (Ω)}^{2} \\ \geq min \{κ_{*}, μ_{0}\} {(∥ \nabla u ∥}_{L^{2} (Ω)}^{2} + {∥ u ∥}_{L^{2} (Ω)}^{2}) \\ \equiv min \{κ_{*}, μ_{0}\} {∥ u ∥}_{H_{0}^{1} (Ω)}^{2}, \forall u \in H_{0}^{1} (Ω) . \end{matrix}

(10)

We also deduce that

\begin{matrix} | 〈 A (t) u, v 〉 | & = | \int_{Ω} κ (t, x) \nabla u \cdot \nabla v + \int_{Ω} b (t, x) \cdot \nabla u v + \int_{Ω} c (t, x) u v | \\ \leq | \int_{Ω} κ (t, x) \nabla u \cdot \nabla v | + | \int_{Ω} b (t, x) \cdot \nabla u v | + | \int_{Ω} c (t, x) u v | \\ \leq κ^{*} | \int_{Ω} {\nabla u \cdot \nabla v | + ∥ b ∥}_{\infty, Ω} (| \int_{Ω} \frac{\partial u}{\partial x} v | + | \int_{Ω} \frac{\partial u}{\partial y} v | + | \int_{Ω} \frac{\partial u}{\partial z} v |) + {∥ c ∥}_{\infty, Ω} | \int_{Ω} u v | \\ \leq κ^{*} {∥ \nabla u ∥}_{L^{2} (Ω)} {∥ \nabla v ∥}_{L^{2} (Ω)} + {∥ b ∥}_{\infty, Ω} {∥ \nabla u ∥}_{L^{2} (Ω)} {∥ v ∥}_{L^{2} (Ω)} + {∥ c ∥}_{\infty, Ω} {∥ u ∥}_{L^{2} (Ω)} {∥ v ∥}_{L^{2} (Ω)} \\ \leq max {κ^{*} {, ∥ b ∥}_{\infty, Ω} {, ∥ c ∥}_{\infty, Ω} {} ∥ u ∥}_{H_{0}^{1} (Ω)} {∥ v ∥}_{H_{0}^{1} (Ω)} \forall u, v \in H_{0}^{1} (Ω) . \end{matrix}

Borrowing the same arguments as [30], we prove the thesis. □

Now, we are in a position to state the unique solvability of (2):

Theorem 2.

Let us assume that

f \in L^{2} (0, T; H^{- 1} (Ω))

, and the bilinear form

D (\cdot, \cdot)

(3) is bounded, as discussed in (5), and satisfies the inf-sup condition (8). Then, under the inf-sup stability condition (6) (Assumption 1), there exists a unique solution

u \in U

of (2), satisfying

{∥ u ∥}_{U} \leq C {∥ f ∥}_{L^{2} (0, T; H^{- 1} (Ω))},

(11)

where C is a positive constant that depends on

κ (t, x)

,

b (t, x)

, and

c (t, x)

.

Proof.

We will briefly state the proof of the theorem. Since,

D (\cdot, \cdot)

satisfies the inf-sup condition, we can write

{C ∥ u ∥}_{U} \leq sup_{0 \neq v \in V} \frac{D (u, v)}{{∥ v ∥}_{V}} = sup_{0 \neq v \in V} \frac{〈 f (t, x), v 〉}{{∥ v ∥}_{V}} = {∥ f ∥}_{L^{2} (0, T; H^{- 1} (Ω))} .

□

Interested readers may refer to [14,15] (Theorem 3.7) for a detailed proof of Theorem 2. In the weak formulation of the CDR in Equation (2), the solution u and test function v are chosen from appropriate function spaces to account for both the spatial and temporal regularities. Contrast this against the (Bubnov–Galerkin) methods, where we pick the same function space for the trial and test functions, which may cause instability. For example, when the velocity field is strong (i.e., large convection), the convective term can cause steep gradients or sharp layers to appear in the solution, which the standard Galerkin method fails to resolve properly. In the Galerkin approach, the test function v can come from a different space than the trial function.

Equation (2) is the weak formulation corresponding to (1) with homogeneous boundary conditions [20,30,31]. The weak formulation corresponding to the inhomogeneous boundary conditions can be obtained by following the standard technique of decomposing the solution

u (t, x)

of (1) as

u (t, x) = u^{homog} (t, x) + u_{0} (t, x)

, where

u^{homog} (t, x)

is the unknown solution of (1) with homogeneous Dirichlet boundary conditions, and

u_{0} (t, x)

satisfies the boundary conditions on

[0, T] \times Ω

, and is different from zero only on the boundaries. In the weak formulation, this simply results in a modified loading term,

f (t, x)

. We refer to [20] (Remark 1) for a detailed proof of the inhomogeneous initial and boundary conditions.

2.2. Finite Element Approximation

We will now describe the finite element approximation of Equation (2). This is done by picking finite-dimensional subspaces

U_{h} \subset U

and

V_{h} \subset V

specified by chosen bases described below. Let

{ϕ_{i}}_{i = 1}^{N_{Q}}

be the basis for

U_{h}

, and

{ψ_{j}}_{j = 1}^{M_{Q}}

be the basis for

V_{h}

. Then any approximate solution

u_{h} \in U_{h}

can be written as

u_{h} (t, x) = \sum_{i = 1}^{N_{Q}} U_{i} ϕ_{i} (t, x)

and each test function

v_{h} \in V_{h}

is written as

v_{h} (t, x) = \sum_{j = 1}^{M_{Q}} V_{j} ψ_{j} (t, x) .

Substituting

u_{h}

and

v_{h}

into the weak form in Equation (2) and approximating the coefficients with Lagrange interpolation results in a discrete system. We now describe this system for a particular choice of linear basis functions.

First, the domain

Ω_{T}

is discretized into a set of hypercubes that overlap only on their boundaries. The hypercube elements are defined by dividing each axis into smaller intervals:

\begin{matrix} Ω_{X} & = [x_{0}, x_{1}] \cup [x_{1}, x_{2}] \cup \dots [x_{n_{X} - 1}, x_{n_{X}}] \\ Ω_{Y} & = [y_{0}, y_{1}] \cup [y_{1}, y_{2}] \cup \dots [y_{n_{Y} - 1}, y_{n_{Y}}] \\ Ω_{Z} & = [z_{0}, z_{1}] \cup [z_{1}, z_{2}] \cup \dots [z_{n_{Z} - 1}, x_{n_{Z}}] \\ I_{T} & = [t_{0}, t_{1}] \cup [t_{1}, t_{2}] \cup \dots [t_{n_{T} - 1}, t_{n_{T}}] . \end{matrix}

The resulting hypercubes are of the form

[x_{i}, x_{i + 1}] \times [y_{j}, y_{j + 1}] \times [z_{k}, z_{k + 1}] \times [t_{l}, t_{l + 1}]

. Throughout, we adopt a uniform mesh on

Ω_{T}

with

h : = x_{i + 1} - x_{i} = y_{i + 1} - y_{i} = z_{i + 1} - z_{i} = t_{i + 1} - t_{i}

for all

i, j, k, l

. Furthermore, we assume

N = n_{X} = n_{Y} = n_{Z} = n_{T}

so that there is a total of

N + 1

total nodes along each dimension, and N intervals. We let

k = 0, 1, \dots N

denote the index for the nodes in each of the one-dimensional spaces. There is a total of

N_{Q} = {(N + 1)}^{4}

total space-time nodes in

Ω_{T}

. We denote these elements by

{q_{l}}_{l = 0}^{N_{Q} - 1}

, which are raster-ordered with X before Y before Z before T. The bijection between the linear element index l and the individual element indices

(k_{x}, k_{y}, k_{z}, k_{t})

is then given by

l ⟷ k_{x} + (N + 1) k_{y} + {(N + 1)}^{2} k_{z} + {(N + 1)}^{3} k_{t} .

(12)

Local Estimates

Throughout, we make use of local interpolation, which we describe here. Each 4D element,

q_{l}

, consists of two nodes per dimension for a total of 16 nodes. For a fixed element, each node can be indexed by four indices,

i_{1}, i_{2}, i_{3}, i_{4} \in {0, 1}

, with the zero index corresponding to the lower value in the interval. The local index for each element

i = 0, 1, \dots, 15

is given by

i = i_{1} + 2 i_{2} + 4 i_{3} + 8 i_{4} .

(13)

Along each dimension, we define the standard piecewise linear hat functions defined on the one-dimensional elements. For example, on the interval

I_{k}^{x} : = [x_{k}, x_{k + 1}]

ϕ_{1, k}^{x} (x) : = \{\begin{matrix} \frac{x - x_{k}}{h} & x \in I_{k}^{x} \\ 0 & e l s e \end{matrix} and ϕ_{0, k}^{x} (x) : = \{\begin{matrix} \frac{x_{k + 1} - x}{h} & x \in I_{k}^{x} \\ 0 & e l s e . \end{matrix}

(14)

Clearly, the

ϕ_{i_{1}, k}^{x}

have derivatives defined almost everywhere:

\frac{\partial ϕ_{1, k}^{x}}{\partial x} (x) : = \{\begin{matrix} \frac{1}{h} & x \in (x_{k}, x_{k + 1}) \\ 0 & e l s e \end{matrix} and \frac{\partial ϕ_{0, k}^{x}}{\partial x} (x) : = \{\begin{matrix} \frac{- 1}{h} & x \in (x_{k}, x_{k + 1}) \\ 0 & e l s e . \end{matrix}

(15)

We similarly define

ϕ_{0, k}^{y}, ϕ_{1, k}^{y}, ϕ_{0, k}^{z}, ϕ_{1, k}^{z}, ϕ_{0, k}^{t}, ϕ_{1, k}^{t}

. Using the local index notation associated with element

q_{l}

, the local space-time basis functions are defined as the tensor products of the 1D hat functions:

ϕ_{i, l} (t, x, y, z) = ϕ_{i_{1}, k_{x}} (x) ϕ_{i_{2}, k_{y}} (y) ϕ_{i_{3}, k_{z}} (z) ϕ_{i_{4}, k_{t}} (t),

where we use the correspondence between the element l and interval index

(k_{x}, k_{y}, k_{z}, k_{t})

(12), and the local index i with node index

(i_{1}, \dots, i_{4})

(13). For completeness, we note that the 16 functions

ϕ_{i, l}

are nonzero only on the cell

q_{l}

.

2.3. Local and Global Interpolation Operators

Now we will define the global and local Lagrange interpolation operators,

L_{h}

and

L_{h}^{l}

, respectively. For cell

q_{l}

and local index i, we let

V_{i, l}

denote the

(t, x, y, z)

node at index i in cell l. We define the local interpolation operator at

q_{l}

,

L_{h}^{l} : C^{0} (q_{l}) \to U_{h} |_{q_{l}}

, as

L_{h}^{l} (u) (t, x) = \sum_{i = 0}^{15} u (V_{i, l}) ϕ_{i, l} (t, x) .

(16)

In other words,

L_{h}^{l} (u) (x, t)

approximates the solution locally using the hat functions defined on the vertices of the finite element with index l. The global interpolation operator

L_{h} : U \cap C^{0} (Ω_{T}) \to U_{h}

is defined as

L_{h} (u) (t, x) = \sum_{l = 0}^{N_{Q} - 1} L_{h}^{l} (u) (t, x) = \sum_{l = 0}^{N_{Q} - 1} \sum_{i = 0}^{15} u (V_{i, l}) ϕ_{i, l} (t, x) .

(17)

The global interpolation operator combines the contributions from all elements of the mesh. These operators may be applied to the solution u, as well as the coefficient and forcing functions. In fact, we deduce the following standard approximation property of

L_{h}

[5] from (17) as follows:

∥ u - L_{h} {u ∥}_{L^{2} (Ω_{T})} \leq C h^{2} {| | u | |}_{H^{2} (Ω_{T})},

(18)

where

L^{2} (Ω_{T}) : = L^{2} (0, T; L^{2} (Ω))

and

H^{2} (Ω_{T}) : = H^{2} (0, T; H^{2} (Ω))

. Let

V_{h}

denote the finite-dimensional subspace spanned by a set of linearly independent functions

{ψ_{j}}_{j = 1}^{M_{Q}} \subset V

. Substituting the approximate solution

u_{h}

and a test function

v_{h} \in V_{h}

into the weak form Equation (2) leads to our discrete system of equations. Specifically, the Galerkin discretization of the variational problem seeks to find

u_{h} \in U_{h}

such that

\begin{matrix} \underset{T_{1}}{\underset{⏟}{〈\frac{\partial u_{h}}{\partial t}, v_{h}〉}} + \underset{T_{2}}{\underset{⏟}{〈 L_{h} (κ (t, x)) \nabla u_{h}, \nabla v_{h} 〉}} + \underset{T_{3}}{\underset{⏟}{〈 L_{h} (b (t, x)) \cdot \nabla u_{h}, v_{h} 〉}} & + 〈 \underset{T_{4}}{\underset{⏟}{L_{h} (c (t, x)) u_{h}, v_{h} 〉}} \\ = \underset{T_{5}}{\underset{⏟}{〈 L_{h} (f (t, x)), v_{h} 〉}}, \forall v_{h} \in V_{h} . \end{matrix}

(19)

While

ψ_{j}

can have lower regularity than the

ϕ_{i}

basis for

U_{h}

, in practice, we will consecutively use each

2^{4}

local space-time Lagrange basis function,

ϕ_{i} (t, x)

, as the test function,

v_{h}

, in Equation (19). We denote each term of Equation (19) by

T_{i}

. The goal of this paper is to provide a tensor representation for all terms

T_{i}, 1 \leq i \leq 5

in Equation (19). Furthermore, for

u_{h} \in U_{h}

,

v_{h} \in V_{h}

, we denote the discrete bilinear form as

\begin{matrix} D_{h} (u_{h}, v_{h}) & : = 〈\frac{\partial u_{h}}{\partial t}, v_{h}〉 + 〈 L_{h} (κ (t, x)) \nabla u_{h}, \nabla v_{h} 〉 \\ + 〈 L_{h} (b (t, x)) \cdot \nabla u_{h}, v_{h} 〉 + 〈 L_{h} (c (t, x)) u_{h}, v_{h} 〉, \\ B_{h} (u_{h}, v_{h}) & : = 〈 L_{h} (κ (t, x)) \nabla u_{h}, \nabla v_{h} 〉 + 〈 L_{h} (b (t, x)) \cdot \nabla u_{h}, v_{h} 〉 \\ + 〈 L_{h} (c (t, x)) u_{h}, v_{h} 〉, \end{matrix}

(20)

and

F_{h} (v_{h}) : = 〈 L_{h} (f (t, x)), v_{h} 〉 .

(21)

Hence, (19) reduces to finding

u_{h} \in U_{h}

such that

D_{h} (u_{h}, v_{h}) = F_{h} (v_{h}) \forall v_{h} \in V_{h} .

The discrete inf-sup condition of

D_{h} (u_{h}, v_{h})

is discussed below in Theorem 3:

Theorem 3.

Let

U_{h} \subset U

, and

V_{h} \subset V

be the discrete spaces satisfying

U_{h} \subset V_{h}

, and we assume that (6) is satisfied. Then the following discrete stability condition holds:

C ∥ u_{h} ∥_{U_{h}} \leq sup_{0 \neq v_{h} \in V_{h}} \frac{D_{h} (u_{h}, v_{h})}{∥ v_{h} ∥_{V}} \forall u_{h} \in U_{h},

(22)

where C is a positive constant that depends on the regularity of

κ (t, x)

,

b (t, x)

, and

c (t, x)

, but is independent of the mesh size h.

{∥ \cdot ∥}_{U_{h}}

denotes the norm in the discrete space

U_{h}

, which is defined in (A29). The proof can be completed by following the arguments in Theorem 3.1 in [30] and using the approximation property of the interpolation operator

L_{h}

discussed in (18). By adding and subtracting

〈 κ (t, x) \nabla u_{h}, \nabla v_{h} 〉 + 〈 b (t, x) \cdot \nabla u_{h}, v_{h} 〉 + 〈 c (t, x) u_{h}, v_{h} 〉

to (19) and employing the approximation property of

L_{h}

, we can prove the discrete inf-sup condition for sufficiently small values h of the mesh size. Hence, (19) is well-posed, i.e., it has a unique solution (see [30], section 3). In Appendix D, we provide a detailed proof of Theorem 3. Moreover, we highlight that the Galerkin orthogonality is not satisfied, i.e.,

D_{h} (u_{h}, v_{h}) - D (u, v_{h}) \neq 0 \forall v_{h} \in V_{h},

(23)

where u and

u_{h}

satisfy (2) and (19), respectively. We refer to (23) as an inconsistency error in the discrete scheme. In fact, the error estimate depends on the bound given in (23) as derived below.

Theorem 4.

Let

u \in U

, and

u_{h} \in U_{h}

be the unique solutions of the variational formulation (2), and (19), respectively. Then the following a priori error estimate holds:

\begin{matrix} ∥ u - u_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} & \leq C (inf_{p_{h} \in U_{h}} {∥ u - p_{h} ∥}_{U} + sup_{0 \neq v_{h} \in V_{h}} \frac{| F_{h} (v_{h}) - F (v_{h}) |}{∥ v_{h} ∥_{V}} \\ + sup_{0 \neq v_{h} \in V_{h}} \frac{| D_{h} (p_{h}, v_{h}) - D (u, v_{h}) |}{∥ v_{h} ∥_{V}}), \end{matrix}

(24)

where C is a positive constant independent of h.

Proof.

The proof is a consequence of the discrete inf-sup condition stated in Theorem 3. For arbitrary

p_{h} \in U_{h}

, we split the error as

u - u_{h} = u - p_{h} + p_{h} - u_{h}

. Since

u_{h} - p_{h} \in U_{h}

, we use the discrete inf-sup condition, and the required result follows directly. In Appendix E, we provide a detailed proof. □

3. Tensor-Train Decomposition

In this section, we introduce the TT format [32], as well as the representation of linear operators in the so-called TT-matrix format, the cross-interpolation method, and the QTT format. All these methods are fundamental in the tensorization of our finite element discretization of the CDR equation.

3.1. Tensor-Train

The TT format, introduced by Oseledets in 2011 [32], represents a sequential chain of matrix products involving both two-dimensional matrices and three-dimensional tensors, referred to as TT-cores. We can visualize this chain as in Figure 1.

Given that tensors in our formulation are, at most, four-dimensional (one temporal and three spatial dimensions), we consider the tensor-train format in the context of 4D tensors. Specifically, the TT approximation

X^{T T}

of a four-dimensional tensor

X

is a tensor with elements

\begin{matrix} X^{T T} (i_{1}, i_{2}, i_{3}, i_{4}) = \sum_{α_{1} = 1}^{r_{1}} \sum_{α_{2} = 1}^{r_{2}} \sum_{α_{3} = 1}^{r_{3}} G_{1} (1, i_{1}, α_{1}) G_{2} (α_{1}, i_{2}, α_{2}) G_{3} (α_{2}, i_{3}, α_{3}) G_{4} (α_{3}, i_{4}, 1) . \end{matrix}

(25)

Here, we have

X = X^{T T} + ε

where the error,

ε

, is a tensor with the same dimensions as

X

. The elements of the array

r = [r_{1}, r_{2}, r_{3}]

are the TT-ranks, which quantify the compression effectiveness of the TT approximation. Since each TT-core,

G_{p} (i_{k})

, only depends on a single index of the full tensor,

X

, the TT format effectively embodies a discrete separation of variables [33]. In Figure 1, we show a four-dimensional array

X (t, x, y, z)

, decomposed in the TT format.

3.2. Linear Operators in the TT-Matrix Format

Suppose the approximate solution of the CDR equation is a 4D tensor

U

, the linear operator

A

acting on that solution is represented as an 8D tensor. The transformation

A U

is defined as follows:

\begin{matrix} (A U) (i_{1}, i_{2}, i_{3}, i_{4}) = \sum_{j_{1}, j_{2}, j_{3}, j_{4}} A (i_{1}, j_{1}, \dots, i_{4}, j_{4}) U (j_{1}, \dots, j_{4}) . \end{matrix}

The tensor

A

can be related to a matrix operator

A

via

A (i_{1}, j_{1}, \dots, i_{4}, j_{4}) = A (i_{1} i_{2} i_{3} i_{4}, j_{1} j_{2} j_{3} j_{4}) .

(26)

We can construct the tensor

A

by suitably reshaping and permuting the dimensions of the matrix

A

. The linear operator

A

can be further represented in a variant of the TT format, called the TT-matrix, cf. [34]. The component-wise TT-matrix

A^{T T}

is defined as follows:

A^{T T} (i_{1}, j_{1}, \dots, i_{4}, j_{4}) = \sum_{α_{1}, α_{2}, α_{3}} G_{1} (1, (i_{1}, j_{1}), α_{1}) \dots G_{4} (α_{3}, (i_{4}, j_{4}), 1),

(27)

where

G_{k}

are 4D TT-cores. Figure 2 shows the process of transforming a matrix operator

A

to its tensor format,

A

, and, finally, to its TT-matrix format,

A^{TT}

.

The Kronecker product ⊗ of two matrices is an operation that produces a larger matrix, while the tensor product ∘ produces a higher-dimensional tensor (see the Appendix A.2). We can further simplify the TT-matrix representations of the matrix

A

if it is a Kronecker product of matrices, i.e.,

A = A_{1} \otimes A_{2} \otimes A_{3} \otimes A_{4}

. Based on the relationship defined in Equation (26), the tensor

A

can be constructed using the tensor product as

A = A_{1} \circ A_{2} \circ A_{3} \circ A_{4}

. This implies that the internal ranks of the TT format of

A

in (27) are all equal to 1. In such a case, all summations in Equation (27) reduce to a sequence of single matrix–matrix multiplications, and the TT format of

A

becomes the tensor product of d matrices:

\begin{matrix} A^{T T} & = A_{1} \circ A_{2} \circ \dots \circ A_{d} . \end{matrix}

(28)

This specific structure appears quite often in the matrix discretization, and will be exploited in the tensorization to construct the efficient TT format.

3.3. TT-Cross Interpolation

The original TT algorithm is based on consecutive applications of singular-value decompositions (SVDs) on the unfoldings of a tensor [32]. Although known for its efficiency, the TT algorithm requires access to the full tensor, which is impractical and even impossible for extra-large tensors. To address this challenge, the cross-interpolation algorithm, TT-cross, has been developed [27]. The idea behind TT-cross is essentially to replace the SVD in the TT algorithm with an approximate version of the skeleton/CUR decomposition [35,36]. CUR decomposition approximates a matrix by selecting a few of its columns

C

, a few of its rows

R

, and a matrix

U

that connects them, as shown in Figure 3.

Mathematically, the CUR decomposition finds an approximation for a matrix

A

, as

A \approx C U R

. The TT-cross algorithm utilizes the maximum volume principle (maxvol algorithm) [37,38] to determine

U

. The maxvol algorithm chooses a few columns,

C

, and rows,

R

, of

A

such that the intersection matrix

U^{- 1}

has maximum volume [39].

TT-cross interpolation and its variants can be seen as a heuristic generalization of the CUR decomposition to tensors [40,41]. TT-cross utilizes the maximum volume algorithm iteratively, often beginning with a few randomly chosen fibers, to select an optimal number of specific tensor fibers that capture essential information of the tensor [42]. These fibers are used to construct a lower-rank TT representation. The naive generalization of CUR is proven to be expensive, which has led to the development of various heuristic optimization techniques, such as TT-ALS [43], DMRG [39,44], and AMEN [45]. Another active direction for further reducing the computational cost of TT-cross interpolation is the development of parallel algorithms to enhance efficiency and scalability [46].

3.4. Quantized Tensor-Train Format

When the TT-cores of linear operators in the tensor-train (TT) format exhibit specific tensor structures, they can undergo further compression to enhance computational efficiency. One such advanced compression technique is the quantized tensor-train (QTT) format, which extends the standard TT decomposition by leveraging hidden low-rank structures in exponentially large spaces.

In a traditional TT decomposition, a high-dimensional tensor is factorized into a sequence of smaller, lower-dimensional tensors (TT-cores). This reduces storage complexity and makes computations more manageable. However, when dealing with extremely large tensors—such as those arising from discretizations with many degrees of freedom—even the standard TT decomposition may become computationally intensive.

To address this, the QTT format introduces an additional layer of compression by applying the TT decomposition recursively. The key idea behind QTT is to reshape the original tensor into a higher-dimensional structure by performing a dyadic (power-of-two) partitioning of its modes. Once reshaped, the resulting smaller tensors are then factorized using the TT format. This hierarchical approach significantly reduces both storage requirements and computational costs, particularly when working with structured tensors that exhibit self-similar patterns across multiple scales. A notable advantage of the QTT format is its ability to efficiently represent structured operators, such as those with Toeplitz structures, which inherently possess low-rank QTT representations [47,48,49,50]. For instance, in the context of solving the CDR equations, the TT-cores of the corresponding linear operators exhibit such structures, making them ideal candidates for QTT-based compression.

Thus, to achieve higher compression and improved computational efficiency, we further transform these TT-cores into the QTT format. We then solve a mixed TT/QTT version of the problem using an appropriate TT optimization technique.

4. Tensorization of the FEM

In this section, we detail the tensorization of the FEM for the space-time CDR equation. We will begin by deriving the local mass, stiffness, and time derivative matrices associated with the system (2). From there, the Galerkin approximation will be written in terms of a sum of Kronecker products of the local matrices. This will then be used to assemble global matrices for the system. This ultimately leads to the tensorization of the discrete form of the CDR (19).

4.1. Local One-Dimensional Mass, Stiffness, and Time-Derivative Matrices

We now define the local mass (time derivative) and stiffness (diffusion) matrices. Since test functions and local basis functions are tensor products of 1D hat functions, these matrices can be described as Kronecker products of local 1D mass and stiffness matrices. From Equations (14) and (15), we see that the local 1D mass and stiffness matrices at index k for the variable x is given by

M_{I_{k}^{x}} : = [\begin{matrix} (ϕ_{0, k}^{x}, ϕ_{0, k}^{x}) & (ϕ_{1, k}^{x}, ϕ_{0, k}^{x}) \\ (ϕ_{0, k}^{x}, ϕ_{1, k}^{x}) & (ϕ_{1, k}^{x}, ϕ_{1, k}^{x}) \end{matrix}], S_{I_{k}^{x}} : = [\begin{matrix} (\frac{d ϕ_{0, k}^{x}}{d x}, \frac{d ϕ_{0, k}^{x}}{d x}) & (\frac{d ϕ_{1, k}^{x}}{d x}, \frac{d ϕ_{0, k}^{x}}{d x}) \\ (\frac{d ϕ_{0, k}^{x}}{d x}, \frac{d ϕ_{1, k}^{x}}{d x}) & (\frac{d ϕ_{1, k}^{x}}{d x}, \frac{d ϕ_{1, k}^{x}}{d x}) \end{matrix}] .

(29)

The matrices for other space-intervals,

I_{k}^{t}, I_{k}^{y}, I_{k}^{z},

are defined in a similar way. For the local time derivative matrix,

D_{I_{k}^{t}}

, we have the following expression:

D_{I_{k}^{t}} : = [\begin{matrix} (\frac{d ϕ_{0, k}^{t}}{d t}, ϕ_{0, k}^{t}) & (\frac{d ϕ_{1, k}^{t}}{d t}, ϕ_{0, k}^{t}) \\ (\frac{d ϕ_{0, k}^{t}}{d t}, ϕ_{1, k}^{t}) & (\frac{d ϕ_{1, k}^{t}}{d t}, ϕ_{1, k}^{t}) \end{matrix}] .

(30)

Importantly, since we are using a uniform space-time grid, the defined mass and stiffness matrices are the same for each four-dimensional hypercube and do not depend on the index k.

4.2. Local Discretization of the Variational Form

Using these local matrices, we detail the discretization and matricization of each of the bilinear forms

T_{i}

in Equation (19). Throughout, we will pick the local basis functions

ϕ_{j, l}

for our test functions

v_{h}

.

4.2.1. Discretization of the Time-Derivative Term, $T_{1}$ , on a Local Four-Dimensional Hypercube

The discrete approximation of the time-derivative term, on the local four-dimensional hypercube,

q_{l}

, is

T_{1} : = 〈\frac{\partial u_{h}^{l}}{\partial t}, ϕ_{j, l}〉 = \sum_{i = 0}^{15} U_{i, l} \int_{I_{k_{t}}^{t} \times I_{k_{x}}^{x} \times I_{k_{y}}^{y} \times I_{k_{z}}^{z}} \frac{\partial ϕ_{i, l} (t, x)}{\partial t} ϕ_{j, l} (t, x) d t d x d y d z,

(31)

where

u_{h}^{l} = u_{h} |_{q_{l}}

. For simplification, we suppress the local element index l from the basis functions and

T_{i}

afterward, unless otherwise specified. Using the properties of multiple integrals on the Cartesian product mesh and the fact that the space-time local functions are expressed as products of one-dimensional functions, the expression can be compactly expressed using one-dimensional mass, stiffness, and time-derivative matrices via the Kronecker product. When we consecutively use—for

v_{h}

—every one of the 16 space-time functions

ϕ_{j} (t, x)

defined on the hypercube

q_{l}

, Equation (31) yields 16 equations with 16 unknowns, which, in matrix notation, can be expressed as follows:

T_{1} = (D_{I_{k_{t}}^{t}} \otimes M_{I_{k_{z}}^{z}} \otimes M_{I_{k_{y}}^{y}} \otimes M_{I_{k_{x}}^{x}}) U,

(32)

where

U

is a

[16 \times 1]

column vector containing the unknown

2^{4}

components of

u_{h}

on

q_{l}

.

4.2.2. Discretization of the Diffusion Term, $T_{2}$ , on the Four-Dimensional Hypercube, with a Constant Diffusion Coefficient, $κ (t, x) = 1$

In this section, we derive the matricization of

T_{2}

with

κ (t, x) = 1

, so that the term reduces to

T_{2} = 〈 \nabla u_{h}, \nabla v_{h} 〉

. By using (16), and

v_{h} = ϕ_{j}

, we can explicitly write

T_{2}

locally on

q_{l}

as

\begin{matrix} T_{2} & : = \sum_{i = 0}^{15} U_{i, l} \int_{I_{k_{t}}^{t} \times I_{k_{x}}^{x} \times I_{k_{y}}^{y} \times I_{k_{z}}^{z}} (\nabla ϕ_{i} (t, x), \nabla ϕ_{j} (t, x)) d t d x d y d z \\ = \sum_{i = 0}^{15} U_{i, l} (\int_{I_{k_{t}}^{t} \times I_{k_{x}}^{x} \times I_{k_{y}}^{y} \times I_{k_{z}}^{z}} (\nabla_{x} ϕ_{i} (t, x) \nabla_{x} ϕ_{j} (t, x) + \nabla_{y} ϕ_{i} (t, x) \nabla_{y} ϕ_{j} (t, x) \\ + \nabla_{z} ϕ_{i} (t, x) \nabla_{z} ϕ_{j} (t, x)) d t d x d y d z) . \end{matrix}

(33)

This result in 16 expressions with 16 unknowns, which in the matrix notation becomes

\begin{matrix} T_{2} : = (M_{I_{k_{t}}^{t}} \otimes S_{I_{k_{z}}^{z}} \otimes M_{I_{k_{y}}^{y}} \otimes M_{I_{k_{x}}^{x}} & + M_{I_{k_{t}}^{t}} \otimes M_{I_{k_{z}}^{z}} \otimes S_{I_{k_{y}}^{y}} \otimes M_{I_{k_{x}}^{x}} \\ + M_{I_{k_{t}}^{t}} \otimes M_{I_{k_{z}}^{z}} \otimes M_{I_{k_{y}}^{y}} \otimes S_{I_{k_{x}}^{x}}) U . \end{matrix}

(34)

4.2.3. Discretization of the Diffusion Term, $T_{2}$ , on the Four-Dimensional Hypercube, with a Non-Constant Diffusion Function

For the variable diffusion coefficient,

κ (t, x)

, we employ the Lagrange interpolation operator defined in (16). Assuming that

κ (t, x)

is well-defined at the nodes of

q_{l}

, and once again using

v_{h} = ϕ_{j}

, we derive the matricization of

T_{2}

as

\begin{matrix} T_{2} & = 〈 L_{h}^{l} (κ (t, x)) \nabla u_{h}^{l}, \nabla v_{h} 〉 = \sum_{i = 0}^{15} U_{i, l} \int_{q_{l}} L_{h}^{l} (κ (t, x)) (\nabla ϕ_{i} (t, x), \nabla ϕ_{j} (t, x)) d t d x d y d z \\ = \sum_{i, m = 0}^{15} κ (V_{m, l}) U_{i, l} \int_{I_{k_{t}}^{t} \times I_{k_{x}}^{x} \times I_{k_{y}}^{y} \times I_{k_{z}}^{z}} ϕ_{m} (t, x) (\nabla_{x} ϕ_{i} (t, x) \nabla_{x} ϕ_{j} (t, x) \\ + \nabla_{y} ϕ_{i} (t, x) \nabla_{y} ϕ_{j} (t, x) + \nabla_{z} ϕ_{i} (t, x) \nabla_{z} ϕ_{j} (t, x)) d t d x d y d z, \end{matrix}

(35)

where

κ (V_{m, l})

are the values of the diffusion function

κ (t, x)

at the

2^{4}

nodes of

q_{l}

. Here, in contrast to (33), each one-dimensional integration involves the product of one basis function and the gradients of two basis functions. To formulate this in a way that is structurally similar to (34), for each dimension, we introduce the following modified mass and stiffness matrices:

M_{I_{k}^{x}}^{p} = [\begin{matrix} (ϕ_{p, k}^{x} ϕ_{0, k}^{x}, ϕ_{0, k}^{x}) & (ϕ_{p, k}^{x} ϕ_{1, k}^{x}, ϕ_{0, k}^{x}) \\ (ϕ_{p, k}^{x} ϕ_{0, k}^{x}, ϕ_{1, k}^{x}) & (ϕ_{p, k}^{x} ϕ_{1, k}^{x}, ϕ_{1, k}^{x}) \end{matrix}]; S_{I_{k}^{x}}^{p} = [\begin{matrix} (ϕ_{p, k}^{x} \frac{d ϕ_{0, k}^{x}}{d x}, \frac{d ϕ_{0, k}^{x}}{d x}) & (ϕ_{p, k}^{x} \frac{d ϕ_{1, k}^{x}}{d x}, \frac{d ϕ_{0, k}^{x}}{d x}) \\ (ϕ_{p, k}^{x} \frac{d ϕ_{0, k}^{x}}{d x}, \frac{d ϕ_{1, k}^{x}}{d x}) & (ϕ_{p, k}^{x} \frac{d ϕ_{1, k}^{x}}{d x}, \frac{d ϕ_{1, k}^{x}}{d x}) \end{matrix}],

(36)

where the index p assumes two values, namely, 0 and 1. For

v_{h} = ϕ_{j}

, Equation (35) reduces to a matrix–vector product, where the associated matrix can be constructed by using one-dimensional matrices introduced in (36). Using the properties of multiple integrals on the Cartesian product mesh, and the fact that the space-time local functions are products of one-dimensional functions, we explicitly have

\begin{matrix} T_{2} & : = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} κ (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes S_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}} \\ + M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes S_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}} + M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes S_{I_{k_{x}}^{x}}^{m_{1}})) U . \end{matrix}

(37)

Here, we employ local index m for each element as defined in (13).

4.2.4. Discretization of the Convection Term, $T_{3}$ , on the Four-Dimensional Hypercube, with a Non-Constant Convection Function

In this section, we derive the matricization of the convection term with variable coefficients. Following the same approach as in (35), using (17) and

v_{h} = ϕ_{j} (t, x)

,

T_{3}

becomes

\begin{matrix} T_{3} & = 〈 L_{h}^{l} (b (t, x)) \cdot \nabla u_{h}^{l}, v_{h} 〉 \\ = \sum_{m = 0}^{15} b_{1} (V_{m, l}) \sum_{i = 0}^{15} U_{i, l} \int_{q_{l}} ϕ_{m} (t, x) \nabla_{x} ϕ_{i} (t, x) ϕ_{j} (t, x) d t d x d y d z \\ + \sum_{m = 0}^{15} b_{2} (V_{m, l}) \sum_{i = 0}^{15} U_{i, l} \int_{q_{l}} ϕ_{m} (t, x) \nabla_{y} ϕ_{i} (t, x) ϕ_{j} (t, x) d t d x d y d z \\ + \sum_{m = 0}^{15} b_{3} (V_{m, l}) \sum_{i = 0}^{15} U_{i, l} \int_{q_{l}} ϕ_{m} (t, x) \nabla_{z} ϕ_{i} (t, x) ϕ_{j} (t, x) d t d x d y d z . \end{matrix}

(38)

To derive the matrix structure of (38), we introduce the following one-dimensional local space-derivative matrix for each dimension. On the interval

I_{k}^{x}

, we define

D_{I_{k}^{x}}^{p} = [\begin{matrix} (ϕ_{p, k}^{x} \frac{d ϕ_{0, k}^{x}}{d x}, ϕ_{0, k}^{x}) & (ϕ_{p, k}^{x} \frac{d ϕ_{1, k}^{x}}{d x}, ϕ_{0, k}^{x}) \\ (ϕ_{p, k}^{x} \frac{d ϕ_{0, k}^{x}}{d x}, ϕ_{1, k}^{x}) & (ϕ_{p, k}^{x} \frac{d ϕ_{1, k}^{x}}{d x}, ϕ_{1, k}^{x}) \end{matrix}],

(39)

where

p \in {0, 1}

. By using (39) and (36), we find that

\begin{matrix} T_{3} & = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} b_{1} (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes D_{I_{k_{x}}^{x}}^{m_{1}}) \\ + b_{2} (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes D_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}}) \\ + b_{3} (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes D_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}})) U . \end{matrix}

(40)

4.2.5. Discretization of the Reaction Term, $T_{4}$ , on the Four-Dimensional Hypercube

In this section, we build the matricization of the reaction term,

T_{4}

, with the variable reaction coefficient. We should note that for a constant reaction coefficient and

v_{h} = ϕ_{j}

, the matrix associated with

T_{4}

reduces to the mass matrix. In this case, it follows from (29) that

\begin{matrix} T_{4} = (M_{I_{k_{t}}^{t}} \otimes M_{I_{k_{x}}^{x}} \otimes M_{I_{k_{y}}^{y}} \otimes M_{I_{k_{x}}^{x}}) U . \end{matrix}

(41)

For the variable reaction coefficient, we discretize the reaction term using the Lagrange interpolation operator (16). Again, taking

v_{h} = ϕ_{j} (t, x)

,

T_{4}

becomes

T_{4} = 〈 L_{h}^{l} (c (t, x)) u_{h}^{l}, v_{h} 〉 .

Using (36), we derive

T_{4} = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} c (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}})) U .

(42)

4.2.6. Discretization of the Loading Term, $T_{5}$ , on the Four-Dimensional Hypercube

Finally, using the Lagrange interpolation operator (16), we rewrite the load vector into a matrix–vector product. For

v_{h} = ϕ_{j} (t, x)

,

T_{5}

reduces to

T_{5} = 〈 L_{h}^{l} (f (t, x)), v_{h} 〉 = \sum_{m = 0}^{15} f (V_{m, l}) \int_{q_{l}} ϕ_{m} (t, x) ϕ_{j} (t, x) d t d x d y d z .

The final matrix–vector product can be written as follows:

T_{5} : = (M_{I_{k_{t}}^{t}} \otimes M_{I_{k_{z}}^{z}} \otimes M_{I_{k_{y}}^{y}} \otimes M_{I_{k_{x}}^{x}}) F .

(43)

Here, the load vector is defined as

{(F)}_{m} : = f (V_{m, l}),

where m (as always) denotes the 16 local nodes of

q_{l}

(13).

5. Assembly of Global Matrices

5.1. Assembly for Terms with Constant Coefficients

The local matrix structures of

T_{1}, T_{2}, T_{3}, T_{4}

, and

T_{5}

can easily be extended to similar global structures when the diffusion, convection, and reaction coefficients are constants. To achieve this, we need to construct the global mass, stiffness, and time-derivative matrices using the local ones for each variable.

First, we construct from (30) the following block matrix,

D_{t}

, for the temporal variable. Recall that the time interval

I_{T}

is decomposed using uniform intervals

I_{T} = \cup_{k_{t} = 1}^{N} I_{k_{t}}^{t}

. We define

D_{t} = {[\begin{matrix} D_{I_{1}^{t}} & 0 & 0 & 0 & \dots & 0 \\ 0 & D_{I_{2}^{t}} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & D_{I_{N - 1}^{t}} & 0 \\ 0 & 0 & 0 & 0 & 0 & D_{I_{N}^{t}} \end{matrix}]}_{2 N \times 2 N},

(44)

where

D_{I_{k_{t}}^{t}}

is defined as in Equation (30). To achieve the proper global assembly of the 1D time derivative matrix, we need to take into account the common boundary points of each interval. This can be accomplished with the help of the binary matrix,

B

:

B = {[\begin{matrix} 1 & 0 & 0 & 0 & 0 & \dots & 0 \\ 0 & 1 & 1 & 0 & 0 & \dots & 0 \\ 0 & 0 & 0 & 1 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & 0 & 0 & \dots & 1 \end{matrix}]}_{(N + 1) \times 2 N} .

(45)

Then, the global

[(N + 1) \times (N + 1)]

time-derivative matrix,

D_{I_{T}}

, can be determined as

D_{I_{T}} = B D_{t} B^{T} .

(46)

In a similar fashion, using

B

, we can build the global one-dimensional mass matrices

M_{Ω_{Z}}, M_{Ω_{Y}}

, and

M_{Ω_{X}}

. Finally, we obtain the following for

T_{1}^{g}

:

T_{1}^{g} : = (D_{I_{T}} \otimes M_{Ω_{Z}} \otimes M_{Ω_{Y}} \otimes M_{Ω_{X}}) U^{g},

(47)

here,

U^{g}

is a global column solution vector of size

[{(N + 1)}^{4} \times 1]

. Similarly, we build the term

T_{5}^{g}

:

T_{5}^{g} : = (M_{I_{T}} \otimes M_{Ω_{Z}} \otimes M_{Ω_{Y}} \otimes M_{Ω_{X}}) F^{g},

(48)

where

F^{g}

is the column load vector of size

[{(N + 1)}^{4} \times 1]

.

5.2. Assembly for Terms with Variable Coefficients

5.2.1. Assembly of $T_{2}$ , When the Diffusion Coefficient Depends on Space-Time Variables, but Enables a Separation of Variables

First, let us assume that

κ (t, z, y, x) = κ^{t} (t) κ^{z} (z) κ^{y} (y) κ^{x} (x),

(49)

κ^{t} (t)

,

κ^{z} (z)

,

κ^{y} (y)

, and

κ^{x} (x)

denote the separation of variables of the function

κ (t, z, y, x)

, then locally, we have the following (37):

\begin{matrix} T_{2} & = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} κ^{t} (t_{m_{4}}) κ^{z} (z_{m_{3}}) κ^{y} (y_{m_{2}}) κ^{x} (x_{m_{1}}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes S_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}} \\ + M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes S_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}} + M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes S_{I_{k_{x}}^{x}}^{m_{1}})) U . \end{matrix}

(50)

To assemble the global matrix in the x-dimension, let us define the following pairs of local one-dimensional coefficient matrices. For each interval,

I_{k - 1}^{x} = [x_{k - 1}, x_{k}]

, we have

\begin{matrix} κ_{k - 1}^{x} = {[\begin{matrix} κ^{x} (x_{k - 1}) & 0 \\ 0 & κ^{x} (x_{k - 1}) \end{matrix}]}_{2 \times 2} . \end{matrix}

(51)

and similarly for

y, z

, and t dimensions. By evaluating

κ^{x} (x)

at all points of

Ω_{X} = \cup_{k_{x} = 0}^{N - 1} I_{k_{x}}^{x}

, we define the following pair of global block coefficient matrices in the x-dimension:

C_{0}^{x} = {[\begin{matrix} κ_{0}^{x} & 0 & 0 & 0 & \dots & 0 \\ 0 & κ_{1}^{x} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & κ_{N - 2}^{x} & 0 \\ 0 & 0 & 0 & 0 & 0 & κ_{N - 1}^{x} \end{matrix}]}_{2 N \times 2 N},

(52)

C_{1}^{x} = {[\begin{matrix} κ_{1}^{x} & 0 & 0 & 0 & \dots & 0 \\ 0 & κ_{2}^{x} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & κ_{N - 1}^{x} & 0 \\ 0 & 0 & 0 & 0 & 0 & κ_{N}^{x} \end{matrix}]}_{2 N \times 2 N},

(53)

Using (39), we construct the global pair of stiffness block matrices as follows:

S_{0}^{x} = {[\begin{matrix} S_{I_{0}^{x}}^{0} & 0 & 0 & 0 & \dots & 0 \\ 0 & S_{I_{1}^{x}}^{0} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & S_{I_{N - 2}^{x}}^{0} & 0 \\ 0 & 0 & 0 & 0 & 0 & S_{I_{N - 1}^{x}}^{0} \end{matrix}]}_{2 N \times 2 N},

(54)

S_{1}^{x} = {[\begin{matrix} S_{I_{0}^{x}}^{1} & 0 & 0 & 0 & \dots & 0 \\ 0 & S_{I_{1}^{x}}^{1} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & S_{I_{N - 2}^{x}}^{1} & 0 \\ 0 & 0 & 0 & 0 & 0 & S_{I_{N - 1}^{x}}^{1} \end{matrix}]}_{2 N \times 2 N} .

(55)

By using (45), (52)–(54), we construct the global stiffness matrices corresponding to the variable x:

S_{x_{0}}^{g} : = B S_{0}^{x} C_{0}^{x} B^{T}, and S_{x_{1}}^{g} : = B S_{1}^{x} C_{1}^{x} B^{T},

(56)

and similarly, the global stiffness matrices

S_{y_{0}}^{g}

,

S_{z_{0}}^{g}

,

S_{y_{1}}^{g}

, and

S_{z_{1}}^{g}

. Now, we focus on the construction of block matrices

M_{0}^{x}

and

M_{1}^{x}

, whose diagonal positions consist of local mass matrices (35) as follows:

M_{0}^{x} = {[\begin{matrix} M_{I_{0}^{x}}^{0} & 0 & 0 & 0 & \dots & 0 \\ 0 & M_{I_{1}^{x}}^{0} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & M_{I_{N - 2}^{x}}^{0} & 0 \\ 0 & 0 & 0 & 0 & 0 & M_{I_{N - 1}^{x}}^{0} \end{matrix}]}_{2 N \times 2 N},

(57)

M_{1}^{x} = {[\begin{matrix} M_{I_{0}^{x}}^{1} & 0 & 0 & 0 & \dots & 0 \\ 0 & M_{I_{1}^{x}}^{1} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & M_{I_{N - 2}^{x}}^{1} & 0 \\ 0 & 0 & 0 & 0 & 0 & M_{I_{N - 1}^{x}}^{1} \end{matrix}]}_{2 N \times 2 N} .

(58)

By using (45), (52), (53), (57), and (58), we construct the global mass matrices corresponding to the variable x as follows:

M_{x_{0}}^{g} : = B M_{0}^{x} C_{0}^{x} B^{T}, and M_{x_{1}}^{g} : = B M_{1}^{x} C_{1}^{x} B^{T},

(59)

and similarly for the remaining dimensions. By using the above matrices, we construct the global space-time matricization for the diffusion term as follows:

\begin{matrix} T_{2}^{g} & : = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [M_{t_{m_{4}}}^{g} \otimes S_{z_{m_{3}}}^{g} \otimes M_{y_{m_{2}}}^{g} \otimes M_{x_{m_{1}}}^{g} \\ + M_{t_{m_{4}}}^{g} \otimes M_{z_{m_{3}}}^{g} \otimes S_{y_{m_{2}}}^{g} \otimes M_{x_{m_{1}}}^{g} + M_{t_{m_{4}}}^{g} \otimes M_{z_{m_{3}}}^{g} \otimes M_{y_{m_{2}}}^{g} \otimes S_{x_{m_{1}}}^{g}]) U^{g} . \end{matrix}

(60)

Furthermore, we highlight that we use the same notation for the global mass and stiffness matrices corresponding to the diffusion, convection, and reaction terms, unless otherwise specified.

5.3. Assembly for the Convection Term $T_{3}$ , on the Four-Dimensional Hypercube

The coefficient matrices corresponding to the convective part can be constructed following the analogous technique as described in the previous section. First, we write

b (t, x) = [b_{1} (t, x) b_{2} (t, x) b_{3} (t, x)] .

Moreover, we assume that each component

b_{i}

for

1 \leq i \leq 3

satisfies the separation of variable conditions. Thus, we have

b_{i} (t, x) = b_{i}^{t} (t) b_{i}^{z} (z) b_{i}^{y} (y) b_{i}^{x} (x),

where

b_{i}^{t} (t) b_{i}^{z} (z) b_{i}^{y} (y) b_{i}^{x} (x)

denote the separation of variables of the function

b_{i} (t, x)

for each i. Then, we have

\begin{matrix} T_{3} & = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} b_{1} (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes D_{I_{k_{x}}^{x}}^{m_{1}}) \\ + b_{2} (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes M_{I_{k_{z}}^{z}}^{m_{3}} \otimes D_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}}) \\ + b_{3} (V_{m, l}) (M_{I_{k_{t}}^{t}}^{m_{4}} \otimes D_{I_{k_{z}}^{z}}^{m_{3}} \otimes M_{I_{k_{y}}^{y}}^{m_{2}} \otimes M_{I_{k_{x}}^{x}}^{m_{1}})) U . \end{matrix}

(61)

We define the following pairs of local one-dimensional coefficient matrices to assemble the global matrix in the x-dimension corresponding to the convective term. On each interval

I_{k - 1}^{x} = [x_{k - 1}, x_{k}]

, we have

\begin{matrix} b_{i, k - 1}^{x} = {[\begin{matrix} b_{i}^{x} (x_{k - 1}) & 0 \\ 0 & b_{i}^{x} (x_{k - 1}) \end{matrix}]}_{2 \times 2} \forall i = 1, 2, 3 . \end{matrix}

(62)

We follow the same constriction for the y, z, and t dimensions. Using (62),

C_{i, 0}^{b, x} = {[\begin{matrix} b_{i, 0}^{x} & 0 & 0 & 0 & \dots & 0 \\ 0 & b_{i, 1}^{x} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & b_{i, N - 2}^{x} & 0 \\ 0 & 0 & 0 & 0 & 0 & b_{i, N - 1}^{x} \end{matrix}]}_{2 N \times 2 N},

(63)

C_{i, 1}^{b, x} = {[\begin{matrix} b_{i, 1}^{x} & 0 & 0 & 0 & \dots & 0 \\ 0 & b_{i, 2}^{x} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & b_{i, N - 1}^{x} & 0 \\ 0 & 0 & 0 & 0 & 0 & b_{i, N}^{x} \end{matrix}]}_{2 N \times 2 N} .

(64)

Using (39), we construct the global pair of derivative block matrices as follows:

D_{i, 0}^{x} = {[\begin{matrix} D_{I_{0}^{x}}^{0} & 0 & 0 & 0 & \dots & 0 \\ 0 & D_{I_{1}^{x}}^{0} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & D_{I_{N - 2}^{x}}^{0} & 0 \\ 0 & 0 & 0 & 0 & 0 & D_{I_{N - 1}^{x}}^{0} \end{matrix}]}_{2 N \times 2 N},

(65)

D_{i, 1}^{x} = {[\begin{matrix} D_{I_{0}^{x}}^{1} & 0 & 0 & 0 & \dots & 0 \\ 0 & D_{I_{1}^{x}}^{1} & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & D_{I_{N - 2}^{x}}^{1} & 0 \\ 0 & 0 & 0 & 0 & 0 & D_{I_{N - 1}^{x}}^{1} \end{matrix}]}_{2 N \times 2 N} .

(66)

The construction of the block matrices consisting of local mass matrices

M_{i, 0}^{x}

and

M_{i, 1}^{x}

,

1 \leq i \leq 3

follows the same techniques as in Section 5.2.1 (Equations (57) and (58)). Using (45) and (63)–(65), we construct the global derivative matrices corresponding to the x variable as follows:

D_{i, x_{0}}^{g} : = B D_{i, 0}^{x} C_{i, 0}^{b, x} B^{T} and D_{i, x_{1}}^{g} : = B D_{i, 1}^{x} C_{i, 1}^{b, x} B^{T} .

(67)

The matrices

D_{i, y_{0}}^{g}

,

D_{i, y_{1}}^{g}

,

D_{i, z_{0}}^{g}

and

D_{i, z_{1}}^{g}

are constructed following the same structure as (67). Furthermore, we construct the global mass matrix as (59)

M_{i, x_{0}}^{g} : = B M_{i, 0}^{x} C_{i, 0}^{b, x} B^{T}, and M_{i, x_{1}}^{g} : = B M_{i, 1}^{x} C_{i, 1}^{b, x} B^{T} \forall 1 \leq i \leq 3 .

(68)

The construction of the mass matrices for the other variables, such as y, z, and t, follows the same techniques as in (68). By using these global matrices corresponding to the variables x,y,z, and t, we derive the low-rank structure of the convective term as follows:

\begin{matrix} T_{3}^{g} & = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [M_{1, t_{m_{4}}}^{g} \otimes M_{1, z_{m_{3}}}^{g} \otimes M_{1, y_{m_{2}}}^{g} \otimes D_{1, x_{m_{1}}}^{g} \\ + M_{2, t_{m_{4}}}^{g} \otimes M_{2, z_{m_{3}}}^{g} \otimes D_{2, y_{m_{2}}}^{g} \otimes M_{2, x_{m_{1}}}^{g} \\ + M_{3, t_{m_{4}}}^{g} \otimes D_{3, z_{m_{3}}}^{g} \otimes M_{3, y_{m_{2}}}^{g} \otimes M_{3, x_{m_{1}}}^{g}]) U^{g} . \end{matrix}

(69)

5.4. Assembly for the Reaction Term $T_{4}$ on the Four-Dimensional Hypercube

The construction of the global mass matrices in the x, y, z, and t variables corresponding to the reaction term follows the same techniques as in Equations (57) and (58). By using

M_{t_{m_{4}}}^{g}

,

M_{z_{m_{3}}}^{g}

,

M_{y_{m_{2}}}^{g}

, and

M_{x_{m_{1}}}^{g}

, we can construct the matricization corresponding to reaction term

T_{4}

as

T_{4}^{g} : = \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} (M_{t_{m_{4}}}^{g} \otimes M_{z_{m_{3}}}^{g} \otimes M_{y_{m_{2}}}^{g} \otimes M_{x_{m_{1}}}^{g}) U^{g} .

(70)

5.5. The Global System

Throughout, we assume that

κ

,

b

, and c are rank one functions in order to derive expressions for

T_{1}^{g}, T_{2}^{g}, T_{3}^{g}, T_{4}^{g}

, and

T_{5}^{g}

. In a general case, we utilize the cross-interpolation technique [51] to approximate the TT format of general functions, which can be interpreted as an approximation in separated-variable form. Hence, the techniques described above can be used to construct the global matrices. Taking into account all expressions

T_{1}^{g}, T_{2}^{g}, T_{3}^{g}, T_{4}^{g}

, and

T_{5}^{g}

results in the following system of equations:

(T_{1}^{g} + T_{2}^{g} + T_{3}^{g} + T_{4}^{g}) : = A^{g} U^{g} = T_{5}^{g} .

(71)

Finally, to deal with boundary conditions, we reformulate the system in Equation (71) to only include interior nodes:

A^{g, i n t} U^{g, i n t} = T_{5}^{g, i n t} - F^{b d},

(72)

where

F^{b d}

is the boundary term incorporating the boundary and initial conditions. To implement the boundary conditions in the TT format, we reduce the linear system to hold only for interior nodes, while the boundary conditions are imposed on the boundary nodes, and the corresponding columns in (71) are subtracted from the right-hand side of the equation. As a result, the load vector is modified to account for the boundary conditions. This standard approach is implemented in the TT format via matrix–vector multiplication [34]. More details of this formulation are described in [12] (Section 2.2.4).

6. Tensorization of the Weak-Form of CDR

6.1. Transformation of the CDR Discretization into TT and QTT Formats

To solve the CDR equation with coefficients

κ (t, x)

,

b (t, x)

, and

c (t, x)

, along with boundary conditions, initial conditions, and loading functions that do not allow separation of variables, we use TT-cross to construct the TT format directly from these inputs. The space-time discretization process leads to the formulation of a linear system for all interior nodes, as outlined in Equation (72). For simplicity, we will omit part of the indices and refer to this equation as follows:

A U = T_{5} - F^{bd}

, where

A

represents the operator matrix,

U

is the solution vector,

T_{5}

is the loading term, and

F^{bd}

accounts for the boundary and initial conditions. In the following section, we will provide a detailed explanation of how to construct the TT and QTT formats for each component of this linear system, utilizing the following three steps:

\begin{matrix} A U = T_{5} - F^{bd} \\ A^{T T} U^{T T} = T_{5}^{T T} - F^{bd, T T} \\ A^{Q T T} U^{Q T T} = T_{5}^{Q T T} - F^{bd, Q T T}, \end{matrix}

(73)

where

A^{T T} U^{T T} : = T_{1}^{T T} + T_{2}^{T T} + T_{3}^{T T} + T_{4}^{T T}

and

A^{Q T T} U^{Q T T} : = T_{1}^{Q T T} + T_{2}^{Q T T} + T_{3}^{Q T T} + T_{4}^{Q T T}

are formulated using

A, U

, and loading terms such as

T_{5}, F^{bd}

, which are expressed in

T T

and

Q T T

tensor formats.

6.2. Construction of the TT Format for Linear System Components

Given that the operators in their matrix representations exhibit the Kronecker product structures, as derived in Section 2, their TT formats can be constructed by employing component matrices as TT-cores; see Equation (28). To construct the TT format of the operators acting on the interior nodes, we first need to define some sets of indices:

\begin{matrix} I_{t} : = 2 : N + 1, index set for the time variable corresponding to interior nodes; \\ I_{s} : = 2 : N, index set for the space variable corresponding to interior nodes . \end{matrix}

TT format of $T_{1}^{g, i n t}$ , $T_{1}^{T T}$ : From the formulation in Equation (47), the global temporal operator in the TT-matrix format acting only on the interior nodes is constructed as follows:

$T_{1}^{T T} : = (D_{I_{T}} (I_{t}, I_{t}) \circ M_{Ω_{Z}} (I_{s}, I_{s}) \circ M_{Ω_{Y}} (I_{s}, I_{s}) \circ M_{Ω_{X}} (I_{s}, I_{s})) U^{T T},$

(74)

where the tensor product operator ∘ is defined in Appendix A.
TT format of $T_{2}^{g, i n t}$ , $T_{2}^{T T}$ : From the formulation in Equation (60), the diffusion operator in the TT-matrix format is constructed as follows:

$\begin{matrix} T_{2}^{T T} : = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [ & M_{t_{m_{4}}}^{g} (I_{t}, I_{t}) \circ S_{z_{m_{3}}}^{g} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g} (I_{s}, I_{s}) \circ S_{y_{m_{2}}}^{g} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g} (I_{s}, I_{s}) \circ S_{x_{m_{1}}}^{g} (I_{s}, I_{s})]) U^{T T} . \end{matrix}$

(75)
TT format of $T_{3}^{g, i n t}$ , $T_{3}^{T T}$ : Following the formulation in Equation (69), the convection operator in the TT-matrix format is constructed as follows:

$\begin{matrix} T_{3}^{T T} = (\sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [ & M_{t_{m_{4}}}^{g} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g} (I_{s}, I_{s}) \circ D_{x_{m_{1}}}^{g} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g} (I_{s}, I_{s}) \circ D_{y_{m_{2}}}^{g} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, I_{t}) \circ D_{z_{m_{3}}}^{g} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g} (I_{s}, I_{s})]) U^{T T} . \end{matrix}$

(76)
TT format of $T_{4}^{g, i n t}$ , $T_{4}^{T T}$ : Following the formulation in Equation (70), the reaction term is constructed as follows:

$T_{4}^{T T} : = \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} (M_{t_{m_{4}}}^{g} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g} (I_{s}, I_{s})) U^{T T} .$

(77)
TT format of $T_{5}^{g, i n t}$ , $T_{5}^{T T}$ : From Equation (48), the loading term is constructed as follows:

$T_{5}^{T T} : = (M_{I_{T}} (I_{t}, I_{t}) \circ M_{Ω_{Z}} (I_{s}, I_{s}) \circ M_{Ω_{Y}} (I_{s}, I_{s}) \circ M_{Ω_{X}} (I_{s}, I_{s})) F^{T T} .$

(78)

The TT format of the loading term $F^{T T}$ is approximated by the cross-interpolation algorithm [52].
TT format of the boundary term, $F^{B d, T T}$ : The boundary and initial conditions are enforced by the boundary term, which will map from all nodes to only interior nodes. In fact, we constructed a map that enforces the boundary condition in each equation of the system only corresponding to the interior nodes. This approach is very convenient for the TT modification, as it only requires matrix–vector multiplication. The detailed construction of $F^{B d, T T}$ is included in Appendix B.

6.3. Construction of the QTT Format for Linear System Components

Given the banded structure of the component matrices, they can be further compressed through transformation into the QTT format [28]. We will proceed with the construction of the QTT format. The idea is to first convert the component matrices into QTT format and then link them together, as in the case of the TT format. This algorithm is described in [34] (Algorithm 1).

At this point, we have completed constructing the TT format and QTT format of the linear system in Equation (73). To solve the TT/QTT linear system using tensor network optimization techniques, we employed the MATLAB TT-Toolbox [52].

7. Extension of the Method to Higher Orders and Dimensions

In this section, we briefly show how the proposed technique can be extended to arbitrary order

γ

and arbitrary dimension d. We define the set of basis functions on the interval

I^{x}

as follows:

B_{I^{x}}^{x} = {ϕ_{0}^{x}, ϕ_{1}^{x}, ϕ_{2}^{x}, \dots, ϕ_{γ}^{x}},

(79)

where each

ϕ_{i}^{x}

for

0 \leq i \leq γ

is a polynomial of degree

γ

, and satisfies the

δ_{i j}

property on each node on

I^{x}

. For simplification, we avoid specifying the interval k. On

I^{x}

, we define the mass and stiffness matrices corresponding to a single variable, x, as follows:

\begin{matrix} M_{I^{x}} & : = {[\begin{matrix} (ϕ_{0}^{x}, ϕ_{0}^{x}) & (ϕ_{1}^{x}, ϕ_{0}^{x}) & (ϕ_{2}^{x}, ϕ_{0}^{x}) & \dots & (ϕ_{γ}^{x}, ϕ_{0}^{x}) \\ (ϕ_{0}^{x}, ϕ_{1}^{x}) & (ϕ_{1}^{x}, ϕ_{1}^{x}) & (ϕ_{2}^{x}, ϕ_{1}^{x}) & \dots & (ϕ_{γ}^{x}, ϕ_{1}^{x}) \\ (ϕ_{0}^{x}, ϕ_{2}^{x}) & (ϕ_{1}^{x}, ϕ_{2}^{x}) & (ϕ_{2}^{x}, ϕ_{2}^{x}) & \dots & (ϕ_{γ}^{x}, ϕ_{2}^{x}) \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ (ϕ_{0}^{x}, ϕ_{γ}^{x}) & (ϕ_{1}^{x}, ϕ_{γ}^{x}) & (ϕ_{2}^{x}, ϕ_{γ}^{x}) & \dots & (ϕ_{γ}^{x}, ϕ_{γ}^{x}) \end{matrix}]}_{γ + 1, γ + 1}; \\ S_{I^{x}} & : = {[\begin{matrix} (\frac{d ϕ_{0}^{x}}{d x}, \frac{d ϕ_{0}^{x}}{d x}) & (\frac{d ϕ_{1}^{x}}{d x}, \frac{d ϕ_{0}^{x}}{d x}) & (\frac{d ϕ_{2}^{x}}{d x}, \frac{d ϕ_{0}^{x}}{d x}) & \dots & (\frac{d ϕ_{γ}^{x}}{d x}, \frac{d ϕ_{0}^{x}}{d x}) \\ (\frac{d ϕ_{0}^{x}}{d x}, \frac{d ϕ_{1}^{x}}{d x}) & (\frac{d ϕ_{1}^{x}}{d x}, \frac{d ϕ_{1}^{x}}{d x}) & (\frac{d ϕ_{2}^{x}}{d x}, \frac{d ϕ_{1}^{x}}{d x}) & \dots & (\frac{d ϕ_{γ}^{x}}{d x}, \frac{d ϕ_{1}^{x}}{d x}) \\ (\frac{d ϕ_{0}^{x}}{d x}, \frac{d ϕ_{2}^{x}}{d x}) & (\frac{d ϕ_{1}^{x}}{d x}, \frac{d ϕ_{2}^{x}}{d x}) & (\frac{d ϕ_{2}^{x}}{d x}, \frac{d ϕ_{2}^{x}}{d x}) & \dots & (\frac{d ϕ_{γ}^{x}}{d x}, \frac{d ϕ_{2}^{x}}{d x}) \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋮ \\ (\frac{d ϕ_{0}^{x}}{d x}, \frac{d ϕ_{γ}^{x}}{d x}) & (\frac{d ϕ_{1}^{x}}{d x}, \frac{d ϕ_{γ}^{x}}{d x}) & (\frac{d ϕ_{2}^{x}}{d x}, \frac{d ϕ_{γ}^{x}}{d x}) & \dots & (\frac{d ϕ_{γ}^{x}}{d x}, \frac{d ϕ_{γ}^{x}}{d x}) \end{matrix}]}_{γ + 1, γ + 1} . \end{matrix}

(80)

By using (80), we generate the discretization of

{(\nabla u_{h}, \nabla v_{h})}_{Ω}

as follows:

\begin{matrix} T_{2} & : = (M_{I^{x_{d}}} \otimes S_{I^{x_{d - 1}}} \otimes M_{I^{x_{d - 2}}} \otimes \dots \otimes M_{I^{x_{1}}} \\ + M_{I^{x_{d}}} \otimes M_{I^{x_{d - 1}}} \otimes S_{I^{x_{d - 2}}} \otimes M_{I^{x_{d - 3}}} \otimes \dots \otimes M_{I^{x_{1}}} \\ ⋮ \\ + M_{I^{x_{d}}} \otimes M_{I^{x_{d - 1}}} \otimes \dots \otimes M_{I^{x_{2}}} \otimes S_{I^{x_{1}}}) U, \end{matrix}

(81)

where

M_{I^{x_{j}}}

and

S_{I^{x_{j}}}

are the mass and stiffness matrices corresponding to the variable

x_{j}

for

1 \leq j \leq d

.

U

denotes a column vector of size

[{(γ + 1)}^{d} \times 1]

on each cell

q_{l}

containing the components of

u_{h}

on

q_{l}

. Similarly, we construct the matrix corresponding to the

L^{2}

inner-product on

Ω_{T}

, i.e.,

{(u_{h}, v_{h})}_{Ω_{T}}

as follows:

\begin{matrix} T_{4} = (M_{I^{x_{d}}} \otimes M_{I^{x_{d - 1}}} \otimes \dots \otimes M_{I^{x_{2}}} \otimes M_{I^{x_{1}}}) U . \end{matrix}

(82)

In fact, we can extend the tensorization of various terms, including the variable coefficients, adopting the same technique. Thus, we emphasize that our method is very convenient for the simulation of multi-dimensional PDEs. Moreover, the equalities (81) and (82) are exact, and the practical implementation of these terms involves only quadrature error, which can be minimized by selecting an appropriate quadrature rule. However, for higher-order quadrilateral elements, the interpolation matrix associated with Lagrange basis functions can become ill-conditioned, leading to numerical instability in solving the resulting systems of equations, especially for large-scale problems. Preconditioning techniques can help mitigate these issues. Furthermore, in [53], the authors discussed the QTT-FEM for the wave equation by employing a hierarchical basis. However, the system becomes ill-conditioned for higher-order approximation. They used preconditioning to demonstrate that the system is uniformly well-conditioned. This technique could be helpful for higher-order approximations in our methods and will be explored in future work.

8. Numerical Experiments

In this section, we discuss a series of numerical experiments to assess the performance of our proposed tensor-network space-time finite element method in both the TT and QTT formats. All simulations were conducted using MATLAB R2022b software running on a Linux operating system equipped with a 2.1 GHz Intel Gold 6152 processor. Moreover, in all numerical experiments, we chose a TT truncation tolerance smaller than the approximation error in the TT construction. We compared the performance of TT and QTT solvers against the full-grid solver, which performed calculations on full tensors. Furthermore, we computed the relative error in the

L^{2} ([0, T] \times Ω)

norm for all numerical experiments. We report the error quantity as follows:

Relative Error : = \frac{∥ u - u_{h} ∥_{L^{2} ([0, T] \times Ω)}}{{∥ u ∥}_{L^{2} ([0, T] \times Ω)}} .

We used linear polynomials for all experiments; consequently, we have observed a quadratic order of convergence ([20], Figure 3). Furthermore, we highlight that in Theorem 4, we proved the convergence estimate in the

L^{2} (0, T; H_{0}^{1} (Ω))

norm. The theory does not cover convergence analysis in the

L^{2} ([0, T] \times Ω)

norm. However, from the interpolation estimate (18), we expect an optimal order of convergence in the

L^{2} ([0, T] \times Ω)

norm.

8.1. TT-Ranks of the Diffusion Operator

Here, we explore how the TT-ranks of the diffusion operator

\nabla \cdot (κ (x, y, z) \nabla)

in the TT format vary with four different coefficient functions

κ (x, y, z)

. We chose to conduct this investigation because the TT-ranks of the diffusion operator might not be expected to remain low, given the complex assembly process described in Equation (75). The chosen coefficient functions have varying TT-ranks to illustrate their impact on the TT-ranks of the diffusion operator. Moreover, these functions possess exact TT-ranks that are independent of the grid size. The function

1 / (1 + x + y + z)

, however, demonstrates more complex behavior, with TT-ranks that vary depending on the selected TT truncation tolerance, while still remaining independent of the grid size.

The results from Table 1 show that when the coefficient function is simply

κ (x, y, z) = 1

, the TT-ranks of the global diffusion operator are surprisingly low at

[2, 2]

across three discretization grid sizes. Despite the complex assembly process in FEM, this demonstrates that the global diffusion operator itself possesses a low-rank structure, which depends on the rank of the diffusion function and its rank. In general, we observe that the TT-ranks of the diffusion operator increase at a rate that is, at most, twice the TT-ranks of the coefficient function

κ (x, y, z)

. Moreover, the number of elements per dimension does not affect the TT-ranks, which will lead to better compression on a larger grid. This result suggests a relationship between the complexity of the coefficient function and the rank of the resulting global diffusion operator in the TT format.

8.2. Three-Dimensional Poisson Equation

Next, we demonstrate the performance of the TT and QTT solvers on the 3D Poisson equation:

\begin{matrix} - \nabla \cdot (κ (x, y, z) \nabla u) & = f (x, y, z) in Ω, \\ u & = 0 on \partial Ω, \end{matrix}

(83)

and with the manufactured solution

u (x, y, z) = sin (π x) sin (π y) sin (π z)

on the computational domain

[0, 1] \times [0, 1] \times [0, 1]

with

U^{T T}

-rank 1, and

κ (x, y, z) = 1 + cos (π (x + y)) cos (π z)

. The loading term

f (x, y, z)

and the boundary conditions are computed in a way to enforce the manufactured solution. We compute the solutions with three different solvers, namely, the full-grid solver, TT solver, and QTT solver.

The performance of the three solvers is shown in Table 2 and Figure 4, with the convergence analysis in the left panel, the computational cost in the middle panel, and the compression ratios of the TT and QTT formats in the right panel. Figure 4 (left) illustrates that both the TT and QTT formats maintain the same level of accuracy, with second-order convergence, as the full-grid solver up to a grid size of 33 elements per dimension. Beyond this point, the full-grid solver requires more memory than is available. Both the TT and QTT solvers were able to run simulations on finer grids while successfully maintaining the convergence rate.

Figure 4 (middle) illustrates the computational time required by all three solvers. As expected, the full-grid solver is the most computationally expensive, with much steeper scaling compared to the TT and QTT solvers. When comparing the TT and QTT solvers, the TT solver is slightly faster for smaller grids, but begins to show steeper scaling compared to the QTT solver at grid sizes of 129 and 257 elements per dimension.

Figure 4 (right) displays the compression ratio of the diffusion operators in TT and QTT formats across different grid sizes. The compression ratio is defined as the ratio of the number of elements in the TT or QTT formats to the number of elements in the full tensor. The plot indicates that the QTT format becomes increasingly efficient compared to the TT format as the grid size grows. For smaller grids, the compression achieved by the QTT format is similar to that of the TT format, which explains why the QTT solver is slightly slower than the TT solver at these grid sizes. This is because, at similar levels of compression, the QTT format involves calculations with more dimensions, making it more computationally demanding than the TT algorithm. However, as the compression advantage of the QTT format becomes more pronounced (about 2 orders of magnitude better at the grid size of 129), its computational efficiency starts to outweigh that of the TT format, reflected in the reduced computational time.

Furthermore, the performance crossover between TT and QTT formats is primarily due to the scaling behavior of compression and arithmetic complexity. At small grid sizes, the TT format is faster because it involves fewer dimensions (four cores for space-time), resulting in less overhead during construction and computation. In contrast, the QTT format introduces more TT-cores due to binary decomposition, which increases setup costs at small scales. However, as the number of elements increases, QTT exhibits significantly better compression ratios because it captures hierarchical and structured patterns more effectively, especially for operators with regular banded structures (e.g., stiffness matrices). This leads to much lower TT-ranks in QTT format for large problems, reducing both memory and computational costs. Furthermore, QTT operations scale logarithmically with the problem size compared to polynomial scaling in standard TT. As a result, at larger scales, the benefits of deeper compression and lower ranks outweigh the initial overhead, making QTT more efficient than TT.

8.3. Three-Dimensional CDR Equation

In this experiment, we investigate the performance of TT and QTT solvers in solving a 3D convection–diffusion–reaction equation with inhomogeneous boundary conditions.

\begin{matrix} \frac{\partial u}{\partial t} - \nabla \cdot (κ (t, x) \nabla u) + b (t, x) \cdot \nabla u + c (t, x) u & = f (t, x) in Ω \times [0, T], \\ u & = g (t, x) on \partial Ω \times [0, T], \\ u (t = 0, x) & = u_{0} (x) in Ω, \end{matrix}

(84)

where

κ (t, x) = 1 + c o s (π x) c o s (π y) c o s (π z)

,

b (x) = [- 2 x / 3, y / 3, z / 3]

, and

c (t, x) = e^{- (x + y + z)}

. Here, it is clear that the convective field satisfies the divergence-free condition, i.e.,

\nabla \cdot b (x) = 0

, and therefore,

μ (t, x) > 0

(see (6)). The manufactured solution is

u (t, x, y, z) = s i n (π (t + x + y + z))

with

U^{T T}

-rank 2 and the space-time computational domain is

{[0, 1]}^{4}

.

The performance of three solvers (full grid, TT, and QTT) is presented in Table 3 and Figure 5, with convergence shown in the left panel, computational cost in the middle panel, and compression ratios in the TT and QTT formats displayed in the right panel.

Figure 5 (left) demonstrates that both the TT and QTT formats maintain a similar level of accuracy as the full-grid solver, with second-order convergence, consistent with the findings from the first experiment. Up to a grid size of 17 elements per dimension, all solvers perform comparably. However, beyond this point, the full-grid solver becomes infeasible due to memory limitations. In contrast, both the TT and QTT solvers continue to perform on finer grids, preserving the convergence rate even at higher resolutions.

Figure 5 (middle) shows the computational time required by the three solvers. The full-grid solver is the most expensive, exhibiting significantly steeper scaling with increasing grid size. When comparing the TT and QTT solvers, they initially perform comparably on smaller grids. However, as the grid size increases to 257 and 513 elements per dimension, the QTT solver’s scaling becomes less steep, eventually overtaking the TT solver in computational efficiency.

Figure 5 (right) illustrates the behavior of the diffusion operators in TT and QTT formats across different grid sizes. The results show that the QTT format becomes increasingly more effective as the grid size increases. For example, at the grid size of 513 elements per dimension, the compression of the QTT format is 4 orders of magnitude lower compared to the compression of the TT format.

Given that this is a 4D problem, the TT and QTT solvers show superior efficiency compared to the full-grid solver, which potentially allows for much faster and more accurate simulations at much higher resolutions. Furthermore, we highlight that the space-time scheme provides a better numerical approximation than the time-stepping scheme. Interested readers may refer to [12,54] for a comparison between the space-time scheme and the time-stepping scheme.

8.4. Three-Dimensional CDR Equation with a Nonlinear Loading Term

In this section, we present numerical results for a simplified semiconductor problem, where the drift-diffusion physics is replaced with a Poisson–Boltzmann equation approximation [55]. In this example, we benchmark the performance of the TT and QTT solvers on a three-dimensional nonlinear equation:

\begin{matrix} \frac{\partial u}{\partial t} - Δ u & = u - u^{3} + f (x) in Ω \times [0, T], \\ u & = 0 on \partial Ω, \\ u & (t = 0, x) = u_{0} . \end{matrix}

(85)

with the computational domain

{[0, 1]}^{4}

, and the manufactured solution,

u (t, x, y, z) = s i n (π x) s i n (π y) s i n (π z) s i n (π t) + s i n (2 π x) s i n (2 π y) s i n (2 π z) s i n (2 π t),

with

U^{T T}

-rank 2. The output of the discretization process is a nonlinear equation in TT format, given as follows:

A^{T T} U^{T T} = M^{T T} (U^{T T} - {(U^{T T})}^{3} + F^{T T}),

where

A^{T T} U^{T T} = T_{1}^{T T} + T_{2}^{T T}

and

M^{T T} = M_{I_{T}} (I_{t}, I_{t}) \circ M_{Ω_{Z}} (I_{s}, I_{s}) \circ M_{Ω_{Y}} (I_{s}, I_{s}) \circ M_{Ω_{X}} (I_{s}, I_{s})

. The QTT format of this equation can be achieved by using the process described above in Section 6.3.

The nonlinear TT/QTT equation is solved using the step truncation TT-Newton method developed in [22] (Algorithm 1), with the loss function defined as follows:

L (U^{T T}) = A^{T T} U^{T T} - M^{T T} (U^{T T} - {(U^{T T})}^{3} + F^{T T}) .

The Jacobian of this matrix function is explicitly computable. Figure 6 compares three solvers for this nonlinear problem, showing convergence (left panel), the computational cost (middle panel), and the number of Newton iterations (right panel). The data is shown in Table 4. The diffusion operator compression for the TT and QTT solvers is not shown, as it is similar to that in the third experiment.

Figure 6 (left) demonstrates that both TT and QTT maintain accuracy comparable to the full-grid solver up to 17 elements per dimension, beyond which the full-grid solver becomes memory-infeasible. As expected, the TT and QTT solvers efficiently handle finer grids and preserve second-order convergence rates.

Figure 6 (middle) presents the computational time, where the full-grid solver is the most expensive. The TT solver is initially faster for smaller grids, but QTT shows better scaling at larger grid sizes (257 and 513 elements per dimension).

However, as shown in Figure 6 (right), at larger grid sizes (257 and 513), the QTT solver requires a greater number of Newton iterations to converge. For example, at a grid size of 513, QTT requires 8 iterations compared to 4 iterations for the TT solver. This results in QTT being twice as fast in terms of computational time per Newton iteration at this grid size. The results suggest that QTT and TT solvers may take distinct paths to convergence, which is intriguing and likely linked to the choice of truncation errors in these formats. This relationship is particularly important, as prior research has demonstrated that the selection of TT truncation errors significantly influences both the efficiency and accuracy of TT-format PDE solvers [22,56,57,58,59]. This observed difference in solver behavior certainly deserves a more detailed investigation and thorough analysis.

9. Conclusions

In conclusion, we developed a Galerkin TT/QTT space-time formulation specifically tailored to the finite element discretization of convection–diffusion–reaction equations. This approach includes a detailed construction of finite element discrete operators in both TT and QTT formats, accommodating general coefficient functions. By leveraging the QTT format, we have effectively compressed the discrete operators while addressing the banded structure, leading to enhanced computational efficiency. Our numerical experiments demonstrate that our mixed approach not only achieves significant speedup but also significantly reduces memory usage, enabling high-resolution simulations to be conducted in a much shorter time. These advancements highlight the potential of our approach to tackle more complex and large-scale problems efficiently. Moving forward, future work will focus on extending this framework to address nonlinear partial differential equations and adapting it for application to more complex domains, thereby broadening the scope and impact of this method in computational science and engineering.

We also plan to develop a discontinuous Petrov–Galerkin (DPG) formulation—of the type introduced by Demkowicz and Gopalakrishnan [60,61]—by employing the so-called mixed formulation of DPG (see Roberts et al. [62], Appendix A.5.1). The mixed formulation will enable the application of TT formats by preserving the tensor-product structure of the basis functions; the mixed formulation avoids explicit computation of DPG’s optimal test functions and computes an explicit error-representation function, which can be used to evaluate the accuracy of the solution. In the TT context, the latter could enable an iterative refinement approach, whereby error estimates are used to determine the placement of refined elements in the appropriate 1D mesh components. DPG has previously been applied to steady convection–diffusion–reaction problems [8], and to space-time convection [63] and convection–reaction problems [64]. A DPG-based TT formulation of space-time convection–diffusion–reaction can be expected to provide excellent stability properties, even on coarse meshes and for challenging problems.

Author Contributions

Conceptualization, D.A., D.P.T., R.V., S.D., D.D., N.V.R., K.Ø.R. and B.S.A.; methodology, D.A., D.P.T., R.V., S.D., D.D., N.V.R., K.Ø.R. and B.S.A.; writing—original draft, D.A., D.P.T., R.V., S.D., D. D., N.V.R., K.Ø.R. and B.S.A.; writing—review and editing, D.A., D.P.T., R.V., S.D., D.D., N.V.R., K.Ø.R. and B.S.A.; supervision, K.Ø.R. and B.S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Laboratory Directed Research and Development (LDRD) project number 20230067DR.

Data Availability Statement

The data that support the findings of this research are available from the corresponding author upon reasonable request.

Acknowledgments

The authors gratefully acknowledge the support of the Laboratory Directed Research and Development (LDRD) program of Los Alamos National Laboratory under project number 20230067DR. Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of the U.S. Department of Energy (contract no. 89233218CNA000001). The authors gratefully acknowledge the support of the Laboratory Directed Research and Development (LDRD) program of Sandia. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. This paper describes objective technical results and analysis. Any subjective views or opinions that might be expressed in the paper do not necessarily represent the views of the U.S. Department of Energy or the United States Government. This article has been authored by an employee of National Technology & Engineering Solutions of Sandia, LLC under Contract No. DE-NA0003525 with the U.S. Department of Energy (DOE). The employee owns all right, title and interest in and to the article and is solely responsible for its contents. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this article or allow others to do so, for United States Government purposes. The DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (https://www.energy.gov/downloads/doe-public-access-plan, accessed on 8 July 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Notations and Definitions

Appendix A.1. Notation Table

Table A1. Summary of notation.

Symbol	Description
$Ω$	Spatial domain in $R^{3}$
$[0, T]$ or $I_{T}$	Time interval
$Ω_{T} = I_{T} \times Ω$	Space-time domain
U	Trial space: $L^{2} (0, T; H_{0}^{1} (Ω)) \cap H^{1} (0, T; H^{- 1} (Ω))$
V	Test space: $L^{2} (0, T; H_{0}^{1} (Ω))$
$T_{h}$	Discrete projection operator
$ϕ_{i} (t, x)$	Nodal basis functions
`TT`	Tensor-Train format
`QTT`	Quantized Tensor-Train format
$G_{k}$	TT-core tensor in TT decomposition
$r_{k}$	TT-rank between the k-th and $(k + 1)$ -th TT-core
$∥ \cdot ∥$	$L^{2}$ norm
$U^{T T}$	TT Tensor format of solution
$U^{Q T T}$	QTT Tensor format of solution
$L_{h}$	Lagrange interpolation operator
$L$	Loss function in TT-Newton method
$\frac{\partial u}{\partial t}$	Partial derivative of u with respect to time t

Appendix A.2. Kronecker Product

The Kronecker product ⨂ of matrix

A = (a_{i j}) \in R^{m_{A} \times n_{A}}

and matrix

B = (b_{i j}) \in R^{m_{B} \times n_{B}}

is the matrix

A \otimes B

of size

N_{A \otimes B} = (m_{A} m_{B}) \times (n_{A} n_{B})

, defined as follows:

A \otimes B = [\begin{matrix} a_{11} B & a_{12} B & \dots & a_{1 n_{A}} B \\ a_{21} B & a_{22} B & \dots & a_{2 n_{A}} B \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m_{A} 1} B & a_{m_{A} 2} B & \dots & a_{I n_{A}} B \end{matrix}] .

(A1)

Equivalently, it holds that

{(A \otimes B)}_{i j} = a_{i_{A} j_{A}} b_{i_{B} j_{B}}

, where

i = i_{B} + (i_{A} - 1) m_{B}

,

j = j_{B} + (j_{A} - 1) m_{B}

, with

i_{A} = 1, \dots, m_{A}

,

j_{A} = 1, \dots, n_{A}

,

i_{B} = 1, \dots, m_{B}

, and

j_{B} = 1, \dots, n_{B}

.

Appendix A.3. The Tensor Product

The tensor product of matrix

A = (a_{i j}) \in R^{m_{A} \times n_{A}}

and matrix

B = (b_{k l}) \in R^{m_{B} \times n_{B}}

produces the four-dimensional tensor of size

N_{A \circ B} = m_{A} \times n_{A} \times m_{B} \times n_{B}

, with the following elements:

{(A \circ B)}_{i j k l} = a_{i j} b_{k l},

(A2)

for

i = 1, 2, \dots, m_{A}

,

j = 1, 2, \dots, n_{A}

,

k = 1, 2, \dots, m_{B}

,

l = 1, 2, \dots, n_{B}

.

Appendix B. TT Format Construction of the Boundary Term F bd,TT

The boundary term

F^{b d, T T}

is computed as

F^{b d, T T} = A^{m a p, T T} G^{b d, T T}

where

A^{m a p, T T} = A_{t}^{T T} + A_{d}^{T T} + A_{c}^{T T} + A_{r}^{T T} .

Next, we will show how to construct each component

A_{t}^{T T}, A_{d}^{T T}, A_{c}^{T T}, A_{r}^{T T} .

We redefine the sets of indices

I_{t} = 2 : N + 1

and

I_{s} = 2 : N

where N is the number of elements per dimension. Then the TT format of each component is constructed as follows:

A_{t}^{T T} : = D_{I_{T}} (I_{t}, :) \circ M_{Ω_{Z}} (I_{s}, :) \circ M_{Ω_{Y}} (I_{s}, :) \circ M_{Ω_{X}} (I_{s}, :) .

(A3)

\begin{matrix} A_{d}^{T T} : = \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [ & M_{t_{m_{4}}}^{g} (I_{t}, :) \circ S_{z_{m_{3}}}^{g} (I_{s}, :) \circ M_{y_{m_{2}}}^{g} (I_{s}, :) \circ M_{x_{m_{1}}}^{g} (I_{s}, :) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, :) \circ M_{z_{m_{3}}}^{g} (I_{s}, :) \circ S_{y_{m_{2}}}^{g} (I_{s}, :) \circ M_{x_{m_{1}}}^{g} (I_{s}, :) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, :) \circ M_{z_{m_{3}}}^{g} (I_{s}, :) \circ M_{y_{m_{2}}}^{g} (I_{s}, :) \circ S_{x_{m_{1}}}^{g} (I_{s}, :)] . \end{matrix}

(A4)

\begin{matrix} A_{c}^{T T} = \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [ & M_{t_{m_{4}}}^{g} (I_{t}, :) \circ M_{z_{m_{3}}}^{g} (I_{s}, :) \circ M_{y_{m_{2}}}^{g} (I_{s}, :) \circ D_{x_{m_{1}}}^{g} (I_{s}, :) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, :) \circ M_{z_{m_{3}}}^{g} (I_{s}, :) \circ D_{y_{m_{2}}}^{g} (I_{s}, :) \circ M_{x_{m_{1}}}^{g} (I_{s}, :) \\ + & M_{t_{m_{4}}}^{g} (I_{t}, :) \circ D_{z_{m_{3}}}^{g} (I_{s}, :) \circ M_{y_{m_{2}}}^{g} (I_{s}, :) \circ M_{x_{m_{1}}}^{g} (I_{s}, :)] . \end{matrix}

(A5)

A_{r}^{T T} : = \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} M_{t_{m_{4}}}^{g} (I_{t}, :) \circ M_{z_{m_{3}}}^{g} (I_{s}, :) \circ M_{y_{m_{2}}}^{g} (I_{s}, :) \circ M_{x_{m_{1}}}^{g} (I_{s}, :) .

(A6)

A^{m a p, T T} = A_{t}^{T T} + A_{d}^{T T} + A_{c}^{T T} + A_{r}^{T T} .

(A7)

Next, we describe the construction of the tensor

G^{b d, T T}

. The full grid tensor

G^{b d}

is a

[(N + 1) \times (N + 1) \times (N + 1) \times (N + 1)]

tensor, where only the initial and boundary elements are nonzero, with all other elements being zeros. The tensor

G^{b d, T T}

in TT format is then constructed using the cross-interpolation technique.

Finally, the TT format of the boundary term is computed as

F^{b d, T T} = A^{m a p, T T} G^{b d, T T}

.

Appendix C. QTT Format Construction

T_{1}^{Q T T} : = (D_{I_{T}}^{Q T T} (I_{t}, I_{t}) \circ M_{Ω_{Z}}^{Q T T} (I_{s}, I_{s}) \circ M_{Ω_{Y}}^{Q T T} (I_{s}, I_{s}) \circ M_{Ω_{X}}^{Q T T} (I_{s}, I_{s})) U^{Q T T},

(A8)

\begin{matrix} T_{2}^{Q T T} : = ( & \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [M_{t_{m_{4}}}^{g, Q T T} (I_{t}, I_{t}) \circ S_{z_{m_{3}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g, Q T T} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g, Q T T} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g, Q T T} (I_{s}, I_{s}) \circ S_{y_{m_{2}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g, Q T T} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g, Q T T} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g, Q T T} (I_{s}, I_{s}) \circ S_{x_{m_{1}}}^{g, Q T T} (I_{s}, I_{s})]) U^{Q T T} . \end{matrix}

(A9)

\begin{matrix} T_{3}^{Q T T} = ( & \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} [M_{t_{m_{4}}}^{g, Q T T} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g, Q T T} (I_{s}, I_{s}) \circ D_{x_{m_{1}}}^{g, Q T T} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g, Q T T} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g, Q T T} (I_{s}, I_{s}) \circ D_{y_{m_{2}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g, Q T T} (I_{s}, I_{s}) \\ + & M_{t_{m_{4}}}^{g, Q T T} (I_{t}, I_{t}) \circ D_{z_{m_{3}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{y_{m_{2}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g, Q T T} (I_{s}, I_{s})]) U^{Q T T}, \end{matrix}

(A10)

\begin{matrix} T_{4}^{Q T T} : = \sum_{m_{1}, m_{2}, m_{3}, m_{4} = 0}^{1} ( & M_{t_{m_{4}}}^{g, Q T T} (I_{t}, I_{t}) \circ M_{z_{m_{3}}}^{g, Q T T} (I_{s}, I_{s}) . . . \\ \circ M_{y_{m_{2}}}^{g, Q T T} (I_{s}, I_{s}) \circ M_{x_{m_{1}}}^{g, Q T T} (I_{s}, I_{s})) U^{Q T T} . \end{matrix}

(A11)

T_{5}^{Q T T} : = (M_{I_{T}}^{Q T T} (I_{t}, I_{t}) \circ M_{Ω_{Z}}^{Q T T} (I_{s}, I_{s}) \circ M_{Ω_{Y}}^{Q T T} (I_{s}, I_{s}) \circ M_{Ω_{X}}^{Q T T} (I_{s}, I_{s})) F^{Q T T},

(A12)

The QTT format of the loading term

F^{Q T T}

is approximated by first approximating

F^{T T}

using the cross-interpolation algorithm, then converting

F^{T T}

into

F^{Q T T}

. The QTT format of the boundary term

F^{B d, Q T T}

can be similarly constructed using Appendix B.

Appendix D. Discrete Inf-Sup Condition

This appendix establishes the discrete inf-sup condition for the finite element discretization of the time-dependent CDR equation. Namely, we establish the mesh-independence result of Theorem 3. Recall that

V : = L^{2} (0, T; H_{0}^{1} (Ω))

,

U : = V \cap H^{1} (0, T; H^{- 1} (Ω))

, and

V_{h}

,

U_{h}

are the finite-dimensional subspaces specified by the finite elements. The goal is to show that the following discrete stability condition holds:

C ∥ u_{h} ∥_{U_{h}} \leq sup_{0 \neq v_{h} \in V_{h}} \frac{D_{h} (u_{h}, v_{h})}{∥ v_{h} ∥_{V}} \forall u_{h} \in U_{h},

(A13)

where C is a positive constant that depends on the regularity of

κ (t, x)

,

b (t, x)

, and

c (t, x)

, but independent of mesh size h. As in the text, we define

\begin{matrix} B (z, v) & : = 〈 κ (t, x) \nabla z, \nabla v 〉 + 〈 b (t, x) \cdot \nabla z, v 〉 + 〈 c (t, x) z, v 〉, \\ 〈 ϕ, v 〉 & : = \int_{0}^{T} \int_{Ω} ϕ v \end{matrix}

(A14)

for

v, z \in L^{2} (0, T; H_{0}^{1} (Ω))

and

ϕ \in L^{2} (0, T; H^{- 1} (Ω))

. We assume the coefficients

κ, b

, and c satisfy the conditions ensuring coercivity and continuity. In the continuous case, the coercivity constant

C_{*}

can be expressed as follows:

\begin{matrix} C_{*} = min {κ_{*}, μ_{0}} . \end{matrix}

(A15)

Indeed, using the identity

\int_{Ω} (b (t, x) \cdot \nabla w) w = - \frac{1}{2} \int_{Ω} (\nabla \cdot b) w^{2}

, we can write the following:

\begin{matrix} B (w, w) & : = \int_{0}^{T} \int_{Ω} κ (t, x) \nabla w \cdot \nabla w + \int_{0}^{T} \int_{Ω} (b (t, x) \cdot \nabla w) w + c (t, x) w^{2} \\ = \int_{0}^{T} \int_{Ω} κ (t, x) \nabla w \cdot \nabla w + \int_{0}^{T} \int_{Ω} μ (t, x) w^{2} \\ \geq κ_{*} {∥ \nabla w ∥}_{L^{2} (0, T, L^{2} (Ω))}^{2} + μ_{0} {∥ w ∥}_{L^{2} (0, T, L^{2} (Ω))}^{2} \\ \geq min {κ_{*}, μ_{0}} ({∥ \nabla w ∥}_{L^{2} (0, T, L^{2} (Ω))}^{2} + {∥ w ∥}_{L^{2} (0, T, L^{2} (Ω))}^{2}) \\ \equiv min {κ_{*}, μ_{0}} {∥ w ∥}_{L^{2} (0, T, H_{0}^{1} (Ω))}^{2} . \end{matrix}

(A16)

The continuous variational problem

B (z, v) = 〈 ϕ, v 〉 \forall v \in L^{2} (0, T; H_{0}^{1} (Ω))

is then well defined. The projection operator

T : L^{2} (0, T; H^{- 1} (Ω)) \to L^{2} (0, T; H_{0}^{1} (Ω))

is defined by

T (ϕ) = z

, where

ϕ

and z satisfy,

B (z, v) = 〈 ϕ, v 〉 \forall v \in L^{2} (0, T; H_{0}^{1} (Ω)) .

(A17)

By assumption of coercivity and continuity,

T

is well defined and bounded by the Lax–Milgram theorem [5].

Appendix D.1. Construction of Approximate Projection Operator

For

V_{h} \subset L^{2} (0, T; H_{0}^{1} (Ω))

, recall

B_{h} (z_{h}, v_{h})

is defined as

\begin{matrix} B_{h} (z_{h}, v_{h}) & : = 〈 L_{h} (κ (t, x)) \nabla z_{h}, \nabla v_{h} 〉 + 〈 L_{h} (b (t, x)) \cdot \nabla z_{h}, v_{h} 〉 + 〈 L_{h} (c (t, x)) z_{h}, v_{h} 〉, \end{matrix}

(A18)

where

L_{h}

is the Lagrange interpolation operator defined in (16). We also recall Definition (20)

D_{h} (u_{h}, v_{h}) : = 〈\frac{\partial u_{h}}{\partial t}, v_{h}〉 + B_{h} (u_{h}, v_{h}) .

(A19)

We define the approximate projection operator,

T_{h} : L^{2} (0, T; H^{- 1} (Ω)) \to V_{h}

[65] by

T_{h} ϕ = z_{h} \in V_{h}

, where

ϕ

and

z_{h}

satisfy the following:

B_{h} (z_{h}, v_{h}) = 〈 ϕ, v_{h} 〉 \forall v_{h} \in V_{h} .

(A20)

To prove that the definition (A20) is well-defined, we first will prove that

B_{h}

is coercive.

Lemma A1.

The discrete bilinear form,

B_{h} (z_{h}, v_{h})

is coercive, that is, there exists a positive constant

\hat{C} > 0

, such that

\hat{C} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} \leq B_{h} (w_{h}, w_{h}) \forall w_{h} \in U_{h} .

(A21)

Moreover, this constant

\hat{C}

is independent of mesh size h for all sufficiently small h.

Proof.

The proof will follow the set of steps outlined in Equation (A16). Notice that we have

\begin{matrix} B_{h} (w_{h}, w_{h}) & : = \int_{0}^{T} \int_{Ω} L_{h} (κ (t, x)) \nabla w_{h} \cdot \nabla w_{h} + \int_{0}^{T} \int_{Ω} L_{h} (b (t, x)) \cdot \nabla w_{h} w_{h} + L_{h} (c (t, x)) w_{h}^{2} . \end{matrix}

(A22)

Previously, at the continuous level, integration by parts was applied to the diffusive term to write the whole expression in terms of

μ

and

κ

. In this discrete case, we instead need to perform some approximations. We write

\begin{matrix} B_{h} (w_{h}, w_{h}) & = \int_{Ω_{T}} L_{h} (κ (t, x)) \nabla w_{h} \cdot \nabla w_{h} + \int_{Ω_{T}} L_{h} (b (t, x)) \cdot \nabla w_{h} w_{h} + \int_{Ω_{T}} L_{h} (c (t, x)) w_{h}^{2} \\ = \int_{Ω_{T}} L_{h} (κ (t, x)) \nabla w_{h} \cdot \nabla w_{h} + \int_{Ω_{T}} L_{h} (b (t, x)) \cdot \nabla w_{h} w_{h} - b (t, x) \cdot \nabla w_{h} w_{h} \\ + \int_{Ω_{T}} L_{h} (c (t, x)) w_{h}^{2} - c (t, x) w_{h}^{2} + \int_{Ω_{T}} b (t, x) \cdot \nabla w_{h} w_{h} + c (t, x) w_{h}^{2} \\ = \int_{Ω_{T}} L_{h} (κ (t, x)) \nabla w_{h} \cdot \nabla w_{h} + \int_{Ω_{T}} L_{h} (b (t, x)) \cdot \nabla w_{h} w_{h} - b (t, x) \cdot \nabla w_{h} w_{h} \\ + \int_{Ω_{T}} L_{h} (c (t, x)) w_{h}^{2} - c (t, x) w_{h}^{2} + \int_{Ω_{T}} μ (t, x) w_{h}^{2} . \end{matrix}

(A23)

By assumption,

κ (t, x) \geq κ_{*}

and

μ (t, x) \geq μ_{0}

. Therefore,

L_{h} (κ (t, x)) \geq κ_{*}

. Hence, we get

\begin{matrix} B_{h} (w_{h}, w_{h}) & \geq C_{*} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} + \int_{Ω_{T}} [L_{h} (b (t, x)) - b (t, x)] \cdot \nabla w_{h} w_{h} \\ + \int_{Ω_{T}} [L_{h} (c (t, x)) - c (t, x)] w_{h}^{2} . \end{matrix}

(A24)

We now provide bounds for the other two integral terms. To this end, note that applying the Cauchy–Schwarz inequality yields

\begin{matrix} |\int_{Ω_{T}} [L_{h} (b (t, x)) - b (t, x)] \cdot \nabla w_{h} w_{h}| \leq ∥ L_{h} {(b) - b ∥}_{L^{\infty} (Ω_{T})} ∥ \nabla w_{h} ∥_{L^{2} (Ω_{T})} {∥ w_{h} ∥}_{L^{2} (Ω_{T})} . \end{matrix}

(A25)

From the approximation property of

L_{h}

in

L^{\infty}

norm [66] (Corollary (4.4.24)), we get

\begin{matrix} |\int_{Ω_{T}} [L_{h} (b (t, x)) - b (t, x)] \cdot \nabla w_{h} w_{h}| & \leq C_{1} h ∥ \nabla w_{h} ∥_{L^{2} (Ω_{T})} {∥ w_{h} ∥}_{L^{2} (Ω_{T})} \\ \leq C_{1}^{'} h {∥ \nabla w_{h} ∥}_{L^{2} (Ω_{T})}^{2} . \end{matrix}

(A26)

where the second inequality comes from Poincaré’s inequality. Similarly, we observe that

\begin{matrix} |\int_{Ω_{T}} [L_{h} (c (t, x)) - c (t, x)] w_{h}^{2}| \leq & ∥ L_{h} {(c) - c ∥}_{L^{\infty} (Ω_{T})} {∥ w_{h} ∥}_{L^{2} (Ω_{T})}^{2} \\ \leq C_{2} h {∥ w_{h} ∥}_{L^{2} (Ω_{T})}^{2} \end{matrix}

(A27)

Letting

C = max {C_{1}^{'}, C_{2}}

, we therefore have

\begin{matrix} B_{h} (w_{h}, w_{h}) & \geq C_{*} ∥ w_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} - C h (∥ \nabla w_{h} ∥_{L^{2} (Ω_{T})}^{2} + ∥ w_{h} ∥_{L^{2} (Ω_{T})}^{2}) \\ = C_{*} ∥ w_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} - C h {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} \\ = (C_{*} - C h) {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} \end{matrix}

(A28)

This holds for every h. Hence, for sufficiently small h, we can pick a constant

0 < \hat{C} < C_{*}

so that

B_{h} (w_{h}, w_{h}) \geq \hat{C} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2}

. This completes the proof. □

Appendix D.2. Proof of Theorem 3

We now provide the Proof of the inf-sup condition. First, we recall the mesh dependent norm

∥ w_{h} ∥_{U_{h}}^{2} : = ∥ w_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} + {∥ z_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} .

(A29)

Theorem A1.

Let

U_{h} \subset U

, and

V_{h} \subset V

be the discrete space satisfying

U_{h} \subset V_{h}

, and we assume that (6) is satisfied. Then the following discrete stability condition holds

C ∥ w_{h} ∥_{U_{h}} \leq sup_{0 \neq v_{h} \in V_{h}} \frac{D_{h} (w_{h}, v_{h})}{∥ v_{h} ∥_{V}} \forall w_{h} \in U_{h},

(A30)

where C is a positive constant that depends on the regularity of

κ (t, x)

,

b (t, x)

, and

c (t, x)

, but independent of mesh size h.

Proof.

For each

w_{h} \in U_{h}

, we use the projection operator,

T_{h}

, defined in (A20),

T_{h} (\frac{\partial w_{h}}{\partial t}) = z_{h} \in V_{h} .

(A31)

That is,

B_{h} (z_{h}, v_{h}) = 〈 \frac{\partial w_{h}}{\partial t}, v_{h} 〉

for all

v_{h}

. Since

U_{h} \subset V_{h}

, we conclude that

w_{h} + z_{h} \in V_{h}

. We will show that there is a test function

v_{h} = w_{h} + z_{h}

such that

C ∥ w_{h} ∥_{U_{h}} \leq \frac{D_{h} (w_{h}, v_{h})}{∥ v_{h} ∥_{V}}

, so that the inf-sup stability condition holds. Since

u_{h} (0, x) = 0

, it follows from Lemma A1 that

\begin{matrix} D_{h} (w_{h}, w_{h}) & : = 〈 \frac{\partial w_{h}}{\partial t}, w_{h} 〉 + B_{h} (w_{h}, w_{h}) \\ = \frac{1}{2} {∥ w_{h} (T) ∥}_{L^{2} (Ω)}^{2} + B_{h} (w_{h}, w_{h}) \\ \geq \frac{1}{2} ∥ w_{h} {(T) ∥}_{L^{2} (Ω)}^{2} + \hat{C} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} . \end{matrix}

(A32)

Note that

B_{h} (\cdot, \cdot)

is bounded. Specifically for

w_{h}

and

z_{h}

we have

\begin{matrix} | B_{h} (w_{h}, z_{h}) | = | 〈 L_{h} (κ (t, x)) \nabla w_{h}, \nabla z_{h} 〉 + 〈 L_{h} (b (t, x)) \cdot \nabla w_{h}, z_{h} 〉 + 〈 L_{h} (c (t, x)) w_{h}, z_{h} 〉 | \\ \leq | \int_{0}^{T} \int_{Ω} L_{h} (κ (t, x)) \nabla w_{h} \cdot \nabla z_{h} | + | \int_{0}^{T} \int_{Ω} L_{h} (b (t, x)) \cdot \nabla w_{h} z_{h} | \\ + | \int_{0}^{T} \int_{Ω} L_{h} (c (t, x)) w_{h} z_{h} | \\ \leq κ^{*} ∥ \nabla w_{h} ∥_{L^{2} (Ω_{T})} ∥ \nabla z_{h} ∥_{L^{2} (Ω_{T})} + {∥ b ∥}_{\infty, Ω_{T}} ∥ \nabla w_{h} ∥_{L^{2} (Ω_{T})} {∥ z_{h} ∥}_{L^{2} (Ω_{T})} [Cauchy - Schwarz Inq] \\ + {∥ c ∥}_{\infty, Ω_{T}} ∥ w_{h} ∥_{L^{2} (Ω_{T})} {∥ z_{h} ∥}_{L^{2} (Ω_{T})} \\ \leq C max {κ^{*} {, ∥ b ∥}_{\infty, Ω_{T}} {, ∥ c ∥}_{\infty, Ω_{T}}} ∥ w_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} {∥ z_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))} . [boundedness of L_{h}] \end{matrix}

(A33)

For simplicity, we denote

\begin{matrix} C^{*} = C max {κ^{*} {, ∥ b ∥}_{\infty, Ω_{T}} {, ∥ c ∥}_{\infty, Ω_{T}}} . \end{matrix}

(A34)

Using the definition of

z_{h}

, the coercivity of

B_{h}

, as well as (A33), we derive that

\begin{matrix} D_{h} (w_{h}, z_{h}) & = 〈\frac{\partial w_{h}}{\partial t}, z_{h}〉 + B_{h} (w_{h}, z_{h}) \\ = B_{h} (z_{h}, z_{h}) + B_{h} (w_{h}, z_{h}) \\ \geq \hat{C} ∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} - C^{*} ∥ w_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} {∥ z_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))} . \end{matrix}

(A35)

Next, by using Young’s inequality (

a b \leq a^{2} / 2 ε + b^{2} ε / 2

), and using

a : = C^{*} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}

;

b : = ∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}

; and

ε : = \hat{C}

, we have

C^{*} ∥ w_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} ∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} \leq \frac{\hat{C}}{2} ∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} + \frac{{C^{*}}^{2}}{2 \hat{C}} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} .

(A36)

Hence, we have

\begin{matrix} D_{h} (w_{h}, z_{h}) & \geq \hat{C} {∥ z_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} - (\frac{\hat{C}}{2} ∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} + \frac{{C^{*}}^{2}}{2 \hat{C}} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2}) . \end{matrix}

(A37)

Now, we can put it all together. Breaking up the bilinear form and using (A32) and (A37), we get

\begin{matrix} D_{h} (w_{h}, w_{h} + z_{h}) & = D_{h} (w_{h}, w_{h}) + D_{h} (w_{h}, z_{h}) \\ \geq \frac{1}{2} ∥ w_{h} {(T) ∥}_{L^{2} (Ω)}^{2} + \hat{C} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} \\ + \frac{\hat{C}}{2} ∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} - \frac{{C^{*}}^{2}}{2 \hat{C}} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} . \end{matrix}

Dropping the positive term

\frac{1}{2} {∥ w_{h} (T) ∥}_{L^{2} (Ω)}^{2}

yields

\begin{matrix} D_{h} (w_{h}, w_{h} + z_{h}) \geq \frac{\hat{C}}{2} ∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} + \frac{(2 {\hat{C}}^{2} - {C^{*}}^{2})}{2 \hat{C}} {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} . \end{matrix}

(A38)

Furthermore, using the expressions of

C^{*}

and

\hat{C}

, (A15) and (A34), we can assume that for specific

κ (t, x), b (t, x)

, and

c (t, x)

, we can choose,

(2 {\hat{C}}^{2} - {C^{*}}^{2}) > 0

. For this choice, we obtain

\begin{matrix} D_{h} (w_{h}, w_{h} + z_{h}) & \geq min {\frac{\hat{C}}{2}, \frac{(2 {\hat{C}}^{2} - {C^{*}}^{2})}{2 \hat{C}}} (∥ z_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2} + {∥ w_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))}^{2}) \\ \geq \frac{1}{\sqrt{2}} min {\frac{\hat{C}}{2}, \frac{(2 {\hat{C}}^{2} - {C^{*}}^{2})}{2 \hat{C}}} ∥ w_{h} ∥_{U_{h}} {∥ w_{h} + z_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))} . \end{matrix}

(A39)

This completes the proof of the discrete inf-sup condition. □

In the above proof, we have made the constant appearing in the discrete inf-sup condition positive for sufficiently small values of mesh size h. This technique is quite standard since

h \to 0

implies a small value. Thus, we conclude that our proposed scheme is discrete inf-sup stable and has a unique solution. In the discrete scheme, we have considered the interpolation of the coefficients instead of direct computation. This technique is well known in finite elements and virtual elements for the analysis of nonlinear PDEs [67,68], and is referred to as the finite element method/virtual element method with interpolated coefficients.

Appendix E. Convergence Analysis (Proof of Theorem 4)

In this section, we will estimate the error

∥ u - u_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}

. To do that, we need the following result:

Lemma A2.

Let

u \in H^{2} (Ω_{T})

be a given function with a homogeneous boundary condition and

u (0, x) = 0

, and let

L_{h}

be the interpolation operator as defined in (17). Then the following estimate holds:

∥ u - L_{h} {u ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))} \leq C h {| u |}_{H^{2} (Ω_{T})},

(A40)

where C is a positive constant independent of h.

Proof.

The proof is followed by discrete inf-sup conditions and the discrete continuity property of the stationary part of (2). For detailed proof, we refer [20] (Theorem 4.4). □

Next, we start the main proof of Theorem 4.

Proof.

Let

p_{h} \in U_{h}

, since

u_{h} - p_{h} \in U_{h}

, by using the discrete inf-sup condition (Theorem 3), we have

\begin{matrix} C ∥ u_{h} - p_{h} ∥_{U_{h}} & \leq sup_{0 \neq v_{h} \in V_{h}} \frac{D_{h} (u_{h} - p_{h}, v_{h})}{∥ v_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))}} \forall v_{h} \in V_{h} . \end{matrix}

(A41)

By using (19), we write

\begin{matrix} D_{h} (u_{h} - p_{h}, v_{h}) & = D_{h} (u_{h}, v_{h}) - D_{h} (p_{h}, v_{h}) \\ = F_{h} (v_{h}) - D_{h} (p_{h}, v_{h}) [using equality in (19)] \\ = F_{h} (v_{h}) - F (v_{h}) + F (v_{h}) - D_{h} (p_{h}, v_{h}) [add and subtract F (v_{h})] \\ = \underset{G_{1}}{\underset{⏟}{F_{h} (v_{h}) - F (v_{h})}} + \underset{G_{2}}{\underset{⏟}{D (u, v_{h}) - D_{h} (p_{h}, v_{h})}} . \end{matrix}

(A42)

Now, we bound the differences

G_{1}

and

G_{2}

on the right-hand side of (A42). By using the approximation property of

L_{h}

, we bound the term

G_{1}

as follows

\begin{matrix} | G_{1} | & = | F_{h} (v_{h}) - F (v_{h}) | \\ = | \int_{0}^{T} \int_{Ω} (L_{h} (f) - f) v_{h} | \\ \leq C ∥ L_{h} {(f) - f ∥}_{L^{2} (Ω_{T})} {∥ v_{h} ∥}_{L^{2} (Ω_{T})} [Cauchy - Schwarz Inequality] \end{matrix}

(A43)

\begin{matrix} \leq {C h ∥ f ∥}_{H^{1} (Ω_{T})} {∥ v_{h} ∥}_{L^{2} (0, T; L^{2} (Ω))} [approximation of L_{h} (18)] . \end{matrix}

(A44)

Now, we focus on constructing a suitable Ritz-type projection operator [30] (Theorem 3.3),

π_{h} : U \to U_{h}

by

D_{h} (π_{h} u, v_{h}) = D (u, v_{h}) \forall v_{h} \in V_{h} .

(A45)

Since

D_{h} (\cdot, \cdot)

satisfies the discrete inf-sup condition, and

D_{h} (u, \cdot)

is a bounded functional on

V_{h}

, (A45) is well defined. By using the approximation property of the Lagrange interpolation operator (Lemma A2), we derive that

∥ u - π_{h} {u ∥}_{U} \leq C h {∥ u ∥}_{H^{2} (Ω_{T})} .

(A46)

By choosing

p_{h} = π_{h} u

in (24), where

π_{h}

is the elliptic projection operator as defined in (A45), and boundedness of

F_{h} (v_{h}) - F (v_{h})

, we conclude that

∥ u - u_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} \leq C h {| u |}_{H^{2} (Ω)},

where u, and

u_{h}

are the continuous and discrete solutions satisfying (2), and (19) respectively. Upon using

p_{h}

, we split the error as follows

∥ u - u_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} \leq ∥ u - p_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} + {∥ p_{h} - u_{h} ∥}_{L^{2} (0, T; H_{0}^{1} (Ω))} .

(A47)

From Equation (A29), we have

∥ p_{h} - u_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} \leq {∥ p_{h} - u_{h} ∥}_{U_{h}} .

(A48)

By setting

p_{h} = π_{h} u

, and using (A43), (A48), we deduce from (A41) that

∥ p_{h} - u_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} \leq ∥ p_{h} - u_{h} ∥_{U_{h}} \leq C h {∥ u ∥}_{H^{2} (Ω_{T})} .

(A49)

By inserting (A46), and (A49) in (A47), we derive that

∥ u - u_{h} ∥_{L^{2} (0, T; H_{0}^{1} (Ω))} \leq C h {∥ u ∥}_{H^{2} (Ω_{T})} .

(A50)

By the definition of

π_{h}

in (A45) and the choice of

p_{h}

, we have

G_{2} = 0

. This completes the proof. □

References

John, V.; Schmeyer, E. Finite element methods for time-dependent convection–diffusion–reaction equations with small diffusion. Comput. Methods Appl. Mech. Eng. 2008, 198, 475–494. [Google Scholar] [CrossRef]
John, V.; Novo, J. Error analysis of the SUPG finite element discretization of evolutionary convection-diffusion-reaction equations. SIAM J. Numer. Anal. 2011, 49, 1149–1176. [Google Scholar] [CrossRef]
Roos, H.G. Robust Numerical Methods for Singularly Perturbed Differential Equations; Springer: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
Knobloch, P.; Tobiska, L. The P 1 mod element: A new nonconforming finite element for convection-diffusion problems. SIAM J. Numer. Anal. 2003, 41, 436–456. [Google Scholar] [CrossRef]
Ciarlet, P.G. The Finite Element Method for Elliptic Problems; SIAM: Philadelphia, PA, USA, 2002. [Google Scholar]
Shu, C.W. Discontinuous Galerkin methods: General approach and stability. Numer. Solut. Partial Differ. Equ. 2009, 201, 149–201. [Google Scholar]
Ayuso, B.; Marini, L.D. Discontinuous Galerkin methods for advection-diffusion-reaction problems. SIAM J. Numer. Anal. 2009, 47, 1391–1420. [Google Scholar] [CrossRef]
Bui-Thanh, T.; Demkowicz, L.; Ghattas, O. A unified discontinuous Petrov–Galerkin method and its analysis for Friedrichs’ systems. SIAM J. Numer. Anal. 2013, 51, 1933–1958. [Google Scholar] [CrossRef]
Houston, P.; Roggendorf, S.; van der Zee, K.G. Eliminating Gibbs phenomena: A non-linear Petrov–Galerkin method for the convection–diffusion–reaction equation. Comput. Math. Appl. 2020, 80, 851–873. [Google Scholar] [CrossRef]
da Veiga, L.B.; Dassi, F.; Lovadina, C.; Vacca, G. SUPG-stabilized virtual elements for diffusion-convection problems: A robustness analysis. ESAIM Math. Model. Numer. Anal. 2021, 55, 2233–2258. [Google Scholar] [CrossRef]
Gharibi, Z.; Dehghan, M. Convergence analysis of weak Galerkin flux-based mixed finite element method for solving singularly perturbed convection-diffusion-reaction problem. Appl. Numer. Math. 2021, 163, 303–316. [Google Scholar] [CrossRef]
Adak, D.; Truong, D.P.; Manzini, G.; Rasmussen, K.O.; Alexandrov, B.S. Tensor Network Space-Time Spectral Collocation Method for Time Dependent Convection-Diffusion-Reaction Equations. arXiv 2024, arXiv:2402.18073. [Google Scholar] [CrossRef]
Arrutselvi, M.; Natarajan, E.; Natarajan, S. Virtual element method for the quasilinear convection-diffusion-reaction equation on polygonal meshes. Adv. Comput. Math. 2022, 48, 78. [Google Scholar] [CrossRef]
Boffi, D.; Brezzi, F.; Fortin, M. Mixed Finite Element Methods and Applications; Springer: Berlin/Heidelberg, Germany, 2013; Volume 44. [Google Scholar]
Steinbach, O. Numerical Approximation Methods for Elliptic Boundary Value Problems: Finite and Boundary Elements; Springer Science & Business Media: New York, NY, USA, 2007. [Google Scholar]
Andreev, R. Stability of sparse space–time finite element discretizations of linear parabolic evolution equations. IMA J. Numer. Anal. 2013, 33, 242–260. [Google Scholar] [CrossRef]
Stevenson, R.; Westerdiep, J. Stability of Galerkin discretizations of a mixed space–time variational formulation of parabolic evolution equations. IMA J. Numer. Anal. 2021, 41, 28–47. [Google Scholar] [CrossRef]
Langer, U.; Steinbach, O.; Tröltzsch, F.; Yang, H. Space-time finite element discretization of parabolic optimal control problems with energy regularization. SIAM J. Numer. Anal. 2021, 59, 675–695. [Google Scholar] [CrossRef]
Cangiani, A.; Dong, Z.; Georgoulis, E.H. hp-version space-time discontinuous Galerkin methods for parabolic problems on prismatic meshes. SIAM J. Sci. Comput. 2017, 39, A1251–A1279. [Google Scholar] [CrossRef]
Gomez, S.; Mascotto, L.; Moiola, A.; Perugia, I. Space-time virtual elements for the heat equation. SIAM J. Numer. Anal. 2024, 62, 199–228. [Google Scholar] [CrossRef]
Gómez, S.; Mascotto, L.; Perugia, I. Design and performance of a space-time virtual element method for the heat equation on prismatic meshes. Comput. Methods Appl. Mech. Eng. 2024, 418, 116491. [Google Scholar] [CrossRef]
Adak, D.; Danis, M.; Truong, D.P.; Rasmussen, K.Ø.; Alexandrov, B.S. Tensor Network Space-Time Spectral Collocation Method for Solving the Nonlinear Convection Diffusion Equation. arXiv 2024, arXiv:2406.02505. [Google Scholar]
Krebs, J.R.; Anderson, J.E.; Hinkley, D.; Neelamani, R.; Lee, S.; Baumstein, A.; Lacasse, M.D. Fast full-wavefield seismic inversion using encoded sources. Geophysics 2009, 74, WCC177–WCC188. [Google Scholar] [CrossRef]
Bellman, R. Dynamic Programming. Science 1966, 153, 34–37. [Google Scholar] [CrossRef] [PubMed]
Leugering, G.; Benner, P.; Engell, S.; Griewank, A.; Harbrecht, H.; Hinze, M.; Rannacher, R.; Ulbrich, S. Trends in PDE Constrained Optimization; Springer: Cham, Switzerland, 2014; Volume 165. [Google Scholar]
Gunzburger, M.D.; Peterson, J.S.; Shadid, J.N. Reduced-order modeling of time-dependent PDEs with multiple parameters in the boundary data. Comput. Methods Appl. Mech. Eng. 2007, 196, 1030–1047. [Google Scholar] [CrossRef]
Oseledets, I.; Tyrtyshnikov, E. TT-cross approximation for multidimensional arrays. Linear Algebra Its Appl. 2010, 432, 70–88. [Google Scholar] [CrossRef]
Kazeev, V.A.; Khoromskij, B.N.; Tyrtyshnikov, E.E. Multilevel Toeplitz matrices generated by tensor-structured vectors and convolution with logarithmic complexity. SIAM J. Sci. Comput. 2013, 35, A1511–A1536. [Google Scholar] [CrossRef]
Kornev, E.; Dolgov, S.; Perelshtein, M.; Melnikov, A. TetraFEM: Numerical Solution of Partial Differential Equations Using Tensor Train Finite Element Method. Mathematics 2024, 12, 3277. [Google Scholar] [CrossRef]
Steinbach, O. Space-time finite element methods for parabolic problems. Comput. Methods Appl. Mat. 2015, 15, 551–566. [Google Scholar] [CrossRef]
Schwab, C.; Stevenson, R. Space-time adaptive wavelet methods for parabolic evolution problems. Math. Comput. 2009, 78, 1293–1318. [Google Scholar] [CrossRef]
Oseledets, I.V. Tensor-train decomposition. SIAM J. Sci. Comput. 2011, 33, 2295–2317. [Google Scholar] [CrossRef]
Bachmayr, M.; Schneider, R.; Uschmajew, A. Tensor networks and hierarchical tensors for the solution of high-dimensional partial differential equations. Found. Comput. Math. 2016, 16, 1423–1472. [Google Scholar] [CrossRef]
Truong, D.P.; Ortega, M.I.; Boureima, I.; Manzini, G.; Rasmussen, K.O.; Alexandrov, B.S. Tensor networks for solving the time-independent Boltzmann neutron transport equation. J. Comput. Phys. 2024, 507, 112943. [Google Scholar] [CrossRef]
Goreinov, S.A.; Tyrtyshnikov, E.E.; Zamarashkin, N.L. A theory of pseudoskeleton approximations. Linear Algebra Appl. 1997, 261, 1–21. [Google Scholar] [CrossRef]
Mahoney, M.W.; Drineas, P. CUR matrix decompositions for improved data analysis. Proc. Natl. Acad. Sci. USA 2009, 106, 697–702. [Google Scholar] [CrossRef] [PubMed]
Goreinov, S.A.; Oseledets, I.V.; Savostyanov, D.V.; Tyrtyshnikov, E.E.; Zamarashkin, N.L. How to find a good submatrix. In Matrix Methods: Theory, Algorithms and Applications: Dedicated to the Memory of Gene Golub; World Scientific: Singapore, 2010; pp. 247–256. [Google Scholar]
Mikhalev, A.; Oseledets, I.V. Rectangular maximum-volume submatrices and their applications. Linear Algebra Appl. 2018, 538, 187–211. [Google Scholar] [CrossRef]
Savostyanov, D.; Oseledets, I. Fast adaptive interpolation of multi-dimensional arrays in tensor train format. In Proceedings of the The 2011 International Workshop on Multidimensional (nD) Systems, Poitiers, France, 5–7 September 2011; pp. 1–8. [Google Scholar]
Oseledets, I.V.; Savostianov, D.V.; Tyrtyshnikov, E.E. Tucker dimensionality reduction of three-dimensional arrays in linear time. SIAM J. Matrix Anal. Appl. 2008, 30, 939–956. [Google Scholar] [CrossRef]
Sozykin, K.; Chertkov, A.; Schutski, R.; Phan, A.H.; Cichoki, A.S.; Oseledets, I. TTOpt: A maximum volume quantized tensor train-based optimization and its application to reinforcement learning. Adv. Neural. Inf. Process Syst. 2022, 35, 26052–26065. [Google Scholar]
Savostyanov, D.V. Quasioptimality of maximum-volume cross interpolation of tensors. Linear Algebra Its Appl. 2014, 458, 217–244. [Google Scholar] [CrossRef]
Holtz, S.; Rohwedder, T.; Schneider, R. The alternating linear scheme for tensor optimization in the tensor train format. SIAM J. Sci. Comput. 2012, 34, A683–A713. [Google Scholar] [CrossRef]
Oseledets, I.V.; Dolgov, S.V. Solution of linear systems and matrix inversion in the TT-format. SIAM J. Sci. Comput. 2012, 34, A2718–A2739. [Google Scholar] [CrossRef]
Dolgov, S.V.; Savostyanov, D.V. Alternating minimal energy methods for linear systems in higher dimensions. SIAM J. Sci. Comput. 2014, 36, A2248–A2271. [Google Scholar] [CrossRef]
Dolgov, S.; Savostyanov, D. Parallel cross interpolation for high-precision calculation of high-dimensional integrals. Comput. Phys. Commun. 2019, 246, 106869. [Google Scholar] [CrossRef]
Kazeev, V.; Reichmann, O.; Schwab, C. Low-rank tensor structure of linear diffusion operators in the TT and QTT formats. Linear Algebra Its Appl. 2013, 438, 4204–4221. [Google Scholar] [CrossRef]
Khoromskij, B.N.; Oseledets, I.V. QTT approximation of elliptic solution operators in higher dimensions. Russ. J. Numer. Anal. Math. Model. 2011, 26, 303–322. [Google Scholar] [CrossRef]
Vysotsky, L.; Rakhuba, M. Tensor rank bounds and explicit QTT representations for the inverses of circulant matrices. Numer. Linear Algebra Appl. 2023, 30, e2461. [Google Scholar] [CrossRef]
Kazeev, V.A.; Khoromskij, B.N. Low-rank explicit QTT representation of the Laplace operator and its inverse. SIAM J. Matrix Anal. Appl. 2012, 33, 742–758. [Google Scholar] [CrossRef]
Oseledets, I.V. Approximation of 2^d × 2^d matrices using tensor decomposition. SIAM J. Matrix Anal. Appl. 2010, 31, 2130–2145. [Google Scholar] [CrossRef]
Oseledets, I. TT-Toolbox, Version 2.2. 2023. Available online: https://github.com/oseledets/TT-Toolbox (accessed on 8 July 2025).
Fraschini, S.; Kazeev, V.; Perugia, I. Symplectic QTT-FEM solution of the one-dimensional acoustic wave equation in the time domain. arXiv 2024, arXiv:2411.11321. [Google Scholar]
Dolgov, S.V.; Khoromskij, B.N.; Oseledets, I.V. Fast solution of parabolic problems in the tensor train/quantized tensor train format with initial application to the Fokker–Planck equation. SIAM J. Sci. Comput. 2012, 34, A3016–A3038. [Google Scholar] [CrossRef]
Holst, M.J. Multilevel Methods for the Poisson-Boltzmann Equation. Ph.D. Thesis, University of Illinois at Urbana-Champaign, Champaign, IL, USA, 1993. [Google Scholar]
Rodgers, A.; Dektor, A.; Venturi, D. Adaptive integration of nonlinear evolution equations on tensor manifolds. J. Sci. Comput. 2022, 92, 39. [Google Scholar] [CrossRef]
Rodgers, A.; Venturi, D. Implicit integration of nonlinear evolution equations on tensor manifolds. J. Sci. Comput. 2023, 97, 33. [Google Scholar] [CrossRef]
Danis, M.E.; Truong, D.; Boureima, I.; Korobkin, O.; Rasmussen, K.; Alexandrov, B. Tensor-Train WENO Scheme for Compressible Flows. arXiv 2024, arXiv:2405.12301. [Google Scholar] [CrossRef]
Danis, M.E.; Truong, D.P.; DeSantis, D.; Petersen, M.; Rasmussen, K.O.; Alexandrov, B.S. High-order Tensor-Train Finite Volume Method for Shallow Water Equations. arXiv 2024, arXiv:2408.03483. [Google Scholar] [CrossRef]
Demkowicz, L.; Gopalakrishnan, J. A Class of Discontinuous Petrov-Galerkin Methods. Part I: The Transport Equation. Comput. Methods Appl. Mech. Eng. 2010, 199, 1558–1572. [Google Scholar] [CrossRef]
Demkowicz, L.; Gopalakrishnan, J. A class of discontinuous Petrov–Galerkin methods. Part II: Optimal test functions. Numer. Methods Partial Differ. Equ. 2011, 27, 70–105. [Google Scholar] [CrossRef]
Roberts, N.V.; Miller, S.T.; Bond, S.D.; Cyr, E.C. An implicit-in-time DPG formulation of the 1D1V Vlasov-Poisson equations. Comput. Math. Appl. 2024, 154, 103–119. [Google Scholar] [CrossRef]
Broersen, D.; Dahmen, W.; Stevenson, R.P. On the Stability of DPG Formulations of Transport Equations. Math. Comput. 2018, 87, 1051–1082. [Google Scholar] [CrossRef]
Demkowicz, L.F.; Roberts, N.V.; Matute, J.M. The DPG Method for the Convection-Reaction Problem, Revisited. Comput. Methods Appl. Math. 2023, 23, 93–125. [Google Scholar] [CrossRef]
Sande, E.; Manni, C.; Speleers, H. Explicit error estimates for spline approximation of arbitrary smoothness in isogeometric analysis. Numer. Math. 2020, 144, 889–929. [Google Scholar] [CrossRef]
Brenner, S.C. The Mathematical Theory of Finite Element Methods; Springer: New York, NY, USA, 2008. [Google Scholar]
Chen, C.M.; Larsson, S.; Zhang, N.Y. Error estimates of optimal order for finite element methods with interpolated coefficients for the nonlinear heat equation. IMA J. Numer. Anal. 1989, 9, 507–524. [Google Scholar] [CrossRef]
Adak, D.; Natarajan, S. Virtual element method for semilinear sine–Gordon equation over polygonal mesh using product approximation technique. Math. Comput. Simul. 2020, 172, 224–243. [Google Scholar] [CrossRef]

Figure 1. TT format of a 4D tensor with TT ranks

r = [r_{1}, r_{2}, r_{3}]

and approximation error

ε

, in accordance with Equation (25).

Figure 1. TT format of a 4D tensor with TT ranks

r = [r_{1}, r_{2}, r_{3}]

and approximation error

ε

, in accordance with Equation (25).

Figure 2. Representation of a linear matrix

A

in the TT-matrix format. First, we reshape the operation matrix

A

and permute its indices to create the tensor

A

. Then, we factorize the tensor in the tensor-train matrix format according to Equation (27) to obtain

A^{TT}

.

Figure 2. Representation of a linear matrix

A

in the TT-matrix format. First, we reshape the operation matrix

A

and permute its indices to create the tensor

A

. Then, we factorize the tensor in the tensor-train matrix format according to Equation (27) to obtain

A^{TT}

.

Figure 3. CUR matrix decomposition.

Figure 4. Performance of the full-grid, TT, and QTT solvers on the 3D Poisson equation.

Figure 5. Performance of the full-grid, TT, and QTT solvers on the 3D CDR equation with the manufactured solution.

Figure 6. Performance of the full-grid, TT, and QTT solvers on the 3D nonlinear equation with the manufactured solution.

Table 1. TT-ranks of the diffusion operator. The results show the connection between the TT-ranks of the diffusion coefficient function

κ (x, y, z)

—calculated via TT-cross interpolation with a truncation error (TT-tol)—and the TT-ranks of the diffusion operator (i.e., the operator on the left-hand side of Equation (83) in discrete form). Here, the number of elements is

N_{Q} = N_{q}^{3}

, and each element,

q_{l}

, has

N = 2

nodes.

Table 1. TT-ranks of the diffusion operator. The results show the connection between the TT-ranks of the diffusion coefficient function

κ (x, y, z)

—calculated via TT-cross interpolation with a truncation error (TT-tol)—and the TT-ranks of the diffusion operator (i.e., the operator on the left-hand side of Equation (83) in discrete form). Here, the number of elements is

N_{Q} = N_{q}^{3}

, and each element,

q_{l}

, has

N = 2

nodes.

$κ (x, y, z)$	TT-tol	$κ^{TT}$ Ranks	TT-Ranks of Diffusion Operator
$κ (x, y, z)$	TT-tol	$κ^{TT}$ Ranks	$N_{q} = 17$	$N_{q} = 33$	$N_{q} = 65$
1	$1 \times 10^{- 12}$	[1, 1]	[2, 2]	[2, 2]	[2, 2]
$1 + x y z$	$1 \times 10^{- 12}$	[2, 2]	[4, 4]	[4, 4]	[4, 4]
$1 + cos (π (x + y)) cos (π z)$	$1 \times 10^{- 12}$	[3, 2]	[6, 4]	[6, 4]	[6, 4]
$1 / (1 + x + y + z)$	$1 \times 10^{- 6}$	[5, 5]	[8, 8]	[8, 8]	[8, 8]
$1 / (1 + x + y + z)$	$1 \times 10^{- 12}$	[9, 9]	[15, 15]	[15, 15]	[15, 15]

Table 2. Comparison of the relative numerical error, execution time, and compression ratio for the 3D Poisson problem using the full grid, TT, and QTT solvers. Full grid reports error and time only, while TT and QTT include operator compression ratios (defined as the storage size relative to the full tensor).

N	Full Grid		TT			QTT
N	Error	Time (s)	Error	Time (s)	Comp.	Error	Time (s)	Comp.
9	$1.31 \times 10^{- 2}$	0.20	–	–	–	–	–	–
17	$3.94 \times 10^{- 3}$	1.67	$3.94 \times 10^{- 3}$	0.30	$5.19 \times 10^{- 4}$	$3.94 \times 10^{- 3}$	0.45	$2.38 \times 10^{- 4}$
33	$1.08 \times 10^{- 3}$	102.99	$1.08 \times 10^{- 3}$	0.60	$3.24 \times 10^{- 5}$	$1.08 \times 10^{- 3}$	0.89	$5.34 \times 10^{- 6}$
65	–	–	$2.81 \times 10^{- 4}$	1.64	$2.03 \times 10^{- 6}$	$2.81 \times 10^{- 4}$	2.69	$1.09 \times 10^{- 7}$
129	–	–	$7.15 \times 10^{- 5}$	6.53	$1.27 \times 10^{- 7}$	$7.16 \times 10^{- 5}$	6.97	$2.10 \times 10^{- 9}$
257	–	–	$1.80 \times 10^{- 5}$	60.50	$7.92 \times 10^{- 9}$	$1.81 \times 10^{- 5}$	18.94	$3.90 \times 10^{- 11}$

Table 3. Performance comparison for Test Case 2: Time-dependent convection–diffusion–reaction (CDR) equation in three spatial dimensions. The table reports the relative numerical error, runtime, and compression ratio of the diffusion operator for the full grid, TT, and QTT solvers. The TT and QTT methods show improved scalability through low-rank tensor representations, while full grid results are shown where computationally feasible.

N	Full Grid		TT			QTT
N	Error	Time (s)	Error	Time (s)	Comp.	Error	Time (s)	Comp.
5	$1.83 \times 10^{- 1}$	0.18	–	–	–	–	–	–
9	$6.50 \times 10^{- 2}$	0.67	$6.81 \times 10^{- 2}$	0.30	$4.12 \times 10^{- 4}$	$6.81 \times 10^{- 2}$	0.51	$1.31 \times 10^{- 4}$
17	$1.86 \times 10^{- 2}$	211.07	$2.00 \times 10^{- 2}$	0.52	$6.44 \times 10^{- 6}$	$2.00 \times 10^{- 2}$	0.75	$9.30 \times 10^{- 7}$
33	–	–	$5.39 \times 10^{- 3}$	1.14	$1.01 \times 10^{- 7}$	$5.39 \times 10^{- 3}$	1.84	$5.68 \times 10^{- 9}$
65	–	–	$1.39 \times 10^{- 3}$	2.57	$1.57 \times 10^{- 9}$	$1.39 \times 10^{- 3}$	3.82	$2.93 \times 10^{- 11}$
129	–	–	$3.54 \times 10^{- 4}$	6.20	$2.46 \times 10^{- 11}$	$3.55 \times 10^{- 4}$	7.73	$1.42 \times 10^{- 13}$
257	–	–	$8.93 \times 10^{- 5}$	23.92	$3.84 \times 10^{- 13}$	$9.03 \times 10^{- 5}$	15.04	$6.65 \times 10^{- 16}$
513	–	–	$2.24 \times 10^{- 5}$	143.47	$6.00 \times 10^{- 15}$	$2.45 \times 10^{- 5}$	34.05	$3.02 \times 10^{- 18}$

Table 4. Performance comparison for Test Case 3: nonlinear 3D time-dependent equation solved using the full grid, TT, and QTT solvers. The table reports the relative numerical error, runtime, and number of Newton iterations. The full grid results are shown only for smaller sizes due to memory limitations.

N	Full Grid		TT			QTT
N	Error	Time (s)	Error	Time (s)	Iter	Error	Time (s)	Iter
5	$8.70 \times 10^{- 2}$	0.18	–	–	–	–	–	–
9	$2.87 \times 10^{- 2}$	2.16	$2.87 \times 10^{- 2}$	0.29	4	$2.87 \times 10^{- 2}$	0.76	3
17	$8.19 \times 10^{- 3}$	1435.28	$8.19 \times 10^{- 3}$	0.44	4	$8.19 \times 10^{- 3}$	2.65	4
33	–	–	$2.18 \times 10^{- 3}$	1.00	4	$2.18 \times 10^{- 3}$	4.93	4
65	–	–	$5.64 \times 10^{- 4}$	3.29	5	$5.64 \times 10^{- 4}$	16.05	5
129	–	–	$1.43 \times 10^{- 4}$	10.10	5	$1.43 \times 10^{- 4}$	47.84	6
257	–	–	$3.61 \times 10^{- 5}$	22.60	4	$3.61 \times 10^{- 5}$	52.27	6
513	–	–	$9.04 \times 10^{- 6}$	113.68	4	$9.05 \times 10^{- 6}$	76.57	8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Adak, D.; Truong, D.P.; Vuchkov, R.; De, S.; DeSantis, D.; Roberts, N.V.; Rasmussen, K.Ø.; Alexandrov, B.S. Space-Time Finite Element Tensor Network Approach for the Time-Dependent Convection–Diffusion–Reaction Equation with Variable Coefficients. Mathematics 2025, 13, 2277. https://doi.org/10.3390/math13142277

AMA Style

Adak D, Truong DP, Vuchkov R, De S, DeSantis D, Roberts NV, Rasmussen KØ, Alexandrov BS. Space-Time Finite Element Tensor Network Approach for the Time-Dependent Convection–Diffusion–Reaction Equation with Variable Coefficients. Mathematics. 2025; 13(14):2277. https://doi.org/10.3390/math13142277

Chicago/Turabian Style

Adak, Dibyendu, Duc P. Truong, Radoslav Vuchkov, Saibal De, Derek DeSantis, Nathan V. Roberts, Kim Ø. Rasmussen, and Boian S. Alexandrov. 2025. "Space-Time Finite Element Tensor Network Approach for the Time-Dependent Convection–Diffusion–Reaction Equation with Variable Coefficients" Mathematics 13, no. 14: 2277. https://doi.org/10.3390/math13142277

APA Style

Adak, D., Truong, D. P., Vuchkov, R., De, S., DeSantis, D., Roberts, N. V., Rasmussen, K. Ø., & Alexandrov, B. S. (2025). Space-Time Finite Element Tensor Network Approach for the Time-Dependent Convection–Diffusion–Reaction Equation with Variable Coefficients. Mathematics, 13(14), 2277. https://doi.org/10.3390/math13142277

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Space-Time Finite Element Tensor Network Approach for the Time-Dependent Convection–Diffusion–Reaction Equation with Variable Coefficients

Abstract

1. Introduction

1.1. Time-Dependent 3D Convection–Diffusion–Reaction Problem

1.2. Classical Methods for Solving CDR

1.3. Mitigating the Curse of Dimensionality

2. Space-Time Finite Element Formulation of CDR

2.1. Space-Time Weak Formulation

2.2. Finite Element Approximation

Local Estimates

2.3. Local and Global Interpolation Operators

3. Tensor-Train Decomposition

3.1. Tensor-Train

3.2. Linear Operators in the TT-Matrix Format

3.3. TT-Cross Interpolation

3.4. Quantized Tensor-Train Format

4. Tensorization of the FEM

4.1. Local One-Dimensional Mass, Stiffness, and Time-Derivative Matrices

4.2. Local Discretization of the Variational Form

4.2.1. Discretization of the Time-Derivative Term, T 1 , on a Local Four-Dimensional Hypercube

4.2.2. Discretization of the Diffusion Term, T 2 , on the Four-Dimensional Hypercube, with a Constant Diffusion Coefficient, κ ( t , x ) = 1

4.2.3. Discretization of the Diffusion Term, T 2 , on the Four-Dimensional Hypercube, with a Non-Constant Diffusion Function

4.2.4. Discretization of the Convection Term, T 3 , on the Four-Dimensional Hypercube, with a Non-Constant Convection Function

4.2.5. Discretization of the Reaction Term, T 4 , on the Four-Dimensional Hypercube

4.2.6. Discretization of the Loading Term, T 5 , on the Four-Dimensional Hypercube

5. Assembly of Global Matrices

5.1. Assembly for Terms with Constant Coefficients

5.2. Assembly for Terms with Variable Coefficients

5.2.1. Assembly of T 2 , When the Diffusion Coefficient Depends on Space-Time Variables, but Enables a Separation of Variables

5.3. Assembly for the Convection Term T 3 , on the Four-Dimensional Hypercube

5.4. Assembly for the Reaction Term T 4 on the Four-Dimensional Hypercube

5.5. The Global System

6. Tensorization of the Weak-Form of CDR

6.1. Transformation of the CDR Discretization into TT and QTT Formats

6.2. Construction of the TT Format for Linear System Components

6.3. Construction of the QTT Format for Linear System Components

7. Extension of the Method to Higher Orders and Dimensions

8. Numerical Experiments

8.1. TT-Ranks of the Diffusion Operator

8.2. Three-Dimensional Poisson Equation

8.3. Three-Dimensional CDR Equation

8.4. Three-Dimensional CDR Equation with a Nonlinear Loading Term

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Notations and Definitions

Appendix A.1. Notation Table

Appendix A.2. Kronecker Product

Appendix A.3. The Tensor Product

Appendix B. TT Format Construction of the Boundary Term F bd,TT

Appendix C. QTT Format Construction

Appendix D. Discrete Inf-Sup Condition

Appendix D.1. Construction of Approximate Projection Operator

Appendix D.2. Proof of Theorem 3

Appendix E. Convergence Analysis (Proof of Theorem 4)

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.2.1. Discretization of the Time-Derivative Term, $T_{1}$ , on a Local Four-Dimensional Hypercube

4.2.2. Discretization of the Diffusion Term, $T_{2}$ , on the Four-Dimensional Hypercube, with a Constant Diffusion Coefficient, $κ (t, x) = 1$

4.2.3. Discretization of the Diffusion Term, $T_{2}$ , on the Four-Dimensional Hypercube, with a Non-Constant Diffusion Function

4.2.4. Discretization of the Convection Term, $T_{3}$ , on the Four-Dimensional Hypercube, with a Non-Constant Convection Function

4.2.5. Discretization of the Reaction Term, $T_{4}$ , on the Four-Dimensional Hypercube

4.2.6. Discretization of the Loading Term, $T_{5}$ , on the Four-Dimensional Hypercube

5.2.1. Assembly of $T_{2}$ , When the Diffusion Coefficient Depends on Space-Time Variables, but Enables a Separation of Variables

5.3. Assembly for the Convection Term $T_{3}$ , on the Four-Dimensional Hypercube

5.4. Assembly for the Reaction Term $T_{4}$ on the Four-Dimensional Hypercube