Goal oriented time adaptivity using local error estimates

We consider initial value problems where we are interested in a quantity of interest (QoI) that is the integral in time of a functional of the solution of the IVP. For these, we look into local error based time adaptivity. We derive a goal oriented error estimate and timestep controller, based on error contribution to the error in the QoI, for which we prove convergence of the error in the QoI for tolerance to zero under weak assumptions. We analyze global error propagation of this method and derive guidelines to predict performance of the method. In numerical tests we verify convergence results and guidelines on method performance. Additionally, we compare with the dual-weighted residual method (DWR) and classical local error based time-adaptivity. The local error based methods show better performance than DWR and the goal oriented method shows good results in most examples, with significant speedups in some cases.


Introduction
A typical situation in numerical simulations based on differential equations is that one is not interested in the solution of the differential equation per se, but a Quantity of Interest (QoI) that is given as a functional of the solution.For example, when designing an airplane, the QoI would be the lift coefficient divided by the drag coefficient.In simulations of the Greenland ice sheet, one would like to know the net amount of ice loss over a year.When simulating wind turbines, the amount of energy produced during a certain time period is more important than the actual flow solution.
Further examples are found in optimization problems with ODEs or PDEs as constraints.In the turbine example, one may want to optimize blade shape or determine optimal placement of e.g.tidal turbines [8] for maximal energy output.Inverse problems in e.g.oceanography [5] can also be considered.Here the aim is to determine model parameters or initial conditions to fit measurement data to goal functions of simulation results.An example for such an inverse problem is to determine vertical mixing parameters with the QoI being the total inflow of salt water from the North Sea into the Baltic Sea.
In this article, we restrict ourselves to problems where the QoI is given as an integral over time of a functional of the solution.From the examples above, only the steady state problem in air plane design does not qualify.The basic problem we consider is thus: Given the initial value problem u(t) = f (t, u(t)), t ∈ [t 0 , t e ], u(t 0 ) = u 0 , for a sufficiently smooth function f : [t 0 , t e ] × R d → R d with solution u(t), we are interested in the QoI with j : [t 0 , t e ] × R d → R, which we will refer to as density function, following the notation in [1].When solving PDEs, the system of ODEs originates from a semi-discretization, thus u consists of unknowns of the space discretization.Consequently j can be used to provide spatial weighting and to select only specific points or regions of the spatial discretization.
The goal here is to determine an adaptive discrete approximation u h ≈ u(t).Our degrees of freedom are the timesteps and we want to use as few as possible.This strategy will not yield an optimal solution, but works well in practice.We adapt the timesteps ∆t using a timestep controller, which is based on local error estimates.The solution process as a whole involves a variety of schemes.
An adaptive method consists of a time-integration scheme for (1), an error estimator, a timestep controller and an initial timestep ∆t 0 .If we consider problem (1) - (2), the adaptive method also includes a discrete approximation J h ≈ J given by a quadrature scheme.
The input for an adaptive method is a tolerance τ , which is used in the timestep controller and possibly to determine ∆t 0 .The output is an approximation to the solution, this can be a discrete solution u h , or J h (u h ), depending on which problem is considered.Since we have an adaptive method, we cannot use the usual notion of convergence for ∆t → 0 for a time-integration scheme.Instead, we consider the limit of the tolerance going to zero.Definition 1.An adaptive method for an IVP (1) is called convergent (in the solution), if u(t e ) − u N = 0, for τ → 0, where u N is an approximation to u(t e ) and • is an appropriate norm.
An adaptive method for an IVP (1) with QoI (2) is convergent in the QoI, if For a convergent adaptive method we are naturally interested in the convergence rate and will express it in terms of O(τ q ).This definition of adaptive methods and convergence is targeted to local error based methods, but can also be considered for methods based on global error estimates.
For goal oriented adaptivity, the standard approach is the dual weighted residual (DWR) method [2,18].Originally, it was developed for spatial problems, but has been extended to time dependent problems.The basic idea is to use the adjoint (dual) problem to get an estimate of the error in the QoI.In the time dependent case, the adjoint problem is a terminal value problem (IVP backwards in time).For linear problems, this gives rise to global error bounds, in the nonlinear case, global error estimates are obtained.
The DWR method is based on global a-posteriori error estimates.To obtain these error estimates one needs to subsequently integrate forward and backward in time.Here the primal and adjoint solution need to be stored.The error estimate is obtained from the primal and dual solution and is used to refine the meshes.This iterative process is repeated until a discretization is found, where the error estimate η(u h ) fulfills The major drawback of this method is its cost, both in implementation and computation.To reduce computational effort, Carey et.al. suggested to apply the approach in a blockwise manner, thus making it more local [4].The storage of the primal and dual solution can be problematic for high resolutions.This, for example, can be solved by check-pointing [14,15], but will further increase computational costs.The method requires a full variational formulation, restricting it to Galerkin type schemes in space and time.
An alternative is to use a classical time adaptive method for IVPs based on estimating the local error.Results on convergence are well established and described in standard textbooks [21,11].This adaptive method is not goal oriented, but can be used to solve problems with QoIs.We do not have global error bounds, since the accumulation of local errors is hard to analyse.This approach works particularly well for stiff problems, since there, local errors typically dissipate with time.
We choose a different approach, aiming to get the best of both methods.To this end we derive a new error estimator for the classic adaptive method to make it goal oriented.We estimate the time-stepwise error contribution to the error in the QoI, which consists of both quadrature and time-integration errors.Neglecting the quadrature contribution, we derive a local error estimate and use it in the deadbeat controller.
We show that convergence in the QoI follows from convergence in the solution, with additional requirements on the timesteps.The derived goal oriented adaptive method fulfills these requirements and is convergent in the QoI under weak assumptions.To obtain high convergence rates in the QoI when using higher order (> 2) time-integration schemes, one needs solutions of sufficiently high order in all quadrature evaluation points.We explain how to obtain these from the stage value of a given RK scheme.
We do our analysis for one-step methods for time-integration, embedded Runge-Kutta schemes [11] for error estimation, the deadbeat controller (13) and simple choices for ∆t 0 .These restrictions are done for easier analysis, but it is straightforward to extend the results to other error estimation techniques, such as Richardson-extrapolation [11].For different controllers, such as PID controllers [22], our results allow for simple convergence proofs based on similarity to the deadbeat controller.The results hold for a wide range of initial timesteps and thus for any reasonable scheme used to compute ∆t 0 .
Implementation of this method only requires a standard deadbeat controller, an embedded Runge-Kutta scheme and the density function j(t, u(t)).Due to being based on local error estimates, the method is computationally very cheap.For problems where the density function only regards a small part of the state vector u, the error estimate will be even cheaper than the classical one.
A similar method has been proposed by [12,24,25], using various other techniques for error estimation.John, Rang propose it for drag and lift coefficients in incompressible flows, but do not show numerical results [12].Turek describes a case where using the method for an alternating lift coefficient leads to "catastrophical results" [24].Wick uses a point-wise evaluation of the displacement field in fluid-structure interaction [25,6].The author describes inconsistent convergence patterns but concludes satisfying results.
To be able to make statements on the performance of the goal oriented adaptive method, we analyse the impact of global error dynamics on the error in the QoI.This analysis revolves around the nullspace of the density function j(t, u) and thus our error estimator.A method performs well, if all relevant processes are sufficiently resolved in time.To be able to sufficiently resolve a process, its local error or the local error of a faster process, must appear in the error estimate.The question if a process is relevant for the QoI is a matter of global error dynamics.Thus, with sufficient knowledge on the global error dynamics, we are able to make predictions on the performance of the goal oriented adaptive method.
We use numerical tests with widely different global error behaviors with respect to the QoI.For these we confirm the convergence results and are able to explain the performance results.It turns out to be relatively easy to predict bad performance, but hard to predict good performance.Our results show that the local error based methods are more efficient than the DWR method.The goal oriented adaptive method shows good performance in most cases and significant speedups in some.
The structure of the article is as follows: We first review current adaptive methods in section 2, then we explain and analyse our approach in section 3. Numerical results are presented in section 4.

Current adaptive methods 2.1 A posteriori error estimation via the dual weighted residual method
The starting point of the DWR method is an initial value problem in variational formulation: Find u ∈ U , such that Here, U and V are appropriate spaces, A is linear in v and possibly nonlinear in u.Here we have A(u; v) = (u t , v) − (f (t, u), v) and F (v) = 0, see (1).Furthermore, there is a discrete approximation to this problem, also in weak form: Find u h ∈ U h , such that Here, U h ⊂ U and V h ⊂ V are finite element spaces in time.

The error estimate
To obtain an estimate of the error e J = J(u) − J(u h ) in the QoI (2), one uses the linearised adjoint problem for J(u): Find z ∈ V , such that and its discrete version where A ′ and J ′ are the Gateaux derivatives of A and J with respect to u in direction v.
Note that the adjoint problem is an initial value problem backwards in time.
An approximation of the error in the QoI is given by with equality for linear functionals and approximate upper bounds for the general nonlinear case.Using an approximation z + h ≈ z, which is of higher accuracy than z h , using e.g. higher order interpolation or a discrete solution on a finer grid [1], one gets an estimate This can be further bounded by decomposing it into timestep wise contributions and thus giving a guide on where and how to adapt.For this to work, it is imperative that the solutions of the primal and adjoint problems are obtained at all points.This can cause storage problems for long time simulations and can be dealt with using check-pointing [10].

Adaptation scheme
A large number of different adaptation strategies exist.Here we use a fixed-rate strategy [1], where the r ∈ [0, 1] elements with largest error are refined.Summarizing, the following scheme is obtained.
1. Start with initial grid.
3. Construct and solve adjoint problem (4) to obtain z h .
The scheme is very expensive due to the need of solving adjoint problems to obtain an error estimate.While one can use generic schemes for grid adaptation, the adjoint problem and the error estimate are specific to a given equation and goal functional.Construction and solution of the adjoint problem can be automated using software such as dolfin-adjoint [7].An advantage of the method is that the error estimate is global and one can expect the resulting discretizations to be of high quality.
We use a finer grid to approximate z by z + h , making this the most expensive step in the computation of η(u h ).

Time Adaptivity based on local error estimates
The second adaptive method we discuss is the standard in ODE solvers.It uses local error estimates of the solution and does not take into account QoIs.The results from this section for One-step methods and the deadbeat controller (13) are in principal classic [20].
Here, we present a new convergence proof that separates requirements on the error estimate, timesteps and ∆t 0 , for generic One-step methods.This makes it easier to show convergence for general controllers and estimates, and we use it to show convergence in the QoI for the goal oriented adaptive method in section 3. We first introduce the relevant terminology used in this paper.For readers familiar with time adaptivity for ODEs, we use local extrapolation and Error Per Step (EPS) based control, see [20].
Definition 2. The flow [23] of an IVP (1) is the map where t ∈ [t 0 , t e ] and t + ∆t ≤ t e for ∆t > 0.
The flow acts as the solution operator for u(t).To numerically solve an IVP means to approximate the flow by a numerical flow map N t,∆t : R d → R d defined by some numerical scheme.A timestep can be written in the form We generally assume problem (1) to have the unique solution u(t) guaranteeing existence of the flow map M t,∆t .
We define the global error by By adding zero we obtain the global error propagation form The dynamics of global error propagation are usually not known.The global error increments, however, have a known structure.
Definition 3. Assume a sufficiently smooth right-hand side f for (1).The principal error function [11] φ of a scheme N t,∆t of order p is The local error of a scheme N t,∆t of order p is Here, the local error is equivalent to the global error increment (7).We will estimate the local error and derive a timestep controller to keep the norm of the local error in check.Then we show that the resulting adaptive method is convergent, that is, the global error can be controlled by the global error increments and goes to zero for τ → 0.

Error estimation and timestep controller
We now derive an estimate for the local error using the two solutions of order p and p.We approximate the local error behaviour by a simplified model, focusing on the leading terms.Aiming to keep the norm of the local error equal to a desired tolerance, this determines the new timestep.This timestep controller gives us ∆t n+1 based on the previous timestep ∆t n , the local error estimate and a tolerance τ .
Assume two time-integration schemes (N t,∆t , N t,∆t − ) with orders (p, p) and principal error functions (φ, φ − ).Embedded Runge-Kutta schemes [11] are a possible choice, as they have the advantage that the embedded solution uses the same stage derivatives, requiring essentially no extra computation.
We use a local extrapolation approach to estimate the local error by ln := N tn,∆tn The leading term of this error estimate, characterized by the principal error function φ − , matches the leading term of the local error (9).Higher order terms with regards to ∆t n will differ.Note that this local error is not the global error increment from ( 7), but the one corresponding to N t,∆t − .We model the local error using assuming φ − (t n , u n ) to be slowly changing.The next step of this model yields Aiming for m n+1 = τ gives the well-known deadbeat controller

Convergence in the solution
We now show convergence with e n = O(τ p/(p+1) ) for the adaptive method consisting of a time-integration scheme of order p, the error estimate (10), controller ( 13) and a suitable initial step-size.
First we build a relation between global error and maximal timestep with Lemma 1. Corollary 2 relaxes this relation to general timesteps in dependence on the tolerance τ .We cannot use the timesteps from controller (13) directly, since their dependence on τ is more involved.Instead, we construct a reference timestep series which fulfills the requirements in both Lemma and Corollary and gives the targeted convergence rate.With Theorem 3 we show the timesteps from the controller (13) converge to the reference timesteps for τ → 0, which gives convergence with the rate O(τ p/(p+1) ).
Lemma 1.Let problem (1) have a sufficiently smooth right-hand side f , such that a scheme N t,∆t of order p has the global error increment with timesteps ∆t n = t n+1 −t n and the step-size function θ : [t 0 , t e ] → (θ min , 1], θ min > 0, for some ∆T > 0. Then the global error (6) fulfills Proof.We first neglect the O(∆T 1+ǫ ) term in (14).Under these assumptions a proof of e n = O(∆T p ) can be found in [9, pp. 68].
To extend this result to ∆t n = θ(t n )∆T + O(∆T 1+ǫ ), we define which fulfills 0 < θ * (t n ) ≤ 1 for ∆T sufficiently small.This means the general case ( 14) is also covered by the proof in [9, pp. 68] and for ∆T → 0 we get Using this Lemma, we can now link the global error to the tolerance.
Corollary 2. Assume the smoothness requirements of Lemma 1 to be met and assume a scheme of order p to get u n .Assume a mesh t 0 < • • • < t N = t e with timesteps ∆t n = t n+1 − t n , n = 0, . . ., N − 1 that fulfill Then, the global error fulfills We thus meet all assumptions of Lemma 1 and get We cannot apply Corollary 2 to the timesteps (13) directly, since they have a more complex dependence on τ .Therefore we use reference timesteps ∆t ref n = O(τ 1/q ).We show ∆t n → ∆t ref n for τ → 0 with a difference of at most O(∆T 2 ref ) and can apply Lemma 1.We define the reference timesteps where φ − is the principal error function (8) corresponding to N t,∆t − and c n is given by where O(1) is with respect to τ → 0 and c 0 > 0. This adds a degree of freedom to choose the initial timestep.For (15) to be well-defined we require f in problem (1) to be sufficiently smooth and define φ −,min := min where we assume φ −,min > 0. This gives the maximal timestep We have ∆t ). Applying Lemma 1 gives us e n = O(τ p/(p+1) ), for a time-integration scheme of order p.We now show convergence of the adaptive method with timesteps from (13).
Proof.By induction we show the timesteps fulfill We choose c 0 in ∆t ref 0 such that ∆t 0 = ∆t ref 0 , meaning the induction base is met.The timestep given by the controller is .
We perform another expansion to separate the O(∆t n ) term and get We now consider the O term.From the definition of the maximal timestep (16) ).Thus we showed the induction step.By Corollary 2 we then get the result e n = O(τ p/(p+1) ).
Thus we established convergence for the derived adaptive method for a suitable initial timestep ∆t 0 .The assumption ( 17) is a requirement of controllability in the asymptotic regime.The global error would not be controllable by means of local errors, if the local error vanishes at some point.Further we built a structure with which one can prove similar results for different controllers, e.g.PID controllers [22].To prove convergence one can either show (14) using suitable reference timesteps or show a maximal deviation of O(∆T 1+ǫ ref ), ǫ > 0 of a given controller from the deadbeat controller (13).

Goal oriented adaptivity using local error estimates
We now consider the goal oriented setting (2) for problem (1) and are only interested in the QoI J(u).We approximate the integral in J using quadrature and u(t) by the numerical solution u h to get Here u n ) and σ k are the evaluation points resp.weights for the quadrature scheme.We assume an embedded Runge-Kutta scheme for time-integration.
As we are now only interested in the QoI, we derive an adaptive method that is convergent in the QoI and goal oriented.The method aims to be more efficient by taking into account the QoI for the error estimate.Convergence in the QoI will be shown based on convergence in the solution.With the following Theorem we establish the connection between convergence rates.Theorem 4. Assume f in problem (1) and j in the QoI (2) to be sufficiently smooth, a mesh t 0 < • • • < t N = t e with timesteps ∆t n = t n+1 − t n = O(τ 1/q ) and an approximation J h ≈ J (18) by a quadrature scheme of order r.Further assume an approximation u h ≈ u(t) with order u for all n, k.Then the error in the QoI fulfills Proof.By splitting the error, we obtain and can deal with the two errors separately.An estimate for general numerical quadrature schemes of order r gives with a constant c q .Using the bound j max := max t 0 ≤t≤te |(j(t, u(t))) (r) |, we get For the time-integration error we have where we linearise j(t n )) and use assumption (19) to get This yields Summing up quadrature and time-integration error yields Here, we combined statements on convergence in the QoI and the respective rates.The assumption ∆t n = O(τ 1/q ) gives u n −u(t n ) = O(τ p/q ), for all n = 0, . . ., N −1 by Corollary 2. Using linear interpolation for an intermediate point t ).The requirement (19) becomes relevant for schemes of order p > 2 and is discussed in the end of section 3.2.
Our idea is now to use a goal oriented error estimate to obtain step-sizes more suitable to address the error in the QoI.In practical computations this should lead to a gain in efficiency.
We first derive our error estimate and controller, for the resulting goal oriented adaptive method we show convergence in the QoI with Theorem 5. We make an analysis to predict the performance in section 3.3.

Error estimate and timestep controller
In the proof of Theorem 4 we see two different error sources -time-integration and quadrature, see (20).While one can estimate the quadrature error, doing so is not necessary.
Using an error estimate based on the time-integration error only, we will get an adaptive method that is convergent in the QoI.
Neglecting the quadrature error we have As we generally do not have error estimates for the intermediate points of the quadrature scheme, we approximate the above term-wise by the rectangular rule Note that this is an approximation of the time-integration error J h (u)−J h (u h ) and does not place general restrictions on choices for quadrature schemes.The global error propagation form of ( 22) is = ∆t n j t n+1 , N tn,∆tn u n − j t n+1 , M tn,∆tn u n global error increment ∆t n j t n+1 , M tn,∆tn u n − j t n+1 , M tn,∆tn u(t n ) Again, we do not know the global error propagation dynamics, but we can estimate the global error increment and control it using timesteps.We use local extrapolation with a scheme N t,∆t − of order p < p and control ∆t n ℓ j n := ∆t n j t n+1 , N tn,∆tn This is the global error increment (24), but corresponding to N tn,∆tn − .We estimate (26) by To construct a controller we need a model for (26).As j may be non-linear, we linearise in u and get As model we choose the leading term of (28) and assume the derivative term to be slowly changing.Here, m n is the classical error estimate (11).For this model the next step yields We aim to control the error per unit interval, per step, which means aiming for |m j n+1 | = ∆t n+1 τ .This is not to be confused with the common Error Per Unit Step (EPUS) approach in classical timestep control.We get the deadbeat controller We thus constructed a timestep controller to control the error in the QoI (2) using only local error estimates in j(t, u).In the next section we show that the resulting adaptive method is convergent in the solution and QoI.
Comparing the implementation of this adaptive method to the classical one from section 2.2, we only require the density function j.This we need regardless of the used method, as it is necessary for the evaluation of J(u).

Convergence in the quantity of interest
We now show convergence of the derived goal oriented adaptive scheme in the QoI, using Theorem 4. While we use the same controller, we have a different error estimator and cannot use Corollary 2 directly.This is due to the timesteps (29) converging to a different series of reference timesteps.We define these and repeat the steps of Theorem 3, showing convergence of the steps from the controller (29) to our reference.We use with where O(1) is for τ → 0 and c 0 > 0. This gives a degree of freedom in choosing ∆t 0 .For the timesteps to be well-defined we require yielding the maximal timestep We have ∆t ).With the following Theorem we show convergence of the timesteps from controller (29) with error estimate (27) to the reference timesteps (30).
Theorem 5. Let f in (1) and j in (2) be sufficiently smooth.Assume an adaptive method consisting of: 1.A pair of schemes N t,∆t , N t,∆t − with orders p, p and p > p, 2. a quadrature scheme of order r to approximate J(u) as in (18),

schemes N t,∆t
(k) to obtain solutions of order p − 1 for all quadrature evaluation points, that are not part of the resulting grid, 4. the error estimator (27), 5. the deadbeat controller (29),

If the principal error function φ
Proof.We first show convergence in the solution by inductively showing that every step given by the controller (29) fulfills We choose c 0 for ∆t .
Here we can pull out O(∆t) terms and use the induction hypothesis to get ∆t n+1 which shows the induction step holds, yielding ∆t n+1 = O(τ 1/(p+1) ).We thus proved the induction and get ∆t n = O(τ 1/(p+1) ) for all n.This gives us the assumption on step-sizes as needed by Theorem 3.1 and convergence in the solution in the grid-points by Corollary 2, with a rate of O(τ p/(p+1) ).Now, we want to show that we fulfill assumption (19) in Theorem 4. Taking a single step of size ∆t (k) n from t n , to the quadrature evaluation point t , gives the error Here we have ∆t ). Hence we have e (k) n = O(τ p/(p+1) ) and fulfill assumption (19) of Theorem 4 for q = p+1, which gives us convergence in the QoI with e J = O(τ min(r, p)/(p+1) ).
Thus our adaptive method is convergent in the QoI.The requirement of φ j −,min > 0 is a requirement on controllability of the global error by means of the local error (26).For a j which is linear in u this is equivalent to the local error not being in the nullspace of j(t, •).Possible consequences of it being in the nullspace of j(t, u) are shown in the following example.A more general analysis of this is subject of the next section.Example 6.In [24] the author describes using the lift-coefficient of the flow around a cylinder as density function j(t, u), which changes sign over time, implying it has zeros.It is observed that large timesteps are chosen when the lift-coefficient is close to zero, leading to "catastrophical results".By the description of this example we can see criterion (32) not being fulfilled and thus not guaranteeing convergence in the QoI.
We now discuss the time-integration schemes for the quadrature evaluation points, as needed in the assumptions of Theorem 5. Here, we only need a solution for the points, that are not part of the grid.This is relevant only for quadrature schemes of order r > 2, for r = 2 there is the trapezoidal rule.One can also use linear interpolation of the solution at grid points, which can be formally expressed using a combination of the identity operator and N tn,∆tn , using suitable weights.Linear interpolation will, however, yield at most e We instead want to use RK schemes and use the already calculated stage derivatives.To determine weights of RK schemes for intermediate points t Taking the order conditions for order p, e.g.s b s c s = 1 2 for p = 2, one has to multiply the right-hand side by γ p k .This becomes clear when looking into the details of a proof on order conditions [11, pp. 142].
Example 7. Assume the classic 4th order Runge-Kutta scheme for time-integration.Since the convergence rate in the QoI (21) is determined by the minimum order of quadrature and time-integration scheme, we pick the Simpson rule (r = 4) for quadrature.To get a 4th order (3rd order local) solution for the point t n + ∆t n /2 one can use the RK weights b * = 1 24 (5, 4, 4, −1) for the same stage derivatives.

The nullspace of j(t, u) and global error propagation
Convergence of a method is a statement for the limit τ → 0 and does not regard global error dynamics.Here, we analyse them to make a qualitative statement about the grid obtained from the goal oriented adaptive method for τ > 0 not in the limit.We establish guidelines to predict grid quality and thus performance of the goal-oriented adaptive method.We assume a QoI with a density function that is linear in u, Considering the split ( 20) of e J , the quadrature error does not involve the numerical solution.
Global error propagation only appears in the time-integration part of the error, which in this case is given by We define a weighted seminorm • w : R d → R by This is a seminorm, since it may have a non-trivial nullspace if w i = 0 for some indices i.
Throughout this section, we assume • w to have a non-trivial nullspace.Using (35) we get the bound n w and we need to further investigate how e (k) n w is affected by global error propagation.Starting from ( 23) -( 25), we extract the error corresponding to a single quadrature evaluation point and get We now want to find a bound for the global error propagation term (36) depending on e n .For this we use Lipschitz-conditions. Assume a map f : U → W fulfills the Lipschitz condition with some constant L and suitable norms for the spaces U and W .The Lipschitz-norm of f is the minimal L fulfilling the Lipschitz condition, cf.[23].We can define an according Lipschitz-seminorm in the following way.
Definition 4. Assume a map f : U → W with • w being a seminorm on W and • U being a norm on U .We define the Lipschitz-seminorm (with respect to a given norm on U ) by where u, v ∈ U .
Due to the possible non-trivial nullspace, we yet require a norm in the denominator.We can use this definition to bound the global error propagation in (36) -( 37) by where • U is a norm.It will be non-zero in the nullspace of • w .For a non-trivial nullspace this can be problematic, which we first illustrate by a simple test case and later using numerical test problems in section 4.
To get a better idea of the dynamics, we look at the linear case f (u) = Au and take • U to be the 1-norm.
Lemma 8.The Lipschitz-seminorm of a linear operator A ∈ R n×m , with respect to the 1-norm, is given by Proof.The first form using the supremum is obtained by defining x := u − v and scaling x 1 = 0 to x 1 = 1 using the homogeneity of the seminorm.For the second part we have Using the supremum over all x 1 = 1 yields the result.
Thus the Lipschitz-seminorm of a linear operator is a (weighted) column max-seminorm of the given matrix.

Example 9. Consider
We have A w = 2, x w = 1, x 1 = 3 and Ax w = 4. From (38) we get the inequality Here we fulfill Ax w ≤ A w x 1 , but have Ax w A w x w .This shows the inequality Ax ≤ A x for norms does not hold for seminorms.

Consider the flow map and weights
The diagonal entries of M describe dampening or amplification in the nullspace or image and the off-diagonal entries describe (scaled) transport from the nullspace into the image or vice versa.This can also be formulated in an analogous blockwise formulation, then the diagonal blocks m 11 and m 22 may include transport inside the nullspace resp.image.Dampening is generally favorable and amplification may be unavoidable if it is part of the ODE/PDE.Transport from the image into the nullspace is unproblematic, but transport from the nullspace to the image can be highly problematic.In this example the corresponding component is m 21 .
We choose timesteps to control the global error increments (37) in the image.Controlling the timesteps we try to keep the seminorm of these increments below a given tolerance.We do not control the error increments in the nullspace, which can be problematic if errors from the nullspace are transported into the image.
As simulating a process results in errors regardless of the step-size, this is a question of sufficiently resolving relevant processes.Assume a process in the nullspace is faster than a process in the image at a given time.The timesteps are chosen to sufficiently resolve the slow process in the image.The process in the nullspace remains under-resolved, its error exceeding the tolerance.If this error is then transported into the image, the performance of the goal oriented adaptive method suffers.
Likewise the goal oriented controller performs well, if all processes whose errors end up in the image, are sufficiently resolved.This can be due to the image containing the processes which require smaller timesteps.The other case is that processes in the nullspace remain under-resolved but have neglectable impact on J(u).This may be due to strong damping in m 22 or lack of transport with m 12 being small.
Due to potentially complicated dynamics of the system, it is hard to clearly identify which processes are neglectable.
Example 10.In [25] the author simulates flow-driven fracturing of an obstacle.The QoI j(t, u) is the displacement of the obstacle in flow direction, measured at the tip of the outflow edge.Using this density function to control timesteps, there may be a small delay from the flow building up around the obstacle and the displacement occurring.This delay would result in choosing too large timesteps when the displacement is just starting to grow, but the flow pattern around the obstacle is already beginning to form.The author observes significant error reductions after a certain tolerance, which may be the point at which the inflow is sufficiently resolved.

Numerical Results
We now test the results of Theorem 4 on convergence rates numerically.Further we compare performance of the DWR method and the local error based adaptive methods.Verification of the results of section 2.2 are not presented as they are well established.
The experiments were run on a Intel i7-3930K 3.20 GHz CPU and implemented in Python 2.7 using FEniCS [13].
The following specifications are shared for all test-cases.For local error based methods using timestep-controllers, we bound the rate by which timesteps change [11] by Here l = ln for (13) resp.ln = | lj n | for (29).The purpose of this is to provide more computational stability by preventing too large or too small timestep changes.In practice this will not take effect for τ → 0. We use f max = 3 and f min = 0.01 and do not reject timesteps.For the initial timestep we use ∆t 0 = τ 1/(p+1) .
For the DWR method we use an initial grid with 10 equidistant cells.As refinement strategy we use fixed-rate refinement [1] with X = 0.8 and Y = 0.This means we refine 80% of cells corresponding to the largest errors, where refinement means to split the cell into two equally sized cells.To approximate z ≈ z + h we use a finer grid, dividing all time-intervals in half.
We refer to the adaptive method from section 2.2 as the "Classic" method and to the one from section 3 as the "Goal oriented" method.

Test problem
As a simple test problem with known global error dynamics we consider We use [t 0 , t e ] = [0, 2] and vary the stiffness by k < 0.

DWR estimate
The unique solution to (40 ).We define the finite element space V h := {u ∈ C([t 0 , t e ]) : u In ∈ P q (I n )} denoting the space of continuous piece-wise q-th order polynomials, where Using (•, •) In as the standard L 2 scalar product over I n , we have where the entire left-hand side defines the bilinear form A(u h , φ h ).Let z be the exact adjoint solution and z h = (z 1 , z 2 ) its finite element approximation, we get e J = A(u h , z − z h ).We approximate z by z we get the final error estimate η h (u h ) using the composite trapezoidal rule

Numerical verification of Theorem 4
We first verify Theorem 4 for the goal oriented method.Figure 1 shows results for the Crank-Nicolson scheme with Implicit Euler for error estimation, trapezoidal rule for quadrature and a range of different density functions.With p, p = (2, 1) and r = 2, we expect at least e J = O(τ ), which the plots clearly show.Further we consider fourth order schemes p = r = 4 for the goal oriented adaptive method with time-dependent density functions j.We use Simpson's rule for quadrature and the classical Runge-Kutta scheme for time-integration.As embedded scheme we use the weights b = 1 3 (1, 1, 0, 1) T , which give a second order (third order for autonomous systems) solution.To get a fourth order solution in t n +∆t n /2 needed by the Simpson rule we use the weights b * = 1 24 (5, 4, 4, −1) T .As the test problem ( 40) is autonomous we have p, p = (4, 3).With r = 4 and 4th order solutions for all evaluation points of the quadrature scheme, we expect to get e J = O(τ ) = O(N −4 ) from Theorem 4. This can be observed in Figure 2.

Method comparison and performance tests
We now compare the DWR method with the local error based classic and goal oriented method.For the DWR method we additionally have the estimate η(u h ) of the error e J , which we denote by "DWR Est" in Figures, the actual error is denoted by "DWR Err".We only consider the final grid with DWR Est = η(u h ) ≤ τ .We use second order timeintegration for both DWR and the local error methods.As DWR requires a variational formulation, we use the Crank-Nicolson scheme for time-integration and for the local error based methods Implicit Euler for error estimation.For quadrature we use the trapezoidal rule.
We compare methods in terms of computational efficiency (error vs. computational time spent) and grid quality (error vs. number of timesteps).We consider the density functions j(t, u) = u 1 for k ∈ {−1, −100} (Figures 3, 5) and j(t, u) = u 2 for k = −1 (Figure 4).Results for DWR are considered first and the local error based adaptive methods are then discussed based on the results of section 3.3.
Looking at Figures 3 -5 and considering the actual error (DWR Err), we see the method convergence in the QoI for τ → 0, since the timestep taken at t = 1 will tend to infinity.This trend can be observed when looking at the timesteps over time in Figure 6, which form an upward cusp.We are, however, using an extremely small tolerance of τ = 10 −14 and have the error e J ≈ 4 • 10 −13 , which is already close to machine zero.This shows that the requirement (32) may not be a strict requirement on convergence in the QoI in practice for some problems.In the case of k = −1 and j(t, u) = u 2 , see Figure 4, we do control the error in the fastest process with the goal oriented method.Thus the chosen timesteps sufficiently resolve all processes.The results show that the two local error based methods have grids of identical quality and require the same computational effort.
For k = −100 and j(t, u) = u 1 , see Figure 5, we similarly to the case of k = −1 do not control the fastest process, but the impact of u 2 on J(u) is small.It turns out the efficiency gain in not properly resolving the second component is worth the additional error, resulting in a more efficient method by a factor of around two.

Convection-diffusion equation
Moving to a problem involving a spatial component, we look at a linear convection-diffusion equation We want to model the case of having error build-up in the nullspace of j(t, u), which is transported into its image.We consider the domain Ω = [0, 3] × [0, 1] and restrict the source term f to Ω f = [0.25,0.75] × [0.25, 0.75].As QoI we consider with We use the source term providing a spike-shaped build-up in the first 2 time units.The remaining parameters are a = 0.5, γ = 0.01, c = 0.15 and v = (1, 0) T .We use the initial condition u 0 (x) = 1.Since we do not have an analytical solution, we use as reference solution from using the classic adaptive method with τ ref = τ min /10, where τ min is the minimal tolerance for which tests are done.

Discretization
For our convection-diffusion problem we have a weak solution in the space see [19].We discretize time along the points t n with I n := [t n , t n+1 ] and space by 2 (3 • 32) • 32 = 6144 regular triangular cells K defining the finite element mesh.We define the global finite element space by where P q (I n ) is the space of polynomials on I n of degree up to q and Q 1 (K) being the space of polynomials on K with partial degrees up to 1.In this space the variational formulation becomes A h (u h , φ h ) = F (φ h ) for all φ ∈ V h,k with the bilinear form

DWR Estimate
We have where we approximate z ≈ z + h using a finer grid in time.Splitting this by timesteps gives dt .
We use the composite trapezoidal rule to get the error estimate

Method comparison and performance tests
We use the same schemes for time-integration as in section 4.1.2.We again compare DWR with the two local error based adaptive methods.The way we set up the problem, we expect the goal oriented method to perform poorly.Due to the source term being in the nullspace of j(t, u), the resulting timesteps will not sufficiently resolve it.The convection transports the build-up from the source term and its error into the image of j(t, u).This leads to an increase in error, which can no longer be controlled by the step-size.The results can be seen in Figure 8.One can observe the classic adaptive method performs fine and the goal oriented adaptive method shows the expected poor performance.In Figure 9 one can see the timesteps chosen by the goal oriented method are too large to resolve the source term.Nevertheless we have convergence in the QoI with e J = O(τ ), as predicted by Theorem 4, see Figure 9.
The DWR method is computationally expensive, but gives high quality grids.Here, we used it to only adapt the grid in time to get a fair comparison with the other methods.coefficients for RK schemes.The structure of our proof allows to immediately conclude the same result for closely related controllers.
We further derived guidelines to predict performance of the goal oriented method in relation to classical adaptive methods.These are based on analyzing global error propagation with respect to the nullspace of the error estimator.The goal oriented adaptive method does not regard errors in the nullspace of the error estimator when choosing timesteps.If processes in the nullspace are not sufficiently resolved by the chosen timesteps and the resulting error affects the QoI, due to global error propagation, performance of the goal oriented method will suffer.The goal oriented method will perform well, if all relevant processes are sufficiently resolved.To use these guidelines one requires sufficient knowledge of the global error dynamics of a problem.
In numerical experiments designed to test these guidelines, we confirm the results on convergence rates and that the guidelines hold true for our test-cases.We test a linear system with two variables with varying stiffness for various QoIs.As more complex test cases we have a 2D convection diffusion equation with source term that is outside the QoI.Further we test two coupled heat equations with varying coefficients and have heat transfer over the interface as the QoI.
The tests show that it is easy to correctly predict bad performance of the goal oriented method.It is, however, hard to predict if the goal oriented method will perform better than a classical norm-based adaptive method.
The results further show that the local error based adaptive methods perform better than the DWR method.The goal oriented method is shown to perform well in many cases, it is, however, not recommended to use it as a black-box solver for general goal oriented problems.

n
= t n + γ k ∆t n for γ k ∈ (0, 1] one has to modify the RK order conditions the following way:

Figure 5 :
Figure 5: Performance comparison of the various methods for problem (40) for k = −100 and j(t, u) = u 1 .